Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2012-01-12 Thread Jean Spirat
cutting in 4 the backup done the trick, i also moved the part that took 
the longest to tar. Curiously this is not the part with the most files 
but the part with the most directoryies that takes so long to backup :)

  anyway the 8Millions files are backed up now.

thanks for your help.

regards,
Jean.

--
RSA(R) Conference 2012
Mar 27 - Feb 2
Save $400 by Jan. 27
Register now!
http://p.sf.net/sfu/rsa-sfdev2dev2
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-19 Thread gagablubber
You could transfer to the backuppc host not into the pool but to a temp 
directory and unpackage the tars there. This all via pre backup script. 
Then backuppc steps in and creates a local backup of this temporary 
files so you get the pooling. In the post backup script you flush this 
temp files.

Better?


Timothy J Massey schrieb:
 I'd rather deal with a few tarfiles, too, but you'll lose pooling...
 Unless the script that makes the tarfiles is intelligent. In which case
 BackupPC is somewhat overkill.

 Basically, your choices are poor no matter what.  Garbage in, garbage out,
 and all that...

 Timothy J. Massey
 Out of the Box Solutions Inc.

 Sent from my iPad

 On Dec 18, 2011, at 1:41 PM, gagablub...@vollbio.de
 gagablub...@vollbio.de wrote:

   
 Why don't you ask the developers to write a script that creates one or a
 few tar files out of this massive number of files?
 The execution of that script could be triggered via http request (with
 authentification). On the backuppc side you could call this script via
 pre backup command before every backup... Then just transfer the files
 in whatever way you like.

 This way they don't need to give you access to their system but you
 could do fast and easy backupps.



 Am 16.12.11 16:57, schrieb Les Mikesell:
 
 On Fri, Dec 16, 2011 at 9:00 AM, Jean Spiratjean.spi...@squirk.org
   
 wrote:
   
 Excuse my off topic-ness, but with that many small files I kind of
   
 expect a
   
 filesystem to reach certain limits. Why is that webapp written to use
   
 many
   
 little files? Why not with a database where all that stuff is in
   
 blobs?
   
 That whould be easier to maintain and easier to back up.

 Have fun,

   
 fortunately it is not in my power to discuss choice of the developper
 
 my
   
 only job is to try to figure a way to make backup of the thing work :)

 i am sure most know the pain, i allready strive to exclude caching
 directory because devs seems to invent new way to name them every two
 days (temp temporary temporaireenter your custom name  cache mycache
 appli-cache img-cache  variation with plurals...).


 the backuppc server has 16Gb of ram so on the server side it should be
 okay memory wise. on my monitoring i never go under 5gb free (but i
 
 only
   
 have  the data every 5 minutes).

 
 Aside from excluding unneeded parts, one other approach that can
 sometimes help is to split the target into separate runs for different
 subdirectories.   To do that, you can make different 'host' entries,
 then use the ClientNameAlias setting to point them back to the same
 real target.  If the files in question are split into some reasonable
 upper-level subdirectories, you may be able to get sets that complete
 in a reasonable amount of time.

   
 
 --

   
 Learn Windows Azure Live!  Tuesday, Dec 13, 2011
 Microsoft is holding a special Learn Windows Azure training event for
 developers. It will provide a great way to learn Windows Azure and what
 
 it
   
 provides. You can attend the event by watching it streamed LIVE online.
 Learn more at http://p.sf.net/sfu/ms-windowsazure
 ___
 BackupPC-users mailing list
 BackupPC-users@lists.sourceforge.net
 List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
 Wiki:http://backuppc.wiki.sourceforge.net
 Project: http://backuppc.sourceforge.net/
 


 --
 Learn Windows Azure Live!  Tuesday, Dec 13, 2011
 Microsoft is holding a special Learn Windows Azure training event for 
 developers. It will provide a great way to learn Windows Azure and what it 
 provides. You can attend the event by watching it streamed LIVE online.  
 Learn more at http://p.sf.net/sfu/ms-windowsazure
 ___
 BackupPC-users mailing list
 BackupPC-users@lists.sourceforge.net
 List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
 Wiki:http://backuppc.wiki.sourceforge.net
 Project: http://backuppc.sourceforge.net/
   

--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-19 Thread Les Mikesell
On Mon, Dec 19, 2011 at 2:27 AM,  gagablub...@vollbio.de wrote:
 You could transfer to the backuppc host not into the pool but to a temp
 directory and unpackage the tars there. This all via pre backup script.
 Then backuppc steps in and creates a local backup of this temporary
 files so you get the pooling. In the post backup script you flush this
 temp files.

This would take some clever timestamp management to do incremental
tars or you would end up copying all the data every time - which I
think is the real problem here at the nfs level anyway.

-- 
  Les Mikesell
lesmikes...@gmail.com

--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-19 Thread Tim Fletcher
On Mon, 2011-12-19 at 12:32 -0600, Les Mikesell wrote:
 On Mon, Dec 19, 2011 at 12:04 PM, Jean Spirat jean.spi...@squirk.org wrote:
 
  I directly mount the nfs share on the backuppc server so no need for
  rsyncd here this is like local backup with the NFS overhead of course.
 
 The whole point of rsync is that it can read the files locally with
 block checksums to decide what it really has to copy over the network.
  Doing it over NFS, you've already had to copy if over the network so
 rsync at the wrong end can read it (and decide that it didn't have
 to...).

I think the real problem is the metadata access, and after a bit of
digging I've dug this up, it's comparing iSCSI with NFS.

But what might help is tweaking the NFS settings to improve the metadata
caching etc.

6.2 Meta-data intensive applications
NFS and iSCSI show their greatest differences in their
handling of meta-data intensive applications. Overall,
we find that iSCSI outperforms NFS for meta-data in-
tensive workloads—workloads where the network traffic
is dominated by meta-data accesses.
The better performance of iSCSI can be attributed to
two factors. First, NFS requires clients to update meta-
data synchronously to the server. In contrast, iSCSI,
when used in conjunction with modern file systems, up-
dates meta-data asynchronously. An additional bene-
fit of asynchronous meta-data updates is that it enables
update aggregation—multiple meta-data updates to the
same cached cached block are aggregated into a single
network write, yielding significant savings. Such opti-
mizations are not possible in NFS v2 or v3 due to their
synchronous meta-data update requirement.
Second, iSCSI also benefits from aggressive meta-
data caching by the file system. Since iSCSI reads are
in granularity of disk blocks, the file system reads and
caches entire blocks containing meta-data; applications
with meta-data locality benefit from such caching. Al-
though the NFS client can also cache meta-data, NFS
clients need to perform periodic consistency checks with
the server to provide weak consistency guarantees across
client machines that share the same NFS namespace.
Since the concept of sharing does not exist in the SCSI
architectural model, the iSCSI protocol also does not pay
the overhead of such a consistency protocol.

Full details are here: http://lass.cs.umass.edu/papers/pdf/FAST04.pdf

-- 
Tim Fletcher t...@night-shade.org.uk


--
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-19 Thread Pedro M. S. Oliveira
sorry to take long to reply.
yes it saves me a lot of time,  let me explain.
although I have a fast san and servers the time for fetching lots of small
files is high,  the max bandwidth i could get was about 5MB/s, increasing
concurrecy i can get about 20-40MB/s depending on what im backingup at the
moment.  this way i can get more out of the san and backup server. if i
increase concurrency even more i can reach higher performance but i don't
want to steal all io available for backuppc, but to be sincere I don't need
it anyway as i get really good performance this way.
This setup is running on a large finantional group and it outperforms very
expensive (and complex) proprietary solutions.

the backuppc should have a fairly amount of ram,  cpu,  and isn't
virtualized. in my case a 4 core server and 8GB ram (although it swaps a
bit), i'm also using ssh+ rsync which add some overhead but not critical in
any way.

cheers
pedro
Sent from my galaxy nexus.
www.linux-geex. com
 On Dec 19, 2011 6:05 PM, Jean Spirat jean.spi...@squirk.org wrote:

 Le 18/12/2011 20:44, Pedro M. S. Oliveira a écrit :


 you may try to use a rsyncd directly on the server. This may speed up
 things.
 another thing is to split the large backup into several smaller ones.
  I've an email cluster with 8TB and millions of small files (I'm using
 dovecot),  theres also a san involved. in order to use all the bandwidth
 available I configured backup to run from username starting in a to e,  f
 to j and so on,  then they all run at the same time. incremental take about
 1 hour and full about 5.
 cheers
 pedro


 I directly mount the nfs share on the backuppc server so no need for
 rsyncd here this is like local backup with the NFS overhead of course.

 Do you won a lot from splitting instead of doing just one big backup ? At
 least you seems to have the same kind of file numbers i have.

 regards,
 Jean.

--
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-18 Thread gagablub...@vollbio.de
Why don't you ask the developers to write a script that creates one or a 
few tar files out of this massive number of files?
The execution of that script could be triggered via http request (with 
authentification). On the backuppc side you could call this script via 
pre backup command before every backup... Then just transfer the files 
in whatever way you like.

This way they don't need to give you access to their system but you 
could do fast and easy backupps.



Am 16.12.11 16:57, schrieb Les Mikesell:
 On Fri, Dec 16, 2011 at 9:00 AM, Jean Spiratjean.spi...@squirk.org  wrote:
 Excuse my off topic-ness, but with that many small files I kind of expect 
 a
 filesystem to reach certain limits. Why is that webapp written to use many
 little files? Why not with a database where all that stuff is in blobs?
 That whould be easier to maintain and easier to back up.

 Have fun,

 fortunately it is not in my power to discuss choice of the developper my
 only job is to try to figure a way to make backup of the thing work :)

 i am sure most know the pain, i allready strive to exclude caching
 directory because devs seems to invent new way to name them every two
 days (temp temporary temporaireenter your custom name  cache mycache
 appli-cache img-cache  variation with plurals...).


 the backuppc server has 16Gb of ram so on the server side it should be
 okay memory wise. on my monitoring i never go under 5gb free (but i only
 have  the data every 5 minutes).

 Aside from excluding unneeded parts, one other approach that can
 sometimes help is to split the target into separate runs for different
 subdirectories.   To do that, you can make different 'host' entries,
 then use the ClientNameAlias setting to point them back to the same
 real target.  If the files in question are split into some reasonable
 upper-level subdirectories, you may be able to get sets that complete
 in a reasonable amount of time.


--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-18 Thread Pedro M. S. Oliveira
you may try to use a rsyncd directly on the server. This may speed up
things.
another thing is to split the large backup into several smaller ones.  I've
an email cluster with 8TB and millions of small files (I'm using dovecot),
theres also a san involved. in order to use all the bandwidth available I
configured backup to run from username starting in a to e,  f to j and so
on,  then they all run at the same time. incremental take about 1 hour and
full about 5.
cheers
pedro


Sent from my galaxy nexus.
www.linux-geex. com
 On Dec 16, 2011 9:47 AM, Jean Spirat jean.spi...@squirk.org wrote:

 hi,

  I use backuppc to save a webserver. The issue is that the application
 used on it is making thousand of little files used for a game to create
 maps and various things. The issue is that we are now at 100GB of data
 and 8.030.000 files so the backups takes 48H and more (to help the files
 are on NFS share). I think i come to the point where file backup is at
 it's limit.

   Is any of you reached this kind of issue, how do you solve that ?
 going for a different way of backuping the files by using
 block/partition backup system etc..

 the issue is that so manyfiles make the file by file process very slow.
 I was thinking about block backup but i do not even know tools that does
 it apart R1backup that is a commercial one.

 If anyone here met the same issue and give some pointers it would be
 great even perhaps if someone found a way to continue using backuppc in
 that extreme situation.


 ps: backuppc server and the web server are debian linux,  i use rysnc
 method and backup  the NFS that i mount localy on the backuppc server.

 regards,
 Jean.



 --
 Learn Windows Azure Live!  Tuesday, Dec 13, 2011
 Microsoft is holding a special Learn Windows Azure training event for
 developers. It will provide a great way to learn Windows Azure and what it
 provides. You can attend the event by watching it streamed LIVE online.
 Learn more at http://p.sf.net/sfu/ms-windowsazure
 ___
 BackupPC-users mailing list
 BackupPC-users@lists.sourceforge.net
 List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
 Wiki:http://backuppc.wiki.sourceforge.net
 Project: http://backuppc.sourceforge.net/

--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-18 Thread Timothy J Massey

I'd rather deal with a few tarfiles, too, but you'll lose pooling...
Unless the script that makes the tarfiles is intelligent. In which case
BackupPC is somewhat overkill.

Basically, your choices are poor no matter what.  Garbage in, garbage out,
and all that...

Timothy J. Massey
Out of the Box Solutions Inc.

Sent from my iPad

On Dec 18, 2011, at 1:41 PM, gagablub...@vollbio.de
gagablub...@vollbio.de wrote:

 Why don't you ask the developers to write a script that creates one or a
 few tar files out of this massive number of files?
 The execution of that script could be triggered via http request (with
 authentification). On the backuppc side you could call this script via
 pre backup command before every backup... Then just transfer the files
 in whatever way you like.

 This way they don't need to give you access to their system but you
 could do fast and easy backupps.



 Am 16.12.11 16:57, schrieb Les Mikesell:
  On Fri, Dec 16, 2011 at 9:00 AM, Jean Spiratjean.spi...@squirk.org
wrote:
  Excuse my off topic-ness, but with that many small files I kind of
expect a
  filesystem to reach certain limits. Why is that webapp written to use
many
  little files? Why not with a database where all that stuff is in
blobs?
  That whould be easier to maintain and easier to back up.
 
  Have fun,
 
  fortunately it is not in my power to discuss choice of the developper
my
  only job is to try to figure a way to make backup of the thing work :)
 
  i am sure most know the pain, i allready strive to exclude caching
  directory because devs seems to invent new way to name them every two
  days (temp temporary temporaireenter your custom name  cache mycache
  appli-cache img-cache  variation with plurals...).
 
 
  the backuppc server has 16Gb of ram so on the server side it should be
  okay memory wise. on my monitoring i never go under 5gb free (but i
only
  have  the data every 5 minutes).
 
  Aside from excluding unneeded parts, one other approach that can
  sometimes help is to split the target into separate runs for different
  subdirectories.   To do that, you can make different 'host' entries,
  then use the ClientNameAlias setting to point them back to the same
  real target.  If the files in question are split into some reasonable
  upper-level subdirectories, you may be able to get sets that complete
  in a reasonable amount of time.
 


--

 Learn Windows Azure Live!  Tuesday, Dec 13, 2011
 Microsoft is holding a special Learn Windows Azure training event for
 developers. It will provide a great way to learn Windows Azure and what
it
 provides. You can attend the event by watching it streamed LIVE online.
 Learn more at http://p.sf.net/sfu/ms-windowsazure
 ___
 BackupPC-users mailing list
 BackupPC-users@lists.sourceforge.net
 List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
 Wiki:http://backuppc.wiki.sourceforge.net
 Project: http://backuppc.sourceforge.net/


--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


[BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-16 Thread Jean Spirat
hi,

  I use backuppc to save a webserver. The issue is that the application 
used on it is making thousand of little files used for a game to create 
maps and various things. The issue is that we are now at 100GB of data 
and 8.030.000 files so the backups takes 48H and more (to help the files 
are on NFS share). I think i come to the point where file backup is at 
it's limit.

   Is any of you reached this kind of issue, how do you solve that ? 
going for a different way of backuping the files by using 
block/partition backup system etc..

the issue is that so manyfiles make the file by file process very slow. 
I was thinking about block backup but i do not even know tools that does 
it apart R1backup that is a commercial one.

If anyone here met the same issue and give some pointers it would be 
great even perhaps if someone found a way to continue using backuppc in 
that extreme situation.


ps: backuppc server and the web server are debian linux,  i use rysnc 
method and backup  the NFS that i mount localy on the backuppc server.

regards,
Jean.


--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-16 Thread Tim Fletcher
On Fri, 2011-12-16 at 10:42 +0100, Jean Spirat wrote:
 hi,
 
   I use backuppc to save a webserver. The issue is that the application 
 used on it is making thousand of little files used for a game to create 
 maps and various things. The issue is that we are now at 100GB of data 
 and 8.030.000 files so the backups takes 48H and more (to help the files 
 are on NFS share). I think i come to the point where file backup is at 
 it's limit.

 ps: backuppc server and the web server are debian linux,  i use rysnc 
 method and backup  the NFS that i mount localy on the backuppc server.

I have a backup with a similar number of files in and I have found that
tar is much better than rsync. Your issues are:

1. rsync will take a very long time and a very large amount of memory to
build the file tree, especially over NFS

2. NFS isn't really a high performance filesystem, you are better off
working locally on the server being backed up via ssh.

I would suggest you try the following: 

Move to tar over ssh on the remote webserver, the first full backup
might well take a long time but the following ones should be faster.

tar+ssh backups however use more bandwidth but as you are already using
nfs I am assuming you are on a local network of some sort.

-- 
Tim Fletcher t...@night-shade.org.uk


--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-16 Thread Jean Spirat

r.



 I would suggest you try the following:

 Move to tar over ssh on the remote webserver, the first full backup
 might well take a long time but the following ones should be faster.

 tar+ssh backups however use more bandwidth but as you are already using
 nfs I am assuming you are on a local network of some sort.


Hum i cannot directly use the FS i have no access to the NFS server that 
is on the hosting company side i just have access to the webserver that 
use the nfs partition to store  it's content. Right now i also mount the 
nfs share on the backup server so that way i have not the overhead of 
ssh in the mix (but yes i am on a 1000mbps  lan for the NFS and backup 
server).

for my understanding  rsync had allways seems to be the most efficient 
of the two but i never challenged this fact ;p

  i will have a look at tar and see if i can work with it .

If any other person has some experience in it feel free to contribute :)

regards,
Jean.



--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-16 Thread Tim Fletcher
On Fri, 2011-12-16 at 11:49 +0100, Jean Spirat wrote:

  I would suggest you try the following:

  tar+ssh backups however use more bandwidth but as you are already using
  nfs I am assuming you are on a local network of some sort.

 for my understanding  rsync had allways seems to be the most efficient 
 of the two but i never challenged this fact ;p
 
   i will have a look at tar and see if i can work with it .

http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg15217.html

Is the pros and cons of tar and rsync in far more detail than I can
offer.

-- 
Tim Fletcher t...@night-shade.org.uk


--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-16 Thread Les Mikesell
On Fri, Dec 16, 2011 at 4:49 AM, Jean Spirat jean.spi...@squirk.org wrote:

 Hum i cannot directly use the FS i have no access to the NFS server that
 is on the hosting company side i just have access to the webserver that
 use the nfs partition to store  it's content. Right now i also mount the
 nfs share on the backup server so that way i have not the overhead of
 ssh in the mix (but yes i am on a 1000mbps  lan for the NFS and backup
 server).

 for my understanding  rsync had allways seems to be the most efficient
 of the two but i never challenged this fact ;p

Rsync working natively is very efficient, but think about what it has
to do in your case.   It will have to read the entire file across nfs
just so rsync can compere contents and decide not to copy the content
that already exists in your  backup.

  i will have a look at tar and see if i can work with it .

I'd try rsync over ssh first, at least if most of the files do not
change between runs.   If you don't have enough ram to hold the
directory listing or if there are changes to a large number of files
per run, tar might be faster.

-- 
   Les Mikesell
 lesmikes...@gmail.com

--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-16 Thread Steve
On Fri, Dec 16, 2011 at 4:42 AM, Jean Spirat jean.spi...@squirk.org wrote:
 The issue is that we are now at 100GB of data
 and 8.030.000 files so the backups takes 48H and more (to help the files
 are on NFS share). I think i come to the point where file backup is at
 it's limit.

What about a script on this machine with all the files that uses tar
to put all (or some, or groups) these little files into a few bigger
files, stored in a separate directory?  Run your script a few times a
day and just exclude the directories with gazillions of files and
backup the directory you created that has the tar archives in them.

Steve

-- 
The universe is probably littered with the one-planet graves of
cultures which made the sensible economic decision that there's no
good reason to go into space--each discovered, studied, and remembered
by the ones who made the irrational decision. - Randall Munroe

--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-16 Thread Tim Fletcher
On Fri, 2011-12-16 at 07:33 -0600, Les Mikesell wrote:
 On Fri, Dec 16, 2011 at 4:49 AM, Jean Spirat jean.spi...@squirk.org wrote:

  for my understanding  rsync had allways seems to be the most efficient
  of the two but i never challenged this fact ;p
 
 Rsync working natively is very efficient, but think about what it has
 to do in your case.   It will have to read the entire file across nfs
 just so rsync can compere contents and decide not to copy the content
 that already exists in your  backup.
 
   i will have a look at tar and see if i can work with it .
 
 I'd try rsync over ssh first, at least if most of the files do not
 change between runs.   If you don't have enough ram to hold the
 directory listing or if there are changes to a large number of files
 per run, tar might be faster.

The real issue with rsync is the memory usage for the 8 million entries
in the file list. This is because the first thing that happens is rsync
walks the tree comparing with already backuped up files to see if the
date stamp has changed. This puts memory and disk load on both the
backup server and the backed up client. The approach that tar uses is
just to walk the directory tree and transfer everything newer than a
timestamp that backuppc passes to it. 

This costs some extra network bandwidth but massively reduces the disk
and memory bandwidth needed on both the backuppc client and server.

The server that I am backing up with ~7 million files takes on the order
of 6000 minutes to backup with rsync, the bulk of that time is taken up
by rsync building the tree of files to transfer. The same server takes
about 2500 minutes with tar because of the simpler way of finding files.

Overall rsync makes better backups because it finds moved and deleted
files and is far far more efficient with network bandwidth, but if you
understand the draw backs and need the filesystem efficiency of tar then
it is still an excellent backup tool.

-- 
Tim Fletcher t...@night-shade.org.uk


--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-16 Thread Arnold Krille
Hi,

On Friday 16 December 2011 10:42:00 Jean Spirat wrote:
   I use backuppc to save a webserver. The issue is that the application
 used on it is making thousand of little files used for a game to create
 maps and various things. The issue is that we are now at 100GB of data
 and 8.030.000 files so the backups takes 48H and more (to help the files
 are on NFS share). I think i come to the point where file backup is at
 it's limit.

Excuse my off topic-ness, but with that many small files I kind of expect a 
filesystem to reach certain limits. Why is that webapp written to use many 
little files? Why not with a database where all that stuff is in blobs?
That whould be easier to maintain and easier to back up.

Have fun,

Arnold


signature.asc
Description: This is a digitally signed message part.
--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-16 Thread Jean Spirat

 Excuse my off topic-ness, but with that many small files I kind of expect a
 filesystem to reach certain limits. Why is that webapp written to use many
 little files? Why not with a database where all that stuff is in blobs?
 That whould be easier to maintain and easier to back up.

 Have fun,


fortunately it is not in my power to discuss choice of the developper my 
only job is to try to figure a way to make backup of the thing work :)

i am sure most know the pain, i allready strive to exclude caching 
directory because devs seems to invent new way to name them every two 
days (temp temporary temporaire enter your custom name cache mycache 
appli-cache img-cache  variation with plurals...).


the backuppc server has 16Gb of ram so on the server side it should be 
okay memory wise. on my monitoring i never go under 5gb free (but i only 
have  the data every 5 minutes).


regards,
Jean.

--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 8.030.000, Too much files to backup ?

2011-12-16 Thread Les Mikesell
On Fri, Dec 16, 2011 at 9:00 AM, Jean Spirat jean.spi...@squirk.org wrote:

 Excuse my off topic-ness, but with that many small files I kind of expect a
 filesystem to reach certain limits. Why is that webapp written to use many
 little files? Why not with a database where all that stuff is in blobs?
 That whould be easier to maintain and easier to back up.

 Have fun,


 fortunately it is not in my power to discuss choice of the developper my
 only job is to try to figure a way to make backup of the thing work :)

 i am sure most know the pain, i allready strive to exclude caching
 directory because devs seems to invent new way to name them every two
 days (temp temporary temporaire enter your custom name cache mycache
 appli-cache img-cache  variation with plurals...).


 the backuppc server has 16Gb of ram so on the server side it should be
 okay memory wise. on my monitoring i never go under 5gb free (but i only
 have  the data every 5 minutes).


Aside from excluding unneeded parts, one other approach that can
sometimes help is to split the target into separate runs for different
subdirectories.   To do that, you can make different 'host' entries,
then use the ClientNameAlias setting to point them back to the same
real target.  If the files in question are split into some reasonable
upper-level subdirectories, you may be able to get sets that complete
in a reasonable amount of time.

-- 
  Les Mikesell
lesmikes...@gmail.com

--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/