Re: [BackupPC-users] 8.030.000, Too much files to backup ?
cutting in 4 the backup done the trick, i also moved the part that took the longest to tar. Curiously this is not the part with the most files but the part with the most directoryies that takes so long to backup :) anyway the 8Millions files are backed up now. thanks for your help. regards, Jean. -- RSA(R) Conference 2012 Mar 27 - Feb 2 Save $400 by Jan. 27 Register now! http://p.sf.net/sfu/rsa-sfdev2dev2 ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
You could transfer to the backuppc host not into the pool but to a temp directory and unpackage the tars there. This all via pre backup script. Then backuppc steps in and creates a local backup of this temporary files so you get the pooling. In the post backup script you flush this temp files. Better? Timothy J Massey schrieb: I'd rather deal with a few tarfiles, too, but you'll lose pooling... Unless the script that makes the tarfiles is intelligent. In which case BackupPC is somewhat overkill. Basically, your choices are poor no matter what. Garbage in, garbage out, and all that... Timothy J. Massey Out of the Box Solutions Inc. Sent from my iPad On Dec 18, 2011, at 1:41 PM, gagablub...@vollbio.de gagablub...@vollbio.de wrote: Why don't you ask the developers to write a script that creates one or a few tar files out of this massive number of files? The execution of that script could be triggered via http request (with authentification). On the backuppc side you could call this script via pre backup command before every backup... Then just transfer the files in whatever way you like. This way they don't need to give you access to their system but you could do fast and easy backupps. Am 16.12.11 16:57, schrieb Les Mikesell: On Fri, Dec 16, 2011 at 9:00 AM, Jean Spiratjean.spi...@squirk.org wrote: Excuse my off topic-ness, but with that many small files I kind of expect a filesystem to reach certain limits. Why is that webapp written to use many little files? Why not with a database where all that stuff is in blobs? That whould be easier to maintain and easier to back up. Have fun, fortunately it is not in my power to discuss choice of the developper my only job is to try to figure a way to make backup of the thing work :) i am sure most know the pain, i allready strive to exclude caching directory because devs seems to invent new way to name them every two days (temp temporary temporaireenter your custom name cache mycache appli-cache img-cache variation with plurals...). the backuppc server has 16Gb of ram so on the server side it should be okay memory wise. on my monitoring i never go under 5gb free (but i only have the data every 5 minutes). Aside from excluding unneeded parts, one other approach that can sometimes help is to split the target into separate runs for different subdirectories. To do that, you can make different 'host' entries, then use the ClientNameAlias setting to point them back to the same real target. If the files in question are split into some reasonable upper-level subdirectories, you may be able to get sets that complete in a reasonable amount of time. -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/ -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/ -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
On Mon, Dec 19, 2011 at 2:27 AM, gagablub...@vollbio.de wrote: You could transfer to the backuppc host not into the pool but to a temp directory and unpackage the tars there. This all via pre backup script. Then backuppc steps in and creates a local backup of this temporary files so you get the pooling. In the post backup script you flush this temp files. This would take some clever timestamp management to do incremental tars or you would end up copying all the data every time - which I think is the real problem here at the nfs level anyway. -- Les Mikesell lesmikes...@gmail.com -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
On Mon, 2011-12-19 at 12:32 -0600, Les Mikesell wrote: On Mon, Dec 19, 2011 at 12:04 PM, Jean Spirat jean.spi...@squirk.org wrote: I directly mount the nfs share on the backuppc server so no need for rsyncd here this is like local backup with the NFS overhead of course. The whole point of rsync is that it can read the files locally with block checksums to decide what it really has to copy over the network. Doing it over NFS, you've already had to copy if over the network so rsync at the wrong end can read it (and decide that it didn't have to...). I think the real problem is the metadata access, and after a bit of digging I've dug this up, it's comparing iSCSI with NFS. But what might help is tweaking the NFS settings to improve the metadata caching etc. 6.2 Meta-data intensive applications NFS and iSCSI show their greatest differences in their handling of meta-data intensive applications. Overall, we find that iSCSI outperforms NFS for meta-data in- tensive workloads—workloads where the network traffic is dominated by meta-data accesses. The better performance of iSCSI can be attributed to two factors. First, NFS requires clients to update meta- data synchronously to the server. In contrast, iSCSI, when used in conjunction with modern file systems, up- dates meta-data asynchronously. An additional bene- fit of asynchronous meta-data updates is that it enables update aggregation—multiple meta-data updates to the same cached cached block are aggregated into a single network write, yielding significant savings. Such opti- mizations are not possible in NFS v2 or v3 due to their synchronous meta-data update requirement. Second, iSCSI also benefits from aggressive meta- data caching by the file system. Since iSCSI reads are in granularity of disk blocks, the file system reads and caches entire blocks containing meta-data; applications with meta-data locality benefit from such caching. Al- though the NFS client can also cache meta-data, NFS clients need to perform periodic consistency checks with the server to provide weak consistency guarantees across client machines that share the same NFS namespace. Since the concept of sharing does not exist in the SCSI architectural model, the iSCSI protocol also does not pay the overhead of such a consistency protocol. Full details are here: http://lass.cs.umass.edu/papers/pdf/FAST04.pdf -- Tim Fletcher t...@night-shade.org.uk -- Write once. Port to many. Get the SDK and tools to simplify cross-platform app development. Create new or port existing apps to sell to consumers worldwide. Explore the Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join http://p.sf.net/sfu/intel-appdev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
sorry to take long to reply. yes it saves me a lot of time, let me explain. although I have a fast san and servers the time for fetching lots of small files is high, the max bandwidth i could get was about 5MB/s, increasing concurrecy i can get about 20-40MB/s depending on what im backingup at the moment. this way i can get more out of the san and backup server. if i increase concurrency even more i can reach higher performance but i don't want to steal all io available for backuppc, but to be sincere I don't need it anyway as i get really good performance this way. This setup is running on a large finantional group and it outperforms very expensive (and complex) proprietary solutions. the backuppc should have a fairly amount of ram, cpu, and isn't virtualized. in my case a 4 core server and 8GB ram (although it swaps a bit), i'm also using ssh+ rsync which add some overhead but not critical in any way. cheers pedro Sent from my galaxy nexus. www.linux-geex. com On Dec 19, 2011 6:05 PM, Jean Spirat jean.spi...@squirk.org wrote: Le 18/12/2011 20:44, Pedro M. S. Oliveira a écrit : you may try to use a rsyncd directly on the server. This may speed up things. another thing is to split the large backup into several smaller ones. I've an email cluster with 8TB and millions of small files (I'm using dovecot), theres also a san involved. in order to use all the bandwidth available I configured backup to run from username starting in a to e, f to j and so on, then they all run at the same time. incremental take about 1 hour and full about 5. cheers pedro I directly mount the nfs share on the backuppc server so no need for rsyncd here this is like local backup with the NFS overhead of course. Do you won a lot from splitting instead of doing just one big backup ? At least you seems to have the same kind of file numbers i have. regards, Jean. -- Write once. Port to many. Get the SDK and tools to simplify cross-platform app development. Create new or port existing apps to sell to consumers worldwide. Explore the Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join http://p.sf.net/sfu/intel-appdev___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
Why don't you ask the developers to write a script that creates one or a few tar files out of this massive number of files? The execution of that script could be triggered via http request (with authentification). On the backuppc side you could call this script via pre backup command before every backup... Then just transfer the files in whatever way you like. This way they don't need to give you access to their system but you could do fast and easy backupps. Am 16.12.11 16:57, schrieb Les Mikesell: On Fri, Dec 16, 2011 at 9:00 AM, Jean Spiratjean.spi...@squirk.org wrote: Excuse my off topic-ness, but with that many small files I kind of expect a filesystem to reach certain limits. Why is that webapp written to use many little files? Why not with a database where all that stuff is in blobs? That whould be easier to maintain and easier to back up. Have fun, fortunately it is not in my power to discuss choice of the developper my only job is to try to figure a way to make backup of the thing work :) i am sure most know the pain, i allready strive to exclude caching directory because devs seems to invent new way to name them every two days (temp temporary temporaireenter your custom name cache mycache appli-cache img-cache variation with plurals...). the backuppc server has 16Gb of ram so on the server side it should be okay memory wise. on my monitoring i never go under 5gb free (but i only have the data every 5 minutes). Aside from excluding unneeded parts, one other approach that can sometimes help is to split the target into separate runs for different subdirectories. To do that, you can make different 'host' entries, then use the ClientNameAlias setting to point them back to the same real target. If the files in question are split into some reasonable upper-level subdirectories, you may be able to get sets that complete in a reasonable amount of time. -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
you may try to use a rsyncd directly on the server. This may speed up things. another thing is to split the large backup into several smaller ones. I've an email cluster with 8TB and millions of small files (I'm using dovecot), theres also a san involved. in order to use all the bandwidth available I configured backup to run from username starting in a to e, f to j and so on, then they all run at the same time. incremental take about 1 hour and full about 5. cheers pedro Sent from my galaxy nexus. www.linux-geex. com On Dec 16, 2011 9:47 AM, Jean Spirat jean.spi...@squirk.org wrote: hi, I use backuppc to save a webserver. The issue is that the application used on it is making thousand of little files used for a game to create maps and various things. The issue is that we are now at 100GB of data and 8.030.000 files so the backups takes 48H and more (to help the files are on NFS share). I think i come to the point where file backup is at it's limit. Is any of you reached this kind of issue, how do you solve that ? going for a different way of backuping the files by using block/partition backup system etc.. the issue is that so manyfiles make the file by file process very slow. I was thinking about block backup but i do not even know tools that does it apart R1backup that is a commercial one. If anyone here met the same issue and give some pointers it would be great even perhaps if someone found a way to continue using backuppc in that extreme situation. ps: backuppc server and the web server are debian linux, i use rysnc method and backup the NFS that i mount localy on the backuppc server. regards, Jean. -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/ -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
I'd rather deal with a few tarfiles, too, but you'll lose pooling... Unless the script that makes the tarfiles is intelligent. In which case BackupPC is somewhat overkill. Basically, your choices are poor no matter what. Garbage in, garbage out, and all that... Timothy J. Massey Out of the Box Solutions Inc. Sent from my iPad On Dec 18, 2011, at 1:41 PM, gagablub...@vollbio.de gagablub...@vollbio.de wrote: Why don't you ask the developers to write a script that creates one or a few tar files out of this massive number of files? The execution of that script could be triggered via http request (with authentification). On the backuppc side you could call this script via pre backup command before every backup... Then just transfer the files in whatever way you like. This way they don't need to give you access to their system but you could do fast and easy backupps. Am 16.12.11 16:57, schrieb Les Mikesell: On Fri, Dec 16, 2011 at 9:00 AM, Jean Spiratjean.spi...@squirk.org wrote: Excuse my off topic-ness, but with that many small files I kind of expect a filesystem to reach certain limits. Why is that webapp written to use many little files? Why not with a database where all that stuff is in blobs? That whould be easier to maintain and easier to back up. Have fun, fortunately it is not in my power to discuss choice of the developper my only job is to try to figure a way to make backup of the thing work :) i am sure most know the pain, i allready strive to exclude caching directory because devs seems to invent new way to name them every two days (temp temporary temporaireenter your custom name cache mycache appli-cache img-cache variation with plurals...). the backuppc server has 16Gb of ram so on the server side it should be okay memory wise. on my monitoring i never go under 5gb free (but i only have the data every 5 minutes). Aside from excluding unneeded parts, one other approach that can sometimes help is to split the target into separate runs for different subdirectories. To do that, you can make different 'host' entries, then use the ClientNameAlias setting to point them back to the same real target. If the files in question are split into some reasonable upper-level subdirectories, you may be able to get sets that complete in a reasonable amount of time. -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/ -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
[BackupPC-users] 8.030.000, Too much files to backup ?
hi, I use backuppc to save a webserver. The issue is that the application used on it is making thousand of little files used for a game to create maps and various things. The issue is that we are now at 100GB of data and 8.030.000 files so the backups takes 48H and more (to help the files are on NFS share). I think i come to the point where file backup is at it's limit. Is any of you reached this kind of issue, how do you solve that ? going for a different way of backuping the files by using block/partition backup system etc.. the issue is that so manyfiles make the file by file process very slow. I was thinking about block backup but i do not even know tools that does it apart R1backup that is a commercial one. If anyone here met the same issue and give some pointers it would be great even perhaps if someone found a way to continue using backuppc in that extreme situation. ps: backuppc server and the web server are debian linux, i use rysnc method and backup the NFS that i mount localy on the backuppc server. regards, Jean. -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
On Fri, 2011-12-16 at 10:42 +0100, Jean Spirat wrote: hi, I use backuppc to save a webserver. The issue is that the application used on it is making thousand of little files used for a game to create maps and various things. The issue is that we are now at 100GB of data and 8.030.000 files so the backups takes 48H and more (to help the files are on NFS share). I think i come to the point where file backup is at it's limit. ps: backuppc server and the web server are debian linux, i use rysnc method and backup the NFS that i mount localy on the backuppc server. I have a backup with a similar number of files in and I have found that tar is much better than rsync. Your issues are: 1. rsync will take a very long time and a very large amount of memory to build the file tree, especially over NFS 2. NFS isn't really a high performance filesystem, you are better off working locally on the server being backed up via ssh. I would suggest you try the following: Move to tar over ssh on the remote webserver, the first full backup might well take a long time but the following ones should be faster. tar+ssh backups however use more bandwidth but as you are already using nfs I am assuming you are on a local network of some sort. -- Tim Fletcher t...@night-shade.org.uk -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
r. I would suggest you try the following: Move to tar over ssh on the remote webserver, the first full backup might well take a long time but the following ones should be faster. tar+ssh backups however use more bandwidth but as you are already using nfs I am assuming you are on a local network of some sort. Hum i cannot directly use the FS i have no access to the NFS server that is on the hosting company side i just have access to the webserver that use the nfs partition to store it's content. Right now i also mount the nfs share on the backup server so that way i have not the overhead of ssh in the mix (but yes i am on a 1000mbps lan for the NFS and backup server). for my understanding rsync had allways seems to be the most efficient of the two but i never challenged this fact ;p i will have a look at tar and see if i can work with it . If any other person has some experience in it feel free to contribute :) regards, Jean. -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
On Fri, 2011-12-16 at 11:49 +0100, Jean Spirat wrote: I would suggest you try the following: tar+ssh backups however use more bandwidth but as you are already using nfs I am assuming you are on a local network of some sort. for my understanding rsync had allways seems to be the most efficient of the two but i never challenged this fact ;p i will have a look at tar and see if i can work with it . http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg15217.html Is the pros and cons of tar and rsync in far more detail than I can offer. -- Tim Fletcher t...@night-shade.org.uk -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
On Fri, Dec 16, 2011 at 4:49 AM, Jean Spirat jean.spi...@squirk.org wrote: Hum i cannot directly use the FS i have no access to the NFS server that is on the hosting company side i just have access to the webserver that use the nfs partition to store it's content. Right now i also mount the nfs share on the backup server so that way i have not the overhead of ssh in the mix (but yes i am on a 1000mbps lan for the NFS and backup server). for my understanding rsync had allways seems to be the most efficient of the two but i never challenged this fact ;p Rsync working natively is very efficient, but think about what it has to do in your case. It will have to read the entire file across nfs just so rsync can compere contents and decide not to copy the content that already exists in your backup. i will have a look at tar and see if i can work with it . I'd try rsync over ssh first, at least if most of the files do not change between runs. If you don't have enough ram to hold the directory listing or if there are changes to a large number of files per run, tar might be faster. -- Les Mikesell lesmikes...@gmail.com -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
On Fri, Dec 16, 2011 at 4:42 AM, Jean Spirat jean.spi...@squirk.org wrote: The issue is that we are now at 100GB of data and 8.030.000 files so the backups takes 48H and more (to help the files are on NFS share). I think i come to the point where file backup is at it's limit. What about a script on this machine with all the files that uses tar to put all (or some, or groups) these little files into a few bigger files, stored in a separate directory? Run your script a few times a day and just exclude the directories with gazillions of files and backup the directory you created that has the tar archives in them. Steve -- The universe is probably littered with the one-planet graves of cultures which made the sensible economic decision that there's no good reason to go into space--each discovered, studied, and remembered by the ones who made the irrational decision. - Randall Munroe -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
On Fri, 2011-12-16 at 07:33 -0600, Les Mikesell wrote: On Fri, Dec 16, 2011 at 4:49 AM, Jean Spirat jean.spi...@squirk.org wrote: for my understanding rsync had allways seems to be the most efficient of the two but i never challenged this fact ;p Rsync working natively is very efficient, but think about what it has to do in your case. It will have to read the entire file across nfs just so rsync can compere contents and decide not to copy the content that already exists in your backup. i will have a look at tar and see if i can work with it . I'd try rsync over ssh first, at least if most of the files do not change between runs. If you don't have enough ram to hold the directory listing or if there are changes to a large number of files per run, tar might be faster. The real issue with rsync is the memory usage for the 8 million entries in the file list. This is because the first thing that happens is rsync walks the tree comparing with already backuped up files to see if the date stamp has changed. This puts memory and disk load on both the backup server and the backed up client. The approach that tar uses is just to walk the directory tree and transfer everything newer than a timestamp that backuppc passes to it. This costs some extra network bandwidth but massively reduces the disk and memory bandwidth needed on both the backuppc client and server. The server that I am backing up with ~7 million files takes on the order of 6000 minutes to backup with rsync, the bulk of that time is taken up by rsync building the tree of files to transfer. The same server takes about 2500 minutes with tar because of the simpler way of finding files. Overall rsync makes better backups because it finds moved and deleted files and is far far more efficient with network bandwidth, but if you understand the draw backs and need the filesystem efficiency of tar then it is still an excellent backup tool. -- Tim Fletcher t...@night-shade.org.uk -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
Hi, On Friday 16 December 2011 10:42:00 Jean Spirat wrote: I use backuppc to save a webserver. The issue is that the application used on it is making thousand of little files used for a game to create maps and various things. The issue is that we are now at 100GB of data and 8.030.000 files so the backups takes 48H and more (to help the files are on NFS share). I think i come to the point where file backup is at it's limit. Excuse my off topic-ness, but with that many small files I kind of expect a filesystem to reach certain limits. Why is that webapp written to use many little files? Why not with a database where all that stuff is in blobs? That whould be easier to maintain and easier to back up. Have fun, Arnold signature.asc Description: This is a digitally signed message part. -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
Excuse my off topic-ness, but with that many small files I kind of expect a filesystem to reach certain limits. Why is that webapp written to use many little files? Why not with a database where all that stuff is in blobs? That whould be easier to maintain and easier to back up. Have fun, fortunately it is not in my power to discuss choice of the developper my only job is to try to figure a way to make backup of the thing work :) i am sure most know the pain, i allready strive to exclude caching directory because devs seems to invent new way to name them every two days (temp temporary temporaire enter your custom name cache mycache appli-cache img-cache variation with plurals...). the backuppc server has 16Gb of ram so on the server side it should be okay memory wise. on my monitoring i never go under 5gb free (but i only have the data every 5 minutes). regards, Jean. -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] 8.030.000, Too much files to backup ?
On Fri, Dec 16, 2011 at 9:00 AM, Jean Spirat jean.spi...@squirk.org wrote: Excuse my off topic-ness, but with that many small files I kind of expect a filesystem to reach certain limits. Why is that webapp written to use many little files? Why not with a database where all that stuff is in blobs? That whould be easier to maintain and easier to back up. Have fun, fortunately it is not in my power to discuss choice of the developper my only job is to try to figure a way to make backup of the thing work :) i am sure most know the pain, i allready strive to exclude caching directory because devs seems to invent new way to name them every two days (temp temporary temporaire enter your custom name cache mycache appli-cache img-cache variation with plurals...). the backuppc server has 16Gb of ram so on the server side it should be okay memory wise. on my monitoring i never go under 5gb free (but i only have the data every 5 minutes). Aside from excluding unneeded parts, one other approach that can sometimes help is to split the target into separate runs for different subdirectories. To do that, you can make different 'host' entries, then use the ClientNameAlias setting to point them back to the same real target. If the files in question are split into some reasonable upper-level subdirectories, you may be able to get sets that complete in a reasonable amount of time. -- Les Mikesell lesmikes...@gmail.com -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/