Re: [BackupPC-users] How to manage disk space?
On Wed, Apr 15, 2015 at 12:00 PM, Dave Sill de5-backu...@sws5.ornl.gov wrote: A corollary would be: how do I know that the space BackupPC is using doesn't include a bunch of cruft like files from systems that have been removed from BackupPC, or file systems that have been removed, ... Somebody might have a script to check this, but you may have some backups that are lacking configured hosts. As I recall, when you delete a host from the interface, the data needs to be manually removed (and I believe there is a notification alluding to this). However, if it's just file systems, the files are likely pooled in other backups. Kris Lou k...@themusiclink.net -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] How to manage disk space?
Thanks for replies. Don't know how I missed the Host Summary page, but that's useful. Holger Parplies wb...@parplies.de wrote: Les Mikesell wrote on 2015-04-14 09:34:35 -0500 [Re: [BackupPC-users] How to manage disk space?]: On Mon, Apr 13, 2015 at 4:57 PM, backu...@kosowsky.org wrote: Dave Sill wrote at about 15:28:49 -0400 on Monday, April 13, 2015: We've been using BackupPC for a couple years and have just encountered the problem of insufficient disk space on the server. [...] What I'd like to know is (1) where is the disk space going, To store ayour backups and (2) how can adjust BackupPC to use less space? Save fewer backups or backup fewer machines Jeffrey has a point here. You don't give us much detail to guess on. A couple dozen Linux servers can mean just about anything. Well, yeah, but rather than spend hours collecting all of the various information that could potentially help, being a newbie and not knowing which details really would help, I thought I'd let people request further info if it was needed. :-) But more specifically, a likely problem is that you have some very large files like databases, log files, virtual machine images or mailboxes that change daily and thus are not pooled. That is one possibility. Another would be keeping several years worth of daily history of large mail servers. Either your history is too long (for the disk space available), or your backups are too large, or most likely a combination of both. Backups may be too large either by design (you need to backup too much data) or by malfunction (you are backing up something you don't mean to backup). I suspect they're too large by design. The user is the ORNL DAAC, a NASA data archive. Pooling helps a lot on system files, I'm sure, but the bulk of our holdings are data files that probably aren't stored many times. My immediate problem was that the disk was full and I needed to figure out how to get backups running again without adding more space because none was available. I could take systems/filesystems out of BackupPC or adjust retention, but I had no idea how much space that would free up or how quickly that would happen. Yet other possibilities would be that BackupPC_nightly is not running, or that linking is not working. Then again, you might have meant to ask, how do I find out where the disk space is going?. I thought that's what I asked. A corollary would be: how do I know that the space BackupPC is using doesn't include a bunch of cruft like files from systems that have been removed from BackupPC, or file systems that have been removed, ... I can't think of a good answer to that. BackupPC's pooling mechanism means that if you have 100 copies of one file content (all linked to one pool file by BackupPC), deleting 99 of them won't save you anything, as long as one remains. Put differently, one host *might* seem very large in terms of total backup size, yet share all files with other seemingly smaller hosts. You really have to look at your source data: what are you backing up, how often does it change, how unique is it? And you have to know your constraints. If you *need* to keep a long history of a large amount of data, there is nothing much you can do (except from getting more disk space). If you don't, the easiest option is to expire old backups and see what happens - just keep in mind that you don't get back any disk space for content still present in more recent backups. Reducing the size of existing backups is somewhat tricky, and reducing the size of future backups won't gain you anything until the old backups expire. Actually, there might be a way to shed some light. I'd probably look for large files with a low link count (-links 2 or 3) in the pc/ tree. You need to be aware that 'find' will take a *long* time to traverse such a large pool. It just might be worthwhile to run a rather general 'find' command with output redirected to a file and then filter that repeatedly to narrow down your search, rather than running several different 'find' invocations. Or even looking in the {c,}pool/ rather than the pc/ tree (faster, but you don't get any file paths, just file content). Running 'find $topdir/pc/$host/$num -type f -links -3 -ls' should give you an approximate list of files that would actually be deleted by deleting [only] backup $num of host $host ('-links -3' takes into account files for some reason not linked into the pool; in theory, these *should* all be zero length, but in case of some malfunction, they might not). Much of that might not make any sense for your particular case, but I hope some of it helps. Thanks, Holger, that does help. -Dave -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices
Re: [BackupPC-users] How to manage disk space?
Hi, Kris Lou wrote on 2015-04-15 12:57:54 -0700 [Re: [BackupPC-users] How to manage disk space?]: On Wed, Apr 15, 2015 at 12:00 PM, Dave Sill de5-backu...@sws5.ornl.gov wrote: A corollary would be: how do I know that the space BackupPC is using doesn't include a bunch of cruft like files from systems that have been removed from BackupPC, or file systems that have been removed, ... Somebody might have a script to check this, I doubt that, because it seems to be impossible to exactly define what the script should look for :-). If you change a backup definition to no longer include part of the files it used to include, existing backups will still include those files, and that is how it should be. In some cases you may wish to remove those files from previous backups (because they were erraneously included), in others, you may simply not need to back them up in the future (e.g. they were previously created by hand, and now they're generated from data included in the backup). There is no automatic way to decide this. You can always delete files you do not need, but you could not undo the effect of files you would have needed being automatically purged from the backup. Just imagine *accidentally* removing files from your backup definition. If that would immediately mangle your backup history, you would undoubtedly immediately switch to another backup tool :-). Yes, it would be possible (but complicated) to check if existing backups match the current backup definition and alert you to differences, but it seems like a *lot* of work without much gain. If you find out that your backups include something they shouldn't, you should really change the backup definition *and* remove the extraneous files (or decide that they won't do any harm until the backups expire). While BackupPC does not natively support changing existing backups, I believe there are user contributed scripts to do such things, probably written by Jeffrey ;-). As for hosts that have been removed, that is really easy to check: ls -l $topdir/pc If there are directories not corresponding to existing hosts, you can 'simply' remove them - if you don't want to wait, move them to $topdir/trash and BackupPC will take care of it for you. You won't immediately get back much space, because all files are linked to the pool. BackupPC_nightly will proceed to delete the pool files not referenced by other backups (which might take several nights, depending on your configuration). Aside from that, you probably need to trust BackupPC to work as designed :-). Regards, Holger -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] How to manage disk space?
In my experience, just removing a server doesn’t delete the files, nor reduce disk space on the system, even when you’ve gone into the BackupPC/pc/host/ dir and deleted the dir(s) that have a number corresponding to the backup you want to delete. The best way to remove “orphaned” files (i.e., files that do not have any other hard links), is to run: su –s /bin/bash –c “/usr/share/BackupPC/bin/BackupPC_nightly 0 255” backuppc What this does is, runs the BackupPC_nightly script (should have come with your copy of BackupPC) as the user backuppc, with the appropriate flags. What this script does, is it goes through all of the files in the pool, and if it’s not associated with a backup (i.e., no other hard links), it removes it. This process can take a really long time (sometimes it takes me minutes; other times, it’s taken me 12+ hours. All depends on how many files you have, and how many meet the chopping block), and possibly put a heavy load on your system while doing it, so be forewarned before running it. In short, I’m not sure of a way to check ahead of time whether there’s files that are “cruft”, but you can be sure that after running this script, the number of “cruft” files will be zero. Thanks, --Mark From: Kris Lou [mailto:k...@themusiclink.net] Sent: Wednesday, April 15, 2015 3:58 PM To: General list for user discussion, questions and support Subject: Re: [BackupPC-users] How to manage disk space? On Wed, Apr 15, 2015 at 12:00 PM, Dave Sill de5-backu...@sws5.ornl.govmailto:de5-backu...@sws5.ornl.gov wrote: A corollary would be: how do I know that the space BackupPC is using doesn't include a bunch of cruft like files from systems that have been removed from BackupPC, or file systems that have been removed, ... Somebody might have a script to check this, but you may have some backups that are lacking configured hosts. As I recall, when you delete a host from the interface, the data needs to be manually removed (and I believe there is a notification alluding to this). However, if it's just file systems, the files are likely pooled in other backups. Kris Lou k...@themusiclink.netmailto:k...@themusiclink.net -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] How to manage disk space?
Hi, Mark Campbell wrote on 2015-04-15 14:37:53 -0700 [Re: [BackupPC-users] How to manage disk space?]: [...] The best way to remove ???orphaned??? files (i.e., files that do not have any other hard links), is to run: su ???s /bin/bash ???c ???/usr/share/BackupPC/bin/BackupPC_nightly 0 255??? backuppc as always when that is suggested: WRONG. NEVER call BackupPC_nightly directly. What this does is, possibly trash your pool. And no, you are very unlikely to notice. In short, I???m not sure of a way to check ahead of time whether there???s files that are ???cruft???, but you can be sure that after running this script, the number of ???cruft??? files will be zero. For an extremely meaningless definition of 'cruft'. BackupPC_nightly is automatically run by BackupPC when it is safe to do so every night (in particular, BackupPC won't interfere while it *knows* BackupPC_nightly is running, which it doesn't if you run it by hand), so this sort of 'cruft' is regularly removed anyway without any manual action. Regards, Holger -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/