Re: [BackupPC-users] How to manage disk space?

2015-04-15 Thread Kris Lou
On Wed, Apr 15, 2015 at 12:00 PM, Dave Sill de5-backu...@sws5.ornl.gov
wrote:

 A corollary would be: how do I know that the space BackupPC is using
 doesn't include a bunch of cruft like files from systems that have been
 removed from BackupPC, or file systems that have been removed, ...


Somebody might have a script to check this, but you may have some backups
that are lacking configured hosts.  As I recall, when you delete a host
from the interface, the data needs to be manually removed (and I believe
there is a notification alluding to this).  However, if it's just file
systems, the files are likely pooled in other backups.


Kris Lou
k...@themusiclink.net
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] How to manage disk space?

2015-04-15 Thread Dave Sill
Thanks for replies. Don't know how I missed the Host Summary page, but
that's useful.

Holger Parplies wb...@parplies.de wrote:
 
 Les Mikesell wrote on 2015-04-14 09:34:35 -0500 [Re: [BackupPC-users] How to 
 manage disk space?]:
  On Mon, Apr 13, 2015 at 4:57 PM,  backu...@kosowsky.org wrote:
   Dave Sill wrote at about 15:28:49 -0400 on Monday, April 13, 2015:
 We've been using BackupPC for a couple years and have just encountered
 the problem of insufficient disk space on the server. [...]

 What I'd like to know is (1) where is the disk space going,
   To store ayour backups
  
 and (2) how can adjust BackupPC to use less space?
   Save fewer backups or backup fewer machines
 
 Jeffrey has a point here. You don't give us much detail to guess on. A couple
 dozen Linux servers can mean just about anything.

Well, yeah, but rather than spend hours collecting all of the various
information that could potentially help, being a newbie and not
knowing which details really would help, I thought I'd let people
request further info if it was needed. :-)

  But more specifically, a likely problem is that you have some very
  large files like databases, log files, virtual machine images or
  mailboxes that change daily and thus are not pooled.
 
 That is one possibility. Another would be keeping several years worth of daily
 history of large mail servers. Either your history is too long (for the disk
 space available), or your backups are too large, or most likely a combination
 of both. Backups may be too large either by design (you need to backup too
 much data) or by malfunction (you are backing up something you don't mean to
 backup).

I suspect they're too large by design. The user is the ORNL DAAC, a
NASA data archive. Pooling helps a lot on system files, I'm sure, but
the bulk of our holdings are data files that probably aren't stored
many times.

My immediate problem was that the disk was full and I needed to figure
out how to get backups running again without adding more space because
none was available. I could take systems/filesystems out of BackupPC
or adjust retention, but I had no idea how much space that would free
up or how quickly that would happen.

 Yet other possibilities would be that BackupPC_nightly is not running, or that
 linking is not working.
 
 Then again, you might have meant to ask, how do I find out where the disk
 space is going?.

I thought that's what I asked.

A corollary would be: how do I know that the space BackupPC is using
doesn't include a bunch of cruft like files from systems that have been
removed from BackupPC, or file systems that have been removed, ...

 I can't think of a good answer to that. BackupPC's pooling
 mechanism means that if you have 100 copies of one file content (all linked
 to one pool file by BackupPC), deleting 99 of them won't save you anything, as
 long as one remains. Put differently, one host *might* seem very large in
 terms of total backup size, yet share all files with other seemingly smaller
 hosts. You really have to look at your source data: what are you backing up,
 how often does it change, how unique is it? And you have to know your
 constraints. If you *need* to keep a long history of a large amount of data,
 there is nothing much you can do (except from getting more disk space). If you
 don't, the easiest option is to expire old backups and see what happens - just
 keep in mind that you don't get back any disk space for content still present
 in more recent backups.
 Reducing the size of existing backups is somewhat tricky, and reducing the
 size of future backups won't gain you anything until the old backups expire.
 
 Actually, there might be a way to shed some light. I'd probably look for large
 files with a low link count (-links 2 or 3) in the pc/ tree. You need to be
 aware that 'find' will take a *long* time to traverse such a large pool. It
 just might be worthwhile to run a rather general 'find' command with output
 redirected to a file and then filter that repeatedly to narrow down your
 search, rather than running several different 'find' invocations. Or even
 looking in the {c,}pool/ rather than the pc/ tree (faster, but you don't get
 any file paths, just file content).
 
 Running 'find $topdir/pc/$host/$num -type f -links -3 -ls' should give you an
 approximate list of files that would actually be deleted by deleting [only]
 backup $num of host $host ('-links -3' takes into account files for some
 reason not linked into the pool; in theory, these *should* all be zero length,
 but in case of some malfunction, they might not).
 
 Much of that might not make any sense for your particular case, but I hope
 some of it helps.

Thanks, Holger, that does help.

-Dave

--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices 

Re: [BackupPC-users] How to manage disk space?

2015-04-15 Thread Holger Parplies
Hi,

Kris Lou wrote on 2015-04-15 12:57:54 -0700 [Re: [BackupPC-users] How to manage 
disk space?]:
 On Wed, Apr 15, 2015 at 12:00 PM, Dave Sill de5-backu...@sws5.ornl.gov
 wrote:
 
  A corollary would be: how do I know that the space BackupPC is using
  doesn't include a bunch of cruft like files from systems that have been
  removed from BackupPC, or file systems that have been removed, ...
 
 Somebody might have a script to check this,

I doubt that, because it seems to be impossible to exactly define what the
script should look for :-).

If you change a backup definition to no longer include part of the files it
used to include, existing backups will still include those files, and that is
how it should be. In some cases you may wish to remove those files from
previous backups (because they were erraneously included), in others, you may
simply not need to back them up in the future (e.g. they were previously
created by hand, and now they're generated from data included in the backup).
There is no automatic way to decide this. You can always delete files you do
not need, but you could not undo the effect of files you would have needed
being automatically purged from the backup. Just imagine *accidentally*
removing files from your backup definition. If that would immediately mangle
your backup history, you would undoubtedly immediately switch to another
backup tool :-).

Yes, it would be possible (but complicated) to check if existing backups match
the current backup definition and alert you to differences, but it seems like
a *lot* of work without much gain. If you find out that your backups include
something they shouldn't, you should really change the backup definition *and*
remove the extraneous files (or decide that they won't do any harm until the
backups expire). While BackupPC does not natively support changing existing
backups, I believe there are user contributed scripts to do such things,
probably written by Jeffrey ;-).

As for hosts that have been removed, that is really easy to check:
ls -l $topdir/pc

If there are directories not corresponding to existing hosts, you can 'simply'
remove them - if you don't want to wait, move them to $topdir/trash and
BackupPC will take care of it for you. You won't immediately get back much
space, because all files are linked to the pool. BackupPC_nightly will proceed
to delete the pool files not referenced by other backups (which might take
several nights, depending on your configuration).

Aside from that, you probably need to trust BackupPC to work as designed :-).

Regards,
Holger

--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] How to manage disk space?

2015-04-15 Thread Mark Campbell
In my experience, just removing a server doesn’t delete the files, nor reduce 
disk space on the system, even when you’ve gone into the BackupPC/pc/host/ 
dir and deleted the dir(s) that have a number corresponding to the backup you 
want to delete.  The best way to remove “orphaned” files (i.e., files that do 
not have any other hard links), is to run:

su –s /bin/bash –c “/usr/share/BackupPC/bin/BackupPC_nightly 0 255” backuppc

What this does is, runs the BackupPC_nightly script (should have come with your 
copy of BackupPC) as the user backuppc, with the appropriate flags.  What this 
script does, is it goes through all of the files in the pool, and if it’s not 
associated with a backup (i.e., no other hard links), it removes it.  This 
process can take a really long time (sometimes it takes me minutes; other 
times, it’s taken me 12+ hours.  All depends on how many files you have, and 
how many meet the chopping block), and possibly put a heavy load on your system 
while doing it, so be forewarned before running it.

In short, I’m not sure of a way to check ahead of time whether there’s files 
that are “cruft”, but you can be sure that after running this script, the 
number of “cruft” files will be zero.

Thanks,

--Mark

From: Kris Lou [mailto:k...@themusiclink.net]
Sent: Wednesday, April 15, 2015 3:58 PM
To: General list for user discussion, questions and support
Subject: Re: [BackupPC-users] How to manage disk space?


On Wed, Apr 15, 2015 at 12:00 PM, Dave Sill 
de5-backu...@sws5.ornl.govmailto:de5-backu...@sws5.ornl.gov wrote:
A corollary would be: how do I know that the space BackupPC is using
doesn't include a bunch of cruft like files from systems that have been
removed from BackupPC, or file systems that have been removed, ...

Somebody might have a script to check this, but you may have some backups that 
are lacking configured hosts.  As I recall, when you delete a host from the 
interface, the data needs to be manually removed (and I believe there is a 
notification alluding to this).  However, if it's just file systems, the files 
are likely pooled in other backups.


Kris Lou
k...@themusiclink.netmailto:k...@themusiclink.net
--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] How to manage disk space?

2015-04-15 Thread Holger Parplies
Hi,

Mark Campbell wrote on 2015-04-15 14:37:53 -0700 [Re: [BackupPC-users] How to 
manage disk space?]:
 [...] The best way to remove ???orphaned??? files (i.e., files that do not
 have any other hard links), is to run:
 
 su ???s /bin/bash ???c ???/usr/share/BackupPC/bin/BackupPC_nightly 0 255??? 
 backuppc

as always when that is suggested: WRONG. NEVER call BackupPC_nightly directly.

 What this does is,

possibly trash your pool. And no, you are very unlikely to notice.

 In short, I???m not sure of a way to check ahead of time whether there???s
 files that are ???cruft???, but you can be sure that after running this
 script, the number of ???cruft??? files will be zero.

For an extremely meaningless definition of 'cruft'. BackupPC_nightly is
automatically run by BackupPC when it is safe to do so every night (in
particular, BackupPC won't interfere while it *knows* BackupPC_nightly is
running, which it doesn't if you run it by hand), so this sort of 'cruft'
is regularly removed anyway without any manual action.

Regards,
Holger

--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/