Pieter, Thank you for kindly providing this script. I also have an accounting need to get some sort of reasonable estimate for how much space they are occupying, and I don't want to do a du.
Forgive my ignorance, but can you give human readable explanations for the abbreviations?: total: alloc=x dalloc=x dentries=x dsize=x falloc=x fcount=x fsize=x I have guesses but I would prefer to hear it from the guy who wrote it. Thanks! Kyle Anderson Tummy.com Pieter Wuille wrote: > On Tue, Dec 01, 2009 at 09:28:50AM -0500, Jeffrey J. Kosowsky wrote: >> Pieter Wuille wrote at about 13:18:33 +0100 on Tuesday, December 1, 2009: >> > What you can do is count the allocated space for each directory and file, >> but >> > divide the numbers for files by (nHardlinks+1). This way you end up >> > distributing the size each file takes on disks over the different backups >> it >> > belongs to. >> > >> > I have a script that does this; if there's interest i'll attach it. It >> does >> > take a day (wild guess, never accurately measured) to go over all pc/* >> > directories (Pool is 370.65GB comprising 4237093 files and 4369 >> > directories) >> >> I am surprised that it would take a day. > The server is quite busy making backups, and rsync'ing to an offsite backup > server at the same time -- especially the latter puts some serious load on > I/O, i assume. > >> The only real cost should be that of doing a 'find' and a 'stat' on >> the pc tree - which I would do in perl so that I could do the >> arithmetic in place (rather than having to use a *nix find -printf to >> pass it off to another program). > Yes, it is a perl script. > >> Unless you have a huge number of pc's and backups, I can't imagine >> this would take more than a couple of hours since your total number of >> unique files in only about 4 million. > We have 4 million unique inodes. We do however have some 20-25 million > directory entries, which is what the script needs to read through. > >> Given that you only have 4 million unique files, you could even avoid >> the multiple stats at the cost of that much memory by caching the >> nlinks and size by inode number. > Except that the script already needs to do a stat per directory entry in order > to know the inode number itself... > >> Can you post your script? > > See attachment. You can run eg.: > > ./diffsize.pl /var/lib/backuppc/pc/* > > to see values per host, and a total. > > PS: it actually (correctly) divides by (nHardLinks-1) instead of +1 (what i > claimed earlier). > > kind regards, > > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > Join us December 9, 2009 for the Red Hat Virtual Experience, > a free event focused on virtualization and cloud computing. > Attend in-depth sessions from your desk. Your couch. Anywhere. > http://p.sf.net/sfu/redhat-sfdev2dev > > > ------------------------------------------------------------------------ > > _______________________________________________ > BackupPC-users mailing list > BackupPC-users@lists.sourceforge.net > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev
_______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/