Tino Schwarze wrote at about 11:01:01 +0100 on Thursday, March 3, 2011: > Hi Craig & Jeffrey, > > On Wed, Mar 02, 2011 at 09:17:26PM -0500, Jeffrey J. Kosowsky wrote: > > > > However, the benefits significantly outweigh the drawbacks: > > > > > > - eliminating hardlinks means the backup storage is much easier > > > to replicate, copy or restore. > > > > TOTALLY AWESOME. This should reduce the traffic on the BackuppPC > > newslist by about 25% just by eliminating this FAQ and complaint. > > I'm looking forward to being able to backup the pool! :-) And maybe I'll > start a new install for 4.0. Hm... Maybe a pool conversion tool wouldn't > be too difficult? > > > > - determining which pool files can be deleted is much more > > > efficient, since only the reference count database needs > > > to be searched for reference counts of 0. It is no longer > > > necessary to stat() every file in the pool, which is very > > > time consuming on large pools. > > > > Love these improvements in performance. > > > > > > It is not necessary to update the reference counts in real time, so > > > the implementation is a lot simpler and more efficient. In fact, > > > the reference count updating is done as part of the BackupPC_nightly > > > process. > > > > > > The reference count database is stored in 128 different files, > > > based on the first byte of the digest anded with 0xfe. Therefore > > > the file: > > > > > > CPOOL_DIR/4e/poolCnt > > > > > > stores all the reference counts for digests that start with 0x4e or > > > 0x4f. The file itself is the result of using Storable::store() on a > > > hash whose key is the digest and value is the reference count. This > > > is a compact format for storing the perl data structure. The entire > > > file is read or written in a batch like manner - it is not intended > > > for dynamic updates of individual entries. > > > > Why use only 7 bits (and with 0xfe) rather than 8 bits (and with 0xff)? > > I wondered the same. I just took a look - my pool has 9 million files > (and I'm sure there are significantly larger pools in the wild), so 9 > million / 128 = is roughly 70.000 files per file. Digest is 32 bytes, > IIRC, refcount is 8 byte, so we've got roughly 42 bytes/entry ;-) which > means almost 3 MB per file. > > We're talking about at least 270 MB of refcounts for the 9 million files!
In one of my earlier postings, I mentioned that for similar reasons it might be helpful to have the number of pool levels configurable. Currently, it seems that 2 levels is hardwired giving 128*128 = 16384 pool directories. I argued for allowing the number of levels to be user-configurable based upon the (projected) pool size since at some point the inefficiency of very large directories outweighs the cost of having another level of directories. So, similarly, it might be helpful to have the poolCnt files migrate down a level either in parallel with the number of levels or separately via another user-variable. Again depending on pool size, CPU speed, memory, disk access speed, etc. different users may find different optima. It's hard to believe that a one size fits all of 128 poolCnt files can be optimal across orders of magnitude of pool size and memory etc. ------------------------------------------------------------------------------ Free Software Download: Index, Search & Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev _______________________________________________ BackupPC-devel mailing list BackupPC-devel@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-devel Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/