Craig Barratt wrote at about 00:32:57 -0800 on Wednesday, March 2, 2011:
> The next topic is the pool structure in 4.x.
>
> Here are the differences in pool file storage between 3.x and 4.x:
>
> - Digest changes from partial MD4 to full-file MD5. This will
> significantly reduce pool collisions - in almost all
> installations there will be no pool collisions. The most
> common exception will be if someone uses the now well-known
> constructed cases of different files with MD5 collisions.
AWESOME. The partial MD5 (not partial MD4 by the way -- you may want
to correct this in your documentation) checksums always seemed like
such a kluge and pita to me :P
> In 3.x a partial MD4 digest is used, so collisions are more
> common. Also, the file system's hardlink limit can also
> cause more entries in a pool file chain. In 4.x reference
> counting is done using a simple database, so the file system
> hardlink limit isn't relevant.
> - If pool files do collide, a chain is created by appending one or
> more bytes to the MD5 digest as a counter. The first instance of a
> pool file will have a regular 16 byte digest. The next file that is
> different but has the same MD5 digest will be stored as a 17 byte
> digest with an extra byte of 0x01. The 256th file in the chain
> (unlikely of course) will have two more bytes appended: 0x0100.
> The extension is basically the file index with leading 0x00 bytes
> removed.
As per my earlier post, I worry about the concept of still having
potential collisions (even if rare) and chains. The reason being both
that it is inelegant and that presumably it requires pool files to be
decompressed and compared byte-by-byte with any new file to check if
there is a collision. For large files, decompressing the entire file
and then comparing byte-by-byte is *slow* and could account for a
significant portion of the backup time.
Wouldn't it be better to pick a more secure checksum such as sha256sum
or even sha512sum where the chances of a collision are so
astronomically small as to be less likely than having a bit error in
your RAM. No collisions have ever been found for either of them and
they allow maximum file sizes up to 2^64-1 and 2^128-1
respectively. Presumably, the chance of a random collision is 2^-256
and 2^-512 respectively which are numbers so small that physical
hardware errors are more likely.
Eliminating any statistical likelihood of collision would then allow
you to simplify the code by eliminating the need for chains while also
speeding up adding files to the pool since you wouldn't need to check
for new collisions but just use the sha-sum.
>
> - 4.x doesn't use hardlinks (except as inherited from existing 3.x
> pools).
>
> - In 4.x pool files are never renamed. In 3.x pool files in a chain
> of repeated digests will be renamed if one of the middle files is
> deleted. In the unlikely even there is a chain of repeated files in
> 4.x, and one of the files is deleted (ie: no longer referenced),
> then it is replaced by a zero-length file. That acts as a tag that
> searching through the chain should continue past that point, and
> also acts as a tag that that file can be replaced by a real pool
> file when the next file is added.
As above, I think it would be preferable and more elegant to use a
checksum that can be relied upon as being unique.
>
> - In 4.x the pool files are stored two-levels deep, with 128
> directories at each level. The directories are numbered in hex
> from 00 to fe in steps of 2. The directory names are based on
> the first two bytes of the MD5 digest, each anded with 0xfe.
> For example, a file with digest 0458d9d0e9ddd2b6b21a1e60b6cdf323
> will be stored in:
>
> CPOOL_DIR/04/58/0458d9d0e9ddd2b6b21a1e60b6cdf323
>
> while a file with digest 09682c6df94c87b1e9ee6e1d0d89e8f2 will be
> stored in:
>
> CPOOL_DIR/08/68/09682c6df94c87b1e9ee6e1d0d89e8f2
>
> (notice that 0x09 & 0xfe == 0x08).
When I was writing my code in BackupPC_copyPcPool.pl to create an
inode-based pool, I thought about the tradeoffs between fewer vs. more
levels of pool. It seemed to me that as the number of pool files
increased you would be better off with more levels of pools -- with
there being a tradeoff between the number of layers and directories
accessed vs. the size of each directory. In my code, this led me to
allow for a user-specified pool depth.
I wonder whether here too, it may be best to allow for a configurable
depth to the pool. This would not change the actual pool file names
but just how many levels they are distributed across.
Then for small installations, one might choose just 2 levels but for
truly humongous installations, it might be better to have 3 or even
more levels. Presumably, different filesystems and hardware would also
influence this tradeoff.
It would seem that it wouldn't be too hard to modify the code to allow
this to be a user configurable parameter.
>
> In 3.x the directories are three levels deep, with 16 directories
> at each level based on the first 3 hex digests of the partial
> MD4 digest. So in 3.x there are 16^3 = 4096 leaf directories,
> while in 4.x there are 128 * 128 = 16384 leaf directories.
>
> - The 3.x and 4.x CPOOL_DIR is the same. The trees below are
> separate because of the directory naming conventions.
>
> - In 4.x when pool file matching occurs the full-file MD5 digest
> is needed to match files. There is also a flag, $bpc->{PoolV3},
> that determines whether old 3.x pool files should be checked
> too. Currently that flag is hardcoded and I need to make it
> autodetect whether there are any old pool files (I guess based
> on BackupPC_nightly?). If PoolV3 is set and there are no
> candidate 4.x files, then the old digest is computed too and
> 3.x candidate pool files are also checked for matches.
>
> If an old pool 3.x file is matched, then that file is renamed to
> the corresponding 4.x pool file path (based on the MD5 digest).
> This file might still have multiple hardlinks due to the existing
> 3.x backups. As those backups are expired, eventually the link
> count on the pool file will decrease to 1.
While I appreciate the benefit of preserving backward compatibility
with 3.x and for allowing the two to coexist, I wonder whether you
might be better off requiring either separate instances or a one-time
conversion. This would presumably simplify the code and its ongoing
maintenance considerably since you wouldn't have to maintain parallel
implementations within the same code. Also, I imagine that for most
people either they want to convert it all to 4.x to gain the benefit
of eliminated hard links or they will be ok with maintaining a
separate archival 3.x instance.
In the mixed case, I imagine that most people retain a relatively long
tail on their backups so that the link count may never decrease to 1
which again to me would argue for either a one time conversion or a
separate instance.
>
> - For backing up the BackupPC store in a mixed V3/V4 environment it
> should be possible just copy the new V4 pool and new V4 backups
> (without worrying about hardlinks that might remain on pool files
> from V3 backups). However, I need to devise a way of determining
> the paths of the V4 backups. Perhaps I should add a utility that
> lists all the directories that should be backed up?
>
Personally, I am most interested in a utility that would convert V3 to
V4. This could presumably be run in the background so long as all new
backups were created in V4 format. The fact that the deltas are
time-backward should make this all easier since one could work
backwards from the first V4 instance or alternatively just start by
converting the last V3 instance to a new filled version and working
backwards from there. Since this could run in the background, it
wouldn't matter at least to me how slow it runs. But at least, after a
while I could be guaranteed a nice, consistent, uniform V4 system with
all of its advantages (save the fact that old V3 versions would still
obviously lack extended attributes).
I really don't like the mixed idea where gradually V3 pool files get
renamed which would then add files with potentially thousands of hard
links to a new clean V4 pool. Also, this would make it very difficult
to back up the old V3 portion since the pool for such backups would
now be split between the old V3 pool and pool files transferred and
renamed in the new V4 pool.
------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in
Real-Time with Splunk. Collect, index and harness all the fast moving IT data
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business
insights. http://p.sf.net/sfu/splunk-dev2dev
_______________________________________________
BackupPC-devel mailing list
[email protected]
List: https://lists.sourceforge.net/lists/listinfo/backuppc-devel
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/