AHHHH OK - so no magic. I just coded up a new way that should in general be significantly faster.
Basically, I create a new inode-centered pool that I call 'ipool' that is a decimal-based tree (rather than the hexadecimal-based pool/cpool trees). You can set how many levels you want. Then I recurse through the pool/cpool and for every entry, I store a corresponding file in the ipool based on the pool/cpool *inode* number. The file's contents are set to the *name* of the pool/cpool file (actually the path relative to TopDir). Note that the ipool is indexed by the least significant digits of the inode number to ensure more uniform distribution across the tree. Then you can recurse through the pc tree and quickly look up each inode to find it's pool/cpool location via my ipool construct. I haven't benchmarked, but I have to believe that this will in general be significantly faster than (re)computing the partial file md5sum for each file in the pc tree (though caching does help of course). Also my method requires constant memory so it scales nicely. Finally, I'm not sure if you implement it in BackupPC_tarPCCopy, but if for some reason a pc tree entry (other than backupInfo) does not have its inode in the ipool then I flag it and optionally correct it by linking the file back into the pool/cpool. By the way, this alone could be used as a much faster approach to solving Robin's quetion earlier where she needed to check and fix a large pc tree where a number of files had nlinks >1 but *none* of them were in the pool/cpool. Craig Barratt wrote at about 01:56:31 -0800 on Friday, January 21, 2011: > Jeffrey, > > > I am trying to understand how BackupPC_tarPCCopy figures out all the > > hard links from the PC directory to the pool without doing a lot of > > work and/or without caching lots of pool inodes. > > It just opens the file to compute the pool digest. If there are > multiple files in the pool with the same digest it compares inode > numbers to determine the match. > > Craig ------------------------------------------------------------------------------ Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d _______________________________________________ BackupPC-users mailing list [email protected] List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
