Holger Parplies wrote at about 05:39:15 +0200 on Sunday, October 2, 2011: > Mike Dresser wrote on 2011-09-29 14:11:20 -0400 [[BackupPC-users] Fairly > large backuppc pool (4TB) moved with backuppc_tarpccopy]: > > [...] Did see a few errors, all of them were related to the attrib files, > > similar to "Can't find xx/116/f%2f/fvar/flog/attrib in pool, will copy > > file" > > [...] > > Out of curiosity, where are those errors (the attrib in pool ones) > > coming from? > > (which is a question, and a good one). > > I can't promise that this is the correct answer, but it's a possibility: > prior > to BackupPC 3.2.0, *top-level* attrib files (i.e. those for the directory > containing all the share subdirectories) were linked into the pool with an > incorrect digest, presuming there was more than one share. This would mean > that > BackupPC_tarPCCopy would not find the content in the pool, because it would > look for a file with the *correct* digest (i.e. file name). Please note that > your quote above does *not* reference a *top-level* attrib file (that would > be > "xx/116/attrib"), and, beyond that, you don't seem to have multiple shares, > so it might well be a different problem. > > According to the ChangeLog, Jeffrey should have pointed this out, because he > discovered the bug and supplied a patch ;-). > > I notice this problem on my pool when investigating where the longest hash > collision chain comes from: it's a chain of top-level attrib files - all for > the same host and with different contents and thus certainly different > digests.
As Holger points out, the bug I reported and suggested a patch for involved top-level attribs where you have more than one share. This has been fixed in 3.2.0. That being said, in the past I did find a couple of broken attrib file md5sums out of many 100's of thousands but I assumed at the time that it was an artifact of some other messing around I may have been done. If you are finding missing pooled attrib files not in the top-level, then it would be interesting to figure out what is causing it since there may be a real bug somewhere (though again I haven't seen the problem recently but I haven't really checked recently either). If you want to troubleshoot, I would do the following: - Look up the inode of the bad attrib file in the pc tree - Check how many links it has - Assuming it has nlinks >1, search for that inode in the pool using say find <topdir>/cpool -inum <inode number> - If the file is indeed hard-linked into the pool, calculate the actual partial md5sum of the file (not the *nix md5sum) using say one of my routines. Check to see if the calculated partial file md5sum matches the pool file name. Presumably it should be different. - If the file is not there, then that is another issue. - Also, look back through your logs to see when the attrib file was actually created and written to the pool. See if anything is weird/wrong there Assuming that you do have a real issue with non-top-level attrib file md5sums or pool links, it would be interesting to see if anybody has encountered the same problem in versions >= 3.2.0 > > > I still have the old filesystem online if it's something I > > should look at. > > I don't think it's really important. If the attrib file was not in the pool > previously, then that may simply have wasted a small amount of space. As I > understand the matter, the file will remain unpooled in the copy. You could > fix that with one of Jeffrey's scripts or just live with a few wasted bytes. > If you are running a BackupPC version < 3.2.0, pooling likely won't work for > those attrib files anyway. > > It might be interesting to determine whether the non-top-level attrib files > you got errors for are also, in fact, pooled under an incorrect pool file > name, though that would involve finding the pool file by inode number and > calculating the correct pool hash (or ruling out the existance of a pool file > due to a link count of 1 :-). Agreed - see my suggestion above ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/