Hi, Jeffrey J. Kosowsky wrote on 2011-10-04 18:58:51 -0400 [[BackupPC-users] Bad md5sums due to zero size (uncompressed) cpool files - WEIRD BUG]: > After the recent thread on bad md5sum file names, I ran a check on all > my 1.1 million cpool files to check whether the md5sum file names are > correct. > > I got a total of 71 errors out of 1.1 million files: > [...] > - 68 of the 71 were *zero* sized when decompressed > [...] > Each such cpool file has anywhere from 2 to several thousand links > [...] > It turns out though that none of those zero-length decompressed cpool > files were originally zero length but somehow they were stored in the > pool as zero length with an md5sum that is correct for the original > non-zero length file. > [...] > Now it seems unlikely that the files were corrupted after the backups > were completed since the header and trailers are correct and there is > no way that the filesystem would just happen to zero out the data > while leaving the header and trailers intact (including checksums). > [...] > Also, on my latest full backup a spot check shows that the files are > backed up correctly to the right non-zero length cpool file which of > course has the same (now correct) partial file md5sum. Though as you > would expect, that cpool file has a _0 suffix since the earlier zero > length is already stored (incorrectly) as the base of the chain. > [...] > In summary, what could possibly cause BackupPC to truncate the data > sometime between reading the file/calculating the partial file md5sum > and compressing/writing the file to the cpool?
the first and only thing that springs to my mind is a full disk. In some situations, BackupPC needs to create a temporary file (RStmp, I think) to reconstruct the remote file contents. This file can become quite large, I suppose. Independant of that, I remember there is *at least* an "incorrect size" fixup which needs to copy already written content to a different hash chain (because the hash turns out to be incorrect *after* transmission/compression). Without looking closely at the code, I could imagine (but am not sure) that this could interact badly with a full disk: * output file is already open, headers have been written * huge RStmp file is written, filling up the disk * received file contents are for some reason written to disk (which doesn't work - no space left) and read back for writing into the output file (giving zero-length contents) * trailing information is written to the output file - this works, because there is enough space left in the already allocated block for the file * RStmp file gets removed and the rest of the backup continues without apparent error Actually, for the case I tried to invent above, this doesn't seem to fit, but the general idea could apply - at least the symptoms are "correct content stored somewhere but read back incorrectly". This would mean the result of a write operation would have to be unchecked by BackupPC somewhere (or handled incorrectly). So, the question is: have you been running BackupPC with an almost full disk? Would there be at least one file in the backup set, of which the *uncompressed* size is large in comparison to the reserved space (-> DfMaxUsagePct)? For the moment, that's the most concrete thing I can think of. Of course, writing to a temporary location might be fine an reading could fail (you haven't modified your BackupPC code to use a signal handler for some arbitrary purposes, have you? ;-). Or your Perl version could have an obscure bug that occasionally trashes the contents of a string. Doesn't sound very likely, though. What *size* are the original files? Ah, yes. How many backups are (or rather were) you running in parallel? Noone said the RStmp needs to be created by the affected backup ... Regards, Holger ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/