After the recent thread on bad md5sum file names, I ran a check on all my 1.1 million cpool files to check whether the md5sum file names are correct.
I got a total of 71 errors out of 1.1 million files: - 3 had data in it (though each file was only a few hundred bytes long) - 68 of the 71 were *zero* sized when decompressed 29 were 8 bytes long corresponding to zlib compression of a zero length file 39 were 57 bytes long corresponding to a zero length file with an rsync checksum Each such cpool file has anywhere from 2 to several thousand links The 68 *zero* length files should *not* be in the pool since zero length files are not pooled. So, something is really messed up here. It turns out though that none of those zero-length decompressed cpool files were originally zero length but somehow they were stored in the pool as zero length with an md5sum that is correct for the original non-zero length file. Some are attrib files and some are regular files. Now it seems unlikely that the files were corrupted after the backups were completed since the header and trailers are correct and there is no way that the filesystem would just happen to zero out the data while leaving the header and trailers intact (including checksums). Also, it's not the rsync checksum caching causing the problem since some of the zero length files are without checksums. Now the fact that the md5sum file names are correct relative to the original data means that the file was originally read correctly by BackupPC.. So it seems that for some reason the data was truncated when compressing and writing the cpool/pc file but after the partial file md5sum was calculated. And it seems to have happened multiple times for some of these files since there are multiple pc files linked to the same pool file (and before linking to a cpool file, the actual content of the files are compared since the partial file md5sum is not unique). Also, on my latest full backup a spot check shows that the files are backed up correctly to the right non-zero length cpool file which of course has the same (now correct) partial file md5sum. Though as you would expect, that cpool file has a _0 suffix since the earlier zero length is already stored (incorrectly) as the base of the chain. I am not sure what is going on with the other 3 files since I have yet to find them in the pc tree (my 'find' routine is still running) I will continue to investigate this but this is very strange and worrying since truncated cpool files means data loss! In summary, what could possibly cause BackupPC to truncate the data sometime between reading the file/calculating the partial file md5sum and compressing/writing the file to the cpool? ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/