I had an issue some months back. It turned out to be a bad RAM stick in my NAS. The issues would not show up on a restart but after some usage it would hit the RAM errors and :(
This may not be your issue, but I remember how annoying it was to figure out. On Fri, May 22, 2026 at 9:53 AM Charles Curley <[email protected]> wrote: > > I have four four terabyte hard drives. Each has a partition on it. The > four partitions comprise a RAID 5 array using mdadm. On top of that, > LUKS encryption, then LVM with ext4 logical volumes. > > On one LVM partition I have a number of backup files, tarred, > bzipped, and sha256 and sha512 summed. I have a script which will find > checksum files, and execute the appropriate program to test the > archives. It puts each program into the background, parallising any > number of checksum tests. > > Starting about a week ago, the script finds an error in one or more > files out of several. Results are inconsistent: one pass may find an > error in a given file, the next pass not find any errors in it. Running > checksums manually, one at a time, does not turn up an error. Running > "tar tvf" finds no error in a suspect file. Running "bunzip2 -t" also > turns up no error. Only running the script turns up any errors. > > I create two checksum files when I create the backups, for sha256 and > sha512. After this problem surfaced (about a week ago), I then made two > new checksum files of a suspect file. The two checksum file pairs > (e.g. both sha512sum files) show the same checksums. The script now > tests using both the old and new checksum files. Sometime only one pair > of checksum files fail the suspect file. > > In addition to all of that, I also get the occasional "bad message" > error. I have no idea what that means, but an fsck seems to deal with > it. > > To be thorough, I have run extended SMART tests on the hard drives, > kicked mdadm into testing the RAID array, and fscked the LVM partitions > on the RAID array. Only fsck turned up issues, and that has not stopped. > > I also back some of this up to offsite USB drives. I ran the script on > one of those, using a different computer. No errors reported. > > I have a hypothesis as to what is going on, but would like to hear from > you before I discuss it. > > -- > Does anybody read signatures any more? > > https://charlescurley.com > https://charlescurley.com/blog/ > -- - Andrew "lathama" Latham -

