Zak Kohler posted on Sun, 29 Oct 2017 21:57:00 -0400 as excerpted: > So I ran memtest86+ 5.01 for >4 days: > Pass: 39 Errors: 0
Be aware that memtest86+ will detect some types of errors but not others. In particular, some years ago I had some memory (DDR1/3-digit-Opteron era), actually registered as required by Opterons and ECC, that passed that sort of memory test because what the test /tests/ is memory cell retention (if you put a value in does it verify on read-back?), that was none-the-less bad memory in that at its rated speed it was unreliable at memory /transfers/. Eventually that mobo got a BIOS update that could adjust memory clocking, and I downclocked it a notch (from its rated pc3200 to 3000, IIRC). At the lower clock it was rock stable, even with reduced wait-states to make up a bit of the performance I was losing to the lower clock. But I had to get a BIOS that could do it, first, and as I said, in the mean time memtest86, etc, reported it was fine, because it was testing stable- state, not stress-testing memory transfers at full rated bandwidth. Eventually I did a memory upgrade, and the new memory worked fine at full rated speed. As for actual symptoms, that was well before I switched to btrfs (actually probably about time btrfs development started, I'd guess), but the most frequent early symptoms were untar/bunzip2 checksum failures, with some system crashes during builds (gentoo so I do a lot of untarballing of sources and builds). Later, when the kernel got support for the amd/opteron system-check error detection hardware, that would sometimes, but not always, trigger as well, but the most frequent symptom really was checksum verification failures untarring tbz2s. Of course if btrfs had been around for me to try to run back then I expect I'd have been triggering enough checksum failures that I'd have given up on running it. But FWIW, tho the reiserfs I was running wasn't checksummed, it didn't ever seem to significantly corrupt, with this or other hardware problems I had. That's what really amazed me about reiserfs, that it remained stable thru not only that hardware problem but various others I've had over the years that would have killed or made unworkable other filesystems. Once it got ordered journaling (as opposed to the original writeback journaling that gave it the bad rep... and later poked holes in ext3 reliability when they tried it there for a few kernel versions too) and switched to ordered by default, it really /was/ remarkably stable, even in the face of hardware issues that would take down many filesystems. Of course one of the things that attracted me to btrfs years later was that it was the same Chris Mason that helped reiserfs get that ordered journaling back when he was working for SuSE and they were using reiserfs by default, that began btrfs. Tho btrfs is far more complex, and isn't yet as stable as reiserfs ended up being for me. And on bad hardware it may never be, even if eventually it's simply due to checksum failures making it effectively unusable. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html