On giovedì 12 maggio 2016 17:43:38 CEST, Austin S. Hemmelgarn wrote:
That's probably a good indication of the CPU and the MB being OK, but not necessarily the RAM. There's two other possible options for testing the RAM that haven't been mentioned yet though (which I hadn't thought of myself until now): 1. If you have access to Windows, try the Windows Memory Diagnostic. This runs yet another slightly different set of tests from memtest86 and memtest86+, so it may catch issues they don't. You can start this directly on an EFI system by loading /EFI/Microsoft/Boot/MEMTEST.EFI from the EFI system partition. 2. This is a Dell system. If you still have the utility partition which Dell ships all their per-provisioned systems with, that should have a hardware diagnostics tool. I doubt that this will find anything (it's part of their QA procedure AFAICT), but it's probably worth trying, as the memory testing in that uses yet another slightly different implementation of the typical tests. You can usually find this in the boot interrupt menu accessed by hitting F12 before the boot-loader loads.

I tried the Dell System Test, including the enhanced optional ram tests and it was fine. I also tried the Microsoft one, which passed. BUT if I select the advanced test in the Microsoft One it always stops at 21% of first test. The test menus are still working, but fans get quiet and it keeps writing "test running... 21%" forever. I tried it many times and it always got stuck at 21%, so I suspect a test suite bug instead of a ram failure.

I also noticed some other interesting behaviours: while I was running the usual scrub+check (both were fine) from the livecd I noticed this in dmesg: [ 261.301159] BTRFS info (device dm-0): bdev /dev/mapper/cryptroot errs: wr 0, rd 0, flush 0, corrupt 4, gen 0 Corrupt? But both scrub and check were fine... I double checked scrub and check and they were still fine.

This is what happened another time: https://drive.google.com/open?id=0Bwe9Wtc-5xF1dGtPaWhTZ0w5aUU I was making a backup of my partition USING DD from the livecd. It wasn't even mounted if I recall correctly!

On giovedì 12 maggio 2016 18:48:17 CEST, Zygo Blaxell wrote:
That's what a RAM corruption problem looks like when you run btrfs scrub.
Maybe the RAM itself is OK, but *something* is scribbling on it.

Does the Arch live usb use the same kernel as your normal system?

Yes, except for the point release (the system is slightly ahead of the liveusb).

On giovedì 12 maggio 2016 18:48:17 CEST, Zygo Blaxell wrote:
Did you try an older (or newer) kernel?  I've been running 4.5.x on a few
canary systems, but so far none of them have survived more than a day.

No (except for point releases from 4.5.0 to 4.5.4), but I will try 4.4.

On giovedì 12 maggio 2016 18:48:17 CEST, Zygo Blaxell wrote:
It's possible there's a problem that affects only very specific chipsets
You seem to have eliminated RAM in isolation, but there could be a problem
in the kernel that affects only your chipset.

Funny considering it is sold as a Linux laptop. Unfortunately they only tested it with the ancient Ubuntu 14.04.

Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to