On 2020-12-28, Bastien Durel <[email protected]> wrote: > Le lundi 28 décembre 2020 à 09:23 +1000, Stuart Longland a écrit : >> On 28/12/20 3:56 am, Bastien Durel wrote: >> > After that I got a (maybe) endless loop of panics inducing panics >> > (I did >> > not got the output, it was cycling fast), and after that the /bsd >> > file >> > was left empty : >> > >> > > > > OpenBSD/amd64 BOOT 3.52 >> > > boot> NOTE: random seed is being reused. >> > > booting hd0a:/bsd: read header >> > > failed(0). will try /bsd >> … >> > How can I figure out the cause of all these problems ? >> >> Seems awfully strange for `/bsd` to become zero-length out-of-the- >> blue. >> Got a `memtest86` disk handy? >> >> I'd be checking: >> - RAM >> - disks >> - CPU >> >> I think from the `dmesg` the storage device is a SSD? Could it be it >> has failed early? Some do that, and they give practically no warning >> when they do. > > SMART is OK on the disk > > I ran a memtest86 test, and got thousands of errors > > > Test Start Time 2020-12-28 08:38:08 > Elapsed Time 0:01:11 > Memory Range Tested 0x0 - 16F000000 (5872MB) > CPU Selection Mode Parallel (All CPUs) > ECC Polling Enabled > > Lowest Error Address 0x12AA18018 (4778MB) > Highest Error Address 0x12BFE7FF8 (4799MB) > Bits in Error Mask FF00000000000000 > Bits in Error 8 > Max Contiguous Errors 1 > > > > Test # Tests Passed Errors > Test 0 [Address test, walking ones, 1 CPU] 1/1 (100%) 0 > Test 1 [Address test, own address, 1 CPU] 0/0 (0%) 10988 > > > Last 10 Errors > 2020-12-28 08:39:19 - [Data Error] Test: 1, CPU: 0, Address: 12BFE7FF8, > Expected: 000000012BFE7FF8, Actual: 100000012BFE7FF8 > 2020-12-28 08:39:19 - [Data Error] Test: 1, CPU: 0, Address: 12BFE7FE8, > Expected: 000000012BFE7FE8, Actual: 040000012BFE7FE8 > 2020-12-28 08:39:19 - [Data Error] Test: 1, CPU: 0, Address: 12BFE7F58, > Expected: 000000012BFE7F58, Actual: 040000012BFE7F58 > 2020-12-28 08:39:19 - [Data Error] Test: 1, CPU: 0, Address: 12BFE7F48, > Expected: 000000012BFE7F48, Actual: 080000012BFE7F48 > 2020-12-28 08:39:19 - [Data Error] Test: 1, CPU: 0, Address: 12BFE7EF8, > Expected: 000000012BFE7EF8, Actual: 400000012BFE7EF8 > 2020-12-28 08:39:19 - [Data Error] Test: 1, CPU: 0, Address: 12BFE7EE8, > Expected: 000000012BFE7EE8, Actual: C00000012BFE7EE8 > 2020-12-28 08:39:19 - [Data Error] Test: 1, CPU: 0, Address: 12BFE7EC8, > Expected: 000000012BFE7EC8, Actual: 040000012BFE7EC8 > 2020-12-28 08:39:19 - [Data Error] Test: 1, CPU: 0, Address: 12BFE7E58, > Expected: 000000012BFE7E58, Actual: 400000012BFE7E58 > 2020-12-28 08:39:19 - [Data Error] Test: 1, CPU: 0, Address: 12BFE7D58, > Expected: 000000012BFE7D58, Actual: 080000012BFE7D58 > 2020-12-28 08:39:19 - [Data Error] Test: 1, CPU: 0, Address: 12BFE7D48, > Expected: 000000012BFE7D48, Actual: 080000012BFE7D48 > > > So hardware failure confirmed :/ Do you think I can change the RAM or > it's more likely a CPU/Chipset failure ? > > Thanks, >
If you have multiple sticks of RAM, try removing some.

