On Wednesday, 5 July 2023 16:38:35 BST Michael wrote:

> Hmm ... you won't like what I have to say now about bad RAM:

:)

> Servers and specialised workstations with large amounts of RAM use RDIMM ECC
> to correct errors.  With modern PCs having even more RAM than some servers,
> it can take forever to test them thoroughly using memtest86+.  Perhaps if
> you remove all but one stick and test it overnight, then replace this with
> the next stick and so on until all are tested, you may find the problematic
> stick sooner. 

This version of memtest86 ran to completion after going through the whole 
64GB, and stopped with a success message.

> Or, carry on as you are and keep an eye out for errors.  A heavy round of
> emerge can be as likely to come across it sooner or later. 

Over the last...oh, many months, I've noticed an occasional package in a large 
batch failing for no obvious reason, only to succeed on its own. I haven't 
been able to diagnose this, but it's one factor behind my trying to find the 
best settings of jobs and load average, on the suspicion that job control is 
weaker at high loads.

> The other culprit to consider is a power supply problem, especially if this
> is not a laptop or a PC fed off a UPS.  A transient glitch could cause an
> one off error and you wouldn't even notice it.

I do run a UPS. Just as well, too, as the village is fed over-ground and we 
get one or two brief blackouts every year. They're odd, because they last 
longer than delayed auto-reclose but shorter than I would expect manual 
switching to take (I used to work in that industry). Maybe the operators are 
quicker on their feet these days - it was a long time ago.  ;-)

-- 
Regards,
Peter.




Reply via email to