On 2014-07-01 17:33, O. Hartmann wrote:
Am Tue, 01 Jul 2014 17:23:14 +0200
Willem Jan Withagen <w...@digiware.nl> schrieb:
On 2014-07-01 16:48, Rang, Anton wrote:
DOT => DOD
444F54 => 444F44
That's a single-bit flip. Bad memory, perhaps?
Very likely, especially if the system does not have ECC....
It just happens on rare occasions that a alpha particle, power cycle, or
any things else disruptive damages a memory cell. And it could be that
it requires a special pattern of accesses to actually exhibit the error.
In the past (199x's) 'make buildworld' used to be a rather good memory
tester. But nowadays look at
This tool has found all of the bad memory in all the systems I used and
or build for others...
Note that it might take a few runs and some more heat to actually
trigger the faulty cell, but memtest86 will usually find it.
Note that on big systems with lots of memory it can take a loooooong
time to run just one full testset to completion.
I already testet via memtest86+ (had to download the linux image, the port on
broken on CURRENT). It didn't find anything strange so far.
I will do another test.
I realised, that on that that specific box, the chipset temperature is 81 Grad
The chipset is a Eaglelake P45 - in which the memory controller resides on that
platform. dmidecode gives:
Manufacturer: ASUSTeK Computer INC.
Product Name: P5Q-WS
Version: Rev 1.xx
I've build several (5+) systems with these boards (from memory they date
around 2009??). And if I recall right, one of them is still functional.
The first one broke down in a couple of weeks, and the other did not
survive time either.
The auxiliary chips on that board do run hot, but I never realized this
hot. Is 81C is the CPU temp from sysctl, or did you measure the cooling
body on the motherboard. In the later case it is just too hot, probably.
But even if it is the temp on the chip itself, I've rrarely seen temps
go up this high.
You can need to run the memtest86 for more than 6-10 complete runs with
all the tests.
If the memtests do not reveal anything broken, then you get into even
more wizardry stuff, like bad power etc... Especially since it only
occurs on occasion, it is going to be a nightmare to find the root cause
of this. Other than replacing hardware piece by piece, which won't be
easy given the age of the board and parts.
You could go into the bios, and try to config ram access at a slower
speed and see if the problem goes away. Then it could be that you are
running an the edge of the spec with regards to ram timing.
But like I said, it is all lots of funky details that can interact in
strange and unexpected ways.
firstname.lastname@example.org mailing list
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"