On Tuesday, November 26, 2002, at 01:09 PM, MonMotha wrote:
What tests did you run? If you did the "all tests" (I think there's 12
of them), the last test is VERY thourough and should catch any memory
errors on the first go-around. Barring that, it never hurts to run it
longer, but likely the ram is good if it can pass all the memtest86
tests.
All tests. I was wary of the ram at one point, but, I trust the test.
I'll be doing it again, though. I never trust myself completely.
Another thing that happened to someone recently was the motherboard
not setting the voltage correctlty with AUTO. Forcing the voltage to
that in the spec sheets fixed his problems.
This will be investigated. It seems like I couldn't get 30 days with
bad voltage, but, perhaps this ultimately leads to suggestion 3,
thermal shutdown. I'll check.
What processor is this again? The Pentium 3 thermal shutdown is to
simply execute the HLT instruction, which would be like a hard lock,
though I believe it does throw an MCE jsut before that, which would
cause a kernel panic if MCE is enabled.
This is a dual athlon mp board. Does it behave differently than the
Pentium 3?
This definately sounds like a hardware issue (possibly thermal
shutdown?). Normally the kernel manages to at least throw up an Oops
on hardware failure, but occasionally hard locks are the result. If
you can find something that reliably triggers the problem, you can go
a great way to diagnosing the cause. Another possibility if it is
software is a problem in an interrupt handler or some other situation
where the kernel can't be interrupted but control is never returned
to the kernel by a driver.
I have theorized that my realtek ethernet chipset may be substandard
for this application. A freebsd friend pointed out that the author of
the realtek driver for Freebsd made a few very negative comments about
the quality of the chipset in his man pages. He makes these two
comments:
snip comments re: realtek
The rl driver was written by Bill Paul <[EMAIL PROTECTED]>.
In your opinion, could this lead to a lock-down, and does realtek have
that bad of a reputation in the Linux community? It sounds pretty bad
to me.
The Linux community isn't usaully as in tune with the hardware gurus
(other than the driver developers themselves), but I have seen a few
gripes about the quality of the spec sheets (and the chip in general)
of the realtek 8139. However, I've used those cards without incident
for years, acheiving uptimes of many months (power outages...).
However, depending on the application, you might want a nicer card like
a 3Com 3c905.
I too have a lot of them in use and have always thought it was a rather
stable chip. I chose to use it over the 3c905 for this reason. Asus
shipped this 3com nic with the motherboard. Perhaps for a reason.
Pending other information developments, I am quadrupling my file-max
setting. The extra 3com nic is strategy 2.
scott