Hi misc@,

I have two Supermicro X9SBAA servers in a rack. These are Atom
S1260 systems with 8GB of ECC memory.

I noticed that one of them was beeping while flashing an alarm
light. There were no symptoms otherwise, but obviously this is
concerning.

It was running 7.8 and I ended up upgrading to 7.9 trying to diagnose
it. There was no impact.

Triggering it was very random. It doesn't seem to be thermally
related.  The temperature, according to sysctl, is maybe 3C higher
than the (basically) identical server next to it. I can peg all
four threads and it won't trigger it. It will start/stop fairly
randomly, but I know at one point during fw_get it started as soon
as ftp was going. It seems to be potentially network related, but
downing the links and/or unplugging ethernet cables did not seem
to reliably stop the alarm.

I tried with spread spectrum on and off in the BIOS. PSU voltages
looked normal. I eventually lowered the clock with setperf=1 and
did not hear the alarm after that, but I'm not confident if it's
going off or not. I wish I knew if it was going off because I
don't want to annoy anyone else in the datacenter. It's a very
loud alarm.

Are there any interfaces exposed to OpenBSD (and maybe accessible
through sysctl) that can tell me if an alarm is going off, or that
might help me diagnose it? The system has never crashed or done
anything weird otherwise. Maybe some way to see if there are ECC
errors that are being fixed on the fly?

Very perplexing. I would appreciate any tips you can suggest.

Please include me in replies as this account is not on misc@.

Thank you!

-Slow Servers

Reply via email to