Kris Kennaway wrote:
On Sat, Nov 11, 2006 at 11:15:54AM -0800, Chris wrote:
If your system is hanging then you need to configure additional
debugging to figure out the cause.  Read the chapter on kernel
debugging the developers handbook; without this information no
developer can help you.


P.S. In my testing SMP amd64 is quite stable even under exceptionally
heavy loads, so it's either something related to your hardware or your
particular workload.
Hadn't considered that a user level debugging solution. I'll give it a try.
That is indeed almost always failing hardware.

I think I'm having the same problems.
I'm running 6.1(latest patch set)/amd64 on a dual-core Opteron Acer server with SCSI disks and it is hanging completely and suddenly. Checking the hardware was the first thing I did, but it really seems ok (unless it's the second core on the processor). I checked, among the others: the HDs with the vendor's tools, RAM with MemTest86+ and the CPU with different stress tools. If anyone can suggest other diagnostics I'd be happy to comply. I compiled the kernel with debug info, but that's totally useless, since it won't dump anything, just hang there; I don't think even DDB would help, since even the keyboard is not working at that time. If I'm missing something, I'd be glad to be directed to any pointer. The box features an em NIC on board, but since it shows a lot of problems, I removed that driver from the kernel (it's not possible to turn it off in the BIOS, though) and put in a different add-on card. I had some shared IRQs, but managed to solve that issue (even if I think it should not matter).
Next, I'll try to disable SMP as soon as I can and see if it helps.

Of course upgrading to 6.2 should be attempted, but since this is a production server and 6.2 is still at RC1...

 bye & Thanks
_______________________________________________ mailing list
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to