On Thu, June 16, 2005 15:56, Charles Leggett said: > > My dual opteron system hangs at random intervals. Sometimes it's stable > for a week, sometimes it hangs after just a few hours. The symptoms > are always the same - NONE. Carefully scanning the system logs shows > abslutely nothing occurred to cause a hang. No kernel oopses, no error > messages. It's been this way ever since I insalled debian 6 months ago - > before that I was running CentOS, and it never died then, so I'm pretty > sure it's not a hardware problem.
I don't really have a solution, but rather this is more of a "me too" reply. I have also been seeing seemingly random lockups on a dual Opteron system. The screen is frozen and there is no keyboard response. Once it has locked up, however, I was able to login via ssh and do some looking around. Most commands take about 1-2 minutes to complete. Just typing "vim somefile" can take as long as 2 minutes before it completes. Login in via ssh sometimes takes 30 seconds or more. On one of the lockups I started killing off processes and finally determined that X was not stopping when given the HUP and SEGV signals. Running (I use KDE) "/etc/init.d/kdm stop" would end in an error about the xserver not responding. Luckily (for me) X would stop with a KILL signal (-9) and I was able to restart X with "/etc/init.d/kdm restart" which would then return the local console and the frozen screen to normal and the machine would operate normally from that point on. There doesn't seem to be any visable connection between the lockups outside X and friends. It can be anywhere from a few hours (rare) to a week. So far most of the lockups were while I wasn't even in the office - one was in the middle of the night and another was during the day when I was away. I have been having weird behavior from xscreensaver (doesn't want to start sometimes, some screensavers (especially opengl ones) will bleed onto the screen in preview mode (forcing me to restart X to regain control), and sometimes (rare) it doesn't want to stop on keyboard or mouse activity) which could be the root of the problem. I have not yet tried running without xscreensaver and if the lockups continue I may try stopping it. I configured xscreensaver to use the "slide show" screensaver and I've only seen one lockup so far (knock on wood). The machine has never locked up while in use, only when idle. This system has been otherwise rock solid. It runs everything extremely well including games like ut2004 (amd64) and doom3 (i386 chroot). When a lockup happens there is nothing in any of the logs. I am running sid and keep it up-to-date. This machine has been heavily tested with a huge range of tasks (games, compiling, benchmarks, etc.) and none have shown signs of any problems. I like to compile large packages (for comparison to other machines and) to look for any signs of instability. After compiling xserver-xfree86 (33 minutes) there were no errors or unusual behavior (although I did not actually try using the compiled packages). Other compiled packages have built and run fine (although none are as large as X) so I'm leaning towards a problem with X or xscreensaver. Hardware list (to look for any possible common connections): 2 x Opteron 252 Tyan S2875 2GB DDR400 (2 x 1GB) SATA (one seagate drive) IDE (one sony DVDRW) EVGA Nvidia Geforce 6800 Ultra 256MB (AGP 8x, FW, and SBA enabled) no PCI devices installed Right now the machine has been up 7 days without a lockup and I'll continue to track it - but narrowing the problem down is difficult when the lockups only happen about a week or two apart... Regards, Ron -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

