Yesterday, after about 17 hours of uptime, the system crashed during
heavy NFS load.  This was running kernel vmlinuz-2.6.37-8-server, with
command line:

Command line: root=UUID=5626f5d7-0210-432c-9200-ec6a1d599df3 ro
crashkernel=384M-2G:64M,2G-:128M

the heavy NFS load was this:

client machine hlidskjalfe had mounted directory /more from server 'valhalla'
on client machine hlidskjalfe, a non-priviledged user ran the fdupes command 
across the entire /more filesystem, directing output to /more/fdupes.out

the machines are physically separate -- client machine hlidskjalfe is a
core2duo box connected to server valhalla via a gigabit ethernet switch,
to the Intel e1000 NIC in the server.

server is also set to remote syslog to client machine hlidskjalfe.  It 
continued to log a few messages AFTER it had stopped logging them locally on 
server valhalla  (I noticed these in /var/log/syslog) .  Accordingly, I have 
attached logs from valhalla, and logs from hlidskjalfe -- after removing 
hlidskjalfe's log messages.  If you like, I can upload the unexpurgated logs 
from hlidsjkalfe.
the tarfile extracts to bug688068-logs for simplicity.  I can also upload the 
enter /var/log directory from both machines if that is helpful.  This 19MB 
tarfile extracts to 210MB, and contains data from more than just the most 
recent crash.  Note that the daemon.log file has sensor data, such as CPU temp 
and whatnot.  Authlog shows the various nagios checks which are repeatedly run 
against the system.

I am now back on 2.6.32-26-server, running cmdline:
root=UUID=5626f5d7-0210-432c-9200-ec6a1d599df3 ro pci=nomsi  
crashkernel=384M-2G:64M,2G-:128M

For the time being, I will avoid heavy NFS loads such as the one I did
last night -- I can always run jobs like this from an ssh session on the
local machine.




** Attachment added: "logfiles.tgz"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/688068/+attachment/1762592/+files/logfiles.tgz

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/688068

Title:
  lucid system randomly locks up, does not recover

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to