So. It just happened again. My server crashed. This time I am sure it
has nothing to do with the USB drive I had since it is no longer attached.
It seems to be some unfortunate timing of a kernel(?) problem and
heavy disk use.
I just suddenly get these messages in the log:
Oct 23 00:56:13 matrix kernel: [14573759.262982] ata1: link is slow to respond,
please be patient (ready=0)
Oct 23 00:56:13 matrix kernel: [14573764.242683] ata1: device not ready
(errno=-16), forcing hardreset
Oct 23 00:56:13 matrix kernel: [14573764.242721] ata1: soft resetting link
Oct 23 00:56:13 matrix kernel: [14573765.081129] ata1.00: configured for
UDMA/133
Oct 23 00:56:13 matrix kernel: [14573765.081188] ata1: EH completeOct 23
00:56:13 matrix kernel: [14573765.082422] sd 0:0:0:0: [sda] 312581808 512-byte
hardware sectors (160042 MB)
Oct 23 00:56:13 matrix kernel: [14573765.126583] sd 0:0:0:0: [sda] Write
Protect is off
Oct 23 00:56:53 matrix kernel: [14573765.127506] sd 0:0:0:0: [sda] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Which just repeat themselves until about 01:19 and then it goes quiet until a
final logging at
7:54 where the server finally crashes (just stops to respond to network
requests, keyboard a.s.o.)
I just checked the kern.log, which has a lot of entries of:
Oct 23 00:54:12 matrix kernel: [14573754.220270] ata1.00: exception Emask 0x0
SAct 0x0 SErr 0x0 action 0x6 frozen
Oct 23 00:56:13 matrix kernel: [14573754.220348] ata1.00: cmd
ca/00:50:14:9f:8d/00:00:00:00:00/e1 tag 0 dma 40960 out
Oct 23 00:56:13 matrix kernel: [14573754.220352] res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Oct 23 00:56:13 matrix kernel: [14573754.220465] ata1.00: status: { DRDY }
Oct 23 00:56:13 matrix kernel: [14573759.262982] ata1: link is slow to respond,
please be patient (ready=0)
Oct 23 00:56:13 matrix kernel: [14573764.242683] ata1: device not ready
(errno=-16), forcing hardreset
Oct 23 00:56:13 matrix kernel: [14573764.242721] ata1: soft resetting linkOct
23 00:56:13 matrix kernel: [14573765.081129] ata1.00: configured for UDMA/133
Oct 23 00:56:13 matrix kernel: [14573765.081188] ata1: EH complete
Oct 23 00:56:13 matrix kernel: [14573765.082422] sd 0:0:0:0: [sda] 312581808
512-byte hardware sectors (160042 MB)
Oct 23 00:56:13 matrix kernel: [14573765.126583] sd 0:0:0:0: [sda] Write
Protect is off
Oct 23 00:56:13 matrix kernel: [14573765.126598] sd 0:0:0:0: [sda] Mode Sense:
00 3a 00 00Oct 23 00:56:53 matrix kernel: [14573765.127506] sd 0:0:0:0: [sda]
Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
This adds some more info about an exception?
Searching for theses entries, gives a lot of people reporting the same
problem:
And probably a solution: http://ubuntuforums.org/showthread.php?t=1145513
(The guy on that post wonders why there hasn't been many reports on this
issue...)
Also:
https://bugzilla.redhat.com/show_bug.cgi?id=462425
https://bugzilla.redhat.com/show_bug.cgi?id=404851
http://lkml.org/lkml/2008/11/9/22
http://forums.fedoraforum.org/showthread.php?t=219746
I'm running kernel 2.6.27-11-server. Someone suggest to run kernel-rt
instead:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/279693 (comment
#23)
I haven't tried that. I will try to see if a kernel 2.6.27-14 is available or
eventually try the -rt
suggestion.
It seems it is possible to crash the system by doing a "ls -lR /". Not
what I expect from a Linux system...
Kind regards
Torben
** Bug watch added: Red Hat Bugzilla #462425
https://bugzilla.redhat.com/show_bug.cgi?id=462425
** Bug watch added: Red Hat Bugzilla #404851
https://bugzilla.redhat.com/show_bug.cgi?id=404851
--
Consistent repeating [ata1: link is slow to respond, please be patient ]
https://bugs.launchpad.net/bugs/297058
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs