Today, I had another SCSI failure. I was able to get a bit more of dmesg
stuff, but can't figure out, what is going wrong there.

In /var/log/messages, the unusuall stuff starts with this repeated a
couple of times:

Mar 28 12:00:45 mail kernel: (scsi0:0:2:0) Parity error during Message-In phase
Mar 28 12:00:45 mail kernel: (scsi0:0:2:0) Parity error during Data-In phase.

It goes on to a lot of messages similar to this (pid, id and stuff right
from 'lun 0' is changing):

Mar 28 12:00:45 mail kernel: scsi : aborting command due to timeout : pid 14301024, 
scsi0, channel 0, id 0, lun 0 Write (10) 00 00 6b 0f 14 00 00 08 00

Then this (a lot of lines):

Mar 28 12:00:45 mail kernel: SCSI host 0 abort (pid 14301062) timed out - resetting
Mar 28 12:00:45 mail kernel: SCSI bus is being reset for host 0 channel 0.

Somewhere in between this shows up:

Mar 28 12:00:45 mail kernel: (scsi0:0:2:0) Performing Domain validation.

Then this:

Mar 28 12:00:45 mail kernel: SCSI host 0 reset (pid 14301061) timed out again - 
Mar 28 12:00:45 mail kernel: probably an unrecoverable SCSI bus or device hang.

And finally this:

Mar 28 12:00:45 mail kernel: (scsi0:0:2:0) Successfully completed Domain validation.
Mar 28 12:00:45 mail kernel: (scsi0:0:2:0) Using asynchronous transfers.
Mar 28 12:00:45 mail kernel: (scsi0:0:1:0) Synchronous at 80.0 Mbyte/sec, offse 31.
Mar 28 12:00:45 mail kernel: (scsi0:0:0:0) Using asynchronous transfers.   

followed by some more liens of previous messages. This are the last
entries I got in /var/log/messages before rebooting (hard). The machine
was sortof alive (ie. ping, httpd, php3...), but I was unable to login
(even locally). The one console I had open was able to do 'ls', 'free',
'dmesg', things doing anything with hard disk froze up. Even 'shutdown'
and 'reboot' failed to execute.

The weird thing is that all of these messages occured in a single second
(12:00:45).

I'm asking if someone with more SCSI experience could diagnose what could
be the cause of that?

  Thanks, D.

PS: More info about the machine:

CPU:    Dual P-III 500 MHz
Board:  Intel L440GX
Disks:  4x IBM DNES-309170Y (3 RAID5 + 1 spare)
LAN:    Integrated Inte EtherExpress Pro 10/100 

cat /proc/interrupts
           CPU0       CPU1
  0:     253546     252241    IO-APIC-edge  timer
  1:         99        103    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
  4:        473        472    IO-APIC-edge  serial
  8:          0          0    IO-APIC-edge  rtc
 13:          1          0          XT-PIC  fpu
 19:     358370     359232   IO-APIC-level  aic7xxx, aic7xxx
 21:     225846     225239   IO-APIC-level  Intel EtherExpress Pro 10/100 Ethernet

cat /proc/ioports
0000-001f : dma1
0020-003f : pic1
0040-005f : timer
0060-006f : keyboard
0070-007f : rtc
0080-008f : dma page reg
00a0-00bf : pic2
00c0-00df : dma2
00f0-00ff : fpu
03c0-03df : vga+
03f8-03ff : serial(auto)
1080-109f : Intel Speedo3 Ethernet
1400-14be : aic7xxx
1800-18be : aic7xxx    

uname -a
Linux my.host.name 2.2.13 #1 SMP Tue Mar 14 11:55:56 CET 2000 i686 unknown

Reply via email to