Not a SMP problem but because of the dual-board maybe somebody
allready collected some experience ...

Hi!

We have some serious problems to set up our Linux-server
timeout problems (system hang) with aic7895

When we execute some heavy file transfers
tar - | tar - , or similar cp actions we get this:

: scsi : aborting command due to timeout : pid 199204, scsi0, channel 0, id 1,
lun 0 Read (6) 17 03 17 02 00
: (scsi0:0:1:0) Parity error during Command phase.
: SCSI host 0 abort (pid 199204) timed out - resetting
: SCSI bus is being reset for host 0 channel 0.
: SCSI host 0 channel 0 reset (pid 199204) timed out - trying hardertimeout
: SCSI bus is being reset for host 0 channel 0.
: (scsi0:0:1:0) Synchronous at 40.0 Mbyte/sec, offset 8.
: (scsi0:0:0:0) Synchronous at 40.0 Mbyte/sec, offset 8.
: SCSI host 0 abort (pid 199204) timed out - resetting
: SCSI bus is being reset for host 0 channel 0.
: scsi : aborting command due to timeout : pid 200515, scsi0, channel 0, id 0,
lun 0 Read (6) 01 80 49 02 00
: scsi : aborting command due to timeout : pid 200562, scsi0, channel 0, id 0,
lun 0 Read (6) 03 ee 73 02 00
: SCSI host 0 channel 0 reset (pid 199204) timed out - trying harder
: SCSI bus is being reset for host 0 channel 0.
: SCSI host 0 reset (pid 199204) timed out again -
: probably an unrecoverable SCSI bus or device hang.

We also tried different transfere rates (via BIOS) between 20 and 40 Mbyte/sec:
the same as above. Without heavy traffic on the SCSI-bus the systems seems to
be stable.


Hardware:
- GigaByte BXDS - Dual Slot1 and aic7895 dual channel UW SCSI chip
- Running actually with only one CPU : Celeron 300A
- 64 MB SDRAM
- 2(3) Seagate ST34520W drives
(- Toshiba  Model: CD-ROM XM-3501TA Rev: 3054)
(- HP DAT   Model: C1533A           Rev: 9503)
- PCI NE2000 clone 'RealTek RTL-8029'
- CirrusLogic 5446 VGA

Linux:
RedHat 5.2 distribution with patches
Kernel 2.0.36 or  2.2.0-pre4 to -pre9
slightly different behaviour with fdisk but problem persists

For kernel 2.0.36 we applied patches for aic7xxx.c: 5.1.2 -> 5.1.6
(ftp://ftp.redhat.com/pub/aic/)

Tested:
Cabling, termination, swapping drive IDs and positions - on both channels A
and B, no BIOS control (with 1542C for booting) - problem still
persisting.

With 1542 contr. max. 8Mbyte/sec without any timeouts is possible
otherwise (10M/s) it won't recognize any devices (???) in this GiBy BXDS board.

Testing aic7895 with narrow devices up to 10Mbyte/sec (CD and tape) seems
to be o.k. but not tested with narrow HDDs yet...

The Seagate drives function well at 40.0 Mbyte/sec with the same Peripherials
in a GigaByte BXS (Single slot 1 with aic7800 UW) - only difference is the
CPU: PII266.

Somehow strange: every of our three equal Seagate drives behave
different in respect of failure rate (one very often, the others
rare) independent of their position on the scsi-bus or as boot or
non boot device.


Any ideas / experience / help?

Thanks i.a.!

===============================================================================
 Janos Palinkas : Institute f. Physical Biology
                  Heinrich-Heine-University
                  40225 Duesseldorf / Germany

          email : [EMAIL PROTECTED]

-- 
===============================================================================
 Janos Palinkas : Institut f. Physikalische Biologie
                  Heinrich-Heine-Universit"at D"usseldorf
                  Universit"atsstr. 1 / Geb�ude 26.12.U1                  
                  40225 D"usseldorf

          phone : +49/ (0)211 / 81-14927                                  
            Fax : +49/ (0)211 / 81-15167
          email : [EMAIL PROTECTED]                          
===============================================================================

Reply via email to