Hi!

Just wanted to verify your problem. I've tried most of what you have tried
and my systems still hang (once in a while).


My setup (~ 15 systems):

* Asus P2B-DS with various PIII's (450-600 MHz).
* IBM 10k rpm disks (3 in each system)
* 2.2.14 + mingo RAID-patches
* Short, internal SCSI cables (comes with the Asus motherboard).
* 2x3Com 90x ethernet adapters
* 

Something is seriously *wrong* in either the Adaptec driver or in
the RAID code itself (not able to tell in which).

I'm considering totally abondoning sw RAID. Getting tired of this.


Anyway, cheers!


/m


On Mon, 1 May 2000, Jeff Hill wrote:

> Based on advise from the list, I ordered a custom built cable from CS
> Electronics asking for the best quality they had: 3' LVD teflon with
> active negation terminator. A $125US later, system still hangs. Bummer.
> 
> I have now spent five weeks off-and-on trying to find out why the system
> with a new Adaptec 2940U2W and 2xSegate LVD drives hangs momentarily
> (using 2.2.14 kernel with raid-2.2.14-B1 patch). The closest thing I
> have to a cause is that it seems to happen only when there is a load
> (medium to heavy) on the server and it happens less when I reduce the
> speed in the SCSI controller from 80Mb to 40Mb.
> 
> I'd appreciate any suggestions other than using my disk drives for skeet
> practice and other than what I have already tried:
> 
>  * set up remote logging to catch any error message (none logged);
> 
>  * added verbose debugging (nothing);
> 
>  * upgraded the AIC7xxx driver from 5.1.21 to 5.1.28 (performance
>    improved but it still hangs);
> 
>  * removed the IDE drives and kernel support for IDE, based on 
>    someone's hunch;
> 
>  * lowered the front bus speed from 100.3Mhz to 88.3 (underclocking
>    CPU), based on another hunch;
> 
>  * checked IRQs for conflicts; 
> 
>  * compiled the kernel with and without tagged command queuing, 
>    with more and fewer max commands;
> 
>  * tried kernels 2.1.10 to 2.1.14 with appropriate RAID patches;
> 
>  * repeatedly searched the archives of mailing lists for aic7xxx,  
>    linux-scsi, Adaptec, and ASUS -- no luck.   
> 
>  * removed my RAID-1 and added it back in (unable to fully 
>    test this);
> 
> 
>  * checked cabinet temperature, which is same as room (about 72F),
>    but no verification on exact temperature of drives.
> 
>  * reduced SCSI controller speed (as previously mentioned) on Seagate 
>    drives to 40MB from 80MB (appears to reduce but not eliminate 
>    problem);
> 
> 
> No matter what, the system hangs for 30 seconds to several minutes,
> several times per day. Programs continue to run, but any requests are
> halted until the disks begin responding again. 
> 
> 
> Thanks,
> 
> Jeff Hill
> 
> 
> Current Configuration:
> ----------------------
> 2.2.14 kernel with raid-2.2.14-B1 patch
> Adaptec 2940U2W (2.20.0 bios)
> Short, internal Ultra2-LVD cable.
> 2xSeagateST39103LW (U2W drives) 
> Single PII 400Mhz 
> ASUS P3B-F motherboard (Intel 440BX AGPset); CPU Bus/PCI Freq -
> 100.3/33.43
> 
> Tagged Command Queueing enabled; max 24 commands per device; reset delay
> at default 5 seconds
> 
> Standard Kernel Boot:
> ----------------------------
> Apr 17 08:00:51 apache kernel: Linux version 2.2.14 (root@apache) (gcc
> version 2.95.2 20000313 (Debian GNU/Linux)) #1 Wed Apr 12 21:43:09 EDT
> 2000
> Apr 17 08:00:51 apache kernel: Detected 333230761 Hz processor.
> Apr 17 08:00:51 apache kernel: Console: colour VGA+ 80x25
> Apr 17 08:00:51 apache kernel: Calibrating delay loop... 332.60 BogoMIPS
> Apr 17 08:00:51 apache kernel: Memory: 517412k/524288k available (1020k
> kernel code, 412k reserved, 5396k data, 48k init)
> Apr 17 08:00:51 apache kernel: Dentry hash table entries: 65536 (order
> 7, 512k)
> Apr 17 08:00:51 apache kernel: Buffer cache hash table entries: 524288
> (order 9, 2048k)
> Apr 17 08:00:51 apache kernel: Page cache hash table entries: 131072
> (order 7, 512k)
> Apr 17 08:00:51 apache kernel: CPU: Intel Pentium II (Deschutes)
> stepping 02
> Apr 17 08:00:51 apache kernel: Checking 386/387 coupling... OK, FPU
> using exception 16 error reporting.
> Apr 17 08:00:51 apache kernel: Checking 'hlt' instruction... OK.
> Apr 17 08:00:51 apache kernel: POSIX conformance testing by UNIFIX
> Apr 17 08:00:51 apache kernel: PCI: PCI BIOS revision 2.10 entry at
> 0xf08b0
> Apr 17 08:00:51 apache kernel: PCI: Using configuration type 1
> Apr 17 08:00:51 apache kernel: PCI: Probing PCI hardware
> Apr 17 08:00:51 apache kernel: Linux NET4.0 for Linux 2.2
> Apr 17 08:00:51 apache kernel: Based upon Swansea University Computer
> Society NET3.039
> Apr 17 08:00:51 apache kernel: NET4: Unix domain sockets 1.0 for Linux
> NET4.0.
> Apr 17 08:00:51 apache kernel: NET4: Linux TCP/IP 1.0 for NET4.0
> Apr 17 08:00:51 apache kernel: IP Protocols: ICMP, UDP, TCP
> Apr 17 08:00:51 apache kernel: TCP: Hash tables configured (ehash 524288
> bhash 65536)
> Apr 17 08:00:51 apache kernel: Starting kswapd v 1.5
> Apr 17 08:00:51 apache kernel: Serial driver version 4.27 with no serial
> options enabled
> Apr 17 08:00:51 apache kernel: ttyS00 at 0x03f8 (irq = 4) is a 16550A
> Apr 17 08:00:51 apache kernel: ttyS01 at 0x02f8 (irq = 3) is a 16550A
> Apr 17 08:00:51 apache kernel: pty: 256 Unix98 ptys configured
> Apr 17 08:00:51 apache kernel: RAM disk driver initialized:  16 RAM
> disks of 4096K size
> Apr 17 08:00:51 apache kernel: PIIX4: IDE controller on PCI bus 00 dev
> 21
> Apr 17 08:00:51 apache kernel: PIIX4: device not capable of full native
> PCI mode
> Apr 17 08:00:51 apache kernel: PIIX4: device disabled (BIOS)
> Apr 17 08:00:51 apache kernel: Floppy drive(s): fd0 is 1.44M
> Apr 17 08:00:51 apache kernel: FDC 0 is a post-1991 82077
> Apr 17 08:00:51 apache kernel: md driver 0.90.0 MAX_MD_DEVS=256,
> MAX_REAL=12 
> Apr 17 08:00:51 apache kernel: linear personality registered
> Apr 17 08:00:51 apache kernel: raid1 personality registered
> Apr 17 08:00:51 apache kernel: (scsi0) <Adaptec AHA-294X Ultra2 SCSI
> host adapter> found at PCI 0/12/0
> Apr 17 08:00:51 apache kernel: (scsi0) Wide Channel, SCSI ID=7, 32/255
> SCBs
> Apr 17 08:00:51 apache kernel: (scsi0) Downloading sequencer code... 396
> instructions downloaded
> Apr 17 08:00:51 apache kernel: scsi0 : Adaptec AHA274x/284x/294x
> (EISA/VLB/PCI-Fast SCSI) 5.1.28/3.2.4
> Apr 17 08:00:51 apache kernel:        <Adaptec AHA-294X Ultra2 SCSI host
> adapter>
> Apr 17 08:00:51 apache kernel: scsi : 1 host.
> Apr 17 08:00:51 apache kernel: (scsi0:0:0:0) Synchronous at 80.0
> Mbyte/sec, offset 15.
> Apr 17 08:00:51 apache kernel:   Vendor: SEAGATE   Model:
> ST39103LW         Rev: 0002
> Apr 17 08:00:51 apache kernel:   Type: 
> Direct-Access                      ANSI SCSI revision: 02
> Apr 17 08:00:51 apache kernel: Detected scsi disk sda at scsi0,
> channel0, id 0, lun 0
> Apr 17 08:00:51 apache kernel: (scsi0:0:1:0) Synchronous at 80.0
> Mbyte/sec, offset 15.
> Apr 17 08:00:51 apache kernel:   Vendor: SEAGATE   Model:
> ST39103LW         Rev: 0002
> Apr 17 08:00:51 apache kernel:   Type: 
> Direct-Access                      ANSI SCSI revision: 02
> Apr 17 08:00:51 apache kernel: Detected scsi disk sdb at scsi0, channel
> 0, id 1, lun 0
> Apr 17 08:00:51 apache kernel: scsi : detected 2 SCSI disks total.
> Apr 17 08:00:51 apache kernel: SCSI device sda: hdwr sector= 512 bytes.
> Sectors= 17783240 [8683 MB] [8.7 GB]
> Apr 17 08:00:51 apache kernel: SCSI device sdb: hdwr sector= 512 bytes.
> Sectors= 17783240 [8683 MB] [8.7 GB]
> Apr 17 08:00:51 apache kernel: ne.c:v1.10 9/23/94 Donald Becker
> ([EMAIL PROTECTED])
> Apr 17 08:00:51 apache kernel: NE*000 ethercard probe at 0x300: 00 40 05
> 2c 0d 1d
> Apr 17 08:00:51 apache kernel: eth0: NE2000 found at 0x300, using IRQ 3.
> Apr 17 08:00:51 apache kernel: Partition check:
> Apr 17 08:00:51 apache kernel:  sda: sda1 sda2 sda3
> Apr 17 08:00:51 apache kernel:  sdb: sdb1 sdb2 sdb3
> Apr 17 08:00:51 apache kernel: md.c: sizeof(mdp_super_t) = 4096
> Apr 17 08:00:51 apache kernel: autodetecting RAID arrays
> Apr 17 08:00:51 apache kernel: (read) sda2's sb offset: 8739264 [events:
> 0000004b]
> Apr 17 08:00:51 apache kernel: (read) sda3's sb offset: 128448 [events:
> 00000002]
> Apr 17 08:00:51 apache kernel: (read) sdb2's sb offset: 8739264 [events:
> 0000004b]
> Apr 17 08:00:51 apache kernel: (read) sdb3's sb offset: 128448 [events:
> 00000002]
> Apr 17 08:00:51 apache kernel: autorun ...
> Apr 17 08:00:51 apache kernel: considering sdb3 ...
> Apr 17 08:00:51 apache kernel:   adding sdb3 ...
> Apr 17 08:00:51 apache kernel:   adding sda3 ...
> Apr 17 08:00:51 apache kernel: created md2
> Apr 17 08:00:51 apache kernel: bind<sda3,1>
> Apr 17 08:00:51 apache kernel: bind<sdb3,2>
> Apr 17 08:00:51 apache kernel: running: <sdb3><sda3>
> Apr 17 08:00:51 apache kernel: now!
> Apr 17 08:00:51 apache kernel: sdb3's event counter: 00000002
> Apr 17 08:00:51 apache kernel: sda3's event counter: 00000002
> Apr 17 08:00:51 apache kernel: md2: max total readahead window set to
> 128k
> Apr 17 08:00:51 apache kernel: md2: 1 data-disks, max readahead per
> data-disk: 128k
> Apr 17 08:00:51 apache kernel: raid1: device sdb3 operational as mirror
> 1
> Apr 17 08:00:51 apache kernel: raid1: device sda3 operational as mirror
> 0
> Apr 17 08:00:51 apache kernel: (checking disk 0) 
> Apr 17 08:00:51 apache kernel: (really checking disk 0)
> Apr 17 08:00:51 apache kernel: (checking disk 1)
> Apr 17 08:00:51 apache kernel: (really checking disk 1)
> Apr 17 08:00:51 apache kernel: (checking disk 2)
> Apr 17 08:00:51 apache kernel: (checking disk 3)
> Apr 17 08:00:51 apache kernel: (checking disk 4)
> Apr 17 08:00:51 apache kernel: (checking disk 5)
> Apr 17 08:00:51 apache kernel: (checking disk 6)
> Apr 17 08:00:51 apache kernel: (checking disk 6)
> Apr 17 08:00:51 apache kernel: (checking disk 10)
> Apr 17 08:00:51 apache kernel: (checking disk 11)
> Apr 17 08:00:51 apache kernel: md: updating md2 RAID superblock on
> device
> Apr 17 08:00:51 apache kernel: sda3 [events: 00000003](write) sda3's sb
> offset: 128448
> Apr 17 08:00:51 apache kernel: created md0
> Apr 17 08:00:51 apache kernel: running: <sdb2><sda2>
> Apr 17 08:00:51 apache kernel: sda2's event counter: 0000004b
> Apr 17 08:00:51 apache kernel: (checking disk 0)
> Apr 17 08:00:51 apache kernel: (really checking disk 1)
> Apr 17 08:00:51 apache kernel: (checking disk 4)
> Apr 17 08:00:51 apache kernel: (checking disk 9)
> Apr 17 08:00:51 apache kernel: (checking disk 10)
> Apr 17 08:00:51 apache kernel: raid1: raid set md0 active with 2 out of
> 2 mirrors
> Apr 17 08:00:51 apache kernel: sda2 [events: 0000004c](write) sda2's sb
> offset: 8739264
> Apr 17 08:00:51 apache kernel: Adding Swap: 128444k swap-space (priority
> -1)
> 
> 
> -- 
> ------------------------------------------------------------
> ------  HR On-Line:  The Network for Workplace Issues ------
> http://www.hronline.com - Ph:416-604-7251 - Fax:416-604-4708
> ------------------------------------------------------------
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
> 


/m





-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]

Reply via email to