stability problem with SCSI in linux 2.4.2
Hi, I think I've found a stability problem in the SCSI subsystem of actual kernel 2.4.2. The setup: Dual-Pentium2 512MB RAM onboard Adaptec Controller AIC7895 (Dual Channel) 3 internal disk IBM DDRS-39130D 1 external disk easyRAID II with 360GB net capacity The easyRAID II is a RAID system that works with IDE-Disk and has an SCSI interface to the outside world. This RAIDs SCSI interface is a Symbios 53C895. Moreover the system contains 2 network cards and a graphics card. I use SCSI low-level driver for AIC-7xxx built into kernel. I have updated all the necessary tools as described in the Documentation/Changes file. gcc-2.91.66 binutils-2.10.1 util-linux-2.10s e2fsprogs-1.19 The problem occurs when I try to create a file system (or copying lots of files to the existing filesystem) on the internal disk or the external RAID after a while there a lot of read error on the disks reported, and the system becomes unusable, and mostly the external RAID crashes. After a reboot of the machine and resetting the RAID box, everything seems to be fine, and also there are no disk error. I can run three or more badblocks scans in parallel and all sectors are OK, but when I now try to create a file system (ext2 or reiserfs) the same stuff as described above happens. Then I test another setup with Dual-Pentium3 with 1GB RAM and Adaptec 29160 SCSI-controller and the same disks and the RAID box connected, but the same errors came up, when I try to write large amounts of data, disk reads causes no problems. Little writes to the disk, i.e. editing a file and saving it, also causes no errors (in both described setups). The problem is definitly away when going back to 2.2.18. Perhaps it is a similar problem like with kernel 2.2.13 that doesn't recognize IBM DDYS-T18350 disk, but 2.2.18 do. Thanks for advertence. With best regards Sebastian Woelk Systemadministrator - Werner-Seelenbinder-Schule Berlin (Germany) Below I've pasted some messages from my system log. Mar 3 01:31:30 tenkei kernel: (scsi1:0:12:-1) Unexpected busfree, LASTPHASE = 0x0, SEQADDR = 0x110 Mar 3 01:31:36 tenkei kernel: (scsi1:0:12:-1) Unexpected busfree, LASTPHASE = 0x0, SEQADDR = 0x58 Mar 3 01:31:48 tenkei last message repeated 2 times Mar 3 01:31:54 tenkei kernel: (scsi1:0:12:-1) Unexpected busfree, LASTPHASE = 0x0, SEQADDR = 0x59 Mar 3 01:31:55 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 11 9c c4 50 00 04 00 00 Mar 3 01:31:55 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 11 9c c8 50 00 04 00 00 Mar 3 01:31:55 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 11 9c cc 50 00 04 00 00 Mar 3 01:31:55 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 11 9c d0 50 00 04 00 00 Mar 3 01:31:55 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 11 9c d4 50 00 02 e0 00 Mar 3 01:31:55 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 0b eb c0 00 00 00 08 00 Mar 3 01:31:55 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 0b f3 c0 30 00 00 08 00 Mar 3 01:31:55 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 0b fb c0 30 00 00 08 00 Mar 3 01:32:15 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 0b ff c0 38 00 00 08 00 Mar 3 01:32:23 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 0c 13 c0 30 00 00 08 00 Mar 3 01:32:23 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 0c 2f c0 30 00 00 08 00 Mar 3 01:32:23 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 0c 33 c0 30 00 00 10 00 Mar 3 01:33:02 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 0c 37 c0 30 00 00 08 00 Mar 3 01:33:16 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 0c 3b c0 30 00 00 08 00 Mar 3 01:33:27 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 0c 3f c0 30 00 00 08 00 Mar 3 01:33:38 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 0c 43 c0 30 00 00 08 00 Mar 3 01:33:48 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12, lun 0 Write (10) 00 0c 47 c0 30 00 00 08 00 Mar 3 01:33:58 tenkei kernel: scsi : aborting command due to timeout : pid 0, scsi1, channel 0, id 12,
Re: scsi vs ide performance on fsync's
There is definitely something strange going on here. As the bonnie test below shows, the SCSI disk used for my tests should vastly outperform the old IDE one: ---Sequential Output ---Sequential Input-- --Random-- Seagate -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- ST318451LW MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU SCSI 200 21544 96.8 51367 51.4 11141 16.3 17729 58.2 40968 40.4 602.9 5.4 Quantum ---Sequential Output ---Sequential Input-- --Random-- Fireball -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- ST3.2A MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU IDE 200 3884 72.8 4513 86.0 1781 36.4 3144 89.9 4052 95.3 131.5 0.9 I used a program based on Mike Black's "Blah Blah" test (shown below) in which 200 write()+fdatasync()s are performed. Each write() outputs either 20 or 4096 bytes. On my Celeron 533 Mhz 128 MB ram hardware with an ext2 fs, the "block" size that is seen by the sd driver for each fdatasync() is 4096 bytes. lk 2.4.2 is being used. The fs/buffer.c __wait_on_buffer() routine waits for IO completion in response to fdatasync(). Timings have been done with Andrew Morton's timepegs (units are microseconds). Here are the IDE results: IDE 20*200 Destination Count Min Max Average Total enter __wait_on_buffer:0 - leave __wait_on_buffer:0 2001,037.23 6,487.72 1,252.19 250,439.80 leave __wait_on_buffer:0 - enter __wait_on_buffer:0 1997.32 21.05 7.821,557.05 IDE 4096*200 Destination Count Min Max Average Total enter __wait_on_buffer:0 - leave __wait_on_buffer:0 2001,037.06 7,354.21 1,243.78 248,756.64 leave __wait_on_buffer:0 - enter __wait_on_buffer:0 199 23.01 67.32 37.037,370.51 So the size of each transfer doesn't matter to this IDE disk. Now the same test for the SCSI disk: SCSI(20*200) Destination Count Min Max Average Total enter __wait_on_buffer:0 - enter sd_init_command:0 200 1.86 13.27 2.05 411.48 enter sd_init_command:0 - enter rw_intr:0 200320.87 5,398.56 3,417.30 683,461.25 enter rw_intr:0 - leave __wait_on_buffer:0 200 4.04 15.81 4.42 885.73 leave __wait_on_buffer:0 - enter __wait_on_buffer:0 199 8.78 14.39 9.261,844.23 SCSI(4096*200) Destination Count MinMax Average Total enter __wait_on_buffer:0 - enter sd_init_command:0 200 1.97 13.20 2.21 443.52 enter sd_init_command:0 - enter rw_intr:0 200109.53 13,997.50 1,327.47 265,495.87 enter rw_intr:0 - leave __wait_on_buffer:0 200 4.37 22.50 4.75 951.44 leave __wait_on_buffer:0 - enter __wait_on_buffer:0 199 22.40 42.20 24.274,831.34 The extra timepegs inside the SCSI subsystem show that the IO transaction to that disk really did take that long. [Initially I suspected a "plugging" type elevator bug, but that isn't supported by the above and various other timepegs not shown.] Since there is a wait on completion for every write, tagged queuing should not be involved. So writing more data to the SCSI disk speeds it up! I suspect the critical point in the "20*200" test is that the same sequence of 8 512 byte sectors are being written to disk 200 times. BTW That disk spins at 15K rpm so one rotation takes 4 ms and it has a 4 MB cache. Even though the SCSI disk's "cache" mode page indicates that the write cache is on, it would seem that writing the same sectors continually causes flushes to the medium (and hence the associated delay). Here is scu's output of the "cache" mode page: $ scu -f /dev/sda show page cache Cache Control Parameters (Page 0x8 - Current Values): Mode Parameter Header: Mode Data Length: 31 Medium Type: 0 (Default Medium Type) Device Specific Parameter: 0x10 (Supports DPO FUA bits) Block Descriptor Length: 8 Mode Parameter Block Descriptor: Density Code: 0x2 Number of Logical Blocks: 2289239 (1117.792 megabytes) Logical Block Length: 512 Page Header / Data: Page Code: 0x8 Parameters Savable: Yes Page Length: 18 Read Cache Disable (RCD): No Multiplication Factor (MF): Off Write Cache Enable (WCE): Yes Cache Segment Size Enable (SIZE): Off Discontinuity (DISC): On Caching Analysis Permitted (CAP): Disabled Abort Pre-Fetch (ABPF): Off Initiator Control Enable (IC): Off Write Retention Priority: 0 (Not distiguished) Demand Read Retention Priority: 0 (Not distiguished) Disable Prefetch
Re: stability problem with SCSI in linux 2.4.2
I think I've found a stability problem in the SCSI subsystem of actual kernel 2.4.2. The 2.4.2 adaptec driver has plenty of problems. Either use the -ac version and/or get Justin Gibbs new driver - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]
Re: scsi vs ide performance on fsync's
Douglas Gilbert wrote: There is definitely something strange going on here. As the bonnie test below shows, the SCSI disk used for my tests should vastly outperform the old IDE one: First thank you and others with my clueless investigation about the module loading under Debian GNU/Linux. (I should have known that Debian uses a very special module setup.) Anyway, I used to think SCSI is better than IDE in general, and the post was quite surprising. So I ran the test on my PC. On my systems too, the IDE beats SCSI hand down with the test case. BTW, has anyone noticed that the elapsed time of SCSI case is TWICE as long if we let the previous output of the test program stay before running the second test? (I suspect fdatasync takes time proportional to the (then current) file size, but still why SCSI case is so long is beyond me.) Eg. ishikawa@duron$ ls -l /tmp/t.out ls: /tmp/t.out: No such file or directory ishikawa@duron$ time ./xlog /tmp/t.out fsync real0m38.673s=== my scsi disk is slow one to begin with... user0m0.050s sys 0m0.140s ishikawa@duron$ ls -l /tmp/t.out -rw-r--r--1 ishikawa users 112000 Mar 5 06:19 /tmp/t.out ishikawa@duron$ time ./xlog /tmp/t.out fsync real1m16.928s=== See TWICE as long! user0m0.060s sys 0m0.160s ishikawa@duron$ ls -l /tmp/t.out -rw-r--r--1 ishikawa users 112000 Mar 5 06:20 /tmp/t.out ishikawa@duron$ rm /tmp/t.out REMOVE the file and try again. ishikawa@duron$ time ./xlog /tmp/t.out fsync real0m40.667s Half as long and back to original. user0m0.040s sys 0m0.120s iishikawa@duron$ time ./xlog /tmp/t.out xxx real0m0.012s === very fast without fdatasync as it should be. user0m0.010s sys 0m0.010s ishikawa@duron$ - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED]