RE: [Iscsitarget-devel] [BUG] Raid1/5 over iSCSI trouble
BERTRAND Joël wrote: Ross S. W. Walker wrote: BERTRAND Joël wrote: BERTRAND Joël wrote: Bill Davidsen wrote: Dan Williams wrote: On Fri, 2007-10-19 at 01:04 -0700, BERTRAND Joël wrote: I run for 12 hours some dd's (read and write in nullio) between initiator and target without any disconnection. Thus iSCSI code seems to be robust. Both initiator and target are alone on a single gigabit ethernet link (without any switch). I'm investigating... Can you reproduce on 2.6.22? Also, I do not think this is the cause of your failure, but you have CONFIG_DMA_ENGINE=y in your config. Setting this to 'n' will compile out the unneeded checks for offload engines in async_memcpy and async_xor. Given that offload engines are far less tested code, I think this is a very good thing to try! I'm trying wihtout CONFIG_DMA_ENGINE=y. istd1 only uses 40% of one CPU when I rebuild my raid1 array. 1% of this array was now resynchronized without any hang. Root gershwin:[/usr/scripts] cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md7 : active raid1 sdi1[2] md_d0p1[0] 1464725632 blocks [2/1] [U_] [] recovery = 1.0% (15705536/1464725632) finish=1103.9min speed=21875K/sec Same result... connection2:0: iscsi: detected conn error (1011) session2: iscsi: session recovery timed out after 120 secs sd 4:0:0:0: scsi: Device offlined - not ready after error recovery sd 4:0:0:0: scsi: Device offlined - not ready after error recovery sd 4:0:0:0: scsi: Device offlined - not ready after error recovery sd 4:0:0:0: scsi: Device offlined - not ready after error recovery sd 4:0:0:0: scsi: Device offlined - not ready after error recovery sd 4:0:0:0: scsi: Device offlined - not ready after error recovery sd 4:0:0:0: scsi: Device offlined - not ready after error recovery I am unsure why you would want to setup an iSCSI RAID1, but before doing so I would try to verify that each independant iSCSI session is bullet proof. I use one and only one iSCSI session. Raid1 array is built between a local and iSCSI volume. Oh, in that case you will be much better served with DRBD, which would provide you with what you want without creating a Frankenstein setup... -Ross __ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Iscsitarget-devel] [BUG] Raid1/5 over iSCSI trouble
BERTRAND Joël wrote: BERTRAND Joël wrote: BERTRAND Joël wrote: Bill Davidsen wrote: Dan Williams wrote: On Fri, 2007-10-19 at 01:04 -0700, BERTRAND Joël wrote: I run for 12 hours some dd's (read and write in nullio) between initiator and target without any disconnection. Thus iSCSI code seems to be robust. Both initiator and target are alone on a single gigabit ethernet link (without any switch). I'm investigating... Can you reproduce on 2.6.22? Also, I do not think this is the cause of your failure, but you have CONFIG_DMA_ENGINE=y in your config. Setting this to 'n' will compile out the unneeded checks for offload engines in async_memcpy and async_xor. Given that offload engines are far less tested code, I think this is a very good thing to try! I'm trying wihtout CONFIG_DMA_ENGINE=y. istd1 only uses 40% of one CPU when I rebuild my raid1 array. 1% of this array was now resynchronized without any hang. Root gershwin:[/usr/scripts] cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md7 : active raid1 sdi1[2] md_d0p1[0] 1464725632 blocks [2/1] [U_] [] recovery = 1.0% (15705536/1464725632) finish=1103.9min speed=21875K/sec Same result... connection2:0: iscsi: detected conn error (1011) session2: iscsi: session recovery timed out after 120 secs sd 4:0:0:0: scsi: Device offlined - not ready after error recovery sd 4:0:0:0: scsi: Device offlined - not ready after error recovery sd 4:0:0:0: scsi: Device offlined - not ready after error recovery sd 4:0:0:0: scsi: Device offlined - not ready after error recovery sd 4:0:0:0: scsi: Device offlined - not ready after error recovery sd 4:0:0:0: scsi: Device offlined - not ready after error recovery sd 4:0:0:0: scsi: Device offlined - not ready after error recovery Sorry for this last mail. I have found another mistake, but I don't know if this bug comes from iscsi-target or raid5 itself. iSCSI target is disconnected because istd1 and md_d0_raid5 kernel threads use 100% of CPU each ! Tasks: 235 total, 6 running, 227 sleeping, 0 stopped, 2 zombie Cpu(s): 0.1%us, 12.5%sy, 0.0%ni, 87.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4139032k total, 218424k used, 3920608k free, 10136k buffers Swap: 7815536k total,0k used, 7815536k free, 64808k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 5824 root 15 -5 000 R 100 0.0 10:34.25 istd1 5599 root 15 -5 000 R 100 0.0 7:25.43 md_d0_raid5 Regards, JKB If you have 2 iSCSI sessions mirrored then any failure along either path will hose the setup. Plus having iSCSI and MD RAID fight over same resources in kernel is a recipe for a race condition. How about exploring MPIO and DRBD? -Ross __ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Iscsitarget-devel] Abort Task ?
Ming Zhang wrote: On Fri, 2007-10-19 at 16:30 +0200, BERTRAND Joël wrote: Ming Zhang wrote: On Fri, 2007-10-19 at 09:48 +0200, BERTRAND Joël wrote: Ross S. W. Walker wrote: BERTRAND Joël wrote: BERTRAND Joël wrote: I can format serveral times (mkfs.ext3) a 1.5 TB volume over iSCSI without any trouble. I can read and write on this virtual disk without any trouble. Now, I have configured ietd with : Lun 0 Sectors=1464725758,Type=nullio and I run on initiator side : Root gershwin:[/dev] dd if=/dev/zero of=/dev/sdj bs=8192 479482+0 records in 479482+0 records out 3927916544 bytes (3.9 GB) copied, 153.222 seconds, 25.6 MB/s Root gershwin:[/dev] dd if=/dev/zero of=/dev/sdj bs=8192 I'm waitinfor a crash. No one when I write these lines. I suspect an interaction between raid and iscsi. I simultanely run : Root gershwin:[/dev] dd if=/dev/zero of=/dev/sdj bs=8192 8397210+0 records in 8397210+0 records out 68789944320 bytes (69 GB) copied, 2732.55 seconds, 25.2 MB/s and Root gershwin:[~] dd if=/dev/sdj of=/dev/null bs=8192 739200+0 records in 739199+0 records out 6055518208 bytes (6.1 GB) copied, 447.178 seconds, 13.5 MB/s without any trouble. The speed can definitely be improved. Look at your network setup and use ping to try and get the network latency to a minimum. # ping -A -s 8192 172.16.24.140 --- 172.16.24.140 ping statistics --- 14058 packets transmitted, 14057 received, 0% packet loss, time 9988ms rtt min/avg/max/mdev = 0.234/0.268/2.084/0.041 ms, ipg/ewma 0.710/0.260 ms gershwin:[~] ping -A -s 8192 192.168.0.2 PING 192.168.0.2 (192.168.0.2) 8192(8220) bytes of data. 8200 bytes from 192.168.0.2: icmp_seq=1 ttl=64 time=0.693 ms 8200 bytes from 192.168.0.2: icmp_seq=2 ttl=64 time=0.595 ms 8200 bytes from 192.168.0.2: icmp_seq=3 ttl=64 time=0.583 ms 8200 bytes from 192.168.0.2: icmp_seq=4 ttl=64 time=0.589 ms 8200 bytes from 192.168.0.2: icmp_seq=5 ttl=64 time=0.580 ms 8200 bytes from 192.168.0.2: icmp_seq=6 ttl=64 time=0.594 ms 8200 bytes from 192.168.0.2: icmp_seq=7 ttl=64 time=0.580 ms 8200 bytes from 192.168.0.2: icmp_seq=8 ttl=64 time=0.592 ms 8200 bytes from 192.168.0.2: icmp_seq=9 ttl=64 time=0.589 ms 8200 bytes from 192.168.0.2: icmp_seq=10 ttl=64 time=0.571 ms 8200 bytes from 192.168.0.2: icmp_seq=11 ttl=64 time=0.588 ms 8200 bytes from 192.168.0.2: icmp_seq=12 ttl=64 time=0.580 ms 8200 bytes from 192.168.0.2: icmp_seq=13 ttl=64 time=0.587 ms --- 192.168.0.2 ping statistics --- 13 packets transmitted, 13 received, 0% packet loss, time 2400ms rtt min/avg/max/mdev = 0.571/0.593/0.693/0.044 ms, ipg/ewma 200.022/0.607 ms gershwin:[~] Both initiator and target are alone on a gigabit NIC (Tigon3). On target server, istd1 takes 100% of a CPU (and only one CPU, even my T1000 can simultaneous run 32 threads). I think the limitation comes from istd1. usually istdx will not take 100% cpu with 1G network, especially when using disk as back storage, some kind of profiling work might be helpful to tell what happened... forgot to ask, your sparc64 platform cpu spec. Root gershwin:[/mnt/solaris] cat /proc/cpuinfo cpu : UltraSparc T1 (Niagara) fpu : UltraSparc T1 integrated FPU prom: OBP 4.23.4 2006/08/04 20:45 type: sun4v ncpus probed: 24 ncpus active: 24 D$ parity tl1 : 0 I$ parity tl1 : 0 Both servers are built with 1 GHz T1 processors (6 cores, 24 threads). as Ross pointed out, many io pattern only have 1 outstanding io at any time, so there is only one work thread actively to serve it. so it can not exploit the multiple core here. you see 100% at nullio or fileio? with disk, most time should spend on iowait and cpu utilization should not high at all. Maybe it has to do with the endian-ness fix? Look at where the fix was implemented and if there was a simpler way of implementing it? (If that is the cause) The network is still slower then expected, I don't know what chipset the Sparcs use for their interfaces, if it is e1000 then you can set low-latency interrupt throttling with InterruptThrottleRate=1 which works well. You can explore other interface module options around Interrupt throttling or coalesence. -Ross __ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please
RE: [Iscsitarget-devel] Abort Task ?
BERTRAND Joël wrote: BERTRAND Joël wrote: I can format serveral times (mkfs.ext3) a 1.5 TB volume over iSCSI without any trouble. I can read and write on this virtual disk without any trouble. Now, I have configured ietd with : Lun 0 Sectors=1464725758,Type=nullio and I run on initiator side : Root gershwin:[/dev] dd if=/dev/zero of=/dev/sdj bs=8192 479482+0 records in 479482+0 records out 3927916544 bytes (3.9 GB) copied, 153.222 seconds, 25.6 MB/s Root gershwin:[/dev] dd if=/dev/zero of=/dev/sdj bs=8192 I'm waitinfor a crash. No one when I write these lines. I suspect an interaction between raid and iscsi. I simultanely run : Root gershwin:[/dev] dd if=/dev/zero of=/dev/sdj bs=8192 8397210+0 records in 8397210+0 records out 68789944320 bytes (69 GB) copied, 2732.55 seconds, 25.2 MB/s and Root gershwin:[~] dd if=/dev/sdj of=/dev/null bs=8192 739200+0 records in 739199+0 records out 6055518208 bytes (6.1 GB) copied, 447.178 seconds, 13.5 MB/s without any trouble. The speed can definitely be improved. Look at your network setup and use ping to try and get the network latency to a minimum. # ping -A -s 8192 172.16.24.140 --- 172.16.24.140 ping statistics --- 14058 packets transmitted, 14057 received, 0% packet loss, time 9988ms rtt min/avg/max/mdev = 0.234/0.268/2.084/0.041 ms, ipg/ewma 0.710/0.260 ms You want your avg ping time for 8192 byte payloads to be 300us or less. 1000/.268 = 3731 IOPS @ 8k = 30 MB/s If you use apps that do overlapping asynchronous IO you can see better numbers. -Ross __ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html