RE: [Iscsitarget-devel] [BUG] Raid1/5 over iSCSI trouble

2007-10-19 Thread Ross S. W. Walker
BERTRAND Joël wrote:
 
 Ross S. W. Walker wrote:
  BERTRAND Joël wrote:
  BERTRAND Joël wrote:
  Bill Davidsen wrote:
  Dan Williams wrote:
  On Fri, 2007-10-19 at 01:04 -0700, BERTRAND Joël wrote:
   
  I run for 12 hours some dd's (read and write in nullio)
  between
  initiator and target without any disconnection. Thus 
  iSCSI code seems
  to
  be robust. Both initiator and target are alone on a 
  single gigabit
  ethernet link (without any switch). I'm investigating...
  
  Can you reproduce on 2.6.22?
 
  Also, I do not think this is the cause of your failure, 
  but you have
  CONFIG_DMA_ENGINE=y in your config.  Setting this to 'n' 
  will compile
  out the unneeded checks for offload engines in async_memcpy and
  async_xor.
  Given that offload engines are far less tested code, I 
  think this is a 
  very good thing to try!
  I'm trying wihtout CONFIG_DMA_ENGINE=y. istd1 only uses 
  40% of one 
  CPU when I rebuild my raid1 array. 1% of this array was now 
  resynchronized without any hang.
 
  Root gershwin:[/usr/scripts]  cat /proc/mdstat
  Personalities : [raid1] [raid6] [raid5] [raid4]
  md7 : active raid1 sdi1[2] md_d0p1[0]
1464725632 blocks [2/1] [U_]
[]  recovery =  1.0% 
  (15705536/1464725632) 
  finish=1103.9min speed=21875K/sec
 Same result...
 
  connection2:0: iscsi: detected conn error (1011)
   
session2: iscsi: session recovery timed out 
 after 120 secs
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  
  I am unsure why you would want to setup an iSCSI RAID1, but before
  doing so I would try to verify that each independant iSCSI session
  is bullet proof.
 
   I use one and only one iSCSI session. Raid1 array is 
 built between a 
 local and iSCSI volume.

Oh, in that case you will be much better served with DRBD, which
would provide you with what you want without creating a Frankenstein
setup...

-Ross

__
This e-mail, and any attachments thereto, is intended only for use by
the addressee(s) named herein and may contain legally privileged
and/or confidential information. If you are not the intended recipient
of this e-mail, you are hereby notified that any dissemination,
distribution or copying of this e-mail, and any attachments thereto,
is strictly prohibited. If you have received this e-mail in error,
please immediately notify the sender and permanently delete the
original and any copy or printout thereof.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [Iscsitarget-devel] [BUG] Raid1/5 over iSCSI trouble

2007-10-19 Thread Ross S. W. Walker
BERTRAND Joël wrote:
 
 BERTRAND Joël wrote:
  BERTRAND Joël wrote:
  Bill Davidsen wrote:
  Dan Williams wrote:
  On Fri, 2007-10-19 at 01:04 -0700, BERTRAND Joël wrote:
   
  I run for 12 hours some dd's (read and write in nullio)
  between
  initiator and target without any disconnection. Thus 
 iSCSI code seems
  to
  be robust. Both initiator and target are alone on a 
 single gigabit
  ethernet link (without any switch). I'm investigating...
  
 
  Can you reproduce on 2.6.22?
 
  Also, I do not think this is the cause of your failure, 
 but you have
  CONFIG_DMA_ENGINE=y in your config.  Setting this to 'n' 
 will compile
  out the unneeded checks for offload engines in async_memcpy and
  async_xor.
 
  Given that offload engines are far less tested code, I 
 think this is 
  a very good thing to try!
 
  I'm trying wihtout CONFIG_DMA_ENGINE=y. istd1 only 
 uses 40% of one 
  CPU when I rebuild my raid1 array. 1% of this array was now 
  resynchronized without any hang.
 
  Root gershwin:[/usr/scripts]  cat /proc/mdstat
  Personalities : [raid1] [raid6] [raid5] [raid4]
  md7 : active raid1 sdi1[2] md_d0p1[0]
1464725632 blocks [2/1] [U_]
[]  recovery =  1.0% 
 (15705536/1464725632) 
  finish=1103.9min speed=21875K/sec
  
  Same result...
  
  connection2:0: iscsi: detected conn error (1011)
  
   session2: iscsi: session recovery timed out after 120 secs
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
  sd 4:0:0:0: scsi: Device offlined - not ready after error recovery
 
   Sorry for this last mail. I have found another mistake, 
 but I don't 
 know if this bug comes from iscsi-target or raid5 itself. 
 iSCSI target 
 is disconnected because istd1 and md_d0_raid5 kernel threads 
 use 100% of 
 CPU each !
 
 Tasks: 235 total,   6 running, 227 sleeping,   0 stopped,   2 zombie
 Cpu(s):  0.1%us, 12.5%sy,  0.0%ni, 87.4%id,  0.0%wa,  0.0%hi, 
  0.0%si, 
 0.0%st
 Mem:   4139032k total,   218424k used,  3920608k free,
 10136k buffers
 Swap:  7815536k total,0k used,  7815536k free,
 64808k cached
 
PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND 
 
   5824 root  15  -5 000 R  100  0.0  10:34.25 istd1 
 
   5599 root  15  -5 000 R  100  0.0   7:25.43 
 md_d0_raid5
 
   Regards,
 
   JKB

If you have 2 iSCSI sessions mirrored then any failure along either
path will hose the setup. Plus having iSCSI and MD RAID fight over
same resources in kernel is a recipe for a race condition.

How about exploring MPIO and DRBD?

-Ross

__
This e-mail, and any attachments thereto, is intended only for use by
the addressee(s) named herein and may contain legally privileged
and/or confidential information. If you are not the intended recipient
of this e-mail, you are hereby notified that any dissemination,
distribution or copying of this e-mail, and any attachments thereto,
is strictly prohibited. If you have received this e-mail in error,
please immediately notify the sender and permanently delete the
original and any copy or printout thereof.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [Iscsitarget-devel] Abort Task ?

2007-10-19 Thread Ross S. W. Walker
Ming Zhang wrote:
 
 On Fri, 2007-10-19 at 16:30 +0200, BERTRAND Joël wrote:
  Ming Zhang wrote:
   On Fri, 2007-10-19 at 09:48 +0200, BERTRAND Joël wrote:
   Ross S. W. Walker wrote:
   BERTRAND Joël wrote:
   BERTRAND Joël wrote:
   I can format serveral times (mkfs.ext3) a 1.5 TB volume 
   over iSCSI 
   without any trouble. I can read and write on this virtual 
   disk without 
   any trouble.
  
   Now, I have configured ietd with :
  
   Lun 0 Sectors=1464725758,Type=nullio
  
   and I run on initiator side :
  
   Root gershwin:[/dev]  dd if=/dev/zero of=/dev/sdj bs=8192
   479482+0 records in
   479482+0 records out
   3927916544 bytes (3.9 GB) copied, 153.222 seconds, 25.6 MB/s
  
   Root gershwin:[/dev]  dd if=/dev/zero of=/dev/sdj bs=8192
  
   I'm waitinfor a crash. No one when I write these lines. 
  I suspect 
   an interaction between raid and iscsi.
  I simultanely run :
  
   Root gershwin:[/dev]  dd if=/dev/zero of=/dev/sdj bs=8192
   8397210+0 records in
   8397210+0 records out
   68789944320 bytes (69 GB) copied, 2732.55 seconds, 25.2 MB/s
  
   and
  
   Root gershwin:[~]  dd if=/dev/sdj of=/dev/null bs=8192
   739200+0 records in
   739199+0 records out
   6055518208 bytes (6.1 GB) copied, 447.178 seconds, 13.5 MB/s
  
  without any trouble.
   The speed can definitely be improved. Look at your network setup
   and use ping to try and get the network latency to a minimum.
  
   # ping -A -s 8192 172.16.24.140
   
   --- 172.16.24.140 ping statistics ---
   14058 packets transmitted, 14057 received, 0% packet 
 loss, time 9988ms
   rtt min/avg/max/mdev = 0.234/0.268/2.084/0.041 ms, 
 ipg/ewma 0.710/0.260 ms
   gershwin:[~]  ping -A -s 8192 192.168.0.2
   PING 192.168.0.2 (192.168.0.2) 8192(8220) bytes of data.
   8200 bytes from 192.168.0.2: icmp_seq=1 ttl=64 time=0.693 ms
   8200 bytes from 192.168.0.2: icmp_seq=2 ttl=64 time=0.595 ms
   8200 bytes from 192.168.0.2: icmp_seq=3 ttl=64 time=0.583 ms
   8200 bytes from 192.168.0.2: icmp_seq=4 ttl=64 time=0.589 ms
   8200 bytes from 192.168.0.2: icmp_seq=5 ttl=64 time=0.580 ms
   8200 bytes from 192.168.0.2: icmp_seq=6 ttl=64 time=0.594 ms
   8200 bytes from 192.168.0.2: icmp_seq=7 ttl=64 time=0.580 ms
   8200 bytes from 192.168.0.2: icmp_seq=8 ttl=64 time=0.592 ms
   8200 bytes from 192.168.0.2: icmp_seq=9 ttl=64 time=0.589 ms
   8200 bytes from 192.168.0.2: icmp_seq=10 ttl=64 time=0.571 ms
   8200 bytes from 192.168.0.2: icmp_seq=11 ttl=64 time=0.588 ms
   8200 bytes from 192.168.0.2: icmp_seq=12 ttl=64 time=0.580 ms
   8200 bytes from 192.168.0.2: icmp_seq=13 ttl=64 time=0.587 ms
  
   --- 192.168.0.2 ping statistics ---
   13 packets transmitted, 13 received, 0% packet loss, time 2400ms
   rtt min/avg/max/mdev = 0.571/0.593/0.693/0.044 ms, 
 ipg/ewma 200.022/0.607 ms
   gershwin:[~] 
  
Both initiator and target are alone on a gigabit NIC 
 (Tigon3). On 
   target server, istd1 takes 100% of a CPU (and only one 
 CPU, even my 
   T1000 can simultaneous run 32 threads). I think the 
 limitation comes 
   from istd1.
   
   usually istdx will not take 100% cpu with 1G network, 
 especially when
   using disk as back storage, some kind of profiling work 
 might be helpful
   to tell what happened...
   
   forgot to ask, your sparc64 platform cpu spec.
  
  Root gershwin:[/mnt/solaris]  cat /proc/cpuinfo
  cpu : UltraSparc T1 (Niagara)
  fpu : UltraSparc T1 integrated FPU
  prom: OBP 4.23.4 2006/08/04 20:45
  type: sun4v
  ncpus probed: 24
  ncpus active: 24
  D$ parity tl1   : 0
  I$ parity tl1   : 0
  
  Both servers are built with 1 GHz T1 processors (6 
 cores, 24 threads).
  
 
 as Ross pointed out, many io pattern only have 1 outstanding io at any
 time, so there is only one work thread actively to serve it. so it can
 not exploit the multiple core here.
 
 
 you see 100% at nullio or fileio? with disk, most time should spend on
 iowait and cpu utilization should not high at all.

Maybe it has to do with the endian-ness fix?

Look at where the fix was implemented and if there was a simpler way
of implementing it? (If that is the cause)

The network is still slower then expected, I don't know what chipset
the Sparcs use for their interfaces, if it is e1000 then you can set
low-latency interrupt throttling with InterruptThrottleRate=1 which
works well. You can explore other interface module options around
Interrupt throttling or coalesence.

-Ross

__
This e-mail, and any attachments thereto, is intended only for use by
the addressee(s) named herein and may contain legally privileged
and/or confidential information. If you are not the intended recipient
of this e-mail, you are hereby notified that any dissemination,
distribution or copying of this e-mail, and any attachments thereto,
is strictly prohibited. If you have received this e-mail in error,
please

RE: [Iscsitarget-devel] Abort Task ?

2007-10-18 Thread Ross S. W. Walker
BERTRAND Joël wrote:
 
 BERTRAND Joël wrote:
  I can format serveral times (mkfs.ext3) a 1.5 TB volume 
 over iSCSI 
  without any trouble. I can read and write on this virtual 
 disk without 
  any trouble.
  
  Now, I have configured ietd with :
  
  Lun 0 Sectors=1464725758,Type=nullio
  
  and I run on initiator side :
  
  Root gershwin:[/dev]  dd if=/dev/zero of=/dev/sdj bs=8192
  479482+0 records in
  479482+0 records out
  3927916544 bytes (3.9 GB) copied, 153.222 seconds, 25.6 MB/s
  
  Root gershwin:[/dev]  dd if=/dev/zero of=/dev/sdj bs=8192
  
  I'm waitinfor a crash. No one when I write these lines. 
I suspect 
  an interaction between raid and iscsi.
 
   I simultanely run :
 
 Root gershwin:[/dev]  dd if=/dev/zero of=/dev/sdj bs=8192
 8397210+0 records in
 8397210+0 records out
 68789944320 bytes (69 GB) copied, 2732.55 seconds, 25.2 MB/s
 
 and
 
 Root gershwin:[~]  dd if=/dev/sdj of=/dev/null bs=8192
 739200+0 records in
 739199+0 records out
 6055518208 bytes (6.1 GB) copied, 447.178 seconds, 13.5 MB/s
 
   without any trouble.

The speed can definitely be improved. Look at your network setup
and use ping to try and get the network latency to a minimum.

# ping -A -s 8192 172.16.24.140

--- 172.16.24.140 ping statistics ---
14058 packets transmitted, 14057 received, 0% packet loss, time 9988ms
rtt min/avg/max/mdev = 0.234/0.268/2.084/0.041 ms, ipg/ewma 0.710/0.260 ms

You want your avg ping time for 8192 byte payloads to be 300us or less.

1000/.268 = 3731 IOPS @ 8k = 30 MB/s

If you use apps that do overlapping asynchronous IO you can see better
numbers.

-Ross

__
This e-mail, and any attachments thereto, is intended only for use by
the addressee(s) named herein and may contain legally privileged
and/or confidential information. If you are not the intended recipient
of this e-mail, you are hereby notified that any dissemination,
distribution or copying of this e-mail, and any attachments thereto,
is strictly prohibited. If you have received this e-mail in error,
please immediately notify the sender and permanently delete the
original and any copy or printout thereof.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html