Re: iSCSI and FileSystem (ext2/ext3)

2009-04-17 Thread Ulrich Windl

On 16 Apr 2009 at 17:17, Bart Van Assche wrote:

 
 On Thu, Apr 16, 2009 at 2:07 PM, Pasi Kärkkäinen pa...@iki.fi wrote:
  Iirc it has been with RHEL5/CentOS5 2.6.18 based kernels..
 
  Mike Christie has been writing about this aswell.. dunno about what kernels
  he has seen it with.
 
  Then again CFQ was designed for single disk workstations..
 
 I have seen the above statement about CFQ only once in the past, and
 that was in this post:
 http://www.mail-archive.com/cen...@centos.org/msg04648.html. But if
 CFQ really was designed for single disk workstations, I do not
 understand why it has been chosen as the default in Red Hat Enterprise
 Linux 4. There must have been a good reason behind that choice.

From what I know and what I've read (e.g. Deitel, Operating Systems, 3rd 
edition, 
Case Study: Linux), the I/O scheduler manages requests per device, and the CFQ 
has 
the advantage (over the default elevator algorithm) that it avoids starvation 
of 
requests. This may have the consequence that disk channel throughput is sub-
optimal.

Regards,
Ulrich


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: open-iscsi created only one disk but missing 2 more

2009-04-17 Thread Ulrich Windl

On 16 Apr 2009 at 16:02, sundar mahadevan wrote:

 
 I tried the same setting with switching the hard drive to system 2 and
 i still get the same result. It detects only one logical volume. There
 is some setting which i'm obviously missing out. Experts, please help.

Hi!

I have no idea about that, but I'd guess IF the device files are all created, 
you 
SHOULD restart the iSCSI target service to detect the devices, and maybe you'll 
have to define access permissions inside the target software as well.

However this ist is for the iSCSI INITIATOR (open-iscsi), so maybe you are 
asking 
the wrong people. Did you read the documentation for your TARGET software 
(which I 
don't know)?

Regards,
Ulrich


 
 On Thu, Apr 16, 2009 at 11:48 AM, sundar mahadevan
 sundarmahadeva...@gmail.com wrote:
  Could someone enlighten me on this question please:
  Do i have to install iscsitarget on system 1 and access it with
  open-iscsi(iscsi initiator) from system 2 or
  can i have open-iscsi installed on both system 1 and system 2 and get
  it working?
 
  On Thu, Apr 16, 2009 at 10:51 AM, sundar mahadevan
  sundarmahadeva...@gmail.com wrote:
  kernel: 2.6.27-11
 
  IET is your target.
  To my understanding IET is iscsi enterprise target which is similar to
  open-iscsi in implementing iscsi targets. open-iscsi and IET are
  different organisations implementing iscsi. Am i right?
 
 
 
  On Thu, Apr 16, 2009 at 10:46 AM, Mike Christie micha...@cs.wisc.edu 
  wrote:
 
  sundar mahadevan wrote:
  Are you building the open-iscsi tools or are they part of the distro you
  are using?
 
  I just used apt-get install open-iscsi on both systems.
 
 
  What kernel are you using (uname -a)?
 
  Are you also using qla4xxx or just using iscsi_tcp?
 
   I dont think i use qla4xxx. I think i use iscsi_tcp.
 
  If you are building your own tools make sure if you are using a 64 bit
  kernel then the tools are also compiled as 64 bits.
 
    I'm on a 32 bit system. In that case what is a 64 bit of use to me.
 
  Make sure that you only have one set of tools installed. Do a whereis
  iscsid and whereis iscsiadm.
 
  whereis iscsid
  iscsid: /sbin/iscsid /usr/share/man/man8/iscsid.8.gz
  whereis iscsiadm
  iscsiadm: /sbin/iscsiadm /usr/share/man/man8/iscsiadm.8.gz
 
  You are using IET right? If so it does not matter what disks you use.
  IET can handle it.
 
  No i dont use IET. I use open-iscsi on both systems.
 
  IET is your target.
 
  
 
 
 
 
  



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-17 Thread Ulrich Windl

On 16 Apr 2009 at 13:59, jnantel wrote:

 
  FINAL RESULTS *
 First of all I'd thank Mike Christie for all his help. Mike I'll
 tapping your brain again for some read performance help.
 
 This for the benefit of anyone using the Dell Equallogic PS5000XV
 PS5000E with SLES10 SP2 / Redhat 5.3 / Centos 5.3 / Oracle Linux +
 Multipath ( MPIO ) and open-iscsi ( iscsi ).  Sorry about weird
 formatting, making sure this is going get hit for people that were in
 my predicament.

When seeing your settings, I wonder what your network latency for jumbo frames 
is 
(e.g using ping). The timing is dependent on packet sizes. Here is what I have 
if 
everything is connected to one switch (and both ends are handling normal iSCSI 
traffic at the same time), started from Domain-0 of a XEN-virtualized machine 
that 
has 77 users logged on:

# ping -s 9000 172.20.76.1
PING 172.20.76.1 (172.20.76.1) 9000(9028) bytes of data.
9008 bytes from 172.20.76.1: icmp_seq=1 ttl=64 time=1.90 ms
9008 bytes from 172.20.76.1: icmp_seq=2 ttl=64 time=1.38 ms
9008 bytes from 172.20.76.1: icmp_seq=3 ttl=64 time=1.39 ms
9008 bytes from 172.20.76.1: icmp_seq=4 ttl=64 time=1.40 ms
9008 bytes from 172.20.76.1: icmp_seq=5 ttl=64 time=1.56 ms
9008 bytes from 172.20.76.1: icmp_seq=6 ttl=64 time=1.52 ms
9008 bytes from 172.20.76.1: icmp_seq=7 ttl=64 time=1.39 ms
9008 bytes from 172.20.76.1: icmp_seq=8 ttl=64 time=1.40 ms
9008 bytes from 172.20.76.1: icmp_seq=9 ttl=64 time=1.55 ms
9008 bytes from 172.20.76.1: icmp_seq=10 ttl=64 time=1.38 ms

--- 172.20.76.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9000ms
rtt min/avg/max/mdev = 1.384/1.491/1.900/0.154 ms
# ping 172.20.76.1
PING 172.20.76.1 (172.20.76.1) 56(84) bytes of data.
64 bytes from 172.20.76.1: icmp_seq=1 ttl=64 time=0.253 ms
64 bytes from 172.20.76.1: icmp_seq=2 ttl=64 time=0.214 ms
64 bytes from 172.20.76.1: icmp_seq=3 ttl=64 time=0.223 ms
64 bytes from 172.20.76.1: icmp_seq=4 ttl=64 time=0.214 ms
64 bytes from 172.20.76.1: icmp_seq=5 ttl=64 time=0.215 ms
64 bytes from 172.20.76.1: icmp_seq=6 ttl=64 time=0.208 ms
64 bytes from 172.20.76.1: icmp_seq=7 ttl=64 time=0.270 ms
64 bytes from 172.20.76.1: icmp_seq=8 ttl=64 time=0.313 ms

--- 172.20.76.1 ping statistics ---
8 packets transmitted, 8 received, 0% packet loss, time 6996ms
rtt min/avg/max/mdev = 0.208/0.238/0.313/0.039 ms

I think large queues are more important if the roundtrip delay is high. ANd 
don't 
forget that queue sizes are per device or session, so is uses some RAM.

Regards,
Ulrich

 
 As from this thread my issue was amazingly slow performance with
 sequential writes with my multipath, around 35 meg/s, configuration
 when measured with IOMETER.  First things first... THROW OUT IOMETER
 FOR LINUX , it has problems with queue depth.  With that said, with
 default iscsi and multipath setup we saw between 60-80meg/sec
 performance with multipath. In essence it was slower than single
 interface in certain block sizes. When I got done my write performance
 was pushing 180-190meg/sec with blocks as small as 4k ( sequential
 write test using dt).
 
 Here are my tweaks:
 
 After making any multipath changes do multipath -F  then multipath
 otherwise your changes won't take effect.
 
 /etc/multipath.conf
 
 device {
 vendor EQLOGIC
 product 100E-00
 path_grouping_policy multibus
 getuid_callout /sbin/scsi_id -g -u -s /block/%n
 features 1 queue_if_no_path--- important
 path_checker readsector0
 failback immediate
 path_selector round-robin 0
 rr_min_io 512  important, only works with large queue
 depth and cms in iscsi.conf
 rr_weight priorities
 }
 
 
 /etc/iscsi/iscsi.conf   ( restarting iscsi seems to apply the configs
 fine)
 
 # To control how many commands the session will queue set
 # node.session.cmds_max to an integer between 2 and 2048 that is also
 # a power of 2. The default is 128.
 node.session.cmds_max = 1024
 
 # To control the device's queue depth set node.session.queue_depth
 # to a value between 1 and 128. The default is 32.
 node.session.queue_depth = 128
 
 Other changes I've made are basic gigabit network tuning for large
 transfers and turning off some congestion functions, some scheduler
 changes (noop is amazing for sub 4k blocks but awful for 4meg chunks
 or higher). I've turned off TSO on the network cards, apparently it's
 not supported with jumbo frames and actually slows down performance.
 
 
 dc1stgdb14:~ # ethtool -k eth7
 Offload parameters for eth7:
 rx-checksumming: off
 tx-checksumming: off
 scatter-gather: off
 tcp segmentation offload: off
 dc1stgdb14:~ # ethtool -k eth10
 Offload parameters for eth10:
 rx-checksumming: off
 tx-checksumming: off
 scatter-gather: off
 tcp segmentation offload: off
 dc1stgdb14:~ #
 
 
 On Apr 13, 4:36 pm, jnantel nan...@hotmail.com wrote:
  I am having a major issue with multipath + iscsi writeperformance
  with anything random or any 

Re: open-iscsi created only one disk but missing 2 more

2009-04-17 Thread sundar mahadevan

Hi All,
I tried different things to go past the errors. Now I received a
different error.

I changed my setup. Earlier i had a cross over cable with static ip
connection between the 2 systems but now i have connected the systems
to a router with dhcp. And the error is as follows: Please help.

As usual i receive this error with the following command:
iscsiadm -m session

Apr 17 11:40:42 sunny1 iscsid: Could not get host for sid 1.
Apr 17 11:40:42 sunny1 iscsid: could not get host_no for session 6.
Apr 17 11:40:42 sunny1 iscsid: could not find session info for session1

And here is the major error: now there are no devices detected. I
tried switching another hard disk too (thinking that it might be a
problem with the hard disk) but the same error again.

Apr 17 11:40:42 sunny1 iscsid: session
[iqn.2009-09.com.ubuntu:asm,192.168.2.3,3260] already running.
Apr 17 11:40:44 sunny1 kernel: [ 5661.360412] sd 2:0:0:0: [sdb]
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
Apr 17 11:40:44 sunny1 kernel: [ 5661.360436] end_request: I/O error,
dev sdb, sector 0
Apr 17 11:40:44 sunny1 kernel: [ 5661.360449] __ratelimit: 2 callbacks
suppressed
Apr 17 11:40:44 sunny1 kernel: [ 5661.360457] Buffer I/O error on
device sdb, logical block 0
Apr 17 11:40:44 sunny1 kernel: [ 5661.360471] Buffer I/O error on
device sdb, logical block 1
Apr 17 11:40:44 sunny1 kernel: [ 5661.360478] Buffer I/O error on
device sdb, logical block 2
Apr 17 11:40:44 sunny1 kernel: [ 5661.360486] Buffer I/O error on
device sdb, logical block 3
Apr 17 11:40:44 sunny1 kernel: [ 5661.370342] sd 2:0:0:0: [sdb]
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
Apr 17 11:40:44 sunny1 kernel: [ 5661.370359] end_request: I/O error,
dev sdb, sector 0
Apr 17 11:40:44 sunny1 kernel: [ 5661.370369] Buffer I/O error on
device sdb, logical block 0
Apr 17 11:40:44 sunny1 kernel: [ 5661.375106] sd 2:0:0:0: [sdb]
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
Apr 17 11:40:44 sunny1 kernel: [ 5661.375125] end_request: I/O error,
dev sdb, sector 33554424
Apr 17 11:40:44 sunny1 kernel: [ 5661.375135] Buffer I/O error on
device sdb, logical block 4194303
Apr 17 11:40:44 sunny1 kernel: [ 5661.379983] sd 2:0:0:0: [sdb]
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
Apr 17 11:40:44 sunny1 kernel: [ 5661.380022] end_request: I/O error,
dev sdb, sector 33554424
Apr 17 11:40:44 sunny1 kernel: [ 5661.380032] Buffer I/O error on
device sdb, logical block 4194303
Apr 17 11:40:44 sunny1 kernel: [ 5661.385049] sd 2:0:0:0: [sdb]
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
Apr 17 11:40:44 sunny1 kernel: [ 5661.385069] end_request: I/O error,
dev sdb, sector 0
Apr 17 11:40:44 sunny1 kernel: [ 5661.385079] Buffer I/O error on
device sdb, logical block 0
Apr 17 11:40:44 sunny1 kernel: [ 5661.389754] sd 2:0:0:0: [sdb]
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
Apr 17 11:40:44 sunny1 kernel: [ 5661.389773] end_request: I/O error,
dev sdb, sector 8
Apr 17 11:40:44 sunny1 kernel: [ 5661.389782] Buffer I/O error on
device sdb, logical block 1
Apr 17 11:40:44 sunny1 kernel: [ 5661.394365] sd 2:0:0:0: [sdb]
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
Apr 17 11:40:44 sunny1 kernel: [ 5661.394383] end_request: I/O error,
dev sdb, sector 16
Apr 17 11:40:44 sunny1 kernel: [ 5661.394392] Buffer I/O error on
device sdb, logical block 2
Apr 17 11:40:44 sunny1 kernel: [ 5661.397706] sd 2:0:0:0: [sdb]
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
Apr 17 11:40:44 sunny1 kernel: [ 5661.397722] end_request: I/O error,
dev sdb, sector 24
Apr 17 11:40:44 sunny1 kernel: [ 5661.397743] sd 2:0:0:0: [sdb]
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
Apr 17 11:40:44 sunny1 kernel: [ 5661.397753] end_request: I/O error,
dev sdb, sector 0
r...@sunny1:/home/sunny#

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Time of Log-In

2009-04-17 Thread jnantel

Are you aware of them fixing the Broadcom iscsi offload cards to
support Jumbo frames? We have 2 5709s we turfed because of it.

On Apr 16, 11:18 am, Mike Christie micha...@cs.wisc.edu wrote:
 Ulrich Windl wrote:
  Hello,

  while thinking about an udev/multipath timing problem with device 
  discovery, I
  wondered how difficult it would be to record and the report  time of session
  establihment (i.e. log-in). iscsiadm -m session -P 3 does not show that. 
  Would
  that tinme be related to SID?

 No.

 I can add it though. I am still busy with work stuff trying to finish
 adding bnx2i and making sure cxgb3i is ok. When that stuff gets finished
 I will work on all your and eveyrone else's requests more.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-17 Thread jnantel

Accross my SAN, tuned system:

ping -I eth2 -s 9000 10.1.253.48
PING 10.1.253.48 (10.1.253.48) from 10.1.253.48 eth2: 9000(9028) bytes
of data.
9008 bytes from 10.1.253.48: icmp_seq=1 ttl=64 time=0.074 ms
9008 bytes from 10.1.253.48: icmp_seq=2 ttl=64 time=0.013 ms
9008 bytes from 10.1.253.48: icmp_seq=3 ttl=64 time=0.012 ms
9008 bytes from 10.1.253.48: icmp_seq=4 ttl=64 time=0.011 ms
9008 bytes from 10.1.253.48: icmp_seq=5 ttl=64 time=0.012 ms
9008 bytes from 10.1.253.48: icmp_seq=6 ttl=64 time=0.012 ms
9008 bytes from 10.1.253.48: icmp_seq=7 ttl=64 time=0.012 ms
9008 bytes from 10.1.253.48: icmp_seq=8 ttl=64 time=0.011 ms
9008 bytes from 10.1.253.48: icmp_seq=9 ttl=64 time=0.012 ms


TSO, TCP checksum offload and things like that seem to have a big
effect on latency. If you look at how things like TSO work, their
intention is to save you CPU overhead...in my case I don't care about
overhead I've got 24 cores.

On Apr 17, 3:25 am, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de
wrote:
 On 16 Apr 2009 at 13:59, jnantel wrote:



   FINAL RESULTS *
  First of all I'd thank Mike Christie for all his help. Mike I'll
  tapping your brain again for some read performance help.

  This for the benefit of anyone using the Dell Equallogic PS5000XV
  PS5000E with SLES10 SP2 / Redhat 5.3 / Centos 5.3 / Oracle Linux +
  Multipath ( MPIO ) and open-iscsi ( iscsi ).  Sorry about weird
  formatting, making sure this is going get hit for people that were in
  my predicament.

 When seeing your settings, I wonder what your network latency for jumbo 
 frames is
 (e.g using ping). The timing is dependent on packet sizes. Here is what I 
 have if
 everything is connected to one switch (and both ends are handling normal iSCSI
 traffic at the same time), started from Domain-0 of a XEN-virtualized machine 
 that
 has 77 users logged on:

 # ping -s 9000 172.20.76.1
 PING 172.20.76.1 (172.20.76.1) 9000(9028) bytes of data.
 9008 bytes from 172.20.76.1: icmp_seq=1 ttl=64 time=1.90 ms
 9008 bytes from 172.20.76.1: icmp_seq=2 ttl=64 time=1.38 ms
 9008 bytes from 172.20.76.1: icmp_seq=3 ttl=64 time=1.39 ms
 9008 bytes from 172.20.76.1: icmp_seq=4 ttl=64 time=1.40 ms
 9008 bytes from 172.20.76.1: icmp_seq=5 ttl=64 time=1.56 ms
 9008 bytes from 172.20.76.1: icmp_seq=6 ttl=64 time=1.52 ms
 9008 bytes from 172.20.76.1: icmp_seq=7 ttl=64 time=1.39 ms
 9008 bytes from 172.20.76.1: icmp_seq=8 ttl=64 time=1.40 ms
 9008 bytes from 172.20.76.1: icmp_seq=9 ttl=64 time=1.55 ms
 9008 bytes from 172.20.76.1: icmp_seq=10 ttl=64 time=1.38 ms

 --- 172.20.76.1 ping statistics ---
 10 packets transmitted, 10 received, 0% packet loss, time 9000ms
 rtt min/avg/max/mdev = 1.384/1.491/1.900/0.154 ms
 # ping 172.20.76.1
 PING 172.20.76.1 (172.20.76.1) 56(84) bytes of data.
 64 bytes from 172.20.76.1: icmp_seq=1 ttl=64 time=0.253 ms
 64 bytes from 172.20.76.1: icmp_seq=2 ttl=64 time=0.214 ms
 64 bytes from 172.20.76.1: icmp_seq=3 ttl=64 time=0.223 ms
 64 bytes from 172.20.76.1: icmp_seq=4 ttl=64 time=0.214 ms
 64 bytes from 172.20.76.1: icmp_seq=5 ttl=64 time=0.215 ms
 64 bytes from 172.20.76.1: icmp_seq=6 ttl=64 time=0.208 ms
 64 bytes from 172.20.76.1: icmp_seq=7 ttl=64 time=0.270 ms
 64 bytes from 172.20.76.1: icmp_seq=8 ttl=64 time=0.313 ms

 --- 172.20.76.1 ping statistics ---
 8 packets transmitted, 8 received, 0% packet loss, time 6996ms
 rtt min/avg/max/mdev = 0.208/0.238/0.313/0.039 ms

 I think large queues are more important if the roundtrip delay is high. ANd 
 don't
 forget that queue sizes are per device or session, so is uses some RAM.

 Regards,
 Ulrich



  As from this thread my issue was amazingly slow performance with
  sequential writes with my multipath, around 35 meg/s, configuration
  when measured with IOMETER.  First things first... THROW OUT IOMETER
  FOR LINUX , it has problems with queue depth.  With that said, with
  default iscsi and multipath setup we saw between 60-80meg/sec
  performance with multipath. In essence it was slower than single
  interface in certain block sizes. When I got done my write performance
  was pushing 180-190meg/sec with blocks as small as 4k ( sequential
  write test using dt).

  Here are my tweaks:

  After making any multipath changes do multipath -F  then multipath
  otherwise your changes won't take effect.

  /etc/multipath.conf

  device {
          vendor EQLOGIC
          product 100E-00
          path_grouping_policy multibus
          getuid_callout /sbin/scsi_id -g -u -s /block/%n
          features 1 queue_if_no_path    --- important
          path_checker readsector0
          failback immediate
          path_selector round-robin 0
          rr_min_io 512  important, only works with large queue
  depth and cms in iscsi.conf
          rr_weight priorities
  }

  /etc/iscsi/iscsi.conf   ( restarting iscsi seems to apply the configs
  fine)

  # To control how many commands the session will queue set
  # node.session.cmds_max to an integer between 2 and 2048 that is also
  # 

Re: Time of Log-In

2009-04-17 Thread Mike Christie

jnantel wrote:
 Are you aware of them fixing the Broadcom iscsi offload cards to
 support Jumbo frames? We have 2 5709s we turfed because of it.

I have not heard anything.

 
 On Apr 16, 11:18 am, Mike Christie micha...@cs.wisc.edu wrote:
 Ulrich Windl wrote:
 Hello,
 while thinking about an udev/multipath timing problem with device 
 discovery, I
 wondered how difficult it would be to record and the report  time of session
 establihment (i.e. log-in). iscsiadm -m session -P 3 does not show that. 
 Would
 that tinme be related to SID?
 No.

 I can add it though. I am still busy with work stuff trying to finish
 adding bnx2i and making sure cxgb3i is ok. When that stuff gets finished
 I will work on all your and eveyrone else's requests more.
  


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Advice on device recovery

2009-04-17 Thread dave

Long post, but please bear with me :)

This post is related to my previous post at:
http://groups.google.com/group/open-iscsi/browse_thread/thread/4569bc674383145a/e6c2e4320ec401f5?lnk=gstq=device+not+ready#e6c2e4320ec401f5

My situation:

I have linux initiators running open-iscsi 2.0-869 with dm-multipath,
queue_if_no_path enabled. The target is an OpenSolaris box sharing
zvols from a mirrored zpool, which means the target LUNs are virtual
devices with storage backed by the ZFS zpool.

The problem:

When one of the disks in the solaris zpool dies, ZFS halts reads/
writes to the zpool for a minute or two while it waits for the disk
controller/driver to determine if the device should be offlined. The
side effect of this is that because the iSCSI targets are virtual
devices with their data store being the ZFS zpool, the iSCSI read/
writes are also halted as long as ZFS is waiting for a device to fail.
The iSCSI targets don't disappear, they are just unable to complete
read/write ops - they still respond fine to logins and target
discovery. Once ZFS continues operation, the iSCSI devices also resume
normal operation. Since I am using multipath on the linux initiators,
the linux boxes can wait patiently for read/writes to resume, but it
seems that the scsi system does not retry TUR messages which can cause
the device to never be put back into operation on the initiator node.

The process:

- Linux initiator logs in to a iscsi target and maps it to /dev/sdc
- multipath maps /dev/sdc to /dev/mapper/xyz with queue_if_no_path
- Apps start reading/writing to /dev/mapper/xyz
- A disk in the Solaris server fails
- ZFS halts reads/writes to the zpool, also halts read/write of iSCSI
targets
- Linux reads/writes to /dev/mapper/xyz halt.
- Linux scsi layer waits for /sys/block/sdc/device/timeout seconds
before it runs eh code path
- scsi eh tries to abort any outstanding tasks issued to iscsi device.
This fails.
- scsi eh tries a lu reset. This fails.
- open-iscsi logs out and back into the iSCSI target. This works fine.
- scsi eh sends a TUR to the iscsi device. This fails because ZFS is
still waiting for the one device to timeout
- Solaris finally marks the device as fulated. ZFS resumes normal read/
writes from the zpool.
- Linux apps are still waiting for /dev/mapper/xyz to come back
online, but since the scsi layer only sends one TUR and never retries
if it fails, the device never comes back automatically

My questions:

- I want the linux initiators to queue or pause read/write requests up
to 24 hours and periodically (every 15-30 seconds) attempt to reset
and online the iscsi device. What is the best way to do this?
- I can extend the timeout period by setting /sys/block/sdc/device/
timeout to a larger value, but is this wise? What are the dangers of
setting this to a large value?
- I can online the device with 'echo running  /sys/block/sdc/device/
state'. This may be fine to do manually once I know ZFS has resumed
read/writes, but what if ZFS is still halted? What if I am just
blindly setting this to 'running' on each iscsi device every 15
seconds via script (I can't imagine this would be optimal)?

I want the linux box to act like it would if this were a problem with
NFS connections being disrupted. Just wait forever (up to 24 hours)
until the connection is recovered and then resume operation like
nothing happened.

Thanks in advance,

Dave
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



[PATCH] cxgb3i: fix ddp map overrun

2009-04-17 Thread kxie

[PATCH] cxgb3i: fix ddp map overrun

From: Karen Xie k...@chelsio.com

Fixed a bug in calculating ddp map range when search for free entries: 
it was going beyond the end by one, thus corrupting gl_skb[0].

Signed-off-by: Karen Xie k...@chelsio.com
---

 drivers/scsi/cxgb3i/cxgb3i_ddp.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)


diff --git a/drivers/scsi/cxgb3i/cxgb3i_ddp.c b/drivers/scsi/cxgb3i/cxgb3i_ddp.c
index d06a661..43f6ece 100644
--- a/drivers/scsi/cxgb3i/cxgb3i_ddp.c
+++ b/drivers/scsi/cxgb3i/cxgb3i_ddp.c
@@ -367,7 +367,7 @@ int cxgb3i_ddp_tag_reserve(struct t3cdev *tdev, unsigned 
int tid,
}
 
npods = (gl-nelem + PPOD_PAGES_MAX - 1)  PPOD_PAGES_SHIFT;
-   idx_max = ddp-nppods - npods + 1;
+   idx_max = ddp-nppods - npods;
 
if (ddp-idx_last == ddp-nppods)
idx = ddp_find_unused_entries(ddp, 0, idx_max, npods, gl);
@@ -376,7 +376,7 @@ int cxgb3i_ddp_tag_reserve(struct t3cdev *tdev, unsigned 
int tid,
  idx_max, npods, gl);
if (idx  0  ddp-idx_last = npods)
idx = ddp_find_unused_entries(ddp, 0,
- ddp-idx_last - npods + 1,
+ ddp-idx_last + npods - 1,
  npods, gl);
}
if (idx  0) {

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



[PATCH v2] cxgb3i: fix ddp map overrun (v2)

2009-04-17 Thread kxie

[PATCH v2] cxgb3i: fix ddp map overrun (version 2)

From: Karen Xie k...@chelsio.com

Fixed a bug in calculating ddp map range when search for free entries:
it was going beyond the end by one, thus corrupting gl_skb[0].

Signed-off-by: Karen Xie k...@chelsio.com
---

 drivers/scsi/cxgb3i/cxgb3i_ddp.c |   32 +++-
 1 files changed, 19 insertions(+), 13 deletions(-)


diff --git a/drivers/scsi/cxgb3i/cxgb3i_ddp.c b/drivers/scsi/cxgb3i/cxgb3i_ddp.c
index d06a661..99c9125 100644
--- a/drivers/scsi/cxgb3i/cxgb3i_ddp.c
+++ b/drivers/scsi/cxgb3i/cxgb3i_ddp.c
@@ -120,20 +120,26 @@ static void clear_ddp_map(struct cxgb3i_ddp_info *ddp, 
unsigned int tag,
 }
 
 static inline int ddp_find_unused_entries(struct cxgb3i_ddp_info *ddp,
- int start, int max, int count,
+ unsigned int start, unsigned int max,
+ unsigned int count,
  struct cxgb3i_gather_list *gl)
 {
-   unsigned int i, j;
+   unsigned int i, j, k;
 
+   /* not enough entries */
+   if ((max - start)  count)
+   return -EBUSY;
+
+   max -= count;
spin_lock(ddp-map_lock);
-   for (i = start; i = max;) {
-   for (j = 0; j  count; j++) {
-   if (ddp-gl_map[i + j])
+   for (i = start; i  max;) {
+   for (j = 0, k = i; j  count; j++, k++) {
+   if (ddp-gl_map[k])
break;
}
if (j == count) {
-   for (j = 0; j  count; j++)
-   ddp-gl_map[i + j] = gl;
+   for (j = 0, k = i; j  count; j++, k++)
+   ddp-gl_map[k] = gl;
spin_unlock(ddp-map_lock);
return i;
}
@@ -354,7 +360,7 @@ int cxgb3i_ddp_tag_reserve(struct t3cdev *tdev, unsigned 
int tid,
struct cxgb3i_ddp_info *ddp = tdev-ulp_iscsi;
struct pagepod_hdr hdr;
unsigned int npods;
-   int idx = -1, idx_max;
+   int idx = -1;
int err = -ENOMEM;
u32 sw_tag = *tagp;
u32 tag;
@@ -367,17 +373,17 @@ int cxgb3i_ddp_tag_reserve(struct t3cdev *tdev, unsigned 
int tid,
}
 
npods = (gl-nelem + PPOD_PAGES_MAX - 1)  PPOD_PAGES_SHIFT;
-   idx_max = ddp-nppods - npods + 1;
 
if (ddp-idx_last == ddp-nppods)
-   idx = ddp_find_unused_entries(ddp, 0, idx_max, npods, gl);
+   idx = ddp_find_unused_entries(ddp, 0, ddp-nppods, npods, gl);
else {
idx = ddp_find_unused_entries(ddp, ddp-idx_last + 1,
- idx_max, npods, gl);
-   if (idx  0  ddp-idx_last = npods)
+ ddp-nppods, npods, gl);
+   if (idx  0  ddp-idx_last = npods) {
idx = ddp_find_unused_entries(ddp, 0,
- ddp-idx_last - npods + 1,
+   min(ddp-idx_last + npods, ddp-nppods),
  npods, gl);
+   }
}
if (idx  0) {
ddp_log_debug(xferlen %u, gl %u, npods %u NO DDP.\n,

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---