Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-14 Thread Bart Van Assche

On Mon, Apr 13, 2009 at 10:33 PM, Mike Christie micha...@cs.wisc.edu wrote:
 I think linux is just not so good with smaller IO sizes like 4K. I do
 not see good performance with Fibre Channel or iscsi.

Can you elaborate on the above ? I have already measured a throughput
of more than 60 MB/s when using the SRP protocol over an InfiniBand
network with a block size of 4 KB blocks, which is definitely not bad.

Bart.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Tuning iSCSI between Linux and NetAPP

2009-04-14 Thread Frank Bonnet

Hello

I'm setting up a Samba server that will use iSCSI to access
some shares on a NetAPP filer ( FAS 2050 )

I would like to know if some of you has already build such
configuration and if there are some tricks to optimize it.

The Linux server is a HP Proliant quad CPU and runs
Debian Lenny, it has 16 Gb of RAM.

Thanks a lot.



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Open iSCSI Performance on IBM

2009-04-14 Thread Gonçalo Borges

Unfortunately not. I'm now trying to optimize some filesystem options
to check if that increases performance... but from the iscsi part, I
do not know what else to optimize. I could be reaching the physical
limit of the system but I'm not sure since I do not know about any
other performance numbers from any other systems.

On Apr 13, 7:00 pm, jnantel nan...@hotmail.com wrote:
 Have you made any headway with this issue? I'm having a write issue
 that seems to share some similarities with yours.

 On Apr 13, 8:14 am, Gonçalo Borges borges.gonc...@gmail.com wrote:

  Hi...

   Is /apoio04/b1 a scsi/iscsi disk or is it LVM/DM/RAID on top of a
   iscsi/scsi disk?

  /apoio04/ is a RAID1 of two disks accessible via iscsi (in the
  following tests, I changed the mount point from /apoio04/ to /iscsi04-
  lun0/ but they are exactly the same).

   Could you set the IO scheduler to noop
   echo noop  /sys/block/sdX/queue/scheduler and see if that makes a 
   difference.

  I checked the definition and I have

  [r...@core06 ~]# cat /sys/block/sdh/queue/scheduler
  noop anticipatory deadline [cfq]

  Now I've changed to

  [r...@core06 ~]# cat /sys/block/sdh/queue/scheduler
  [noop] anticipatory deadline cfq

  and I've run the tests again. This is what I got:

  [r...@core06 ~]# dd if=/dev/zero of=/iscsi04-lun0/b1 bs=64k
  count=125000
  125000+0 records in
  125000+0 records out
  819200 bytes (8.2 GB) copied, 470.332 seconds, 17.4 MB/s

  [r...@core06 ~]# dd if=/dev/zero of=/iscsi04-lun0/b2 bs=128k
  count=62500
  62500+0 records in
  62500+0 records out
  819200 bytes (8.2 GB) copied, 470.973 seconds, 17.4 MB/s

  Basically, the performance didn't increase :(

   And then also run
   iscsiadm -m session -P 3

  [r...@core06 ~]# iscsiadm -m session -P 3
  iSCSI Transport Class version 2.0-724
  iscsiadm version 2.0-868
  Target: iqn.1992-01.com.lsi:1535.600a0b80003ad11c490ade2d
          Current Portal: 10.131.2.14:3260,1
          Persistent Portal: 10.131.2.14:3260,1
                  **
                  Interface:
                  **
                  Iface Name: default
                  Iface Transport: tcp
                  Iface Initiatorname: iqn.1994-05.com.redhat:8c56e324f294
                  Iface IPaddress: 10.131.4.6
                  Iface HWaddress: default
                  Iface Netdev: default
                  SID: 37
                  iSCSI Connection State: LOGGED IN
                  iSCSI Session State: Unknown
                  Internal iscsid Session State: NO CHANGE
                  
                  Negotiated iSCSI params:
                  
                  HeaderDigest: None
                  DataDigest: None
                  MaxRecvDataSegmentLength: 131072
                  MaxXmitDataSegmentLength: 65536
                  FirstBurstLength: 8192
                  MaxBurstLength: 262144
                  ImmediateData: Yes
                  InitialR2T: Yes
                  MaxOutstandingR2T: 1
                  
                  Attached SCSI devices:
                  
                  Host Number: 38 State: running
                  scsi38 Channel 00 Id 0 Lun: 0
                  scsi38 Channel 00 Id 0 Lun: 1
                  scsi38 Channel 00 Id 0 Lun: 2
                  scsi38 Channel 00 Id 0 Lun: 3
                  scsi38 Channel 00 Id 0 Lun: 4
                  scsi38 Channel 00 Id 0 Lun: 5
                  scsi38 Channel 00 Id 0 Lun: 31
          Current Portal: 10.131.2.13:3260,1
          Persistent Portal: 10.131.2.13:3260,1
                  **
                  Interface:
                  **
                  Iface Name: default
                  Iface Transport: tcp
                  Iface Initiatorname: iqn.1994-05.com.redhat:8c56e324f294
                  Iface IPaddress: 10.131.4.6
                  Iface HWaddress: default
                  Iface Netdev: default
                  SID: 38
                  iSCSI Connection State: LOGGED IN
                  iSCSI Session State: Unknown
                  Internal iscsid Session State: NO CHANGE
                  
                  Negotiated iSCSI params:
                  
                  HeaderDigest: None
                  DataDigest: None
                  MaxRecvDataSegmentLength: 131072
                  MaxXmitDataSegmentLength: 65536
                  FirstBurstLength: 8192
                  MaxBurstLength: 262144
                  ImmediateData: Yes
                  InitialR2T: Yes
                  MaxOutstandingR2T: 1
                  
                  Attached SCSI devices:
                  
                  Host Number: 39 State: running
                  scsi39 Channel 00 Id 0 Lun: 0
                  scsi39 Channel 00 Id 0 Lun: 1
                 

Re: Tuning iSCSI between Linux and NetAPP

2009-04-14 Thread benoit plessis
First i would ask why the hell ?

The netapp filer is a very good CIFS/SMB share server. Using it as an iSCSI
target -- which is not is primary
function (netapp filer are more NAS than SAN) -- will only create
limitations (unable to resize volume on the fly,
unable to use wafl attributes to store windows security acl, ...) with no
visible gain ...

 Also your server seem very overkill to me, i must hope it won't have to be
just a samba=iscsi interface ...

For iSCSI and netapp in general, first make sure that you have at least
10% of free space inside the volume, and 10% of free space inside the
aggregate or else perf could
suffer and more important you won't be able to launch the reallocate process
(defrag).

The following is the recommended netapp/iscsi optimisations, however
open-iscsi doesn't support multiple
 connections per session now (iirc), so the best way to have parallel access
is to use multipath

iscsi.iswt.max_ios_per_session 64
iscsi.max_connections_per_session 16
iscsi.max_ios_per_session64

2009/4/14 Frank Bonnet f.bon...@esiee.fr


 Hello

 I'm setting up a Samba server that will use iSCSI to access
 some shares on a NetAPP filer ( FAS 2050 )

 I would like to know if some of you has already build such
 configuration and if there are some tricks to optimize it.

 The Linux server is a HP Proliant quad CPU and runs
 Debian Lenny, it has 16 Gb of RAM.

 Thanks a lot.



 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Data Digest always None during iBFT based boot

2009-04-14 Thread Ulrich Windl

On 9 Apr 2009 at 14:11, Karen Xie wrote:

 But if the Target always requires Data digests on, the login will fail
 with a non-retryable login failure error.

I doubt whether the target will be iSCSI-compliant then. (Default is None for 
both HeaderDigest and DataDigest.)

Ulrich


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Tuning iSCSI between Linux and NetAPP

2009-04-14 Thread Frank Bonnet

benoit plessis wrote:
 First i would ask why the hell ?
 
 The netapp filer is a very good CIFS/SMB share server. Using it as an 
 iSCSI target -- which is not is primary
 function (netapp filer are more NAS than SAN) -- will only create 
 limitations (unable to resize volume on the fly,
 unable to use wafl attributes to store windows security acl, ...) with 
 no visible gain ...

Netapp filers works supports well SMB as long as you have a M$ domain 
PDC server or an AD server, I don't

If you use Samba as PDC, SMB won't work natively ( if you have a 
solution that works I'll be really happy to get it !! )

 
  Also your server seem very overkill to me, i must hope it won't have to 
 be just a samba=iscsi interface ...

no it's a complete server, not only disks access,  with 800 M$ clients 
connected to.

 
 For iSCSI and netapp in general, first make sure that you have at least
 10% of free space inside the volume, and 10% of free space inside the 
 aggregate or else perf could
 suffer and more important you won't be able to launch the reallocate 
 process (defrag).
 
 The following is the recommended netapp/iscsi optimisations, however 
 open-iscsi doesn't support multiple
  connections per session now (iirc), so the best way to have parallel 
 access is to use multipath
 
 iscsi.iswt.max_ios_per_session 64
 iscsi.max_connections_per_session 16
 iscsi.max_ios_per_session64
 

thank you for your help


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-14 Thread jnantel

Well I've got some disconcerting news on this issue.  No changes at
any level alter the 34/meg throughput I get. I flushed multipath, blew
away /var/lib/iscsi just in case. I also verified in /var/lib/iscsi
the options got set. RHEL53 took my renice no problem.

Some observations:
Single interface iscsi gives me the exact same 34meg/sec
Going with 2 interfaces it gives me 17meg/sec each interface
Going with 4 interfaces it gives me 8meg/sec...etc..etc..etc.
I can't seem to set node.conn[0].iscsi.MaxXmitDataSegmentLength =
262144 in a way that actually gets used.
node.session.iscsi.MaxConnections = 1can't find any docs on this,
doubtful it is relevant.

iscsiadm -m session -P 3  still gives me the default 65536 for xmit
segment.

The Equallogic has all its interfaces on the same SAN network, this is
contrary to most implementations of multipath I've done. This is the
vendor recommended deployment.

Whatever is choking performance its consistently choking it down to
the same level.




On Apr 13, 5:33 pm, Mike Christie micha...@cs.wisc.edu wrote:
 jnantel wrote:

  I am having a major issue with multipath + iscsi write performance
  with anything random or any sequential write with data sizes smaller
  than 4meg  (128k 64k 32k 16k 8k).  With 32k block size, I am able to
  get a maximum throughput of 33meg/s write.  My performance gets cut by
  a third with each smaller size, with 4k blocks giving me a whopping
  4meg/s combined throughput.  Now bumping the data size up to 32meg
  gets me 160meg/sec throughput, and 64 gives me 190meg/s and finally to
  top it out 128meg gives me 210megabytes/sec.  My question is what
  factors would limit my performance in the 4-128k range?

 I think linux is just not so good with smaller IO sizes like 4K. I do
 not see good performance with Fibre Channel or iscsi.

 64K+ should be fine, but you want to get lots of 64K+ IOs in flight. If
 you run iostat or blktrace you should see more than 1 IO in flight. If
 while the test is running if you
 cat /sys/class/scsi_host/hostX/host_busy
 you should also see lots of IO running.

 What limits the number of IO? On the iscsi initiator side, it could be
 params like node.session.cmds_max or node.session.queue_depth. For a
 decent target like the ones you have I would increase
 node.session.cmds_max to 1024 and increase node.session.queue_depth to 128.

 What IO tool are you using? Are you doing direct IO or are you doing
 file system IO? If you just use something like dd with bs=64K then you
 are not going to get lots of IO running. I think you will get 1 64K IO
 in flight, so throughput is not going to be high. If you use something
 like disktest
 disktest -PT -T30 -h1 -K128 -B64k -ID /dev/sdb

 you should see a lot of IOs (depends on merging).

 If you were using dd with bs=128m then that IO is going to get broken
 down into lots of smaller IOs (probably around 256K), and so the pipe is
 nice and full.

 Another thing I noticed in RHEL is if you increase the nice value of the
 iscsi threads it will increase write perforamnce sometimes. So for RHEL
 or Oracle do

 ps -u root | grep scsi_wq

 Then patch the scsi_wq_%HOST_ID with the iscsiadm -m session -P 3 Host
 Number. And then renive the thread to -20.

 Also check the logs and make sure you do not see any conn error messages.

 And then what do you get when running the IO test to the individual
 iscsi disks instead of the dm one? Is there any difference? You might
 want to change the rr_min_io. If you are sending smaller IOs then
 rr_min_io of 10 is probably too small. The path is not going to get lots
 of nice large IOs like you would want.



  Some basics about my performance lab:

  2 identical 1 gigabit paths (2  dual port intel pro 1000 MTs) in
  separate pcie slots.

  Hardware:
  2 x Dell R900 6 quad core, 128gig ram, 2 x Dual port Intel Pro MT
  Cisco 3750s with 32gigabit stackwise interconnect
  2 x Dell Equallogic PS5000XV arrays
  1 x Dell Equallogic PS5000E arrays

  Operating systems
  SLES 10 SP2 , RHEL5 Update 3, Oracle Linux 5 update 3

  /etc/mutipath.conf

  defaults {
          udev_dir                /dev
          polling_interval        10
          selector                round-robin 0
          path_grouping_policy    multibus
          getuid_callout          /sbin/scsi_id -g -u -s /block/%n
          prio_callout            /bin/true
          path_checker            readsector0
          features 1 queue_if_no_path
          rr_min_io               10
          max_fds                 8192
  #       rr_weight               priorities
          failback                immediate
  #       no_path_retry           fail
  #       user_friendly_names     yes

  /etc/iscsi/iscsi.conf   (non default values)

  node.session.timeo.replacement_timeout = 15
  node.conn[0].timeo.noop_out_interval = 5
  node.conn[0].timeo.noop_out_timeout = 30
  node.session.cmds_max = 128
  node.session.queue_depth = 32
  node.session.iscsi.FirstBurstLength = 262144
  

Re: equallogic - load balancing and xfs

2009-04-14 Thread jnantel

I've had a similar issue with SLES10 SP2

This is what my multipathd config looks like now:

efaults {
udev_dir/dev
polling_interval10
selectorround-robin 0
#   path_grouping_policymultibus
#   getuid_callout  /lib/udev/scsi_id -g -u -s /block/%n
#   prioconst
#   path_checkerdirectio
no_path_retry   5
features1 queue_if_no_path
rr_min_io   10
max_fds 8192
rr_weight   priorities
failbackimmediate
#   no_path_retry   5
user_friendly_names no
}

no_path_retry has to be commented out in order for the
queue_if_no_path to work

I've also played with the no-op timeouts as well:

node.conn[0].timeo.noop_out_interval = 10
node.conn[0].timeo.noop_out_timeout = 30

On Apr 13, 9:23 pm, Matthew Kent m...@bravenet.com wrote:
 On Mon, 2009-04-13 at 17:28 -0500, Mike Christie wrote:
  Matthew Kent wrote:
   On Mon, 2009-04-13 at 15:44 -0500, Mike Christie wrote:
   Matthew Kent wrote:
   Can anyone suggest a timeout I might be hitting or a setting I'm
   missing?

   The run down:

   - EqualLogic target
   - CentOS 5.2 client
   You will want to upgrade that to 5.3 when you can. The iscsi code in
   there fixes a bug where the initiator dropped the session when it should
   not.

   Will do, probably Wednesday night and we'll see if this goes away. I'll
   be sure to follow up for the archives.

   - xfs  lvm  iscsi

   During a period of high load the EqualLogic decides to load balance:

    INFO  4/13/09  12:08:29 AM  eql3    iSCSI session to target
   '20.20.20.31:3260,
   iqn.2001-05.com.equallogic:0-8a0906-b7f6d3801-2b2000d0f5347d9a-foo' from
   initiator '20.20.20.92:51274, iqn.1994-05.com.redhat:a62ba20db72' was
   closed.   Load balancing request was received on the array.  

   So is this what you get in the EQL log when it decides to load balance
   the initiator and send us to a different portal?

   Yes, a straight copy from event log in the java web interface.

    INFO  4/13/09  12:08:31 AM  eql3    iSCSI login to target
   '20.20.20.32:3260,
   iqn.2001-05.com.equallogic:0-8a0906-b7f6d3801-2b2000d0f5347d9a-foo' from
   initiator '20.20.20.92:44805, iqn.1994-05.com.redhat:a62ba20db72'
   successful, using standard frame length.  

   on the client see I get:

   Apr 13 00:08:29 moo kernel: [4576850.161324] sd 5:0:0:0: SCSI error:
   return code = 0x0002

   Apr 13 00:08:29 moo kernel: [4576850.161330] end_request: I/O error, dev
   sdc, sector 113287552

   Apr 13 00:08:32 moo kernel: [4576852.470879] I/O error in filesystem
   (dm-10) meta-data dev dm-10 block 0x6c0a000
   Are you using dm-multipath over iscsi? Does this load balance issue
   affect all the paths at the same time? What is your multipath
   no_path_retry value? I think you might want to set that higher to avoid
   the FS from getting IO errors at this time if all paths are affected at
   the same time.

   Not using multipath on this one.

  Do you have xfs on sdc or is there something like LVM or RAID on top of sdc?

  That is really strange then. 0x0002 is DID_BUS_BUSY. The iscsi
  initiator layer would return this when the target does its load
  balancing. The initiator does this to ask he scsi layer to retry the IO.
  If dm-multipath was used then it is failed to the multipath layer right
  away. If dm-multipath is not used then we get 5 retries so we should not
  see the error if there was only the one rebalancing at the time. If
  there was a bunch of load rebalancing within a couple minutes then it
  makes sense.

 Yeah xfs on top of lvm, no multipath.

 Logs only show the one load balancing request around that time.

 Funny thing is this system, and the load balancing etc, has been going
 error free for months now, but the last couple days it's flared up right
 around the time of some log rotation and heavy i/o.

 We'll see what happens after the centos 5.3 upgrade. We'll also be
 upgrading the firmware on all the equallogics to the latest version.
 --
 Matthew Kent \ SA \ bravenet.com
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Tuning iSCSI between Linux and NetAPP

2009-04-14 Thread jnantel

I use a netapp filer...where are these values set? Host or Array?

iscsi.iswt.max_ios_per_session 64
iscsi.max_connections_per_session 16
iscsi.max_ios_per_session64


On Apr 14, 6:40 am, benoit plessis plessis.ben...@gmail.com wrote:
 First i would ask why the hell ?

 The netapp filer is a very good CIFS/SMB share server. Using it as an iSCSI
 target -- which is not is primary
 function (netapp filer are more NAS than SAN) -- will only create
 limitations (unable to resize volume on the fly,
 unable to use wafl attributes to store windows security acl, ...) with no
 visible gain ...

  Also your server seem very overkill to me, i must hope it won't have to be
 just a samba=iscsi interface ...

 For iSCSI and netapp in general, first make sure that you have at least
 10% of free space inside the volume, and 10% of free space inside the
 aggregate or else perf could
 suffer and more important you won't be able to launch the reallocate process
 (defrag).

 The following is the recommended netapp/iscsi optimisations, however
 open-iscsi doesn't support multiple
  connections per session now (iirc), so the best way to have parallel access
 is to use multipath

 iscsi.iswt.max_ios_per_session 64
 iscsi.max_connections_per_session 16
 iscsi.max_ios_per_session    64

 2009/4/14 Frank Bonnet f.bon...@esiee.fr



  Hello

  I'm setting up a Samba server that will use iSCSI to access
  some shares on a NetAPP filer ( FAS 2050 )

  I would like to know if some of you has already build such
  configuration and if there are some tricks to optimize it.

  The Linux server is a HP Proliant quad CPU and runs
  Debian Lenny, it has 16 Gb of RAM.

  Thanks a lot.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-14 Thread Mike Christie

Bart Van Assche wrote:
 On Mon, Apr 13, 2009 at 10:33 PM, Mike Christie micha...@cs.wisc.edu wrote:
 I think linux is just not so good with smaller IO sizes like 4K. I do
 not see good performance with Fibre Channel or iscsi.
 
 Can you elaborate on the above ? I have already measured a throughput
 of more than 60 MB/s when using the SRP protocol over an InfiniBand
 network with a block size of 4 KB blocks, which is definitely not bad.
 

How does that compare to Windows or Solaris?

Is that a 10 gig link?

What tool were you using and what command did you run? I will try to 
replicate it here and see what I get.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-14 Thread Mike Christie

jnantel wrote:
 Well I've got some disconcerting news on this issue.  No changes at
 any level alter the 34/meg throughput I get. I flushed multipath, blew
 away /var/lib/iscsi just in case. I also verified in /var/lib/iscsi
 the options got set. RHEL53 took my renice no problem.
 


What were you using for the io test tool, and how did you run it?

 Some observations:
 Single interface iscsi gives me the exact same 34meg/sec
 Going with 2 interfaces it gives me 17meg/sec each interface
 Going with 4 interfaces it gives me 8meg/sec...etc..etc..etc.
 I can't seem to set node.conn[0].iscsi.MaxXmitDataSegmentLength =
 262144 in a way that actually gets used.


We will always take what the target wants to use, so you have to 
increase it there.


 node.session.iscsi.MaxConnections = 1can't find any docs on this,
 doubtful it is relevant.
 
 iscsiadm -m session -P 3  still gives me the default 65536 for xmit
 segment.
 
 The Equallogic has all its interfaces on the same SAN network, this is
 contrary to most implementations of multipath I've done. This is the
 vendor recommended deployment.
 
 Whatever is choking performance its consistently choking it down to
 the same level.
 
 
 
 
 On Apr 13, 5:33 pm, Mike Christie micha...@cs.wisc.edu wrote:
 jnantel wrote:

 I am having a major issue with multipath + iscsi write performance
 with anything random or any sequential write with data sizes smaller
 than 4meg  (128k 64k 32k 16k 8k).  With 32k block size, I am able to
 get a maximum throughput of 33meg/s write.  My performance gets cut by
 a third with each smaller size, with 4k blocks giving me a whopping
 4meg/s combined throughput.  Now bumping the data size up to 32meg
 gets me 160meg/sec throughput, and 64 gives me 190meg/s and finally to
 top it out 128meg gives me 210megabytes/sec.  My question is what
 factors would limit my performance in the 4-128k range?
 I think linux is just not so good with smaller IO sizes like 4K. I do
 not see good performance with Fibre Channel or iscsi.

 64K+ should be fine, but you want to get lots of 64K+ IOs in flight. If
 you run iostat or blktrace you should see more than 1 IO in flight. If
 while the test is running if you
 cat /sys/class/scsi_host/hostX/host_busy
 you should also see lots of IO running.

 What limits the number of IO? On the iscsi initiator side, it could be
 params like node.session.cmds_max or node.session.queue_depth. For a
 decent target like the ones you have I would increase
 node.session.cmds_max to 1024 and increase node.session.queue_depth to 128.

 What IO tool are you using? Are you doing direct IO or are you doing
 file system IO? If you just use something like dd with bs=64K then you
 are not going to get lots of IO running. I think you will get 1 64K IO
 in flight, so throughput is not going to be high. If you use something
 like disktest
 disktest -PT -T30 -h1 -K128 -B64k -ID /dev/sdb

 you should see a lot of IOs (depends on merging).

 If you were using dd with bs=128m then that IO is going to get broken
 down into lots of smaller IOs (probably around 256K), and so the pipe is
 nice and full.

 Another thing I noticed in RHEL is if you increase the nice value of the
 iscsi threads it will increase write perforamnce sometimes. So for RHEL
 or Oracle do

 ps -u root | grep scsi_wq

 Then patch the scsi_wq_%HOST_ID with the iscsiadm -m session -P 3 Host
 Number. And then renive the thread to -20.

 Also check the logs and make sure you do not see any conn error messages.

 And then what do you get when running the IO test to the individual
 iscsi disks instead of the dm one? Is there any difference? You might
 want to change the rr_min_io. If you are sending smaller IOs then
 rr_min_io of 10 is probably too small. The path is not going to get lots
 of nice large IOs like you would want.



 Some basics about my performance lab:
 2 identical 1 gigabit paths (2  dual port intel pro 1000 MTs) in
 separate pcie slots.
 Hardware:
 2 x Dell R900 6 quad core, 128gig ram, 2 x Dual port Intel Pro MT
 Cisco 3750s with 32gigabit stackwise interconnect
 2 x Dell Equallogic PS5000XV arrays
 1 x Dell Equallogic PS5000E arrays
 Operating system
 SLES 10 SP2 , RHEL5 Update 3, Oracle Linux 5 update 3
 /etc/mutipath.conf
 defaults {
 udev_dir/dev
 polling_interval10
 selectorround-robin 0
 path_grouping_policymultibus
 getuid_callout  /sbin/scsi_id -g -u -s /block/%n
 prio_callout/bin/true
 path_checkerreadsector0
 features 1 queue_if_no_path
 rr_min_io   10
 max_fds 8192
 #   rr_weight   priorities
 failbackimmediate
 #   no_path_retry   fail
 #   user_friendly_names yes
 /etc/iscsi/iscsi.conf   (non default values)
 node.session.timeo.replacement_timeout = 15
 node.conn[0].timeo.noop_out_interval = 5
 

Re: Tuning iSCSI between Linux and NetAPP

2009-04-14 Thread Bart Van Assche

On Tue, Apr 14, 2009 at 7:09 PM, Dmitry Yusupov dmitry_...@yahoo.com wrote:
 NexentaStor also is very good candidate for CIFS workgroups/AD
 environments with the whole SMB stack implemented in the kernel, which
 boosts performance over the top. And as far as iSCSI target - I would
 recommend to use COMSTAR, which is ZFS integrated in Nexenta [2].

Do you have any performance numbers available for the combination of
open-iscsi and the NexentaStor target ?

Bart.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Low IOPs for certain block sizes

2009-04-14 Thread Vladislav Bolkhovitin

Hello,

Bart Van Assche, on 04/12/2009 10:09 PM wrote:
 Hello,
 
 While running iSCSI performance tests I noticed that the performance
 for certain block sizes deviated significantly (more than ten times)
 from the performance for other block sizes, both larger and smaller.
 This surprised me.
 
 The test I ran was as follows:
 * A file of 1 GB residing on a tmpfs filesystem was exported via iSCSI
 target software. The test has been repeated with both SCST and STGT.
 * On the initiator system open-iscsi version 2.0.870 was used for
 performing reads and writes with dd via direct I/O. Read-ahead was set
 to zero.
 * Both systems were running kernel 2.6.29.1 in run level 3 (no X
 server) and the 1 GbE interfaces in the two systems were connected via
 a crossed cable. The MTU has been left to its default value, 1500
 bytes. Netperf reported a throughput of 600 Mbit/s = 75 MB/s for the
 TCP/IP stream test on this setup.
 * 128 MB of data has been transferred during each test.
 * Each measurement has been repeated three times.
 * All caches were flushed before each test.
 * The ratio of standard deviation to average was 2% or lower for all
 measurements.
 * The measurement result are as follows (transfer speeds in MB/s):
 
 Block   SCSTSTGTSCSTSTGT
  size  writing writing reading reading
 -- --- --- --- ---
  64 MB  71.763.362.158.4
  32 MB  71.963.461.758.1
  16 MB  72.463.061.757.1
   8 MB  72.763.361.756.9
   4 MB  72.963.561.357.0
   2 MB  72.859.560.356.9
   1 MB  72.138.759.456.0
 512 KB  67.321.458.054.4
 256 KB  67.422.855.553.4
 128 KB  60.922.653.351.7
  64 KB  53.222.253.045.7
  32 KB  48.921.640.040.0
  16 KB  40.020.8 0.6 1.3
   8 KB  20.019.919.920.0
   4 KB   0.6 1.618.910.3
 
 All results look normal to me, except the write throughput for a block
 size of 4 KB and the read throughput for a block size of 16 KB.
 
 Regarding CPU load: during the 4 KB write test, the CPU load was 0.9
 on the initiator system and 0.1 on the target.

I would suggest you to make sure you have any hardware coalescing 
disabled on both hosts. You can do that by using ethtool -c

 Has anyone observed similar behavior before ?
 
 Bart.
 
  
 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-14 Thread jnantel

iometer 32k write 0 read 0 randoms   Equallogic is using this in their
lab
iozone with -I option and various settings
dd + iostat

On Apr 14, 1:57 pm, Mike Christie micha...@cs.wisc.edu wrote:
 Bart Van Assche wrote:
  On Mon, Apr 13, 2009 at 10:33 PM, Mike Christie micha...@cs.wisc.edu 
  wrote:
  I think linux is just not so good with smaller IO sizes like 4K. I do
  not see good performance with Fibre Channel or iscsi.

  Can you elaborate on the above ? I have already measured a throughput
  of more than 60 MB/s when using the SRP protocol over an InfiniBand
  network with a block size of 4 KB blocks, which is definitely not bad.

 How does that compare to Windows or Solaris?

 Is that a 10 gig link?

 What tool were you using and what command did you run? I will try to
 replicate it here and see what I get.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



[PATCH] bind offloaded connection to port

2009-04-14 Thread Mike Christie
Hey offload guys,

If we are using a offload card, then iface_set_param will match the 
iface info to a scsi_host and pass that info down to setup the net 
settings of the port (currently we just set the ip address). When we 
create the tcp/ip connection by calling ep_connect, we currently just go 
by the routing table info.

I think there are two problems with this.

1. Some drivers do not have access to a routing table. Some drivers like 
qla4xxx do not even know about other ports.

2. If you have two initiator ports on the same subnet, the user may have 
set things up so that session1 was supposed to be run through port1. and 
session2 was supposed to be run through port2. It looks like we could 
end with both sessions going through one of the ports.

Also how do you edit the routing table for the offload cards? You cannot 
use normal net tools like route can you?

3. If we set up hostA in the iface_set_param step, but then the routing 
info leads us to hostB, we are stuck.


I did the attached patches to fix this. Basically we just pass down the 
scsi host we want to go through. Well, ok I began to fix this :) For 
qla4xxx or serverengines I think this will work fine.

For bnx2i and cxgb3i, I am not sure. See the TODO and note in cxgb3i in 
kern-ep-connect-through-host.patch. bnx2i guys, you guys do somehting 
similar so will this work? In ep_connect can I control which host/port 
to use?

The patches were made against my iscsi tress. The kernel one was made 
over the iscsi brandh and that was just updated so you might want to 
reclone.

The userspace one was made over the open-iscsi git tree head.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c 
b/drivers/infiniband/ulp/iser/iscsi_iser.c
index 75223f5..ffbe0c7 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -517,7 +517,8 @@ iscsi_iser_conn_get_stats(struct iscsi_cls_conn *cls_conn, 
struct iscsi_stats *s
 }
 
 static struct iscsi_endpoint *
-iscsi_iser_ep_connect(struct sockaddr *dst_addr, int non_blocking)
+iscsi_iser_ep_connect(struct Scsi_Host *shost, struct sockaddr *dst_addr,
+ int non_blocking)
 {
int err;
struct iser_conn *ib_conn;
diff --git a/drivers/scsi/cxgb3i/cxgb3i.h b/drivers/scsi/cxgb3i/cxgb3i.h
index 59b0958..e3133b5 100644
--- a/drivers/scsi/cxgb3i/cxgb3i.h
+++ b/drivers/scsi/cxgb3i/cxgb3i.h
@@ -144,7 +144,6 @@ struct cxgb3i_adapter *cxgb3i_adapter_find_by_tdev(struct 
t3cdev *);
 void cxgb3i_adapter_open(struct t3cdev *);
 void cxgb3i_adapter_close(struct t3cdev *);
 
-struct cxgb3i_hba *cxgb3i_hba_find_by_netdev(struct net_device *);
 struct cxgb3i_hba *cxgb3i_hba_host_add(struct cxgb3i_adapter *,
   struct net_device *);
 void cxgb3i_hba_host_remove(struct cxgb3i_hba *);
diff --git a/drivers/scsi/cxgb3i/cxgb3i_iscsi.c 
b/drivers/scsi/cxgb3i/cxgb3i_iscsi.c
index 9212400..f423c49 100644
--- a/drivers/scsi/cxgb3i/cxgb3i_iscsi.c
+++ b/drivers/scsi/cxgb3i/cxgb3i_iscsi.c
@@ -178,7 +178,7 @@ void cxgb3i_adapter_close(struct t3cdev *t3dev)
  * cxgb3i_hba_find_by_netdev - find the cxgb3i_hba structure via net_device
  * @t3dev: t3cdev adapter
  */
-struct cxgb3i_hba *cxgb3i_hba_find_by_netdev(struct net_device *ndev)
+static struct cxgb3i_hba *cxgb3i_hba_find_by_netdev(struct net_device *ndev)
 {
struct cxgb3i_adapter *snic;
int i;
@@ -261,12 +261,14 @@ void cxgb3i_hba_host_remove(struct cxgb3i_hba *hba)
 
 /**
  * cxgb3i_ep_connect - establish TCP connection to target portal
+ * @shost: scsi host to use
  * @dst_addr:  target IP address
  * @non_blocking:  blocking or non-blocking call
  *
  * Initiates a TCP/IP connection to the dst_addr
  */
-static struct iscsi_endpoint *cxgb3i_ep_connect(struct sockaddr *dst_addr,
+static struct iscsi_endpoint *cxgb3i_ep_connect(struct Scsi_Host *shost,
+   struct sockaddr *dst_addr,
int non_blocking)
 {
struct iscsi_endpoint *ep;
@@ -275,6 +277,13 @@ static struct iscsi_endpoint *cxgb3i_ep_connect(struct 
sockaddr *dst_addr,
struct s3_conn *c3cn = NULL;
int err = 0;
 
+   if (!shost) {
+   cxgb3i_log_error(Cannot connect. Missing host. Check that 
+have the current iscsi tools.\n);
+   err = -EINVAL;
+   goto release_conn;
+   }
+
c3cn = cxgb3i_c3cn_create();
if (!c3cn) {