Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-17 Thread jnantel

Accross my SAN, tuned system:

ping -I eth2 -s 9000 10.1.253.48
PING 10.1.253.48 (10.1.253.48) from 10.1.253.48 eth2: 9000(9028) bytes
of data.
9008 bytes from 10.1.253.48: icmp_seq=1 ttl=64 time=0.074 ms
9008 bytes from 10.1.253.48: icmp_seq=2 ttl=64 time=0.013 ms
9008 bytes from 10.1.253.48: icmp_seq=3 ttl=64 time=0.012 ms
9008 bytes from 10.1.253.48: icmp_seq=4 ttl=64 time=0.011 ms
9008 bytes from 10.1.253.48: icmp_seq=5 ttl=64 time=0.012 ms
9008 bytes from 10.1.253.48: icmp_seq=6 ttl=64 time=0.012 ms
9008 bytes from 10.1.253.48: icmp_seq=7 ttl=64 time=0.012 ms
9008 bytes from 10.1.253.48: icmp_seq=8 ttl=64 time=0.011 ms
9008 bytes from 10.1.253.48: icmp_seq=9 ttl=64 time=0.012 ms


TSO, TCP checksum offload and things like that seem to have a big
effect on latency. If you look at how things like TSO work, their
intention is to save you CPU overhead...in my case I don't care about
overhead I've got 24 cores.

On Apr 17, 3:25 am, "Ulrich Windl" 
wrote:
> On 16 Apr 2009 at 13:59, jnantel wrote:
>
>
>
> >  FINAL RESULTS *
> > First of all I'd thank Mike Christie for all his help. Mike I'll
> > tapping your brain again for some read performance help.
>
> > This for the benefit of anyone using the Dell Equallogic PS5000XV
> > PS5000E with SLES10 SP2 / Redhat 5.3 / Centos 5.3 / Oracle Linux +
> > Multipath ( MPIO ) and open-iscsi ( iscsi ).  Sorry about weird
> > formatting, making sure this is going get hit for people that were in
> > my predicament.
>
> When seeing your settings, I wonder what your network latency for jumbo 
> frames is
> (e.g using ping). The timing is dependent on packet sizes. Here is what I 
> have if
> everything is connected to one switch (and both ends are handling normal iSCSI
> traffic at the same time), started from Domain-0 of a XEN-virtualized machine 
> that
> has 77 users logged on:
>
> # ping -s 9000 172.20.76.1
> PING 172.20.76.1 (172.20.76.1) 9000(9028) bytes of data.
> 9008 bytes from 172.20.76.1: icmp_seq=1 ttl=64 time=1.90 ms
> 9008 bytes from 172.20.76.1: icmp_seq=2 ttl=64 time=1.38 ms
> 9008 bytes from 172.20.76.1: icmp_seq=3 ttl=64 time=1.39 ms
> 9008 bytes from 172.20.76.1: icmp_seq=4 ttl=64 time=1.40 ms
> 9008 bytes from 172.20.76.1: icmp_seq=5 ttl=64 time=1.56 ms
> 9008 bytes from 172.20.76.1: icmp_seq=6 ttl=64 time=1.52 ms
> 9008 bytes from 172.20.76.1: icmp_seq=7 ttl=64 time=1.39 ms
> 9008 bytes from 172.20.76.1: icmp_seq=8 ttl=64 time=1.40 ms
> 9008 bytes from 172.20.76.1: icmp_seq=9 ttl=64 time=1.55 ms
> 9008 bytes from 172.20.76.1: icmp_seq=10 ttl=64 time=1.38 ms
>
> --- 172.20.76.1 ping statistics ---
> 10 packets transmitted, 10 received, 0% packet loss, time 9000ms
> rtt min/avg/max/mdev = 1.384/1.491/1.900/0.154 ms
> # ping 172.20.76.1
> PING 172.20.76.1 (172.20.76.1) 56(84) bytes of data.
> 64 bytes from 172.20.76.1: icmp_seq=1 ttl=64 time=0.253 ms
> 64 bytes from 172.20.76.1: icmp_seq=2 ttl=64 time=0.214 ms
> 64 bytes from 172.20.76.1: icmp_seq=3 ttl=64 time=0.223 ms
> 64 bytes from 172.20.76.1: icmp_seq=4 ttl=64 time=0.214 ms
> 64 bytes from 172.20.76.1: icmp_seq=5 ttl=64 time=0.215 ms
> 64 bytes from 172.20.76.1: icmp_seq=6 ttl=64 time=0.208 ms
> 64 bytes from 172.20.76.1: icmp_seq=7 ttl=64 time=0.270 ms
> 64 bytes from 172.20.76.1: icmp_seq=8 ttl=64 time=0.313 ms
>
> --- 172.20.76.1 ping statistics ---
> 8 packets transmitted, 8 received, 0% packet loss, time 6996ms
> rtt min/avg/max/mdev = 0.208/0.238/0.313/0.039 ms
>
> I think large queues are more important if the roundtrip delay is high. ANd 
> don't
> forget that queue sizes are per device or session, so is uses some RAM.
>
> Regards,
> Ulrich
>
>
>
> > As from this thread my issue was amazingly slow performance with
> > sequential writes with my multipath, around 35 meg/s, configuration
> > when measured with IOMETER.  First things first... THROW OUT IOMETER
> > FOR LINUX , it has problems with queue depth.  With that said, with
> > default iscsi and multipath setup we saw between 60-80meg/sec
> > performance with multipath. In essence it was slower than single
> > interface in certain block sizes. When I got done my write performance
> > was pushing 180-190meg/sec with blocks as small as 4k ( sequential
> > write test using "dt").
>
> > Here are my tweaks:
>
> > After making any multipath changes do "multipath -F"  then "multipath"
> > otherwise your changes won't take effect.
>
> > /etc/multipath.conf
>
> > device {
> >         vendor "EQLOGIC"
> >         product "100E-00"
> >         path_grouping_policy multibus
> >         getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
> >         features "1 queue_if_no_path"   < --- important
> >         path_checker readsector0
> >         failback immediate
> >         path_selector "round-robin 0"
> >         rr_min_io 512 < important, only works with large queue
> > depth and cms in iscsi.conf
> >         rr_weight priorities
> > }
>
> > /etc/iscsi/iscsi.conf   ( restarting iscsi seems to apply the configs
> > fine)
>
> > # To control

Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-16 Thread Ulrich Windl

On 16 Apr 2009 at 13:59, jnantel wrote:

> 
>  FINAL RESULTS *
> First of all I'd thank Mike Christie for all his help. Mike I'll
> tapping your brain again for some read performance help.
> 
> This for the benefit of anyone using the Dell Equallogic PS5000XV
> PS5000E with SLES10 SP2 / Redhat 5.3 / Centos 5.3 / Oracle Linux +
> Multipath ( MPIO ) and open-iscsi ( iscsi ).  Sorry about weird
> formatting, making sure this is going get hit for people that were in
> my predicament.

When seeing your settings, I wonder what your network latency for jumbo frames 
is 
(e.g using ping). The timing is dependent on packet sizes. Here is what I have 
if 
everything is connected to one switch (and both ends are handling normal iSCSI 
traffic at the same time), started from Domain-0 of a XEN-virtualized machine 
that 
has 77 users logged on:

# ping -s 9000 172.20.76.1
PING 172.20.76.1 (172.20.76.1) 9000(9028) bytes of data.
9008 bytes from 172.20.76.1: icmp_seq=1 ttl=64 time=1.90 ms
9008 bytes from 172.20.76.1: icmp_seq=2 ttl=64 time=1.38 ms
9008 bytes from 172.20.76.1: icmp_seq=3 ttl=64 time=1.39 ms
9008 bytes from 172.20.76.1: icmp_seq=4 ttl=64 time=1.40 ms
9008 bytes from 172.20.76.1: icmp_seq=5 ttl=64 time=1.56 ms
9008 bytes from 172.20.76.1: icmp_seq=6 ttl=64 time=1.52 ms
9008 bytes from 172.20.76.1: icmp_seq=7 ttl=64 time=1.39 ms
9008 bytes from 172.20.76.1: icmp_seq=8 ttl=64 time=1.40 ms
9008 bytes from 172.20.76.1: icmp_seq=9 ttl=64 time=1.55 ms
9008 bytes from 172.20.76.1: icmp_seq=10 ttl=64 time=1.38 ms

--- 172.20.76.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9000ms
rtt min/avg/max/mdev = 1.384/1.491/1.900/0.154 ms
# ping 172.20.76.1
PING 172.20.76.1 (172.20.76.1) 56(84) bytes of data.
64 bytes from 172.20.76.1: icmp_seq=1 ttl=64 time=0.253 ms
64 bytes from 172.20.76.1: icmp_seq=2 ttl=64 time=0.214 ms
64 bytes from 172.20.76.1: icmp_seq=3 ttl=64 time=0.223 ms
64 bytes from 172.20.76.1: icmp_seq=4 ttl=64 time=0.214 ms
64 bytes from 172.20.76.1: icmp_seq=5 ttl=64 time=0.215 ms
64 bytes from 172.20.76.1: icmp_seq=6 ttl=64 time=0.208 ms
64 bytes from 172.20.76.1: icmp_seq=7 ttl=64 time=0.270 ms
64 bytes from 172.20.76.1: icmp_seq=8 ttl=64 time=0.313 ms

--- 172.20.76.1 ping statistics ---
8 packets transmitted, 8 received, 0% packet loss, time 6996ms
rtt min/avg/max/mdev = 0.208/0.238/0.313/0.039 ms

I think large queues are more important if the roundtrip delay is high. ANd 
don't 
forget that queue sizes are per device or session, so is uses some RAM.

Regards,
Ulrich

> 
> As from this thread my issue was amazingly slow performance with
> sequential writes with my multipath, around 35 meg/s, configuration
> when measured with IOMETER.  First things first... THROW OUT IOMETER
> FOR LINUX , it has problems with queue depth.  With that said, with
> default iscsi and multipath setup we saw between 60-80meg/sec
> performance with multipath. In essence it was slower than single
> interface in certain block sizes. When I got done my write performance
> was pushing 180-190meg/sec with blocks as small as 4k ( sequential
> write test using "dt").
> 
> Here are my tweaks:
> 
> After making any multipath changes do "multipath -F"  then "multipath"
> otherwise your changes won't take effect.
> 
> /etc/multipath.conf
> 
> device {
> vendor "EQLOGIC"
> product "100E-00"
> path_grouping_policy multibus
> getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
> features "1 queue_if_no_path"   < --- important
> path_checker readsector0
> failback immediate
> path_selector "round-robin 0"
> rr_min_io 512 < important, only works with large queue
> depth and cms in iscsi.conf
> rr_weight priorities
> }
> 
> 
> /etc/iscsi/iscsi.conf   ( restarting iscsi seems to apply the configs
> fine)
> 
> # To control how many commands the session will queue set
> # node.session.cmds_max to an integer between 2 and 2048 that is also
> # a power of 2. The default is 128.
> node.session.cmds_max = 1024
> 
> # To control the device's queue depth set node.session.queue_depth
> # to a value between 1 and 128. The default is 32.
> node.session.queue_depth = 128
> 
> Other changes I've made are basic gigabit network tuning for large
> transfers and turning off some congestion functions, some scheduler
> changes (noop is amazing for sub 4k blocks but awful for 4meg chunks
> or higher). I've turned off TSO on the network cards, apparently it's
> not supported with jumbo frames and actually slows down performance.
> 
> 
> dc1stgdb14:~ # ethtool -k eth7
> Offload parameters for eth7:
> rx-checksumming: off
> tx-checksumming: off
> scatter-gather: off
> tcp segmentation offload: off
> dc1stgdb14:~ # ethtool -k eth10
> Offload parameters for eth10:
> rx-checksumming: off
> tx-checksumming: off
> scatter-gather: off
> tcp segmentation offload: off
> dc1stgdb14:~ #
> 
> 
> On Apr 13, 4:36 pm, jnantel  wrote:
> > I am having a major 

Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-16 Thread jnantel

 FINAL RESULTS *
First of all I'd thank Mike Christie for all his help. Mike I'll
tapping your brain again for some read performance help.

This for the benefit of anyone using the Dell Equallogic PS5000XV
PS5000E with SLES10 SP2 / Redhat 5.3 / Centos 5.3 / Oracle Linux +
Multipath ( MPIO ) and open-iscsi ( iscsi ).  Sorry about weird
formatting, making sure this is going get hit for people that were in
my predicament.

As from this thread my issue was amazingly slow performance with
sequential writes with my multipath, around 35 meg/s, configuration
when measured with IOMETER.  First things first... THROW OUT IOMETER
FOR LINUX , it has problems with queue depth.  With that said, with
default iscsi and multipath setup we saw between 60-80meg/sec
performance with multipath. In essence it was slower than single
interface in certain block sizes. When I got done my write performance
was pushing 180-190meg/sec with blocks as small as 4k ( sequential
write test using "dt").

Here are my tweaks:

After making any multipath changes do "multipath -F"  then "multipath"
otherwise your changes won't take effect.

/etc/multipath.conf

device {
vendor "EQLOGIC"
product "100E-00"
path_grouping_policy multibus
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
features "1 queue_if_no_path"   < --- important
path_checker readsector0
failback immediate
path_selector "round-robin 0"
rr_min_io 512 < important, only works with large queue
depth and cms in iscsi.conf
rr_weight priorities
}


/etc/iscsi/iscsi.conf   ( restarting iscsi seems to apply the configs
fine)

# To control how many commands the session will queue set
# node.session.cmds_max to an integer between 2 and 2048 that is also
# a power of 2. The default is 128.
node.session.cmds_max = 1024

# To control the device's queue depth set node.session.queue_depth
# to a value between 1 and 128. The default is 32.
node.session.queue_depth = 128

Other changes I've made are basic gigabit network tuning for large
transfers and turning off some congestion functions, some scheduler
changes (noop is amazing for sub 4k blocks but awful for 4meg chunks
or higher). I've turned off TSO on the network cards, apparently it's
not supported with jumbo frames and actually slows down performance.


dc1stgdb14:~ # ethtool -k eth7
Offload parameters for eth7:
rx-checksumming: off
tx-checksumming: off
scatter-gather: off
tcp segmentation offload: off
dc1stgdb14:~ # ethtool -k eth10
Offload parameters for eth10:
rx-checksumming: off
tx-checksumming: off
scatter-gather: off
tcp segmentation offload: off
dc1stgdb14:~ #


On Apr 13, 4:36 pm, jnantel  wrote:
> I am having a major issue with multipath + iscsi writeperformance
> with anything random or any sequential write with data sizes smaller
> than 4meg  (128k 64k 32k 16k 8k).  With 32k block size, I am able to
> get a maximum throughput of 33meg/s write.  Myperformancegets cut by
> a third with each smaller size, with 4k blocks giving me a whopping
> 4meg/s combined throughput.  Now bumping the data size up to 32meg
> gets me 160meg/sec throughput, and 64 gives me 190meg/s and finally to
> top it out 128meg gives me 210megabytes/sec.  My question is what
> factors would limit myperformancein the 4-128k range?
>
> Some basics about myperformancelab:
>
> 2 identical 1 gigabit paths (2  dual port intel pro 1000 MTs) in
> separate pcie slots.
>
> Hardware:
> 2 x Dell R900 6 quad core, 128gig ram, 2 x Dual port Intel Pro MT
> Cisco 3750s with 32gigabit stackwise interconnect
> 2 x Dell Equallogic PS5000XV arrays
> 1 x Dell Equallogic PS5000E arrays
>
> Operating systems
> SLES 10 SP2 , RHEL5 Update 3, Oracle Linux 5 update 3
>
> /etc/mutipath.conf
>
> defaults {
>         udev_dir                /dev
>         polling_interval        10
>         selector                "round-robin 0"
>         path_grouping_policy    multibus
>         getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
>         prio_callout            /bin/true
>         path_checker            readsector0
>         features "1 queue_if_no_path"
>         rr_min_io               10
>         max_fds                 8192
> #       rr_weight               priorities
>         failback                immediate
> #       no_path_retry           fail
> #       user_friendly_names     yes
>
> /etc/iscsi/iscsi.conf   (non default values)
>
> node.session.timeo.replacement_timeout = 15
> node.conn[0].timeo.noop_out_interval = 5
> node.conn[0].timeo.noop_out_timeout = 30
> node.session.cmds_max = 128
> node.session.queue_depth = 32
> node.session.iscsi.FirstBurstLength = 262144
> node.session.iscsi.MaxBurstLength = 16776192
> node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144
> node.conn[0].iscsi.MaxXmitDataSegmentLength = 262144
>
> discovery.sendtargets.iscsi.MaxRecvDataSegmentLength = 65536
>
> Scheduler:
>
> cat /sys/block/sdb/queue/scheduler
> [noop] anticipatory deadline cfq

Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-14 Thread Bart Van Assche

On Mon, Apr 13, 2009 at 10:33 PM, Mike Christie  wrote:
> I think linux is just not so good with smaller IO sizes like 4K. I do
> not see good performance with Fibre Channel or iscsi.

Most people run a filesystem on top of a block device imported via
open-iscsi. It is well known that a filesystem performs I/O to the
underlying block device using block sizes between 4 KB and 64 KB, with
a significant fraction being 4 KB I/O's. If there was a performance
problem in Linux with regard to small block sizes, filesystem
performance in Linux would suffer. I have not yet seen statistics that
show that Linux' filesystem performance is worse than for other
operating systems. But I have already seen measurements that show the
contrary.

Bart.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-14 Thread jnantel

iometer 32k write 0 read 0 randoms   Equallogic is using this in their
lab
iozone with -I option and various settings
dd + iostat

On Apr 14, 1:57 pm, Mike Christie  wrote:
> Bart Van Assche wrote:
> > On Mon, Apr 13, 2009 at 10:33 PM, Mike Christie  
> > wrote:
> >> I think linux is just not so good with smaller IO sizes like 4K. I do
> >> not see good performance with Fibre Channel or iscsi.
>
> > Can you elaborate on the above ? I have already measured a throughput
> > of more than 60 MB/s when using the SRP protocol over an InfiniBand
> > network with a block size of 4 KB blocks, which is definitely not bad.
>
> How does that compare to Windows or Solaris?
>
> Is that a 10 gig link?
>
> What tool were you using and what command did you run? I will try to
> replicate it here and see what I get.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-14 Thread Mike Christie

jnantel wrote:
> Well I've got some disconcerting news on this issue.  No changes at
> any level alter the 34/meg throughput I get. I flushed multipath, blew
> away /var/lib/iscsi just in case. I also verified in /var/lib/iscsi
> the options got set. RHEL53 took my renice no problem.
> 


What were you using for the io test tool, and how did you run it?

> Some observations:
> Single interface iscsi gives me the exact same 34meg/sec
> Going with 2 interfaces it gives me 17meg/sec each interface
> Going with 4 interfaces it gives me 8meg/sec...etc..etc..etc.
> I can't seem to set node.conn[0].iscsi.MaxXmitDataSegmentLength =
> 262144 in a way that actually gets used.


We will always take what the target wants to use, so you have to 
increase it there.


> node.session.iscsi.MaxConnections = 1can't find any docs on this,
> doubtful it is relevant.
> 
> iscsiadm -m session -P 3  still gives me the default 65536 for xmit
> segment.
> 
> The Equallogic has all its interfaces on the same SAN network, this is
> contrary to most implementations of multipath I've done. This is the
> vendor recommended deployment.
> 
> Whatever is choking performance its consistently choking it down to
> the same level.
> 
> 
> 
> 
> On Apr 13, 5:33 pm, Mike Christie  wrote:
>> jnantel wrote:
>>
>>> I am having a major issue with multipath + iscsi write performance
>>> with anything random or any sequential write with data sizes smaller
>>> than 4meg  (128k 64k 32k 16k 8k).  With 32k block size, I am able to
>>> get a maximum throughput of 33meg/s write.  My performance gets cut by
>>> a third with each smaller size, with 4k blocks giving me a whopping
>>> 4meg/s combined throughput.  Now bumping the data size up to 32meg
>>> gets me 160meg/sec throughput, and 64 gives me 190meg/s and finally to
>>> top it out 128meg gives me 210megabytes/sec.  My question is what
>>> factors would limit my performance in the 4-128k range?
>> I think linux is just not so good with smaller IO sizes like 4K. I do
>> not see good performance with Fibre Channel or iscsi.
>>
>> 64K+ should be fine, but you want to get lots of 64K+ IOs in flight. If
>> you run iostat or blktrace you should see more than 1 IO in flight. If
>> while the test is running if you
>> cat /sys/class/scsi_host/hostX/host_busy
>> you should also see lots of IO running.
>>
>> What limits the number of IO? On the iscsi initiator side, it could be
>> params like node.session.cmds_max or node.session.queue_depth. For a
>> decent target like the ones you have I would increase
>> node.session.cmds_max to 1024 and increase node.session.queue_depth to 128.
>>
>> What IO tool are you using? Are you doing direct IO or are you doing
>> file system IO? If you just use something like dd with bs=64K then you
>> are not going to get lots of IO running. I think you will get 1 64K IO
>> in flight, so throughput is not going to be high. If you use something
>> like disktest
>> disktest -PT -T30 -h1 -K128 -B64k -ID /dev/sdb
>>
>> you should see a lot of IOs (depends on merging).
>>
>> If you were using dd with bs=128m then that IO is going to get broken
>> down into lots of smaller IOs (probably around 256K), and so the pipe is
>> nice and full.
>>
>> Another thing I noticed in RHEL is if you increase the nice value of the
>> iscsi threads it will increase write perforamnce sometimes. So for RHEL
>> or Oracle do
>>
>> ps -u root | grep scsi_wq
>>
>> Then patch the scsi_wq_%HOST_ID with the iscsiadm -m session -P 3 Host
>> Number. And then renive the thread to -20.
>>
>> Also check the logs and make sure you do not see any conn error messages.
>>
>> And then what do you get when running the IO test to the individual
>> iscsi disks instead of the dm one? Is there any difference? You might
>> want to change the rr_min_io. If you are sending smaller IOs then
>> rr_min_io of 10 is probably too small. The path is not going to get lots
>> of nice large IOs like you would want.
>>
>>
>>
>>> Some basics about my performance lab:
>>> 2 identical 1 gigabit paths (2  dual port intel pro 1000 MTs) in
>>> separate pcie slots.
>>> Hardware:
>>> 2 x Dell R900 6 quad core, 128gig ram, 2 x Dual port Intel Pro MT
>>> Cisco 3750s with 32gigabit stackwise interconnect
>>> 2 x Dell Equallogic PS5000XV arrays
>>> 1 x Dell Equallogic PS5000E arrays
>>> Operating system
>>> SLES 10 SP2 , RHEL5 Update 3, Oracle Linux 5 update 3
>>> /etc/mutipath.conf
>>> defaults {
>>> udev_dir/dev
>>> polling_interval10
>>> selector"round-robin 0"
>>> path_grouping_policymultibus
>>> getuid_callout  "/sbin/scsi_id -g -u -s /block/%n"
>>> prio_callout/bin/true
>>> path_checkerreadsector0
>>> features "1 queue_if_no_path"
>>> rr_min_io   10
>>> max_fds 8192
>>> #   rr_weight   priorities
>>> failbackimmediate
>>> #   

Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-14 Thread Mike Christie

Mike Christie wrote:
> Bart Van Assche wrote:
>> On Mon, Apr 13, 2009 at 10:33 PM, Mike Christie  
>> wrote:
>>> I think linux is just not so good with smaller IO sizes like 4K. I do
>>> not see good performance with Fibre Channel or iscsi.
>>
>> Can you elaborate on the above ? I have already measured a throughput
>> of more than 60 MB/s when using the SRP protocol over an InfiniBand
>> network with a block size of 4 KB blocks, which is definitely not bad.
>>
> 
> How does that compare to Windows or Solaris?
> 
> Is that a 10 gig link?
> 
> What tool were you using and what command did you run? I will try to 
> replicate it here and see what I get.
> 

Oh yeah, how many IOPs can you get with that setup?

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-14 Thread Mike Christie

Bart Van Assche wrote:
> On Mon, Apr 13, 2009 at 10:33 PM, Mike Christie  wrote:
>> I think linux is just not so good with smaller IO sizes like 4K. I do
>> not see good performance with Fibre Channel or iscsi.
> 
> Can you elaborate on the above ? I have already measured a throughput
> of more than 60 MB/s when using the SRP protocol over an InfiniBand
> network with a block size of 4 KB blocks, which is definitely not bad.
> 

How does that compare to Windows or Solaris?

Is that a 10 gig link?

What tool were you using and what command did you run? I will try to 
replicate it here and see what I get.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-14 Thread jnantel

An addition to this, I seem to be getting the following error when I
login:

Apr 14 08:24:54 dc1stgdb15 iscsid: received iferror -38
Apr 14 08:24:54 dc1stgdb15 last message repeated 2 times
Apr 14 08:24:54 dc1stgdb15 iscsid: connection2:0 is operational now
Apr 14 08:24:54 dc1stgdb15 iscsid: received iferror -38
Apr 14 08:24:54 dc1stgdb15 last message repeated 2 times
Apr 14 08:24:54 dc1stgdb15 iscsid: connection1:0 is operational now


On Apr 14, 1:12 pm, jnantel  wrote:
> Well I've got some disconcerting news on this issue.  No changes at
> any level alter the 34/meg throughput I get. I flushed multipath, blew
> away /var/lib/iscsi just in case. I also verified in /var/lib/iscsi
> the options got set. RHEL53 took my renice no problem.
>
> Some observations:
> Single interface iscsi gives me the exact same 34meg/sec
> Going with 2 interfaces it gives me 17meg/sec each interface
> Going with 4 interfaces it gives me 8meg/sec...etc..etc..etc.
> I can't seem to set node.conn[0].iscsi.MaxXmitDataSegmentLength =
> 262144 in a way that actually gets used.
> node.session.iscsi.MaxConnections = 1    can't find any docs on this,
> doubtful it is relevant.
>
> iscsiadm -m session -P 3  still gives me the default 65536 for xmit
> segment.
>
> The Equallogic has all its interfaces on the same SAN network, this is
> contrary to most implementations of multipath I've done. This is the
> vendor recommended deployment.
>
> Whatever is choking performance its consistently choking it down to
> the same level.
>
> On Apr 13, 5:33 pm, Mike Christie  wrote:
>
> > jnantel wrote:
>
> > > I am having a major issue with multipath + iscsi write performance
> > > with anything random or any sequential write with data sizes smaller
> > > than 4meg  (128k 64k 32k 16k 8k).  With 32k block size, I am able to
> > > get a maximum throughput of 33meg/s write.  My performance gets cut by
> > > a third with each smaller size, with 4k blocks giving me a whopping
> > > 4meg/s combined throughput.  Now bumping the data size up to 32meg
> > > gets me 160meg/sec throughput, and 64 gives me 190meg/s and finally to
> > > top it out 128meg gives me 210megabytes/sec.  My question is what
> > > factors would limit my performance in the 4-128k range?
>
> > I think linux is just not so good with smaller IO sizes like 4K. I do
> > not see good performance with Fibre Channel or iscsi.
>
> > 64K+ should be fine, but you want to get lots of 64K+ IOs in flight. If
> > you run iostat or blktrace you should see more than 1 IO in flight. If
> > while the test is running if you
> > cat /sys/class/scsi_host/hostX/host_busy
> > you should also see lots of IO running.
>
> > What limits the number of IO? On the iscsi initiator side, it could be
> > params like node.session.cmds_max or node.session.queue_depth. For a
> > decent target like the ones you have I would increase
> > node.session.cmds_max to 1024 and increase node.session.queue_depth to 128.
>
> > What IO tool are you using? Are you doing direct IO or are you doing
> > file system IO? If you just use something like dd with bs=64K then you
> > are not going to get lots of IO running. I think you will get 1 64K IO
> > in flight, so throughput is not going to be high. If you use something
> > like disktest
> > disktest -PT -T30 -h1 -K128 -B64k -ID /dev/sdb
>
> > you should see a lot of IOs (depends on merging).
>
> > If you were using dd with bs=128m then that IO is going to get broken
> > down into lots of smaller IOs (probably around 256K), and so the pipe is
> > nice and full.
>
> > Another thing I noticed in RHEL is if you increase the nice value of the
> > iscsi threads it will increase write perforamnce sometimes. So for RHEL
> > or Oracle do
>
> > ps -u root | grep scsi_wq
>
> > Then patch the scsi_wq_%HOST_ID with the iscsiadm -m session -P 3 Host
> > Number. And then renive the thread to -20.
>
> > Also check the logs and make sure you do not see any conn error messages.
>
> > And then what do you get when running the IO test to the individual
> > iscsi disks instead of the dm one? Is there any difference? You might
> > want to change the rr_min_io. If you are sending smaller IOs then
> > rr_min_io of 10 is probably too small. The path is not going to get lots
> > of nice large IOs like you would want.
>
> > > Some basics about my performance lab:
>
> > > 2 identical 1 gigabit paths (2  dual port intel pro 1000 MTs) in
> > > separate pcie slots.
>
> > > Hardware:
> > > 2 x Dell R900 6 quad core, 128gig ram, 2 x Dual port Intel Pro MT
> > > Cisco 3750s with 32gigabit stackwise interconnect
> > > 2 x Dell Equallogic PS5000XV arrays
> > > 1 x Dell Equallogic PS5000E arrays
>
> > > Operating systems
> > > SLES 10 SP2 , RHEL5 Update 3, Oracle Linux 5 update 3
>
> > > /etc/mutipath.conf
>
> > > defaults {
> > >         udev_dir                /dev
> > >         polling_interval        10
> > >         selector                "round-robin 0"
> > >         path_grouping_policy    multibus
> > >         get

Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-14 Thread jnantel

Well I've got some disconcerting news on this issue.  No changes at
any level alter the 34/meg throughput I get. I flushed multipath, blew
away /var/lib/iscsi just in case. I also verified in /var/lib/iscsi
the options got set. RHEL53 took my renice no problem.

Some observations:
Single interface iscsi gives me the exact same 34meg/sec
Going with 2 interfaces it gives me 17meg/sec each interface
Going with 4 interfaces it gives me 8meg/sec...etc..etc..etc.
I can't seem to set node.conn[0].iscsi.MaxXmitDataSegmentLength =
262144 in a way that actually gets used.
node.session.iscsi.MaxConnections = 1can't find any docs on this,
doubtful it is relevant.

iscsiadm -m session -P 3  still gives me the default 65536 for xmit
segment.

The Equallogic has all its interfaces on the same SAN network, this is
contrary to most implementations of multipath I've done. This is the
vendor recommended deployment.

Whatever is choking performance its consistently choking it down to
the same level.




On Apr 13, 5:33 pm, Mike Christie  wrote:
> jnantel wrote:
>
> > I am having a major issue with multipath + iscsi write performance
> > with anything random or any sequential write with data sizes smaller
> > than 4meg  (128k 64k 32k 16k 8k).  With 32k block size, I am able to
> > get a maximum throughput of 33meg/s write.  My performance gets cut by
> > a third with each smaller size, with 4k blocks giving me a whopping
> > 4meg/s combined throughput.  Now bumping the data size up to 32meg
> > gets me 160meg/sec throughput, and 64 gives me 190meg/s and finally to
> > top it out 128meg gives me 210megabytes/sec.  My question is what
> > factors would limit my performance in the 4-128k range?
>
> I think linux is just not so good with smaller IO sizes like 4K. I do
> not see good performance with Fibre Channel or iscsi.
>
> 64K+ should be fine, but you want to get lots of 64K+ IOs in flight. If
> you run iostat or blktrace you should see more than 1 IO in flight. If
> while the test is running if you
> cat /sys/class/scsi_host/hostX/host_busy
> you should also see lots of IO running.
>
> What limits the number of IO? On the iscsi initiator side, it could be
> params like node.session.cmds_max or node.session.queue_depth. For a
> decent target like the ones you have I would increase
> node.session.cmds_max to 1024 and increase node.session.queue_depth to 128.
>
> What IO tool are you using? Are you doing direct IO or are you doing
> file system IO? If you just use something like dd with bs=64K then you
> are not going to get lots of IO running. I think you will get 1 64K IO
> in flight, so throughput is not going to be high. If you use something
> like disktest
> disktest -PT -T30 -h1 -K128 -B64k -ID /dev/sdb
>
> you should see a lot of IOs (depends on merging).
>
> If you were using dd with bs=128m then that IO is going to get broken
> down into lots of smaller IOs (probably around 256K), and so the pipe is
> nice and full.
>
> Another thing I noticed in RHEL is if you increase the nice value of the
> iscsi threads it will increase write perforamnce sometimes. So for RHEL
> or Oracle do
>
> ps -u root | grep scsi_wq
>
> Then patch the scsi_wq_%HOST_ID with the iscsiadm -m session -P 3 Host
> Number. And then renive the thread to -20.
>
> Also check the logs and make sure you do not see any conn error messages.
>
> And then what do you get when running the IO test to the individual
> iscsi disks instead of the dm one? Is there any difference? You might
> want to change the rr_min_io. If you are sending smaller IOs then
> rr_min_io of 10 is probably too small. The path is not going to get lots
> of nice large IOs like you would want.
>
>
>
> > Some basics about my performance lab:
>
> > 2 identical 1 gigabit paths (2  dual port intel pro 1000 MTs) in
> > separate pcie slots.
>
> > Hardware:
> > 2 x Dell R900 6 quad core, 128gig ram, 2 x Dual port Intel Pro MT
> > Cisco 3750s with 32gigabit stackwise interconnect
> > 2 x Dell Equallogic PS5000XV arrays
> > 1 x Dell Equallogic PS5000E arrays
>
> > Operating systems
> > SLES 10 SP2 , RHEL5 Update 3, Oracle Linux 5 update 3
>
> > /etc/mutipath.conf
>
> > defaults {
> >         udev_dir                /dev
> >         polling_interval        10
> >         selector                "round-robin 0"
> >         path_grouping_policy    multibus
> >         getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
> >         prio_callout            /bin/true
> >         path_checker            readsector0
> >         features "1 queue_if_no_path"
> >         rr_min_io               10
> >         max_fds                 8192
> > #       rr_weight               priorities
> >         failback                immediate
> > #       no_path_retry           fail
> > #       user_friendly_names     yes
>
> > /etc/iscsi/iscsi.conf   (non default values)
>
> > node.session.timeo.replacement_timeout = 15
> > node.conn[0].timeo.noop_out_interval = 5
> > node.conn[0].timeo.noop_out_timeout = 30

Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-13 Thread Bart Van Assche

On Mon, Apr 13, 2009 at 10:33 PM, Mike Christie  wrote:
> I think linux is just not so good with smaller IO sizes like 4K. I do
> not see good performance with Fibre Channel or iscsi.

Can you elaborate on the above ? I have already measured a throughput
of more than 60 MB/s when using the SRP protocol over an InfiniBand
network with a block size of 4 KB blocks, which is definitely not bad.

Bart.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-13 Thread Mike Christie

jnantel wrote:
> 
> 
> I am having a major issue with multipath + iscsi write performance
> with anything random or any sequential write with data sizes smaller
> than 4meg  (128k 64k 32k 16k 8k).  With 32k block size, I am able to
> get a maximum throughput of 33meg/s write.  My performance gets cut by
> a third with each smaller size, with 4k blocks giving me a whopping
> 4meg/s combined throughput.  Now bumping the data size up to 32meg
> gets me 160meg/sec throughput, and 64 gives me 190meg/s and finally to
> top it out 128meg gives me 210megabytes/sec.  My question is what
> factors would limit my performance in the 4-128k range?

I think linux is just not so good with smaller IO sizes like 4K. I do 
not see good performance with Fibre Channel or iscsi.

64K+ should be fine, but you want to get lots of 64K+ IOs in flight. If 
you run iostat or blktrace you should see more than 1 IO in flight. If 
while the test is running if you
cat /sys/class/scsi_host/hostX/host_busy
you should also see lots of IO running.

What limits the number of IO? On the iscsi initiator side, it could be 
params like node.session.cmds_max or node.session.queue_depth. For a 
decent target like the ones you have I would increase 
node.session.cmds_max to 1024 and increase node.session.queue_depth to 128.

What IO tool are you using? Are you doing direct IO or are you doing 
file system IO? If you just use something like dd with bs=64K then you 
are not going to get lots of IO running. I think you will get 1 64K IO 
in flight, so throughput is not going to be high. If you use something 
like disktest
disktest -PT -T30 -h1 -K128 -B64k -ID /dev/sdb

you should see a lot of IOs (depends on merging).

If you were using dd with bs=128m then that IO is going to get broken 
down into lots of smaller IOs (probably around 256K), and so the pipe is 
nice and full.

Another thing I noticed in RHEL is if you increase the nice value of the 
iscsi threads it will increase write perforamnce sometimes. So for RHEL 
or Oracle do

ps -u root | grep scsi_wq

Then patch the scsi_wq_%HOST_ID with the iscsiadm -m session -P 3 Host 
Number. And then renive the thread to -20.


Also check the logs and make sure you do not see any conn error messages.

And then what do you get when running the IO test to the individual 
iscsi disks instead of the dm one? Is there any difference? You might 
want to change the rr_min_io. If you are sending smaller IOs then 
rr_min_io of 10 is probably too small. The path is not going to get lots 
of nice large IOs like you would want.



> 
> 
> Some basics about my performance lab:
> 
> 2 identical 1 gigabit paths (2  dual port intel pro 1000 MTs) in
> separate pcie slots.
> 
> Hardware:
> 2 x Dell R900 6 quad core, 128gig ram, 2 x Dual port Intel Pro MT
> Cisco 3750s with 32gigabit stackwise interconnect
> 2 x Dell Equallogic PS5000XV arrays
> 1 x Dell Equallogic PS5000E arrays
> 
> Operating systems
> SLES 10 SP2 , RHEL5 Update 3, Oracle Linux 5 update 3
> 
> 
> /etc/mutipath.conf
> 
> defaults {
> udev_dir/dev
> polling_interval10
> selector"round-robin 0"
> path_grouping_policymultibus
> getuid_callout  "/sbin/scsi_id -g -u -s /block/%n"
> prio_callout/bin/true
> path_checkerreadsector0
> features "1 queue_if_no_path"
> rr_min_io   10
> max_fds 8192
> #   rr_weight   priorities
> failbackimmediate
> #   no_path_retry   fail
> #   user_friendly_names yes
> 
> /etc/iscsi/iscsi.conf   (non default values)
> 
> node.session.timeo.replacement_timeout = 15
> node.conn[0].timeo.noop_out_interval = 5
> node.conn[0].timeo.noop_out_timeout = 30
> node.session.cmds_max = 128
> node.session.queue_depth = 32
> node.session.iscsi.FirstBurstLength = 262144
> node.session.iscsi.MaxBurstLength = 16776192
> node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144
> node.conn[0].iscsi.MaxXmitDataSegmentLength = 262144
> 
> discovery.sendtargets.iscsi.MaxRecvDataSegmentLength = 65536
> 
> Scheduler:
> 
> cat /sys/block/sdb/queue/scheduler
> [noop] anticipatory deadline cfq
> cat /sys/block/sdc/queue/scheduler
> [noop] anticipatory deadline cfq
> 
> 
> Command outputs:
> 
> iscsiadm -m session -P 3
> iSCSI Transport Class version 2.0-724
> iscsiadm version 2.0-868
> Target: iqn.2001-05.com.equallogic:0-8a0906-2c82dfd03-64c000cfe2249e37-
> dc1stgdb15-sas-raid6
> Current Portal: 10.1.253.13:3260,1
> Persistent Portal: 10.1.253.10:3260,1
> **
> Interface:
> **
> Iface Name: ieth1
> Iface Transport: tcp
> Iface Initiatorname: iqn.2005-04.com.linux:dc1stgdb15
> Iface IPaddress: 10.1.253.148
> Iface HWaddress: default
> Iface Netdev: e

Multipath + iscsi + SLES10 SP2 / REDHAT 5.3 / Oracle Linux 5 update 3

2009-04-13 Thread jnantel



I am having a major issue with multipath + iscsi write performance
with anything random or any sequential write with data sizes smaller
than 4meg  (128k 64k 32k 16k 8k).  With 32k block size, I am able to
get a maximum throughput of 33meg/s write.  My performance gets cut by
a third with each smaller size, with 4k blocks giving me a whopping
4meg/s combined throughput.  Now bumping the data size up to 32meg
gets me 160meg/sec throughput, and 64 gives me 190meg/s and finally to
top it out 128meg gives me 210megabytes/sec.  My question is what
factors would limit my performance in the 4-128k range?


Some basics about my performance lab:

2 identical 1 gigabit paths (2  dual port intel pro 1000 MTs) in
separate pcie slots.

Hardware:
2 x Dell R900 6 quad core, 128gig ram, 2 x Dual port Intel Pro MT
Cisco 3750s with 32gigabit stackwise interconnect
2 x Dell Equallogic PS5000XV arrays
1 x Dell Equallogic PS5000E arrays

Operating systems
SLES 10 SP2 , RHEL5 Update 3, Oracle Linux 5 update 3


/etc/mutipath.conf

defaults {
udev_dir/dev
polling_interval10
selector"round-robin 0"
path_grouping_policymultibus
getuid_callout  "/sbin/scsi_id -g -u -s /block/%n"
prio_callout/bin/true
path_checkerreadsector0
features "1 queue_if_no_path"
rr_min_io   10
max_fds 8192
#   rr_weight   priorities
failbackimmediate
#   no_path_retry   fail
#   user_friendly_names yes

/etc/iscsi/iscsi.conf   (non default values)

node.session.timeo.replacement_timeout = 15
node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 30
node.session.cmds_max = 128
node.session.queue_depth = 32
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144
node.conn[0].iscsi.MaxXmitDataSegmentLength = 262144

discovery.sendtargets.iscsi.MaxRecvDataSegmentLength = 65536

Scheduler:

cat /sys/block/sdb/queue/scheduler
[noop] anticipatory deadline cfq
cat /sys/block/sdc/queue/scheduler
[noop] anticipatory deadline cfq


Command outputs:

iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-724
iscsiadm version 2.0-868
Target: iqn.2001-05.com.equallogic:0-8a0906-2c82dfd03-64c000cfe2249e37-
dc1stgdb15-sas-raid6
Current Portal: 10.1.253.13:3260,1
Persistent Portal: 10.1.253.10:3260,1
**
Interface:
**
Iface Name: ieth1
Iface Transport: tcp
Iface Initiatorname: iqn.2005-04.com.linux:dc1stgdb15
Iface IPaddress: 10.1.253.148
Iface HWaddress: default
Iface Netdev: eth1
SID: 3
iSCSI Connection State: LOGGED IN
iSCSI Session State: Unknown
Internal iscsid Session State: NO CHANGE

Negotiated iSCSI params:

HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 262144
MaxXmitDataSegmentLength: 65536
FirstBurstLength: 65536
MaxBurstLength: 262144
ImmediateData: Yes
InitialR2T: No
MaxOutstandingR2T: 1

Attached SCSI devices:

Host Number: 5  State: running
scsi5 Channel 00 Id 0 Lun: 0
Attached scsi disk sdb  State: running
Current Portal: 10.1.253.12:3260,1
Persistent Portal: 10.1.253.10:3260,1
**
Interface:
**
Iface Name: ieth2
Iface Transport: tcp
Iface Initiatorname: iqn.2005-04.com.linux:dc1stgdb15
Iface IPaddress: 10.1.253.48
Iface HWaddress: default
Iface Netdev: eth2
SID: 4
iSCSI Connection State: LOGGED IN
iSCSI Session State: Unknown
Internal iscsid Session State: NO CHANGE

Negotiated iSCSI params:

HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 262144
MaxXmitDataSegmentLength: 65536
FirstBurstLength: 65536
MaxBurstLength: 262144
ImmediateData: Yes
InitialR2T: No
MaxOutstandingR2T: 1

Attached SCSI devices:
*