Just an update to this issue, it's still persisting.  210meg/sec raw
reads with 100-110meg/sec throughput on filesystem reads.  I'm a bit
perplexed.  Setting stride and aligning sectors in my mind is to help
with write performance not read performance.  So I'm left with a
couple of questions.  What does ext3 add to the equation for reads
that might cause this behavior.

Block device
File table
request merging (I'd like to tune this, but I've not seen any tunable
parameters here)

I'm missing a lot more obviously.

On Apr 24, 6:02 pm, jnantel <nan...@hotmail.com> wrote:
> Donald, thanks for the reply.  This issue has me baffled. I can goof
> with the read ahead all I want but it has no effect on the performance
> with a filesystem. I must be missing a key buffer section that is
> starving my filesystem reads.
>
> Here is the output from iostat -k 5 during artificially generated read
> (dd if=/fs/disk0/testfile of=/dev/null bs=32k -c=1000000)
>
> This is reading a file residing on the ext3 filesystem on my raid6
> volume.  Keep in mind I am using multipath:
>
> Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
> sdc             115.17     52090.22         0.00     260972          0
> sdd               0.00         0.00         0.00          0          0
> sde             109.78     49694.21         0.00     248968          0
> dm-0            249.30    101784.43         0.00     509940          0
>
> Same volume reading from the device itself( dd if=/dev/mapper/raid6
> of=/dev/null bs=32k -c=1000000):
>
> Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
> sdc             418.80    106905.60         0.00     534528          0
> sdd               0.00         0.00         0.00          0          0
> sde             901.80    106950.40         0.00     534752          0
> dm-0          53452.80    213811.20         0.00    1069056          0
>
> More detailed on ext3 performance
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util
> sda               0.00     0.40    0.00    0.40     0.00     4.00
> 20.00     0.00    0.00   0.00   0.00
> sdb               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00   0.00   0.00
> sdc              12.40     0.20  110.80    0.40 50215.20     2.40
> 903.19     1.00    8.97   4.81  53.52
> sdd               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00   0.00   0.00
> sde              11.20     0.00  104.00    0.00 47205.60     0.00
> 907.80     0.91    8.77   4.65  48.32
> dm-0              0.00     0.00  238.40    0.60 97375.20     2.40
> 814.88     2.08    8.70   4.18 100.00
>
> On Apr 24, 3:14 pm, Donald Williams <don.e.willi...@gmail.com> wrote:
>
> > Have you tried increasing the disk readahead value?
> > #blockdev --setra X /dev/<multipath device>
>
> >  The default is 256.    Use --getra to see current setting.
>
> >  Setting it too high will probably hurt your database performance.  Since
> > databases tend to be random, not sequential.
>
> >  Don
>
> > On Fri, Apr 24, 2009 at 11:07 AM, jnantel <nan...@hotmail.com> wrote:
>
> > > If you recall my thread on tuning performance for writes.  Now I am
> > > attempting to squeeze as much read performance as I can from my
> > > current setup.  I've read a lot of the previous threads, and there has
> > > been mention of "miracle" settings that resolved slow reads vs
> > > writes.  Unfortunately, most posts reference the effects and not the
> > > changes.   If I were tuning for read performance in the 4k to 128k
> > > block range what would the best way to go about it?
>
> > > Observed behavior:
> > > - Read performance seems to be capped out at 110meg/sec
> > > - Write performance I get upwards of 190meg/sec
>
> > > Tuning options I'll be trying:
> > > block alignment (stride)
> > > Receiving buffers
> > > multipath min io changes
> > > iscsi cmd depth
>
> > > Hardware:
> > > 2 x Cisco 3750  with 32gig interconnect
> > > 2 x Dell R900 with 128gig ram and 1 broadcom Quad (5709) and 2 dual
> > > port intels (pro 1000/MT)
> > > 2 x Dell Equallogic PS5000XV with 15 x SAS in raid 10 config
>
> > > multipath.conf:
>
> > > device {
> > >        vendor "EQLOGIC"
> > >        product "100E-00"
> > >        path_grouping_policy multibus
> > >        getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
> > >        features "1 queue_if_no_path"
> > >        path_checker readsector0
> > >        failback immediate
> > >        path_selector "round-robin 0"
> > >        rr_min_io 128
> > >        rr_weight priorities
> > > }
>
> > > iscsi settings:
>
> > > node.tpgt = 1
> > > node.startup = automatic
> > > iface.hwaddress = default
> > > iface.iscsi_ifacename = ieth10
> > > iface.net_ifacename = eth10
> > > iface.transport_name = tcp
> > > node.discovery_address = 10.1.253.10
> > > node.discovery_port = 3260
> > > node.discovery_type = send_targets
> > > node.session.initial_cmdsn = 0
> > > node.session.initial_login_retry_max = 4
> > > node.session.cmds_max = 1024
> > > node.session.queue_depth = 128
> > > node.session.auth.authmethod = None
> > > node.session.timeo.replacement_timeout = 120
> > > node.session.err_timeo.abort_timeout = 15
> > > node.session.err_timeo.lu_reset_timeout = 30
> > > node.session.err_timeo.host_reset_timeout = 60
> > > node.session.iscsi.FastAbort = Yes
> > > node.session.iscsi.InitialR2T = No
> > > node.session.iscsi.ImmediateData = Yes
> > > node.session.iscsi.FirstBurstLength = 262144
> > > node.session.iscsi.MaxBurstLength = 16776192
> > > node.session.iscsi.DefaultTime2Retain = 0
> > > node.session.iscsi.DefaultTime2Wait = 2
> > > node.session.iscsi.MaxConnections = 1
> > > node.session.iscsi.MaxOutstandingR2T = 1
> > > node.session.iscsi.ERL = 0
> > > node.conn[0].address = 10.1.253.10
> > > node.conn[0].port = 3260
> > > node.conn[0].startup = manual
> > > node.conn[0].tcp.window_size = 524288
> > > node.conn[0].tcp.type_of_service = 0
> > > node.conn[0].timeo.logout_timeout = 15
> > > node.conn[0].timeo.login_timeout = 15
> > > node.conn[0].timeo.auth_timeout = 45
> > > node.conn[0].timeo.noop_out_interval = 10
> > > node.conn[0].timeo.noop_out_timeout = 30
> > > node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144
> > > node.conn[0].iscsi.HeaderDigest = None,CRC32C
> > > node.conn[0].iscsi.DataDigest = None
> > > node.conn[0].iscsi.IFMarker = No
> > > node.conn[0].iscsi.OFMarker = No
>
> > > /etc/sysctl.conf
>
> > > net.core.rmem_default= 65536
> > > net.core.rmem_max=2097152
> > > net.core.wmem_default = 65536
> > > net.core.wmem_max = 262144
> > > net.ipv4.tcp_mem= 98304 131072 196608
> > > net.ipv4.tcp_window_scaling=1
>
> > > #
> > > # Additional options for Oracle database server
> > > #ORACLE
> > > kernel.panic = 2
> > > kernel.panic_on_oops = 1
> > > net.ipv4.ip_local_port_range = 1024 65000
> > > net.core.rmem_default=262144
> > > net.core.wmem_default=262144
> > > net.core.rmem_max=524288
> > > net.core.wmem_max=524288
> > > fs.aio-max-nr=524288
>
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to