On Thu, Jan 07, 2010 at 10:05:57AM -0800, Jack Z wrote:
> Hi Pasi,
> 
> Thanks again for your reply!
> 
> > > > Try to play and experiment with these options:
> >
> > > > -B 64k (blocksize 64k, try also 4k)
> > > > -I BD (block device, direct IO (O_DIRECT))
> > > > -K 16 (16 threads, aka 16 outstanding IOs. -K 1 should be the same as 
> > > > dd)
> >
> > > > Examples:
> >
> > > > Sequential (linear) reads using blocksize 4k and 4 simultaneous 
> > > > threads, for 60 seconds:
> > > > disktest -B 4k -h 1 -I BD -K 4 -p l -P T -T 60 -r /dev/sdX
> >
> > > > Random writes:
> >
> > > > disktest -B 4k -h 1 -I BD -K 4 -p r -P T -T 60 -w /dev/sdX
> >
> > > > 30% random reads, 70% random writes:
> > > > disktest -r -w -D30:70 -K2 -E32 -B 8k -T 60 -pR -Ibd -PA /dev/md4
> >
> > > > Hopefully that helps..
> >
> > > That did help. I tried the following combinations of -B -K and -p at
> > > 20 ms RTT and the other options were -h 30 -I BD -P T -S0:(1 GB size)
> >
> > > -B 4k/64k -K 4/64 -p l
> >
> > > It seems that when I put -p l there the performance goes down
> > > drastically...
> >
> > That's really weird.. linear/sequential (-p l) should always be faster
> > than random.
> >
> > > -B 4k -K 4/64 -p r
> >
> > > The disk throughput is similar to the one I used in the previous post
> > > "disktest -w -S0:1k -B 1024 /dev/sdb " and it's much lower than dd
> > > could get.
> >
> > like said, weird.
> 
> I'll try to repeat more of these tests that yielded weird results.
> I'll let you know if anything new comes up. :)
> 

Yep.


> 
> > > -B 64k -K 4 -p r
> >
> > > The disk throughput is higher than the last one but still not as high
> > > as dd could get.
> >
> > > -B 64k -K 64 -p r
> >
> > > The disk throughput was boosted to 8.06 MB/s and the IOPS was 129.0.
> > > At the link layer, the traffic rate was 70.536 Mbps (the TCP baseline
> > > was 96.202 Mbps). At the same time, dd ( bs=64K count=(1 GB size)) got
> > > a throughput of 6.7 MB/s and the traffic rate on the link layer was
> > > 57.749 Mbps.
> >
> > Ok.
> >
> > 129 IOPS * 64kB = 8256 kB/sec, which pretty much matches the 8 MB/sec
> > you measured.
> >
> > this still means there was only 1 outstanding IO.. and definitely not 64 
> > (-K 64).
> 
> For this part, I did not quite understand... Following your previous
> calculations,
> 
> 1 s = 1000 ms
> 1000 / 129 = 7.75 ms/IO
> 
> And the link RTT is 20 ms.
> 
> 20/7.75 = 2.58 > 2. So there should be at least 2 outstanding IOs...
> Am I corrent...?
> 

That's correct. I was wrong. I was too busy when replying to you :)

> And for the 64 outstanding IOs, I'll try more experiments and see why
> that is not happening.
> 

It could be because of the IO elevator/scheduler.. see below.

> 
> > > Although not much, it was still an improvement and it was the first
> > > improvement I have ever seen since I started my experiments! Thank you
> > > very much!
> >
> > > As for
> >
> > > > Oh, also make sure you have 'oflag=direct' for dd.
> >
> > > The result was surprisingly low again... Do you think the reason might
> > > be that I was running dd on a device file (/dev/sdb), which did not
> > > have any partitions/file systems on it?
> >
> > > Thanks a lot!
> >
> > oflag=direct makes dd use O_DIRECT, aka bypass all kernel/initiator caches 
> > for writing.
> > iflag=direct would bypass all caches for reading.
> >
> > It shouldn't matter if you write or read from /dev/sda1 instead of /dev/sda.
> > As long as it's a raw block device, it shouldn't matter.
> > If you write/read to/from a filesystem, that obviously matters.
> >
> > What kind of target you are using for this benchmark?
> 
> It is the iSCSI Enterprise Target, which came with ubuntu 9.04.
> (iscsitarget (0.4.16+svn162-3ubuntu1)).
> 
> Thank you very much!
> 

Make sure you use 'deadline' elevator on the target machine!! This is
important, since the default 'cfq' doesn't perform well with IETD.

You can either set the target machine kernel option 'elevator=deadline'
in grub.conf and reboot, or then you can change the settings on the fly
like this:

echo deadline > /sys/block/sdX/queue/scheduler

do that for all the disks/devices you have in your target machine, ie.
replace sdX with each disk.

Also if you're using fileio on IETD, change it to blockio.


One more things: On the initiator machine you should use 'noop'
scheduler for the iSCSI disks.. so on the initiator do for each iSCSI disk:

echo noop > /sys/block/sdX/queue/scheduler

And benchmark again after setting correct schedulers/elevators on both
the target and initiator, and the blockio mode on IETD.

-- Pasi

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.


Reply via email to