On Fri, Jan 08, 2010 at 02:27:30PM +0200, Delian Krustev wrote:

> To illustrate the numbers, first the local test:
> 
> # dd if=/dev/zero of=/dev/mapper/vg0-nbd6.0 bs=1M count=1000 seek=100
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 8.18595 s, 128 MB/s
> 
> At the same time on the nearby console iostat shows:
> 
> Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
> sda             147.00         4.00     73420.00          4      73420
> sdb             141.00         8.00     70724.00          8      70724
> md8           38662.00         0.00    154648.00          0     154648
> dm-2          38662.00         0.00    154648.00          0     154648
> 
> The physical devices (sda/sdb) do about 1/2 MB per transfer operation.

Yes, that's the default maximum request size for the disks, and since
"dd" gives nice big 1M requests to the kernel, it can easily submit such
large requests to the disks.

> Then do the AoE test:
> 
> # dd if=/dev/zero of=/dev/etherd/e6.0 bs=1M count=100 seek=100
> 100+0 records in
> 100+0 records out
> 104857600 bytes (105 MB) copied, 4.36242 s, 24.0 MB/s
> 
> And the iostat results:
> 
> Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
> sda            2962.00         0.00     13145.00          0      13145
> sdb            2918.00         0.00     12971.00          0      12971
> md8            7658.00         0.00     25364.00          0      25364
> dm-2           7658.00         0.00     25364.00          0      25364
> 
> So this time sda&sdb do about 4 KB per transfer operation.
[...]
> # ggaoectl stats
> # Statistics for device nbd6.0
> read_cnt: 504
> read_bytes: 516096
> read_time: 4.79039
> write_cnt: 204800
> write_bytes: 209715200
> write_time: 103.907
> other_cnt: 34
> other_time: 0.00017897
> io_slots: 58789
> io_runs: 58789
> queue_length: 2128789
> queue_stall: 0
> queue_over: 0
> ata_err: 0
> proto_err: 0
> 
> # Statistics for interface eth1
> rx_cnt: 205337
> rx_bytes: 217120220
> rx_runs: 58822
> rx_buffers_full: 0
> tx_cnt: 205338
> tx_bytes: 12832088
> tx_runs: 0
> tx_buffers_full: 0
> dropped: 0
> ignored: 0
> broadcast: 11
[...]

> From the numbers above: ( 504 + 204800 ) / 58789 = 3.49

That's basically the same ratio as rx_cnt/rx_runs. That means on average
there were 3.49 packets in the memory mapped ring buffer whenever ggaoed
got woken up by the kernel, and ggaoed could almost always merge them
into a single I/O request.

So request merging works nicely, it's just that given the MTU being
1500, the average request size is still just 3.5 kB, less than a page
size. That's very small for a modern disk.

> I could conclude that I'm hitting a protocol limitation which you're trying
> to workaround with GGAoEd (request merging)

It's not a workaround, but an optimization: request merging should
happen as high in the stack as possible, and it's certainly possible to
do it at the AoE daemon level. However it's not a magic bullet.

Gabor

-- 
     ---------------------------------------------------------
     MTA SZTAKI Computer and Automation Research Institute
                Hungarian Academy of Sciences
     ---------------------------------------------------------

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Aoetools-discuss mailing list
Aoetools-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/aoetools-discuss

Reply via email to