On Wed, Jul 6, 2011 at 11:51, Adi Kriegisch <a...@cg.tuwien.ac.at> wrote:
> Hi!
>
>> >I should have mentioned that the AoE device is backed by a RAID setup that
>> >is
>> >able to write well above 120 MB/s.
>> >If I mount the same filesystem locally, on the server, bonnie tells me
>> >it's able to do
>> >sequential writes at ~370 MB/s.
>> >
>> >If I write straight to the AoE device, I can get the expected
>> >line-speed of the network, around ~110 MB/s.
>> >dd if=/dev/zero of=/dev/etherd/e1.1 bs=1M
>> >
>> >However, when mounting a filesystem, and copying a file onto the AoE
>> >device, I only see about ~70 MB/s.
>> >
>> >This leads me to thinking that the performance degradation I'm seeing
>> >is related to
>> >the filesystem or the network.
>> >Of course, I wouldn't expect a filesystem to give the same performance as
>> >the
>> >raw device, but I didn't expect to see a ~25% hit in performance,
>> >especially
>> >when doing a sequential write.
>> >
>> What filesystem do you use? XFS is known to be the recommended
>> filesystem for AoE.
> Actually I think this could be due to RAID block sizes: most AoE
> implementations assume a block size of 512Byte. If you're using a linux
> software RAID5 with a default chunk size of 512K and you're using 4 disks,
> a single "block" has 3*512K block size. This is what has to be written when
> changing data in a file for example.
> mkfs.ext4 or mkfs.xfs respects those block sizes, stride sizes, stripe
> width and so on (see man pages) when the information is available (which is
> not the case when creating a file system on an AoE device.

I'm using a LSI MegaRAID SAS 9280 RAID controller that is exposing a
single block device.
The RAID itself is a RAID6 configuration, using default settings.
MegaCLI says that the virtual drive has a "Strip Size" of 64KB.

The virtual device from the RAID controller is used as a physical
volume for LVM,
and the exported AoE devices are LVM logical volumes cut from this
physical volume.

It seems I get the same filesystem settings if I create the filesystem
right on the LVM volume,
or if I create it on the AoE volume.

Creating it on the server, mkfs.ext4 says:
root@storage01:~# mkfs.ext4 /dev/aoepool0/aoetest
mke2fs 1.41.12 (17-May-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
1310720 inodes, 5242880 blocks

Creating it on the client, mkfs.ext4 says:
root@xen08:/home/torbjorn# mkfs.ext4 /dev/etherd/e7.1
mke2fs 1.41.12 (17-May-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
1310720 inodes, 5242880 blocks

Using both of these filesystems on the client, I end up with pretty
much the same
transfer rate of about ~70MB/s.

Using it on the client, that is, mounting the LVM volume directly,
I get the much preferable ~370 MB/s.

>
> To check if you're hit by this is quite simple: install dstat or iostat on
> the server exporting the volume. Run your benchmark and watch the output of
> dstat/iostat: if you experience massive reads while writing, congrats, you
> found the root cause. To improve things a little, create the file system on
> the server that is exporting the AoE targets. To improve them even more --
> especially with RAID5 and RAID6 -- choose a smaller chunk size.
>
> I'd be glad if you could post back some numbers... :-)

I have iostat running continually, and I have seen that "massive read"
problem earlier.
However, when I'm doing these tests, I have a bare minimum of reads,
it's mostly all writes.

The "%util" column from iostat is mostly around  ~10%, while at some
intervals peaking towards 100%.
I'm guessing there is some cache flushing going on when I'm seeing those spikes.
This is on the server, the client chugs stably along at ~70 MB/s.

>
> On a side note: linear performance isn't what is counting when using
> network storage. You better measure iops (input/output operations per
> second). I use fio for benchmarks which lets you define your I/O patterns
> to (kind of) fit real world usage.
>
> -- Adi
>
> ------------------------------------------------------------------------------
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security
> threats, fraudulent activity, and more. Splunk takes this data and makes
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2d-c2
> _______________________________________________
> Aoetools-discuss mailing list
> Aoetools-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/aoetools-discuss
>



-- 
Vennlig hilsen
Torbjørn Thorsen
Utvikler / driftstekniker

Trollweb Solutions AS
- Professional Magento Partner
www.trollweb.no

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Aoetools-discuss mailing list
Aoetools-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/aoetools-discuss

Reply via email to