On Wed, Jul 6, 2011 at 11:51, Adi Kriegisch <a...@cg.tuwien.ac.at> wrote: > Hi! > >> >I should have mentioned that the AoE device is backed by a RAID setup that >> >is >> >able to write well above 120 MB/s. >> >If I mount the same filesystem locally, on the server, bonnie tells me >> >it's able to do >> >sequential writes at ~370 MB/s. >> > >> >If I write straight to the AoE device, I can get the expected >> >line-speed of the network, around ~110 MB/s. >> >dd if=/dev/zero of=/dev/etherd/e1.1 bs=1M >> > >> >However, when mounting a filesystem, and copying a file onto the AoE >> >device, I only see about ~70 MB/s. >> > >> >This leads me to thinking that the performance degradation I'm seeing >> >is related to >> >the filesystem or the network. >> >Of course, I wouldn't expect a filesystem to give the same performance as >> >the >> >raw device, but I didn't expect to see a ~25% hit in performance, >> >especially >> >when doing a sequential write. >> > >> What filesystem do you use? XFS is known to be the recommended >> filesystem for AoE. > Actually I think this could be due to RAID block sizes: most AoE > implementations assume a block size of 512Byte. If you're using a linux > software RAID5 with a default chunk size of 512K and you're using 4 disks, > a single "block" has 3*512K block size. This is what has to be written when > changing data in a file for example. > mkfs.ext4 or mkfs.xfs respects those block sizes, stride sizes, stripe > width and so on (see man pages) when the information is available (which is > not the case when creating a file system on an AoE device.
I'm using a LSI MegaRAID SAS 9280 RAID controller that is exposing a single block device. The RAID itself is a RAID6 configuration, using default settings. MegaCLI says that the virtual drive has a "Strip Size" of 64KB. The virtual device from the RAID controller is used as a physical volume for LVM, and the exported AoE devices are LVM logical volumes cut from this physical volume. It seems I get the same filesystem settings if I create the filesystem right on the LVM volume, or if I create it on the AoE volume. Creating it on the server, mkfs.ext4 says: root@storage01:~# mkfs.ext4 /dev/aoepool0/aoetest mke2fs 1.41.12 (17-May-2010) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 1310720 inodes, 5242880 blocks Creating it on the client, mkfs.ext4 says: root@xen08:/home/torbjorn# mkfs.ext4 /dev/etherd/e7.1 mke2fs 1.41.12 (17-May-2010) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 1310720 inodes, 5242880 blocks Using both of these filesystems on the client, I end up with pretty much the same transfer rate of about ~70MB/s. Using it on the client, that is, mounting the LVM volume directly, I get the much preferable ~370 MB/s. > > To check if you're hit by this is quite simple: install dstat or iostat on > the server exporting the volume. Run your benchmark and watch the output of > dstat/iostat: if you experience massive reads while writing, congrats, you > found the root cause. To improve things a little, create the file system on > the server that is exporting the AoE targets. To improve them even more -- > especially with RAID5 and RAID6 -- choose a smaller chunk size. > > I'd be glad if you could post back some numbers... :-) I have iostat running continually, and I have seen that "massive read" problem earlier. However, when I'm doing these tests, I have a bare minimum of reads, it's mostly all writes. The "%util" column from iostat is mostly around ~10%, while at some intervals peaking towards 100%. I'm guessing there is some cache flushing going on when I'm seeing those spikes. This is on the server, the client chugs stably along at ~70 MB/s. > > On a side note: linear performance isn't what is counting when using > network storage. You better measure iops (input/output operations per > second). I use fio for benchmarks which lets you define your I/O patterns > to (kind of) fit real world usage. > > -- Adi > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2d-c2 > _______________________________________________ > Aoetools-discuss mailing list > Aoetools-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/aoetools-discuss > -- Vennlig hilsen Torbjørn Thorsen Utvikler / driftstekniker Trollweb Solutions AS - Professional Magento Partner www.trollweb.no ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Aoetools-discuss mailing list Aoetools-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/aoetools-discuss