I don't see where you talk about what real hardware there is outside of the VM.
Are all 3 vmdk's on separate independent disks? I have only seen adding extra controllers really matter when you have a large number of disks per controller. Ie 5 disks/controller works better than 15 disks on one controller. And this really only seemed to matter with read io where the app had many threads (ie not direct io). Parallel writes with direct IO breaks down to write an io and wait to each device...especially when the block size of the direct IO is similar (or smaller) than the stripe size. On Fri, Jul 18, 2025 at 2:14 AM Henry, Andrew <andrew.he...@eon.se> wrote: > > I’m trying to improve the performance of a virtual machine guest running RHEL > 9.6. I’ve assigned 3 separate SAS controllers to the VM, and 3 vmdk disks > each connected to their own controller, so there is no kernel bottleneck in > the VM on the controller side when writing through to each vmdk under high > load. > > In the guest, I’m creating an LV as follows: > > lvcreate -n swap -i 3 -I 64k --type=striped -L 16G vg00 > mkfs -t xfs -b size=4K -d su=64k,sw=3 -f /dev/vg00/swap > > After mounting, I test write speeds using: > > time dd if=/dev/zero of=/mnt/testfile1 bs=8192 count=65536 oflag=direct > > 536870912 bytes (537 MB, 512 MiB) copied, 41.2043 s, 13.0 MB/s > 536870912 bytes (537 MB, 512 MiB) copied, 27.3986 s, 19.6 MB/s > 536870912 bytes (537 MB, 512 MiB) copied, 22.8359 s, 23.5 MB/s > > time dd if=/dev/zero of=/mnt/testfile bs=64k count=163840 oflag=direct > > 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 10.7506 s, 99.9 MB/s ## only did > one test here > > time dd if=/dev/zero of=/mnt/testfile bs=1M count=10240 oflag=direct > > 10737418240 bytes (11 GB, 10 GiB) copied, 19.158 s, 560 MB/s > 10737418240 bytes (11 GB, 10 GiB) copied, 23.0384 s, 466 MB/s > 10737418240 bytes (11 GB, 10 GiB) copied, 18.3788 s, 584 MB/s > > I monitor speeds to the individual discs and the LV itself using: > > iostat -xdmt 2 > > > My question(s): > > When I create the LV using just one disk (linear lvm), I am seeing the same > write speed results as above. In iostat, I can see that eg. With 1MB block > size in dd, it’s ~500MB/s to the dm-1 device and also ~500MB/s to the > physical disk xvda. When I create the LV as striped, I’m getting the same > throughput on the dm-1 device (~500MB/s), but it is splitting that throughput > evenly across all three PV’s, so about ~170MB/s to each xvd disk, totalling > ~500MB/s at the LV side. This was not my expectation. I expected ~500MB/s > to each device with striped LV, totalling ~1500MB/s on the dm-a device. What > am I missing? > > Second question is: > > I’m testing database throughput. I have the following setup: > > DB → 8KB block size > XFS → 4KB block size limited by pagesize in Linux Kernel on x86 architecture. > LVM → 64KB block size, and have also tested with 4KB, 8KB, 256KB, 512KB, > 1MB, 4MB > Xvd[abc] → 4KB block size on these disks provided by the virtual host > > The only part of that equation I can modify is the stripe size on the LVM. > Irrespective of which stripe size I choose, I’m getting similar results with > dd (and fio for that matter) when testing different block sizes with my write > tests. In other words, it doesn’t seem to make any difference whatsoever > which stripe size I choose in the LVM. Is this a limitation cause by my > virtualisation layer? Would I have seen different (better) results if I’d > used 3 physical disks connected to a physical host? > > I suppose it’s normal to get low speeds with small block sizes and higher > speeds with larger block sizes, but I’m still concerned that 8KB block sizes > are writing to a fast disk system at only 13-20MB/s. Is this common? > > /AH