Hi, What HV is that? CentOS? Are you using the right tuned profile? What about in the guest? Which IO scheduler?
-- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro ----- Original Message ----- > From: "Ivan Kudryavtsev" <kudryavtsev...@bw-sw.com> > To: "users" <users@cloudstack.apache.org> > Sent: Friday, 17 May, 2019 10:13:50 > Subject: Poor NVMe Performance with KVM > Hello, colleagues. > > Hope, someone could help me. I just deployed a new VM host with Intel P4500 > local storage NVMe drive. > > From Hypervisor host I can get expected performance, 200K RIOPS, 3GBs with > FIO, write performance is also high as expected. > > I've created a new KVM VM Service offering with virtio-scsi controller > (tried virtio as well) and VM is deployed. Now I try to benchmark it with > FIO. Results are very strange: > > 1. Read/Write with large blocks (1M) shows expected performance (my limits > are R=1000/W=500 MBs). > > 2. Write with direct=0 leads to expected 50K IOPS, while write with > direct=1 leads to very moderate 2-3K IOPS. > > 3. Read with direct=0, direct=1 both lead to 3000 IOPS. > > During the benchmark I see VM IOWAIT=20%, while host IOWAIT is 0% which is > strange. > > So, basically, from inside VM my NVMe works very slow when small IOPS are > executed. From the host, it works great. > > I tried to mount the volume with NBD to /dev/nbd0 and benchmark. Read > performance is nice. Maybe someone managed to use NVME with KVM with small > IOPS? > > The filesystem is XFS, previously tried with EXT4 - results are the same. > > This is the part of VM XML definition generated by CloudStack: > > <devices> > <emulator>/usr/bin/kvm-spice</emulator> > <disk type='file' device='disk'> > <driver name='qemu' type='qcow2' cache='none' discard='unmap'/> > <source > file='/var/lib/libvirt/images/6809dbd0-4a15-4014-9322-fe9010695934'/> > <backingStore type='file' index='1'> > <format type='raw'/> > <source > file='/var/lib/libvirt/images/ac43742c-3991-4be1-bff1-7617bf4fc6ef'/> > <backingStore/> > </backingStore> > <target dev='sda' bus='scsi'/> > <iotune> > <read_bytes_sec>1048576000</read_bytes_sec> > <write_bytes_sec>524288000</write_bytes_sec> > <read_iops_sec>100000</read_iops_sec> > <write_iops_sec>50000</write_iops_sec> > </iotune> > <serial>6809dbd04a1540149322</serial> > <alias name='scsi0-0-0-0'/> > <address type='drive' controller='0' bus='0' target='0' unit='0'/> > </disk> > <disk type='file' device='cdrom'> > <driver name='qemu' type='raw'/> > <backingStore/> > <target dev='hdc' bus='ide'/> > <readonly/> > <alias name='ide0-1-0'/> > <address type='drive' controller='0' bus='1' target='0' unit='0'/> > </disk> > <controller type='scsi' index='0' model='virtio-scsi'> > <alias name='scsi0'/> > <address type='pci' domain='0x0000' bus='0x00' slot='0x09' > function='0x0'/> > </controller> > > So, what I see now, is that it works slower than couple of two Samsung 960 > PRO which is extremely strange. > > Thanks in advance. > > > -- > With best regards, Ivan Kudryavtsev > Bitworks LLC > Cell RU: +7-923-414-1515 > Cell USA: +1-201-257-1512 > WWW: http://bitworks.software/ <http://bw-sw.com/>