Re: [Qemu-devel] Disparity between host and guest CPU utilization during disk IO benchmark

Nagarajan, Padhu (HPE Storage) Thu, 13 Jul 2017 16:36:34 -0700

Thanks Stefan. Couldn't get to this earlier. Did another run and took a diff of 
/proc/interrupts before and after the run. It shows all the interrupts for 
'virtio7-req.0' going to CPU1. I guess that explains the "CPU1/KVM" vcpu 
utilization on the host.


34:        147     666085          0          0   PCI-MSI-edge      
virtio7-req.0

The only remaining question is the high CPU utilization of the vCPU threads for 
this workload. Even when I run a light fio workload (queue depth of 1 which 
gives 8K IOPS), the vCPU threads are close to 100% utilization. Why is it high 
and does it have an impact on guest code that could be executing on the same 
CPU ?

fio command line: fio --time_based --ioengine=libaio --randrepeat=1 --direct=1 
--invalidate=1 --verify=0 --offset=0 --verify_fatal=0 --group_reporting 
--numjobs=1 --name=randread --rw=randread --blocksize=8K --iodepth=1 
--runtime=60 --filename=/dev/vdb

qemu command line: qemu-system-x86_64 -L /usr/share/seabios/ -enable-kvm -name 
node1,debug-threads=on -name node1 -S -machine pc-i440fx-2.8,accel=kvm,usb=off 
-cpu SandyBridge -m 7680 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 
-object iothread,id=iothread1 -object iothread,id=iothread2 -object 
iothread,id=iothread3 -object iothread,id=iothread4 -uuid XX -nographic 
-no-user-config -nodefaults -chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/node1fs.monitor,server,nowait 
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew 
-global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on 
-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device 
lsi,id=scsi0,bus=pci.0,addr=0x6 -device 
virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x7 -device 
virtio-scsi-pci,id=scsi2,bus=pci.0,addr=0x8 -drive 
file=rhel7.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 -device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
 -drive 
file=/dev/sdc,if=none,id=drive-virtio-disk1,format=raw,cache=none,aio=native 
-device 
virtio-blk-pci,iothread=iothread1,scsi=off,bus=pci.0,addr=0x17,drive=drive-virtio-disk1,id=virtio-disk1
 -drive 
file=/dev/sdc,if=none,id=drive-scsi1-0-0-0,format=raw,cache=none,aio=native 
-device 
scsi-hd,bus=scsi1.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi1-0-0-0,id=scsi1-0-0-0
 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=XXX,bus=pci.0,addr=0x2 -netdev 
tap,fd=26,id=hostnet1,vhost=on,vhostfd=27 -device 
virtio-net-pci,netdev=hostnet1,id=net1,mac=YYY,bus=pci.0,multifunction=on,addr=0x15
 -netdev tap,fd=28,id=hostnet2,vhost=on,vhostfd=29 -device 
virtio-net-pci,netdev=hostnet2,id=net2,mac=ZZZ,bus=pci.0,multifunction=on,addr=0x16
 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -msg timestamp=on

# qemu-system-x86_64 --version
QEMU emulator version 2.8.0(Debian 1:2.8+dfsg-3~bpo8+1)
Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers

Note that I had the same host block device (/dev/sdc in this case) exposed over 
both virtio-scsi and virtio-blk to the guest VM for perf comparisons.

I see poor performance for 8K random reads inside the guest over both 
virtio-scsi and virtio-blk, compared to the host performance. Let me open 
another thread for that problem, but let me know if something obvious pops up 
based on the qemu command line.

~Padhu.

-----Original Message-----
From: Stefan Hajnoczi [mailto:stefa...@gmail.com] 
Sent: Tuesday, July 11, 2017 5:19 AM
To: Nagarajan, Padhu (HPE Storage) <pa...@hpe.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Disparity between host and guest CPU utilization 
during disk IO benchmark

On Mon, Jul 10, 2017 at 05:27:15PM +0000, Nagarajan, Padhu (HPE Storage) wrote:
> Posted this in qemu-discuss and did not get a response. Hoping that someone 
> here might be able to offer insights.
> 
> I was running an 8K random-read fio benchmark inside the guest with 
> iodepth=32. The device used inside the guest for the test was a virtio-blk 
> device with iothread enabled, mapped on to a raw block device on the host. 
> While this workload was running, I took a snapshot of the CPU utilization 
> reported by the host and the guest. The guest had 4 cores. top inside guest 
> shows 3 idle cores and one core being 74% utilized by fio (active on core 3). 
> The host had 12 cores and three cores were completely consumed by three qemu 
> threads. top inside host shows three qemu threads, each utilizing the CPU 
> core to a near 100%. These threads are "CPU 1/KVM", "CPU 3/KVM" and "IO 
> iothread1". The CPU utilization story on the host side is the same, even if I 
> run a light fio workload inside the guest (for ex. iodepth=1). 
> 
> Why do I see two "CPU/KVM" threads occupying 100% CPU, even though only one 
> core inside the guest is being utilized ? Note that I had 'accel=kvm' turned 
> on for the guest.

fio might be submitting I/O requests on one vcpu and the completion
interrupts are processed on another vcpu.

To discuss further, please post:
1. Full QEMU command-line
2. Full fio command-line and job file (if applicable)
3. Output of cat /proc/interrupts inside the guest after running the
   benchmark

Re: [Qemu-devel] Disparity between host and guest CPU utilization during disk IO benchmark

Reply via email to