## The Problem I Am Having I'm using QEMU to virtualize Spectrum OS (https://spectrum-os.org). I'm finding that QEMU is very slow (data transfer rates of 30MiB/s or so) and uses a lot of CPU (300-400% or thereabouts). `top` inside the VM shows low CPU usage, so I suspect that most of the CPU using is from QEMU itself.
## QEMU configuration
I'm using the following script to run QEMU (on NixOS).
It's a bash script with an odd #! that ensures that all dependencies are
available.
#!/usr/bin/env nix-shell
#! nix-shell -i bash -p qemu_kvm btrfs-progs weston
set -euo pipefail
case $0 in
/*) cd "${0%/*}/";;
*/*) cd "./${0%/*}";;
*) :;;
esac
case $1 in
install)
if [[ "$#" != 3 ]]; then echo "Usage run.sh [install
install_media|run] root" >&2; exit 1; fi
install_image=$2
shift 2
unset b
dir=$(mktemp -d)
mkdir -- "$dir/tmp" "$dir/home"
rm -rf -- "$dir"
;;
run)
if [[ "$#" != 3 ]]; then echo "Usage run.sh [install
install_media|run] root writable" >&2; exit 1; fi
shift
b=1
unset install_image
exec
;;
*) echo "Bad action (must be install or run)" >&2; exit 1;;
esac
f=$(type qemu-system-x86_64)
case $f in
('qemu-system-x86_64 is /nix/store/'*/bin/qemu-system-x86_64)
p=${f#'qemu-system-x86_64 is '}
p=${p%/bin/qemu-system-x86_64}/share/qemu/edk2-x86_64-code.fd
;;
(*)
printf 'Not able to find qemu-system-x86_64\n' >&2
exit 1
;;
esac
cpus=$(exec nproc)
DISPLAY=
if ! [[ "$XDG_RUNTIME_DIR" =~ ^/run/user/(0|[1-9][0-9]*)$ ]]; then
echo Bad XDG_RUNTIME_DIR>&2
exit 1
fi
qemu_args=(
-m 4G
-cpu max
-smp "$cpus"
-machine q35,accel=kvm,kernel-irqchip=split,pflash0=firmware
-blockdev
"driver=raw,node-name=firmware,file.driver=file,file.filename=${p//,/,,},read-only=on"
)
if [[ -v b ]]; then
qemu_args+=(
-device intel-iommu,intremap=on,device-iotlb=on
-device pcie-root-port,id=pcie.1
-device
virtio-net-pci,ats=on,iommu_platform=on,bus=pcie.1,netdev=net0,disable-legacy=on,disable-modern=off
-netdev
"stream,addr.path=${XDG_RUNTIME_DIR//,/,,}/passt/passt.sock,addr.type=unix,id=net0,server=off"
)
fi
qemu_args+=(
-device qemu-xhci
-device virtio-gpu
-device usb-tablet
-parallel none
-sandbox on
-vga none
-boot menu=on
-smbios
'type=11,value=io.systemd.stub.kernel-cmdline-extra=console=ttyS0
systemd.debug_shell=ttyS0'
-device "virtio-blk-pci,drive=drive2${b+,bootindex=1}"
-blockdev
"driver=raw,node-name=drive2,file.driver=file,file.filename=${1//,/,,},cache.direct=off,file.aio=io_uring"
-display
gtk,full-screen=off,gl=on,grab-on-hover=off,show-tabs=on,zoom-to-fit=on
)
if [[ -v install_image ]]; then
qemu_args+=(
-device
'usb-storage,drive=drive1,removable=true,bootindex=1'
-blockdev
"driver=raw,node-name=drive1,file.driver=file,file.filename=${install_image//,/,,},read-only=on,cache.direct=on,file.aio=io_uring"
)
else
qemu_args+=(
-device 'virtio-blk-pci-non-transitional,drive=drive3'
-blockdev
"driver=raw,node-name=drive3,file.driver=file,file.filename=${2//,/,,},cache.direct=off,file.aio=io_uring"
)
fi
exec qemu-system-x86_64 "${qemu_args[@]}"
## Final QEMU Command Line
For execution, the QEMU command line should be:
-m 4G \
-cpu max \
-smp "$NUMBER_OF_CPUS_ON_HOST" \
-machine q35,accel=kvm,kernel-irqchip=split,pflash0=firmware \
-blockdev
"driver=raw,node-name=firmware,file.driver=file,file.filename=${p//,/,,},read-only=on"
\
-device intel-iommu,intremap=on,device-iotlb=on \
-device pcie-root-port,id=pcie.1 \
-device
virtio-net-pci,ats=on,iommu_platform=on,bus=pcie.1,netdev=net0,disable-legacy=on,disable-modern=off
\
-netdev
"stream,addr.path=${XDG_RUNTIME_DIR//,/,,}/passt/passt.sock,addr.type=unix,id=net0,server=off"
\
-device qemu-xhci \
-device virtio-gpu \
-device usb-tablet \
-parallel none \
-sandbox on \
-vga none \
-boot menu=on \
-smbios
'type=11,value=io.systemd.stub.kernel-cmdline-extra=console=ttyS0
systemd.debug_shell=ttyS0' \
-device "virtio-blk-pci,drive=drive2,bootindex=1" \
-blockdev
"driver=raw,node-name=drive2,file.driver=file,file.filename=${1//,/,,},cache.direct=off,file.aio=io_uring"
\
-display
gtk,full-screen=off,gl=on,grab-on-hover=off,show-tabs=on,zoom-to-fit=on \
-device 'virtio-blk-pci-non-transitional,drive=drive3' \
-blockdev
"driver=raw,node-name=drive3,file.driver=file,file.filename=${2//,/,,},cache.direct=off,file.aio=io_uring"
Here $1 is the boot image, $2 is a writable image, and $p is the path to OVMF.
Also, ${XDG_RUNTIME_DIR//,/,,}/passt/passt.sock is a passt listening socket.
## The Guest's Workload
The guest's workload uses very little CPU. The guest:
1. Creates two VMs using Cloud Hypervisor.
2. Assigns the emulated NIC to one of them.
3. Sets up bridged networking the two guest's virtio NICs.
4. Tells the other to download some files into a virtiofs-mounted
BTRFS subvolume.
5. Takes a BTRFS snapshot of that subvolume.
6. Copies these files to its OS drive.
`top` in the guest shows that the guest is idle over 90% of the time.
`top` on the host shows CPU usage of 200% or more when the download
is happening, and between 35% and 70% while installing the files to
the OS disk. The host `top` is standard `top`, while the guest uses
Busybox `top`.
## Potential Causes
There are a few factors I can think of that could cause the high CPU usage:
- Spectrum uses PCI passthrough, so QEMU must emulate an IOMMU and
use an emulated NIC instead of a virtio one.
- I might have made a mistake and passed the wrong parameters to QEMU.
- My host drives might be slow.
Nevertheless, I would not expect this to cause QEMU to take such a
vast amount of CPU time. Given the above factors, I'm not expecting
near-native performance, but 200% CPU on the host seems wrong in
some way.
Is this a bug in QEMU, user error on my side, or just an inevitable
consequence of needing to use PCI passthrough inside a VM on x86?
On Arm I could use a virtio IOMMU, but that doesn't support interrupt
remapping on x86 and so isn't going to work.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
OpenPGP_0xB288B55FFF9C22C1.asc
Description: OpenPGP public key
OpenPGP_signature.asc
Description: OpenPGP digital signature
