RE: [raw] Guest stuck during live live-migration
Do you think that, applying this patch ( replacing by "#if 0" there : https://github.com/qemu/qemu/blob/master/block/file-posix.c#L2601 ), could affect for any reason the customer data ? As we are on full NVME and 10G networks it should fix our vm which completely freeze. Quentin De : Qemu-devel de la part de Quentin Grolleau Envoyé : mardi 24 novembre 2020 13:58:53 À : Kevin Wolf Cc : qemu-devel@nongnu.org; qemu-bl...@nongnu.org Objet : RE: [raw] Guest stuck during live live-migration Thanks Kevin, > > Hello, > > > > In our company, we are hosting a large number of Vm, hosted behind > > Openstack (so libvirt/qemu). > > A large majority of our Vms are runnign with local data only, stored on > > NVME, and most of them are RAW disks. > > > > With Qemu 4.0 (can be even with older version) we see strange > > live-migration comportement: > First of all, 4.0 is relatively old. Generally it is worth retrying with > the most recent code (git master or 5.2.0-rc2) before having a closer > look at problems, because it is frustrating to spend considerable time > debugging an issue and then find out it has already been fixed a year > ago. > I will try to build it the most recent code I will try to build with the most recent code, but it will take me some time to do it > > - some Vms live migrate at very high speed without issue (> 6 Gbps) > > - some Vms are running correctly, but migrating at a strange low speed > > (3Gbps) > > - some Vms are migrating at a very low speed (1Gbps, sometime less) and > > during the migration the guest is completely I/O stuck > > > > When this issue happen the VM is completly block, iostat in the Vm show us > > a latency of 30 secs > Can you get the stack backtraces of all QEMU threads while the VM is > blocked (e.g. with gdb or pstack)? (gdb) thread apply all bt Thread 20 (Thread 0x7f8a0effd700 (LWP 201248)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x56520139878b in qemu_cond_wait_impl (cond=0x5652020f27b0, mutex=0x5652020f27e8, file=0x5652014e4178 "/root/qemu_debug_LSEEK/qemu_debug/qemu/ui/vnc-jobs.c", line=214) at /root/qemu_debug_LSEEK/qemu_debug/qemu/util/qemu-thread-posix.c:161 #2 0x5652012a264d in vnc_worker_thread_loop (queue=queue@entry=0x5652020f27b0) at /root/qemu_debug_LSEEK/qemu_debug/qemu/ui/vnc-jobs.c:214 #3 0x5652012a2c18 in vnc_worker_thread (arg=arg@entry=0x5652020f27b0) at /root/qemu_debug_LSEEK/qemu_debug/qemu/ui/vnc-jobs.c:324 #4 0x565201398116 in qemu_thread_start (args=) at /root/qemu_debug_LSEEK/qemu_debug/qemu/util/qemu-thread-posix.c:502 #5 0x7f8a5e24a6ba in start_thread (arg=0x7f8a0effd700) at pthread_create.c:333 #6 0x7f8a5df8041d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 19 (Thread 0x7f8a0700 (LWP 201222)): #0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x7f8a5e24cdbd in __GI___pthread_mutex_lock (mutex=mutex@entry=0x565201adb680 ) at ../nptl/pthread_mutex_lock.c:80 #2 0x565201398263 in qemu_mutex_lock_impl (mutex=0x565201adb680 , file=0x5652013d7c68 "/root/qemu_debug_LSEEK/qemu_debug/qemu/accel/kvm/kvm-all.c", line=2089) at /root/qemu_debug_LSEEK/qemu_debug/qemu/util/qemu-thread-posix.c:66 #3 0x565200f7d00e in qemu_mutex_lock_iothread_impl (file=file@entry=0x5652013d7c68 "/root/qemu_debug_LSEEK/qemu_debug/qemu/accel/kvm/kvm-all.c", line=line@entry=2089) at /root/qemu_debug_LSEEK/qemu_debug/qemu/cpus.c:1850 #4 0x565200fa7ca8 in kvm_cpu_exec (cpu=cpu@entry=0x565202354480) at /root/qemu_debug_LSEEK/qemu_debug/qemu/accel/kvm/kvm-all.c:2089 #5 0x565200f7d1ce in qemu_kvm_cpu_thread_fn (arg=arg@entry=0x565202354480) at /root/qemu_debug_LSEEK/qemu_debug/qemu/cpus.c:1281 #6 0x565201398116 in qemu_thread_start (args=) at /root/qemu_debug_LSEEK/qemu_debug/qemu/util/qemu-thread-posix.c:502 #7 0x7f8a5e24a6ba in start_thread (arg=0x7f8a0700) at pthread_create.c:333 #8 0x7f8a5df8041d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 18 (Thread 0x7f8a2cff9700 (LWP 201221)): #0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x7f8a5e24cdbd in __GI___pthread_mutex_lock (mutex=mutex@entry=0x565201adb680 ) at ../nptl/pthread_mutex_lock.c:80 #2 0x565201398263 in qemu_mutex_lock_impl (mutex=0x565201adb680 , file=0x5652013d7c68 "/root/qemu_debug_LSEEK/qemu_debug/qemu/accel/kvm/kvm-all.c", line=2089) at /root/qemu_debug_LSEEK/qemu_debug/qemu/util/qemu-thread-posix.c:66 #3 0x565200f7d00e in qemu_mutex_lock_iothread_impl (file=file@entry=0x5652013d7c68 "/root/qemu_debug_LSEEK/qemu_debug/qemu/accel/kvm/kvm-all.c", line=line@entry=208
RE: [raw] Guest stuck during live live-migration
. > Please also provide your QEMU command line. Here the qemu command line : /usr/bin/qemu-system-x86_64 \ -name guest=instance-0034d494,debug-threads=on \ -S \ -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-9-instance-0034d494/master-key.aes \ -machine pc-i440fx-bionic,accel=kvm,usb=off,dump-guest-core=off \ -cpu Broadwell-IBRS,md-clear=on,vmx=on \ -m 6 \ -overcommit mem-lock=off \ -smp 16,sockets=16,cores=1,threads=1 \ -uuid b959a460-84b0-4280-b851-96755027622b \ -smbios 'type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=14.1.1,serial=5b429103-2856-154f-1caf-5ffb5694cdc3,uuid=b959a460-84b0-4280-b851-96755027622b,family=Virtual Machine' \ -no-user-config \ -nodefaults \ -chardev socket,id=charmonitor,fd=28,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=utc,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -no-hpet \ -no-shutdown \ -boot strict=on \ -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ -drive file=/home/instances/b959a460-84b0-4280-b851-96755027622b/disk,format=raw,if=none,id=drive-virtio-disk0,cache=none,discard=unmap,aio=native,throttling.iops-total=8 \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=on \ -add-fd set=1,fd=31 \ -chardev file,id=charserial0,path=/dev/fdset/1,append=on \ -device isa-serial,chardev=charserial0,id=serial0 \ -chardev pty,id=charserial1 \ -device isa-serial,chardev=charserial1,id=serial1 \ -device usb-tablet,id=input0,bus=usb.0,port=1 \ -vnc 10.224.27.81:0 \ -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 \ -s \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ -msg timestamp=on > At the moment, my assumption is that this is during a mirror block job > which is migrating the disk to your destination server. Not looking for > holes would mean that a sparse source file would become fully allocated > on the destination, which is usually not wanted (also we would > potentially transfer a lot more data over the networkj). The vm disk is already fully allocated, so in this case it doesn't change. > Can you give us a snippet from your strace that shows the individual > lseek syscalls? Depending on which ranges are queried, maybe we could > optimise things by caching the previous result. Here the strace during the block migration : strace -c -p 32571 strace: Process 32571 attached ^Cstrace: Process 32571 detached % timeseconds usecs/callcalls errors syscall 94.15 13.754159 2503 5495 lseek 3.47 0.507101 91 5549 sendmsg 1.60 0.233516 84 2769 io_submit 0.40 0.057817 11 5496 setsockopt 0.18 0.025747 4 5730 184 recvmsg 0.16 0.023560 6 4259 write 0.02 0.002575 6 408 read 0.02 0.002425 9 266 35 futex 0.01 0.002136 12 184 ppoll 0.00 0.000184 12 16 poll 0.00 0.38 381 clone 0.00 0.32 162 rt_sigprocmask 0.00 0.13 131 ioctl 100.00 14.609303 30176 219 total > Also, a final remark, I know of some cases (on XFS) where lseeks were > slow because the image file was heavily fragmented. Defragmenting the > file resolved the problem, so this may be another thing to try. > On XFS, newer QEMU versions set an extent size hint on newly created > image files (during qemu-img create), which can reduce fragmentation > considerably. > Kevin > Server hosting the VM : > - Bi-Xeon hosts With NVME storage and 10 Go Network card > - Qemu 4.0 And Libvirt 5.4 > - Kernel 4.18.0.25 > > Guest having the issue : > - raw image with Debian 8 > > Here the qemu img on the disk : > > qemu-img info disk > image: disk > file format: raw > virtual size: 400G (429496729600 bytes) > disk size: 400G > > > Quentin GROLLEAU >
[raw] Guest stuck during live live-migration
Hello, In our company, we are hosting a large number of Vm, hosted behind Openstack (so libvirt/qemu). A large majority of our Vms are runnign with local data only, stored on NVME, and most of them are RAW disks. With Qemu 4.0 (can be even with older version) we see strange live-migration comportement: - some Vms live migrate at very high speed without issue (> 6 Gbps) - some Vms are running correctly, but migrating at a strange low speed (3Gbps) - some Vms are migrating at a very low speed (1Gbps, sometime less) and during the migration the guest is completely I/O stuck When this issue happen the VM is completly block, iostat in the Vm show us a latency of 30 secs First we thought it was related to an hardware issue we check it, we comparing different hardware, but no issue where found there So one of my colleague had the idea to limit with "tc" the bandwidth on the interface the migration was done, and it worked the VM didn't lose any ping nor being I/O stuck Important point : Once the Vm have been migrate (with the limitation ) one time, if we migrate it again right after, the migration will be done at full speed (8-9Gb/s) without freezing the Vm It only happen on existing VM, we tried to replicate with a fresh instance with exactly the same spec and nothing was happening We tried to replicate the workload inside the VM but there was no way to replicate the case. So it was not related to the workload nor to the server that hosts the Vm So we thought about the disk of the instance : the raw file. We also tried to strace -c the process during the live-migration and it was doing a lot of "lseek" and we found this : https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00462.html So i rebuilt Qemu with this patch and the live-migration went well, at high speed and with no VM freeze ( https://github.com/qemu/qemu/blob/master/block/file-posix.c#L2601 ) Do you have a way to avoid the "lseek" mechanism as it consumes more resources to find the holes in the disk and don't let any for the VM ? Server hosting the VM : - Bi-Xeon hosts With NVME storage and 10 Go Network card - Qemu 4.0 And Libvirt 5.4 - Kernel 4.18.0.25 Guest having the issue : - raw image with Debian 8 Here the qemu img on the disk : > qemu-img info disk image: disk file format: raw virtual size: 400G (429496729600 bytes) disk size: 400G Quentin GROLLEAU