Re: [PATCH] migration: fix multifd_send_pages() next channel
Laurent Vivier wrote: > multifd_send_pages() loops around the available channels, > the next channel to use between two calls to multifd_send_pages() is stored > inside a local static variable, next_channel. > > It works well, except if the number of channels decreases between two calls > to multifd_send_pages(). In this case, the loop can try to access the > data of a channel that doesn't exist anymore. > > The problem can be triggered if we start a migration with a given number of > channels and then we cancel the migration to restart it with a lower number. > This ends generally with an error like: > qemu-system-ppc64: .../util/qemu-thread-posix.c:77: qemu_mutex_lock_impl: > Assertion `mutex->initialized' failed. > > This patch fixes the error by capping next_channel with the current number > of channels before using it. > > Signed-off-by: Laurent Vivier Reviewed-by: Juan Quintela
Re: [PATCH] migration: fix multifd_send_pages() next channel
Patchew URL: https://patchew.org/QEMU/20200617113154.593233-1-lviv...@redhat.com/ Hi, This series failed the asan build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash export ARCH=x86_64 make docker-image-fedora V=1 NETWORK=1 time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1 === TEST SCRIPT END === CC qga/commands.o CC qga/guest-agent-command-state.o CC qga/main.o /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) CC qga/commands-posix.o CC qga/channel-posix.o CC qga/qapi-generated/qga-qapi-visit.o --- AR libvhost-user.a GEN docs/interop/qemu-ga-ref.html GEN docs/interop/qemu-ga-ref.txt /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) GEN docs/interop/qemu-ga-ref.7 AS pc-bios/optionrom/multiboot.o CC pc-bios/optionrom/linuxboot_dma.o --- SIGNpc-bios/optionrom/pvh.bin LINKqemu-ga LINKqemu-keymap /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKivshmem-client LINKivshmem-server /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKqemu-nbd LINKqemu-storage-daemon LINKqemu-img /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKqemu-io /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKqemu-edid LINKfsdev/virtfs-proxy-helper /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKscsi/qemu-pr-helper /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKqemu-bridge-helper LINKvirtiofsd LINKvhost-user-input /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from
[PATCH] migration: fix multifd_send_pages() next channel
multifd_send_pages() loops around the available channels, the next channel to use between two calls to multifd_send_pages() is stored inside a local static variable, next_channel. It works well, except if the number of channels decreases between two calls to multifd_send_pages(). In this case, the loop can try to access the data of a channel that doesn't exist anymore. The problem can be triggered if we start a migration with a given number of channels and then we cancel the migration to restart it with a lower number. This ends generally with an error like: qemu-system-ppc64: .../util/qemu-thread-posix.c:77: qemu_mutex_lock_impl: Assertion `mutex->initialized' failed. This patch fixes the error by capping next_channel with the current number of channels before using it. Signed-off-by: Laurent Vivier --- migration/multifd.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/migration/multifd.c b/migration/multifd.c index 5a3e4d0d46d1..d0441202aae9 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -415,6 +415,12 @@ static int multifd_send_pages(QEMUFile *f) } qemu_sem_wait(_send_state->channels_ready); +/* + * next_channel can remain from a previous migration that was + * using more channels, so ensure it doesn't overflow if the + * limit is lower now. + */ +next_channel %= migrate_multifd_channels(); for (i = next_channel;; i = (i + 1) % migrate_multifd_channels()) { p = _send_state->params[i]; -- 2.26.2