Peter Xu <pet...@redhat.com> writes: > On Wed, Sep 13, 2023 at 04:42:31PM -0300, Fabiano Rosas wrote: >> Stefan Hajnoczi <stefa...@redhat.com> writes: >> >> > Hi, >> > The following intermittent failure occurred in the CI and I have filed >> > an Issue for it: >> > https://gitlab.com/qemu-project/qemu/-/issues/1886 >> > >> > Output: >> > >> > >>> QTEST_QEMU_IMG=./qemu-img MALLOC_PERTURB_=116 >> > QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon >> > G_TEST_DBUS_DAEMON=/builds/qemu-project/qemu/tests/dbus-vmstate-daemon.sh >> > QTEST_QEMU_BINARY=./qemu-system-x86_64 >> > /builds/qemu-project/qemu/build/tests/qtest/migration-test --tap -k >> > ――――――――――――――――――――――――――――――――――――― ✀ >> > ――――――――――――――――――――――――――――――――――――― >> > stderr: >> > qemu-system-x86_64: Unable to read from socket: Connection reset by peer >> > Memory content inconsistency at 5b43000 first_byte = bd last_byte = bc >> > current = 4f hit_edge = 1 >> > ** >> > ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion >> > failed: (bad == 0) >> > (test program exited with status code -6) >> > >> > You can find the full output here: >> > https://gitlab.com/qemu-project/qemu/-/jobs/5080200417 >> >> This is the postcopy return path issue that I'm addressing here: >> >> https://lore.kernel.org/r/20230911171320.24372-1-faro...@suse.de >> Subject: [PATCH v6 00/10] Fix segfault on migration return path >> Message-ID: <20230911171320.24372-1-faro...@suse.de> > > Hmm I just noticed one thing, that Stefan's failure is a ram check issue > only, which means qemu won't crash? >
The source could have crashed and left the migration at an inconsistent state and then the destination saw corrupted memory? > Fabiano, are you sure it's the same issue on your return-path fix? > I've been running the preempt tests on my branch for thousands of iterations and didn't see any other errors. Since there's no code going into the migration tree recently I assume it's the same error. I run the tests with GDB attached to QEMU, so I'll always see a crash before any memory corruption. > I'm also trying to reproduce either of them with some loads. I think I hit > some but it's very hard to reproduce solidly. Well, if you find anything else let me know and we'll fix it.