Bug#1062044: qemu 7.2+dfsg-7+deb12u4 flagged for acceptance
On Tue, 6 Feb 2024 19:37:46 +0300 Michael Tokarev wrote: 03.02.2024 12:47, Michael Tokarev wrote: >>> It looks like we broke suspend/resume in this version of qemu. >> Oops. Is that related to the cryptsetup failure, or a separate issue? > > Yes, it is related to cryptsetup autopkgtest failure. It looks > like this is the only place where suspend/resume code in qemu > is actually being used, - it's rather rare to suspend (hybernate) > a virtual machine, and cryptsetup performs testing of how the > encrypted filesystem is unlocked (or not) on resume. So, while the problematic upstream commit fixes quite a few real potential qemu lockups, it introduces a new lockup in suspend- resume-hibernate cycle. The problem isn't understood yet, and we're getting close to the 12.5 release. The problematic upstream commit (on master) is this one: https://gitlab.com/qemu-project/qemu/-/commit/effd60c878176bcaf97fa7ce2b12d04bb8ead6f7 It has links to 2 bugs it is fixing, and there are quite a few other bugs which are fixed too. It turned out this commit was innocent. The bug most likely is somewhere in qemu, but it is triggered by the guest kernel, it looks like, not by this qemu commit. Current bookworm kernels (6.1.19 and 6.1.20) fails in this same suspend/resume test in all current versions of qemu, including the ones with this commit applied, including current qemu master, and including versions much older than that, - including original qemu as initially released with bookworm, before all updates. It is not yet clear what's going on here. But we'll have to live with that somehow, and, - I guess - have to live with the broken cryptsetup autopkgtests. I'm preparing a new upstream stable/bugfix version of qemu 7.2.x for bookworm, which will include a few CVE fixes, many other fixes all other the place, and re-introduction of this commit too, - which, as it turns out, has actually nothing to do with the broken suspend-resume. Thanks, /mjt
Bug#1062044: qemu 7.2+dfsg-7+deb12u4 flagged for acceptance
06.02.2024 20:55, Adam D. Barratt : On Tue, 2024-02-06 at 20:49 +0300, Michael Tokarev wrote: .. The change isn't small per se, as the commit is rather large (mostly due to many changed tests, - it changes order of output in quite some places). Here's the diffstat: monitor/qmp.c | 17 + qapi/qmp-dispatch.c | 24 +- -- This is the relevant bit for size IMO. If you're happy with the result then please upload as soon as you're ready. Yes, I'm happy with the result. Well, - as much as one can be happy here, choosing between one bug or another, - but it is at least a status-quo and we don't have known regressions in debian stable due to this. I just re-ran upstream testsuite just to be extra sure, and am running my bunch of guests as well, everything works as expected so far. I wont try to reproduce the issues this patch (which I'm reverting) fixed, though ;) Uploaded +deb12u5 now, waiting to be picked up. Thank you for the patience and all the work! /mjt
Bug#1062044: qemu 7.2+dfsg-7+deb12u4 flagged for acceptance
On Tue, 2024-02-06 at 20:49 +0300, Michael Tokarev wrote: > 06.02.2024 20:33, Adam D. Barratt: > > On Tue, 2024-02-06 at 19:37 +0300, Michael Tokarev wrote: > > > problematic upstream commit (on master) is this one: > > > https://gitlab.com/qemu-project/qemu/-/commit/effd60c878176bcaf97fa7ce2b12d04bb8ead6f7 > > > Technically we already froze p-u for 12.5 on Sunday evening, as > > previously announced. If you could get an upload just fixing that > > single issue with a small change uploaded today then I'd be tempted > > to > > accept it anyway. > > Oh. I knew we're getting late, but not *that* late. > The point release(s) are on Saturday, and we always freeze a week beforehand. > The change isn't small per se, as the commit is rather large (mostly > due to many changed tests, - it changes order of output in quite some > places). Here's the diffstat: > > monitor/qmp.c | 17 + > qapi/qmp-dispatch.c | 24 +- > -- This is the relevant bit for size IMO. If you're happy with the result then please upload as soon as you're ready. Regards, Adam
Bug#1062044: qemu 7.2+dfsg-7+deb12u4 flagged for acceptance
06.02.2024 20:33, Adam D. Barratt: On Tue, 2024-02-06 at 19:37 +0300, Michael Tokarev wrote: problematic upstream commit (on master) is this one: https://gitlab.com/qemu-project/qemu/-/commit/effd60c878176bcaf97fa7ce2b12d04bb8ead6f7 Technically we already froze p-u for 12.5 on Sunday evening, as previously announced. If you could get an upload just fixing that single issue with a small change uploaded today then I'd be tempted to accept it anyway. Oh. I knew we're getting late, but not *that* late. The change isn't small per se, as the commit is rather large (mostly due to many changed tests, - it changes order of output in quite some places). Here's the diffstat: monitor/qmp.c | 17 + qapi/qmp-dispatch.c | 24 +--- tests/qemu-iotests/060.out|4 ++-- tests/qemu-iotests/071.out|4 ++-- tests/qemu-iotests/081.out| 16 tests/qemu-iotests/087.out| 12 ++-- tests/qemu-iotests/108.out|2 +- tests/qemu-iotests/109|4 ++-- tests/qemu-iotests/109.out| 78 +- tests/qemu-iotests/117.out|2 +- tests/qemu-iotests/120.out|2 +- tests/qemu-iotests/127.out|2 +- tests/qemu-iotests/140.out|2 +- tests/qemu-iotests/143.out|2 +- tests/qemu-iotests/156.out|2 +- tests/qemu-iotests/176.out| 16 tests/qemu-iotests/182.out|2 +- tests/qemu-iotests/183.out|4 ++-- tests/qemu-iotests/184.out| 32 tests/qemu-iotests/185|6 +++--- tests/qemu-iotests/185.out| 45 + tests/qemu-iotests/191.out| 16 tests/qemu-iotests/195.out| 16 tests/qemu-iotests/223.out| 12 ++-- tests/qemu-iotests/227.out| 32 tests/qemu-iotests/247.out|2 +- tests/qemu-iotests/273.out|8 tests/qemu-iotests/308|4 ++-- tests/qemu-iotests/308.out|2 +- tests/qemu-iotests/tests/qsd-jobs.out |4 ++-- 30 files changed, 173 insertions(+), 201 deletions(-) (as you can see, first two are the gist of it, the rest are the consequences). I'm including a complete revert of this single commit together with all the testsuite changes, ie, exactly as it is, - while the upstream testsuite isn't used in debian directly, it still works, and I'm running it right now locally just to be sure (though it definitely worked before that commit has been initially applied, so it should be okay). Presumably the bugs being fixed by that commit already exist in bookworm's qemu, so not including the commit isn't a regression? Yes, exactly, that's why I wrote about the status-quo. Please also attach a debdiff against the previous upload. Attached.diff -Nru qemu-7.2+dfsg/debian/changelog qemu-7.2+dfsg/debian/changelog --- qemu-7.2+dfsg/debian/changelog 2024-01-30 19:15:04.0 +0300 +++ qemu-7.2+dfsg/debian/changelog 2024-02-06 20:38:06.0 +0300 @@ -1,3 +1,12 @@ +qemu (1:7.2+dfsg-7+deb12u5) bookworm; urgency=medium + + * +revert-monitor-only-run-coroutine-commands-in-qemu_aio_context.patch +Revert a single upstream change in 7.2.9 which, while fixed a few qemu +lockup bugs, introduced a regression in suspend-resume-hibernate cycle +(triggered by cryptsetup autopkgtest) + + -- Michael Tokarev Tue, 06 Feb 2024 20:38:06 +0300 + qemu (1:7.2+dfsg-7+deb12u4) bookworm; urgency=medium [ Michael Tokarev ] diff -Nru qemu-7.2+dfsg/debian/patches/revert-monitor-only-run-coroutine-commands-in-qemu_aio_context.patch qemu-7.2+dfsg/debian/patches/revert-monitor-only-run-coroutine-commands-in-qemu_aio_context.patch --- qemu-7.2+dfsg/debian/patches/revert-monitor-only-run-coroutine-commands-in-qemu_aio_context.patch 1970-01-01 03:00:00.0 +0300 +++ qemu-7.2+dfsg/debian/patches/revert-monitor-only-run-coroutine-commands-in-qemu_aio_context.patch 2024-02-06 20:36:21.0 +0300 @@ -0,0 +1,1544 @@ +From 84a139b0289470994f8a518034d69186f5ad5bb9 Mon Sep 17 00:00:00 2001 +From: Michael Tokarev +Date: Tue, 6 Feb 2024 20:35:22 +0300 +Subject: [PATCH] Revert "monitor: only run coroutine commands in + qemu_aio_context" + +This reverts commit 8ec90598e922a604c222bdbc6289bed7279dced6. +Causes a regression at least in suspend-resume-hibernate cycle, +let's revert it to restore the status quo for now. +--- + monitor/qmp.c | 17 ++ + qapi/qmp-dispatch.c | 24 + + tests/qemu-iotests/060.out| 4 +- +
Bug#1062044: qemu 7.2+dfsg-7+deb12u4 flagged for acceptance
On Tue, 2024-02-06 at 19:37 +0300, Michael Tokarev wrote: > e problematic upstream commit (on master) is this one: > https://gitlab.com/qemu-project/qemu/-/commit/effd60c878176bcaf97fa7ce2b12d04bb8ead6f7 > It has links to 2 bugs it is fixing, and there are quite a few > other bugs which are fixed too. > > I can add a revert of this single commit (with all tests) for debian > stable (for deb12u5 release) on top of current deb12u4. I think > this would be best, despite the way it goes, - first the change is > added in v7.2.9.diff, and next removed in a followup revert, - > because this way we follow upstream releases, and this patch > will be easy to remove in subsequent update. [...] > re thing, if the solution will be found in a couple of days, > I'll try to push that one instead, but it also depends on the > complexity and possible risks there, and timeline. Technically we already froze p-u for 12.5 on Sunday evening, as previously announced. If you could get an upload just fixing that single issue with a small change uploaded today then I'd be tempted to accept it anyway. Presumably the bugs being fixed by that commit already exist in bookworm's qemu, so not including the commit isn't a regression? Please also attach a debdiff against the previous upload. Regards, Adam
Bug#1062044: qemu 7.2+dfsg-7+deb12u4 flagged for acceptance
03.02.2024 12:47, Michael Tokarev wrote: It looks like we broke suspend/resume in this version of qemu. Oops. Is that related to the cryptsetup failure, or a separate issue? Yes, it is related to cryptsetup autopkgtest failure. It looks like this is the only place where suspend/resume code in qemu is actually being used, - it's rather rare to suspend (hybernate) a virtual machine, and cryptsetup performs testing of how the encrypted filesystem is unlocked (or not) on resume. So, while the problematic upstream commit fixes quite a few real potential qemu lockups, it introduces a new lockup in suspend- resume-hibernate cycle. The problem isn't understood yet, and we're getting close to the 12.5 release. The problematic upstream commit (on master) is this one: https://gitlab.com/qemu-project/qemu/-/commit/effd60c878176bcaf97fa7ce2b12d04bb8ead6f7 It has links to 2 bugs it is fixing, and there are quite a few other bugs which are fixed too. I can add a revert of this single commit (with all tests) for debian stable (for deb12u5 release) on top of current deb12u4. I think this would be best, despite the way it goes, - first the change is added in v7.2.9.diff, and next removed in a followup revert, - because this way we follow upstream releases, and this patch will be easy to remove in subsequent update. Alternatively we probably can ignore cryptsetup autopkgtest failure, but this smells somewhat wrong, I think it's better to restore the status quo for now, even in such a weird way (applying and reverting a patch). What do you think? Sure thing, if the solution will be found in a couple of days, I'll try to push that one instead, but it also depends on the complexity and possible risks there, and timeline. Thanks, /mjt
Bug#1062044: qemu 7.2+dfsg-7+deb12u4 flagged for acceptance
On Sat, 2024-02-03 at 12:47 +0300, Michael Tokarev wrote: > 03.02.2024 12:43, Adam D. Barratt : > .. > > > I'm aware of the autopkgtest failure with cryptsetup, working on > > > it > > > now. > > > It looks like we broke suspend/resume in this version of qemu. > > > > Oops. Is that related to the cryptsetup failure, or a separate > > issue? > > Yes, it is related to cryptsetup autopkgtest failure. It looks > like this is the only place where suspend/resume code in qemu > is actually being used, - it's rather rare to suspend (hybernate) > a virtual machine, and cryptsetup performs testing of how the > encrypted filesystem is unlocked (or not) on resume. > > I already found the upstream commit which broke this (in all > supported versions of upstream qemu, including master), dunno > yet what to do with it, - trying to reduce the cryptroot test > to some manageable minimum. > > It'd be sad to avoid updating of qemu due to this. But let's > see.. Thanks for the update, and for being proactive. Regards, Adam
Bug#1062044: qemu 7.2+dfsg-7+deb12u4 flagged for acceptance
03.02.2024 12:43, Adam D. Barratt : .. I'm aware of the autopkgtest failure with cryptsetup, working on it now. It looks like we broke suspend/resume in this version of qemu. Oops. Is that related to the cryptsetup failure, or a separate issue? Yes, it is related to cryptsetup autopkgtest failure. It looks like this is the only place where suspend/resume code in qemu is actually being used, - it's rather rare to suspend (hybernate) a virtual machine, and cryptsetup performs testing of how the encrypted filesystem is unlocked (or not) on resume. I already found the upstream commit which broke this (in all supported versions of upstream qemu, including master), dunno yet what to do with it, - trying to reduce the cryptroot test to some manageable minimum. It'd be sad to avoid updating of qemu due to this. But let's see.. Thanks, /mjt
Bug#1062044: qemu 7.2+dfsg-7+deb12u4 flagged for acceptance
On Sat, 2024-02-03 at 11:40 +0300, Michael Tokarev wrote: > 01.02.2024 11:40, Adam D Barratt : > .. > > Package: qemu > > Version: 7.2+dfsg-7+deb12u4 > > > > Explanation: new upstream stable release; irtio-net: correctly copy > > vnet header when flushing TX [CVE-2023-6693]; fix null pointer > > dereference issue [CVE-2023-6683] > > There's a typo here, should be virtio-net. Oops, copy-n-paste fail; fixed. > > I'm aware of the autopkgtest failure with cryptsetup, working on it > now. > OK, thanks. > It looks like we broke suspend/resume in this version of qemu. Oops. Is that related to the cryptsetup failure, or a separate issue? Regards, Adam
Bug#1062044: qemu 7.2+dfsg-7+deb12u4 flagged for acceptance
01.02.2024 11:40, Adam D Barratt : .. Package: qemu Version: 7.2+dfsg-7+deb12u4 Explanation: new upstream stable release; irtio-net: correctly copy vnet header when flushing TX [CVE-2023-6693]; fix null pointer dereference issue [CVE-2023-6683] There's a typo here, should be virtio-net. I'm aware of the autopkgtest failure with cryptsetup, working on it now. It looks like we broke suspend/resume in this version of qemu. Thanks, /mjt
Bug#1062044: qemu 7.2+dfsg-7+deb12u4 flagged for acceptance
package release.debian.org tags 1062044 = bookworm pending thanks Hi, The upload referenced by this bug report has been flagged for acceptance into the proposed-updates queue for Debian bookworm. Thanks for your contribution! Upload details == Package: qemu Version: 7.2+dfsg-7+deb12u4 Explanation: new upstream stable release; irtio-net: correctly copy vnet header when flushing TX [CVE-2023-6693]; fix null pointer dereference issue [CVE-2023-6683]