>-----Original Message----- >From: Daniel P. Berrangé <[email protected]> >Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute >"query-balloon" after CPR transfer > >On Thu, Oct 09, 2025 at 10:32:34AM -0400, Steven Sistare wrote: >> On 10/9/2025 10:19 AM, Daniel P. Berrangé wrote: >> > On Thu, Oct 09, 2025 at 10:11:17AM -0400, Steven Sistare wrote: >> > > On 10/8/2025 6:22 AM, Duan, Zhenzhong wrote: >> > > > >> > > > >> > > > > -----Original Message----- >> > > > > From: Steven Sistare <[email protected]> >> > > > > Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute >> > > > > "query-balloon" after CPR transfer >> > > > > >> > > > > On 9/30/2025 2:00 AM, Duan, Zhenzhong wrote: >> > > > > > > -----Original Message----- >> > > > > > > From: Steven Sistare <[email protected]> >> > > > > > > Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when >execute >> > > > > > > "query-balloon" after CPR transfer >> > > > > > > >> > > > > > > On 9/28/2025 4:54 AM, Zhenzhong Duan wrote: >> > > > > > > > After CPR transfer, source QEMU closes kvm fd and sets >kvm_state to >> > > > > > > NULL, >> > > > > > > > "query-balloon" will check kvm_state->sync_mmu and trigger >NULL >> > > > > pointer >> > > > > > > > reference. >> > > > > > > > >> > > > > > > > We don't need to NULL kvm_state as all states in kvm_state >aren't >> > > > > released >> > > > > > > > actually. Just closing kvm fd is enough so we could still query >states >> > > > > > > > through "query_*" qmp command. >> > > > > > > >> > > > > > > IMO this does not make sense. Much of the state in kvm_state >was >> > > > > derived >> > > > > > >from ioctl's on the descriptors, and closing them invalidates it. >Asking >> > > > > > > historical questions about what used to be makes no sense. >> > > > > > >> > > > > > You also have your valid point. >> > > > > > >> > > > > > > >> > > > > > > Clearing kvm_state and setting kvm_allowed=false would be a >safer fix. >> > > > > >> > > > > Setting kvm_allowed=false causes kvm_enabled() to return false >which should >> > > > > prevent kvm_state from being dereferenced anywhere: >> > > > > >> > > > > #define kvm_enabled() (kvm_allowed) >> > > > > >> > > > > Eg for the balloon: >> > > > > >> > > > > static bool have_balloon(Error **errp) >> > > > > { >> > > > > if (kvm_enabled() && !kvm_has_sync_mmu()) { >> > > > >> > > > OK, will do, clearing kvm_allowed is a good idea. >> > > > >> > > > Currently there are two qmp commands "query-balloon" and >"query-cpu-definitions" >> > > > causing SIGSEGV after CPR-transfer, clearing kvm_allowed helps for >both. >> > > > But clearing both kvm_allowed and kvm_state cause SIGSEGV on >"query-cpu-definitions". >> > > > >> > > > I'll send a patch to clearing only kvm_allowed if you are fine with it. >> > > >> > > I still don't love the idea. kvm_state is no longer valid. >> > > >> > > It sounds like "query-cpu-definitions" is missing a check for >kvm_enabled(). >> > >> > This patch / bug feels like it is side-stepping a general conceptual >> > question. After cpr-transfer is complete what QMP commands are still >> > valid to call in general ? This thread mentions two which have caused >> > bugs, but beyond that it feels like a large subset of QMP comamnds >> > are conceptually invalid to use. >> >> Agreed. I don't see why these commands should be issued to the dead >instance. >> How about migration status, quit, and nothing else? >> Half serious. > >I was hoping you'd suggest such a short list :-) > >Essentially a case of calling qmp_for_each_command(), and in the callback >run qmp_disable_command() for everything not in the allow-list.
Thanks for your suggestion, I agree this is a perfect scheme, but is also heavy on this corner case😊 Just curious if we need to do same for live migration. E.g., send qmp command on source qemu after live migration. Thanks Zhenzhong
