>-----Original Message-----
>From: Steven Sistare <[email protected]>
>Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute
>"query-balloon" after CPR transfer
>
>On 10/9/2025 10:19 AM, Daniel P. Berrangé wrote:
>> On Thu, Oct 09, 2025 at 10:11:17AM -0400, Steven Sistare wrote:
>>> On 10/8/2025 6:22 AM, Duan, Zhenzhong wrote:
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Steven Sistare <[email protected]>
>>>>> Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute
>>>>> "query-balloon" after CPR transfer
>>>>>
>>>>> On 9/30/2025 2:00 AM, Duan, Zhenzhong wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: Steven Sistare <[email protected]>
>>>>>>> Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute
>>>>>>> "query-balloon" after CPR transfer
>>>>>>>
>>>>>>> On 9/28/2025 4:54 AM, Zhenzhong Duan wrote:
>>>>>>>> After CPR transfer, source QEMU closes kvm fd and sets kvm_state
>to
>>>>>>> NULL,
>>>>>>>> "query-balloon" will check kvm_state->sync_mmu and trigger NULL
>>>>> pointer
>>>>>>>> reference.
>>>>>>>>
>>>>>>>> We don't need to NULL kvm_state as all states in kvm_state aren't
>>>>> released
>>>>>>>> actually. Just closing kvm fd is enough so we could still query states
>>>>>>>> through "query_*" qmp command.
>>>>>>>
>>>>>>> IMO this does not make sense. Much of the state in kvm_state was
>>>>> derived
>>>>>> >from ioctl's on the descriptors, and closing them invalidates it.
>Asking
>>>>>>> historical questions about what used to be makes no sense.
>>>>>>
>>>>>> You also have your valid point.
>>>>>>
>>>>>>>
>>>>>>> Clearing kvm_state and setting kvm_allowed=false would be a safer
>fix.
>>>>>
>>>>> Setting kvm_allowed=false causes kvm_enabled() to return false which
>should
>>>>> prevent kvm_state from being dereferenced anywhere:
>>>>>
>>>>> #define kvm_enabled() (kvm_allowed)
>>>>>
>>>>> Eg for the balloon:
>>>>>
>>>>> static bool have_balloon(Error **errp)
>>>>> {
>>>>> if (kvm_enabled() && !kvm_has_sync_mmu()) {
>>>>
>>>> OK, will do, clearing kvm_allowed is a good idea.
>>>>
>>>> Currently there are two qmp commands "query-balloon" and
>"query-cpu-definitions"
>>>> causing SIGSEGV after CPR-transfer, clearing kvm_allowed helps for both.
>>>> But clearing both kvm_allowed and kvm_state cause SIGSEGV on
>"query-cpu-definitions".
>>>>
>>>> I'll send a patch to clearing only kvm_allowed if you are fine with it.
>>>
>>> I still don't love the idea. kvm_state is no longer valid.
>>>
>>> It sounds like "query-cpu-definitions" is missing a check for kvm_enabled().
>>
>> This patch / bug feels like it is side-stepping a general conceptual
>> question. After cpr-transfer is complete what QMP commands are still
>> valid to call in general ? This thread mentions two which have caused
>> bugs, but beyond that it feels like a large subset of QMP comamnds
>> are conceptually invalid to use.
>
>Agreed. I don't see why these commands should be issued to the dead
>instance.
Uh, yes, that's why I hesitate if it's deserved to fix it.
>How about migration status, quit, and nothing else?
>Half serious.
We only have issues with "query-balloon" and "query-cpu-definitions", all other
"query-*" qmp commands work fine.
{"execute": "query-migrate"}
{"timestamp": {"seconds": 1760067694, "microseconds": 815419}, "event": "STOP"}
{"return": {"status": "completed", "setup-time": 3, "downtime": 14,
"total-time": 67, "ram": {"total": 0, "postcopy-requests": 0,
"dirty-sync-count": 2, "multifd-bytes": 0, "pages-per-second": 0,
"downtime-bytes": 0, "page-size": 4096, "remaining": 0, "postcopy-bytes": 0,
"mbps": 58.152999999999999, "transferred": 465224,
"dirty-sync-missed-zero-copy": 0, "precopy-bytes": 0, "duplicate": 0,
"dirty-pages-rate": 0, "normal-bytes": 0, "normal": 0}}}
{"execute": "quit"}
{"timestamp": {"seconds": 1760067734, "microseconds": 599830}, "event":
"SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
{"return": {}}
Connection closed by foreign host.Connection closed by foreign host.
Thanks
Zhenzhong