Re: [virtio-dev] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

Chen, Jiqian Wed, 27 Sep 2023 02:00:41 -0700

On 2023/9/26 19:09, Michael S. Tsirkin wrote:
> On Tue, Sep 26, 2023 at 11:00:28AM +0000, Chen, Jiqian wrote:
>> On 2023/9/26 10:51, Jason Wang wrote:
>>> On Tue, Sep 26, 2023 at 10:24 AM Chen, Jiqian <jiqian.c...@amd.com> wrote:
>>>>
>>>> On 2023/9/25 10:52, Jason Wang wrote:
>>>>> On Fri, Sep 22, 2023 at 4:07 PM Chen, Jiqian <jiqian.c...@amd.com> wrote:
>>>>>>
>>>>>> Hi Jason,
>>>>>>
>>>>>> On 2023/9/22 11:17, Jason Wang wrote:
>>>>>>> On Thu, Sep 21, 2023 at 2:28 PM Chen, Jiqian <jiqian.c...@amd.com> 
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Jason,
>>>>>>>>
>>>>>>>> On 2023/9/21 12:22, Jason Wang wrote:
>>>>>>>>> On Tue, Sep 19, 2023 at 7:43 PM Jiqian Chen <jiqian.c...@amd.com> 
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> When guest vm does S3, Qemu will reset and clear some things of 
>>>>>>>>>> virtio
>>>>>>>>>> devices, but guest can't aware that, so that may cause some problems.
>>>>>>>>>> For excample, Qemu calls virtio_reset->virtio_gpu_gl_reset when guest
>>>>>>>>>> resume, that function will destroy render resources of virtio-gpu. As
>>>>>>>>>> a result, after guest resume, the display can't come back and we only
>>>>>>>>>> saw a black screen. Due to guest can't re-create all the resources, 
>>>>>>>>>> so
>>>>>>>>>> we need to let Qemu not to destroy them when S3.
>>>>>>>>>>
>>>>>>>>>> For above purpose, we need a mechanism that allows guests and QEMU to
>>>>>>>>>> negotiate their reset behavior. So this patch add a new parameter
>>>>>>>>>> named freeze_mode to struct virtio_pci_common_cfg. And when guest
>>>>>>>>>> suspends, it can write freeze_mode to be FREEZE_S3, and then virtio
>>>>>>>>>> devices can change their reset behavior on Qemu side according to
>>>>>>>>>> freeze_mode. What's more, freeze_mode can be used for all virtio
>>>>>>>>>> devices to affect the behavior of Qemu, not just virtio gpu device.
>>>>>>>>>
>>>>>>>>> A simple question, why is this issue specific to pci?
>>>>>>>> I thought you possibly missed the previous version patches. At the 
>>>>>>>> beginning, I just wanted to add a new feature flag 
>>>>>>>> VIRTIO_GPU_F_FREEZE_S3 for virtio-gpu since I encountered virtio-gpu 
>>>>>>>> issue during guest S3, so that the guest and qemu can negotiate and 
>>>>>>>> change the reset behavior during S3. But Parav and Mikhail hoped me 
>>>>>>>> can improve the feature to a pci level, then other virtio devices can 
>>>>>>>> also benefit from it. Although I am not sure if expanding its 
>>>>>>>> influence is appropriate, I have not received any feedback from 
>>>>>>>> others, so I change it to the pci level and made this version.
>>>>>>>> If you are interested, please see the previous version: 
>>>>>>>> https://lists.oasis-open.org/archives/virtio-comment/202307/msg00209.html,
>>>>>>>>  thank you.
>>>>>>>
>>>>>>> This is not a good answer. Let me ask you differently, why don't you
>>>>>>> see it in other forms of transport like virtio-gpu-mmio?
>>>>>> Sorry for not understanding your question before. Did you mean why did I 
>>>>>> add freeze_mode for virtio-pci type devices instead of virtio-mmio type 
>>>>>> devices?
>>>>>
>>>>> Yes.
>>>>>
>>>>>> If that's the case, it isn't well thought out by me, because there is 
>>>>>> only virtio-gpu with pci support in my environment, so I, as a matter of 
>>>>>> course, improve it from virtio-gpu-pci to a virtio-pci level instead of 
>>>>>> virtio-mmio.
>>>>>> Do you think I need to add freeze_mode for virtio-mmio too?
>>>>>
>>>>> Because the function you describe is to preserve the render resources
>>>>> which have nothing specific to PCI. It's a core virtio function.
>>>>>
>>>>> I'd suggest fixing it at the virtio level instead of PCI level.
>>>> Yes, you are right, I will change to describe it to the virtio level in 
>>>> next version.
>>>>
>>>>>
>>>>>> Or improve it to a higher level, like virtio-bus? I am looking forward 
>>>>>> to getting your suggestions, thanks.
>>>>>
>>>>> As pointed out in other thread, reusing the existing status state
>>>>> machine is a perfect way for doing this.
>>>>>
>>>>> We can tweak the state machine to be able to do both migration and S3.
>>>> I noticed the other thread you said, it will add a SUSPEND status to into 
>>>> device_status field, but it seems not enough to meet all scenarios, like 
>>>> we need to set the unfreeze state of guest,
>>>
>>> It's just a resume by clearing the suspend bit?
>> It is not to trigger resume behavior to device by setting unfreeze in my 
>> scenario.
>> You may misunderstand my intention.
>> In current kernel and qemu code, it will trigger reset device behavior by 
>> setting device_status to 0 during resuming, and I found some operations of 
>> reset is not reasonable during resuming, like virtio_gpu_reset, it will 
>> destroy render resources and then caused the display gone after guest 
>> finishing reusming.
>> So, I will set guest to S3 when suspending, and then qemu will notice the S3 
>> state of guest, and then it will not destroy render resources for virtio-gpu 
>> during resuming. The unfreeze state will be set after guest completed 
>> resuming.
>> I think other virtio devices can also change the unreasonable behavior of 
>> resetting during resuming according the S3 state. And this will not affect 
>> other reset situation, just reset during resuming.
>> I think this patch doesn't conflict with the patch in another thread, and 
>> they have different roles.
>> Perhaps I need to use different words to avoid misunderstandings, not freeze 
>> or suspend, maybe state_s3, state_normal etc.
> 
> 
> And then I have to ask - aren't any of the standard power states
> appropriate here? I am not a PM expert but I think that it's unusual to
> have device get the system state as opposed to device power state. No?
I didn't express clearly.
It's not to have device get the system state. In my original patch, I just 
wanted virtio-gpu driver to let virtio-gpu device know the freeze status by 
setting VIRTIO_GPU_CMD_SET_FREEZE_MODE(virtio_gpu_freeze).
This version improves above mechanism to virtio-pci level, and now, it is to 
have virtio-pci devices get the freeze status(virtio_pci_freeze).


We can say:
Driver requirements:
If VIRTIO_PCI_COMMON_F_MODE is negotiated, driver MUST set freeze_mode into 
device when virtio_pci enter Freeze(virtio_pci_freeze is called).
If VIRTIO_PCI_COMMON_F_MODE is negotiated, driver MUST set unfreeze_mode into 
device when virtio_pci finish Restore(virtio_pci_restore is called).

Device requirements:
If VIRTIO_PCI_COMMON_F_MODE is negotiated, since reset(virtio_reset) will be 
done between freeze and restore, device can keep resources or change reset 
behavior according to the freeze_mode field during that period.

I will provide a more reasonable and detailed description in the next version.

> 
> 
>>>
>>>>
>>>
>>>> the S3 state of the guest, the S4 state of the guest etc.
>>>
>>> Hibernation is different from suspending, it probably requires
>>> setting/getting device states. And I don't get how this patch can help
>>> with hibernation.
>> The same reason as above, guest can set it to S4 state, and then we can 
>> change some behaviors on qemu side, so that to help guest resume from 
>> hibernation better.
>>
>>>
>>>> So, one bit is not enough to indicate what status the guest is, since the 
>>>> freeze mode is a series,
>>>
>>> We can tweak the existing status state machine, otherwise you need to
>>> explain the interaction between the two. For example what happens if
>>> we freeze after reset etc.
>>>
>>> And if you do that, it is actually a single state machine.
>>>
>>> Thanks
>>>
>>>> I think add a new field to describe it is more reasonable.
>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Jiqian Chen <jiqian.c...@amd.com>
>>>>>>>>>> ---
>>>>>>>>>>  transport-pci.tex | 7 +++++++
>>>>>>>>>>  1 file changed, 7 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/transport-pci.tex b/transport-pci.tex
>>>>>>>>>> index a5c6719..2543536 100644
>>>>>>>>>> --- a/transport-pci.tex
>>>>>>>>>> +++ b/transport-pci.tex
>>>>>>>>>> @@ -319,6 +319,7 @@ \subsubsection{Common configuration structure 
>>>>>>>>>> layout}\label{sec:Virtio Transport
>>>>>>>>>>          le64 queue_desc;                /* read-write */
>>>>>>>>>>          le64 queue_driver;              /* read-write */
>>>>>>>>>>          le64 queue_device;              /* read-write */
>>>>>>>>>> +        le16 freeze_mode;               /* read-write */
>>>>>>>>>>          le16 queue_notif_config_data;   /* read-only for driver */
>>>>>>>>>>          le16 queue_reset;               /* read-write */
>>>>>>>>>>
>>>>>>>>>> @@ -393,6 +394,12 @@ \subsubsection{Common configuration structure 
>>>>>>>>>> layout}\label{sec:Virtio Transport
>>>>>>>>>>  \item[\field{queue_device}]
>>>>>>>>>>          The driver writes the physical address of Device Area here. 
>>>>>>>>>>  See section \ref{sec:Basic Facilities of a Virtio Device / 
>>>>>>>>>> Virtqueues}.
>>>>>>>>>>
>>>>>>>>>> +\item[\field{freeze_mode}]
>>>>>>>>>> +        The driver writes this to set the freeze mode of virtio pci.
>>>>>>>>>> +        VIRTIO_PCI_FREEZE_MODE_UNFREEZE - virtio-pci is running;
>>>>>>>>>> +        VIRTIO_PCI_FREEZE_MODE_FREEZE_S3 - guest vm is doing S3, 
>>>>>>>>>> and virtio-pci enters S3 suspension;
>>>>>>>>>> +        Other values are reserved for future use, like S4, etc.
>>>>>>>>>> +
>>>>>>>>>>  \item[\field{queue_notif_config_data}]
>>>>>>>>>>          This field exists only if VIRTIO_F_NOTIF_CONFIG_DATA has 
>>>>>>>>>> been negotiated.
>>>>>>>>>>          The driver will use this value when driver sends available 
>>>>>>>>>> buffer
>>>>>>>>>> --
>>>>>>>>>> 2.34.1
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
>>>>>>>>> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>> Jiqian Chen.
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>> Jiqian Chen.
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
>>>>> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
>>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Jiqian Chen.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
>>> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
>>>
>>
>> -- 
>> Best regards,
>> Jiqian Chen.
> 

-- 
Best regards,
Jiqian Chen.

Re: [virtio-dev] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

Reply via email to