On Fri, Jan 12, 2024 at 3:41 PM Chen, Jiqian <jiqian.c...@amd.com> wrote: > > Hi all, > Sorry to reply late. > I don't know if you still remember this problem, let me briefly descript it. > I am working to implement virtgpu S3 function on Xen. > Currently on Xen, if we start a guest through qemu with enabling virtgpu, and > then suspend and resume guest. We can find that the guest kernel comes back, > but the display doesn't. It just shown a black screen. > That is because during suspending, guest called into qemu and qemu destroyed > all display resources and reset renderer. This made the display gone after > guest resumed. > So, I add a new command for virtio-gpu that when guest is suspending, it will > notify qemu and set parameter(preserver_resource) to 1 and then qemu will > preserve resources, and when resuming, guest will notify qemu to set > parameter to 0, and then qemu will keep the normal actions. That can help > guest's display come back. > When I upstream above implementation, Parav and MST suggest me to use the PM > capability to fix this problem instead of adding a new command or state bit. > Now, I have tried the PM capability of virtio-gpu, it can't be used to solve > this problem. > The reason is: > during guest suspending, it will write D3 state through PM cap, then I can > save the resources of virtio-gpu on Qemu side(set preserver_resource to 1), > but during the process of resuming, the state of PM cap will be cleared by > qemu resetting(qemu_system_wakeup-> qemu_devices_reset-> > virtio_vga_base_reset-> virtio_gpu_gl_reset), > it causes that when guest reads state from PM cap, it will find the > virtio-gpu has already been D0 state, so guest will not write D0 through PM > cap, so I can't know when to restore the resources(set preserver_resource to > 0).
Looks like a bug to be fixed? > Do you have any other suggestions? > Or can I just fallback to the version that add a new > command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think that way is > more reasonable and feasible for virtio-gpu to protect display resources > during S3. As for other devices, if necessary, they can also refer to the > implementation of the virtio-gpu to add new commands to prevent resource loss > during S3. Note that there's recently a fix for no_soft_reset, a well behaved device should not suffer from this issue anymore. If I understand this correctly, there's no need for any extension for the spec as well.: https://lore.kernel.org/lkml/CACGkMEs_MajTFxGVcK5R8oqQzTxuL4Pm=uunoondczwzqau...@mail.gmail.com/T/ Thanks > > On 2023/10/27 11:03, Chen, Jiqian wrote: > > Hi Michael S. Tsirkin and Parav Pandit, > > Thank you for your detailed explanation. I will try to use PM cap to fix > > this issue. > > > > On 2023/10/26 18:30, Michael S. Tsirkin wrote: > >> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote: > >>> > >>> On 2023/10/25 11:51, Parav Pandit wrote: > >>>> > >>>> > >>>>> From: Chen, Jiqian <jiqian.c...@amd.com> > >>>>> Sent: Tuesday, October 24, 2023 5:44 PM > >>>>> > >>>>> On 2023/10/24 18:51, Parav Pandit wrote: > >>>>>> > >>>>>>> From: Chen, Jiqian <jiqian.c...@amd.com> > >>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM > >>>>>>> > >>>>>>> On 2023/10/23 21:35, Parav Pandit wrote: > >>>>>>>> > >>>>>>>>> From: Chen, Jiqian <jiqian.c...@amd.com> > >>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM > >>>>>>>> > >>>>>>>>>> I think this can be done without introducing the new register. > >>>>>>>>>> Can you please check if the PM register itself can serve the > >>>>>>>>>> purpose instead > >>>>>>>>> of new virtio level register? > >>>>>>>>> Do you mean the system PM register? > >>>>>>>> No, the device's PM register at transport level. > >>>>>>> I tried to find this register(pci level or virtio pci level or virtio > >>>>>>> driver level), but I didn't find it in Linux kernel or Qemu codes. > >>>>>>> May I know which register you are referring to specifically? Or which > >>>>>>> PM state bit you mentioned below? > >>>>>>> > >>>>>> PCI spec's PCI Power Management Capability Structure in section 7.5.2. > >>>>> Yes, what you point to is PM capability for PCIe device. > >>>>> But the problem is still that in Qemu code, it will check the > >>>>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci > >>>>> devices in > >>>>> function virtio_pci_realize(), if the virtio devices aren't a PCIe > >>>>> device, it will not > >>>>> add PM capability for them. > >>>> PCI PM capability is must for PCIe devices. So may be QEMU code has put > >>>> it only under is_pcie check. > >>>> But it can be done outside of that check as well because this capability > >>>> exists on PCI too for long time and it is backward compatible. > >>> Do you suggest me to implement PM capability for virtio devices in > >>> Qemu firstly, and then to try if the PM capability can work for this > >>> scenario? > >> > >> virtio devices in qemu already have a PM capability. > >> > >> > >>> If so, we will complicate a simple problem. Because there are no other > >>> needs to add PM capability for virtio devices for now, if we add it just > >>> for preserving resources, it seems unnecessary and unreasonable. And we > >>> are not sure if there are other scenarios that are not during the process > >>> of PM state changing also need to preserve resources, if have, then the > >>> PM register can't cover, but preserve_resources register can. > >> > >> One of the selling points of virtio is precisely reusing existing > >> platform capabilities as opposed to coming up with our own. > >> See abstract.tex > >> > >> > >>> Can I add some notes like "If PM capability is implemented for virtio > >>> devices, it may cover this scenario, and if there are no other scenarios > >>> that PM can't cover, then we can remove this register " in commit message > >>> or spec description and let us continue to add preserve_resources > >>> register? > >> > >> We can't remove registers. > >> > >>>> > >>>>> And another problem is how about the MMIO transport devices? Since > >>>>> preserve resources is need for all transport type devices. > >>>>> > >>>> MMIO lacks such rich PM definitions. If in future MMIO wants to support, > >>>> it will be extended to match to other transports like PCI. > >>>> > >>>>>>>> > >>>>>>>>> I think it is unreasonable to let virtio- device listen the PM > >>>>>>>>> state of Guest system. > >>>>>>>> Guest driver performs any work on the guest systems PM callback > >>>>>>>> events in > >>>>>>> the virtio driver. > >>>>>>> I didn't find any PM state callback in the virtio driver. > >>>>>>> > >>>>>> There are virtio_suspend and virtio_resume in case of Linux. > >>>>> I think what you said virtio_suspend/resume is freeze/restore callback > >>>>> from > >>>>> "struct virtio_driver" or suspend/resume callback from "static const > >>>>> struct > >>>>> dev_pm_ops virtio_pci_pm_ops". > >>>>> And yes, I agree, if virtio devices have PM capability, maybe we can > >>>>> set PM state > >>>>> in those callback functions. > >>>>> > >>>>>> > >>>>>>>> > >>>>>>>>> It's more suitable that each device gets notifications from driver, > >>>>>>>>> and then do preserving resources operation. > >>>>>>>> I agree that each device gets the notification from driver. > >>>>>>>> The question is, should it be virtio driver, or existing pci driver > >>>>>>>> which > >>>>>>> transitions the state from d0->d3 and d3->d0 is enough. > >>>>>>> It seems there isn't existing pci driver to transitions d0 or d3 > >>>>>>> state. Could you please tell me which one it is specifically? I am > >>>>>>> very willing to > >>>>> give a try. > >>>>>>> > >>>>>> Virtio-pci modern driver of Linux should be able to. > >>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two > >>>>> problems I > >>>>> said above. > >>>>> > >>>> Both can be resolved without switching to pcie. > >>>> > >>>>>> > >>>>>>>> Can you please check that? > >>>>>>>> > >>>>>>>>>>> --- a/transport-pci.tex > >>>>>>>>>>> +++ b/transport-pci.tex > >>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration > >>>>> structure > >>>>>>>>>>> layout}\label{sec:Virtio Transport > >>>>>>>>>>> /* About the administration virtqueue. */ > >>>>>>>>>>> le16 admin_queue_index; /* read-only for driver > >>>>>>>>>>> */ > >>>>>>>>>>> le16 admin_queue_num; /* read-only for driver */ > >>>>>>>>>>> + le16 preserve_resources; /* read-write */ > >>>>>>>>>> Preserving these resources in the device implementation takes > >>>>>>>>>> finite amount > >>>>>>>>> of time. > >>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP). > >>>>>>>>>> Hence this register must be a polling register to indicate that > >>>>>>>>> preservation_done. > >>>>>>>>>> This will tell the guest when the preservation is done, and when > >>>>>>>>>> restoration is > >>>>>>>>> done, so that it can resume upper layers. > >>>>>>>>>> > >>>>>>>>>> Please refer to queue_reset definition to learn more about such > >>>>>>>>>> register > >>>>>>>>> definition. > >>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three values, > >>>>>>>>> driver write 1 to let device do preserving resources, driver write > >>>>>>>>> 2 to let device do restoring resources, device write 0 to tell > >>>>>>>>> driver that preserving or restoring done, am I right? > >>>>>>>>> > >>>>>>>> Right. > >>>>>>>> > >>>>>>>> And if the existing pcie pm state bits can do, we can leverage that. > >>>>>>>> If it cannot be used, lets add that reasoning in the commit log to > >>>>>>>> describe this > >>>>>>> register. > >>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Lets please make sure that PCIe PM level registers are > >>>>>>>>>> sufficient/not-sufficient > >>>>>>>>> to decide the addition of this register. > >>>>>>>>> But if the device is not a PCIe device, it doesn't have PM > >>>>>>>>> capability, then this will not work. Actually in my local > >>>>>>>>> environment, pci_is_express() return false in Qemu, they are not > >>>>>>>>> PCIe > >>>>>>> device. > >>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new > >>>>>>>> functionality that too you mentioned a gpu device. 😊 > >>>>>>>> Which does not have very long history of any backward compatibility. > >>>>>>> Do you suggest me to add PM capability for virtio-gpu or change > >>>>>>> virtio-gpu to a PCIe device? > >>>>>>> > >>>>>> PCI Power Management Capability Structure does not seem to be limited > >>>>>> to > >>>>> PCIe. > >>>>> I am not sure, but in current Qemu code, I can see the check > >>>>> "pci_is_express" > >>>>> for all virtio-pci devices. If we want to add PM capability for > >>>>> virtio-pci devices, > >>>>> we need to change them to PCIe device I think. > >>>>> > >>>> That is one option. > >>>> Second option to extend PCI PM cap for non pci device because it is > >>>> supported. > >>>> > >>>>>> > >>>>> > >>>>> -- > >>>>> Best regards, > >>>>> Jiqian Chen. > >>> > >>> -- > >>> Best regards, > >>> Jiqian Chen. > >> > > > > -- > Best regards, > Jiqian Chen. --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org