Hi everybody, Here are the notes from the last Hypervisor Live Update call that happened on Monday, July 14. Thanks to everybody who was involved!
These notes are intended to bring people up to speed who could not attend the call as well as keep the conversation going in between meetings. ----->o----- Mike Rapoport discussed a proposal about sticky preservation with KHO. A colleague had been working on integration with Microsoft Hyper-V (MSHV). They want to mark memory that is persistent forever so it is not necessary to mark pages as persistent for every kexec; this would continue until there was a request to stop persisting that memory. An example is memfd to back guest memory where the layout of memory doesn't change much; another example might be PCI configuration. This may be used for memory that is obtained from the HV but then not tracked anymore anywhere. Pasha Tatashin suggested that this would be another user for KHO and not part of LUO. Mike suggested there may be states involved with this memory since you could eventually mark it as not sticky anymore. Until this is marked as no longer sticky, it would be a permanently included fd. Jason Gunthorpe asked what we would get back on the other side of the kexec. Pratyush Yadav noted that we can't change the size of this memory without more effort. Jason suggested that MSHV would just use a KHO provider and preserve with LUO as normal because you need to track the memory. We might schedule additional time to discuss this in a follow-up when joined by MSHV developers. ----->o----- We discussed current status of LUO. Pasha at the time had sent out v1 of of the patch series (non RFC) and received some comments. Current version should address any concerns from the RFC version and also includes memfd preservation from Pratyush. There is an on-going discussion about using kexec fs instead of using tokens. Jason suggested against using filesystems if at all possible. Pasha noted this could be a future extension and does not need to be part of the core LUO. Mike noted there were two opinions being expressed: ioctls shouldn't be used and how much dev tmpfs differs from kernel filesystems. Jason noted this is just a regular character device like hudnreds of other devices in the kernel. Pratyush brought up the previous conversation about handing out sessions so when serialization is done it is done per session. The agent decides who gets access to these sessions. David Matlack asked if the agent could hand out sessions to unprivileged processes and, if so, this would address concerns about all saved fds being sent to the agent and context stored in the agent's process before being sent out again. There was general agreement on session fds; an open question remained about whether the fd passing would be done with sockets or bind mounts. An example Jason provided would be to create a session through a file in a filesystem and then give that to qemu; qemu would then open it and do an ioctl on it and provide the kvmfd. Pasha believed that it was very clean to do the policy through a userspace agent rather than in the kernel. ----->o----- We transitioned to discussing the agent, liveupdated, which the group thought was going to be needed. There was a desire to have both Christian and systemd people join the call, so I took the AI to reach out and see if they would be available. I asked about where liveupdated would live, if this is something that would be shipped along with the kernel itself or whether this would be perhaps its own open source entity. Pratyush experimented with fdstore and was looking into whether a process could survive through the reboot call, or at least an fd survive through that call in systemd. Pasha suggested there should be a way for the agent to survive until the reboot call if you modify the target to not kill the agent process. He also suggested that while it may be attractive to have this as part of the systemd tree, it would be separated from that entirely. ----->o----- Chris Li gave a quick update on PCI preservation. He said the current support allows for the PCI device to preserve its state across the kexec and allows for supporting DMA during reboot. He suggested the inital patch series to modify the PCI interfaces to allow registration will be straightforward. However, feedback and discussion will be needed for changing PCI initialization for device that already have state; these changes could be invasive. ----->o----- David Matlack shared that the live update microconference was accepted for LPC! It now shows up for the CFP so people can submit talks for this[1]. ----->o----- Next meeting will be on Monday, July 28 at 8am PDT (UTC-7), everybody is welcome: https://meet.google.com/rjn-dmzu-hgq Topics for the next meeting: - follow-up on sticky preservations with KHO, any additional insight provided for MSHV use cases - discuss any on-going pushback against character devices and suggestions for using a filesystem instead of ioctls - status update on the liveupdated agent, design, and timelines, as well as open sourcing it and libluo in its own repository - Frank van der Linden: physical pool allocator, used to provide memory for hugetlb, guest_memfd, etc - Chris Li: update on PCI preservation, registration, and initialization, and the RFC patch series discussed in last meeting - later: testing methodology to allow downstream consumers to qualify that live update works from one version to another - later: reducing blackout window during live update Please let me know if you'd like to propose additional topics for discussion, thank you! [1] https://lpc.events/event/19/contributions/2004/