On Fri, 27 Feb 2026 09:19:28 -0800 David Matlack <[email protected]> wrote:
> On Fri, Feb 27, 2026 at 8:32 AM Alex Williamson <[email protected]> wrote: > > > > On Thu, 26 Feb 2026 00:28:28 +0000 > > David Matlack <[email protected]> wrote: > > > > > +static int pci_flb_preserve(struct liveupdate_flb_op_args *args) > > > > > +{ > > > > > + struct pci_dev *dev = NULL; > > > > > + int max_nr_devices = 0; > > > > > + struct pci_ser *ser; > > > > > + unsigned long size; > > > > > + > > > > > + for_each_pci_dev(dev) > > > > > + max_nr_devices++; > > > > > > > > How is this protected against hotplug? > > > > > > Pranjal raised this as well. Here was my reply: > > > > > > . Yes, it's possible to run out space to preserve devices if devices are > > > . hot-plugged and then preserved. But I think it's better to defer > > > . handling such a use-case exists (unless you see an obvious simple > > > . solution). So far I am not seeing preserving hot-plugged devices > > > . across Live Update as a high priority use-case to support. > > > > > > I am going to add a comment here in the next revision to clarify that. > > > I will also add a comment clarifying why this code doesn't bother to > > > account for VFs created after this call (preserving VFs are explicitly > > > disallowed to be preserved in this patch since they require additional > > > support). > > > > TBH, without SR-IOV support and some examples of in-kernel PF > > preservation in support of vfio-pci VFs, it seems like this only > > supports a very niche use case. > > The intent is to start by supporting a simple use-case and expand to > more complex scenarios over time, including preserving VFs. Full GPU > passthrough is common at cloud providers so even non-VF preservation > support is valuable. > > > I expect the majority of vfio-pci > > devices are VFs and I don't think we want to present a solution where > > the requirement is to move the PF driver to userspace. > > JasonG recommended the upstream support for VF preservation be limited > to cases where the PF is also bound to VFIO: > > https://lore.kernel.org/lkml/[email protected]/ > > Within Google we have a way to support in-kernel PF drivers but we are > trying to focus on simpler use-cases first upstream. > > > It's not clear, > > for example, how we can have vfio-pci variant drivers relying on > > in-kernel channels to PF drivers to support migration in this model. > > Agree this still needs to be fleshed out and designed. I think the > roadmap will be something like: > > 1. Get non-VF preservation working end-to-end (device fully preserved > and doing DMA continuously during Live Update). > 2. Extend to support VF preservation where the PF is also bound to vfio-pci. > 3. (Maybe) Extend to support in-kernel PF drivers. > > This series is the first step of #1. I have line of sight to how #2 > could work since it's all VFIO. Without 3, does this become a mainstream feature? There's obviously a knee jerk reaction that moving PF drivers into userspace is a means to circumvent the GPL that was evident at LPC, even if the real reason is "in-kernel is hard". Related to that, there's also not much difference between a userspace driver and an out-of-tree driver when it comes to adding in-kernel code for their specific support requirements. Therefore, unless migration is entirely accomplished via a shared dmabuf between PF and VF, orchestrated through userspace, I'm not sure how we get to migration, making KHO vs migration a binary choice. I have trouble seeing how that's a viable intermediate step. Thanks, Alex

