Re: A lingering doubt on PCI-MMIO region of PCI-passthrough-device

Ajay Garg Sat, 20 Dec 2025 05:26:41 -0800

Is guest-acpi-mcfg/mmconfig tables the answer to my last question? :)

i.e. qemu-bios setting up acpi mcfg / mmconfig addresses, which are
backed by mmap'ed pci-config-space mappings on the host (while also
setting up ept-mappings for pci-config-space too)?



On Sat, Dec 20, 2025 at 6:22 PM Ajay Garg <[email protected]> wrote:
>
> Thanks Alex.
>
> I was/am aware of GPA-ranges backed by mmap'ed HVA-ranges.
> On further thought, I think I have all the missing pieces (except one,
> as mentioned at last in current email).
>
> I'll list the steps below :
>
> a)
> There are three stages :
>
>    * pre-configuration by host/qemu.
>    * guest-vm bios.
>    * guest-vm kernel.
>
> b)
> Host procures following memory-slots (amongst others) via mmap :
>
>   * guest-ram
>   * pci-config-space       : via vfio's ioctls' help.
>   * pci-bar-mmio-space : via vfio's ioctls' help.
>
> For the above memory-slots,
>
> *
> guest-ram physical-address is known (0), so ept-mappings for guest-ram
> are set up even before guest-vm begins to boot up.
>
> *
> there is no concept of guest-physical-address for pci-config-space.
>
> *
> pci-bar-mmio-space physical address is not known yet, so ept-mappings
> for pci-bar-mmio-space are not set up (yet).
>
> c)
> qemu starts the guest, and guest-vm-bios runs next.
>
> This bios is "owned by qemu", and is "definitely different" from the
> host-bios (qemu is an altogether different "hardware"). qemu-bios and
> host-bios handle pci bus/enumeration "completely differently".
>
> When the pci-enumeration runs during this guest-vm-bios stage, it
> accesses the pci-device config-space (backed on the host by mmap'ed
> mappings). Note that guest-kernel is still not in picture.
>
> "OBVIOUSLY", all accesses (reads/writes) to pci-config space go to the
> pci-config-space memory-slot (handled purely by qemu-bios code).
>
> Once the guest-vm bios carves out guest-physical-addresses for the
> pci-device-bars, it programs the bars by writing to bars-offsets in
> the pci-config-space. qemu detects this, and does the following :
>
>    * does not relay the actual-writes to physical bars on the host.
>    * since the bar-guest-physical-addresses are now known, so now the
> missing ept-mappings
>      for pci-bar-mmio-space are now set up.
>
> d)
> Finally, guest-kernel takes over, and
>
>    * all accesses to ram go through vanilla two-stages translation.
>    * all accesses to pci-bars-mmio go through vanilla two-stages translation.
>
>
> Requests :
>
> i)
> Alex / QEMU-experts : kindly correct me if I am wrong :) till now.
>
> ii)
> Once kernel boots up, how are accesses to pci-config-space handled? Is
> again qemu-bios involved in pci-config-space accesses after
> guest-kernel has booted up?
>
>
> Once again, many thanks to everyone for their time and help.
>
> Thanks and Regards,
> Ajay
>
>
> On Sat, Dec 20, 2025 at 5:36 AM Alex Williamson <[email protected]> wrote:
> >
> > On Fri, 19 Dec 2025 11:53:56 +0530
> > Ajay Garg <[email protected]> wrote:
> >
> > > Hi Alex.
> > > Kindly help if the steps listed in the previous email are correct.
> > >
> > > (Have added qemu mailing-list too, as it might be a qemu thing too as
> > > virtual-pci is in picture).
> > >
> > > On Mon, Dec 15, 2025 at 9:20 AM Ajay Garg <[email protected]> wrote:
> > > >
> > > > Thanks Alex.
> > > >
> > > > So does something like the following happen :
> > > >
> > > > i)
> > > > During bootup, guest starts pci-enumeration as usual.
> > > >
> > > > ii)
> > > > Upon discovering the "passthrough-device", guest carves the physical
> > > > MMIO regions (as usual) in the guest's physical-address-space, and
> > > > starts-to/attempts to program the BARs with the
> > > > guest-physical-base-addresses carved out.
> > > >
> > > > iii)
> > > > These attempts to program the BARs (lying in the
> > > > "passthrough-device"'s config-space), are intercepted by the
> > > > hypervisor instead (causing a VM-exit in the interim).
> > > >
> > > > iv)
> > > > The hypervisor uses the above info to update the EPT, to ensure GPA =>
> > > > HPA conversions go fine when the guest tries to access the PCI-MMIO
> > > > regions later (once gurst is fully booted up). Also, the hypervisor
> > > > marks the operation as success (without "really" re-programming the
> > > > BARs).
> > > >
> > > > v)
> > > > The VM-entry is called, and the guest resumes with the "impression"
> > > > that the BARs have been "programmed by guest".
> > > >
> > > > Is the above sequencing correct at a bird's view level?
> >
> > It's not far off.  The key is simply that we can create a host virtual
> > mapping to the device BARs, ie. an mmap.  The guest enumerates emulated
> > BARs, they're only used for sizing and locating the BARs in the guest
> > physical address space.  When the guest BAR is programmed and memory
> > enabled, the address space in QEMU is populated at the BAR indicated
> > GPA using the mmap backing.  KVM memory slots are used to fill the
> > mappings in the vCPU.  The same BAR mmap is also used to provide DMA
> > mapping of the BAR through the IOMMU in the legacy type1 IOMMU backend
> > case.  Barring a vIOMMU, the IOMMU IOVA space is the guest physical
> > address space.  Thanks,
> >
> > Alex

Re: A lingering doubt on PCI-MMIO region of PCI-passthrough-device

Reply via email to