On Tue, Oct 21, 2025 at 06:26:39PM +0200, Eric Auger wrote:
> Hi Nicolin,
>
> On 10/20/25 8:00 PM, Nicolin Chen wrote:
> > On Mon, Oct 20, 2025 at 06:14:33PM +0200, Eric Auger wrote:
> >>>> This will cause the device to be configured with wrong MSI doorbell
> >>>> address if it return the system address space.
> >>> I think it'd be nicer to elaborate why a wrong address will be returned:
> >>>
> >>> --------------------------------------------------------------------------
> >>> On ARM, a device behind an IOMMU requires translation for its MSI doorbell
> >>> address. When HW nested translation is enabled, the translation will also
> >>> happen in two stages: gIOVA => gPA => ITS page.
> >>>
> >>> In the accelerated SMMUv3 mode, both stages are translated by the HW. So,
> >>> get_address_space() returns the system address space for stage-2 mappings,
> >>> as the smmuv3-accel model doesn't involve in either stage.
> >> I don't understand "doesn't involve in either stage". This is still not
> >> obious to me that for an HW accelerated nested IOMMU get_address_space()
> >> shall return the system address space. I think this deserves to be
> >> explained and maybe documented along with the callback.
> > get_address_space() is used by pci_device_iommu_address_space(),
> > which is for attach or translation.
> >
> > In QEMU, we have an "iommu" type of memory region, to represent
> > the address space providing the stage-1 translation.
> >
> > In accel case excluding MSI, there is no need of "emulated iommu
> > translation" since HW/host SMMU takes care of both stages. Thus,
> > the system address is returned for get_address_space(), to avoid
> > stage-1 translation and to also allow VFIO devices to attach to
> > the system address space that the VFIO core will monitor to take
> > care of stage-2 mappings.
> but in general if you set as output 'as' the system_address_memory it
> rather means you have no translation in place. This is what I am not
> convinced about.
You mean you are not convinced about "no translation"?
> you say it aims at
> - avoiding stage-1 translation - allow VFIO devices to attach to the
> system address space that the VFIO core will monitor to take care of
> stage-2 mappings. Can you achieve the same goals with a proper address
> space?
Would you please define "proper"?
The disagreement is seemingly about using system address space or
even address_space_memory, IIUIC.
To our purpose here, so long as the vfio core can setup a proper
listener to monitor the guest physical address space, we are fine
with any alternative.
The system address space just seems to be the simplest one. FWIW,
kvm_arch_fixup_msi_route() also checks in the beginning:
if (as == &address_space_memory)
So, returning @address_space_memory seems to be straightforward?
I think I also need some education to understand why do we need
an indirect address space that eventually will be routed back to
address_space_memory?
Thanks
Nicolin