Hi Eric,

> -----Original Message-----
> From: Eric Auger <[email protected]>
> Sent: 04 November 2025 11:06
> To: Shameer Kolothum <[email protected]>; qemu-
> [email protected]; [email protected]
> Cc: [email protected]; Jason Gunthorpe <[email protected]>; Nicolin
> Chen <[email protected]>; [email protected]; [email protected];
> Nathan Chen <[email protected]>; Matt Ochs <[email protected]>;
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> Krishnakant Jaju <[email protected]>
> Subject: Re: [PATCH v5 13/32] hw/arm/smmuv3-accel: Add nested vSTE
> install/uninstall support
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi Shameer,
> 
> On 10/31/25 11:49 AM, Shameer Kolothum wrote:
> > From: Nicolin Chen <[email protected]>
> >
> > A device placed behind a vSMMU instance must have corresponding vSTEs
> > (bypass, abort, or translate) installed. The bypass and abort proxy
> > nested HWPTs are pre-allocated.
> >
> > For translat HWPT, a vDEVICE object is allocated and associated with
> > the vIOMMU for each guest device. This allows the host kernel to
> > establish a virtual SID to physical SID mapping, which is required for
> > handling invalidations and event reporting.
> >
> > An translate HWPT is allocated based on the guest STE configuration
> > and attached to the device when the guest issues SMMU_CMD_CFGI_STE or
> > SMMU_CMD_CFGI_STE_RANGE, provided the STE enables S1 translation.
> >
> > If the guest STE is invalid or S1 translation is disabled, the device
> > is attached to one of the pre-allocated ABORT or BYPASS HWPTs instead.
> >
> > While at it, export both smmu_find_ste() and smmuv3_flush_config() for
> > use here.
> >
> > Signed-off-by: Nicolin Chen <[email protected]>
> > Signed-off-by: Shameer Kolothum
> <[email protected]>
> > Reviewed-by: Jonathan Cameron <[email protected]>
> > Signed-off-by: Shameer Kolothum <[email protected]>
> > ---
> >  hw/arm/smmuv3-accel.c    | 193
> +++++++++++++++++++++++++++++++++++++++
> >  hw/arm/smmuv3-accel.h    |  23 +++++
> >  hw/arm/smmuv3-internal.h |  20 ++++
> >  hw/arm/smmuv3.c          |  18 +++-
> >  hw/arm/trace-events      |   2 +
> >  5 files changed, 253 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c index
> > d4d65299a8..c74e95a0ea 100644
> > --- a/hw/arm/smmuv3-accel.c
> > +++ b/hw/arm/smmuv3-accel.c
> > @@ -28,6 +28,191 @@ MemoryRegion root;  MemoryRegion sysmem;
> static
> > AddressSpace *shared_as_sysmem;
> >
> > +static bool
> > +smmuv3_accel_alloc_vdev(SMMUv3AccelDevice *accel_dev, int sid, Error
> > +**errp) {
> > +    SMMUViommu *vsmmu = accel_dev->vsmmu;
> > +    IOMMUFDVdev *vdev;
> > +    uint32_t vdevice_id;
> > +
> > +    if (!accel_dev->idev || accel_dev->vdev) {
> > +        return true;
> > +    }
> > +
> > +    if (!iommufd_backend_alloc_vdev(vsmmu->iommufd, accel_dev->idev-
> >devid,
> > +                                    vsmmu->viommu.viommu_id, sid,
> > +                                    &vdevice_id, errp)) {
> > +            return false;
> > +    }
> > +    if (!host_iommu_device_iommufd_attach_hwpt(accel_dev->idev,
> > +                                               vsmmu->bypass_hwpt_id, 
> > errp)) {
> > +        iommufd_backend_free_id(vsmmu->iommufd, vdevice_id);
> > +        return false;
> > +    }
> > +
> > +    vdev = g_new(IOMMUFDVdev, 1);
> > +    vdev->vdevice_id = vdevice_id;
> > +    vdev->virt_id = sid;
> > +    accel_dev->vdev = vdev;
> > +    return true;
> > +}
> > +
> > +static bool
> > +smmuv3_accel_dev_uninstall_nested_ste(SMMUv3AccelDevice *accel_dev,
> bool abort,
> > +                                      Error **errp)+{
> > +    HostIOMMUDeviceIOMMUFD *idev = accel_dev->idev;
> > +    SMMUS1Hwpt *s1_hwpt = accel_dev->s1_hwpt;
> > +    uint32_t hwpt_id;
> > +
> > +    if (!s1_hwpt || !accel_dev->vsmmu) {
> > +        return true;
> > +    }
> > +
> > +    if (abort) {
> > +        hwpt_id = accel_dev->vsmmu->abort_hwpt_id;
> > +    } else {
> > +        hwpt_id = accel_dev->vsmmu->bypass_hwpt_id;
> > +    }
> > +
> > +    if (!host_iommu_device_iommufd_attach_hwpt(idev, hwpt_id, errp)) {
> > +        return false;
> > +    }
> > +    trace_smmuv3_accel_uninstall_nested_ste(smmu_get_sid(&accel_dev-
> >sdev),
> > +                                            abort ? "abort" : "bypass",
> > +                                            hwpt_id);
> > +
> > +    iommufd_backend_free_id(s1_hwpt->iommufd, s1_hwpt->hwpt_id);
> > +    accel_dev->s1_hwpt = NULL;
> > +    g_free(s1_hwpt);
> > +    return true;
> > +}
> > +
> > +static bool
> > +smmuv3_accel_dev_install_nested_ste(SMMUv3AccelDevice *accel_dev,
> > +                                    uint32_t data_type, uint32_t data_len,
> > +                                    void *data, Error **errp)
> the name is very close to the caller function, ie.
> smmuv3_accel_install_nested_ste which also takes a sdev.
> I would rename to smmuv3_accel_install_hwpt() or something alike

This one is going to change a bit based on Nicolin's feedback on taking 
care of SMMUEN/GBPA values.
https://lore.kernel.org/all/aQVLzfaxxSfw1HBL@Asurada-Nvidia/

Probably smmuv3_accel_attach_nested_hwpt() suits better considering
that’s what it finally ends up doing.

> > +{
> > +    SMMUViommu *vsmmu = accel_dev->vsmmu;
> > +    SMMUS1Hwpt *s1_hwpt = accel_dev->s1_hwpt;
> > +    HostIOMMUDeviceIOMMUFD *idev = accel_dev->idev;
> > +    uint32_t flags = 0;
> > +
> > +    if (!idev || !vsmmu) {
> > +        error_setg(errp, "Device 0x%x has no associated IOMMU dev or
> vIOMMU",
> > +                   smmu_get_sid(&accel_dev->sdev));
> > +        return false;
> > +    }
> > +
> > +    if (s1_hwpt) {
> > +        if (!smmuv3_accel_dev_uninstall_nested_ste(accel_dev, true, errp)) 
> > {
> > +            return false;
> > +        }
> > +    }
> > +
> > +    s1_hwpt = g_new0(SMMUS1Hwpt, 1);
> > +    s1_hwpt->iommufd = idev->iommufd;
> > +    if (!iommufd_backend_alloc_hwpt(idev->iommufd, idev->devid,
> > +                                    vsmmu->viommu.viommu_id, flags,
> > +                                    data_type, data_len, data,
> > +                                    &s1_hwpt->hwpt_id, errp)) {
> > +        return false;
> > +    }
> > +
> > +    if (!host_iommu_device_iommufd_attach_hwpt(idev, s1_hwpt-
> >hwpt_id, errp)) {
> > +        iommufd_backend_free_id(idev->iommufd, s1_hwpt->hwpt_id);
> > +        return false;
> > +    }
> > +    accel_dev->s1_hwpt = s1_hwpt;
> > +    return true;
> > +}
> > +
> > +bool
> > +smmuv3_accel_install_nested_ste(SMMUv3State *s, SMMUDevice *sdev,
> int sid,
> > +                                Error **errp) {
> > +    SMMUv3AccelDevice *accel_dev;
> > +    SMMUEventInfo event = {.type = SMMU_EVT_NONE, .sid = sid,
> > +                           .inval_ste_allowed = true};
> > +    struct iommu_hwpt_arm_smmuv3 nested_data = {};
> > +    uint64_t ste_0, ste_1;
> > +    uint32_t config;
> > +    STE ste;
> > +    int ret;
> > +
> > +    if (!s->accel) {
> don't you want to check !s->vsmmu as well done in
> smmuv3_accel_install_nested_ste_range()

Nicolin has a suggestion to merge struct SMMUViommu and
SMMUv3AccelState into one and avoid the extra layering. I will
attempt that and all these checking might change as a result.

> > +        return true;
> > +    }
> > +
> > +    accel_dev = container_of(sdev, SMMUv3AccelDevice, sdev);
> > +    if (!accel_dev->vsmmu) {
> > +        return true;
> > +    }
> > +
> > +    if (!smmuv3_accel_alloc_vdev(accel_dev, sid, errp)) {
> > +        return false;
> > +    }
> > +
> > +    ret = smmu_find_ste(sdev->smmu, sid, &ste, &event);
> > +    if (ret) {
> > +        error_setg(errp, "Failed to find STE for Device 0x%x", sid);
> > +        return true;
> returning true while setting errp looks wrong to me.

Right, will just return true from here. I am not sure under what circumstances
we will hit here though.

> +    }
> +
> +    config = STE_CONFIG(&ste);
> +    if (!STE_VALID(&ste) || !STE_CFG_S1_ENABLED(config)) {
> +        if (!smmuv3_accel_dev_uninstall_nested_ste(accel_dev,
> +                                                   STE_CFG_ABORT(config),
> +                                                   errp)) {
> +            return false;
> +        }
> +        smmuv3_flush_config(sdev);
> +        return true;
> +    }
> +
> +    ste_0 = (uint64_t)ste.word[0] | (uint64_t)ste.word[1] << 32;
> +    ste_1 = (uint64_t)ste.word[2] | (uint64_t)ste.word[3] << 32;
> +    nested_data.ste[0] = cpu_to_le64(ste_0 & STE0_MASK);
> +    nested_data.ste[1] = cpu_to_le64(ste_1 & STE1_MASK);
> +
> +    if (!smmuv3_accel_dev_install_nested_ste(accel_dev,
> +                                             IOMMU_HWPT_DATA_ARM_SMMUV3,
> +                                             sizeof(nested_data),
> +                                             &nested_data, errp)) {
> +        error_append_hint(errp, "Unable to install sid=0x%x nested STE="
> +                          "0x%"PRIx64":=0x%"PRIx64"", sid,
nit: why ":=" between both 64b?
> +                          (uint64_t)le64_to_cpu(nested_data.ste[1]),
> +                          (uint64_t)le64_to_cpu(nested_data.ste[0]));
> +        return false;
in case of various failure cases, do we need to free the vdev?

I don't think we need to fee the vdev corresponding to this vSID on failures
here. I think the association between vSID and host SID can remain
intact even if the nested HWPT alloc/attach fails for whatever reason.

Thanks,
Shameer

Reply via email to