On Thu, Mar 18, 2021 at 07:36:50PM +0000, Robin Murphy wrote:
> On 2021-03-16 19:16, Jean-Philippe Brucker wrote:
> > The ACPI Virtual I/O Translation Table describes topology of
> > para-virtual platforms. For now it describes the relation between
> > virtio-iommu and the endpoints it manages. Supporting that requires
> > three steps:
> > 
> > (1) acpi_viot_init(): parse the VIOT table, build a list of endpoints
> >      and vIOMMUs.
> > 
> > (2) acpi_viot_set_iommu_ops(): when the vIOMMU driver is loaded and the
> >      device probed, register it to the VIOT driver. This step is required
> >      because unlike similar drivers, VIOT doesn't create the vIOMMU
> >      device.
> 
> Note that you're basically the same as the DT case in this regard, so I'd
> expect things to be closer to that pattern than to that of IORT.
> 
> [...]
> > @@ -1506,12 +1507,17 @@ int acpi_dma_configure_id(struct device *dev, enum 
> > dev_dma_attr attr,
> >   {
> >     const struct iommu_ops *iommu;
> >     u64 dma_addr = 0, size = 0;
> > +   int ret;
> >     if (attr == DEV_DMA_NOT_SUPPORTED) {
> >             set_dma_ops(dev, &dma_dummy_ops);
> >             return 0;
> >     }
> > +   ret = acpi_viot_dma_setup(dev, attr);
> > +   if (ret)
> > +           return ret > 0 ? 0 : ret;
> 
> I think things could do with a fair bit of refactoring here. Ideally we want
> to process a possible _DMA method (acpi_dma_get_range()) regardless of which
> flavour of IOMMU table might be present, and the amount of duplication we
> fork into at this point is unfortunate.
> 
> > +
> >     iort_dma_setup(dev, &dma_addr, &size);
> 
> For starters I think most of that should be dragged out to this level here -
> it's really only the {rc,nc}_dma_get_range() bit that deserves to be the
> IORT-specific call.

Makes sense, though I'll move it to drivers/acpi/arm64/dma.c instead of
here, because it has only ever run on CONFIG_ARM64. I don't want to
accidentally break some x86 platform with an invalid _DMA method (same
reason 7ad426398082 and 18b709beb503 kept this code in IORT)

> 
> >     iommu = iort_iommu_configure_id(dev, input_id);
> 
> Similarly, it feels like it's only the table scan part in the middle of that
> that needs dispatching between IORT/VIOT, and its head and tail pulled out
> into a common path.

Agreed

> 
> [...]
> > +static const struct iommu_ops *viot_iommu_setup(struct device *dev)
> > +{
> > +   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > +   struct viot_iommu *viommu = NULL;
> > +   struct viot_endpoint *ep;
> > +   u32 epid;
> > +   int ret;
> > +
> > +   /* Already translated? */
> > +   if (fwspec && fwspec->ops)
> > +           return NULL;
> > +
> > +   mutex_lock(&viommus_lock);
> > +   list_for_each_entry(ep, &viot_endpoints, list) {
> > +           if (viot_device_match(dev, &ep->dev_id, &epid)) {
> > +                   epid += ep->endpoint_id;
> > +                   viommu = ep->viommu;
> > +                   break;
> > +           }
> > +   }
> > +   mutex_unlock(&viommus_lock);
> > +   if (!viommu)
> > +           return NULL;
> > +
> > +   /* We're not translating ourself */
> > +   if (viot_device_match(dev, &viommu->dev_id, &epid))
> > +           return NULL;
> > +
> > +   /*
> > +    * If we found a PCI range managed by the viommu, we're the one that has
> > +    * to request ACS.
> > +    */
> > +   if (dev_is_pci(dev))
> > +           pci_request_acs();
> > +
> > +   if (!viommu->ops || WARN_ON(!viommu->dev))
> > +           return ERR_PTR(-EPROBE_DEFER);
> 
> Can you create (or look up) a viommu->fwnode when initially parsing the VIOT
> to represent the IOMMU devices to wait for, such that the
> viot_device_match() lookup can resolve to that and let you fall into the
> standard iommu_ops_from_fwnode() path? That's what I mean about following
> the DT pattern - I guess it might need a bit of trickery to rewrite things
> if iommu_device_register() eventually turns up with a new fwnode, so I doubt
> we can get away without *some* kind of private interface between
> virtio-iommu and VIOT, but it would be nice for the common(ish) DMA paths to
> stay as unaware of the specifics as possible.

Yes I can reuse iommu_ops_from_fwnode(). Turns out it's really easy: if we
move the VIOT initialization after acpi_scan_init(), we can use
pci_get_domain_bus_and_slot() directly and create missing fwnodes. That
gets rid of any need for a private interface between virtio-iommu and
VIOT.

> 
> > +
> > +   ret = iommu_fwspec_init(dev, viommu->dev->fwnode, viommu->ops);
> > +   if (ret)
> > +           return ERR_PTR(ret);
> > +
> > +   iommu_fwspec_add_ids(dev, &epid, 1);
> > +
> > +   /*
> > +    * If we have reason to believe the IOMMU driver missed the initial
> > +    * add_device callback for dev, replay it to get things in order.
> > +    */
> > +   if (dev->bus && !device_iommu_mapped(dev))
> > +           iommu_probe_device(dev);
> > +
> > +   return viommu->ops;
> > +}
> > +
> > +/**
> > + * acpi_viot_dma_setup - Configure DMA for an endpoint described in VIOT
> > + * @dev: the endpoint
> > + * @attr: coherency property of the endpoint
> > + *
> > + * Setup the DMA and IOMMU ops for an endpoint described by the VIOT table.
> > + *
> > + * Return:
> > + * * 0 - @dev doesn't match any VIOT node
> > + * * 1 - ops for @dev were successfully installed
> > + * * -EPROBE_DEFER - ops for @dev aren't yet available
> > + */
> > +int acpi_viot_dma_setup(struct device *dev, enum dev_dma_attr attr)
> > +{
> > +   const struct iommu_ops *iommu_ops = viot_iommu_setup(dev);
> > +
> > +   if (IS_ERR_OR_NULL(iommu_ops)) {
> > +           int ret = PTR_ERR(iommu_ops);
> > +
> > +           if (ret == -EPROBE_DEFER || ret == 0)
> > +                   return ret;
> > +           dev_err(dev, "error %d while setting up virt IOMMU\n", ret);
> > +           return 0;
> > +   }
> > +
> > +#ifdef CONFIG_ARCH_HAS_SETUP_DMA_OPS
> > +   arch_setup_dma_ops(dev, 0, ~0ULL, iommu_ops, attr == DEV_DMA_COHERENT);
> > +#else
> > +   iommu_setup_dma_ops(dev, 0, ~0ULL);
> > +#endif
> 
> Duplicating all of this feels particularly wrong... :(

Right, I still don't have a good solution for this last part. Ideally I'd
implement arch_setup_dma_ops() on x86 but virtio-iommu alone isn't enough
justification and changing DMAR and IVRS to use it is too much work. For
the next version I added a probe_finalize() method in virtio-iommu that
does the same as Vt-d and AMD IOMMU on x86. Hopefully the only wart in the
series.

Thanks,
Jean

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to