Re: [PATCH v2 02/17] driver core: Add dma_unconfigure callback in bus_type
On 11/28/21 4:02 PM, Greg Kroah-Hartman wrote: On Sun, Nov 28, 2021 at 10:50:36AM +0800, Lu Baolu wrote: The bus_type structure defines dma_configure() callback for bus drivers to configure DMA on the devices. This adds the paired dma_unconfigure() callback and calls it during driver unbinding so that bus drivers can do some cleanup work. One use case for this paired DMA callbacks is for the bus driver to check for DMA ownership conflicts during driver binding, where multiple devices belonging to a same IOMMU group (the minimum granularity of isolation and protection) may be assigned to kernel drivers or user space respectively. Without this change, for example, the vfio driver has to listen to a bus BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict. This leads to bad user experience since careless driver binding operation may crash the system if the admin overlooks the group restriction. Aside from bad design, this leads to a security problem as a root user, even with lockdown=integrity, can force the kernel to BUG. With this change, the bus driver could check and set the DMA ownership in driver binding process and fail on ownership conflicts. The DMA ownership should be released during driver unbinding. Suggested-by: Jason Gunthorpe Link: https://lore.kernel.org/linux-iommu/20210922123931.gi327...@nvidia.com/ Link: https://lore.kernel.org/linux-iommu/20210928115751.gk964...@nvidia.com/ Signed-off-by: Lu Baolu --- include/linux/device/bus.h | 3 +++ drivers/base/dd.c | 7 ++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h index a039ab809753..ef54a71e5f8f 100644 --- a/include/linux/device/bus.h +++ b/include/linux/device/bus.h @@ -59,6 +59,8 @@ struct fwnode_handle; *bus supports. * @dma_configure:Called to setup DMA configuration on a device on *this bus. + * @dma_unconfigure: Called to cleanup DMA configuration on a device on + * this bus. "dma_cleanup()" is a better name for this, don't you think? I agree with you. dma_cleanup() is more explicit and better here. thanks, greg k-h Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier()
Hi Greg, On 11/28/21 4:10 PM, Greg Kroah-Hartman wrote: On Sun, Nov 28, 2021 at 10:50:34AM +0800, Lu Baolu wrote: The original post and intent of this series is here. https://lore.kernel.org/linux-iommu/2025020552.2378167-1-baolu...@linux.intel.com/ Please put the intent in the message, dont make us go and dig it up again. Sure! I will include the message next time. Thanks! thanks, greg k-h Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 04/17] driver core: platform: Add driver dma ownership management
On Sun, Nov 28, 2021 at 09:10:14AM +0100, Greg Kroah-Hartman wrote: > On Sun, Nov 28, 2021 at 10:50:38AM +0800, Lu Baolu wrote: > > Multiple platform devices may be placed in the same IOMMU group because > > they cannot be isolated from each other. These devices must either be > > entirely under kernel control or userspace control, never a mixture. This > > checks and sets DMA ownership during driver binding, and release the > > ownership during driver unbinding. > > > > Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto > > claiming DMA_OWNER_DMA_API ownership in the binding process. For instance, > > the userspace framework drivers (vfio etc.) which need to manually claim > > DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace. > > Why would any vfio driver be a platform driver? Why not? VFIO implements drivers for most physical device types these days. Why wouldn't platform be included? > That should never be the case as they obviously are not platform > drivers, they are virtual ones. Huh? > > diff --git a/include/linux/platform_device.h > > b/include/linux/platform_device.h > > index 7c96f169d274..779bcf2a851c 100644 > > +++ b/include/linux/platform_device.h > > @@ -210,6 +210,7 @@ struct platform_driver { > > struct device_driver driver; > > const struct platform_device_id *id_table; > > bool prevent_deferred_probe; > > + bool suppress_auto_claim_dma_owner; > > What platform driver needs this change? It is in patch 12: --- a/drivers/vfio/platform/vfio_platform.c +++ b/drivers/vfio/platform/vfio_platform.c @@ -76,6 +76,7 @@ static struct platform_driver vfio_platform_driver = { .driver = { .name = "vfio-platform", }, + .suppress_auto_claim_dma_owner = true, }; Which is how VFIO provides support to DPDK for some Ethernet controllers embedded in a few ARM SOCs. It is also used in patch 17 in five tegra platform_drivers to make their sharing of an iommu group between possibly related platform_driver's safer. > > USE_PLATFORM_PM_SLEEP_OPS > > @@ -1478,7 +1505,8 @@ struct bus_type platform_bus_type = { > > .probe = platform_probe, > > .remove = platform_remove, > > .shutdown = platform_shutdown, > > - .dma_configure = platform_dma_configure, > > + .dma_configure = _platform_dma_configure, > > What happened to the original platform_dma_configure() function? It is still called. The issue here is that platform_dma_configure has nothing to do with platform and is being re-used by AMBA. Probably the resolution to both remarks is to rename platform_dma_configure to something sensible (firwmare dma configure maybe?) and use it in all places that do the of & acpi stuff - pci/amba/platform at least. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch 29/37] PCI/MSI: Use __msi_get_virq() in pci_get_vector()
On Sun, Nov 28 2021 at 19:37, Marc Zyngier wrote: > On Sat, 27 Nov 2021 01:22:03 +, > Thomas Gleixner wrote: > > I worked around it with the hack below, but I doubt this is the real > thing. portdrv_core.c does complicated things, and I don't completely > understand its logic. > > M. > > diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c > index 1f72bc734226..b15278a5fb4b 100644 > --- a/drivers/pci/msi/msi.c > +++ b/drivers/pci/msi/msi.c > @@ -1092,8 +1092,9 @@ int pci_irq_vector(struct pci_dev *dev, unsigned int nr) > int irq = __msi_get_virq(>dev, nr); > > switch (irq) { > - case -ENODEV: return !nr ? dev->irq : -EINVAL; > - case -ENOENT: return -EINVAL; > + case -ENOENT: > + case -ENODEV: > + return !nr ? dev->irq : -EINVAL; Hrm. ENODEV is returned when dev->msi.data == NULL, ENOENT when there is no MSI entry. But yes, that goes south when the device tried to enable MSI[X} and then ended up with INTx. It still has dev->msi.data, which causes it to return -ENOENT, which makes the above go belly up. Moo, what was I thinking? Thanks, tglx --- --- a/drivers/pci/msi/msi.c +++ b/drivers/pci/msi/msi.c @@ -1032,13 +1032,13 @@ EXPORT_SYMBOL(pci_free_irq_vectors); */ int pci_irq_vector(struct pci_dev *dev, unsigned int nr) { - int irq = __msi_get_virq(>dev, nr); + unsigned int irq; - switch (irq) { - case -ENODEV: return !nr ? dev->irq : -EINVAL; - case -ENOENT: return -EINVAL; - } - return irq; + if (!dev->msi_enabled && !dev->msix_enabled) + return !nr ? dev->irq : -EINVAL; + + irq = msi_get_virq(>dev, nr); + return irq ? irq : -EINVAL; } EXPORT_SYMBOL(pci_irq_vector); --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -169,21 +169,7 @@ static inline bool msi_device_has_proper } #endif -int __msi_get_virq(struct device *dev, unsigned int index); - -/** - * msi_get_virq - Return Linux interrupt number of a MSI interrupt - * @dev: Device to operate on - * @index: MSI interrupt index to look for (0-based) - * - * Return: The Linux interrupt number on success (> 0), 0 if not found - */ -static inline unsigned int msi_get_virq(struct device *dev, unsigned int index) -{ - int ret = __msi_get_virq(dev, index); - - return ret < 0 ? 0 : ret; -} +unsigned int msi_get_virq(struct device *dev, unsigned int index); /* Helpers to hide struct msi_desc implementation details */ #define msi_desc_to_dev(desc) ((desc)->dev) --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -119,21 +119,19 @@ int msi_setup_device_data(struct device } /** - * __msi_get_virq - Return Linux interrupt number of a MSI interrupt + * msi_get_virq - Return Linux interrupt number of a MSI interrupt * @dev: Device to operate on * @index: MSI interrupt index to look for (0-based) * - * Return: The Linux interrupt number on success (> 0) - *-ENODEV when the device is not using MSI - *-ENOENT if no such entry exists + * Return: The Linux interrupt number on success (> 0), 0 if not found */ -int __msi_get_virq(struct device *dev, unsigned int index) +unsigned int msi_get_virq(struct device *dev, unsigned int index) { struct msi_desc *desc; bool pcimsi; if (!dev->msi.data) - return -ENODEV; + return 0; pcimsi = msi_device_has_property(dev, MSI_PROP_PCI_MSI); @@ -152,9 +150,9 @@ int __msi_get_virq(struct device *dev, u if (desc->msi_index == index) return desc->irq; } - return -ENOENT; + return 0; } -EXPORT_SYMBOL_GPL(__msi_get_virq); +EXPORT_SYMBOL_GPL(msi_get_virq); #ifdef CONFIG_SYSFS static ssize_t msi_mode_show(struct device *dev, struct device_attribute *attr, ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch 00/37] genirq/msi, PCI/MSI: Spring cleaning - Part 2
On Sat, Nov 27 2021 at 20:39, Jason Gunthorpe wrote: > On Sat, Nov 27, 2021 at 02:21:17AM +0100, Thomas Gleixner wrote: >>4) Provide a function to retrieve the Linux interrupt number for a given >> MSI index similar to pci_irq_vector() and cleanup all open coded >> variants. > > The msi_get_virq() sure does make a big difference.. Though it does > highlight there is some asymmetry with how platform and PCI works here > where PCI fills some 'struct msix_entry *'. Many drivers would be > quite happy to just call msi_get_virq() and avoid the extra memory, so > I think the msi_get_virq() version is good. struct msix_entry should just go away. 90+% of the use cases fill it with a linear index range 0...N and then use the virq entry for request_irq(). So they can just use pci_alloc_irs_vectors_affinity() and retrieve the interrupt number via pci_irq_vector(). The few drivers which actually use it to allocate a sparse populated MSI index, e.g. 0, 12, 14 can be converted over to alloc vector 0 and then use the dynamic extenstion for the rest. Thanks, tglx ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch 29/37] PCI/MSI: Use __msi_get_virq() in pci_get_vector()
On Sat, 27 Nov 2021 01:22:03 +, Thomas Gleixner wrote: > > Use __msi_get_vector() and handle the return values to be compatible. > > No functional change intended. You wish ;-). [ 15.779540] pcieport 0001:00:01.0: AER: request AER IRQ -22 failed Notice how amusing the IRQ number is? > > Signed-off-by: Thomas Gleixner > --- > drivers/pci/msi/msi.c | 25 + > 1 file changed, 5 insertions(+), 20 deletions(-) > > --- a/drivers/pci/msi/msi.c > +++ b/drivers/pci/msi/msi.c > @@ -1023,28 +1023,13 @@ EXPORT_SYMBOL(pci_free_irq_vectors); > */ > int pci_irq_vector(struct pci_dev *dev, unsigned int nr) > { > - if (dev->msix_enabled) { > - struct msi_desc *entry; > + int irq = __msi_get_virq(>dev, nr); > > - for_each_pci_msi_entry(entry, dev) { > - if (entry->msi_index == nr) > - return entry->irq; > - } > - WARN_ON_ONCE(1); > - return -EINVAL; > + switch (irq) { > + case -ENODEV: return !nr ? dev->irq : -EINVAL; > + case -ENOENT: return -EINVAL; > } > - > - if (dev->msi_enabled) { > - struct msi_desc *entry = first_pci_msi_entry(dev); > - > - if (WARN_ON_ONCE(nr >= entry->nvec_used)) > - return -EINVAL; > - } else { > - if (WARN_ON_ONCE(nr > 0)) > - return -EINVAL; > - } > - > - return dev->irq + nr; > + return irq; > } > EXPORT_SYMBOL(pci_irq_vector); I worked around it with the hack below, but I doubt this is the real thing. portdrv_core.c does complicated things, and I don't completely understand its logic. M. diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c index 1f72bc734226..b15278a5fb4b 100644 --- a/drivers/pci/msi/msi.c +++ b/drivers/pci/msi/msi.c @@ -1092,8 +1092,9 @@ int pci_irq_vector(struct pci_dev *dev, unsigned int nr) int irq = __msi_get_virq(>dev, nr); switch (irq) { - case -ENODEV: return !nr ? dev->irq : -EINVAL; - case -ENOENT: return -EINVAL; + case -ENOENT: + case -ENODEV: + return !nr ? dev->irq : -EINVAL; } return irq; } -- Without deviation from the norm, progress is not possible. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch 02/37] device: Add device::msi_data pointer and struct msi_device_data
On Sat, Nov 27 2021 at 20:14, Jason Gunthorpe wrote: > On Sat, Nov 27, 2021 at 02:20:09AM +0100, Thomas Gleixner wrote: > >> +/** >> + * msi_setup_device_data - Setup MSI device data >> + * @dev:Device for which MSI device data should be set up >> + * >> + * Return: 0 on success, appropriate error code otherwise >> + * >> + * This can be called more than once for @dev. If the MSI device data is >> + * already allocated the call succeeds. The allocated memory is >> + * automatically released when the device is destroyed. > > I would say 'by devres when the driver is removed' rather than device > is destroyed - to me the latter implies it would happen at device_del > or perhaps during release.. Ah. Indeed it's when the driver unbinds because that's what disables MSI. >> +int msi_setup_device_data(struct device *dev) >> +{ >> +struct msi_device_data *md; >> + >> +if (dev->msi.data) >> +return 0; >> + >> +md = devres_alloc(msi_device_data_release, sizeof(*md), GFP_KERNEL); >> +if (!md) >> +return -ENOMEM; >> + >> +raw_spin_lock_init(>lock); > > I also couldn't guess why this needed to be raw? That lock is to replace the raw spinlock we have in struct device right now, which is protecting low level register access and that's called from within irq_desc::lock held regions with interrupts disabled. I had to stick something into the data structure because allocating 0 bytes is invalid. But yes, I should have mentioned it in the changelog. Thanks, tglx ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [git pull] IOMMU Fixes for Linux v5.16-rc2
The pull request you sent on Sun, 28 Nov 2021 14:32:59 +0100: > git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git > tags/iommu-fixes-v5.16-rc2 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/0757ca01d944001254a94ac1b25ced702a1e9ac5 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[git pull] IOMMU Fixes for Linux v5.16-rc2
Hi Linus, The following changes since commit 136057256686de39cc3a07c2e39ef6bc43003ff6: Linux 5.16-rc2 (2021-11-21 13:47:39 -0800) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git tags/iommu-fixes-v5.16-rc2 for you to fetch changes up to 86dc40c7ea9c22f64571e0e45f695de73a0e2644: iommu/vt-d: Fix unmap_pages support (2021-11-26 22:54:47 +0100) IOMMU Fixes for Linux v5.16-rc2: Including: - Intel VT-d fixes: - Remove unused PASID_DISABLED - Fix RCU locking - Fix for the unmap_pages call-back - Rockchip RK3568 address mask fix - AMD IOMMUv2 log message clarification Alex Bee (1): iommu/rockchip: Fix PAGE_DESC_HI_MASKs for RK3568 Alex Williamson (1): iommu/vt-d: Fix unmap_pages support Christophe JAILLET (1): iommu/vt-d: Fix an unbalanced rcu_read_lock/rcu_read_unlock() Joerg Roedel (2): iommu/vt-d: Remove unused PASID_DISABLED iommu/amd: Clarify AMD IOMMUv2 initialization messages arch/x86/include/asm/fpu/api.h | 6 -- drivers/iommu/amd/iommu_v2.c| 6 +++--- drivers/iommu/intel/cap_audit.c | 5 +++-- drivers/iommu/intel/iommu.c | 6 ++ drivers/iommu/rockchip-iommu.c | 4 ++-- 5 files changed, 10 insertions(+), 17 deletions(-) Please pull. Thanks, Joerg signature.asc Description: Digital signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier()
On Sun, Nov 28, 2021 at 10:50:34AM +0800, Lu Baolu wrote: > The original post and intent of this series is here. > https://lore.kernel.org/linux-iommu/2025020552.2378167-1-baolu...@linux.intel.com/ Please put the intent in the message, dont make us go and dig it up again. thanks, greg k-h ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 04/17] driver core: platform: Add driver dma ownership management
On Sun, Nov 28, 2021 at 10:50:38AM +0800, Lu Baolu wrote: > Multiple platform devices may be placed in the same IOMMU group because > they cannot be isolated from each other. These devices must either be > entirely under kernel control or userspace control, never a mixture. This > checks and sets DMA ownership during driver binding, and release the > ownership during driver unbinding. > > Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto > claiming DMA_OWNER_DMA_API ownership in the binding process. For instance, > the userspace framework drivers (vfio etc.) which need to manually claim > DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace. Why would any vfio driver be a platform driver? That should never be the case as they obviously are not platform drivers, they are virtual ones. > > Signed-off-by: Lu Baolu > --- > include/linux/platform_device.h | 1 + > drivers/base/platform.c | 30 +- > 2 files changed, 30 insertions(+), 1 deletion(-) > > diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h > index 7c96f169d274..779bcf2a851c 100644 > --- a/include/linux/platform_device.h > +++ b/include/linux/platform_device.h > @@ -210,6 +210,7 @@ struct platform_driver { > struct device_driver driver; > const struct platform_device_id *id_table; > bool prevent_deferred_probe; > + bool suppress_auto_claim_dma_owner; What platform driver needs this change? > }; > > #define to_platform_driver(drv) (container_of((drv), struct > platform_driver, \ > diff --git a/drivers/base/platform.c b/drivers/base/platform.c > index 598acf93a360..df4b385c8a52 100644 > --- a/drivers/base/platform.c > +++ b/drivers/base/platform.c > @@ -30,6 +30,7 @@ > #include > #include > #include > +#include > > #include "base.h" > #include "power/power.h" > @@ -1465,6 +1466,32 @@ int platform_dma_configure(struct device *dev) > return ret; > } > > +static int _platform_dma_configure(struct device *dev) > +{ > + struct platform_driver *drv = to_platform_driver(dev->driver); > + int ret; > + > + if (!drv->suppress_auto_claim_dma_owner) { > + ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL); > + if (ret) > + return ret; > + } > + > + ret = platform_dma_configure(dev); > + if (ret && !drv->suppress_auto_claim_dma_owner) > + iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API); > + > + return ret; > +} > + > +static void _platform_dma_unconfigure(struct device *dev) > +{ > + struct platform_driver *drv = to_platform_driver(dev->driver); > + > + if (!drv->suppress_auto_claim_dma_owner) > + iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API); > +} > + > static const struct dev_pm_ops platform_dev_pm_ops = { > SET_RUNTIME_PM_OPS(pm_generic_runtime_suspend, > pm_generic_runtime_resume, NULL) > USE_PLATFORM_PM_SLEEP_OPS > @@ -1478,7 +1505,8 @@ struct bus_type platform_bus_type = { > .probe = platform_probe, > .remove = platform_remove, > .shutdown = platform_shutdown, > - .dma_configure = platform_dma_configure, > + .dma_configure = _platform_dma_configure, What happened to the original platform_dma_configure() function? And single "_" prefixes are odd, please just spell out what the difference is in the function name, "_" gives us no hint at all. thnaks, greg k-h ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 02/17] driver core: Add dma_unconfigure callback in bus_type
On Sun, Nov 28, 2021 at 10:50:36AM +0800, Lu Baolu wrote: > The bus_type structure defines dma_configure() callback for bus drivers > to configure DMA on the devices. This adds the paired dma_unconfigure() > callback and calls it during driver unbinding so that bus drivers can do > some cleanup work. > > One use case for this paired DMA callbacks is for the bus driver to check > for DMA ownership conflicts during driver binding, where multiple devices > belonging to a same IOMMU group (the minimum granularity of isolation and > protection) may be assigned to kernel drivers or user space respectively. > > Without this change, for example, the vfio driver has to listen to a bus > BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict. > This leads to bad user experience since careless driver binding operation > may crash the system if the admin overlooks the group restriction. Aside > from bad design, this leads to a security problem as a root user, even with > lockdown=integrity, can force the kernel to BUG. > > With this change, the bus driver could check and set the DMA ownership in > driver binding process and fail on ownership conflicts. The DMA ownership > should be released during driver unbinding. > > Suggested-by: Jason Gunthorpe > Link: https://lore.kernel.org/linux-iommu/20210922123931.gi327...@nvidia.com/ > Link: https://lore.kernel.org/linux-iommu/20210928115751.gk964...@nvidia.com/ > Signed-off-by: Lu Baolu > --- > include/linux/device/bus.h | 3 +++ > drivers/base/dd.c | 7 ++- > 2 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h > index a039ab809753..ef54a71e5f8f 100644 > --- a/include/linux/device/bus.h > +++ b/include/linux/device/bus.h > @@ -59,6 +59,8 @@ struct fwnode_handle; > * bus supports. > * @dma_configure: Called to setup DMA configuration on a device on > * this bus. > + * @dma_unconfigure: Called to cleanup DMA configuration on a device on > + * this bus. "dma_cleanup()" is a better name for this, don't you think? thanks, greg k-h ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu