Re: [PATCH 1/1] iommu/vt-d: Remove unnecessary WARN_ON_ONCE()

2020-01-17 Thread Lu Baolu

Hi Joerg,

On 1/17/20 5:59 PM, Joerg Roedel wrote:

On Thu, Jan 16, 2020 at 09:52:36AM +0800, Lu Baolu wrote:

Address field in device TLB invalidation descriptor is qualified
by the S field. If S field is zero, a single page at page address
specified by address [63:12] is requested to be invalidated. If S
field is set, the least significant bit in the address field with
value 0b (say bit N) indicates the invalidation address range. The
spec doesn't require the address [N - 1, 0] to be cleared, hence
remove the unnecessary WARN_ON_ONCE().

Otherwise, the caller might set "mask = MAX_AGAW_PFN_WIDTH" in order
to invalidating all the cached mappings on an endpoint, and below
overflow error will be triggered.

[...]
UBSAN: Undefined behaviour in drivers/iommu/dmar.c:1354:3
shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]

Reported-and-tested-by: Frank 
Signed-off-by: Lu Baolu 


Does this need a Fixes and/or stable tag?



This doesn't cause any errors, just an unnecessary checking of

"0 & ((1UL << 64) - 1)"

in some cases.



Regards,

Joerg


Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 2/4] PCI: Add "pci=iommu_passthrough=" parameter for iommu passthrough

2020-01-17 Thread Lu Baolu

Hi Bjorn,

On 1/18/20 8:18 AM, Bjorn Helgaas wrote:

On Wed, Jan 01, 2020 at 01:26:46PM +0800, Lu Baolu wrote:

The new parameter takes a list of devices separated by a semicolon.
Each device specified will have its iommu_passthrough bit in struct
device set. This is very similar to the existing 'disable_acs_redir'
parameter.

Signed-off-by: Lu Baolu 
---
  .../admin-guide/kernel-parameters.txt |  5 +++
  drivers/pci/pci.c | 34 +++
  drivers/pci/pci.h |  1 +
  drivers/pci/probe.c   |  2 ++
  4 files changed, 42 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index ade4e6ec23e0..d3edc2cb6696 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3583,6 +3583,11 @@
may put more devices in an IOMMU group.
force_floating  [S390] Force usage of floating interrupts.
nomio   [S390] Do not use MIO instructions.
+   iommu_passthrough=[; ...]
+   Specify one or more PCI devices (in the format
+   specified above) separated by semicolons.
+   Each device specified will bypass IOMMU DMA
+   translation.
  
  	pcie_aspm=	[PCIE] Forcibly enable or disable PCIe Active State Power

Management.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 90dbd7c70371..05bf3f4acc36 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6401,6 +6401,37 @@ void __weak pci_fixup_cardbus(struct pci_bus *bus)
  }
  EXPORT_SYMBOL(pci_fixup_cardbus);
  
+static const char *iommu_passthrough_param;

+bool pci_iommu_passthrough_match(struct pci_dev *dev)
+{
+   int ret = 0;
+   const char *p = iommu_passthrough_param;
+
+   if (!p)
+   return false;
+
+   while (*p) {
+   ret = pci_dev_str_match(dev, p, );
+   if (ret < 0) {
+   pr_info_once("PCI: Can't parse iommu_passthrough parameter: 
%s\n",
+iommu_passthrough_param);
+
+   break;
+   } else if (ret == 1) {
+   pci_info(dev, "PCI: IOMMU passthrough\n");
+   return true;
+   }
+
+   if (*p != ';' && *p != ',') {
+   /* End of param or invalid format */
+   break;
+   }
+   p++;
+   }
+
+   return false;
+}


This duplicates a lot of the code in pci_disable_acs_redir().  That
needs to be factored out somehow so we don't duplicate it.



Sure. I will try to refactor the code in the next version.


Bjorn



Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 2/4] PCI: Add "pci=iommu_passthrough=" parameter for iommu passthrough

2020-01-17 Thread Bjorn Helgaas
On Wed, Jan 01, 2020 at 01:26:46PM +0800, Lu Baolu wrote:
> The new parameter takes a list of devices separated by a semicolon.
> Each device specified will have its iommu_passthrough bit in struct
> device set. This is very similar to the existing 'disable_acs_redir'
> parameter.
> 
> Signed-off-by: Lu Baolu 
> ---
>  .../admin-guide/kernel-parameters.txt |  5 +++
>  drivers/pci/pci.c | 34 +++
>  drivers/pci/pci.h |  1 +
>  drivers/pci/probe.c   |  2 ++
>  4 files changed, 42 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index ade4e6ec23e0..d3edc2cb6696 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -3583,6 +3583,11 @@
>   may put more devices in an IOMMU group.
>   force_floating  [S390] Force usage of floating interrupts.
>   nomio   [S390] Do not use MIO instructions.
> + iommu_passthrough=[; ...]
> + Specify one or more PCI devices (in the format
> + specified above) separated by semicolons.
> + Each device specified will bypass IOMMU DMA
> + translation.
>  
>   pcie_aspm=  [PCIE] Forcibly enable or disable PCIe Active State 
> Power
>   Management.
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 90dbd7c70371..05bf3f4acc36 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -6401,6 +6401,37 @@ void __weak pci_fixup_cardbus(struct pci_bus *bus)
>  }
>  EXPORT_SYMBOL(pci_fixup_cardbus);
>  
> +static const char *iommu_passthrough_param;
> +bool pci_iommu_passthrough_match(struct pci_dev *dev)
> +{
> + int ret = 0;
> + const char *p = iommu_passthrough_param;
> +
> + if (!p)
> + return false;
> +
> + while (*p) {
> + ret = pci_dev_str_match(dev, p, );
> + if (ret < 0) {
> + pr_info_once("PCI: Can't parse iommu_passthrough 
> parameter: %s\n",
> +  iommu_passthrough_param);
> +
> + break;
> + } else if (ret == 1) {
> + pci_info(dev, "PCI: IOMMU passthrough\n");
> + return true;
> + }
> +
> + if (*p != ';' && *p != ',') {
> + /* End of param or invalid format */
> + break;
> + }
> + p++;
> + }
> +
> + return false;
> +}

This duplicates a lot of the code in pci_disable_acs_redir().  That
needs to be factored out somehow so we don't duplicate it.

Bjorn
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 5/7] iommu/vt-d: Remove VMD child device sanity check

2020-01-17 Thread Jon Derrick
This removes the sanity check required for VMD child devices. The new
pci_real_dma_dev() DMA alias mechanism places them in the same IOMMU
group as the VMD endpoint. Assignment of the group would require
assigning the VMD endpoint, where unbinding the VMD endpoint removes the
child device domain from the hierarchy.

Signed-off-by: Jon Derrick 
---
 drivers/iommu/intel-iommu.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 01a1b0f..c055699 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -774,15 +774,7 @@ static struct intel_iommu *device_to_iommu(struct device 
*dev, u8 *bus, u8 *devf
if (dev_is_pci(dev)) {
struct pci_dev *pf_pdev;
 
-   pdev = to_pci_dev(dev);
-
-#ifdef CONFIG_X86
-   /* VMD child devices currently cannot be handled individually */
-   if (is_vmd(pdev->bus))
-   return NULL;
-#endif
-
-   pdev = pci_real_dma_dev(dev);
+   pdev = pci_real_dma_dev(to_pci_dev(dev));
 
/* VFs aren't listed in scope tables; we need to look up
 * the PF instead to find the IOMMU. */
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 4/7] iommu/vt-d: Use pci_real_dma_dev() for mapping

2020-01-17 Thread Jon Derrick
The PCI device may have a DMA requester on another bus, such as VMD
subdevices needing to use the VMD endpoint. This case requires the real
DMA device when mapping to IOMMU.

Signed-off-by: Jon Derrick 
---
 drivers/iommu/intel-iommu.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0c8d81f..01a1b0f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -782,6 +782,8 @@ static struct intel_iommu *device_to_iommu(struct device 
*dev, u8 *bus, u8 *devf
return NULL;
 #endif
 
+   pdev = pci_real_dma_dev(dev);
+
/* VFs aren't listed in scope tables; we need to look up
 * the PF instead to find the IOMMU. */
pf_pdev = pci_physfn(pdev);
@@ -2428,6 +2430,13 @@ static struct dmar_domain *find_domain(struct device 
*dev)
 dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO))
return NULL;
 
+   if (dev_is_pci(dev)) {
+   struct pci_dev *pdev;
+
+   pdev = pci_real_dma_dev(to_pci_dev(dev));
+   dev = >dev;
+   }
+
/* No lock here, assumes no domain exit in normal case */
info = dev->archdata.iommu;
if (likely(info))
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 2/7] x86/PCI: Expose VMD's PCI Device in pci_sysdata

2020-01-17 Thread Jon Derrick
To be used by Intel-IOMMU code to find the correct domain.

CC: Christoph Hellwig 
Signed-off-by: Jon Derrick 
---
 arch/x86/include/asm/pci.h   | 4 ++--
 drivers/pci/controller/vmd.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index a4e09db60..6512c54 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -25,7 +25,7 @@ struct pci_sysdata {
void*fwnode;/* IRQ domain for MSI assignment */
 #endif
 #if IS_ENABLED(CONFIG_VMD)
-   bool vmd_domain;/* True if in Intel VMD domain */
+   struct pci_dev  *vmd_dev;   /* VMD Device if in Intel VMD domain */
 #endif
 };
 
@@ -64,7 +64,7 @@ static inline void *_pci_root_bus_fwnode(struct pci_bus *bus)
 #if IS_ENABLED(CONFIG_VMD)
 static inline bool is_vmd(struct pci_bus *bus)
 {
-   return to_pci_sysdata(bus)->vmd_domain;
+   return to_pci_sysdata(bus)->vmd_dev != NULL;
 }
 #else
 #define is_vmd(bus)false
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 2128422..d67ad56 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -679,7 +679,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned 
long features)
.parent = res,
};
 
-   sd->vmd_domain = true;
+   sd->vmd_dev = vmd->dev;
sd->domain = vmd_find_free_domain();
if (sd->domain < 0)
return sd->domain;
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 1/7] x86/PCI: Add a to_pci_sysdata helper

2020-01-17 Thread Jon Derrick
From: Christoph Hellwig 

Various helpers need the pci_sysdata just to dereference a single field
in it.  Add a little helper that returns the properly typed sysdata
pointer to require a little less boilerplate code.

Signed-off-by: Christoph Hellwig 
[jonathan.derrick: to_pci_sysdata const argument]
Signed-off-by: Jon Derrick 
---
 arch/x86/include/asm/pci.h | 29 +
 1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index 90d0731..a4e09db60 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -35,12 +35,15 @@ struct pci_sysdata {
 
 #ifdef CONFIG_PCI
 
+static inline struct pci_sysdata *to_pci_sysdata(const struct pci_bus *bus)
+{
+   return bus->sysdata;
+}
+
 #ifdef CONFIG_PCI_DOMAINS
 static inline int pci_domain_nr(struct pci_bus *bus)
 {
-   struct pci_sysdata *sd = bus->sysdata;
-
-   return sd->domain;
+   return to_pci_sysdata(bus)->domain;
 }
 
 static inline int pci_proc_domain(struct pci_bus *bus)
@@ -52,24 +55,20 @@ static inline int pci_proc_domain(struct pci_bus *bus)
 #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
 static inline void *_pci_root_bus_fwnode(struct pci_bus *bus)
 {
-   struct pci_sysdata *sd = bus->sysdata;
-
-   return sd->fwnode;
+   return to_pci_sysdata(bus)->fwnode;
 }
 
 #define pci_root_bus_fwnode_pci_root_bus_fwnode
 #endif
 
+#if IS_ENABLED(CONFIG_VMD)
 static inline bool is_vmd(struct pci_bus *bus)
 {
-#if IS_ENABLED(CONFIG_VMD)
-   struct pci_sysdata *sd = bus->sysdata;
-
-   return sd->vmd_domain;
-#else
-   return false;
-#endif
+   return to_pci_sysdata(bus)->vmd_domain;
 }
+#else
+#define is_vmd(bus)false
+#endif /* CONFIG_VMD */
 
 /* Can be used to override the logic in pci_scan_bus for skipping
already-configured bus numbers - to be used for buggy BIOSes
@@ -124,9 +123,7 @@ static inline void early_quirks(void) { }
 /* Returns the node based on pci bus */
 static inline int __pcibus_to_node(const struct pci_bus *bus)
 {
-   const struct pci_sysdata *sd = bus->sysdata;
-
-   return sd->node;
+   return to_pci_sysdata(bus)->node;
 }
 
 static inline const struct cpumask *
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 3/7] PCI: Introduce pci_real_dma_dev()

2020-01-17 Thread Jon Derrick
The current DMA alias implementation requires the aliased device be on
the same PCI bus as the requester ID. This introduces an arch-specific
mechanism to point to another PCI device when doing mapping and
PCI DMA alias search. The default case returns the actual device.

CC: Christoph Hellwig 
Signed-off-by: Jon Derrick 
---
 arch/x86/pci/common.c | 10 ++
 drivers/pci/pci.c | 19 ++-
 drivers/pci/search.c  |  6 ++
 include/linux/pci.h   |  1 +
 4 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 1e59df0..fe21a5c 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -736,3 +736,13 @@ int pci_ext_cfg_avail(void)
else
return 0;
 }
+
+#if IS_ENABLED(CONFIG_VMD)
+struct pci_dev *pci_real_dma_dev(struct pci_dev *dev)
+{
+   if (is_vmd(dev->bus))
+   return to_pci_sysdata(dev->bus)->vmd_dev;
+
+   return dev;
+}
+#endif
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 581b177..36d24f2 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6048,7 +6048,9 @@ bool pci_devs_are_dma_aliases(struct pci_dev *dev1, 
struct pci_dev *dev2)
return (dev1->dma_alias_mask &&
test_bit(dev2->devfn, dev1->dma_alias_mask)) ||
   (dev2->dma_alias_mask &&
-   test_bit(dev1->devfn, dev2->dma_alias_mask));
+   test_bit(dev1->devfn, dev2->dma_alias_mask)) ||
+  pci_real_dma_dev(dev1) == dev2 ||
+  pci_real_dma_dev(dev2) == dev1;
 }
 
 bool pci_device_is_present(struct pci_dev *pdev)
@@ -6072,6 +6074,21 @@ void pci_ignore_hotplug(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pci_ignore_hotplug);
 
+/**
+ * pci_real_dma_dev - Get PCI DMA device for PCI device
+ * @dev: the PCI device that may have a PCI DMA alias
+ *
+ * Permits the platform to provide architecture-specific functionality to
+ * devices needing to alias DMA to another PCI device on another PCI bus. If
+ * the PCI device is on the same bus, it is recommended to use
+ * pci_add_dma_alias(). This is the default implementation. Architecture
+ * implementations can override this.
+ */
+struct pci_dev __weak *pci_real_dma_dev(struct pci_dev *dev)
+{
+   return dev;
+}
+
 resource_size_t __weak pcibios_default_alignment(void)
 {
return 0;
diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index e4dbdef..2061672 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -32,6 +32,12 @@ int pci_for_each_dma_alias(struct pci_dev *pdev,
struct pci_bus *bus;
int ret;
 
+   /*
+* The device may have an explicit alias requester ID for DMA where the
+* requester is on another PCI bus.
+*/
+   pdev = pci_real_dma_dev(pdev);
+
ret = fn(pdev, pci_dev_id(pdev), data);
if (ret)
return ret;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 930fab2..3840a54 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1202,6 +1202,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct 
pci_dev **limiting_dev,
 int pci_select_bars(struct pci_dev *dev, unsigned long flags);
 bool pci_device_is_present(struct pci_dev *pdev);
 void pci_ignore_hotplug(struct pci_dev *dev);
+struct pci_dev *pci_real_dma_dev(struct pci_dev *dev);
 
 int __printf(6, 7) pci_request_irq(struct pci_dev *dev, unsigned int nr,
irq_handler_t handler, irq_handler_t thread_fn, void *dev_id,
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 7/7] x86/PCI: Remove X86_DEV_DMA_OPS

2020-01-17 Thread Jon Derrick
From: Christoph Hellwig 

There are no users of X86_DEV_DMA_OPS left, so remove the code.

Reviewed-by: Jon Derrick 
Signed-off-by: Christoph Hellwig 
---
 arch/x86/Kconfig  |  3 ---
 arch/x86/include/asm/device.h | 10 --
 arch/x86/pci/common.c | 38 --
 3 files changed, 51 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5e89499..77f9426 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2955,9 +2955,6 @@ config HAVE_ATOMIC_IOMAP
def_bool y
depends on X86_32
 
-config X86_DEV_DMA_OPS
-   bool
-
 source "drivers/firmware/Kconfig"
 
 source "arch/x86/kvm/Kconfig"
diff --git a/arch/x86/include/asm/device.h b/arch/x86/include/asm/device.h
index 5e12c63..7e31f7f 100644
--- a/arch/x86/include/asm/device.h
+++ b/arch/x86/include/asm/device.h
@@ -8,16 +8,6 @@ struct dev_archdata {
 #endif
 };
 
-#if defined(CONFIG_X86_DEV_DMA_OPS) && defined(CONFIG_PCI_DOMAINS)
-struct dma_domain {
-   struct list_head node;
-   const struct dma_map_ops *dma_ops;
-   int domain_nr;
-};
-void add_dma_domain(struct dma_domain *domain);
-void del_dma_domain(struct dma_domain *domain);
-#endif
-
 struct pdev_archdata {
 };
 
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index fe21a5c..df1d959 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -625,43 +625,6 @@ unsigned int pcibios_assign_all_busses(void)
return (pci_probe & PCI_ASSIGN_ALL_BUSSES) ? 1 : 0;
 }
 
-#if defined(CONFIG_X86_DEV_DMA_OPS) && defined(CONFIG_PCI_DOMAINS)
-static LIST_HEAD(dma_domain_list);
-static DEFINE_SPINLOCK(dma_domain_list_lock);
-
-void add_dma_domain(struct dma_domain *domain)
-{
-   spin_lock(_domain_list_lock);
-   list_add(>node, _domain_list);
-   spin_unlock(_domain_list_lock);
-}
-EXPORT_SYMBOL_GPL(add_dma_domain);
-
-void del_dma_domain(struct dma_domain *domain)
-{
-   spin_lock(_domain_list_lock);
-   list_del(>node);
-   spin_unlock(_domain_list_lock);
-}
-EXPORT_SYMBOL_GPL(del_dma_domain);
-
-static void set_dma_domain_ops(struct pci_dev *pdev)
-{
-   struct dma_domain *domain;
-
-   spin_lock(_domain_list_lock);
-   list_for_each_entry(domain, _domain_list, node) {
-   if (pci_domain_nr(pdev->bus) == domain->domain_nr) {
-   pdev->dev.dma_ops = domain->dma_ops;
-   break;
-   }
-   }
-   spin_unlock(_domain_list_lock);
-}
-#else
-static void set_dma_domain_ops(struct pci_dev *pdev) {}
-#endif
-
 static void set_dev_domain_options(struct pci_dev *pdev)
 {
if (is_vmd(pdev->bus))
@@ -697,7 +660,6 @@ int pcibios_add_device(struct pci_dev *dev)
pa_data = data->next;
memunmap(data);
}
-   set_dma_domain_ops(dev);
set_dev_domain_options(dev);
return 0;
 }
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 0/7] Clean up VMD DMA Map Ops

2020-01-17 Thread Jon Derrick
v3 Set: 
https://lore.kernel.org/linux-iommu/20200113181742.ga27...@e121166-lin.cambridge.arm.com/T/#t
v2 Set: 
https://lore.kernel.org/linux-iommu/1578580256-3483-1-git-send-email-jonathan.derr...@intel.com/T/#t
v1 Set: 
https://lore.kernel.org/linux-iommu/20200107134125.gd30...@8bytes.org/T/#t

VMD currently works with VT-d enabled by pointing DMA and IOMMU actions at the
VMD endpoint. The problem with this approach is that the VMD endpoint's
device-specific attributes, such as the DMA Mask Bits, are used instead of the
child device's attributes.

This set cleans up VMD by removing the override that redirects DMA map
operations to the VMD endpoint. Instead it introduces a new DMA alias mechanism
into the existing DMA alias infrastructure. This new DMA alias mechanism allows
an architecture-specific pci_real_dma_dev() function to provide a pointer from
a pci_dev to its PCI DMA device, where by default it returns the original
pci_dev.

In addition, this set removes the sanity check that was added to prevent
assigning VMD child devices. By using the DMA alias mechanism, all child
devices are assigned the same IOMMU group as the VMD endpoint. This removes the
need for restricting VMD child devices from assignment, as the whole group
would have to be assigned, requiring unbinding the VMD driver and removing the
child device domain.

v1 added a pointer in struct pci_dev that pointed to the DMA alias' struct
pci_dev and did the necessary DMA alias and IOMMU modifications.

v2 introduced a new weak function to reference the 'Direct DMA Alias', and
removed the need to add a pointer in struct device or pci_dev. Weak functions
are generally frowned upon when it's a single architecture implementation, so I
am open to alternatives.

v3 referenced the pci_dev rather than the struct device for the PCI
'Direct DMA Alias' (pci_direct_dma_alias()). This revision also allowed
pci_for_each_dma_alias() to call any DMA aliases for the Direct DMA alias
device, though I don't expect the VMD endpoint to need intra-bus DMA aliases.

v4 changes the 'Direct DMA Alias' to instead refer to the 'Real DMA Dev', which
either returns the PCI device itself or the PCI DMA device.


Changes from v3:
Uses pci_real_dma_dev() instead of pci_direct_dma_alias()
Split IOMMU enabling, IOMMU VMD sanity check and VMD dma_map_ops cleanup into 
three patches

Changes from v2:
Uses struct pci_dev for PCI Device 'Direct DMA aliasing' (pci_direct_dma_alias)
Allows pci_for_each_dma_alias to iterate over the alias mask of the 'Direct DMA 
alias'

Changes from v1:
Removed 1/5 & 2/5 misc fix patches that were merged
Uses Christoph's staging/cleanup patches
Introduce weak function rather than including pointer in struct device or 
pci_dev.

Based on Bjorn's next:
https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/log/?h=next

Christoph Hellwig (2):
  x86/PCI: Add a to_pci_sysdata helper
  x86/PCI: Remove X86_DEV_DMA_OPS

Jon Derrick (5):
  x86/PCI: Expose VMD's PCI Device in pci_sysdata
  PCI: Introduce pci_real_dma_dev()
  iommu/vt-d: Use pci_real_dma_dev() for mapping
  iommu/vt-d: Remove VMD child device sanity check
  PCI: vmd: Stop overriding dma_map_ops

 arch/x86/Kconfig   |   3 -
 arch/x86/include/asm/device.h  |  10 ---
 arch/x86/include/asm/pci.h |  31 -
 arch/x86/pci/common.c  |  48 +++--
 drivers/iommu/intel-iommu.c|  15 ++--
 drivers/pci/controller/Kconfig |   1 -
 drivers/pci/controller/vmd.c   | 152 +
 drivers/pci/pci.c  |  19 +-
 drivers/pci/search.c   |   6 ++
 include/linux/pci.h|   1 +
 10 files changed, 58 insertions(+), 228 deletions(-)

-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 6/7] PCI: vmd: Stop overriding dma_map_ops

2020-01-17 Thread Jon Derrick
Devices on the VMD domain use the VMD endpoint's requester ID and have
been relying on the VMD endpoint's DMA operations. The problem with this
was that VMD domain devices would use the VMD endpoint's attributes when
doing DMA and IOMMU mapping. We can be smarter about this by only using
the VMD endpoint when mapping and providing the correct child device's
attributes during DMA operations.

This patch removes the dma_map_ops redirect.

Signed-off-by: Jon Derrick 
---
 drivers/pci/controller/Kconfig |   1 -
 drivers/pci/controller/vmd.c   | 150 -
 2 files changed, 151 deletions(-)

diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
index 918e283..20bf00f 100644
--- a/drivers/pci/controller/Kconfig
+++ b/drivers/pci/controller/Kconfig
@@ -239,7 +239,6 @@ config PCIE_TANGO_SMP8759
 
 config VMD
depends on PCI_MSI && X86_64 && SRCU
-   select X86_DEV_DMA_OPS
tristate "Intel Volume Management Device Driver"
---help---
  Adds support for the Intel Volume Management Device (VMD). VMD is a
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index d67ad56..fe1acb0 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -98,9 +98,6 @@ struct vmd_dev {
struct irq_domain   *irq_domain;
struct pci_bus  *bus;
u8  busn_start;
-
-   struct dma_map_ops  dma_ops;
-   struct dma_domain   dma_domain;
 };
 
 static inline struct vmd_dev *vmd_from_bus(struct pci_bus *bus)
@@ -295,151 +292,6 @@ static void vmd_set_desc(msi_alloc_info_t *arg, struct 
msi_desc *desc)
.chip   = _msi_controller,
 };
 
-/*
- * VMD replaces the requester ID with its own.  DMA mappings for devices in a
- * VMD domain need to be mapped for the VMD, not the device requiring
- * the mapping.
- */
-static struct device *to_vmd_dev(struct device *dev)
-{
-   struct pci_dev *pdev = to_pci_dev(dev);
-   struct vmd_dev *vmd = vmd_from_bus(pdev->bus);
-
-   return >dev->dev;
-}
-
-static void *vmd_alloc(struct device *dev, size_t size, dma_addr_t *addr,
-  gfp_t flag, unsigned long attrs)
-{
-   return dma_alloc_attrs(to_vmd_dev(dev), size, addr, flag, attrs);
-}
-
-static void vmd_free(struct device *dev, size_t size, void *vaddr,
-dma_addr_t addr, unsigned long attrs)
-{
-   return dma_free_attrs(to_vmd_dev(dev), size, vaddr, addr, attrs);
-}
-
-static int vmd_mmap(struct device *dev, struct vm_area_struct *vma,
-   void *cpu_addr, dma_addr_t addr, size_t size,
-   unsigned long attrs)
-{
-   return dma_mmap_attrs(to_vmd_dev(dev), vma, cpu_addr, addr, size,
-   attrs);
-}
-
-static int vmd_get_sgtable(struct device *dev, struct sg_table *sgt,
-  void *cpu_addr, dma_addr_t addr, size_t size,
-  unsigned long attrs)
-{
-   return dma_get_sgtable_attrs(to_vmd_dev(dev), sgt, cpu_addr, addr, size,
-   attrs);
-}
-
-static dma_addr_t vmd_map_page(struct device *dev, struct page *page,
-  unsigned long offset, size_t size,
-  enum dma_data_direction dir,
-  unsigned long attrs)
-{
-   return dma_map_page_attrs(to_vmd_dev(dev), page, offset, size, dir,
-   attrs);
-}
-
-static void vmd_unmap_page(struct device *dev, dma_addr_t addr, size_t size,
-  enum dma_data_direction dir, unsigned long attrs)
-{
-   dma_unmap_page_attrs(to_vmd_dev(dev), addr, size, dir, attrs);
-}
-
-static int vmd_map_sg(struct device *dev, struct scatterlist *sg, int nents,
- enum dma_data_direction dir, unsigned long attrs)
-{
-   return dma_map_sg_attrs(to_vmd_dev(dev), sg, nents, dir, attrs);
-}
-
-static void vmd_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
-enum dma_data_direction dir, unsigned long attrs)
-{
-   dma_unmap_sg_attrs(to_vmd_dev(dev), sg, nents, dir, attrs);
-}
-
-static void vmd_sync_single_for_cpu(struct device *dev, dma_addr_t addr,
-   size_t size, enum dma_data_direction dir)
-{
-   dma_sync_single_for_cpu(to_vmd_dev(dev), addr, size, dir);
-}
-
-static void vmd_sync_single_for_device(struct device *dev, dma_addr_t addr,
-  size_t size, enum dma_data_direction dir)
-{
-   dma_sync_single_for_device(to_vmd_dev(dev), addr, size, dir);
-}
-
-static void vmd_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
-   int nents, enum dma_data_direction dir)
-{
-   dma_sync_sg_for_cpu(to_vmd_dev(dev), sg, nents, dir);
-}
-
-static void vmd_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
-  int nents, 

[PATCH v2] iommu/arm-smmu-v3: Add SMMUv3.2 range invalidation support

2020-01-17 Thread Rob Herring
Arm SMMUv3.2 adds support for TLB range invalidate operations.
Support for range invalidate is determined by the RIL bit in the IDR3
register.

The range invalidate is in units of the leaf page size and operates on
1-32 chunks of a power of 2 multiple pages. First, we determine from the
size what power of 2 multiple we can use. Then we calculate how many
chunks (1-31) of the power of 2 size for the range on the iteration. On
each iteration, we move up in size by at least 5 bits.

Cc: Eric Auger 
Cc: Jean-Philippe Brucker 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Joerg Roedel 
Signed-off-by: Rob Herring 
---
 drivers/iommu/arm-smmu-v3.c | 66 -
 1 file changed, 65 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index e91b4a098215..0ee561db7149 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -70,6 +70,9 @@
 #define IDR1_SSIDSIZE  GENMASK(10, 6)
 #define IDR1_SIDSIZE   GENMASK(5, 0)
 
+#define ARM_SMMU_IDR3  0xc
+#define IDR3_RIL   (1 << 10)
+
 #define ARM_SMMU_IDR5  0x14
 #define IDR5_STALL_MAX GENMASK(31, 16)
 #define IDR5_GRAN64K   (1 << 6)
@@ -327,9 +330,14 @@
 #define CMDQ_CFGI_1_LEAF   (1UL << 0)
 #define CMDQ_CFGI_1_RANGE  GENMASK_ULL(4, 0)
 
+#define CMDQ_TLBI_0_NUMGENMASK_ULL(16, 12)
+#define CMDQ_TLBI_RANGE_NUM_MAX31
+#define CMDQ_TLBI_0_SCALE  GENMASK_ULL(24, 20)
 #define CMDQ_TLBI_0_VMID   GENMASK_ULL(47, 32)
 #define CMDQ_TLBI_0_ASID   GENMASK_ULL(63, 48)
 #define CMDQ_TLBI_1_LEAF   (1UL << 0)
+#define CMDQ_TLBI_1_TTLGENMASK_ULL(9, 8)
+#define CMDQ_TLBI_1_TG GENMASK_ULL(11, 10)
 #define CMDQ_TLBI_1_VA_MASKGENMASK_ULL(63, 12)
 #define CMDQ_TLBI_1_IPA_MASK   GENMASK_ULL(51, 12)
 
@@ -455,9 +463,13 @@ struct arm_smmu_cmdq_ent {
#define CMDQ_OP_TLBI_S2_IPA 0x2a
#define CMDQ_OP_TLBI_NSNH_ALL   0x30
struct {
+   u8  num;
+   u8  scale;
u16 asid;
u16 vmid;
boolleaf;
+   u8  ttl;
+   u8  tg;
u64 addr;
} tlbi;
 
@@ -595,6 +607,7 @@ struct arm_smmu_device {
 #define ARM_SMMU_FEAT_HYP  (1 << 12)
 #define ARM_SMMU_FEAT_STALL_FORCE  (1 << 13)
 #define ARM_SMMU_FEAT_VAX  (1 << 14)
+#define ARM_SMMU_FEAT_RANGE_INV(1 << 15)
u32 features;
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
@@ -856,13 +869,21 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct 
arm_smmu_cmdq_ent *ent)
cmd[1] |= FIELD_PREP(CMDQ_CFGI_1_RANGE, 31);
break;
case CMDQ_OP_TLBI_NH_VA:
+   cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_NUM, ent->tlbi.num);
+   cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_SCALE, ent->tlbi.scale);
cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_ASID, ent->tlbi.asid);
cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_LEAF, ent->tlbi.leaf);
+   cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_TTL, ent->tlbi.ttl);
+   cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_TG, ent->tlbi.tg);
cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_VA_MASK;
break;
case CMDQ_OP_TLBI_S2_IPA:
+   cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_NUM, ent->tlbi.num);
+   cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_SCALE, ent->tlbi.scale);
cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, ent->tlbi.vmid);
cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_LEAF, ent->tlbi.leaf);
+   cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_TTL, ent->tlbi.ttl);
+   cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_TG, ent->tlbi.tg);
cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_IPA_MASK;
break;
case CMDQ_OP_TLBI_NH_ASID:
@@ -2003,7 +2024,7 @@ static void arm_smmu_tlb_inv_range(unsigned long iova, 
size_t size,
 {
u64 cmds[CMDQ_BATCH_ENTRIES * CMDQ_ENT_DWORDS];
struct arm_smmu_device *smmu = smmu_domain->smmu;
-   unsigned long start = iova, end = iova + size;
+   unsigned long start = iova, end = iova + size, num_pages = 0, tg = 0;
int i = 0;
struct arm_smmu_cmdq_ent cmd = {
.tlbi = {
@@ -2022,12 +2043,50 @@ static void arm_smmu_tlb_inv_range(unsigned long iova, 
size_t size,
cmd.tlbi.vmid   = smmu_domain->s2_cfg.vmid;
}
 
+   if (smmu->features & ARM_SMMU_FEAT_RANGE_INV) {
+   /* Get the leaf 

Re: [PATCH v3 4/5] PCI: vmd: Stop overriding dma_map_ops

2020-01-17 Thread Derrick, Jonathan
On Tue, 2020-01-14 at 09:54 +0100, Christoph Hellwig wrote:
> On Fri, Jan 10, 2020 at 10:21:12AM -0700, Jon Derrick wrote:
> > Devices on the VMD domain use the VMD endpoint's requester ID and have
> > been relying on the VMD endpoint's DMA operations. The problem with this
> > was that VMD domain devices would use the VMD endpoint's attributes when
> > doing DMA and IOMMU mapping. We can be smarter about this by only using
> > the VMD endpoint when mapping and providing the correct child device's
> > attributes during DMA operations.
> > 
> > This patch modifies Intel-IOMMU to check for a 'Direct DMA Alias' and
> > refer to it for mapping.
> > 
> > Signed-off-by: Jon Derrick 
> > ---
> >  drivers/iommu/intel-iommu.c|  18 +++--
> >  drivers/pci/controller/Kconfig |   1 -
> >  drivers/pci/controller/vmd.c   | 150 
> > -
> >  3 files changed, 13 insertions(+), 156 deletions(-)
> > 
> > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > index 716347e2..7ca807a 100644
> > --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -804,14 +804,14 @@ static struct intel_iommu *device_to_iommu(struct 
> > device *dev, u8 *bus, u8 *devf
> >  
> > if (dev_is_pci(dev)) {
> > struct pci_dev *pf_pdev;
> > +   struct pci_dev *dma_alias;
> >  
> > pdev = to_pci_dev(dev);
> >  
> > -#ifdef CONFIG_X86
> > -   /* VMD child devices currently cannot be handled individually */
> > -   if (is_vmd(pdev->bus))
> > -   return NULL;
> > -#endif
> 
> Don't we need this sanity check to prevent assingning vmd subdevices?
I don't think it's necessary now. The new code results in the child
devices being assigned the same IOMMU group as the VMD endpoint.
(AFAIK) You have to assign all devices in a group when doing
assignment, so by unbinding the VMD endpoint in order to assign it, you
would lose the child devices in the host.


> 
> > +   /* DMA aliased devices use the DMA alias's IOMMU */
> > +   dma_alias = pci_direct_dma_alias(pdev);
> > +   if (dma_alias)
> > +   pdev = dma_alias;
> >  
> > /* VFs aren't listed in scope tables; we need to look up
> >  * the PF instead to find the IOMMU. */
> > @@ -2521,6 +2521,14 @@ struct dmar_domain *find_domain(struct device *dev)
> >  dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO))
> > return NULL;
> >  
> > +   if (dev_is_pci(dev)) {
> > +   struct pci_dev *pdev = to_pci_dev(dev);
> > +   struct pci_dev *dma_alias = pci_direct_dma_alias(pdev);
> > +
> > +   if (dma_alias)
> > +   dev = _alias->dev;
> 
> Instead of all these duplicate calls, shouldn't pci_direct_dma_alias be
> replaced with a pci_real_dma_dev helper that either returns the
> dma parent if it exiѕts, or the actual device?
> 
> Also I think this patch should be split - one for intel-iommu that
> just adds the real device checks, and then one that wires up vmd to
> the new mechanism and then removes all the cruft.


Thanks for the review. I'll work on the suggestion
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [rfc] dma-mapping: preallocate unencrypted DMA atomic pool

2020-01-17 Thread Tom Lendacky
On 12/31/19 7:54 PM, David Rientjes wrote:
> Christoph, Thomas, is something like this (without the diagnosic 
> information included in this patch) acceptable for these allocations?  
> Adding expansion support when the pool is half depleted wouldn't be *that* 
> hard.

Sorry for the delay in responding...  Overall, I think this will work
well. If you want the default size to be adjustable, you can always go
with a Kconfig setting or a command line parameter or both (I realize the
command line parameter is not optimal).

Just a couple of overall comments about the use of variable names and
messages using both unencrypted and encrypted, I think the use should be
consistent throughout the patch.

Thanks,
Tom

> 
> Or are there alternatives we should consider?  Thanks!
> 
> 
> 
> 
> When AMD SEV is enabled in the guest, all allocations through 
> dma_pool_alloc_page() must call set_memory_decrypted() for unencrypted 
> DMA.  This includes dma_pool_alloc() and dma_direct_alloc_pages().  These 
> calls may block which is not allowed in atomic allocation contexts such as 
> from the NVMe driver.
> 
> Preallocate a complementary unecrypted DMA atomic pool that is initially 
> 4MB in size.  This patch does not contain dynamic expansion, but that 
> could be added if necessary.
> 
> In our stress testing, our peak unecrypted DMA atomic allocation 
> requirements is ~1.4MB, so 4MB is plenty.  This pool is similar to the 
> existing DMA atomic pool but is unencrypted.
> 
> Signed-off-by: David Rientjes 
> ---
>  Based on v5.4 HEAD.
> 
>  This commit contains diagnostic information and is not intended for use 
>  in a production environment.
> 
>  arch/x86/Kconfig|   1 +
>  drivers/iommu/dma-iommu.c   |   5 +-
>  include/linux/dma-mapping.h |   7 ++-
>  kernel/dma/direct.c |  16 -
>  kernel/dma/remap.c  | 116 ++--
>  5 files changed, 108 insertions(+), 37 deletions(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1530,6 +1530,7 @@ config X86_CPA_STATISTICS
>  config AMD_MEM_ENCRYPT
>   bool "AMD Secure Memory Encryption (SME) support"
>   depends on X86_64 && CPU_SUP_AMD
> + select DMA_DIRECT_REMAP
>   select DYNAMIC_PHYSICAL_MASK
>   select ARCH_USE_MEMREMAP_PROT
>   select ARCH_HAS_FORCE_DMA_UNENCRYPTED
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -928,7 +928,7 @@ static void __iommu_dma_free(struct device *dev, size_t 
> size, void *cpu_addr)
>  
>   /* Non-coherent atomic allocation? Easy */
>   if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
> - dma_free_from_pool(cpu_addr, alloc_size))
> + dma_free_from_pool(dev, cpu_addr, alloc_size))
>   return;
>  
>   if (IS_ENABLED(CONFIG_DMA_REMAP) && is_vmalloc_addr(cpu_addr)) {
> @@ -1011,7 +1011,8 @@ static void *iommu_dma_alloc(struct device *dev, size_t 
> size,
>  
>   if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
>   !gfpflags_allow_blocking(gfp) && !coherent)
> - cpu_addr = dma_alloc_from_pool(PAGE_ALIGN(size), , gfp);
> + cpu_addr = dma_alloc_from_pool(dev, PAGE_ALIGN(size), ,
> +gfp);
>   else
>   cpu_addr = iommu_dma_alloc_pages(dev, size, , gfp, attrs);
>   if (!cpu_addr)
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -629,9 +629,10 @@ void *dma_common_pages_remap(struct page **pages, size_t 
> size,
>   pgprot_t prot, const void *caller);
>  void dma_common_free_remap(void *cpu_addr, size_t size);
>  
> -bool dma_in_atomic_pool(void *start, size_t size);
> -void *dma_alloc_from_pool(size_t size, struct page **ret_page, gfp_t flags);
> -bool dma_free_from_pool(void *start, size_t size);
> +bool dma_in_atomic_pool(struct device *dev, void *start, size_t size);
> +void *dma_alloc_from_pool(struct device *dev, size_t size,
> +   struct page **ret_page, gfp_t flags);
> +bool dma_free_from_pool(struct device *dev, void *start, size_t size);
>  
>  int
>  dma_common_get_sgtable(struct device *dev, struct sg_table *sgt, void 
> *cpu_addr,
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -10,6 +10,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -131,6 +132,13 @@ void *dma_direct_alloc_pages(struct device *dev, size_t 
> size,
>   struct page *page;
>   void *ret;
>  
> + if (!gfpflags_allow_blocking(gfp) && force_dma_unencrypted(dev)) {
> + ret = dma_alloc_from_pool(dev, size, , gfp);
> + if (!ret)
> + return NULL;
> + goto done;
> + }
> +
>   page = 

Re: [linux-next/mainline][bisected 3acac06][ppc] Oops when unloading mpt3sas driver

2020-01-17 Thread Abdul Haleem
On Thu, 2020-01-16 at 09:44 -0800, Christoph Hellwig wrote:
> Hi Abdul,
> 
> I think the problem is that mpt3sas has some convoluted logic to do
> some DMA allocations with a 32-bit coherent mask, and then switches
> to a 63 or 64 bit mask, which is not supported by the DMA API.
> 
> Can you try the patch below?

Thank you Christoph, with the given patch applied the bug is not seen.

rmmod of mpt3sas driver is successful, no kernel Oops

Reported-and-tested-by: Abdul Haleem 

> 
> ---
> From 0738b1704ed528497b41b0408325f6828a8e51f6 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig 
> Date: Thu, 16 Jan 2020 18:31:38 +0100
> Subject: mpt3sas: don't change the dma coherent mask after allocations
> 
> The DMA layer does not allow changing the DMA coherent mask after
> there are outstanding allocations.  Stop doing that and always
> use a 32-bit coherent DMA mask in mpt3sas.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/scsi/mpt3sas/mpt3sas_base.c | 67 -
>  drivers/scsi/mpt3sas/mpt3sas_base.h |  2 -
>  2 files changed, 19 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
> b/drivers/scsi/mpt3sas/mpt3sas_base.c
> index fea3cb6a090b..3b51bed05008 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
> @@ -2706,58 +2706,38 @@ _base_build_sg_ieee(struct MPT3SAS_ADAPTER *ioc, void 
> *psge,
>  static int
>  _base_config_dma_addressing(struct MPT3SAS_ADAPTER *ioc, struct pci_dev 
> *pdev)
>  {
> - u64 required_mask, coherent_mask;
>   struct sysinfo s;
> - /* Set 63 bit DMA mask for all SAS3 and SAS35 controllers */
> - int dma_mask = (ioc->hba_mpi_version_belonged > MPI2_VERSION) ? 63 : 64;
> -
> - if (ioc->is_mcpu_endpoint)
> - goto try_32bit;
> + int dma_mask;
> 
> - required_mask = dma_get_required_mask(>dev);
> - if (sizeof(dma_addr_t) == 4 || required_mask == 32)
> - goto try_32bit;
> -
> - if (ioc->dma_mask)
> - coherent_mask = DMA_BIT_MASK(dma_mask);
> + if (ioc->is_mcpu_endpoint ||
> + sizeof(dma_addr_t) == 4 ||
> + dma_get_required_mask(>dev) <= 32)
> + dma_mask = 32;
> + /* Set 63 bit DMA mask for all SAS3 and SAS35 controllers */
> + else if (ioc->hba_mpi_version_belonged > MPI2_VERSION)
> + dma_mask = 63;
>   else
> - coherent_mask = DMA_BIT_MASK(32);
> + dma_mask = 64;
> 
>   if (dma_set_mask(>dev, DMA_BIT_MASK(dma_mask)) ||
> - dma_set_coherent_mask(>dev, coherent_mask))
> - goto try_32bit;
> -
> - ioc->base_add_sg_single = &_base_add_sg_single_64;
> - ioc->sge_size = sizeof(Mpi2SGESimple64_t);
> - ioc->dma_mask = dma_mask;
> - goto out;
> -
> - try_32bit:
> - if (dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32)))
> + dma_set_coherent_mask(>dev, DMA_BIT_MASK(32)))
>   return -ENODEV;
> 
> - ioc->base_add_sg_single = &_base_add_sg_single_32;
> - ioc->sge_size = sizeof(Mpi2SGESimple32_t);
> - ioc->dma_mask = 32;
> - out:
> + if (dma_mask > 32) {
> + ioc->base_add_sg_single = &_base_add_sg_single_64;
> + ioc->sge_size = sizeof(Mpi2SGESimple64_t);
> + } else {
> + ioc->base_add_sg_single = &_base_add_sg_single_32;
> + ioc->sge_size = sizeof(Mpi2SGESimple32_t);
> + }
> +
>   si_meminfo();
>   ioc_info(ioc, "%d BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (%ld 
> kB)\n",
> -  ioc->dma_mask, convert_to_kb(s.totalram));
> +  dma_mask, convert_to_kb(s.totalram));
> 
>   return 0;
>  }
> 
> -static int
> -_base_change_consistent_dma_mask(struct MPT3SAS_ADAPTER *ioc,
> -   struct pci_dev *pdev)
> -{
> - if (pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(ioc->dma_mask))) {
> - if (pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32)))
> - return -ENODEV;
> - }
> - return 0;
> -}
> -
>  /**
>   * _base_check_enable_msix - checks MSIX capabable.
>   * @ioc: per adapter object
> @@ -5030,14 +5010,6 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER 
> *ioc)
>   total_sz += sz;
>   } while (ioc->rdpq_array_enable && (++i < ioc->reply_queue_count));
> 
> - if (ioc->dma_mask > 32) {
> - if (_base_change_consistent_dma_mask(ioc, ioc->pdev) != 0) {
> - ioc_warn(ioc, "no suitable consistent DMA mask for 
> %s\n",
> -  pci_name(ioc->pdev));
> - goto out;
> - }
> - }
> -
>   ioc->scsiio_depth = ioc->hba_queue_depth -
>   ioc->hi_priority_depth - ioc->internal_depth;
> 
> @@ -6965,7 +6937,6 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc)
>   ioc->smp_affinity_enable = smp_affinity_enable;
> 
>   ioc->rdpq_array_enable_assigned = 0;
> - ioc->dma_mask = 0;
>   if 

Re: [RFC PATCH 3/4] iommu: Preallocate iommu group when probing devices

2020-01-17 Thread Joerg Roedel
On Wed, Jan 01, 2020 at 01:26:47PM +0800, Lu Baolu wrote:
> This splits iommu group allocation from adding devices. This makes
> it possible to determine the default domain type for each group as
> all devices belonging to the group have been determined.

I think its better to keep group allocation as it is and just defer
default domain allocation after each device is in its group. But take
care about the device hotplug path which might add new devices to groups
which already have a default domain, or add new groups that might need a
default domain too.

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: amd: Fix IOMMU perf counter clobbering during init

2020-01-17 Thread Joerg Roedel
Adding Suravee, who wrote the IOMMU Perf Counter code.

On Tue, Jan 14, 2020 at 08:12:20AM -0700, Shuah Khan wrote:
> init_iommu_perf_ctr() clobbers the register when it checks write access
> to IOMMU perf counters and fails to restore when they are writable.
> 
> Add save and restore to fix it.
> 
> Signed-off-by: Shuah Khan 
> ---
>  drivers/iommu/amd_iommu_init.c | 22 --
>  1 file changed, 16 insertions(+), 6 deletions(-)

Suravee, can you please review this patch?

> 
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 568c52317757..c0ad4f293522 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -1655,27 +1655,37 @@ static int iommu_pc_get_set_reg(struct amd_iommu 
> *iommu, u8 bank, u8 cntr,
>  static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>  {
>   struct pci_dev *pdev = iommu->dev;
> - u64 val = 0xabcd, val2 = 0;
> + u64 val = 0xabcd, val2 = 0, save_reg = 0;
>  
>   if (!iommu_feature(iommu, FEATURE_PC))
>   return;
>  
>   amd_iommu_pc_present = true;
>  
> + /* save the value to restore, if writable */
> + if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, false))
> + goto pc_false;
> +
>   /* Check if the performance counters can be written to */
>   if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, , true)) ||
>   (iommu_pc_get_set_reg(iommu, 0, 0, 0, , false)) ||
> - (val != val2)) {
> - pci_err(pdev, "Unable to write to IOMMU perf counter.\n");
> - amd_iommu_pc_present = false;
> - return;
> - }
> + (val != val2))
> + goto pc_false;
> +
> + /* restore */
> + if (iommu_pc_get_set_reg(iommu, 0, 0, 0, _reg, true))
> + goto pc_false;
>  
>   pci_info(pdev, "IOMMU performance counters supported\n");
>  
>   val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET);
>   iommu->max_banks = (u8) ((val >> 12) & 0x3f);
>   iommu->max_counters = (u8) ((val >> 7) & 0xf);
> +
> +pc_false:
> + pci_err(pdev, "Unable to read/write to IOMMU perf counter.\n");
> + amd_iommu_pc_present = false;
> + return;
>  }
>  
>  static ssize_t amd_iommu_show_cap(struct device *dev,
> -- 
> 2.20.1
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] iommu/amd: Optimization and cleanup for unused variables

2020-01-17 Thread Joerg Roedel
On Thu, Jan 09, 2020 at 11:02:49AM +0800, Adrian Huang wrote:
> This series optimizes the register reading by using readq instead of
> readl and cleans up the unused variables.
> 
> Adrian Huang (2):
>   iommu/amd: Replace two consecutive readl calls with one readq
>   iommu/amd: Remove unused struct member

Applied, thanks Adrian.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] iommu/arm-smmu: Updates for 5.6

2020-01-17 Thread Joerg Roedel
On Thu, Jan 16, 2020 at 10:25:49AM +, Will Deacon wrote:
> Hi Joerg,
> 
> Please pull these Arm SMMU updates for 5.6. The branch is based on your
> arm/smmu branch and includes a patch addressing the feedback from Greg
> about setting the module 'owner' field in the 'iommu_ops'.
> 
> I've used a signed tag this time, so you can see the summary of the
> changes listed in there. The big deal is that we're laying the groundwork
> for PCIe PASID support in SMMUv3, and I expect to hook that up for PCIe
> masters in 5.7 once we've exported the necessary symbols to do so.
> 
> Cheers,
> 
> Will
> 
> --->8
> 
> The following changes since commit 1ea27ee2f76e67f98b9942988f1336a70d351317:
> 
>   iommu/arm-smmu: Update my email address in MODULE_AUTHOR() (2019-12-23 
> 14:06:06 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
> tags/arm-smmu-updates

Pulled, thanks Will.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: Remove unnecessary WARN_ON_ONCE()

2020-01-17 Thread Joerg Roedel
On Thu, Jan 16, 2020 at 09:52:36AM +0800, Lu Baolu wrote:
> Address field in device TLB invalidation descriptor is qualified
> by the S field. If S field is zero, a single page at page address
> specified by address [63:12] is requested to be invalidated. If S
> field is set, the least significant bit in the address field with
> value 0b (say bit N) indicates the invalidation address range. The
> spec doesn't require the address [N - 1, 0] to be cleared, hence
> remove the unnecessary WARN_ON_ONCE().
> 
> Otherwise, the caller might set "mask = MAX_AGAW_PFN_WIDTH" in order
> to invalidating all the cached mappings on an endpoint, and below
> overflow error will be triggered.
> 
> [...]
> UBSAN: Undefined behaviour in drivers/iommu/dmar.c:1354:3
> shift exponent 64 is too large for 64-bit type 'long long unsigned int'
> [...]
> 
> Reported-and-tested-by: Frank 
> Signed-off-by: Lu Baolu 

Does this need a Fixes and/or stable tag?


Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PULL REQUEST] iommu/vt-d: patches for v5.6 - 2nd wave

2020-01-17 Thread Joerg Roedel
On Wed, Jan 15, 2020 at 11:03:54AM +0800, Lu Baolu wrote:
> Barret Rhoden (2):
>   iommu/vt-d: Mark firmware tainted if RMRR fails sanity check
>   iommu/vt-d: Add RMRR base and end addresses sanity check
> 
> Lu Baolu (2):
>   iommu/vt-d: Allow devices with RMRRs to use identity domain
>   iommu/vt-d: Unnecessary to handle default identity domain
> 
> jimyan (1):
>   iommu/vt-d: Don't reject Host Bridge due to scope mismatch
> 
>  drivers/iommu/dmar.c|  2 +-
>  drivers/iommu/intel-iommu.c | 47 ++---
>  2 files changed, 24 insertions(+), 25 deletions(-)

All queued to the x86/vt-d branch, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu