Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address

2020-05-27 Thread Srinath Mannam via iommu
On Wed, May 27, 2020 at 11:00 PM Robin Murphy  wrote:
>
Thanks Robin for your quick response.
> On 2020-05-27 17:03, Srinath Mannam wrote:
> > This patch gives the provision to change default value of MSI IOVA base
> > to platform's suitable IOVA using module parameter. The present
> > hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
>
> That in itself doesn't seem entirely unreasonable; IIRC the current
> address is just an arbitrary choice to fit nicely into Qemu's memory
> map, and there was always the possibility that it wouldn't suit everything.
>
> > Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> > DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> > property are reserved.
>
> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> dynamically, so they aren't affected by reserved regions any more than
> regular DMA pages are. In fact, it explicitly ignores the software MSI
> region, since as the comment says, it *is* the software that manages those.
Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
>
> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> the kernel *doesn't* control the address space, but still needs some way
> to steal a bit of it for MSIs that the guest doesn't necessarily know
> about, and give userspace a fighting chance of knowing what it's taken.
> I think at the time we discussed the idea of adding something to the
> VFIO uapi such that userspace could move this around if it wanted or
> needed to, but decided we could live without that initially. Perhaps now
> the time has come?
Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
region is considered to map MSI registers. This patch helps us to fix the issue.

Thanks,
Srinath.
>
> Robin.
>
> > If any platform has the limitaion to access default MSI IOVA, then it can
> > be changed using "arm-smmu.msi_iova_base=0xa000" command line argument.
> >
> > Signed-off-by: Srinath Mannam 
> > ---
> >   drivers/iommu/arm-smmu.c | 5 -
> >   1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index 4f1a350..5e59c9d 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -72,6 +72,9 @@ static bool disable_bypass =
> >   module_param(disable_bypass, bool, S_IRUGO);
> >   MODULE_PARM_DESC(disable_bypass,
> >   "Disable bypass streams such that incoming transactions from devices 
> > that are not attached to an iommu domain will report an abort back to the 
> > device and will not be allowed to pass through the SMMU.");
> > +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> > +module_param(msi_iova_base, ulong, S_IRUGO);
> > +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >
> >   struct arm_smmu_s2cr {
> >   struct iommu_group  *group;
> > @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device 
> > *dev,
> >   struct iommu_resv_region *region;
> >   int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> >
> > - region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> > + region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
> >prot, IOMMU_RESV_SW_MSI);
> >   if (!region)
> >   return;
> >
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/2] iommu: Remove iommu_sva_ops::mm_exit()

2020-05-27 Thread Lu Baolu

Sorry for the typo.

On 2020/5/28 11:32, Lu Baolu wrote:

Hi Jorge,

 ^
 Joerg




On 2020/5/27 20:42, Joerg Roedel wrote:

Hi Jean-Philippe,

On Wed, May 27, 2020 at 12:10:38PM +0200, Jean-Philippe Brucker wrote:

I was wondering if you could take these two patches for v5.8. The API
change is a precursor for the SVA support in SMMUv3, and the VT-d
implementation of the SVA API (queued for 5.8) doesn't implement
iommu_sva_ops.


I'd like some Acks on patch 2 (at least from the Intel people) before
going ahead with this.



Patch 2 in this series looks good to me.

Acked-by: Lu Baolu 

+Jacob, he participated in the discussions.

Best regards,
baolu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/2] iommu: Remove iommu_sva_ops::mm_exit()

2020-05-27 Thread Lu Baolu

Hi Jorge,

On 2020/5/27 20:42, Joerg Roedel wrote:

Hi Jean-Philippe,

On Wed, May 27, 2020 at 12:10:38PM +0200, Jean-Philippe Brucker wrote:

I was wondering if you could take these two patches for v5.8. The API
change is a precursor for the SVA support in SMMUv3, and the VT-d
implementation of the SVA API (queued for 5.8) doesn't implement
iommu_sva_ops.


I'd like some Acks on patch 2 (at least from the Intel people) before
going ahead with this.



Patch 2 in this series looks good to me.

Acked-by: Lu Baolu 

+Jacob, he participated in the discussions.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

2020-05-27 Thread Bjorn Helgaas
On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote:
> Some platform devices appear as PCI but are actually on the AMBA bus,
> and they need fixup in drivers/pci/quirks.c handling iommu_fwnode.
> Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode
> is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow
> down iommu probing as all devices in fixup final list will be
> reprocessed, suggested by Joerg, [1]

Is this slowdown significant?  We already iterate over every device
when applying PCI_FIXUP_FINAL quirks, so if we used the existing
PCI_FIXUP_FINAL, we wouldn't be adding a new loop.  We would only be
adding two more iterations to the loop in pci_do_fixups() that tries
to match quirks against the current device.  I doubt that would be a
measurable slowdown.

> For example:
> Hisilicon platform device need fixup in
> drivers/pci/quirks.c handling fwspec->can_stall, which is introduced in [2]
> 
> +static void quirk_huawei_pcie_sva(struct pci_dev *pdev)
> +{
> +struct iommu_fwspec *fwspec;
> +
> +pdev->eetlp_prefix_path = 1;
> +fwspec = dev_iommu_fwspec_get(>dev);
> +if (fwspec)
> +fwspec->can_stall = 1;
> +}
> +
> +DECLARE_PCI_FIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva);
> +DECLARE_PCI_iFIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa251, 
> quirk_huawei_pcie_sva); 
> 
> [1] https://www.spinics.net/lists/iommu/msg44591.html
> [2] https://www.spinics.net/lists/linux-pci/msg94559.html

If you reference these in the commit logs, please use lore.kernel.org
links instead of spinics.

> Zhangfei Gao (2):
>   PCI: Introduce PCI_FIXUP_IOMMU
>   iommu: calling pci_fixup_iommu in iommu_fwspec_init
> 
>  drivers/iommu/iommu.c | 4 
>  drivers/pci/quirks.c  | 7 +++
>  include/asm-generic/vmlinux.lds.h | 3 +++
>  include/linux/pci.h   | 8 
>  4 files changed, 22 insertions(+)
> 
> -- 
> 2.7.4
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address

2020-05-27 Thread Robin Murphy

On 2020-05-27 17:03, Srinath Mannam wrote:

This patch gives the provision to change default value of MSI IOVA base
to platform's suitable IOVA using module parameter. The present
hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.


That in itself doesn't seem entirely unreasonable; IIRC the current 
address is just an arbitrary choice to fit nicely into Qemu's memory 
map, and there was always the possibility that it wouldn't suit everything.



Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
property are reserved.


That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells 
dynamically, so they aren't affected by reserved regions any more than 
regular DMA pages are. In fact, it explicitly ignores the software MSI 
region, since as the comment says, it *is* the software that manages those.


The MSI_IOVA_BASE region exists for VFIO, precisely because in that case 
the kernel *doesn't* control the address space, but still needs some way 
to steal a bit of it for MSIs that the guest doesn't necessarily know 
about, and give userspace a fighting chance of knowing what it's taken. 
I think at the time we discussed the idea of adding something to the 
VFIO uapi such that userspace could move this around if it wanted or 
needed to, but decided we could live without that initially. Perhaps now 
the time has come?


Robin.


If any platform has the limitaion to access default MSI IOVA, then it can
be changed using "arm-smmu.msi_iova_base=0xa000" command line argument.

Signed-off-by: Srinath Mannam 
---
  drivers/iommu/arm-smmu.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 4f1a350..5e59c9d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -72,6 +72,9 @@ static bool disable_bypass =
  module_param(disable_bypass, bool, S_IRUGO);
  MODULE_PARM_DESC(disable_bypass,
"Disable bypass streams such that incoming transactions from devices that 
are not attached to an iommu domain will report an abort back to the device and will not 
be allowed to pass through the SMMU.");
+static unsigned long msi_iova_base = MSI_IOVA_BASE;
+module_param(msi_iova_base, ulong, S_IRUGO);
+MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
  
  struct arm_smmu_s2cr {

struct iommu_group  *group;
@@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
struct iommu_resv_region *region;
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
  
-	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,

+   region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
 prot, IOMMU_RESV_SW_MSI);
if (!region)
return;


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 09/14] device core: Add ability to handle multiple dma offsets

2020-05-27 Thread Nicolas Saenz Julienne
Hi Jim,

On Wed, 2020-05-27 at 11:43 -0400, Jim Quinlan wrote:
> Hi Nicolas,
> 
> On Wed, May 27, 2020 at 11:00 AM Nicolas Saenz Julienne
>  wrote:
> > Hi Jim,
> > one thing comes to mind, there is a small test suite in
> > drivers/of/unittest.c
> > (specifically of_unittest_pci_dma_ranges()) you could extend it to include
> > your
> > use cases.
> Sure, will check out.
> > On Tue, 2020-05-26 at 15:12 -0400, Jim Quinlan wrote:
> > > The new field in struct device 'dma_pfn_offset_map' is used to facilitate
> > > the use of multiple pfn offsets between cpu addrs and dma addrs.  It is
> > > similar to 'dma_pfn_offset' except that the offset chosen depends on the
> > > cpu or dma address involved.
> > > 
> > > Signed-off-by: Jim Quinlan 
> > > ---
> > >  drivers/of/address.c| 65 +++--
> > >  drivers/usb/core/message.c  |  3 ++
> > >  drivers/usb/core/usb.c  |  3 ++
> > >  include/linux/device.h  | 10 +-
> > >  include/linux/dma-direct.h  | 10 --
> > >  include/linux/dma-mapping.h | 46 ++
> > >  kernel/dma/Kconfig  | 13 
> > >  7 files changed, 144 insertions(+), 6 deletions(-)
> > > 
> > 
> > [...]
> > 
> > > @@ -977,10 +1020,19 @@ int of_dma_get_range(struct device *dev, struct
> > > device_node *np, u64 *dma_addr,
> > >   pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> > >range.bus_addr, range.cpu_addr, range.size);
> > > 
> > > + num_ranges++;
> > >   if (dma_offset && range.cpu_addr - range.bus_addr !=
> > > dma_offset)
> > > {
> > > - pr_warn("Can't handle multiple dma-ranges with
> > > different
> > > offsets on node(%pOF)\n", node);
> > > - /* Don't error out as we'd break some existing DTs
> > > */
> > > - continue;
> > > + if (!IS_ENABLED(CONFIG_DMA_PFN_OFFSET_MAP)) {
> > > + pr_warn("Can't handle multiple dma-ranges
> > > with
> > > different offsets on node(%pOF)\n", node);
> > > + pr_warn("Perhaps set
> > > DMA_PFN_OFFSET_MAP=y?\n");
> > > + /*
> > > +  * Don't error out as we'd break some
> > > existing
> > > +  * DTs that are using configs w/o
> > > +  * CONFIG_DMA_PFN_OFFSET_MAP set.
> > > +  */
> > > + continue;
> > 
> > dev->bus_dma_limit is set in of_dma_configure(), this function's caller,
> > based
> > on dma_start's value (set after this continue). So you'd be effectively
> > setting
> > the dev->bus_dma_limit to whatever we get from the first dma-range.
> I'm not seeing that at all.  On the  evaluation of each dma-range,
> dma_start and dma_end are re-evaluated to be the lowest and highest
> bus values of the  dma-ranges seen so far.  After all dma-ranges are
> examined,  dev->bus_dma_limit being set to the highest.  In fact, the
> current code -- ie before my commits -- already does this for multiple
> dma-ranges as long as the cpu-bus offset is the same in the
> dma-ranges.

Sorry I got carried away, you're right.

So I understand there is an underlaying assumption that the non DMAble memory
space will always sit on top of the bus memory space, as intertwined as it
might be, so as to every phys_to_dma() call on non DMAble memory to fall above
bus_dma_limit.

Regards,
Nicolas



signature.asc
Description: This is a digitally signed message part
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v1 0/3] iommu/vt-d: real DMA sub-device info allocation

2020-05-27 Thread Jon Derrick
This set adds the support for real DMA sub-devices to have device_domain_info,
leading to the correct domain type being used.

This applies on Joerg's origin/next. This also applies against v5.6.12
and v5.7-rc7 with some API modifications, making it a stable candidate
that fixes the issue reported in [1].

For v5.6.12 and v5.7-rc7, identity_mapping() would return 0 for real DMA
sub-devices due to not having valid device_domain_info, leading to
__intel_map_single() paths. This is a problem if the real DMA device
started in IDENTITY, leading to a NULL Pointer Dereference:

__intel_map_single()
domain = find_domain(dev);
dev = _real_dma_dev(to_pci_dev(dev))->dev;
info = dev->archdata.iommu;
return info->domain;

iommu = domain_get_iommu(domain)
if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA))
return NULL;

cap_zlr(iommu->cap) <-- NULL Pointer Deref

This issue was also fixed by 6fc7020cf298 ("iommu/vt-d: Apply per-device
dma_ops") due to removing identity_mapping() paths.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=207575

Jon Derrick (3):
  iommu/vt-d: Only clear real DMA device's context entries
  iommu/vt-d: Allocate domain info for real DMA sub-devices
  iommu/vt-d: Remove real DMA lookup in find_domain

 drivers/iommu/intel-iommu.c | 31 +++
 include/linux/intel-iommu.h |  1 +
 2 files changed, 24 insertions(+), 8 deletions(-)

-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 3/3] iommu/vt-d: Remove real DMA lookup in find_domain

2020-05-27 Thread Jon Derrick
By removing the real DMA indirection in find_domain(), we can allow
sub-devices of a real DMA device to have their own valid
device_domain_info. The dmar lookup and context entry removal paths have
been fixed to account for sub-devices.

Fixes: 2b0140c69637 ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207575
Signed-off-by: Jon Derrick 
---
 drivers/iommu/intel-iommu.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 6d39b9b..5767882 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2436,9 +2436,6 @@ struct dmar_domain *find_domain(struct device *dev)
if (unlikely(attach_deferred(dev) || iommu_dummy(dev)))
return NULL;
 
-   if (dev_is_pci(dev))
-   dev = _real_dma_dev(to_pci_dev(dev))->dev;
-
/* No lock here, assumes no domain exit in normal case */
info = get_domain_info(dev);
if (likely(info))
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 2/3] iommu/vt-d: Allocate domain info for real DMA sub-devices

2020-05-27 Thread Jon Derrick
Sub-devices of a real DMA device might exist on a separate segment than
the real DMA device and its IOMMU. These devices should still have a
valid device_domain_info, but the current dma alias model won't
allocate info for the subdevice.

This patch adds a segment member to struct device_domain_info and uses
the sub-device's BDF so that these sub-devices won't alias to other
devices.

Signed-off-by: Jon Derrick 
---
 drivers/iommu/intel-iommu.c | 19 +++
 include/linux/intel-iommu.h |  1 +
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 1ff45b2..6d39b9b 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2463,7 +2463,7 @@ static void do_deferred_attach(struct device *dev)
struct device_domain_info *info;
 
list_for_each_entry(info, _domain_list, global)
-   if (info->iommu->segment == segment && info->bus == bus &&
+   if (info->segment == segment && info->bus == bus &&
info->devfn == devfn)
return info;
 
@@ -2520,8 +2520,18 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
if (!info)
return NULL;
 
-   info->bus = bus;
-   info->devfn = devfn;
+   if (!dev_is_real_dma_subdevice(dev)) {
+   info->bus = bus;
+   info->devfn = devfn;
+   info->segment = iommu->segment;
+   } else {
+   struct pci_dev *pdev = to_pci_dev(dev);
+
+   info->bus = pdev->bus->number;
+   info->devfn = pdev->devfn;
+   info->segment = pci_domain_nr(pdev->bus);
+   }
+
info->ats_supported = info->pasid_supported = info->pri_supported = 0;
info->ats_enabled = info->pasid_enabled = info->pri_enabled = 0;
info->ats_qdep = 0;
@@ -2561,7 +2571,8 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
 
if (!found) {
struct device_domain_info *info2;
-   info2 = dmar_search_domain_by_dev_info(iommu->segment, bus, 
devfn);
+   info2 = dmar_search_domain_by_dev_info(info->segment, info->bus,
+  info->devfn);
if (info2) {
found  = info2->domain;
info2->dev = dev;
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 21633ce..4100bd2 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -609,6 +609,7 @@ struct device_domain_info {
struct list_head auxiliary_domains; /* auxiliary domains
 * attached to this device
 */
+   u32 segment;/* PCI segment number */
u8 bus; /* PCI bus number */
u8 devfn;   /* PCI devfn number */
u16 pfsid;  /* SRIOV physical function source ID */
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 1/3] iommu/vt-d: Only clear real DMA device's context entries

2020-05-27 Thread Jon Derrick
Domain context mapping can encounter issues with sub-devices of a real
DMA device. A sub-device cannot have a valid context entry due to it
potentially aliasing another device's 16-bit ID. It's expected that
sub-devices of the real DMA device uses the real DMA device's requester
when context mapping.

This is an issue when a sub-device is removed where the context entry is
cleared for all aliases. Other sub-devices are still valid, resulting in
those sub-devices being stranded without valid context entries.

The correct approach is to use the real DMA device when programming the
context entries. The insertion path is correct because device_to_iommu()
will return the bus and devfn of the real DMA device. The removal path
needs to only operate on the real DMA device, otherwise the entire
context entry would be cleared for all sub-devices of the real DMA
device.

This patch also adds a helper to determine if a struct device is a
sub-device of a real DMA device.

Signed-off-by: Jon Derrick 
---
 drivers/iommu/intel-iommu.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index ff5a30a..1ff45b2 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2500,6 +2500,12 @@ static int domain_setup_first_level(struct intel_iommu 
*iommu,
 flags);
 }
 
+static bool dev_is_real_dma_subdevice(struct device *dev)
+{
+   return dev && dev_is_pci(dev) &&
+  pci_real_dma_dev(to_pci_dev(dev)) != to_pci_dev(dev);
+}
+
 static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
int bus, int devfn,
struct device *dev,
@@ -4975,7 +4981,8 @@ static void __dmar_remove_one_dev_info(struct 
device_domain_info *info)
PASID_RID2PASID, false);
 
iommu_disable_dev_iotlb(info);
-   domain_context_clear(iommu, info->dev);
+   if (!dev_is_real_dma_subdevice(info->dev))
+   domain_context_clear(iommu, info->dev);
intel_pasid_free_table(info->dev);
}
 
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address

2020-05-27 Thread Srinath Mannam via iommu
This patch gives the provision to change default value of MSI IOVA base
to platform's suitable IOVA using module parameter. The present
hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.

Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
property are reserved.

If any platform has the limitaion to access default MSI IOVA, then it can
be changed using "arm-smmu.msi_iova_base=0xa000" command line argument.

Signed-off-by: Srinath Mannam 
---
 drivers/iommu/arm-smmu.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 4f1a350..5e59c9d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -72,6 +72,9 @@ static bool disable_bypass =
 module_param(disable_bypass, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
"Disable bypass streams such that incoming transactions from devices 
that are not attached to an iommu domain will report an abort back to the 
device and will not be allowed to pass through the SMMU.");
+static unsigned long msi_iova_base = MSI_IOVA_BASE;
+module_param(msi_iova_base, ulong, S_IRUGO);
+MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
 
 struct arm_smmu_s2cr {
struct iommu_group  *group;
@@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
struct iommu_resv_region *region;
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
-   region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+   region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
 prot, IOMMU_RESV_SW_MSI);
if (!region)
return;
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 09/14] device core: Add ability to handle multiple dma offsets

2020-05-27 Thread Jim Quinlan via iommu
Hi Nicolas,

On Wed, May 27, 2020 at 11:00 AM Nicolas Saenz Julienne
 wrote:
>
> Hi Jim,
> one thing comes to mind, there is a small test suite in drivers/of/unittest.c
> (specifically of_unittest_pci_dma_ranges()) you could extend it to include 
> your
> use cases.
Sure, will check out.
>
> On Tue, 2020-05-26 at 15:12 -0400, Jim Quinlan wrote:
> > The new field in struct device 'dma_pfn_offset_map' is used to facilitate
> > the use of multiple pfn offsets between cpu addrs and dma addrs.  It is
> > similar to 'dma_pfn_offset' except that the offset chosen depends on the
> > cpu or dma address involved.
> >
> > Signed-off-by: Jim Quinlan 
> > ---
> >  drivers/of/address.c| 65 +++--
> >  drivers/usb/core/message.c  |  3 ++
> >  drivers/usb/core/usb.c  |  3 ++
> >  include/linux/device.h  | 10 +-
> >  include/linux/dma-direct.h  | 10 --
> >  include/linux/dma-mapping.h | 46 ++
> >  kernel/dma/Kconfig  | 13 
> >  7 files changed, 144 insertions(+), 6 deletions(-)
> >
>
> [...]
>
> > @@ -977,10 +1020,19 @@ int of_dma_get_range(struct device *dev, struct
> > device_node *np, u64 *dma_addr,
> >   pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
> >range.bus_addr, range.cpu_addr, range.size);
> >
> > + num_ranges++;
> >   if (dma_offset && range.cpu_addr - range.bus_addr != 
> > dma_offset)
> > {
> > - pr_warn("Can't handle multiple dma-ranges with 
> > different
> > offsets on node(%pOF)\n", node);
> > - /* Don't error out as we'd break some existing DTs */
> > - continue;
> > + if (!IS_ENABLED(CONFIG_DMA_PFN_OFFSET_MAP)) {
> > + pr_warn("Can't handle multiple dma-ranges with
> > different offsets on node(%pOF)\n", node);
> > + pr_warn("Perhaps set 
> > DMA_PFN_OFFSET_MAP=y?\n");
> > + /*
> > +  * Don't error out as we'd break some existing
> > +  * DTs that are using configs w/o
> > +  * CONFIG_DMA_PFN_OFFSET_MAP set.
> > +  */
> > + continue;
>
> dev->bus_dma_limit is set in of_dma_configure(), this function's caller, based
> on dma_start's value (set after this continue). So you'd be effectively 
> setting
> the dev->bus_dma_limit to whatever we get from the first dma-range.
I'm not seeing that at all.  On the  evaluation of each dma-range,
dma_start and dma_end are re-evaluated to be the lowest and highest
bus values of the  dma-ranges seen so far.  After all dma-ranges are
examined,  dev->bus_dma_limit being set to the highest.  In fact, the
current code -- ie before my commits -- already does this for multiple
dma-ranges as long as the cpu-bus offset is the same in the
dma-ranges.
>
> This can be troublesome depending on how the dma-ranges are setup, for example
> if the first dma-range doesn't include the CMA area, in arm64 generally set as
> high as possible in ZONE_DMA32, that would render it useless for
> dma/{direct/swiotlb}. Again depending on the bus_dma_limit value, if smaller
> than ZONE_DMA you'd be unable to allocate any DMA memory.
>
> IMO, a solution to this calls for a revamp of dma-direct's dma_capable(): 
> match
> the target DMA memory area with each dma-range we have to see if it fits.
>
> > + }
> > + dma_multi_pfn_offset = true;
> >   }
> >   dma_offset = range.cpu_addr - range.bus_addr;
> >
> > @@ -991,6 +1043,13 @@ int of_dma_get_range(struct device *dev, struct
> > device_node *np, u64 *dma_addr,
> >   dma_end = range.bus_addr + range.size;
> >   }
> >
> > + if (dma_multi_pfn_offset) {
> > + dma_offset = 0;
> > + ret = attach_dma_pfn_offset_map(dev, node, num_ranges);
> > + if (ret)
> > + return ret;
> > + }
> > +
> >   if (dma_start >= dma_end) {
> >   ret = -EINVAL;
> >   pr_debug("Invalid DMA ranges configuration on node(%pOF)\n",
> > diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> > index 6197938dcc2d..aaa3e58f5eb4 100644
> > --- a/drivers/usb/core/message.c
> > +++ b/drivers/usb/core/message.c
> > @@ -1960,6 +1960,9 @@ int usb_set_configuration(struct usb_device *dev, int
> > configuration)
> >*/
> >   intf->dev.dma_mask = dev->dev.dma_mask;
> >   intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
> > +#ifdef CONFIG_DMA_PFN_OFFSET_MAP
> > + intf->dev.dma_pfn_offset_map = dev->dev.dma_pfn_offset_map;
> > +#endif
>
> Thanks for looking at this, that said, I see more instances of drivers 
> changing
> dma_pfn_offset outside of the core code. Why not doing this there too?
>

Re: [PATCH v2 09/14] device core: Add ability to handle multiple dma offsets

2020-05-27 Thread Nicolas Saenz Julienne
Hi Jim,
one thing comes to mind, there is a small test suite in drivers/of/unittest.c
(specifically of_unittest_pci_dma_ranges()) you could extend it to include your
use cases.

On Tue, 2020-05-26 at 15:12 -0400, Jim Quinlan wrote:
> The new field in struct device 'dma_pfn_offset_map' is used to facilitate
> the use of multiple pfn offsets between cpu addrs and dma addrs.  It is
> similar to 'dma_pfn_offset' except that the offset chosen depends on the
> cpu or dma address involved.
> 
> Signed-off-by: Jim Quinlan 
> ---
>  drivers/of/address.c| 65 +++--
>  drivers/usb/core/message.c  |  3 ++
>  drivers/usb/core/usb.c  |  3 ++
>  include/linux/device.h  | 10 +-
>  include/linux/dma-direct.h  | 10 --
>  include/linux/dma-mapping.h | 46 ++
>  kernel/dma/Kconfig  | 13 
>  7 files changed, 144 insertions(+), 6 deletions(-)
> 

[...]

> @@ -977,10 +1020,19 @@ int of_dma_get_range(struct device *dev, struct
> device_node *np, u64 *dma_addr,
>   pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
>range.bus_addr, range.cpu_addr, range.size);
>  
> + num_ranges++;
>   if (dma_offset && range.cpu_addr - range.bus_addr != dma_offset)
> {
> - pr_warn("Can't handle multiple dma-ranges with different
> offsets on node(%pOF)\n", node);
> - /* Don't error out as we'd break some existing DTs */
> - continue;
> + if (!IS_ENABLED(CONFIG_DMA_PFN_OFFSET_MAP)) {
> + pr_warn("Can't handle multiple dma-ranges with
> different offsets on node(%pOF)\n", node);
> + pr_warn("Perhaps set DMA_PFN_OFFSET_MAP=y?\n");
> + /*
> +  * Don't error out as we'd break some existing
> +  * DTs that are using configs w/o
> +  * CONFIG_DMA_PFN_OFFSET_MAP set.
> +  */
> + continue;

dev->bus_dma_limit is set in of_dma_configure(), this function's caller, based
on dma_start's value (set after this continue). So you'd be effectively setting
the dev->bus_dma_limit to whatever we get from the first dma-range.

This can be troublesome depending on how the dma-ranges are setup, for example
if the first dma-range doesn't include the CMA area, in arm64 generally set as
high as possible in ZONE_DMA32, that would render it useless for
dma/{direct/swiotlb}. Again depending on the bus_dma_limit value, if smaller
than ZONE_DMA you'd be unable to allocate any DMA memory.

IMO, a solution to this calls for a revamp of dma-direct's dma_capable(): match
the target DMA memory area with each dma-range we have to see if it fits.

> + }
> + dma_multi_pfn_offset = true;
>   }
>   dma_offset = range.cpu_addr - range.bus_addr;
>  
> @@ -991,6 +1043,13 @@ int of_dma_get_range(struct device *dev, struct
> device_node *np, u64 *dma_addr,
>   dma_end = range.bus_addr + range.size;
>   }
>  
> + if (dma_multi_pfn_offset) {
> + dma_offset = 0;
> + ret = attach_dma_pfn_offset_map(dev, node, num_ranges);
> + if (ret)
> + return ret;
> + }
> +
>   if (dma_start >= dma_end) {
>   ret = -EINVAL;
>   pr_debug("Invalid DMA ranges configuration on node(%pOF)\n",
> diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> index 6197938dcc2d..aaa3e58f5eb4 100644
> --- a/drivers/usb/core/message.c
> +++ b/drivers/usb/core/message.c
> @@ -1960,6 +1960,9 @@ int usb_set_configuration(struct usb_device *dev, int
> configuration)
>*/
>   intf->dev.dma_mask = dev->dev.dma_mask;
>   intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
> +#ifdef CONFIG_DMA_PFN_OFFSET_MAP
> + intf->dev.dma_pfn_offset_map = dev->dev.dma_pfn_offset_map;
> +#endif

Thanks for looking at this, that said, I see more instances of drivers changing
dma_pfn_offset outside of the core code. Why not doing this there too?

Also, are we 100% sure that dev->dev.dma_pfn_offset isn't going to be freed
before we're done using intf->dev? Maybe it's safer to copy the ranges?

>   INIT_WORK(>reset_ws, __usb_queue_reset_device);
>   intf->minor = -1;
>   device_initialize(>dev);
> diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
> index f16c26dc079d..d2ed4d90e56e 100644
> --- a/drivers/usb/core/usb.c
> +++ b/drivers/usb/core/usb.c
> @@ -612,6 +612,9 @@ struct usb_device *usb_alloc_dev(struct usb_device
> *parent,
>*/
>   dev->dev.dma_mask = bus->sysdev->dma_mask;
>   dev->dev.dma_pfn_offset = bus->sysdev->dma_pfn_offset;
> +#ifdef CONFIG_DMA_PFN_OFFSET_MAP
> + 

Re: [PATCH 1/2] ia64: Hide the archdata.iommu field behind generic IOMMU_API

2020-05-27 Thread Joerg Roedel
On Mon, May 18, 2020 at 02:08:54PM +0200, Krzysztof Kozlowski wrote:
> There is a generic, kernel wide configuration symbol for enabling the
> IOMMU specific bits: CONFIG_IOMMU_API.  Implementations (including
> INTEL_IOMMU driver) select it so use it here as well.
> 
> This makes the conditional archdata.iommu field consistent with other
> platforms and also fixes any compile test builds of other IOMMU drivers,
> when INTEL_IOMMU is not selected).
> 
> For the case when INTEL_IOMMU and COMPILE_TEST are not selected, this
> should create functionally equivalent code/choice.  With COMPILE_TEST
> this field could appear if other IOMMU drivers are chosen but
> INTEL_IOMMU not.
> 
> Reported-by: kbuild test robot 
> Signed-off-by: Krzysztof Kozlowski 
> 
> ---
> 
> See:
> https://lore.kernel.org/lkml/202005181412.frc4jufy%25...@intel.com/
> ---
>  arch/ia64/include/asm/device.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied both to iommu/fixes.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] x86: Hide the archdata.iommu field behind generic IOMMU_API

2020-05-27 Thread Borislav Petkov
On Mon, May 18, 2020 at 02:08:55PM +0200, Krzysztof Kozlowski wrote:
> There is a generic, kernel wide configuration symbol for enabling the
> IOMMU specific bits: CONFIG_IOMMU_API.  Implementations (including
> INTEL_IOMMU and AMD_IOMMU driver) select it so use it here as well.
> 
> This makes the conditional archdata.iommu field consistent with other
> platforms and also fixes any compile test builds of other IOMMU drivers,
> when INTEL_IOMMU or AMD_IOMMU are not selected).
> 
> For the case when INTEL_IOMMU/AMD_IOMMU and COMPILE_TEST are not
> selected, this should create functionally equivalent code/choice.  With
> COMPILE_TEST this field could appear if other IOMMU drivers are chosen
> but neither INTEL_IOMMU nor AMD_IOMMU are not.
> 
> Reported-by: kbuild test robot 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  arch/x86/include/asm/device.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/device.h b/arch/x86/include/asm/device.h
> index 7e31f7f1bb06..49bd6cf3eec9 100644
> --- a/arch/x86/include/asm/device.h
> +++ b/arch/x86/include/asm/device.h
> @@ -3,7 +3,7 @@
>  #define _ASM_X86_DEVICE_H
>  
>  struct dev_archdata {
> -#if defined(CONFIG_INTEL_IOMMU) || defined(CONFIG_AMD_IOMMU)
> +#ifdef CONFIG_IOMMU_API
>   void *iommu; /* hook for IOMMU specific extension */
>  #endif
>  };

Acked-by: Borislav Petkov 

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

2020-05-27 Thread Zhangfei Gao



On 2020/5/27 下午5:53, Arnd Bergmann wrote:

On Wed, May 27, 2020 at 11:00 AM Greg Kroah-Hartman
 wrote:

On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote:

Some platform devices appear as PCI but are actually on the AMBA bus,

Why would these devices not just show up on the AMBA bus and use all of
that logic instead of being a PCI device and having to go through odd
fixes like this?

There is a general move to having hardware be discoverable even with
ARM processors. Having on-chip devices be discoverable using PCI config
space is how x86 SoCs usually do it, and that is generally a good thing
as it means we don't need to describe them in DT

I guess as the hardware designers are still learning about it, this is not
always done correctly. In general, we can also describe PCI devices on
DT and do fixups during the probing there, but I suspect that won't work
as easily using ACPI probing, so the fixup is keyed off the hardware ID,
again as is common for x86 on-chip devices.

  

Yes, thanks Arnd :)

In order to use pasid, io page fault has to be supported,
either by PCI PRI feature (from pci device) or stall mode from smmu 
(platform device).
Here is letting system know the platform device can support smmu stall 
mode, as a result support pasid.

While stall is not a pci capability, so we use a fixup here.

Thanks

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 0/2] iommu: Remove iommu_sva_ops::mm_exit()

2020-05-27 Thread Joerg Roedel
Hi Jean-Philippe,

On Wed, May 27, 2020 at 12:10:38PM +0200, Jean-Philippe Brucker wrote:
> I was wondering if you could take these two patches for v5.8. The API
> change is a precursor for the SVA support in SMMUv3, and the VT-d
> implementation of the SVA API (queued for 5.8) doesn't implement
> iommu_sva_ops.

I'd like some Acks on patch 2 (at least from the Intel people) before
going ahead with this.


Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] drivers/iommu: Constify structs

2020-05-27 Thread Joerg Roedel
On Mon, May 25, 2020 at 11:49:56PM +0200, Rikard Falkeborn wrote:
> Constify some structs with function pointers to allow the compiler to
> put them in read-only memory. There is not dependency between the
> patches.
> 
> Rikard Falkeborn (2):
>   iommu/hyper-v: Constify hyperv_ir_domain_ops
>   iommu/sun50i: Constify sun50i_iommu_ops
> 
>  drivers/iommu/hyperv-iommu.c | 2 +-
>  drivers/iommu/sun50i-iommu.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)

Applied both, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/4] PCI, iommu: Factor 'untrusted' check for ATS enablement

2020-05-27 Thread Joerg Roedel
On Wed, May 20, 2020 at 05:21:59PM +0200, Jean-Philippe Brucker wrote:
> IOMMU drivers currently check themselves if a device is untrusted
> (plugged into an external-facing port) before enabling ATS. Move the
> check to drivers/pci. The only functional change should be to the AMD
> IOMMU driver. With this change all IOMMU drivers block 'Translated' PCIe
> transactions and Translation Requests from untrusted devices.
> 
> Since v1 [1] I added tags, addressed comments on patches 1 and 3, and
> fixed a regression in patch 3.
> 
> [1] 
> https://lore.kernel.org/linux-iommu/20200515104359.1178606-1-jean-phili...@linaro.org/
> 
> Jean-Philippe Brucker (4):
>   PCI/ATS: Only enable ATS for trusted devices
>   iommu/amd: Use pci_ats_supported()
>   iommu/arm-smmu-v3: Use pci_ats_supported()
>   iommu/vt-d: Use pci_ats_supported()
> 
>  include/linux/pci-ats.h |  3 +++
>  drivers/iommu/amd_iommu.c   | 12 
>  drivers/iommu/arm-smmu-v3.c | 20 +++-
>  drivers/iommu/intel-iommu.c |  9 +++--
>  drivers/pci/ats.c   | 18 +-
>  5 files changed, 34 insertions(+), 28 deletions(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC] Use SMMU HTTU for DMA dirty page tracking

2020-05-27 Thread Zengtao (B)
> -Original Message-
> From: zhengxiang (A)
> Sent: Monday, May 25, 2020 7:34 PM
> To: Jean-Philippe Brucker
> Cc: linux-arm-ker...@lists.infradead.org; kvm...@lists.cs.columbia.edu;
> Will Deacon; Wanghaibin (D); Zengtao (B); m...@kernel.org; James Morse;
> julien.thierry.k...@gmail.com; Suzuki K Poulose; Wangzhou (B);
> iommu@lists.linux-foundation.org; Kirti Wankhede; Yan Zhao;
> alex.william...@redhat.com
> Subject: Re: [RFC] Use SMMU HTTU for DMA dirty page tracking
> 
> [+cc Kirti, Yan, Alex]
> 
> On 2020/5/23 1:14, Jean-Philippe Brucker wrote:
> > Hi,
> >
> > On Tue, May 19, 2020 at 05:42:55PM +0800, Xiang Zheng wrote:
> >> Hi all,
> >>
> >> Is there any plan for enabling SMMU HTTU?
> >
> > Not outside of SVA, as far as I know.
> >
> 
> >> I have seen the patch locates in the SVA series patch, which adds
> >> support for HTTU:
> >> https://www.spinics.net/lists/arm-kernel/msg798694.html
> >>
> >> HTTU reduces the number of access faults on SMMU fault queue
> >> (permission faults also benifit from it).
> >>
> >> Besides reducing the faults, HTTU also helps to track dirty pages for
> >> device DMA. Is it feasible to utilize HTTU to get dirty pages on device
> >> DMA during VFIO live migration?
> >
> > As you know there is a VFIO interface for this under discussion:
> >
> https://lore.kernel.org/kvm/1589781397-28368-1-git-send-email-kwankhe
> d...@nvidia.com/
> > It doesn't implement an internal API to communicate with the IOMMU
> driver
> > about dirty pages.
> 
> >
> >> If SMMU can track dirty pages, devices are not required to implement
> >> additional dirty pages tracking to support VFIO live migration.
> >
> > It seems feasible, though tracking it in the device might be more
> > efficient. I might have misunderstood but I think for live migration of
> > the Intel NIC they trap guest accesses to the device and introspect its
> > state to figure out which pages it is accessing.
> >
> > With HTTU I suppose (without much knowledge about live migration)
> that
> > you'd need several new interfaces to the IOMMU drivers:
> >
> > * A way for VFIO to query HTTU support in the SMMU. There are some
> >   discussions about communicating more IOMMU capabilities through
> VFIO but
> >   no implementation yet. When HTTU isn't supported the DIRTY_PAGES
> bitmap
> >   would report all pages as they do now.
> >
> > * VFIO_IOMMU_DIRTY_PAGES_FLAG_START/STOP would clear the dirty
> bit
> >   for all VFIO mappings (which is going to take some time). There is a
> >   walker in io-pgtable for iova_to_phys() which could be extended. I
> >   suppose it's also possible to atomically switch the HA and HD bits in
> >   context descriptors.
> 
> Maybe we need not switch HA and HD bits, just turn on them all the time?

I think we need START/STOP, start is called when migration is started and STOP
called when migration finished.

I think you mean that during the migration(between START and STOP), HA and HD
functionality is always turned on.

> 
> >
> > * VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP would query the
> dirty bit for all
> >   VFIO mappings.
> >
We need a clear during the query which mean we have to reset the dirty status 
after a query.

> 
> I think we need to consider the case of IOMMU dirty pages logging. We
> want
> to test Kirti's VFIO migration patches combined with SMMU HTTU, any
> suggestions?

@kirti @yan @alex
As we know, you have worked on this feature for a long time, and it seems that
 everything is going very well now, thanks very much for your VFIO migration 
framework, it really helps a lot, and we want to start with SMMU HTTU enabled
 based on your jobs, it's kind of hardware dirty page tracking, and expected to 
bring us
 better performance, any suggestions? 

Thank you.
Zengtao 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 02/10] iommu/amd: Unexport get_dev_data()

2020-05-27 Thread Joerg Roedel
From: Joerg Roedel 

This function is internal to the AMD IOMMU driver and only exported
because the amd_iommu_v2 modules calls it. But the reason it is called
from there could better be handled by amd_iommu_is_attach_deferred().
So unexport get_dev_data() and use amd_iommu_is_attach_deferred()
instead.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd/amd_iommu_proto.h |  3 ++-
 drivers/iommu/amd/iommu.c   |  9 +
 drivers/iommu/amd/iommu_v2.c| 10 --
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_proto.h 
b/drivers/iommu/amd/amd_iommu_proto.h
index 92c2ba6468a0..1c6c12c11368 100644
--- a/drivers/iommu/amd/amd_iommu_proto.h
+++ b/drivers/iommu/amd/amd_iommu_proto.h
@@ -92,5 +92,6 @@ static inline void *iommu_phys_to_virt(unsigned long paddr)
 }
 
 extern bool translation_pre_enabled(struct amd_iommu *iommu);
-extern struct iommu_dev_data *get_dev_data(struct device *dev);
+extern bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
+struct device *dev);
 #endif /* _ASM_X86_AMD_IOMMU_PROTO_H  */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 39155f550f18..8368f6b9c17f 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -280,11 +280,10 @@ static struct iommu_dev_data *find_dev_data(u16 devid)
return dev_data;
 }
 
-struct iommu_dev_data *get_dev_data(struct device *dev)
+static struct iommu_dev_data *get_dev_data(struct device *dev)
 {
return dev->archdata.iommu;
 }
-EXPORT_SYMBOL(get_dev_data);
 
 /*
 * Find or create an IOMMU group for a acpihid device.
@@ -2706,12 +2705,14 @@ static void amd_iommu_get_resv_regions(struct device 
*dev,
list_add_tail(>list, head);
 }
 
-static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
-struct device *dev)
+bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
+ struct device *dev)
 {
struct iommu_dev_data *dev_data = dev->archdata.iommu;
+
return dev_data->defer_attach;
 }
+EXPORT_SYMBOL_GPL(amd_iommu_is_attach_deferred);
 
 static void amd_iommu_flush_iotlb_all(struct iommu_domain *domain)
 {
diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index d6d85debd01b..9b6e038150c1 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -517,13 +517,12 @@ static int ppr_notifier(struct notifier_block *nb, 
unsigned long e, void *data)
struct amd_iommu_fault *iommu_fault;
struct pasid_state *pasid_state;
struct device_state *dev_state;
+   struct pci_dev *pdev = NULL;
unsigned long flags;
struct fault *fault;
bool finish;
u16 tag, devid;
int ret;
-   struct iommu_dev_data *dev_data;
-   struct pci_dev *pdev = NULL;
 
iommu_fault = data;
tag = iommu_fault->tag & 0x1ff;
@@ -534,12 +533,11 @@ static int ppr_notifier(struct notifier_block *nb, 
unsigned long e, void *data)
   devid & 0xff);
if (!pdev)
return -ENODEV;
-   dev_data = get_dev_data(>dev);
 
-   /* In kdump kernel pci dev is not initialized yet -> send INVALID */
ret = NOTIFY_DONE;
-   if (translation_pre_enabled(amd_iommu_rlookup_table[devid])
-   && dev_data->defer_attach) {
+
+   /* In kdump kernel pci dev is not initialized yet -> send INVALID */
+   if (amd_iommu_is_attach_deferred(NULL, >dev)) {
amd_iommu_complete_ppr(pdev, iommu_fault->pasid,
   PPR_INVALID, tag);
goto out;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 05/10] iommu/amd: Free page-table in protection_domain_free()

2020-05-27 Thread Joerg Roedel
From: Joerg Roedel 

Align release of the page-table with the place where it is allocated.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd/iommu.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 0d5a5dbee9f3..5282ff6b8ea0 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2387,12 +2387,18 @@ static void cleanup_domain(struct protection_domain 
*domain)
 
 static void protection_domain_free(struct protection_domain *domain)
 {
+   struct domain_pgtable pgtable;
+
if (!domain)
return;
 
if (domain->id)
domain_id_free(domain->id);
 
+   amd_iommu_domain_get_pgtable(domain, );
+   atomic64_set(>pt_root, 0);
+   free_pagetable();
+
kfree(domain);
 }
 
@@ -2476,7 +2482,6 @@ static struct iommu_domain 
*amd_iommu_domain_alloc(unsigned type)
 static void amd_iommu_domain_free(struct iommu_domain *dom)
 {
struct protection_domain *domain;
-   struct domain_pgtable pgtable;
 
domain = to_pdomain(dom);
 
@@ -2494,10 +2499,6 @@ static void amd_iommu_domain_free(struct iommu_domain 
*dom)
dma_ops_domain_free(domain);
break;
default:
-   amd_iommu_domain_get_pgtable(domain, );
-   atomic64_set(>pt_root, 0);
-   free_pagetable();
-
if (domain->flags & PD_IOMMUV2_MASK)
free_gcr3_table(domain);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 04/10] iommu/amd: Allocate page-table in protection_domain_init()

2020-05-27 Thread Joerg Roedel
From: Joerg Roedel 

Consolidate the allocation of the domain page-table in one place.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd/iommu.c | 48 ++-
 1 file changed, 22 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index c7e47a7f0d45..0d5a5dbee9f3 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -71,6 +71,8 @@
  */
 #define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
 
+#define DEFAULT_PGTABLE_LEVEL  PAGE_MODE_3_LEVEL
+
 static DEFINE_SPINLOCK(pd_bitmap_lock);
 
 /* List of all available dev_data structures */
@@ -99,7 +101,7 @@ struct iommu_cmd {
 struct kmem_cache *amd_iommu_irq_cache;
 
 static void update_domain(struct protection_domain *domain);
-static int protection_domain_init(struct protection_domain *domain);
+static int protection_domain_init(struct protection_domain *domain, int mode);
 static void detach_device(struct device *dev);
 static void update_and_flush_device_table(struct protection_domain *domain,
  struct domain_pgtable *pgtable);
@@ -1847,21 +1849,14 @@ static void dma_ops_domain_free(struct 
protection_domain *domain)
 static struct protection_domain *dma_ops_domain_alloc(void)
 {
struct protection_domain *domain;
-   u64 *pt_root, root;
 
domain = kzalloc(sizeof(struct protection_domain), GFP_KERNEL);
if (!domain)
return NULL;
 
-   if (protection_domain_init(domain))
-   goto free_domain;
-
-   pt_root = (void *)get_zeroed_page(GFP_KERNEL);
-   if (!pt_root)
+   if (protection_domain_init(domain, DEFAULT_PGTABLE_LEVEL))
goto free_domain;
 
-   root = amd_iommu_domain_encode_pgtable(pt_root, PAGE_MODE_3_LEVEL);
-   atomic64_set(>pt_root, root);
domain->flags = PD_DMA_OPS_MASK;
 
if (iommu_get_dma_cookie(>domain) == -ENOMEM)
@@ -2401,18 +2396,31 @@ static void protection_domain_free(struct 
protection_domain *domain)
kfree(domain);
 }
 
-static int protection_domain_init(struct protection_domain *domain)
+static int protection_domain_init(struct protection_domain *domain, int mode)
 {
+   u64 *pt_root = NULL, root;
+
+   BUG_ON(mode < PAGE_MODE_NONE || mode > PAGE_MODE_6_LEVEL);
+
spin_lock_init(>lock);
domain->id = domain_id_alloc();
if (!domain->id)
return -ENOMEM;
INIT_LIST_HEAD(>dev_list);
 
+   if (mode != PAGE_MODE_NONE) {
+   pt_root = (void *)get_zeroed_page(GFP_KERNEL);
+   if (!pt_root)
+   return -ENOMEM;
+   }
+
+   root = amd_iommu_domain_encode_pgtable(pt_root, mode);
+   atomic64_set(>pt_root, root);
+
return 0;
 }
 
-static struct protection_domain *protection_domain_alloc(void)
+static struct protection_domain *protection_domain_alloc(int mode)
 {
struct protection_domain *domain;
 
@@ -2420,7 +2428,7 @@ static struct protection_domain 
*protection_domain_alloc(void)
if (!domain)
return NULL;
 
-   if (protection_domain_init(domain))
+   if (protection_domain_init(domain, mode))
goto out_err;
 
return domain;
@@ -2434,23 +2442,13 @@ static struct protection_domain 
*protection_domain_alloc(void)
 static struct iommu_domain *amd_iommu_domain_alloc(unsigned type)
 {
struct protection_domain *pdomain;
-   u64 *pt_root, root;
 
switch (type) {
case IOMMU_DOMAIN_UNMANAGED:
-   pdomain = protection_domain_alloc();
+   pdomain = protection_domain_alloc(DEFAULT_PGTABLE_LEVEL);
if (!pdomain)
return NULL;
 
-   pt_root = (void *)get_zeroed_page(GFP_KERNEL);
-   if (!pt_root) {
-   protection_domain_free(pdomain);
-   return NULL;
-   }
-
-   root = amd_iommu_domain_encode_pgtable(pt_root, 
PAGE_MODE_3_LEVEL);
-   atomic64_set(>pt_root, root);
-
pdomain->domain.geometry.aperture_start = 0;
pdomain->domain.geometry.aperture_end   = ~0ULL;
pdomain->domain.geometry.force_aperture = true;
@@ -2464,11 +2462,9 @@ static struct iommu_domain 
*amd_iommu_domain_alloc(unsigned type)
}
break;
case IOMMU_DOMAIN_IDENTITY:
-   pdomain = protection_domain_alloc();
+   pdomain = protection_domain_alloc(PAGE_MODE_NONE);
if (!pdomain)
return NULL;
-
-   atomic64_set(>pt_root, PAGE_MODE_NONE);
break;
default:
return NULL;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 08/10] iommu/amd: Merge private header files

2020-05-27 Thread Joerg Roedel
From: Joerg Roedel 

Merge amd_iommu_proto.h into amd_iommu.h.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd/amd_iommu.h   | 96 +++-
 drivers/iommu/amd/amd_iommu_proto.h | 97 -
 drivers/iommu/amd/debugfs.c |  5 +-
 drivers/iommu/amd/init.c|  4 +-
 drivers/iommu/amd/iommu.c   |  4 +-
 drivers/iommu/amd/iommu_v2.c|  4 +-
 6 files changed, 100 insertions(+), 110 deletions(-)
 delete mode 100644 drivers/iommu/amd/amd_iommu_proto.h

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 12d540d9b59b..f892992c8744 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -1,9 +1,103 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2009-2010 Advanced Micro Devices, Inc.
+ * Author: Joerg Roedel 
+ */
 
 #ifndef AMD_IOMMU_H
 #define AMD_IOMMU_H
 
-int __init add_special_device(u8 type, u8 id, u16 *devid, bool cmd_line);
+#include 
+
+#include "amd_iommu_types.h"
+
+extern int amd_iommu_get_num_iommus(void);
+extern int amd_iommu_init_dma_ops(void);
+extern int amd_iommu_init_passthrough(void);
+extern irqreturn_t amd_iommu_int_thread(int irq, void *data);
+extern irqreturn_t amd_iommu_int_handler(int irq, void *data);
+extern void amd_iommu_apply_erratum_63(u16 devid);
+extern void amd_iommu_reset_cmd_buffer(struct amd_iommu *iommu);
+extern int amd_iommu_init_devices(void);
+extern void amd_iommu_uninit_devices(void);
+extern void amd_iommu_init_notifier(void);
+extern int amd_iommu_init_api(void);
+
+#ifdef CONFIG_AMD_IOMMU_DEBUGFS
+void amd_iommu_debugfs_setup(struct amd_iommu *iommu);
+#else
+static inline void amd_iommu_debugfs_setup(struct amd_iommu *iommu) {}
+#endif
+
+/* Needed for interrupt remapping */
+extern int amd_iommu_prepare(void);
+extern int amd_iommu_enable(void);
+extern void amd_iommu_disable(void);
+extern int amd_iommu_reenable(int);
+extern int amd_iommu_enable_faulting(void);
+extern int amd_iommu_guest_ir;
+
+/* IOMMUv2 specific functions */
+struct iommu_domain;
+
+extern bool amd_iommu_v2_supported(void);
+extern int amd_iommu_register_ppr_notifier(struct notifier_block *nb);
+extern int amd_iommu_unregister_ppr_notifier(struct notifier_block *nb);
+extern void amd_iommu_domain_direct_map(struct iommu_domain *dom);
+extern int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids);
+extern int amd_iommu_flush_page(struct iommu_domain *dom, int pasid,
+   u64 address);
+extern int amd_iommu_flush_tlb(struct iommu_domain *dom, int pasid);
+extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, int pasid,
+unsigned long cr3);
+extern int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, int pasid);
+extern struct iommu_domain *amd_iommu_get_v2_domain(struct pci_dev *pdev);
+
+#ifdef CONFIG_IRQ_REMAP
+extern int amd_iommu_create_irq_domain(struct amd_iommu *iommu);
+#else
+static inline int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
+{
+   return 0;
+}
+#endif
+
+#define PPR_SUCCESS0x0
+#define PPR_INVALID0x1
+#define PPR_FAILURE0xf
+
+extern int amd_iommu_complete_ppr(struct pci_dev *pdev, int pasid,
+ int status, int tag);
+
+static inline bool is_rd890_iommu(struct pci_dev *pdev)
+{
+   return (pdev->vendor == PCI_VENDOR_ID_ATI) &&
+  (pdev->device == PCI_DEVICE_ID_RD890_IOMMU);
+}
+
+static inline bool iommu_feature(struct amd_iommu *iommu, u64 f)
+{
+   if (!(iommu->cap & (1 << IOMMU_CAP_EFR)))
+   return false;
+
+   return !!(iommu->features & f);
+}
+
+static inline u64 iommu_virt_to_phys(void *vaddr)
+{
+   return (u64)__sme_set(virt_to_phys(vaddr));
+}
+
+static inline void *iommu_phys_to_virt(unsigned long paddr)
+{
+   return phys_to_virt(__sme_clr(paddr));
+}
+
+extern bool translation_pre_enabled(struct amd_iommu *iommu);
+extern bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
+struct device *dev);
+extern int __init add_special_device(u8 type, u8 id, u16 *devid,
+bool cmd_line);
 
 #ifdef CONFIG_DMI
 void amd_iommu_apply_ivrs_quirks(void);
diff --git a/drivers/iommu/amd/amd_iommu_proto.h 
b/drivers/iommu/amd/amd_iommu_proto.h
deleted file mode 100644
index 1c6c12c11368..
--- a/drivers/iommu/amd/amd_iommu_proto.h
+++ /dev/null
@@ -1,97 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (C) 2009-2010 Advanced Micro Devices, Inc.
- * Author: Joerg Roedel 
- */
-
-#ifndef _ASM_X86_AMD_IOMMU_PROTO_H
-#define _ASM_X86_AMD_IOMMU_PROTO_H
-
-#include "amd_iommu_types.h"
-
-extern int amd_iommu_get_num_iommus(void);
-extern int amd_iommu_init_dma_ops(void);
-extern int amd_iommu_init_passthrough(void);
-extern irqreturn_t amd_iommu_int_thread(int irq, void *data);

[PATCH 07/10] iommu/amd: Remove PD_DMA_OPS_MASK

2020-05-27 Thread Joerg Roedel
From: Joerg Roedel 

This is covered by IOMMU_DOMAIN_DMA from the IOMMU core code already,
so remove it.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd/iommu.c | 24 +++-
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 9e0737932e0c..7c87ef78f26a 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1817,15 +1817,6 @@ static void free_gcr3_table(struct protection_domain 
*domain)
free_page((unsigned long)domain->gcr3_tbl);
 }
 
-/*
- * little helper function to check whether a given protection domain is a
- * dma_ops domain
- */
-static bool dma_ops_domain(struct protection_domain *domain)
-{
-   return domain->flags & PD_DMA_OPS_MASK;
-}
-
 static void set_dte_entry(u16 devid, struct protection_domain *domain,
  struct domain_pgtable *pgtable,
  bool ats, bool ppr)
@@ -2408,11 +2399,9 @@ static struct iommu_domain 
*amd_iommu_domain_alloc(unsigned type)
domain->domain.geometry.aperture_end   = ~0ULL;
domain->domain.geometry.force_aperture = true;
 
-   if (type == IOMMU_DOMAIN_DMA) {
-   if (iommu_get_dma_cookie(>domain) == -ENOMEM)
-   goto free_domain;
-   domain->flags = PD_DMA_OPS_MASK;
-   }
+   if (type == IOMMU_DOMAIN_DMA &&
+   iommu_get_dma_cookie(>domain) == -ENOMEM)
+   goto free_domain;
 
return >domain;
 
@@ -3024,17 +3013,18 @@ struct iommu_domain *amd_iommu_get_v2_domain(struct 
pci_dev *pdev)
if (!check_device(dev))
return NULL;
 
-   pdomain = get_dev_data(dev)->domain;
+   pdomain   = get_dev_data(dev)->domain;
+   io_domain = iommu_get_domain_for_dev(dev);
if (pdomain == NULL && get_dev_data(dev)->defer_attach) {
get_dev_data(dev)->defer_attach = false;
-   io_domain = iommu_get_domain_for_dev(dev);
pdomain = to_pdomain(io_domain);
attach_device(dev, pdomain);
}
+
if (pdomain == NULL)
return NULL;
 
-   if (!dma_ops_domain(pdomain))
+   if (io_domain->type != IOMMU_DOMAIN_DMA)
return NULL;
 
/* Only return IOMMUv2 domains */
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 00/10] iommu/amd: Updates and Cleanups

2020-05-27 Thread Joerg Roedel
Hi,

here is a collection of patches that clean up a few things
in the AMD IOMMU driver. Foremost, it moves all related
files of the driver into a separate subdirectory.

But the patches also remove usage of dev->archdata.iommu and
clean up dev_data handling and domain allocation.

Patches are runtime-tested, including device-assignment.

Please review.

Regards,

Joerg

Joerg Roedel (10):
  iommu/amd: Move AMD IOMMU driver to a subdirectory
  iommu/amd: Unexport get_dev_data()
  iommu/amd: Let free_pagetable() not rely on domain->pt_root
  iommu/amd: Allocate page-table in protection_domain_init()
  iommu/amd: Free page-table in protection_domain_free()
  iommu/amd: Consolidate domain allocation/freeing
  iommu/amd: Remove PD_DMA_OPS_MASK
  iommu/amd: Merge private header files
  iommu/amd: Store dev_data as device iommu private data
  iommu/amd: Remove redundant devid checks

 MAINTAINERS   |   2 +-
 drivers/iommu/Makefile|   6 +-
 .../{amd_iommu_proto.h => amd/amd_iommu.h}|  20 +-
 drivers/iommu/{ => amd}/amd_iommu_types.h |   0
 .../{amd_iommu_debugfs.c => amd/debugfs.c}|   5 +-
 .../iommu/{amd_iommu_init.c => amd/init.c}|   6 +-
 drivers/iommu/{amd_iommu.c => amd/iommu.c}| 265 ++
 .../iommu/{amd_iommu_v2.c => amd/iommu_v2.c}  |  14 +-
 .../{amd_iommu_quirks.c => amd/quirks.c}  |   0
 drivers/iommu/amd_iommu.h |  14 -
 10 files changed, 117 insertions(+), 215 deletions(-)
 rename drivers/iommu/{amd_iommu_proto.h => amd/amd_iommu.h} (88%)
 rename drivers/iommu/{ => amd}/amd_iommu_types.h (100%)
 rename drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} (89%)
 rename drivers/iommu/{amd_iommu_init.c => amd/init.c} (99%)
 rename drivers/iommu/{amd_iommu.c => amd/iommu.c} (95%)
 rename drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} (98%)
 rename drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c} (100%)
 delete mode 100644 drivers/iommu/amd_iommu.h

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 10/10] iommu/amd: Remove redundant devid checks

2020-05-27 Thread Joerg Roedel
From: Joerg Roedel 

Checking the return value of get_device_id() in a code-path which has
already done check_device() is not needed, as check_device() does the
same check and bails out if it fails.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd/iommu.c | 13 ++---
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 2504aa184837..db149c1a35bf 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -413,13 +413,8 @@ static void iommu_ignore_device(struct device *dev)
 static void amd_iommu_uninit_device(struct device *dev)
 {
struct iommu_dev_data *dev_data;
-   int devid;
-
-   devid = get_device_id(dev);
-   if (devid < 0)
-   return;
 
-   dev_data = search_dev_data(devid);
+   dev_data = dev_iommu_priv_get(dev);
if (!dev_data)
return;
 
@@ -2173,16 +2168,12 @@ static void amd_iommu_probe_finalize(struct device *dev)
 
 static void amd_iommu_release_device(struct device *dev)
 {
+   int devid = get_device_id(dev);
struct amd_iommu *iommu;
-   int devid;
 
if (!check_device(dev))
return;
 
-   devid = get_device_id(dev);
-   if (devid < 0)
-   return;
-
iommu = amd_iommu_rlookup_table[devid];
 
amd_iommu_uninit_device(dev);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 01/10] iommu/amd: Move AMD IOMMU driver to a subdirectory

2020-05-27 Thread Joerg Roedel
From: Joerg Roedel 

The driver consists of five C files and three header files by now.
Move them to their own subdirectory to not clutter to iommu top-level
directory with them.

Signed-off-by: Joerg Roedel 
---
 MAINTAINERS  | 2 +-
 drivers/iommu/Makefile   | 6 +++---
 drivers/iommu/{ => amd}/amd_iommu.h  | 0
 drivers/iommu/{ => amd}/amd_iommu_proto.h| 0
 drivers/iommu/{ => amd}/amd_iommu_types.h| 0
 drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} | 0
 drivers/iommu/{amd_iommu_init.c => amd/init.c}   | 2 +-
 drivers/iommu/{amd_iommu.c => amd/iommu.c}   | 2 +-
 drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} | 0
 drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c}   | 0
 10 files changed, 6 insertions(+), 6 deletions(-)
 rename drivers/iommu/{ => amd}/amd_iommu.h (100%)
 rename drivers/iommu/{ => amd}/amd_iommu_proto.h (100%)
 rename drivers/iommu/{ => amd}/amd_iommu_types.h (100%)
 rename drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} (100%)
 rename drivers/iommu/{amd_iommu_init.c => amd/init.c} (99%)
 rename drivers/iommu/{amd_iommu.c => amd/iommu.c} (99%)
 rename drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} (100%)
 rename drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c} (100%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 50659d76976b..dd59ec6676d9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -876,7 +876,7 @@ M:  Joerg Roedel 
 L: iommu@lists.linux-foundation.org
 S: Maintained
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
-F: drivers/iommu/amd_iommu*.[ch]
+F: drivers/iommu/amd/
 F: include/linux/amd-iommu.h
 
 AMD KFD
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 57cf4ba5e27c..3af7e374b0cb 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -11,9 +11,9 @@ obj-$(CONFIG_IOASID) += ioasid.o
 obj-$(CONFIG_IOMMU_IOVA) += iova.o
 obj-$(CONFIG_OF_IOMMU) += of_iommu.o
 obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o
-obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o amd_iommu_quirks.o
-obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd_iommu_debugfs.o
-obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o
+obj-$(CONFIG_AMD_IOMMU) += amd/iommu.o amd/init.o amd/quirks.o
+obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd/debugfs.o
+obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o
 obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
 arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o
 obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
diff --git a/drivers/iommu/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
similarity index 100%
rename from drivers/iommu/amd_iommu.h
rename to drivers/iommu/amd/amd_iommu.h
diff --git a/drivers/iommu/amd_iommu_proto.h 
b/drivers/iommu/amd/amd_iommu_proto.h
similarity index 100%
rename from drivers/iommu/amd_iommu_proto.h
rename to drivers/iommu/amd/amd_iommu_proto.h
diff --git a/drivers/iommu/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
similarity index 100%
rename from drivers/iommu/amd_iommu_types.h
rename to drivers/iommu/amd/amd_iommu_types.h
diff --git a/drivers/iommu/amd_iommu_debugfs.c b/drivers/iommu/amd/debugfs.c
similarity index 100%
rename from drivers/iommu/amd_iommu_debugfs.c
rename to drivers/iommu/amd/debugfs.c
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd/init.c
similarity index 99%
rename from drivers/iommu/amd_iommu_init.c
rename to drivers/iommu/amd/init.c
index 5b81fd16f5fa..fda80fd1d9a6 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd/init.c
@@ -35,7 +35,7 @@
 #include "amd_iommu.h"
 #include "amd_iommu_proto.h"
 #include "amd_iommu_types.h"
-#include "irq_remapping.h"
+#include "../irq_remapping.h"
 
 /*
  * definitions for the ACPI scanning code
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd/iommu.c
similarity index 99%
rename from drivers/iommu/amd_iommu.c
rename to drivers/iommu/amd/iommu.c
index 1b36c40d0712..39155f550f18 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -45,7 +45,7 @@
 
 #include "amd_iommu_proto.h"
 #include "amd_iommu_types.h"
-#include "irq_remapping.h"
+#include "../irq_remapping.h"
 
 #define CMD_SET_TYPE(cmd, t) ((cmd)->data[1] |= ((t) << 28))
 
diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
similarity index 100%
rename from drivers/iommu/amd_iommu_v2.c
rename to drivers/iommu/amd/iommu_v2.c
diff --git a/drivers/iommu/amd_iommu_quirks.c b/drivers/iommu/amd/quirks.c
similarity index 100%
rename from drivers/iommu/amd_iommu_quirks.c
rename to drivers/iommu/amd/quirks.c
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 03/10] iommu/amd: Let free_pagetable() not rely on domain->pt_root

2020-05-27 Thread Joerg Roedel
From: Joerg Roedel 

Use 'struct domain_pgtable' instead to free_pagetable(). This solves
the problem that amd_iommu_domain_direct_map() needs to restore
domain->pt_root after the device table has been updated just to make
free_pagetable release the domain page-table.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd/iommu.c | 36 
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 8368f6b9c17f..c7e47a7f0d45 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1391,20 +1391,19 @@ static struct page *free_sub_pt(unsigned long root, int 
mode,
return freelist;
 }
 
-static void free_pagetable(struct protection_domain *domain)
+static void free_pagetable(struct domain_pgtable *pgtable)
 {
-   struct domain_pgtable pgtable;
struct page *freelist = NULL;
unsigned long root;
 
-   amd_iommu_domain_get_pgtable(domain, );
-   atomic64_set(>pt_root, 0);
+   if (pgtable->mode == PAGE_MODE_NONE)
+   return;
 
-   BUG_ON(pgtable.mode < PAGE_MODE_NONE ||
-  pgtable.mode > PAGE_MODE_6_LEVEL);
+   BUG_ON(pgtable->mode < PAGE_MODE_NONE ||
+  pgtable->mode > PAGE_MODE_6_LEVEL);
 
-   root = (unsigned long)pgtable.root;
-   freelist = free_sub_pt(root, pgtable.mode, freelist);
+   root = (unsigned long)pgtable->root;
+   freelist = free_sub_pt(root, pgtable->mode, freelist);
 
free_page_list(freelist);
 }
@@ -1823,12 +1822,16 @@ static void free_gcr3_table(struct protection_domain 
*domain)
  */
 static void dma_ops_domain_free(struct protection_domain *domain)
 {
+   struct domain_pgtable pgtable;
+
if (!domain)
return;
 
iommu_put_dma_cookie(>domain);
 
-   free_pagetable(domain);
+   amd_iommu_domain_get_pgtable(domain, );
+   atomic64_set(>pt_root, 0);
+   free_pagetable();
 
if (domain->id)
domain_id_free(domain->id);
@@ -2496,9 +2499,8 @@ static void amd_iommu_domain_free(struct iommu_domain 
*dom)
break;
default:
amd_iommu_domain_get_pgtable(domain, );
-
-   if (pgtable.mode != PAGE_MODE_NONE)
-   free_pagetable(domain);
+   atomic64_set(>pt_root, 0);
+   free_pagetable();
 
if (domain->flags & PD_IOMMUV2_MASK)
free_gcr3_table(domain);
@@ -2796,7 +2798,6 @@ void amd_iommu_domain_direct_map(struct iommu_domain *dom)
struct protection_domain *domain = to_pdomain(dom);
struct domain_pgtable pgtable;
unsigned long flags;
-   u64 pt_root;
 
spin_lock_irqsave(>lock, flags);
 
@@ -2804,18 +2805,13 @@ void amd_iommu_domain_direct_map(struct iommu_domain 
*dom)
amd_iommu_domain_get_pgtable(domain, );
 
/* Update data structure */
-   pt_root = amd_iommu_domain_encode_pgtable(NULL, PAGE_MODE_NONE);
-   atomic64_set(>pt_root, pt_root);
+   atomic64_set(>pt_root, 0);
 
/* Make changes visible to IOMMUs */
update_domain(domain);
 
-   /* Restore old pgtable in domain->ptroot to free page-table */
-   pt_root = amd_iommu_domain_encode_pgtable(pgtable.root, pgtable.mode);
-   atomic64_set(>pt_root, pt_root);
-
/* Page-table is not visible to IOMMU anymore, so free it */
-   free_pagetable(domain);
+   free_pagetable();
 
spin_unlock_irqrestore(>lock, flags);
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 09/10] iommu/amd: Store dev_data as device iommu private data

2020-05-27 Thread Joerg Roedel
From: Joerg Roedel 

Do not use dev->archdata.iommu anymore and switch to using the private
per-device pointer provided by the IOMMU core code.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd/iommu.c | 44 +++
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index e3fdc7a0e853..2504aa184837 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -279,11 +279,6 @@ static struct iommu_dev_data *find_dev_data(u16 devid)
return dev_data;
 }
 
-static struct iommu_dev_data *get_dev_data(struct device *dev)
-{
-   return dev->archdata.iommu;
-}
-
 /*
 * Find or create an IOMMU group for a acpihid device.
 */
@@ -334,7 +329,7 @@ static bool pdev_pri_erratum(struct pci_dev *pdev, u32 
erratum)
 {
struct iommu_dev_data *dev_data;
 
-   dev_data = get_dev_data(>dev);
+   dev_data = dev_iommu_priv_get(>dev);
 
return dev_data->errata & (1 << erratum) ? true : false;
 }
@@ -369,7 +364,7 @@ static int iommu_init_device(struct device *dev)
struct iommu_dev_data *dev_data;
int devid;
 
-   if (dev->archdata.iommu)
+   if (dev_iommu_priv_get(dev))
return 0;
 
devid = get_device_id(dev);
@@ -396,7 +391,7 @@ static int iommu_init_device(struct device *dev)
dev_data->iommu_v2 = iommu->is_iommu_v2;
}
 
-   dev->archdata.iommu = dev_data;
+   dev_iommu_priv_set(dev, dev_data);
 
return 0;
 }
@@ -431,6 +426,8 @@ static void amd_iommu_uninit_device(struct device *dev)
if (dev_data->domain)
detach_device(dev);
 
+   dev_iommu_priv_set(dev, NULL);
+
/*
 * We keep dev_data around for unplugged devices and reuse it when the
 * device is re-plugged - not doing so would introduce a ton of races.
@@ -493,7 +490,7 @@ static void amd_iommu_report_page_fault(u16 devid, u16 
domain_id,
pdev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(devid),
   devid & 0xff);
if (pdev)
-   dev_data = get_dev_data(>dev);
+   dev_data = dev_iommu_priv_get(>dev);
 
if (dev_data && __ratelimit(_data->rs)) {
pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x 
address=0x%llx flags=0x%04x]\n",
@@ -2033,7 +2030,7 @@ static int attach_device(struct device *dev,
 
spin_lock_irqsave(>lock, flags);
 
-   dev_data = get_dev_data(dev);
+   dev_data = dev_iommu_priv_get(dev);
 
spin_lock(_data->lock);
 
@@ -2097,7 +2094,7 @@ static void detach_device(struct device *dev)
struct iommu_dev_data *dev_data;
unsigned long flags;
 
-   dev_data = get_dev_data(dev);
+   dev_data = dev_iommu_priv_get(dev);
domain   = dev_data->domain;
 
spin_lock_irqsave(>lock, flags);
@@ -2146,7 +2143,7 @@ static struct iommu_device *amd_iommu_probe_device(struct 
device *dev)
 
iommu = amd_iommu_rlookup_table[devid];
 
-   if (get_dev_data(dev))
+   if (dev_iommu_priv_get(dev))
return >iommu;
 
ret = iommu_init_device(dev);
@@ -2435,7 +2432,7 @@ static void amd_iommu_domain_free(struct iommu_domain 
*dom)
 static void amd_iommu_detach_device(struct iommu_domain *dom,
struct device *dev)
 {
-   struct iommu_dev_data *dev_data = dev->archdata.iommu;
+   struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
struct amd_iommu *iommu;
int devid;
 
@@ -2473,7 +2470,7 @@ static int amd_iommu_attach_device(struct iommu_domain 
*dom,
if (!check_device(dev))
return -EINVAL;
 
-   dev_data = dev->archdata.iommu;
+   dev_data = dev_iommu_priv_get(dev);
dev_data->defer_attach = false;
 
iommu = amd_iommu_rlookup_table[dev_data->devid];
@@ -2632,7 +2629,7 @@ static void amd_iommu_get_resv_regions(struct device *dev,
 bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
  struct device *dev)
 {
-   struct iommu_dev_data *dev_data = dev->archdata.iommu;
+   struct iommu_dev_data *dev_data = dev_iommu_priv_get(dev);
 
return dev_data->defer_attach;
 }
@@ -2659,7 +2656,7 @@ static int amd_iommu_def_domain_type(struct device *dev)
 {
struct iommu_dev_data *dev_data;
 
-   dev_data = get_dev_data(dev);
+   dev_data = dev_iommu_priv_get(dev);
if (!dev_data)
return 0;
 
@@ -2992,7 +2989,7 @@ int amd_iommu_complete_ppr(struct pci_dev *pdev, int 
pasid,
struct amd_iommu *iommu;
struct iommu_cmd cmd;
 
-   dev_data = get_dev_data(>dev);
+   dev_data = dev_iommu_priv_get(>dev);
iommu= amd_iommu_rlookup_table[dev_data->devid];
 
build_complete_ppr(, dev_data->devid, pasid, status,
@@ -3005,16 +3002,19 @@ EXPORT_SYMBOL(amd_iommu_complete_ppr);
 struct 

[PATCH 06/10] iommu/amd: Consolidate domain allocation/freeing

2020-05-27 Thread Joerg Roedel
From: Joerg Roedel 

Merge the allocation code paths of DMA and UNMANAGED domains and
remove code duplication.

Signed-off-by: Joerg Roedel 
---
 drivers/iommu/amd/iommu.c | 116 +-
 1 file changed, 27 insertions(+), 89 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 5282ff6b8ea0..9e0737932e0c 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -101,7 +101,6 @@ struct iommu_cmd {
 struct kmem_cache *amd_iommu_irq_cache;
 
 static void update_domain(struct protection_domain *domain);
-static int protection_domain_init(struct protection_domain *domain, int mode);
 static void detach_device(struct device *dev);
 static void update_and_flush_device_table(struct protection_domain *domain,
  struct domain_pgtable *pgtable);
@@ -1818,58 +1817,6 @@ static void free_gcr3_table(struct protection_domain 
*domain)
free_page((unsigned long)domain->gcr3_tbl);
 }
 
-/*
- * Free a domain, only used if something went wrong in the
- * allocation path and we need to free an already allocated page table
- */
-static void dma_ops_domain_free(struct protection_domain *domain)
-{
-   struct domain_pgtable pgtable;
-
-   if (!domain)
-   return;
-
-   iommu_put_dma_cookie(>domain);
-
-   amd_iommu_domain_get_pgtable(domain, );
-   atomic64_set(>pt_root, 0);
-   free_pagetable();
-
-   if (domain->id)
-   domain_id_free(domain->id);
-
-   kfree(domain);
-}
-
-/*
- * Allocates a new protection domain usable for the dma_ops functions.
- * It also initializes the page table and the address allocator data
- * structures required for the dma_ops interface
- */
-static struct protection_domain *dma_ops_domain_alloc(void)
-{
-   struct protection_domain *domain;
-
-   domain = kzalloc(sizeof(struct protection_domain), GFP_KERNEL);
-   if (!domain)
-   return NULL;
-
-   if (protection_domain_init(domain, DEFAULT_PGTABLE_LEVEL))
-   goto free_domain;
-
-   domain->flags = PD_DMA_OPS_MASK;
-
-   if (iommu_get_dma_cookie(>domain) == -ENOMEM)
-   goto free_domain;
-
-   return domain;
-
-free_domain:
-   dma_ops_domain_free(domain);
-
-   return NULL;
-}
-
 /*
  * little helper function to check whether a given protection domain is a
  * dma_ops domain
@@ -2447,36 +2394,32 @@ static struct protection_domain 
*protection_domain_alloc(int mode)
 
 static struct iommu_domain *amd_iommu_domain_alloc(unsigned type)
 {
-   struct protection_domain *pdomain;
-
-   switch (type) {
-   case IOMMU_DOMAIN_UNMANAGED:
-   pdomain = protection_domain_alloc(DEFAULT_PGTABLE_LEVEL);
-   if (!pdomain)
-   return NULL;
+   struct protection_domain *domain;
+   int mode = DEFAULT_PGTABLE_LEVEL;
 
-   pdomain->domain.geometry.aperture_start = 0;
-   pdomain->domain.geometry.aperture_end   = ~0ULL;
-   pdomain->domain.geometry.force_aperture = true;
+   if (type == IOMMU_DOMAIN_IDENTITY)
+   mode = PAGE_MODE_NONE;
 
-   break;
-   case IOMMU_DOMAIN_DMA:
-   pdomain = dma_ops_domain_alloc();
-   if (!pdomain) {
-   pr_err("Failed to allocate\n");
-   return NULL;
-   }
-   break;
-   case IOMMU_DOMAIN_IDENTITY:
-   pdomain = protection_domain_alloc(PAGE_MODE_NONE);
-   if (!pdomain)
-   return NULL;
-   break;
-   default:
+   domain = protection_domain_alloc(mode);
+   if (!domain)
return NULL;
+
+   domain->domain.geometry.aperture_start = 0;
+   domain->domain.geometry.aperture_end   = ~0ULL;
+   domain->domain.geometry.force_aperture = true;
+
+   if (type == IOMMU_DOMAIN_DMA) {
+   if (iommu_get_dma_cookie(>domain) == -ENOMEM)
+   goto free_domain;
+   domain->flags = PD_DMA_OPS_MASK;
}
 
-   return >domain;
+   return >domain;
+
+free_domain:
+   protection_domain_free(domain);
+
+   return NULL;
 }
 
 static void amd_iommu_domain_free(struct iommu_domain *dom)
@@ -2493,18 +2436,13 @@ static void amd_iommu_domain_free(struct iommu_domain 
*dom)
if (!dom)
return;
 
-   switch (dom->type) {
-   case IOMMU_DOMAIN_DMA:
-   /* Now release the domain */
-   dma_ops_domain_free(domain);
-   break;
-   default:
-   if (domain->flags & PD_IOMMUV2_MASK)
-   free_gcr3_table(domain);
+   if (dom->type == IOMMU_DOMAIN_DMA)
+   iommu_put_dma_cookie(>domain);
 
-   protection_domain_free(domain);
-   break;
-   }
+   if (domain->flags & PD_IOMMUV2_MASK)
+   

Re: [RFC 0/2] iommu: arm-smmu: Add support for early direct mappings

2020-05-27 Thread Will Deacon
Hi John, Bjorn,

On Tue, May 26, 2020 at 01:34:45PM -0700, John Stultz wrote:
> On Thu, May 14, 2020 at 12:34 PM  wrote:
> >
> > On Thu 27 Feb 18:57 PST 2020, Bjorn Andersson wrote:
> >
> > Rob, Will, we're reaching the point where upstream has enough
> > functionality that this is becoming a critical issue for us.
> >
> > E.g. Lenovo Yoga C630 is lacking this and a single dts patch to boot
> > mainline with display, GPU, WiFi and audio working and the story is
> > similar on several devboards.
> >
> > As previously described, the only thing I want is the stream mapping
> > related to the display controller in place, either with the CB with
> > translation disabled or possibly with a way to specify the framebuffer
> > region (although this turns out to mess things up in the display
> > driver...)
> >
> > I did pick this up again recently and concluded that by omitting the
> > streams for the USB controllers causes an instability issue seen on one
> > of the controller to disappear. So I would prefer if we somehow could
> > have a mechanism to only pick the display streams and the context
> > allocation for this.
> >
> >
> > Can you please share some pointers/insights/wishes for how we can
> > conclude on this subject?
> 
> Ping? I just wanted to follow up on this discussion as this small
> series is crucial for booting mainline on the Dragonboard 845c
> devboard. It would be really valuable to be able to get some solution
> upstream so we can test mainline w/o adding additional patches.

Sorry, it's been insanely busy recently and I haven't had a chance to think
about this on top of everything else. We're also carrying a hack in Android
for you :)

> The rest of the db845c series has been moving forward smoothly, but
> this set seems to be very stuck with no visible progress since Dec.
> 
> Are there any pointers for what folks would prefer to see?

I've had a chat with Robin about this. Originally, I was hoping that
people would all work together towards an idyllic future where firmware
would be able to describe arbitrary pre-existing mappings for devices,
irrespective of the IOMMU through which they master and Linux could
inherit this configuration. However, that hasn't materialised (there was
supposed to be an IORT update, but I don't know what happened to that)
and, in actual fact, the problem that you have on db845 is /far/ more
restricted than the general problem.

Could you please try hacking something along the following lines and see
how you get on? You may need my for-joerg/arm-smmu/updates branch for
all the pieces:

  1. Use the ->cfg_probe() callback to reserve the SMR/S2CRs you need
 "pinning" and configure for bypass.

  2. Use the ->def_domain_type() callback to return IOMMU_DOMAIN_IDENTITY
 for the display controller

I /think/ that's sufficient, but note that it differs from the current
approach because we don't end up reserving a CB -- bypass is configured
in the S2CR instead. Some invalidation might therefore be needed in
->cfg_probe() after unhooking the CB.

Thanks, and please yell if you run into problems with this approach.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/2] iommu: Remove iommu_sva_ops::mm_exit()

2020-05-27 Thread Jean-Philippe Brucker
Hi Joerg,

I was wondering if you could take these two patches for v5.8. The API
change is a precursor for the SVA support in SMMUv3, and the VT-d
implementation of the SVA API (queued for 5.8) doesn't implement
iommu_sva_ops.

Thanks,
Jean

On Thu, Apr 23, 2020 at 02:53:27PM +0200, Jean-Philippe Brucker wrote:
> The IOMMU SVA API currently requires device drivers to implement an
> mm_exit() callback, which stops device jobs that do DMA. This function
> is called in the release() MMU notifier, when an address space that is
> shared with a device exits.
> 
> It has been noted several time during discussions about SVA that
> cancelling DMA jobs can be slow and complex, and doing it in the
> release() notifier might cause synchronization issues. Device drivers
> must in any case call unbind() to remove their bond, after stopping DMA
> from a more favorable context (release of a file descriptor).
> 
> Patch 1 removes the mm_exit() callback from the uacce module, and patch
> 2 removes it from the IOMMU API. Since v1 [1] I fixed the uacce unbind
> reported by Zhangfei and added details in the commit message of patch 2.

[1] 
https://lore.kernel.org/linux-iommu/20200408140427.212807-1-jean-phili...@linaro.org/

> Jean-Philippe Brucker (2):
>   uacce: Remove mm_exit() op
>   iommu: Remove iommu_sva_ops::mm_exit()
> 
>  include/linux/iommu.h  |  30 ---
>  include/linux/uacce.h  |  34 ++--
>  drivers/iommu/iommu.c  |  11 ---
>  drivers/misc/uacce/uacce.c | 172 +
>  4 files changed, 51 insertions(+), 196 deletions(-)
> 
> -- 
> 2.26.0
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

2020-05-27 Thread Arnd Bergmann
On Wed, May 27, 2020 at 11:00 AM Greg Kroah-Hartman
 wrote:
>
> On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote:
> > Some platform devices appear as PCI but are actually on the AMBA bus,
>
> Why would these devices not just show up on the AMBA bus and use all of
> that logic instead of being a PCI device and having to go through odd
> fixes like this?

There is a general move to having hardware be discoverable even with
ARM processors. Having on-chip devices be discoverable using PCI config
space is how x86 SoCs usually do it, and that is generally a good thing
as it means we don't need to describe them in DT

I guess as the hardware designers are still learning about it, this is not
always done correctly. In general, we can also describe PCI devices on
DT and do fixups during the probing there, but I suspect that won't work
as easily using ACPI probing, so the fixup is keyed off the hardware ID,
again as is common for x86 on-chip devices.

  Arnd
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/4] PCI, iommu: Factor 'untrusted' check for ATS enablement

2020-05-27 Thread Jean-Philippe Brucker
Hi Joerg, Bjorn,

On Wed, May 20, 2020 at 05:21:59PM +0200, Jean-Philippe Brucker wrote:
> IOMMU drivers currently check themselves if a device is untrusted
> (plugged into an external-facing port) before enabling ATS. Move the
> check to drivers/pci. The only functional change should be to the AMD
> IOMMU driver. With this change all IOMMU drivers block 'Translated' PCIe
> transactions and Translation Requests from untrusted devices.

This seems ready for v5.8. I guess it could go through the IOMMU tree
since there are a little more IOMMU changes?

Thanks,
Jean
> 
> Since v1 [1] I added tags, addressed comments on patches 1 and 3, and
> fixed a regression in patch 3.
> 
> [1] 
> https://lore.kernel.org/linux-iommu/20200515104359.1178606-1-jean-phili...@linaro.org/
> 
> Jean-Philippe Brucker (4):
>   PCI/ATS: Only enable ATS for trusted devices
>   iommu/amd: Use pci_ats_supported()
>   iommu/arm-smmu-v3: Use pci_ats_supported()
>   iommu/vt-d: Use pci_ats_supported()
> 
>  include/linux/pci-ats.h |  3 +++
>  drivers/iommu/amd_iommu.c   | 12 
>  drivers/iommu/arm-smmu-v3.c | 20 +++-
>  drivers/iommu/intel-iommu.c |  9 +++--
>  drivers/pci/ats.c   | 18 +-
>  5 files changed, 34 insertions(+), 28 deletions(-)
> 
> -- 
> 2.26.2
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC] Use SMMU HTTU for DMA dirty page tracking

2020-05-27 Thread Jean-Philippe Brucker
On Wed, May 27, 2020 at 08:40:47AM +, Tian, Kevin wrote:
> > From: Xiang Zheng 
> > Sent: Wednesday, May 27, 2020 2:45 PM
> > 
> > 
> > On 2020/5/27 11:27, Tian, Kevin wrote:
> > >> From: Xiang Zheng
> > >> Sent: Monday, May 25, 2020 7:34 PM
> > >>
> > >> [+cc Kirti, Yan, Alex]
> > >>
> > >> On 2020/5/23 1:14, Jean-Philippe Brucker wrote:
> > >>> Hi,
> > >>>
> > >>> On Tue, May 19, 2020 at 05:42:55PM +0800, Xiang Zheng wrote:
> >  Hi all,
> > 
> >  Is there any plan for enabling SMMU HTTU?
> > >>>
> > >>> Not outside of SVA, as far as I know.
> > >>>
> > >>
> >  I have seen the patch locates in the SVA series patch, which adds
> >  support for HTTU:
> >  https://www.spinics.net/lists/arm-kernel/msg798694.html
> > 
> >  HTTU reduces the number of access faults on SMMU fault queue
> >  (permission faults also benifit from it).
> > 
> >  Besides reducing the faults, HTTU also helps to track dirty pages for
> >  device DMA. Is it feasible to utilize HTTU to get dirty pages on device
> >  DMA during VFIO live migration?
> > >>>
> > >>> As you know there is a VFIO interface for this under discussion:
> > >>> https://lore.kernel.org/kvm/1589781397-28368-1-git-send-email-
> > >> kwankh...@nvidia.com/
> > >>> It doesn't implement an internal API to communicate with the IOMMU
> > >> driver
> > >>> about dirty pages.
> > >
> > > We plan to add such API later, e.g. to utilize A/D bit in VT-d 2nd-level
> > > page tables (Rev 3.0).
> > >
> > 
> > Thank you, Kevin.
> > 
> > When will you send this series patches? Maybe(Hope) we can also support
> > hardware-based dirty pages tracking via common APIs based on your
> > patches. :)
> 
> Yan is working with Kirti on basic live migration support now. After that
> part is done, we will start working on A/D bit support. Yes, common APIs
> are definitely the goal here.
> 
> > 
> > >>
> > >>>
> >  If SMMU can track dirty pages, devices are not required to implement
> >  additional dirty pages tracking to support VFIO live migration.
> > >>>
> > >>> It seems feasible, though tracking it in the device might be more
> > >>> efficient. I might have misunderstood but I think for live migration of
> > >>> the Intel NIC they trap guest accesses to the device and introspect its
> > >>> state to figure out which pages it is accessing.
> > >
> > > Does HTTU implement A/D-like mechanism in SMMU page tables, or just
> > > report dirty pages in a log buffer? Either way tracking dirty pages in 
> > > IOMMU
> > > side is generic thus doesn't require device-specific tweak like in Intel 
> > > NIC.
> > >
> > 
> > Currently HTTU just implement A/D-like mechanism in SMMU page tables.
> > We certainly
> > expect SMMU can also implement PML-like feature so that we can avoid
> > walking the
> > whole page table to get the dirty pages.

There is no reporting of dirty pages in log buffer. It might be possible
to do software logging based on PRI or Stall, but that requires special
support in the endpoint as well as the SMMU.

> Is there a link to HTTU introduction?

I don't know any gentle introduction, but there are sections D5.4.11
"Hardware management of the Access flag and dirty state" in the ARM
Architecture Reference Manual (DDI0487E), and section 3.13 "Translation
table entries and Access/Dirty flags" in the SMMU specification
(IHI0070C). HTTU stands for "Hardware Translation Table Update".

In short, when HTTU is enabled, the SMMU translation performs an atomic
read-modify-write on the leaf translation table descriptor, setting some
bits depending on the type of memory access. This can be enabled
independently on both stage-1 and stage-2 tables (equivalent to your 1st
and 2nd page tables levels, I think).

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 0/2] iommu: arm-smmu: Add support for early direct mappings

2020-05-27 Thread Laurentiu Tudor


On 5/26/2020 11:34 PM, John Stultz wrote:
> On Thu, May 14, 2020 at 12:34 PM  wrote:
>>
>> On Thu 27 Feb 18:57 PST 2020, Bjorn Andersson wrote:
>>
>> Rob, Will, we're reaching the point where upstream has enough
>> functionality that this is becoming a critical issue for us.
>>
>> E.g. Lenovo Yoga C630 is lacking this and a single dts patch to boot
>> mainline with display, GPU, WiFi and audio working and the story is
>> similar on several devboards.
>>
>> As previously described, the only thing I want is the stream mapping
>> related to the display controller in place, either with the CB with
>> translation disabled or possibly with a way to specify the framebuffer
>> region (although this turns out to mess things up in the display
>> driver...)
>>
>> I did pick this up again recently and concluded that by omitting the
>> streams for the USB controllers causes an instability issue seen on one
>> of the controller to disappear. So I would prefer if we somehow could
>> have a mechanism to only pick the display streams and the context
>> allocation for this.
>>
>>
>> Can you please share some pointers/insights/wishes for how we can
>> conclude on this subject?
> 
> Ping? I just wanted to follow up on this discussion as this small
> series is crucial for booting mainline on the Dragonboard 845c
> devboard. It would be really valuable to be able to get some solution
> upstream so we can test mainline w/o adding additional patches.

+1

There are also some NXP chips that depend on this. Also, I've submitted
a v2 [1] a while back that tries to address the feedback on the initial
implementation.

[1] https://patchwork.ozlabs.org/project/linux-tegra/list/?series=164853

---
Best Regards, Laurentiu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3] dma: Fix max PFN arithmetic overflow on 32 bit systems

2020-05-27 Thread Greg KH
On Tue, May 26, 2020 at 07:57:49PM +0200, Alexander Dahl wrote:
> The intermediate result of the old term (4UL * 1024 * 1024 * 1024) is
> 4 294 967 296 or 0x1 which is no problem on 64 bit systems.  The
> patch does not change the later overall result of 0x10 for
> MAX_DMA32_PFN.  The new calculation yields the same result, but does not
> require 64 bit arithmetic.
> 
> On 32 bit systems the old calculation suffers from an arithmetic
> overflow in that intermediate term in braces: 4UL aka unsigned long int
> is 4 byte wide and an arithmetic overflow happens (the 0x1 does
> not fit in 4 bytes), the in braces result is truncated to zero, the
> following right shift does not alter that, so MAX_DMA32_PFN evaluates to
> 0 on 32 bit systems.
> 
> That wrong value is a problem in a comparision against MAX_DMA32_PFN in
> the init code for swiotlb in 'pci_swiotlb_detect_4gb()' to decide if
> swiotlb should be active.  That comparison yields the opposite result,
> when compiling on 32 bit systems.
> 
> This was not possible before 1b7e03ef7570 ("x86, NUMA: Enable emulation
> on 32bit too") when that MAX_DMA32_PFN was first made visible to x86_32
> (and which landed in v3.0).
> 
> In practice this wasn't a problem, unless you activated CONFIG_SWIOTLB
> on x86 (32 bit).
> 
> However for ARCH=x86 (32 bit) and if you have set CONFIG_IOMMU_INTEL,
> since c5a5dc4cbbf4 ("iommu/vt-d: Don't switch off swiotlb if bounce page
> is used") there's a dependency on CONFIG_SWIOTLB, which was not
> necessarily active before.  That landed in v5.4, where we noticed it in
> the fli4l Linux distribution.  We have CONFIG_IOMMU_INTEL active on both
> 32 and 64 bit kernel configs there (I could not find out why, so let's
> just say historical reasons).
> 
> The effect is at boot time 64 MiB (default size) were allocated for
> bounce buffers now, which is a noticeable amount of memory on small
> systems like pcengines ALIX 2D3 with 256 MiB memory, which are still
> frequently used as home routers.
> 
> We noticed this effect when migrating from kernel v4.19 (LTS) to v5.4
> (LTS) in fli4l and got that kernel messages for example:
> 
>   Linux version 5.4.22 (buildroot@buildroot) (gcc version 7.3.0 (Buildroot 
> 2018.02.8)) #1 SMP Mon Nov 26 23:40:00 CET 2018
>   …
>   Memory: 183484K/261756K available (4594K kernel code, 393K rwdata, 1660K 
> rodata, 536K init, 456K bss , 78272K reserved, 0K cma-reserved, 0K highmem)
>   …
>   PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
>   software IO TLB: mapped [mem 0x0bb78000-0x0fb78000] (64MB)
> 
> The initial analysis and the suggested fix was done by user 'sourcejedi'
> at stackoverflow and explicitly marked as GPLv2 for inclusion in the
> Linux kernel:
> 
>   https://unix.stackexchange.com/a/520525/50007
> 
> The new calculation, which does not suffer from that overflow, is the
> same as for arch/mips now as suggested by Robin Murphy.
> 
> The fix was tested by fli4l users on round about two dozen different
> systems, including both 32 and 64 bit archs, bare metal and virtualized
> machines.
> 
> Fixes: 1b7e03ef7570 ("x86, NUMA: Enable emulation on 32bit too")
> Fixes: https://web.nettworks.org/bugs/browse/FFL-2560
> Fixes: https://unix.stackexchange.com/q/520065/50007
> Reported-by: Alan Jenkins 
> Suggested-by: Robin Murphy 
> Signed-off-by: Alexander Dahl 
> Cc: sta...@vger.kernel.org

Reviewed-by: Greg Kroah-Hartman 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/2] PCI: Introduce PCI_FIXUP_IOMMU

2020-05-27 Thread Greg Kroah-Hartman
On Tue, May 26, 2020 at 11:09:57PM +0800, Zhangfei Gao wrote:
> Hi, Christoph
> 
> On 2020/5/26 下午10:46, Christoph Hellwig wrote:
> > On Tue, May 26, 2020 at 07:49:08PM +0800, Zhangfei Gao wrote:
> > > Some platform devices appear as PCI but are actually on the AMBA bus,
> > > and they need fixup in drivers/pci/quirks.c handling iommu_fwnode.
> > > Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode
> > > is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow
> > > down iommu probing as all devices in fixup final list will be
> > > reprocessed.
> > Who is going to use this?  I don't see a single user in the series.
> We will add iommu fixup in drivers/pci/quirks.c, handling
> 
> fwspec->can_stall, which is introduced in
> 
> https://www.spinics.net/lists/linux-pci/msg94559.html
> 
> Unfortunately, the patch does not catch v5.8, so we have to wait.
> And we want to check whether this is a right method to solve this issue.

We can't take new apis without a real user, so please submit them all at
once.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/2] iommu: calling pci_fixup_iommu in iommu_fwspec_init

2020-05-27 Thread Greg Kroah-Hartman
On Tue, May 26, 2020 at 07:49:09PM +0800, Zhangfei Gao wrote:
> Calling pci_fixup_iommu in iommu_fwspec_init, which alloc
> iommu_fwnode. Some platform devices appear as PCI but are
> actually on the AMBA bus, and they need fixup in
> drivers/pci/quirks.c handling iommu_fwnode.
> So calling pci_fixup_iommu after iommu_fwnode is allocated.
> 
> Signed-off-by: Zhangfei Gao 
> ---
>  drivers/iommu/iommu.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 7b37542..fb84c42 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct 
> fwnode_handle *iommu_fwnode,
>   fwspec->iommu_fwnode = iommu_fwnode;
>   fwspec->ops = ops;
>   dev_iommu_fwspec_set(dev, fwspec);
> +
> + if (dev_is_pci(dev))
> + pci_fixup_device(pci_fixup_iommu, to_pci_dev(dev));

Why can't the caller do this as it "knows" it is a PCI device at that
point in time, right?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

2020-05-27 Thread Greg Kroah-Hartman
On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote:
> Some platform devices appear as PCI but are actually on the AMBA bus,

Why would these devices not just show up on the AMBA bus and use all of
that logic instead of being a PCI device and having to go through odd
fixes like this?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 18/24] iommu/arm-smmu-v3: Add support for Hardware Translation Table Update

2020-05-27 Thread Jean-Philippe Brucker
On Wed, May 27, 2020 at 11:00:29AM +0800, Xiang Zheng wrote:
> Hi Jean,
> 
> This patch only enables HTTU bits in CDs. Is it also neccessary to enable
> HTTU bits in STEs in this patch?

Only if you need HTTU for stage-2 tables. This series is only about
sharing stage-1 page tables, for which HTTU is enabled in the CD. I'll add
a statement in the commit message.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC] Use SMMU HTTU for DMA dirty page tracking

2020-05-27 Thread Tian, Kevin
> From: Xiang Zheng 
> Sent: Wednesday, May 27, 2020 2:45 PM
> 
> 
> On 2020/5/27 11:27, Tian, Kevin wrote:
> >> From: Xiang Zheng
> >> Sent: Monday, May 25, 2020 7:34 PM
> >>
> >> [+cc Kirti, Yan, Alex]
> >>
> >> On 2020/5/23 1:14, Jean-Philippe Brucker wrote:
> >>> Hi,
> >>>
> >>> On Tue, May 19, 2020 at 05:42:55PM +0800, Xiang Zheng wrote:
>  Hi all,
> 
>  Is there any plan for enabling SMMU HTTU?
> >>>
> >>> Not outside of SVA, as far as I know.
> >>>
> >>
>  I have seen the patch locates in the SVA series patch, which adds
>  support for HTTU:
>  https://www.spinics.net/lists/arm-kernel/msg798694.html
> 
>  HTTU reduces the number of access faults on SMMU fault queue
>  (permission faults also benifit from it).
> 
>  Besides reducing the faults, HTTU also helps to track dirty pages for
>  device DMA. Is it feasible to utilize HTTU to get dirty pages on device
>  DMA during VFIO live migration?
> >>>
> >>> As you know there is a VFIO interface for this under discussion:
> >>> https://lore.kernel.org/kvm/1589781397-28368-1-git-send-email-
> >> kwankh...@nvidia.com/
> >>> It doesn't implement an internal API to communicate with the IOMMU
> >> driver
> >>> about dirty pages.
> >
> > We plan to add such API later, e.g. to utilize A/D bit in VT-d 2nd-level
> > page tables (Rev 3.0).
> >
> 
> Thank you, Kevin.
> 
> When will you send this series patches? Maybe(Hope) we can also support
> hardware-based dirty pages tracking via common APIs based on your
> patches. :)

Yan is working with Kirti on basic live migration support now. After that
part is done, we will start working on A/D bit support. Yes, common APIs
are definitely the goal here.

> 
> >>
> >>>
>  If SMMU can track dirty pages, devices are not required to implement
>  additional dirty pages tracking to support VFIO live migration.
> >>>
> >>> It seems feasible, though tracking it in the device might be more
> >>> efficient. I might have misunderstood but I think for live migration of
> >>> the Intel NIC they trap guest accesses to the device and introspect its
> >>> state to figure out which pages it is accessing.
> >
> > Does HTTU implement A/D-like mechanism in SMMU page tables, or just
> > report dirty pages in a log buffer? Either way tracking dirty pages in IOMMU
> > side is generic thus doesn't require device-specific tweak like in Intel 
> > NIC.
> >
> 
> Currently HTTU just implement A/D-like mechanism in SMMU page tables.
> We certainly
> expect SMMU can also implement PML-like feature so that we can avoid
> walking the
> whole page table to get the dirty pages.

Is there a link to HTTU introduction?

> 
> By the way, I'm not sure whether HTTU or SLAD can help for mediated deivce.
> 

A/D bit applies to mediated device on VT-d.

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC] Use SMMU HTTU for DMA dirty page tracking

2020-05-27 Thread Xiang Zheng


On 2020/5/27 11:27, Tian, Kevin wrote:
>> From: Xiang Zheng
>> Sent: Monday, May 25, 2020 7:34 PM
>>
>> [+cc Kirti, Yan, Alex]
>>
>> On 2020/5/23 1:14, Jean-Philippe Brucker wrote:
>>> Hi,
>>>
>>> On Tue, May 19, 2020 at 05:42:55PM +0800, Xiang Zheng wrote:
 Hi all,

 Is there any plan for enabling SMMU HTTU?
>>>
>>> Not outside of SVA, as far as I know.
>>>
>>
 I have seen the patch locates in the SVA series patch, which adds
 support for HTTU:
 https://www.spinics.net/lists/arm-kernel/msg798694.html

 HTTU reduces the number of access faults on SMMU fault queue
 (permission faults also benifit from it).

 Besides reducing the faults, HTTU also helps to track dirty pages for
 device DMA. Is it feasible to utilize HTTU to get dirty pages on device
 DMA during VFIO live migration?
>>>
>>> As you know there is a VFIO interface for this under discussion:
>>> https://lore.kernel.org/kvm/1589781397-28368-1-git-send-email-
>> kwankh...@nvidia.com/
>>> It doesn't implement an internal API to communicate with the IOMMU
>> driver
>>> about dirty pages.
> 
> We plan to add such API later, e.g. to utilize A/D bit in VT-d 2nd-level 
> page tables (Rev 3.0). 
> 

Thank you, Kevin.

When will you send this series patches? Maybe(Hope) we can also support
hardware-based dirty pages tracking via common APIs based on your patches. :)

>>
>>>
 If SMMU can track dirty pages, devices are not required to implement
 additional dirty pages tracking to support VFIO live migration.
>>>
>>> It seems feasible, though tracking it in the device might be more
>>> efficient. I might have misunderstood but I think for live migration of
>>> the Intel NIC they trap guest accesses to the device and introspect its
>>> state to figure out which pages it is accessing.
> 
> Does HTTU implement A/D-like mechanism in SMMU page tables, or just
> report dirty pages in a log buffer? Either way tracking dirty pages in IOMMU
> side is generic thus doesn't require device-specific tweak like in Intel NIC.
> 

Currently HTTU just implement A/D-like mechanism in SMMU page tables. We 
certainly
expect SMMU can also implement PML-like feature so that we can avoid walking the
whole page table to get the dirty pages.

By the way, I'm not sure whether HTTU or SLAD can help for mediated deivce.

-- 
Thanks,
Xiang

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/iova: Free global iova rcache on iova alloc failure

2020-05-27 Thread vjitta
From: Vijayanand Jitta 

When ever an iova alloc request fails we free the iova
ranges present in the percpu iova rcaches and then retry
but the global iova rcache is not freed as a result we
could still see iova alloc failure even after retry as
global rcache is still holding the iova's which can cause
fragmentation. So, free the global iova rcache as well
and then go for the retry.

Signed-off-by: Vijayanand Jitta 
---
 drivers/iommu/iova.c | 22 ++
 include/linux/iova.h |  6 ++
 2 files changed, 28 insertions(+)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 0e6a953..5ae0328 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -431,6 +431,7 @@ struct iova *find_iova(struct iova_domain *iovad, unsigned 
long pfn)
flush_rcache = false;
for_each_online_cpu(cpu)
free_cpu_cached_iovas(cpu, iovad);
+   free_global_cached_iovas(iovad);
goto retry;
}
 
@@ -1044,5 +1045,26 @@ void free_cpu_cached_iovas(unsigned int cpu, struct 
iova_domain *iovad)
}
 }
 
+/*
+ * free all the IOVA ranges of global cache
+ */
+void free_global_cached_iovas(struct iova_domain *iovad)
+{
+   struct iova_rcache *rcache;
+   int i, j;
+
+   for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+   rcache = >rcaches[i];
+   spin_lock(>lock);
+   for (j = 0; j < rcache->depot_size; ++j) {
+   iova_magazine_free_pfns(rcache->depot[j], iovad);
+   iova_magazine_free(rcache->depot[j]);
+   rcache->depot[j] = NULL;
+   }
+   rcache->depot_size = 0;
+   spin_unlock(>lock);
+   }
+}
+
 MODULE_AUTHOR("Anil S Keshavamurthy ");
 MODULE_LICENSE("GPL");
diff --git a/include/linux/iova.h b/include/linux/iova.h
index a0637ab..a905726 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -163,6 +163,7 @@ int init_iova_flush_queue(struct iova_domain *iovad,
 struct iova *split_and_remove_iova(struct iova_domain *iovad,
struct iova *iova, unsigned long pfn_lo, unsigned long pfn_hi);
 void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
+void free_global_cached_iovas(struct iova_domain *iovad);
 #else
 static inline int iova_cache_get(void)
 {
@@ -270,6 +271,11 @@ static inline void free_cpu_cached_iovas(unsigned int cpu,
 struct iova_domain *iovad)
 {
 }
+
+static inline void free_global_cached_iovas(struct iova_domain *iovad)
+{
+}
+
 #endif
 
 #endif
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu