Re: [PATCH v11 1/9] iommu: Introduce a callback to struct iommu_resv_region

2022-04-22 Thread Christoph Hellwig
Looks good:

Reviewed-by: Christoph Hellwig 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 1/9] iommu: Introduce a callback to struct iommu_resv_region

2022-04-22 Thread Christoph Hellwig
On Sat, Apr 23, 2022 at 10:04:39AM +0800, Lu Baolu wrote:
> The generic_iommu_put_resv_regions() itself is a callback. Why bothering
> adding another callback from the same iommu driver in it? Or, you are
> going to remove the put_resv_regions from the iommu ops?

It is a driver method, but these reserved entries are not actually
allocated by the driver.  And I do have a patch pending removing this
driver method that should never have been a driver method, check
the iomm list archives for

iommu: remove the put_resv_regions method

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 1/9] iommu: Introduce a callback to struct iommu_resv_region

2022-04-22 Thread Lu Baolu

On 2022/4/23 00:28, Shameer Kolothum via iommu wrote:

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f2c45b85b9fc..ffcfa684e80c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2597,16 +2597,22 @@ void iommu_put_resv_regions(struct device *dev, struct 
list_head *list)
   * @list: reserved region list for device
   *
   * IOMMU drivers can use this to implement their .put_resv_regions() callback
- * for simple reservations. Memory allocated for each reserved region will be
- * freed. If an IOMMU driver allocates additional resources per region, it is
- * going to have to implement a custom callback.
+ * for simple reservations. If a per region callback is provided that will be
+ * used to free all memory allocations associated with the reserved region or
+ * else just free up the memory for the regions. If an IOMMU driver allocates
+ * additional resources per region, it is going to have to implement a custom
+ * callback.
   */
  void generic_iommu_put_resv_regions(struct device *dev, struct list_head 
*list)
  {
struct iommu_resv_region *entry, *next;
  
-	list_for_each_entry_safe(entry, next, list, list)

-   kfree(entry);
+   list_for_each_entry_safe(entry, next, list, list) {
+   if (entry->free)
+   entry->free(dev, entry);
+   else
+   kfree(entry);
+   }
  }
  EXPORT_SYMBOL(generic_iommu_put_resv_regions);


The generic_iommu_put_resv_regions() itself is a callback. Why bothering
adding another callback from the same iommu driver in it? Or, you are
going to remove the put_resv_regions from the iommu ops?

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/mediatek: fix NULL pointer dereference when printing dev_name

2022-04-22 Thread Miles Chen via iommu
When larbdev is NULL (in the case I hit, the node is incorrectly set
iommus = < NUM>), it will cause device_link_add() fail and
the kernel crashes when we try to print dev_name(larbdev).

Fix it by adding a NULL pointer check before
device_link_add/device_link_remove.

It should work for normal correct setting and avoid the crash caused
by my incorrect setting.

Error log:
[   18.189042][  T301] Unable to handle kernel NULL pointer dereference at 
virtual address 0050
[   18.190247][  T301] Mem abort info:
[   18.190255][  T301]   ESR = 0x9605
[   18.190263][  T301]   EC = 0x25: DABT (current EL), IL = 32 bits
[   18.192142][  T301]   SET = 0, FnV = 0
[   18.192151][  T301]   EA = 0, S1PTW = 0
[   18.194710][  T301]   FSC = 0x05: level 1 translation fault
[   18.195424][  T301] Data abort info:
[   18.195888][  T301]   ISV = 0, ISS = 0x0005
[   18.196500][  T301]   CM = 0, WnR = 0
[   18.196977][  T301] user pgtable: 4k pages, 39-bit VAs, pgdp=000104f9e000
[   18.197889][  T301] [0050] pgd=, 
p4d=, pud=
[   18.199220][  T301] Internal error: Oops: 9605 [#1] PREEMPT SMP
[   18.343152][  T301] Kernel Offset: 0x144408 from 0xffc00800
[   18.343988][  T301] PHYS_OFFSET: 0x4000
[   18.344519][  T301] pstate: a045 (NzCv daif +PAN -UAO)
[   18.345213][  T301] pc : mtk_iommu_probe_device+0xf8/0x118 [mtk_iommu]
[   18.346050][  T301] lr : mtk_iommu_probe_device+0xd0/0x118 [mtk_iommu]
[   18.346884][  T301] sp : ffc00a5635e0
[   18.347392][  T301] x29: ffc00a5635e0 x28: ffd44a46c1d8
[   18.348156][  T301] x27: ff80c39a8000 x26: ffd44a80cc38
[   18.348917][  T301] x25:  x24: ffd44a80cc38
[   18.349677][  T301] x23: ffd44e4da4c6 x22: ffd44a80cc38
[   18.350438][  T301] x21: ff80cecd1880 x20: 
[   18.351198][  T301] x19: ff80c439f010 x18: ffc00a50d0c0
[   18.351959][  T301] x17:  x16: 0004
[   18.352719][  T301] x15: 0004 x14: ffd44eb5d420
[   18.353480][  T301] x13: 0ad2 x12: 0003
[   18.354241][  T301] x11: fad2 x10: c000fad2
[   18.355003][  T301] x9 : a0d288d8d7142d00 x8 : a0d288d8d7142d00
[   18.355763][  T301] x7 : ffd44c2bc640 x6 : 
[   18.356524][  T301] x5 : 0080 x4 : 0001
[   18.357284][  T301] x3 :  x2 : 0005
[   18.358045][  T301] x1 :  x0 : 
[   18.360208][  T301] Hardware name: MT6873 (DT)
[   18.360771][  T301] Call trace:
[   18.361168][  T301]  dump_backtrace+0xf8/0x1f0
[   18.361737][  T301]  dump_stack_lvl+0xa8/0x11c
[   18.362305][  T301]  dump_stack+0x1c/0x2c
[   18.362816][  T301]  mrdump_common_die+0x184/0x40c [mrdump]
[   18.363575][  T301]  ipanic_die+0x24/0x38 [mrdump]
[   18.364230][  T301]  atomic_notifier_call_chain+0x128/0x2b8
[   18.364937][  T301]  die+0x16c/0x568
[   18.365394][  T301]  __do_kernel_fault+0x1e8/0x214
[   18.365402][  T301]  do_page_fault+0xb8/0x678
[   18.366934][  T301]  do_translation_fault+0x48/0x64
[   18.368645][  T301]  do_mem_abort+0x68/0x148
[   18.368652][  T301]  el1_abort+0x40/0x64
[   18.368660][  T301]  el1h_64_sync_handler+0x54/0x88
[   18.368668][  T301]  el1h_64_sync+0x68/0x6c
[   18.368673][  T301]  mtk_iommu_probe_device+0xf8/0x118 [mtk_iommu]
[   18.369840][  T301]  __iommu_probe_device+0x12c/0x358
[   18.370880][  T301]  iommu_probe_device+0x3c/0x31c
[   18.372026][  T301]  of_iommu_configure+0x200/0x274
[   18.373587][  T301]  of_dma_configure_id+0x1b8/0x230
[   18.375200][  T301]  platform_dma_configure+0x24/0x3c
[   18.376456][  T301]  really_probe+0x110/0x504
[   18.376464][  T301]  __driver_probe_device+0xb4/0x188
[   18.376472][  T301]  driver_probe_device+0x5c/0x2b8
[   18.376481][  T301]  __driver_attach+0x338/0x42c
[   18.377992][  T301]  bus_add_driver+0x218/0x4c8
[   18.379389][  T301]  driver_register+0x84/0x17c
[   18.380580][  T301]  __platform_driver_register+0x28/0x38
...

Fixes: 635319a4a744 ("media: iommu/mediatek: Add device_link between the 
consumer and the larb devices")
Signed-off-by: Miles Chen 
---
 drivers/iommu/mtk_iommu.c| 16 ++--
 drivers/iommu/mtk_iommu_v1.c | 16 ++--
 2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 6fd75a60abd6..1405502118ca 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -581,10 +581,12 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
}
}
larbdev = data->larb_imu[larbid].dev;
-   link = device_link_add(dev, larbdev,
-  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
-   if (!link)
-   dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
+   if (larbdev) {
+   link = device_link_add(dev, 

Re: [PATCH] Documentation: x86: rework IOMMU documentation

2022-04-22 Thread Robin Murphy

On 2022-04-22 21:06, Alex Deucher wrote:

Add preliminary documentation for AMD IOMMU and combine
with the existing Intel IOMMU documentation and clean
up and modernize some of the existing documentation to
align with the current state of the kernel.


FWIW,

Reviewed-by: Robin Murphy 


Signed-off-by: Alex Deucher 
---

V2: Incorporate feedback from Robin to clarify IOMMU vs DMA engine (e.g.,
 a device) and document proper DMA API.  Also correct the fact that
 the AMD IOMMU is not limited to managing PCI devices.
v3: Fix spelling and rework text as suggested by Vasant
v4: Combine Intel and AMD documents into a single document as suggested
 by Dave Hansen
v5: Clarify that keywords are related to ACPI, grammatical fixes
v6: Make more stuff common based on feedback from Robin

  Documentation/x86/index.rst   |   2 +-
  Documentation/x86/intel-iommu.rst | 115 
  Documentation/x86/iommu.rst   | 143 ++
  3 files changed, 144 insertions(+), 116 deletions(-)
  delete mode 100644 Documentation/x86/intel-iommu.rst
  create mode 100644 Documentation/x86/iommu.rst

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index f498f1d36cd3..6f8409fe0674 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -21,7 +21,7 @@ x86-specific Documentation
 tlb
 mtrr
 pat
-   intel-iommu
+   iommu
 intel_txt
 amd-memory-encryption
 pti
diff --git a/Documentation/x86/intel-iommu.rst 
b/Documentation/x86/intel-iommu.rst
deleted file mode 100644
index 099f13d51d5f..
--- a/Documentation/x86/intel-iommu.rst
+++ /dev/null
@@ -1,115 +0,0 @@
-===
-Linux IOMMU Support
-===
-
-The architecture spec can be obtained from the below location.
-
-http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf
-
-This guide gives a quick cheat sheet for some basic understanding.
-
-Some Keywords
-
-- DMAR - DMA remapping
-- DRHD - DMA Remapping Hardware Unit Definition
-- RMRR - Reserved memory Region Reporting Structure
-- ZLR  - Zero length reads from PCI devices
-- IOVA - IO Virtual address.
-
-Basic stuff

-
-ACPI enumerates and lists the different DMA engines in the platform, and
-device scope relationships between PCI devices and which DMA engine  controls
-them.
-
-What is RMRR?
--
-
-There are some devices the BIOS controls, for e.g USB devices to perform
-PS2 emulation. The regions of memory used for these devices are marked
-reserved in the e820 map. When we turn on DMA translation, DMA to those
-regions will fail. Hence BIOS uses RMRR to specify these regions along with
-devices that need to access these regions. OS is expected to setup
-unity mappings for these regions for these devices to access these regions.
-
-How is IOVA generated?
---
-
-Well behaved drivers call pci_map_*() calls before sending command to device
-that needs to perform DMA. Once DMA is completed and mapping is no longer
-required, device performs a pci_unmap_*() calls to unmap the region.
-
-The Intel IOMMU driver allocates a virtual address per domain. Each PCIE
-device has its own domain (hence protection). Devices under p2p bridges
-share the virtual address with all devices under the p2p bridge due to
-transaction id aliasing for p2p bridges.
-
-IOVA generation is pretty generic. We used the same technique as vmalloc()
-but these are not global address spaces, but separate for each domain.
-Different DMA engines may support different number of domains.
-
-We also allocate guard pages with each mapping, so we can attempt to catch
-any overflow that might happen.
-
-
-Graphics Problems?
---
-If you encounter issues with graphics devices, you can try adding
-option intel_iommu=igfx_off to turn off the integrated graphics engine.
-If this fixes anything, please ensure you file a bug reporting the problem.
-
-Some exceptions to IOVA

-Interrupt ranges are not address translated, (0xfee0 - 0xfeef).
-The same is true for peer to peer transactions. Hence we reserve the
-address from PCI MMIO ranges so they are not allocated for IOVA addresses.
-
-
-Fault reporting

-When errors are reported, the DMA engine signals via an interrupt. The fault
-reason and device that caused it with fault reason is printed on console.
-
-See below for sample.
-
-
-Boot Message Sample

-
-Something like this gets printed indicating presence of DMAR tables
-in ACPI.
-
-ACPI: DMAR (v001 A M I  OEMDMAR  0x0001 MSFT 0x0097) @ 
0x7f5b5ef0
-
-When DMAR is being processed and initialized by ACPI, prints DMAR locations
-and any RMRR's processed::
-
-   ACPI DMAR:Host address width 36
-   ACPI DMAR:DRHD (flags: 0x)base: 0xfed9
-   ACPI DMAR:DRHD (flags: 0x)base: 0xfed91000
-   

Re: fully convert arm to use dma-direct

2022-04-22 Thread Linus Walleij
On Thu, Apr 21, 2022 at 9:42 AM Christoph Hellwig  wrote:

> arm is the last platform not using the dma-direct code for directly
> mapped DMA.  With the dmaboune removal from Arnd we can easily switch
> arm to always use dma-direct now (it already does for LPAE configs
> and nommu).  I'd love to merge this series through the dma-mapping tree
> as it gives us the opportunity for additional core dma-mapping
> improvements.
(...)

>  b/arch/arm/mach-footbridge/Kconfig   |1
>  b/arch/arm/mach-footbridge/common.c  |   19
>  b/arch/arm/mach-footbridge/include/mach/dma-direct.h |8
>  b/arch/arm/mach-footbridge/include/mach/memory.h |4

I think Marc Z has a Netwinder that he can test this on. Marc?
I have one too, just not much in my office because of parental leave.

Yours,
Linus Walleij
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] Documentation: x86: rework IOMMU documentation

2022-04-22 Thread Alex Deucher via iommu
Add preliminary documentation for AMD IOMMU and combine
with the existing Intel IOMMU documentation and clean
up and modernize some of the existing documentation to
align with the current state of the kernel.

Signed-off-by: Alex Deucher 
---

V2: Incorporate feedback from Robin to clarify IOMMU vs DMA engine (e.g.,
a device) and document proper DMA API.  Also correct the fact that
the AMD IOMMU is not limited to managing PCI devices.
v3: Fix spelling and rework text as suggested by Vasant
v4: Combine Intel and AMD documents into a single document as suggested
by Dave Hansen
v5: Clarify that keywords are related to ACPI, grammatical fixes
v6: Make more stuff common based on feedback from Robin

 Documentation/x86/index.rst   |   2 +-
 Documentation/x86/intel-iommu.rst | 115 
 Documentation/x86/iommu.rst   | 143 ++
 3 files changed, 144 insertions(+), 116 deletions(-)
 delete mode 100644 Documentation/x86/intel-iommu.rst
 create mode 100644 Documentation/x86/iommu.rst

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index f498f1d36cd3..6f8409fe0674 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -21,7 +21,7 @@ x86-specific Documentation
tlb
mtrr
pat
-   intel-iommu
+   iommu
intel_txt
amd-memory-encryption
pti
diff --git a/Documentation/x86/intel-iommu.rst 
b/Documentation/x86/intel-iommu.rst
deleted file mode 100644
index 099f13d51d5f..
--- a/Documentation/x86/intel-iommu.rst
+++ /dev/null
@@ -1,115 +0,0 @@
-===
-Linux IOMMU Support
-===
-
-The architecture spec can be obtained from the below location.
-
-http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf
-
-This guide gives a quick cheat sheet for some basic understanding.
-
-Some Keywords
-
-- DMAR - DMA remapping
-- DRHD - DMA Remapping Hardware Unit Definition
-- RMRR - Reserved memory Region Reporting Structure
-- ZLR  - Zero length reads from PCI devices
-- IOVA - IO Virtual address.
-
-Basic stuff

-
-ACPI enumerates and lists the different DMA engines in the platform, and
-device scope relationships between PCI devices and which DMA engine  controls
-them.
-
-What is RMRR?
--
-
-There are some devices the BIOS controls, for e.g USB devices to perform
-PS2 emulation. The regions of memory used for these devices are marked
-reserved in the e820 map. When we turn on DMA translation, DMA to those
-regions will fail. Hence BIOS uses RMRR to specify these regions along with
-devices that need to access these regions. OS is expected to setup
-unity mappings for these regions for these devices to access these regions.
-
-How is IOVA generated?
---
-
-Well behaved drivers call pci_map_*() calls before sending command to device
-that needs to perform DMA. Once DMA is completed and mapping is no longer
-required, device performs a pci_unmap_*() calls to unmap the region.
-
-The Intel IOMMU driver allocates a virtual address per domain. Each PCIE
-device has its own domain (hence protection). Devices under p2p bridges
-share the virtual address with all devices under the p2p bridge due to
-transaction id aliasing for p2p bridges.
-
-IOVA generation is pretty generic. We used the same technique as vmalloc()
-but these are not global address spaces, but separate for each domain.
-Different DMA engines may support different number of domains.
-
-We also allocate guard pages with each mapping, so we can attempt to catch
-any overflow that might happen.
-
-
-Graphics Problems?
---
-If you encounter issues with graphics devices, you can try adding
-option intel_iommu=igfx_off to turn off the integrated graphics engine.
-If this fixes anything, please ensure you file a bug reporting the problem.
-
-Some exceptions to IOVA

-Interrupt ranges are not address translated, (0xfee0 - 0xfeef).
-The same is true for peer to peer transactions. Hence we reserve the
-address from PCI MMIO ranges so they are not allocated for IOVA addresses.
-
-
-Fault reporting

-When errors are reported, the DMA engine signals via an interrupt. The fault
-reason and device that caused it with fault reason is printed on console.
-
-See below for sample.
-
-
-Boot Message Sample

-
-Something like this gets printed indicating presence of DMAR tables
-in ACPI.
-
-ACPI: DMAR (v001 A M I  OEMDMAR  0x0001 MSFT 0x0097) @ 
0x7f5b5ef0
-
-When DMAR is being processed and initialized by ACPI, prints DMAR locations
-and any RMRR's processed::
-
-   ACPI DMAR:Host address width 36
-   ACPI DMAR:DRHD (flags: 0x)base: 0xfed9
-   ACPI DMAR:DRHD (flags: 0x)base: 0xfed91000
-   ACPI DMAR:DRHD (flags: 0x0001)base: 0xfed93000
-   ACPI DMAR:RMRR base: 

RE: [PATCH v4] Documentation: x86: rework IOMMU documentation

2022-04-22 Thread Deucher, Alexander via iommu
[Public]

> -Original Message-
> From: Robin Murphy 
> Sent: Friday, April 22, 2022 3:41 PM
> To: Deucher, Alexander ; linux-
> d...@vger.kernel.org; linux-ker...@vger.kernel.org; cor...@lwn.net;
> h...@zytor.com; x...@kernel.org; dave.han...@linux.intel.com;
> b...@alien8.de; mi...@redhat.com; t...@linutronix.de; j...@8bytes.org;
> Suthikulpanit, Suravee ; w...@kernel.org;
> iommu@lists.linux-foundation.org; Hegde, Vasant 
> Subject: Re: [PATCH v4] Documentation: x86: rework IOMMU documentation
> 
> On 2022-04-22 18:54, Alex Deucher wrote:
> [...]
> > +Intel Specific Notes
> > +
> > +
> > +Graphics Problems?
> > +^^
> > +
> > +If you encounter issues with graphics devices, you can try adding
> > +option intel_iommu=igfx_off to turn off the integrated graphics engine.
> > +If this fixes anything, please ensure you file a bug reporting the problem.
> > +
> > +Some exceptions to IOVA
> > +^^^
> > +
> > +Interrupt ranges are not address translated, (0xfee0 - 0xfeef).
> > +The same is true for peer to peer transactions. Hence we reserve the
> > +address from PCI MMIO ranges so they are not allocated for IOVA
> addresses.
> 
> Note that this should be true for both drivers.
> 
> > +
> > +AMD Specific Notes
> > +--
> > +
> > +Graphics Problems?
> > +^^
> > +
> > +If you encounter issues with integrated graphics devices, you can try
> > +adding option iommu=pt to the kernel command line use a 1:1 mapping
> > +for the IOMMU.  If this fixes anything, please ensure you file a bug
> reporting the problem.
> 
> And indeed this is a generic option. I reckon we could simply merge these two
> sections together, with the first paragraph being something like:
> 
> If you encounter issues with integrated graphics devices, you can try adding
> the option "iommu.passthrough=1", or the equivalent "iommu=pt", to the
> kernel command line to use a 1:1 mapping for the IOMMU in general.  On
> Intel you can also try "intel_iommu=igfx_off" to turn off translation 
> specifically
> for the integrated graphics engine only.  If this fixes anything, please 
> ensure
> you file a bug reporting the problem.
> 
> > +
> > +Fault reporting
> > +---
> > +When errors are reported, the IOMMU signals via an interrupt. The
> > +fault reason and device that caused it is printed on the console.
> > +
> > +
> > +Kernel Log Samples
> > +--
> > +
> > +Intel Boot Messages
> > +^^^
> > +
> > +Something like this gets printed indicating presence of DMAR tables
> > +in ACPI.
> > +
> > +::
> > +
> > +   ACPI: DMAR (v001 A M I  OEMDMAR  0x0001 MSFT
> 0x0097) @
> > +0x7f5b5ef0
> > +
> > +When DMAR is being processed and initialized by ACPI, prints DMAR
> > +locations and any RMRR's processed
> > +
> > +::
> > +
> > +   ACPI DMAR:Host address width 36
> > +   ACPI DMAR:DRHD (flags: 0x)base: 0xfed9
> > +   ACPI DMAR:DRHD (flags: 0x)base: 0xfed91000
> > +   ACPI DMAR:DRHD (flags: 0x0001)base: 0xfed93000
> > +   ACPI DMAR:RMRR base: 0x000ed000 end:
> 0x000e
> > +   ACPI DMAR:RMRR base: 0x7f60 end:
> 0x7fff
> > +
> > +When DMAR is enabled for use, you will notice
> > +
> > +::
> > +
> > +   PCI-DMA: Using DMAR IOMMU
> > +
> > +Intel Fault reporting
> > +^
> > +
> > +::
> > +
> > +   DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000
> > +   DMAR:[fault reason 05] PTE Write access is not set
> > +   DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000
> > +   DMAR:[fault reason 05] PTE Write access is not set
> > +
> > +AMD Boot Messages
> > +^
> > +
> > +Something like this gets printed indicating presence of the IOMMU.
> > +
> > +::
> > +
> > +   iommu: Default domain type: Translated
> > +   iommu: DMA domain TLB invalidation policy: lazy mode
> 
> Similarly, that's common IOMMU API reporting which will be seen on all
> architectures (let alone IOMMU drivers). Maybe some of the messages from
> print_iommu_info() might be better AMD-specific examples?
> 

All good points.  I've integrated these suggestions and will send out a new 
version.

Thanks!

Alex

> Cheers,
> Robin.
> 
> > +
> > +AMD Fault reporting
> > +^^^
> > +
> > +::
> > +
> > +   AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0007
> address=0xc02000 flags=0x]
> > +   AMD-Vi: Event logged [IO_PAGE_FAULT device=07:00.0
> domain=0x0007
> > +address=0xc02000 flags=0x]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] Documentation: x86: rework IOMMU documentation

2022-04-22 Thread Robin Murphy

On 2022-04-22 18:54, Alex Deucher wrote:
[...]

+Intel Specific Notes
+
+
+Graphics Problems?
+^^
+
+If you encounter issues with graphics devices, you can try adding
+option intel_iommu=igfx_off to turn off the integrated graphics engine.
+If this fixes anything, please ensure you file a bug reporting the problem.
+
+Some exceptions to IOVA
+^^^
+
+Interrupt ranges are not address translated, (0xfee0 - 0xfeef).
+The same is true for peer to peer transactions. Hence we reserve the
+address from PCI MMIO ranges so they are not allocated for IOVA addresses.


Note that this should be true for both drivers.


+
+AMD Specific Notes
+--
+
+Graphics Problems?
+^^
+
+If you encounter issues with integrated graphics devices, you can try adding
+option iommu=pt to the kernel command line use a 1:1 mapping for the IOMMU.  If
+this fixes anything, please ensure you file a bug reporting the problem.


And indeed this is a generic option. I reckon we could simply merge 
these two sections together, with the first paragraph being something like:


If you encounter issues with integrated graphics devices, you can try 
adding the option "iommu.passthrough=1", or the equivalent "iommu=pt", 
to the kernel command line to use a 1:1 mapping for the IOMMU in 
general.  On Intel you can also try "intel_iommu=igfx_off" to turn off 
translation specifically for the integrated graphics engine only.  If 
this fixes anything, please ensure you file a bug reporting the problem.



+
+Fault reporting
+---
+When errors are reported, the IOMMU signals via an interrupt. The fault
+reason and device that caused it is printed on the console.
+
+
+Kernel Log Samples
+--
+
+Intel Boot Messages
+^^^
+
+Something like this gets printed indicating presence of DMAR tables
+in ACPI.
+
+::
+
+   ACPI: DMAR (v001 A M I  OEMDMAR  0x0001 MSFT 0x0097) @ 
0x7f5b5ef0
+
+When DMAR is being processed and initialized by ACPI, prints DMAR locations
+and any RMRR's processed
+
+::
+
+   ACPI DMAR:Host address width 36
+   ACPI DMAR:DRHD (flags: 0x)base: 0xfed9
+   ACPI DMAR:DRHD (flags: 0x)base: 0xfed91000
+   ACPI DMAR:DRHD (flags: 0x0001)base: 0xfed93000
+   ACPI DMAR:RMRR base: 0x000ed000 end: 0x000e
+   ACPI DMAR:RMRR base: 0x7f60 end: 0x7fff
+
+When DMAR is enabled for use, you will notice
+
+::
+
+   PCI-DMA: Using DMAR IOMMU
+
+Intel Fault reporting
+^
+
+::
+
+   DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000
+   DMAR:[fault reason 05] PTE Write access is not set
+   DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000
+   DMAR:[fault reason 05] PTE Write access is not set
+
+AMD Boot Messages
+^
+
+Something like this gets printed indicating presence of the IOMMU.
+
+::
+
+   iommu: Default domain type: Translated
+   iommu: DMA domain TLB invalidation policy: lazy mode


Similarly, that's common IOMMU API reporting which will be seen on all 
architectures (let alone IOMMU drivers). Maybe some of the messages from 
print_iommu_info() might be better AMD-specific examples?


Cheers,
Robin.


+
+AMD Fault reporting
+^^^
+
+::
+
+   AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0007 address=0xc02000 
flags=0x]
+   AMD-Vi: Event logged [IO_PAGE_FAULT device=07:00.0 domain=0x0007 
address=0xc02000 flags=0x]

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5] Documentation: x86: rework IOMMU documentation

2022-04-22 Thread Alex Deucher via iommu
Add preliminary documentation for AMD IOMMU and combine
with the existing Intel IOMMU documentation and clean
up and modernize some of the existing documentation to
align with the current state of the kernel.

Signed-off-by: Alex Deucher 
---

V2: Incorporate feedback from Robin to clarify IOMMU vs DMA engine (e.g.,
a device) and document proper DMA API.  Also correct the fact that
the AMD IOMMU is not limited to managing PCI devices.
v3: Fix spelling and rework text as suggested by Vasant
v4: Combine Intel and AMD documents into a single document as suggested
by Dave Hansen
v5: Flag keywords as ACPI related.  Some grammatical fixes.

 Documentation/x86/index.rst   |   2 +-
 Documentation/x86/intel-iommu.rst | 115 ---
 Documentation/x86/iommu.rst   | 151 ++
 3 files changed, 152 insertions(+), 116 deletions(-)
 delete mode 100644 Documentation/x86/intel-iommu.rst
 create mode 100644 Documentation/x86/iommu.rst

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index f498f1d36cd3..6f8409fe0674 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -21,7 +21,7 @@ x86-specific Documentation
tlb
mtrr
pat
-   intel-iommu
+   iommu
intel_txt
amd-memory-encryption
pti
diff --git a/Documentation/x86/intel-iommu.rst 
b/Documentation/x86/intel-iommu.rst
deleted file mode 100644
index 099f13d51d5f..
--- a/Documentation/x86/intel-iommu.rst
+++ /dev/null
@@ -1,115 +0,0 @@
-===
-Linux IOMMU Support
-===
-
-The architecture spec can be obtained from the below location.
-
-http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf
-
-This guide gives a quick cheat sheet for some basic understanding.
-
-Some Keywords
-
-- DMAR - DMA remapping
-- DRHD - DMA Remapping Hardware Unit Definition
-- RMRR - Reserved memory Region Reporting Structure
-- ZLR  - Zero length reads from PCI devices
-- IOVA - IO Virtual address.
-
-Basic stuff

-
-ACPI enumerates and lists the different DMA engines in the platform, and
-device scope relationships between PCI devices and which DMA engine  controls
-them.
-
-What is RMRR?
--
-
-There are some devices the BIOS controls, for e.g USB devices to perform
-PS2 emulation. The regions of memory used for these devices are marked
-reserved in the e820 map. When we turn on DMA translation, DMA to those
-regions will fail. Hence BIOS uses RMRR to specify these regions along with
-devices that need to access these regions. OS is expected to setup
-unity mappings for these regions for these devices to access these regions.
-
-How is IOVA generated?
---
-
-Well behaved drivers call pci_map_*() calls before sending command to device
-that needs to perform DMA. Once DMA is completed and mapping is no longer
-required, device performs a pci_unmap_*() calls to unmap the region.
-
-The Intel IOMMU driver allocates a virtual address per domain. Each PCIE
-device has its own domain (hence protection). Devices under p2p bridges
-share the virtual address with all devices under the p2p bridge due to
-transaction id aliasing for p2p bridges.
-
-IOVA generation is pretty generic. We used the same technique as vmalloc()
-but these are not global address spaces, but separate for each domain.
-Different DMA engines may support different number of domains.
-
-We also allocate guard pages with each mapping, so we can attempt to catch
-any overflow that might happen.
-
-
-Graphics Problems?
---
-If you encounter issues with graphics devices, you can try adding
-option intel_iommu=igfx_off to turn off the integrated graphics engine.
-If this fixes anything, please ensure you file a bug reporting the problem.
-
-Some exceptions to IOVA

-Interrupt ranges are not address translated, (0xfee0 - 0xfeef).
-The same is true for peer to peer transactions. Hence we reserve the
-address from PCI MMIO ranges so they are not allocated for IOVA addresses.
-
-
-Fault reporting

-When errors are reported, the DMA engine signals via an interrupt. The fault
-reason and device that caused it with fault reason is printed on console.
-
-See below for sample.
-
-
-Boot Message Sample

-
-Something like this gets printed indicating presence of DMAR tables
-in ACPI.
-
-ACPI: DMAR (v001 A M I  OEMDMAR  0x0001 MSFT 0x0097) @ 
0x7f5b5ef0
-
-When DMAR is being processed and initialized by ACPI, prints DMAR locations
-and any RMRR's processed::
-
-   ACPI DMAR:Host address width 36
-   ACPI DMAR:DRHD (flags: 0x)base: 0xfed9
-   ACPI DMAR:DRHD (flags: 0x)base: 0xfed91000
-   ACPI DMAR:DRHD (flags: 0x0001)base: 0xfed93000
-   ACPI DMAR:RMRR base: 0x000ed000 end: 0x000e
-   ACPI DMAR:RMRR base: 

RE: [PATCH v4] Documentation: x86: rework IOMMU documentation

2022-04-22 Thread Deucher, Alexander via iommu
[Public]

> -Original Message-
> From: Deucher, Alexander 
> Sent: Friday, April 22, 2022 1:54 PM
> To: linux-...@vger.kernel.org; linux-ker...@vger.kernel.org;
> cor...@lwn.net; h...@zytor.com; x...@kernel.org;
> dave.han...@linux.intel.com; b...@alien8.de; mi...@redhat.com;
> t...@linutronix.de; j...@8bytes.org; Suthikulpanit, Suravee
> ; w...@kernel.org; iommu@lists.linux-
> foundation.org; robin.mur...@arm.com; Hegde, Vasant
> 
> Cc: Deucher, Alexander 
> Subject: [PATCH v4] Documentation: x86: rework IOMMU documentation
> 
> Add preliminary documentation for AMD IOMMU and combine with the
> existing Intel IOMMU documentation and clean up and modernize some of the
> existing documentation to align with the current state of the kernel.
> 
> Signed-off-by: Alex Deucher 
> ---
> 
> V2: Incorporate feedback from Robin to clarify IOMMU vs DMA engine (e.g.,
> a device) and document proper DMA API.  Also correct the fact that
> the AMD IOMMU is not limited to managing PCI devices.
> v3: Fix spelling and rework text as suggested by Vasant
> v4: Combine Intel and AMD documents into a single document as suggested
> by Dave Hansen
> 
>  Documentation/x86/index.rst   |   2 +-
>  Documentation/x86/intel-iommu.rst | 115 --
>  Documentation/x86/iommu.rst   | 153
> ++
>  3 files changed, 154 insertions(+), 116 deletions(-)  delete mode 100644
> Documentation/x86/intel-iommu.rst  create mode 100644
> Documentation/x86/iommu.rst
> 
> diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
> index f498f1d36cd3..6f8409fe0674 100644
> --- a/Documentation/x86/index.rst
> +++ b/Documentation/x86/index.rst
> @@ -21,7 +21,7 @@ x86-specific Documentation
> tlb
> mtrr
> pat
> -   intel-iommu
> +   iommu
> intel_txt
> amd-memory-encryption
> pti
> diff --git a/Documentation/x86/intel-iommu.rst b/Documentation/x86/intel-
> iommu.rst
> deleted file mode 100644
> index 099f13d51d5f..
> --- a/Documentation/x86/intel-iommu.rst
> +++ /dev/null
> @@ -1,115 +0,0 @@
> -===
> -Linux IOMMU Support
> -===
> -
> -The architecture spec can be obtained from the below location.
> -
> -http://www.intel.com/content/dam/www/public/us/en/documents/product-
> specifications/vt-directed-io-spec.pdf
> -
> -This guide gives a quick cheat sheet for some basic understanding.
> -
> -Some Keywords
> -
> -- DMAR - DMA remapping
> -- DRHD - DMA Remapping Hardware Unit Definition
> -- RMRR - Reserved memory Region Reporting Structure
> -- ZLR  - Zero length reads from PCI devices
> -- IOVA - IO Virtual address.
> -
> -Basic stuff
> 
> -
> -ACPI enumerates and lists the different DMA engines in the platform, and -
> device scope relationships between PCI devices and which DMA engine
> controls -them.
> -
> -What is RMRR?
> --
> -
> -There are some devices the BIOS controls, for e.g USB devices to perform
> -PS2 emulation. The regions of memory used for these devices are marked -
> reserved in the e820 map. When we turn on DMA translation, DMA to those -
> regions will fail. Hence BIOS uses RMRR to specify these regions along with -
> devices that need to access these regions. OS is expected to setup -unity
> mappings for these regions for these devices to access these regions.
> -
> -How is IOVA generated?
> ---
> -
> -Well behaved drivers call pci_map_*() calls before sending command to
> device -that needs to perform DMA. Once DMA is completed and mapping is
> no longer -required, device performs a pci_unmap_*() calls to unmap the
> region.
> -
> -The Intel IOMMU driver allocates a virtual address per domain. Each PCIE -
> device has its own domain (hence protection). Devices under p2p bridges -
> share the virtual address with all devices under the p2p bridge due to -
> transaction id aliasing for p2p bridges.
> -
> -IOVA generation is pretty generic. We used the same technique as vmalloc() -
> but these are not global address spaces, but separate for each domain.
> -Different DMA engines may support different number of domains.
> -
> -We also allocate guard pages with each mapping, so we can attempt to catch -
> any overflow that might happen.
> -
> -
> -Graphics Problems?
> ---
> -If you encounter issues with graphics devices, you can try adding -option
> intel_iommu=igfx_off to turn off the integrated graphics engine.
> -If this fixes anything, please ensure you file a bug reporting the problem.
> -
> -Some exceptions to IOVA
> 
> -Interrupt ranges are not address translated, (0xfee0 - 0xfeef).
> -The same is true for peer to peer transactions. Hence we reserve the -
> address from PCI MMIO ranges so they are not allocated for IOVA addresses.
> -
> -
> -Fault reporting
> 
> -When errors are reported, the DMA engine signals via an interrupt. The fault
> -reason and device that caused it with 

[PATCH] dt-bindings: iommu: Drop client node in examples

2022-04-22 Thread Rob Herring
There's no need to show consumer side in provider examples. The ones
used here are undocumented or undocumented in schemas which results in
warnings.

Signed-off-by: Rob Herring 
---
 .../devicetree/bindings/iommu/mediatek,iommu.yaml  | 10 --
 .../devicetree/bindings/iommu/samsung,sysmmu.yaml  | 10 --
 2 files changed, 20 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml 
b/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
index 97e8c471a5e8..e0389539194f 100644
--- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
+++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
@@ -173,13 +173,3 @@ examples:
  <>, <>, <>;
 #iommu-cells = <1>;
 };
-
-  - |
-#include 
-
-/* Example for a client device */
-display {
-   compatible = "mediatek,mt8173-disp";
-   iommus = < M4U_PORT_DISP_OVL0>,
-< M4U_PORT_DISP_RDMA0>;
- };
diff --git a/Documentation/devicetree/bindings/iommu/samsung,sysmmu.yaml 
b/Documentation/devicetree/bindings/iommu/samsung,sysmmu.yaml
index 783c6b37c9f0..672a0beea600 100644
--- a/Documentation/devicetree/bindings/iommu/samsung,sysmmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/samsung,sysmmu.yaml
@@ -86,16 +86,6 @@ examples:
   - |
 #include 
 
-gsc_0: scaler@13e0 {
-  compatible = "samsung,exynos5-gsc";
-  reg = <0x13e0 0x1000>;
-  interrupts = <0 85 0>;
-  power-domains = <_gsc>;
-  clocks = < CLK_GSCL0>;
-  clock-names = "gscl";
-  iommus = <_gsc0>;
-};
-
 sysmmu_gsc0: iommu@13e8 {
   compatible = "samsung,exynos-sysmmu";
   reg = <0x13E8 0x1000>;
-- 
2.32.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 02/13] iommu: Move bus setup to IOMMU device registration

2022-04-22 Thread Robin Murphy

On 2022-04-22 19:37, Krishna Reddy wrote:

Good effort to isolate bus config from smmu drivers.
Reviewed-By: Krishna Reddy 


Thanks!


I have an orthogonal question here.
Can the following code handle the case, where different buses have different 
type of SMMU instances(like one bus has SMMUv2 and another bus has SMMUv3)?
If it need to handle the above case, can the smmu device bus be matched with 
specific bus here and ops set only for that bus?


Not yet, but that is one of the end goals that this is all working 
towards. I think the stuff that I've added to the dev branch[1] today 
should have reached the point where that becomes viable, but I'll need 
to rig up a system to test it next week.


Intermediate solutions aren't worth it because in practice you 
inevitably end up needing both IOMMU drivers to share the platform "bus" 
anyway.


Cheers,
Robin.

[1] https://gitlab.arm.com/linux-arm/linux-rm/-/commits/iommu/bus





+   for (int i = 0; i < ARRAY_SIZE(iommu_buses); i++) {
+   struct bus_type *bus = iommu_buses[i];
+   const struct iommu_ops *bus_ops = bus->iommu_ops;
+   int err;
+
+   WARN_ON(bus_ops && bus_ops != ops);
+   bus->iommu_ops = ops;
+   err = bus_iommu_probe(bus);
+   if (err) {
+   bus_for_each_dev(bus, NULL, iommu,
remove_iommu_group);
+   bus->iommu_ops = bus_ops;
+   return err;
+   }
+   }



-KR

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 02/13] iommu: Move bus setup to IOMMU device registration

2022-04-22 Thread Krishna Reddy via iommu
Good effort to isolate bus config from smmu drivers.
Reviewed-By: Krishna Reddy 

I have an orthogonal question here.
Can the following code handle the case, where different buses have different 
type of SMMU instances(like one bus has SMMUv2 and another bus has SMMUv3)?
If it need to handle the above case, can the smmu device bus be matched with 
specific bus here and ops set only for that bus? 


> +   for (int i = 0; i < ARRAY_SIZE(iommu_buses); i++) {
> +   struct bus_type *bus = iommu_buses[i];
> +   const struct iommu_ops *bus_ops = bus->iommu_ops;
> +   int err;
> +
> +   WARN_ON(bus_ops && bus_ops != ops);
> +   bus->iommu_ops = ops;
> +   err = bus_iommu_probe(bus);
> +   if (err) {
> +   bus_for_each_dev(bus, NULL, iommu,
> remove_iommu_group);
> +   bus->iommu_ops = bus_ops;
> +   return err;
> +   }
> +   }


-KR
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4] Documentation: x86: rework IOMMU documentation

2022-04-22 Thread Alex Deucher via iommu
Add preliminary documentation for AMD IOMMU and combine
with the existing Intel IOMMU documentation and clean
up and modernize some of the existing documentation to
align with the current state of the kernel.

Signed-off-by: Alex Deucher 
---

V2: Incorporate feedback from Robin to clarify IOMMU vs DMA engine (e.g.,
a device) and document proper DMA API.  Also correct the fact that
the AMD IOMMU is not limited to managing PCI devices.
v3: Fix spelling and rework text as suggested by Vasant
v4: Combine Intel and AMD documents into a single document as suggested
by Dave Hansen

 Documentation/x86/index.rst   |   2 +-
 Documentation/x86/intel-iommu.rst | 115 --
 Documentation/x86/iommu.rst   | 153 ++
 3 files changed, 154 insertions(+), 116 deletions(-)
 delete mode 100644 Documentation/x86/intel-iommu.rst
 create mode 100644 Documentation/x86/iommu.rst

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index f498f1d36cd3..6f8409fe0674 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -21,7 +21,7 @@ x86-specific Documentation
tlb
mtrr
pat
-   intel-iommu
+   iommu
intel_txt
amd-memory-encryption
pti
diff --git a/Documentation/x86/intel-iommu.rst 
b/Documentation/x86/intel-iommu.rst
deleted file mode 100644
index 099f13d51d5f..
--- a/Documentation/x86/intel-iommu.rst
+++ /dev/null
@@ -1,115 +0,0 @@
-===
-Linux IOMMU Support
-===
-
-The architecture spec can be obtained from the below location.
-
-http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf
-
-This guide gives a quick cheat sheet for some basic understanding.
-
-Some Keywords
-
-- DMAR - DMA remapping
-- DRHD - DMA Remapping Hardware Unit Definition
-- RMRR - Reserved memory Region Reporting Structure
-- ZLR  - Zero length reads from PCI devices
-- IOVA - IO Virtual address.
-
-Basic stuff

-
-ACPI enumerates and lists the different DMA engines in the platform, and
-device scope relationships between PCI devices and which DMA engine  controls
-them.
-
-What is RMRR?
--
-
-There are some devices the BIOS controls, for e.g USB devices to perform
-PS2 emulation. The regions of memory used for these devices are marked
-reserved in the e820 map. When we turn on DMA translation, DMA to those
-regions will fail. Hence BIOS uses RMRR to specify these regions along with
-devices that need to access these regions. OS is expected to setup
-unity mappings for these regions for these devices to access these regions.
-
-How is IOVA generated?
---
-
-Well behaved drivers call pci_map_*() calls before sending command to device
-that needs to perform DMA. Once DMA is completed and mapping is no longer
-required, device performs a pci_unmap_*() calls to unmap the region.
-
-The Intel IOMMU driver allocates a virtual address per domain. Each PCIE
-device has its own domain (hence protection). Devices under p2p bridges
-share the virtual address with all devices under the p2p bridge due to
-transaction id aliasing for p2p bridges.
-
-IOVA generation is pretty generic. We used the same technique as vmalloc()
-but these are not global address spaces, but separate for each domain.
-Different DMA engines may support different number of domains.
-
-We also allocate guard pages with each mapping, so we can attempt to catch
-any overflow that might happen.
-
-
-Graphics Problems?
---
-If you encounter issues with graphics devices, you can try adding
-option intel_iommu=igfx_off to turn off the integrated graphics engine.
-If this fixes anything, please ensure you file a bug reporting the problem.
-
-Some exceptions to IOVA

-Interrupt ranges are not address translated, (0xfee0 - 0xfeef).
-The same is true for peer to peer transactions. Hence we reserve the
-address from PCI MMIO ranges so they are not allocated for IOVA addresses.
-
-
-Fault reporting

-When errors are reported, the DMA engine signals via an interrupt. The fault
-reason and device that caused it with fault reason is printed on console.
-
-See below for sample.
-
-
-Boot Message Sample

-
-Something like this gets printed indicating presence of DMAR tables
-in ACPI.
-
-ACPI: DMAR (v001 A M I  OEMDMAR  0x0001 MSFT 0x0097) @ 
0x7f5b5ef0
-
-When DMAR is being processed and initialized by ACPI, prints DMAR locations
-and any RMRR's processed::
-
-   ACPI DMAR:Host address width 36
-   ACPI DMAR:DRHD (flags: 0x)base: 0xfed9
-   ACPI DMAR:DRHD (flags: 0x)base: 0xfed91000
-   ACPI DMAR:DRHD (flags: 0x0001)base: 0xfed93000
-   ACPI DMAR:RMRR base: 0x000ed000 end: 0x000e
-   ACPI DMAR:RMRR base: 0x7f60 end: 0x7fff
-
-When DMAR is 

[PATCH v11 9/9] iommu/arm-smmu: Get associated RMR info and install bypass SMR

2022-04-22 Thread Shameer Kolothum via iommu
From: Jon Nettleton 

Check if there is any RMR info associated with the devices behind
the SMMU and if any, install bypass SMRs for them. This is to
keep any ongoing traffic associated with these devices alive
when we enable/reset SMMU during probe().

Signed-off-by: Jon Nettleton 
Signed-off-by: Steven Price 
Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 52 +++
 1 file changed, 52 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 568cce590ccc..e02cc2d4fb4e 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -2068,6 +2068,54 @@ err_reset_platform_ops: __maybe_unused;
return err;
 }
 
+static void arm_smmu_rmr_install_bypass_smr(struct arm_smmu_device *smmu)
+{
+   struct list_head rmr_list;
+   struct iommu_resv_region *e;
+   int idx, cnt = 0;
+   u32 reg;
+
+   INIT_LIST_HEAD(_list);
+   iort_get_rmr_sids(dev_fwnode(smmu->dev), _list);
+
+   /*
+* Rather than trying to look at existing mappings that
+* are setup by the firmware and then invalidate the ones
+* that do no have matching RMR entries, just disable the
+* SMMU until it gets enabled again in the reset routine.
+*/
+   reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sCR0);
+   reg |= ARM_SMMU_sCR0_CLIENTPD;
+   arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sCR0, reg);
+
+   list_for_each_entry(e, _list, list) {
+   struct iommu_iort_rmr_data *rmr;
+   int i;
+
+   rmr = container_of(e, struct iommu_iort_rmr_data, rr);
+   for (i = 0; i < rmr->num_sids; i++) {
+   idx = arm_smmu_find_sme(smmu, rmr->sids[i], ~0);
+   if (idx < 0)
+   continue;
+
+   if (smmu->s2crs[idx].count == 0) {
+   smmu->smrs[idx].id = rmr->sids[i];
+   smmu->smrs[idx].mask = 0;
+   smmu->smrs[idx].valid = true;
+   }
+   smmu->s2crs[idx].count++;
+   smmu->s2crs[idx].type = S2CR_TYPE_BYPASS;
+   smmu->s2crs[idx].privcfg = S2CR_PRIVCFG_DEFAULT;
+
+   cnt++;
+   }
+   }
+
+   dev_notice(smmu->dev, "\tpreserved %d boot mapping%s\n", cnt,
+  cnt == 1 ? "" : "s");
+   iort_put_rmr_sids(dev_fwnode(smmu->dev), _list);
+}
+
 static int arm_smmu_device_probe(struct platform_device *pdev)
 {
struct resource *res;
@@ -2189,6 +2237,10 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
}
 
platform_set_drvdata(pdev, smmu);
+
+   /* Check for RMRs and install bypass SMRs if any */
+   arm_smmu_rmr_install_bypass_smr(smmu);
+
arm_smmu_device_reset(smmu);
arm_smmu_test_smr_masks(smmu);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v11 8/9] iommu/arm-smmu-v3: Get associated RMR info and install bypass STE

2022-04-22 Thread Shameer Kolothum via iommu
Check if there is any RMR info associated with the devices behind
the SMMUv3 and if any, install bypass STEs for them. This is to
keep any ongoing traffic associated with these devices alive
when we enable/reset SMMUv3 during probe().

Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index a939d9e0f747..8a5dfd078e95 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3754,6 +3754,36 @@ static void __iomem *arm_smmu_ioremap(struct device 
*dev, resource_size_t start,
return devm_ioremap_resource(dev, );
 }
 
+static void arm_smmu_rmr_install_bypass_ste(struct arm_smmu_device *smmu)
+{
+   struct list_head rmr_list;
+   struct iommu_resv_region *e;
+
+   INIT_LIST_HEAD(_list);
+   iort_get_rmr_sids(dev_fwnode(smmu->dev), _list);
+
+   list_for_each_entry(e, _list, list) {
+   __le64 *step;
+   struct iommu_iort_rmr_data *rmr;
+   int ret, i;
+
+   rmr = container_of(e, struct iommu_iort_rmr_data, rr);
+   for (i = 0; i < rmr->num_sids; i++) {
+   ret = arm_smmu_init_sid_strtab(smmu, rmr->sids[i]);
+   if (ret) {
+   dev_err(smmu->dev, "RMR SID(0x%x) bypass 
failed\n",
+   rmr->sids[i]);
+   continue;
+   }
+
+   step = arm_smmu_get_step_for_sid(smmu, rmr->sids[i]);
+   arm_smmu_init_bypass_stes(step, 1, true);
+   }
+   }
+
+   iort_put_rmr_sids(dev_fwnode(smmu->dev), _list);
+}
+
 static int arm_smmu_device_probe(struct platform_device *pdev)
 {
int irq, ret;
@@ -3835,6 +3865,9 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
/* Record our private device structure */
platform_set_drvdata(pdev, smmu);
 
+   /* Check for RMRs and install bypass STEs if any */
+   arm_smmu_rmr_install_bypass_ste(smmu);
+
/* Reset the device */
ret = arm_smmu_device_reset(smmu, bypass);
if (ret)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v11 7/9] iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force bypass

2022-04-22 Thread Shameer Kolothum via iommu
By default, disable_bypass flag is set and any dev without
an iommu domain installs STE with CFG_ABORT during
arm_smmu_init_bypass_stes(). Introduce a "force" flag and
move the STE update logic to arm_smmu_init_bypass_stes()
so that we can force it to install CFG_BYPASS STE for specific
SIDs.

This will be useful in a follow-up patch to install bypass
for IORT RMR SIDs.

Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index df326d8f02c6..a939d9e0f747 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1380,12 +1380,21 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_master *master, u32 sid,
arm_smmu_cmdq_issue_cmd(smmu, _cmd);
 }
 
-static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent)
+static void arm_smmu_init_bypass_stes(__le64 *strtab, unsigned int nent, bool 
force)
 {
unsigned int i;
+   u64 val = STRTAB_STE_0_V;
+
+   if (disable_bypass && !force)
+   val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT);
+   else
+   val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS);
 
for (i = 0; i < nent; ++i) {
-   arm_smmu_write_strtab_ent(NULL, -1, strtab);
+   strtab[0] = cpu_to_le64(val);
+   strtab[1] = cpu_to_le64(FIELD_PREP(STRTAB_STE_1_SHCFG,
+  
STRTAB_STE_1_SHCFG_INCOMING));
+   strtab[2] = 0;
strtab += STRTAB_STE_DWORDS;
}
 }
@@ -1413,7 +1422,7 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device 
*smmu, u32 sid)
return -ENOMEM;
}
 
-   arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT);
+   arm_smmu_init_bypass_stes(desc->l2ptr, 1 << STRTAB_SPLIT, false);
arm_smmu_write_strtab_l1_desc(strtab, desc);
return 0;
 }
@@ -3051,7 +3060,7 @@ static int arm_smmu_init_strtab_linear(struct 
arm_smmu_device *smmu)
reg |= FIELD_PREP(STRTAB_BASE_CFG_LOG2SIZE, smmu->sid_bits);
cfg->strtab_base_cfg = reg;
 
-   arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents);
+   arm_smmu_init_bypass_stes(strtab, cfg->num_l1_ents, false);
return 0;
 }
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v11 6/9] iommu/arm-smmu-v3: Introduce strtab init helper

2022-04-22 Thread Shameer Kolothum via iommu
Introduce a helper to check the sid range and to init the l2 strtab
entries(bypass). This will be useful when we have to initialize the
l2 strtab with bypass for RMR SIDs.

Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 28 +++--
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 627a3ed5ee8f..df326d8f02c6 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2537,6 +2537,19 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device 
*smmu, u32 sid)
return sid < limit;
 }
 
+static int arm_smmu_init_sid_strtab(struct arm_smmu_device *smmu, u32 sid)
+{
+   /* Check the SIDs are in range of the SMMU and our stream table */
+   if (!arm_smmu_sid_in_range(smmu, sid))
+   return -ERANGE;
+
+   /* Ensure l2 strtab is initialised */
+   if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB)
+   return arm_smmu_init_l2_strtab(smmu, sid);
+
+   return 0;
+}
+
 static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
  struct arm_smmu_master *master)
 {
@@ -2560,20 +2573,9 @@ static int arm_smmu_insert_master(struct arm_smmu_device 
*smmu,
new_stream->id = sid;
new_stream->master = master;
 
-   /*
-* Check the SIDs are in range of the SMMU and our stream table
-*/
-   if (!arm_smmu_sid_in_range(smmu, sid)) {
-   ret = -ERANGE;
+   ret = arm_smmu_init_sid_strtab(smmu, sid);
+   if (ret)
break;
-   }
-
-   /* Ensure l2 strtab is initialised */
-   if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
-   ret = arm_smmu_init_l2_strtab(smmu, sid);
-   if (ret)
-   break;
-   }
 
/* Insert into SID tree */
new_node = &(smmu->streams.rb_node);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v11 5/9] ACPI/IORT: Add a helper to retrieve RMR info directly

2022-04-22 Thread Shameer Kolothum via iommu
This will provide a way for SMMU drivers to retrieve StreamIDs
associated with IORT RMR nodes and use that to set bypass settings
for those IDs.

Signed-off-by: Shameer Kolothum 
---
 drivers/acpi/arm64/iort.c | 28 
 include/linux/acpi_iort.h |  8 
 2 files changed, 36 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 5be6e8ecca38..c0de17034dff 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1393,6 +1393,34 @@ int iort_dma_get_ranges(struct device *dev, u64 *size)
return nc_dma_get_range(dev, size);
 }
 
+/**
+ * iort_get_rmr_sids - Retrieve IORT RMR node reserved regions with
+ * associated StreamIDs information.
+ * @iommu_fwnode: fwnode associated with IOMMU
+ * @head: Resereved region list
+ */
+void iort_get_rmr_sids(struct fwnode_handle *iommu_fwnode,
+  struct list_head *head)
+{
+   iort_iommu_rmr_get_resv_regions(iommu_fwnode, NULL, head);
+}
+EXPORT_SYMBOL_GPL(iort_get_rmr_sids);
+
+/**
+ * iort_put_rmr_sids - Free memory allocated for RMR reserved regions.
+ * @iommu_fwnode: fwnode associated with IOMMU
+ * @head: Resereved region list
+ */
+void iort_put_rmr_sids(struct fwnode_handle *iommu_fwnode,
+  struct list_head *head)
+{
+   struct iommu_resv_region *entry, *next;
+
+   list_for_each_entry_safe(entry, next, head, list)
+   entry->free(NULL, entry);
+}
+EXPORT_SYMBOL_GPL(iort_put_rmr_sids);
+
 static void __init acpi_iort_register_irq(int hwirq, const char *name,
  int trigger,
  struct resource *res)
diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
index e5d2de9caf7f..b43be0987b19 100644
--- a/include/linux/acpi_iort.h
+++ b/include/linux/acpi_iort.h
@@ -33,6 +33,10 @@ struct irq_domain *iort_get_device_domain(struct device 
*dev, u32 id,
  enum irq_domain_bus_token bus_token);
 void acpi_configure_pmsi_domain(struct device *dev);
 int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id);
+void iort_get_rmr_sids(struct fwnode_handle *iommu_fwnode,
+  struct list_head *head);
+void iort_put_rmr_sids(struct fwnode_handle *iommu_fwnode,
+  struct list_head *head);
 /* IOMMU interface */
 int iort_dma_get_ranges(struct device *dev, u64 *size);
 int iort_iommu_configure_id(struct device *dev, const u32 *id_in);
@@ -46,6 +50,10 @@ static inline struct irq_domain *iort_get_device_domain(
struct device *dev, u32 id, enum irq_domain_bus_token bus_token)
 { return NULL; }
 static inline void acpi_configure_pmsi_domain(struct device *dev) { }
+static inline
+void iort_get_rmr_sids(struct fwnode_handle *iommu_fwnode, struct list_head 
*head) { }
+static inline
+void iort_put_rmr_sids(struct fwnode_handle *iommu_fwnode, struct list_head 
*head) { }
 /* IOMMU interface */
 static inline int iort_dma_get_ranges(struct device *dev, u64 *size)
 { return -ENODEV; }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v11 3/9] ACPI/IORT: Provide a generic helper to retrieve reserve regions

2022-04-22 Thread Shameer Kolothum via iommu
Currently IORT provides a helper to retrieve HW MSI reserve regions.
Change this to a generic helper to retrieve any IORT related reserve
regions. This will be useful when we add support for RMR nodes in
subsequent patches.

[Lorenzo: For ACPI IORT]
Reviewed-by: Lorenzo Pieralisi 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Shameer Kolothum 
---
 drivers/acpi/arm64/iort.c | 22 +++---
 drivers/iommu/dma-iommu.c |  2 +-
 include/linux/acpi_iort.h |  4 ++--
 3 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 213f61cae176..cd5d1d7823cb 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -806,15 +806,13 @@ static struct acpi_iort_node 
*iort_get_msi_resv_iommu(struct device *dev)
return NULL;
 }
 
-/**
- * iort_iommu_msi_get_resv_regions - Reserved region driver helper
- * @dev: Device from iommu_get_resv_regions()
- * @head: Reserved region list from iommu_get_resv_regions()
- *
+/*
+ * Retrieve platform specific HW MSI reserve regions.
  * The ITS interrupt translation spaces (ITS_base + SZ_64K, SZ_64K)
  * associated with the device are the HW MSI reserved regions.
  */
-void iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
*head)
+static void iort_iommu_msi_get_resv_regions(struct device *dev,
+   struct list_head *head)
 {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct acpi_iort_its_group *its;
@@ -863,6 +861,16 @@ void iort_iommu_msi_get_resv_regions(struct device *dev, 
struct list_head *head)
}
 }
 
+/**
+ * iort_iommu_get_resv_regions - Generic helper to retrieve reserved regions.
+ * @dev: Device from iommu_get_resv_regions()
+ * @head: Reserved region list from iommu_get_resv_regions()
+ */
+void iort_iommu_get_resv_regions(struct device *dev, struct list_head *head)
+{
+   iort_iommu_msi_get_resv_regions(dev, head);
+}
+
 static inline bool iort_iommu_driver_enabled(u8 type)
 {
switch (type) {
@@ -1027,7 +1035,7 @@ int iort_iommu_configure_id(struct device *dev, const u32 
*id_in)
 }
 
 #else
-void iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
*head)
+void iort_iommu_get_resv_regions(struct device *dev, struct list_head *head)
 { }
 int iort_iommu_configure_id(struct device *dev, const u32 *input_id)
 { return -ENODEV; }
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 09f6e1c0f9c0..93d76b666888 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -384,7 +384,7 @@ void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
 {
 
if (!is_of_node(dev_iommu_fwspec_get(dev)->iommu_fwnode))
-   iort_iommu_msi_get_resv_regions(dev, list);
+   iort_iommu_get_resv_regions(dev, list);
 
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
index a8198b83753d..e5d2de9caf7f 100644
--- a/include/linux/acpi_iort.h
+++ b/include/linux/acpi_iort.h
@@ -36,7 +36,7 @@ int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id);
 /* IOMMU interface */
 int iort_dma_get_ranges(struct device *dev, u64 *size);
 int iort_iommu_configure_id(struct device *dev, const u32 *id_in);
-void iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
*head);
+void iort_iommu_get_resv_regions(struct device *dev, struct list_head *head);
 phys_addr_t acpi_iort_dma_get_max_cpu_address(void);
 #else
 static inline void acpi_iort_init(void) { }
@@ -52,7 +52,7 @@ static inline int iort_dma_get_ranges(struct device *dev, u64 
*size)
 static inline int iort_iommu_configure_id(struct device *dev, const u32 *id_in)
 { return -ENODEV; }
 static inline
-void iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
*head)
+void iort_iommu_get_resv_regions(struct device *dev, struct list_head *head)
 { }
 
 static inline phys_addr_t acpi_iort_dma_get_max_cpu_address(void)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v11 4/9] ACPI/IORT: Add support to retrieve IORT RMR reserved regions

2022-04-22 Thread Shameer Kolothum via iommu
Parse through the IORT RMR nodes and populate the reserve region list
corresponding to a given IOMMU and device(optional). Also, go through
the ID mappings of the RMR node and retrieve all the SIDs associated
with it.

Signed-off-by: Shameer Kolothum 
---
 drivers/acpi/arm64/iort.c | 290 ++
 include/linux/iommu.h |   8 ++
 2 files changed, 298 insertions(+)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index cd5d1d7823cb..5be6e8ecca38 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -788,6 +788,293 @@ void acpi_configure_pmsi_domain(struct device *dev)
 }
 
 #ifdef CONFIG_IOMMU_API
+static void iort_rmr_free(struct device *dev,
+ struct iommu_resv_region *region)
+{
+   struct iommu_iort_rmr_data *rmr_data;
+
+   rmr_data = container_of(region, struct iommu_iort_rmr_data, rr);
+   kfree(rmr_data->sids);
+   kfree(rmr_data);
+}
+
+struct iommu_iort_rmr_data *iort_rmr_alloc(struct acpi_iort_rmr_desc *rmr_desc,
+  int prot, enum iommu_resv_type type,
+  u32 *sids, u32 num_sids)
+{
+   struct iommu_iort_rmr_data *rmr_data;
+   struct iommu_resv_region *region;
+   u32 *sids_copy;
+   u64 addr = rmr_desc->base_address, size = rmr_desc->length;
+
+   rmr_data = kmalloc(sizeof(*rmr_data), GFP_KERNEL);
+   if (!rmr_data)
+   return NULL;
+
+   /* Create a copy of SIDs array to associate with this rmr_data */
+   sids_copy = kmemdup(sids, num_sids * sizeof(*sids), GFP_KERNEL);
+   if (!sids_copy) {
+   kfree(rmr_data);
+   return NULL;
+   }
+   rmr_data->sids = sids_copy;
+   rmr_data->num_sids = num_sids;
+
+   if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) {
+   /* PAGE align base addr and size */
+   addr &= PAGE_MASK;
+   size = PAGE_ALIGN(size + 
offset_in_page(rmr_desc->base_address));
+
+   pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not aligned to 
64K, continue with [0x%llx - 0x%llx]\n",
+  rmr_desc->base_address,
+  rmr_desc->base_address + rmr_desc->length - 1,
+  addr, addr + size - 1);
+   }
+
+   region = _data->rr;
+   INIT_LIST_HEAD(>list);
+   region->start = addr;
+   region->length = size;
+   region->prot = prot;
+   region->type = type;
+   region->free = iort_rmr_free;
+
+   return rmr_data;
+}
+
+static void iort_rmr_desc_check_overlap(struct acpi_iort_rmr_desc *desc,
+   u32 count)
+{
+   int i, j;
+
+   for (i = 0; i < count; i++) {
+   u64 end, start = desc[i].base_address, length = desc[i].length;
+
+   if (!length) {
+   pr_err(FW_BUG "RMR descriptor[0x%llx] with zero length, 
continue anyway\n",
+  start);
+   continue;
+   }
+
+   end = start + length - 1;
+
+   /* Check for address overlap */
+   for (j = i + 1; j < count; j++) {
+   u64 e_start = desc[j].base_address;
+   u64 e_end = e_start + desc[j].length - 1;
+
+   if (start <= e_end && end >= e_start)
+   pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] 
overlaps, continue anyway\n",
+  start, end);
+   }
+   }
+}
+
+/*
+ * Please note, we will keep the already allocated RMR reserve
+ * regions in case of a memory allocation failure.
+ */
+static void iort_get_rmrs(struct acpi_iort_node *node,
+ struct acpi_iort_node *smmu,
+ u32 *sids, u32 num_sids,
+ struct list_head *head)
+{
+   struct acpi_iort_rmr *rmr = (struct acpi_iort_rmr *)node->node_data;
+   struct acpi_iort_rmr_desc *rmr_desc;
+   int i;
+
+   rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, node,
+   rmr->rmr_offset);
+
+   iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count);
+
+   for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) {
+   struct iommu_iort_rmr_data *rmr_data;
+   enum iommu_resv_type type;
+   int prot = IOMMU_READ | IOMMU_WRITE;
+
+   if (rmr->flags & ACPI_IORT_RMR_REMAP_PERMITTED)
+   type = IOMMU_RESV_DIRECT_RELAXABLE;
+   else
+   type = IOMMU_RESV_DIRECT;
+
+   if (rmr->flags & ACPI_IORT_RMR_ACCESS_PRIVILEGE)
+   prot |= IOMMU_PRIV;
+
+   /* Attributes 0x00 - 0x03 represents device memory */
+   if (ACPI_IORT_RMR_ACCESS_ATTRIBUTES(rmr->flags) <=
+   

[PATCH v11 2/9] ACPI/IORT: Make iort_iommu_msi_get_resv_regions() return void

2022-04-22 Thread Shameer Kolothum via iommu
At present iort_iommu_msi_get_resv_regions() returns the number of
MSI reserved regions on success and there are no users for this.
The reserved region list will get populated anyway for platforms
that require the HW MSI region reservation. Hence, change the
function to return void instead.

Reviewed-by: Christoph Hellwig 
Signed-off-by: Shameer Kolothum 
---
 drivers/acpi/arm64/iort.c | 25 +
 include/linux/acpi_iort.h |  6 +++---
 2 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index f2f8f05662de..213f61cae176 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -811,22 +811,19 @@ static struct acpi_iort_node 
*iort_get_msi_resv_iommu(struct device *dev)
  * @dev: Device from iommu_get_resv_regions()
  * @head: Reserved region list from iommu_get_resv_regions()
  *
- * Returns: Number of msi reserved regions on success (0 if platform
- *  doesn't require the reservation or no associated msi regions),
- *  appropriate error value otherwise. The ITS interrupt translation
- *  spaces (ITS_base + SZ_64K, SZ_64K) associated with the device
- *  are the msi reserved regions.
+ * The ITS interrupt translation spaces (ITS_base + SZ_64K, SZ_64K)
+ * associated with the device are the HW MSI reserved regions.
  */
-int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+void iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
*head)
 {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct acpi_iort_its_group *its;
struct acpi_iort_node *iommu_node, *its_node = NULL;
-   int i, resv = 0;
+   int i;
 
iommu_node = iort_get_msi_resv_iommu(dev);
if (!iommu_node)
-   return 0;
+   return;
 
/*
 * Current logic to reserve ITS regions relies on HW topologies
@@ -846,7 +843,7 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, 
struct list_head *head)
}
 
if (!its_node)
-   return 0;
+   return;
 
/* Move to ITS specific data */
its = (struct acpi_iort_its_group *)its_node->node_data;
@@ -860,14 +857,10 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, 
struct list_head *head)
 
region = iommu_alloc_resv_region(base + SZ_64K, SZ_64K,
 prot, IOMMU_RESV_MSI);
-   if (region) {
+   if (region)
list_add_tail(>list, head);
-   resv++;
-   }
}
}
-
-   return (resv == its->its_count) ? resv : -ENODEV;
 }
 
 static inline bool iort_iommu_driver_enabled(u8 type)
@@ -1034,8 +1027,8 @@ int iort_iommu_configure_id(struct device *dev, const u32 
*id_in)
 }
 
 #else
-int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
-{ return 0; }
+void iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
*head)
+{ }
 int iort_iommu_configure_id(struct device *dev, const u32 *input_id)
 { return -ENODEV; }
 #endif
diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
index f1f0842a2cb2..a8198b83753d 100644
--- a/include/linux/acpi_iort.h
+++ b/include/linux/acpi_iort.h
@@ -36,7 +36,7 @@ int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id);
 /* IOMMU interface */
 int iort_dma_get_ranges(struct device *dev, u64 *size);
 int iort_iommu_configure_id(struct device *dev, const u32 *id_in);
-int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
*head);
+void iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
*head);
 phys_addr_t acpi_iort_dma_get_max_cpu_address(void);
 #else
 static inline void acpi_iort_init(void) { }
@@ -52,8 +52,8 @@ static inline int iort_dma_get_ranges(struct device *dev, u64 
*size)
 static inline int iort_iommu_configure_id(struct device *dev, const u32 *id_in)
 { return -ENODEV; }
 static inline
-int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
-{ return 0; }
+void iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
*head)
+{ }
 
 static inline phys_addr_t acpi_iort_dma_get_max_cpu_address(void)
 { return PHYS_ADDR_MAX; }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v11 1/9] iommu: Introduce a callback to struct iommu_resv_region

2022-04-22 Thread Shameer Kolothum via iommu
A callback is introduced to struct iommu_resv_region to free memory
allocations associated with the reserved region. This will be useful
when we introduce support for IORT RMR based reserved regions.

Signed-off-by: Shameer Kolothum 
---
 drivers/iommu/iommu.c | 16 +++-
 include/linux/iommu.h |  2 ++
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f2c45b85b9fc..ffcfa684e80c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2597,16 +2597,22 @@ void iommu_put_resv_regions(struct device *dev, struct 
list_head *list)
  * @list: reserved region list for device
  *
  * IOMMU drivers can use this to implement their .put_resv_regions() callback
- * for simple reservations. Memory allocated for each reserved region will be
- * freed. If an IOMMU driver allocates additional resources per region, it is
- * going to have to implement a custom callback.
+ * for simple reservations. If a per region callback is provided that will be
+ * used to free all memory allocations associated with the reserved region or
+ * else just free up the memory for the regions. If an IOMMU driver allocates
+ * additional resources per region, it is going to have to implement a custom
+ * callback.
  */
 void generic_iommu_put_resv_regions(struct device *dev, struct list_head *list)
 {
struct iommu_resv_region *entry, *next;
 
-   list_for_each_entry_safe(entry, next, list, list)
-   kfree(entry);
+   list_for_each_entry_safe(entry, next, list, list) {
+   if (entry->free)
+   entry->free(dev, entry);
+   else
+   kfree(entry);
+   }
 }
 EXPORT_SYMBOL(generic_iommu_put_resv_regions);
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 9208eca4b0d1..68bcfb3a06d7 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -134,6 +134,7 @@ enum iommu_resv_type {
  * @length: Length of the region in bytes
  * @prot: IOMMU Protection flags (READ/WRITE/...)
  * @type: Type of the reserved region
+ * @free: Callback to free associated memory allocations
  */
 struct iommu_resv_region {
struct list_headlist;
@@ -141,6 +142,7 @@ struct iommu_resv_region {
size_t  length;
int prot;
enum iommu_resv_typetype;
+   void (*free)(struct device *dev, struct iommu_resv_region *region);
 };
 
 /**
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v11 0/9] ACPI/IORT: Support for IORT RMR node

2022-04-22 Thread Shameer Kolothum via iommu
Hi

v9 --> v10
 -Addressed Christoph's comments. We now have a callback to 
  struct iommu_resv_region to free all related memory and also dropped
  the FW specific union and now has a container struct iommu_iort_rmr_data.
  See patches #1 & #4
 -Added R-by from Christoph.
 -Dropped R-by from Lorenzo for patches #4 & #5 due to the above changes.
 -Also dropped T-by from Steve and Laurentiu. Many thanks for your test
  efforts. I have done basic sanity testing on my platform but please
  give it a try at your end as well.

As mentioned in v10, this now has a dependency on the ACPICA header patch
here[1]. 

Please take a look and let me know.

Thanks,
Shameer
[1] https://lore.kernel.org/all/44610361.fMDQidcC6G@kreacher/

From old:
We have faced issues with 3408iMR RAID controller cards which
fail to boot when SMMU is enabled. This is because these
controllers make use of host memory for various caching related
purposes and when SMMU is enabled the iMR firmware fails to
access these memory regions as there is no mapping for them.
IORT RMR provides a way for UEFI to describe and report these
memory regions so that the kernel can make a unity mapping for
these in SMMU.

Change History:

v9 --> v10
 - Dropped patch #1 ("Add temporary RMR node flag definitions") since
   the ACPICA header updates patch is now in the mailing list
 - Based on the suggestion from Christoph, introduced a 
   resv_region_free_fw_data() callback in struct iommu_resv_region and
   used that to free RMR specific memory allocations.

v8 --> v9
 - Adressed comments from Robin on interfaces.
 - Addressed comments from Lorenzo.

v7 --> v8
  - Patch #1 has temp definitions for RMR related changes till
    the ACPICA header changes are part of kernel.
  - No early parsing of RMR node info and is only parsed at the
    time of use.
  - Changes to the RMR get/put API format compared to the
    previous version.
  - Support for RMR descriptor shared by multiple stream IDs.

v6 --> v7
 -fix pointed out by Steve to the SMMUv2 SMR bypass install in patch #8.

v5 --> v6
- Addressed comments from Robin & Lorenzo.
  : Moved iort_parse_rmr() to acpi_iort_init() from
    iort_init_platform_devices().
  : Removed use of struct iort_rmr_entry during the initial
    parse. Using struct iommu_resv_region instead.
  : Report RMR address alignment and overlap errors, but continue.
  : Reworked arm_smmu_init_bypass_stes() (patch # 6).
- Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8).
- Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based
  on Type of RMR region. Suggested by Jon N.

v4 --> v5
 -Added a fw_data union to struct iommu_resv_region and removed
  struct iommu_rmr (Based on comments from Joerg/Robin).
 -Added iommu_put_rmrs() to release mem.
 -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by
  yet because of the above changes.

v3 -->v4
-Included the SMMUv2 SMR bypass install changes suggested by
 Steve(patch #7)
-As per Robin's comments, RMR reserve implementation is now
 more generic  (patch #8) and dropped v3 patches 8 and 10.
-Rebase to 5.13-rc1

RFC v2 --> v3
 -Dropped RFC tag as the ACPICA header changes are now ready to be
  part of 5.13[0]. But this series still has a dependency on that patch.
 -Added IORT E.b related changes(node flags, _DSM function 5 checks for
  PCIe).
 -Changed RMR to stream id mapping from M:N to M:1 as per the spec and
  discussion here[1].
 -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!)

Jon Nettleton (1):
  iommu/arm-smmu: Get associated RMR info and install bypass SMR

Shameer Kolothum (8):
  iommu: Introduce a callback to struct iommu_resv_region
  ACPI/IORT: Make iort_iommu_msi_get_resv_regions() return void
  ACPI/IORT: Provide a generic helper to retrieve reserve regions
  ACPI/IORT: Add support to retrieve IORT RMR reserved regions
  ACPI/IORT: Add a helper to retrieve RMR info directly
  iommu/arm-smmu-v3: Introduce strtab init helper
  iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force
bypass
  iommu/arm-smmu-v3: Get associated RMR info and install bypass STE

 drivers/acpi/arm64/iort.c   | 359 ++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  78 -
 drivers/iommu/arm/arm-smmu/arm-smmu.c   |  52 +++
 drivers/iommu/dma-iommu.c   |   2 +-
 drivers/iommu/iommu.c   |  16 +-
 include/linux/acpi_iort.h   |  14 +-
 include/linux/iommu.h   |  10 +
 7 files changed, 485 insertions(+), 46 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-22 Thread Jean-Philippe Brucker
On Fri, Apr 22, 2022 at 09:15:01PM +0800, zhangfei@foxmail.com wrote:
> > I'm trying to piece together what happens from the kernel point of view.
> > 
> > * master process with mm A opens a queue fd through uacce, which calls
> >iommu_sva_bind_device(dev, A) -> PASID 1
> > 
> > * master forks and exits. Child (daemon) gets mm B, inherits the queue fd.
> >The device is still bound to mm A with PASID 1, since the queue fd is
> >still open.
> 
> > We discussed this before, but I don't remember where we left off. The
> > child can't use the queue because its mappings are not copied on fork(),
> > and the queue is still bound to the parent mm A. The child either needs to
> > open a new queue or take ownership of the old one with a new uacce ioctl.
> Yes, currently nginx aligned with the case.
> Child process (worker process) reopen uacce,
> 
> Master process (do init) open uacce, iommu_sva_bind_device(dev, A) -> PASID
> 1
> Master process fork Child (daemon) and exit.
> 
> Child (daemon)  does not use PASID 1 any more, only fork and manage worker
> process.
> Worker process reopen uacce, iommu_sva_bind_device(dev, B) PASID 2
> 
> So it is expected.

Yes, that's fine

> > Is that the "IMPLEMENT_DYNAMIC_BIND_FN()" you mention, something out of
> > tree?  This operation should unbind from A before binding to B, no?
> > Otherwise we leak PASID 1.
> In 5.16 PASID 1 from master is hold until nginx service stop.
> nginx start
> master:
> iommu_sva_alloc_pasid mm->pasid=1      // master process
> 
> lynx https start:
> iommu_sva_alloc_pasid mm->pasid=2    //worker process
> 
> nginx stop:  from fops_release
> iommu_sva_free_pasid mm->pasid=2   // worker process
> iommu_sva_free_pasid mm->pasid=1  // master process

That's the expected behavior (master could close its fd before forking, in
order to free things up earlier, but it's not mandatory)

> Have one silly question.
> 
> kerne driver
> fops_open
> iommu_sva_bind_device
> 
> fops_release
> iommu_sva_unbind_device
> 
> application
> main()
> fd = open
> return;
> 
> Application exit but not close(fd), is it expected fops_release will be
> called automatically by system?

Yes, the application doesn't have to call close() explicitly, the file
descriptor is closed automatically on exit. Note that the fd is copied on
fork(), so it is only released once parent and all child processes exit.

> On 5.17
> fops_release is called automatically, as well as iommu_sva_unbind_device.
> On 5.18-rc1.
> fops_release is not called, have to manually call close(fd)

Right that's weird

> Since nginx may have a issue, it does not call close(fd) when nginx -s quit.

And you're sure that none of the processes are still alive or in zombie
state?  Just to cover every possibility.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 0/2] dma-mapping, remoteproc: Fix dma_mem leak after rproc_shutdown

2022-04-22 Thread Mark-PK Tsai via iommu
Release dma coherent memory before rvdev is free in
rproc_rvdev_release().

Below is the kmemleak report:
unreferenced object 0xff8051c1a980 (size 128):
  comm "sh", pid 4895, jiffies 4295026604 (age 15481.896s)
  hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[<3a0f3ec0>] dma_declare_coherent_memory+0x44/0x11c
[] rproc_add_virtio_dev+0xb8/0x20c
[] rproc_vdev_do_start+0x18/0x24
[] rproc_start+0x22c/0x3e0
[<0b938941>] rproc_boot+0x4a4/0x860
[<3c4dc532>] state_store.52856+0x10c/0x1b8
[] dev_attr_store+0x34/0x84
[<83a53bdb>] sysfs_kf_write+0x60/0xbc
[<8ed830df>] kernfs_fop_write+0x198/0x458
[<72b9ad06>] __vfs_write+0x50/0x210
[<377d7469>] vfs_write+0xe4/0x1a8
[] ksys_write+0x78/0x144
[<9aef6f4b>] __arm64_sys_write+0x1c/0x28
[<03496a98>] el0_svc_common+0xc8/0x22c
[] el0_svc_compat_handler+0x1c/0x28
[] el0_svc_compat+0x8/0x24

Mark-PK Tsai (2):
  dma-mapping: Add dma_release_coherent_memory to DMA API
  remoteproc: Fix dma_mem leak after rproc_shutdown

 drivers/remoteproc/remoteproc_core.c |  1 +
 include/linux/dma-map-ops.h  |  3 +++
 kernel/dma/coherent.c| 10 --
 3 files changed, 12 insertions(+), 2 deletions(-)

-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] dma-mapping: Add dma_release_coherent_memory to DMA API

2022-04-22 Thread Mark-PK Tsai via iommu
Add dma_release_coherent_memory to DMA API to allow dma
user call it to release dev->dma_mem when the device is
removed.

Signed-off-by: Mark-PK Tsai 
---
 include/linux/dma-map-ops.h |  3 +++
 kernel/dma/coherent.c   | 10 --
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index 0d5b06b3a4a6..53db9655efe9 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -166,6 +166,7 @@ static inline void dma_pernuma_cma_reserve(void) { }
 #ifdef CONFIG_DMA_DECLARE_COHERENT
 int dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,
dma_addr_t device_addr, size_t size);
+void dma_release_coherent_memory(struct device *dev);
 int dma_alloc_from_dev_coherent(struct device *dev, ssize_t size,
dma_addr_t *dma_handle, void **ret);
 int dma_release_from_dev_coherent(struct device *dev, int order, void *vaddr);
@@ -177,6 +178,8 @@ static inline int dma_declare_coherent_memory(struct device 
*dev,
 {
return -ENOSYS;
 }
+
+#define dma_release_coherent_memory(dev) (0)
 #define dma_alloc_from_dev_coherent(dev, size, handle, ret) (0)
 #define dma_release_from_dev_coherent(dev, order, vaddr) (0)
 #define dma_mmap_from_dev_coherent(dev, vma, vaddr, order, ret) (0)
diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 375fb3c9538d..c21abc77c53e 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -74,7 +74,7 @@ static struct dma_coherent_mem 
*dma_init_coherent_memory(phys_addr_t phys_addr,
return ERR_PTR(-ENOMEM);
 }
 
-static void dma_release_coherent_memory(struct dma_coherent_mem *mem)
+static void _dma_release_coherent_memory(struct dma_coherent_mem *mem)
 {
if (!mem)
return;
@@ -126,10 +126,16 @@ int dma_declare_coherent_memory(struct device *dev, 
phys_addr_t phys_addr,
 
ret = dma_assign_coherent_memory(dev, mem);
if (ret)
-   dma_release_coherent_memory(mem);
+   _dma_release_coherent_memory(mem);
return ret;
 }
 
+void dma_release_coherent_memory(struct device *dev)
+{
+   if (dev)
+   _dma_release_coherent_memory(dev->dma_mem);
+}
+
 static void *__dma_alloc_from_coherent(struct device *dev,
   struct dma_coherent_mem *mem,
   ssize_t size, dma_addr_t *dma_handle)
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] remoteproc: Fix dma_mem leak after rproc_shutdown

2022-04-22 Thread Mark-PK Tsai via iommu
Release dma coherent memory before rvdev is free in
rproc_rvdev_release().

Below is the kmemleak report:
unreferenced object 0xff8051c1a980 (size 128):
  comm "sh", pid 4895, jiffies 4295026604 (age 15481.896s)
  hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[<3a0f3ec0>] dma_declare_coherent_memory+0x44/0x11c
[] rproc_add_virtio_dev+0xb8/0x20c
[] rproc_vdev_do_start+0x18/0x24
[] rproc_start+0x22c/0x3e0
[<0b938941>] rproc_boot+0x4a4/0x860
[<3c4dc532>] state_store.52856+0x10c/0x1b8
[] dev_attr_store+0x34/0x84
[<83a53bdb>] sysfs_kf_write+0x60/0xbc
[<8ed830df>] kernfs_fop_write+0x198/0x458
[<72b9ad06>] __vfs_write+0x50/0x210
[<377d7469>] vfs_write+0xe4/0x1a8
[] ksys_write+0x78/0x144
[<9aef6f4b>] __arm64_sys_write+0x1c/0x28
[<03496a98>] el0_svc_common+0xc8/0x22c
[] el0_svc_compat_handler+0x1c/0x28
[] el0_svc_compat+0x8/0x24

Signed-off-by: Mark-PK Tsai 
---
 drivers/remoteproc/remoteproc_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/remoteproc/remoteproc_core.c 
b/drivers/remoteproc/remoteproc_core.c
index c510125769b9..8e28290eefa9 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -461,6 +461,7 @@ static void rproc_rvdev_release(struct device *dev)
struct rproc_vdev *rvdev = container_of(dev, struct rproc_vdev, dev);
 
of_reserved_mem_device_release(dev);
+   dma_release_coherent_memory(dev);
 
kfree(rvdev);
 }
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-22 Thread zhangfei....@foxmail.com


Hi, Jean

On 2022/4/22 下午6:11, Jean-Philippe Brucker wrote:

On Fri, Apr 22, 2022 at 05:03:10PM +0800, zhangfei@foxmail.com wrote:
[...]

Have tested, still got some issue with our openssl-engine.

1. If openssl-engine does not register rsa, nginx works well.

2. If openssl-engine register rsa, nginx also works, but ioasid is not
freed when nginx stop.

IMPLEMENT_DYNAMIC_BIND_FN(bind_fn)
bind_fn
ENGINE_set_RSA(e, rsa_methods())

destroy_fn

If ENGINE_set_RSA is set, nginx start and stop will NOT call destroy_fn.
Even rsa_methods is almost new via RSA_meth_new.

In 5.18-rcx, this caused ioasid  not freed in nginx start and stop.
In 5.17, though destroy_fn is not called, but ioasid is freed when nginx
stop, so not noticed this issue before.

1. uacce_fops_release
In 5.16 or 5.17
In fact, we aslo has the issue: openssl engine does not call destroy_fn ->
close(uacce_fd)
But system will automatically close all opened fd,
so uacce_fops_release is also called and free ioasid.

Have one experiment, not call close fd

log: open uacce fd but no close
[ 2583.471225]  dump_backtrace+0x0/0x1a0
[ 2583.474876]  show_stack+0x20/0x30
[ 2583.478178]  dump_stack_lvl+0x8c/0xb8
[ 2583.481825]  dump_stack+0x18/0x34
[ 2583.485126]  uacce_fops_release+0x44/0xdc
[ 2583.489117]  __fput+0x78/0x240
[ 2583.492159]  fput+0x18/0x28
[ 2583.495288]  task_work_run+0x88/0x160
[ 2583.498936]  do_notify_resume+0x214/0x490
[ 2583.502927]  el0_svc+0x58/0x70
[ 2583.505968]  el0t_64_sync_handler+0xb0/0xb8
[ 2583.510132]  el0t_64_sync+0x1a0/0x1a4
[ 2583.582292]  uacce_fops_release q=d6674128

In 5.18, since refcount was add.
The opened uacce fd was not closed automatically by system
So we see the issue.

log: open uacce fd but no close
  [  106.360140]  uacce_fops_open q=ccc38d74
[  106.364929]  ioasid_alloc ioasid=1
[  106.368585]  iommu_sva_alloc_pasid pasid=1
[  106.372943]  iommu_sva_bind_device handle=6cca298a
// ioasid is not free

I'm trying to piece together what happens from the kernel point of view.

* master process with mm A opens a queue fd through uacce, which calls
   iommu_sva_bind_device(dev, A) -> PASID 1

* master forks and exits. Child (daemon) gets mm B, inherits the queue fd.
   The device is still bound to mm A with PASID 1, since the queue fd is
   still open.



We discussed this before, but I don't remember where we left off. The
child can't use the queue because its mappings are not copied on fork(),
and the queue is still bound to the parent mm A. The child either needs to
open a new queue or take ownership of the old one with a new uacce ioctl.

Yes, currently nginx aligned with the case.
Child process (worker process) reopen uacce,

Master process (do init) open uacce, iommu_sva_bind_device(dev, A) -> 
PASID 1

Master process fork Child (daemon) and exit.

Child (daemon)  does not use PASID 1 any more, only fork and manage 
worker process.

Worker process reopen uacce, iommu_sva_bind_device(dev, B) PASID 2

So it is expected.

Is that the "IMPLEMENT_DYNAMIC_BIND_FN()" you mention, something out of
tree?  This operation should unbind from A before binding to B, no?
Otherwise we leak PASID 1.

In 5.16 PASID 1 from master is hold until nginx service stop.
nginx start
master:
iommu_sva_alloc_pasid mm->pasid=1      // master process

lynx https start:
iommu_sva_alloc_pasid mm->pasid=2    //worker process

nginx stop:  from fops_release
iommu_sva_free_pasid mm->pasid=2   // worker process
iommu_sva_free_pasid mm->pasid=1  // master process


Have one silly question.

kerne driver
fops_open
iommu_sva_bind_device

fops_release
iommu_sva_unbind_device

application
main()
fd = open
return;

Application exit but not close(fd), is it expected fops_release will be 
called automatically by system?


On 5.17
fops_release is called automatically, as well as iommu_sva_unbind_device.
On 5.18-rc1.
fops_release is not called, have to manually call close(fd)

Since nginx may have a issue, it does not call close(fd) when nginx -s quit.

Thanks



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 2/4] iommu/vt-d: Set PGSNP bit in pasid table entry for SVA binding

2022-04-22 Thread Lu Baolu

On 2022/4/22 11:05, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Thursday, April 21, 2022 7:36 PM

This field make the requests snoop processor caches irrespective of
other attributes in the request or other fields in paging structure
entries used to translate the request.


I think you want to first point out the fact that SVA wants snoop
cache instead of just talking about the effect of PGSNP.

But thinking more I wonder why PGSNP is ever required. This is
similar to DMA API case. x86 is already cache coherent for normal
DMA (if not setting PCI no-snoop) and if the driver knows no-snoop
is incompatible to SVA API then it should avoid triggering no-snoop
traffic for SVA usage. In this case it is pointless for IOMMU driver
to enable force-snooping. Even in the future certain platform allows
no-snoop usage w/ SVA (I'm not sure how it works) this again should
be reflected by additional SVA APIs for driver to explicitly manage.

force-snoop should be enabled only in device assignment case IMHO,
orthogonal to whether vSVA is actually used.

Did I misunderstand the motivation here?


No, you didn't.

Let's talk with the arch guys for more details before move this patch
ahead. Thanks for pointing this out.

Best regards,
baolu





Signed-off-by: Lu Baolu 
---
  drivers/iommu/intel/svm.c | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 23a38763c1d1..c720d1be992d 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -391,9 +391,12 @@ static struct iommu_sva
*intel_svm_bind_mm(struct intel_iommu *iommu,
}

/* Setup the pasid table: */
-   sflags = (flags & SVM_FLAG_SUPERVISOR_MODE) ?
-   PASID_FLAG_SUPERVISOR_MODE : 0;
-   sflags |= cpu_feature_enabled(X86_FEATURE_LA57) ?
PASID_FLAG_FL5LP : 0;
+   sflags = PASID_FLAG_PAGE_SNOOP;
+   if (flags & SVM_FLAG_SUPERVISOR_MODE)
+   sflags |= PASID_FLAG_SUPERVISOR_MODE;
+   if (cpu_feature_enabled(X86_FEATURE_LA57))
+   sflags |= PASID_FLAG_FL5LP;
+
spin_lock_irqsave(>lock, iflags);
ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, mm-

pasid,

FLPT_DEFAULT_DID, sflags);
--
2.25.1



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/4] iommu/vt-d: Check before setting PGSNP bit in pasid table entry

2022-04-22 Thread Lu Baolu

On 2022/4/22 10:47, Tian, Kevin wrote:

From: Lu Baolu
Sent: Thursday, April 21, 2022 7:36 PM

The latest VT-d specification states that the PGSNP field in the pasid
table entry should be treated as Reserved(0) for implementations not
supporting Snoop Control (SC=0 in the Extended Capability Register).
This adds a check before setting the field.

Signed-off-by: Lu Baolu
---
  drivers/iommu/intel/pasid.c | 13 ++---
  1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index f8d215d85695..5cb2daa2b8cb 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -625,8 +625,14 @@ int intel_pasid_setup_first_level(struct intel_iommu
*iommu,
}
}

-   if (flags & PASID_FLAG_PAGE_SNOOP)
-   pasid_set_pgsnp(pte);
+   if (flags & PASID_FLAG_PAGE_SNOOP) {
+   if (ecap_sc_support(iommu->ecap)) {
+   pasid_set_pgsnp(pte);
+   } else {
+   pasid_clear_entry(pte);
+   return -EINVAL;
+   }
+   }

pasid_set_domain_id(pte, did);
pasid_set_address_width(pte, iommu->agaw);
@@ -710,7 +716,8 @@ int intel_pasid_setup_second_level(struct
intel_iommu *iommu,
pasid_set_fault_enable(pte);
pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));

-   if (domain->domain.type == IOMMU_DOMAIN_UNMANAGED)
+   if (ecap_sc_support(iommu->ecap) &&
+   domain->domain.type == IOMMU_DOMAIN_UNMANAGED)
pasid_set_pgsnp(pte);


This should be rebased on top of Jason's enforce coherency series
instead of blindly setting it. No matter whether it's legacy mode
where we set SNP in PTE or scalable mode where we set PGSNP
in PASID entry for entire page table, the trigger point should be
same i.e. when someone calls enforce_cache_coherency().


With Jason's enforce coherency series merged, we even don't need to set
PGSNP bit of a pasid entry for second level translation. 2nd level
always supports SNP in PTEs, so set PGSNP in pasid table entry is
unnecessary.

Any thoughts?

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: Use dev_iommu_ops() for probe_finalize

2022-04-22 Thread Lu Baolu

On 2022/4/22 20:33, Robin Murphy wrote:

The ->probe_finalize hook only runs after ->probe_device succeeds,
so we can move that over to the new dev_iommu_ops() as well.

Signed-off-by: Robin Murphy 


Reviewed-by: Lu Baolu 

Best regards,
baolu


---

Another cheeky little one which doesn't need to wait...

  drivers/iommu/iommu.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1b8dcda5fbe4..8825a4628e46 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -315,7 +315,7 @@ static int __iommu_probe_device(struct device *dev, struct 
list_head *group_list
  
  int iommu_probe_device(struct device *dev)

  {
-   const struct iommu_ops *ops = dev->bus->iommu_ops;
+   const struct iommu_ops *ops;
struct iommu_group *group;
int ret;
  
@@ -352,6 +352,7 @@ int iommu_probe_device(struct device *dev)

mutex_unlock(>mutex);
iommu_group_put(group);
  
+	ops = dev_iommu_ops(dev);

if (ops->probe_finalize)
ops->probe_finalize(dev);
  

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: Use dev_iommu_ops() for probe_finalize

2022-04-22 Thread Robin Murphy
The ->probe_finalize hook only runs after ->probe_device succeeds,
so we can move that over to the new dev_iommu_ops() as well.

Signed-off-by: Robin Murphy 
---

Another cheeky little one which doesn't need to wait...

 drivers/iommu/iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1b8dcda5fbe4..8825a4628e46 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -315,7 +315,7 @@ static int __iommu_probe_device(struct device *dev, struct 
list_head *group_list
 
 int iommu_probe_device(struct device *dev)
 {
-   const struct iommu_ops *ops = dev->bus->iommu_ops;
+   const struct iommu_ops *ops;
struct iommu_group *group;
int ret;
 
@@ -352,6 +352,7 @@ int iommu_probe_device(struct device *dev)
mutex_unlock(>mutex);
iommu_group_put(group);
 
+   ops = dev_iommu_ops(dev);
if (ops->probe_finalize)
ops->probe_finalize(dev);
 
-- 
2.35.3.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[GIT PULL] iommu/arm-smmu: Fixes for 5.18

2022-04-22 Thread Will Deacon
Hi Joerg,

Unusually, we've got some SMMU driver fixes this time around. Summary in
the tag -- please can you pull these for 5.18?

Cheers,

Will

--->8

The following changes since commit 3123109284176b1532874591f7c81f3837bbdc17:

  Linux 5.18-rc1 (2022-04-03 14:08:21 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
tags/arm-smmu-fixes

for you to fetch changes up to 4a25f2ea0e030b2fc852c4059a50181bfc5b2f57:

  iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu (2022-04-22 
11:21:30 +0100)


Arm SMMU fixes for 5.18

- Fix off-by-one in SMMUv3 SVA TLB invalidation

- Disable large mappings to workaround nvidia erratum


Ashish Mhetre (1):
  iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

Nicolin Chen (1):
  iommu/arm-smmu-v3: Fix size calculation in arm_smmu_mm_invalidate_range()

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c |  9 +++-
 drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c| 30 +
 2 files changed, 38 insertions(+), 1 deletion(-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Patch v2] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

2022-04-22 Thread Will Deacon
On Thu, 21 Apr 2022 13:45:04 +0530, Ashish Mhetre wrote:
> Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
> entries to not be invalidated correctly. The problem is that the walk
> cache index generated for IOVA is not same across translation and
> invalidation requests. This is leading to page faults when PMD entry is
> released during unmap and populated with new PTE table during subsequent
> map request. Disabling large page mappings avoids the release of PMD
> entry and avoid translations seeing stale PMD entry in walk cache.
> Fix this by limiting the page mappings to PAGE_SIZE for Tegra194 and
> Tegra234 devices. This is recommended fix from Tegra hardware design
> team.
> 
> [...]

Applied to will (for-joerg/arm-smmu/fixes), thanks!

[1/1] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu
  https://git.kernel.org/will/c/4a25f2ea0e03

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-22 Thread Jean-Philippe Brucker
On Fri, Apr 22, 2022 at 05:03:10PM +0800, zhangfei@foxmail.com wrote:
[...]
> > Have tested, still got some issue with our openssl-engine.
> > 
> > 1. If openssl-engine does not register rsa, nginx works well.
> > 
> > 2. If openssl-engine register rsa, nginx also works, but ioasid is not
> > freed when nginx stop.
> > 
> > IMPLEMENT_DYNAMIC_BIND_FN(bind_fn)
> > bind_fn
> > ENGINE_set_RSA(e, rsa_methods())
> > 
> > destroy_fn
> > 
> > If ENGINE_set_RSA is set, nginx start and stop will NOT call destroy_fn.
> > Even rsa_methods is almost new via RSA_meth_new.
> > 
> > In 5.18-rcx, this caused ioasid  not freed in nginx start and stop.
> > In 5.17, though destroy_fn is not called, but ioasid is freed when nginx
> > stop, so not noticed this issue before.
> 
> 1. uacce_fops_release
> In 5.16 or 5.17
> In fact, we aslo has the issue: openssl engine does not call destroy_fn ->
> close(uacce_fd)
> But system will automatically close all opened fd,
> so uacce_fops_release is also called and free ioasid.
> 
> Have one experiment, not call close fd
> 
> log: open uacce fd but no close
> [ 2583.471225]  dump_backtrace+0x0/0x1a0
> [ 2583.474876]  show_stack+0x20/0x30
> [ 2583.478178]  dump_stack_lvl+0x8c/0xb8
> [ 2583.481825]  dump_stack+0x18/0x34
> [ 2583.485126]  uacce_fops_release+0x44/0xdc
> [ 2583.489117]  __fput+0x78/0x240
> [ 2583.492159]  fput+0x18/0x28
> [ 2583.495288]  task_work_run+0x88/0x160
> [ 2583.498936]  do_notify_resume+0x214/0x490
> [ 2583.502927]  el0_svc+0x58/0x70
> [ 2583.505968]  el0t_64_sync_handler+0xb0/0xb8
> [ 2583.510132]  el0t_64_sync+0x1a0/0x1a4
> [ 2583.582292]  uacce_fops_release q=d6674128
> 
> In 5.18, since refcount was add.
> The opened uacce fd was not closed automatically by system
> So we see the issue.
> 
> log: open uacce fd but no close
>  [  106.360140]  uacce_fops_open q=ccc38d74
> [  106.364929]  ioasid_alloc ioasid=1
> [  106.368585]  iommu_sva_alloc_pasid pasid=1
> [  106.372943]  iommu_sva_bind_device handle=6cca298a
> // ioasid is not free

I'm trying to piece together what happens from the kernel point of view.

* master process with mm A opens a queue fd through uacce, which calls
  iommu_sva_bind_device(dev, A) -> PASID 1

* master forks and exits. Child (daemon) gets mm B, inherits the queue fd.
  The device is still bound to mm A with PASID 1, since the queue fd is
  still open.

We discussed this before, but I don't remember where we left off. The
child can't use the queue because its mappings are not copied on fork(),
and the queue is still bound to the parent mm A. The child either needs to
open a new queue or take ownership of the old one with a new uacce ioctl.
Is that the "IMPLEMENT_DYNAMIC_BIND_FN()" you mention, something out of
tree?  This operation should unbind from A before binding to B, no?
Otherwise we leak PASID 1.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-04-22 Thread zhangfei....@foxmail.com

Hi, Jean

On 2022/4/21 下午2:47, zhangfei@foxmail.com wrote:



On 2022/4/21 上午12:45, Jean-Philippe Brucker wrote:

Hi,

On Fri, Apr 15, 2022 at 02:51:08AM -0700, Fenghua Yu wrote:

 From a6444e1e5bd8076f5e5c5e950d3192de327f0c9c Mon Sep 17 00:00:00 2001
From: Fenghua Yu 
Date: Fri, 15 Apr 2022 00:51:33 -0700
Subject: [RFC PATCH] iommu/sva: Fix PASID use-after-free issue

A PASID might be still used even though it is freed on mm exit.

process A:
sva_bind();
ioasid_alloc() = N; // Get PASID N for the mm
fork(): // spawn process B
exit();
ioasid_free(N);

process B:
device uses PASID N -> failure
sva_unbind();

Dave Hansen suggests to take a refcount on the mm whenever binding the
PASID to a device and drop the refcount on unbinding. The mm won't be
dropped if the PASID is still bound to it.

Fixes: 701fac40384f ("iommu/sva: Assign a PASID to mm on PASID 
allocation and free it on mm exit")


Reported-by: Zhangfei Gao 
Suggested-by: Dave Hansen" 
Signed-off-by: Fenghua Yu 
---
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 6 ++
  drivers/iommu/intel/svm.c   | 4 
  2 files changed, 10 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c

index 22ddd05bbdcd..3fcb842a0df0 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -7,6 +7,7 @@
  #include 
  #include 
  #include 
+#include 
    #include "arm-smmu-v3.h"
  #include "../../iommu-sva-lib.h"
@@ -363,6 +364,9 @@ arm_smmu_sva_bind(struct device *dev, struct 
mm_struct *mm, void *drvdata)

    mutex_lock(_lock);
  handle = __arm_smmu_sva_bind(dev, mm);
+    /* Take an mm refcount on a successful bind. */
+    if (!IS_ERR(handle))
+    mmget(mm);
  mutex_unlock(_lock);
  return handle;
  }
@@ -372,6 +376,8 @@ void arm_smmu_sva_unbind(struct iommu_sva *handle)
  struct arm_smmu_bond *bond = sva_to_bond(handle);
    mutex_lock(_lock);
+    /* Drop an mm refcount. */
+    mmput(bond->mm);

I do like the idea because it will simplify the driver. We can't call
mmput() here, though, because it may call the release() MMU notifier 
which

will try to grab sva_lock, already held.

I also found another use-after-free in arm_smmu_free_shared_cd(), 
where we
call arm64_mm_context_put() when the mm could already be freed. There 
used

to be an mmgrab() preventing this but it went away during a rewrite.

To fix both we could just move mmput() at the end of unbind() but I'd
rather do a proper cleanup removing the release() notifier right away.
Zhangfei, could you try the patch below?

Thanks,
Jean

--- 8< ---

 From 4e09c0d71dfb35fc90915bd1e36545027fbf8a03 Mon Sep 17 00:00:00 2001
From: Jean-Philippe Brucker 
Date: Wed, 20 Apr 2022 10:19:24 +0100
Subject: [PATCH] iommu/arm-smmu-v3-sva: Fix PASID and mm 
use-after-free issues


Commit 701fac40384f ("iommu/sva: Assign a PASID to mm on PASID
allocation and free it on mm exit") frees the PASID earlier than what
the SMMUv3 driver expects. At the moment the SMMU driver handles mm exit
in the release() MMU notifier by quiescing the context descriptor. The
context descriptor is only made invalid in unbind(), after the device
driver ensured the PASID is not used anymore. Releasing the PASID on mm
exit may cause it to be reallocated while it is still used by the
context descriptor.

There is another use-after-free, present since the beginning, where we
call arm64_mm_context_put() without a guarantee that mm_count is held.

Dave Hansen suggests to grab mm_users whenever binding the mm to a
device and drop it on unbinding. With that we can fix both issues and
simplify the driver by removing the release() notifier.

Fixes: 32784a9562fb ("iommu/arm-smmu-v3: Implement 
iommu_sva_bind/unbind()")

Reported-by: Zhangfei Gao 
Suggested-by: Dave Hansen 
Signed-off-by: Fenghua Yu 
Signed-off-by: Jean-Philippe Brucker 
---
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  1 -
  .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 49 +--
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 14 +-
  3 files changed, 15 insertions(+), 49 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h

index cd48590ada30..d50d215d946c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -735,7 +735,6 @@ static inline struct arm_smmu_domain 
*to_smmu_domain(struct iommu_domain *dom)

    extern struct xarray arm_smmu_asid_xa;
  extern struct mutex arm_smmu_asid_lock;
-extern struct arm_smmu_ctx_desc quiet_cd;
    int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, 
int ssid,

  struct arm_smmu_ctx_desc *cd);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c

index 22ddd05bbdcd..f9dff0f6cdd4 100644
---