Re: [Patch v2] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

2022-04-26 Thread Jon Hunter via iommu

Hi Will,

On 22/04/2022 11:55, Will Deacon wrote:

On Thu, 21 Apr 2022 13:45:04 +0530, Ashish Mhetre wrote:

Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
entries to not be invalidated correctly. The problem is that the walk
cache index generated for IOVA is not same across translation and
invalidation requests. This is leading to page faults when PMD entry is
released during unmap and populated with new PTE table during subsequent
map request. Disabling large page mappings avoids the release of PMD
entry and avoid translations seeing stale PMD entry in walk cache.
Fix this by limiting the page mappings to PAGE_SIZE for Tegra194 and
Tegra234 devices. This is recommended fix from Tegra hardware design
team.

[...]


Applied to will (for-joerg/arm-smmu/fixes), thanks!

[1/1] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu
   https://git.kernel.org/will/c/4a25f2ea0e03



Thanks for applying. Sorry to be late to the party, but feel free
to add my ...

Reviewed-by: Jon Hunter 

Also any chance we could tag for stable? Probably the most
appropriate fixes-tag would be ...

Fixes: aab5a1c88276 ("iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 
usage")

Thanks!
Jon

--
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Patch v1] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

2022-04-19 Thread Jon Hunter via iommu



On 17/04/2022 10:04, Ashish Mhetre wrote:

Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
entries to not be invalidated correctly. The problem is that the walk
cache index generated for IOVA is not same across translation and
invalidation requests. This is leading to page faults when PMD entry is
released during unmap and populated with new PTE table during subsequent
map request. Disabling large page mappings avoids the release of PMD
entry and avoid translations seeing stale PMD entry in walk cache.
Fix this by limiting the page mappings to PAGE_SIZE for Tegra194 and
Tegra234 devices. This is recommended fix from Tegra hardware design
team.

Co-developed-by: Pritesh Raithatha 
Signed-off-by: Pritesh Raithatha 
Signed-off-by: Ashish Mhetre 
---
  drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c | 23 
  drivers/iommu/arm/arm-smmu/arm-smmu.c|  3 +++
  drivers/iommu/arm/arm-smmu/arm-smmu.h|  1 +
  3 files changed, 27 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
index 01e9b50b10a1..b7a3d06da2f4 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
@@ -258,6 +258,27 @@ static void nvidia_smmu_probe_finalize(struct 
arm_smmu_device *smmu, struct devi
dev_name(dev), err);
  }
  
+static void nvidia_smmu_cfg_pgsize_bitmap(struct arm_smmu_device *smmu)

+{
+   const struct device_node *np = smmu->dev->of_node;
+
+   /*
+* Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
+* entries to not be invalidated correctly. The problem is that the walk
+* cache index generated for IOVA is not same across translation and
+* invalidation requests. This is leading to page faults when PMD entry
+* is released during unmap and populated with new PTE table during
+* subsequent map request. Disabling large page mappings avoids the
+* release of PMD entry and avoid translations seeing stale PMD entry in
+* walk cache.
+* Fix this by limiting the page mappings to PAGE_SIZE on Tegra194 and
+* Tegra234.
+*/
+   if (of_device_is_compatible(np, "nvidia,tegra234-smmu") ||
+   of_device_is_compatible(np, "nvidia,tegra194-smmu"))
+   smmu->pgsize_bitmap = PAGE_SIZE;
+}
+
  static const struct arm_smmu_impl nvidia_smmu_impl = {
.read_reg = nvidia_smmu_read_reg,
.write_reg = nvidia_smmu_write_reg,
@@ -268,10 +289,12 @@ static const struct arm_smmu_impl nvidia_smmu_impl = {
.global_fault = nvidia_smmu_global_fault,
.context_fault = nvidia_smmu_context_fault,
.probe_finalize = nvidia_smmu_probe_finalize,
+   .cfg_pgsize_bitmap = nvidia_smmu_cfg_pgsize_bitmap,
  };
  
  static const struct arm_smmu_impl nvidia_smmu_single_impl = {

.probe_finalize = nvidia_smmu_probe_finalize,
+   .cfg_pgsize_bitmap = nvidia_smmu_cfg_pgsize_bitmap,
  };
  
  struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 568cce590ccc..3692a19a588a 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1872,6 +1872,9 @@ static int arm_smmu_device_cfg_probe(struct 
arm_smmu_device *smmu)
if (smmu->features & ARM_SMMU_FEAT_FMT_AARCH64_64K)
smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
  
+	if (smmu->impl && smmu->impl->cfg_pgsize_bitmap)

+   smmu->impl->cfg_pgsize_bitmap(smmu);
+
if (arm_smmu_ops.pgsize_bitmap == -1UL)
arm_smmu_ops.pgsize_bitmap = smmu->pgsize_bitmap;
else
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 2b9b42fb6f30..5d9b03024969 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -442,6 +442,7 @@ struct arm_smmu_impl {
void (*write_s2cr)(struct arm_smmu_device *smmu, int idx);
void (*write_sctlr)(struct arm_smmu_device *smmu, int idx, u32 reg);
void (*probe_finalize)(struct arm_smmu_device *smmu, struct device 
*dev);
+   void (*cfg_pgsize_bitmap)(struct arm_smmu_device *smmu);
  };
  
  #define INVALID_SMENDX			-1



Reviewed-by: Jon Hunter 
Tested-by: Jon Hunter 

Thanks!
Jon

--
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/8] Host1x context isolation support

2021-12-17 Thread Jon Hunter via iommu



On 14/12/2021 15:38, Robin Murphy wrote:

...

IOMMU/DT folks, any thoughts about this approach? The patches that are 
of interest outside of Host1x/TegraDRM specifics are patches 1, 2, 4, 
and 5.


FWIW it looks fairly innocuous to me. I don't understand host1x - 
neither hardware nor driver abstractions - well enough to meaningfully 
review it all (e.g. maybe it's deliberate that the bus .dma_configure 
method isn't used?), but the SMMU patch seems fine given the Kconfig 
solution to avoid module linkage problems.



Thanks Robin! Will, Joerg, is OK with you?

Mikko, I believe we are missing a dt-binding change to document the 
'memory-contexts' node which I assume you will add if everyone is OK 
with this?


Cheers
Jon

--
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/8] Host1x context isolation support

2021-12-14 Thread Jon Hunter via iommu

Hi all,

Still no response on this :-(

On 06/12/2021 09:55, Jon Hunter wrote:

Will, Joerg, Rob,

On 08/11/2021 10:36, Mikko Perttunen wrote:

On 9/16/21 5:32 PM, Mikko Perttunen wrote:

Hi all,

***
New in v2:

Added support for Tegra194
Use standard iommu-map property instead of custom mechanism
***

this series adds support for Host1x 'context isolation'. Since
when programming engines through Host1x, userspace can program in
any addresses it wants, we need some way to isolate the engines'
memory spaces. Traditionally this has either been done imperfectly
with a single shared IOMMU domain, or by copying and verifying the
programming command stream at submit time (Host1x firewall).

Since Tegra186 there is a privileged (only usable by kernel)
Host1x opcode that allows setting the stream ID sent by the engine
to the SMMU. So, by allocating a number of context banks and stream
IDs for this purpose, and using this opcode at the beginning of
each job, we can implement isolation. Due to the limited number of
context banks only each process gets its own context, and not
each channel.

This feature also allows sharing engines among multiple VMs when
used with Host1x's hardware virtualization support - up to 8 VMs
can be configured with a subset of allowed stream IDs, enforced
at hardware level.

To implement this, this series adds a new host1x context bus, which
will contain the 'struct device's corresponding to each context
bank / stream ID, changes to device tree and SMMU code to allow
registering the devices and using the bus, as well as the Host1x
stream ID programming code and support in TegraDRM.

Device tree bindings are not updated yet pending consensus that the
proposed changes make sense.

Thanks,
Mikko

Mikko Perttunen (8):
   gpu: host1x: Add context bus
   gpu: host1x: Add context device management code
   gpu: host1x: Program context stream ID on submission
   iommu/arm-smmu: Attach to host1x context device bus
   arm64: tegra: Add Host1x context stream IDs on Tegra186+
   drm/tegra: falcon: Set DMACTX field on DMA transactions
   drm/tegra: vic: Implement get_streamid_offset
   drm/tegra: Support context isolation

  arch/arm64/boot/dts/nvidia/tegra186.dtsi  |  12 ++
  arch/arm64/boot/dts/nvidia/tegra194.dtsi  |  12 ++
  drivers/gpu/Makefile  |   3 +-
  drivers/gpu/drm/tegra/drm.h   |   2 +
  drivers/gpu/drm/tegra/falcon.c    |   8 +
  drivers/gpu/drm/tegra/falcon.h    |   1 +
  drivers/gpu/drm/tegra/submit.c    |  13 ++
  drivers/gpu/drm/tegra/uapi.c  |  34 -
  drivers/gpu/drm/tegra/vic.c   |  38 +
  drivers/gpu/host1x/Kconfig    |   5 +
  drivers/gpu/host1x/Makefile   |   2 +
  drivers/gpu/host1x/context.c  | 174 ++
  drivers/gpu/host1x/context.h  |  27 
  drivers/gpu/host1x/context_bus.c  |  31 
  drivers/gpu/host1x/dev.c  |  12 +-
  drivers/gpu/host1x/dev.h  |   2 +
  drivers/gpu/host1x/hw/channel_hw.c    |  52 ++-
  drivers/gpu/host1x/hw/host1x06_hardware.h |  10 ++
  drivers/gpu/host1x/hw/host1x07_hardware.h |  10 ++
  drivers/iommu/arm/arm-smmu/arm-smmu.c |  13 ++
  include/linux/host1x.h    |  21 +++
  include/linux/host1x_context_bus.h    |  15 ++
  22 files changed, 488 insertions(+), 9 deletions(-)
  create mode 100644 drivers/gpu/host1x/context.c
  create mode 100644 drivers/gpu/host1x/context.h
  create mode 100644 drivers/gpu/host1x/context_bus.c
  create mode 100644 include/linux/host1x_context_bus.h



IOMMU/DT folks, any thoughts about this approach? The patches that are 
of interest outside of Host1x/TegraDRM specifics are patches 1, 2, 4, 
and 5.



Any feedback on this?

Jon



--
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 0/8] Host1x context isolation support

2021-12-06 Thread Jon Hunter via iommu

Will, Joerg, Rob,

On 08/11/2021 10:36, Mikko Perttunen wrote:

On 9/16/21 5:32 PM, Mikko Perttunen wrote:

Hi all,

***
New in v2:

Added support for Tegra194
Use standard iommu-map property instead of custom mechanism
***

this series adds support for Host1x 'context isolation'. Since
when programming engines through Host1x, userspace can program in
any addresses it wants, we need some way to isolate the engines'
memory spaces. Traditionally this has either been done imperfectly
with a single shared IOMMU domain, or by copying and verifying the
programming command stream at submit time (Host1x firewall).

Since Tegra186 there is a privileged (only usable by kernel)
Host1x opcode that allows setting the stream ID sent by the engine
to the SMMU. So, by allocating a number of context banks and stream
IDs for this purpose, and using this opcode at the beginning of
each job, we can implement isolation. Due to the limited number of
context banks only each process gets its own context, and not
each channel.

This feature also allows sharing engines among multiple VMs when
used with Host1x's hardware virtualization support - up to 8 VMs
can be configured with a subset of allowed stream IDs, enforced
at hardware level.

To implement this, this series adds a new host1x context bus, which
will contain the 'struct device's corresponding to each context
bank / stream ID, changes to device tree and SMMU code to allow
registering the devices and using the bus, as well as the Host1x
stream ID programming code and support in TegraDRM.

Device tree bindings are not updated yet pending consensus that the
proposed changes make sense.

Thanks,
Mikko

Mikko Perttunen (8):
   gpu: host1x: Add context bus
   gpu: host1x: Add context device management code
   gpu: host1x: Program context stream ID on submission
   iommu/arm-smmu: Attach to host1x context device bus
   arm64: tegra: Add Host1x context stream IDs on Tegra186+
   drm/tegra: falcon: Set DMACTX field on DMA transactions
   drm/tegra: vic: Implement get_streamid_offset
   drm/tegra: Support context isolation

  arch/arm64/boot/dts/nvidia/tegra186.dtsi  |  12 ++
  arch/arm64/boot/dts/nvidia/tegra194.dtsi  |  12 ++
  drivers/gpu/Makefile  |   3 +-
  drivers/gpu/drm/tegra/drm.h   |   2 +
  drivers/gpu/drm/tegra/falcon.c    |   8 +
  drivers/gpu/drm/tegra/falcon.h    |   1 +
  drivers/gpu/drm/tegra/submit.c    |  13 ++
  drivers/gpu/drm/tegra/uapi.c  |  34 -
  drivers/gpu/drm/tegra/vic.c   |  38 +
  drivers/gpu/host1x/Kconfig    |   5 +
  drivers/gpu/host1x/Makefile   |   2 +
  drivers/gpu/host1x/context.c  | 174 ++
  drivers/gpu/host1x/context.h  |  27 
  drivers/gpu/host1x/context_bus.c  |  31 
  drivers/gpu/host1x/dev.c  |  12 +-
  drivers/gpu/host1x/dev.h  |   2 +
  drivers/gpu/host1x/hw/channel_hw.c    |  52 ++-
  drivers/gpu/host1x/hw/host1x06_hardware.h |  10 ++
  drivers/gpu/host1x/hw/host1x07_hardware.h |  10 ++
  drivers/iommu/arm/arm-smmu/arm-smmu.c |  13 ++
  include/linux/host1x.h    |  21 +++
  include/linux/host1x_context_bus.h    |  15 ++
  22 files changed, 488 insertions(+), 9 deletions(-)
  create mode 100644 drivers/gpu/host1x/context.c
  create mode 100644 drivers/gpu/host1x/context.h
  create mode 100644 drivers/gpu/host1x/context_bus.c
  create mode 100644 include/linux/host1x_context_bus.h



IOMMU/DT folks, any thoughts about this approach? The patches that are 
of interest outside of Host1x/TegraDRM specifics are patches 1, 2, 4, 
and 5.



Any feedback on this?

Jon

--
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 2/2] arm64: tegra: Add ISO SMMU controller for Tegra194

2021-12-01 Thread Jon Hunter via iommu
The display controllers are attached to a separate ARM SMMU instance
that is dedicated to servicing isochronous memory clients. Add this ISO
instance of the ARM SMMU to device tree.

Please note that the display controllers are not hooked up to this SMMU
yet, because we are still missing a means to transition framebuffers
used by the bootloader to the kernel.

This based upon an initial patch by Thierry Reding .

Signed-off-by: Jon Hunter 
---
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 76 
 1 file changed, 76 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index d78c9ed08c47..496e31b5c637 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -1474,6 +1474,82 @@ pmc: pmc@c36 {
interrupt-controller;
};
 
+   iommu@1000 {
+   compatible = "nvidia,tegra194-smmu", "nvidia,smmu-500";
+   reg = <0x1000 0x80>;
+   interrupts = ,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+;
+   stream-match-mask = <0x7f80>;
+   #global-interrupts = <1>;
+   #iommu-cells = <1>;
+
+   nvidia,memory-controller = <>;
+   status = "okay";
+   };
+
smmu: iommu@1200 {
compatible = "nvidia,tegra194-smmu", "nvidia,smmu-500";
reg = <0x1200 0x80>,
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] dt-bindings: arm-smmu: Fix json-schema for Tegra

2021-12-01 Thread Jon Hunter via iommu
The dt_binding_check currently issues the following warnings for the
Tegra186 and Tegra194 SMMUs ...

 arch/arm64/boot/dts/nvidia/tegra186-p2771-.dt.yaml: iommu@1200:
  'nvidia,memory-controller' does not match any of the regexes: 'pinctrl-[0-9]+'
From schema: Documentation/devicetree/bindings/iommu/arm,smmu.yaml
  DTC arch/arm64/boot/dts/nvidia/tegra186-p3509-+p3636-0001.dt.yaml
  CHECK   arch/arm64/boot/dts/nvidia/tegra186-p3509-+p3636-0001.dt.yaml

Add the missing 'nvidia,memory-controller' property to fix the above
warning.

Signed-off-by: Jon Hunter 
---
 Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index f66a3effba73..5dce07b12cd7 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -155,6 +155,10 @@ properties:
   power-domains:
 maxItems: 1
 
+  nvidia,memory-controller:
+$ref: /schemas/types.yaml#/definitions/phandle
+description: phandle of the memory controller node
+
 required:
   - compatible
   - reg
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 5/5] iommu/arm-smmu: Add global/context fault implementation hooks

2020-07-08 Thread Jon Hunter
  }
> + }
> +
> + return ret;
> +}
> +
>  static const struct arm_smmu_impl nvidia_smmu_impl = {
>   .read_reg = nvidia_smmu_read_reg,
>   .write_reg = nvidia_smmu_write_reg,
> @@ -134,6 +231,8 @@ static const struct arm_smmu_impl nvidia_smmu_impl = {
>   .write_reg64 = nvidia_smmu_write_reg64,
>   .reset = nvidia_smmu_reset,
>   .tlb_sync = nvidia_smmu_tlb_sync,
> + .global_fault = nvidia_smmu_global_fault,
> + .context_fault = nvidia_smmu_context_fault,
>  };
>  
>  struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index c123a5814f70..020afddfaa0f 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -670,6 +670,7 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>   enum io_pgtable_fmt fmt;
>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   struct arm_smmu_cfg *cfg = _domain->cfg;
> + irqreturn_t (*context_fault)(int irq, void *dev);
>  
>   mutex_lock(_domain->init_mutex);
>   if (smmu_domain->smmu)
> @@ -832,7 +833,13 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>* handler seeing a half-initialised domain state.
>*/
>   irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
> - ret = devm_request_irq(smmu->dev, irq, arm_smmu_context_fault,
> +
> + if (smmu->impl && smmu->impl->context_fault)
> + context_fault = smmu->impl->context_fault;
> + else
> + context_fault = arm_smmu_context_fault;
> +
> + ret = devm_request_irq(smmu->dev, irq, context_fault,
>  IRQF_SHARED, "arm-smmu-context-fault", domain);
>   if (ret < 0) {
>   dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
> @@ -2105,6 +2112,7 @@ static int arm_smmu_device_probe(struct platform_device 
> *pdev)
>   struct arm_smmu_device *smmu;
>   struct device *dev = >dev;
>   int num_irqs, i, err;
> + irqreturn_t (*global_fault)(int irq, void *dev);
>  
>   smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
>   if (!smmu) {
> @@ -2191,9 +2199,14 @@ static int arm_smmu_device_probe(struct 
> platform_device *pdev)
>   smmu->num_context_irqs = smmu->num_context_banks;
>   }
>  
> + if (smmu->impl && smmu->impl->global_fault)
> + global_fault = smmu->impl->global_fault;
> + else
> + global_fault = arm_smmu_global_fault;
> +
>   for (i = 0; i < smmu->num_global_irqs; ++i) {
>   err = devm_request_irq(smmu->dev, smmu->irqs[i],
> -arm_smmu_global_fault,
> +global_fault,
>  IRQF_SHARED,
>  "arm-smmu global fault",
>          smmu);
> diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> index fad63efa1a72..d890a4a968e8 100644
> --- a/drivers/iommu/arm-smmu.h
> +++ b/drivers/iommu/arm-smmu.h
> @@ -18,6 +18,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -389,6 +390,8 @@ struct arm_smmu_impl {
>   void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync,
>int status);
>   int (*def_domain_type)(struct device *dev);
> + irqreturn_t (*global_fault)(int irq, void *dev);
> + irqreturn_t (*context_fault)(int irq, void *dev);
>  };
>  
>  static inline void __iomem *arm_smmu_page(struct arm_smmu_device *smmu, int 
> n)
> 


Reviewed-by: Jon Hunter 

Thanks!
Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 4/5] dt-bindings: arm-smmu: add binding for Tegra194 SMMU

2020-07-08 Thread Jon Hunter


On 08/07/2020 06:00, Krishna Reddy wrote:
> Add binding for NVIDIA's Tegra194 SoC SMMU.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  .../devicetree/bindings/iommu/arm,smmu.yaml| 18 ++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
> b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> index d7ceb4c34423..ac1f526c3424 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> @@ -38,6 +38,11 @@ properties:
>- qcom,sc7180-smmu-500
>- qcom,sdm845-smmu-500
>- const: arm,mmu-500
> +  - description: NVIDIA SoCs that program two ARM MMU-500s identically
> +items:
> +  - enum:
> +  - nvidia,tegra194-smmu
> +  - const: nvidia,smmu-500
>- items:
>- const: arm,mmu-500
>- const: arm,smmu-v2
> @@ -138,6 +143,19 @@ required:
>  
>  additionalProperties: false
>  
> +allOf:
> +  - if:
> +  properties:
> +compatible:
> +  contains:
> +enum:
> +  - nvidia,tegra194-smmu
> +then:
> +  properties:
> +reg:
> +  minItems: 2
> +      maxItems: 2
> +
>  examples:
>- |+
>  /* SMMU with stream matching or stream indexing */
> 

Reviewed-by: Jon Hunter 

Thanks
Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 3/5] iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage

2020-07-08 Thread Jon Hunter
+}
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index e03e873d3bca..c123a5814f70 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1943,6 +1943,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
>   { .compatible = "arm,mmu-401", .data = _mmu401 },
>   { .compatible = "arm,mmu-500", .data = _mmu500 },
>   { .compatible = "cavium,smmu-v2", .data = _smmuv2 },
> + { .compatible = "nvidia,smmu-500", .data = _mmu500 },
>   { .compatible = "qcom,smmu-v2", .data = _smmuv2 },
>   { },
>  };
> diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> index c7d0122a7c6c..fad63efa1a72 100644
> --- a/drivers/iommu/arm-smmu.h
> +++ b/drivers/iommu/arm-smmu.h
> @@ -452,6 +452,7 @@ static inline void arm_smmu_writeq(struct arm_smmu_device 
> *smmu, int page,
>   arm_smmu_writeq((s), ARM_SMMU_CB((s), (n)), (o), (v))
>  
>  struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu);
> +struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu);
>  struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu);
>  
>  int arm_mmu500_reset(struct arm_smmu_device *smmu);
> 

Reviewed-by: Jon Hunter 

Thanks
Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 2/5] iommu/arm-smmu: ioremap smmu mmio region before implementation init

2020-07-08 Thread Jon Hunter


On 08/07/2020 06:00, Krishna Reddy wrote:
> ioremap smmu mmio region before calling into implementation init.
> This is necessary to allow mapped address available during vendor
> specific implementation init.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  drivers/iommu/arm-smmu.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index d2054178df35..e03e873d3bca 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -2120,10 +2120,6 @@ static int arm_smmu_device_probe(struct 
> platform_device *pdev)
>   if (err)
>   return err;
>  
> - smmu = arm_smmu_impl_init(smmu);
> - if (IS_ERR(smmu))
> - return PTR_ERR(smmu);
> -
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   ioaddr = res->start;
>   smmu->base = devm_ioremap_resource(dev, res);
> @@ -2135,6 +2131,10 @@ static int arm_smmu_device_probe(struct 
> platform_device *pdev)
>*/
>   smmu->numpage = resource_size(res);
>  
> + smmu = arm_smmu_impl_init(smmu);
> + if (IS_ERR(smmu))
> + return PTR_ERR(smmu);
> +
>   num_irqs = 0;
>   while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) {
>   num_irqs++;
> 


Reviewed-by: Jon Hunter 

Thanks!
Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 1/5] iommu/arm-smmu: move TLB timeout and spin count macros

2020-07-08 Thread Jon Hunter


On 08/07/2020 06:00, Krishna Reddy wrote:
> Move TLB timeout and spin count macros to header file to
> allow using the same from vendor specific implementations.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  drivers/iommu/arm-smmu.c | 3 ---
>  drivers/iommu/arm-smmu.h | 2 ++
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 243bc4cb2705..d2054178df35 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -52,9 +52,6 @@
>   */
>  #define QCOM_DUMMY_VAL -1
>  
> -#define TLB_LOOP_TIMEOUT 100 /* 1s! */
> -#define TLB_SPIN_COUNT   10
> -
>  #define MSI_IOVA_BASE0x800
>  #define MSI_IOVA_LENGTH  0x10
>  
> diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> index d172c024be61..c7d0122a7c6c 100644
> --- a/drivers/iommu/arm-smmu.h
> +++ b/drivers/iommu/arm-smmu.h
> @@ -236,6 +236,8 @@ enum arm_smmu_cbar_type {
>  /* Maximum number of context banks per SMMU */
>  #define ARM_SMMU_MAX_CBS 128
>  
> +#define TLB_LOOP_TIMEOUT 100 /* 1s! */
> +#define TLB_SPIN_COUNT       10
>  
>  /* Shared driver definitions */
>  enum arm_smmu_arch_version {
> 


Reviewed-by: Jon Hunter 

Thanks!
Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 2/3] dt-bindings: arm-smmu: Add binding for Tegra194 SMMU

2020-07-01 Thread Jon Hunter


On 01/07/2020 20:00, Krishna Reddy wrote:
> +items:
> +  - enum:
> +  - nvdia,tegra194-smmu
> +  - const: arm,mmu-500
>>>
 Is the fallback compatible appropriate here? If software treats this as a 
 standard MMU-500 it will only program the first instance (because the 
 second isn't presented as a separate MMU-500) - is there any way that 
 isn't going to blow up?
>>>
>>> When compatible is set to both nvidia,tegra194-smmu and arm,mmu-500, 
>>> implementation override ensure that both instances are programmed. Isn't 
>>> it? I am not sure I follow your comment fully.
> 
>> The problem is, if for some reason someone had a Tegra194, but only set the 
>> compatible string to 'arm,mmu-500' it would assume that it was a normal 
>> arm,mmu-500 and only one instance would be programmed. We always want at 
>> least 2 of the 3 instances >programmed and so we should only match 
>> 'nvidia,tegra194-smmu'. In fact, I think that we also need to update the 
>> arm_smmu_of_match table to add 'nvidia,tegra194-smmu' with the data set to 
>> _mmu500.
> 
> In that case, new binding "nvidia,smmu-v2" can be added with data set to 
> _mmu500 and enumeration would have nvidia,tegra194-smmu and another 
> variant for next generation SoC in future. 

I think you would be better off with nvidia,smmu-500 as smmu-v2 appears
to be something different. I see others have a smmu-v2 but I am not sure
if that is legacy. We have an smmu-500 and so that would seem more
appropriate.

Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 2/3] dt-bindings: arm-smmu: Add binding for Tegra194 SMMU

2020-07-01 Thread Jon Hunter


On 01/07/2020 19:28, Krishna Reddy wrote:
>>> +  - description: NVIDIA SoCs that use more than one "arm,mmu-500"
>> Hmm, there must be a better way to word that to express that it only applies 
>> to the sets of SMMUs that must be programmed identically, and not any other 
>> independent MMU-500s that might also happen to be in the same SoC.
> 
> Let me reword it to "NVIDIA SoCs that must program multiple MMU-500s 
> identically".
> 
>>> +items:
>>> +  - enum:
>>> +  - nvdia,tegra194-smmu
>>> +  - const: arm,mmu-500
> 
>> Is the fallback compatible appropriate here? If software treats this as a 
>> standard MMU-500 it will only program the first instance (because the second 
>> isn't presented as a separate MMU-500) - is there any way that isn't going 
>> to blow up?
> 
> When compatible is set to both nvidia,tegra194-smmu and arm,mmu-500, 
> implementation override ensure that both instances are programmed. Isn't it? 
> I am not sure I follow your comment fully.

The problem is, if for some reason someone had a Tegra194, but only set
the compatible string to 'arm,mmu-500' it would assume that it was a
normal arm,mmu-500 and only one instance would be programmed. We always
want at least 2 of the 3 instances programmed and so we should only
match 'nvidia,tegra194-smmu'. In fact, I think that we also need to
update the arm_smmu_of_match table to add 'nvidia,tegra194-smmu' with
the data set to _mmu500.

Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v9 3/4] dt-bindings: arm-smmu: add binding for Tegra194 SMMU

2020-07-01 Thread Jon Hunter


On 01/07/2020 00:57, Krishna Reddy wrote:
> Add binding for NVIDIA's Tegra194 SoC SMMU topology that is based
> on ARM MMU-500.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  .../devicetree/bindings/iommu/arm,smmu.yaml| 18 ++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
> b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> index d7ceb4c34423b..662c46e16f07d 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> @@ -38,6 +38,11 @@ properties:
>- qcom,sc7180-smmu-500
>- qcom,sdm845-smmu-500
>- const: arm,mmu-500
> +  - description: NVIDIA SoCs that use more than one "arm,mmu-500"
> +items:
> +  - enum:
> +  - nvidia,tegra194-smmu
> +  - const: arm,mmu-500

We should also address Robin's comments here. I don't think that we ever
want to fail back to just the regular 'arm,mmu-500' and should be should
probably just drop this part.

Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v9 2/4] iommu/arm-smmu: add NVIDIA implementation for ARM MMU-500 usage

2020-07-01 Thread Jon Hunter


On 01/07/2020 00:57, Krishna Reddy wrote:
> NVIDIA's Tegra194 SoC has three ARM MMU-500 instances.
> It uses two of ARM MMU-500s together to interleave IOVA accesses
> across them and must be programmed identically.
> The third SMMU instance is used as a regular ARM MMU-500 and it
> can either be programmed independently or identical to other
> two ARM MMU-500s.
> 
> This implementation supports programming two or three ARM MMU-500s
> identically as per DT config.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  MAINTAINERS |   2 +
>  drivers/iommu/Makefile  |   2 +-
>  drivers/iommu/arm-smmu-impl.c   |   3 +
>  drivers/iommu/arm-smmu-nvidia.c | 206 
>  drivers/iommu/arm-smmu.h|   1 +
>  5 files changed, 213 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/iommu/arm-smmu-nvidia.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7b5ffd646c6b9..64c37dbdd4426 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16808,8 +16808,10 @@ F:   drivers/i2c/busses/i2c-tegra.c
>  
>  TEGRA IOMMU DRIVERS
>  M:   Thierry Reding 
> +R:   Krishna Reddy 
>  L:   linux-te...@vger.kernel.org
>  S:   Supported
> +F:   drivers/iommu/arm-smmu-nvidia.c
>  F:   drivers/iommu/tegra*
>  
>  TEGRA KBC DRIVER
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 342190196dfb0..2b8203db73ec3 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -15,7 +15,7 @@ obj-$(CONFIG_AMD_IOMMU) += amd/iommu.o amd/init.o 
> amd/quirks.o
>  obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd/debugfs.o
>  obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o
>  obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
> -arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o
> +arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-nvidia.o arm-smmu-qcom.o
>  obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
>  obj-$(CONFIG_DMAR_TABLE) += intel/dmar.o
>  obj-$(CONFIG_INTEL_IOMMU) += intel/iommu.o intel/pasid.o
> diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
> index c75b9d957b702..f15571d05474e 100644
> --- a/drivers/iommu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm-smmu-impl.c
> @@ -171,6 +171,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
> arm_smmu_device *smmu)
>   if (of_property_read_bool(np, "calxeda,smmu-secure-config-access"))
>   smmu->impl = _impl;
>  
> + if (of_device_is_compatible(np, "nvidia,tegra194-smmu"))
> + return nvidia_smmu_impl_init(smmu);
> +
>   if (of_device_is_compatible(np, "qcom,sdm845-smmu-500") ||
>   of_device_is_compatible(np, "qcom,sc7180-smmu-500"))
>   return qcom_smmu_impl_init(smmu);
> diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
> new file mode 100644
> index 0..5c874912e1c1a
> --- /dev/null
> +++ b/drivers/iommu/arm-smmu-nvidia.c
> @@ -0,0 +1,206 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// NVIDIA ARM SMMU v2 implementation quirks
> +// Copyright (C) 2019-2020 NVIDIA CORPORATION.  All rights reserved.
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "arm-smmu.h"
> +
> +/*
> + * Tegra194 has three ARM MMU-500 Instances.
> + * Two of them are used together for interleaved IOVA accesses and
> + * used by non-isochronous HW devices for SMMU translations.
> + * Third one is used for SMMU translations from isochronous HW devices.
> + * It is possible to use this implementation to program either
> + * all three or two of the instances identically as desired through
> + * DT node.
> + *
> + * Programming all the three instances identically comes with redundant TLB
> + * invalidations as all three never need to be TLB invalidated for a HW 
> device.
> + *
> + * When Linux kernel supports multiple SMMU devices, the SMMU device used for
> + * isochornous HW devices should be added as a separate ARM MMU-500 device
> + * in DT and be programmed independently for efficient TLB invalidates.
> + */

We should address Robin's comment about the 'When' above.

> +#define MAX_SMMU_INSTANCES 3
> +
> +struct nvidia_smmu {
> + struct arm_smmu_device  smmu;
> + unsigned intnum_inst;
> + void __iomem*bases[MAX_SMMU_INSTANCES];
> +};
> +
> +static inline struct nvidia_smmu *to_nvidia_smmu(struct arm_smmu_device 
> *smmu)
> +{
> + return container_of(smmu, struct nvidia_smmu, smmu);
> +}
> +
> +static inline void __iomem *nvidia_smmu_page(struct arm_smmu_device *smmu,
> +  unsigned int inst, int page)
> +{
> + struct nvidia_smmu *nvidia_smmu = to_nvidia_smmu(smmu);
> +
> + if (!nvidia_smmu->bases[0])
> + nvidia_smmu->bases[0] = smmu->base;


Robin said that he would accept a patch to move the
devm_ioremap_resource() call in arm_smmu_device_probe() so that we do
not need to do this. See the V7 series. I think that this would be a
good improvement so that we could avoid doing the above.

Cheers

Re: [PATCH v8 1/3] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage

2020-06-30 Thread Jon Hunter


On 30/06/2020 18:16, Krishna Reddy wrote:
>> OK, well I see what you are saying, but if we intended to support all 3 for 
>> Tegra194, then we should ensure all 3 are initialised correctly.
> 
> The driver intend to support up to 3 instances. It doesn't really mandate 
> that all three instances be present in same DT node.
> Each mmio aperture in "reg" property is an instance here. reg =  size>, , ;
> The reg can have all three or less and driver just configures based on reg 
> and it works fine.

So it sounds like we need at least 2 SMMUs (for non-iso and iso) but we
have up to 3 (for Tegra194). So the question is do we have a use-case
where we only use 2 and not 3? If not, then it still seems that we
should require that all 3 are present.

The other problem I see here is that currently the arm-smmu binding
defines the 'reg' with a 'maxItems' of 1, whereas we have 3. I believe
that this will get caught by the 'dt_binding_check' when we try to
populate the binding.

>> It would be better to query the number of SMMUs populated in device-tree and 
>> then ensure that all are initialised correctly.
> 
> Getting the IORESOURCE_MEM is the way to count the instances driver need to 
> support.  
> In a way, It is already querying through IORESOURCE_MEM here. 

Yes I was wondering that. I think we just need to decide if the 3rd SMMU
is optional or not. The DT binding should detail and min and max supported. 

Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 1/3] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage

2020-06-30 Thread Jon Hunter


On 30/06/2020 17:32, Jon Hunter wrote:
> On 30/06/2020 17:23, Krishna Reddy wrote:
>>>> +struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device 
>>>> +*smmu) {
>>>> +  unsigned int i;
>> 
>>>> +  for (i = 1; i < MAX_SMMU_INSTANCES; i++) {
>>>> +  struct resource *res;
>>>> +
>>>> +  res = platform_get_resource(pdev, IORESOURCE_MEM, i);
>>>> +  if (!res)
>>>> +  break;
>>
>>> Currently this driver is only supported for Tegra194 which I understand has 
>>> 3 SMMUs. Therefore, I don't feel that we should fail silently here, I think 
>>> it is better to return an error if all 3 cannot be initialised.
>>
>> Initialization of all the three SMMU instances is not necessary here.
> 
> That is not what I am saying.
> 
>> The driver can work with all the possible number of instances 1, 2 and 3 
>> based on the DT config though it doesn't make much sense to use it with 1 
>> instance.
>> There is no silent failure here from driver point of view. If there is 
>> misconfig in DT, SMMU faults would catch issues.
> 
> I disagree and you should return a proper error here.

OK, well I see what you are saying, but if we intended to support all 3
for Tegra194, then we should ensure all 3 are initialised correctly.

My concern here is testing, because when things break in upstream I am
usually the one that tracks it down. Not having clear warning/error
messages when something is not initialised as expected makes it harder.

It would be better to query the number of SMMUs populated in device-tree
and then ensure that all are initialised correctly.

Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 1/3] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage

2020-06-30 Thread Jon Hunter



On 30/06/2020 17:23, Krishna Reddy wrote:
>>> +struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device 
>>> +*smmu) {
>>> +   unsigned int i;
> 
>>> +   for (i = 1; i < MAX_SMMU_INSTANCES; i++) {
>>> +   struct resource *res;
>>> +
>>> +   res = platform_get_resource(pdev, IORESOURCE_MEM, i);
>>> +   if (!res)
>>> +   break;
> 
>> Currently this driver is only supported for Tegra194 which I understand has 
>> 3 SMMUs. Therefore, I don't feel that we should fail silently here, I think 
>> it is better to return an error if all 3 cannot be initialised.
> 
> Initialization of all the three SMMU instances is not necessary here.

That is not what I am saying.

> The driver can work with all the possible number of instances 1, 2 and 3 
> based on the DT config though it doesn't make much sense to use it with 1 
> instance.
> There is no silent failure here from driver point of view. If there is 
> misconfig in DT, SMMU faults would catch issues.

I disagree and you should return a proper error here.

>>> +   nvidia_smmu->bases[i] = devm_ioremap_resource(smmu->dev, res);
>>> +   if (IS_ERR(nvidia_smmu->bases[i]))
>>> +   return ERR_CAST(nvidia_smmu->bases[i]);
> 
>> You want to use PTR_ERR() here.
> 
> PTR_ERR() returns long integer. 
> This function returns a pointer. ERR_CAST is the right one to use here. 

Ah yes, indeed. OK that's fine.

Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 1/3] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage

2020-06-30 Thread Jon Hunter

On 30/06/2020 15:53, Robin Murphy wrote:
> On 2020-06-30 09:19, Jon Hunter wrote:
>>
>> On 30/06/2020 01:10, Krishna Reddy wrote:
>>> NVIDIA's Tegra194 SoC uses two ARM MMU-500s together to interleave
>>> IOVA accesses across them.
>>> Add NVIDIA implementation for dual ARM MMU-500s and add new compatible
>>> string for Tegra194 SoC SMMU topology.
>>
>> There is no description here of the 3rd SMMU that you mention below.
>> I think that we should describe the full picture here.
>>  
>>> Signed-off-by: Krishna Reddy 

...

>>> +static void nvidia_smmu_tlb_sync(struct arm_smmu_device *smmu, int
>>> page,
>>> +   int sync, int status)
>>> +{
>>> +    unsigned int delay;
>>> +
>>> +    arm_smmu_writel(smmu, page, sync, 0);
>>> +
>>> +    for (delay = 1; delay < TLB_LOOP_TIMEOUT_IN_US; delay *= 2) {
>>
>> So we are doubling the delay every time? Is this better than just using
>> the same on each loop?
> 
> This is the same logic as the main driver (see 8513c8930069) - the sync
> is expected to complete relatively quickly, hence why we have the inner
> spin loop to avoid the delay entirely in the typical case, and the
> longer it's taking, the more likely it is that something's wrong and it
> will never complete anyway. Realistically, a heavily loaded SMMU at a
> modest clock rate might take us through a couple of iterations of the
> outer loop, but beyond that we're pretty much just killing time until we
> declare it wedged and give up, and by then there's not much point in
> burning power frantically hamering on the interconnect.

Ah OK. Then maybe we should move the definitions for TLB_LOOP_TIMEOUT
and TLB_SPIN_COUNT into the arm-smmu.h so that we can use them directly
in this file instead of redefining them. Then it maybe clear that these
are part of the main driver.

 >>> +struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device
>>> *smmu)
>>> +{
>>> +    unsigned int i;
>>> +    struct nvidia_smmu *nvidia_smmu;
>>> +    struct platform_device *pdev = to_platform_device(smmu->dev);
>>> +
>>> +    nvidia_smmu = devm_kzalloc(smmu->dev, sizeof(*nvidia_smmu),
>>> GFP_KERNEL);
>>> +    if (!nvidia_smmu)
>>> +    return ERR_PTR(-ENOMEM);
>>> +
>>> +    nvidia_smmu->smmu = *smmu;
>>> +    /* Instance 0 is ioremapped by arm-smmu.c after this function
>>> returns */
>>> +    nvidia_smmu->num_inst = 1;
>>> +
>>> +    for (i = 1; i < MAX_SMMU_INSTANCES; i++) {
>>> +    struct resource *res;
>>> +
>>> +    res = platform_get_resource(pdev, IORESOURCE_MEM, i);
>>> +    if (!res)
>>> +    break;
>>> +
>>> +    nvidia_smmu->bases[i] = devm_ioremap_resource(smmu->dev, res);
>>> +    if (IS_ERR(nvidia_smmu->bases[i]))
>>> +    return ERR_CAST(nvidia_smmu->bases[i]);
>>> +
>>> +    nvidia_smmu->num_inst++;
>>> +    }
>>> +
>>> +    nvidia_smmu->smmu.impl = _smmu_impl;
>>> +    /*
>>> + * Free the arm_smmu_device struct allocated in arm-smmu.c.
>>> + * Once this function returns, arm-smmu.c would use arm_smmu_device
>>> + * allocated as part of nvidia_smmu struct.
>>> + */
>>> +    devm_kfree(smmu->dev, smmu);
>>
>> Why don't we just store the pointer of the smmu struct passed to this
>> function
>> in the nvidia_smmu struct and then we do not need to free this here.
>> In other
>> words make ...
>>
>>   struct nvidia_smmu {
>> struct arm_smmu_device    *smmu;
>> unsigned int    num_inst;
>> void __iomem    *bases[MAX_SMMU_INSTANCES];
>>   };
>>
>> This seems more appropriate, than copying the struct and freeing memory
>> allocated else-where.
> 
> But then how do you get back to struct nvidia_smmu given just a pointer
> to struct arm_smmu_device?

Ah yes of course that is what I was missing. I wondered what was going
on here. So I think we should add a nice comment in the above function
of why we are copying this and cannot simply store the pointer.

Cheers
Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v8 3/3] iommu/arm-smmu: Add global/context fault implementation hooks

2020-06-30 Thread Jon Hunter

On 30/06/2020 13:13, Robin Murphy wrote:
> On 2020-06-30 09:37, Jon Hunter wrote:
>>
>> On 30/06/2020 01:10, Krishna Reddy wrote:
>>> Add global/context fault hooks to allow NVIDIA SMMU implementation
>>> handle faults across multiple SMMUs.
>>
>> Nit ... this is not just for NVIDIA, but this allows anyone to add
>> custom global/context and fault hooks. So I think that the changelog
>> should be clear that this change permits custom fault hooks and that
>> custom fault hooks are needed for the Tegra194 SMMU. You may also want
>> to say why.
>>
>>>
>>> Signed-off-by: Krishna Reddy 
>>> ---
>>>   drivers/iommu/arm-smmu-nvidia.c | 98 +
>>>   drivers/iommu/arm-smmu.c    | 17 +-
>>>   drivers/iommu/arm-smmu.h    |  3 +
>>>   3 files changed, 116 insertions(+), 2 deletions(-)

...

>>> @@ -835,7 +836,13 @@ static int arm_smmu_init_domain_context(struct
>>> iommu_domain *domain,
>>>    * handler seeing a half-initialised domain state.
>>>    */
>>>   irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
>>> -    ret = devm_request_irq(smmu->dev, irq, arm_smmu_context_fault,
>>> +
>>> +    if (smmu->impl && smmu->impl->context_fault)
>>> +    context_fault = smmu->impl->context_fault;
>>> +    else
>>> +    context_fault = arm_smmu_context_fault;
>>
>> Why not see the default smmu->impl->context_fault to
>> arm_smmu_context_fault in arm_smmu_impl_init() and then allow the
>> various implementations to override as necessary? Then you can get rid
>> of this context_fault variable here and just use
>> smmu->impl->context_fault below.
> 
> Because the default smmu->impl is NULL. And as I've said before, NAK to
> forcing the common case to allocate a set of "quirks" purely to override
> the default IRQ handler with the default IRQ handler ;)


Ah OK, makes sense. Sorry I am a bit late to the review :-)

Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 1/3] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage

2020-06-30 Thread Jon Hunter


On 29/06/2020 23:49, Krishna Reddy wrote:
>>> + if (!nvidia_smmu->bases[0])
>>> + nvidia_smmu->bases[0] = smmu->base;
>>> +
>>> + return nvidia_smmu->bases[inst] + (page << smmu->pgshift); }
> 
>> Not critical -- just a nit: why not put the bases[0] in init()?
> 
> smmu->base is not available during nvidia_smmu_impl_init() call. It is set 
> afterwards in arm-smmu.c.
> It can't be avoided without changing the devm_ioremap() and impl_init() call 
> order in arm-smmu.c.


Why don't we move the call to devm_ioremap_resource() to before
arm_smmu_impl_init() in arm_smmu_device_probe()? From a quick look I
don't see why we cannot do this and seems better than what we are
currently doing which is quite confusing and hard to understand.

Jon


-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 1/3] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage

2020-06-30 Thread Jon Hunter


On 30/06/2020 01:10, Krishna Reddy wrote:
> NVIDIA's Tegra194 SoC uses two ARM MMU-500s together to interleave
> IOVA accesses across them.
> Add NVIDIA implementation for dual ARM MMU-500s and add new compatible
> string for Tegra194 SoC SMMU topology.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  MAINTAINERS |   2 +
>  drivers/iommu/Makefile  |   2 +-
>  drivers/iommu/arm-smmu-impl.c   |   3 +
>  drivers/iommu/arm-smmu-nvidia.c | 196 
>  drivers/iommu/arm-smmu.h|   1 +
>  5 files changed, 203 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/iommu/arm-smmu-nvidia.c

...

> +struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
> +{
> + unsigned int i;
> + struct nvidia_smmu *nvidia_smmu;
> + struct platform_device *pdev = to_platform_device(smmu->dev);
> +
> + nvidia_smmu = devm_kzalloc(smmu->dev, sizeof(*nvidia_smmu), GFP_KERNEL);
> + if (!nvidia_smmu)
> + return ERR_PTR(-ENOMEM);
> +
> + nvidia_smmu->smmu = *smmu;
> + /* Instance 0 is ioremapped by arm-smmu.c after this function returns */
> + nvidia_smmu->num_inst = 1;
> +
> + for (i = 1; i < MAX_SMMU_INSTANCES; i++) {
> + struct resource *res;
> +
> + res = platform_get_resource(pdev, IORESOURCE_MEM, i);
> + if (!res)
> + break;

Currently this driver is only supported for Tegra194 which I understand
has 3 SMMUs. Therefore, I don't feel that we should fail silently here,
I think it is better to return an error if all 3 cannot be initialised.
In the future if there is an SoC that has less (hopefully not more) than
Tegra194 then we should handle this via the DT compatible string. In
other words, we should always know how many SMMUs there are for a given
SoC and how many we should initialise.

> +
> + nvidia_smmu->bases[i] = devm_ioremap_resource(smmu->dev, res);
> + if (IS_ERR(nvidia_smmu->bases[i]))
> + return ERR_CAST(nvidia_smmu->bases[i]);

You want to use PTR_ERR() here.

Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 3/3] iommu/arm-smmu: Add global/context fault implementation hooks

2020-06-30 Thread Jon Hunter


On 30/06/2020 01:10, Krishna Reddy wrote:
> Add global/context fault hooks to allow NVIDIA SMMU implementation
> handle faults across multiple SMMUs.

Nit ... this is not just for NVIDIA, but this allows anyone to add
custom global/context and fault hooks. So I think that the changelog
should be clear that this change permits custom fault hooks and that
custom fault hooks are needed for the Tegra194 SMMU. You may also want
to say why.

> 
> Signed-off-by: Krishna Reddy 
> ---
>  drivers/iommu/arm-smmu-nvidia.c | 98 +
>  drivers/iommu/arm-smmu.c| 17 +-
>  drivers/iommu/arm-smmu.h|  3 +
>  3 files changed, 116 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
> index 1124f0ac1823a..c9423b4199c65 100644
> --- a/drivers/iommu/arm-smmu-nvidia.c
> +++ b/drivers/iommu/arm-smmu-nvidia.c
> @@ -147,6 +147,102 @@ static int nvidia_smmu_reset(struct arm_smmu_device 
> *smmu)
>   return 0;
>  }
>  
> +static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
> +{
> + return container_of(dom, struct arm_smmu_domain, domain);
> +}
> +
> +static irqreturn_t nvidia_smmu_global_fault_inst(int irq,
> +struct arm_smmu_device *smmu,
> +int inst)
> +{
> + u32 gfsr, gfsynr0, gfsynr1, gfsynr2;
> + void __iomem *gr0_base = nvidia_smmu_page(smmu, inst, 0);
> +
> + gfsr = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSR);
> + if (!gfsr)
> + return IRQ_NONE;
> +
> + gfsynr0 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR0);
> + gfsynr1 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR1);
> + gfsynr2 = readl_relaxed(gr0_base + ARM_SMMU_GR0_sGFSYNR2);
> +
> + dev_err_ratelimited(smmu->dev,
> + "Unexpected global fault, this could be serious\n");
> + dev_err_ratelimited(smmu->dev,
> + "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 
> 0x%08x\n",
> + gfsr, gfsynr0, gfsynr1, gfsynr2);
> +
> + writel_relaxed(gfsr, gr0_base + ARM_SMMU_GR0_sGFSR);
> + return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t nvidia_smmu_global_fault(int irq, void *dev)
> +{
> + int inst;

Should be unsigned

> + irqreturn_t irq_ret = IRQ_NONE;
> + struct arm_smmu_device *smmu = dev;
> + struct nvidia_smmu *nvidia_smmu = to_nvidia_smmu(smmu);
> +
> + for (inst = 0; inst < nvidia_smmu->num_inst; inst++) {
> + irq_ret = nvidia_smmu_global_fault_inst(irq, smmu, inst);
> + if (irq_ret == IRQ_HANDLED)
> + return irq_ret;

Any chance there could be more than one SMMU faulting by the time we
service the interrupt?

> + }
> +
> + return irq_ret;
> +}
> +
> +static irqreturn_t nvidia_smmu_context_fault_bank(int irq,
> + struct arm_smmu_device *smmu,
> + int idx, int inst)
> +{
> + u32 fsr, fsynr, cbfrsynra;
> + unsigned long iova;
> + void __iomem *gr1_base = nvidia_smmu_page(smmu, inst, 1);
> + void __iomem *cb_base = nvidia_smmu_page(smmu, inst, smmu->numpage + 
> idx);
> +
> + fsr = readl_relaxed(cb_base + ARM_SMMU_CB_FSR);
> + if (!(fsr & ARM_SMMU_FSR_FAULT))
> + return IRQ_NONE;
> +
> + fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
> + iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
> + cbfrsynra = readl_relaxed(gr1_base + ARM_SMMU_GR1_CBFRSYNRA(idx));
> +
> + dev_err_ratelimited(smmu->dev,
> + "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
> cbfrsynra=0x%x, cb=%d\n",
> + fsr, iova, fsynr, cbfrsynra, idx);
> +
> + writel_relaxed(fsr, cb_base + ARM_SMMU_CB_FSR);
> + return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t nvidia_smmu_context_fault(int irq, void *dev)
> +{
> + int inst, idx;

Unsigned

> + irqreturn_t irq_ret = IRQ_NONE;
> + struct iommu_domain *domain = dev;
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> + struct arm_smmu_device *smmu = smmu_domain->smmu;
> +
> + for (inst = 0; inst < to_nvidia_smmu(smmu)->num_inst; inst++) {
> + /*
> +  * Interrupt line shared between all context faults.
> +  * Check for faults across all contexts.
> +  */
> + for (idx = 0; idx < smmu->num_context_banks; idx++) {
> + irq_ret = nvidia_smmu_context_fault_bank(irq, smmu,
> +  idx, inst);
> +
> + if (irq_ret == IRQ_HANDLED)
> + return irq_ret;

Any reason why we don't check all banks?

> + }
> + }
> +
> + return irq_ret;
> +}
> +
>  static const struct arm_smmu_impl nvidia_smmu_impl = {
>   .read_reg = nvidia_smmu_read_reg,
>   .write_reg = 

Re: [PATCH v8 2/3] dt-bindings: arm-smmu: Add binding for Tegra194 SMMU

2020-06-30 Thread Jon Hunter


On 30/06/2020 01:10, Krishna Reddy wrote:
> Add binding for NVIDIA's Tegra194 SoC SMMU topology that is based
> on ARM MMU-500.
> 
> Signed-off-by: Krishna Reddy 
> ---
>  Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
> b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> index d7ceb4c34423b..5b2586ac715ed 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> @@ -38,6 +38,11 @@ properties:
>- qcom,sc7180-smmu-500
>- qcom,sdm845-smmu-500
>- const: arm,mmu-500
> +  - description: NVIDIA SoCs that use more than one "arm,mmu-500"
> +items:
> +  - enum:
> +  - nvdia,tegra194-smmu

s/nvdia/nvidia

Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 1/3] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage

2020-06-30 Thread Jon Hunter


On 30/06/2020 01:10, Krishna Reddy wrote:
> NVIDIA's Tegra194 SoC uses two ARM MMU-500s together to interleave
> IOVA accesses across them.
> Add NVIDIA implementation for dual ARM MMU-500s and add new compatible
> string for Tegra194 SoC SMMU topology.

There is no description here of the 3rd SMMU that you mention below.
I think that we should describe the full picture here.
 
> Signed-off-by: Krishna Reddy 
> ---
>  MAINTAINERS |   2 +
>  drivers/iommu/Makefile  |   2 +-
>  drivers/iommu/arm-smmu-impl.c   |   3 +
>  drivers/iommu/arm-smmu-nvidia.c | 196 
>  drivers/iommu/arm-smmu.h|   1 +
>  5 files changed, 203 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/iommu/arm-smmu-nvidia.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7b5ffd646c6b9..64c37dbdd4426 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16808,8 +16808,10 @@ F:   drivers/i2c/busses/i2c-tegra.c
>  
>  TEGRA IOMMU DRIVERS
>  M:   Thierry Reding 
> +R:   Krishna Reddy 
>  L:   linux-te...@vger.kernel.org
>  S:   Supported
> +F:   drivers/iommu/arm-smmu-nvidia.c
>  F:   drivers/iommu/tegra*
>  
>  TEGRA KBC DRIVER
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 342190196dfb0..2b8203db73ec3 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -15,7 +15,7 @@ obj-$(CONFIG_AMD_IOMMU) += amd/iommu.o amd/init.o 
> amd/quirks.o
>  obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd/debugfs.o
>  obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o
>  obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
> -arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o
> +arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-nvidia.o arm-smmu-qcom.o
>  obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
>  obj-$(CONFIG_DMAR_TABLE) += intel/dmar.o
>  obj-$(CONFIG_INTEL_IOMMU) += intel/iommu.o intel/pasid.o
> diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
> index c75b9d957b702..70f7318017617 100644
> --- a/drivers/iommu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm-smmu-impl.c
> @@ -171,6 +171,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
> arm_smmu_device *smmu)
>   if (of_property_read_bool(np, "calxeda,smmu-secure-config-access"))
>   smmu->impl = _impl;
>  
> + if (of_device_is_compatible(smmu->dev->of_node, "nvidia,tegra194-smmu"))
> + return nvidia_smmu_impl_init(smmu);
> +
>   if (of_device_is_compatible(np, "qcom,sdm845-smmu-500") ||
>   of_device_is_compatible(np, "qcom,sc7180-smmu-500"))
>   return qcom_smmu_impl_init(smmu);
> diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
> new file mode 100644
> index 0..1124f0ac1823a
> --- /dev/null
> +++ b/drivers/iommu/arm-smmu-nvidia.c
> @@ -0,0 +1,196 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// NVIDIA ARM SMMU v2 implementation quirks
> +// Copyright (C) 2019-2020 NVIDIA CORPORATION.  All rights reserved.
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "arm-smmu.h"
> +
> +/*
> + * Tegra194 has three ARM MMU-500 Instances.
> + * Two of them are used together for interleaved IOVA accesses and
> + * used by non-isochronous HW devices for SMMU translations.
> + * Third one is used for SMMU translations from isochronous HW devices.
> + * It is possible to use this implementation to program either
> + * all three or two of the instances identically as desired through
> + * DT node.
> + *
> + * Programming all the three instances identically comes with redundant TLB
> + * invalidations as all three never need to be TLB invalidated for a HW 
> device.
> + *
> + * When Linux kernel supports multiple SMMU devices, the SMMU device used for
> + * isochornous HW devices should be added as a separate ARM MMU-500 device
> + * in DT and be programmed independently for efficient TLB invalidates.
> + */
> +#define MAX_SMMU_INSTANCES 3
> +
> +#define TLB_LOOP_TIMEOUT_IN_US   100 /* 1s! */
> +#define TLB_SPIN_COUNT   10
> +
> +struct nvidia_smmu {
> + struct arm_smmu_device  smmu;
> + unsigned intnum_inst;
> + void __iomem*bases[MAX_SMMU_INSTANCES];
> +};
> +
> +static inline struct nvidia_smmu *to_nvidia_smmu(struct arm_smmu_device 
> *smmu)
> +{
> + return container_of(smmu, struct nvidia_smmu, smmu);
> +}
> +
> +static inline void __iomem *nvidia_smmu_page(struct arm_smmu_device *smmu,
> +unsigned int inst, int page)

If you run checkpatch --strict on these you will get a lot of ...

CHECK: Alignment should match open parenthesis
#116: FILE: drivers/iommu/arm-smmu-nvidia.c:46:
+static inline void __iomem *nvidia_smmu_page(struct arm_smmu_device *smmu,
+  unsigned int inst, int page)

We should fix these.

> +{
> + struct nvidia_smmu *nvidia_smmu = to_nvidia_smmu(smmu);
> +
> + if (!nvidia_smmu->bases[0])
> + 

Re: [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code

2019-06-04 Thread Jon Hunter


On 04/06/2019 07:05, Christoph Hellwig wrote:
> On Mon, Jun 03, 2019 at 08:47:57PM +0100, Jon Hunter wrote:
>> Since next-20190529 one of our tests for MMC has started failing, where
>> the symptom is that the data written to the MMC does not match the
>> source. Bisecting this is pointing to this commit. Unfortunately, I am
>> not able to cleanly revert this on top of -next, but wanted to report
>> this if case you have any ideas.
> 
> Does this fix your problem?
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=generic-dma-ops=1b961423158caaae49d3900b7c9c37477bbfa9b3

Yes I can confirm with this patch on today's -next the issue is no
longer seen, and reverting this patch on top of today's -next causes the
problem to occur again. So yes this fixes my problem.

Thanks!
Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code

2019-06-03 Thread Jon Hunter


On 29/04/2019 13:56, Robin Murphy wrote:
> On 22/04/2019 18:59, Christoph Hellwig wrote:
>> There is nothing really arm64 specific in the iommu_dma_ops
>> implementation, so move it to dma-iommu.c and keep a lot of symbols
>> self-contained.  Note the implementation does depend on the
>> DMA_DIRECT_REMAP infrastructure for now, so we'll have to make the
>> DMA_IOMMU support depend on it, but this will be relaxed soon.
> 
> Nothing looks objectionable, and boot testing with this much of the
> series merged has my coherent and non-coherent IOMMU-backed devices
> appearing to still work OK, so:
> 
> Acked-by: Robin Murphy 

Since next-20190529 one of our tests for MMC has started failing, where
the symptom is that the data written to the MMC does not match the
source. Bisecting this is pointing to this commit. Unfortunately, I am
not able to cleanly revert this on top of -next, but wanted to report
this if case you have any ideas.

Cheers
Jon

-- 
nvpublic


Re: [PATCH] Revert "dma-contiguous: do not allocate a single page from CMA area"

2019-02-27 Thread Jon Hunter


On 26/02/2019 20:23, Nicolin Chen wrote:
> This reverts commit d222e42e88168fd67e6d131984b86477af1fc256.
> 
> The original change breaks omap dss:
> omapdss_dispc 58001000.dispc:
> dispc_errata_i734_wa_init: dma_alloc_writecombine failed
> 
> Let's revert it first and then find a safer solution instead.
> 
> Reported-by: Tony Lindgren 
> Signed-off-by: Nicolin Chen 
> ---
> Tony,
>   
> Would you please test and verify? Thanks!

This also fixes various memory allocation failures we have seen on
32-bit Tegra as well.

Tested-by: Jon Hunter 

Cheers
Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/tegra: Add support for struct iommu_device

2017-08-31 Thread Jon Hunter
Hi Joerg,

On 30/08/17 16:30, Joerg Roedel wrote:
> Hi Jon,
> 
> On Wed, Aug 30, 2017 at 03:22:05PM +0100, Jon Hunter wrote:
>> Yes I can confirm that this fixes the crash. I assume that you will fix
>> the error path for bus_set_iommu() above as I believe now it needs to
>> call iommu_device_sysfs_remove().
> 
> Thanks for testing the patch. I updated the error-path and put the
> commit below into the iommu-tree:
Today's -next is still failing and I don't see this fix in your public
tree yet [0]. Can you push this out?

Cheers
Jon

[0] https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/tegra: Add support for struct iommu_device

2017-08-30 Thread Jon Hunter

On 30/08/17 16:30, Joerg Roedel wrote:
> Hi Jon,
> 
> On Wed, Aug 30, 2017 at 03:22:05PM +0100, Jon Hunter wrote:
>> Yes I can confirm that this fixes the crash. I assume that you will fix
>> the error path for bus_set_iommu() above as I believe now it needs to
>> call iommu_device_sysfs_remove().
> 
> Thanks for testing the patch. I updated the error-path and put the
> commit below into the iommu-tree:
Great! Thanks, Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/tegra: Add support for struct iommu_device

2017-08-30 Thread Jon Hunter
Hi Joerg,

On 30/08/17 13:09, Joerg Roedel wrote:
> Hi Jon,
> 
> On Wed, Aug 30, 2017 at 12:04:38PM +0100, Jon Hunter wrote:
>> With next-20170829 I am seeing several Tegra boards crashing [0][1] on
>> boot in tegra_smmu_probe() and the bisect is point to this commit. Looks
>> like there maybe a sequence problem between calls to bus_set_iommu() and
>> iommu_device_add_sysfs() which results in a NULL pointer dereference.
> 
> Thanks for the report. Can you please test whether the patch below
> fixes the problem?
> 
> Thanks,
> 
>   Joerg
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 2802e12e6a54..48ffbe95a663 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -949,10 +949,6 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>  
>   tegra_smmu_ahb_enable();
>  
> - err = bus_set_iommu(_bus_type, _smmu_ops);
> - if (err < 0)
> - return ERR_PTR(err);
> -
>   err = iommu_device_sysfs_add(>iommu, dev, NULL, dev_name(dev));
>   if (err)
>   return ERR_PTR(err);
> @@ -965,6 +961,10 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>   return ERR_PTR(err);
>   }
>  
> + err = bus_set_iommu(_bus_type, _smmu_ops);
> + if (err < 0)
> + return ERR_PTR(err);
> +

Yes I can confirm that this fixes the crash. I assume that you will fix
the error path for bus_set_iommu() above as I believe now it needs to
call iommu_device_sysfs_remove().

Cheers!
Jon

-- 
nvpublic
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/tegra: Add support for struct iommu_device

2017-08-30 Thread Jon Hunter

On 09/08/17 23:29, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Add a struct iommu_device to each tegra-smmu and register it
> with the iommu-core. Also link devices added to the driver
> to their respective hardware iommus.
> 
> Signed-off-by: Joerg Roedel 
> ---
>  drivers/iommu/tegra-smmu.c | 25 +
>  1 file changed, 25 insertions(+)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index faa9c1e..2802e12 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -36,6 +36,8 @@ struct tegra_smmu {
>   struct list_head list;
>  
>   struct dentry *debugfs;
> +
> + struct iommu_device iommu;  /* IOMMU Core code handle */
>  };
>  
>  struct tegra_smmu_as {
> @@ -720,6 +722,9 @@ static int tegra_smmu_add_device(struct device *dev)
>* first match.
>*/
>   dev->archdata.iommu = smmu;
> +
> + iommu_device_link(>iommu, dev);
> +
>   break;
>   }
>  
> @@ -737,6 +742,11 @@ static int tegra_smmu_add_device(struct device *dev)
>  
>  static void tegra_smmu_remove_device(struct device *dev)
>  {
> + struct tegra_smmu *smmu = dev->archdata.iommu;
> +
> + if (smmu)
> + iommu_device_unlink(>iommu, dev);
> +
>   dev->archdata.iommu = NULL;
>   iommu_group_remove_device(dev);
>  }
> @@ -943,6 +953,18 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>   if (err < 0)
>   return ERR_PTR(err);
>  
> + err = iommu_device_sysfs_add(>iommu, dev, NULL, dev_name(dev));
> + if (err)
> + return ERR_PTR(err);
> +
> + iommu_device_set_ops(>iommu, _smmu_ops);
> +
> + err = iommu_device_register(>iommu);
> + if (err) {
> + iommu_device_sysfs_remove(>iommu);
> + return ERR_PTR(err);
> + }
> +
>   if (IS_ENABLED(CONFIG_DEBUG_FS))
>   tegra_smmu_debugfs_init(smmu);
>  
> @@ -951,6 +973,9 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>  
>  void tegra_smmu_remove(struct tegra_smmu *smmu)
>  {
> + iommu_device_unregister(>iommu);
> + iommu_device_sysfs_remove(>iommu);
> +
>   if (IS_ENABLED(CONFIG_DEBUG_FS))
>   tegra_smmu_debugfs_exit(smmu);
>  }
> 

With next-20170829 I am seeing several Tegra boards crashing [0][1] on
boot in tegra_smmu_probe() and the bisect is point to this commit. Looks
like there maybe a sequence problem between calls to bus_set_iommu() and
iommu_device_add_sysfs() which results in a NULL pointer dereference.

You can see the full crash log here [1].

Cheers
Jon

[0] https://nvtb.github.io//linux-next/
[1]
https://nvtb.github.io//linux-next/test_next-20170829/20170829024534/boot/tegra124-jetson-tk1/tegra124-jetson-tk1/tegra_defconfig_log.txt

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu