date:20200930

This patch mainly adds support for mt8192 IOMMU and SMI.

mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor translation
table format. The M4U-SMI HW diagram is as below:

  EMI
   |
  M4U
   |
  
   SMI Common
  
   |
  +---+--+--+--+---+
  |   |  |  |   .. |   |
  |   |  |  |  |   |
larb0   larb1  larb2  larb4 ..  larb19   larb20
disp0   disp1   mdpvdec   IPE  IPE

All the connections are HW fixed, SW can NOT adjust it.

Comparing with the preview SoC, this patchset mainly adds two new functions:
a) add iova 34 bits support.
b) add multi domains support since several HW has the special iova
region requirement.

this patchset depend on v5.9-rc1.

change note:
v3:
  a) Fix DT schema issue commented from Rob.
  b) Fix a v7s issue. Use "_lvl" instead of "_l" in the 
macro(ARM_V7S_PTES_PER_LVL) since 
  it is called in ARM_V7S_LVL_IDX which has already used "_l".
  c) Fix a PM suspend issue: Avoid pm suspend in pm runtime case.

v2: 
https://lore.kernel.org/linux-iommu/20200905080920.13396-1-yong...@mediatek.com/
  a) Convert IOMMU/SMI dt-binding to DT schema.
  b) Fix some comment from Pi-Hsun and Nicolas. like use
  generic_iommu_put_resv_regions.
  c) Reword some comment, like add how to use domain-id.

v1: 
https://lore.kernel.org/linux-iommu/20200711064846.16007-1-yong...@mediatek.com/

Yong Wu (24):
  dt-bindings: iommu: mediatek: Convert IOMMU to DT schema
  dt-bindings: memory: mediatek: Convert SMI to DT schema
  dt-bindings: memory: mediatek: Add a common larb-port header file
  dt-bindings: memory: mediatek: Extend LARB_NR_MAX to 32
  dt-bindings: memory: mediatek: Add domain definition
  dt-bindings: mediatek: Add binding for mt8192 IOMMU
  iommu/mediatek: Use the common mtk-smi-larb-port.h
  iommu/io-pgtable-arm-v7s: Use ias to check the valid iova in unmap
  iommu/io-pgtable-arm-v7s: Extend PA34 for MediaTek
  iommu/io-pgtable-arm-v7s: Add cfg as a param in some macros
  iommu/io-pgtable-arm-v7s: Quad lvl1 pgtable for MediaTek
  iommu/mediatek: Move hw_init into attach_device
  iommu/mediatek: Add device link for smi-common and m4u
  iommu/mediatek: Add pm runtime callback
  iommu/mediatek: Add power-domain operation
  iommu/mediatek: Add iova reserved function
  iommu/mediatek: Add single domain
  iommu/mediatek: Support master use iova over 32bit
  iommu/mediatek: Support up to 34bit iova in tlb flush
  iommu/mediatek: Support report iova 34bit translation fault in ISR
  iommu/mediatek: Add support for multi domain
  iommu/mediatek: Adjust the structure
  iommu/mediatek: Add mt8192 support
  memory: mtk-smi: Add mt8192 support

 .../bindings/iommu/mediatek,iommu.txt | 103 ---
 .../bindings/iommu/mediatek,iommu.yaml| 159 +++
 .../mediatek,smi-common.txt   |  49 
 .../mediatek,smi-common.yaml  | 101 +++
 .../memory-controllers/mediatek,smi-larb.txt  |  49 
 .../memory-controllers/mediatek,smi-larb.yaml |  92 +++
 drivers/iommu/io-pgtable-arm-v7s.c|  57 ++--
 drivers/iommu/mtk_iommu.c | 256 +++---
 drivers/iommu/mtk_iommu.h |  11 +-
 drivers/memory/mtk-smi.c  |  27 ++
 include/dt-bindings/memory/mt2712-larb-port.h |   2 +-
 include/dt-bindings/memory/mt6779-larb-port.h |   2 +-
 include/dt-bindings/memory/mt8173-larb-port.h |   2 +-
 include/dt-bindings/memory/mt8183-larb-port.h |   2 +-
 include/dt-bindings/memory/mt8192-larb-port.h | 239 
 .../dt-bindings/memory/mtk-smi-larb-port.h|  22 ++
 include/linux/io-pgtable.h|   4 +-
 include/soc/mediatek/smi.h|   3 +-
 18 files changed, 903 insertions(+), 277 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
 create mode 100644 Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
 delete mode 100644 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
 create mode 100644 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
 delete mode 100644 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt
 create mode 100644 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
 create mode 100644 include/dt-bindings/memory/mt8192-larb-port.h
 create mode 100644 include/dt-bindings/memory/mtk-smi-larb-port.h

-- 
2.18.0


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 01/24] dt-bindings: iommu: mediatek: Convert IOMMU to DT schema

Convert MediaTek IOMMU to DT schema.

Signed-off-by: Yong Wu 
---
 .../bindings/iommu/mediatek,iommu.txt | 103 
 .../bindings/iommu/mediatek,iommu.yaml| 154 ++
 2 files changed, 154 insertions(+), 103 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
 create mode 100644 Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml

diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt 
b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
deleted file mode 100644
index c1ccd8582eb2..
--- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
+++ /dev/null
@@ -1,103 +0,0 @@
-* Mediatek IOMMU Architecture Implementation
-
-  Some Mediatek SOCs contain a Multimedia Memory Management Unit (M4U), and
-this M4U have two generations of HW architecture. Generation one uses flat
-pagetable, and only supports 4K size page mapping. Generation two uses the
-ARM Short-Descriptor translation table format for address translation.
-
-  About the M4U Hardware Block Diagram, please check below:
-
-  EMI (External Memory Interface)
-   |
-  m4u (Multimedia Memory Management Unit)
-   |
-  ++
-  ||
-  gals0-rx   gals1-rx(Global Async Local Sync rx)
-  ||
-  ||
-  gals0-tx   gals1-tx(Global Async Local Sync tx)
-  ||  Some SoCs may have GALS.
-  ++
-   |
-   SMI Common(Smart Multimedia Interface Common)
-   |
-   ++---
-   ||
-   | gals-rxThere may be GALS in some larbs.
-   ||
-   ||
-   | gals-tx
-   ||
-   SMI larb0SMI larb1   ... SoCs have several SMI local arbiter(larb).
-   (display) (vdec)
-   ||
-   ||
- +-+-+ +++
- | | | |||
- | | |...  |||  ... There are different ports in each larb.
- | | | |||
-OVL0 RDMA0 WDMA0  MC   PP   VLD
-
-  As above, The Multimedia HW will go through SMI and M4U while it
-access EMI. SMI is a bridge between m4u and the Multimedia HW. It contain
-smi local arbiter and smi common. It will control whether the Multimedia
-HW should go though the m4u for translation or bypass it and talk
-directly with EMI. And also SMI help control the power domain and clocks for
-each local arbiter.
-  Normally we specify a local arbiter(larb) for each multimedia HW
-like display, video decode, and camera. And there are different ports
-in each larb. Take a example, There are many ports like MC, PP, VLD in the
-video decode local arbiter, all these ports are according to the video HW.
-  In some SoCs, there may be a GALS(Global Async Local Sync) module between
-smi-common and m4u, and additional GALS module between smi-larb and
-smi-common. GALS can been seen as a "asynchronous fifo" which could help
-synchronize for the modules in different clock frequency.
-
-Required properties:
-- compatible : must be one of the following string:
-   "mediatek,mt2701-m4u" for mt2701 which uses generation one m4u HW.
-   "mediatek,mt2712-m4u" for mt2712 which uses generation two m4u HW.
-   "mediatek,mt6779-m4u" for mt6779 which uses generation two m4u HW.
-   "mediatek,mt7623-m4u", "mediatek,mt2701-m4u" for mt7623 which uses
-generation one m4u HW.
-   "mediatek,mt8173-m4u" for mt8173 which uses generation two m4u HW.
-   "mediatek,mt8183-m4u" for mt8183 which uses generation two m4u HW.
-- reg : m4u register base and size.
-- interrupts : the interrupt of m4u.
-- clocks : must contain one entry for each clock-names.
-- clock-names : Only 1 optional clock:
-  - "bclk": the block clock of m4u.
-  Here is the list which require this "bclk":
-  - mt2701, mt2712, mt7623 and mt8173.
-  Note that m4u use the EMI clock which always has been enabled before kernel
-  if there is no this "bclk".
-- mediatek,larbs : List of phandle to the local arbiters in the current Socs.
-   Refer to bindings/memory-controllers/mediatek,smi-larb.txt. It must sort
-   according to the local arbiter index, like larb0, larb1, larb2...
-- iommu-cells : must be 1. This is the mtk_m4u_id according to the HW.
-   Specifies the mtk_m4u_id as defined in
-   dt-binding/memory/mt2701-larb-port.h for mt2701, mt7623
-   dt-binding/memory/mt2712-larb-port.h for mt2712,
-   dt-binding/memory/mt6779-larb-port.h for mt6779,
-   dt-binding/memory/mt8173-larb-port.h for mt8173, and
-   dt-binding/memory/mt8183-larb-port.h for mt8183.
-
-Example:
-   iommu: iommu@10205000 {
-   compatible = "mediatek,mt8173-m4u";
-   reg = <0 0x10205000 0 0x10

[PATCH v3 02/24] dt-bindings: memory: mediatek: Convert SMI to DT schema

Convert MediaTek SMI to DT schema.

Signed-off-by: Yong Wu 
---
 .../mediatek,smi-common.txt   |  49 -
 .../mediatek,smi-common.yaml  | 100 ++
 .../memory-controllers/mediatek,smi-larb.txt  |  49 -
 .../memory-controllers/mediatek,smi-larb.yaml |  91 
 4 files changed, 191 insertions(+), 98 deletions(-)
 delete mode 100644 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
 create mode 100644 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
 delete mode 100644 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt
 create mode 100644 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml

diff --git 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
deleted file mode 100644
index b64573680b42..
--- 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
+++ /dev/null
@@ -1,49 +0,0 @@
-SMI (Smart Multimedia Interface) Common
-
-The hardware block diagram please check bindings/iommu/mediatek,iommu.txt
-
-Mediatek SMI have two generations of HW architecture, here is the list
-which generation the SoCs use:
-generation 1: mt2701 and mt7623.
-generation 2: mt2712, mt6779, mt8173 and mt8183.
-
-There's slight differences between the two SMI, for generation 2, the
-register which control the iommu port is at each larb's register base. But
-for generation 1, the register is at smi ao base(smi always on register
-base). Besides that, the smi async clock should be prepared and enabled for
-SMI generation 1 to transform the smi clock into emi clock domain, but that is
-not needed for SMI generation 2.
-
-Required properties:
-- compatible : must be one of :
-   "mediatek,mt2701-smi-common"
-   "mediatek,mt2712-smi-common"
-   "mediatek,mt6779-smi-common"
-   "mediatek,mt7623-smi-common", "mediatek,mt2701-smi-common"
-   "mediatek,mt8173-smi-common"
-   "mediatek,mt8183-smi-common"
-- reg : the register and size of the SMI block.
-- power-domains : a phandle to the power domain of this local arbiter.
-- clocks : Must contain an entry for each entry in clock-names.
-- clock-names : must contain 3 entries for generation 1 smi HW and 2 entries
-  for generation 2 smi HW as follows:
-  - "apb" : Advanced Peripheral Bus clock, It's the clock for setting
-   the register.
-  - "smi" : It's the clock for transfer data and command.
-   They may be the same if both source clocks are the same.
-  - "async" : asynchronous clock, it help transform the smi clock into the emi
- clock domain, this clock is only needed by generation 1 smi HW.
-  and these 2 option clocks for generation 2 smi HW:
-  - "gals0": the path0 clock of GALS(Global Async Local Sync).
-  - "gals1": the path1 clock of GALS(Global Async Local Sync).
-  Here is the list which has this GALS: mt6779 and mt8183.
-
-Example:
-   smi_common: smi@14022000 {
-   compatible = "mediatek,mt8173-smi-common";
-   reg = <0 0x14022000 0 0x1000>;
-   power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
-   clocks = <&mmsys CLK_MM_SMI_COMMON>,
-<&mmsys CLK_MM_SMI_COMMON>;
-   clock-names = "apb", "smi";
-   };
diff --git 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
new file mode 100644
index ..76ecc7205438
--- /dev/null
+++ 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
@@ -0,0 +1,100 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/memory-controllers/mediatek,smi-common.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: SMI (Smart Multimedia Interface) Common
+
+maintainers:
+  - Yong Wu 
+
+description: |+
+  The hardware block diagram please check bindings/iommu/mediatek,iommu.yaml
+
+  MediaTek SMI have two generations of HW architecture, here is the list
+  which generation the SoCs use:
+  generation 1: mt2701 and mt7623.
+  generation 2: mt2712, mt6779, mt8173 and mt8183.
+
+  There's slight differences between the two SMI, for generation 2, the
+  register which control the iommu port is at each larb's register base. But
+  for generation 1, the register is at smi ao base(smi always on register
+  base). Besides that, the smi async clock should be prepared and enabled for
+  SMI generation 1 to transform the smi clock into emi clock domain, but that 
is
+  not needed for SMI generation 2.
+
+properties:
+  compatible:
+oneOf:
+  - enum:
+  - mediatek,mt2701-smi-common
+  - mediatek,mt2712-smi-common
+  - mediatek,mt6779-smi-common
+

[PATCH v3 06/24] dt-bindings: mediatek: Add binding for mt8192 IOMMU

This patch adds decriptions for mt8192 IOMMU and SMI.

mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor translation
table format. The M4U-SMI HW diagram is as below:

  EMI
   |
  M4U
   |
  
   SMI Common
  
   |
  +---+--+--+--+---+
  |   |  |  |   .. |   |
  |   |  |  |  |   |
larb0   larb1  larb2  larb4 ..  larb19   larb20
disp0   disp1   mdpvdec   IPE  IPE

All the connections are HW fixed, SW can NOT adjust it.

mt8192 M4U support 0~16GB iova range. we preassign different engines
into different iova ranges:

domain-id  module iova-range  larbs
   0   disp0 ~ 4G  larb0/1
   1   vcodec  4G ~ 8G larb4/5/7
   2   cam/mdp 8G ~ 12G larb2/9/11/13/14/16/17/18/19/20
   3   CCU00x4000_ ~ 0x43ff_ larb13: port 9/10
   4   CCU10x4400_ ~ 0x47ff_ larb14: port 4/5

The iova range for CCU0/1(camera control unit) is HW requirement.

Signed-off-by: Yong Wu 
Reviewed-by: Rob Herring 
---
 .../bindings/iommu/mediatek,iommu.yaml|   9 +-
 .../mediatek,smi-common.yaml  |   5 +-
 .../memory-controllers/mediatek,smi-larb.yaml |   3 +-
 include/dt-bindings/memory/mt8192-larb-port.h | 239 ++
 4 files changed, 251 insertions(+), 5 deletions(-)
 create mode 100644 include/dt-bindings/memory/mt8192-larb-port.h

diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml 
b/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
index eae773ad53a3..a7f41cdf3aca 100644
--- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
+++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
@@ -75,6 +75,7 @@ properties:
   - mediatek,mt6779-m4u # mt6779 generation two HW
   - mediatek,mt8173-m4u # mt8173 generation two HW
   - mediatek,mt8183-m4u # mt8183 generation two HW
+  - mediatek,mt8192-m4u # mt8192 generation two HW
 
   - description: mt7623 generation one HW
 items:
@@ -90,7 +91,7 @@ properties:
   clocks:
 description: |
   bclk is optional. here is the list which require this bclk:
-  mt2701, mt2712, mt7623 and mt8173.
+  mt2701, mt2712, mt7623, mt8173 and mt8192.
   M4U will use the EMI clock which always has been enabled before
   kernel if there is no this bclk.
 items:
@@ -116,7 +117,11 @@ properties:
   dt-binding/memory/mt2712-larb-port.h for mt2712,
   dt-binding/memory/mt6779-larb-port.h for mt6779,
   dt-binding/memory/mt8173-larb-port.h for mt8173,
-  dt-binding/memory/mt8183-larb-port.h for mt8183.
+  dt-binding/memory/mt8183-larb-port.h for mt8183,
+  dt-binding/memory/mt8192-larb-port.h for mt8192.
+
+  power-domains:
+maxItems: 1
 
 required:
   - compatible
diff --git 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
index 76ecc7205438..d3cb4d853e71 100644
--- 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
+++ 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
@@ -15,7 +15,7 @@ description: |+
   MediaTek SMI have two generations of HW architecture, here is the list
   which generation the SoCs use:
   generation 1: mt2701 and mt7623.
-  generation 2: mt2712, mt6779, mt8173 and mt8183.
+  generation 2: mt2712, mt6779, mt8173, mt8183 and mt8192.
 
   There's slight differences between the two SMI, for generation 2, the
   register which control the iommu port is at each larb's register base. But
@@ -33,6 +33,7 @@ properties:
   - mediatek,mt6779-smi-common
   - mediatek,mt8173-smi-common
   - mediatek,mt8183-smi-common
+  - mediatek,mt8192-smi-common
 
   - description: for mt7623
 items:
@@ -46,7 +47,7 @@ properties:
 description: |
   apb and smi are mandatory. the async is only for generation 1 smi HW.
   gals(global async local sync) also is optional, here is the list which
-  require gals: mt6779 and mt8183.
+  require gals: mt6779, mt8183 and mt8192.
 minItems: 2
 maxItems: 4
 items:
diff --git 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
index ea418113bf27..f5ba43638c37 100644
--- 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
+++ 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
@@ -21,6 +21,7 @@ properties:
   - mediatek,mt

[PATCH v3 05/24] dt-bindings: memory: mediatek: Add domain definition

In the latest SoC, there are several HW IP require a sepecial iova
range, mainly CCU and VPU has this requirement. Take CCU as a example,
CCU require its iova locate in the range(0x4000_ ~ 0x43ff_).

In this patch we add a domain definition for the special port. In the
example of CCU, If we preassign CCU port in domain1, then iommu driver
will prepare a independent iommu domain of the special iova range for it,
then the iova got from dma_alloc_attrs(ccu-dev) will locate in its special
range.

This is a preparing patch for multi-domain support.

Signed-off-by: Yong Wu 
---
 include/dt-bindings/memory/mtk-smi-larb-port.h | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/dt-bindings/memory/mtk-smi-larb-port.h 
b/include/dt-bindings/memory/mtk-smi-larb-port.h
index f4d8e3aed0bc..d00f5de8438b 100644
--- a/include/dt-bindings/memory/mtk-smi-larb-port.h
+++ b/include/dt-bindings/memory/mtk-smi-larb-port.h
@@ -7,9 +7,16 @@
 #define __DTS_MTK_IOMMU_PORT_H_
 
 #define MTK_LARB_NR_MAX32
+#define MTK_M4U_DOM_NR_MAX 8
+
+#define MTK_M4U_DOM_ID(domid, larb, port)  \
+   (((domid) & 0x7) << 16 | (((larb) & 0x1f) << 5) | ((port) & 0x1f))
+
+/* The default dom id is 0. */
+#define MTK_M4U_ID(larb, port) MTK_M4U_DOM_ID(0, larb, port)
 
-#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
 #define MTK_M4U_TO_LARB(id)(((id) >> 5) & 0x1f)
 #define MTK_M4U_TO_PORT(id)((id) & 0x1f)
+#define MTK_M4U_TO_DOM(id) (((id) >> 16) & 0x7)
 
 #endif
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 04/24] dt-bindings: memory: mediatek: Extend LARB_NR_MAX to 32

Extend the max larb number definition as mt8192 has larb_nr over 16.

Signed-off-by: Yong Wu 
Acked-by: Rob Herring 
---
 .../bindings/memory-controllers/mediatek,smi-larb.yaml| 2 +-
 include/dt-bindings/memory/mtk-smi-larb-port.h| 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
index 50793a0e6759..ea418113bf27 100644
--- 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
+++ 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
@@ -62,7 +62,7 @@ properties:
   mediatek,larb-id:
 $ref: /schemas/types.yaml#/definitions/uint32
 minimum: 0
-maximum: 15
+maximum: 31
 description: the hardware id of this larb.
   Required property for mt2701, mt2712, mt6779 and mt7623.
 
diff --git a/include/dt-bindings/memory/mtk-smi-larb-port.h 
b/include/dt-bindings/memory/mtk-smi-larb-port.h
index 2ec7fe5ce4e9..f4d8e3aed0bc 100644
--- a/include/dt-bindings/memory/mtk-smi-larb-port.h
+++ b/include/dt-bindings/memory/mtk-smi-larb-port.h
@@ -6,10 +6,10 @@
 #ifndef __DTS_MTK_IOMMU_PORT_H_
 #define __DTS_MTK_IOMMU_PORT_H_
 
-#define MTK_LARB_NR_MAX16
+#define MTK_LARB_NR_MAX32
 
 #define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
-#define MTK_M4U_TO_LARB(id)(((id) >> 5) & 0xf)
+#define MTK_M4U_TO_LARB(id)(((id) >> 5) & 0x1f)
 #define MTK_M4U_TO_PORT(id)((id) & 0x1f)
 
 #endif
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 03/24] dt-bindings: memory: mediatek: Add a common larb-port header file

Put all the macros about smi larb/port togethers, this is a preparing
patch for extending LARB_NR and adding new dom-id support.

Signed-off-by: Yong Wu 
Acked-by: Rob Herring 
---
 include/dt-bindings/memory/mt2712-larb-port.h  |  2 +-
 include/dt-bindings/memory/mt6779-larb-port.h  |  2 +-
 include/dt-bindings/memory/mt8173-larb-port.h  |  2 +-
 include/dt-bindings/memory/mt8183-larb-port.h  |  2 +-
 include/dt-bindings/memory/mtk-smi-larb-port.h | 15 +++
 5 files changed, 19 insertions(+), 4 deletions(-)
 create mode 100644 include/dt-bindings/memory/mtk-smi-larb-port.h

diff --git a/include/dt-bindings/memory/mt2712-larb-port.h 
b/include/dt-bindings/memory/mt2712-larb-port.h
index 6f9aa7349cef..b6b2c6bf4459 100644
--- a/include/dt-bindings/memory/mt2712-larb-port.h
+++ b/include/dt-bindings/memory/mt2712-larb-port.h
@@ -6,7 +6,7 @@
 #ifndef __DTS_IOMMU_PORT_MT2712_H
 #define __DTS_IOMMU_PORT_MT2712_H
 
-#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
+#include 
 
 #define M4U_LARB0_ID   0
 #define M4U_LARB1_ID   1
diff --git a/include/dt-bindings/memory/mt6779-larb-port.h 
b/include/dt-bindings/memory/mt6779-larb-port.h
index 2ad0899fbf2f..60f57f54393e 100644
--- a/include/dt-bindings/memory/mt6779-larb-port.h
+++ b/include/dt-bindings/memory/mt6779-larb-port.h
@@ -7,7 +7,7 @@
 #ifndef _DTS_IOMMU_PORT_MT6779_H_
 #define _DTS_IOMMU_PORT_MT6779_H_
 
-#define MTK_M4U_ID(larb, port)  (((larb) << 5) | (port))
+#include 
 
 #define M4U_LARB0_ID0
 #define M4U_LARB1_ID1
diff --git a/include/dt-bindings/memory/mt8173-larb-port.h 
b/include/dt-bindings/memory/mt8173-larb-port.h
index 9f31ccfeca21..d8c99c946053 100644
--- a/include/dt-bindings/memory/mt8173-larb-port.h
+++ b/include/dt-bindings/memory/mt8173-larb-port.h
@@ -6,7 +6,7 @@
 #ifndef __DTS_IOMMU_PORT_MT8173_H
 #define __DTS_IOMMU_PORT_MT8173_H
 
-#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
+#include 
 
 #define M4U_LARB0_ID   0
 #define M4U_LARB1_ID   1
diff --git a/include/dt-bindings/memory/mt8183-larb-port.h 
b/include/dt-bindings/memory/mt8183-larb-port.h
index 2c579f305162..275c095a6fd6 100644
--- a/include/dt-bindings/memory/mt8183-larb-port.h
+++ b/include/dt-bindings/memory/mt8183-larb-port.h
@@ -6,7 +6,7 @@
 #ifndef __DTS_IOMMU_PORT_MT8183_H
 #define __DTS_IOMMU_PORT_MT8183_H
 
-#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
+#include 
 
 #define M4U_LARB0_ID   0
 #define M4U_LARB1_ID   1
diff --git a/include/dt-bindings/memory/mtk-smi-larb-port.h 
b/include/dt-bindings/memory/mtk-smi-larb-port.h
new file mode 100644
index ..2ec7fe5ce4e9
--- /dev/null
+++ b/include/dt-bindings/memory/mtk-smi-larb-port.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2020 MediaTek Inc.
+ * Author: Yong Wu 
+ */
+#ifndef __DTS_MTK_IOMMU_PORT_H_
+#define __DTS_MTK_IOMMU_PORT_H_
+
+#define MTK_LARB_NR_MAX16
+
+#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
+#define MTK_M4U_TO_LARB(id)(((id) >> 5) & 0xf)
+#define MTK_M4U_TO_PORT(id)((id) & 0x1f)
+
+#endif
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 09/24] iommu/io-pgtable-arm-v7s: Extend PA34 for MediaTek

MediaTek extend the bit5 in lvl1 and lvl2 descriptor as PA34.

Signed-off-by: Yong Wu 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 9 +++--
 drivers/iommu/mtk_iommu.c  | 2 +-
 include/linux/io-pgtable.h | 4 ++--
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 4c9d8dccfc5a..a3b3e9147b8d 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -112,9 +112,10 @@
 #define ARM_V7S_TEX_MASK   0x7
 #define ARM_V7S_ATTR_TEX(val)  (((val) & ARM_V7S_TEX_MASK) << 
ARM_V7S_TEX_SHIFT)
 
-/* MediaTek extend the two bits for PA 32bit/33bit */
+/* MediaTek extend the bits below for PA 32bit/33bit/34bit */
 #define ARM_V7S_ATTR_MTK_PA_BIT32  BIT(9)
 #define ARM_V7S_ATTR_MTK_PA_BIT33  BIT(4)
+#define ARM_V7S_ATTR_MTK_PA_BIT34  BIT(5)
 
 /* *well, except for TEX on level 2 large pages, of course :( */
 #define ARM_V7S_CONT_PAGE_TEX_SHIFT6
@@ -194,6 +195,8 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int 
lvl,
pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
if (paddr & BIT_ULL(33))
pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
+   if (paddr & BIT_ULL(34))
+   pte |= ARM_V7S_ATTR_MTK_PA_BIT34;
return pte;
 }
 
@@ -218,6 +221,8 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int 
lvl,
paddr |= BIT_ULL(32);
if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
paddr |= BIT_ULL(33);
+   if (pte & ARM_V7S_ATTR_MTK_PA_BIT34)
+   paddr |= BIT_ULL(34);
return paddr;
 }
 
@@ -753,7 +758,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
if (cfg->ias > ARM_V7S_ADDR_BITS)
return NULL;
 
-   if (cfg->oas > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS))
+   if (cfg->oas > (arm_v7s_is_mtk_enabled(cfg) ? 35 : ARM_V7S_ADDR_BITS))
return NULL;
 
if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index fd8322ee4980..f6a2e3eb59d2 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -317,7 +317,7 @@ static int mtk_iommu_domain_finalise(struct 
mtk_iommu_domain *dom)
IO_PGTABLE_QUIRK_ARM_MTK_EXT,
.pgsize_bitmap = mtk_iommu_ops.pgsize_bitmap,
.ias = 32,
-   .oas = 34,
+   .oas = 35,
.tlb = &mtk_iommu_flush_ops,
.iommu_dev = data->dev,
};
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 23285ba645db..8c9a889d8df9 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -77,8 +77,8 @@ struct io_pgtable_cfg {
 *  TLB maintenance when mapping as well as when unmapping.
 *
 * IO_PGTABLE_QUIRK_ARM_MTK_EXT: (ARM v7s format) MediaTek IOMMUs extend
-*  to support up to 34 bits PA where the bit32 and bit33 are
-*  encoded in the bit9 and bit4 of the PTE respectively.
+*  to support up to 35 bits PA where the bit32, bit33 and bit34 are
+*  encoded in the bit9, bit4 and bit5 of the PTE respectively.
 *
 * IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs
 *  on unmap, for DMA domains using the flush queue mechanism for
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 10/24] iommu/io-pgtable-arm-v7s: Add cfg as a param in some macros

Add "cfg" as a parameter for some macros. This is a preparing patch for
mediatek extend the lvl1 pgtable. No functional change.

Signed-off-by: Yong Wu 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 34 +++---
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index a3b3e9147b8d..8362fdf76657 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -53,17 +53,17 @@
 #define ARM_V7S_LVL_SHIFT(lvl) (ARM_V7S_ADDR_BITS - (4 + 8 * (lvl)))
 #define ARM_V7S_TABLE_SHIFT10
 
-#define ARM_V7S_PTES_PER_LVL(lvl)  (1 << _ARM_V7S_LVL_BITS(lvl))
-#define ARM_V7S_TABLE_SIZE(lvl)
\
-   (ARM_V7S_PTES_PER_LVL(lvl) * sizeof(arm_v7s_iopte))
+#define ARM_V7S_PTES_PER_LVL(lvl, cfg) (1 << _ARM_V7S_LVL_BITS(lvl))
+#define ARM_V7S_TABLE_SIZE(lvl, cfg)   \
+   (ARM_V7S_PTES_PER_LVL(lvl, cfg) * sizeof(arm_v7s_iopte))
 
 #define ARM_V7S_BLOCK_SIZE(lvl)(1UL << ARM_V7S_LVL_SHIFT(lvl))
 #define ARM_V7S_LVL_MASK(lvl)  ((u32)(~0U << ARM_V7S_LVL_SHIFT(lvl)))
 #define ARM_V7S_TABLE_MASK ((u32)(~0U << ARM_V7S_TABLE_SHIFT))
-#define _ARM_V7S_IDX_MASK(lvl) (ARM_V7S_PTES_PER_LVL(lvl) - 1)
-#define ARM_V7S_LVL_IDX(addr, lvl) ({  \
+#define _ARM_V7S_IDX_MASK(lvl, cfg)(ARM_V7S_PTES_PER_LVL(lvl, cfg) - 1)
+#define ARM_V7S_LVL_IDX(addr, lvl, cfg)({  \
int _l = lvl;   \
-   ((u32)(addr) >> ARM_V7S_LVL_SHIFT(_l)) & _ARM_V7S_IDX_MASK(_l); \
+   ((u32)(addr) >> ARM_V7S_LVL_SHIFT(_l)) & _ARM_V7S_IDX_MASK(_l, cfg); \
 })
 
 /*
@@ -239,7 +239,7 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
struct device *dev = cfg->iommu_dev;
phys_addr_t phys;
dma_addr_t dma;
-   size_t size = ARM_V7S_TABLE_SIZE(lvl);
+   size_t size = ARM_V7S_TABLE_SIZE(lvl, cfg);
void *table = NULL;
 
if (lvl == 1)
@@ -285,7 +285,7 @@ static void __arm_v7s_free_table(void *table, int lvl,
 {
struct io_pgtable_cfg *cfg = &data->iop.cfg;
struct device *dev = cfg->iommu_dev;
-   size_t size = ARM_V7S_TABLE_SIZE(lvl);
+   size_t size = ARM_V7S_TABLE_SIZE(lvl, cfg);
 
if (!cfg->coherent_walk)
dma_unmap_single(dev, __arm_v7s_dma_addr(table), size,
@@ -429,7 +429,7 @@ static int arm_v7s_init_pte(struct arm_v7s_io_pgtable *data,
arm_v7s_iopte *tblp;
size_t sz = ARM_V7S_BLOCK_SIZE(lvl);
 
-   tblp = ptep - ARM_V7S_LVL_IDX(iova, lvl);
+   tblp = ptep - ARM_V7S_LVL_IDX(iova, lvl, cfg);
if (WARN_ON(__arm_v7s_unmap(data, NULL, iova + i * sz,
sz, lvl, tblp) != sz))
return -EINVAL;
@@ -482,7 +482,7 @@ static int __arm_v7s_map(struct arm_v7s_io_pgtable *data, 
unsigned long iova,
int num_entries = size >> ARM_V7S_LVL_SHIFT(lvl);
 
/* Find our entry at the current level */
-   ptep += ARM_V7S_LVL_IDX(iova, lvl);
+   ptep += ARM_V7S_LVL_IDX(iova, lvl, cfg);
 
/* If we can install a leaf entry at this level, then do so */
if (num_entries)
@@ -554,7 +554,7 @@ static void arm_v7s_free_pgtable(struct io_pgtable *iop)
struct arm_v7s_io_pgtable *data = io_pgtable_to_data(iop);
int i;
 
-   for (i = 0; i < ARM_V7S_PTES_PER_LVL(1); i++) {
+   for (i = 0; i < ARM_V7S_PTES_PER_LVL(1, &data->iop.cfg); i++) {
arm_v7s_iopte pte = data->pgd[i];
 
if (ARM_V7S_PTE_IS_TABLE(pte, 1))
@@ -606,9 +606,9 @@ static size_t arm_v7s_split_blk_unmap(struct 
arm_v7s_io_pgtable *data,
if (!tablep)
return 0; /* Bytes unmapped */
 
-   num_ptes = ARM_V7S_PTES_PER_LVL(2);
+   num_ptes = ARM_V7S_PTES_PER_LVL(2, cfg);
num_entries = size >> ARM_V7S_LVL_SHIFT(2);
-   unmap_idx = ARM_V7S_LVL_IDX(iova, 2);
+   unmap_idx = ARM_V7S_LVL_IDX(iova, 2, cfg);
 
pte = arm_v7s_prot_to_pte(arm_v7s_pte_to_prot(blk_pte, 1), 2, cfg);
if (num_entries > 1)
@@ -650,7 +650,7 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable 
*data,
if (WARN_ON(lvl > 2))
return 0;
 
-   idx = ARM_V7S_LVL_IDX(iova, lvl);
+   idx = ARM_V7S_LVL_IDX(iova, lvl, &iop->cfg);
ptep += idx;
do {
pte[i] = READ_ONCE(ptep[i]);
@@ -736,7 +736,7 @@ static phys_addr_t arm_v7s_iova_to_phys(struct 
io_pgtable_ops *ops,
u32 mask;
 
do {
-   ptep += ARM_V7S_LVL_IDX(iova, ++lvl);
+   ptep += ARM_V7S_LVL_IDX(iova, ++lvl, &data->iop.cfg);
pte = READ_ONCE(*ptep);
ptep = iopte_deref(pte,

[PATCH v3 11/24] iommu/io-pgtable-arm-v7s: Quad lvl1 pgtable for MediaTek

The standard input iova bits is 32. MediaTek quad the lvl1 pagetable
(4 * lvl1). No change for lvl2 pagetable. Then the iova bits can reach
34bit.

Signed-off-by: Yong Wu 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 13 ++---
 drivers/iommu/mtk_iommu.c  |  2 +-
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 8362fdf76657..306bae2755ed 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -50,10 +50,17 @@
  */
 #define ARM_V7S_ADDR_BITS  32
 #define _ARM_V7S_LVL_BITS(lvl) (16 - (lvl) * 4)
+/* MediaTek: totally 34bits, 14bits at lvl1 and 8bits at lvl2. */
+#define _ARM_V7S_LVL_BITS_MTK(lvl) (20 - (lvl) * 6)
 #define ARM_V7S_LVL_SHIFT(lvl) (ARM_V7S_ADDR_BITS - (4 + 8 * (lvl)))
 #define ARM_V7S_TABLE_SHIFT10
 
-#define ARM_V7S_PTES_PER_LVL(lvl, cfg) (1 << _ARM_V7S_LVL_BITS(lvl))
+#define ARM_V7S_PTES_PER_LVL(lvl, cfg) ({  \
+   int _lvl = lvl; \
+   !arm_v7s_is_mtk_enabled(cfg) ?  \
+(1 << _ARM_V7S_LVL_BITS(_lvl)) : (1 << _ARM_V7S_LVL_BITS_MTK(_lvl));\
+})
+
 #define ARM_V7S_TABLE_SIZE(lvl, cfg)   \
(ARM_V7S_PTES_PER_LVL(lvl, cfg) * sizeof(arm_v7s_iopte))
 
@@ -63,7 +70,7 @@
 #define _ARM_V7S_IDX_MASK(lvl, cfg)(ARM_V7S_PTES_PER_LVL(lvl, cfg) - 1)
 #define ARM_V7S_LVL_IDX(addr, lvl, cfg)({  \
int _l = lvl;   \
-   ((u32)(addr) >> ARM_V7S_LVL_SHIFT(_l)) & _ARM_V7S_IDX_MASK(_l, cfg); \
+   ((addr) >> ARM_V7S_LVL_SHIFT(_l)) & _ARM_V7S_IDX_MASK(_l, cfg); \
 })
 
 /*
@@ -755,7 +762,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
 {
struct arm_v7s_io_pgtable *data;
 
-   if (cfg->ias > ARM_V7S_ADDR_BITS)
+   if (cfg->ias > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS))
return NULL;
 
if (cfg->oas > (arm_v7s_is_mtk_enabled(cfg) ? 35 : ARM_V7S_ADDR_BITS))
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index f6a2e3eb59d2..6e85c9976a33 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -316,7 +316,7 @@ static int mtk_iommu_domain_finalise(struct 
mtk_iommu_domain *dom)
IO_PGTABLE_QUIRK_TLBI_ON_MAP |
IO_PGTABLE_QUIRK_ARM_MTK_EXT,
.pgsize_bitmap = mtk_iommu_ops.pgsize_bitmap,
-   .ias = 32,
+   .ias = 34,
.oas = 35,
.tlb = &mtk_iommu_flush_ops,
.iommu_dev = data->dev,
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 07/24] iommu/mediatek: Use the common mtk-smi-larb-port.h

Use the common larb-port header in the source code.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c  | 7 ---
 drivers/iommu/mtk_iommu.h  | 1 +
 drivers/memory/mtk-smi.c   | 1 +
 include/soc/mediatek/smi.h | 2 --
 4 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 785b228d39a6..fd8322ee4980 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -101,13 +101,6 @@
 
 #define MTK_PROTECT_PA_ALIGN   256
 
-/*
- * Get the local arbiter ID and the portid within the larb arbiter
- * from mtk_m4u_id which is defined by MTK_M4U_ID.
- */
-#define MTK_M4U_TO_LARB(id)(((id) >> 5) & 0xf)
-#define MTK_M4U_TO_PORT(id)((id) & 0x1f)
-
 #define HAS_4GB_MODE   BIT(0)
 /* HW will use the EMI clock if there isn't the "bclk". */
 #define HAS_BCLK   BIT(1)
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 122925dbe547..a2e2c844b96e 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define MTK_LARB_COM_MAX   8
 #define MTK_LARB_SUBCOM_MAX4
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index c21262502581..ec83f1ac48b1 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /* mt8173 */
diff --git a/include/soc/mediatek/smi.h b/include/soc/mediatek/smi.h
index 5a34b87d89e3..9371bf572ab8 100644
--- a/include/soc/mediatek/smi.h
+++ b/include/soc/mediatek/smi.h
@@ -11,8 +11,6 @@
 
 #ifdef CONFIG_MTK_SMI
 
-#define MTK_LARB_NR_MAX16
-
 #define MTK_SMI_MMU_EN(port)   BIT(port)
 
 struct mtk_smi_larb_iommu {
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 08/24] iommu/io-pgtable-arm-v7s: Use ias to check the valid iova in unmap

Use the ias for the valid iova checking in arm_v7s_unmap. This is a
preparing patch for supporting iova 34bit for MediaTek.
BTW, change the ias/oas checking format in arm_v7s_map.

Signed-off-by: Yong Wu 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index a688f22cbe3b..4c9d8dccfc5a 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -526,8 +526,7 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned 
long iova,
if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
return 0;
 
-   if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
-   paddr >= (1ULL << data->iop.cfg.oas)))
+   if (WARN_ON(iova >> data->iop.cfg.ias || paddr >> data->iop.cfg.oas))
return -ERANGE;
 
ret = __arm_v7s_map(data, iova, paddr, size, prot, 1, data->pgd, gfp);
@@ -717,7 +716,7 @@ static size_t arm_v7s_unmap(struct io_pgtable_ops *ops, 
unsigned long iova,
 {
struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops);
 
-   if (WARN_ON(upper_32_bits(iova)))
+   if (WARN_ON(iova >> data->iop.cfg.ias))
return 0;
 
return __arm_v7s_unmap(data, gather, iova, size, 1, data->pgd);
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 12/24] iommu/mediatek: Move hw_init into attach_device

In attach device, it will update the pagetable base address register.
Move the hw_init function also here. Then it only need call
pm_runtime_get/put one time here if m4u has power domain.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 6e85c9976a33..940b7a9191b2 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -122,6 +122,8 @@ struct mtk_iommu_domain {
 
 static const struct iommu_ops mtk_iommu_ops;
 
+static int mtk_iommu_hw_init(const struct mtk_iommu_data *data);
+
 /*
  * In M4U 4GB mode, the physical address is remapped as below:
  *
@@ -377,12 +379,16 @@ static int mtk_iommu_attach_device(struct iommu_domain 
*domain,
 {
struct mtk_iommu_data *data = dev_iommu_priv_get(dev);
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
+   int ret;
 
if (!data)
return -ENODEV;
 
/* Update the pgtable base address register of the M4U HW */
if (!data->m4u_dom) {
+   ret = mtk_iommu_hw_init(data);
+   if (ret)
+   return ret;
data->m4u_dom = dom;
writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
   data->base + REG_MMU_PT_BASE_ADDR);
@@ -705,10 +711,6 @@ static int mtk_iommu_probe(struct platform_device *pdev)
 
platform_set_drvdata(pdev, data);
 
-   ret = mtk_iommu_hw_init(data);
-   if (ret)
-   return ret;
-
ret = iommu_device_sysfs_add(&data->iommu, dev, NULL,
 "mtk-iommu.%pa", &ioaddr);
if (ret)
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 13/24] iommu/mediatek: Add device link for smi-common and m4u

In the lastest SoC, M4U has its special power domain. thus, If the engine
begin to work, it should help enable the power for M4U firstly.
Currently if the engine work, it always enable the power/clocks for
smi-larbs/smi-common. This patch adds device_link for smi-common and M4U.
then, if smi-common power is enabled, the M4U power also is powered on
automatically.

Normally M4U connect with several smi-larbs and their smi-common always
are the same, In this patch it get smi-common dev from the first smi-larb
device(i==0), then add the device_link only while m4u has power-domain.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c | 24 +++-
 drivers/iommu/mtk_iommu.h |  1 +
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 940b7a9191b2..5625458b21ba 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -681,7 +682,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
return larb_nr;
 
for (i = 0; i < larb_nr; i++) {
-   struct device_node *larbnode;
+   struct device_node *larbnode, *smicomm_node;
struct platform_device *plarbdev;
u32 id;
 
@@ -707,6 +708,15 @@ static int mtk_iommu_probe(struct platform_device *pdev)
 
component_match_add_release(dev, &match, release_of,
compare_of, larbnode);
+   if (!i) {
+   smicomm_node = of_parse_phandle(larbnode, 
"mediatek,smi", 0);
+   if (!smicomm_node)
+   return -EINVAL;
+
+   plarbdev = of_find_device_by_node(smicomm_node);
+   of_node_put(smicomm_node);
+   data->smicomm_dev = &plarbdev->dev;
+   }
}
 
platform_set_drvdata(pdev, data);
@@ -729,6 +739,16 @@ static int mtk_iommu_probe(struct platform_device *pdev)
if (!iommu_present(&platform_bus_type))
bus_set_iommu(&platform_bus_type, &mtk_iommu_ops);
 
+   if (dev->pm_domain) {
+   struct device_link *link;
+
+   link = device_link_add(data->smicomm_dev, dev,
+  DL_FLAG_STATELESS | DL_FLAG_PM_RUNTIME);
+   if (!link) {
+   dev_err(dev, "Unable link %s.\n", 
dev_name(data->smicomm_dev));
+   return -EINVAL;
+   }
+   }
return component_master_add_with_match(dev, &mtk_iommu_com_ops, match);
 }
 
@@ -743,6 +763,8 @@ static int mtk_iommu_remove(struct platform_device *pdev)
bus_set_iommu(&platform_bus_type, NULL);
 
clk_disable_unprepare(data->bclk);
+   if (pdev->dev.pm_domain)
+   device_link_remove(data->smicomm_dev, &pdev->dev);
devm_free_irq(&pdev->dev, data->irq, data);
component_master_del(&pdev->dev, &mtk_iommu_com_ops);
return 0;
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index a2e2c844b96e..ae7909815cdb 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -67,6 +67,7 @@ struct mtk_iommu_data {
 
struct iommu_device iommu;
const struct mtk_iommu_plat_data *plat_data;
+   struct device   *smicomm_dev;
 
struct dma_iommu_mapping*mapping; /* For mtk_iommu_v1.c */
 
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 17/24] iommu/mediatek: Add single domain

Defaultly the iova range is 0-4G. here we add a single-domain(0-4G)
for the previous SoC. this also is a preparing patch for supporting
multi-domains.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 8e2a8e29d13c..87ca4f47e494 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -158,6 +158,10 @@ struct mtk_iommu_iova_region {
unsigned long long  size;
 };
 
+static const struct mtk_iommu_iova_region single_domain[] = {
+   {.iova_base = 0,.size = SZ_4G},
+};
+
 /*
  * There may be 1 or 2 M4U HWs, But we always expect they are in the same 
domain
  * for the performance.
@@ -886,6 +890,8 @@ static const struct mtk_iommu_plat_data mt2712_data = {
.m4u_plat = M4U_MT2712,
.flags= HAS_4GB_MODE | HAS_BCLK | HAS_VLD_PA_RNG,
.inv_sel_reg  = REG_MMU_INV_SEL_GEN1,
+   .iova_region  = single_domain,
+   .iova_region_nr = ARRAY_SIZE(single_domain),
.larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}},
 };
 
@@ -893,6 +899,8 @@ static const struct mtk_iommu_plat_data mt6779_data = {
.m4u_plat  = M4U_MT6779,
.flags = HAS_SUB_COMM | OUT_ORDER_WR_EN | WR_THROT_EN,
.inv_sel_reg   = REG_MMU_INV_SEL_GEN2,
+   .iova_region   = single_domain,
+   .iova_region_nr = ARRAY_SIZE(single_domain),
.larbid_remap  = {{0}, {1}, {2}, {3}, {5}, {7, 8}, {10}, {9}},
 };
 
@@ -900,6 +908,8 @@ static const struct mtk_iommu_plat_data mt8173_data = {
.m4u_plat = M4U_MT8173,
.flags= HAS_4GB_MODE | HAS_BCLK | RESET_AXI,
.inv_sel_reg  = REG_MMU_INV_SEL_GEN1,
+   .iova_region  = single_domain,
+   .iova_region_nr = ARRAY_SIZE(single_domain),
.larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}}, /* Linear mapping. */
 };
 
@@ -907,6 +917,8 @@ static const struct mtk_iommu_plat_data mt8183_data = {
.m4u_plat = M4U_MT8183,
.flags= RESET_AXI,
.inv_sel_reg  = REG_MMU_INV_SEL_GEN1,
+   .iova_region  = single_domain,
+   .iova_region_nr = ARRAY_SIZE(single_domain),
.larbid_remap = {{0}, {4}, {5}, {6}, {7}, {2}, {3}, {1}},
 };
 
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 15/24] iommu/mediatek: Add power-domain operation

In the previous SoC, the M4U HW is in the EMI power domain which is
always on. the latest M4U is in the display power domain which may be
turned on/off, thus we have to add pm_runtime interface for it.

When the engine work, the engine always enable the power and clocks for
smi-larb/smi-common, then the M4U's power will always be powered on
automatically via the device link with smi-common.

Note: we don't enable the M4U power in iommu_map/unmap for tlb flush.
If its power already is on, of course it is ok. if the power is off,
the main tlb will be reset while M4U power on, thus the tlb flush while
m4u power off is unnecessary, just skip it.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c | 27 ++-
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 052eb72acf69..1e6e6d3fa7f1 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -196,6 +196,10 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long 
iova, size_t size,
u32 tmp;
 
for_each_m4u(data) {
+   /* skip tlb flush when pm is not active. */
+   if (!pm_runtime_active(data->dev))
+   continue;
+
spin_lock_irqsave(&data->tlb_lock, flags);
writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
   data->base + data->plat_data->inv_sel_reg);
@@ -380,6 +384,7 @@ static int mtk_iommu_attach_device(struct iommu_domain 
*domain,
 {
struct mtk_iommu_data *data = dev_iommu_priv_get(dev);
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
+   struct device *m4udev = data->dev;
int ret;
 
if (!data)
@@ -387,12 +392,18 @@ static int mtk_iommu_attach_device(struct iommu_domain 
*domain,
 
/* Update the pgtable base address register of the M4U HW */
if (!data->m4u_dom) {
+   ret = pm_runtime_get_sync(m4udev);
+   if (ret < 0)
+   return ret;
ret = mtk_iommu_hw_init(data);
-   if (ret)
+   if (ret) {
+   pm_runtime_put(m4udev);
return ret;
+   }
data->m4u_dom = dom;
writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
   data->base + REG_MMU_PT_BASE_ADDR);
+   pm_runtime_put(m4udev);
}
 
mtk_iommu_config(data, dev, true);
@@ -742,10 +753,13 @@ static int mtk_iommu_probe(struct platform_device *pdev)
if (dev->pm_domain) {
struct device_link *link;
 
+   pm_runtime_enable(dev);
+
link = device_link_add(data->smicomm_dev, dev,
   DL_FLAG_STATELESS | DL_FLAG_PM_RUNTIME);
if (!link) {
dev_err(dev, "Unable link %s.\n", 
dev_name(data->smicomm_dev));
+   pm_runtime_disable(dev);
return -EINVAL;
}
}
@@ -763,8 +777,10 @@ static int mtk_iommu_remove(struct platform_device *pdev)
bus_set_iommu(&platform_bus_type, NULL);
 
clk_disable_unprepare(data->bclk);
-   if (pdev->dev.pm_domain)
+   if (pdev->dev.pm_domain) {
device_link_remove(data->smicomm_dev, &pdev->dev);
+   pm_runtime_disable(&pdev->dev);
+   }
devm_free_irq(&pdev->dev, data->irq, data);
component_master_del(&pdev->dev, &mtk_iommu_com_ops);
return 0;
@@ -796,6 +812,9 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct 
device *dev)
void __iomem *base = data->base;
int ret;
 
+   /* Avoid first resume to affect the default value of registers below. */
+   if (!m4u_dom)
+   return 0;
ret = clk_prepare_enable(data->bclk);
if (ret) {
dev_err(data->dev, "Failed to enable clk(%d) in resume\n", ret);
@@ -809,9 +828,7 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct 
device *dev)
writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL);
writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
writel_relaxed(reg->vld_pa_rng, base + REG_MMU_VLD_PA_RNG);
-   if (m4u_dom)
-   writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
-  base + REG_MMU_PT_BASE_ADDR);
+   writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, base + 
REG_MMU_PT_BASE_ADDR);
return 0;
 }
 
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 16/24] iommu/mediatek: Add iova reserved function

For multiple iommu_domains, we need to reserve some iova regions. Take a
example, If the default iova region is 0 ~ 4G, but the 0x4000_ ~
0x43ff_ is only for the special CCU0 domain. Thus we should exclude
this region for the default iova region.

This patch adds iova reserved flow. It's a preparing patch for supporting
multi-domain.

Signed-off-by: Anan sun 
Signed-off-by: Chao Hao 
Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c | 28 
 drivers/iommu/mtk_iommu.h |  5 +
 2 files changed, 33 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 1e6e6d3fa7f1..8e2a8e29d13c 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -153,6 +153,11 @@ static LIST_HEAD(m4ulist); /* List all the M4U HWs */
 
 #define for_each_m4u(data) list_for_each_entry(data, &m4ulist, list)
 
+struct mtk_iommu_iova_region {
+   dma_addr_t  iova_base;
+   unsigned long long  size;
+};
+
 /*
  * There may be 1 or 2 M4U HWs, But we always expect they are in the same 
domain
  * for the performance.
@@ -539,6 +544,27 @@ static int mtk_iommu_of_xlate(struct device *dev, struct 
of_phandle_args *args)
return iommu_fwspec_add_ids(dev, args->args, 1);
 }
 
+static void mtk_iommu_get_resv_regions(struct device *dev,
+  struct list_head *head)
+{
+   struct mtk_iommu_data *data = dev_iommu_priv_get(dev);
+   const struct mtk_iommu_iova_region *resv;
+   struct iommu_resv_region *region;
+   int prot = IOMMU_WRITE | IOMMU_READ;
+   unsigned int i;
+
+   for (i = 0; i < data->plat_data->iova_region_nr; i++) {
+   resv = data->plat_data->iova_region + i;
+
+   region = iommu_alloc_resv_region(resv->iova_base, resv->size,
+prot, IOMMU_RESV_RESERVED);
+   if (!region)
+   return;
+
+   list_add_tail(®ion->list, head);
+   }
+}
+
 static const struct iommu_ops mtk_iommu_ops = {
.domain_alloc   = mtk_iommu_domain_alloc,
.domain_free= mtk_iommu_domain_free,
@@ -553,6 +579,8 @@ static const struct iommu_ops mtk_iommu_ops = {
.release_device = mtk_iommu_release_device,
.device_group   = mtk_iommu_device_group,
.of_xlate   = mtk_iommu_of_xlate,
+   .get_resv_regions = mtk_iommu_get_resv_regions,
+   .put_resv_regions = generic_iommu_put_resv_regions,
.pgsize_bitmap  = SZ_4K | SZ_64K | SZ_1M | SZ_16M,
 };
 
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index ae7909815cdb..d45c13c9d324 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -44,10 +44,15 @@ enum mtk_iommu_plat {
M4U_MT8183,
 };
 
+struct mtk_iommu_iova_region;
+
 struct mtk_iommu_plat_data {
enum mtk_iommu_plat m4u_plat;
u32 flags;
u32 inv_sel_reg;
+
+   unsigned intiova_region_nr;
+   const struct mtk_iommu_iova_region   *iova_region;
unsigned char   larbid_remap[MTK_LARB_COM_MAX][MTK_LARB_SUBCOM_MAX];
 };
 
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 18/24] iommu/mediatek: Support master use iova over 32bit

After extending v7s, our pagetable already support iova reach
16GB(34bit). the master got the iova via dma_alloc_attrs may reach
34bits, but its HW register still is 32bit. then how to set the
bit32/bit33 iova? this depend on a SMI larb setting(bank_sel).

we separate whole 16GB iova to four banks:
bank: 0: 0~4G; 1: 4~8G; 2: 8-12G; 3: 12-16G;
The bank number is (iova >> 32).

We will preassign which bank the larbs belong to. currently we don't
have a interface for master to adjust its bank number.

Each a bank is a iova_region which is a independent iommu-domain.
the iova range for each iommu-domain can't cross 4G.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c  | 12 +---
 drivers/memory/mtk-smi.c   |  7 +++
 include/soc/mediatek/smi.h |  1 +
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 87ca4f47e494..2f26a8242428 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -303,17 +303,23 @@ static void mtk_iommu_config(struct mtk_iommu_data *data,
 struct device *dev, bool enable)
 {
struct mtk_smi_larb_iommu*larb_mmu;
-   unsigned int larbid, portid;
+   unsigned int larbid, portid, domid;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   const struct mtk_iommu_iova_region *region;
int i;
 
for (i = 0; i < fwspec->num_ids; ++i) {
larbid = MTK_M4U_TO_LARB(fwspec->ids[i]);
portid = MTK_M4U_TO_PORT(fwspec->ids[i]);
+   domid = MTK_M4U_TO_DOM(fwspec->ids[i]);
+
larb_mmu = &data->larb_imu[larbid];
+   region = data->plat_data->iova_region + domid;
+   larb_mmu->bank[portid] = upper_32_bits(region->iova_base);
 
-   dev_dbg(dev, "%s iommu port: %d\n",
-   enable ? "enable" : "disable", portid);
+   dev_dbg(dev, "%s iommu for larb(%s) port %d dom %d bank %d.\n",
+   enable ? "enable" : "disable", dev_name(larb_mmu->dev),
+   portid, domid, larb_mmu->bank[portid]);
 
if (enable)
larb_mmu->mmu |= MTK_SMI_MMU_EN(portid);
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index ec83f1ac48b1..e94c99ca2883 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -41,6 +41,10 @@
 /* mt2712 */
 #define SMI_LARB_NONSEC_CON(id)(0x380 + ((id) * 4))
 #define F_MMU_EN   BIT(0)
+#define BANK_SEL(id)   ({  \
+   u32 _id = (id) & 0x3;   \
+   (_id << 8 | _id << 10 | _id << 12 | _id << 14); \
+})
 
 /* SMI COMMON */
 #define SMI_BUS_SEL0x220
@@ -85,6 +89,7 @@ struct mtk_smi_larb { /* larb: local arbiter */
const struct mtk_smi_larb_gen   *larb_gen;
int larbid;
u32 *mmu;
+   unsigned char   *bank;
 };
 
 static int mtk_smi_clk_enable(const struct mtk_smi *smi)
@@ -151,6 +156,7 @@ mtk_smi_larb_bind(struct device *dev, struct device 
*master, void *data)
if (dev == larb_mmu[i].dev) {
larb->larbid = i;
larb->mmu = &larb_mmu[i].mmu;
+   larb->bank = larb_mmu[i].bank;
return 0;
}
}
@@ -169,6 +175,7 @@ static void mtk_smi_larb_config_port_gen2_general(struct 
device *dev)
for_each_set_bit(i, (unsigned long *)larb->mmu, 32) {
reg = readl_relaxed(larb->base + SMI_LARB_NONSEC_CON(i));
reg |= F_MMU_EN;
+   reg |= BANK_SEL(larb->bank[i]);
writel(reg, larb->base + SMI_LARB_NONSEC_CON(i));
}
 }
diff --git a/include/soc/mediatek/smi.h b/include/soc/mediatek/smi.h
index 9371bf572ab8..4cf445dbbdaa 100644
--- a/include/soc/mediatek/smi.h
+++ b/include/soc/mediatek/smi.h
@@ -16,6 +16,7 @@
 struct mtk_smi_larb_iommu {
struct device *dev;
unsigned int   mmu;
+   unsigned char  bank[32];
 };
 
 /*
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 14/24] iommu/mediatek: Add pm runtime callback

This patch adds pm runtime callback.

In pm runtime case, all the registers backup/restore and bclk are
controlled in the pm_runtime callback, then pm_suspend is not needed in
this case.

runtime PM is disabled when suspend, thus we call
pm_runtime_status_suspended instead of pm_runtime_suspended.

And, m4u doesn't have its special pm runtime domain in previous SoC, in
this case dev->power.runtime_status is RPM_SUSPENDED defaultly, thus add
a "dev->pm_domain" checking for the SoC that has pm runtime domain.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 5625458b21ba..052eb72acf69 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -770,7 +770,7 @@ static int mtk_iommu_remove(struct platform_device *pdev)
return 0;
 }
 
-static int __maybe_unused mtk_iommu_suspend(struct device *dev)
+static int __maybe_unused mtk_iommu_runtime_suspend(struct device *dev)
 {
struct mtk_iommu_data *data = dev_get_drvdata(dev);
struct mtk_iommu_suspend_reg *reg = &data->reg;
@@ -788,7 +788,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device 
*dev)
return 0;
 }
 
-static int __maybe_unused mtk_iommu_resume(struct device *dev)
+static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev)
 {
struct mtk_iommu_data *data = dev_get_drvdata(dev);
struct mtk_iommu_suspend_reg *reg = &data->reg;
@@ -815,7 +815,25 @@ static int __maybe_unused mtk_iommu_resume(struct device 
*dev)
return 0;
 }
 
+static int __maybe_unused mtk_iommu_suspend(struct device *dev)
+{
+   /* runtime PM is disabled when suspend in pm_runtime case. */
+   if (dev->pm_domain && pm_runtime_status_suspended(dev))
+   return 0;
+
+   return mtk_iommu_runtime_suspend(dev);
+}
+
+static int __maybe_unused mtk_iommu_resume(struct device *dev)
+{
+   if (dev->pm_domain && pm_runtime_status_suspended(dev))
+   return 0;
+
+   return mtk_iommu_runtime_resume(dev);
+}
+
 static const struct dev_pm_ops mtk_iommu_pm_ops = {
+   SET_RUNTIME_PM_OPS(mtk_iommu_runtime_suspend, mtk_iommu_runtime_resume, 
NULL)
SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(mtk_iommu_suspend, mtk_iommu_resume)
 };
 
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 19/24] iommu/mediatek: Support up to 34bit iova in tlb flush

If the iova is 34bit, the iova[32][33] is the bit0/1 in the tlb flush
register. Add a new macro for this.

there is a minor change unrelated with this patch. it also use the new
macro.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 2f26a8242428..a2e519c86ce9 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -125,6 +125,9 @@ static const struct iommu_ops mtk_iommu_ops;
 
 static int mtk_iommu_hw_init(const struct mtk_iommu_data *data);
 
+#define MTK_IOMMU_ADDR(addr) ({unsigned long _addr = addr; \
+ (lower_32_bits(_addr) | upper_32_bits(_addr)); })
+
 /*
  * In M4U 4GB mode, the physical address is remapped as below:
  *
@@ -213,8 +216,9 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long 
iova, size_t size,
writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
   data->base + data->plat_data->inv_sel_reg);
 
-   writel_relaxed(iova, data->base + REG_MMU_INVLD_START_A);
-   writel_relaxed(iova + size - 1,
+   writel_relaxed(MTK_IOMMU_ADDR(iova),
+  data->base + REG_MMU_INVLD_START_A);
+   writel_relaxed(MTK_IOMMU_ADDR(iova + size - 1),
   data->base + REG_MMU_INVLD_END_A);
writel_relaxed(F_MMU_INV_RANGE,
   data->base + REG_MMU_INVALIDATE);
@@ -634,8 +638,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
*data)
if (data->plat_data->m4u_plat == M4U_MT8173)
regval = (data->protect_base >> 1) | (data->enable_4GB << 31);
else
-   regval = lower_32_bits(data->protect_base) |
-upper_32_bits(data->protect_base);
+   regval = MTK_IOMMU_ADDR(data->protect_base);
writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
 
if (data->enable_4GB &&
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 20/24] iommu/mediatek: Support report iova 34bit translation fault in ISR

If the iova is over 32bit, the fault status register bit is a little
different. Add a flag for the special register bits.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index a2e519c86ce9..16760e318648 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -4,6 +4,7 @@
  * Author: Yong Wu 
  */
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -87,6 +88,9 @@
 #define F_REG_MMU1_FAULT_MASK  GENMASK(13, 7)
 
 #define REG_MMU0_FAULT_VA  0x13c
+#define F_MMU_INVAL_VA_31_12_MASK  GENMASK(31, 12)
+#define F_MMU_INVAL_VA_34_32_MASK  GENMASK(11, 9)
+#define F_MMU_INVAL_PA_34_32_MASK  GENMASK(8, 6)
 #define F_MMU_FAULT_VA_WRITE_BIT   BIT(1)
 #define F_MMU_FAULT_VA_LAYER_BIT   BIT(0)
 
@@ -110,6 +114,7 @@
 #define OUT_ORDER_WR_ENBIT(4)
 #define HAS_SUB_COMM   BIT(5)
 #define WR_THROT_ENBIT(6)
+#define IOVA_34_EN BIT(7)
 
 #define MTK_IOMMU_HAS_FLAG(pdata, _x) \
pdata)->flags) & (_x)) == (_x))
@@ -258,8 +263,9 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
 {
struct mtk_iommu_data *data = dev_id;
struct mtk_iommu_domain *dom = data->m4u_dom;
-   u32 int_state, regval, fault_iova, fault_pa;
unsigned int fault_larb, fault_port, sub_comm = 0;
+   u32 int_state, regval, va34_32, pa34_32;
+   u64 fault_iova, fault_pa;
bool layer, write;
 
/* Read error info from registers */
@@ -275,6 +281,14 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
}
layer = fault_iova & F_MMU_FAULT_VA_LAYER_BIT;
write = fault_iova & F_MMU_FAULT_VA_WRITE_BIT;
+   if (MTK_IOMMU_HAS_FLAG(data->plat_data, IOVA_34_EN)) {
+   va34_32 = FIELD_GET(F_MMU_INVAL_VA_34_32_MASK, fault_iova);
+   pa34_32 = FIELD_GET(F_MMU_INVAL_PA_34_32_MASK, fault_iova);
+   fault_iova = fault_iova & F_MMU_INVAL_VA_31_12_MASK;
+   fault_iova |=  (u64)va34_32 << 32;
+   fault_pa |= (u64)pa34_32 << 32;
+   }
+
fault_port = F_MMU_INT_ID_PORT_ID(regval);
if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_SUB_COMM)) {
fault_larb = F_MMU_INT_ID_COMM_ID(regval);
@@ -288,7 +302,7 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
   write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) {
dev_err_ratelimited(
data->dev,
-   "fault type=0x%x iova=0x%x pa=0x%x larb=%d port=%d 
layer=%d %s\n",
+   "fault type=0x%x iova=0x%llx pa=0x%llx larb=%d port=%d 
layer=%d %s\n",
int_state, fault_iova, fault_pa, fault_larb, fault_port,
layer, write ? "write" : "read");
}
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 21/24] iommu/mediatek: Add support for multi domain

Some HW IP(ex: CCU) require the special iova range. That means the
iova got from dma_alloc_attrs for that devices must locate in his
special range. In this patch, we allocate a special iova_range for
each a special requirement and create each a iommu domain for each
a iova_range.

meanwhile we still use one pagetable which support 16GB iova.

After this patch, If the iova range of a master is over 4G, the master
should:
a) Declare its special dma_ranges in its dtsi node. For example, If we
   preassign the iova 4G-8G for vcodec, then the vcodec dtsi node should
   add this:
   /*
* iova start at 0x1__, pa still start at 0x4000_
* size is 0x1__.
*/
   dma-ranges = <0x1 0x0 0x0 0x4000 0x1 0x0>;  /* 4G ~ 8G */
 Note: we don't have a actual bus concept here. the master doesn't have its
 special parent node, thus this dma-ranges can only be put in the master's
 node.

b) Update the dma_mask:
  dma_set_mask_and_coherent(dev, DMA_BIT_MASK(33));

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c | 47 +++
 drivers/iommu/mtk_iommu.h |  3 ++-
 2 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 16760e318648..1d5d3e76d2d1 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -350,6 +350,14 @@ static int mtk_iommu_domain_finalise(struct 
mtk_iommu_domain *dom)
 {
struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
 
+   /* Use the exist domain as there is only one m4u pgtable here. */
+   if (data->m4u_dom) {
+   dom->iop = data->m4u_dom->iop;
+   dom->cfg = data->m4u_dom->cfg;
+   dom->domain.pgsize_bitmap = data->m4u_dom->cfg.pgsize_bitmap;
+   return 0;
+   }
+
dom->cfg = (struct io_pgtable_cfg) {
.quirks = IO_PGTABLE_QUIRK_ARM_NS |
IO_PGTABLE_QUIRK_NO_PERMS |
@@ -375,6 +383,8 @@ static int mtk_iommu_domain_finalise(struct 
mtk_iommu_domain *dom)
 
 static struct iommu_domain *mtk_iommu_domain_alloc(unsigned type)
 {
+   struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
+   const struct mtk_iommu_iova_region *region;
struct mtk_iommu_domain *dom;
 
if (type != IOMMU_DOMAIN_DMA)
@@ -390,8 +400,9 @@ static struct iommu_domain *mtk_iommu_domain_alloc(unsigned 
type)
if (mtk_iommu_domain_finalise(dom))
goto  put_dma_cookie;
 
-   dom->domain.geometry.aperture_start = 0;
-   dom->domain.geometry.aperture_end = DMA_BIT_MASK(32);
+   region = data->plat_data->iova_region + data->cur_domid;
+   dom->domain.geometry.aperture_start = region->iova_base;
+   dom->domain.geometry.aperture_end = region->iova_base + region->size - 
1;
dom->domain.geometry.force_aperture = true;
 
return &dom->domain;
@@ -535,19 +546,31 @@ static void mtk_iommu_release_device(struct device *dev)
 static struct iommu_group *mtk_iommu_device_group(struct device *dev)
 {
struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct iommu_group *group;
+   int domid;
 
if (!data)
return ERR_PTR(-ENODEV);
 
-   /* All the client devices are in the same m4u iommu-group */
-   if (!data->m4u_group) {
-   data->m4u_group = iommu_group_alloc();
-   if (IS_ERR(data->m4u_group))
+   domid = MTK_M4U_TO_DOM(fwspec->ids[0]);
+   if (domid >= data->plat_data->iova_region_nr) {
+   dev_err(dev, "iommu domain id(%d/%d) is error.\n", domid,
+   data->plat_data->iova_region_nr);
+   return ERR_PTR(-EINVAL);
+   }
+
+   group = data->m4u_group[domid];
+   if (!group) {
+   group = iommu_group_alloc();
+   if (IS_ERR(group))
dev_err(dev, "Failed to allocate M4U IOMMU group\n");
+   data->m4u_group[domid] = group;
} else {
-   iommu_group_ref_get(data->m4u_group);
+   iommu_group_ref_get(group);
}
-   return data->m4u_group;
+   data->cur_domid = domid;
+   return group;
 }
 
 static int mtk_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
@@ -576,14 +599,20 @@ static void mtk_iommu_get_resv_regions(struct device *dev,
   struct list_head *head)
 {
struct mtk_iommu_data *data = dev_iommu_priv_get(dev);
-   const struct mtk_iommu_iova_region *resv;
+   const struct mtk_iommu_iova_region *resv, *curdom;
struct iommu_resv_region *region;
int prot = IOMMU_WRITE | IOMMU_READ;
unsigned int i;
 
+   curdom = data->plat_data->iova_region + data->cur_domid;
for (i = 0; i < data->plat_data->iova_region_nr; i++) {
resv = data->plat_data->iova_region + i;
 
+   /* Onl

[PATCH v3 23/24] iommu/mediatek: Add mt8192 support

Add mt8192 iommu support.

For multi domain, Add 1M gap for the vdec domain size. That is because
vdec HW has a end address register which require (start_addr +
len) rather than (start_addr + len - 1). Take a example, if the start_addr
is 0xfff0, size is 0x10, then the end_address is 0xfff0 +
0x10 = 0x1  . but the register only is 32bit. thus HW will get
the end address is 0. To avoid this issue, I add 1M gap for this.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c | 22 ++
 drivers/iommu/mtk_iommu.h |  1 +
 2 files changed, 23 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 7f8d39f8ac29..8bf5d4370792 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -171,6 +171,16 @@ static const struct mtk_iommu_iova_region single_domain[] 
= {
{.iova_base = 0,.size = SZ_4G},
 };
 
+static const struct mtk_iommu_iova_region mt8192_multi_dom[] = {
+   { .iova_base = 0x0, .size = SZ_4G}, /* disp: 0 ~ 4G 
*/
+   #if IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT)
+   { .iova_base = SZ_4G,   .size = SZ_4G - SZ_1M}, /* vdec: 4G ~ 
8G gap: 1M */
+   { .iova_base = SZ_4G * 2,   .size = SZ_4G - SZ_1M}, /* CAM/MDP: 8G 
~ 12G */
+   { .iova_base = 0x24000ULL,  .size = 0x400}, /* CCU0 */
+   { .iova_base = 0x24400ULL,  .size = 0x400}, /* CCU1 */
+   #endif
+};
+
 /*
  * There may be 1 or 2 M4U HWs, But we always expect they are in the same 
domain
  * for the performance.
@@ -976,11 +986,23 @@ static const struct mtk_iommu_plat_data mt8183_data = {
.larbid_remap = {{0}, {4}, {5}, {6}, {7}, {2}, {3}, {1}},
 };
 
+static const struct mtk_iommu_plat_data mt8192_data = {
+   .m4u_plat   = M4U_MT8192,
+   .flags  = HAS_BCLK | HAS_SUB_COMM | OUT_ORDER_WR_EN |
+ WR_THROT_EN | IOVA_34_EN,
+   .inv_sel_reg= REG_MMU_INV_SEL_GEN2,
+   .iova_region= mt8192_multi_dom,
+   .iova_region_nr = ARRAY_SIZE(mt8192_multi_dom),
+   .larbid_remap   = {{0}, {1}, {4, 5}, {7}, {2}, {9, 11, 19, 20},
+  {0, 14, 16}, {0, 13, 18, 17}},
+};
+
 static const struct of_device_id mtk_iommu_of_ids[] = {
{ .compatible = "mediatek,mt2712-m4u", .data = &mt2712_data},
{ .compatible = "mediatek,mt6779-m4u", .data = &mt6779_data},
{ .compatible = "mediatek,mt8173-m4u", .data = &mt8173_data},
{ .compatible = "mediatek,mt8183-m4u", .data = &mt8183_data},
+   { .compatible = "mediatek,mt8192-m4u", .data = &mt8192_data},
{}
 };
 
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 5e346464cdf8..d2702eda25d4 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -42,6 +42,7 @@ enum mtk_iommu_plat {
M4U_MT6779,
M4U_MT8173,
M4U_MT8183,
+   M4U_MT8192,
 };
 
 struct mtk_iommu_iova_region;
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 22/24] iommu/mediatek: Adjust the structure

Add "struct mtk_iommu_data *" in the "struct mtk_iommu_domain",
reduce the call mtk_iommu_get_m4u_data().
No functional change.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 1d5d3e76d2d1..7f8d39f8ac29 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -123,6 +123,7 @@ struct mtk_iommu_domain {
struct io_pgtable_cfg   cfg;
struct io_pgtable_ops   *iop;
 
+   struct mtk_iommu_data   *data;
struct iommu_domain domain;
 };
 
@@ -348,7 +349,7 @@ static void mtk_iommu_config(struct mtk_iommu_data *data,
 
 static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom)
 {
-   struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
+   struct mtk_iommu_data *data = dom->data;
 
/* Use the exist domain as there is only one m4u pgtable here. */
if (data->m4u_dom) {
@@ -397,6 +398,7 @@ static struct iommu_domain *mtk_iommu_domain_alloc(unsigned 
type)
if (iommu_get_dma_cookie(&dom->domain))
goto  free_dom;
 
+   dom->data = data;
if (mtk_iommu_domain_finalise(dom))
goto  put_dma_cookie;
 
@@ -469,10 +471,9 @@ static int mtk_iommu_map(struct iommu_domain *domain, 
unsigned long iova,
 phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
 {
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
-   struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
 
/* The "4GB mode" M4U physically can not use the lower remap of Dram. */
-   if (data->enable_4GB)
+   if (dom->data->enable_4GB)
paddr |= BIT_ULL(32);
 
/* Synchronize with the tlb_lock */
@@ -490,31 +491,32 @@ static size_t mtk_iommu_unmap(struct iommu_domain *domain,
 
 static void mtk_iommu_flush_iotlb_all(struct iommu_domain *domain)
 {
-   mtk_iommu_tlb_flush_all(mtk_iommu_get_m4u_data());
+   struct mtk_iommu_domain *dom = to_mtk_domain(domain);
+
+   mtk_iommu_tlb_flush_all(dom->data);
 }
 
 static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
 struct iommu_iotlb_gather *gather)
 {
-   struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
+   struct mtk_iommu_domain *dom = to_mtk_domain(domain);
size_t length = gather->end - gather->start;
 
if (gather->start == ULONG_MAX)
return;
 
mtk_iommu_tlb_flush_range_sync(gather->start, length, gather->pgsize,
-  data);
+  dom->data);
 }
 
 static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
  dma_addr_t iova)
 {
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
-   struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
phys_addr_t pa;
 
pa = dom->iop->iova_to_phys(dom->iop, iova);
-   if (data->enable_4GB && pa >= MTK_IOMMU_4GB_MODE_REMAP_BASE)
+   if (dom->data->enable_4GB && pa >= MTK_IOMMU_4GB_MODE_REMAP_BASE)
pa &= ~BIT_ULL(32);
 
return pa;
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 24/24] memory: mtk-smi: Add mt8192 support

Add mt8192 smi support.

Signed-off-by: Yong Wu 
---
 drivers/memory/mtk-smi.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index e94c99ca2883..0ec3eff4d92d 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -261,6 +261,10 @@ static const struct mtk_smi_larb_gen mtk_smi_larb_mt8183 = 
{
  /* IPU0 | IPU1 | CCU */
 };
 
+static const struct mtk_smi_larb_gen mtk_smi_larb_mt8192 = {
+   .config_port= mtk_smi_larb_config_port_gen2_general,
+};
+
 static const struct of_device_id mtk_smi_larb_of_ids[] = {
{
.compatible = "mediatek,mt8173-smi-larb",
@@ -282,6 +286,10 @@ static const struct of_device_id mtk_smi_larb_of_ids[] = {
.compatible = "mediatek,mt8183-smi-larb",
.data = &mtk_smi_larb_mt8183
},
+   {
+   .compatible = "mediatek,mt8192-smi-larb",
+   .data = &mtk_smi_larb_mt8192
+   },
{}
 };
 
@@ -421,6 +429,13 @@ static const struct mtk_smi_common_plat 
mtk_smi_common_mt8183 = {
F_MMU1_LARB(7),
 };
 
+static const struct mtk_smi_common_plat mtk_smi_common_mt8192 = {
+   .gen  = MTK_SMI_GEN2,
+   .has_gals = true,
+   .bus_sel  = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(5) |
+   F_MMU1_LARB(6),
+};
+
 static const struct of_device_id mtk_smi_common_of_ids[] = {
{
.compatible = "mediatek,mt8173-smi-common",
@@ -442,6 +457,10 @@ static const struct of_device_id mtk_smi_common_of_ids[] = 
{
.compatible = "mediatek,mt8183-smi-common",
.data = &mtk_smi_common_mt8183,
},
+   {
+   .compatible = "mediatek,mt8192-smi-common",
+   .data = &mtk_smi_common_mt8192,
+   },
{}
 };
 
-- 
2.18.0
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [Linaro-mm-sig] [PATCH v5 05/38] drm: prime: use sgtable iterators in drm_prime_sg_to_page_addr_arrays()

2020-09-30 Thread Marek Szyprowski

Hi All,

On 25.09.2020 23:23, Alex Deucher wrote:
> On Tue, Sep 22, 2020 at 2:28 AM Marek Szyprowski
>  wrote:
>> On 22.09.2020 01:15, Alex Goins wrote:
>>> Tested-by: Alex Goins 
>>>
>>> This change fixes a regression with drm_prime_sg_to_page_addr_arrays() and
>>> AMDGPU in v5.9.
>> Thanks for testing!
>>
>>> Commit 39913934 similarly revamped AMDGPU to use sgtable helper functions. 
>>> When
>>> it changed from dma_map_sg_attrs() to dma_map_sgtable(), as a side effect it
>>> started correctly updating sgt->nents to the return value of 
>>> dma_map_sg_attrs().
>>> However, drm_prime_sg_to_page_addr_arrays() incorrectly uses sgt->nents to
>>> iterate over pages, rather than sgt->orig_nents, resulting in it now 
>>> returning
>>> the incorrect number of pages on AMDGPU.
>>>
>>> I had written a patch that changes drm_prime_sg_to_page_addr_arrays() to use
>>> for_each_sgtable_sg() instead of for_each_sg(), iterating using 
>>> sgt->orig_nents:
>>>
>>> -   for_each_sg(sgt->sgl, sg, sgt->nents, count) {
>>> +   for_each_sgtable_sg(sgt, sg, count) {
>>>
>>> This patch takes it further, but still has the effect of fixing the number 
>>> of
>>> pages that drm_prime_sg_to_page_addr_arrays() returns. Something like this
>>> should be included in v5.9 to prevent a regression with AMDGPU.
>> Probably the easiest way to handle a fix for v5.9 would be to simply
>> merge the latest version of this patch also to v5.9-rcX:
>> https://lore.kernel.org/dri-devel/20200904131711.12950-3-m.szyprow...@samsung.com/
>>
>>
>> This way we would get it fixed and avoid possible conflict in the -next.
>> Do you have any AMDGPU fixes for v5.9 in the queue? Maybe you can add
>> that patch to the queue? Dave: would it be okay that way?
> I think this should go into drm-misc for 5.9 since it's an update to
> drm_prime.c.  Is that patch ready to merge?
> Acked-by: Alex Deucher 

Maarten, Maxime or Thomas: could you take this one:

https://lore.kernel.org/dri-devel/20200904131711.12950-3-m.szyprow...@samsung.com/

also to drm-misc-fixes for v5.9-rc?

Best regards

-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 3/3] iommu/tegra-smmu: Add PCI support

On Wed, Sep 30, 2020 at 09:01:09AM +0300, Dmitry Osipenko wrote:
> 30.09.2020 08:34, Nicolin Chen пишет:
> > On Wed, Sep 30, 2020 at 08:10:35AM +0300, Dmitry Osipenko wrote:
> >> 30.09.2020 03:30, Nicolin Chen пишет:
> >>>  void tegra_smmu_remove(struct tegra_smmu *smmu)
> >>>  {
> >>> + bus_set_iommu(&platform_bus_type, NULL);
> >>
> >> Why only platform_bus? Is this really needed at all?
> > 
> > I see qcom_iommu.c file set to NULL in remove(), Probably should
> > have added pci_bus_type too though.
> > 
> > Or are you sure that there's no need at all?
> > 
> 
> It wasn't here before this patch and platform_bus is unrelated to the
> topic of this patch. But it probably should be there.
> 
> On the other hand, the tegra_smmu_remove() is unused and maybe it could
> be better to get rid of this unused function at all.

I will drop this line then, since no one is calling it. And maybe
we can submit a clean up patch later.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/3] memory: tegra: Add helper function tegra_get_memory_controller

On Wed, 30 Sep 2020 at 02:35, Nicolin Chen  wrote:
>
> This can be used by both tegra-smmu and tegra20-devfreq drivers.
>
> Suggested-by: Dmitry Osipenko 
> Signed-off-by: Nicolin Chen 
> ---
>
> Changelog
> v1->v2
>  * N/A
>
>  drivers/memory/tegra/mc.c | 23 +++
>  include/soc/tegra/mc.h|  1 +
>  2 files changed, 24 insertions(+)
>
> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> index ec8403557ed4..09352ad66dcc 100644
> --- a/drivers/memory/tegra/mc.c
> +++ b/drivers/memory/tegra/mc.c
> @@ -42,6 +42,29 @@ static const struct of_device_id tegra_mc_of_match[] = {
>  };
>  MODULE_DEVICE_TABLE(of, tegra_mc_of_match);

> +struct tegra_mc *tegra_get_memory_controller(void)
> +{

Add kerneldoc and mention dropping of reference to the device after use.

Best regards,
Krzysztof

> +   struct platform_device *pdev;
> +   struct device_node *np;
> +   struct tegra_mc *mc;
> +
> +   np = of_find_matching_node_and_match(NULL, tegra_mc_of_match, NULL);
> +   if (!np)
> +   return ERR_PTR(-ENOENT);
> +
> +   pdev = of_find_device_by_node(np);
> +   of_node_put(np);
> +   if (!pdev)
> +   return ERR_PTR(-ENODEV);
> +
> +   mc = platform_get_drvdata(pdev);
> +   if (!mc)
> +   return ERR_PTR(-EPROBE_DEFER);
> +
> +   return mc;
> +}
> +EXPORT_SYMBOL_GPL(tegra_get_memory_controller);
> +
>  static int tegra_mc_block_dma_common(struct tegra_mc *mc,
>  const struct tegra_mc_reset *rst)
>  {
> diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
> index 1238e35653d1..c72718fd589f 100644
> --- a/include/soc/tegra/mc.h
> +++ b/include/soc/tegra/mc.h
> @@ -183,5 +183,6 @@ struct tegra_mc {
>
>  int tegra_mc_write_emem_configuration(struct tegra_mc *mc, unsigned long 
> rate);
>  unsigned int tegra_mc_get_emem_device_count(struct tegra_mc *mc);
> +struct tegra_mc *tegra_get_memory_controller(void);
>
>  #endif /* __SOC_TEGRA_MC_H__ */
> --
> 2.17.1
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 3/3] iommu/tegra-smmu: Add PCI support

On Wed, Sep 30, 2020 at 08:58:50AM +0300, Dmitry Osipenko wrote:
> 30.09.2020 08:29, Nicolin Chen пишет:
> > Hi Dmitry,
> > 
> > On Wed, Sep 30, 2020 at 08:10:07AM +0300, Dmitry Osipenko wrote:
> >> 30.09.2020 03:30, Nicolin Chen пишет:
> >>> - group->group = iommu_group_alloc();
> >>> + group->group = pci ? pci_device_group(dev) : iommu_group_alloc();
> >>
> >> This will be nicer to write as:
> >>
> >> if (dev_is_pci(dev))
> >> group->group = pci_device_group(dev);
> >> else
> >> group->group = generic_device_group(dev);
> > 
> > Why is that nicer? I don't feel mine is hard to read at all...
> > 
> 
> Previously you said that you're going to add USB support, so I assume
> there will be something about USB.

I was hoping that it'd work for USB. Yet USB doesn't seem to have
an different function for device_group().

> It's also a kinda common style that majority of Tegra drivers use in
> upstream kernel. But yours variant is good too if there won't be a need
> to change it later on.

Okay..I'll use yours then.

Thanks
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/3] memory: tegra: Add helper function tegra_get_memory_controller

Hi Krzysztof,

On Wed, Sep 30, 2020 at 09:21:39AM +0200, Krzysztof Kozlowski wrote:
> On Wed, 30 Sep 2020 at 02:35, Nicolin Chen  wrote:
> >
> > This can be used by both tegra-smmu and tegra20-devfreq drivers.
> >
> > Suggested-by: Dmitry Osipenko 
> > Signed-off-by: Nicolin Chen 
> > ---
> >
> > Changelog
> > v1->v2
> >  * N/A
> >
> >  drivers/memory/tegra/mc.c | 23 +++
> >  include/soc/tegra/mc.h|  1 +
> >  2 files changed, 24 insertions(+)
> >
> > diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> > index ec8403557ed4..09352ad66dcc 100644
> > --- a/drivers/memory/tegra/mc.c
> > +++ b/drivers/memory/tegra/mc.c
> > @@ -42,6 +42,29 @@ static const struct of_device_id tegra_mc_of_match[] = {
> >  };
> >  MODULE_DEVICE_TABLE(of, tegra_mc_of_match);
> 
> > +struct tegra_mc *tegra_get_memory_controller(void)
> > +{
> 
> Add kerneldoc and mention dropping of reference to the device after use.

I am abort to use Dmitry's devm_ one in my next version:
https://github.com/grate-driver/linux/commit/2105e7664063772d72fefe9696bdab0b688b9de2

Could I just skip the kerneldoc part? Otherwise, would you please
tell me which kerneldoc file I should update?

Thanks
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/3] memory: tegra: Add helper function tegra_get_memory_controller

30.09.2020 09:38, Nicolin Chen пишет:
> On Wed, Sep 30, 2020 at 09:32:20AM +0300, Dmitry Osipenko wrote:
>> 30.09.2020 08:44, Nicolin Chen пишет:
>>> On Wed, Sep 30, 2020 at 08:12:10AM +0300, Dmitry Osipenko wrote:
 30.09.2020 03:30, Nicolin Chen пишет:
 ...
>  int tegra_mc_write_emem_configuration(struct tegra_mc *mc, unsigned long 
> rate);
>  unsigned int tegra_mc_get_emem_device_count(struct tegra_mc *mc);
> +struct tegra_mc *tegra_get_memory_controller(void);
>  
>  #endif /* __SOC_TEGRA_MC_H__ */
>

 This will conflict with the tegra20-devfreq driver, you should change it
 as well.
>>>
>>> Will remove that in v3.
>>>
>>> Thanks
>>>
>>
>> Please also consider to add a resource-managed variant, similar to what
>> I did here:
>>
>> https://github.com/grate-driver/linux/commit/2105e7664063772d72fefe9696bdab0b688b9de2
>>
>> I was going to use it in the next iteration of the memory interconnect
>> series.
>>
>> But now it also will be good if you could add the devm variant to yours
>> SMMU patchset since you're already about to touch the tegra20-devfreq
>> driver. I'll then rebase my patches on top of yours patch.
> 
> Just saw this reply. Yea, I think this'd be better. Thanks
> 

Please don't forget to add a stub for !MC as well since devfreq drivers
use COMPILE_TEST and don't directly depend on the MC driver.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure

2020-09-30 Thread vjitta

From: Vijayanand Jitta 

When ever an iova alloc request fails we free the iova
ranges present in the percpu iova rcaches and then retry
but the global iova rcache is not freed as a result we could
still see iova alloc failure even after retry as global
rcache is holding the iova's which can cause fragmentation.
So, free the global iova rcache as well and then go for the
retry.

Signed-off-by: Vijayanand Jitta 
---
 drivers/iommu/iova.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index c3a1a8e..faf9b13 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain *iovad);
 static void free_iova_rcaches(struct iova_domain *iovad);
 static void fq_destroy_all_entries(struct iova_domain *iovad);
 static void fq_flush_timeout(struct timer_list *t);
+static void free_global_cached_iovas(struct iova_domain *iovad);
 
 void
 init_iova_domain(struct iova_domain *iovad, unsigned long granule,
@@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long 
size,
flush_rcache = false;
for_each_online_cpu(cpu)
free_cpu_cached_iovas(cpu, iovad);
+   free_global_cached_iovas(iovad);
goto retry;
}
 
@@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, struct 
iova_domain *iovad)
}
 }
 
+/*
+ * free all the IOVA ranges of global cache
+ */
+static void free_global_cached_iovas(struct iova_domain *iovad)
+{
+   struct iova_rcache *rcache;
+   unsigned long flags;
+   int i, j;
+
+   for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+   rcache = &iovad->rcaches[i];
+   spin_lock_irqsave(&rcache->lock, flags);
+   for (j = 0; j < rcache->depot_size; ++j) {
+   iova_magazine_free_pfns(rcache->depot[j], iovad);
+   iova_magazine_free(rcache->depot[j]);
+   rcache->depot[j] = NULL;
+   }
+   rcache->depot_size = 0;
+   spin_unlock_irqrestore(&rcache->lock, flags);
+   }
+}
 MODULE_AUTHOR("Anil S Keshavamurthy ");
 MODULE_LICENSE("GPL");
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v5 1/2] iommu/iova: Retry from last rb tree node if iova search fails

2020-09-30 Thread vjitta

From: Vijayanand Jitta 

When ever a new iova alloc request comes iova is always searched
from the cached node and the nodes which are previous to cached
node. So, even if there is free iova space available in the nodes
which are next to the cached node iova allocation can still fail
because of this approach.

Consider the following sequence of iova alloc and frees on
1GB of iova space

1) alloc - 500MB
2) alloc - 12MB
3) alloc - 499MB
4) free -  12MB which was allocated in step 2
5) alloc - 13MB

After the above sequence we will have 12MB of free iova space and
cached node will be pointing to the iova pfn of last alloc of 13MB
which will be the lowest iova pfn of that iova space. Now if we get an
alloc request of 2MB we just search from cached node and then look
for lower iova pfn's for free iova and as they aren't any, iova alloc
fails though there is 12MB of free iova space.

To avoid such iova search failures do a retry from the last rb tree node
when iova search fails, this will search the entire tree and get an iova
if its available.

Signed-off-by: Vijayanand Jitta 
Reviewed-by: Robin Murphy 
---
 drivers/iommu/iova.c | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 30d969a..c3a1a8e 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -184,8 +184,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain 
*iovad,
struct rb_node *curr, *prev;
struct iova *curr_iova;
unsigned long flags;
-   unsigned long new_pfn;
+   unsigned long new_pfn, retry_pfn;
unsigned long align_mask = ~0UL;
+   unsigned long high_pfn = limit_pfn, low_pfn = iovad->start_pfn;
 
if (size_aligned)
align_mask <<= fls_long(size - 1);
@@ -198,15 +199,25 @@ static int __alloc_and_insert_iova_range(struct 
iova_domain *iovad,
 
curr = __get_cached_rbnode(iovad, limit_pfn);
curr_iova = rb_entry(curr, struct iova, node);
+   retry_pfn = curr_iova->pfn_hi + 1;
+
+retry:
do {
-   limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
-   new_pfn = (limit_pfn - size) & align_mask;
+   high_pfn = min(high_pfn, curr_iova->pfn_lo);
+   new_pfn = (high_pfn - size) & align_mask;
prev = curr;
curr = rb_prev(curr);
curr_iova = rb_entry(curr, struct iova, node);
-   } while (curr && new_pfn <= curr_iova->pfn_hi);
-
-   if (limit_pfn < size || new_pfn < iovad->start_pfn) {
+   } while (curr && new_pfn <= curr_iova->pfn_hi && new_pfn >= low_pfn);
+
+   if (high_pfn < size || new_pfn < low_pfn) {
+   if (low_pfn == iovad->start_pfn && retry_pfn < limit_pfn) {
+   high_pfn = limit_pfn;
+   low_pfn = retry_pfn;
+   curr = &iovad->anchor.node;
+   curr_iova = rb_entry(curr, struct iova, node);
+   goto retry;
+   }
iovad->max32_alloc_size = size;
goto iova32_full;
}
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/3] memory: tegra: Add helper function tegra_get_memory_controller

On Wed, 30 Sep 2020 at 09:31, Nicolin Chen  wrote:
>
> Hi Krzysztof,
>
> On Wed, Sep 30, 2020 at 09:21:39AM +0200, Krzysztof Kozlowski wrote:
> > On Wed, 30 Sep 2020 at 02:35, Nicolin Chen  wrote:
> > >
> > > This can be used by both tegra-smmu and tegra20-devfreq drivers.
> > >
> > > Suggested-by: Dmitry Osipenko 
> > > Signed-off-by: Nicolin Chen 
> > > ---
> > >
> > > Changelog
> > > v1->v2
> > >  * N/A
> > >
> > >  drivers/memory/tegra/mc.c | 23 +++
> > >  include/soc/tegra/mc.h|  1 +
> > >  2 files changed, 24 insertions(+)
> > >
> > > diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> > > index ec8403557ed4..09352ad66dcc 100644
> > > --- a/drivers/memory/tegra/mc.c
> > > +++ b/drivers/memory/tegra/mc.c
> > > @@ -42,6 +42,29 @@ static const struct of_device_id tegra_mc_of_match[] = 
> > > {
> > >  };
> > >  MODULE_DEVICE_TABLE(of, tegra_mc_of_match);
> >
> > > +struct tegra_mc *tegra_get_memory_controller(void)
> > > +{
> >
> > Add kerneldoc and mention dropping of reference to the device after use.
>
> I am abort to use Dmitry's devm_ one in my next version:
> https://github.com/grate-driver/linux/commit/2105e7664063772d72fefe9696bdab0b688b9de2
>
> Could I just skip the kerneldoc part? Otherwise, would you please
> tell me which kerneldoc file I should update?

His version is almost the same as yours so it does not matter - you
declare an exported function, so you need to document it. kerneldoc
goes to the C file.
https://elixir.bootlin.com/linux/latest/source/Documentation/doc-guide/kernel-doc.rst

Best regards,
Krzysztof
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[GIT PULL] iommu/arm-smmu: Updates for 5.10

2020-09-30 Thread Will Deacon

Hi Joerg,

Please pull these arm-smmu updates for 5.10. Summary in the tag, but the
big thing here is the long-awaited SVM enablement from Jean-Philippe.
We're not quite done yet, but this pull extends the SMMUv3 driver so that
we're very close to being able to share page-tables directly with the CPU.

Other than that, there are a couple of things to note:

  1. My PGP subkeys expired. I've updated them here:


https://mirrors.edge.kernel.org/pub/linux/kernel/people/will/3E542FD9.asc

 and I've also mailed an updated copy for inclusion in the pgpkeys
 repository on kernel.org, but it hasn't landed yet:


https://lore.kernel.org/keys/20200929222707.GA14916@willie-the-truck/T/#u

  2. The SVM enablement has a dependency on some ASID allocation rework
 in the arm64 tree, so I made a shared branch and pulled that in here
 too.

Please shout if you run into any problems.

Will

--->8

The following changes since commit f75aef392f869018f78cfedf3c320a6b3fcfda6b:

  Linux 5.9-rc3 (2020-08-30 16:01:54 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
tags/arm-smmu-updates

for you to fetch changes up to e2eae09939a89e0994f7965ba3c676a5eac8b4b0:

  iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate() 
(2020-09-29 16:25:52 +0100)


Arm SMMU updates for 5.10

- Continued SVM enablement, where page-table is shared with CPU

- Groundwork to support integrated SMMU with Adreno GPU

- Allow disabling of MSI-based polling on the kernel command-line

- Minor driver fixes and cleanups (octal permissions, error messages, ...)


Barry Song (3):
  iommu/arm-smmu-v3: replace symbolic permissions by octal permissions for 
module parameter
  iommu/arm-smmu-v3: replace module_param_named by module_param for 
disable_bypass
  iommu/arm-smmu-v3: permit users to disable msi polling

Jean-Philippe Brucker (9):
  iommu/arm-smmu-v3: Fix endianness annotations
  arm64: mm: Pin down ASIDs for sharing mm with devices
  arm64: cpufeature: Export symbol read_sanitised_ftr_reg()
  iommu/io-pgtable-arm: Move some definitions to a header
  iommu/arm-smmu-v3: Move definitions to a header
  iommu/arm-smmu-v3: Share process page tables
  iommu/arm-smmu-v3: Seize private ASID
  iommu/arm-smmu-v3: Check for SVA features
  iommu/arm-smmu-v3: Add SVA device feature

Jordan Crouse (3):
  iommu/arm-smmu: Pass io-pgtable config to implementation specific function
  iommu/arm-smmu: Add support for split pagetables
  iommu/arm-smmu: Prepare for the adreno-smmu implementation

Rob Clark (1):
  iommu/arm-smmu: Constify some helpers

Will Deacon (1):
  Merge branch 'for-next/svm' of git://git.kernel.org/.../arm64/linux into 
for-joerg/arm-smmu/updates

Yu Kuai (1):
  iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()

Zenghui Yu (1):
  iommu/arm-smmu-v3: Fix l1 stream table size in the error message

Zhou Wang (1):
  iommu/arm-smmu-v3: Ensure queue is read after updating prod pointer

 MAINTAINERS |   3 +-
 arch/arm64/include/asm/barrier.h|   1 +
 arch/arm64/include/asm/io.h |   1 +
 arch/arm64/include/asm/mmu.h|   3 +
 arch/arm64/include/asm/mmu_context.h|  11 +-
 arch/arm64/kernel/cpufeature.c  |   1 +
 arch/arm64/mm/context.c | 105 ++-
 drivers/iommu/Kconfig   |  10 +
 drivers/iommu/arm/arm-smmu-v3/Makefile  |   5 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 248 +++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 843 +---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 723 
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c  |   3 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c   | 102 ++-
 drivers/iommu/arm/arm-smmu/arm-smmu.h   |  84 ++-
 drivers/iommu/arm/arm-smmu/qcom_iommu.c |   8 +-
 drivers/iommu/io-pgtable-arm.c  |  27 +-
 drivers/iommu/io-pgtable-arm.h  |  30 +
 18 files changed, 1410 insertions(+), 798 deletions(-)
 create mode 100644 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
 create mode 100644 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
 create mode 100644 drivers/iommu/io-pgtable-arm.h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 3/3] iommu/tegra-smmu: Add PCI support

This patch simply adds support for PCI devices.

Signed-off-by: Nicolin Chen 
---

Changelog
v2->v3
 * Replaced ternary conditional operator with if-else in .device_group()
 * Dropped change in tegra_smmu_remove()
v1->v2
 * Added error-out labels in tegra_smmu_probe()
 * Dropped pci_request_acs() since IOMMU core would call it.

 drivers/iommu/tegra-smmu.c | 35 ++-
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 636dc3b89545..db98ca334eae 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -882,7 +883,11 @@ static struct iommu_group *tegra_smmu_device_group(struct 
device *dev)
group->smmu = smmu;
group->soc = soc;
 
-   group->group = iommu_group_alloc();
+   if (dev_is_pci(dev))
+   group->group = pci_device_group(dev);
+   else
+   group->group = generic_device_group(dev);
+
if (IS_ERR(group->group)) {
devm_kfree(smmu->dev, group);
mutex_unlock(&smmu->lock);
@@ -1079,22 +1084,34 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
iommu_device_set_fwnode(&smmu->iommu, dev->fwnode);
 
err = iommu_device_register(&smmu->iommu);
-   if (err) {
-   iommu_device_sysfs_remove(&smmu->iommu);
-   return ERR_PTR(err);
-   }
+   if (err)
+   goto err_sysfs;
 
err = bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
-   if (err < 0) {
-   iommu_device_unregister(&smmu->iommu);
-   iommu_device_sysfs_remove(&smmu->iommu);
-   return ERR_PTR(err);
+   if (err < 0)
+   goto err_unregister;
+
+#ifdef CONFIG_PCI
+   if (!iommu_present(&pci_bus_type)) {
+   err = bus_set_iommu(&pci_bus_type, &tegra_smmu_ops);
+   if (err < 0)
+   goto err_bus_set;
}
+#endif
 
if (IS_ENABLED(CONFIG_DEBUG_FS))
tegra_smmu_debugfs_init(smmu);
 
return smmu;
+
+err_bus_set:
+   bus_set_iommu(&platform_bus_type, NULL);
+err_unregister:
+   iommu_device_unregister(&smmu->iommu);
+err_sysfs:
+   iommu_device_sysfs_remove(&smmu->iommu);
+
+   return ERR_PTR(err);
 }
 
 void tegra_smmu_remove(struct tegra_smmu *smmu)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 0/3] iommu/tegra-smmu: Add PCI support

This series is to add PCI support with three changes:
PATCH-1 adds a helper function to get mc pointer
PATCH-2 adds support for clients that don't exist in DTB
PATCH-3 adds PCI support accordingly

Changelog (Detail in each patch)
v2->v3
 * Replaced with devm_tegra_get_memory_controller
 * Updated changes by following Dmitry's comments
v1->v2
 * Added PATCH-1 suggested by Dmitry
 * Reworked PATCH-2 to unify certain code

Dmitry Osipenko (1):
  memory: tegra: Add devm_tegra_get_memory_controller()

Nicolin Chen (2):
  iommu/tegra-smmu: Rework .probe_device and .attach_dev
  iommu/tegra-smmu: Add PCI support

 drivers/iommu/tegra-smmu.c | 179 +
 drivers/memory/tegra/mc.c  |  39 
 include/soc/tegra/mc.h |  17 
 3 files changed, 118 insertions(+), 117 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

From: Dmitry Osipenko 

Multiple Tegra drivers need to retrieve Memory Controller and hence there
is quite some duplication of the retrieval code among the drivers. Let's
add a new common helper for the retrieval of the MC.

Signed-off-by: Dmitry Osipenko 
Signed-off-by: Nicolin Chen 
---

Changelog
v2->v3:
 * Replaced with Dimtry's devm_tegra_get_memory_controller()
v1->v2:
 * N/A

 drivers/memory/tegra/mc.c | 39 +++
 include/soc/tegra/mc.h| 17 +
 2 files changed, 56 insertions(+)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index ec8403557ed4..dd691dc3738e 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -42,6 +42,45 @@ static const struct of_device_id tegra_mc_of_match[] = {
 };
 MODULE_DEVICE_TABLE(of, tegra_mc_of_match);
 
+static void tegra_mc_devm_action_put_device(void *data)
+{
+   struct tegra_mc *mc = data;
+
+   put_device(mc->dev);
+}
+
+struct tegra_mc *devm_tegra_get_memory_controller(struct device *dev)
+{
+   struct platform_device *pdev;
+   struct device_node *np;
+   struct tegra_mc *mc;
+   int err;
+
+   np = of_find_matching_node_and_match(NULL, tegra_mc_of_match, NULL);
+   if (!np)
+   return ERR_PTR(-ENOENT);
+
+   pdev = of_find_device_by_node(np);
+   of_node_put(np);
+   if (!pdev)
+   return ERR_PTR(-ENODEV);
+
+   mc = platform_get_drvdata(pdev);
+   if (!mc) {
+   put_device(mc->dev);
+   return ERR_PTR(-EPROBE_DEFER);
+   }
+
+   err = devm_add_action(dev, tegra_mc_devm_action_put_device, mc);
+   if (err) {
+   put_device(mc->dev);
+   return ERR_PTR(err);
+   }
+
+   return mc;
+}
+EXPORT_SYMBOL_GPL(devm_tegra_get_memory_controller);
+
 static int tegra_mc_block_dma_common(struct tegra_mc *mc,
 const struct tegra_mc_reset *rst)
 {
diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
index 1238e35653d1..c05142e3e244 100644
--- a/include/soc/tegra/mc.h
+++ b/include/soc/tegra/mc.h
@@ -184,4 +184,21 @@ struct tegra_mc {
 int tegra_mc_write_emem_configuration(struct tegra_mc *mc, unsigned long rate);
 unsigned int tegra_mc_get_emem_device_count(struct tegra_mc *mc);
 
+#ifdef CONFIG_TEGRA_MC
+/**
+ * devm_tegra_get_memory_controller() - Get the tegra_mc pointer.
+ * @dev: Device that will be interacted with
+ *
+ * Return: ERR_PTR() on error or a valid pointer to a struct tegra_mc.
+ *
+ * The mc->dev counter will be automatically put by the device management code.
+ */
+struct tegra_mc *devm_tegra_get_memory_controller(struct device *dev);
+#else
+static inline struct tegra_mc *devm_tegra_get_memory_controller(struct device 
*dev)
+{
+   return ERR_PTR(-ENODEV);
+}
+#endif
+
 #endif /* __SOC_TEGRA_MC_H__ */
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

Previously the driver relies on bus_set_iommu() in .probe() to call
in .probe_device() function so each client can poll iommus property
in DTB to configure fwspec via tegra_smmu_configure(). According to
the comments in .probe(), this is a bit of a hack. And this doesn't
work for a client that doesn't exist in DTB, PCI device for example.

Actually when a device/client gets probed, the of_iommu_configure()
will call in .probe_device() function again, with a prepared fwspec
from of_iommu_configure() that reads the SWGROUP id in DTB as we do
in tegra-smmu driver.

Additionally, as a new helper devm_tegra_get_memory_controller() is
introduced, there's no need to poll the iommus property in order to
get mc->smmu pointers or SWGROUP id.

This patch reworks .probe_device() and .attach_dev() by doing:
1) Using fwspec to get swgroup id in .attach_dev/.dettach_dev()
2) Removing DT polling code, tegra_smmu_find/tegra_smmu_configure()
3) Calling devm_tegra_get_memory_controller() in .probe_device()
4) Also dropping the hack in .probe() that's no longer needed.

Signed-off-by: Nicolin Chen 
---

Changelog
v2->v3
 * Used devm_tegra_get_memory_controller() to get mc pointer
 * Replaced IS_ERR_OR_NULL with IS_ERR in .probe_device()
v1->v2
 * Replaced in .probe_device() tegra_smmu_find/tegra_smmu_configure()
   with tegra_get_memory_controller call.
 * Dropped the hack in tegra_smmu_probe().

 drivers/iommu/tegra-smmu.c | 144 ++---
 1 file changed, 36 insertions(+), 108 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 6a3ecc334481..636dc3b89545 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -61,6 +61,8 @@ struct tegra_smmu_as {
u32 attr;
 };
 
+static const struct iommu_ops tegra_smmu_ops;
+
 static struct tegra_smmu_as *to_smmu_as(struct iommu_domain *dom)
 {
return container_of(dom, struct tegra_smmu_as, domain);
@@ -484,60 +486,50 @@ static void tegra_smmu_as_unprepare(struct tegra_smmu 
*smmu,
 static int tegra_smmu_attach_dev(struct iommu_domain *domain,
 struct device *dev)
 {
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
struct tegra_smmu_as *as = to_smmu_as(domain);
-   struct device_node *np = dev->of_node;
-   struct of_phandle_args args;
unsigned int index = 0;
int err = 0;
 
-   while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
-  &args)) {
-   unsigned int swgroup = args.args[0];
-
-   if (args.np != smmu->dev->of_node) {
-   of_node_put(args.np);
-   continue;
-   }
-
-   of_node_put(args.np);
+   if (!fwspec || fwspec->ops != &tegra_smmu_ops)
+   return -ENOENT;
 
+   for (index = 0; index < fwspec->num_ids; index++) {
err = tegra_smmu_as_prepare(smmu, as);
-   if (err < 0)
-   return err;
+   if (err)
+   goto disable;
 
-   tegra_smmu_enable(smmu, swgroup, as->id);
-   index++;
+   tegra_smmu_enable(smmu, fwspec->ids[index], as->id);
}
 
if (index == 0)
return -ENODEV;
 
return 0;
+
+disable:
+   while (index--) {
+   tegra_smmu_disable(smmu, fwspec->ids[index], as->id);
+   tegra_smmu_as_unprepare(smmu, as);
+   }
+
+   return err;
 }
 
 static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device 
*dev)
 {
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct tegra_smmu_as *as = to_smmu_as(domain);
-   struct device_node *np = dev->of_node;
struct tegra_smmu *smmu = as->smmu;
-   struct of_phandle_args args;
unsigned int index = 0;
 
-   while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
-  &args)) {
-   unsigned int swgroup = args.args[0];
-
-   if (args.np != smmu->dev->of_node) {
-   of_node_put(args.np);
-   continue;
-   }
-
-   of_node_put(args.np);
+   if (!fwspec || fwspec->ops != &tegra_smmu_ops)
+   return;
 
-   tegra_smmu_disable(smmu, swgroup, as->id);
+   for (index = 0; index < fwspec->num_ids; index++) {
+   tegra_smmu_disable(smmu, fwspec->ids[index], as->id);
tegra_smmu_as_unprepare(smmu, as);
-   index++;
}
 }
 
@@ -807,80 +799,26 @@ static phys_addr_t tegra_smmu_iova_to_phys(struct 
iommu_domain *domain,
return SMMU_PFN_PHYS(pfn) + SMMU_OFFSET_IN_PAGE(iova);
 }
 
-static struct tegra_smmu *tegra_smmu_find(struct device_node *np)
-{
-   struct platform_device *pdev;

[PATCH 6/9] dma-mapping: remove

Just provide a weak default definition of dma_contiguous_early_fixup and
let arm override it.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/include/asm/dma-contiguous.h | 15 ---
 arch/arm/mm/dma-mapping.c |  1 -
 include/asm-generic/Kbuild|  1 -
 include/asm-generic/dma-contiguous.h  | 10 --
 include/linux/dma-map-ops.h   |  2 ++
 kernel/dma/contiguous.c   |  6 +-
 6 files changed, 7 insertions(+), 28 deletions(-)
 delete mode 100644 arch/arm/include/asm/dma-contiguous.h
 delete mode 100644 include/asm-generic/dma-contiguous.h

diff --git a/arch/arm/include/asm/dma-contiguous.h 
b/arch/arm/include/asm/dma-contiguous.h
deleted file mode 100644
index d785187a6f8ac1..00
--- a/arch/arm/include/asm/dma-contiguous.h
+++ /dev/null
@@ -1,15 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef ASMARM_DMA_CONTIGUOUS_H
-#define ASMARM_DMA_CONTIGUOUS_H
-
-#ifdef __KERNEL__
-#ifdef CONFIG_DMA_CMA
-
-#include 
-
-void dma_contiguous_early_fixup(phys_addr_t base, unsigned long size);
-
-#endif
-#endif
-
-#endif
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 154c24cec94c74..911fc6ea26071e 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -34,7 +34,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include "dma.h"
diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild
index 74b0612601dd1b..62ebdc731ee239 100644
--- a/include/asm-generic/Kbuild
+++ b/include/asm-generic/Kbuild
@@ -16,7 +16,6 @@ mandatory-y += current.h
 mandatory-y += delay.h
 mandatory-y += device.h
 mandatory-y += div64.h
-mandatory-y += dma-contiguous.h
 mandatory-y += dma-mapping.h
 mandatory-y += dma.h
 mandatory-y += emergency-restart.h
diff --git a/include/asm-generic/dma-contiguous.h 
b/include/asm-generic/dma-contiguous.h
deleted file mode 100644
index f24b0f9a4f05b6..00
--- a/include/asm-generic/dma-contiguous.h
+++ /dev/null
@@ -1,10 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_GENERIC_DMA_CONTIGUOUS_H
-#define _ASM_GENERIC_DMA_CONTIGUOUS_H
-
-#include 
-
-static inline void
-dma_contiguous_early_fixup(phys_addr_t base, unsigned long size) { }
-
-#endif
diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index 474fc81bd4921c..7912f5d00ed950 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -116,6 +116,8 @@ bool dma_release_from_contiguous(struct device *dev, struct 
page *pages,
 int count);
 struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp);
 void dma_free_contiguous(struct device *dev, struct page *page, size_t size);
+
+void dma_contiguous_early_fixup(phys_addr_t base, unsigned long size);
 #else /* CONFIG_DMA_CMA */
 static inline struct cma *dev_get_cma_area(struct device *dev)
 {
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 6bfb763fff6fca..a2ee330a3749ec 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -44,7 +44,6 @@
 #endif
 
 #include 
-#include 
 
 #include 
 #include 
@@ -212,6 +211,11 @@ void __init dma_contiguous_reserve(phys_addr_t limit)
}
 }
 
+void __weak
+dma_contiguous_early_fixup(phys_addr_t base, unsigned long size)
+{
+}
+
 /**
  * dma_contiguous_reserve_area() - reserve custom contiguous area
  * @size: Size of the reserved area (in bytes),
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

clean up the DMA mapping headers

Hi all,

this series cleans up the dma-mapping headers by moving everything not
required by normal drivers out of  into a new
 and then folding most other DMA mapping related
headers either into the new dma-map-ops.h one, or by moving them to
kernel/dma/ and thus out of the global scope.  A bunch of cleanups
for the DMA CMA code are thrown in as well, as they help keeping the
exposed bits in the header small.

Diffstat:
 arch/arm/include/asm/dma-contiguous.h |   15 
 b/Documentation/admin-guide/kernel-parameters.txt |2 
 b/MAINTAINERS |2 
 b/arch/alpha/kernel/pci_iommu.c   |2 
 b/arch/arc/mm/dma.c   |2 
 b/arch/arm/common/dmabounce.c |1 
 b/arch/arm/include/asm/dma-iommu.h|1 
 b/arch/arm/include/asm/dma-mapping.h  |1 
 b/arch/arm/mach-davinci/devices-da8xx.c   |   18 -
 b/arch/arm/mach-highbank/highbank.c   |2 
 b/arch/arm/mach-imx/mach-imx27_visstrim_m10.c |2 
 b/arch/arm/mach-imx/mach-mx31moboard.c|2 
 b/arch/arm/mach-mvebu/coherency.c |2 
 b/arch/arm/mach-shmobile/setup-rcar-gen2.c|2 
 b/arch/arm/mm/dma-mapping-nommu.c |1 
 b/arch/arm/mm/dma-mapping.c   |5 
 b/arch/arm/mm/init.c  |2 
 b/arch/arm/xen/mm.c   |2 
 b/arch/arm64/mm/dma-mapping.c |2 
 b/arch/arm64/mm/init.c|3 
 b/arch/c6x/mm/dma-coherent.c  |2 
 b/arch/csky/kernel/setup.c|2 
 b/arch/csky/mm/dma-mapping.c  |4 
 b/arch/hexagon/kernel/dma.c   |2 
 b/arch/ia64/hp/common/sba_iommu.c |2 
 b/arch/ia64/kernel/dma-mapping.c  |2 
 b/arch/ia64/mm/init.c |2 
 b/arch/m68k/kernel/dma.c  |2 
 b/arch/microblaze/kernel/dma.c|3 
 b/arch/microblaze/mm/consistent.c |2 
 b/arch/microblaze/mm/init.c   |2 
 b/arch/mips/jazz/jazzdma.c|2 
 b/arch/mips/kernel/setup.c|2 
 b/arch/mips/mm/dma-noncoherent.c  |3 
 b/arch/nds32/kernel/dma.c |2 
 b/arch/openrisc/kernel/dma.c  |2 
 b/arch/parisc/kernel/drivers.c|1 
 b/arch/parisc/kernel/pci-dma.c|2 
 b/arch/powerpc/include/asm/iommu.h|2 
 b/arch/powerpc/include/asm/pci.h  |2 
 b/arch/powerpc/mm/dma-noncoherent.c   |2 
 b/arch/powerpc/platforms/ps3/system-bus.c |2 
 b/arch/powerpc/platforms/pseries/ibmebus.c|2 
 b/arch/powerpc/platforms/pseries/vio.c|2 
 b/arch/s390/kernel/setup.c|2 
 b/arch/s390/pci/pci_dma.c |2 
 b/arch/sh/boards/mach-ap325rxa/setup.c|1 
 b/arch/sh/boards/mach-ecovec24/setup.c|1 
 b/arch/sh/boards/mach-kfr2r09/setup.c |2 
 b/arch/sh/boards/mach-migor/setup.c   |2 
 b/arch/sh/boards/mach-se/7724/setup.c |1 
 b/arch/sh/drivers/pci/fixups-dreamcast.c  |2 
 b/arch/sh/drivers/pci/pci.c   |1 
 b/arch/sh/kernel/dma-coherent.c   |2 
 b/arch/sparc/kernel/iommu.c   |2 
 b/arch/sparc/kernel/ioport.c  |2 
 b/arch/sparc/kernel/pci_sun4v.c   |1 
 b/arch/sparc/mm/io-unit.c |2 
 b/arch/sparc/mm/iommu.c   |2 
 b/arch/x86/include/asm/dma-mapping.h  |2 
 b/arch/x86/kernel/amd_gart_64.c   |1 
 b/arch/x86/kernel/pci-dma.c   |2 
 b/arch/x86/kernel/setup.c |2 
 b/arch/x86/xen/pci-swiotlb-xen.c  |2 
 b/arch/xtensa/kernel/pci-dma.c|3 
 b/arch/xtensa/mm/init.c   |2 
 b/drivers/acpi/arm64/iort.c   |2 
 b/drivers/acpi/scan.c |2 
 b/drivers/base/dd.c   |2 
 b/drivers/dma-buf/heaps/cma_heap.c|2 
 b/drivers/gpu/drm/exynos/exynos_drm_dma.c |2 
 b/drivers/gpu/drm/msm/msm_gem.c   |1 
 b/drivers/iommu/amd/iommu.c   |3 
 b/drivers/iommu/dma-iommu.c   |3 
 b/drivers/iommu/intel/iommu.c |4 
 b/drivers/media/platform/exynos4-is/fimc-is.c |1 
 b/drivers/misc/mic/bus/mic_bus.c  |1 
 b/drivers/misc/mic/bus/scif_bu

[PATCH 8/9] dma-mapping: move large parts of to kernel/dma

Most of the dma_direct symbols should only be used by direct.c and
mapping.c, so move them to kernel/dma.  In fact more of dma-direct.h
should eventually move, but that will require more coordination with
other subsystems.

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-direct.h | 106 -
 kernel/dma/direct.c|   2 +-
 kernel/dma/direct.h| 119 +
 kernel/dma/mapping.c   |   2 +-
 4 files changed, 121 insertions(+), 108 deletions(-)
 create mode 100644 kernel/dma/direct.h

diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index 38ed3b55034d50..a2d6640c42c04e 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -120,114 +120,8 @@ struct page *dma_direct_alloc_pages(struct device *dev, 
size_t size,
 void dma_direct_free_pages(struct device *dev, size_t size,
struct page *page, dma_addr_t dma_addr,
enum dma_data_direction dir);
-int dma_direct_get_sgtable(struct device *dev, struct sg_table *sgt,
-   void *cpu_addr, dma_addr_t dma_addr, size_t size,
-   unsigned long attrs);
-bool dma_direct_can_mmap(struct device *dev);
-int dma_direct_mmap(struct device *dev, struct vm_area_struct *vma,
-   void *cpu_addr, dma_addr_t dma_addr, size_t size,
-   unsigned long attrs);
 int dma_direct_supported(struct device *dev, u64 mask);
-bool dma_direct_need_sync(struct device *dev, dma_addr_t dma_addr);
-int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
-   enum dma_data_direction dir, unsigned long attrs);
 dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr,
size_t size, enum dma_data_direction dir, unsigned long attrs);
-size_t dma_direct_max_mapping_size(struct device *dev);
 
-#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \
-defined(CONFIG_SWIOTLB)
-void dma_direct_sync_sg_for_device(struct device *dev, struct scatterlist *sgl,
-   int nents, enum dma_data_direction dir);
-#else
-static inline void dma_direct_sync_sg_for_device(struct device *dev,
-   struct scatterlist *sgl, int nents, enum dma_data_direction dir)
-{
-}
-#endif
-
-#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \
-defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) || \
-defined(CONFIG_SWIOTLB)
-void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
-   int nents, enum dma_data_direction dir, unsigned long attrs);
-void dma_direct_sync_sg_for_cpu(struct device *dev,
-   struct scatterlist *sgl, int nents, enum dma_data_direction 
dir);
-#else
-static inline void dma_direct_unmap_sg(struct device *dev,
-   struct scatterlist *sgl, int nents, enum dma_data_direction dir,
-   unsigned long attrs)
-{
-}
-static inline void dma_direct_sync_sg_for_cpu(struct device *dev,
-   struct scatterlist *sgl, int nents, enum dma_data_direction dir)
-{
-}
-#endif
-
-static inline void dma_direct_sync_single_for_device(struct device *dev,
-   dma_addr_t addr, size_t size, enum dma_data_direction dir)
-{
-   phys_addr_t paddr = dma_to_phys(dev, addr);
-
-   if (unlikely(is_swiotlb_buffer(paddr)))
-   swiotlb_tbl_sync_single(dev, paddr, size, dir, SYNC_FOR_DEVICE);
-
-   if (!dev_is_dma_coherent(dev))
-   arch_sync_dma_for_device(paddr, size, dir);
-}
-
-static inline void dma_direct_sync_single_for_cpu(struct device *dev,
-   dma_addr_t addr, size_t size, enum dma_data_direction dir)
-{
-   phys_addr_t paddr = dma_to_phys(dev, addr);
-
-   if (!dev_is_dma_coherent(dev)) {
-   arch_sync_dma_for_cpu(paddr, size, dir);
-   arch_sync_dma_for_cpu_all();
-   }
-
-   if (unlikely(is_swiotlb_buffer(paddr)))
-   swiotlb_tbl_sync_single(dev, paddr, size, dir, SYNC_FOR_CPU);
-
-   if (dir == DMA_FROM_DEVICE)
-   arch_dma_mark_clean(paddr, size);
-}
-
-static inline dma_addr_t dma_direct_map_page(struct device *dev,
-   struct page *page, unsigned long offset, size_t size,
-   enum dma_data_direction dir, unsigned long attrs)
-{
-   phys_addr_t phys = page_to_phys(page) + offset;
-   dma_addr_t dma_addr = phys_to_dma(dev, phys);
-
-   if (unlikely(swiotlb_force == SWIOTLB_FORCE))
-   return swiotlb_map(dev, phys, size, dir, attrs);
-
-   if (unlikely(!dma_capable(dev, dma_addr, size, true))) {
-   if (swiotlb_force != SWIOTLB_NO_FORCE)
-   return swiotlb_map(dev, phys, size, dir, attrs);
-
-   dev_WARN_ONCE(dev, 1,
-"DMA addr %pad+%zu overflow (mask %llx, bus limit 
%llx).\n",
-&dma_addr, size, *dev->dma_mask, 
dev->bus_dma_limit);
-   return DMA_MAPPING_ERROR;
-   }
-
-   if (!dev_is_dma

[PATCH 7/9] dma-mapping: move dma-debug.h to kernel/dma/

Most of dma-debug.h is not required by anything outside of kernel/dma.
Move the four declarations needed by dma-mappin.h or dma-ops providers
into dma-mapping.h and dma-map-ops.h, and move the remainder of the
file to kernel/dma/debug.h.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/include/asm/dma-iommu.h  |  1 -
 arch/arm/include/asm/dma-mapping.h|  1 -
 arch/microblaze/kernel/dma.c  |  1 -
 arch/sh/drivers/pci/pci.c |  1 -
 arch/x86/include/asm/dma-mapping.h|  1 -
 arch/x86/kernel/pci-dma.c |  1 -
 drivers/pci/pci-driver.c  |  1 +
 include/linux/dma-map-ops.h   | 12 +
 include/linux/dma-mapping.h   | 16 ++-
 kernel/dma/debug.c|  5 +--
 .../linux/dma-debug.h => kernel/dma/debug.h   | 44 ++-
 kernel/dma/mapping.c  |  1 +
 mm/memory.c   |  1 -
 13 files changed, 34 insertions(+), 52 deletions(-)
 rename include/linux/dma-debug.h => kernel/dma/debug.h (81%)

diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h
index 86405cc81385c8..fe9ef6f79e9cfe 100644
--- a/arch/arm/include/asm/dma-iommu.h
+++ b/arch/arm/include/asm/dma-iommu.h
@@ -6,7 +6,6 @@
 
 #include 
 #include 
-#include 
 #include 
 
 struct dma_iommu_mapping {
diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index 0a1a536368c3a4..77082246a5e121 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -6,7 +6,6 @@
 
 #include 
 #include 
-#include 
 
 #include 
 #include 
diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c
index d7bebd04247b72..a564863db06e98 100644
--- a/arch/microblaze/kernel/dma.c
+++ b/arch/microblaze/kernel/dma.c
@@ -10,7 +10,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/arch/sh/drivers/pci/pci.c b/arch/sh/drivers/pci/pci.c
index 6ab0b7377f6634..a3903304f33faa 100644
--- a/arch/sh/drivers/pci/pci.c
+++ b/arch/sh/drivers/pci/pci.c
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/arch/x86/include/asm/dma-mapping.h 
b/arch/x86/include/asm/dma-mapping.h
index e0c380b3ec1407..bb1654fe0ce74c 100644
--- a/arch/x86/include/asm/dma-mapping.h
+++ b/arch/x86/include/asm/dma-mapping.h
@@ -8,7 +8,6 @@
  */
 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index 4892dd043d414c..de234e7a8962eb 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -1,7 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 449466f71040d2..d1b7169c068403 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "pci.h"
 #include "pcie/portdrv.h"
 
diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index 7912f5d00ed950..9891def42da71f 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -220,6 +220,18 @@ static inline void arch_teardown_dma_ops(struct device 
*dev)
 }
 #endif /* CONFIG_ARCH_HAS_TEARDOWN_DMA_OPS */
 
+#ifdef CONFIG_DMA_API_DEBUG
+void dma_debug_add_bus(struct bus_type *bus);
+void debug_dma_dump_mappings(struct device *dev);
+#else
+static inline void dma_debug_add_bus(struct bus_type *bus)
+{
+}
+static inline void debug_dma_dump_mappings(struct device *dev)
+{
+}
+#endif /* CONFIG_DMA_API_DEBUG */
+
 extern const struct dma_map_ops dma_dummy_ops;
 
 #endif /* _LINUX_DMA_MAP_OPS_H */
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 9591cd482d7c2d..3f029afdc9dc6a 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -6,7 +6,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -76,6 +75,21 @@
 
 #define DMA_BIT_MASK(n)(((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
 
+#ifdef CONFIG_DMA_API_DEBUG
+void debug_dma_mapping_error(struct device *dev, dma_addr_t dma_addr);
+void debug_dma_map_single(struct device *dev, const void *addr,
+   unsigned long len);
+#else
+static inline void debug_dma_mapping_error(struct device *dev,
+   dma_addr_t dma_addr)
+{
+}
+static inline void debug_dma_map_single(struct device *dev, const void *addr,
+   unsigned long len)
+{
+}
+#endif /* CONFIG_DMA_API_DEBUG */
+
 #ifdef CONFIG_HAS_DMA
 static inline int dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
 {
diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 4211800d9f3e4d..14de1271463fd0 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -9,10 +9,9 @@
 
 #include 
 #include 
-#include 
+#include 
 #include 
 #incl

[PATCH 2/9] dma-contiguous: remove dma_declare_contiguous

dma_declare_contiguous is a trivial wrapper around
dma_contiguous_reserve_area and just has a single caller.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/mach-davinci/devices-da8xx.c | 16 +-
 include/linux/dma-contiguous.h| 32 ---
 2 files changed, 10 insertions(+), 38 deletions(-)

diff --git a/arch/arm/mach-davinci/devices-da8xx.c 
b/arch/arm/mach-davinci/devices-da8xx.c
index feb206bdf6e172..2e2853582b459e 100644
--- a/arch/arm/mach-davinci/devices-da8xx.c
+++ b/arch/arm/mach-davinci/devices-da8xx.c
@@ -884,6 +884,7 @@ early_param("rproc_mem", early_rproc_mem);
 
 void __init da8xx_rproc_reserve_cma(void)
 {
+   struct cma *cma;
int ret;
 
if (!rproc_base || !rproc_size) {
@@ -897,13 +898,16 @@ void __init da8xx_rproc_reserve_cma(void)
pr_info("%s: reserving 0x%lx @ 0x%lx...\n",
__func__, rproc_size, (unsigned long)rproc_base);
 
-   ret = dma_declare_contiguous(&da8xx_dsp.dev, rproc_size, rproc_base, 0);
-   if (ret)
-   pr_err("%s: dma_declare_contiguous failed %d\n", __func__, ret);
-   else
-   rproc_mem_inited = true;
+   ret = dma_contiguous_reserve_area(rproc_size, rproc_base, 0, &cma,
+   true);
+   if (ret) {
+   pr_err("%s: dma_contiguous_reserve_area failed %d\n",
+   __func__, ret);
+   return;
+   }
+   dev_set_cma_area(&da8xx_dsp.dev, cma);
+   rproc_mem_inited = true;
 }
-
 #else
 
 void __init da8xx_rproc_reserve_cma(void)
diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
index fe55e004f1f433..62fd55d0723546 100644
--- a/include/linux/dma-contiguous.h
+++ b/include/linux/dma-contiguous.h
@@ -83,31 +83,6 @@ int __init dma_contiguous_reserve_area(phys_addr_t size, 
phys_addr_t base,
   phys_addr_t limit, struct cma **res_cma,
   bool fixed);
 
-/**
- * dma_declare_contiguous() - reserve area for contiguous memory handling
- *   for particular device
- * @dev:   Pointer to device structure.
- * @size:  Size of the reserved memory.
- * @base:  Start address of the reserved memory (optional, 0 for any).
- * @limit: End address of the reserved memory (optional, 0 for any).
- *
- * This function reserves memory for specified device. It should be
- * called by board specific code when early allocator (memblock or bootmem)
- * is still activate.
- */
-
-static inline int dma_declare_contiguous(struct device *dev, phys_addr_t size,
-phys_addr_t base, phys_addr_t limit)
-{
-   struct cma *cma;
-   int ret;
-   ret = dma_contiguous_reserve_area(size, base, limit, &cma, true);
-   if (ret == 0)
-   dev_set_cma_area(dev, cma);
-
-   return ret;
-}
-
 struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
   unsigned int order, bool no_warn);
 bool dma_release_from_contiguous(struct device *dev, struct page *pages,
@@ -135,13 +110,6 @@ static inline int dma_contiguous_reserve_area(phys_addr_t 
size, phys_addr_t base
return -ENOSYS;
 }
 
-static inline
-int dma_declare_contiguous(struct device *dev, phys_addr_t size,
-  phys_addr_t base, phys_addr_t limit)
-{
-   return -ENOSYS;
-}
-
 static inline
 struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
   unsigned int order, bool no_warn)
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 4/9] dma-contiguous: remove dma_contiguous_set_default

dma_contiguous_set_default contains a trivial assignment, and has a
single caller that is compiled if CONFIG_CMA_DMA is enabled.

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-contiguous.h | 7 ---
 kernel/dma/contiguous.c| 2 +-
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
index 41ec08d81bc317..f9ce1ee58d4180 100644
--- a/include/linux/dma-contiguous.h
+++ b/include/linux/dma-contiguous.h
@@ -66,11 +66,6 @@ static inline struct cma *dev_get_cma_area(struct device 
*dev)
return dma_contiguous_default_area;
 }
 
-static inline void dma_contiguous_set_default(struct cma *cma)
-{
-   dma_contiguous_default_area = cma;
-}
-
 void dma_contiguous_reserve(phys_addr_t addr_limit);
 
 int __init dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t base,
@@ -91,8 +86,6 @@ static inline struct cma *dev_get_cma_area(struct device *dev)
return NULL;
 }
 
-static inline void dma_contiguous_set_default(struct cma *cma) { }
-
 static inline void dma_contiguous_reserve(phys_addr_t limit) { }
 
 static inline int dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t 
base,
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 95adcee972e85c..bf05ec2256e149 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -407,7 +407,7 @@ static int __init rmem_cma_setup(struct reserved_mem *rmem)
dma_contiguous_early_fixup(rmem->base, rmem->size);
 
if (default_cma)
-   dma_contiguous_set_default(cma);
+   dma_contiguous_default_area = cma;
 
rmem->ops = &rmem_cma_ops;
rmem->priv = cma;
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 9/9] dma-mapping: merge into

Move more nitty gritty DMA implementation details into the common
internal header.

Signed-off-by: Christoph Hellwig 
---
 MAINTAINERS   |   1 -
 arch/arc/mm/dma.c |   1 -
 arch/arm/mm/dma-mapping.c |   1 -
 arch/arm/xen/mm.c |   2 +-
 arch/arm64/mm/dma-mapping.c   |   1 -
 arch/c6x/mm/dma-coherent.c|   2 +-
 arch/csky/mm/dma-mapping.c|   1 -
 arch/hexagon/kernel/dma.c |   2 +-
 arch/ia64/mm/init.c   |   2 +-
 arch/m68k/kernel/dma.c|   2 +-
 arch/microblaze/kernel/dma.c  |   2 +-
 arch/microblaze/mm/consistent.c   |   2 +-
 arch/mips/jazz/jazzdma.c  |   1 -
 arch/mips/mm/dma-noncoherent.c|   1 -
 arch/nds32/kernel/dma.c   |   2 +-
 arch/openrisc/kernel/dma.c|   2 +-
 arch/parisc/kernel/pci-dma.c  |   2 +-
 arch/powerpc/mm/dma-noncoherent.c |   2 +-
 arch/sh/kernel/dma-coherent.c |   2 +-
 arch/sparc/kernel/ioport.c|   2 +-
 arch/xtensa/kernel/pci-dma.c  |   1 -
 drivers/iommu/dma-iommu.c |   1 -
 drivers/xen/swiotlb-xen.c |   3 +-
 include/linux/dma-direct.h|   2 +-
 include/linux/dma-map-ops.h   | 102 
 include/linux/dma-noncoherent.h   | 109 --
 kernel/dma/ops_helpers.c  |   1 -
 kernel/dma/pool.c |   1 -
 kernel/dma/swiotlb.c  |   2 +-
 29 files changed, 118 insertions(+), 137 deletions(-)
 delete mode 100644 include/linux/dma-noncoherent.h

diff --git a/MAINTAINERS b/MAINTAINERS
index b13fc17943079a..c0dbe6e9de6549 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5203,7 +5203,6 @@ F:include/asm-generic/dma-mapping.h
 F: include/linux/dma-direct.h
 F: include/linux/dma-mapping.h
 F: include/linux/dma-map-ops.h
-F: include/linux/dma-noncoherent.h
 F: kernel/dma/
 
 DMA-BUF HEAPS FRAMEWORK
diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
index a8c453e98d753b..517988e60cfc4d 100644
--- a/arch/arc/mm/dma.c
+++ b/arch/arc/mm/dma.c
@@ -4,7 +4,6 @@
  */
 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 911fc6ea26071e..c4b8df2ad32850 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 396797ffe2b1df..5c80088db13b59 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 3afd3bd659d8de..93e87b2875567e 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -7,7 +7,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/arch/c6x/mm/dma-coherent.c b/arch/c6x/mm/dma-coherent.c
index a5909091cb1424..03df07a831fc99 100644
--- a/arch/c6x/mm/dma-coherent.c
+++ b/arch/c6x/mm/dma-coherent.c
@@ -15,7 +15,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include 
diff --git a/arch/csky/mm/dma-mapping.c b/arch/csky/mm/dma-mapping.c
index 3d9ff26c4bb4d8..c3a775a7e8f9d8 100644
--- a/arch/csky/mm/dma-mapping.c
+++ b/arch/csky/mm/dma-mapping.c
@@ -3,7 +3,6 @@
 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/arch/hexagon/kernel/dma.c b/arch/hexagon/kernel/dma.c
index 25f388d9cfcc36..00b9a81075dd54 100644
--- a/arch/hexagon/kernel/dma.c
+++ b/arch/hexagon/kernel/dma.c
@@ -5,7 +5,7 @@
  * Copyright (c) 2010-2012, The Linux Foundation. All rights reserved.
  */
 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 02e5aa08294ee0..ccba04d1267182 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -8,7 +8,7 @@
 #include 
 #include 
 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/m68k/kernel/dma.c b/arch/m68k/kernel/dma.c
index b1ca3522eccc2f..1c1b875fadc180 100644
--- a/arch/m68k/kernel/dma.c
+++ b/arch/m68k/kernel/dma.c
@@ -6,7 +6,7 @@
 
 #undef DEBUG
 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c
index a564863db06e98..04d091ade4172b 100644
--- a/arch/microblaze/kernel/dma.c
+++ b/arch/microblaze/kernel/dma.c
@@ -8,7 +8,7 @@
  */
 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/microblaze/mm/consistent.c b/arch/microblaze/mm/consistent.c
index e09b66e43cb63f..81dffe43b18c1e 100644
--- a/arch/microblaze/mm/consistent.c
+++ b/arch/microblaze/mm/consistent.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 
diff --git a/arch/mips/jazz/jazzdma.c b/arch/mips/jazz/jazzdma.c
index b8fb42e56da0a8..461457b289828a

[PATCH 1/9] dma-mapping: split

Split out all the bits that are purely for dma_map_ops implementations
and related code into a new  header so that they
don't get pulled into all the drivers.  That also means the architecture
specific  is not pulled in by 
any more, which leads to a missing includes that were pulled in by the
x86 or arm versions in a few not overly portable drivers.

Signed-off-by: Christoph Hellwig 
---
 MAINTAINERS |   1 +
 arch/alpha/kernel/pci_iommu.c   |   2 +-
 arch/arc/mm/dma.c   |   1 +
 arch/arm/common/dmabounce.c |   1 +
 arch/arm/mach-highbank/highbank.c   |   2 +-
 arch/arm/mach-imx/mach-imx27_visstrim_m10.c |   2 +-
 arch/arm/mach-imx/mach-mx31moboard.c|   2 +-
 arch/arm/mach-mvebu/coherency.c |   2 +-
 arch/arm/mm/dma-mapping-nommu.c |   1 +
 arch/arm/mm/dma-mapping.c   |   2 +-
 arch/arm64/mm/dma-mapping.c |   1 +
 arch/ia64/hp/common/sba_iommu.c |   2 +-
 arch/ia64/kernel/dma-mapping.c  |   2 +-
 arch/mips/jazz/jazzdma.c|   1 +
 arch/mips/mm/dma-noncoherent.c  |   1 +
 arch/parisc/kernel/drivers.c|   1 +
 arch/powerpc/include/asm/iommu.h|   2 +-
 arch/powerpc/include/asm/pci.h  |   2 +-
 arch/powerpc/platforms/ps3/system-bus.c |   2 +-
 arch/powerpc/platforms/pseries/ibmebus.c|   2 +-
 arch/powerpc/platforms/pseries/vio.c|   2 +-
 arch/s390/pci/pci_dma.c |   2 +-
 arch/sh/boards/mach-ap325rxa/setup.c|   1 +
 arch/sh/boards/mach-ecovec24/setup.c|   1 +
 arch/sh/boards/mach-kfr2r09/setup.c |   2 +-
 arch/sh/boards/mach-migor/setup.c   |   2 +-
 arch/sh/boards/mach-se/7724/setup.c |   1 +
 arch/sh/drivers/pci/fixups-dreamcast.c  |   2 +-
 arch/sparc/kernel/iommu.c   |   2 +-
 arch/sparc/kernel/pci_sun4v.c   |   1 +
 arch/sparc/mm/io-unit.c |   2 +-
 arch/sparc/mm/iommu.c   |   2 +-
 arch/x86/kernel/amd_gart_64.c   |   1 +
 arch/x86/kernel/pci-dma.c   |   1 +
 arch/x86/kernel/setup.c |   2 +
 arch/x86/xen/pci-swiotlb-xen.c  |   2 +-
 drivers/acpi/arm64/iort.c   |   2 +-
 drivers/acpi/scan.c |   2 +-
 drivers/base/dd.c   |   2 +-
 drivers/gpu/drm/exynos/exynos_drm_dma.c |   2 +-
 drivers/gpu/drm/msm/msm_gem.c   |   1 +
 drivers/iommu/dma-iommu.c   |   1 +
 drivers/iommu/intel/iommu.c |   2 +-
 drivers/misc/mic/bus/mic_bus.c  |   1 +
 drivers/misc/mic/bus/scif_bus.c |   2 +-
 drivers/misc/mic/bus/scif_bus.h |   2 +-
 drivers/misc/mic/bus/vop_bus.c  |   2 +-
 drivers/misc/mic/host/mic_boot.c|   1 +
 drivers/of/device.c |   1 +
 drivers/parisc/ccio-dma.c   |   1 +
 drivers/parisc/sba_iommu.c  |   1 +
 drivers/pci/xen-pcifront.c  |   1 +
 drivers/remoteproc/remoteproc_core.c|   1 +
 drivers/remoteproc/remoteproc_virtio.c  |   2 +-
 drivers/vdpa/vdpa_sim/vdpa_sim.c|   2 +-
 drivers/xen/swiotlb-xen.c   |   1 +
 include/linux/dma-map-ops.h | 158 ++
 include/linux/dma-mapping.h | 168 +---
 kernel/dma/coherent.c   |   1 +
 kernel/dma/direct.c |   1 +
 kernel/dma/dummy.c  |   2 +-
 kernel/dma/mapping.c|   2 +-
 kernel/dma/ops_helpers.c|   1 +
 kernel/dma/virt.c   |   2 +-
 64 files changed, 223 insertions(+), 200 deletions(-)
 create mode 100644 include/linux/dma-map-ops.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 190c7fa2ea010a..b13fc17943079a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5202,6 +5202,7 @@ T:git 
git://git.infradead.org/users/hch/dma-mapping.git
 F: include/asm-generic/dma-mapping.h
 F: include/linux/dma-direct.h
 F: include/linux/dma-mapping.h
+F: include/linux/dma-map-ops.h
 F: include/linux/dma-noncoherent.h
 F: kernel/dma/
 
diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index 447e0fd0ed3895..d84b19aa8e9da7 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include 
diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
index e947572a521ec0..a8c453e98d753b 100644
--- a/arch/arc/mm/dma.c
+++ b/arch/arc/mm/dma.c
@@ -3,6 +3,7 @@
  * Copyright (C) 2004, 2007-2010, 2011-2012 Synopsys, Inc. (www.synopsys.com)
  */
 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/arm/common/dmabounce.c b/arch

[PATCH 3/9] dma-contiguous: remove dev_set_cma_area

dev_set_cma_area contains a trivial assignment.  It has just three
callers that all have a non-NULL device and depend on CONFIG_DMA_CMA,
so remove the wrapper.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/mach-davinci/devices-da8xx.c | 2 +-
 include/linux/dma-contiguous.h| 8 
 kernel/dma/contiguous.c   | 4 ++--
 3 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/arch/arm/mach-davinci/devices-da8xx.c 
b/arch/arm/mach-davinci/devices-da8xx.c
index 2e2853582b459e..1207eabe8d1cc4 100644
--- a/arch/arm/mach-davinci/devices-da8xx.c
+++ b/arch/arm/mach-davinci/devices-da8xx.c
@@ -905,7 +905,7 @@ void __init da8xx_rproc_reserve_cma(void)
__func__, ret);
return;
}
-   dev_set_cma_area(&da8xx_dsp.dev, cma);
+   da8xx_dsp.dev.cma_area = cma;
rproc_mem_inited = true;
 }
 #else
diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
index 62fd55d0723546..41ec08d81bc317 100644
--- a/include/linux/dma-contiguous.h
+++ b/include/linux/dma-contiguous.h
@@ -66,12 +66,6 @@ static inline struct cma *dev_get_cma_area(struct device 
*dev)
return dma_contiguous_default_area;
 }
 
-static inline void dev_set_cma_area(struct device *dev, struct cma *cma)
-{
-   if (dev)
-   dev->cma_area = cma;
-}
-
 static inline void dma_contiguous_set_default(struct cma *cma)
 {
dma_contiguous_default_area = cma;
@@ -97,8 +91,6 @@ static inline struct cma *dev_get_cma_area(struct device *dev)
return NULL;
 }
 
-static inline void dev_set_cma_area(struct device *dev, struct cma *cma) { }
-
 static inline void dma_contiguous_set_default(struct cma *cma) { }
 
 static inline void dma_contiguous_reserve(phys_addr_t limit) { }
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index f4c150810fd25c..95adcee972e85c 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -359,14 +359,14 @@ void dma_free_contiguous(struct device *dev, struct page 
*page, size_t size)
 
 static int rmem_cma_device_init(struct reserved_mem *rmem, struct device *dev)
 {
-   dev_set_cma_area(dev, rmem->priv);
+   dev->cma_area = rmem->priv;
return 0;
 }
 
 static void rmem_cma_device_release(struct reserved_mem *rmem,
struct device *dev)
 {
-   dev_set_cma_area(dev, NULL);
+   dev->cma_area = NULL;
 }
 
 static const struct reserved_mem_ops rmem_cma_ops = {
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 5/9] dma-mapping: merge into

Merge dma-contiguous.h into dma-map-ops.h, after removing the comment
describing the contiguous allocator into kernel/dma/contigous.c.

Signed-off-by: Christoph Hellwig 
---
 .../admin-guide/kernel-parameters.txt |   2 +-
 arch/arm/mach-davinci/devices-da8xx.c |   2 +-
 arch/arm/mach-shmobile/setup-rcar-gen2.c  |   2 +-
 arch/arm/mm/dma-mapping.c |   1 -
 arch/arm/mm/init.c|   2 +-
 arch/arm64/mm/init.c  |   3 +-
 arch/csky/kernel/setup.c  |   2 +-
 arch/csky/mm/dma-mapping.c|   3 +-
 arch/microblaze/mm/init.c |   2 +-
 arch/mips/kernel/setup.c  |   2 +-
 arch/mips/mm/dma-noncoherent.c|   1 -
 arch/s390/kernel/setup.c  |   2 +-
 arch/x86/include/asm/dma-mapping.h|   1 -
 arch/x86/kernel/setup.c   |   2 +-
 arch/xtensa/kernel/pci-dma.c  |   2 +-
 arch/xtensa/mm/init.c |   2 +-
 drivers/dma-buf/heaps/cma_heap.c  |   2 +-
 drivers/iommu/amd/iommu.c |   3 +-
 drivers/iommu/dma-iommu.c |   1 -
 drivers/iommu/intel/iommu.c   |   2 +-
 drivers/media/platform/exynos4-is/fimc-is.c   |   1 -
 include/linux/dma-contiguous.h| 135 --
 include/linux/dma-map-ops.h   |  65 +
 kernel/dma/Kconfig|   2 +-
 kernel/dma/contiguous.c   |  30 +++-
 kernel/dma/direct.c   |   1 -
 kernel/dma/ops_helpers.c  |   1 -
 kernel/dma/pool.c |   2 +-
 28 files changed, 112 insertions(+), 164 deletions(-)
 delete mode 100644 include/linux/dma-contiguous.h

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index e464cf0b502591..7657e00e83ca38 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -597,7 +597,7 @@
placement constraint by the physical address range of
memory allocations. A value of 0 disables CMA
altogether. For more information, see
-   include/linux/dma-contiguous.h
+   kernel/dma/contiguous.c
 
cma_pernuma=nn[MG]
[ARM64,KNL]
diff --git a/arch/arm/mach-davinci/devices-da8xx.c 
b/arch/arm/mach-davinci/devices-da8xx.c
index 1207eabe8d1cc4..bb368938fc4921 100644
--- a/arch/arm/mach-davinci/devices-da8xx.c
+++ b/arch/arm/mach-davinci/devices-da8xx.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/arm/mach-shmobile/setup-rcar-gen2.c 
b/arch/arm/mach-shmobile/setup-rcar-gen2.c
index c42ff8c314c82d..e00f5b3b929362 100644
--- a/arch/arm/mach-shmobile/setup-rcar-gen2.c
+++ b/arch/arm/mach-shmobile/setup-rcar-gen2.c
@@ -9,7 +9,7 @@
 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 8bf0bc6bc31172..154c24cec94c74 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 000c1b48e9734f..ab1d1931a3525e 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -18,7 +18,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f1c75957ff3c24..1b591ddb28b01b 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -21,8 +21,7 @@
 #include 
 #include 
 #include 
-#include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/csky/kernel/setup.c b/arch/csky/kernel/setup.c
index 0481f4e34538ea..e4cab16056d606 100644
--- a/arch/csky/kernel/setup.c
+++ b/arch/csky/kernel/setup.c
@@ -7,7 +7,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/csky/mm/dma-mapping.c b/arch/csky/mm/dma-mapping.c
index 8f6571ae27c867..3d9ff26c4bb4d8 100644
--- a/arch/csky/mm/dma-mapping.c
+++ b/arch/csky/mm/dma-mapping.c
@@ -2,8 +2,7 @@
 // Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd.
 
 #include 
-#include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c
index 3344d4a1fe890c..7659a8b86580fd 100644
--- a/arch/microblaze/mm/init.c
+++ b/arch/microblaze/mm/init.c
@@ -7,7 +7,7 @@
  * for more details.
  */
 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index bf5f5acab0a82f..464bfd3957ae96 100644
--- a/arch/mips/kerne

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

"On Wed, 30 Sep 2020 at 10:48, Nicolin Chen  wrote:
>
> From: Dmitry Osipenko 
>
> Multiple Tegra drivers need to retrieve Memory Controller and hence there
> is quite some duplication of the retrieval code among the drivers. Let's
> add a new common helper for the retrieval of the MC.
>
> Signed-off-by: Dmitry Osipenko 
> Signed-off-by: Nicolin Chen 
> ---
>
> Changelog
> v2->v3:
>  * Replaced with Dimtry's devm_tegra_get_memory_controller()
> v1->v2:
>  * N/A
>
>  drivers/memory/tegra/mc.c | 39 +++
>  include/soc/tegra/mc.h| 17 +
>  2 files changed, 56 insertions(+)
>
> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> index ec8403557ed4..dd691dc3738e 100644
> --- a/drivers/memory/tegra/mc.c
> +++ b/drivers/memory/tegra/mc.c
> @@ -42,6 +42,45 @@ static const struct of_device_id tegra_mc_of_match[] = {
>  };
>  MODULE_DEVICE_TABLE(of, tegra_mc_of_match);
>
> +static void tegra_mc_devm_action_put_device(void *data)

devm_tegra_memory_controller_put()

> +{
> +   struct tegra_mc *mc = data;
> +
> +   put_device(mc->dev);
> +}
> +
> +struct tegra_mc *devm_tegra_get_memory_controller(struct device *dev)

Usually 'get' is a suffix (e.g. clk, gpiod, iio, led), so:
devm_tegra_memory_controller_get()

> +{
> +   struct platform_device *pdev;
> +   struct device_node *np;
> +   struct tegra_mc *mc;
> +   int err;
> +
> +   np = of_find_matching_node_and_match(NULL, tegra_mc_of_match, NULL);
> +   if (!np)
> +   return ERR_PTR(-ENOENT);
> +
> +   pdev = of_find_device_by_node(np);
> +   of_node_put(np);
> +   if (!pdev)
> +   return ERR_PTR(-ENODEV);
> +
> +   mc = platform_get_drvdata(pdev);
> +   if (!mc) {
> +   put_device(mc->dev);
> +   return ERR_PTR(-EPROBE_DEFER);
> +   }
> +
> +   err = devm_add_action(dev, tegra_mc_devm_action_put_device, mc);
> +   if (err) {
> +   put_device(mc->dev);
> +   return ERR_PTR(err);
> +   }
> +
> +   return mc;
> +}
> +EXPORT_SYMBOL_GPL(devm_tegra_get_memory_controller);
> +
>  static int tegra_mc_block_dma_common(struct tegra_mc *mc,
>  const struct tegra_mc_reset *rst)
>  {
> diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
> index 1238e35653d1..c05142e3e244 100644
> --- a/include/soc/tegra/mc.h
> +++ b/include/soc/tegra/mc.h
> @@ -184,4 +184,21 @@ struct tegra_mc {
>  int tegra_mc_write_emem_configuration(struct tegra_mc *mc, unsigned long 
> rate);
>  unsigned int tegra_mc_get_emem_device_count(struct tegra_mc *mc);
>
> +#ifdef CONFIG_TEGRA_MC
> +/**
> + * devm_tegra_get_memory_controller() - Get the tegra_mc pointer.
> + * @dev: Device that will be interacted with

This is not precise enough and there is no interaction with 'dev' in
devm_tegra_get_memory_controller(). Something like: "Device that owns
the pointer to tegra memory controller"

> + *
> + * Return: ERR_PTR() on error or a valid pointer to a struct tegra_mc.
> + *
> + * The mc->dev counter will be automatically put by the device management 
> code.

1. s/mc/tegra_mc/ (it's the first occurence of word mc here)
2. "kerneldoc goes to the C file". Not to the header.

Best regards,
Krzysztof
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v10 00/13] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)

2020-09-30 Thread Jean-Philippe Brucker

On Mon, Sep 28, 2020 at 11:39:02PM +0100, Will Deacon wrote:
> On Mon, Sep 28, 2020 at 06:23:15PM +0100, Will Deacon wrote:
> > On Mon, Sep 28, 2020 at 06:47:31PM +0200, Jean-Philippe Brucker wrote:
> > > On Fri, Sep 18, 2020 at 12:18:40PM +0200, Jean-Philippe Brucker wrote:
> > > > This is version 10 of the page table sharing support for Arm SMMUv3.
> > > > Patch 1 still needs an Ack from mm maintainers. However patches 4-11 do
> > > > not depend on it, and could get merged for v5.10 regardless.
> > > 
> > > Are you OK with taking patches 4-11 for v5.10?
> > > 
> > > The rest depends on patch 1 which hasn't been acked yet. It's
> > > uncontroversial and I'm sure it will eventually make it. In case it
> > > doesn't, we'll keep track of mm->pasid within the IOMMU subsystem instead.
> > 
> > I was off most of last week, but I plan to see how much of this I can queue
> > tonight. Stay tuned...
> 
> I've queued 4-11 locally, but I've put 4 and 6 on a shared branch with arm64
> (for-next/svm) so I'd like that to hit next before I push out the merge into
> the branch for Joerg.

Great, thanks! I'll split the remainder into two or three small series

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v10 01/13] mm: Define pasid in mm

2020-09-30 Thread Jean-Philippe Brucker

On Mon, Sep 28, 2020 at 10:43:26PM +, Fenghua Yu wrote:
> Hi, Will and Jean,
> 
> On Mon, Sep 28, 2020 at 11:22:51PM +0100, Will Deacon wrote:
> > On Fri, Sep 18, 2020 at 12:18:41PM +0200, Jean-Philippe Brucker wrote:
> > > From: Fenghua Yu 
> > > 
> > > PASID is shared by all threads in a process. So the logical place to keep
> > > track of it is in the "mm". Both ARM and X86 need to use the PASID in the
> > > "mm".
> > > 
> > > Suggested-by: Christoph Hellwig 
> > > Signed-off-by: Fenghua Yu 
> > > Reviewed-by: Tony Luck 
> > > ---
> > > https://lore.kernel.org/linux-iommu/1600187413-163670-8-git-send-email-fenghua...@intel.com/
> > > ---
> > >  include/linux/mm_types.h | 4 
> > >  1 file changed, 4 insertions(+)
> > 
> > Acked-by: Will Deacon 
> 
> FYI. This patch is in x86 maintainers tree tip:x86/pasid now as part of
> the x86 PASID MSR series.

Ah I missed that, glad to see it in v5.10

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

On Wed, 30 Sep 2020 at 10:48, Nicolin Chen  wrote:
>
> Previously the driver relies on bus_set_iommu() in .probe() to call
> in .probe_device() function so each client can poll iommus property
> in DTB to configure fwspec via tegra_smmu_configure(). According to
> the comments in .probe(), this is a bit of a hack. And this doesn't
> work for a client that doesn't exist in DTB, PCI device for example.
>
> Actually when a device/client gets probed, the of_iommu_configure()
> will call in .probe_device() function again, with a prepared fwspec
> from of_iommu_configure() that reads the SWGROUP id in DTB as we do
> in tegra-smmu driver.
>
> Additionally, as a new helper devm_tegra_get_memory_controller() is
> introduced, there's no need to poll the iommus property in order to
> get mc->smmu pointers or SWGROUP id.
>
> This patch reworks .probe_device() and .attach_dev() by doing:
> 1) Using fwspec to get swgroup id in .attach_dev/.dettach_dev()
> 2) Removing DT polling code, tegra_smmu_find/tegra_smmu_configure()
> 3) Calling devm_tegra_get_memory_controller() in .probe_device()
> 4) Also dropping the hack in .probe() that's no longer needed.
>
> Signed-off-by: Nicolin Chen 
> ---
>
> Changelog
> v2->v3
>  * Used devm_tegra_get_memory_controller() to get mc pointer
>  * Replaced IS_ERR_OR_NULL with IS_ERR in .probe_device()
> v1->v2
>  * Replaced in .probe_device() tegra_smmu_find/tegra_smmu_configure()
>with tegra_get_memory_controller call.
>  * Dropped the hack in tegra_smmu_probe().
>
>  drivers/iommu/tegra-smmu.c | 144 ++---
>  1 file changed, 36 insertions(+), 108 deletions(-)
>
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 6a3ecc334481..636dc3b89545 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -61,6 +61,8 @@ struct tegra_smmu_as {
> u32 attr;
>  };
>
> +static const struct iommu_ops tegra_smmu_ops;

I cannot find in this patch where this is assigned.

Best regards,
Krzysztof
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

On Wed, Sep 30, 2020 at 11:21:14AM +0200, Krzysztof Kozlowski wrote:
> On Wed, 30 Sep 2020 at 10:48, Nicolin Chen  wrote:
> >
> > Previously the driver relies on bus_set_iommu() in .probe() to call
> > in .probe_device() function so each client can poll iommus property
> > in DTB to configure fwspec via tegra_smmu_configure(). According to
> > the comments in .probe(), this is a bit of a hack. And this doesn't
> > work for a client that doesn't exist in DTB, PCI device for example.
> >
> > Actually when a device/client gets probed, the of_iommu_configure()
> > will call in .probe_device() function again, with a prepared fwspec
> > from of_iommu_configure() that reads the SWGROUP id in DTB as we do
> > in tegra-smmu driver.
> >
> > Additionally, as a new helper devm_tegra_get_memory_controller() is
> > introduced, there's no need to poll the iommus property in order to
> > get mc->smmu pointers or SWGROUP id.
> >
> > This patch reworks .probe_device() and .attach_dev() by doing:
> > 1) Using fwspec to get swgroup id in .attach_dev/.dettach_dev()
> > 2) Removing DT polling code, tegra_smmu_find/tegra_smmu_configure()
> > 3) Calling devm_tegra_get_memory_controller() in .probe_device()
> > 4) Also dropping the hack in .probe() that's no longer needed.
> >
> > Signed-off-by: Nicolin Chen 
> > ---
> >
> > Changelog
> > v2->v3
> >  * Used devm_tegra_get_memory_controller() to get mc pointer
> >  * Replaced IS_ERR_OR_NULL with IS_ERR in .probe_device()
> > v1->v2
> >  * Replaced in .probe_device() tegra_smmu_find/tegra_smmu_configure()
> >with tegra_get_memory_controller call.
> >  * Dropped the hack in tegra_smmu_probe().
> >
> >  drivers/iommu/tegra-smmu.c | 144 ++---
> >  1 file changed, 36 insertions(+), 108 deletions(-)
> >
> > diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> > index 6a3ecc334481..636dc3b89545 100644
> > --- a/drivers/iommu/tegra-smmu.c
> > +++ b/drivers/iommu/tegra-smmu.c
> > @@ -61,6 +61,8 @@ struct tegra_smmu_as {
> > u32 attr;
> >  };
> >
> > +static const struct iommu_ops tegra_smmu_ops;
> 
> I cannot find in this patch where this is assigned.

Because it's already set in probe():
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/iommu/tegra-smmu.c#n1162

And my PATCH-3 sets it for PCI bus also.

Thanks
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

On Wed, Sep 30, 2020 at 11:07:32AM +0200, Krzysztof Kozlowski wrote:
> "On Wed, 30 Sep 2020 at 10:48, Nicolin Chen  wrote:
> >
> > From: Dmitry Osipenko 
> >
> > Multiple Tegra drivers need to retrieve Memory Controller and hence there
> > is quite some duplication of the retrieval code among the drivers. Let's
> > add a new common helper for the retrieval of the MC.
> >
> > Signed-off-by: Dmitry Osipenko 
> > Signed-off-by: Nicolin Chen 
> > ---
> >
> > Changelog
> > v2->v3:
> >  * Replaced with Dimtry's devm_tegra_get_memory_controller()
> > v1->v2:
> >  * N/A
> >
> >  drivers/memory/tegra/mc.c | 39 +++
> >  include/soc/tegra/mc.h| 17 +
> >  2 files changed, 56 insertions(+)
> >
> > diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> > index ec8403557ed4..dd691dc3738e 100644
> > --- a/drivers/memory/tegra/mc.c
> > +++ b/drivers/memory/tegra/mc.c
> > @@ -42,6 +42,45 @@ static const struct of_device_id tegra_mc_of_match[] = {
> >  };
> >  MODULE_DEVICE_TABLE(of, tegra_mc_of_match);
> >
> > +static void tegra_mc_devm_action_put_device(void *data)
> 
> devm_tegra_memory_controller_put()
> 
> > +{
> > +   struct tegra_mc *mc = data;
> > +
> > +   put_device(mc->dev);
> > +}
> > +
> > +struct tegra_mc *devm_tegra_get_memory_controller(struct device *dev)
> 
> Usually 'get' is a suffix (e.g. clk, gpiod, iio, led), so:
> devm_tegra_memory_controller_get()
> 
> > +{
> > +   struct platform_device *pdev;
> > +   struct device_node *np;
> > +   struct tegra_mc *mc;
> > +   int err;
> > +
> > +   np = of_find_matching_node_and_match(NULL, tegra_mc_of_match, NULL);
> > +   if (!np)
> > +   return ERR_PTR(-ENOENT);
> > +
> > +   pdev = of_find_device_by_node(np);
> > +   of_node_put(np);
> > +   if (!pdev)
> > +   return ERR_PTR(-ENODEV);
> > +
> > +   mc = platform_get_drvdata(pdev);
> > +   if (!mc) {
> > +   put_device(mc->dev);
> > +   return ERR_PTR(-EPROBE_DEFER);
> > +   }
> > +
> > +   err = devm_add_action(dev, tegra_mc_devm_action_put_device, mc);
> > +   if (err) {
> > +   put_device(mc->dev);
> > +   return ERR_PTR(err);
> > +   }
> > +
> > +   return mc;
> > +}
> > +EXPORT_SYMBOL_GPL(devm_tegra_get_memory_controller);
> > +
> >  static int tegra_mc_block_dma_common(struct tegra_mc *mc,
> >  const struct tegra_mc_reset *rst)
> >  {
> > diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
> > index 1238e35653d1..c05142e3e244 100644
> > --- a/include/soc/tegra/mc.h
> > +++ b/include/soc/tegra/mc.h
> > @@ -184,4 +184,21 @@ struct tegra_mc {
> >  int tegra_mc_write_emem_configuration(struct tegra_mc *mc, unsigned long 
> > rate);
> >  unsigned int tegra_mc_get_emem_device_count(struct tegra_mc *mc);
> >
> > +#ifdef CONFIG_TEGRA_MC
> > +/**
> > + * devm_tegra_get_memory_controller() - Get the tegra_mc pointer.
> > + * @dev: Device that will be interacted with
> 
> This is not precise enough and there is no interaction with 'dev' in
> devm_tegra_get_memory_controller(). Something like: "Device that owns
> the pointer to tegra memory controller"
> 
> > + *
> > + * Return: ERR_PTR() on error or a valid pointer to a struct tegra_mc.
> > + *
> > + * The mc->dev counter will be automatically put by the device management 
> > code.
> 
> 1. s/mc/tegra_mc/ (it's the first occurence of word mc here)
> 2. "kerneldoc goes to the C file". Not to the header.

I will send v4 after changing all of the places.

Thanks for the comments!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 0/3] iommu: Reserved regions for IOVAs beyond dma_mask and iommu aperture

2020-09-30 Thread Auger Eric

Hi Alex,

On 9/29/20 8:18 PM, Alex Williamson wrote:
> On Tue, 29 Sep 2020 09:18:22 +0200
> Auger Eric  wrote:
> 
>> Hi all,
>>
>> [also correcting some outdated email addresses + adding Lorenzo in cc]
>>
>> On 9/29/20 12:42 AM, Alex Williamson wrote:
>>> On Mon, 28 Sep 2020 21:50:34 +0200
>>> Eric Auger  wrote:
>>>   
 VFIO currently exposes the usable IOVA regions through the
 VFIO_IOMMU_GET_INFO ioctl / VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE
 capability. However it fails to take into account the dma_mask
 of the devices within the container. The top limit currently is
 defined by the iommu aperture.  
>>>
>>> I think that dma_mask is traditionally a DMA API interface for a device
>>> driver to indicate to the DMA layer which mappings are accessible to the
>>> device.  On the other hand, vfio makes use of the IOMMU API where the
>>> driver is in userspace.  That userspace driver has full control of the
>>> IOVA range of the device, therefore dma_mask is mostly irrelevant to
>>> vfio.  I think the issue you're trying to tackle is that the IORT code
>>> is making use of the dma_mask to try to describe a DMA address
>>> limitation imposed by the PCI root bus, living between the endpoint
>>> device and the IOMMU.  Therefore, if the IORT code is exposing a
>>> topology or system imposed device limitation, this seems much more akin
>>> to something like an MSI reserved range, where it's not necessarily the
>>> device or the IOMMU with the limitation, but something that sits
>>> between them.  
>>
>> First I think I failed to explain the context. I worked on NVMe
>> passthrough on ARM. The QEMU NVMe backend uses VFIO to program the
>> physical device. The IOVA allocator there currently uses an IOVA range
>> within [0x1, 1ULL << 39]. This IOVA layout rather is arbitrary if I
>> understand correctly.
> 
> 39 bits is the minimum available on some VT-d systems, so it was
> probably considered a reasonable minimum address width to consider.
OK
> 
>> I noticed we rapidly get some VFIO MAP DMA
>> failures because the allocated IOVA collide with the ARM MSI reserved
>> IOVA window [0x800, 0x810]. Since  9b77e5c79840 ("vfio/type1:
>> Check reserved region conflict and update iova list"), such VFIO MAP DMA
>> attempts to map IOVAs belonging to host reserved IOVA windows fail. So,
>> by using the VFIO_IOMMU_GET_INFO ioctl /
>> VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE I can change the IOVA allocator to
>> avoid allocating within this range and others. While working on this, I
>> tried to automatically compute the min/max IOVAs and change the
>> arbitrary [0x1, 1ULL << 39]. My SMMUv2 supports up to 48b so
>> naturally the max IOVA was computed as 1ULL << 48. The QEMU NVMe backend
>> allocates at the bottom and at the top of the range. I noticed the use
>> case was not working as soon as the top IOVA was more than 1ULL << 42.
>> And then we noticed the dma_mask was set to 42 by using
>> cat  /sys/bus/pci/devices/0005:01:00.0/dma_mask_bits. So my
>> interpretation is the dma_mask was somehow containing the info the
>> device couldn't handle IOVAs beyond a certain limit.
> 
> I see that there are both OF and ACPI hooks in pci_dma_configure() and
> both modify dev->dma_mask, which is what pci-sysfs is exposing here,
> but I'm not convinced this even does what it's intended to do.  The
> driver core calls this via the bus->dma_configure callback before
> probing a driver, but then what happens when the driver calls
> pci_set_dma_mask()?  This is just a wrapper for dma_set_mask() and I
> don't see anywhere that would take into account the existing
> dev->dma_mask.  It seems for example that pci_dma_configure() could
> produce a 42 bit mask as we have here, then the driver could override
> that with anything that the dma_ops.dma_supported() callback finds
> acceptable, and I don't see any instances where the current
> dev->dma_mask is considered.  Am I overlooking something? 

I don't see it either. So the dma_mask set by the driver would never be
checked against the dma_mask limited found when parsing OF/ACPI?
>  
>> In my case the 42b limit is computed in iort_dma_setup() by
>> acpi_dma_get_range(dev, &dmaaddr, &offset, &size);
>>
>> Referring to the comment, it does "Evaluate DMA regions and return
>> respectively DMA region start, offset and size in dma_addr, offset and
>> size on parsing success". This parses the ACPI table, looking for ACPI
>> companions with _DMA methods.
>>
>> But as Alex mentioned, the IORT also allows to define limits on "the
>> number of address bits, starting from the least significant bit that can
>> be generated by a device when it accesses memory". See Named component
>> node.Device Memory Address Size limit or PCI root complex node. Memory
>> address size limit.
>>
>> ret = acpi_dma_get_range(dev, &dmaaddr, &offset, &size);
>> if (ret == -ENODEV)
>> ret = dev_is_pci(dev) ? rc_dma_get_range(dev, &size)
>>

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

On Wed, Sep 30, 2020 at 02:40:32AM -0700, Nicolin Chen wrote:
> On Wed, Sep 30, 2020 at 11:21:14AM +0200, Krzysztof Kozlowski wrote:
> > On Wed, 30 Sep 2020 at 10:48, Nicolin Chen  wrote:
> > >
> > > Previously the driver relies on bus_set_iommu() in .probe() to call
> > > in .probe_device() function so each client can poll iommus property
> > > in DTB to configure fwspec via tegra_smmu_configure(). According to
> > > the comments in .probe(), this is a bit of a hack. And this doesn't
> > > work for a client that doesn't exist in DTB, PCI device for example.
> > >
> > > Actually when a device/client gets probed, the of_iommu_configure()
> > > will call in .probe_device() function again, with a prepared fwspec
> > > from of_iommu_configure() that reads the SWGROUP id in DTB as we do
> > > in tegra-smmu driver.
> > >
> > > Additionally, as a new helper devm_tegra_get_memory_controller() is
> > > introduced, there's no need to poll the iommus property in order to
> > > get mc->smmu pointers or SWGROUP id.
> > >
> > > This patch reworks .probe_device() and .attach_dev() by doing:
> > > 1) Using fwspec to get swgroup id in .attach_dev/.dettach_dev()
> > > 2) Removing DT polling code, tegra_smmu_find/tegra_smmu_configure()
> > > 3) Calling devm_tegra_get_memory_controller() in .probe_device()
> > > 4) Also dropping the hack in .probe() that's no longer needed.
> > >
> > > Signed-off-by: Nicolin Chen 
> > > ---
> > >
> > > Changelog
> > > v2->v3
> > >  * Used devm_tegra_get_memory_controller() to get mc pointer
> > >  * Replaced IS_ERR_OR_NULL with IS_ERR in .probe_device()
> > > v1->v2
> > >  * Replaced in .probe_device() tegra_smmu_find/tegra_smmu_configure()
> > >with tegra_get_memory_controller call.
> > >  * Dropped the hack in tegra_smmu_probe().
> > >
> > >  drivers/iommu/tegra-smmu.c | 144 ++---
> > >  1 file changed, 36 insertions(+), 108 deletions(-)
> > >
> > > diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> > > index 6a3ecc334481..636dc3b89545 100644
> > > --- a/drivers/iommu/tegra-smmu.c
> > > +++ b/drivers/iommu/tegra-smmu.c
> > > @@ -61,6 +61,8 @@ struct tegra_smmu_as {
> > > u32 attr;
> > >  };
> > >
> > > +static const struct iommu_ops tegra_smmu_ops;
> > 
> > I cannot find in this patch where this is assigned.
> 
> Because it's already set in probe():
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/iommu/tegra-smmu.c#n1162
> 
> And my PATCH-3 sets it for PCI bus also.

OK, good point. Thanks for explanation.

Best regards,
Krzysztof
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

On Wed, Sep 30, 2020 at 02:41:45AM -0700, Nicolin Chen wrote:
> On Wed, Sep 30, 2020 at 11:07:32AM +0200, Krzysztof Kozlowski wrote:
> > "On Wed, 30 Sep 2020 at 10:48, Nicolin Chen  wrote:
> > >
> > > From: Dmitry Osipenko 
> > >
> > > Multiple Tegra drivers need to retrieve Memory Controller and hence there
> > > is quite some duplication of the retrieval code among the drivers. Let's
> > > add a new common helper for the retrieval of the MC.
> > >
> > > Signed-off-by: Dmitry Osipenko 
> > > Signed-off-by: Nicolin Chen 
> > > ---
> > >
> > > Changelog
> > > v2->v3:
> > >  * Replaced with Dimtry's devm_tegra_get_memory_controller()
> > > v1->v2:
> > >  * N/A
> > >
> > >  drivers/memory/tegra/mc.c | 39 +++
> > >  include/soc/tegra/mc.h| 17 +
> > >  2 files changed, 56 insertions(+)
> > >
> > > diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> > > index ec8403557ed4..dd691dc3738e 100644
> > > --- a/drivers/memory/tegra/mc.c
> > > +++ b/drivers/memory/tegra/mc.c
> > > @@ -42,6 +42,45 @@ static const struct of_device_id tegra_mc_of_match[] = 
> > > {
> > >  };
> > >  MODULE_DEVICE_TABLE(of, tegra_mc_of_match);
> > >
> > > +static void tegra_mc_devm_action_put_device(void *data)
> > 
> > devm_tegra_memory_controller_put()

My bad here, this is not a "put" helper so the previous name was
actually good. No need to change.

> > 
> > > +{
> > > +   struct tegra_mc *mc = data;
> > > +
> > > +   put_device(mc->dev);
> > > +}
> > > +
> > > +struct tegra_mc *devm_tegra_get_memory_controller(struct device *dev)
> > 
> > Usually 'get' is a suffix (e.g. clk, gpiod, iio, led), so:
> > devm_tegra_memory_controller_get()
> > 
> > > +{
> > > +   struct platform_device *pdev;
> > > +   struct device_node *np;
> > > +   struct tegra_mc *mc;
> > > +   int err;
> > > +
> > > +   np = of_find_matching_node_and_match(NULL, tegra_mc_of_match, 
> > > NULL);
> > > +   if (!np)
> > > +   return ERR_PTR(-ENOENT);
> > > +
> > > +   pdev = of_find_device_by_node(np);
> > > +   of_node_put(np);
> > > +   if (!pdev)
> > > +   return ERR_PTR(-ENODEV);
> > > +
> > > +   mc = platform_get_drvdata(pdev);
> > > +   if (!mc) {
> > > +   put_device(mc->dev);
> > > +   return ERR_PTR(-EPROBE_DEFER);
> > > +   }
> > > +
> > > +   err = devm_add_action(dev, tegra_mc_devm_action_put_device, mc);

This can be simpler with devm_add_action_or_reset.

Best regards,
Krzysztof
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [patch V2 00/46] x86, PCI, XEN, genirq ...: Prepare for device MSI

2020-09-30 Thread Jason Gunthorpe

On Wed, Sep 30, 2020 at 08:41:48AM +0200, Thomas Gleixner wrote:
> On Tue, Sep 29 2020 at 16:03, Megha Dey wrote:
> > On 8/26/2020 4:16 AM, Thomas Gleixner wrote:
> >> #9 is obviously just for the folks interested in IMS
> >>
> >
> > I see that the tip tree (as of 9/29) has most of these patches but 
> > notice that the DEV_MSI related patches
> >
> > haven't made it. I have tested the tip tree(x86/irq branch) with your
> > DEV_MSI infra patches and our IMS patches with the IDXD driver and was
> 
> Your IMS patches? Why do you need something special again?
> 
> > wondering if we should push out those patches as part of our patchset?
> 
> As I don't have any hardware to test that, I was waiting for you and
> Jason to confirm that this actually works for the two different IMS
> implementations.

How urgently do you need this? The code looked good from what I
understood. It will be a while before we have all the parts to send an
actual patch though.

We might be able to put together a mockup just to prove it

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [patch V2 24/46] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI

2020-09-30 Thread Derrick, Jonathan

Hi Jason

On Mon, 2020-08-31 at 11:39 -0300, Jason Gunthorpe wrote:
> On Wed, Aug 26, 2020 at 01:16:52PM +0200, Thomas Gleixner wrote:
> > From: Thomas Gleixner 
> > 
> > Devices on the VMD bus use their own MSI irq domain, but it is not
> > distinguishable from regular PCI/MSI irq domains. This is required
> > to exclude VMD devices from getting the irq domain pointer set by
> > interrupt remapping.
> > 
> > Override the default bus token.
> > 
> > Signed-off-by: Thomas Gleixner 
> > Acked-by: Bjorn Helgaas 
> >  drivers/pci/controller/vmd.c |6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
> > return -ENODEV;
> > }
> >  
> > +   /*
> > +* Override the irq domain bus token so the domain can be distinguished
> > +* from a regular PCI/MSI domain.
> > +*/
> > +   irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> > +
> 
> Having the non-transparent-bridge hold a MSI table and
> multiplex/de-multiplex IRQs looks like another good use case for
> something close to pci_subdevice_msi_create_irq_domain()?
> 
> If each device could have its own internal MSI-X table programmed
> properly things would work alot better. Disable capture/remap of the
> MSI range in the NTB.
We can disable the capture and remap in newer devices so we don't even
need the irq domain. Legacy VMD will automatically remap based on the
APIC dest bits in the MSI address.

I would however like to determine if the MSI data bits could be used
for individual devices to have the IRQ domain subsystem demultiplex the
virq from that and eliminate the IRQ list iteration.

A concern I have about that scheme is virtualization as I think those
bits are used to route to the virtual vector.

> 
> Then each device could have a proper non-multiplexed interrupt
> delivered to the CPU.. Affinity would work properly, no need to share
> IRQs (eg vmd_irq() goes away)/etc.
> 
> Something for the VMD maintainers to think about at least.
> 
> As I hear more about NTB these days a full MSI scheme for NTB seems
> interesting to have in the PCI-E core code..
> 
> Jason


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [patch V2 24/46] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI

2020-09-30 Thread Jason Gunthorpe

On Wed, Sep 30, 2020 at 12:45:30PM +, Derrick, Jonathan wrote:
> Hi Jason
> 
> On Mon, 2020-08-31 at 11:39 -0300, Jason Gunthorpe wrote:
> > On Wed, Aug 26, 2020 at 01:16:52PM +0200, Thomas Gleixner wrote:
> > > From: Thomas Gleixner 
> > > 
> > > Devices on the VMD bus use their own MSI irq domain, but it is not
> > > distinguishable from regular PCI/MSI irq domains. This is required
> > > to exclude VMD devices from getting the irq domain pointer set by
> > > interrupt remapping.
> > > 
> > > Override the default bus token.
> > > 
> > > Signed-off-by: Thomas Gleixner 
> > > Acked-by: Bjorn Helgaas 
> > >  drivers/pci/controller/vmd.c |6 ++
> > >  1 file changed, 6 insertions(+)
> > > 
> > > +++ b/drivers/pci/controller/vmd.c
> > > @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
> > >   return -ENODEV;
> > >   }
> > >  
> > > + /*
> > > +  * Override the irq domain bus token so the domain can be distinguished
> > > +  * from a regular PCI/MSI domain.
> > > +  */
> > > + irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> > > +
> > 
> > Having the non-transparent-bridge hold a MSI table and
> > multiplex/de-multiplex IRQs looks like another good use case for
> > something close to pci_subdevice_msi_create_irq_domain()?
> > 
> > If each device could have its own internal MSI-X table programmed
> > properly things would work alot better. Disable capture/remap of the
> > MSI range in the NTB.

> We can disable the capture and remap in newer devices so we don't even
> need the irq domain.

You'd still need an irq domain, it just comes from
pci_subdevice_msi_create_irq_domain() instead of internal to this
driver.

> I would however like to determine if the MSI data bits could be used
> for individual devices to have the IRQ domain subsystem demultiplex the
> virq from that and eliminate the IRQ list iteration.

Yes, exactly. This new pci_subdevice_msi_create_irq_domain() creates
*proper* fully functional interrupts, no need for any IRQ handler in
this driver.

> A concern I have about that scheme is virtualization as I think those
> bits are used to route to the virtual vector.

It should be fine with this patch series, consult with Megha how
virtualization is working with IDXD/etc

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [patch V2 24/46] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI

2020-09-30 Thread Derrick, Jonathan

+Megha

On Wed, 2020-09-30 at 09:57 -0300, Jason Gunthorpe wrote:
> On Wed, Sep 30, 2020 at 12:45:30PM +, Derrick, Jonathan wrote:
> > Hi Jason
> > 
> > On Mon, 2020-08-31 at 11:39 -0300, Jason Gunthorpe wrote:
> > > On Wed, Aug 26, 2020 at 01:16:52PM +0200, Thomas Gleixner wrote:
> > > > From: Thomas Gleixner 
> > > > 
> > > > Devices on the VMD bus use their own MSI irq domain, but it is not
> > > > distinguishable from regular PCI/MSI irq domains. This is required
> > > > to exclude VMD devices from getting the irq domain pointer set by
> > > > interrupt remapping.
> > > > 
> > > > Override the default bus token.
> > > > 
> > > > Signed-off-by: Thomas Gleixner 
> > > > Acked-by: Bjorn Helgaas 
> > > >  drivers/pci/controller/vmd.c |6 ++
> > > >  1 file changed, 6 insertions(+)
> > > > 
> > > > +++ b/drivers/pci/controller/vmd.c
> > > > @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
> > > > return -ENODEV;
> > > > }
> > > >  
> > > > +   /*
> > > > +* Override the irq domain bus token so the domain can be 
> > > > distinguished
> > > > +* from a regular PCI/MSI domain.
> > > > +*/
> > > > +   irq_domain_update_bus_token(vmd->irq_domain, 
> > > > DOMAIN_BUS_VMD_MSI);
> > > > +
> > > 
> > > Having the non-transparent-bridge hold a MSI table and
> > > multiplex/de-multiplex IRQs looks like another good use case for
> > > something close to pci_subdevice_msi_create_irq_domain()?
> > > 
> > > If each device could have its own internal MSI-X table programmed
> > > properly things would work alot better. Disable capture/remap of the
> > > MSI range in the NTB.
> > We can disable the capture and remap in newer devices so we don't even
> > need the irq domain.
> 
> You'd still need an irq domain, it just comes from
> pci_subdevice_msi_create_irq_domain() instead of internal to this
> driver.
I have this set which disables remapping and bypasses the creation of
the IRQ domain:
https://patchwork.ozlabs.org/project/linux-pci/list/?series=192936

It allows the end-devices like NVMe to directly manager their own
interrupts and eliminates the VMD interrupt completely. The only quirk
was that kernel has to program IRTE with the VMD device ID as it still
remaps Requester-ID from subdevices.

> 
> > I would however like to determine if the MSI data bits could be used
> > for individual devices to have the IRQ domain subsystem demultiplex the
> > virq from that and eliminate the IRQ list iteration.
> 
> Yes, exactly. This new pci_subdevice_msi_create_irq_domain() creates
> *proper* fully functional interrupts, no need for any IRQ handler in
> this driver.
> 
> > A concern I have about that scheme is virtualization as I think those
> > bits are used to route to the virtual vector.
> 
> It should be fine with this patch series, consult with Megha how
> virtualization is working with IDXD/etc
> 
> Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

...
> +struct tegra_mc *devm_tegra_get_memory_controller(struct device *dev)
> +{
> + struct platform_device *pdev;
> + struct device_node *np;
> + struct tegra_mc *mc;
> + int err;
> +
> + np = of_find_matching_node_and_match(NULL, tegra_mc_of_match, NULL);
> + if (!np)
> + return ERR_PTR(-ENOENT);
> +
> + pdev = of_find_device_by_node(np);
> + of_node_put(np);
> + if (!pdev)
> + return ERR_PTR(-ENODEV);
> +
> + mc = platform_get_drvdata(pdev);
> + if (!mc) {
> + put_device(mc->dev);

This should be put_device(&pdev->dev). Please always be careful while
copying someones else code :)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

...
> + struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>  
> - of_node_put(args.np);
> - index++;
> - }
> + /* An invalid mc pointer means mc and smmu drivers are not ready */
> + if (IS_ERR(mc))
> + return ERR_PTR(-EPROBE_DEFER);
>  
> - if (!smmu)
> + /*
> +  * IOMMU core allows -ENODEV return to carry on. So bypass any call
> +  * from bus_set_iommu() during tegra_smmu_probe(), as a device will
> +  * call in again via of_iommu_configure when fwspec is prepared.
> +  */
> + if (!mc->smmu || !fwspec || fwspec->ops != &tegra_smmu_ops)
>   return ERR_PTR(-ENODEV);
>  
> - return &smmu->iommu;
> + dev_iommu_priv_set(dev, mc->smmu);
> +
> + return &mc->smmu->iommu;
>  }

Is it really okay to use devm_tegra_get_memory_controller() here?

I assume it should be more preferred to do it only for devices that have
fwspec->ops == &tegra_smmu_ops.

Secondly, it also looks to me that a non-devm variant should be more
appropriate here because tegra_smmu_probe_device() isn't invoked by the
devices themselves.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

On Wed, 30 Sep 2020 at 16:41, Dmitry Osipenko  wrote:
>
> ...
> > +struct tegra_mc *devm_tegra_get_memory_controller(struct device *dev)
> > +{
> > + struct platform_device *pdev;
> > + struct device_node *np;
> > + struct tegra_mc *mc;
> > + int err;
> > +
> > + np = of_find_matching_node_and_match(NULL, tegra_mc_of_match, NULL);
> > + if (!np)
> > + return ERR_PTR(-ENOENT);
> > +
> > + pdev = of_find_device_by_node(np);
> > + of_node_put(np);
> > + if (!pdev)
> > + return ERR_PTR(-ENODEV);
> > +
> > + mc = platform_get_drvdata(pdev);
> > + if (!mc) {
> > + put_device(mc->dev);
>
> This should be put_device(&pdev->dev). Please always be careful while
> copying someones else code :)

Good catch. I guess devm_add_action_or_reset() would also work... or
running Smatch on new code. Smatch should point it out.

Best regards,
Krzysztof
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 3/3] iommu/tegra-smmu: Add PCI support

...
> +#ifdef CONFIG_PCI
> + if (!iommu_present(&pci_bus_type)) {


In the previous reply you said that you're borrowing this check from the
arm-smmu driver, but arm-smmu also has a similar check for
platform_bus_type, while Tegra SMMU driver doesn't have it. Hence I'm
objecting the necessity of having this check.

Please give a rationale for having this check in the code. And please
note that cargo cult isn't a rationale to me.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

...
>  static int tegra_smmu_attach_dev(struct iommu_domain *domain,
>struct device *dev)
>  {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
>   struct tegra_smmu_as *as = to_smmu_as(domain);
> - struct device_node *np = dev->of_node;
> - struct of_phandle_args args;
>   unsigned int index = 0;
>   int err = 0;
>  
> - while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> -&args)) {
> - unsigned int swgroup = args.args[0];
> -
> - if (args.np != smmu->dev->of_node) {
> - of_node_put(args.np);
> - continue;
> - }
> -
> - of_node_put(args.np);
> + if (!fwspec || fwspec->ops != &tegra_smmu_ops)
> + return -ENOENT;

In previous reply you said that these fwspec checks are borrowed from
the arm-smmu driver, but I'm now looking at what other drivers do and I
don't see them having this check.

Hence I'm objecting the need to have this check here. You either should
give a rational to having the check or it should be removed.

Please never blindly copy foreign code, you should always double-check it.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [patch V2 00/46] x86, PCI, XEN, genirq ...: Prepare for device MSI

2020-09-30 Thread Thomas Gleixner

On Wed, Sep 30 2020 at 08:43, Jason Gunthorpe wrote:
> On Wed, Sep 30, 2020 at 08:41:48AM +0200, Thomas Gleixner wrote:
>> On Tue, Sep 29 2020 at 16:03, Megha Dey wrote:
>> > On 8/26/2020 4:16 AM, Thomas Gleixner wrote:
>> >> #9is obviously just for the folks interested in IMS
>> >>
>> >
>> > I see that the tip tree (as of 9/29) has most of these patches but 
>> > notice that the DEV_MSI related patches
>> >
>> > haven't made it. I have tested the tip tree(x86/irq branch) with your
>> > DEV_MSI infra patches and our IMS patches with the IDXD driver and was
>> 
>> Your IMS patches? Why do you need something special again?
>> 
>> > wondering if we should push out those patches as part of our patchset?
>> 
>> As I don't have any hardware to test that, I was waiting for you and
>> Jason to confirm that this actually works for the two different IMS
>> implementations.
>
> How urgently do you need this? The code looked good from what I
> understood. It will be a while before we have all the parts to send an
> actual patch though.

I personally do not need it at all :) Megha might have different
thoughts... 

> We might be able to put together a mockup just to prove it

If that makes Megha's stuff going that would of course be appreciated,
but we can defer the IMS_QUEUE part for later. It's orthogonal to the
IMS_ARRAY stuff.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

30.09.2020 17:45, Krzysztof Kozlowski пишет:
> On Wed, 30 Sep 2020 at 16:41, Dmitry Osipenko  wrote:
>>
>> ...
>>> +struct tegra_mc *devm_tegra_get_memory_controller(struct device *dev)
>>> +{
>>> + struct platform_device *pdev;
>>> + struct device_node *np;
>>> + struct tegra_mc *mc;
>>> + int err;
>>> +
>>> + np = of_find_matching_node_and_match(NULL, tegra_mc_of_match, NULL);
>>> + if (!np)
>>> + return ERR_PTR(-ENOENT);
>>> +
>>> + pdev = of_find_device_by_node(np);
>>> + of_node_put(np);
>>> + if (!pdev)
>>> + return ERR_PTR(-ENODEV);
>>> +
>>> + mc = platform_get_drvdata(pdev);
>>> + if (!mc) {
>>> + put_device(mc->dev);
>>
>> This should be put_device(&pdev->dev). Please always be careful while
>> copying someones else code :)
> 
> Good catch. I guess devm_add_action_or_reset() would also work... or
> running Smatch on new code. Smatch should point it out.

The devm_add_action_or_reset() shouldn't help much in this particular
case because it's too early for the devm_add_action here.

Having an explicit put_device() in all error code paths also helps with
improving readability of the code a tad, IMO.

Smatch indeed should catch this bug, but Smatch usually isn't a part of
the developers workflow. More often Smatch is a part of maintainers
workflow, hence such problems usually are getting caught before patches
are applied.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

On Wed, Sep 30, 2020 at 01:42:56AM -0700, Nicolin Chen wrote:
> From: Dmitry Osipenko 
> 
> Multiple Tegra drivers need to retrieve Memory Controller and hence there
> is quite some duplication of the retrieval code among the drivers. Let's
> add a new common helper for the retrieval of the MC.
> 
> Signed-off-by: Dmitry Osipenko 
> Signed-off-by: Nicolin Chen 
> ---
> 
> Changelog
> v2->v3:
>  * Replaced with Dimtry's devm_tegra_get_memory_controller()
> v1->v2:
>  * N/A
> 
>  drivers/memory/tegra/mc.c | 39 +++
>  include/soc/tegra/mc.h| 17 +
>  2 files changed, 56 insertions(+)

Let's not add this helper, please. If a device needs a reference to the
memory controller, it should have a phandle to the memory controller in
device tree so that it can be looked up explicitly.

Adding this helper is officially sanctioning that it's okay not to have
that reference and that's a bad idea.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

30.09.2020 18:23, Thierry Reding пишет:
> On Wed, Sep 30, 2020 at 01:42:56AM -0700, Nicolin Chen wrote:
>> From: Dmitry Osipenko 
>>
>> Multiple Tegra drivers need to retrieve Memory Controller and hence there
>> is quite some duplication of the retrieval code among the drivers. Let's
>> add a new common helper for the retrieval of the MC.
>>
>> Signed-off-by: Dmitry Osipenko 
>> Signed-off-by: Nicolin Chen 
>> ---
>>
>> Changelog
>> v2->v3:
>>  * Replaced with Dimtry's devm_tegra_get_memory_controller()
>> v1->v2:
>>  * N/A
>>
>>  drivers/memory/tegra/mc.c | 39 +++
>>  include/soc/tegra/mc.h| 17 +
>>  2 files changed, 56 insertions(+)
> 
> Let's not add this helper, please. If a device needs a reference to the
> memory controller, it should have a phandle to the memory controller in
> device tree so that it can be looked up explicitly.
> 
> Adding this helper is officially sanctioning that it's okay not to have
> that reference and that's a bad idea.

Maybe that's because the reference isn't really needed for the lookup? :)

Secondly, we could use the reference and then fall back to the
of-matching for devices that don't have the reference, like all Tegra20
devices + Tegra30/124 ACTMON devices.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

On Wed, Sep 30, 2020 at 01:42:57AM -0700, Nicolin Chen wrote:
> Previously the driver relies on bus_set_iommu() in .probe() to call
> in .probe_device() function so each client can poll iommus property
> in DTB to configure fwspec via tegra_smmu_configure(). According to
> the comments in .probe(), this is a bit of a hack. And this doesn't
> work for a client that doesn't exist in DTB, PCI device for example.
> 
> Actually when a device/client gets probed, the of_iommu_configure()
> will call in .probe_device() function again, with a prepared fwspec
> from of_iommu_configure() that reads the SWGROUP id in DTB as we do
> in tegra-smmu driver.
> 
> Additionally, as a new helper devm_tegra_get_memory_controller() is
> introduced, there's no need to poll the iommus property in order to
> get mc->smmu pointers or SWGROUP id.
> 
> This patch reworks .probe_device() and .attach_dev() by doing:
> 1) Using fwspec to get swgroup id in .attach_dev/.dettach_dev()
> 2) Removing DT polling code, tegra_smmu_find/tegra_smmu_configure()
> 3) Calling devm_tegra_get_memory_controller() in .probe_device()
> 4) Also dropping the hack in .probe() that's no longer needed.
> 
> Signed-off-by: Nicolin Chen 
> ---
> 
> Changelog
> v2->v3
>  * Used devm_tegra_get_memory_controller() to get mc pointer
>  * Replaced IS_ERR_OR_NULL with IS_ERR in .probe_device()
> v1->v2
>  * Replaced in .probe_device() tegra_smmu_find/tegra_smmu_configure()
>with tegra_get_memory_controller call.
>  * Dropped the hack in tegra_smmu_probe().
> 
>  drivers/iommu/tegra-smmu.c | 144 ++---
>  1 file changed, 36 insertions(+), 108 deletions(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 6a3ecc334481..636dc3b89545 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -61,6 +61,8 @@ struct tegra_smmu_as {
>   u32 attr;
>  };
>  
> +static const struct iommu_ops tegra_smmu_ops;
> +
>  static struct tegra_smmu_as *to_smmu_as(struct iommu_domain *dom)
>  {
>   return container_of(dom, struct tegra_smmu_as, domain);
> @@ -484,60 +486,50 @@ static void tegra_smmu_as_unprepare(struct tegra_smmu 
> *smmu,
>  static int tegra_smmu_attach_dev(struct iommu_domain *domain,
>struct device *dev)
>  {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
>   struct tegra_smmu_as *as = to_smmu_as(domain);
> - struct device_node *np = dev->of_node;
> - struct of_phandle_args args;
>   unsigned int index = 0;
>   int err = 0;
>  
> - while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> -&args)) {
> - unsigned int swgroup = args.args[0];
> -
> - if (args.np != smmu->dev->of_node) {
> - of_node_put(args.np);
> - continue;
> - }
> -
> - of_node_put(args.np);
> + if (!fwspec || fwspec->ops != &tegra_smmu_ops)
> + return -ENOENT;
>  
> + for (index = 0; index < fwspec->num_ids; index++) {
>   err = tegra_smmu_as_prepare(smmu, as);
> - if (err < 0)
> - return err;
> + if (err)
> + goto disable;
>  
> - tegra_smmu_enable(smmu, swgroup, as->id);
> - index++;
> + tegra_smmu_enable(smmu, fwspec->ids[index], as->id);
>   }
>  
>   if (index == 0)
>   return -ENODEV;
>  
>   return 0;
> +
> +disable:
> + while (index--) {
> + tegra_smmu_disable(smmu, fwspec->ids[index], as->id);
> + tegra_smmu_as_unprepare(smmu, as);
> + }
> +
> + return err;
>  }
>  
>  static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device 
> *dev)
>  {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   struct tegra_smmu_as *as = to_smmu_as(domain);
> - struct device_node *np = dev->of_node;
>   struct tegra_smmu *smmu = as->smmu;
> - struct of_phandle_args args;
>   unsigned int index = 0;
>  
> - while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> -&args)) {
> - unsigned int swgroup = args.args[0];
> -
> - if (args.np != smmu->dev->of_node) {
> - of_node_put(args.np);
> - continue;
> - }
> -
> - of_node_put(args.np);
> + if (!fwspec || fwspec->ops != &tegra_smmu_ops)
> + return;
>  
> - tegra_smmu_disable(smmu, swgroup, as->id);
> + for (index = 0; index < fwspec->num_ids; index++) {
> + tegra_smmu_disable(smmu, fwspec->ids[index], as->id);
>   tegra_smmu_as_unprepare(smmu, as);
> - index++;
>   }
>  }
>  
> @@ -807,80 +799,26 @@ static phys_addr_t tegra_smmu_iova_to_phy

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

 I'...
>> +struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
>> +struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> 
> It looks to me like the only reason why you need this new global API is
> because PCI devices may not have a device tree node with a phandle to
> the IOMMU. However, SMMU support for PCI will only be enabled if the
> root complex has an iommus property, right? In that case, can't we
> simply do something like this:
> 
>   if (dev_is_pci(dev))
>   np = find_host_bridge(dev)->of_node;
>   else
>   np = dev->of_node;
> 
> ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
> sure that exists.
> 
> Once we have that we can still iterate over the iommus property and do
> not need to rely on this global variable.

This sounds more complicated than the current variant.

Secondly, I'm already about to use the new tegra_get_memory_controller()
API for all the T20/30/124/210 EMC and devfreq drivers.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

30.09.2020 18:23, Thierry Reding пишет:
> On Wed, Sep 30, 2020 at 01:42:56AM -0700, Nicolin Chen wrote:
>> From: Dmitry Osipenko 
>>
>> Multiple Tegra drivers need to retrieve Memory Controller and hence there
>> is quite some duplication of the retrieval code among the drivers. Let's
>> add a new common helper for the retrieval of the MC.
>>
>> Signed-off-by: Dmitry Osipenko 
>> Signed-off-by: Nicolin Chen 
>> ---
>>
>> Changelog
>> v2->v3:
>>  * Replaced with Dimtry's devm_tegra_get_memory_controller()
>> v1->v2:
>>  * N/A
>>
>>  drivers/memory/tegra/mc.c | 39 +++
>>  include/soc/tegra/mc.h| 17 +
>>  2 files changed, 56 insertions(+)
> 
> Let's not add this helper, please. If a device needs a reference to the
> memory controller, it should have a phandle to the memory controller in
> device tree so that it can be looked up explicitly.
> 
> Adding this helper is officially sanctioning that it's okay not to have
> that reference and that's a bad idea.

And please explain why it's a bad idea, I don't see anything bad here at
all.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

On Wed, Sep 30, 2020 at 06:53:06PM +0300, Dmitry Osipenko wrote:
> 30.09.2020 18:23, Thierry Reding пишет:
> > On Wed, Sep 30, 2020 at 01:42:56AM -0700, Nicolin Chen wrote:
> >> From: Dmitry Osipenko 
> >>
> >> Multiple Tegra drivers need to retrieve Memory Controller and hence there
> >> is quite some duplication of the retrieval code among the drivers. Let's
> >> add a new common helper for the retrieval of the MC.
> >>
> >> Signed-off-by: Dmitry Osipenko 
> >> Signed-off-by: Nicolin Chen 
> >> ---
> >>
> >> Changelog
> >> v2->v3:
> >>  * Replaced with Dimtry's devm_tegra_get_memory_controller()
> >> v1->v2:
> >>  * N/A
> >>
> >>  drivers/memory/tegra/mc.c | 39 +++
> >>  include/soc/tegra/mc.h| 17 +
> >>  2 files changed, 56 insertions(+)
> > 
> > Let's not add this helper, please. If a device needs a reference to the
> > memory controller, it should have a phandle to the memory controller in
> > device tree so that it can be looked up explicitly.
> > 
> > Adding this helper is officially sanctioning that it's okay not to have
> > that reference and that's a bad idea.
> 
> And please explain why it's a bad idea, I don't see anything bad here at
> all.

Well, you said yourself in a recent comment that we should avoid global
variables. devm_tegra_get_memory_controller() is nothing but a glorified
global variable.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

On Wed, Sep 30, 2020 at 06:36:52PM +0300, Dmitry Osipenko wrote:
>  I'...
> >> +  struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
> >> +  struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > 
> > It looks to me like the only reason why you need this new global API is
> > because PCI devices may not have a device tree node with a phandle to
> > the IOMMU. However, SMMU support for PCI will only be enabled if the
> > root complex has an iommus property, right? In that case, can't we
> > simply do something like this:
> > 
> > if (dev_is_pci(dev))
> > np = find_host_bridge(dev)->of_node;
> > else
> > np = dev->of_node;
> > 
> > ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
> > sure that exists.
> > 
> > Once we have that we can still iterate over the iommus property and do
> > not need to rely on this global variable.
> 
> This sounds more complicated than the current variant.
> 
> Secondly, I'm already about to use the new tegra_get_memory_controller()
> API for all the T20/30/124/210 EMC and devfreq drivers.

Why do we need it there? They seem to work fine without it right now. If
it is required for new functionality, we can always make the dependent
on a DT reference via phandle without breaking any existing code.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

30.09.2020 19:03, Thierry Reding пишет:
> On Wed, Sep 30, 2020 at 06:53:06PM +0300, Dmitry Osipenko wrote:
>> 30.09.2020 18:23, Thierry Reding пишет:
>>> On Wed, Sep 30, 2020 at 01:42:56AM -0700, Nicolin Chen wrote:
 From: Dmitry Osipenko 

 Multiple Tegra drivers need to retrieve Memory Controller and hence there
 is quite some duplication of the retrieval code among the drivers. Let's
 add a new common helper for the retrieval of the MC.

 Signed-off-by: Dmitry Osipenko 
 Signed-off-by: Nicolin Chen 
 ---

 Changelog
 v2->v3:
  * Replaced with Dimtry's devm_tegra_get_memory_controller()
 v1->v2:
  * N/A

  drivers/memory/tegra/mc.c | 39 +++
  include/soc/tegra/mc.h| 17 +
  2 files changed, 56 insertions(+)
>>>
>>> Let's not add this helper, please. If a device needs a reference to the
>>> memory controller, it should have a phandle to the memory controller in
>>> device tree so that it can be looked up explicitly.
>>>
>>> Adding this helper is officially sanctioning that it's okay not to have
>>> that reference and that's a bad idea.
>>
>> And please explain why it's a bad idea, I don't see anything bad here at
>> all.
> 
> Well, you said yourself in a recent comment that we should avoid global
> variables. devm_tegra_get_memory_controller() is nothing but a glorified
> global variable.

This is not a variable, but a common helper function which will remove
the duplicated code and will help to avoid common mistakes like a missed
put_device().
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

dma_alloc_pages / dma_alloc_noncoherent fixups

Hi all,

this series has a bunch of fixups for the noncoherent DMA allocator
rework that recently landed in linux-next.

I think the most important part is that the idea of vmap()ing
non-contiguous allocations in dma_alloc_noncoherent doesn't work very
well after all.  It means we can't just rely on virt_to_page to get
the page and just use remap_pfn_range or stuff it into other APIs,
but on the other hand it also isn't really generic enought for what
the media APIs seems to want.

So the first patch reverts that change, and the last patch suggests
a different lower level API which should allow the media code to do
all it wants.

I'd suggest all but the last patch for the current merge window, and
we should have a discussion on how well the last one suits the media
subsystem, and probably merge it together with any media changes to
use the required API.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 7/8] dma-iommu: remove __iommu_dma_mmap

The function has a single caller, so open code it there and take
advantage of the precalculated page count variable.

Signed-off-by: Christoph Hellwig 
---
 drivers/iommu/dma-iommu.c | 17 +
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index b363b20a9f41ce..7922f545cd5eef 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -656,21 +656,6 @@ static void *iommu_dma_alloc_remap(struct device *dev, 
size_t size,
return NULL;
 }
 
-/**
- * __iommu_dma_mmap - Map a buffer into provided user VMA
- * @pages: Array representing buffer from __iommu_dma_alloc()
- * @size: Size of buffer in bytes
- * @vma: VMA describing requested userspace mapping
- *
- * Maps the pages of the buffer in @pages into @vma. The caller is responsible
- * for verifying the correct size and protection of @vma beforehand.
- */
-static int __iommu_dma_mmap(struct page **pages, size_t size,
-   struct vm_area_struct *vma)
-{
-   return vm_map_pages(vma, pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
-}
-
 static void iommu_dma_sync_single_for_cpu(struct device *dev,
dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
 {
@@ -1075,7 +1060,7 @@ static int iommu_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
struct page **pages = dma_common_find_pages(cpu_addr);
 
if (pages)
-   return __iommu_dma_mmap(pages, size, vma);
+   return vm_map_pages(vma, pages, nr_pages);
pfn = vmalloc_to_pfn(cpu_addr);
} else {
pfn = page_to_pfn(virt_to_page(cpu_addr));
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 4/8] dma-direct: use __GFP_ZERO in dma_direct_alloc_pages

Prepare for supporting the DMA_ATTR_NO_KERNEL_MAPPING flag in
dma_alloc_pages.

Signed-off-by: Christoph Hellwig 
---
 kernel/dma/direct.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index b5f20781d3a96f..b5d56810130b22 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -296,9 +296,10 @@ struct page *dma_direct_alloc_pages(struct device *dev, 
size_t size,
dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp)
 {
struct page *page;
-   void *ret;
 
if (dma_should_alloc_from_pool(dev, gfp, 0)) {
+   void *ret;
+
page = dma_alloc_from_pool(dev, size, &ret, gfp,
dma_coherent_ok);
if (!page)
@@ -306,7 +307,7 @@ struct page *dma_direct_alloc_pages(struct device *dev, 
size_t size,
goto done;
}
 
-   page = __dma_direct_alloc_pages(dev, size, gfp);
+   page = __dma_direct_alloc_pages(dev, size, gfp | __GFP_ZERO);
if (!page)
return NULL;
if (PageHighMem(page)) {
@@ -320,13 +321,11 @@ struct page *dma_direct_alloc_pages(struct device *dev, 
size_t size,
goto out_free_pages;
}
 
-   ret = page_address(page);
if (force_dma_unencrypted(dev)) {
-   if (set_memory_decrypted((unsigned long)ret,
+   if (set_memory_decrypted((unsigned long)page_address(page),
1 << get_order(size)))
goto out_free_pages;
}
-   memset(ret, 0, size);
 done:
*dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
return page;
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 3/8] dma-direct check for highmem pages in dma_direct_alloc_pages

Check for highmem pages from CMA, just like in the dma_direct_alloc path.

Signed-off-by: Christoph Hellwig 
---
 kernel/dma/direct.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 121a9c1969dd3a..b5f20781d3a96f 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -309,6 +309,17 @@ struct page *dma_direct_alloc_pages(struct device *dev, 
size_t size,
page = __dma_direct_alloc_pages(dev, size, gfp);
if (!page)
return NULL;
+   if (PageHighMem(page)) {
+   /*
+* Depending on the cma= arguments and per-arch setup
+* dma_alloc_contiguous could return highmem pages.
+* Without remapping there is no way to return them here,
+* so log an error and fail.
+*/
+   dev_info(dev, "Rejecting highmem page from CMA.\n");
+   goto out_free_pages;
+   }
+
ret = page_address(page);
if (force_dma_unencrypted(dev)) {
if (set_memory_decrypted((unsigned long)ret,
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 8/8] WIP: add a dma_alloc_contiguous API

Add a new API that returns a virtually non-contigous array of pages
and dma address.  This API is only implemented for dma-iommu and will
not be implemented for non-iommu DMA API instances that have to allocate
contiguous memory.  It is up to the caller to check if the API is
available.

The intent is that media drivers can use this API if either:

 - no kernel mapping or only temporary kernel mappings are required.
   That is as a better replacement for DMA_ATTR_NO_KERNEL_MAPPING
 - a kernel mapping is required for cached and DMA mapped pages, but
   the driver also needs the pages to e.g. map them to userspace.
   In that sense it is a replacement for some aspects of the recently
   removed and never fully implemented DMA_ATTR_NON_CONSISTENT

Signed-off-by: Christoph Hellwig 
---
 drivers/iommu/dma-iommu.c   | 73 +
 include/linux/dma-mapping.h |  9 +
 kernel/dma/mapping.c| 35 ++
 3 files changed, 93 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 7922f545cd5eef..158026a856622c 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -565,23 +565,12 @@ static struct page **__iommu_dma_alloc_pages(struct 
device *dev,
return pages;
 }
 
-/**
- * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA space
- * @dev: Device to allocate memory for. Must be a real device
- *  attached to an iommu_dma_domain
- * @size: Size of buffer in bytes
- * @dma_handle: Out argument for allocated DMA handle
- * @gfp: Allocation flags
- * @prot: pgprot_t to use for the remapped mapping
- * @attrs: DMA attributes for this allocation
- *
- * If @size is less than PAGE_SIZE, then a full CPU page will be allocated,
+/*
+ * If size is less than PAGE_SIZE, then a full CPU page will be allocated,
  * but an IOMMU which supports smaller pages might not map the whole thing.
- *
- * Return: Mapped virtual address, or NULL on failure.
  */
-static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
-   dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot,
+static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev,
+   size_t size, dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot,
unsigned long attrs)
 {
struct iommu_domain *domain = iommu_get_dma_domain(dev);
@@ -593,7 +582,6 @@ static void *iommu_dma_alloc_remap(struct device *dev, 
size_t size,
struct page **pages;
struct sg_table sgt;
dma_addr_t iova;
-   void *vaddr;
 
*dma_handle = DMA_MAPPING_ERROR;
 
@@ -636,17 +624,10 @@ static void *iommu_dma_alloc_remap(struct device *dev, 
size_t size,
< size)
goto out_free_sg;
 
-   vaddr = dma_common_pages_remap(pages, size, prot,
-   __builtin_return_address(0));
-   if (!vaddr)
-   goto out_unmap;
-
*dma_handle = iova;
sg_free_table(&sgt);
-   return vaddr;
+   return pages;
 
-out_unmap:
-   __iommu_dma_unmap(dev, iova, size);
 out_free_sg:
sg_free_table(&sgt);
 out_free_iova:
@@ -656,6 +637,46 @@ static void *iommu_dma_alloc_remap(struct device *dev, 
size_t size,
return NULL;
 }
 
+static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
+   dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot,
+   unsigned long attrs)
+{
+   struct page **pages;
+   void *vaddr;
+
+   pages = __iommu_dma_alloc_noncontiguous(dev, size, dma_handle, gfp,
+   prot, attrs);
+   if (!pages)
+   return NULL;
+   vaddr = dma_common_pages_remap(pages, size, prot,
+   __builtin_return_address(0));
+   if (!vaddr)
+   goto out_unmap;
+   return vaddr;
+
+out_unmap:
+   __iommu_dma_unmap(dev, *dma_handle, size);
+   __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
+   return NULL;
+}
+
+#ifdef CONFIG_DMA_REMAP
+static struct page **iommu_dma_alloc_noncontiguous(struct device *dev,
+   size_t size, dma_addr_t *dma_handle, gfp_t gfp,
+   unsigned long attrs)
+{
+   return __iommu_dma_alloc_noncontiguous(dev, size, dma_handle, gfp,
+  PAGE_KERNEL, attrs);
+}
+
+static void iommu_dma_free_noncontiguous(struct device *dev, size_t size,
+   struct page **pages, dma_addr_t dma_handle)
+{
+   __iommu_dma_unmap(dev, dma_handle, size);
+   __iommu_dma_free_pages(pages, PAGE_ALIGN(size) >> PAGE_SHIFT);
+}
+#endif
+
 static void iommu_dma_sync_single_for_cpu(struct device *dev,
dma_addr_t dma_handle, size_t size, enum dma_data_direction dir)
 {
@@ -1110,6 +1131,10 @@ static const struct dma_map_ops iommu_dma_ops = {
.free   = iommu_dma_free,
.alloc_pages= dma_common_all

[PATCH 5/8] dma-direct: factor out a dma_direct_alloc_from_pool helper

This ensures dma_direct_alloc_pages will use the right gfp mask, as
well as keeping the code for that common.

Signed-off-by: Christoph Hellwig 
---
 kernel/dma/direct.c | 41 -
 1 file changed, 20 insertions(+), 21 deletions(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index b5d56810130b22..ace9159c992f65 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -147,6 +147,22 @@ static struct page *__dma_direct_alloc_pages(struct device 
*dev, size_t size,
return page;
 }
 
+static void *dma_direct_alloc_from_pool(struct device *dev, size_t size,
+   dma_addr_t *dma_handle, gfp_t gfp)
+{
+   struct page *page;
+   u64 phys_mask;
+   void *ret;
+
+   gfp |= dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
+  &phys_mask);
+   page = dma_alloc_from_pool(dev, size, &ret, gfp, dma_coherent_ok);
+   if (!page)
+   return NULL;
+   *dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
+   return ret;
+}
+
 void *dma_direct_alloc(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
 {
@@ -163,17 +179,8 @@ void *dma_direct_alloc(struct device *dev, size_t size,
if (attrs & DMA_ATTR_NO_WARN)
gfp |= __GFP_NOWARN;
 
-   if (dma_should_alloc_from_pool(dev, gfp, attrs)) {
-   u64 phys_mask;
-
-   gfp |= dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
-   &phys_mask);
-   page = dma_alloc_from_pool(dev, size, &ret, gfp,
-   dma_coherent_ok);
-   if (!page)
-   return NULL;
-   goto done;
-   }
+   if (dma_should_alloc_from_pool(dev, gfp, attrs))
+   return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp);
 
/* we always manually zero the memory once we are done */
page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO);
@@ -297,15 +304,8 @@ struct page *dma_direct_alloc_pages(struct device *dev, 
size_t size,
 {
struct page *page;
 
-   if (dma_should_alloc_from_pool(dev, gfp, 0)) {
-   void *ret;
-
-   page = dma_alloc_from_pool(dev, size, &ret, gfp,
-   dma_coherent_ok);
-   if (!page)
-   return NULL;
-   goto done;
-   }
+   if (dma_should_alloc_from_pool(dev, gfp, 0))
+   return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp);
 
page = __dma_direct_alloc_pages(dev, size, gfp | __GFP_ZERO);
if (!page)
@@ -326,7 +326,6 @@ struct page *dma_direct_alloc_pages(struct device *dev, 
size_t size,
1 << get_order(size)))
goto out_free_pages;
}
-done:
*dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
return page;
 out_free_pages:
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 2/8] dma-mapping: document dma_{alloc,free}_pages

Document the new dma_alloc_pages and dma_free_pages APIs, and fix
up the documentation for dma_alloc_noncoherent and dma_free_noncoherent.

Reported-by: Robin Murphy 
Signed-off-by: Christoph Hellwig 
---
 Documentation/core-api/dma-api.rst | 45 ++
 1 file changed, 40 insertions(+), 5 deletions(-)

diff --git a/Documentation/core-api/dma-api.rst 
b/Documentation/core-api/dma-api.rst
index ea0413276ddb70..a75c469dbcaa7c 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -534,11 +534,9 @@ an I/O device, you should not be using this part of the 
API.
dma_addr_t *dma_handle, enum dma_data_direction dir,
gfp_t gfp)
 
-This routine allocates a region of  bytes of consistent memory.  It
+This routine allocates a region of  bytes of non-coherent memory.  It
 returns a pointer to the allocated region (in the processor's virtual address
-space) or NULL if the allocation failed.  The returned memory may or may not
-be in the kernels direct mapping.  Drivers must not call virt_to_page on
-the returned memory region.
+space) or NULL if the allocation failed.
 
 It also returns a  which may be cast to an unsigned integer the
 same width as the bus and given to the device as the DMA address base of
@@ -565,7 +563,44 @@ reused.
 Free a region of memory previously allocated using dma_alloc_noncoherent().
 dev, size and dma_handle and dir must all be the same as those passed into
 dma_alloc_noncoherent().  cpu_addr must be the virtual address returned by
-the dma_alloc_noncoherent().
+dma_alloc_noncoherent().
+
+::
+
+   struct page *
+   dma_alloc_pages(struct device *dev, size_t size, dma_addr_t *dma_handle,
+   enum dma_data_direction dir, gfp_t gfp)
+
+This routine allocates a region of  bytes of non-coherent memory.  It
+returns a pointer to first struct page for the region, or NULL if the
+allocation failed.
+
+It also returns a  which may be cast to an unsigned integer the
+same width as the bus and given to the device as the DMA address base of
+the region.
+
+The dir parameter specified if data is read and/or written by the device,
+see dma_map_single() for details.
+
+The gfp parameter allows the caller to specify the ``GFP_`` flags (see
+kmalloc()) for the allocation, but rejects flags used to specify a memory
+zone such as GFP_DMA or GFP_HIGHMEM.
+
+Before giving the memory to the device, dma_sync_single_for_device() needs
+to be called, and before reading memory written by the device,
+dma_sync_single_for_cpu(), just like for streaming DMA mappings that are
+reused.
+
+::
+
+   void
+   dma_free_pages(struct device *dev, size_t size, struct page *page,
+   dma_addr_t dma_handle, enum dma_data_direction dir)
+
+Free a region of memory previously allocated using dma_alloc_pages().
+dev, size and dma_handle and dir must all be the same as those passed into
+dma_alloc_noncoherent().  page must be the pointer returned by
+dma_alloc_pages().
 
 ::
 
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 6/8] dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling

Use and entirely separate code path for the DMA_ATTR_NO_KERNEL_MAPPING
path.  This avoids any confusion about the ret type, and avoids lots of
attr checks and helpers that can be significantly simplified now.

It also ensures that common handling is applied to architetures still
using the arch alloc/free hooks.

Signed-off-by: Christoph Hellwig 
---
 include/linux/dma-noncoherent.h |  13 -
 kernel/dma/direct.c | 100 +---
 2 files changed, 39 insertions(+), 74 deletions(-)

diff --git a/include/linux/dma-noncoherent.h b/include/linux/dma-noncoherent.h
index e61283e06576a8..73ac149fa181b4 100644
--- a/include/linux/dma-noncoherent.h
+++ b/include/linux/dma-noncoherent.h
@@ -21,19 +21,6 @@ static inline bool dev_is_dma_coherent(struct device *dev)
 }
 #endif /* CONFIG_ARCH_HAS_DMA_COHERENCE_H */
 
-/*
- * Check if an allocation needs to be marked uncached to be coherent.
- */
-static __always_inline bool dma_alloc_need_uncached(struct device *dev,
-   unsigned long attrs)
-{
-   if (dev_is_dma_coherent(dev))
-   return false;
-   if (attrs & DMA_ATTR_NO_KERNEL_MAPPING)
-   return false;
-   return true;
-}
-
 void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
gfp_t gfp, unsigned long attrs);
 void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index ace9159c992f65..a3c619b424edf0 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -75,39 +75,6 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t 
phys, size_t size)
min_not_zero(dev->coherent_dma_mask, dev->bus_dma_limit);
 }
 
-/*
- * Decrypting memory is allowed to block, so if this device requires
- * unencrypted memory it must come from atomic pools.
- */
-static inline bool dma_should_alloc_from_pool(struct device *dev, gfp_t gfp,
- unsigned long attrs)
-{
-   if (!IS_ENABLED(CONFIG_DMA_COHERENT_POOL))
-   return false;
-   if (gfpflags_allow_blocking(gfp))
-   return false;
-   if (force_dma_unencrypted(dev))
-   return true;
-   if (!IS_ENABLED(CONFIG_DMA_DIRECT_REMAP))
-   return false;
-   if (dma_alloc_need_uncached(dev, attrs))
-   return true;
-   return false;
-}
-
-static inline bool dma_should_free_from_pool(struct device *dev,
-unsigned long attrs)
-{
-   if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL))
-   return true;
-   if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
-   !force_dma_unencrypted(dev))
-   return false;
-   if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP))
-   return true;
-   return false;
-}
-
 static struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
gfp_t gfp)
 {
@@ -170,35 +137,45 @@ void *dma_direct_alloc(struct device *dev, size_t size,
void *ret;
int err;
 
-   if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) &&
-   !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
-   dma_alloc_need_uncached(dev, attrs))
-   return arch_dma_alloc(dev, size, dma_handle, gfp, attrs);
-
size = PAGE_ALIGN(size);
if (attrs & DMA_ATTR_NO_WARN)
gfp |= __GFP_NOWARN;
 
-   if (dma_should_alloc_from_pool(dev, gfp, attrs))
-   return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp);
-
-   /* we always manually zero the memory once we are done */
-   page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO);
-   if (!page)
-   return NULL;
-
if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
!force_dma_unencrypted(dev)) {
+   page = __dma_direct_alloc_pages(dev, size, gfp);
+   if (!page)
+   return NULL;
/* remove any dirty cache lines on the kernel alias */
if (!PageHighMem(page))
arch_dma_prep_coherent(page, size);
+   *dma_handle = phys_to_dma_direct(dev, page_to_phys(page));
/* return the page pointer as the opaque cookie */
-   ret = page;
-   goto done;
+   return page;
}
 
+   if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) &&
+   !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
+   !dev_is_dma_coherent(dev))
+   return arch_dma_alloc(dev, size, dma_handle, gfp, attrs);
+
+   /*
+* Remapping or decrypting memory may block. If either is required and
+* we can't block, allocate the memory from the atomic pools.
+*/
+   if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) &&
+   !gfpflags_allow_blocking(gfp) &&
+   (force_dma_unencrypted(dev) ||
+(IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && 
!d

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

On Wed, Sep 30, 2020 at 06:36:52PM +0300, Dmitry Osipenko wrote:
>  I'...
> >> +  struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
> >> +  struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > 
> > It looks to me like the only reason why you need this new global API is
> > because PCI devices may not have a device tree node with a phandle to
> > the IOMMU. However, SMMU support for PCI will only be enabled if the
> > root complex has an iommus property, right? In that case, can't we
> > simply do something like this:
> > 
> > if (dev_is_pci(dev))
> > np = find_host_bridge(dev)->of_node;
> > else
> > np = dev->of_node;
> > 
> > ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
> > sure that exists.
> > 
> > Once we have that we can still iterate over the iommus property and do
> > not need to rely on this global variable.
> 
> This sounds more complicated than the current variant.

I don't think so. It's actually very clear and explicit. And yes, this
might be a little more work (and honestly, this is what? a handful of
lines?) than accessing a global variable, but that's a fair price to pay
for proper encapsulation.

> Secondly, I'm already about to use the new tegra_get_memory_controller()
> API for all the T20/30/124/210 EMC and devfreq drivers.

Also, this really proves the point I was trying to make about how this
is going to proliferate...

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 1/8] dma-mapping: remove the {alloc,free}_noncoherent methods

It turns out allowing non-contigous allocations here was a rather bad
idea, as we'll now need to define ways to get the pages for mmaping
or dma_buf sharing.  Revert this change and stick to the original
concept.  A different API for the use case of non-contigous allocations
will be added back later.

Signed-off-by: Christoph Hellwig 
---
 drivers/iommu/dma-iommu.c   | 30 --
 include/linux/dma-mapping.h |  5 -
 kernel/dma/mapping.c| 33 ++---
 3 files changed, 6 insertions(+), 62 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index c12c1dc43d312e..b363b20a9f41ce 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1055,34 +1055,6 @@ static void *iommu_dma_alloc(struct device *dev, size_t 
size,
return cpu_addr;
 }
 
-#ifdef CONFIG_DMA_REMAP
-static void *iommu_dma_alloc_noncoherent(struct device *dev, size_t size,
-   dma_addr_t *handle, enum dma_data_direction dir, gfp_t gfp)
-{
-   if (!gfpflags_allow_blocking(gfp)) {
-   struct page *page;
-
-   page = dma_common_alloc_pages(dev, size, handle, dir, gfp);
-   if (!page)
-   return NULL;
-   return page_address(page);
-   }
-
-   return iommu_dma_alloc_remap(dev, size, handle, gfp | __GFP_ZERO,
-PAGE_KERNEL, 0);
-}
-
-static void iommu_dma_free_noncoherent(struct device *dev, size_t size,
-   void *cpu_addr, dma_addr_t handle, enum dma_data_direction dir)
-{
-   __iommu_dma_unmap(dev, handle, size);
-   __iommu_dma_free(dev, size, cpu_addr);
-}
-#else
-#define iommu_dma_alloc_noncoherentNULL
-#define iommu_dma_free_noncoherent NULL
-#endif /* CONFIG_DMA_REMAP */
-
 static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs)
@@ -1153,8 +1125,6 @@ static const struct dma_map_ops iommu_dma_ops = {
.free   = iommu_dma_free,
.alloc_pages= dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
-   .alloc_noncoherent  = iommu_dma_alloc_noncoherent,
-   .free_noncoherent   = iommu_dma_free_noncoherent,
.mmap   = iommu_dma_mmap,
.get_sgtable= iommu_dma_get_sgtable,
.map_page   = iommu_dma_map_page,
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 7c77cd6f3604a7..4b9b1d64f5ec9e 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -74,11 +74,6 @@ struct dma_map_ops {
gfp_t gfp);
void (*free_pages)(struct device *dev, size_t size, struct page *vaddr,
dma_addr_t dma_handle, enum dma_data_direction dir);
-   void* (*alloc_noncoherent)(struct device *dev, size_t size,
-   dma_addr_t *dma_handle, enum dma_data_direction dir,
-   gfp_t gfp);
-   void (*free_noncoherent)(struct device *dev, size_t size, void *vaddr,
-   dma_addr_t dma_handle, enum dma_data_direction dir);
int (*mmap)(struct device *, struct vm_area_struct *,
  void *, dma_addr_t, size_t,
  unsigned long attrs);
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 9669550656a0b4..06115f59f4ffbf 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -513,40 +513,19 @@ EXPORT_SYMBOL_GPL(dma_free_pages);
 void *dma_alloc_noncoherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp)
 {
-   const struct dma_map_ops *ops = get_dma_ops(dev);
-   void *vaddr;
-
-   if (!ops || !ops->alloc_noncoherent) {
-   struct page *page;
-
-   page = dma_alloc_pages(dev, size, dma_handle, dir, gfp);
-   if (!page)
-   return NULL;
-   return page_address(page);
-   }
+   struct page *page;
 
-   size = PAGE_ALIGN(size);
-   vaddr = ops->alloc_noncoherent(dev, size, dma_handle, dir, gfp);
-   if (vaddr)
-   debug_dma_map_page(dev, virt_to_page(vaddr), 0, size, dir,
-  *dma_handle);
-   return vaddr;
+   page = dma_alloc_pages(dev, size, dma_handle, dir, gfp);
+   if (!page)
+   return NULL;
+   return page_address(page);
 }
 EXPORT_SYMBOL_GPL(dma_alloc_noncoherent);
 
 void dma_free_noncoherent(struct device *dev, size_t size, void *vaddr,
dma_addr_t dma_handle, enum dma_data_direction dir)
 {
-   const struct dma_map_ops *ops = get_dma_ops(dev);
-
-   if (!ops || !ops->free_noncoherent) {
-   dma_free_pages(dev, size, virt_to_page(vaddr), dma_handle, dir)

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

On Wed, Sep 30, 2020 at 07:06:27PM +0300, Dmitry Osipenko wrote:
> 30.09.2020 19:03, Thierry Reding пишет:
> > On Wed, Sep 30, 2020 at 06:53:06PM +0300, Dmitry Osipenko wrote:
> >> 30.09.2020 18:23, Thierry Reding пишет:
> >>> On Wed, Sep 30, 2020 at 01:42:56AM -0700, Nicolin Chen wrote:
>  From: Dmitry Osipenko 
> 
>  Multiple Tegra drivers need to retrieve Memory Controller and hence there
>  is quite some duplication of the retrieval code among the drivers. Let's
>  add a new common helper for the retrieval of the MC.
> 
>  Signed-off-by: Dmitry Osipenko 
>  Signed-off-by: Nicolin Chen 
>  ---
> 
>  Changelog
>  v2->v3:
>   * Replaced with Dimtry's devm_tegra_get_memory_controller()
>  v1->v2:
>   * N/A
> 
>   drivers/memory/tegra/mc.c | 39 +++
>   include/soc/tegra/mc.h| 17 +
>   2 files changed, 56 insertions(+)
> >>>
> >>> Let's not add this helper, please. If a device needs a reference to the
> >>> memory controller, it should have a phandle to the memory controller in
> >>> device tree so that it can be looked up explicitly.
> >>>
> >>> Adding this helper is officially sanctioning that it's okay not to have
> >>> that reference and that's a bad idea.
> >>
> >> And please explain why it's a bad idea, I don't see anything bad here at
> >> all.
> > 
> > Well, you said yourself in a recent comment that we should avoid global
> > variables. devm_tegra_get_memory_controller() is nothing but a glorified
> > global variable.
> 
> This is not a variable, but a common helper function which will remove
> the duplicated code and will help to avoid common mistakes like a missed
> put_device().

Yeah, you're right: this is actually much worse than a global variable.
It's a helper function that needs 50+ lines in order to effectively
access a global variable.

You could write this much simpler by doing something like this:

static struct tegra_mc *global_mc;

int tegra_mc_probe(...)
{
...

global_mc = mc;

...
}

struct tegra_mc *tegra_get_memory_controller(void)
{
return global_mc;
}

The result is *exactly* the same, except that this is actually more
honest. Nicolin's patch *pretends* that it isn't using a global variable
by wrapping a lot of complicated code around it.

But that doesn't change the fact that this accesses a singleton object
without actually being able to tie it to the device in the first place.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

30.09.2020 19:15, Thierry Reding пишет:
> On Wed, Sep 30, 2020 at 07:06:27PM +0300, Dmitry Osipenko wrote:
>> 30.09.2020 19:03, Thierry Reding пишет:
>>> On Wed, Sep 30, 2020 at 06:53:06PM +0300, Dmitry Osipenko wrote:
 30.09.2020 18:23, Thierry Reding пишет:
> On Wed, Sep 30, 2020 at 01:42:56AM -0700, Nicolin Chen wrote:
>> From: Dmitry Osipenko 
>>
>> Multiple Tegra drivers need to retrieve Memory Controller and hence there
>> is quite some duplication of the retrieval code among the drivers. Let's
>> add a new common helper for the retrieval of the MC.
>>
>> Signed-off-by: Dmitry Osipenko 
>> Signed-off-by: Nicolin Chen 
>> ---
>>
>> Changelog
>> v2->v3:
>>  * Replaced with Dimtry's devm_tegra_get_memory_controller()
>> v1->v2:
>>  * N/A
>>
>>  drivers/memory/tegra/mc.c | 39 +++
>>  include/soc/tegra/mc.h| 17 +
>>  2 files changed, 56 insertions(+)
>
> Let's not add this helper, please. If a device needs a reference to the
> memory controller, it should have a phandle to the memory controller in
> device tree so that it can be looked up explicitly.
>
> Adding this helper is officially sanctioning that it's okay not to have
> that reference and that's a bad idea.

 And please explain why it's a bad idea, I don't see anything bad here at
 all.
>>>
>>> Well, you said yourself in a recent comment that we should avoid global
>>> variables. devm_tegra_get_memory_controller() is nothing but a glorified
>>> global variable.
>>
>> This is not a variable, but a common helper function which will remove
>> the duplicated code and will help to avoid common mistakes like a missed
>> put_device().
> 
> Yeah, you're right: this is actually much worse than a global variable.
> It's a helper function that needs 50+ lines in order to effectively
> access a global variable.
> 
> You could write this much simpler by doing something like this:
> 
>   static struct tegra_mc *global_mc;
> 
>   int tegra_mc_probe(...)
>   {
>   ...
> 
>   global_mc = mc;
> 
>   ...
>   }
> 
>   struct tegra_mc *tegra_get_memory_controller(void)
>   {
>   return global_mc;
>   }
> 
> The result is *exactly* the same, except that this is actually more
> honest. Nicolin's patch *pretends* that it isn't using a global variable
> by wrapping a lot of complicated code around it.
> 
> But that doesn't change the fact that this accesses a singleton object
> without actually being able to tie it to the device in the first place.

I don't think that the MC driver will stay built-in forever, although
its modularization is complicated right now. Hence something shall keep
the reference to the MC device resources while they are in use and this
patch takes care of doing that.

Secondly, the Nicolin's patch doesn't pretend on anything, but rather
brings the already existing duplicated code to a single common place.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

30.09.2020 19:06, Thierry Reding пишет:
> On Wed, Sep 30, 2020 at 06:36:52PM +0300, Dmitry Osipenko wrote:
>>  I'...
 +  struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
 +  struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>>>
>>> It looks to me like the only reason why you need this new global API is
>>> because PCI devices may not have a device tree node with a phandle to
>>> the IOMMU. However, SMMU support for PCI will only be enabled if the
>>> root complex has an iommus property, right? In that case, can't we
>>> simply do something like this:
>>>
>>> if (dev_is_pci(dev))
>>> np = find_host_bridge(dev)->of_node;
>>> else
>>> np = dev->of_node;
>>>
>>> ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
>>> sure that exists.
>>>
>>> Once we have that we can still iterate over the iommus property and do
>>> not need to rely on this global variable.
>>
>> This sounds more complicated than the current variant.
>>
>> Secondly, I'm already about to use the new tegra_get_memory_controller()
>> API for all the T20/30/124/210 EMC and devfreq drivers.
> 
> Why do we need it there? They seem to work fine without it right now.

All the Tegra30/124/210 EMC drivers are already duplicating that MC
lookup code and only the recent T210 driver does it properly.

> If it is required for new functionality, we can always make the dependent
> on a DT reference via phandle without breaking any existing code.

That's correct, it will be also needed for the new functionality as
well, hence even more drivers will need to perform the MC lookup.

I don't quite understand why you're asking for the phandle reference,
it's absolutely not needed for the MC lookup and won't work for the
older DTs if DT change will be needed. Please give a detailed explanation.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

...
>> Secondly, I'm already about to use the new tegra_get_memory_controller()
>> API for all the T20/30/124/210 EMC and devfreq drivers.
> 
> Also, this really proves the point I was trying to make about how this
> is going to proliferate...

Sorry, I'm probably totally missing yours point.. "what" exactly will
proliferate?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

On Wed, Sep 30, 2020 at 07:26:00PM +0300, Dmitry Osipenko wrote:
> 30.09.2020 19:15, Thierry Reding пишет:
> > On Wed, Sep 30, 2020 at 07:06:27PM +0300, Dmitry Osipenko wrote:
> >> 30.09.2020 19:03, Thierry Reding пишет:
> >>> On Wed, Sep 30, 2020 at 06:53:06PM +0300, Dmitry Osipenko wrote:
>  30.09.2020 18:23, Thierry Reding пишет:
> > On Wed, Sep 30, 2020 at 01:42:56AM -0700, Nicolin Chen wrote:
> >> From: Dmitry Osipenko 
> >>
> >> Multiple Tegra drivers need to retrieve Memory Controller and hence 
> >> there
> >> is quite some duplication of the retrieval code among the drivers. 
> >> Let's
> >> add a new common helper for the retrieval of the MC.
> >>
> >> Signed-off-by: Dmitry Osipenko 
> >> Signed-off-by: Nicolin Chen 
> >> ---
> >>
> >> Changelog
> >> v2->v3:
> >>  * Replaced with Dimtry's devm_tegra_get_memory_controller()
> >> v1->v2:
> >>  * N/A
> >>
> >>  drivers/memory/tegra/mc.c | 39 +++
> >>  include/soc/tegra/mc.h| 17 +
> >>  2 files changed, 56 insertions(+)
> >
> > Let's not add this helper, please. If a device needs a reference to the
> > memory controller, it should have a phandle to the memory controller in
> > device tree so that it can be looked up explicitly.
> >
> > Adding this helper is officially sanctioning that it's okay not to have
> > that reference and that's a bad idea.
> 
>  And please explain why it's a bad idea, I don't see anything bad here at
>  all.
> >>>
> >>> Well, you said yourself in a recent comment that we should avoid global
> >>> variables. devm_tegra_get_memory_controller() is nothing but a glorified
> >>> global variable.
> >>
> >> This is not a variable, but a common helper function which will remove
> >> the duplicated code and will help to avoid common mistakes like a missed
> >> put_device().
> > 
> > Yeah, you're right: this is actually much worse than a global variable.
> > It's a helper function that needs 50+ lines in order to effectively
> > access a global variable.
> > 
> > You could write this much simpler by doing something like this:
> > 
> > static struct tegra_mc *global_mc;
> > 
> > int tegra_mc_probe(...)
> > {
> > ...
> > 
> > global_mc = mc;
> > 
> > ...
> > }
> > 
> > struct tegra_mc *tegra_get_memory_controller(void)
> > {
> > return global_mc;
> > }
> > 
> > The result is *exactly* the same, except that this is actually more
> > honest. Nicolin's patch *pretends* that it isn't using a global variable
> > by wrapping a lot of complicated code around it.
> > 
> > But that doesn't change the fact that this accesses a singleton object
> > without actually being able to tie it to the device in the first place.
> 
> I don't think that the MC driver will stay built-in forever, although
> its modularization is complicated right now. Hence something shall keep
> the reference to the MC device resources while they are in use and this
> patch takes care of doing that.

It looks to me like all the other places where we get a reference to the
MC also keep a reference to the device. That's obviously not going to be
enough once the code is turned into a module. At that point we need to
make sure to also grab a reference to the module. But that's orthogonal
to this discussion.

> Secondly, the Nicolin's patch doesn't pretend on anything, but rather

Yes, the patch does pretend to "look up" the memory controller device,
but in reality it will always return a singleton object, which can just
as easily be achieved by using a global variable.

> brings the already existing duplicated code to a single common place.

Where exactly is that duplicated code? The only places I see where we
get a reference to the memory controller are from the EMC drivers and
they properly look up the MC via the nvidia,memory-controller device
tree property.

But that's not what this new helper does. This code will use the OF
lookup table to find any match and then returns that, completely
ignoring any links established by the device tree.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

On Wed, Sep 30, 2020 at 07:25:41PM +0300, Dmitry Osipenko wrote:
> 30.09.2020 19:06, Thierry Reding пишет:
> > On Wed, Sep 30, 2020 at 06:36:52PM +0300, Dmitry Osipenko wrote:
> >>  I'...
>  +struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
>  +struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> >>>
> >>> It looks to me like the only reason why you need this new global API is
> >>> because PCI devices may not have a device tree node with a phandle to
> >>> the IOMMU. However, SMMU support for PCI will only be enabled if the
> >>> root complex has an iommus property, right? In that case, can't we
> >>> simply do something like this:
> >>>
> >>>   if (dev_is_pci(dev))
> >>>   np = find_host_bridge(dev)->of_node;
> >>>   else
> >>>   np = dev->of_node;
> >>>
> >>> ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
> >>> sure that exists.
> >>>
> >>> Once we have that we can still iterate over the iommus property and do
> >>> not need to rely on this global variable.
> >>
> >> This sounds more complicated than the current variant.
> >>
> >> Secondly, I'm already about to use the new tegra_get_memory_controller()
> >> API for all the T20/30/124/210 EMC and devfreq drivers.
> > 
> > Why do we need it there? They seem to work fine without it right now.
> 
> All the Tegra30/124/210 EMC drivers are already duplicating that MC
> lookup code and only the recent T210 driver does it properly.
> 
> > If it is required for new functionality, we can always make the dependent
> > on a DT reference via phandle without breaking any existing code.
> 
> That's correct, it will be also needed for the new functionality as
> well, hence even more drivers will need to perform the MC lookup.

I don't have any issues with adding a helper if we need it from several
different locations. But the helper should be working off of a given
device and look up the device via the device tree node referenced by
phandle. We already have those phandles in place for the EMC devices,
and any other device that needs to interoperate with the MC should also
get such a reference.

> I don't quite understand why you're asking for the phandle reference,
> it's absolutely not needed for the MC lookup and won't work for the

We need that phandle in order to establish a link between the devices.
Yes, you can probably do it without the phandle and just match by
compatible string. But we don't do that for other types of devices
either, right? For a display driver we reference the attached panel via
phandle, but we could also just look it up via name or absolute path or
some other heuristic. But a phandle is just a much more explicit way of
linking the devices, so why not use it?

> older DTs if DT change will be needed. Please give a detailed explanation.

New functionality doesn't have to work with older DTs.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [patch V2 00/46] x86, PCI, XEN, genirq ...: Prepare for device MSI

2020-09-30 Thread Dey, Megha

Hi Thomas/Jason,

On 9/30/2020 8:20 AM, Thomas Gleixner wrote:

On Wed, Sep 30 2020 at 08:43, Jason Gunthorpe wrote:

On Wed, Sep 30, 2020 at 08:41:48AM +0200, Thomas Gleixner wrote:

On Tue, Sep 29 2020 at 16:03, Megha Dey wrote:

On 8/26/2020 4:16 AM, Thomas Gleixner wrote:

#9 is obviously just for the folks interested in IMS

I see that the tip tree (as of 9/29) has most of these patches but
notice that the DEV_MSI related patches

haven't made it. I have tested the tip tree(x86/irq branch) with your
DEV_MSI infra patches and our IMS patches with the IDXD driver and was

Your IMS patches? Why do you need something special again?

By IMS patches, I meant your IMS driver patch that was updated (as it
was untested, it had some compile

errors and we removed the IMS_QUEUE parts) :

https://lore.kernel.org/lkml/160021246221.67751.16280230469654363209.st...@djiang5-desk3.ch.intel.com/

and some iommu related changes required by IMS.

https://lore.kernel.org/lkml/160021246905.67751.1674517279122764758.st...@djiang5-desk3.ch.intel.com/

The whole patchset can be found here:

https://lore.kernel.org/lkml/f4a085f1-f6de-2539-12fe-c7308d243...@intel.com/

It would be great if you could review the IMS patches :)

wondering if we should push out those patches as part of our patchset?

As I don't have any hardware to test that, I was waiting for you and
Jason to confirm that this actually works for the two different IMS
implementations.

How urgently do you need this? The code looked good from what I
understood. It will be a while before we have all the parts to send an
actual patch though.

I personally do not need it at all :) Megha might have different
thoughts...

I have tested these patches and it works fine (I had to add a couple of
EXPORT_SYMBOLS).

We were hoping to get IMS in the 5.10 merge window :)

We might be able to put together a mockup just to prove it

If that makes Megha's stuff going that would of course be appreciated,
but we can defer the IMS_QUEUE part for later. It's orthogonal to the
IMS_ARRAY stuff.

In our patch series, we have removed the IMS_QUEUE stuff and retained
only the IMS_ARRAY parts

as that was sufficient for us.

Thanks,

tglx

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

...
>> I don't think that the MC driver will stay built-in forever, although
>> its modularization is complicated right now. Hence something shall keep
>> the reference to the MC device resources while they are in use and this
>> patch takes care of doing that.
> 
> It looks to me like all the other places where we get a reference to the
> MC also keep a reference to the device. That's obviously not going to be
> enough once the code is turned into a module. At that point we need to
> make sure to also grab a reference to the module. But that's orthogonal
> to this discussion.
> 
>> Secondly, the Nicolin's patch doesn't pretend on anything, but rather
> 
> Yes, the patch does pretend to "look up" the memory controller device,
> but in reality it will always return a singleton object, which can just
> as easily be achieved by using a global variable.
> 
>> brings the already existing duplicated code to a single common place.
> 
> Where exactly is that duplicated code? The only places I see where we
> get a reference to the memory controller are from the EMC drivers and
> they properly look up the MC via the nvidia,memory-controller device
> tree property.

Yours observation is correct, all the drivers *do the lookup*. My point
is that the nvidia,memory-controller usage isn't strictly necessary.

The tegra20-devfreq driver doesn't use the phandle lookup because
Tegra20 DTs don't have it, instead it uses the compatible lookup. Hence
this patch doesn't really change the already existing behaviour for the
drivers. The phandle usage is optional.

This patch adds a common API that is usable by *all* the already
existing drivers, starting from the Tegra20 drivers.

> But that's not what this new helper does. This code will use the OF
> lookup table to find any match and then returns that, completely
> ignoring any links established by the device tree.

As I already said in the other reply, it should be fine to add lookup by
the phandle and then fall back to the compatible matching. On the other
hand, this is not strictly necessary because we always have only a
single MC device at a time.

Please note that I don't have any objections to improving this patch. In
the end either way will work, so we just need to choose the best option.
I was merely explaining the rationale behind this patch and not trying
to defend it.

Yours suggestion about using static mc variable is also good to me since
currently MC driver is built-in and at least that won't be a
globally-visible kernel variable, which doesn't feel great to have.

I think we can follow approach of the static mc variable for the starter
and improve it once there will be a real need. This should be the
simplest option right now.

But again, we may slightly future-proof the API by adding the
resource-managed variant. Either way will be good, IMO :) Currently I
don't have a strong preference.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [patch V2 00/46] x86, PCI, XEN, genirq ...: Prepare for device MSI

2020-09-30 Thread Thomas Gleixner

Megha,

On Wed, Sep 30 2020 at 10:25, Megha Dey wrote:
> On 9/30/2020 8:20 AM, Thomas Gleixner wrote:
 Your IMS patches? Why do you need something special again?
>
> By IMS patches, I meant your IMS driver patch that was updated (as it 
> was untested, it had some compile errors and we removed the IMS_QUEUE
> parts) :

Ok.

> The whole patchset can be found here:
>
> https://lore.kernel.org/lkml/f4a085f1-f6de-2539-12fe-c7308d243...@intel.com/
>
> It would be great if you could review the IMS patches :)

It somehow slipped through the cracks. I'll have a look.

> We were hoping to get IMS in the 5.10 merge window :)

Hope dies last, right?

>>> We might be able to put together a mockup just to prove it
>> If that makes Megha's stuff going that would of course be appreciated,
>> but we can defer the IMS_QUEUE part for later. It's orthogonal to the
>> IMS_ARRAY stuff.
>
> In our patch series, we have removed the IMS_QUEUE stuff and retained 
> only the IMS_ARRAY parts > as that was sufficient for us.

That works. We can add that back when Jason has his puzzle pieces
sorted.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [patch V2 24/46] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI

2020-09-30 Thread Jason Gunthorpe

On Wed, Sep 30, 2020 at 01:08:27PM +, Derrick, Jonathan wrote:
> +Megha
> 
> On Wed, 2020-09-30 at 09:57 -0300, Jason Gunthorpe wrote:
> > On Wed, Sep 30, 2020 at 12:45:30PM +, Derrick, Jonathan wrote:
> > > Hi Jason
> > > 
> > > On Mon, 2020-08-31 at 11:39 -0300, Jason Gunthorpe wrote:
> > > > On Wed, Aug 26, 2020 at 01:16:52PM +0200, Thomas Gleixner wrote:
> > > > > From: Thomas Gleixner 
> > > > > 
> > > > > Devices on the VMD bus use their own MSI irq domain, but it is not
> > > > > distinguishable from regular PCI/MSI irq domains. This is required
> > > > > to exclude VMD devices from getting the irq domain pointer set by
> > > > > interrupt remapping.
> > > > > 
> > > > > Override the default bus token.
> > > > > 
> > > > > Signed-off-by: Thomas Gleixner 
> > > > > Acked-by: Bjorn Helgaas 
> > > > >  drivers/pci/controller/vmd.c |6 ++
> > > > >  1 file changed, 6 insertions(+)
> > > > > 
> > > > > +++ b/drivers/pci/controller/vmd.c
> > > > > @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
> > > > >   return -ENODEV;
> > > > >   }
> > > > >  
> > > > > + /*
> > > > > +  * Override the irq domain bus token so the domain can be 
> > > > > distinguished
> > > > > +  * from a regular PCI/MSI domain.
> > > > > +  */
> > > > > + irq_domain_update_bus_token(vmd->irq_domain, 
> > > > > DOMAIN_BUS_VMD_MSI);
> > > > > +
> > > > 
> > > > Having the non-transparent-bridge hold a MSI table and
> > > > multiplex/de-multiplex IRQs looks like another good use case for
> > > > something close to pci_subdevice_msi_create_irq_domain()?
> > > > 
> > > > If each device could have its own internal MSI-X table programmed
> > > > properly things would work alot better. Disable capture/remap of the
> > > > MSI range in the NTB.
> > > We can disable the capture and remap in newer devices so we don't even
> > > need the irq domain.
> > 
> > You'd still need an irq domain, it just comes from
> > pci_subdevice_msi_create_irq_domain() instead of internal to this
> > driver.
> I have this set which disables remapping and bypasses the creation of
> the IRQ domain:
> https://patchwork.ozlabs.org/project/linux-pci/list/?series=192936

After Thomas's series VMD needs to supply a hierarchical IRQ domain to
get the correct PCI originator. This instead of the x86 patch in that
series. I think that domain should be created by
pci_subdevice_msi_create_irq_domain(), at least I'd start there.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 3/3] iommu/tegra-smmu: Add PCI support

On Wed, Sep 30, 2020 at 05:53:20PM +0300, Dmitry Osipenko wrote:
> ...
> > +#ifdef CONFIG_PCI
> > +   if (!iommu_present(&pci_bus_type)) {
> 
> 
> In the previous reply you said that you're borrowing this check from the
> arm-smmu driver, but arm-smmu also has a similar check for
> platform_bus_type, while Tegra SMMU driver doesn't have it. Hence I'm
> objecting the necessity of having this check.
> 
> Please give a rationale for having this check in the code. And please
> note that cargo cult isn't a rationale to me.

Okay, okayI will remove it upon testing.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

On Wed, Sep 30, 2020 at 05:31:31PM +0200, Thierry Reding wrote:
> On Wed, Sep 30, 2020 at 01:42:57AM -0700, Nicolin Chen wrote:
> > Previously the driver relies on bus_set_iommu() in .probe() to call
> > in .probe_device() function so each client can poll iommus property
> > in DTB to configure fwspec via tegra_smmu_configure(). According to
> > the comments in .probe(), this is a bit of a hack. And this doesn't
> > work for a client that doesn't exist in DTB, PCI device for example.
> > 
> > Actually when a device/client gets probed, the of_iommu_configure()
> > will call in .probe_device() function again, with a prepared fwspec
> > from of_iommu_configure() that reads the SWGROUP id in DTB as we do
> > in tegra-smmu driver.
> > 
> > Additionally, as a new helper devm_tegra_get_memory_controller() is
> > introduced, there's no need to poll the iommus property in order to
> > get mc->smmu pointers or SWGROUP id.
> > 
> > This patch reworks .probe_device() and .attach_dev() by doing:
> > 1) Using fwspec to get swgroup id in .attach_dev/.dettach_dev()
> > 2) Removing DT polling code, tegra_smmu_find/tegra_smmu_configure()
> > 3) Calling devm_tegra_get_memory_controller() in .probe_device()
> > 4) Also dropping the hack in .probe() that's no longer needed.
> > 
> > Signed-off-by: Nicolin Chen 
[...]
> >  static struct iommu_device *tegra_smmu_probe_device(struct device *dev)
> >  {
> > -   struct device_node *np = dev->of_node;
> > -   struct tegra_smmu *smmu = NULL;
> > -   struct of_phandle_args args;
> > -   unsigned int index = 0;
> > -   int err;
> > -
> > -   while (of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> > - &args) == 0) {
> > -   smmu = tegra_smmu_find(args.np);
> > -   if (smmu) {
> > -   err = tegra_smmu_configure(smmu, dev, &args);
> > -   of_node_put(args.np);
> > -
> > -   if (err < 0)
> > -   return ERR_PTR(err);
> > -
> > -   /*
> > -* Only a single IOMMU master interface is currently
> > -* supported by the Linux kernel, so abort after the
> > -* first match.
> > -*/
> > -   dev_iommu_priv_set(dev, smmu);
> > -
> > -   break;
> > -   }
> > +   struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
> > +   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> 
> It looks to me like the only reason why you need this new global API is
> because PCI devices may not have a device tree node with a phandle to
> the IOMMU. However, SMMU support for PCI will only be enabled if the
> root complex has an iommus property, right? In that case, can't we
> simply do something like this:
> 
>   if (dev_is_pci(dev))
>   np = find_host_bridge(dev)->of_node;
>   else
>   np = dev->of_node;
> 
> ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
> sure that exists.
> 
> Once we have that we can still iterate over the iommus property and do
> not need to rely on this global variable.

I agree that it'd work. But I was hoping to simplify the code
here if it's possible. Looks like we have an argument on this
so I will choose to go with your suggestion above for now.

> > -   of_node_put(args.np);
> > -   index++;
> > -   }
> > +   /* An invalid mc pointer means mc and smmu drivers are not ready */
> > +   if (IS_ERR(mc))
> > +   return ERR_PTR(-EPROBE_DEFER);
> >  
> > -   if (!smmu)
> > +   /*
> > +* IOMMU core allows -ENODEV return to carry on. So bypass any call
> > +* from bus_set_iommu() during tegra_smmu_probe(), as a device will
> > +* call in again via of_iommu_configure when fwspec is prepared.
> > +*/
> > +   if (!mc->smmu || !fwspec || fwspec->ops != &tegra_smmu_ops)
> > return ERR_PTR(-ENODEV);
> >  
> > -   return &smmu->iommu;
> > +   dev_iommu_priv_set(dev, mc->smmu);
> > +
> > +   return &mc->smmu->iommu;
> >  }
> >  
> >  static void tegra_smmu_release_device(struct device *dev)
> > @@ -1089,16 +1027,6 @@ struct tegra_smmu *tegra_smmu_probe(struct device 
> > *dev,
> > if (!smmu)
> > return ERR_PTR(-ENOMEM);
> >  
> > -   /*
> > -* This is a bit of a hack. Ideally we'd want to simply return this
> > -* value. However the IOMMU registration process will attempt to add
> > -* all devices to the IOMMU when bus_set_iommu() is called. In order
> > -* not to rely on global variables to track the IOMMU instance, we
> > -* set it here so that it can be looked up from the .probe_device()
> > -* callback via the IOMMU device's .drvdata field.
> > -*/
> > -   mc->smmu = smmu;
> 
> I don't think this is going to work. I distinctly remember putting this
> here because we needed access to this before ->probe_device() had been
> called for any of the devices.

Do you remember

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev