Re: [PATCH 5/5] iommu: virt: Use iommu_put_resv_regions_simple()

2019-09-02 Thread Christoph Hellwig
I think the subject should say virtio instead of virt.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 -next] iommu/arm-smmu-v3: Fix build error without CONFIG_PCI_ATS

2019-09-02 Thread YueHaibing
If CONFIG_PCI_ATS is not set, building fails:

drivers/iommu/arm-smmu-v3.c: In function arm_smmu_ats_supported:
drivers/iommu/arm-smmu-v3.c:2325:35: error: struct pci_dev has no member named 
ats_cap; did you mean msi_cap?
  return !pdev->untrusted && pdev->ats_cap;
   ^~~

ats_cap should only used when CONFIG_PCI_ATS is defined,
so use #ifdef block to guard this.

Fixes: bfff88ec1afe ("iommu/arm-smmu-v3: Rework enabling/disabling of ATS for 
PCI masters")
Signed-off-by: YueHaibing 
---
v2: Add arm_smmu_ats_supported() of no CONFIG_PCI_ATS
---
 drivers/iommu/arm-smmu-v3.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 66bf641..8da93e7 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2311,6 +2311,7 @@ static void arm_smmu_install_ste_for_dev(struct 
arm_smmu_master *master)
}
 }
 
+#ifdef CONFIG_PCI_ATS
 static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
 {
struct pci_dev *pdev;
@@ -2324,6 +2325,12 @@ static bool arm_smmu_ats_supported(struct 
arm_smmu_master *master)
pdev = to_pci_dev(master->dev);
return !pdev->untrusted && pdev->ats_cap;
 }
+#else
+static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
+{
+   return false;
+}
+#endif
 
 static void arm_smmu_enable_ats(struct arm_smmu_master *master)
 {
-- 
2.7.4




RE: [PATCH v10 3/4] block: add a helper function to merge the segments

2019-09-02 Thread Yoshihiro Shimoda
Hi Christoph,

> > Now this patch series got {Ack,Review}ed-by from each maintainer.
> > https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=166501
> >
> > So, would you pick this up through the dma-mapping tree as you said before?
> 
> I've applied it to the dma-mapping tree for 5.4 now, thanks a lot!

Thank you very much for your support!

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: "Rework enabling/disabling of ATS for PCI masters" failed to compile on arm64

2019-09-02 Thread Will Deacon
On Mon, Sep 02, 2019 at 10:10:30PM -0400, Qian Cai wrote:
> The linux-next commit “iommu/arm-smmu-v3: Rework enabling/disabling of ATS 
> for PCI masters” [1] causes a compilation error when PCI_ATS=n on arm64.
> 
> [1] 
> https://lore.kernel.org/linux-iommu/20190820154549.17018-3-w...@kernel.org/
> 
> drivers/iommu/arm-smmu-v3.c:2325:35: error: no member named 'ats_cap' in 
> 'struct pci_dev'
> return !pdev->untrusted && pdev->ats_cap;
>  ^
> 
> For example,
> 
> Symbol: PCI_ATS [=n]
>   │ Type  : bool
>   │   Defined at drivers/pci/Kconfig:118
>   │   Depends on: PCI [=y] 
>   │   Selected by [n]: 
>   │   - PCI_IOV [=n] && PCI [=y] 
>   │   - PCI_PRI [=n] && PCI [=y]│  
>   │   - PCI_PASID [=n] && PCI [=y] │  
>   │   - AMD_IOMMU [=n] && IOMMU_SUPPORT [=y] && X86_64 && PCI [=y] && ACPI 
> [=y]

https://lkml.kernel.org/r/20190903063028.6ryuk5dmaohi2fqa@willie-the-truck

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH -next] iommu/arm-smmu-v3: Fix build error without CONFIG_PCI_ATS

2019-09-02 Thread Yuehaibing
On 2019/9/3 14:30, Will Deacon wrote:
> On Tue, Sep 03, 2019 at 10:42:12AM +0800, YueHaibing wrote:
>> If CONFIG_PCI_ATS is not set, building fails:
>>
>> drivers/iommu/arm-smmu-v3.c: In function arm_smmu_ats_supported:
>> drivers/iommu/arm-smmu-v3.c:2325:35: error: struct pci_dev has no member 
>> named ats_cap; did you mean msi_cap?
>>   return !pdev->untrusted && pdev->ats_cap;
>>^~~
>>
>> ats_cap should only used when CONFIG_PCI_ATS is defined,
>> so use #ifdef block to guard this.
>>
>> Fixes: bfff88ec1afe ("iommu/arm-smmu-v3: Rework enabling/disabling of ATS 
>> for PCI masters")
>> Signed-off-by: YueHaibing 
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 66bf641..44ac9ac 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -2313,7 +2313,7 @@ static void arm_smmu_install_ste_for_dev(struct 
>> arm_smmu_master *master)
>>  
>>  static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
>>  {
>> -struct pci_dev *pdev;
>> +struct pci_dev *pdev __maybe_unused;
>>  struct arm_smmu_device *smmu = master->smmu;
>>  struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
>>  
>> @@ -2321,8 +2321,10 @@ static bool arm_smmu_ats_supported(struct 
>> arm_smmu_master *master)
>>  !(fwspec->flags & IOMMU_FWSPEC_PCI_RC_ATS) || pci_ats_disabled())
>>  return false;
>>  
>> +#ifdef CONFIG_PCI_ATS
>>  pdev = to_pci_dev(master->dev);
>>  return !pdev->untrusted && pdev->ats_cap;
>> +#endif
>>  }
> 
> Hmm, I really don't like the missing return statement here, even though we
> never get this far thanks to the feature not getting set during ->probe().
> I'd actually prefer just to duplicate the function:
> 
> #ifndef CONFIG_PCI_ATS
> static bool
> arm_smmu_ats_supported(struct arm_smmu_master *master) { return false; }
> #else
> 
> #endif
> 
> Can you send a v2 like that, please?

Ok, will send v2 as your suggestion.

> 
> Will
> 
> .
> 



Re: [PATCH v10 3/4] block: add a helper function to merge the segments

2019-09-02 Thread h...@lst.de
On Tue, Sep 03, 2019 at 04:59:59AM +, Yoshihiro Shimoda wrote:
> Hi Christoph,
> 
> Now this patch series got {Ack,Review}ed-by from each maintainer.
> https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=166501
> 
> So, would you pick this up through the dma-mapping tree as you said before?

I've applied it to the dma-mapping tree for 5.4 now, thanks a lot!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH -next] iommu/arm-smmu-v3: Fix build error without CONFIG_PCI_ATS

2019-09-02 Thread Will Deacon
On Tue, Sep 03, 2019 at 10:42:12AM +0800, YueHaibing wrote:
> If CONFIG_PCI_ATS is not set, building fails:
> 
> drivers/iommu/arm-smmu-v3.c: In function arm_smmu_ats_supported:
> drivers/iommu/arm-smmu-v3.c:2325:35: error: struct pci_dev has no member 
> named ats_cap; did you mean msi_cap?
>   return !pdev->untrusted && pdev->ats_cap;
>^~~
> 
> ats_cap should only used when CONFIG_PCI_ATS is defined,
> so use #ifdef block to guard this.
> 
> Fixes: bfff88ec1afe ("iommu/arm-smmu-v3: Rework enabling/disabling of ATS for 
> PCI masters")
> Signed-off-by: YueHaibing 
> ---
>  drivers/iommu/arm-smmu-v3.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 66bf641..44ac9ac 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -2313,7 +2313,7 @@ static void arm_smmu_install_ste_for_dev(struct 
> arm_smmu_master *master)
>  
>  static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
>  {
> - struct pci_dev *pdev;
> + struct pci_dev *pdev __maybe_unused;
>   struct arm_smmu_device *smmu = master->smmu;
>   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
>  
> @@ -2321,8 +2321,10 @@ static bool arm_smmu_ats_supported(struct 
> arm_smmu_master *master)
>   !(fwspec->flags & IOMMU_FWSPEC_PCI_RC_ATS) || pci_ats_disabled())
>   return false;
>  
> +#ifdef CONFIG_PCI_ATS
>   pdev = to_pci_dev(master->dev);
>   return !pdev->untrusted && pdev->ats_cap;
> +#endif
>  }

Hmm, I really don't like the missing return statement here, even though we
never get this far thanks to the feature not getting set during ->probe().
I'd actually prefer just to duplicate the function:

#ifndef CONFIG_PCI_ATS
static bool
arm_smmu_ats_supported(struct arm_smmu_master *master) { return false; }
#else

#endif

Can you send a v2 like that, please?

Will


RE: [PATCH v10 3/4] block: add a helper function to merge the segments

2019-09-02 Thread Yoshihiro Shimoda
Hi Christoph,

Now this patch series got {Ack,Review}ed-by from each maintainer.
https://patchwork.kernel.org/project/linux-renesas-soc/list/?series=166501

So, would you pick this up through the dma-mapping tree as you said before?

> From: Jens Axboe, Sent: Tuesday, September 3, 2019 6:47 AM
> 
> On 8/28/19 6:35 AM, Yoshihiro Shimoda wrote:
> > This patch adds a helper function whether a queue can merge
> > the segments by the DMA MAP layer (e.g. via IOMMU).
> 
> Reviewed-by: Jens Axboe 

Jens, thank you for your review!

Best regards,
Yoshihiro Shimoda



Re: [PATCH 2/2] media: i2c: dw9768: Add DW9768 VCM driver

2019-09-02 Thread Tomasz Figa
Hi Dongchun,

On Tue, Sep 3, 2019 at 12:02 AM Dongchun Zhu  wrote:
>
> Hi Tomasz,
>
> On Fri, 2019-08-23 at 17:17 +0900, Tomasz Figa wrote:
> > Hi Dongchun,
> >
> > On Mon, Jul 08, 2019 at 06:06:41PM +0800, dongchun@mediatek.com wrote:
> > > From: Dongchun Zhu 
> > >
> > > This patch adds a V4L2 sub-device driver for DW9768 lens voice coil,
> > > and provides control to set the desired focus.
> > >
> > > The DW9807 is a 10 bit DAC from Dongwoon, designed for linear
> > > control of voice coil motor.
> > >
> > > Signed-off-by: Dongchun Zhu 
> > > ---
> > >  MAINTAINERS|   1 +
> > >  drivers/media/i2c/Kconfig  |  10 +
> > >  drivers/media/i2c/Makefile |   1 +
> > >  drivers/media/i2c/dw9768.c | 458 
> > > +
> > >  4 files changed, 470 insertions(+)
> > >  create mode 100644 drivers/media/i2c/dw9768.c
> > >
> >
> > Thanks for the patch! Please see my comments inline.
> >
> > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > index 8f6ac93..17152d7 100644
> > > --- a/MAINTAINERS
> > > +++ b/MAINTAINERS
> > > @@ -4877,6 +4877,7 @@ M:Dongchun Zhu 
> > >  L: linux-me...@vger.kernel.org
> > >  T: git git://linuxtv.org/media_tree.git
> > >  S: Maintained
> > > +F: drivers/media/i2c/dw9768.c
> > >  F: Documentation/devicetree/bindings/media/i2c/dongwoon,dw9768.txt
> > >
> > >  DONGWOON DW9807 LENS VOICE COIL DRIVER
> > > diff --git a/drivers/media/i2c/Kconfig b/drivers/media/i2c/Kconfig
> > > index 7793358..8ff6c95 100644
> > > --- a/drivers/media/i2c/Kconfig
> > > +++ b/drivers/media/i2c/Kconfig
> > > @@ -1014,6 +1014,16 @@ config VIDEO_DW9714
> > >   capability. This is designed for linear control of
> > >   voice coil motors, controlled via I2C serial interface.
> > >
> > > +config VIDEO_DW9768
> > > +   tristate "DW9768 lens voice coil support"
> > > +   depends on I2C && VIDEO_V4L2 && MEDIA_CONTROLLER
> > > +   depends on VIDEO_V4L2_SUBDEV_API
> > > +   help
> > > + This is a driver for the DW9768 camera lens voice coil.
> > > + DW9768 is a 10 bit DAC with 100mA output current sink
> > > + capability. This is designed for linear control of
> > > + voice coil motors, controlled via I2C serial interface.
> > > +
> > >  config VIDEO_DW9807_VCM
> > > tristate "DW9807 lens voice coil support"
> > > depends on I2C && VIDEO_V4L2 && MEDIA_CONTROLLER
> > > diff --git a/drivers/media/i2c/Makefile b/drivers/media/i2c/Makefile
> > > index d8ad9da..944fbf6 100644
> > > --- a/drivers/media/i2c/Makefile
> > > +++ b/drivers/media/i2c/Makefile
> > > @@ -24,6 +24,7 @@ obj-$(CONFIG_VIDEO_SAA6752HS) += saa6752hs.o
> > >  obj-$(CONFIG_VIDEO_AD5820)  += ad5820.o
> > >  obj-$(CONFIG_VIDEO_AK7375)  += ak7375.o
> > >  obj-$(CONFIG_VIDEO_DW9714)  += dw9714.o
> > > +obj-$(CONFIG_VIDEO_DW9768)  += dw9768.o
> > >  obj-$(CONFIG_VIDEO_DW9807_VCM)  += dw9807-vcm.o
> > >  obj-$(CONFIG_VIDEO_ADV7170) += adv7170.o
> > >  obj-$(CONFIG_VIDEO_ADV7175) += adv7175.o
> > > diff --git a/drivers/media/i2c/dw9768.c b/drivers/media/i2c/dw9768.c
> > > new file mode 100644
> > > index 000..f5b5591
> > > --- /dev/null
> > > +++ b/drivers/media/i2c/dw9768.c
> > > @@ -0,0 +1,458 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * Copyright (c) 2018 MediaTek Inc.
> > > + */
> > > +
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +
> > > +#define DW9768_VOLTAGE_ANALOG  280
> >
> > This is a platform detail and should be defined in the platform data, for
> > example DTS on platforms using DT.
> >
>
> Thanks for your reminder.
> This would be fixed in next release.
>
> > > +#define DW9768_NAME"dw9768"
> >
> > The chip we seem to be using this driver for is called gt9769. Shouldn't we
> > call the driver the same?
> >
>
> It is also called DW9768 from camera module specification, which was
> initially confirmed with vendor.
>

Okay, thanks for clarifying.

Best regards,
Tomasz


[PATCH v2 7/7] arm64: tegra: enable SMMU for SDHCI and EQOS on T194

2019-09-02 Thread Krishna Reddy
Enable SMMU translations for SDHCI and EQOS transactions on T194.

Signed-off-by: Krishna Reddy 
---
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index 5ae3bbf..cac3462 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -51,6 +51,7 @@
clock-names = "master_bus", "slave_bus", "rx", "tx", 
"ptp_ref";
resets = <&bpmp TEGRA194_RESET_EQOS>;
reset-names = "eqos";
+   iommus = <&smmu TEGRA186_SID_EQOS>;
status = "disabled";
 
snps,write-requests = <1>;
@@ -381,6 +382,7 @@
clock-names = "sdhci";
resets = <&bpmp TEGRA194_RESET_SDMMC1>;
reset-names = "sdhci";
+   iommus = <&smmu TEGRA186_SID_SDMMC1>;
nvidia,pad-autocal-pull-up-offset-3v3-timeout =
<0x07>;
nvidia,pad-autocal-pull-down-offset-3v3-timeout =
@@ -403,6 +405,7 @@
clock-names = "sdhci";
resets = <&bpmp TEGRA194_RESET_SDMMC3>;
reset-names = "sdhci";
+   iommus = <&smmu TEGRA186_SID_SDMMC3>;
nvidia,pad-autocal-pull-up-offset-1v8 = <0x00>;
nvidia,pad-autocal-pull-down-offset-1v8 = <0x7a>;
nvidia,pad-autocal-pull-up-offset-3v3-timeout = <0x07>;
@@ -430,6 +433,7 @@
  <&bpmp TEGRA194_CLK_PLLC4>;
resets = <&bpmp TEGRA194_RESET_SDMMC4>;
reset-names = "sdhci";
+   iommus = <&smmu TEGRA186_SID_SDMMC4>;
nvidia,pad-autocal-pull-up-offset-hs400 = <0x00>;
nvidia,pad-autocal-pull-down-offset-hs400 = <0x00>;
nvidia,pad-autocal-pull-up-offset-1v8-timeout = <0x0a>;
-- 
2.1.4



[PATCH v2 3/7] dt-bindings: arm-smmu: Add binding for Tegra194 SMMU

2019-09-02 Thread Krishna Reddy
Add binding for NVIDIA's Tegra194 Soc SMMU that is based
on ARM MMU-500.

Signed-off-by: Krishna Reddy 
---
 Documentation/devicetree/bindings/iommu/arm,smmu.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 3133f3b..1d72fac 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -31,6 +31,10 @@ conditions.
   as below, SoC-specific compatibles:
   "qcom,sdm845-smmu-500", "arm,mmu-500"
 
+  NVIDIA SoCs that use more than one ARM MMU-500 together
+  needs following SoC-specific compatibles along with 
"arm,mmu-500":
+  "nvidia,tegra194-smmu"
+
 - reg   : Base address and size of the SMMU.
 
 - #global-interrupts : The number of global interrupts exposed by the
-- 
2.1.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/7] Nvidia Arm SMMUv2 Implementation

2019-09-02 Thread Krishna Reddy
Changes in v2:
- Prepare arm_smu_flush_ops for override.
- Remove NVIDIA_SMMUv2 and use ARM_SMMUv2 model as T194 SMMU hasn't modified 
ARM MMU-500.
- Add T194 specific compatible string - "nvidia,tegra194-smmu"
- Remove tlb_sync hook added in v1 and Override arm_smmu_flush_ops->tlb_sync() 
from implementation.
- Register implementation specific context/global fault hooks directly for irq 
handling.
- Update global/context interrupt list in DT and releant fault handling code in 
arm-smmu-nvidia.c.
- Implement reset hook in arm-smmu-nvidia.c to clear irq status and sync tlb.

v1 - https://lkml.org/lkml/2019/8/29/1588

Krishna Reddy (7):
  iommu/arm-smmu: prepare arm_smmu_flush_ops for override
  iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage
  dt-bindings: arm-smmu: Add binding for Tegra194 SMMU
  iommu/arm-smmu: Add global/context fault implementation hooks
  arm64: tegra: Add Memory controller DT node on T194
  arm64: tegra: Add DT node for T194 SMMU
  arm64: tegra: enable SMMU for SDHCI and EQOS on T194

 .../devicetree/bindings/iommu/arm,smmu.txt |   4 +
 MAINTAINERS|   2 +
 arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi |   4 +
 arch/arm64/boot/dts/nvidia/tegra194.dtsi   |  88 +++
 drivers/iommu/Makefile |   2 +-
 drivers/iommu/arm-smmu-impl.c  |   3 +
 drivers/iommu/arm-smmu-nvidia.c| 287 +
 drivers/iommu/arm-smmu.c   |  27 +-
 drivers/iommu/arm-smmu.h   |   8 +-
 9 files changed, 413 insertions(+), 12 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-nvidia.c

-- 
2.1.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/7] iommu/arm-smmu: add NVIDIA implementation for dual ARM MMU-500 usage

2019-09-02 Thread Krishna Reddy
NVIDIA's Tegra194 soc uses two ARM MMU-500s together to interleave
IOVA accesses across them.
Add NVIDIA implementation for dual ARM MMU-500s and add new compatible
string for Tegra194 soc.

Signed-off-by: Krishna Reddy 
---
 MAINTAINERS |   2 +
 drivers/iommu/Makefile  |   2 +-
 drivers/iommu/arm-smmu-impl.c   |   3 +
 drivers/iommu/arm-smmu-nvidia.c | 187 
 drivers/iommu/arm-smmu.h|   1 +
 5 files changed, 194 insertions(+), 1 deletion(-)
 create mode 100644 drivers/iommu/arm-smmu-nvidia.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 74e9d9c..c9b802a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15807,9 +15807,11 @@ F: drivers/i2c/busses/i2c-tegra.c
 
 TEGRA IOMMU DRIVERS
 M: Thierry Reding 
+R: Krishna Reddy 
 L: linux-te...@vger.kernel.org
 S: Supported
 F: drivers/iommu/tegra*
+F: drivers/iommu/arm-smmu-nvidia.c
 
 TEGRA KBC DRIVER
 M: Laxman Dewangan 
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 7caad48..556b94c 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -13,7 +13,7 @@ obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o
 obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o amd_iommu_quirks.o
 obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd_iommu_debugfs.o
 obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o
-obj-$(CONFIG_ARM_SMMU) += arm-smmu.o arm-smmu-impl.o
+obj-$(CONFIG_ARM_SMMU) += arm-smmu.o arm-smmu-impl.o arm-smmu-nvidia.o
 obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
 obj-$(CONFIG_DMAR_TABLE) += dmar.o
 obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o
diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index 5c87a38..1a19687 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -158,6 +158,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
 */
switch (smmu->model) {
case ARM_MMU500:
+   if (of_device_is_compatible(smmu->dev->of_node,
+   "nvidia,tegra194-smmu"))
+   return nvidia_smmu_impl_init(smmu);
smmu->impl = &arm_mmu500_impl;
break;
case CAVIUM_SMMUV2:
diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
new file mode 100644
index 000..ca871dc
--- /dev/null
+++ b/drivers/iommu/arm-smmu-nvidia.c
@@ -0,0 +1,187 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Nvidia ARM SMMU v2 implementation quirks
+// Copyright (C) 2019 NVIDIA CORPORATION.  All rights reserved.
+
+#define pr_fmt(fmt) "nvidia-smmu: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "arm-smmu.h"
+
+/* Tegra194 has three ARM MMU-500 Instances.
+ * Two of them are used together for Interleaved IOVA accesses and
+ * used by Non-Isochronous Hw devices for SMMU translations.
+ * Third one is used for SMMU translations from Isochronous HW devices.
+ * It is possible to use this Implementation to program either
+ * all three or two of the instances identically as desired through
+ * DT node.
+ *
+ * Programming all the three instances identically comes with redundant tlb
+ * invalidations as all three never need to be tlb invalidated for a HW device.
+ *
+ * When Linux Kernel supports multiple SMMU devices, The SMMU device used for
+ * Isochornous HW devices should be added as a separate ARM MMU-500 device
+ * in DT and be programmed independently for efficient tlb invalidates.
+ *
+ */
+#define MAX_SMMU_INSTANCES 3
+
+struct nvidia_smmu {
+   struct arm_smmu_device  smmu;
+   unsigned intnum_inst;
+   void __iomem*bases[MAX_SMMU_INSTANCES];
+};
+
+#define to_nvidia_smmu(s) container_of(s, struct nvidia_smmu, smmu)
+
+#define nsmmu_page(smmu, inst, page) \
+   (((inst) ? to_nvidia_smmu(smmu)->bases[(inst)] : smmu->base) + \
+   ((page) << smmu->pgshift))
+
+static u32 nsmmu_read_reg(struct arm_smmu_device *smmu,
+ int page, int offset)
+{
+   return readl_relaxed(nsmmu_page(smmu, 0, page) + offset);
+}
+
+static void nsmmu_write_reg(struct arm_smmu_device *smmu,
+   int page, int offset, u32 val)
+{
+   unsigned int i;
+
+   for (i = 0; i < to_nvidia_smmu(smmu)->num_inst; i++)
+   writel_relaxed(val, nsmmu_page(smmu, i, page) + offset);
+}
+
+static u64 nsmmu_read_reg64(struct arm_smmu_device *smmu,
+   int page, int offset)
+{
+   return readq_relaxed(nsmmu_page(smmu, 0, page) + offset);
+}
+
+static void nsmmu_write_reg64(struct arm_smmu_device *smmu,
+ int page, int offset, u64 val)
+{
+   unsigned int i;
+
+   for (i = 0; i < to_nvidia_smmu(smmu)->num_inst; i++)
+   writeq_relaxed(val, nsmmu_page(smmu, i, page) + offset);
+}
+
+static void nsmmu_tlb_sync(struct arm_smmu_device *smmu, int page,
+  int s

[PATCH v2 6/7] arm64: tegra: Add DT node for T194 SMMU

2019-09-02 Thread Krishna Reddy
Add DT node for T194 SMMU to enable SMMU support.

Signed-off-by: Krishna Reddy 
---
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 77 
 1 file changed, 77 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index d906958..5ae3bbf 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -1401,6 +1401,83 @@
  0x8200 0x0  0x4000 0x1f 0x4000 0x0 
0xc000>; /* non-prefetchable memory (3GB) */
};
 
+   smmu: iommu@1200 {
+   compatible = "arm,mmu-500","nvidia,tegra194-smmu";
+   reg = <0 0x1200 0 0x80>,
+ <0 0x1100 0 0x80>,
+ <0 0x1000 0 0x80>;
+   interrupts = ,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+;
+   stream-match-mask = <0x7f80>;
+   #global-interrupts = <3>;
+   #iommu-cells = <1>;
+   };
+
sysram@4000 {
compatible = "nvidia,tegra194-sysram", "mmio-sram";
reg = <0x0 0x4000 0x0 0x5>;
-- 
2.1.4



[PATCH v2 1/7] iommu/arm-smmu: prepare arm_smmu_flush_ops for override

2019-09-02 Thread Krishna Reddy
Remove const keyword for arm_smmu_flush_ops in arm_smmu_domain
and replace direct references to arm_smmu_tlb_sync* functions with
arm_smmu_flush_ops->tlb_sync().
This is necessary for vendor specific implementations that
need to override arm_smmu_flush_ops in part or full.

Signed-off-by: Krishna Reddy 
---
 drivers/iommu/arm-smmu.c | 16 
 drivers/iommu/arm-smmu.h |  4 +++-
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5b93c79..16b5c54 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -52,9 +52,6 @@
  */
 #define QCOM_DUMMY_VAL -1
 
-#define TLB_LOOP_TIMEOUT   100 /* 1s! */
-#define TLB_SPIN_COUNT 10
-
 #define MSI_IOVA_BASE  0x800
 #define MSI_IOVA_LENGTH0x10
 
@@ -290,6 +287,8 @@ static void arm_smmu_tlb_sync_vmid(void *cookie)
 static void arm_smmu_tlb_inv_context_s1(void *cookie)
 {
struct arm_smmu_domain *smmu_domain = cookie;
+   const struct arm_smmu_flush_ops *ops = smmu_domain->flush_ops;
+
/*
 * The TLBI write may be relaxed, so ensure that PTEs cleared by the
 * current CPU are visible beforehand.
@@ -297,18 +296,19 @@ static void arm_smmu_tlb_inv_context_s1(void *cookie)
wmb();
arm_smmu_cb_write(smmu_domain->smmu, smmu_domain->cfg.cbndx,
  ARM_SMMU_CB_S1_TLBIASID, smmu_domain->cfg.asid);
-   arm_smmu_tlb_sync_context(cookie);
+   ops->tlb_sync(cookie);
 }
 
 static void arm_smmu_tlb_inv_context_s2(void *cookie)
 {
struct arm_smmu_domain *smmu_domain = cookie;
struct arm_smmu_device *smmu = smmu_domain->smmu;
+   const struct arm_smmu_flush_ops *ops = smmu_domain->flush_ops;
 
/* See above */
wmb();
arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_TLBIVMID, smmu_domain->cfg.vmid);
-   arm_smmu_tlb_sync_global(smmu);
+   ops->tlb_sync(cookie);
 }
 
 static void arm_smmu_tlb_inv_range_s1(unsigned long iova, size_t size,
@@ -410,7 +410,7 @@ static void arm_smmu_tlb_add_page(struct iommu_iotlb_gather 
*gather,
ops->tlb_inv_range(iova, granule, granule, true, cookie);
 }
 
-static const struct arm_smmu_flush_ops arm_smmu_s1_tlb_ops = {
+static struct arm_smmu_flush_ops arm_smmu_s1_tlb_ops = {
.tlb = {
.tlb_flush_all  = arm_smmu_tlb_inv_context_s1,
.tlb_flush_walk = arm_smmu_tlb_inv_walk,
@@ -421,7 +421,7 @@ static const struct arm_smmu_flush_ops arm_smmu_s1_tlb_ops 
= {
.tlb_sync   = arm_smmu_tlb_sync_context,
 };
 
-static const struct arm_smmu_flush_ops arm_smmu_s2_tlb_ops_v2 = {
+static struct arm_smmu_flush_ops arm_smmu_s2_tlb_ops_v2 = {
.tlb = {
.tlb_flush_all  = arm_smmu_tlb_inv_context_s2,
.tlb_flush_walk = arm_smmu_tlb_inv_walk,
@@ -432,7 +432,7 @@ static const struct arm_smmu_flush_ops 
arm_smmu_s2_tlb_ops_v2 = {
.tlb_sync   = arm_smmu_tlb_sync_context,
 };
 
-static const struct arm_smmu_flush_ops arm_smmu_s2_tlb_ops_v1 = {
+static struct arm_smmu_flush_ops arm_smmu_s2_tlb_ops_v1 = {
.tlb = {
.tlb_flush_all  = arm_smmu_tlb_inv_context_s2,
.tlb_flush_walk = arm_smmu_tlb_inv_walk,
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index b19b6ca..b2d6c7f 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -207,6 +207,8 @@ enum arm_smmu_cbar_type {
 /* Maximum number of context banks per SMMU */
 #define ARM_SMMU_MAX_CBS   128
 
+#define TLB_LOOP_TIMEOUT   100 /* 1s! */
+#define TLB_SPIN_COUNT 10
 
 /* Shared driver definitions */
 enum arm_smmu_arch_version {
@@ -314,7 +316,7 @@ struct arm_smmu_flush_ops {
 struct arm_smmu_domain {
struct arm_smmu_device  *smmu;
struct io_pgtable_ops   *pgtbl_ops;
-   const struct arm_smmu_flush_ops *flush_ops;
+   struct arm_smmu_flush_ops   *flush_ops;
struct arm_smmu_cfg cfg;
enum arm_smmu_domain_stage  stage;
boolnon_strict;
-- 
2.1.4



[PATCH v2 4/7] iommu/arm-smmu: Add global/context fault implementation hooks

2019-09-02 Thread Krishna Reddy
Add global/context fault hooks to allow NVIDIA SMMU implementation
handle faults across multiple SMMUs.

Signed-off-by: Krishna Reddy 
---
 drivers/iommu/arm-smmu-nvidia.c | 100 
 drivers/iommu/arm-smmu.c|  11 -
 drivers/iommu/arm-smmu.h|   3 ++
 3 files changed, 112 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c
index ca871dc..2a19d41 100644
--- a/drivers/iommu/arm-smmu-nvidia.c
+++ b/drivers/iommu/arm-smmu-nvidia.c
@@ -143,6 +143,104 @@ static int nsmmu_init_context(struct arm_smmu_domain 
*smmu_domain)
return 0;
 }
 
+static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
+{
+   return container_of(dom, struct arm_smmu_domain, domain);
+}
+
+static irqreturn_t nsmmu_global_fault_inst(int irq,
+  struct arm_smmu_device *smmu,
+  int inst)
+{
+   u32 gfsr, gfsynr0, gfsynr1, gfsynr2;
+
+   gfsr = readl_relaxed(nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR);
+   gfsynr0 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
+   ARM_SMMU_GR0_sGFSYNR0);
+   gfsynr1 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
+   ARM_SMMU_GR0_sGFSYNR1);
+   gfsynr2 = readl_relaxed(nsmmu_page(smmu, inst, 0) +
+   ARM_SMMU_GR0_sGFSYNR2);
+
+   if (!gfsr)
+   return IRQ_NONE;
+
+   dev_err_ratelimited(smmu->dev,
+   "Unexpected global fault, this could be serious\n");
+   dev_err_ratelimited(smmu->dev,
+   "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 
0x%08x\n",
+   gfsr, gfsynr0, gfsynr1, gfsynr2);
+
+   writel_relaxed(gfsr, nsmmu_page(smmu, inst, 0) + ARM_SMMU_GR0_sGFSR);
+   return IRQ_HANDLED;
+}
+
+static irqreturn_t nsmmu_global_fault(int irq, void *dev)
+{
+   int inst;
+   irqreturn_t irq_ret = IRQ_NONE;
+   struct arm_smmu_device *smmu = dev;
+
+   for (inst = 0; inst < to_nvidia_smmu(smmu)->num_inst; inst++) {
+   irq_ret = nsmmu_global_fault_inst(irq, smmu, inst);
+   if (irq_ret == IRQ_HANDLED)
+   return irq_ret;
+   }
+
+   return irq_ret;
+}
+
+static irqreturn_t nsmmu_context_fault_bank(int irq,
+   struct arm_smmu_device *smmu,
+   int idx, int inst)
+{
+   u32 fsr, fsynr, cbfrsynra;
+   unsigned long iova;
+
+   fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR);
+   if (!(fsr & FSR_FAULT))
+   return IRQ_NONE;
+
+   fsynr = readl_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) +
+ ARM_SMMU_CB_FSYNR0);
+   iova = readq_relaxed(nsmmu_page(smmu, inst, smmu->numpage + idx) +
+ARM_SMMU_CB_FAR);
+   cbfrsynra = readl_relaxed(nsmmu_page(smmu, inst, 1) +
+ ARM_SMMU_GR1_CBFRSYNRA(idx));
+
+   dev_err_ratelimited(smmu->dev,
+   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
cbfrsynra=0x%x, cb=%d\n",
+   fsr, iova, fsynr, cbfrsynra, idx);
+
+   writel_relaxed(fsr, nsmmu_page(smmu, inst, smmu->numpage + idx) +
+   ARM_SMMU_CB_FSR);
+   return IRQ_HANDLED;
+}
+
+static irqreturn_t nsmmu_context_fault(int irq, void *dev)
+{
+   int inst, idx;
+   irqreturn_t irq_ret = IRQ_NONE;
+   struct iommu_domain *domain = dev;
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+   for (inst = 0; inst < to_nvidia_smmu(smmu)->num_inst; inst++) {
+   /* Interrupt line shared between all context faults.
+* Check for faults across all contexts.
+*/
+   for (idx = 0; idx < smmu->num_context_banks; idx++) {
+   irq_ret = nsmmu_context_fault_bank(irq, smmu,
+  idx, inst);
+
+   if (irq_ret == IRQ_HANDLED)
+   return irq_ret;
+   }
+   }
+
+   return irq_ret;
+}
+
 static const struct arm_smmu_impl nvidia_smmu_impl = {
.read_reg = nsmmu_read_reg,
.write_reg = nsmmu_write_reg,
@@ -150,6 +248,8 @@ static const struct arm_smmu_impl nvidia_smmu_impl = {
.write_reg64 = nsmmu_write_reg64,
.reset = nsmmu_reset,
.init_context = nsmmu_init_context,
+   .global_fault = nsmmu_global_fault,
+   .context_fault = nsmmu_context_fault,
 };
 
 struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 16b5c54..7811e7d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-s

[PATCH v2 5/7] arm64: tegra: Add Memory controller DT node on T194

2019-09-02 Thread Krishna Reddy
Add Memory controller DT node on T194 and enable it.
This patch is a prerequisite for SMMU enable on T194.

Signed-off-by: Krishna Reddy 
---
 arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi | 4 
 arch/arm64/boot/dts/nvidia/tegra194.dtsi   | 7 +++
 2 files changed, 11 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
index 62e07e11..4b3441b 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
@@ -47,6 +47,10 @@
};
};
 
+   memory-controller@2c0 {
+   status = "okay";
+   };
+
serial@311 {
status = "okay";
};
diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index adebbbf..d906958 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 
 / {
compatible = "nvidia,tegra194";
@@ -130,6 +131,12 @@
};
};
 
+   memory-controller@2c0 {
+   compatible = "nvidia,tegra186-mc";
+   reg = <0x02c0 0xb>;
+   status = "disabled";
+   };
+
uarta: serial@310 {
compatible = "nvidia,tegra194-uart", 
"nvidia,tegra20-uart";
reg = <0x0310 0x40>;
-- 
2.1.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH -next] iommu/arm-smmu-v3: Fix build error without CONFIG_PCI_ATS

2019-09-02 Thread YueHaibing
If CONFIG_PCI_ATS is not set, building fails:

drivers/iommu/arm-smmu-v3.c: In function arm_smmu_ats_supported:
drivers/iommu/arm-smmu-v3.c:2325:35: error: struct pci_dev has no member named 
ats_cap; did you mean msi_cap?
  return !pdev->untrusted && pdev->ats_cap;
   ^~~

ats_cap should only used when CONFIG_PCI_ATS is defined,
so use #ifdef block to guard this.

Fixes: bfff88ec1afe ("iommu/arm-smmu-v3: Rework enabling/disabling of ATS for 
PCI masters")
Signed-off-by: YueHaibing 
---
 drivers/iommu/arm-smmu-v3.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 66bf641..44ac9ac 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2313,7 +2313,7 @@ static void arm_smmu_install_ste_for_dev(struct 
arm_smmu_master *master)
 
 static bool arm_smmu_ats_supported(struct arm_smmu_master *master)
 {
-   struct pci_dev *pdev;
+   struct pci_dev *pdev __maybe_unused;
struct arm_smmu_device *smmu = master->smmu;
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
 
@@ -2321,8 +2321,10 @@ static bool arm_smmu_ats_supported(struct 
arm_smmu_master *master)
!(fwspec->flags & IOMMU_FWSPEC_PCI_RC_ATS) || pci_ats_disabled())
return false;
 
+#ifdef CONFIG_PCI_ATS
pdev = to_pci_dev(master->dev);
return !pdev->untrusted && pdev->ats_cap;
+#endif
 }
 
 static void arm_smmu_enable_ats(struct arm_smmu_master *master)
-- 
2.7.4




"Rework enabling/disabling of ATS for PCI masters" failed to compile on arm64

2019-09-02 Thread Qian Cai
The linux-next commit “iommu/arm-smmu-v3: Rework enabling/disabling of ATS for 
PCI masters” [1] causes a compilation error when PCI_ATS=n on arm64.

[1] https://lore.kernel.org/linux-iommu/20190820154549.17018-3-w...@kernel.org/

drivers/iommu/arm-smmu-v3.c:2325:35: error: no member named 'ats_cap' in 
'struct pci_dev'
return !pdev->untrusted && pdev->ats_cap;
     ^

For example,

Symbol: PCI_ATS [=n]
  │ Type  : bool
  │   Defined at drivers/pci/Kconfig:118
  │   Depends on: PCI [=y] 
  │   Selected by [n]: 
  │   - PCI_IOV [=n] && PCI [=y] 
  │   - PCI_PRI [=n] && PCI [=y]│  
  │   - PCI_PASID [=n] && PCI [=y] │  
  │   - AMD_IOMMU [=n] && IOMMU_SUPPORT [=y] && X86_64 && PCI [=y] && ACPI [=y]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH] iommu/vt-d: Fix IOMMU field not populated on device hot re-plug

2019-09-02 Thread Lu Baolu

Hi Janusz,

On 9/2/19 4:37 PM, Janusz Krzysztofik wrote:

I am not saying that keeping data is not acceptable. I just want to
check whether there are any other solutions.

Then reverting 458b7c8e0dde and applying this patch still resolves the issue
for me.  No errors appear when mappings are unmapped on device close after the
device has been removed, and domain info preserved on device removal is
successfully reused on device re-plug.


This patch doesn't look good to me although I agree that keeping data is
acceptable. It updates dev->archdata.iommu, but leaves the hardware
context/pasid table unchanged. This might cause problems somewhere.



Is there anything else I can do to help?


Can you please tell me how to reproduce the problem? Keeping the per
device domain info while device is unplugged is a bit dangerous because
info->dev might be a wild pointer. We need to work out a clean fix.



Thanks,
Janusz



Best regards,
Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 1/7] iommu/arm-smmu: add Nvidia SMMUv2 implementation

2019-09-02 Thread Krishna Reddy
>>> +ARM_SMMU_MATCH_DATA(nvidia_smmuv2, ARM_SMMU_V2, NVIDIA_SMMUV2);
 
>> The ARM MMU-500 implementation is unmodified.  It is the way the are 
>> integrated and used together(for interleaved accesses) is different from 
>> regular ARM MMU-500.
>> I have added it to get the model number and to be able differentiate the 
>> SMMU implementation in arm-smmu-impl.c.

>In that case, I would rather keep smmu->model representing the MMU-500 
>microarchitecture - 
>since you'll still want to pick up errata workarounds etc. for that - and 
>detect the Tegra integration via an explicit of_device_is_compatible()
> check in arm_smmu_impl_init().

Looks good to me. 

>For comparison, under ACPI we'd probably have to detect integration details by 
>looking at table headers, separately
> from the IORT "Model" field, so I'd prefer if the DT vs. ACPI handling didn't 
> diverge more than necessary.

ACPI support for T194 can be added based on need in subsequent patches. For 
now, I am updating it for DT support.

-KR
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 3/4] block: add a helper function to merge the segments

2019-09-02 Thread Jens Axboe
On 8/28/19 6:35 AM, Yoshihiro Shimoda wrote:
> This patch adds a helper function whether a queue can merge
> the segments by the DMA MAP layer (e.g. via IOMMU).

Reviewed-by: Jens Axboe 

-- 
Jens Axboe



Re: [PATCH] PCI: Move ATS declarations to linux/pci.h

2019-09-02 Thread Bjorn Helgaas
[+cc Kelsey]

On Mon, Sep 02, 2019 at 04:11:00PM -0500, Bjorn Helgaas wrote:
> On Fri, Aug 30, 2019 at 09:18:40AM -0700, Christoph Hellwig wrote:
> > On Fri, Aug 30, 2019 at 05:07:56PM +0200, Krzysztof Wilczynski wrote:
> > > Move ATS function prototypes from include/linux/pci-ats.h to
> > > include/linux/pci.h so users only need to include :
> > 
> > Why is that so important?  Very few PCI(e) device drivers use ATS,
> > so keeping it out of everyones include hell doesn't seem all bad.
> 
> This was my idea, and it wasn't a good one, sorry.
> 
> The ATS, PRI, and PASID interfaces are all sort of related and are
> used only by the IOMMU drivers, so it probably makes sense to put them
> all together.  Right now the ATS stuff is in linux/pci.h and PRI/PASID
> stuff is in linux/pci-ats.h.  Maybe the right thing would be to move
> the ATS stuff to pci-ats.h.
> 
> I previously moved it from pci-ats.h to pci.h with ff9bee895c4d ("PCI:
> Move ATS declarations to linux/pci.h so they're all together") with
> the excuse of putting the external ATS interfaces next to
> pci_ats_init().  But that really looks like it was a mistake because
> pci_ats_init() is a PCI-internal thing and its declaration should
> probably be in drivers/pci/pci.h instead.

Never mind the pci_ats_init() part; Kelsey has already moved that:
https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/commit/?id=b92b512a435d


Re: [PATCH] PCI: Move ATS declarations to linux/pci.h

2019-09-02 Thread Bjorn Helgaas
On Fri, Aug 30, 2019 at 09:18:40AM -0700, Christoph Hellwig wrote:
> On Fri, Aug 30, 2019 at 05:07:56PM +0200, Krzysztof Wilczynski wrote:
> > Move ATS function prototypes from include/linux/pci-ats.h to
> > include/linux/pci.h so users only need to include :
> 
> Why is that so important?  Very few PCI(e) device drivers use ATS,
> so keeping it out of everyones include hell doesn't seem all bad.

This was my idea, and it wasn't a good one, sorry.

The ATS, PRI, and PASID interfaces are all sort of related and are
used only by the IOMMU drivers, so it probably makes sense to put them
all together.  Right now the ATS stuff is in linux/pci.h and PRI/PASID
stuff is in linux/pci-ats.h.  Maybe the right thing would be to move
the ATS stuff to pci-ats.h.

I previously moved it from pci-ats.h to pci.h with ff9bee895c4d ("PCI:
Move ATS declarations to linux/pci.h so they're all together") with
the excuse of putting the external ATS interfaces next to
pci_ats_init().  But that really looks like it was a mistake because
pci_ats_init() is a PCI-internal thing and its declaration should
probably be in drivers/pci/pci.h instead.

There's also a useless "struct pci_ats" forward declaration in
linux/pci.h that I should have removed with d544d75ac96a ("PCI: Embed
ATS info directly into struct pci_dev").

Bjorn
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/4] dma-mapping: remove the dma_declare_coherent_memory export

2019-09-02 Thread Christoph Hellwig
dma_declare_coherent_memory is something that the platform setup code
(which pretty much means the device tree these days) need to do so that
drivers can use the memory as declared by the platform.  Drivers
themselves have no business calling this function.

Signed-off-by: Christoph Hellwig 
---
 kernel/dma/coherent.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 7cafe1affdc9..545e3869b0e3 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -122,7 +122,6 @@ int dma_declare_coherent_memory(struct device *dev, 
phys_addr_t phys_addr,
dma_release_coherent_memory(mem);
return ret;
 }
-EXPORT_SYMBOL(dma_declare_coherent_memory);
 
 static void *__dma_alloc_from_coherent(struct dma_coherent_mem *mem,
ssize_t size, dma_addr_t *dma_handle)
-- 
2.20.1



[PATCH 2/4] dma-mapping: remove the dma_mmap_from_dev_coherent export

2019-09-02 Thread Christoph Hellwig
dma_mmap_from_dev_coherent is only used by dma_map_ops instances,
none of which is modular.

Signed-off-by: Christoph Hellwig 
---
 kernel/dma/coherent.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 7271cda86a37..7cafe1affdc9 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -277,7 +277,6 @@ int dma_mmap_from_dev_coherent(struct device *dev, struct 
vm_area_struct *vma,
 
return __dma_mmap_from_coherent(mem, vma, vaddr, size, ret);
 }
-EXPORT_SYMBOL(dma_mmap_from_dev_coherent);
 
 int dma_mmap_from_global_coherent(struct vm_area_struct *vma, void *vaddr,
   size_t size, int *ret)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/4] remoteproc: don't allow modular build

2019-09-02 Thread Christoph Hellwig
Remoteproc started using dma_declare_coherent_memory recently, which is
a bad idea from drivers, and the maintainers agreed to fix that.  But
until that is fixed only allow building the driver built in so that we
can remove the dma_declare_coherent_memory export and prevent other
drivers from "accidentally" using it like remoteproc.  Note that the
driver would also leak the declared coherent memory on unload if it
actually was built as a module at the moment.

Signed-off-by: Christoph Hellwig 
---
 drivers/remoteproc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 28ed306982f7..94afdde4bc9f 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -2,7 +2,7 @@
 menu "Remoteproc drivers"
 
 config REMOTEPROC
-   tristate "Support for Remote Processor subsystem"
+   bool "Support for Remote Processor subsystem"
depends on HAS_DMA
select CRC32
select FW_LOADER
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


remove various dma_declare_coherent related exports

2019-09-02 Thread Christoph Hellwig
Hi all,

this is a refresh of and older series that tries to ensure that
drivers don't use the dma_declare_coherent function, which is
intende for platform code.  Unfortunately we've actually grown
a user in remoteproc since then.  While the maintainers havee
promised to fix that up that hasn't happened so far, so for now
this disabled the modular build for remoteproc until that has been
solved.


[PATCH 1/4] dma-mapping: remove dma_release_declared_memory

2019-09-02 Thread Christoph Hellwig
This function is entirely unused given that declared memory is
generally provided by platform setup code.

Signed-off-by: Christoph Hellwig 
---
 Documentation/DMA-API.txt   | 11 ---
 include/linux/dma-mapping.h |  6 --
 kernel/dma/coherent.c   | 11 ---
 3 files changed, 28 deletions(-)

diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
index e47c63bd4887..c0865ca664b8 100644
--- a/Documentation/DMA-API.txt
+++ b/Documentation/DMA-API.txt
@@ -595,17 +595,6 @@ For reasons of efficiency, most platforms choose to track 
the declared
 region only at the granularity of a page.  For smaller allocations,
 you should use the dma_pool() API.
 
-::
-
-   void
-   dma_release_declared_memory(struct device *dev)
-
-Remove the memory region previously declared from the system.  This
-API performs *no* in-use checking for this region and will return
-unconditionally having removed all the required structures.  It is the
-driver's job to ensure that no parts of this memory region are
-currently in use.
-
 Part III - Debug drivers use of the DMA-API
 ---
 
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 48ebe8295987..165cd61f1c6e 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -753,7 +753,6 @@ static inline int dma_get_cache_alignment(void)
 #ifdef CONFIG_DMA_DECLARE_COHERENT
 int dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,
dma_addr_t device_addr, size_t size);
-void dma_release_declared_memory(struct device *dev);
 #else
 static inline int
 dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,
@@ -761,11 +760,6 @@ dma_declare_coherent_memory(struct device *dev, 
phys_addr_t phys_addr,
 {
return -ENOSYS;
 }
-
-static inline void
-dma_release_declared_memory(struct device *dev)
-{
-}
 #endif /* CONFIG_DMA_DECLARE_COHERENT */
 
 static inline void *dmam_alloc_coherent(struct device *dev, size_t size,
diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 29fd6590dc1e..7271cda86a37 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -124,17 +124,6 @@ int dma_declare_coherent_memory(struct device *dev, 
phys_addr_t phys_addr,
 }
 EXPORT_SYMBOL(dma_declare_coherent_memory);
 
-void dma_release_declared_memory(struct device *dev)
-{
-   struct dma_coherent_mem *mem = dev->dma_mem;
-
-   if (!mem)
-   return;
-   dma_release_coherent_memory(mem);
-   dev->dma_mem = NULL;
-}
-EXPORT_SYMBOL(dma_release_declared_memory);
-
 static void *__dma_alloc_from_coherent(struct dma_coherent_mem *mem,
ssize_t size, dma_addr_t *dma_handle)
 {
-- 
2.20.1



Re: [PATCH 1/2] iommu: Implement of_iommu_get_resv_regions()

2019-09-02 Thread Thierry Reding
On Mon, Sep 02, 2019 at 02:54:23PM +0100, Robin Murphy wrote:
> On 29/08/2019 12:14, Thierry Reding wrote:
> > From: Thierry Reding 
> > 
> > This is an implementation that IOMMU drivers can use to obtain reserved
> > memory regions from a device tree node. It uses the reserved-memory DT
> > bindings to find the regions associated with a given device. These
> > regions will be used to create 1:1 mappings in the IOMMU domain that
> > the devices will be attached to.
> > 
> > Cc: Rob Herring 
> > Cc: Frank Rowand 
> > Cc: devicet...@vger.kernel.org
> > Signed-off-by: Thierry Reding 
> > ---
> >   drivers/iommu/of_iommu.c | 39 +++
> >   include/linux/of_iommu.h |  8 
> >   2 files changed, 47 insertions(+)
> > 
> > diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> > index 614a93aa5305..0d47f626b854 100644
> > --- a/drivers/iommu/of_iommu.c
> > +++ b/drivers/iommu/of_iommu.c
> > @@ -9,6 +9,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >   #include 
> >   #include 
> >   #include 
> > @@ -225,3 +226,41 @@ const struct iommu_ops *of_iommu_configure(struct 
> > device *dev,
> > return ops;
> >   }
> > +
> > +/**
> > + * of_iommu_get_resv_regions - reserved region driver helper for device 
> > tree
> > + * @dev: device for which to get reserved regions
> > + * @list: reserved region list
> > + *
> > + * IOMMU drivers can use this to implement their .get_resv_regions() 
> > callback
> > + * for memory regions attached to a device tree node. See the 
> > reserved-memory
> > + * device tree bindings on how to use these:
> > + *
> > + *   Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > + */
> > +void of_iommu_get_resv_regions(struct device *dev, struct list_head *list)
> > +{
> > +   struct of_phandle_iterator it;
> > +   int err;
> > +
> > +   of_for_each_phandle(&it, err, dev->of_node, "memory-region", NULL, 0) {
> > +   struct iommu_resv_region *region;
> > +   struct resource res;
> > +
> > +   err = of_address_to_resource(it.node, 0, &res);
> > +   if (err < 0) {
> > +   dev_err(dev, "failed to parse memory region %pOF: %d\n",
> > +   it.node, err);
> > +   continue;
> > +   }
> 
> What if the device node has memory regions for other purposes, like private
> CMA carveouts? We wouldn't want to force mappings of those (and in the very
> worst case doing so could even render them unusable).

I suppose we could come up with additional properties to mark such
memory regions and skip them here.

One other alternative might be to make sure that the driver claims
the memory regions that have been mapped and then tells the IOMMU to
undo the mappings for them. That way the driver could set up the new
mappings, reprogram the hardware and then have the old mappings torn
down. I'm not sure that could always be done in a race-free way. For
example, what if the new mappings need to be in a region, such as a
private CMA carveout, that's already mapped. Can we temporarily map
one physical address to two DMA addresses?

The details here probably depend on the IOMMU hardware.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/2] iommu: dma: Use of_iommu_get_resv_regions()

2019-09-02 Thread Thierry Reding
On Mon, Sep 02, 2019 at 03:22:35PM +0100, Robin Murphy wrote:
> On 29/08/2019 12:14, Thierry Reding wrote:
> > From: Thierry Reding 
> > 
> > For device tree nodes, use the standard of_iommu_get_resv_regions()
> > implementation to obtain the reserved memory regions associated with a
> > device.
> 
> This covers the window between iommu_probe_device() setting up a default
> domain and the device's driver finally probing and taking control, but
> iommu_probe_device() represents the point that the IOMMU driver first knows
> about this device - there's still a window from whenever the IOMMU driver
> itself probed up to here where the "unidentified" traffic may have already
> been disrupted. Some IOMMU drivers have no option but to make the necessary
> configuration during their own probe routine, at which point a struct device
> for the display/etc. endpoint may not even exist yet.

Yeah, I think I'm actually running into this issue with the ARM SMMU
driver. The above works fine with the Tegra SMMU driver, though, because
it doesn't touch the SMMU configuration until a device is attached to a
domain.

For anything earlier than iommu_probe_device(), I don't see a way of
doing this generically. I've been working on a prototype to make these
reserved memory regions early on for ARM SMMU but I've been failing so
far. I think it would possibly work if we just switched the default for
stream IDs to be "bypass" if they have any devices that have reserved
memory regions, but again, this isn't quite working (yet).

Thierry

> > Cc: Rob Herring 
> > Cc: Frank Rowand 
> > Cc: devicet...@vger.kernel.org
> > Signed-off-by: Thierry Reding 
> > ---
> >   drivers/iommu/dma-iommu.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index de68b4a02aea..31d48e55ab55 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -19,6 +19,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >   #include 
> >   #include 
> >   #include 
> > @@ -164,6 +165,8 @@ void iommu_dma_get_resv_regions(struct device *dev, 
> > struct list_head *list)
> > if (!is_of_node(dev_iommu_fwspec_get(dev)->iommu_fwnode))
> > iort_iommu_msi_get_resv_regions(dev, list);
> > +   if (dev->of_node)
> > +   of_iommu_get_resv_regions(dev, list);
> >   }
> >   EXPORT_SYMBOL(iommu_dma_get_resv_regions);
> > 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v1 2/2] of: Let of_for_each_phandle fallback to non-negative cell_count

2019-09-02 Thread Rob Herring
On Sat, Aug 24, 2019 at 03:28:46PM +0200, Uwe Kleine-König wrote:
> Referencing device tree nodes from a property allows to pass arguments.
> This is for example used for referencing gpios. This looks as follows:
> 
>   gpio_ctrl: gpio-controller {
>   #gpio-cells = <2>
>   ...
>   }
> 
>   someothernode {
>   gpios = <&gpio_ctrl 5 0 &gpio_ctrl 3 0>;
>   ...
>   }
> 
> To know the number of arguments this must be either fixed, or the
> referenced node is checked for a $cells_name (here: "#gpio-cells")
> property and with this information the start of the second reference can
> be determined.
> 
> Currently regulators are referenced with no additional arguments. To
> allow some optional arguments without having to change all referenced
> nodes this change introduces a way to specify a default cell_count. So
> when a phandle is parsed we check for the $cells_name property and use
> it as before if present. If it is not present we fall back to
> cells_count if non-negative and only fail if cells_count is smaller than
> zero.
> 
> Signed-off-by: Uwe Kleine-König 
> ---
>  drivers/of/base.c | 25 +
>  1 file changed, 17 insertions(+), 8 deletions(-)

Looks fine to me. I can apply with an ack from the iommu folks on patch 
1.

Rob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu: dma: Use of_iommu_get_resv_regions()

2019-09-02 Thread Robin Murphy

On 29/08/2019 12:14, Thierry Reding wrote:

From: Thierry Reding 

For device tree nodes, use the standard of_iommu_get_resv_regions()
implementation to obtain the reserved memory regions associated with a
device.


This covers the window between iommu_probe_device() setting up a default 
domain and the device's driver finally probing and taking control, but 
iommu_probe_device() represents the point that the IOMMU driver first 
knows about this device - there's still a window from whenever the IOMMU 
driver itself probed up to here where the "unidentified" traffic may 
have already been disrupted. Some IOMMU drivers have no option but to 
make the necessary configuration during their own probe routine, at 
which point a struct device for the display/etc. endpoint may not even 
exist yet.


Robin.


Cc: Rob Herring 
Cc: Frank Rowand 
Cc: devicet...@vger.kernel.org
Signed-off-by: Thierry Reding 
---
  drivers/iommu/dma-iommu.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index de68b4a02aea..31d48e55ab55 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -19,6 +19,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -164,6 +165,8 @@ void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
if (!is_of_node(dev_iommu_fwspec_get(dev)->iommu_fwnode))
iort_iommu_msi_get_resv_regions(dev, list);
  
+	if (dev->of_node)

+   of_iommu_get_resv_regions(dev, list);
  }
  EXPORT_SYMBOL(iommu_dma_get_resv_regions);
  



Re: [PATCH 1/2] iommu: Implement of_iommu_get_resv_regions()

2019-09-02 Thread Robin Murphy

On 29/08/2019 12:14, Thierry Reding wrote:

From: Thierry Reding 

This is an implementation that IOMMU drivers can use to obtain reserved
memory regions from a device tree node. It uses the reserved-memory DT
bindings to find the regions associated with a given device. These
regions will be used to create 1:1 mappings in the IOMMU domain that
the devices will be attached to.

Cc: Rob Herring 
Cc: Frank Rowand 
Cc: devicet...@vger.kernel.org
Signed-off-by: Thierry Reding 
---
  drivers/iommu/of_iommu.c | 39 +++
  include/linux/of_iommu.h |  8 
  2 files changed, 47 insertions(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 614a93aa5305..0d47f626b854 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -9,6 +9,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -225,3 +226,41 @@ const struct iommu_ops *of_iommu_configure(struct device 
*dev,
  
  	return ops;

  }
+
+/**
+ * of_iommu_get_resv_regions - reserved region driver helper for device tree
+ * @dev: device for which to get reserved regions
+ * @list: reserved region list
+ *
+ * IOMMU drivers can use this to implement their .get_resv_regions() callback
+ * for memory regions attached to a device tree node. See the reserved-memory
+ * device tree bindings on how to use these:
+ *
+ *   Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+ */
+void of_iommu_get_resv_regions(struct device *dev, struct list_head *list)
+{
+   struct of_phandle_iterator it;
+   int err;
+
+   of_for_each_phandle(&it, err, dev->of_node, "memory-region", NULL, 0) {
+   struct iommu_resv_region *region;
+   struct resource res;
+
+   err = of_address_to_resource(it.node, 0, &res);
+   if (err < 0) {
+   dev_err(dev, "failed to parse memory region %pOF: %d\n",
+   it.node, err);
+   continue;
+   }


What if the device node has memory regions for other purposes, like 
private CMA carveouts? We wouldn't want to force mappings of those (and 
in the very worst case doing so could even render them unusable).


Robin.


+
+   region = iommu_alloc_resv_region(res.start, resource_size(&res),
+IOMMU_READ | IOMMU_WRITE,
+IOMMU_RESV_DIRECT_RELAXABLE);
+   if (!region)
+   continue;
+
+   list_add_tail(®ion->list, list);
+   }
+}
+EXPORT_SYMBOL(of_iommu_get_resv_regions);
diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
index f3d40dd7bb66..fa16b26f55bc 100644
--- a/include/linux/of_iommu.h
+++ b/include/linux/of_iommu.h
@@ -15,6 +15,9 @@ extern int of_get_dma_window(struct device_node *dn, const 
char *prefix,
  extern const struct iommu_ops *of_iommu_configure(struct device *dev,
struct device_node *master_np);
  
+extern void of_iommu_get_resv_regions(struct device *dev,

+ struct list_head *list);
+
  #else
  
  static inline int of_get_dma_window(struct device_node *dn, const char *prefix,

@@ -30,6 +33,11 @@ static inline const struct iommu_ops 
*of_iommu_configure(struct device *dev,
return NULL;
  }
  
+static inline void of_iommu_get_resv_regions(struct device *dev,

+struct list_head *list)
+{
+}
+
  #endif/* CONFIG_OF_IOMMU */
  
  #endif /* __OF_IOMMU_H */



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/7] iommu/arm-smmu: add Nvidia SMMUv2 implementation

2019-09-02 Thread Robin Murphy

On 30/08/2019 19:16, Krishna Reddy wrote:

+ARM_SMMU_MATCH_DATA(nvidia_smmuv2, ARM_SMMU_V2, NVIDIA_SMMUV2);



 From the previous discussions, I got the impression that other than the 
'novel' way they're integrated, the actual SMMU implementations were unmodified 
Arm MMU-500s. Is that the case, or have I misread something?


The ARM MMU-500 implementation is unmodified.  It is the way the are integrated 
and used together(for interleaved accesses) is different from regular ARM 
MMU-500.
I have added it to get the model number and to be able differentiate the SMMU 
implementation in arm-smmu-impl.c.


In that case, I would rather keep smmu->model representing the MMU-500 
microarchitecture - since you'll still want to pick up errata 
workarounds etc. for that - and detect the Tegra integration via an 
explicit of_device_is_compatible() check in arm_smmu_impl_init(). For 
comparison, under ACPI we'd probably have to detect integration details 
by looking at table headers, separately from the IORT "Model" field, so 
I'd prefer if the DT vs. ACPI handling didn't diverge more than necessary.


Of course, that immediately opens the question of how best to combine 
arm_mmu500_impl with nsmmu_impl, but hey, one step at a time :)


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu: Implement of_iommu_get_resv_regions()

2019-09-02 Thread Rob Herring
On Thu, 29 Aug 2019 13:14:06 +0200, Thierry Reding wrote:
> From: Thierry Reding 
> 
> This is an implementation that IOMMU drivers can use to obtain reserved
> memory regions from a device tree node. It uses the reserved-memory DT
> bindings to find the regions associated with a given device. These
> regions will be used to create 1:1 mappings in the IOMMU domain that
> the devices will be attached to.
> 
> Cc: Rob Herring 
> Cc: Frank Rowand 
> Cc: devicet...@vger.kernel.org
> Signed-off-by: Thierry Reding 
> ---
>  drivers/iommu/of_iommu.c | 39 +++
>  include/linux/of_iommu.h |  8 
>  2 files changed, 47 insertions(+)
> 

Reviewed-by: Rob Herring 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] PCI: Remove unused includes and superfluous struct declaration

2019-09-02 Thread Rob Herring
On Sun,  1 Sep 2019 13:25:06 +0200, Krzysztof Wilczynski wrote:
> Remove  and  from being included
> directly as part of the include/linux/of_pci.h, and remove
> superfluous declaration of struct of_phandle_args.
> 
> Move users of include  to include 
> and  directly rather than rely on both being
> included transitively through .
> 
> Signed-off-by: Krzysztof Wilczynski 
> ---
>  drivers/iommu/of_iommu.c  | 2 ++
>  drivers/pci/controller/dwc/pcie-designware-host.c | 1 +
>  drivers/pci/controller/pci-aardvark.c | 1 +
>  drivers/pci/pci.c | 1 +
>  drivers/pci/probe.c   | 1 +
>  include/linux/of_pci.h| 4 +---
>  6 files changed, 7 insertions(+), 3 deletions(-)
> 

Acked-by: Rob Herring 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 13/13] arm64: use asm-generic/dma-mapping.h

2019-09-02 Thread Christoph Hellwig
Now that the Xen special cases are gone nothing worth mentioning is
left in the arm64  file, so switch to use the
asm-generic version instead.

Signed-off-by: Christoph Hellwig 
Acked-by: Will Deacon 
Reviewed-by: Stefano Stabellini 
---
 arch/arm64/include/asm/Kbuild|  1 +
 arch/arm64/include/asm/dma-mapping.h | 22 --
 arch/arm64/mm/dma-mapping.c  |  1 +
 3 files changed, 2 insertions(+), 22 deletions(-)
 delete mode 100644 arch/arm64/include/asm/dma-mapping.h

diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild
index c52e151afab0..98a5405c8558 100644
--- a/arch/arm64/include/asm/Kbuild
+++ b/arch/arm64/include/asm/Kbuild
@@ -4,6 +4,7 @@ generic-y += delay.h
 generic-y += div64.h
 generic-y += dma.h
 generic-y += dma-contiguous.h
+generic-y += dma-mapping.h
 generic-y += early_ioremap.h
 generic-y += emergency-restart.h
 generic-y += hw_irq.h
diff --git a/arch/arm64/include/asm/dma-mapping.h 
b/arch/arm64/include/asm/dma-mapping.h
deleted file mode 100644
index 67243255a858..
--- a/arch/arm64/include/asm/dma-mapping.h
+++ /dev/null
@@ -1,22 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (C) 2012 ARM Ltd.
- */
-#ifndef __ASM_DMA_MAPPING_H
-#define __ASM_DMA_MAPPING_H
-
-#ifdef __KERNEL__
-
-#include 
-#include 
-
-#include 
-#include 
-
-static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-{
-   return NULL;
-}
-
-#endif /* __KERNEL__ */
-#endif /* __ASM_DMA_MAPPING_H */
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 4b244a037349..6578abcfbbc7 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
-- 
2.20.1



[PATCH 12/13] swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page

2019-09-02 Thread Christoph Hellwig
No need for a no-op wrapper.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Stefano Stabellini 
---
 drivers/xen/swiotlb-xen.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 95911ff9c11c..384304a77020 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -414,9 +414,8 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, 
struct page *page,
  * After this call, reads by the cpu to the buffer are guaranteed to see
  * whatever the device wrote there.
  */
-static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr,
-size_t size, enum dma_data_direction dir,
-unsigned long attrs)
+static void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
+   size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
phys_addr_t paddr = xen_bus_to_phys(dev_addr);
 
@@ -430,13 +429,6 @@ static void xen_unmap_single(struct device *hwdev, 
dma_addr_t dev_addr,
swiotlb_tbl_unmap_single(hwdev, paddr, size, dir, attrs);
 }
 
-static void xen_swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
-   size_t size, enum dma_data_direction dir,
-   unsigned long attrs)
-{
-   xen_unmap_single(hwdev, dev_addr, size, dir, attrs);
-}
-
 static void
 xen_swiotlb_sync_single_for_cpu(struct device *dev, dma_addr_t dma_addr,
size_t size, enum dma_data_direction dir)
@@ -477,7 +469,8 @@ xen_swiotlb_unmap_sg(struct device *hwdev, struct 
scatterlist *sgl, int nelems,
BUG_ON(dir == DMA_NONE);
 
for_each_sg(sgl, sg, nelems, i)
-   xen_unmap_single(hwdev, sg->dma_address, sg_dma_len(sg), dir, 
attrs);
+   xen_swiotlb_unmap_page(hwdev, sg->dma_address, sg_dma_len(sg),
+   dir, attrs);
 
 }
 
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 11/13] swiotlb-xen: remove page-coherent.h

2019-09-02 Thread Christoph Hellwig
The only thing left of page-coherent.h is two functions implemented by
the architecture for non-coherent DMA support that are never called for
fully coherent architectures.  Just move the prototypes for those to
swiotlb-xen.h instead.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Stefano Stabellini 
---
 arch/arm/include/asm/xen/page-coherent.h   |  2 --
 arch/arm64/include/asm/xen/page-coherent.h |  2 --
 arch/x86/include/asm/xen/page-coherent.h   | 11 ---
 drivers/xen/swiotlb-xen.c  |  3 ---
 include/Kbuild |  1 -
 include/xen/arm/page-coherent.h| 10 --
 include/xen/swiotlb-xen.h  |  6 ++
 7 files changed, 6 insertions(+), 29 deletions(-)
 delete mode 100644 arch/arm/include/asm/xen/page-coherent.h
 delete mode 100644 arch/arm64/include/asm/xen/page-coherent.h
 delete mode 100644 arch/x86/include/asm/xen/page-coherent.h
 delete mode 100644 include/xen/arm/page-coherent.h

diff --git a/arch/arm/include/asm/xen/page-coherent.h 
b/arch/arm/include/asm/xen/page-coherent.h
deleted file mode 100644
index 27e984977402..
--- a/arch/arm/include/asm/xen/page-coherent.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include 
diff --git a/arch/arm64/include/asm/xen/page-coherent.h 
b/arch/arm64/include/asm/xen/page-coherent.h
deleted file mode 100644
index 27e984977402..
--- a/arch/arm64/include/asm/xen/page-coherent.h
+++ /dev/null
@@ -1,2 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include 
diff --git a/arch/x86/include/asm/xen/page-coherent.h 
b/arch/x86/include/asm/xen/page-coherent.h
deleted file mode 100644
index c9c8398a31ff..
--- a/arch/x86/include/asm/xen/page-coherent.h
+++ /dev/null
@@ -1,11 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_X86_XEN_PAGE_COHERENT_H
-#define _ASM_X86_XEN_PAGE_COHERENT_H
-
-static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
-   dma_addr_t handle, size_t size, enum dma_data_direction dir) { }
-
-static inline void xen_dma_sync_single_for_device(struct device *hwdev,
-   dma_addr_t handle, size_t size, enum dma_data_direction dir) { }
-
-#endif /* _ASM_X86_XEN_PAGE_COHERENT_H */
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index a642e284f1e2..95911ff9c11c 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -35,9 +35,6 @@
 #include 
 #include 
 
-#include 
-#include 
-
 #include 
 /*
  * Used to do a quick range check in swiotlb_tbl_unmap_single and
diff --git a/include/Kbuild b/include/Kbuild
index c38f0d46b267..cce5cf6abf89 100644
--- a/include/Kbuild
+++ b/include/Kbuild
@@ -1189,7 +1189,6 @@ header-test-  += video/vga.h
 header-test-   += video/w100fb.h
 header-test-   += xen/acpi.h
 header-test-   += xen/arm/hypercall.h
-header-test-   += xen/arm/page-coherent.h
 header-test-   += xen/arm/page.h
 header-test-   += xen/balloon.h
 header-test-   += xen/events.h
diff --git a/include/xen/arm/page-coherent.h b/include/xen/arm/page-coherent.h
deleted file mode 100644
index 635492d41ebe..
--- a/include/xen/arm/page-coherent.h
+++ /dev/null
@@ -1,10 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _XEN_ARM_PAGE_COHERENT_H
-#define _XEN_ARM_PAGE_COHERENT_H
-
-void xen_dma_sync_for_cpu(struct device *dev, dma_addr_t handle,
-   phys_addr_t paddr, size_t size, enum dma_data_direction dir);
-void xen_dma_sync_for_device(struct device *dev, dma_addr_t handle,
-   phys_addr_t paddr, size_t size, enum dma_data_direction dir);
-
-#endif /* _XEN_ARM_PAGE_COHERENT_H */
diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h
index 5e4b83f83dbc..a7c642872568 100644
--- a/include/xen/swiotlb-xen.h
+++ b/include/xen/swiotlb-xen.h
@@ -2,8 +2,14 @@
 #ifndef __LINUX_SWIOTLB_XEN_H
 #define __LINUX_SWIOTLB_XEN_H
 
+#include 
 #include 
 
+void xen_dma_sync_for_cpu(struct device *dev, dma_addr_t handle,
+   phys_addr_t paddr, size_t size, enum dma_data_direction dir);
+void xen_dma_sync_for_device(struct device *dev, dma_addr_t handle,
+   phys_addr_t paddr, size_t size, enum dma_data_direction dir);
+
 extern int xen_swiotlb_init(int verbose, bool early);
 extern const struct dma_map_ops xen_swiotlb_dma_ops;
 
-- 
2.20.1



[PATCH 10/13] swiotlb-xen: simplify cache maintainance

2019-09-02 Thread Christoph Hellwig
Now that we know we always have the dma-noncoherent.h helpers available
if we are on an architecture with support for non-coherent devices,
we can just call them directly, and remove the calls to the dma-direct
routines, including the fact that we call the dma_direct_map_page
routines but ignore the value returned from it.  Instead we now have
Xen wrappers for the arch_sync_dma_for_{device,cpu} helpers that call
the special Xen versions of those routines for foreign pages.

Note that the new helpers get the physical address passed in addition
to the dma address to avoid another translation for the local cache
maintainance.  The pfn_valid checks remain on the dma address as in
the old code, even if that looks a little funny.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Stefano Stabellini 
---
 arch/arm/xen/mm.c| 64 ++
 arch/x86/include/asm/xen/page-coherent.h | 11 
 drivers/xen/swiotlb-xen.c| 20 +++
 include/xen/arm/page-coherent.h  | 69 ++--
 4 files changed, 31 insertions(+), 133 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 9d73fa4a5991..2b2c208408bb 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -60,63 +60,33 @@ static void dma_cache_maint(dma_addr_t handle, size_t size, 
u32 op)
} while (size);
 }
 
-static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t handle,
-   size_t size, enum dma_data_direction dir)
+/*
+ * Dom0 is mapped 1:1, and while the Linux page can span across multiple Xen
+ * pages, it is not possible for it to contain a mix of local and foreign Xen
+ * pages.  Calling pfn_valid on a foreign mfn will always return false, so if
+ * pfn_valid returns true the pages is local and we can use the native
+ * dma-direct functions, otherwise we call the Xen specific version.
+ */
+void xen_dma_sync_for_cpu(struct device *dev, dma_addr_t handle,
+   phys_addr_t paddr, size_t size, enum dma_data_direction dir)
 {
-   if (dir != DMA_TO_DEVICE)
+   if (pfn_valid(PFN_DOWN(handle)))
+   arch_sync_dma_for_cpu(dev, paddr, size, dir);
+   else if (dir != DMA_TO_DEVICE)
dma_cache_maint(handle, size, GNTTAB_CACHE_INVAL);
 }
 
-static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
-   size_t size, enum dma_data_direction dir)
+void xen_dma_sync_for_device(struct device *dev, dma_addr_t handle,
+   phys_addr_t paddr, size_t size, enum dma_data_direction dir)
 {
-   if (dir == DMA_FROM_DEVICE)
+   if (pfn_valid(PFN_DOWN(handle)))
+   arch_sync_dma_for_device(dev, paddr, size, dir);
+   else if (dir == DMA_FROM_DEVICE)
dma_cache_maint(handle, size, GNTTAB_CACHE_INVAL);
else
dma_cache_maint(handle, size, GNTTAB_CACHE_CLEAN);
 }
 
-void __xen_dma_map_page(struct device *hwdev, struct page *page,
-dma_addr_t dev_addr, unsigned long offset, size_t size,
-enum dma_data_direction dir, unsigned long attrs)
-{
-   if (dev_is_dma_coherent(hwdev))
-   return;
-   if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
-   return;
-
-   __xen_dma_page_cpu_to_dev(hwdev, dev_addr, size, dir);
-}
-
-void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t handle,
-   size_t size, enum dma_data_direction dir,
-   unsigned long attrs)
-
-{
-   if (dev_is_dma_coherent(hwdev))
-   return;
-   if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
-   return;
-
-   __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
-}
-
-void __xen_dma_sync_single_for_cpu(struct device *hwdev,
-   dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-   if (dev_is_dma_coherent(hwdev))
-   return;
-   __xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
-}
-
-void __xen_dma_sync_single_for_device(struct device *hwdev,
-   dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-   if (dev_is_dma_coherent(hwdev))
-   return;
-   __xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
-}
-
 bool xen_arch_need_swiotlb(struct device *dev,
   phys_addr_t phys,
   dma_addr_t dev_addr)
diff --git a/arch/x86/include/asm/xen/page-coherent.h 
b/arch/x86/include/asm/xen/page-coherent.h
index 8ee33c5edded..c9c8398a31ff 100644
--- a/arch/x86/include/asm/xen/page-coherent.h
+++ b/arch/x86/include/asm/xen/page-coherent.h
@@ -2,17 +2,6 @@
 #ifndef _ASM_X86_XEN_PAGE_COHERENT_H
 #define _ASM_X86_XEN_PAGE_COHERENT_H
 
-#include 
-#include 
-
-static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
-dma_addr_t dev_addr, unsigned long offset, size_t size,
-enum dma_data_direction dir, unsigned long attrs) { }
-
-static inline void xen_dma_unmap_page(struct device *hwdev, dma_addr_t ha

[PATCH 09/13] swiotlb-xen: use the same foreign page check everywhere

2019-09-02 Thread Christoph Hellwig
xen_dma_map_page uses a different and more complicated check for foreign
pages than the other three cache maintainance helpers.  Switch it to the
simpler pfn_valid method a well, and document the scheme with a single
improved comment in xen_dma_map_page.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Stefano Stabellini 
---
 include/xen/arm/page-coherent.h | 31 +--
 1 file changed, 9 insertions(+), 22 deletions(-)

diff --git a/include/xen/arm/page-coherent.h b/include/xen/arm/page-coherent.h
index 0e244f4fec1a..07c104dbc21f 100644
--- a/include/xen/arm/page-coherent.h
+++ b/include/xen/arm/page-coherent.h
@@ -41,23 +41,17 @@ static inline void xen_dma_map_page(struct device *hwdev, 
struct page *page,
 dma_addr_t dev_addr, unsigned long offset, size_t size,
 enum dma_data_direction dir, unsigned long attrs)
 {
-   unsigned long page_pfn = page_to_xen_pfn(page);
-   unsigned long dev_pfn = XEN_PFN_DOWN(dev_addr);
-   unsigned long compound_pages =
-   (1

[PATCH 02/13] xen/arm: consolidate page-coherent.h

2019-09-02 Thread Christoph Hellwig
Shared the duplicate arm/arm64 code in include/xen/arm/page-coherent.h.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/include/asm/xen/page-coherent.h   | 75 
 arch/arm64/include/asm/xen/page-coherent.h | 75 
 include/xen/arm/page-coherent.h| 80 ++
 3 files changed, 80 insertions(+), 150 deletions(-)

diff --git a/arch/arm/include/asm/xen/page-coherent.h 
b/arch/arm/include/asm/xen/page-coherent.h
index 602ac02f154c..27e984977402 100644
--- a/arch/arm/include/asm/xen/page-coherent.h
+++ b/arch/arm/include/asm/xen/page-coherent.h
@@ -1,77 +1,2 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_ARM_XEN_PAGE_COHERENT_H
-#define _ASM_ARM_XEN_PAGE_COHERENT_H
-
-#include 
-#include 
 #include 
-
-static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
-   dma_addr_t *dma_handle, gfp_t flags, unsigned long attrs)
-{
-   return dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
-}
-
-static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
-   void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs)
-{
-   dma_direct_free(hwdev, size, cpu_addr, dma_handle, attrs);
-}
-
-static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
-   dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-   unsigned long pfn = PFN_DOWN(handle);
-
-   if (pfn_valid(pfn))
-   dma_direct_sync_single_for_cpu(hwdev, handle, size, dir);
-   else
-   __xen_dma_sync_single_for_cpu(hwdev, handle, size, dir);
-}
-
-static inline void xen_dma_sync_single_for_device(struct device *hwdev,
-   dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-   unsigned long pfn = PFN_DOWN(handle);
-   if (pfn_valid(pfn))
-   dma_direct_sync_single_for_device(hwdev, handle, size, dir);
-   else
-   __xen_dma_sync_single_for_device(hwdev, handle, size, dir);
-}
-
-static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
-dma_addr_t dev_addr, unsigned long offset, size_t size,
-enum dma_data_direction dir, unsigned long attrs)
-{
-   unsigned long page_pfn = page_to_xen_pfn(page);
-   unsigned long dev_pfn = XEN_PFN_DOWN(dev_addr);
-   unsigned long compound_pages =
-   (1<
-#include 
 #include 
-
-static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
-   dma_addr_t *dma_handle, gfp_t flags, unsigned long attrs)
-{
-   return dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
-}
-
-static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
-   void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs)
-{
-   dma_direct_free(hwdev, size, cpu_addr, dma_handle, attrs);
-}
-
-static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
-   dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-   unsigned long pfn = PFN_DOWN(handle);
-
-   if (pfn_valid(pfn))
-   dma_direct_sync_single_for_cpu(hwdev, handle, size, dir);
-   else
-   __xen_dma_sync_single_for_cpu(hwdev, handle, size, dir);
-}
-
-static inline void xen_dma_sync_single_for_device(struct device *hwdev,
-   dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-   unsigned long pfn = PFN_DOWN(handle);
-   if (pfn_valid(pfn))
-   dma_direct_sync_single_for_device(hwdev, handle, size, dir);
-   else
-   __xen_dma_sync_single_for_device(hwdev, handle, size, dir);
-}
-
-static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
-dma_addr_t dev_addr, unsigned long offset, size_t size,
-enum dma_data_direction dir, unsigned long attrs)
-{
-   unsigned long page_pfn = page_to_xen_pfn(page);
-   unsigned long dev_pfn = XEN_PFN_DOWN(dev_addr);
-   unsigned long compound_pages =
-   (1<
+#include 
+
 void __xen_dma_map_page(struct device *hwdev, struct page *page,
 dma_addr_t dev_addr, unsigned long offset, size_t size,
 enum dma_data_direction dir, unsigned long attrs);
@@ -13,4 +16,81 @@ void __xen_dma_sync_single_for_cpu(struct device *hwdev,
 void __xen_dma_sync_single_for_device(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir);
 
+static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
+   dma_addr_t *dma_handle, gfp_t flags, unsigned long attrs)
+{
+   return dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
+}
+
+static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
+   void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs)
+{
+   dma_direct_free(hwdev, size, cpu_addr, dma_handle, attrs);
+}
+
+static inline void xen_dm

[PATCH 05/13] xen/arm: remove xen_dma_ops

2019-09-02 Thread Christoph Hellwig
arm and arm64 can just use xen_swiotlb_dma_ops directly like x86, no
need for a pointer indirection.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Julien Grall 
Reviewed-by: Stefano Stabellini 
---
 arch/arm/mm/dma-mapping.c| 3 ++-
 arch/arm/xen/mm.c| 4 
 arch/arm64/mm/dma-mapping.c  | 3 ++-
 include/xen/arm/hypervisor.h | 2 --
 4 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 738097396445..2661cad36359 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "dma.h"
 #include "mm.h"
@@ -2360,7 +2361,7 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, 
u64 size,
 
 #ifdef CONFIG_XEN
if (xen_initial_domain())
-   dev->dma_ops = xen_dma_ops;
+   dev->dma_ops = &xen_swiotlb_dma_ops;
 #endif
dev->archdata.dma_ops_setup = true;
 }
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 2fde161733b0..11d5ad26fcfe 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -162,16 +162,12 @@ void xen_destroy_contiguous_region(phys_addr_t pstart, 
unsigned int order)
 }
 EXPORT_SYMBOL_GPL(xen_destroy_contiguous_region);
 
-const struct dma_map_ops *xen_dma_ops;
-EXPORT_SYMBOL(xen_dma_ops);
-
 int __init xen_mm_init(void)
 {
struct gnttab_cache_flush cflush;
if (!xen_initial_domain())
return 0;
xen_swiotlb_init(1, false);
-   xen_dma_ops = &xen_swiotlb_dma_ops;
 
cflush.op = 0;
cflush.a.dev_bus_addr = 0;
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index bd2b039f43a6..4b244a037349 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -64,6 +65,6 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 
size,
 
 #ifdef CONFIG_XEN
if (xen_initial_domain())
-   dev->dma_ops = xen_dma_ops;
+   dev->dma_ops = &xen_swiotlb_dma_ops;
 #endif
 }
diff --git a/include/xen/arm/hypervisor.h b/include/xen/arm/hypervisor.h
index 2982571f7cc1..43ef24dd030e 100644
--- a/include/xen/arm/hypervisor.h
+++ b/include/xen/arm/hypervisor.h
@@ -19,8 +19,6 @@ static inline enum paravirt_lazy_mode 
paravirt_get_lazy_mode(void)
return PARAVIRT_LAZY_NONE;
 }
 
-extern const struct dma_map_ops *xen_dma_ops;
-
 #ifdef CONFIG_XEN
 void __init xen_early_init(void);
 #else
-- 
2.20.1



[PATCH 06/13] xen: remove the exports for xen_{create,destroy}_contiguous_region

2019-09-02 Thread Christoph Hellwig
These routines are only used by swiotlb-xen, which cannot be modular.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Stefano Stabellini 
---
 arch/arm/xen/mm.c | 2 --
 arch/x86/xen/mmu_pv.c | 2 --
 2 files changed, 4 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 11d5ad26fcfe..9d73fa4a5991 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -154,13 +154,11 @@ int xen_create_contiguous_region(phys_addr_t pstart, 
unsigned int order,
*dma_handle = pstart;
return 0;
 }
-EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
 
 void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
 {
return;
 }
-EXPORT_SYMBOL_GPL(xen_destroy_contiguous_region);
 
 int __init xen_mm_init(void)
 {
diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
index 26e8b326966d..c8dbee62ec2a 100644
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -2625,7 +2625,6 @@ int xen_create_contiguous_region(phys_addr_t pstart, 
unsigned int order,
*dma_handle = virt_to_machine(vstart).maddr;
return success ? 0 : -ENOMEM;
 }
-EXPORT_SYMBOL_GPL(xen_create_contiguous_region);
 
 void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
 {
@@ -2660,7 +2659,6 @@ void xen_destroy_contiguous_region(phys_addr_t pstart, 
unsigned int order)
 
spin_unlock_irqrestore(&xen_reservation_lock, flags);
 }
-EXPORT_SYMBOL_GPL(xen_destroy_contiguous_region);
 
 static noinline void xen_flush_tlb_all(void)
 {
-- 
2.20.1



[PATCH 07/13] swiotlb-xen: remove xen_swiotlb_dma_mmap and -xen_swiotlb_dma_get_sgtable

2019-09-02 Thread Christoph Hellwig
There is no need to wrap the common version, just wire them up directly.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Stefano Stabellini 
---
 drivers/xen/swiotlb-xen.c | 29 ++---
 1 file changed, 2 insertions(+), 27 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index eee86cc7046b..b8808677ae1d 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -547,31 +547,6 @@ xen_swiotlb_dma_supported(struct device *hwdev, u64 mask)
return xen_virt_to_bus(xen_io_tlb_end - 1) <= mask;
 }
 
-/*
- * Create userspace mapping for the DMA-coherent memory.
- * This function should be called with the pages from the current domain only,
- * passing pages mapped from other domains would lead to memory corruption.
- */
-static int
-xen_swiotlb_dma_mmap(struct device *dev, struct vm_area_struct *vma,
-void *cpu_addr, dma_addr_t dma_addr, size_t size,
-unsigned long attrs)
-{
-   return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
-}
-
-/*
- * This function should be called with the pages from the current domain only,
- * passing pages mapped from other domains would lead to memory corruption.
- */
-static int
-xen_swiotlb_get_sgtable(struct device *dev, struct sg_table *sgt,
-   void *cpu_addr, dma_addr_t handle, size_t size,
-   unsigned long attrs)
-{
-   return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size, attrs);
-}
-
 const struct dma_map_ops xen_swiotlb_dma_ops = {
.alloc = xen_swiotlb_alloc_coherent,
.free = xen_swiotlb_free_coherent,
@@ -584,6 +559,6 @@ const struct dma_map_ops xen_swiotlb_dma_ops = {
.map_page = xen_swiotlb_map_page,
.unmap_page = xen_swiotlb_unmap_page,
.dma_supported = xen_swiotlb_dma_supported,
-   .mmap = xen_swiotlb_dma_mmap,
-   .get_sgtable = xen_swiotlb_get_sgtable,
+   .mmap = dma_common_mmap,
+   .get_sgtable = dma_common_get_sgtable,
 };
-- 
2.20.1



[PATCH 04/13] xen/arm: simplify dma_cache_maint

2019-09-02 Thread Christoph Hellwig
Calculate the required operation in the caller, and pass it directly
instead of recalculating it for each page, and use simple arithmetics
to get from the physical address to Xen page size aligned chunks.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Stefano Stabellini 
---
 arch/arm/xen/mm.c | 61 ---
 1 file changed, 21 insertions(+), 40 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 90574d89d0d4..2fde161733b0 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -35,64 +35,45 @@ unsigned long xen_get_swiotlb_free_pages(unsigned int order)
return __get_free_pages(flags, order);
 }
 
-enum dma_cache_op {
-   DMA_UNMAP,
-   DMA_MAP,
-};
 static bool hypercall_cflush = false;
 
-/* functions called by SWIOTLB */
-
-static void dma_cache_maint(dma_addr_t handle, unsigned long offset,
-   size_t size, enum dma_data_direction dir, enum dma_cache_op op)
+/* buffers in highmem or foreign pages cannot cross page boundaries */
+static void dma_cache_maint(dma_addr_t handle, size_t size, u32 op)
 {
struct gnttab_cache_flush cflush;
-   unsigned long xen_pfn;
-   size_t left = size;
 
-   xen_pfn = (handle >> XEN_PAGE_SHIFT) + offset / XEN_PAGE_SIZE;
-   offset %= XEN_PAGE_SIZE;
+   cflush.a.dev_bus_addr = handle & XEN_PAGE_MASK;
+   cflush.offset = xen_offset_in_page(handle);
+   cflush.op = op;
 
do {
-   size_t len = left;
-   
-   /* buffers in highmem or foreign pages cannot cross page
-* boundaries */
-   if (len + offset > XEN_PAGE_SIZE)
-   len = XEN_PAGE_SIZE - offset;
-
-   cflush.op = 0;
-   cflush.a.dev_bus_addr = xen_pfn << XEN_PAGE_SHIFT;
-   cflush.offset = offset;
-   cflush.length = len;
-
-   if (op == DMA_UNMAP && dir != DMA_TO_DEVICE)
-   cflush.op = GNTTAB_CACHE_INVAL;
-   if (op == DMA_MAP) {
-   if (dir == DMA_FROM_DEVICE)
-   cflush.op = GNTTAB_CACHE_INVAL;
-   else
-   cflush.op = GNTTAB_CACHE_CLEAN;
-   }
-   if (cflush.op)
-   HYPERVISOR_grant_table_op(GNTTABOP_cache_flush, 
&cflush, 1);
+   if (size + cflush.offset > XEN_PAGE_SIZE)
+   cflush.length = XEN_PAGE_SIZE - cflush.offset;
+   else
+   cflush.length = size;
+
+   HYPERVISOR_grant_table_op(GNTTABOP_cache_flush, &cflush, 1);
 
-   offset = 0;
-   xen_pfn++;
-   left -= len;
-   } while (left);
+   cflush.offset = 0;
+   cflush.a.dev_bus_addr += cflush.length;
+   size -= cflush.length;
+   } while (size);
 }
 
 static void __xen_dma_page_dev_to_cpu(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir)
 {
-   dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, 
DMA_UNMAP);
+   if (dir != DMA_TO_DEVICE)
+   dma_cache_maint(handle, size, GNTTAB_CACHE_INVAL);
 }
 
 static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t handle,
size_t size, enum dma_data_direction dir)
 {
-   dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, 
DMA_MAP);
+   if (dir == DMA_FROM_DEVICE)
+   dma_cache_maint(handle, size, GNTTAB_CACHE_INVAL);
+   else
+   dma_cache_maint(handle, size, GNTTAB_CACHE_CLEAN);
 }
 
 void __xen_dma_map_page(struct device *hwdev, struct page *page,
-- 
2.20.1



[PATCH 08/13] swiotlb-xen: always use dma-direct helpers to alloc coherent pages

2019-09-02 Thread Christoph Hellwig
x86 currently calls alloc_pages, but using dma-direct works as well
there, with the added benefit of using the CMA pool if available.
The biggest advantage is of course to remove a pointless bit of
architecture specific code.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Stefano Stabellini 
---
 arch/x86/include/asm/xen/page-coherent.h | 16 
 drivers/xen/swiotlb-xen.c|  7 +++
 include/xen/arm/page-coherent.h  | 12 
 3 files changed, 3 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/xen/page-coherent.h 
b/arch/x86/include/asm/xen/page-coherent.h
index 116777e7f387..8ee33c5edded 100644
--- a/arch/x86/include/asm/xen/page-coherent.h
+++ b/arch/x86/include/asm/xen/page-coherent.h
@@ -5,22 +5,6 @@
 #include 
 #include 
 
-static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
-   dma_addr_t *dma_handle, gfp_t flags,
-   unsigned long attrs)
-{
-   void *vstart = (void*)__get_free_pages(flags, get_order(size));
-   *dma_handle = virt_to_phys(vstart);
-   return vstart;
-}
-
-static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
-   void *cpu_addr, dma_addr_t dma_handle,
-   unsigned long attrs)
-{
-   free_pages((unsigned long) cpu_addr, get_order(size));
-}
-
 static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
 dma_addr_t dev_addr, unsigned long offset, size_t size,
 enum dma_data_direction dir, unsigned long attrs) { }
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index b8808677ae1d..f9dd4cb6e4b3 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -299,8 +299,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t 
size,
 * address. In fact on ARM virt_to_phys only works for kernel direct
 * mapped RAM memory. Also see comment below.
 */
-   ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs);
-
+   ret = dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
if (!ret)
return ret;
 
@@ -319,7 +318,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t 
size,
else {
if (xen_create_contiguous_region(phys, order,
 fls64(dma_mask), dma_handle) 
!= 0) {
-   xen_free_coherent_pages(hwdev, size, ret, 
(dma_addr_t)phys, attrs);
+   dma_direct_free(hwdev, size, ret, (dma_addr_t)phys, 
attrs);
return NULL;
}
SetPageXenRemapped(virt_to_page(ret));
@@ -351,7 +350,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
TestClearPageXenRemapped(virt_to_page(vaddr)))
xen_destroy_contiguous_region(phys, order);
 
-   xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
+   dma_direct_free(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
 }
 
 /*
diff --git a/include/xen/arm/page-coherent.h b/include/xen/arm/page-coherent.h
index a840d6949a87..0e244f4fec1a 100644
--- a/include/xen/arm/page-coherent.h
+++ b/include/xen/arm/page-coherent.h
@@ -16,18 +16,6 @@ void __xen_dma_sync_single_for_cpu(struct device *hwdev,
 void __xen_dma_sync_single_for_device(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir);
 
-static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
-   dma_addr_t *dma_handle, gfp_t flags, unsigned long attrs)
-{
-   return dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
-}
-
-static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
-   void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs)
-{
-   dma_direct_free(hwdev, size, cpu_addr, dma_handle, attrs);
-}
-
 static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
 {
-- 
2.20.1



[PATCH 01/13] xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance

2019-09-02 Thread Christoph Hellwig
Copy the arm64 code that uses the dma-direct/swiotlb helpers for DMA
on-coherent devices.

Signed-off-by: Christoph Hellwig 
---
 arch/arm/include/asm/device.h|  3 -
 arch/arm/include/asm/xen/page-coherent.h | 72 +---
 arch/arm/mm/dma-mapping.c|  8 +--
 drivers/xen/swiotlb-xen.c| 20 ---
 4 files changed, 28 insertions(+), 75 deletions(-)

diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
index f6955b55c544..c675bc0d5aa8 100644
--- a/arch/arm/include/asm/device.h
+++ b/arch/arm/include/asm/device.h
@@ -14,9 +14,6 @@ struct dev_archdata {
 #endif
 #ifdef CONFIG_ARM_DMA_USE_IOMMU
struct dma_iommu_mapping*mapping;
-#endif
-#ifdef CONFIG_XEN
-   const struct dma_map_ops *dev_dma_ops;
 #endif
unsigned int dma_coherent:1;
unsigned int dma_ops_setup:1;
diff --git a/arch/arm/include/asm/xen/page-coherent.h 
b/arch/arm/include/asm/xen/page-coherent.h
index 2c403e7c782d..602ac02f154c 100644
--- a/arch/arm/include/asm/xen/page-coherent.h
+++ b/arch/arm/include/asm/xen/page-coherent.h
@@ -6,23 +6,37 @@
 #include 
 #include 
 
-static inline const struct dma_map_ops *xen_get_dma_ops(struct device *dev)
-{
-   if (dev && dev->archdata.dev_dma_ops)
-   return dev->archdata.dev_dma_ops;
-   return get_arch_dma_ops(NULL);
-}
-
 static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
dma_addr_t *dma_handle, gfp_t flags, unsigned long attrs)
 {
-   return xen_get_dma_ops(hwdev)->alloc(hwdev, size, dma_handle, flags, 
attrs);
+   return dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
 }
 
 static inline void xen_free_coherent_pages(struct device *hwdev, size_t size,
void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs)
 {
-   xen_get_dma_ops(hwdev)->free(hwdev, size, cpu_addr, dma_handle, attrs);
+   dma_direct_free(hwdev, size, cpu_addr, dma_handle, attrs);
+}
+
+static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
+   dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+   unsigned long pfn = PFN_DOWN(handle);
+
+   if (pfn_valid(pfn))
+   dma_direct_sync_single_for_cpu(hwdev, handle, size, dir);
+   else
+   __xen_dma_sync_single_for_cpu(hwdev, handle, size, dir);
+}
+
+static inline void xen_dma_sync_single_for_device(struct device *hwdev,
+   dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+   unsigned long pfn = PFN_DOWN(handle);
+   if (pfn_valid(pfn))
+   dma_direct_sync_single_for_device(hwdev, handle, size, dir);
+   else
+   __xen_dma_sync_single_for_device(hwdev, handle, size, dir);
 }
 
 static inline void xen_dma_map_page(struct device *hwdev, struct page *page,
@@ -36,17 +50,8 @@ static inline void xen_dma_map_page(struct device *hwdev, 
struct page *page,
bool local = (page_pfn <= dev_pfn) &&
(dev_pfn - page_pfn < compound_pages);
 
-   /*
-* Dom0 is mapped 1:1, while the Linux page can span across
-* multiple Xen pages, it's not possible for it to contain a
-* mix of local and foreign Xen pages. So if the first xen_pfn
-* == mfn the page is local otherwise it's a foreign page
-* grant-mapped in dom0. If the page is local we can safely
-* call the native dma_ops function, otherwise we call the xen
-* specific function.
-*/
if (local)
-   xen_get_dma_ops(hwdev)->map_page(hwdev, page, offset, size, 
dir, attrs);
+   dma_direct_map_page(hwdev, page, offset, size, dir, attrs);
else
__xen_dma_map_page(hwdev, page, dev_addr, offset, size, dir, 
attrs);
 }
@@ -63,33 +68,10 @@ static inline void xen_dma_unmap_page(struct device *hwdev, 
dma_addr_t handle,
 * safely call the native dma_ops function, otherwise we call the xen
 * specific function.
 */
-   if (pfn_valid(pfn)) {
-   if (xen_get_dma_ops(hwdev)->unmap_page)
-   xen_get_dma_ops(hwdev)->unmap_page(hwdev, handle, size, 
dir, attrs);
-   } else
+   if (pfn_valid(pfn))
+   dma_direct_unmap_page(hwdev, handle, size, dir, attrs);
+   else
__xen_dma_unmap_page(hwdev, handle, size, dir, attrs);
 }
 
-static inline void xen_dma_sync_single_for_cpu(struct device *hwdev,
-   dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
-   unsigned long pfn = PFN_DOWN(handle);
-   if (pfn_valid(pfn)) {
-   if (xen_get_dma_ops(hwdev)->sync_single_for_cpu)
-   xen_get_dma_ops(hwdev)->sync_single_for_cpu(hwdev, 
handle, size, dir);
-   } else
-   __xen_dma_sync_single_for_cpu(hwdev, handle, size, dir);
-}
-
-static inline void xen_dma_sync_single_for_device(struct device *hwdev,
-

[PATCH 03/13] xen/arm: use dev_is_dma_coherent

2019-09-02 Thread Christoph Hellwig
Use the dma-noncoherent dev_is_dma_coherent helper instead of the home
grown variant.  Note that both are always initialized to the same
value in arch_setup_dma_ops.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Julien Grall 
Reviewed-by: Stefano Stabellini 
---
 arch/arm/include/asm/dma-mapping.h   |  6 --
 arch/arm/xen/mm.c| 12 ++--
 arch/arm64/include/asm/dma-mapping.h |  9 -
 3 files changed, 6 insertions(+), 21 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index dba9355e2484..bdd80ddbca34 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -91,12 +91,6 @@ static inline dma_addr_t virt_to_dma(struct device *dev, 
void *addr)
 }
 #endif
 
-/* do not use this function in a driver */
-static inline bool is_device_dma_coherent(struct device *dev)
-{
-   return dev->archdata.dma_coherent;
-}
-
 /**
  * arm_dma_alloc - allocate consistent memory for DMA
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index d33b77e9add3..90574d89d0d4 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -99,7 +99,7 @@ void __xen_dma_map_page(struct device *hwdev, struct page 
*page,
 dma_addr_t dev_addr, unsigned long offset, size_t size,
 enum dma_data_direction dir, unsigned long attrs)
 {
-   if (is_device_dma_coherent(hwdev))
+   if (dev_is_dma_coherent(hwdev))
return;
if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
return;
@@ -112,7 +112,7 @@ void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t 
handle,
unsigned long attrs)
 
 {
-   if (is_device_dma_coherent(hwdev))
+   if (dev_is_dma_coherent(hwdev))
return;
if (attrs & DMA_ATTR_SKIP_CPU_SYNC)
return;
@@ -123,7 +123,7 @@ void __xen_dma_unmap_page(struct device *hwdev, dma_addr_t 
handle,
 void __xen_dma_sync_single_for_cpu(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
 {
-   if (is_device_dma_coherent(hwdev))
+   if (dev_is_dma_coherent(hwdev))
return;
__xen_dma_page_dev_to_cpu(hwdev, handle, size, dir);
 }
@@ -131,7 +131,7 @@ void __xen_dma_sync_single_for_cpu(struct device *hwdev,
 void __xen_dma_sync_single_for_device(struct device *hwdev,
dma_addr_t handle, size_t size, enum dma_data_direction dir)
 {
-   if (is_device_dma_coherent(hwdev))
+   if (dev_is_dma_coherent(hwdev))
return;
__xen_dma_page_cpu_to_dev(hwdev, handle, size, dir);
 }
@@ -159,7 +159,7 @@ bool xen_arch_need_swiotlb(struct device *dev,
 * memory and we are not able to flush the cache.
 */
return (!hypercall_cflush && (xen_pfn != bfn) &&
-   !is_device_dma_coherent(dev));
+   !dev_is_dma_coherent(dev));
 }
 
 int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
diff --git a/arch/arm64/include/asm/dma-mapping.h 
b/arch/arm64/include/asm/dma-mapping.h
index bdcb0922a40c..67243255a858 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -18,14 +18,5 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
return NULL;
 }
 
-/*
- * Do not use this function in a driver, it is only provided for
- * arch/arm/mm/xen.c, which is used by arm64 as well.
- */
-static inline bool is_device_dma_coherent(struct device *dev)
-{
-   return dev->dma_coherent;
-}
-
 #endif /* __KERNEL__ */
 #endif /* __ASM_DMA_MAPPING_H */
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


swiotlb-xen cleanups v3

2019-09-02 Thread Christoph Hellwig
Hi Xen maintainers and friends,

please take a look at this series that cleans up the parts of swiotlb-xen
that deal with non-coherent caches.

Boris and Juergen, can you take a look at patch 8, which touches x86
a as well?

Changes since v2:
 - further dma_cache_maint improvements
 - split the previous patch 1 into 3 patches

Changes since v1:
 - rewrite dma_cache_maint to be much simpler
 - improve various comments and commit logs
 - remove page-coherent.h entirely
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 01/11] asm-generic: add dma_zone_size

2019-09-02 Thread Christoph Hellwig
On Fri, Aug 30, 2019 at 07:24:25PM +0200, Nicolas Saenz Julienne wrote:
> I'll be happy to implement it that way. I agree it's a good compromise.
> 
> @Christoph, do you still want the patch where I create 'zone_dma_bits'? With a
> hardcoded ZONE_DMA it's not absolutely necessary. Though I remember you said 
> it
> was a first step towards being able to initialize dma-direct's min_mask in
> meminit.

I do like the variable better than the current #define.  I wonder if
really want a mask or a max_zone_dma_address like variable.  So for this
series feel free to drop the patch.   I'll see if I'll pick it up
later or if we can find some way to automatically propagate that
information from the zone initialization.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/7] iommu/arm-smmu: Add tlb_sync implementation hook

2019-09-02 Thread Robin Murphy

On 30/08/2019 23:49, Krishna Reddy wrote:

+   if (smmu->impl->tlb_sync) {
+   smmu->impl->tlb_sync(smmu, page, sync, status);



What I'd hoped is that rather than needing a hook for this, you could just 
override smmu_domain->tlb_ops from .init_context to wire up the alternate .sync 
method directly. That would save this extra level of indirection.


Hi Robin,  overriding tlb_ops->tlb_sync function is not enough here.
There are direct references to arm_smmu_tlb_sync_context(),  
arm_smmu_tlb_sync_global() functions.
In arm-smmu.c.  we can replace these direct references with tlb_ops->tlb_sync() 
function except in one function arm_smmu_device_reset().
When arm_smmu_device_reset() gets called, domains are not initialized and 
tlb_ops is not available.
Should we add a new hook for arm_smmu_tlb_sync_global() or make this as a 
responsibility of impl->reset() hook as it gets
called at the same place?


Ah, right, I'd forgotten about the TLB maintenance on reset. Looking 
again, though, I think you might need to implement impl->reset anyway 
for the sake of clearing GFSR correctly - since the value read from the 
base instance technically may not match whatever bits might happen to be 
set in the other instances - so I don't see any issue with just calling 
nsmmu_tlb_sync() from there as well.


Robin.


Re: [RFC PATCH] iommu/vt-d: Fix IOMMU field not populated on device hot re-plug

2019-09-02 Thread Janusz Krzysztofik
Hi Baolu,

On Thursday, August 29, 2019 11:08:18 AM CEST Lu Baolu wrote:
> Hi,
> 
> On 8/29/19 3:58 PM, Janusz Krzysztofik wrote:
> > Hi Baolu,
> > 
> > On Thursday, August 29, 2019 3:43:31 AM CEST Lu Baolu wrote:
> >> Hi Janusz,
> >>
> >> On 8/28/19 10:17 PM, Janusz Krzysztofik wrote:
>  We should avoid kernel panic when a intel_unmap() is called against
>  a non-existent domain.
> >>> Does that mean you suggest to replace
> >>>   BUG_ON(!domain);
> >>> with something like
> >>>   if (WARN_ON(!domain))
> >>>   return;
> >>> and to not care of orphaned mappings left allocated?  Is there a way to
> > inform
> >>> users that their active DMA mappings are no longer valid and they
> > shouldn't
> >>> call dma_unmap_*()?
> >>>
>  But we shouldn't expect the IOMMU driver not
>  cleaning up the domain info when a device remove notification comes and
>  wait until all file descriptors being closed, right?
> >>> Shouldn't then the IOMMU driver take care of cleaning up resources still
> >>> allocated on device remove before it invalidates and forgets their
> > pointers?
> >>>
> >>
> >> You are right. We need to wait until all allocated resources (iova and
> >> mappings) to be released.
> >>
> >> How about registering a callback for BUS_NOTIFY_UNBOUND_DRIVER, and
> >> removing the domain info when the driver detachment completes?
> > 
> > Device core calls BUS_NOTIFY_UNBOUND_DRIVER on each driver unbind, 
regardless
> > of a device being removed or not.  As long as the device is not unplugged 
and
> > the BUS_NOTIFY_REMOVED_DEVICE notification not generated, an unbound 
driver is
> > not a problem here.
> > Morever, BUS_NOTIFY_UNBOUND_DRIVER  is called even before
> > BUS_NOTIFY_REMOVED_DEVICE so that wouldn't help anyway.
> > Last but not least, bus events are independent of the IOMMU driver use via
> > DMA-API it exposes.
> 
> Fair enough.
> 
> > 
> > If keeping data for unplugged devices and reusing it on device re-plug is 
not
> > acceptable then maybe the IOMMU driver should perform reference counting 
of
> > its internal resources occupied by DMA-API users and perform cleanups on 
last
> > release?
> 
> I am not saying that keeping data is not acceptable. I just want to
> check whether there are any other solutions.

Then reverting 458b7c8e0dde and applying this patch still resolves the issue 
for me.  No errors appear when mappings are unmapped on device close after the 
device has been removed, and domain info preserved on device removal is 
successfully reused on device re-plug.

Is there anything else I can do to help?

Thanks,
Janusz

> 
> Best regards,
> Baolu
> 




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] swiotlb-xen: Convert to use macro

2019-09-02 Thread Souptick Joarder
Rather than using static int max_dma_bits, this
can be coverted to use as macro.

Signed-off-by: Souptick Joarder 
Reviewed-by: Juergen Gross 
---
 drivers/xen/swiotlb-xen.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index ae1df49..d1eced5 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -38,6 +38,7 @@
 #include 
 
 #include 
+#define MAX_DMA_BITS 32
 /*
  * Used to do a quick range check in swiotlb_tbl_unmap_single and
  * swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by 
this
@@ -114,8 +115,6 @@ static int is_xen_swiotlb_buffer(dma_addr_t dma_addr)
return 0;
 }
 
-static int max_dma_bits = 32;
-
 static int
 xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
 {
@@ -135,7 +134,7 @@ static int is_xen_swiotlb_buffer(dma_addr_t dma_addr)
p + (i << IO_TLB_SHIFT),
get_order(slabs << IO_TLB_SHIFT),
dma_bits, &dma_handle);
-   } while (rc && dma_bits++ < max_dma_bits);
+   } while (rc && dma_bits++ < MAX_DMA_BITS);
if (rc)
return rc;
 
-- 
1.9.1



Re: [PATCH 2/7] dt-bindings: arm-smmu: Add binding for nvidia, smmu-v2

2019-09-02 Thread Thierry Reding
On Fri, Aug 30, 2019 at 06:12:08PM +, Krishna Reddy wrote:
> >> +"nidia,smmu-v2"
> >>   "qcom,smmu-v2"
> 
> >I agree with Mikko that the compatible must be at least SoC-specific, but 
> >potentially even instance-specific (e.g. "nvidia,tegra194-gpu-smmu")
> > depending on how many of these parallel-SMMU configurations might be hiding 
> > in current and future SoCs.
> 
> I am correcting the spelling mistake pointed by Mikko.  The NVIDIA SMMUv2 
> implementation is getting used beyond  Tegra194 SOC.  
> To be able to use the smmu compatible string across multiple SOC's, 
> "nvidia,smmu-v2" compatible string is chosen.
> Are you suggesting to make it soc specific and add another one in future?

Yeah, I think that's the safest thing to do. Even if we're using the
same implementation in future SoCs, chances are there will be some
changes. Even if the changes are just fixes, having a SoC-specific
compatible string will ensure we can apply workarounds only to the
implementations that are missing the fixes.

So I think "nvidia,tegra194-smmu" is a good candidate. It uniquely
identifies the instantiation of the IP in Tegra194. Also, if it ever
turns out that the instantiation of the SMMU in the next Tegra
generation is *exactly* the same (even if highly unlikely), there's
nothing wrong with reusing the "nvidia,tegra194-smmu".

We've done similar things in the past, where some new IP was mostly
compatible with old IP. Typically we still include a new compatible
string in case any errata are discovered subsequently. It's not uncommon
to see things like:

compatible = "nvidia,tegra124-xyz", "nvidia,tegra20-xyz";

Basically this means that this is the IP that was also used in Tegra20
and the same Tegra20 driver can be used to drive this hardware on
Tegra124. The Tegra124-specific compatible string may enable newer
features if there's a driver that supports it.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v8 3/7] swiotlb: Zero out bounce buffer for untrusted device

2019-09-02 Thread Christoph Hellwig
On Mon, Sep 02, 2019 at 09:58:27AM +0800, Lu Baolu wrote:
> The untrusted flag is introduced in another series. I agree that we
> could consider to move it to struct device, but I think making it
> in a separated patch looks better.

A separate patch is of course a good idea.  But it needs to happen
before we can use the flag in the swiotlb code.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] swiotlb-zen: Convert to use macro

2019-09-02 Thread Juergen Gross

On 01.09.19 21:28, Souptick Joarder wrote:

Rather than using static int max_dma_bits, this
can be coverted to use as macro.

Signed-off-by: Souptick Joarder 


s/zen/xen/ in the patch title, other than that:

Reviewed-by: Juergen Gross 


Juergen


Re: [PATCH v8 7/7] iommu/vt-d: Use bounce buffer for untrusted devices

2019-09-02 Thread Lu Baolu

Hi David,

On 8/30/19 9:39 PM, David Laight wrote:

From: Lu Baolu

Sent: 30 August 2019 08:17



The Intel VT-d hardware uses paging for DMA remapping.
The minimum mapped window is a page size. The device
drivers may map buffers not filling the whole IOMMU
window. This allows the device to access to possibly
unrelated memory and a malicious device could exploit
this to perform DMA attacks. To address this, the
Intel IOMMU driver will use bounce pages for those
buffers which don't fill whole IOMMU pages.


Won't this completely kill performance?

I'd expect to see something for dma_alloc_coherent() (etc)
that tries to give the driver page sized buffers.


Bounce page won't be used if driver request page sized buffers.



Either that or the driver could allocate page sized buffers
even though it only passes fragments of these buffers to
the dma functions (to avoid excessive cache invalidates).


Yes, agreed. One possible solution is to add a dma attribution and the
device driver could hint that the buffer under mapping is part of a page
sized buffer and iommu driver don't need to use bounce buffer for it.
This is in the todo list. We need to figure out which device driver
really needs this.



Since you have to trust the driver, why not actually trust it?



In thunderbolt case, we trust driver, but we don't trust the hot-added
devices.

Best regards,
Baolu