Re: [PATCH] iommu: of: skip iommu_device_list traversal in of_iommu_xlate()

2020-09-23 Thread Charan Teja Kalla


On 9/23/2020 9:54 PM, Robin Murphy wrote:
> On 2020-09-23 15:53, Charan Teja Reddy wrote:
>> In of_iommu_xlate(), check if iommu device is enabled before traversing
>> the iommu_device_list through iommu_ops_from_fwnode(). It is of no use
>> in traversing the iommu_device_list only to return NO_IOMMU because of
>> iommu device node is disabled.
> 
> Well, the "use" is that it keeps the code that much smaller and simpler
> to have a single path for returning this condition. This whole callstack
> isn't exactly a high-performance code path to begin with, and we've
> always assumed that IOMMUs present but disabled in DT would be a pretty
> rare exception. 

Fine..I thought that it is logical to return when IOMMU DT node is
disabled over code simplicity. And agree that it is not high-performance
path.

> Do you have a system that challenges those assumptions
> and shows any benefit from this change?

No, I don't have a system that challenges these assumptions.

> 
> Robin.
> 
>> Signed-off-by: Charan Teja Reddy 
>> ---
>>   drivers/iommu/of_iommu.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>> index e505b91..225598c 100644
>> --- a/drivers/iommu/of_iommu.c
>> +++ b/drivers/iommu/of_iommu.c
>> @@ -94,9 +94,10 @@ static int of_iommu_xlate(struct device *dev,
>>   struct fwnode_handle *fwnode = _spec->np->fwnode;
>>   int ret;
>>   +    if (!of_device_is_available(iommu_spec->np))
>> +    return NO_IOMMU;
>>   ops = iommu_ops_from_fwnode(fwnode);
>> -    if ((ops && !ops->of_xlate) ||
>> -    !of_device_is_available(iommu_spec->np))
>> +    if (ops && !ops->of_xlate)
>>   return NO_IOMMU;
>>     ret = iommu_fwspec_init(dev, _spec->np->fwnode, ops);
>>

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 0/6] Convert the intel iommu driver to the dma-iommu api

2020-09-23 Thread Lu Baolu

Hi Tvrtko,

On 9/15/20 4:31 PM, Tvrtko Ursulin wrote:
With the previous version of the series I hit a problem on Ivybridge 
where apparently the dma engine width is not respected. At least that 
is my layman interpretation of the errors. From the older thread:


<3> [209.526605] DMAR: intel_iommu_map: iommu width (39) is not 
sufficient for the mapped address (008000)


Relevant iommu boot related messages are:

<6>[    0.184234] DMAR: Host address width 36
<6>[    0.184245] DMAR: DRHD base: 0x00fed9 flags: 0x0
<6>[    0.184288] DMAR: dmar0: reg_base_addr fed9 ver 1:0 cap 
c020e60262 ecap f0101a

<6>[    0.184308] DMAR: DRHD base: 0x00fed91000 flags: 0x1
<6>[    0.184337] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap 
c9008020660262 ecap f0105a
<6>[    0.184357] DMAR: RMRR base: 0x00d8d28000 end: 
0x00d8d46fff
<6>[    0.184377] DMAR: RMRR base: 0x00db00 end: 
0x00df1f
<6>[    0.184398] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 
IOMMU 1

<6>[    0.184414] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
<6>[    0.184428] DMAR-IR: Queued invalidation will be enabled to 
support x2apic and Intr-remapping.

<6>[    0.185173] DMAR-IR: Enabled IRQ remapping in x2apic mode

<6>[    0.878934] DMAR: No ATSR found
<6>[    0.878966] DMAR: dmar0: Using Queued invalidation
<6>[    0.879007] DMAR: dmar1: Using Queued invalidation

<6>[    0.915032] DMAR: Intel(R) Virtualization Technology for 
Directed I/O
<6>[    0.915060] PCI-DMA: Using software bounce buffering for IO 
(SWIOTLB)
<6>[    0.915084] software IO TLB: mapped [mem 0xc80d4000-0xcc0d4000] 
(64MB)


(Full boot log at 
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7054/fi-ivb-3770/boot0.txt, 
failures at 
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7054/fi-ivb-3770/igt@i915_selftest@l...@blt.html.) 



Does this look familiar or at least plausible to you? Is this 
something your new series has fixed?


This happens during attaching a domain to device. It has nothing to do
with this patch series. I will look into this issue, but not in this
email thread context.


I am not sure what step is attaching domain to device, but these type 
messages:


<3> [209.526605] DMAR: intel_iommu_map: iommu width (39) is not
 >> sufficient for the mapped address (008000)

They definitely appear to happen at runtime, as i915 is getting 
exercised by userspace.


Can you please check whether below change helps here?

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index c8323a9f8bde..0484c539debc 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -724,6 +724,7 @@ static int domain_update_device_node(struct 
dmar_domain *domain)

 /* Some capabilities may be different across iommus */
 static void domain_update_iommu_cap(struct dmar_domain *domain)
 {
+   domain->geometry.aperture_end = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
domain_update_iommu_coherency(domain);
domain->iommu_snooping = domain_update_iommu_snooping(NULL);
domain->iommu_superpage = domain_update_iommu_superpage(domain, 
NULL);


Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [bugzilla-dae...@bugzilla.kernel.org: [Bug 209149] New: "iommu/vt-d: Enable PCI ACS for platform opt in hint" makes NVMe config space not accessible after S3]

2020-09-23 Thread Rajat Jain via iommu
On Wed, Sep 23, 2020 at 9:19 AM Raj, Ashok  wrote:
>
> Hi Bjorn
>
>
> On Wed, Sep 23, 2020 at 11:03:27AM -0500, Bjorn Helgaas wrote:
> > [+cc IOMMU and NVMe folks]
> >
> > Sorry, I forgot to forward this to linux-pci when it was first
> > reported.
> >
> > Apparently this happens with v5.9-rc3, and may be related to
> > 50310600ebda ("iommu/vt-d: Enable PCI ACS for platform opt in hint"),
> > which appeared in v5.8-rc3.
> >
> > There are several dmesg logs and proposed patches in the bugzilla, but
> > no analysis yet of what the problem is.  From the first dmesg
> > attachment (https://bugzilla.kernel.org/attachment.cgi?id=292327):
>
> We have been investigating this internally as well. It appears maybe the
> specupdate for Cometlake is missing the errata documention. The offsets
> were wrong in some of them, and if its the same issue its likely cause.

Can you please also confirm if errata applies to Tigerlake ?

Thanks,

Rajat

>
> Will nudge the hw folks to hunt that down :-(.
>
> Cheers,
> Ashok
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [bugzilla-dae...@bugzilla.kernel.org: [Bug 209149] New: "iommu/vt-d: Enable PCI ACS for platform opt in hint" makes NVMe config space not accessible after S3]

2020-09-23 Thread Kai-Heng Feng
[+Cc Christoph]

> On Sep 24, 2020, at 00:03, Bjorn Helgaas  wrote:
> 
> [+cc IOMMU and NVMe folks]
> 
> Sorry, I forgot to forward this to linux-pci when it was first
> reported.
> 
> Apparently this happens with v5.9-rc3, and may be related to
> 50310600ebda ("iommu/vt-d: Enable PCI ACS for platform opt in hint"),
> which appeared in v5.8-rc3.
> 
> There are several dmesg logs and proposed patches in the bugzilla, but
> no analysis yet of what the problem is.  From the first dmesg
> attachment (https://bugzilla.kernel.org/attachment.cgi?id=292327):

AFAIK Intel is working on it internally.
Comet Lake probably needs ACS quirk like older generation chips.

> 
>  [   50.434945] PM: suspend entry (deep)
>  [   50.802086] nvme :01:00.0: saving config space at offset 0x0 (reading 
> 0x11e0f)
>  [   50.842775] ACPI: Preparing to enter system sleep state S3
>  [   50.858922] ACPI: Waking up from system sleep state S3
>  [   50.883622] nvme :01:00.0: can't change power state from D3hot to D0 
> (config space inaccessible)
>  [   50.947352] nvme :01:00.0: restoring config space at offset 0x0 (was 
> 0x, writing 0x11e0f)
>  [   50.947816] pcieport :00:1b.0: DPC: containment event, status:0x1f01 
> source:0x
>  [   50.947817] pcieport :00:1b.0: DPC: unmasked uncorrectable error 
> detected
>  [   50.947829] pcieport :00:1b.0: PCIe Bus Error: severity=Uncorrected 
> (Non-Fatal), type=Transaction Layer, (Receiver ID)
>  [   50.947830] pcieport :00:1b.0:   device [8086:06ac] error 
> status/mask=0020/0001
>  [   50.947831] pcieport :00:1b.0:[21] ACSViol(First)
>  [   50.947841] pcieport :00:1b.0: AER: broadcast error_detected message
>  [   50.947843] nvme nvme0: frozen state error detected, reset controller
> 
> I suspect the nvme "can't change power state" and restore config space
> errors are a consequence of the DPC event.  If DPC disables the link,
> the device is inaccessible.
> 
> I don't know what caused the ACS Violation.  The AER TLP Header Log
> might have a clue, but unfortunately we didn't print it.
> 
> Tangent:
> 
>  The fact that we didn't print the AER TLP Header log looks like
>  a bug in itself.  PCIe r5.0, sec 6.2.7, table 6-5, says many
>  errors, including ACS Violation, should log the TLP header.  But
>  aer_get_device_error_info() only reads the log for error bits in
>  AER_LOG_TLP_MASKS, which doesn't include PCI_ERR_UNC_ACSV.
> 
>  I don't think there's a "TLP Header Log Valid" bit, and it's ugly to
>  have to update AER_LOG_TLP_MASKS if new errors are added.  I think
>  maybe we should always print the header log.

I can attach TLP Header if there's a patch...

Kai-Heng

> 
> - Forwarded message from bugzilla-dae...@bugzilla.kernel.org -
> 
> Date: Fri, 04 Sep 2020 14:31:20 +
> From: bugzilla-dae...@bugzilla.kernel.org
> To: bj...@helgaas.com
> Subject: [Bug 209149] New: "iommu/vt-d: Enable PCI ACS for platform opt in
>   hint" makes NVMe config space not accessible after S3
> Message-ID: 
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=209149
> 
>Bug ID: 209149
>   Summary: "iommu/vt-d: Enable PCI ACS for platform opt in hint"
>makes NVMe config space not accessible after S3
>   Product: Drivers
>   Version: 2.5
>Kernel Version: mainline
>  Hardware: All
>OS: Linux
>  Tree: Mainline
>Status: NEW
>  Severity: normal
>  Priority: P1
> Component: PCI
>  Assignee: drivers_...@kernel-bugs.osdl.org
>  Reporter: kai.heng.f...@canonical.com
>Regression: No
> 
> Here's the error:
> [   50.947816] pcieport :00:1b.0: DPC: containment event, status:0x1f01
> source:0x
> [   50.947817] pcieport :00:1b.0: DPC: unmasked uncorrectable error
> detected
> [   50.947829] pcieport :00:1b.0: PCIe Bus Error: severity=Uncorrected
> (Non-Fatal), type=Transaction Layer, (Receiver ID)
> [   50.947830] pcieport :00:1b.0:   device [8086:06ac] error
> status/mask=0020/0001
> [   50.947831] pcieport :00:1b.0:[21] ACSViol(First)
> [   50.947841] pcieport :00:1b.0: AER: broadcast error_detected message
> [   50.947843] nvme nvme0: frozen state error detected, reset controller
> 
> -- 
> You are receiving this mail because:
> You are watching the assignee of the bug.
> 
> - End forwarded message -

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: of: skip iommu_device_list traversal in of_iommu_xlate()

2020-09-23 Thread Robin Murphy

On 2020-09-23 15:53, Charan Teja Reddy wrote:

In of_iommu_xlate(), check if iommu device is enabled before traversing
the iommu_device_list through iommu_ops_from_fwnode(). It is of no use
in traversing the iommu_device_list only to return NO_IOMMU because of
iommu device node is disabled.


Well, the "use" is that it keeps the code that much smaller and simpler 
to have a single path for returning this condition. This whole callstack 
isn't exactly a high-performance code path to begin with, and we've 
always assumed that IOMMUs present but disabled in DT would be a pretty 
rare exception. Do you have a system that challenges those assumptions 
and shows any benefit from this change?


Robin.


Signed-off-by: Charan Teja Reddy 
---
  drivers/iommu/of_iommu.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index e505b91..225598c 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -94,9 +94,10 @@ static int of_iommu_xlate(struct device *dev,
struct fwnode_handle *fwnode = _spec->np->fwnode;
int ret;
  
+	if (!of_device_is_available(iommu_spec->np))

+   return NO_IOMMU;
ops = iommu_ops_from_fwnode(fwnode);
-   if ((ops && !ops->of_xlate) ||
-   !of_device_is_available(iommu_spec->np))
+   if (ops && !ops->of_xlate)
return NO_IOMMU;
  
  	ret = iommu_fwspec_init(dev, _spec->np->fwnode, ops);



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [bugzilla-dae...@bugzilla.kernel.org: [Bug 209149] New: "iommu/vt-d: Enable PCI ACS for platform opt in hint" makes NVMe config space not accessible after S3]

2020-09-23 Thread Raj, Ashok
Hi Bjorn


On Wed, Sep 23, 2020 at 11:03:27AM -0500, Bjorn Helgaas wrote:
> [+cc IOMMU and NVMe folks]
> 
> Sorry, I forgot to forward this to linux-pci when it was first
> reported.
> 
> Apparently this happens with v5.9-rc3, and may be related to
> 50310600ebda ("iommu/vt-d: Enable PCI ACS for platform opt in hint"),
> which appeared in v5.8-rc3.
> 
> There are several dmesg logs and proposed patches in the bugzilla, but
> no analysis yet of what the problem is.  From the first dmesg
> attachment (https://bugzilla.kernel.org/attachment.cgi?id=292327):

We have been investigating this internally as well. It appears maybe the
specupdate for Cometlake is missing the errata documention. The offsets
were wrong in some of them, and if its the same issue its likely cause. 

Will nudge the hw folks to hunt that down :-(.

Cheers,
Ashok
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[bugzilla-dae...@bugzilla.kernel.org: [Bug 209149] New: "iommu/vt-d: Enable PCI ACS for platform opt in hint" makes NVMe config space not accessible after S3]

2020-09-23 Thread Bjorn Helgaas
[+cc IOMMU and NVMe folks]

Sorry, I forgot to forward this to linux-pci when it was first
reported.

Apparently this happens with v5.9-rc3, and may be related to
50310600ebda ("iommu/vt-d: Enable PCI ACS for platform opt in hint"),
which appeared in v5.8-rc3.

There are several dmesg logs and proposed patches in the bugzilla, but
no analysis yet of what the problem is.  From the first dmesg
attachment (https://bugzilla.kernel.org/attachment.cgi?id=292327):

  [   50.434945] PM: suspend entry (deep)
  [   50.802086] nvme :01:00.0: saving config space at offset 0x0 (reading 
0x11e0f)
  [   50.842775] ACPI: Preparing to enter system sleep state S3
  [   50.858922] ACPI: Waking up from system sleep state S3
  [   50.883622] nvme :01:00.0: can't change power state from D3hot to D0 
(config space inaccessible)
  [   50.947352] nvme :01:00.0: restoring config space at offset 0x0 (was 
0x, writing 0x11e0f)
  [   50.947816] pcieport :00:1b.0: DPC: containment event, status:0x1f01 
source:0x
  [   50.947817] pcieport :00:1b.0: DPC: unmasked uncorrectable error 
detected
  [   50.947829] pcieport :00:1b.0: PCIe Bus Error: severity=Uncorrected 
(Non-Fatal), type=Transaction Layer, (Receiver ID)
  [   50.947830] pcieport :00:1b.0:   device [8086:06ac] error 
status/mask=0020/0001
  [   50.947831] pcieport :00:1b.0:[21] ACSViol(First)
  [   50.947841] pcieport :00:1b.0: AER: broadcast error_detected message
  [   50.947843] nvme nvme0: frozen state error detected, reset controller

I suspect the nvme "can't change power state" and restore config space
errors are a consequence of the DPC event.  If DPC disables the link,
the device is inaccessible.

I don't know what caused the ACS Violation.  The AER TLP Header Log
might have a clue, but unfortunately we didn't print it.

Tangent:

  The fact that we didn't print the AER TLP Header log looks like
  a bug in itself.  PCIe r5.0, sec 6.2.7, table 6-5, says many
  errors, including ACS Violation, should log the TLP header.  But
  aer_get_device_error_info() only reads the log for error bits in
  AER_LOG_TLP_MASKS, which doesn't include PCI_ERR_UNC_ACSV.

  I don't think there's a "TLP Header Log Valid" bit, and it's ugly to
  have to update AER_LOG_TLP_MASKS if new errors are added.  I think
  maybe we should always print the header log.

- Forwarded message from bugzilla-dae...@bugzilla.kernel.org -

Date: Fri, 04 Sep 2020 14:31:20 +
From: bugzilla-dae...@bugzilla.kernel.org
To: bj...@helgaas.com
Subject: [Bug 209149] New: "iommu/vt-d: Enable PCI ACS for platform opt in
hint" makes NVMe config space not accessible after S3
Message-ID: 

https://bugzilla.kernel.org/show_bug.cgi?id=209149

Bug ID: 209149
   Summary: "iommu/vt-d: Enable PCI ACS for platform opt in hint"
makes NVMe config space not accessible after S3
   Product: Drivers
   Version: 2.5
Kernel Version: mainline
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: PCI
  Assignee: drivers_...@kernel-bugs.osdl.org
  Reporter: kai.heng.f...@canonical.com
Regression: No

Here's the error:
[   50.947816] pcieport :00:1b.0: DPC: containment event, status:0x1f01
source:0x
[   50.947817] pcieport :00:1b.0: DPC: unmasked uncorrectable error
detected
[   50.947829] pcieport :00:1b.0: PCIe Bus Error: severity=Uncorrected
(Non-Fatal), type=Transaction Layer, (Receiver ID)
[   50.947830] pcieport :00:1b.0:   device [8086:06ac] error
status/mask=0020/0001
[   50.947831] pcieport :00:1b.0:[21] ACSViol(First)
[   50.947841] pcieport :00:1b.0: AER: broadcast error_detected message
[   50.947843] nvme nvme0: frozen state error detected, reset controller

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

- End forwarded message -
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv5 5/6] iommu: arm-smmu-impl: Use table to list QCOM implementations

2020-09-23 Thread Robin Murphy

On 2020-09-22 07:18, Sai Prakash Ranjan wrote:

Use table and of_match_node() to match qcom implementation
instead of multiple of_device_compatible() calls for each
QCOM SMMU implementation.

Signed-off-by: Sai Prakash Ranjan 
---
  drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 12 
  1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index d199b4bff15d..ce78295cfa78 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -9,6 +9,13 @@
  
  #include "arm-smmu.h"
  
+static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = {

+   { .compatible = "qcom,sc7180-smmu-500" },
+   { .compatible = "qcom,sdm845-smmu-500" },
+   { .compatible = "qcom,sm8150-smmu-500" },
+   { .compatible = "qcom,sm8250-smmu-500" },
+   { }
+};


Can you push the table itself into arm-smmu-qcom? That way you'll be 
free to add new SoCs willy-nilly without any possibility of conflicting 
with anything else.


Bonus points if you can fold in the Adreno variant and keep everything 
together ;)


Robin.


  static int arm_smmu_gr0_ns(int offset)
  {
@@ -217,10 +224,7 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
if (of_device_is_compatible(np, "nvidia,tegra194-smmu"))
return nvidia_smmu_impl_init(smmu);
  
-	if (of_device_is_compatible(np, "qcom,sdm845-smmu-500") ||

-   of_device_is_compatible(np, "qcom,sc7180-smmu-500") ||
-   of_device_is_compatible(np, "qcom,sm8150-smmu-500") ||
-   of_device_is_compatible(np, "qcom,sm8250-smmu-500"))
+   if (of_match_node(qcom_smmu_impl_of_match, np))
return qcom_smmu_impl_init(smmu);
  
  	if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu"))



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv5 4/6] drm/msm/a6xx: Add support for using system cache(LLC)

2020-09-23 Thread Jordan Crouse
On Tue, Sep 22, 2020 at 11:48:17AM +0530, Sai Prakash Ranjan wrote:
> From: Sharat Masetty 
> 
> The last level system cache can be partitioned to 32 different
> slices of which GPU has two slices preallocated. One slice is
> used for caching GPU buffers and the other slice is used for
> caching the GPU SMMU pagetables. This talks to the core system
> cache driver to acquire the slice handles, configure the SCID's
> to those slices and activates and deactivates the slices upon
> GPU power collapse and restore.
> 
> Some support from the IOMMU driver is also needed to make use
> of the system cache to set the right TCR attributes. GPU then
> has the ability to override a few cacheability parameters which
> it does to override write-allocate to write-no-allocate as the
> GPU hardware does not benefit much from it.
> 
> DOMAIN_ATTR_SYS_CACHE is another domain level attribute used by the
> IOMMU driver to set the right attributes to cache the hardware
> pagetables into the system cache.
> 
> Signed-off-by: Sharat Masetty 
> [saiprakash.ranjan: fix to set attr before device attach to iommu and rebase]
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 83 +
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |  4 ++
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 17 +
>  3 files changed, 104 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 8915882e..151190ff62f7 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -8,7 +8,9 @@
>  #include "a6xx_gpu.h"
>  #include "a6xx_gmu.xml.h"
>  
> +#include 
>  #include 
> +#include 
>  
>  #define GPU_PAS_ID 13
>  
> @@ -1022,6 +1024,79 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
>   return IRQ_HANDLED;
>  }
>  
> +static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask, u32 
> or)
> +{
> + return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or);
> +}
> +
> +static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 value)
> +{
> + return msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2));
> +}
> +
> +static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu)
> +{
> + llcc_slice_deactivate(a6xx_gpu->llc_slice);
> + llcc_slice_deactivate(a6xx_gpu->htw_llc_slice);
> +}
> +
> +static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu)
> +{
> + u32 cntl1_regval = 0;
> +
> + if (IS_ERR(a6xx_gpu->llc_mmio))
> + return;
> +
> + if (!llcc_slice_activate(a6xx_gpu->llc_slice)) {
> + u32 gpu_scid = llcc_get_slice_id(a6xx_gpu->llc_slice);
> +
> + gpu_scid &= 0x1f;
> + cntl1_regval = (gpu_scid << 0) | (gpu_scid << 5) | (gpu_scid << 
> 10) |
> +(gpu_scid << 15) | (gpu_scid << 20);
> + }
> +
> + if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) {
> + u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice);
> +
> + gpuhtw_scid &= 0x1f;
> + cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid);
> + }
> +
> + if (cntl1_regval) {
> + /*
> +  * Program the slice IDs for the various GPU blocks and GPU MMU
> +  * pagetables
> +  */
> + a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1, 
> cntl1_regval);
> +
> + /*
> +  * Program cacheability overrides to not allocate cache lines on
> +  * a write miss
> +  */
> + a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 
> 0xF, 0x03);
> + }
> +}

This code has been around long enough that it pre-dates a650. On a650 and other
MMU-500 targets the htw_llc is configured by the firmware and the llc_slice is
configured in a different register.

I don't think we need to pause everything and add support for the MMU-500 path,
but we do need a way to disallow LLCC on affected targets until such time that
we can get it fixed up.

Jordan

> +
> +static void a6xx_llc_slices_destroy(struct a6xx_gpu *a6xx_gpu)
> +{
> + llcc_slice_putd(a6xx_gpu->llc_slice);
> + llcc_slice_putd(a6xx_gpu->htw_llc_slice);
> +}
> +
> +static void a6xx_llc_slices_init(struct platform_device *pdev,
> + struct a6xx_gpu *a6xx_gpu)
> +{
> + a6xx_gpu->llc_mmio = msm_ioremap(pdev, "cx_mem", "gpu_cx");
> + if (IS_ERR(a6xx_gpu->llc_mmio))
> + return;
> +
> + a6xx_gpu->llc_slice = llcc_slice_getd(LLCC_GPU);
> + a6xx_gpu->htw_llc_slice = llcc_slice_getd(LLCC_GPUHTW);
> +
> + if (IS_ERR(a6xx_gpu->llc_slice) && IS_ERR(a6xx_gpu->htw_llc_slice))
> + a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> +}
> +
>  static int a6xx_pm_resume(struct msm_gpu *gpu)
>  {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> @@ -1038,6 +1113,8 @@ static int a6xx_pm_resume(struct msm_gpu *gpu)
>  
>   msm_gpu_resume_devfreq(gpu);
>  

[PATCH] iommu: of: skip iommu_device_list traversal in of_iommu_xlate()

2020-09-23 Thread Charan Teja Reddy
In of_iommu_xlate(), check if iommu device is enabled before traversing
the iommu_device_list through iommu_ops_from_fwnode(). It is of no use
in traversing the iommu_device_list only to return NO_IOMMU because of
iommu device node is disabled.

Signed-off-by: Charan Teja Reddy 
---
 drivers/iommu/of_iommu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index e505b91..225598c 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -94,9 +94,10 @@ static int of_iommu_xlate(struct device *dev,
struct fwnode_handle *fwnode = _spec->np->fwnode;
int ret;
 
+   if (!of_device_is_available(iommu_spec->np))
+   return NO_IOMMU;
ops = iommu_ops_from_fwnode(fwnode);
-   if ((ops && !ops->of_xlate) ||
-   !of_device_is_available(iommu_spec->np))
+   if (ops && !ops->of_xlate)
return NO_IOMMU;
 
ret = iommu_fwspec_init(dev, _spec->np->fwnode, ops);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/2] iommu/arm-smmu-v3: Improve cmdq lock efficiency

2020-09-23 Thread John Garry

On 21/09/2020 14:58, John Garry wrote:



Could you try to adapt the hacks I sent before,
please? I know they weren't quite right (I have no hardware to test 
on


Could the ARM Rev C FVP be used to at least functionally test? Can't 
seem to access myself, even though it's gratis...


), but

the basic idea is to fall back to a spinlock if the cmpxchg() fails. The
queueing in the spinlock implementation should avoid the contention.




So I modified that suggested change to get it functioning, and it looks 
like this:


diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c

index 7196207be7ea..f907b7c233a2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -560,6 +560,7 @@ struct arm_smmu_cmdq {
atomic_long_t   *valid_map;
atomic_towner_prod;
atomic_tlock;
+   spinlock_t  slock;
 };

 struct arm_smmu_cmdq_batch {
@@ -1378,7 +1379,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct 
arm_smmu_device *smmu,

u64 cmd_sync[CMDQ_ENT_DWORDS];
u32 prod;
unsigned long flags;
-   bool owner;
+   bool owner, locked = false;
struct arm_smmu_cmdq *cmdq = >cmdq;
struct arm_smmu_ll_queue llq = {
.max_n_shift = cmdq->q.llq.max_n_shift,
@@ -1387,26 +1388,42 @@ static int arm_smmu_cmdq_issue_cmdlist(struct 
arm_smmu_device *smmu,


/* 1. Allocate some space in the queue */
local_irq_save(flags);
-   llq.val = READ_ONCE(cmdq->q.llq.val);
do {
u64 old;

-   while (!queue_has_space(, n + sync)) {
+   llq.val = READ_ONCE(cmdq->q.llq.val);
+
+   if (queue_has_space(, n + sync))
+   goto try_cas;
+
+   if (locked) {
+   spin_unlock(>slock);
+   locked = 0; // added
+   }
+
+   do {
local_irq_restore(flags);
if (arm_smmu_cmdq_poll_until_not_full(smmu, ))
dev_err_ratelimited(smmu->dev, "CMDQ 
timeout\n");
local_irq_save(flags);
-   }
+   } while (!queue_has_space(, n + sync));

+try_cas:
head.cons = llq.cons;
head.prod = queue_inc_prod_n(, n + sync) |
 CMDQ_PROD_OWNED_FLAG;

old = cmpxchg_relaxed(>q.llq.val, llq.val, head.val);
-   if (old == llq.val)
+   if (old == llq.val) { // was if (old != llq.val)
+   if (locked)   //   break;
+   spin_unlock(>slock);//
break;//
+   }//

-   llq.val = old;
+   if (!locked) {
+   spin_lock(>slock);
+   locked = true;
+   }
} while (1);
owner = !(llq.prod & CMDQ_PROD_OWNED_FLAG);
head.prod &= ~CMDQ_PROD_OWNED_FLAG;
@@ -3192,6 +3209,7 @@ static int arm_smmu_cmdq_init(struct 
arm_smmu_device *smmu)


atomic_set(>owner_prod, 0);
atomic_set(>lock, 0);
+   spin_lock_init(>slock);

bitmap = (atomic_long_t *)bitmap_zalloc(nents, GFP_KERNEL);
if (!bitmap) {
--
2.26.2

I annotated my mods with comments. Maybe those mods would not be as you 
intend.


So I'm not sure that we solve the problem of a new CPU coming along and 
trying the cmpxchg immediately, while another CPU has the slock and will 
try the cmpxchg also.


Anyway, the results are a bit mixed depending on the CPU count, but 
generally positive compared to mainline:


CPUs2   4   8   16  32  64  96
v5.9-rc1453K409K295K157K33.6K   9.5K5.2K
Will's change   459K414K281K131K44K 15.5K   8.6K
$subject change 481K406K305K190K81K 30K 18.7K

(Unit is DMA map+unmap per CPU per second, using test harness. Higher is 
better.)


Please let me know of any way to progress.

Thanks,
John
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [External] Re: [PATCH] Revert "iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions"

2020-09-23 Thread Adrian Huang12
Hi Baoquan,

> -Original Message-
> From: Baoquan He 
> Sent: Wednesday, September 23, 2020 10:33 AM
> To: j...@8bytes.org; Adrian Huang12 
> Cc: iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> jsnit...@redhat.com
> Subject: [External] Re: [PATCH] Revert "iommu/amd: Treat per-device exclusion
> ranges as r/w unity-mapped regions"
> 
> Forgot CC-ing Jerry, add him.
> 
> On 09/23/20 at 10:26am, Baoquan He wrote:
> > A regression failure of kdump kernel boot was reported on a HPE system.
> > Bisect points at commit 387caf0b759ac43 ("iommu/amd: Treat per-device
> > exclusion ranges as r/w unity-mapped regions") as criminal. Reverting
> > it fix the failure.
> >
> > With the commit, kdump kernel will always print below error message,
> > then naturally AMD iommu can't function normally during kdump kernel
> bootup.
> >
> >   ~
> >   AMD-Vi: [Firmware Bug]: IVRS invalid checksum
> >
> > Why commit 387caf0b759ac43 causing it haven't been made clear.
> 
> Hi Joerg, Adrian
> 
> We only have one machine which can reproduce the issue, it's a gen10-01 of
> HPE. If any log or info are needed, please let me know, I can attach here.

Could you please provide the following info?
1. The booting log for both system kernel and kdump kernel by appending the 
kernel parameter 'amd_iommu_dump'
2. ACPI table (# acpidump > acpi-table) -> Send out the file 'acpi-table'. 

-- Adrian
 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: Re: dma_alloc_coherent not allocating memory from CMA Reserved

2020-09-23 Thread Sathyavathi M



Hi Robin,
 
My question is during boot up the CMA was allocated at  [ 0.014538] *cma: Reserved 400 MiB at 0x000205c0* 
But when we do dma_alloac_coherent in the driver the address is different  f800
The address  f800 doesnt lie in the range of CMA allocated at 0x000205c0. Is this still vald?
 
I have one more question regarding CMA. If the system is Intel VTD enabled, then the dma_alloc_coherent for the VF fails. 
 
Regards,
Sathya
 
- Original Message -
Sender : Robin Murphy 
Date : 2020-09-23 19:23 (GMT+5:30)
Title : Re: dma_alloc_coherent not allocating memory from CMA Reserved
To : Sathyavathi M, null
 
On 2020-09-23 08:43, Sathyavathi M wrote:
> Hi All,
> 
> I am trying to allocate coherent memory for 33 MB in kerenl driver. and for that
> i have reserved CMA of 1024 MB, but from dmesg, i can see that address reserved
> for cma is different and what i get with dma_alloc_coherent is different. My pc
> is intel x86 machine and tried in different motherboard, but this issue is
> occuring in some specific PCs. please help me in debugging the actual issue,
> below are kernel bootup logs
> 
> 
> [ 0.014362] No NUMA configuration found
> [ 0.014363] Faking a node at [mem 0x-0x00021edf]
> [ 0.014374] NODE_DATA(0) allocated [mem 0x21edd5000-0x21edf]
> [ 0.014538] *cma: Reserved 400 MiB at 0x000205c0*
> [ 0.014541] Reserving 640MB of memory at 2512MB for crashkernel (System RAM: 8046MB)
> [ 0.014553] Zone ranges:
> [ 0.014554] DMA [mem 0x1000-0x00ff]
> [ 0.014554] DMA32 [mem 0x0100-0x]
> [ 0.014555] Normal [mem 0x0001-0x00021edf]
> 
> 
> *at dma_alloc_coherent call
> [ 27.816062] dev->dma_33M_addr is f800*
 
I'm confused - if you got a DMA address back, then the allocation must 
have succeeded, so what exactly is the issue there? If the allocator 
managed to find a suitable amount of memory that your device can 
address, does it really matter exactly where it came from?
 
Also bear in mind that a DMA address is not necessarily the same as a 
physical address anyway, if for instance you have an IOMMU or other 
forms of interconnect translation. In fact the default behaviour of the 
intel-iommu driver for a PCI device is to allocate DMA addresses in 
naturally-aligned regions downwards from 4GB - if the first request in 
an empty address space was for 33MB, it would get rounded up to 64MB, 
wouldn't fit at 4GB - 64MB (0xfc00) because that would clash with 
the IOAPIC region, so would end up at 4GB - 128MB (0xf800). Now 
where have I seen that before?...
 
Robin.
 
> ---
> and below are the logs in working case, at driver dma_alloc_coherent api call we
> have address which is in range of what reserved for cma.
> at boot i get..
> Faking a node at [mem 0x-0x00019fdf]
> NODE_DATA(0) allocated [mem 0x19fdd3000-0x19fdfdfff]
> *cma: Reserved 800 MiB at 0x00016dc0*
> Reserving 640MB of memory at 1792MB for crashkernel (System RAM: 6016MB)
> Zone ranges:
> DMA [mem 0x1000-0x00ff]
> DMA32 [mem 0x0100-0x]
> Normal [mem 0x0001-0x00019fdf]
> 
> *at dma_alloc_coherent call
> dev->dma_33M_addr is 16e20*
> 
> Please help me in solving this iissue, or can suggest any alternative way to
> allocate big coherent memory.
> 
> Regards,
> 
> Sathya
> 
> 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 
 
 
Regards,
Sathya


 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: dma_alloc_coherent not allocating memory from CMA Reserved

2020-09-23 Thread Robin Murphy

On 2020-09-23 08:43, Sathyavathi M wrote:

Hi All,

I am trying to allocate coherent memory for 33 MB in kerenl driver. and for that
i have reserved CMA of 1024 MB, but from dmesg, i can see that address reserved
for cma is different and what i get with dma_alloc_coherent is different. My pc
is intel x86 machine and tried in different motherboard, but this issue is
occuring in some specific PCs. please help me in debugging the actual issue,
below are kernel bootup logs


[ 0.014362] No NUMA configuration found
[ 0.014363] Faking a node at [mem 0x-0x00021edf]
[ 0.014374] NODE_DATA(0) allocated [mem 0x21edd5000-0x21edf]
[ 0.014538] *cma: Reserved 400 MiB at 0x000205c0*
[ 0.014541] Reserving 640MB of memory at 2512MB for crashkernel (System RAM: 
8046MB)
[ 0.014553] Zone ranges:
[ 0.014554] DMA [mem 0x1000-0x00ff]
[ 0.014554] DMA32 [mem 0x0100-0x]
[ 0.014555] Normal [mem 0x0001-0x00021edf]


*at dma_alloc_coherent call
[ 27.816062] dev->dma_33M_addr is f800*


I'm confused - if you got a DMA address back, then the allocation must 
have succeeded, so what exactly is the issue there? If the allocator 
managed to find a suitable amount of memory that your device can 
address, does it really matter exactly where it came from?


Also bear in mind that a DMA address is not necessarily the same as a 
physical address anyway, if for instance you have an IOMMU or other 
forms of interconnect translation. In fact the default behaviour of the 
intel-iommu driver for a PCI device is to allocate DMA addresses in 
naturally-aligned regions downwards from 4GB - if the first request in 
an empty address space was for 33MB, it would get rounded up to 64MB, 
wouldn't fit at 4GB - 64MB (0xfc00) because that would clash with 
the IOAPIC region, so would end up at 4GB - 128MB (0xf800). Now 
where have I seen that before?...


Robin.


---
and below are the logs in working case, at driver dma_alloc_coherent api call we
have address which is in range of what reserved for cma.
at boot i get..
Faking a node at [mem 0x-0x00019fdf]
NODE_DATA(0) allocated [mem 0x19fdd3000-0x19fdfdfff]
*cma: Reserved 800 MiB at 0x00016dc0*
Reserving 640MB of memory at 1792MB for crashkernel (System RAM: 6016MB)
Zone ranges:
DMA [mem 0x1000-0x00ff]
DMA32 [mem 0x0100-0x]
Normal [mem 0x0001-0x00019fdf]

*at dma_alloc_coherent call
dev->dma_33M_addr is 16e20*

Please help me in solving this iissue, or can suggest any alternative way to
allocate big coherent memory.

Regards,

Sathya


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: IOVA allocation dependency between firmware buffer and remaining buffers

2020-09-23 Thread Christoph Hellwig
On Wed, Sep 23, 2020 at 01:15:33PM +0530, Ajay kumar wrote:
> Hello all,
> 
> We pretty much tried to solve the same issue here with a new API in DMA-IOMMU:
> https://lore.kernel.org/linux-iommu/20200811054912.ga...@infradead.org/T/
> 
> Christopher- the user part would be MFC devices on exynos platforms

I still think we:

 a) need to wire it up through the DMA API with an ops vector,
and an error for devices not using dma-iommu
 b) submit it together with an actual users (like the series from Marek)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/3] iommu: amd: Re-purpose Exclusion range registers to support SNP CWWB

2020-09-23 Thread Suravee Suthikulpanit
When the IOMMU SNP support bit is set in the IOMMU Extended Features
register, hardware re-purposes the following registers:

1. IOMMU Exclusion Base register (MMIO offset 0020h) to
   Completion Wait Write-Back (CWWB) Base register

2. IOMMU Exclusion Range Limit (MMIO offset 0028h) to
   Completion Wait Write-Back (CWWB) Range Limit register

and requires the IOMMU CWWB semaphore base and range to be programmed
in the register offset 0020h and 0028h accordingly.

Cc: Brijesh Singh 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu_types.h |  1 +
 drivers/iommu/amd/init.c| 26 ++
 2 files changed, 27 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 1e7966c73707..f696ac7c5f89 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -93,6 +93,7 @@
 #define FEATURE_PC (1ULL<<9)
 #define FEATURE_GAM_VAPIC  (1ULL<<21)
 #define FEATURE_EPHSUP (1ULL<<50)
+#define FEATURE_SNP(1ULL<<63)
 
 #define FEATURE_PASID_SHIFT32
 #define FEATURE_PASID_MASK (0x1fULL << FEATURE_PASID_SHIFT)
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index febc072f2717..c55df4347487 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -359,6 +359,29 @@ static void iommu_set_exclusion_range(struct amd_iommu 
*iommu)
, sizeof(entry));
 }
 
+static void iommu_set_cwwb_range(struct amd_iommu *iommu)
+{
+   u64 start = iommu_virt_to_phys((void *)iommu->cmd_sem);
+   u64 entry = start & PM_ADDR_MASK;
+
+   if (!iommu_feature(iommu, FEATURE_SNP))
+   return;
+
+   /* Note:
+* Re-purpose Exclusion base/limit registers for Completion wait
+* write-back base/limit.
+*/
+   memcpy_toio(iommu->mmio_base + MMIO_EXCL_BASE_OFFSET,
+   , sizeof(entry));
+
+   /* Note:
+* Default to 4 Kbytes, which can be specified by setting base
+* address equal to the limit address.
+*/
+   memcpy_toio(iommu->mmio_base + MMIO_EXCL_LIMIT_OFFSET,
+   , sizeof(entry));
+}
+
 /* Programs the physical address of the device table into the IOMMU hardware */
 static void iommu_set_device_table(struct amd_iommu *iommu)
 {
@@ -1901,6 +1924,9 @@ static int __init amd_iommu_init_pci(void)
ret = iommu_init_pci(iommu);
if (ret)
break;
+
+   /* Need to setup range after PCI init */
+   iommu_set_cwwb_range(iommu);
}
 
/*
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/3] amd : iommu : Initial IOMMU support for SNP

2020-09-23 Thread Suravee Suthikulpanit
Introducing support for AMD Secure Nested Paging (SNP) with IOMMU,
which mainly affects the use of IOMMU Exclusion Base and Range Limit
registers. Note that these registers are no longer used by Linux IOMMU
driver. Patch 2 and 3 are SNP-specific, and discuss detail of
the implementation.

In order to support SNP, the current Completion Wait Write-back logic
is modified (patch 1/4). This change is independent from SNP.

Please see the following white paper for more info on SNP:
  
https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf
 

Changes from V1: (https://lkml.org/lkml/2020/9/16/455)
- Patch 2/3: Fix up per Joerg's comments

Thank you,
Suravee

Suravee Suthikulpanit (3):
  iommu: amd: Use 4K page for completion wait write-back semaphore
  iommu: amd: Add support for RMP_PAGE_FAULT and RMP_HW_ERR
  iommu: amd: Re-purpose Exclusion range registers to support SNP CWWB

 drivers/iommu/amd/amd_iommu_types.h |  6 +-
 drivers/iommu/amd/init.c| 44 ++
 drivers/iommu/amd/iommu.c   | 90 +
 3 files changed, 127 insertions(+), 13 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/3] iommu: amd: Use 4K page for completion wait write-back semaphore

2020-09-23 Thread Suravee Suthikulpanit
IOMMU SNP support requires the completion wait write-back semaphore to be
implemented using a 4K-aligned page, where the page address is to be
programmed into the newly introduced MMIO base/range registers.

This new scheme uses a per-iommu atomic variable to store the current
semaphore value, which is incremented for every completion wait command.

Since this new scheme is also compatible with non-SNP mode,
generalize the driver to use 4K page for completion-wait semaphore in
both modes.

Cc: Brijesh Singh 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu_types.h |  3 ++-
 drivers/iommu/amd/init.c| 18 ++
 drivers/iommu/amd/iommu.c   | 23 +++
 3 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 30a5d412255a..4c80483e78ec 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -595,7 +595,8 @@ struct amd_iommu {
 #endif
 
u32 flags;
-   volatile u64 __aligned(8) cmd_sem;
+   volatile u64 *cmd_sem;
+   u64 cmd_sem_val;
 
 #ifdef CONFIG_AMD_IOMMU_DEBUGFS
/* DebugFS Info */
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 445a08d23fed..febc072f2717 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -813,6 +813,19 @@ static int iommu_init_ga(struct amd_iommu *iommu)
return ret;
 }
 
+static int __init alloc_cwwb_sem(struct amd_iommu *iommu)
+{
+   iommu->cmd_sem = (void *)get_zeroed_page(GFP_KERNEL);
+
+   return iommu->cmd_sem ? 0 : -ENOMEM;
+}
+
+static void __init free_cwwb_sem(struct amd_iommu *iommu)
+{
+   if (iommu->cmd_sem)
+   free_page((unsigned long)iommu->cmd_sem);
+}
+
 static void iommu_enable_xt(struct amd_iommu *iommu)
 {
 #ifdef CONFIG_IRQ_REMAP
@@ -1395,6 +1408,7 @@ static int __init init_iommu_from_acpi(struct amd_iommu 
*iommu,
 
 static void __init free_iommu_one(struct amd_iommu *iommu)
 {
+   free_cwwb_sem(iommu);
free_command_buffer(iommu);
free_event_buffer(iommu);
free_ppr_log(iommu);
@@ -1481,6 +1495,7 @@ static int __init init_iommu_one(struct amd_iommu *iommu, 
struct ivhd_header *h)
int ret;
 
raw_spin_lock_init(>lock);
+   iommu->cmd_sem_val = 0;
 
/* Add IOMMU to internal data structures */
list_add_tail(>list, _iommu_list);
@@ -1558,6 +1573,9 @@ static int __init init_iommu_one(struct amd_iommu *iommu, 
struct ivhd_header *h)
if (!iommu->mmio_base)
return -ENOMEM;
 
+   if (alloc_cwwb_sem(iommu))
+   return -ENOMEM;
+
if (alloc_command_buffer(iommu))
return -ENOMEM;
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 10e4200d3552..9e9898683537 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -792,11 +792,11 @@ irqreturn_t amd_iommu_int_handler(int irq, void *data)
  *
  /
 
-static int wait_on_sem(volatile u64 *sem)
+static int wait_on_sem(struct amd_iommu *iommu, u64 data)
 {
int i = 0;
 
-   while (*sem == 0 && i < LOOP_TIMEOUT) {
+   while (*iommu->cmd_sem != data && i < LOOP_TIMEOUT) {
udelay(1);
i += 1;
}
@@ -827,16 +827,16 @@ static void copy_cmd_to_buffer(struct amd_iommu *iommu,
writel(tail, iommu->mmio_base + MMIO_CMD_TAIL_OFFSET);
 }
 
-static void build_completion_wait(struct iommu_cmd *cmd, u64 address)
+static void build_completion_wait(struct iommu_cmd *cmd,
+ struct amd_iommu *iommu,
+ u64 data)
 {
-   u64 paddr = iommu_virt_to_phys((void *)address);
-
-   WARN_ON(address & 0x7ULL);
+   u64 paddr = iommu_virt_to_phys((void *)iommu->cmd_sem);
 
memset(cmd, 0, sizeof(*cmd));
cmd->data[0] = lower_32_bits(paddr) | CMD_COMPL_WAIT_STORE_MASK;
cmd->data[1] = upper_32_bits(paddr);
-   cmd->data[2] = 1;
+   cmd->data[2] = data;
CMD_SET_TYPE(cmd, CMD_COMPL_WAIT);
 }
 
@@ -1045,22 +1045,21 @@ static int iommu_completion_wait(struct amd_iommu 
*iommu)
struct iommu_cmd cmd;
unsigned long flags;
int ret;
+   u64 data;
 
if (!iommu->need_sync)
return 0;
 
-
-   build_completion_wait(, (u64)>cmd_sem);
-
raw_spin_lock_irqsave(>lock, flags);
 
-   iommu->cmd_sem = 0;
+   data = ++iommu->cmd_sem_val;
+   build_completion_wait(, iommu, data);
 
ret = __iommu_queue_command_sync(iommu, , false);
if (ret)
goto out_unlock;
 
-   ret = wait_on_sem(>cmd_sem);
+   ret = wait_on_sem(iommu, data);
 
 out_unlock:
raw_spin_unlock_irqrestore(>lock, flags);
-- 
2.17.1

___
iommu mailing list

[PATCH v2 2/3] iommu: amd: Add support for RMP_PAGE_FAULT and RMP_HW_ERR

2020-09-23 Thread Suravee Suthikulpanit
IOMMU SNP support introduces two new IOMMU events:
  * RMP Page Fault event
  * RMP Hardware Error event

Hence, add reporting functions for these events.

Cc: Brijesh Singh 
Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu_types.h |  2 +
 drivers/iommu/amd/iommu.c   | 67 +
 2 files changed, 69 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 4c80483e78ec..1e7966c73707 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -128,6 +128,8 @@
 #define EVENT_TYPE_IOTLB_INV_TO0x7
 #define EVENT_TYPE_INV_DEV_REQ 0x8
 #define EVENT_TYPE_INV_PPR_REQ 0x9
+#define EVENT_TYPE_RMP_FAULT   0xd
+#define EVENT_TYPE_RMP_HW_ERR  0xe
 #define EVENT_DEVID_MASK   0x
 #define EVENT_DEVID_SHIFT  0
 #define EVENT_DOMID_MASK_LO0x
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 9e9898683537..ea64fa8a9418 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -486,6 +486,67 @@ static void dump_command(unsigned long phys_addr)
pr_err("CMD[%d]: %08x\n", i, cmd->data[i]);
 }
 
+static void amd_iommu_report_rmp_hw_error(volatile u32 *event)
+{
+   struct iommu_dev_data *dev_data = NULL;
+   int devid, vmg_tag, flags;
+   struct pci_dev *pdev;
+   u64 spa;
+
+   devid   = (event[0] >> EVENT_DEVID_SHIFT) & EVENT_DEVID_MASK;
+   vmg_tag = (event[1]) & 0x;
+   flags   = (event[1] >> EVENT_FLAGS_SHIFT) & EVENT_FLAGS_MASK;
+   spa = ((u64)event[3] << 32) | (event[2] & 0xFFF8);
+
+   pdev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(devid),
+  devid & 0xff);
+   if (pdev)
+   dev_data = dev_iommu_priv_get(>dev);
+
+   if (dev_data && __ratelimit(_data->rs)) {
+   pci_err(pdev, "Event logged [RMP_HW_ERROR vmg_tag=0x%04x, 
spa=0x%llx, flags=0x%04x]\n",
+   vmg_tag, spa, flags);
+   } else {
+   pr_err_ratelimited("Event logged [RMP_HW_ERROR 
device=%02x:%02x.%x, vmg_tag=0x%04x, spa=0x%llx, flags=0x%04x]\n",
+   PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+   vmg_tag, spa, flags);
+   }
+
+   if (pdev)
+   pci_dev_put(pdev);
+}
+
+static void amd_iommu_report_rmp_fault(volatile u32 *event)
+{
+   struct iommu_dev_data *dev_data = NULL;
+   int devid, flags_rmp, vmg_tag, flags;
+   struct pci_dev *pdev;
+   u64 gpa;
+
+   devid = (event[0] >> EVENT_DEVID_SHIFT) & EVENT_DEVID_MASK;
+   flags_rmp = (event[0] >> EVENT_FLAGS_SHIFT) & 0xFF;
+   vmg_tag   = (event[1]) & 0x;
+   flags = (event[1] >> EVENT_FLAGS_SHIFT) & EVENT_FLAGS_MASK;
+   gpa   = ((u64)event[3] << 32) | event[2];
+
+   pdev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(devid),
+  devid & 0xff);
+   if (pdev)
+   dev_data = dev_iommu_priv_get(>dev);
+
+   if (dev_data && __ratelimit(_data->rs)) {
+   pci_err(pdev, "Event logged [RMP_PAGE_FAULT vmg_tag=0x%04x, 
gpa=0x%llx, flags_rmp=0x%04x, flags=0x%04x]\n",
+   vmg_tag, gpa, flags_rmp, flags);
+   } else {
+   pr_err_ratelimited("Event logged [RMP_PAGE_FAULT 
device=%02x:%02x.%x, vmg_tag=0x%04x, gpa=0x%llx, flags_rmp=0x%04x, 
flags=0x%04x]\n",
+   PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+   vmg_tag, gpa, flags_rmp, flags);
+   }
+
+   if (pdev)
+   pci_dev_put(pdev);
+}
+
 static void amd_iommu_report_page_fault(u16 devid, u16 domain_id,
u64 address, int flags)
 {
@@ -577,6 +638,12 @@ static void iommu_print_event(struct amd_iommu *iommu, 
void *__evt)
PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
pasid, address, flags);
break;
+   case EVENT_TYPE_RMP_FAULT:
+   amd_iommu_report_rmp_fault(event);
+   break;
+   case EVENT_TYPE_RMP_HW_ERR:
+   amd_iommu_report_rmp_hw_error(event);
+   break;
case EVENT_TYPE_INV_PPR_REQ:
pasid = PPR_PASID(*((u64 *)__evt));
tag = event[1] & 0x03FF;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 01/11] vfio: VFIO_IOMMU_SET_PASID_TABLE

2020-09-23 Thread Auger Eric
Hi Zenghui,

On 9/23/20 1:27 PM, Zenghui Yu wrote:
> Hi Eric,
> 
> On 2020/3/21 0:19, Eric Auger wrote:
>> From: "Liu, Yi L" 
>>
>> This patch adds an VFIO_IOMMU_SET_PASID_TABLE ioctl
>> which aims to pass the virtual iommu guest configuration
>> to the host. This latter takes the form of the so-called
>> PASID table.
>>
>> Signed-off-by: Jacob Pan 
>> Signed-off-by: Liu, Yi L 
>> Signed-off-by: Eric Auger 
> 
> [...]
> 
>> diff --git a/drivers/vfio/vfio_iommu_type1.c
>> b/drivers/vfio/vfio_iommu_type1.c
>> index a177bf2c6683..bfacbd876ee1 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -2172,6 +2172,43 @@ static int vfio_iommu_iova_build_caps(struct
>> vfio_iommu *iommu,
>>   return ret;
>>   }
>>   +static void
>> +vfio_detach_pasid_table(struct vfio_iommu *iommu)
>> +{
>> +    struct vfio_domain *d;
>> +
>> +    mutex_lock(>lock);
>> +
>> +    list_for_each_entry(d, >domain_list, next) {
>> +    iommu_detach_pasid_table(d->domain);
>> +    }
>> +    mutex_unlock(>lock);
>> +}
>> +
>> +static int
>> +vfio_attach_pasid_table(struct vfio_iommu *iommu,
>> +    struct vfio_iommu_type1_set_pasid_table *ustruct)
>> +{
>> +    struct vfio_domain *d;
>> +    int ret = 0;
>> +
>> +    mutex_lock(>lock);
>> +
>> +    list_for_each_entry(d, >domain_list, next) {
>> +    ret = iommu_attach_pasid_table(d->domain, >config);
>> +    if (ret)
>> +    goto unwind;
>> +    }
>> +    goto unlock;
>> +unwind:
>> +    list_for_each_entry_continue_reverse(d, >domain_list, next) {
>> +    iommu_detach_pasid_table(d->domain);
>> +    }
>> +unlock:
>> +    mutex_unlock(>lock);
>> +    return ret;
>> +}
>> +
>>   static long vfio_iommu_type1_ioctl(void *iommu_data,
>>  unsigned int cmd, unsigned long arg)
>>   {
>> @@ -2276,6 +2313,25 @@ static long vfio_iommu_type1_ioctl(void
>> *iommu_data,
>>     return copy_to_user((void __user *)arg, , minsz) ?
>>   -EFAULT : 0;
>> +    } else if (cmd == VFIO_IOMMU_SET_PASID_TABLE) {
>> +    struct vfio_iommu_type1_set_pasid_table ustruct;
>> +
>> +    minsz = offsetofend(struct vfio_iommu_type1_set_pasid_table,
>> +    config);
>> +
>> +    if (copy_from_user(, (void __user *)arg, minsz))
>> +    return -EFAULT;
>> +
>> +    if (ustruct.argsz < minsz)
>> +    return -EINVAL;
>> +
>> +    if (ustruct.flags & VFIO_PASID_TABLE_FLAG_SET)
>> +    return vfio_attach_pasid_table(iommu, );
>> +    else if (ustruct.flags & VFIO_PASID_TABLE_FLAG_UNSET) {
>> +    vfio_detach_pasid_table(iommu);
>> +    return 0;
>> +    } else
>> +    return -EINVAL;
> 
> Nit:
> 
> What if user-space blindly set both flags? Should we check that only one
> flag is allowed to be set at this stage, and return error otherwise?
Indeed I can check that.
> 
> Besides, before going through the whole series [1][2], I'd like to know
> if this is the latest version of your Nested-Stage-Setup work in case I
> had missed something.
> 
> [1] https://lore.kernel.org/r/20200320161911.27494-1-eric.au...@redhat.com
> [2] https://lore.kernel.org/r/20200414150607.28488-1-eric.au...@redhat.com

yes those 2 series are the last ones. Thank you for reviewing.

FYI, I intend to respin within a week or 2 on top of Jacob's  [PATCH v10
0/7] IOMMU user API enhancement. But functionally there won't a lot of
changes.

Thanks

Eric
> 
> 
> Thanks,
> Zenghui
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v10 01/11] vfio: VFIO_IOMMU_SET_PASID_TABLE

2020-09-23 Thread Zenghui Yu

Hi Eric,

On 2020/3/21 0:19, Eric Auger wrote:

From: "Liu, Yi L" 

This patch adds an VFIO_IOMMU_SET_PASID_TABLE ioctl
which aims to pass the virtual iommu guest configuration
to the host. This latter takes the form of the so-called
PASID table.

Signed-off-by: Jacob Pan 
Signed-off-by: Liu, Yi L 
Signed-off-by: Eric Auger 


[...]


diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index a177bf2c6683..bfacbd876ee1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2172,6 +2172,43 @@ static int vfio_iommu_iova_build_caps(struct vfio_iommu 
*iommu,
return ret;
  }
  
+static void

+vfio_detach_pasid_table(struct vfio_iommu *iommu)
+{
+   struct vfio_domain *d;
+
+   mutex_lock(>lock);
+
+   list_for_each_entry(d, >domain_list, next) {
+   iommu_detach_pasid_table(d->domain);
+   }
+   mutex_unlock(>lock);
+}
+
+static int
+vfio_attach_pasid_table(struct vfio_iommu *iommu,
+   struct vfio_iommu_type1_set_pasid_table *ustruct)
+{
+   struct vfio_domain *d;
+   int ret = 0;
+
+   mutex_lock(>lock);
+
+   list_for_each_entry(d, >domain_list, next) {
+   ret = iommu_attach_pasid_table(d->domain, >config);
+   if (ret)
+   goto unwind;
+   }
+   goto unlock;
+unwind:
+   list_for_each_entry_continue_reverse(d, >domain_list, next) {
+   iommu_detach_pasid_table(d->domain);
+   }
+unlock:
+   mutex_unlock(>lock);
+   return ret;
+}
+
  static long vfio_iommu_type1_ioctl(void *iommu_data,
   unsigned int cmd, unsigned long arg)
  {
@@ -2276,6 +2313,25 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
  
  		return copy_to_user((void __user *)arg, , minsz) ?

-EFAULT : 0;
+   } else if (cmd == VFIO_IOMMU_SET_PASID_TABLE) {
+   struct vfio_iommu_type1_set_pasid_table ustruct;
+
+   minsz = offsetofend(struct vfio_iommu_type1_set_pasid_table,
+   config);
+
+   if (copy_from_user(, (void __user *)arg, minsz))
+   return -EFAULT;
+
+   if (ustruct.argsz < minsz)
+   return -EINVAL;
+
+   if (ustruct.flags & VFIO_PASID_TABLE_FLAG_SET)
+   return vfio_attach_pasid_table(iommu, );
+   else if (ustruct.flags & VFIO_PASID_TABLE_FLAG_UNSET) {
+   vfio_detach_pasid_table(iommu);
+   return 0;
+   } else
+   return -EINVAL;


Nit:

What if user-space blindly set both flags? Should we check that only one
flag is allowed to be set at this stage, and return error otherwise?

Besides, before going through the whole series [1][2], I'd like to know
if this is the latest version of your Nested-Stage-Setup work in case I
had missed something.

[1] https://lore.kernel.org/r/20200320161911.27494-1-eric.au...@redhat.com
[2] https://lore.kernel.org/r/20200414150607.28488-1-eric.au...@redhat.com


Thanks,
Zenghui
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu: amd: Add support for RMP_PAGE_FAULT and RMP_HW_ERR

2020-09-23 Thread Suravee Suthikulpanit




On 9/18/20 4:31 PM, Joerg Roedel wrote:

Hi Suravee,

On Wed, Sep 16, 2020 at 01:55:48PM +, Suravee Suthikulpanit wrote:

+static void amd_iommu_report_rmp_hw_error(volatile u32 *event)
+{
+   struct pci_dev *pdev;
+   struct iommu_dev_data *dev_data = NULL;
+   int devid = (event[0] >> EVENT_DEVID_SHIFT) & EVENT_DEVID_MASK;
+   int vmg_tag   = (event[1]) & 0x;
+   int flags = (event[1] >> EVENT_FLAGS_SHIFT) & EVENT_FLAGS_MASK;
+   u64 spa   = ((u64)event[3] << 32) | (event[2] & 0xFFF8);


Please write this as:

struct iommu_dev_data *dev_data = NULL;
int devid, vmg_tag, flags;
struct pci_dev *pdev;
u64 spa;

devid   = (event[0] >> EVENT_DEVID_SHIFT) & EVENT_DEVID_MASK;
vmg_tag = (event[1]) & 0x;
flags   = (event[1] >> EVENT_FLAGS_SHIFT) & EVENT_FLAGS_MASK;
spa = ((u64)event[3] << 32) | (event[2] & 0xFFF8);

Same applied the the next function.


+
+   pdev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(devid),
+  devid & 0xff);
+   if (pdev)
+   dev_data = dev_iommu_priv_get(>dev);
+
+   if (dev_data && __ratelimit(_data->rs)) {
+   pci_err(pdev, "Event logged [RMP_HW_ERROR devid=0x%04x, 
vmg_tag=0x%04x, spa=0x%llx, flags=0x%04x]\n",
+   devid, vmg_tag, spa, flags);


Printing the devid is not really needed here, no? Same issue in the next
function.


I'll update the patch and will send out V2.

Thanks,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 11/13] iommu: amd: Introduce iommu_v1_iova_to_phys

2020-09-23 Thread Suravee Suthikulpanit
This implements iova_to_phys for AMD IOMMU v1 pagetable,
which will be used by the IO page table framework.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/io_pgtable.c | 21 +
 drivers/iommu/amd/iommu.c  | 16 +---
 2 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index f913fc7b1e58..2f36bab23516 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -525,6 +525,26 @@ unsigned long iommu_unmap_page(struct protection_domain 
*dom,
return unmapped;
 }
 
+static phys_addr_t iommu_v1_iova_to_phys(struct io_pgtable_ops *ops, unsigned 
long iova)
+{
+   struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
+   unsigned long offset_mask, pte_pgsize;
+   u64 *pte, __pte;
+
+   if (pgtable->mode == PAGE_MODE_NONE)
+   return iova;
+
+   pte = fetch_pte(pgtable, iova, _pgsize);
+
+   if (!pte || !IOMMU_PTE_PRESENT(*pte))
+   return 0;
+
+   offset_mask = pte_pgsize - 1;
+   __pte   = __sme_clr(*pte & PM_ADDR_MASK);
+
+   return (__pte & ~offset_mask) | (iova & offset_mask);
+}
+
 struct io_pgtable_ops *amd_iommu_setup_io_pgtable_ops(struct iommu_dev_data 
*dev_data,
 struct protection_domain *domain)
 {
@@ -551,6 +571,7 @@ static struct io_pgtable *v1_alloc_pgtable(struct 
io_pgtable_cfg *cfg, void *coo
 {
struct protection_domain *pdom = (struct protection_domain *)cookie;
 
+   pdom->iop.iop.ops.iova_to_phys = iommu_v1_iova_to_phys;
return >iop.iop;
 }
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 87cea1cde414..9a1a16031e00 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2079,22 +2079,8 @@ static phys_addr_t amd_iommu_iova_to_phys(struct 
iommu_domain *dom,
 {
struct protection_domain *domain = to_pdomain(dom);
struct io_pgtable_ops *ops = >iop.iop.ops;
-   struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
-   unsigned long offset_mask, pte_pgsize;
-   u64 *pte, __pte;
 
-   if (domain->iop.mode == PAGE_MODE_NONE)
-   return iova;
-
-   pte = fetch_pte(pgtable, iova, _pgsize);
-
-   if (!pte || !IOMMU_PTE_PRESENT(*pte))
-   return 0;
-
-   offset_mask = pte_pgsize - 1;
-   __pte   = __sme_clr(*pte & PM_ADDR_MASK);
-
-   return (__pte & ~offset_mask) | (iova & offset_mask);
+   return ops->iova_to_phys(ops, iova);
 }
 
 static bool amd_iommu_capable(enum iommu_cap cap)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 03/13] iommu: amd: Move pt_root to to struct amd_io_pgtable

2020-09-23 Thread Suravee Suthikulpanit
To better organize the data structure since it contains IO page table
related information.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h   | 2 +-
 drivers/iommu/amd/amd_iommu_types.h | 2 +-
 drivers/iommu/amd/iommu.c   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 97cdb235ce69..da6e09657e00 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -96,7 +96,7 @@ static inline void *iommu_phys_to_virt(unsigned long paddr)
 static inline
 void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root)
 {
-   atomic64_set(>pt_root, root);
+   atomic64_set(>iop.pt_root, root);
 }
 
 static inline
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 77cd8d966fbc..5d53b7bec256 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -490,6 +490,7 @@ struct amd_io_pgtable {
struct io_pgtable   iop;
int mode;
u64 *root;
+   atomic64_t pt_root; /* pgtable root and pgtable mode */
 };
 
 /*
@@ -503,7 +504,6 @@ struct protection_domain {
struct amd_io_pgtable iop;
spinlock_t lock;/* mostly used to lock the page table*/
u16 id; /* the domain id written to the device table */
-   atomic64_t pt_root; /* pgtable root and pgtable mode */
int glx;/* Number of levels for GCR3 table */
u64 *gcr3_tbl;  /* Guest CR3 table */
unsigned long flags;/* flags to find out type of domain */
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 2b7eb51dcbb8..c8b8619cc744 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -146,7 +146,7 @@ static struct protection_domain *to_pdomain(struct 
iommu_domain *dom)
 static void amd_iommu_domain_get_pgtable(struct protection_domain *domain,
 struct domain_pgtable *pgtable)
 {
-   u64 pt_root = atomic64_read(>pt_root);
+   u64 pt_root = atomic64_read(>iop.pt_root);
 
pgtable->root = (u64 *)(pt_root & PAGE_MASK);
pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 12/13] iommu: amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page

2020-09-23 Thread Suravee Suthikulpanit
These implement map and unmap for AMD IOMMU v1 pagetable, which
will be used by the IO pagetable framework.

Also clean up unused extern function declarations.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h  | 13 -
 drivers/iommu/amd/io_pgtable.c | 25 -
 drivers/iommu/amd/iommu.c  |  7 ---
 3 files changed, 16 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 69996e57fae2..2e8dc2a1ec0f 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -124,19 +124,6 @@ void amd_iommu_apply_ivrs_quirks(void);
 static inline void amd_iommu_apply_ivrs_quirks(void) { }
 #endif
 
-/* TODO: These are temporary and will be removed once fully transition */
-extern int iommu_map_page(struct protection_domain *dom,
- unsigned long bus_addr,
- unsigned long phys_addr,
- unsigned long page_size,
- int prot,
- gfp_t gfp);
-extern unsigned long iommu_unmap_page(struct protection_domain *dom,
- unsigned long bus_addr,
- unsigned long page_size);
-extern u64 *fetch_pte(struct amd_io_pgtable *pgtable,
- unsigned long address,
- unsigned long *page_size);
 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode);
 extern void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable);
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 2f36bab23516..7f7be302c538 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -348,9 +348,9 @@ static u64 *alloc_pte(struct protection_domain *domain,
  * This function checks if there is a PTE for a given dma address. If
  * there is one, it returns the pointer to it.
  */
-u64 *fetch_pte(struct amd_io_pgtable *pgtable,
-  unsigned long address,
-  unsigned long *page_size)
+static u64 *fetch_pte(struct amd_io_pgtable *pgtable,
+ unsigned long address,
+ unsigned long *page_size)
 {
int level;
u64 *pte;
@@ -423,13 +423,10 @@ static struct page *free_clear_pte(u64 *pte, u64 pteval, 
struct page *freelist)
  * supporting all features of AMD IOMMU page tables like level skipping
  * and full 64 bit address spaces.
  */
-int iommu_map_page(struct protection_domain *dom,
-  unsigned long iova,
-  unsigned long paddr,
-  unsigned long size,
-  int prot,
-  gfp_t gfp)
+static int iommu_v1_map_page(struct io_pgtable_ops *ops, unsigned long iova,
+ phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
 {
+   struct protection_domain *dom = io_pgtable_ops_to_domain(ops);
struct page *freelist = NULL;
bool updated = false;
u64 __pte, *pte;
@@ -492,11 +489,11 @@ int iommu_map_page(struct protection_domain *dom,
return ret;
 }
 
-unsigned long iommu_unmap_page(struct protection_domain *dom,
-  unsigned long iova,
-  unsigned long size)
+static unsigned long iommu_v1_unmap_page(struct io_pgtable_ops *ops,
+ unsigned long iova,
+ size_t size,
+ struct iommu_iotlb_gather *gather)
 {
-   struct io_pgtable_ops *ops = >iop.iop.ops;
struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
unsigned long long unmapped;
unsigned long unmap_size;
@@ -571,6 +568,8 @@ static struct io_pgtable *v1_alloc_pgtable(struct 
io_pgtable_cfg *cfg, void *coo
 {
struct protection_domain *pdom = (struct protection_domain *)cookie;
 
+   pdom->iop.iop.ops.map  = iommu_v1_map_page;
+   pdom->iop.iop.ops.unmap= iommu_v1_unmap_page;
pdom->iop.iop.ops.iova_to_phys = iommu_v1_iova_to_phys;
return >iop.iop;
 }
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 9a1a16031e00..77f44b927ae7 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2044,6 +2044,7 @@ static int amd_iommu_map(struct iommu_domain *dom, 
unsigned long iova,
 gfp_t gfp)
 {
struct protection_domain *domain = to_pdomain(dom);
+   struct io_pgtable_ops *ops = >iop.iop.ops;
int prot = 0;
int ret;
 
@@ -2055,8 +2056,7 @@ static int amd_iommu_map(struct iommu_domain *dom, 
unsigned long iova,
if (iommu_prot & IOMMU_WRITE)
prot |= IOMMU_PROT_IW;
 
-   ret = iommu_map_page(domain, iova, paddr, page_size, prot, gfp);
-
+   ret = ops->map(ops, iova, paddr, page_size, prot, gfp);
  

[PATCH 06/13] iommu: amd: Move IO page table related functions

2020-09-23 Thread Suravee Suthikulpanit
Preparing to migrate to use IO page table framework.
There is no functional change.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h  |  18 ++
 drivers/iommu/amd/io_pgtable.c | 473 
 drivers/iommu/amd/iommu.c  | 476 +
 3 files changed, 493 insertions(+), 474 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 8b7be9171030..ee7ff4d827e1 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -122,4 +122,22 @@ void amd_iommu_apply_ivrs_quirks(void);
 static inline void amd_iommu_apply_ivrs_quirks(void) { }
 #endif
 
+/* TODO: These are temporary and will be removed once fully transition */
+extern void free_pagetable(struct domain_pgtable *pgtable);
+extern int iommu_map_page(struct protection_domain *dom,
+ unsigned long bus_addr,
+ unsigned long phys_addr,
+ unsigned long page_size,
+ int prot,
+ gfp_t gfp);
+extern unsigned long iommu_unmap_page(struct protection_domain *dom,
+ unsigned long bus_addr,
+ unsigned long page_size);
+extern u64 *fetch_pte(struct protection_domain *domain,
+ unsigned long address,
+ unsigned long *page_size);
+extern void amd_iommu_domain_get_pgtable(struct protection_domain *domain,
+struct domain_pgtable *pgtable);
+extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
+u64 *root, int mode);
 #endif
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 452cad26a2b3..7e4154e26757 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -54,6 +54,479 @@ static const struct iommu_flush_ops amd_flush_ops = {
.tlb_add_page   = v1_tlb_add_page,
 };
 
+/*
+ * Helper function to get the first pte of a large mapping
+ */
+static u64 *first_pte_l7(u64 *pte, unsigned long *page_size,
+unsigned long *count)
+{
+   unsigned long pte_mask, pg_size, cnt;
+   u64 *fpte;
+
+   pg_size  = PTE_PAGE_SIZE(*pte);
+   cnt  = PAGE_SIZE_PTE_COUNT(pg_size);
+   pte_mask = ~((cnt << 3) - 1);
+   fpte = (u64 *)(((unsigned long)pte) & pte_mask);
+
+   if (page_size)
+   *page_size = pg_size;
+
+   if (count)
+   *count = cnt;
+
+   return fpte;
+}
+
+/
+ *
+ * The functions below are used the create the page table mappings for
+ * unity mapped regions.
+ *
+ /
+
+static void free_page_list(struct page *freelist)
+{
+   while (freelist != NULL) {
+   unsigned long p = (unsigned long)page_address(freelist);
+
+   freelist = freelist->freelist;
+   free_page(p);
+   }
+}
+
+static struct page *free_pt_page(unsigned long pt, struct page *freelist)
+{
+   struct page *p = virt_to_page((void *)pt);
+
+   p->freelist = freelist;
+
+   return p;
+}
+
+#define DEFINE_FREE_PT_FN(LVL, FN) 
\
+static struct page *free_pt_##LVL (unsigned long __pt, struct page *freelist)  
\
+{  
\
+   unsigned long p;
\
+   u64 *pt;
\
+   int i;  
\
+   
\
+   pt = (u64 *)__pt;   
\
+   
\
+   for (i = 0; i < 512; ++i) { 
\
+   /* PTE present? */  
\
+   if (!IOMMU_PTE_PRESENT(pt[i]))  
\
+   continue;   
\
+   
\
+   /* Large PTE? */
\
+   if (PM_PTE_LEVEL(pt[i]) == 0 || 
\
+   PM_PTE_LEVEL(pt[i]) == 7)   
\
+   continue;   
\
+   
\
+   p = (unsigned long)IOMMU_PTE_PAGE(pt[i]);

[PATCH 08/13] iommu: amd: Remove amd_iommu_domain_get_pgtable

2020-09-23 Thread Suravee Suthikulpanit
Since the IO page table root and mode parameters have been moved into
the struct amd_io_pg, the function is no longer needed. Therefore,
remove it along with the struct domain_pgtable.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h   |  4 ++--
 drivers/iommu/amd/amd_iommu_types.h |  6 -
 drivers/iommu/amd/io_pgtable.c  | 36 ++---
 drivers/iommu/amd/iommu.c   | 34 ---
 4 files changed, 19 insertions(+), 61 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 8dff7d85be79..2059e64fdc53 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -101,6 +101,8 @@ static inline
 void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root)
 {
atomic64_set(>iop.pt_root, root);
+   domain->iop.root = (u64 *)(root & PAGE_MASK);
+   domain->iop.mode = root & 7; /* lowest 3 bits encode pgtable mode */
 }
 
 static inline
@@ -135,8 +137,6 @@ extern unsigned long iommu_unmap_page(struct 
protection_domain *dom,
 extern u64 *fetch_pte(struct protection_domain *domain,
  unsigned long address,
  unsigned long *page_size);
-extern void amd_iommu_domain_get_pgtable(struct protection_domain *domain,
-struct domain_pgtable *pgtable);
 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode);
 extern void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable);
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 5d53b7bec256..a07af389eae1 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -511,12 +511,6 @@ struct protection_domain {
unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
 };
 
-/* For decocded pt_root */
-struct domain_pgtable {
-   int mode;
-   u64 *root;
-};
-
 /*
  * Structure where we save information about one hardware AMD IOMMU in the
  * system.
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 8ce2f0325123..524c5406ccd6 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -215,30 +215,27 @@ static bool increase_address_space(struct 
protection_domain *domain,
   unsigned long address,
   gfp_t gfp)
 {
-   struct domain_pgtable pgtable;
unsigned long flags;
bool ret = true;
u64 *pte;
 
spin_lock_irqsave(>lock, flags);
 
-   amd_iommu_domain_get_pgtable(domain, );
-
-   if (address <= PM_LEVEL_SIZE(pgtable.mode))
+   if (address <= PM_LEVEL_SIZE(domain->iop.mode))
goto out;
 
ret = false;
-   if (WARN_ON_ONCE(pgtable.mode == PAGE_MODE_6_LEVEL))
+   if (WARN_ON_ONCE(domain->iop.mode == PAGE_MODE_6_LEVEL))
goto out;
 
pte = (void *)get_zeroed_page(gfp);
if (!pte)
goto out;
 
-   *pte = PM_LEVEL_PDE(pgtable.mode, iommu_virt_to_phys(pgtable.root));
+   *pte = PM_LEVEL_PDE(domain->iop.mode, 
iommu_virt_to_phys(domain->iop.root));
 
-   pgtable.root  = pte;
-   pgtable.mode += 1;
+   domain->iop.root  = pte;
+   domain->iop.mode += 1;
amd_iommu_update_and_flush_device_table(domain);
amd_iommu_domain_flush_complete(domain);
 
@@ -246,7 +243,7 @@ static bool increase_address_space(struct protection_domain 
*domain,
 * Device Table needs to be updated and flushed before the new root can
 * be published.
 */
-   amd_iommu_domain_set_pgtable(domain, pte, pgtable.mode);
+   amd_iommu_domain_set_pgtable(domain, pte, domain->iop.mode);
 
ret = true;
 
@@ -263,29 +260,23 @@ static u64 *alloc_pte(struct protection_domain *domain,
  gfp_t gfp,
  bool *updated)
 {
-   struct domain_pgtable pgtable;
int level, end_lvl;
u64 *pte, *page;
 
BUG_ON(!is_power_of_2(page_size));
 
-   amd_iommu_domain_get_pgtable(domain, );
-
-   while (address > PM_LEVEL_SIZE(pgtable.mode)) {
+   while (address > PM_LEVEL_SIZE(domain->iop.mode)) {
/*
 * Return an error if there is no memory to update the
 * page-table.
 */
if (!increase_address_space(domain, address, gfp))
return NULL;
-
-   /* Read new values to check if update was successful */
-   amd_iommu_domain_get_pgtable(domain, );
}
 
 
-   level   = pgtable.mode - 1;
-   pte = [PM_LEVEL_INDEX(level, address)];
+   level   = domain->iop.mode - 1;
+   pte = >iop.root[PM_LEVEL_INDEX(level, address)];
address = PAGE_SIZE_ALIGN(address, page_size);
end_lvl 

[PATCH 13/13] iommu: amd: Adopt IO page table framework

2020-09-23 Thread Suravee Suthikulpanit
Switch to using IO page table framework for AMD IOMMU v1 page table.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h |  3 +++
 drivers/iommu/amd/iommu.c | 10 ++
 2 files changed, 13 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 2e8dc2a1ec0f..046ea81a3b77 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -127,4 +127,7 @@ static inline void amd_iommu_apply_ivrs_quirks(void) { }
 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode);
 extern void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable);
+extern struct io_pgtable_ops *
+amd_iommu_setup_io_pgtable_ops(struct iommu_dev_data *dev_data,
+  struct protection_domain *pdom);
 #endif
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 77f44b927ae7..df304d8a630a 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1580,6 +1581,7 @@ static int pdev_iommuv2_enable(struct pci_dev *pdev)
 static int attach_device(struct device *dev,
 struct protection_domain *domain)
 {
+   struct io_pgtable_ops *pgtbl_ops;
struct iommu_dev_data *dev_data;
struct pci_dev *pdev;
unsigned long flags;
@@ -1623,6 +1625,12 @@ static int attach_device(struct device *dev,
 skip_ats_check:
ret = 0;
 
+   pgtbl_ops = amd_iommu_setup_io_pgtable_ops(dev_data, domain);
+   if (!pgtbl_ops) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
do_attach(dev_data, domain);
 
/*
@@ -1958,6 +1966,8 @@ static void amd_iommu_domain_free(struct iommu_domain 
*dom)
if (domain->dev_cnt > 0)
cleanup_domain(domain);
 
+   free_io_pgtable_ops(>iop.iop.ops);
+
BUG_ON(domain->dev_cnt != 0);
 
if (!dom)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 01/13] iommu: amd: Re-define amd_iommu_domain_encode_pgtable as inline

2020-09-23 Thread Suravee Suthikulpanit
Move the function to header file to allow inclusion in other files.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h | 13 +
 drivers/iommu/amd/iommu.c | 10 --
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 57309716fd18..97cdb235ce69 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -93,6 +93,19 @@ static inline void *iommu_phys_to_virt(unsigned long paddr)
return phys_to_virt(__sme_clr(paddr));
 }
 
+static inline
+void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 root)
+{
+   atomic64_set(>pt_root, root);
+}
+
+static inline
+void amd_iommu_domain_clr_pt_root(struct protection_domain *domain)
+{
+   amd_iommu_domain_set_pt_root(domain, 0);
+}
+
+
 extern bool translation_pre_enabled(struct amd_iommu *iommu);
 extern bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
 struct device *dev);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index db4fb840c59c..e92b3f744292 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -162,16 +162,6 @@ static void amd_iommu_domain_get_pgtable(struct 
protection_domain *domain,
pgtable->mode = pt_root & 7; /* lowest 3 bits encode pgtable mode */
 }
 
-static void amd_iommu_domain_set_pt_root(struct protection_domain *domain, u64 
root)
-{
-   atomic64_set(>pt_root, root);
-}
-
-static void amd_iommu_domain_clr_pt_root(struct protection_domain *domain)
-{
-   amd_iommu_domain_set_pt_root(domain, 0);
-}
-
 static void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode)
 {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 05/13] iommu: amd: Declare functions as extern

2020-09-23 Thread Suravee Suthikulpanit
And move declaration to header file so that they can be included across
multiple files. There is no functional change.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h |  3 +++
 drivers/iommu/amd/iommu.c | 39 +--
 2 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 22ecacb71675..8b7be9171030 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -48,6 +48,9 @@ extern int amd_iommu_domain_enable_v2(struct iommu_domain 
*dom, int pasids);
 extern int amd_iommu_flush_page(struct iommu_domain *dom, int pasid,
u64 address);
 extern void amd_iommu_update_and_flush_device_table(struct protection_domain 
*domain);
+extern void amd_iommu_domain_update(struct protection_domain *domain);
+extern void amd_iommu_domain_flush_complete(struct protection_domain *domain);
+extern void amd_iommu_domain_flush_tlb_pde(struct protection_domain *domain);
 extern int amd_iommu_flush_tlb(struct iommu_domain *dom, int pasid);
 extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, int pasid,
 unsigned long cr3);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 09da37c4c9c4..f91f35edb7ba 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -88,7 +88,6 @@ struct iommu_cmd {
 
 struct kmem_cache *amd_iommu_irq_cache;
 
-static void update_domain(struct protection_domain *domain);
 static void detach_device(struct device *dev);
 
 /
@@ -1294,12 +1293,12 @@ static void domain_flush_pages(struct protection_domain 
*domain,
 }
 
 /* Flush the whole IO/TLB for a given protection domain - including PDE */
-static void domain_flush_tlb_pde(struct protection_domain *domain)
+void amd_iommu_domain_flush_tlb_pde(struct protection_domain *domain)
 {
__domain_flush_pages(domain, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS, 1);
 }
 
-static void domain_flush_complete(struct protection_domain *domain)
+void amd_iommu_domain_flush_complete(struct protection_domain *domain)
 {
int i;
 
@@ -1324,7 +1323,7 @@ static void domain_flush_np_cache(struct 
protection_domain *domain,
 
spin_lock_irqsave(>lock, flags);
domain_flush_pages(domain, iova, size);
-   domain_flush_complete(domain);
+   amd_iommu_domain_flush_complete(domain);
spin_unlock_irqrestore(>lock, flags);
}
 }
@@ -1481,7 +1480,7 @@ static bool increase_address_space(struct 
protection_domain *domain,
pgtable.root  = pte;
pgtable.mode += 1;
amd_iommu_update_and_flush_device_table(domain);
-   domain_flush_complete(domain);
+   amd_iommu_domain_flush_complete(domain);
 
/*
 * Device Table needs to be updated and flushed before the new root can
@@ -1734,8 +1733,8 @@ static int iommu_map_page(struct protection_domain *dom,
 * Updates and flushing already happened in
 * increase_address_space().
 */
-   domain_flush_tlb_pde(dom);
-   domain_flush_complete(dom);
+   amd_iommu_domain_flush_tlb_pde(dom);
+   amd_iommu_domain_flush_complete(dom);
spin_unlock_irqrestore(>lock, flags);
}
 
@@ -1978,10 +1977,10 @@ static void do_detach(struct iommu_dev_data *dev_data)
device_flush_dte(dev_data);
 
/* Flush IOTLB */
-   domain_flush_tlb_pde(domain);
+   amd_iommu_domain_flush_tlb_pde(domain);
 
/* Wait for the flushes to finish */
-   domain_flush_complete(domain);
+   amd_iommu_domain_flush_complete(domain);
 
/* decrease reference counters - needs to happen after the flushes */
domain->dev_iommu[iommu->index] -= 1;
@@ -2114,9 +2113,9 @@ static int attach_device(struct device *dev,
 * left the caches in the IOMMU dirty. So we have to flush
 * here to evict all dirty stuff.
 */
-   domain_flush_tlb_pde(domain);
+   amd_iommu_domain_flush_tlb_pde(domain);
 
-   domain_flush_complete(domain);
+   amd_iommu_domain_flush_complete(domain);
 
 out:
spin_unlock(_data->lock);
@@ -2277,7 +2276,7 @@ void amd_iommu_update_and_flush_device_table(struct 
protection_domain *domain)
domain_flush_devices(domain);
 }
 
-static void update_domain(struct protection_domain *domain)
+void amd_iommu_domain_update(struct protection_domain *domain)
 {
struct domain_pgtable pgtable;
 
@@ -2286,8 +2285,8 @@ static void update_domain(struct protection_domain 
*domain)
amd_iommu_update_and_flush_device_table(domain);
 
/* Flush domain TLB(s) and wait for completion */
-   domain_flush_tlb_pde(domain);
-   domain_flush_complete(domain);
+   

[PATCH 04/13] iommu: amd: Convert to using amd_io_pgtable

2020-09-23 Thread Suravee Suthikulpanit
Make use of the new struct amd_io_pgtable in preparation to remove
the struct domain_pgtable.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h |  1 +
 drivers/iommu/amd/iommu.c | 25 ++---
 2 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index da6e09657e00..22ecacb71675 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -47,6 +47,7 @@ extern void amd_iommu_domain_direct_map(struct iommu_domain 
*dom);
 extern int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids);
 extern int amd_iommu_flush_page(struct iommu_domain *dom, int pasid,
u64 address);
+extern void amd_iommu_update_and_flush_device_table(struct protection_domain 
*domain);
 extern int amd_iommu_flush_tlb(struct iommu_domain *dom, int pasid);
 extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, int pasid,
 unsigned long cr3);
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index c8b8619cc744..09da37c4c9c4 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -90,8 +90,6 @@ struct kmem_cache *amd_iommu_irq_cache;
 
 static void update_domain(struct protection_domain *domain);
 static void detach_device(struct device *dev);
-static void update_and_flush_device_table(struct protection_domain *domain,
- struct domain_pgtable *pgtable);
 
 /
  *
@@ -1482,7 +1480,7 @@ static bool increase_address_space(struct 
protection_domain *domain,
 
pgtable.root  = pte;
pgtable.mode += 1;
-   update_and_flush_device_table(domain, );
+   amd_iommu_update_and_flush_device_table(domain);
domain_flush_complete(domain);
 
/*
@@ -1857,17 +1855,16 @@ static void free_gcr3_table(struct protection_domain 
*domain)
 }
 
 static void set_dte_entry(u16 devid, struct protection_domain *domain,
- struct domain_pgtable *pgtable,
  bool ats, bool ppr)
 {
u64 pte_root = 0;
u64 flags = 0;
u32 old_domid;
 
-   if (pgtable->mode != PAGE_MODE_NONE)
-   pte_root = iommu_virt_to_phys(pgtable->root);
+   if (domain->iop.mode != PAGE_MODE_NONE)
+   pte_root = iommu_virt_to_phys(domain->iop.root);
 
-   pte_root |= (pgtable->mode & DEV_ENTRY_MODE_MASK)
+   pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK)
<< DEV_ENTRY_MODE_SHIFT;
pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
 
@@ -1957,7 +1954,7 @@ static void do_attach(struct iommu_dev_data *dev_data,
 
/* Update device table */
amd_iommu_domain_get_pgtable(domain, );
-   set_dte_entry(dev_data->devid, domain, ,
+   set_dte_entry(dev_data->devid, domain,
  ats, dev_data->iommu_v2);
clone_aliases(dev_data->pdev);
 
@@ -2263,22 +2260,20 @@ static int amd_iommu_domain_get_attr(struct 
iommu_domain *domain,
  *
  */
 
-static void update_device_table(struct protection_domain *domain,
-   struct domain_pgtable *pgtable)
+static void update_device_table(struct protection_domain *domain)
 {
struct iommu_dev_data *dev_data;
 
list_for_each_entry(dev_data, >dev_list, list) {
-   set_dte_entry(dev_data->devid, domain, pgtable,
+   set_dte_entry(dev_data->devid, domain,
  dev_data->ats.enabled, dev_data->iommu_v2);
clone_aliases(dev_data->pdev);
}
 }
 
-static void update_and_flush_device_table(struct protection_domain *domain,
- struct domain_pgtable *pgtable)
+void amd_iommu_update_and_flush_device_table(struct protection_domain *domain)
 {
-   update_device_table(domain, pgtable);
+   update_device_table(domain);
domain_flush_devices(domain);
 }
 
@@ -2288,7 +2283,7 @@ static void update_domain(struct protection_domain 
*domain)
 
/* Update device table */
amd_iommu_domain_get_pgtable(domain, );
-   update_and_flush_device_table(domain, );
+   amd_iommu_update_and_flush_device_table(domain);
 
/* Flush domain TLB(s) and wait for completion */
domain_flush_tlb_pde(domain);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 10/13] iommu: amd: Refactor fetch_pte to use struct amd_io_pgtable

2020-09-23 Thread Suravee Suthikulpanit
To simplify the fetch_pte function. There is no functional change.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h  |  2 +-
 drivers/iommu/amd/io_pgtable.c | 13 +++--
 drivers/iommu/amd/iommu.c  |  4 +++-
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 2059e64fdc53..69996e57fae2 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -134,7 +134,7 @@ extern int iommu_map_page(struct protection_domain *dom,
 extern unsigned long iommu_unmap_page(struct protection_domain *dom,
  unsigned long bus_addr,
  unsigned long page_size);
-extern u64 *fetch_pte(struct protection_domain *domain,
+extern u64 *fetch_pte(struct amd_io_pgtable *pgtable,
  unsigned long address,
  unsigned long *page_size);
 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 5da5ce98b7b3..f913fc7b1e58 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -348,7 +348,7 @@ static u64 *alloc_pte(struct protection_domain *domain,
  * This function checks if there is a PTE for a given dma address. If
  * there is one, it returns the pointer to it.
  */
-u64 *fetch_pte(struct protection_domain *domain,
+u64 *fetch_pte(struct amd_io_pgtable *pgtable,
   unsigned long address,
   unsigned long *page_size)
 {
@@ -357,11 +357,11 @@ u64 *fetch_pte(struct protection_domain *domain,
 
*page_size = 0;
 
-   if (address > PM_LEVEL_SIZE(domain->iop.mode))
+   if (address > PM_LEVEL_SIZE(pgtable->mode))
return NULL;
 
-   level  =  domain->iop.mode - 1;
-   pte= >iop.root[PM_LEVEL_INDEX(level, address)];
+   level  =  pgtable->mode - 1;
+   pte= >root[PM_LEVEL_INDEX(level, address)];
*page_size =  PTE_LEVEL_PAGE_SIZE(level);
 
while (level > 0) {
@@ -496,6 +496,8 @@ unsigned long iommu_unmap_page(struct protection_domain 
*dom,
   unsigned long iova,
   unsigned long size)
 {
+   struct io_pgtable_ops *ops = >iop.iop.ops;
+   struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
unsigned long long unmapped;
unsigned long unmap_size;
u64 *pte;
@@ -505,8 +507,7 @@ unsigned long iommu_unmap_page(struct protection_domain 
*dom,
unmapped = 0;
 
while (unmapped < size) {
-   pte = fetch_pte(dom, iova, _size);
-
+   pte = fetch_pte(pgtable, iova, _size);
if (pte) {
int i, count;
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 3f6ede1e572c..87cea1cde414 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2078,13 +2078,15 @@ static phys_addr_t amd_iommu_iova_to_phys(struct 
iommu_domain *dom,
  dma_addr_t iova)
 {
struct protection_domain *domain = to_pdomain(dom);
+   struct io_pgtable_ops *ops = >iop.iop.ops;
+   struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
unsigned long offset_mask, pte_pgsize;
u64 *pte, __pte;
 
if (domain->iop.mode == PAGE_MODE_NONE)
return iova;
 
-   pte = fetch_pte(domain, iova, _pgsize);
+   pte = fetch_pte(pgtable, iova, _pgsize);
 
if (!pte || !IOMMU_PTE_PRESENT(*pte))
return 0;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 07/13] iommu: amd: Restructure code for freeing page table

2020-09-23 Thread Suravee Suthikulpanit
Introduce amd_iommu_free_pgtable helper function, which consolidates
logic for freeing page table.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu.h  |  2 +-
 drivers/iommu/amd/io_pgtable.c | 12 +++-
 drivers/iommu/amd/iommu.c  | 19 ++-
 3 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index ee7ff4d827e1..8dff7d85be79 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -123,7 +123,6 @@ static inline void amd_iommu_apply_ivrs_quirks(void) { }
 #endif
 
 /* TODO: These are temporary and will be removed once fully transition */
-extern void free_pagetable(struct domain_pgtable *pgtable);
 extern int iommu_map_page(struct protection_domain *dom,
  unsigned long bus_addr,
  unsigned long phys_addr,
@@ -140,4 +139,5 @@ extern void amd_iommu_domain_get_pgtable(struct 
protection_domain *domain,
 struct domain_pgtable *pgtable);
 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
 u64 *root, int mode);
+extern void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable);
 #endif
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 7e4154e26757..8ce2f0325123 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -167,14 +167,24 @@ static struct page *free_sub_pt(unsigned long root, int 
mode,
return freelist;
 }
 
-void free_pagetable(struct domain_pgtable *pgtable)
+void amd_iommu_free_pgtable(struct amd_io_pgtable *pgtable)
 {
+   struct protection_domain *dom;
struct page *freelist = NULL;
unsigned long root;
 
if (pgtable->mode == PAGE_MODE_NONE)
return;
 
+   dom = container_of(pgtable, struct protection_domain, iop);
+
+   /* Update data structure */
+   amd_iommu_domain_clr_pt_root(dom);
+
+   /* Make changes visible to IOMMUs */
+   amd_iommu_domain_update(dom);
+
+   /* Page-table is not visible to IOMMU anymore, so free it */
BUG_ON(pgtable->mode < PAGE_MODE_NONE ||
   pgtable->mode > PAGE_MODE_6_LEVEL);
 
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 4d65f64236b6..cbbea7b952fb 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1882,17 +1882,13 @@ static void cleanup_domain(struct protection_domain 
*domain)
 
 static void protection_domain_free(struct protection_domain *domain)
 {
-   struct domain_pgtable pgtable;
-
if (!domain)
return;
 
if (domain->id)
domain_id_free(domain->id);
 
-   amd_iommu_domain_get_pgtable(domain, );
-   amd_iommu_domain_clr_pt_root(domain);
-   free_pagetable();
+   amd_iommu_free_pgtable(>iop);
 
kfree(domain);
 }
@@ -2281,22 +2277,11 @@ EXPORT_SYMBOL(amd_iommu_unregister_ppr_notifier);
 void amd_iommu_domain_direct_map(struct iommu_domain *dom)
 {
struct protection_domain *domain = to_pdomain(dom);
-   struct domain_pgtable pgtable;
unsigned long flags;
 
spin_lock_irqsave(>lock, flags);
 
-   /* First save pgtable configuration*/
-   amd_iommu_domain_get_pgtable(domain, );
-
-   /* Remove page-table from domain */
-   amd_iommu_domain_clr_pt_root(domain);
-
-   /* Make changes visible to IOMMUs */
-   amd_iommu_domain_update(domain);
-
-   /* Page-table is not visible to IOMMU anymore, so free it */
-   free_pagetable();
+   amd_iommu_free_pgtable(>iop);
 
spin_unlock_irqrestore(>lock, flags);
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 09/13] iommu: amd: Rename variables to be consistent with struct io_pgtable_ops

2020-09-23 Thread Suravee Suthikulpanit
There is no functional change.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/io_pgtable.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 524c5406ccd6..5da5ce98b7b3 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -424,9 +424,9 @@ static struct page *free_clear_pte(u64 *pte, u64 pteval, 
struct page *freelist)
  * and full 64 bit address spaces.
  */
 int iommu_map_page(struct protection_domain *dom,
-  unsigned long bus_addr,
-  unsigned long phys_addr,
-  unsigned long page_size,
+  unsigned long iova,
+  unsigned long paddr,
+  unsigned long size,
   int prot,
   gfp_t gfp)
 {
@@ -435,15 +435,15 @@ int iommu_map_page(struct protection_domain *dom,
u64 __pte, *pte;
int ret, i, count;
 
-   BUG_ON(!IS_ALIGNED(bus_addr, page_size));
-   BUG_ON(!IS_ALIGNED(phys_addr, page_size));
+   BUG_ON(!IS_ALIGNED(iova, size));
+   BUG_ON(!IS_ALIGNED(paddr, size));
 
ret = -EINVAL;
if (!(prot & IOMMU_PROT_MASK))
goto out;
 
-   count = PAGE_SIZE_PTE_COUNT(page_size);
-   pte   = alloc_pte(dom, bus_addr, page_size, NULL, gfp, );
+   count = PAGE_SIZE_PTE_COUNT(size);
+   pte   = alloc_pte(dom, iova, size, NULL, gfp, );
 
ret = -ENOMEM;
if (!pte)
@@ -456,10 +456,10 @@ int iommu_map_page(struct protection_domain *dom,
updated = true;
 
if (count > 1) {
-   __pte = PAGE_SIZE_PTE(__sme_set(phys_addr), page_size);
+   __pte = PAGE_SIZE_PTE(__sme_set(paddr), size);
__pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_PR | IOMMU_PTE_FC;
} else
-   __pte = __sme_set(phys_addr) | IOMMU_PTE_PR | IOMMU_PTE_FC;
+   __pte = __sme_set(paddr) | IOMMU_PTE_PR | IOMMU_PTE_FC;
 
if (prot & IOMMU_PROT_IR)
__pte |= IOMMU_PTE_IR;
@@ -493,20 +493,19 @@ int iommu_map_page(struct protection_domain *dom,
 }
 
 unsigned long iommu_unmap_page(struct protection_domain *dom,
-  unsigned long bus_addr,
-  unsigned long page_size)
+  unsigned long iova,
+  unsigned long size)
 {
unsigned long long unmapped;
unsigned long unmap_size;
u64 *pte;
 
-   BUG_ON(!is_power_of_2(page_size));
+   BUG_ON(!is_power_of_2(size));
 
unmapped = 0;
 
-   while (unmapped < page_size) {
-
-   pte = fetch_pte(dom, bus_addr, _size);
+   while (unmapped < size) {
+   pte = fetch_pte(dom, iova, _size);
 
if (pte) {
int i, count;
@@ -516,7 +515,7 @@ unsigned long iommu_unmap_page(struct protection_domain 
*dom,
pte[i] = 0ULL;
}
 
-   bus_addr  = (bus_addr & ~(unmap_size - 1)) + unmap_size;
+   iova = (iova & ~(unmap_size - 1)) + unmap_size;
unmapped += unmap_size;
}
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 00/13] iommu: amd: Add Generic IO Page Table Framework Support

2020-09-23 Thread Suravee Suthikulpanit
The framework allows callable implementation of IO page table.
This allows AMD IOMMU driver to switch between different types
of AMD IOMMU page tables (e.g. v1 vs. v2).

This series refactors the current implementation of AMD IOMMU v1 page table
to adopt the framework. There should be no functional change.
Subsequent series will introduce support for the AMD IOMMU v2 page table.

Thanks,
Suravee

Suravee Suthikulpanit (13):
  iommu: amd: Re-define amd_iommu_domain_encode_pgtable as inline
  iommu: amd: Prepare for generic IO page table framework
  iommu: amd: Move pt_root to to struct amd_io_pgtable
  iommu: amd: Convert to using amd_io_pgtable
  iommu: amd: Declare functions as extern
  iommu: amd: Move IO page table related functions
  iommu: amd: Restructure code for freeing page table
  iommu: amd: Remove amd_iommu_domain_get_pgtable
  iommu: amd: Rename variables to be consistent with struct
io_pgtable_ops
  iommu: amd: Refactor fetch_pte to use struct amd_io_pgtable
  iommu: amd: Introduce iommu_v1_iova_to_phys
  iommu: amd: Introduce iommu_v1_map_page and iommu_v1_unmap_page
  iommu: amd: Adopt IO page table framework

 drivers/iommu/amd/Kconfig   |   1 +
 drivers/iommu/amd/Makefile  |   2 +-
 drivers/iommu/amd/amd_iommu.h   |  25 ++
 drivers/iommu/amd/amd_iommu_types.h |  40 +-
 drivers/iommu/amd/io_pgtable.c  | 580 +
 drivers/iommu/amd/iommu.c   | 630 ++--
 drivers/iommu/io-pgtable.c  |   3 +
 include/linux/io-pgtable.h  |   2 +
 8 files changed, 691 insertions(+), 592 deletions(-)
 create mode 100644 drivers/iommu/amd/io_pgtable.c

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 02/13] iommu: amd: Prepare for generic IO page table framework

2020-09-23 Thread Suravee Suthikulpanit
Add initial hook up code to implement generic IO page table framework.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/Kconfig   |  1 +
 drivers/iommu/amd/Makefile  |  2 +-
 drivers/iommu/amd/amd_iommu_types.h | 32 +++
 drivers/iommu/amd/io_pgtable.c  | 89 +
 drivers/iommu/amd/iommu.c   | 10 
 drivers/iommu/io-pgtable.c  |  3 +
 include/linux/io-pgtable.h  |  2 +
 7 files changed, 128 insertions(+), 11 deletions(-)
 create mode 100644 drivers/iommu/amd/io_pgtable.c

diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig
index 626b97d0dd21..a3cbafb603f5 100644
--- a/drivers/iommu/amd/Kconfig
+++ b/drivers/iommu/amd/Kconfig
@@ -10,6 +10,7 @@ config AMD_IOMMU
select IOMMU_API
select IOMMU_IOVA
select IOMMU_DMA
+   select IOMMU_IO_PGTABLE
depends on X86_64 && PCI && ACPI && HAVE_CMPXCHG_DOUBLE
help
  With this option you can enable support for AMD IOMMU hardware in
diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile
index dc5a2fa4fd37..a935f8f4b974 100644
--- a/drivers/iommu/amd/Makefile
+++ b/drivers/iommu/amd/Makefile
@@ -1,4 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
-obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o
+obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o
 obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o
 obj-$(CONFIG_AMD_IOMMU_V2) += iommu_v2.o
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index f696ac7c5f89..77cd8d966fbc 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Maximum number of IOMMUs supported
@@ -252,6 +253,19 @@
 
 #define GA_GUEST_NR0x1
 
+#define IOMMU_IN_ADDR_BIT_SIZE  52
+#define IOMMU_OUT_ADDR_BIT_SIZE 52
+
+/*
+ * This bitmap is used to advertise the page sizes our hardware support
+ * to the IOMMU core, which will then use this information to split
+ * physically contiguous memory regions it is mapping into page sizes
+ * that we support.
+ *
+ * 512GB Pages are not supported due to a hardware bug
+ */
+#define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
+
 /* Bit value definition for dte irq remapping fields*/
 #define DTE_IRQ_PHYS_ADDR_MASK (((1ULL << 45)-1) << 6)
 #define DTE_IRQ_REMAP_INTCTL_MASK  (0x3ULL << 60)
@@ -461,6 +475,23 @@ struct amd_irte_ops;
 
 #define AMD_IOMMU_FLAG_TRANS_PRE_ENABLED  (1 << 0)
 
+#define io_pgtable_to_data(x) \
+   container_of((x), struct amd_io_pgtable, iop)
+
+#define io_pgtable_ops_to_data(x) \
+   io_pgtable_to_data(io_pgtable_ops_to_pgtable(x))
+
+#define io_pgtable_ops_to_domain(x) \
+   container_of(io_pgtable_ops_to_data(x), \
+struct protection_domain, iop)
+
+struct amd_io_pgtable {
+   struct io_pgtable_cfg   pgtbl_cfg;
+   struct io_pgtable   iop;
+   int mode;
+   u64 *root;
+};
+
 /*
  * This structure contains generic data for  IOMMU protection domains
  * independent of their use.
@@ -469,6 +500,7 @@ struct protection_domain {
struct list_head dev_list; /* List of all devices in this domain */
struct iommu_domain domain; /* generic domain handle used by
   iommu core code */
+   struct amd_io_pgtable iop;
spinlock_t lock;/* mostly used to lock the page table*/
u16 id; /* the domain id written to the device table */
atomic64_t pt_root; /* pgtable root and pgtable mode */
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
new file mode 100644
index ..452cad26a2b3
--- /dev/null
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -0,0 +1,89 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * CPU-agnostic AMD IO page table allocator.
+ *
+ * Copyright (C) 2020 Advanced Micro Devices, Inc.
+ * Author: Suravee Suthikulpanit 
+ */
+
+#define pr_fmt(fmt) "AMD-Vi: " fmt
+#define dev_fmt(fmt)pr_fmt(fmt)
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "amd_iommu_types.h"
+#include "amd_iommu.h"
+
+static void v1_tlb_flush_all(void *cookie)
+{
+}
+
+static void v1_tlb_flush_walk(unsigned long iova, size_t size,
+ size_t granule, void *cookie)
+{
+}
+
+static void v1_tlb_flush_leaf(unsigned long iova, size_t size,
+ size_t granule, void *cookie)
+{
+}
+
+static void v1_tlb_add_page(struct iommu_iotlb_gather *gather,
+unsigned long iova, size_t granule,
+void *cookie)
+{
+   struct protection_domain *pdom = cookie;
+   struct iommu_domain *domain = >domain;
+
+   iommu_iotlb_gather_add_page(domain, gather, iova, 

Re: [PATCH 1/4] ARM/omap1: switch to use dma_direct_set_offset for lbus DMA offsets

2020-09-23 Thread Russell King - ARM Linux admin
On Mon, Sep 21, 2020 at 08:47:23AM +0200, Christoph Hellwig wrote:
> On Mon, Sep 21, 2020 at 09:44:18AM +0300, Tony Lindgren wrote:
> > * Janusz Krzysztofik  [200919 22:29]:
> > > Hi Tony,
> > > 
> > > On Friday, September 18, 2020 7:49:33 A.M. CEST Tony Lindgren wrote:
> > > > * Christoph Hellwig  [200917 17:37]:
> > > > > Switch the omap1510 platform ohci device to use dma_direct_set_offset
> > > > > to set the DMA offset instead of using direct hooks into the DMA
> > > > > mapping code and remove the now unused hooks.
> > > > 
> > > > Looks nice to me :) I still can't test this probably for few more weeks
> > > > though but hopefully Aaro or Janusz (Added to Cc) can test it.
> > > 
> > > Works for me on Amstrad Delta (tested with a USB ethernet adapter).
> > > 
> > > Tested-by: Janusz Krzysztofik 
> > 
> > Great, good to hear! And thanks for testing it.
> > 
> > Christoph, feel free to queue this along with the other patches:
> > 
> > Acked-by: Tony Lindgren 
> > 
> > Or let me know if you want me to pick it up.
> 
> I'd prefer to pick it up through the dma-mapping tree, but preferably
> I'd pick the whole series up once Russell has tested footwinder.

I don't think that's going to happen very soon... seems way too much
effort to pull down the appropriate tree to build and test.  Sorry.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: IOVA allocation dependency between firmware buffer and remaining buffers

2020-09-23 Thread Ajay kumar
Hello all,

We pretty much tried to solve the same issue here with a new API in DMA-IOMMU:
https://lore.kernel.org/linux-iommu/20200811054912.ga...@infradead.org/T/

Christopher- the user part would be MFC devices on exynos platforms

Thanks,
Ajay
On 9/23/20, Christoph Hellwig  wrote:
> On Wed, Sep 23, 2020 at 08:48:26AM +0200, Marek Szyprowski wrote:
>> Hi Shaik,
>>
>> I've run into similar problem while adapting S5P-MFC and Exynos4-IS
>> drivers for generic IOMMU-DMA framework. Here is my first solution:
>> https://lore.kernel.org/linux-samsung-soc/20200918144833.14618-1-m.szyprow...@samsung.com/T/
>>
>>
>>
>> It allows to remap given buffer at the specific IOVA address, although
>> it doesn't guarantee that those specific addresses won't be later used
>> by the IOVA allocator. Probably it would make sense to add an API for
>> generic IOMMU-DMA framework to mark the given IOVA range as
>> reserved/unused to protect them.
>
> If you want to use IOVA addresses in a device otherwise managed by
> dma-iommu we need to expose an API through the dma API.  Can you please
> include the iommu list in the discussion of your series?
>
> I don't think using the raw IOMMU API is a very idea in these drivers,
> we probably want a way to change the allocator algorithm or hint the
> next IOVA and keep using the normal DMA API.  Maybe Robin has a better
> idea.
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: IOVA allocation dependency between firmware buffer and remaining buffers

2020-09-23 Thread Ajay Kumar
Hello all,

We pretty much tried to solve the same issue here with a new API in DMA-IOMMU:
https://lore.kernel.org/linux-iommu/20200811054912.ga...@infradead.org/T/

Christoph - the user part would be MFC devices on exynos platforms

Thanks,
Ajay


On Wed, Sep 23, 2020 at 12:28 PM Christoph Hellwig  wrote:
>
> On Wed, Sep 23, 2020 at 08:48:26AM +0200, Marek Szyprowski wrote:
> > Hi Shaik,
> >
> > I've run into similar problem while adapting S5P-MFC and Exynos4-IS
> > drivers for generic IOMMU-DMA framework. Here is my first solution:
> > https://lore.kernel.org/linux-samsung-soc/20200918144833.14618-1-m.szyprow...@samsung.com/T/
> >
> >
> > It allows to remap given buffer at the specific IOVA address, although
> > it doesn't guarantee that those specific addresses won't be later used
> > by the IOVA allocator. Probably it would make sense to add an API for
> > generic IOMMU-DMA framework to mark the given IOVA range as
> > reserved/unused to protect them.
>
> If you want to use IOVA addresses in a device otherwise managed by
> dma-iommu we need to expose an API through the dma API.  Can you please
> include the iommu list in the discussion of your series?
>
> I don't think using the raw IOMMU API is a very idea in these drivers,
> we probably want a way to change the allocator algorithm or hint the
> next IOVA and keep using the normal DMA API.  Maybe Robin has a better
> idea.
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


dma_alloc_coherent not allocating memory from CMA Reserved

2020-09-23 Thread Sathyavathi M



Hi All,
 
I am trying to allocate coherent memory for 33 MB in kerenl driver. and for that i have reserved CMA of 1024 MB, but from dmesg, i can see that address reserved for cma is different and what i get with dma_alloc_coherent is different. My pc is intel x86 machine and tried in different motherboard, but this issue is occuring in some specific PCs. please help me in debugging the actual issue, below are kernel bootup logs
[ 0.014362] No NUMA configuration found[ 0.014363] Faking a node at [mem 0x-0x00021edf][ 0.014374] NODE_DATA(0) allocated [mem 0x21edd5000-0x21edf][ 0.014538] cma: Reserved 400 MiB at 0x000205c0[ 0.014541] Reserving 640MB of memory at 2512MB for crashkernel (System RAM: 8046MB)[ 0.014553] Zone ranges:[ 0.014554] DMA [mem 0x1000-0x00ff][ 0.014554] DMA32 [mem 0x0100-0x][ 0.014555] Normal [mem 0x0001-0x00021edf]
at dma_alloc_coherent call[ 27.816062] dev->dma_33M_addr is f800---and below are the logs in working case, at driver dma_alloc_coherent api call we have address which is in range of what reserved for cma.at boot i get..Faking a node at [mem 0x-0x00019fdf]NODE_DATA(0) allocated [mem 0x19fdd3000-0x19fdfdfff]cma: Reserved 800 MiB at 0x00016dc0Reserving 640MB of memory at 1792MB for crashkernel (System RAM: 6016MB)Zone ranges:DMA [mem 0x1000-0x00ff]DMA32 [mem 0x0100-0x]Normal [mem 0x0001-0x00019fdf]at dma_alloc_coherent calldev->dma_33M_addr is 16e20
 
Please help me in solving this iissue, or can suggest any alternative way to allocate big coherent memory.
 
 
Regards,
Sathya


 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: IOVA allocation dependency between firmware buffer and remaining buffers

2020-09-23 Thread Christoph Hellwig
On Wed, Sep 23, 2020 at 08:48:26AM +0200, Marek Szyprowski wrote:
> Hi Shaik,
> 
> I've run into similar problem while adapting S5P-MFC and Exynos4-IS 
> drivers for generic IOMMU-DMA framework. Here is my first solution: 
> https://lore.kernel.org/linux-samsung-soc/20200918144833.14618-1-m.szyprow...@samsung.com/T/
>  
> 
> 
> It allows to remap given buffer at the specific IOVA address, although 
> it doesn't guarantee that those specific addresses won't be later used 
> by the IOVA allocator. Probably it would make sense to add an API for 
> generic IOMMU-DMA framework to mark the given IOVA range as 
> reserved/unused to protect them.

If you want to use IOVA addresses in a device otherwise managed by
dma-iommu we need to expose an API through the dma API.  Can you please
include the iommu list in the discussion of your series?

I don't think using the raw IOMMU API is a very idea in these drivers,
we probably want a way to change the allocator algorithm or hint the
next IOVA and keep using the normal DMA API.  Maybe Robin has a better
idea.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: IOVA allocation dependency between firmware buffer and remaining buffers

2020-09-23 Thread Marek Szyprowski
Hi Shaik,

I've run into similar problem while adapting S5P-MFC and Exynos4-IS 
drivers for generic IOMMU-DMA framework. Here is my first solution: 
https://lore.kernel.org/linux-samsung-soc/20200918144833.14618-1-m.szyprow...@samsung.com/T/
 


It allows to remap given buffer at the specific IOVA address, although 
it doesn't guarantee that those specific addresses won't be later used 
by the IOVA allocator. Probably it would make sense to add an API for 
generic IOMMU-DMA framework to mark the given IOVA range as 
reserved/unused to protect them.

Best regards
Marek Szyprowski, PhD
Samsung R Institute Poland

On 24.04.2020 18:15, Shaik Ameer Basha wrote:
> On Fri, Apr 24, 2020 at 8:59 PM Robin Murphy  wrote:
>> On 2020-04-24 4:04 pm, Ajay kumar wrote:
>>> Can someone check this?
>>>
>>> On Mon, Apr 20, 2020 at 9:24 PM Ajay kumar  wrote:
 Hi All,

 I have an IOMMU master which has limitations as mentioned below:
 1) The IOMMU master internally executes a firmware, and the firmware memory
 is allocated by the same master driver.
 The firmware buffer address should be of the lowest range than other 
 address
 allocated by the device, or in other words, all the remaining buffer 
 addresses
 should always be in a higher range than the firmware address.
 2) None of the buffer addresses should go beyond 0xC000_
>> That particular constraint could (and perhaps should) be expressed as a
>> DMA mask/limit for the device, but if you have specific requirements to
> Yes Robin. We do use 0xC000_ address to set the DMA mask in our driver.
>
>> place buffers at particular addresses then you might be better off
>> managing your own IOMMU domain like some other (mostly DRM) drivers do.
> If you remember any of such drivers can you please point the driver path ?
>
>> The DMA APIs don't offer any guarantees about what addresses you'll get
>> other than that they won't exceed the appropriate mask.
> True, we have gone through most of the APIs and didn't find any way to match 
> our
> requirements with the existing DMA APIs
>
 example:
 If firmware buffer address is buf_fw = 0x8000_5000;
 All other addresses given to the device should be greater than
 (0x8000_5000 + firmware size) and less than 0xC000_
>> Out of curiosity, how do you control that in the no-IOMMU or IOMMU
>> passthrough cases?
> We manage the no-IOMMU or pass through cases using the reserved-memory.
>
>> Robin.
>>
 Currently, this is being handled with one of the below hacks:
 1) By keeping dma_mask in lower range while allocating firmware buffer,
 and then increasing the dma_mask to higher range for other buffers.
 2) By reserving IOVA for firmware at the lowest range and creating direct 
 mappings for the same.

 I want to know if there is a better way this can be handled with current 
 framework,
 or if anybody is facing similar problems with their devices,
 please share how it is taken care.

 I also think there should be some way the masters can specify the IOVA
 range they want to limit to for current allocation.
 Something like a new iommu_ops callback like below:
 limit_iova_alloc_range(dev, iova_start, iova_end)

 And, in my driver, the sequence will be:
 limit_iova_alloc_range(dev, 0x_, 0x1000_); /* via helpers */
 alloc( ) firmware buffer using DMA API
 limit_iova_alloc_range(dev, 0x1000_, 0xC000_); /* via helpers */
 alloc( ) other buffers using DMA API

> Just want to understand more from you, on the new iommu_ops we suggested.
> Shouldn't device have that flexibility to allocate IOVA as per it's 
> requirement?
> If you see our device as example, we need to have control on the
> allocated IOVA region
> based on where device is using this buffer.
>
> If we have these callbacks in place, then the low level IOMMU driver
> can implement and
> manage such requests when needed.
>
> If this can't be taken forward for some right reasons, then we will
> definitely try to understand
> on how to manage the IOMMU domain from our driver as per your suggestion
>
> - Shaik.
>
 Thanks,
 Ajay Kumar

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/arm: Add module parameter to set msi iova address

2020-09-23 Thread Auger Eric
Hi Will,

On 9/21/20 10:45 PM, Will Deacon wrote:
> On Mon, Sep 14, 2020 at 11:13:07AM -0700, Vennila Megavannan wrote:
>> From: Srinath Mannam 
>>
>> Add provision to change default value of MSI IOVA base to platform's
>> suitable IOVA using module parameter. The present hardcoded MSI IOVA base
>> may not be the accessible IOVA ranges of platform.
>>
>> If any platform has the limitaion to access default MSI IOVA, then it can
>> be changed using "arm-smmu.msi_iova_base=0xa000" command line argument.
>>
>> Signed-off-by: Srinath Mannam 
>> Co-developed-by: Vennila Megavannan 
>> Signed-off-by: Vennila Megavannan 
>> ---
>>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 5 -
>>  drivers/iommu/arm/arm-smmu/arm-smmu.c   | 5 -
>>  2 files changed, 8 insertions(+), 2 deletions(-)
> 
> This feels pretty fragile. Wouldn't it be better to realise that there's
> a region conflict with iommu_dma_get_resv_regions() and move the MSI window
> accordingly at runtime?

Since cd2c9fcf5c66 ("iommu/dma: Move PCI window region reservation back
into dma specific path"), the PCI host bridge windows are not exposed by
iommu_dma_get_resv_regions() anymore. If I understood correctly, what is
attempted here is to avoid the collision between such PCI host bridge
window and the MSI IOVA range.

Thanks

Eric
> 
> Will
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu