date:20200911

Re: [PATCH 04/12] dma-mapping: fix DMA_OPS dependencies

2020-09-11 Thread Christoph Hellwig

On Thu, Sep 10, 2020 at 01:55:37PM +0100, Robin Murphy wrote:
> AFAICS all three of these bus drivers are only proxying a struct 
> dma_map_ops * pointer around, so if they used the set_dma_ops() helper they 
> shouldn't even need these selects at all. Only INTEL_MIC_HOST appears to 
> have a logical dependency on DMA_OPS for actual functionality.
>
> However, I have a vague feeling you might not be fond of those dma_ops 
> helpers, and I have no great objection to this one-liner as-is, so (modulo 
> the couple of commit message typos),

The problem with these inherіtances is that they don't actually work
for the general case.  You'd also need to inherity things like the
dma ranges, the bus limits, etc, etc.  So we need to kill them instead.
That whole mic/vop case is even worse than that with it's weird set
of chained dma ops that seems to implement some kind of device side
iommu that isn't in scope for the DMA API at all.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 04/12] dma-mapping: fix DMA_OPS dependencies

2020-09-11 Thread Christoph Hellwig

On Tue, Sep 08, 2020 at 09:04:17PM +0300, Sergei Shtylyov wrote:
> Hello!
> 
> On 9/8/20 7:47 PM, Christoph Hellwig wrote:
> 
> > Driver that select DMA_OPS need to depend on HAS_DMA support to
> > work.  The vop driver was missing that dependency, so add it, and also
> > add a nother depends in DMA_OPS itself.  That won't fix the issue due
> 
>Another? :-)

Yes, fixed.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 09/12] dma-direct: remove __dma_to_phys

2020-09-11 Thread Christoph Hellwig

On Thu, Sep 10, 2020 at 02:26:03PM +0100, Robin Murphy wrote:
> On 2020-09-08 17:47, Christoph Hellwig wrote:
>> There is no harm in just always clearing the SME encryption bit, while
>> significantly simplifying the interface.
>
> After a 10-minute diversion into "but hang on, force_dma_unencrypted() is 
> meaningful on PPC and S390 too..." before realising that it all does just 
> come back to __sme_clr(), which is indeed a no-op for everyone other than 
> AMD, any simplification of this mess is indeed welcome :)
>
> Unless I've massively misunderstood how SME is supposed to work,

Exactly.  This weird encryption bit in AMD SME causes all kinds of harm,
and I'm glad no one picked it up.  I've also been wondering if we
should change the interface to explicit set/clear the bit, but I'll
leave that for another pass as fixing up the SME interfaces would turn
into a massive disgression.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 11/12] dma-mapping: move dma_common_{mmap,get_sgtable} out of mapping.c

2020-09-11 Thread Christoph Hellwig

On Thu, Sep 10, 2020 at 02:34:18PM +0100, Robin Murphy wrote:
> On 2020-09-08 17:47, Christoph Hellwig wrote:
>> Add a new file that contains helpera for misc DMA ops, which is only
>
> The Latin plural of the singular "helperum", I guess? :P

I've switched it to helpers, that might be easier to understand :)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 12/12] dma-mapping: move the dma_declare_coherent_memory documentation

2020-09-11 Thread Christoph Hellwig

On Thu, Sep 10, 2020 at 02:51:47PM +0100, Robin Murphy wrote:
> On 2020-09-08 17:47, Christoph Hellwig wrote:
>> dma_declare_coherent_memory should not be in a DMA API guide aimed
>> at driver writers (that is consumers of the API).  Move it to a comment
>> near the function instead.
>
> I still think there might be an occasional valid use for device-local 
> memory outside the scope of platform code without the driver having to go 
> full ZONE_DEVICE/HMM/TTM, e.g. with stuff like PCIe-based FPGA prototyping 
> cards, but the kind of driver I'm imagining for that case would never be 
> upstream anyway (if it were even written, rather than just using hard-coded 
> hacks), so meh.

And I'm not sure this would be the right interface for it.  E.g. NVMe
has the concept of a Controller Memory buffer (and a similar persistent
variant not supported by Linux), where the device can do this local DMA
(in a completely broken way that relies on correlating addresses seen
by the device and those by the host, but that's another disgression).
But that memory obviously can also be addresses by other devices using
PCIe P2P transactions which would also be useful for any HMM-ish devices,
so we'd need to expose it as P2P memory anyay..
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 3/3] iommu/tegra-smmu: Allow to group clients in same swgroup

2020-09-11 Thread Nicolin Chen

There can be clients using the same swgroup in DT, for example i2c0
and i2c1. The current driver will add them to separate IOMMU groups,
though it has implemented device_group() callback which is to group
devices using different swgroups like DC and DCB.

All clients having the same swgroup should be also added to the same
IOMMU group so as to share an asid. Otherwise, the asid register may
get overwritten every time a new device is attached.

Signed-off-by: Nicolin Chen 
---
 drivers/iommu/tegra-smmu.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 50b962b0647e..84fdee473873 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -23,6 +23,7 @@ struct tegra_smmu_group {
struct tegra_smmu *smmu;
const struct tegra_smmu_group_soc *soc;
struct iommu_group *group;
+   unsigned int swgroup;
 };
 
 struct tegra_smmu {
@@ -909,14 +910,14 @@ static struct iommu_group *tegra_smmu_group_get(struct 
tegra_smmu *smmu,
struct tegra_smmu_group *group;
struct iommu_group *grp;
 
+   /* Find group_soc associating with swgroup */
soc = tegra_smmu_find_group(smmu, swgroup);
-   if (!soc)
-   return NULL;
 
mutex_lock(&smmu->lock);
 
+   /* Find existing iommu_group associating with swgroup or group_soc */
list_for_each_entry(group, &smmu->groups, list)
-   if (group->soc == soc) {
+   if ((group->swgroup == swgroup) || (soc && group->soc == soc)) {
grp = iommu_group_ref_get(group->group);
mutex_unlock(&smmu->lock);
return grp;
@@ -929,6 +930,7 @@ static struct iommu_group *tegra_smmu_group_get(struct 
tegra_smmu *smmu,
}
 
INIT_LIST_HEAD(&group->list);
+   group->swgroup = swgroup;
group->smmu = smmu;
group->soc = soc;
 
@@ -940,7 +942,8 @@ static struct iommu_group *tegra_smmu_group_get(struct 
tegra_smmu *smmu,
}
 
iommu_group_set_iommudata(group->group, group, 
tegra_smmu_group_release);
-   iommu_group_set_name(group->group, soc->name);
+   if (soc)
+   iommu_group_set_name(group->group, soc->name);
list_add_tail(&group->list, &smmu->groups);
mutex_unlock(&smmu->lock);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 2/3] iommu/tegra-smmu: Fix iova->phys translation

2020-09-11 Thread Nicolin Chen

IOVA might not be always 4KB aligned. So tegra_smmu_iova_to_phys
function needs to add on the lower 12-bit offset from input iova.

Signed-off-by: Nicolin Chen 
---
 drivers/iommu/tegra-smmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 789d21c01b77..50b962b0647e 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -795,7 +795,7 @@ static phys_addr_t tegra_smmu_iova_to_phys(struct 
iommu_domain *domain,
 
pfn = *pte & as->smmu->pfn_mask;
 
-   return SMMU_PFN_PHYS(pfn);
+   return SMMU_PFN_PHYS(pfn) + SMMU_OFFSET_IN_PAGE(iova);
 }
 
 static struct tegra_smmu *tegra_smmu_find(struct device_node *np)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 0/3] iommu/tegra-smmu: Some small fixes

2020-09-11 Thread Nicolin Chen

These are a series of small fixes for tegra-smmu driver.
They might not be critial bugs as current mainline does
not enable a lot of clients, but be nicer to have since
we are going to enable the DMA domain for other clients
in the near future.

Nicolin Chen (3):
  iommu/tegra-smmu: Do not use PAGE_SHIFT and PAGE_MASK
  iommu/tegra-smmu: Fix iova->phys translation
  iommu/tegra-smmu: Allow to group clients in same swgroup

 drivers/iommu/tegra-smmu.c | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 1/3] iommu/tegra-smmu: Do not use PAGE_SHIFT and PAGE_MASK

2020-09-11 Thread Nicolin Chen

PAGE_SHIFT and PAGE_MASK are defined corresponding to the page size
for CPU virtual addresses, which means PAGE_SHIFT could be a number
other than 12, but tegra-smmu maintains fixed 4KB IOVA pages and has
fixed [21:12] bit range for PTE entries.

So this patch replaces all PAGE_SHIFT/PAGE_MASK references with the
macros defined with SMMU_PTE_SHIFT.

Signed-off-by: Nicolin Chen 
---
 drivers/iommu/tegra-smmu.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 046add7acb61..789d21c01b77 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -130,6 +130,11 @@ static inline u32 smmu_readl(struct tegra_smmu *smmu, 
unsigned long offset)
 #define SMMU_PDE_SHIFT 22
 #define SMMU_PTE_SHIFT 12
 
+#define SMMU_PAGE_MASK (~(SMMU_SIZE_PT-1))
+#define SMMU_OFFSET_IN_PAGE(x) ((unsigned long)(x) & ~SMMU_PAGE_MASK)
+#define SMMU_PFN_PHYS(x)   ((phys_addr_t)(x) << SMMU_PTE_SHIFT)
+#define SMMU_PHYS_PFN(x)   ((unsigned long)((x) >> SMMU_PTE_SHIFT))
+
 #define SMMU_PD_READABLE   (1 << 31)
 #define SMMU_PD_WRITABLE   (1 << 30)
 #define SMMU_PD_NONSECURE  (1 << 29)
@@ -644,7 +649,7 @@ static void tegra_smmu_set_pte(struct tegra_smmu_as *as, 
unsigned long iova,
   u32 *pte, dma_addr_t pte_dma, u32 val)
 {
struct tegra_smmu *smmu = as->smmu;
-   unsigned long offset = offset_in_page(pte);
+   unsigned long offset = SMMU_OFFSET_IN_PAGE(pte);
 
*pte = val;
 
@@ -726,7 +731,7 @@ __tegra_smmu_map(struct iommu_domain *domain, unsigned long 
iova,
pte_attrs |= SMMU_PTE_WRITABLE;
 
tegra_smmu_set_pte(as, iova, pte, pte_dma,
-  __phys_to_pfn(paddr) | pte_attrs);
+  SMMU_PHYS_PFN(paddr) | pte_attrs);
 
return 0;
 }
@@ -790,7 +795,7 @@ static phys_addr_t tegra_smmu_iova_to_phys(struct 
iommu_domain *domain,
 
pfn = *pte & as->smmu->pfn_mask;
 
-   return PFN_PHYS(pfn);
+   return SMMU_PFN_PHYS(pfn);
 }
 
 static struct tegra_smmu *tegra_smmu_find(struct device_node *np)
@@ -1108,7 +1113,8 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
smmu->dev = dev;
smmu->mc = mc;
 
-   smmu->pfn_mask = BIT_MASK(mc->soc->num_address_bits - PAGE_SHIFT) - 1;
+   smmu->pfn_mask =
+   BIT_MASK(mc->soc->num_address_bits - SMMU_PTE_SHIFT) - 1;
dev_dbg(dev, "address bits: %u, PFN mask: %#lx\n",
mc->soc->num_address_bits, smmu->pfn_mask);
smmu->tlb_mask = (smmu->soc->num_tlb_lines << 1) - 1;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 0/8] iommu/arm-smmu: Support maintaining bootloader mappings

2020-09-11 Thread Sai Prakash Ranjan


Hi Bjorn,

On 2020-09-04 21:25, Bjorn Andersson wrote:
Based on previous attempts and discussions this is the latest attempt 
at
inheriting stream mappings set up by the bootloader, for e.g. boot 
splash or

efifb.

Per Will's request this builds on the work by Jordan and Rob for the 
Adreno
SMMU support. It applies cleanly ontop of v16 of their series, which 
can be

found at
https://lore.kernel.org/linux-arm-msm/20200901164707.2645413-1-robdcl...@gmail.com/



Thanks for working on this, I have tested this on qcom platforms
where firmware does these shenanigans(most android) and this series
works well and where firmware doesn't do all this (chrome) and no
regressions there. Review and test tags given on individual patches.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/8] iommu/arm-smmu: Refactor context bank allocation

2020-09-11 Thread Sai Prakash Ranjan


On 2020-09-04 21:25, Bjorn Andersson wrote:

Extract the conditional invocation of the platform defined
alloc_context_bank() to a separate function to keep
arm_smmu_init_domain_context() cleaner.

Instead pass a reference to the arm_smmu_device as parameter to the
call. Also remove the count parameter, as this can be read from the
newly passed object.

This allows us to not assign smmu_domain->smmu before attempting to
allocate the context bank and as such we don't need to roll back this
assignment on failure.

Signed-off-by: Bjorn Andersson 
---


Reviewed-by: Sai Prakash Ranjan 
Tested-by: Sai Prakash Ranjan 

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/8] iommu/arm-smmu: Delay modifying domain during init

2020-09-11 Thread Sai Prakash Ranjan


On 2020-09-04 21:25, Bjorn Andersson wrote:

Delay modifications to the domain during arm_smmu_init_domain_context()
until we've allocated a context bank. This will allow us to postpone 
the
special handling of identity domains until the platform specific 
context

bank allocator has been executed, in a later patch.

Signed-off-by: Bjorn Andersson 
---


Reviewed-by: Sai Prakash Ranjan 
Tested-by: Sai Prakash Ranjan 

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 3/8] iommu/arm-smmu: Consult context bank allocator for identify domains

2020-09-11 Thread Sai Prakash Ranjan


On 2020-09-04 21:25, Bjorn Andersson wrote:
For implementations of the ARM SMMU where stream mappings of bypass 
type

are prohibited identity domains can be implemented by using context
banks with translation disabled.

Postpone the decision to skip allocating a context bank until the
implementation specific context bank allocator has been consulted and 
if

it decides to use a context bank for the identity map, don't enable
translation (i.e. omit ARM_SMMU_SCTLR_M).

Signed-off-by: Bjorn Andersson 
---



Minor nit in the subject: identify -> identity

Reviewed-by: Sai Prakash Ranjan 
Tested-by: Sai Prakash Ranjan 

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 3/8] iommu/arm-smmu: Consult context bank allocator for identify domains

2020-09-11 Thread Sai Prakash Ranjan


On 2020-09-04 21:25, Bjorn Andersson wrote:
For implementations of the ARM SMMU where stream mappings of bypass 
type

are prohibited identity domains can be implemented by using context
banks with translation disabled.

Postpone the decision to skip allocating a context bank until the
implementation specific context bank allocator has been consulted and 
if

it decides to use a context bank for the identity map, don't enable
translation (i.e. omit ARM_SMMU_SCTLR_M).

Signed-off-by: Bjorn Andersson 
---



...


diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h
b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index ddf2ca4c923d..235d9a3a6ab6 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -243,6 +243,8 @@ enum arm_smmu_cbar_type {
 #define TLB_LOOP_TIMEOUT   100 /* 1s! */
 #define TLB_SPIN_COUNT 10

+#define ARM_SMMU_CBNDX_BYPASS  0x
+
 /* Shared driver definitions */
 enum arm_smmu_arch_version {
ARM_SMMU_V1,
@@ -346,6 +348,7 @@ struct arm_smmu_cfg {
u32 sctlr_clr;/* bits to mask in SCTLR 
*/
enum arm_smmu_cbar_type cbar;
enum arm_smmu_context_fmt   fmt;
+   boolm;


Can we use mmu_enable instead of m here to be more descriptive?

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 4/8] iommu/arm-smmu-qcom: Emulate bypass by using context banks

2020-09-11 Thread Sai Prakash Ranjan


On 2020-09-04 21:25, Bjorn Andersson wrote:

Some firmware found on various Qualcomm platforms traps writes to S2CR
of type BYPASS and writes FAULT into the register. In particular, this
prevents us from marking the streams for the display controller as
BYPASS to allow continued scanout of the screen through the
initialization of the ARM SMMU.

This adds a Qualcomm specific cfg_probe function, which probes for the
broken behavior of the S2CR registers and implements a custom
alloc_context_bank() that when necessary allocates a context bank
(without translation) for these domains as well.

Signed-off-by: Bjorn Andersson 
---

Changes since v2:
- Move quirk from arm_smmudevice to qcom_smmu, as we localize the quirk
  handling to the Qualcomm specific implemntation.

 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 52 ++
 1 file changed, 52 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index 229fc8ff8cea..284761a1cd8e 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -11,8 +11,14 @@

 struct qcom_smmu {
struct arm_smmu_device smmu;
+   bool bypass_broken;
 };

+static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu)
+{
+   return container_of(smmu, struct qcom_smmu, smmu);
+}
+
 #define QCOM_ADRENO_SMMU_GPU_SID 0

 static bool qcom_adreno_smmu_is_gpu_device(struct device *dev)
@@ -162,6 +168,50 @@ static const struct of_device_id
qcom_smmu_client_of_match[] __maybe_unused = {
{ }
 };

+static int qcom_smmu_alloc_context_bank(struct arm_smmu_domain 
*smmu_domain,

+   struct arm_smmu_device *smmu,
+   struct device *dev, int start)
+{
+   struct iommu_domain *domain = &smmu_domain->domain;
+   struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
+
+   /* Keep identity domains as bypass, unless bypass is broken */
+   if (domain->type == IOMMU_DOMAIN_IDENTITY && !qsmmu->bypass_broken)
+   return ARM_SMMU_CBNDX_BYPASS;
+
+   /*
+* The identity domain to emulate bypass is the only domain without a
+* dev, use the last context bank for this to avoid collisions with
+* active contexts during initialization.
+*/
+   if (!dev)
+   start = smmu->num_context_banks - 1;
+
+   return __arm_smmu_alloc_bitmap(smmu->context_map, start,
smmu->num_context_banks);
+}
+
+static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
+{
+	unsigned int last_s2cr = ARM_SMMU_GR0_S2CR(smmu->num_mapping_groups - 
1);

+   struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
+   u32 reg;
+
+   /*
+* With some firmware writes to S2CR of type FAULT are ignored, and
+	 * writing BYPASS will end up as FAULT in the register. Perform a 
write

+* to S2CR to detect if this is the case with the current firmware.
+*/
+   reg = FIELD_PREP(ARM_SMMU_S2CR_TYPE, S2CR_TYPE_BYPASS) |
+ FIELD_PREP(ARM_SMMU_S2CR_CBNDX, 0xff) |
+ FIELD_PREP(ARM_SMMU_S2CR_PRIVCFG, S2CR_PRIVCFG_DEFAULT);
+   arm_smmu_gr0_write(smmu, last_s2cr, reg);
+   reg = arm_smmu_gr0_read(smmu, last_s2cr);
+   if (FIELD_GET(ARM_SMMU_S2CR_TYPE, reg) != S2CR_TYPE_BYPASS)
+   qsmmu->bypass_broken = true;
+


Clever :)

Reviewed-by: Sai Prakash Ranjan 
Tested-by: Sai Prakash Ranjan 

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 5/8] iommu/arm-smmu-qcom: Consistently initialize stream mappings

2020-09-11 Thread Sai Prakash Ranjan


On 2020-09-04 21:25, Bjorn Andersson wrote:

Firmware that traps writes to S2CR to translate BYPASS into FAULT also
ignores writes of type FAULT. As such booting with "disable_bypass" set
will result in all S2CR registers left as configured by the bootloader.

This has been seen to result in indeterministic results, as these
mappings might linger and reference context banks that Linux is
reconfiguring.

Use the fact that BYPASS writes result in FAULT type to force all 
stream

mappings to FAULT.

Signed-off-by: Bjorn Andersson 
---


Reviewed-by: Sai Prakash Ranjan 
Tested-by: Sai Prakash Ranjan 


--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 6/8] iommu/arm-smmu: Add impl hook for inherit boot mappings

2020-09-11 Thread Sai Prakash Ranjan


On 2020-09-04 21:25, Bjorn Andersson wrote:

Add a new operation to allow platform implementations to inherit any
stream mappings from the boot loader.

Signed-off-by: Bjorn Andersson 
---



Reviewed-by: Sai Prakash Ranjan 
Tested-by: Sai Prakash Ranjan 

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 7/8] iommu/arm-smmu: Provide helper for allocating identity domain

2020-09-11 Thread Sai Prakash Ranjan


On 2020-09-04 21:25, Bjorn Andersson wrote:

Some platform implementations needs to be able to allocate a domain for
emulating identity mappings using a context bank without translation.
Provide a helper function to allocate such a domain.

Signed-off-by: Bjorn Andersson 
---



Reviewed-by: Sai Prakash Ranjan 
Tested-by: Sai Prakash Ranjan 

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 8/8] iommu/arm-smmu-qcom: Setup identity domain for boot mappings

2020-09-11 Thread Sai Prakash Ranjan


On 2020-09-04 21:25, Bjorn Andersson wrote:

With many Qualcomm platforms not having functional S2CR BYPASS a
temporary IOMMU domain, without translation, needs to be allocated in
order to allow these memory transactions.

Unfortunately the boot loader uses the first few context banks, so
rather than overwriting a active bank the last context bank is used and
streams are diverted here during initialization.

This also performs the readback of SMR registers for the Qualcomm
platform, to trigger the mechanism.

This is based on prior work by Thierry Reding and Laurentiu Tudor.

Signed-off-by: Bjorn Andersson 
---



Reviewed-by: Sai Prakash Ranjan 
Tested-by: Sai Prakash Ranjan 

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] iommu: Kconfig: Update help description for IPMMU_VMSA config

2020-09-11 Thread Lad Prabhakar

ipmmu-vmsa driver is also used on Renesas RZ/G{1,2} Soc's, update the
same to reflect the help description for IPMMU_VMSA config.

Signed-off-by: Lad Prabhakar 
Reviewed-by: Chris Paterson 
---
 drivers/iommu/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index bef5d75e306b..d8f71bf31786 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -232,7 +232,7 @@ config IPMMU_VMSA
select ARM_DMA_USE_IOMMU
help
  Support for the Renesas VMSA-compatible IPMMU found in the R-Mobile
- APE6, R-Car Gen2, and R-Car Gen3 SoCs.
+ APE6, R-Car Gen{2,3} and RZ/G{1,2} SoCs.
 
  If unsure, say N.
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2] iommu/dma: Fix IOVA reserve dma ranges

2020-09-11 Thread Srinath Mannam via iommu

Fix IOVA reserve failure in the case when address of first memory region
listed in dma-ranges is equal to 0x0.

Fixes: aadad097cd46f ("iommu/dma: Reserve IOVA for PCIe inaccessible DMA 
address")
Signed-off-by: Srinath Mannam 
---
Changes from v1:
   Removed unnecessary changes based on Robin's review comments.

 drivers/iommu/dma-iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 5141d49a046b..682068a9aae7 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -217,7 +217,7 @@ static int iova_reserve_pci_windows(struct pci_dev *dev,
lo = iova_pfn(iovad, start);
hi = iova_pfn(iovad, end);
reserve_iova(iovad, lo, hi);
-   } else {
+   } else if (end < start) {
/* dma_ranges list should be sorted */
dev_err(&dev->dev, "Failed to reserve IOVA\n");
return -EINVAL;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/3] ARM/keystone: move the DMA offset handling under ifdef CONFIG_ARM_LPAE

2020-09-11 Thread Robin Murphy


On 2020-09-10 06:40, Christoph Hellwig wrote:

The DMA offset notifier can only be used if PHYS_OFFSET is at least
KEYSTONE_HIGH_PHYS_START, which can't be represented by a 32-bit
phys_addr_t.  Currently the code compiles fine despite that, a pending
change to the DMA offset handling would create a compiler warning for
this case.  Add an ifdef to not compile the code except for LPAE
configs.


Seems reasonable - once again I wonder whether this notifier is really 
needed any more since "dma-ranges" should now be handled properly for 
PCI devices as well, but that's not something to worry about in this series.


Reviewed-by: Robin Murphy 


Signed-off-by: Christoph Hellwig 
---
  arch/arm/mach-keystone/keystone.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/arch/arm/mach-keystone/keystone.c 
b/arch/arm/mach-keystone/keystone.c
index 638808c4e12247..dcd031ba84c2e0 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -24,6 +24,7 @@
  
  #include "keystone.h"
  
+#ifdef CONFIG_ARM_LPAE

  static unsigned long keystone_dma_pfn_offset __read_mostly;
  
  static int keystone_platform_notifier(struct notifier_block *nb,

@@ -48,14 +49,17 @@ static int keystone_platform_notifier(struct notifier_block 
*nb,
  static struct notifier_block platform_nb = {
.notifier_call = keystone_platform_notifier,
  };
+#endif /* CONFIG_ARM_LPAE */
  
  static void __init keystone_init(void)

  {
+#ifdef CONFIG_ARM_LPAE
if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) {
keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START -
   KEYSTONE_LOW_PHYS_START);
bus_register_notifier(&platform_bus_type, &platform_nb);
}
+#endif
keystone_pm_runtime_init();
  }
  


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/3] ARM/keystone: move the DMA offset handling under ifdef CONFIG_ARM_LPAE

2020-09-11 Thread Robin Murphy


On 2020-09-11 12:15, Russell King - ARM Linux admin wrote:

On Thu, Sep 10, 2020 at 07:40:37AM +0200, Christoph Hellwig wrote:

The DMA offset notifier can only be used if PHYS_OFFSET is at least
KEYSTONE_HIGH_PHYS_START, which can't be represented by a 32-bit
phys_addr_t.  Currently the code compiles fine despite that, a pending
change to the DMA offset handling would create a compiler warning for
this case.  Add an ifdef to not compile the code except for LPAE
configs.


However, to have use of the high physical offset, LPAE needs to be
enabled, which ensures that phys_addr_t is 64-bit.

I believe that DMA is non-coherent on this platform unless the high
physical address is used. Or something like that.


Yeah, it's probably not a configuration that anyone would actually want 
to use in anger on Keystone itself, but as long as folks might have 
ARCH_KEYSTONE selected in their non-LPAE multiplatform config we should 
avoid build regressions. I did wonder if Keystone should simply have a 
hard dependency on LPAE, but there does appear to be some explicit 
support for running directly from the non-coherent low 2G alias :/


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/3] ARM/keystone: move the DMA offset handling under ifdef CONFIG_ARM_LPAE

2020-09-11 Thread Russell King - ARM Linux admin

On Thu, Sep 10, 2020 at 07:40:37AM +0200, Christoph Hellwig wrote:
> The DMA offset notifier can only be used if PHYS_OFFSET is at least
> KEYSTONE_HIGH_PHYS_START, which can't be represented by a 32-bit
> phys_addr_t.  Currently the code compiles fine despite that, a pending
> change to the DMA offset handling would create a compiler warning for
> this case.  Add an ifdef to not compile the code except for LPAE
> configs.

However, to have use of the high physical offset, LPAE needs to be
enabled, which ensures that phys_addr_t is 64-bit.

I believe that DMA is non-coherent on this platform unless the high
physical address is used. Or something like that.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] iommu: fsl_pamu: Replace use of kzfree with kfree_sensitive

2020-09-11 Thread Alex Dewar

kzfree() is effectively deprecated as of commit 453431a54934 ("mm,
treewide: rename kzfree() to kfree_sensitive()") and is now simply an
alias for kfree_sensitive(). So just replace it with kfree_sensitive().

Signed-off-by: Alex Dewar 
---
 drivers/iommu/fsl_pamu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/fsl_pamu.c b/drivers/iommu/fsl_pamu.c
index 099a11a35fb9..b9a974d97831 100644
--- a/drivers/iommu/fsl_pamu.c
+++ b/drivers/iommu/fsl_pamu.c
@@ -1174,7 +1174,7 @@ static int fsl_pamu_probe(struct platform_device *pdev)
if (irq != NO_IRQ)
free_irq(irq, data);
 
-   kzfree(data);
+   kfree_sensitive(data);
 
if (pamu_regs)
iounmap(pamu_regs);
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCHv4 0/6] System Cache support for GPU and required SMMU support

2020-09-11 Thread Sai Prakash Ranjan

Some hardware variants contain a system cache or the last level
cache(llc). This cache is typically a large block which is shared
by multiple clients on the SOC. GPU uses the system cache to cache
both the GPU data buffers(like textures) as well the SMMU pagetables.
This helps with improved render performance as well as lower power
consumption by reducing the bus traffic to the system memory.

The system cache architecture allows the cache to be split into slices
which then be used by multiple SOC clients. This patch series is an
effort to enable and use two of those slices perallocated for the GPU,
one for the GPU data buffers and another for the GPU SMMU hardware
pagetables.

Patch 1 - Patch 4 adds system cache support in SMMU and GPU driver.
Patch 5 and 6 are minor cleanups for arm-smmu impl.

The series is based on top of 
https://gitlab.freedesktop.org/drm/msm/-/tree/msm-next-pgtables

Changes in v4:
 * Drop IOMMU_SYS_CACHE prot flag
 * Rebase on top of 
https://gitlab.freedesktop.org/drm/msm/-/tree/msm-next-pgtables

Changes in v3:
 * Fix domain attribute setting to before iommu_attach_device()
 * Fix few code style and checkpatch warnings
 * Rebase on top of Jordan's latest split pagetables and per-instance
   pagetables support

Changes in v2:
 * Addressed review comments and rebased on top of Jordan's split
   pagetables series

Sai Prakash Ranjan (4):
  iommu/io-pgtable-arm: Add support to use system cache
  iommu/arm-smmu: Add domain attribute for system cache
  iommu: arm-smmu-impl: Use table to list QCOM implementations
  iommu: arm-smmu-impl: Remove unwanted extra blank lines

Sharat Masetty (2):
  drm/msm: rearrange the gpu_rmw() function
  drm/msm/a6xx: Add support for using system cache(LLC)

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c  | 83 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h  |  4 ++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c| 21 +-
 drivers/gpu/drm/msm/msm_drv.c  |  8 +++
 drivers/gpu/drm/msm/msm_drv.h  |  1 +
 drivers/gpu/drm/msm/msm_gpu.h  |  5 +-
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 17 ++---
 drivers/iommu/arm/arm-smmu/arm-smmu.c  | 17 +
 drivers/iommu/arm/arm-smmu/arm-smmu.h  |  1 +
 drivers/iommu/io-pgtable-arm.c |  7 +-
 include/linux/io-pgtable.h |  4 ++
 include/linux/iommu.h  |  1 +
 12 files changed, 155 insertions(+), 14 deletions(-)


base-commit: 11e579ab6a3c2003efa2cfd1f0b3b4395f041618
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCHv4 1/6] iommu/io-pgtable-arm: Add support to use system cache

2020-09-11 Thread Sai Prakash Ranjan

Add a quirk IO_PGTABLE_QUIRK_SYS_CACHE to override the
attributes set in TCR for the page table walker when
using system cache.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/iommu/io-pgtable-arm.c | 7 ++-
 include/linux/io-pgtable.h | 4 
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index dc7bcf858b6d..828426c16fa9 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -789,7 +789,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
void *cookie)
 
if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
IO_PGTABLE_QUIRK_NON_STRICT |
-   IO_PGTABLE_QUIRK_ARM_TTBR1))
+   IO_PGTABLE_QUIRK_ARM_TTBR1 |
+   IO_PGTABLE_QUIRK_SYS_CACHE))
return NULL;
 
data = arm_lpae_alloc_pgtable(cfg);
@@ -801,6 +802,10 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
void *cookie)
tcr->sh = ARM_LPAE_TCR_SH_IS;
tcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
+   } else if (cfg->quirks & IO_PGTABLE_QUIRK_SYS_CACHE) {
+   tcr->sh = ARM_LPAE_TCR_SH_OS;
+   tcr->irgn = ARM_LPAE_TCR_RGN_NC;
+   tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
} else {
tcr->sh = ARM_LPAE_TCR_SH_OS;
tcr->irgn = ARM_LPAE_TCR_RGN_NC;
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 23285ba645db..ecc9d2248b84 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -86,6 +86,9 @@ struct io_pgtable_cfg {
 *
 * IO_PGTABLE_QUIRK_ARM_TTBR1: (ARM LPAE format) Configure the table
 *  for use in the upper half of a split address space.
+*
+* IO_PGTABLE_QUIRK_SYS_CACHE: Override the attributes set in TCR for
+*  the page table walker when using system cache.
 */
#define IO_PGTABLE_QUIRK_ARM_NS BIT(0)
#define IO_PGTABLE_QUIRK_NO_PERMS   BIT(1)
@@ -93,6 +96,7 @@ struct io_pgtable_cfg {
#define IO_PGTABLE_QUIRK_ARM_MTK_EXTBIT(3)
#define IO_PGTABLE_QUIRK_NON_STRICT BIT(4)
#define IO_PGTABLE_QUIRK_ARM_TTBR1  BIT(5)
+   #define IO_PGTABLE_QUIRK_SYS_CACHE  BIT(6)
unsigned long   quirks;
unsigned long   pgsize_bitmap;
unsigned intias;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCHv4 2/6] iommu/arm-smmu: Add domain attribute for system cache

2020-09-11 Thread Sai Prakash Ranjan

Add iommu domain attribute for using system cache aka last level
cache by client drivers like GPU to set right attributes for caching
the hardware pagetables into the system cache.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 17 +
 drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
 include/linux/iommu.h |  1 +
 3 files changed, 19 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 1f06ab219819..d449c895ba16 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -789,6 +789,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain 
*domain,
if (smmu_domain->non_strict)
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
+   if (smmu_domain->sys_cache)
+   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
+
pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
if (!pgtbl_ops) {
ret = -ENOMEM;
@@ -1513,6 +1516,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
*domain,
case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
*(int *)data = smmu_domain->non_strict;
return 0;
+   case DOMAIN_ATTR_SYS_CACHE:
+   *((int *)data) = smmu_domain->sys_cache;
+   return 0;
default:
return -ENODEV;
}
@@ -1544,6 +1550,17 @@ static int arm_smmu_domain_set_attr(struct iommu_domain 
*domain,
else
smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
break;
+   case DOMAIN_ATTR_SYS_CACHE:
+   if (smmu_domain->smmu) {
+   ret = -EPERM;
+   goto out_unlock;
+   }
+
+   if (*((int *)data))
+   smmu_domain->sys_cache = true;
+   else
+   smmu_domain->sys_cache = false;
+   break;
default:
ret = -ENODEV;
}
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index ddf2ca4c923d..93593e164e44 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -373,6 +373,7 @@ struct arm_smmu_domain {
struct mutexinit_mutex; /* Protects smmu pointer */
spinlock_t  cb_lock; /* Serialises ATS1* ops and 
TLB syncs */
struct iommu_domain domain;
+   boolsys_cache;
 };
 
 struct arm_smmu_master_cfg {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index fee209efb756..a580dfe9c68d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -118,6 +118,7 @@ enum iommu_attr {
DOMAIN_ATTR_FSL_PAMUV1,
DOMAIN_ATTR_NESTING,/* two stages of translation */
DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
+   DOMAIN_ATTR_SYS_CACHE,
DOMAIN_ATTR_MAX,
 };
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCHv4 3/6] drm/msm: rearrange the gpu_rmw() function

2020-09-11 Thread Sai Prakash Ranjan

From: Sharat Masetty 

The register read-modify-write construct is generic enough
that it can be used by other subsystems as needed, create
a more generic rmw() function and have the gpu_rmw() use
this new function.

Signed-off-by: Sharat Masetty 
Reviewed-by: Jordan Crouse 
Signed-off-by: Sai Prakash Ranjan 
---
 drivers/gpu/drm/msm/msm_drv.c | 8 
 drivers/gpu/drm/msm/msm_drv.h | 1 +
 drivers/gpu/drm/msm/msm_gpu.h | 5 +
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index abf5799d9a22..03caafa7c7b2 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -180,6 +180,14 @@ u32 msm_readl(const void __iomem *addr)
return val;
 }
 
+void msm_rmw(void __iomem *addr, u32 mask, u32 or)
+{
+   u32 val = msm_readl(addr);
+
+   val &= ~mask;
+   msm_writel(val | or, addr);
+}
+
 struct msm_vblank_work {
struct work_struct work;
int crtc_id;
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 2ca9c3c03845..aa07900d9ac4 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -419,6 +419,7 @@ void __iomem *msm_ioremap_quiet(struct platform_device 
*pdev, const char *name,
const char *dbgname);
 void msm_writel(u32 data, void __iomem *addr);
 u32 msm_readl(const void __iomem *addr);
+void msm_rmw(void __iomem *addr, u32 mask, u32 or);
 
 struct msm_gpu_submitqueue;
 int msm_submitqueue_init(struct drm_device *drm, struct msm_file_private *ctx);
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 5ee358b480e6..1d446b2e5746 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -245,10 +245,7 @@ static inline u32 gpu_read(struct msm_gpu *gpu, u32 reg)
 
 static inline void gpu_rmw(struct msm_gpu *gpu, u32 reg, u32 mask, u32 or)
 {
-   uint32_t val = gpu_read(gpu, reg);
-
-   val &= ~mask;
-   gpu_write(gpu, reg, val | or);
+   msm_rmw(gpu->mmio + (reg << 2), mask, or);
 }
 
 static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCHv4 5/6] iommu: arm-smmu-impl: Use table to list QCOM implementations

2020-09-11 Thread Sai Prakash Ranjan

Use table and of_match_node() to match qcom implementation
instead of multiple of_device_compatible() calls for each
QCOM SMMU implementation.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index d199b4bff15d..ce78295cfa78 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -9,6 +9,13 @@
 
 #include "arm-smmu.h"
 
+static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = {
+   { .compatible = "qcom,sc7180-smmu-500" },
+   { .compatible = "qcom,sdm845-smmu-500" },
+   { .compatible = "qcom,sm8150-smmu-500" },
+   { .compatible = "qcom,sm8250-smmu-500" },
+   { }
+};
 
 static int arm_smmu_gr0_ns(int offset)
 {
@@ -217,10 +224,7 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
if (of_device_is_compatible(np, "nvidia,tegra194-smmu"))
return nvidia_smmu_impl_init(smmu);
 
-   if (of_device_is_compatible(np, "qcom,sdm845-smmu-500") ||
-   of_device_is_compatible(np, "qcom,sc7180-smmu-500") ||
-   of_device_is_compatible(np, "qcom,sm8150-smmu-500") ||
-   of_device_is_compatible(np, "qcom,sm8250-smmu-500"))
+   if (of_match_node(qcom_smmu_impl_of_match, np))
return qcom_smmu_impl_init(smmu);
 
if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu"))
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines

2020-09-11 Thread Sai Prakash Ranjan

There are few places in arm-smmu-impl where there are
extra blank lines, remove them and while at it fix the
checkpatch warning for space required before the open
parenthesis.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index ce78295cfa78..f5b5218cbe5b 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -19,7 +19,7 @@ static const struct of_device_id __maybe_unused 
qcom_smmu_impl_of_match[] = {
 
 static int arm_smmu_gr0_ns(int offset)
 {
-   switch(offset) {
+   switch (offset) {
case ARM_SMMU_GR0_sCR0:
case ARM_SMMU_GR0_sACR:
case ARM_SMMU_GR0_sGFSR:
@@ -54,7 +54,6 @@ static const struct arm_smmu_impl calxeda_impl = {
.write_reg = arm_smmu_write_ns,
 };
 
-
 struct cavium_smmu {
struct arm_smmu_device smmu;
u32 id_base;
@@ -110,7 +109,6 @@ static struct arm_smmu_device *cavium_smmu_impl_init(struct 
arm_smmu_device *smm
return &cs->smmu;
 }
 
-
 #define ARM_MMU500_ACTLR_CPRE  (1 << 1)
 
 #define ARM_MMU500_ACR_CACHE_LOCK  (1 << 26)
@@ -197,7 +195,6 @@ static const struct arm_smmu_impl mrvl_mmu500_impl = {
.reset = arm_mmu500_reset,
 };
 
-
 struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
 {
const struct device_node *np = smmu->dev->of_node;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCHv4 4/6] drm/msm/a6xx: Add support for using system cache(LLC)

2020-09-11 Thread Sai Prakash Ranjan

From: Sharat Masetty 

The last level system cache can be partitioned to 32 different
slices of which GPU has two slices preallocated. One slice is
used for caching GPU buffers and the other slice is used for
caching the GPU SMMU pagetables. This talks to the core system
cache driver to acquire the slice handles, configure the SCID's
to those slices and activates and deactivates the slices upon
GPU power collapse and restore.

Some support from the IOMMU driver is also needed to make use
of the system cache to set the right TCR attributes. GPU then
has the ability to override a few cacheability parameters which
it does to override write-allocate to write-no-allocate as the
GPU hardware does not benefit much from it.

DOMAIN_ATTR_SYS_CACHE is another domain level attribute used by the
IOMMU driver to set the right attributes to cache the hardware
pagetables into the system cache.

Signed-off-by: Sharat Masetty 
(saiprakash.ranjan: fix to set attr before device attach to iommu and rebase)
Signed-off-by: Sai Prakash Ranjan 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 83 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |  4 ++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 21 ++-
 3 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index a3a8d6fd06bb..6241b93ed375 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -9,6 +9,8 @@
 #include "a6xx_gmu.xml.h"
 
 #include 
+#include 
+#include 
 
 #define GPU_PAS_ID 13
 
@@ -969,6 +971,79 @@ static const u32 
a6xx_register_offsets[REG_ADRENO_REGISTER_MAX] = {
REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_CNTL, REG_A6XX_CP_RB_CNTL),
 };
 
+static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask, u32 or)
+{
+   return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or);
+}
+
+static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 value)
+{
+   return msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2));
+}
+
+static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu)
+{
+   llcc_slice_deactivate(a6xx_gpu->llc_slice);
+   llcc_slice_deactivate(a6xx_gpu->htw_llc_slice);
+}
+
+static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu)
+{
+   u32 cntl1_regval = 0;
+
+   if (IS_ERR(a6xx_gpu->llc_mmio))
+   return;
+
+   if (!llcc_slice_activate(a6xx_gpu->llc_slice)) {
+   u32 gpu_scid = llcc_get_slice_id(a6xx_gpu->llc_slice);
+
+   gpu_scid &= 0x1f;
+   cntl1_regval = (gpu_scid << 0) | (gpu_scid << 5) | (gpu_scid << 
10) |
+  (gpu_scid << 15) | (gpu_scid << 20);
+   }
+
+   if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) {
+   u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice);
+
+   gpuhtw_scid &= 0x1f;
+   cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid);
+   }
+
+   if (cntl1_regval) {
+   /*
+* Program the slice IDs for the various GPU blocks and GPU MMU
+* pagetables
+*/
+   a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1, 
cntl1_regval);
+
+   /*
+* Program cacheability overrides to not allocate cache lines on
+* a write miss
+*/
+   a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 
0xF, 0x03);
+   }
+}
+
+static void a6xx_llc_slices_destroy(struct a6xx_gpu *a6xx_gpu)
+{
+   llcc_slice_putd(a6xx_gpu->llc_slice);
+   llcc_slice_putd(a6xx_gpu->htw_llc_slice);
+}
+
+static void a6xx_llc_slices_init(struct platform_device *pdev,
+   struct a6xx_gpu *a6xx_gpu)
+{
+   a6xx_gpu->llc_mmio = msm_ioremap(pdev, "cx_mem", "gpu_cx");
+   if (IS_ERR(a6xx_gpu->llc_mmio))
+   return;
+
+   a6xx_gpu->llc_slice = llcc_slice_getd(LLCC_GPU);
+   a6xx_gpu->htw_llc_slice = llcc_slice_getd(LLCC_GPUHTW);
+
+   if (IS_ERR(a6xx_gpu->llc_slice) && IS_ERR(a6xx_gpu->htw_llc_slice))
+   a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
+}
+
 static int a6xx_pm_resume(struct msm_gpu *gpu)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -985,6 +1060,8 @@ static int a6xx_pm_resume(struct msm_gpu *gpu)
 
msm_gpu_resume_devfreq(gpu);
 
+   a6xx_llc_activate(a6xx_gpu);
+
return 0;
 }
 
@@ -995,6 +1072,8 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu)
 
trace_msm_gpu_suspend(0);
 
+   a6xx_llc_deactivate(a6xx_gpu);
+
devfreq_suspend_device(gpu->devfreq.devfreq);
 
return a6xx_gmu_stop(a6xx_gpu);
@@ -1033,6 +1112,8 @@ static void a6xx_destroy(struct msm_gpu *gpu)
drm_gem_object_put(a6xx_gpu->sqe_bo);
}
 
+   a6xx_llc_slices_destroy(a6xx_gpu);
+
a6xx_gmu_remove(a6xx_gpu);
 
adreno_gpu_cleanup(adreno_gpu);
@@ -1132,6 +1213,8 @

Re: [PATCH v2] iommu/dma: Fix IOVA reserve dma ranges

2020-09-11 Thread Bjorn Helgaas

On Fri, Sep 11, 2020 at 03:55:34PM +0530, Srinath Mannam wrote:
> Fix IOVA reserve failure in the case when address of first memory region
> listed in dma-ranges is equal to 0x0.
> 
> Fixes: aadad097cd46f ("iommu/dma: Reserve IOVA for PCIe inaccessible DMA 
> address")
> Signed-off-by: Srinath Mannam 
> ---
> Changes from v1:
>Removed unnecessary changes based on Robin's review comments.
> 
>  drivers/iommu/dma-iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 5141d49a046b..682068a9aae7 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -217,7 +217,7 @@ static int iova_reserve_pci_windows(struct pci_dev *dev,
>   lo = iova_pfn(iovad, start);
>   hi = iova_pfn(iovad, end);
>   reserve_iova(iovad, lo, hi);
> - } else {
> + } else if (end < start) {
>   /* dma_ranges list should be sorted */
>   dev_err(&dev->dev, "Failed to reserve IOVA\n");

You didn't actually change the error message, but the message would be
way more useful if it included the IOVA address range, e.g., the
format used in pci_register_host_bridge():

  bus address [%#010llx-%#010llx]

Incidentally, the pr_err() in copy_reserved_iova() looks bogus; it
prints iova->pfn_low twice, when it should probably print the base and
size or (my preference) something like the above:

pr_err("Reserve iova range %lx@%lx failed\n",
   iova->pfn_lo, iova->pfn_lo);

>   return -EINVAL;
> -- 
> 2.17.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines

2020-09-11 Thread Robin Murphy


On 2020-09-11 15:28, Sai Prakash Ranjan wrote:

There are few places in arm-smmu-impl where there are
extra blank lines, remove them


FWIW those were deliberate - sometimes I like a bit of subtle space to 
visually delineate distinct groups of definitions. I suppose it won't be 
to everyone's taste :/



and while at it fix the
checkpatch warning for space required before the open
parenthesis.


That one, however, was not ;)

BTW am I supposed to have received 3 copies of everything? Because I did...

Robin.


Signed-off-by: Sai Prakash Ranjan 
---
  drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 5 +
  1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index ce78295cfa78..f5b5218cbe5b 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -19,7 +19,7 @@ static const struct of_device_id __maybe_unused 
qcom_smmu_impl_of_match[] = {
  
  static int arm_smmu_gr0_ns(int offset)

  {
-   switch(offset) {
+   switch (offset) {
case ARM_SMMU_GR0_sCR0:
case ARM_SMMU_GR0_sACR:
case ARM_SMMU_GR0_sGFSR:
@@ -54,7 +54,6 @@ static const struct arm_smmu_impl calxeda_impl = {
.write_reg = arm_smmu_write_ns,
  };
  
-

  struct cavium_smmu {
struct arm_smmu_device smmu;
u32 id_base;
@@ -110,7 +109,6 @@ static struct arm_smmu_device *cavium_smmu_impl_init(struct 
arm_smmu_device *smm
return &cs->smmu;
  }
  
-

  #define ARM_MMU500_ACTLR_CPRE (1 << 1)
  
  #define ARM_MMU500_ACR_CACHE_LOCK	(1 << 26)

@@ -197,7 +195,6 @@ static const struct arm_smmu_impl mrvl_mmu500_impl = {
.reset = arm_mmu500_reset,
  };
  
-

  struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
  {
const struct device_node *np = smmu->dev->of_node;


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines

2020-09-11 Thread Will Deacon

On Fri, Sep 11, 2020 at 05:03:06PM +0100, Robin Murphy wrote:
> BTW am I supposed to have received 3 copies of everything? Because I did...

Yeah, this seems to be happening for all of Sai's emails :/

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 0/8] iommu/arm-smmu: Support maintaining bootloader mappings

2020-09-11 Thread Amit Pundir

On Fri, 4 Sep 2020 at 21:25, Bjorn Andersson  wrote:
>
> Based on previous attempts and discussions this is the latest attempt at
> inheriting stream mappings set up by the bootloader, for e.g. boot splash or
> efifb.
>
> Per Will's request this builds on the work by Jordan and Rob for the Adreno
> SMMU support. It applies cleanly ontop of v16 of their series, which can be
> found at
> https://lore.kernel.org/linux-arm-msm/20200901164707.2645413-1-robdcl...@gmail.com/
>

Boot tested the series on Xiaomi Poco F1 phone (sdm845)

Tested-by: Amit Pundir 

> Bjorn Andersson (8):
>   iommu/arm-smmu: Refactor context bank allocation
>   iommu/arm-smmu: Delay modifying domain during init
>   iommu/arm-smmu: Consult context bank allocator for identify domains
>   iommu/arm-smmu-qcom: Emulate bypass by using context banks
>   iommu/arm-smmu-qcom: Consistently initialize stream mappings
>   iommu/arm-smmu: Add impl hook for inherit boot mappings
>   iommu/arm-smmu: Provide helper for allocating identity domain
>   iommu/arm-smmu-qcom: Setup identity domain for boot mappings
>
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 111 ++-
>  drivers/iommu/arm/arm-smmu/arm-smmu.c  | 122 ++---
>  drivers/iommu/arm/arm-smmu/arm-smmu.h  |  14 ++-
>  3 files changed, 205 insertions(+), 42 deletions(-)
>
> --
> 2.28.0
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/3] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset

2020-09-11 Thread Robin Murphy

(apologies to Jim - I did look through one of the previous versions 
since I last commented and thought it looked OK, but never actually 
replied as such)


On 2020-09-10 06:40, Christoph Hellwig wrote:

From: Jim Quinlan 

The new field 'dma_range_map' in struct device is used to facilitate the
use of single or multiple offsets between mapping regions of cpu addrs and
dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
capable of holding a single uniform offset and had no region bounds
checking.

The function of_dma_get_range() has been modified so that it takes a single
argument -- the device node -- and returns a map, NULL, or an error code.
The map is an array that holds the information regarding the DMA regions.
Each range entry contains the address offset, the cpu_start address, the
dma_start address, and the size of the region.

of_dma_configure() is the typical manner to set range offsets but there are
a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
driver code.  These cases now invoke the function
dma_attach_offset_range(dev, cpu_addr, dma_addr, size).


This is now called dma_direct_set_offset(), right?


Signed-off-by: Jim Quinlan 
[hch: various interface cleanups]
Signed-off-by: Christoph Hellwig 
Tested-by: Nathan Chancellor 
---
  arch/arm/include/asm/dma-direct.h |  9 +--
  arch/arm/mach-keystone/keystone.c | 17 ++--
  arch/sh/drivers/pci/pcie-sh7786.c |  9 ++-
  arch/x86/pci/sta2x11-fixup.c  |  6 +-
  drivers/acpi/arm64/iort.c |  6 +-
  drivers/base/core.c   |  2 +
  drivers/gpu/drm/sun4i/sun4i_backend.c |  8 +-
  drivers/iommu/io-pgtable-arm.c|  2 +-
  .../platform/sunxi/sun4i-csi/sun4i_csi.c  |  9 ++-
  .../platform/sunxi/sun6i-csi/sun6i_csi.c  | 11 ++-
  drivers/of/address.c  | 73 -
  drivers/of/device.c   | 44 ++
  drivers/of/of_private.h   | 11 +--
  drivers/of/unittest.c | 34 +---
  drivers/remoteproc/remoteproc_core.c  |  4 +-
  .../staging/media/sunxi/cedrus/cedrus_hw.c| 10 ++-
  drivers/usb/core/message.c|  5 +-
  drivers/usb/core/usb.c|  3 +-
  include/linux/device.h|  4 +-
  include/linux/dma-direct.h| 52 ++--
  include/linux/dma-mapping.h   | 19 -
  kernel/dma/coherent.c |  7 +-
  kernel/dma/direct.c   | 81 ++-
  23 files changed, 303 insertions(+), 123 deletions(-)

diff --git a/arch/arm/include/asm/dma-direct.h 
b/arch/arm/include/asm/dma-direct.h
index de0f4ff9279615..a443d5257a21ed 100644
--- a/arch/arm/include/asm/dma-direct.h
+++ b/arch/arm/include/asm/dma-direct.h
@@ -16,8 +16,8 @@
  #ifndef __arch_pfn_to_dma
  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
  {
-   if (dev)
-   pfn -= dev->dma_pfn_offset;
+   if (dev && dev->dma_range_map)
+   pfn = PFN_DOWN(translate_phys_to_dma(dev, PFN_PHYS(pfn)));
return (dma_addr_t)__pfn_to_bus(pfn);
  }
  
@@ -25,9 +25,8 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)

  {
unsigned long pfn = __bus_to_pfn(addr);
  
-	if (dev)

-   pfn += dev->dma_pfn_offset;
-
+   if (dev && dev->dma_range_map)
+   pfn = PFN_DOWN(translate_dma_to_phys(dev, PFN_PHYS(pfn)));
return pfn;
  }
  
diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-keystone/keystone.c

index dcd031ba84c2e0..09a65c2dfd7327 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -8,6 +8,7 @@
   */
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -25,8 +26,6 @@
  #include "keystone.h"
  
  #ifdef CONFIG_ARM_LPAE

-static unsigned long keystone_dma_pfn_offset __read_mostly;
-
  static int keystone_platform_notifier(struct notifier_block *nb,
  unsigned long event, void *data)
  {
@@ -39,9 +38,12 @@ static int keystone_platform_notifier(struct notifier_block 
*nb,
return NOTIFY_BAD;
  
  	if (!dev->of_node) {

-   dev->dma_pfn_offset = keystone_dma_pfn_offset;
-   dev_err(dev, "set dma_pfn_offset%08lx\n",
-   dev->dma_pfn_offset);
+   int ret = dma_direct_set_offset(dev, KEYSTONE_HIGH_PHYS_START,
+   KEYSTONE_LOW_PHYS_START,
+   KEYSTONE_HIGH_PHYS_SIZE);
+   dev_err(dev, "set dma_offset%08llx%s\n",
+   KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START,
+   ret ? " failed" : "");


FWIW I've already been thinking of some optimisations which would have

Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines

2020-09-11 Thread Sai Prakash Ranjan


On 2020-09-11 21:33, Robin Murphy wrote:

On 2020-09-11 15:28, Sai Prakash Ranjan wrote:

There are few places in arm-smmu-impl where there are
extra blank lines, remove them


FWIW those were deliberate - sometimes I like a bit of subtle space to
visually delineate distinct groups of definitions. I suppose it won't
be to everyone's taste :/



Ah ok, I thought it was not intentional, I can drop it.


and while at it fix the
checkpatch warning for space required before the open
parenthesis.


That one, however, was not ;)



I'll keep this one.

BTW am I supposed to have received 3 copies of everything? Because I 
did...




Ugh no, I just sent it once but something seems to have gone wrong.
Apologies again if you receive this message also multiple times.
I'll check further what's going wrong with my setup.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines

2020-09-11 Thread Sai Prakash Ranjan


On 2020-09-11 21:37, Will Deacon wrote:

On Fri, Sep 11, 2020 at 05:03:06PM +0100, Robin Murphy wrote:
BTW am I supposed to have received 3 copies of everything? Because I 
did...


Yeah, this seems to be happening for all of Sai's emails :/



Sorry, I am not sure what went wrong as I only sent this once
and there are no recent changes to any of my configs, I'll
check it further.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines

2020-09-11 Thread Robin Murphy


On 2020-09-11 17:21, Sai Prakash Ranjan wrote:

On 2020-09-11 21:37, Will Deacon wrote:

On Fri, Sep 11, 2020 at 05:03:06PM +0100, Robin Murphy wrote:
BTW am I supposed to have received 3 copies of everything? Because I 
did...


Yeah, this seems to be happening for all of Sai's emails :/



Sorry, I am not sure what went wrong as I only sent this once
and there are no recent changes to any of my configs, I'll
check it further.


Actually on closer inspection it appears to be "correct" behaviour. I'm 
still subscribed to LAKML and the IOMMU list on this account, but 
normally Office 365 deduplicates so aggressively that I have rules set 
up to copy list mails that I'm cc'ed on back to my inbox, in case they 
arrive first and cause the direct copy to get eaten - apparently there's 
something unique about your email setup that manages to defeat the 
deduplicator and make it deliver all 3 copies intact... :/


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines

2020-09-11 Thread Sai Prakash Ranjan


On 2020-09-11 22:04, Robin Murphy wrote:

On 2020-09-11 17:21, Sai Prakash Ranjan wrote:

On 2020-09-11 21:37, Will Deacon wrote:

On Fri, Sep 11, 2020 at 05:03:06PM +0100, Robin Murphy wrote:
BTW am I supposed to have received 3 copies of everything? Because I 
did...


Yeah, this seems to be happening for all of Sai's emails :/



Sorry, I am not sure what went wrong as I only sent this once
and there are no recent changes to any of my configs, I'll
check it further.


Actually on closer inspection it appears to be "correct" behaviour.
I'm still subscribed to LAKML and the IOMMU list on this account, but
normally Office 365 deduplicates so aggressively that I have rules set
up to copy list mails that I'm cc'ed on back to my inbox, in case they
arrive first and cause the direct copy to get eaten - apparently
there's something unique about your email setup that manages to defeat
the deduplicator and make it deliver all 3 copies intact... :/



No changes in my local setup atleast, but in the past we have
had cases with codeaurora mail acting weird or it could be my vpn,
will have to check.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 6/8] iommu/arm-smmu: Add impl hook for inherit boot mappings

2020-09-11 Thread Robin Murphy


On 2020-09-04 16:55, Bjorn Andersson wrote:

Add a new operation to allow platform implementations to inherit any
stream mappings from the boot loader.


Is there a reason we need an explicit step for this? The aim of the 
cfg_probe hook is that the SMMU software state should all be set up by 
then, and you can mess about with it however you like before 
arm_smmu_reset() actually commits anything to hardware. I would have 
thought you could permanently steal a context bank, configure it as your 
bypass hole, read out the previous SME configuration and tweak 
smmu->smrs and smmu->s2crs appropriately all together "invisibly" at 
that point. If that can't work, I'm very curious as to what I've overlooked.


Robin.


Signed-off-by: Bjorn Andersson 
---

Changes since v2:
- New patch/interface

  drivers/iommu/arm/arm-smmu/arm-smmu.c | 11 ++-
  drivers/iommu/arm/arm-smmu/arm-smmu.h |  6 ++
  2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index eb5c6ca5c138..4c4d302cd747 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -85,11 +85,6 @@ static inline void arm_smmu_rpm_put(struct arm_smmu_device 
*smmu)
pm_runtime_put_autosuspend(smmu->dev);
  }
  
-static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)

-{
-   return container_of(dom, struct arm_smmu_domain, domain);
-}
-
  static struct platform_driver arm_smmu_driver;
  static struct iommu_ops arm_smmu_ops;
  
@@ -2188,6 +2183,12 @@ static int arm_smmu_device_probe(struct platform_device *pdev)

if (err)
return err;
  
+	if (smmu->impl->inherit_mappings) {

+   err = smmu->impl->inherit_mappings(smmu);
+   if (err)
+   return err;
+   }
+
if (smmu->version == ARM_SMMU_V2) {
if (smmu->num_context_banks > smmu->num_context_irqs) {
dev_err(dev,
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 235d9a3a6ab6..f58164976e74 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -378,6 +378,11 @@ struct arm_smmu_domain {
struct iommu_domain domain;
  };
  
+static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)

+{
+   return container_of(dom, struct arm_smmu_domain, domain);
+}
+
  struct arm_smmu_master_cfg {
struct arm_smmu_device  *smmu;
s16 smendx[];
@@ -442,6 +447,7 @@ struct arm_smmu_impl {
int (*alloc_context_bank)(struct arm_smmu_domain *smmu_domain,
  struct arm_smmu_device *smmu,
  struct device *dev, int start);
+   int (*inherit_mappings)(struct arm_smmu_device *smmu);
  };
  
  #define INVALID_SMENDX			-1



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 8/8] iommu/arm-smmu-qcom: Setup identity domain for boot mappings

2020-09-11 Thread Robin Murphy


On 2020-09-04 16:55, Bjorn Andersson wrote:

With many Qualcomm platforms not having functional S2CR BYPASS a
temporary IOMMU domain, without translation, needs to be allocated in
order to allow these memory transactions.

Unfortunately the boot loader uses the first few context banks, so
rather than overwriting a active bank the last context bank is used and
streams are diverted here during initialization.

This also performs the readback of SMR registers for the Qualcomm
platform, to trigger the mechanism.

This is based on prior work by Thierry Reding and Laurentiu Tudor.

Signed-off-by: Bjorn Andersson 
---

Changes since v2:
- Combined from pieces spread between the Qualcomm impl and generic code in v2.
- Moved to use the newly introduced inherit_mapping op.

  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 33 ++
  1 file changed, 33 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index 70a1eaa52e14..a54302190932 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -12,6 +12,7 @@
  struct qcom_smmu {
struct arm_smmu_device smmu;
bool bypass_broken;
+   struct iommu_domain *identity;
  };
  
  static struct qcom_smmu *to_qcom_smmu(struct arm_smmu_device *smmu)

@@ -228,6 +229,37 @@ static int qcom_smmu_cfg_probe(struct arm_smmu_device 
*smmu)
return 0;
  }
  
+static int qcom_smmu_inherit_mappings(struct arm_smmu_device *smmu)

+{
+   struct qcom_smmu *qsmmu = to_qcom_smmu(smmu);
+   int cbndx;
+   u32 smr;
+   int i;
+
+   qsmmu->identity = arm_smmu_alloc_identity_domain(smmu);
+   if (IS_ERR(qsmmu->identity))
+   return PTR_ERR(qsmmu->identity);
+
+   cbndx = to_smmu_domain(qsmmu->identity)->cfg.cbndx;


I don't really get the point of going through the dance of allocating a 
whole iommu_domain() just to get a context. If you don't want to simply 
statically reserve a context at probe time, then just allocate from 
smmu->context_map here (where AFAICS "here" should be in cfg_probe 
anyway). This is entirely driver-internal, so there shouldn't be any 
need for IOMMU-API-level stuff to be involved.


Robin.


+
+   for (i = 0; i < smmu->num_mapping_groups; i++) {
+   smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i));
+
+   if (FIELD_GET(ARM_SMMU_SMR_VALID, smr)) {
+   smmu->smrs[i].id = FIELD_GET(ARM_SMMU_SMR_ID, smr);
+   smmu->smrs[i].mask = FIELD_GET(ARM_SMMU_SMR_MASK, smr);
+   smmu->smrs[i].valid = true;
+
+   smmu->s2crs[i].type = S2CR_TYPE_TRANS;
+   smmu->s2crs[i].privcfg = S2CR_PRIVCFG_DEFAULT;
+   smmu->s2crs[i].cbndx = cbndx;
+   smmu->s2crs[i].count++;
+   }
+   }
+
+   return 0;
+}
+
  static int qcom_smmu_def_domain_type(struct device *dev)
  {
const struct of_device_id *match =
@@ -270,6 +302,7 @@ static const struct arm_smmu_impl qcom_smmu_impl = {
.cfg_probe = qcom_smmu_cfg_probe,
.def_domain_type = qcom_smmu_def_domain_type,
.reset = qcom_smmu500_reset,
+   .inherit_mappings = qcom_smmu_inherit_mappings,
  };
  
  static const struct arm_smmu_impl qcom_adreno_smmu_impl = {



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/3] ARM/keystone: move the DMA offset handling under ifdef CONFIG_ARM_LPAE

2020-09-11 Thread santosh . shilimkar


On 9/11/20 4:15 AM, Russell King - ARM Linux admin wrote:

On Thu, Sep 10, 2020 at 07:40:37AM +0200, Christoph Hellwig wrote:

The DMA offset notifier can only be used if PHYS_OFFSET is at least
KEYSTONE_HIGH_PHYS_START, which can't be represented by a 32-bit
phys_addr_t.  Currently the code compiles fine despite that, a pending
change to the DMA offset handling would create a compiler warning for
this case.  Add an ifdef to not compile the code except for LPAE
configs.


However, to have use of the high physical offset, LPAE needs to be
enabled, which ensures that phys_addr_t is 64-bit.

I believe that DMA is non-coherent on this platform unless the high
physical address is used. Or something like that.


Exactly. Higher address ranges needs to be used for DMA coherency.

Regards,
Santosh
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 01/16] iommu: Report domain nesting info

2020-09-11 Thread Alex Williamson

On Thu, 10 Sep 2020 03:45:18 -0700
Liu Yi L  wrote:

> IOMMUs that support nesting translation needs report the capability info
> to userspace. It gives information about requirements the userspace needs
> to implement plus other features characterizing the physical implementation.
> 
> This patch introduces a new IOMMU UAPI struct that gives information about
> the nesting capabilities and features. This struct is supposed to be returned
> by iommu_domain_get_attr() with DOMAIN_ATTR_NESTING attribute parameter, with
> one domain whose type has been set to DOMAIN_ATTR_NESTING.
> 
> Cc: Kevin Tian 
> CC: Jacob Pan 
> Cc: Alex Williamson 
> Cc: Eric Auger 
> Cc: Jean-Philippe Brucker 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Jacob Pan 
> ---
> v6 -> v7:
> *) rephrase the commit message, replace the @data[] field in struct
>iommu_nesting_info with union per comments from Eric Auger.
> 
> v5 -> v6:
> *) rephrase the feature notes per comments from Eric Auger.
> *) rename @size of struct iommu_nesting_info to @argsz.
> 
> v4 -> v5:
> *) address comments from Eric Auger.
> 
> v3 -> v4:
> *) split the SMMU driver changes to be a separate patch
> *) move the @addr_width and @pasid_bits from vendor specific
>part to generic part.
> *) tweak the description for the @features field of struct
>iommu_nesting_info.
> *) add description on the @data[] field of struct iommu_nesting_info
> 
> v2 -> v3:
> *) remvoe cap/ecap_mask in iommu_nesting_info.
> *) reuse DOMAIN_ATTR_NESTING to get nesting info.
> *) return an empty iommu_nesting_info for SMMU drivers per Jean'
>suggestion.
> ---
>  include/uapi/linux/iommu.h | 76 
> ++
>  1 file changed, 76 insertions(+)
> 
> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> index 1ebc23d..ff987e4 100644
> --- a/include/uapi/linux/iommu.h
> +++ b/include/uapi/linux/iommu.h
> @@ -341,4 +341,80 @@ struct iommu_gpasid_bind_data {
>   } vendor;
>  };
>  
> +/*
> + * struct iommu_nesting_info_vtd - Intel VT-d specific nesting info.
> + *
> + * @flags:   VT-d specific flags. Currently reserved for future
> + *   extension. must be set to 0.
> + * @cap_reg: Describe basic capabilities as defined in VT-d capability
> + *   register.
> + * @ecap_reg:Describe the extended capabilities as defined in VT-d
> + *   extended capability register.
> + */
> +struct iommu_nesting_info_vtd {
> + __u32   flags;
> + __u64   cap_reg;
> + __u64   ecap_reg;
> +};


The vendor union has 8-byte alignment, so flags here will be 8-byte
aligned, followed by a compiler dependent gap before the 8-byte fields.
We should fill that gap with padding to make it deterministic for
userspace.  Thanks,

Alex

> +
> +/*
> + * struct iommu_nesting_info - Information for nesting-capable IOMMU.
> + *  userspace should check it before using
> + *  nesting capability.
> + *
> + * @argsz:   size of the whole structure.
> + * @flags:   currently reserved for future extension. must set to 0.
> + * @format:  PASID table entry format, the same definition as struct
> + *   iommu_gpasid_bind_data @format.
> + * @features:supported nesting features.
> + * @addr_width:  The output addr width of first level/stage translation
> + * @pasid_bits:  Maximum supported PASID bits, 0 represents no PASID
> + *   support.
> + * @vendor:  vendor specific data, structure type can be deduced from
> + *   @format field.
> + *
> + * +===+==+
> + * | feature   |  Notes   |
> + * +===+==+
> + * | SYSWIDE_PASID |  IOMMU vendor driver sets it to mandate userspace|
> + * |   |  to allocate PASID from kernel. All PASID allocation |
> + * |   |  free must be mediated through the IOMMU UAPI.   |
> + * +---+--+
> + * | BIND_PGTBL|  IOMMU vendor driver sets it to mandate userspace to |
> + * |   |  bind the first level/stage page table to associated |
> + * |   |  PASID (either the one specified in bind request or  |
> + * |   |  the default PASID of iommu domain), through IOMMU   |
> + * |   |  UAPI.   |
> + * +---+--+
> + * | CACHE_INVLD   |  IOMMU vendor driver sets it to mandate userspace to |
> + * |   |  explicitly invalidate the IOMMU cache through IOMMU |
> + * |   |  UAPI according to vendor-specific requirement when  |
> + * |   |  changing the 1st level/stage page table.|
> + * +---+

Re: [PATCH v7 03/16] vfio/type1: Report iommu nesting info to userspace

2020-09-11 Thread Alex Williamson

On Thu, 10 Sep 2020 03:45:20 -0700
Liu Yi L  wrote:

> This patch exports iommu nesting capability info to user space through
> VFIO. Userspace is expected to check this info for supported uAPIs (e.g.
> PASID alloc/free, bind page table, and cache invalidation) and the vendor
> specific format information for first level/stage page table that will be
> bound to.
> 
> The nesting info is available only after container set to be NESTED type.
> Current implementation imposes one limitation - one nesting container
> should include at most one iommu group. The philosophy of vfio container
> is having all groups/devices within the container share the same IOMMU
> context. When vSVA is enabled, one IOMMU context could include one 2nd-
> level address space and multiple 1st-level address spaces. While the
> 2nd-level address space is reasonably sharable by multiple groups, blindly
> sharing 1st-level address spaces across all groups within the container
> might instead break the guest expectation. In the future sub/super container
> concept might be introduced to allow partial address space sharing within
> an IOMMU context. But for now let's go with this restriction by requiring
> singleton container for using nesting iommu features. Below link has the
> related discussion about this decision.
> 
> https://lore.kernel.org/kvm/20200515115924.37e69...@w520.home/
> 
> This patch also changes the NESTING type container behaviour. Something
> that would have succeeded before will now fail: Before this series, if
> user asked for a VFIO_IOMMU_TYPE1_NESTING, it would have succeeded even
> if the SMMU didn't support stage-2, as the driver would have silently
> fallen back on stage-1 mappings (which work exactly the same as stage-2
> only since there was no nesting supported). After the series, we do check
> for DOMAIN_ATTR_NESTING so if user asks for VFIO_IOMMU_TYPE1_NESTING and
> the SMMU doesn't support stage-2, the ioctl fails. But it should be a good
> fix and completely harmless. Detail can be found in below link as well.
> 
> https://lore.kernel.org/kvm/20200717090900.GC4850@myrica/
> 
> Cc: Kevin Tian 
> CC: Jacob Pan 
> Cc: Alex Williamson 
> Cc: Eric Auger 
> Cc: Jean-Philippe Brucker 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Signed-off-by: Liu Yi L 
> ---
> v6 -> v7:
> *) using vfio_info_add_capability() for adding nesting cap per suggestion
>from Eric.
> 
> v5 -> v6:
> *) address comments against v5 from Eric Auger.
> *) don't report nesting cap to userspace if the nesting_info->format is
>invalid.
> 
> v4 -> v5:
> *) address comments from Eric Auger.
> *) return struct iommu_nesting_info for VFIO_IOMMU_TYPE1_INFO_CAP_NESTING as
>cap is much "cheap", if needs extension in future, just define another cap.
>https://lore.kernel.org/kvm/20200708132947.5b7ee...@x1.home/
> 
> v3 -> v4:
> *) address comments against v3.
> 
> v1 -> v2:
> *) added in v2
> ---
>  drivers/vfio/vfio_iommu_type1.c | 92 
> +++--
>  include/uapi/linux/vfio.h   | 19 +
>  2 files changed, 99 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index c992973..3c0048b 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -62,18 +62,20 @@ MODULE_PARM_DESC(dma_entry_limit,
>"Maximum number of user DMA mappings per container (65535).");
>  
>  struct vfio_iommu {
> - struct list_headdomain_list;
> - struct list_headiova_list;
> - struct vfio_domain  *external_domain; /* domain for external user */
> - struct mutexlock;
> - struct rb_root  dma_list;
> - struct blocking_notifier_head notifier;
> - unsigned intdma_avail;
> - uint64_tpgsize_bitmap;
> - boolv2;
> - boolnesting;
> - booldirty_page_tracking;
> - boolpinned_page_dirty_scope;
> + struct list_headdomain_list;
> + struct list_headiova_list;
> + /* domain for external user */
> + struct vfio_domain  *external_domain;
> + struct mutexlock;
> + struct rb_root  dma_list;
> + struct blocking_notifier_head   notifier;
> + unsigned intdma_avail;
> + uint64_tpgsize_bitmap;
> + boolv2;
> + boolnesting;
> + booldirty_page_tracking;
> + boolpinned_page_dirty_scope;
> + struct iommu_nesting_info   *nesting_info;

Nit, not as important as the previous alignment, but might as well move
this up with the uint64_t pgsize_bitmap with the bools at the end of
the structure to avoid adding new gaps.

>  };
>  
>  struct vfio_domain {
> @@ -130,6 +132,9 @@

Re: [PATCH v7 04/16] vfio: Add PASID allocation/free support

2020-09-11 Thread Alex Williamson

On Thu, 10 Sep 2020 03:45:21 -0700
Liu Yi L  wrote:

> Shared Virtual Addressing (a.k.a Shared Virtual Memory) allows sharing
> multiple process virtual address spaces with the device for simplified
> programming model. PASID is used to tag an virtual address space in DMA
> requests and to identify the related translation structure in IOMMU. When
> a PASID-capable device is assigned to a VM, we want the same capability
> of using PASID to tag guest process virtual address spaces to achieve
> virtual SVA (vSVA).
> 
> PASID management for guest is vendor specific. Some vendors (e.g. Intel
> VT-d) requires system-wide managed PASIDs across all devices, regardless
> of whether a device is used by host or assigned to guest. Other vendors
> (e.g. ARM SMMU) may allow PASIDs managed per-device thus could be fully
> delegated to the guest for assigned devices.
> 
> For system-wide managed PASIDs, this patch introduces a vfio module to
> handle explicit PASID alloc/free requests from guest. Allocated PASIDs
> are associated to a process (or, mm_struct) in IOASID core. A vfio_mm
> object is introduced to track mm_struct. Multiple VFIO containers within
> a process share the same vfio_mm object.
> 
> A quota mechanism is provided to prevent malicious user from exhausting
> available PASIDs. Currently the quota is a global parameter applied to
> all VFIO devices. In the future per-device quota might be supported too.
> 
> Cc: Kevin Tian 
> CC: Jacob Pan 
> Cc: Eric Auger 
> Cc: Jean-Philippe Brucker 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Suggested-by: Alex Williamson 
> Signed-off-by: Liu Yi L 
> Reviewed-by: Eric Auger 
> ---
> v6 -> v7:
> *) remove "#include " and add r-b from Eric Auger.
> 
> v5 -> v6:
> *) address comments from Eric. Add vfio_unlink_pasid() to be consistent
>with vfio_unlink_dma(). Add a comment in vfio_pasid_exit().
> 
> v4 -> v5:
> *) address comments from Eric Auger.
> *) address the comments from Alex on the pasid free range support. Added
>per vfio_mm pasid r-b tree.
>https://lore.kernel.org/kvm/20200709082751.32074...@x1.home/
> 
> v3 -> v4:
> *) fix lock leam in vfio_mm_get_from_task()
> *) drop pasid_quota field in struct vfio_mm
> *) vfio_mm_get_from_task() returns ERR_PTR(-ENOTTY) when !CONFIG_VFIO_PASID
> 
> v1 -> v2:
> *) added in v2, split from the pasid alloc/free support of v1
> ---
>  drivers/vfio/Kconfig  |   5 +
>  drivers/vfio/Makefile |   1 +
>  drivers/vfio/vfio_pasid.c | 247 
> ++
>  include/linux/vfio.h  |  28 ++
>  4 files changed, 281 insertions(+)
>  create mode 100644 drivers/vfio/vfio_pasid.c
> 
> diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> index fd17db9..3d8a108 100644
> --- a/drivers/vfio/Kconfig
> +++ b/drivers/vfio/Kconfig
> @@ -19,6 +19,11 @@ config VFIO_VIRQFD
>   depends on VFIO && EVENTFD
>   default n
>  
> +config VFIO_PASID
> + tristate
> + depends on IOASID && VFIO
> + default n
> +
>  menuconfig VFIO
>   tristate "VFIO Non-Privileged userspace driver framework"
>   depends on IOMMU_API
> diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
> index de67c47..bb836a3 100644
> --- a/drivers/vfio/Makefile
> +++ b/drivers/vfio/Makefile
> @@ -3,6 +3,7 @@ vfio_virqfd-y := virqfd.o
>  
>  obj-$(CONFIG_VFIO) += vfio.o
>  obj-$(CONFIG_VFIO_VIRQFD) += vfio_virqfd.o
> +obj-$(CONFIG_VFIO_PASID) += vfio_pasid.o
>  obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
>  obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
>  obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o
> diff --git a/drivers/vfio/vfio_pasid.c b/drivers/vfio/vfio_pasid.c
> new file mode 100644
> index 000..44ecdd5
> --- /dev/null
> +++ b/drivers/vfio/vfio_pasid.c
> @@ -0,0 +1,247 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2020 Intel Corporation.
> + * Author: Liu Yi L 
> + *
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DRIVER_VERSION  "0.1"
> +#define DRIVER_AUTHOR   "Liu Yi L "
> +#define DRIVER_DESC "PASID management for VFIO bus drivers"
> +
> +#define VFIO_DEFAULT_PASID_QUOTA 1000

I'm not sure we really need a macro to define this since it's only used
once, but a comment discussing the basis for this default value would be
useful.  Also, since Matthew Rosato is finding it necessary to expose
the available DMA mapping counter to userspace, is this also a
limitation that userspace might be interested in knowing such that we
should plumb it through an IOMMU info capability?

> +static int pasid_quota = VFIO_DEFAULT_PASID_QUOTA;
> +module_param_named(pasid_quota, pasid_quota, uint, 0444);
> +MODULE_PARM_DESC(pasid_quota,
> +  "Set the quota for max number of PASIDs that an application is 
> allowed to request (default 1000)");
> +
> +struct vfio_mm_token {
> + unsigned long long val;
> +};
> +
> +struct vfio_mm {
> + struct kref kref;
> + struct ioasi

Re: [PATCH v7 07/16] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free)

2020-09-11 Thread Alex Williamson

On Thu, 10 Sep 2020 03:45:24 -0700
Liu Yi L  wrote:

> This patch allows userspace to request PASID allocation/free, e.g. when
> serving the request from the guest.
> 
> PASIDs that are not freed by userspace are automatically freed when the
> IOASID set is destroyed when process exits.
> 
> Cc: Kevin Tian 
> CC: Jacob Pan 
> Cc: Alex Williamson 
> Cc: Eric Auger 
> Cc: Jean-Philippe Brucker 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Yi Sun 
> Signed-off-by: Jacob Pan 
> ---
> v6 -> v7:
> *) current VFIO returns allocated pasid via signed int, thus VFIO UAPI
>can only support 31 bits pasid. If user space gives min,max which is
>wider than 31 bits, should fail the allocation or free request.
> 
> v5 -> v6:
> *) address comments from Eric against v5. remove the alloc/free helper.
> 
> v4 -> v5:
> *) address comments from Eric Auger.
> *) the comments for the PASID_FREE request is addressed in patch 5/15 of
>this series.
> 
> v3 -> v4:
> *) address comments from v3, except the below comment against the range
>of PASID_FREE request. needs more help on it.
> "> +if (req.range.min > req.range.max)  
> 
>  Is it exploitable that a user can spin the kernel for a long time in
>  the case of a free by calling this with [0, MAX_UINT] regardless of
>  their actual allocations?"
> https://lore.kernel.org/linux-iommu/20200702151832.048b4...@x1.home/
> 
> v1 -> v2:
> *) move the vfio_mm related code to be a seprate module
> *) use a single structure for alloc/free, could support a range of PASIDs
> *) fetch vfio_mm at group_attach time instead of at iommu driver open time
> ---
>  drivers/vfio/Kconfig|  1 +
>  drivers/vfio/vfio_iommu_type1.c | 76 
> +
>  drivers/vfio/vfio_pasid.c   | 10 ++
>  include/linux/vfio.h|  6 
>  include/uapi/linux/vfio.h   | 43 +++
>  5 files changed, 136 insertions(+)
> 
> diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> index 3d8a108..95d90c6 100644
> --- a/drivers/vfio/Kconfig
> +++ b/drivers/vfio/Kconfig
> @@ -2,6 +2,7 @@
>  config VFIO_IOMMU_TYPE1
>   tristate
>   depends on VFIO
> + select VFIO_PASID if (X86)
>   default n
>  
>  config VFIO_IOMMU_SPAPR_TCE
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 3c0048b..bd4b668 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -76,6 +76,7 @@ struct vfio_iommu {
>   booldirty_page_tracking;
>   boolpinned_page_dirty_scope;
>   struct iommu_nesting_info   *nesting_info;
> + struct vfio_mm  *vmm;
>  };
>  
>  struct vfio_domain {
> @@ -2000,6 +2001,11 @@ static void vfio_iommu_iova_insert_copy(struct 
> vfio_iommu *iommu,
>  
>  static void vfio_iommu_release_nesting_info(struct vfio_iommu *iommu)
>  {
> + if (iommu->vmm) {
> + vfio_mm_put(iommu->vmm);
> + iommu->vmm = NULL;
> + }
> +
>   kfree(iommu->nesting_info);
>   iommu->nesting_info = NULL;
>  }
> @@ -2127,6 +2133,26 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>   iommu->nesting_info);
>   if (ret)
>   goto out_detach;
> +
> + if (iommu->nesting_info->features &
> + IOMMU_NESTING_FEAT_SYSWIDE_PASID) {
> + struct vfio_mm *vmm;
> + struct ioasid_set *set;
> +
> + vmm = vfio_mm_get_from_task(current);
> + if (IS_ERR(vmm)) {
> + ret = PTR_ERR(vmm);
> + goto out_detach;
> + }
> + iommu->vmm = vmm;
> +
> + set = vfio_mm_ioasid_set(vmm);
> + ret = iommu_domain_set_attr(domain->domain,
> + DOMAIN_ATTR_IOASID_SET,
> + set);
> + if (ret)
> + goto out_detach;
> + }
>   }
>  
>   /* Get aperture info */
> @@ -2908,6 +2934,54 @@ static int vfio_iommu_type1_dirty_pages(struct 
> vfio_iommu *iommu,
>   return -EINVAL;
>  }
>  
> +static int vfio_iommu_type1_pasid_request(struct vfio_iommu *iommu,
> +   unsigned long arg)
> +{
> + struct vfio_iommu_type1_pasid_request req;
> + unsigned long minsz;
> + int ret;
> +
> + minsz = offsetofend(struct vfio_iommu_type1_pasid_request, range);
> +
> + if (copy_from_user(&req, (void __user *)arg, minsz))
> + return -EFAULT;
> +
> + if (req.argsz < minsz || (req.flags & ~VFIO_PASID_REQUEST_MASK))
> + return -EINVAL;
> +
> + /*
> +  * Current VFIO_IOMMU_PASID

[PATCH v9 1/7] docs: IOMMU user API

2020-09-11 Thread Jacob Pan

IOMMU UAPI is newly introduced to support communications between guest
virtual IOMMU and host IOMMU. There has been lots of discussions on how
it should work with VFIO UAPI and userspace in general.

This document is intended to clarify the UAPI design and usage. The
mechanics of how future extensions should be achieved are also covered
in this documentation.

Reviewed-by: Eric Auger 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
 Documentation/userspace-api/iommu.rst | 211 ++
 MAINTAINERS   |   1 +
 2 files changed, 212 insertions(+)
 create mode 100644 Documentation/userspace-api/iommu.rst

diff --git a/Documentation/userspace-api/iommu.rst 
b/Documentation/userspace-api/iommu.rst
new file mode 100644
index ..1e68e8f05bb3
--- /dev/null
+++ b/Documentation/userspace-api/iommu.rst
@@ -0,0 +1,211 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. iommu:
+
+=
+IOMMU Userspace API
+=
+
+IOMMU UAPI is used for virtualization cases where communications are
+needed between physical and virtual IOMMU drivers. For baremetal
+usage, the IOMMU is a system device which does not need to communicate
+with user space directly.
+
+The primary use cases are guest Shared Virtual Address (SVA) and
+guest IO virtual address (IOVA), wherin the vIOMMU implementation
+relies on the physical IOMMU and for this reason requires interactions
+with the host driver.
+
+.. contents:: :local:
+
+Functionalities
+===
+Communications of user and kernel involve both directions. The
+supported user-kernel APIs are as follows:
+
+1. Alloc/Free PASID
+2. Bind/Unbind guest PASID (e.g. Intel VT-d)
+3. Bind/Unbind guest PASID table (e.g. ARM SMMU)
+4. Invalidate IOMMU caches upon guest requests
+5. Report errors to the guest and serve page requests
+
+Requirements
+
+The IOMMU UAPIs are generic and extensible to meet the following
+requirements:
+
+1. Emulated and para-virtualised vIOMMUs
+2. Multiple vendors (Intel VT-d, ARM SMMU, etc.)
+3. Extensions to the UAPI shall not break existing user space
+
+Interfaces
+==
+Although the data structures defined in IOMMU UAPI are self-contained,
+there is no user API functions introduced. Instead, IOMMU UAPI is
+designed to work with existing user driver frameworks such as VFIO.
+
+Extension Rules & Precautions
+-
+When IOMMU UAPI gets extended, the data structures can *only* be
+modified in two ways:
+
+1. Adding new fields by re-purposing the padding[] field. No size change.
+2. Adding new union members at the end. May increase the structure sizes.
+
+No new fields can be added *after* the variable sized union in that it
+will break backward compatibility when offset moves. A new flag must
+be introduced whenever a change affects the structure using either
+method. The IOMMU driver processes the data based on flags which
+ensures backward compatibility.
+
+Version field is only reserved for the unlikely event of UAPI upgrade
+at its entirety.
+
+It's *always* the caller's responsibility to indicate the size of the
+structure passed by setting argsz appropriately.
+Though at the same time, argsz is user provided data which is not
+trusted. The argsz field allows the user app to indicate how much data
+it is providing, it's still the kernel's responsibility to validate
+whether it's correct and sufficient for the requested operation.
+
+Compatibility Checking
+--
+When IOMMU UAPI extension results in some structure size increase,
+IOMMU UAPI code shall handle the following cases:
+
+1. User and kernel has exact size match
+2. An older user with older kernel header (smaller UAPI size) running on a
+   newer kernel (larger UAPI size)
+3. A newer user with newer kernel header (larger UAPI size) running
+   on an older kernel.
+4. A malicious/misbehaving user pass illegal/invalid size but within
+   range. The data may contain garbage.
+
+Feature Checking
+
+While launching a guest with vIOMMU, it is strongly advised to check
+the compatibility upfront, as some subsequent errors happening during
+vIOMMU operation, such as cache invalidation failures cannot be nicely
+escaladated to the guest due to IOMMU specifications. This can lead to
+catastrophic failures for the users.
+
+User applications such as QEMU are expected to import kernel UAPI
+headers. Backward compatibility is supported per feature flags.
+For example, an older QEMU (with older kernel header) can run on newer
+kernel. Newer QEMU (with new kernel header) may refuse to initialize
+on an older kernel if new feature flags are not supported by older
+kernel. Simply recompiling existing code with newer kernel header should
+not be an issue in that only existing flags are used.
+
+IOMMU vendor driver should report the below features to IOMMU UAPI
+consumers (e.g. via VFIO).
+
+1. IOMMU_NESTING_FEAT_SYSWIDE_PASID

[PATCH v9 4/7] iommu/uapi: Use named union for user data

2020-09-11 Thread Jacob Pan

IOMMU UAPI data size is filled by the user space which must be validated
by the kernel. To ensure backward compatibility, user data can only be
extended by either re-purpose padding bytes or extend the variable sized
union at the end. No size change is allowed before the union. Therefore,
the minimum size is the offset of the union.

To use offsetof() on the union, we must make it named.

Link: https://lore.kernel.org/linux-iommu/20200611145518.0c281...@x1.home/
Signed-off-by: Jacob Pan 
Reviewed-by: Lu Baolu 
Reviewed-by: Eric Auger 
---
 drivers/iommu/intel/iommu.c | 22 +++---
 drivers/iommu/intel/svm.c   |  2 +-
 include/uapi/linux/iommu.h  |  4 ++--
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 87b17bac04c2..461f3a6864d4 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5434,8 +5434,8 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
 
/* Size is only valid in address selective invalidation */
if (inv_info->granularity == IOMMU_INV_GRANU_ADDR)
-   size = to_vtd_size(inv_info->addr_info.granule_size,
-  inv_info->addr_info.nb_granules);
+   size = to_vtd_size(inv_info->granu.addr_info.granule_size,
+  inv_info->granu.addr_info.nb_granules);
 
for_each_set_bit(cache_type,
 (unsigned long *)&inv_info->cache,
@@ -5456,20 +5456,20 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
 * granularity.
 */
if (inv_info->granularity == IOMMU_INV_GRANU_PASID &&
-   (inv_info->pasid_info.flags & IOMMU_INV_PASID_FLAGS_PASID))
-   pasid = inv_info->pasid_info.pasid;
+   (inv_info->granu.pasid_info.flags & 
IOMMU_INV_PASID_FLAGS_PASID))
+   pasid = inv_info->granu.pasid_info.pasid;
else if (inv_info->granularity == IOMMU_INV_GRANU_ADDR &&
-(inv_info->addr_info.flags & 
IOMMU_INV_ADDR_FLAGS_PASID))
-   pasid = inv_info->addr_info.pasid;
+(inv_info->granu.addr_info.flags & 
IOMMU_INV_ADDR_FLAGS_PASID))
+   pasid = inv_info->granu.addr_info.pasid;
 
switch (BIT(cache_type)) {
case IOMMU_CACHE_INV_TYPE_IOTLB:
/* HW will ignore LSB bits based on address mask */
if (inv_info->granularity == IOMMU_INV_GRANU_ADDR &&
size &&
-   (inv_info->addr_info.addr & ((BIT(VTD_PAGE_SHIFT + 
size)) - 1))) {
+   (inv_info->granu.addr_info.addr & 
((BIT(VTD_PAGE_SHIFT + size)) - 1))) {
pr_err_ratelimited("User address not aligned, 
0x%llx, size order %llu\n",
-  inv_info->addr_info.addr, 
size);
+  
inv_info->granu.addr_info.addr, size);
}
 
/*
@@ -5477,9 +5477,9 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
 * We use npages = -1 to indicate that.
 */
qi_flush_piotlb(iommu, did, pasid,
-   mm_to_dma_pfn(inv_info->addr_info.addr),
+   
mm_to_dma_pfn(inv_info->granu.addr_info.addr),
(granu == QI_GRAN_NONG_PASID) ? -1 : 1 
<< size,
-   inv_info->addr_info.flags & 
IOMMU_INV_ADDR_FLAGS_LEAF);
+   inv_info->granu.addr_info.flags & 
IOMMU_INV_ADDR_FLAGS_LEAF);
 
if (!info->ats_enabled)
break;
@@ -5502,7 +5502,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
size = 64 - VTD_PAGE_SHIFT;
addr = 0;
} else if (inv_info->granularity == 
IOMMU_INV_GRANU_ADDR) {
-   addr = inv_info->addr_info.addr;
+   addr = inv_info->granu.addr_info.addr;
}
 
if (info->ats_enabled)
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 95c3164a2302..99353d6468fa 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -370,7 +370,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
spin_lock(&iommu->lock);
ret = intel_pasid_setup_nested(iommu, dev,
   (pgd_t *)(uintptr_t)data->gpgd,
-  data->hpasid,

[PATCH v9 5/7] iommu/uapi: Rename uapi functions

2020-09-11 Thread Jacob Pan

User APIs such as iommu_sva_unbind_gpasid() may also be used by the
kernel. Since we introduced user pointer to the UAPI functions,
in-kernel callers cannot share the same APIs. In-kernel callers are also
trusted, there is no need to validate the data.

We plan to have two flavors of the same API functions, one called
through ioctls, carrying a user pointer and one called directly with
valid IOMMU UAPI structs. To differentiate both, let's rename existing
functions with an iommu_uapi_ prefix.

Suggested-by: Alex Williamson 
Reviewed-by: Eric Auger 
Signed-off-by: Jacob Pan 
---
 drivers/iommu/iommu.c | 18 +-
 include/linux/iommu.h | 31 ---
 2 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 609bd25bf154..4ae02291ccc2 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1961,35 +1961,35 @@ int iommu_attach_device(struct iommu_domain *domain, 
struct device *dev)
 }
 EXPORT_SYMBOL_GPL(iommu_attach_device);
 
-int iommu_cache_invalidate(struct iommu_domain *domain, struct device *dev,
-  struct iommu_cache_invalidate_info *inv_info)
+int iommu_uapi_cache_invalidate(struct iommu_domain *domain, struct device 
*dev,
+   struct iommu_cache_invalidate_info *inv_info)
 {
if (unlikely(!domain->ops->cache_invalidate))
return -ENODEV;
 
return domain->ops->cache_invalidate(domain, dev, inv_info);
 }
-EXPORT_SYMBOL_GPL(iommu_cache_invalidate);
+EXPORT_SYMBOL_GPL(iommu_uapi_cache_invalidate);
 
-int iommu_sva_bind_gpasid(struct iommu_domain *domain,
-  struct device *dev, struct iommu_gpasid_bind_data 
*data)
+int iommu_uapi_sva_bind_gpasid(struct iommu_domain *domain,
+  struct device *dev, struct 
iommu_gpasid_bind_data *data)
 {
if (unlikely(!domain->ops->sva_bind_gpasid))
return -ENODEV;
 
return domain->ops->sva_bind_gpasid(domain, dev, data);
 }
-EXPORT_SYMBOL_GPL(iommu_sva_bind_gpasid);
+EXPORT_SYMBOL_GPL(iommu_uapi_sva_bind_gpasid);
 
-int iommu_sva_unbind_gpasid(struct iommu_domain *domain, struct device *dev,
-ioasid_t pasid)
+int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain, struct device 
*dev,
+ioasid_t pasid)
 {
if (unlikely(!domain->ops->sva_unbind_gpasid))
return -ENODEV;
 
return domain->ops->sva_unbind_gpasid(dev, pasid);
 }
-EXPORT_SYMBOL_GPL(iommu_sva_unbind_gpasid);
+EXPORT_SYMBOL_GPL(iommu_uapi_sva_unbind_gpasid);
 
 static void __iommu_detach_device(struct iommu_domain *domain,
  struct device *dev)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index fee209efb756..710d5d2691eb 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -424,13 +424,13 @@ extern int iommu_attach_device(struct iommu_domain 
*domain,
   struct device *dev);
 extern void iommu_detach_device(struct iommu_domain *domain,
struct device *dev);
-extern int iommu_cache_invalidate(struct iommu_domain *domain,
- struct device *dev,
- struct iommu_cache_invalidate_info *inv_info);
-extern int iommu_sva_bind_gpasid(struct iommu_domain *domain,
-   struct device *dev, struct iommu_gpasid_bind_data *data);
-extern int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
-   struct device *dev, ioasid_t pasid);
+extern int iommu_uapi_cache_invalidate(struct iommu_domain *domain,
+  struct device *dev,
+  struct iommu_cache_invalidate_info 
*inv_info);
+extern int iommu_uapi_sva_bind_gpasid(struct iommu_domain *domain,
+ struct device *dev, struct 
iommu_gpasid_bind_data *data);
+extern int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain,
+   struct device *dev, ioasid_t pasid);
 extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
 extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
@@ -1032,21 +1032,22 @@ static inline int iommu_sva_get_pasid(struct iommu_sva 
*handle)
return IOMMU_PASID_INVALID;
 }
 
-static inline int
-iommu_cache_invalidate(struct iommu_domain *domain,
-  struct device *dev,
-  struct iommu_cache_invalidate_info *inv_info)
+static inline int iommu_uapi_cache_invalidate(struct iommu_domain *domain,
+ struct device *dev,
+ struct 
iommu_cache_invalidate_info *inv_info)
 {
return -ENODEV;
 }
-static inline int iom

[PATCH v9 3/7] iommu/uapi: Introduce enum type for PASID data format

2020-09-11 Thread Jacob Pan

There can be multiple vendor-specific PASID data formats used in UAPI
structures. This patch adds enum type with a last entry which makes
range checking much easier.

Suggested-by: Alex Williamson 
Reviewed-by: Eric Auger 
Signed-off-by: Jacob Pan 
---
 include/uapi/linux/iommu.h | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index b42acc8fe007..7cc6ee6c41f7 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -298,11 +298,16 @@ struct iommu_gpasid_bind_data_vtd {
 IOMMU_SVA_VTD_GPASID_PCD |  \
 IOMMU_SVA_VTD_GPASID_PWT)
 
+enum iommu_pasid_data_format {
+   IOMMU_PASID_FORMAT_INTEL_VTD = 1,
+   IOMMU_PASID_FORMAT_LAST,
+};
+
 /**
  * struct iommu_gpasid_bind_data - Information about device and guest PASID 
binding
  * @argsz: User filled size of this data
  * @version:   Version of this data structure
- * @format:PASID table entry format
+ * @format:PASID table entry format of enum iommu_pasid_data_format type
  * @flags: Additional information on guest bind request
  * @gpgd:  Guest page directory base of the guest mm to bind
  * @hpasid:Process address space ID used for the guest mm in host IOMMU
@@ -321,7 +326,6 @@ struct iommu_gpasid_bind_data {
__u32 argsz;
 #define IOMMU_GPASID_BIND_VERSION_11
__u32 version;
-#define IOMMU_PASID_FORMAT_INTEL_VTD   1
__u32 format;
__u32 addr_width;
 #define IOMMU_SVA_GPASID_VAL   (1 << 0) /* guest PASID valid */
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v9 6/7] iommu/uapi: Handle data and argsz filled by users

2020-09-11 Thread Jacob Pan

IOMMU user APIs are responsible for processing user data. This patch
changes the interface such that user pointers can be passed into IOMMU
code directly. Separate kernel APIs without user pointers are introduced
for in-kernel users of the UAPI functionality.

IOMMU UAPI data has a user filled argsz field which indicates the data
length of the structure. User data is not trusted, argsz must be
validated based on the current kernel data size, mandatory data size,
and feature flags.

User data may also be extended, resulting in possible argsz increase.
Backward compatibility is ensured based on size and flags (or
the functional equivalent fields) checking.

This patch adds sanity checks in the IOMMU layer. In addition to argsz,
reserved/unused fields in padding, flags, and version are also checked.
Details are documented in Documentation/userspace-api/iommu.rst

Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
 drivers/iommu/iommu.c | 199 --
 include/linux/iommu.h |  28 ---
 2 files changed, 211 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 4ae02291ccc2..5c1b7ae48aae 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1961,34 +1961,219 @@ int iommu_attach_device(struct iommu_domain *domain, 
struct device *dev)
 }
 EXPORT_SYMBOL_GPL(iommu_attach_device);
 
+/*
+ * Check flags and other user provided data for valid combinations. We also
+ * make sure no reserved fields or unused flags are set. This is to ensure
+ * not breaking userspace in the future when these fields or flags are used.
+ */
+static int iommu_check_cache_invl_data(struct iommu_cache_invalidate_info 
*info)
+{
+   u32 mask;
+   int i;
+
+   if (info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
+   return -EINVAL;
+
+   mask = (1 << IOMMU_CACHE_INV_TYPE_NR) - 1;
+   if (info->cache & ~mask)
+   return -EINVAL;
+
+   if (info->granularity >= IOMMU_INV_GRANU_NR)
+   return -EINVAL;
+
+   switch (info->granularity) {
+   case IOMMU_INV_GRANU_ADDR:
+   if (info->cache & IOMMU_CACHE_INV_TYPE_PASID)
+   return -EINVAL;
+
+   mask = IOMMU_INV_ADDR_FLAGS_PASID |
+   IOMMU_INV_ADDR_FLAGS_ARCHID |
+   IOMMU_INV_ADDR_FLAGS_LEAF;
+
+   if (info->granu.addr_info.flags & ~mask)
+   return -EINVAL;
+   break;
+   case IOMMU_INV_GRANU_PASID:
+   mask = IOMMU_INV_PASID_FLAGS_PASID |
+   IOMMU_INV_PASID_FLAGS_ARCHID;
+   if (info->granu.pasid_info.flags & ~mask)
+   return -EINVAL;
+
+   break;
+   case IOMMU_INV_GRANU_DOMAIN:
+   if (info->cache & IOMMU_CACHE_INV_TYPE_DEV_IOTLB)
+   return -EINVAL;
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   /* Check reserved padding fields */
+   for (i = 0; i < sizeof(info->padding); i++) {
+   if (info->padding[i])
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 int iommu_uapi_cache_invalidate(struct iommu_domain *domain, struct device 
*dev,
-   struct iommu_cache_invalidate_info *inv_info)
+   void __user *uinfo)
 {
+   struct iommu_cache_invalidate_info inv_info = { 0 };
+   u32 minsz;
+   int ret = 0;
+
if (unlikely(!domain->ops->cache_invalidate))
return -ENODEV;
 
-   return domain->ops->cache_invalidate(domain, dev, inv_info);
+   /*
+* No new spaces can be added before the variable sized union, the
+* minimum size is the offset to the union.
+*/
+   minsz = offsetof(struct iommu_cache_invalidate_info, granu);
+
+   /* Copy minsz from user to get flags and argsz */
+   if (copy_from_user(&inv_info, uinfo, minsz))
+   return -EFAULT;
+
+   /* Fields before variable size union is mandatory */
+   if (inv_info.argsz < minsz)
+   return -EINVAL;
+
+   /* PASID and address granu require additional info beyond minsz */
+   if (inv_info.argsz == minsz &&
+   ((inv_info.granularity == IOMMU_INV_GRANU_PASID) ||
+   (inv_info.granularity == IOMMU_INV_GRANU_ADDR)))
+   return -EINVAL;
+
+   if (inv_info.granularity == IOMMU_INV_GRANU_PASID &&
+   inv_info.argsz < offsetofend(struct iommu_cache_invalidate_info, 
granu.pasid_info))
+   return -EINVAL;
+
+   if (inv_info.granularity == IOMMU_INV_GRANU_ADDR &&
+   inv_info.argsz < offsetofend(struct iommu_cache_invalidate_info, 
granu.addr_info))
+   return -EINVAL;
+
+   /*
+* User might be using a newer UAPI header which has a larger data
+* size, we shall support the existing flag

[PATCH v9 0/7] IOMMU user API enhancement

2020-09-11 Thread Jacob Pan

IOMMU user API header was introduced to support nested DMA translation and
related fault handling. The current UAPI data structures consist of three
areas that cover the interactions between host kernel and guest:
 - fault handling
 - cache invalidation
 - bind guest page tables, i.e. guest PASID

Future extensions are likely to support more architectures and vIOMMU features.

In the previous discussion, using user-filled data size and feature flags is
made a preferred approach over a unified version number.
https://lkml.org/lkml/2020/1/29/45

In addition to introduce argsz field to data structures, this patchset is also
trying to document the UAPI design, usage, and extension rules. VT-d driver
changes to utilize the new argsz field is included, VFIO usage is to follow.

This set is available at:
https://github.com/jacobpan/linux.git vsva_v5.9_uapi_v9

Thanks,

Jacob


Changelog:
v9
- Directly pass PASID value to iommu_sva_unbind_gpasid() without
  the superfluous data in struct iommu_gpasid_bind_data.
v8
- Rebased to v5.9-rc2
- Addressed review comments from Eric Auger
  1. added a check for the unused vendor flags
  2. commit message improvements
v7
- Added PASID data format enum for range checking
- Tidy up based on reviews from Alex W.
- Removed doc section for vIOMMU fault handling
v6
- Renamed all UAPI functions with iommu_uapi_ prefix
- Replaced argsz maxsz checking with flag specific size checks
- Documentation improvements based on suggestions by Eric Auger
  Replaced example code with a pointer to the actual code
- Added more checks for illegal flags combinations
- Added doc file to MAINTAINERS
v5
- Addjusted paddings in UAPI data to be 8 byte aligned
- Do not clobber argsz in IOMMU core before passing on to vendor driver
- Removed pr_warn_ for invalid UAPI data check, just return -EINVAL
- Clarified VFIO responsibility in UAPI data handling
- Use iommu_uapi prefix to differentiate APIs has in-kernel caller
- Added comment for unchecked flags of invalidation granularity
- Added example in doc to show vendor data checking

v4
- Added checks of UAPI data for reserved fields, version, and flags.
- Removed version check from vendor driver (vt-d)
- Relaxed argsz check to match the UAPI struct size instead of variable
  union size
- Updated documentation

v3:
- Rewrote backward compatibility rule to support existing code
  re-compiled with newer kernel UAPI header that runs on older
  kernel. Based on review comment from Alex W.
  https://lore.kernel.org/linux-iommu/20200611094741.6d118...@w520.home/
- Take user pointer directly in UAPI functions. Perform argsz check
  and copy_from_user() in IOMMU driver. Eliminate the need for
  VFIO or other upper layer to parse IOMMU data.
- Create wrapper function for in-kernel users of UAPI functions
v2:
- Removed unified API version and helper
- Introduced argsz for each UAPI data
- Introduced UAPI doc


Jacob Pan (7):
  docs: IOMMU user API
  iommu/uapi: Add argsz for user filled data
  iommu/uapi: Introduce enum type for PASID data format
  iommu/uapi: Use named union for user data
  iommu/uapi: Rename uapi functions
  iommu/uapi: Handle data and argsz filled by users
  iommu/vt-d: Check UAPI data processed by IOMMU core

 Documentation/userspace-api/iommu.rst | 211 ++
 MAINTAINERS   |   1 +
 drivers/iommu/intel/iommu.c   |  25 ++--
 drivers/iommu/intel/svm.c |  13 ++-
 drivers/iommu/iommu.c | 201 ++--
 include/linux/iommu.h |  35 --
 include/uapi/linux/iommu.h|  25 ++--
 7 files changed, 468 insertions(+), 43 deletions(-)
 create mode 100644 Documentation/userspace-api/iommu.rst

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v9 7/7] iommu/vt-d: Check UAPI data processed by IOMMU core

2020-09-11 Thread Jacob Pan

IOMMU generic layer already does sanity checks on UAPI data for version
match and argsz range based on generic information.

This patch adjusts the following data checking responsibilities:
- removes the redundant version check from VT-d driver
- removes the check for vendor specific data size
- adds check for the use of reserved/undefined flags

Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/iommu.c |  3 +--
 drivers/iommu/intel/svm.c   | 11 +--
 include/uapi/linux/iommu.h  |  1 +
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 461f3a6864d4..18ed3b3c70d7 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5408,8 +5408,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, 
struct device *dev,
int ret = 0;
u64 size = 0;
 
-   if (!inv_info || !dmar_domain ||
-   inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
+   if (!inv_info || !dmar_domain)
return -EINVAL;
 
if (!dev || !dev_is_pci(dev))
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 99353d6468fa..0cb9a15f1112 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -284,8 +284,15 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
if (WARN_ON(!iommu) || !data)
return -EINVAL;
 
-   if (data->version != IOMMU_GPASID_BIND_VERSION_1 ||
-   data->format != IOMMU_PASID_FORMAT_INTEL_VTD)
+   if (data->format != IOMMU_PASID_FORMAT_INTEL_VTD)
+   return -EINVAL;
+
+   /* IOMMU core ensures argsz is more than the start of the union */
+   if (data->argsz < offsetofend(struct iommu_gpasid_bind_data, 
vendor.vtd))
+   return -EINVAL;
+
+   /* Make sure no undefined flags are used in vendor data */
+   if (data->vendor.vtd.flags & ~(IOMMU_SVA_VTD_GPASID_LAST - 1))
return -EINVAL;
 
if (!dev_is_pci(dev))
diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index c64bca5af419..1ebc23df4fbc 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -288,6 +288,7 @@ struct iommu_gpasid_bind_data_vtd {
 #define IOMMU_SVA_VTD_GPASID_PWT   (1 << 3) /* page-level write through */
 #define IOMMU_SVA_VTD_GPASID_EMTE  (1 << 4) /* extended mem type enable */
 #define IOMMU_SVA_VTD_GPASID_CD(1 << 5) /* PASID-level cache 
disable */
+#define IOMMU_SVA_VTD_GPASID_LAST  (1 << 6)
__u64 flags;
__u32 pat;
__u32 emt;
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v9 2/7] iommu/uapi: Add argsz for user filled data

2020-09-11 Thread Jacob Pan

As IOMMU UAPI gets extended, user data size may increase. To support
backward compatibiliy, this patch introduces a size field to each UAPI
data structures. It is *always* the responsibility for the user to fill in
the correct size. Padding fields are adjusted to ensure 8 byte alignment.

Specific scenarios for user data handling are documented in:
Documentation/userspace-api/iommu.rst

As there is no current users of the API, struct version is not
incremented.

Reviewed-by: Eric Auger 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
 include/uapi/linux/iommu.h | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index c2b2caf9ed41..b42acc8fe007 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -139,6 +139,7 @@ enum iommu_page_response_code {
 
 /**
  * struct iommu_page_response - Generic page response information
+ * @argsz: User filled size of this data
  * @version: API version of this structure
  * @flags: encodes whether the corresponding fields are valid
  * (IOMMU_FAULT_PAGE_RESPONSE_* values)
@@ -147,6 +148,7 @@ enum iommu_page_response_code {
  * @code: response code from &enum iommu_page_response_code
  */
 struct iommu_page_response {
+   __u32   argsz;
 #define IOMMU_PAGE_RESP_VERSION_1  1
__u32   version;
 #define IOMMU_PAGE_RESP_PASID_VALID(1 << 0)
@@ -222,6 +224,7 @@ struct iommu_inv_pasid_info {
 /**
  * struct iommu_cache_invalidate_info - First level/stage invalidation
  * information
+ * @argsz: User filled size of this data
  * @version: API version of this structure
  * @cache: bitfield that allows to select which caches to invalidate
  * @granularity: defines the lowest granularity used for the invalidation:
@@ -250,6 +253,7 @@ struct iommu_inv_pasid_info {
  * must support the used granularity.
  */
 struct iommu_cache_invalidate_info {
+   __u32   argsz;
 #define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1
__u32   version;
 /* IOMMU paging structure cache */
@@ -259,7 +263,7 @@ struct iommu_cache_invalidate_info {
 #define IOMMU_CACHE_INV_TYPE_NR(3)
__u8cache;
__u8granularity;
-   __u8padding[2];
+   __u8padding[6];
union {
struct iommu_inv_pasid_info pasid_info;
struct iommu_inv_addr_info addr_info;
@@ -296,6 +300,7 @@ struct iommu_gpasid_bind_data_vtd {
 
 /**
  * struct iommu_gpasid_bind_data - Information about device and guest PASID 
binding
+ * @argsz: User filled size of this data
  * @version:   Version of this data structure
  * @format:PASID table entry format
  * @flags: Additional information on guest bind request
@@ -313,17 +318,18 @@ struct iommu_gpasid_bind_data_vtd {
  * PASID to host PASID based on this bind data.
  */
 struct iommu_gpasid_bind_data {
+   __u32 argsz;
 #define IOMMU_GPASID_BIND_VERSION_11
__u32 version;
 #define IOMMU_PASID_FORMAT_INTEL_VTD   1
__u32 format;
+   __u32 addr_width;
 #define IOMMU_SVA_GPASID_VAL   (1 << 0) /* guest PASID valid */
__u64 flags;
__u64 gpgd;
__u64 hpasid;
__u64 gpasid;
-   __u32 addr_width;
-   __u8  padding[12];
+   __u8  padding[8];
/* Vendor specific data */
union {
struct iommu_gpasid_bind_data_vtd vtd;
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 10/16] vfio/type1: Support binding guest page tables to PASID

2020-09-11 Thread Alex Williamson

On Thu, 10 Sep 2020 03:45:27 -0700
Liu Yi L  wrote:

> Nesting translation allows two-levels/stages page tables, with 1st level
> for guest translations (e.g. GVA->GPA), 2nd level for host translations
> (e.g. GPA->HPA). This patch adds interface for binding guest page tables
> to a PASID. This PASID must have been allocated by the userspace before
> the binding request.
> 
> Cc: Kevin Tian 
> CC: Jacob Pan 
> Cc: Alex Williamson 
> Cc: Eric Auger 
> Cc: Jean-Philippe Brucker 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Signed-off-by: Jean-Philippe Brucker 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Jacob Pan 
> ---
> v6 -> v7:
> *) introduced @user in struct domain_capsule to simplify the code per Eric's
>suggestion.
> *) introduced VFIO_IOMMU_NESTING_OP_NUM for sanitizing op from userspace.
> *) corrected the @argsz value of unbind_data in vfio_group_unbind_gpasid_fn().
> 
> v5 -> v6:
> *) dropped vfio_find_nesting_group() and add 
> vfio_get_nesting_domain_capsule().
>per comment from Eric.
> *) use iommu_uapi_sva_bind/unbind_gpasid() and iommu_sva_unbind_gpasid() in
>linux/iommu.h for userspace operation and in-kernel operation.
> 
> v3 -> v4:
> *) address comments from Alex on v3
> 
> v2 -> v3:
> *) use __iommu_sva_unbind_gpasid() for unbind call issued by VFIO
>
> https://lore.kernel.org/linux-iommu/1592931837-58223-6-git-send-email-jacob.jun@linux.intel.com/
> 
> v1 -> v2:
> *) rename subject from "vfio/type1: Bind guest page tables to host"
> *) remove VFIO_IOMMU_BIND, introduce VFIO_IOMMU_NESTING_OP to support bind/
>unbind guet page table
> *) replaced vfio_iommu_for_each_dev() with a group level loop since this
>series enforces one group per container w/ nesting type as start.
> *) rename vfio_bind/unbind_gpasid_fn() to vfio_dev_bind/unbind_gpasid_fn()
> *) vfio_dev_unbind_gpasid() always successful
> *) use vfio_mm->pasid_lock to avoid race between PASID free and page table
>bind/unbind
> ---
>  drivers/vfio/vfio_iommu_type1.c | 163 
> 
>  drivers/vfio/vfio_pasid.c   |  26 +++
>  include/linux/vfio.h|  20 +
>  include/uapi/linux/vfio.h   |  36 +
>  4 files changed, 245 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index bd4b668..11f1156 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -149,6 +149,39 @@ struct vfio_regions {
>  #define DIRTY_BITMAP_PAGES_MAX((u64)INT_MAX)
>  #define DIRTY_BITMAP_SIZE_MAX 
> DIRTY_BITMAP_BYTES(DIRTY_BITMAP_PAGES_MAX)
>  
> +struct domain_capsule {
> + struct vfio_group   *group;
> + struct iommu_domain *domain;
> + /* set if @data contains a user pointer*/
> + booluser;
> + void*data;
> +};

Put the hole in the structure at the end, but I suspect we might lose
the user field when the internal api drops the unnecessary structure
for unbind anyway.

> +
> +/* iommu->lock must be held */
> +static int vfio_prepare_nesting_domain_capsule(struct vfio_iommu *iommu,
> +struct domain_capsule *dc)
> +{
> + struct vfio_domain *domain = NULL;
> + struct vfio_group *group = NULL;

Unnecessary initialization.

> +
> + if (!iommu->nesting_info)
> + return -EINVAL;
> +
> + /*
> +  * Only support singleton container with nesting type. If
> +  * nesting_info is non-NULL, the container is non-empty.
> +  * Also domain is non-empty.
> +  */
> + domain = list_first_entry(&iommu->domain_list,
> +   struct vfio_domain, next);
> + group = list_first_entry(&domain->group_list,
> +  struct vfio_group, next);
> + dc->group = group;
> + dc->domain = domain->domain;
> + dc->user = true;
> + return 0;
> +}
> +
>  static int put_pfn(unsigned long pfn, int prot);
>  
>  static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu 
> *iommu,
> @@ -2405,6 +2438,49 @@ static int vfio_iommu_resv_refresh(struct vfio_iommu 
> *iommu,
>   return ret;
>  }
>  
> +static int vfio_dev_bind_gpasid_fn(struct device *dev, void *data)
> +{
> + struct domain_capsule *dc = (struct domain_capsule *)data;
> + unsigned long arg = *(unsigned long *)dc->data;
> +
> + return iommu_uapi_sva_bind_gpasid(dc->domain, dev,
> +   (void __user *)arg);
> +}
> +
> +static int vfio_dev_unbind_gpasid_fn(struct device *dev, void *data)
> +{
> + struct domain_capsule *dc = (struct domain_capsule *)data;
> +
> + if (dc->user) {
> + unsigned long arg = *(unsigned long *)dc->data;
> +
> + iommu_uapi_sva_unbind_gpasid(dc->domain,
> +  dev, (void __user *)arg);
> + } else {
> + struct iommu_gpasid_bind_data *unbind_data =
> +

Re: [PATCH v7 13/16] vfio/pci: Expose PCIe PASID capability to guest

2020-09-11 Thread Alex Williamson

On Thu, 10 Sep 2020 03:45:30 -0700
Liu Yi L  wrote:

> This patch exposes PCIe PASID capability to guest for assigned devices.
> Existing vfio_pci driver hides it from guest by setting the capability
> length as 0 in pci_ext_cap_length[].

This exposes the PASID capability, but it's still read-only, so this
largely just helps userspace know where to emulate the capability,
right?  Thanks,

Alex
 
> And this patch only exposes PASID capability for devices which has PCIe
> PASID extended struture in its configuration space. VFs will not expose
> the PASID capability as they do not implement the PASID extended structure
> in their config space. It is a TODO in future. Related discussion can be
> found in below link:
> 
> https://lore.kernel.org/kvm/20200407095801.648b1...@w520.home/
> 
> Cc: Kevin Tian 
> CC: Jacob Pan 
> Cc: Alex Williamson 
> Cc: Eric Auger 
> Cc: Jean-Philippe Brucker 
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Signed-off-by: Liu Yi L 
> Reviewed-by: Eric Auger 
> ---
> v5 -> v6:
> *) add review-by from Eric Auger.
> 
> v1 -> v2:
> *) added in v2, but it was sent in a separate patchseries before
> ---
>  drivers/vfio/pci/vfio_pci_config.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_config.c 
> b/drivers/vfio/pci/vfio_pci_config.c
> index d98843f..07ff2e6 100644
> --- a/drivers/vfio/pci/vfio_pci_config.c
> +++ b/drivers/vfio/pci/vfio_pci_config.c
> @@ -95,7 +95,7 @@ static const u16 pci_ext_cap_length[PCI_EXT_CAP_ID_MAX + 1] 
> = {
>   [PCI_EXT_CAP_ID_LTR]=   PCI_EXT_CAP_LTR_SIZEOF,
>   [PCI_EXT_CAP_ID_SECPCI] =   0,  /* not yet */
>   [PCI_EXT_CAP_ID_PMUX]   =   0,  /* not yet */
> - [PCI_EXT_CAP_ID_PASID]  =   0,  /* not yet */
> + [PCI_EXT_CAP_ID_PASID]  =   PCI_EXT_CAP_PASID_SIZEOF,
>  };
>  
>  /*

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH V2 2/5] iommu: Add iommu_dma_free_cpu_cached_iovas function

2020-09-11 Thread Lu Baolu


On 2020/9/9 15:05, Christoph Hellwig wrote:

+static inline void iommu_dma_free_cpu_cached_iovas(unsigned int cpu,
+  struct iommu_domain
*domain)


This adds a crazy long line.  Which is rather pointless as other
bits of code in the patch use the more compact two tab indentations
for the prototype continuation lines anyway.



Okay. I will use two tabs instead.

Best regards,
baolu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH V2 5/5] DO NOT MERGE: iommu: disable list appending in dma-iommu

2020-09-11 Thread Lu Baolu


On 2020/9/9 15:06, Christoph Hellwig wrote:

On Wed, Sep 09, 2020 at 09:43:09AM +0800, Lu Baolu wrote:

+   /*
+* The Intel graphic device driver is used to assume that the
returned
+* sg list is not combound. This blocks the efforts of converting
the


This adds pointless overly long lines.


+* Intel IOMMU driver to dma-iommu api's. Add this quirk to make the
+* device driver work and should be removed once it's fixed in i915
+* driver.
+*/
+   if (dev_is_pci(dev) &&
+   to_pci_dev(dev)->vendor == PCI_VENDOR_ID_INTEL &&
+   (to_pci_dev(dev)->class >> 16) == PCI_BASE_CLASS_DISPLAY) {
+   for_each_sg(sg, s, nents, i) {
+   unsigned int s_iova_off = sg_dma_address(s);
+   unsigned int s_length = sg_dma_len(s);
+   unsigned int s_iova_len = s->length;
+
+   s->offset += s_iova_off;
+   s->length = s_length;
+   sg_dma_address(s) = dma_addr + s_iova_off;
+   sg_dma_len(s) = s_length;
+   dma_addr += s_iova_len;
+   }
+
+   return nents;
+   }


This wants an IS_ENABLED() check.  And probably a pr_once reminding
of the workaround.



Will fix in the next version.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 5/6] iommu/vt-d: Convert intel iommu driver to the iommu ops

2020-09-11 Thread Lu Baolu

From: Tom Murphy 

Convert the intel iommu driver to the dma-iommu api. Remove the iova
handling and reserve region code from the intel iommu driver.

Signed-off-by: Tom Murphy 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel/Kconfig |   1 +
 drivers/iommu/intel/iommu.c | 742 ++--
 2 files changed, 43 insertions(+), 700 deletions(-)

diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig
index 5337ee1584b0..28a3d1596c76 100644
--- a/drivers/iommu/intel/Kconfig
+++ b/drivers/iommu/intel/Kconfig
@@ -13,6 +13,7 @@ config INTEL_IOMMU
select DMAR_TABLE
select SWIOTLB
select IOASID
+   select IOMMU_DMA
help
  DMA remapping (DMAR) devices support enables independent address
  translations for Direct Memory Access (DMA) from devices.
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 63ee30c689a7..adc231790e0a 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -41,7 +42,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -382,9 +382,6 @@ struct device_domain_info *get_domain_info(struct device 
*dev)
 DEFINE_SPINLOCK(device_domain_lock);
 static LIST_HEAD(device_domain_list);
 
-#define device_needs_bounce(d) (!intel_no_bounce && dev_is_pci(d) &&   \
-   to_pci_dev(d)->untrusted)
-
 /*
  * Iterate over elements in device_domain_list and call the specified
  * callback @fn against each element.
@@ -1242,13 +1239,6 @@ static void dma_free_pagelist(struct page *freelist)
}
 }
 
-static void iova_entry_free(unsigned long data)
-{
-   struct page *freelist = (struct page *)data;
-
-   dma_free_pagelist(freelist);
-}
-
 /* iommu handling */
 static int iommu_alloc_root_entry(struct intel_iommu *iommu)
 {
@@ -1613,19 +1603,17 @@ static inline void __mapping_notify_one(struct 
intel_iommu *iommu,
iommu_flush_write_buffer(iommu);
 }
 
-static void iommu_flush_iova(struct iova_domain *iovad)
+static void intel_flush_iotlb_all(struct iommu_domain *domain)
 {
-   struct dmar_domain *domain;
+   struct dmar_domain *dmar_domain = to_dmar_domain(domain);
int idx;
 
-   domain = container_of(iovad, struct dmar_domain, iovad);
-
-   for_each_domain_iommu(idx, domain) {
+   for_each_domain_iommu(idx, dmar_domain) {
struct intel_iommu *iommu = g_iommus[idx];
-   u16 did = domain->iommu_did[iommu->seq_id];
+   u16 did = dmar_domain->iommu_did[iommu->seq_id];
 
-   if (domain_use_first_level(domain))
-   domain_flush_piotlb(iommu, domain, 0, -1, 0);
+   if (domain_use_first_level(dmar_domain))
+   domain_flush_piotlb(iommu, dmar_domain, 0, -1, 0);
else
iommu->flush.flush_iotlb(iommu, did, 0, 0,
 DMA_TLB_DSI_FLUSH);
@@ -1907,48 +1895,6 @@ static int domain_detach_iommu(struct dmar_domain 
*domain,
return count;
 }
 
-static struct iova_domain reserved_iova_list;
-static struct lock_class_key reserved_rbtree_key;
-
-static int dmar_init_reserved_ranges(void)
-{
-   struct pci_dev *pdev = NULL;
-   struct iova *iova;
-   int i;
-
-   init_iova_domain(&reserved_iova_list, VTD_PAGE_SIZE, IOVA_START_PFN);
-
-   lockdep_set_class(&reserved_iova_list.iova_rbtree_lock,
-   &reserved_rbtree_key);
-
-   /* IOAPIC ranges shouldn't be accessed by DMA */
-   iova = reserve_iova(&reserved_iova_list, IOVA_PFN(IOAPIC_RANGE_START),
-   IOVA_PFN(IOAPIC_RANGE_END));
-   if (!iova) {
-   pr_err("Reserve IOAPIC range failed\n");
-   return -ENODEV;
-   }
-
-   /* Reserve all PCI MMIO to avoid peer-to-peer access */
-   for_each_pci_dev(pdev) {
-   struct resource *r;
-
-   for (i = 0; i < PCI_NUM_RESOURCES; i++) {
-   r = &pdev->resource[i];
-   if (!r->flags || !(r->flags & IORESOURCE_MEM))
-   continue;
-   iova = reserve_iova(&reserved_iova_list,
-   IOVA_PFN(r->start),
-   IOVA_PFN(r->end));
-   if (!iova) {
-   pci_err(pdev, "Reserve iova for %pR failed\n", 
r);
-   return -ENODEV;
-   }
-   }
-   }
-   return 0;
-}
-
 static inline int guestwidth_to_adjustwidth(int gaw)
 {
int agaw;
@@ -1971,7 +1917,7 @@ static void domain_exit(struct dmar_domain *domain)
 
/* destroy iovas */
if (domain->domain.type == IOMMU_DOMAIN_DMA)
-   put_iova_domain(&domain->iovad);
+

[PATCH v3 2/6] iommu: Add iommu_dma_free_cpu_cached_iovas()

2020-09-11 Thread Lu Baolu

From: Tom Murphy 

Add a iommu_dma_free_cpu_cached_iovas function to allow drivers which
use the dma-iommu ops to free cached cpu iovas.

Signed-off-by: Tom Murphy 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/dma-iommu.c | 9 +
 include/linux/dma-iommu.h | 8 
 2 files changed, 17 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 82c071b2d5c8..d06411bd5e08 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -50,6 +50,15 @@ struct iommu_dma_cookie {
struct iommu_domain *fq_domain;
 };
 
+void iommu_dma_free_cpu_cached_iovas(unsigned int cpu,
+   struct iommu_domain *domain)
+{
+   struct iommu_dma_cookie *cookie = domain->iova_cookie;
+   struct iova_domain *iovad = &cookie->iovad;
+
+   free_cpu_cached_iovas(cpu, iovad);
+}
+
 static void iommu_dma_entry_dtor(unsigned long data)
 {
struct page *freelist = (struct page *)data;
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 2112f21f73d8..706b68d1359b 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -37,6 +37,9 @@ void iommu_dma_compose_msi_msg(struct msi_desc *desc,
 
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
+void iommu_dma_free_cpu_cached_iovas(unsigned int cpu,
+   struct iommu_domain *domain);
+
 #else /* CONFIG_IOMMU_DMA */
 
 struct iommu_domain;
@@ -78,5 +81,10 @@ static inline void iommu_dma_get_resv_regions(struct device 
*dev, struct list_he
 {
 }
 
+static inline void iommu_dma_free_cpu_cached_iovas(unsigned int cpu,
+   struct iommu_domain *domain)
+{
+}
+
 #endif /* CONFIG_IOMMU_DMA */
 #endif /* __DMA_IOMMU_H */
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 6/6] iommu/vt-d: Cleanup after converting to dma-iommu ops

2020-09-11 Thread Lu Baolu

Some cleanups after converting the driver to use dma-iommu ops.
- Remove nobounce option;
- Cleanup and simplify the path in domain mapping.

Signed-off-by: Lu Baolu 
---
 .../admin-guide/kernel-parameters.txt |  5 --
 drivers/iommu/intel/iommu.c   | 90 ++-
 2 files changed, 28 insertions(+), 67 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index a1068742a6df..0d11ef43d314 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1866,11 +1866,6 @@
Note that using this option lowers the security
provided by tboot because it makes the system
vulnerable to DMA attacks.
-   nobounce [Default off]
-   Disable bounce buffer for untrusted devices such as
-   the Thunderbolt devices. This will treat the untrusted
-   devices as the trusted ones, hence might expose security
-   risks of DMA attacks.
 
intel_idle.max_cstate=  [KNL,HW,ACPI,X86]
0   disables intel_idle and fall back on acpi_idle.
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index adc231790e0a..fe2544c95013 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -355,7 +355,6 @@ static int dmar_forcedac;
 static int intel_iommu_strict;
 static int intel_iommu_superpage = 1;
 static int iommu_identity_mapping;
-static int intel_no_bounce;
 static int iommu_skip_te_disable;
 
 #define IDENTMAP_GFX   2
@@ -457,9 +456,6 @@ static int __init intel_iommu_setup(char *str)
} else if (!strncmp(str, "tboot_noforce", 13)) {
pr_info("Intel-IOMMU: not forcing on after tboot. This 
could expose security risk for tboot\n");
intel_iommu_tboot_noforce = 1;
-   } else if (!strncmp(str, "nobounce", 8)) {
-   pr_info("Intel-IOMMU: No bounce buffer. This could 
expose security risks of DMA attacks\n");
-   intel_no_bounce = 1;
}
 
str += strcspn(str, ",");
@@ -2230,15 +2226,14 @@ static inline int hardware_largepage_caps(struct 
dmar_domain *domain,
return level;
 }
 
-static int __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
-   struct scatterlist *sg, unsigned long phys_pfn,
-   unsigned long nr_pages, int prot)
+static int
+__domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
+unsigned long phys_pfn, unsigned long nr_pages, int prot)
 {
struct dma_pte *first_pte = NULL, *pte = NULL;
-   phys_addr_t pteval;
-   unsigned long sg_res = 0;
unsigned int largepage_lvl = 0;
unsigned long lvl_pages = 0;
+   phys_addr_t pteval;
u64 attr;
 
BUG_ON(!domain_pfn_supported(domain, iov_pfn + nr_pages - 1));
@@ -2250,26 +2245,14 @@ static int __domain_mapping(struct dmar_domain *domain, 
unsigned long iov_pfn,
if (domain_use_first_level(domain))
attr |= DMA_FL_PTE_PRESENT | DMA_FL_PTE_XD | DMA_FL_PTE_US;
 
-   if (!sg) {
-   sg_res = nr_pages;
-   pteval = ((phys_addr_t)phys_pfn << VTD_PAGE_SHIFT) | attr;
-   }
+   pteval = ((phys_addr_t)phys_pfn << VTD_PAGE_SHIFT) | attr;
 
while (nr_pages > 0) {
uint64_t tmp;
 
-   if (!sg_res) {
-   unsigned int pgoff = sg->offset & ~PAGE_MASK;
-
-   sg_res = aligned_nrpages(sg->offset, sg->length);
-   sg->dma_address = ((dma_addr_t)iov_pfn << 
VTD_PAGE_SHIFT) + pgoff;
-   sg->dma_length = sg->length;
-   pteval = (sg_phys(sg) - pgoff) | attr;
-   phys_pfn = pteval >> VTD_PAGE_SHIFT;
-   }
-
if (!pte) {
-   largepage_lvl = hardware_largepage_caps(domain, 
iov_pfn, phys_pfn, sg_res);
+   largepage_lvl = hardware_largepage_caps(domain, iov_pfn,
+   phys_pfn, nr_pages);
 
first_pte = pte = pfn_to_dma_pte(domain, iov_pfn, 
&largepage_lvl);
if (!pte)
@@ -2281,7 +2264,7 @@ static int __domain_mapping(struct dmar_domain *domain, 
unsigned long iov_pfn,
pteval |= DMA_PTE_LARGE_PAGE;
lvl_pages = lvl_to_nr_pages(largepage_lvl);
 
-   nr_superpages = sg_res / lvl_pages;
+   nr_superpages = nr_pages / lvl_pages;
end_pfn = iov_pfn + nr_superpages * lvl_pages - 
1;
 
/*
@@ -2315,48 +2298,4

[PATCH v3 3/6] iommu: Allow the dma-iommu api to use bounce buffers

2020-09-11 Thread Lu Baolu

From: Tom Murphy 

Allow the dma-iommu api to use bounce buffers for untrusted devices.
This is a copy of the intel bounce buffer code.

Signed-off-by: Tom Murphy 
Co-developed-by: Lu Baolu 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/dma-iommu.c | 163 +++---
 1 file changed, 150 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index d06411bd5e08..1a1da22e5a5e 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -21,9 +21,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
+#include 
 
 struct iommu_dma_msi_page {
struct list_headlist;
@@ -498,6 +500,31 @@ static void __iommu_dma_unmap(struct device *dev, 
dma_addr_t dma_addr,
iommu_dma_free_iova(cookie, dma_addr, size, iotlb_gather.freelist);
 }
 
+static void __iommu_dma_unmap_swiotlb(struct device *dev, dma_addr_t dma_addr,
+   size_t size, enum dma_data_direction dir,
+   unsigned long attrs)
+{
+   struct iommu_domain *domain = iommu_get_dma_domain(dev);
+   struct iommu_dma_cookie *cookie = domain->iova_cookie;
+   struct iova_domain *iovad = &cookie->iovad;
+   phys_addr_t phys;
+
+   phys = iommu_iova_to_phys(domain, dma_addr);
+   if (WARN_ON(!phys))
+   return;
+
+   __iommu_dma_unmap(dev, dma_addr, size);
+
+   if (unlikely(is_swiotlb_buffer(phys)))
+   swiotlb_tbl_unmap_single(dev, phys, size,
+   iova_align(iovad, size), dir, attrs);
+}
+
+static bool dev_is_untrusted(struct device *dev)
+{
+   return dev_is_pci(dev) && to_pci_dev(dev)->untrusted;
+}
+
 static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
size_t size, int prot, u64 dma_mask)
 {
@@ -523,6 +550,55 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
phys_addr_t phys,
return iova + iova_off;
 }
 
+static dma_addr_t __iommu_dma_map_swiotlb(struct device *dev, phys_addr_t phys,
+   size_t org_size, dma_addr_t dma_mask, bool coherent,
+   enum dma_data_direction dir, unsigned long attrs)
+{
+   int prot = dma_info_to_prot(dir, coherent, attrs);
+   struct iommu_domain *domain = iommu_get_dma_domain(dev);
+   struct iommu_dma_cookie *cookie = domain->iova_cookie;
+   struct iova_domain *iovad = &cookie->iovad;
+   size_t aligned_size = org_size;
+   void *padding_start;
+   size_t padding_size;
+   dma_addr_t iova;
+
+   /*
+* If both the physical buffer start address and size are
+* page aligned, we don't need to use a bounce page.
+*/
+   if (IS_ENABLED(CONFIG_SWIOTLB) && dev_is_untrusted(dev) &&
+   iova_offset(iovad, phys | org_size)) {
+   aligned_size = iova_align(iovad, org_size);
+   phys = swiotlb_tbl_map_single(dev,
+   __phys_to_dma(dev, io_tlb_start),
+   phys, org_size, aligned_size, dir, attrs);
+
+   if (phys == DMA_MAPPING_ERROR)
+   return DMA_MAPPING_ERROR;
+
+   /* Cleanup the padding area. */
+   padding_start = phys_to_virt(phys);
+   padding_size = aligned_size;
+
+   if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
+   (dir == DMA_TO_DEVICE ||
+dir == DMA_BIDIRECTIONAL)) {
+   padding_start += org_size;
+   padding_size -= org_size;
+   }
+
+   memset(padding_start, 0, padding_size);
+   }
+
+   iova = __iommu_dma_map(dev, phys, aligned_size, prot, dma_mask);
+   if ((iova == DMA_MAPPING_ERROR) && is_swiotlb_buffer(phys))
+   swiotlb_tbl_unmap_single(dev, phys, org_size,
+   aligned_size, dir, attrs);
+
+   return iova;
+}
+
 static void __iommu_dma_free_pages(struct page **pages, int count)
 {
while (count--)
@@ -698,11 +774,15 @@ static void iommu_dma_sync_single_for_cpu(struct device 
*dev,
 {
phys_addr_t phys;
 
-   if (dev_is_dma_coherent(dev))
+   if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev))
return;
 
phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
-   arch_sync_dma_for_cpu(phys, size, dir);
+   if (!dev_is_dma_coherent(dev))
+   arch_sync_dma_for_cpu(phys, size, dir);
+
+   if (is_swiotlb_buffer(phys))
+   swiotlb_tbl_sync_single(dev, phys, size, dir, SYNC_FOR_CPU);
 }
 
 static void iommu_dma_sync_single_for_device(struct device *dev,
@@ -710,11 +790,15 @@ static void iommu_dma_sync_single_for_device(struct 
device *dev,
 {
phys_addr_t phys;
 
-   if (dev_is_dma_coherent(dev))
+   if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev))
return;
 
phys = iommu_iova_to_phys(iommu_get_dma_domain(dev

[PATCH v3 0/6] Convert the intel iommu driver to the dma-iommu api

2020-09-11 Thread Lu Baolu

Tom Murphy has almost done all the work. His latest patch series was
posted here.

https://lore.kernel.org/linux-iommu/20200903201839.7327-1-murph...@tcd.ie/

Thanks a lot!

This series is a follow-up with below changes:

1. Add a quirk for the i915 driver issue described in Tom's cover
letter.
2. Fix several bugs in patch "iommu: Allow the dma-iommu api to use
bounce buffers" to make the bounce buffer work for untrusted devices.
3. Several cleanups in iommu/vt-d driver after the conversion.

Please review and test.

Best regards,
baolu

Lu Baolu (2):
  iommu: Add quirk for Intel graphic devices in map_sg
  iommu/vt-d: Cleanup after converting to dma-iommu ops

Tom Murphy (4):
  iommu: Handle freelists when using deferred flushing in iommu drivers
  iommu: Add iommu_dma_free_cpu_cached_iovas()
  iommu: Allow the dma-iommu api to use bounce buffers
  iommu/vt-d: Convert intel iommu driver to the iommu ops

 .../admin-guide/kernel-parameters.txt |   5 -
 drivers/iommu/dma-iommu.c | 229 -
 drivers/iommu/intel/Kconfig   |   1 +
 drivers/iommu/intel/iommu.c   | 885 +++---
 include/linux/dma-iommu.h |   8 +
 include/linux/iommu.h |   1 +
 6 files changed, 323 insertions(+), 806 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 1/6] iommu: Handle freelists when using deferred flushing in iommu drivers

2020-09-11 Thread Lu Baolu

From: Tom Murphy 

Allow the iommu_unmap_fast to return newly freed page table pages and
pass the freelist to queue_iova in the dma-iommu ops path.

This is useful for iommu drivers (in this case the intel iommu driver)
which need to wait for the ioTLB to be flushed before newly
free/unmapped page table pages can be freed. This way we can still batch
ioTLB free operations and handle the freelists.

Signed-off-by: Tom Murphy 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/dma-iommu.c   | 30 ++--
 drivers/iommu/intel/iommu.c | 55 -
 include/linux/iommu.h   |  1 +
 3 files changed, 59 insertions(+), 27 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 5141d49a046b..82c071b2d5c8 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -50,6 +50,18 @@ struct iommu_dma_cookie {
struct iommu_domain *fq_domain;
 };
 
+static void iommu_dma_entry_dtor(unsigned long data)
+{
+   struct page *freelist = (struct page *)data;
+
+   while (freelist) {
+   unsigned long p = (unsigned long)page_address(freelist);
+
+   freelist = freelist->freelist;
+   free_page(p);
+   }
+}
+
 static inline size_t cookie_msi_granule(struct iommu_dma_cookie *cookie)
 {
if (cookie->type == IOMMU_DMA_IOVA_COOKIE)
@@ -344,7 +356,8 @@ static int iommu_dma_init_domain(struct iommu_domain 
*domain, dma_addr_t base,
if (!cookie->fq_domain && !iommu_domain_get_attr(domain,
DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE, &attr) && attr) {
cookie->fq_domain = domain;
-   init_iova_flush_queue(iovad, iommu_dma_flush_iotlb_all, NULL);
+   init_iova_flush_queue(iovad, iommu_dma_flush_iotlb_all,
+ iommu_dma_entry_dtor);
}
 
if (!dev)
@@ -438,7 +451,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain 
*domain,
 }
 
 static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
-   dma_addr_t iova, size_t size)
+   dma_addr_t iova, size_t size, struct page *freelist)
 {
struct iova_domain *iovad = &cookie->iovad;
 
@@ -447,7 +460,8 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie 
*cookie,
cookie->msi_iova -= size;
else if (cookie->fq_domain) /* non-strict mode */
queue_iova(iovad, iova_pfn(iovad, iova),
-   size >> iova_shift(iovad), 0);
+   size >> iova_shift(iovad),
+   (unsigned long)freelist);
else
free_iova_fast(iovad, iova_pfn(iovad, iova),
size >> iova_shift(iovad));
@@ -472,7 +486,7 @@ static void __iommu_dma_unmap(struct device *dev, 
dma_addr_t dma_addr,
 
if (!cookie->fq_domain)
iommu_tlb_sync(domain, &iotlb_gather);
-   iommu_dma_free_iova(cookie, dma_addr, size);
+   iommu_dma_free_iova(cookie, dma_addr, size, iotlb_gather.freelist);
 }
 
 static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
@@ -494,7 +508,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
phys_addr_t phys,
return DMA_MAPPING_ERROR;
 
if (iommu_map_atomic(domain, iova, phys - iova_off, size, prot)) {
-   iommu_dma_free_iova(cookie, iova, size);
+   iommu_dma_free_iova(cookie, iova, size, NULL);
return DMA_MAPPING_ERROR;
}
return iova + iova_off;
@@ -649,7 +663,7 @@ static void *iommu_dma_alloc_remap(struct device *dev, 
size_t size,
 out_free_sg:
sg_free_table(&sgt);
 out_free_iova:
-   iommu_dma_free_iova(cookie, iova, size);
+   iommu_dma_free_iova(cookie, iova, size, NULL);
 out_free_pages:
__iommu_dma_free_pages(pages, count);
return NULL;
@@ -900,7 +914,7 @@ static int iommu_dma_map_sg(struct device *dev, struct 
scatterlist *sg,
return __finalise_sg(dev, sg, nents, iova);
 
 out_free_iova:
-   iommu_dma_free_iova(cookie, iova, iova_len);
+   iommu_dma_free_iova(cookie, iova, iova_len, NULL);
 out_restore_sg:
__invalidate_sg(sg, nents);
return 0;
@@ -1194,7 +1208,7 @@ static struct iommu_dma_msi_page 
*iommu_dma_get_msi_page(struct device *dev,
return msi_page;
 
 out_free_iova:
-   iommu_dma_free_iova(cookie, iova, size);
+   iommu_dma_free_iova(cookie, iova, size, NULL);
 out_free_page:
kfree(msi_page);
return NULL;
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 87b17bac04c2..63ee30c689a7 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1208,17 +1208,17 @@ static struct page *dma_pte_clear_level(struct 
dmar_domain *domain, int level,
pages can only be freed after the IOTLB flush has been done. */
 static struct page *domain_unmap(struct dmar_domain

[PATCH v3 4/6] iommu: Add quirk for Intel graphic devices in map_sg

2020-09-11 Thread Lu Baolu

Combining the sg segments exposes a bug in the Intel i915 driver which
causes visual artifacts and the screen to freeze. This is most likely
because of how the i915 handles the returned list. It probably doesn't
respect the returned value specifying the number of elements in the list
and instead depends on the previous behaviour of the Intel iommu driver
which would return the same number of elements in the output list as in
the input list.

Signed-off-by: Tom Murphy 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/dma-iommu.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 1a1da22e5a5e..fc19f1fb9413 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -880,6 +880,33 @@ static int __finalise_sg(struct device *dev, struct 
scatterlist *sg, int nents,
unsigned int cur_len = 0, max_len = dma_get_max_seg_size(dev);
int i, count = 0;
 
+   /*
+* The Intel graphic driver is used to assume that the returned
+* sg list is not combound. This blocks the efforts of converting
+* Intel IOMMU driver to dma-iommu api's. Add this quirk to make the
+* device driver work and should be removed once it's fixed in i915
+* driver.
+*/
+   if (IS_ENABLED(CONFIG_DRM_I915) && dev_is_pci(dev) &&
+   to_pci_dev(dev)->vendor == PCI_VENDOR_ID_INTEL &&
+   (to_pci_dev(dev)->class >> 16) == PCI_BASE_CLASS_DISPLAY) {
+   for_each_sg(sg, s, nents, i) {
+   unsigned int s_iova_off = sg_dma_address(s);
+   unsigned int s_length = sg_dma_len(s);
+   unsigned int s_iova_len = s->length;
+
+   s->offset += s_iova_off;
+   s->length = s_length;
+   sg_dma_address(s) = dma_addr + s_iova_off;
+   sg_dma_len(s) = s_length;
+   dma_addr += s_iova_len;
+
+   pr_info_once("sg combining disabled due to i915 
driver\n");
+   }
+
+   return nents;
+   }
+
for_each_sg(sg, s, nents, i) {
/* Restore this segment's original unaligned fields first */
unsigned int s_iova_off = sg_dma_address(s);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [PATCH v7 10/16] vfio/type1: Support binding guest page tables to PASID

2020-09-11 Thread Liu, Yi L

Hi Alex,

> From: Alex Williamson 
> Sent: Saturday, September 12, 2020 6:04 AM
> 
> On Thu, 10 Sep 2020 03:45:27 -0700
> Liu Yi L  wrote:
> 
> > Nesting translation allows two-levels/stages page tables, with 1st
> > level for guest translations (e.g. GVA->GPA), 2nd level for host
> > translations (e.g. GPA->HPA). This patch adds interface for binding
> > guest page tables to a PASID. This PASID must have been allocated by
> > the userspace before the binding request.
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Jean-Philippe Brucker 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> > v6 -> v7:
> > *) introduced @user in struct domain_capsule to simplify the code per Eric's
> >suggestion.
> > *) introduced VFIO_IOMMU_NESTING_OP_NUM for sanitizing op from userspace.
> > *) corrected the @argsz value of unbind_data in 
> > vfio_group_unbind_gpasid_fn().
> >
> > v5 -> v6:
> > *) dropped vfio_find_nesting_group() and add 
> > vfio_get_nesting_domain_capsule().
> >per comment from Eric.
> > *) use iommu_uapi_sva_bind/unbind_gpasid() and iommu_sva_unbind_gpasid() in
> >linux/iommu.h for userspace operation and in-kernel operation.
> >
> > v3 -> v4:
> > *) address comments from Alex on v3
> >
> > v2 -> v3:
> > *) use __iommu_sva_unbind_gpasid() for unbind call issued by VFIO
> >
> > https://lore.kernel.org/linux-iommu/1592931837-58223-6-git-send-email-
> > jacob.jun@linux.intel.com/
> >
> > v1 -> v2:
> > *) rename subject from "vfio/type1: Bind guest page tables to host"
> > *) remove VFIO_IOMMU_BIND, introduce VFIO_IOMMU_NESTING_OP to support
> bind/
> >unbind guet page table
> > *) replaced vfio_iommu_for_each_dev() with a group level loop since this
> >series enforces one group per container w/ nesting type as start.
> > *) rename vfio_bind/unbind_gpasid_fn() to
> > vfio_dev_bind/unbind_gpasid_fn()
> > *) vfio_dev_unbind_gpasid() always successful
> > *) use vfio_mm->pasid_lock to avoid race between PASID free and page table
> >bind/unbind
> > ---
> >  drivers/vfio/vfio_iommu_type1.c | 163
> 
> >  drivers/vfio/vfio_pasid.c   |  26 +++
> >  include/linux/vfio.h|  20 +
> >  include/uapi/linux/vfio.h   |  36 +
> >  4 files changed, 245 insertions(+)
> >
> > diff --git a/drivers/vfio/vfio_iommu_type1.c
> > b/drivers/vfio/vfio_iommu_type1.c index bd4b668..11f1156 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -149,6 +149,39 @@ struct vfio_regions {
> >  #define DIRTY_BITMAP_PAGES_MAX  ((u64)INT_MAX)
> >  #define DIRTY_BITMAP_SIZE_MAX
> DIRTY_BITMAP_BYTES(DIRTY_BITMAP_PAGES_MAX)
> >
> > +struct domain_capsule {
> > +   struct vfio_group   *group;
> > +   struct iommu_domain *domain;
> > +   /* set if @data contains a user pointer*/
> > +   booluser;
> > +   void*data;
> > +};
> 
> Put the hole in the structure at the end, but I suspect we might lose the 
> user field
> when the internal api drops the unnecessary structure for unbind anyway.

I see. will move @user and its comment to the end of this struct. As it's
used to imply the @data field user pointer or not, I guess it's still useful
to keep it. The difference would be the @data is a pasid not a bind_data
struct.

> > +
> > +/* iommu->lock must be held */
> > +static int vfio_prepare_nesting_domain_capsule(struct vfio_iommu *iommu,
> > +  struct domain_capsule *dc) {
> > +   struct vfio_domain *domain = NULL;
> > +   struct vfio_group *group = NULL;
> 
> Unnecessary initialization.

will remove them. :-)

> > +
> > +   if (!iommu->nesting_info)
> > +   return -EINVAL;
> > +
> > +   /*
> > +* Only support singleton container with nesting type. If
> > +* nesting_info is non-NULL, the container is non-empty.
> > +* Also domain is non-empty.
> > +*/
> > +   domain = list_first_entry(&iommu->domain_list,
> > + struct vfio_domain, next);
> > +   group = list_first_entry(&domain->group_list,
> > +struct vfio_group, next);
> > +   dc->group = group;
> > +   dc->domain = domain->domain;
> > +   dc->user = true;
> > +   return 0;
> > +}
> > +
> >  static int put_pfn(unsigned long pfn, int prot);
> >
> >  static struct vfio_group *vfio_iommu_find_iommu_group(struct
> > vfio_iommu *iommu, @@ -2405,6 +2438,49 @@ static int
> vfio_iommu_resv_refresh(struct vfio_iommu *iommu,
> > return ret;
> >  }
> >
> > +static int vfio_dev_bind_gpasid_fn(struct device *dev, void *data) {
> > +   struct domain_capsule *dc = (struct domain_capsule *)data;
> > +   unsigned long arg = *(unsigned long *)dc->data;
> > +
> > +   return iommu_uapi_sva_bind_gpasid(dc->domain, dev,
> > +

RE: [PATCH v7 07/16] vfio/type1: Add VFIO_IOMMU_PASID_REQUEST (alloc/free)

2020-09-11 Thread Liu, Yi L

Hi Alex,

> From: Alex Williamson 
> Sent: Saturday, September 12, 2020 5:38 AM
> 
> On Thu, 10 Sep 2020 03:45:24 -0700
> Liu Yi L  wrote:
> 
> > This patch allows userspace to request PASID allocation/free, e.g. when
> > serving the request from the guest.
> >
> > PASIDs that are not freed by userspace are automatically freed when the
> > IOASID set is destroyed when process exits.
> >
> > Cc: Kevin Tian 
> > CC: Jacob Pan 
> > Cc: Alex Williamson 
> > Cc: Eric Auger 
> > Cc: Jean-Philippe Brucker 
> > Cc: Joerg Roedel 
> > Cc: Lu Baolu 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Yi Sun 
> > Signed-off-by: Jacob Pan 
> > ---
> > v6 -> v7:
> > *) current VFIO returns allocated pasid via signed int, thus VFIO UAPI
> >can only support 31 bits pasid. If user space gives min,max which is
> >wider than 31 bits, should fail the allocation or free request.
> >
> > v5 -> v6:
> > *) address comments from Eric against v5. remove the alloc/free helper.
> >
> > v4 -> v5:
> > *) address comments from Eric Auger.
> > *) the comments for the PASID_FREE request is addressed in patch 5/15 of
> >this series.
> >
> > v3 -> v4:
> > *) address comments from v3, except the below comment against the range
> >of PASID_FREE request. needs more help on it.
> > "> +if (req.range.min > req.range.max)
> >
> >  Is it exploitable that a user can spin the kernel for a long time in
> >  the case of a free by calling this with [0, MAX_UINT] regardless of
> >  their actual allocations?"
> > https://lore.kernel.org/linux-iommu/20200702151832.048b4...@x1.home/
> >
> > v1 -> v2:
> > *) move the vfio_mm related code to be a seprate module
> > *) use a single structure for alloc/free, could support a range of PASIDs
> > *) fetch vfio_mm at group_attach time instead of at iommu driver open time
> > ---
> >  drivers/vfio/Kconfig|  1 +
> >  drivers/vfio/vfio_iommu_type1.c | 76
> +
> >  drivers/vfio/vfio_pasid.c   | 10 ++
> >  include/linux/vfio.h|  6 
> >  include/uapi/linux/vfio.h   | 43 +++
> >  5 files changed, 136 insertions(+)
> >
> > diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> > index 3d8a108..95d90c6 100644
> > --- a/drivers/vfio/Kconfig
> > +++ b/drivers/vfio/Kconfig
> > @@ -2,6 +2,7 @@
> >  config VFIO_IOMMU_TYPE1
> > tristate
> > depends on VFIO
> > +   select VFIO_PASID if (X86)
> > default n
> >
> >  config VFIO_IOMMU_SPAPR_TCE
> > diff --git a/drivers/vfio/vfio_iommu_type1.c 
> > b/drivers/vfio/vfio_iommu_type1.c
> > index 3c0048b..bd4b668 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -76,6 +76,7 @@ struct vfio_iommu {
> > booldirty_page_tracking;
> > boolpinned_page_dirty_scope;
> > struct iommu_nesting_info   *nesting_info;
> > +   struct vfio_mm  *vmm;
> >  };
> >
> >  struct vfio_domain {
> > @@ -2000,6 +2001,11 @@ static void vfio_iommu_iova_insert_copy(struct
> vfio_iommu *iommu,
> >
> >  static void vfio_iommu_release_nesting_info(struct vfio_iommu *iommu)
> >  {
> > +   if (iommu->vmm) {
> > +   vfio_mm_put(iommu->vmm);
> > +   iommu->vmm = NULL;
> > +   }
> > +
> > kfree(iommu->nesting_info);
> > iommu->nesting_info = NULL;
> >  }
> > @@ -2127,6 +2133,26 @@ static int vfio_iommu_type1_attach_group(void
> *iommu_data,
> > iommu->nesting_info);
> > if (ret)
> > goto out_detach;
> > +
> > +   if (iommu->nesting_info->features &
> > +   IOMMU_NESTING_FEAT_SYSWIDE_PASID)
> {
> > +   struct vfio_mm *vmm;
> > +   struct ioasid_set *set;
> > +
> > +   vmm = vfio_mm_get_from_task(current);
> > +   if (IS_ERR(vmm)) {
> > +   ret = PTR_ERR(vmm);
> > +   goto out_detach;
> > +   }
> > +   iommu->vmm = vmm;
> > +
> > +   set = vfio_mm_ioasid_set(vmm);
> > +   ret = iommu_domain_set_attr(domain->domain,
> > +   DOMAIN_ATTR_IOASID_SET,
> > +   set);
> > +   if (ret)
> > +   goto out_detach;
> > +   }
> > }
> >
> > /* Get aperture info */
> > @@ -2908,6 +2934,54 @@ static int vfio_iommu_type1_dirty_pages(struct
> vfio_iommu *iommu,
> > return -EINVAL;
> >  }
> >
> > +static int vfio_iommu_type1_pasid_request(struct vfio_iommu *iommu,
> > + unsigned long arg)
> > +{
> > +   struct vfio_iommu_type1_pasid_request req;
> > +   unsigned long minsz;
> > +   int ret;
> > +
> > +   minsz = offsetofend(struct vfio_iommu_type1_pasid_request, range);
> > +
> > +

Re: [PATCH 3/3] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset

2020-09-11 Thread Christoph Hellwig

On Fri, Sep 11, 2020 at 05:12:36PM +0100, Robin Murphy wrote:
> (apologies to Jim - I did look through one of the previous versions since I 
> last commented and thought it looked OK, but never actually replied as 
> such)
>
> On 2020-09-10 06:40, Christoph Hellwig wrote:
>> From: Jim Quinlan 
>>
>> The new field 'dma_range_map' in struct device is used to facilitate the
>> use of single or multiple offsets between mapping regions of cpu addrs and
>> dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
>> capable of holding a single uniform offset and had no region bounds
>> checking.
>>
>> The function of_dma_get_range() has been modified so that it takes a single
>> argument -- the device node -- and returns a map, NULL, or an error code.
>> The map is an array that holds the information regarding the DMA regions.
>> Each range entry contains the address offset, the cpu_start address, the
>> dma_start address, and the size of the region.
>>
>> of_dma_configure() is the typical manner to set range offsets but there are
>> a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
>> driver code.  These cases now invoke the function
>> dma_attach_offset_range(dev, cpu_addr, dma_addr, size).
>
> This is now called dma_direct_set_offset(), right?

Yes.

>> +int ret = dma_direct_set_offset(dev, KEYSTONE_HIGH_PHYS_START,
>> +KEYSTONE_LOW_PHYS_START,
>> +KEYSTONE_HIGH_PHYS_SIZE);
>> +dev_err(dev, "set dma_offset%08llx%s\n",
>> +KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START,
>> +ret ? " failed" : "");
>
> FWIW I've already been thinking of some optimisations which would have the 
> happy side-effect of removing many of these allocation failure scenarios, 
> but at this point I reckon it's more practical to just get the current 
> implementation landed and working.

Given that no one deals or can easily deal with these failures we
should probably take care of that.  IMHO we could just allocate a
single static range and point all the devices to it, what would
you think of that?

>>   @@ -811,8 +812,13 @@ static int sun4i_backend_bind(struct device *dev, 
>> struct device *master,
>>   * because of an old DT, we need to set the DMA offset by hand
>>   * on our device since the RAM mapping is at 0 for the DMA bus,
>>   * unlike the CPU.
>> + *
>> + * XXX(hch): this has no business in a driver and needs to move
>> + * to the device tree.
>
> As the context implies, this has actually grown a proper DT description of 
> the funky interconnect layout (see 564d6fd611f9 and the linked patch 
> series), and this is just an ugly fallback path to prevent regressions with 
> old DTBs that are already out there. So unless you can fire up the time 
> machine to fix those, this extra comment is really just beating a dead 
> horse :(

Well, I need to beat the dead horse to avoid having to export the
function, which will really just bread new users.  So at the minimum
we'd need to move the setup to platform code that is always built in.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

70 matches

Mail list logo