date:20200415

Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2020-04-15 Thread Zhangfei Gao




On 2020/4/14 下午11:05, Eric Auger wrote:

This version fixes an issue observed by Shameer on an SMMU 3.2,
when moving from dual stage config to stage 1 only config.
The 2 high 64b of the STE now get reset. Otherwise, leaving the
S2TTB set may cause a C_BAD_STE error.

This series can be found at:
https://github.com/eauger/linux/tree/v5.6-2stage-v11_10.1
(including the VFIO part)
The QEMU fellow series still can be found at:
https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6

Users have expressed interest in that work and tested v9/v10:
- https://patchwork.kernel.org/cover/11039995/#23012381
- https://patchwork.kernel.org/cover/11039995/#23197235

Background:

This series brings the IOMMU part of HW nested paging support
in the SMMUv3. The VFIO part is submitted separately.

The IOMMU API is extended to support 2 new API functionalities:
1) pass the guest stage 1 configuration
2) pass stage 1 MSI bindings

Then those capabilities gets implemented in the SMMUv3 driver.

The virtualizer passes information through the VFIO user API
which cascades them to the iommu subsystem. This allows the guest
to own stage 1 tables and context descriptors (so-called PASID
table) while the host owns stage 2 tables and main configuration
structures (STE).




Thanks Eric

Tested v11 on Hisilicon kunpeng920 board via hardware zip accelerator.
1. no-sva works, where guest app directly use physical address via ioctl.
2. vSVA still not work, same as v10,
3.  the v10 issue reported by Shameer has been solved,  first start qemu 
with  iommu=smmuv3, then start qemu without  iommu=smmuv3

4. no-sva also works without  iommu=smmuv3

Test details in https://docs.qq.com/doc/DRU5oR1NtUERseFNL

Thanks
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 6/7] iommu/vt-d: Add page request draining support

2020-04-15 Thread Lu Baolu


On 2020/4/15 19:10, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Wednesday, April 15, 2020 1:26 PM

When a PASID is stopped or terminated, there can be pending
PRQs (requests that haven't received responses) in remapping
hardware. This adds the interface to drain page requests and
call it when a PASID is terminated.

Signed-off-by: Jacob Pan 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
---
  drivers/iommu/intel-svm.c   | 90 ++---
  include/linux/intel-iommu.h |  1 +
  2 files changed, 86 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 05aeb8ea51c4..736dd39fb52b 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -23,6 +23,7 @@
  #include "intel-pasid.h"

  static irqreturn_t prq_event_thread(int irq, void *d);
+static void intel_svm_drain_prq(struct device *dev, int pasid);

  #define PRQ_ORDER 0

@@ -210,6 +211,7 @@ static void intel_mm_release(struct mmu_notifier
*mn, struct mm_struct *mm)
rcu_read_lock();
list_for_each_entry_rcu(sdev, >devs, list) {
intel_pasid_tear_down_entry(svm->iommu, sdev->dev, svm-

pasid);

+   intel_svm_drain_prq(sdev->dev, svm->pasid);


I feel there is a problem here. If you clear the PASID entry before draining,
in-fly requests will hit unrecoverable fault instead, due to invalid PASID
entry.


The in-fly requests will be ignored by IOMMU if the pasid entry is
empty. It won't result in an unrecoverable fault.




intel_flush_svm_range_dev(svm, sdev, 0, -1, 0);
}
rcu_read_unlock();
@@ -403,12 +405,8 @@ int intel_svm_unbind_gpasid(struct device *dev, int
pasid)
if (!sdev->users) {
list_del_rcu(>list);
intel_pasid_tear_down_entry(iommu, dev, svm-

pasid);

+   intel_svm_drain_prq(dev, svm->pasid);
intel_flush_svm_range_dev(svm, sdev, 0, -1, 0);
-   /* TODO: Drain in flight PRQ for the PASID since it
-* may get reused soon, we don't want to
-* confuse with its previous life.
-* intel_svm_drain_prq(dev, pasid);
-*/
kfree_rcu(sdev, rcu);

if (list_empty(>devs)) {
@@ -646,6 +644,7 @@ int intel_svm_unbind_mm(struct device *dev, int
pasid)
 * large and has to be physically contiguous. So it's
 * hard to be as defensive as we might like. */
intel_pasid_tear_down_entry(iommu, dev, svm-

pasid);

+   intel_svm_drain_prq(dev, svm->pasid);
intel_flush_svm_range_dev(svm, sdev, 0, -1, 0);
kfree_rcu(sdev, rcu);

@@ -703,6 +702,7 @@ struct page_req_dsc {
  struct page_req {
struct list_head list;
struct page_req_dsc desc;
+   struct completion complete;
unsigned int processing:1;
unsigned int drained:1;
unsigned int completed:1;
@@ -732,9 +732,83 @@ static bool is_canonical_address(u64 addr)
return (((saddr << shift) >> shift) == saddr);
  }

+/**
+ * intel_svm_drain_prq:
+ *
+ * Drain all pending page requests related to a specific pasid in both
+ * software and hardware. The caller must guarantee that no more page
+ * requests related to this pasid coming.
+ */
+static void intel_svm_drain_prq(struct device *dev, int pasid)
+{
+   struct device_domain_info *info;
+   struct dmar_domain *domain;
+   struct intel_iommu *iommu;
+   struct qi_desc desc[3];
+   struct pci_dev *pdev;
+   struct page_req *req;
+   unsigned long flags;
+   u16 sid, did;
+   int qdep;
+
+   info = get_domain_info(dev);
+   if (WARN_ON(!info || !dev_is_pci(dev)))
+   return;
+
+   iommu = info->iommu;
+   domain = info->domain;
+   pdev = to_pci_dev(dev);
+
+   /* Mark all related pending requests drained. */
+   spin_lock_irqsave(>prq_lock, flags);
+   list_for_each_entry(req, >prq_list, list)
+   if (req->desc.pasid_present && req->desc.pasid == pasid)
+   req->drained = true;
+   spin_unlock_irqrestore(>prq_lock, flags);
+
+   /* Wait until all related pending requests complete. */
+retry:
+   spin_lock_irqsave(>prq_lock, flags);
+   list_for_each_entry(req, >prq_list, list) {
+   if (req->desc.pasid_present &&
+   req->desc.pasid == pasid &&
+   !req->completed) {
+   spin_unlock_irqrestore(>prq_lock, flags);
+   wait_for_completion_timeout(>complete, 5 *
HZ);
+   goto retry;
+   }
+   }
+   spin_unlock_irqrestore(>prq_lock, flags);
+
+   /*
+* Perform steps described in VT-d spec CH7.10 to drain page
+*

Re: [PATCH v2 5/7] iommu/vt-d: Save prq descriptors in an internal list

2020-04-15 Thread Lu Baolu


On 2020/4/15 17:30, Tian, Kevin wrote:

From: Lu Baolu
Sent: Wednesday, April 15, 2020 1:26 PM

Currently, the page request interrupt thread handles the page
requests in the queue in this way:

- Clear PPR bit to ensure new interrupt could come in;
- Read and record the head and tail registers;
- Handle all descriptors between head and tail;
- Write tail to head register.

This might cause some descriptors to be handles multiple times.
An example sequence:

- Thread A got scheduled with PRQ_1 and PRQ_2 in the queue;
- Thread A clear the PPR bit and record the head and tail;
- A new PRQ_3 comes and Thread B gets scheduled;
- Thread B record the head and tail which includes PRQ_1
   and PRQ_2.

I may overlook something but isn't the prq interrupt thread
per iommu then why would two prq threads contend here?


The prq interrupt could be masked by the PPR (Pending Page Request) bit
in Page Request Status Register. In the interrupt handling thread once
this bit is clear, new prq interrupts are allowed to be generated.

So, if a page request is in process and the PPR bit is cleared, another
page request from any devices under the same iommu could trigger another
interrupt thread.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 4/7] iommu/vt-d: Refactor prq_event_thread()

2020-04-15 Thread Lu Baolu


On 2020/4/15 17:15, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Wednesday, April 15, 2020 1:26 PM

Move the software processing page request descriptors part from
prq_event_thread() into a separated function. No any functional
changes.

Signed-off-by: Lu Baolu 
---
  drivers/iommu/intel-svm.c | 256 --
  1 file changed, 135 insertions(+), 121 deletions(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 83dc4319f661..a1921b462783 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -722,142 +722,156 @@ static bool is_canonical_address(u64 addr)
return (((saddr << shift) >> shift) == saddr);
  }

-static irqreturn_t prq_event_thread(int irq, void *d)
+static void process_single_prq(struct intel_iommu *iommu,
+  struct page_req_dsc *req)
  {
-   struct intel_iommu *iommu = d;
-   struct intel_svm *svm = NULL;
-   int head, tail, handled = 0;
-
-   /* Clear PPR bit before reading head/tail registers, to
-* ensure that we get a new interrupt if needed. */
-   writel(DMA_PRS_PPR, iommu->reg + DMAR_PRS_REG);
-
-   tail = dmar_readq(iommu->reg + DMAR_PQT_REG) &
PRQ_RING_MASK;
-   head = dmar_readq(iommu->reg + DMAR_PQH_REG) &
PRQ_RING_MASK;
-   while (head != tail) {
-   struct intel_svm_dev *sdev;
-   struct vm_area_struct *vma;
-   struct page_req_dsc *req;
-   struct qi_desc resp;
-   int result;
-   vm_fault_t ret;
-   u64 address;
-
-   handled = 1;
-
-   req = >prq[head / sizeof(*req)];
+   int result = QI_RESP_FAILURE;
+   struct intel_svm_dev *sdev;
+   struct vm_area_struct *vma;
+   struct intel_svm *svm;
+   struct qi_desc resp;
+   vm_fault_t ret;
+   u64 address;
+
+   address = (u64)req->addr << VTD_PAGE_SHIFT;
+   if (!req->pasid_present) {
+   pr_err("%s: Page request without PASID: %08llx %08llx\n",
+  iommu->name, ((unsigned long long *)req)[0],
+  ((unsigned long long *)req)[1]);
+   goto no_pasid;
+   }

-   result = QI_RESP_FAILURE;
-   address = (u64)req->addr << VTD_PAGE_SHIFT;
-   if (!req->pasid_present) {
-   pr_err("%s: Page request without
PASID: %08llx %08llx\n",
-  iommu->name, ((unsigned long long *)req)[0],
-  ((unsigned long long *)req)[1]);
-   goto no_pasid;
-   }
+   rcu_read_lock();
+   svm = ioasid_find(NULL, req->pasid, NULL);
+   /*
+* It *can't* go away, because the driver is not permitted
+* to unbind the mm while any page faults are outstanding.
+* So we only need RCU to protect the internal idr code.
+*/
+   rcu_read_unlock();

-   if (!svm || svm->pasid != req->pasid) {
-   rcu_read_lock();
-   svm = ioasid_find(NULL, req->pasid, NULL);
-   /* It *can't* go away, because the driver is not
permitted
-* to unbind the mm while any page faults are
outstanding.
-* So we only need RCU to protect the internal idr
code. */
-   rcu_read_unlock();
-   if (IS_ERR_OR_NULL(svm)) {
-   pr_err("%s: Page request for invalid
PASID %d: %08llx %08llx\n",
-  iommu->name, req->pasid, ((unsigned
long long *)req)[0],
-  ((unsigned long long *)req)[1]);
-   goto no_pasid;
-   }
-   }
+   if (IS_ERR_OR_NULL(svm)) {
+   pr_err("%s: Page request for invalid
PASID %d: %08llx %08llx\n",
+  iommu->name, req->pasid, ((unsigned long long *)req)[0],
+  ((unsigned long long *)req)[1]);
+   goto no_pasid;
+   }

-   result = QI_RESP_INVALID;
-   /* Since we're using init_mm.pgd directly, we should never
take
-* any faults on kernel addresses. */
-   if (!svm->mm)
-   goto bad_req;
+   result = QI_RESP_INVALID;
+   /* Since we're using init_mm.pgd directly, we should never take
+* any faults on kernel addresses. */
+   if (!svm->mm)
+   goto bad_req;
+
+   /* If address is not canonical, return invalid response */
+   if (!is_canonical_address(address))
+   goto bad_req;
+
+   /* If the mm is already defunct, don't handle faults. */
+   if (!mmget_not_zero(svm->mm))
+   goto bad_req;
+
+   down_read(>mm->mmap_sem);
+   vma = find_extend_vma(svm->mm, address);
+   if (!vma || address < vma->vm_start)
+

RE: [PATCH v2 1/3] iommu/uapi: Define uapi version and capabilities

2020-04-15 Thread Tian, Kevin

> From: Jacob Pan 
> Sent: Wednesday, April 15, 2020 11:39 PM
> 
> On Tue, 14 Apr 2020 23:47:40 +
> "Tian, Kevin"  wrote:
> 
> > > From: Jacob Pan 
> > > Sent: Wednesday, April 15, 2020 6:32 AM
> > >
> > > On Tue, 14 Apr 2020 10:13:04 -0700
> > > Jacob Pan  wrote:
> > >
> > > > > > >  In any of the proposed solutions, the
> > > > > > > IOMMU driver is ultimately responsible for validating the
> > > > > > > user data, so do we want vfio performing the
> > > > > > > copy_from_user() to an object that could later be assumed
> > > > > > > to be sanitized, or should vfio just pass a user pointer to
> > > > > > > make it obvious that the consumer is responsible for all
> > > > > > > the user protections?  Seems like the latter.
> > > > > > I like the latter as well.
> > > > > >
> > > On a second thought, I think the former is better. Two reasons:
> > >
> > > 1. IOMMU API such as page_response is also used in baremetal. So it
> > > is not suitable to pass a __user *.
> > > https://www.spinics.net/lists/arm-kernel/msg798677.html
> >
> > You can have a wrapped version accepting a __user* and an internal
> > version for kernel pointers.
> >
> I have thought about that also but the problem is that some of the
> flags are processed in the vendor IOMMU ops so it is hard to do that in
> a generic wrapper.

All vendor IOMMU ops are defined in the same header file, so they
can be verified in one common IOMMU wrapper, just like how you
dealt with it in VFIO originally...

> 
> > >
> > > 2. Some data are in the mandatory (fixed offset, never removed or
> > > extended) portion of the uAPI structure. It is simpler for VFIO to
> > > extract that and pass it to IOMMU API. For example, the PASID value
> > > used for unbind_gpasid(). VFIO also need to sanitize the PASID
> > > value to make sure it belongs to the same VM that did the
> > > allocation.
> >
> > I don't think this makes much difference. If anyway you still plan to
> > let IOMMU driver parse some user pointers, why not making a clear
> > split to have it sparse all IOMMU specific fields?
> >
> The plan is not to have IOMMU driver parse user pointers. This is the
> "former" case in Alex's comment. I.e. vfio performing the
> copy_from_user based on argsz in IOMMU uAPI.
> 

I'm confused. I thought Alex proposed the latter one:
---[quote]
> So, __user * will be passed to IOMMU driver if VFIO checks minsz
> include flags and they are valid.
> IOMMU driver can copy the rest based on the mandatory version/minsz and
> flags in the IOMMU uAPI structs.
> Does it sound right? This is really choice #2.

Sounds like each IOMMU UAPI struct just needs to have an embedded size
and flags field, but yes.


Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [EXT] Re: [RFC PATCH 1/4] bus: fsl-mc: add custom .dma_configure implementation

2020-04-15 Thread Makarand Pawagi




> -Original Message-
> From: Lorenzo Pieralisi 
> Sent: Tuesday, April 14, 2020 8:02 PM
> To: Laurentiu Tudor 
> Cc: linux-ker...@vger.kernel.org; iommu@lists.linux-foundation.org; linux-arm-
> ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> robin.mur...@arm.com; ard.biesheu...@linaro.org; Ioana Ciornei
> ; Diana Madalina Craciun (OSS)
> ; m...@kernel.org; j...@solid-run.com; Pankaj
> Bansal ; Makarand Pawagi
> ; Calvin Johnson ;
> Varun Sethi ; Cristi Sovaiala ;
> stuart.yo...@arm.com; jeremy.lin...@arm.com; j...@8bytes.org;
> t...@linutronix.de; ja...@lakedaemon.net
> Subject: [EXT] Re: [RFC PATCH 1/4] bus: fsl-mc: add custom .dma_configure
> implementation
> 
> Caution: EXT Email
> 
> On Wed, Mar 25, 2020 at 06:48:55PM +0200, Laurentiu Tudor wrote:
> > Hi Lorenzo,
> >
> > On 3/25/2020 2:51 PM, Lorenzo Pieralisi wrote:
> > > On Thu, Feb 27, 2020 at 12:05:39PM +0200, laurentiu.tu...@nxp.com wrote:
> > >> From: Laurentiu Tudor 
> > >>
> > >> The devices on this bus are not discovered by way of device tree
> > >> but by queries to the firmware. It makes little sense to trick the
> > >> generic of layer into thinking that these devices are of related so
> > >> that we can get our dma configuration. Instead of doing that, add
> > >> our custom dma configuration implementation.
> > >>
> > >> Signed-off-by: Laurentiu Tudor 
> > >> ---
> > >>  drivers/bus/fsl-mc/fsl-mc-bus.c | 31
> > >> ++-
> > >>  1 file changed, 30 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c
> > >> b/drivers/bus/fsl-mc/fsl-mc-bus.c index 36eb25f82c8e..eafaa0e0b906
> > >> 100644
> > >> --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> > >> +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> > >> @@ -132,11 +132,40 @@ static int fsl_mc_bus_uevent(struct device
> > >> *dev, struct kobj_uevent_env *env)  static int
> > >> fsl_mc_dma_configure(struct device *dev)  {
> > >>struct device *dma_dev = dev;
> > >> +  struct iommu_fwspec *fwspec;
> > >> +  const struct iommu_ops *iommu_ops;  struct fsl_mc_device *mc_dev
> > >> + = to_fsl_mc_device(dev);  int ret;
> > >> +  u32 icid;
> > >>
> > >>while (dev_is_fsl_mc(dma_dev))
> > >>dma_dev = dma_dev->parent;
> > >>
> > >> -  return of_dma_configure(dev, dma_dev->of_node, 0);
> > >> +  fwspec = dev_iommu_fwspec_get(dma_dev);  if (!fwspec)
> > >> +  return -ENODEV;
> > >> +  iommu_ops = iommu_ops_from_fwnode(fwspec->iommu_fwnode);
> > >> +  if (!iommu_ops)
> > >> +  return -ENODEV;
> > >> +
> > >> +  ret = iommu_fwspec_init(dev, fwspec->iommu_fwnode, iommu_ops);
> > >> + if (ret)
> > >> +  return ret;
> > >> +
> > >> +  icid = mc_dev->icid;
> > >> +  ret = iommu_fwspec_add_ids(dev, , 1);
> > >
> > > I see. So with this patch we would use the MC named component only
> > > to retrieve the iommu_ops
> >
> > Right. I'd also add that the implementation tries to follow the
> > existing standard .dma_configure implementations, e.g.
> > of_dma_configure + of_iommu_configure. I'd also note that similarly to
> > the ACPI case, this MC FW device is probed as a platform device in the
> > DT scenario, binding here [1].
> > A similar approach is used for the retrieval of the msi irq domain,
> > see following patch.
> >
> > > - the streamid are injected directly here bypassing OF/IORT bindings
> translations altogether.
> >
> > Actually I've submitted a v2 [2] that calls into .of_xlate() to allow
> > the smmu driver to do some processing on the raw streamid coming from
> > the firmware. I have not yet tested this with ACPI but expect it to
> > work, however, it's debatable how valid is this approach in the
> > context of ACPI.
> 
> Actually, what I think you need is of_map_rid() (and an IORT equivalent, that 
> I
> am going to write - generalizing iort_msi_map_rid()).
> 

That would help.

> Would that be enough to enable IORT "normal" mappings in the MC bus named
> components ?
> 

But still the question remain unanswered that how we are going to represent MC? 
As Platform device with single ID mapping flag?

> Thanks,
> Lorenzo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 1/4] bus: fsl-mc: add custom .dma_configure implementation

2020-04-15 Thread Robin Murphy


On 2020-04-15 4:44 pm, Laurentiu Tudor wrote:



On 4/14/2020 5:32 PM, Lorenzo Pieralisi wrote:

On Wed, Mar 25, 2020 at 06:48:55PM +0200, Laurentiu Tudor wrote:

Hi Lorenzo,

On 3/25/2020 2:51 PM, Lorenzo Pieralisi wrote:

On Thu, Feb 27, 2020 at 12:05:39PM +0200, laurentiu.tu...@nxp.com wrote:

From: Laurentiu Tudor 

The devices on this bus are not discovered by way of device tree
but by queries to the firmware. It makes little sense to trick the
generic of layer into thinking that these devices are of related so
that we can get our dma configuration. Instead of doing that, add
our custom dma configuration implementation.

Signed-off-by: Laurentiu Tudor 
---
  drivers/bus/fsl-mc/fsl-mc-bus.c | 31 ++-
  1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 36eb25f82c8e..eafaa0e0b906 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -132,11 +132,40 @@ static int fsl_mc_bus_uevent(struct device *dev, struct 
kobj_uevent_env *env)
  static int fsl_mc_dma_configure(struct device *dev)
  {
struct device *dma_dev = dev;
+   struct iommu_fwspec *fwspec;
+   const struct iommu_ops *iommu_ops;
+   struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
+   int ret;
+   u32 icid;
  
  	while (dev_is_fsl_mc(dma_dev))

dma_dev = dma_dev->parent;
  
-	return of_dma_configure(dev, dma_dev->of_node, 0);

+   fwspec = dev_iommu_fwspec_get(dma_dev);
+   if (!fwspec)
+   return -ENODEV;
+   iommu_ops = iommu_ops_from_fwnode(fwspec->iommu_fwnode);
+   if (!iommu_ops)
+   return -ENODEV;
+
+   ret = iommu_fwspec_init(dev, fwspec->iommu_fwnode, iommu_ops);
+   if (ret)
+   return ret;
+
+   icid = mc_dev->icid;
+   ret = iommu_fwspec_add_ids(dev, , 1);


I see. So with this patch we would use the MC named component only to
retrieve the iommu_ops


Right. I'd also add that the implementation tries to follow the existing
standard .dma_configure implementations, e.g. of_dma_configure +
of_iommu_configure. I'd also note that similarly to the ACPI case, this
MC FW device is probed as a platform device in the DT scenario, binding
here [1].
A similar approach is used for the retrieval of the msi irq domain, see
following patch.


- the streamid are injected directly here bypassing OF/IORT bindings 
translations altogether.


Actually I've submitted a v2 [2] that calls into .of_xlate() to allow
the smmu driver to do some processing on the raw streamid coming from
the firmware. I have not yet tested this with ACPI but expect it to
work, however, it's debatable how valid is this approach in the context
of ACPI.


Actually, what I think you need is of_map_rid() (and an IORT
equivalent, that I am going to write - generalizing iort_msi_map_rid()).

Would that be enough to enable IORT "normal" mappings in the MC bus
named components ?



At a first glance, looks like this could very well fix the ACPI
scenario, but I have some unclarities on the approach:
  * are we going to rely in DT and ACPI generic layers even if these
devices are not published / enumerated through DT or ACPI tables?


Assuming you mean the DPRC devices rather than the MC itself, then yes; 
in that sense it's exactly the same as how we treat 
dynamically-discovered PCI devices.



  * the firmware manages and provides discrete streamids for the devices
it exposes so there's no translation involved. There's no
requestor_id / input_id involved but it seems that we would still do
some kind of translation relying for this on the DT/ACPI functions.


Wrong - last time I looked, what that firmware actually manages are 
*ICIDs* for the devices, not SMMU Stream IDs or GIC Device IDs; what 
DT/ACPI specifies is a translation from ICID to Stream ID/Device ID. The 
ICID is very much the requester/input ID for that translation. Yes, in 
practice the "translation" is effectively always a trivial identity 
mapping, but conceptually it most definitely exists. Yes, the subtlety 
is incredibly easy to overlook because it's basically drawing a 
distinction between one end of some wires vs. the other end, but it matters.


(and of course "trivial 1:1 translation" isn't even true in the case of 
SMMU Stream ID values, since IIRC they are really composed of 5 
different inputs, only one of which is (part of) the incoming ICID)



  * MC firmware has its own stream_id (e.g. on some chips 0x4000, others
0xf00, so outside the range of stream_ids used for the mc devices)
while for the devices on this bus, MC allocates stream_ids from a
range (e.g. 0x17 - 0x3f). Is it possible to describe this in the IORT table?


If it represents a unique ICID allocated to the MC itself, then sure, it 
simply goes through the mapping like anything else. Just like a PCI host 
bridge owns requester ID 0:0.0 and thus whatever Stream ID/Device ID 
that

Re: [RFC PATCH 1/4] bus: fsl-mc: add custom .dma_configure implementation

2020-04-15 Thread Lorenzo Pieralisi

On Wed, Apr 15, 2020 at 06:44:37PM +0300, Laurentiu Tudor wrote:
> 
> 
> On 4/14/2020 5:32 PM, Lorenzo Pieralisi wrote:
> > On Wed, Mar 25, 2020 at 06:48:55PM +0200, Laurentiu Tudor wrote:
> >> Hi Lorenzo,
> >>
> >> On 3/25/2020 2:51 PM, Lorenzo Pieralisi wrote:
> >>> On Thu, Feb 27, 2020 at 12:05:39PM +0200, laurentiu.tu...@nxp.com wrote:
>  From: Laurentiu Tudor 
> 
>  The devices on this bus are not discovered by way of device tree
>  but by queries to the firmware. It makes little sense to trick the
>  generic of layer into thinking that these devices are of related so
>  that we can get our dma configuration. Instead of doing that, add
>  our custom dma configuration implementation.
> 
>  Signed-off-by: Laurentiu Tudor 
>  ---
>   drivers/bus/fsl-mc/fsl-mc-bus.c | 31 ++-
>   1 file changed, 30 insertions(+), 1 deletion(-)
> 
>  diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c 
>  b/drivers/bus/fsl-mc/fsl-mc-bus.c
>  index 36eb25f82c8e..eafaa0e0b906 100644
>  --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
>  +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
>  @@ -132,11 +132,40 @@ static int fsl_mc_bus_uevent(struct device *dev, 
>  struct kobj_uevent_env *env)
>   static int fsl_mc_dma_configure(struct device *dev)
>   {
>   struct device *dma_dev = dev;
>  +struct iommu_fwspec *fwspec;
>  +const struct iommu_ops *iommu_ops;
>  +struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
>  +int ret;
>  +u32 icid;
>   
>   while (dev_is_fsl_mc(dma_dev))
>   dma_dev = dma_dev->parent;
>   
>  -return of_dma_configure(dev, dma_dev->of_node, 0);
>  +fwspec = dev_iommu_fwspec_get(dma_dev);
>  +if (!fwspec)
>  +return -ENODEV;
>  +iommu_ops = iommu_ops_from_fwnode(fwspec->iommu_fwnode);
>  +if (!iommu_ops)
>  +return -ENODEV;
>  +
>  +ret = iommu_fwspec_init(dev, fwspec->iommu_fwnode, iommu_ops);
>  +if (ret)
>  +return ret;
>  +
>  +icid = mc_dev->icid;
>  +ret = iommu_fwspec_add_ids(dev, , 1);
> >>>
> >>> I see. So with this patch we would use the MC named component only to
> >>> retrieve the iommu_ops
> >>
> >> Right. I'd also add that the implementation tries to follow the existing
> >> standard .dma_configure implementations, e.g. of_dma_configure +
> >> of_iommu_configure. I'd also note that similarly to the ACPI case, this
> >> MC FW device is probed as a platform device in the DT scenario, binding
> >> here [1].
> >> A similar approach is used for the retrieval of the msi irq domain, see
> >> following patch.
> >>
> >>> - the streamid are injected directly here bypassing OF/IORT bindings 
> >>> translations altogether. 
> >>
> >> Actually I've submitted a v2 [2] that calls into .of_xlate() to allow
> >> the smmu driver to do some processing on the raw streamid coming from
> >> the firmware. I have not yet tested this with ACPI but expect it to
> >> work, however, it's debatable how valid is this approach in the context
> >> of ACPI.
> > 
> > Actually, what I think you need is of_map_rid() (and an IORT
> > equivalent, that I am going to write - generalizing iort_msi_map_rid()).
> > 
> > Would that be enough to enable IORT "normal" mappings in the MC bus
> > named components ?
> > 
> 
> At a first glance, looks like this could very well fix the ACPI
> scenario, but I have some unclarities on the approach:
>  * are we going to rely in DT and ACPI generic layers even if these
> devices are not published / enumerated through DT or ACPI tables?
>  * the firmware manages and provides discrete streamids for the devices
> it exposes so there's no translation involved. There's no
>requestor_id / input_id involved but it seems that we would still do
> some kind of translation relying for this on the DT/ACPI functions.
>  * MC firmware has its own stream_id (e.g. on some chips 0x4000, others
> 0xf00, so outside the range of stream_ids used for the mc devices)
>while for the devices on this bus, MC allocates stream_ids from a
> range (e.g. 0x17 - 0x3f). Is it possible to describe this in the IORT table?
>  * Regarding the of_map_rid() use you mentioned, I was planning to
> decouple the mc bus from the DT layer by dropping the use of
> of_map_rid(), see patch 4.
> I briefly glanced over the iort code and spotted this static function:
> iort_iommu_xlate(). Wouldn't it also help, of course after making it public?

Guys I have lost you honestly. I don't understand what you really need
to do with DT and ACPI here. Are they needed to describe what you need
or not ? If the MC dma configure function does not need any DT/ACPI
bindings that's fine by me, I don't understand though why you are still
asking how to represent MC in ACPI then, what

Re: [PATCH v11 01/13] iommu: Introduce attach/detach_pasid_table API

2020-04-15 Thread Auger Eric

Hi Jacob,

On 4/15/20 5:59 PM, Jacob Pan wrote:
> On Wed, 15 Apr 2020 16:52:10 +0200
> Auger Eric  wrote:
> 
>> Hi Jacob,
>> On 4/15/20 12:15 AM, Jacob Pan wrote:
>>> Hi Eric,
>>>
>>> There are some discussions about how to size the uAPI data.
>>> https://lkml.org/lkml/2020/4/14/939
>>>
>>> I think the problem with the current scheme is that when uAPI data
>>> gets extended, if VFIO continue to use:
>>>
>>> minsz = offsetofend(struct vfio_iommu_type1_set_pasid_table,
>>> config); if (copy_from_user(, (void __user *)arg, minsz))
>>>
>>> It may copy more data from user than what was setup by the user.
>>>
>>> So, as suggested by Alex, we could add argsz to the IOMMU uAPI
>>> struct. So if argsz > minsz, then fail the attach_table since
>>> kernel might be old, doesn't know about the extra data.
>>> If argsz <= minsz, kernel can support the attach_table but must
>>> process the data based on flags or config.  
>>
>> So I guess we would need both an argsz _u32 + a new flag _u32 right?
>>
> Yes.
>> I am ok with that idea. Besides how will you manage for existing IOMMU
>> UAPIs?
> I plan to add argsz and flags (if not already have one)
> 
>> At some point you envisionned to have a getter at iommu api
>> level to retrieve the size of a structure for a given version, right?
>>
> This idea is shot down. There is no version-size lookup.
> So the current plan is for user to fill out argsz in each IOMMU uAPI
> struct. VFIO does the copy_from_user() based on argsz (sanitized
> against the size of current kernel struct).
> 
> IOMMU vendor driver process the data based on flags which indicates
> new capability/extensions.
OK. Sounds sensible

Thanks

Eric
> 
>> Thanks
>>
>> Eric
>>>
>>> Does it make sense to you?
>>>
>>>
>>> On Tue, 14 Apr 2020 17:05:55 +0200
>>> Eric Auger  wrote:
>>>   
 From: Jacob Pan 

 In virtualization use case, when a guest is assigned
 a PCI host device, protected by a virtual IOMMU on the guest,
 the physical IOMMU must be programmed to be consistent with
 the guest mappings. If the physical IOMMU supports two
 translation stages it makes sense to program guest mappings
 onto the first stage/level (ARM/Intel terminology) while the host
 owns the stage/level 2.

 In that case, it is mandated to trap on guest configuration
 settings and pass those to the physical iommu driver.

 This patch adds a new API to the iommu subsystem that allows
 to set/unset the pasid table information.

 A generic iommu_pasid_table_config struct is introduced in
 a new iommu.h uapi header. This is going to be used by the VFIO
 user API.

 Signed-off-by: Jean-Philippe Brucker
  Signed-off-by: Liu, Yi L
  Signed-off-by: Ashok Raj
  Signed-off-by: Jacob Pan
  Signed-off-by: Eric Auger
  Reviewed-by: Jean-Philippe Brucker
  ---
  drivers/iommu/iommu.c  | 19 ++
  include/linux/iommu.h  | 18 ++
  include/uapi/linux/iommu.h | 51
 ++ 3 files changed, 88
 insertions(+)

 diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
 index 2b471419e26c..b71ad56f8c99 100644
 --- a/drivers/iommu/iommu.c
 +++ b/drivers/iommu/iommu.c
 @@ -1723,6 +1723,25 @@ int iommu_sva_unbind_gpasid(struct
 iommu_domain *domain, struct device *dev, }
  EXPORT_SYMBOL_GPL(iommu_sva_unbind_gpasid);
  
 +int iommu_attach_pasid_table(struct iommu_domain *domain,
 +   struct iommu_pasid_table_config *cfg)
 +{
 +  if (unlikely(!domain->ops->attach_pasid_table))
 +  return -ENODEV;
 +
 +  return domain->ops->attach_pasid_table(domain, cfg);
 +}
 +EXPORT_SYMBOL_GPL(iommu_attach_pasid_table);
 +
 +void iommu_detach_pasid_table(struct iommu_domain *domain)
 +{
 +  if (unlikely(!domain->ops->detach_pasid_table))
 +  return;
 +
 +  domain->ops->detach_pasid_table(domain);
 +}
 +EXPORT_SYMBOL_GPL(iommu_detach_pasid_table);
 +
  static void __iommu_detach_device(struct iommu_domain *domain,
  struct device *dev)
  {
 diff --git a/include/linux/iommu.h b/include/linux/iommu.h
 index 7ef8b0bda695..3e1057c3585a 100644
 --- a/include/linux/iommu.h
 +++ b/include/linux/iommu.h
 @@ -248,6 +248,8 @@ struct iommu_iotlb_gather {
   * @cache_invalidate: invalidate translation caches
   * @sva_bind_gpasid: bind guest pasid and mm
   * @sva_unbind_gpasid: unbind guest pasid and mm
 + * @attach_pasid_table: attach a pasid table
 + * @detach_pasid_table: detach the pasid table
   * @pgsize_bitmap: bitmap of all possible supported page sizes
   * @owner: Driver module providing these ops
   */
 @@ -307,6 +309,9 @@ struct iommu_ops {
  void *drvdata);
void (*sva_unbind)(struct iommu_sva *handle);

Re: [PATCH v11 01/13] iommu: Introduce attach/detach_pasid_table API

2020-04-15 Thread Jacob Pan

On Wed, 15 Apr 2020 16:52:10 +0200
Auger Eric  wrote:

> Hi Jacob,
> On 4/15/20 12:15 AM, Jacob Pan wrote:
> > Hi Eric,
> > 
> > There are some discussions about how to size the uAPI data.
> > https://lkml.org/lkml/2020/4/14/939
> > 
> > I think the problem with the current scheme is that when uAPI data
> > gets extended, if VFIO continue to use:
> > 
> > minsz = offsetofend(struct vfio_iommu_type1_set_pasid_table,
> > config); if (copy_from_user(, (void __user *)arg, minsz))
> > 
> > It may copy more data from user than what was setup by the user.
> > 
> > So, as suggested by Alex, we could add argsz to the IOMMU uAPI
> > struct. So if argsz > minsz, then fail the attach_table since
> > kernel might be old, doesn't know about the extra data.
> > If argsz <= minsz, kernel can support the attach_table but must
> > process the data based on flags or config.  
> 
> So I guess we would need both an argsz _u32 + a new flag _u32 right?
> 
Yes.
> I am ok with that idea. Besides how will you manage for existing IOMMU
> UAPIs?
I plan to add argsz and flags (if not already have one)

> At some point you envisionned to have a getter at iommu api
> level to retrieve the size of a structure for a given version, right?
> 
This idea is shot down. There is no version-size lookup.
So the current plan is for user to fill out argsz in each IOMMU uAPI
struct. VFIO does the copy_from_user() based on argsz (sanitized
against the size of current kernel struct).

IOMMU vendor driver process the data based on flags which indicates
new capability/extensions.

> Thanks
> 
> Eric
> > 
> > Does it make sense to you?
> > 
> > 
> > On Tue, 14 Apr 2020 17:05:55 +0200
> > Eric Auger  wrote:
> >   
> >> From: Jacob Pan 
> >>
> >> In virtualization use case, when a guest is assigned
> >> a PCI host device, protected by a virtual IOMMU on the guest,
> >> the physical IOMMU must be programmed to be consistent with
> >> the guest mappings. If the physical IOMMU supports two
> >> translation stages it makes sense to program guest mappings
> >> onto the first stage/level (ARM/Intel terminology) while the host
> >> owns the stage/level 2.
> >>
> >> In that case, it is mandated to trap on guest configuration
> >> settings and pass those to the physical iommu driver.
> >>
> >> This patch adds a new API to the iommu subsystem that allows
> >> to set/unset the pasid table information.
> >>
> >> A generic iommu_pasid_table_config struct is introduced in
> >> a new iommu.h uapi header. This is going to be used by the VFIO
> >> user API.
> >>
> >> Signed-off-by: Jean-Philippe Brucker
> >>  Signed-off-by: Liu, Yi L
> >>  Signed-off-by: Ashok Raj
> >>  Signed-off-by: Jacob Pan
> >>  Signed-off-by: Eric Auger
> >>  Reviewed-by: Jean-Philippe Brucker
> >>  ---
> >>  drivers/iommu/iommu.c  | 19 ++
> >>  include/linux/iommu.h  | 18 ++
> >>  include/uapi/linux/iommu.h | 51
> >> ++ 3 files changed, 88
> >> insertions(+)
> >>
> >> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> >> index 2b471419e26c..b71ad56f8c99 100644
> >> --- a/drivers/iommu/iommu.c
> >> +++ b/drivers/iommu/iommu.c
> >> @@ -1723,6 +1723,25 @@ int iommu_sva_unbind_gpasid(struct
> >> iommu_domain *domain, struct device *dev, }
> >>  EXPORT_SYMBOL_GPL(iommu_sva_unbind_gpasid);
> >>  
> >> +int iommu_attach_pasid_table(struct iommu_domain *domain,
> >> +   struct iommu_pasid_table_config *cfg)
> >> +{
> >> +  if (unlikely(!domain->ops->attach_pasid_table))
> >> +  return -ENODEV;
> >> +
> >> +  return domain->ops->attach_pasid_table(domain, cfg);
> >> +}
> >> +EXPORT_SYMBOL_GPL(iommu_attach_pasid_table);
> >> +
> >> +void iommu_detach_pasid_table(struct iommu_domain *domain)
> >> +{
> >> +  if (unlikely(!domain->ops->detach_pasid_table))
> >> +  return;
> >> +
> >> +  domain->ops->detach_pasid_table(domain);
> >> +}
> >> +EXPORT_SYMBOL_GPL(iommu_detach_pasid_table);
> >> +
> >>  static void __iommu_detach_device(struct iommu_domain *domain,
> >>  struct device *dev)
> >>  {
> >> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> >> index 7ef8b0bda695..3e1057c3585a 100644
> >> --- a/include/linux/iommu.h
> >> +++ b/include/linux/iommu.h
> >> @@ -248,6 +248,8 @@ struct iommu_iotlb_gather {
> >>   * @cache_invalidate: invalidate translation caches
> >>   * @sva_bind_gpasid: bind guest pasid and mm
> >>   * @sva_unbind_gpasid: unbind guest pasid and mm
> >> + * @attach_pasid_table: attach a pasid table
> >> + * @detach_pasid_table: detach the pasid table
> >>   * @pgsize_bitmap: bitmap of all possible supported page sizes
> >>   * @owner: Driver module providing these ops
> >>   */
> >> @@ -307,6 +309,9 @@ struct iommu_ops {
> >>  void *drvdata);
> >>void (*sva_unbind)(struct iommu_sva *handle);
> >>int (*sva_get_pasid)(struct iommu_sva *handle);
> >> +  int (*attach_pasid_table)(struct iommu_domain *domain,

Re: [RFC PATCH 1/4] bus: fsl-mc: add custom .dma_configure implementation

2020-04-15 Thread Laurentiu Tudor

On 4/14/2020 5:32 PM, Lorenzo Pieralisi wrote:
> On Wed, Mar 25, 2020 at 06:48:55PM +0200, Laurentiu Tudor wrote:
>> Hi Lorenzo,
>>
>> On 3/25/2020 2:51 PM, Lorenzo Pieralisi wrote:
>>> On Thu, Feb 27, 2020 at 12:05:39PM +0200, laurentiu.tu...@nxp.com wrote:
 From: Laurentiu Tudor 

 The devices on this bus are not discovered by way of device tree
 but by queries to the firmware. It makes little sense to trick the
 generic of layer into thinking that these devices are of related so
 that we can get our dma configuration. Instead of doing that, add
 our custom dma configuration implementation.

 Signed-off-by: Laurentiu Tudor 
 ---
  drivers/bus/fsl-mc/fsl-mc-bus.c | 31 ++-
  1 file changed, 30 insertions(+), 1 deletion(-)

 diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c 
 b/drivers/bus/fsl-mc/fsl-mc-bus.c
 index 36eb25f82c8e..eafaa0e0b906 100644
 --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
 +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
 @@ -132,11 +132,40 @@ static int fsl_mc_bus_uevent(struct device *dev, 
 struct kobj_uevent_env *env)
  static int fsl_mc_dma_configure(struct device *dev)
  {
struct device *dma_dev = dev;
 +  struct iommu_fwspec *fwspec;
 +  const struct iommu_ops *iommu_ops;
 +  struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
 +  int ret;
 +  u32 icid;

while (dev_is_fsl_mc(dma_dev))
dma_dev = dma_dev->parent;

 -  return of_dma_configure(dev, dma_dev->of_node, 0);
 +  fwspec = dev_iommu_fwspec_get(dma_dev);
 +  if (!fwspec)
 +  return -ENODEV;
 +  iommu_ops = iommu_ops_from_fwnode(fwspec->iommu_fwnode);
 +  if (!iommu_ops)
 +  return -ENODEV;
 +
 +  ret = iommu_fwspec_init(dev, fwspec->iommu_fwnode, iommu_ops);
 +  if (ret)
 +  return ret;
 +
 +  icid = mc_dev->icid;
 +  ret = iommu_fwspec_add_ids(dev, , 1);
>>>
>>> I see. So with this patch we would use the MC named component only to
>>> retrieve the iommu_ops
>>
>> Right. I'd also add that the implementation tries to follow the existing
>> standard .dma_configure implementations, e.g. of_dma_configure +
>> of_iommu_configure. I'd also note that similarly to the ACPI case, this
>> MC FW device is probed as a platform device in the DT scenario, binding
>> here [1].
>> A similar approach is used for the retrieval of the msi irq domain, see
>> following patch.
>>
>>> - the streamid are injected directly here bypassing OF/IORT bindings 
>>> translations altogether. 
>>
>> Actually I've submitted a v2 [2] that calls into .of_xlate() to allow
>> the smmu driver to do some processing on the raw streamid coming from
>> the firmware. I have not yet tested this with ACPI but expect it to
>> work, however, it's debatable how valid is this approach in the context
>> of ACPI.
> 
> Actually, what I think you need is of_map_rid() (and an IORT
> equivalent, that I am going to write - generalizing iort_msi_map_rid()).
> 
> Would that be enough to enable IORT "normal" mappings in the MC bus
> named components ?
> 

At a first glance, looks like this could very well fix the ACPI
scenario, but I have some unclarities on the approach:
 * are we going to rely in DT and ACPI generic layers even if these
devices are not published / enumerated through DT or ACPI tables?
 * the firmware manages and provides discrete streamids for the devices
it exposes so there's no translation involved. There's no
   requestor_id / input_id involved but it seems that we would still do
some kind of translation relying for this on the DT/ACPI functions.
 * MC firmware has its own stream_id (e.g. on some chips 0x4000, others
0xf00, so outside the range of stream_ids used for the mc devices)
   while for the devices on this bus, MC allocates stream_ids from a
range (e.g. 0x17 - 0x3f). Is it possible to describe this in the IORT table?
 * Regarding the of_map_rid() use you mentioned, I was planning to
decouple the mc bus from the DT layer by dropping the use of
of_map_rid(), see patch 4.
I briefly glanced over the iort code and spotted this static function:
iort_iommu_xlate(). Wouldn't it also help, of course after making it public?

---
Thanks & Best Regards, Laurentiu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [EXT] Re: [RFC PATCH 1/4] bus: fsl-mc: add custom .dma_configure implementation

2020-04-15 Thread Lorenzo Pieralisi

On Wed, Apr 15, 2020 at 05:42:03AM +, Makarand Pawagi wrote:
> 
> 
> > -Original Message-
> > From: Lorenzo Pieralisi 
> > Sent: Tuesday, April 14, 2020 8:02 PM
> > To: Laurentiu Tudor 
> > Cc: linux-ker...@vger.kernel.org; iommu@lists.linux-foundation.org; 
> > linux-arm-
> > ker...@lists.infradead.org; linux-a...@vger.kernel.org;
> > robin.mur...@arm.com; ard.biesheu...@linaro.org; Ioana Ciornei
> > ; Diana Madalina Craciun (OSS)
> > ; m...@kernel.org; j...@solid-run.com; Pankaj
> > Bansal ; Makarand Pawagi
> > ; Calvin Johnson ;
> > Varun Sethi ; Cristi Sovaiala ;
> > stuart.yo...@arm.com; jeremy.lin...@arm.com; j...@8bytes.org;
> > t...@linutronix.de; ja...@lakedaemon.net
> > Subject: [EXT] Re: [RFC PATCH 1/4] bus: fsl-mc: add custom .dma_configure
> > implementation
> > 
> > Caution: EXT Email
> > 
> > On Wed, Mar 25, 2020 at 06:48:55PM +0200, Laurentiu Tudor wrote:
> > > Hi Lorenzo,
> > >
> > > On 3/25/2020 2:51 PM, Lorenzo Pieralisi wrote:
> > > > On Thu, Feb 27, 2020 at 12:05:39PM +0200, laurentiu.tu...@nxp.com wrote:
> > > >> From: Laurentiu Tudor 
> > > >>
> > > >> The devices on this bus are not discovered by way of device tree
> > > >> but by queries to the firmware. It makes little sense to trick the
> > > >> generic of layer into thinking that these devices are of related so
> > > >> that we can get our dma configuration. Instead of doing that, add
> > > >> our custom dma configuration implementation.
> > > >>
> > > >> Signed-off-by: Laurentiu Tudor 
> > > >> ---
> > > >>  drivers/bus/fsl-mc/fsl-mc-bus.c | 31
> > > >> ++-
> > > >>  1 file changed, 30 insertions(+), 1 deletion(-)
> > > >>
> > > >> diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c
> > > >> b/drivers/bus/fsl-mc/fsl-mc-bus.c index 36eb25f82c8e..eafaa0e0b906
> > > >> 100644
> > > >> --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> > > >> +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> > > >> @@ -132,11 +132,40 @@ static int fsl_mc_bus_uevent(struct device
> > > >> *dev, struct kobj_uevent_env *env)  static int
> > > >> fsl_mc_dma_configure(struct device *dev)  {
> > > >>struct device *dma_dev = dev;
> > > >> +  struct iommu_fwspec *fwspec;
> > > >> +  const struct iommu_ops *iommu_ops;  struct fsl_mc_device *mc_dev
> > > >> + = to_fsl_mc_device(dev);  int ret;
> > > >> +  u32 icid;
> > > >>
> > > >>while (dev_is_fsl_mc(dma_dev))
> > > >>dma_dev = dma_dev->parent;
> > > >>
> > > >> -  return of_dma_configure(dev, dma_dev->of_node, 0);
> > > >> +  fwspec = dev_iommu_fwspec_get(dma_dev);  if (!fwspec)
> > > >> +  return -ENODEV;
> > > >> +  iommu_ops = iommu_ops_from_fwnode(fwspec->iommu_fwnode);
> > > >> +  if (!iommu_ops)
> > > >> +  return -ENODEV;
> > > >> +
> > > >> +  ret = iommu_fwspec_init(dev, fwspec->iommu_fwnode, iommu_ops);
> > > >> + if (ret)
> > > >> +  return ret;
> > > >> +
> > > >> +  icid = mc_dev->icid;
> > > >> +  ret = iommu_fwspec_add_ids(dev, , 1);
> > > >
> > > > I see. So with this patch we would use the MC named component only
> > > > to retrieve the iommu_ops
> > >
> > > Right. I'd also add that the implementation tries to follow the
> > > existing standard .dma_configure implementations, e.g.
> > > of_dma_configure + of_iommu_configure. I'd also note that similarly to
> > > the ACPI case, this MC FW device is probed as a platform device in the
> > > DT scenario, binding here [1].
> > > A similar approach is used for the retrieval of the msi irq domain,
> > > see following patch.
> > >
> > > > - the streamid are injected directly here bypassing OF/IORT bindings
> > translations altogether.
> > >
> > > Actually I've submitted a v2 [2] that calls into .of_xlate() to allow
> > > the smmu driver to do some processing on the raw streamid coming from
> > > the firmware. I have not yet tested this with ACPI but expect it to
> > > work, however, it's debatable how valid is this approach in the
> > > context of ACPI.
> > 
> > Actually, what I think you need is of_map_rid() (and an IORT equivalent, 
> > that I
> > am going to write - generalizing iort_msi_map_rid()).
> > 
> 
> That would help.
> 
> > Would that be enough to enable IORT "normal" mappings in the MC bus named
> > components ?
> > 
> 
> But still the question remain unanswered that how we are going to represent 
> MC? As Platform device with single ID mapping flag?

No, "normal" mappings, that's what I wrote above and it is not a
platform device it is a named component in ACPI/IORT terms.

Thanks,
Lorenzo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/3] iommu/uapi: Define uapi version and capabilities

2020-04-15 Thread Jacob Pan

On Tue, 14 Apr 2020 23:47:40 +
"Tian, Kevin"  wrote:

> > From: Jacob Pan 
> > Sent: Wednesday, April 15, 2020 6:32 AM
> > 
> > On Tue, 14 Apr 2020 10:13:04 -0700
> > Jacob Pan  wrote:
> >   
> > > > > >  In any of the proposed solutions, the
> > > > > > IOMMU driver is ultimately responsible for validating the
> > > > > > user data, so do we want vfio performing the
> > > > > > copy_from_user() to an object that could later be assumed
> > > > > > to be sanitized, or should vfio just pass a user pointer to
> > > > > > make it obvious that the consumer is responsible for all
> > > > > > the user protections?  Seems like the latter.  
> > > > > I like the latter as well.
> > > > >  
> > On a second thought, I think the former is better. Two reasons:
> > 
> > 1. IOMMU API such as page_response is also used in baremetal. So it
> > is not suitable to pass a __user *.
> > https://www.spinics.net/lists/arm-kernel/msg798677.html  
> 
> You can have a wrapped version accepting a __user* and an internal
> version for kernel pointers.
> 
I have thought about that also but the problem is that some of the
flags are processed in the vendor IOMMU ops so it is hard to do that in
a generic wrapper.

> > 
> > 2. Some data are in the mandatory (fixed offset, never removed or
> > extended) portion of the uAPI structure. It is simpler for VFIO to
> > extract that and pass it to IOMMU API. For example, the PASID value
> > used for unbind_gpasid(). VFIO also need to sanitize the PASID
> > value to make sure it belongs to the same VM that did the
> > allocation.  
> 
> I don't think this makes much difference. If anyway you still plan to
> let IOMMU driver parse some user pointers, why not making a clear
> split to have it sparse all IOMMU specific fields?
>
The plan is not to have IOMMU driver parse user pointers. This is the
"former" case in Alex's comment. I.e. vfio performing the
copy_from_user based on argsz in IOMMU uAPI.
 
> Thanks
> Kevin
> 
> > 
> >   
> > > > > >  That still really
> > > > > > doesn't address what's in that user data blob yet, but the
> > > > > > vfio interface could be:
> > > > > >
> > > > > > struct {
> > > > > > __u32 argsz;
> > > > > > __u32 flags;
> > > > > > __u8  data[];
> > > > > > }
> > > > > >
> > > > > > Where flags might be partitioned like we do for
> > > > > > DEVICE_FEATURE to indicate the format of data and what vfio
> > > > > > should do with it, and data might simply be defined as a
> > > > > > (__u64 __user *). 
> > > > > So, __user * will be passed to IOMMU driver if VFIO checks
> > > > > minsz include flags and they are valid.
> > > > > IOMMU driver can copy the rest based on the mandatory
> > > > > version/minsz and flags in the IOMMU uAPI structs.
> > > > > Does it sound right? This is really choice #2.  
> > > >
> > > > Sounds like each IOMMU UAPI struct just needs to have an
> > > > embedded size and flags field, but yes.
> > > >  
> > > Yes, an argsz field can be added to each UAPI. There are already
> > > flags or the equivalent. IOMMU driver can process the __user *
> > > based on the argsz, flags, check argsz against
> > > offsetofend(iommu_uapi_struct, last_element), etc.;  

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [patch 5/7] dma-pool: add pool sizes to debugfs

2020-04-15 Thread Tom Lendacky





On 4/14/20 7:04 PM, David Rientjes wrote:

The atomic DMA pools can dynamically expand based on non-blocking
allocations that need to use it.

Export the sizes of each of these pools, in bytes, through debugfs for
measurement.

Suggested-by: Christoph Hellwig 
Signed-off-by: David Rientjes 
---
  kernel/dma/pool.c | 41 +
  1 file changed, 41 insertions(+)

diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c
index cf052314d9e4..3e22022c933b 100644
--- a/kernel/dma/pool.c
+++ b/kernel/dma/pool.c
@@ -2,6 +2,7 @@
  /*
   * Copyright (C) 2020 Google LLC
   */
+#include 
  #include 
  #include 
  #include 
@@ -15,6 +16,11 @@
  static struct gen_pool *atomic_pool_dma __ro_after_init;
  static struct gen_pool *atomic_pool_dma32 __ro_after_init;
  static struct gen_pool *atomic_pool_kernel __ro_after_init;
+#ifdef CONFIG_DEBUG_FS


I don't think you need the #ifdef any more unless you just want to save 
space. All of the debugfs routines have versions for whether 
CONFIG_DEBUG_FS is defined or not.



+static unsigned long pool_size_dma;
+static unsigned long pool_size_dma32;
+static unsigned long pool_size_kernel;
+#endif
  
  #define DEFAULT_DMA_COHERENT_POOL_SIZE  SZ_256K

  static size_t atomic_pool_size = DEFAULT_DMA_COHERENT_POOL_SIZE;
@@ -29,6 +35,38 @@ static int __init early_coherent_pool(char *p)
  }
  early_param("coherent_pool", early_coherent_pool);
  
+#ifdef CONFIG_DEBUG_FS

+static void __init dma_atomic_pool_debugfs_init(void)
+{
+   struct dentry *root;
+
+   root = debugfs_create_dir("dma_pools", NULL);
+   if (IS_ERR_OR_NULL(root))
+   return;


I believe GregKH went through and removed a lot of these error checks (see 
9e3926df8779 ("xgbe: no need to check return value of debugfs_create 
functions") for an example).


Thanks,
Tom


+
+   debugfs_create_ulong("pool_size_dma", 0400, root, _size_dma);
+   debugfs_create_ulong("pool_size_dma32", 0400, root, _size_dma32);
+   debugfs_create_ulong("pool_size_kernel", 0400, root, _size_kernel);
+}
+
+static void dma_atomic_pool_size_add(gfp_t gfp, size_t size)
+{
+   if (gfp & __GFP_DMA)
+   pool_size_dma += size;
+   else if (gfp & __GFP_DMA32)
+   pool_size_dma32 += size;
+   else
+   pool_size_kernel += size;
+}
+#else
+static inline void dma_atomic_pool_debugfs_init(void)
+{
+}
+static inline void dma_atomic_pool_size_add(gfp_t gfp, size_t size)
+{
+}
+#endif /* CONFIG_DEBUG_FS */
+
  static int atomic_pool_expand(struct gen_pool *pool, size_t pool_size,
  gfp_t gfp)
  {
@@ -76,6 +114,7 @@ static int atomic_pool_expand(struct gen_pool *pool, size_t 
pool_size,
if (ret)
goto encrypt_mapping;
  
+	dma_atomic_pool_size_add(gfp, pool_size);

return 0;
  
  encrypt_mapping:

@@ -160,6 +199,8 @@ static int __init dma_atomic_pool_init(void)
if (!atomic_pool_dma32)
ret = -ENOMEM;
}
+
+   dma_atomic_pool_debugfs_init();
return ret;
  }
  postcore_initcall(dma_atomic_pool_init);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 05/10] iommu/ioasid: Create an IOASID set for host SVA use

2020-04-15 Thread Jean-Philippe Brucker

On Mon, Apr 13, 2020 at 03:06:31PM -0700, Jacob Pan wrote:
> > > > But quotas are only necessary for VMs, when the host shares the
> > > > PASID space with them (which isn't a use-case for Arm systems as
> > > > far as I know, each VM gets its own PASID space).  
> > > Is there a host-guest PASID translation? or the PASID used by the
> > > VM is physical PASID? When a page request comes in to SMMU, how
> > > does it know the owner of the PASID if PASID range can overlap
> > > between host and guest?  
> > 
> > We assign PCI functions to VMs, so Page Requests are routed with
> > RID:PASID, not PASID alone. The SMMU finds the struct device
> > associated with the RID, and submits the fault with
> > iommu_report_device_fault(). If the VF is assigned to a VM, then the
> > page request gets injected into the VM, otherwise it uses the host
> > IOPF handler
> > 
> Got it, VM private PASID space works then.
> For VM, the IOASID search is within the VM ioasid_set.
> For SVA, the IOASID search is within host default set.
> Should be faster than global search once we have per set xarray.
> I guess the PASID table is per VM instead of per RID (device)? Sorry if
> you already answered it before.

The PASID table is per IOMMU domain, so it's closer to per RID than per
VM, unless userspace puts all devices in the same VFIO container (hence in
the same IOMMU domain).

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v11 01/13] iommu: Introduce attach/detach_pasid_table API

2020-04-15 Thread Auger Eric

Hi Jacob,
On 4/15/20 12:15 AM, Jacob Pan wrote:
> Hi Eric,
> 
> There are some discussions about how to size the uAPI data.
> https://lkml.org/lkml/2020/4/14/939
> 
> I think the problem with the current scheme is that when uAPI data gets
> extended, if VFIO continue to use:
> 
> minsz = offsetofend(struct vfio_iommu_type1_set_pasid_table, config);
> if (copy_from_user(, (void __user *)arg, minsz))
> 
> It may copy more data from user than what was setup by the user.
> 
> So, as suggested by Alex, we could add argsz to the IOMMU uAPI struct.
> So if argsz > minsz, then fail the attach_table since kernel might be
> old, doesn't know about the extra data.
> If argsz <= minsz, kernel can support the attach_table but must process
> the data based on flags or config.

So I guess we would need both an argsz _u32 + a new flag _u32 right?

I am ok with that idea. Besides how will you manage for existing IOMMU
UAPIs? At some point you envisionned to have a getter at iommu api level
to retrieve the size of a structure for a given version, right?

Thanks

Eric
> 
> Does it make sense to you?
> 
> 
> On Tue, 14 Apr 2020 17:05:55 +0200
> Eric Auger  wrote:
> 
>> From: Jacob Pan 
>>
>> In virtualization use case, when a guest is assigned
>> a PCI host device, protected by a virtual IOMMU on the guest,
>> the physical IOMMU must be programmed to be consistent with
>> the guest mappings. If the physical IOMMU supports two
>> translation stages it makes sense to program guest mappings
>> onto the first stage/level (ARM/Intel terminology) while the host
>> owns the stage/level 2.
>>
>> In that case, it is mandated to trap on guest configuration
>> settings and pass those to the physical iommu driver.
>>
>> This patch adds a new API to the iommu subsystem that allows
>> to set/unset the pasid table information.
>>
>> A generic iommu_pasid_table_config struct is introduced in
>> a new iommu.h uapi header. This is going to be used by the VFIO
>> user API.
>>
>> Signed-off-by: Jean-Philippe Brucker 
>> Signed-off-by: Liu, Yi L 
>> Signed-off-by: Ashok Raj 
>> Signed-off-by: Jacob Pan 
>> Signed-off-by: Eric Auger 
>> Reviewed-by: Jean-Philippe Brucker 
>> ---
>>  drivers/iommu/iommu.c  | 19 ++
>>  include/linux/iommu.h  | 18 ++
>>  include/uapi/linux/iommu.h | 51
>> ++ 3 files changed, 88
>> insertions(+)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 2b471419e26c..b71ad56f8c99 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1723,6 +1723,25 @@ int iommu_sva_unbind_gpasid(struct
>> iommu_domain *domain, struct device *dev, }
>>  EXPORT_SYMBOL_GPL(iommu_sva_unbind_gpasid);
>>  
>> +int iommu_attach_pasid_table(struct iommu_domain *domain,
>> + struct iommu_pasid_table_config *cfg)
>> +{
>> +if (unlikely(!domain->ops->attach_pasid_table))
>> +return -ENODEV;
>> +
>> +return domain->ops->attach_pasid_table(domain, cfg);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_attach_pasid_table);
>> +
>> +void iommu_detach_pasid_table(struct iommu_domain *domain)
>> +{
>> +if (unlikely(!domain->ops->detach_pasid_table))
>> +return;
>> +
>> +domain->ops->detach_pasid_table(domain);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_detach_pasid_table);
>> +
>>  static void __iommu_detach_device(struct iommu_domain *domain,
>>struct device *dev)
>>  {
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 7ef8b0bda695..3e1057c3585a 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -248,6 +248,8 @@ struct iommu_iotlb_gather {
>>   * @cache_invalidate: invalidate translation caches
>>   * @sva_bind_gpasid: bind guest pasid and mm
>>   * @sva_unbind_gpasid: unbind guest pasid and mm
>> + * @attach_pasid_table: attach a pasid table
>> + * @detach_pasid_table: detach the pasid table
>>   * @pgsize_bitmap: bitmap of all possible supported page sizes
>>   * @owner: Driver module providing these ops
>>   */
>> @@ -307,6 +309,9 @@ struct iommu_ops {
>>void *drvdata);
>>  void (*sva_unbind)(struct iommu_sva *handle);
>>  int (*sva_get_pasid)(struct iommu_sva *handle);
>> +int (*attach_pasid_table)(struct iommu_domain *domain,
>> +  struct iommu_pasid_table_config
>> *cfg);
>> +void (*detach_pasid_table)(struct iommu_domain *domain);
>>  
>>  int (*page_response)(struct device *dev,
>>   struct iommu_fault_event *evt,
>> @@ -446,6 +451,9 @@ extern int iommu_sva_bind_gpasid(struct
>> iommu_domain *domain, struct device *dev, struct
>> iommu_gpasid_bind_data *data); extern int
>> iommu_sva_unbind_gpasid(struct iommu_domain *domain, struct device
>> *dev, ioasid_t pasid); +extern int iommu_attach_pasid_table(struct
>> iommu_domain *domain,
>> +struct iommu_pasid_table_config
>> *cfg); +extern

Re: [PATCH v2] dma: Fix max PFN arithmetic overflow on 32 bit systems

2020-04-15 Thread Alexander Dahl

Hello,

now after v5.7-rc1 is out, I would kindly ask, if anyone had time to
review this one line patch? Is anything wrong with that fix?

(I added the current fli4l kernel package maintainer Florian to Cc to
let him know I'm still having an eye on this.)

Greets
Alex

On Sat, Mar 21, 2020 at 07:28:23PM +0100, Alexander Dahl wrote:
> For ARCH=x86 (32 bit) when you set CONFIG_IOMMU_INTEL since c5a5dc4cbbf4
> ("iommu/vt-d: Don't switch off swiotlb if bounce page is used") there's
> a dependency on CONFIG_SWIOTLB, which was not necessarily active before.
> 
> The init code for swiotlb in 'pci_swiotlb_detect_4gb()' compares
> something against MAX_DMA32_PFN to decide if it should be active.
> However that define suffers from an arithmetic overflow since
> 1b7e03ef7570 ("x86, NUMA: Enable emulation on 32bit too") when it was
> first made visible to x86_32.
> 
> The effect is at boot time 64 MiB (default size) were allocated for
> bounce buffers now, which is a noticeable amount of memory on small
> systems. We noticed this effect on the fli4l Linux distribution when
> migrating from kernel v4.19 (LTS) to v5.4 (LTS) on boards like pcengines
> ALIX 2D3 with 256 MiB memory for example:
> 
>   Linux version 5.4.22 (buildroot@buildroot) (gcc version 7.3.0 (Buildroot 
> 2018.02.8)) #1 SMP Mon Nov 26 23:40:00 CET 2018
>   …
>   Memory: 183484K/261756K available (4594K kernel code, 393K rwdata, 1660K 
> rodata, 536K init, 456K bss , 78272K reserved, 0K cma-reserved, 0K highmem)
>   …
>   PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
>   software IO TLB: mapped [mem 0x0bb78000-0x0fb78000] (64MB)
> 
> The initial analysis and the suggested fix was done by user 'sourcejedi'
> at stackoverflow and explicitly marked as GPLv2 for inclusion in the
> Linux kernel:
> 
>   https://unix.stackexchange.com/a/520525/50007
> 
> The actual calculation however is the same as for arch/mips now as
> suggested by Robin Murphy.
> 
> Fixes: https://web.nettworks.org/bugs/browse/FFL-2560
> Fixes: https://unix.stackexchange.com/q/520065/50007
> Reported-by: Alan Jenkins 
> Suggested-by: Robin Murphy 
> Signed-off-by: Alexander Dahl 
> ---
> 
> Notes:
> v1 -> v2:
>   - use the same calculation as with arch/mips (Robin Murphy)
> 
>  arch/x86/include/asm/dma.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/dma.h b/arch/x86/include/asm/dma.h
> index 00f7cf45e699..8e95aa4b0d17 100644
> --- a/arch/x86/include/asm/dma.h
> +++ b/arch/x86/include/asm/dma.h
> @@ -74,7 +74,7 @@
>  #define MAX_DMA_PFN   ((16UL * 1024 * 1024) >> PAGE_SHIFT)
>  
>  /* 4GB broken PCI/AGP hardware bus master zone */
> -#define MAX_DMA32_PFN ((4UL * 1024 * 1024 * 1024) >> PAGE_SHIFT)
> +#define MAX_DMA32_PFN (1UL << (32 - PAGE_SHIFT))
>  
>  #ifdef CONFIG_X86_32
>  /* The maximum address that we can perform a DMA transfer to on this 
> platform */
> -- 
> 2.20.1
> 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

-- 
/"\ ASCII RIBBON | »With the first link, the chain is forged. The first
\ / CAMPAIGN | speech censured, the first thought forbidden, the
 X  AGAINST  | first freedom denied, chains us all irrevocably.«
/ \ HTML MAIL| (Jean-Luc Picard, quoting Judge Aaron Satie)


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] dt-bndings: iommu: renesas, ipmmu-vmsa: convert to json-schema

2020-04-15 Thread Geert Uytterhoeven

Hi Shimoda-san,

On Tue, Apr 14, 2020 at 2:26 AM Yoshihiro Shimoda
 wrote:
> Convert Renesas VMSA-Compatible IOMMU bindings documentation
> to json-schema.
>
> Signed-off-by: Yoshihiro Shimoda 

> --- /dev/null
> +++ b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml
> @@ -0,0 +1,90 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/iommu/renesas,ipmmu-vmsa.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Renesas VMSA-Compatible IOMMU
> +
> +maintainers:
> +  - Yoshihiro Shimoda 
> +
> +description:
> +  The IPMMU is an IOMMU implementation compatible with the ARM VMSA page 
> tables.
> +  It provides address translation for bus masters outside of the CPU, each
> +  connected to the IPMMU through a port called micro-TLB.
> +
> +properties:
> +  compatible:
> +oneOf:
> +  - items:
> +  - enum:
> +  - renesas,ipmmu-r8a7743  # RZ/G1M
> +  - renesas,ipmmu-r8a7744  # RZ/G1N
> +  - renesas,ipmmu-r8a7745  # RZ/G1E
> +  - renesas,ipmmu-r8a7790  # R-Car H2
> +  - renesas,ipmmu-r8a7791  # R-Car M2-W
> +  - renesas,ipmmu-r8a7793  # R-Car M2-N
> +  - renesas,ipmmu-r8a7794  # R-Car E2
> +  - renesas,ipmmu-r8a7795  # R-Car H3
> +  - const: renesas,ipmmu-vmsa  # R-Car Gen2 or RZ/G1
> +  - items:
> +  - enum:
> +  - renesas,ipmmu-r8a73a4  # R-Mobile APE6

I believe the R-Mobile APE6 IPMMU is similar to the R-Car Gen2 IPMMU,
and thus belongs in the section above instead.

> +  - renesas,ipmmu-r8a774a1 # RZ/G2M
> +  - renesas,ipmmu-r8a774b1 # RZ/G2N
> +  - renesas,ipmmu-r8a774c0 # RZ/G2E
> +  - renesas,ipmmu-r8a7796  # R-Car M3-W
> +  - renesas,ipmmu-r8a77965 # R-Car M3-N
> +  - renesas,ipmmu-r8a77970 # R-Car V3M
> +  - renesas,ipmmu-r8a77980 # R-Car V3H
> +  - renesas,ipmmu-r8a77990 # R-Car E3
> +  - renesas,ipmmu-r8a77995 # R-Car D3
> +
> +  reg:
> +maxItems: 1
> +
> +  interrupts:
> +minItems: 1
> +maxItems: 2
> +description:
> +  Specifiers for the MMU fault interrupts. For instances that support
> +  secure mode two interrupts must be specified, for non-secure and secure
> +  mode, in that order. For instances that don't support secure mode a
> +  single interrupt must be specified. Not required for cache IPMMUs.

items:
  - description: 
  - description: 

> +
> +  '#iommu-cells':
> +const: 1
> +
> +  power-domains:
> +maxItems: 1
> +
> +  renesas,ipmmu-main:
> +$ref: /schemas/types.yaml#/definitions/phandle-array
> +description:
> +  Reference to the main IPMMU instance in two cells. The first cell is
> +  a phandle to the main IPMMU and the second cell is the interrupt bit
> +  number associated with the particular cache IPMMU device. The interrupt
> +  bit number needs to match the main IPMMU IMSSTR register. Only used by
> +  cache IPMMU instances.

This property is not valid only on R-Car Gen2 and R-Mobile APE6.

(untested)

oneOf:
  - properties:
  contains:
const: renesas,ipmmu-vmsa
  - properties:
  renesas,ipmmu-main:
$ref: /schemas/types.yaml#/definitions/phandle-array
description:
  [...]

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 07/33] iommu: Add probe_device() and remove_device() call-backs

2020-04-15 Thread Joerg Roedel

On Wed, Apr 15, 2020 at 02:36:20PM +0800, Lu Baolu wrote:
> On 2020/4/14 21:15, Joerg Roedel wrote:
> > From: Joerg Roedel
> > 
> > Add call-backs to 'struct iommu_ops' as an alternative to the
> > add_device() and remove_device() call-backs, which will be removed when
> > all drivers are converted.
> > 
> > The new call-backs will not setupt IOMMU groups and domains anymore,
> > so also add a probe_finalize() call-back where the IOMMU driver can do
> > per-device setup work which require the device to be set up with a
> > group and a domain.
> 
> The subject is inaccurate. probe_device() and release_device() are
> added to replace the add and remove pair.

This patch does not replace them yet, it just adds the new call-backs.
The removal of add_device()/remove_device() happens later in the
patch-set when all drivers are converted.

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 13/33] iommu: Export bus_iommu_probe() and make is safe for re-probing

2020-04-15 Thread Joerg Roedel

Hi Baolu,

On Wed, Apr 15, 2020 at 02:10:03PM +0800, Lu Baolu wrote:
> On 2020/4/14 21:15, Joerg Roedel wrote:
> > > + /* Device is probed already if in a group */
> > +   if (iommu_group_get(dev) != NULL)
> 
> Same as
>   if (iommu_group_get(dev))
> ?
> 
> By the way, do we need to put the group if device has already been
> probed?

Right, fixed both, thank you.


Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 4.4 14/14] iommu/amd: Fix the configuration of GCR3 table root pointer

2020-04-15 Thread Sasha Levin

From: Adrian Huang 

[ Upstream commit c20f36534666e37858a14e591114d93cc1be0d34 ]

The SPA of the GCR3 table root pointer[51:31] masks 20 bits. However,
this requires 21 bits (Please see the AMD IOMMU specification).
This leads to the potential failure when the bit 51 of SPA of
the GCR3 table root pointer is 1'.

Signed-off-by: Adrian Huang 
Fixes: 52815b75682e2 ("iommu/amd: Add support for IOMMUv2 domain mode")
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd_iommu_types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index b08cf57bf4554..695d4e235438c 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -303,7 +303,7 @@
 
 #define DTE_GCR3_VAL_A(x)  (((x) >> 12) & 0x7ULL)
 #define DTE_GCR3_VAL_B(x)  (((x) >> 15) & 0x0ULL)
-#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0xfULL)
+#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0x1fULL)
 
 #define DTE_GCR3_INDEX_A   0
 #define DTE_GCR3_INDEX_B   1
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 4.9 21/21] iommu/amd: Fix the configuration of GCR3 table root pointer

2020-04-15 Thread Sasha Levin

From: Adrian Huang 

[ Upstream commit c20f36534666e37858a14e591114d93cc1be0d34 ]

The SPA of the GCR3 table root pointer[51:31] masks 20 bits. However,
this requires 21 bits (Please see the AMD IOMMU specification).
This leads to the potential failure when the bit 51 of SPA of
the GCR3 table root pointer is 1'.

Signed-off-by: Adrian Huang 
Fixes: 52815b75682e2 ("iommu/amd: Add support for IOMMUv2 domain mode")
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd_iommu_types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 0d91785ebdc34..da3fbf82d1cf4 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -329,7 +329,7 @@
 
 #define DTE_GCR3_VAL_A(x)  (((x) >> 12) & 0x7ULL)
 #define DTE_GCR3_VAL_B(x)  (((x) >> 15) & 0x0ULL)
-#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0xfULL)
+#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0x1fULL)
 
 #define DTE_GCR3_INDEX_A   0
 #define DTE_GCR3_INDEX_B   1
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 4.14 26/30] iommu/vt-d: Fix mm reference leak

2020-04-15 Thread Sasha Levin

From: Jacob Pan 

[ Upstream commit 902baf61adf6b187f0a6b789e70d788ea71ff5bc ]

Move canonical address check before mmget_not_zero() to avoid mm
reference leak.

Fixes: 9d8c3af31607 ("iommu/vt-d: IOMMU Page Request needs to check if address 
is canonical.")
Signed-off-by: Jacob Pan 
Acked-by: Lu Baolu 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/intel-svm.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index f5573bb9f450e..837459762eb39 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -613,14 +613,15 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 * any faults on kernel addresses. */
if (!svm->mm)
goto bad_req;
-   /* If the mm is already defunct, don't handle faults. */
-   if (!mmget_not_zero(svm->mm))
-   goto bad_req;
 
/* If address is not canonical, return invalid response */
if (!is_canonical_address(address))
goto bad_req;
 
+   /* If the mm is already defunct, don't handle faults. */
+   if (!mmget_not_zero(svm->mm))
+   goto bad_req;
+
down_read(>mm->mmap_sem);
vma = find_extend_vma(svm->mm, address);
if (!vma || address < vma->vm_start)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 4.14 30/30] iommu/amd: Fix the configuration of GCR3 table root pointer

2020-04-15 Thread Sasha Levin

From: Adrian Huang 

[ Upstream commit c20f36534666e37858a14e591114d93cc1be0d34 ]

The SPA of the GCR3 table root pointer[51:31] masks 20 bits. However,
this requires 21 bits (Please see the AMD IOMMU specification).
This leads to the potential failure when the bit 51 of SPA of
the GCR3 table root pointer is 1'.

Signed-off-by: Adrian Huang 
Fixes: 52815b75682e2 ("iommu/amd: Add support for IOMMUv2 domain mode")
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd_iommu_types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 3054c0971759f..74c8638aac2b9 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -348,7 +348,7 @@
 
 #define DTE_GCR3_VAL_A(x)  (((x) >> 12) & 0x7ULL)
 #define DTE_GCR3_VAL_B(x)  (((x) >> 15) & 0x0ULL)
-#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0xfULL)
+#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0x1fULL)
 
 #define DTE_GCR3_INDEX_A   0
 #define DTE_GCR3_INDEX_B   1
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 4.19 39/40] iommu/amd: Fix the configuration of GCR3 table root pointer

2020-04-15 Thread Sasha Levin

From: Adrian Huang 

[ Upstream commit c20f36534666e37858a14e591114d93cc1be0d34 ]

The SPA of the GCR3 table root pointer[51:31] masks 20 bits. However,
this requires 21 bits (Please see the AMD IOMMU specification).
This leads to the potential failure when the bit 51 of SPA of
the GCR3 table root pointer is 1'.

Signed-off-by: Adrian Huang 
Fixes: 52815b75682e2 ("iommu/amd: Add support for IOMMUv2 domain mode")
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd_iommu_types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 69f3d4c95b530..859b06424e5c4 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -352,7 +352,7 @@
 
 #define DTE_GCR3_VAL_A(x)  (((x) >> 12) & 0x7ULL)
 #define DTE_GCR3_VAL_B(x)  (((x) >> 15) & 0x0ULL)
-#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0xfULL)
+#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0x1fULL)
 
 #define DTE_GCR3_INDEX_A   0
 #define DTE_GCR3_INDEX_B   1
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.4 70/84] iommu/vt-d: Fix mm reference leak

2020-04-15 Thread Sasha Levin

From: Jacob Pan 

[ Upstream commit 902baf61adf6b187f0a6b789e70d788ea71ff5bc ]

Move canonical address check before mmget_not_zero() to avoid mm
reference leak.

Fixes: 9d8c3af31607 ("iommu/vt-d: IOMMU Page Request needs to check if address 
is canonical.")
Signed-off-by: Jacob Pan 
Acked-by: Lu Baolu 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/intel-svm.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 518d0b2d12afd..3020506180c10 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -583,14 +583,15 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 * any faults on kernel addresses. */
if (!svm->mm)
goto bad_req;
-   /* If the mm is already defunct, don't handle faults. */
-   if (!mmget_not_zero(svm->mm))
-   goto bad_req;
 
/* If address is not canonical, return invalid response */
if (!is_canonical_address(address))
goto bad_req;
 
+   /* If the mm is already defunct, don't handle faults. */
+   if (!mmget_not_zero(svm->mm))
+   goto bad_req;
+
down_read(>mm->mmap_sem);
vma = find_extend_vma(svm->mm, address);
if (!vma || address < vma->vm_start)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.4 74/84] iommu/vt-d: Fix page request descriptor size

2020-04-15 Thread Sasha Levin

From: Jacob Pan 

[ Upstream commit 52355fb1919ef7ed9a38e0f3de6e928de1f57217 ]

Intel VT-d might support PRS (Page Reqest Support) when it's
running in the scalable mode. Each page request descriptor
occupies 32 bytes and is 32-bytes aligned. The page request
descriptor offset mask should be 32-bytes aligned.

Fixes: 5b438f4ba315d ("iommu/vt-d: Support page request in scalable mode")
Signed-off-by: Lu Baolu 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/intel-svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 3020506180c10..1d3816cd65d57 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -502,7 +502,7 @@ struct page_req_dsc {
u64 priv_data[2];
 };
 
-#define PRQ_RING_MASK ((0x1000 << PRQ_ORDER) - 0x10)
+#define PRQ_RING_MASK  ((0x1000 << PRQ_ORDER) - 0x20)
 
 static bool access_error(struct vm_area_struct *vma, struct page_req_dsc *req)
 {
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.4 73/84] iommu/vt-d: Silence RCU-list debugging warning in dmar_find_atsr()

2020-04-15 Thread Sasha Levin

From: Qian Cai 

[ Upstream commit c6f4ebdeba4cff590594df931ff1ee610c426431 ]

dmar_find_atsr() calls list_for_each_entry_rcu() outside of an RCU read
side critical section but with dmar_global_lock held. Silence this
false positive.

 drivers/iommu/intel-iommu.c:4504 RCU-list traversed in non-reader section!!
 1 lock held by swapper/0/1:
 #0: 9755bee8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x1a6/0xe19

 Call Trace:
  dump_stack+0xa4/0xfe
  lockdep_rcu_suspicious+0xeb/0xf5
  dmar_find_atsr+0x1ab/0x1c0
  dmar_parse_one_atsr+0x64/0x220
  dmar_walk_remapping_entries+0x130/0x380
  dmar_table_init+0x166/0x243
  intel_iommu_init+0x1ab/0xe19
  pci_iommu_init+0x1a/0x44
  do_one_initcall+0xae/0x4d0
  kernel_init_freeable+0x412/0x4c5
  kernel_init+0x19/0x193

Signed-off-by: Qian Cai 
Acked-by: Lu Baolu 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 9d47b227e5578..86b1fecdd6865 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4346,7 +4346,8 @@ static struct dmar_atsr_unit *dmar_find_atsr(struct 
acpi_dmar_atsr *atsr)
struct dmar_atsr_unit *atsru;
struct acpi_dmar_atsr *tmp;
 
-   list_for_each_entry_rcu(atsru, _atsr_units, list) {
+   list_for_each_entry_rcu(atsru, _atsr_units, list,
+   dmar_rcu_check()) {
tmp = (struct acpi_dmar_atsr *)atsru->hdr;
if (atsr->segment != tmp->segment)
continue;
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 4.19 33/40] iommu/vt-d: Fix mm reference leak

2020-04-15 Thread Sasha Levin

From: Jacob Pan 

[ Upstream commit 902baf61adf6b187f0a6b789e70d788ea71ff5bc ]

Move canonical address check before mmget_not_zero() to avoid mm
reference leak.

Fixes: 9d8c3af31607 ("iommu/vt-d: IOMMU Page Request needs to check if address 
is canonical.")
Signed-off-by: Jacob Pan 
Acked-by: Lu Baolu 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/intel-svm.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 5944d3b4dca37..ef3aadec980ee 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -620,14 +620,15 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 * any faults on kernel addresses. */
if (!svm->mm)
goto bad_req;
-   /* If the mm is already defunct, don't handle faults. */
-   if (!mmget_not_zero(svm->mm))
-   goto bad_req;
 
/* If address is not canonical, return invalid response */
if (!is_canonical_address(address))
goto bad_req;
 
+   /* If the mm is already defunct, don't handle faults. */
+   if (!mmget_not_zero(svm->mm))
+   goto bad_req;
+
down_read(>mm->mmap_sem);
vma = find_extend_vma(svm->mm, address);
if (!vma || address < vma->vm_start)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.4 83/84] iommu/amd: Fix the configuration of GCR3 table root pointer

2020-04-15 Thread Sasha Levin

From: Adrian Huang 

[ Upstream commit c20f36534666e37858a14e591114d93cc1be0d34 ]

The SPA of the GCR3 table root pointer[51:31] masks 20 bits. However,
this requires 21 bits (Please see the AMD IOMMU specification).
This leads to the potential failure when the bit 51 of SPA of
the GCR3 table root pointer is 1'.

Signed-off-by: Adrian Huang 
Fixes: 52815b75682e2 ("iommu/amd: Add support for IOMMUv2 domain mode")
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd_iommu_types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index daeabd98c60e2..0679896b9e2e1 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -348,7 +348,7 @@
 
 #define DTE_GCR3_VAL_A(x)  (((x) >> 12) & 0x7ULL)
 #define DTE_GCR3_VAL_B(x)  (((x) >> 15) & 0x0ULL)
-#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0xfULL)
+#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0x1fULL)
 
 #define DTE_GCR3_INDEX_A   0
 #define DTE_GCR3_INDEX_B   1
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.4 69/84] iommu/virtio: Fix freeing of incomplete domains

2020-04-15 Thread Sasha Levin

From: Jean-Philippe Brucker 

[ Upstream commit 7062af3ed2ba451029e3733d9f677c68f5ea9e77 ]

Calling viommu_domain_free() on a domain that hasn't been finalised (not
attached to any device, for example) can currently cause an Oops,
because we attempt to call ida_free() on ID 0, which may either be
unallocated or used by another domain.

Only initialise the vdomain->viommu pointer, which denotes a finalised
domain, at the end of a successful viommu_domain_finalise().

Fixes: edcd69ab9a32 ("iommu: Add virtio-iommu driver")
Reported-by: Eric Auger 
Signed-off-by: Jean-Philippe Brucker 
Reviewed-by: Robin Murphy 
Link: 
https://lore.kernel.org/r/20200326093558.2641019-3-jean-phili...@linaro.org
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/virtio-iommu.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 3ea9d76829995..6c340a4f4fd28 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -614,18 +614,20 @@ static int viommu_domain_finalise(struct viommu_dev 
*viommu,
int ret;
struct viommu_domain *vdomain = to_viommu_domain(domain);
 
-   vdomain->viommu = viommu;
-   vdomain->map_flags  = viommu->map_flags;
+   ret = ida_alloc_range(>domain_ids, viommu->first_domain,
+ viommu->last_domain, GFP_KERNEL);
+   if (ret < 0)
+   return ret;
+
+   vdomain->id = (unsigned int)ret;
 
domain->pgsize_bitmap   = viommu->pgsize_bitmap;
domain->geometry= viommu->geometry;
 
-   ret = ida_alloc_range(>domain_ids, viommu->first_domain,
- viommu->last_domain, GFP_KERNEL);
-   if (ret >= 0)
-   vdomain->id = (unsigned int)ret;
+   vdomain->map_flags  = viommu->map_flags;
+   vdomain->viommu = viommu;
 
-   return ret > 0 ? 0 : ret;
+   return 0;
 }
 
 static void viommu_domain_free(struct iommu_domain *domain)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.5 089/106] iommu/vt-d: Fix mm reference leak

2020-04-15 Thread Sasha Levin

From: Jacob Pan 

[ Upstream commit 902baf61adf6b187f0a6b789e70d788ea71ff5bc ]

Move canonical address check before mmget_not_zero() to avoid mm
reference leak.

Fixes: 9d8c3af31607 ("iommu/vt-d: IOMMU Page Request needs to check if address 
is canonical.")
Signed-off-by: Jacob Pan 
Acked-by: Lu Baolu 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/intel-svm.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 518d0b2d12afd..3020506180c10 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -583,14 +583,15 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 * any faults on kernel addresses. */
if (!svm->mm)
goto bad_req;
-   /* If the mm is already defunct, don't handle faults. */
-   if (!mmget_not_zero(svm->mm))
-   goto bad_req;
 
/* If address is not canonical, return invalid response */
if (!is_canonical_address(address))
goto bad_req;
 
+   /* If the mm is already defunct, don't handle faults. */
+   if (!mmget_not_zero(svm->mm))
+   goto bad_req;
+
down_read(>mm->mmap_sem);
vma = find_extend_vma(svm->mm, address);
if (!vma || address < vma->vm_start)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.4 15/84] dma-coherent: fix integer overflow in the reserved-memory dma allocation

2020-04-15 Thread Sasha Levin

From: Kevin Grandemange 

[ Upstream commit 286c21de32b904131f8cf6a36ce40b8b0c9c5da3 ]

pageno is an int and the PAGE_SHIFT shift is done on an int,
overflowing if the memory is bigger than 2G

This can be reproduced using for example a reserved-memory of 4G

reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;

reserved_dma: buffer@0 {
compatible = "shared-dma-pool";
no-map;
reg = <0x5 0x 0x1 0x0>;
};
};

Signed-off-by: Kevin Grandemange 
Signed-off-by: Christoph Hellwig 
Signed-off-by: Sasha Levin 
---
 kernel/dma/coherent.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 551b0eb7028a3..2a0c4985f38e4 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -134,7 +134,7 @@ static void *__dma_alloc_from_coherent(struct device *dev,
 
spin_lock_irqsave(>spinlock, flags);
 
-   if (unlikely(size > (mem->size << PAGE_SHIFT)))
+   if (unlikely(size > ((dma_addr_t)mem->size << PAGE_SHIFT)))
goto err;
 
pageno = bitmap_find_free_region(mem->bitmap, mem->size, order);
@@ -144,8 +144,9 @@ static void *__dma_alloc_from_coherent(struct device *dev,
/*
 * Memory was found in the coherent area.
 */
-   *dma_handle = dma_get_device_base(dev, mem) + (pageno << PAGE_SHIFT);
-   ret = mem->virt_base + (pageno << PAGE_SHIFT);
+   *dma_handle = dma_get_device_base(dev, mem) +
+   ((dma_addr_t)pageno << PAGE_SHIFT);
+   ret = mem->virt_base + ((dma_addr_t)pageno << PAGE_SHIFT);
spin_unlock_irqrestore(>spinlock, flags);
memset(ret, 0, size);
return ret;
@@ -194,7 +195,7 @@ static int __dma_release_from_coherent(struct 
dma_coherent_mem *mem,
   int order, void *vaddr)
 {
if (mem && vaddr >= mem->virt_base && vaddr <
-  (mem->virt_base + (mem->size << PAGE_SHIFT))) {
+  (mem->virt_base + ((dma_addr_t)mem->size << PAGE_SHIFT))) {
int page = (vaddr - mem->virt_base) >> PAGE_SHIFT;
unsigned long flags;
 
@@ -238,10 +239,10 @@ static int __dma_mmap_from_coherent(struct 
dma_coherent_mem *mem,
struct vm_area_struct *vma, void *vaddr, size_t size, int *ret)
 {
if (mem && vaddr >= mem->virt_base && vaddr + size <=
-  (mem->virt_base + (mem->size << PAGE_SHIFT))) {
+  (mem->virt_base + ((dma_addr_t)mem->size << PAGE_SHIFT))) {
unsigned long off = vma->vm_pgoff;
int start = (vaddr - mem->virt_base) >> PAGE_SHIFT;
-   int user_count = vma_pages(vma);
+   unsigned long user_count = vma_pages(vma);
int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
 
*ret = -ENXIO;
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.5 088/106] iommu/virtio: Fix freeing of incomplete domains

2020-04-15 Thread Sasha Levin

From: Jean-Philippe Brucker 

[ Upstream commit 7062af3ed2ba451029e3733d9f677c68f5ea9e77 ]

Calling viommu_domain_free() on a domain that hasn't been finalised (not
attached to any device, for example) can currently cause an Oops,
because we attempt to call ida_free() on ID 0, which may either be
unallocated or used by another domain.

Only initialise the vdomain->viommu pointer, which denotes a finalised
domain, at the end of a successful viommu_domain_finalise().

Fixes: edcd69ab9a32 ("iommu: Add virtio-iommu driver")
Reported-by: Eric Auger 
Signed-off-by: Jean-Philippe Brucker 
Reviewed-by: Robin Murphy 
Link: 
https://lore.kernel.org/r/20200326093558.2641019-3-jean-phili...@linaro.org
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/virtio-iommu.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 315c7cc4f99d8..779a6025b5962 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -613,18 +613,20 @@ static int viommu_domain_finalise(struct viommu_dev 
*viommu,
int ret;
struct viommu_domain *vdomain = to_viommu_domain(domain);
 
-   vdomain->viommu = viommu;
-   vdomain->map_flags  = viommu->map_flags;
+   ret = ida_alloc_range(>domain_ids, viommu->first_domain,
+ viommu->last_domain, GFP_KERNEL);
+   if (ret < 0)
+   return ret;
+
+   vdomain->id = (unsigned int)ret;
 
domain->pgsize_bitmap   = viommu->pgsize_bitmap;
domain->geometry= viommu->geometry;
 
-   ret = ida_alloc_range(>domain_ids, viommu->first_domain,
- viommu->last_domain, GFP_KERNEL);
-   if (ret >= 0)
-   vdomain->id = (unsigned int)ret;
+   vdomain->map_flags  = viommu->map_flags;
+   vdomain->viommu = viommu;
 
-   return ret > 0 ? 0 : ret;
+   return 0;
 }
 
 static void viommu_domain_free(struct iommu_domain *domain)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.5 095/106] iommu/vt-d: Fix page request descriptor size

2020-04-15 Thread Sasha Levin

From: Jacob Pan 

[ Upstream commit 52355fb1919ef7ed9a38e0f3de6e928de1f57217 ]

Intel VT-d might support PRS (Page Reqest Support) when it's
running in the scalable mode. Each page request descriptor
occupies 32 bytes and is 32-bytes aligned. The page request
descriptor offset mask should be 32-bytes aligned.

Fixes: 5b438f4ba315d ("iommu/vt-d: Support page request in scalable mode")
Signed-off-by: Lu Baolu 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/intel-svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 3020506180c10..1d3816cd65d57 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -502,7 +502,7 @@ struct page_req_dsc {
u64 priv_data[2];
 };
 
-#define PRQ_RING_MASK ((0x1000 << PRQ_ORDER) - 0x10)
+#define PRQ_RING_MASK  ((0x1000 << PRQ_ORDER) - 0x20)
 
 static bool access_error(struct vm_area_struct *vma, struct page_req_dsc *req)
 {
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.5 104/106] iommu/amd: Fix the configuration of GCR3 table root pointer

2020-04-15 Thread Sasha Levin

From: Adrian Huang 

[ Upstream commit c20f36534666e37858a14e591114d93cc1be0d34 ]

The SPA of the GCR3 table root pointer[51:31] masks 20 bits. However,
this requires 21 bits (Please see the AMD IOMMU specification).
This leads to the potential failure when the bit 51 of SPA of
the GCR3 table root pointer is 1'.

Signed-off-by: Adrian Huang 
Fixes: 52815b75682e2 ("iommu/amd: Add support for IOMMUv2 domain mode")
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd_iommu_types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 798e1533a1471..d336ae8f5c73b 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -348,7 +348,7 @@
 
 #define DTE_GCR3_VAL_A(x)  (((x) >> 12) & 0x7ULL)
 #define DTE_GCR3_VAL_B(x)  (((x) >> 15) & 0x0ULL)
-#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0xfULL)
+#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0x1fULL)
 
 #define DTE_GCR3_INDEX_A   0
 #define DTE_GCR3_INDEX_B   1
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.5 094/106] iommu/vt-d: Silence RCU-list debugging warning in dmar_find_atsr()

2020-04-15 Thread Sasha Levin

From: Qian Cai 

[ Upstream commit c6f4ebdeba4cff590594df931ff1ee610c426431 ]

dmar_find_atsr() calls list_for_each_entry_rcu() outside of an RCU read
side critical section but with dmar_global_lock held. Silence this
false positive.

 drivers/iommu/intel-iommu.c:4504 RCU-list traversed in non-reader section!!
 1 lock held by swapper/0/1:
 #0: 9755bee8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x1a6/0xe19

 Call Trace:
  dump_stack+0xa4/0xfe
  lockdep_rcu_suspicious+0xeb/0xf5
  dmar_find_atsr+0x1ab/0x1c0
  dmar_parse_one_atsr+0x64/0x220
  dmar_walk_remapping_entries+0x130/0x380
  dmar_table_init+0x166/0x243
  intel_iommu_init+0x1ab/0xe19
  pci_iommu_init+0x1a/0x44
  do_one_initcall+0xae/0x4d0
  kernel_init_freeable+0x412/0x4c5
  kernel_init+0x19/0x193

Signed-off-by: Qian Cai 
Acked-by: Lu Baolu 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 7f31775e9b554..65f8005589d36 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4370,7 +4370,8 @@ static struct dmar_atsr_unit *dmar_find_atsr(struct 
acpi_dmar_atsr *atsr)
struct dmar_atsr_unit *atsru;
struct acpi_dmar_atsr *tmp;
 
-   list_for_each_entry_rcu(atsru, _atsr_units, list) {
+   list_for_each_entry_rcu(atsru, _atsr_units, list,
+   dmar_rcu_check()) {
tmp = (struct acpi_dmar_atsr *)atsru->hdr;
if (atsr->segment != tmp->segment)
continue;
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.5 024/106] dma-coherent: fix integer overflow in the reserved-memory dma allocation

2020-04-15 Thread Sasha Levin

From: Kevin Grandemange 

[ Upstream commit 286c21de32b904131f8cf6a36ce40b8b0c9c5da3 ]

pageno is an int and the PAGE_SHIFT shift is done on an int,
overflowing if the memory is bigger than 2G

This can be reproduced using for example a reserved-memory of 4G

reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;

reserved_dma: buffer@0 {
compatible = "shared-dma-pool";
no-map;
reg = <0x5 0x 0x1 0x0>;
};
};

Signed-off-by: Kevin Grandemange 
Signed-off-by: Christoph Hellwig 
Signed-off-by: Sasha Levin 
---
 kernel/dma/coherent.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 551b0eb7028a3..2a0c4985f38e4 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -134,7 +134,7 @@ static void *__dma_alloc_from_coherent(struct device *dev,
 
spin_lock_irqsave(>spinlock, flags);
 
-   if (unlikely(size > (mem->size << PAGE_SHIFT)))
+   if (unlikely(size > ((dma_addr_t)mem->size << PAGE_SHIFT)))
goto err;
 
pageno = bitmap_find_free_region(mem->bitmap, mem->size, order);
@@ -144,8 +144,9 @@ static void *__dma_alloc_from_coherent(struct device *dev,
/*
 * Memory was found in the coherent area.
 */
-   *dma_handle = dma_get_device_base(dev, mem) + (pageno << PAGE_SHIFT);
-   ret = mem->virt_base + (pageno << PAGE_SHIFT);
+   *dma_handle = dma_get_device_base(dev, mem) +
+   ((dma_addr_t)pageno << PAGE_SHIFT);
+   ret = mem->virt_base + ((dma_addr_t)pageno << PAGE_SHIFT);
spin_unlock_irqrestore(>spinlock, flags);
memset(ret, 0, size);
return ret;
@@ -194,7 +195,7 @@ static int __dma_release_from_coherent(struct 
dma_coherent_mem *mem,
   int order, void *vaddr)
 {
if (mem && vaddr >= mem->virt_base && vaddr <
-  (mem->virt_base + (mem->size << PAGE_SHIFT))) {
+  (mem->virt_base + ((dma_addr_t)mem->size << PAGE_SHIFT))) {
int page = (vaddr - mem->virt_base) >> PAGE_SHIFT;
unsigned long flags;
 
@@ -238,10 +239,10 @@ static int __dma_mmap_from_coherent(struct 
dma_coherent_mem *mem,
struct vm_area_struct *vma, void *vaddr, size_t size, int *ret)
 {
if (mem && vaddr >= mem->virt_base && vaddr + size <=
-  (mem->virt_base + (mem->size << PAGE_SHIFT))) {
+  (mem->virt_base + ((dma_addr_t)mem->size << PAGE_SHIFT))) {
unsigned long off = vma->vm_pgoff;
int start = (vaddr - mem->virt_base) >> PAGE_SHIFT;
-   int user_count = vma_pages(vma);
+   unsigned long user_count = vma_pages(vma);
int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
 
*ret = -ENXIO;
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.6 116/129] iommu/vt-d: Fix page request descriptor size

2020-04-15 Thread Sasha Levin

From: Jacob Pan 

[ Upstream commit 52355fb1919ef7ed9a38e0f3de6e928de1f57217 ]

Intel VT-d might support PRS (Page Reqest Support) when it's
running in the scalable mode. Each page request descriptor
occupies 32 bytes and is 32-bytes aligned. The page request
descriptor offset mask should be 32-bytes aligned.

Fixes: 5b438f4ba315d ("iommu/vt-d: Support page request in scalable mode")
Signed-off-by: Lu Baolu 
Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/intel-svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index fc7d78876e021..2998418f0a383 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -531,7 +531,7 @@ struct page_req_dsc {
u64 priv_data[2];
 };
 
-#define PRQ_RING_MASK ((0x1000 << PRQ_ORDER) - 0x10)
+#define PRQ_RING_MASK  ((0x1000 << PRQ_ORDER) - 0x20)
 
 static bool access_error(struct vm_area_struct *vma, struct page_req_dsc *req)
 {
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.6 115/129] iommu/vt-d: Silence RCU-list debugging warning in dmar_find_atsr()

2020-04-15 Thread Sasha Levin

From: Qian Cai 

[ Upstream commit c6f4ebdeba4cff590594df931ff1ee610c426431 ]

dmar_find_atsr() calls list_for_each_entry_rcu() outside of an RCU read
side critical section but with dmar_global_lock held. Silence this
false positive.

 drivers/iommu/intel-iommu.c:4504 RCU-list traversed in non-reader section!!
 1 lock held by swapper/0/1:
 #0: 9755bee8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x1a6/0xe19

 Call Trace:
  dump_stack+0xa4/0xfe
  lockdep_rcu_suspicious+0xeb/0xf5
  dmar_find_atsr+0x1ab/0x1c0
  dmar_parse_one_atsr+0x64/0x220
  dmar_walk_remapping_entries+0x130/0x380
  dmar_table_init+0x166/0x243
  intel_iommu_init+0x1ab/0xe19
  pci_iommu_init+0x1a/0x44
  do_one_initcall+0xae/0x4d0
  kernel_init_freeable+0x412/0x4c5
  kernel_init+0x19/0x193

Signed-off-by: Qian Cai 
Acked-by: Lu Baolu 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 4be5494786918..ef0a5246700e5 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4501,7 +4501,8 @@ static struct dmar_atsr_unit *dmar_find_atsr(struct 
acpi_dmar_atsr *atsr)
struct dmar_atsr_unit *atsru;
struct acpi_dmar_atsr *tmp;
 
-   list_for_each_entry_rcu(atsru, _atsr_units, list) {
+   list_for_each_entry_rcu(atsru, _atsr_units, list,
+   dmar_rcu_check()) {
tmp = (struct acpi_dmar_atsr *)atsru->hdr;
if (atsr->segment != tmp->segment)
continue;
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.6 126/129] iommu/amd: Fix the configuration of GCR3 table root pointer

2020-04-15 Thread Sasha Levin

From: Adrian Huang 

[ Upstream commit c20f36534666e37858a14e591114d93cc1be0d34 ]

The SPA of the GCR3 table root pointer[51:31] masks 20 bits. However,
this requires 21 bits (Please see the AMD IOMMU specification).
This leads to the potential failure when the bit 51 of SPA of
the GCR3 table root pointer is 1'.

Signed-off-by: Adrian Huang 
Fixes: 52815b75682e2 ("iommu/amd: Add support for IOMMUv2 domain mode")
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/amd_iommu_types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index f8d01d6b00da7..ca8c4522045b3 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -348,7 +348,7 @@
 
 #define DTE_GCR3_VAL_A(x)  (((x) >> 12) & 0x7ULL)
 #define DTE_GCR3_VAL_B(x)  (((x) >> 15) & 0x0ULL)
-#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0xfULL)
+#define DTE_GCR3_VAL_C(x)  (((x) >> 31) & 0x1fULL)
 
 #define DTE_GCR3_INDEX_A   0
 #define DTE_GCR3_INDEX_B   1
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.6 107/129] iommu/virtio: Fix freeing of incomplete domains

2020-04-15 Thread Sasha Levin

From: Jean-Philippe Brucker 

[ Upstream commit 7062af3ed2ba451029e3733d9f677c68f5ea9e77 ]

Calling viommu_domain_free() on a domain that hasn't been finalised (not
attached to any device, for example) can currently cause an Oops,
because we attempt to call ida_free() on ID 0, which may either be
unallocated or used by another domain.

Only initialise the vdomain->viommu pointer, which denotes a finalised
domain, at the end of a successful viommu_domain_finalise().

Fixes: edcd69ab9a32 ("iommu: Add virtio-iommu driver")
Reported-by: Eric Auger 
Signed-off-by: Jean-Philippe Brucker 
Reviewed-by: Robin Murphy 
Link: 
https://lore.kernel.org/r/20200326093558.2641019-3-jean-phili...@linaro.org
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/virtio-iommu.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index cce329d71fbad..5eed75cd121f1 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -613,18 +613,20 @@ static int viommu_domain_finalise(struct viommu_dev 
*viommu,
int ret;
struct viommu_domain *vdomain = to_viommu_domain(domain);
 
-   vdomain->viommu = viommu;
-   vdomain->map_flags  = viommu->map_flags;
+   ret = ida_alloc_range(>domain_ids, viommu->first_domain,
+ viommu->last_domain, GFP_KERNEL);
+   if (ret < 0)
+   return ret;
+
+   vdomain->id = (unsigned int)ret;
 
domain->pgsize_bitmap   = viommu->pgsize_bitmap;
domain->geometry= viommu->geometry;
 
-   ret = ida_alloc_range(>domain_ids, viommu->first_domain,
- viommu->last_domain, GFP_KERNEL);
-   if (ret >= 0)
-   vdomain->id = (unsigned int)ret;
+   vdomain->map_flags  = viommu->map_flags;
+   vdomain->viommu = viommu;
 
-   return ret > 0 ? 0 : ret;
+   return 0;
 }
 
 static void viommu_domain_free(struct iommu_domain *domain)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.6 108/129] iommu/vt-d: Add build dependency on IOASID

2020-04-15 Thread Sasha Levin

From: Jacob Pan 

[ Upstream commit 4a663dae47316ae8b97d5b77025fe7dfd9d3487f ]

IOASID code is needed by VT-d scalable mode for PASID allocation.
Add explicit dependency such that IOASID is built-in whenever Intel
IOMMU is enabled.
Otherwise, aux domain code will fail when IOMMU is built-in and IOASID
is compiled as a module.

Fixes: 59a623374dc38 ("iommu/vt-d: Replace Intel specific PASID allocator with 
IOASID")
Signed-off-by: Jacob Pan 
Acked-by: Lu Baolu 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d2fade9849997..25149544d57c9 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -188,6 +188,7 @@ config INTEL_IOMMU
select NEED_DMA_MAP_STATE
select DMAR_TABLE
select SWIOTLB
+   select IOASID
help
  DMA remapping (DMAR) devices support enables independent address
  translations for Direct Memory Access (DMA) from devices.
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.6 109/129] iommu/vt-d: Fix mm reference leak

2020-04-15 Thread Sasha Levin

From: Jacob Pan 

[ Upstream commit 902baf61adf6b187f0a6b789e70d788ea71ff5bc ]

Move canonical address check before mmget_not_zero() to avoid mm
reference leak.

Fixes: 9d8c3af31607 ("iommu/vt-d: IOMMU Page Request needs to check if address 
is canonical.")
Signed-off-by: Jacob Pan 
Acked-by: Lu Baolu 
Signed-off-by: Joerg Roedel 
Signed-off-by: Sasha Levin 
---
 drivers/iommu/intel-svm.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index d7f2a53589002..fc7d78876e021 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -611,14 +611,15 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 * any faults on kernel addresses. */
if (!svm->mm)
goto bad_req;
-   /* If the mm is already defunct, don't handle faults. */
-   if (!mmget_not_zero(svm->mm))
-   goto bad_req;
 
/* If address is not canonical, return invalid response */
if (!is_canonical_address(address))
goto bad_req;
 
+   /* If the mm is already defunct, don't handle faults. */
+   if (!mmget_not_zero(svm->mm))
+   goto bad_req;
+
down_read(>mm->mmap_sem);
vma = find_extend_vma(svm->mm, address);
if (!vma || address < vma->vm_start)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH AUTOSEL 5.6 032/129] dma-coherent: fix integer overflow in the reserved-memory dma allocation

2020-04-15 Thread Sasha Levin

From: Kevin Grandemange 

[ Upstream commit 286c21de32b904131f8cf6a36ce40b8b0c9c5da3 ]

pageno is an int and the PAGE_SHIFT shift is done on an int,
overflowing if the memory is bigger than 2G

This can be reproduced using for example a reserved-memory of 4G

reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;

reserved_dma: buffer@0 {
compatible = "shared-dma-pool";
no-map;
reg = <0x5 0x 0x1 0x0>;
};
};

Signed-off-by: Kevin Grandemange 
Signed-off-by: Christoph Hellwig 
Signed-off-by: Sasha Levin 
---
 kernel/dma/coherent.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index 551b0eb7028a3..2a0c4985f38e4 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -134,7 +134,7 @@ static void *__dma_alloc_from_coherent(struct device *dev,
 
spin_lock_irqsave(>spinlock, flags);
 
-   if (unlikely(size > (mem->size << PAGE_SHIFT)))
+   if (unlikely(size > ((dma_addr_t)mem->size << PAGE_SHIFT)))
goto err;
 
pageno = bitmap_find_free_region(mem->bitmap, mem->size, order);
@@ -144,8 +144,9 @@ static void *__dma_alloc_from_coherent(struct device *dev,
/*
 * Memory was found in the coherent area.
 */
-   *dma_handle = dma_get_device_base(dev, mem) + (pageno << PAGE_SHIFT);
-   ret = mem->virt_base + (pageno << PAGE_SHIFT);
+   *dma_handle = dma_get_device_base(dev, mem) +
+   ((dma_addr_t)pageno << PAGE_SHIFT);
+   ret = mem->virt_base + ((dma_addr_t)pageno << PAGE_SHIFT);
spin_unlock_irqrestore(>spinlock, flags);
memset(ret, 0, size);
return ret;
@@ -194,7 +195,7 @@ static int __dma_release_from_coherent(struct 
dma_coherent_mem *mem,
   int order, void *vaddr)
 {
if (mem && vaddr >= mem->virt_base && vaddr <
-  (mem->virt_base + (mem->size << PAGE_SHIFT))) {
+  (mem->virt_base + ((dma_addr_t)mem->size << PAGE_SHIFT))) {
int page = (vaddr - mem->virt_base) >> PAGE_SHIFT;
unsigned long flags;
 
@@ -238,10 +239,10 @@ static int __dma_mmap_from_coherent(struct 
dma_coherent_mem *mem,
struct vm_area_struct *vma, void *vaddr, size_t size, int *ret)
 {
if (mem && vaddr >= mem->virt_base && vaddr + size <=
-  (mem->virt_base + (mem->size << PAGE_SHIFT))) {
+  (mem->virt_base + ((dma_addr_t)mem->size << PAGE_SHIFT))) {
unsigned long off = vma->vm_pgoff;
int start = (vaddr - mem->virt_base) >> PAGE_SHIFT;
-   int user_count = vma_pages(vma);
+   unsigned long user_count = vma_pages(vma);
int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
 
*ret = -ENXIO;
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] dma-mapping: move the remaining DMA API calls out of line

2020-04-15 Thread Alexey Kardashevskiy

On 15/04/2020 16:18, Christoph Hellwig wrote:
> On Wed, Apr 15, 2020 at 12:26:04PM +1000, Alexey Kardashevskiy wrote:
>> May be this is correct and allowed (no idea) but removing exported
>> symbols at least deserves a mention in the commit log, does not it?
>>
>> The rest of the series is fine and works. Thanks,
> 
> Maybe I can throw in a line, but the point is that dma_direct_*
> was exported as dma_* called them inline.  Now dma_* is out of line
> and exported instead, which always was the actual API.

They become inline in 2/4.

And the fact they were exported leaves possibility that there is a
driver somewhere relying on these symbols or distro kernel won't build
because the symbol disappeared from exports (I do not know what KABI
guarantees or if mainline kernel cares). I do not care in particular but
some might, a line separated with empty lines in the commit log would do.

-- 
Alexey
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [PATCH v2 6/7] iommu/vt-d: Add page request draining support

2020-04-15 Thread Tian, Kevin

> From: Lu Baolu 
> Sent: Wednesday, April 15, 2020 1:26 PM
> 
> When a PASID is stopped or terminated, there can be pending
> PRQs (requests that haven't received responses) in remapping
> hardware. This adds the interface to drain page requests and
> call it when a PASID is terminated.
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel-svm.c   | 90 ++---
>  include/linux/intel-iommu.h |  1 +
>  2 files changed, 86 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 05aeb8ea51c4..736dd39fb52b 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -23,6 +23,7 @@
>  #include "intel-pasid.h"
> 
>  static irqreturn_t prq_event_thread(int irq, void *d);
> +static void intel_svm_drain_prq(struct device *dev, int pasid);
> 
>  #define PRQ_ORDER 0
> 
> @@ -210,6 +211,7 @@ static void intel_mm_release(struct mmu_notifier
> *mn, struct mm_struct *mm)
>   rcu_read_lock();
>   list_for_each_entry_rcu(sdev, >devs, list) {
>   intel_pasid_tear_down_entry(svm->iommu, sdev->dev, svm-
> >pasid);
> + intel_svm_drain_prq(sdev->dev, svm->pasid);

I feel there is a problem here. If you clear the PASID entry before draining,
in-fly requests will hit unrecoverable fault instead, due to invalid PASID
entry.

>   intel_flush_svm_range_dev(svm, sdev, 0, -1, 0);
>   }
>   rcu_read_unlock();
> @@ -403,12 +405,8 @@ int intel_svm_unbind_gpasid(struct device *dev, int
> pasid)
>   if (!sdev->users) {
>   list_del_rcu(>list);
>   intel_pasid_tear_down_entry(iommu, dev, svm-
> >pasid);
> + intel_svm_drain_prq(dev, svm->pasid);
>   intel_flush_svm_range_dev(svm, sdev, 0, -1, 0);
> - /* TODO: Drain in flight PRQ for the PASID since it
> -  * may get reused soon, we don't want to
> -  * confuse with its previous life.
> -  * intel_svm_drain_prq(dev, pasid);
> -  */
>   kfree_rcu(sdev, rcu);
> 
>   if (list_empty(>devs)) {
> @@ -646,6 +644,7 @@ int intel_svm_unbind_mm(struct device *dev, int
> pasid)
>* large and has to be physically contiguous. So it's
>* hard to be as defensive as we might like. */
>   intel_pasid_tear_down_entry(iommu, dev, svm-
> >pasid);
> + intel_svm_drain_prq(dev, svm->pasid);
>   intel_flush_svm_range_dev(svm, sdev, 0, -1, 0);
>   kfree_rcu(sdev, rcu);
> 
> @@ -703,6 +702,7 @@ struct page_req_dsc {
>  struct page_req {
>   struct list_head list;
>   struct page_req_dsc desc;
> + struct completion complete;
>   unsigned int processing:1;
>   unsigned int drained:1;
>   unsigned int completed:1;
> @@ -732,9 +732,83 @@ static bool is_canonical_address(u64 addr)
>   return (((saddr << shift) >> shift) == saddr);
>  }
> 
> +/**
> + * intel_svm_drain_prq:
> + *
> + * Drain all pending page requests related to a specific pasid in both
> + * software and hardware. The caller must guarantee that no more page
> + * requests related to this pasid coming.
> + */
> +static void intel_svm_drain_prq(struct device *dev, int pasid)
> +{
> + struct device_domain_info *info;
> + struct dmar_domain *domain;
> + struct intel_iommu *iommu;
> + struct qi_desc desc[3];
> + struct pci_dev *pdev;
> + struct page_req *req;
> + unsigned long flags;
> + u16 sid, did;
> + int qdep;
> +
> + info = get_domain_info(dev);
> + if (WARN_ON(!info || !dev_is_pci(dev)))
> + return;
> +
> + iommu = info->iommu;
> + domain = info->domain;
> + pdev = to_pci_dev(dev);
> +
> + /* Mark all related pending requests drained. */
> + spin_lock_irqsave(>prq_lock, flags);
> + list_for_each_entry(req, >prq_list, list)
> + if (req->desc.pasid_present && req->desc.pasid == pasid)
> + req->drained = true;
> + spin_unlock_irqrestore(>prq_lock, flags);
> +
> + /* Wait until all related pending requests complete. */
> +retry:
> + spin_lock_irqsave(>prq_lock, flags);
> + list_for_each_entry(req, >prq_list, list) {
> + if (req->desc.pasid_present &&
> + req->desc.pasid == pasid &&
> + !req->completed) {
> + spin_unlock_irqrestore(>prq_lock, flags);
> + wait_for_completion_timeout(>complete, 5 *
> HZ);
> + goto retry;
> + }
> + }
> + spin_unlock_irqrestore(>prq_lock, flags);
> +
> + /*
> +  * Perform steps described in VT-d spec CH7.10 to drain page
> +  * request and responses in hardware.
> +  */
> +

Re: [PATCH 0/2] iommu/arm-smmu: Allow client devices to select direct mapping

2020-04-15 Thread Sai Prakash Ranjan


Hi Joerg,

On 2020-04-13 20:42, Jordan Crouse wrote:

On Thu, Apr 09, 2020 at 04:31:24PM -0700, Matthias Kaehlcke wrote:

On Tue, Feb 04, 2020 at 11:12:17PM +0530, Sai Prakash Ranjan wrote:
> Hello Robin, Will
>
> On 2020-01-22 17:18, Sai Prakash Ranjan wrote:
> > This series allows drm devices to set a default identity
> > mapping using iommu_request_dm_for_dev(). First patch is
> > a cleanup to support other SoCs to call into QCOM specific
> > implementation and preparation for second patch.
> > Second patch sets the default identity domain for drm devices.
> >
> > Jordan Crouse (1):
> >   iommu/arm-smmu: Allow client devices to select direct mapping
> >
> > Sai Prakash Ranjan (1):
> >   iommu: arm-smmu-impl: Convert to a generic reset implementation
> >
> >  drivers/iommu/arm-smmu-impl.c |  8 +++--
> >  drivers/iommu/arm-smmu-qcom.c | 55 +--
> >  drivers/iommu/arm-smmu.c  |  3 ++
> >  drivers/iommu/arm-smmu.h  |  5 
> >  4 files changed, 65 insertions(+), 6 deletions(-)
>
> Any review comments?

Ping

What is the status of this series, is it ready to land or are any 
changes

needed?

Thanks

Matthias


I think this is up in the air following the changes that Joerg 
suggested:

https://lists.linuxfoundation.org/pipermail/iommu/2020-April/043017.html



1st patch for generic reset in this series is independent and can be 
merged.
But seems like requesting direct mapping fails with the joerg's patch 
series.


Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [PATCH v2 5/7] iommu/vt-d: Save prq descriptors in an internal list

2020-04-15 Thread Tian, Kevin

> From: Lu Baolu 
> Sent: Wednesday, April 15, 2020 1:26 PM
> 
> Currently, the page request interrupt thread handles the page
> requests in the queue in this way:
> 
> - Clear PPR bit to ensure new interrupt could come in;
> - Read and record the head and tail registers;
> - Handle all descriptors between head and tail;
> - Write tail to head register.
> 
> This might cause some descriptors to be handles multiple times.
> An example sequence:
> 
> - Thread A got scheduled with PRQ_1 and PRQ_2 in the queue;
> - Thread A clear the PPR bit and record the head and tail;
> - A new PRQ_3 comes and Thread B gets scheduled;
> - Thread B record the head and tail which includes PRQ_1
>   and PRQ_2.

I may overlook something but isn't the prq interrupt thread
per iommu then why would two prq threads contend here?

Thanks,
Kevin

> 
> As the result, PRQ_1 and PRQ_2 are handled twice in Thread_A and
> Thread_B.
> 
>Thread_AThread_B
>   ..  ..
>   ||  ||
>   ..  ..
>   head| PRQ_1  |  head| PRQ_1  |
>   ..  ..
>   | PRQ_2  |  | PRQ_2  |
>   ..  ..
>   tail||  | PRQ_3  |
>   ..  ..
>   ||  tail||
>   ''  ''
> 
> To avoid this, probably, we need to apply a spinlock to ensure
> that PRQs are handled in a serialized way. But that means the
> intel_svm_process_prq() will be called with a spinlock held.
> This causes extra complexities in intel_svm_process_prq().
> 
> This aims to make PRQ descriptors to be handled in a serialized
> way while remove the requirement of holding the spin lock in
> intel_svm_process_prq() by saving the descriptors in a list.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel-svm.c   | 58 ++---
>  include/linux/intel-iommu.h |  2 ++
>  2 files changed, 49 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index a1921b462783..05aeb8ea51c4 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -50,6 +50,8 @@ int intel_svm_enable_prq(struct intel_iommu *iommu)
>   return ret;
>   }
>   iommu->pr_irq = irq;
> + INIT_LIST_HEAD(>prq_list);
> + spin_lock_init(>prq_lock);
> 
>   snprintf(iommu->prq_name, sizeof(iommu->prq_name), "dmar%d-
> prq", iommu->seq_id);
> 
> @@ -698,6 +700,14 @@ struct page_req_dsc {
> 
>  #define PRQ_RING_MASK((0x1000 << PRQ_ORDER) - 0x20)
> 
> +struct page_req {
> + struct list_head list;
> + struct page_req_dsc desc;
> + unsigned int processing:1;
> + unsigned int drained:1;
> + unsigned int completed:1;
> +};
> +
>  static bool access_error(struct vm_area_struct *vma, struct page_req_dsc
> *req)
>  {
>   unsigned long requested = 0;
> @@ -842,34 +852,60 @@ static void process_single_prq(struct intel_iommu
> *iommu,
>   }
>  }
> 
> -static void intel_svm_process_prq(struct intel_iommu *iommu,
> -   struct page_req_dsc *prq,
> -   int head, int tail)
> +static void intel_svm_process_prq(struct intel_iommu *iommu)
>  {
> - struct page_req_dsc *req;
> -
> - while (head != tail) {
> - req = >prq[head / sizeof(*req)];
> - process_single_prq(iommu, req);
> - head = (head + sizeof(*req)) & PRQ_RING_MASK;
> + struct page_req *req;
> + unsigned long flags;
> +
> + spin_lock_irqsave(>prq_lock, flags);
> + while (!list_empty(>prq_list)) {
> + req = list_first_entry(>prq_list, struct page_req, list);
> + if (!req->processing) {
> + req->processing = true;
> + spin_unlock_irqrestore(>prq_lock, flags);
> + process_single_prq(iommu, >desc);
> + spin_lock_irqsave(>prq_lock, flags);
> + req->completed = true;
> + } else if (req->completed) {
> + list_del(>list);
> + kfree(req);
> + } else {
> + break;
> + }
>   }
> + spin_unlock_irqrestore(>prq_lock, flags);
>  }
> 
>  static irqreturn_t prq_event_thread(int irq, void *d)
>  {
>   struct intel_iommu *iommu = d;
> + unsigned long flags;
>   int head, tail;
> 
> + spin_lock_irqsave(>prq_lock, flags);
>   /*
>* Clear PPR bit before reading head/tail registers, to
>* ensure that we get a new interrupt if needed.
>*/
>   writel(DMA_PRS_PPR, iommu->reg + DMAR_PRS_REG);
> -
>   tail = dmar_readq(iommu->reg + DMAR_PQT_REG) &
> PRQ_RING_MASK;
>   head = dmar_readq(iommu->reg + DMAR_PQH_REG) &
> PRQ_RING_MASK;
> - intel_svm_process_prq(iommu, iommu->prq, head, tail);
> + while (head != tail) {
> +

RE: [PATCH v2 4/7] iommu/vt-d: Refactor prq_event_thread()

2020-04-15 Thread Tian, Kevin

> From: Lu Baolu 
> Sent: Wednesday, April 15, 2020 1:26 PM
> 
> Move the software processing page request descriptors part from
> prq_event_thread() into a separated function. No any functional
> changes.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel-svm.c | 256 --
>  1 file changed, 135 insertions(+), 121 deletions(-)
> 
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 83dc4319f661..a1921b462783 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -722,142 +722,156 @@ static bool is_canonical_address(u64 addr)
>   return (((saddr << shift) >> shift) == saddr);
>  }
> 
> -static irqreturn_t prq_event_thread(int irq, void *d)
> +static void process_single_prq(struct intel_iommu *iommu,
> +struct page_req_dsc *req)
>  {
> - struct intel_iommu *iommu = d;
> - struct intel_svm *svm = NULL;
> - int head, tail, handled = 0;
> -
> - /* Clear PPR bit before reading head/tail registers, to
> -  * ensure that we get a new interrupt if needed. */
> - writel(DMA_PRS_PPR, iommu->reg + DMAR_PRS_REG);
> -
> - tail = dmar_readq(iommu->reg + DMAR_PQT_REG) &
> PRQ_RING_MASK;
> - head = dmar_readq(iommu->reg + DMAR_PQH_REG) &
> PRQ_RING_MASK;
> - while (head != tail) {
> - struct intel_svm_dev *sdev;
> - struct vm_area_struct *vma;
> - struct page_req_dsc *req;
> - struct qi_desc resp;
> - int result;
> - vm_fault_t ret;
> - u64 address;
> -
> - handled = 1;
> -
> - req = >prq[head / sizeof(*req)];
> + int result = QI_RESP_FAILURE;
> + struct intel_svm_dev *sdev;
> + struct vm_area_struct *vma;
> + struct intel_svm *svm;
> + struct qi_desc resp;
> + vm_fault_t ret;
> + u64 address;
> +
> + address = (u64)req->addr << VTD_PAGE_SHIFT;
> + if (!req->pasid_present) {
> + pr_err("%s: Page request without PASID: %08llx %08llx\n",
> +iommu->name, ((unsigned long long *)req)[0],
> +((unsigned long long *)req)[1]);
> + goto no_pasid;
> + }
> 
> - result = QI_RESP_FAILURE;
> - address = (u64)req->addr << VTD_PAGE_SHIFT;
> - if (!req->pasid_present) {
> - pr_err("%s: Page request without
> PASID: %08llx %08llx\n",
> -iommu->name, ((unsigned long long *)req)[0],
> -((unsigned long long *)req)[1]);
> - goto no_pasid;
> - }
> + rcu_read_lock();
> + svm = ioasid_find(NULL, req->pasid, NULL);
> + /*
> +  * It *can't* go away, because the driver is not permitted
> +  * to unbind the mm while any page faults are outstanding.
> +  * So we only need RCU to protect the internal idr code.
> +  */
> + rcu_read_unlock();
> 
> - if (!svm || svm->pasid != req->pasid) {
> - rcu_read_lock();
> - svm = ioasid_find(NULL, req->pasid, NULL);
> - /* It *can't* go away, because the driver is not
> permitted
> -  * to unbind the mm while any page faults are
> outstanding.
> -  * So we only need RCU to protect the internal idr
> code. */
> - rcu_read_unlock();
> - if (IS_ERR_OR_NULL(svm)) {
> - pr_err("%s: Page request for invalid
> PASID %d: %08llx %08llx\n",
> -iommu->name, req->pasid, ((unsigned
> long long *)req)[0],
> -((unsigned long long *)req)[1]);
> - goto no_pasid;
> - }
> - }
> + if (IS_ERR_OR_NULL(svm)) {
> + pr_err("%s: Page request for invalid
> PASID %d: %08llx %08llx\n",
> +iommu->name, req->pasid, ((unsigned long long *)req)[0],
> +((unsigned long long *)req)[1]);
> + goto no_pasid;
> + }
> 
> - result = QI_RESP_INVALID;
> - /* Since we're using init_mm.pgd directly, we should never
> take
> -  * any faults on kernel addresses. */
> - if (!svm->mm)
> - goto bad_req;
> + result = QI_RESP_INVALID;
> + /* Since we're using init_mm.pgd directly, we should never take
> +  * any faults on kernel addresses. */
> + if (!svm->mm)
> + goto bad_req;
> +
> + /* If address is not canonical, return invalid response */
> + if (!is_canonical_address(address))
> + goto bad_req;
> +
> + /* If the mm is already defunct, don't handle faults. */
> + if (!mmget_not_zero(svm->mm))
> + goto bad_req;
> +
> + down_read(>mm->mmap_sem);
> + vma = find_extend_vma(svm->mm, address);
> + if (!vma ||

[PATCH v2] dt-bindings: iommu: renesas, ipmmu-vmsa: convert to json-schema

2020-04-15 Thread Yoshihiro Shimoda

Convert Renesas VMSA-Compatible IOMMU bindings documentation
to json-schema.

Signed-off-by: Yoshihiro Shimoda 
---
 Changes from v1:
 - Fix typo in the subject.
 - Add a description on #iommu-cells.
 https://patchwork.kernel.org/patch/11485415/

 .../bindings/iommu/renesas,ipmmu-vmsa.txt  | 73 -
 .../bindings/iommu/renesas,ipmmu-vmsa.yaml | 94 ++
 2 files changed, 94 insertions(+), 73 deletions(-)
 delete mode 100644 
Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt
 create mode 100644 
Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml

diff --git a/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt 
b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt
deleted file mode 100644
index 020d6f2..
--- a/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt
+++ /dev/null
@@ -1,73 +0,0 @@
-* Renesas VMSA-Compatible IOMMU
-
-The IPMMU is an IOMMU implementation compatible with the ARM VMSA page tables.
-It provides address translation for bus masters outside of the CPU, each
-connected to the IPMMU through a port called micro-TLB.
-
-
-Required Properties:
-
-  - compatible: Must contain SoC-specific and generic entry below in case
-the device is compatible with the R-Car Gen2 VMSA-compatible IPMMU.
-
-- "renesas,ipmmu-r8a73a4" for the R8A73A4 (R-Mobile APE6) IPMMU.
-- "renesas,ipmmu-r8a7743" for the R8A7743 (RZ/G1M) IPMMU.
-- "renesas,ipmmu-r8a7744" for the R8A7744 (RZ/G1N) IPMMU.
-- "renesas,ipmmu-r8a7745" for the R8A7745 (RZ/G1E) IPMMU.
-- "renesas,ipmmu-r8a774a1" for the R8A774A1 (RZ/G2M) IPMMU.
-- "renesas,ipmmu-r8a774b1" for the R8A774B1 (RZ/G2N) IPMMU.
-- "renesas,ipmmu-r8a774c0" for the R8A774C0 (RZ/G2E) IPMMU.
-- "renesas,ipmmu-r8a7790" for the R8A7790 (R-Car H2) IPMMU.
-- "renesas,ipmmu-r8a7791" for the R8A7791 (R-Car M2-W) IPMMU.
-- "renesas,ipmmu-r8a7793" for the R8A7793 (R-Car M2-N) IPMMU.
-- "renesas,ipmmu-r8a7794" for the R8A7794 (R-Car E2) IPMMU.
-- "renesas,ipmmu-r8a7795" for the R8A7795 (R-Car H3) IPMMU.
-- "renesas,ipmmu-r8a7796" for the R8A7796 (R-Car M3-W) IPMMU.
-- "renesas,ipmmu-r8a77965" for the R8A77965 (R-Car M3-N) IPMMU.
-- "renesas,ipmmu-r8a77970" for the R8A77970 (R-Car V3M) IPMMU.
-- "renesas,ipmmu-r8a77980" for the R8A77980 (R-Car V3H) IPMMU.
-- "renesas,ipmmu-r8a77990" for the R8A77990 (R-Car E3) IPMMU.
-- "renesas,ipmmu-r8a77995" for the R8A77995 (R-Car D3) IPMMU.
-- "renesas,ipmmu-vmsa" for generic R-Car Gen2 or RZ/G1 VMSA-compatible
-  IPMMU.
-
-  - reg: Base address and size of the IPMMU registers.
-  - interrupts: Specifiers for the MMU fault interrupts. For instances that
-support secure mode two interrupts must be specified, for non-secure and
-secure mode, in that order. For instances that don't support secure mode a
-single interrupt must be specified. Not required for cache IPMMUs.
-
-  - #iommu-cells: Must be 1.
-
-Optional properties:
-
-  - renesas,ipmmu-main: reference to the main IPMMU instance in two cells.
-The first cell is a phandle to the main IPMMU and the second cell is
-the interrupt bit number associated with the particular cache IPMMU device.
-The interrupt bit number needs to match the main IPMMU IMSSTR register.
-Only used by cache IPMMU instances.
-
-
-Each bus master connected to an IPMMU must reference the IPMMU in its device
-node with the following property:
-
-  - iommus: A reference to the IPMMU in two cells. The first cell is a phandle
-to the IPMMU and the second cell the number of the micro-TLB that the
-device is connected to.
-
-
-Example: R8A7791 IPMMU-MX and VSP1-D0 bus master
-
-   ipmmu_mx: mmu@fe951000 {
-   compatible = "renasas,ipmmu-r8a7791", "renasas,ipmmu-vmsa";
-   reg = <0 0xfe951000 0 0x1000>;
-   interrupts = <0 222 IRQ_TYPE_LEVEL_HIGH>,
-<0 221 IRQ_TYPE_LEVEL_HIGH>;
-   #iommu-cells = <1>;
-   };
-
-   vsp@fe928000 {
-   ...
-   iommus = <_mx 13>;
-   ...
-   };
diff --git a/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml 
b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml
new file mode 100644
index ..2c69f1f
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml
@@ -0,0 +1,94 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/iommu/renesas,ipmmu-vmsa.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Renesas VMSA-Compatible IOMMU
+
+maintainers:
+  - Yoshihiro Shimoda 
+
+description:
+  The IPMMU is an IOMMU implementation compatible with the ARM VMSA page 
tables.
+  It provides address translation for bus masters outside of the CPU, each
+  connected to the IPMMU through a port called

RE: [PATCH v2 2/7] iommu/vt-d: Multiple descriptors per qi_submit_sync()

2020-04-15 Thread Tian, Kevin

> From: Lu Baolu 
> Sent: Wednesday, April 15, 2020 4:30 PM
> 
> On 2020/4/15 16:18, Tian, Kevin wrote:
> >> From: Lu Baolu
> >> Sent: Wednesday, April 15, 2020 1:26 PM
> >>
> >> Extend qi_submit_sync() function to support multiple descriptors.
> >>
> >> Signed-off-by: Jacob Pan
> >> Signed-off-by: Lu Baolu
> >> ---
> >>   drivers/iommu/dmar.c| 39 +++--
> >>   include/linux/intel-iommu.h |  1 +
> >>   2 files changed, 25 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> >> index bb42177e2369..61d049e91f84 100644
> >> --- a/drivers/iommu/dmar.c
> >> +++ b/drivers/iommu/dmar.c
> >> @@ -1157,12 +1157,11 @@ static inline void reclaim_free_desc(struct
> >> q_inval *qi)
> >>}
> >>   }
> >>
> >> -static int qi_check_fault(struct intel_iommu *iommu, int index)
> >> +static int qi_check_fault(struct intel_iommu *iommu, int index, int
> >> wait_index)
> >>   {
> >>u32 fault;
> >>int head, tail;
> >>struct q_inval *qi = iommu->qi;
> >> -  int wait_index = (index + 1) % QI_LENGTH;
> >>int shift = qi_shift(iommu);
> >>
> >>if (qi->desc_status[wait_index] == QI_ABORT)
> >> @@ -1234,12 +1233,12 @@ static int qi_check_fault(struct intel_iommu
> >> *iommu, int index)
> >>   int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
> >>   unsigned int count, unsigned long options)
> >>   {
> >> -  int rc;
> >>struct q_inval *qi = iommu->qi;
> >> -  int offset, shift, length;
> >>struct qi_desc wait_desc;
> >>int wait_index, index;
> >>unsigned long flags;
> >> +  int offset, shift;
> >> +  int rc, i;
> >>
> >>if (!qi)
> >>return 0;
> >> @@ -1248,32 +1247,41 @@ int qi_submit_sync(struct intel_iommu
> *iommu,
> >> struct qi_desc *desc,
> >>rc = 0;
> >>
> >>raw_spin_lock_irqsave(>q_lock, flags);
> >> -  while (qi->free_cnt < 3) {
> >> +  /*
> >> +   * Check if we have enough empty slots in the queue to submit,
> >> +   * the calculation is based on:
> >> +   * # of desc + 1 wait desc + 1 space between head and tail
> >> +   */
> >> +  while (qi->free_cnt < count + 2) {
> >>raw_spin_unlock_irqrestore(>q_lock, flags);
> >>cpu_relax();
> >>raw_spin_lock_irqsave(>q_lock, flags);
> >>}
> >>
> >>index = qi->free_head;
> >> -  wait_index = (index + 1) % QI_LENGTH;
> >> +  wait_index = (index + count) % QI_LENGTH;
> >>shift = qi_shift(iommu);
> >> -  length = 1 << shift;
> >>
> >> -  qi->desc_status[index] = qi->desc_status[wait_index] = QI_IN_USE;
> >> +  for (i = 0; i < count; i++) {
> >> +  offset = ((index + i) % QI_LENGTH) << shift;
> >> +  memcpy(qi->desc + offset, [i], 1 << shift);
> >> +  qi->desc_status[(index + i) % QI_LENGTH] = QI_IN_USE;
> >> +  }
> > what about doing one memcpy and leave the loop only for updating
> > qi status?
> >
> 
> One memcpy might cross the table boundary.
> 

Thanks. you are right.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/7] iommu/vt-d: Refactor parameters for qi_submit_sync()

2020-04-15 Thread Lu Baolu


On 2020/4/15 16:02, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Wednesday, April 15, 2020 1:26 PM

Current qi_submit_sync() supports single invalidation descriptor
per submission and appends wait descriptor after each submission
to poll hardware completion. This patch adjusts the parameters
of this function so that multiple descriptors per submission can
be supported.

Signed-off-by: Jacob Pan 
Signed-off-by: Lu Baolu 
---
  drivers/iommu/dmar.c| 24 ++--
  drivers/iommu/intel-pasid.c |  4 ++--
  drivers/iommu/intel-svm.c   |  6 +++---
  drivers/iommu/intel_irq_remapping.c |  2 +-
  include/linux/intel-iommu.h |  8 +++-
  5 files changed, 27 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index d9dc787feef7..bb42177e2369 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1225,10 +1225,14 @@ static int qi_check_fault(struct intel_iommu
*iommu, int index)
  }

  /*
- * Submit the queued invalidation descriptor to the remapping
- * hardware unit and wait for its completion.
+ * Function to submit invalidation descriptors of all types to the queued
+ * invalidation interface(QI). Multiple descriptors can be submitted at a
+ * time, a wait descriptor will be appended to each submission to ensure
+ * hardware has completed the invalidation before return. Wait descriptors
+ * can be part of the submission but it will not be polled for completion.
   */
-int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu)
+int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
+  unsigned int count, unsigned long options)


Adding parameter w/o actually using them is not typical way of splitting
patches. Better squash this with 2/7 together.


My original thought was to make it easier for code review. No particular
preference. Both are okay to me. :-)

Best regards,
baolu




  {
int rc;
struct q_inval *qi = iommu->qi;
@@ -1318,7 +1322,7 @@ void qi_global_iec(struct intel_iommu *iommu)
desc.qw3 = 0;

/* should never fail */
-   qi_submit_sync(, iommu);
+   qi_submit_sync(iommu, , 1, 0);
  }

  void qi_flush_context(struct intel_iommu *iommu, u16 did, u16 sid, u8 fm,
@@ -1332,7 +1336,7 @@ void qi_flush_context(struct intel_iommu *iommu,
u16 did, u16 sid, u8 fm,
desc.qw2 = 0;
desc.qw3 = 0;

-   qi_submit_sync(, iommu);
+   qi_submit_sync(iommu, , 1, 0);
  }

  void qi_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
@@ -1356,7 +1360,7 @@ void qi_flush_iotlb(struct intel_iommu *iommu,
u16 did, u64 addr,
desc.qw2 = 0;
desc.qw3 = 0;

-   qi_submit_sync(, iommu);
+   qi_submit_sync(iommu, , 1, 0);
  }

  void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
@@ -1378,7 +1382,7 @@ void qi_flush_dev_iotlb(struct intel_iommu
*iommu, u16 sid, u16 pfsid,
desc.qw2 = 0;
desc.qw3 = 0;

-   qi_submit_sync(, iommu);
+   qi_submit_sync(iommu, , 1, 0);
  }

  /* PASID-based IOTLB invalidation */
@@ -1419,7 +1423,7 @@ void qi_flush_piotlb(struct intel_iommu *iommu,
u16 did, u32 pasid, u64 addr,
QI_EIOTLB_AM(mask);
}

-   qi_submit_sync(, iommu);
+   qi_submit_sync(iommu, , 1, 0);
  }

  /* PASID-based device IOTLB Invalidate */
@@ -1448,7 +1452,7 @@ void qi_flush_dev_iotlb_pasid(struct intel_iommu
*iommu, u16 sid, u16 pfsid,
if (size_order)
desc.qw1 |= QI_DEV_EIOTLB_SIZE;

-   qi_submit_sync(, iommu);
+   qi_submit_sync(iommu, , 1, 0);
  }

  void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did,
@@ -1458,7 +1462,7 @@ void qi_flush_pasid_cache(struct intel_iommu
*iommu, u16 did,

desc.qw0 = QI_PC_PASID(pasid) | QI_PC_DID(did) |
QI_PC_GRAN(granu) | QI_PC_TYPE;
-   qi_submit_sync(, iommu);
+   qi_submit_sync(iommu, , 1, 0);
  }

  /*
diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 48cc9ca5f3dc..7969e3dac2ad 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -498,7 +498,7 @@ pasid_cache_invalidation_with_pasid(struct
intel_iommu *iommu,
desc.qw2 = 0;
desc.qw3 = 0;

-   qi_submit_sync(, iommu);
+   qi_submit_sync(iommu, , 1, 0);
  }

  static void
@@ -512,7 +512,7 @@ iotlb_invalidation_with_pasid(struct intel_iommu
*iommu, u16 did, u32 pasid)
desc.qw2 = 0;
desc.qw3 = 0;

-   qi_submit_sync(, iommu);
+   qi_submit_sync(iommu, , 1, 0);
  }

  static void
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index e9f4e979a71f..83dc4319f661 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -138,7 +138,7 @@ static void intel_flush_svm_range_dev (struct
intel_svm *svm, struct intel_svm_d
}
desc.qw2 = 0;
desc.qw3 = 0;
-   qi_submit_sync(, svm->iommu);
+

Re: [PATCH v2 2/7] iommu/vt-d: Multiple descriptors per qi_submit_sync()

2020-04-15 Thread Lu Baolu


On 2020/4/15 16:18, Tian, Kevin wrote:

From: Lu Baolu
Sent: Wednesday, April 15, 2020 1:26 PM

Extend qi_submit_sync() function to support multiple descriptors.

Signed-off-by: Jacob Pan
Signed-off-by: Lu Baolu
---
  drivers/iommu/dmar.c| 39 +++--
  include/linux/intel-iommu.h |  1 +
  2 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index bb42177e2369..61d049e91f84 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1157,12 +1157,11 @@ static inline void reclaim_free_desc(struct
q_inval *qi)
}
  }

-static int qi_check_fault(struct intel_iommu *iommu, int index)
+static int qi_check_fault(struct intel_iommu *iommu, int index, int
wait_index)
  {
u32 fault;
int head, tail;
struct q_inval *qi = iommu->qi;
-   int wait_index = (index + 1) % QI_LENGTH;
int shift = qi_shift(iommu);

if (qi->desc_status[wait_index] == QI_ABORT)
@@ -1234,12 +1233,12 @@ static int qi_check_fault(struct intel_iommu
*iommu, int index)
  int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
   unsigned int count, unsigned long options)
  {
-   int rc;
struct q_inval *qi = iommu->qi;
-   int offset, shift, length;
struct qi_desc wait_desc;
int wait_index, index;
unsigned long flags;
+   int offset, shift;
+   int rc, i;

if (!qi)
return 0;
@@ -1248,32 +1247,41 @@ int qi_submit_sync(struct intel_iommu *iommu,
struct qi_desc *desc,
rc = 0;

raw_spin_lock_irqsave(>q_lock, flags);
-   while (qi->free_cnt < 3) {
+   /*
+* Check if we have enough empty slots in the queue to submit,
+* the calculation is based on:
+* # of desc + 1 wait desc + 1 space between head and tail
+*/
+   while (qi->free_cnt < count + 2) {
raw_spin_unlock_irqrestore(>q_lock, flags);
cpu_relax();
raw_spin_lock_irqsave(>q_lock, flags);
}

index = qi->free_head;
-   wait_index = (index + 1) % QI_LENGTH;
+   wait_index = (index + count) % QI_LENGTH;
shift = qi_shift(iommu);
-   length = 1 << shift;

-   qi->desc_status[index] = qi->desc_status[wait_index] = QI_IN_USE;
+   for (i = 0; i < count; i++) {
+   offset = ((index + i) % QI_LENGTH) << shift;
+   memcpy(qi->desc + offset, [i], 1 << shift);
+   qi->desc_status[(index + i) % QI_LENGTH] = QI_IN_USE;
+   }

what about doing one memcpy and leave the loop only for updating
qi status?



One memcpy might cross the table boundary.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 0/7] iommu/vt-d: Add page request draining support

2020-04-15 Thread Lu Baolu


Hi Kevin,

On 2020/4/15 15:57, Tian, Kevin wrote:

From: Lu Baolu
Sent: Wednesday, April 15, 2020 1:26 PM

When a PASID is stopped or terminated, there can be pending PRQs
(requests that haven't received responses) in the software and
remapping hardware. The pending page requests must be drained
so that the pasid could be reused. The register level interface
for page request draining is defined in 7.11 of the VT-d spec.
This series adds the support for page requests draining.

7.11 doesn't include register-level interface. It just talks about
the general requirements on system software, endpoint device
and its driver.



I will replace this with "spec 7.10 specifies the software steps to
drain page requests and response".

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [PATCH v2 2/7] iommu/vt-d: Multiple descriptors per qi_submit_sync()

2020-04-15 Thread Tian, Kevin

> From: Lu Baolu 
> Sent: Wednesday, April 15, 2020 1:26 PM
> 
> Extend qi_submit_sync() function to support multiple descriptors.
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/dmar.c| 39 +++--
>  include/linux/intel-iommu.h |  1 +
>  2 files changed, 25 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> index bb42177e2369..61d049e91f84 100644
> --- a/drivers/iommu/dmar.c
> +++ b/drivers/iommu/dmar.c
> @@ -1157,12 +1157,11 @@ static inline void reclaim_free_desc(struct
> q_inval *qi)
>   }
>  }
> 
> -static int qi_check_fault(struct intel_iommu *iommu, int index)
> +static int qi_check_fault(struct intel_iommu *iommu, int index, int
> wait_index)
>  {
>   u32 fault;
>   int head, tail;
>   struct q_inval *qi = iommu->qi;
> - int wait_index = (index + 1) % QI_LENGTH;
>   int shift = qi_shift(iommu);
> 
>   if (qi->desc_status[wait_index] == QI_ABORT)
> @@ -1234,12 +1233,12 @@ static int qi_check_fault(struct intel_iommu
> *iommu, int index)
>  int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
>  unsigned int count, unsigned long options)
>  {
> - int rc;
>   struct q_inval *qi = iommu->qi;
> - int offset, shift, length;
>   struct qi_desc wait_desc;
>   int wait_index, index;
>   unsigned long flags;
> + int offset, shift;
> + int rc, i;
> 
>   if (!qi)
>   return 0;
> @@ -1248,32 +1247,41 @@ int qi_submit_sync(struct intel_iommu *iommu,
> struct qi_desc *desc,
>   rc = 0;
> 
>   raw_spin_lock_irqsave(>q_lock, flags);
> - while (qi->free_cnt < 3) {
> + /*
> +  * Check if we have enough empty slots in the queue to submit,
> +  * the calculation is based on:
> +  * # of desc + 1 wait desc + 1 space between head and tail
> +  */
> + while (qi->free_cnt < count + 2) {
>   raw_spin_unlock_irqrestore(>q_lock, flags);
>   cpu_relax();
>   raw_spin_lock_irqsave(>q_lock, flags);
>   }
> 
>   index = qi->free_head;
> - wait_index = (index + 1) % QI_LENGTH;
> + wait_index = (index + count) % QI_LENGTH;
>   shift = qi_shift(iommu);
> - length = 1 << shift;
> 
> - qi->desc_status[index] = qi->desc_status[wait_index] = QI_IN_USE;
> + for (i = 0; i < count; i++) {
> + offset = ((index + i) % QI_LENGTH) << shift;
> + memcpy(qi->desc + offset, [i], 1 << shift);
> + qi->desc_status[(index + i) % QI_LENGTH] = QI_IN_USE;
> + }

what about doing one memcpy and leave the loop only for updating
qi status?

> + qi->desc_status[wait_index] = QI_IN_USE;
> 
> - offset = index << shift;
> - memcpy(qi->desc + offset, desc, length);
>   wait_desc.qw0 = QI_IWD_STATUS_DATA(QI_DONE) |
>   QI_IWD_STATUS_WRITE | QI_IWD_TYPE;
> + if (options & QI_OPT_WAIT_DRAIN)
> + wait_desc.qw0 |= QI_IWD_PRQ_DRAIN;
>   wait_desc.qw1 = virt_to_phys(>desc_status[wait_index]);
>   wait_desc.qw2 = 0;
>   wait_desc.qw3 = 0;
> 
>   offset = wait_index << shift;
> - memcpy(qi->desc + offset, _desc, length);
> + memcpy(qi->desc + offset, _desc, 1 << shift);
> 
> - qi->free_head = (qi->free_head + 2) % QI_LENGTH;
> - qi->free_cnt -= 2;
> + qi->free_head = (qi->free_head + count + 1) % QI_LENGTH;
> + qi->free_cnt -= count + 1;
> 
>   /*
>* update the HW tail register indicating the presence of
> @@ -1289,7 +1297,7 @@ int qi_submit_sync(struct intel_iommu *iommu,
> struct qi_desc *desc,
>* a deadlock where the interrupt context can wait
> indefinitely
>* for free slots in the queue.
>*/
> - rc = qi_check_fault(iommu, index);
> + rc = qi_check_fault(iommu, index, wait_index);
>   if (rc)
>   break;
> 
> @@ -1298,7 +1306,8 @@ int qi_submit_sync(struct intel_iommu *iommu,
> struct qi_desc *desc,
>   raw_spin_lock(>q_lock);
>   }
> 
> - qi->desc_status[index] = QI_DONE;
> + for (i = 0; i < count; i++)
> + qi->desc_status[(index + i) % QI_LENGTH] = QI_DONE;
> 
>   reclaim_free_desc(qi);
>   raw_spin_unlock_irqrestore(>q_lock, flags);
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index ee2d5cdd8339..cca1e5f9aeaa 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -333,6 +333,7 @@ enum {
> 
>  #define QI_IWD_STATUS_DATA(d)(((u64)d) << 32)
>  #define QI_IWD_STATUS_WRITE  (((u64)1) << 5)
> +#define QI_IWD_PRQ_DRAIN (((u64)1) << 7)
> 
>  #define QI_IOTLB_DID(did)(((u64)did) << 16)
>  #define QI_IOTLB_DR(dr)  (((u64)dr) << 7)
> --
> 2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org

RE: [PATCH v2 1/7] iommu/vt-d: Refactor parameters for qi_submit_sync()

2020-04-15 Thread Tian, Kevin

> From: Lu Baolu 
> Sent: Wednesday, April 15, 2020 1:26 PM
> 
> Current qi_submit_sync() supports single invalidation descriptor
> per submission and appends wait descriptor after each submission
> to poll hardware completion. This patch adjusts the parameters
> of this function so that multiple descriptors per submission can
> be supported.
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/dmar.c| 24 ++--
>  drivers/iommu/intel-pasid.c |  4 ++--
>  drivers/iommu/intel-svm.c   |  6 +++---
>  drivers/iommu/intel_irq_remapping.c |  2 +-
>  include/linux/intel-iommu.h |  8 +++-
>  5 files changed, 27 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> index d9dc787feef7..bb42177e2369 100644
> --- a/drivers/iommu/dmar.c
> +++ b/drivers/iommu/dmar.c
> @@ -1225,10 +1225,14 @@ static int qi_check_fault(struct intel_iommu
> *iommu, int index)
>  }
> 
>  /*
> - * Submit the queued invalidation descriptor to the remapping
> - * hardware unit and wait for its completion.
> + * Function to submit invalidation descriptors of all types to the queued
> + * invalidation interface(QI). Multiple descriptors can be submitted at a
> + * time, a wait descriptor will be appended to each submission to ensure
> + * hardware has completed the invalidation before return. Wait descriptors
> + * can be part of the submission but it will not be polled for completion.
>   */
> -int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu)
> +int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
> +unsigned int count, unsigned long options)

Adding parameter w/o actually using them is not typical way of splitting
patches. Better squash this with 2/7 together.

>  {
>   int rc;
>   struct q_inval *qi = iommu->qi;
> @@ -1318,7 +1322,7 @@ void qi_global_iec(struct intel_iommu *iommu)
>   desc.qw3 = 0;
> 
>   /* should never fail */
> - qi_submit_sync(, iommu);
> + qi_submit_sync(iommu, , 1, 0);
>  }
> 
>  void qi_flush_context(struct intel_iommu *iommu, u16 did, u16 sid, u8 fm,
> @@ -1332,7 +1336,7 @@ void qi_flush_context(struct intel_iommu *iommu,
> u16 did, u16 sid, u8 fm,
>   desc.qw2 = 0;
>   desc.qw3 = 0;
> 
> - qi_submit_sync(, iommu);
> + qi_submit_sync(iommu, , 1, 0);
>  }
> 
>  void qi_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
> @@ -1356,7 +1360,7 @@ void qi_flush_iotlb(struct intel_iommu *iommu,
> u16 did, u64 addr,
>   desc.qw2 = 0;
>   desc.qw3 = 0;
> 
> - qi_submit_sync(, iommu);
> + qi_submit_sync(iommu, , 1, 0);
>  }
> 
>  void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
> @@ -1378,7 +1382,7 @@ void qi_flush_dev_iotlb(struct intel_iommu
> *iommu, u16 sid, u16 pfsid,
>   desc.qw2 = 0;
>   desc.qw3 = 0;
> 
> - qi_submit_sync(, iommu);
> + qi_submit_sync(iommu, , 1, 0);
>  }
> 
>  /* PASID-based IOTLB invalidation */
> @@ -1419,7 +1423,7 @@ void qi_flush_piotlb(struct intel_iommu *iommu,
> u16 did, u32 pasid, u64 addr,
>   QI_EIOTLB_AM(mask);
>   }
> 
> - qi_submit_sync(, iommu);
> + qi_submit_sync(iommu, , 1, 0);
>  }
> 
>  /* PASID-based device IOTLB Invalidate */
> @@ -1448,7 +1452,7 @@ void qi_flush_dev_iotlb_pasid(struct intel_iommu
> *iommu, u16 sid, u16 pfsid,
>   if (size_order)
>   desc.qw1 |= QI_DEV_EIOTLB_SIZE;
> 
> - qi_submit_sync(, iommu);
> + qi_submit_sync(iommu, , 1, 0);
>  }
> 
>  void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did,
> @@ -1458,7 +1462,7 @@ void qi_flush_pasid_cache(struct intel_iommu
> *iommu, u16 did,
> 
>   desc.qw0 = QI_PC_PASID(pasid) | QI_PC_DID(did) |
>   QI_PC_GRAN(granu) | QI_PC_TYPE;
> - qi_submit_sync(, iommu);
> + qi_submit_sync(iommu, , 1, 0);
>  }
> 
>  /*
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index 48cc9ca5f3dc..7969e3dac2ad 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -498,7 +498,7 @@ pasid_cache_invalidation_with_pasid(struct
> intel_iommu *iommu,
>   desc.qw2 = 0;
>   desc.qw3 = 0;
> 
> - qi_submit_sync(, iommu);
> + qi_submit_sync(iommu, , 1, 0);
>  }
> 
>  static void
> @@ -512,7 +512,7 @@ iotlb_invalidation_with_pasid(struct intel_iommu
> *iommu, u16 did, u32 pasid)
>   desc.qw2 = 0;
>   desc.qw3 = 0;
> 
> - qi_submit_sync(, iommu);
> + qi_submit_sync(iommu, , 1, 0);
>  }
> 
>  static void
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index e9f4e979a71f..83dc4319f661 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -138,7 +138,7 @@ static void intel_flush_svm_range_dev (struct
> intel_svm *svm, struct intel_svm_d
>   }
>   desc.qw2 = 0;
>   desc.qw3 = 0;
> - qi_submit_sync(, svm->iommu);
> +

RE: [PATCH v2 0/7] iommu/vt-d: Add page request draining support

2020-04-15 Thread Tian, Kevin

> From: Lu Baolu 
> Sent: Wednesday, April 15, 2020 1:26 PM
> 
> When a PASID is stopped or terminated, there can be pending PRQs
> (requests that haven't received responses) in the software and
> remapping hardware. The pending page requests must be drained
> so that the pasid could be reused. The register level interface
> for page request draining is defined in 7.11 of the VT-d spec.
> This series adds the support for page requests draining.

7.11 doesn't include register-level interface. It just talks about
the general requirements on system software, endpoint device
and its driver.

Thanks
Kevin

> 
> This includes two parts:
>  - PATCH 1/7 ~ 3/7: refactor the qi_submit_sync() to support
>multiple descriptors per submission which will be used by
>PATCH 6/7.
>  - PATCH 4/7 ~ 7/7: add page request drain support after a
>pasid entry is torn down due to an unbind operation.
> 
> Please help to review.
> 
> Best regards,
> baolu
> 
> Change log:
>  v1->v2:
>   - Fix race between multiple prq handling threads
> 
> Lu Baolu (7):
>   iommu/vt-d: Refactor parameters for qi_submit_sync()
>   iommu/vt-d: Multiple descriptors per qi_submit_sync()
>   iommu/vt-d: debugfs: Add support to show inv queue internals
>   iommu/vt-d: Refactor prq_event_thread()
>   iommu/vt-d: Save prq descriptors in an internal list
>   iommu/vt-d: Add page request draining support
>   iommu/vt-d: Remove redundant IOTLB flush
> 
>  drivers/iommu/dmar.c|  63 +++--
>  drivers/iommu/intel-iommu-debugfs.c |  62 +
>  drivers/iommu/intel-pasid.c |   4 +-
>  drivers/iommu/intel-svm.c   | 383 ++--
>  drivers/iommu/intel_irq_remapping.c |   2 +-
>  include/linux/intel-iommu.h |  12 +-
>  6 files changed, 369 insertions(+), 157 deletions(-)
> 
> --
> 2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 0/2] iommu: Remove iommu_sva_ops::mm_exit()

2020-04-15 Thread Jean-Philippe Brucker

On Fri, Apr 10, 2020 at 08:52:49AM -0700, Jacob Pan wrote:
> On Thu, 9 Apr 2020 16:50:58 +0200
> Jean-Philippe Brucker  wrote:
> 
> > > So unbind is coming anyway, the difference in handling in mmu
> > > release notifier is whether we silently drop DMA fault vs.
> > > reporting fault?  
> > 
> > What I meant is, between mmu release notifier and unbind(), we can't
> > print any error from DMA fault on dmesg, because an mm exit is easily
> > triggered by userspace. Look at the lifetime of the bond:
> > 
> > bind()
> >  |
> >  : Here any DMA fault is handled by mm, and on error we don't print
> >  : anything to dmesg. Userspace can easily trigger faults by issuing
> > DMA : on unmapped buffers.
> >  |
> > mm exit -> clear pgd, invalidate IOTLBs
> >  |
> >  : Here the PASID descriptor doesn't have the pgd anymore, but we
> > don't : print out any error to dmesg either. DMA is likely still
> > running but : any fault has to be ignored.
> >  :
> >  : We also can't free the PASID yet, since transactions are still
> > coming : in with this PASID.
> >  |
> > unbind() -> clear context descriptor, release PASID and mmu notifier
> >  |
> >  : Here the PASID descriptor is clear. If DMA is still running the
> > device : driver really messed up and we have to print out any fault.
> > 
> > For that middle state I had to introduce a new pasid descriptor state
> > in the SMMU driver, to avoid reporting errors between mm exit and
> > unbind().
> I must have missed something, but why bother with a state when you can
> always check if the mm is dead by mmget_not_zero()? You would not
> handle IOPF if the mm is dead anyway, similarly for other DMA errors.

In the SMMU a cleared PASID descriptor results in unrecoverable faults,
which do not go through the I/O page fault handler. I've been thinking
about injecting everything to the IOPF handler, recoverable or not, but
filtering down the stream is complicated. Most of the time outside this
small window, we really need to print out those messages because they
would indicate serious bugs.

> Also, since you are not freeing ioasid in mmu_notifier release anymore,
> does it mean the IOASID notifier chain can be non-atomic?

Unfortunately not, ioasid_free() is called from
mmu_notifier_ops::free_notifier() in the RCU callback that results from
mmu_notifier_put(). 

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 07/33] iommu: Add probe_device() and remove_device() call-backs

2020-04-15 Thread Lu Baolu


On 2020/4/14 21:15, Joerg Roedel wrote:

From: Joerg Roedel

Add call-backs to 'struct iommu_ops' as an alternative to the
add_device() and remove_device() call-backs, which will be removed when
all drivers are converted.

The new call-backs will not setupt IOMMU groups and domains anymore,
so also add a probe_finalize() call-back where the IOMMU driver can do
per-device setup work which require the device to be set up with a
group and a domain.


The subject is inaccurate. probe_device() and release_device() are
added to replace the add and remove pair.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/4] dma-mapping: move the remaining DMA API calls out of line

2020-04-15 Thread Christoph Hellwig

On Wed, Apr 15, 2020 at 12:26:04PM +1000, Alexey Kardashevskiy wrote:
> May be this is correct and allowed (no idea) but removing exported
> symbols at least deserves a mention in the commit log, does not it?
> 
> The rest of the series is fine and works. Thanks,

Maybe I can throw in a line, but the point is that dma_direct_*
was exported as dma_* called them inline.  Now dma_* is out of line
and exported instead, which always was the actual API.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 13/33] iommu: Export bus_iommu_probe() and make is safe for re-probing

2020-04-15 Thread Lu Baolu


On 2020/4/14 21:15, Joerg Roedel wrote:

From: Joerg Roedel 

Add a check to the bus_iommu_probe() call-path to make sure it ignores
devices which have already been successfully probed. Then export the
bus_iommu_probe() function so it can be used by IOMMU drivers.

Signed-off-by: Joerg Roedel 
---
  drivers/iommu/iommu.c | 6 +-
  include/linux/iommu.h | 1 +
  2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 834a45da0ed0..a2ff95424044 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1615,6 +1615,10 @@ static int probe_iommu_group(struct device *dev, void 
*data)
if (!dev_iommu_get(dev))
return -ENOMEM;
  
+	/* Device is probed already if in a group */

+   if (iommu_group_get(dev) != NULL)


Same as
if (iommu_group_get(dev))
?

By the way, do we need to put the group if device has already been
probed?

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

62 matches

Mail list logo