Re: [PATCH v10 4/4] crypto: hisilicon - register zip engine to uacce

2020-01-09 Thread zhangfei



On 2020/1/10 上午1:48, Jonathan Cameron wrote:

On Mon, 16 Dec 2019 11:08:17 +0800
Zhangfei Gao  wrote:


Register qm to uacce framework for user crypto driver

Signed-off-by: Zhangfei Gao 
Signed-off-by: Zhou Wang 

Very nice to see how minimal the changes are.

Whilst uacce in general isn't crypto specific, as we are looking
at changes in a crypto driver, this will need a crypto Ack.

Herbert, this is about as non invasive as things can get and
provide a user space shared virtual addressing based interface.
What do you think?

 From my side, for what it's worth...

Reviewed-by: Jonathan Cameron 


Thanks Jonathan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 00/13] iommu: Add PASID support to Arm SMMUv3

2020-01-09 Thread Jean-Philippe Brucker
On Thu, Jan 09, 2020 at 02:41:01PM +, Will Deacon wrote:
> On Thu, Jan 09, 2020 at 03:36:18PM +0100, Jean-Philippe Brucker wrote:
> > On Thu, Dec 19, 2019 at 05:30:20PM +0100, Jean-Philippe Brucker wrote:
> > > Add support for Substream ID and PASIDs to the SMMUv3 driver. Since v3
> > > [1], I added review and tested tags where appropriate and applied the
> > > suggested changes, shown in the diff below. Thanks all!
> > > 
> > > I'm testing using the zip accelerator on the Hisilicon KunPeng920 and
> > > Zhangfei's uacce module [2]. The full SVA support, which I'll send out
> > > early next year, is available on my branch sva/zip-devel at
> > > https://jpbrucker.net/git/linux/
> > 
> > Is there anything more I should do for the PASID support? Ideally I'd like
> > to get this in v5.6 so I can focus on the rest of the SVA work and on
> > performance improvements.
> 
> Apologies, I'm just behind with review what with the timing of the new
> year. You're on the list, but I was hoping to get Robin's TCR stuff dusted
> off so that Jordan doesn't have to depend on patches languishing on the
> mailing list and there's also the nvidia stuff to review as well.
> 
> Going as fast as I can!

No worries, I just wanted to check that it didn't slip through the cracks.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Bug 205201 - Booting halts if Dawicontrol DC-2976 UW SCSI board installed, unless RAM size limited to 3500M

2020-01-09 Thread Christian Zigotzky
Hi All,

The SCSI cards work again. [1, 2]

Sorry for bothering you.

Thanks,
Christian

[1] 
http://forum.hyperion-entertainment.com/viewtopic.php?f=58=49603=1adf9e6d558c136c1ad4ff15c44212ba#p49599
[2] https://bugzilla.kernel.org/show_bug.cgi?id=205201
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 2/4] uacce: add uacce driver

2020-01-09 Thread zhangfei....@foxmail.com



On 2020/1/10 上午1:38, Jonathan Cameron wrote:

On Mon, 16 Dec 2019 11:08:15 +0800
Zhangfei Gao  wrote:


From: Kenneth Lee 

Uacce (Unified/User-space-access-intended Accelerator Framework) targets to
provide Shared Virtual Addressing (SVA) between accelerators and processes.
So accelerator can access any data structure of the main cpu.
This differs from the data sharing between cpu and io device, which share
only data content rather than address.
Since unified address, hardware and user space of process can share the
same virtual address in the communication.

Uacce create a chrdev for every registration, the queue is allocated to
the process when the chrdev is opened. Then the process can access the
hardware resource by interact with the queue file. By mmap the queue
file space to user space, the process can directly put requests to the
hardware without syscall to the kernel space.

The IOMMU core only tracks mm<->device bonds at the moment, because it
only needs to handle IOTLB invalidation and PASID table entries. However
uacce needs a finer granularity since multiple queues from the same
device can be bound to an mm. When the mm exits, all bound queues must
be stopped so that the IOMMU can safely clear the PASID table entry and
reallocate the PASID.

An intermediate struct uacce_mm links uacce devices and queues.
Note that an mm may be bound to multiple devices but an uacce_mm
structure only ever belongs to a single device, because we don't need
anything more complex (if multiple devices are bound to one mm, then
we'll create one uacce_mm for each bond).

 uacce_device --+-- uacce_mm --+-- uacce_queue
|  '-- uacce_queue
|
'-- uacce_mm --+-- uacce_queue
   +-- uacce_queue
   '-- uacce_queue

Signed-off-by: Kenneth Lee 
Signed-off-by: Zaibo Xu 
Signed-off-by: Zhou Wang 
Signed-off-by: Jean-Philippe Brucker 
Signed-off-by: Zhangfei Gao 

Hi,

Two small things I'd missed previously.  Fix those and for
what it's worth

Reviewed-by: Jonathan Cameron 

Thanks Jonathan



---
  Documentation/ABI/testing/sysfs-driver-uacce |  37 ++
  drivers/misc/Kconfig |   1 +
  drivers/misc/Makefile|   1 +
  drivers/misc/uacce/Kconfig   |  13 +
  drivers/misc/uacce/Makefile  |   2 +
  drivers/misc/uacce/uacce.c   | 628 +++
  include/linux/uacce.h| 161 +++
  include/uapi/misc/uacce/uacce.h  |  38 ++
  8 files changed, 881 insertions(+)
  create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce
  create mode 100644 drivers/misc/uacce/Kconfig
  create mode 100644 drivers/misc/uacce/Makefile
  create mode 100644 drivers/misc/uacce/uacce.c
  create mode 100644 include/linux/uacce.h
  create mode 100644 include/uapi/misc/uacce/uacce.h


...

+
+What:   /sys/class/uacce//available_instances
+Date:   Dec 2019
+KernelVersion:  5.6
+Contact:linux-accelerat...@lists.ozlabs.org
+Description:Available instances left of the device
+Return -ENODEV if uacce_ops get_available_instances is not 
provided
+

See below.  It doesn't "return" it prints it currently.

Will update to
'unknown' if uacce_ops get_available_instances is not provided


...


+static int uacce_fops_mmap(struct file *filep, struct vm_area_struct *vma)
+{
+   struct uacce_queue *q = filep->private_data;
+   struct uacce_device *uacce = q->uacce;
+   struct uacce_qfile_region *qfr;
+   enum uacce_qfrt type = UACCE_MAX_REGION;
+   int ret = 0;
+
+   if (vma->vm_pgoff < UACCE_MAX_REGION)
+   type = vma->vm_pgoff;
+   else
+   return -EINVAL;
+
+   qfr = kzalloc(sizeof(*qfr), GFP_KERNEL);
+   if (!qfr)
+   return -ENOMEM;
+
+   vma->vm_flags |= VM_DONTCOPY | VM_DONTEXPAND | VM_WIPEONFORK;
+   vma->vm_ops = _vm_ops;
+   vma->vm_private_data = q;
+   qfr->type = type;
+
+   mutex_lock(_mutex);
+
+   if (q->state != UACCE_Q_INIT && q->state != UACCE_Q_STARTED) {
+   ret = -EINVAL;
+   goto out_with_lock;
+   }
+
+   if (q->qfrs[type]) {
+   ret = -EEXIST;
+   goto out_with_lock;
+   }
+
+   switch (type) {
+   case UACCE_QFRT_MMIO:
+   if (!uacce->ops->mmap) {
+   ret = -EINVAL;
+   goto out_with_lock;
+   }
+
+   ret = uacce->ops->mmap(q, vma, qfr);
+   if (ret)
+   goto out_with_lock;
+
+   break;
+
+   case UACCE_QFRT_DUS:
+   if (uacce->flags & UACCE_DEV_SVA) {
+   if (!uacce->ops->mmap) {
+   ret = -EINVAL;
+   goto out_with_lock;
+

Re: [PATCH v10 0/4] Add uacce module for Accelerator

2020-01-09 Thread zhangfei



On 2020/1/10 上午1:49, Jonathan Cameron wrote:

On Mon, 16 Dec 2019 11:08:13 +0800
Zhangfei Gao  wrote:


Uacce (Unified/User-space-access-intended Accelerator Framework) targets to
provide Shared Virtual Addressing (SVA) between accelerators and processes.
So accelerator can access any data structure of the main cpu.
This differs from the data sharing between cpu and io device, which share
data content rather than address.
Because of unified address, hardware and user space of process can share
the same virtual address in the communication.

Uacce is intended to be used with Jean Philippe Brucker's SVA
patchset[1], which enables IO side page fault and PASID support.
We have keep verifying with Jean's sva patchset [2]
We also keep verifying with Eric's SMMUv3 Nested Stage patches [3]

Hi Zhangfei Gao,

Just to check my understanding...

This patch set is not dependent on either 2 or 3?

To use it on our hardware, we need 2, but the interfaces used are already
upstream, so this could move forwards in parallel.



Yes,
patch 1, 2 is for uacce.
patch 3, 4 is an example using uacce, which happen to be crypto.

Thanks
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 2/2] iommu/vt-d: Unnecessary to handle default identity domain

2020-01-09 Thread Lu Baolu
The iommu default domain framework has been designed to take
care of setting identity default domain type. It's unnecessary
to handle this again in the VT-d driver. Hence, remove it.

Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel-iommu.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 6aeaad2cf8a9..e7942bc05e4e 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -387,7 +387,6 @@ static int intel_iommu_superpage = 1;
 static int iommu_identity_mapping;
 static int intel_no_bounce;
 
-#define IDENTMAP_ALL   1
 #define IDENTMAP_GFX   2
 #define IDENTMAP_AZALIA4
 
@@ -3079,8 +3078,7 @@ static int device_def_domain_type(struct device *dev)
return IOMMU_DOMAIN_DMA;
}
 
-   return (iommu_identity_mapping & IDENTMAP_ALL) ?
-   IOMMU_DOMAIN_IDENTITY : 0;
+   return 0;
 }
 
 static void intel_iommu_init_qi(struct intel_iommu *iommu)
@@ -3424,9 +3422,6 @@ static int __init init_dmars(void)
iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH);
}
 
-   if (iommu_default_passthrough())
-   iommu_identity_mapping |= IDENTMAP_ALL;
-
 #ifdef CONFIG_INTEL_IOMMU_BROKEN_GFX_WA
dmar_map_gfx = 0;
 #endif
@@ -5017,7 +5012,7 @@ static int __init platform_optin_force_iommu(void)
 * map for all devices except those marked as being untrusted.
 */
if (dmar_disabled)
-   iommu_identity_mapping |= IDENTMAP_ALL;
+   iommu_set_default_passthrough(false);
 
dmar_disabled = 0;
no_iommu = 0;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] iommu/vt-d: Allow devices with RMRRs to use identity domain

2020-01-09 Thread Lu Baolu
Since commit ea2447f700cab ("intel-iommu: Prevent devices with
RMRRs from being placed into SI Domain"), the Intel IOMMU driver
doesn't allow any devices with RMRR locked to use the identity
domain. This was added to to fix the issue where the RMRR info
for devices being placed in and out of the identity domain gets
lost. This identity maps all RMRRs when setting up the identity
domain, so that devices with RMRRs could also use it.

Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel-iommu.c | 15 ++-
 1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index e0bba32a60e3..6aeaad2cf8a9 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2893,10 +2893,8 @@ static int __init si_domain_init(int hw)
}
 
/*
-* Normally we use DMA domains for devices which have RMRRs. But we
-* loose this requirement for graphic and usb devices. Identity map
-* the RMRRs for graphic and USB devices so that they could use the
-* si_domain.
+* Identity map the RMRRs so that devices with RMRRs could also use
+* the si_domain.
 */
for_each_rmrr_units(rmrr) {
for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
@@ -2904,9 +2902,6 @@ static int __init si_domain_init(int hw)
unsigned long long start = rmrr->base_address;
unsigned long long end = rmrr->end_address;
 
-   if (device_is_rmrr_locked(dev))
-   continue;
-
if (WARN_ON(end < start ||
end >> agaw_to_width(si_domain->agaw)))
continue;
@@ -3045,9 +3040,6 @@ static int device_def_domain_type(struct device *dev)
if (dev_is_pci(dev)) {
struct pci_dev *pdev = to_pci_dev(dev);
 
-   if (device_is_rmrr_locked(dev))
-   return IOMMU_DOMAIN_DMA;
-
/*
 * Prevent any device marked as untrusted from getting
 * placed into the statically identity mapping domain.
@@ -3085,9 +3077,6 @@ static int device_def_domain_type(struct device *dev)
return IOMMU_DOMAIN_DMA;
} else if (pci_pcie_type(pdev) == PCI_EXP_TYPE_PCI_BRIDGE)
return IOMMU_DOMAIN_DMA;
-   } else {
-   if (device_has_rmrr(dev))
-   return IOMMU_DOMAIN_DMA;
}
 
return (iommu_identity_mapping & IDENTMAP_ALL) ?
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 08/10] iommu/vt-d: Add custom allocator for IOASID

2020-01-09 Thread Lu Baolu

Hi,

On 1/10/20 6:06 AM, Jacob Pan wrote:

+static void register_pasid_allocator(struct intel_iommu *iommu)
+{
+   if (!intel_iommu_sm) {

Use sm_supported(iommu) instead.


sounds good, seems we could separate the sm code more cleanly in the
future to avoid all these checks.



Agreed.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 04/10] iommu/vt-d: Support flushing more translation cache types

2020-01-09 Thread Lu Baolu

Hi,

On 1/10/20 5:50 AM, Jacob Pan wrote:

On Thu, 19 Dec 2019 10:46:51 +0800
Lu Baolu  wrote:


Hi,

On 12/17/19 3:24 AM, Jacob Pan wrote:

When Shared Virtual Memory is exposed to a guest via vIOMMU,
scalable IOTLB invalidation may be passed down from outside IOMMU
subsystems. This patch adds invalidation functions that can be used
for additional translation cache types.

Signed-off-by: Jacob Pan
---
   drivers/iommu/dmar.c| 46
+
drivers/iommu/intel-pasid.c |  3 ++- include/linux/intel-iommu.h |
21 + 3 files changed, 65 insertions(+), 5
deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 3acfa6a25fa2..f2f5d75da94a 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1348,6 +1348,20 @@ void qi_flush_iotlb(struct intel_iommu
*iommu, u16 did, u64 addr, qi_submit_sync(, iommu);
   }
   
+/* PASID-based IOTLB Invalidate */

+void qi_flush_iotlb_pasid(struct intel_iommu *iommu, u16 did, u64
addr, u32 pasid,
+   unsigned int size_order, u64 granu, int ih)
+{
+   struct qi_desc desc = {.qw2 = 0, .qw3 = 0};
+
+   desc.qw0 = QI_EIOTLB_PASID(pasid) | QI_EIOTLB_DID(did) |
+   QI_EIOTLB_GRAN(granu) | QI_EIOTLB_TYPE;
+   desc.qw1 = QI_EIOTLB_ADDR(addr) | QI_EIOTLB_IH(ih) |
+   QI_EIOTLB_AM(size_order);
+
+   qi_submit_sync(, iommu);
+}

There's another version of pasid-based iotlb invalidation.

https://lkml.org/lkml/2019/12/10/2128

Let's consider merging them.


Absolutely, the difference i see is that the granularity is explicit
here. Here we do invalidation request from the guest. Perhaps, we can
look at consolidation once this use case is supported?



Looks good to me. :-)

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 02/10] iommu/vt-d: Add nested translation helper function

2020-01-09 Thread Lu Baolu

Hi Jacob,

On 1/10/20 2:39 AM, Jacob Pan wrote:

On Wed, 18 Dec 2019 10:41:53 +0800
Lu Baolu  wrote:


Hi again,

On 12/17/19 3:24 AM, Jacob Pan wrote:

+/**
+ * intel_pasid_setup_nested() - Set up PASID entry for nested
translation
+ * which is used for vSVA. The first level page tables are used for
+ * GVA-GPA or GIOVA-GPA translation in the guest, second level
page tables
+ *  are used for GPA-HPA translation.
+ *
+ * @iommu:  Iommu which the device belong to
+ * @dev:Device to be set up for translation
+ * @gpgd:   FLPTPTR: First Level Page translation pointer in
GPA
+ * @pasid:  PASID to be programmed in the device PASID table
+ * @pasid_data: Additional PASID info from the guest bind request
+ * @domain: Domain info for setting up second level page tables
+ * @addr_width: Address width of the first level (guest)
+ */
+int intel_pasid_setup_nested(struct intel_iommu *iommu,
+   struct device *dev, pgd_t *gpgd,
+   int pasid, struct
iommu_gpasid_bind_data_vtd *pasid_data,
+   struct dmar_domain *domain,
+   int addr_width)
+{
+   struct pasid_entry *pte;
+   struct dma_pte *pgd;
+   u64 pgd_val;
+   int agaw;
+   u16 did;
+
+   if (!ecap_nest(iommu->ecap)) {
+   pr_err("IOMMU: %s: No nested translation
support\n",
+  iommu->name);
+   return -EINVAL;
+   }
+
+   pte = intel_pasid_get_entry(dev, pasid);
+   if (WARN_ON(!pte))
+   return -EINVAL;
+
+   pasid_clear_entry(pte);


In some cases, e.g. nested mode for GIOVA-HPA, the PASID entry might
have already been setup for second level translation. (This could be
checked with the Present bit.) Hence, it's safe to flush caches here.

Or, maybe intel_pasid_tear_down_entry() is more suitable?


We don't allow binding the same device-PASID twice, so if the PASID
entry was used for GIOVA/RID2PASID, it should unbind first, and
teardown flush included, right?



Fair enough. Can you please add this as a comment to this function? So
that the caller of this interface can know this. Or add a check in this
function which returns error if the pasid entry has already been bond.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/5] PCI: Introduce direct dma alias

2020-01-09 Thread Derrick, Jonathan
On Thu, 2020-01-09 at 17:11 -0600, Bjorn Helgaas wrote:
> In subject:
> s/Introduce direct dma alias/Add pci_direct_dma_alias()/
> 
> On Thu, Jan 09, 2020 at 07:30:54AM -0700, Jon Derrick wrote:
> > The current dma alias implementation requires the aliased device be on
> > the same bus as the dma parent. This introduces an arch-specific
> > mechanism to point to an arbitrary struct device when doing mapping and
> > pci alias search.
> 
> "arbitrary struct device" is a little weird since an arbitrary device
> doesn't have to be a PCI device, but these mappings and aliases only
> make sense in the PCI domain.
> 
> Maybe it has something to do with pci_sysdata.vmd_dev being a
> "struct device *" rather than a "struct pci_dev *"?  I don't know why
> that is, because it looks like every place you use it, you use
> to_pci_dev() to get the pci_dev pointer back anyway.  But I assume you
> have some good reason for that.
No particular reason other than to align with the suggestion in the
last set to be using the struct device. It does make sense to reference
the struct device as that provides the dma context, however as you have
pointed out, the implementation here moreso needs the device's
pci_dev. 

I'll see how it looks for the next set.

> 
> s/dma/DMA/
> s/pci/PCI/
> (above and also in code comments below)
> 
> > 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/5] PCI: Introduce direct dma alias

2020-01-09 Thread Bjorn Helgaas
In subject:
s/Introduce direct dma alias/Add pci_direct_dma_alias()/

On Thu, Jan 09, 2020 at 07:30:54AM -0700, Jon Derrick wrote:
> The current dma alias implementation requires the aliased device be on
> the same bus as the dma parent. This introduces an arch-specific
> mechanism to point to an arbitrary struct device when doing mapping and
> pci alias search.

"arbitrary struct device" is a little weird since an arbitrary device
doesn't have to be a PCI device, but these mappings and aliases only
make sense in the PCI domain.

Maybe it has something to do with pci_sysdata.vmd_dev being a
"struct device *" rather than a "struct pci_dev *"?  I don't know why
that is, because it looks like every place you use it, you use
to_pci_dev() to get the pci_dev pointer back anyway.  But I assume you
have some good reason for that.

s/dma/DMA/
s/pci/PCI/
(above and also in code comments below)

> Signed-off-by: Jon Derrick 
> ---
>  arch/x86/pci/common.c |  7 +++
>  drivers/pci/pci.c | 17 -
>  drivers/pci/search.c  |  9 +
>  include/linux/pci.h   |  1 +
>  4 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
> index 1e59df0..565cc17 100644
> --- a/arch/x86/pci/common.c
> +++ b/arch/x86/pci/common.c
> @@ -736,3 +736,10 @@ int pci_ext_cfg_avail(void)
>   else
>   return 0;
>  }
> +
> +#if IS_ENABLED(CONFIG_VMD)
> +struct device *pci_direct_dma_alias(struct pci_dev *dev)
> +{
> + return to_pci_sysdata(dev->bus)->vmd_dev;
> +}
> +#endif
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index ad746d9..e4269e9 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -6034,7 +6034,9 @@ bool pci_devs_are_dma_aliases(struct pci_dev *dev1, 
> struct pci_dev *dev2)
>   return (dev1->dma_alias_mask &&
>   test_bit(dev2->devfn, dev1->dma_alias_mask)) ||
>  (dev2->dma_alias_mask &&
> - test_bit(dev1->devfn, dev2->dma_alias_mask));
> + test_bit(dev1->devfn, dev2->dma_alias_mask)) ||
> +(pci_direct_dma_alias(dev1) == >dev) ||
> +(pci_direct_dma_alias(dev2) == >dev);
>  }
>  
>  bool pci_device_is_present(struct pci_dev *pdev)
> @@ -6058,6 +6060,19 @@ void pci_ignore_hotplug(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL_GPL(pci_ignore_hotplug);
>  
> +/**
> + * pci_direct_dma_alias - Get dma alias for pci device
> + * @dev: the PCI device that may have a dma alias
> + *
> + * Permits the platform to provide architecture-specific functionality to
> + * devices needing to alias dma to another device. This is the default
> + * implementation. Architecture implementations can override this.
> + */
> +struct device __weak *pci_direct_dma_alias(struct pci_dev *dev)
> +{
> + return NULL;
> +}
> +
>  resource_size_t __weak pcibios_default_alignment(void)
>  {
>   return 0;
> diff --git a/drivers/pci/search.c b/drivers/pci/search.c
> index bade140..6d61209 100644
> --- a/drivers/pci/search.c
> +++ b/drivers/pci/search.c
> @@ -32,6 +32,15 @@ int pci_for_each_dma_alias(struct pci_dev *pdev,
>   struct pci_bus *bus;
>   int ret;
>  
> + if (unlikely(pci_direct_dma_alias(pdev))) {
> + struct device *dev = pci_direct_dma_alias(pdev);
> +
> + if (dev_is_pci(dev))
> + pdev = to_pci_dev(dev);
> + return fn(pdev, PCI_DEVID(pdev->bus->number, pdev->devfn),
> +   data);
> + }
> +
>   ret = fn(pdev, pci_dev_id(pdev), data);
>   if (ret)
>   return ret;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index c393dff..82494d3 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1202,6 +1202,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, 
> struct pci_dev **limiting_dev,
>  int pci_select_bars(struct pci_dev *dev, unsigned long flags);
>  bool pci_device_is_present(struct pci_dev *pdev);
>  void pci_ignore_hotplug(struct pci_dev *dev);
> +struct device *pci_direct_dma_alias(struct pci_dev *dev);
>  
>  int __printf(6, 7) pci_request_irq(struct pci_dev *dev, unsigned int nr,
>   irq_handler_t handler, irq_handler_t thread_fn, void *dev_id,
> -- 
> 1.8.3.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 08/10] iommu/vt-d: Add custom allocator for IOASID

2020-01-09 Thread Jacob Pan
On Wed, 18 Dec 2019 12:10:55 +0800
Lu Baolu  wrote:

> Hi,
> 
> On 12/17/19 3:24 AM, Jacob Pan wrote:
> > When VT-d driver runs in the guest, PASID allocation must be
> > performed via virtual command interface. This patch registers a
> > custom IOASID allocator which takes precedence over the default
> > XArray based allocator. The resulting IOASID allocation will always
> > come from the host. This ensures that PASID namespace is system-
> > wide.
> > 
> > Signed-off-by: Lu Baolu 
> > Signed-off-by: Liu, Yi L 
> > Signed-off-by: Jacob Pan 
> > ---
> >   drivers/iommu/intel-iommu.c | 75
> > +
> > include/linux/intel-iommu.h |  2 ++ 2 files changed, 77
> > insertions(+)
> > 
> > diff --git a/drivers/iommu/intel-iommu.c
> > b/drivers/iommu/intel-iommu.c index e90102c7540d..b0c0bb6f740e
> > 100644 --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -1700,6 +1700,9 @@ static void free_dmar_iommu(struct
> > intel_iommu *iommu) if (ecap_prs(iommu->ecap))
> > intel_svm_finish_prq(iommu);
> > }
> > +   if (ecap_vcs(iommu->ecap) && vccap_pasid(iommu->vccap))
> > +
> > ioasid_unregister_allocator(>pasid_allocator); +
> >   #endif
> >   }
> >   
> > @@ -3181,6 +3184,75 @@ static int copy_translation_tables(struct
> > intel_iommu *iommu) return ret;
> >   }
> >   
> > +#ifdef CONFIG_INTEL_IOMMU_SVM
> > +static ioasid_t intel_ioasid_alloc(ioasid_t min, ioasid_t max,
> > void *data) +{
> > +   struct intel_iommu *iommu = data;
> > +   ioasid_t ioasid;
> > +  
> 
> Check !iommu just like the free api?
> 
sounds good, will return INVALID_IOASID if NULL.

> > +   /*
> > +* VT-d virtual command interface always uses the full 20
> > bit
> > +* PASID range. Host can partition guest PASID range based
> > on
> > +* policies but it is out of guest's control.
> > +*/
> > +   if (min < PASID_MIN || max > intel_pasid_max_id)
> > +   return INVALID_IOASID;
> > +
> > +   if (vcmd_alloc_pasid(iommu, ))
> > +   return INVALID_IOASID;
> > +
> > +   return ioasid;
> > +}
> > +
> > +static void intel_ioasid_free(ioasid_t ioasid, void *data)
> > +{
> > +   struct intel_iommu *iommu = data;
> > +
> > +   if (!iommu)
> > +   return;
> > +   /*
> > +* Sanity check the ioasid owner is done at upper layer,
> > e.g. VFIO
> > +* We can only free the PASID when all the devices are
> > unbound.
> > +*/
> > +   if (ioasid_find(NULL, ioasid, NULL)) {
> > +   pr_alert("Cannot free active IOASID %d\n", ioasid);
> > +   return;
> > +   }
> > +   vcmd_free_pasid(iommu, ioasid);
> > +}
> > +
> > +static void register_pasid_allocator(struct intel_iommu *iommu)
> > +{
> > +   if (!intel_iommu_sm) {  
> 
> Use sm_supported(iommu) instead.
> 
sounds good, seems we could separate the sm code more cleanly in the
future to avoid all these checks.

> > +   pr_warn("VT-d scalable mode not enabled\n");
> > +   return;
> > +   }
> > +
> > +   /*
> > +* Register a custom PASID allocator if we are running in
> > a guest,
> > +* guest PASID must be obtained via virtual command
> > interface.
> > +* There can be multiple vIOMMUs in each guest but only
> > one allocator
> > +* is active. All vIOMMU allocators will eventually be
> > calling the same
> > +* host allocator.
> > +*/
> > +   if (ecap_vcs(iommu->ecap) && vccap_pasid(iommu->vccap)) {
> > +   pr_info("Register custom PASID allocator\n");
> > +   iommu->pasid_allocator.alloc = intel_ioasid_alloc;
> > +   iommu->pasid_allocator.free = intel_ioasid_free;
> > +   iommu->pasid_allocator.pdata = (void *)iommu;
> > +   if
> > (!ioasid_register_allocator(>pasid_allocator)) {
> > +   pr_warn("Custom PASID allocator failed,
> > scalable mode disabled\n");
> > +   /*
> > +* Disable scalable mode on this IOMMU if
> > there
> > +* is no custom allocator. Mixing SM
> > capable vIOMMU
> > +* and non-SM vIOMMU are not supported.
> > +*/
> > +   intel_iommu_sm = 0;
> > +   }
> > +   }
> > +}
> > +#endif
> > +
> >   static int __init init_dmars(void)
> >   {
> > struct dmar_drhd_unit *drhd;
> > @@ -3298,6 +3370,9 @@ static int __init init_dmars(void)
> >  */
> > for_each_active_iommu(iommu, drhd) {
> > iommu_flush_write_buffer(iommu);
> > +#ifdef CONFIG_INTEL_IOMMU_SVM
> > +   register_pasid_allocator(iommu);
> > +#endif
> > iommu_set_root_entry(iommu);
> > iommu->flush.flush_context(iommu, 0, 0, 0,
> > DMA_CCMD_GLOBAL_INVL); iommu->flush.flush_iotlb(iommu, 0, 0, 0,
> > DMA_TLB_GLOBAL_FLUSH); diff --git a/include/linux/intel-iommu.h
> > b/include/linux/intel-iommu.h index 1e11560b0e59..8c30b23bd838
> > 100644 --- a/include/linux/intel-iommu.h
> > +++ b/include/linux/intel-iommu.h
> > @@ -19,6 +19,7 @@

Re: [PATCH v8 06/10] iommu/vt-d: Cache virtual command capability register

2020-01-09 Thread Jacob Pan
On Wed, 18 Dec 2019 11:25:27 +0800
Lu Baolu  wrote:

> Hi,
> 
> On 12/17/19 3:24 AM, Jacob Pan wrote:
> > Virtual command registers are used in the guest only, to prevent
> > vmexit cost, we cache the capability and store it during
> > initialization.
> > 
> > Signed-off-by: Jacob Pan 
> > ---
> >   drivers/iommu/dmar.c| 1 +
> >   include/linux/intel-iommu.h | 4 
> >   2 files changed, 5 insertions(+)
> > 
> > diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> > index f2f5d75da94a..3f98dd9ad004 100644
> > --- a/drivers/iommu/dmar.c
> > +++ b/drivers/iommu/dmar.c
> > @@ -953,6 +953,7 @@ static int map_iommu(struct intel_iommu *iommu,
> > u64 phys_addr) warn_invalid_dmar(phys_addr, " returns all ones");
> > goto unmap;
> > }
> > +   iommu->vccap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);
> >   
> > /* the registers might be more than one page */
> > map_size = max_t(int, ecap_max_iotlb_offset(iommu->ecap),
> > diff --git a/include/linux/intel-iommu.h
> > b/include/linux/intel-iommu.h index ee26989df008..4d25141ec3df
> > 100644 --- a/include/linux/intel-iommu.h
> > +++ b/include/linux/intel-iommu.h
> > @@ -189,6 +189,9 @@
> >   #define ecap_max_handle_mask(e) ((e >> 20) & 0xf)
> >   #define ecap_sc_support(e)((e >> 7) & 0x1) /* Snooping
> > Control */ 
> > +/* Virtual command interface capabilities */
> > +#define vccap_pasid(v) ((v & DMA_VCS_PAS)) /* PASID
> > allocation */  
> 
> Has DMA_VCS_PAS ever been defined?
> 
Good catch, it is in the next patch, need to move the #define here.
Thanks!

> Best regards,
> baolu
> 
>  [...]  

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 04/10] iommu/vt-d: Support flushing more translation cache types

2020-01-09 Thread Jacob Pan
On Thu, 19 Dec 2019 10:46:51 +0800
Lu Baolu  wrote:

> Hi,
> 
> On 12/17/19 3:24 AM, Jacob Pan wrote:
> > When Shared Virtual Memory is exposed to a guest via vIOMMU,
> > scalable IOTLB invalidation may be passed down from outside IOMMU
> > subsystems. This patch adds invalidation functions that can be used
> > for additional translation cache types.
> > 
> > Signed-off-by: Jacob Pan
> > ---
> >   drivers/iommu/dmar.c| 46
> > +
> > drivers/iommu/intel-pasid.c |  3 ++- include/linux/intel-iommu.h |
> > 21 + 3 files changed, 65 insertions(+), 5
> > deletions(-)
> > 
> > diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> > index 3acfa6a25fa2..f2f5d75da94a 100644
> > --- a/drivers/iommu/dmar.c
> > +++ b/drivers/iommu/dmar.c
> > @@ -1348,6 +1348,20 @@ void qi_flush_iotlb(struct intel_iommu
> > *iommu, u16 did, u64 addr, qi_submit_sync(, iommu);
> >   }
> >   
> > +/* PASID-based IOTLB Invalidate */
> > +void qi_flush_iotlb_pasid(struct intel_iommu *iommu, u16 did, u64
> > addr, u32 pasid,
> > +   unsigned int size_order, u64 granu, int ih)
> > +{
> > +   struct qi_desc desc = {.qw2 = 0, .qw3 = 0};
> > +
> > +   desc.qw0 = QI_EIOTLB_PASID(pasid) | QI_EIOTLB_DID(did) |
> > +   QI_EIOTLB_GRAN(granu) | QI_EIOTLB_TYPE;
> > +   desc.qw1 = QI_EIOTLB_ADDR(addr) | QI_EIOTLB_IH(ih) |
> > +   QI_EIOTLB_AM(size_order);
> > +
> > +   qi_submit_sync(, iommu);
> > +}  
> 
> There's another version of pasid-based iotlb invalidation.
> 
> https://lkml.org/lkml/2019/12/10/2128
> 
> Let's consider merging them.
> 
Absolutely, the difference i see is that the granularity is explicit
here. Here we do invalidation request from the guest. Perhaps, we can
look at consolidation once this use case is supported?

> Best regards,
> baolu

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 03/10] iommu/vt-d: Add bind guest PASID support

2020-01-09 Thread Jacob Pan
On Wed, 18 Dec 2019 11:14:59 +0800
Lu Baolu  wrote:

> Hi,
> 
> On 12/17/19 3:24 AM, Jacob Pan wrote:
> > When supporting guest SVA with emulated IOMMU, the guest PASID
> > table is shadowed in VMM. Updates to guest vIOMMU PASID table
> > will result in PASID cache flush which will be passed down to
> > the host as bind guest PASID calls.
> > 
> > For the SL page tables, it will be harvested from device's
> > default domain (request w/o PASID), or aux domain in case of
> > mediated device.
> > 
> >  .-.  .---.
> >  |   vIOMMU|  | Guest process CR3, FL only|
> >  | |  '---'
> >  ./
> >  | PASID Entry |--- PASID cache flush -
> >  '-'   |
> >  | |   V
> >  | |CR3 in GPA
> >  '-'
> > Guest
> > --| Shadow |--|
> >vv  v
> > Host
> >  .-.  .--.
> >  |   pIOMMU|  | Bind FL for GVA-GPA  |
> >  | |  '--'
> >  ./  |
> >  | PASID Entry | V (Nested xlate)
> >  '\.--.
> >  | |   |SL for GPA-HPA, default domain|
> >  | |   '--'
> >  '-'
> > Where:
> >   - FL = First level/stage one page tables
> >   - SL = Second level/stage two page tables
> > 
> > Signed-off-by: Jacob Pan 
> > Signed-off-by: Liu, Yi L 
> > ---
> >   drivers/iommu/intel-iommu.c |   4 +
> >   drivers/iommu/intel-svm.c   | 214
> > 
> > include/linux/intel-iommu.h |   8 +- include/linux/intel-svm.h   |
> > 17  4 files changed, 242 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/intel-iommu.c
> > b/drivers/iommu/intel-iommu.c index cc89791d807c..304654dbc622
> > 100644 --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -5993,6 +5993,10 @@ const struct iommu_ops intel_iommu_ops = {
> > .dev_disable_feat   = intel_iommu_dev_disable_feat,
> > .is_attach_deferred =
> > intel_iommu_is_attach_deferred, .pgsize_bitmap  =
> > INTEL_IOMMU_PGSIZES, +#ifdef CONFIG_INTEL_IOMMU_SVM
> > +   .sva_bind_gpasid= intel_svm_bind_gpasid,
> > +   .sva_unbind_gpasid  = intel_svm_unbind_gpasid,
> > +#endif
> >   };
> >   
> >   static void quirk_iommu_igfx(struct pci_dev *dev)
> > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> > index 0fcbe631cd5f..f580b7be63c5 100644
> > --- a/drivers/iommu/intel-svm.c
> > +++ b/drivers/iommu/intel-svm.c
> > @@ -230,6 +230,220 @@ static LIST_HEAD(global_svm_list);
> > list_for_each_entry((sdev), &(svm)->devs, list) \
> > if ((d) != (sdev)->dev) {} else
> >   
> > +int intel_svm_bind_gpasid(struct iommu_domain *domain,
> > +   struct device *dev,
> > +   struct iommu_gpasid_bind_data *data)
> > +{
> > +   struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
> > +   struct dmar_domain *ddomain;
> > +   struct intel_svm_dev *sdev;
> > +   struct intel_svm *svm;
> > +   int ret = 0;
> > +
> > +   if (WARN_ON(!iommu) || !data)
> > +   return -EINVAL;
> > +
> > +   if (data->version != IOMMU_GPASID_BIND_VERSION_1 ||
> > +   data->format != IOMMU_PASID_FORMAT_INTEL_VTD)
> > +   return -EINVAL;
> > +
> > +   if (dev_is_pci(dev)) {
> > +   /* VT-d supports devices with full 20 bit PASIDs
> > only */
> > +   if (pci_max_pasids(to_pci_dev(dev)) != PASID_MAX)
> > +   return -EINVAL;
> > +   } else {
> > +   return -ENOTSUPP;
> > +   }
> > +
> > +   /*
> > +* We only check host PASID range, we have no knowledge to
> > check
> > +* guest PASID range nor do we use the guest PASID.
> > +*/
> > +   if (data->hpasid <= 0 || data->hpasid >= PASID_MAX)
> > +   return -EINVAL;
> > +
> > +   ddomain = to_dmar_domain(domain);
> > +
> > +   /* Sanity check paging mode support match between host and
> > guest */
> > +   if (data->addr_width == ADDR_WIDTH_5LEVEL &&
> > +   !cap_5lp_support(iommu->cap)) {
> > +   pr_err("Cannot support 5 level paging requested by
> > guest!\n");
> > +   return -EINVAL;
> > +   }
> > +
> > +   mutex_lock(_mutex);
> > +   svm = ioasid_find(NULL, data->hpasid, NULL);
> > +   if (IS_ERR(svm)) {
> > +   ret = PTR_ERR(svm);
> > +   goto out;
> > +   }
> > +
> > +   if (svm) {
> > +   /*
> > +* If we found svm for the PASID, there must be at
> > +* least one device bond, otherwise svm should be
> > freed.
> > +*/
> > +   if (WARN_ON(list_empty(>devs)))
> > +   return -EINVAL;
> > +
> > +   if (svm->mm == 

[PATCH v2 1/5] x86/pci: Add a to_pci_sysdata helper

2020-01-09 Thread Jon Derrick
From: Christoph Hellwig 

From: Christoph Hellwig 

Various helpers need the pci_sysdata just to dereference a single field
in it.  Add a little helper that returns the properly typed sysdata
pointer to require a little less boilerplate code.

Signed-off-by: Christoph Hellwig 
[jonathan.derrick: added un-const cast]
Signed-off-by: Jon Derrick 
---
 arch/x86/include/asm/pci.h | 28 +---
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index 90d0731..cf680c5 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -35,12 +35,15 @@ struct pci_sysdata {
 
 #ifdef CONFIG_PCI
 
+static inline struct pci_sysdata *to_pci_sysdata(struct pci_bus *bus)
+{
+   return bus->sysdata;
+}
+
 #ifdef CONFIG_PCI_DOMAINS
 static inline int pci_domain_nr(struct pci_bus *bus)
 {
-   struct pci_sysdata *sd = bus->sysdata;
-
-   return sd->domain;
+   return to_pci_sysdata(bus)->domain;
 }
 
 static inline int pci_proc_domain(struct pci_bus *bus)
@@ -52,23 +55,20 @@ static inline int pci_proc_domain(struct pci_bus *bus)
 #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
 static inline void *_pci_root_bus_fwnode(struct pci_bus *bus)
 {
-   struct pci_sysdata *sd = bus->sysdata;
-
-   return sd->fwnode;
+   return to_pci_sysdata(bus)->fwnode;
 }
 
 #define pci_root_bus_fwnode_pci_root_bus_fwnode
 #endif
 
+#if IS_ENABLED(CONFIG_VMD)
 static inline bool is_vmd(struct pci_bus *bus)
 {
-#if IS_ENABLED(CONFIG_VMD)
-   struct pci_sysdata *sd = bus->sysdata;
-
-   return sd->vmd_domain;
+   return to_pci_sysdata(bus)->vmd_domain;
+}
 #else
-   return false;
-#endif
+#define is_vmd(bus)false
+#endif /* CONFIG_VMD */
 }
 
 /* Can be used to override the logic in pci_scan_bus for skipping
@@ -124,9 +124,7 @@ static inline void early_quirks(void) { }
 /* Returns the node based on pci bus */
 static inline int __pcibus_to_node(const struct pci_bus *bus)
 {
-   const struct pci_sysdata *sd = bus->sysdata;
-
-   return sd->node;
+   return to_pci_sysdata((struct pci_bus *) bus)->node;
 }
 
 static inline const struct cpumask *
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 4/5] PCI: vmd: Stop overriding dma_map_ops

2020-01-09 Thread Jon Derrick
Devices on the VMD domain use the VMD endpoint's requester-id and have
been relying on the VMD endpoint's dma operations. The downside of this
was that VMD domain devices would use the VMD endpoint's attributes when
doing DMA and IOMMU mapping. We can be smarter about this by only using
the VMD endpoint when mapping and providing the correct child device's
attributes during dma operations.

This patch modifies intel-iommu to check for a 'direct dma alias' and
refer to it for mapping.

Signed-off-by: Jon Derrick 
---
 drivers/iommu/intel-iommu.c|  17 +++--
 drivers/pci/controller/Kconfig |   1 -
 drivers/pci/controller/vmd.c   | 150 -
 3 files changed, 12 insertions(+), 156 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index b2526a4..c812594 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -773,14 +773,13 @@ static struct intel_iommu *device_to_iommu(struct device 
*dev, u8 *bus, u8 *devf
 
if (dev_is_pci(dev)) {
struct pci_dev *pf_pdev;
+   struct device *dma_alias;
 
pdev = to_pci_dev(dev);
 
-#ifdef CONFIG_X86
-   /* VMD child devices currently cannot be handled individually */
-   if (is_vmd(pdev->bus))
-   return NULL;
-#endif
+   dma_alias = pci_direct_dma_alias(pdev);
+   if (dma_alias && dev_is_pci(dma_alias))
+   pdev = to_pci_dev(dma_alias);
 
/* VFs aren't listed in scope tables; we need to look up
 * the PF instead to find the IOMMU. */
@@ -2428,6 +2427,14 @@ static struct dmar_domain *find_domain(struct device 
*dev)
 dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO))
return NULL;
 
+   if (dev_is_pci(dev)) {
+   struct pci_dev *pdev = to_pci_dev(dev);
+   struct device *dma_alias = pci_direct_dma_alias(pdev);
+
+   if (dma_alias)
+   dev = dma_alias;
+   }
+
/* No lock here, assumes no domain exit in normal case */
info = dev->archdata.iommu;
if (likely(info))
diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
index c77069c..55671429 100644
--- a/drivers/pci/controller/Kconfig
+++ b/drivers/pci/controller/Kconfig
@@ -239,7 +239,6 @@ config PCIE_TANGO_SMP8759
 
 config VMD
depends on PCI_MSI && X86_64 && SRCU
-   select X86_DEV_DMA_OPS
tristate "Intel Volume Management Device Driver"
---help---
  Adds support for the Intel Volume Management Device (VMD). VMD is a
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 907b5bd..5824a39 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -98,9 +98,6 @@ struct vmd_dev {
struct irq_domain   *irq_domain;
struct pci_bus  *bus;
u8  busn_start;
-
-   struct dma_map_ops  dma_ops;
-   struct dma_domain   dma_domain;
 };
 
 static inline struct vmd_dev *vmd_from_bus(struct pci_bus *bus)
@@ -295,151 +292,6 @@ static void vmd_set_desc(msi_alloc_info_t *arg, struct 
msi_desc *desc)
.chip   = _msi_controller,
 };
 
-/*
- * VMD replaces the requester ID with its own.  DMA mappings for devices in a
- * VMD domain need to be mapped for the VMD, not the device requiring
- * the mapping.
- */
-static struct device *to_vmd_dev(struct device *dev)
-{
-   struct pci_dev *pdev = to_pci_dev(dev);
-   struct vmd_dev *vmd = vmd_from_bus(pdev->bus);
-
-   return >dev->dev;
-}
-
-static void *vmd_alloc(struct device *dev, size_t size, dma_addr_t *addr,
-  gfp_t flag, unsigned long attrs)
-{
-   return dma_alloc_attrs(to_vmd_dev(dev), size, addr, flag, attrs);
-}
-
-static void vmd_free(struct device *dev, size_t size, void *vaddr,
-dma_addr_t addr, unsigned long attrs)
-{
-   return dma_free_attrs(to_vmd_dev(dev), size, vaddr, addr, attrs);
-}
-
-static int vmd_mmap(struct device *dev, struct vm_area_struct *vma,
-   void *cpu_addr, dma_addr_t addr, size_t size,
-   unsigned long attrs)
-{
-   return dma_mmap_attrs(to_vmd_dev(dev), vma, cpu_addr, addr, size,
-   attrs);
-}
-
-static int vmd_get_sgtable(struct device *dev, struct sg_table *sgt,
-  void *cpu_addr, dma_addr_t addr, size_t size,
-  unsigned long attrs)
-{
-   return dma_get_sgtable_attrs(to_vmd_dev(dev), sgt, cpu_addr, addr, size,
-   attrs);
-}
-
-static dma_addr_t vmd_map_page(struct device *dev, struct page *page,
-  unsigned long offset, size_t size,
-  enum dma_data_direction dir,
-  unsigned long attrs)
-{
-   return 

[PATCH v2 0/5] Clean up VMD DMA Map Ops

2020-01-09 Thread Jon Derrick
v1 Set: 
https://lore.kernel.org/linux-iommu/20200107134125.gd30...@8bytes.org/T/#t

This revision introduces a new weak function to reference the dma alias, as
well as using the struct device rather than the pci_dev. By using the weak
function in this manner, we remove the need to add a pointer in struct device
or pci_dev. Weak functions are generally frowned upon when it's a single
architecture implementation, so I am open to alternatives.

v1 Blurb:
VMD currently works with VT-d enabled by pointing DMA and IOMMU actions at the
VMD endpoint. The problem with this approach is that the VMD endpoint's
device-specific attributes, such as the dma mask, are used instead.

This set cleans up VMD by removing the override that redirects dma map
operations to the VMD endpoint. Instead it introduces a new dma alias mechanism
into the existing dma alias infrastructure.

Changes from v1:
Removed 1/5 & 2/5 misc fix patches that were merged
Uses Christoph's staging/cleanup patches
Introduce weak function rather than including pointer in struct device or 
pci_dev.

Based on Joerg's next:
https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/

Christoph Hellwig (2):
  x86/pci: Add a to_pci_sysdata helper
  x86/pci: Replace the vmd_domain field with a vmd_dev pointer

Jon Derrick (3):
  PCI: Introduce direct dma alias
  PCI: vmd: Stop overriding dma_map_ops
  x86/pci: Remove X86_DEV_DMA_OPS

 arch/x86/Kconfig   |   3 -
 arch/x86/include/asm/device.h  |  10 ---
 arch/x86/include/asm/pci.h |  31 -
 arch/x86/pci/common.c  |  45 ++--
 drivers/iommu/intel-iommu.c|  17 +++--
 drivers/pci/controller/Kconfig |   1 -
 drivers/pci/controller/vmd.c   | 152 +
 drivers/pci/pci.c  |  17 -
 drivers/pci/search.c   |   9 +++
 include/linux/pci.h|   1 +
 10 files changed, 60 insertions(+), 226 deletions(-)

-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/5] x86/pci: Replace the vmd_domain field with a vmd_dev pointer

2020-01-09 Thread Jon Derrick
From: Christoph Hellwig 

From: Christoph Hellwig 

Store the actual VMD device in struct pci_sysdata, so that we can later
use it directly for DMA mappings.

Reviewed-by: Jon Derrick 
Signed-off-by: Christoph Hellwig 
---
 arch/x86/include/asm/pci.h   | 5 ++---
 drivers/pci/controller/vmd.c | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
index cf680c5..7c6c7d7 100644
--- a/arch/x86/include/asm/pci.h
+++ b/arch/x86/include/asm/pci.h
@@ -25,7 +25,7 @@ struct pci_sysdata {
void*fwnode;/* IRQ domain for MSI assignment */
 #endif
 #if IS_ENABLED(CONFIG_VMD)
-   bool vmd_domain;/* True if in Intel VMD domain */
+   struct device   *vmd_dev;   /* Main device if in Intel VMD domain */
 #endif
 };
 
@@ -64,12 +64,11 @@ static inline void *_pci_root_bus_fwnode(struct pci_bus 
*bus)
 #if IS_ENABLED(CONFIG_VMD)
 static inline bool is_vmd(struct pci_bus *bus)
 {
-   return to_pci_sysdata(bus)->vmd_domain;
+   return to_pci_sysdata(bus)->vmd_dev != NULL;
 }
 #else
 #define is_vmd(bus)false
 #endif /* CONFIG_VMD */
-}
 
 /* Can be used to override the logic in pci_scan_bus for skipping
already-configured bus numbers - to be used for buggy BIOSes
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 2128422..907b5bd 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -679,7 +679,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned 
long features)
.parent = res,
};
 
-   sd->vmd_domain = true;
+   sd->vmd_dev = >dev->dev;
sd->domain = vmd_find_free_domain();
if (sd->domain < 0)
return sd->domain;
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 5/5] x86/pci: Remove X86_DEV_DMA_OPS

2020-01-09 Thread Jon Derrick
From: Christoph Hellwig 

There are no users of X86_DEV_DMA_OPS left, so remove the code.

Reviewed-by: Jon Derrick 
Signed-off-by: Christoph Hellwig 
---
 arch/x86/Kconfig  |  3 ---
 arch/x86/include/asm/device.h | 10 --
 arch/x86/pci/common.c | 38 --
 3 files changed, 51 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5e89499..77f9426 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2955,9 +2955,6 @@ config HAVE_ATOMIC_IOMAP
def_bool y
depends on X86_32
 
-config X86_DEV_DMA_OPS
-   bool
-
 source "drivers/firmware/Kconfig"
 
 source "arch/x86/kvm/Kconfig"
diff --git a/arch/x86/include/asm/device.h b/arch/x86/include/asm/device.h
index 5e12c63..7e31f7f 100644
--- a/arch/x86/include/asm/device.h
+++ b/arch/x86/include/asm/device.h
@@ -8,16 +8,6 @@ struct dev_archdata {
 #endif
 };
 
-#if defined(CONFIG_X86_DEV_DMA_OPS) && defined(CONFIG_PCI_DOMAINS)
-struct dma_domain {
-   struct list_head node;
-   const struct dma_map_ops *dma_ops;
-   int domain_nr;
-};
-void add_dma_domain(struct dma_domain *domain);
-void del_dma_domain(struct dma_domain *domain);
-#endif
-
 struct pdev_archdata {
 };
 
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 565cc17..5a9fb00 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -625,43 +625,6 @@ unsigned int pcibios_assign_all_busses(void)
return (pci_probe & PCI_ASSIGN_ALL_BUSSES) ? 1 : 0;
 }
 
-#if defined(CONFIG_X86_DEV_DMA_OPS) && defined(CONFIG_PCI_DOMAINS)
-static LIST_HEAD(dma_domain_list);
-static DEFINE_SPINLOCK(dma_domain_list_lock);
-
-void add_dma_domain(struct dma_domain *domain)
-{
-   spin_lock(_domain_list_lock);
-   list_add(>node, _domain_list);
-   spin_unlock(_domain_list_lock);
-}
-EXPORT_SYMBOL_GPL(add_dma_domain);
-
-void del_dma_domain(struct dma_domain *domain)
-{
-   spin_lock(_domain_list_lock);
-   list_del(>node);
-   spin_unlock(_domain_list_lock);
-}
-EXPORT_SYMBOL_GPL(del_dma_domain);
-
-static void set_dma_domain_ops(struct pci_dev *pdev)
-{
-   struct dma_domain *domain;
-
-   spin_lock(_domain_list_lock);
-   list_for_each_entry(domain, _domain_list, node) {
-   if (pci_domain_nr(pdev->bus) == domain->domain_nr) {
-   pdev->dev.dma_ops = domain->dma_ops;
-   break;
-   }
-   }
-   spin_unlock(_domain_list_lock);
-}
-#else
-static void set_dma_domain_ops(struct pci_dev *pdev) {}
-#endif
-
 static void set_dev_domain_options(struct pci_dev *pdev)
 {
if (is_vmd(pdev->bus))
@@ -697,7 +660,6 @@ int pcibios_add_device(struct pci_dev *dev)
pa_data = data->next;
memunmap(data);
}
-   set_dma_domain_ops(dev);
set_dev_domain_options(dev);
return 0;
 }
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/5] PCI: Introduce direct dma alias

2020-01-09 Thread Jon Derrick
The current dma alias implementation requires the aliased device be on
the same bus as the dma parent. This introduces an arch-specific
mechanism to point to an arbitrary struct device when doing mapping and
pci alias search.

Signed-off-by: Jon Derrick 
---
 arch/x86/pci/common.c |  7 +++
 drivers/pci/pci.c | 17 -
 drivers/pci/search.c  |  9 +
 include/linux/pci.h   |  1 +
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 1e59df0..565cc17 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -736,3 +736,10 @@ int pci_ext_cfg_avail(void)
else
return 0;
 }
+
+#if IS_ENABLED(CONFIG_VMD)
+struct device *pci_direct_dma_alias(struct pci_dev *dev)
+{
+   return to_pci_sysdata(dev->bus)->vmd_dev;
+}
+#endif
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ad746d9..e4269e9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6034,7 +6034,9 @@ bool pci_devs_are_dma_aliases(struct pci_dev *dev1, 
struct pci_dev *dev2)
return (dev1->dma_alias_mask &&
test_bit(dev2->devfn, dev1->dma_alias_mask)) ||
   (dev2->dma_alias_mask &&
-   test_bit(dev1->devfn, dev2->dma_alias_mask));
+   test_bit(dev1->devfn, dev2->dma_alias_mask)) ||
+  (pci_direct_dma_alias(dev1) == >dev) ||
+  (pci_direct_dma_alias(dev2) == >dev);
 }
 
 bool pci_device_is_present(struct pci_dev *pdev)
@@ -6058,6 +6060,19 @@ void pci_ignore_hotplug(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pci_ignore_hotplug);
 
+/**
+ * pci_direct_dma_alias - Get dma alias for pci device
+ * @dev: the PCI device that may have a dma alias
+ *
+ * Permits the platform to provide architecture-specific functionality to
+ * devices needing to alias dma to another device. This is the default
+ * implementation. Architecture implementations can override this.
+ */
+struct device __weak *pci_direct_dma_alias(struct pci_dev *dev)
+{
+   return NULL;
+}
+
 resource_size_t __weak pcibios_default_alignment(void)
 {
return 0;
diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index bade140..6d61209 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -32,6 +32,15 @@ int pci_for_each_dma_alias(struct pci_dev *pdev,
struct pci_bus *bus;
int ret;
 
+   if (unlikely(pci_direct_dma_alias(pdev))) {
+   struct device *dev = pci_direct_dma_alias(pdev);
+
+   if (dev_is_pci(dev))
+   pdev = to_pci_dev(dev);
+   return fn(pdev, PCI_DEVID(pdev->bus->number, pdev->devfn),
+ data);
+   }
+
ret = fn(pdev, pci_dev_id(pdev), data);
if (ret)
return ret;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c393dff..82494d3 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1202,6 +1202,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct 
pci_dev **limiting_dev,
 int pci_select_bars(struct pci_dev *dev, unsigned long flags);
 bool pci_device_is_present(struct pci_dev *pdev);
 void pci_ignore_hotplug(struct pci_dev *dev);
+struct device *pci_direct_dma_alias(struct pci_dev *dev);
 
 int __printf(6, 7) pci_request_irq(struct pci_dev *dev, unsigned int nr,
irq_handler_t handler, irq_handler_t thread_fn, void *dev_id,
-- 
1.8.3.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 2/5] iommu/arm-smmu: Add support for split pagetables

2020-01-09 Thread Jordan Crouse
On Thu, Jan 09, 2020 at 02:33:34PM +, Will Deacon wrote:
> On Mon, Dec 16, 2019 at 09:37:48AM -0700, Jordan Crouse wrote:
> > Add support to enable split pagetables (TTBR1) if the supporting driver
> > requests it via the DOMAIN_ATTR_SPLIT_TABLES flag. When enabled, the driver
> > will set up the TTBR0 and TTBR1 regions and program the default domain
> > pagetable on TTBR1.
> > 
> > After attaching the device, the value of he domain attribute can
> > be queried to see if the split pagetables were successfully programmed.
> > Furthermore the domain geometry will be updated so that the caller can
> > determine the active region for the pagetable that was programmed.
> > 
> > Signed-off-by: Jordan Crouse 
> > ---
> > 
> >  drivers/iommu/arm-smmu.c | 40 +++-
> >  drivers/iommu/arm-smmu.h | 45 +++--
> >  2 files changed, 74 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index c106406..7b59116 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -538,9 +538,17 @@ static void arm_smmu_init_context_bank(struct 
> > arm_smmu_domain *smmu_domain,
> > cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr;
> > cb->ttbr[1] = 0;
> > } else {
> > -   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
> > -   cb->ttbr[0] |= FIELD_PREP(TTBRn_ASID, cfg->asid);
> > -   cb->ttbr[1] = FIELD_PREP(TTBRn_ASID, cfg->asid);
> > +   if (pgtbl_cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) {
> > +   cb->ttbr[0] = FIELD_PREP(TTBRn_ASID, cfg->asid);
> > +   cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
> > +   cb->ttbr[1] |=
> > +   FIELD_PREP(TTBRn_ASID, cfg->asid);
> > +   } else {
> > +   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
> > +   cb->ttbr[0] |=
> > +   FIELD_PREP(TTBRn_ASID, cfg->asid);
> > +   cb->ttbr[1] = FIELD_PREP(TTBRn_ASID, cfg->asid);
> > +   }
> 
> I still don't understand why you have to set the ASID in both of the TTBRs.
> Assuming TCR.A1 is clear, then we should only need to set the field in
> TTBR0.

This is mostly out of a sense of symmetry with the non-split configuration. I'll
clean it up.

> 
> > }
> > } else {
> > cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
> > @@ -651,6 +659,7 @@ static int arm_smmu_init_domain_context(struct 
> > iommu_domain *domain,
> > enum io_pgtable_fmt fmt;
> > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > struct arm_smmu_cfg *cfg = _domain->cfg;
> > +   u32 quirks = 0;
> >  
> > mutex_lock(_domain->init_mutex);
> > if (smmu_domain->smmu)
> > @@ -719,6 +728,8 @@ static int arm_smmu_init_domain_context(struct 
> > iommu_domain *domain,
> > oas = smmu->ipa_size;
> > if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64) {
> > fmt = ARM_64_LPAE_S1;
> > +   if (smmu_domain->split_pagetables)
> > +   quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;
> > } else if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_L) {
> > fmt = ARM_32_LPAE_S1;
> > ias = min(ias, 32UL);
> > @@ -788,6 +799,7 @@ static int arm_smmu_init_domain_context(struct 
> > iommu_domain *domain,
> > .coherent_walk  = smmu->features & ARM_SMMU_FEAT_COHERENT_WALK,
> > .tlb= smmu_domain->flush_ops,
> > .iommu_dev  = smmu->dev,
> > +   .quirks = quirks,
> > };
> >  
> > if (smmu_domain->non_strict)
> > @@ -801,8 +813,15 @@ static int arm_smmu_init_domain_context(struct 
> > iommu_domain *domain,
> >  
> > /* Update the domain's page sizes to reflect the page table format */
> > domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> > -   domain->geometry.aperture_end = (1UL << ias) - 1;
> > -   domain->geometry.force_aperture = true;
> > +
> > +   if (pgtbl_cfg.quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) {
> > +   domain->geometry.aperture_start = ~((1ULL << ias) - 1);
> > +   domain->geometry.aperture_end = ~0UL;
> > +   } else {
> > +   domain->geometry.aperture_end = (1UL << ias) - 1;
> > +   domain->geometry.force_aperture = true;
> > +   smmu_domain->split_pagetables = false;
> > +   }
> >  
> > /* Initialise the context bank with our page table cfg */
> > arm_smmu_init_context_bank(smmu_domain, _cfg);
> > @@ -1484,6 +1503,9 @@ static int arm_smmu_domain_get_attr(struct 
> > iommu_domain *domain,
> > case DOMAIN_ATTR_NESTING:
> > *(int *)data = (smmu_domain->stage == 
> > 

Re: [rfc] dma-mapping: preallocate unencrypted DMA atomic pool

2020-01-09 Thread David Rientjes via iommu
On Thu, 9 Jan 2020, Christoph Hellwig wrote:

> > I'll rely on Thomas to chime in if this doesn't make sense for the SEV 
> > usecase.
> > 
> > I think the sizing of the single atomic pool needs to be determined.  Our 
> > peak usage that we have measured from NVMe is ~1.4MB and atomic_pool is 
> > currently sized to 256KB by default.  I'm unsure at this time if we need 
> > to be able to dynamically expand this pool with a kworker.
> >
> > Maybe when CONFIG_AMD_MEM_ENCRYPT is enabled this atomic pool should be 
> > sized to 2MB or so and then when it reaches half capacity we schedule some 
> > background work to dynamically increase it?  That wouldn't be hard unless 
> > the pool can be rapidly depleted.
> > 
> 
> Note that a non-coherent architecture with the same workload would need
> the same size.
> 
> > Do we want to increase the atomic pool size by default and then do 
> > background dynamic expansion?
> 
> For now I'd just scale with system memory size.
> 

Thanks Christoph and Robin for the help, we're running some additional 
stress tests to double check that our required amount of memory from this 
pool is accurate.  Once that's done, I'll refresh the patch with th 
suggestions and propose it formally.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 02/10] iommu/vt-d: Add nested translation helper function

2020-01-09 Thread Jacob Pan
On Wed, 18 Dec 2019 10:41:53 +0800
Lu Baolu  wrote:

> Hi again,
> 
> On 12/17/19 3:24 AM, Jacob Pan wrote:
> > +/**
> > + * intel_pasid_setup_nested() - Set up PASID entry for nested
> > translation
> > + * which is used for vSVA. The first level page tables are used for
> > + * GVA-GPA or GIOVA-GPA translation in the guest, second level
> > page tables
> > + *  are used for GPA-HPA translation.
> > + *
> > + * @iommu:  Iommu which the device belong to
> > + * @dev:Device to be set up for translation
> > + * @gpgd:   FLPTPTR: First Level Page translation pointer in
> > GPA
> > + * @pasid:  PASID to be programmed in the device PASID table
> > + * @pasid_data: Additional PASID info from the guest bind request
> > + * @domain: Domain info for setting up second level page tables
> > + * @addr_width: Address width of the first level (guest)
> > + */
> > +int intel_pasid_setup_nested(struct intel_iommu *iommu,
> > +   struct device *dev, pgd_t *gpgd,
> > +   int pasid, struct
> > iommu_gpasid_bind_data_vtd *pasid_data,
> > +   struct dmar_domain *domain,
> > +   int addr_width)
> > +{
> > +   struct pasid_entry *pte;
> > +   struct dma_pte *pgd;
> > +   u64 pgd_val;
> > +   int agaw;
> > +   u16 did;
> > +
> > +   if (!ecap_nest(iommu->ecap)) {
> > +   pr_err("IOMMU: %s: No nested translation
> > support\n",
> > +  iommu->name);
> > +   return -EINVAL;
> > +   }
> > +
> > +   pte = intel_pasid_get_entry(dev, pasid);
> > +   if (WARN_ON(!pte))
> > +   return -EINVAL;
> > +
> > +   pasid_clear_entry(pte);  
> 
> In some cases, e.g. nested mode for GIOVA-HPA, the PASID entry might
> have already been setup for second level translation. (This could be
> checked with the Present bit.) Hence, it's safe to flush caches here.
> 
> Or, maybe intel_pasid_tear_down_entry() is more suitable?
> 
We don't allow binding the same device-PASID twice, so if the PASID
entry was used for GIOVA/RID2PASID, it should unbind first, and
teardown flush included, right?

> Best regards,
> baolu

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/16] drivers/iommu: Take a ref to the IOMMU driver prior to ->add_device()

2020-01-09 Thread Greg Kroah-Hartman
On Thu, Jan 09, 2020 at 02:16:03PM +, Will Deacon wrote:
> Hi Greg,
> 
> On Thu, Dec 19, 2019 at 03:44:37PM +0100, Greg Kroah-Hartman wrote:
> > On Thu, Dec 19, 2019 at 12:03:41PM +, Will Deacon wrote:
> > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > > index f2223cbb5fd5..e9f94d3f7a04 100644
> > > --- a/include/linux/iommu.h
> > > +++ b/include/linux/iommu.h
> > > @@ -246,9 +246,10 @@ struct iommu_iotlb_gather {
> > >   * @sva_get_pasid: Get PASID associated to a SVA handle
> > >   * @page_response: handle page request response
> > >   * @cache_invalidate: invalidate translation caches
> > > - * @pgsize_bitmap: bitmap of all possible supported page sizes
> > >   * @sva_bind_gpasid: bind guest pasid and mm
> > >   * @sva_unbind_gpasid: unbind guest pasid and mm
> > > + * @pgsize_bitmap: bitmap of all possible supported page sizes
> > > + * @owner: Driver module providing these ops
> > >   */
> > >  struct iommu_ops {
> > >   bool (*capable)(enum iommu_cap);
> > > @@ -318,6 +319,7 @@ struct iommu_ops {
> > >   int (*sva_unbind_gpasid)(struct device *dev, int pasid);
> > >  
> > >   unsigned long pgsize_bitmap;
> > > + struct module *owner;
> > 
> > Everyone is always going to forget to set this field.  I don't think you
> > even set it for all of the different iommu_ops possible in this series,
> > right?
> 
> I only initialised the field for those drivers which can actually be built
> as a module, but I take your point about this being error-prone.
> 
> > The "trick" we did to keep people from having to remember this is to do
> > what we did for the bus registering functions.
> > 
> > Look at pci_register_driver in pci.h:
> > #define pci_register_driver(driver) \
> > __pci_register_driver(driver, THIS_MODULE, KBUILD_MODNAME)
> > 
> > Then we set the .owner field in the "real" __pci_register_driver() call.
> > 
> > Same thing for USB and lots, if not all, other driver register
> > functions.
> > 
> > You can do the same thing here, and I would recommend it.
> 
> Yes, that makes sense, cheers. Diff below. I'll send it to Joerg along
> with some other SMMU patches that have come in since the holiday.
> 
> Will
> 
> --->8
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 03dc97842875..e82997a705a8 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -2733,7 +2733,6 @@ static struct iommu_ops arm_smmu_ops = {
>   .get_resv_regions   = arm_smmu_get_resv_regions,
>   .put_resv_regions   = arm_smmu_put_resv_regions,
>   .pgsize_bitmap  = -1UL, /* Restricted during device attach */
> - .owner  = THIS_MODULE,
>  };
>  
>  /* Probing and initialisation functions */
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 5ef1f2e100d7..93d332423f6f 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1623,7 +1623,6 @@ static struct iommu_ops arm_smmu_ops = {
>   .get_resv_regions   = arm_smmu_get_resv_regions,
>   .put_resv_regions   = arm_smmu_put_resv_regions,
>   .pgsize_bitmap  = -1UL, /* Restricted during device attach */
> - .owner  = THIS_MODULE,
>  };
>  
>  static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index e9f94d3f7a04..90007c92ad2d 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -388,12 +388,19 @@ void iommu_device_sysfs_remove(struct iommu_device 
> *iommu);
>  int  iommu_device_link(struct iommu_device   *iommu, struct device *link);
>  void iommu_device_unlink(struct iommu_device *iommu, struct device *link);
>  
> -static inline void iommu_device_set_ops(struct iommu_device *iommu,
> - const struct iommu_ops *ops)
> +static inline void __iommu_device_set_ops(struct iommu_device *iommu,
> +   const struct iommu_ops *ops)
>  {
>   iommu->ops = ops;
>  }
>  
> +#define iommu_device_set_ops(iommu, ops) \
> +do { \
> + struct iommu_ops *__ops = (struct iommu_ops *)(ops);\
> + __ops->owner = THIS_MODULE; \
> + __iommu_device_set_ops(iommu, __ops);   \
> +} while (0)
> +
>  static inline void iommu_device_set_fwnode(struct iommu_device *iommu,
>  struct fwnode_handle *fwnode)
>  {

Looks good:

Reviewed-by: Greg Kroah-Hartman 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 0/4] Add uacce module for Accelerator

2020-01-09 Thread Jonathan Cameron
On Mon, 16 Dec 2019 11:08:13 +0800
Zhangfei Gao  wrote:

> Uacce (Unified/User-space-access-intended Accelerator Framework) targets to
> provide Shared Virtual Addressing (SVA) between accelerators and processes.
> So accelerator can access any data structure of the main cpu.
> This differs from the data sharing between cpu and io device, which share
> data content rather than address.
> Because of unified address, hardware and user space of process can share
> the same virtual address in the communication.
> 
> Uacce is intended to be used with Jean Philippe Brucker's SVA
> patchset[1], which enables IO side page fault and PASID support. 
> We have keep verifying with Jean's sva patchset [2]
> We also keep verifying with Eric's SMMUv3 Nested Stage patches [3]

Hi Zhangfei Gao,

Just to check my understanding...

This patch set is not dependent on either 2 or 3?

To use it on our hardware, we need 2, but the interfaces used are already
upstream, so this could move forwards in parallel.

Given interest from Dave it would be great if it can!

Thanks,

Jonathan

> 
> This series and related zip & qm driver
> https://github.com/Linaro/linux-kernel-warpdrive/tree/v5.5-rc1-uacce-v10
> 
> The library and user application:
> https://github.com/Linaro/warpdrive/tree/wdprd-upstream-v10
> 
> References:
> [1] http://jpbrucker.net/sva/
> [2] http://jpbrucker.net/git/linux/log/?h=sva/zip-devel
> [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9
> 
> Change History:
> v10:
> Modify the include header to fix kbuild test erorr in other arch.
> 
> v9:
> Suggested by Jonathan
> 1. Remove sysfs: numa_distance, node_id, id, also add is_visible callback
> 2. Split the api to solve the potential race
> struct uacce_device *uacce_alloc(struct device *parent,
>struct uacce_interface *interface)
> int uacce_register(struct uacce_device *uacce)
> void uacce_remove(struct uacce_device *uacce)
> 3. Split clean up patch 03
> 
> v8:
> Address some comments from Jonathan
> Merge Jean's patch, using uacce_mm instead of pid for sva_exit
> 
> v7:
> As suggested by Jean and Jerome
> Only consider sva case and remove unused dma apis for the first patch.
> Also add mm_exit for sva and vm_ops.close etc
> 
> 
> v6: https://lkml.org/lkml/2019/10/16/231
> Change sys qfrs_size to different file, suggested by Jonathan
> Fix crypto daily build issue and based on crypto code base, also 5.4-rc1.
> 
> v5: https://lkml.org/lkml/2019/10/14/74
> Add an example patch using the uacce interface, suggested by Greg
> 0003-crypto-hisilicon-register-zip-engine-to-uacce.patch
> 
> v4: https://lkml.org/lkml/2019/9/17/116
> Based on 5.4-rc1
> Considering other driver integrating uacce, 
> if uacce not compiled, uacce_register return error and uacce_unregister is 
> empty.
> Simplify uacce flag: UACCE_DEV_SVA.
> Address Greg's comments: 
> Fix state machine, remove potential syslog triggered from user space etc.
> 
> v3: https://lkml.org/lkml/2019/9/2/990
> Recommended by Greg, use sturct uacce_device instead of struct uacce,
> and use struct *cdev in struct uacce_device, as a result, 
> cdev can be released by itself when refcount decreased to 0.
> So the two structures are decoupled and self-maintained by themsleves.
> Also add dev.release for put_device.
> 
> v2: https://lkml.org/lkml/2019/8/28/565
> Address comments from Greg and Jonathan
> Modify interface uacce_register
> Drop noiommu mode first
> 
> v1: https://lkml.org/lkml/2019/8/14/277
> 1. Rebase to 5.3-rc1
> 2. Build on iommu interface
> 3. Verifying with Jean's sva and Eric's nested mode iommu.
> 4. User library has developed a lot: support zlib, openssl etc.
> 5. Move to misc first
> 
> RFC3:
> https://lkml.org/lkml/2018/11/12/1951
> 
> RFC2:
> https://lwn.net/Articles/763990/
> 
> 
> Background of why Uacce:
> Von Neumann processor is not good at general data manipulation.
> It is designed for control-bound rather than data-bound application.
> The latter need less control path facility and more/specific ALUs.
> So there are more and more heterogeneous processors, such as
> encryption/decryption accelerators, TPUs, or
> EDGE (Explicated Data Graph Execution) processors, introduced to gain
> better performance or power efficiency for particular applications
> these days.
> 
> There are generally two ways to make use of these heterogeneous processors:
> 
> The first is to make them co-processors, just like FPU.
> This is good for some application but it has its own cons:
> It changes the ISA set permanently.
> You must save all state elements when the process is switched out.
> But most data-bound processors have a huge set of state elements.
> It makes the kernel scheduler more complex.
> 
> The second is Accelerator.
> It is taken as a IO device from the CPU's point of view
> (but it need not to be physically). The process, running on CPU,
> hold a context of the accelerator and send instructions to it as if
> it calls a function or thread running with FPU.

Re: [PATCH v10 4/4] crypto: hisilicon - register zip engine to uacce

2020-01-09 Thread Jonathan Cameron
On Mon, 16 Dec 2019 11:08:17 +0800
Zhangfei Gao  wrote:

> Register qm to uacce framework for user crypto driver
> 
> Signed-off-by: Zhangfei Gao 
> Signed-off-by: Zhou Wang 

Very nice to see how minimal the changes are.

Whilst uacce in general isn't crypto specific, as we are looking
at changes in a crypto driver, this will need a crypto Ack.

Herbert, this is about as non invasive as things can get and
provide a user space shared virtual addressing based interface.
What do you think?

>From my side, for what it's worth...

Reviewed-by: Jonathan Cameron 

> ---
>  drivers/crypto/hisilicon/qm.c   | 236 
> +++-
>  drivers/crypto/hisilicon/qm.h   |  11 ++
>  drivers/crypto/hisilicon/zip/zip_main.c |  16 ++-
>  include/uapi/misc/uacce/hisi_qm.h   |  23 
>  4 files changed, 278 insertions(+), 8 deletions(-)
>  create mode 100644 include/uapi/misc/uacce/hisi_qm.h
> 
> diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c
> index b57da5e..1e923bc 100644
> --- a/drivers/crypto/hisilicon/qm.c
> +++ b/drivers/crypto/hisilicon/qm.c
> @@ -9,6 +9,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
>  #include "qm.h"
>  
>  /* eq/aeq irq enable */
> @@ -465,9 +468,14 @@ static void qm_cq_head_update(struct hisi_qp *qp)
>  
>  static void qm_poll_qp(struct hisi_qp *qp, struct hisi_qm *qm)
>  {
> - struct qm_cqe *cqe = qp->cqe + qp->qp_status.cq_head;
> + if (qp->event_cb) {
> + qp->event_cb(qp);
> + return;
> + }
>  
>   if (qp->req_cb) {
> + struct qm_cqe *cqe = qp->cqe + qp->qp_status.cq_head;
> +
>   while (QM_CQE_PHASE(cqe) == qp->qp_status.cqc_phase) {
>   dma_rmb();
>   qp->req_cb(qp, qp->sqe + qm->sqe_size *
> @@ -1269,7 +1277,7 @@ static int qm_qp_ctx_cfg(struct hisi_qp *qp, int qp_id, 
> int pasid)
>   * @qp: The qp we want to start to run.
>   * @arg: Accelerator specific argument.
>   *
> - * After this function, qp can receive request from user. Return qp_id if
> + * After this function, qp can receive request from user. Return 0 if
>   * successful, Return -EBUSY if failed.
>   */
>  int hisi_qm_start_qp(struct hisi_qp *qp, unsigned long arg)
> @@ -1314,7 +1322,7 @@ int hisi_qm_start_qp(struct hisi_qp *qp, unsigned long 
> arg)
>  
>   dev_dbg(dev, "queue %d started\n", qp_id);
>  
> - return qp_id;
> + return 0;
>  }
>  EXPORT_SYMBOL_GPL(hisi_qm_start_qp);
>  
> @@ -1395,6 +1403,213 @@ static void hisi_qm_cache_wb(struct hisi_qm *qm)
>   }
>  }
>  
> +static void qm_qp_event_notifier(struct hisi_qp *qp)
> +{
> + wake_up_interruptible(>uacce_q->wait);
> +}
> +
> +static int hisi_qm_get_available_instances(struct uacce_device *uacce)
> +{
> + int i, ret;
> + struct hisi_qm *qm = uacce->priv;
> +
> + read_lock(>qps_lock);
> + for (i = 0, ret = 0; i < qm->qp_num; i++)
> + if (!qm->qp_array[i])
> + ret++;
> + read_unlock(>qps_lock);
> +
> + return ret;
> +}
> +
> +static int hisi_qm_uacce_get_queue(struct uacce_device *uacce,
> +unsigned long arg,
> +struct uacce_queue *q)
> +{
> + struct hisi_qm *qm = uacce->priv;
> + struct hisi_qp *qp;
> + u8 alg_type = 0;
> +
> + qp = hisi_qm_create_qp(qm, alg_type);
> + if (IS_ERR(qp))
> + return PTR_ERR(qp);
> +
> + q->priv = qp;
> + q->uacce = uacce;
> + qp->uacce_q = q;
> + qp->event_cb = qm_qp_event_notifier;
> + qp->pasid = arg;
> +
> + return 0;
> +}
> +
> +static void hisi_qm_uacce_put_queue(struct uacce_queue *q)
> +{
> + struct hisi_qp *qp = q->priv;
> +
> + hisi_qm_cache_wb(qp->qm);
> + hisi_qm_release_qp(qp);
> +}
> +
> +/* map sq/cq/doorbell to user space */
> +static int hisi_qm_uacce_mmap(struct uacce_queue *q,
> +   struct vm_area_struct *vma,
> +   struct uacce_qfile_region *qfr)
> +{
> + struct hisi_qp *qp = q->priv;
> + struct hisi_qm *qm = qp->qm;
> + size_t sz = vma->vm_end - vma->vm_start;
> + struct pci_dev *pdev = qm->pdev;
> + struct device *dev = >dev;
> + unsigned long vm_pgoff;
> + int ret;
> +
> + switch (qfr->type) {
> + case UACCE_QFRT_MMIO:
> + if (qm->ver == QM_HW_V2) {
> + if (sz > PAGE_SIZE * (QM_DOORBELL_PAGE_NR +
> + QM_DOORBELL_SQ_CQ_BASE_V2 / PAGE_SIZE))
> + return -EINVAL;
> + } else {
> + if (sz > PAGE_SIZE * QM_DOORBELL_PAGE_NR)
> + return -EINVAL;
> + }
> +
> + vma->vm_flags |= VM_IO;
> +
> + return remap_pfn_range(vma, vma->vm_start,
> +qm->phys_base >> PAGE_SHIFT,
> +sz, 

Re: [PATCH v8 02/10] iommu/vt-d: Add nested translation helper function

2020-01-09 Thread Jacob Pan
Hi Baolu,

Appreciate the review. Comments inline below.

On Wed, 18 Dec 2019 10:01:17 +0800
Lu Baolu  wrote:

> Hi Jacob,
> 
> On 12/17/19 3:24 AM, Jacob Pan wrote:
> > Nested translation mode is supported in VT-d 3.0 Spec.CH 3.8.
> > With PASID granular translation type set to 0x11b, translation
> > result from the first level(FL) also subject to a second level(SL)
> > page table translation. This mode is used for SVA virtualization,
> > where FL performs guest virtual to guest physical translation and
> > SL performs guest physical to host physical translation.
> > 
> > Signed-off-by: Jacob Pan 
> > Signed-off-by: Liu, Yi L 
> > ---
> >   drivers/iommu/intel-pasid.c | 213
> > 
> > drivers/iommu/intel-pasid.h |  12 +++ include/linux/intel-iommu.h
> > |   3 + include/uapi/linux/iommu.h  |   5 +-
> >   4 files changed, 232 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/intel-pasid.c
> > b/drivers/iommu/intel-pasid.c index 3cb569e76642..b178ad9e47ae
> > 100644 --- a/drivers/iommu/intel-pasid.c
> > +++ b/drivers/iommu/intel-pasid.c
> > @@ -359,6 +359,76 @@ pasid_set_flpm(struct pasid_entry *pe, u64
> > value) pasid_set_bits(>val[2], GENMASK_ULL(3, 2), value << 2);
> >   }
> >   
> > +/*
> > + * Setup the Extended Memory Type(EMT) field (Bits 91-93)
> > + * of a scalable mode PASID entry.
> > + */
> > +static inline void
> > +pasid_set_emt(struct pasid_entry *pe, u64 value)
> > +{
> > +   pasid_set_bits(>val[1], GENMASK_ULL(29, 27), value <<
> > 27); +}
> > +
> > +/*
> > + * Setup the Page Attribute Table (PAT) field (Bits 96-127)
> > + * of a scalable mode PASID entry.
> > + */
> > +static inline void
> > +pasid_set_pat(struct pasid_entry *pe, u64 value)
> > +{
> > +   pasid_set_bits(>val[1], GENMASK_ULL(63, 32), value <<
> > 27);  
> 
> The last input should be "value << 32".
> 
you are right. will fix.
> > +}
> > +
> > +/*
> > + * Setup the Cache Disable (CD) field (Bit 89)
> > + * of a scalable mode PASID entry.
> > + */
> > +static inline void
> > +pasid_set_cd(struct pasid_entry *pe)
> > +{
> > +   pasid_set_bits(>val[1], 1 << 25, 1);  
> 
> The last input should be "1 << 25".
> 
right, i misunderstood the argument of pasid_set_bits(), same for the
other bits below.
> > +}
> > +
> > +/*
> > + * Setup the Extended Memory Type Enable (EMTE) field (Bit 90)
> > + * of a scalable mode PASID entry.
> > + */
> > +static inline void
> > +pasid_set_emte(struct pasid_entry *pe)
> > +{
> > +   pasid_set_bits(>val[1], 1 << 26, 1);  
> 
> The last input should be "1 << 26".
> 
> > +}
> > +
> > +/*
> > + * Setup the Extended Access Flag Enable (EAFE) field (Bit 135)
> > + * of a scalable mode PASID entry.
> > + */
> > +static inline void
> > +pasid_set_eafe(struct pasid_entry *pe)
> > +{
> > +   pasid_set_bits(>val[2], 1 << 7, 1);  
> 
> The last input should be "1 << 7".
> 
> > +}
> > +
> > +/*
> > + * Setup the Page-level Cache Disable (PCD) field (Bit 95)
> > + * of a scalable mode PASID entry.
> > + */
> > +static inline void
> > +pasid_set_pcd(struct pasid_entry *pe)
> > +{
> > +   pasid_set_bits(>val[1], 1 << 31, 1);  
> 
> The last input should be "1 << 31".
> 
> > +}
> > +
> > +/*
> > + * Setup the Page-level Write-Through (PWT)) field (Bit 94)
> > + * of a scalable mode PASID entry.
> > + */
> > +static inline void
> > +pasid_set_pwt(struct pasid_entry *pe)
> > +{
> > +   pasid_set_bits(>val[1], 1 << 30, 1);  
> 
> The last input should be "1 << 30".
> 
> > +}
> > +
> >   static void
> >   pasid_cache_invalidation_with_pasid(struct intel_iommu *iommu,
> > u16 did, int pasid)
> > @@ -599,3 +669,146 @@ int intel_pasid_setup_pass_through(struct
> > intel_iommu *iommu, 
> > return 0;
> >   }
> > +
> > +static int intel_pasid_setup_bind_data(struct intel_iommu *iommu,
> > +   struct pasid_entry *pte,
> > +   struct iommu_gpasid_bind_data_vtd
> > *pasid_data) +{
> > +   /*
> > +* Not all guest PASID table entry fields are passed down
> > during bind,
> > +* here we only set up the ones that are dependent on
> > guest settings.
> > +* Execution related bits such as NXE, SMEP are not
> > meaningful to IOMMU,
> > +* therefore not set. Other fields, such as snoop related,
> > are set based
> > +* on host needs regardless of  guest settings.
> > +*/
> > +   if (pasid_data->flags & IOMMU_SVA_VTD_GPASID_SRE) {
> > +   if (!ecap_srs(iommu->ecap)) {
> > +   pr_err("No supervisor request support on
> > %s\n",
> > +  iommu->name);
> > +   return -EINVAL;
> > +   }
> > +   pasid_set_sre(pte);
> > +   }
> > +
> > +   if (pasid_data->flags & IOMMU_SVA_VTD_GPASID_EAFE) {
> > +   if (!ecap_eafs(iommu->ecap)) {
> > +   pr_err("No extended access flag support on
> > %s\n",
> > +   iommu->name);
> > +   return -EINVAL;
> > 

Re: [PATCH v10 3/4] crypto: hisilicon - Remove module_param uacce_mode

2020-01-09 Thread Jonathan Cameron
On Mon, 16 Dec 2019 11:08:16 +0800
Zhangfei Gao  wrote:

> Remove the module_param uacce_mode, which is not used currently.
> 
> Signed-off-by: Zhangfei Gao 
> Signed-off-by: Zhou Wang 


Reviewed-by: Jonathan Cameron 

> ---
>  drivers/crypto/hisilicon/zip/zip_main.c | 31 ++-
>  1 file changed, 6 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/crypto/hisilicon/zip/zip_main.c 
> b/drivers/crypto/hisilicon/zip/zip_main.c
> index e1bab1a..93345f0 100644
> --- a/drivers/crypto/hisilicon/zip/zip_main.c
> +++ b/drivers/crypto/hisilicon/zip/zip_main.c
> @@ -297,9 +297,6 @@ static u32 pf_q_num = HZIP_PF_DEF_Q_NUM;
>  module_param_cb(pf_q_num, _q_num_ops, _q_num, 0444);
>  MODULE_PARM_DESC(pf_q_num, "Number of queues in PF(v1 1-4096, v2 1-1024)");
>  
> -static int uacce_mode;
> -module_param(uacce_mode, int, 0);
> -
>  static u32 vfs_num;
>  module_param(vfs_num, uint, 0444);
>  MODULE_PARM_DESC(vfs_num, "Number of VFs to enable(1-63)");
> @@ -791,6 +788,7 @@ static int hisi_zip_probe(struct pci_dev *pdev, const 
> struct pci_device_id *id)
>   pci_set_drvdata(pdev, hisi_zip);
>  
>   qm = _zip->qm;
> + qm->use_dma_api = true;
>   qm->pdev = pdev;
>   qm->ver = rev_id;
>  
> @@ -798,20 +796,6 @@ static int hisi_zip_probe(struct pci_dev *pdev, const 
> struct pci_device_id *id)
>   qm->dev_name = hisi_zip_name;
>   qm->fun_type = (pdev->device == PCI_DEVICE_ID_ZIP_PF) ? QM_HW_PF :
>   QM_HW_VF;
> - switch (uacce_mode) {
> - case 0:
> - qm->use_dma_api = true;
> - break;
> - case 1:
> - qm->use_dma_api = false;
> - break;
> - case 2:
> - qm->use_dma_api = true;
> - break;
> - default:
> - return -EINVAL;
> - }
> -
>   ret = hisi_qm_init(qm);
>   if (ret) {
>   dev_err(>dev, "Failed to init qm!\n");
> @@ -1010,12 +994,10 @@ static int __init hisi_zip_init(void)
>   goto err_pci;
>   }
>  
> - if (uacce_mode == 0 || uacce_mode == 2) {
> - ret = hisi_zip_register_to_crypto();
> - if (ret < 0) {
> - pr_err("Failed to register driver to crypto.\n");
> - goto err_crypto;
> - }
> + ret = hisi_zip_register_to_crypto();
> + if (ret < 0) {
> + pr_err("Failed to register driver to crypto.\n");
> + goto err_crypto;
>   }
>  
>   return 0;
> @@ -1030,8 +1012,7 @@ static int __init hisi_zip_init(void)
>  
>  static void __exit hisi_zip_exit(void)
>  {
> - if (uacce_mode == 0 || uacce_mode == 2)
> - hisi_zip_unregister_from_crypto();
> + hisi_zip_unregister_from_crypto();
>   pci_unregister_driver(_zip_pci_driver);
>   hisi_zip_unregister_debugfs();
>  }


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 2/4] uacce: add uacce driver

2020-01-09 Thread Jonathan Cameron
On Mon, 16 Dec 2019 11:08:15 +0800
Zhangfei Gao  wrote:

> From: Kenneth Lee 
> 
> Uacce (Unified/User-space-access-intended Accelerator Framework) targets to
> provide Shared Virtual Addressing (SVA) between accelerators and processes.
> So accelerator can access any data structure of the main cpu.
> This differs from the data sharing between cpu and io device, which share
> only data content rather than address.
> Since unified address, hardware and user space of process can share the
> same virtual address in the communication.
> 
> Uacce create a chrdev for every registration, the queue is allocated to
> the process when the chrdev is opened. Then the process can access the
> hardware resource by interact with the queue file. By mmap the queue
> file space to user space, the process can directly put requests to the
> hardware without syscall to the kernel space.
> 
> The IOMMU core only tracks mm<->device bonds at the moment, because it
> only needs to handle IOTLB invalidation and PASID table entries. However
> uacce needs a finer granularity since multiple queues from the same
> device can be bound to an mm. When the mm exits, all bound queues must
> be stopped so that the IOMMU can safely clear the PASID table entry and
> reallocate the PASID.
> 
> An intermediate struct uacce_mm links uacce devices and queues.
> Note that an mm may be bound to multiple devices but an uacce_mm
> structure only ever belongs to a single device, because we don't need
> anything more complex (if multiple devices are bound to one mm, then
> we'll create one uacce_mm for each bond).
> 
> uacce_device --+-- uacce_mm --+-- uacce_queue
>|  '-- uacce_queue
>|
>'-- uacce_mm --+-- uacce_queue
>   +-- uacce_queue
>   '-- uacce_queue
> 
> Signed-off-by: Kenneth Lee 
> Signed-off-by: Zaibo Xu 
> Signed-off-by: Zhou Wang 
> Signed-off-by: Jean-Philippe Brucker 
> Signed-off-by: Zhangfei Gao 

Hi,

Two small things I'd missed previously.  Fix those and for
what it's worth

Reviewed-by: Jonathan Cameron 

> ---
>  Documentation/ABI/testing/sysfs-driver-uacce |  37 ++
>  drivers/misc/Kconfig |   1 +
>  drivers/misc/Makefile|   1 +
>  drivers/misc/uacce/Kconfig   |  13 +
>  drivers/misc/uacce/Makefile  |   2 +
>  drivers/misc/uacce/uacce.c   | 628 
> +++
>  include/linux/uacce.h| 161 +++
>  include/uapi/misc/uacce/uacce.h  |  38 ++
>  8 files changed, 881 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-driver-uacce
>  create mode 100644 drivers/misc/uacce/Kconfig
>  create mode 100644 drivers/misc/uacce/Makefile
>  create mode 100644 drivers/misc/uacce/uacce.c
>  create mode 100644 include/linux/uacce.h
>  create mode 100644 include/uapi/misc/uacce/uacce.h
> 
...
> +
> +What:   /sys/class/uacce//available_instances
> +Date:   Dec 2019
> +KernelVersion:  5.6
> +Contact:linux-accelerat...@lists.ozlabs.org
> +Description:Available instances left of the device
> +Return -ENODEV if uacce_ops get_available_instances is not 
> provided
> +

See below.  It doesn't "return" it prints it currently.

...

> +static int uacce_fops_mmap(struct file *filep, struct vm_area_struct *vma)
> +{
> + struct uacce_queue *q = filep->private_data;
> + struct uacce_device *uacce = q->uacce;
> + struct uacce_qfile_region *qfr;
> + enum uacce_qfrt type = UACCE_MAX_REGION;
> + int ret = 0;
> +
> + if (vma->vm_pgoff < UACCE_MAX_REGION)
> + type = vma->vm_pgoff;
> + else
> + return -EINVAL;
> +
> + qfr = kzalloc(sizeof(*qfr), GFP_KERNEL);
> + if (!qfr)
> + return -ENOMEM;
> +
> + vma->vm_flags |= VM_DONTCOPY | VM_DONTEXPAND | VM_WIPEONFORK;
> + vma->vm_ops = _vm_ops;
> + vma->vm_private_data = q;
> + qfr->type = type;
> +
> + mutex_lock(_mutex);
> +
> + if (q->state != UACCE_Q_INIT && q->state != UACCE_Q_STARTED) {
> + ret = -EINVAL;
> + goto out_with_lock;
> + }
> +
> + if (q->qfrs[type]) {
> + ret = -EEXIST;
> + goto out_with_lock;
> + }
> +
> + switch (type) {
> + case UACCE_QFRT_MMIO:
> + if (!uacce->ops->mmap) {
> + ret = -EINVAL;
> + goto out_with_lock;
> + }
> +
> + ret = uacce->ops->mmap(q, vma, qfr);
> + if (ret)
> + goto out_with_lock;
> +
> + break;
> +
> + case UACCE_QFRT_DUS:
> + if (uacce->flags & UACCE_DEV_SVA) {
> + if (!uacce->ops->mmap) {
> + ret = -EINVAL;
> + goto out_with_lock;
> +  

Re: [RFC 3/5] x86/PCI: Expose VMD's device in pci_sysdata

2020-01-09 Thread Derrick, Jonathan
On Thu, 2020-01-09 at 15:33 +0100, Christoph Hellwig wrote:
> On Tue, Dec 31, 2019 at 01:24:21PM -0700, Jon Derrick wrote:
> > To be used by intel-iommu code to find the correct domain.
> 
> Any reason to prefer this version over my patches 2 and 3 from the
> series in August?

2 & 3 of your set is fine.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 4/5] PCI: vmd: Stop overriding dma_map_ops

2020-01-09 Thread Derrick, Jonathan
On Thu, 2020-01-09 at 15:36 +0100, Christoph Hellwig wrote:
> On Tue, Dec 31, 2019 at 01:24:22PM -0700, Jon Derrick wrote:
> > Devices on the VMD domain use the VMD endpoint's requester-id and have
> > been relying on the VMD endpoint's dma operations. The downside of this
> > was that VMD domain devices would use the VMD endpoint's attributes when
> > doing DMA and IOMMU mapping. We can be smarter about this by only using
> > the VMD endpoint when mapping and providing the correct child device's
> > attributes during dma operations.
> > 
> > This patch adds a new dma alias mechanism by adding a hint to a pci_dev
> > to point to a singular DMA requester's pci_dev. This integrates into the
> > existing dma alias infrastructure to reduce the impact of the changes
> > required to support this mode.
> 
> If we want to lift this check into common code I think it should go
> into struct device, as that is what DMA operates on normally.
I thought about that too, but the dma alias mechanism was in pci_dev. I
can prepare a new version with struct device.

>   That
> being said given that this insane hack only exists for braindamage in
> Intel hardware I'd rather keep it as isolated as possible. 
jmho but the footprint of the new set is pretty minimal and removes a
lot of dubious code in vmd.c.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 3/5] x86/PCI: Expose VMD's device in pci_sysdata

2020-01-09 Thread Derrick, Jonathan
On Thu, 2020-01-09 at 15:33 +0100, Christoph Hellwig wrote:
> On Tue, Dec 31, 2019 at 01:24:21PM -0700, Jon Derrick wrote:
> > To be used by intel-iommu code to find the correct domain.
> 
> Any reason to prefer this version over my patches 2 and 3 from the
> series in August?

Mine uses the correct device's dma mask
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 5/5] x86/PCI: Remove unused X86_DEV_DMA_OPS

2020-01-09 Thread Derrick, Jonathan
On Thu, 2020-01-09 at 15:37 +0100, Christoph Hellwig wrote:
> On Tue, Dec 31, 2019 at 01:24:23PM -0700, Jon Derrick wrote:
> > VMD was the only user of device dma operations. Now that the IOMMU has
> > been made aware of direct DMA aliases, VMD domain devices can reference
> > the VMD endpoint directly and the VMD device dma operations has been
> > made obsolete.
> > 
> > Signed-off-by: Jon Derrick 
> 
> This seems to be a 1:1 copy of my patch from August?
Sorry I didn't notice that. I'll give you attributions.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] arm: use swiotlb for bounce buffer on LPAE configs

2020-01-09 Thread Christoph Hellwig
On Wed, Jan 08, 2020 at 03:20:07PM +, Robin Murphy wrote:
>> The problem - I think - is that the DMA_BIT_MASK(32) from
>> dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)) is treated as physical
>> address along the call path so the dma_pfn_offset is applied to it and
>> the check will fail, saying that DMA_BIT_MASK(32) can not be supported.
>
> But that's the thing - in isolation, that is entirely correct. Considering 
> ZONE_DMA32 for simplicity, in general the zone is expected to cover the 
> physical address range 0x_ - 0x_ (because DMA offsets are 
> relatively rare), and a device with a dma_pfn_offset of more than 
> (0x1__ >> PAGE_SHIFT) *cannot* support that range with any mask, 
> because the DMA address itself would have to be negative.

Note that ZONE_DMA32 is irrelevant in this particular case, as we are
talking about arm32.  But with ZONE_DMA instead this roughly makes sense.

> The problem is that platforms with esoteric memory maps have no right thing 
> to do. If the base of RAM is at at 0x1__ or higher, the "correct" 
> ZONE_DMA32 would be empty while ZONE_NORMAL above it would not, and last 
> time I looked that makes the page allocator break badly. So the standard 
> bodge on such platforms is to make ZONE_DMA32 cover not the first 4GB of 
> *PA space*, but the first 4GB of *RAM*, wherever that happens to be. That 
> then brings different problems - now the page allocator is happy and 
> successfully returns GFP_DMA32 allocations from the range 0x8__ - 
> 0x8__ that are utterly useless to 32-bit devices with zero 
> dma_pfn_offset - see the AMD Seattle SoC for the prime example of that. If 
> on the other hand all devices are guaranteed to have a dma_pfn_offset that 
> puts the base of RAM at DMA address 0 then GFP_DMA32 allocations do end up 
> working as expected, but now the original assumption of where ZONE_DMA32 
> actually is is broken, so generic code unaware of the 
> platform/architecture-specific bodge will be misled - that's the case 
> you're running into.
>
> Having thought this far, if there's a non-hacky way to reach in and grab 
> ZONE_DMA{32} such that dma_direct_supported() could use zone_end_pfn() 
> instead of trying to assume either way, that might be the most robust 
> general solution.

zone_dma_bits is our somewhat ugly way to try to poke into this
information, although the way it is done right now sucks pretty badly.

The patch I sent to Peter in December was trying to convey that
information in a way similar to what the arm32 legacy dma code does, but
it didn't work, so I'll need to find some time to sit down and figure out
why.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 00/13] iommu: Add PASID support to Arm SMMUv3

2020-01-09 Thread Will Deacon
On Thu, Jan 09, 2020 at 03:36:18PM +0100, Jean-Philippe Brucker wrote:
> On Thu, Dec 19, 2019 at 05:30:20PM +0100, Jean-Philippe Brucker wrote:
> > Add support for Substream ID and PASIDs to the SMMUv3 driver. Since v3
> > [1], I added review and tested tags where appropriate and applied the
> > suggested changes, shown in the diff below. Thanks all!
> > 
> > I'm testing using the zip accelerator on the Hisilicon KunPeng920 and
> > Zhangfei's uacce module [2]. The full SVA support, which I'll send out
> > early next year, is available on my branch sva/zip-devel at
> > https://jpbrucker.net/git/linux/
> 
> Is there anything more I should do for the PASID support? Ideally I'd like
> to get this in v5.6 so I can focus on the rest of the SVA work and on
> performance improvements.

Apologies, I'm just behind with review what with the timing of the new
year. You're on the list, but I was hoping to get Robin's TCR stuff dusted
off so that Jordan doesn't have to depend on patches languishing on the
mailing list and there's also the nvidia stuff to review as well.

Going as fast as I can!

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 5/5] x86/PCI: Remove unused X86_DEV_DMA_OPS

2020-01-09 Thread Christoph Hellwig
On Tue, Dec 31, 2019 at 01:24:23PM -0700, Jon Derrick wrote:
> VMD was the only user of device dma operations. Now that the IOMMU has
> been made aware of direct DMA aliases, VMD domain devices can reference
> the VMD endpoint directly and the VMD device dma operations has been
> made obsolete.
> 
> Signed-off-by: Jon Derrick 

This seems to be a 1:1 copy of my patch from August?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 4/5] PCI: vmd: Stop overriding dma_map_ops

2020-01-09 Thread Christoph Hellwig
On Tue, Dec 31, 2019 at 01:24:22PM -0700, Jon Derrick wrote:
> Devices on the VMD domain use the VMD endpoint's requester-id and have
> been relying on the VMD endpoint's dma operations. The downside of this
> was that VMD domain devices would use the VMD endpoint's attributes when
> doing DMA and IOMMU mapping. We can be smarter about this by only using
> the VMD endpoint when mapping and providing the correct child device's
> attributes during dma operations.
> 
> This patch adds a new dma alias mechanism by adding a hint to a pci_dev
> to point to a singular DMA requester's pci_dev. This integrates into the
> existing dma alias infrastructure to reduce the impact of the changes
> required to support this mode.

If we want to lift this check into common code I think it should go
into struct device, as that is what DMA operates on normally.  That
being said given that this insane hack only exists for braindamage in
Intel hardware I'd rather keep it as isolated as possible.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 1/5] iommu: Add DOMAIN_ATTR_SPLIT_TABLES

2020-01-09 Thread Will Deacon
On Mon, Dec 16, 2019 at 09:37:47AM -0700, Jordan Crouse wrote:
> Add a new attribute to enable and query the state of split pagetables
> for the domain.
> 
> Signed-off-by: Jordan Crouse 
> ---
> 
>  include/linux/iommu.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index f2223cb..18c861e 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -126,6 +126,7 @@ enum iommu_attr {
>   DOMAIN_ATTR_FSL_PAMUV1,
>   DOMAIN_ATTR_NESTING,/* two stages of translation */
>   DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> + DOMAIN_ATTR_SPLIT_TABLES,
>   DOMAIN_ATTR_MAX,
>  };

Acked-by: Will Deacon 

Although a comment describing what this does wouldn't hurt.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 00/13] iommu: Add PASID support to Arm SMMUv3

2020-01-09 Thread Jean-Philippe Brucker
Hi Will,

On Thu, Dec 19, 2019 at 05:30:20PM +0100, Jean-Philippe Brucker wrote:
> Add support for Substream ID and PASIDs to the SMMUv3 driver. Since v3
> [1], I added review and tested tags where appropriate and applied the
> suggested changes, shown in the diff below. Thanks all!
> 
> I'm testing using the zip accelerator on the Hisilicon KunPeng920 and
> Zhangfei's uacce module [2]. The full SVA support, which I'll send out
> early next year, is available on my branch sva/zip-devel at
> https://jpbrucker.net/git/linux/

Is there anything more I should do for the PASID support? Ideally I'd like
to get this in v5.6 so I can focus on the rest of the SVA work and on
performance improvements.

Thanks,
Jean

> 
> [1] 
> https://lore.kernel.org/linux-iommu/20191209180514.272727-1-jean-phili...@linaro.org/
> [2] 
> https://lore.kernel.org/linux-iommu/1576465697-27946-1-git-send-email-zhangfei@linaro.org/
> 
> Jean-Philippe Brucker (13):
>   iommu/arm-smmu-v3: Drop __GFP_ZERO flag from DMA allocation
>   dt-bindings: document PASID property for IOMMU masters
>   iommu/arm-smmu-v3: Parse PASID devicetree property of platform devices
>   ACPI/IORT: Parse SSID property of named component node
>   iommu/arm-smmu-v3: Prepare arm_smmu_s1_cfg for SSID support
>   iommu/arm-smmu-v3: Add context descriptor tables allocators
>   iommu/arm-smmu-v3: Add support for Substream IDs
>   iommu/arm-smmu-v3: Propagate ssid_bits
>   iommu/arm-smmu-v3: Prepare for handling arm_smmu_write_ctx_desc()
> failure
>   iommu/arm-smmu-v3: Add second level of context descriptor table
>   iommu/arm-smmu-v3: Improve add_device() error handling
>   PCI/ATS: Add PASID stubs
>   iommu/arm-smmu-v3: Add support for PCI PASID
> 
>  .../devicetree/bindings/iommu/iommu.txt   |   6 +
>  drivers/acpi/arm64/iort.c |  18 +
>  drivers/iommu/arm-smmu-v3.c   | 467 +++---
>  drivers/iommu/of_iommu.c  |   6 +-
>  include/linux/iommu.h |   2 +
>  include/linux/pci-ats.h   |   3 +
>  6 files changed, 442 insertions(+), 60 deletions(-)
> 
> -- 
> Diff since v3:
> #diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index cde7af39681c..8e95ecad4c9a 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -589,8 +589,10 @@ struct arm_smmu_cd_table {
>  };
> 
>  struct arm_smmu_s1_cfg {
> + /* Leaf tables or linear table */
>   struct arm_smmu_cd_table*tables;
>   size_t  num_tables;
> + /* First level tables, when two levels are used */
>   __le64  *l1ptr;
>   dma_addr_t  l1ptr_dma;
>   struct arm_smmu_ctx_desccd;
> @@ -1561,27 +1563,22 @@ static __le64 *arm_smmu_get_cd_ptr(struct 
> arm_smmu_domain *smmu_domain,
>   struct arm_smmu_device *smmu = smmu_domain->smmu;
>   struct arm_smmu_s1_cfg *cfg = _domain->s1_cfg;
> 
> - if (cfg->s1fmt == STRTAB_STE_0_S1FMT_LINEAR) {
> - table = >tables[0];
> - idx = ssid;
> - } else {
> - idx = ssid >> CTXDESC_SPLIT;
> - if (idx >= cfg->num_tables)
> - return NULL;
> + if (cfg->s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
> + return cfg->tables[0].ptr + ssid * CTXDESC_CD_DWORDS;
> 
> - table = >tables[idx];
> - if (!table->ptr) {
> - if (arm_smmu_alloc_cd_leaf_table(smmu, table,
> -  CTXDESC_L2_ENTRIES))
> - return NULL;
> + idx = ssid >> CTXDESC_SPLIT;
> + table = >tables[idx];
> + if (!table->ptr) {
> + if (arm_smmu_alloc_cd_leaf_table(smmu, table,
> +  CTXDESC_L2_ENTRIES))
> + return NULL;
> 
> - l1ptr = cfg->l1ptr + idx * CTXDESC_L1_DESC_DWORDS;
> - arm_smmu_write_cd_l1_desc(l1ptr, table);
> - /* An invalid L1CD can be cached */
> - arm_smmu_sync_cd(smmu_domain, ssid, false);
> - }
> - idx = ssid & (CTXDESC_L2_ENTRIES - 1);
> + l1ptr = cfg->l1ptr + idx * CTXDESC_L1_DESC_DWORDS;
> + arm_smmu_write_cd_l1_desc(l1ptr, table);
> + /* An invalid L1CD can be cached */
> + arm_smmu_sync_cd(smmu_domain, ssid, false);
>   }
> + idx = ssid & (CTXDESC_L2_ENTRIES - 1);
>   return table->ptr + idx * CTXDESC_CD_DWORDS;
>  }
> 
> @@ -1617,8 +1614,12 @@ static int arm_smmu_write_ctx_desc(struct 
> arm_smmu_domain *smmu_domain,
>   u64 val;
>   bool cd_live;
>   struct arm_smmu_device *smmu = smmu_domain->smmu;
> - __le64 *cdptr = arm_smmu_get_cd_ptr(smmu_domain, ssid);
> + __le64 *cdptr;
> 
> + if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax)))
> + return 

Re: [RFC 3/5] x86/PCI: Expose VMD's device in pci_sysdata

2020-01-09 Thread Christoph Hellwig
On Tue, Dec 31, 2019 at 01:24:21PM -0700, Jon Derrick wrote:
> To be used by intel-iommu code to find the correct domain.

Any reason to prefer this version over my patches 2 and 3 from the
series in August?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 2/5] iommu/arm-smmu: Add support for split pagetables

2020-01-09 Thread Will Deacon
On Mon, Dec 16, 2019 at 09:37:48AM -0700, Jordan Crouse wrote:
> Add support to enable split pagetables (TTBR1) if the supporting driver
> requests it via the DOMAIN_ATTR_SPLIT_TABLES flag. When enabled, the driver
> will set up the TTBR0 and TTBR1 regions and program the default domain
> pagetable on TTBR1.
> 
> After attaching the device, the value of he domain attribute can
> be queried to see if the split pagetables were successfully programmed.
> Furthermore the domain geometry will be updated so that the caller can
> determine the active region for the pagetable that was programmed.
> 
> Signed-off-by: Jordan Crouse 
> ---
> 
>  drivers/iommu/arm-smmu.c | 40 +++-
>  drivers/iommu/arm-smmu.h | 45 +++--
>  2 files changed, 74 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index c106406..7b59116 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -538,9 +538,17 @@ static void arm_smmu_init_context_bank(struct 
> arm_smmu_domain *smmu_domain,
>   cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr;
>   cb->ttbr[1] = 0;
>   } else {
> - cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
> - cb->ttbr[0] |= FIELD_PREP(TTBRn_ASID, cfg->asid);
> - cb->ttbr[1] = FIELD_PREP(TTBRn_ASID, cfg->asid);
> + if (pgtbl_cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) {
> + cb->ttbr[0] = FIELD_PREP(TTBRn_ASID, cfg->asid);
> + cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
> + cb->ttbr[1] |=
> + FIELD_PREP(TTBRn_ASID, cfg->asid);
> + } else {
> + cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
> + cb->ttbr[0] |=
> + FIELD_PREP(TTBRn_ASID, cfg->asid);
> + cb->ttbr[1] = FIELD_PREP(TTBRn_ASID, cfg->asid);
> + }

I still don't understand why you have to set the ASID in both of the TTBRs.
Assuming TCR.A1 is clear, then we should only need to set the field in
TTBR0.

>   }
>   } else {
>   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
> @@ -651,6 +659,7 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>   enum io_pgtable_fmt fmt;
>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   struct arm_smmu_cfg *cfg = _domain->cfg;
> + u32 quirks = 0;
>  
>   mutex_lock(_domain->init_mutex);
>   if (smmu_domain->smmu)
> @@ -719,6 +728,8 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>   oas = smmu->ipa_size;
>   if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64) {
>   fmt = ARM_64_LPAE_S1;
> + if (smmu_domain->split_pagetables)
> + quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;
>   } else if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_L) {
>   fmt = ARM_32_LPAE_S1;
>   ias = min(ias, 32UL);
> @@ -788,6 +799,7 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>   .coherent_walk  = smmu->features & ARM_SMMU_FEAT_COHERENT_WALK,
>   .tlb= smmu_domain->flush_ops,
>   .iommu_dev  = smmu->dev,
> + .quirks = quirks,
>   };
>  
>   if (smmu_domain->non_strict)
> @@ -801,8 +813,15 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>  
>   /* Update the domain's page sizes to reflect the page table format */
>   domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> - domain->geometry.aperture_end = (1UL << ias) - 1;
> - domain->geometry.force_aperture = true;
> +
> + if (pgtbl_cfg.quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) {
> + domain->geometry.aperture_start = ~((1ULL << ias) - 1);
> + domain->geometry.aperture_end = ~0UL;
> + } else {
> + domain->geometry.aperture_end = (1UL << ias) - 1;
> + domain->geometry.force_aperture = true;
> + smmu_domain->split_pagetables = false;
> + }
>  
>   /* Initialise the context bank with our page table cfg */
>   arm_smmu_init_context_bank(smmu_domain, _cfg);
> @@ -1484,6 +1503,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
> *domain,
>   case DOMAIN_ATTR_NESTING:
>   *(int *)data = (smmu_domain->stage == 
> ARM_SMMU_DOMAIN_NESTED);
>   return 0;
> + case DOMAIN_ATTR_SPLIT_TABLES:
> + *(int *)data = smmu_domain->split_pagetables;
> + return 0;
>   default:
>   return 

Re: [rfc] dma-mapping: preallocate unencrypted DMA atomic pool

2020-01-09 Thread Christoph Hellwig
On Tue, Jan 07, 2020 at 11:57:24AM -0800, David Rientjes wrote:
> I'll rely on Thomas to chime in if this doesn't make sense for the SEV 
> usecase.
> 
> I think the sizing of the single atomic pool needs to be determined.  Our 
> peak usage that we have measured from NVMe is ~1.4MB and atomic_pool is 
> currently sized to 256KB by default.  I'm unsure at this time if we need 
> to be able to dynamically expand this pool with a kworker.
>
> Maybe when CONFIG_AMD_MEM_ENCRYPT is enabled this atomic pool should be 
> sized to 2MB or so and then when it reaches half capacity we schedule some 
> background work to dynamically increase it?  That wouldn't be hard unless 
> the pool can be rapidly depleted.
> 

Note that a non-coherent architecture with the same workload would need
the same size.

> Do we want to increase the atomic pool size by default and then do 
> background dynamic expansion?

For now I'd just scale with system memory size.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/16] drivers/iommu: Take a ref to the IOMMU driver prior to ->add_device()

2020-01-09 Thread Will Deacon
Hi Greg,

On Thu, Dec 19, 2019 at 03:44:37PM +0100, Greg Kroah-Hartman wrote:
> On Thu, Dec 19, 2019 at 12:03:41PM +, Will Deacon wrote:
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > index f2223cbb5fd5..e9f94d3f7a04 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -246,9 +246,10 @@ struct iommu_iotlb_gather {
> >   * @sva_get_pasid: Get PASID associated to a SVA handle
> >   * @page_response: handle page request response
> >   * @cache_invalidate: invalidate translation caches
> > - * @pgsize_bitmap: bitmap of all possible supported page sizes
> >   * @sva_bind_gpasid: bind guest pasid and mm
> >   * @sva_unbind_gpasid: unbind guest pasid and mm
> > + * @pgsize_bitmap: bitmap of all possible supported page sizes
> > + * @owner: Driver module providing these ops
> >   */
> >  struct iommu_ops {
> > bool (*capable)(enum iommu_cap);
> > @@ -318,6 +319,7 @@ struct iommu_ops {
> > int (*sva_unbind_gpasid)(struct device *dev, int pasid);
> >  
> > unsigned long pgsize_bitmap;
> > +   struct module *owner;
> 
> Everyone is always going to forget to set this field.  I don't think you
> even set it for all of the different iommu_ops possible in this series,
> right?

I only initialised the field for those drivers which can actually be built
as a module, but I take your point about this being error-prone.

> The "trick" we did to keep people from having to remember this is to do
> what we did for the bus registering functions.
> 
> Look at pci_register_driver in pci.h:
> #define pci_register_driver(driver) \
> __pci_register_driver(driver, THIS_MODULE, KBUILD_MODNAME)
> 
> Then we set the .owner field in the "real" __pci_register_driver() call.
> 
> Same thing for USB and lots, if not all, other driver register
> functions.
> 
> You can do the same thing here, and I would recommend it.

Yes, that makes sense, cheers. Diff below. I'll send it to Joerg along
with some other SMMU patches that have come in since the holiday.

Will

--->8

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 03dc97842875..e82997a705a8 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2733,7 +2733,6 @@ static struct iommu_ops arm_smmu_ops = {
.get_resv_regions   = arm_smmu_get_resv_regions,
.put_resv_regions   = arm_smmu_put_resv_regions,
.pgsize_bitmap  = -1UL, /* Restricted during device attach */
-   .owner  = THIS_MODULE,
 };
 
 /* Probing and initialisation functions */
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5ef1f2e100d7..93d332423f6f 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1623,7 +1623,6 @@ static struct iommu_ops arm_smmu_ops = {
.get_resv_regions   = arm_smmu_get_resv_regions,
.put_resv_regions   = arm_smmu_put_resv_regions,
.pgsize_bitmap  = -1UL, /* Restricted during device attach */
-   .owner  = THIS_MODULE,
 };
 
 static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e9f94d3f7a04..90007c92ad2d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -388,12 +388,19 @@ void iommu_device_sysfs_remove(struct iommu_device 
*iommu);
 int  iommu_device_link(struct iommu_device   *iommu, struct device *link);
 void iommu_device_unlink(struct iommu_device *iommu, struct device *link);
 
-static inline void iommu_device_set_ops(struct iommu_device *iommu,
-   const struct iommu_ops *ops)
+static inline void __iommu_device_set_ops(struct iommu_device *iommu,
+ const struct iommu_ops *ops)
 {
iommu->ops = ops;
 }
 
+#define iommu_device_set_ops(iommu, ops)   \
+do {   \
+   struct iommu_ops *__ops = (struct iommu_ops *)(ops);\
+   __ops->owner = THIS_MODULE; \
+   __iommu_device_set_ops(iommu, __ops);   \
+} while (0)
+
 static inline void iommu_device_set_fwnode(struct iommu_device *iommu,
   struct fwnode_handle *fwnode)
 {
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu/arm-smmu-v3: simplify parse_driver_options()

2020-01-09 Thread Robin Murphy

On 26/12/2019 9:51 am, Masahiro Yamada wrote:

Using ARRAY_SIZE() instead of the sentinel is slightly simpler, IMHO.


Given that it's fairly well-decided that we don't want to add any more 
of these anyway, I'd be inclined to lose the array/loop machinery 
altogether. As it is we'd need a lot more options for it to actually 
offer any kind of code size saving.


Robin.


Signed-off-by: Masahiro Yamada 
---

  drivers/iommu/arm-smmu-v3.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index ed9933960370..b27489b7f9d8 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -676,7 +676,6 @@ struct arm_smmu_option_prop {
  static const struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" },
{ ARM_SMMU_OPT_PAGE0_REGS_ONLY, "cavium,cn9900-broken-page1-regspace"},
-   { 0, NULL},
  };
  
  static inline void __iomem *arm_smmu_page1_fixup(unsigned long offset,

@@ -696,16 +695,16 @@ static struct arm_smmu_domain *to_smmu_domain(struct 
iommu_domain *dom)
  
  static void parse_driver_options(struct arm_smmu_device *smmu)

  {
-   int i = 0;
+   int i;
  
-	do {

+   for (i = 0; i < ARRAY_SIZE(arm_smmu_options); i++) {
if (of_property_read_bool(smmu->dev->of_node,
arm_smmu_options[i].prop)) {
smmu->options |= arm_smmu_options[i].opt;
dev_notice(smmu->dev, "option %s\n",
arm_smmu_options[i].prop);
}
-   } while (arm_smmu_options[++i].opt);
+   };
  }
  
  /* Low-level queue manipulation functions */



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: fix resource_size check

2020-01-09 Thread Robin Murphy

On 26/12/2019 9:50 am, Masahiro Yamada wrote:

This is an off-by-one mistake.

resource_size() returns res->end - res->start + 1.


Heh, despite the optimism of "Avoid one-off errors by introducing a 
resource_size() function", life finds a way...


Reviewed-by: Robin Murphy 


Signed-off-by: Masahiro Yamada 
---

  drivers/iommu/arm-smmu-v3.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d9e0d9c19b4f..b463599559d2 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -3599,7 +3599,7 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
  
  	/* Base address */

res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-   if (resource_size(res) + 1 < arm_smmu_resource_size(smmu)) {
+   if (resource_size(res) < arm_smmu_resource_size(smmu)) {
dev_err(dev, "MMIO region too small (%pr)\n", res);
return -EINVAL;
}


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


答复: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices

2020-01-09 Thread Jim,Yan
Hi Baolu,

> -邮件原件-
> 发件人: Lu Baolu [mailto:baolu...@linux.intel.com]
> 发送时间: 2020年1月9日 16:53
> 收件人: Christoph Hellwig 
> 抄送: baolu...@linux.intel.com; Joerg Roedel ; Roland
> Dreier ; Jim,Yan ;
> iommu@lists.linux-foundation.org
> 主题: Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched
> devices
> 
> On 1/9/20 3:06 PM, Christoph Hellwig wrote:
> > On Thu, Jan 09, 2020 at 07:28:41AM +0800, Lu Baolu wrote:
> >> This patch has been replaced with this one.
> >>
> >> https://lkml.org/lkml/2020/1/5/103
> >
> > That still mentions a "nvme host device", which despite the different
> > spelling still doesn't make any sense.
> >
> 
> Jim, can you please refine it accordingly?
> 
> Best regards,
> Baolu


Yes, I am working on it.

Regards
Jim
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices

2020-01-09 Thread Lu Baolu

On 1/9/20 3:06 PM, Christoph Hellwig wrote:

On Thu, Jan 09, 2020 at 07:28:41AM +0800, Lu Baolu wrote:

This patch has been replaced with this one.

https://lkml.org/lkml/2020/1/5/103


That still mentions a "nvme host device", which despite the different
spelling still doesn't make any sense.



Jim, can you please refine it accordingly?

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu