Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted

2019-03-24 Thread David Gibson
On Sat, Mar 23, 2019 at 05:01:35PM -0400, Michael S. Tsirkin wrote:
> On Thu, Mar 21, 2019 at 09:05:04PM -0300, Thiago Jung Bauermann wrote:
> > Michael S. Tsirkin  writes:
[snip]
> > >> > Is there any justification to doing that beyond someone putting
> > >> > out slow code in the past?
> > >>
> > >> The definition of the ACCESS_PLATFORM flag is generic and captures the
> > >> notion of memory access restrictions for the device. Unfortunately, on
> > >> powerpc pSeries guests it also implies that the IOMMU is turned on
> > >
> > > IIUC that's really because on pSeries IOMMU is *always* turned on.
> > > Platform has no way to say what you want it to say
> > > which is bypass the iommu for the specific device.
> > 
> > Yes, that's correct. pSeries guests running on KVM are in a gray area
> > where theoretically they use an IOMMU but in practice KVM ignores it.
> > It's unfortunate but it's the reality on the ground today. :-/

Um.. I'm not sure what you mean by this.  As far as I'm concerned
there is always a guest-visible (paravirtualized) IOMMU, and that will
be backed onto the host IOMMU when necessary.

[Actually there is an IOMMU bypass hack that's used by the guest
 firmware, but I don't think we want to expose that]

> Well it's not just the reality, virt setups need something that
> emulated IOMMUs don't provide. That is not uncommon, e.g.
> intel's VTD has a "cache mode" field which AFAIK is only used for virt.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

2019-03-24 Thread Lu Baolu

Hi James,

On 3/22/19 6:05 PM, James Sewart wrote:

Hey Lu,


On 20 Mar 2019, at 01:26, Lu Baolu  wrote:

Hi James,

On 3/19/19 9:35 PM, James Sewart wrote:

Hey Lu,

On 15 Mar 2019, at 03:13, Lu Baolu  wrote:

Hi James,

On 3/14/19 7:56 PM, James Sewart wrote:

Patches 1 and 2 are the same as v1.
v1-v2:
   Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
   Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped
reserved regions.
   Integrated patches by Lu to remove more unused code in cleanup.
Lu: I didn't integrate your patch to set the default domain type as it
isn't directly related to the aim of this patchset. Instead patch 4


Without those patches, user experience will be affected and some devices
will not work on Intel platforms anymore.

For a long time, Intel IOMMU driver has its own logic to determine
whether a device requires an identity domain. For example, when user
specifies "iommu=pt" in kernel parameter, all device will be attached
with the identity domain. Further more, some quirky devices require
an identity domain to be used before enabling DMA remapping, otherwise,
it will not work. This was done by adding quirk bits in Intel IOMMU
driver.

So from my point of view, one way is porting all those quirks and kernel
parameters into IOMMU generic layer, or opening a door for vendor IOMMU
driver to determine the default domain type by their own. I prefer the
latter option since it will not impact any behaviors on other
architectures.

I see your point. I’m not confident that using the proposed door to set a
groups default domain has the desired behaviour. As discussed before the
default domain type will be set based on the desired type for only the
first device attached to a group. I think to change the default domain
type you would need a slightly different door that wasn’t conditioned on
device.


I think this as another problem. Just a summarize for the ease of
discussion. We saw two problems:

1. When allocating a new group for a device, how should we determine the
type of the default domain? This is what my proposal patches trying to
address.


This will work as expected only if all devices within a group get the same
result from is_identity_map. Otherwise wee see issue 2.



2. If we need to put a device into an existing group which uses a
different type of domain from what the device desires to use, we might
break the functionality of the device. For this problem I'd second your
proposal below if I get your point correctly.


For situations where individual devices require an identity domain because
of quirks then maybe calling is_identity_map per device in
iommu_group_get_for_dev is a better solution than the one I proposed.


Do you mean if we see a quirky device requires a different domain type
other than the default domain type of the group, we will assign a new
group to it? That looks good to me as far as I can see. I suppose this
should be done in vt-d's ops callback.


I have thought about this a bit and I think the cleanest approach is to
put devices that require an identity domain into their own group, this can
be done in the device_group callback, avoiding any situation where we have
to deal with creating a new group based on domain type in
iommu_group_get_for_dev. This way we shouldn’t ever be in a situation with
multiple different domain types per group. This way your patches will work
as expected. See below for a possible implementation.



Best regards,
Lu Baolu


Cheers,
James.

Devices that require an identity map because of quirks or other reasons
should be put in their own IOMMU group so as to not end up with multiple
different domains per group.


Yeah! This looks good to me.

Best regards,
Lu Baolu



Signed-off-by: James Sewart 

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 3cb8b36abf50..0f5a121d31a0 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5421,6 +5421,22 @@ struct intel_iommu *intel_svm_device_to_iommu(struct 
device *dev)
  }
  #endif /* CONFIG_INTEL_IOMMU_SVM */

+static struct iommu_group *intel_identity_group;
+
+struct iommu_group *intel_iommu_pci_device_group(struct device *dev)
+{
+   if (iommu_no_mapping(dev)) {
+   if (!intel_identity_group) {
+   intel_identity_group = iommu_group_alloc();
+   if (IS_ERR(intel_identity_group))
+   return NULL;
+   }
+   return intel_identity_group;
+   }
+
+   return pci_device_group(dev);
+}
+
  const struct iommu_ops intel_iommu_ops = {
 .capable= intel_iommu_capable,
 .domain_alloc   = intel_iommu_domain_alloc,
@@ -5435,7 +5451,7 @@ const struct iommu_ops intel_iommu_ops = {
 .get_resv_regions   = intel_iommu_get_resv_regions,
 .put_resv_regions   = intel_iommu_put_resv_regions,
 .apply_resv_region  = 

Re: [PATCH 1/1] iommu: Add config option to set lazy mode as default

2019-03-24 Thread Leizhen (ThunderTown)



On 2019/3/22 22:42, John Garry wrote:
> On 22/03/2019 14:11, Zhen Lei wrote:
> 
>> This allows the default behaviour to be controlled by a kernel config
>> option instead of changing the command line for the kernel to include
>> "iommu.strict=0" on ARM64 where this is desired.
>>
>> This is similar to CONFIG_IOMMU_DEFAULT_PASSTHROUGH
>>
>> Note: At present, intel_iommu, amd_iommu and s390_iommu use lazy mode as
>> defalut, so there is no need to add code for them.
> 
> /s/defalut/default/
> 
>>
>> Signed-off-by: Zhen Lei 
>> ---
>>  drivers/iommu/Kconfig | 14 ++
>>  drivers/iommu/iommu.c |  5 +
> 
> Do we need to update kernel-parameters.txt for iommu.strict?
> 
>>  2 files changed, 19 insertions(+)
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index 6f07f3b21816c64..5ec9780f564eaf8 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -85,6 +85,20 @@ config IOMMU_DEFAULT_PASSTHROUGH
>>
>>If unsure, say N here.
>>
>> +config IOMMU_LAZY_MODE
> 
> maybe should add "DMA" to the name, and even "DEFAULT"
OK, thanks

> 
>> +bool "IOMMU use lazy mode to flush IOTLB and free IOVA"
>> +depends on IOMMU_API
>> +help
>> +  For every IOMMU unmap operation, the flush operation of IOTLB and the 
>> free
>> +  operation of IOVA are deferred.
> 
> This is a bit unclear, as there is no context. I think that you need to say 
> something like, "Support lazy mode, where for every IOMMU DMA unmap 
> operation, the flush operation of IOTLB and the free operation of IOVA are 
> deferred. "
> 
> 
> They are only guaranteed to be done before
>> +  the related IOVA will be reused. Removing the need to pass in 
>> iommu.strict=0
>> +  through command line on ARM64(Now, intel_iommu, amd_iommu, s390_iommu 
>> use
>> +  lazy mode as deault).
> 
> prone to going out-of-date
> 
>  If this is enabled, you can still disable with kernel
>> +  parameters, such as iommu.strict=1, intel_iommu=strict, 
>> amd_iommu=fullflush
>> +  or s390_iommu=strict depending on the architecture.
>> +
>> +  If unsure, say N here.
>> +
>>  config OF_IOMMU
>> def_bool y
>> depends on OF && IOMMU_API
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 33a982e33716369..e307d70d1578b3b 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -43,7 +43,12 @@
>>  #else
>>  static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_DMA;
>>  #endif
>> +
>> +#ifdef CONFIG_IOMMU_LAZY_MODE
>> +static bool iommu_dma_strict __read_mostly;
>> +#else
>>  static bool iommu_dma_strict __read_mostly = true;
>> +#endif
>>
>>  struct iommu_callback_data {
>>  const struct iommu_ops *ops;
>> -- 
>> 1.8.3
>>
>>
>>
> 
> Cheers
> 
> 
> 
> 
> .
> 

-- 
Thanks!
BestRegards

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

2019-03-24 Thread Lu Baolu

Hi James,

On 3/22/19 5:57 PM, James Sewart wrote:

Hey Lu,


On 15 Mar 2019, at 02:19, Lu Baolu  wrote:

Hi James,

On 3/14/19 7:58 PM, James Sewart wrote:

To support mapping ISA region via iommu_group_create_direct_mappings,
make sure its exposed by iommu_get_resv_regions. This allows
deduplication of reserved region mappings
Signed-off-by: James Sewart 
---
  drivers/iommu/intel-iommu.c | 42 +
  1 file changed, 33 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8e0a4e2ff77f..2e00e8708f06 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
  #define for_each_rmrr_units(rmrr) \
list_for_each_entry(rmrr, _rmrr_units, list)
  +static struct iommu_resv_region *isa_resv_region;
+
  /* bitmap for indexing intel_iommus */
  static int g_num_of_iommus;
  @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct 
dmar_rmrr_unit *rmrr,
  rmrr->end_address);
  }
  +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
+{
+   if (!isa_resv_region)
+   isa_resv_region = iommu_alloc_resv_region(0,
+   16*1024*1024,
+   0, IOMMU_RESV_DIRECT);
+
+   return isa_resv_region;
+}
+
  #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
-static inline void iommu_prepare_isa(void)
+static inline void iommu_prepare_isa(struct pci_dev *pdev)
  {
-   struct pci_dev *pdev;
int ret;
+   struct iommu_resv_region *reg = iommu_get_isa_resv_region();
  - pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
-   if (!pdev)
+   if (!reg)
return;
pr_info("Prepare 0-16MiB unity mapping for LPC\n");
-   ret = iommu_prepare_identity_map(>dev, 0, 16*1024*1024 - 1);
+   ret = iommu_prepare_identity_map(>dev, reg->start,
+   reg->start + reg->length - 1);
if (ret)
pr_err("Failed to create 0-16MiB identity map - floppy might not 
work\n");
-
-   pci_dev_put(pdev);
  }
  #else
-static inline void iommu_prepare_isa(void)
+static inline void iommu_prepare_isa(struct pci_dev *pdev)
  {
return;
  }
@@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
struct dmar_rmrr_unit *rmrr;
bool copied_tables = false;
struct device *dev;
+   struct pci_dev *pdev;
struct intel_iommu *iommu;
int i, ret;
  @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
}
}
  - iommu_prepare_isa();
+   pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
+   if (pdev) {
+   iommu_prepare_isa(pdev);
+   pci_dev_put(pdev);
+   }
domains_done:
  @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
struct iommu_resv_region *reg;
struct dmar_rmrr_unit *rmrr;
struct device *i_dev;
+   struct pci_dev *pdev;
int i;
rcu_read_lock();
@@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
}
rcu_read_unlock();
  + if (dev_is_pci(device)) {
+   pdev = to_pci_dev(device);
+   if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
+   reg = iommu_get_isa_resv_region();
+   list_add_tail(>list, head);
+   }
+   }
+


Just wondering why not just

+#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
+   if (dev_is_pci(device)) {
+   pdev = to_pci_dev(device);
+   if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
+   reg = iommu_alloc_resv_region(0,
+   16*1024*1024,
+   0, IOMMU_RESV_DIRECT);
+   if (reg)
+   list_add_tail(>list, head);
+   }
+   }
+#endif

and, remove all other related code?


At this point in the patchset if we remove iommu_prepare_isa then the ISA
region won’t be mapped to the device. Only once the dma domain is allocable
will the reserved regions be mapped by iommu_group_create_direct_mappings.


Yes. So if we put the allocation code here, it won't impact anything and
will take effect as soon as the dma domain is allocatable.



Theres an issue that if we choose to alloc a new resv_region with type
IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
to free this entry type which means refactoring the rmrr regions in
get_resv_regions. Should this work be in this patchset?


Do you mean the rmrr regions are not allocated in get_resv_regions, but
are freed in put_resv_regions? I think we should fix this in this patch
set since this might impact the device passthrough if we don't do it.

Best regards,
Lu Baolu
___

[PATCH v8 9/9] vfio/type1: Handle different mdev isolation type

2019-03-24 Thread Lu Baolu
This adds the support to determine the isolation type
of a mediated device group by checking whether it has
an iommu device. If an iommu device exists, an iommu
domain will be allocated and then attached to the iommu
device. Otherwise, keep the same behavior as it is.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Sanjay Kumar 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
Reviewed-by: Jean-Philippe Brucker 
---
 drivers/vfio/vfio_iommu_type1.c | 55 +
 1 file changed, 42 insertions(+), 13 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index ccc4165474aa..b91cafcd5181 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -559,7 +559,7 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
mutex_lock(>lock);
 
/* Fail if notifier list is empty */
-   if ((!iommu->external_domain) || (!iommu->notifier.head)) {
+   if (!iommu->notifier.head) {
ret = -EINVAL;
goto pin_done;
}
@@ -641,11 +641,6 @@ static int vfio_iommu_type1_unpin_pages(void *iommu_data,
 
mutex_lock(>lock);
 
-   if (!iommu->external_domain) {
-   mutex_unlock(>lock);
-   return -EINVAL;
-   }
-
do_accounting = !IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu);
for (i = 0; i < npage; i++) {
struct vfio_dma *dma;
@@ -1368,13 +1363,40 @@ static void vfio_iommu_detach_group(struct vfio_domain 
*domain,
iommu_detach_group(domain->domain, group->iommu_group);
 }
 
+static bool vfio_bus_is_mdev(struct bus_type *bus)
+{
+   struct bus_type *mdev_bus;
+   bool ret = false;
+
+   mdev_bus = symbol_get(mdev_bus_type);
+   if (mdev_bus) {
+   ret = (bus == mdev_bus);
+   symbol_put(mdev_bus_type);
+   }
+
+   return ret;
+}
+
+static int vfio_mdev_iommu_device(struct device *dev, void *data)
+{
+   struct device **old = data, *new;
+
+   new = vfio_mdev_get_iommu_device(dev);
+   if (!new || (*old && *old != new))
+   return -EINVAL;
+
+   *old = new;
+
+   return 0;
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 struct iommu_group *iommu_group)
 {
struct vfio_iommu *iommu = iommu_data;
struct vfio_group *group;
struct vfio_domain *domain, *d;
-   struct bus_type *bus = NULL, *mdev_bus;
+   struct bus_type *bus = NULL;
int ret;
bool resv_msi, msi_remap;
phys_addr_t resv_msi_base;
@@ -1409,23 +1431,30 @@ static int vfio_iommu_type1_attach_group(void 
*iommu_data,
if (ret)
goto out_free;
 
-   mdev_bus = symbol_get(mdev_bus_type);
+   if (vfio_bus_is_mdev(bus)) {
+   struct device *iommu_device = NULL;
 
-   if (mdev_bus) {
-   if ((bus == mdev_bus) && !iommu_present(bus)) {
-   symbol_put(mdev_bus_type);
+   group->mdev_group = true;
+
+   /* Determine the isolation type */
+   ret = iommu_group_for_each_dev(iommu_group, _device,
+  vfio_mdev_iommu_device);
+   if (ret || !iommu_device) {
if (!iommu->external_domain) {
INIT_LIST_HEAD(>group_list);
iommu->external_domain = domain;
-   } else
+   } else {
kfree(domain);
+   }
 
list_add(>next,
 >external_domain->group_list);
mutex_unlock(>lock);
+
return 0;
}
-   symbol_put(mdev_bus_type);
+
+   bus = iommu_device->bus;
}
 
domain->domain = iommu_domain_alloc(bus);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 8/9] vfio/type1: Add domain at(de)taching group helpers

2019-03-24 Thread Lu Baolu
This adds helpers to attach or detach a domain to a
group. This will replace iommu_attach_group() which
only works for non-mdev devices.

If a domain is attaching to a group which includes the
mediated devices, it should attach to the iommu device
(a pci device which represents the mdev in iommu scope)
instead. The added helper supports attaching domain to
groups for both pci and mdev devices.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Sanjay Kumar 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
Reviewed-by: Jean-Philippe Brucker 
---
 drivers/vfio/vfio_iommu_type1.c | 84 ++---
 1 file changed, 77 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 73652e21efec..ccc4165474aa 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -91,6 +91,7 @@ struct vfio_dma {
 struct vfio_group {
struct iommu_group  *iommu_group;
struct list_headnext;
+   boolmdev_group; /* An mdev group */
 };
 
 /*
@@ -1298,6 +1299,75 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group 
*group, phys_addr_t *base)
return ret;
 }
 
+static struct device *vfio_mdev_get_iommu_device(struct device *dev)
+{
+   struct device *(*fn)(struct device *dev);
+   struct device *iommu_device;
+
+   fn = symbol_get(mdev_get_iommu_device);
+   if (fn) {
+   iommu_device = fn(dev);
+   symbol_put(mdev_get_iommu_device);
+
+   return iommu_device;
+   }
+
+   return NULL;
+}
+
+static int vfio_mdev_attach_domain(struct device *dev, void *data)
+{
+   struct iommu_domain *domain = data;
+   struct device *iommu_device;
+
+   iommu_device = vfio_mdev_get_iommu_device(dev);
+   if (iommu_device) {
+   if (iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX))
+   return iommu_aux_attach_device(domain, iommu_device);
+   else
+   return iommu_attach_device(domain, iommu_device);
+   }
+
+   return -EINVAL;
+}
+
+static int vfio_mdev_detach_domain(struct device *dev, void *data)
+{
+   struct iommu_domain *domain = data;
+   struct device *iommu_device;
+
+   iommu_device = vfio_mdev_get_iommu_device(dev);
+   if (iommu_device) {
+   if (iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX))
+   iommu_aux_detach_device(domain, iommu_device);
+   else
+   iommu_detach_device(domain, iommu_device);
+   }
+
+   return 0;
+}
+
+static int vfio_iommu_attach_group(struct vfio_domain *domain,
+  struct vfio_group *group)
+{
+   if (group->mdev_group)
+   return iommu_group_for_each_dev(group->iommu_group,
+   domain->domain,
+   vfio_mdev_attach_domain);
+   else
+   return iommu_attach_group(domain->domain, group->iommu_group);
+}
+
+static void vfio_iommu_detach_group(struct vfio_domain *domain,
+   struct vfio_group *group)
+{
+   if (group->mdev_group)
+   iommu_group_for_each_dev(group->iommu_group, domain->domain,
+vfio_mdev_detach_domain);
+   else
+   iommu_detach_group(domain->domain, group->iommu_group);
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 struct iommu_group *iommu_group)
 {
@@ -1373,7 +1443,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
goto out_domain;
}
 
-   ret = iommu_attach_group(domain->domain, iommu_group);
+   ret = vfio_iommu_attach_group(domain, group);
if (ret)
goto out_domain;
 
@@ -1405,8 +1475,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
list_for_each_entry(d, >domain_list, next) {
if (d->domain->ops == domain->domain->ops &&
d->prot == domain->prot) {
-   iommu_detach_group(domain->domain, iommu_group);
-   if (!iommu_attach_group(d->domain, iommu_group)) {
+   vfio_iommu_detach_group(domain, group);
+   if (!vfio_iommu_attach_group(d, group)) {
list_add(>next, >group_list);
iommu_domain_free(domain->domain);
kfree(domain);
@@ -1414,7 +1484,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
return 0;
}
 
-   ret = iommu_attach_group(domain->domain, iommu_group);
+   ret = vfio_iommu_attach_group(domain, group);
if (ret)
  

[PATCH v8 6/9] iommu/vt-d: Return ID associated with an auxiliary domain

2019-03-24 Thread Lu Baolu
This adds support to return the default pasid associated with
an auxiliary domain. The PCI device which is bound with this
domain should use this value as the pasid for all DMA requests
of the subset of device which is isolated and protected with
this domain.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Sanjay Kumar 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel-iommu.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 28a998afaf74..c137f0f2cf49 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5697,6 +5697,15 @@ intel_iommu_dev_feat_enabled(struct device *dev, enum 
iommu_dev_features feat)
return false;
 }
 
+static int
+intel_iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
+{
+   struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+
+   return dmar_domain->default_pasid > 0 ?
+   dmar_domain->default_pasid : -EINVAL;
+}
+
 const struct iommu_ops intel_iommu_ops = {
.capable= intel_iommu_capable,
.domain_alloc   = intel_iommu_domain_alloc,
@@ -5705,6 +5714,7 @@ const struct iommu_ops intel_iommu_ops = {
.detach_dev = intel_iommu_detach_device,
.aux_attach_dev = intel_iommu_aux_attach_device,
.aux_detach_dev = intel_iommu_aux_detach_device,
+   .aux_get_pasid  = intel_iommu_aux_get_pasid,
.map= intel_iommu_map,
.unmap  = intel_iommu_unmap,
.iova_to_phys   = intel_iommu_iova_to_phys,
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 7/9] vfio/mdev: Add iommu related member in mdev_device

2019-03-24 Thread Lu Baolu
A parent device might create different types of mediated
devices. For example, a mediated device could be created
by the parent device with full isolation and protection
provided by the IOMMU. One usage case could be found on
Intel platforms where a mediated device is an assignable
subset of a PCI, the DMA requests on behalf of it are all
tagged with a PASID. Since IOMMU supports PASID-granular
translations (scalable mode in VT-d 3.0), this mediated
device could be individually protected and isolated by an
IOMMU.

This patch adds a new member in the struct mdev_device to
indicate that the mediated device represented by mdev could
be isolated and protected by attaching a domain to a device
represented by mdev->iommu_device. It also adds a helper to
add or set the iommu device.

* mdev_device->iommu_device
  - This, if set, indicates that the mediated device could
be fully isolated and protected by IOMMU via attaching
an iommu domain to this device. If empty, it indicates
using vendor defined isolation, hence bypass IOMMU.

* mdev_set/get_iommu_device(dev, iommu_device)
  - Set or get the iommu device which represents this mdev
in IOMMU's device scope. Drivers don't need to set the
iommu device if it uses vendor defined isolation.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Cc: Liu Yi L 
Suggested-by: Kevin Tian 
Suggested-by: Alex Williamson 
Signed-off-by: Lu Baolu 
Reviewed-by: Jean-Philippe Brucker 
---
 drivers/vfio/mdev/mdev_core.c| 18 ++
 drivers/vfio/mdev/mdev_private.h |  1 +
 include/linux/mdev.h | 14 ++
 3 files changed, 33 insertions(+)

diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index b96fedc77ee5..1b6435529166 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -390,6 +390,24 @@ int mdev_device_remove(struct device *dev, bool 
force_remove)
return 0;
 }
 
+int mdev_set_iommu_device(struct device *dev, struct device *iommu_device)
+{
+   struct mdev_device *mdev = to_mdev_device(dev);
+
+   mdev->iommu_device = iommu_device;
+
+   return 0;
+}
+EXPORT_SYMBOL(mdev_set_iommu_device);
+
+struct device *mdev_get_iommu_device(struct device *dev)
+{
+   struct mdev_device *mdev = to_mdev_device(dev);
+
+   return mdev->iommu_device;
+}
+EXPORT_SYMBOL(mdev_get_iommu_device);
+
 static int __init mdev_init(void)
 {
return mdev_bus_register();
diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index 379758c52b1b..bfb7b22a7cb6 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -34,6 +34,7 @@ struct mdev_device {
struct list_head next;
struct kobject *type_kobj;
bool active;
+   struct device *iommu_device;
 };
 
 #define to_mdev_device(dev)container_of(dev, struct mdev_device, dev)
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index d7aee90e5da5..df2ea39f47ee 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -15,6 +15,20 @@
 
 struct mdev_device;
 
+/*
+ * Called by the parent device driver to set the device which represents
+ * this mdev in iommu protection scope. By default, the iommu device is
+ * NULL, that indicates using vendor defined isolation.
+ *
+ * @dev: the mediated device that iommu will isolate.
+ * @iommu_device: a pci device which represents the iommu for @dev.
+ *
+ * Return 0 for success, otherwise negative error value.
+ */
+int mdev_set_iommu_device(struct device *dev, struct device *iommu_device);
+
+struct device *mdev_get_iommu_device(struct device *dev);
+
 /**
  * struct mdev_parent_ops - Structure to be registered for each parent device 
to
  * register the device to mdev module.
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 4/9] iommu/vt-d: Move common code out of iommu_attch_device()

2019-03-24 Thread Lu Baolu
This part of code could be used by both normal and aux
domain specific attach entries. Hence move them into a
common function to avoid duplication.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel-iommu.c | 60 ++---
 1 file changed, 36 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index ba5f2e5b3a03..a0f9c748ca9f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5066,35 +5066,14 @@ static void intel_iommu_domain_free(struct iommu_domain 
*domain)
domain_exit(to_dmar_domain(domain));
 }
 
-static int intel_iommu_attach_device(struct iommu_domain *domain,
-struct device *dev)
+static int prepare_domain_attach_device(struct iommu_domain *domain,
+   struct device *dev)
 {
struct dmar_domain *dmar_domain = to_dmar_domain(domain);
struct intel_iommu *iommu;
int addr_width;
u8 bus, devfn;
 
-   if (device_is_rmrr_locked(dev)) {
-   dev_warn(dev, "Device is ineligible for IOMMU domain attach due 
to platform RMRR requirement.  Contact your platform vendor.\n");
-   return -EPERM;
-   }
-
-   /* normally dev is not mapped */
-   if (unlikely(domain_context_mapped(dev))) {
-   struct dmar_domain *old_domain;
-
-   old_domain = find_domain(dev);
-   if (old_domain) {
-   rcu_read_lock();
-   dmar_remove_one_dev_info(dev);
-   rcu_read_unlock();
-
-   if (!domain_type_is_vm_or_si(old_domain) &&
-list_empty(_domain->devices))
-   domain_exit(old_domain);
-   }
-   }
-
iommu = device_to_iommu(dev, , );
if (!iommu)
return -ENODEV;
@@ -5127,7 +5106,40 @@ static int intel_iommu_attach_device(struct iommu_domain 
*domain,
dmar_domain->agaw--;
}
 
-   return domain_add_dev_info(dmar_domain, dev);
+   return 0;
+}
+
+static int intel_iommu_attach_device(struct iommu_domain *domain,
+struct device *dev)
+{
+   int ret;
+
+   if (device_is_rmrr_locked(dev)) {
+   dev_warn(dev, "Device is ineligible for IOMMU domain attach due 
to platform RMRR requirement.  Contact your platform vendor.\n");
+   return -EPERM;
+   }
+
+   /* normally dev is not mapped */
+   if (unlikely(domain_context_mapped(dev))) {
+   struct dmar_domain *old_domain;
+
+   old_domain = find_domain(dev);
+   if (old_domain) {
+   rcu_read_lock();
+   dmar_remove_one_dev_info(dev);
+   rcu_read_unlock();
+
+   if (!domain_type_is_vm_or_si(old_domain) &&
+   list_empty(_domain->devices))
+   domain_exit(old_domain);
+   }
+   }
+
+   ret = prepare_domain_attach_device(domain, dev);
+   if (ret)
+   return ret;
+
+   return domain_add_dev_info(to_dmar_domain(domain), dev);
 }
 
 static void intel_iommu_detach_device(struct iommu_domain *domain,
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v8 0/9] vfio/mdev: IOMMU aware mediated device

2019-03-24 Thread Lu Baolu
Hi,

The Mediate Device is a framework for fine-grained physical device
sharing across the isolated domains. Currently the mdev framework
is designed to be independent of the platform IOMMU support. As the
result, the DMA isolation relies on the mdev parent device in a
vendor specific way.

There are several cases where a mediated device could be protected
and isolated by the platform IOMMU. For example, Intel vt-d rev3.0
[1] introduces a new translation mode called 'scalable mode', which
enables PASID-granular translations. The vt-d scalable mode is the
key ingredient for Scalable I/O Virtualization [2] [3] which allows
sharing a device in minimal possible granularity (ADI - Assignable
Device Interface).

A mediated device backed by an ADI could be protected and isolated
by the IOMMU since 1) the parent device supports tagging an unique
PASID to all DMA traffic out of the mediated device; and 2) the DMA
translation unit (IOMMU) supports the PASID granular translation.
We can apply IOMMU protection and isolation to this kind of devices
just as what we are doing with an assignable PCI device.

In order to distinguish the IOMMU-capable mediated devices from those
which still need to rely on parent devices, this patch set adds one
new member in struct mdev_device.

* iommu_device
  - This, if set, indicates that the mediated device could
be fully isolated and protected by IOMMU via attaching
an iommu domain to this device. If empty, it indicates
using vendor defined isolation.

Below helpers are added to set and get above iommu device in mdev core
implementation.

* mdev_set/get_iommu_device(dev, iommu_device)
  - Set or get the iommu device which represents this mdev
in IOMMU's device scope. Drivers don't need to set the
iommu device if it uses vendor defined isolation.

The mdev parent device driver could opt-in that the mdev could be
fully isolated and protected by the IOMMU when the mdev is being
created by invoking mdev_set_iommu_device() in its @create().

In the vfio_iommu_type1_attach_group(), a domain allocated through
iommu_domain_alloc() will be attached to the mdev iommu device if
an iommu device has been set. Otherwise, the dummy external domain
will be used and all the DMA isolation and protection are routed to
parent driver as the result.

On IOMMU side, a basic requirement is allowing to attach multiple
domains to a PCI device if the device advertises the capability
and the IOMMU hardware supports finer granularity translations than
the normal PCI Source ID based translation.

As the result, a PCI device could work in two modes: normal mode
and auxiliary mode. In the normal mode, a pci device could be
isolated in the Source ID granularity; the pci device itself could
be assigned to a user application by attaching a single domain
to it. In the auxiliary mode, a pci device could be isolated in
finer granularity, hence subsets of the device could be assigned
to different user level application by attaching a different domain
to each subset.

Below APIs are introduced in iommu generic layer for aux-domain
purpose:

* iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
  - Detect both IOMMU and PCI endpoint devices supporting
the feature (aux-domain here) without the host driver
dependency.

* iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX)
  - Check the enabling status of the feature (aux-domain
here). The aux-domain interfaces are available only
if this returns true.

* iommu_dev_enable/disable_feature(dev, IOMMU_DEV_FEAT_AUX)
  - Enable/disable device specific aux-domain feature.

* iommu_aux_attach_device(domain, dev)
  - Attaches @domain to @dev in the auxiliary mode. Multiple
domains could be attached to a single device in the
auxiliary mode with each domain representing an isolated
address space for an assignable subset of the device.

* iommu_aux_detach_device(domain, dev)
  - Detach @domain which has been attached to @dev in the
auxiliary mode.

* iommu_aux_get_pasid(domain, dev)
  - Return ID used for finer-granularity DMA translation.
For the Intel Scalable IOV usage model, this will be
a PASID. The device which supports Scalable IOV needs
to write this ID to the device register so that DMA
requests could be tagged with a right PASID prefix.

In order for the ease of discussion, sometimes we call "a domain in
auxiliary mode' or simply 'an auxiliary domain' when a domain is
attached to a device for finer granularity translations. But we need
to keep in mind that this doesn't mean there is a differnt domain
type. A same domain could be bound to a device for Source ID based
translation, and bound to another device for finer granularity
translation at the same time.

This patch series extends both IOMMU and vfio components to support
mdev device passing through when it could be isolated and protected
by the IOMMU units. The first part of this series (PATCH 1/09~6/09)
adds the interfaces and implementation of the multiple 

[PATCH v8 1/9] iommu: Add APIs for multiple domains per device

2019-03-24 Thread Lu Baolu
Sharing a physical PCI device in a finer-granularity way
is becoming a consensus in the industry. IOMMU vendors
are also engaging efforts to support such sharing as well
as possible. Among the efforts, the capability of support
finer-granularity DMA isolation is a common requirement
due to the security consideration. With finer-granularity
DMA isolation, subsets of a PCI function can be isolated
from each others by the IOMMU. As a result, there is a
request in software to attach multiple domains to a physical
PCI device. One example of such use model is the Intel
Scalable IOV [1] [2]. The Intel vt-d 3.0 spec [3] introduces
the scalable mode which enables PASID granularity DMA
isolation.

This adds the APIs to support multiple domains per device.
In order to ease the discussions, we call it 'a domain in
auxiliary mode' or simply 'auxiliary domain' when multiple
domains are attached to a physical device.

The APIs include:

* iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
  - Detect both IOMMU and PCI endpoint devices supporting
the feature (aux-domain here) without the host driver
dependency.

* iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX)
  - Check the enabling status of the feature (aux-domain
here). The aux-domain interfaces are available only
if this returns true.

* iommu_dev_enable/disable_feature(dev, IOMMU_DEV_FEAT_AUX)
  - Enable/disable device specific aux-domain feature.

* iommu_aux_attach_device(domain, dev)
  - Attaches @domain to @dev in the auxiliary mode. Multiple
domains could be attached to a single device in the
auxiliary mode with each domain representing an isolated
address space for an assignable subset of the device.

* iommu_aux_detach_device(domain, dev)
  - Detach @domain which has been attached to @dev in the
auxiliary mode.

* iommu_aux_get_pasid(domain, dev)
  - Return ID used for finer-granularity DMA translation.
For the Intel Scalable IOV usage model, this will be
a PASID. The device which supports Scalable IOV needs
to write this ID to the device register so that DMA
requests could be tagged with a right PASID prefix.

This has been updated with the latest proposal from Joerg
posted here [5].

Many people involved in discussions of this design.

Kevin Tian 
Liu Yi L 
Ashok Raj 
Sanjay Kumar 
Jacob Pan 
Alex Williamson 
Jean-Philippe Brucker 
Joerg Roedel 

and some discussions can be found here [4] [5].

[1] 
https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
[2] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
[3] 
https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
[4] https://lkml.org/lkml/2018/7/26/4
[5] https://www.spinics.net/lists/iommu/msg31874.html

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Cc: Liu Yi L 
Suggested-by: Kevin Tian 
Suggested-by: Jean-Philippe Brucker 
Suggested-by: Joerg Roedel 
Signed-off-by: Lu Baolu 
Reviewed-by: Jean-Philippe Brucker 
---
 drivers/iommu/iommu.c | 96 +++
 include/linux/iommu.h | 70 +++
 2 files changed, 166 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1164b9926a2b..1b697feb3e30 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2030,3 +2030,99 @@ int iommu_fwspec_add_ids(struct device *dev, u32 *ids, 
int num_ids)
return 0;
 }
 EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids);
+
+/*
+ * Per device IOMMU features.
+ */
+bool iommu_dev_has_feature(struct device *dev, enum iommu_dev_features feat)
+{
+   const struct iommu_ops *ops = dev->bus->iommu_ops;
+
+   if (ops && ops->dev_has_feat)
+   return ops->dev_has_feat(dev, feat);
+
+   return false;
+}
+EXPORT_SYMBOL_GPL(iommu_dev_has_feature);
+
+int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features feat)
+{
+   const struct iommu_ops *ops = dev->bus->iommu_ops;
+
+   if (ops && ops->dev_enable_feat)
+   return ops->dev_enable_feat(dev, feat);
+
+   return -ENODEV;
+}
+EXPORT_SYMBOL_GPL(iommu_dev_enable_feature);
+
+/*
+ * The device drivers should do the necessary cleanups before calling this.
+ * For example, before disabling the aux-domain feature, the device driver
+ * should detach all aux-domains. Otherwise, this will return -EBUSY.
+ */
+int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features feat)
+{
+   const struct iommu_ops *ops = dev->bus->iommu_ops;
+
+   if (ops && ops->dev_disable_feat)
+   return ops->dev_disable_feat(dev, feat);
+
+   return -EBUSY;
+}
+EXPORT_SYMBOL_GPL(iommu_dev_disable_feature);
+
+bool iommu_dev_feature_enabled(struct device *dev, enum iommu_dev_features 
feat)
+{
+   const struct iommu_ops *ops = dev->bus->iommu_ops;
+
+   if (ops && ops->dev_feat_enabled)
+   return ops->dev_feat_enabled(dev, feat);
+
+   return false;

[PATCH v8 3/9] iommu/vt-d: Add per-device IOMMU feature ops entries

2019-03-24 Thread Lu Baolu
This adds the iommu ops entries for aux-domain per-device
feature query and enable/disable.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Sanjay Kumar 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel-iommu.c | 159 
 include/linux/intel-iommu.h |   1 +
 2 files changed, 160 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index d2e613875b3a..ba5f2e5b3a03 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2485,6 +2485,7 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
info->domain = domain;
info->iommu = iommu;
info->pasid_table = NULL;
+   info->auxd_enabled = 0;
 
if (dev && dev_is_pci(dev)) {
struct pci_dev *pdev = to_pci_dev(info->dev);
@@ -5223,6 +5224,42 @@ static phys_addr_t intel_iommu_iova_to_phys(struct 
iommu_domain *domain,
return phys;
 }
 
+static inline bool scalable_mode_support(void)
+{
+   struct dmar_drhd_unit *drhd;
+   struct intel_iommu *iommu;
+   bool ret = true;
+
+   rcu_read_lock();
+   for_each_active_iommu(iommu, drhd) {
+   if (!sm_supported(iommu)) {
+   ret = false;
+   break;
+   }
+   }
+   rcu_read_unlock();
+
+   return ret;
+}
+
+static inline bool iommu_pasid_support(void)
+{
+   struct dmar_drhd_unit *drhd;
+   struct intel_iommu *iommu;
+   bool ret = true;
+
+   rcu_read_lock();
+   for_each_active_iommu(iommu, drhd) {
+   if (!pasid_supported(iommu)) {
+   ret = false;
+   break;
+   }
+   }
+   rcu_read_unlock();
+
+   return ret;
+}
+
 static bool intel_iommu_capable(enum iommu_cap cap)
 {
if (cap == IOMMU_CAP_CACHE_COHERENCY)
@@ -5380,6 +5417,124 @@ struct intel_iommu *intel_svm_device_to_iommu(struct 
device *dev)
 }
 #endif /* CONFIG_INTEL_IOMMU_SVM */
 
+static int intel_iommu_enable_auxd(struct device *dev)
+{
+   struct device_domain_info *info;
+   struct intel_iommu *iommu;
+   unsigned long flags;
+   u8 bus, devfn;
+   int ret;
+
+   iommu = device_to_iommu(dev, , );
+   if (!iommu || dmar_disabled)
+   return -EINVAL;
+
+   if (!sm_supported(iommu) || !pasid_supported(iommu))
+   return -EINVAL;
+
+   ret = intel_iommu_enable_pasid(iommu, dev);
+   if (ret)
+   return -ENODEV;
+
+   spin_lock_irqsave(_domain_lock, flags);
+   info = dev->archdata.iommu;
+   info->auxd_enabled = 1;
+   spin_unlock_irqrestore(_domain_lock, flags);
+
+   return 0;
+}
+
+static int intel_iommu_disable_auxd(struct device *dev)
+{
+   struct device_domain_info *info;
+   unsigned long flags;
+
+   spin_lock_irqsave(_domain_lock, flags);
+   info = dev->archdata.iommu;
+   if (!WARN_ON(!info))
+   info->auxd_enabled = 0;
+   spin_unlock_irqrestore(_domain_lock, flags);
+
+   return 0;
+}
+
+/*
+ * A PCI express designated vendor specific extended capability is defined
+ * in the section 3.7 of Intel scalable I/O virtualization technical spec
+ * for system software and tools to detect endpoint devices supporting the
+ * Intel scalable IO virtualization without host driver dependency.
+ *
+ * Returns the address of the matching extended capability structure within
+ * the device's PCI configuration space or 0 if the device does not support
+ * it.
+ */
+static int siov_find_pci_dvsec(struct pci_dev *pdev)
+{
+   int pos;
+   u16 vendor, id;
+
+   pos = pci_find_next_ext_capability(pdev, 0, 0x23);
+   while (pos) {
+   pci_read_config_word(pdev, pos + 4, );
+   pci_read_config_word(pdev, pos + 8, );
+   if (vendor == PCI_VENDOR_ID_INTEL && id == 5)
+   return pos;
+
+   pos = pci_find_next_ext_capability(pdev, pos, 0x23);
+   }
+
+   return 0;
+}
+
+static bool
+intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat)
+{
+   if (feat == IOMMU_DEV_FEAT_AUX) {
+   int ret;
+
+   if (!dev_is_pci(dev) || dmar_disabled ||
+   !scalable_mode_support() || !iommu_pasid_support())
+   return false;
+
+   ret = pci_pasid_features(to_pci_dev(dev));
+   if (ret < 0)
+   return false;
+
+   return !!siov_find_pci_dvsec(to_pci_dev(dev));
+   }
+
+   return false;
+}
+
+static int
+intel_iommu_dev_enable_feat(struct device *dev, enum iommu_dev_features feat)
+{
+   if (feat == IOMMU_DEV_FEAT_AUX)
+   return intel_iommu_enable_auxd(dev);
+
+   return -ENODEV;
+}
+
+static int
+intel_iommu_dev_disable_feat(struct device *dev, enum iommu_dev_features feat)
+{
+   if (feat == 

[PATCH v8 2/9] iommu/vt-d: Make intel_iommu_enable_pasid() more generic

2019-03-24 Thread Lu Baolu
This moves intel_iommu_enable_pasid() out of the scope of
CONFIG_INTEL_IOMMU_SVM with more and more features requiring
pasid function.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel-iommu.c | 21 +++--
 drivers/iommu/intel-svm.c   | 19 ++-
 include/linux/intel-iommu.h |  2 +-
 3 files changed, 26 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 28cb713d728c..d2e613875b3a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5307,8 +5307,7 @@ static void intel_iommu_put_resv_regions(struct device 
*dev,
}
 }
 
-#ifdef CONFIG_INTEL_IOMMU_SVM
-int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev 
*sdev)
+int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device *dev)
 {
struct device_domain_info *info;
struct context_entry *context;
@@ -5317,7 +5316,7 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, 
struct intel_svm_dev *sd
u64 ctx_lo;
int ret;
 
-   domain = get_valid_domain_for_dev(sdev->dev);
+   domain = get_valid_domain_for_dev(dev);
if (!domain)
return -EINVAL;
 
@@ -5325,7 +5324,7 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, 
struct intel_svm_dev *sd
spin_lock(>lock);
 
ret = -EINVAL;
-   info = sdev->dev->archdata.iommu;
+   info = dev->archdata.iommu;
if (!info || !info->pasid_supported)
goto out;
 
@@ -5335,14 +5334,13 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, 
struct intel_svm_dev *sd
 
ctx_lo = context[0].lo;
 
-   sdev->did = FLPT_DEFAULT_DID;
-   sdev->sid = PCI_DEVID(info->bus, info->devfn);
-
if (!(ctx_lo & CONTEXT_PASIDE)) {
ctx_lo |= CONTEXT_PASIDE;
context[0].lo = ctx_lo;
wmb();
-   iommu->flush.flush_context(iommu, sdev->did, sdev->sid,
+   iommu->flush.flush_context(iommu,
+  domain->iommu_did[iommu->seq_id],
+  PCI_DEVID(info->bus, info->devfn),
   DMA_CCMD_MASK_NOBIT,
   DMA_CCMD_DEVICE_INVL);
}
@@ -5351,12 +5349,6 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, 
struct intel_svm_dev *sd
if (!info->pasid_enabled)
iommu_enable_dev_iotlb(info);
 
-   if (info->ats_enabled) {
-   sdev->dev_iotlb = 1;
-   sdev->qdep = info->ats_qdep;
-   if (sdev->qdep >= QI_DEV_EIOTLB_MAX_INVS)
-   sdev->qdep = 0;
-   }
ret = 0;
 
  out:
@@ -5366,6 +5358,7 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, 
struct intel_svm_dev *sd
return ret;
 }
 
+#ifdef CONFIG_INTEL_IOMMU_SVM
 struct intel_iommu *intel_svm_device_to_iommu(struct device *dev)
 {
struct intel_iommu *iommu;
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 3a4b09ae8561..8f87304f915c 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -228,6 +228,7 @@ static LIST_HEAD(global_svm_list);
 int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct 
svm_dev_ops *ops)
 {
struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
+   struct device_domain_info *info;
struct intel_svm_dev *sdev;
struct intel_svm *svm = NULL;
struct mm_struct *mm = NULL;
@@ -291,13 +292,29 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int 
flags, struct svm_dev_
}
sdev->dev = dev;
 
-   ret = intel_iommu_enable_pasid(iommu, sdev);
+   ret = intel_iommu_enable_pasid(iommu, dev);
if (ret || !pasid) {
/* If they don't actually want to assign a PASID, this is
 * just an enabling check/preparation. */
kfree(sdev);
goto out;
}
+
+   info = dev->archdata.iommu;
+   if (!info || !info->pasid_supported) {
+   kfree(sdev);
+   goto out;
+   }
+
+   sdev->did = FLPT_DEFAULT_DID;
+   sdev->sid = PCI_DEVID(info->bus, info->devfn);
+   if (info->ats_enabled) {
+   sdev->dev_iotlb = 1;
+   sdev->qdep = info->ats_qdep;
+   if (sdev->qdep >= QI_DEV_EIOTLB_MAX_INVS)
+   sdev->qdep = 0;
+   }
+
/* Finish the setup now we know we're keeping it */
sdev->users = 1;
sdev->ops = ops;
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index fa364de9db18..b7d1e2fbb9ca 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -650,6 +650,7 @@ struct intel_iommu *domain_get_iommu(struct dmar_domain 
*domain);
 int for_each_device_domain(int (*fn)(struct device_domain_info 

[PATCH] arm64: dts: Add m4u and smi-larbs nodes for mt8183

2019-03-24 Thread Yong Wu
Add nodes for M4U, smi-common, and smi-larbs.

Signed-off-by: Yong Wu 
---
This one is based on MTK IOMMU v6[1] and basical nodes of clocks and powers.

[1] https://patchwork.kernel.org/patch/10816827/
[2] https://patchwork.kernel.org/patch/10858941/
---
 arch/arm64/boot/dts/mediatek/mt8183.dtsi | 85 
 1 file changed, 85 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index 75c4881..1138da2 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 / {
@@ -269,6 +270,15 @@
clock-names = "spi", "wrap";
};
 
+   iommu: iommu@10205000 {
+   compatible = "mediatek,mt8183-m4u";
+   reg = <0 0x10205000 0 0x1000>;
+   interrupts = ;
+   mediatek,larbs = <   
+   >;
+   #iommu-cells = <1>;
+   };
+
uart0: serial@11002000 {
compatible = "mediatek,mt8183-uart",
 "mediatek,mt6577-uart";
@@ -317,9 +327,25 @@
#clock-cells = <1>;
};
 
+   larb0: larb@14017000 {
+   compatible = "mediatek,mt8183-smi-larb";
+   reg = <0 0x14017000 0 0x1000>;
+   mediatek,smi = <_common>;
+   clocks = < CLK_MM_SMI_LARB0>,
+< CLK_MM_SMI_LARB0>;
+   power-domains = < MT8183_POWER_DOMAIN_DISP>;
+   clock-names = "apb", "smi";
+   };
+
smi_common: smi@14019000 {
compatible = "mediatek,mt8183-smi-common", "syscon";
reg = <0 0x14019000 0 0x1000>;
+   clocks = < CLK_MM_SMI_COMMON>,
+< CLK_MM_SMI_COMMON>,
+< CLK_MM_GALS_COMM0>,
+< CLK_MM_GALS_COMM1>;
+   clock-names = "apb", "smi", "gals0", "gals1";
+   power-domains = < MT8183_POWER_DOMAIN_DISP>;
};
 
imgsys: syscon@1502 {
@@ -328,18 +354,57 @@
#clock-cells = <1>;
};
 
+   larb5: larb@15021000 {
+   compatible = "mediatek,mt8183-smi-larb";
+   reg = <0 0x15021000 0 0x1000>;
+   mediatek,smi = <_common>;
+   clocks = < CLK_IMG_LARB5>, < 
CLK_IMG_LARB5>,
+< CLK_MM_GALS_IMG2MM>;
+   clock-names = "apb", "smi", "gals";
+   power-domains = < MT8183_POWER_DOMAIN_ISP>;
+   };
+
+   larb2: larb@1502f000 {
+   compatible = "mediatek,mt8183-smi-larb";
+   reg = <0 0x1502f000 0 0x1000>;
+   mediatek,smi = <_common>;
+   clocks = < CLK_IMG_LARB2>, < 
CLK_IMG_LARB2>,
+< CLK_MM_GALS_IPU2MM>;
+   clock-names = "apb", "smi", "gals";
+   power-domains = < MT8183_POWER_DOMAIN_ISP>;
+   };
+
vdecsys: syscon@1600 {
compatible = "mediatek,mt8183-vdecsys", "syscon";
reg = <0 0x1600 0 0x1000>;
#clock-cells = <1>;
};
 
+   larb1: larb@1601 {
+   compatible = "mediatek,mt8183-smi-larb";
+   reg = <0 0x1601 0 0x1000>;
+   mediatek,smi = <_common>;
+   clocks = < CLK_VDEC_VDEC>, < 
CLK_VDEC_LARB1>;
+   clock-names = "apb", "smi";
+   power-domains = < MT8183_POWER_DOMAIN_VDEC>;
+   };
+
vencsys: syscon@1700 {
compatible = "mediatek,mt8183-vencsys", "syscon";
reg = <0 0x1700 0 0x1000>;
#clock-cells = <1>;
};
 
+   larb4: larb@1701 {
+   compatible = "mediatek,mt8183-smi-larb";
+   reg = <0 0x1701 0 0x1000>;
+   mediatek,smi = <_common>;
+   clocks = < CLK_VENC_LARB>,
+< CLK_VENC_LARB>;
+   clock-names = "apb", "smi";
+   power-domains = < MT8183_POWER_DOMAIN_VENC>;
+   };
+
ipu_conn: syscon@1900 {
compatible = "mediatek,mt8183-ipu_conn", "syscon";
reg = <0 0x1900 0