[PATCH V5 3/3] iommu: Document usage of "/sys/kernel/iommu_groups//type" file

2020-07-24 Thread Sai Praneeth Prakhya
The default domain type of an iommu group can be changed by writing to
"/sys/kernel/iommu_groups//type" file. Hence, document it's usage
and more importantly spell out its limitations.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Reviewed-by: Lu Baolu 
Signed-off-by: Sai Praneeth Prakhya 
---
 .../ABI/testing/sysfs-kernel-iommu_groups | 30 +++
 1 file changed, 30 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups 
b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index 017f5bc3920c..5b50f0ddd2a1 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -33,3 +33,33 @@ Description:In case an RMRR is used only by graphics or 
USB devices
it is now exposed as "direct-relaxable" instead of "direct".
In device assignment use case, for instance, those RMRR
are considered to be relaxable and safe.
+
+What:  /sys/kernel/iommu_groups//type
+Date:  August 2020
+KernelVersion: v5.9
+Contact:   Sai Praneeth Prakhya 
+Description:   Let the user know the type of default domain in use by iommu
+   for this group. A privileged user could request kernel to change
+   the group type by writing to this file. Presently, only three
+   types are supported
+   1. DMA: All the DMA transactions from the device in this group
+   are translated by the iommu.
+   2. identity: All the DMA transactions from the device in this
+group are *not* translated by the iommu.
+   3. auto: Change to the type the device was booted with. When the
+user reads the file he would never see "auto". This is
+just a write only value.
+   Note:
+   -
+   A group type could be modified only when
+   1. The group has *only* one device
+   2. The device in the group is not bound to any device driver.
+  So, the user must first unbind the appropriate driver and
+  then change the default domain type.
+   Caution:
+   
+   Unbinding a device driver will take away the driver's control
+   over the device and if done on devices that host root file
+   system could lead to catastrophic effects (the user might
+   need to reboot the machine to get it to normal state). So, it's
+   expected that the user understands what he is doing.
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V5 2/3] iommu: Take lock before reading iommu group default domain type

2020-07-24 Thread Sai Praneeth Prakhya
"/sys/kernel/iommu_groups//type" file could be read to find out the
default domain type of an iommu group. The default domain of an iommu group
doesn't change after booting and hence could be read directly. But,
after addding support to dynamically change iommu group default domain, the
above assumption no longer stays valid.

iommu group default domain type could be changed at any time by writing to
"/sys/kernel/iommu_groups//type". So, take group mutex before
reading iommu group default domain type so that the user wouldn't see stale
values or iommu_group_show_type() doesn't try to derefernce stale pointers.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Reviewed-by: Lu Baolu 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 939b37727722..b1607832ede9 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -501,6 +501,7 @@ static ssize_t iommu_group_show_type(struct iommu_group 
*group,
 {
char *type = "unknown\n";
 
+   mutex_lock(>mutex);
if (group->default_domain) {
switch (group->default_domain->type) {
case IOMMU_DOMAIN_BLOCKED:
@@ -517,6 +518,7 @@ static ssize_t iommu_group_show_type(struct iommu_group 
*group,
break;
}
}
+   mutex_unlock(>mutex);
strcpy(buf, type);
 
return strlen(type);
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V5 1/3] iommu: Add support to change default domain of an iommu group

2020-07-24 Thread Sai Praneeth Prakhya
Presently, the default domain of an iommu group is allocated during boot
time and it cannot be changed later. So, the device would typically be
either in identity (also known as pass_through) mode or the device would be
in DMA mode as long as the machine is up and running. There is no way to
change the default domain type dynamically i.e. after booting, a device
cannot switch between identity mode and DMA mode.

But, assume a use case wherein the user trusts the device and believes that
the OS is secure enough and hence wants *only* this device to bypass IOMMU
(so that it could be high performing) whereas all the other devices to go
through IOMMU (so that the system is protected). Presently, this use case
is not supported. It will be helpful if there is some way to change the
default domain of an iommu group dynamically. Hence, add such support.

A privileged user could request the kernel to change the default domain
type of a iommu group by writing to
"/sys/kernel/iommu_groups//type" file. Presently, only three values
are supported
1. identity: all the DMA transactions from the device in this group are
 *not* translated by the iommu
2. DMA: all the DMA transactions from the device in this group are
translated by the iommu
3. auto: change to the type the device was booted with

Note:
1. Default domain of an iommu group with two or more devices cannot be
   changed.
2. The device in the iommu group shouldn't be bound to any driver.
3. The device shouldn't be assigned to user for direct access.
4. The vendor iommu driver is required to add def_domain_type() callback.
   The change request will fail if the request type conflicts with that
   returned from the callback.

Please see "Documentation/ABI/testing/sysfs-kernel-iommu_groups" for more
information.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Reviewed-by: Lu Baolu 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/iommu.c | 225 +-
 1 file changed, 224 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 31fd91717ec5..939b37727722 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -93,6 +93,8 @@ static void __iommu_detach_group(struct iommu_domain *domain,
 static int iommu_create_device_direct_mappings(struct iommu_group *group,
   struct device *dev);
 static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
+static ssize_t iommu_group_store_type(struct iommu_group *group,
+ const char *buf, size_t count);
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)  \
 struct iommu_group_attribute iommu_group_attr_##_name =\
@@ -525,7 +527,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO, 
iommu_group_show_name, NULL);
 static IOMMU_GROUP_ATTR(reserved_regions, 0444,
iommu_group_show_resv_regions, NULL);
 
-static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL);
+static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
+   iommu_group_store_type);
 
 static void iommu_group_release(struct kobject *kobj)
 {
@@ -2849,3 +2852,223 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
return ops->sva_get_pasid(handle);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+/*
+ * Changes the default domain of an iommu group that has *only* one device
+ *
+ * @group: The group for which the default domain should be changed
+ * @prev_dev: The device in the group (this is used to make sure that the 
device
+ *  hasn't changed after the caller has called this function)
+ * @type: The type of the new default domain that gets associated with the 
group
+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Note:
+ * 1. Presently, this function is called only when user requests to change the
+ *group's default domain type through 
/sys/kernel/iommu_groups//type
+ *Please take a closer look if intended to use for other purposes.
+ */
+static int iommu_change_dev_def_domain(struct iommu_group *group,
+  struct device *prev_dev, int type)
+{
+   struct iommu_domain *prev_dom;
+   struct group_device *grp_dev;
+   const struct iommu_ops *ops;
+   int ret, dev_def_dom;
+   struct device *dev;
+
+   if (!group)
+   return -EINVAL;
+
+   mutex_lock(>mutex);
+
+   if (group->default_domain != group->domain) {
+   pr_err_ratelimited("Group not assigned to default domain\n");
+   ret = -EBUSY;
+   goto out;
+   }
+
+   /*
+* iommu group wasn't locked while acquiring device lock in
+* iommu_group_store_type(). So, make sure that the device count hasn't
+* changed while acquiring device lo

[PATCH V5 0/3] iommu: Add support to change default domain of an iommu group

2020-07-24 Thread Sai Praneeth Prakhya
Presently, the default domain of an iommu group is allocated during boot time
and it cannot be changed later. So, the device would typically be either in
identity (pass_through) mode or the device would be in DMA mode as long as the
system is up and running. There is no way to change the default domain type
dynamically i.e. after booting, a device cannot switch between identity mode and
DMA mode.

Assume a use case wherein the privileged user would want to use the device in
pass-through mode when the device is used for host so that it would be high
performing. Presently, this is not supported. Hence add support to change the
default domain of an iommu group dynamically.

Support this by writing to a sysfs file, namely
"/sys/kernel/iommu_groups//type".

Testing:

Tested by dynamically changing storage device (nvme) from
1. identity mode to DMA and making sure file transfer works
2. DMA mode to identity mode and making sure file transfer works
Tested only for intel_iommu/vt-d. Would appreciate if someone could test on AMD
and ARM based machines.

Based on iommu maintainer's 'next' branch.

Changes from V4:

1. Created device direct mappings before attaching the device to the domain
2. Used list_first_entry() instead of list_for_each_entry() to get the first
   element of a linked list.
3. Used get_device() and put_device() before and after device_lock()
4. Passed device as an argument to iommu_change_dev_def_domain() to check that
   the device hasn't changed between calls.
5. Changed error message from "Group assigned to user level for direct access"
   to "Group not assigned to default domain".
6. Changed error message from "Cannot change default domain of a group with two
   or more devices" to "Cannot change default domain: Group has more than one
   device".
7. Removed printing error message "'def_domain_type' call back isn't registered"

Changes from V3:

1. Made changes to commit message as suggested by Baolu.
2. Don't pass "prev_dom" and "dev" as parameters to
   iommu_change_dev_def_domain(). Instead get them from group.
3. Sanitize the logic to validate user default domain type request. The logic
   remains same but is implmented differently.
4. Push lot of error checking into iommu_change_dev_def_domain() from
   iommu_group_store_type().
5. iommu_change_dev_def_domain() takes/releases group mutex as needed. So, it
   shouldn't be called holding a group mutex.
6. Use pr_err_ratelimited() instead of pr_err() to avoid DOS attack.

Changes from V2:

1. Change the logic of updating default domain from V2 because
   ops->probe_finalize() could be used to update dma_ops.
2. Drop 1st and 2nd patch of V2 series because they are no longer needed on
   iommu maintainer's 'next' branch.
3. Limit this feature to iommu groups with only one device.
4. Hold device_lock and group mutex until the default domain is changed.

Changes from V1:

1. V1 patch set wasn't updating dma_ops for some vendors (Eg: AMD), hence,
   change the logic of updating default domain as below (because adding a device
   to iommu_group automatically updates dma_ops)
   a. Allocate a new domain
   b. For every device in the group
i. Remove the device from the group
ii. Add the device back to the group
   c. Free previous domain
2. Drop 1st patch of V1 (iommu/vt-d: Modify device_def_domain_type() to use at
   runtime) because "iommu=pt" has no effect on this function anymore.
3. Added a patch to take/release lock while reading 
iommu_group->default_domain->type
   because it can be changed any time by user.
4. Before changing default domain type of a group, check if the group is
   directly assigned for user level access. If so, abort.
5. Sanitize return path (using ternary operator) in iommu_group_store_type()
6. Split 2nd patch of V1 (iommu: Add device_def_domain_type() call back function
   to iommu_ops) into two patches such that iommu generic changes are now in 1st
   patch of V2 and vt-d specific changes are in 2nd patch of V2.
7. Rename device_def_domain_type() to dev_def_domain_type()
8. Remove example from documentation
9. Change the value written to file "/sys/kernel/iommu_groups//type"
   from "dma" to "DMA".

Changes from RFC:
-
1. Added support for "auto" type, so that kernel selects one among identity or
   dma mode.
2. Use "system_state" in device_def_domain_type() instead of an argument.

Sai Praneeth Prakhya (3):
  iommu: Add support to change default domain of an iommu group
  iommu: Take lock before reading iommu group default domain type
  iommu: Document usage of "/sys/kernel/iommu_groups//type" file

 .../ABI/testing/sysfs-kernel-iommu_groups |  30 +++
 drivers/iommu/iommu.c | 227 +-
 2 files changed, 256 insert

[PATCH V4 0/3] iommu: Add support to change default domain of an iommu group

2020-06-04 Thread Sai Praneeth Prakhya
Presently, the default domain of an iommu group is allocated during boot time
and it cannot be changed later. So, the device would typically be either in
identity (pass_through) mode or the device would be in DMA mode as long as the
system is up and running. There is no way to change the default domain type
dynamically i.e. after booting, a device cannot switch between identity mode and
DMA mode.

Assume a use case wherein the privileged user would want to use the device in
pass-through mode when the device is used for host so that it would be high
performing. Presently, this is not supported. Hence add support to change the
default domain of an iommu group dynamically.

Support this by writing to a sysfs file, namely
"/sys/kernel/iommu_groups//type".

Testing:

Tested by dynamically changing storage device (nvme) from
1. identity mode to DMA and making sure file transfer works
2. DMA mode to identity mode and making sure file transfer works
Tested only for intel_iommu/vt-d. Would appreciate if someone could test on AMD
and ARM based machines.

Based on iommu maintainer's 'next' branch.

Changes from V3:

1. Made changes to commit message as suggested by Baolu.
2. Don't pass "prev_dom" and "dev" as parameters to
   iommu_change_dev_def_domain(). Instead get them from group.
3. Sanitize the logic to validate user default domain type request. The logic
   remains same but is implmented differently.
4. Push lot of error checking into iommu_change_dev_def_domain() from
   iommu_group_store_type().
5. iommu_change_dev_def_domain() takes/releases group mutex as needed. So, it
   shouldn't be called holding a group mutex.
6. Use pr_err_ratelimited() instead of pr_err() to avoid DOS attack.

Changes from V2:

1. Change the logic of updating default domain from V2 because
   ops->probe_finalize() could be used to update dma_ops.
2. Drop 1st and 2nd patch of V2 series because they are no longer needed on
   iommu maintainer's 'next' branch.
3. Limit this feature to iommu groups with only one device.
4. Hold device_lock and group mutex until the default domain is changed.

Changes from V1:

1. V1 patch set wasn't updating dma_ops for some vendors (Eg: AMD), hence,
   change the logic of updating default domain as below (because adding a device
   to iommu_group automatically updates dma_ops)
   a. Allocate a new domain
   b. For every device in the group
i. Remove the device from the group
ii. Add the device back to the group
   c. Free previous domain
2. Drop 1st patch of V1 (iommu/vt-d: Modify device_def_domain_type() to use at
   runtime) because "iommu=pt" has no effect on this function anymore.
3. Added a patch to take/release lock while reading 
iommu_group->default_domain->type
   because it can be changed any time by user.
4. Before changing default domain type of a group, check if the group is
   directly assigned for user level access. If so, abort.
5. Sanitize return path (using ternary operator) in iommu_group_store_type()
6. Split 2nd patch of V1 (iommu: Add device_def_domain_type() call back function
   to iommu_ops) into two patches such that iommu generic changes are now in 1st
   patch of V2 and vt-d specific changes are in 2nd patch of V2.
7. Rename device_def_domain_type() to dev_def_domain_type()
8. Remove example from documentation
9. Change the value written to file "/sys/kernel/iommu_groups//type"
   from "dma" to "DMA".

Changes from RFC:
-
1. Added support for "auto" type, so that kernel selects one among identity or
   dma mode.
2. Use "system_state" in device_def_domain_type() instead of an argument.

Sai Praneeth Prakhya (3):
  iommu: Add support to change default domain of an iommu_group
  iommu: Take lock before reading iommu_group default domain type
  iommu: Document usage of "/sys/kernel/iommu_groups//type" file

 .../ABI/testing/sysfs-kernel-iommu_groups |  30 +++
 drivers/iommu/iommu.c | 217 +-
 2 files changed, 246 insertions(+), 1 deletion(-)

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 

-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V4 3/3] iommu: Document usage of "/sys/kernel/iommu_groups//type" file

2020-06-04 Thread Sai Praneeth Prakhya
The default domain type of an iommu group can be changed by writing to
"/sys/kernel/iommu_groups//type" file. Hence, document it's usage
and more importantly spell out its limitations.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 .../ABI/testing/sysfs-kernel-iommu_groups | 30 +++
 1 file changed, 30 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups 
b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index 017f5bc3920c..a498daffeb0c 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -33,3 +33,33 @@ Description:In case an RMRR is used only by graphics or 
USB devices
it is now exposed as "direct-relaxable" instead of "direct".
In device assignment use case, for instance, those RMRR
are considered to be relaxable and safe.
+
+What:  /sys/kernel/iommu_groups//type
+Date:  June 2020
+KernelVersion: v5.8
+Contact:   Sai Praneeth Prakhya 
+Description:   Let the user know the type of default domain in use by iommu
+   for this group. A privileged user could request kernel to change
+   the group type by writing to this file. Presently, only three
+   types are supported
+   1. DMA: All the DMA transactions from the device in this group
+   are translated by the iommu.
+   2. identity: All the DMA transactions from the device in this
+group are *not* translated by the iommu.
+   3. auto: Change to the type the device was booted with. When the
+user reads the file he would never see "auto". This is
+just a write only value.
+   Note:
+   -
+   A group type could be modified only when
+   1. The group has *only* one device
+   2. The device in the group is not bound to any device driver.
+  So, the user must first unbind the appropriate driver and
+  then change the default domain type.
+   Caution:
+   
+   Unbinding a device driver will take away the driver's control
+   over the device and if done on devices that host root file
+   system could lead to catastrophic effects (the user might
+   need to reboot the machine to get it to normal state). So, it's
+   expected that the user understands what he is doing.
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V4 1/3] iommu: Add support to change default domain of an iommu group

2020-06-04 Thread Sai Praneeth Prakhya
Presently, the default domain of an iommu group is allocated during boot
time and it cannot be changed later. So, the device would typically be
either in identity (also known as pass_through) mode or the device would be
in DMA mode as long as the machine is up and running. There is no way to
change the default domain type dynamically i.e. after booting, a device
cannot switch between identity mode and DMA mode.

But, assume a use case wherein the user trusts the device and believes that
the OS is secure enough and hence wants *only* this device to bypass IOMMU
(so that it could be high performing) whereas all the other devices to go
through IOMMU (so that the system is protected). Presently, this use case
is not supported. It will be helpful if there is some way to change the
default domain of an iommu group dynamically. Hence, add such support.

A privileged user could request the kernel to change the default domain
type of a iommu group by writing to
"/sys/kernel/iommu_groups//type" file. Presently, only three values
are supported
1. identity: all the DMA transactions from the device in this group are
 *not* translated by the iommu
2. DMA: all the DMA transactions from the device in this group are
translated by the iommu
3. auto: change to the type the device was booted with

Note:
1. Default domain of an iommu group with two or more devices cannot be
   changed.
2. The device in the iommu group shouldn't be bound to any driver.
3. The device shouldn't be assigned to user for direct access.
4. The vendor iommu driver is required to add def_domain_type() callback.
   The change request will fail if the request type conflicts with that
   returned from the callback.

Please see "Documentation/ABI/testing/sysfs-kernel-iommu_groups" for more
information.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/iommu.c | 215 +-
 1 file changed, 214 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d43120eb1dc5..b023f06f12d6 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -93,6 +93,8 @@ static void __iommu_detach_group(struct iommu_domain *domain,
 static int iommu_create_device_direct_mappings(struct iommu_group *group,
   struct device *dev);
 static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
+static ssize_t iommu_group_store_type(struct iommu_group *group,
+ const char *buf, size_t count);
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)  \
 struct iommu_group_attribute iommu_group_attr_##_name =\
@@ -525,7 +527,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO, 
iommu_group_show_name, NULL);
 static IOMMU_GROUP_ATTR(reserved_regions, 0444,
iommu_group_show_resv_regions, NULL);
 
-static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL);
+static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
+   iommu_group_store_type);
 
 static void iommu_group_release(struct kobject *kobj)
 {
@@ -2838,3 +2841,213 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
return ops->sva_get_pasid(handle);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+/*
+ * Changes the default domain of an iommu group
+ *
+ * @group: The group for which the default domain should be changed
+ * @type: The type of the new default domain that gets associated with the 
group
+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Note:
+ * 1. Presently, this function is called only when user requests to change the
+ *group's default domain type through 
/sys/kernel/iommu_groups//type
+ *Please take a closer look if intended to use for other purposes.
+ */
+static int iommu_change_dev_def_domain(struct iommu_group *group, int type)
+{
+   struct iommu_domain *prev_dom;
+   struct group_device *grp_dev;
+   const struct iommu_ops *ops;
+   int ret, dev_def_dom;
+   struct device *dev;
+
+   if (!group)
+   return -EINVAL;
+
+   mutex_lock(>mutex);
+
+   if (group->default_domain != group->domain) {
+   pr_err_ratelimited("Group assigned to user level for direct 
access\n");
+   ret = -EBUSY;
+   goto out;
+   }
+
+   /*
+* iommu group wasn't locked while acquiring device lock in
+* iommu_group_store_type(). So, make sure that the device count hasn't
+* changed while acquiring device lock.
+*
+* Changing default domain of an iommu group with two or more devices
+* isn't supported because there could be a potential deadlock. Consider
+* the following scenario. T1 is trying to acquire device locks of all
+   

[PATCH V4 2/3] iommu: Take lock before reading iommu group default domain type

2020-06-04 Thread Sai Praneeth Prakhya
"/sys/kernel/iommu_groups//type" file could be read to find out the
default domain type of an iommu group. The default domain of an iommu group
doesn't change after booting and hence could be read directly. But,
after addding support to dynamically change iommu group default domain, the
above assumption no longer stays valid.

iommu group default domain type could be changed at any time by writing to
"/sys/kernel/iommu_groups//type". So, take group mutex before
reading iommu group default domain type so that the user wouldn't see stale
values or iommu_group_show_type() doesn't try to derefernce stale pointers.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b023f06f12d6..eb133ee5c13a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -501,6 +501,7 @@ static ssize_t iommu_group_show_type(struct iommu_group 
*group,
 {
char *type = "unknown\n";
 
+   mutex_lock(>mutex);
if (group->default_domain) {
switch (group->default_domain->type) {
case IOMMU_DOMAIN_BLOCKED:
@@ -517,6 +518,7 @@ static ssize_t iommu_group_show_type(struct iommu_group 
*group,
break;
}
}
+   mutex_unlock(>mutex);
strcpy(buf, type);
 
return strlen(type);
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V3 3/3] iommu: Document usage of "/sys/kernel/iommu_groups//type" file

2020-05-28 Thread Sai Praneeth Prakhya
The default domain type of an iommu group can be changed by writing to
"/sys/kernel/iommu_groups//type" file. Hence, document it's usage
and more importantly spell out its limitations.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 .../ABI/testing/sysfs-kernel-iommu_groups | 30 +++
 1 file changed, 30 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups 
b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index 017f5bc3920c..d238fca4408f 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -33,3 +33,33 @@ Description:In case an RMRR is used only by graphics or 
USB devices
it is now exposed as "direct-relaxable" instead of "direct".
In device assignment use case, for instance, those RMRR
are considered to be relaxable and safe.
+
+What:  /sys/kernel/iommu_groups//type
+Date:  June 2020
+KernelVersion: v5.8
+Contact:   Sai Praneeth Prakhya 
+Description:   Lets the user know the type of default domain in use by iommu
+   for this group. A privileged user could request kernel to change
+   the group type by writing to this file. Presently, only three
+   types are supported
+   1. DMA: All the DMA transactions from the device in this group
+   are translated by the iommu.
+   2. identity: All the DMA transactions from the device in this
+group are *not* translated by the iommu.
+   3. auto: Change to the type the device was booted with. When the
+user reads the file he would never see "auto". This is
+just a write only value.
+   Note:
+   -
+   A group type could be modified only when
+   1. The group has *only* one device
+   2. The device in the group is not bound to any device driver.
+  So, the user must first unbind the appropriate driver and
+  then change the default domain type.
+   Caution:
+   
+   Unbinding a device driver will take away the driver's control
+   over the device and if done on devices that host root file
+   system could lead to catastrophic effects (the user might
+   need to reboot the machine to get it to normal state). So, it's
+   expected that the user understands what he is doing.
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V3 0/3] iommu: Add support to change default domain of an iommu

2020-05-28 Thread Sai Praneeth Prakhya
Presently, the default domain of a iommu group is allocated during boot time and
it cannot be changed later. So, the device would typically be either in identity
(pass_through) mode or the device would be in DMA mode as long as the system is
up and running. There is no way to change the default domain type dynamically
i.e. after booting, a device cannot switch between identity mode and DMA mode.

Assume a use case wherein the privileged user would want to use the device in
pass-through mode when the device is used for host so that it would be high
performing. Presently, this is not supported. Hence add support to change the
default domain of an iommu group dynamically.

Support this by writing to a sysfs file, namely
"/sys/kernel/iommu_groups//type".

Testing:

Tested by dynamically changing storage device (nvme) from
1. identity mode to DMA and making sure file transfer works
2. DMA mode to identity mode and making sure file transfer works
Tested only for intel_iommu/vt-d. Would appreciate if someone could test on AMD
and ARM based machines.

Based on iommu maintainer's 'next' branch.

Changes from V2:

1. Change the logic of updating default domain from V2 because
   ops->probe_finalize() could be used to update dma_ops.
2. Drop 1st and 2nd patch of V2 series because they are no longer needed on
   iommu maintainer's 'next' branch.
3. Limit this feature to iommu groups with only one device.
4. Hold device_lock and group mutex until the default domain is changed.

Changes from V1:

1. V1 patch set wasn't updating dma_ops for some vendors (Eg: AMD), hence,
   change the logic of updating default domain as below (because adding a device
   to iommu_group automatically updates dma_ops)
   a. Allocate a new domain
   b. For every device in the group
i. Remove the device from the group
ii. Add the device back to the group
   c. Free previous domain
2. Drop 1st patch of V1 (iommu/vt-d: Modify device_def_domain_type() to use at
   runtime) because "iommu=pt" has no effect on this function anymore.
3. Added a patch to take/release lock while reading 
iommu_group->default_domain->type
   because it can be changed any time by user.
4. Before changing default domain type of a group, check if the group is
   directly assigned for user level access. If so, abort.
5. Sanitize return path (using ternary operator) in iommu_group_store_type()
6. Split 2nd patch of V1 (iommu: Add device_def_domain_type() call back function
   to iommu_ops) into two patches such that iommu generic changes are now in 1st
   patch of V2 and vt-d specific changes are in 2nd patch of V2.
7. Rename device_def_domain_type() to dev_def_domain_type()
8. Remove example from documentation
9. Change the value written to file "/sys/kernel/iommu_groups//type"
   from "dma" to "DMA".

Changes from RFC:
-
1. Added support for "auto" type, so that kernel selects one among identity or
   dma mode.
2. Use "system_state" in device_def_domain_type() instead of an argument.

Sai Praneeth Prakhya (3):
  iommu: Add support to change default domain of an iommu_group
  iommu: Take lock before reading iommu_group default domain type
  iommu: Document usage of "/sys/kernel/iommu_groups//type" file

 .../ABI/testing/sysfs-kernel-iommu_groups |  30 +++
 drivers/iommu/iommu.c | 213 +-
 2 files changed, 242 insertions(+), 1 deletion(-)

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 

-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V3 1/3] iommu: Add support to change default domain of an iommu_group

2020-05-28 Thread Sai Praneeth Prakhya
Presently, the default domain of an iommu_group is allocated during boot
time (i.e. when a device is being added to a group) and it cannot be
changed later. So, the device would typically be either in identity (also
known as pass_through) mode (controlled by "iommu=pt" kernel command line
argument) or the device would be in DMA mode as long as the machine is up
and running. There is no way to change the default domain type dynamically
i.e. after booting, a device cannot switch between identity mode and DMA
mode.

But, assume a use case wherein the user trusts the device and believes that
the OS is secure enough and hence wants *only* this device to bypass IOMMU
(so that it could be high performing) whereas all the other devices to go
through IOMMU (so that the system is protected). Presently, this use case
is not supported. It will be helpful if there is some way to change the
default domain of a B:D.F dynamically. Hence, add such support.

A privileged user could request the kernel to change the default domain
type of a iommu_group by writing to
"/sys/kernel/iommu_groups//type" file. Presently, only three values
are supported
1. identity: all the DMA transactions from the device in this group are
 *not* translated by the iommu
2. DMA: all the DMA transactions from the device in this group are
translated by the iommu
3. auto: change to the type the device was booted with

Note:
1. Default domain of an iommu group with two or more devices cannot be
   changed.
2. The device in the iommu group shouldn't be bound to any driver.
3. The device shouldn't be assigned to user for direct access.

Please see "Documentation/ABI/testing/sysfs-kernel-iommu_groups" for more
information.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/iommu.c | 211 +-
 1 file changed, 210 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index a4c2f122eb8b..2b6cca799055 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -92,6 +92,8 @@ static void __iommu_detach_group(struct iommu_domain *domain,
 static int iommu_create_device_direct_mappings(struct iommu_group *group,
   struct device *dev);
 static struct iommu_group *iommu_group_get_for_dev(struct device *dev);
+static ssize_t iommu_group_store_type(struct iommu_group *group,
+ const char *buf, size_t count);
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)  \
 struct iommu_group_attribute iommu_group_attr_##_name =\
@@ -524,7 +526,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO, 
iommu_group_show_name, NULL);
 static IOMMU_GROUP_ATTR(reserved_regions, 0444,
iommu_group_show_resv_regions, NULL);
 
-static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL);
+static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
+   iommu_group_store_type);
 
 static void iommu_group_release(struct kobject *kobj)
 {
@@ -2847,3 +2850,209 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
return ops->sva_get_pasid(handle);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+/*
+ * Changes the default domain of a device
+ *
+ * @dev: The *only* device in the group
+ * @group: The group for which the default domain should be changed
+ * @prev_domain: The previous domain that is being switched from
+ * @type: The type of the new default domain that gets associated with the 
group
+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Note:
+ * 1. Presently, this function is called only when user requests to change the
+ *group's default domain type through 
/sys/kernel/iommu_groups//type
+ *Please take a closer look if intended to use for other purposes.
+ * 2. Assumes that group->mutex is already taken and releases before returning
+ */
+static int iommu_change_dev_def_domain(struct device *dev,
+  struct iommu_group *group,
+  struct iommu_domain *prev_dom, int type)
+{
+   int ret = 0;
+
+   /* Sets group->default_domain to the newly allocated domain */
+   ret = iommu_group_alloc_default_domain(dev->bus, group, type);
+   if (ret)
+   goto out;
+
+   ret = __iommu_attach_device(group->default_domain, dev);
+   if (ret)
+   goto free_new_domain;
+
+   group->domain = group->default_domain;
+
+   ret = iommu_create_device_direct_mappings(group, dev);
+   if (ret)
+   goto free_new_domain;
+
+   /*
+* Release the mutex here because ops->probe_finalize() call-back of
+* some vendor IOMMU drivers calls arm_iommu_attach_device() which
+* in-turn might 

[PATCH V3 2/3] iommu: Take lock before reading iommu_group default domain type

2020-05-28 Thread Sai Praneeth Prakhya
"/sys/kernel/iommu_groups//type" file could be read to find out the
default domain type of an iommu group. The default domain of an iommu group
doesn't change after booting and hence could be read directly. But,
after addding support to dynamically change iommu group default domain, the
above assumption no longer stays valid.

iommu group default domain type could be changed at any time by writing to
"/sys/kernel/iommu_groups//type". So, take group mutex before
reading iommu group default domain type so that the user wouldn't see stale
values or iommu_group_show_type() doesn't try to derefernce stale pointers.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 2b6cca799055..fae8a4e1c7ab 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -500,6 +500,7 @@ static ssize_t iommu_group_show_type(struct iommu_group 
*group,
 {
char *type = "unknown\n";
 
+   mutex_lock(>mutex);
if (group->default_domain) {
switch (group->default_domain->type) {
case IOMMU_DOMAIN_BLOCKED:
@@ -516,6 +517,7 @@ static ssize_t iommu_group_show_type(struct iommu_group 
*group,
break;
}
}
+   mutex_unlock(>mutex);
strcpy(buf, type);
 
return strlen(type);
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: Remove functions that support private domain

2020-05-13 Thread Sai Praneeth Prakhya
After moving iommu_group setup to iommu core code [1][2] and removing
private domain support in vt-d [3], there are no users for functions such
as iommu_request_dm_for_dev(), iommu_request_dma_domain_for_dev() and
request_default_domain_for_dev(). So, remove these functions.

[1] commit dce8d6964ebd ("iommu/amd: Convert to probe/release_device()
call-backs")
[2] commit e5d1841f18b2 ("iommu/vt-d: Convert to probe/release_device()
call-backs")
[3] commit 327d5b2fee91 ("iommu/vt-d: Allow 32bit devices to uses DMA
domain")

Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/iommu.c | 65 ---
 include/linux/iommu.h | 12 
 2 files changed, 77 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 4050569188be..374b34fd6fac 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2536,71 +2536,6 @@ struct iommu_resv_region 
*iommu_alloc_resv_region(phys_addr_t start,
 }
 EXPORT_SYMBOL_GPL(iommu_alloc_resv_region);
 
-static int
-request_default_domain_for_dev(struct device *dev, unsigned long type)
-{
-   struct iommu_domain *domain;
-   struct iommu_group *group;
-   int ret;
-
-   /* Device must already be in a group before calling this function */
-   group = iommu_group_get(dev);
-   if (!group)
-   return -EINVAL;
-
-   mutex_lock(>mutex);
-
-   ret = 0;
-   if (group->default_domain && group->default_domain->type == type)
-   goto out;
-
-   /* Don't change mappings of existing devices */
-   ret = -EBUSY;
-   if (iommu_group_device_count(group) != 1)
-   goto out;
-
-   ret = -ENOMEM;
-   domain = __iommu_domain_alloc(dev->bus, type);
-   if (!domain)
-   goto out;
-
-   /* Attach the device to the domain */
-   ret = __iommu_attach_group(domain, group);
-   if (ret) {
-   iommu_domain_free(domain);
-   goto out;
-   }
-
-   /* Make the domain the default for this group */
-   if (group->default_domain)
-   iommu_domain_free(group->default_domain);
-   group->default_domain = domain;
-
-   iommu_create_device_direct_mappings(group, dev);
-
-   dev_info(dev, "Using iommu %s mapping\n",
-type == IOMMU_DOMAIN_DMA ? "dma" : "direct");
-
-   ret = 0;
-out:
-   mutex_unlock(>mutex);
-   iommu_group_put(group);
-
-   return ret;
-}
-
-/* Request that a device is direct mapped by the IOMMU */
-int iommu_request_dm_for_dev(struct device *dev)
-{
-   return request_default_domain_for_dev(dev, IOMMU_DOMAIN_IDENTITY);
-}
-
-/* Request that a device can't be direct mapped by the IOMMU */
-int iommu_request_dma_domain_for_dev(struct device *dev)
-{
-   return request_default_domain_for_dev(dev, IOMMU_DOMAIN_DMA);
-}
-
 void iommu_set_default_passthrough(bool cmd_line)
 {
if (cmd_line)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 7cfd2dddb49d..78a26ae5c2b6 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -482,8 +482,6 @@ extern void iommu_get_resv_regions(struct device *dev, 
struct list_head *list);
 extern void iommu_put_resv_regions(struct device *dev, struct list_head *list);
 extern void generic_iommu_put_resv_regions(struct device *dev,
   struct list_head *list);
-extern int iommu_request_dm_for_dev(struct device *dev);
-extern int iommu_request_dma_domain_for_dev(struct device *dev);
 extern void iommu_set_default_passthrough(bool cmd_line);
 extern void iommu_set_default_translated(bool cmd_line);
 extern bool iommu_default_passthrough(void);
@@ -804,16 +802,6 @@ static inline int iommu_get_group_resv_regions(struct 
iommu_group *group,
return -ENODEV;
 }
 
-static inline int iommu_request_dm_for_dev(struct device *dev)
-{
-   return -ENODEV;
-}
-
-static inline int iommu_request_dma_domain_for_dev(struct device *dev)
-{
-   return -ENODEV;
-}
-
 static inline void iommu_set_default_passthrough(bool cmd_line)
 {
 }
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V2 3/5] iommu: Add support to change default domain of an iommu_group

2020-02-16 Thread Sai Praneeth Prakhya
Presently, the default domain of an iommu_group is allocated during boot
time (i.e. when a device is being added to a group) and it cannot be
changed later. So, the device would typically be either in identity (also
known as pass_through) mode (controlled by "iommu=pt" kernel command line
argument) or the device would be in DMA mode as long as the machine is up
and running. There is no way to change the default domain type dynamically
i.e. after booting, a device cannot switch between identity mode and DMA
mode.

But, assume a use case wherein the user trusts the device and also believes
that the OS is secure enough and hence wants *only* this device to bypass
IOMMU (so that it could be high performing) whereas all the other devices
to go through IOMMU (so that the system is protected). Presently, this use
case is not supported. Hence it will be helpful if there is some way to
change the default domain of a B:D.F dynamically. Since, linux iommu
subsystem prefers to deal at iommu_group level instead of B:D.F level, it
might be helpful if there is some way to change the default domain of a
*iommu_group* dynamically. Hence, add such support.

A privileged user could request the kernel to change the default domain
type of a iommu_group by writing to
"/sys/kernel/iommu_groups//type"
file. Presently, only three values are supported
1. identity: all the DMA transactions from the devices in this group are
 *not* translated by the iommu
2. DMA: all the DMA transactions from the devices in this group are
translated by the iommu
3. auto: kernel decides one among DMA/identity

Also please note that a group type could be modified only when *all* the
devices in the group are not binded to any device driver. Please see
"Documentation/ABI/testing/sysfs-kernel-iommu_groups" for more information.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/iommu.c | 227 +-
 1 file changed, 226 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3e3528436e0b..56a29076871f 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -225,6 +225,8 @@ static int __iommu_attach_group(struct iommu_domain *domain,
struct iommu_group *group);
 static void __iommu_detach_group(struct iommu_domain *domain,
 struct iommu_group *group);
+static ssize_t iommu_group_store_type(struct iommu_group *group,
+ const char *buf, size_t count);
 
 static int __init iommu_set_def_domain_type(char *str)
 {
@@ -448,7 +450,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO, 
iommu_group_show_name, NULL);
 static IOMMU_GROUP_ATTR(reserved_regions, 0444,
iommu_group_show_resv_regions, NULL);
 
-static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL);
+static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
+   iommu_group_store_type);
 
 static void iommu_group_release(struct kobject *kobj)
 {
@@ -2654,3 +2657,225 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
return ops->sva_get_pasid(handle);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+/*
+ * Changes the default domain of a group
+ *
+ * @group: The group for which the default domain should be changed
+ * @prv_dom: The previous domain that is being switched from
+ * @type: The type of the new default domain that gets associated with the 
group
+ *
+ * Returns 0 on success and error code on failure
+ */
+static int iommu_group_change_def_domain(struct iommu_group *group,
+struct iommu_domain *prv_dom,
+int type)
+{
+   struct group_device *grp_dev, *temp;
+   struct iommu_domain *new_domain;
+   int ret;
+
+   /*
+* iommu_domain_alloc() takes "struct bus_type" as an argument which is
+* a member in "struct device". Changing a group's default domain type
+* deals at iommu_group level rather than device level and hence there
+* is no straight forward way to get "bus_type" of an iommu_group that
+* could be passed to iommu_domain_alloc(). So, instead of directly
+* calling iommu_domain_alloc(), use iommu_ops from previous default
+* domain.
+*/
+   if (!prv_dom || !prv_dom->ops || !prv_dom->ops->domain_alloc || !type)
+   return -EINVAL;
+
+   /* Allocate a new domain of requested type */
+   new_domain = prv_dom->ops->domain_alloc(type);
+   if (!new_domain) {
+   pr_err("Unable to allocate memory for the new domain\n");
+   return -ENOMEM;
+   }
+
+   new_domain->type = type;
+  

[PATCH V2 4/5] iommu: Take lock before reading iommu_group default domain type

2020-02-16 Thread Sai Praneeth Prakhya
Since user could change default domain type of an iommu group through
sysfs, reading the default domain type now requires taking the lock first.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/iommu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 56a29076871f..10e14cb9c32b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -424,6 +424,7 @@ static ssize_t iommu_group_show_type(struct iommu_group 
*group,
 {
char *type = "unknown\n";
 
+   mutex_lock(>mutex);
if (group->default_domain) {
switch (group->default_domain->type) {
case IOMMU_DOMAIN_BLOCKED:
@@ -440,6 +441,7 @@ static ssize_t iommu_group_show_type(struct iommu_group 
*group,
break;
}
}
+   mutex_unlock(>mutex);
strcpy(buf, type);
 
return strlen(type);
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V2 1/5] iommu: Add dev_def_domain_type() call back function to iommu_ops

2020-02-16 Thread Sai Praneeth Prakhya
When user requests kernel to change the default domain type of a group
through sysfs, kernel has to make sure that it's ok to change the domain
type of every device in the group to the requested domain (every device may
not support both the domain types i.e. DMA and identity). Hence, add a call
back function that could be implemented per architecture that performs the
above check.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 include/linux/iommu.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d1b5f4d98569..3f4aaad0aeb7 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -248,6 +248,7 @@ struct iommu_iotlb_gather {
  * @cache_invalidate: invalidate translation caches
  * @sva_bind_gpasid: bind guest pasid and mm
  * @sva_unbind_gpasid: unbind guest pasid and mm
+ * @dev_def_domain_type: Return the required default domain type for a device
  * @pgsize_bitmap: bitmap of all possible supported page sizes
  * @owner: Driver module providing these ops
  */
@@ -318,6 +319,8 @@ struct iommu_ops {
 
int (*sva_unbind_gpasid)(struct device *dev, int pasid);
 
+   int (*dev_def_domain_type)(struct device *dev);
+
unsigned long pgsize_bitmap;
struct module *owner;
 };
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V2 0/5] iommu: Add support to change default domain of a group

2020-02-16 Thread Sai Praneeth Prakhya
Presently, the default domain of a group is allocated during boot time and it
cannot be changed later. So, the device would typically be either in identity
(pass_through) mode or the device would be in DMA mode as long as the
system is up and running. There is no way to change the default domain type
dynamically i.e. after booting, a device cannot switch between identity mode and
DMA mode.

Assume a use case wherein the privileged user would want to use the device in
pass-through mode when the device is used for host so that it would be high
performing. Presently, this is not supported. Hence add support to change the
default domain of a group dynamically.

Support this through a sysfs file, namely 
"/sys/kernel/iommu_groups//type".

Hi Joerg,
Sorry! for _huge_ delay in posting a V2 of this patch set. Was stuck with some
internal PoC work. Will be consistent from now.

Changes from V1:

1. V1 patch set wasn't updating dma_ops for some vendors (Eg: AMD), hence,
   change the logic of updating default domain as below (because adding a device
   to iommu_group automatically updates dma_ops)
   a. Allocate a new domain
   b. For every device in the group
i. Remove the device from the group
ii. Add the device back to the group
   c. Free previous domain
2. Drop 1st patch of V1 (iommu/vt-d: Modify device_def_domain_type() to use at
   runtime) because "iommu=pt" has no effect on this function anymore.
3. Added a patch to take/release lock while reading 
iommu_group->default_domain->type
   because it can be changed any time by user.
4. Before changing default domain type of a group, check if the group is
   directly assigned for user level access. If so, abort.
5. Sanitize return path (using ternary operator) in iommu_group_store_type()
6. Split 2nd patch of V1 (iommu: Add device_def_domain_type() call back function
   to iommu_ops) into two patches such that iommu generic changes are now in 1st
   patch of V2 and vt-d specific changes are in 2nd patch of V2.
7. Rename device_def_domain_type() to dev_def_domain_type()
8. Remove example from documentation
9. Change the value written to file "/sys/kernel/iommu_groups//type"
   from "dma" to "DMA".

Changes from RFC:
-
1. Added support for "auto" type, so that kernel selects one among identity or
   dma mode.
2. Use "system_state" in device_def_domain_type() instead of an argument.

Testing:

Tested by dynamically changing storage device (nvme) from
1. identity mode to dma and making sure file transfer works
2. dma mode to identity mode and making sure file transfer works
Tested only for intel_iommu/vt-d.

Based on iommu maintainer's 'next' branch.

Sai Praneeth Prakhya (5):
  iommu: Add dev_def_domain_type() call back function to iommu_ops
  iommu/vt-d: Rename device_def_domain_type() to
intel_iommu_dev_def_domain_type()
  iommu: Add support to change default domain of an iommu_group
  iommu: Take lock before reading iommu_group default domain type
  iommu: Document usage of "/sys/kernel/iommu_groups//type" file

 .../ABI/testing/sysfs-kernel-iommu_groups  |  29 +++
 drivers/iommu/intel-iommu.c|   8 +-
 drivers/iommu/iommu.c  | 229 -
 include/linux/iommu.h  |   3 +
 4 files changed, 265 insertions(+), 4 deletions(-)

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V2 5/5] iommu: Document usage of "/sys/kernel/iommu_groups//type" file

2020-02-16 Thread Sai Praneeth Prakhya
The default domain type of an iommu group can be changed using file
"/sys/kernel/iommu_groups//type". Hence, document it's usage and
more importantly spell out it's limitations.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 .../ABI/testing/sysfs-kernel-iommu_groups  | 29 ++
 1 file changed, 29 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups 
b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index 017f5bc3920c..808a9507b281 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -33,3 +33,32 @@ Description:In case an RMRR is used only by graphics or 
USB devices
it is now exposed as "direct-relaxable" instead of "direct".
In device assignment use case, for instance, those RMRR
are considered to be relaxable and safe.
+
+What:  /sys/kernel/iommu_groups//type
+Date:  February 2020
+KernelVersion: v5.6
+Contact:   Sai Praneeth Prakhya 
+Description:   Lets the user know the type of default domain in use by iommu
+   for this group. A privileged user could request kernel to change
+   the group type by writing to this file. Presently, only three
+   types are supported
+   1. DMA: All the DMA transactions from the devices in this group
+   are translated by the iommu.
+   2. identity: All the DMA transactions from the devices in this
+group are *not* translated by the iommu.
+   3. auto: Kernel decides one among DMA/identity mode and hence
+when the user reads the file he would never see "auto".
+This is just a write only value.
+   Note:
+   -
+   A group type could be modified only when *all* the devices in
+   the group are not binded to any device driver. So, the user has
+   to first unbind the appropriate drivers and then change the
+   default domain type.
+   Caution:
+   
+   Unbinding a device driver will take away the drivers control
+   over the device and if done on devices that host root file
+   system could lead to catastrophic effects (the user might
+   need to reboot the machine to get it to normal state). So, it's
+   expected that the user understands what he is doing.
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V2 2/5] iommu/vt-d: Rename device_def_domain_type() to intel_iommu_dev_def_domain_type()

2020-02-16 Thread Sai Praneeth Prakhya
The functionality needed for iommu_ops->dev_def_domain_type() is already
provided by device_def_domain_type() in intel_iommu.c. But, every call back
function in intel_iommu_ops starts with intel_iommu prefix, hence rename
device_def_domain_type() to intel_iommu_dev_def_domain_type() so that it
follows the same semantics.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 64ddccf1d5fe..68f10d271ac0 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3034,7 +3034,7 @@ static bool device_is_rmrr_locked(struct device *dev)
  *  - IOMMU_DOMAIN_IDENTITY: device requires an identical mapping domain
  *  - 0: both identity and dynamic domains work for this device
  */
-static int device_def_domain_type(struct device *dev)
+static int intel_iommu_dev_def_domain_type(struct device *dev)
 {
if (dev_is_pci(dev)) {
struct pci_dev *pdev = to_pci_dev(dev);
@@ -5796,7 +5796,8 @@ static int intel_iommu_add_device(struct device *dev)
domain = iommu_get_domain_for_dev(dev);
dmar_domain = to_dmar_domain(domain);
if (domain->type == IOMMU_DOMAIN_DMA) {
-   if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
+   if (intel_iommu_dev_def_domain_type(dev) ==
+   IOMMU_DOMAIN_IDENTITY) {
ret = iommu_request_dm_for_dev(dev);
if (ret) {
dmar_remove_one_dev_info(dev);
@@ -5807,7 +5808,7 @@ static int intel_iommu_add_device(struct device *dev)
}
}
} else {
-   if (device_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
+   if (intel_iommu_dev_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
ret = iommu_request_dma_domain_for_dev(dev);
if (ret) {
dmar_remove_one_dev_info(dev);
@@ -6194,6 +6195,7 @@ const struct iommu_ops intel_iommu_ops = {
.dev_enable_feat= intel_iommu_dev_enable_feat,
.dev_disable_feat   = intel_iommu_dev_disable_feat,
.is_attach_deferred = intel_iommu_is_attach_deferred,
+   .dev_def_domain_type= intel_iommu_dev_def_domain_type,
.pgsize_bitmap  = INTEL_IOMMU_PGSIZES,
 };
 
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/4] iommu: Document usage of "/sys/kernel/iommu_groups//type" file

2019-08-21 Thread Sai Praneeth Prakhya
I don't know why people are being dropped randomly from CC'list. So, adding
them back

+ Sohil, Jacob, Joerg, Baolu

> > +What:  /sys/kernel/iommu_groups//type
> > +Date:  August 2019
> > +KernelVersion: v5.4
> > +Contact:   Sai Praneeth Prakhya 
> > +Description:   /sys/kernel/iommu_groups//type lets the user
> > know the
> > +   type of default domain in use by iommu for this group. A
> > +   privileged user could request kernel to change the group type
> > by
> > +   writing to this file. Presently, only three types are
> > supported
> 
> It's unclear whether the following is a list of all values the user 
> could both read and write (which it isn't).

Thanks for bringing this up.
True.. user would never see "auto" when he reads this file. I will make it
clear in V2.

> 
> > +   1. dma: All the DMA transactions from the devices in this
> > group
> > +   are translated by the iommu.
> 
> Why "dma", and not "DMA" (which is what we would read for DMA type)?

Makes sense.. Will change to "DMA" in V2.

> 
> > +   2. identity: All the DMA transactions from the devices in this
> > +group are *not* translated by the iommu.
> > +   3. auto: Kernel decides one among dma/identity
> 
> Isn't this the same as reset value, which we could read and remember?

Agreed. But, I think (assuming it's done manually and the user hasn't stored
default value in some script), remembering would be difficult if the system
had been running for weeks together and the user had already changed group
type multiple times.

> (And the kernel could reject when we attempt to change to a disallowed 
> type).

It already does and I see your point.
Since there are only two options, user might try to write "DMA" and if the
kernel disallows he would write "identity", simple enough.

I think of "auto" as an add-on feature. Since it's simple enough to implement
in kernel and reduces one extra step to user.

> 
> > +   Note:
> > +   -
> > +   A group type could be modified only when *all* the devices in
> > +   the group are not binded to any device driver. So, the user
> > has
> > +   to first unbind the appropriate drivers and then change the
> > +   default domain type.
> > +   Caution:
> > +   
> > +   Unbinding a device driver will take away the drivers control
> > +   over the device and if done on devices that host root file
> > +   system could lead to catastrophic effects (the user might
> > +   need to reboot the machine to get it to normal state). So,
> > it's
> > +   expected that the user understands what he is doing.
> 
> I think that this goes without saying.

Haha.. Yes, it does. But, just wanted to be explicit so that new users are
warned well before :)

Regards,
Sai

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/4] iommu: Document usage of "/sys/kernel/iommu_groups//type" file

2019-08-20 Thread Sai Praneeth Prakhya
The default domain type of an iommu group can be changed using file
"/sys/kernel/iommu_groups//type". Hence, document it's usage and
more importantly spell out it's limitations and an example.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 .../ABI/testing/sysfs-kernel-iommu_groups | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups 
b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index 017f5bc3920c..0a70b3a66ff3 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -33,3 +33,38 @@ Description:In case an RMRR is used only by graphics or 
USB devices
it is now exposed as "direct-relaxable" instead of "direct".
In device assignment use case, for instance, those RMRR
are considered to be relaxable and safe.
+
+What:  /sys/kernel/iommu_groups//type
+Date:  August 2019
+KernelVersion: v5.4
+Contact:   Sai Praneeth Prakhya 
+Description:   /sys/kernel/iommu_groups//type lets the user know the
+   type of default domain in use by iommu for this group. A
+   privileged user could request kernel to change the group type by
+   writing to this file. Presently, only three types are supported
+   1. dma: All the DMA transactions from the devices in this group
+   are translated by the iommu.
+   2. identity: All the DMA transactions from the devices in this
+group are *not* translated by the iommu.
+   3. auto: Kernel decides one among dma/identity
+   Note:
+   -
+   A group type could be modified only when *all* the devices in
+   the group are not binded to any device driver. So, the user has
+   to first unbind the appropriate drivers and then change the
+   default domain type.
+   Caution:
+   
+   Unbinding a device driver will take away the drivers control
+   over the device and if done on devices that host root file
+   system could lead to catastrophic effects (the user might
+   need to reboot the machine to get it to normal state). So, it's
+   expected that the user understands what he is doing.
+   Example:
+   
+   # Unbind USB device driver
+   1. echo ":00:14.0" > /sys/bus/pci/drivers/xhci_hcd/unbind
+   # Put the USB device in identity mode (a.k.a pass through)
+   2. echo "identity" > /sys/kernel/iommu_groups//type
+   # Re-bind the driver
+   3. echo ":00:14.0" > /sys/bus/pci/drivers/xhci_hcd/bind
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/4] iommu: Add support to change default domain of an iommu_group

2019-08-20 Thread Sai Praneeth Prakhya
Presently, the default domain of an iommu_group is allocated during boot
time (i.e. when a device is being added to a group) and it cannot be
changed later. So, the device would typically be either in identity (also
known as pass_through) mode (controlled by "iommu=pt" kernel command line
argument) or the device would be in DMA mode as long as the machine is up
and running. There is no way to change the default domain type dynamically
i.e. after booting, a device cannot switch between identity mode and DMA
mode.

But, assume a use case where in the user trusts the device and also
believes that his OS is secure enough and hence wants *only* this device to
bypass IOMMU (so that it could be high performing) whereas all the other
devices to go through IOMMU (so that the system is protected). Presently,
this is not supported and hence it will be helpful if there is some way to
change the default domain of a B:D.F dynamically. Since, linux iommu
subsystem prefers to deal at iommu_group level instead of B:D.F level, it
might be helpful if there is some way to change the default domain of a
*iommu_group* dynamically. Hence, add such support.

A privileged user could request the kernel to change the default domain
type of a group by writing to "/sys/kernel/iommu_groups//type"
file. Presently, only three values are supported
1. identity: all the DMA transactions from the devices in this group are
 *not* translated by the iommu
2. dma: all the DMA transactions from the devices in this group are
translated by the iommu
3. auto: kernel decides one among dma/identity

Also please note that a group type could be modified only when *all*
the devices in the group are not binded to any device driver. Please see
"Documentation/ABI/testing/sysfs-kernel-iommu_groups" for more information.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/iommu.c | 232 +-
 1 file changed, 231 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 70bfbcc09248..f4a5981e570b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -155,6 +155,8 @@ static int __iommu_attach_group(struct iommu_domain *domain,
struct iommu_group *group);
 static void __iommu_detach_group(struct iommu_domain *domain,
 struct iommu_group *group);
+static ssize_t iommu_group_store_type(struct iommu_group *group,
+ const char *buf, size_t count);
 
 static int __init iommu_set_def_domain_type(char *str)
 {
@@ -376,7 +378,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO, 
iommu_group_show_name, NULL);
 static IOMMU_GROUP_ATTR(reserved_regions, 0444,
iommu_group_show_resv_regions, NULL);
 
-static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL);
+static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
+   iommu_group_store_type);
 
 static void iommu_group_release(struct kobject *kobj)
 {
@@ -2470,3 +2473,230 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
return ops->sva_get_pasid(handle);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+/*
+ * Changes the default domain of a group. This function is heavily inspired 
from
+ * request_default_domain_for_dev() and couldn't re-use the same because:
+ * 1. The domain should be changed even if there are devices under this group
+ *because the driver is already unbinded and it's safe to do so. Also, note
+ *that *only* default_domain is being changed and hence the devices list in
+ *the group need not be changed.
+ * 2. Unlike request_default_domain_for_dev(), a domain is allocated only once
+ *for the whole group, where as the former allocates a domain per device.
+ *
+ * @group: The group for which the default domain should be changed
+ * @prev_domain: The previous domain that is being switched from
+ * @type: The type of the new default domain that gets associated with the 
group
+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Note:
+ * 1. Presently, this function is called only when user requests to change the
+ *group's default domain type through 
/sys/kernel/iommu_groups//type
+ *Be aware to take a closer look if intended to use for other cases.
+ * 2. Assumes group->mutex is already taken
+ */
+static int iommu_group_change_def_domain(struct iommu_group *group,
+struct iommu_domain *prev_domain,
+int type)
+{
+   struct group_device *grp_dev;
+   struct iommu_domain *new_domain;
+   int ret = 0;
+
+   /*
+* iommu_domain_alloc() takes "struct bus_type" as an argument which is
+* a member in "struct device".

[PATCH 1/4] iommu/vt-d: Modify device_def_domain_type() to use at runtime

2019-08-20 Thread Sai Praneeth Prakhya
device_def_domain_type() determines the domain type a device could have and
currently it's called only during boot time. Since the function is called
only during boot time, it *always* considers command line arguments like
"intel_iommu=igfx_off" and "iommu=pt" to determine the domain type of a
device. To support changing the domain of an iommu_group through sysfs,
this function has to be called at runtime as well. Hence, modify this
function such that command line arguments are considered to determine the
domain type of a device *only* when the system is booting and ignored if
the system is already up and running.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu.c | 35 +--
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index b7454ca4a87c..27c3c893530a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2859,15 +2859,14 @@ static bool device_is_rmrr_locked(struct device *dev)
 }
 
 /*
- * Return the required default domain type for a specific device.
+ * Returns the possible default domain types a device could have.
  *
- * @dev: the device in query
- * @startup: true if this is during early boot
+ * @dev: The device in query
  *
  * Returns:
- *  - IOMMU_DOMAIN_DMA: device requires a dynamic mapping domain
- *  - IOMMU_DOMAIN_IDENTITY: device requires an identical mapping domain
- *  - 0: both identity and dynamic domains work for this device
+ *  - IOMMU_DOMAIN_DMA: Device should always be in dynamic mapping domain
+ *  - IOMMU_DOMAIN_IDENTITY: Device should always be in identity mapping domain
+ *  - 0: Device could be in any domain (i.e. identity or dynamic)
  */
 static int device_def_domain_type(struct device *dev)
 {
@@ -2884,11 +2883,22 @@ static int device_def_domain_type(struct device *dev)
if (pdev->untrusted)
return IOMMU_DOMAIN_DMA;
 
+   /*
+* Azalia device should always be in identity domain if
+* check_tylersburg_isoch() decides so.
+*/
if ((iommu_identity_mapping & IDENTMAP_AZALIA) && 
IS_AZALIA(pdev))
return IOMMU_DOMAIN_IDENTITY;
 
-   if ((iommu_identity_mapping & IDENTMAP_GFX) && 
IS_GFX_DEVICE(pdev))
-   return IOMMU_DOMAIN_IDENTITY;
+   /*
+* intel_iommu=igfx_off should have it's effect only during
+* boot.
+*/
+   if (system_state < SYSTEM_RUNNING) {
+   if ((iommu_identity_mapping & IDENTMAP_GFX) &&
+   IS_GFX_DEVICE(pdev))
+   return IOMMU_DOMAIN_IDENTITY;
+   }
 
/*
 * We want to start off with all devices in the 1:1 domain, and
@@ -2919,8 +2929,13 @@ static int device_def_domain_type(struct device *dev)
return IOMMU_DOMAIN_DMA;
}
 
-   return (iommu_identity_mapping & IDENTMAP_ALL) ?
-   IOMMU_DOMAIN_IDENTITY : 0;
+   /* iommu=pt should have it's effect only during boot */
+   if (system_state < SYSTEM_RUNNING) {
+   if (iommu_identity_mapping & IDENTMAP_ALL)
+   return IOMMU_DOMAIN_IDENTITY;
+   }
+
+   return 0;
 }
 
 static void intel_iommu_init_qi(struct intel_iommu *iommu)
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] iommu: Add device_def_domain_type() call back function to iommu_ops

2019-08-20 Thread Sai Praneeth Prakhya
When user requests kernel to change the default domain type of a group
through sysfs, kernel has to make sure that it's ok to change the domain
type of every device in the group to the requested domain (every device may
not support both the domain types i.e. DMA and identity). Hence, add a call
back function that could be implemented per architecture that performs the
above check.

For intel_iommu, this is already done by device_def_domain_type(), but
every call back function in intel_iommu_ops starts with intel_iommu prefix,
hence modify device_def_domain_type() so that it follows the same
semantics.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu.c | 9 ++---
 include/linux/iommu.h   | 4 
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 27c3c893530a..3e8655197aad 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2868,7 +2868,7 @@ static bool device_is_rmrr_locked(struct device *dev)
  *  - IOMMU_DOMAIN_IDENTITY: Device should always be in identity mapping domain
  *  - 0: Device could be in any domain (i.e. identity or dynamic)
  */
-static int device_def_domain_type(struct device *dev)
+static int intel_iommu_device_def_domain_type(struct device *dev)
 {
if (dev_is_pci(dev)) {
struct pci_dev *pdev = to_pci_dev(dev);
@@ -5297,7 +5297,8 @@ static int intel_iommu_add_device(struct device *dev)
domain = iommu_get_domain_for_dev(dev);
dmar_domain = to_dmar_domain(domain);
if (domain->type == IOMMU_DOMAIN_DMA) {
-   if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
+   if (intel_iommu_device_def_domain_type(dev) ==
+   IOMMU_DOMAIN_IDENTITY) {
ret = iommu_request_dm_for_dev(dev);
if (ret) {
dmar_remove_one_dev_info(dev);
@@ -5308,7 +5309,8 @@ static int intel_iommu_add_device(struct device *dev)
}
}
} else {
-   if (device_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
+   if (intel_iommu_device_def_domain_type(dev) ==
+   IOMMU_DOMAIN_DMA) {
ret = iommu_request_dma_domain_for_dev(dev);
if (ret) {
dmar_remove_one_dev_info(dev);
@@ -5652,6 +5654,7 @@ const struct iommu_ops intel_iommu_ops = {
.dev_enable_feat= intel_iommu_dev_enable_feat,
.dev_disable_feat   = intel_iommu_dev_disable_feat,
.is_attach_deferred = intel_iommu_is_attach_deferred,
+   .device_def_domain_type = intel_iommu_device_def_domain_type,
.pgsize_bitmap  = INTEL_IOMMU_PGSIZES,
 };
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 64ebaff33455..ae4755ad2937 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -244,6 +244,8 @@ struct iommu_iotlb_gather {
  * @sva_unbind: Unbind process address space from device
  * @sva_get_pasid: Get PASID associated to a SVA handle
  * @page_response: handle page request response
+ * @device_def_domain_type: Return the possible default domain types a device
+ * could have.
  * @pgsize_bitmap: bitmap of all possible supported page sizes
  */
 struct iommu_ops {
@@ -307,6 +309,8 @@ struct iommu_ops {
 struct iommu_fault_event *evt,
 struct iommu_page_response *msg);
 
+   int (*device_def_domain_type)(struct device *dev);
+
unsigned long pgsize_bitmap;
 };
 
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/4] iommu: Add support to change default domain of a group

2019-08-20 Thread Sai Praneeth Prakhya
Presently, the default domain of a group is allocated during boot time and it
cannot be changed later. So, the device would typically be either in identity
(pass_through) mode or the device would be in DMA mode as long as the
system is up and running. There is no way to change the default domain type
dynamically i.e. after booting, a device cannot switch between identity mode and
DMA mode.

Assume a use case where-in the priviliged user would want to use the device in
pass-through mode when the device is used for host so that it would be high
performing. Presently, this is not supported and hence add support to change the
default domain of a group dynamically.

Support this through a sysfs file, namely 
"/sys/kernel/iommu_groups//type".

Changes from RFC:
-
1. Added support for "auto" type, so that kernel selects one among identity or
   dma mode.
2. Use "system_state" in device_def_domain_type() instead of an argument.

Testing:

Tested by dynamically changing network device from
1. identity mode to dma and making sure ping works
2. dma mode to identity mode and making sure ping works
Tested only for intel_iommu/vt-d. Haven't tested on other architectures.

Sai Praneeth Prakhya (4):
  iommu/vt-d: Modify device_def_domain_type() to use at runtime
  iommu: Add device_def_domain_type() call back function to iommu_ops
  iommu: Add support to change default domain of an iommu_group
  iommu: Document usage of "/sys/kernel/iommu_groups//type" file

 .../ABI/testing/sysfs-kernel-iommu_groups |  35 +++
 drivers/iommu/intel-iommu.c   |  44 +++-
 drivers/iommu/iommu.c | 232 +-
 include/linux/iommu.h |   4 +
 4 files changed, 301 insertions(+), 14 deletions(-)

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 

-- 
2.19.1
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/vt-d: Print pasid table entries MSB to LSB in debugfs

2019-07-21 Thread Sai Praneeth Prakhya
Commit dd5142ca5d24 ("iommu/vt-d: Add debugfs support to show scalable mode
DMAR table internals") prints content of pasid table entries from LSB to
MSB where as other entries are printed MSB to LSB. So, to maintain
uniformity among all entries and to not confuse the user, print MSB first.

Cc: Joerg Roedel 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu-debugfs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu-debugfs.c 
b/drivers/iommu/intel-iommu-debugfs.c
index 73a552914455..2b25d9c59336 100644
--- a/drivers/iommu/intel-iommu-debugfs.c
+++ b/drivers/iommu/intel-iommu-debugfs.c
@@ -162,9 +162,9 @@ static inline void print_tbl_walk(struct seq_file *m)
   (u64)0, (u64)0, (u64)0);
else
seq_printf(m, "%-6d\t0x%016llx:0x%016llx:0x%016llx\n",
-  tbl_wlk->pasid, tbl_wlk->pasid_tbl_entry->val[0],
+  tbl_wlk->pasid, tbl_wlk->pasid_tbl_entry->val[2],
   tbl_wlk->pasid_tbl_entry->val[1],
-  tbl_wlk->pasid_tbl_entry->val[2]);
+  tbl_wlk->pasid_tbl_entry->val[0]);
 }
 
 static void pasid_tbl_walk(struct seq_file *m, struct pasid_entry *tbl_entry,
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 2/4] iommu: Add device_def_domain_type() call back function to iommu_ops

2019-07-02 Thread Sai Praneeth Prakhya
When user requests kernel to change the default domain type of a group
through sysfs, kernel has to make sure if it's ok to change the domain type
of every device in the group to the requested domain (every device may not
support both the domain types i.e. DMA and identity). Hence, add a call
back function that could be implemented per architecture that performs the
above check.

For intel_iommu, this is already done by device_def_domain_type(), but
every call back function in intel_iommu_ops starts with intel_iommu prefix,
hence modify device_def_domain_type() so that it follows the same semantics.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu.c | 9 ++---
 include/linux/iommu.h   | 3 +++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 54e82415e396..0c991eb5034a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2874,7 +2874,7 @@ static bool device_is_rmrr_locked(struct device *dev)
  *  - IOMMU_DOMAIN_IDENTITY: device requires an identical mapping domain
  *  - 0: both identity and dynamic domains work for this device
  */
-static int device_def_domain_type(struct device *dev, bool startup)
+static int intel_iommu_device_def_domain_type(struct device *dev, bool startup)
 {
if (dev_is_pci(dev)) {
struct pci_dev *pdev = to_pci_dev(dev);
@@ -5286,7 +5286,8 @@ static int intel_iommu_add_device(struct device *dev)
domain = iommu_get_domain_for_dev(dev);
dmar_domain = to_dmar_domain(domain);
if (domain->type == IOMMU_DOMAIN_DMA) {
-   if (device_def_domain_type(dev, true) == IOMMU_DOMAIN_IDENTITY) 
{
+   if (intel_iommu_device_def_domain_type(dev, true) ==
+   IOMMU_DOMAIN_IDENTITY) {
ret = iommu_request_dm_for_dev(dev);
if (ret) {
dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
@@ -5296,7 +5297,8 @@ static int intel_iommu_add_device(struct device *dev)
}
}
} else {
-   if (device_def_domain_type(dev, true) == IOMMU_DOMAIN_DMA) {
+   if (intel_iommu_device_def_domain_type(dev, true) ==
+   IOMMU_DOMAIN_DMA) {
ret = iommu_request_dma_domain_for_dev(dev);
if (ret) {
dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
@@ -5637,6 +5639,7 @@ const struct iommu_ops intel_iommu_ops = {
.dev_enable_feat= intel_iommu_dev_enable_feat,
.dev_disable_feat   = intel_iommu_dev_disable_feat,
.is_attach_deferred = intel_iommu_is_attach_deferred,
+   .device_def_domain_type = intel_iommu_device_def_domain_type,
.pgsize_bitmap  = INTEL_IOMMU_PGSIZES,
 };
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index fdc355ccc570..2d172c02be05 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -228,6 +228,7 @@ struct iommu_sva_ops {
  * @sva_unbind: Unbind process address space from device
  * @sva_get_pasid: Get PASID associated to a SVA handle
  * @page_response: handle page request response
+ * @device_def_domain_type: Return the required default domain type for a 
device
  * @pgsize_bitmap: bitmap of all possible supported page sizes
  */
 struct iommu_ops {
@@ -292,6 +293,8 @@ struct iommu_ops {
 struct iommu_fault_event *evt,
 struct iommu_page_response *msg);
 
+   int (*device_def_domain_type)(struct device *dev, bool startup);
+
unsigned long pgsize_bitmap;
 };
 
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 0/4] iommu: Add support to change default domain of a group

2019-07-02 Thread Sai Praneeth Prakhya
Presently, the default domain of a group is allocated during boot time and it
cannot be changed later. So, the device would typically be either in
identity/pass through mode or the device would be in DMA mode as long as the
system is up and running. There is no way to change the default domain type
dynamically i.e. after booting, a device cannot switch between identity mode and
DMA mode.

Assume a use case where-in the priviliged user would want to use the device in
pass-through mode when the device is used for host but would want to switch to
dma protected mode when switching for VFIO in user space. Presently, this is not
supported and hence add support to change default domain of a group dynamically.

Support this through a sysfs file, namely 
"/sys/kernel/iommu_groups//type".

Testing:

Tested by dynamically changing USB device from identity mode to dma and from dma
to identity. Only for x86_64 (i.e. intel_iommu/vt-d). Haven't tested for other
architectures.

Sai Praneeth Prakhya (4):
  iommu/vt-d: Modify device_def_domain_type() to use at runtime
  iommu: Add device_def_domain_type() call back function to iommu_ops
  iommu: Add support to change default domain of a group
  iommu: Document usage of "/sys/kernel/iommu_groups//type" file

 .../ABI/testing/sysfs-kernel-iommu_groups |  34 
 drivers/iommu/intel-iommu.c   |  32 +++-
 drivers/iommu/iommu.c | 178 +-
 include/linux/iommu.h |   3 +
 4 files changed, 237 insertions(+), 10 deletions(-)

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 

-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 3/4] iommu: Add support to change default domain of a group

2019-07-02 Thread Sai Praneeth Prakhya
Presently, the default domain of a group is allocated during boot time
(i.e. when a device is being added to a group) and it cannot be changed
later. So, the device would typically be either in identity/pass_through
mode (controlled by "iommu=pt" kernel command line argument) or the device
would be in DMA mode (as long as the machine is up and running). There is
no way to change the default domain type dynamically i.e. after booting, a
device cannot switch between identity mode and DMA mode.

But, assume a use case where in there is an SR-IOV device and if the
privileged user decides to use some VF's natively (i.e. they are available
only to host) and he wants them to be high performing and also believes
that his OS is secure enough. In this scenario, some VF's should bypass
IOMMU. Presently, this is not supported and hence it will be helpful if
there is some way to change the default domain of a B:D.F dynamically.
Since, linux iommu subsystem prefers to deal at group level instead of
B:D.F level, it might be helpful if there is some way to change the default
domain of a *group* dynamically. Hence, add such support.

A privileged user could request the kernel to change the default domain
type of a group by writing to "/sys/kernel/iommu_groups//type"
file. Presently, only two values are supported "identity" and "dma".
"identity" means that all the DMA transactions from the devices in this
group are *not* translated by the iommu and where as "dma" means that all
the DMA transactions from the devices in this group are translated by the
iommu. Also please note that a group type could be modified only when *all*
the devices in the group are not binded to any device driver.

Please see "Documentation/ABI/testing/sysfs-kernel-iommu_groups" for more
information.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/iommu.c | 178 +-
 1 file changed, 177 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0c674d80c37f..92fadbea36b1 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -155,6 +155,8 @@ static int __iommu_attach_group(struct iommu_domain *domain,
struct iommu_group *group);
 static void __iommu_detach_group(struct iommu_domain *domain,
 struct iommu_group *group);
+static ssize_t iommu_group_store_type(struct iommu_group *group,
+ const char *buf, size_t count);
 
 static int __init iommu_set_def_domain_type(char *str)
 {
@@ -376,7 +378,8 @@ static IOMMU_GROUP_ATTR(name, S_IRUGO, 
iommu_group_show_name, NULL);
 static IOMMU_GROUP_ATTR(reserved_regions, 0444,
iommu_group_show_resv_regions, NULL);
 
-static IOMMU_GROUP_ATTR(type, 0444, iommu_group_show_type, NULL);
+static IOMMU_GROUP_ATTR(type, 0644, iommu_group_show_type,
+   iommu_group_store_type);
 
 static void iommu_group_release(struct kobject *kobj)
 {
@@ -2468,3 +2471,176 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
return ops->sva_get_pasid(handle);
 }
 EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
+
+/*
+ * Changes the default domain of a group. This function is heavily inspired 
from
+ * request_default_domain_for_dev() and couldn't re-use the same because:
+ * 1. The domain should be changed even if there are devices under this group
+ *because the driver is already unbinded and it's safe to do so. Also, note
+ *that *only* default_domain is being changed and hence the devices list in
+ *the group need not be changed.
+ * 2. Unlike request_default_domain_for_dev(), a domain is allocated only once
+ *for the whole group, where as the former allocates a domain per device.
+ *
+ * @group: The group for which the default domain should be changed
+ * @prev_domain: The previous domain that is being switched from
+ * @type: The type of the new default domain that gets associated with the 
group
+ *
+ * Returns 0 on success and error code on failure
+ *
+ * Note:
+ * 1. Presently, this function is called only when user requests to change the
+ *group's default domain type through 
/sys/kernel/iommu_groups//type
+ *Be aware to take a closer look if intended to use for other cases.
+ * 2. Assumes group->mutex is already taken
+ */
+static int iommu_group_change_def_domain(struct iommu_group *group,
+struct iommu_domain *prev_domain,
+int type)
+{
+   struct group_device *grp_dev;
+   struct iommu_domain *new_domain;
+   int ret = 0;
+
+   /*
+* iommu_domain_alloc() takes "struct bus_type" as an argument which is
+* a member in "struct device

[PATCH RFC 4/4] iommu: Document usage of "/sys/kernel/iommu_groups//type" file

2019-07-02 Thread Sai Praneeth Prakhya
The default domain type of an iommu group can be changed using file
"/sys/kernel/iommu_groups//type". Hence, document it's usage and
more importantly spell out it's limitations and an example.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 .../ABI/testing/sysfs-kernel-iommu_groups | 34 +++
 1 file changed, 34 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups 
b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index 017f5bc3920c..7eb1b784c5e3 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -33,3 +33,37 @@ Description:In case an RMRR is used only by graphics or 
USB devices
it is now exposed as "direct-relaxable" instead of "direct".
In device assignment use case, for instance, those RMRR
are considered to be relaxable and safe.
+
+What:  /sys/kernel/iommu_groups//type
+Date:  July 2019
+KernelVersion: v5.4
+Contact:   Sai Praneeth Prakhya 
+Description:   /sys/kernel/iommu_groups//type lets the user know the
+   type of default domain in use by iommu for this group. A
+   privileged user could request kernel to change the group type by
+   writing to this file. Presently, only two types are supported
+   1. dma: All the DMA transactions from the devices in this group
+   are translated by the iommu.
+   2. identity: All the DMA transactions from the devices in this
+group are *not* translated by the iommu.
+   Note:
+   -
+   A group type could be modified only when *all* the devices in
+   the group are not binded to any device driver. So, the user has
+   to first unbind the appropriate drivers and then change the
+   default domain type.
+   Caution:
+   
+   Unbinding a device driver will take away the drivers control
+   over the device and if done on devices that host root file
+   system could lead to catastrophic effects (the user might
+   need to reboot the machine to get it to normal state). So, it's
+   expected that the user understands what he is doing.
+   Example:
+   
+   # Unbind USB device driver
+   1. echo ":00:14.0" > /sys/bus/pci/drivers/xhci_hcd/unbind
+   # Put the USB device in identity mode (a.k.a pass through)
+   2. echo "identity" > /sys/kernel/iommu_groups//type
+   # Re-bind the driver
+   3. echo ":00:14.0" > /sys/bus/pci/drivers/xhci_hcd/bind
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 1/4] iommu/vt-d: Modify device_def_domain_type() to use at runtime

2019-07-02 Thread Sai Praneeth Prakhya
device_def_domain_type() determines the domain type a device could have and
it's called only during boot. But, to change the domain of a group through
sysfs, kernel has to call this function during runtime. Hence, add an
argument to the function which lets the function know if it's being called
at boot time or runtime.

Normally, device_def_domain_type() considers the value of command line
arguments like "intel_iommu=igfx_off" or "iommu=pt" to determine the domain
type of a device, but at runtime, user request should take higher priority
over command line arguments. Hence, use 'startup' argument to modify
device_def_domain_type() such that command line argument settings apply only at
boot time.

Cc: Christoph Hellwig 
Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Will Deacon 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: Robin Murphy 
Cc: Jacob Pan 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu.c | 29 -
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index ac4172c02244..54e82415e396 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2862,15 +2862,19 @@ static bool device_is_rmrr_locked(struct device *dev)
 /*
  * Return the required default domain type for a specific device.
  *
- * @dev: the device in query
- * @startup: true if this is during early boot
+ * @dev:   The device in query
+ * @startup:   Should command line arguments (E.g: igfx_off or iommu=pt) be
+ * considered to decide the domain type of a device? Which is
+ * 'true' while booting but 'false' if the user requests kernel to
+ * change the domain type by writing to
+ * /sys/kernel/iommu_groups//type.
  *
  * Returns:
  *  - IOMMU_DOMAIN_DMA: device requires a dynamic mapping domain
  *  - IOMMU_DOMAIN_IDENTITY: device requires an identical mapping domain
  *  - 0: both identity and dynamic domains work for this device
  */
-static int device_def_domain_type(struct device *dev)
+static int device_def_domain_type(struct device *dev, bool startup)
 {
if (dev_is_pci(dev)) {
struct pci_dev *pdev = to_pci_dev(dev);
@@ -2888,8 +2892,11 @@ static int device_def_domain_type(struct device *dev)
if ((iommu_identity_mapping & IDENTMAP_AZALIA) && 
IS_AZALIA(pdev))
return IOMMU_DOMAIN_IDENTITY;
 
-   if ((iommu_identity_mapping & IDENTMAP_GFX) && 
IS_GFX_DEVICE(pdev))
-   return IOMMU_DOMAIN_IDENTITY;
+   if (startup) {
+   if ((iommu_identity_mapping & IDENTMAP_GFX) &&
+   IS_GFX_DEVICE(pdev))
+   return IOMMU_DOMAIN_IDENTITY;
+   }
 
/*
 * We want to start off with all devices in the 1:1 domain, and
@@ -2920,8 +2927,12 @@ static int device_def_domain_type(struct device *dev)
return IOMMU_DOMAIN_DMA;
}
 
-   return (iommu_identity_mapping & IDENTMAP_ALL) ?
-   IOMMU_DOMAIN_IDENTITY : 0;
+   if (startup) {
+   if (iommu_identity_mapping & IDENTMAP_ALL)
+   return IOMMU_DOMAIN_IDENTITY;
+   }
+
+   return 0;
 }
 
 static void intel_iommu_init_qi(struct intel_iommu *iommu)
@@ -5275,7 +5286,7 @@ static int intel_iommu_add_device(struct device *dev)
domain = iommu_get_domain_for_dev(dev);
dmar_domain = to_dmar_domain(domain);
if (domain->type == IOMMU_DOMAIN_DMA) {
-   if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
+   if (device_def_domain_type(dev, true) == IOMMU_DOMAIN_IDENTITY) 
{
ret = iommu_request_dm_for_dev(dev);
if (ret) {
dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
@@ -5285,7 +5296,7 @@ static int intel_iommu_add_device(struct device *dev)
}
}
} else {
-   if (device_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
+   if (device_def_domain_type(dev, true) == IOMMU_DOMAIN_DMA) {
ret = iommu_request_dma_domain_for_dev(dev);
if (ret) {
dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/6] iommu/vt-d: Cleanup after delegating DMA domain to generic iommu

2019-06-10 Thread Sai Praneeth Prakhya
On Mon, 2019-06-10 at 11:45 -0700, Mehta, Sohil wrote:
> On Sun, 2019-06-09 at 10:38 +0800, Lu Baolu wrote:
> >  static int __init si_domain_init(int hw)
> > @@ -3306,14 +3252,13 @@ static int __init init_dmars(void)
> > if (pasid_supported(iommu))
> > intel_svm_init(iommu);
> >  #endif
> > -   }
> >  
> > -   /*
> > -* Now that qi is enabled on all iommus, set the root entry
> > and flush
> > -* caches. This is required on some Intel X58 chipsets,
> > otherwise the
> > -* flush_context function will loop forever and the boot
> > hangs.
> > -*/
> > -   for_each_active_iommu(iommu, drhd) {
> > +   /*
> > +* Now that qi is enabled on all iommus, set the root
> > entry and
> > +* flush caches. This is required on some Intel X58
> > chipsets,
> > +* otherwise the flush_context function will loop
> > forever and
> > +* the boot hangs.
> > +*/
> > iommu_flush_write_buffer(iommu);
> > iommu_set_root_entry(iommu);
> > iommu->flush.flush_context(iommu, 0, 0, 0,
> > DMA_CCMD_GLOBAL_INVL);
> 
> This changes the intent of the original code. As the comment says
> enable QI on all IOMMUs, then flush the caches and set the root entry.
> The order of setting the root entries has changed now.
> 
> Refer: 
> Commit a4c34ff1c029 ('iommu/vt-d: Enable QI on all IOMMUs before
> setting root entry')

Thanks Sohil! for catching the bug.
Will send a V2 to Lu Baolu fixing this.

Regards,
Sai



Re: Device specific pass through in host systems - discuss user interface

2019-06-08 Thread Sai Praneeth Prakhya
On Sat, 2019-06-08 at 09:27 +0200, h...@lst.de wrote:
> Just curious, what exactly is the use case?  Explaining how someone
> would wan to use this should drive the way we design an interface for it.

Makes sense.

Some example use cases:
1. Assume an SR-IOV device and if the admin decides to use some VF's natively
(i.e. they are available only to host) and he wants them to be high performing
and also believes that his OS is secure enough (so decides to by pass IOMMU).

Presently, we don't support this use case because "iommu=pt" kernel command
line argument is an all or none feature i.e. either all BDF's are translated
through IOMMU or none. So, we would like to propose a per-BDF on/off feature.

We would also want it to be run-time (i.e. sysfs based interface) rather than
just boot-time interface (kernel command line argument).

Also, want to be clear that it's not just SR-IOV devices that we are looking
at but could be any unrelated PCIe devices i.e. a SATA device, USB, NIC or
GFx. Wherein the admin wants to selectively put NIC and GFx through IOMMU and
does not want transactions by SATA and USB to be translated by IOMMU.

Regards,
Sai

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Device specific pass through in host systems - discuss user interface

2019-06-07 Thread Sai Praneeth Prakhya


> It's interesting to see this from a fresh angle which isn't clouded by 
> other SoC GPU details - thanks for the proposal! A couple more thoughts 
> jump out immediately...
> 

Thanks a lot! for jumping in and providing your valuable insights :)
Sorry! for taking time to reply on this. Did some quick testing on
unbinding/binding stuff.

> > Since, this feature could be generic across architectures, we thought it
> > would be better if the user interface is discussed in the community first.
> > We are envisioning this to be used both during boot time and runtime and
> > hence having a kernel command line argument along with a sysfs entry are
> > needed. So, please pour in your suggestions on how the user interface
> > should look like to make it architecture agnostic.
> > 
> > 
> > 1.  Have a kernel command line argument that takes a list of BDF's as
> > an input and puts them in pass through mode
> > 
> > a.  Accepting BDF as an input has a downside - BDF is dynamic and
> > could change if BIOS/OS enumerates a new device in next reboot
> > 
> > b.  Accepting  pair as an input has a downside -
> > What to do when there are multiple such devices and user would like to put
> > only some of them in PT mode
> > 
> 
> c. Not all devices are PCI in the first place.

Agreed, maybe we could limit this feature only to PCIe devices :(

> 
> > 2.  Have a sysfs file which takes 1 or 0 as an input to enable/disable
> > pass through mode. Some places that seem to be reasonable are
> > 
> > a.  /sys/class/iommu/dmar0/devices/
> > 
> > b.  /sys/kernel/iommu_groups//devices
> 
> Note that this this works out a bit tricky to actually use, since 
> changing the meaning of DMA addresses under the device's feet would be 
> disastrous.

Yes, that makes sense.

> It can only safely take effect by unbinding and rebinding 
> the driver and/or resetting the device (as a side note, this starts to 
> overlap with the IOMMU-drivers-as-modules concept, and whatever solution 
> to this we end up with may be helpful in making that work too). In most 
> cases that's probably viable, but not every driver supports unbinding, 
> and either way what if the device in question is the one hosting the 
> root filesystem... :/

We would like to propose something like this

1. User will first unbind the device driver by
echo "BDF" > /sys/bus/pci/drivers//unbind

2. Set the Pass Through bit (this is to say the intent of the user that upon
the next rebind, please make this device as pass through)
echo 1 > /sys/class/iommu/dmar0/devices//pt

3. Re-bind the driver again
echo "BDF" > /sys/bus/pci/drivers//bind

So, if the driver doesn't support unbind then the user cannot put the device
in pass through mode, in which case he has to use kernel command line
argument.

I did unbind on my device that has root file system and after that I was
unable to run any command (like cat or echo), which made sense. So, it's upto
the user to decide which devices can go through unbind and hence can be put in
pass through mode. Since this feature requires sudo privileges, I think, we
can assume admin does the right thing.

Please let me know what you think about this and also please do suggest if you
have any other better ways to deal with this.

Regards,
Sai

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V3 3/3] iommu/vt-d: Add debugfs support to show scalable mode DMAR table internals

2019-05-24 Thread Sai Praneeth Prakhya
A DMAR table walk would typically follow the below process.
1. Bus number is used to index into root table which points to a context
   table.
2. Device number and Function number are used together to index into
   context table which then points to a pasid directory.
3. PASID[19:6] is used to index into PASID directory which points to a
   PASID table.
4. PASID[5:0] is used to index into PASID table which points to all levels
   of page tables.

Whenever a user opens the file
"/sys/kernel/debug/iommu/intel/dmar_translation_struct", the above
described DMAR table walk is performed and the contents of the table are
dumped into the file. The dump could be handy while dealing with devices
that use PASID.

Example of such dump:
cat /sys/kernel/debug/iommu/intel/dmar_translation_struct

(Please note that because of 80 char limit, entries that should have been
in the same line are broken into different lines)

IOMMU dmar0: Root Table Address: 0x436f7c000
B.D.F   Root_entry  Context_entry
PASID   PASID_table_entry
00:0a.0 0x:0x00044dd3f001   
0x0010:0x000435460e1d
0   0x00044d6e1089:0x0003:0x0001
00:0a.0 0x:0x00044dd3f001   
0x0010:0x000435460e1d
1   0x0049:0x0001:0x03c0e001

Note that the above format is followed even for legacy DMAR table dump
which doesn't support PASID and hence in such cases PASID is defaulted to
-1 indicating that PASID and it's related fields are invalid.

Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Andy Shevchenko 
Reviewed-by: Lu Baolu 
Reviewed-by: Andy Shevchenko 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu-debugfs.c | 78 +++--
 1 file changed, 75 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-iommu-debugfs.c 
b/drivers/iommu/intel-iommu-debugfs.c
index 3f5399b5e6c0..73a552914455 100644
--- a/drivers/iommu/intel-iommu-debugfs.c
+++ b/drivers/iommu/intel-iommu-debugfs.c
@@ -14,11 +14,15 @@
 
 #include 
 
+#include "intel-pasid.h"
+
 struct tbl_walk {
u16 bus;
u16 devfn;
+   u32 pasid;
struct root_entry *rt_entry;
struct context_entry *ctx_entry;
+   struct pasid_entry *pasid_tbl_entry;
 };
 
 struct iommu_regset {
@@ -142,21 +146,82 @@ static inline void print_tbl_walk(struct seq_file *m)
 {
struct tbl_walk *tbl_wlk = m->private;
 
-   seq_printf(m, 
"%02x:%02x.%x\t0x%016llx:0x%016llx\t0x%016llx:0x%016llx\n",
+   seq_printf(m, 
"%02x:%02x.%x\t0x%016llx:0x%016llx\t0x%016llx:0x%016llx\t",
   tbl_wlk->bus, PCI_SLOT(tbl_wlk->devfn),
   PCI_FUNC(tbl_wlk->devfn), tbl_wlk->rt_entry->hi,
   tbl_wlk->rt_entry->lo, tbl_wlk->ctx_entry->hi,
   tbl_wlk->ctx_entry->lo);
+
+   /*
+* A legacy mode DMAR doesn't support PASID, hence default it to -1
+* indicating that it's invalid. Also, default all PASID related fields
+* to 0.
+*/
+   if (!tbl_wlk->pasid_tbl_entry)
+   seq_printf(m, "%-6d\t0x%016llx:0x%016llx:0x%016llx\n", -1,
+  (u64)0, (u64)0, (u64)0);
+   else
+   seq_printf(m, "%-6d\t0x%016llx:0x%016llx:0x%016llx\n",
+  tbl_wlk->pasid, tbl_wlk->pasid_tbl_entry->val[0],
+  tbl_wlk->pasid_tbl_entry->val[1],
+  tbl_wlk->pasid_tbl_entry->val[2]);
+}
+
+static void pasid_tbl_walk(struct seq_file *m, struct pasid_entry *tbl_entry,
+  u16 dir_idx)
+{
+   struct tbl_walk *tbl_wlk = m->private;
+   u8 tbl_idx;
+
+   for (tbl_idx = 0; tbl_idx < PASID_TBL_ENTRIES; tbl_idx++) {
+   if (pasid_pte_is_present(tbl_entry)) {
+   tbl_wlk->pasid_tbl_entry = tbl_entry;
+   tbl_wlk->pasid = (dir_idx << PASID_PDE_SHIFT) + tbl_idx;
+   print_tbl_walk(m);
+   }
+
+   tbl_entry++;
+   }
+}
+
+static void pasid_dir_walk(struct seq_file *m, u64 pasid_dir_ptr,
+  u16 pasid_dir_size)
+{
+   struct pasid_dir_entry *dir_entry = phys_to_virt(pasid_dir_ptr);
+   struct pasid_entry *pasid_tbl;
+   u16 dir_idx;
+
+   for (dir_idx = 0; dir_idx < pasid_dir_size; dir_idx++) {
+   pasid_tbl = get_pasid_table_from_pde(dir_entry);
+   if (pasid_tbl)
+   pasid_tbl_walk(m, pasid_tbl, dir_idx);
+
+   dir_entry++;
+   }
 }
 
 static void ctx_tbl_walk(struct seq_file *m, struct intel_iommu *iommu, u16 
bus)
 {
struct context_entry *context;
-   u16 devfn;
+   u16 dev

[PATCH V3 0/3] Add debugfs support to show scalable mode DMAR table

2019-05-24 Thread Sai Praneeth Prakhya
Presently, "/sys/kernel/debug/iommu/intel/dmar_translation_struct" file dumps
only legacy DMAR table which consists of root table and context table. Scalable
mode DMAR table adds PASID directory and PASID table. Hence, add support to dump
these tables as well.

Directly extending the present dumping format for PASID tables will make the
output look clumsy. Hence, the first patch modifies the present format to a
tabular format. The second patch introduces macros that are used during PASID
table walk and the third patch actually adds support to dump scalable mode DMAR
table.

Changes from V1 to V2:
--
1. Make my name consistent in "From" and "Signed-off-by"
2. Fix comment regarding scalable mode context entries
3. Add reviewed by tags

Changes from V2 to V3:
--
Presently, for V2 patches if kernel command line argument "iommu=pt" is passed,
dumping DMAR table seg faults. This happens because in pass through mode (for
non-scalable DMAR's) 3rd bit of context entry is set and it is misinterpreted as
PASID enabled by debugfs code and hence tries to dereference PASID directory
pointer which leads to seg fault (PASID directory pointer is undefined for
non-scalable DMAR's). To fix this, dereference PASID directory pointer only when
1. PASID is supported and
2. PASID is enabled.

This patch is tested on
1. Non-scalable DMAR with and without iommu=pt
2. Scalable DMAR with and without iommu=pt

Sai Praneeth Prakhya (3):
  iommu/vt-d: Modify the format of intel DMAR tables dump
  iommu/vt-d: Introduce macros useful for dumping DMAR table
  iommu/vt-d: Add debugfs support to show scalable mode DMAR table
internals

 drivers/iommu/intel-iommu-debugfs.c | 137 +---
 drivers/iommu/intel-iommu.c |   6 +-
 drivers/iommu/intel-pasid.c |  17 -
 drivers/iommu/intel-pasid.h |  26 +++
 include/linux/intel-iommu.h |   6 ++
 5 files changed, 146 insertions(+), 46 deletions(-)

Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Andy Shevchenko 
Reviewed-by: Lu Baolu 
Reviewed-by: Andy Shevchenko 
Signed-off-by: Sai Praneeth Prakhya 

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V3 1/3] iommu/vt-d: Modify the format of intel DMAR tables dump

2019-05-24 Thread Sai Praneeth Prakhya
Presently, "/sys/kernel/debug/iommu/intel/dmar_translation_struct" file
dumps DMAR tables in the below format

IOMMU dmar2: Root Table Address:4362cc000
Root Table Entries:
 Bus: 0 H: 0 L: 4362f0001
 Context Table Entries for Bus: 0
  Entry B:D.F   HighLow
  160   00:14.0 102 4362ef001
  184   00:17.0 302 435ec4001
  248   00:1f.0 202 43631

This format has few short comings like
1. When extended for dumping scalable mode DMAR table it will quickly be
   very clumsy, making it unreadable.
2. It has information like the Bus number and Entry which are basically
   part of B:D.F, hence are a repetition and are not so useful.

So, change it to a new format which could be easily extended to dump
scalable mode DMAR table. The new format looks as below:

IOMMU dmar2: Root Table Address: 0x436f7d000
B.D.F   Root_entry  Context_entry
00:14.0 0x:0x000436fbd001   
0x0102:0x000436fbc001
00:17.0 0x:0x000436fbd001   
0x0302:0x000436af4001
00:1f.0 0x:0x000436fbd001   
0x0202:0x000436fcd001

Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Andy Shevchenko 
Reviewed-by: Lu Baolu 
Reviewed-by: Andy Shevchenko 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu-debugfs.c | 65 +++--
 1 file changed, 41 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/intel-iommu-debugfs.c 
b/drivers/iommu/intel-iommu-debugfs.c
index 7fabf9b1c2dc..3f5399b5e6c0 100644
--- a/drivers/iommu/intel-iommu-debugfs.c
+++ b/drivers/iommu/intel-iommu-debugfs.c
@@ -14,6 +14,13 @@
 
 #include 
 
+struct tbl_walk {
+   u16 bus;
+   u16 devfn;
+   struct root_entry *rt_entry;
+   struct context_entry *ctx_entry;
+};
+
 struct iommu_regset {
int offset;
const char *regs;
@@ -131,16 +138,25 @@ static int iommu_regset_show(struct seq_file *m, void 
*unused)
 }
 DEFINE_SHOW_ATTRIBUTE(iommu_regset);
 
-static void ctx_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu,
-  int bus)
+static inline void print_tbl_walk(struct seq_file *m)
 {
-   struct context_entry *context;
-   int devfn;
+   struct tbl_walk *tbl_wlk = m->private;
 
-   seq_printf(m, " Context Table Entries for Bus: %d\n", bus);
-   seq_puts(m, "  Entry\tB:D.F\tHigh\tLow\n");
+   seq_printf(m, 
"%02x:%02x.%x\t0x%016llx:0x%016llx\t0x%016llx:0x%016llx\n",
+  tbl_wlk->bus, PCI_SLOT(tbl_wlk->devfn),
+  PCI_FUNC(tbl_wlk->devfn), tbl_wlk->rt_entry->hi,
+  tbl_wlk->rt_entry->lo, tbl_wlk->ctx_entry->hi,
+  tbl_wlk->ctx_entry->lo);
+}
+
+static void ctx_tbl_walk(struct seq_file *m, struct intel_iommu *iommu, u16 
bus)
+{
+   struct context_entry *context;
+   u16 devfn;
 
for (devfn = 0; devfn < 256; devfn++) {
+   struct tbl_walk tbl_wlk = {0};
+
context = iommu_context_addr(iommu, bus, devfn, 0);
if (!context)
return;
@@ -148,33 +164,34 @@ static void ctx_tbl_entry_show(struct seq_file *m, struct 
intel_iommu *iommu,
if (!context_present(context))
continue;
 
-   seq_printf(m, "  %-5d\t%02x:%02x.%x\t%-6llx\t%llx\n", devfn,
-  bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
-  context[0].hi, context[0].lo);
+   tbl_wlk.bus = bus;
+   tbl_wlk.devfn = devfn;
+   tbl_wlk.rt_entry = >root_entry[bus];
+   tbl_wlk.ctx_entry = context;
+   m->private = _wlk;
+
+   print_tbl_walk(m);
}
 }
 
-static void root_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu)
+static void root_tbl_walk(struct seq_file *m, struct intel_iommu *iommu)
 {
unsigned long flags;
-   int bus;
+   u16 bus;
 
spin_lock_irqsave(>lock, flags);
-   seq_printf(m, "IOMMU %s: Root Table Address:%llx\n", iommu->name,
+   seq_printf(m, "IOMMU %s: Root Table Address: 0x%llx\n", iommu->name,
   (u64)virt_to_phys(iommu->root_entry));
-   seq_puts(m, "Root Table Entries:\n");
-
-   for (bus = 0; bus < 256; bus++) {
-   if (!(iommu->root_entry[bus].lo & 1))
-   continue;
+   seq_puts(m, "B.D.F\tRoot_entry\t\t\t\tContext_entry\n");
 
-   seq_printf(m, " Bus: %d H: %llx L: %llx\n", bus,
-  iommu->root_entry[bus].hi,
-  iommu->root_entry[bus].lo);
+   /*
+* No need to check if the root entry is present or not because
+* iommu_con

[PATCH V3 2/3] iommu/vt-d: Introduce macros useful for dumping DMAR table

2019-05-24 Thread Sai Praneeth Prakhya
A scalable mode DMAR table walk would involve looking at bits in each stage
of walk, like,
1. Is PASID enabled in the context entry?
2. What's the size of PASID directory?
3. Is the PASID directory entry present?
4. Is the PASID table entry present?
5. Number of PASID table entries?

Hence, add these macros that will later be used during this walk.
Apart from adding new macros, move existing macros (like
pasid_pde_is_present(), get_pasid_table_from_pde() and pasid_supported())
to appropriate header files so that they could be reused.

Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Andy Shevchenko 
Reviewed-by: Lu Baolu 
Reviewed-by: Andy Shevchenko 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu.c |  6 +-
 drivers/iommu/intel-pasid.c | 17 -
 drivers/iommu/intel-pasid.h | 26 ++
 include/linux/intel-iommu.h |  6 ++
 4 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a209199f3af6..f284732fe62b 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -357,6 +357,7 @@ int dmar_disabled = 0;
 int dmar_disabled = 1;
 #endif /*CONFIG_INTEL_IOMMU_DEFAULT_ON*/
 
+int intel_iommu_sm;
 int intel_iommu_enabled = 0;
 EXPORT_SYMBOL_GPL(intel_iommu_enabled);
 
@@ -364,17 +365,12 @@ static int dmar_map_gfx = 1;
 static int dmar_forcedac;
 static int intel_iommu_strict;
 static int intel_iommu_superpage = 1;
-static int intel_iommu_sm;
 static int iommu_identity_mapping;
 
 #define IDENTMAP_ALL   1
 #define IDENTMAP_GFX   2
 #define IDENTMAP_AZALIA4
 
-#define sm_supported(iommu)(intel_iommu_sm && ecap_smts((iommu)->ecap))
-#define pasid_supported(iommu) (sm_supported(iommu) && \
-ecap_pasid((iommu)->ecap))
-
 int intel_iommu_gfx_mapped;
 EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped);
 
diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 2fefeafda437..6895a23b2157 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -169,23 +169,6 @@ int intel_pasid_alloc_table(struct device *dev)
return 0;
 }
 
-/* Get PRESENT bit of a PASID directory entry. */
-static inline bool
-pasid_pde_is_present(struct pasid_dir_entry *pde)
-{
-   return READ_ONCE(pde->val) & PASID_PTE_PRESENT;
-}
-
-/* Get PASID table from a PASID directory entry. */
-static inline struct pasid_entry *
-get_pasid_table_from_pde(struct pasid_dir_entry *pde)
-{
-   if (!pasid_pde_is_present(pde))
-   return NULL;
-
-   return phys_to_virt(READ_ONCE(pde->val) & PDE_PFN_MASK);
-}
-
 void intel_pasid_free_table(struct device *dev)
 {
struct device_domain_info *info;
diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h
index 23537b3f34e3..fc8cd8f17de1 100644
--- a/drivers/iommu/intel-pasid.h
+++ b/drivers/iommu/intel-pasid.h
@@ -18,6 +18,10 @@
 #define PDE_PFN_MASK   PAGE_MASK
 #define PASID_PDE_SHIFT6
 #define MAX_NR_PASID_BITS  20
+#define PASID_TBL_ENTRIES  BIT(PASID_PDE_SHIFT)
+
+#define is_pasid_enabled(entry)(((entry)->lo >> 3) & 0x1)
+#define get_pasid_dir_size(entry)  (1 << entry)->lo >> 9) & 0x7) + 7))
 
 /*
  * Domain ID reserved for pasid entries programmed for first-level
@@ -49,6 +53,28 @@ struct pasid_table {
struct list_headdev;/* device list */
 };
 
+/* Get PRESENT bit of a PASID directory entry. */
+static inline bool pasid_pde_is_present(struct pasid_dir_entry *pde)
+{
+   return READ_ONCE(pde->val) & PASID_PTE_PRESENT;
+}
+
+/* Get PASID table from a PASID directory entry. */
+static inline struct pasid_entry *
+get_pasid_table_from_pde(struct pasid_dir_entry *pde)
+{
+   if (!pasid_pde_is_present(pde))
+   return NULL;
+
+   return phys_to_virt(READ_ONCE(pde->val) & PDE_PFN_MASK);
+}
+
+/* Get PRESENT bit of a PASID table entry. */
+static inline bool pasid_pte_is_present(struct pasid_entry *pte)
+{
+   return READ_ONCE(pte->val[0]) & PASID_PTE_PRESENT;
+}
+
 extern u32 intel_pasid_max_id;
 int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp);
 void intel_pasid_free_id(int pasid);
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 6925a18a5ca3..4140726867a9 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -447,6 +447,12 @@ enum {
 #define VTD_FLAG_TRANS_PRE_ENABLED (1 << 0)
 #define VTD_FLAG_IRQ_REMAP_PRE_ENABLED (1 << 1)
 
+extern int intel_iommu_sm;
+
+#define sm_supported(iommu)(intel_iommu_sm && ecap_smts((iommu)->ecap))
+#define pasid_supported(iommu) (sm_supported(iommu) && \
+ 

[PATCH V2 1/3] iommu/vt-d: Modify the format of intel DMAR tables dump

2019-05-10 Thread Sai Praneeth Prakhya
Presently, "/sys/kernel/debug/iommu/intel/dmar_translation_struct" file
dumps DMAR tables in the below format

IOMMU dmar2: Root Table Address:4362cc000
Root Table Entries:
 Bus: 0 H: 0 L: 4362f0001
 Context Table Entries for Bus: 0
  Entry B:D.F   HighLow
  160   00:14.0 102 4362ef001
  184   00:17.0 302 435ec4001
  248   00:1f.0 202 43631

This format has few short comings like
1. When extended for dumping scalable mode DMAR table it will quickly be
   very clumsy, making it unreadable.
2. It has information like the Bus number and Entry which are basically
   part of B:D.F, hence are a repetition and are not so useful.

So, change it to a new format which could be easily extended to dump
scalable mode DMAR table. The new format looks as below:

IOMMU dmar2: Root Table Address: 0x436f7d000
B.D.F   Root_entry  Context_entry
00:14.0 0x:0x000436fbd001   
0x0102:0x000436fbc001
00:17.0 0x:0x000436fbd001   
0x0302:0x000436af4001
00:1f.0 0x:0x000436fbd001   
0x0202:0x000436fcd001

Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Andy Shevchenko 
Reviewed-by: Lu Baolu 
Reviewed-by: Andy Shevchenko 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu-debugfs.c | 65 +++--
 1 file changed, 41 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/intel-iommu-debugfs.c 
b/drivers/iommu/intel-iommu-debugfs.c
index 7fabf9b1c2dc..3f5399b5e6c0 100644
--- a/drivers/iommu/intel-iommu-debugfs.c
+++ b/drivers/iommu/intel-iommu-debugfs.c
@@ -14,6 +14,13 @@
 
 #include 
 
+struct tbl_walk {
+   u16 bus;
+   u16 devfn;
+   struct root_entry *rt_entry;
+   struct context_entry *ctx_entry;
+};
+
 struct iommu_regset {
int offset;
const char *regs;
@@ -131,16 +138,25 @@ static int iommu_regset_show(struct seq_file *m, void 
*unused)
 }
 DEFINE_SHOW_ATTRIBUTE(iommu_regset);
 
-static void ctx_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu,
-  int bus)
+static inline void print_tbl_walk(struct seq_file *m)
 {
-   struct context_entry *context;
-   int devfn;
+   struct tbl_walk *tbl_wlk = m->private;
 
-   seq_printf(m, " Context Table Entries for Bus: %d\n", bus);
-   seq_puts(m, "  Entry\tB:D.F\tHigh\tLow\n");
+   seq_printf(m, 
"%02x:%02x.%x\t0x%016llx:0x%016llx\t0x%016llx:0x%016llx\n",
+  tbl_wlk->bus, PCI_SLOT(tbl_wlk->devfn),
+  PCI_FUNC(tbl_wlk->devfn), tbl_wlk->rt_entry->hi,
+  tbl_wlk->rt_entry->lo, tbl_wlk->ctx_entry->hi,
+  tbl_wlk->ctx_entry->lo);
+}
+
+static void ctx_tbl_walk(struct seq_file *m, struct intel_iommu *iommu, u16 
bus)
+{
+   struct context_entry *context;
+   u16 devfn;
 
for (devfn = 0; devfn < 256; devfn++) {
+   struct tbl_walk tbl_wlk = {0};
+
context = iommu_context_addr(iommu, bus, devfn, 0);
if (!context)
return;
@@ -148,33 +164,34 @@ static void ctx_tbl_entry_show(struct seq_file *m, struct 
intel_iommu *iommu,
if (!context_present(context))
continue;
 
-   seq_printf(m, "  %-5d\t%02x:%02x.%x\t%-6llx\t%llx\n", devfn,
-  bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
-  context[0].hi, context[0].lo);
+   tbl_wlk.bus = bus;
+   tbl_wlk.devfn = devfn;
+   tbl_wlk.rt_entry = >root_entry[bus];
+   tbl_wlk.ctx_entry = context;
+   m->private = _wlk;
+
+   print_tbl_walk(m);
}
 }
 
-static void root_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu)
+static void root_tbl_walk(struct seq_file *m, struct intel_iommu *iommu)
 {
unsigned long flags;
-   int bus;
+   u16 bus;
 
spin_lock_irqsave(>lock, flags);
-   seq_printf(m, "IOMMU %s: Root Table Address:%llx\n", iommu->name,
+   seq_printf(m, "IOMMU %s: Root Table Address: 0x%llx\n", iommu->name,
   (u64)virt_to_phys(iommu->root_entry));
-   seq_puts(m, "Root Table Entries:\n");
-
-   for (bus = 0; bus < 256; bus++) {
-   if (!(iommu->root_entry[bus].lo & 1))
-   continue;
+   seq_puts(m, "B.D.F\tRoot_entry\t\t\t\tContext_entry\n");
 
-   seq_printf(m, " Bus: %d H: %llx L: %llx\n", bus,
-  iommu->root_entry[bus].hi,
-  iommu->root_entry[bus].lo);
+   /*
+* No need to check if the root entry is present or not because
+* iommu_con

[PATCH V2 3/3] iommu/vt-d: Add debugfs support to show scalable mode DMAR table internals

2019-05-10 Thread Sai Praneeth Prakhya
A DMAR table walk would typically follow the below process.
1. Bus number is used to index into root table which points to a context
   table.
2. Device number and Function number are used together to index into
   context table which then points to a pasid directory.
3. PASID[19:6] is used to index into PASID directory which points to a
   PASID table.
4. PASID[5:0] is used to index into PASID table which points to all levels
   of page tables.

Whenever a user opens the file
"/sys/kernel/debug/iommu/intel/dmar_translation_struct", the above
described DMAR table walk is performed and the contents of the table are
dumped into the file. The dump could be handy while dealing with devices
that use PASID.

Example of such dump:
cat /sys/kernel/debug/iommu/intel/dmar_translation_struct

(Please note that because of 80 char limit, entries that should have been
in the same line are broken into different lines)

IOMMU dmar0: Root Table Address: 0x436f7c000
B.D.F   Root_entry  Context_entry
PASID   PASID_table_entry
00:0a.0 0x:0x00044dd3f001   
0x0010:0x000435460e1d
0   0x00044d6e1089:0x0003:0x0001
00:0a.0 0x:0x00044dd3f001   
0x0010:0x000435460e1d
1   0x0049:0x0001:0x03c0e001

Note that the above format is followed even for legacy DMAR table dump
which doesn't support PASID and hence in such cases PASID is defaulted to
-1 indicating that PASID and it's related fields are invalid.

Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Andy Shevchenko 
Reviewed-by: Lu Baolu 
Reviewed-by: Andy Shevchenko 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu-debugfs.c | 80 +++--
 1 file changed, 76 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel-iommu-debugfs.c 
b/drivers/iommu/intel-iommu-debugfs.c
index 3f5399b5e6c0..f1b0f82373bb 100644
--- a/drivers/iommu/intel-iommu-debugfs.c
+++ b/drivers/iommu/intel-iommu-debugfs.c
@@ -14,11 +14,15 @@
 
 #include 
 
+#include "intel-pasid.h"
+
 struct tbl_walk {
u16 bus;
u16 devfn;
+   u32 pasid;
struct root_entry *rt_entry;
struct context_entry *ctx_entry;
+   struct pasid_entry *pasid_tbl_entry;
 };
 
 struct iommu_regset {
@@ -142,21 +146,82 @@ static inline void print_tbl_walk(struct seq_file *m)
 {
struct tbl_walk *tbl_wlk = m->private;
 
-   seq_printf(m, 
"%02x:%02x.%x\t0x%016llx:0x%016llx\t0x%016llx:0x%016llx\n",
+   seq_printf(m, 
"%02x:%02x.%x\t0x%016llx:0x%016llx\t0x%016llx:0x%016llx\t",
   tbl_wlk->bus, PCI_SLOT(tbl_wlk->devfn),
   PCI_FUNC(tbl_wlk->devfn), tbl_wlk->rt_entry->hi,
   tbl_wlk->rt_entry->lo, tbl_wlk->ctx_entry->hi,
   tbl_wlk->ctx_entry->lo);
+
+   /*
+* A legacy mode DMAR doesn't support PASID, hence default it to -1
+* indicating that it's invalid. Also, default all PASID related fields
+* to 0.
+*/
+   if (!tbl_wlk->pasid_tbl_entry)
+   seq_printf(m, "%-6d\t0x%016llx:0x%016llx:0x%016llx\n", -1,
+  (u64)0, (u64)0, (u64)0);
+   else
+   seq_printf(m, "%-6d\t0x%016llx:0x%016llx:0x%016llx\n",
+  tbl_wlk->pasid, tbl_wlk->pasid_tbl_entry->val[0],
+  tbl_wlk->pasid_tbl_entry->val[1],
+  tbl_wlk->pasid_tbl_entry->val[2]);
+}
+
+static void pasid_tbl_walk(struct seq_file *m, struct pasid_entry *tbl_entry,
+  u16 dir_idx)
+{
+   struct tbl_walk *tbl_wlk = m->private;
+   u8 tbl_idx;
+
+   for (tbl_idx = 0; tbl_idx < PASID_TBL_ENTRIES; tbl_idx++) {
+   if (pasid_pte_is_present(tbl_entry)) {
+   tbl_wlk->pasid_tbl_entry = tbl_entry;
+   tbl_wlk->pasid = (dir_idx << PASID_PDE_SHIFT) + tbl_idx;
+   print_tbl_walk(m);
+   }
+
+   tbl_entry++;
+   }
+}
+
+static void pasid_dir_walk(struct seq_file *m, u64 pasid_dir_ptr,
+  u16 pasid_dir_size)
+{
+   struct pasid_dir_entry *dir_entry = phys_to_virt(pasid_dir_ptr);
+   struct pasid_entry *pasid_tbl;
+   u16 dir_idx;
+
+   for (dir_idx = 0; dir_idx < pasid_dir_size; dir_idx++) {
+   pasid_tbl = get_pasid_table_from_pde(dir_entry);
+   if (pasid_tbl)
+   pasid_tbl_walk(m, pasid_tbl, dir_idx);
+
+   dir_entry++;
+   }
 }
 
 static void ctx_tbl_walk(struct seq_file *m, struct intel_iommu *iommu, u16 
bus)
 {
struct context_entry *context;
-   u16 devfn;
+   u16 dev

[PATCH V2 2/3] iommu/vt-d: Introduce macros useful for dumping DMAR table

2019-05-10 Thread Sai Praneeth Prakhya
A scalable mode DMAR table walk would involve looking at bits in each stage
of walk, like,
1. Is PASID enabled in the context entry?
2. What's the size of PASID directory?
3. Is the PASID directory entry present?
4. Is the PASID table entry present?
5. Number of PASID table entries?

Hence, add these macros that will later be used during this walk.
Apart from adding new macros, move existing macros (like
pasid_pde_is_present() and get_pasid_table_from_pde()) from pasid.c file
to pasid.h header file so that they could be reused.

Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Andy Shevchenko 
Reviewed-by: Lu Baolu 
Reviewed-by: Andy Shevchenko 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-pasid.c | 17 -
 drivers/iommu/intel-pasid.h | 26 ++
 2 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 03b12d2ee213..0be00ff53d25 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -167,23 +167,6 @@ int intel_pasid_alloc_table(struct device *dev)
return 0;
 }
 
-/* Get PRESENT bit of a PASID directory entry. */
-static inline bool
-pasid_pde_is_present(struct pasid_dir_entry *pde)
-{
-   return READ_ONCE(pde->val) & PASID_PTE_PRESENT;
-}
-
-/* Get PASID table from a PASID directory entry. */
-static inline struct pasid_entry *
-get_pasid_table_from_pde(struct pasid_dir_entry *pde)
-{
-   if (!pasid_pde_is_present(pde))
-   return NULL;
-
-   return phys_to_virt(READ_ONCE(pde->val) & PDE_PFN_MASK);
-}
-
 void intel_pasid_free_table(struct device *dev)
 {
struct device_domain_info *info;
diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h
index 23537b3f34e3..fc8cd8f17de1 100644
--- a/drivers/iommu/intel-pasid.h
+++ b/drivers/iommu/intel-pasid.h
@@ -18,6 +18,10 @@
 #define PDE_PFN_MASK   PAGE_MASK
 #define PASID_PDE_SHIFT6
 #define MAX_NR_PASID_BITS  20
+#define PASID_TBL_ENTRIES  BIT(PASID_PDE_SHIFT)
+
+#define is_pasid_enabled(entry)(((entry)->lo >> 3) & 0x1)
+#define get_pasid_dir_size(entry)  (1 << entry)->lo >> 9) & 0x7) + 7))
 
 /*
  * Domain ID reserved for pasid entries programmed for first-level
@@ -49,6 +53,28 @@ struct pasid_table {
struct list_headdev;/* device list */
 };
 
+/* Get PRESENT bit of a PASID directory entry. */
+static inline bool pasid_pde_is_present(struct pasid_dir_entry *pde)
+{
+   return READ_ONCE(pde->val) & PASID_PTE_PRESENT;
+}
+
+/* Get PASID table from a PASID directory entry. */
+static inline struct pasid_entry *
+get_pasid_table_from_pde(struct pasid_dir_entry *pde)
+{
+   if (!pasid_pde_is_present(pde))
+   return NULL;
+
+   return phys_to_virt(READ_ONCE(pde->val) & PDE_PFN_MASK);
+}
+
+/* Get PRESENT bit of a PASID table entry. */
+static inline bool pasid_pte_is_present(struct pasid_entry *pte)
+{
+   return READ_ONCE(pte->val[0]) & PASID_PTE_PRESENT;
+}
+
 extern u32 intel_pasid_max_id;
 int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp);
 void intel_pasid_free_id(int pasid);
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V2 0/3] Add debugfs support to show scalable mode DMAR table

2019-05-10 Thread Sai Praneeth Prakhya
Presently, "/sys/kernel/debug/iommu/intel/dmar_translation_struct" file dumps
only legacy DMAR table which consists of root table and context table. Scalable
mode DMAR table adds PASID directory and PASID table. Hence, add support to dump
these tables as well.

Directly extending the present dumping format for PASID tables will make the
output look clumsy. Hence, the first patch modifies the present format to a
tabular format. The second patch introduces macros that are used during PASID
table walk and the third patch actually adds support to dump scalable mode DMAR
table.

Changes from V1 to V2:
--
1. Make my name consistent in "From" and "Signed-off-by"
2. Fix comment regarding scalable mode context entries
3. Add reviewed by tags

Sai Praneeth Prakhya (3):
  iommu/vt-d: Modify the format of intel DMAR tables dump
  iommu/vt-d: Introduce macros useful for dumping DMAR table
  iommu/vt-d: Add debugfs support to show scalable mode DMAR table
internals

 drivers/iommu/intel-iommu-debugfs.c | 137 +---
 drivers/iommu/intel-pasid.c |  17 -
 drivers/iommu/intel-pasid.h |  26 +++
 3 files changed, 139 insertions(+), 41 deletions(-)

Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Andy Shevchenko 
Reviewed-by: Lu Baolu 
Reviewed-by: Andy Shevchenko 
Signed-off-by: Sai Praneeth Prakhya 

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/3] iommu/vt-d: Add debugfs support to show scalable mode DMAR table internals

2019-05-09 Thread Sai Praneeth Prakhya
From: Sai Praneeth 

A DMAR table walk would typically follow the below process.
1. Bus number is used to index into root table which points to a context
   table.
2. Device number and Function number are used together to index into
   context table which then points to a pasid directory.
3. PASID[19:6] is used to index into PASID directory which points to a
   PASID table.
4. PASID[5:0] is used to index into PASID table which points to all levels
   of page tables.

Whenever a user opens the file
"/sys/kernel/debug/iommu/intel/dmar_translation_struct", the above
described DMAR table walk is performed and the contents of the table are
dumped into the file. The dump could be handy while dealing with devices
that use PASID.

Example of such dump:
cat /sys/kernel/debug/iommu/intel/dmar_translation_struct

(Please note that because of 80 char limit, entries that should have been
in the same line are broken into different lines)

IOMMU dmar0: Root Table Address: 0x436f7c000
B.D.F   Root_entry  Context_entry
PASID   PASID_table_entry
00:0a.0 0x:0x00044dd3f001   
0x0010:0x000435460e1d
0   0x00044d6e1089:0x0003:0x0001
00:0a.0 0x:0x00044dd3f001   
0x0010:0x000435460e1d
1   0x0049:0x0001:0x03c0e001

Note that the above format is followed even for legacy DMAR table dump
which doesn't support PASID and hence in such cases PASID is defaulted to
-1 indicating that PASID and it's related fields are invalid.

Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Andy Shevchenko 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu-debugfs.c | 75 +++--
 1 file changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel-iommu-debugfs.c 
b/drivers/iommu/intel-iommu-debugfs.c
index 3f5399b5e6c0..8982e93a50d7 100644
--- a/drivers/iommu/intel-iommu-debugfs.c
+++ b/drivers/iommu/intel-iommu-debugfs.c
@@ -14,11 +14,15 @@
 
 #include 
 
+#include "intel-pasid.h"
+
 struct tbl_walk {
u16 bus;
u16 devfn;
+   u32 pasid;
struct root_entry *rt_entry;
struct context_entry *ctx_entry;
+   struct pasid_entry *pasid_tbl_entry;
 };
 
 struct iommu_regset {
@@ -142,21 +146,77 @@ static inline void print_tbl_walk(struct seq_file *m)
 {
struct tbl_walk *tbl_wlk = m->private;
 
-   seq_printf(m, 
"%02x:%02x.%x\t0x%016llx:0x%016llx\t0x%016llx:0x%016llx\n",
+   seq_printf(m, 
"%02x:%02x.%x\t0x%016llx:0x%016llx\t0x%016llx:0x%016llx\t",
   tbl_wlk->bus, PCI_SLOT(tbl_wlk->devfn),
   PCI_FUNC(tbl_wlk->devfn), tbl_wlk->rt_entry->hi,
   tbl_wlk->rt_entry->lo, tbl_wlk->ctx_entry->hi,
   tbl_wlk->ctx_entry->lo);
+
+   /*
+* A legacy mode DMAR doesn't support PASID, hence default it to -1
+* indicating that it's invalid. Also, default all PASID related fields
+* to 0.
+*/
+   if (!tbl_wlk->pasid_tbl_entry)
+   seq_printf(m, "%-6d\t0x%016llx:0x%016llx:0x%016llx\n", -1,
+  (u64)0, (u64)0, (u64)0);
+   else
+   seq_printf(m, "%-6d\t0x%016llx:0x%016llx:0x%016llx\n",
+  tbl_wlk->pasid, tbl_wlk->pasid_tbl_entry->val[0],
+  tbl_wlk->pasid_tbl_entry->val[1],
+  tbl_wlk->pasid_tbl_entry->val[2]);
+}
+
+static void pasid_tbl_walk(struct seq_file *m, struct pasid_entry *tbl_entry,
+  u16 dir_idx)
+{
+   struct tbl_walk *tbl_wlk = m->private;
+   u8 tbl_idx;
+
+   for (tbl_idx = 0; tbl_idx < PASID_TBL_ENTRIES; tbl_idx++) {
+   if (pasid_pte_is_present(tbl_entry)) {
+   tbl_wlk->pasid_tbl_entry = tbl_entry;
+   tbl_wlk->pasid = (dir_idx << PASID_PDE_SHIFT) + tbl_idx;
+   print_tbl_walk(m);
+   }
+
+   tbl_entry++;
+   }
+}
+
+static void pasid_dir_walk(struct seq_file *m, u64 pasid_dir_ptr,
+  u16 pasid_dir_size)
+{
+   struct pasid_dir_entry *dir_entry = phys_to_virt(pasid_dir_ptr);
+   struct pasid_entry *pasid_tbl;
+   u16 dir_idx;
+
+   for (dir_idx = 0; dir_idx < pasid_dir_size; dir_idx++) {
+   pasid_tbl = get_pasid_table_from_pde(dir_entry);
+   if (pasid_tbl)
+   pasid_tbl_walk(m, pasid_tbl, dir_idx);
+
+   dir_entry++;
+   }
 }
 
 static void ctx_tbl_walk(struct seq_file *m, struct intel_iommu *iommu, u16 
bus)
 {
struct context_entry *context;
-   u16 devfn;
+   u16 dev

[PATCH 2/3] iommu/vt-d: Introduce macros useful for dumping DMAR table

2019-05-09 Thread Sai Praneeth Prakhya
From: Sai Praneeth 

A scalable mode DMAR table walk would involve looking at bits in each stage
of walk, like,
1. Is PASID enabled in the context entry?
2. What's the size of PASID directory?
3. Is the PASID directory entry present?
4. Is the PASID table entry present?
5. Number of PASID table entries?

Hence, add these macros that will later be used during this walk.
Apart from adding new macros, move existing macros (like
pasid_pde_is_present() and get_pasid_table_from_pde()) from pasid.c file
to pasid.h header file so that they could be reused.

Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Andy Shevchenko 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-pasid.c | 17 -
 drivers/iommu/intel-pasid.h | 26 ++
 2 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 03b12d2ee213..0be00ff53d25 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -167,23 +167,6 @@ int intel_pasid_alloc_table(struct device *dev)
return 0;
 }
 
-/* Get PRESENT bit of a PASID directory entry. */
-static inline bool
-pasid_pde_is_present(struct pasid_dir_entry *pde)
-{
-   return READ_ONCE(pde->val) & PASID_PTE_PRESENT;
-}
-
-/* Get PASID table from a PASID directory entry. */
-static inline struct pasid_entry *
-get_pasid_table_from_pde(struct pasid_dir_entry *pde)
-{
-   if (!pasid_pde_is_present(pde))
-   return NULL;
-
-   return phys_to_virt(READ_ONCE(pde->val) & PDE_PFN_MASK);
-}
-
 void intel_pasid_free_table(struct device *dev)
 {
struct device_domain_info *info;
diff --git a/drivers/iommu/intel-pasid.h b/drivers/iommu/intel-pasid.h
index 23537b3f34e3..fc8cd8f17de1 100644
--- a/drivers/iommu/intel-pasid.h
+++ b/drivers/iommu/intel-pasid.h
@@ -18,6 +18,10 @@
 #define PDE_PFN_MASK   PAGE_MASK
 #define PASID_PDE_SHIFT6
 #define MAX_NR_PASID_BITS  20
+#define PASID_TBL_ENTRIES  BIT(PASID_PDE_SHIFT)
+
+#define is_pasid_enabled(entry)(((entry)->lo >> 3) & 0x1)
+#define get_pasid_dir_size(entry)  (1 << entry)->lo >> 9) & 0x7) + 7))
 
 /*
  * Domain ID reserved for pasid entries programmed for first-level
@@ -49,6 +53,28 @@ struct pasid_table {
struct list_headdev;/* device list */
 };
 
+/* Get PRESENT bit of a PASID directory entry. */
+static inline bool pasid_pde_is_present(struct pasid_dir_entry *pde)
+{
+   return READ_ONCE(pde->val) & PASID_PTE_PRESENT;
+}
+
+/* Get PASID table from a PASID directory entry. */
+static inline struct pasid_entry *
+get_pasid_table_from_pde(struct pasid_dir_entry *pde)
+{
+   if (!pasid_pde_is_present(pde))
+   return NULL;
+
+   return phys_to_virt(READ_ONCE(pde->val) & PDE_PFN_MASK);
+}
+
+/* Get PRESENT bit of a PASID table entry. */
+static inline bool pasid_pte_is_present(struct pasid_entry *pte)
+{
+   return READ_ONCE(pte->val[0]) & PASID_PTE_PRESENT;
+}
+
 extern u32 intel_pasid_max_id;
 int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp);
 void intel_pasid_free_id(int pasid);
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/3] iommu/vt-d: Modify the format of intel DMAR tables dump

2019-05-09 Thread Sai Praneeth Prakhya
From: Sai Praneeth 

Presently, "/sys/kernel/debug/iommu/intel/dmar_translation_struct" file
dumps DMAR tables in the below format

IOMMU dmar2: Root Table Address:4362cc000
Root Table Entries:
 Bus: 0 H: 0 L: 4362f0001
 Context Table Entries for Bus: 0
  Entry B:D.F   HighLow
  160   00:14.0 102 4362ef001
  184   00:17.0 302 435ec4001
  248   00:1f.0 202 43631

This format has few short comings like
1. When extended for dumping scalable mode DMAR table it will quickly be
   very clumsy, making it unreadable.
2. It has information like the Bus number and Entry which are basically
   part of B:D.F, hence are a repetition and are not so useful.

So, change it to a new format which could be easily extended to dump
scalable mode DMAR table. The new format looks as below:

IOMMU dmar2: Root Table Address: 0x436f7d000
B.D.F   Root_entry  Context_entry
00:14.0 0x:0x000436fbd001   
0x0102:0x000436fbc001
00:17.0 0x:0x000436fbd001   
0x0302:0x000436af4001
00:1f.0 0x:0x000436fbd001   
0x0202:0x000436fcd001

Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Andy Shevchenko 
Signed-off-by: Sai Praneeth Prakhya 
---
 drivers/iommu/intel-iommu-debugfs.c | 65 +++--
 1 file changed, 41 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/intel-iommu-debugfs.c 
b/drivers/iommu/intel-iommu-debugfs.c
index 7fabf9b1c2dc..3f5399b5e6c0 100644
--- a/drivers/iommu/intel-iommu-debugfs.c
+++ b/drivers/iommu/intel-iommu-debugfs.c
@@ -14,6 +14,13 @@
 
 #include 
 
+struct tbl_walk {
+   u16 bus;
+   u16 devfn;
+   struct root_entry *rt_entry;
+   struct context_entry *ctx_entry;
+};
+
 struct iommu_regset {
int offset;
const char *regs;
@@ -131,16 +138,25 @@ static int iommu_regset_show(struct seq_file *m, void 
*unused)
 }
 DEFINE_SHOW_ATTRIBUTE(iommu_regset);
 
-static void ctx_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu,
-  int bus)
+static inline void print_tbl_walk(struct seq_file *m)
 {
-   struct context_entry *context;
-   int devfn;
+   struct tbl_walk *tbl_wlk = m->private;
 
-   seq_printf(m, " Context Table Entries for Bus: %d\n", bus);
-   seq_puts(m, "  Entry\tB:D.F\tHigh\tLow\n");
+   seq_printf(m, 
"%02x:%02x.%x\t0x%016llx:0x%016llx\t0x%016llx:0x%016llx\n",
+  tbl_wlk->bus, PCI_SLOT(tbl_wlk->devfn),
+  PCI_FUNC(tbl_wlk->devfn), tbl_wlk->rt_entry->hi,
+  tbl_wlk->rt_entry->lo, tbl_wlk->ctx_entry->hi,
+  tbl_wlk->ctx_entry->lo);
+}
+
+static void ctx_tbl_walk(struct seq_file *m, struct intel_iommu *iommu, u16 
bus)
+{
+   struct context_entry *context;
+   u16 devfn;
 
for (devfn = 0; devfn < 256; devfn++) {
+   struct tbl_walk tbl_wlk = {0};
+
context = iommu_context_addr(iommu, bus, devfn, 0);
if (!context)
return;
@@ -148,33 +164,34 @@ static void ctx_tbl_entry_show(struct seq_file *m, struct 
intel_iommu *iommu,
if (!context_present(context))
continue;
 
-   seq_printf(m, "  %-5d\t%02x:%02x.%x\t%-6llx\t%llx\n", devfn,
-  bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
-  context[0].hi, context[0].lo);
+   tbl_wlk.bus = bus;
+   tbl_wlk.devfn = devfn;
+   tbl_wlk.rt_entry = >root_entry[bus];
+   tbl_wlk.ctx_entry = context;
+   m->private = _wlk;
+
+   print_tbl_walk(m);
}
 }
 
-static void root_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu)
+static void root_tbl_walk(struct seq_file *m, struct intel_iommu *iommu)
 {
unsigned long flags;
-   int bus;
+   u16 bus;
 
spin_lock_irqsave(>lock, flags);
-   seq_printf(m, "IOMMU %s: Root Table Address:%llx\n", iommu->name,
+   seq_printf(m, "IOMMU %s: Root Table Address: 0x%llx\n", iommu->name,
   (u64)virt_to_phys(iommu->root_entry));
-   seq_puts(m, "Root Table Entries:\n");
-
-   for (bus = 0; bus < 256; bus++) {
-   if (!(iommu->root_entry[bus].lo & 1))
-   continue;
+   seq_puts(m, "B.D.F\tRoot_entry\t\t\t\tContext_entry\n");
 
-   seq_printf(m, " Bus: %d H: %llx L: %llx\n", bus,
-  iommu->root_entry[bus].hi,
-  iommu->root_entry[bus].lo);
+   /*
+* No need to check if the root entry is present or not because
+* iommu_context_addr() performs the same check bef

[PATCH 0/3] Add debugfs support to show scalable mode DMAR table

2019-05-09 Thread Sai Praneeth Prakhya
From: Sai Praneeth 

Presently, "/sys/kernel/debug/iommu/intel/dmar_translation_struct" file dumps
only legacy DMAR table which consists of root table and context table. Scalable
mode DMAR table adds PASID directory and PASID table. Hence, add support to dump
these tables as well.

Directly extending the present dumping format for PASID tables will make the
output look clumsy. Hence, the first patch modifies the present format to a
tabular format. The second patch introduces macros that are used during PASID
table walk and the third patch actually adds support to dump scalable mode DMAR
table.

Sai Praneeth (3):
  iommu/vt-d: Modify the format of intel DMAR tables dump
  iommu/vt-d: Introduce macros useful for dumping DMAR table
  iommu/vt-d: Add debugfs support to show scalable mode DMAR table
internals

 drivers/iommu/intel-iommu-debugfs.c | 132 +---
 drivers/iommu/intel-pasid.c |  17 -
 drivers/iommu/intel-pasid.h |  26 +++
 3 files changed, 134 insertions(+), 41 deletions(-)

Cc: Joerg Roedel 
Cc: Ashok Raj 
Cc: Lu Baolu 
Cc: Sohil Mehta 
Cc: David Woodhouse 
Cc: Jacob Pan 
Cc: Andy Shevchenko 
Signed-off-by: Sai Praneeth Prakhya 

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu