Re: [PATCH v4 0/3] Replace private domain with per-group default domain

2020-05-05 Thread Daniel Drake
On Wed, May 6, 2020 at 10:03 AM Lu Baolu  wrote:
> https://lkml.org/lkml/2020/4/14/616
> [This has been applied in iommu/next.]
>
> Hence, there is no need to keep the private domain implementation
> in the Intel IOMMU driver. This patch series aims to remove it.

I applied these patches on top of Joerg's branch and confirmed that
they fix the issue discussed in the thread:

[PATCH v2] iommu/vt-d: consider real PCI device when checking if
mapping is needed
(the patch there is no longer needed)

Tested-by: Daniel Drake 

Thanks!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code

2020-04-16 Thread Daniel Drake
Hi Joerg,

> Hi,
> 
> here is the second version of this patch-set. The first version with
> some more introductory text can be found here:
> 
>   https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/

Thanks for the continued improvements in this area!

I may have spotted a problem with setups like VMD.

The core PCI bus is set up during early boot.
Then, for the PCI bus, we reach iommu_bus_init() -> bus_iommu_probe().
In there, we call probe_iommu_group() -> dev_iommu_get() for each PCI
device, which allocates dev->iommu in each case. So far so good.

The problem is that this is the last time that we'll call dev_iommu_get().
If any PCI bus devices get added after this point, they do not get passed
to dev_iommu_get().

So when the vmd module gets loaded later, and creates more PCI devices,
we end up in iommu_bus_notifier() -> iommu_probe_device()
-> __iommu_probe_device() which does:

dev->iommu->iommu_dev = iommu_dev;

dev->iommu-> is a NULL dereference because dev_iommu_get() was never
called for this new device.

Daniel

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: use DMA domain for real DMA devices and subdevices

2020-04-12 Thread Daniel Drake
On Fri, Apr 10, 2020 at 9:22 AM Lu Baolu  wrote:
> This is caused by the fragile private domain implementation. We are in
> process of removing it by enhancing the iommu subsystem with per-group
> default domain.
>
> https://www.spinics.net/lists/iommu/msg42976.html
>
> So ultimately VMD subdevices should have their own per-device iommu data
> and support per-device dma ops.

Interesting. There's also this patchset you posted:
[PATCH 00/19] [PULL REQUEST] iommu/vt-d: patches for v5.7
https://lists.linuxfoundation.org/pipermail/iommu/2020-April/042967.html
(to be pushed out to 5.8)

In there you have:
> iommu/vt-d: Don't force 32bit devices to uses DMA domain
which seems to clash with the approach being explored in this thread.

And:
> iommu/vt-d: Apply per-device dma_ops
This effectively solves the trip point that caused me to open these
discussions, where intel_map_page() -> iommu_need_mapping() would
incorrectly determine that a intel-iommu DMA mapping was needed for a
PCI subdevice running in identity mode. After this patch, a PCI
subdevice in identity mode uses the default system dma_ops and
completely avoids intel-iommu.

So that solves the issues I was looking at. Jon, you might want to
check if the problems you see are likewise solved for you by these
patches.

I didn't try Joerg's iommu group rework yet as it conflicts with those
patches above.

Daniel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: use DMA domain for real DMA devices and subdevices

2020-04-11 Thread Daniel Drake
Hi Jon,

Thanks for picking this up. Apologies for my absence here - I wasn't
able to work on this recently, but I'm back again now.

On Fri, Apr 10, 2020 at 3:32 AM Jon Derrick  wrote:
> This becomes problematic if the real DMA device and the subdevices have
> different addressing capabilities and some require translation. Instead we can
> put the real DMA dev and any subdevices on the DMA domain. This change assigns
> subdevices to the DMA domain, and moves the real DMA device to the DMA domain
> if necessary.

Have you tested this with the real DMA device in identity mode?
It is not quite working for me. (Again, I'm not using VMD here, but
have looked closely and believe we're working under the same
constraints)

First, the real DMA device gets added to the group:
 pci :00:17.0: Adding to iommu group 9
(it's in IDENTITY mode here)

Then later, the first subdevice comes along, and these are the results:
 pci 1:00:00.0: [8086:02d7] type 00 class 0x010601
 pci 1:00:00.0: reg 0x10: [mem 0xae1a-0xae1a7fff]
 pci 1:00:00.0: reg 0x14: [mem 0xae1a8000-0xae1a80ff]
 pci 1:00:00.0: reg 0x18: [io  0x3090-0x3097]
 pci 1:00:00.0: reg 0x1c: [io  0x3080-0x3083]
 pci 1:00:00.0: reg 0x20: [io  0x3060-0x307f]
 pci 1:00:00.0: reg 0x24: [mem 0xae10-0xae103fff]
 pci 1:00:00.0: PME# supported from D3hot
 pci 1:00:00.0: Adding to iommu group 9
 pci 1:00:00.0: DMAR: Failed to get a private domain.

That final message is added by your patch and indicates that it's not working.

This is because the subdevice got added to the iommu group before the
code you added tried to change to the DMA domain.

It first gets added to the group through this call path:
intel_iommu_add_device
-> iommu_group_get_for_dev
-> iommu_group_add_device

Then, continuing within intel_iommu_add_device we get to the code you
added, which tries to move the real DMA dev to DMA mode instead. It
calls:

   intel_iommu_request_dma_domain_for_dev
-> iommu_request_dma_domain_for_dev
-> request_default_domain_for_dev

Which fails here:
/* Don't change mappings of existing devices */
ret = -EBUSY;
if (iommu_group_device_count(group) != 1)
goto out;

because we already have 2 devices in the group (the real DMA dev, plus
the subdevice we're in the process of handling now).

Next I'll look into the iommu group rework that Baolu mentioned.

Thanks,
Daniel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed

2020-02-20 Thread Daniel Drake
> On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu  wrote:
> > With respect, this is problematical. The parent and all subdevices share
> > a single translation entry. The DMA mask should be consistent.
> >
> > Otherwise, for example, subdevice A has 64-bit DMA capability and uses
> > an identity domain for DMA translation. While subdevice B has 32-bit DMA
> > capability and is forced to switch to DMA domain. Subdevice A will be
> > impacted without any notification.

Looking closer, this problematic codepath may already be happening for VMD,
under intel_iommu_add_device(). Consider this function running for a VMD
subdevice, we hit:

    domain = iommu_get_domain_for_dev(dev);

I can't quite grasp the code flow here, but domain->type now always seems
to return the domain type of the real DMA device, which seems like pretty
reasonable behaviour.

    if (domain->type == IOMMU_DOMAIN_DMA) {

and as detailed in previous mails, the real VMD device seems to be in a DMA
domain by default, so we continue.

        if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {

Now we checked the default domain type of the subdevice. This seems rather
likely to return IDENTITY because that's effectively the default type...

            ret = iommu_request_dm_for_dev(dev);
            if (ret) {
                dmar_remove_one_dev_info(dev);
                dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
                domain_add_dev_info(si_domain, dev);
                dev_info(dev,
                     "Device uses a private identity domain.\n");
            }
        }

and now we're doing the bad stuff that Lu pointed out: we only have one
mapping shared for all the subdevices, so if we end up changing it for one
subdevice, we're likely to be breaking another.
In this case iommu_request_dm_for_dev() returns -EBUSY without doing anything
and the following private identity code fortunately seems to have no
consequential effects - the real DMA device continues to operate in the DMA
domain, and all subdevice DMA requests go through the DMA mapping codepath.
That's probably why VMD appears to be working fine anyway, but this seems
fragile.

The following changes enforce that the real DMA device is in the DMA domain,
and avoid the intel_iommu_add_device() codepaths that would try to change
it to a different domain type. Let me know if I'm on the right lines...
---
 drivers/iommu/intel-iommu.c   | 16 
 drivers/pci/controller/intel-nvme-remap.c |  6 ++
 2 files changed, 22 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 9644a5b3e0ae..8872b8d1780d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2911,6 +2911,9 @@ static int device_def_domain_type(struct device *dev)
if (dev_is_pci(dev)) {
struct pci_dev *pdev = to_pci_dev(dev);
 
+   if (pci_real_dma_dev(pdev) != pdev)
+   return IOMMU_DOMAIN_DMA;
+
if (device_is_rmrr_locked(dev))
return IOMMU_DOMAIN_DMA;
 
@@ -5580,6 +5583,7 @@ static bool intel_iommu_capable(enum iommu_cap cap)
 
 static int intel_iommu_add_device(struct device *dev)
 {
+   struct device *real_dev = dev;
struct dmar_domain *dmar_domain;
struct iommu_domain *domain;
struct intel_iommu *iommu;
@@ -5591,6 +5595,17 @@ static int intel_iommu_add_device(struct device *dev)
if (!iommu)
return -ENODEV;
 
+   if (dev_is_pci(dev))
+   real_dev = _real_dma_dev(to_pci_dev(dev))->dev;
+
+   if (real_dev != dev) {
+   domain = iommu_get_domain_for_dev(real_dev);
+   if (domain->type != IOMMU_DOMAIN_DMA) {
+   dev_err(dev, "Real DMA device not in DMA domain; can't 
handle DMA\n");
+   return -ENODEV;
+   }
+   }
+
iommu_device_link(>iommu, dev);
 
if (translation_pre_enabled(iommu))

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed

2020-02-19 Thread Daniel Drake
On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu  wrote:
> With respect, this is problematical. The parent and all subdevices share
> a single translation entry. The DMA mask should be consistent.
>
> Otherwise, for example, subdevice A has 64-bit DMA capability and uses
> an identity domain for DMA translation. While subdevice B has 32-bit DMA
> capability and is forced to switch to DMA domain. Subdevice A will be
> impacted without any notification.

I see what you mean.

Perhaps we should just ensure that setups involving such real DMA
devices and subdevices should always use the DMA domain, avoiding this
type of complication. That's apparently even the default for VMD. This
is probably something that should be forced/ensured when the real DMA
device gets registered, because similarly to the noted case, we can't
risk any identity mappings having been created on the real device if
we later decide to move it into the DMA domain based on the appearance
of subdevices.

Jon, any thoughts?

Daniel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed

2020-02-18 Thread Daniel Drake
From: Jon Derrick 

The PCI devices handled by intel-iommu may have a DMA requester on
another bus, such as VMD subdevices needing to use the VMD endpoint.

The real DMA device is now used for the DMA mapping, but one case was
missed earlier: if the VMD device (and hence subdevices too) are under
IOMMU_DOMAIN_IDENTITY, mappings do not work.

Codepaths like intel_map_page() handle the IOMMU_DOMAIN_DMA case by
creating an iommu DMA mapping, and fall back on dma_direct_map_page()
for the IOMMU_DOMAIN_IDENTITY case. However, handling of the IDENTITY
case is broken when intel_page_page() handles a subdevice.

We observe that at iommu attach time, dmar_insert_one_dev_info() for
the subdevices will never set dev->archdata.iommu. This is because
that function uses find_domain() to check if there is already an IOMMU
for the device, and find_domain() then defers to the real DMA device
which does have one. Thus dmar_insert_one_dev_info() returns without
assigning dev->archdata.iommu.

Then, later:

1. intel_map_page() checks if an IOMMU mapping is needed by calling
   iommu_need_mapping() on the subdevice. identity_mapping() returns
   false because dev->archdata.iommu is NULL, so this function
   returns false indicating that mapping is needed.
2. __intel_map_single() is called to create the mapping.
3. __intel_map_single() calls find_domain(). This function now returns
   the IDENTITY domain corresponding to the real DMA device.
4. __intel_map_single() calls domain_get_iommu() on this "real" domain.
   A failure is hit and the entire operation is aborted, because this
   codepath is not intended to handle IDENTITY mappings:
   if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA))
   return NULL;

Fix this by using the real DMA device when checking if a mapping is
needed. The IDENTITY case will then directly fall back on
dma_direct_map_page(). The subdevice DMA mask is still considered in
order to handle any situations where (e.g.) the subdevice only supports
32-bit DMA with the real DMA requester having a 64-bit DMA mask.

Reported-by: Daniel Drake 
Fixes: b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Signed-off-by: Jon Derrick 
Signed-off-by: Daniel Drake 
---

Notes:
v2: switch to Jon's approach instead.
v3: shortcut mask check in non-identity case
v4: amend commit msg to explain why subdevice DMA mask is still considered

This problem was originally detected with a non-upstream patch which
creates PCI devices similar to VMD:
"PCI: Add Intel remapped NVMe device support"
(https://marc.info/?l=linux-ide=156015271021615=2)

This patch has now been tested on VMD forced into identity mode.

 drivers/iommu/intel-iommu.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 9dc37672bf89..7ffd252bf835 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3582,12 +3582,16 @@ static struct dmar_domain 
*get_private_domain_for_dev(struct device *dev)
 /* Check if the dev needs to go through non-identity map and unmap process.*/
 static bool iommu_need_mapping(struct device *dev)
 {
+   struct device *dma_dev = dev;
int ret;
 
if (iommu_dummy(dev))
return false;
 
-   ret = identity_mapping(dev);
+   if (dev_is_pci(dev))
+   dma_dev = _real_dma_dev(to_pci_dev(dev))->dev;
+
+   ret = identity_mapping(dma_dev);
if (ret) {
u64 dma_mask = *dev->dma_mask;
 
@@ -3601,19 +3605,19 @@ static bool iommu_need_mapping(struct device *dev)
 * 32 bit DMA is removed from si_domain and fall back to
 * non-identity mapping.
 */
-   dmar_remove_one_dev_info(dev);
-   ret = iommu_request_dma_domain_for_dev(dev);
+   dmar_remove_one_dev_info(dma_dev);
+   ret = iommu_request_dma_domain_for_dev(dma_dev);
if (ret) {
struct iommu_domain *domain;
struct dmar_domain *dmar_domain;
 
-   domain = iommu_get_domain_for_dev(dev);
+   domain = iommu_get_domain_for_dev(dma_dev);
if (domain) {
dmar_domain = to_dmar_domain(domain);
dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
}
-   dmar_remove_one_dev_info(dev);
-   get_private_domain_for_dev(dev);
+   dmar_remove_one_dev_info(dma_dev);
+   get_private_domain_for_dev(dma_dev);
}
 
dev_info(dev, "32bit DMA uses non-identity mapping\n");
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3] iommu/vt-d: consider real PCI device when checking if mapping is needed

2020-02-17 Thread Daniel Drake
From: Jon Derrick 

The PCI devices handled by intel-iommu may have a DMA requester on
another bus, such as VMD subdevices needing to use the VMD endpoint.

The real DMA device is now used for the DMA mapping, but one case was
missed earlier: if the VMD device (and hence subdevices too) are under
IOMMU_DOMAIN_IDENTITY, mappings do not work.

Codepaths like intel_map_page() handle the IOMMU_DOMAIN_DMA case by
creating an iommu DMA mapping, and fall back on dma_direct_map_page()
for the IOMMU_DOMAIN_IDENTITY case. However, handling of the IDENTITY
case is broken when intel_page_page() handles a subdevice.

We observe that at iommu attach time, dmar_insert_one_dev_info() for
the subdevices will never set dev->archdata.iommu. This is because
that function uses find_domain() to check if there is already an IOMMU
for the device, and find_domain() then defers to the real DMA device
which does have one. Thus dmar_insert_one_dev_info() returns without
assigning dev->archdata.iommu.

Then, later:

1. intel_map_page() checks if an IOMMU mapping is needed by calling
   iommu_need_mapping() on the subdevice. identity_mapping() returns
   false because dev->archdata.iommu is NULL, so this function
   returns false indicating that mapping is needed.
2. __intel_map_single() is called to create the mapping.
3. __intel_map_single() calls find_domain(). This function now returns
   the IDENTITY domain corresponding to the real DMA device.
4. __intel_map_single() calls domain_get_iommu() on this "real" domain.
   A failure is hit and the entire operation is aborted, because this
   codepath is not intended to handle IDENTITY mappings:
   if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA))
   return NULL;

Fix this by using the real DMA device when checking if a mapping is
needed, while also considering the subdevice DMA mask.
The IDENTITY case will then directly fall back on dma_direct_map_page().

Reported-by: Daniel Drake 
Fixes: b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Signed-off-by: Jon Derrick 
Signed-off-by: Daniel Drake 
---
 drivers/iommu/intel-iommu.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 9dc37672bf89..7ffd252bf835 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3582,12 +3582,16 @@ static struct dmar_domain 
*get_private_domain_for_dev(struct device *dev)
 /* Check if the dev needs to go through non-identity map and unmap process.*/
 static bool iommu_need_mapping(struct device *dev)
 {
+   struct device *dma_dev = dev;
int ret;
 
if (iommu_dummy(dev))
return false;
 
-   ret = identity_mapping(dev);
+   if (dev_is_pci(dev))
+   dma_dev = _real_dma_dev(to_pci_dev(dev))->dev;
+
+   ret = identity_mapping(dma_dev);
if (ret) {
u64 dma_mask = *dev->dma_mask;
 
@@ -3601,19 +3605,19 @@ static bool iommu_need_mapping(struct device *dev)
 * 32 bit DMA is removed from si_domain and fall back to
 * non-identity mapping.
 */
-   dmar_remove_one_dev_info(dev);
-   ret = iommu_request_dma_domain_for_dev(dev);
+   dmar_remove_one_dev_info(dma_dev);
+   ret = iommu_request_dma_domain_for_dev(dma_dev);
if (ret) {
struct iommu_domain *domain;
struct dmar_domain *dmar_domain;
 
-   domain = iommu_get_domain_for_dev(dev);
+   domain = iommu_get_domain_for_dev(dma_dev);
if (domain) {
dmar_domain = to_dmar_domain(domain);
dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
}
-   dmar_remove_one_dev_info(dev);
-   get_private_domain_for_dev(dev);
+   dmar_remove_one_dev_info(dma_dev);
+   get_private_domain_for_dev(dma_dev);
}
 
dev_info(dev, "32bit DMA uses non-identity mapping\n");
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] iommu/vt-d: consider real PCI device when checking if mapping is needed

2020-02-14 Thread Daniel Drake
From: Jon Derrick 

The PCI devices handled by intel-iommu may have a DMA requester on
another bus, such as VMD subdevices needing to use the VMD endpoint.

The real DMA device is now used for the DMA mapping, but one case was
missed earlier: if the VMD device (and hence subdevices too) are under
IOMMU_DOMAIN_IDENTITY, mappings do not work.

Codepaths like intel_map_page() handle the IOMMU_DOMAIN_DMA case by
creating an iommu DMA mapping, and fall back on dma_direct_map_page()
for the IOMMU_DOMAIN_IDENTITY case. However, handling of the IDENTITY
case is broken when intel_page_page() handles a subdevice.

We observe that at iommu attach time, dmar_insert_one_dev_info() for
the subdevices will never set dev->archdata.iommu. This is because
that function uses find_domain() to check if there is already an IOMMU
for the device, and find_domain() then defers to the real DMA device
which does have one. Thus dmar_insert_one_dev_info() returns without
assigning dev->archdata.iommu.

Then, later:

1. intel_map_page() checks if an IOMMU mapping is needed by calling
   iommu_need_mapping() on the subdevice. identity_mapping() returns
   false because dev->archdata.iommu is NULL, so this function
   returns false indicating that mapping is needed.
2. __intel_map_single() is called to create the mapping.
3. __intel_map_single() calls find_domain(). This function now returns
   the IDENTITY domain corresponding to the real DMA device.
4. __intel_map_single() calls domain_get_iommu() on this "real" domain.
   A failure is hit and the entire operation is aborted, because this
   codepath is not intended to handle IDENTITY mappings:
   if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA))
   return NULL;

Fix this by using the real DMA device when checking if a mapping is
needed, while also considering the subdevice DMA mask.
The IDENTITY case will then directly fall back on dma_direct_map_page().

Reported-by: Daniel Drake 
Fixes: b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Signed-off-by: Daniel Drake 
---

Notes:
v2: switch to Jon's approach instead.

This problem was detected with a non-upstream patch
"PCI: Add Intel remapped NVMe device support"
(https://marc.info/?l=linux-ide=156015271021615=2)

This patch creates PCI devices a bit like VMD, and hence
I believe VMD would hit this class of problem for any cases where
the VMD device is in the IDENTITY domain. (I presume the reason this
bug was not seen already there is that it is in a DMA iommu domain).

However this hasn't actually been tested on VMD (don't have the hardware)
so if I've missed anything and/or it's not a real issue then feel free to
drop this patch.

 drivers/iommu/intel-iommu.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 9dc37672bf89..edbe2866b515 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3582,19 +3582,23 @@ static struct dmar_domain 
*get_private_domain_for_dev(struct device *dev)
 /* Check if the dev needs to go through non-identity map and unmap process.*/
 static bool iommu_need_mapping(struct device *dev)
 {
+   u64 dma_mask, required_dma_mask;
int ret;
 
if (iommu_dummy(dev))
return false;
 
-   ret = identity_mapping(dev);
-   if (ret) {
-   u64 dma_mask = *dev->dma_mask;
+   dma_mask = *dev->dma_mask;
+   if (dev->coherent_dma_mask && dev->coherent_dma_mask < dma_mask)
+   dma_mask = dev->coherent_dma_mask;
+   required_dma_mask = dma_direct_get_required_mask(dev);
 
-   if (dev->coherent_dma_mask && dev->coherent_dma_mask < dma_mask)
-   dma_mask = dev->coherent_dma_mask;
+   if (dev_is_pci(dev))
+   dev = _real_dma_dev(to_pci_dev(dev))->dev;
 
-   if (dma_mask >= dma_direct_get_required_mask(dev))
+   ret = identity_mapping(dev);
+   if (ret) {
+   if (dma_mask >= required_dma_mask)
return false;
 
/*
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: consider real PCI device when checking if mapping is needed

2020-02-11 Thread Daniel Drake
On Wed, Feb 12, 2020 at 12:03 AM Derrick, Jonathan
 wrote:
> On Tue, 2020-02-11 at 17:13 +0800, Daniel Drake wrote:
> > The PCI devices handled by intel-iommu may have a DMA requester on
> > another bus, such as VMD subdevices needing to use the VMD endpoint.
> >
> > The real DMA device is now used for the DMA mapping, but one case was
> > missed earlier, when allocating memory through (e.g.) intel_map_page().
> > Confusion ensues if the iommu domain type for the subdevice does not match
> > the iommu domain type for the real DMA device.
> Is there a way to force this situation for my testing?

I think you could hack device_def_domain_type() to return
IOMMU_DOMAIN_IDENTITY for the real device, and IOMMU_DOMAIN_DMA for
the subdevice.

But I got curious as to why my subdevice might be IOMMU_DOMAIN_DMA, so
I checked, and found out that my assumptions weren't quite correct.
The subdevice has no iommu domain recorded at all. Before applying any
patches here, what's actually happening is:

1. Real DMA device gets registered with the iommu as
IOMMU_DOMAIN_IDENTITY using si_domain.
2. When the subdevice gets registered, the relevant code flow is
inside dmar_insert_one_dev_info():
 - it creates a new device_domain_info and domain->domain.type == IDENTITY, but
 - it then calls find_domain(dev) which successfully defers to the
real DMA device and returns the real DMA device's dmar_domain
 - since found != NULL (dmar_domain was found for this device) the
function bails out before setting dev->archdata.iommu

The results at this point are that the real DMA device is fully
registered as IOMMU_DOMAIN_IDENTITY using si_domain, but all of the
subdevices will always have dev->archdata.iommu == NULL.

Then when intel_map_page() is reached for the subdevice, it calls
iommu_need_mapping() for the subdevice.
This calls identity_mapping() on the subdevice, but that will always
return 0 because dev->archdata.iommu == NULL.
Following on from there, iommu_need_mapping() will then *always*
return true (mapping needed) for subdevices.

That will then lead to the situation described in my last mail, where
later down the allocation chain the request for creating a mapping
will be handed towards the real DMA dev, but that will then fail
because the real DMA dev is using IOMMU_DOMAIN_IDENTITY where no
mapping is needed.

Unless I missed anything that seems pretty clear to me now, and I
guess the only reason why you may not have already faced this in the
vmd case is if the real DMA device is not using IOMMU_DOMAIN_IDENTITY.
(To check this, you could log the value of the real dev
domain->domain.type in dmar_insert_one_dev_info(), and/or observe the
return value of identity_mapping() in iommu_need_mapping for the real
dev).

In any case it seems increasingly clear to me that
iommu_need_mapping() should be consulting the real DMA device in the
identity_mapping check, and your patch likewise solves the problem
faced here.

Thanks
Daniel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/vt-d: consider real PCI device when checking if mapping is needed

2020-02-11 Thread Daniel Drake
The PCI devices handled by intel-iommu may have a DMA requester on
another bus, such as VMD subdevices needing to use the VMD endpoint.

The real DMA device is now used for the DMA mapping, but one case was
missed earlier, when allocating memory through (e.g.) intel_map_page().
Confusion ensues if the iommu domain type for the subdevice does not match
the iommu domain type for the real DMA device.

For example, take the case of the subdevice handled by intel_map_page()
in a IOMMU_DOMAIN_DMA, with the real DMA device in a
IOMMU_DOMAIN_IDENTITY:

1. intel_map_page() checks if an IOMMU mapping is needed by calling
   iommu_need_mapping() on the subdevice. Result: mapping is needed.
2. __intel_map_single() is called to create the mapping:
  - __intel_map_single() calls find_domain(). This function now returns
the IDENTITY domain corresponding to the real DMA device.
  - __intel_map_single() then calls domain_get_iommu() on this "real"
domain. A failure is hit and the entire operation is aborted, because
this codepath is not intended to handle IDENTITY mappings:
if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA))
 return NULL;

Fix this by using the real DMA device when checking if a mapping is
needed. The above case will then directly fall back on
dma_direct_map_page().

Fixes: 2b0140c69637 ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Signed-off-by: Daniel Drake 
---

Notes:
This problem was detected with a non-upstream patch
"PCI: Add Intel remapped NVMe device support"
(https://marc.info/?l=linux-ide=156015271021615=2)

This patch creates PCI devices in the same way as VMD, and hence
I believe VMD would hit this class of problem for any cases where
iommu domain type may mismatch between subdevice and real device,
which we have run into here.

However this hasn't actually been tested on VMD (don't have the hardware)
so if I've missed anything and/or it's not a real issue then feel free to
drop this patch.

 drivers/iommu/intel-iommu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 9dc37672bf89..713810f8350c 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3587,6 +3587,9 @@ static bool iommu_need_mapping(struct device *dev)
if (iommu_dummy(dev))
return false;
 
+   if (dev_is_pci(dev))
+   dev = _real_dma_dev(to_pci_dev(dev))->dev;
+
ret = identity_mapping(dev);
if (ret) {
u64 dma_mask = *dev->dma_mask;
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/intel-iommu: set as DUMMY_DEVICE_DOMAIN_INFO if no IOMMU

2020-02-11 Thread Daniel Drake
On Sat, Feb 8, 2020 at 2:30 PM Lu Baolu  wrote:
> > The devices under segment 1 are fake devices produced by
> > intel-nvme-remap mentioned here https://lkml.org/lkml/2020/2/5/139
>
> Has this series been accepted?

Sadly not - we didn't find any consensus on the right approach, and
further conversation is hindered by the questionable hardware design
and lack of further communication from Intel in explaining it. It's
one of the exceptional cases where we carry a significant non-upstream
kernel change, because unfortunately most of the current day consumer
PCs we work with have this BIOS option on by default and hence
unmodified Linux can't access the storage devices. On the offchance
that you have some energy to bubble this up inside Intel please let me
know and we will talk more... :)

That said, this thread was indeed only opened since we thought we had
found a more general issue that would potentially affect other cases.
The issue described does seem to highlight a possible imperfection in
code flow, although it may also be reasonable to say that (without
crazy downstream patches at play) if intel_iommu_add_device() fails
then we have bigger problems to face anyway...

> Will this help here? https://www.spinics.net/lists/iommu/msg41300.html

Yes! Very useful info and a real improvement. We'll follow the same
approach here. That does solve the problem we were facing, although we
needed one more fixup which I've sent separately for your
consideration: iommu/vt-d: consider real PCI device when checking if
mapping is needed

Thanks!
Daniel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/dmar: ignore devices with out-of-spec domain number

2020-02-11 Thread Daniel Drake
VMD subdevices are created with a PCI domain ID of 0x1 or
higher.

These subdevices are also handled like all other PCI devices by
dmar_pci_bus_notifier().

However, when dmar_alloc_pci_notify_info() take records of such devices,
it will truncate the domain ID to a u16 value (in info->seg).
The device at (e.g.) 1:00:02.0 is then treated by the DMAR code as if
it is :00:02.0.

In the unlucky event that a real device also exists at :00:02.0 and
also has a device-specific entry in the DMAR table,
dmar_insert_dev_scope() will crash on:
   BUG_ON(i >= devices_cnt);

That's basically a sanity check that only one PCI device matches a
single DMAR entry; in this case we seem to have two matching devices.

Fix this by ignoring devices that have a domain number higher than
what can be looked up in the DMAR table.

This problem was carefully diagnosed by Jian-Hong Pan.

Signed-off-by: Daniel Drake 
---

Notes:
This problem was detected with a non-upstream patch
"PCI: Add Intel remapped NVMe device support"
(https://marc.info/?l=linux-ide=156015271021615=2)

This patch creates PCI devices in the same way as VMD, and hence
I believe VMD would hit the same problem that we encountered here, when
a VMD-using product comes along that meets the mentioned conditions.

However this hasn't actually been tested on VMD (don't have the hardware)
so if I've missed anything and/or it's not a real issue then feel free to
drop this patch.

 drivers/iommu/dmar.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 071bb42bbbc5..8f94c817a7b5 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -128,6 +129,13 @@ dmar_alloc_pci_notify_info(struct pci_dev *dev, unsigned 
long event)
 
BUG_ON(dev->is_virtfn);
 
+   /*
+* Ignore devices that have a domain number higher than what can
+* be looked up in DMAR, e.g. VMD subdevices with domain 0x1
+*/
+   if (pci_domain_nr(dev->bus) > U16_MAX)
+   return NULL;
+
/* Only generate path[] for device addition event */
if (event == BUS_NOTIFY_ADD_DEVICE)
for (tmp = dev; tmp; tmp = tmp->bus->self)
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v3)

2017-09-12 Thread Daniel Drake
Hi,

On Tue, May 30, 2017 at 3:38 PM, Nath, Arindam  wrote:
>>-Original Message-
>>From: Joerg Roedel [mailto:j...@8bytes.org]
>>Sent: Monday, May 29, 2017 8:09 PM
>>To: Nath, Arindam ; Lendacky, Thomas
>>
>>Cc: iommu@lists.linux-foundation.org; amd-...@lists.freedesktop.org;
>>Deucher, Alexander ; Bridgman, John
>>; dr...@endlessm.com; Suthikulpanit, Suravee
>>; li...@endlessm.com; Craig Stein
>>; mic...@daenzer.net; Kuehling, Felix
>>; sta...@vger.kernel.org
>>Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v3)
>>
>>Hi Arindam,
>>
>>I met Tom Lendacky last week in Nuremberg last week and he told me he is
>>working on the same area of the code that this patch is for. His reason
>>for touching this code was to solve some locking problems. Maybe you two
>>can work together on a joint approach to improve this?
>
> Sure Joerg, I will work with Tom.

What was the end result here? I see that the code has been reworked in
4.13 so your original patch no longer applies. Is the reworked version
also expected to solve the original issue?

Thanks
Daniel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


DMAR table missing, Intel IOMMU not available

2017-08-14 Thread Daniel Drake
Hi,

We're working with a number of platforms based on Intel Apollo Lake
and there are some clues suggesting that the IR-PCI-MSI irqchip
functionality would be able to get us out of a tricky situation
described at:

ath9k hardware corrupts MSI Message Data, raises wrong interrupt
http://marc.info/?l=linux-pci=150238260726797=2

However the affected platforms do not have a DMAR table present. And I
read in the
Intel® Virtualization Technology for Directed I/O spec: "The system
BIOS is responsible for detecting the remapping hardware functions in
the platform and for locating the memory-mapped remapping hardware
registers in the host system address space. The BIOS reports the
remapping hardware units in a platform to system software through the
DMA
Remapping Reporting (DMAR) ACPI table".

Unfortunately since the BIOS authors have not done what the spec
asked, this nice hardware functionality is completely unavailable :(

For now we will have to find an alternative approach to solve the
problem (BIOS can't be changed), but I am curious if there are plans
to have Linux automatically probe the IOMMU through some other means,
given that BIOS authors are apparently not providing the DMAR table in
many cases.

Thanks
Daniel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/amd: flush IOTLB for specific domains only

2017-05-08 Thread Daniel Drake
On Wed, Apr 5, 2017 at 9:01 AM, Nath, Arindam <arindam.n...@amd.com> wrote:
>
> >-Original Message-
> >From: Daniel Drake [mailto:dr...@endlessm.com]
> >Sent: Thursday, March 30, 2017 7:15 PM
> >To: Nath, Arindam
> >Cc: j...@8bytes.org; Deucher, Alexander; Bridgman, John; amd-
> >g...@lists.freedesktop.org; iommu@lists.linux-foundation.org; Suthikulpanit,
> >Suravee; Linux Upstreaming Team
> >Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only
> >
> >On Thu, Mar 30, 2017 at 12:23 AM, Nath, Arindam <arindam.n...@amd.com>
> >wrote:
> >> Daniel, did you get chance to test this patch?
> >
> >Not yet. Should we test it alone or alongside "PCI: Blacklist AMD
> >Stoney GPU devices for ATS"?
>
> Daniel, any luck with this patch?

Sorry for the delay. The patch appears to be working fine.

Daniel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: flush IOTLB for specific domains only

2017-03-27 Thread Daniel Drake
Hi Arindam,

You CC'd me on this - does this mean that it is a fix for the issue
described in the thread "amd-iommu: can't boot with amdgpu, AMD-Vi:
Completion-Wait loop timed out" ?

Thanks
Daniel


On Mon, Mar 27, 2017 at 12:17 AM,   wrote:
> From: Arindam Nath 
>
> The idea behind flush queues is to defer the IOTLB flushing
> for domains for which the mappings are no longer valid. We
> add such domains in queue_add(), and when the queue size
> reaches FLUSH_QUEUE_SIZE, we perform __queue_flush().
>
> Since we have already taken lock before __queue_flush()
> is called, we need to make sure the IOTLB flushing is
> performed as quickly as possible.
>
> In the current implementation, we perform IOTLB flushing
> for all domains irrespective of which ones were actually
> added in the flush queue initially. This can be quite
> expensive especially for domains for which unmapping is
> not required at this point of time.
>
> This patch makes use of domain information in
> 'struct flush_queue_entry' to make sure we only flush
> IOTLBs for domains who need it, skipping others.
>
> Signed-off-by: Arindam Nath 
> ---
>  drivers/iommu/amd_iommu.c | 15 ---
>  1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 98940d1..6a9a048 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2227,15 +2227,16 @@ static struct iommu_group 
> *amd_iommu_device_group(struct device *dev)
>
>  static void __queue_flush(struct flush_queue *queue)
>  {
> -   struct protection_domain *domain;
> -   unsigned long flags;
> int idx;
>
> -   /* First flush TLB of all known domains */
> -   spin_lock_irqsave(_iommu_pd_lock, flags);
> -   list_for_each_entry(domain, _iommu_pd_list, list)
> -   domain_flush_tlb(domain);
> -   spin_unlock_irqrestore(_iommu_pd_lock, flags);
> +   /* First flush TLB of all domains which were added to flush queue */
> +   for (idx = 0; idx < queue->next; ++idx) {
> +   struct flush_queue_entry *entry;
> +
> +   entry = queue->entries + idx;
> +
> +   domain_flush_tlb(>dma_dom->domain);
> +   }
>
> /* Wait until flushes have completed */
> domain_flush_complete(NULL);
> --
> 1.9.1
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait loop timed out

2017-03-27 Thread Daniel Drake
Hi Joerg,

Thanks for looking into this. We confirm that this workaround avoids
the iommu log spam and that amdgpu appears to be working fine with it.

Daniel


On Wed, Mar 22, 2017 at 5:22 AM, j...@8bytes.org  wrote:
> On Tue, Mar 21, 2017 at 04:30:55PM +, Deucher, Alexander wrote:
>> > I am preparing a debug-patch that disables ATS for these GPUs so someone
>> > with such a chip can test it.
>>
>> Thanks Joerg.
>
> Here is a debug patch, using the hard hammer of disabling the use of ATS
> completly in the AMD IOMMU driver. If it fixes the issue I am going to
> write a more upstreamable version.
>
> But for now, please test if this fixes the issue.
>
> Thanks,
>
> Joerg
>
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 98940d1..f019aa6 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -467,7 +467,7 @@ static int iommu_init_device(struct device *dev)
> struct amd_iommu *iommu;
>
> iommu = amd_iommu_rlookup_table[dev_data->devid];
> -   dev_data->iommu_v2 = iommu->is_iommu_v2;
> +   dev_data->iommu_v2 = false;
> }
>
> dev->archdata.iommu = dev_data;
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 6130278..41d0e64 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -171,7 +171,7 @@ int amd_iommus_present;
>
>  /* IOMMUs have a non-present cache? */
>  bool amd_iommu_np_cache __read_mostly;
> -bool amd_iommu_iotlb_sup __read_mostly = true;
> +bool amd_iommu_iotlb_sup __read_mostly = false;
>
>  u32 amd_iommu_max_pasid __read_mostly = ~0;
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait loop timed out

2017-03-17 Thread Daniel Drake
Hi,

On Mon, Mar 13, 2017 at 2:01 PM, Deucher, Alexander
 wrote:
> > We are unable to boot Acer Aspire E5-553G (AMD FX-9800P RADEON R7) nor
> > Acer Aspire E5-523 with standard configurations because during boot
> > the screen is flooded with the following error message over and over:
> >
> >   AMD-Vi: Completion-Wait loop timed out
>
> We ran into similar issues and bisected it to commit 
> b1516a14657acf81a587e9a6e733a881625eee53.  I'm not too familiar with the 
> IOMMU hardware to know if this is an iommu or display driver issue yet.

We can confirm that reverting this commit solves the issue.

Given that that commit is an optimization, but it has introduced a
regression on multiple platforms, and has been like this for 8 months,
it would be common practice to now revert this patch upstream until
the regression is fixed. Could you please send a new patch to do this?

Also, we would be happy to test any real solutions to this issue while
we still have the affected units in hand.

Thanks
Daniel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait loop timed out

2017-03-13 Thread Daniel Drake
Hi,

We are unable to boot Acer Aspire E5-553G (AMD FX-9800P RADEON R7) nor
Acer Aspire E5-523 with standard configurations because during boot
the screen is flooded with the following error message over and over:

  AMD-Vi: Completion-Wait loop timed out

We have left the system for quite a while but the message spam does
not stop and the system doesn't complete the boot sequence.

We have reproduced on Linux 4.8 and Linux 4.10.

To avoid this, we can boot with iommu=soft or just disable the amdgpu
display driver.

Looks like this may also affect HP 15-ba012no :
https://bugzilla.redhat.com/show_bug.cgi?id=1409201

Earlier during boot the iommu is detected as:

[1.274518] AMD-Vi: Found IOMMU at :00:00.2 cap 0x40
[1.274519] AMD-Vi: Extended features (0x37ef22294ada):
[1.274519]  PPR NX GT IA GA PC GA_vAPIC
[1.274523] AMD-Vi: Interrupt remapping enabled
[1.274523] AMD-Vi: virtual APIC enabled
[1.275144] AMD-Vi: Lazy IO/TLB flushing enabled
[1.276498] perf: AMD NB counters detected
[1.278096] LVT offset 0 assigned for vector 0x400
[1.278963] perf: AMD IBS detected (0x07ff)
[1.278977] perf: amd_iommu: Detected. (0 banks, 0 counters/bank)

Any suggestions for how we can fix this, or get more useful debug info?

Thanks
Daniel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu