Re: [PATCH v4 0/3] Replace private domain with per-group default domain
On Wed, May 6, 2020 at 10:03 AM Lu Baolu wrote: > https://lkml.org/lkml/2020/4/14/616 > [This has been applied in iommu/next.] > > Hence, there is no need to keep the private domain implementation > in the Intel IOMMU driver. This patch series aims to remove it. I applied these patches on top of Joerg's branch and confirmed that they fix the issue discussed in the thread: [PATCH v2] iommu/vt-d: consider real PCI device when checking if mapping is needed (the patch there is no longer needed) Tested-by: Daniel Drake Thanks! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code
Hi Joerg, > Hi, > > here is the second version of this patch-set. The first version with > some more introductory text can be found here: > > https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/ Thanks for the continued improvements in this area! I may have spotted a problem with setups like VMD. The core PCI bus is set up during early boot. Then, for the PCI bus, we reach iommu_bus_init() -> bus_iommu_probe(). In there, we call probe_iommu_group() -> dev_iommu_get() for each PCI device, which allocates dev->iommu in each case. So far so good. The problem is that this is the last time that we'll call dev_iommu_get(). If any PCI bus devices get added after this point, they do not get passed to dev_iommu_get(). So when the vmd module gets loaded later, and creates more PCI devices, we end up in iommu_bus_notifier() -> iommu_probe_device() -> __iommu_probe_device() which does: dev->iommu->iommu_dev = iommu_dev; dev->iommu-> is a NULL dereference because dev_iommu_get() was never called for this new device. Daniel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: use DMA domain for real DMA devices and subdevices
On Fri, Apr 10, 2020 at 9:22 AM Lu Baolu wrote: > This is caused by the fragile private domain implementation. We are in > process of removing it by enhancing the iommu subsystem with per-group > default domain. > > https://www.spinics.net/lists/iommu/msg42976.html > > So ultimately VMD subdevices should have their own per-device iommu data > and support per-device dma ops. Interesting. There's also this patchset you posted: [PATCH 00/19] [PULL REQUEST] iommu/vt-d: patches for v5.7 https://lists.linuxfoundation.org/pipermail/iommu/2020-April/042967.html (to be pushed out to 5.8) In there you have: > iommu/vt-d: Don't force 32bit devices to uses DMA domain which seems to clash with the approach being explored in this thread. And: > iommu/vt-d: Apply per-device dma_ops This effectively solves the trip point that caused me to open these discussions, where intel_map_page() -> iommu_need_mapping() would incorrectly determine that a intel-iommu DMA mapping was needed for a PCI subdevice running in identity mode. After this patch, a PCI subdevice in identity mode uses the default system dma_ops and completely avoids intel-iommu. So that solves the issues I was looking at. Jon, you might want to check if the problems you see are likewise solved for you by these patches. I didn't try Joerg's iommu group rework yet as it conflicts with those patches above. Daniel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: use DMA domain for real DMA devices and subdevices
Hi Jon, Thanks for picking this up. Apologies for my absence here - I wasn't able to work on this recently, but I'm back again now. On Fri, Apr 10, 2020 at 3:32 AM Jon Derrick wrote: > This becomes problematic if the real DMA device and the subdevices have > different addressing capabilities and some require translation. Instead we can > put the real DMA dev and any subdevices on the DMA domain. This change assigns > subdevices to the DMA domain, and moves the real DMA device to the DMA domain > if necessary. Have you tested this with the real DMA device in identity mode? It is not quite working for me. (Again, I'm not using VMD here, but have looked closely and believe we're working under the same constraints) First, the real DMA device gets added to the group: pci :00:17.0: Adding to iommu group 9 (it's in IDENTITY mode here) Then later, the first subdevice comes along, and these are the results: pci 1:00:00.0: [8086:02d7] type 00 class 0x010601 pci 1:00:00.0: reg 0x10: [mem 0xae1a-0xae1a7fff] pci 1:00:00.0: reg 0x14: [mem 0xae1a8000-0xae1a80ff] pci 1:00:00.0: reg 0x18: [io 0x3090-0x3097] pci 1:00:00.0: reg 0x1c: [io 0x3080-0x3083] pci 1:00:00.0: reg 0x20: [io 0x3060-0x307f] pci 1:00:00.0: reg 0x24: [mem 0xae10-0xae103fff] pci 1:00:00.0: PME# supported from D3hot pci 1:00:00.0: Adding to iommu group 9 pci 1:00:00.0: DMAR: Failed to get a private domain. That final message is added by your patch and indicates that it's not working. This is because the subdevice got added to the iommu group before the code you added tried to change to the DMA domain. It first gets added to the group through this call path: intel_iommu_add_device -> iommu_group_get_for_dev -> iommu_group_add_device Then, continuing within intel_iommu_add_device we get to the code you added, which tries to move the real DMA dev to DMA mode instead. It calls: intel_iommu_request_dma_domain_for_dev -> iommu_request_dma_domain_for_dev -> request_default_domain_for_dev Which fails here: /* Don't change mappings of existing devices */ ret = -EBUSY; if (iommu_group_device_count(group) != 1) goto out; because we already have 2 devices in the group (the real DMA dev, plus the subdevice we're in the process of handling now). Next I'll look into the iommu group rework that Baolu mentioned. Thanks, Daniel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed
> On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu wrote: > > With respect, this is problematical. The parent and all subdevices share > > a single translation entry. The DMA mask should be consistent. > > > > Otherwise, for example, subdevice A has 64-bit DMA capability and uses > > an identity domain for DMA translation. While subdevice B has 32-bit DMA > > capability and is forced to switch to DMA domain. Subdevice A will be > > impacted without any notification. Looking closer, this problematic codepath may already be happening for VMD, under intel_iommu_add_device(). Consider this function running for a VMD subdevice, we hit: domain = iommu_get_domain_for_dev(dev); I can't quite grasp the code flow here, but domain->type now always seems to return the domain type of the real DMA device, which seems like pretty reasonable behaviour. if (domain->type == IOMMU_DOMAIN_DMA) { and as detailed in previous mails, the real VMD device seems to be in a DMA domain by default, so we continue. if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) { Now we checked the default domain type of the subdevice. This seems rather likely to return IDENTITY because that's effectively the default type... ret = iommu_request_dm_for_dev(dev); if (ret) { dmar_remove_one_dev_info(dev); dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; domain_add_dev_info(si_domain, dev); dev_info(dev, "Device uses a private identity domain.\n"); } } and now we're doing the bad stuff that Lu pointed out: we only have one mapping shared for all the subdevices, so if we end up changing it for one subdevice, we're likely to be breaking another. In this case iommu_request_dm_for_dev() returns -EBUSY without doing anything and the following private identity code fortunately seems to have no consequential effects - the real DMA device continues to operate in the DMA domain, and all subdevice DMA requests go through the DMA mapping codepath. That's probably why VMD appears to be working fine anyway, but this seems fragile. The following changes enforce that the real DMA device is in the DMA domain, and avoid the intel_iommu_add_device() codepaths that would try to change it to a different domain type. Let me know if I'm on the right lines... --- drivers/iommu/intel-iommu.c | 16 drivers/pci/controller/intel-nvme-remap.c | 6 ++ 2 files changed, 22 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 9644a5b3e0ae..8872b8d1780d 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2911,6 +2911,9 @@ static int device_def_domain_type(struct device *dev) if (dev_is_pci(dev)) { struct pci_dev *pdev = to_pci_dev(dev); + if (pci_real_dma_dev(pdev) != pdev) + return IOMMU_DOMAIN_DMA; + if (device_is_rmrr_locked(dev)) return IOMMU_DOMAIN_DMA; @@ -5580,6 +5583,7 @@ static bool intel_iommu_capable(enum iommu_cap cap) static int intel_iommu_add_device(struct device *dev) { + struct device *real_dev = dev; struct dmar_domain *dmar_domain; struct iommu_domain *domain; struct intel_iommu *iommu; @@ -5591,6 +5595,17 @@ static int intel_iommu_add_device(struct device *dev) if (!iommu) return -ENODEV; + if (dev_is_pci(dev)) + real_dev = _real_dma_dev(to_pci_dev(dev))->dev; + + if (real_dev != dev) { + domain = iommu_get_domain_for_dev(real_dev); + if (domain->type != IOMMU_DOMAIN_DMA) { + dev_err(dev, "Real DMA device not in DMA domain; can't handle DMA\n"); + return -ENODEV; + } + } + iommu_device_link(>iommu, dev); if (translation_pre_enabled(iommu)) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed
On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu wrote: > With respect, this is problematical. The parent and all subdevices share > a single translation entry. The DMA mask should be consistent. > > Otherwise, for example, subdevice A has 64-bit DMA capability and uses > an identity domain for DMA translation. While subdevice B has 32-bit DMA > capability and is forced to switch to DMA domain. Subdevice A will be > impacted without any notification. I see what you mean. Perhaps we should just ensure that setups involving such real DMA devices and subdevices should always use the DMA domain, avoiding this type of complication. That's apparently even the default for VMD. This is probably something that should be forced/ensured when the real DMA device gets registered, because similarly to the noted case, we can't risk any identity mappings having been created on the real device if we later decide to move it into the DMA domain based on the appearance of subdevices. Jon, any thoughts? Daniel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed
From: Jon Derrick The PCI devices handled by intel-iommu may have a DMA requester on another bus, such as VMD subdevices needing to use the VMD endpoint. The real DMA device is now used for the DMA mapping, but one case was missed earlier: if the VMD device (and hence subdevices too) are under IOMMU_DOMAIN_IDENTITY, mappings do not work. Codepaths like intel_map_page() handle the IOMMU_DOMAIN_DMA case by creating an iommu DMA mapping, and fall back on dma_direct_map_page() for the IOMMU_DOMAIN_IDENTITY case. However, handling of the IDENTITY case is broken when intel_page_page() handles a subdevice. We observe that at iommu attach time, dmar_insert_one_dev_info() for the subdevices will never set dev->archdata.iommu. This is because that function uses find_domain() to check if there is already an IOMMU for the device, and find_domain() then defers to the real DMA device which does have one. Thus dmar_insert_one_dev_info() returns without assigning dev->archdata.iommu. Then, later: 1. intel_map_page() checks if an IOMMU mapping is needed by calling iommu_need_mapping() on the subdevice. identity_mapping() returns false because dev->archdata.iommu is NULL, so this function returns false indicating that mapping is needed. 2. __intel_map_single() is called to create the mapping. 3. __intel_map_single() calls find_domain(). This function now returns the IDENTITY domain corresponding to the real DMA device. 4. __intel_map_single() calls domain_get_iommu() on this "real" domain. A failure is hit and the entire operation is aborted, because this codepath is not intended to handle IDENTITY mappings: if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA)) return NULL; Fix this by using the real DMA device when checking if a mapping is needed. The IDENTITY case will then directly fall back on dma_direct_map_page(). The subdevice DMA mask is still considered in order to handle any situations where (e.g.) the subdevice only supports 32-bit DMA with the real DMA requester having a 64-bit DMA mask. Reported-by: Daniel Drake Fixes: b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping") Signed-off-by: Jon Derrick Signed-off-by: Daniel Drake --- Notes: v2: switch to Jon's approach instead. v3: shortcut mask check in non-identity case v4: amend commit msg to explain why subdevice DMA mask is still considered This problem was originally detected with a non-upstream patch which creates PCI devices similar to VMD: "PCI: Add Intel remapped NVMe device support" (https://marc.info/?l=linux-ide=156015271021615=2) This patch has now been tested on VMD forced into identity mode. drivers/iommu/intel-iommu.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 9dc37672bf89..7ffd252bf835 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3582,12 +3582,16 @@ static struct dmar_domain *get_private_domain_for_dev(struct device *dev) /* Check if the dev needs to go through non-identity map and unmap process.*/ static bool iommu_need_mapping(struct device *dev) { + struct device *dma_dev = dev; int ret; if (iommu_dummy(dev)) return false; - ret = identity_mapping(dev); + if (dev_is_pci(dev)) + dma_dev = _real_dma_dev(to_pci_dev(dev))->dev; + + ret = identity_mapping(dma_dev); if (ret) { u64 dma_mask = *dev->dma_mask; @@ -3601,19 +3605,19 @@ static bool iommu_need_mapping(struct device *dev) * 32 bit DMA is removed from si_domain and fall back to * non-identity mapping. */ - dmar_remove_one_dev_info(dev); - ret = iommu_request_dma_domain_for_dev(dev); + dmar_remove_one_dev_info(dma_dev); + ret = iommu_request_dma_domain_for_dev(dma_dev); if (ret) { struct iommu_domain *domain; struct dmar_domain *dmar_domain; - domain = iommu_get_domain_for_dev(dev); + domain = iommu_get_domain_for_dev(dma_dev); if (domain) { dmar_domain = to_dmar_domain(domain); dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; } - dmar_remove_one_dev_info(dev); - get_private_domain_for_dev(dev); + dmar_remove_one_dev_info(dma_dev); + get_private_domain_for_dev(dma_dev); } dev_info(dev, "32bit DMA uses non-identity mapping\n"); -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3] iommu/vt-d: consider real PCI device when checking if mapping is needed
From: Jon Derrick The PCI devices handled by intel-iommu may have a DMA requester on another bus, such as VMD subdevices needing to use the VMD endpoint. The real DMA device is now used for the DMA mapping, but one case was missed earlier: if the VMD device (and hence subdevices too) are under IOMMU_DOMAIN_IDENTITY, mappings do not work. Codepaths like intel_map_page() handle the IOMMU_DOMAIN_DMA case by creating an iommu DMA mapping, and fall back on dma_direct_map_page() for the IOMMU_DOMAIN_IDENTITY case. However, handling of the IDENTITY case is broken when intel_page_page() handles a subdevice. We observe that at iommu attach time, dmar_insert_one_dev_info() for the subdevices will never set dev->archdata.iommu. This is because that function uses find_domain() to check if there is already an IOMMU for the device, and find_domain() then defers to the real DMA device which does have one. Thus dmar_insert_one_dev_info() returns without assigning dev->archdata.iommu. Then, later: 1. intel_map_page() checks if an IOMMU mapping is needed by calling iommu_need_mapping() on the subdevice. identity_mapping() returns false because dev->archdata.iommu is NULL, so this function returns false indicating that mapping is needed. 2. __intel_map_single() is called to create the mapping. 3. __intel_map_single() calls find_domain(). This function now returns the IDENTITY domain corresponding to the real DMA device. 4. __intel_map_single() calls domain_get_iommu() on this "real" domain. A failure is hit and the entire operation is aborted, because this codepath is not intended to handle IDENTITY mappings: if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA)) return NULL; Fix this by using the real DMA device when checking if a mapping is needed, while also considering the subdevice DMA mask. The IDENTITY case will then directly fall back on dma_direct_map_page(). Reported-by: Daniel Drake Fixes: b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping") Signed-off-by: Jon Derrick Signed-off-by: Daniel Drake --- drivers/iommu/intel-iommu.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 9dc37672bf89..7ffd252bf835 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3582,12 +3582,16 @@ static struct dmar_domain *get_private_domain_for_dev(struct device *dev) /* Check if the dev needs to go through non-identity map and unmap process.*/ static bool iommu_need_mapping(struct device *dev) { + struct device *dma_dev = dev; int ret; if (iommu_dummy(dev)) return false; - ret = identity_mapping(dev); + if (dev_is_pci(dev)) + dma_dev = _real_dma_dev(to_pci_dev(dev))->dev; + + ret = identity_mapping(dma_dev); if (ret) { u64 dma_mask = *dev->dma_mask; @@ -3601,19 +3605,19 @@ static bool iommu_need_mapping(struct device *dev) * 32 bit DMA is removed from si_domain and fall back to * non-identity mapping. */ - dmar_remove_one_dev_info(dev); - ret = iommu_request_dma_domain_for_dev(dev); + dmar_remove_one_dev_info(dma_dev); + ret = iommu_request_dma_domain_for_dev(dma_dev); if (ret) { struct iommu_domain *domain; struct dmar_domain *dmar_domain; - domain = iommu_get_domain_for_dev(dev); + domain = iommu_get_domain_for_dev(dma_dev); if (domain) { dmar_domain = to_dmar_domain(domain); dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN; } - dmar_remove_one_dev_info(dev); - get_private_domain_for_dev(dev); + dmar_remove_one_dev_info(dma_dev); + get_private_domain_for_dev(dma_dev); } dev_info(dev, "32bit DMA uses non-identity mapping\n"); -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2] iommu/vt-d: consider real PCI device when checking if mapping is needed
From: Jon Derrick The PCI devices handled by intel-iommu may have a DMA requester on another bus, such as VMD subdevices needing to use the VMD endpoint. The real DMA device is now used for the DMA mapping, but one case was missed earlier: if the VMD device (and hence subdevices too) are under IOMMU_DOMAIN_IDENTITY, mappings do not work. Codepaths like intel_map_page() handle the IOMMU_DOMAIN_DMA case by creating an iommu DMA mapping, and fall back on dma_direct_map_page() for the IOMMU_DOMAIN_IDENTITY case. However, handling of the IDENTITY case is broken when intel_page_page() handles a subdevice. We observe that at iommu attach time, dmar_insert_one_dev_info() for the subdevices will never set dev->archdata.iommu. This is because that function uses find_domain() to check if there is already an IOMMU for the device, and find_domain() then defers to the real DMA device which does have one. Thus dmar_insert_one_dev_info() returns without assigning dev->archdata.iommu. Then, later: 1. intel_map_page() checks if an IOMMU mapping is needed by calling iommu_need_mapping() on the subdevice. identity_mapping() returns false because dev->archdata.iommu is NULL, so this function returns false indicating that mapping is needed. 2. __intel_map_single() is called to create the mapping. 3. __intel_map_single() calls find_domain(). This function now returns the IDENTITY domain corresponding to the real DMA device. 4. __intel_map_single() calls domain_get_iommu() on this "real" domain. A failure is hit and the entire operation is aborted, because this codepath is not intended to handle IDENTITY mappings: if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA)) return NULL; Fix this by using the real DMA device when checking if a mapping is needed, while also considering the subdevice DMA mask. The IDENTITY case will then directly fall back on dma_direct_map_page(). Reported-by: Daniel Drake Fixes: b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping") Signed-off-by: Daniel Drake --- Notes: v2: switch to Jon's approach instead. This problem was detected with a non-upstream patch "PCI: Add Intel remapped NVMe device support" (https://marc.info/?l=linux-ide=156015271021615=2) This patch creates PCI devices a bit like VMD, and hence I believe VMD would hit this class of problem for any cases where the VMD device is in the IDENTITY domain. (I presume the reason this bug was not seen already there is that it is in a DMA iommu domain). However this hasn't actually been tested on VMD (don't have the hardware) so if I've missed anything and/or it's not a real issue then feel free to drop this patch. drivers/iommu/intel-iommu.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 9dc37672bf89..edbe2866b515 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3582,19 +3582,23 @@ static struct dmar_domain *get_private_domain_for_dev(struct device *dev) /* Check if the dev needs to go through non-identity map and unmap process.*/ static bool iommu_need_mapping(struct device *dev) { + u64 dma_mask, required_dma_mask; int ret; if (iommu_dummy(dev)) return false; - ret = identity_mapping(dev); - if (ret) { - u64 dma_mask = *dev->dma_mask; + dma_mask = *dev->dma_mask; + if (dev->coherent_dma_mask && dev->coherent_dma_mask < dma_mask) + dma_mask = dev->coherent_dma_mask; + required_dma_mask = dma_direct_get_required_mask(dev); - if (dev->coherent_dma_mask && dev->coherent_dma_mask < dma_mask) - dma_mask = dev->coherent_dma_mask; + if (dev_is_pci(dev)) + dev = _real_dma_dev(to_pci_dev(dev))->dev; - if (dma_mask >= dma_direct_get_required_mask(dev)) + ret = identity_mapping(dev); + if (ret) { + if (dma_mask >= required_dma_mask) return false; /* -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/vt-d: consider real PCI device when checking if mapping is needed
On Wed, Feb 12, 2020 at 12:03 AM Derrick, Jonathan wrote: > On Tue, 2020-02-11 at 17:13 +0800, Daniel Drake wrote: > > The PCI devices handled by intel-iommu may have a DMA requester on > > another bus, such as VMD subdevices needing to use the VMD endpoint. > > > > The real DMA device is now used for the DMA mapping, but one case was > > missed earlier, when allocating memory through (e.g.) intel_map_page(). > > Confusion ensues if the iommu domain type for the subdevice does not match > > the iommu domain type for the real DMA device. > Is there a way to force this situation for my testing? I think you could hack device_def_domain_type() to return IOMMU_DOMAIN_IDENTITY for the real device, and IOMMU_DOMAIN_DMA for the subdevice. But I got curious as to why my subdevice might be IOMMU_DOMAIN_DMA, so I checked, and found out that my assumptions weren't quite correct. The subdevice has no iommu domain recorded at all. Before applying any patches here, what's actually happening is: 1. Real DMA device gets registered with the iommu as IOMMU_DOMAIN_IDENTITY using si_domain. 2. When the subdevice gets registered, the relevant code flow is inside dmar_insert_one_dev_info(): - it creates a new device_domain_info and domain->domain.type == IDENTITY, but - it then calls find_domain(dev) which successfully defers to the real DMA device and returns the real DMA device's dmar_domain - since found != NULL (dmar_domain was found for this device) the function bails out before setting dev->archdata.iommu The results at this point are that the real DMA device is fully registered as IOMMU_DOMAIN_IDENTITY using si_domain, but all of the subdevices will always have dev->archdata.iommu == NULL. Then when intel_map_page() is reached for the subdevice, it calls iommu_need_mapping() for the subdevice. This calls identity_mapping() on the subdevice, but that will always return 0 because dev->archdata.iommu == NULL. Following on from there, iommu_need_mapping() will then *always* return true (mapping needed) for subdevices. That will then lead to the situation described in my last mail, where later down the allocation chain the request for creating a mapping will be handed towards the real DMA dev, but that will then fail because the real DMA dev is using IOMMU_DOMAIN_IDENTITY where no mapping is needed. Unless I missed anything that seems pretty clear to me now, and I guess the only reason why you may not have already faced this in the vmd case is if the real DMA device is not using IOMMU_DOMAIN_IDENTITY. (To check this, you could log the value of the real dev domain->domain.type in dmar_insert_one_dev_info(), and/or observe the return value of identity_mapping() in iommu_need_mapping for the real dev). In any case it seems increasingly clear to me that iommu_need_mapping() should be consulting the real DMA device in the identity_mapping check, and your patch likewise solves the problem faced here. Thanks Daniel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/vt-d: consider real PCI device when checking if mapping is needed
The PCI devices handled by intel-iommu may have a DMA requester on another bus, such as VMD subdevices needing to use the VMD endpoint. The real DMA device is now used for the DMA mapping, but one case was missed earlier, when allocating memory through (e.g.) intel_map_page(). Confusion ensues if the iommu domain type for the subdevice does not match the iommu domain type for the real DMA device. For example, take the case of the subdevice handled by intel_map_page() in a IOMMU_DOMAIN_DMA, with the real DMA device in a IOMMU_DOMAIN_IDENTITY: 1. intel_map_page() checks if an IOMMU mapping is needed by calling iommu_need_mapping() on the subdevice. Result: mapping is needed. 2. __intel_map_single() is called to create the mapping: - __intel_map_single() calls find_domain(). This function now returns the IDENTITY domain corresponding to the real DMA device. - __intel_map_single() then calls domain_get_iommu() on this "real" domain. A failure is hit and the entire operation is aborted, because this codepath is not intended to handle IDENTITY mappings: if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA)) return NULL; Fix this by using the real DMA device when checking if a mapping is needed. The above case will then directly fall back on dma_direct_map_page(). Fixes: 2b0140c69637 ("iommu/vt-d: Use pci_real_dma_dev() for mapping") Signed-off-by: Daniel Drake --- Notes: This problem was detected with a non-upstream patch "PCI: Add Intel remapped NVMe device support" (https://marc.info/?l=linux-ide=156015271021615=2) This patch creates PCI devices in the same way as VMD, and hence I believe VMD would hit this class of problem for any cases where iommu domain type may mismatch between subdevice and real device, which we have run into here. However this hasn't actually been tested on VMD (don't have the hardware) so if I've missed anything and/or it's not a real issue then feel free to drop this patch. drivers/iommu/intel-iommu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 9dc37672bf89..713810f8350c 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3587,6 +3587,9 @@ static bool iommu_need_mapping(struct device *dev) if (iommu_dummy(dev)) return false; + if (dev_is_pci(dev)) + dev = _real_dma_dev(to_pci_dev(dev))->dev; + ret = identity_mapping(dev); if (ret) { u64 dma_mask = *dev->dma_mask; -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/intel-iommu: set as DUMMY_DEVICE_DOMAIN_INFO if no IOMMU
On Sat, Feb 8, 2020 at 2:30 PM Lu Baolu wrote: > > The devices under segment 1 are fake devices produced by > > intel-nvme-remap mentioned here https://lkml.org/lkml/2020/2/5/139 > > Has this series been accepted? Sadly not - we didn't find any consensus on the right approach, and further conversation is hindered by the questionable hardware design and lack of further communication from Intel in explaining it. It's one of the exceptional cases where we carry a significant non-upstream kernel change, because unfortunately most of the current day consumer PCs we work with have this BIOS option on by default and hence unmodified Linux can't access the storage devices. On the offchance that you have some energy to bubble this up inside Intel please let me know and we will talk more... :) That said, this thread was indeed only opened since we thought we had found a more general issue that would potentially affect other cases. The issue described does seem to highlight a possible imperfection in code flow, although it may also be reasonable to say that (without crazy downstream patches at play) if intel_iommu_add_device() fails then we have bigger problems to face anyway... > Will this help here? https://www.spinics.net/lists/iommu/msg41300.html Yes! Very useful info and a real improvement. We'll follow the same approach here. That does solve the problem we were facing, although we needed one more fixup which I've sent separately for your consideration: iommu/vt-d: consider real PCI device when checking if mapping is needed Thanks! Daniel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/dmar: ignore devices with out-of-spec domain number
VMD subdevices are created with a PCI domain ID of 0x1 or higher. These subdevices are also handled like all other PCI devices by dmar_pci_bus_notifier(). However, when dmar_alloc_pci_notify_info() take records of such devices, it will truncate the domain ID to a u16 value (in info->seg). The device at (e.g.) 1:00:02.0 is then treated by the DMAR code as if it is :00:02.0. In the unlucky event that a real device also exists at :00:02.0 and also has a device-specific entry in the DMAR table, dmar_insert_dev_scope() will crash on: BUG_ON(i >= devices_cnt); That's basically a sanity check that only one PCI device matches a single DMAR entry; in this case we seem to have two matching devices. Fix this by ignoring devices that have a domain number higher than what can be looked up in the DMAR table. This problem was carefully diagnosed by Jian-Hong Pan. Signed-off-by: Daniel Drake --- Notes: This problem was detected with a non-upstream patch "PCI: Add Intel remapped NVMe device support" (https://marc.info/?l=linux-ide=156015271021615=2) This patch creates PCI devices in the same way as VMD, and hence I believe VMD would hit the same problem that we encountered here, when a VMD-using product comes along that meets the mentioned conditions. However this hasn't actually been tested on VMD (don't have the hardware) so if I've missed anything and/or it's not a real issue then feel free to drop this patch. drivers/iommu/dmar.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index 071bb42bbbc5..8f94c817a7b5 100644 --- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include @@ -128,6 +129,13 @@ dmar_alloc_pci_notify_info(struct pci_dev *dev, unsigned long event) BUG_ON(dev->is_virtfn); + /* +* Ignore devices that have a domain number higher than what can +* be looked up in DMAR, e.g. VMD subdevices with domain 0x1 +*/ + if (pci_domain_nr(dev->bus) > U16_MAX) + return NULL; + /* Only generate path[] for device addition event */ if (event == BUS_NOTIFY_ADD_DEVICE) for (tmp = dev; tmp; tmp = tmp->bus->self) -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v3)
Hi, On Tue, May 30, 2017 at 3:38 PM, Nath, Arindamwrote: >>-Original Message- >>From: Joerg Roedel [mailto:j...@8bytes.org] >>Sent: Monday, May 29, 2017 8:09 PM >>To: Nath, Arindam ; Lendacky, Thomas >> >>Cc: iommu@lists.linux-foundation.org; amd-...@lists.freedesktop.org; >>Deucher, Alexander ; Bridgman, John >> ; dr...@endlessm.com; Suthikulpanit, Suravee >> ; li...@endlessm.com; Craig Stein >> ; mic...@daenzer.net; Kuehling, Felix >> ; sta...@vger.kernel.org >>Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v3) >> >>Hi Arindam, >> >>I met Tom Lendacky last week in Nuremberg last week and he told me he is >>working on the same area of the code that this patch is for. His reason >>for touching this code was to solve some locking problems. Maybe you two >>can work together on a joint approach to improve this? > > Sure Joerg, I will work with Tom. What was the end result here? I see that the code has been reworked in 4.13 so your original patch no longer applies. Is the reworked version also expected to solve the original issue? Thanks Daniel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
DMAR table missing, Intel IOMMU not available
Hi, We're working with a number of platforms based on Intel Apollo Lake and there are some clues suggesting that the IR-PCI-MSI irqchip functionality would be able to get us out of a tricky situation described at: ath9k hardware corrupts MSI Message Data, raises wrong interrupt http://marc.info/?l=linux-pci=150238260726797=2 However the affected platforms do not have a DMAR table present. And I read in the Intel® Virtualization Technology for Directed I/O spec: "The system BIOS is responsible for detecting the remapping hardware functions in the platform and for locating the memory-mapped remapping hardware registers in the host system address space. The BIOS reports the remapping hardware units in a platform to system software through the DMA Remapping Reporting (DMAR) ACPI table". Unfortunately since the BIOS authors have not done what the spec asked, this nice hardware functionality is completely unavailable :( For now we will have to find an alternative approach to solve the problem (BIOS can't be changed), but I am curious if there are plans to have Linux automatically probe the IOMMU through some other means, given that BIOS authors are apparently not providing the DMAR table in many cases. Thanks Daniel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: flush IOTLB for specific domains only
On Wed, Apr 5, 2017 at 9:01 AM, Nath, Arindam <arindam.n...@amd.com> wrote: > > >-Original Message- > >From: Daniel Drake [mailto:dr...@endlessm.com] > >Sent: Thursday, March 30, 2017 7:15 PM > >To: Nath, Arindam > >Cc: j...@8bytes.org; Deucher, Alexander; Bridgman, John; amd- > >g...@lists.freedesktop.org; iommu@lists.linux-foundation.org; Suthikulpanit, > >Suravee; Linux Upstreaming Team > >Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only > > > >On Thu, Mar 30, 2017 at 12:23 AM, Nath, Arindam <arindam.n...@amd.com> > >wrote: > >> Daniel, did you get chance to test this patch? > > > >Not yet. Should we test it alone or alongside "PCI: Blacklist AMD > >Stoney GPU devices for ATS"? > > Daniel, any luck with this patch? Sorry for the delay. The patch appears to be working fine. Daniel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: flush IOTLB for specific domains only
Hi Arindam, You CC'd me on this - does this mean that it is a fix for the issue described in the thread "amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait loop timed out" ? Thanks Daniel On Mon, Mar 27, 2017 at 12:17 AM,wrote: > From: Arindam Nath > > The idea behind flush queues is to defer the IOTLB flushing > for domains for which the mappings are no longer valid. We > add such domains in queue_add(), and when the queue size > reaches FLUSH_QUEUE_SIZE, we perform __queue_flush(). > > Since we have already taken lock before __queue_flush() > is called, we need to make sure the IOTLB flushing is > performed as quickly as possible. > > In the current implementation, we perform IOTLB flushing > for all domains irrespective of which ones were actually > added in the flush queue initially. This can be quite > expensive especially for domains for which unmapping is > not required at this point of time. > > This patch makes use of domain information in > 'struct flush_queue_entry' to make sure we only flush > IOTLBs for domains who need it, skipping others. > > Signed-off-by: Arindam Nath > --- > drivers/iommu/amd_iommu.c | 15 --- > 1 file changed, 8 insertions(+), 7 deletions(-) > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index 98940d1..6a9a048 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -2227,15 +2227,16 @@ static struct iommu_group > *amd_iommu_device_group(struct device *dev) > > static void __queue_flush(struct flush_queue *queue) > { > - struct protection_domain *domain; > - unsigned long flags; > int idx; > > - /* First flush TLB of all known domains */ > - spin_lock_irqsave(_iommu_pd_lock, flags); > - list_for_each_entry(domain, _iommu_pd_list, list) > - domain_flush_tlb(domain); > - spin_unlock_irqrestore(_iommu_pd_lock, flags); > + /* First flush TLB of all domains which were added to flush queue */ > + for (idx = 0; idx < queue->next; ++idx) { > + struct flush_queue_entry *entry; > + > + entry = queue->entries + idx; > + > + domain_flush_tlb(>dma_dom->domain); > + } > > /* Wait until flushes have completed */ > domain_flush_complete(NULL); > -- > 1.9.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait loop timed out
Hi Joerg, Thanks for looking into this. We confirm that this workaround avoids the iommu log spam and that amdgpu appears to be working fine with it. Daniel On Wed, Mar 22, 2017 at 5:22 AM, j...@8bytes.orgwrote: > On Tue, Mar 21, 2017 at 04:30:55PM +, Deucher, Alexander wrote: >> > I am preparing a debug-patch that disables ATS for these GPUs so someone >> > with such a chip can test it. >> >> Thanks Joerg. > > Here is a debug patch, using the hard hammer of disabling the use of ATS > completly in the AMD IOMMU driver. If it fixes the issue I am going to > write a more upstreamable version. > > But for now, please test if this fixes the issue. > > Thanks, > > Joerg > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index 98940d1..f019aa6 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -467,7 +467,7 @@ static int iommu_init_device(struct device *dev) > struct amd_iommu *iommu; > > iommu = amd_iommu_rlookup_table[dev_data->devid]; > - dev_data->iommu_v2 = iommu->is_iommu_v2; > + dev_data->iommu_v2 = false; > } > > dev->archdata.iommu = dev_data; > diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c > index 6130278..41d0e64 100644 > --- a/drivers/iommu/amd_iommu_init.c > +++ b/drivers/iommu/amd_iommu_init.c > @@ -171,7 +171,7 @@ int amd_iommus_present; > > /* IOMMUs have a non-present cache? */ > bool amd_iommu_np_cache __read_mostly; > -bool amd_iommu_iotlb_sup __read_mostly = true; > +bool amd_iommu_iotlb_sup __read_mostly = false; > > u32 amd_iommu_max_pasid __read_mostly = ~0; > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait loop timed out
Hi, On Mon, Mar 13, 2017 at 2:01 PM, Deucher, Alexanderwrote: > > We are unable to boot Acer Aspire E5-553G (AMD FX-9800P RADEON R7) nor > > Acer Aspire E5-523 with standard configurations because during boot > > the screen is flooded with the following error message over and over: > > > > AMD-Vi: Completion-Wait loop timed out > > We ran into similar issues and bisected it to commit > b1516a14657acf81a587e9a6e733a881625eee53. I'm not too familiar with the > IOMMU hardware to know if this is an iommu or display driver issue yet. We can confirm that reverting this commit solves the issue. Given that that commit is an optimization, but it has introduced a regression on multiple platforms, and has been like this for 8 months, it would be common practice to now revert this patch upstream until the regression is fixed. Could you please send a new patch to do this? Also, we would be happy to test any real solutions to this issue while we still have the affected units in hand. Thanks Daniel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait loop timed out
Hi, We are unable to boot Acer Aspire E5-553G (AMD FX-9800P RADEON R7) nor Acer Aspire E5-523 with standard configurations because during boot the screen is flooded with the following error message over and over: AMD-Vi: Completion-Wait loop timed out We have left the system for quite a while but the message spam does not stop and the system doesn't complete the boot sequence. We have reproduced on Linux 4.8 and Linux 4.10. To avoid this, we can boot with iommu=soft or just disable the amdgpu display driver. Looks like this may also affect HP 15-ba012no : https://bugzilla.redhat.com/show_bug.cgi?id=1409201 Earlier during boot the iommu is detected as: [1.274518] AMD-Vi: Found IOMMU at :00:00.2 cap 0x40 [1.274519] AMD-Vi: Extended features (0x37ef22294ada): [1.274519] PPR NX GT IA GA PC GA_vAPIC [1.274523] AMD-Vi: Interrupt remapping enabled [1.274523] AMD-Vi: virtual APIC enabled [1.275144] AMD-Vi: Lazy IO/TLB flushing enabled [1.276498] perf: AMD NB counters detected [1.278096] LVT offset 0 assigned for vector 0x400 [1.278963] perf: AMD IBS detected (0x07ff) [1.278977] perf: amd_iommu: Detected. (0 banks, 0 counters/bank) Any suggestions for how we can fix this, or get more useful debug info? Thanks Daniel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu