Re: [GIT PULL] dma-mapping fixes for 5.3-rc
The pull request you sent on Sun, 25 Aug 2019 07:50:10 +0900: > git://git.infradead.org/users/hch/dma-mapping.git tags/dma-mapping-5.3-5 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/e67095fd2f727c35e510d831c588696f2138a1bb Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker
Re: [BISECTED REGRESSION 5.3-rc2] Marvell 88SE9128 SATA controller unusable with intel_iommu=on
Hi, I am looking into this and will come up with a fix. Best regards, Baolu On 8/25/19 12:20 AM, Stijn Tintel wrote: Hi, There is a bug in kernel 5.3-rc2 and later that breaks my Marvell 88SE9128 SATA controller. The problem does not occur when I boot with intel_iommu=off. This seems to be a regression of https://bugzilla.kernel.org/show_bug.cgi?id=42679. A quirk was added to fix this, in cc346a4714a59d08c118e8f33fd86692d3563133. This quirk is still in place, but it appears that it is no longer working. aug 23 18:04:17 taz kernel: DMAR: DRHD: handling fault status reg 2 aug 23 18:04:17 taz kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300) aug 23 18:04:17 taz kernel: ata9: SATA link down (SStatus 0 SControl 300) aug 23 18:04:17 taz kernel: ata8: SATA link down (SStatus 0 SControl 300) aug 23 18:04:17 taz kernel: ata10: SATA link down (SStatus 0 SControl 300) aug 23 18:04:17 taz kernel: ata11: SATA link down (SStatus 0 SControl 300) aug 23 18:04:17 taz kernel: ata5.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 32), AA aug 23 18:04:17 taz kernel: DMAR: [DMA Read] Request device [09:00.1] fault addr fff0 [fault reason 02] Present bit in context entry is clear ... aug 23 18:04:17 taz kernel: sd 5:0:0:0: [sde] Attached SCSI disk aug 23 18:04:17 taz kernel: ata14.00: qc timeout (cmd 0xa1) aug 23 18:04:17 taz kernel: ata7.00: qc timeout (cmd 0xec) aug 23 18:04:17 taz kernel: ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) aug 23 18:04:17 taz kernel: ata7.00: failed to IDENTIFY (I/O error, err_mask=0x4) aug 23 18:04:17 taz kernel: ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) aug 23 18:04:17 taz kernel: ata7: link is slow to respond, please be patient (ready=0) aug 23 18:04:17 taz kernel: ata7: COMRESET failed (errno=-16) aug 23 18:04:17 taz kernel: ata14.00: qc timeout (cmd 0xa1) aug 23 18:04:17 taz kernel: ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) aug 23 18:04:17 taz kernel: ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) aug 23 18:04:17 taz kernel: ata7: link is slow to respond, please be patient (ready=0) aug 23 18:04:17 taz kernel: ata7: COMRESET failed (errno=-16) aug 23 18:04:17 taz kernel: ata7: link is slow to respond, please be patient (ready=0) aug 23 18:04:17 taz kernel: ata14.00: qc timeout (cmd 0xa1) aug 23 18:04:17 taz kernel: ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) aug 23 18:04:17 taz kernel: ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) aug 23 18:04:17 taz kernel: ata7: COMRESET failed (errno=-16) aug 23 18:04:17 taz kernel: ata7: limiting SATA link speed to 3.0 Gbps aug 23 18:04:17 taz kernel: ata7: COMRESET failed (errno=-16) aug 23 18:04:17 taz kernel: ata7: reset failed, giving up This is the outcome of git bisect: 557529494d79f3f1fadd486dd18d2de0b19be4da is the first bad commit commit 557529494d79f3f1fadd486dd18d2de0b19be4da Author: Lu Baolu Date: Tue Jul 9 13:22:45 2019 +0800 iommu/vt-d: Avoid duplicated pci dma alias consideration As we have abandoned the home-made lazy domain allocation and delegated the DMA domain life cycle up to the default domain mechanism defined in the generic iommu layer, we needn't consider pci alias anymore when mapping/unmapping the context entries. Without this fix, we see kernel NULL pointer dereference during pci device hot-plug test. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Fixes: fa954e6831789 ("iommu/vt-d: Delegate the dma domain to upper layer") Signed-off-by: Lu Baolu Reported-and-tested-by: Xu Pengfei Signed-off-by: Joerg Roedel :04 04 010e7057b8401481e7258948786a2658f9f14037 18aeac50a60d8b8424fcdccd0b3118f565ce3909 M drivers Stijn ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/2] dma-contiguous: Use fallback alloc_pages for single pages
On Fri, Aug 23, 2019 at 09:56:52PM +0900, Masahiro Yamada wrote: > + linux-mmc, Ulf Hansson, Adrian Hunter, > > > ADMA of SDHCI is not working > since bd2e75633c8012fc8a7431c82fda66237133bf7e Does it work for you with this commit: http://git.infradead.org/users/hch/dma-mapping.git/commitdiff/90ae409f9eb3bcaf38688f9ec22375816053a08e ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[GIT PULL] dma-mapping fixes for 5.3-rc
The following changes since commit d1abaeb3be7b5fa6d7a1fbbd2e14e3310005c4c1: Linux 5.3-rc5 (2019-08-18 14:31:08 -0700) are available in the Git repository at: git://git.infradead.org/users/hch/dma-mapping.git tags/dma-mapping-5.3-5 for you to fetch changes up to 90ae409f9eb3bcaf38688f9ec22375816053a08e: dma-direct: fix zone selection after an unaddressable CMA allocation (2019-08-21 07:14:10 +0900) dma-mapping fixes for 5.3-rc Two fixes for regressions in this merge window: - select the Kconfig symbols for the noncoherent dma arch helpers on arm if swiotlb is selected, not just for LPAE to not break then Xen build, that uses swiotlb indirectly through swiotlb-xen - fix the page allocator fallback in dma_alloc_contiguous if the CMA allocation fails Christoph Hellwig (2): arm: select the dma-noncoherent symbols for all swiotlb builds dma-direct: fix zone selection after an unaddressable CMA allocation arch/arm/Kconfig | 4 arch/arm/mm/Kconfig| 4 drivers/iommu/dma-iommu.c | 3 +++ include/linux/dma-contiguous.h | 5 + kernel/dma/contiguous.c| 8 ++-- kernel/dma/direct.c| 10 +- 6 files changed, 19 insertions(+), 15 deletions(-)
Re: [PATCH V5 1/5] iommu/amd: Remove unnecessary locking from AMD iommu driver
Thank for the explanation Tom. It might make sense to add a condensed version of it to commit log for the next iteration. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: cleanup the dma_pgprot handling
On Fri, Aug 23, 2019 at 09:58:04PM +, Paul Burton wrote: > So I believe uncached & uncached accelerated are another case like that > described above - they're 2 different CCAs but the same "access type", > namely uncached. > > Section 4.9 then goes on to forbid mixing access types, but not CCAs. > > It would be nice if the precise mapping from CCA to access type was > provided, but I don't see that anywhere. I can check with the > architecture team to be sure, but to my knowledge we're fine to mix > access via kseg1 (ie. uncached) & mappings with CCA=7 (uncached > accelerated). Ok. Looks like we can keep it then and I'll add a comment to the code with the above reference.
[BISECTED REGRESSION 5.3-rc2] Marvell 88SE9128 SATA controller unusable with intel_iommu=on
Hi, There is a bug in kernel 5.3-rc2 and later that breaks my Marvell 88SE9128 SATA controller. The problem does not occur when I boot with intel_iommu=off. This seems to be a regression of https://bugzilla.kernel.org/show_bug.cgi?id=42679. A quirk was added to fix this, in cc346a4714a59d08c118e8f33fd86692d3563133. This quirk is still in place, but it appears that it is no longer working. aug 23 18:04:17 taz kernel: DMAR: DRHD: handling fault status reg 2 aug 23 18:04:17 taz kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300) aug 23 18:04:17 taz kernel: ata9: SATA link down (SStatus 0 SControl 300) aug 23 18:04:17 taz kernel: ata8: SATA link down (SStatus 0 SControl 300) aug 23 18:04:17 taz kernel: ata10: SATA link down (SStatus 0 SControl 300) aug 23 18:04:17 taz kernel: ata11: SATA link down (SStatus 0 SControl 300) aug 23 18:04:17 taz kernel: ata5.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 32), AA aug 23 18:04:17 taz kernel: DMAR: [DMA Read] Request device [09:00.1] fault addr fff0 [fault reason 02] Present bit in context entry is clear ... aug 23 18:04:17 taz kernel: sd 5:0:0:0: [sde] Attached SCSI disk aug 23 18:04:17 taz kernel: ata14.00: qc timeout (cmd 0xa1) aug 23 18:04:17 taz kernel: ata7.00: qc timeout (cmd 0xec) aug 23 18:04:17 taz kernel: ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) aug 23 18:04:17 taz kernel: ata7.00: failed to IDENTIFY (I/O error, err_mask=0x4) aug 23 18:04:17 taz kernel: ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) aug 23 18:04:17 taz kernel: ata7: link is slow to respond, please be patient (ready=0) aug 23 18:04:17 taz kernel: ata7: COMRESET failed (errno=-16) aug 23 18:04:17 taz kernel: ata14.00: qc timeout (cmd 0xa1) aug 23 18:04:17 taz kernel: ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) aug 23 18:04:17 taz kernel: ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) aug 23 18:04:17 taz kernel: ata7: link is slow to respond, please be patient (ready=0) aug 23 18:04:17 taz kernel: ata7: COMRESET failed (errno=-16) aug 23 18:04:17 taz kernel: ata7: link is slow to respond, please be patient (ready=0) aug 23 18:04:17 taz kernel: ata14.00: qc timeout (cmd 0xa1) aug 23 18:04:17 taz kernel: ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4) aug 23 18:04:17 taz kernel: ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300) aug 23 18:04:17 taz kernel: ata7: COMRESET failed (errno=-16) aug 23 18:04:17 taz kernel: ata7: limiting SATA link speed to 3.0 Gbps aug 23 18:04:17 taz kernel: ata7: COMRESET failed (errno=-16) aug 23 18:04:17 taz kernel: ata7: reset failed, giving up This is the outcome of git bisect: 557529494d79f3f1fadd486dd18d2de0b19be4da is the first bad commit commit 557529494d79f3f1fadd486dd18d2de0b19be4da Author: Lu Baolu Date: Tue Jul 9 13:22:45 2019 +0800 iommu/vt-d: Avoid duplicated pci dma alias consideration As we have abandoned the home-made lazy domain allocation and delegated the DMA domain life cycle up to the default domain mechanism defined in the generic iommu layer, we needn't consider pci alias anymore when mapping/unmapping the context entries. Without this fix, we see kernel NULL pointer dereference during pci device hot-plug test. Cc: Ashok Raj Cc: Jacob Pan Cc: Kevin Tian Fixes: fa954e6831789 ("iommu/vt-d: Delegate the dma domain to upper layer") Signed-off-by: Lu Baolu Reported-and-tested-by: Xu Pengfei Signed-off-by: Joerg Roedel :04 04 010e7057b8401481e7258948786a2658f9f14037 18aeac50a60d8b8424fcdccd0b3118f565ce3909 M drivers Stijn ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v1 1/2] iommu: pass cell_count = -1 to of_for_each_phandle with cells_name
Currently of_for_each_phandle ignores the cell_count parameter when a cells_name is given. I intend to change that and let the iterator fall back to a non-negative cell_count if the cells_name property is missing in the referenced node. To not change how existing of_for_each_phandle's users iterate, fix them to pass cell_count = -1 when also cells_name is given which yields the expected behaviour with and without my change. Signed-off-by: Uwe Kleine-König --- drivers/iommu/arm-smmu.c | 2 +- drivers/iommu/mtk_iommu_v1.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 64977c131ee6..81b7734654b3 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -333,7 +333,7 @@ static int __find_legacy_master_phandle(struct device *dev, void *data) int err; of_for_each_phandle(it, err, dev->of_node, "mmu-masters", - "#stream-id-cells", 0) + "#stream-id-cells", -1) if (it->node == np) { *(void **)data = dev; return 1; diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c index abeeac488372..68d1de70de0c 100644 --- a/drivers/iommu/mtk_iommu_v1.c +++ b/drivers/iommu/mtk_iommu_v1.c @@ -426,7 +426,7 @@ static int mtk_iommu_add_device(struct device *dev) int err; of_for_each_phandle(, err, dev->of_node, "iommus", - "#iommu-cells", 0) { + "#iommu-cells", -1) { int count = of_phandle_iterator_args(, iommu_spec.args, MAX_PHANDLE_ARGS); iommu_spec.np = of_node_get(it.node); -- 2.20.1
[PATCH v1 2/2] of: Let of_for_each_phandle fallback to non-negative cell_count
Referencing device tree nodes from a property allows to pass arguments. This is for example used for referencing gpios. This looks as follows: gpio_ctrl: gpio-controller { #gpio-cells = <2> ... } someothernode { gpios = <_ctrl 5 0 _ctrl 3 0>; ... } To know the number of arguments this must be either fixed, or the referenced node is checked for a $cells_name (here: "#gpio-cells") property and with this information the start of the second reference can be determined. Currently regulators are referenced with no additional arguments. To allow some optional arguments without having to change all referenced nodes this change introduces a way to specify a default cell_count. So when a phandle is parsed we check for the $cells_name property and use it as before if present. If it is not present we fall back to cells_count if non-negative and only fail if cells_count is smaller than zero. Signed-off-by: Uwe Kleine-König --- drivers/of/base.c | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/drivers/of/base.c b/drivers/of/base.c index 55e7f5bb0549..2f25d2dfecfa 100644 --- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -1335,11 +1335,20 @@ int of_phandle_iterator_next(struct of_phandle_iterator *it) if (of_property_read_u32(it->node, it->cells_name, )) { - pr_err("%pOF: could not get %s for %pOF\n", - it->parent, - it->cells_name, - it->node); - goto err; + /* +* If both cell_count and cells_name is given, +* fall back to cell_count in absence +* of the cells_name property +*/ + if (it->cell_count >= 0) { + count = it->cell_count; + } else { + pr_err("%pOF: could not get %s for %pOF\n", + it->parent, + it->cells_name, + it->node); + goto err; + } } } else { count = it->cell_count; @@ -1505,7 +1514,7 @@ int of_parse_phandle_with_args(const struct device_node *np, const char *list_na { if (index < 0) return -EINVAL; - return __of_parse_phandle_with_args(np, list_name, cells_name, 0, + return __of_parse_phandle_with_args(np, list_name, cells_name, -1, index, out_args); } EXPORT_SYMBOL(of_parse_phandle_with_args); @@ -1588,7 +1597,7 @@ int of_parse_phandle_with_args_map(const struct device_node *np, if (!pass_name) goto free; - ret = __of_parse_phandle_with_args(np, list_name, cells_name, 0, index, + ret = __of_parse_phandle_with_args(np, list_name, cells_name, -1, index, out_args); if (ret) goto free; @@ -1756,7 +1765,7 @@ int of_count_phandle_with_args(const struct device_node *np, const char *list_na struct of_phandle_iterator it; int rc, cur_index = 0; - rc = of_phandle_iterator_init(, np, list_name, cells_name, 0); + rc = of_phandle_iterator_init(, np, list_name, cells_name, -1); if (rc) return rc; -- 2.20.1
Re: [PATCH v11 00/23] MT8183 IOMMU SUPPORT
On Sat, Aug 24, 2019 at 11:01:45AM +0800, Yong Wu wrote: > This patchset mainly adds support for mt8183 IOMMU and SMI. Thanks for persevering with this, and sorry it took me so long to get to grips with the io-pgtable changes. Joerg -- this is good for you to pick up from my side now, but if you run into any fiddly conflicts with any of my other changes then I'm happy to resolve them on a separate branch for you to pull. Just let me know. Cheers, Will
Re: [PATCH V5 1/5] iommu/amd: Remove unnecessary locking from AMD iommu driver
>I have to admit I don't fully understand the concurrency issues here, but >neither do I understand what the mutex you removed might have helped to start >with. Each range in the page tables is protected by the IO virtual address allocator. The iommu driver allocates an IOVA range using locks before it writes to a page table range. The IOVA allocator acts like a lock on a specific range of the page tables. So we can handle most of the concurrency issues in the IOVA allocator and avoid locking while writing to a range in the page tables. However because we have multiple levels of pages we might have to allocate a middle page (a PMD) which covers more than the IOVA range we have allocated. To solve this we could use locks: //pseudo code lock_page_table() if (we need to allocate middle pages) { //allocate the page //set the PMD value } unlock_page_table() but we can actually avoid having any locking by doing the following: //pseudo code if (we need to allocate middle pages) { //allocate the page //cmpxchg64 to set the PMD if it wasn't already set since we last checked if (the PMD was set while since we last checked) //free the page we just allocated } In this case we can end up doing a pointless page allocate and free but it's worth it to avoid using locks You can see this in the intel iommu code here: https://github.com/torvalds/linux/blob/9140d8bdd4c5a04abe181bb300378355d56990a4/drivers/iommu/intel-iommu.c#L904 >what the mutex you removed might have helped to start with. The mutex I removed is arguably completely useless. In the dma ops path we handle the IOVA allocations in the driver so we can be sure a certain range is protected by the IOVA allocator. Because the iommu ops path doesn't handle the IOVA allocations it seems reasonable to lock the page tables to avoid two writers writing to the same range at the same time. Without the lock it's complete chaos and all writers can be writing to the same range at the same time resulting in complete garbage. BUT the locking doesn't actually make any real difference. Even with locking we still have a race condition if two writers want to write to the same range at the same time, the race is just whoever gets the lock first, we still can't be sure what the result will be. So the result is still garbage, just slightly more usable garbage because at least the range is correct for one writer. It just makes no sense to ever have two writers writing to the same range and adding a lock doesn't fix that. Already the Intel iommu ops path doesn't use locks for it's page table so this isn't a new idea I'm just doing the same for the AMD iommu driver Does all that make sense? On Tue, 20 Aug 2019 at 10:41, Christoph Hellwig wrote: > > On Thu, Aug 15, 2019 at 12:09:39PM +0100, Tom Murphy wrote: > > We can remove the mutex lock from amd_iommu_map and amd_iommu_unmap. > > iommu_map doesn’t lock while mapping and so no two calls should touch > > the same iova range. The AMD driver already handles the page table page > > allocations without locks so we can safely remove the locks. > > I've been looking over the code and trying to understand how the > synchronization works. I gues we the cmpxchg64 in free_clear_pte > is the important point here? I have to admit I don't fully understand > the concurrency issues here, but neither do I understand what the > mutex you removed might have helped to start with.