Re: [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in debugfs
On 2022/5/30 20:14, Jason Gunthorpe wrote: On Sun, May 29, 2022 at 01:14:46PM +0800, Baolu Lu wrote: From 1e87b5df40c6ce9414cdd03988c3b52bfb17af5f Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Sun, 29 May 2022 10:18:56 +0800 Subject: [PATCH 1/1] iommu/vt-d: debugfs: Remove device_domain_lock usage The domain_translation_struct debugfs node is used to dump static mappings of PCI devices. It potentially races with setting new domains to devices and the iommu_map/unmap() interfaces. The existing code tries to use the global spinlock device_domain_lock to avoid the races, but this is problematical as this lock is only used to protect the device tracking lists of the domains. Instead of using an immature lock to cover up the problem, it's better to explicitly restrict the use of this debugfs node. This also makes device_domain_lock static. What does "explicitly restrict" mean? I originally thought about adding restrictions on this interface to a document. But obviously, this is a naive idea. :-) I went over the code again. The races exist in two paths: 1. Dump the page table in use while setting a new page table to the device. 2. A high-level page table entry has been marked as non-present, but the dumping code has walked down to the low-level tables. For case 1, we can try to solve it by dumping tables while holding the group->mutex. For case 2, it is a bit weird. I tried to add a rwsem lock to make the iommu_unmap() and dumping tables in debugfs exclusive. This does not work because debugfs may depend on the DMA of the devices to work. It seems that what we can do is to allow this race, but when we traverse the page table in debugfs, we will check the validity of the physical address retrieved from the page table entry. Then, the worst case is to print some useless information. The real code looks like this: From 3feb0727f9d7095729ef75ab1967270045b3a38c Mon Sep 17 00:00:00 2001 From: Lu Baolu Date: Sun, 29 May 2022 10:18:56 +0800 Subject: [PATCH 1/1] iommu/vt-d: debugfs: Remove device_domain_lock usage The domain_translation_struct debugfs node is used to dump the DMAR page tables for the PCI devices. It potentially races with setting domains to devices and the iommu_unmap() interface. The existing code uses a global spinlock device_domain_lock to avoid the races, but this is problematical as this lock is only used to protect the device tracking lists of each domain. This replaces device_domain_lock with group->mutex to protect the traverse of the page tables from setting a new domain and always check the physical address retrieved from the page table entry before traversing to the next- level page table. As a cleanup, this also makes device_domain_lock static. Signed-off-by: Lu Baolu --- drivers/iommu/intel/debugfs.c | 42 ++- drivers/iommu/intel/iommu.c | 2 +- drivers/iommu/intel/iommu.h | 1 - 3 files changed, 27 insertions(+), 18 deletions(-) diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c index d927ef10641b..e6f4835b8d9f 100644 --- a/drivers/iommu/intel/debugfs.c +++ b/drivers/iommu/intel/debugfs.c @@ -333,25 +333,28 @@ static void pgtable_walk_level(struct seq_file *m, struct dma_pte *pde, continue; path[level] = pde->val; - if (dma_pte_superpage(pde) || level == 1) + if (dma_pte_superpage(pde) || level == 1) { dump_page_info(m, start, path); - else - pgtable_walk_level(m, phys_to_virt(dma_pte_addr(pde)), + } else { + unsigned long phys_addr; + + phys_addr = (unsigned long)dma_pte_addr(pde); + if (!pfn_valid(__phys_to_pfn(phys_addr))) + break; + pgtable_walk_level(m, phys_to_virt(phys_addr), level - 1, start, path); + } path[level] = 0; } } -static int show_device_domain_translation(struct device *dev, void *data) +static int __show_device_domain_translation(struct device *dev, void *data) { struct device_domain_info *info = dev_iommu_priv_get(dev); struct dmar_domain *domain = info->domain; struct seq_file *m = data; u64 path[6] = { 0 }; - if (!domain) - return 0; - seq_printf(m, "Device %s @0x%llx\n", dev_name(dev), (u64)virt_to_phys(domain->pgd)); seq_puts(m, "IOVA_PFN\t\tPML5E\t\t\tPML4E\t\t\tPDPE\t\t\tPDE\t\t\tPTE\n"); @@ -359,20 +362,27 @@ static int show_device_domain_translation(struct device *dev, void *data) pgtable_walk_level(m, domain->pgd, domain->agaw + 2, 0, path); seq_putc(m, '\n'); - return 0; + return 1; } -static int domain_translation_struct_show(struct seq_file *m, void *unused) +static int show_device_domain_translation(struct
[PATCH V3 5/8] dt-bindings: Add xen, grant-dma IOMMU description for xen-grant DMA ops
From: Oleksandr Tyshchenko The main purpose of this binding is to communicate Xen specific information using generic IOMMU device tree bindings (which is a good fit here) rather than introducing a custom property. Introduce Xen specific IOMMU for the virtualized device (e.g. virtio) to be used by Xen grant DMA-mapping layer in the subsequent commit. The reference to Xen specific IOMMU node using "iommus" property indicates that Xen grant mappings need to be enabled for the device, and it specifies the ID of the domain where the corresponding backend resides. The domid (domain ID) is used as an argument to the Xen grant mapping APIs. This is needed for the option to restrict memory access using Xen grant mappings to work which primary goal is to enable using virtio devices in Xen guests. Signed-off-by: Oleksandr Tyshchenko --- Changes RFC -> V1: - update commit subject/description and text in description - move to devicetree/bindings/arm/ Changes V1 -> V2: - update text in description - change the maintainer of the binding - fix validation issue - reference xen,dev-domid.yaml schema from virtio/mmio.yaml Change V2 -> V3: - Stefano already gave his Reviewed-by, I dropped it due to the changes (significant) - use generic IOMMU device tree bindings instead of custom property "xen,dev-domid" - change commit subject and description, was "dt-bindings: Add xen,dev-domid property description for xen-grant DMA ops" --- .../devicetree/bindings/iommu/xen,grant-dma.yaml | 49 ++ 1 file changed, 49 insertions(+) create mode 100644 Documentation/devicetree/bindings/iommu/xen,grant-dma.yaml diff --git a/Documentation/devicetree/bindings/iommu/xen,grant-dma.yaml b/Documentation/devicetree/bindings/iommu/xen,grant-dma.yaml new file mode 100644 index ..ab5765c --- /dev/null +++ b/Documentation/devicetree/bindings/iommu/xen,grant-dma.yaml @@ -0,0 +1,49 @@ +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/iommu/xen,grant-dma.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Xen specific IOMMU for virtualized devices (e.g. virtio) + +maintainers: + - Stefano Stabellini + +description: + The reference to Xen specific IOMMU node using "iommus" property indicates + that Xen grant mappings need to be enabled for the device, and it specifies + the ID of the domain where the corresponding backend resides. + The binding is required to restrict memory access using Xen grant mappings. + +properties: + compatible: +const: xen,grant-dma + + '#iommu-cells': +const: 1 +description: + Xen specific IOMMU is multiple-master IOMMU device. + The single cell describes the domid (domain ID) of the domain where + the backend is running. + +required: + - compatible + - "#iommu-cells" + +additionalProperties: false + +examples: + - | +xen_iommu { +compatible = "xen,grant-dma"; +#iommu-cells = <1>; +}; + +virtio@3000 { +compatible = "virtio,mmio"; +reg = <0x3000 0x100>; +interrupts = <41>; + +/* The backend is located in Xen domain with ID 1 */ +iommus = <_iommu 1>; +}; -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/3] iommu: mtk_iommu: add support for 6-bit encoded port IDs
Until now the port ID was always encoded as a 5-bit data. On MT8365, the port ID is encoded as a 6-bit data. This requires to rework the macros F_MMU_INT_ID_LARB_ID, and F_MMU_INT_ID_PORT_ID in order to support 5-bit and 6-bit encoded port IDs. Signed-off-by: Fabien Parent --- drivers/iommu/mtk_iommu.c | 17 + drivers/iommu/mtk_iommu.h | 1 + 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 6fd75a60abd6..b692347d8d56 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -103,8 +103,10 @@ #define REG_MMU1_INT_ID0x154 #define F_MMU_INT_ID_COMM_ID(a)(((a) >> 9) & 0x7) #define F_MMU_INT_ID_SUB_COMM_ID(a)(((a) >> 7) & 0x3) -#define F_MMU_INT_ID_LARB_ID(a)(((a) >> 7) & 0x7) -#define F_MMU_INT_ID_PORT_ID(a)(((a) >> 2) & 0x1f) +#define F_MMU_INT_ID_LARB_ID(a, port_width)\ + ((a) >> ((port_width + 2) & 0x7)) +#define F_MMU_INT_ID_PORT_ID(a, port_width)\ + (((a) >> 2) & GENMASK(port_width - 1, 0)) #define MTK_PROTECT_PA_ALIGN 256 @@ -291,12 +293,13 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id) fault_pa |= (u64)pa34_32 << 32; } - fault_port = F_MMU_INT_ID_PORT_ID(regval); + fault_port = F_MMU_INT_ID_PORT_ID(regval, data->plat_data->port_width); if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_SUB_COMM)) { fault_larb = F_MMU_INT_ID_COMM_ID(regval); sub_comm = F_MMU_INT_ID_SUB_COMM_ID(regval); } else { - fault_larb = F_MMU_INT_ID_LARB_ID(regval); + fault_larb = F_MMU_INT_ID_LARB_ID(regval, + data->plat_data->port_width); } fault_larb = data->plat_data->larbid_remap[fault_larb][sub_comm]; @@ -1034,6 +1037,7 @@ static const struct mtk_iommu_plat_data mt2712_data = { .iova_region = single_domain, .iova_region_nr = ARRAY_SIZE(single_domain), .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}}, + .port_width = 5, }; static const struct mtk_iommu_plat_data mt6779_data = { @@ -1043,6 +1047,7 @@ static const struct mtk_iommu_plat_data mt6779_data = { .iova_region = single_domain, .iova_region_nr = ARRAY_SIZE(single_domain), .larbid_remap = {{0}, {1}, {2}, {3}, {5}, {7, 8}, {10}, {9}}, + .port_width= 5, }; static const struct mtk_iommu_plat_data mt8167_data = { @@ -1052,6 +1057,7 @@ static const struct mtk_iommu_plat_data mt8167_data = { .iova_region = single_domain, .iova_region_nr = ARRAY_SIZE(single_domain), .larbid_remap = {{0}, {1}, {2}}, /* Linear mapping. */ + .port_width = 5, }; static const struct mtk_iommu_plat_data mt8173_data = { @@ -1062,6 +1068,7 @@ static const struct mtk_iommu_plat_data mt8173_data = { .iova_region = single_domain, .iova_region_nr = ARRAY_SIZE(single_domain), .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}}, /* Linear mapping. */ + .port_width = 5, }; static const struct mtk_iommu_plat_data mt8183_data = { @@ -1071,6 +1078,7 @@ static const struct mtk_iommu_plat_data mt8183_data = { .iova_region = single_domain, .iova_region_nr = ARRAY_SIZE(single_domain), .larbid_remap = {{0}, {4}, {5}, {6}, {7}, {2}, {3}, {1}}, + .port_width = 5, }; static const struct mtk_iommu_plat_data mt8192_data = { @@ -1082,6 +1090,7 @@ static const struct mtk_iommu_plat_data mt8192_data = { .iova_region_nr = ARRAY_SIZE(mt8192_multi_dom), .larbid_remap = {{0}, {1}, {4, 5}, {7}, {2}, {9, 11, 19, 20}, {0, 14, 16}, {0, 13, 18, 17}}, + .port_width = 5, }; static const struct of_device_id mtk_iommu_of_ids[] = { diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index b742432220c5..84cecaf6d61c 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -54,6 +54,7 @@ struct mtk_iommu_plat_data { enum mtk_iommu_plat m4u_plat; u32 flags; u32 inv_sel_reg; + u8 port_width; unsigned intiova_region_nr; const struct mtk_iommu_iova_region *iova_region; -- 2.36.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/3] iommu: mtk_iommu: add support for MT8365 SoC
Add IOMMU support for MT8365 SoC. Signed-off-by: Fabien Parent --- drivers/iommu/mtk_iommu.c | 11 +++ drivers/iommu/mtk_iommu.h | 1 + 2 files changed, 12 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index b692347d8d56..039b8f9d5022 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -1093,6 +1093,16 @@ static const struct mtk_iommu_plat_data mt8192_data = { .port_width = 5, }; +static const struct mtk_iommu_plat_data mt8365_data = { + .m4u_plat = M4U_MT8365, + .flags= RESET_AXI, + .inv_sel_reg = REG_MMU_INV_SEL_GEN1, + .iova_region = single_domain, + .iova_region_nr = ARRAY_SIZE(single_domain), + .larbid_remap = {{0}, {1}, {2}, {3}, {4}, {5}}, /* Linear mapping. */ + .port_width = 6, +}; + static const struct of_device_id mtk_iommu_of_ids[] = { { .compatible = "mediatek,mt2712-m4u", .data = _data}, { .compatible = "mediatek,mt6779-m4u", .data = _data}, @@ -1100,6 +1110,7 @@ static const struct of_device_id mtk_iommu_of_ids[] = { { .compatible = "mediatek,mt8173-m4u", .data = _data}, { .compatible = "mediatek,mt8183-m4u", .data = _data}, { .compatible = "mediatek,mt8192-m4u", .data = _data}, + { .compatible = "mediatek,mt8365-m4u", .data = _data}, {} }; diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h index 84cecaf6d61c..cb174fa6f2ab 100644 --- a/drivers/iommu/mtk_iommu.h +++ b/drivers/iommu/mtk_iommu.h @@ -46,6 +46,7 @@ enum mtk_iommu_plat { M4U_MT8173, M4U_MT8183, M4U_MT8192, + M4U_MT8365, }; struct mtk_iommu_iova_region; -- 2.36.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/3] dt-bindings: iommu: mediatek: add binding documentation for MT8365 SoC
Add IOMMU binding documentation for the MT8365 SoC. Signed-off-by: Fabien Parent --- .../bindings/iommu/mediatek,iommu.yaml| 2 + include/dt-bindings/memory/mt8365-larb-port.h | 96 +++ 2 files changed, 98 insertions(+) create mode 100644 include/dt-bindings/memory/mt8365-larb-port.h diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml b/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml index 97e8c471a5e8..5ba688365da5 100644 --- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml +++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml @@ -77,6 +77,7 @@ properties: - mediatek,mt8173-m4u # generation two - mediatek,mt8183-m4u # generation two - mediatek,mt8192-m4u # generation two + - mediatek,mt8365-m4u # generation two - description: mt7623 generation one items: @@ -120,6 +121,7 @@ properties: dt-binding/memory/mt8173-larb-port.h for mt8173, dt-binding/memory/mt8183-larb-port.h for mt8183, dt-binding/memory/mt8192-larb-port.h for mt8192. + dt-binding/memory/mt8365-larb-port.h for mt8365. power-domains: maxItems: 1 diff --git a/include/dt-bindings/memory/mt8365-larb-port.h b/include/dt-bindings/memory/mt8365-larb-port.h new file mode 100644 index ..e7d5637aa38e --- /dev/null +++ b/include/dt-bindings/memory/mt8365-larb-port.h @@ -0,0 +1,96 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2022 MediaTek Inc. + * Author: Yong Wu + */ +#ifndef _DT_BINDINGS_MEMORY_MT8365_LARB_PORT_H_ +#define _DT_BINDINGS_MEMORY_MT8365_LARB_PORT_H_ + +#include + +#define M4U_LARB0_ID 0 +#define M4U_LARB1_ID 1 +#define M4U_LARB2_ID 2 +#define M4U_LARB3_ID 3 +#define M4U_LARB4_ID 4 +#define M4U_LARB5_ID 5 +#define M4U_LARB6_ID 6 +#define M4U_LARB7_ID 7 + +/* larb0 */ +#define M4U_PORT_DISP_OVL0 MTK_M4U_ID(0, 0) +#define M4U_PORT_DISP_OVL0_2L MTK_M4U_ID(0, 1) +#define M4U_PORT_DISP_RDMA0MTK_M4U_ID(0, 2) +#define M4U_PORT_DISP_WDMA0MTK_M4U_ID(0, 3) +#define M4U_PORT_DISP_RDMA1MTK_M4U_ID(0, 4) +#define M4U_PORT_MDP_RDMA0 MTK_M4U_ID(0, 5) +#define M4U_PORT_MDP_WROT1 MTK_M4U_ID(0, 6) +#define M4U_PORT_MDP_WROT0 MTK_M4U_ID(0, 7) +#define M4U_PORT_MDP_RDMA1 MTK_M4U_ID(0, 8) +#define M4U_PORT_DISP_FAKE0MTK_M4U_ID(0, 9) + +/* larb1 */ +#define M4U_PORT_VENC_RCPU MTK_M4U_ID(1, 0) +#define M4U_PORT_VENC_REC MTK_M4U_ID(1, 1) +#define M4U_PORT_VENC_BSDMAMTK_M4U_ID(1, 2) +#define M4U_PORT_VENC_SV_COMV MTK_M4U_ID(1, 3) +#define M4U_PORT_VENC_RD_COMV MTK_M4U_ID(1, 4) +#define M4U_PORT_VENC_NBM_RDMA MTK_M4U_ID(1, 5) +#define M4U_PORT_VENC_NBM_RDMA_LITEMTK_M4U_ID(1, 6) +#define M4U_PORT_JPGENC_Y_RDMA MTK_M4U_ID(1, 7) +#define M4U_PORT_JPGENC_C_RDMA MTK_M4U_ID(1, 8) +#define M4U_PORT_JPGENC_Q_TABLEMTK_M4U_ID(1, 9) +#define M4U_PORT_JPGENC_BSDMA MTK_M4U_ID(1, 10) +#define M4U_PORT_JPGDEC_WDMA MTK_M4U_ID(1, 11) +#define M4U_PORT_JPGDEC_BSDMA MTK_M4U_ID(1, 12) +#define M4U_PORT_VENC_NBM_WDMA MTK_M4U_ID(1, 13) +#define M4U_PORT_VENC_NBM_WDMA_LITEMTK_M4U_ID(1, 14) +#define M4U_PORT_VENC_CUR_LUMA MTK_M4U_ID(1, 15) +#define M4U_PORT_VENC_CUR_CHROMA MTK_M4U_ID(1, 16) +#define M4U_PORT_VENC_REF_LUMA MTK_M4U_ID(1, 17) +#define M4U_PORT_VENC_REF_CHROMA MTK_M4U_ID(1, 18) + +/* larb2 */ +#define M4U_PORT_CAM_IMGO MTK_M4U_ID(2, 0) +#define M4U_PORT_CAM_RRZO MTK_M4U_ID(2, 1) +#define M4U_PORT_CAM_AAO MTK_M4U_ID(2, 2) +#define M4U_PORT_CAM_LCS MTK_M4U_ID(2, 3) +#define M4U_PORT_CAM_ESFKO MTK_M4U_ID(2, 4) +#define M4U_PORT_CAM_CAM_SV0 MTK_M4U_ID(2, 5) +#define M4U_PORT_CAM_CAM_SV1 MTK_M4U_ID(2, 6) +#define M4U_PORT_CAM_LSCI MTK_M4U_ID(2, 7) +#define M4U_PORT_CAM_LSCI_DMTK_M4U_ID(2, 8) +#define M4U_PORT_CAM_AFO MTK_M4U_ID(2, 9) +#define M4U_PORT_CAM_SPARE MTK_M4U_ID(2, 10) +#define M4U_PORT_CAM_BPCI MTK_M4U_ID(2, 11) +#define M4U_PORT_CAM_BPCI_DMTK_M4U_ID(2, 12) +#define M4U_PORT_CAM_UFDI MTK_M4U_ID(2, 13) +#define M4U_PORT_CAM_IMGI MTK_M4U_ID(2, 14) +#define M4U_PORT_CAM_IMG2O MTK_M4U_ID(2, 15) +#define M4U_PORT_CAM_IMG3O MTK_M4U_ID(2, 16) +#define M4U_PORT_CAM_WPE0_IMTK_M4U_ID(2, 17) +#define M4U_PORT_CAM_WPE1_IMTK_M4U_ID(2, 18) +#define M4U_PORT_CAM_WPE_O MTK_M4U_ID(2, 19) +#define M4U_PORT_CAM_FD0_I MTK_M4U_ID(2, 20) +#define M4U_PORT_CAM_FD1_I MTK_M4U_ID(2, 21)
[syzbot] WARNING in dma_map_sgtable (2)
Hello, syzbot found the following issue on: HEAD commit:7e062cda7d90 Merge tag 'net-next-5.19' of git://git.kernel.. git tree: upstream console+strace: https://syzkaller.appspot.com/x/log.txt?x=172151d3f0 kernel config: https://syzkaller.appspot.com/x/.config?x=e9d71d3c07c36588 dashboard link: https://syzkaller.appspot.com/bug?extid=3ba551855046ba3b3806 compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12918503f0 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1386fa39f0 Bisection is inconclusive: the issue happens on the oldest tested release. bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14107ee5f0 final oops: https://syzkaller.appspot.com/x/report.txt?x=16107ee5f0 console output: https://syzkaller.appspot.com/x/log.txt?x=12107ee5f0 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+3ba551855046ba3b3...@syzkaller.appspotmail.com [ cut here ] WARNING: CPU: 0 PID: 3610 at kernel/dma/mapping.c:188 dma_map_sgtable+0x203/0x260 kernel/dma/mapping.c:264 Modules linked in: CPU: 0 PID: 3610 Comm: syz-executor162 Not tainted 5.18.0-syzkaller-04943-g7e062cda7d90 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__dma_map_sg_attrs kernel/dma/mapping.c:188 [inline] RIP: 0010:dma_map_sgtable+0x203/0x260 kernel/dma/mapping.c:264 Code: 75 15 e8 50 5f 14 00 eb cb e8 49 5f 14 00 eb c4 e8 42 5f 14 00 eb bd e8 3b 5f 14 00 0f 0b bd fb ff ff ff eb af e8 2d 5f 14 00 <0f> 0b 31 ed 48 bb 00 00 00 00 00 fc ff df e9 7b ff ff ff 89 e9 80 RSP: 0018:c9000305fd40 EFLAGS: 00010293 RAX: 81723873 RBX: dc00 RCX: 88801fbb8000 RDX: RSI: 0001 RDI: 0002 RBP: 8881487e5408 R08: 81723743 R09: ed1003592c9e R10: ed1003592c9e R11: 111003592c9c R12: 8881487e5000 R13: 88801ac964e0 R14: R15: 0001 FS: 56c2a300() GS:8880b9a0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 005d84c8 CR3: 1f1ef000 CR4: 003506f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: get_sg_table+0xf9/0x150 drivers/dma-buf/udmabuf.c:72 begin_cpu_udmabuf+0xf5/0x160 drivers/dma-buf/udmabuf.c:126 dma_buf_begin_cpu_access+0xd8/0x170 drivers/dma-buf/dma-buf.c:1172 dma_buf_ioctl+0x2a0/0x2f0 drivers/dma-buf/dma-buf.c:363 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f8bf9c6dc19 Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:7ffd7cfae1d8 EFLAGS: 0246 ORIG_RAX: 0010 RAX: ffda RBX: RCX: 7f8bf9c6dc19 RDX: 2100 RSI: 40086200 RDI: 0006 RBP: 7f8bf9c31dc0 R08: R09: R10: R11: 0246 R12: 7f8bf9c31e50 R13: R14: R15: --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. For information about bisection process see: https://goo.gl/tpsmEJ#bisection syzbot can test patches for this issue, for details see: https://goo.gl/tpsmEJ#testing-patches ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH AUTOSEL 4.19 26/38] dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
From: Mikulas Patocka [ Upstream commit 84bc4f1d5f8aa68706a96711dccb28b518e5 ] We observed the error "cacheline tracking ENOMEM, dma-debug disabled" during a light system load (copying some files). The reason for this error is that the dma_active_cacheline radix tree uses GFP_NOWAIT allocation - so it can't access the emergency memory reserves and it fails as soon as anybody reaches the watermark. This patch changes GFP_NOWAIT to GFP_ATOMIC, so that it can access the emergency memory reserves. Signed-off-by: Mikulas Patocka Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- kernel/dma/debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index 9c9a5b12f92f..7c6cd00d0fca 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -469,7 +469,7 @@ void debug_dma_dump_mappings(struct device *dev) * At any time debug_dma_assert_idle() can be called to trigger a * warning if any cachelines in the given page are in the active set. */ -static RADIX_TREE(dma_active_cacheline, GFP_NOWAIT); +static RADIX_TREE(dma_active_cacheline, GFP_ATOMIC); static DEFINE_SPINLOCK(radix_lock); #define ACTIVE_CACHELINE_MAX_OVERLAP ((1 << RADIX_TREE_MAX_TAGS) - 1) #define CACHELINE_PER_PAGE_SHIFT (PAGE_SHIFT - L1_CACHE_SHIFT) -- 2.35.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH AUTOSEL 5.4 37/55] dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
From: Mikulas Patocka [ Upstream commit 84bc4f1d5f8aa68706a96711dccb28b518e5 ] We observed the error "cacheline tracking ENOMEM, dma-debug disabled" during a light system load (copying some files). The reason for this error is that the dma_active_cacheline radix tree uses GFP_NOWAIT allocation - so it can't access the emergency memory reserves and it fails as soon as anybody reaches the watermark. This patch changes GFP_NOWAIT to GFP_ATOMIC, so that it can access the emergency memory reserves. Signed-off-by: Mikulas Patocka Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- kernel/dma/debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index 4dc3bbfd3e3f..1c133f610f59 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -450,7 +450,7 @@ void debug_dma_dump_mappings(struct device *dev) * At any time debug_dma_assert_idle() can be called to trigger a * warning if any cachelines in the given page are in the active set. */ -static RADIX_TREE(dma_active_cacheline, GFP_NOWAIT); +static RADIX_TREE(dma_active_cacheline, GFP_ATOMIC); static DEFINE_SPINLOCK(radix_lock); #define ACTIVE_CACHELINE_MAX_OVERLAP ((1 << RADIX_TREE_MAX_TAGS) - 1) #define CACHELINE_PER_PAGE_SHIFT (PAGE_SHIFT - L1_CACHE_SHIFT) -- 2.35.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH AUTOSEL 5.10 50/76] dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
From: Mikulas Patocka [ Upstream commit 84bc4f1d5f8aa68706a96711dccb28b518e5 ] We observed the error "cacheline tracking ENOMEM, dma-debug disabled" during a light system load (copying some files). The reason for this error is that the dma_active_cacheline radix tree uses GFP_NOWAIT allocation - so it can't access the emergency memory reserves and it fails as soon as anybody reaches the watermark. This patch changes GFP_NOWAIT to GFP_ATOMIC, so that it can access the emergency memory reserves. Signed-off-by: Mikulas Patocka Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- kernel/dma/debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index f8ae54679865..ee7da1f2462f 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -448,7 +448,7 @@ void debug_dma_dump_mappings(struct device *dev) * other hand, consumes a single dma_debug_entry, but inserts 'nents' * entries into the tree. */ -static RADIX_TREE(dma_active_cacheline, GFP_NOWAIT); +static RADIX_TREE(dma_active_cacheline, GFP_ATOMIC); static DEFINE_SPINLOCK(radix_lock); #define ACTIVE_CACHELINE_MAX_OVERLAP ((1 << RADIX_TREE_MAX_TAGS) - 1) #define CACHELINE_PER_PAGE_SHIFT (PAGE_SHIFT - L1_CACHE_SHIFT) -- 2.35.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH AUTOSEL 5.10 01/76] iommu/vt-d: Add RPLS to quirk list to skip TE disabling
From: Tejas Upadhyay [ Upstream commit 0a967f5bfd9134b89681cae58deb222e20840e76 ] The VT-d spec requires (10.4.4 Global Command Register, TE field) that: Hardware implementations supporting DMA draining must drain any in-flight DMA read/write requests queued within the Root-Complex before completing the translation enable command and reflecting the status of the command through the TES field in the Global Status register. Unfortunately, some integrated graphic devices fail to do so after some kind of power state transition. As the result, the system might stuck in iommu_disable_translati on(), waiting for the completion of TE transition. This adds RPLS to a quirk list for those devices and skips TE disabling if the qurik hits. Link: https://gitlab.freedesktop.org/drm/intel/-/issues/4898 Tested-by: Raviteja Goud Talla Cc: Rodrigo Vivi Acked-by: Lu Baolu Signed-off-by: Tejas Upadhyay Reviewed-by: Rodrigo Vivi Signed-off-by: Rodrigo Vivi Link: https://patchwork.freedesktop.org/patch/msgid/20220302043256.191529-1-tejaskumarx.surendrakumar.upadh...@intel.com Signed-off-by: Sasha Levin --- drivers/iommu/intel/iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 21749859ad45..477dde39823c 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -6296,7 +6296,7 @@ static void quirk_igfx_skip_te_disable(struct pci_dev *dev) ver = (dev->device >> 8) & 0xff; if (ver != 0x45 && ver != 0x46 && ver != 0x4c && ver != 0x4e && ver != 0x8a && ver != 0x98 && - ver != 0x9a) + ver != 0x9a && ver != 0xa7) return; if (risky_device(dev)) -- 2.35.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH AUTOSEL 5.15 069/109] dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
From: Mikulas Patocka [ Upstream commit 84bc4f1d5f8aa68706a96711dccb28b518e5 ] We observed the error "cacheline tracking ENOMEM, dma-debug disabled" during a light system load (copying some files). The reason for this error is that the dma_active_cacheline radix tree uses GFP_NOWAIT allocation - so it can't access the emergency memory reserves and it fails as soon as anybody reaches the watermark. This patch changes GFP_NOWAIT to GFP_ATOMIC, so that it can access the emergency memory reserves. Signed-off-by: Mikulas Patocka Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- kernel/dma/debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index f8ff598596b8..ac740630c79c 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -448,7 +448,7 @@ void debug_dma_dump_mappings(struct device *dev) * other hand, consumes a single dma_debug_entry, but inserts 'nents' * entries into the tree. */ -static RADIX_TREE(dma_active_cacheline, GFP_NOWAIT); +static RADIX_TREE(dma_active_cacheline, GFP_ATOMIC); static DEFINE_SPINLOCK(radix_lock); #define ACTIVE_CACHELINE_MAX_OVERLAP ((1 << RADIX_TREE_MAX_TAGS) - 1) #define CACHELINE_PER_PAGE_SHIFT (PAGE_SHIFT - L1_CACHE_SHIFT) -- 2.35.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH AUTOSEL 5.15 001/109] iommu/vt-d: Add RPLS to quirk list to skip TE disabling
From: Tejas Upadhyay [ Upstream commit 0a967f5bfd9134b89681cae58deb222e20840e76 ] The VT-d spec requires (10.4.4 Global Command Register, TE field) that: Hardware implementations supporting DMA draining must drain any in-flight DMA read/write requests queued within the Root-Complex before completing the translation enable command and reflecting the status of the command through the TES field in the Global Status register. Unfortunately, some integrated graphic devices fail to do so after some kind of power state transition. As the result, the system might stuck in iommu_disable_translati on(), waiting for the completion of TE transition. This adds RPLS to a quirk list for those devices and skips TE disabling if the qurik hits. Link: https://gitlab.freedesktop.org/drm/intel/-/issues/4898 Tested-by: Raviteja Goud Talla Cc: Rodrigo Vivi Acked-by: Lu Baolu Signed-off-by: Tejas Upadhyay Reviewed-by: Rodrigo Vivi Signed-off-by: Rodrigo Vivi Link: https://patchwork.freedesktop.org/patch/msgid/20220302043256.191529-1-tejaskumarx.surendrakumar.upadh...@intel.com Signed-off-by: Sasha Levin --- drivers/iommu/intel/iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 91a5c75966f3..a1ffb3d6d901 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -5728,7 +5728,7 @@ static void quirk_igfx_skip_te_disable(struct pci_dev *dev) ver = (dev->device >> 8) & 0xff; if (ver != 0x45 && ver != 0x46 && ver != 0x4c && ver != 0x4e && ver != 0x8a && ver != 0x98 && - ver != 0x9a) + ver != 0x9a && ver != 0xa7) return; if (risky_device(dev)) -- 2.35.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH AUTOSEL 5.17 083/135] dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
From: Mikulas Patocka [ Upstream commit 84bc4f1d5f8aa68706a96711dccb28b518e5 ] We observed the error "cacheline tracking ENOMEM, dma-debug disabled" during a light system load (copying some files). The reason for this error is that the dma_active_cacheline radix tree uses GFP_NOWAIT allocation - so it can't access the emergency memory reserves and it fails as soon as anybody reaches the watermark. This patch changes GFP_NOWAIT to GFP_ATOMIC, so that it can access the emergency memory reserves. Signed-off-by: Mikulas Patocka Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- kernel/dma/debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index f8ff598596b8..ac740630c79c 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -448,7 +448,7 @@ void debug_dma_dump_mappings(struct device *dev) * other hand, consumes a single dma_debug_entry, but inserts 'nents' * entries into the tree. */ -static RADIX_TREE(dma_active_cacheline, GFP_NOWAIT); +static RADIX_TREE(dma_active_cacheline, GFP_ATOMIC); static DEFINE_SPINLOCK(radix_lock); #define ACTIVE_CACHELINE_MAX_OVERLAP ((1 << RADIX_TREE_MAX_TAGS) - 1) #define CACHELINE_PER_PAGE_SHIFT (PAGE_SHIFT - L1_CACHE_SHIFT) -- 2.35.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH AUTOSEL 5.17 001/135] iommu/vt-d: Add RPLS to quirk list to skip TE disabling
From: Tejas Upadhyay [ Upstream commit 0a967f5bfd9134b89681cae58deb222e20840e76 ] The VT-d spec requires (10.4.4 Global Command Register, TE field) that: Hardware implementations supporting DMA draining must drain any in-flight DMA read/write requests queued within the Root-Complex before completing the translation enable command and reflecting the status of the command through the TES field in the Global Status register. Unfortunately, some integrated graphic devices fail to do so after some kind of power state transition. As the result, the system might stuck in iommu_disable_translati on(), waiting for the completion of TE transition. This adds RPLS to a quirk list for those devices and skips TE disabling if the qurik hits. Link: https://gitlab.freedesktop.org/drm/intel/-/issues/4898 Tested-by: Raviteja Goud Talla Cc: Rodrigo Vivi Acked-by: Lu Baolu Signed-off-by: Tejas Upadhyay Reviewed-by: Rodrigo Vivi Signed-off-by: Rodrigo Vivi Link: https://patchwork.freedesktop.org/patch/msgid/20220302043256.191529-1-tejaskumarx.surendrakumar.upadh...@intel.com Signed-off-by: Sasha Levin --- drivers/iommu/intel/iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index ab2273300346..e3f15e0cae34 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -5764,7 +5764,7 @@ static void quirk_igfx_skip_te_disable(struct pci_dev *dev) ver = (dev->device >> 8) & 0xff; if (ver != 0x45 && ver != 0x46 && ver != 0x4c && ver != 0x4e && ver != 0x8a && ver != 0x98 && - ver != 0x9a) + ver != 0x9a && ver != 0xa7) return; if (risky_device(dev)) -- 2.35.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH AUTOSEL 5.18 100/159] dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
From: Mikulas Patocka [ Upstream commit 84bc4f1d5f8aa68706a96711dccb28b518e5 ] We observed the error "cacheline tracking ENOMEM, dma-debug disabled" during a light system load (copying some files). The reason for this error is that the dma_active_cacheline radix tree uses GFP_NOWAIT allocation - so it can't access the emergency memory reserves and it fails as soon as anybody reaches the watermark. This patch changes GFP_NOWAIT to GFP_ATOMIC, so that it can access the emergency memory reserves. Signed-off-by: Mikulas Patocka Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- kernel/dma/debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index f8ff598596b8..ac740630c79c 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -448,7 +448,7 @@ void debug_dma_dump_mappings(struct device *dev) * other hand, consumes a single dma_debug_entry, but inserts 'nents' * entries into the tree. */ -static RADIX_TREE(dma_active_cacheline, GFP_NOWAIT); +static RADIX_TREE(dma_active_cacheline, GFP_ATOMIC); static DEFINE_SPINLOCK(radix_lock); #define ACTIVE_CACHELINE_MAX_OVERLAP ((1 << RADIX_TREE_MAX_TAGS) - 1) #define CACHELINE_PER_PAGE_SHIFT (PAGE_SHIFT - L1_CACHE_SHIFT) -- 2.35.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V2 2/6] iommu: iova: properly handle 0 as a valid IOVA address
Hi Robin On Mon, May 23, 2022 at 11:00 PM Robin Murphy wrote: > > On 2022-05-11 13:15, Ajay Kumar wrote: > > From: Marek Szyprowski > > > > Zero is a valid DMA and IOVA address on many architectures, so adjust the > > IOVA management code to properly handle it. A new value IOVA_BAD_ADDR > > (~0UL) is introduced as a generic value for the error case. Adjust all > > callers of the alloc_iova_fast() function for the new return value. > > And when does anything actually need this? In fact if you were to stop > iommu-dma from reserving IOVA 0 - which you don't - it would only show > how patch #3 is broken. Right! Since the IOVA allocation happens from higher addr to lower addr, hitting this (IOVA==0) case means out of IOVA space which is highly unlikely. > Also note that it's really nothing to do with architectures either way; > iommu-dma simply chooses to reserve IOVA 0 for its own convenience, > mostly because it can. Much the same way that 0 is typically a valid CPU > VA, but mapping something meaningful there is just asking for a world of > pain debugging NULL-dereference bugs. > > Robin. This makes sense, let me think about managing the PFN at lowest address in some other way. Thanks, Ajay Kumar > > Signed-off-by: Marek Szyprowski > > Signed-off-by: Ajay Kumar > > --- > > drivers/iommu/dma-iommu.c | 16 +--- > > drivers/iommu/iova.c | 13 + > > include/linux/iova.h | 1 + > > 3 files changed, 19 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > > index 1ca85d37eeab..16218d6a0703 100644 > > --- a/drivers/iommu/dma-iommu.c > > +++ b/drivers/iommu/dma-iommu.c > > @@ -605,7 +605,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct > > iommu_domain *domain, > > { > > struct iommu_dma_cookie *cookie = domain->iova_cookie; > > struct iova_domain *iovad = >iovad; > > - unsigned long shift, iova_len, iova = 0; > > + unsigned long shift, iova_len, iova = IOVA_BAD_ADDR; > > > > if (cookie->type == IOMMU_DMA_MSI_COOKIE) { > > cookie->msi_iova += size; > > @@ -625,11 +625,13 @@ static dma_addr_t iommu_dma_alloc_iova(struct > > iommu_domain *domain, > > iova = alloc_iova_fast(iovad, iova_len, > > DMA_BIT_MASK(32) >> shift, false); > > > > - if (!iova) > > + if (iova == IOVA_BAD_ADDR) > > iova = alloc_iova_fast(iovad, iova_len, dma_limit >> shift, > > true); > > > > - return (dma_addr_t)iova << shift; > > + if (iova != IOVA_BAD_ADDR) > > + return (dma_addr_t)iova << shift; > > + return DMA_MAPPING_ERROR; > > } > > > > static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie, > > @@ -688,7 +690,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, > > phys_addr_t phys, > > size = iova_align(iovad, size + iova_off); > > > > iova = iommu_dma_alloc_iova(domain, size, dma_mask, dev); > > - if (!iova) > > + if (iova == DMA_MAPPING_ERROR) > > return DMA_MAPPING_ERROR; > > > > if (iommu_map_atomic(domain, iova, phys - iova_off, size, prot)) { > > @@ -799,7 +801,7 @@ static struct page > > **__iommu_dma_alloc_noncontiguous(struct device *dev, > > > > size = iova_align(iovad, size); > > iova = iommu_dma_alloc_iova(domain, size, dev->coherent_dma_mask, > > dev); > > - if (!iova) > > + if (iova == DMA_MAPPING_ERROR) > > goto out_free_pages; > > > > if (sg_alloc_table_from_pages(sgt, pages, count, 0, size, GFP_KERNEL)) > > @@ -1204,7 +1206,7 @@ static int iommu_dma_map_sg(struct device *dev, > > struct scatterlist *sg, > > } > > > > iova = iommu_dma_alloc_iova(domain, iova_len, dma_get_mask(dev), dev); > > - if (!iova) { > > + if (iova == DMA_MAPPING_ERROR) { > > ret = -ENOMEM; > > goto out_restore_sg; > > } > > @@ -1516,7 +1518,7 @@ static struct iommu_dma_msi_page > > *iommu_dma_get_msi_page(struct device *dev, > > return NULL; > > > > iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev); > > - if (!iova) > > + if (iova == DMA_MAPPING_ERROR) > > goto out_free_page; > > > > if (iommu_map(domain, iova, msi_addr, size, prot)) > > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > > index db77aa675145..ae0fe0a6714e 100644 > > --- a/drivers/iommu/iova.c > > +++ b/drivers/iommu/iova.c > > @@ -429,6 +429,8 @@ EXPORT_SYMBOL_GPL(free_iova); > >* This function tries to satisfy an iova allocation from the rcache, > >* and falls back to regular allocation on failure. If regular allocation > >* fails too and the flush_rcache flag is set then the rcache will be > > flushed. > > + * Returns a pfn the allocated iova starts at or IOVA_BAD_ADDR in the case > > + * of a failure. > > */ > > unsigned long > > alloc_iova_fast(struct
[PATCH AUTOSEL 5.18 001/159] iommu/vt-d: Add RPLS to quirk list to skip TE disabling
From: Tejas Upadhyay [ Upstream commit 0a967f5bfd9134b89681cae58deb222e20840e76 ] The VT-d spec requires (10.4.4 Global Command Register, TE field) that: Hardware implementations supporting DMA draining must drain any in-flight DMA read/write requests queued within the Root-Complex before completing the translation enable command and reflecting the status of the command through the TES field in the Global Status register. Unfortunately, some integrated graphic devices fail to do so after some kind of power state transition. As the result, the system might stuck in iommu_disable_translati on(), waiting for the completion of TE transition. This adds RPLS to a quirk list for those devices and skips TE disabling if the qurik hits. Link: https://gitlab.freedesktop.org/drm/intel/-/issues/4898 Tested-by: Raviteja Goud Talla Cc: Rodrigo Vivi Acked-by: Lu Baolu Signed-off-by: Tejas Upadhyay Reviewed-by: Rodrigo Vivi Signed-off-by: Rodrigo Vivi Link: https://patchwork.freedesktop.org/patch/msgid/20220302043256.191529-1-tejaskumarx.surendrakumar.upadh...@intel.com Signed-off-by: Sasha Levin --- drivers/iommu/intel/iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 0ea47e17b379..ba9a63cac47c 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -5031,7 +5031,7 @@ static void quirk_igfx_skip_te_disable(struct pci_dev *dev) ver = (dev->device >> 8) & 0xff; if (ver != 0x45 && ver != 0x46 && ver != 0x4c && ver != 0x4e && ver != 0x8a && ver != 0x98 && - ver != 0x9a) + ver != 0x9a && ver != 0xa7) return; if (risky_device(dev)) -- 2.35.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 1/6] iommu: Add a per domain PASID for DMA API
On Tue, May 24, 2022 at 08:17:27AM -0700, Jacob Pan wrote: > Hi Jason, > > On Tue, 24 May 2022 10:50:34 -0300, Jason Gunthorpe wrote: > > > On Wed, May 18, 2022 at 11:21:15AM -0700, Jacob Pan wrote: > > > DMA requests tagged with PASID can target individual IOMMU domains. > > > Introduce a domain-wide PASID for DMA API, it will be used on the same > > > mapping as legacy DMA without PASID. Let it be IOVA or PA in case of > > > identity domain. > > > > Huh? I can't understand what this is trying to say or why this patch > > makes sense. > > > > We really should not have pasid's like this attached to the domains.. > > > This is the same "DMA API global PASID" you reviewed in v3, I just > singled it out as a standalone patch and renamed it. Here is your previous > review comment. > > > +++ b/include/linux/iommu.h > > @@ -105,6 +105,8 @@ struct iommu_domain { > > enum iommu_page_response_code (*iopf_handler)(struct iommu_fault *fault, > > void *data); > > void *fault_data; > > + ioasid_t pasid; /* Used for DMA requests with PASID */ > > + atomic_t pasid_users; > > These are poorly named, this is really the DMA API global PASID and > shouldn't be used for other things. > > > > Perhaps I misunderstood, do you mind explaining more? You still haven't really explained what this is for in this patch, maybe it just needs a better commit message, or maybe something is wrong. I keep saying the DMA API usage is not special, so why do we need to create a new global pasid and refcount? Realistically this is only going to be used by IDXD, why can't we just allocate a PASID and return it to the driver every time a driver asks for DMA API on PASI mode? Why does the core need to do anything special? Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/dma: Fix race condition during iova_domain initialization
From: Yunfei Wang When many devices share the same iova domain, iommu_dma_init_domain() may be called at the same time. The checking of iovad->start_pfn will all get false in iommu_dma_init_domain() and both enter init_iova_domain() to do iovad initialization. Fix this by protecting init_iova_domain() with iommu_dma_cookie->mutex. Exception backtrace: rb_insert_color(param1=0xFF80CD2BDB40, param3=1) + 64 init_iova_domain() + 180 iommu_setup_dma_ops() + 260 arch_setup_dma_ops() + 132 of_dma_configure_id() + 468 platform_dma_configure() + 32 really_probe() + 1168 driver_probe_device() + 268 __device_attach_driver() + 524 __device_attach() + 524 bus_probe_device() + 64 deferred_probe_work_func() + 260 process_one_work() + 580 worker_thread() + 1076 kthread() + 332 ret_from_fork() + 16 Signed-off-by: Ning Li Signed-off-by: Yunfei Wang --- drivers/iommu/dma-iommu.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 09f6e1c0f9c0..b38c5041eeab 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -63,6 +63,7 @@ struct iommu_dma_cookie { /* Domain for flush queue callback; NULL if flush queue not in use */ struct iommu_domain *fq_domain; + struct mutexmutex; }; static DEFINE_STATIC_KEY_FALSE(iommu_deferred_attach_enabled); @@ -309,6 +310,7 @@ int iommu_get_dma_cookie(struct iommu_domain *domain) if (!domain->iova_cookie) return -ENOMEM; + mutex_init(>iova_cookie->mutex); return 0; } @@ -549,26 +551,33 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, } /* start_pfn is always nonzero for an already-initialised domain */ + mutex_lock(>mutex); if (iovad->start_pfn) { if (1UL << order != iovad->granule || base_pfn != iovad->start_pfn) { pr_warn("Incompatible range for DMA domain\n"); - return -EFAULT; + ret = -EFAULT; + goto done_unlock; } - return 0; + ret = 0; + goto done_unlock; } init_iova_domain(iovad, 1UL << order, base_pfn); ret = iova_domain_init_rcaches(iovad); if (ret) - return ret; + goto done_unlock; /* If the FQ fails we can simply fall back to strict mode */ if (domain->type == IOMMU_DOMAIN_DMA_FQ && iommu_dma_init_fq(domain)) domain->type = IOMMU_DOMAIN_DMA; - return iova_reserve_iommu_regions(dev, domain); + ret = iova_reserve_iommu_regions(dev, domain); + +done_unlock: + mutex_unlock(>mutex); + return ret; } /** -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in debugfs
On Sun, May 29, 2022 at 01:14:46PM +0800, Baolu Lu wrote: > From 1e87b5df40c6ce9414cdd03988c3b52bfb17af5f Mon Sep 17 00:00:00 2001 > From: Lu Baolu > Date: Sun, 29 May 2022 10:18:56 +0800 > Subject: [PATCH 1/1] iommu/vt-d: debugfs: Remove device_domain_lock usage > > The domain_translation_struct debugfs node is used to dump static > mappings of PCI devices. It potentially races with setting new > domains to devices and the iommu_map/unmap() interfaces. The existing > code tries to use the global spinlock device_domain_lock to avoid the > races, but this is problematical as this lock is only used to protect > the device tracking lists of the domains. > > Instead of using an immature lock to cover up the problem, it's better > to explicitly restrict the use of this debugfs node. This also makes > device_domain_lock static. What does "explicitly restrict" mean? Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v1 7/9] driver core: Add fw_devlink_unblock_may_probe() helper function
On Thu, May 26, 2022 at 1:22 PM Saravana Kannan wrote: > > This function can be used during the kernel boot sequence to forcefully > override fw_devlink=on and unblock the probing of all devices that have > a driver. > > It's mainly meant to be called from late_initcall() or > late_initcall_sync() where a device needs to probe before the kernel can > mount rootfs. ... > diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h > index 9a81c4410b9f..0770edda7068 100644 > --- a/include/linux/fwnode.h > +++ b/include/linux/fwnode.h > @@ -13,6 +13,7 @@ > #include > #include > #include > +#include > > struct fwnode_operations; > struct device; > @@ -199,5 +200,6 @@ extern bool fw_devlink_is_strict(void); > int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup); > void fwnode_links_purge(struct fwnode_handle *fwnode); > void fw_devlink_purge_absent_suppliers(struct fwnode_handle *fwnode); > +void __init fw_devlink_unblock_may_probe(void); I don't think you need init.h and __init here. Important is that you have it in the C-file. Am I wrong? -- With Best Regards, Andy Shevchenko ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v1 0/9] deferred_probe_timeout logic clean up
Hi Saravana, On Thu, May 26, 2022 at 10:15 AM Saravana Kannan wrote: > This series is based on linux-next + these 2 small patches applies on top: > https://lore.kernel.org/lkml/20220526034609.480766-1-sarava...@google.com/ > > A lot of the deferred_probe_timeout logic is redundant with > fw_devlink=on. Also, enabling deferred_probe_timeout by default breaks > a few cases. > > This series tries to delete the redundant logic, simplify the frameworks > that use driver_deferred_probe_check_state(), enable > deferred_probe_timeout=10 by default, and fixes the nfsroot failure > case. > > Patches 1 to 3 are fairly straightforward and can probably be applied > right away. > > Patches 4 to 9 are related and are the complicated bits of this series. > > Patch 8 is where someone with more knowledge of the IP auto config code > can help rewrite the patch to limit the scope of the workaround by > running the work around only if IP auto config fails the first time > around. But it's also something that can be optimized in the future > because it's already limited to the case where IP auto config is enabled > using the kernel commandline. Thanks for your series! > Yoshihiro/Geert, > > If you can test this patch series and confirm that the NFS root case > works, I'd really appreciate that. On Salvator-XS, Micrel KSZ9031 Gigabit PHY probe is no longer delayed by 9s after applying the two earlier patches, and the same is true after applying this series on top. Tested-by: Geert Uytterhoeven I will do testing on more boards, but that may take a while, as we're in the middle of the merge window. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v1 2/9] pinctrl: devicetree: Delete usage of driver_deferred_probe_check_state()
Hi Saravana, Thanks for your patch! On Thu, May 26, 2022 at 10:16 AM Saravana Kannan wrote: > Now that fw_devlink=on by default and fw_devlink supports > "pinctrl-[0-8]" property, the execution will never get to the point 0-9? oh, it's really 0-8: drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl0, "pinctrl-0", NULL) drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl1, "pinctrl-1", NULL) drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl2, "pinctrl-2", NULL) drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl3, "pinctrl-3", NULL) drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl4, "pinctrl-4", NULL) drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL) drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL) drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL) drivers/of/property.c:DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL) Looks fragile, especially since we now have: arch/arm64/boot/dts/microchip/sparx5_pcb134_board.dtsi: pinctrl-9 = <_9>; arch/arm64/boot/dts/microchip/sparx5_pcb134_board.dtsi: pinctrl-10 = <_10>; arch/arm64/boot/dts/microchip/sparx5_pcb134_board.dtsi: pinctrl-11 = <_11>; arch/arm64/boot/dts/microchip/sparx5_pcb134_board.dtsi: pinctrl-12 = <_pins_i>; > where driver_deferred_probe_check_state() is called before the supplier > has probed successfully or before deferred probe timeout has expired. > > So, delete the call and replace it with -ENODEV. > > Signed-off-by: Saravana Kannan Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v1 4/9] Revert "driver core: Set default deferred_probe_timeout back to 0."
Hi Saravana, On Thu, May 26, 2022 at 10:16 AM Saravana Kannan wrote: > This reverts commit 11f7e7ef553b6b93ac1aa74a3c2011b9cc8aeb61. scripts/chdeckpatch.pl says: WARNING: Unknown commit id '11f7e7ef553b6b93ac1aa74a3c2011b9cc8aeb61', maybe rebased or not pulled? I assume this is your local copy of https://lore.kernel.org/r/20220526034609.480766-3-sarava...@google.com? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v1 0/9] deferred_probe_timeout logic clean up
On 2022-05-26 01:15:39 [-0700], Saravana Kannan wrote: > Yoshihiro/Geert, Hi Saravana, > If you can test this patch series and confirm that the NFS root case > works, I'd really appreciate that. The two patches you sent earlier, plus this series, plus diff --git a/drivers/base/core.c b/drivers/base/core.c index 7ff7fbb006431..829d9b1f7403f 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -1697,8 +1697,6 @@ static int fw_devlink_may_probe(struct device *dev, void *data) */ void __init fw_devlink_unblock_may_probe(void) { - struct device_link *link, *ln; - if (!fw_devlink_flags || fw_devlink_is_permissive()) return; and it compiles + boots without a delay. Sebastian ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v7 1/2] iommu/io-pgtable-arm-v7s: Add a quirk to allow pgtable PA up to 35bit
From: Yunfei Wang Single memory zone feature will remove ZONE_DMA32 and ZONE_DMA and cause pgtable PA size larger than 32bit. Since Mediatek IOMMU hardware support at most 35bit PA in pgtable, so add a quirk to allow the PA of pgtables support up to bit35. Signed-off-by: Ning Li Signed-off-by: Yunfei Wang --- drivers/iommu/io-pgtable-arm-v7s.c | 48 +- include/linux/io-pgtable.h | 17 +++ 2 files changed, 45 insertions(+), 20 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c index be066c1503d3..9a7671a89fd7 100644 --- a/drivers/iommu/io-pgtable-arm-v7s.c +++ b/drivers/iommu/io-pgtable-arm-v7s.c @@ -182,14 +182,8 @@ static bool arm_v7s_is_mtk_enabled(struct io_pgtable_cfg *cfg) (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT); } -static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl, - struct io_pgtable_cfg *cfg) +static arm_v7s_iopte to_iopte_mtk(phys_addr_t paddr, arm_v7s_iopte pte) { - arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl); - - if (!arm_v7s_is_mtk_enabled(cfg)) - return pte; - if (paddr & BIT_ULL(32)) pte |= ARM_V7S_ATTR_MTK_PA_BIT32; if (paddr & BIT_ULL(33)) @@ -199,6 +193,17 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl, return pte; } +static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl, + struct io_pgtable_cfg *cfg) +{ + arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl); + + if (!arm_v7s_is_mtk_enabled(cfg)) + return pte; + + return to_iopte_mtk(paddr, pte); +} + static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl, struct io_pgtable_cfg *cfg) { @@ -234,6 +239,7 @@ static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl, static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp, struct arm_v7s_io_pgtable *data) { + gfp_t gfp_l1 = __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA; struct io_pgtable_cfg *cfg = >iop.cfg; struct device *dev = cfg->iommu_dev; phys_addr_t phys; @@ -241,9 +247,11 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp, size_t size = ARM_V7S_TABLE_SIZE(lvl, cfg); void *table = NULL; + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT) + gfp_l1 = GFP_KERNEL | __GFP_ZERO; + if (lvl == 1) - table = (void *)__get_free_pages( - __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size)); + table = (void *)__get_free_pages(gfp_l1, get_order(size)); else if (lvl == 2) table = kmem_cache_zalloc(data->l2_tables, gfp); @@ -251,7 +259,8 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp, return NULL; phys = virt_to_phys(table); - if (phys != (arm_v7s_iopte)phys) { + if (phys != (arm_v7s_iopte)phys && + !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)) { /* Doesn't fit in PTE */ dev_err(dev, "Page table does not fit in PTE: %pa", ); goto out_free; @@ -457,9 +466,14 @@ static arm_v7s_iopte arm_v7s_install_table(arm_v7s_iopte *table, arm_v7s_iopte curr, struct io_pgtable_cfg *cfg) { + phys_addr_t phys = virt_to_phys(table); arm_v7s_iopte old, new; - new = virt_to_phys(table) | ARM_V7S_PTE_TYPE_TABLE; + new = phys | ARM_V7S_PTE_TYPE_TABLE; + + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT) + new = to_iopte_mtk(phys, new); + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS) new |= ARM_V7S_ATTR_NS_TABLE; @@ -778,7 +792,9 @@ static phys_addr_t arm_v7s_iova_to_phys(struct io_pgtable_ops *ops, static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie) { + slab_flags_t slab_flag = ARM_V7S_TABLE_SLAB_FLAGS; struct arm_v7s_io_pgtable *data; + phys_addr_t paddr; if (cfg->ias > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS)) return NULL; @@ -788,7 +804,8 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_PERMS | - IO_PGTABLE_QUIRK_ARM_MTK_EXT)) + IO_PGTABLE_QUIRK_ARM_MTK_EXT | + IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)) return NULL; /* If ARM_MTK_4GB is enabled, the NO_PERMS is also expected. */ @@ -801,10 +818,12 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, return NULL;
[PATCH v7 2/2] iommu/mediatek: Allow page table PA up to 35bit
From: Yunfei Wang Single memory zone feature will remove ZONE_DMA32 and ZONE_DMA. So add the quirk IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT to let level 1 and level 2 pgtable support at most 35bit PA. Signed-off-by: Ning Li Signed-off-by: Yunfei Wang --- drivers/iommu/mtk_iommu.c | 29 + 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 6fd75a60abd6..dd9661690ca6 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -33,6 +33,9 @@ #define REG_MMU_PT_BASE_ADDR 0x000 #define MMU_PT_ADDR_MASK GENMASK(31, 7) +/* Mediatek extend ttbr bits[2:0] for PA bits[34:32] */ +#define MMU_PT_35BIT_PA(pa)\ + ((pa & GENMASK_ULL(31, 7)) | ((pa & GENMASK_ULL(34, 32)) >> 32)) #define REG_MMU_INVALIDATE 0x020 #define F_ALL_INVLD0x2 @@ -118,6 +121,7 @@ #define WR_THROT_ENBIT(6) #define HAS_LEGACY_IVRP_PADDR BIT(7) #define IOVA_34_EN BIT(8) +#define PGTABLE_PA_35_EN BIT(9) #define MTK_IOMMU_HAS_FLAG(pdata, _x) \ pdata)->flags) & (_x)) == (_x)) @@ -401,6 +405,9 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom, .iommu_dev = data->dev, }; + if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN)) + dom->cfg.quirks |= IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT; + if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) dom->cfg.oas = data->enable_4GB ? 33 : 32; else @@ -450,6 +457,7 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain, struct mtk_iommu_domain *dom = to_mtk_domain(domain); struct device *m4udev = data->dev; int ret, domid; + u32 regval; domid = mtk_iommu_get_domain_id(dev, data->plat_data); if (domid < 0) @@ -472,8 +480,13 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain, return ret; } data->m4u_dom = dom; - writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, - data->base + REG_MMU_PT_BASE_ADDR); + + /* Bits[6:3] are invalid for mediatek platform */ + if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN)) + regval = MMU_PT_35BIT_PA(dom->cfg.arm_v7s_cfg.ttbr); + else + regval = dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK; + writel(regval, data->base + REG_MMU_PT_BASE_ADDR); pm_runtime_put(m4udev); } @@ -987,6 +1000,7 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) struct mtk_iommu_suspend_reg *reg = >reg; struct mtk_iommu_domain *m4u_dom = data->m4u_dom; void __iomem *base = data->base; + u32 regval; int ret; ret = clk_prepare_enable(data->bclk); @@ -1010,7 +1024,13 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL); writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR); writel_relaxed(reg->vld_pa_rng, base + REG_MMU_VLD_PA_RNG); - writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, base + REG_MMU_PT_BASE_ADDR); + + /* Bits[6:3] are invalid for mediatek platform */ + if (MTK_IOMMU_HAS_FLAG(data->plat_data, PGTABLE_PA_35_EN)) + regval = MMU_PT_35BIT_PA(m4u_dom->cfg.arm_v7s_cfg.ttbr); + else + regval = m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK; + writel(regval, base + REG_MMU_PT_BASE_ADDR); /* * Users may allocate dma buffer before they call pm_runtime_get, @@ -1038,7 +1058,8 @@ static const struct mtk_iommu_plat_data mt2712_data = { static const struct mtk_iommu_plat_data mt6779_data = { .m4u_plat = M4U_MT6779, - .flags = HAS_SUB_COMM | OUT_ORDER_WR_EN | WR_THROT_EN, + .flags = HAS_SUB_COMM | OUT_ORDER_WR_EN | WR_THROT_EN | +PGTABLE_PA_35_EN, .inv_sel_reg = REG_MMU_INV_SEL_GEN2, .iova_region = single_domain, .iova_region_nr = ARRAY_SIZE(single_domain), -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
On Wed, May 25, 2022 at 12:49:00PM -0700, Saravana Kannan wrote: > On Wed, May 25, 2022 at 12:12 AM Sebastian Andrzej Siewior > wrote: > > > > On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote: > > > > Removing probe_timeout_waitqueue (as suggested) or setting the timeout > > > > to 0 avoids the delay. > > > > > > In your case, I think it might be working as intended? Curious, what > > > was the call stack in your case where it was blocked? > > > > Why is then there 10sec delay during boot? The backtrace is > > |[ cut here ] > > |WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 > > wait_for_device_probe+0x30/0x110 > > |Modules linked in: > > |CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154 > > |RIP: 0010:wait_for_device_probe+0x30/0x110 > > |Call Trace: > > | > > | prepare_namespace+0x2b/0x160 > > | kernel_init_freeable+0x2b3/0x2dd > > | kernel_init+0x11/0x110 > > | ret_from_fork+0x22/0x30 > > | > > > > Looking closer, it can't access init. This in particular box boots > > directly the kernel without an initramfs so the kernel later mounts > > /dev/sda1 and everything is good. So that seems to be the reason… > Hello there, My (QEMU) boot times were recently extended by 10 seconds. Looking at the timestamps, it looks like nothing is being done for 10 whole seconds. A git bisect landed me at the patch in $subject: 2b28a1a84a0e ("driver core: Extend deferred probe timeout on driver registration") Adding a WARN_ON(1) in wait_for_device_probe(), as requested by the patch author from the others seeing a regression with this patch, gives two different stacktraces during boot: [0.459633] printk: console [netcon0] enabled [0.459636] printk: console [netcon0] printing thread started [0.459637] netconsole: network logging started [0.459896] cfg80211: Loading compiled-in X.509 certificates for regulatory database [0.460230] kworker/u8:6 (105) used greatest stack depth: 14744 bytes left [0.461031] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7' [0.461077] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2 [0.461085] cfg80211: failed to load regulatory.db [0.461113] ALSA device list: [0.461116] No soundcards found. [0.461614] [ cut here ] [0.461615] WARNING: CPU: 2 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160 [0.485809] Modules linked in: [0.486089] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220526-4-g74f936013b08-dirty #20 [0.486842] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [0.487707] RIP: 0010:wait_for_device_probe+0x1a/0x160 [0.488103] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f [0.489539] RSP: :9c7900013ed8 EFLAGS: 00010246 [0.489965] RAX: RBX: 0008 RCX: 0d02 [0.490597] RDX: 0cc2 RSI: RDI: 0002e990 [0.491181] RBP: 0214 R08: 000f R09: 0064 [0.491788] R10: 9c7900013c6c R11: R12: 8964c0343640 [0.492384] R13: 9e51791c R14: R15: [0.492960] FS: () GS:896637d0() knlGS: [0.493658] CS: 0010 DS: ES: CR0: 80050033 [0.494501] CR2: CR3: 0001ed20c001 CR4: 00370ee0 [0.495621] Call Trace: [0.496059] [0.496266] ? init_eaccess+0x3b/0x76 [0.496657] prepare_namespace+0x30/0x16a [0.497016] kernel_init_freeable+0x207/0x212 [0.497407] ? rest_init+0xc0/0xc0 [0.497714] kernel_init+0x16/0x120 [0.498250] ret_from_fork+0x1f/0x30 [0.498898] [0.499307] ---[ end trace ]--- [0.748413] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [0.749053] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [0.749461] ata2.00: ATA-7: QEMU HARDDISK, version, max UDMA/100 [0.749470] ata2.00: 732 sectors, multi 16: LBA48 NCQ (depth 32) [0.749479] ata2.00: applying bridge limits [0.750915] ata4: SATA link down (SStatus 0 SControl 300) [0.752110] ata5: SATA link down (SStatus 0 SControl 300) [0.753424] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [0.754877] ata3: SATA link down (SStatus 0 SControl 300) [0.755342] ata1.00: ATA-7: QEMU HARDDISK, version, max UDMA/100 [0.755377] ata1.00: 268435456 sectors, multi 16: LBA48 NCQ (depth 32) [0.755387] ata1.00: applying bridge limits [0.755486] ata6.00: ATA-7: QEMU HARDDISK, version, max UDMA/100 [0.755492] ata6.00: 8388608 sectors, multi 16: LBA48 NCQ (depth 32) [0.755500] ata6.00: applying bridge limits [