Re: [PATCH] dma-pool: Do not allocate pool memory from CMA
Hi Nicolas, Sorry I got stuck on other things yesterday. On Tue, 21 Jul 2020 at 21:57, Nicolas Saenz Julienne wrote: > > On Tue, 2020-07-21 at 20:52 +0530, Amit Pundir wrote: > > [...] > > > > > > Can you try booting *without* my patch and this in the kernel > > > > > command > > > > > line: "cma=16M@0x1-0x2". > > > > > > > > It doesn't boot with this added kernel command line. > > > > > > For the record, this placed the CMA in the [4GB, 8GB] address space > > > instead of you setup's default: [3GB, 4GB]. All atomic pools fall > > > in > > > that memory area without my patch, which makes me think some of the > > > devices on your board might not like higher addresses. > > > > > > > Thank you Nicolas for the details. Though we don't set the CMA > > alloc-ranges explicitly in upstream sdm845 dts, but I dug around and > > found that CMA alloc-ranges in the downstream kernel are indeed in > > lower address space. > > https://github.com/MiCode/Xiaomi_Kernel_OpenSource/blob/dipper-q-oss/arch/arm64/boot/dts/qcom/sdm845.dtsi#L662 > > > > /* global autoconfigured region for contiguous allocations */ > > linux,cma { > > compatible = "shared-dma-pool"; > > alloc-ranges = <0 0x 0 0x>; > > reusable; > > alignment = <0 0x40>; > > size = <0 0x200>; > > linux,cma-default; > > }; > > Pretty standard, and similar to what it's being used upstream by > default. > > > > > > What happens if you boot with my troublesome patch with this in > > > your > > > device tree? (insert it at the bottom of sdm845-beryllium.dts) > > > > > > { > > > dma-ranges = <0 0 0 0 0x1 0>; > > > }; > > > > > > > Device still doesn't boot up to adb shell. > > Let's get a bigger hammer, I'm just looking for clues here. Can you > apply this and provide the dmesg output. > > diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c > index 6bc74a2d5127..2160676bf488 100644 > --- a/kernel/dma/pool.c > +++ b/kernel/dma/pool.c > @@ -268,6 +268,8 @@ void *dma_alloc_from_pool(struct device *dev, size_t size, > schedule_work(_pool_work); > } > > + dev_info(dev, "%s: size %lx, phys addr %llx, flags 0x%x\n", __func__, > size, phys, flags); > + > return ptr; > } I never made it to dma_alloc_from_pool() call from dma_direct_alloc_pages(), dma_should_alloc_from_pool() returns False from gfpflags_allow_blocking() block. Regards, Amit Pundir > > > Regards, > Nicolas > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v4] iommu/mediatek: check 4GB mode by reading infracfg
In previous discussion [1] and [2], we found that it is risky to use max_pfn or totalram_pages to tell if 4GB mode is enabled. Check 4GB mode by reading infracfg register, remove the usage of the un-exported symbol max_pfn. This is a step towards building mtk_iommu as a kernel module. [1] https://lore.kernel.org/lkml/20200603161132.2441-1-miles.c...@mediatek.com/ [2] https://lore.kernel.org/lkml/20200604080120.2628-1-miles.c...@mediatek.com/ [3] https://lore.kernel.org/lkml/20200715205120.GA778876@bogus/ Cc: Mike Rapoport Cc: David Hildenbrand Cc: Yong Wu Cc: Yingjoe Chen Cc: Christoph Hellwig Cc: Rob Herring Cc: Matthias Brugger Signed-off-by: Miles Chen --- Change since v3 - use lore.kernel.org links - move "change since..." after "---" Change since v2: - determine compatible string by m4u_plat - rebase to next-20200720 - add "---" Change since v1: - remove the phandle usage, search for infracfg instead [3] - use infracfg instead of infracfg_regmap - move infracfg definitaions to linux/soc/mediatek/infracfg.h - update enable_4GB only when has_4gb_mode --- drivers/iommu/mtk_iommu.c | 34 +++ include/linux/soc/mediatek/infracfg.h | 3 +++ 2 files changed, 32 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 59e5a62a34db..9ec666168822 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -3,7 +3,6 @@ * Copyright (c) 2015-2016 MediaTek Inc. * Author: Yong Wu */ -#include #include #include #include @@ -15,13 +14,16 @@ #include #include #include +#include #include #include #include #include #include +#include #include #include +#include #include #include @@ -640,8 +642,11 @@ static int mtk_iommu_probe(struct platform_device *pdev) struct resource *res; resource_size_t ioaddr; struct component_match *match = NULL; + struct regmap *infracfg; void*protect; int i, larb_nr, ret; + u32 val; + char*p; data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); if (!data) @@ -655,10 +660,29 @@ static int mtk_iommu_probe(struct platform_device *pdev) return -ENOMEM; data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN); - /* Whether the current dram is over 4GB */ - data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT)); - if (!MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) - data->enable_4GB = false; + data->enable_4GB = false; + if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) { + switch (data->plat_data->m4u_plat) { + case M4U_MT2712: + p = "mediatek,mt2712-infracfg"; + break; + case M4U_MT8173: + p = "mediatek,mt8173-infracfg"; + break; + default: + p = NULL; + } + + infracfg = syscon_regmap_lookup_by_compatible(p); + + if (IS_ERR(infracfg)) + return PTR_ERR(infracfg); + + ret = regmap_read(infracfg, REG_INFRA_MISC, ); + if (ret) + return ret; + data->enable_4GB = !!(val & F_DDR_4GB_SUPPORT_EN); + } res = platform_get_resource(pdev, IORESOURCE_MEM, 0); data->base = devm_ioremap_resource(dev, res); diff --git a/include/linux/soc/mediatek/infracfg.h b/include/linux/soc/mediatek/infracfg.h index fd25f0148566..233463d789c6 100644 --- a/include/linux/soc/mediatek/infracfg.h +++ b/include/linux/soc/mediatek/infracfg.h @@ -32,6 +32,9 @@ #define MT7622_TOP_AXI_PROT_EN_WB (BIT(2) | BIT(6) | \ BIT(7) | BIT(8)) +#define REG_INFRA_MISC 0xf00 +#define F_DDR_4GB_SUPPORT_EN BIT(13) + int mtk_infracfg_set_bus_protection(struct regmap *infracfg, u32 mask, bool reg_update); int mtk_infracfg_clear_bus_protection(struct regmap *infracfg, u32 mask, -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu/mediatek: check 4GB mode by reading infracfg
On Wed, 2020-07-22 at 17:19 +0200, Matthias Brugger wrote: > > On 22/07/2020 16:19, Miles Chen wrote: > > In previous discussion [1] and [2], we found that it is risky to > > use max_pfn or totalram_pages to tell if 4GB mode is enabled. > > > > Check 4GB mode by reading infracfg register, remove the usage > > of the un-exported symbol max_pfn. > > > > This is a step towards building mtk_iommu as a kernel module. > > > > --- > > That's wrong. The commit message would be cut after this '---' so we would > loose > the Cc and Signed-of-by tags. Thanks for the comment. understood, I will fix that in patch v4. > > > > > Change since v2: > > - determine compatible string by m4u_plat > > - rebase to next-20200720 > > - add "---" > > > > Change since v1: > > - remove the phandle usage, search for infracfg instead [3] > > - use infracfg instead of infracfg_regmap > > - move infracfg definitaions to linux/soc/mediatek/infracfg.h > > - update enable_4GB only when has_4gb_mode > > > > [1] > > https://urldefense.com/v3/__https://lkml.org/lkml/2020/6/3/733__;!!CTRNKA9wMg0ARbw!073_W_0qmeQnHgSGJRNPTbK2KnPa4VzaPqFBffFn12odyEL1LDaQtZEmrMY4tB_vBw$ > > > > [2] > > https://urldefense.com/v3/__https://lkml.org/lkml/2020/6/4/136__;!!CTRNKA9wMg0ARbw!073_W_0qmeQnHgSGJRNPTbK2KnPa4VzaPqFBffFn12odyEL1LDaQtZEmrMZ7PRs7yw$ > > > > I think using links to lore.kernel.org would make sure that the URL does not > change over time. As the commit log will stay there for ever, but who konws > what > happens with lkml.org I will use lore.kernel.org links > > > [3] > > https://urldefense.com/v3/__https://lkml.org/lkml/2020/7/15/1147__;!!CTRNKA9wMg0ARbw!073_W_0qmeQnHgSGJRNPTbK2KnPa4VzaPqFBffFn12odyEL1LDaQtZEmrMYreY-qqA$ > > > > > > Cc: Mike Rapoport > > Cc: David Hildenbrand > > Cc: Yong Wu > > Cc: Yingjoe Chen > > Cc: Christoph Hellwig > > Cc: Rob Herring > > Cc: Matthias Brugger > > Signed-off-by: Miles Chen > > The formating should look like this: > In previous discussion [1] and [2], we found that it is risky to > use max_pfn or totalram_pages to tell if 4GB mode is enabled. > > Check 4GB mode by reading infracfg register, remove the usage > of the un-exported symbol max_pfn. > > This is a step towards building mtk_iommu as a kernel module. > > [1] > https://urldefense.com/v3/__https://lkml.org/lkml/2020/6/3/733__;!!CTRNKA9wMg0ARbw!073_W_0qmeQnHgSGJRNPTbK2KnPa4VzaPqFBffFn12odyEL1LDaQtZEmrMY4tB_vBw$ > > [2] > https://urldefense.com/v3/__https://lkml.org/lkml/2020/6/4/136__;!!CTRNKA9wMg0ARbw!073_W_0qmeQnHgSGJRNPTbK2KnPa4VzaPqFBffFn12odyEL1LDaQtZEmrMZ7PRs7yw$ > > > Cc: Mike Rapoport > Cc: David Hildenbrand > Cc: Yong Wu > Cc: Yingjoe Chen > Cc: Christoph Hellwig > Cc: Rob Herring > Cc: Matthias Brugger > Signed-off-by: Miles Chen > --- > > Change since v2: > - determine compatible string by m4u_plat > - rebase to next-20200720 > - add "---" > > Change since v1: > - remove the phandle usage, search for infracfg instead > https://urldefense.com/v3/__https://lkml.org/lkml/2020/7/15/1147__;!!CTRNKA9wMg0ARbw!073_W_0qmeQnHgSGJRNPTbK2KnPa4VzaPqFBffFn12odyEL1LDaQtZEmrMYreY-qqA$ > > - use infracfg instead of infracfg_regmap > - move infracfg definitaions to linux/soc/mediatek/infracfg.h > - update enable_4GB only when has_4gb_mode > > > > > --- > > drivers/iommu/mtk_iommu.c | 34 +++ > > include/linux/soc/mediatek/infracfg.h | 3 +++ > > 2 files changed, 32 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index 59e5a62a34db..9ec666168822 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -3,7 +3,6 @@ > >* Copyright (c) 2015-2016 MediaTek Inc. > >* Author: Yong Wu > >*/ > > -#include > > #include > > #include > > #include > > @@ -15,13 +14,16 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > #include > > #include > > +#include > > #include > > #include > > +#include > > #include > > #include > > > > @@ -640,8 +642,11 @@ static int mtk_iommu_probe(struct platform_device > > *pdev) > > struct resource *res; > > resource_size_t ioaddr; > > struct component_match *match = NULL; > > + struct regmap *infracfg; > > void*protect; > > int i, larb_nr, ret; > > + u32 val; > > + char*p; > > > > data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); > > if (!data) > > @@ -655,10 +660,29 @@ static int mtk_iommu_probe(struct platform_device > > *pdev) > > return -ENOMEM; > > data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN); > > > > - /* Whether the current dram is over 4GB */ > > - data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT)); > > - if
Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu
On 7/22/20 7:45 AM, Limonciello, Mario wrote: -Original Message- From: Lu Baolu Sent: Tuesday, July 21, 2020 6:07 PM To: Limonciello, Mario; Joerg Roedel Cc: baolu...@linux.intel.com; Ashok Raj; linux-ker...@vger.kernel.org; sta...@vger.kernel.org; Koba Ko; iommu@lists.linux-foundation.org Subject: Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu [EXTERNAL EMAIL] Hi Limonciello, On 7/21/20 10:44 PM, Limonciello, Mario wrote: -Original Message- From: iommu On Behalf Of Lu Baolu Sent: Monday, July 20, 2020 7:17 PM To: Joerg Roedel Cc: Ashok Raj;linux-ker...@vger.kernel.org;sta...@vger.kernel.org; Koba Ko;iommu@lists.linux-foundation.org Subject: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu The VT-d spec requires (10.4.4 Global Command Register, TE field) that: Hardware implementations supporting DMA draining must drain any in-flight DMA read/write requests queued within the Root-Complex before completing the translation enable command and reflecting the status of the command through the TES field in the Global Status register. Unfortunately, some integrated graphic devices fail to do so after some kind of power state transition. As the result, the system might stuck in iommu_disable_translation(), waiting for the completion of TE transition. This provides a quirk list for those devices and skips TE disabling if the qurik hits. Fixes:https://bugzilla.kernel.org/show_bug.cgi?id=208363 That one is for TGL. I think you also want to add this one for ICL: Fixes:https://bugzilla.kernel.org/show_bug.cgi?id=206571 Do you mean someone have tested that this patch also fixes the problem described in 206571? Yes, confusingly https://bugzilla.kernel.org/show_bug.cgi?id=208363#c31 actually is the XPS 9300 ICL system and issue. I also have a private confirmation from another person that it resolves it for them on another ICL platform. Okay! Thank you very much! I just posted v2 with this tag added. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu
The VT-d spec requires (10.4.4 Global Command Register, TE field) that: Hardware implementations supporting DMA draining must drain any in-flight DMA read/write requests queued within the Root-Complex before completing the translation enable command and reflecting the status of the command through the TES field in the Global Status register. Unfortunately, some integrated graphic devices fail to do so after some kind of power state transition. As the result, the system might stuck in iommu_disable_translation(), waiting for the completion of TE transition. This provides a quirk list for those devices and skips TE disabling if the qurik hits. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=208363 Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206571 Tested-by: Koba Ko Tested-by: Jun Miao Cc: Ashok Raj Cc: sta...@vger.kernel.org Signed-off-by: Lu Baolu --- Change since v1: - Add below tags: Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206571 Tested-by: Jun Miao drivers/iommu/intel/dmar.c | 1 + drivers/iommu/intel/iommu.c | 27 +++ include/linux/dmar.h| 1 + include/linux/intel-iommu.h | 2 ++ 4 files changed, 31 insertions(+) diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 683b812c5c47..16f47041f1bf 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -1102,6 +1102,7 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd) } drhd->iommu = iommu; + iommu->drhd = drhd; return 0; diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index d759e7234e98..a459eac96754 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -356,6 +356,7 @@ static int intel_iommu_strict; static int intel_iommu_superpage = 1; static int iommu_identity_mapping; static int intel_no_bounce; +static int iommu_skip_te_disable; #define IDENTMAP_GFX 2 #define IDENTMAP_AZALIA4 @@ -1629,6 +1630,10 @@ static void iommu_disable_translation(struct intel_iommu *iommu) u32 sts; unsigned long flag; + if (iommu_skip_te_disable && iommu->drhd->gfx_dedicated && + (cap_read_drain(iommu->cap) || cap_write_drain(iommu->cap))) + return; + raw_spin_lock_irqsave(>register_lock, flag); iommu->gcmd &= ~DMA_GCMD_TE; writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG); @@ -4039,6 +4044,7 @@ static void __init init_no_remapping_devices(void) /* This IOMMU has *only* gfx devices. Either bypass it or set the gfx_mapped flag, as appropriate */ + drhd->gfx_dedicated = 1; if (!dmar_map_gfx) { drhd->ignored = 1; for_each_active_dev_scope(drhd->devices, @@ -6182,6 +6188,27 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0044, quirk_calpella_no_shadow_g DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0062, quirk_calpella_no_shadow_gtt); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x006a, quirk_calpella_no_shadow_gtt); +static void quirk_igfx_skip_te_disable(struct pci_dev *dev) +{ + unsigned short ver; + + if (!IS_GFX_DEVICE(dev)) + return; + + ver = (dev->device >> 8) & 0xff; + if (ver != 0x45 && ver != 0x46 && ver != 0x4c && + ver != 0x4e && ver != 0x8a && ver != 0x98 && + ver != 0x9a) + return; + + if (risky_device(dev)) + return; + + pci_info(dev, "Skip IOMMU disabling for graphics\n"); + iommu_skip_te_disable = 1; +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, quirk_igfx_skip_te_disable); + /* On Tylersburg chipsets, some BIOSes have been known to enable the ISOCH DMAR unit for the Azalia sound device, but not give it any TLB entries, which causes it to deadlock. Check for that. We do diff --git a/include/linux/dmar.h b/include/linux/dmar.h index d7bf029df737..65565820328a 100644 --- a/include/linux/dmar.h +++ b/include/linux/dmar.h @@ -48,6 +48,7 @@ struct dmar_drhd_unit { u16 segment;/* PCI domain */ u8 ignored:1; /* ignore drhd */ u8 include_all:1; + u8 gfx_dedicated:1;/* graphic dedicated*/ struct intel_iommu *iommu; }; diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h index 3e8fa1c7a1e6..04bd9279c3fb 100644 --- a/include/linux/intel-iommu.h +++ b/include/linux/intel-iommu.h @@ -600,6 +600,8 @@ struct intel_iommu { struct iommu_device iommu; /* IOMMU core code handle */ int node; u32 flags; /* Software defined flags */ + + struct dmar_drhd_unit *drhd; }; /* PCI domain-device relationship */ -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org
Re: [PATCH v5 3/7] iommu/vt-d: Fix PASID devTLB invalidation
On 7/23/20 3:26 AM, Jacob Pan wrote: DevTLB flush can be used for both DMA request with and without PASIDs. The former uses PASID#0 (RID2PASID), latter uses non-zero PASID for SVA usage. This patch adds a check for PASID value such that devTLB flush with PASID is used for SVA case. This is more efficient in that multiple PASIDs can be used by a single device, when tearing down a PASID entry we shall flush only the devTLB specific to a PASID. Fixes: 6f7db75e1c46 ("iommu/vt-d: Add second level page table") Signed-off-by: Jacob Pan Reviewed-by: Eric Auger Acked-by: Lu Baolu Best regards, baolu --- drivers/iommu/intel/pasid.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index c81f0f17c6ba..fa0154cce537 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -486,7 +486,16 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu, qdep = info->ats_qdep; pfsid = info->pfsid; - qi_flush_dev_iotlb(iommu, sid, pfsid, qdep, 0, 64 - VTD_PAGE_SHIFT); + /* +* When PASID 0 is used, it indicates RID2PASID(DMA request w/o PASID), +* devTLB flush w/o PASID should be used. For non-zero PASID under +* SVA usage, device could do DMA with multiple PASIDs. It is more +* efficient to flush devTLB specific to the PASID. +*/ + if (pasid == PASID_RID2PASID) + qi_flush_dev_iotlb(iommu, sid, pfsid, qdep, 0, 64 - VTD_PAGE_SHIFT); + else + qi_flush_dev_iotlb_pasid(iommu, sid, pfsid, pasid, qdep, 0, 64 - VTD_PAGE_SHIFT); } void intel_pasid_tear_down_entry(struct intel_iommu *iommu, struct device *dev, ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 08/12] device core: Introduce DMA range map, supplanting dma_pfn_offset
On Tue, Jul 21, 2020 at 8:51 AM Christoph Hellwig wrote: > > On Wed, Jul 15, 2020 at 10:35:11AM -0400, Jim Quinlan wrote: > > The new field 'dma_range_map' in struct device is used to facilitate the > > use of single or multiple offsets between mapping regions of cpu addrs and > > dma addrs. It subsumes the role of "dev->dma_pfn_offset" which was only > > capable of holding a single uniform offset and had no region bounds > > checking. > > > > The function of_dma_get_range() has been modified so that it takes a single > > argument -- the device node -- and returns a map, NULL, or an error code. > > The map is an array that holds the information regarding the DMA regions. > > Each range entry contains the address offset, the cpu_start address, the > > dma_start address, and the size of the region. > > > > of_dma_configure() is the typical manner to set range offsets but there are > > a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel > > driver code. These cases now invoke the function > > dma_attach_offset_range(dev, cpu_addr, dma_addr, size). > > So my main higher level issue here is the dma_attach_offset_range > function. I think it should keep the old functionality and just > set a global range from 0 to (phys_addr_t)-1, and bail out if there > are DMA ranges already: > > int dma_set_global_offset(struct device *dev, u64 offset); Hi Christoph, I had it this way in [V1...V5] but Robin requested that for V6 I should change this function to o add bounds to the call o if there is a mapping already, check if what is requested is already covered and return success. Can you and Robin please discuss this and let me know which way to move forward? > > > otherwise there is all kinds of minor nitpicks that aren't too > substantial, let me know what you think of something like this > hacked up version: Kind of hard to see what you have changed but I will diff both of our diffs and make the changes. Thanks, Jim Quinlan Broadcom STB > > > diff --git a/arch/arm/include/asm/dma-mapping.h > b/arch/arm/include/asm/dma-mapping.h > index bdd80ddbca3451..2405afeb79573a 100644 > --- a/arch/arm/include/asm/dma-mapping.h > +++ b/arch/arm/include/asm/dma-mapping.h > @@ -35,8 +35,11 @@ static inline const struct dma_map_ops > *get_arch_dma_ops(struct bus_type *bus) > #ifndef __arch_pfn_to_dma > static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn) > { > - if (dev) > - pfn -= dev->dma_pfn_offset; > + if (dev) { > + phys_addr_t paddr = PFN_PHYS(pfn); > + > + pfn -= (dma_offset_from_phys_addr(dev, paddr) >> PAGE_SHIFT); > + } > return (dma_addr_t)__pfn_to_bus(pfn); > } > > @@ -45,8 +48,7 @@ static inline unsigned long dma_to_pfn(struct device *dev, > dma_addr_t addr) > unsigned long pfn = __bus_to_pfn(addr); > > if (dev) > - pfn += dev->dma_pfn_offset; > - > + pfn += (dma_offset_from_dma_addr(dev, addr) >> PAGE_SHIFT); > return pfn; > } > > diff --git a/arch/arm/mach-keystone/keystone.c > b/arch/arm/mach-keystone/keystone.c > index 638808c4e12247..7539679205fbf7 100644 > --- a/arch/arm/mach-keystone/keystone.c > +++ b/arch/arm/mach-keystone/keystone.c > @@ -8,6 +8,7 @@ > */ > #include > #include > +#include > #include > #include > #include > @@ -24,8 +25,6 @@ > > #include "keystone.h" > > -static unsigned long keystone_dma_pfn_offset __read_mostly; > - > static int keystone_platform_notifier(struct notifier_block *nb, > unsigned long event, void *data) > { > @@ -38,9 +37,12 @@ static int keystone_platform_notifier(struct > notifier_block *nb, > return NOTIFY_BAD; > > if (!dev->of_node) { > - dev->dma_pfn_offset = keystone_dma_pfn_offset; > - dev_err(dev, "set dma_pfn_offset%08lx\n", > - dev->dma_pfn_offset); > + int ret = dma_set_offset_range(dev, KEYSTONE_HIGH_PHYS_START, > + KEYSTONE_LOW_PHYS_START, > + KEYSTONE_HIGH_PHYS_SIZE); > + dev_err(dev, "set dma_offset%08llx%s\n", > + KEYSTONE_HIGH_PHYS_START - KEYSTONE_LOW_PHYS_START, > + ret ? " failed" : ""); > } > return NOTIFY_OK; > } > @@ -51,11 +53,8 @@ static struct notifier_block platform_nb = { > > static void __init keystone_init(void) > { > - if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) { > - keystone_dma_pfn_offset = PFN_DOWN(KEYSTONE_HIGH_PHYS_START - > - KEYSTONE_LOW_PHYS_START); > + if (PHYS_OFFSET >= KEYSTONE_HIGH_PHYS_START) > bus_register_notifier(_bus_type, _nb); > - } > keystone_pm_runtime_init(); > } > > diff --git a/arch/sh/drivers/pci/pcie-sh7786.c >
RE: [PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA
> -Original Message- > From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Thursday, July 23, 2020 2:30 AM > To: Song Bao Hua (Barry Song) > Cc: h...@lst.de; m.szyprow...@samsung.com; robin.mur...@arm.com; > w...@kernel.org; ganapatrao.kulka...@cavium.com; > catalin.mari...@arm.com; iommu@lists.linux-foundation.org; Linuxarm > ; linux-arm-ker...@lists.infradead.org; > linux-ker...@vger.kernel.org; Jonathan Cameron > ; Nicolas Saenz Julienne > ; Steve Capper ; Andrew > Morton ; Mike Rapoport > Subject: Re: [PATCH v3 1/2] dma-direct: provide the ability to reserve > per-numa CMA > +cc Prime and Daode who are interested in this patchset. > On Sun, Jun 28, 2020 at 11:12:50PM +1200, Barry Song wrote: > > struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t > gfp) > > { > > size_t count = size >> PAGE_SHIFT; > > struct page *page = NULL; > > struct cma *cma = NULL; > > + int nid = dev ? dev_to_node(dev) : NUMA_NO_NODE; > > + bool alloc_from_pernuma = false; > > + > > + if ((count <= 1) && !(dev && dev->cma_area)) > > + return NULL; > > > > if (dev && dev->cma_area) > > cma = dev->cma_area; > > - else if (count > 1) > > + else if ((nid != NUMA_NO_NODE) && > dma_contiguous_pernuma_area[nid] > > + && !(gfp & (GFP_DMA | GFP_DMA32))) { > > + cma = dma_contiguous_pernuma_area[nid]; > > + alloc_from_pernuma = true; > > + } else { > > cma = dma_contiguous_default_area; > > + } > > I find the function rather confusing now. What about something > like (this relies on the fact that dev should never be NULL in the > DMA API) > > struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp) > { > size_t cma_align = min_t(size_t, get_order(size), > CONFIG_CMA_ALIGNMENT); > size_t count = size >> PAGE_SHIFT; > > if (gfpflags_allow_blocking(gfp)) > return NULL; > gfp &= __GFP_NOWARN; > > if (dev->cma_area) I got a kernel robot warning which said dev should be checked before being accessed when I did a similar change in v1. Probably it was an invalid warning if dev should never be null. > return cma_alloc(dev->cma_area, count, cma_align, gfp); > if (count <= 1) > return NULL; > > if (IS_ENABLED(CONFIG_PERNODE_CMA) && !(gfp & (GFP_DMA | > GFP_DMA32)) { > int nid = dev_to_node(dev); > struct cma *cma = dma_contiguous_pernuma_area[nid]; > struct page *page; > > if (cma) { > page = cma_alloc(cma, count, cma_align, gfp); > if (page) > return page; > } > } > > return cma_alloc(dma_contiguous_default_area, count, cma_align, gfp); > } Yes, it looks much better. > > > + /* > > +* otherwise, page is from either per-numa cma or default cma > > +*/ > > + int nid = page_to_nid(page); > > + > > + if (nid != NUMA_NO_NODE) { > > + if (cma_release(dma_contiguous_pernuma_area[nid], page, > > + PAGE_ALIGN(size) >> PAGE_SHIFT)) > > + return; > > + } > > + > > + if (cma_release(dma_contiguous_default_area, page, > > How can page_to_nid ever return NUMA_NO_NODE? I thought page_to_nid would return NUMA_NO_NODE if CONFIG_NUMA is not enabled. Probably I was wrong. Will get it fixed in v4. Thanks Barry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA
> -Original Message- > From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Thursday, July 23, 2020 2:17 AM > To: Song Bao Hua (Barry Song) > Cc: h...@lst.de; m.szyprow...@samsung.com; robin.mur...@arm.com; > w...@kernel.org; ganapatrao.kulka...@cavium.com; > catalin.mari...@arm.com; iommu@lists.linux-foundation.org; Linuxarm > ; linux-arm-ker...@lists.infradead.org; > linux-ker...@vger.kernel.org; Jonathan Cameron > ; Nicolas Saenz Julienne > ; Steve Capper ; Andrew > Morton ; Mike Rapoport > Subject: Re: [PATCH v3 1/2] dma-direct: provide the ability to reserve > per-numa CMA > +cc Prime and Daode who are interested in this patchset. > On Sun, Jun 28, 2020 at 11:12:50PM +1200, Barry Song wrote: > > This is useful for at least two scenarios: > > 1. ARM64 smmu will get memory from local numa node, it can save its > > command queues and page tables locally. Tests show it can decrease > > dma_unmap latency at lot. For example, without this patch, smmu on > > node2 will get memory from node0 by calling dma_alloc_coherent(), > > typically, it has to wait for more than 560ns for the completion of > > CMD_SYNC in an empty command queue; with this patch, it needs 240ns > > only. > > 2. when we set iommu passthrough, drivers will get memory from CMA, > > local memory means much less latency. > > I really don't like the config options. With the boot parameters > you can always hardcode that in CONFIG_CMDLINE anyway. I understand your concern. Anyway, The primary purpose of this patchset is providing a general way for users like IOMMU to get local coherent dma buffers to put their command queue and page tables in. The first user case is what really made me begin to prepare this patchset. For the second case, it is probably a positive side effect of this patchset for those users who have more concern on performance than dma security, then they maybe skip IOMMU by iommu.passthrough= [ARM64, X86] Configure DMA to bypass the IOMMU by default. Format: { "0" | "1" } 0 - Use IOMMU translation for DMA. 1 - Bypass the IOMMU for DMA. unset - Use value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH. In this case, they can get local memory and get better performance. However, it is not the primary purpose of this patchset. Thanks Barry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/arm-smmu: Implement qcom,skip-init
>Is the problem on SDM630 that when you write to SMR/S2CR the device >reboots? Or that when you start writing out the context bank >configuration that trips the display and the device reboots? I added some debug prints and the phone hangs after reaching the seventh CB (with i=6) at arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, ARM_SMMU_FSR_FAULT); line in arm_smmu_device_reset. Konrad ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 5/7] iommu/vt-d: Fix devTLB flush for vSVA
From: Liu Yi L For guest SVA usage, in order to optimize for less VMEXIT, guest request of IOTLB flush also includes device TLB. On the host side, IOMMU driver performs IOTLB and implicit devTLB invalidation. When PASID-selective granularity is requested by the guest we need to derive the equivalent address range for devTLB instead of using the address information in the UAPI data. The reason for that is, unlike IOTLB flush, devTLB flush does not support PASID-selective granularity. This is to say, we need to set the following in the PASID based devTLB invalidation descriptor: - entire 64 bit range in address ~(0x1 << 63) - S bit = 1 (VT-d CH 6.5.2.6). Without this fix, device TLB flush range is not set properly for PASID selective granularity. This patch also merged devTLB flush code for both implicit and explicit cases. Fixes: 6ee1b77ba3ac ("iommu/vt-d: Add svm/sva invalidate function") Acked-by: Lu Baolu Reviewed-by: Eric Auger Signed-off-by: Liu Yi L Signed-off-by: Jacob Pan --- drivers/iommu/intel/iommu.c | 30 -- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index bdd1e7d81178..c6999b9b6265 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -5416,7 +5416,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, sid = PCI_DEVID(bus, devfn); /* Size is only valid in address selective invalidation */ - if (inv_info->granularity != IOMMU_INV_GRANU_PASID) + if (inv_info->granularity == IOMMU_INV_GRANU_ADDR) size = to_vtd_size(inv_info->addr_info.granule_size, inv_info->addr_info.nb_granules); @@ -5425,6 +5425,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, IOMMU_CACHE_INV_TYPE_NR) { int granu = 0; u64 pasid = 0; + u64 addr = 0; granu = to_vtd_granularity(cache_type, inv_info->granularity); if (granu == -EINVAL) { @@ -5464,24 +5465,33 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, (granu == QI_GRAN_NONG_PASID) ? -1 : 1 << size, inv_info->addr_info.flags & IOMMU_INV_ADDR_FLAGS_LEAF); + if (!info->ats_enabled) + break; /* * Always flush device IOTLB if ATS is enabled. vIOMMU * in the guest may assume IOTLB flush is inclusive, * which is more efficient. */ - if (info->ats_enabled) - qi_flush_dev_iotlb_pasid(iommu, sid, - info->pfsid, pasid, - info->ats_qdep, - inv_info->addr_info.addr, - size); - break; + fallthrough; case IOMMU_CACHE_INV_TYPE_DEV_IOTLB: + /* +* PASID based device TLB invalidation does not support +* IOMMU_INV_GRANU_PASID granularity but only supports +* IOMMU_INV_GRANU_ADDR. +* The equivalent of that is we set the size to be the +* entire range of 64 bit. User only provides PASID info +* without address info. So we set addr to 0. +*/ + if (inv_info->granularity == IOMMU_INV_GRANU_PASID) { + size = 64 - VTD_PAGE_SHIFT; + addr = 0; + } else if (inv_info->granularity == IOMMU_INV_GRANU_ADDR) + addr = inv_info->addr_info.addr; + if (info->ats_enabled) qi_flush_dev_iotlb_pasid(iommu, sid, info->pfsid, pasid, - info->ats_qdep, - inv_info->addr_info.addr, + info->ats_qdep, addr, size); else pr_warn_ratelimited("Passdown device IOTLB flush w/o ATS!\n"); -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 4/7] iommu/vt-d: Handle non-page aligned address
From: Liu Yi L Address information for device TLB invalidation comes from userspace when device is directly assigned to a guest with vIOMMU support. VT-d requires page aligned address. This patch checks and enforce address to be page aligned, otherwise reserved bits can be set in the invalidation descriptor. Unrecoverable fault will be reported due to non-zero value in the reserved bits. Fixes: 61a06a16e36d8 ("iommu/vt-d: Support flushing more translation cache types") Acked-by: Lu Baolu Signed-off-by: Liu Yi L Signed-off-by: Jacob Pan Reviewed-by: Eric Auger --- drivers/iommu/intel/dmar.c | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 9be08b9400ee..6038d02c5066 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -1456,9 +1456,25 @@ void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid, * Max Invs Pending (MIP) is set to 0 for now until we have DIT in * ECAP. */ - desc.qw1 |= addr & ~mask; - if (size_order) + if (addr & GENMASK_ULL(size_order + VTD_PAGE_SHIFT, 0)) + pr_warn_ratelimited("Invalidate non-aligned address %llx, order %d\n", addr, size_order); + + /* Take page address */ + desc.qw1 = QI_DEV_EIOTLB_ADDR(addr); + + if (size_order) { + /* +* Existing 0s in address below size_order may be the least +* significant bit, we must set them to 1s to avoid having +* smaller size than desired. +*/ + desc.qw1 |= GENMASK_ULL(size_order + VTD_PAGE_SHIFT - 1, + VTD_PAGE_SHIFT); + /* Clear size_order bit to indicate size */ + desc.qw1 &= ~mask; + /* Set the S bit to indicate flushing more than 1 page */ desc.qw1 |= QI_DEV_EIOTLB_SIZE; + } qi_submit_sync(iommu, , 1, 0); } -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 7/7] iommu/vt-d: Disable multiple GPASID-dev bind
For the unlikely use case where multiple aux domains from the same pdev are attached to a single guest and then bound to a single process (thus same PASID) within that guest, we cannot easily support this case by refcounting the number of users. As there is only one SL page table per PASID while we have multiple aux domains thus multiple SL page tables for the same PASID. Extra unbinding guest PASID can happen due to race between normal and exception cases. Termination of one aux domain may affect others unless we actively track and switch aux domains to ensure the validity of SL page tables and TLB states in the shared PASID entry. Support for sharing second level PGDs across domains can reduce the complexity but this is not available due to the limitations on VFIO container architecture. We can revisit this decision once sharing PGDs are available. Overall, the complexity and potential glitch do not warrant this unlikely use case thereby removed by this patch. Fixes: 56722a4398a30 ("iommu/vt-d: Add bind guest PASID support") Acked-by: Lu Baolu Cc: Kevin Tian Cc: Lu Baolu Reviewed-by: Eric Auger Signed-off-by: Liu Yi L Signed-off-by: Jacob Pan --- drivers/iommu/intel/svm.c | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 6c87c807a0ab..d386853121a2 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -277,20 +277,16 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev, goto out; } + /* +* Do not allow multiple bindings of the same device-PASID since +* there is only one SL page tables per PASID. We may revisit +* once sharing PGD across domains are supported. +*/ for_each_svm_dev(sdev, svm, dev) { - /* -* For devices with aux domains, we should allow -* multiple bind calls with the same PASID and pdev. -*/ - if (iommu_dev_feature_enabled(dev, - IOMMU_DEV_FEAT_AUX)) { - sdev->users++; - } else { - dev_warn_ratelimited(dev, -"Already bound with PASID %u\n", -svm->pasid); - ret = -EBUSY; - } + dev_warn_ratelimited(dev, +"Already bound with PASID %u\n", +svm->pasid); + ret = -EBUSY; goto out; } } else { -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 6/7] iommu/vt-d: Warn on out-of-range invalidation address
For guest requested IOTLB invalidation, address and mask are provided as part of the invalidation data. VT-d HW silently ignores any address bits below the mask. SW shall also allow such case but give warning if address does not align with the mask. This patch relax the fault handling from error to warning and proceed with invalidation request with the given mask. Fixes: 6ee1b77ba3ac0 ("iommu/vt-d: Add svm/sva invalidate function") Acked-by: Lu Baolu Reviewed-by: Eric Auger Signed-off-by: Jacob Pan --- drivers/iommu/intel/iommu.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index c6999b9b6265..92c7ad66e64c 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -5447,13 +5447,12 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, switch (BIT(cache_type)) { case IOMMU_CACHE_INV_TYPE_IOTLB: + /* HW will ignore LSB bits based on address mask */ if (inv_info->granularity == IOMMU_INV_GRANU_ADDR && size && (inv_info->addr_info.addr & ((BIT(VTD_PAGE_SHIFT + size)) - 1))) { - pr_err_ratelimited("Address out of range, 0x%llx, size order %llu\n", - inv_info->addr_info.addr, size); - ret = -ERANGE; - goto out_unlock; + pr_err_ratelimited("User address not aligned, 0x%llx, size order %llu\n", + inv_info->addr_info.addr, size); } /* -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 2/7] iommu/vt-d: Remove global page support in devTLB flush
Global pages support is removed from VT-d spec 3.0 for dev TLB invalidation. This patch is to remove the bits for vSVA. Similar change already made for the native SVA. See the link below. Link: https://lkml.org/lkml/2019/8/26/651 Acked-by: Lu Baolu Reviewed-by: Eric Auger Signed-off-by: Jacob Pan --- drivers/iommu/intel/dmar.c | 4 +--- drivers/iommu/intel/iommu.c | 4 ++-- include/linux/intel-iommu.h | 3 +-- 3 files changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 683b812c5c47..9be08b9400ee 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -1438,8 +1438,7 @@ void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr, /* PASID-based device IOTLB Invalidate */ void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid, - u32 pasid, u16 qdep, u64 addr, - unsigned int size_order, u64 granu) + u32 pasid, u16 qdep, u64 addr, unsigned int size_order) { unsigned long mask = 1UL << (VTD_PAGE_SHIFT + size_order - 1); struct qi_desc desc = {.qw1 = 0, .qw2 = 0, .qw3 = 0}; @@ -1447,7 +1446,6 @@ void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid, desc.qw0 = QI_DEV_EIOTLB_PASID(pasid) | QI_DEV_EIOTLB_SID(sid) | QI_DEV_EIOTLB_QDEP(qdep) | QI_DEIOTLB_TYPE | QI_DEV_IOTLB_PFSID(pfsid); - desc.qw1 = QI_DEV_EIOTLB_GLOB(granu); /* * If S bit is 0, we only flush a single page. If S bit is set, diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index d759e7234e98..bdd1e7d81178 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -5474,7 +5474,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, info->pfsid, pasid, info->ats_qdep, inv_info->addr_info.addr, - size, granu); + size); break; case IOMMU_CACHE_INV_TYPE_DEV_IOTLB: if (info->ats_enabled) @@ -5482,7 +5482,7 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, info->pfsid, pasid, info->ats_qdep, inv_info->addr_info.addr, - size, granu); + size); else pr_warn_ratelimited("Passdown device IOTLB flush w/o ATS!\n"); break; diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h index 37b50e93..c7a8aae36771 100644 --- a/include/linux/intel-iommu.h +++ b/include/linux/intel-iommu.h @@ -381,7 +381,6 @@ enum { #define QI_DEV_EIOTLB_ADDR(a) ((u64)(a) & VTD_PAGE_MASK) #define QI_DEV_EIOTLB_SIZE (((u64)1) << 11) -#define QI_DEV_EIOTLB_GLOB(g) ((u64)(g) & 0x1) #define QI_DEV_EIOTLB_PASID(p) ((u64)((p) & 0xf) << 32) #define QI_DEV_EIOTLB_SID(sid) ((u64)((sid) & 0x) << 16) #define QI_DEV_EIOTLB_QDEP(qd) ((u64)((qd) & 0x1f) << 4) @@ -705,7 +704,7 @@ void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr, void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid, u32 pasid, u16 qdep, u64 addr, - unsigned int size_order, u64 granu); + unsigned int size_order); void qi_flush_pasid_cache(struct intel_iommu *iommu, u16 did, u64 granu, int pasid); -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 0/7] iommu/vt-d: Misc tweaks and fixes for vSVA
Hi Baolu and all, This is a series to address some of the issues we found in vSVA support. Most of the patches deal with exception handling, we also removed some bits that are not currently supported. Many thanks to Kevin Tian's review. Jacob & Yi Changelog: v5 Fix off by one bug in 4/7 v4 Address reviews from Eric Auger - Additional warning for unaligned mask and address in guest IOTLB invalidation - Comments rewrite and code simplification v3 - Use pr_err instead of WARN() for invalid user address range (6/7) - Fix logic in PASID selective devTLB flush (3/7) v2 Address reviews from Baolu - Fixed addr field in devTLB flush (5/7) - Assign address for single page devTLB invalidation (4/7) - Coding style tweaks Jacob Pan (4): iommu/vt-d: Remove global page support in devTLB flush iommu/vt-d: Fix PASID devTLB invalidation iommu/vt-d: Warn on out-of-range invalidation address iommu/vt-d: Disable multiple GPASID-dev bind Liu Yi L (3): iommu/vt-d: Enforce PASID devTLB field mask iommu/vt-d: Handle non-page aligned address iommu/vt-d: Fix devTLB flush for vSVA drivers/iommu/intel/dmar.c | 24 +++- drivers/iommu/intel/iommu.c | 39 --- drivers/iommu/intel/pasid.c | 11 ++- drivers/iommu/intel/svm.c | 22 +- include/linux/intel-iommu.h | 5 ++--- 5 files changed, 64 insertions(+), 37 deletions(-) -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 1/7] iommu/vt-d: Enforce PASID devTLB field mask
From: Liu Yi L Set proper masks to avoid invalid input spillover to reserved bits. Acked-by: Lu Baolu Reviewed-by: Eric Auger Signed-off-by: Liu Yi L Signed-off-by: Jacob Pan --- include/linux/intel-iommu.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h index 3e8fa1c7a1e6..37b50e93 100644 --- a/include/linux/intel-iommu.h +++ b/include/linux/intel-iommu.h @@ -381,8 +381,8 @@ enum { #define QI_DEV_EIOTLB_ADDR(a) ((u64)(a) & VTD_PAGE_MASK) #define QI_DEV_EIOTLB_SIZE (((u64)1) << 11) -#define QI_DEV_EIOTLB_GLOB(g) ((u64)g) -#define QI_DEV_EIOTLB_PASID(p) (((u64)p) << 32) +#define QI_DEV_EIOTLB_GLOB(g) ((u64)(g) & 0x1) +#define QI_DEV_EIOTLB_PASID(p) ((u64)((p) & 0xf) << 32) #define QI_DEV_EIOTLB_SID(sid) ((u64)((sid) & 0x) << 16) #define QI_DEV_EIOTLB_QDEP(qd) ((u64)((qd) & 0x1f) << 4) #define QI_DEV_EIOTLB_PFSID(pfsid) (((u64)(pfsid & 0xf) << 12) | \ -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 3/7] iommu/vt-d: Fix PASID devTLB invalidation
DevTLB flush can be used for both DMA request with and without PASIDs. The former uses PASID#0 (RID2PASID), latter uses non-zero PASID for SVA usage. This patch adds a check for PASID value such that devTLB flush with PASID is used for SVA case. This is more efficient in that multiple PASIDs can be used by a single device, when tearing down a PASID entry we shall flush only the devTLB specific to a PASID. Fixes: 6f7db75e1c46 ("iommu/vt-d: Add second level page table") Signed-off-by: Jacob Pan Reviewed-by: Eric Auger --- drivers/iommu/intel/pasid.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index c81f0f17c6ba..fa0154cce537 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -486,7 +486,16 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu, qdep = info->ats_qdep; pfsid = info->pfsid; - qi_flush_dev_iotlb(iommu, sid, pfsid, qdep, 0, 64 - VTD_PAGE_SHIFT); + /* +* When PASID 0 is used, it indicates RID2PASID(DMA request w/o PASID), +* devTLB flush w/o PASID should be used. For non-zero PASID under +* SVA usage, device could do DMA with multiple PASIDs. It is more +* efficient to flush devTLB specific to the PASID. +*/ + if (pasid == PASID_RID2PASID) + qi_flush_dev_iotlb(iommu, sid, pfsid, qdep, 0, 64 - VTD_PAGE_SHIFT); + else + qi_flush_dev_iotlb_pasid(iommu, sid, pfsid, pasid, qdep, 0, 64 - VTD_PAGE_SHIFT); } void intel_pasid_tear_down_entry(struct intel_iommu *iommu, struct device *dev, -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 4/7] iommu/vt-d: Handle non-page aligned address
On Wed, 22 Jul 2020 09:01:27 +0800 Lu Baolu wrote: > > > > Not sure what state is this patch in, there is a bug in this patch > > (see below), shall I send out an updated version of this one only? > > or another incremental patch. > > Please send an updated version. I hope Joerg could pick these as 5.8 > fix. OK, will do. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 01/12] iommu: Change type of pasid to u32
Hi, Joerg, On Wed, Jul 22, 2020 at 04:03:40PM +0200, Joerg Roedel wrote: > On Mon, Jul 13, 2020 at 04:47:56PM -0700, Fenghua Yu wrote: > > PASID is defined as a few different types in iommu including "int", > > "u32", and "unsigned int". To be consistent and to match with uapi > > definitions, define PASID and its variations (e.g. max PASID) as "u32". > > "u32" is also shorter and a little more explicit than "unsigned int". > > > > No PASID type change in uapi although it defines PASID as __u64 in > > some places. > > > > Suggested-by: Thomas Gleixner > > Signed-off-by: Fenghua Yu > > Reviewed-by: Tony Luck > > Reviewed-by: Lu Baolu > > Acked-by: Felix Kuehling > > For the IOMMU parts: > > Acked-by: Joerg Roedel Thank you! -Fenghua ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH V4 1/3] iommu: Add support to change default domain of an iommu group
Hi Joerg, Thanks for the reply. I will spin another version of the patch addressing your comments. > -Original Message- > From: Joerg Roedel > Sent: Wednesday, July 22, 2020 6:53 AM > To: Prakhya, Sai Praneeth > Cc: Raj, Ashok ; Will Deacon ; > iommu@lists.linux-foundation.org; Robin Murphy ; > Christoph Hellwig > Subject: Re: [PATCH V4 1/3] iommu: Add support to change default domain of > an iommu group > > On Tue, Jul 14, 2020 at 06:23:54PM +, Prakhya, Sai Praneeth wrote: > > Q1: > > > Presently, iommu_change_dev_def_domain() checks if the iommu group > > > still has only one device or not. Hence, checking if iommu group has > > > one device or not is done twice, once before taking device_lock() > > > and the other, after taking device_lock(). > > > > > > I agree that the code isn't checking if the iommu group still has > > > the _same_ device or not. > > > One way, I could think of doing it is by storing "dev" temporarily > > > and checking for it. > > > Do you think that's ok? Or would you rather suggest something else? > > That sounds reasonable, get the device from the group, lock it, take > group->mutex, and check whether the same device is still alone in the > group. Sounds good! I will implement this. Regards, Sai ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/qcom: Use domain rather than dev as tlb cookie
On Wed, Jul 22, 2020 at 07:54:40AM -0700, Rob Clark wrote: > On Wed, Jul 22, 2020 at 6:10 AM Joerg Roedel wrote: > > Is this needed for v5.8/stable? A fixes tag would be great too. > > looks like, yes: > > Fixes: 09b5dfff9ad6 ("iommu/qcom: Use accessor functions for iommu > private data") Thanks, applied to fixes branch. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu/mediatek: check 4GB mode by reading infracfg
On 22/07/2020 16:19, Miles Chen wrote: In previous discussion [1] and [2], we found that it is risky to use max_pfn or totalram_pages to tell if 4GB mode is enabled. Check 4GB mode by reading infracfg register, remove the usage of the un-exported symbol max_pfn. This is a step towards building mtk_iommu as a kernel module. --- That's wrong. The commit message would be cut after this '---' so we would loose the Cc and Signed-of-by tags. Change since v2: - determine compatible string by m4u_plat - rebase to next-20200720 - add "---" Change since v1: - remove the phandle usage, search for infracfg instead [3] - use infracfg instead of infracfg_regmap - move infracfg definitaions to linux/soc/mediatek/infracfg.h - update enable_4GB only when has_4gb_mode [1] https://lkml.org/lkml/2020/6/3/733 [2] https://lkml.org/lkml/2020/6/4/136 I think using links to lore.kernel.org would make sure that the URL does not change over time. As the commit log will stay there for ever, but who konws what happens with lkml.org [3] https://lkml.org/lkml/2020/7/15/1147 Cc: Mike Rapoport Cc: David Hildenbrand Cc: Yong Wu Cc: Yingjoe Chen Cc: Christoph Hellwig Cc: Rob Herring Cc: Matthias Brugger Signed-off-by: Miles Chen The formating should look like this: In previous discussion [1] and [2], we found that it is risky to use max_pfn or totalram_pages to tell if 4GB mode is enabled. Check 4GB mode by reading infracfg register, remove the usage of the un-exported symbol max_pfn. This is a step towards building mtk_iommu as a kernel module. [1] https://lkml.org/lkml/2020/6/3/733 [2] https://lkml.org/lkml/2020/6/4/136 Cc: Mike Rapoport Cc: David Hildenbrand Cc: Yong Wu Cc: Yingjoe Chen Cc: Christoph Hellwig Cc: Rob Herring Cc: Matthias Brugger Signed-off-by: Miles Chen --- Change since v2: - determine compatible string by m4u_plat - rebase to next-20200720 - add "---" Change since v1: - remove the phandle usage, search for infracfg instead https://lkml.org/lkml/2020/7/15/1147 - use infracfg instead of infracfg_regmap - move infracfg definitaions to linux/soc/mediatek/infracfg.h - update enable_4GB only when has_4gb_mode --- drivers/iommu/mtk_iommu.c | 34 +++ include/linux/soc/mediatek/infracfg.h | 3 +++ 2 files changed, 32 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 59e5a62a34db..9ec666168822 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -3,7 +3,6 @@ * Copyright (c) 2015-2016 MediaTek Inc. * Author: Yong Wu */ -#include #include #include #include @@ -15,13 +14,16 @@ #include #include #include +#include #include #include #include #include #include +#include #include #include +#include #include #include @@ -640,8 +642,11 @@ static int mtk_iommu_probe(struct platform_device *pdev) struct resource *res; resource_size_t ioaddr; struct component_match *match = NULL; + struct regmap *infracfg; void*protect; int i, larb_nr, ret; + u32 val; + char*p; data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); if (!data) @@ -655,10 +660,29 @@ static int mtk_iommu_probe(struct platform_device *pdev) return -ENOMEM; data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN); - /* Whether the current dram is over 4GB */ - data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT)); - if (!MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) - data->enable_4GB = false; + data->enable_4GB = false; + if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) { + switch (data->plat_data->m4u_plat) { + case M4U_MT2712: + p = "mediatek,mt2712-infracfg"; + break; + case M4U_MT8173: + p = "mediatek,mt8173-infracfg"; + break; + default: + p = NULL; + } + + infracfg = syscon_regmap_lookup_by_compatible(p); if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) { switch (data->plat_data->m4u_plat) { case M4U_MT2712: infracfg = syscon_regmap_lookup_by_compatible("mediatek,mt2712-infracfg"); break; case M4U_MT8173: infracfg = syscon_regmap_lookup_by_compatible("mediatek,mt8173-infracfg"); break; default: infracfg = -ENODEV; } + + if (IS_ERR(infracfg)) + return PTR_ERR(infracfg); + + ret = regmap_read(infracfg, REG_INFRA_MISC,
Re: [PATCH] iommu/qcom: Use domain rather than dev as tlb cookie
On Wed, Jul 22, 2020 at 6:10 AM Joerg Roedel wrote: > > On Tue, Jul 21, 2020 at 12:45:17AM +0530, Naresh Kamboju wrote: > > On Mon, 20 Jul 2020 at 21:21, Rob Clark wrote: > > > > > > From: Rob Clark > > > > > > The device may be torn down, but the domain should still be valid. Lets > > > use that as the tlb flush ops cookie. > > > > > > Fixes a problem reported in [1] > > > > This proposed fix patch applied on top of linux mainline master > > and boot test PASS on db410c. > > > > The reported problem got fixed. > > Is this needed for v5.8/stable? A fixes tag would be great too. looks like, yes: Fixes: 09b5dfff9ad6 ("iommu/qcom: Use accessor functions for iommu private data") BR, -R > > Regards, > > Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] dma-contiguous: cleanup dma_alloc_contiguous
Split out a cma_alloc_aligned helper to deal with the "interesting" calling conventions for cma_alloc, which then allows to the main function to be written straight forward. This also takes advantage of the fact that NULL dev arguments have been gone from the DMA API for a while. Signed-off-by: Christoph Hellwig --- kernel/dma/contiguous.c | 31 ++- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index 15bc5026c485f2..f16b8d3f9932de 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -215,6 +215,13 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages, return cma_release(dev_get_cma_area(dev), pages, count); } +static struct page *cma_alloc_aligned(struct cma *cma, size_t size, gfp_t gfp) +{ + return cma_alloc(dma_contiguous_default_area, size >> PAGE_SHIFT, +min_t(size_t, get_order(size), CONFIG_CMA_ALIGNMENT), +gfp & __GFP_NOWARN); +} + /** * dma_alloc_contiguous() - allocate contiguous pages * @dev: Pointer to device for which the allocation is performed. @@ -231,24 +238,14 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages, */ struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp) { - size_t count = size >> PAGE_SHIFT; - struct page *page = NULL; - struct cma *cma = NULL; - - if (dev && dev->cma_area) - cma = dev->cma_area; - else if (count > 1) - cma = dma_contiguous_default_area; - /* CMA can be used only in the context which permits sleeping */ - if (cma && gfpflags_allow_blocking(gfp)) { - size_t align = get_order(size); - size_t cma_align = min_t(size_t, align, CONFIG_CMA_ALIGNMENT); - - page = cma_alloc(cma, count, cma_align, gfp & __GFP_NOWARN); - } - - return page; + if (!gfpflags_allow_blocking(gfp)) + return NULL; + if (dev->cma_area) + return cma_alloc_aligned(dev->cma_area, size, gfp); + if (size <= PAGE_SIZE || !dma_contiguous_default_area) + return NULL; + return cma_alloc_aligned(dma_contiguous_default_area, size, gfp); } /** -- 2.27.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA
On Sun, Jun 28, 2020 at 11:12:50PM +1200, Barry Song wrote: > struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp) > { > size_t count = size >> PAGE_SHIFT; > struct page *page = NULL; > struct cma *cma = NULL; > + int nid = dev ? dev_to_node(dev) : NUMA_NO_NODE; > + bool alloc_from_pernuma = false; > + > + if ((count <= 1) && !(dev && dev->cma_area)) > + return NULL; > > if (dev && dev->cma_area) > cma = dev->cma_area; > - else if (count > 1) > + else if ((nid != NUMA_NO_NODE) && dma_contiguous_pernuma_area[nid] > + && !(gfp & (GFP_DMA | GFP_DMA32))) { > + cma = dma_contiguous_pernuma_area[nid]; > + alloc_from_pernuma = true; > + } else { > cma = dma_contiguous_default_area; > + } I find the function rather confusing now. What about something like (this relies on the fact that dev should never be NULL in the DMA API) struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp) { size_t cma_align = min_t(size_t, get_order(size), CONFIG_CMA_ALIGNMENT); size_t count = size >> PAGE_SHIFT; if (gfpflags_allow_blocking(gfp)) return NULL; gfp &= __GFP_NOWARN; if (dev->cma_area) return cma_alloc(dev->cma_area, count, cma_align, gfp); if (count <= 1) return NULL; if (IS_ENABLED(CONFIG_PERNODE_CMA) && !(gfp & (GFP_DMA | GFP_DMA32)) { int nid = dev_to_node(dev); struct cma *cma = dma_contiguous_pernuma_area[nid]; struct page *page; if (cma) { page = cma_alloc(cma, count, cma_align, gfp); if (page) return page; } } return cma_alloc(dma_contiguous_default_area, count, cma_align, gfp); } > + /* > + * otherwise, page is from either per-numa cma or default cma > + */ > + int nid = page_to_nid(page); > + > + if (nid != NUMA_NO_NODE) { > + if (cma_release(dma_contiguous_pernuma_area[nid], page, > + PAGE_ALIGN(size) >> PAGE_SHIFT)) > + return; > + } > + > + if (cma_release(dma_contiguous_default_area, page, How can page_to_nid ever return NUMA_NO_NODE? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3] iommu/mediatek: check 4GB mode by reading infracfg
In previous discussion [1] and [2], we found that it is risky to use max_pfn or totalram_pages to tell if 4GB mode is enabled. Check 4GB mode by reading infracfg register, remove the usage of the un-exported symbol max_pfn. This is a step towards building mtk_iommu as a kernel module. --- Change since v2: - determine compatible string by m4u_plat - rebase to next-20200720 - add "---" Change since v1: - remove the phandle usage, search for infracfg instead [3] - use infracfg instead of infracfg_regmap - move infracfg definitaions to linux/soc/mediatek/infracfg.h - update enable_4GB only when has_4gb_mode [1] https://lkml.org/lkml/2020/6/3/733 [2] https://lkml.org/lkml/2020/6/4/136 [3] https://lkml.org/lkml/2020/7/15/1147 Cc: Mike Rapoport Cc: David Hildenbrand Cc: Yong Wu Cc: Yingjoe Chen Cc: Christoph Hellwig Cc: Rob Herring Cc: Matthias Brugger Signed-off-by: Miles Chen --- drivers/iommu/mtk_iommu.c | 34 +++ include/linux/soc/mediatek/infracfg.h | 3 +++ 2 files changed, 32 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 59e5a62a34db..9ec666168822 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -3,7 +3,6 @@ * Copyright (c) 2015-2016 MediaTek Inc. * Author: Yong Wu */ -#include #include #include #include @@ -15,13 +14,16 @@ #include #include #include +#include #include #include #include #include #include +#include #include #include +#include #include #include @@ -640,8 +642,11 @@ static int mtk_iommu_probe(struct platform_device *pdev) struct resource *res; resource_size_t ioaddr; struct component_match *match = NULL; + struct regmap *infracfg; void*protect; int i, larb_nr, ret; + u32 val; + char*p; data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); if (!data) @@ -655,10 +660,29 @@ static int mtk_iommu_probe(struct platform_device *pdev) return -ENOMEM; data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN); - /* Whether the current dram is over 4GB */ - data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT)); - if (!MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) - data->enable_4GB = false; + data->enable_4GB = false; + if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) { + switch (data->plat_data->m4u_plat) { + case M4U_MT2712: + p = "mediatek,mt2712-infracfg"; + break; + case M4U_MT8173: + p = "mediatek,mt8173-infracfg"; + break; + default: + p = NULL; + } + + infracfg = syscon_regmap_lookup_by_compatible(p); + + if (IS_ERR(infracfg)) + return PTR_ERR(infracfg); + + ret = regmap_read(infracfg, REG_INFRA_MISC, ); + if (ret) + return ret; + data->enable_4GB = !!(val & F_DDR_4GB_SUPPORT_EN); + } res = platform_get_resource(pdev, IORESOURCE_MEM, 0); data->base = devm_ioremap_resource(dev, res); diff --git a/include/linux/soc/mediatek/infracfg.h b/include/linux/soc/mediatek/infracfg.h index fd25f0148566..233463d789c6 100644 --- a/include/linux/soc/mediatek/infracfg.h +++ b/include/linux/soc/mediatek/infracfg.h @@ -32,6 +32,9 @@ #define MT7622_TOP_AXI_PROT_EN_WB (BIT(2) | BIT(6) | \ BIT(7) | BIT(8)) +#define REG_INFRA_MISC 0xf00 +#define F_DDR_4GB_SUPPORT_EN BIT(13) + int mtk_infracfg_set_bus_protection(struct regmap *infracfg, u32 mask, bool reg_update); int mtk_infracfg_clear_bus_protection(struct regmap *infracfg, u32 mask, -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA
On Sun, Jun 28, 2020 at 11:12:50PM +1200, Barry Song wrote: > This is useful for at least two scenarios: > 1. ARM64 smmu will get memory from local numa node, it can save its > command queues and page tables locally. Tests show it can decrease > dma_unmap latency at lot. For example, without this patch, smmu on > node2 will get memory from node0 by calling dma_alloc_coherent(), > typically, it has to wait for more than 560ns for the completion of > CMD_SYNC in an empty command queue; with this patch, it needs 240ns > only. > 2. when we set iommu passthrough, drivers will get memory from CMA, > local memory means much less latency. I really don't like the config options. With the boot parameters you can always hardcode that in CONFIG_CMDLINE anyway. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH -next] iommu: Make some functions static
On Mon, Jul 13, 2020 at 10:25:42PM +0800, Wei Yongjun wrote: > The sparse tool complains as follows: > > drivers/iommu/iommu.c:386:5: warning: > symbol 'iommu_insert_resv_region' was not declared. Should it be static? > drivers/iommu/iommu.c:2182:5: warning: > symbol '__iommu_map' was not declared. Should it be static? > > Those functions are not used outside of iommu.c, so mark them static. > > Reported-by: Hulk Robot > Signed-off-by: Wei Yongjun > --- > drivers/iommu/iommu.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 01/12] iommu: Change type of pasid to u32
On Mon, Jul 13, 2020 at 04:47:56PM -0700, Fenghua Yu wrote: > PASID is defined as a few different types in iommu including "int", > "u32", and "unsigned int". To be consistent and to match with uapi > definitions, define PASID and its variations (e.g. max PASID) as "u32". > "u32" is also shorter and a little more explicit than "unsigned int". > > No PASID type change in uapi although it defines PASID as __u64 in > some places. > > Suggested-by: Thomas Gleixner > Signed-off-by: Fenghua Yu > Reviewed-by: Tony Luck > Reviewed-by: Lu Baolu > Acked-by: Felix Kuehling For the IOMMU parts: Acked-by: Joerg Roedel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/3] iommu/amd: I/O VA address limits
On Wed, Jul 22, 2020 at 12:34:57PM +, Sironi, Filippo wrote: > On Wed, 2020-07-22 at 14:19 +0200, j...@8bytes.org wrote: > I wouldn't be surprised if a PCIe device raises a PCIe SERR if it is > asked to do DMA beyond its abilities. Yeah, but that would also make it impossible to safely assign the device to any untrusted entity, like a guest of user-space driver. > I think that this discussion is orthogonal wrt the spirit of the > patches. They are actually adding a few bits to the AMD IOMMU driver > (and propagating them to the right places) to implement a portion of the > specification that wasn't implemented, i.e., relying on the IVRS table. > These patches are valuable independently on the content of the IVRS > table, be it 32, 48, or 64 bits. You are right from a technical point of view, and the patches are as well. The problem I see is that there are a lot of systems out there with an AMD IOMMU and possibly broken ACPI tables. And if the driver starts checking this field now it is possible that it breaks formerly working setups. So doing this needs a strong reason, like upcoming hardware that has lower limits in the supported address space size than before. The use-case you have described is not a strong enough reason to take the risk. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V4 1/3] iommu: Add support to change default domain of an iommu group
On Tue, Jul 14, 2020 at 06:23:54PM +, Prakhya, Sai Praneeth wrote: > Q1: > > Presently, iommu_change_dev_def_domain() checks if the iommu group still has > > only one device or not. Hence, checking if iommu group has one device or > > not is > > done twice, once before taking device_lock() and the other, after taking > > device_lock(). > > > > I agree that the code isn't checking if the iommu group still has the _same_ > > device or not. > > One way, I could think of doing it is by storing "dev" temporarily and > > checking > > for it. > > Do you think that's ok? Or would you rather suggest something else? That sounds reasonable, get the device from the group, lock it, take group->mutex, and check whether the same device is still alone in the group. > Q2: > > The reason for taking iommu_group->mutex in the beginning of > > iommu_change_dev_def_domain() is that the function > > > > 1. Checks if the group is being directly used by user level drivers (i.e. > > if (group- > > >default_domain != group->domain)) > > > > 2. Uses iommu_ops > > (prev_dom = group->default_domain; > > if (!prev_dom || !prev_dom->ops || !prev_dom->ops->def_domain_type)) > > > > 3. Sets iomu_group->domain to iommu_group->default_domain > > > > I wanted to make sure that iommu_group->domain and iommu_group- > > >default_domain aren't changed by some other thread while this thread is > > working on it. So, please let me know if I misunderstood something. This looks correct as well. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [GIT PULL] iommu/arm-smmu: Updates for 5.9
Hi Will, On Tue, Jul 21, 2020 at 09:03:53AM +0100, Will Deacon wrote: > Please pull these Arm SMMU driver updates for 5.9. Summary is in the tag, > but the main thing is support for two new SoC integrations, one of which > is considerably more brain-dead than the other (determining which one is > left as an exercise to the reader although the diffstat is fairly revealing). :) > The following changes since commit 9ebcfadb0610322ac537dd7aa5d9cbc2b2894c68: > > Linux 5.8-rc3 (2020-06-28 15:00:24 -0700) > > are available in the Git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git > tags/arm-smmu-updates Pulled, thanks. Given the number of arm-smmu* files it probably makes sense to create a drivers/iommu/arm-smmu/ directory and move it all there. If you agree feel free to send this as an additional patch on-top of this pull-request. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/qcom: Use domain rather than dev as tlb cookie
On Tue, Jul 21, 2020 at 12:45:17AM +0530, Naresh Kamboju wrote: > On Mon, 20 Jul 2020 at 21:21, Rob Clark wrote: > > > > From: Rob Clark > > > > The device may be torn down, but the domain should still be valid. Lets > > use that as the tlb flush ops cookie. > > > > Fixes a problem reported in [1] > > This proposed fix patch applied on top of linux mainline master > and boot test PASS on db410c. > > The reported problem got fixed. Is this needed for v5.8/stable? A fixes tag would be great too. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] OMAP: iommu: check for failure of a call to omap_iommu_dump_ctx
On Tue, Jul 14, 2020 at 08:22:11PM +0100, Colin King wrote: > From: Colin Ian King > > It is possible for the call to omap_iommu_dump_ctx to return > a negative error number, so check for the failure and return > the error number rather than pass the negative value to > simple_read_from_buffer. > > Addresses-Coverity: ("Improper use of negative value") > Fixes: 14e0e6796a0d ("OMAP: iommu: add initial debugfs support") > Signed-off-by: Colin Ian King > --- > drivers/iommu/omap-iommu-debug.c | 3 +++ > 1 file changed, 3 insertions(+) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/exynos: Rename update_pte()
On Tue, Jul 14, 2020 at 12:59:28PM +0100, Robin Murphy wrote: > The name "update_pte" is a little too generic, and can end up clashing > with architecture pagetable code leaked out of common mm headers. Rename > it to something more appropriately namespaced. > > Reported-by: kernel test robot > Acked-by: Marek Szyprowski > Signed-off-by: Robin Murphy > > --- > v2: fix typo - thanks Marek! > --- > drivers/iommu/exynos-iommu.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/2] iommu/ipmmu-vmsa: Add entry for R8A774E1 and r8a77961
On Tue, Jul 14, 2020 at 11:20:53AM +0100, Lad Prabhakar wrote: > Lad Prabhakar (1): > iommu/ipmmu-vmsa: Add an entry for r8a77961 in soc_rcar_gen3[] > > Marian-Cristian Rotariu (1): > iommu/ipmmu-vmsa: Hook up R8A774E1 DT matching code Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/dma: Avoid SAC address trick for PCIe devices
On Tue, Jul 14, 2020 at 12:42:36PM +0100, Robin Murphy wrote: > Oh bother - yes, this could have been masking all manner of bugs. That > system will presumably also break if you managed to exhaust the 32-bit IOVA > space such that the allocator moved up to the higher range anyway, or if you > passed the XHCI through to a VM with a sufficiently wacky GPA layout, but I > guess those are cases that simply nobody's run into yet. > > Does the firmware actually report any upper address constraint such that > Sebastian's IVRS aperture patches might help? No, it doesn't. I am not sure what the best way is to get these issues found and fixed. I doubt they will be getting fixed when the allocation pattern isn't changed, maybe we can put your changes behind a config variable and start testing/reporting bugs/etc. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/3] iommu/amd: I/O VA address limits
On Wed, 2020-07-22 at 14:19 +0200, j...@8bytes.org wrote: > > On Fri, Jul 17, 2020 at 03:15:43PM +, Sironi, Filippo wrote: > > I don't believe that we want to trust a userspace driver here, this > > may > > result in hosts becoming unstable because devices are asked to do > > things > > they aren't meant to do (e.g., DMA beyond 48 bits). > > How is the hosts stability affected when a device is programmed with > handles outside of its DMA mask? I wouldn't be surprised if a PCIe device raises a PCIe SERR if it is asked to do DMA beyond its abilities. > Anyway, someone needs to know how to use the device in the end, and > this > entity also needs to know the DMA mask of the device and pass it down > to > path to the dma-iommu code. > > Regards, > > Joerg I agree, device drivers should do the right thing and if we have generic device drivers they may need "quirks" based on VID:DID to do the right thing. Still, I believe that the VFIO case is special because the device driver is effectively in userspace I really think that trusting userspace isn't quite correct (and we can keep discussing on this front). I think that this discussion is orthogonal wrt the spirit of the patches. They are actually adding a few bits to the AMD IOMMU driver (and propagating them to the right places) to implement a portion of the specification that wasn't implemented, i.e., relying on the IVRS table. These patches are valuable independently on the content of the IVRS table, be it 32, 48, or 64 bits. Filippo Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/3] iommu/amd: I/O VA address limits
On Fri, Jul 17, 2020 at 03:15:43PM +, Sironi, Filippo wrote: > I don't believe that we want to trust a userspace driver here, this may > result in hosts becoming unstable because devices are asked to do things > they aren't meant to do (e.g., DMA beyond 48 bits). How is the hosts stability affected when a device is programmed with handles outside of its DMA mask? Anyway, someone needs to know how to use the device in the end, and this entity also needs to know the DMA mask of the device and pass it down to path to the dma-iommu code. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu
Hi [This is an automated email] This commit has been processed because it contains a "Fixes:" tag fixing commit: . The bot has tested the following trees: v5.7.9, v5.4.52, v4.19.133, v4.14.188, v4.9.230, v4.4.230. v5.7.9: Failed to apply! Possible dependencies: Unable to calculate v5.4.52: Failed to apply! Possible dependencies: Unable to calculate v4.19.133: Failed to apply! Possible dependencies: e5e04d051979d ("iommu/vt-d: Check whether device requires bounce buffer") v4.14.188: Failed to apply! Possible dependencies: 85319dcc8955f ("iommu/vt-d: Add for_each_device_domain() helper") 9ddbfb42138d8 ("iommu/vt-d: Move device_domain_info to header") e5e04d051979d ("iommu/vt-d: Check whether device requires bounce buffer") v4.9.230: Failed to apply! Possible dependencies: 161b28aae1651 ("iommu/vt-d: Make sure IOMMUs are off when intel_iommu=off") 61012985eb132 ("iommu/vt-d: Use lo_hi_readq() / lo_hi_writeq()") 85319dcc8955f ("iommu/vt-d: Add for_each_device_domain() helper") 9ddbfb42138d8 ("iommu/vt-d: Move device_domain_info to header") a7fdb6e648fb1 ("iommu/vt-d: Fix crash when accessing VT-d sysfs entries") b0119e870837d ("iommu: Introduce new 'struct iommu_device'") b316d02a13c3a ("iommu/vt-d: Unwrap __get_valid_domain_for_dev()") bfd20f1cc8501 ("x86, iommu/vt-d: Add an option to disable Intel IOMMU force on") e5e04d051979d ("iommu/vt-d: Check whether device requires bounce buffer") v4.4.230: Failed to apply! Possible dependencies: 0824c5920b16f ("iommu/vt-d: avoid dev iotlb logic for domains with no dev iotlbs") 161b28aae1651 ("iommu/vt-d: Make sure IOMMUs are off when intel_iommu=off") 314f1dc140844 ("iommu/vt-d: refactoring of deferred flush entries") 53c92d793395f ("iommu: of: enforce const-ness of struct iommu_ops") 57f98d2f61e19 ("iommu: Introduce iommu_fwspec") 592033790e827 ("iommu/vt-d: Check the return value of iommu_device_create()") 85319dcc8955f ("iommu/vt-d: Add for_each_device_domain() helper") 8d54d6c8b8f3e ("iommu/amd: Implement apply_dm_region call-back") 9ddbfb42138d8 ("iommu/vt-d: Move device_domain_info to header") a7fdb6e648fb1 ("iommu/vt-d: Fix crash when accessing VT-d sysfs entries") aa4732406e129 ("iommu/vt-d: per-cpu deferred invalidation queues") b0119e870837d ("iommu: Introduce new 'struct iommu_device'") b996444cf35e7 ("iommu/of: Handle iommu-map property for PCI") bc8474549e94e ("iommu/vt-d: Fix up error handling in alloc_iommu") bfd20f1cc8501 ("x86, iommu/vt-d: Add an option to disable Intel IOMMU force on") e5e04d051979d ("iommu/vt-d: Check whether device requires bounce buffer") NOTE: The patch will not be queued to stable trees until it is upstream. How should we proceed with this patch? -- Thanks Sasha ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Subject: Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu
Hi Lu, limonciello. Yestoday i just verified the issue with the patch. and just iommu Subscription today.This is my test log. [Hardware info] Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz 1.20GHz ICLSFWR1.R00.3162.A00.1904162000 BIOS Information BIOS Vendo Intel Core Version 1.5.2.0 RP01 Client Silicon Version 0.2.0.15 Project Version ICLSFWR1.R00.3162.A00.1904162000 Build Date 20:00 04/16/2019 Board Name IceLake U DDR4 SODIMM PD RVP TLC Processor Information Name IceLake UL [S3(mem) failed] $ echo deep > /sys/power/mem_sleep $ rtcwake -m mem -s 10 ACPI: EC: interrupt blocked e1000e :00:1f.6: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 14317 usecs ec PNP0C09:00: acpi_ec_suspend_noirq+0x0/0x50 returned 0 after 355319 usecs wdat_wdt wdat_wdt: calling wdat_wdt_suspend_noirq+0x0/0x66 [wdat_wdt] @ 347, parent: platform ahci :00:17.0: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 383843 usecs intel-lpss :00:1e.3: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 384062 usecs wdat_wdt wdat_wdt: wdat_wdt_suspend_noirq+0x0/0x66 [wdat_wdt] returned 0 after 11 usecs intel-lpss :00:1e.0: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 414466 usecs xhci_hcd :00:14.0: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 414023 usecs sdhci-pci :00:14.5: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 429325 usecs pcieport :00:07.3: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 429026 usecs pcieport :00:07.1: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 429675 usecs pcieport :00:07.2: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 430309 usecs pcieport :00:07.0: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 430213 usecs thunderbolt :00:0d.2: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 432523 usecs thunderbolt :00:0d.3: pci_pm_suspend_noirq+0x0/0x250 returned 0 after 432815 usecs ACPI: Preparing to enter system sleep state S3 ACPI: EC: event blocked ACPI: EC: EC stopped PM: Saving platform NVS memory Disabling non-boot CPUs ... smpboot: CPU 1 is now offline smpboot: CPU 2 is now offline smpboot: CPU 3 is now offline smpboot: CPU 4 is now offline smpboot: CPU 5 is now offline smpboot: CPU 6 is now offline smpboot: CPU 7 is now offline PM: Calling mce_syscore_suspend+0x0/0x20 PM: Calling nmi_suspend+0x0/0x20 PM: Calling timekeeping_suspend+0x0/0x2d0 PM: Calling save_ioapic_entries+0x0/0x90 PM: Calling i8259A_suspend+0x0/0x30 PM: Calling iommu_suspend+0x0/0x1b0 Kernel panic - not syncing: DMAR hardware is malfunctioning CPU: 0 PID: 347 Comm: rtcwake Not tainted 5.4.0-yocto-standard #124 Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3162.A00.1904162000 04/16/2019 Call Trace: dump_stack+0x59/0x75 panic+0xff/0x2d4 iommu_disable_translation+0x88/0x90 iommu_suspend+0x12f/0x1b0 syscore_suspend+0x6c/0x220 suspend_devices_and_enter+0x313/0x840 pm_suspend+0x30d/0x390 state_store+0x82/0xf0 kobj_attr_store+0x12/0x20 sysfs_kf_write+0x3c/0x50 kernfs_fop_write+0x11d/0x190 __vfs_write+0x1b/0x40 vfs_write+0xc6/0x1d0 ksys_write+0x5e/0xe0 __x64_sys_write+0x1a/0x20 do_syscall_64+0x4d/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f97b8080113 Code: 8b 15 81 bd 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18 RSP: 002b:7ffcfa6f48b8 EFLAGS: 0246 ORIG_RAX: 0001 RAX: ffda RBX: 0004 RCX: 7f97b8080113 RDX: 0004 RSI: 55e7db03b700 RDI: 0004 RBP: 55e7db03b700 R08: 55e7db03b700 R09: 0004 R10: 0004 R11: 0246 R12: 0004 R13: 55e7db039380 R14: 0004 R15: 7f97b814d700 Kernel Offset: 0x38a0 from 0x8100 (relocation range: 0x8000-0xbfff) ---[ end Kernel panic - not syncing: DMAR hardware is malfunctioning ]--- [S3 successfully with the patch] sh-5.0# uname -a Linux intel-x86-64 5.8.0-rc6-yoctodev-standard+ #128 SMP PREEMPT Tue Jul 21 12:14:39 CST 2020 x86_64 x86_64 x86_64 GNU/Linux sh-5.0# sh-5.0# lsmod |grep -i thunderbolt intel_wmi_thunderbolt16384 0 thunderbolt 167936 0 wmi24576 2 intel_wmi_thunderbolt,wmi_bmof sh-5.0# sh-5.0# sh-5.0# sh-5.0# modinfo thunderbolt filename: /lib/modules/5.8.0-rc6-yoctodev-standard+/kernel/drivers/thunderbolt/thunderbolt.ko license:GPL alias: pci:v*d*sv*sd*bc0Csc03i40* alias: pci:v8086d9A1Dsv*sd*bc*sc*i* alias: pci:v8086d9A1Bsv*sd*bc*sc*i* alias: pci:v8086d8A0Dsv*sd*bc*sc*i* alias: pci:v8086d8A17sv*sd*bc*sc*i* alias:
Re: [PATCH v2] iommu/mediatek: check 4GB mode by reading infracfg
On Wed, 2020-07-22 at 15:17 +0800, Miles Chen wrote: > On Tue, 2020-07-21 at 23:19 +0200, Matthias Brugger wrote: > > > > On 21/07/2020 13:24, Yong Wu wrote: > > > On Tue, 2020-07-21 at 11:40 +0200, Matthias Brugger wrote: > > >> > > >> On 21/07/2020 04:16, Miles Chen wrote: > > >>> In previous discussion [1] and [2], we found that it is risky to > > >>> use max_pfn or totalram_pages to tell if 4GB mode is enabled. > > >>> > > >>> Check 4GB mode by reading infracfg register, remove the usage > > >>> of the un-exported symbol max_pfn. > > >>> > > >>> This is a step towards building mtk_iommu as a kernel module. > > >>> > > >>> Change since v1: > > >>> 1. remove the phandle usage, search for infracfg instead [3] > > >>> 2. use infracfg instead of infracfg_regmap > > >>> 3. move infracfg definitaions to linux/soc/mediatek/infracfg.h > > >>> 4. update enable_4GB only when has_4gb_mode > > >>> > > >>> [1] > > >>> https://urldefense.com/v3/__https://lkml.org/lkml/2020/6/3/733__;!!CTRNKA9wMg0ARbw!w5YjY83YRL9_ijgXHwB1x2Dnb5BqiFUI8H5IAyAWWFMvUJKI9Qbj_zta2AaiFZejiQ$ > > >>> > > >>> [2] > > >>> https://urldefense.com/v3/__https://lkml.org/lkml/2020/6/4/136__;!!CTRNKA9wMg0ARbw!w5YjY83YRL9_ijgXHwB1x2Dnb5BqiFUI8H5IAyAWWFMvUJKI9Qbj_zta2Aa9U2yQyg$ > > >>> > > >>> [3] > > >>> https://urldefense.com/v3/__https://lkml.org/lkml/2020/7/15/1147__;!!CTRNKA9wMg0ARbw!w5YjY83YRL9_ijgXHwB1x2Dnb5BqiFUI8H5IAyAWWFMvUJKI9Qbj_zta2Aaxpk_Wjw$ > > >>> > > >>> > > >>> Cc: Mike Rapoport > > >>> Cc: David Hildenbrand > > >>> Cc: Yong Wu > > >>> Cc: Yingjoe Chen > > >>> Cc: Christoph Hellwig > > >>> Cc: Yong Wu > > >>> Cc: Chao Hao > > >>> Cc: Rob Herring > > >>> Cc: Matthias Brugger > > >>> Signed-off-by: Miles Chen > > >>> --- > > >>>drivers/iommu/mtk_iommu.c | 26 +- > > >>>include/linux/soc/mediatek/infracfg.h | 3 +++ > > >>>2 files changed, 24 insertions(+), 5 deletions(-) > > >>> > > >>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > >>> index 2be96f1cdbd2..16765f532853 100644 > > >>> --- a/drivers/iommu/mtk_iommu.c > > >>> +++ b/drivers/iommu/mtk_iommu.c > > >>> @@ -3,7 +3,6 @@ > > >>> * Copyright (c) 2015-2016 MediaTek Inc. > > >>> * Author: Yong Wu > > >>> */ > > >>> -#include > > >>>#include > > >>>#include > > >>>#include > > >>> @@ -15,13 +14,16 @@ > > >>>#include > > >>>#include > > >>>#include > > >>> +#include > > >>>#include > > >>>#include > > >>>#include > > >>>#include > > >>>#include > > >>> +#include > > >>>#include > > >>>#include > > >>> +#include > > >>>#include > > >>>#include > > >>> > > >>> @@ -599,8 +601,10 @@ static int mtk_iommu_probe(struct platform_device > > >>> *pdev) > > >>> struct resource *res; > > >>> resource_size_t ioaddr; > > >>> struct component_match *match = NULL; > > >>> + struct regmap *infracfg; > > >>> void*protect; > > >>> int i, larb_nr, ret; > > >>> + u32 val; > > >>> > > >>> data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); > > >>> if (!data) > > >>> @@ -614,10 +618,22 @@ static int mtk_iommu_probe(struct platform_device > > >>> *pdev) > > >>> return -ENOMEM; > > >>> data->protect_base = ALIGN(virt_to_phys(protect), > > >>> MTK_PROTECT_PA_ALIGN); > > >>> > > >>> - /* Whether the current dram is over 4GB */ > > >>> - data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT)); > > >>> - if (!data->plat_data->has_4gb_mode) > > >>> - data->enable_4GB = false; > > >>> + data->enable_4GB = false; > > >>> + if (data->plat_data->has_4gb_mode) { > > >>> + infracfg = syscon_regmap_lookup_by_compatible( > > >>> + "mediatek,mt8173-infracfg"); > > >>> + if (IS_ERR(infracfg)) { > > >>> + infracfg = syscon_regmap_lookup_by_compatible( > > >>> + "mediatek,mt2712-infracfg"); > > >>> + if (IS_ERR(infracfg)) > > >>> + return PTR_ERR(infracfg); > > >> > > >> I think we should check m4u_plat instead to decide which compatible we > > >> have to > > >> look for. > > >> Another option would be to add a general compatible something like > > >> "mtk-infracfg" and search for that. That would need an update of all DTS > > >> having > > >> a infracfg compatible right now. After thinking twice, this would break > > >> newer > > >> kernel with older device tree, so maybe it's better to go with m4u_plat > > >> switch > > >> statement. > > > > > > Add a "char *infracfg" in the plat_data, Use the mt2712, mt8173 > > > corresponding string in it. If it is NULL, It means the "enable_4GB" > > > always is false. Then we also can remove the
Re: Subject: Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu
On 7/22/20 11:07 AM, Lu Baolu wrote: On 7/22/20 11:03 AM, Jun Miao wrote: On 7/22/20 10:40 AM, Lu Baolu wrote: Hi Jun, On 7/22/20 10:26 AM, Miao, Jun wrote: Kernel panic - not syncing: DMAR hardware is malfunctioning CPU: 0 PID: 347 Comm: rtcwake Not tainted 5.4.0-yocto-standard #124 Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3162.A00.1904162000 04/16/2019 Call Trace: dump_stack+0x59/0x75 panic+0xff/0x2d4 iommu_disable_translation+0x88/0x90 iommu_suspend+0x12f/0x1b0 syscore_suspend+0x6c/0x220 suspend_devices_and_enter+0x313/0x840 pm_suspend+0x30d/0x390 state_store+0x82/0xf0 kobj_attr_store+0x12/0x20 sysfs_kf_write+0x3c/0x50 kernfs_fop_write+0x11d/0x190 __vfs_write+0x1b/0x40 vfs_write+0xc6/0x1d0 ksys_write+0x5e/0xe0 __x64_sys_write+0x1a/0x20 do_syscall_64+0x4d/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f97b8080113 Code: 8b 15 81 bd 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18 RSP: 002b:7ffcfa6f48b8 EFLAGS: 0246 ORIG_RAX: 0001 RAX: ffda RBX: 0004 RCX: 7f97b8080113 RDX: 0004 RSI: 55e7db03b700 RDI: 0004 RBP: 55e7db03b700 R08: 55e7db03b700 R09: 0004 R10: 0004 R11: 0246 R12: 0004 R13: 55e7db039380 R14: 0004 R15: 7f97b814d700 Kernel Offset: 0x38a0 from 0x8100 (relocation range: 0x8000-0xbfff) ---[ end Kernel panic - not syncing: DMAR hardware is malfunctioning ]--- Do you mean that system hangs in iommu_disable_translation() without this fix. Yes ,From the call trace and i also read the DMARD_GCMD_RGS is wrong without this patch. Okay! Thanks a lot for confirming this. Best regards, baolu [S3 successfully with the patch] And, this failure disappeared after you applied this fix? YES , the log is too long , only head and tail . this failure disappereared. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Subject: Re: [PATCH 1/1] iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu
On 7/22/20 10:40 AM, Lu Baolu wrote: Hi Jun, On 7/22/20 10:26 AM, Miao, Jun wrote: Kernel panic - not syncing: DMAR hardware is malfunctioning CPU: 0 PID: 347 Comm: rtcwake Not tainted 5.4.0-yocto-standard #124 Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3162.A00.1904162000 04/16/2019 Call Trace: dump_stack+0x59/0x75 panic+0xff/0x2d4 iommu_disable_translation+0x88/0x90 iommu_suspend+0x12f/0x1b0 syscore_suspend+0x6c/0x220 suspend_devices_and_enter+0x313/0x840 pm_suspend+0x30d/0x390 state_store+0x82/0xf0 kobj_attr_store+0x12/0x20 sysfs_kf_write+0x3c/0x50 kernfs_fop_write+0x11d/0x190 __vfs_write+0x1b/0x40 vfs_write+0xc6/0x1d0 ksys_write+0x5e/0xe0 __x64_sys_write+0x1a/0x20 do_syscall_64+0x4d/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f97b8080113 Code: 8b 15 81 bd 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18 RSP: 002b:7ffcfa6f48b8 EFLAGS: 0246 ORIG_RAX: 0001 RAX: ffda RBX: 0004 RCX: 7f97b8080113 RDX: 0004 RSI: 55e7db03b700 RDI: 0004 RBP: 55e7db03b700 R08: 55e7db03b700 R09: 0004 R10: 0004 R11: 0246 R12: 0004 R13: 55e7db039380 R14: 0004 R15: 7f97b814d700 Kernel Offset: 0x38a0 from 0x8100 (relocation range: 0x8000-0xbfff) ---[ end Kernel panic - not syncing: DMAR hardware is malfunctioning ]--- Do you mean that system hangs in iommu_disable_translation() without this fix. Yes ,From the call trace and i also read the DMARD_GCMD_RGS is wrong without this patch. [S3 successfully with the patch] And, this failure disappeared after you applied this fix? Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/mediatek: check 4GB mode by reading infracfg
On Tue, 2020-07-21 at 23:19 +0200, Matthias Brugger wrote: > > On 21/07/2020 13:24, Yong Wu wrote: > > On Tue, 2020-07-21 at 11:40 +0200, Matthias Brugger wrote: > >> > >> On 21/07/2020 04:16, Miles Chen wrote: > >>> In previous discussion [1] and [2], we found that it is risky to > >>> use max_pfn or totalram_pages to tell if 4GB mode is enabled. > >>> > >>> Check 4GB mode by reading infracfg register, remove the usage > >>> of the un-exported symbol max_pfn. > >>> > >>> This is a step towards building mtk_iommu as a kernel module. > >>> > >>> Change since v1: > >>> 1. remove the phandle usage, search for infracfg instead [3] > >>> 2. use infracfg instead of infracfg_regmap > >>> 3. move infracfg definitaions to linux/soc/mediatek/infracfg.h > >>> 4. update enable_4GB only when has_4gb_mode > >>> > >>> [1] > >>> https://urldefense.com/v3/__https://lkml.org/lkml/2020/6/3/733__;!!CTRNKA9wMg0ARbw!w5YjY83YRL9_ijgXHwB1x2Dnb5BqiFUI8H5IAyAWWFMvUJKI9Qbj_zta2AaiFZejiQ$ > >>> > >>> [2] > >>> https://urldefense.com/v3/__https://lkml.org/lkml/2020/6/4/136__;!!CTRNKA9wMg0ARbw!w5YjY83YRL9_ijgXHwB1x2Dnb5BqiFUI8H5IAyAWWFMvUJKI9Qbj_zta2Aa9U2yQyg$ > >>> > >>> [3] > >>> https://urldefense.com/v3/__https://lkml.org/lkml/2020/7/15/1147__;!!CTRNKA9wMg0ARbw!w5YjY83YRL9_ijgXHwB1x2Dnb5BqiFUI8H5IAyAWWFMvUJKI9Qbj_zta2Aaxpk_Wjw$ > >>> > >>> > >>> Cc: Mike Rapoport > >>> Cc: David Hildenbrand > >>> Cc: Yong Wu > >>> Cc: Yingjoe Chen > >>> Cc: Christoph Hellwig > >>> Cc: Yong Wu > >>> Cc: Chao Hao > >>> Cc: Rob Herring > >>> Cc: Matthias Brugger > >>> Signed-off-by: Miles Chen > >>> --- > >>>drivers/iommu/mtk_iommu.c | 26 +- > >>>include/linux/soc/mediatek/infracfg.h | 3 +++ > >>>2 files changed, 24 insertions(+), 5 deletions(-) > >>> > >>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > >>> index 2be96f1cdbd2..16765f532853 100644 > >>> --- a/drivers/iommu/mtk_iommu.c > >>> +++ b/drivers/iommu/mtk_iommu.c > >>> @@ -3,7 +3,6 @@ > >>> * Copyright (c) 2015-2016 MediaTek Inc. > >>> * Author: Yong Wu > >>> */ > >>> -#include > >>>#include > >>>#include > >>>#include > >>> @@ -15,13 +14,16 @@ > >>>#include > >>>#include > >>>#include > >>> +#include > >>>#include > >>>#include > >>>#include > >>>#include > >>>#include > >>> +#include > >>>#include > >>>#include > >>> +#include > >>>#include > >>>#include > >>> > >>> @@ -599,8 +601,10 @@ static int mtk_iommu_probe(struct platform_device > >>> *pdev) > >>> struct resource *res; > >>> resource_size_t ioaddr; > >>> struct component_match *match = NULL; > >>> + struct regmap *infracfg; > >>> void*protect; > >>> int i, larb_nr, ret; > >>> + u32 val; > >>> > >>> data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); > >>> if (!data) > >>> @@ -614,10 +618,22 @@ static int mtk_iommu_probe(struct platform_device > >>> *pdev) > >>> return -ENOMEM; > >>> data->protect_base = ALIGN(virt_to_phys(protect), > >>> MTK_PROTECT_PA_ALIGN); > >>> > >>> - /* Whether the current dram is over 4GB */ > >>> - data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT)); > >>> - if (!data->plat_data->has_4gb_mode) > >>> - data->enable_4GB = false; > >>> + data->enable_4GB = false; > >>> + if (data->plat_data->has_4gb_mode) { > >>> + infracfg = syscon_regmap_lookup_by_compatible( > >>> + "mediatek,mt8173-infracfg"); > >>> + if (IS_ERR(infracfg)) { > >>> + infracfg = syscon_regmap_lookup_by_compatible( > >>> + "mediatek,mt2712-infracfg"); > >>> + if (IS_ERR(infracfg)) > >>> + return PTR_ERR(infracfg); > >> > >> I think we should check m4u_plat instead to decide which compatible we > >> have to > >> look for. > >> Another option would be to add a general compatible something like > >> "mtk-infracfg" and search for that. That would need an update of all DTS > >> having > >> a infracfg compatible right now. After thinking twice, this would break > >> newer > >> kernel with older device tree, so maybe it's better to go with m4u_plat > >> switch > >> statement. > > > > Add a "char *infracfg" in the plat_data, Use the mt2712, mt8173 > > corresponding string in it. If it is NULL, It means the "enable_4GB" > > always is false. Then we also can remove the flag "has_4gb_mode". > > > > is this OK? > > > > It's an option, but I personally find that a bit hacky. Thanks Yong and Matthias for your comment. I will try adding a char *infracfg in patch v3. > > Regards, > Matthias ___ iommu mailing list iommu@lists.linux-foundation.org
Re: [PATCH v2] iommu/mediatek: check 4GB mode by reading infracfg
On Tue, 2020-07-21 at 11:10 +0200, David Hildenbrand wrote: > On 21.07.20 04:16, Miles Chen wrote: > > In previous discussion [1] and [2], we found that it is risky to > > use max_pfn or totalram_pages to tell if 4GB mode is enabled. > > > > Check 4GB mode by reading infracfg register, remove the usage > > of the un-exported symbol max_pfn. > > > > This is a step towards building mtk_iommu as a kernel module. > > > > Change since v1: > > 1. remove the phandle usage, search for infracfg instead [3] > > 2. use infracfg instead of infracfg_regmap > > 3. move infracfg definitaions to linux/soc/mediatek/infracfg.h > > 4. update enable_4GB only when has_4gb_mode > > Nit: We tend to place such information below the "---" (adding a second > one) such that this information is discarded when the patch is picked up. Thanks, I'll add a "---" before the "Change since..." in patch v3. > > > > > [1] > > https://urldefense.com/v3/__https://lkml.org/lkml/2020/6/3/733__;!!CTRNKA9wMg0ARbw!wrP2H0azuZkTHSnM_ETki4kT_FMF0Cl7abQX8tIHX0iTr65JkvMCe4jDt84d_YhBrA$ > > > > [2] > > https://urldefense.com/v3/__https://lkml.org/lkml/2020/6/4/136__;!!CTRNKA9wMg0ARbw!wrP2H0azuZkTHSnM_ETki4kT_FMF0Cl7abQX8tIHX0iTr65JkvMCe4jDt86IGhmouQ$ > > > > [3] > > https://urldefense.com/v3/__https://lkml.org/lkml/2020/7/15/1147__;!!CTRNKA9wMg0ARbw!wrP2H0azuZkTHSnM_ETki4kT_FMF0Cl7abQX8tIHX0iTr65JkvMCe4jDt855z4xdqw$ > > > > > > Cc: Mike Rapoport > > Cc: David Hildenbrand > > Cc: Yong Wu > > Cc: Yingjoe Chen > > Cc: Christoph Hellwig > > Cc: Yong Wu > > Cc: Chao Hao > > Cc: Rob Herring > > Cc: Matthias Brugger > > Signed-off-by: Miles Chen > > --- > > drivers/iommu/mtk_iommu.c | 26 +- > > include/linux/soc/mediatek/infracfg.h | 3 +++ > > 2 files changed, 24 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index 2be96f1cdbd2..16765f532853 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -3,7 +3,6 @@ > > * Copyright (c) 2015-2016 MediaTek Inc. > > * Author: Yong Wu > > */ > > -#include > > #include > > #include > > #include > > @@ -15,13 +14,16 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > #include > > #include > > +#include > > #include > > #include > > +#include > > #include > > #include > > > > @@ -599,8 +601,10 @@ static int mtk_iommu_probe(struct platform_device > > *pdev) > > struct resource *res; > > resource_size_t ioaddr; > > struct component_match *match = NULL; > > + struct regmap *infracfg; > > void*protect; > > int i, larb_nr, ret; > > + u32 val; > > > > data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); > > if (!data) > > @@ -614,10 +618,22 @@ static int mtk_iommu_probe(struct platform_device > > *pdev) > > return -ENOMEM; > > data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN); > > > > - /* Whether the current dram is over 4GB */ > > - data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT)); > > - if (!data->plat_data->has_4gb_mode) > > - data->enable_4GB = false; > > + data->enable_4GB = false; > > + if (data->plat_data->has_4gb_mode) { > > + infracfg = syscon_regmap_lookup_by_compatible( > > + "mediatek,mt8173-infracfg"); > > + if (IS_ERR(infracfg)) { > > + infracfg = syscon_regmap_lookup_by_compatible( > > + "mediatek,mt2712-infracfg"); > > + if (IS_ERR(infracfg)) > > + return PTR_ERR(infracfg); > > + > > + } > > + ret = regmap_read(infracfg, REG_INFRA_MISC, ); > > + if (ret) > > + return ret; > > + data->enable_4GB = !!(val & F_DDR_4GB_SUPPORT_EN); > > (I am absolutely not familiar with syscon_regmap_lookup_by ..., I am > missing some context, so sorry for the stupid questions) > > Who sets the regmap value and based on what? Who decides that 4GB mode > is supported or not? And who decides if 4GB mode is *required* or not? > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu