Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
On Thu, Jan 09, 2020 at 07:28:41AM +0800, Lu Baolu wrote: > This patch has been replaced with this one. > > https://lkml.org/lkml/2020/1/5/103 That still mentions a "nvme host device", which despite the different spelling still doesn't make any sense. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/2] iommu/amd: Optimization and cleanup for unused variables
This series optimizes the register reading by using readq instead of readl and cleans up the unused variables. Adrian Huang (2): iommu/amd: Replace two consecutive readl calls with one readq iommu/amd: Remove unused struct member drivers/iommu/amd_iommu_init.c | 6 +- drivers/iommu/amd_iommu_types.h | 1 - 2 files changed, 1 insertion(+), 6 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/2] iommu/amd: Replace two consecutive readl calls with one readq
From: Adrian Huang Optimize the reigster reading by using readq instead of the two consecutive readl calls. Signed-off-by: Adrian Huang --- drivers/iommu/amd_iommu_init.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c index c5167fe0bd5f..cfdc4b60ccbe 100644 --- a/drivers/iommu/amd_iommu_init.c +++ b/drivers/iommu/amd_iommu_init.c @@ -1718,7 +1718,6 @@ static const struct attribute_group *amd_iommu_groups[] = { static int __init iommu_init_pci(struct amd_iommu *iommu) { int cap_ptr = iommu->cap_ptr; - u32 low, high; int ret; iommu->dev = pci_get_domain_bus_and_slot(0, PCI_BUS_NUM(iommu->devid), @@ -1736,10 +1735,7 @@ static int __init iommu_init_pci(struct amd_iommu *iommu) amd_iommu_iotlb_sup = false; /* read extended feature bits */ - low = readl(iommu->mmio_base + MMIO_EXT_FEATURES); - high = readl(iommu->mmio_base + MMIO_EXT_FEATURES + 4); - - iommu->features = ((u64)high << 32) | low; + iommu->features = readq(iommu->mmio_base + MMIO_EXT_FEATURES); if (iommu_feature(iommu, FEATURE_GT)) { int glxval; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/2] iommu/amd: Remove unused struct member
From: Adrian Huang Commit c805b428f206 ("iommu/amd: Remove amd_iommu_pd_list") removes the global list for the allocated protection domains. The corresponding member 'list' of the protection_domain struct is not used anymore, so it can be removed. Signed-off-by: Adrian Huang --- drivers/iommu/amd_iommu_types.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h index 500f0b78879d..f8d01d6b00da 100644 --- a/drivers/iommu/amd_iommu_types.h +++ b/drivers/iommu/amd_iommu_types.h @@ -463,7 +463,6 @@ struct amd_irte_ops; * independent of their use. */ struct protection_domain { - struct list_head list; /* for list of all protection domains */ struct list_head dev_list; /* List of all devices in this domain */ struct iommu_domain domain; /* generic domain handle used by iommu core code */ -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH RFC] ARM: dts: qcom: msm8974: add mdp5 iommu support
This adds preliminary IOMMU support for the MDP5 on msm8974. It appears that the qcom-iommu driver in upstream can be used on this SoC. I marked this patch as a RFC since the frame buffer becomes corrupted when I boot the Nexus 5 phone with this patch: https://raw.githubusercontent.com/masneyb/nexus-5-upstream/master/images/broken-mdp5-iommu.jpg A quick note about the ranges of the context banks below: Based on the downstream sources, I believe that the memory addresses should be mapped out like this: mdp_iommu: iommu@fd928000 { reg = <0xfd928000 0x8000>; ranges = <0 0xfd93 0x8000>; ... iommu-ctx@0 { reg = <0x0 0x1000>; ... }; iommu-ctx@1000 { reg = <0x1000 0x1000>; ... }; iommu-ctx@2000 { reg = <0x2000 0x1000>; ... }; }; However, the qcom-iommu driver in upstream expects the first context bank to exist at address 0x1000, and the address space identifier (asid) to be greater than 0. See get_asid() and qcom_iommu_of_xlate() in the upstream qcom-iommu.c driver. I put in the patch below what the driver expects. I modified the driver in my local tree to allow the mapping that I have above so that the extra 0x1000 of memory is mapped into the global address space and still experience the same screen corruption issue. Downstream MSM 3.4 IOMMU dts snippet for reference: https://github.com/AICP/kernel_lge_hammerhead/blob/n7.1/arch/arm/boot/dts/msm-iommu-v1.dtsi#L110 I'm hoping that someone that's more familiar with this hardware has a suggestion for something to try. Signed-off-by: Brian Masney --- arch/arm/boot/dts/qcom-msm8974.dtsi | 44 + 1 file changed, 44 insertions(+) diff --git a/arch/arm/boot/dts/qcom-msm8974.dtsi b/arch/arm/boot/dts/qcom-msm8974.dtsi index 4b161b809dd5..2515a3bd4aa7 100644 --- a/arch/arm/boot/dts/qcom-msm8974.dtsi +++ b/arch/arm/boot/dts/qcom-msm8974.dtsi @@ -1305,6 +1305,46 @@ etm3_out: endpoint { }; }; + mdp_iommu: iommu@fd928000 { + compatible = "qcom,msm8974-iommu", +"qcom,msm-iommu-v1"; + reg = <0xfd928000 0x7000>; + #address-cells = <1>; + #size-cells = <1>; + #iommu-cells = <1>; + ranges = <0 0xfd92f000 0x9000>; + + clocks = < MDSS_AHB_CLK>, +< MDSS_AXI_CLK>; + clock-names = "iface", + "bus"; + + qcom,iommu-secure-id = <1>; + + // mdp_0 + iommu-ctx@1000 { + compatible = "qcom,msm-iommu-v1-ns"; + reg = <0x1000 0x1000>; + interrupts = ; + }; + + // mdp_1 + iommu-ctx@2000 { + compatible = "qcom,msm-iommu-v1-sec"; + reg = <0x2000 0x1000>; + interrupts = , +; + }; + + // mdp_2 + iommu-ctx@3000 { + compatible = "qcom,msm-iommu-v1-sec"; + reg = <0x3000 0x1000>; + interrupts = , +; + }; +}; + ocmem@fdd0 { compatible = "qcom,msm8974-ocmem"; reg = <0xfdd0 0x2000>, @@ -1427,6 +1467,10 @@ mdp: mdp@fd90 { interconnects = < MNOC_MAS_MDP_PORT0 BIMC_SLV_EBI_CH0>; interconnect-names = "mdp0-mem"; + iommus = <_iommu 1 + _iommu 2 + _iommu 3>; + ports { #address-cells = <1>; #size-cells = <0>; -- 2.24.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
> Are you willing to add your reviewed-by for Jim's v2 patch? I will > queue it for v5.6 if you both agree. Sure: Reviewed-by: Roland Dreier ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
Hi Christoph, On 1/8/20 10:16 PM, Christoph Hellwig wrote: +/* + * We expect devices with endpoint scope to have normal PCI + * headers, and devices with bridge scope to have bridge PCI + * headers. However some PCI devices may be listed in the + * DMAR table with bridge scope, even though they have a + * normal PCI header. We don't declare a socpe mismatch for + * below special cases. + */ Please use up all 80 lines for comments. +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2f0d, /* NTB devices */ +quirk_dmar_scope_mismatch); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2020, /* NVME host */ +quirk_dmar_scope_mismatch); As said before "NVME host" host. Besides the wrong spelling of NVMe, the NVMe host is the Linux kernel, so describing a device as such seems rather bogus. This patch has been replaced with this one. https://lkml.org/lkml/2020/1/5/103 Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/2] iommu/vt-d: skip invalid RMRR entries
On 1/7/20 8:27 PM, Lu Baolu wrote: diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index a8bb458845bc..32c3c6338a3d 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4315,13 +4315,25 @@ static void __init init_iommu_pm_ops(void) static inline void init_iommu_pm_ops(void) {} #endif /* CONFIG_PM */ +static int rmrr_validity_check(struct acpi_dmar_reserved_memory *rmrr) +{ + if ((rmrr->base_address & PAGE_MASK) || + (rmrr->end_address <= rmrr->base_address) || + ((rmrr->end_address - rmrr->base_address + 1) & PAGE_MASK)) { + pr_err(FW_BUG "Broken RMRR base: %#018Lx end: %#018Lx\n", + rmrr->base_address, rmrr->end_address); Since you will WARN_TAINT below, do you still want an error message here? I'm fine either way. I put it in since arch_rmrr_sanity_check() also has a pr_err(): pr_err(FW_BUG "No firmware reserved region can cover this RMRR [%#018Lx-%#018Lx], contact BIOS vendor for fixes\n", start, end - 1); Thanks, Barret ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] dt-bindings: iommu: Add binding for MediaTek MT8167 IOMMU
On Fri, 3 Jan 2020 17:26:31 +0100, Fabien Parent wrote: > This commit adds IOMMU binding documentation for the MT8167 SoC. > > Signed-off-by: Fabien Parent > --- > Documentation/devicetree/bindings/iommu/mediatek,iommu.txt | 1 + > 1 file changed, 1 insertion(+) > Acked-by: Rob Herring ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v10 0/4] Add uacce module for Accelerator
On 12/15/19 8:08 PM, Zhangfei Gao wrote: Uacce (Unified/User-space-access-intended Accelerator Framework) targets to provide Shared Virtual Addressing (SVA) between accelerators and processes. So accelerator can access any data structure of the main cpu. This differs from the data sharing between cpu and io device, which share data content rather than address. Because of unified address, hardware and user space of process can share the same virtual address in the communication. Uacce is intended to be used with Jean Philippe Brucker's SVA patchset[1], which enables IO side page fault and PASID support. We have keep verifying with Jean's sva patchset [2] We also keep verifying with Eric's SMMUv3 Nested Stage patches [3] Looking forward to this common framework going upstream. I'm currently in the process of upstreaming the device driver (idxd) [1] for the recently announced Intel Data Streaming Accelerator [2] [3] that also supports SVA. [1] https://lkml.org/lkml/2020/1/7/905 [2] https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator [3] https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification And I think with this framework upstream I can potentially drop the in driver exported char device support code and use this framework directly. This series and related zip & qm driver https://github.com/Linaro/linux-kernel-warpdrive/tree/v5.5-rc1-uacce-v10 The library and user application: https://github.com/Linaro/warpdrive/tree/wdprd-upstream-v10 References: [1] http://jpbrucker.net/sva/ [2] http://jpbrucker.net/git/linux/log/?h=sva/zip-devel [3] https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9 Change History: v10: Modify the include header to fix kbuild test erorr in other arch. v9: Suggested by Jonathan 1. Remove sysfs: numa_distance, node_id, id, also add is_visible callback 2. Split the api to solve the potential race struct uacce_device *uacce_alloc(struct device *parent, struct uacce_interface *interface) int uacce_register(struct uacce_device *uacce) void uacce_remove(struct uacce_device *uacce) 3. Split clean up patch 03 v8: Address some comments from Jonathan Merge Jean's patch, using uacce_mm instead of pid for sva_exit v7: As suggested by Jean and Jerome Only consider sva case and remove unused dma apis for the first patch. Also add mm_exit for sva and vm_ops.close etc v6: https://lkml.org/lkml/2019/10/16/231 Change sys qfrs_size to different file, suggested by Jonathan Fix crypto daily build issue and based on crypto code base, also 5.4-rc1. v5: https://lkml.org/lkml/2019/10/14/74 Add an example patch using the uacce interface, suggested by Greg 0003-crypto-hisilicon-register-zip-engine-to-uacce.patch v4: https://lkml.org/lkml/2019/9/17/116 Based on 5.4-rc1 Considering other driver integrating uacce, if uacce not compiled, uacce_register return error and uacce_unregister is empty. Simplify uacce flag: UACCE_DEV_SVA. Address Greg's comments: Fix state machine, remove potential syslog triggered from user space etc. v3: https://lkml.org/lkml/2019/9/2/990 Recommended by Greg, use sturct uacce_device instead of struct uacce, and use struct *cdev in struct uacce_device, as a result, cdev can be released by itself when refcount decreased to 0. So the two structures are decoupled and self-maintained by themsleves. Also add dev.release for put_device. v2: https://lkml.org/lkml/2019/8/28/565 Address comments from Greg and Jonathan Modify interface uacce_register Drop noiommu mode first v1: https://lkml.org/lkml/2019/8/14/277 1. Rebase to 5.3-rc1 2. Build on iommu interface 3. Verifying with Jean's sva and Eric's nested mode iommu. 4. User library has developed a lot: support zlib, openssl etc. 5. Move to misc first RFC3: https://lkml.org/lkml/2018/11/12/1951 RFC2: https://lwn.net/Articles/763990/ Background of why Uacce: Von Neumann processor is not good at general data manipulation. It is designed for control-bound rather than data-bound application. The latter need less control path facility and more/specific ALUs. So there are more and more heterogeneous processors, such as encryption/decryption accelerators, TPUs, or EDGE (Explicated Data Graph Execution) processors, introduced to gain better performance or power efficiency for particular applications these days. There are generally two ways to make use of these heterogeneous processors: The first is to make them co-processors, just like FPU. This is good for some application but it has its own cons: It changes the ISA set permanently. You must save all state elements when the process is switched out. But most data-bound processors have a huge set of state elements. It makes the kernel scheduler more complex. The second is Accelerator. It is taken as a IO device from the CPU's point of view (but it need not to be physically). The process, running on CPU, hold a context of the accelerator and send
Re: [PATCH 2/2] arm: use swiotlb for bounce buffer on LPAE configs
On 08/01/2020 2:00 pm, Peter Ujfalusi wrote: Robin, On 08/01/2020 14.21, Robin Murphy wrote: On 08/01/2020 8:28 am, Peter Ujfalusi via iommu wrote: Hi Christoph, On 19/12/2019 17.20, Peter Ujfalusi wrote: Hi Christoph, On 19/12/2019 17.02, Christoph Hellwig wrote: Hi Peter, can you try the patch below (it will need to be split into two): Thank you! Unfortunately it does not help: [ 0.596208] edma: probe of 270.edma failed with error -5 [ 0.596626] edma: probe of 2728000.edma failed with error -5 ... [ 2.108602] sdhci-omap 2300.mmc: Got CD GPIO [ 2.113899] mmc0: Failed to set 32-bit DMA mask. [ 2.118592] mmc0: No suitable DMA available - falling back to PIO [ 2.159038] mmc0: SDHCI controller on 2300.mmc [2300.mmc] using PIO [ 2.167531] mmc1: Failed to set 32-bit DMA mask. [ 2.172192] mmc1: No suitable DMA available - falling back to PIO [ 2.213841] mmc1: SDHCI controller on 2310.mmc [2310.mmc] using PIO Do you have idea on how to fix this in a proper way? IMHO when drivers are setting the dma_mask and coherent_mask the dma_pfn_offset should not be applied to the mask at all. If I understand it correctly for EDMA as example: I set dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); since it can only address memory in this range. It does not matter if dma_pfn_offset is 0 or not 0 (like in k2g, where it is 0x78) the EDMA still can only address within 32 bits. The dma_pfn_offset will tell us that the memory location's physical address is seen by the DMA at (phys_pfn - dma_pfn_offset) -> dma_pfn. The dma_mask should be checked against the dma_pfn. That's exactly what dma_direct_supported() does, though. Yes, this is my understanding as well. What it doesn't do, AFAICS, is account for weird cases where the DMA zone *starts* way, way above 32 physical address bits because of an implicit assumption that *all* devices have a dma_pfn_offset equal to min_low_pfn. The problem - I think - is that the DMA_BIT_MASK(32) from dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)) is treated as physical address along the call path so the dma_pfn_offset is applied to it and the check will fail, saying that DMA_BIT_MASK(32) can not be supported. But that's the thing - in isolation, that is entirely correct. Considering ZONE_DMA32 for simplicity, in general the zone is expected to cover the physical address range 0x_ - 0x_ (because DMA offsets are relatively rare), and a device with a dma_pfn_offset of more than (0x1__ >> PAGE_SHIFT) *cannot* support that range with any mask, because the DMA address itself would have to be negative. The problem is that platforms with esoteric memory maps have no right thing to do. If the base of RAM is at at 0x1__ or higher, the "correct" ZONE_DMA32 would be empty while ZONE_NORMAL above it would not, and last time I looked that makes the page allocator break badly. So the standard bodge on such platforms is to make ZONE_DMA32 cover not the first 4GB of *PA space*, but the first 4GB of *RAM*, wherever that happens to be. That then brings different problems - now the page allocator is happy and successfully returns GFP_DMA32 allocations from the range 0x8__ - 0x8__ that are utterly useless to 32-bit devices with zero dma_pfn_offset - see the AMD Seattle SoC for the prime example of that. If on the other hand all devices are guaranteed to have a dma_pfn_offset that puts the base of RAM at DMA address 0 then GFP_DMA32 allocations do end up working as expected, but now the original assumption of where ZONE_DMA32 actually is is broken, so generic code unaware of the platform/architecture-specific bodge will be misled - that's the case you're running into. Having thought this far, if there's a non-hacky way to reach in and grab ZONE_DMA{32} such that dma_direct_supported() could use zone_end_pfn() instead of trying to assume either way, that might be the most robust general solution. Robin. Fyi, this is what I have gathered on k2g via prints: dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); Prints inside dma_direct_supported(): sdhci-omap 2300.mmc: max_pfn: 88 sdhci-omap 2300.mmc: min_mask #1: ff sdhci-omap 2300.mmc: min_mask #2: ff sdhci-omap 2300.mmc: dev->dma_pfn_offset: 78 sdhci-omap 2300.mmc: PAGE_SHIFT: 12 sdhci-omap 2300.mmc: __phys_to_dma(dev, min_mask): ff880ff sdhci-omap 2300.mmc: mask: Print in dma_supported() after returning from dma_direct_supported(): sdhci-omap 2300.mmc: dma_is_direct, ret = 0 sdhci-omap 2310.mmc: DMA is not supported for the device keystone-k2g have this in soc0 node: dma-ranges = <0x8000 0x8 0x 0x8000>; DDR starts at 0x8 (8G) and 2G is aliased at 0x8000 . This gives the 0x78 for dma_pfn_offset for all devices underneath it. The DMA_BIT_MASK(24) is passed to __phys_to_dma() because CONFIG_ZONE_DMA is
Re: [PATCH 2/2] arm: use swiotlb for bounce buffer on LPAE configs
Robin, On 08/01/2020 14.21, Robin Murphy wrote: > On 08/01/2020 8:28 am, Peter Ujfalusi via iommu wrote: >> Hi Christoph, >> >> On 19/12/2019 17.20, Peter Ujfalusi wrote: >>> Hi Christoph, >>> >>> On 19/12/2019 17.02, Christoph Hellwig wrote: Hi Peter, can you try the patch below (it will need to be split into two): >>> >>> Thank you! >>> >>> Unfortunately it does not help: >>> [ 0.596208] edma: probe of 270.edma failed with error -5 >>> [ 0.596626] edma: probe of 2728000.edma failed with error -5 >>> ... >>> [ 2.108602] sdhci-omap 2300.mmc: Got CD GPIO >>> [ 2.113899] mmc0: Failed to set 32-bit DMA mask. >>> [ 2.118592] mmc0: No suitable DMA available - falling back to PIO >>> [ 2.159038] mmc0: SDHCI controller on 2300.mmc [2300.mmc] >>> using PIO >>> [ 2.167531] mmc1: Failed to set 32-bit DMA mask. >>> [ 2.172192] mmc1: No suitable DMA available - falling back to PIO >>> [ 2.213841] mmc1: SDHCI controller on 2310.mmc [2310.mmc] >>> using PIO >> >> Do you have idea on how to fix this in a proper way? >> >> IMHO when drivers are setting the dma_mask and coherent_mask the >> dma_pfn_offset should not be applied to the mask at all. >> >> If I understand it correctly for EDMA as example: >> >> I set dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); >> since it can only address memory in this range. >> >> It does not matter if dma_pfn_offset is 0 or not 0 (like in k2g, where >> it is 0x78) the EDMA still can only address within 32 bits. >> >> The dma_pfn_offset will tell us that the memory location's physical >> address is seen by the DMA at (phys_pfn - dma_pfn_offset) -> dma_pfn. >> >> The dma_mask should be checked against the dma_pfn. > > That's exactly what dma_direct_supported() does, though. Yes, this is my understanding as well. > What it doesn't > do, AFAICS, is account for weird cases where the DMA zone *starts* way, > way above 32 physical address bits because of an implicit assumption > that *all* devices have a dma_pfn_offset equal to min_low_pfn. The problem - I think - is that the DMA_BIT_MASK(32) from dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)) is treated as physical address along the call path so the dma_pfn_offset is applied to it and the check will fail, saying that DMA_BIT_MASK(32) can not be supported. Fyi, this is what I have gathered on k2g via prints: dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); Prints inside dma_direct_supported(): sdhci-omap 2300.mmc: max_pfn: 88 sdhci-omap 2300.mmc: min_mask #1: ff sdhci-omap 2300.mmc: min_mask #2: ff sdhci-omap 2300.mmc: dev->dma_pfn_offset: 78 sdhci-omap 2300.mmc: PAGE_SHIFT: 12 sdhci-omap 2300.mmc: __phys_to_dma(dev, min_mask): ff880ff sdhci-omap 2300.mmc: mask: Print in dma_supported() after returning from dma_direct_supported(): sdhci-omap 2300.mmc: dma_is_direct, ret = 0 sdhci-omap 2310.mmc: DMA is not supported for the device keystone-k2g have this in soc0 node: dma-ranges = <0x8000 0x8 0x 0x8000>; DDR starts at 0x8 (8G) and 2G is aliased at 0x8000 . This gives the 0x78 for dma_pfn_offset for all devices underneath it. The DMA_BIT_MASK(24) is passed to __phys_to_dma() because CONFIG_ZONE_DMA is enabled. SWIOTLB is enabled in kconfig. > I'm not sure how possible it is to cope with that generically. Me neither, but k2g is failing since v5.3-rc3 and the bisect pointed to this commit. > > Robin. > >> We can not 'move' the dma_mask with dma_pfn_offset when setting the mask >> since it is not correct. The DMA can access in 32 bits range and we have >> the peripherals under 0x8000 . >> >> I might be missing something, but it looks to me that the way we set the >> dma_mask and the coherent_mask is the place where this can be fixed. >> >> Best regards, >> - Péter >> >>> >>> - Péter >>> >>> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index e822af0d9219..30b9c6786ce3 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -221,7 +221,8 @@ EXPORT_SYMBOL(arm_coherent_dma_ops); static int __dma_supported(struct device *dev, u64 mask, bool warn) { - unsigned long max_dma_pfn = min(max_pfn, arm_dma_pfn_limit); + unsigned long max_dma_pfn = + min_t(unsigned long, max_pfn, zone_dma_limit >> PAGE_SHIFT); /* * Translate the device's DMA mask to a PFN limit. This diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 3ef204137e73..dd0e169a1bb1 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -84,15 +85,6 @@ static void __init find_limits(unsigned long *min, unsigned long *max_low, phys_addr_t arm_dma_zone_size __read_mostly;
Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices
> +/* > + * We expect devices with endpoint scope to have normal PCI > + * headers, and devices with bridge scope to have bridge PCI > + * headers. However some PCI devices may be listed in the > + * DMAR table with bridge scope, even though they have a > + * normal PCI header. We don't declare a socpe mismatch for > + * below special cases. > + */ Please use up all 80 lines for comments. > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2f0d,/* NTB devices > */ > + quirk_dmar_scope_mismatch); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2020,/* NVME host */ > + quirk_dmar_scope_mismatch); As said before "NVME host" host. Besides the wrong spelling of NVMe, the NVMe host is the Linux kernel, so describing a device as such seems rather bogus. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] arm: use swiotlb for bounce buffer on LPAE configs
On 08/01/2020 8:28 am, Peter Ujfalusi via iommu wrote: Hi Christoph, On 19/12/2019 17.20, Peter Ujfalusi wrote: Hi Christoph, On 19/12/2019 17.02, Christoph Hellwig wrote: Hi Peter, can you try the patch below (it will need to be split into two): Thank you! Unfortunately it does not help: [0.596208] edma: probe of 270.edma failed with error -5 [0.596626] edma: probe of 2728000.edma failed with error -5 ... [2.108602] sdhci-omap 2300.mmc: Got CD GPIO [2.113899] mmc0: Failed to set 32-bit DMA mask. [2.118592] mmc0: No suitable DMA available - falling back to PIO [2.159038] mmc0: SDHCI controller on 2300.mmc [2300.mmc] using PIO [2.167531] mmc1: Failed to set 32-bit DMA mask. [2.172192] mmc1: No suitable DMA available - falling back to PIO [2.213841] mmc1: SDHCI controller on 2310.mmc [2310.mmc] using PIO Do you have idea on how to fix this in a proper way? IMHO when drivers are setting the dma_mask and coherent_mask the dma_pfn_offset should not be applied to the mask at all. If I understand it correctly for EDMA as example: I set dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); since it can only address memory in this range. It does not matter if dma_pfn_offset is 0 or not 0 (like in k2g, where it is 0x78) the EDMA still can only address within 32 bits. The dma_pfn_offset will tell us that the memory location's physical address is seen by the DMA at (phys_pfn - dma_pfn_offset) -> dma_pfn. The dma_mask should be checked against the dma_pfn. That's exactly what dma_direct_supported() does, though. What it doesn't do, AFAICS, is account for weird cases where the DMA zone *starts* way, way above 32 physical address bits because of an implicit assumption that *all* devices have a dma_pfn_offset equal to min_low_pfn. I'm not sure how possible it is to cope with that generically. Robin. We can not 'move' the dma_mask with dma_pfn_offset when setting the mask since it is not correct. The DMA can access in 32 bits range and we have the peripherals under 0x8000 . I might be missing something, but it looks to me that the way we set the dma_mask and the coherent_mask is the place where this can be fixed. Best regards, - Péter - Péter diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index e822af0d9219..30b9c6786ce3 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -221,7 +221,8 @@ EXPORT_SYMBOL(arm_coherent_dma_ops); static int __dma_supported(struct device *dev, u64 mask, bool warn) { - unsigned long max_dma_pfn = min(max_pfn, arm_dma_pfn_limit); + unsigned long max_dma_pfn = + min_t(unsigned long, max_pfn, zone_dma_limit >> PAGE_SHIFT); /* * Translate the device's DMA mask to a PFN limit. This diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 3ef204137e73..dd0e169a1bb1 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -84,15 +85,6 @@ static void __init find_limits(unsigned long *min, unsigned long *max_low, phys_addr_t arm_dma_zone_size __read_mostly; EXPORT_SYMBOL(arm_dma_zone_size); -/* - * The DMA mask corresponding to the maximum bus address allocatable - * using GFP_DMA. The default here places no restriction on DMA - * allocations. This must be the smallest DMA mask in the system, - * so a successful GFP_DMA allocation will always satisfy this. - */ -phys_addr_t arm_dma_limit; -unsigned long arm_dma_pfn_limit; - static void __init arm_adjust_dma_zone(unsigned long *size, unsigned long *hole, unsigned long dma_size) { @@ -108,14 +100,14 @@ static void __init arm_adjust_dma_zone(unsigned long *size, unsigned long *hole, void __init setup_dma_zone(const struct machine_desc *mdesc) { -#ifdef CONFIG_ZONE_DMA - if (mdesc->dma_zone_size) { + if (!IS_ENABLED(CONFIG_ZONE_DMA)) { + zone_dma_limit = ((phys_addr_t)~0); + } else if (mdesc->dma_zone_size) { arm_dma_zone_size = mdesc->dma_zone_size; - arm_dma_limit = PHYS_OFFSET + arm_dma_zone_size - 1; - } else - arm_dma_limit = 0x; - arm_dma_pfn_limit = arm_dma_limit >> PAGE_SHIFT; -#endif + zone_dma_limit = PHYS_OFFSET + arm_dma_zone_size - 1; + } else { + zone_dma_limit = 0x; + } } static void __init zone_sizes_init(unsigned long min, unsigned long max_low, @@ -279,7 +271,7 @@ void __init arm_memblock_init(const struct machine_desc *mdesc) early_init_fdt_scan_reserved_mem(); /* reserve memory for DMA contiguous allocations */ - dma_contiguous_reserve(arm_dma_limit); + dma_contiguous_reserve(zone_dma_limit); arm_memblock_steal_permitted = false; memblock_dump_all(); diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h index
Re: [PATCH 0/3] iommu/arm-smmu: Qualcomm bootsplash/efifb
On Thu, Dec 26, 2019 at 02:17:06PM -0800, Bjorn Andersson wrote: > These patches implements the stream mapping inheritance that's necessary in > order to not hit a security violation as the display hardware looses its > stream > mapping during initialization of arm-smmu in various Qualcomm platforms. > > This was previously posted as an RFC [1], changes since then involves the > rebase and migration of the read-back code to the Qualcomm specific > implementation, the mapping is maintained indefinitely - to handle probe > deferring clients - and rewritten commit messages. I don't think we should solve this in a Qualcomm-specific manner. Please can you take a look at the proposal from Thierry [1] and see whether or not it works for you? Thanks, Will [1] https://lore.kernel.org/lkml/20191209150748.2471814-1-thierry.red...@gmail.com ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] arm: use swiotlb for bounce buffer on LPAE configs
Hi Christoph, On 19/12/2019 17.20, Peter Ujfalusi wrote: > Hi Christoph, > > On 19/12/2019 17.02, Christoph Hellwig wrote: >> Hi Peter, >> >> can you try the patch below (it will need to be split into two): > > Thank you! > > Unfortunately it does not help: > [0.596208] edma: probe of 270.edma failed with error -5 > [0.596626] edma: probe of 2728000.edma failed with error -5 > ... > [2.108602] sdhci-omap 2300.mmc: Got CD GPIO > [2.113899] mmc0: Failed to set 32-bit DMA mask. > [2.118592] mmc0: No suitable DMA available - falling back to PIO > [2.159038] mmc0: SDHCI controller on 2300.mmc [2300.mmc] > using PIO > [2.167531] mmc1: Failed to set 32-bit DMA mask. > [2.172192] mmc1: No suitable DMA available - falling back to PIO > [2.213841] mmc1: SDHCI controller on 2310.mmc [2310.mmc] > using PIO Do you have idea on how to fix this in a proper way? IMHO when drivers are setting the dma_mask and coherent_mask the dma_pfn_offset should not be applied to the mask at all. If I understand it correctly for EDMA as example: I set dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); since it can only address memory in this range. It does not matter if dma_pfn_offset is 0 or not 0 (like in k2g, where it is 0x78) the EDMA still can only address within 32 bits. The dma_pfn_offset will tell us that the memory location's physical address is seen by the DMA at (phys_pfn - dma_pfn_offset) -> dma_pfn. The dma_mask should be checked against the dma_pfn. We can not 'move' the dma_mask with dma_pfn_offset when setting the mask since it is not correct. The DMA can access in 32 bits range and we have the peripherals under 0x8000 . I might be missing something, but it looks to me that the way we set the dma_mask and the coherent_mask is the place where this can be fixed. Best regards, - Péter > > - Péter > > >> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c >> index e822af0d9219..30b9c6786ce3 100644 >> --- a/arch/arm/mm/dma-mapping.c >> +++ b/arch/arm/mm/dma-mapping.c >> @@ -221,7 +221,8 @@ EXPORT_SYMBOL(arm_coherent_dma_ops); >> >> static int __dma_supported(struct device *dev, u64 mask, bool warn) >> { >> -unsigned long max_dma_pfn = min(max_pfn, arm_dma_pfn_limit); >> +unsigned long max_dma_pfn = >> +min_t(unsigned long, max_pfn, zone_dma_limit >> PAGE_SHIFT); >> >> /* >> * Translate the device's DMA mask to a PFN limit. This >> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c >> index 3ef204137e73..dd0e169a1bb1 100644 >> --- a/arch/arm/mm/init.c >> +++ b/arch/arm/mm/init.c >> @@ -19,6 +19,7 @@ >> #include >> #include >> #include >> +#include >> #include >> #include >> #include >> @@ -84,15 +85,6 @@ static void __init find_limits(unsigned long *min, >> unsigned long *max_low, >> phys_addr_t arm_dma_zone_size __read_mostly; >> EXPORT_SYMBOL(arm_dma_zone_size); >> >> -/* >> - * The DMA mask corresponding to the maximum bus address allocatable >> - * using GFP_DMA. The default here places no restriction on DMA >> - * allocations. This must be the smallest DMA mask in the system, >> - * so a successful GFP_DMA allocation will always satisfy this. >> - */ >> -phys_addr_t arm_dma_limit; >> -unsigned long arm_dma_pfn_limit; >> - >> static void __init arm_adjust_dma_zone(unsigned long *size, unsigned long >> *hole, >> unsigned long dma_size) >> { >> @@ -108,14 +100,14 @@ static void __init arm_adjust_dma_zone(unsigned long >> *size, unsigned long *hole, >> >> void __init setup_dma_zone(const struct machine_desc *mdesc) >> { >> -#ifdef CONFIG_ZONE_DMA >> -if (mdesc->dma_zone_size) { >> +if (!IS_ENABLED(CONFIG_ZONE_DMA)) { >> +zone_dma_limit = ((phys_addr_t)~0); >> +} else if (mdesc->dma_zone_size) { >> arm_dma_zone_size = mdesc->dma_zone_size; >> -arm_dma_limit = PHYS_OFFSET + arm_dma_zone_size - 1; >> -} else >> -arm_dma_limit = 0x; >> -arm_dma_pfn_limit = arm_dma_limit >> PAGE_SHIFT; >> -#endif >> +zone_dma_limit = PHYS_OFFSET + arm_dma_zone_size - 1; >> +} else { >> +zone_dma_limit = 0x; >> +} >> } >> >> static void __init zone_sizes_init(unsigned long min, unsigned long max_low, >> @@ -279,7 +271,7 @@ void __init arm_memblock_init(const struct machine_desc >> *mdesc) >> early_init_fdt_scan_reserved_mem(); >> >> /* reserve memory for DMA contiguous allocations */ >> -dma_contiguous_reserve(arm_dma_limit); >> +dma_contiguous_reserve(zone_dma_limit); >> >> arm_memblock_steal_permitted = false; >> memblock_dump_all(); >> diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h >> index 88c121ac14b3..7dbd77554273 100644 >> --- a/arch/arm/mm/mm.h >> +++ b/arch/arm/mm/mm.h >> @@ -82,14 +82,6 @@ extern __init void add_static_vm_early(struct static_vm >> *svm); >> >> #endif >> >> -#ifdef CONFIG_ZONE_DMA >> -extern