Re: [PATCH v8 07/21] iommu/io-pgtable-arm-v7s: Extend MediaTek 4GB Mode
On Wed, 2019-07-17 at 15:23 +0100, Will Deacon wrote: > On Wed, Jul 17, 2019 at 08:44:19PM +0800, Yong Wu wrote: > > On Mon, 2019-07-15 at 10:51 +0100, Will Deacon wrote: > > > On Sun, Jul 14, 2019 at 12:41:20PM +0800, Yong Wu wrote: > > > > @@ -742,7 +763,9 @@ static struct io_pgtable > > > > *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, > > > > { > > > > struct arm_v7s_io_pgtable *data; > > > > > > > > - if (cfg->ias > ARM_V7S_ADDR_BITS || cfg->oas > > > > > ARM_V7S_ADDR_BITS) > > > > + if (cfg->ias > ARM_V7S_ADDR_BITS || > > > > + (cfg->oas > ARM_V7S_ADDR_BITS && > > > > +!(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT))) > > > > return NULL; > > > > > > I think you can rework this to do something like: > > > > > > if (cfg->ias > ARM_V7S_ADDR_BITS) > > > return NULL; > > > > > > if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) { > > > if (!IS_ENABLED(CONFIG_PHYS_ADDR_T_64BIT)) > > > cfg->oas = min(cfg->oas, ARM_V7S_ADDR_BITS); > > > else if (cfg->oas > 34) > > > return NULL; > > > } else if (cfg->oas > ARM_V7S_ADDR_BITS) { > > > return NULL; > > > } > > > > > > so that we clamp the oas when phys_addr_t is 32-bit for you. That should > > > allow you to remove lots of the checking from iopte_to_paddr() too if you > > > check against oas in the map() function. > > > > > > Does that make sense? > > > > Of course I'm ok for this. I'm only afraid that this function has > > already 3 checking "if (x) return NULL", Here we add a new one and so > > many lines... Maybe the user should guarantee the right value of oas. > > How about move it into mtk_iommu.c? > > > > About the checking of iopte_to_paddr, I can not remove them. I know it > > may be a bit special and not readable. Hmm, I guess I should use a MACRO > > instead of the hard code 33 for the special 4GB mode case. > > Why can't you just do something like: > > if (!(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT)) > return paddr; > > if (pte & ARM_V7S_ATTR_MTK_PA_BIT33) > paddr |= BIT_ULL(33); OK here. > > if (pte & ARM_V_ATTR_MTK_PA_BIT32) > paddr |= BIT_ULL(32); No here, The flow is a bit special for 4GB mode here. This is the detailed remap relationship for our 4GB mode. CPU PA ->HW PA register: 0x0 ~ 0x3fff_ dram 1G:0x4000_~0x7fff_ ->0x1_4000_~0x1_7fff_(Add bit32) dram 2G:0x8000_~0xbfff_ ->0x1_8000_~0x1_bfff_(Add bit32) dram 3G:0xc000_~0x_ ->0x1_c000_~0x1__(Add bit32) dram 4G:0x1__~0x1_3fff_->0x1__~0x1_3fff_ Thus, in the 4GB mode, we should add always add bit9 in pte(for bit32 PA). But we can not always add bit32 in the iova_to_phys. The valid PA range should be 0x4000_ - 0x1_3fff_. Thus, we can only add bit32 when the PA in pte < 0x4000_, keep it as-is if the PA in pte located from 0x4000_ to 0x_. This issue exist all the time after we added 4GB mode for mt8173. Thus, I have to add a special flow for 4gb mode here: /* Workaround for MTK 4GB Mode: Add BIT32 only when PA < 0x4000_.*/ if (cfg->oas == ARM_V7S_MTK_4GB_OAS && paddr < 0x4000UL) paddr |= BIT_ULL(32); else if (pte & ARM_V7S_ATTR_MTK_PA_BIT32) paddr |= BIT_ULL(32); > > return paddr; > > The diff I sent previously sanitises the oas at init time, and then you > can just enforce it in map(). > > Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted
Hello, Just going back to this question which I wasn't able to answer. Thiago Jung Bauermann writes: > Michael S. Tsirkin writes: > >> So far so good, but now a question: >> >> how are we handling guest address width limitations? >> Is VIRTIO_F_ACCESS_PLATFORM_IDENTITY_ADDRESS subject to >> guest address width limitations? >> I am guessing we can make them so ... >> This needs to be documented. > > I'm not sure. I will get back to you on this. We don't have address width limitations between host and guest. -- Thiago Jung Bauermann IBM Linux Technology Center
[PATCH v3 0/6] Remove x86-specific code from generic headers
Hello, This version is mostly about splitting up patch 2/3 into three separate patches, as suggested by Christoph Hellwig. Two other changes are a fix in patch 1 which wasn't selecting ARCH_HAS_MEM_ENCRYPT for s390 spotted by Janani and removal of sme_active and sev_active symbol exports as suggested by Christoph Hellwig. These patches are applied on top of today's dma-mapping/for-next. I don't have a way to test SME, SEV, nor s390's PEF so the patches have only been build tested. Changelog Since v2: - Patch "x86,s390: Move ARCH_HAS_MEM_ENCRYPT definition to arch/Kconfig" - Added "select ARCH_HAS_MEM_ENCRYPT" to config S390. Suggested by Janani. - Patch "DMA mapping: Move SME handling to x86-specific files" - Split up into 3 new patches. Suggested by Christoph Hellwig. - Patch "swiotlb: Remove call to sme_active()" - New patch. - Patch "dma-mapping: Remove dma_check_mask()" - New patch. - Patch "x86,s390/mm: Move sme_active() and sme_me_mask to x86-specific header" - New patch. - Removed export of sme_active symbol. Suggested by Christoph Hellwig. - Patch "fs/core/vmcore: Move sev_active() reference to x86 arch code" - Removed export of sev_active symbol. Suggested by Christoph Hellwig. - Patch "s390/mm: Remove sev_active() function" - New patch. Since v1: - Patch "x86,s390: Move ARCH_HAS_MEM_ENCRYPT definition to arch/Kconfig" - Remove definition of ARCH_HAS_MEM_ENCRYPT from s390/Kconfig as well. - Reworded patch title and message a little bit. - Patch "DMA mapping: Move SME handling to x86-specific files" - Adapt s390's as well. - Remove dma_check_mask() from kernel/dma/mapping.c. Suggested by Christoph Hellwig. Thiago Jung Bauermann (6): x86,s390: Move ARCH_HAS_MEM_ENCRYPT definition to arch/Kconfig swiotlb: Remove call to sme_active() dma-mapping: Remove dma_check_mask() x86,s390/mm: Move sme_active() and sme_me_mask to x86-specific header fs/core/vmcore: Move sev_active() reference to x86 arch code s390/mm: Remove sev_active() function arch/Kconfig| 3 +++ arch/s390/Kconfig | 4 +--- arch/s390/include/asm/mem_encrypt.h | 5 + arch/s390/mm/init.c | 8 +--- arch/x86/Kconfig| 4 +--- arch/x86/include/asm/mem_encrypt.h | 10 ++ arch/x86/kernel/crash_dump_64.c | 5 + arch/x86/mm/mem_encrypt.c | 2 -- fs/proc/vmcore.c| 8 include/linux/crash_dump.h | 14 ++ include/linux/mem_encrypt.h | 15 +-- kernel/dma/mapping.c| 8 kernel/dma/swiotlb.c| 3 +-- 13 files changed, 42 insertions(+), 47 deletions(-) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 5/6] fs/core/vmcore: Move sev_active() reference to x86 arch code
Secure Encrypted Virtualization is an x86-specific feature, so it shouldn't appear in generic kernel code because it forces non-x86 architectures to define the sev_active() function, which doesn't make a lot of sense. To solve this problem, add an x86 elfcorehdr_read() function to override the generic weak implementation. To do that, it's necessary to make read_from_oldmem() public so that it can be used outside of vmcore.c. Also, remove the export for sev_active() since it's only used in files that won't be built as modules. Signed-off-by: Thiago Jung Bauermann --- arch/x86/kernel/crash_dump_64.c | 5 + arch/x86/mm/mem_encrypt.c | 1 - fs/proc/vmcore.c| 8 include/linux/crash_dump.h | 14 ++ include/linux/mem_encrypt.h | 1 - 5 files changed, 23 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c index 22369dd5de3b..045e82e8945b 100644 --- a/arch/x86/kernel/crash_dump_64.c +++ b/arch/x86/kernel/crash_dump_64.c @@ -70,3 +70,8 @@ ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf, size_t csize, { return __copy_oldmem_page(pfn, buf, csize, offset, userbuf, true); } + +ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos) +{ + return read_from_oldmem(buf, count, ppos, 0, sev_active()); +} diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c index 7139f2f43955..b1e823441093 100644 --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -349,7 +349,6 @@ bool sev_active(void) { return sme_me_mask && sev_enabled; } -EXPORT_SYMBOL(sev_active); /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */ bool force_dma_unencrypted(struct device *dev) diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c index 57957c91c6df..ca1f20bedd8c 100644 --- a/fs/proc/vmcore.c +++ b/fs/proc/vmcore.c @@ -100,9 +100,9 @@ static int pfn_is_ram(unsigned long pfn) } /* Reads a page from the oldmem device from given offset. */ -static ssize_t read_from_oldmem(char *buf, size_t count, - u64 *ppos, int userbuf, - bool encrypted) +ssize_t read_from_oldmem(char *buf, size_t count, +u64 *ppos, int userbuf, +bool encrypted) { unsigned long pfn, offset; size_t nr_bytes; @@ -166,7 +166,7 @@ void __weak elfcorehdr_free(unsigned long long addr) */ ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos) { - return read_from_oldmem(buf, count, ppos, 0, sev_active()); + return read_from_oldmem(buf, count, ppos, 0, false); } /* diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h index f774c5eb9e3c..4664fc1871de 100644 --- a/include/linux/crash_dump.h +++ b/include/linux/crash_dump.h @@ -115,4 +115,18 @@ static inline int vmcore_add_device_dump(struct vmcoredd_data *data) return -EOPNOTSUPP; } #endif /* CONFIG_PROC_VMCORE_DEVICE_DUMP */ + +#ifdef CONFIG_PROC_VMCORE +ssize_t read_from_oldmem(char *buf, size_t count, +u64 *ppos, int userbuf, +bool encrypted); +#else +static inline ssize_t read_from_oldmem(char *buf, size_t count, + u64 *ppos, int userbuf, + bool encrypted) +{ + return -EOPNOTSUPP; +} +#endif /* CONFIG_PROC_VMCORE */ + #endif /* LINUX_CRASHDUMP_H */ diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h index 0c5b0ff9eb29..5c4a18a91f89 100644 --- a/include/linux/mem_encrypt.h +++ b/include/linux/mem_encrypt.h @@ -19,7 +19,6 @@ #else /* !CONFIG_ARCH_HAS_MEM_ENCRYPT */ static inline bool mem_encrypt_active(void) { return false; } -static inline bool sev_active(void) { return false; } #endif /* CONFIG_ARCH_HAS_MEM_ENCRYPT */ ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 3/6] dma-mapping: Remove dma_check_mask()
sme_active() is an x86-specific function so it's better not to call it from generic code. Christoph Hellwig mentioned that "There is no reason why we should have a special debug printk just for one specific reason why there is a requirement for a large DMA mask.", so just remove dma_check_mask(). Signed-off-by: Thiago Jung Bauermann --- kernel/dma/mapping.c | 8 1 file changed, 8 deletions(-) diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 1f628e7ac709..61eeefbfcb36 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -291,12 +291,6 @@ void dma_free_attrs(struct device *dev, size_t size, void *cpu_addr, } EXPORT_SYMBOL(dma_free_attrs); -static inline void dma_check_mask(struct device *dev, u64 mask) -{ - if (sme_active() && (mask < (((u64)sme_get_me_mask() << 1) - 1))) - dev_warn(dev, "SME is active, device will require DMA bounce buffers\n"); -} - int dma_supported(struct device *dev, u64 mask) { const struct dma_map_ops *ops = get_dma_ops(dev); @@ -327,7 +321,6 @@ int dma_set_mask(struct device *dev, u64 mask) return -EIO; arch_dma_set_mask(dev, mask); - dma_check_mask(dev, mask); *dev->dma_mask = mask; return 0; } @@ -345,7 +338,6 @@ int dma_set_coherent_mask(struct device *dev, u64 mask) if (!dma_supported(dev, mask)) return -EIO; - dma_check_mask(dev, mask); dev->coherent_dma_mask = mask; return 0; }
[PATCH v3 4/6] x86, s390/mm: Move sme_active() and sme_me_mask to x86-specific header
Now that generic code doesn't reference them, move sme_active() and sme_me_mask to x86's . Also remove the export for sme_active() since it's only used in files that won't be built as modules. sme_me_mask on the other hand is used in arch/x86/kvm/svm.c (via __sme_set() and __psp_pa()) which can be built as a module so its export needs to stay. Signed-off-by: Thiago Jung Bauermann --- arch/s390/include/asm/mem_encrypt.h | 4 +--- arch/x86/include/asm/mem_encrypt.h | 10 ++ arch/x86/mm/mem_encrypt.c | 1 - include/linux/mem_encrypt.h | 14 +- 4 files changed, 12 insertions(+), 17 deletions(-) diff --git a/arch/s390/include/asm/mem_encrypt.h b/arch/s390/include/asm/mem_encrypt.h index 3eb018508190..ff813a56bc30 100644 --- a/arch/s390/include/asm/mem_encrypt.h +++ b/arch/s390/include/asm/mem_encrypt.h @@ -4,9 +4,7 @@ #ifndef __ASSEMBLY__ -#define sme_me_mask0ULL - -static inline bool sme_active(void) { return false; } +static inline bool mem_encrypt_active(void) { return false; } extern bool sev_active(void); int set_memory_encrypted(unsigned long addr, int numpages); diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 0c196c47d621..848ce43b9040 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -92,6 +92,16 @@ early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; extern char __start_bss_decrypted[], __end_bss_decrypted[], __start_bss_decrypted_unused[]; +static inline bool mem_encrypt_active(void) +{ + return sme_me_mask; +} + +static inline u64 sme_get_me_mask(void) +{ + return sme_me_mask; +} + #endif /* __ASSEMBLY__ */ #endif /* __X86_MEM_ENCRYPT_H__ */ diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c index c805f0a5c16e..7139f2f43955 100644 --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -344,7 +344,6 @@ bool sme_active(void) { return sme_me_mask && !sev_enabled; } -EXPORT_SYMBOL(sme_active); bool sev_active(void) { diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h index 470bd53a89df..0c5b0ff9eb29 100644 --- a/include/linux/mem_encrypt.h +++ b/include/linux/mem_encrypt.h @@ -18,23 +18,11 @@ #else /* !CONFIG_ARCH_HAS_MEM_ENCRYPT */ -#define sme_me_mask0ULL - -static inline bool sme_active(void) { return false; } +static inline bool mem_encrypt_active(void) { return false; } static inline bool sev_active(void) { return false; } #endif /* CONFIG_ARCH_HAS_MEM_ENCRYPT */ -static inline bool mem_encrypt_active(void) -{ - return sme_me_mask; -} - -static inline u64 sme_get_me_mask(void) -{ - return sme_me_mask; -} - #ifdef CONFIG_AMD_MEM_ENCRYPT /* * The __sme_set() and __sme_clr() macros are useful for adding or removing ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 6/6] s390/mm: Remove sev_active() function
All references to sev_active() were moved to arch/x86 so we don't need to define it for s390 anymore. Signed-off-by: Thiago Jung Bauermann --- arch/s390/include/asm/mem_encrypt.h | 1 - arch/s390/mm/init.c | 8 +--- 2 files changed, 1 insertion(+), 8 deletions(-) diff --git a/arch/s390/include/asm/mem_encrypt.h b/arch/s390/include/asm/mem_encrypt.h index ff813a56bc30..2542cbf7e2d1 100644 --- a/arch/s390/include/asm/mem_encrypt.h +++ b/arch/s390/include/asm/mem_encrypt.h @@ -5,7 +5,6 @@ #ifndef __ASSEMBLY__ static inline bool mem_encrypt_active(void) { return false; } -extern bool sev_active(void); int set_memory_encrypted(unsigned long addr, int numpages); int set_memory_decrypted(unsigned long addr, int numpages); diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 78c319c5ce48..6286eb3e815b 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -155,15 +155,9 @@ int set_memory_decrypted(unsigned long addr, int numpages) return 0; } -/* are we a protected virtualization guest? */ -bool sev_active(void) -{ - return is_prot_virt_guest(); -} - bool force_dma_unencrypted(struct device *dev) { - return sev_active(); + return is_prot_virt_guest(); } /* protected virtualization */ ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 2/6] swiotlb: Remove call to sme_active()
sme_active() is an x86-specific function so it's better not to call it from generic code. There's no need to mention which memory encryption feature is active, so just use a more generic message. Besides, other architectures will have different names for similar technology. Signed-off-by: Thiago Jung Bauermann --- kernel/dma/swiotlb.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 62fa5a82a065..e52401f94e91 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -459,8 +459,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer"); if (mem_encrypt_active()) - pr_warn_once("%s is active and system is using DMA bounce buffers\n", -sme_active() ? "SME" : "SEV"); + pr_warn_once("Memory encryption is active and system is using DMA bounce buffers\n"); mask = dma_get_seg_boundary(hwdev);
[PATCH v3 1/6] x86, s390: Move ARCH_HAS_MEM_ENCRYPT definition to arch/Kconfig
powerpc is also going to use this feature, so put it in a generic location. Signed-off-by: Thiago Jung Bauermann Reviewed-by: Thomas Gleixner Reviewed-by: Christoph Hellwig --- arch/Kconfig | 3 +++ arch/s390/Kconfig | 4 +--- arch/x86/Kconfig | 4 +--- 3 files changed, 5 insertions(+), 6 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index e8d19c3cb91f..8fc285180848 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -935,6 +935,9 @@ config LOCK_EVENT_COUNTS the chance of application behavior change because of timing differences. The counts are reported via debugfs. +config ARCH_HAS_MEM_ENCRYPT + bool + source "kernel/gcov/Kconfig" source "scripts/gcc-plugins/Kconfig" diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index a4ad2733eedf..f43319c44454 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -1,7 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 -config ARCH_HAS_MEM_ENCRYPT -def_bool y - config MMU def_bool y @@ -68,6 +65,7 @@ config S390 select ARCH_HAS_GCOV_PROFILE_ALL select ARCH_HAS_GIGANTIC_PAGE select ARCH_HAS_KCOV + select ARCH_HAS_MEM_ENCRYPT select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_SET_MEMORY select ARCH_HAS_STRICT_KERNEL_RWX diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index c9f331bb538b..5d3295f2df94 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -68,6 +68,7 @@ config X86 select ARCH_HAS_FORTIFY_SOURCE select ARCH_HAS_GCOV_PROFILE_ALL select ARCH_HAS_KCOVif X86_64 + select ARCH_HAS_MEM_ENCRYPT select ARCH_HAS_MEMBARRIER_SYNC_CORE select ARCH_HAS_PMEM_APIif X86_64 select ARCH_HAS_PTE_SPECIAL @@ -1520,9 +1521,6 @@ config X86_CPA_STATISTICS helps to determine the effectiveness of preserving large and huge page mappings when mapping protections are changed. -config ARCH_HAS_MEM_ENCRYPT - def_bool y - config AMD_MEM_ENCRYPT bool "AMD Secure Memory Encryption (SME) support" depends on X86_64 && CPU_SUP_AMD ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 12/15] iommu/vt-d: Cleanup get_valid_domain_for_dev()
On Sat, 25 May 2019 13:41:33 +0800 Lu Baolu wrote: > Previously, get_valid_domain_for_dev() is used to retrieve the > DMA domain which has been attached to the device or allocate one > if no domain has been attached yet. As we have delegated the DMA > domain management to upper layer, this function is used purely to > allocate a private DMA domain if the default domain doesn't work > for ths device. Cleanup the code for readability. > > Signed-off-by: Lu Baolu > --- > drivers/iommu/intel-iommu.c | 18 -- > include/linux/intel-iommu.h | 1 - > 2 files changed, 8 insertions(+), 11 deletions(-) System fails to boot bisected to this commit: ommit 4ec066c7b1476e0ca66a7acdb575627a5d1a1ee6 (HEAD, refs/bisect/bad) Author: Lu Baolu Date: Sat May 25 13:41:33 2019 +0800 iommu/vt-d: Cleanup get_valid_domain_for_dev() [6.857697] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [6.870723] ehci-pci: EHCI PCI platform driver [6.879666] ehci-pci :00:1a.0: EHCI Host Controller [6.890122] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [6.904893] ehci-pci :00:1a.0: debug port 2 [6.913961] ehci-pci :00:1a.0: DMAR: Setting identity map [0xbe8d1000 - 0xbe8d] [6.929953] ehci-pci :00:1a.0: DMAR: 32bit DMA uses non-identity mapping [6.944033] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:600 domain_get_iommu+0x55/0x60 [6.945017] Modules linked in: [6.945017] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc2+ #37 [6.945017] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 [6.945017] RIP: 0010:domain_get_iommu+0x55/0x60 [6.945017] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 e0 f5 44 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6 [6.945017] RSP: 0018:bd02418bba98 EFLAGS: 00010293 [6.945017] RAX: RBX: RCX: [6.945017] RDX: 1000 RSI: 313ea000 RDI: 9ce9f3026e00 [6.945017] RBP: 9cea144f80b0 R08: R09: 9ce9f13ea000 [6.945017] R10: 0010 R11: 0100 R12: 313ea000 [6.945017] R13: 9ce9f3026e00 R14: 1000 R15: 313ea000 [6.945017] FS: () GS:9ceddf40() knlGS: [6.945017] CS: 0010 DS: ES: CR0: 80050033 [6.945017] CR2: 7fa72ed421f8 CR3: 00011b20a001 CR4: 001606f0 [6.945017] Call Trace: [6.945017] __intel_map_single+0x4e/0x120 [6.945017] intel_alloc_coherent+0xa7/0x110 [6.945017] dma_pool_alloc+0xd9/0x1e0 [6.945017] ehci_qh_alloc+0x55/0x130 [6.945017] ehci_setup+0x284/0x7b0 [6.945017] ehci_pci_setup+0x8b/0x41d [6.945017] usb_add_hcd.cold.40+0x1c5/0x7b7 [6.945017] usb_hcd_pci_probe+0x2a4/0x420 [6.945017] local_pci_probe+0x42/0x80 [6.945017] pci_device_probe+0x115/0x190 [6.945017] really_probe+0xef/0x390 [6.945017] driver_probe_device+0xb4/0x100 [6.945017] device_driver_attach+0x4f/0x60 [6.945017] __driver_attach+0x86/0x140 [6.945017] ? device_driver_attach+0x60/0x60 [6.945017] bus_for_each_dev+0x77/0xc0 [6.945017] ? klist_add_tail+0x3b/0x60 [6.945017] bus_add_driver+0x14a/0x1e0 [6.945017] ? ehci_hcd_init+0xaa/0xaa [6.945017] ? do_early_param+0x8e/0x8e [6.945017] driver_register+0x6b/0xb0 [6.945017] ? ehci_hcd_init+0xaa/0xaa [6.945017] do_one_initcall+0x46/0x1f4 [6.945017] ? do_early_param+0x8e/0x8e [6.945017] kernel_init_freeable+0x1ac/0x255 [6.945017] ? rest_init+0x9f/0x9f [6.945017] kernel_init+0xa/0x101 [6.945017] ret_from_fork+0x35/0x40 [6.945017] ---[ end trace 1aa3219cfb92242b ]--- Kernel command line includes intel_iommu=on iommu=pt On the merge commit of Joerg's pull request for 5.3 (6b04014f3f15), including: commit c57b260a7d7d60dfbcf794dd9836c1d9fdbf5434 Author: Lu Baolu Date: Wed Jun 12 08:28:46 2019 +0800 iommu/vt-d: Set domain type for a private domain Otherwise, domain_get_iommu() will be broken. Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with private") The fail to boot error shows as: [7.042371] general protection fault: [#1] SMP PTI [7.042968] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.0+ #39 [7.042968] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 [7.042968] RIP: 0010:domain_context_mapping_one+0x212/0x4c0 [7.042968] Code: 39 c2 0f 8d fc 01 00 00 48 8b 15 49 4d b5 00 eb 0c 83 e8 01 41 39 c2 0f 84 ea 01 00 00 4d 8b 00 49 81 e0 00 f0 ff ff 49 01 d0 <41> f6 00 03 75 e1 b8 f4 ff ff ff e9 be fe ff ff 44 89 e1 45 89 e0 [7.042968] RSP: 0018:b021818bb928 EFLAGS: 00010087 [7.042968] RAX:
[RFC 0/4] Raspberry Pi 4 DMA addressing support
Hi all, this series attempts to address some issues we found while bringing up the new Raspberry Pi 4 in arm64 and it's intended to serve as a follow up to this: https://www.spinics.net/lists/arm-kernel/msg740650.html The new Raspberry Pi 4 has up to 4GB of ram but most devices can only address the first GB of ram: the DMA address range is 0xc000-0xfc00 which is aliased to the first GB of memory 0x-0x3c00. Note that only some devices have this limitations, the ARM cores, PCIe, GENET, and 40-bit DMA channels have a wider view of the address space. This is solved in arm32 by setting up the correct '.dma_zone_size = SZ_1G' which takes care of the allocating the coherent memory area at the right spot and also is taken into account in the arch specific 'dma_map_ops'. Unfortunately there is no such thing as '.dma_zone_size' in arm64, to make things worse it's assumed that all devices will be able to adress the first 4GB of memory. This raises two issues: the coherent memory reserves are located in an area not accessible by most devices, and DMA streaming's dma_supported(), which fails for most devices since it's min_mask isn't properly set. Note that the rest if DMA streaming works fine thanks to the help of swiotlb. On one hand I've implemented a function that parses the 'dma-range' on all interconnects and tries to select a location for the coherent memory reserves that'll fit all devices. I made the algorithm as simple as possible, based on the existing devices limitations. On the other I've added a new variable in dma-direct that allows modifying the min_mask during the init process and taken care of setting it accordingly in the arm64's init code. Regards, Nicolas --- Nicolas Saenz Julienne (4): arm64: mm: use arm64_dma_phys_limit instead of calling max_zone_dma_phys() arm64: mm: parse dma-ranges in order to better estimate arm64_dma_phys_limit dma-direct: add dma_direct_min_mask arm64: mm: set direct_dma_min_mask according to dma-ranges arch/arm64/mm/init.c | 69 kernel/dma/direct.c | 4 ++- 2 files changed, 67 insertions(+), 6 deletions(-) -- 2.22.0
[RFC 3/4] dma-direct: add dma_direct_min_mask
Historically devices with ZONE_DMA32 have been assumed to be able to address at least the lower 4GB of ram for DMA. This is still the defualt behavior yet the Raspberry Pi 4 is limited to the first GB of memory. This has been observed to trigger failures in dma_direct_supported() as the 'min_mask' isn't properly set. We create 'dma_direct_min_mask' in order for the arch init code to be able to fine-tune dma direct's 'min_dma' mask. Signed-off-by: Nicolas Saenz Julienne --- kernel/dma/direct.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index b90e1aede743..3c8cd730648b 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -23,6 +23,8 @@ #define ARCH_ZONE_DMA_BITS 24 #endif +u64 dma_direct_min_mask __ro_after_init = DMA_BIT_MASK(32); + /* * For AMD SEV all DMA must be to unencrypted addresses. */ @@ -393,7 +395,7 @@ int dma_direct_supported(struct device *dev, u64 mask) if (IS_ENABLED(CONFIG_ZONE_DMA)) min_mask = DMA_BIT_MASK(ARCH_ZONE_DMA_BITS); else - min_mask = DMA_BIT_MASK(32); + min_mask = dma_direct_min_mask; min_mask = min_t(u64, min_mask, (max_pfn - 1) << PAGE_SHIFT); -- 2.22.0
Re: [PATCH v8 07/21] iommu/io-pgtable-arm-v7s: Extend MediaTek 4GB Mode
On Wed, Jul 17, 2019 at 08:44:19PM +0800, Yong Wu wrote: > On Mon, 2019-07-15 at 10:51 +0100, Will Deacon wrote: > > On Sun, Jul 14, 2019 at 12:41:20PM +0800, Yong Wu wrote: > > > @@ -742,7 +763,9 @@ static struct io_pgtable > > > *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, > > > { > > > struct arm_v7s_io_pgtable *data; > > > > > > - if (cfg->ias > ARM_V7S_ADDR_BITS || cfg->oas > ARM_V7S_ADDR_BITS) > > > + if (cfg->ias > ARM_V7S_ADDR_BITS || > > > + (cfg->oas > ARM_V7S_ADDR_BITS && > > > + !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT))) > > > return NULL; > > > > I think you can rework this to do something like: > > > > if (cfg->ias > ARM_V7S_ADDR_BITS) > > return NULL; > > > > if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) { > > if (!IS_ENABLED(CONFIG_PHYS_ADDR_T_64BIT)) > > cfg->oas = min(cfg->oas, ARM_V7S_ADDR_BITS); > > else if (cfg->oas > 34) > > return NULL; > > } else if (cfg->oas > ARM_V7S_ADDR_BITS) { > > return NULL; > > } > > > > so that we clamp the oas when phys_addr_t is 32-bit for you. That should > > allow you to remove lots of the checking from iopte_to_paddr() too if you > > check against oas in the map() function. > > > > Does that make sense? > > Of course I'm ok for this. I'm only afraid that this function has > already 3 checking "if (x) return NULL", Here we add a new one and so > many lines... Maybe the user should guarantee the right value of oas. > How about move it into mtk_iommu.c? > > About the checking of iopte_to_paddr, I can not remove them. I know it > may be a bit special and not readable. Hmm, I guess I should use a MACRO > instead of the hard code 33 for the special 4GB mode case. Why can't you just do something like: if (!(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT)) return paddr; if (pte & ARM_V7S_ATTR_MTK_PA_BIT33) paddr |= BIT_ULL(33); if (pte & ARM_V_ATTR_MTK_PA_BIT32) paddr |= BIT_ULL(32); return paddr; The diff I sent previously sanitises the oas at init time, and then you can just enforce it in map(). Will
Re: add swiotlb support to arm32
Hi, On 09/07/19 7:50 PM, Christoph Hellwig wrote: > Hi Russell, > > This series adds swiotlb support to the 32-bit arm port to ensure > platforms with LPAE support can support DMA mapping for all devices > using 32-bit dma masks, just like we do on other ports that support >> 32-bit physical addressing and don't have an iommu. This series fixes SATA errors seen on TI platforms with LPAE like DRA7 Rev H EVM. Tested-by: Vignesh Raghavendra Thanks for the fix! -- Regards Vignesh ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 07/21] iommu/io-pgtable-arm-v7s: Extend MediaTek 4GB Mode
On Mon, 2019-07-15 at 10:51 +0100, Will Deacon wrote: > On Sun, Jul 14, 2019 at 12:41:20PM +0800, Yong Wu wrote: > > On Thu, 2019-07-11 at 13:31 +0100, Will Deacon wrote: > > > This looks like the right sort of idea. Basically, I was thinking that you > > > can use the oas in conjunction with the quirk to specify whether or not > > > your two magic bits should be set. You could also then cap the oas using > > > the size of phys_addr_t to deal with my other comment. > > > > > > Finally, I was hoping you could drop the |= BIT_ULL(32) and the &= > > > ~BIT_ULL(32) bits of the mtk driver if the pgtable code now accepts higher > > > addresses. Did that not work out? > > > > After the current patch, the pgtable has accepted the higher address. > > the " |= BIT_ULL(32)" and "& = ~ BIT_ULL(32)" is for a special case(we > > call it 4GB mode). > > > > Now MediaTek IOMMU support 2 kind memory: > > 1) normal case: PA is 0x4000_ - 0x3__. the PA won't be > > remapped. mt8183 and the non-4GB mode of mt8173/mt2712 use this mode. > > > > 2) 4GB Mode: PA is 0x4000_ - 0x1_3fff_. But the PA will remapped > > to 0x1__ to 0x1__. This is for the 4GB mode of > > mt8173/mt2712. This case is so special that we should change the PA > > manually(add bit32). > > (mt2712 and mt8173 have both mode: 4GB and non-4GB.) > > > > If we try to use oas and our quirk to cover this two case. Then I can > > use "oas == 33" only for this 4GB mode. and "oas == 34" for the normal > > case even though the PA mayn't reach 34bit. Also I should add some > > "workaround" for the 4GB mode(oas==33). > > > > I copy the new patch in the mail below(have dropped the "|= BIT_ULL(32)" > > and the "&= ~BIT_ULL(32)) in mtk iommu". please help have a look if it > > is ok. > > (another thing: Current the PA can support over 4GB. So the quirk name > > "MTK_4GB" looks not suitable, I used a new patch rename to "MTK_EXT"). > > Makes sense, thanks. One comment below. > > > @@ -205,7 +216,20 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte > > pte, int lvl, > > else > > mask = ARM_V7S_LVL_MASK(lvl); > > > > - return pte & mask; > > + paddr = pte & mask; > > + if (IS_ENABLED(CONFIG_PHYS_ADDR_T_64BIT) && > > + (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT)) { > > + /* > > +* Workaround for MTK 4GB Mode: > > +* Add BIT32 only when PA < 0x4000_. > > +*/ > > + if ((cfg->oas == 33 && paddr < 0x4000UL) || > > + (cfg->oas > 33 && (pte & ARM_V7S_ATTR_MTK_PA_BIT32))) > > + paddr |= BIT_ULL(32); > > + if (pte & ARM_V7S_ATTR_MTK_PA_BIT33) > > + paddr |= BIT_ULL(33); > > + } > > + return paddr; > > } > > > > static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl, > > @@ -326,9 +350,6 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, > > int lvl, > > if (lvl == 1 && (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)) > > pte |= ARM_V7S_ATTR_NS_SECTION; > > > > - if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) > > - pte |= ARM_V7S_ATTR_MTK_4GB; > > - > > return pte; > > } > > > > @@ -742,7 +763,9 @@ static struct io_pgtable > > *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg, > > { > > struct arm_v7s_io_pgtable *data; > > > > - if (cfg->ias > ARM_V7S_ADDR_BITS || cfg->oas > ARM_V7S_ADDR_BITS) > > + if (cfg->ias > ARM_V7S_ADDR_BITS || > > + (cfg->oas > ARM_V7S_ADDR_BITS && > > +!(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT))) > > return NULL; > > I think you can rework this to do something like: > > if (cfg->ias > ARM_V7S_ADDR_BITS) > return NULL; > > if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT) { > if (!IS_ENABLED(CONFIG_PHYS_ADDR_T_64BIT)) > cfg->oas = min(cfg->oas, ARM_V7S_ADDR_BITS); > else if (cfg->oas > 34) > return NULL; > } else if (cfg->oas > ARM_V7S_ADDR_BITS) { > return NULL; > } > > so that we clamp the oas when phys_addr_t is 32-bit for you. That should > allow you to remove lots of the checking from iopte_to_paddr() too if you > check against oas in the map() function. > > Does that make sense? Of course I'm ok for this. I'm only afraid that this function has already 3 checking "if (x) return NULL", Here we add a new one and so many lines... Maybe the user should guarantee the right value of oas. How about move it into mtk_iommu.c? About the checking of iopte_to_paddr, I can not remove them. I know it may be a bit special and not readable. Hmm, I guess I should use a MACRO instead of the hard code 33 for the special 4GB mode case. Then the patch would be like: //= static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl, struct io_pgtable_cfg *cfg) { arm_v7s_iopte mask; + phys_addr_t
[PATCH 2/2] dma-direct: only limit the mapping size if swiotlb could be used
Don't just check for a swiotlb buffer, but also if buffering might be required for this particular device. Fixes: 133d624b1cee ("dma: Introduce dma_max_mapping_size()") Reported-by: Benjamin Herrenschmidt Signed-off-by: Christoph Hellwig Tested-by: Benjamin Herrenschmidt --- kernel/dma/direct.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index d7cec866d16b..e269b6f9b444 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -399,11 +399,9 @@ int dma_direct_supported(struct device *dev, u64 mask) size_t dma_direct_max_mapping_size(struct device *dev) { - size_t size = SIZE_MAX; - /* If SWIOTLB is active, use its maximum mapping size */ - if (is_swiotlb_active()) - size = swiotlb_max_mapping_size(dev); - - return size; + if (is_swiotlb_active() && + (dma_addressing_limited(dev) || swiotlb_force == SWIOTLB_FORCE)) + return swiotlb_max_mapping_size(dev); + return SIZE_MAX; } -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
fix nvme performance regression due to dma_max_mapping_size()
Hi all, the new dma_max_mapping_size function is a little to eager to limit the I/O size if the swiotlb buffer is present, but the device is not addressing limited. Fix this by adding an additional check. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu