Re: [PATCH] swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on arm
On 4/28/22 6:49 PM, Stefano Stabellini wrote: On Thu, 28 Apr 2022, Boris Ostrovsky wrote: On 4/28/22 5:49 PM, Stefano Stabellini wrote: On Thu, 28 Apr 2022, Christoph Hellwig wrote: On Tue, Apr 26, 2022 at 04:07:45PM -0700, Stefano Stabellini wrote: Reported-by: Rahul Singh Signed-off-by: Christoph Hellwig Reviewed-by: Stefano Stabellini Do you want to take this through the Xen tree or should I pick it up? Either way I'd love to see some testing on x86 as well. I agree on the x86 testing. Juergen, Boris? I'd say to take this patch via the Xen tree but Juergen has just sent a Xen pull request to Linux last Saturday. Juergen do you plan to send another one? Do you have something else lined up? If not, it might be better to let Christoph pick it up. We don't have anything pending. I can test it but at best tomorrow so not sure we can get this into rc5. Do you consider this an urgent fix or can we wait until 5.19? Because it's a bit too much IMO for rc6. On one hand, Linux doesn't boot on a platform without this fix. On the other hand, I totally see that this patch could introduce regressions on x86 so I think it is fair that we are careful with it. From my point of view, it might be better to wait for 5.19 and mark it as backport. No problems uncovered during testing. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on arm
On 4/28/22 5:49 PM, Stefano Stabellini wrote: On Thu, 28 Apr 2022, Christoph Hellwig wrote: On Tue, Apr 26, 2022 at 04:07:45PM -0700, Stefano Stabellini wrote: Reported-by: Rahul Singh Signed-off-by: Christoph Hellwig Reviewed-by: Stefano Stabellini Do you want to take this through the Xen tree or should I pick it up? Either way I'd love to see some testing on x86 as well. I agree on the x86 testing. Juergen, Boris? I'd say to take this patch via the Xen tree but Juergen has just sent a Xen pull request to Linux last Saturday. Juergen do you plan to send another one? Do you have something else lined up? If not, it might be better to let Christoph pick it up. We don't have anything pending. I can test it but at best tomorrow so not sure we can get this into rc5. Do you consider this an urgent fix or can we wait until 5.19? Because it's a bit too much IMO for rc6. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: cleanup swiotlb initialization v8
On 4/4/22 1:05 AM, Christoph Hellwig wrote: Hi all, this series tries to clean up the swiotlb initialization, including that of swiotlb-xen. To get there is also removes the x86 iommu table infrastructure that massively obsfucates the initialization path. Git tree: git://git.infradead.org/users/hch/misc.git swiotlb-init-cleanup Gitweb: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/swiotlb-init-cleanup Tested-by: Boris Ostrovsky ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 12/15] swiotlb: provide swiotlb_init variants that remap the buffer
On 3/15/22 2:36 AM, Christoph Hellwig wrote: @@ -271,12 +273,23 @@ void __init swiotlb_init(bool addressing_limit, unsigned int flags) * allow to pick a location everywhere for hypervisors with guest * memory encryption. */ +retry: + bytes = PAGE_ALIGN(default_nslabs << IO_TLB_SHIFT); if (flags & SWIOTLB_ANY) tlb = memblock_alloc(bytes, PAGE_SIZE); else tlb = memblock_alloc_low(bytes, PAGE_SIZE); if (!tlb) goto fail; + if (remap && remap(tlb, nslabs) < 0) { + memblock_free(tlb, PAGE_ALIGN(bytes)); + + if (nslabs <= IO_TLB_MIN_SLABS) + panic("%s: Failed to remap %zu bytes\n", + __func__, bytes); + nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE)); I spoke with Konrad (who wrote the original patch --- f4b2f07b2ed9b469ead87e06fc2fc3d12663a725) and apparently the reason for 2MB was to optimize for Xen's slab allocator, it had nothing to do with IO_TLB_MIN_SLABS. Since this is now common code we should not expose Xen-specific optimizations here and smaller values will still work so IO_TLB_MIN_SLABS is fine. I think this should be mentioned in the commit message though, probably best in the next patch where you switch to this code. As far as the hunk above, I don't think we need the max() here: with IO_TLB_MIN_SLABS being 512 we may get stuck in an infinite loop. Something like nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE); if (nslabs <= IO_TLB_MIN_SLABS) panic() should be sufficient. + goto retry; + } if (swiotlb_init_with_tbl(tlb, default_nslabs, flags)) goto fail_free_mem; return; @@ -287,12 +300,18 @@ void __init swiotlb_init(bool addressing_limit, unsigned int flags) pr_warn("Cannot allocate buffer"); } +void __init swiotlb_init(bool addressing_limit, unsigned int flags) +{ + return swiotlb_init_remap(addressing_limit, flags, NULL); +} + /* * Systems with larger DMA zones (those that don't support ISA) can * initialize the swiotlb later using the slab allocator if needed. * This should be just like above, but with some error catching. */ -int swiotlb_init_late(size_t size, gfp_t gfp_mask) +int swiotlb_init_late(size_t size, gfp_t gfp_mask, + int (*remap)(void *tlb, unsigned long nslabs)) { unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE); unsigned long bytes; @@ -303,6 +322,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask) if (swiotlb_force_disable) return 0; +retry: order = get_order(nslabs << IO_TLB_SHIFT); nslabs = SLABS_PER_PAGE << order; bytes = nslabs << IO_TLB_SHIFT; @@ -317,6 +337,16 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask) if (!vstart) return -ENOMEM; + if (remap) + rc = remap(vstart, nslabs); + if (rc) { + free_pages((unsigned long)vstart, order); + + if (IO_TLB_MIN_SLABS <= 1024) + return rc; + nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE)); Same here. (The 'if' check above is wrong anyway). Patches 13 and 14 look good. -boris + goto retry; + } if (order != get_order(bytes)) { pr_warn("only able to allocate %ld MB\n", ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 14/15] swiotlb: remove swiotlb_init_with_tbl and swiotlb_init_late_with_tbl
On 3/14/22 3:31 AM, Christoph Hellwig wrote: @@ -314,6 +293,7 @@ void __init swiotlb_init(bool addressing_limit, unsigned int flags) int swiotlb_init_late(size_t size, gfp_t gfp_mask, int (*remap)(void *tlb, unsigned long nslabs)) { + struct io_tlb_mem *mem = _tlb_default_mem; unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE); unsigned long bytes; unsigned char *vstart = NULL; @@ -355,33 +335,28 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, (PAGE_SIZE << order) >> 20); nslabs = SLABS_PER_PAGE << order; } - rc = swiotlb_late_init_with_tbl(vstart, nslabs); - if (rc) - free_pages((unsigned long)vstart, order); - - return rc; -} - -int -swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs) -{ - struct io_tlb_mem *mem = _tlb_default_mem; - unsigned long bytes = nslabs << IO_TLB_SHIFT; - if (swiotlb_force_disable) - return 0; + if (remap) + rc = remap(vstart, nslabs); + if (rc) { + free_pages((unsigned long)vstart, order); - /* protect against double initialization */ - if (WARN_ON_ONCE(mem->nslabs)) - return -ENOMEM; + /* Min is 2MB */ + if (nslabs <= 1024) + return rc; + nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE)); + goto retry; + } We now end up with two attempts to remap. I think this second one is what we want since it solves the problem I pointed in previous patch. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 13/15] swiotlb: merge swiotlb-xen initialization into swiotlb
On 3/14/22 3:31 AM, Christoph Hellwig wrote: - static void __init pci_xen_swiotlb_init(void) { if (!xen_initial_domain() && !x86_swiotlb_enable) return; x86_swiotlb_enable = true; - xen_swiotlb = true; - xen_swiotlb_init_early(); + swiotlb_init_remap(true, x86_swiotlb_flags, xen_swiotlb_fixup); I think we need to have SWIOTLB_ANY set in x86_swiotlb_flags here. dma_ops = _swiotlb_dma_ops; if (IS_ENABLED(CONFIG_PCI)) pci_request_acs(); @@ -88,14 +85,16 @@ static void __init pci_xen_swiotlb_init(void) int pci_xen_swiotlb_init_late(void) { - int rc; - - if (xen_swiotlb) + if (dma_ops == _swiotlb_dma_ops) return 0; - rc = xen_swiotlb_init(); - if (rc) - return rc; + /* we can work with the default swiotlb */ + if (!io_tlb_default_mem.nslabs) { + int rc = swiotlb_init_late(swiotlb_size_or_default(), + GFP_KERNEL, xen_swiotlb_fixup); This may be comment for previous patch but looking at swiotlb_init_late(): retry: order = get_order(nslabs << IO_TLB_SHIFT); nslabs = SLABS_PER_PAGE << order; bytes = nslabs << IO_TLB_SHIFT; while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) { vstart = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN, order); if (vstart) break; order--; } if (!vstart) return -ENOMEM; if (remap) rc = remap(vstart, nslabs); if (rc) { free_pages((unsigned long)vstart, order); /* Min is 2MB */ if (nslabs <= 1024) return rc; nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE)); goto retry; } if (order != get_order(bytes)) { pr_warn("only able to allocate %ld MB\n", (PAGE_SIZE << order) >> 20); nslabs = SLABS_PER_PAGE << order; <=== } rc = swiotlb_late_init_with_tbl(vstart, nslabs); Notice that we don't do remap() after final update to nslabs. We should. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 12/15] swiotlb: provide swiotlb_init variants that remap the buffer
On 3/14/22 3:31 AM, Christoph Hellwig wrote: -void __init swiotlb_init(bool addressing_limit, unsigned int flags) +void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags, + int (*remap)(void *tlb, unsigned long nslabs)) { - size_t bytes = PAGE_ALIGN(default_nslabs << IO_TLB_SHIFT); + unsigned long nslabs = default_nslabs; + size_t bytes; void *tlb; if (!addressing_limit && !swiotlb_force_bounce) @@ -271,12 +273,24 @@ void __init swiotlb_init(bool addressing_limit, unsigned int flags) * allow to pick a location everywhere for hypervisors with guest * memory encryption. */ +retry: + bytes = PAGE_ALIGN(default_nslabs << IO_TLB_SHIFT); if (flags & SWIOTLB_ANY) tlb = memblock_alloc(bytes, PAGE_SIZE); else tlb = memblock_alloc_low(bytes, PAGE_SIZE); if (!tlb) goto fail; + if (remap && remap(tlb, nslabs) < 0) { + memblock_free(tlb, PAGE_ALIGN(bytes)); + + /* Min is 2MB */ + if (nslabs <= 1024) This is IO_TLB_MIN_SLABS, isn't it? (Xen code didn't say so but that's what it meant to say I believe) + panic("%s: Failed to remap %zu bytes\n", + __func__, bytes); + nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE)); + goto retry; + } @@ -303,6 +323,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask) if (swiotlb_force_disable) return 0; +retry: order = get_order(nslabs << IO_TLB_SHIFT); nslabs = SLABS_PER_PAGE << order; bytes = nslabs << IO_TLB_SHIFT; @@ -317,6 +338,17 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask) if (!vstart) return -ENOMEM; + if (remap) + rc = remap(vstart, nslabs); + if (rc) { + free_pages((unsigned long)vstart, order); + + /* Min is 2MB */ + if (nslabs <= 1024) Same here. -boris + return rc; + nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE)); + goto retry; + } if (order != get_order(bytes)) { pr_warn("only able to allocate %ld MB\n", ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb
On 3/9/22 1:18 AM, Christoph Hellwig wrote: On Tue, Mar 08, 2022 at 04:38:21PM -0500, Boris Ostrovsky wrote: On 3/1/22 5:53 AM, Christoph Hellwig wrote: Allow to pass a remap argument to the swiotlb initialization functions to handle the Xen/x86 remap case. ARM/ARM64 never did any remapping from xen_swiotlb_fixup, so we don't even need that quirk. Any chance this patch could be split? Lots of things are happening here and it's somewhat hard to review. (Patch 7 too BTW but I think I managed to get through it) What would be your preferred split? swiotlb_init() rework to be done separately? diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c index e0def4b1c3181..2f2c468acb955 100644 --- a/arch/x86/kernel/pci-dma.c +++ b/arch/x86/kernel/pci-dma.c @@ -71,15 +71,12 @@ static inline void __init pci_swiotlb_detect(void) #endif /* CONFIG_SWIOTLB */ #ifdef CONFIG_SWIOTLB_XEN -static bool xen_swiotlb; - static void __init pci_xen_swiotlb_init(void) { if (!xen_initial_domain() && !x86_swiotlb_enable) return; Now that there is a single call site for this routine I think this check can be dropped. We are only called here for xen_initial_domain()==true. The callsite just checks xen_pv_domain() and itself is called unconditionally during initialization. Oh, right, nevermind. *pv* domain. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb
On 3/1/22 5:53 AM, Christoph Hellwig wrote: Allow to pass a remap argument to the swiotlb initialization functions to handle the Xen/x86 remap case. ARM/ARM64 never did any remapping from xen_swiotlb_fixup, so we don't even need that quirk. Any chance this patch could be split? Lots of things are happening here and it's somewhat hard to review. (Patch 7 too BTW but I think I managed to get through it) diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c index e0def4b1c3181..2f2c468acb955 100644 --- a/arch/x86/kernel/pci-dma.c +++ b/arch/x86/kernel/pci-dma.c @@ -71,15 +71,12 @@ static inline void __init pci_swiotlb_detect(void) #endif /* CONFIG_SWIOTLB */ #ifdef CONFIG_SWIOTLB_XEN -static bool xen_swiotlb; - static void __init pci_xen_swiotlb_init(void) { if (!xen_initial_domain() && !x86_swiotlb_enable) return; Now that there is a single call site for this routine I think this check can be dropped. We are only called here for xen_initial_domain()==true. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb
On 3/4/22 12:43 PM, Christoph Hellwig wrote: On Fri, Mar 04, 2022 at 12:36:17PM -0500, Boris Ostrovsky wrote: I bisected it to "x86: remove the IOMMU table infrastructure" but haven't actually looked at the code yet. That looks like the swiotlb buffer did not get initialized at all, but I can't really explain why. Can you stick in a printk and see if xen_swiotlb_init_early gets called at all? Actually, that's the only thing I did do so far and yes, it does get called. So, specifically for "x86: remove the IOMMU table infrastructure" I think we need the one-liner below so that swiotlb_exit doesn't get called for the Xen case. But that should have been fixed up by the next patch already. This indeed allows dom0 to boot. Not sure I see where in the next patch this would have been fixed? (BTW, just noticed in iommu_setup() you set this variable to 1. Should be 'true') -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb
On 3/4/22 12:28 PM, Christoph Hellwig wrote: On Wed, Mar 02, 2022 at 08:15:03AM -0500, Boris Ostrovsky wrote: Not for me, I fail to boot with [ 52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), total 0 (slots), used 0 (slots) (this is iscsi root so I need the NIC). I bisected it to "x86: remove the IOMMU table infrastructure" but haven't actually looked at the code yet. That looks like the swiotlb buffer did not get initialized at all, but I can't really explain why. Can you stick in a printk and see if xen_swiotlb_init_early gets called at all? Actually, that's the only thing I did do so far and yes, it does get called. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb
On 3/3/22 5:57 AM, Christoph Hellwig wrote: On Wed, Mar 02, 2022 at 08:15:03AM -0500, Boris Ostrovsky wrote: Not for me, I fail to boot with [ 52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), total 0 (slots), used 0 (slots) (this is iscsi root so I need the NIC). I bisected it to "x86: remove the IOMMU table infrastructure" but haven't actually looked at the code yet. Thanks. Looks like the sizing is going wrong. Just to confirm, this is dom0 on x86 and no special command line options? Right. module2 /boot/vmlinuz-5.17.0-rc6swiotlb placeholder root=UUID=dbef1262-8c8a-43db-8055-7d9bec7bece0 ro crashkernel=auto LANG=en_US.UTF-8 rd.luks=0 rd.lvm=0 rd.md=0 rd.dm=0 netroot=iscsi:169.254.0.2:::1:iqn.2015-02.oracle.boot:uefi iscsi_param=node.session.timeo.replacement_timeout=6000 net.ifnames=1 nvme_core.shutdown_timeout=10 ipmi_si.tryacpi=0 ipmi_si.trydmi=0 ipmi_si.trydefaults=0 libiscsi.debug_libiscsi_eh=1 panic=20 nokaslr earlyprintk=xen console=hvc0 loglevel=8 4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb
On 3/2/22 8:15 AM, Boris Ostrovsky wrote: On 3/1/22 9:55 PM, Stefano Stabellini wrote: On Tue, 1 Mar 2022, Christoph Hellwig wrote: Allow to pass a remap argument to the swiotlb initialization functions to handle the Xen/x86 remap case. ARM/ARM64 never did any remapping from xen_swiotlb_fixup, so we don't even need that quirk. Aside from that the rest looks OK. Also, you can add my: Tested-by: Stefano Stabellini Not for me, I fail to boot with [ 52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), total 0 (slots), used 0 (slots) (this is iscsi root so I need the NIC). I bisected it to "x86: remove the IOMMU table infrastructure" but haven't actually looked at the code yet. Again, this is as dom0. Baremetal is fine. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb
On 3/1/22 9:55 PM, Stefano Stabellini wrote: On Tue, 1 Mar 2022, Christoph Hellwig wrote: Allow to pass a remap argument to the swiotlb initialization functions to handle the Xen/x86 remap case. ARM/ARM64 never did any remapping from xen_swiotlb_fixup, so we don't even need that quirk. Aside from that the rest looks OK. Also, you can add my: Tested-by: Stefano Stabellini Not for me, I fail to boot with [ 52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), total 0 (slots), used 0 (slots) (this is iscsi root so I need the NIC). I bisected it to "x86: remove the IOMMU table infrastructure" but haven't actually looked at the code yet. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: cleanup swiotlb initialization
On 2/25/22 3:47 AM, Christoph Hellwig wrote: On Thu, Feb 24, 2022 at 12:07:26PM -0500, Boris Ostrovsky wrote: Just to be clear: this crashes only as dom0. Boots fine as baremetal. Ah. I can gues what this might be. On Xen the hypervisor controls the IOMMU and we should never end up initializing it in Linux, right? Right, we shouldn't be in that code path. Can you try the patch below on top of the series? Yes, this makes dom0 boot fine. (It also addresses something I wanted to mention while looking at the patches, which was to remove Xen initialization code from pci_swiotlb_detect_4gb() since it had noting to do with what the routine's name implies.) -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: cleanup swiotlb initialization
On 2/24/22 11:39 AM, Christoph Hellwig wrote: On Thu, Feb 24, 2022 at 11:18:33AM -0500, Boris Ostrovsky wrote: On 2/24/22 10:58 AM, Christoph Hellwig wrote: Thanks. This looks really strange as early_amd_iommu_init should not interact much with the changes. I'll see if I can find a AMD system to test on. Just to be clear: this crashes only as dom0. Boots fine as baremetal. Ah. I can gues what this might be. On Xen the hypervisor controls the IOMMU and we should never end up initializing it in Linux, right? Right, we shouldn't be in that code path. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: cleanup swiotlb initialization
On 2/24/22 10:58 AM, Christoph Hellwig wrote: Thanks. This looks really strange as early_amd_iommu_init should not interact much with the changes. I'll see if I can find a AMD system to test on. Just to be clear: this crashes only as dom0. Boots fine as baremetal. -boris On Wed, Feb 23, 2022 at 07:57:49PM -0500, Boris Ostrovsky wrote: [ 37.377313] BUG: unable to handle page fault for address: c90042880018 [ 37.378219] #PF: supervisor read access in kernel mode [ 37.378219] #PF: error_code(0x) - not-present page [ 37.378219] PGD 7c2f2ee067 P4D 7c2f2ee067 PUD 7bf019b067 PMD 105a30067 PTE 0 [ 37.378219] Oops: [#1] PREEMPT SMP NOPTI [ 37.378219] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc5swiotlb #9 [ 37.378219] Hardware name: Oracle Corporation ORACLE SERVER E1-2c/ASY,Generic,SM,E1-2c, BIOS 49004900 12/23/2020 [ 37.378219] RIP: e030:init_iommu_one+0x248/0x2f0 [ 37.378219] Code: 48 89 43 68 48 85 c0 74 c4 be 00 20 00 00 48 89 df e8 ea ee ff ff 48 89 43 78 48 85 c0 74 ae c6 83 98 00 00 00 00 48 8b 43 38 <48> 8b 40 18 a8 01 74 07 83 8b a8 04 00 00 01 f6 83 a8 04 00 00 01 [ 37.378219] RSP: e02b:c9004044bd18 EFLAGS: 00010286 [ 37.378219] RAX: c9004288 RBX: 888107260800 RCX: [ 37.378219] RDX: 8000 RSI: ea00041cab80 RDI: [ 37.378219] RBP: c9004044bd38 R08: 0901 R09: ea00041cab00 [ 37.378219] R10: 0002 R11: R12: c90040435008 [ 37.378219] R13: 0008 R14: efa0 R15: [ 37.378219] FS: () GS:88fef418() knlGS: [ 37.378219] CS: e030 DS: ES: CR0: 80050033 [ 37.378219] CR2: c90042880018 CR3: 0260a000 CR4: 00050660 [ 37.378219] Call Trace: [ 37.378219] [ 37.378219] early_amd_iommu_init+0x3c5/0x72d [ 37.378219] ? iommu_setup+0x284/0x284 [ 37.378219] state_next+0x158/0x68f [ 37.378219] ? iommu_setup+0x284/0x284 [ 37.378219] iommu_go_to_state+0x28/0x2d [ 37.378219] amd_iommu_init+0x15/0x4b [ 37.378219] ? iommu_setup+0x284/0x284 [ 37.378219] pci_iommu_init+0x12/0x37 [ 37.378219] do_one_initcall+0x48/0x210 [ 37.378219] kernel_init_freeable+0x229/0x28c [ 37.378219] ? rest_init+0xe0/0xe0 [ 37.963966] kernel_init+0x1a/0x130 [ 37.979415] ret_from_fork+0x22/0x30 [ 37.991436] [ 37.999465] Modules linked in: [ 38.007413] CR2: c90042880018 [ 38.019416] ---[ end trace ]--- [ 38.023418] RIP: e030:init_iommu_one+0x248/0x2f0 [ 38.023418] Code: 48 89 43 68 48 85 c0 74 c4 be 00 20 00 00 48 89 df e8 ea ee ff ff 48 89 43 78 48 85 c0 74 ae c6 83 98 00 00 00 00 48 8b 43 38 <48> 8b 40 18 a8 01 74 07 83 8b a8 04 00 00 01 f6 83 a8 04 00 00 01 [ 38.023418] RSP: e02b:c9004044bd18 EFLAGS: 00010286 [ 38.023418] RAX: c9004288 RBX: 888107260800 RCX: [ 38.155413] RDX: 8000 RSI: ea00041cab80 RDI: [ 38.175965] Freeing initrd memory: 62640K [ 38.155413] RBP: c9004044bd38 R08: 0901 R09: ea00041cab00 [ 38.155413] R10: 0002 R11: R12: c90040435008 [ 38.155413] R13: 0008 R14: efa0 R15: [ 38.155413] FS: () GS:88fef418() knlGS: [ 38.287414] CS: e030 DS: ES: CR0: 80050033 [ 38.309557] CR2: c90042880018 CR3: 0260a000 CR4: 00050660 [ 38.332403] Kernel panic - not syncing: Fatal exception [ 38.351414] Rebooting in 20 seconds.. -boris ---end quoted text--- ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: cleanup swiotlb initialization
On 2/22/22 10:35 AM, Christoph Hellwig wrote: Hi all, this series tries to clean up the swiotlb initialization, including that of swiotlb-xen. To get there is also removes the x86 iommu table infrastructure that massively obsfucates the initialization path. Git tree: git://git.infradead.org/users/hch/misc.git swiotlb-init-cleanup I haven't had a chance to look at this yet but this crashes as dom0: [ 37.377313] BUG: unable to handle page fault for address: c90042880018 [ 37.378219] #PF: supervisor read access in kernel mode [ 37.378219] #PF: error_code(0x) - not-present page [ 37.378219] PGD 7c2f2ee067 P4D 7c2f2ee067 PUD 7bf019b067 PMD 105a30067 PTE 0 [ 37.378219] Oops: [#1] PREEMPT SMP NOPTI [ 37.378219] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc5swiotlb #9 [ 37.378219] Hardware name: Oracle Corporation ORACLE SERVER E1-2c/ASY,Generic,SM,E1-2c, BIOS 49004900 12/23/2020 [ 37.378219] RIP: e030:init_iommu_one+0x248/0x2f0 [ 37.378219] Code: 48 89 43 68 48 85 c0 74 c4 be 00 20 00 00 48 89 df e8 ea ee ff ff 48 89 43 78 48 85 c0 74 ae c6 83 98 00 00 00 00 48 8b 43 38 <48> 8b 40 18 a8 01 74 07 83 8b a8 04 00 00 01 f6 83 a8 04 00 00 01 [ 37.378219] RSP: e02b:c9004044bd18 EFLAGS: 00010286 [ 37.378219] RAX: c9004288 RBX: 888107260800 RCX: [ 37.378219] RDX: 8000 RSI: ea00041cab80 RDI: [ 37.378219] RBP: c9004044bd38 R08: 0901 R09: ea00041cab00 [ 37.378219] R10: 0002 R11: R12: c90040435008 [ 37.378219] R13: 0008 R14: efa0 R15: [ 37.378219] FS: () GS:88fef418() knlGS: [ 37.378219] CS: e030 DS: ES: CR0: 80050033 [ 37.378219] CR2: c90042880018 CR3: 0260a000 CR4: 00050660 [ 37.378219] Call Trace: [ 37.378219] [ 37.378219] early_amd_iommu_init+0x3c5/0x72d [ 37.378219] ? iommu_setup+0x284/0x284 [ 37.378219] state_next+0x158/0x68f [ 37.378219] ? iommu_setup+0x284/0x284 [ 37.378219] iommu_go_to_state+0x28/0x2d [ 37.378219] amd_iommu_init+0x15/0x4b [ 37.378219] ? iommu_setup+0x284/0x284 [ 37.378219] pci_iommu_init+0x12/0x37 [ 37.378219] do_one_initcall+0x48/0x210 [ 37.378219] kernel_init_freeable+0x229/0x28c [ 37.378219] ? rest_init+0xe0/0xe0 [ 37.963966] kernel_init+0x1a/0x130 [ 37.979415] ret_from_fork+0x22/0x30 [ 37.991436] [ 37.999465] Modules linked in: [ 38.007413] CR2: c90042880018 [ 38.019416] ---[ end trace ]--- [ 38.023418] RIP: e030:init_iommu_one+0x248/0x2f0 [ 38.023418] Code: 48 89 43 68 48 85 c0 74 c4 be 00 20 00 00 48 89 df e8 ea ee ff ff 48 89 43 78 48 85 c0 74 ae c6 83 98 00 00 00 00 48 8b 43 38 <48> 8b 40 18 a8 01 74 07 83 8b a8 04 00 00 01 f6 83 a8 04 00 00 01 [ 38.023418] RSP: e02b:c9004044bd18 EFLAGS: 00010286 [ 38.023418] RAX: c9004288 RBX: 888107260800 RCX: [ 38.155413] RDX: 8000 RSI: ea00041cab80 RDI: [ 38.175965] Freeing initrd memory: 62640K [ 38.155413] RBP: c9004044bd38 R08: 0901 R09: ea00041cab00 [ 38.155413] R10: 0002 R11: R12: c90040435008 [ 38.155413] R13: 0008 R14: efa0 R15: [ 38.155413] FS: () GS:88fef418() knlGS: [ 38.287414] CS: e030 DS: ES: CR0: 80050033 [ 38.309557] CR2: c90042880018 CR3: 0260a000 CR4: 00050660 [ 38.332403] Kernel panic - not syncing: Fatal exception [ 38.351414] Rebooting in 20 seconds.. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [GIT PULL] dma-mapping fix for Linux 5.14
On 7/25/21 12:50 PM, Linus Torvalds wrote: > On Sat, Jul 24, 2021 at 11:03 PM Christoph Hellwig wrote: > >> - handle vmalloc addresses in dma_common_{mmap,get_sgtable} >> (Roman Skakun) > I've pulled this, but my reaction is that we've tried to avoid this in > the past. Why is Xen using vmalloc'ed addresses and passing those in > to the dma mapping routines? > > It *smells* to me like a Xen-swiotlb bug, and it would have been > better to try to fix it there. Was that just too painful? Stefano will probably know better but this appears to have something to do with how Pi (and possibly more ARM systems?) manage DMA memory: https://lore.kernel.org/xen-devel/CADz_WD5Ln7Pe1WAFp73d2Mz9wxspzTE3WgAJusp5S8LX4=8...@mail.gmail.com/. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 13/16] xen: swiotlb: return error code from xen_swiotlb_map_sg()
On 7/15/21 12:45 PM, Logan Gunthorpe wrote: > From: Martin Oliveira > > The .map_sg() op now expects an error code instead of zero on failure. > > xen_swiotlb_map_sg() may only fail if xen_swiotlb_map_page() fails, but > xen_swiotlb_map_page() only supports returning errors as > DMA_MAPPING_ERROR. So coalesce all errors into EINVAL. Reviewed-by: Boris Ostrovsky ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] dma-mapping: use vmalloc_to_page for vmalloc addresses
On 7/15/21 3:39 AM, Roman Skakun wrote: >> This looks like it wasn't picked up? Should it go in rc1? > Hi, Konrad! > > This looks like an unambiguous bug, and should be in rc1. Looks like you didn't copy Christoph which could be part of the problem. Adding him. -boris > > Cheers! > > ср, 14 июл. 2021 г. в 03:15, Konrad Rzeszutek Wilk : >> On Tue, Jun 22, 2021 at 04:34:14PM +0300, Roman Skakun wrote: >>> This commit is dedicated to fix incorrect conversion from >>> cpu_addr to page address in cases when we get virtual >>> address which allocated in the vmalloc range. >>> As the result, virt_to_page() cannot convert this address >>> properly and return incorrect page address. >>> >>> Need to detect such cases and obtains the page address using >>> vmalloc_to_page() instead. >>> >>> Signed-off-by: Roman Skakun >>> Reviewed-by: Andrii Anisov >>> --- >>> Hey! >>> Thanks for suggestions, Christoph! >>> I updated the patch according to your advice. >>> But, I'm so surprised because nobody catches this problem >>> in the common code before. It looks a bit strange as for me. >> This looks like it wasn't picked up? Should it go in rc1? >>> >>> kernel/dma/ops_helpers.c | 12 ++-- >>> 1 file changed, 10 insertions(+), 2 deletions(-) >>> >>> diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c >>> index 910ae69cae77..782728d8a393 100644 >>> --- a/kernel/dma/ops_helpers.c >>> +++ b/kernel/dma/ops_helpers.c >>> @@ -5,6 +5,14 @@ >>> */ >>> #include >>> >>> +static struct page *cpu_addr_to_page(void *cpu_addr) >>> +{ >>> + if (is_vmalloc_addr(cpu_addr)) >>> + return vmalloc_to_page(cpu_addr); >>> + else >>> + return virt_to_page(cpu_addr); >>> +} >>> + >>> /* >>> * Create scatter-list for the already allocated DMA buffer. >>> */ >>> @@ -12,7 +20,7 @@ int dma_common_get_sgtable(struct device *dev, struct >>> sg_table *sgt, >>>void *cpu_addr, dma_addr_t dma_addr, size_t size, >>>unsigned long attrs) >>> { >>> - struct page *page = virt_to_page(cpu_addr); >>> + struct page *page = cpu_addr_to_page(cpu_addr); >>> int ret; >>> >>> ret = sg_alloc_table(sgt, 1, GFP_KERNEL); >>> @@ -43,7 +51,7 @@ int dma_common_mmap(struct device *dev, struct >>> vm_area_struct *vma, >>> return -ENXIO; >>> >>> return remap_pfn_range(vma, vma->vm_start, >>> - page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff, >>> + page_to_pfn(cpu_addr_to_page(cpu_addr)) + >>> vma->vm_pgoff, >>> user_count << PAGE_SHIFT, vma->vm_page_prot); >>> #else >>> return -ENXIO; >>> -- >>> 2.25.1 >>> > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops
On 6/16/21 11:35 AM, Christoph Hellwig wrote: > On Wed, Jun 16, 2021 at 11:33:50AM -0400, Boris Ostrovsky wrote: >> Isn't the expectation of virt_to_page() that it only works on non-vmalloc'd >> addresses? (This is not a rhetorical question, I actually don't know). > Yes. Thus is why I'd suggest to just do the vmalloc_to_page or > virt_to_page dance in ops_helpers.c and just continue using that. Ah, OK, so something along the lines of what I suggested. (I thought by "helpers" you meant virt_to_page()). -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops
On 6/16/21 10:21 AM, Christoph Hellwig wrote: > On Wed, Jun 16, 2021 at 10:12:55AM -0400, Boris Ostrovsky wrote: >> I wonder now whether we could avoid code duplication between here and >> dma_common_mmap()/dma_common_get_sgtable() and use your helper there. >> >> >> Christoph, would that work? I.e. something like > You should not duplicate the code at all, and just make the common > helpers work with vmalloc addresses. Isn't the expectation of virt_to_page() that it only works on non-vmalloc'd addresses? (This is not a rhetorical question, I actually don't know). -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops
On 6/16/21 7:42 AM, Roman Skakun wrote: > This commit is dedicated to fix incorrect conversion from > cpu_addr to page address in cases when we get virtual > address which allocated through xen_swiotlb_alloc_coherent() > and can be mapped in the vmalloc range. > As the result, virt_to_page() cannot convert this address > properly and return incorrect page address. > > Need to detect such cases and obtains the page address using > vmalloc_to_page() instead. > > The reference code for mmap() and get_sgtable() was copied > from kernel/dma/ops_helpers.c and modified to provide > additional detections as described above. > > In order to simplify code there was added a new > dma_cpu_addr_to_page() helper. > > Signed-off-by: Roman Skakun > Reviewed-by: Andrii Anisov > --- > drivers/xen/swiotlb-xen.c | 42 +++ > 1 file changed, 34 insertions(+), 8 deletions(-) > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index 90bc5fc321bc..9331a8500547 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -118,6 +118,14 @@ static int is_xen_swiotlb_buffer(struct device *dev, > dma_addr_t dma_addr) > return 0; > } > > +static struct page *cpu_addr_to_page(void *cpu_addr) > +{ > + if (is_vmalloc_addr(cpu_addr)) > + return vmalloc_to_page(cpu_addr); > + else > + return virt_to_page(cpu_addr); > +} > + > static int > xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs) > { > @@ -337,7 +345,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t > size, void *vaddr, > int order = get_order(size); > phys_addr_t phys; > u64 dma_mask = DMA_BIT_MASK(32); > - struct page *page; > + struct page *page = cpu_addr_to_page(vaddr); > > if (hwdev && hwdev->coherent_dma_mask) > dma_mask = hwdev->coherent_dma_mask; > @@ -349,11 +357,6 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t > size, void *vaddr, > /* Convert the size to actually allocated. */ > size = 1UL << (order + XEN_PAGE_SHIFT); > > - if (is_vmalloc_addr(vaddr)) > - page = vmalloc_to_page(vaddr); > - else > - page = virt_to_page(vaddr); > - > if (!WARN_ON((dev_addr + size - 1 > dma_mask) || >range_straddles_page_boundary(phys, size)) && > TestClearPageXenRemapped(page)) > @@ -573,7 +576,23 @@ xen_swiotlb_dma_mmap(struct device *dev, struct > vm_area_struct *vma, >void *cpu_addr, dma_addr_t dma_addr, size_t size, >unsigned long attrs) > { > - return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size, attrs); > + unsigned long user_count = vma_pages(vma); > + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; > + unsigned long off = vma->vm_pgoff; > + struct page *page = cpu_addr_to_page(cpu_addr); > + int ret; > + > + vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs); > + > + if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, )) > + return ret; > + > + if (off >= count || user_count > count - off) > + return -ENXIO; > + > + return remap_pfn_range(vma, vma->vm_start, > + page_to_pfn(page) + vma->vm_pgoff, > + user_count << PAGE_SHIFT, vma->vm_page_prot); > } I wonder now whether we could avoid code duplication between here and dma_common_mmap()/dma_common_get_sgtable() and use your helper there. Christoph, would that work? I.e. something like diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c index 910ae69cae77..43411c2fa47b 100644 --- a/kernel/dma/ops_helpers.c +++ b/kernel/dma/ops_helpers.c @@ -12,7 +12,7 @@ int dma_common_get_sgtable(struct device *dev, struct sg_table *sgt, void *cpu_addr, dma_addr_t dma_addr, size_t size, unsigned long attrs) { - struct page *page = virt_to_page(cpu_addr); + struct page *page = cpu_addr_to_page(cpu_addr); int ret; ret = sg_alloc_table(sgt, 1, GFP_KERNEL); @@ -43,7 +43,7 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct *vma, return -ENXIO; return remap_pfn_range(vma, vma->vm_start, - page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff, + page_to_pfn(cpu_addr_to_page(cpu_addr)) + vma->vm_pgoff, user_count << PAGE_SHIFT, vma->vm_page_prot); #else return -ENXIO; -boris > > /* > @@ -585,7 +604,14 @@ xen_swiotlb_get_sgtable(struct device *dev, struct > sg_table *sgt, > void *cpu_addr, dma_addr_t handle, size_t size, > unsigned long attrs) > { > - return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size, attrs); > + struct page *page = cpu_addr_to_page(cpu_addr); > + int ret; > + > + ret = sg_alloc_table(sgt, 1,
Re: [PATCH] swiotlb-xen: override common mmap and get_sgtable dma ops
On 6/14/21 8:47 AM, Roman Skakun wrote: > Hey, Boris! > Thanks for the review. > > I have an additional question about current implementation that disturbed me. > I think, that we can have cases when mapped memory can not be > physically contiguous. > In order to proceed this correctly need to apply some additional steps > to current implementation as well. > > In mmap() : > 1. Is the memory region physically contiguous? > 2. Remap multiple ranges if it is not. > > In get_sgtable() : > 1. Is the memory region physically contiguous? > 2. Create sgt that will be included multiple contiguous ranges if it is not. > > What do you think about it? We make sure that we allocate contiguous memory in xen_swiotlb_alloc_coherent(). -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] swiotlb-xen: override common mmap and get_sgtable dma ops
On 6/11/21 5:55 AM, Roman Skakun wrote: > > +static int > +xen_swiotlb_dma_mmap(struct device *dev, struct vm_area_struct *vma, > + void *cpu_addr, dma_addr_t dma_addr, size_t size, > + unsigned long attrs) > +{ > + unsigned long user_count = vma_pages(vma); > + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; > + unsigned long off = vma->vm_pgoff; > + struct page *page; > + > + if (is_vmalloc_addr(cpu_addr)) > + page = vmalloc_to_page(cpu_addr); > + else > + page = virt_to_page(cpu_addr); > + > + vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs); > + > + if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, )) > + return -ENXIO; > + > + if (off >= count || user_count > count - off) > + return -ENXIO; > + > + return remap_pfn_range(vma, vma->vm_start, > + page_to_pfn(page) + vma->vm_pgoff, > + user_count << PAGE_SHIFT, vma->vm_page_prot); > +} I suggest you create a helper for computing page value and then revert 922659ea771b3fd728149262c5ea15608fab9719 and pass result of the helper instead of cpu_addr. Here and in xen_swiotlb_dma_get_sgtable(). And use this new helper in xen_swiotlb_free_coherent() too. I am curious though why this was not a problem when Stefano was looking at the problem that introduced this vmalloc check (i.e. 8b1e868f66076490189a36d984fcce286cdd6295). Stefano? -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH V3 09/11] HV/IOMMU: Enable swiotlb bounce buffer for Isolation VM
On 6/3/21 11:37 AM, Tianyu Lan wrote: > > Yes, the dependency is between hyperv_swiotlb_detect() and > pci_swiotlb_detect_override()/pci_swiotlb_detect_4gb(). Now > pci_swiotlb_detect_override() and pci_swiotlb_detect_4gb() depends on > pci_xen_swiotlb_detect(). To keep dependency between > hyperv_swiotlb_detect() and pci_swiotlb_detect_override/4gb(), make > pci_xen_swiotlb_detect() depends on hyperv_swiotlb_detect() and just to > keep order in the IOMMU table. Current iommu_table_entry only has one > depend callback and this is why I put xen depends on hyperv detect function. > Ah, ok. Thanks. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH V3 09/11] HV/IOMMU: Enable swiotlb bounce buffer for Isolation VM
On 6/2/21 11:01 AM, Tianyu Lan wrote: > Hi Boris: > Thanks for your review. > > On 6/2/2021 9:16 AM, Boris Ostrovsky wrote: >> >> On 5/30/21 11:06 AM, Tianyu Lan wrote: >>> @@ -91,6 +92,6 @@ int pci_xen_swiotlb_init_late(void) >>> EXPORT_SYMBOL_GPL(pci_xen_swiotlb_init_late); >>> IOMMU_INIT_FINISH(2, >>> - NULL, >>> + hyperv_swiotlb_detect, >>> pci_xen_swiotlb_init, >>> NULL); >> >> >> Could you explain this change? > > Hyper-V allocates its own swiotlb bounce buffer and the default > swiotlb buffer should not be allocated. swiotlb_init() in pci_swiotlb_init() > is to allocate default swiotlb buffer. > To achieve this, put hyperv_swiotlb_detect() as the first entry in the > iommu_table_entry list. The detect loop in the pci_iommu_alloc() will exit > once hyperv_swiotlb_detect() is called in Hyper-V VM and other > iommu_table_entry callback will not be called. Right. But pci_xen_swiotlb_detect() will only do something for Xen PV guests, and those guests don't run on hyperV. It's either xen_pv_domain() (i.e. hypervisor_is_type(X86_HYPER_XEN_PV)) or hypervisor_is_type(X86_HYPER_MS_HYPERV) but never both. So I don't think there needs to be a dependency between the two callbacks. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH V3 09/11] HV/IOMMU: Enable swiotlb bounce buffer for Isolation VM
On 5/30/21 11:06 AM, Tianyu Lan wrote: > @@ -91,6 +92,6 @@ int pci_xen_swiotlb_init_late(void) > EXPORT_SYMBOL_GPL(pci_xen_swiotlb_init_late); > > IOMMU_INIT_FINISH(2, > - NULL, > + hyperv_swiotlb_detect, > pci_xen_swiotlb_init, > NULL); Could you explain this change? -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC v1 5/6] xen-swiotlb: convert variables to arrays
On 2/19/21 3:32 PM, Konrad Rzeszutek Wilk wrote: > On Sun, Feb 07, 2021 at 04:56:01PM +0100, Christoph Hellwig wrote: >> On Thu, Feb 04, 2021 at 09:40:23AM +0100, Christoph Hellwig wrote: >>> So one thing that has been on my mind for a while: I'd really like >>> to kill the separate dma ops in Xen swiotlb. If we compare xen-swiotlb >>> to swiotlb the main difference seems to be: >>> >>> - additional reasons to bounce I/O vs the plain DMA capable >>> - the possibility to do a hypercall on arm/arm64 >>> - an extra translation layer before doing the phys_to_dma and vice >>>versa >>> - an special memory allocator >>> >>> I wonder if inbetween a few jump labels or other no overhead enablement >>> options and possibly better use of the dma_range_map we could kill >>> off most of swiotlb-xen instead of maintaining all this code duplication? >> So I looked at this a bit more. >> >> For x86 with XENFEAT_auto_translated_physmap (how common is that?) > Juergen, Boris please correct me if I am wrong, but that > XENFEAT_auto_translated_physmap > only works for PVH guests? That's both HVM and PVH (for dom0 it's only PVH). -boris > >> pfn_to_gfn is a nop, so plain phys_to_dma/dma_to_phys do work as-is. >> >> xen_arch_need_swiotlb always returns true for x86, and >> range_straddles_page_boundary should never be true for the >> XENFEAT_auto_translated_physmap case. > Correct. The kernel should have no clue of what the real MFNs are > for PFNs. >> So as far as I can tell the mapping fast path for the >> XENFEAT_auto_translated_physmap can be trivially reused from swiotlb. >> >> That leaves us with the next more complicated case, x86 or fully cache >> coherent arm{,64} without XENFEAT_auto_translated_physmap. In that case >> we need to patch in a phys_to_dma/dma_to_phys that performs the MFN >> lookup, which could be done using alternatives or jump labels. >> I think if that is done right we should also be able to let that cover >> the foreign pages in is_xen_swiotlb_buffer/is_swiotlb_buffer, but >> in that worst case that would need another alternative / jump label. >> >> For non-coherent arm{,64} we'd also need to use alternatives or jump >> labels to for the cache maintainance ops, but that isn't a hard problem >> either. >> >> ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 09/11] swiotlb-xen: simplify cache maintainance
On 9/6/19 10:43 AM, Konrad Rzeszutek Wilk wrote: > On Fri, Sep 06, 2019 at 10:19:01AM -0400, Boris Ostrovsky wrote: >> On 9/6/19 10:01 AM, Christoph Hellwig wrote: >>> On Fri, Sep 06, 2019 at 09:52:12AM -0400, Boris Ostrovsky wrote: >>>> We need nop definitions of these two for x86. >>>> >>>> Everything builds now but that's probably because the calls are under >>>> 'if (!dev_is_dma_coherent(dev))' which is always false so compiler >>>> optimized is out. I don't think we should rely on that. >>> That is how a lot of the kernel works. Provide protypes only for code >>> that is semantically compiled, but can't ever be called due to >>> IS_ENABLED() checks. It took me a while to get used to it, but it >>> actually is pretty nice as the linker does the work for you to check >>> that it really is never called. Much better than say a BUILD_BUG_ON(). >> >> (with corrected Juergen's email) >> >> I know about IS_ENABLED() but I didn't realize that this is allowed for >> compile-time inlines and such as well. >> >> Anyway, for non-ARM bits >> >> Reviewed-by: Boris Ostrovsky > Acked-by: Konrad Rzeszutek Wilk > > as well. > > Albeit folks have tested this under x86 Xen with 'swiotlb=force' right? Yes, I did. -boris > > I can test it myself but it will take a couple of days. >> If this goes via Xen tree then the first couple of patches need an ack >> from ARM maintainers. >> >> -boris
Re: [PATCH 09/11] swiotlb-xen: simplify cache maintainance
On 9/6/19 10:01 AM, Christoph Hellwig wrote: > On Fri, Sep 06, 2019 at 09:52:12AM -0400, Boris Ostrovsky wrote: >> We need nop definitions of these two for x86. >> >> Everything builds now but that's probably because the calls are under >> 'if (!dev_is_dma_coherent(dev))' which is always false so compiler >> optimized is out. I don't think we should rely on that. > That is how a lot of the kernel works. Provide protypes only for code > that is semantically compiled, but can't ever be called due to > IS_ENABLED() checks. It took me a while to get used to it, but it > actually is pretty nice as the linker does the work for you to check > that it really is never called. Much better than say a BUILD_BUG_ON(). (with corrected Juergen's email) I know about IS_ENABLED() but I didn't realize that this is allowed for compile-time inlines and such as well. Anyway, for non-ARM bits Reviewed-by: Boris Ostrovsky If this goes via Xen tree then the first couple of patches need an ack from ARM maintainers. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 09/11] swiotlb-xen: simplify cache maintainance
On 9/5/19 7:34 AM, Christoph Hellwig wrote: > diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h > index 5e4b83f83dbc..d71380f6ed0b 100644 > --- a/include/xen/swiotlb-xen.h > +++ b/include/xen/swiotlb-xen.h > @@ -4,6 +4,11 @@ > > #include > > +void xen_dma_sync_for_cpu(struct device *dev, dma_addr_t handle, > + phys_addr_t paddr, size_t size, enum dma_data_direction dir); > +void xen_dma_sync_for_device(struct device *dev, dma_addr_t handle, > + phys_addr_t paddr, size_t size, enum dma_data_direction dir); > + > extern int xen_swiotlb_init(int verbose, bool early); > extern const struct dma_map_ops xen_swiotlb_dma_ops; We need nop definitions of these two for x86. Everything builds now but that's probably because the calls are under 'if (!dev_is_dma_coherent(dev))' which is always false so compiler optimized is out. I don't think we should rely on that. -boris
Re: [PATCH v2] swiotlb-xen: Convert to use macro
On 9/6/19 8:27 AM, Souptick Joarder wrote: > On Mon, Sep 2, 2019 at 2:04 PM Souptick Joarder wrote: >> Rather than using static int max_dma_bits, this >> can be coverted to use as macro. >> >> Signed-off-by: Souptick Joarder >> Reviewed-by: Juergen Gross > If it is still not late, can we get this patch in queue for 5.4 ? Yes, I will queue it later today. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 08/13] swiotlb-xen: always use dma-direct helpers to alloc coherent pages
(Now with correct address for Juergen) On 9/3/19 6:15 PM, Boris Ostrovsky wrote: > On 9/2/19 9:03 AM, Christoph Hellwig wrote: >> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c >> index b8808677ae1d..f9dd4cb6e4b3 100644 >> --- a/drivers/xen/swiotlb-xen.c >> +++ b/drivers/xen/swiotlb-xen.c >> @@ -299,8 +299,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t >> size, >> * address. In fact on ARM virt_to_phys only works for kernel direct >> * mapped RAM memory. Also see comment below. >> */ >> -ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs); >> - >> +ret = dma_direct_alloc(hwdev, size, dma_handle, flags, attrs); > > If I am reading __dma_direct_alloc_pages() correctly there is a path > that will force us to use GFP_DMA32 and as Juergen pointed out in > another message that may not be desirable. > > -boris > > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 08/13] swiotlb-xen: always use dma-direct helpers to alloc coherent pages
On 9/2/19 9:03 AM, Christoph Hellwig wrote: > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index b8808677ae1d..f9dd4cb6e4b3 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -299,8 +299,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t > size, >* address. In fact on ARM virt_to_phys only works for kernel direct >* mapped RAM memory. Also see comment below. >*/ > - ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs); > - > + ret = dma_direct_alloc(hwdev, size, dma_handle, flags, attrs); If I am reading __dma_direct_alloc_pages() correctly there is a path that will force us to use GFP_DMA32 and as Juergen pointed out in another message that may not be desirable. -boris
Re: [PATCH v2 3/3] xen/swiotlb: remember having called xen_create_contiguous_region()
On 5/30/19 5:04 AM, Christoph Hellwig wrote: > Please don't add your private flag to page-flags.h. The whole point of > the private flag is that you can use it in any way you want withou > touching the common code. There is already a bunch of aliases for various sub-components (including Xen) in page-flags.h for private flags, which is why I suggested we do the same for the new use case. Adding this new alias will keep flag usage consistent. -boris
Re: [PATCH 3/3] xen/swiotlb: remember having called xen_create_contiguous_region()
On 4/23/19 6:54 AM, Juergen Gross wrote: > Instead of always calling xen_destroy_contiguous_region() in case the > memory is DMA-able for the used device, do so only in case it has been > made DMA-able via xen_create_contiguous_region() before. > > This will avoid a lot of xen_destroy_contiguous_region() calls for > 64-bit capable devices. > > As the memory in question is owned by swiotlb-xen the PG_owner_priv_1 > flag of the first allocated page can be used for remembering. I think a new enum in pageflags would be useful, and be consistent with other flag uses. -boris > > Signed-off-by: Juergen Gross > --- > drivers/xen/swiotlb-xen.c | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index 43b6e65ae256..a72f181d8e20 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -321,6 +321,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t > size, > xen_free_coherent_pages(hwdev, size, ret, > (dma_addr_t)phys, attrs); > return NULL; > } > + SetPageOwnerPriv1(virt_to_page(ret)); > } > memset(ret, 0, size); > return ret; > @@ -344,9 +345,11 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t > size, void *vaddr, > /* Convert the size to actually allocated. */ > size = 1UL << (order + XEN_PAGE_SHIFT); > > - if ((dev_addr + size - 1 <= dma_mask) && > - !WARN_ON(range_straddles_page_boundary(phys, size))) > - xen_destroy_contiguous_region(phys, order); > + if (PageOwnerPriv1(virt_to_page(vaddr))) { > + if (!WARN_ON(range_straddles_page_boundary(phys, size))) > + xen_destroy_contiguous_region(phys, order); > + ClearPageOwnerPriv1(virt_to_page(vaddr)); > + } > > xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs); > } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/3] xen/swiotlb: simplify range_straddles_page_boundary()
On 4/23/19 6:54 AM, Juergen Gross wrote: > range_straddles_page_boundary() is open coding several macros from > include/xen/page.h. Use those instead. Additionally there is no need > to have check_pages_physically_contiguous() as a separate function as > it is used only once, so merge it into range_straddles_page_boundary(). > > Signed-off-by: Juergen Gross Reviewed-by: Boris Ostrovsky ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
On 4/23/19 6:54 AM, Juergen Gross wrote: > The condition in xen_swiotlb_free_coherent() for deciding whether to > call xen_destroy_contiguous_region() is wrong: in case the region to > be freed is not contiguous calling xen_destroy_contiguous_region() is > the wrong thing to do: it would result in inconsistent mappings of > multiple PFNs to the same MFN. This will lead to various strange > crashes or data corruption. > > Instead of calling xen_destroy_contiguous_region() in that case a > warning should be issued as that situation should never occur. > > Cc: sta...@vger.kernel.org > Signed-off-by: Juergen Gross Reviewed-by: Boris Ostrovsky ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH -next] xen-swiotlb: Make two functions static
On 4/16/19 10:49 AM, Yue Haibing wrote: > From: YueHaibing > > Fix sparse warnings: > > drivers/xen/swiotlb-xen.c:489:1: warning: > symbol 'xen_swiotlb_sync_single_for_cpu' was not declared. Should it be > static? > drivers/xen/swiotlb-xen.c:496:1: warning: > symbol 'xen_swiotlb_sync_single_for_device' was not declared. Should it be > static? > > Reported-by: Hulk Robot > Signed-off-by: YueHaibing I think latest patches from Christoph take care of this. -boris > --- > drivers/xen/swiotlb-xen.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index 877baf2..e741df4 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -485,14 +485,14 @@ xen_swiotlb_sync_single(struct device *hwdev, > dma_addr_t dev_addr, > xen_dma_sync_single_for_device(hwdev, dev_addr, size, dir); > } > > -void > +static void > xen_swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr, > size_t size, enum dma_data_direction dir) > { > xen_swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU); > } > > -void > +static void > xen_swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr, > size_t size, enum dma_data_direction dir) > { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/9] Use vm_insert_range
On 11/21/18 2:56 PM, Souptick Joarder wrote: > On Thu, Nov 22, 2018 at 1:08 AM Boris Ostrovsky > wrote: >> On 11/21/18 1:24 AM, Souptick Joarder wrote: >>> On Thu, Nov 15, 2018 at 9:09 PM Souptick Joarder >>> wrote: >>>> Previouly drivers have their own way of mapping range of >>>> kernel pages/memory into user vma and this was done by >>>> invoking vm_insert_page() within a loop. >>>> >>>> As this pattern is common across different drivers, it can >>>> be generalized by creating a new function and use it across >>>> the drivers. >>>> >>>> vm_insert_range is the new API which will be used to map a >>>> range of kernel memory/pages to user vma. >>>> >>>> All the applicable places are converted to use new vm_insert_range >>>> in this patch series. >>>> >>>> Souptick Joarder (9): >>>> mm: Introduce new vm_insert_range API >>>> arch/arm/mm/dma-mapping.c: Convert to use vm_insert_range >>>> drivers/firewire/core-iso.c: Convert to use vm_insert_range >>>> drm/rockchip/rockchip_drm_gem.c: Convert to use vm_insert_range >>>> drm/xen/xen_drm_front_gem.c: Convert to use vm_insert_range >>>> iommu/dma-iommu.c: Convert to use vm_insert_range >>>> videobuf2/videobuf2-dma-sg.c: Convert to use vm_insert_range >>>> xen/gntdev.c: Convert to use vm_insert_range >>>> xen/privcmd-buf.c: Convert to use vm_insert_range >>> Any further comment on driver changes ? >> Xen drivers (the last two patches) look fine to me. > Thanks, can I considered this as Reviewed-by ? Reviewed-by: Boris Ostrovsky ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/9] Use vm_insert_range
On 11/21/18 1:24 AM, Souptick Joarder wrote: > On Thu, Nov 15, 2018 at 9:09 PM Souptick Joarder wrote: >> Previouly drivers have their own way of mapping range of >> kernel pages/memory into user vma and this was done by >> invoking vm_insert_page() within a loop. >> >> As this pattern is common across different drivers, it can >> be generalized by creating a new function and use it across >> the drivers. >> >> vm_insert_range is the new API which will be used to map a >> range of kernel memory/pages to user vma. >> >> All the applicable places are converted to use new vm_insert_range >> in this patch series. >> >> Souptick Joarder (9): >> mm: Introduce new vm_insert_range API >> arch/arm/mm/dma-mapping.c: Convert to use vm_insert_range >> drivers/firewire/core-iso.c: Convert to use vm_insert_range >> drm/rockchip/rockchip_drm_gem.c: Convert to use vm_insert_range >> drm/xen/xen_drm_front_gem.c: Convert to use vm_insert_range >> iommu/dma-iommu.c: Convert to use vm_insert_range >> videobuf2/videobuf2-dma-sg.c: Convert to use vm_insert_range >> xen/gntdev.c: Convert to use vm_insert_range >> xen/privcmd-buf.c: Convert to use vm_insert_range > Any further comment on driver changes ? Xen drivers (the last two patches) look fine to me. -boris >> arch/arm/mm/dma-mapping.c | 21 ++--- >> drivers/firewire/core-iso.c | 15 ++-- >> drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 20 ++-- >> drivers/gpu/drm/xen/xen_drm_front_gem.c | 20 +--- >> drivers/iommu/dma-iommu.c | 12 ++ >> drivers/media/common/videobuf2/videobuf2-dma-sg.c | 23 ++- >> drivers/xen/gntdev.c | 11 - >> drivers/xen/privcmd-buf.c | 8 ++- >> include/linux/mm_types.h | 3 +++ >> mm/memory.c | 28 >> +++ >> mm/nommu.c| 7 ++ >> 11 files changed, 70 insertions(+), 98 deletions(-) >> >> -- >> 1.9.1 >> ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [Xen-devel] [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3
>> >> PV guests don't go through Linux x86 early boot code. They start at >> xen_start_kernel() (well, xen-head.S:startup_xen(), really) and merge >> with baremetal path at x86_64_start_reservations() (for 64-bit). >> > > Ok, I don't think anything needs to be done then. The sme_me_mask is set > in sme_enable() which is only called from head_64.S. If the sme_me_mask > isn't set then SME won't be active. The feature will just report the > capability of the processor, but that doesn't mean it is active. If you > still want the feature to be clobbered we can do that, though. I'd prefer to explicitly clear to avoid any ambiguity. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [Xen-devel] [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3
On 06/09/2017 02:36 PM, Tom Lendacky wrote: > On 6/8/2017 5:01 PM, Andrew Cooper wrote: >> On 08/06/2017 22:17, Boris Ostrovsky wrote: >>> On 06/08/2017 05:02 PM, Tom Lendacky wrote: >>>> On 6/8/2017 3:51 PM, Boris Ostrovsky wrote: >>>>>>> What may be needed is making sure X86_FEATURE_SME is not set for PV >>>>>>> guests. >>>>>> And that may be something that Xen will need to control through >>>>>> either >>>>>> CPUID or MSR support for the PV guests. >>>>> >>>>> Only on newer versions of Xen. On earlier versions (2-3 years old) >>>>> leaf >>>>> 0x8007 is passed to the guest unchanged. And so is MSR_K8_SYSCFG. >>>> The SME feature is in leaf 0x801f, is that leaf passed to the >>>> guest >>>> unchanged? >>> Oh, I misread the patch where X86_FEATURE_SME is defined. Then all >>> versions, including the current one, pass it unchanged. >>> >>> All that's needed is setup_clear_cpu_cap(X86_FEATURE_SME) in >>> xen_init_capabilities(). >> >> AMD processors still don't support CPUID Faulting (or at least, I >> couldn't find any reference to it in the latest docs), so we cannot >> actually hide SME from a guest which goes looking at native CPUID. >> Furthermore, I'm not aware of any CPUID masking support covering that >> leaf. >> >> However, if Linux is using the paravirtual cpuid hook, things are >> slightly better. >> >> On Xen 4.9 and later, no guests will see the feature. On earlier >> versions of Xen (before I fixed the logic), plain domUs will not see the >> feature, while dom0 will. >> >> For safely, I'd recommend unilaterally clobbering the feature as Boris >> suggested. There is no way SME will be supportable on a per-PV guest > > That may be too late. Early boot support in head_64.S will make calls to > check for the feature (through CPUID and MSR), set the sme_me_mask and > encrypt the kernel in place. Is there another way to approach this? PV guests don't go through Linux x86 early boot code. They start at xen_start_kernel() (well, xen-head.S:startup_xen(), really) and merge with baremetal path at x86_64_start_reservations() (for 64-bit). -boris > >> basis, although (as far as I am aware) Xen as a whole would be able to >> encompass itself and all of its PV guests inside one single SME >> instance. > > Yes, that is correct. > > Thanks, > Tom > >> >> ~Andrew >> ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3
On 06/08/2017 05:02 PM, Tom Lendacky wrote: > On 6/8/2017 3:51 PM, Boris Ostrovsky wrote: >> >>> >>>> What may be needed is making sure X86_FEATURE_SME is not set for PV >>>> guests. >>> >>> And that may be something that Xen will need to control through either >>> CPUID or MSR support for the PV guests. >> >> >> Only on newer versions of Xen. On earlier versions (2-3 years old) leaf >> 0x8007 is passed to the guest unchanged. And so is MSR_K8_SYSCFG. > > The SME feature is in leaf 0x801f, is that leaf passed to the guest > unchanged? Oh, I misread the patch where X86_FEATURE_SME is defined. Then all versions, including the current one, pass it unchanged. All that's needed is setup_clear_cpu_cap(X86_FEATURE_SME) in xen_init_capabilities(). -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3
> >> What may be needed is making sure X86_FEATURE_SME is not set for PV >> guests. > > And that may be something that Xen will need to control through either > CPUID or MSR support for the PV guests. Only on newer versions of Xen. On earlier versions (2-3 years old) leaf 0x8007 is passed to the guest unchanged. And so is MSR_K8_SYSCFG. -boris ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu