Re: [RFC PATCH 3/7] module: [

2024-04-18 Thread Nadav Amit
> On 18 Apr 2024, at 13:20, Mike Rapoport wrote: > > On Tue, Apr 16, 2024 at 12:36:08PM +0300, Nadav Amit wrote: >> >> >> >> I might be missing something, but it seems a bit racy. >> >> IIUC, module_finalize() calls alternatives_smp_module_

Re: [RFC PATCH 3/7] module: prepare to handle ROX allocations for text

2024-04-16 Thread Nadav Amit
> On 11 Apr 2024, at 19:05, Mike Rapoport wrote: > > @@ -2440,7 +2479,24 @@ static int post_relocation(struct module *mod, const > struct load_info *info) > add_kallsyms(mod, info); > > /* Arch-specific module finalizing. */ > - return module_finalize(info->hdr,

Re: [PATCH] iommu/amd: page-specific invalidations for more than one page

2021-04-08 Thread Nadav Amit
> On Apr 8, 2021, at 12:18 AM, Joerg Roedel wrote: > > Hi Nadav, > > On Wed, Apr 07, 2021 at 05:57:31PM +0000, Nadav Amit wrote: >> I tested it on real bare-metal hardware. I ran some basic I/O workloads >> with the IOMMU enabled, checkers enabled/disabled, and so

Re: [PATCH] iommu/amd: page-specific invalidations for more than one page

2021-04-07 Thread Nadav Amit
> On Apr 7, 2021, at 3:01 AM, Joerg Roedel wrote: > > On Tue, Mar 23, 2021 at 02:06:19PM -0700, Nadav Amit wrote: >> From: Nadav Amit >> >> Currently, IOMMU invalidations and device-IOTLB invalidations using >> AMD IOMMU fall back to full address-space inva

Re: [RFC] NUMA balancing: reduce TLB flush via delaying mapping on hint page fault

2021-04-01 Thread Nadav Amit
> On Apr 1, 2021, at 1:38 AM, Mel Gorman wrote: > > On Wed, Mar 31, 2021 at 09:36:04AM -0700, Nadav Amit wrote: >> >> >>> On Mar 31, 2021, at 6:16 AM, Mel Gorman wrote: >>> >>> On Wed, Mar 31, 2021 at 07:20:09PM +0800, Huang, Ying wrote: >

Re: [RFC] NUMA balancing: reduce TLB flush via delaying mapping on hint page fault

2021-03-31 Thread Nadav Amit
> On Mar 31, 2021, at 6:16 AM, Mel Gorman wrote: > > On Wed, Mar 31, 2021 at 07:20:09PM +0800, Huang, Ying wrote: >> Mel Gorman writes: >> >>> On Mon, Mar 29, 2021 at 02:26:51PM +0800, Huang Ying wrote: For NUMA balancing, in hint page fault handler, the faulting page will be

Re: A problem of Intel IOMMU hardware ?

2021-03-26 Thread Nadav Amit
> On Mar 26, 2021, at 7:31 PM, Lu Baolu wrote: > > Hi Nadav, > > On 3/19/21 12:46 AM, Nadav Amit wrote: >> So here is my guess: >> Intel probably used as a basis for the IOTLB an implementation of >> some other (regular) TLB design. >> Intel SDM say

[PATCH] iommu/amd: page-specific invalidations for more than one page

2021-03-23 Thread Nadav Amit
From: Nadav Amit Currently, IOMMU invalidations and device-IOTLB invalidations using AMD IOMMU fall back to full address-space invalidation if more than a single page need to be flushed. Full flushes are especially inefficient when the IOMMU is virtualized by a hypervisor, since it requires

Re: A problem of Intel IOMMU hardware ?

2021-03-18 Thread Nadav Amit
e, Cloud Infrastructure Service Product Dept.) >> ; Nadav Amit >> Cc: chenjiashang ; David Woodhouse >> ; io...@lists.linux-foundation.org; LKML >> ; alex.william...@redhat.com; Gonglei (Arei) >> ; w...@kernel.org >> Subject: RE: A problem of Intel IOMMU hardwar

Re: A problem of Intel IOMMU hardware ?

2021-03-18 Thread Nadav Amit
> On Mar 17, 2021, at 9:46 PM, Longpeng (Mike, Cloud Infrastructure Service > Product Dept.) wrote: > [Snip] > > NOTE, the magical thing happen...(*Operation-4*) we write the PTE > of Operation-1 from 0 to 0x3 which means can Read/Write, and then > we trigger DMA read again, it success and

Re: A problem of Intel IOMMU hardware ?

2021-03-17 Thread Nadav Amit
> On Mar 17, 2021, at 2:35 AM, Longpeng (Mike, Cloud Infrastructure Service > Product Dept.) wrote: > > Hi Nadav, > >> -Original Message- >> From: Nadav Amit [mailto:nadav.a...@gmail.com] >>> reproduce the problem with high probability (~50%). >

Re: A problem of Intel IOMMU hardware ?

2021-03-16 Thread Nadav Amit
> On Mar 16, 2021, at 8:16 PM, Longpeng (Mike, Cloud Infrastructure Service > Product Dept.) wrote: > > Hi guys, > > We find the Intel iommu cache (i.e. iotlb) maybe works wrong in a special > situation, it would cause DMA fails or get wrong data. > > The reproducer (based on Alex's vfio

[tip: x86/mm] x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()

2021-03-06 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 6035152d8eebe16a5bb60398d3e05dc7799067b0 Gitweb: https://git.kernel.org/tip/6035152d8eebe16a5bb60398d3e05dc7799067b0 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:06 -08:00 Committer

[tip: x86/mm] smp: Run functions concurrently in smp_call_function_many_cond()

2021-03-06 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: a32a4d8a815c4eb6dc64b8962dc13a9dfae70868 Gitweb: https://git.kernel.org/tip/a32a4d8a815c4eb6dc64b8962dc13a9dfae70868 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:04 -08:00 Committer

[tip: x86/mm] x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote()

2021-03-06 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 4c1ba3923e6c8aa736e40f481a278c21b956c072 Gitweb: https://git.kernel.org/tip/4c1ba3923e6c8aa736e40f481a278c21b956c072 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:05 -08:00 Committer

[tip: x86/mm] x86/mm/tlb: Privatize cpu_tlbstate

2021-03-06 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 2f4305b19fe6a2a261d76c21856c5598f7d878fe Gitweb: https://git.kernel.org/tip/2f4305b19fe6a2a261d76c21856c5598f7d878fe Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:08 -08:00 Committer

[tip: x86/mm] x86/mm/tlb: Flush remote and local TLBs concurrently

2021-03-06 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 4ce94eabac16b1d2c95762b40f49e5654ab288d7 Gitweb: https://git.kernel.org/tip/4ce94eabac16b1d2c95762b40f49e5654ab288d7 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:07 -08:00 Committer

[tip: x86/mm] x86/mm/tlb: Remove unnecessary uses of the inline keyword

2021-03-06 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 1608e4cf31b88c8c448ce13aa1d77969dda6bdb7 Gitweb: https://git.kernel.org/tip/1608e4cf31b88c8c448ce13aa1d77969dda6bdb7 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:11 -08:00 Committer

[tip: x86/mm] cpumask: Mark functions as pure

2021-03-06 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 291c4011dd7ac0cd0cebb727a75ee5a50d16dcf7 Gitweb: https://git.kernel.org/tip/291c4011dd7ac0cd0cebb727a75ee5a50d16dcf7 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:10 -08:00 Committer

[tip: x86/mm] smp: Inline on_each_cpu_cond() and on_each_cpu()

2021-03-06 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: a5aa5ce300597224ec76dacc8e63ba3ad7a18bbd Gitweb: https://git.kernel.org/tip/a5aa5ce300597224ec76dacc8e63ba3ad7a18bbd Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:12 -08:00 Committer

[tip: x86/mm] x86/mm/tlb: Do not make is_lazy dirty for no reason

2021-03-06 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 09c5272e48614a30598e759c3c7bed126d22037d Gitweb: https://git.kernel.org/tip/09c5272e48614a30598e759c3c7bed126d22037d Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:09 -08:00 Committer

[PATCH v4] mm/userfaultfd: fix memory corruption due to writeprotect

2021-03-04 Thread Nadav Amit
From: Nadav Amit Userfaultfd self-test fails occasionally, indicating a memory corruption. Analyzing this problem indicates that there is a real bug since mmap_lock is only taken for read in mwriteprotect_range() and defers flushes, and since there is insufficient consideration of concurrent

Re: [PATCH RESEND v3] mm/userfaultfd: fix memory corruption due to writeprotect

2021-03-03 Thread Nadav Amit
> On Mar 3, 2021, at 11:03 AM, Peter Xu wrote: > > On Wed, Mar 03, 2021 at 01:57:02AM -0800, Nadav Amit wrote: >> From: Nadav Amit >> >> Userfaultfd self-test fails occasionally, indicating a memory >> corruption. > > It's failing very constantly now

[PATCH v3] mm/userfaultfd: fix memory corruption due to writeprotect

2021-03-03 Thread Nadav Amit
From: Nadav Amit Userfaultfd self-test fails occasionally, indicating a memory corruption. Analyzing this problem indicates that there is a real bug since mmap_lock is only taken for read in mwriteprotect_range() and defers flushes, and since there is insufficient consideration of concurrent

[PATCH RESEND v3] mm/userfaultfd: fix memory corruption due to writeprotect

2021-03-03 Thread Nadav Amit
From: Nadav Amit Userfaultfd self-test fails occasionally, indicating a memory corruption. Analyzing this problem indicates that there is a real bug since mmap_lock is only taken for read in mwriteprotect_range() and defers flushes, and since there is insufficient consideration of concurrent

Re: [PATCH v3] mm/userfaultfd: fix memory corruption due to writeprotect

2021-03-03 Thread Nadav Amit
> On Mar 3, 2021, at 1:51 AM, Nadav Amit wrote: > > From: Nadav Amit > > Userfaultfd self-test fails occasionally, indicating a memory > corruption. Please ignore - I will resend. signature.asc Description: Message signed with OpenPGP

Re: [RFC PATCH v2 0/2] mm: fix races due to deferred TLB flushes

2021-03-02 Thread Nadav Amit
> On Mar 2, 2021, at 2:13 PM, Peter Xu wrote: > > On Fri, Dec 25, 2020 at 01:25:27AM -0800, Nadav Amit wrote: >> From: Nadav Amit >> >> This patch-set went from v1 to RFCv2, as there is still an ongoing >> discussion regarding the way of solving the recent

[tip: x86/mm] smp: Run functions concurrently in smp_call_function_many_cond()

2021-03-02 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: b54d50640ca698383fc5b711487f303c17f4b47f Gitweb: https://git.kernel.org/tip/b54d50640ca698383fc5b711487f303c17f4b47f Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:04 -08:00 Committer

[tip: x86/mm] x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()

2021-03-02 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: bc51e8e6f9c387d8dda1d8dea2b8856d0ade4101 Gitweb: https://git.kernel.org/tip/bc51e8e6f9c387d8dda1d8dea2b8856d0ade4101 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:06 -08:00 Committer

[tip: x86/mm] x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote()

2021-03-02 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: f4f14f7c20440a442b4eaeb7b6f25cd0fc437e36 Gitweb: https://git.kernel.org/tip/f4f14f7c20440a442b4eaeb7b6f25cd0fc437e36 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:05 -08:00 Committer

[tip: x86/mm] x86/mm/tlb: Flush remote and local TLBs concurrently

2021-03-02 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: efa72447b0b95cd5e8b2bd7cf55ae23c716f8702 Gitweb: https://git.kernel.org/tip/efa72447b0b95cd5e8b2bd7cf55ae23c716f8702 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:07 -08:00 Committer

[tip: x86/mm] x86/mm/tlb: Privatize cpu_tlbstate

2021-03-02 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: fe978069739b59804c911fc9e9645ce768ec5b9e Gitweb: https://git.kernel.org/tip/fe978069739b59804c911fc9e9645ce768ec5b9e Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:08 -08:00 Committer

[tip: x86/mm] smp: Inline on_each_cpu_cond() and on_each_cpu()

2021-03-02 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 28344ab0a282a5ab5e4d56bfbcb2b363f4c15447 Gitweb: https://git.kernel.org/tip/28344ab0a282a5ab5e4d56bfbcb2b363f4c15447 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:12 -08:00 Committer

[tip: x86/mm] x86/mm/tlb: Do not make is_lazy dirty for no reason

2021-03-02 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: db73f8099a502be8ed46f6332c91754c74ac76c2 Gitweb: https://git.kernel.org/tip/db73f8099a502be8ed46f6332c91754c74ac76c2 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:09 -08:00 Committer

[tip: x86/mm] cpumask: Mark functions as pure

2021-03-02 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 1028a5918cbaae6b9d7f0a04b6a200b9e67aec14 Gitweb: https://git.kernel.org/tip/1028a5918cbaae6b9d7f0a04b6a200b9e67aec14 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:10 -08:00 Committer

[tip: x86/mm] x86/mm/tlb: Remove unnecessary uses of the inline keyword

2021-03-02 Thread tip-bot2 for Nadav Amit
The following commit has been merged into the x86/mm branch of tip: Commit-ID: 327db7a160b33865e086f7fff73e08f6d8d47005 Gitweb: https://git.kernel.org/tip/327db7a160b33865e086f7fff73e08f6d8d47005 Author:Nadav Amit AuthorDate:Sat, 20 Feb 2021 15:17:11 -08:00 Committer

Re: [PATCH v6 1/9] smp: Run functions concurrently in smp_call_function_many_cond()

2021-03-01 Thread Nadav Amit
> On Mar 1, 2021, at 9:10 AM, Peter Zijlstra wrote: > > On Sat, Feb 20, 2021 at 03:17:04PM -0800, Nadav Amit wrote: >> +/* >> + * Choose the most efficient way to send an IPI. Note that the >> + * number of CPUs might be zer

Re: [RFC 1/6] vdso/extable: fix calculation of base

2021-02-28 Thread Nadav Amit
> On Feb 26, 2021, at 9:47 AM, Sean Christopherson wrote: > > On Fri, Feb 26, 2021, Nadav Amit wrote: >> >>> On Feb 25, 2021, at 1:16 PM, Sean Christopherson wrote: >>> It's been literally years since I wrote this code, but I distinctly >>>

Re: [RFC 1/6] vdso/extable: fix calculation of base

2021-02-26 Thread Nadav Amit
> On Feb 25, 2021, at 1:16 PM, Sean Christopherson wrote: > > On Wed, Feb 24, 2021, Nadav Amit wrote: >> From: Nadav Amit >> >> Apparently, the assembly considers __ex_table as the location when the >> pushsection directive was issued. Therefore when the

Re: [RFC 0/6] x86: prefetch_page() vDSO call

2021-02-25 Thread Nadav Amit
> On Feb 25, 2021, at 9:32 AM, Matthew Wilcox wrote: > > On Thu, Feb 25, 2021 at 04:56:50PM +0000, Nadav Amit wrote: >> >>> On Feb 25, 2021, at 4:16 AM, Matthew Wilcox wrote: >>> >>> On Wed, Feb 24, 2021 at 11:29:04PM -0800, Nadav Amit wrote:

Re: [RFC 0/6] x86: prefetch_page() vDSO call

2021-02-25 Thread Nadav Amit
> On Feb 25, 2021, at 4:16 AM, Matthew Wilcox wrote: > > On Wed, Feb 24, 2021 at 11:29:04PM -0800, Nadav Amit wrote: >> Just as applications can use prefetch instructions to overlap >> computations and memory accesses, applications may want to overlap the >> page-fa

Re: [RFC 0/6] x86: prefetch_page() vDSO call

2021-02-25 Thread Nadav Amit
> On Feb 25, 2021, at 12:52 AM, Nadav Amit wrote: > > > >> On Feb 25, 2021, at 12:40 AM, Peter Zijlstra wrote: >> >> On Wed, Feb 24, 2021 at 11:29:04PM -0800, Nadav Amit wrote: >>> From: Nadav Amit >>> >>> Just as applications

Re: [RFC 0/6] x86: prefetch_page() vDSO call

2021-02-25 Thread Nadav Amit
> On Feb 25, 2021, at 12:40 AM, Peter Zijlstra wrote: > > On Wed, Feb 24, 2021 at 11:29:04PM -0800, Nadav Amit wrote: >> From: Nadav Amit >> >> Just as applications can use prefetch instructions to overlap >> computations and memory accesses, application

[RFC 5/6] mm: use lightweight reclaim on FAULT_FLAG_RETRY_NOWAIT

2021-02-24 Thread Nadav Amit
From: Nadav Amit When FAULT_FLAG_RETRY_NOWAIT is set, the caller arguably wants only a lightweight reclaim to avoid a long reclamation, which would not respect the "NOWAIT" semantic. Regard the request in swap and file-backed page-faults accordingly during the first try. Cc: Andy Luto

[PATCH 6/6] testing/selftest: test vDSO prefetch_page()

2021-02-24 Thread Nadav Amit
From: Nadav Amit Test prefetch_page() in cases of invalid pointer, file-mmap and anonymous memory. Partial checks are also done with mincore syscall to ensure the output of prefetch_page() is consistent with mincore (taking into account the different semantics of the two). The tests

[RFC 4/6] mm/swap_state: respect FAULT_FLAG_RETRY_NOWAIT

2021-02-24 Thread Nadav Amit
From: Nadav Amit Certain use-cases (e.g., prefetch_page()) may want to avoid polling while a page is brought from the swap. Yet, swap_cluster_readahead() and swap_vma_readahead() do not respect FAULT_FLAG_RETRY_NOWAIT. Add support to respect FAULT_FLAG_RETRY_NOWAIT by not polling in these cases

[RFC 3/6] x86/vdso: introduce page_prefetch()

2021-02-24 Thread Nadav Amit
From: Nadav Amit Introduce a new vDSO function: page_prefetch() which is to be used when certain memory, which might be paged out, is expected to be used soon. The function prefetches the page if needed. The function returns zero if the page is accessible after the call and -1 otherwise

[RFC 1/6] vdso/extable: fix calculation of base

2021-02-24 Thread Nadav Amit
From: Nadav Amit Apparently, the assembly considers __ex_table as the location when the pushsection directive was issued. Therefore when there is more than a single entry in the vDSO exception table, the calculations of the base and fixup are wrong. Fix the calculations of the expected fault IP

[RFC 2/6] x86/vdso: add mask and flags to extable

2021-02-24 Thread Nadav Amit
From: Nadav Amit Add a "mask" field to vDSO exception tables that says which exceptions should be handled. Add a "flags" field to vDSO as well to provide additional information about the exception. The existing preprocessor macro _ASM_VDSO_EXTABLE_HANDLE for assembl

[RFC 0/6] x86: prefetch_page() vDSO call

2021-02-24 Thread Nadav Amit
From: Nadav Amit Just as applications can use prefetch instructions to overlap computations and memory accesses, applications may want to overlap the page-faults and compute or overlap the I/O accesses that are required for page-faults of different pages. Applications can use multiple threads

[PATCH v6 9/9] smp: inline on_each_cpu_cond() and on_each_cpu()

2021-02-20 Thread Nadav Amit
From: Nadav Amit Simplify the code and avoid having an additional function on the stack by inlining on_each_cpu_cond() and on_each_cpu(). Cc: Andy Lutomirski Cc: Thomas Gleixner Suggested-by: Peter Zijlstra Signed-off-by: Nadav Amit --- include/linux/smp.h | 50

[PATCH v6 8/9] x86/mm/tlb: Remove unnecessary uses of the inline keyword

2021-02-20 Thread Nadav Amit
From: Nadav Amit The compiler is smart enough without these hints. Cc: Andy Lutomirski Cc: Peter Zijlstra Suggested-by: Dave Hansen Reviewed-by: Dave Hansen Signed-off-by: Nadav Amit --- arch/x86/mm/tlb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86

[PATCH v6 7/9] cpumask: Mark functions as pure

2021-02-20 Thread Nadav Amit
From: Nadav Amit cpumask_next_and() and cpumask_any_but() are pure, and marking them as such seems to generate different and presumably better code for native_flush_tlb_multi(). Reviewed-by: Dave Hansen Signed-off-by: Nadav Amit --- include/linux/cpumask.h | 6 +++--- 1 file changed, 3

[PATCH v6 6/9] x86/mm/tlb: Do not make is_lazy dirty for no reason

2021-02-20 Thread Nadav Amit
From: Nadav Amit Blindly writing to is_lazy for no reason, when the written value is identical to the old value, makes the cacheline dirty for no reason. Avoid making such writes to prevent cache coherency traffic for no reason. Cc: Andy Lutomirski Cc: Peter Zijlstra Suggested-by: Dave Hansen

[PATCH v6 5/9] x86/mm/tlb: Privatize cpu_tlbstate

2021-02-20 Thread Nadav Amit
From: Nadav Amit cpu_tlbstate is mostly private and only the variable is_lazy is shared. This causes some false-sharing when TLB flushes are performed. Break cpu_tlbstate intro cpu_tlbstate and cpu_tlbstate_shared, and mark each one accordingly. Cc: Andy Lutomirski Cc: Peter Zijlstra

[PATCH v6 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2021-02-20 Thread Nadav Amit
From: Nadav Amit To improve TLB shootdown performance, flush the remote and local TLBs concurrently. Introduce flush_tlb_multi() that does so. Introduce paravirtual versions of flush_tlb_multi() for KVM, Xen and hyper-v (Xen and hyper-v are only compile-tested). While the updated smp

[PATCH v6 3/9] x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()

2021-02-20 Thread Nadav Amit
From: Nadav Amit Open-code on_each_cpu_cond_mask() in native_flush_tlb_others() to optimize the code. Open-coding eliminates the need for the indirect branch that is used to call is_lazy(), and in CPUs that are vulnerable to Spectre v2, it eliminates the retpoline. In addition, it allows to use

[PATCH v6 2/9] x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote()

2021-02-20 Thread Nadav Amit
From: Nadav Amit The unification of these two functions allows to use them in the updated SMP infrastrucutre. To do so, remove the reason argument from flush_tlb_func_local(), add a member to struct tlb_flush_info that says which CPU initiated the flush and act accordingly. Optimize the size

[PATCH v6 1/9] smp: Run functions concurrently in smp_call_function_many_cond()

2021-02-20 Thread Nadav Amit
From: Nadav Amit Currently, on_each_cpu() and similar functions do not exploit the potential of concurrency: the function is first executed remotely and only then it is executed locally. Functions such as TLB flush can take considerable time, so this provides an opportunity for performance

[PATCH v6 0/9] x86/tlb: Concurrent TLB flushes

2021-02-20 Thread Nadav Amit
From: Nadav Amit The series improves TLB shootdown by flushing the local TLB concurrently with remote TLBs, overlapping the IPI delivery time with the local flush. Performance numbers can be found in the previous version [1]. v5 was rebased on 5.11 (long time after v4), and had some bugs

Re: [PATCH v5 1/8] smp: Run functions concurrently in smp_call_function_many_cond()

2021-02-18 Thread Nadav Amit
> On Feb 18, 2021, at 12:09 AM, Christoph Hellwig wrote: > > On Tue, Feb 09, 2021 at 02:16:46PM -0800, Nadav Amit wrote: >> +/* >> + * Flags to be used as scf_flags argument of smp_call_function_many_cond(). >> + */ >> +#define SCF_WAIT(1U << 0)

Re: [PATCH v5 3/8] x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()

2021-02-18 Thread Nadav Amit
> On Feb 18, 2021, at 12:16 AM, Christoph Hellwig wrote: > > On Tue, Feb 09, 2021 at 02:16:48PM -0800, Nadav Amit wrote: >> +/* >> + * Although we could have used on_each_cpu_cond_mask(), >> + * open-coding it has performance a

Re: [PATCH v5 1/8] smp: Run functions concurrently in smp_call_function_many_cond()

2021-02-16 Thread Nadav Amit
> On Feb 16, 2021, at 10:59 AM, Peter Zijlstra wrote: > > On Tue, Feb 16, 2021 at 06:53:09PM +0000, Nadav Amit wrote: >>> On Feb 16, 2021, at 8:32 AM, Peter Zijlstra wrote: > >>> I'm not sure I can explain it yet. It did get me looking at >>>

Local execution of ipi_sync_rq_state() on sync_runqueues_membarrier_state()

2021-02-16 Thread Nadav Amit
Hello Mathieu, While trying to find some unrelated by, something in sync_runqueues_membarrier_state() caught my eye: static int sync_runqueues_membarrier_state(struct mm_struct *mm) { if (atomic_read(>mm_users) == 1 || num_online_cpus() == 1) {

Re: [PATCH] drivers: vmw_balloon: remove dentry pointer for debugfs

2021-02-16 Thread Nadav Amit
> storage and make things a bit simpler. > > Cc: Nadav Amit > Cc: "VMware, Inc." > Cc: Arnd Bergmann > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Greg Kroah-Hartman > --- Thanks for the cleanup. Acked-by: Nadav Amit

Re: [PATCH v5 4/8] x86/mm/tlb: Flush remote and local TLBs concurrently

2021-02-16 Thread Nadav Amit
> On Feb 16, 2021, at 4:10 AM, Peter Zijlstra wrote: > > On Tue, Feb 09, 2021 at 02:16:49PM -0800, Nadav Amit wrote: >> @@ -816,8 +821,8 @@ STATIC_NOPV void native_flush_tlb_others(const struct >> cpumask *cpumask, >> * doing a speculative memory access. &

Re: [PATCH v5 1/8] smp: Run functions concurrently in smp_call_function_many_cond()

2021-02-16 Thread Nadav Amit
> On Feb 16, 2021, at 10:59 AM, Peter Zijlstra wrote: > > On Tue, Feb 16, 2021 at 06:53:09PM +0000, Nadav Amit wrote: >>> On Feb 16, 2021, at 8:32 AM, Peter Zijlstra wrote: > >>> I'm not sure I can explain it yet. It did get me looking at >>>

Re: [PATCH v5 1/8] smp: Run functions concurrently in smp_call_function_many_cond()

2021-02-16 Thread Nadav Amit
> On Feb 16, 2021, at 8:32 AM, Peter Zijlstra wrote: > > On Tue, Feb 09, 2021 at 02:16:46PM -0800, Nadav Amit wrote: >> From: Nadav Amit >> >> Currently, on_each_cpu() and similar functions do not exploit the >> potential of concurrency: the function is fi

Re: [PATCH v5 1/8] smp: Run functions concurrently in smp_call_function_many_cond()

2021-02-16 Thread Nadav Amit
> On Feb 16, 2021, at 4:04 AM, Peter Zijlstra wrote: > > On Tue, Feb 09, 2021 at 02:16:46PM -0800, Nadav Amit wrote: >> @@ -894,17 +911,12 @@ EXPORT_SYMBOL(on_each_cpu_mask); >> void on_each_cpu_cond_mask(smp_cond_func_t cond_func, smp_call_func_t func, >>

[PATCH v5 3/8] x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()

2021-02-09 Thread Nadav Amit
From: Nadav Amit Open-code on_each_cpu_cond_mask() in native_flush_tlb_others() to optimize the code. Open-coding eliminates the need for the indirect branch that is used to call is_lazy(), and in CPUs that are vulnerable to Spectre v2, it eliminates the retpoline. In addition, it allows to use

[PATCH v5 6/8] x86/mm/tlb: Do not make is_lazy dirty for no reason

2021-02-09 Thread Nadav Amit
From: Nadav Amit Blindly writing to is_lazy for no reason, when the written value is identical to the old value, makes the cacheline dirty for no reason. Avoid making such writes to prevent cache coherency traffic for no reason. Cc: Andy Lutomirski Cc: Peter Zijlstra Suggested-by: Dave Hansen

[PATCH v5 7/8] cpumask: Mark functions as pure

2021-02-09 Thread Nadav Amit
From: Nadav Amit cpumask_next_and() and cpumask_any_but() are pure, and marking them as such seems to generate different and presumably better code for native_flush_tlb_multi(). Reviewed-by: Dave Hansen Signed-off-by: Nadav Amit --- include/linux/cpumask.h | 6 +++--- 1 file changed, 3

[PATCH v5 8/8] x86/mm/tlb: Remove unnecessary uses of the inline keyword

2021-02-09 Thread Nadav Amit
From: Nadav Amit The compiler is smart enough without these hints. Cc: Andy Lutomirski Cc: Peter Zijlstra Suggested-by: Dave Hansen Reviewed-by: Dave Hansen Signed-off-by: Nadav Amit --- arch/x86/mm/tlb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86

[PATCH v5 5/8] x86/mm/tlb: Privatize cpu_tlbstate

2021-02-09 Thread Nadav Amit
From: Nadav Amit cpu_tlbstate is mostly private and only the variable is_lazy is shared. This causes some false-sharing when TLB flushes are performed. Break cpu_tlbstate intro cpu_tlbstate and cpu_tlbstate_shared, and mark each one accordingly. Cc: Andy Lutomirski Cc: Peter Zijlstra

[PATCH v5 4/8] x86/mm/tlb: Flush remote and local TLBs concurrently

2021-02-09 Thread Nadav Amit
From: Nadav Amit To improve TLB shootdown performance, flush the remote and local TLBs concurrently. Introduce flush_tlb_multi() that does so. Introduce paravirtual versions of flush_tlb_multi() for KVM, Xen and hyper-v (Xen and hyper-v are only compile-tested). While the updated smp

[PATCH v5 0/8] x86/tlb: Concurrent TLB flushes

2021-02-09 Thread Nadav Amit
From: Nadav Amit This is a respin of a rebased version of an old series, which I did not follow, as I was preoccupied with personal issues (sorry). The series improve TLB shootdown by flushing the local TLB concurrently with remote TLBs, overlapping the IPI delivery time with the local flush

[PATCH v5 1/8] smp: Run functions concurrently in smp_call_function_many_cond()

2021-02-09 Thread Nadav Amit
From: Nadav Amit Currently, on_each_cpu() and similar functions do not exploit the potential of concurrency: the function is first executed remotely and only then it is executed locally. Functions such as TLB flush can take considerable time, so this provides an opportunity for performance

[PATCH v5 2/8] x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote()

2021-02-09 Thread Nadav Amit
From: Nadav Amit The unification of these two functions allows to use them in the updated SMP infrastrucutre. To do so, remove the reason argument from flush_tlb_func_local(), add a member to struct tlb_flush_info that says which CPU initiated the flush and act accordingly. Optimize the size

Re: [RFC 01/20] mm/tlb: fix fullmm semantics

2021-02-03 Thread Nadav Amit
> On Feb 3, 2021, at 1:44 AM, Will Deacon wrote: > > On Tue, Feb 02, 2021 at 01:35:38PM -0800, Nadav Amit wrote: >>> On Feb 2, 2021, at 3:00 AM, Peter Zijlstra wrote: >>> >>> On Tue, Feb 02, 2021 at 01:32:36AM -0800, Nadav Amit wrote: >>>>>

Re: [RFC 01/20] mm/tlb: fix fullmm semantics

2021-02-02 Thread Nadav Amit
> On Feb 2, 2021, at 3:00 AM, Peter Zijlstra wrote: > > On Tue, Feb 02, 2021 at 01:32:36AM -0800, Nadav Amit wrote: >>> On Feb 1, 2021, at 3:36 AM, Peter Zijlstra wrote: >>> >>> >>> https://lkml.kernel.org/r/20210127235347.1402-1-w...@kernel.org &

Re: [RFC 15/20] mm: detect deferred TLB flushes in vma granularity

2021-02-02 Thread Nadav Amit
> On Feb 1, 2021, at 4:14 PM, Andy Lutomirski wrote: > > >> On Feb 1, 2021, at 2:04 PM, Nadav Amit wrote: >> >> Andy’s comments managed to make me realize this code is wrong. We must >> call inc_mm_tlb_gen(mm) every time. >> >> Otherwise, a CPU t

Re: [RFC 11/20] mm/tlb: remove arch-specific tlb_start/end_vma()

2021-02-02 Thread Nadav Amit
> On Feb 2, 2021, at 1:31 AM, Peter Zijlstra wrote: > > On Tue, Feb 02, 2021 at 07:20:55AM +0000, Nadav Amit wrote: >> Arm does not define tlb_end_vma, and consequently it flushes the TLB after >> each VMA. I suspect it is not intentional. > > ARM is one of those t

Re: [RFC 01/20] mm/tlb: fix fullmm semantics

2021-02-02 Thread Nadav Amit
> On Feb 1, 2021, at 3:36 AM, Peter Zijlstra wrote: > > > https://lkml.kernel.org/r/20210127235347.1402-1-w...@kernel.org I have seen this series, and applied my patches on it. Despite Will’s patches, there were still inconsistencies between fullmm and need_flush_all. Am I missing something?

Re: [RFC 11/20] mm/tlb: remove arch-specific tlb_start/end_vma()

2021-02-01 Thread Nadav Amit
> On Feb 1, 2021, at 10:41 PM, Nicholas Piggin wrote: > > Excerpts from Peter Zijlstra's message of February 1, 2021 10:09 pm: >> I also don't think AGRESSIVE_FLUSH_BATCHING quite captures what it does. >> How about: >> >> CONFIG_MMU_GATHER_NO_PER_VMA_FLUSH > > Yes please, have to have

Re: [RFC 13/20] mm/tlb: introduce tlb_start_ptes() and tlb_end_ptes()

2021-02-01 Thread Nadav Amit
> On Feb 1, 2021, at 5:19 AM, Peter Zijlstra wrote: > > On Sat, Jan 30, 2021 at 04:11:25PM -0800, Nadav Amit wrote: >> +#define tlb_start_ptes(tlb) \ >> +do {\ &

Re: [RFC 15/20] mm: detect deferred TLB flushes in vma granularity

2021-02-01 Thread Nadav Amit
> On Jan 30, 2021, at 4:11 PM, Nadav Amit wrote: > > From: Nadav Amit > > Currently, deferred TLB flushes are detected in the mm granularity: if > there is any deferred TLB flush in the entire address space due to NUMA > migration, pte_accessible() in

Re: [RFC 01/20] mm/tlb: fix fullmm semantics

2021-01-31 Thread Nadav Amit
> On Jan 30, 2021, at 6:57 PM, Andy Lutomirski wrote: > > On Sat, Jan 30, 2021 at 5:19 PM Nadav Amit wrote: >>> On Jan 30, 2021, at 5:02 PM, Andy Lutomirski wrote: >>> >>> On Sat, Jan 30, 2021 at 4:16 PM Nadav Amit wrote: >>>> From: Nadav A

Re: [RFC 13/20] mm/tlb: introduce tlb_start_ptes() and tlb_end_ptes()

2021-01-31 Thread Nadav Amit
> On Jan 31, 2021, at 2:07 AM, Damian Tometzki wrote: > > On Sat, 30. Jan 16:11, Nadav Amit wrote: >> From: Nadav Amit >> >> Introduce tlb_start_ptes() and tlb_end_ptes() which would be called >> before and after PTEs are updated and TLB flushes are deferr

Re: [RFC 08/20] mm: store completed TLB generation

2021-01-31 Thread Nadav Amit
> On Jan 31, 2021, at 12:32 PM, Andy Lutomirski wrote: > > On Sat, Jan 30, 2021 at 4:16 PM Nadav Amit wrote: >> From: Nadav Amit >> >> To detect deferred TLB flushes in fine granularity, we need to keep >> track on the completed TLB flush generation for e

Re: [RFC 03/20] mm/mprotect: do not flush on permission promotion

2021-01-31 Thread Nadav Amit
> On Sat, Jan 30, 2021 at 4:16 PM Nadav Amit wrote: >>> diff --git a/mm/mprotect.c b/mm/mprotect.c >>> index 632d5a677d3f..b7473d2c9a1f 100644 >>> --- a/mm/mprotect.c >>> +++ b/mm/mprotect.c >>> @@ -139,7 +139,8 @@ static unsigned long chang

Re: [RFC 00/20] TLB batching consolidation and enhancements

2021-01-31 Thread Nadav Amit
> On Jan 30, 2021, at 11:57 PM, Nadav Amit wrote: > >> On Jan 30, 2021, at 7:30 PM, Nicholas Piggin wrote: >> >> Excerpts from Nadav Amit's message of January 31, 2021 10:11 am: >>> From: Nadav Amit >>> >>> There are currently (at least?)

Re: [RFC 00/20] TLB batching consolidation and enhancements

2021-01-31 Thread Nadav Amit
> On Jan 30, 2021, at 7:30 PM, Nicholas Piggin wrote: > > Excerpts from Nadav Amit's message of January 31, 2021 10:11 am: >> From: Nadav Amit >> >> There are currently (at least?) 5 different TLB batching schemes in the >> kernel: >> >> 1. Using

Re: [RFC 01/20] mm/tlb: fix fullmm semantics

2021-01-30 Thread Nadav Amit
> On Jan 30, 2021, at 5:02 PM, Andy Lutomirski wrote: > > On Sat, Jan 30, 2021 at 4:16 PM Nadav Amit wrote: >> From: Nadav Amit >> >> fullmm in mmu_gather is supposed to indicate that the mm is torn-down >> (e.g., on process exit) and can therefore allow

Re: [RFC 03/20] mm/mprotect: do not flush on permission promotion

2021-01-30 Thread Nadav Amit
> On Jan 30, 2021, at 5:07 PM, Andy Lutomirski wrote: > > Adding Andrew Cooper, who has a distressingly extensive understanding > of the x86 PTE magic. > > On Sat, Jan 30, 2021 at 4:16 PM Nadav Amit wrote: >> From: Nadav Amit >> >> Currently, using mpr

Re: [RFC 00/20] TLB batching consolidation and enhancements

2021-01-30 Thread Nadav Amit
> On Jan 30, 2021, at 4:39 PM, Andy Lutomirski wrote: > > On Sat, Jan 30, 2021 at 4:16 PM Nadav Amit wrote: >> From: Nadav Amit >> >> There are currently (at least?) 5 different TLB batching schemes in the >> kernel: >> >> 1. Using mmu_gather (

[RFC 18/20] mm: make mm_cpumask() volatile

2021-01-30 Thread Nadav Amit
From: Nadav Amit mm_cpumask() is volatile: a bit might be turned on or off at any given moment, and it is not protected by any lock. While the kernel coding guidelines are very prohibitive against the use of volatile, not marking mm_cpumask() as volatile seems wrong. Cpumask and bitmap

[RFC 16/20] mm/tlb: per-page table generation tracking

2021-01-30 Thread Nadav Amit
From: Nadav Amit Detecting deferred TLB flushes per-VMA has two drawbacks: 1. It requires an atomic cmpxchg to record mm's TLB generation at the time of the last TLB flush, as two deferred TLB flushes on the same VMA can race. 2. It might be in coarse granularity for large VMAs. On 64-bit

[RFC 19/20] lib/cpumask: introduce cpumask_atomic_or()

2021-01-30 Thread Nadav Amit
From: Nadav Amit Introduce cpumask_atomic_or() and bitmask_atomic_or() to allow to perform atomic or operations atomically on cpumasks. This will be used by the next patch. To be more efficient, skip atomic operations when no changes are needed. Signed-off-by: Nadav Amit Cc: Mel Gorman Cc

[RFC 20/20] mm/rmap: avoid potential races

2021-01-30 Thread Nadav Amit
From: Nadav Amit flush_tlb_batched_pending() appears to have a theoretical race: tlb_flush_batched is being cleared after the TLB flush, and if in between another core calls set_tlb_ubc_flush_pending() and sets the pending TLB flush indication, this indication might be lost. Holding the page

[RFC 17/20] mm/tlb: updated completed deferred TLB flush conditionally

2021-01-30 Thread Nadav Amit
From: Nadav Amit If all the deferred TLB flushes were completed, there is no need to update the completed TLB flush. This update requires an atomic cmpxchg, so we would like to skip it. To do so, save for each mm the last TLB generation in which TLB flushes were deferred. While saving

  1   2   3   4   5   6   7   8   9   10   >