Re: [RFC PATCH v2 3/3] x86/mm/tlb: Avoid deferring PTI flushes on shootdown

2019-08-27 Thread Nadav Amit
> On Aug 27, 2019, at 4:07 PM, Andy Lutomirski wrote: > > On Fri, Aug 23, 2019 at 11:13 PM Nadav Amit wrote: >> When a shootdown is initiated, the initiating CPU has cycles to burn as >> it waits for the responding CPUs to receive the IPI and acknowledge it. >> I

Re: [RFC PATCH v2 2/3] x86/mm/tlb: Defer PTI flushes

2019-08-27 Thread Nadav Amit
> On Aug 27, 2019, at 4:13 PM, Andy Lutomirski wrote: > > On Fri, Aug 23, 2019 at 11:13 PM Nadav Amit wrote: >> INVPCID is considerably slower than INVLPG of a single PTE. Using it to >> flush the user page-tables when PTI is enabled therefore introduces >> significa

Re: [RFC PATCH 0/3] x86/mm/tlb: Defer TLB flushes with PTI

2019-08-27 Thread Nadav Amit
> On Aug 27, 2019, at 4:18 PM, Andy Lutomirski wrote: > > On Fri, Aug 23, 2019 at 11:07 PM Nadav Amit wrote: >> INVPCID is considerably slower than INVLPG of a single PTE, but it is >> currently used to flush PTEs in the user page-table when PTI is used. >> >>

Re: [RFC PATCH v2 2/3] x86/mm/tlb: Defer PTI flushes

2019-08-27 Thread Nadav Amit
> On Aug 27, 2019, at 11:28 AM, Dave Hansen wrote: > > On 8/23/19 3:52 PM, Nadav Amit wrote: >> INVPCID is considerably slower than INVLPG of a single PTE. Using it to >> flush the user page-tables when PTI is enabled therefore introduces >> significant overhead. >

Re: [PATCH] vmw_balloon: Fix offline page marking with compaction

2019-08-26 Thread Nadav Amit
> On Aug 21, 2019, at 1:30 AM, David Hildenbrand wrote: > > On 20.08.19 18:01, Nadav Amit wrote: >> The compaction code already marks pages as offline when it enqueues >> pages in the ballooned page list, and removes the mapping when the pages >> are removed from th

Re: [RFC PATCH v2 1/3] x86/mm/tlb: Change __flush_tlb_one_user interface

2019-08-26 Thread Nadav Amit
> On Aug 26, 2019, at 12:51 AM, Juergen Gross wrote: > > On 24.08.19 00:52, Nadav Amit wrote: >> __flush_tlb_one_user() currently flushes a single entry, and flushes it >> both in the kernel and user page-tables, when PTI is enabled. >> Change __flush_tlb_one_us

Re: [PATCH v4 1/9] smp: Run functions concurrently in smp_call_function_many()

2019-08-26 Thread Nadav Amit
> On Aug 23, 2019, at 3:41 PM, Nadav Amit wrote: > > Currently, on_each_cpu() and similar functions do not exploit the > potential of concurrency: the function is first executed remotely and > only then it is executed locally. Functions such as TLB flush can take > con

Re: [PATCH] x86/mm: Do not split_large_page() for set_kernel_text_rw()

2019-08-26 Thread Nadav Amit
> On Aug 26, 2019, at 8:56 AM, Steven Rostedt wrote: > > On Mon, 26 Aug 2019 15:41:24 +0000 > Nadav Amit wrote: > >>> Anyway, I believe Nadav has some patches that converts ftrace to use >>> the shadow page modification trick somewhere. >> >&g

Re: [PATCH] x86/mm: Do not split_large_page() for set_kernel_text_rw()

2019-08-26 Thread Nadav Amit
> On Aug 26, 2019, at 4:33 AM, Steven Rostedt wrote: > > On Fri, 23 Aug 2019 11:36:37 +0200 > Peter Zijlstra wrote: > >> On Thu, Aug 22, 2019 at 10:23:35PM -0700, Song Liu wrote: >>> As 4k pages check was removed from cpa [1], set_kernel_text_rw() leads to >>> split_large_page() for all kernel

[RFC PATCH v2 1/3] x86/mm/tlb: Change __flush_tlb_one_user interface

2019-08-24 Thread Nadav Amit
Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: xen-de...@lists.xenproject.org Signed-off-by: Nadav Amit --- arch/x86/include/asm/paravirt.h | 5 ++-- arch/x86/include/asm/paravirt_types.h | 3 ++- arch/x86/include/asm/tlbflush.h | 24 + arch/x86/kernel

[RFC PATCH v2 3/3] x86/mm/tlb: Avoid deferring PTI flushes on shootdown

2019-08-24 Thread Nadav Amit
-table flushes, but it prevents performance regression. Signed-off-by: Nadav Amit --- arch/x86/include/asm/tlbflush.h | 1 + arch/x86/mm/tlb.c | 10 +- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm

[RFC PATCH v2 2/3] x86/mm/tlb: Defer PTI flushes

2019-08-24 Thread Nadav Amit
ranges before the kernel returns to userspace, the overhead of tracking them can exceed the benefit. In these cases, perform a full TLB flush. It is possible to avoid them in some cases, but the benefit in doing so is questionable. Signed-off-by: Nadav Amit --- arch/x86/entry/calling.h

[RFC PATCH v2 0/3] x86/mm/tlb: Defer TLB flushes with PTI

2019-08-24 Thread Nadav Amit
it might be easier to merge this one first ] RFC v1 -> RFC v2: * Wrong patches were sent before Nadav Amit (3): x86/mm/tlb: Change __flush_tlb_one_user interface x86/mm/tlb: Defer PTI flushes x86/mm/tlb: Avoid deferring PTI flushes on shootdown arch/x86/entry/calling.h |

Re: [RFC PATCH 0/3] x86/mm/tlb: Defer TLB flushes with PTI

2019-08-24 Thread Nadav Amit
Sorry, I made a mistake and included the wrong patches. I will send RFC v2 in few minutes. > On Aug 23, 2019, at 3:46 PM, Nadav Amit wrote: > > INVPCID is considerably slower than INVLPG of a single PTE, but it is > currently used to flush PTEs in the user page-table when

[RFC PATCH 1/3] x86/mm/tlb: Defer PTI flushes

2019-08-24 Thread Nadav Amit
ranges before the kernel returns to userspace, the overhead of tracking them can exceed the benefit. In these cases, perform a full TLB flush. It is possible to avoid them in some cases, but the benefit in doing so is questionable. Signed-off-by: Nadav Amit --- arch/x86/entry/calling.h

[RFC PATCH 2/3] x86/mm/tlb: Avoid deferring PTI flushes on shootdown

2019-08-24 Thread Nadav Amit
-table flushes, but it prevents performance regression. Signed-off-by: Nadav Amit --- arch/x86/include/asm/tlbflush.h | 1 + arch/x86/mm/tlb.c | 10 +- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm

[RFC PATCH 0/3] x86/mm/tlb: Defer TLB flushes with PTI

2019-08-24 Thread Nadav Amit
it might be easier to merge this one first ] Nadav Amit (3): x86/mm/tlb: Defer PTI flushes x86/mm/tlb: Avoid deferring PTI flushes on shootdown x86/mm/tlb: Use lockdep irq assertions arch/x86/entry/calling.h| 52 +++-- arch/x86/include/asm/tlbflush.h | 31

[RFC PATCH 3/3] x86/mm/tlb: Use lockdep irq assertions

2019-08-24 Thread Nadav Amit
The assertions that check whether IRQs are disabled depend currently on different debug features. Use instead lockdep_assert_irqs_disabled(), which is standard, enabled by the same debug feature, and provides more information upon failures. Signed-off-by: Nadav Amit --- arch/x86/mm/tlb.c | 5

[PATCH 3/7] x86/percpu: Use C for percpu accesses when possible

2019-08-24 Thread Nadav Amit
in call_timer_fn(). Signed-off-by: Nadav Amit --- arch/x86/include/asm/percpu.h | 115 +++--- 1 file changed, 105 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index 1fe348884477..13987f9bc82f 100644 --- a/arch

[PATCH 5/7] percpu: Assume preemption is disabled on per_cpu_ptr()

2019-08-24 Thread Nadav Amit
s to allow further, per-arch optimizations. Signed-off-by: Nadav Amit --- include/asm-generic/percpu.h | 12 include/linux/percpu-defs.h | 33 - 2 files changed, 44 insertions(+), 1 deletion(-) diff --git a/include/asm-generic/percpu.h b/include/asm-generi

[PATCH 1/7] compiler: Report x86 segment support

2019-08-24 Thread Nadav Amit
GCC v6+ supports x86 segment qualifiers (__seg_gs and __seg_fs). Define COMPILER_HAS_X86_SEG_SUPPORT when it is supported. Signed-off-by: Nadav Amit --- include/linux/compiler-gcc.h | 4 1 file changed, 4 insertions(+) diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler

[PATCH 4/7] x86: Fix possible caching of current_task

2019-08-24 Thread Nadav Amit
of current in __switch_to()'s dynamic extent. Signed-off-by: Nadav Amit --- arch/x86/include/asm/fpu/internal.h| 7 --- arch/x86/include/asm/resctrl_sched.h | 14 +++--- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 4 ++-- arch/x86/kernel/process_32.c | 4 ++-- arch/x86

[PATCH 7/7] x86/current: Aggressive caching of current

2019-08-24 Thread Nadav Amit
in a different compilation unit to avoid the compiler from assuming that the value is constant during compilation. Signed-off-by: Nadav Amit --- arch/x86/include/asm/current.h | 30 ++ arch/x86/kernel/cpu/Makefile | 1 + arch/x86/kernel/cpu/common.c | 7 +-- arch

[PATCH 0/7] x86/percpu: Use segment qualifiers

2019-08-24 Thread Nadav Amit
1310 (0.09%). RFC->v1: * Fixing i386 build bug * Moving chunk to the right place [Peter] Nadav Amit (7): compiler: Report x86 segment support x86/percpu: Use compiler segment prefix qualifier x86/percpu: Use C for percpu accesses when possible x86: Fix possible caching of current_task p

[PATCH 2/7] x86/percpu: Use compiler segment prefix qualifier

2019-08-24 Thread Nadav Amit
variables, and do casting using the segment qualifier instead. Signed-off-by: Nadav Amit --- arch/x86/include/asm/percpu.h | 153 ++--- arch/x86/include/asm/preempt.h | 3 +- 2 files changed, 104 insertions(+), 52 deletions(-) diff --git a/arch/x86/include/asm

[PATCH 6/7] x86/percpu: Optimized arch_raw_cpu_ptr()

2019-08-24 Thread Nadav Amit
in rcu_dynticks_eqs_exit(), the following code: mov$0x2bbc0,%rax add%gs:0x7ef07570(%rip),%rax # 0x10358 lock xadd %edx,0xd8(%rax) Turns with this patch into: mov%gs:0x7ef08aa5(%rip),%rax # 0x10358 lock xadd %edx,0x2bc58(%rax) Signed-off-by: Nadav Amit --- arch

[PATCH v4 2/9] x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote()

2019-08-24 Thread Nadav Amit
: Thomas Gleixner Cc: Andy Lutomirski Cc: Josh Poimboeuf Signed-off-by: Nadav Amit --- arch/x86/include/asm/tlbflush.h | 5 +- arch/x86/mm/tlb.c | 85 +++-- 2 files changed, 41 insertions(+), 49 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b

[PATCH v4 5/9] x86/mm/tlb: Privatize cpu_tlbstate

2019-08-24 Thread Nadav Amit
Signed-off-by: Nadav Amit --- arch/x86/include/asm/tlbflush.h | 39 ++--- arch/x86/mm/init.c | 2 +- arch/x86/mm/tlb.c | 15 - 3 files changed, 31 insertions(+), 25 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch

[PATCH v4 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019-08-24 Thread Nadav Amit
.org Cc: k...@vger.kernel.org Cc: xen-de...@lists.xenproject.org Reviewed-by: Michael Kelley # Hyper-v parts Reviewed-by: Juergen Gross # Xen and paravirt parts Reviewed-by: Dave Hansen Signed-off-by: Nadav Amit --- arch/x86/hyperv/mmu.c | 10 +++--- arch/x86/include/asm/paravirt.h

[PATCH v4 9/9] x86/mm/tlb: Remove unnecessary uses of the inline keyword

2019-08-24 Thread Nadav Amit
The compiler is smart enough without these hints. Cc: Andy Lutomirski Cc: Peter Zijlstra Suggested-by: Dave Hansen Reviewed-by: Dave Hansen Signed-off-by: Nadav Amit --- arch/x86/mm/tlb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86

[PATCH v4 1/9] smp: Run functions concurrently in smp_call_function_many()

2019-08-24 Thread Nadav Amit
Hansen Cc: Peter Zijlstra Cc: Rik van Riel Cc: Thomas Gleixner Cc: Andy Lutomirski Cc: Josh Poimboeuf Signed-off-by: Nadav Amit --- include/linux/smp.h | 34 --- kernel/smp.c| 138 +--- 2 files changed, 92 insertions(+), 80 deletions

[PATCH v4 3/9] x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()

2019-08-24 Thread Nadav Amit
-off-by: Nadav Amit --- arch/x86/mm/tlb.c | 40 +--- 1 file changed, 33 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 2674f55ed9a1..c3ca3545d78a 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -653,11 +653,13

[PATCH v4 0/9] x86/tlb: Concurrent TLB flushes

2019-08-24 Thread Nadav Amit
ualizat...@lists.linux-foundation.org Cc: x...@kernel.org Cc: xen-de...@lists.xenproject.org Nadav Amit (9): smp: Run functions concurrently in smp_call_function_many() x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote() x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy

[PATCH v4 6/9] x86/mm/tlb: Do not make is_lazy dirty for no reason

2019-08-24 Thread Nadav Amit
Hansen Signed-off-by: Nadav Amit --- arch/x86/mm/tlb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 24c9839e3d9b..1393b3cd3697 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -322,7 +322,8 @@ void switch_mm_irqs_off

[PATCH v4 7/9] cpumask: Mark functions as pure

2019-08-24 Thread Nadav Amit
cpumask_next_and() and cpumask_any_but() are pure, and marking them as such seems to generate different and presumably better code for native_flush_tlb_multi(). Reviewed-by: Dave Hansen Signed-off-by: Nadav Amit --- include/linux/cpumask.h | 6 +++--- 1 file changed, 3 insertions(+), 3

[PATCH v4 8/9] x86/mm/tlb: Remove UV special case

2019-08-24 Thread Nadav Amit
SGI UV TLB flushes is outdated and will be replaced with compatible smp_call_many APIC function in the future. For now, simplify the code by removing the UV special case. Cc: Peter Zijlstra Suggested-by: Andy Lutomirski Acked-by: Mike Travis Reviewed-by: Dave Hansen Signed-off-by: Nadav Amit

Re: [PATCH] KVM: x86: Don't update RIP or do single-step on faulting emulation

2019-08-23 Thread Nadav Amit
ction > fixes a bug reported by Andy Lutomirski where a #UD on SYSCALL due to > invalid state with RFLAGS.RF=1 would loop indefinitely due to emulation > overwriting the #UD with #DB and thus restarting the bad SYSCALL over > and over. > > Cc: Nadav Amit > Cc: sta...@vger.kernel.org >

Re: [PATCH] mm/balloon_compaction: suppress allocation warnings

2019-08-21 Thread Nadav Amit
> On Aug 21, 2019, at 12:13 PM, David Hildenbrand wrote: > > On 21.08.19 18:34, Nadav Amit wrote: >>> On Aug 21, 2019, at 9:29 AM, David Hildenbrand wrote: >>> >>> On 21.08.19 18:23, Nadav Amit wrote: >>>>> On Aug 21, 2019, at 9:05 AM, David

Re: [PATCH v2] mm/balloon_compaction: Informative allocation warnings

2019-08-21 Thread Nadav Amit
> On Aug 21, 2019, at 12:12 PM, David Hildenbrand wrote: > > On 21.08.19 21:10, Nadav Amit wrote: >>> On Aug 21, 2019, at 12:06 PM, David Hildenbrand wrote: >>> >>> On 21.08.19 20:59, Nadav Amit wrote: >>>>> On Aug 21, 2019, at 11:57 AM, David

Re: [PATCH v2] mm/balloon_compaction: Informative allocation warnings

2019-08-21 Thread Nadav Amit
> On Aug 21, 2019, at 12:06 PM, David Hildenbrand wrote: > > On 21.08.19 20:59, Nadav Amit wrote: >>> On Aug 21, 2019, at 11:57 AM, David Hildenbrand wrote: >>> >>> On 21.08.19 11:41, Nadav Amit wrote: >>>> There is no reason to print generic wa

Re: [PATCH v2] mm/balloon_compaction: Informative allocation warnings

2019-08-21 Thread Nadav Amit
> On Aug 21, 2019, at 11:57 AM, David Hildenbrand wrote: > > On 21.08.19 11:41, Nadav Amit wrote: >> There is no reason to print generic warnings when balloon memory >> allocation fails, as failures are expected and can be handled >> gracefully. Since VMware balloon

[PATCH v2] mm/balloon_compaction: Informative allocation warnings

2019-08-21 Thread Nadav Amit
the same behavior that the balloon had before. Since such warnings can still be useful to indicate that the balloon is over-inflated, print more informative and less frightening warning if allocation fails instead. Cc: David Hildenbrand Cc: Jason Wang Signed-off-by: Nadav Amit --- v1->

Re: [PATCH] mm/balloon_compaction: suppress allocation warnings

2019-08-21 Thread Nadav Amit
> On Aug 21, 2019, at 9:29 AM, David Hildenbrand wrote: > > On 21.08.19 18:23, Nadav Amit wrote: >>> On Aug 21, 2019, at 9:05 AM, David Hildenbrand wrote: >>> >>> On 20.08.19 11:16, Nadav Amit wrote: >>>> There is no reason to p

Re: [PATCH] mm/balloon_compaction: suppress allocation warnings

2019-08-21 Thread Nadav Amit
> On Aug 21, 2019, at 9:05 AM, David Hildenbrand wrote: > > On 20.08.19 11:16, Nadav Amit wrote: >> There is no reason to print warnings when balloon page allocation fails, >> as they are expected and can be handled gracefully. Since VMware >> balloon now uses balloo

[PATCH] VMCI: Release resource if the work is already queued

2019-08-20 Thread Nadav Amit
s not check status"). Fixes: 83e2ec765be03 ("VMCI: doorbell implementation.") Reported-by: Francois Rigault Cc: Jorgen Hansen Cc: Adit Ranadive Cc: Alexios Zavras Cc: Vishnu DASA Cc: sta...@vger.kernel.org Signed-off-by: Nadav Amit --- drivers/misc/vmw_vmci/vmci_doorbell.c | 6

[PATCH] vmw_balloon: Fix offline page marking with compaction

2019-08-20 Thread Nadav Amit
rectly by the balloon compaction logic. Fixes: 83a8afa72e9c ("vmw_balloon: Compaction support") Cc: David Hildenbrand Reported-by: Thomas Hellstrom Signed-off-by: Nadav Amit --- drivers/misc/vmw_balloon.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/misc

[PATCH] mm/balloon_compaction: suppress allocation warnings

2019-08-20 Thread Nadav Amit
behavior that the balloon had before. Cc: Jason Wang Signed-off-by: Nadav Amit --- mm/balloon_compaction.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c index 798275a51887..26de020aae7b 100644 --- a/mm/balloon_compaction.c

[PATCH] iommu/vt-d: Fix wrong analysis whether devices share the same bus

2019-08-20 Thread Nadav Amit
mu/vt-d: Allow interrupts from the entire bus for aliased devices") Cc: sta...@vger.kernel.org Cc: Logan Gunthorpe Cc: David Woodhouse Cc: Joerg Roedel Cc: Jacob Pan Signed-off-by: Nadav Amit --- drivers/iommu/intel_irq_remapping.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletion

Re: [PATCH v3 8/9] x86/mm/tlb: Remove UV special case

2019-07-30 Thread Nadav Amit
> On Jul 18, 2019, at 7:25 PM, Mike Travis wrote: > > It is a fact that the UV is still the UV and SGI is now part of HPE. The > current external product is known as SuperDome Flex. It is both up to date > as well as very well maintained. The ACK I provided was an okay to change > the code, but

Re: [PATCH v3 1/9] smp: Run functions concurrently in smp_call_function_many()

2019-07-25 Thread Nadav Amit
> On Jul 25, 2019, at 5:36 AM, Thomas Gleixner wrote: > > On Mon, 22 Jul 2019, Nadav Amit wrote: >>> On Jul 22, 2019, at 11:51 AM, Thomas Gleixner wrote: >>> void on_each_cpu(void (*func) (void *info), void *info, int wait) >>> { >>> unsign

Re: [RFC 7/7] x86/current: Aggressive caching of current

2019-07-22 Thread Nadav Amit
> On Jul 22, 2019, at 2:07 PM, Peter Zijlstra wrote: > > On Thu, Jul 18, 2019 at 10:41:10AM -0700, Nadav Amit wrote: >> The current_task is supposed to be constant in each thread and therefore >> does not need to be reread. There is already an attempt to cache it >> u

Re: [RFC 3/7] x86/percpu: Use C for percpu accesses when possible

2019-07-22 Thread Nadav Amit
> On Jul 22, 2019, at 1:52 PM, Peter Zijlstra wrote: > > On Thu, Jul 18, 2019 at 10:41:06AM -0700, Nadav Amit wrote: > >> diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h >> index 99a7fa9ab0a3..60f97b288004 100644 >> --- a/arch/x86/inclu

Re: [PATCH v3 3/9] x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()

2019-07-22 Thread Nadav Amit
> On Jul 22, 2019, at 12:47 PM, Rasmus Villemoes > wrote: > > On 19/07/2019 02.58, Nadav Amit wrote: > >> /* >> @@ -865,7 +893,7 @@ void arch_tlbbatch_flush(struct >> arch_tlbflush_unmap_batch *batch) >> if (cpumask_test_cpu(cpu, >cpumask)) { &

Re: [PATCH v3 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019-07-22 Thread Nadav Amit
> On Jul 22, 2019, at 12:14 PM, Peter Zijlstra wrote: > > On Thu, Jul 18, 2019 at 05:58:32PM -0700, Nadav Amit wrote: >> @@ -709,8 +716,9 @@ void native_flush_tlb_others(const struct cpumask >> *cpumask, >> * doing a speculative memory access. >>

Re: [PATCH v3 1/9] smp: Run functions concurrently in smp_call_function_many()

2019-07-22 Thread Nadav Amit
> On Jul 22, 2019, at 11:51 AM, Thomas Gleixner wrote: > > On Mon, 22 Jul 2019, Nadav Amit wrote: >>> On Jul 22, 2019, at 11:37 AM, Thomas Gleixner wrote: >>> >>> On Mon, 22 Jul 2019, Peter Zijlstra wrote: >>> >>>> O

Re: [PATCH v3 1/9] smp: Run functions concurrently in smp_call_function_many()

2019-07-22 Thread Nadav Amit
> On Jul 22, 2019, at 11:16 AM, Peter Zijlstra wrote: > > On Fri, Jul 19, 2019 at 11:23:06AM -0700, Dave Hansen wrote: >> On 7/18/19 5:58 PM, Nadav Amit wrote: >>> @@ -624,16 +622,11 @@ EXPORT_SYMBOL(on_each_cpu); >>> void on_each_cpu_mask(const struct cp

Re: [PATCH v3 1/9] smp: Run functions concurrently in smp_call_function_many()

2019-07-22 Thread Nadav Amit
> On Jul 22, 2019, at 11:37 AM, Thomas Gleixner wrote: > > On Mon, 22 Jul 2019, Peter Zijlstra wrote: > >> On Thu, Jul 18, 2019 at 05:58:29PM -0700, Nadav Amit wrote: >>> +/* >>> + * Call a function on all processors. May be used during early boot while >

Re: [PATCH v3 1/9] smp: Run functions concurrently in smp_call_function_many()

2019-07-22 Thread Nadav Amit
> On Jul 22, 2019, at 11:21 AM, Peter Zijlstra wrote: > > On Thu, Jul 18, 2019 at 05:58:29PM -0700, Nadav Amit wrote: >> +/* >> + * Call a function on all processors. May be used during early boot while >> + * early_boot_irqs_disabled is set. >> + */ &

Re: [PATCH v3 5/9] x86/mm/tlb: Privatize cpu_tlbstate

2019-07-21 Thread Nadav Amit
> On Jul 19, 2019, at 11:38 AM, Dave Hansen wrote: > > On 7/18/19 5:58 PM, Nadav Amit wrote: >> +struct tlb_state_shared { >> +/* >> + * We can be in one of several states: >> + * >> + * - Actively using an mm. Our CPU's bit will be se

Re: [PATCH v3 3/9] x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()

2019-07-19 Thread Nadav Amit
> On Jul 19, 2019, at 3:44 PM, Joe Perches wrote: > > On Fri, 2019-07-19 at 18:41 +0000, Nadav Amit wrote: >>> On Jul 19, 2019, at 11:36 AM, Dave Hansen wrote: >>> >>> On 7/18/19 5:58 PM, Nadav Amit wrote: >>>> @@ -865,7 +893,7 @@ void arch_tlbba

Re: [PATCH v3 5/9] x86/mm/tlb: Privatize cpu_tlbstate

2019-07-19 Thread Nadav Amit
> On Jul 19, 2019, at 11:48 AM, Dave Hansen wrote: > > On 7/19/19 11:43 AM, Nadav Amit wrote: >> Andy said that for the lazy tlb optimizations there might soon be more >> shared state. If you prefer, I can move is_lazy outside of tlb_state, and >> not set it in any

Re: [PATCH v3 5/9] x86/mm/tlb: Privatize cpu_tlbstate

2019-07-19 Thread Nadav Amit
> On Jul 19, 2019, at 11:38 AM, Dave Hansen wrote: > > On 7/18/19 5:58 PM, Nadav Amit wrote: >> +struct tlb_state_shared { >> +/* >> + * We can be in one of several states: >> + * >> + * - Actively using an mm. Our CPU's bit will be se

Re: [PATCH v3 3/9] x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()

2019-07-19 Thread Nadav Amit
> On Jul 19, 2019, at 11:36 AM, Dave Hansen wrote: > > On 7/18/19 5:58 PM, Nadav Amit wrote: >> @@ -865,7 +893,7 @@ void arch_tlbbatch_flush(struct >> arch_tlbflush_unmap_batch *batch) >> if (cpumask_test_cpu(cpu, >cpumask)) { >>

Re: [PATCH v3 8/9] x86/mm/tlb: Remove UV special case

2019-07-18 Thread Nadav Amit
> On Jul 18, 2019, at 7:25 PM, Mike Travis wrote: > > It is a fact that the UV is still the UV and SGI is now part of HPE. The > current external product is known as SuperDome Flex. It is both up to date > as well as very well maintained. The ACK I provided was an okay to change > the code,

[RFC 6/7] x86/percpu: Optimized arch_raw_cpu_ptr()

2019-07-18 Thread Nadav Amit
in rcu_dynticks_eqs_exit(), the following code: mov$0x2bbc0,%rax add%gs:0x7ef07570(%rip),%rax # 0x10358 lock xadd %edx,0xd8(%rax) Turns with this patch into: mov%gs:0x7ef08aa5(%rip),%rax # 0x10358 lock xadd %edx,0x2bc58(%rax) Signed-off-by: Nadav Amit --- arch

[RFC 4/7] x86: Fix possible caching of current_task

2019-07-18 Thread Nadav Amit
of current in __switch_to()'s dynamic extent. Signed-off-by: Nadav Amit --- arch/x86/include/asm/fpu/internal.h| 7 --- arch/x86/include/asm/resctrl_sched.h | 14 +++--- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 4 ++-- arch/x86/kernel/process_32.c | 4 ++-- arch/x86

[RFC 3/7] x86/percpu: Use C for percpu accesses when possible

2019-07-18 Thread Nadav Amit
in call_timer_fn(). Signed-off-by: Nadav Amit --- arch/x86/include/asm/percpu.h | 115 ++--- arch/x86/include/asm/preempt.h | 3 +- 2 files changed, 107 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index

[RFC 1/7] compiler: Report x86 segment support

2019-07-18 Thread Nadav Amit
GCC v6+ supports x86 segment qualifiers (__seg_gs and __seg_fs). Define COMPILER_HAS_X86_SEG_SUPPORT when it is supported. Signed-off-by: Nadav Amit --- include/linux/compiler-gcc.h | 4 1 file changed, 4 insertions(+) diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler

[RFC 0/7] x86/percpu: Use segment qualifiers

2019-07-18 Thread Nadav Amit
9840 (0.09%). Nadav Amit (7): compiler: Report x86 segment support x86/percpu: Use compiler segment prefix qualifier x86/percpu: Use C for percpu accesses when possible x86: Fix possible caching of current_task percpu: Assume preemption is disabled on per_cpu_ptr() x86/percpu: Optimized arch_r

[RFC 2/7] x86/percpu: Use compiler segment prefix qualifier

2019-07-18 Thread Nadav Amit
variables, and do casting using the segment qualifier instead. Signed-off-by: Nadav Amit --- arch/x86/include/asm/percpu.h | 153 ++ 1 file changed, 102 insertions(+), 51 deletions(-) diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index

[RFC 5/7] percpu: Assume preemption is disabled on per_cpu_ptr()

2019-07-18 Thread Nadav Amit
s to allow further, per-arch optimizations. Signed-off-by: Nadav Amit --- include/asm-generic/percpu.h | 12 include/linux/percpu-defs.h | 33 - 2 files changed, 44 insertions(+), 1 deletion(-) diff --git a/include/asm-generic/percpu.h b/include/asm-generi

[RFC 7/7] x86/current: Aggressive caching of current

2019-07-18 Thread Nadav Amit
in a different compilation unit to avoid the compiler from assuming that the value is constant during compilation. Signed-off-by: Nadav Amit --- arch/x86/include/asm/current.h | 30 ++ arch/x86/kernel/cpu/Makefile | 1 + arch/x86/kernel/cpu/common.c | 7 +-- arch

[PATCH v3 2/9] x86/mm/tlb: Remove reason as argument for flush_tlb_func_local()

2019-07-18 Thread Nadav Amit
Poimboeuf Signed-off-by: Nadav Amit --- arch/x86/mm/tlb.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 4de9704c4aaf..233f3d8308db 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -635,9 +635,12 @@ static void

[PATCH v3 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019-07-18 Thread Nadav Amit
ists.linux-foundation.org Cc: k...@vger.kernel.org Cc: xen-de...@lists.xenproject.org Signed-off-by: Nadav Amit --- arch/x86/hyperv/mmu.c | 10 +++--- arch/x86/include/asm/paravirt.h | 6 ++-- arch/x86/include/asm/paravirt_types.h | 4 +-- arch/x86/include/asm/tlbflush.h

[PATCH v3 9/9] x86/mm/tlb: Remove unnecessary uses of the inline keyword

2019-07-18 Thread Nadav Amit
The compiler is smart enough without these hints. Cc: Andy Lutomirski Cc: Peter Zijlstra Suggested-by: Dave Hansen Signed-off-by: Nadav Amit --- arch/x86/mm/tlb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index

[PATCH v3 5/9] x86/mm/tlb: Privatize cpu_tlbstate

2019-07-18 Thread Nadav Amit
-off-by: Nadav Amit --- arch/x86/include/asm/tlbflush.h | 39 ++--- arch/x86/mm/init.c | 2 +- arch/x86/mm/tlb.c | 15 - 3 files changed, 31 insertions(+), 25 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86

[PATCH v3 3/9] x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()

2019-07-18 Thread Nadav Amit
-by: Nadav Amit --- arch/x86/mm/tlb.c | 40 ++-- 1 file changed, 34 insertions(+), 6 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 233f3d8308db..abbf55fa8b81 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -658,11 +658,13 @@ static

[PATCH v3 8/9] x86/mm/tlb: Remove UV special case

2019-07-18 Thread Nadav Amit
SGI UV support is outdated and not maintained, and it is not clear how it performs relatively to non-UV. Remove the code to simplify the code. Cc: Peter Zijlstra Cc: Dave Hansen Acked-by: Mike Travis Suggested-by: Andy Lutomirski Signed-off-by: Nadav Amit --- arch/x86/mm/tlb.c | 25

[PATCH v3 6/9] x86/mm/tlb: Do not make is_lazy dirty for no reason

2019-07-18 Thread Nadav Amit
-by: Nadav Amit --- arch/x86/mm/tlb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index af80c274c88d..89f83ad19507 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -322,7 +322,8 @@ void switch_mm_irqs_off(struct mm_struct *prev

[PATCH v3 7/9] cpumask: Mark functions as pure

2019-07-18 Thread Nadav Amit
cpumask_next_and() and cpumask_any_but() are pure, and marking them as such seems to generate different and presumably better code for native_flush_tlb_multi(). Signed-off-by: Nadav Amit --- include/linux/cpumask.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git

[PATCH v3 0/9] x86: Concurrent TLB flushes

2019-07-18 Thread Nadav Amit
izat...@lists.linux-foundation.org Cc: x...@kernel.org Cc: xen-de...@lists.xenproject.org Nadav Amit (9): smp: Run functions concurrently in smp_call_function_many() x86/mm/tlb: Remove reason as argument for flush_tlb_func_local() x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy() x86/m

[PATCH v3 1/9] smp: Run functions concurrently in smp_call_function_many()

2019-07-18 Thread Nadav Amit
Cc: Dave Hansen Cc: Rik van Riel Cc: Thomas Gleixner Cc: Andy Lutomirski Cc: Josh Poimboeuf Signed-off-by: Nadav Amit --- include/linux/smp.h | 27 ++--- kernel/smp.c| 133 +--- 2 files changed, 83 insertions(+), 77 deletions(-) diff --git

Re: [x86/modules] f2c65fb322: will-it-scale.per_process_ops -2.9% regression

2019-07-18 Thread Nadav Amit
> On Jul 18, 2019, at 2:50 AM, kernel test robot wrote: > > Greeting, > > FYI, we noticed a -2.9% regression of will-it-scale.per_process_ops due to > commit: > > > commit: f2c65fb3221adc6b73b0549fc7ba892022db9797 ("x86/modules: Avoid > breaking W^X while loading modules") >

Re: [PATCH v2] mm/balloon_compaction: avoid duplicate page removal

2019-07-18 Thread Nadav Amit
anges. Pls > take a look. Thanks (Wei, Michael) for taking care of it. Please cc me on future iterations of the patch. Acked-by: Nadav Amit

Re: [PATCH 0/3] resource: find_next_iomem_res() improvements

2019-07-16 Thread Nadav Amit
> On Jul 16, 2019, at 3:20 PM, Dan Williams wrote: > > On Tue, Jul 16, 2019 at 3:13 PM Nadav Amit wrote: >>> On Jul 16, 2019, at 3:07 PM, Dan Williams wrote: >>> >>> On Tue, Jul 16, 2019 at 3:01 PM Andrew Morton >>> wrote: >>>

Re: [PATCH 0/3] resource: find_next_iomem_res() improvements

2019-07-16 Thread Nadav Amit
> On Jul 16, 2019, at 3:07 PM, Dan Williams wrote: > > On Tue, Jul 16, 2019 at 3:01 PM Andrew Morton > wrote: >> On Tue, 18 Jun 2019 21:56:43 + Nadav Amit wrote: >> >>>> ...and is constant for the life of the device and all subsequent mappings. &

Re: [PATCH 0/3] resource: find_next_iomem_res() improvements

2019-07-16 Thread Nadav Amit
> On Jul 16, 2019, at 3:00 PM, Andrew Morton wrote: > > On Tue, 18 Jun 2019 21:56:43 +0000 Nadav Amit wrote: > >>> ...and is constant for the life of the device and all subsequent mappings. >>> >>>> Perhaps you want to cache the cachability-mode in

Re: [PATCH v2] x86/paravirt: Drop {read,write}_cr8() hooks

2019-07-15 Thread Nadav Amit
> On Jul 15, 2019, at 4:30 PM, Andrew Cooper wrote: > > On 15/07/2019 19:17, Nadav Amit wrote: >>> On Jul 15, 2019, at 8:16 AM, Andrew Cooper >>> wrote: >>> >>> There is a lot of infrastructure for functionality which is used >&g

Re: [PATCH] vmw_balloon: Remove Julien from the maintainers list

2019-07-15 Thread Nadav Amit
> On Jul 14, 2019, at 6:41 AM, Julien Freche wrote: > > > > On Jul 14, 2019, at 12:18 AM, Nadav Amit wrote: > >>> On Jul 13, 2019, at 7:49 AM, Greg Kroah-Hartman >>> wrote: >>> >>>> On Tue, Jul 02, 2019 at 03:05:19AM -0700, Nadav

Re: [PATCH v2] x86/paravirt: Drop {read,write}_cr8() hooks

2019-07-15 Thread Nadav Amit
> On Jul 15, 2019, at 8:16 AM, Andrew Cooper wrote: > > There is a lot of infrastructure for functionality which is used > exclusively in __{save,restore}_processor_state() on the suspend/resume > path. > > cr8 is an alias of APIC_TASKPRI, and APIC_TASKPRI is saved/restored by >

Re: [PATCH] x86/apic: Initialize TPR to block interrupts 16-31

2019-07-14 Thread Nadav Amit
icious hardware. Any PCI or similar hardware that > can be controlled by an attacker MUST be behind a functional IOMMU > that remaps interrupts. The purpose of this patch is to reduce the > chance that a certain class of device malfunctions crashes the > kernel in hard-to-debug ways. &

Re: [PATCH] vmw_balloon: Remove Julien from the maintainers list

2019-07-14 Thread Nadav Amit
> On Jul 13, 2019, at 7:49 AM, Greg Kroah-Hartman > wrote: > > On Tue, Jul 02, 2019 at 03:05:19AM -0700, Nadav Amit wrote: >> Julien will not be a maintainer anymore. >> >> Signed-off-by: Nadav Amit >> --- >> MAINTAINERS | 1 - >> 1 file changed,

Re: [GIT PULL] x86/topology changes for v5.3

2019-07-11 Thread Nadav Amit
> On Jul 11, 2019, at 8:08 AM, Kees Cook wrote: > > On Thu, Jul 11, 2019 at 10:01:34AM +0200, Peter Zijlstra wrote: >> On Thu, Jul 11, 2019 at 07:11:19AM +0000, Nadav Amit wrote: >>>> On Jul 10, 2019, at 7:22 AM, Jiri Kosina wrote: >>>> >>

Re: [GIT PULL] x86/topology changes for v5.3

2019-07-11 Thread Nadav Amit
> On Jul 10, 2019, at 7:22 AM, Jiri Kosina wrote: > > On Wed, 10 Jul 2019, Peter Zijlstra wrote: > >> If we mark the key as RO after init, and then try and modify the key to >> link module usage sites, things might go bang as described. >> >> Thanks! >> >> >> diff --git

Re: [PATCH v2 8/9] x86/mm/tlb: Remove UV special case

2019-07-09 Thread Nadav Amit
> On Jul 9, 2019, at 1:29 PM, Mike Travis wrote: > > > > On 7/9/2019 1:09 PM, Russ Anderson wrote: >> On Tue, Jul 09, 2019 at 09:50:27PM +0200, Thomas Gleixner wrote: >>> On Tue, 2 Jul 2019, Nadav Amit wrote: >>> >>>> SGI UV support is o

Re: linux-next: manual merge of the char-misc tree with the driver-core tree

2019-07-08 Thread Nadav Amit
> On Jul 8, 2019, at 4:20 PM, Stephen Rothwell wrote: > > Hi all, > > On Thu, 13 Jun 2019 15:53:44 +1000 Stephen Rothwell > wrote: >> Today's linux-next merge of the char-misc tree got a conflict in: >> >> drivers/misc/vmw_balloon.c >> >> between commit: >> >> 225afca60b8a ("vmw_balloon:

Re: [patch V2 04/25] x86/apic: Make apic_pending_intr_clear() more robust

2019-07-05 Thread Nadav Amit
> On Jul 5, 2019, at 8:47 AM, Andrew Cooper wrote: > > On 04/07/2019 16:51, Thomas Gleixner wrote: >> 2) The loop termination logic is interesting at best. >> >> If the machine has no TSC or cpu_khz is not known yet it tries 1 >> million times to ack stale IRR/ISR bits. What? >> >>

Re: [PATCH] KVM: LAPIC: ARBPRI is a reserved register for x2APIC

2019-07-05 Thread Nadav Amit
> On Jul 5, 2019, at 6:43 AM, Paolo Bonzini wrote: > > On 05/07/19 15:37, Nadav Amit wrote: >>> On Jul 5, 2019, at 5:14 AM, Paolo Bonzini wrote: >>> >>> kvm-unit-tests were adjusted to match bare metal behavior, but KVM >>> itself was not doing wh

Re: [PATCH] KVM: LAPIC: ARBPRI is a reserved register for x2APIC

2019-07-05 Thread Nadav Amit
> On Jul 5, 2019, at 5:14 AM, Paolo Bonzini wrote: > > kvm-unit-tests were adjusted to match bare metal behavior, but KVM > itself was not doing what bare metal does; fix that. > > Signed-off-by: Paolo Bonzini Reported-by ?

Re: [patch V2 21/25] x86/smp: Enhance native_send_call_func_ipi()

2019-07-04 Thread Nadav Amit
de. That allows to remove the extra cpumask comparison with > cpu_callout_mask. > > Reported-by: Nadav Amit > Signed-off-by: Thomas Gleixner > --- > V2: New patch > --- > arch/x86/kernel/apic/ipi.c | 24 +++- > 1 file changed, 11 insertions(+), 13 deletions(-) >

<    1   2   3   4   5   6   7   8   9   10   >