[RFC PATCH 66/73] x86/pvm: Use new cpu feature to describe XENPV and PVM

2024-02-26 Thread Lai Jiangshan
it is not a paravirtual guest. Signed-off-by: Hou Wenlong Signed-off-by: Lai Jiangshan --- arch/x86/entry/entry_64.S | 5 ++--- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/paravirt.h| 14 +++--- arch/x86/kernel/pvm.c | 1 + arch/x86/xen

Re: [PATCH 2/2] KVM: x86: Fix split-irqchip vs interrupt injection window request

2021-04-15 Thread Lai Jiangshan
On Thu, Apr 15, 2021 at 2:07 PM Paolo Bonzini wrote: > > On 15/04/21 02:59, Lai Jiangshan wrote: > > The next call to inject_pending_event() will reach here AT FIRST with > > vcpu->arch.exception.injected==false and vcpu->arch.exception.pending==false > >

Re: [PATCH 2/2] KVM: x86: Fix split-irqchip vs interrupt injection window request

2021-04-14 Thread Lai Jiangshan
On Thu, Apr 15, 2021 at 12:58 AM Paolo Bonzini wrote: > > On 14/04/21 04:28, Lai Jiangshan wrote: > > On Tue, Apr 13, 2021 at 8:15 PM Paolo Bonzini wrote: > >> > >> On 13/04/21 13:03, Lai Jiangshan wrote: > >>> This patch claims that it has a pla

Re: [PATCH 2/2] KVM: x86: Fix split-irqchip vs interrupt injection window request

2021-04-13 Thread Lai Jiangshan
On Tue, Apr 13, 2021 at 8:15 PM Paolo Bonzini wrote: > > On 13/04/21 13:03, Lai Jiangshan wrote: > > This patch claims that it has a place to > > stash the IRQ when EFLAGS.IF=0, but inject_pending_event() seams to ignore > > EFLAGS.IF and queues the IRQ to the guest dire

Re: [PATCH 2/2] KVM: x86: Fix split-irqchip vs interrupt injection window request

2021-04-13 Thread Lai Jiangshan
On Tue, Apr 13, 2021 at 5:43 AM Sean Christopherson wrote: > > On Fri, Apr 09, 2021, Lai Jiangshan wrote: > > On Fri, Nov 27, 2020 at 7:26 PM Paolo Bonzini wrote: > > > > > > kvm_cpu_accept_dm_intr and kvm_vcpu_ready_for_interrupt_injection are > > > a hod

Re: [PATCH 2/2] KVM: x86: Fix split-irqchip vs interrupt injection window request

2021-04-09 Thread Lai Jiangshan
On Fri, Nov 27, 2020 at 7:26 PM Paolo Bonzini wrote: > > kvm_cpu_accept_dm_intr and kvm_vcpu_ready_for_interrupt_injection are > a hodge-podge of conditions, hacked together to get something that > more or less works. But what is actually needed is much simpler; > in both cases the fundamental

[tip: x86/cleanups] x86/process/64: Move cpu_current_top_of_stack out of TSS

2021-03-28 Thread tip-bot2 for Lai Jiangshan
The following commit has been merged into the x86/cleanups branch of tip: Commit-ID: 1591584e2e762edecefde403c44d9c26c9ff72c9 Gitweb: https://git.kernel.org/tip/1591584e2e762edecefde403c44d9c26c9ff72c9 Author:Lai Jiangshan AuthorDate:Tue, 26 Jan 2021 01:34:29 +08:00

Re: [RFC PATCH 0/6] [RFC] Faultable tracepoints (v2)

2021-02-25 Thread Lai Jiangshan
On Thu, Feb 25, 2021 at 9:15 AM Mathieu Desnoyers wrote: > > - On Feb 24, 2021, at 11:22 AM, Michael Jeanson mjean...@efficios.com > wrote: > > > [ Adding Mathieu Desnoyers in CC ] > > > > On 2021-02-23 21 h 16, Steven Rostedt wrote: > >> On Thu, 18 Feb 2021 17:21:19 -0500 > >> Michael

Re: [PATCH] workqueue: Remove rcu_read_lock/unlock() in workqueue_congested()

2021-02-17 Thread Lai Jiangshan
+CC Paul On Wed, Feb 17, 2021 at 7:58 PM wrote: > > From: Zqiang > > The RCU read critical area already by preempt_disable/enable() > (equivalent to rcu_read_lock_sched/unlock_sched()) mark, so remove > rcu_read_lock/unlock(). I think we can leave it which acks like document, especially

Re: [PATCH] workqueue: Move the position of debug_work_activate() in __queue_work()

2021-02-17 Thread Lai Jiangshan
improve destroy_workqueue() debuggability") The code looks good to me. Reviewed-by: Lai Jiangshan > Signed-off-by: Zqiang > --- > kernel/workqueue.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > in

Re: [PATCH V4 0/6] x86: Don't abuse tss.sp1

2021-02-10 Thread Lai Jiangshan
Hi Mark Thank you for your reply. On Thu, Feb 11, 2021 at 7:42 AM mark gross wrote: > > On Wed, Feb 10, 2021 at 09:39:11PM +0800, Lai Jiangshan wrote: > > From: Lai Jiangshan > > > > In x86_64, tss.sp1 is reused as cpu_current_top_of_stack. We'd better > >

[PATCH V4 4/6] x86/entry/32: Restore %fs before switching stack

2021-02-10 Thread Lai Jiangshan
From: Lai Jiangshan entry_SYSENTER_32 saves the user %fs in the entry stack and restores the kernel %fs before loading the task stack for stack switching, so that it can use percpu before switching stack in the next patch. Signed-off-by: Lai Jiangshan --- arch/x86/entry/entry_32.S | 22

[PATCH V4 6/6] x86/entry/32: Introduce cpu_current_thread_sp0 to replace cpu_tss_rw.x86_tss.sp1

2021-02-10 Thread Lai Jiangshan
From: Lai Jiangshan TSS sp1 is not used by hardware and is used as a copy of thread.sp0. It should just use a percpu variable instead, so we introduce cpu_current_thread_sp0 for it. And we remove the unneeded TSS_sp1. Signed-off-by: Lai Jiangshan --- arch/x86/entry/entry_32.S| 6

[PATCH V4 5/6] x86/entry/32: Use percpu to get thread.sp0 in SYSENTER

2021-02-10 Thread Lai Jiangshan
From: Lai Jiangshan TSS_entry2task_stack is used to refer to tss.sp1 which is a copy of thread.sp0. When TSS_entry2task_stack is used in entry_SYSENTER_32, the CR3 is already kernel CR3 and the kernel %fs is loaded. So it directly uses percpu instead of offset-calculation via

[PATCH V4 3/6] x86/entry/32: Switch to the task stack without emptying the entry stack

2021-02-10 Thread Lai Jiangshan
From: Lai Jiangshan Like the way x86_64 uses the entry stack when switching to the task stack, entry_SYSENTER_32 can also save the entry stack pointer to a register and then switch to the task stack. So that it doesn't need to empty the entry stack by poping contents to registers and it has

[PATCH V4 2/6] x86/entry/32: Use percpu instead of offset-calculation to get thread.sp0 in SWITCH_TO_KERNEL_STACK

2021-02-10 Thread Lai Jiangshan
From: Lai Jiangshan TSS_entry2task_stack is used to refer to tss.sp1 which is a copy of thread.sp0. When TSS_entry2task_stack is used in SWITCH_TO_KERNEL_STACK, the CR3 is already kernel CR3 and the kernel segments are loaded. So it directly uses percpu to get tss.sp1(thread.sp0) instead

[PATCH V4 1/6] x86/entry/64: Move cpu_current_top_of_stack out of TSS

2021-02-10 Thread Lai Jiangshan
From: Lai Jiangshan In x86_64, cpu_current_top_of_stack is an alias of cpu_tss_rw.x86_tss.sp1. When the CPU has meltdown vulnerability(X86_BUG_CPU_MELTDOWN), it would become a coveted fruit even if kernel pagetable isolation is enabled since CPU TSS must also be in the user CR3. An attacker

[PATCH V4 0/6] x86: Don't abuse tss.sp1

2021-02-10 Thread Lai Jiangshan
From: Lai Jiangshan In x86_64, tss.sp1 is reused as cpu_current_top_of_stack. We'd better directly use percpu since CR3 and gs_base is correct when it is used. In x86_32, tss.sp1 is resued as thread.sp0 in three places in entry code. We have the correct CR3 and %fs at two of the places

[tip: x86/urgent] x86/debug: Prevent data breakpoints on cpu_dr7

2021-02-05 Thread tip-bot2 for Lai Jiangshan
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: 3943abf2dbfae9ea4d2da05c1db569a0603f76da Gitweb: https://git.kernel.org/tip/3943abf2dbfae9ea4d2da05c1db569a0603f76da Author:Lai Jiangshan AuthorDate:Thu, 04 Feb 2021 23:27:07 +08:00

[tip: x86/urgent] x86/debug: Prevent data breakpoints on __per_cpu_offset

2021-02-05 Thread tip-bot2 for Lai Jiangshan
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: c4bed4b96918ff1d062ee81fdae4d207da4fa9b0 Gitweb: https://git.kernel.org/tip/c4bed4b96918ff1d062ee81fdae4d207da4fa9b0 Author:Lai Jiangshan AuthorDate:Thu, 04 Feb 2021 23:27:06 +08:00

Re: [patch 11/12] softirq: Allow inlining do_softirq_own_stack()

2021-02-05 Thread Lai Jiangshan
On Fri, Feb 5, 2021 at 10:04 AM Thomas Gleixner wrote: > > The function to switch to the irq stack on x86 is now minimal and there is > only a single caller. Allow the stack switch to be inlined. > > Signed-off-by: Thomas Gleixner > --- > include/linux/interrupt.h |2 ++ > kernel/softirq.c

[PATCH 2/2] x86/hw_breakpoint: Prevent data breakpoints on cpu_dr7

2021-02-04 Thread Lai Jiangshan
From: Lai Jiangshan When in guest (X86_FEATURE_HYPERVISOR), local_db_save() will read per-cpu cpu_dr7 before clear dr7 register. local_db_save() is called at the start of exc_debug_kernel(). To avoid recursive #DB, we have to disallow data breakpoints on cpu_dr7. Fixes: 84b6a3491567a(&quo

[PATCH 1/2] x86/hw_breakpoint: Prevent data breakpoints on __per_cpu_offset

2021-02-04 Thread Lai Jiangshan
From: Lai Jiangshan When FSGSBASE is enabled, paranoid_entry() fetches the per-CPU GSBASE value via __per_cpu_offset or pcpu_unit_offsets. When data breakpoint is set on __per_cpu_offset[cpu] (read-write operation), the specific cpu will be stuck in the infinite #DB loop. RCU will try to send

Re: [PATCH V3 0/6] x86: don't abuse tss.sp1

2021-01-29 Thread Lai Jiangshan
On Sat, Jan 30, 2021 at 12:43 AM Borislav Petkov wrote: > > On Fri, Jan 29, 2021 at 11:35:46PM +0800, Lai Jiangshan wrote: > > Any feedback? > > Yes: be patient please. > > Thx. > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/

[PATCH V3 2/6] x86_32: use percpu instead of offset-calculation to get thread.sp0 when SWITCH_TO_KERNEL_STACK

2021-01-27 Thread Lai Jiangshan
From: Lai Jiangshan TSS_entry2task_stack is used to refer to tss.sp1 which is stored the value of thread.sp0. At the code where TSS_entry2task_stack is used in SWITCH_TO_KERNEL_STACK, the CR3 is already kernel CR3 and kernel segments is loaded. So we can directly use the percpu to get tss.sp1

[PATCH V3 3/6] x86_32/sysenter: switch to the task stack without emptying the entry stack

2021-01-27 Thread Lai Jiangshan
From: Lai Jiangshan Like the way x86_64 uses the "old" stack, we can save the entry stack pointer to a register and switch to the task stack. So that we have space on the "old" stack to save more things or scratch registers. Signed-off-by: Lai Jiangshan --- arch/x86/e

[PATCH V3 5/6] x86_32/sysenter: use percpu to get thread.sp0 when sysenter

2021-01-27 Thread Lai Jiangshan
From: Lai Jiangshan TSS_entry2task_stack is used to refer to tss.sp1 which is stored the value of thread.sp0. At the code where TSS_entry2task_stack is used in sysenter, the CR3 is already kernel CR3 and kernel segments is loaded. So that we can directly use percpu for it instead of offset

[PATCH V3 4/6] x86_32/sysenter: restore %fs before switching stack

2021-01-27 Thread Lai Jiangshan
From: Lai Jiangshan Prepare for using percpu and removing TSS_entry2task_stack Signed-off-by: Lai Jiangshan --- arch/x86/entry/entry_32.S | 22 +- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index

[PATCH V3 6/6] x86_32: use cpu_current_thread_sp0 instead of cpu_tss_rw.x86_tss.sp1

2021-01-27 Thread Lai Jiangshan
From: Lai Jiangshan sp1 is not used by hardware and is used as thread.sp0. We should just use new percpu variable. And remove unneeded TSS_sp1. Signed-off-by: Lai Jiangshan --- arch/x86/entry/entry_32.S| 6 +++--- arch/x86/include/asm/processor.h | 2 ++ arch/x86/include/asm

[PATCH V3 1/6] x86_64: move cpu_current_top_of_stack out of TSS

2021-01-27 Thread Lai Jiangshan
From: Lai Jiangshan When X86_BUG_CPU_MELTDOWN & KPTI, cpu_current_top_of_stack lives in the TSS which is also in the user CR3 and it becomes a coveted fruit. An attacker can fetch the kernel stack top from it and continue next steps of actions based on the kernel stack. The address m

[PATCH V3 0/6] x86: don't abuse tss.sp1

2021-01-27 Thread Lai Jiangshan
From: Lai Jiangshan In x86_64, tss.sp1 is reused as cpu_current_top_of_stack. But we can directly use percpu since CR3 and gs_base is correct when it is used. In x86_32, tss.sp1 is resued as thread.sp0 in three places in entry code. We have the correct CR3 and %fs at two of the places

[PATCH V2 3/6] x86_32/sysenter: switch to the task stack without emptying the entry stack

2021-01-25 Thread Lai Jiangshan
From: Lai Jiangshan Like the way x86_64 uses the "old" stack, we can save the entry stack pointer to a register and switch to the task stack. So that we have space on the "old" stack to save more things or scratch registers. Signed-off-by: Lai Jiangshan --- arch/x86/e

[PATCH V2 5/6] x86_32/sysenter: use percpu to get thread.sp0 when sysenter

2021-01-25 Thread Lai Jiangshan
From: Lai Jiangshan TSS_entry2task_stack is used to refer to tss.sp1 which is stored the value of thread.sp0. At the code where TSS_entry2task_stack is used in sysenter, the CR3 is already kernel CR3 and kernel segments is loaded. So that we can directly use percpu for it instead of offset

[PATCH V2 2/6] x86_32: use percpu instead of offset-calculation to get thread.sp0 when SWITCH_TO_KERNEL_STACK

2021-01-25 Thread Lai Jiangshan
From: Lai Jiangshan TSS_entry2task_stack is used to refer to tss.sp1 which is stored the value of thread.sp0. At the code where TSS_entry2task_stack is used in SWITCH_TO_KERNEL_STACK, the CR3 is already kernel CR3 and kernel segments is loaded. So we can directly use the percpu to get tss.sp1

[PATCH V2 6/6] x86_32: use cpu_current_thread_sp0 instead of cpu_tss_rw.x86_tss.sp1

2021-01-25 Thread Lai Jiangshan
From: Lai Jiangshan sp1 is not used by hardware and is used as thread.sp0. We should just use new percpu variable. And remove unneeded TSS_sp1. Signed-off-by: Lai Jiangshan --- arch/x86/entry/entry_32.S| 6 +++--- arch/x86/include/asm/processor.h | 2 ++ arch/x86/include/asm

[PATCH V2 4/6] x86_32/sysenter: restore %fs before switching stack

2021-01-25 Thread Lai Jiangshan
From: Lai Jiangshan Prepare for using percpu and removing TSS_entry2task_stack Signed-off-by: Lai Jiangshan --- arch/x86/entry/entry_32.S | 22 +- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index

[PATCH V2 1/6] x86_64: move cpu_current_top_of_stack out of TSS

2021-01-25 Thread Lai Jiangshan
From: Lai Jiangshan When X86_BUG_CPU_MELTDOWN & KPTI, cpu_current_top_of_stack lives in the TSS which is also in the user CR3 and it becomes a coveted fruit. An attacker can fetch the kernel stack top from it and continue next steps of actions based on the kernel stack. The address m

[PATCH V2 0/6] x86: don't abuse tss.sp1

2021-01-25 Thread Lai Jiangshan
From: Lai Jiangshan In x86_64, tss.sp1 is reused as cpu_current_top_of_stack. But we can directly use percpu since CR3 and gs_base is correct when it is used. In x86_32, tss.sp1 is resued as thread.sp0 in three places in entry code. We have the correct CR3 and %fs at two of the places

[PATCH V2] x86/entry/64: De-Xen-ify our NMI code further

2021-01-24 Thread Lai Jiangshan
From: Lai Jiangshan The commit 929bacec21478("x86/entry/64: De-Xen-ify our NMI code") simplified the NMI code by changing paravirt code into native code and left a comment about "inspecting RIP instead". But until now, "inspecting RIP instead" has not been made

[PATCH] x86/entry/64: De-Xen-ify our NMI code further

2021-01-24 Thread Lai Jiangshan
From: Lai Jiangshan The commit 929bacec21478("x86/entry/64: De-Xen-ify our NMI code") simplified the NMI code by changing paravirt code into native code and left a comment about "inspecting RIP instead". But until now, "inspecting RIP instead" has not been made

Re: [PATCH v7 45/72] x86/entry/64: Add entry code for #VC handler

2021-01-24 Thread Lai Jiangshan
> + > + /* > +* No need to switch back to the IST stack. The current stack is > either > +* identical to the stack in the IRET frame or the VC fall-back stack, > +* so it is definitly mapped even with PTI enabled. > +*/ > + jmp paranoid_exit > + >

[PATCH] x86_64: move cpu_current_top_of_stack out of TSS

2021-01-22 Thread Lai Jiangshan
From: Lai Jiangshan When X86_BUG_CPU_MELTDOWN & KPTI, cpu_current_top_of_stack lives in the TSS which is also in the user CR3 and it becomes a coveted fruit. An attacker can fetch the kernel stack top from it and continue next steps of actions based on the kernel stack. The address m

[tip: sched/urgent] workqueue: Use cpu_possible_mask instead of cpu_active_mask to break affinity

2021-01-22 Thread tip-bot2 for Lai Jiangshan
The following commit has been merged into the sched/urgent branch of tip: Commit-ID: 547a77d02f8cfb345631ce23b5b548d27afa0fc4 Gitweb: https://git.kernel.org/tip/547a77d02f8cfb345631ce23b5b548d27afa0fc4 Author:Lai Jiangshan AuthorDate:Mon, 11 Jan 2021 23:26:33 +08:00

Re: [PATCH] workqueue: fix annotation for WQ_SYSFS

2021-01-18 Thread Lai Jiangshan
On Mon, Jan 18, 2021 at 4:05 PM wrote: > > From: Menglong Dong > > 'wq_sysfs_register()' in annotation for 'WQ_SYSFS' is unavailable, > change it to 'workqueue_sysfs_register()'. > > Signed-off-by: Menglong Dong Reviewed-by: Lai Jiangshan > --- > include/linux/w

Re: [PATCH 8/8] sched: Relax the set_cpus_allowed_ptr() semantics

2021-01-16 Thread Lai Jiangshan
ces I listed, which can really simplify hotplug code in the workqueue and may be other hotplug code. Reviewed-by: Lai jiangshan > --- > kernel/sched/core.c | 20 +--- > 1 file changed, 9 insertions(+), 11 deletions(-) > > --- a/kernel/sched/core.c > +++ b/kernel

Re: [PATCH 3/4] workqueue: Tag bound workers with KTHREAD_IS_PER_CPU

2021-01-16 Thread Lai Jiangshan
On Sat, Jan 16, 2021 at 11:16 PM Peter Zijlstra wrote: > > On Sat, Jan 16, 2021 at 10:45:04PM +0800, Lai Jiangshan wrote: > > On Sat, Jan 16, 2021 at 8:45 PM Peter Zijlstra wrote: > > > It is also the exact sequence normal per-cpu threads (smpboot) use to > > > pr

Re: [PATCH 3/4] workqueue: Tag bound workers with KTHREAD_IS_PER_CPU

2021-01-16 Thread Lai Jiangshan
On Sat, Jan 16, 2021 at 8:45 PM Peter Zijlstra wrote: > > On Sat, Jan 16, 2021 at 02:27:09PM +0800, Lai Jiangshan wrote: > > On Thu, Jan 14, 2021 at 11:35 PM Peter Zijlstra > > wrote: > > > > > > > > -void kthread_set_per_cpu(struct task_struct *k,

Re: [PATCH 3/4] workqueue: Tag bound workers with KTHREAD_IS_PER_CPU

2021-01-15 Thread Lai Jiangshan
On Thu, Jan 14, 2021 at 11:35 PM Peter Zijlstra wrote: > > -void kthread_set_per_cpu(struct task_struct *k, bool set) > +void kthread_set_per_cpu(struct task_struct *k, int cpu) > { > struct kthread *kthread = to_kthread(k); > if (!kthread) > return; > > -

Re: [PATCH -tip V3 0/8] workqueue: break affinity initiatively

2021-01-15 Thread Lai Jiangshan
On Fri, Jan 15, 2021 at 9:05 PM Peter Zijlstra wrote: > > On Fri, Jan 15, 2021 at 10:11:51AM +0100, Peter Zijlstra wrote: > > On Tue, Jan 12, 2021 at 03:53:24PM -0800, Paul E. McKenney wrote: > > > An SRCU-P run on the new series reproduced the warning below. Repeat-by: > > > > > >

[PATCH] workqueue: keep unbound rescuer's cpumask to be default cpumask

2021-01-15 Thread Lai Jiangshan
From: Lai Jiangshan When we attach a rescuer to a pool, we will set the rescuer's cpumask to the pool's cpumask. If there is hotplug ongoing, it may cause the rescuer running on the dying CPU and cause bug or it may cause warning of setting online&!active cpumask. So we have to find a reli

Re: [PATCH 3/4] workqueue: Tag bound workers with KTHREAD_IS_PER_CPU

2021-01-13 Thread Lai Jiangshan
On Tue, Jan 12, 2021 at 10:51 PM Peter Zijlstra wrote: > > Mark the per-cpu workqueue workers as KTHREAD_IS_PER_CPU. > > Workqueues have unfortunate semantics in that per-cpu workers are not > default flushed and parked during hotplug, however a subset does > manual flush on hotplug and hard

Re: [PATCH -tip V3 0/8] workqueue: break affinity initiatively

2021-01-13 Thread Lai Jiangshan
On 2021/1/13 19:10, Peter Zijlstra wrote: On Tue, Jan 12, 2021 at 11:38:12PM +0800, Lai Jiangshan wrote: But the hard problem is "how to suppress the warning of online&!active in __set_cpus_allowed_ptr()" for late spawned unbound workers during hotplug. I cannot see create_w

Re: [PATCH -tip V3 0/8] workqueue: break affinity initiatively

2021-01-13 Thread Lai Jiangshan
On Wed, Jan 13, 2021 at 7:11 PM Peter Zijlstra wrote: > > On Tue, Jan 12, 2021 at 11:38:12PM +0800, Lai Jiangshan wrote: > > > But the hard problem is "how to suppress the warning of > > online&!active in __set_cpus_allowed_ptr()" for late spawned > > unb

Re: [PATCH 3/4] workqueue: Tag bound workers with KTHREAD_IS_PER_CPU

2021-01-12 Thread Lai Jiangshan
n hotplug and hard relies on them for correctness. > > Therefore play silly games.. > > Signed-off-by: Peter Zijlstra (Intel) > Tested-by: Paul E. McKenney > --- Reviewed-by: Lai Jiangshan I like this patchset in that the scheduler takes care of the affinities of the tasks

Re: [PATCH -tip V3 0/8] workqueue: break affinity initiatively

2021-01-12 Thread Lai Jiangshan
On Tue, Jan 12, 2021 at 10:53 PM Peter Zijlstra wrote: > > On Tue, Jan 12, 2021 at 12:33:03PM +0800, Lai Jiangshan wrote: > > > Well yes, but afaict the workqueue stuff hasn't been settled yet, and > > > the rcutorture patch Paul did was just plain racy and who knows what

Re: [PATCH -tip V3 0/8] workqueue: break affinity initiatively

2021-01-11 Thread Lai Jiangshan
> Well yes, but afaict the workqueue stuff hasn't been settled yet, and > the rcutorture patch Paul did was just plain racy and who knows what > other daft kthread users are out there. That and we're at -rc3. I just send the V4 patchset for the workqueue. Please take a look. > @@ -1861,6

[PATCH -tip V4 7/8] workqueue: Manually break affinity on hotplug for unbound pool

2021-01-11 Thread Lai Jiangshan
From: Lai Jiangshan There is possible that a per-node pool/woker's affinity is a single CPU. It can happen when the workqueue user changes the cpumask of the workqueue or when wq_unbound_cpumask is changed by system adim via /sys/devices/virtual/workqueue/cpumask. And pool->attrs->c

[PATCH -tip V4 8/8] workqueue: Fix affinity of kworkers when attaching into pool

2021-01-11 Thread Lai Jiangshan
From: Lai Jiangshan When worker_attach_to_pool() is called, we should not put the workers to pool->attrs->cpumask when there is or will be no CPU online in it. Otherwise, it may cause BUG_ON(): (quote from Valentin:) Per-CPU kworkers forcefully migrated away by hotpl

[PATCH -tip V4 4/8] workqueue: Manually break affinity on pool detachment

2021-01-11 Thread Lai Jiangshan
From: Lai Jiangshan The pool->attrs->cpumask might be a single CPU and it may go down after detachment, and the scheduler won't force to break affinity for us since it is a per-cpu-ktrhead. So we have to do it on our own and unbind this worker which can't be unbound by workqueue_offli

[PATCH -tip V4 6/8] workqueue: use wq_unbound_online_cpumask in restore_unbound_workers_cpumask()

2021-01-11 Thread Lai Jiangshan
From: Lai Jiangshan restore_unbound_workers_cpumask() is called when CPU_ONLINE, where wq_online_cpumask equals to cpu_online_mask. So no fucntionality changed. Acked-by: Tejun Heo Tested-by: Paul E. McKenney Signed-off-by: Lai Jiangshan --- kernel/workqueue.c | 3 ++- 1 file changed, 2

[PATCH -tip V4 5/8] workqueue: introduce wq_unbound_online_cpumask

2021-01-11 Thread Lai Jiangshan
From: Lai Jiangshan wq_unbound_online_cpumask is the cached result of cpu_online_mask with the going-down cpu cleared before the cpu is cleared from cpu_active_mask. It is used to track the cpu hotplug process so the creation/attachment of unbound workers can know where it is in the process when

[PATCH -tip V4 3/8] workqueue: use cpu_possible_mask instead of cpu_active_mask to break affinity

2021-01-11 Thread Lai Jiangshan
From: Lai Jiangshan The scheduler won't break affinity for us any more, and we should "emulate" the same behavior when the scheduler breaks affinity for us. The behavior is "changing the cpumask to cpu_possible_mask". And there might be some other CPUs online later while

[PATCH -tip V4 1/8] workqueue: split cpuhotplug callbacks for unbound workqueue

2021-01-11 Thread Lai Jiangshan
From: Lai Jiangshan Unbound workers are normally non-per-cpu-kthread, but when cpu hotplug, we also need to update the pools of unbound workqueues based on the info that whether the relevant node has CPUs online or not for every workqueue. The code reuses current cpu hotplug callbacks which

[PATCH -tip V4 2/8] workqueue: set pool->attr->cpumask to workers when cpu online

2021-01-11 Thread Lai Jiangshan
From: Lai Jiangshan The commit d945b5e9f0e("workqueue: Fix setting affinity of unbound worker threads") fixed a problem of set_cpus_allowed_ptr() with online&!active cpumasks (more than one CPUs) when restore_unbound_workers_cpumask() in online callback. But now the new o

[PATCH -tip V4 0/8] workqueue: break affinity initiatively

2021-01-11 Thread Lai Jiangshan
From: Lai Jiangshan 06249738a41a ("workqueue: Manually break affinity on hotplug") said that scheduler will not force break affinity for us. But workqueue highly depends on the old behavior. Many parts of the codes relies on it, 06249738a41a ("workqueue: Manually break affi

Re: [PATCH -tip V3 3/8] workqueue: introduce wq_online_cpumask

2021-01-05 Thread Lai Jiangshan
On Tue, Jan 5, 2021 at 10:37 PM Lai Jiangshan wrote: > > On Tue, Jan 5, 2021 at 9:17 PM Peter Zijlstra wrote: > > > > On Tue, Jan 05, 2021 at 04:23:44PM +0800, Lai Jiangshan wrote: > > > On Tue, Jan 5, 2021 at 10:41 AM Lai Jiangshan > > > wrote: > >

Re: [PATCH -tip V3 3/8] workqueue: introduce wq_online_cpumask

2021-01-05 Thread Lai Jiangshan
On Tue, Jan 5, 2021 at 9:17 PM Peter Zijlstra wrote: > > On Tue, Jan 05, 2021 at 04:23:44PM +0800, Lai Jiangshan wrote: > > On Tue, Jan 5, 2021 at 10:41 AM Lai Jiangshan > > wrote: > > > On Mon, Jan 4, 2021 at 9:56 PM Peter Zijlstra > > > wrote: > &

Re: [PATCH -tip V3 3/8] workqueue: introduce wq_online_cpumask

2021-01-05 Thread Lai Jiangshan
On Tue, Jan 5, 2021 at 10:41 AM Lai Jiangshan wrote: > > On Mon, Jan 4, 2021 at 9:56 PM Peter Zijlstra wrote: > > > > On Sat, Dec 26, 2020 at 10:51:11AM +0800, Lai Jiangshan wrote: > > > From: Lai Jiangshan > > > > > > wq_online_cp

Re: [PATCH -tip V3 3/8] workqueue: introduce wq_online_cpumask

2021-01-04 Thread Lai Jiangshan
On Tue, Jan 5, 2021 at 10:41 AM Lai Jiangshan wrote: > > On Mon, Jan 4, 2021 at 9:56 PM Peter Zijlstra wrote: > > > > On Sat, Dec 26, 2020 at 10:51:11AM +0800, Lai Jiangshan wrote: > > > From: Lai Jiangshan > > > > > > wq_online_cp

Re: [PATCH -tip V3 3/8] workqueue: introduce wq_online_cpumask

2021-01-04 Thread Lai Jiangshan
On Mon, Jan 4, 2021 at 9:56 PM Peter Zijlstra wrote: > > On Sat, Dec 26, 2020 at 10:51:11AM +0800, Lai Jiangshan wrote: > > From: Lai Jiangshan > > > > wq_online_cpumask is the cached result of cpu_online_mask with the > > going-down cpu cleared. > > You can'

Re: [PATCH -tip V3 8/8] workqueue: Fix affinity of kworkers when attaching into pool

2020-12-29 Thread Lai Jiangshan
On Tue, Dec 29, 2020 at 6:06 PM Hillf Danton wrote: > > On Sat, 26 Dec 2020 10:51:16 +0800 > > From: Lai Jiangshan > > > > When worker_attach_to_pool() is called, we should not put the workers > > to pool->attrs->cpumask when there is not CPU

Re: [PATCH -tip V2 00/10] workqueue: break affinity initiatively

2020-12-27 Thread Lai Jiangshan
e following in sched_cpu_dying() in kernel/sched/core.c, > exactly the same as for Lai Jiangshan: > > BUG_ON(rq->nr_running != 1 || rq_has_pinned_tasks(rq)) > > Which is in fact the "this" in my earlier "rcutorture hits this". ;-) > >

Re: [PATCH -tip V3 5/8] workqueue: Manually break affinity on hotplug for unbound pool

2020-12-27 Thread Lai Jiangshan
On Sat, Dec 26, 2020 at 6:16 PM Hillf Danton wrote: > > Sat, 26 Dec 2020 10:51:13 +0800 > > From: Lai Jiangshan > > > > There is possible that a per-node pool/woker's affinity is a single > > CPU. It can happen when the workqueue user changes the cpumas

[PATCH -tip V3 1/8] workqueue: use cpu_possible_mask instead of cpu_active_mask to break affinity

2020-12-25 Thread Lai Jiangshan
From: Lai Jiangshan The scheduler won't break affinity for us any more, and we should "emulate" the same behavior when the scheduler breaks affinity for us. The behavior is "changing the cpumask to cpu_possible_mask". And there might be some other CPUs online later while

[PATCH -tip V3 7/8] workqueue: reorganize workqueue_offline_cpu() unbind_workers()

2020-12-25 Thread Lai Jiangshan
From: Lai Jiangshan Just move around the code, no functionality changed. Only wq_pool_attach_mutex protected region becomes a little larger. It prepares for later patch protecting wq_online_cpumask in wq_pool_attach_mutex. Acked-by: Tejun Heo Signed-off-by: Lai Jiangshan --- kernel

[PATCH -tip V3 6/8] workqueue: reorganize workqueue_online_cpu()

2020-12-25 Thread Lai Jiangshan
From: Lai Jiangshan Just move around the code, no functionality changed. It prepares for later patch protecting wq_online_cpumask in wq_pool_attach_mutex. Acked-by: Tejun Heo Signed-off-by: Lai Jiangshan --- kernel/workqueue.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions

[PATCH -tip V3 5/8] workqueue: Manually break affinity on hotplug for unbound pool

2020-12-25 Thread Lai Jiangshan
From: Lai Jiangshan There is possible that a per-node pool/woker's affinity is a single CPU. It can happen when the workqueue user changes the cpumask of the workqueue or when wq_unbound_cpumask is changed by system adim via /sys/devices/virtual/workqueue/cpumask. And pool->attrs->c

[PATCH -tip V3 8/8] workqueue: Fix affinity of kworkers when attaching into pool

2020-12-25 Thread Lai Jiangshan
From: Lai Jiangshan When worker_attach_to_pool() is called, we should not put the workers to pool->attrs->cpumask when there is not CPU online in it. We have to use wq_online_cpumask in worker_attach_to_pool() to check if pool->attrs->cpumask is valid rather than cpu

[PATCH -tip V3 3/8] workqueue: introduce wq_online_cpumask

2020-12-25 Thread Lai Jiangshan
From: Lai Jiangshan wq_online_cpumask is the cached result of cpu_online_mask with the going-down cpu cleared. It is needed for later patches for setting correct cpumask for workers and break affinity initiatively. The first usage of wq_online_cpumask is also in this patch

[PATCH -tip V3 2/8] workqueue: Manually break affinity on pool detachment

2020-12-25 Thread Lai Jiangshan
From: Lai Jiangshan The pool->attrs->cpumask might be a single CPU and it may go down after detachment, and the scheduler won't force to break affinity for us since it is a per-cpu-ktrhead. So we have to do it on our own and unbind this worker which can't be unbound by workqueue_offli

[PATCH -tip V3 4/8] workqueue: use wq_online_cpumask in restore_unbound_workers_cpumask()

2020-12-25 Thread Lai Jiangshan
From: Lai Jiangshan restore_unbound_workers_cpumask() is called when CPU_ONLINE, where wq_online_cpumask equals to cpu_online_mask. So no fucntionality changed. Acked-by: Tejun Heo Signed-off-by: Lai Jiangshan --- kernel/workqueue.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion

[PATCH -tip V3 0/8] workqueue: break affinity initiatively

2020-12-25 Thread Lai Jiangshan
From: Lai Jiangshan 06249738a41a ("workqueue: Manually break affinity on hotplug") said that scheduler will not force break affinity for us. But workqueue highly depends on the old behavior. Many parts of the codes relies on it, 06249738a41a ("workqueue: Manually break affi

Re: [PATCH -tip V2 00/10] workqueue: break affinity initiatively

2020-12-23 Thread Lai Jiangshan
On Wed, Dec 23, 2020 at 5:39 AM Dexuan-Linux Cui wrote: > > On Fri, Dec 18, 2020 at 8:11 AM Lai Jiangshan wrote: > > > > From: Lai Jiangshan > > > > 06249738a41a ("workqueue: Manually break affinity on hotplug") > > said that scheduler will not forc

Re: [PATCH -tip V2 00/10] workqueue: break affinity initiatively

2020-12-23 Thread Lai Jiangshan
On Wed, Dec 23, 2020 at 5:39 AM Dexuan-Linux Cui wrote: > > On Fri, Dec 18, 2020 at 8:11 AM Lai Jiangshan wrote: > > > > From: Lai Jiangshan > > > > 06249738a41a ("workqueue: Manually break affinity on hotplug") > > said that scheduler will not forc

Re: [PATCH -tip V2 10/10] workqueue: Fix affinity of kworkers when attaching into pool

2020-12-18 Thread Lai Jiangshan
On Sat, Dec 19, 2020 at 1:59 AM Valentin Schneider wrote: > > > On 18/12/20 17:09, Lai Jiangshan wrote: > > From: Lai Jiangshan > > > > When worker_attach_to_pool() is called, we should not put the workers > > to pool->attrs->cpumask when there is not C

[PATCH -tip V2 08/10] workqueue: reorganize workqueue_online_cpu()

2020-12-18 Thread Lai Jiangshan
From: Lai Jiangshan Just move around the code, no functionality changed. It prepares for later patch protecting wq_online_cpumask in wq_pool_attach_mutex. Acked-by: Tejun Heo Signed-off-by: Lai Jiangshan --- kernel/workqueue.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions

[PATCH -tip V2 09/10] workqueue: reorganize workqueue_offline_cpu() unbind_workers()

2020-12-18 Thread Lai Jiangshan
From: Lai Jiangshan Just move around the code, no functionality changed. Only wq_pool_attach_mutex protected region becomes a little larger. It prepares for later patch protecting wq_online_cpumask in wq_pool_attach_mutex. Acked-by: Tejun Heo Signed-off-by: Lai Jiangshan --- kernel

[PATCH -tip V2 10/10] workqueue: Fix affinity of kworkers when attaching into pool

2020-12-18 Thread Lai Jiangshan
From: Lai Jiangshan When worker_attach_to_pool() is called, we should not put the workers to pool->attrs->cpumask when there is not CPU online in it. We have to use wq_online_cpumask in worker_attach_to_pool() to check if pool->attrs->cpumask is valid rather than cpu

[PATCH -tip V2 05/10] workqueue: introduce wq_online_cpumask

2020-12-18 Thread Lai Jiangshan
From: Lai Jiangshan wq_online_cpumask is the cached result of cpu_online_mask with the going-down cpu cleared. It is needed for later patches for setting correct cpumask for workers and break affinity initiatively. The first usage of wq_online_cpumask is also in this patch

[PATCH -tip V2 04/10] workqueue: don't set the worker's cpumask when kthread_bind_mask()

2020-12-18 Thread Lai Jiangshan
From: Lai Jiangshan There might be no online cpu in the pool->attrs->cpumask. We will set the worker's cpumask later in worker_attach_to_pool(). Cc: Peter Zijlstra Acked-by: Tejun Heo Signed-off-by: Lai Jiangshan --- kernel/workqueue.c | 10 +- 1 file changed, 9 insertions

[PATCH -tip V2 07/10] workqueue: Manually break affinity on hotplug for unbound pool

2020-12-18 Thread Lai Jiangshan
From: Lai Jiangshan There is possible that a per-node pool/woker's affinity is a single CPU. It can happen when wq_unbound_cpumask is changed by system adim via /sys/devices/virtual/workqueue/cpumask. And pool->attrs->cpumask is wq_unbound_cpumask & possible_cpumask_of_the_node,

[PATCH -tip V2 02/10] workqueue: use cpu_possible_mask instead of cpu_active_mask to break affinity

2020-12-18 Thread Lai Jiangshan
From: Lai Jiangshan The scheduler won't break affinity for us any more, and we should "emulate" the same behavior when the scheduler breaks affinity for us. The behavior is "changing the cpumask to cpu_possible_mask". And there might be some other CPUs online later while

[PATCH -tip V2 03/10] workqueue: Manually break affinity on pool detachment

2020-12-18 Thread Lai Jiangshan
From: Lai Jiangshan The pool->attrs->cpumask might be a single CPU and it may go down after detachment, and the scheduler won't force to break affinity for us since it is a per-cpu-ktrhead. So we have to do it on our own and unbind this worker which can't be unbound by workqueue_offli

[PATCH -tip V2 06/10] workqueue: use wq_online_cpumask in restore_unbound_workers_cpumask()

2020-12-18 Thread Lai Jiangshan
From: Lai Jiangshan restore_unbound_workers_cpumask() is called when CPU_ONLINE, where wq_online_cpumask equals to cpu_online_mask. So no fucntionality changed. Acked-by: Tejun Heo Signed-off-by: Lai Jiangshan --- kernel/workqueue.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion

[PATCH -tip V2 00/10] workqueue: break affinity initiatively

2020-12-18 Thread Lai Jiangshan
From: Lai Jiangshan 06249738a41a ("workqueue: Manually break affinity on hotplug") said that scheduler will not force break affinity for us. But workqueue highly depends on the old behavior. Many parts of the codes relies on it, 06249738a41a ("workqueue: Manually break affi

[PATCH -tip V2 01/10] workqueue: restore unbound_workers' cpumask correctly

2020-12-18 Thread Lai Jiangshan
From: Lai Jiangshan When we restore workers' cpumask, we should restore them to the designed pool->attrs->cpumask. And we need to only do it at the first time. Cc: Hillf Danton Reported-by: Hillf Danton Acked-by: Tejun Heo Signed-off-by: Lai Jiangshan --- kernel/workqueue.c | 6 +++

Re: [PATCH V2 1/3] x86/mm/pti: handle unaligned address for pmd clone in pti_clone_pagetable()

2020-12-18 Thread Lai Jiangshan
Hello, Dave Hansen Could you help review the patches, please? I think they meet your suggestion except for forcing alignment in the caller. The reason is in the code. Thanks Lai On Thu, Dec 10, 2020 at 9:34 PM Lai Jiangshan wrote: > > From: Lai Jiangshan > > The commit 825d0b73c

[PATCH V3] kvm: check tlbs_dirty directly

2020-12-17 Thread Lai Jiangshan
From: Lai Jiangshan In kvm_mmu_notifier_invalidate_range_start(), tlbs_dirty is used as: need_tlb_flush |= kvm->tlbs_dirty; with need_tlb_flush's type being int and tlbs_dirty's type being long. It means that tlbs_dirty is always used as int and the higher 32 bits is useless. We n

[PATCH 2/2] selftest: parse the max cpu corretly from cpu list string

2020-12-17 Thread Lai Jiangshan
From: Lai Jiangshan "," is allowed in cpu list strings, such as "0-3,5". We need to handle these cases. Signed-off-by: Lai Jiangshan --- tools/testing/selftests/cpu-hotplug/cpu-on-off-test.sh | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/too

[PATCH 1/2] selftest: don't offline the last CPU in cpu hotplug test

2020-12-17 Thread Lai Jiangshan
From: Lai Jiangshan In my box, all CPUs are allowed to be offline. The test tries to offline all offline-able CPUs and causes fail on the last one. We should just skip offlining the last CPU Signed-off-by: Lai Jiangshan --- tools/testing/selftests/cpu-hotplug/cpu-on-off-test.sh | 5 + 1

  1   2   3   4   5   6   7   8   9   10   >