This is to fix some bad issues on an over-commited guest. test-caes: perf record -a perf bench sched messaging -g 400 -p && perf report
18.09% sched-messaging [kernel.vmlinux] [k] osq_lock 12.28% sched-messaging [kernel.vmlinux] [k] rwsem_spin_on_owner 5.27% sched-messaging [kernel.vmlinux] [k] mutex_unlock 3.89% sched-messaging [kernel.vmlinux] [k] wait_consider_task 3.64% sched-messaging [kernel.vmlinux] [k] _raw_write_lock_irq 3.41% sched-messaging [kernel.vmlinux] [k] mutex_spin_on_owner.is 2.49% sched-messaging [kernel.vmlinux] [k] system_call osq takes a long time with preemption disabled which is really bad. This is because vCPU A hold the osq lock and yield out, vCPU B wait per_cpu node->locked to be set. IOW, vCPU B wait vCPU A to run and unlock the osq lock. Even there is need_resched(), it did not help on such scenario. we may also need fix other XXX_spin_on_owner later based on this patch set. these spin_on_onwer variant cause rcu stall. Pan Xinhui (1): locking/osq: Drop the overload of osq_lock() pan xinhui (1): kernel/sched: introduce vcpu preempted interface include/linux/sched.h | 34 ++++++++++++++++++++++++++++++++++ kernel/locking/osq_lock.c | 18 +++++++++++++++--- 2 files changed, 49 insertions(+), 3 deletions(-) -- 2.4.11

