Re: [PATCH 0/6] x86: reduce paravirtualized spinlock overhead

2015-04-30 Thread Jeremy Fitzhardinge
On 04/30/2015 03:53 AM, Juergen Gross wrote: Paravirtualized spinlocks produce some overhead even if the kernel is running on bare metal. The main reason are the more complex locking and unlocking functions. Especially unlocking is no longer just one instruction but so complex that it is no

Re: [PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015-04-30 Thread Waiman Long
On 04/29/2015 02:27 PM, Linus Torvalds wrote: On Wed, Apr 29, 2015 at 11:11 AM, Peter Zijlstrapet...@infradead.org wrote: On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote: In the pv_scan_next() function, the slow cmpxchg atomic operation is performed even if the other CPU is not

Re: [PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015-04-30 Thread Waiman Long
On 04/29/2015 02:11 PM, Peter Zijlstra wrote: On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote: In the pv_scan_next() function, the slow cmpxchg atomic operation is performed even if the other CPU is not even close to being halted. This extra cmpxchg can harm slowpath performance.

[PATCH 2/6] x86: move decision about clearing slowpath flag into arch_spin_lock()

2015-04-30 Thread Juergen Gross
The decision whether the slowpath flag is to be cleared for paravirtualized spinlocks is located in __ticket_check_and_clear_slowpath() today. Move that decision into arch_spin_lock() and add an unlikely attribute to it to avoid calling a function in case the compiler chooses not to inline

[PATCH 1/6] x86: use macro instead of 0 for setting TICKET_SLOWPATH_FLAG

2015-04-30 Thread Juergen Gross
For paravirtualized spinlocks setting the slowpath flag in __ticket_enter_slowpath() is done via setting bit 0 in lock-tickets.head instead of using a macro. Change this by defining an appropriate macro. Signed-off-by: Juergen Gross jgr...@suse.com --- arch/x86/include/asm/spinlock.h | 3

[PATCH 3/6] x86: introduce new pvops function clear_slowpath

2015-04-30 Thread Juergen Gross
To speed up paravirtualized spinlock handling when running on bare metal introduce a new pvops function clear_slowpath. This is a nop when the kernel is running on bare metal. As the clear_slowpath function is common for all users add a new initialization function to set the pvops function

[PATCH 5/6] x86: switch config from UNINLINE_SPIN_UNLOCK to INLINE_SPIN_UNLOCK

2015-04-30 Thread Juergen Gross
There is no need any more for a special treatment of _raw_spin_unlock() regarding inlining compared to the other spinlock functions. Just treat it like all the other spinlock functions. Remove selecting UNINLINE_SPIN_UNLOCK in case of PARAVIRT_SPINLOCKS. Signed-off-by: Juergen Gross

[PATCH 4/6] x86: introduce new pvops function spin_unlock

2015-04-30 Thread Juergen Gross
To speed up paravirtualized spinlock handling when running on bare metal introduce a new pvops function spin_unlock. This is a simple add instruction (possibly with lock prefix) when the kernel is running on bare metal. As the patched instruction includes a lock prefix in some configurations

[PATCH 0/6] x86: reduce paravirtualized spinlock overhead

2015-04-30 Thread Juergen Gross
Paravirtualized spinlocks produce some overhead even if the kernel is running on bare metal. The main reason are the more complex locking and unlocking functions. Especially unlocking is no longer just one instruction but so complex that it is no longer inlined. This patch series addresses this

[PATCH 6/6] x86: remove no longer needed paravirt_ticketlocks_enabled

2015-04-30 Thread Juergen Gross
With the paravirtualized spinlock unlock function being a pvops function paravirt_ticketlocks_enabled is no longer needed. Remove it. Signed-off-by: Juergen Gross jgr...@suse.com --- arch/x86/include/asm/spinlock.h | 3 --- arch/x86/kernel/kvm.c| 14 --