Re: [PATCH v8 01/10] qspinlock: A generic 4-byte queue spinlock implementation

2014-04-04 Thread Waiman Long
On 04/04/2014 12:57 PM, Konrad Rzeszutek Wilk wrote: On Fri, Apr 04, 2014 at 03:00:12PM +0200, Peter Zijlstra wrote: So I'm just not ever going to pick up this patch; I spend a week trying to reverse engineer this; I posted a 7 patch series creating the equivalent, but in a gradual and readable

Re: [PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014-04-04 Thread Waiman Long
On 04/04/2014 12:55 PM, Konrad Rzeszutek Wilk wrote: On Thu, Apr 03, 2014 at 10:57:18PM -0400, Waiman Long wrote: On 04/03/2014 01:23 PM, Konrad Rzeszutek Wilk wrote: On Wed, Apr 02, 2014 at 10:10:17PM -0400, Waiman Long wrote: On 04/02/2014 04:35 PM, Waiman Long wrote: On 04/02/2014 10:32

Re: [PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014-04-03 Thread Waiman Long
On 04/03/2014 01:23 PM, Konrad Rzeszutek Wilk wrote: On Wed, Apr 02, 2014 at 10:10:17PM -0400, Waiman Long wrote: On 04/02/2014 04:35 PM, Waiman Long wrote: On 04/02/2014 10:32 AM, Konrad Rzeszutek Wilk wrote: On Wed, Apr 02, 2014 at 09:27:29AM -0400, Waiman Long wrote: N.B. Sorry

[PATCH v8 02/10] qspinlock, x86: Enable x86-64 to use queue spinlock

2014-04-02 Thread Waiman Long
optimization which will make the queue spinlock code perform better than the generic implementation. Signed-off-by: Waiman Long waiman.l...@hp.com Acked-by: Rik van Riel r...@redhat.com --- arch/x86/Kconfig |1 + arch/x86/include/asm/qspinlock.h | 41

[PATCH v8 03/10] qspinlock: More optimized code for smaller NR_CPUS

2014-04-02 Thread Waiman Long
now be defined in an architecture specific qspinlock.h header file to indicate its support for smaller atomic operation data types. This macro triggers the replacement of some of the generic functions by more optimized versions. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014-04-02 Thread Waiman Long
Westmere-EP (HT on) Waiman Long (10): qspinlock: A generic 4-byte queue spinlock implementation qspinlock, x86: Enable x86-64 to use queue spinlock qspinlock: More optimized code for smaller NR_CPUS qspinlock: Optimized code path for 2 contending tasks pvqspinlock, x86: Allow unfair spinlock

[PATCH v8 05/10] pvqspinlock, x86: Allow unfair spinlock in a PV guest

2014-04-02 Thread Waiman Long
benefit of an unfair lock. For large critical section, however, there may not be much benefit. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/Kconfig | 11 arch/x86/include/asm/qspinlock.h | 86 +- arch/x86/kernel/Makefile

[PATCH v8 01/10] qspinlock: A generic 4-byte queue spinlock implementation

2014-04-02 Thread Waiman Long
. In this case, the contended spinlock is the mb_cache_spinlock. Signed-off-by: Waiman Long waiman.l...@hp.com Acked-by: Rik van Riel r...@redhat.com --- include/asm-generic/qspinlock.h | 122 +++ include/asm-generic/qspinlock_types.h | 49 + kernel/Kconfig.locks

[PATCH v8 04/10] qspinlock: Optimized code path for 2 contending tasks

2014-04-02 Thread Waiman Long
7641 -15% 7 12907 8373 -35% 8 15094 10259 -32% There is some performance drop at the 3 contending tasks level. Other than that, queue spinlock is faster than ticket spinlock. Signed-off-by: Waiman Long waiman.l

[PATCH v8 06/10] pvqspinlock: Enable lock stealing in queue lock waiters

2014-04-02 Thread Waiman Long
2732 4653 Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock.c | 270 +++- 1 files changed, 265 insertions(+), 5 deletions(-) diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index cf16bba..527efc3

[PATCH v8 07/10] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled

2014-04-02 Thread Waiman Long
This patch renames the paravirt_ticketlocks_enabled static key to a more generic paravirt_spinlocks_enabled name. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/spinlock.h |4 ++-- arch/x86/kernel/kvm.c|2 +- arch/x86/kernel/paravirt

[PATCH v8 09/10] pvqspinlock, x86: Enable qspinlock PV support for KVM

2014-04-02 Thread Waiman Long
-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 111 + kernel/Kconfig.locks |2 +- 2 files changed, 112 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 8e646a7..7d97e58 100644

[PATCH v8 08/10] pvqspinlock, x86: Add qspinlock para-virtualization support

2014-04-02 Thread Waiman Long
the unfair lock and PV spinlock features is turned on, lock stealing will still be allowed in the fastpath, but not in the slowpath. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/paravirt.h | 17 ++- arch/x86/include/asm/paravirt_types.h | 16 ++ arch/x86/include/asm

[PATCH v8 10/10] pvqspinlock, x86: Enable qspinlock PV support for XEN

2014-04-02 Thread Waiman Long
This patch adds the necessary KVM specific code to allow XEN to support the sleeping and CPU kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/xen/spinlock.c | 119 +-- kernel

Re: [PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014-04-02 Thread Waiman Long
On 04/02/2014 10:32 AM, Konrad Rzeszutek Wilk wrote: On Wed, Apr 02, 2014 at 09:27:29AM -0400, Waiman Long wrote: N.B. Sorry for the duplicate. This patch series were resent as the original one was rejected by the vger.kernel.org list server due to long header. There is no change

Re: [PATCH v8 10/10] pvqspinlock, x86: Enable qspinlock PV support for XEN

2014-04-02 Thread Waiman Long
On 04/02/2014 10:39 AM, Konrad Rzeszutek Wilk wrote: diff --git a/kernel/Kconfig.locks b/kernel/Kconfig.locks index a70fdeb..451e392 100644 --- a/kernel/Kconfig.locks +++ b/kernel/Kconfig.locks @@ -229,4 +229,4 @@ config ARCH_USE_QUEUE_SPINLOCK config QUEUE_SPINLOCK def_bool y if

[PATCH v8 02/10] qspinlock, x86: Enable x86-64 to use queue spinlock

2014-04-01 Thread Waiman Long
optimization which will make the queue spinlock code perform better than the generic implementation. Signed-off-by: Waiman Long waiman.l...@hp.com Acked-by: Rik van Riel r...@redhat.com --- arch/x86/Kconfig |1 + arch/x86/include/asm/qspinlock.h | 41

[PATCH v8 00/10] qspinlock: a 4-byte queue spinlock with PV support

2014-04-01 Thread Waiman Long
2899 rec/s Westmere-EX (HT off) 2-socket 12-core 2130 rec/s 2176 rec/s Westmere-EP (HT on) Waiman Long (10): qspinlock: A generic 4-byte queue spinlock implementation qspinlock, x86: Enable x86-64 to use queue spinlock qspinlock: More optimized code

[PATCH v8 03/10] qspinlock: More optimized code for smaller NR_CPUS

2014-04-01 Thread Waiman Long
now be defined in an architecture specific qspinlock.h header file to indicate its support for smaller atomic operation data types. This macro triggers the replacement of some of the generic functions by more optimized versions. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include

[PATCH v8 05/10] pvqspinlock, x86: Allow unfair spinlock in a PV guest

2014-04-01 Thread Waiman Long
benefit of an unfair lock. For large critical section, however, there may not be much benefit. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/Kconfig | 11 arch/x86/include/asm/qspinlock.h | 86 +- arch/x86/kernel/Makefile

[PATCH v8 06/10] pvqspinlock: Enable lock stealing in queue lock waiters

2014-04-01 Thread Waiman Long
2732 4653 Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock.c | 270 +++- 1 files changed, 265 insertions(+), 5 deletions(-) diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index cf16bba..527efc3

[PATCH v8 04/10] qspinlock: Optimized code path for 2 contending tasks

2014-04-01 Thread Waiman Long
7641 -15% 7 12907 8373 -35% 8 15094 10259 -32% There is some performance drop at the 3 contending tasks level. Other than that, queue spinlock is faster than ticket spinlock. Signed-off-by: Waiman Long waiman.l

[PATCH v8 01/10] qspinlock: A generic 4-byte queue spinlock implementation

2014-04-01 Thread Waiman Long
. In this case, the contended spinlock is the mb_cache_spinlock. Signed-off-by: Waiman Long waiman.l...@hp.com Acked-by: Rik van Riel r...@redhat.com --- include/asm-generic/qspinlock.h | 122 +++ include/asm-generic/qspinlock_types.h | 49 + kernel/Kconfig.locks

[PATCH v8 09/10] pvqspinlock, x86: Enable qspinlock PV support for KVM

2014-04-01 Thread Waiman Long
-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 111 + kernel/Kconfig.locks |2 +- 2 files changed, 112 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 8e646a7..7d97e58 100644

[PATCH v8 08/10] pvqspinlock, x86: Add qspinlock para-virtualization support

2014-04-01 Thread Waiman Long
the unfair lock and PV spinlock features is turned on, lock stealing will still be allowed in the fastpath, but not in the slowpath. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/paravirt.h | 17 ++- arch/x86/include/asm/paravirt_types.h | 16 ++ arch/x86/include/asm

[PATCH v8 07/10] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled

2014-04-01 Thread Waiman Long
This patch renames the paravirt_ticketlocks_enabled static key to a more generic paravirt_spinlocks_enabled name. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/spinlock.h |4 ++-- arch/x86/kernel/kvm.c|2 +- arch/x86/kernel/paravirt

Re: [PATCH v7 06/11] pvqspinlock, x86: Allow unfair queue spinlock in a KVM guest

2014-03-21 Thread Waiman Long
On 03/20/2014 06:01 PM, Paolo Bonzini wrote: Il 19/03/2014 21:14, Waiman Long ha scritto: This patch adds a KVM init function to activate the unfair queue spinlock in a KVM guest when the PARAVIRT_UNFAIR_LOCKS kernel config option is selected. Signed-off-by: Waiman Long waiman.l...@hp.com

Re: [PATCH v7 10/11] pvqspinlock, x86: Enable qspinlock PV support for KVM

2014-03-21 Thread Waiman Long
On 03/20/2014 06:07 PM, Paolo Bonzini wrote: Il 19/03/2014 21:14, Waiman Long ha scritto: This patch adds the necessary KVM specific code to allow KVM to support the sleeping and CPU kicking operations needed by the queue spinlock PV code. The remaining problem of this patch is that you

Re: [PATCH v7 02/11] qspinlock, x86: Enable x86-64 to use queue spinlock

2014-03-20 Thread Waiman Long
On 03/19/2014 04:24 PM, Konrad Rzeszutek Wilk wrote: On Wed, Mar 19, 2014 at 04:14:00PM -0400, Waiman Long wrote: This patch makes the necessary changes at the x86 architecture specific layer to enable the use of queue spinlock for x86-64. As x86-32 machines are typically not multi-socket

Re: [PATCH v7 07/11] pvqspinlock, x86: Allow unfair queue spinlock in a XEN guest

2014-03-20 Thread Waiman Long
On 03/19/2014 04:28 PM, Konrad Rzeszutek Wilk wrote: On Wed, Mar 19, 2014 at 04:14:05PM -0400, Waiman Long wrote: This patch adds a XEN init function to activate the unfair queue spinlock in a XEN guest when the PARAVIRT_UNFAIR_LOCKS kernel config option is selected. Signed-off-by: Waiman

Re: [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest

2014-03-19 Thread Waiman Long
On 03/19/2014 06:07 AM, Paolo Bonzini wrote: Il 19/03/2014 04:15, Waiman Long ha scritto: You should see the same values with the PV ticketlock. It is not clear to me if this testing did include that variant of locks? Yes, PV is fine. But up to this point of the series, we are concerned

[PATCH v7 01/11] qspinlock: A generic 4-byte queue spinlock implementation

2014-03-19 Thread Waiman Long
. In this case, the contended spinlock is the mb_cache_spinlock. Signed-off-by: Waiman Long waiman.l...@hp.com Acked-by: Rik van Riel r...@redhat.com --- include/asm-generic/qspinlock.h | 122 +++ include/asm-generic/qspinlock_types.h | 49 + kernel/Kconfig.locks

[PATCH v7 03/11] qspinlock: More optimized code for smaller NR_CPUS

2014-03-19 Thread Waiman Long
now be defined in an architecture specific qspinlock.h header file to indicate its support for smaller atomic operation data types. This macro triggers the replacement of some of the generic functions by more optimized versions. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include

[PATCH v7 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest

2014-03-19 Thread Waiman Long
individual task. For the 4 nodes case above, the standard deviation was 785ms. In general, the shorter the critical section, the better the performance benefit of an unfair lock. For large critical section, however, there may not be much benefit. Signed-off-by: Waiman Long waiman.l...@hp.com

[PATCH v7 08/11] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled

2014-03-19 Thread Waiman Long
This patch renames the paravirt_ticketlocks_enabled static key to a more generic paravirt_spinlocks_enabled name. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/spinlock.h |4 ++-- arch/x86/kernel/kvm.c|2 +- arch/x86/kernel/paravirt

[PATCH v7 07/11] pvqspinlock, x86: Allow unfair queue spinlock in a XEN guest

2014-03-19 Thread Waiman Long
This patch adds a XEN init function to activate the unfair queue spinlock in a XEN guest when the PARAVIRT_UNFAIR_LOCKS kernel config option is selected. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/xen/setup.c | 19 +++ 1 files changed, 19 insertions(+), 0

[PATCH v7 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support

2014-03-19 Thread Waiman Long
in performance. When coupled with unfair lock, the queue spinlock can be much faster than the PV ticket lock. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/paravirt.h | 12 ++- arch/x86/include/asm/paravirt_types.h |5 + arch/x86/include/asm/pvqspinlock.h

[PATCH v7 10/11] pvqspinlock, x86: Enable qspinlock PV support for KVM

2014-03-19 Thread Waiman Long
cases, especially with heavy spinlock contention. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 82 + kernel/Kconfig.locks |2 +- 2 files changed, 83 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/kvm.c

[PATCH RFC v7 11/11] pvqspinlock, x86: Enable qspinlock PV support for XEN

2014-03-19 Thread Waiman Long
This patch adds the necessary KVM specific code to allow XEN to support the sleeping and CPU kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/xen/spinlock.c | 90 --- kernel

[PATCH v7 06/11] pvqspinlock, x86: Allow unfair queue spinlock in a KVM guest

2014-03-19 Thread Waiman Long
This patch adds a KVM init function to activate the unfair queue spinlock in a KVM guest when the PARAVIRT_UNFAIR_LOCKS kernel config option is selected. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 17 + 1 files changed, 17 insertions(+), 0

[PATCH v7 00/11] qspinlock: a 4-byte queue spinlock with PV support

2014-03-19 Thread Waiman Long
. The main purpose is to make the lock contention problems more tolerable until someone can spend the time and effort to fix them. Waiman Long (11): qspinlock: A generic 4-byte queue spinlock implementation qspinlock, x86: Enable x86-64 to use queue spinlock qspinlock: More optimized code

[PATCH v7 02/11] qspinlock, x86: Enable x86-64 to use queue spinlock

2014-03-19 Thread Waiman Long
optimization which will make the queue spinlock code perform better than the generic implementation. Signed-off-by: Waiman Long waiman.l...@hp.com Acked-by: Rik van Riel r...@redhat.com --- arch/x86/Kconfig |1 + arch/x86/include/asm/qspinlock.h | 41

[PATCH v7 04/11] qspinlock: Optimized code path for 2 contending tasks

2014-03-19 Thread Waiman Long
10796 -48% Except some drop in performance at the 3 contending tasks level, the queue spinlock performs much better than the ticket spinlock at 2 and 4 contending tasks level. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/qspinlock.h |3 +- kernel

Re: [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest

2014-03-18 Thread Waiman Long
On 03/17/2014 02:54 PM, Peter Zijlstra wrote: On Mon, Mar 17, 2014 at 01:44:34PM -0400, Waiman Long wrote: The PV ticketlock code was designed to handle lock holder preemption by redirecting CPU resources in a preempted guest to another guest that can better use it and then return the preempted

Re: [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest

2014-03-18 Thread Waiman Long
On 03/17/2014 03:10 PM, Konrad Rzeszutek Wilk wrote: On Mon, Mar 17, 2014 at 01:44:34PM -0400, Waiman Long wrote: On 03/14/2014 04:30 AM, Peter Zijlstra wrote: On Thu, Mar 13, 2014 at 04:05:19PM -0400, Waiman Long wrote: On 03/13/2014 11:15 AM, Peter Zijlstra wrote: On Wed, Mar 12, 2014

Re: [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest

2014-03-18 Thread Waiman Long
On 03/18/2014 04:14 AM, Paolo Bonzini wrote: Il 17/03/2014 20:05, Konrad Rzeszutek Wilk ha scritto: Measurements were done by Gleb for two guests running 2.6.32 with 16 vcpus each, on a 16-core system. One guest ran with unfair locks, one guest ran with fair locks. Two kernel compilations

Re: [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks

2014-03-17 Thread Waiman Long
On 03/13/2014 09:57 AM, Peter Zijlstra wrote: On Wed, Mar 12, 2014 at 03:08:24PM -0400, Waiman Long wrote: On 03/12/2014 02:54 PM, Waiman Long wrote: + /* +* Set the lock bit clear the waiting bit simultaneously +* It is assumed

Re: [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest

2014-03-17 Thread Waiman Long
On 03/14/2014 04:30 AM, Peter Zijlstra wrote: On Thu, Mar 13, 2014 at 04:05:19PM -0400, Waiman Long wrote: On 03/13/2014 11:15 AM, Peter Zijlstra wrote: On Wed, Mar 12, 2014 at 02:54:52PM -0400, Waiman Long wrote: +static inline void arch_spin_lock(struct qspinlock *lock

Re: [PATCH RFC v6 10/11] pvqspinlock, x86: Enable qspinlock PV support for KVM

2014-03-17 Thread Waiman Long
On 03/14/2014 04:42 AM, Paolo Bonzini wrote: Il 13/03/2014 20:13, Waiman Long ha scritto: This should also disable the unfair path. Paolo The unfair lock uses a different jump label and does not require any special PV ops. There is a separate init function for that. Yeah, what I mean

Re: [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest

2014-03-13 Thread Waiman Long
On 03/13/2014 06:54 AM, David Vrabel wrote: On 12/03/14 18:54, Waiman Long wrote: Locking is always an issue in a virtualized environment as the virtual CPU that is waiting on a lock may get scheduled out and hence block any progress in lock acquisition even when the lock has been freed. One

Re: [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support

2014-03-13 Thread Waiman Long
On 03/13/2014 07:21 AM, David Vrabel wrote: On 12/03/14 18:54, Waiman Long wrote: This patch adds para-virtualization support to the queue spinlock in the same way as was done in the PV ticket lock code. In essence, the lock waiters will spin for a specified number of times (QSPIN_THRESHOLD = 2

Re: [PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support

2014-03-13 Thread Waiman Long
On 03/13/2014 09:57 AM, Paolo Bonzini wrote: Il 13/03/2014 12:21, David Vrabel ha scritto: On 12/03/14 18:54, Waiman Long wrote: This patch adds para-virtualization support to the queue spinlock in the same way as was done in the PV ticket lock code. In essence, the lock waiters will spin

Re: [PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest

2014-03-13 Thread Waiman Long
On 03/13/2014 11:15 AM, Peter Zijlstra wrote: On Wed, Mar 12, 2014 at 02:54:52PM -0400, Waiman Long wrote: +static inline void arch_spin_lock(struct qspinlock *lock) +{ + if (static_key_false(paravirt_unfairlocks_enabled)) + queue_spin_lock_unfair(lock); + else

Re: [PATCH RFC v6 10/11] pvqspinlock, x86: Enable qspinlock PV support for KVM

2014-03-13 Thread Waiman Long
On 03/13/2014 11:25 AM, Peter Zijlstra wrote: On Wed, Mar 12, 2014 at 02:54:57PM -0400, Waiman Long wrote: A KVM guest of 20 CPU cores was created to run the disk workload of the AIM7 benchmark on both ext4 and xfs RAM disks at 3000 users on a 3.14-rc6 based kernel. The JPM (jobs/minute) data

[PATCH v6 00/11] qspinlock: a 4-byte queue spinlock with PV support

2014-03-12 Thread Waiman Long
spinlock contention problems. Those need to be solved by refactoring the code to make more efficient use of the lock or finer granularity ones. The main purpose is to make the lock contention problems more tolerable until someone can spend the time and effort to fix them. Waiman Long (11): qspinlock

[PATCH v6 01/11] qspinlock: A generic 4-byte queue spinlock implementation

2014-03-12 Thread Waiman Long
. In this case, the contended spinlock is the mb_cache_spinlock. Signed-off-by: Waiman Long waiman.l...@hp.com Acked-by: Rik van Riel r...@redhat.com --- include/asm-generic/qspinlock.h | 122 +++ include/asm-generic/qspinlock_types.h | 55 + kernel/Kconfig.locks

[PATCH v6 06/11] pvqspinlock, x86: Allow unfair queue spinlock in a KVM guest

2014-03-12 Thread Waiman Long
This patch adds a KVM init function to activate the unfair queue spinlock in a KVM guest when the PARAVIRT_UNFAIR_LOCKS kernel config option is selected. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 17 + 1 files changed, 17 insertions(+), 0

[PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks

2014-03-12 Thread Waiman Long
10796 -48% Except some drop in performance at the 3 contending tasks level, the queue spinlock performs much better than the ticket spinlock at 2 and 4 contending tasks level. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/qspinlock.h |3 +- kernel

[PATCH v6 07/11] pvqspinlock, x86: Allow unfair queue spinlock in a XEN guest

2014-03-12 Thread Waiman Long
This patch adds a XEN init function to activate the unfair queue spinlock in a XEN guest when the PARAVIRT_UNFAIR_LOCKS kernel config option is selected. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/xen/setup.c | 19 +++ 1 files changed, 19 insertions(+), 0

[PATCH v6 05/11] pvqspinlock, x86: Allow unfair spinlock in a PV guest

2014-03-12 Thread Waiman Long
section, the better the performance benefit of an unfair lock. For large critical section, however, there may not be much benefit. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/Kconfig | 11 + arch/x86/include/asm/qspinlock.h | 72

[PATCH v6 03/11] qspinlock: More optimized code for smaller NR_CPUS

2014-03-12 Thread Waiman Long
now be defined in an architecture specific qspinlock.h header file to indicate its support for smaller atomic operation data types. This macro triggers the replacement of some of the generic functions by more optimized versions. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include

[PATCH v6 08/11] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled

2014-03-12 Thread Waiman Long
This patch renames the paravirt_ticketlocks_enabled static key to a more generic paravirt_spinlocks_enabled name. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/spinlock.h |4 ++-- arch/x86/kernel/kvm.c|2 +- arch/x86/kernel/paravirt

[PATCH RFC v6 11/11] pvqspinlock, x86: Enable qspinlock PV support for XEN

2014-03-12 Thread Waiman Long
This patch adds the necessary KVM specific code to allow XEN to support the sleeping and CPU kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/xen/spinlock.c | 95 -- kernel

[PATCH RFC v6 09/11] pvqspinlock, x86: Add qspinlock para-virtualization support

2014-03-12 Thread Waiman Long
, on the other hand, only has a minor drop in performance for 3 or more contending tasks. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/paravirt.h | 12 ++- arch/x86/include/asm/paravirt_types.h | 12 ++ arch/x86/include/asm/pvqspinlock.h| 232

[PATCH RFC v6 10/11] pvqspinlock, x86: Enable qspinlock PV support for KVM

2014-03-12 Thread Waiman Long
. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 87 + kernel/Kconfig.locks |2 +- 2 files changed, 88 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index f318e78..aaf704e

Re: [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks

2014-03-12 Thread Waiman Long
On 03/12/2014 02:54 PM, Waiman Long wrote: + + /* +* Now wait until the lock bit is cleared +*/ + while (smp_load_acquire(qlock-qlcode) _QSPINLOCK_LOCKED) + arch_mutex_cpu_relax

Re: [PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation

2014-03-04 Thread Waiman Long
On 03/02/2014 08:12 AM, Oleg Nesterov wrote: On 02/26, Waiman Long wrote: +void queue_spin_lock_slowpath(struct qspinlock *lock, int qsval) +{ + unsigned int cpu_nr, qn_idx; + struct qnode *node, *next; + u32 prev_qcode, my_qcode; + + /* +* Get the queue node

Re: [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

2014-03-04 Thread Waiman Long
On 03/02/2014 08:16 AM, Oleg Nesterov wrote: On 02/26, Waiman Long wrote: @@ -144,7 +317,7 @@ static __always_inline int queue_spin_setlock(struct qspinlock *lock) int qlcode = atomic_read(lock-qlcode); if (!(qlcode _QSPINLOCK_LOCKED) (atomic_cmpxchg(lock-qlcode

Re: [PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation

2014-03-04 Thread Waiman Long
On 03/02/2014 08:31 AM, Oleg Nesterov wrote: Forgot to ask... On 02/26, Waiman Long wrote: +notify_next: + /* +* Wait, if needed, until the next one in queue set up the next field +*/ + while (!(next = ACCESS_ONCE(node-next))) + arch_mutex_cpu_relax

Re: [PATCH v5 2/8] qspinlock, x86: Enable x86-64 to use queue spinlock

2014-03-04 Thread Waiman Long
On 03/02/2014 09:10 AM, Oleg Nesterov wrote: On 02/26, Waiman Long wrote: +#define _ARCH_SUPPORTS_ATOMIC_8_16_BITS_OPS + +/* + * x86-64 specific queue spinlock union structure + */ +union arch_qspinlock { + struct qspinlock slock; + u8 lock; /* Lock bit

Re: [PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment

2014-03-04 Thread Waiman Long
On 03/03/2014 05:55 AM, Paolo Bonzini wrote: Il 28/02/2014 18:06, Waiman Long ha scritto: On 02/26/2014 12:07 PM, Konrad Rzeszutek Wilk wrote: On Wed, Feb 26, 2014 at 10:14:24AM -0500, Waiman Long wrote: Locking is always an issue in a virtualized environment as the virtual CPU

Re: [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

2014-03-04 Thread Waiman Long
On 03/03/2014 12:43 PM, Peter Zijlstra wrote: Hi, Here are some numbers for my version -- also attached is the test code. I found that booting big machines is tediously slow so I lifted the whole lot to userspace. I measure the cycles spend in arch_spin_lock() + arch_spin_unlock(). The

Re: [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

2014-03-04 Thread Waiman Long
Peter, I was trying to implement the generic queue code exchange code using cmpxchg as suggested by you. However, when I gathered the performance data, the code performed worse than I expected at a higher contention level. Below were the execution time of the benchmark tool that I sent you:

Re: [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

2014-02-28 Thread Waiman Long
On 02/28/2014 04:29 AM, Peter Zijlstra wrote: On Thu, Feb 27, 2014 at 03:42:19PM -0500, Waiman Long wrote: + old = xchg(qlock-lock_wait, _QSPINLOCK_WAITING|_QSPINLOCK_LOCKED); + + if (old == 0) { + /* +* Got the lock, can clear the waiting bit now

Re: [PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support

2014-02-28 Thread Waiman Long
On 02/26/2014 12:00 PM, Konrad Rzeszutek Wilk wrote: On Wed, Feb 26, 2014 at 10:14:20AM -0500, Waiman Long wrote: It should be fairly easy. You just need to implement the kick right? An IPI should be all that is needed - look in xen_unlock_kick. The rest of the spinlock code is all generic

Re: [PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment

2014-02-28 Thread Waiman Long
On 02/26/2014 12:07 PM, Konrad Rzeszutek Wilk wrote: On Wed, Feb 26, 2014 at 10:14:24AM -0500, Waiman Long wrote: Locking is always an issue in a virtualized environment as the virtual CPU that is waiting on a lock may get scheduled out and hence block any progress in lock acquisition even when

Re: [PATCH RFC v5 5/8] pvqspinlock, x86: Enable unfair queue spinlock in a KVM guest

2014-02-28 Thread Waiman Long
On 02/26/2014 12:08 PM, Konrad Rzeszutek Wilk wrote: On Wed, Feb 26, 2014 at 10:14:25AM -0500, Waiman Long wrote: This patch adds a KVM init function to activate the unfair queue spinlock in a KVM guest when the PARAVIRT_UNFAIR_LOCKS kernel config option is selected. Signed-off-by: Waiman

Re: [PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support

2014-02-27 Thread Waiman Long
On 02/27/2014 03:37 AM, Peter Zijlstra wrote: Is this the same 8 patches you send yesterday? Sorry for the duplication. It was the same patch. It has some minor update in the cover-letter to include some KVM guest test results. I was having problem locating the patch from the LKML list and

Re: [PATCH RFC v5 8/8] pvqspinlock, x86: Enable KVM to use qspinlock's PV support

2014-02-27 Thread Waiman Long
On 02/27/2014 04:31 AM, Paolo Bonzini wrote: static __init int kvm_spinlock_init_jump(void) diff --git a/kernel/Kconfig.locks b/kernel/Kconfig.locks index f185584..a70fdeb 100644 --- a/kernel/Kconfig.locks +++ b/kernel/Kconfig.locks @@ -229,4 +229,4 @@ config ARCH_USE_QUEUE_SPINLOCK config

Re: [PATCH RFC v5 5/8] pvqspinlock, x86: Enable unfair queue spinlock in a KVM guest

2014-02-27 Thread Waiman Long
On 02/27/2014 04:41 AM, Paolo Bonzini wrote: Il 26/02/2014 16:14, Waiman Long ha scritto: This patch adds a KVM init function to activate the unfair queue spinlock in a KVM guest when the PARAVIRT_UNFAIR_LOCKS kernel config option is selected. Signed-off-by: Waiman Long waiman.l...@hp.com

Re: [PATCH RFC v5 5/8] pvqspinlock, x86: Enable unfair queue spinlock in a KVM guest

2014-02-27 Thread Waiman Long
On 02/27/2014 05:40 AM, Raghavendra K T wrote: On 02/26/2014 08:44 PM, Waiman Long wrote: This patch adds a KVM init function to activate the unfair queue spinlock in a KVM guest when the PARAVIRT_UNFAIR_LOCKS kernel config option is selected. Signed-off-by: Waiman Long waiman.l...@hp.com

Re: [PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment

2014-02-27 Thread Waiman Long
On 02/27/2014 07:28 AM, David Vrabel wrote: On 26/02/14 15:14, Waiman Long wrote: Locking is always an issue in a virtualized environment as the virtual CPU that is waiting on a lock may get scheduled out and hence block any progress in lock acquisition even when the lock has been freed. One

Re: [PATCH RFC v5 7/8] pvqspinlock, x86: Add qspinlock para-virtualization support

2014-02-27 Thread Waiman Long
On 02/27/2014 09:45 AM, Paolo Bonzini wrote: Il 27/02/2014 15:18, David Vrabel ha scritto: On 27/02/14 13:11, Paolo Bonzini wrote: Il 27/02/2014 13:11, David Vrabel ha scritto: This patch adds para-virtualization support to the queue spinlock code by enabling the queue head to kick the lock

Re: [PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation

2014-02-27 Thread Waiman Long
On 02/26/2014 11:22 AM, Peter Zijlstra wrote: On Wed, Feb 26, 2014 at 10:14:21AM -0500, Waiman Long wrote: +struct qnode { + u32 wait; /* Waiting flag */ + struct qnode*next; /* Next queue node addr */ +}; + +struct qnode_set

Re: [PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation

2014-02-27 Thread Waiman Long
On 02/26/2014 11:24 AM, Peter Zijlstra wrote: On Wed, Feb 26, 2014 at 10:14:21AM -0500, Waiman Long wrote: +static void put_qnode(void) +{ + struct qnode_set *qset = this_cpu_ptr(qnset); + + qset-node_idx--; +} That very much wants to be: this_cpu_dec(). Yes, I will change

Re: [PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

2014-02-27 Thread Waiman Long
will add more comments to describe the 4 possible cases and how to handle them. On Wed, Feb 26, 2014 at 10:14:23AM -0500, Waiman Long wrote: +static inline int queue_spin_trylock_quick(struct qspinlock *lock, int qsval) +{ + union arch_qspinlock *qlock = (union arch_qspinlock *)lock

Re: [PATCH RFC v5 7/8] pvqspinlock, x86: Add qspinlock para-virtualization support

2014-02-27 Thread Waiman Long
On 02/27/2014 10:22 AM, Raghavendra K T wrote: On 02/27/2014 08:15 PM, Paolo Bonzini wrote: [...] But neither of the VCPUs being kicked here are halted -- they're either running or runnable (descheduled by the hypervisor). /me actually looks at Waiman's code... Right, this is really

[PATCH v5 2/8] qspinlock, x86: Enable x86-64 to use queue spinlock

2014-02-26 Thread Waiman Long
optimization which will make the queue spinlock code perform better than the generic implementation. Signed-off-by: Waiman Long waiman.l...@hp.com Acked-by: Rik van Riel r...@redhat.com --- arch/x86/Kconfig |1 + arch/x86/include/asm/qspinlock.h | 41

[PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation

2014-02-26 Thread Waiman Long
, the disk workload improved from 416281 JPM to 899101 JPM (+116%) with the patch. In this case, the contended spinlock is the mb_cache_spinlock. Signed-off-by: Waiman Long waiman.l...@hp.com Acked-by: Rik van Riel r...@redhat.com --- include/asm-generic/qspinlock.h | 122 ++ include/asm

[PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support

2014-02-26 Thread Waiman Long
efficient use of the lock or finer granularity ones. The main purpose is to make the lock contention problems more tolerable until someone can spend the time and effort to fix them. Waiman Long (8): qspinlock: Introducing a 4-byte queue spinlock implementation qspinlock, x86: Enable x86-64

[PATCH RFC v5 5/8] pvqspinlock, x86: Enable unfair queue spinlock in a KVM guest

2014-02-26 Thread Waiman Long
This patch adds a KVM init function to activate the unfair queue spinlock in a KVM guest when the PARAVIRT_UNFAIR_LOCKS kernel config option is selected. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 17 + 1 files changed, 17 insertions(+), 0

[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

2014-02-26 Thread Waiman Long
-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/qspinlock.h | 20 - include/asm-generic/qspinlock_types.h |8 ++- kernel/locking/qspinlock.c| 192 - 3 files changed, 215 insertions(+), 5 deletions(-) diff --git a/arch/x86

[PATCH RFC v5 4/8] pvqspinlock, x86: Allow unfair spinlock in a real PV environment

2014-02-26 Thread Waiman Long
guest. Enabling this configuration feature decreases the performance of an uncontended lock-unlock operation by about 1-2%. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/Kconfig | 11 + arch/x86/include/asm/qspinlock.h | 74

[PATCH RFC v5 6/8] pvqspinlock, x86: Rename paravirt_ticketlocks_enabled

2014-02-26 Thread Waiman Long
This patch renames the paravirt_ticketlocks_enabled static key to a more generic paravirt_spinlocks_enabled name. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/spinlock.h |4 ++-- arch/x86/kernel/kvm.c|2 +- arch/x86/kernel/paravirt

[PATCH RFC v5 8/8] pvqspinlock, x86: Enable KVM to use qspinlock's PV support

2014-02-26 Thread Waiman Long
-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 54 + kernel/Kconfig.locks |2 +- 2 files changed, 55 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index f318e78..3ddc436 100644 --- a/arch

[PATCH RFC v5 7/8] pvqspinlock, x86: Add qspinlock para-virtualization support

2014-02-26 Thread Waiman Long
to make sure that their CPUs will stay scheduled in. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/paravirt.h |9 ++- arch/x86/include/asm/paravirt_types.h | 12 +++ arch/x86/include/asm/pvqspinlock.h| 176 + arch/x86

[PATCH v5 1/8] qspinlock: Introducing a 4-byte queue spinlock implementation

2014-02-26 Thread Waiman Long
, the disk workload improved from 416281 JPM to 899101 JPM (+116%) with the patch. In this case, the contended spinlock is the mb_cache_spinlock. Signed-off-by: Waiman Long waiman.l...@hp.com Acked-by: Rik van Riel r...@redhat.com --- include/asm-generic/qspinlock.h | 122 ++ include/asm

[PATCH v5 0/8] qspinlock: a 4-byte queue spinlock with PV support

2014-02-26 Thread Waiman Long
for improvement compared with the current PV ticketlock implementation. Waiman Long (8): qspinlock: Introducing a 4-byte queue spinlock implementation qspinlock, x86: Enable x86-64 to use queue spinlock qspinlock, x86: Add x86 specific optimization for 2 contending tasks pvqspinlock, x86

[PATCH RFC v5 5/8] pvqspinlock, x86: Enable unfair queue spinlock in a KVM guest

2014-02-26 Thread Waiman Long
This patch adds a KVM init function to activate the unfair queue spinlock in a KVM guest when the PARAVIRT_UNFAIR_LOCKS kernel config option is selected. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 17 + 1 files changed, 17 insertions(+), 0

[PATCH v5 3/8] qspinlock, x86: Add x86 specific optimization for 2 contending tasks

2014-02-26 Thread Waiman Long
-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/qspinlock.h | 20 - include/asm-generic/qspinlock_types.h |8 ++- kernel/locking/qspinlock.c| 192 - 3 files changed, 215 insertions(+), 5 deletions(-) diff --git a/arch/x86

<    1   2   3   4   5   >