[PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015-04-24 Thread Waiman Long
without using atomic op. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock_paravirt.h | 28 +--- 1 files changed, 25 insertions(+), 3 deletions(-) diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h index

[PATCH v16 12/14] pvqspinlock: Only kick CPU at unlock time

2015-04-24 Thread Waiman Long
() so as to do the pv_kick() only if it is really necessary. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock.c | 10 ++-- kernel/locking/qspinlock_paravirt.h | 76 +- 2 files changed, 61 insertions(+), 25 deletions(-) diff --git

[PATCH v16 00/14] qspinlock: a 4-byte queue spinlock with PV support

2015-04-24 Thread Waiman Long
-and-set on hypervisors pvqspinlock, x86: Implement the paravirt qspinlock call patching Waiman Long (9): qspinlock: A simple generic 4-byte queue spinlock qspinlock, x86: Enable x86-64 to use queue spinlock qspinlock: Extract out code snippets for the next patch qspinlock: Use a simple write

[PATCH v16 02/14] qspinlock, x86: Enable x86-64 to use queue spinlock

2015-04-24 Thread Waiman Long
optimization which will make the queue spinlock code perform better than the generic implementation. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- arch/x86/Kconfig |1 + arch/x86/include/asm/qspinlock.h | 20

[PATCH v16 11/14] pvqspinlock, x86: Enable PV qspinlock for Xen

2015-04-24 Thread Waiman Long
From: David Vrabel david.vra...@citrix.com This patch adds the necessary Xen specific code to allow Xen to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: David Vrabel david.vra...@citrix.com Signed-off-by: Waiman Long waiman.l...@hp.com

[PATCH v16 09/14] pvqspinlock, x86: Implement the paravirt qspinlock call patching

2015-04-24 Thread Waiman Long
. This significantly lowers the overhead of having CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code. Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/Kconfig |2 +- arch/x86/include/asm/paravirt.h

[PATCH v16 07/14] qspinlock: Revert to test-and-set on hypervisors

2015-04-24 Thread Waiman Long
From: Peter Zijlstra (Intel) pet...@infradead.org When we detect a hypervisor (!paravirt, see qspinlock paravirt support patches), revert to a simple test-and-set lock to avoid the horrors of queue preemption. Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long

[PATCH v16 01/14] qspinlock: A simple generic 4-byte queue spinlock

2015-04-24 Thread Waiman Long
the lock is acquired, the queue node can be released to be used later. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- include/asm-generic/qspinlock.h | 132 + include/asm-generic/qspinlock_types.h | 58

[PATCH v16 04/14] qspinlock: Extract out code snippets for the next patch

2015-04-24 Thread Waiman Long
the locked bit into a new clear_pending_set_locked() function. This patch also simplifies the trylock operation before queuing by calling queue_spin_trylock() directly. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- include/asm-generic

[PATCH v16 14/14] pvqspinlock: Collect slowpath lock statistics

2015-04-24 Thread Waiman Long
the pv-qspinlock directory. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock_paravirt.h | 100 ++- 1 files changed, 98 insertions(+), 2 deletions(-) diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h

[PATCH v16 08/14] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-24 Thread Waiman Long
linear feedback shift register. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock.c | 68 +++- kernel/locking/qspinlock_paravirt.h | 324 +++ 2 files changed, 391 insertions(+), 1 deletions(-) create mode 100644 kernel

[PATCH v16 05/14] qspinlock: Optimize for smaller NR_CPUS

2015-04-24 Thread Waiman Long
-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qspinlock_types.h | 13 ++ kernel/locking/qspinlock.c| 69 - 2 files changed, 81 insertions(+), 1 deletions(-) diff --git a/include/asm-generic/qspinlock_types.h b/include/asm-generic

[PATCH v16 06/14] qspinlock: Use a simple write to grab the lock

2015-04-24 Thread Waiman Long
-- - ticketlock 2075 10.00 216.35 3.49 qspinlock 3023 10.00 198.20 4.80 Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org

[PATCH v16 10/14] pvqspinlock, x86: Enable PV qspinlock for KVM

2015-04-24 Thread Waiman Long
This patch adds the necessary KVM specific code to allow KVM to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 43 +++ kernel/Kconfig.locks

[PATCH v16 03/14] qspinlock: Add pending bit

2015-04-24 Thread Waiman Long
-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qspinlock_types.h | 12 +++- kernel/locking/qspinlock.c| 119 +++-- 2 files changed, 107 insertions(+), 24 deletions(-) diff --git a/include

Re: [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-13 Thread Waiman Long
On 04/13/2015 11:09 AM, Peter Zijlstra wrote: On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote: +__visible void __pv_queue_spin_unlock(struct qspinlock *lock) +{ + struct __qspinlock *l = (void *)lock; + struct pv_node *node; + + if (likely(cmpxchg(l-locked

Re: [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-13 Thread Waiman Long
On 04/13/2015 10:47 AM, Peter Zijlstra wrote: On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote: +void __init __pv_init_lock_hash(void) +{ + int pv_hash_size = 4 * num_possible_cpus(); + + if (pv_hash_size (1U LFSR_MIN_BITS)) + pv_hash_size = (1U

Re: [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-13 Thread Waiman Long
On 04/13/2015 11:08 AM, Peter Zijlstra wrote: On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote: +static void pv_wait_head(struct qspinlock *lock, struct mcs_spinlock *node) +{ + struct __qspinlock *l = (void *)lock; + struct qspinlock **lp = NULL; + struct pv_node

Re: [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-09 Thread Waiman Long
On 04/09/2015 02:23 PM, Peter Zijlstra wrote: On Thu, Apr 09, 2015 at 08:13:27PM +0200, Peter Zijlstra wrote: On Mon, Apr 06, 2015 at 10:55:44PM -0400, Waiman Long wrote: +#define PV_HB_PER_LINE (SMP_CACHE_BYTES / sizeof(struct pv_hash_bucket)) +static struct qspinlock **pv_hash(struct

[PATCH v15 16/16] unfair qspinlock: a queue based unfair lock

2015-04-08 Thread Waiman Long
the performance benefit of qspinlock versus ticket spinlock which got reduced in VM3 due to the overhead of constant vCPUs halting and kicking. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/qspinlock.h | 15 +-- kernel/locking/qspinlock.c| 94 +-- kernel

Re: [Xen-devel] [PATCH v15 12/15] pvqspinlock, x86: Enable PV qspinlock for Xen

2015-04-08 Thread Waiman Long
On 04/08/2015 08:01 AM, David Vrabel wrote: On 07/04/15 03:55, Waiman Long wrote: This patch adds the necessary Xen specific code to allow Xen to support the CPU halting and kicking operations needed by the queue spinlock PV code. This basically looks the same as the version I wrote, except I

[PATCH v15 14/15] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015-04-06 Thread Waiman Long
without using atomic op. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock_paravirt.h | 28 +--- 1 files changed, 25 insertions(+), 3 deletions(-) diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h index

[PATCH v15 11/15] pvqspinlock, x86: Enable PV qspinlock for KVM

2015-04-06 Thread Waiman Long
This patch adds the necessary KVM specific code to allow KVM to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 43 +++ kernel/Kconfig.locks

[PATCH v15 12/15] pvqspinlock, x86: Enable PV qspinlock for Xen

2015-04-06 Thread Waiman Long
This patch adds the necessary Xen specific code to allow Xen to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/xen/spinlock.c | 63 --- kernel

[PATCH v15 06/15] qspinlock: Use a simple write to grab the lock

2015-04-06 Thread Waiman Long
-- - ticketlock 2075 10.00 216.35 3.49 qspinlock 3023 10.00 198.20 4.80 Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org

[PATCH v15 00/15] qspinlock: a 4-byte queue spinlock with PV support

2015-04-06 Thread Waiman Long
qspinlock: Revert to test-and-set on hypervisors pvqspinlock: Implement the paravirt qspinlock for x86 Waiman Long (11): qspinlock: A simple generic 4-byte queue spinlock qspinlock, x86: Enable x86-64 to use queue spinlock qspinlock: Extract out code snippets for the next patch qspinlock: Use

[PATCH v15 01/15] qspinlock: A simple generic 4-byte queue spinlock

2015-04-06 Thread Waiman Long
the lock is acquired, the queue node can be released to be used later. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- include/asm-generic/qspinlock.h | 132 + include/asm-generic/qspinlock_types.h | 58

[PATCH v15 05/15] qspinlock: Optimize for smaller NR_CPUS

2015-04-06 Thread Waiman Long
-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qspinlock_types.h | 13 ++ kernel/locking/qspinlock.c| 69 - 2 files changed, 81 insertions(+), 1 deletions(-) diff --git a/include/asm-generic/qspinlock_types.h b/include/asm-generic

[PATCH v15 03/15] qspinlock: Add pending bit

2015-04-06 Thread Waiman Long
-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qspinlock_types.h | 12 +++- kernel/locking/qspinlock.c| 119 +++-- 2 files changed, 107 insertions(+), 24 deletions(-) diff --git a/include

[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-06 Thread Waiman Long
linear feedback shift register. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock.c | 69 - kernel/locking/qspinlock_paravirt.h | 321 +++ 2 files changed, 389 insertions(+), 1 deletions(-) create mode 100644 kernel

[PATCH v15 08/15] lfsr: a simple binary Galois linear feedback shift register

2015-04-06 Thread Waiman Long
the value 0 in a somewhat random fashion depending on the LFSR taps that is being used. Callers can provide their own taps value or use the default. Signed-off-by: Waiman Long waiman.l...@hp.com --- include/linux/lfsr.h | 80 ++ 1 files changed, 80

[PATCH v15 04/15] qspinlock: Extract out code snippets for the next patch

2015-04-06 Thread Waiman Long
the locked bit into a new clear_pending_set_locked() function. This patch also simplifies the trylock operation before queuing by calling queue_spin_trylock() directly. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- include/asm-generic

[PATCH v15 15/15] pvqspinlock: Add debug code to check for PV lock hash sanity

2015-04-06 Thread Waiman Long
that which will only be enabled if CONFIG_DEBUG_SPINLOCK is defined because of the performance overhead it introduces. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock_paravirt.h | 58 +++ 1 files changed, 58 insertions(+), 0 deletions

[PATCH v15 10/15] pvqspinlock: Implement the paravirt qspinlock for x86

2015-04-06 Thread Waiman Long
. This significantly lowers the overhead of having CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code. Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/Kconfig |2 +- arch/x86/include/asm/paravirt.h

[PATCH v15 13/15] pvqspinlock: Only kick CPU at unlock time

2015-04-06 Thread Waiman Long
vCPU state (vcpu_hashed) which enables the code to delay CPU kicking until at unlock time. Once this state is set, the new lock holder will set _Q_SLOW_VAL and fill in the hash table on behalf of the halted queue head vCPU. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking

[PATCH v15 07/15] qspinlock: Revert to test-and-set on hypervisors

2015-04-06 Thread Waiman Long
From: Peter Zijlstra (Intel) pet...@infradead.org When we detect a hypervisor (!paravirt, see qspinlock paravirt support patches), revert to a simple test-and-set lock to avoid the horrors of queue preemption. Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long

[PATCH v15 02/15] qspinlock, x86: Enable x86-64 to use queue spinlock

2015-04-06 Thread Waiman Long
optimization which will make the queue spinlock code perform better than the generic implementation. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- arch/x86/Kconfig |1 + arch/x86/include/asm/qspinlock.h | 20

Re: [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-02 Thread Waiman Long
On 04/02/2015 03:48 PM, Peter Zijlstra wrote: On Thu, Apr 02, 2015 at 07:20:57PM +0200, Peter Zijlstra wrote: pv_wait_head(): pv_hash() /* MB as per cmpxchg */ cmpxchg(l-locked, _Q_LOCKED_VAL, _Q_SLOW_VAL); VS __pv_queue_spin_unlock(): if (xchg(l-locked, 0)

Re: [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-02 Thread Waiman Long
On 04/01/2015 05:03 PM, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 03:58:58PM -0400, Waiman Long wrote: On 04/01/2015 02:48 PM, Peter Zijlstra wrote: I am sorry that I don't quite get what you mean here. My point is that in the hashing step, a cpu will need to scan an empty bucket to put

Re: [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Waiman Long
On 03/19/2015 08:25 AM, Peter Zijlstra wrote: On Thu, Mar 19, 2015 at 11:12:42AM +0100, Peter Zijlstra wrote: So I was now thinking of hashing the lock pointer; let me go and quickly put something together. A little something like so; ideally we'd allocate the hashtable since NR_CPUS is kinda

Re: [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Waiman Long
On 04/01/2015 02:17 PM, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 07:42:39PM +0200, Peter Zijlstra wrote: Hohumm.. time to think more I think ;-) So bear with me, I've not really pondered this well so it could be full of holes (again). After the cmpxchg(l-locked, _Q_LOCKED_VAL,

Re: [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Waiman Long
On 04/01/2015 02:48 PM, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 02:54:45PM -0400, Waiman Long wrote: On 04/01/2015 02:17 PM, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 07:42:39PM +0200, Peter Zijlstra wrote: Hohumm.. time to think more I think ;-) So bear with me, I've not really

Re: [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Waiman Long
On 04/01/2015 01:12 PM, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 12:20:30PM -0400, Waiman Long wrote: After more careful reading, I think the assumption that the presence of an unused bucket means there is no match is not true. Consider the scenario: 1. cpu 0 puts lock1 into hb[0] 2. cpu

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-30 Thread Waiman Long
On 03/30/2015 12:29 PM, Peter Zijlstra wrote: On Mon, Mar 30, 2015 at 12:25:12PM -0400, Waiman Long wrote: I did it differently in my PV portion of the qspinlock patch. Instead of just waking up the CPU, the new lock holder will check if the new queue head has been halted. If so, it will set

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-30 Thread Waiman Long
On 03/27/2015 10:07 AM, Konrad Rzeszutek Wilk wrote: On Thu, Mar 26, 2015 at 09:21:53PM +0100, Peter Zijlstra wrote: On Wed, Mar 25, 2015 at 03:47:39PM -0400, Konrad Rzeszutek Wilk wrote: Ah nice. That could be spun out as a seperate patch to optimize the existing ticket locks I presume. Yes

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-30 Thread Waiman Long
On 03/25/2015 03:47 PM, Konrad Rzeszutek Wilk wrote: On Mon, Mar 16, 2015 at 02:16:13PM +0100, Peter Zijlstra wrote: Hi Waiman, As promised; here is the paravirt stuff I did during the trip to BOS last week. All the !paravirt patches are more or less the same as before (the only real change

Re: [PATCH 8/9] qspinlock: Generic paravirt support

2015-03-19 Thread Waiman Long
On 03/19/2015 08:25 AM, Peter Zijlstra wrote: On Thu, Mar 19, 2015 at 11:12:42AM +0100, Peter Zijlstra wrote: So I was now thinking of hashing the lock pointer; let me go and quickly put something together. A little something like so; ideally we'd allocate the hashtable since NR_CPUS is kinda

Re: [PATCH 9/9] qspinlock,x86,kvm: Implement KVM support for paravirt qspinlock

2015-03-19 Thread Waiman Long
On 03/19/2015 06:01 AM, Peter Zijlstra wrote: On Wed, Mar 18, 2015 at 10:45:55PM -0400, Waiman Long wrote: On 03/16/2015 09:16 AM, Peter Zijlstra wrote: I do have some concern about this call site patching mechanism as the modification is not atomic. The spin_unlock() calls are in many places

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-18 Thread Waiman Long
On 03/16/2015 09:16 AM, Peter Zijlstra wrote: Hi Waiman, As promised; here is the paravirt stuff I did during the trip to BOS last week. All the !paravirt patches are more or less the same as before (the only real change is the copyright lines in the first patch). The paravirt stuff is

[PATCH v14 08/11] qspinlock, x86: Rename paravirt_ticketlocks_enabled

2015-01-20 Thread Waiman Long
This patch renames the paravirt_ticketlocks_enabled static key to a more generic paravirt_spinlocks_enabled name. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- arch/x86/include/asm/spinlock.h |4 ++-- arch/x86/kernel/kvm.c

[PATCH v14 07/11] qspinlock: Revert to test-and-set on hypervisors

2015-01-20 Thread Waiman Long
From: Peter Zijlstra pet...@infradead.org When we detect a hypervisor (!paravirt, see qspinlock paravirt support patches), revert to a simple test-and-set lock to avoid the horrors of queue preemption. Signed-off-by: Peter Zijlstra pet...@infradead.org Signed-off-by: Waiman Long waiman.l

[PATCH v14 09/11] pvqspinlock, x86: Add para-virtualization support

2015-01-20 Thread Waiman Long
its cpu number in whichever node is pointed to by the tail part of the lock word. Secondly, pv_link_and_wait_node() will propagate the existing head from the old to the new tail node. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/paravirt.h | 22 ++ arch/x86

[PATCH v14 11/11] pvqspinlock, x86: Enable PV qspinlock for XEN

2015-01-20 Thread Waiman Long
This patch adds the necessary XEN specific code to allow XEN to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/xen/spinlock.c | 149 +-- kernel

[PATCH v14 10/11] pvqspinlock, x86: Enable PV qspinlock for KVM

2015-01-20 Thread Waiman Long
that contended qspinlock produces much less cacheline contention than contended ticket spinlock and the test system is an 8-socket server. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 143 - kernel/Kconfig.locks |2 +- 2 files

[PATCH v14 06/11] qspinlock: Use a simple write to grab the lock

2015-01-20 Thread Waiman Long
-- - ticketlock 2075 10.00 216.35 3.49 qspinlock 3023 10.00 198.20 4.80 Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- kernel

[PATCH v14 05/11] qspinlock: Optimize for smaller NR_CPUS

2015-01-20 Thread Waiman Long
. This optimization is needed to make the qspinlock achieve performance parity with ticket spinlock at light load. All this is horribly broken on Alpha pre EV56 (and any other arch that cannot do single-copy atomic byte stores). Signed-off-by: Peter Zijlstra pet...@infradead.org Signed-off-by: Waiman Long

[PATCH v14 04/11] qspinlock: Extract out code snippets for the next patch

2015-01-20 Thread Waiman Long
the locked bit into a new clear_pending_set_locked() function. This patch also simplifies the trylock operation before queuing by calling queue_spin_trylock() directly. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- include/asm-generic

[PATCH v14 02/11] qspinlock, x86: Enable x86-64 to use queue spinlock

2015-01-20 Thread Waiman Long
optimization which will make the queue spinlock code perform better than the generic implementation. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- arch/x86/Kconfig |1 + arch/x86/include/asm/qspinlock.h | 25

[PATCH v14 03/11] qspinlock: Add pending bit

2015-01-20 Thread Waiman Long
Zijlstra pet...@infradead.org Signed-off-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qspinlock_types.h | 12 +++- kernel/locking/qspinlock.c| 119 +++-- 2 files changed, 107 insertions(+), 24 deletions(-) diff --git a/include/asm-generic

[PATCH v14 01/11] qspinlock: A simple generic 4-byte queue spinlock

2015-01-20 Thread Waiman Long
the lock is acquired, the queue node can be released to be used later. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- include/asm-generic/qspinlock.h | 132 + include/asm-generic/qspinlock_types.h | 58

[PATCH v14 00/11] qspinlock: a 4-byte queue spinlock with PV support

2015-01-20 Thread Waiman Long
): qspinlock: Add pending bit qspinlock: Optimize for smaller NR_CPUS qspinlock: Revert to test-and-set on hypervisors Waiman Long (8): qspinlock: A simple generic 4-byte queue spinlock qspinlock, x86: Enable x86-64 to use queue spinlock qspinlock: Extract out code snippets for the next

Re: [PATCH v12 09/11] pvqspinlock, x86: Add para-virtualization support

2014-11-25 Thread Waiman Long
On 10/27/2014 02:02 PM, Konrad Rzeszutek Wilk wrote: On Mon, Oct 27, 2014 at 01:38:20PM -0400, Waiman Long wrote: My concern is that spin_unlock() can be called in many places, including loadable kernel modules. Can the paravirt_patch_ident_32() function able to patch all of them in reasonable

Re: [PATCH v13 09/11] pvqspinlock, x86: Add para-virtualization support

2014-11-03 Thread Waiman Long
On 11/03/2014 05:35 AM, Peter Zijlstra wrote: On Wed, Oct 29, 2014 at 04:19:09PM -0400, Waiman Long wrote: arch/x86/include/asm/pvqspinlock.h| 411 + I do wonder why all this needs to live in x86.. I haven't looked into the para-virtualization code

Re: [PATCH v12 09/11] pvqspinlock, x86: Add para-virtualization support

2014-10-29 Thread Waiman Long
On 10/27/2014 05:22 PM, Waiman Long wrote: On 10/27/2014 02:04 PM, Peter Zijlstra wrote: On Mon, Oct 27, 2014 at 01:38:20PM -0400, Waiman Long wrote: On 10/24/2014 04:54 AM, Peter Zijlstra wrote: On Thu, Oct 16, 2014 at 02:10:38PM -0400, Waiman Long wrote: Since enabling paravirt spinlock

[PATCH v13 00/11] qspinlock: a 4-byte queue spinlock with PV support

2014-10-29 Thread Waiman Long
is to make the lock contention problems more tolerable until someone can spend the time and effort to fix them. Peter Zijlstra (3): qspinlock: Add pending bit qspinlock: Optimize for smaller NR_CPUS qspinlock: Revert to test-and-set on hypervisors Waiman Long (8): qspinlock: A simple generic 4

[PATCH v13 10/11] pvqspinlock, x86: Enable PV qspinlock for KVM

2014-10-29 Thread Waiman Long
. In this case, the unfairlock performs worse than both the PV ticketlock and qspinlock. The performance of the 2 PV locks are comparable. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 138 - kernel/Kconfig.locks |2

[PATCH v13 11/11] pvqspinlock, x86: Enable PV qspinlock for XEN

2014-10-29 Thread Waiman Long
This patch adds the necessary XEN specific code to allow XEN to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/xen/spinlock.c | 149 +-- kernel

[PATCH v13 09/11] pvqspinlock, x86: Add para-virtualization support

2014-10-29 Thread Waiman Long
its cpu number in whichever node is pointed to by the tail part of the lock word. Secondly, pv_link_and_wait_node() will propagate the existing head from the old to the new tail node. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/paravirt.h | 19 ++ arch/x86

[PATCH v13 08/11] qspinlock, x86: Rename paravirt_ticketlocks_enabled

2014-10-29 Thread Waiman Long
This patch renames the paravirt_ticketlocks_enabled static key to a more generic paravirt_spinlocks_enabled name. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- arch/x86/include/asm/spinlock.h |4 ++-- arch/x86/kernel/kvm.c

[PATCH v13 04/11] qspinlock: Extract out code snippets for the next patch

2014-10-29 Thread Waiman Long
the locked bit into a new clear_pending_set_locked() function. This patch also simplifies the trylock operation before queuing by calling queue_spin_trylock() directly. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- include/asm-generic

[PATCH v13 07/11] qspinlock: Revert to test-and-set on hypervisors

2014-10-29 Thread Waiman Long
From: Peter Zijlstra pet...@infradead.org When we detect a hypervisor (!paravirt, see qspinlock paravirt support patches), revert to a simple test-and-set lock to avoid the horrors of queue preemption. Signed-off-by: Peter Zijlstra pet...@infradead.org Signed-off-by: Waiman Long waiman.l

[PATCH v13 05/11] qspinlock: Optimize for smaller NR_CPUS

2014-10-29 Thread Waiman Long
. This optimization is needed to make the qspinlock achieve performance parity with ticket spinlock at light load. All this is horribly broken on Alpha pre EV56 (and any other arch that cannot do single-copy atomic byte stores). Signed-off-by: Peter Zijlstra pet...@infradead.org Signed-off-by: Waiman Long

[PATCH v13 06/11] qspinlock: Use a simple write to grab the lock

2014-10-29 Thread Waiman Long
-- - ticketlock 2075 10.00 216.35 3.49 qspinlock 3023 10.00 198.20 4.80 Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- kernel

[PATCH v13 01/11] qspinlock: A simple generic 4-byte queue spinlock

2014-10-29 Thread Waiman Long
the lock is acquired, the queue node can be released to be used later. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- include/asm-generic/qspinlock.h | 118 +++ include/asm-generic/qspinlock_types.h | 58 + kernel

[PATCH v13 03/11] qspinlock: Add pending bit

2014-10-29 Thread Waiman Long
Zijlstra pet...@infradead.org Signed-off-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qspinlock_types.h | 12 +++- kernel/locking/qspinlock.c| 119 +++-- 2 files changed, 107 insertions(+), 24 deletions(-) diff --git a/include/asm-generic

[PATCH v13 02/11] qspinlock, x86: Enable x86-64 to use queue spinlock

2014-10-29 Thread Waiman Long
optimization which will make the queue spinlock code perform better than the generic implementation. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- arch/x86/Kconfig |1 + arch/x86/include/asm/qspinlock.h | 25

Re: [PATCH v12 09/11] pvqspinlock, x86: Add para-virtualization support

2014-10-29 Thread Waiman Long
On 10/29/2014 03:05 PM, Waiman Long wrote: On 10/27/2014 05:22 PM, Waiman Long wrote: On 10/27/2014 02:04 PM, Peter Zijlstra wrote: On Mon, Oct 27, 2014 at 01:38:20PM -0400, Waiman Long wrote: On 10/24/2014 04:54 AM, Peter Zijlstra wrote: On Thu, Oct 16, 2014 at 02:10:38PM -0400, Waiman Long

Re: [PATCH v12 09/11] pvqspinlock, x86: Add para-virtualization support

2014-10-27 Thread Waiman Long
On 10/24/2014 06:04 PM, Peter Zijlstra wrote: On Fri, Oct 24, 2014 at 04:53:27PM -0400, Waiman Long wrote: The additional register pressure may just cause a few more register moves which should be negligible in the overall performance . The additional icache pressure, however, may have some

Re: [PATCH v12 09/11] pvqspinlock, x86: Add para-virtualization support

2014-10-27 Thread Waiman Long
On 10/24/2014 04:54 AM, Peter Zijlstra wrote: On Thu, Oct 16, 2014 at 02:10:38PM -0400, Waiman Long wrote: Since enabling paravirt spinlock will disable unlock function inlining, a jump label can be added to the unlock function without adding patch sites all over the kernel. But you don't

Re: [PATCH v12 00/11] qspinlock: a 4-byte queue spinlock with PV support

2014-10-27 Thread Waiman Long
On 10/24/2014 04:57 AM, Peter Zijlstra wrote: On Thu, Oct 16, 2014 at 02:10:29PM -0400, Waiman Long wrote: v11-v12: - Based on PeterZ's version of the qspinlock patch (https://lkml.org/lkml/2014/6/15/63). - Incorporated many of the review comments from Konrad Wilk and Paolo Bonzini

Re: [PATCH v12 09/11] pvqspinlock, x86: Add para-virtualization support

2014-10-27 Thread Waiman Long
On 10/27/2014 01:27 PM, Peter Zijlstra wrote: On Mon, Oct 27, 2014 at 01:15:53PM -0400, Waiman Long wrote: On 10/24/2014 06:04 PM, Peter Zijlstra wrote: On Fri, Oct 24, 2014 at 04:53:27PM -0400, Waiman Long wrote: The additional register pressure may just cause a few more register moves which

Re: [PATCH v12 09/11] pvqspinlock, x86: Add para-virtualization support

2014-10-27 Thread Waiman Long
On 10/27/2014 02:02 PM, Konrad Rzeszutek Wilk wrote: On Mon, Oct 27, 2014 at 01:38:20PM -0400, Waiman Long wrote: On 10/24/2014 04:54 AM, Peter Zijlstra wrote: On Thu, Oct 16, 2014 at 02:10:38PM -0400, Waiman Long wrote: Since enabling paravirt spinlock will disable unlock function inlining

Re: [PATCH v12 09/11] pvqspinlock, x86: Add para-virtualization support

2014-10-27 Thread Waiman Long
On 10/27/2014 02:04 PM, Peter Zijlstra wrote: On Mon, Oct 27, 2014 at 01:38:20PM -0400, Waiman Long wrote: On 10/24/2014 04:54 AM, Peter Zijlstra wrote: On Thu, Oct 16, 2014 at 02:10:38PM -0400, Waiman Long wrote: Since enabling paravirt spinlock will disable unlock function inlining, a jump

Re: [PATCH v12 09/11] pvqspinlock, x86: Add para-virtualization support

2014-10-24 Thread Waiman Long
On 10/24/2014 04:47 AM, Peter Zijlstra wrote: On Thu, Oct 16, 2014 at 02:10:38PM -0400, Waiman Long wrote: +static inline void pv_init_node(struct mcs_spinlock *node) +{ + struct pv_qnode *pn = (struct pv_qnode *)node; + + BUILD_BUG_ON(sizeof(struct pv_qnode) 5*sizeof(struct

[PATCH v12 11/11] pvqspinlock, x86: Enable PV qspinlock for XEN

2014-10-16 Thread Waiman Long
This patch adds the necessary XEN specific code to allow XEN to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/xen/spinlock.c | 149 +-- kernel

[PATCH v12 10/11] pvqspinlock, x86: Enable PV qspinlock for KVM

2014-10-16 Thread Waiman Long
. In this case, the unfairlock performs worse than both the PV ticketlock and qspinlock. The performance of the 2 PV locks are comparable. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 138 - kernel/Kconfig.locks |2

[PATCH v12 09/11] pvqspinlock, x86: Add para-virtualization support

2014-10-16 Thread Waiman Long
-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/paravirt.h | 20 ++ arch/x86/include/asm/paravirt_types.h | 20 ++ arch/x86/include/asm/pvqspinlock.h| 403 + arch/x86/include/asm/qspinlock.h | 44 - arch/x86/kernel/paravirt

[PATCH v12 08/11] qspinlock, x86: Rename paravirt_ticketlocks_enabled

2014-10-16 Thread Waiman Long
This patch renames the paravirt_ticketlocks_enabled static key to a more generic paravirt_spinlocks_enabled name. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- arch/x86/include/asm/spinlock.h |4 ++-- arch/x86/kernel/kvm.c

[PATCH v12 07/11] qspinlock: Revert to test-and-set on hypervisors

2014-10-16 Thread Waiman Long
From: Peter Zijlstra pet...@infradead.org When we detect a hypervisor (!paravirt, see qspinlock paravirt support patches), revert to a simple test-and-set lock to avoid the horrors of queue preemption. Signed-off-by: Peter Zijlstra pet...@infradead.org Signed-off-by: Waiman Long waiman.l

[PATCH v12 06/11] qspinlock: Use a simple write to grab the lock

2014-10-16 Thread Waiman Long
-- - ticketlock 2075 10.00 216.35 3.49 qspinlock 3023 10.00 198.20 4.80 Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- kernel

[PATCH v12 05/11] qspinlock: Optimize for smaller NR_CPUS

2014-10-16 Thread Waiman Long
. This optimization is needed to make the qspinlock achieve performance parity with ticket spinlock at light load. All this is horribly broken on Alpha pre EV56 (and any other arch that cannot do single-copy atomic byte stores). Signed-off-by: Peter Zijlstra pet...@infradead.org Signed-off-by: Waiman Long

[PATCH v12 02/11] qspinlock, x86: Enable x86-64 to use queue spinlock

2014-10-16 Thread Waiman Long
optimization which will make the queue spinlock code perform better than the generic implementation. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- arch/x86/Kconfig |1 + arch/x86/include/asm/qspinlock.h | 25

[PATCH v12 01/11] qspinlock: A simple generic 4-byte queue spinlock

2014-10-16 Thread Waiman Long
the lock is acquired, the queue node can be released to be used later. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- include/asm-generic/qspinlock.h | 118 +++ include/asm-generic/qspinlock_types.h | 58 + kernel

[PATCH v12 04/11] qspinlock: Extract out code snippets for the next patch

2014-10-16 Thread Waiman Long
the locked bit into a new clear_pending_set_locked() function. This patch also simplifies the trylock operation before queuing by calling queue_spin_trylock() directly. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- include/asm-generic

[PATCH v12 03/11] qspinlock: Add pending bit

2014-10-16 Thread Waiman Long
Zijlstra pet...@infradead.org Signed-off-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qspinlock_types.h | 12 +++- kernel/locking/qspinlock.c| 119 +++-- 2 files changed, 107 insertions(+), 24 deletions(-) diff --git a/include/asm-generic

[PATCH v12 00/11] qspinlock: a 4-byte queue spinlock with PV support

2014-10-16 Thread Waiman Long
qspinlock: Revert to test-and-set on hypervisors Waiman Long (8): qspinlock: A simple generic 4-byte queue spinlock qspinlock, x86: Enable x86-64 to use queue spinlock qspinlock: Extract out code snippets for the next patch qspinlock: Use a simple write to grab the lock qspinlock, x86: Rename

Re: [PATCH 10/11] qspinlock: Paravirt support

2014-06-18 Thread Waiman Long
On 06/18/2014 08:03 AM, Paolo Bonzini wrote: Il 17/06/2014 00:08, Waiman Long ha scritto: +void __pv_queue_unlock(struct qspinlock *lock) +{ +int val = atomic_read(lock-val); + +native_queue_unlock(lock); + +if (val _Q_LOCKED_SLOW) +___pv_kick_head(lock); +} + Again

Re: [PATCH 04/11] qspinlock: Extract out the exchange of tail code word

2014-06-18 Thread Waiman Long
On 06/18/2014 09:50 AM, Konrad Rzeszutek Wilk wrote: On Wed, Jun 18, 2014 at 01:37:45PM +0200, Paolo Bonzini wrote: Il 17/06/2014 22:55, Konrad Rzeszutek Wilk ha scritto: On Sun, Jun 15, 2014 at 02:47:01PM +0200, Peter Zijlstra wrote: From: Waiman Longwaiman.l...@hp.com This patch extracts

Re: [PATCH 03/11] qspinlock: Add pending bit

2014-06-17 Thread Waiman Long
On 06/17/2014 04:36 PM, Konrad Rzeszutek Wilk wrote: On Sun, Jun 15, 2014 at 02:47:00PM +0200, Peter Zijlstra wrote: Because the qspinlock needs to touch a second cacheline; add a pending bit and allow a single in-word spinner before we punt to the second cacheline. Could you add this in the

Re: [PATCH 03/11] qspinlock: Add pending bit

2014-06-17 Thread Waiman Long
On 06/17/2014 05:10 PM, Konrad Rzeszutek Wilk wrote: On Tue, Jun 17, 2014 at 05:07:29PM -0400, Konrad Rzeszutek Wilk wrote: On Tue, Jun 17, 2014 at 04:51:57PM -0400, Waiman Long wrote: On 06/17/2014 04:36 PM, Konrad Rzeszutek Wilk wrote: On Sun, Jun 15, 2014 at 02:47:00PM +0200, Peter

  1   2   3   >