From: Peter Zijlstra pet...@infradead.org
When we detect a hypervisor (!paravirt, see qspinlock paravirt support
patches), revert to a simple test-and-set lock to avoid the horrors
of queue preemption.
Signed-off-by: Peter Zijlstra pet...@infradead.org
Signed-off-by: Waiman Long waiman.l
This patch renames the paravirt_ticketlocks_enabled static key to a
more generic paravirt_spinlocks_enabled name.
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
arch/x86/include/asm/spinlock.h |4 ++--
arch/x86/kernel/kvm.c
On 06/18/2014 08:03 AM, Paolo Bonzini wrote:
Il 17/06/2014 00:08, Waiman Long ha scritto:
+void __pv_queue_unlock(struct qspinlock *lock)
+{
+int val = atomic_read(lock-val);
+
+native_queue_unlock(lock);
+
+if (val _Q_LOCKED_SLOW)
+___pv_kick_head(lock);
+}
+
Again
On 06/18/2014 09:50 AM, Konrad Rzeszutek Wilk wrote:
On Wed, Jun 18, 2014 at 01:37:45PM +0200, Paolo Bonzini wrote:
Il 17/06/2014 22:55, Konrad Rzeszutek Wilk ha scritto:
On Sun, Jun 15, 2014 at 02:47:01PM +0200, Peter Zijlstra wrote:
From: Waiman Longwaiman.l...@hp.com
This patch extracts
On 06/17/2014 04:36 PM, Konrad Rzeszutek Wilk wrote:
On Sun, Jun 15, 2014 at 02:47:00PM +0200, Peter Zijlstra wrote:
Because the qspinlock needs to touch a second cacheline; add a pending
bit and allow a single in-word spinner before we punt to the second
cacheline.
Could you add this in the
On 06/17/2014 05:10 PM, Konrad Rzeszutek Wilk wrote:
On Tue, Jun 17, 2014 at 05:07:29PM -0400, Konrad Rzeszutek Wilk wrote:
On Tue, Jun 17, 2014 at 04:51:57PM -0400, Waiman Long wrote:
On 06/17/2014 04:36 PM, Konrad Rzeszutek Wilk wrote:
On Sun, Jun 15, 2014 at 02:47:00PM +0200, Peter
On 06/15/2014 08:47 AM, Peter Zijlstra wrote:
When we detect a hypervisor (!paravirt, see later patches), revert to
a simple test-and-set lock to avoid the horrors of queue preemption.
Signed-off-by: Peter Zijlstrapet...@infradead.org
---
arch/x86/include/asm/qspinlock.h | 14 ++
On 06/15/2014 08:47 AM, Peter Zijlstra wrote:
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+
+/*
+ * Write a comment about how all this works...
+ */
+
+#define _Q_LOCKED_SLOW (2U _Q_LOCKED_OFFSET)
+
+struct pv_node {
+ struct mcs_spinlock mcs;
+ struct mcs_spinlock __offset[3];
+
I am resending it as my original reply has some HTML code hence
rejected by the mailing lists.
On 06/15/2014 08:47 AM, Peter Zijlstra wrote:
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+
+/*
+ * Write a comment about how all this works...
+ */
+
+#define _Q_LOCKED_SLOW (2U _Q_LOCKED_OFFSET)
+
On 06/12/2014 04:17 AM, Peter Zijlstra wrote:
On Fri, May 30, 2014 at 11:44:00AM -0400, Waiman Long wrote:
@@ -19,13 +19,46 @@ extern struct static_key virt_unfairlocks_enabled;
* that the clearing the lock bit is done ASAP without artificial delay
* due to compiler optimization
On 06/12/2014 02:00 AM, Peter Zijlstra wrote:
On Wed, Jun 11, 2014 at 05:22:28PM -0400, Long, Wai Man wrote:
@@ -233,11 +233,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock, u32
val)
*/
for (;;) {
/*
-* If we observe any contention;
On 06/12/2014 01:50 AM, Peter Zijlstra wrote:
On Wed, Jun 11, 2014 at 09:37:55PM -0400, Long, Wai Man wrote:
On 6/11/2014 6:54 AM, Peter Zijlstra wrote:
On Fri, May 30, 2014 at 11:43:55AM -0400, Waiman Long wrote:
Enabling this configuration feature causes a slight decrease the
performance
the lock is acquired, the queue node can be released to
be used later.
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
include/asm-generic/qspinlock.h | 118
include/asm-generic/qspinlock_types.h | 61
: Add pending bit
qspinlock: Optimize for smaller NR_CPUS
Waiman Long (14):
qspinlock: A simple generic 4-byte queue spinlock
qspinlock, x86: Enable x86-64 to use queue spinlock
qspinlock: Extract out the exchange of tail code word
qspinlock: prolong the stay in the pending bit path
optimization which will make the queue spinlock code perform
better than the generic implementation.
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
arch/x86/Kconfig |1 +
arch/x86/include/asm/qspinlock.h | 29
From: Peter Zijlstra pet...@infradead.org
Because the qspinlock needs to touch a second cacheline; add a pending
bit and allow a single in-word spinner before we punt to the second
cacheline.
Signed-off-by: Peter Zijlstra pet...@infradead.org
Signed-off-by: Waiman Long waiman.l...@hp.com
This patch extracts the logic for the exchange of new and previous tail
code words into a new xchg_tail() function which can be optimized in a
later patch.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
include/asm-generic/qspinlock_types.h |2 +
kernel/locking/qspinlock.c
%
It can be seen that the queue spinlock performance for 2 contending
tasks is now comparable to ticket spinlock on the same node, but much
faster when in different nodes. With 3 contending tasks, however,
the ticket spinlock is still quite a bit faster.
Signed-off-by: Waiman Long waiman.l
this is horribly broken on Alpha pre EV56 (and any other arch that
cannot do single-copy atomic byte stores).
Signed-off-by: Peter Zijlstra pet...@infradead.org
Signed-off-by: Waiman Long waiman.l...@hp.com
---
include/asm-generic/qspinlock_types.h | 13
kernel/locking/qspinlock.c
If unfair lock is supported, the lock acquisition loop at the end of
the queue_spin_lock_slowpath() function may need to detect the fact
the lock can be stolen. Code are added for the stolen lock detection.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 26
-- --- ----
1135135 137
2 4603 1034 1458
3 10940 12087 2562
4 21555 10507 4793
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/Kconfig | 11
. This avoids the slow down of the pending bit and trylock
code path at the expense of a little bit of additional overhead to
the MCS queuing code path.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 162 ---
1 files changed
Usr Time
-- -
ticketlock 2075 10.00 216.35 3.49
qspinlock 3023 10.00 198.20 4.80
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 62
This patch renames the paravirt_ticketlocks_enabled static key to a
more generic paravirt_spinlocks_enabled name.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/include/asm/spinlock.h |4 ++--
arch/x86/kernel/kvm.c|2 +-
arch/x86/kernel/paravirt
no difference in performance.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/include/asm/pvqspinlock.h | 359
arch/x86/include/asm/qspinlock.h | 33
kernel/locking/qspinlock.c | 72 +++-
3 files changed, 458 insertions(+), 6
in the pending
bit code path back to the regular queuing code path so that it can
be properly halted by the PV qspinlock code.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 47 ---
1 files changed, 43 insertions(+), 4 deletions
statistical data for debugfs
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/include/asm/paravirt.h | 18 +-
arch/x86/include/asm/paravirt_types.h | 17 +
arch/x86/kernel/paravirt-spinlocks.c |6 ++
3 files changed, 40 insertions(+), 1
PV qspinlock 402 10.00 91.550.00
unfair qspinlock 570 10.00 62.980.00
unfair + PV qspinlock 586 10.00 59.680.00
Signed-off-by: Waiman Long waiman.l...@hp.com
Tested-by: Raghavendra K T raghavendra...@linux.vnet.ibm.com
This patch adds the necessary XEN specific code to allow XEN to
support the CPU halting and kicking operations needed by the queue
spinlock PV code.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/xen/spinlock.c | 147 +--
kernel
On 05/14/2014 03:13 PM, Radim Krčmář wrote:
2014-05-14 19:00+0200, Peter Zijlstra:
On Wed, May 14, 2014 at 06:51:24PM +0200, Radim Krčmář wrote:
Ok.
I've seen merit in pvqspinlock even with slightly slower first-waiter,
so I would have happily sacrificed those horrible branches.
(I prefer
On 05/08/2014 03:12 PM, Peter Zijlstra wrote:
On Wed, May 07, 2014 at 11:01:38AM -0400, Waiman Long wrote:
No, we want the unfair thing for VIRT, not PARAVIRT.
Yes, you are right. I will change that to VIRT.
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index
On 05/12/2014 11:22 AM, Radim Krčmář wrote:
2014-05-07 11:01-0400, Waiman Long:
From: Peter Zijlstrapet...@infradead.org
Because the qspinlock needs to touch a second cacheline; add a pending
bit and allow a single in-word spinner before we punt to the second
cacheline.
I think
On 05/08/2014 02:57 PM, Peter Zijlstra wrote:
On Wed, May 07, 2014 at 11:01:31AM -0400, Waiman Long wrote:
+/**
+ * trylock_pending - try to acquire queue spinlock using the pending bit
+ * @lock : Pointer to queue spinlock structure
+ * @pval : Pointer to value of the queue spinlock 32-bit
On 05/08/2014 02:58 PM, Peter Zijlstra wrote:
On Wed, May 07, 2014 at 11:01:34AM -0400, Waiman Long wrote:
@@ -221,11 +222,37 @@ static inline int trylock_pending(struct qspinlock *lock,
u32 *pval)
*/
for (;;) {
/*
-* If we observe any
On 05/08/2014 03:00 PM, Peter Zijlstra wrote:
On Wed, May 07, 2014 at 11:01:35AM -0400, Waiman Long wrote:
@@ -94,23 +94,29 @@ static inline struct mcs_spinlock *decode_tail(u32 tail)
* can allow better optimization of the lock acquisition for the pending
* bit holder
On 05/08/2014 03:02 PM, Peter Zijlstra wrote:
On Wed, May 07, 2014 at 11:01:35AM -0400, Waiman Long wrote:
/**
+ * get_qlock - Set the lock bit and own the lock
+ * @lock: Pointer to queue spinlock structure
+ *
+ * This routine should only be called when the caller is the only one
On 05/08/2014 03:04 PM, Peter Zijlstra wrote:
On Wed, May 07, 2014 at 11:01:36AM -0400, Waiman Long wrote:
/*
+ * To have additional features for better virtualization support, it is
+ * necessary to store additional data in the queue node structure. So
+ * a new queue node structure
On 05/08/2014 03:06 PM, Peter Zijlstra wrote:
On Wed, May 07, 2014 at 11:01:37AM -0400, Waiman Long wrote:
If unfair lock is supported, the lock acquisition loop at the end of
the queue_spin_lock_slowpath() function may need to detect the fact
the lock can be stolen. Code are added
On 05/07/2014 03:07 PM, Konrad Rzeszutek Wilk wrote:
Raghavendra KT had done some performance testing on this patch with
the following results:
Overall we are seeing good improvement for pv-unfair version.
System: 32 cpu sandybridge with HT on (4 node with 32 GB each)
Guest : 8GB with 16
On 05/07/2014 03:07 PM, Konrad Rzeszutek Wilk wrote:
On Wed, May 07, 2014 at 11:01:28AM -0400, Waiman Long wrote:
v9-v10:
- Make some minor changes to qspinlock.c to accommodate review feedback.
- Change author to PeterZ for 2 of the patches.
- Include Raghavendra KT's test results
On 04/27/2014 02:09 PM, Raghavendra K T wrote:
For kvm part feel free to add:
Tested-by: Raghavendra K T raghavendra...@linux.vnet.ibm.com
V9 testing has shown no hangs.
I was able to do some performance testing. here are the results:
Overall we are seeing good improvement for pv-unfair
of the lock or finer granularity ones. The
main purpose is to make the lock contention problems more tolerable
until someone can spend the time and effort to fix them.
Peter Zijlstra (2):
qspinlock: Add pending bit
qspinlock: Optimize for smaller NR_CPUS
Waiman Long (17):
qspinlock: A simple generic
the lock is acquired, the queue node can be released to
be used later.
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
include/asm-generic/qspinlock.h | 118
include/asm-generic/qspinlock_types.h | 61
optimization which will make the queue spinlock code perform
better than the generic implementation.
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
arch/x86/Kconfig |1 +
arch/x86/include/asm/qspinlock.h | 29
This patch extracts the logic for the exchange of new and previous tail
code words into a new xchg_tail() function which can be optimized in a
later patch.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
include/asm-generic/qspinlock_types.h |2 +
kernel/locking/qspinlock.c
. For large critical section,
however, there may not be much benefit.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/Kconfig | 11 +
arch/x86/include/asm/qspinlock.h | 79 ++
arch/x86/kernel/Makefile |1 +
arch/x86
Usr Time
-- -
ticketlock 2075 10.00 216.35 3.49
qspinlock 3023 10.00 198.20 4.80
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 61
this is horribly broken on Alpha pre EV56 (and any other arch that
cannot do single-copy atomic byte stores).
Signed-off-by: Peter Zijlstra pet...@infradead.org
Signed-off-by: Waiman Long waiman.l...@hp.com
---
include/asm-generic/qspinlock_types.h | 13
kernel/locking/qspinlock.c
.
It is also necessary to expand arch_mcs_spin_lock_contended() to the
underlying while loop as additional code will need to be inserted
into the loop.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 36 +++-
1 files changed, 23
If unfair lock is supported, the lock acquisition loop at the end of
the queue_spin_lock_slowpath() function may need to detect the fact
the lock can be stolen. Code are added for the stolen lock detection.
A new qhead macro is also defined as a shorthand for mcs.locked.
Signed-off-by: Waiman
into a slowerpath
function. This avoids the slow down of the pending bit and trylock
code path at the expense of a little bit of additional overhead to
the MCS queuing code path.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 120 +--
1
the cacheline contention problem on the lock
word while trying to maintain as much of a FIFO order as possible.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 147 +++-
1 files changed, 146 insertions(+), 1 deletions
10507 1869 4307
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 160 ++--
1 files changed, 154 insertions(+), 6 deletions(-)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index
This patch renames the paravirt_ticketlocks_enabled static key to a
more generic paravirt_spinlocks_enabled name.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/include/asm/spinlock.h |4 ++--
arch/x86/kernel/kvm.c|2 +-
arch/x86/kernel/paravirt
in the pending
bit code path back to the regular queuing code path so that it can
be properly halted by the PV qspinlock code.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 74 ++--
1 files changed, 64 insertions(+), 10 deletions
no difference in performance. When coupled
with unfair lock, the queue spinlock can be much faster than the PV
ticket lock.
When both the unfair lock and PV spinlock features is turned on,
lock stealing will still be allowed in the fastpath, but not in
the slowpath.
Signed-off-by: Waiman Long waiman.l
%)
1.5x 3991.9622 (4%)
2.0x 2527.0613 (2.5%)
Signed-off-by: Waiman Long waiman.l...@hp.com
Tested-by: Raghavendra K T raghavendra...@linux.vnet.ibm.com
---
arch/x86/kernel/kvm.c | 135 +
kernel/Kconfig.locks |2 +-
2 files changed, 136 insertions
This patch adds the necessary XEN specific code to allow XEN to
support the CPU halting and kicking operations needed by the queue
spinlock PV code.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/xen/spinlock.c | 147 +--
kernel
On 04/18/2014 05:40 PM, Waiman Long wrote:
On 04/18/2014 03:05 PM, Peter Zijlstra wrote:
On Fri, Apr 18, 2014 at 01:52:50PM -0400, Waiman Long wrote:
I am confused by your notation.
Nah, I think I was confused :-) Make the 1 _Q_LOCKED_VAL though, as
that's the proper constant to use
On 04/23/2014 10:56 AM, Konrad Rzeszutek Wilk wrote:
On Wed, Apr 23, 2014 at 10:23:43AM -0400, Waiman Long wrote:
On 04/18/2014 05:40 PM, Waiman Long wrote:
On 04/18/2014 03:05 PM, Peter Zijlstra wrote:
On Fri, Apr 18, 2014 at 01:52:50PM -0400, Waiman Long wrote:
I am confused by your
On 04/23/2014 01:55 PM, Konrad Rzeszutek Wilk wrote:
On Wed, Apr 23, 2014 at 01:43:58PM -0400, Waiman Long wrote:
On 04/23/2014 10:56 AM, Konrad Rzeszutek Wilk wrote:
On Wed, Apr 23, 2014 at 10:23:43AM -0400, Waiman Long wrote:
On 04/18/2014 05:40 PM, Waiman Long wrote:
On 04/18/2014 03:05
On 04/23/2014 06:24 PM, Waiman Long wrote:
On 04/23/2014 01:55 PM, Konrad Rzeszutek Wilk wrote:
On Wed, Apr 23, 2014 at 01:43:58PM -0400, Waiman Long wrote:
On 04/23/2014 10:56 AM, Konrad Rzeszutek Wilk wrote:
On Wed, Apr 23, 2014 at 10:23:43AM -0400, Waiman Long wrote:
On 04/18/2014 05:40
On 04/18/2014 03:42 AM, Ingo Molnar wrote:
* Waiman Longwaiman.l...@hp.com wrote:
Because the qspinlock needs to touch a second cacheline; add a pending
bit and allow a single in-word spinner before we punt to the second
cacheline.
Signed-off-by: Peter Zijlstrapet...@infradead.org
On 04/18/2014 04:13 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 05:20:31PM -0400, Waiman Long wrote:
+ while ((val = atomic_read(lock-val)) _Q_LOCKED_MASK)
+ arch_mutex_cpu_relax();
That was a cpu_relax().
Yes, but arch_mutex_cpu_relax() is the same as cpu_relax
On 04/18/2014 04:15 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 05:28:17PM -0400, Waiman Long wrote:
On 04/17/2014 11:49 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:56AM -0400, Waiman Long wrote:
@@ -192,36 +220,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock, u32
On 04/18/2014 04:27 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 05:46:27PM -0400, Waiman Long wrote:
On 04/17/2014 11:56 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+struct __qspinlock {
+ union {
+ atomic_t val
On 04/18/2014 04:33 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 09:46:04PM -0400, Waiman Long wrote:
BTW, I didn't test out your atomic_test_and_set() change. Did it provide a
noticeable performance benefit when compared with cmpxchg()?
I've not tested that I think. I had a hard time
On 04/18/2014 12:35 PM, Konrad Rzeszutek Wilk wrote:
On Fri, Apr 18, 2014 at 12:23:29PM -0400, Waiman Long wrote:
On 04/18/2014 03:42 AM, Ingo Molnar wrote:
* Waiman Longwaiman.l...@hp.com wrote:
Because the qspinlock needs to touch a second cacheline; add a pending
bit and allow a single
On 04/18/2014 01:53 PM, Peter Zijlstra wrote:
On Fri, Apr 18, 2014 at 01:32:47PM -0400, Waiman Long wrote:
On 04/18/2014 04:15 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 05:28:17PM -0400, Waiman Long wrote:
On 04/17/2014 11:49 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03
On 04/18/2014 03:05 PM, Peter Zijlstra wrote:
On Fri, Apr 18, 2014 at 01:52:50PM -0400, Waiman Long wrote:
I am confused by your notation.
Nah, I think I was confused :-) Make the 1 _Q_LOCKED_VAL though, as
that's the proper constant to use.
Everyone gets confused once in a while:-) I have
the lock is acquired, the queue node can be released to
be used later.
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
include/asm-generic/qspinlock.h | 118
include/asm-generic/qspinlock_types.h | 61
spinlock
contention problems. Those need to be solved by refactoring the code
to make more efficient use of the lock or finer granularity ones. The
main purpose is to make the lock contention problems more tolerable
until someone can spend the time and effort to fix them.
Waiman Long (19):
qspinlock
Because the qspinlock needs to touch a second cacheline; add a pending
bit and allow a single in-word spinner before we punt to the second
cacheline.
Signed-off-by: Peter Zijlstra pet...@infradead.org
Signed-off-by: Waiman Long waiman.l...@hp.com
---
include/asm-generic/qspinlock_types.h | 12
optimization which will make the queue spinlock code perform
better than the generic implementation.
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
arch/x86/Kconfig |1 +
arch/x86/include/asm/qspinlock.h | 29
This patch extracts the logic for the exchange of new and previous tail
code words into a new xchg_tail() function which can be optimized in a
later patch.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
include/asm-generic/qspinlock_types.h |2 +
kernel/locking/qspinlock.c
(and any other arch that
cannot do single-copy atomic byte stores).
Signed-off-by: Peter Zijlstra pet...@infradead.org
Signed-off-by: Waiman Long waiman.l...@hp.com
---
include/asm-generic/qspinlock_types.h | 13
kernel/locking/qspinlock.c| 111 ++---
2
comparable to ticket spinlock on the same node, but much
faster when in different nodes. With 3 contending tasks, however,
the ticket spinlock is still quite a bit faster.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 32 ++--
1 files
Usr Time
-- -
ticketlock 2075 10.00 216.35 3.49
qspinlock 3023 10.00 198.20 4.80
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 61
If unfair lock is supported, the lock acquisition loop at the end of
the queue_spin_lock_slowpath() function may need to detect the fact
the lock can be stolen. Code are added for the stolen lock detection.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 26
. For large critical section,
however, there may not be much benefit.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/Kconfig | 11 +
arch/x86/include/asm/qspinlock.h | 79 ++
arch/x86/kernel/Makefile |1 +
arch/x86
.
It is also necessary to expand arch_mcs_spin_lock_contended() to the
underlying while loop as additional code will need to be inserted
into the loop.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 36 +++-
1 files changed, 23
10507 1869 4307
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 160 ++--
1 files changed, 154 insertions(+), 6 deletions(-)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index
the cacheline contention problem on the lock
word while trying to maintain as much of a FIFO order as possible.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 147 +++-
1 files changed, 146 insertions(+), 1 deletions
into a slowerpath
function. This avoids the slow down of the pending bit and trylock
code path at the expense of a little bit of additional overhead to
the MCS queuing code path.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 111
statistical data for debugfs
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/include/asm/paravirt.h | 18 +-
arch/x86/include/asm/paravirt_types.h | 17 +
arch/x86/kernel/paravirt-spinlocks.c |6 ++
3 files changed, 40 insertions(+), 1
This patch renames the paravirt_ticketlocks_enabled static key to a
more generic paravirt_spinlocks_enabled name.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/include/asm/spinlock.h |4 ++--
arch/x86/kernel/kvm.c|2 +-
arch/x86/kernel/paravirt
in the pending
bit code path back to the regular queuing code path so that it can
be properly halted by the PV qspinlock code.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
kernel/locking/qspinlock.c | 74 ++--
1 files changed, 64 insertions(+), 10 deletions
-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/kernel/kvm.c | 135 +
kernel/Kconfig.locks |2 +-
2 files changed, 136 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 7ab8ab3..eef427b 100644
This patch adds the necessary XEN specific code to allow XEN to
support the CPU halting and kicking operations needed by the queue
spinlock PV code.
Signed-off-by: Waiman Long waiman.l...@hp.com
---
arch/x86/xen/spinlock.c | 146 +--
kernel
On 04/17/2014 11:42 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:55AM -0400, Waiman Long wrote:
+/**
+ * trylock_pending - try to acquire queue spinlock using the pending bit
+ * @lock : Pointer to queue spinlock structure
+ * @pval : Pointer to value of the queue spinlock 32-bit
On 04/17/2014 11:49 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:56AM -0400, Waiman Long wrote:
@@ -192,36 +220,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock, u32
val)
node-next = NULL;
/*
+* We touched a (possibly) cold cacheline; attempt
On 04/17/2014 11:56 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+struct __qspinlock {
+ union {
+ atomic_t val;
+ struct {
+#ifdef __LITTLE_ENDIAN
+ u16 locked_pending
On 04/17/2014 11:58 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+static __always_inline void
+clear_pending_set_locked(struct qspinlock *lock, u32 val)
+{
+ struct __qspinlock *l = (void *)lock;
+
+ ACCESS_ONCE(l-locked_pending) = 1
On 04/17/2014 12:36 PM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:58AM -0400, Waiman Long wrote:
There is a problem in the current trylock_pending() function. When the
lock is free, but the pending bit holder hasn't grabbed the lock
cleared the pending bit yet, the trylock_pending
On 04/17/2014 01:23 PM, Konrad Rzeszutek Wilk wrote:
On Thu, Apr 17, 2014 at 11:03:52AM -0400, Waiman Long wrote:
v8-v9:
- Integrate PeterZ's version of the queue spinlock patch with some
modification:
http://lkml.kernel.org/r/20140310154236.038181...@infradead.org
- Break
On 04/17/2014 01:40 PM, Raghavendra K T wrote:
On 04/17/2014 10:53 PM, Konrad Rzeszutek Wilk wrote:
On Thu, Apr 17, 2014 at 11:03:52AM -0400, Waiman Long wrote:
v8-v9:
- Integrate PeterZ's version of the queue spinlock patch with some
modification:
http://lkml.kernel.org/r
On 04/07/2014 01:51 PM, Raghavendra K T wrote:
On 04/07/2014 10:08 PM, Waiman Long wrote:
On 04/07/2014 02:14 AM, Raghavendra K T wrote:
[...]
But I am seeing hang in overcommit cases. Gdb showed that many vcpus
are halted and there was no progress. Suspecting the problem /race with
halting
On 04/07/2014 02:14 AM, Raghavendra K T wrote:
I tested the v7,v8 of qspinlock with unfair config on kvm guest.
I was curious about unfair locks performance in undercommit cases.
(overcommit case is expected to perform well)
But I am seeing hang in overcommit cases. Gdb showed that many
On 04/07/2014 10:09 AM, Peter Zijlstra wrote:
On Fri, Apr 04, 2014 at 01:08:16PM -0400, Waiman Long wrote:
Peter's patch is a rewrite of my patches 1-4, there is no PV or unfair lock
support in there.
Yes, because your patches were unreadable and entirely non obvious.
And while I appreciate
On 04/04/2014 09:00 AM, Peter Zijlstra wrote:
So I'm just not ever going to pick up this patch; I spend a week trying
to reverse engineer this; I posted a 7 patch series creating the
equivalent, but in a gradual and readable fashion:
201 - 300 of 404 matches
Mail list logo