On Wed, 2008-01-23 at 21:53 +0900, Ryo Tsuruta wrote:
Hi everyone,
I'm happy to announce that I've implemented a Block I/O bandwidth controller.
The controller is designed to be of use in a cgroup or virtual machine
environment. The current approach is that the controller is implemented as
On Wed, 2008-07-16 at 15:51 +0900, Hidetoshi Seto wrote:
If stop_machine() invoked while one of onlined cpu is locked up
by some reason, stop_machine cannot finish its work because the
locked cpu cannot stop. This means all other healthy cpus
will be blocked infinitely by one dead cpu.
On Fri, 2008-07-25 at 10:55 -0700, Jeremy Fitzhardinge wrote:
I'm thinking about ways to improve the Xen balloon driver. This is the
driver which allows the guest domain to expand or contract by either
asking for more memory from the hypervisor, or giving unneeded memory
back. From the
On Thu, 2008-11-06 at 11:01 -0500, Vivek Goyal wrote:
Does this still require I use dm, or does it also work on regular block
devices? Patch 4/4 isn't quite clear on this.
No. You don't have to use dm. It will simply work on regular devices. We
shall have to put few lines of code for it
On Thu, 2008-11-06 at 10:30 -0500, [EMAIL PROTECTED] wrote:
Hi,
If you are not already tired of so many io controller implementations, here
is another one.
This is a very eary very crude implementation to get early feedback to see
if this approach makes any sense or not.
This
On Thu, 2008-11-06 at 11:39 -0500, Vivek Goyal wrote:
On Thu, Nov 06, 2008 at 05:16:13PM +0100, Peter Zijlstra wrote:
On Thu, 2008-11-06 at 11:01 -0500, Vivek Goyal wrote:
Does this still require I use dm, or does it also work on regular block
devices? Patch 4/4 isn't quite clear
On Thu, 2008-11-06 at 11:57 -0500, Rik van Riel wrote:
Peter Zijlstra wrote:
The only real issue I can see is with linear volumes, but those are
stupid anyway - non of the gains but all the risks.
Linear volumes may well be the most common ones.
People start out with the filesystems
On Fri, 2008-11-07 at 11:41 +1100, Dave Chinner wrote:
On Thu, Nov 06, 2008 at 06:11:27PM +0100, Peter Zijlstra wrote:
On Thu, 2008-11-06 at 11:57 -0500, Rik van Riel wrote:
Peter Zijlstra wrote:
The only real issue I can see is with linear volumes, but those are
stupid anyway
On Fri, 2008-11-14 at 13:58 +0900, Satoshi UCHIDA wrote:
I think Satoshi's cfq controller patches also do not seem to be considering
A, B, C, D and E to be at same level, instead it treats cgroup / , D and
E
at same level and tries to do proportional BW division among these.
Satoshi,
These patches never seem to have made it onto LKML?!
On Mon, 2007-08-20 at 15:13 +0200, Laurent Vivier wrote:
The aim of these four patches is to introduce Virtual Machine time accounting.
_Ingo_, as these patches modify files of the scheduler, could you have a look
to
them, please ?
On Tue, 2009-08-04 at 19:29 +0200, Martin Schwidefsky wrote:
So its going to split user time into user and guest. Does that really
make sense? For the host kernel it really is just another user process,
no?
The code (at least in parts) is already upstream. Look at the
account_guest_time
On Mon, 2010-11-15 at 12:03 -0800, H. Peter Anvin wrote:
On 11/15/2010 12:00 PM, Jeremy Fitzhardinge wrote:
Another approach I discussed with PeterZ and Mathieu is to steal the LSB
of the ticket counters (halving the max CPU count) to use as a there is
someone in slowpath waiting on this
On Mon, 2010-11-15 at 13:02 -0800, Jeremy Fitzhardinge wrote:
As a heuristic, it shouldn't be too bad performancewise, since
(handwaving) if ticketholder N has entered the slowpath, then its likely
that N+1 will as well.
Yes, esp. if the whole slow unlock path takes more cycles than you
On Tue, 2010-11-16 at 13:08 -0800, Jeremy Fitzhardinge wrote:
Maintain a flag in both LSBs of the ticket lock which indicates whether
anyone is in the lock slowpath and may need kicking when the current
holder unlocks. The flags are set when the first locker enters
the slowpath, and cleared
when low on memory.
Signed-off-by: Peter Zijlstra a.p.zijls...@chello.nl
---
arch/alpha/kernel/smp.c |1 +
arch/arm/kernel/smp.c |1 +
arch/blackfin/mach-common/smp.c |3 ++-
arch/cris/arch-v32/kernel/smp.c | 13 -
arch/ia64/kernel/irq_ia64.c |2
On Mon, 2011-01-17 at 11:26 +, Russell King - ARM Linux wrote:
On Mon, Jan 17, 2011 at 12:07:13PM +0100, Peter Zijlstra wrote:
diff --git a/arch/alpha/kernel/smp.c b/arch/alpha/kernel/smp.c
index 42aa078..c4a570b 100644
--- a/arch/alpha/kernel/smp.c
+++ b/arch/alpha/kernel/smp.c
On Mon, 2011-01-17 at 12:31 +0100, Peter Zijlstra wrote:
On Mon, 2011-01-17 at 11:26 +, Russell King - ARM Linux wrote:
Maybe remove the comment everything is done on the interrupt return path
as with this function call, that is no longer the case.
(Removed am33, m32r-ka, m32r, arm
On Mon, 2011-01-17 at 14:49 -0500, Mike Frysinger wrote:
On Mon, Jan 17, 2011 at 06:07, Peter Zijlstra wrote:
Also, while reading through all this, I noticed the blackfin SMP code
looks to be broken, it simply discards any IPI when low on memory.
not really. see changelog of commit
On Tue, 2011-01-18 at 07:31 +1100, Benjamin Herrenschmidt wrote:
Beware of false positive, I've used fake reschedule IPIs in the past
for other things (like kicking a CPU out of sleep state for unrelated
reasons). Nothing that I know that is upstream today but some of that
might come back.
On Wed, 2011-01-19 at 22:42 +0530, Srivatsa Vaddagiri wrote:
Add two hypercalls to KVM hypervisor to support pv-ticketlocks.
KVM_HC_WAIT_FOR_KICK blocks the calling vcpu until another vcpu kicks it or it
is woken up because of an event like interrupt.
KVM_HC_KICK_CPU allows the calling
On Wed, 2011-01-19 at 22:53 +0530, Srivatsa Vaddagiri wrote:
On Wed, Jan 19, 2011 at 10:42:39PM +0530, Srivatsa Vaddagiri wrote:
Add two hypercalls to KVM hypervisor to support pv-ticketlocks.
KVM_HC_WAIT_FOR_KICK blocks the calling vcpu until another vcpu kicks it or
it
is woken up
On Thu, 2011-01-20 at 17:29 +0530, Srivatsa Vaddagiri wrote:
If we had a yield-to [1] sort of interface _and_ information on which vcpu
owns a lock, then lock-spinners can yield-to the owning vcpu,
and then I'd nak it for being stupid ;-)
really, yield*() is retarded, never even consider
On Mon, 2011-02-07 at 10:26 +1100, Benjamin Herrenschmidt wrote:
You missed:
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 9813605..467d122 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -98,6 +98,7 @@ void smp_message_recv(int msg)
On Wed, 2011-02-09 at 17:14 +1100, Benjamin Herrenschmidt wrote:
On Mon, 2011-02-07 at 14:54 +0100, Peter Zijlstra wrote:
On Mon, 2011-02-07 at 10:26 +1100, Benjamin Herrenschmidt wrote:
You missed:
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 9813605
On Tue, 2012-08-21 at 16:52 +0300, Michael S. Tsirkin wrote:
+ rcu_read_lock();
+ mapping = rcu_dereference(page-mapping);
+ if (mapping_balloon(mapping))
+ ret = true;
+ rcu_read_unlock();
This looks suspicious: you
On Tue, 2012-08-21 at 09:47 -0300, Rafael Aquini wrote:
+ mapping = rcu_access_pointer(page-mapping);
+ if (mapping)
+ mapping = mapping-assoc_mapping;
The comment near rcu_access_pointer() explicitly says:
* Return the value of the specified RCU-protected pointer,
On Tue, 2012-08-21 at 17:40 +0300, Michael S. Tsirkin wrote:
+ spin_lock(vb-pages_lock);
+ page = list_first_or_null_rcu(vb-pages, struct page, lru);
Why is list_first_or_null_rcu called outside
RCU critical section here?
It looks like vb-pages_lock is the
On Mon, 2012-09-17 at 13:38 -0300, Rafael Aquini wrote:
+static inline void assign_balloon_mapping(struct page *page,
+ struct address_space
*mapping)
+{
+ page-mapping = mapping;
+ smp_wmb();
+}
+
+static inline void
On Wed, Jul 10, 2013 at 01:33:25PM +0300, Gleb Natapov wrote:
Here's an idea, trim the damn email ;-) -- not only directed at gleb.
Ingo, Gleb,
From the results perspective, Andrew Theurer, Vinod's test results are
pro-pvspinlock.
Could you please help me to know what will make it a
On Tue, Jul 16, 2013 at 09:02:15AM +0300, Gleb Natapov wrote:
BTW can NMI handler take spinlocks?
No -- that is, yes you can using trylock, but you still shouldn't.
If it can what happens if NMI is
delivered in a section protected by local_irq_save()/local_irq_restore()?
You deadlock.
You don't happen to have a proper state diagram for this thing do you?
I suppose I'm going to have to make one; this is all getting a bit
unwieldy, and those xchg() + fixup things are hard to read.
On Wed, Feb 26, 2014 at 10:14:23AM -0500, Waiman Long wrote:
+static inline int
On Wed, Feb 26, 2014 at 10:14:21AM -0500, Waiman Long wrote:
+struct qnode {
+ u32 wait; /* Waiting flag */
+ struct qnode*next; /* Next queue node addr */
+};
+
+struct qnode_set {
+ struct qnodenodes[MAX_QNODES];
+ int
On Wed, Feb 26, 2014 at 10:14:21AM -0500, Waiman Long wrote:
+static void put_qnode(void)
+{
+ struct qnode_set *qset = this_cpu_ptr(qnset);
+
+ qset-node_idx--;
+}
That very much wants to be: this_cpu_dec().
___
Virtualization mailing
Is this the same 8 patches you send yesterday?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
On Thu, Feb 27, 2014 at 03:42:19PM -0500, Waiman Long wrote:
+ old = xchg(qlock-lock_wait, _QSPINLOCK_WAITING|_QSPINLOCK_LOCKED);
+
+ if (old == 0) {
+ /*
+* Got the lock, can clear the waiting bit now
+*/
+ smp_u8_store_release(qlock-wait,
On Fri, Feb 28, 2014 at 08:25:24AM -0800, Linus Torvalds wrote:
On Feb 28, 2014 1:30 AM, Peter Zijlstra pet...@infradead.org wrote:
At low contention the cmpxchg won't have to be retried (much) so using
it won't be a problem and you get to have arbitrary atomic ops.
Peter, the difference
After modifying it to do a deterministic cmpxchg, the test run time of 2
contending tasks jumps up from 600ms (best case) to about 1700ms which was
worse than the original qspinlock's 1300-1500ms. It is the opportunistic
nature of the xchg() code that can potentially combine multiple steps
Hi,
Here are some numbers for my version -- also attached is the test code.
I found that booting big machines is tediously slow so I lifted the
whole lot to userspace.
I measure the cycles spend in arch_spin_lock() + arch_spin_unlock().
The machines used are a 4 node (2 socket) AMD Interlagos,
Updated version, this includes numbers for my SNB desktop and Waiman's
variant.
Curiously Waiman's version seems consistently slower on 2 cross node
CPUs. Whereas my version seems to have a problem on SNB with 2 CPUs.
There's something weird with the ticket lock numbers; when I compile
the code
On Tue, Mar 04, 2014 at 05:58:00PM +0100, Peter Zijlstra wrote:
2: 17141.3240502:620.1859302:618.737681
So I forgot that AMD has compute units that share L2:
root@interlagos:~/spinlocks# export LOCK=./ticket ; ($LOCK 0 1 ; $LOCK 0 2) |
awk '/^total/ { print $2
On Tue, Mar 04, 2014 at 12:48:26PM -0500, Waiman Long wrote:
Peter,
I was trying to implement the generic queue code exchange code using
cmpxchg as suggested by you. However, when I gathered the performance
data, the code performed worse than I expected at a higher contention
level. Below
On Tue, Mar 04, 2014 at 11:40:43PM +0100, Peter Zijlstra wrote:
On Tue, Mar 04, 2014 at 12:48:26PM -0500, Waiman Long wrote:
Peter,
I was trying to implement the generic queue code exchange code using
cmpxchg as suggested by you. However, when I gathered the performance
data, the code
On Wed, Mar 12, 2014 at 03:08:24PM -0400, Waiman Long wrote:
On 03/12/2014 02:54 PM, Waiman Long wrote:
+/*
+ * Set the lock bit clear the waiting bit simultaneously
+ * It is assumed that there is no lock stealing with this
+ * quick path
On Wed, Mar 12, 2014 at 02:54:52PM -0400, Waiman Long wrote:
+static inline void arch_spin_lock(struct qspinlock *lock)
+{
+ if (static_key_false(paravirt_unfairlocks_enabled))
+ queue_spin_lock_unfair(lock);
+ else
+ queue_spin_lock(lock);
+}
So I would
On Wed, Mar 12, 2014 at 02:54:57PM -0400, Waiman Long wrote:
A KVM guest of 20 CPU cores was created to run the disk workload of
the AIM7 benchmark on both ext4 and xfs RAM disks at 3000 users on a
3.14-rc6 based kernel. The JPM (jobs/minute) data of the test run were:
You really should just
On Thu, Mar 13, 2014 at 04:05:19PM -0400, Waiman Long wrote:
On 03/13/2014 11:15 AM, Peter Zijlstra wrote:
On Wed, Mar 12, 2014 at 02:54:52PM -0400, Waiman Long wrote:
+static inline void arch_spin_lock(struct qspinlock *lock)
+{
+ if (static_key_false(paravirt_unfairlocks_enabled
On Mon, Mar 17, 2014 at 01:44:34PM -0400, Waiman Long wrote:
The PV ticketlock code was designed to handle lock holder preemption by
redirecting CPU resources in a preempted guest to another guest that can
better use it and then return the preempted CPU back sooner.
But that's the PV code, not
So I'm just not ever going to pick up this patch; I spend a week trying
to reverse engineer this; I posted a 7 patch series creating the
equivalent, but in a gradual and readable fashion:
http://lkml.kernel.org/r/20140310154236.038181...@infradead.org
You keep on ignoring that; I'll keep on
On Fri, Apr 04, 2014 at 01:08:16PM -0400, Waiman Long wrote:
Peter's patch is a rewrite of my patches 1-4, there is no PV or unfair lock
support in there.
Yes, because your patches were unreadable and entirely non obvious.
And while I appreciate that its not entirely your fault; the subject is
On Fri, Apr 04, 2014 at 10:59:09AM -0400, Waiman Long wrote:
I am really sorry if you have bad feeling about it. I do not mean to
discredit you on your effort to make the qspinlock patch better. I really
appreciate your input and would like to work with you on this patch as well
as other
On Thu, Apr 17, 2014 at 11:03:55AM -0400, Waiman Long wrote:
+/**
+ * trylock_pending - try to acquire queue spinlock using the pending bit
+ * @lock : Pointer to queue spinlock structure
+ * @pval : Pointer to value of the queue spinlock 32-bit word
+ * Return: 1 if lock acquired, 0
On Thu, Apr 17, 2014 at 11:03:56AM -0400, Waiman Long wrote:
@@ -192,36 +220,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock,
u32 val)
node-next = NULL;
/*
+ * We touched a (possibly) cold cacheline; attempt the trylock once
+ * more in the hope someone
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+#if !defined(__LITTLE_ENDIAN) !defined(__BIG_ENDIAN)
+#error Missing either LITTLE_ENDIAN or BIG_ENDIAN definition.
+#endif
This seems entirely superfluous, I don't think a kernel build will go
anywhere if either is missing.
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
@@ -48,6 +53,9 @@
* We can further change the first spinner to spin on a bit in the lock word
* instead of its node; whereby avoiding the need to carry a node from lock
to
* unlock, and preserving API.
+ *
+ * N.B. The
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+struct __qspinlock {
+ union {
+ atomic_t val;
+ struct {
+#ifdef __LITTLE_ENDIAN
+ u16 locked_pending;
+ u16 tail;
+#else
+ u16
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+static __always_inline void
+clear_pending_set_locked(struct qspinlock *lock, u32 val)
+{
+ struct __qspinlock *l = (void *)lock;
+
+ ACCESS_ONCE(l-locked_pending) = 1;
+}
@@ -157,8 +251,13 @@ static inline int
On Thu, Apr 17, 2014 at 11:03:58AM -0400, Waiman Long wrote:
There is a problem in the current trylock_pending() function. When the
lock is free, but the pending bit holder hasn't grabbed the lock
cleared the pending bit yet, the trylock_pending() function will fail.
I remember seeing some
On Thu, Apr 17, 2014 at 11:03:59AM -0400, Waiman Long wrote:
kernel/locking/qspinlock.c | 61 +++
1 files changed, 44 insertions(+), 17 deletions(-)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 497da24..80fe9ee 100644
On Thu, Apr 17, 2014 at 05:20:31PM -0400, Waiman Long wrote:
+ while ((val = atomic_read(lock-val)) _Q_LOCKED_MASK)
+ arch_mutex_cpu_relax();
That was a cpu_relax().
Yes, but arch_mutex_cpu_relax() is the same as cpu_relax() for x86.
Yeah, so why bother typing more?
Let the
On Thu, Apr 17, 2014 at 05:28:17PM -0400, Waiman Long wrote:
On 04/17/2014 11:49 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:56AM -0400, Waiman Long wrote:
@@ -192,36 +220,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock,
u32 val)
node-next = NULL
On Thu, Apr 17, 2014 at 05:46:27PM -0400, Waiman Long wrote:
On 04/17/2014 11:56 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:57AM -0400, Waiman Long wrote:
+struct __qspinlock {
+ union {
+ atomic_t val;
char bytes[4];
+ struct {
+#ifdef
On Thu, Apr 17, 2014 at 09:46:04PM -0400, Waiman Long wrote:
BTW, I didn't test out your atomic_test_and_set() change. Did it provide a
noticeable performance benefit when compared with cmpxchg()?
I've not tested that I think. I had a hard time showing that cmpxchg
loops were slower, but once I
On Fri, Apr 18, 2014 at 01:32:47PM -0400, Waiman Long wrote:
On 04/18/2014 04:15 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 05:28:17PM -0400, Waiman Long wrote:
On 04/17/2014 11:49 AM, Peter Zijlstra wrote:
On Thu, Apr 17, 2014 at 11:03:56AM -0400, Waiman Long wrote:
@@ -192,36 +220,25
On Fri, Apr 18, 2014 at 01:52:50PM -0400, Waiman Long wrote:
I am confused by your notation.
Nah, I think I was confused :-) Make the 1 _Q_LOCKED_VAL though, as
that's the proper constant to use.
___
Virtualization mailing list
On Wed, May 07, 2014 at 11:01:31AM -0400, Waiman Long wrote:
+/**
+ * trylock_pending - try to acquire queue spinlock using the pending bit
+ * @lock : Pointer to queue spinlock structure
+ * @pval : Pointer to value of the queue spinlock 32-bit word
+ * Return: 1 if lock acquired, 0
On Wed, May 07, 2014 at 11:01:34AM -0400, Waiman Long wrote:
@@ -221,11 +222,37 @@ static inline int trylock_pending(struct qspinlock
*lock, u32 *pval)
*/
for (;;) {
/*
- * If we observe any contention; queue.
+ * If we observe that the
On Wed, May 07, 2014 at 11:01:35AM -0400, Waiman Long wrote:
@@ -94,23 +94,29 @@ static inline struct mcs_spinlock *decode_tail(u32 tail)
* can allow better optimization of the lock acquisition for the pending
* bit holder.
*/
-#if _Q_PENDING_BITS == 8
-
struct __qspinlock {
On Wed, May 07, 2014 at 11:01:35AM -0400, Waiman Long wrote:
/**
+ * get_qlock - Set the lock bit and own the lock
+ * @lock: Pointer to queue spinlock structure
+ *
+ * This routine should only be called when the caller is the only one
+ * entitled to acquire the lock.
+ */
+static
On Wed, May 07, 2014 at 11:01:36AM -0400, Waiman Long wrote:
/*
+ * To have additional features for better virtualization support, it is
+ * necessary to store additional data in the queue node structure. So
+ * a new queue node structure will have to be defined and used here.
+ */
+struct
On Wed, May 07, 2014 at 11:01:37AM -0400, Waiman Long wrote:
If unfair lock is supported, the lock acquisition loop at the end of
the queue_spin_lock_slowpath() function may need to detect the fact
the lock can be stolen. Code are added for the stolen lock detection.
A new qhead macro is
On Wed, May 07, 2014 at 11:01:38AM -0400, Waiman Long wrote:
No, we want the unfair thing for VIRT, not PARAVIRT.
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 9e7659e..10e87e1 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -227,6
On Wed, May 07, 2014 at 11:01:40AM -0400, Waiman Long wrote:
+#define DEF_LOOP_CNT(c) int c = 0
+#define INC_LOOP_CNT(c) (c)++
+#define LOOP_CNT(c) c
+#define LSTEAL_MIN (1 3)
+#define LSTEAL_MAX (1 10)
+#define LSTEAL_MIN_MASK
On Fri, May 09, 2014 at 08:58:47PM -0400, Waiman Long wrote:
On 05/08/2014 02:58 PM, Peter Zijlstra wrote:
On Wed, May 07, 2014 at 11:01:34AM -0400, Waiman Long wrote:
@@ -221,11 +222,37 @@ static inline int trylock_pending(struct qspinlock
*lock, u32 *pval
On Fri, May 09, 2014 at 09:19:32PM -0400, Waiman Long wrote:
On 05/08/2014 03:06 PM, Peter Zijlstra wrote:
On Wed, May 07, 2014 at 11:01:37AM -0400, Waiman Long wrote:
If unfair lock is supported, the lock acquisition loop at the end of
the queue_spin_lock_slowpath() function may need
On Fri, May 09, 2014 at 09:08:56PM -0400, Waiman Long wrote:
On 05/08/2014 03:04 PM, Peter Zijlstra wrote:
On Wed, May 07, 2014 at 11:01:36AM -0400, Waiman Long wrote:
/*
+ * To have additional features for better virtualization support, it is
+ * necessary to store additional data
On Sat, May 10, 2014 at 04:14:17PM +0200, Peter Zijlstra wrote:
On Fri, May 09, 2014 at 09:08:56PM -0400, Waiman Long wrote:
On 05/08/2014 03:04 PM, Peter Zijlstra wrote:
On Wed, May 07, 2014 at 11:01:36AM -0400, Waiman Long wrote:
/*
+ * To have additional features for better
On Mon, May 12, 2014 at 05:22:08PM +0200, Radim Krčmář wrote:
2014-05-07 11:01-0400, Waiman Long:
From: Peter Zijlstra pet...@infradead.org
Because the qspinlock needs to touch a second cacheline; add a pending
bit and allow a single in-word spinner before we punt to the second
On Wed, May 14, 2014 at 06:51:24PM +0200, Radim Krčmář wrote:
Ok.
I've seen merit in pvqspinlock even with slightly slower first-waiter,
so I would have happily sacrificed those horrible branches.
(I prefer elegant to optimized code, but I can see why we want to be
strictly better than
On Wed, May 28, 2014 at 05:46:39PM +0530, Raghavendra K T wrote:
In virtualized environment there are mainly three problems
related to spinlocks that affect performance.
1. LHP (lock holder preemption)
2. Lock Waiter Preemption (LWP)
3. Starvation/fairness
Though ticketlocks solve the
On Fri, May 30, 2014 at 11:43:52AM -0400, Waiman Long wrote:
---
kernel/locking/qspinlock.c | 18 --
1 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index fc7fd8c..7f10758 100644
---
On Fri, May 30, 2014 at 11:43:55AM -0400, Waiman Long wrote:
Enabling this configuration feature causes a slight decrease the
performance of an uncontended lock-unlock operation by about 1-2%
mainly due to the use of a static key. However, uncontended lock-unlock
operation are really just a
On Wed, Jun 11, 2014 at 12:54:02PM +0200, Peter Zijlstra wrote:
@@ -252,6 +260,18 @@ void queue_spin_lock_slowpath(struct qspinlock *lock,
u32 val)
BUILD_BUG_ON(CONFIG_NR_CPUS = (1U _Q_TAIL_CPU_BITS));
+#ifdef CONFIG_VIRT_UNFAIR_LOCKS
+ /*
+* A simple test and set
On Wed, Jun 11, 2014 at 09:37:55PM -0400, Long, Wai Man wrote:
On 6/11/2014 6:54 AM, Peter Zijlstra wrote:
On Fri, May 30, 2014 at 11:43:55AM -0400, Waiman Long wrote:
Enabling this configuration feature causes a slight decrease the
performance of an uncontended lock-unlock operation
On Wed, Jun 11, 2014 at 05:22:28PM -0400, Long, Wai Man wrote:
@@ -233,11 +233,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock,
u32 val)
*/
for (;;) {
/*
-* If we observe any contention; queue.
+* If we observe that the queue is not
On Fri, May 30, 2014 at 11:44:00AM -0400, Waiman Long wrote:
@@ -19,13 +19,46 @@ extern struct static_key virt_unfairlocks_enabled;
* that the clearing the lock bit is done ASAP without artificial delay
* due to compiler optimization.
*/
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+static
From: Waiman Long waiman.l...@hp.com
This patch extracts the logic for the exchange of new and previous tail
code words into a new xchg_tail() function which can be optimized in a
later patch.
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
Since Waiman seems incapable of doing simple things; here's my take on the
paravirt crap.
The first few patches are taken from Waiman's latest series, but the virt
support is completely new. Its primary aim is to not mess up the native code.
I've not stress tested it, but the virt and paravirt
-off-by: Peter Zijlstra pet...@infradead.org
---
kernel/locking/qspinlock.c | 59 -
1 file changed, 43 insertions(+), 16 deletions(-)
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -93,24 +93,33 @@ static inline struct mcs_spinlock
needed when waiting for the
lock. Once the lock is acquired, the queue node can be released to
be used later.
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
include/asm-generic/qspinlock.h | 118
include/asm-generic
From: Waiman Long waiman.l...@hp.com
This patch renames the paravirt_ticketlocks_enabled static key to a
more generic paravirt_spinlocks_enabled name.
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
arch/x86/include/asm/spinlock.h |4
When we detect a hypervisor (!paravirt, see later patches), revert to
a simple test-and-set lock to avoid the horrors of queue preemption.
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
arch/x86/include/asm/qspinlock.h | 14 ++
include/asm-generic/qspinlock.h |7
From: Peter Zijlstra pet...@infradead.org
When we allow for a max NR_CPUS 2^14 we can optimize the pending
wait-acquire and the xchg_tail() operations.
By growing the pending bit to a byte, we reduce the tail to 16bit.
This means we can use xchg16 for the tail part and do away with all
Because the qspinlock needs to touch a second cacheline; add a pending
bit and allow a single in-word spinner before we punt to the second
cacheline.
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
include/asm-generic/qspinlock_types.h | 12 ++-
kernel/locking/qspinlock.c
file includes some x86
specific optimization which will make the queue spinlock code perform
better than the generic implementation.
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
arch/x86/Kconfig |1 +
arch/x86/include
, in this case the pending bit is guaranteed to
be released 'soon', therefore wait for it and avoid queueing.
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
kernel/locking/qspinlock.c | 10 ++
1 file changed, 10 insertions(+)
Index: linux-2.6/kernel/locking/qspinlock.c
the head is done in two parts, firstly the pv_wait_head will
store its cpu number in whichever node is pointed to by the tail part
of the lock word. Secondly, pv_link_and_wait_node() will propagate the
existing head from the old to the new tail node.
Signed-off-by: Peter Zijlstra pet
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
arch/x86/kernel/kvm.c | 58 ++
kernel/Kconfig.locks |2 -
2 files changed, 59 insertions(+), 1 deletion(-)
Index: linux-2.6/arch/x86/kernel/kvm.c
On Thu, Jun 12, 2014 at 04:54:52PM -0400, Waiman Long wrote:
If two tasks see the pending bit goes away and try to grab it with cmpxchg,
there is no way we can avoid the contention. However, if some how the
pending bit holder get the lock and another task set the pending bit before
the current
On Thu, Jun 12, 2014 at 05:08:28PM -0400, Waiman Long wrote:
Native performance is king, try your very utmost bestest to preserve
that, paravirt is a distant second and nobody sane should care about the
virt case at all.
The patch won't affect native performance unless the kernel is built
On Thu, Jun 12, 2014 at 04:48:41PM -0400, Waiman Long wrote:
I don't have a good understanding of the kernel alternatives mechanism.
I didn't either; I do now, cost me a whole day reading up on
alternative/paravirt code patching.
See the patches I just send out; I got the 'native' case with
1 - 100 of 572 matches
Mail list logo