Re: [RFC] Implement Batched (group) ticket lock

2014-05-30 Thread Raghavendra K T
On 05/30/2014 04:15 AM, Waiman Long wrote: On 05/28/2014 08:16 AM, Raghavendra K T wrote: - we need an intelligent way to nullify the effect of batching for baremetal (because extra cmpxchg is not required). To do this, you will need to have 2 slightly different algorithms depending on the p

Re: [RFC] Implement Batched (group) ticket lock

2014-05-29 Thread Waiman Long
On 05/28/2014 08:16 AM, Raghavendra K T wrote: TODO: - we need an intelligent way to nullify the effect of batching for baremetal (because extra cmpxchg is not required). To do this, you will need to have 2 slightly different algorithms depending on the paravirt_ticketlocks_enabled jump lab

Re: [RFC] Implement Batched (group) ticket lock

2014-05-29 Thread Raghavendra K T
On 05/29/2014 12:16 PM, Peter Zijlstra wrote: On Wed, May 28, 2014 at 05:46:39PM +0530, Raghavendra K T wrote: In virtualized environment there are mainly three problems related to spinlocks that affect performance. 1. LHP (lock holder preemption) 2. Lock Waiter Preemption (LWP) 3. Starvation/fa

Re: [RFC] Implement Batched (group) ticket lock

2014-05-29 Thread Raghavendra K T
On 05/29/2014 03:25 AM, Rik van Riel wrote: On 05/28/2014 08:16 AM, Raghavendra K T wrote: This patch looks very promising. Thank you Rik. [...] - My kernbench/ebizzy test on baremetal (32 cpu +ht sandybridge) did not seem to show the impact of extra cmpxchg. but there should be effect o

Re: [RFC] Implement Batched (group) ticket lock

2014-05-28 Thread Peter Zijlstra
On Wed, May 28, 2014 at 05:46:39PM +0530, Raghavendra K T wrote: > In virtualized environment there are mainly three problems > related to spinlocks that affect performance. > 1. LHP (lock holder preemption) > 2. Lock Waiter Preemption (LWP) > 3. Starvation/fairness > > Though ticketlocks solve t

Re: [RFC] Implement Batched (group) ticket lock

2014-05-28 Thread Rik van Riel
On 05/28/2014 06:19 PM, Linus Torvalds wrote: > If somebody has a P4 still, that's likely the worst case by far. I'm sure cmpxchg isn't the only thing making P4 the worst case :) -- All rights reversed ___ Virtualization mailing list Virtualization@li

Re: [RFC] Implement Batched (group) ticket lock

2014-05-28 Thread Thomas Gleixner
On Wed, 28 May 2014, Linus Torvalds wrote: > > If somebody has a P4 still, that's likely the worst case by far. I do, but I'm only using it during winter and only if the ia64 machine does not provide sufficient heating. So you have to wait at least half a year until I'm able to test it. _

Re: [RFC] Implement Batched (group) ticket lock

2014-05-28 Thread Linus Torvalds
On Wed, May 28, 2014 at 2:55 PM, Rik van Riel wrote: > > Or maybe cmpxchg is cheap once you already own the cache line > exclusively? A locked cmpxchg ends up being anything between ~15-50 cycles depending on microarchitecture if things are already exclusively in the cache (with the P4 being an o

Re: [RFC] Implement Batched (group) ticket lock

2014-05-28 Thread Rik van Riel
On 05/28/2014 08:16 AM, Raghavendra K T wrote: This patch looks very promising. > TODO: > - we need an intelligent way to nullify the effect of batching for baremetal > (because extra cmpxchg is not required). On (larger?) NUMA systems, the unfairness may be a nice performance benefit, reducing

[RFC] Implement Batched (group) ticket lock

2014-05-28 Thread Raghavendra K T
In virtualized environment there are mainly three problems related to spinlocks that affect performance. 1. LHP (lock holder preemption) 2. Lock Waiter Preemption (LWP) 3. Starvation/fairness Though ticketlocks solve the fairness problem, it worsens LWP, LHP problems. pv-ticketlocks tried to addr