Re: [Xen-devel] [PATCH-tip v2 2/2] x86/xen: Deprecate xen_nopvspin

2017-11-02 Thread Waiman Long
On 11/01/2017 06:01 PM, Boris Ostrovsky wrote: > On 11/01/2017 04:58 PM, Waiman Long wrote: >> +/* TODO: To be removed in a future kernel version */ >> static __init int xen_parse_nopvspin(char *arg) >> { >> -xen_pvspin = false; >> +pr_warn(&qu

[Xen-devel] [PATCH-tip v2 0/2] x86/paravirt: Enable users to choose PV lock type

2017-11-01 Thread Waiman Long
h 2 deprecates Xen's xen_nopvspin parameter as it is no longer needed. Waiman Long (2): x86/paravirt: Add kernel parameter to choose paravirt lock type x86/xen: Deprecate xen_nopvspin Documentation/admin-guide/kernel-parameters.txt | 11 --- arch/x86/include/asm/paravirt.h |

[Xen-devel] [PATCH-tip v2 1/2] x86/paravirt: Add kernel parameter to choose paravirt lock type

2017-11-01 Thread Waiman Long
this new parameter in determining if pvqspinlock should be used. The parameter, however, will override Xen's xen_nopvspin in term of disabling unfair lock. Signed-off-by: Waiman Long <long...@redhat.com> --- Documentation/admin-guide/kernel-parameters.txt | 7 + arch/x86/include/asm/para

[Xen-devel] [PATCH-tip v2 2/2] x86/xen: Deprecate xen_nopvspin

2017-11-01 Thread Waiman Long
With the new pvlock_type kernel parameter, xen_nopvspin is no longer needed. This patch deprecates the xen_nopvspin parameter by removing its documentation and treating it as an alias of "pvlock_type=queued". Signed-off-by: Waiman Long <long...@redhat.com> --- Documentation/ad

Re: [Xen-devel] [PATCH] x86/paravirt: Add kernel parameter to choose paravirt lock type

2017-11-01 Thread Waiman Long
On 11/01/2017 03:01 PM, Boris Ostrovsky wrote: > On 11/01/2017 12:28 PM, Waiman Long wrote: >> On 11/01/2017 11:51 AM, Juergen Gross wrote: >>> On 01/11/17 16:32, Waiman Long wrote: >>>> Currently, there are 3 different lock types that can be chosen

Re: [Xen-devel] [PATCH] x86/paravirt: Add kernel parameter to choose paravirt lock type

2017-11-01 Thread Waiman Long
On 11/01/2017 11:51 AM, Juergen Gross wrote: > On 01/11/17 16:32, Waiman Long wrote: >> Currently, there are 3 different lock types that can be chosen for >> the x86 architecture: >> >> - qspinlock >> - pvqspinlock >> - unfair lock >> >> One

[Xen-devel] [PATCH] x86/paravirt: Add kernel parameter to choose paravirt lock type

2017-11-01 Thread Waiman Long
this new parameter in determining if pvqspinlock should be used. The parameter, however, will override Xen's xen_nopvspin in term of disabling unfair lock. Signed-off-by: Waiman Long <long...@redhat.com> --- Documentation/admin-guide/kernel-parameters.txt | 7 + arch/x86/include/asm/para

Re: [Xen-devel] [PATCH v3 0/2] guard virt_spin_lock() with a static key

2017-09-25 Thread Waiman Long
t can decide whether to use paravitualized >> spinlocks, the current fallback to the unfair test-and-set scheme, or >> to mimic the bare metal behavior. >> >> V3: >> - remove test for hypervisor environment from virt_spin_lock(9 as >> suggested by Waiman Long

Re: [Xen-devel] [PATCH v2 1/2] paravirt/locks: use new static key for controlling call of virt_spin_lock()

2017-09-06 Thread Waiman Long
On 09/06/2017 12:04 PM, Peter Zijlstra wrote: > On Wed, Sep 06, 2017 at 11:49:49AM -0400, Waiman Long wrote: >>> #define virt_spin_lock virt_spin_lock >>> static inline bool virt_spin_lock(struct qspinlock *lock) >>> { >>> + if (!static_branch_likely(

Re: [Xen-devel] [PATCH v2 1/2] paravirt/locks: use new static key for controlling call of virt_spin_lock()

2017-09-06 Thread Waiman Long
> diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c > index 294294c71ba4..838d235b87ef 100644 > --- a/kernel/locking/qspinlock.c > +++ b/kernel/locking/qspinlock.c > @@ -76,6 +76,10 @@ > #define MAX_NODES4 > #endif > > +#ifdef CONFIG_PARAVIRT > +DEFI

Re: [Xen-devel] [PATCH v2 2/2] paravirt, xen: correct xen_nopvspin case

2017-09-06 Thread Waiman Long
t xen_init_spinlocks(void) > > if (!xen_pvspin) { > printk(KERN_DEBUG "xen: PV spinlocks disabled\n"); > + static_branch_disable(_spin_lock_key); > return; > } > printk(KERN_DEBUG "xen: PV spinlocks enable

Re: [Xen-devel] [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-06 Thread Waiman Long
On 09/06/2017 09:06 AM, Peter Zijlstra wrote: > On Wed, Sep 06, 2017 at 08:44:09AM -0400, Waiman Long wrote: >> On 09/06/2017 03:08 AM, Peter Zijlstra wrote: >>> Guys, please trim email. >>> >>> On Tue, Sep 05, 2017 at 10:31:46AM -0400, Waiman Long wrote: &g

Re: [Xen-devel] [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-06 Thread Waiman Long
On 09/06/2017 03:08 AM, Peter Zijlstra wrote: > Guys, please trim email. > > On Tue, Sep 05, 2017 at 10:31:46AM -0400, Waiman Long wrote: >> For clarification, I was actually asking if you consider just adding one >> more jump label to skip it for Xen/KVM instead of making >

Re: [Xen-devel] [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-05 Thread Waiman Long
On 09/05/2017 10:24 AM, Waiman Long wrote: > On 09/05/2017 10:18 AM, Juergen Gross wrote: >> On 05/09/17 16:10, Waiman Long wrote: >>> On 09/05/2017 09:24 AM, Juergen Gross wrote: >>>> There are cases where a guest tries to switch spinlocks to bare metal

Re: [Xen-devel] [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-05 Thread Waiman Long
On 09/05/2017 10:18 AM, Juergen Gross wrote: > On 05/09/17 16:10, Waiman Long wrote: >> On 09/05/2017 09:24 AM, Juergen Gross wrote: >>> There are cases where a guest tries to switch spinlocks to bare metal >>> behavior (e.g. by setting "xen_nopvspin"

Re: [Xen-devel] [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-05 Thread Waiman Long
On 09/05/2017 10:08 AM, Peter Zijlstra wrote: > On Tue, Sep 05, 2017 at 10:02:57AM -0400, Waiman Long wrote: >> On 09/05/2017 09:24 AM, Juergen Gross wrote: >>> +static inline bool native_virt_spin_lock(struct qspinlock *lock) >>> +{ >>> + if (!s

Re: [Xen-devel] [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-05 Thread Waiman Long
On 09/05/2017 09:24 AM, Juergen Gross wrote: > There are cases where a guest tries to switch spinlocks to bare metal > behavior (e.g. by setting "xen_nopvspin" boot parameter). Today this > has the downside of falling back to unfair test and set scheme for > qspinlocks due to virt_spin_lock()

Re: [Xen-devel] [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-05 Thread Waiman Long
On 09/05/2017 09:24 AM, Juergen Gross wrote: > There are cases where a guest tries to switch spinlocks to bare metal > behavior (e.g. by setting "xen_nopvspin" boot parameter). Today this > has the downside of falling back to unfair test and set scheme for > qspinlocks due to virt_spin_lock()

[Xen-devel] [PATCH v5 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64

2017-02-20 Thread Waiman Long
__raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched. Suggested-by: Peter Zijlstra <pet...@infradead.org> Signed-off-by: Waiman Long <long...@redhat.com> --- arch/x86/kernel/asm-offsets_64.c | 9 + arch/x86/kernel/kvm.c| 24 2 f

[Xen-devel] [PATCH v5 0/2] x86/kvm: Reduce vcpu_is_preempted() overhead

2017-02-20 Thread Waiman Long
ds to reduce this performance overhead by replacing the C __kvm_vcpu_is_preempted() function by an optimized version of __raw_callee_save___kvm_vcpu_is_preempted() written in assembly. Waiman Long (2): x86/paravirt: Change vcp_is_preempted() arg type to long x86/kvm: Provide optim

[Xen-devel] [PATCH v5 1/2] x86/paravirt: Change vcp_is_preempted() arg type to long

2017-02-20 Thread Waiman Long
number won't exceed 32 bits. Signed-off-by: Waiman Long <long...@redhat.com> --- arch/x86/include/asm/paravirt.h | 2 +- arch/x86/include/asm/qspinlock.h | 2 +- arch/x86/kernel/kvm.c| 2 +- arch/x86/kernel/paravirt-spinlocks.c | 2 +- 4 files changed, 4 insertions

Re: [Xen-devel] [PATCH v4 1/2] x86/paravirt: Change vcp_is_preempted() arg type to long

2017-02-16 Thread Waiman Long
On 02/16/2017 11:09 AM, Peter Zijlstra wrote: > On Wed, Feb 15, 2017 at 04:37:49PM -0500, Waiman Long wrote: >> The cpu argument in the function prototype of vcpu_is_preempted() >> is changed from int to long. That makes it easier to provide a better >> optimized assembly ver

Re: [Xen-devel] [PATCH v4 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64

2017-02-16 Thread Waiman Long
On 02/16/2017 11:48 AM, Peter Zijlstra wrote: > On Wed, Feb 15, 2017 at 04:37:50PM -0500, Waiman Long wrote: >> +/* >> + * Hand-optimize version for x86-64 to avoid 8 64-bit register saving and >> + * restoring to/from the stack. It is assumed that the preempted value >>

[Xen-devel] [PATCH v4 0/2] x86/kvm: Reduce vcpu_is_preempted() overhead

2017-02-15 Thread Waiman Long
ted() can have some impact on system performance on a VM guest, especially of x86-64 guest, this patch set intends to reduce this performance overhead by replacing the C __kvm_vcpu_is_preempted() function by an optimized version of __raw_callee_save___kvm_vcpu_is_preempted() written in assembly. Wa

[Xen-devel] [PATCH v4 1/2] x86/paravirt: Change vcp_is_preempted() arg type to long

2017-02-15 Thread Waiman Long
number won't exceed 32 bits. Signed-off-by: Waiman Long <long...@redhat.com> --- arch/x86/include/asm/paravirt.h | 2 +- arch/x86/include/asm/qspinlock.h | 2 +- arch/x86/kernel/kvm.c| 2 +- arch/x86/kernel/paravirt-spinlocks.c | 2 +- 4 files changed, 4 insertions

[Xen-devel] [PATCH v4 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64

2017-02-15 Thread Waiman Long
__raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched. Suggested-by: Peter Zijlstra <pet...@infradead.org> Signed-off-by: Waiman Long <long...@redhat.com> --- arch/x86/kernel/kvm.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/arch/x86/ke

[Xen-devel] [PATCH v3 0/2] x86/kvm: Reduce vcpu_is_preempted() overhead

2017-02-15 Thread Waiman Long
rmance on a VM guest, especially of x86-64 guest, this patch set intends to reduce this performance overhead by replacing the C __kvm_vcpu_is_preempted() function by an optimized version of __raw_callee_save___kvm_vcpu_is_preempted() written in assembly. Waiman Long (2): x86/paravirt:

[Xen-devel] [PATCH v3 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64

2017-02-15 Thread Waiman Long
__raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched. Suggested-by: Peter Zijlstra <pet...@infradead.org> Signed-off-by: Waiman Long <long...@redhat.com> --- arch/x86/kernel/kvm.c | 28 1 file changed, 28 insertions(+) diff --git a/arch/x86/ke

[Xen-devel] [PATCH v3 1/2] x86/paravirt: Change vcp_is_preempted() arg type to long

2017-02-15 Thread Waiman Long
number won't exceed 32 bits. Signed-off-by: Waiman Long <long...@redhat.com> --- arch/x86/include/asm/paravirt.h | 2 +- arch/x86/include/asm/qspinlock.h | 2 +- arch/x86/kernel/kvm.c| 2 +- arch/x86/kernel/paravirt-spinlocks.c | 2 +- 4 files changed, 4 insertions

Re: [Xen-devel] [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-14 Thread Waiman Long
On 02/14/2017 04:39 AM, Peter Zijlstra wrote: > On Mon, Feb 13, 2017 at 05:34:01PM -0500, Waiman Long wrote: >> It is the address of _time that will exceed the 32-bit limit. > That seems extremely unlikely. That would mean we have more than 4G > worth of per-cpu variables declare

Re: [Xen-devel] [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-13 Thread Waiman Long
On 02/13/2017 04:52 PM, Peter Zijlstra wrote: > On Mon, Feb 13, 2017 at 03:12:45PM -0500, Waiman Long wrote: >> On 02/13/2017 02:42 PM, Waiman Long wrote: >>> On 02/13/2017 05:53 AM, Peter Zijlstra wrote: >>>> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra

Re: [Xen-devel] [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-13 Thread Waiman Long
On 02/13/2017 03:06 PM, h...@zytor.com wrote: > On February 13, 2017 2:53:43 AM PST, Peter Zijlstra > wrote: >> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: >>> That way we'd end up with something like: >>> >>> asm(" >>> push %rdi; >>> movslq %edi, %rdi;

Re: [Xen-devel] [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-13 Thread Waiman Long
On 02/13/2017 02:42 PM, Waiman Long wrote: > On 02/13/2017 05:53 AM, Peter Zijlstra wrote: >> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: >>> That way we'd end up with something like: >>> >>> asm(" >>> push %rdi; >>>

Re: [Xen-devel] [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-13 Thread Waiman Long
On 02/13/2017 05:53 AM, Peter Zijlstra wrote: > On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: >> That way we'd end up with something like: >> >> asm(" >> push %rdi; >> movslq %edi, %rdi; >> movq __per_cpu_offset(,%rdi,8), %rax; >> cmpb $0, %[offset](%rax); >> setne %al; >> pop

Re: [Xen-devel] [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-13 Thread Waiman Long
On 02/13/2017 05:47 AM, Peter Zijlstra wrote: > On Fri, Feb 10, 2017 at 12:00:43PM -0500, Waiman Long wrote: > >>>> +asm( >>>> +".pushsection .text;" >>>> +".global __raw_callee_save___kvm_vcpu_is_preempted;" >

Re: [Xen-devel] [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-10 Thread Waiman Long
On 02/10/2017 11:35 AM, Waiman Long wrote: > On 02/10/2017 11:19 AM, Peter Zijlstra wrote: >> On Fri, Feb 10, 2017 at 10:43:09AM -0500, Waiman Long wrote: >>> It was found when running fio sequential write test with a XFS ramdisk >>> on a VM running on a 2-socket x8

Re: [Xen-devel] [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-10 Thread Waiman Long
On 02/10/2017 11:19 AM, Peter Zijlstra wrote: > On Fri, Feb 10, 2017 at 10:43:09AM -0500, Waiman Long wrote: >> It was found when running fio sequential write test with a XFS ramdisk >> on a VM running on a 2-socket x86-64 system, the %CPU times as reported >> by perf were as

[Xen-devel] [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-10 Thread Waiman Long
[k] osq_lock 10.14% 10.14% fio [k] __kvm_vcpu_is_preempted On bare metal, the patch doesn't introduce any performance regression. On KVM guest, it produces noticeable performance improvement (up to 7%). Signed-off-by: Waiman Long <long...@redhat.com> --- v1->v2: - Rerun

Re: [Xen-devel] [PATCH 1/2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-08 Thread Waiman Long
On 02/08/2017 02:05 PM, Peter Zijlstra wrote: > On Wed, Feb 08, 2017 at 01:00:24PM -0500, Waiman Long wrote: >> It was found when running fio sequential write test with a XFS ramdisk >> on a 2-socket x86-64 system, the %CPU times as reported by perf were >> as follows: >

Re: [Xen-devel] [PATCH 2/2] locking/mutex, rwsem: Reduce vcpu_is_preempted() calling frequency

2017-02-08 Thread Waiman Long
On 02/08/2017 02:05 PM, Peter Zijlstra wrote: > On Wed, Feb 08, 2017 at 01:00:25PM -0500, Waiman Long wrote: >> As the vcpu_is_preempted() call is pretty costly compared with other >> checks within mutex_spin_on_owner() and rwsem_spin_on_owner(), they >> are done at a red

[Xen-devel] [PATCH 2/2] locking/mutex, rwsem: Reduce vcpu_is_preempted() calling frequency

2017-02-08 Thread Waiman Long
As the vcpu_is_preempted() call is pretty costly compared with other checks within mutex_spin_on_owner() and rwsem_spin_on_owner(), they are done at a reduce frequency of once every 256 iterations. Signed-off-by: Waiman Long <long...@redhat.com> --- kernel/locking/mutex.c | 5 -

[Xen-devel] [PATCH 1/2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-08 Thread Waiman Long
and rwsem slowpaths, there isn't much to gain by making it callee-save. So it is now changed to a normal function call instead. With this patch applied, the aggregrate bandwidth of the fio sequential write test increased slightly from 2563.3MB/s to 2588.1MB/s (about 1%). Signed-off-by: Waiman

[Xen-devel] [PATCH v2] x86, locking/spinlocks: Remove paravirt_ticketlocks_enabled

2017-01-12 Thread Waiman Long
locks_jump() are also removed. A simple build and boot test was done to verify it. Signed-off-by: Waiman Long <long...@redhat.com> --- v1->v2: - Remove init functions kvm_spinlock_init_jump() and xen_init_spinlocks_jump(). arch/x86/include/asm/spinlock.h | 3 --- ar

[Xen-devel] [PATCH] x86, locking/spinlocks: Remove paravirt_ticketlocks_enabled

2017-01-12 Thread Waiman Long
This is a follow-up of commit cfd8983f03c7b2 ("x86, locking/spinlocks: Remove ticket (spin)lock implementation"). The static_key structure paravirt_ticketlocks_enabled is now removed as it is no longer used. A simple build and boot test was done to verify it. Signed-off-by: Waiman

Re: [Xen-devel] [PATCH v16 08/14] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-05-04 Thread Waiman Long
On 05/04/2015 10:20 AM, Peter Zijlstra wrote: I changed it to the below; I've not gotten around to compiling or even running it yet :-( The biggest change is the pv_hash/pv_unhash functions, which I've rewritten to hopefully be clearer (and also hopefully not wrecked them). I took out the

Re: [Xen-devel] [PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015-05-04 Thread Waiman Long
On 05/04/2015 10:05 AM, Peter Zijlstra wrote: On Thu, Apr 30, 2015 at 02:49:26PM -0400, Waiman Long wrote: On 04/29/2015 02:11 PM, Peter Zijlstra wrote: On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote: In the pv_scan_next() function, the slow cmpxchg atomic operation is performed

Re: [Xen-devel] [PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015-04-30 Thread Waiman Long
On 04/29/2015 02:27 PM, Linus Torvalds wrote: On Wed, Apr 29, 2015 at 11:11 AM, Peter Zijlstrapet...@infradead.org wrote: On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote: In the pv_scan_next() function, the slow cmpxchg atomic operation is performed even if the other CPU

Re: [Xen-devel] [PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015-04-30 Thread Waiman Long
On 04/29/2015 02:11 PM, Peter Zijlstra wrote: On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote: In the pv_scan_next() function, the slow cmpxchg atomic operation is performed even if the other CPU is not even close to being halted. This extra cmpxchg can harm slowpath performance

[Xen-devel] [PATCH v16 04/14] qspinlock: Extract out code snippets for the next patch

2015-04-24 Thread Waiman Long
the locked bit into a new clear_pending_set_locked() function. This patch also simplifies the trylock operation before queuing by calling queue_spin_trylock() directly. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- include/asm-generic

[Xen-devel] [PATCH v16 01/14] qspinlock: A simple generic 4-byte queue spinlock

2015-04-24 Thread Waiman Long
the lock is acquired, the queue node can be released to be used later. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- include/asm-generic/qspinlock.h | 132 + include/asm-generic/qspinlock_types.h | 58

[Xen-devel] [PATCH v16 00/14] qspinlock: a 4-byte queue spinlock with PV support

2015-04-24 Thread Waiman Long
-and-set on hypervisors pvqspinlock, x86: Implement the paravirt qspinlock call patching Waiman Long (9): qspinlock: A simple generic 4-byte queue spinlock qspinlock, x86: Enable x86-64 to use queue spinlock qspinlock: Extract out code snippets for the next patch qspinlock: Use a simple write

[Xen-devel] [PATCH v16 09/14] pvqspinlock, x86: Implement the paravirt qspinlock call patching

2015-04-24 Thread Waiman Long
. This significantly lowers the overhead of having CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code. Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/Kconfig |2 +- arch/x86/include/asm/paravirt.h

[Xen-devel] [PATCH v16 10/14] pvqspinlock, x86: Enable PV qspinlock for KVM

2015-04-24 Thread Waiman Long
This patch adds the necessary KVM specific code to allow KVM to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 43 +++ kernel/Kconfig.locks

[Xen-devel] [PATCH v16 06/14] qspinlock: Use a simple write to grab the lock

2015-04-24 Thread Waiman Long
-- - ticketlock 2075 10.00 216.35 3.49 qspinlock 3023 10.00 198.20 4.80 Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org

[Xen-devel] [PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015-04-24 Thread Waiman Long
without using atomic op. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock_paravirt.h | 28 +--- 1 files changed, 25 insertions(+), 3 deletions(-) diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h index

[Xen-devel] [PATCH v16 03/14] qspinlock: Add pending bit

2015-04-24 Thread Waiman Long
-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qspinlock_types.h | 12 +++- kernel/locking/qspinlock.c| 119 +++-- 2 files changed, 107 insertions(+), 24 deletions(-) diff --git a/include

[Xen-devel] [PATCH v16 14/14] pvqspinlock: Collect slowpath lock statistics

2015-04-24 Thread Waiman Long
the pv-qspinlock directory. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock_paravirt.h | 100 ++- 1 files changed, 98 insertions(+), 2 deletions(-) diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h

[Xen-devel] [PATCH v16 12/14] pvqspinlock: Only kick CPU at unlock time

2015-04-24 Thread Waiman Long
() so as to do the pv_kick() only if it is really necessary. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock.c | 10 ++-- kernel/locking/qspinlock_paravirt.h | 76 +- 2 files changed, 61 insertions(+), 25 deletions(-) diff --git

[Xen-devel] [PATCH v16 08/14] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-24 Thread Waiman Long
linear feedback shift register. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock.c | 68 +++- kernel/locking/qspinlock_paravirt.h | 324 +++ 2 files changed, 391 insertions(+), 1 deletions(-) create mode 100644 kernel

[Xen-devel] [PATCH v16 05/14] qspinlock: Optimize for smaller NR_CPUS

2015-04-24 Thread Waiman Long
-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qspinlock_types.h | 13 ++ kernel/locking/qspinlock.c| 69 - 2 files changed, 81 insertions(+), 1 deletions(-) diff --git a/include/asm-generic/qspinlock_types.h b/include/asm-generic

[Xen-devel] [PATCH v16 11/14] pvqspinlock, x86: Enable PV qspinlock for Xen

2015-04-24 Thread Waiman Long
From: David Vrabel david.vra...@citrix.com This patch adds the necessary Xen specific code to allow Xen to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: David Vrabel david.vra...@citrix.com Signed-off-by: Waiman Long waiman.l...@hp.com

[Xen-devel] [PATCH v16 07/14] qspinlock: Revert to test-and-set on hypervisors

2015-04-24 Thread Waiman Long
From: Peter Zijlstra (Intel) pet...@infradead.org When we detect a hypervisor (!paravirt, see qspinlock paravirt support patches), revert to a simple test-and-set lock to avoid the horrors of queue preemption. Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long

[Xen-devel] [PATCH v16 02/14] qspinlock, x86: Enable x86-64 to use queue spinlock

2015-04-24 Thread Waiman Long
optimization which will make the queue spinlock code perform better than the generic implementation. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- arch/x86/Kconfig |1 + arch/x86/include/asm/qspinlock.h | 20

Re: [Xen-devel] [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-13 Thread Waiman Long
On 04/13/2015 11:09 AM, Peter Zijlstra wrote: On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote: +__visible void __pv_queue_spin_unlock(struct qspinlock *lock) +{ + struct __qspinlock *l = (void *)lock; + struct pv_node *node; + + if (likely(cmpxchg(l-locked

Re: [Xen-devel] [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-13 Thread Waiman Long
On 04/13/2015 10:47 AM, Peter Zijlstra wrote: On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote: +void __init __pv_init_lock_hash(void) +{ + int pv_hash_size = 4 * num_possible_cpus(); + + if (pv_hash_size (1U LFSR_MIN_BITS)) + pv_hash_size = (1U

Re: [Xen-devel] [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-13 Thread Waiman Long
On 04/13/2015 11:08 AM, Peter Zijlstra wrote: On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote: +static void pv_wait_head(struct qspinlock *lock, struct mcs_spinlock *node) +{ + struct __qspinlock *l = (void *)lock; + struct qspinlock **lp = NULL; + struct pv_node

Re: [Xen-devel] [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-09 Thread Waiman Long
On 04/09/2015 02:23 PM, Peter Zijlstra wrote: On Thu, Apr 09, 2015 at 08:13:27PM +0200, Peter Zijlstra wrote: On Mon, Apr 06, 2015 at 10:55:44PM -0400, Waiman Long wrote: +#define PV_HB_PER_LINE (SMP_CACHE_BYTES / sizeof(struct pv_hash_bucket)) +static struct qspinlock **pv_hash(struct

[Xen-devel] [PATCH v15 16/16] unfair qspinlock: a queue based unfair lock

2015-04-08 Thread Waiman Long
the performance benefit of qspinlock versus ticket spinlock which got reduced in VM3 due to the overhead of constant vCPUs halting and kicking. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/include/asm/qspinlock.h | 15 +-- kernel/locking/qspinlock.c| 94 +-- kernel

Re: [Xen-devel] [PATCH v15 12/15] pvqspinlock, x86: Enable PV qspinlock for Xen

2015-04-08 Thread Waiman Long
On 04/08/2015 08:01 AM, David Vrabel wrote: On 07/04/15 03:55, Waiman Long wrote: This patch adds the necessary Xen specific code to allow Xen to support the CPU halting and kicking operations needed by the queue spinlock PV code. This basically looks the same as the version I wrote, except I

[Xen-devel] [PATCH v15 01/15] qspinlock: A simple generic 4-byte queue spinlock

2015-04-06 Thread Waiman Long
the lock is acquired, the queue node can be released to be used later. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- include/asm-generic/qspinlock.h | 132 + include/asm-generic/qspinlock_types.h | 58

[Xen-devel] [PATCH v15 11/15] pvqspinlock, x86: Enable PV qspinlock for KVM

2015-04-06 Thread Waiman Long
This patch adds the necessary KVM specific code to allow KVM to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/kernel/kvm.c | 43 +++ kernel/Kconfig.locks

[Xen-devel] [PATCH v15 04/15] qspinlock: Extract out code snippets for the next patch

2015-04-06 Thread Waiman Long
the locked bit into a new clear_pending_set_locked() function. This patch also simplifies the trylock operation before queuing by calling queue_spin_trylock() directly. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- include/asm-generic

[Xen-devel] [PATCH v15 07/15] qspinlock: Revert to test-and-set on hypervisors

2015-04-06 Thread Waiman Long
From: Peter Zijlstra (Intel) pet...@infradead.org When we detect a hypervisor (!paravirt, see qspinlock paravirt support patches), revert to a simple test-and-set lock to avoid the horrors of queue preemption. Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long

[Xen-devel] [PATCH v15 03/15] qspinlock: Add pending bit

2015-04-06 Thread Waiman Long
-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qspinlock_types.h | 12 +++- kernel/locking/qspinlock.c| 119 +++-- 2 files changed, 107 insertions(+), 24 deletions(-) diff --git a/include

[Xen-devel] [PATCH v15 05/15] qspinlock: Optimize for smaller NR_CPUS

2015-04-06 Thread Waiman Long
-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qspinlock_types.h | 13 ++ kernel/locking/qspinlock.c| 69 - 2 files changed, 81 insertions(+), 1 deletions(-) diff --git a/include/asm-generic/qspinlock_types.h b/include/asm-generic

[Xen-devel] [PATCH v15 14/15] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015-04-06 Thread Waiman Long
without using atomic op. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock_paravirt.h | 28 +--- 1 files changed, 25 insertions(+), 3 deletions(-) diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h index

[Xen-devel] [PATCH v15 00/15] qspinlock: a 4-byte queue spinlock with PV support

2015-04-06 Thread Waiman Long
qspinlock: Revert to test-and-set on hypervisors pvqspinlock: Implement the paravirt qspinlock for x86 Waiman Long (11): qspinlock: A simple generic 4-byte queue spinlock qspinlock, x86: Enable x86-64 to use queue spinlock qspinlock: Extract out code snippets for the next patch qspinlock: Use

[Xen-devel] [PATCH v15 13/15] pvqspinlock: Only kick CPU at unlock time

2015-04-06 Thread Waiman Long
vCPU state (vcpu_hashed) which enables the code to delay CPU kicking until at unlock time. Once this state is set, the new lock holder will set _Q_SLOW_VAL and fill in the hash table on behalf of the halted queue head vCPU. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking

[Xen-devel] [PATCH v15 02/15] qspinlock, x86: Enable x86-64 to use queue spinlock

2015-04-06 Thread Waiman Long
optimization which will make the queue spinlock code perform better than the generic implementation. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- arch/x86/Kconfig |1 + arch/x86/include/asm/qspinlock.h | 20

[Xen-devel] [PATCH v15 06/15] qspinlock: Use a simple write to grab the lock

2015-04-06 Thread Waiman Long
-- - ticketlock 2075 10.00 216.35 3.49 qspinlock 3023 10.00 198.20 4.80 Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org

[Xen-devel] [PATCH v15 10/15] pvqspinlock: Implement the paravirt qspinlock for x86

2015-04-06 Thread Waiman Long
. This significantly lowers the overhead of having CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code. Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/Kconfig |2 +- arch/x86/include/asm/paravirt.h

[Xen-devel] [PATCH v15 08/15] lfsr: a simple binary Galois linear feedback shift register

2015-04-06 Thread Waiman Long
the value 0 in a somewhat random fashion depending on the LFSR taps that is being used. Callers can provide their own taps value or use the default. Signed-off-by: Waiman Long waiman.l...@hp.com --- include/linux/lfsr.h | 80 ++ 1 files changed, 80

[Xen-devel] [PATCH v15 12/15] pvqspinlock, x86: Enable PV qspinlock for Xen

2015-04-06 Thread Waiman Long
This patch adds the necessary Xen specific code to allow Xen to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/xen/spinlock.c | 63 --- kernel

[Xen-devel] [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-06 Thread Waiman Long
linear feedback shift register. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock.c | 69 - kernel/locking/qspinlock_paravirt.h | 321 +++ 2 files changed, 389 insertions(+), 1 deletions(-) create mode 100644 kernel

[Xen-devel] [PATCH v15 15/15] pvqspinlock: Add debug code to check for PV lock hash sanity

2015-04-06 Thread Waiman Long
that which will only be enabled if CONFIG_DEBUG_SPINLOCK is defined because of the performance overhead it introduces. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/qspinlock_paravirt.h | 58 +++ 1 files changed, 58 insertions(+), 0 deletions

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-02 Thread Waiman Long
On 04/02/2015 03:48 PM, Peter Zijlstra wrote: On Thu, Apr 02, 2015 at 07:20:57PM +0200, Peter Zijlstra wrote: pv_wait_head(): pv_hash() /* MB as per cmpxchg */ cmpxchg(l-locked, _Q_LOCKED_VAL, _Q_SLOW_VAL); VS __pv_queue_spin_unlock(): if (xchg(l-locked, 0)

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-02 Thread Waiman Long
On 04/01/2015 05:03 PM, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 03:58:58PM -0400, Waiman Long wrote: On 04/01/2015 02:48 PM, Peter Zijlstra wrote: I am sorry that I don't quite get what you mean here. My point is that in the hashing step, a cpu will need to scan an empty bucket to put

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Waiman Long
On 03/19/2015 08:25 AM, Peter Zijlstra wrote: On Thu, Mar 19, 2015 at 11:12:42AM +0100, Peter Zijlstra wrote: So I was now thinking of hashing the lock pointer; let me go and quickly put something together. A little something like so; ideally we'd allocate the hashtable since NR_CPUS is kinda

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Waiman Long
On 04/01/2015 02:17 PM, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 07:42:39PM +0200, Peter Zijlstra wrote: Hohumm.. time to think more I think ;-) So bear with me, I've not really pondered this well so it could be full of holes (again). After the cmpxchg(l-locked, _Q_LOCKED_VAL,

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Waiman Long
On 04/01/2015 02:48 PM, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 02:54:45PM -0400, Waiman Long wrote: On 04/01/2015 02:17 PM, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 07:42:39PM +0200, Peter Zijlstra wrote: Hohumm.. time to think more I think ;-) So bear with me, I've not really

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Waiman Long
On 04/01/2015 01:12 PM, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 12:20:30PM -0400, Waiman Long wrote: After more careful reading, I think the assumption that the presence of an unused bucket means there is no match is not true. Consider the scenario: 1. cpu 0 puts lock1 into hb[0] 2. cpu

Re: [Xen-devel] [PATCH 0/9] qspinlock stuff -v15

2015-03-30 Thread Waiman Long
On 03/25/2015 03:47 PM, Konrad Rzeszutek Wilk wrote: On Mon, Mar 16, 2015 at 02:16:13PM +0100, Peter Zijlstra wrote: Hi Waiman, As promised; here is the paravirt stuff I did during the trip to BOS last week. All the !paravirt patches are more or less the same as before (the only real change

Re: [Xen-devel] [PATCH 0/9] qspinlock stuff -v15

2015-03-30 Thread Waiman Long
On 03/30/2015 12:29 PM, Peter Zijlstra wrote: On Mon, Mar 30, 2015 at 12:25:12PM -0400, Waiman Long wrote: I did it differently in my PV portion of the qspinlock patch. Instead of just waking up the CPU, the new lock holder will check if the new queue head has been halted. If so, it will set

Re: [Xen-devel] [PATCH 0/9] qspinlock stuff -v15

2015-03-30 Thread Waiman Long
On 03/27/2015 10:07 AM, Konrad Rzeszutek Wilk wrote: On Thu, Mar 26, 2015 at 09:21:53PM +0100, Peter Zijlstra wrote: On Wed, Mar 25, 2015 at 03:47:39PM -0400, Konrad Rzeszutek Wilk wrote: Ah nice. That could be spun out as a seperate patch to optimize the existing ticket locks I presume. Yes

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-03-19 Thread Waiman Long
On 03/19/2015 08:25 AM, Peter Zijlstra wrote: On Thu, Mar 19, 2015 at 11:12:42AM +0100, Peter Zijlstra wrote: So I was now thinking of hashing the lock pointer; let me go and quickly put something together. A little something like so; ideally we'd allocate the hashtable since NR_CPUS is kinda

Re: [Xen-devel] [PATCH 9/9] qspinlock, x86, kvm: Implement KVM support for paravirt qspinlock

2015-03-19 Thread Waiman Long
On 03/19/2015 06:01 AM, Peter Zijlstra wrote: On Wed, Mar 18, 2015 at 10:45:55PM -0400, Waiman Long wrote: On 03/16/2015 09:16 AM, Peter Zijlstra wrote: I do have some concern about this call site patching mechanism as the modification is not atomic. The spin_unlock() calls are in many places

Re: [Xen-devel] [PATCH 0/9] qspinlock stuff -v15

2015-03-18 Thread Waiman Long
On 03/16/2015 09:16 AM, Peter Zijlstra wrote: Hi Waiman, As promised; here is the paravirt stuff I did during the trip to BOS last week. All the !paravirt patches are more or less the same as before (the only real change is the copyright lines in the first patch). The paravirt stuff is

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-03-18 Thread Waiman Long
On 03/16/2015 09:16 AM, Peter Zijlstra wrote: Implement simple paravirt support for the qspinlock. Provide a separate (second) version of the spin_lock_slowpath for paravirt along with a special unlock path. The second slowpath is generated by adding a few pv hooks to the normal slowpath, but

Re: [Xen-devel] [PATCH 9/9] qspinlock, x86, kvm: Implement KVM support for paravirt qspinlock

2015-03-18 Thread Waiman Long
On 03/16/2015 09:16 AM, Peter Zijlstra wrote: Implement the paravirt qspinlock for x86-kvm. We use the regular paravirt call patching to switch between: native_queue_spin_lock_slowpath()__pv_queue_spin_lock_slowpath() native_queue_spin_unlock() __pv_queue_spin_unlock() We

[Xen-devel] [PATCH v14 01/11] qspinlock: A simple generic 4-byte queue spinlock

2015-01-20 Thread Waiman Long
the lock is acquired, the queue node can be released to be used later. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- include/asm-generic/qspinlock.h | 132 + include/asm-generic/qspinlock_types.h | 58

  1   2   >