[PATCH v12 00/11] qspinlock: a 4-byte queue spinlock with PV support

2014-10-16 Thread Waiman Long
v11-v12: - Based on PeterZ's version of the qspinlock patch (https://lkml.org/lkml/2014/6/15/63). - Incorporated many of the review comments from Konrad Wilk and Paolo Bonzini. - The pvqspinlock code is largely from my previous version with PeterZ's way of going from queue tail to head

[PATCH v12 01/11] qspinlock: A simple generic 4-byte queue spinlock

2014-10-16 Thread Waiman Long
This patch introduces a new generic queue spinlock implementation that can serve as an alternative to the default ticket spinlock. Compared with the ticket spinlock, this queue spinlock should be almost as fair as the ticket spinlock. It has about the same speed in single-thread and it can be much

[PATCH v12 02/11] qspinlock, x86: Enable x86-64 to use queue spinlock

2014-10-16 Thread Waiman Long
This patch makes the necessary changes at the x86 architecture specific layer to enable the use of queue spinlock for x86-64. As x86-32 machines are typically not multi-socket. The benefit of queue spinlock may not be apparent. So queue spinlock is not enabled. Currently, there is some

[PATCH v12 03/11] qspinlock: Add pending bit

2014-10-16 Thread Waiman Long
From: Peter Zijlstra pet...@infradead.org Because the qspinlock needs to touch a second cacheline (the per-cpu mcs_nodes[]); add a pending bit and allow a single in-word spinner before we punt to the second cacheline. It is possible so observe the pending bit without the locked bit when the last

[PATCH v12 04/11] qspinlock: Extract out code snippets for the next patch

2014-10-16 Thread Waiman Long
This is a preparatory patch that extracts out the following 2 code snippets to prepare for the next performance optimization patch. 1) the logic for the exchange of new and previous tail code words into a new xchg_tail() function. 2) the logic for clearing the pending bit and setting the

[PATCH v12 05/11] qspinlock: Optimize for smaller NR_CPUS

2014-10-16 Thread Waiman Long
From: Peter Zijlstra pet...@infradead.org When we allow for a max NR_CPUS 2^14 we can optimize the pending wait-acquire and the xchg_tail() operations. By growing the pending bit to a byte, we reduce the tail to 16bit. This means we can use xchg16 for the tail part and do away with all the

[PATCH v12 10/11] pvqspinlock, x86: Enable PV qspinlock for KVM

2014-10-16 Thread Waiman Long
This patch adds the necessary KVM specific code to allow KVM to support the CPU halting and kicking operations needed by the queue spinlock PV code. Two KVM guests of 20 CPU cores (2 nodes) were created for performance testing in one of the following three configurations: 1) Only 1 VM is active

[PATCH v12 06/11] qspinlock: Use a simple write to grab the lock

2014-10-16 Thread Waiman Long
Currently, atomic_cmpxchg() is used to get the lock. However, this is not really necessary if there is more than one task in the queue and the queue head don't need to reset the tail code. For that case, a simple write to set the lock bit is enough as the queue head will be the only one eligible

[PATCH v12 09/11] pvqspinlock, x86: Add para-virtualization support

2014-10-16 Thread Waiman Long
This patch adds para-virtualization support to the queue spinlock code base with minimal impact to the native case. There are some minor code changes in the generic qspinlock.c file which should be usable in other architectures. The other code changes are specific to x86 processors and so are all

[PATCH v12 11/11] pvqspinlock, x86: Enable PV qspinlock for XEN

2014-10-16 Thread Waiman Long
This patch adds the necessary XEN specific code to allow XEN to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: Waiman Long waiman.l...@hp.com --- arch/x86/xen/spinlock.c | 149 +--

[PATCH v12 07/11] qspinlock: Revert to test-and-set on hypervisors

2014-10-16 Thread Waiman Long
From: Peter Zijlstra pet...@infradead.org When we detect a hypervisor (!paravirt, see qspinlock paravirt support patches), revert to a simple test-and-set lock to avoid the horrors of queue preemption. Signed-off-by: Peter Zijlstra pet...@infradead.org Signed-off-by: Waiman Long

[PATCH v12 08/11] qspinlock, x86: Rename paravirt_ticketlocks_enabled

2014-10-16 Thread Waiman Long
This patch renames the paravirt_ticketlocks_enabled static key to a more generic paravirt_spinlocks_enabled name. Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra pet...@infradead.org --- arch/x86/include/asm/spinlock.h |4 ++-- arch/x86/kernel/kvm.c

[PULL] More virtio fun

2014-10-16 Thread Rusty Russell
The following changes since commit 7ec62d421bdf29cb31101ae2689f7f3a9906289a: Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs (2014-09-10 14:04:17 -0700) are available in the git repository at:

Re: [RFC PATCH net-next 1/6] virtio: make sure used event never go backwards

2014-10-16 Thread Jason Wang
On 10/15/2014 07:38 PM, Michael S. Tsirkin wrote: On Wed, Oct 15, 2014 at 06:44:41PM +0800, Jason Wang wrote: On 10/15/2014 06:32 PM, Michael S. Tsirkin wrote: On Wed, Oct 15, 2014 at 06:13:19PM +0800, Jason Wang wrote: On 10/15/2014 05:34 PM, Michael S. Tsirkin wrote: On Wed, Oct 15, 2014 at

Re: [PATCH RFC v2 1/3] virtio_net: enable tx interrupt

2014-10-16 Thread Jason Wang
On 10/15/2014 10:32 PM, Michael S. Tsirkin wrote: On newer hosts that support delayed tx interrupts, we probably don't have much to gain from orphaning packets early. Based on patch by Jason Wang. Note: this might degrade performance for hosts without event idx support. Should be

Re: [PATCH RFC v2 2/3] virtio_net: bql

2014-10-16 Thread Jason Wang
On 10/15/2014 10:32 PM, Michael S. Tsirkin wrote: Improve tx batching using byte queue limits. Should be especially effective for MQ. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- drivers/net/virtio_net.c | 20 1 file changed, 16 insertions(+), 4 deletions(-)

Re: [PATCH net-next RFC 1/3] virtio: support for urgent descriptors

2014-10-16 Thread Jason Wang
On 10/15/2014 01:40 PM, Rusty Russell wrote: Jason Wang jasow...@redhat.com writes: Below should be useful for some experiments Jason is doing. I thought I'd send it out for early review/feedback. event idx feature allows us to defer interrupts until a specific # of descriptors were used.