Re: [Xen-devel] [Resend Patch v4 12/16] smp, x86: Kill SMP single function call interrupt

2015-01-23 Thread Peter Zijlstra
specific code to support generic SMP function call interfaces, so kill the redundant single function call interrupt. Cc: Peter Zijlstra a.p.zijls...@chello.nl Cc: Ingo Molnar mi...@elte.hu Cc: Steven Rostedt rost...@goodmis.org Signed-off-by: Jiang Liu jiang@linux.intel.com Acked-by: Peter

Re: [Xen-devel] [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-09 Thread Peter Zijlstra
On Mon, Feb 09, 2015 at 03:04:22PM +0530, Raghavendra K T wrote: So we have 3 choices, 1. xadd 2. continue with current approach. 3. a read before unlock and also after that. For the truly paranoid we have probe_kernel_address(), suppose the lock was in module space and the module just got

Re: [Xen-devel] [PATCH V3] x86 spinlock: Fix memory corruption on completing completions

2015-02-12 Thread Peter Zijlstra
On Thu, Feb 12, 2015 at 05:17:27PM +0530, Raghavendra K T wrote: Paravirt spinlock clears slowpath flag after doing unlock. As explained by Linus currently it does: prev = *lock; add_smp(lock-tickets.head, TICKET_LOCK_INC); /* add_smp() is a

[Xen-devel] [PATCH 7/9] qspinlock: Revert to test-and-set on hypervisors

2015-03-16 Thread Peter Zijlstra
From: Peter Zijlstra pet...@infradead.org When we detect a hypervisor (!paravirt, see qspinlock paravirt support patches), revert to a simple test-and-set lock to avoid the horrors of queue preemption. Cc: Ingo Molnar mi...@redhat.com Cc: David Vrabel david.vra...@citrix.com Cc: Oleg Nesterov o

[Xen-devel] [PATCH 5/9] qspinlock: Optimize for smaller NR_CPUS

2015-03-16 Thread Peter Zijlstra
From: Peter Zijlstra pet...@infradead.org When we allow for a max NR_CPUS 2^14 we can optimize the pending wait-acquire and the xchg_tail() operations. By growing the pending bit to a byte, we reduce the tail to 16bit. This means we can use xchg16 for the tail part and do away with all

[Xen-devel] [PATCH 1/9] qspinlock: A simple generic 4-byte queue spinlock

2015-03-16 Thread Peter Zijlstra
...@linux.vnet.ibm.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org Link: http://lkml.kernel.org/r/1421784755-21945-2-git-send-email-waiman.l...@hp.com --- include/asm-generic

[Xen-devel] [PATCH 4/9] qspinlock: Extract out code snippets for the next patch

2015-03-16 Thread Peter Zijlstra
...@linux-foundation.org Cc: Thomas Gleixner t...@linutronix.de Cc: H. Peter Anvin h...@zytor.com Cc: Rik van Riel r...@redhat.com Cc: Raghavendra K T raghavendra...@linux.vnet.ibm.com Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org Link

[Xen-devel] [PATCH 3/9] qspinlock: Add pending bit

2015-03-16 Thread Peter Zijlstra
From: Peter Zijlstra pet...@infradead.org Because the qspinlock needs to touch a second cacheline (the per-cpu mcs_nodes[]); add a pending bit and allow a single in-word spinner before we punt to the second cacheline. It is possible so observe the pending bit without the locked bit when the last

[Xen-devel] [PATCH 6/9] qspinlock: Use a simple write to grab the lock

2015-03-16 Thread Peter Zijlstra
: Rik van Riel r...@redhat.com Cc: Linus Torvalds torva...@linux-foundation.org Cc: Raghavendra K T raghavendra...@linux.vnet.ibm.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet

[Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-03-16 Thread Peter Zijlstra
to the current kvm code. We can do a single enrty because any nesting will wake the vcpu and cause the lower loop to retry. Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- include/asm-generic/qspinlock.h |3 kernel/locking/qspinlock.c | 69 +- kernel/locking

[Xen-devel] [PATCH 9/9] qspinlock, x86, kvm: Implement KVM support for paravirt qspinlock

2015-03-16 Thread Peter Zijlstra
. This significantly lowers the overhead of having CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code. Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org --- arch/x86/Kconfig |2 - arch/x86/include/asm/paravirt.h | 28 - arch/x86/include/asm

[Xen-devel] [PATCH 0/9] qspinlock stuff -v15

2015-03-16 Thread Peter Zijlstra
Hi Waiman, As promised; here is the paravirt stuff I did during the trip to BOS last week. All the !paravirt patches are more or less the same as before (the only real change is the copyright lines in the first patch). The paravirt stuff is 'simple' and KVM only -- the Xen code was a little

Re: [Xen-devel] [PATCH 9/9] qspinlock, x86, kvm: Implement KVM support for paravirt qspinlock

2015-03-19 Thread Peter Zijlstra
On Wed, Mar 18, 2015 at 10:45:55PM -0400, Waiman Long wrote: On 03/16/2015 09:16 AM, Peter Zijlstra wrote: I do have some concern about this call site patching mechanism as the modification is not atomic. The spin_unlock() calls are in many places in the kernel. There is a possibility

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-03-19 Thread Peter Zijlstra
On Wed, Mar 18, 2015 at 04:50:37PM -0400, Waiman Long wrote: +this_cpu_write(__pv_lock_wait, lock); We may run into the same problem of needing to have 4 queue nodes per CPU. If an interrupt happens just after the write and before the actual wait and it goes through the same

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-03-19 Thread Peter Zijlstra
On Thu, Mar 19, 2015 at 01:25:36PM +0100, Peter Zijlstra wrote: +static struct qspinlock **pv_hash(struct qspinlock *lock) +{ + u32 hash = hash_ptr(lock, PV_LOCK_HASH_BITS); + struct pv_hash_bucket *hb, *end; + + if (!hash) + hash = 1; + + hb = __pv_lock_hash

Re: [Xen-devel] [PATCH 0/9] qspinlock stuff -v15

2015-03-19 Thread Peter Zijlstra
On Thu, Mar 19, 2015 at 06:01:34PM +, David Vrabel wrote: This seems work for me, but I've not got time to give it a more thorough testing. You can fold this into your series. Thanks! There doesn't seem to be a way to disable QUEUE_SPINLOCKS when supported by the arch, is this

Re: [Xen-devel] [PATCH 0/9] qspinlock stuff -v15

2015-03-30 Thread Peter Zijlstra
On Mon, Mar 30, 2015 at 12:25:12PM -0400, Waiman Long wrote: I did it differently in my PV portion of the qspinlock patch. Instead of just waking up the CPU, the new lock holder will check if the new queue head has been halted. If so, it will set the slowpath flag for the halted queue head in

Re: [Xen-devel] [PATCH 0/9] qspinlock stuff -v15

2015-03-26 Thread Peter Zijlstra
On Wed, Mar 25, 2015 at 03:47:39PM -0400, Konrad Rzeszutek Wilk wrote: Ah nice. That could be spun out as a seperate patch to optimize the existing ticket locks I presume. Yes I suppose we can do something similar for the ticket and patch in the right increment. We'd need to restructure the

Re: [Xen-devel] [PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015-04-29 Thread Peter Zijlstra
On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote: In the pv_scan_next() function, the slow cmpxchg atomic operation is performed even if the other CPU is not even close to being halted. This extra cmpxchg can harm slowpath performance. This patch introduces the new mayhalt flag to

Re: [Xen-devel] [PATCH v16 13/14] pvqspinlock: Improve slowpath performance by avoiding cmpxchg

2015-05-04 Thread Peter Zijlstra
On Thu, Apr 30, 2015 at 02:49:26PM -0400, Waiman Long wrote: On 04/29/2015 02:11 PM, Peter Zijlstra wrote: On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote: In the pv_scan_next() function, the slow cmpxchg atomic operation is performed even if the other CPU is not even close

Re: [Xen-devel] [PATCH v16 08/14] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-05-04 Thread Peter Zijlstra
...@redhat.com Cc: H. Peter Anvin h...@zytor.com Suggested-by: Peter Zijlstra (Intel) pet...@infradead.org Signed-off-by: Waiman Long waiman.l...@hp.com Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org Link: http://lkml.kernel.org/r/1429901803-29771-9-git-send-email-waiman.l...@hp.com --- kernel

Re: [Xen-devel] [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-13 Thread Peter Zijlstra
On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote: +void __init __pv_init_lock_hash(void) +{ + int pv_hash_size = 4 * num_possible_cpus(); + + if (pv_hash_size (1U LFSR_MIN_BITS)) + pv_hash_size = (1U LFSR_MIN_BITS); + /* +* Allocate space from bootmem

Re: [Xen-devel] [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-13 Thread Peter Zijlstra
On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote: +static void pv_wait_head(struct qspinlock *lock, struct mcs_spinlock *node) +{ + struct __qspinlock *l = (void *)lock; + struct qspinlock **lp = NULL; + struct pv_node *pn = (struct pv_node *)node; + int slow_set = false;

Re: [Xen-devel] [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-13 Thread Peter Zijlstra
On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote: +__visible void __pv_queue_spin_unlock(struct qspinlock *lock) +{ + struct __qspinlock *l = (void *)lock; + struct pv_node *node; + + if (likely(cmpxchg(l-locked, _Q_LOCKED_VAL, 0) == _Q_LOCKED_VAL)) + return; +

Re: [Xen-devel] [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-09 Thread Peter Zijlstra
On Mon, Apr 06, 2015 at 10:55:44PM -0400, Waiman Long wrote: +++ b/kernel/locking/qspinlock_paravirt.h @@ -0,0 +1,321 @@ +#ifndef _GEN_PV_LOCK_SLOWPATH +#error do not include this file +#endif + +/* + * Implement paravirt qspinlocks; the general idea is to halt the vcpus instead + * of

Re: [Xen-devel] [PATCH v15 13/15] pvqspinlock: Only kick CPU at unlock time

2015-04-09 Thread Peter Zijlstra
On Mon, Apr 06, 2015 at 10:55:48PM -0400, Waiman Long wrote: @@ -219,24 +236,30 @@ static void pv_wait_node(struct mcs_spinlock *node) } /* + * Called after setting next-locked = 1 lock acquired. + * Check if the the CPU has been halted. If so, set the _Q_SLOW_VAL flag + * and put an

Re: [Xen-devel] [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

2015-04-09 Thread Peter Zijlstra
On Thu, Apr 09, 2015 at 08:13:27PM +0200, Peter Zijlstra wrote: On Mon, Apr 06, 2015 at 10:55:44PM -0400, Waiman Long wrote: +#define PV_HB_PER_LINE (SMP_CACHE_BYTES / sizeof(struct pv_hash_bucket)) +static struct qspinlock **pv_hash(struct qspinlock *lock, struct pv_node *node

Re: [Xen-devel] [PATCH v15 16/16] unfair qspinlock: a queue based unfair lock

2015-04-09 Thread Peter Zijlstra
On Thu, Apr 09, 2015 at 09:16:24AM -0400, Rik van Riel wrote: On 04/09/2015 03:01 AM, Peter Zijlstra wrote: On Wed, Apr 08, 2015 at 02:32:19PM -0400, Waiman Long wrote: For a virtual guest with the qspinlock patch, a simple unfair byte lock will be used if PV spinlock is not configured

Re: [Xen-devel] [PATCH v15 13/15] pvqspinlock: Only kick CPU at unlock time

2015-04-09 Thread Peter Zijlstra
On Thu, Apr 09, 2015 at 09:57:21PM +0200, Peter Zijlstra wrote: On Mon, Apr 06, 2015 at 10:55:48PM -0400, Waiman Long wrote: @@ -219,24 +236,30 @@ static void pv_wait_node(struct mcs_spinlock *node) } /* + * Called after setting next-locked = 1 lock acquired. + * Check

Re: [Xen-devel] [PATCH v15 16/16] unfair qspinlock: a queue based unfair lock

2015-04-09 Thread Peter Zijlstra
On Wed, Apr 08, 2015 at 02:32:19PM -0400, Waiman Long wrote: For a virtual guest with the qspinlock patch, a simple unfair byte lock will be used if PV spinlock is not configured in or the hypervisor isn't either KVM or Xen. The byte lock works fine with small guest of just a few vCPUs. On a

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Peter Zijlstra
On Wed, Apr 01, 2015 at 12:20:30PM -0400, Waiman Long wrote: After more careful reading, I think the assumption that the presence of an unused bucket means there is no match is not true. Consider the scenario: 1. cpu 0 puts lock1 into hb[0] 2. cpu 1 puts lock2 into hb[1] 3. cpu 2 clears

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Peter Zijlstra
On Wed, Apr 01, 2015 at 07:12:23PM +0200, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 12:20:30PM -0400, Waiman Long wrote: After more careful reading, I think the assumption that the presence of an unused bucket means there is no match is not true. Consider the scenario: 1. cpu 0 puts

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Peter Zijlstra
On Wed, Apr 01, 2015 at 07:42:39PM +0200, Peter Zijlstra wrote: Hohumm.. time to think more I think ;-) So bear with me, I've not really pondered this well so it could be full of holes (again). After the cmpxchg(l-locked, _Q_LOCKED_VAL, _Q_SLOW_VAL) succeeds the spin_unlock() must do

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Peter Zijlstra
On Wed, Apr 01, 2015 at 02:54:45PM -0400, Waiman Long wrote: On 04/01/2015 02:17 PM, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 07:42:39PM +0200, Peter Zijlstra wrote: Hohumm.. time to think more I think ;-) So bear with me, I've not really pondered this well so it could be full of holes

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-02 Thread Peter Zijlstra
On Thu, Apr 02, 2015 at 12:28:30PM -0400, Waiman Long wrote: On 04/01/2015 05:03 PM, Peter Zijlstra wrote: On Wed, Apr 01, 2015 at 03:58:58PM -0400, Waiman Long wrote: On 04/01/2015 02:48 PM, Peter Zijlstra wrote: I am sorry that I don't quite get what you mean here. My point

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-01 Thread Peter Zijlstra
On Wed, Apr 01, 2015 at 03:58:58PM -0400, Waiman Long wrote: On 04/01/2015 02:48 PM, Peter Zijlstra wrote: I am sorry that I don't quite get what you mean here. My point is that in the hashing step, a cpu will need to scan an empty bucket to put the lock in. In the interim, an previously used

Re: [Xen-devel] [PATCH 8/9] qspinlock: Generic paravirt support

2015-04-03 Thread Peter Zijlstra
On Thu, Apr 02, 2015 at 09:48:34PM +0200, Peter Zijlstra wrote: @@ -158,20 +257,20 @@ static void pv_wait_head(struct qspinloc void __pv_queue_spin_unlock(struct qspinlock *lock) { struct __qspinlock *l = (void *)lock; + struct pv_hash_bucket *hb; if (xchg(l-locked, 0

Re: [Xen-devel] linux-next: manual merge of the xen-tip tree with the tip tree

2015-08-12 Thread Peter Zijlstra
On Wed, Aug 12, 2015 at 07:21:05PM +0200, Peter Zijlstra wrote: On Wed, Aug 12, 2015 at 09:27:38AM -0400, Boris Ostrovsky wrote: Incidentally, 11276d53 (locking/static_keys: Add a new static_key interface) breaks old-ish compilers (gcc version 4.4.4 20100503 (Red Hat 4.4.4-2) (GCC

Re: [Xen-devel] linux-next: manual merge of the xen-tip tree with the tip tree

2015-08-12 Thread Peter Zijlstra
On Wed, Aug 12, 2015 at 09:27:38AM -0400, Boris Ostrovsky wrote: Incidentally, 11276d53 (locking/static_keys: Add a new static_key interface) breaks old-ish compilers (gcc version 4.4.4 20100503 (Red Hat 4.4.4-2) (GCC)): CC arch/x86/kernel/nmi.o In file included from

Re: [Xen-devel] linux-next: manual merge of the xen-tip tree with the tip tree

2015-08-12 Thread Peter Zijlstra
On Wed, Aug 12, 2015 at 07:21:05PM +0200, Peter Zijlstra wrote: On Wed, Aug 12, 2015 at 09:27:38AM -0400, Boris Ostrovsky wrote: Incidentally, 11276d53 (locking/static_keys: Add a new static_key interface) breaks old-ish compilers (gcc version 4.4.4 20100503 (Red Hat 4.4.4-2) (GCC

Re: [Xen-devel] [PATCH v2 3/3] sched/preempt: fix cond_resched_lock() and cond_resched_softirq()

2015-07-15 Thread Peter Zijlstra
On Wed, Jul 15, 2015 at 03:52:34PM +0300, Konstantin Khlebnikov wrote: On 15.07.2015 15:16, Eric Dumazet wrote: On Wed, 2015-07-15 at 12:52 +0300, Konstantin Khlebnikov wrote: These functions check should_resched() before unlocking spinlock/bh-enable: preempt_count always non-zero =

Re: [Xen-devel] [PATCH v11 2/5] missing include asm/paravirt.h in cputime.c

2015-11-09 Thread Peter Zijlstra
On Thu, Nov 05, 2015 at 05:30:01PM +, Stefano Stabellini wrote: > On Thu, 5 Nov 2015, Peter Zijlstra wrote: > > How can this be missing? Things compile fine now, right? > > Fair enough. > > > > So please better explain why we do this change. > >

Re: [Xen-devel] [PATCH v2 1/7] timekeeping: introduce __current_kernel_time64

2015-11-10 Thread Peter Zijlstra
On Tue, Nov 10, 2015 at 11:57:49AM +, Stefano Stabellini wrote: > __current_kernel_time64 returns a struct timespec64, without taking the > xtime lock. Mirrors __current_kernel_time/current_kernel_time. It always helps if you include a reason why you want a patch.

Re: [Xen-devel] [PATCH v11 2/5] missing include asm/paravirt.h in cputime.c

2015-11-10 Thread Peter Zijlstra
On Tue, Nov 10, 2015 at 11:27:33AM +, Stefano Stabellini wrote: > On Mon, 9 Nov 2015, Peter Zijlstra wrote: > > On Thu, Nov 05, 2015 at 05:30:01PM +, Stefano Stabellini wrote: > > > On Thu, 5 Nov 2015, Peter Zijlstra wrote: > > > > How can this be missing?

Re: [Xen-devel] [PATCH v11 2/5] missing include asm/paravirt.h in cputime.c

2015-11-05 Thread Peter Zijlstra
How can this be missing? Things compile fine now, right? So please better explain why we do this change. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops

2015-09-17 Thread Peter Zijlstra
On Wed, Sep 16, 2015 at 04:33:11PM -0700, Andy Lutomirski wrote: > Setting CONFIG_PARAVIRT=y has an unintended side effect: it silently > turns all rdmsr and wrmsr operations into the safe variants without > any checks that the operations actually succeed. > > This is IMO awful: it papers over

Re: [Xen-devel] [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops

2015-09-17 Thread Peter Zijlstra
On Thu, Sep 17, 2015 at 01:40:30PM +0200, Paolo Bonzini wrote: > > > On 17/09/2015 10:58, Peter Zijlstra wrote: > > But the far greater problem I have with the whole virt thing is that > > you cannot use rdmsr_safe() to probe if an MSR exists at all because, as > &

Re: [Xen-devel] [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops

2015-09-17 Thread Peter Zijlstra
On Thu, Sep 17, 2015 at 08:17:18AM -0700, Andy Lutomirski wrote: > > Ah, that would be good news. Andy earlier argued I could not rely on > > rdmsr_safe() faulting on unknown MSRs. If practically we can there's > > some code I can simplify :-) > > I was taking about QEMU TCG, not KVM. Just for

Re: [Xen-devel] [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-30 Thread Peter Zijlstra
On Mon, Sep 21, 2015 at 09:36:15AM -0700, Linus Torvalds wrote: > On Mon, Sep 21, 2015 at 1:46 AM, Ingo Molnar wrote: > > > > Linus, what's your preference? > > So quite frankly, is there any reason we don't just implement > native_read_msr() as just > >unsigned long long

Re: [Xen-devel] [PATCH v2 17/32] arm: define __smp_xxx

2016-01-04 Thread Peter Zijlstra
On Sun, Jan 03, 2016 at 11:12:44AM +0200, Michael S. Tsirkin wrote: > On Sat, Jan 02, 2016 at 11:24:38AM +, Russell King - ARM Linux wrote: > > My only concern is that it gives people an additional handle onto a > > "new" set of barriers - just because they're prefixed with __* > >

Re: [Xen-devel] [PATCH v2 06/32] s390: reuse asm-generic/barrier.h

2016-01-04 Thread Peter Zijlstra
On Thu, Dec 31, 2015 at 09:06:30PM +0200, Michael S. Tsirkin wrote: > On s390 read_barrier_depends, smp_read_barrier_depends > smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the > asm-generic variants exactly. Drop the local definitions and pull in > asm-generic/barrier.h

Re: [Xen-devel] [PATCH v2 22/32] s390: define __smp_xxx

2016-01-04 Thread Peter Zijlstra
On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote: > This defines __smp_xxx barriers for s390, > for use by virtualization. > > Some smp_xxx barriers are removed as they are > defined correctly by asm-generic/barriers.h > > Note: smp_mb, smp_rmb and smp_wmb are defined as full

Re: [Xen-devel] [PATCH v2 33/34] xenbus: use virt_xxx barriers

2016-01-04 Thread Peter Zijlstra
On Thu, Dec 31, 2015 at 09:10:01PM +0200, Michael S. Tsirkin wrote: > drivers/xen/xenbus/xenbus_comms.c uses > full memory barriers to communicate with the other side. > > For guests compiled with CONFIG_SMP, smp_wmb and smp_mb > would be sufficient, so mb() and wmb() here are only needed if > a

Re: [Xen-devel] [PATCH v2 20/32] metag: define __smp_xxx

2016-01-04 Thread Peter Zijlstra
On Thu, Dec 31, 2015 at 09:08:22PM +0200, Michael S. Tsirkin wrote: > +#ifdef CONFIG_SMP > +#define fence() metag_fence() > +#else > +#define fence() do { } while (0) > #endif James, it strikes me as odd that fence() is a no-op instead of a barrier() for UP, can you verify/explain?

Re: [Xen-devel] [PATCH v2 31/32] sh: support a 2-byte smp_store_mb

2016-01-04 Thread Peter Zijlstra
On Thu, Dec 31, 2015 at 09:09:47PM +0200, Michael S. Tsirkin wrote: > At the moment, xchg on sh only supports 4 and 1 byte values, so using it > from smp_store_mb means attempts to store a 2 byte value using this > macro fail. > > And happens to be exactly what virtio drivers want to do. > >

Re: [Xen-devel] [PATCH v2 11/32] mips: reuse asm-generic/barrier.h

2016-01-04 Thread Peter Zijlstra
On Thu, Dec 31, 2015 at 09:07:10PM +0200, Michael S. Tsirkin wrote: > -#define smp_store_release(p, v) > \ > -do { \ > - compiletime_assert_atomic_type(*p);

Re: [Xen-devel] [PATCH v2 17/32] arm: define __smp_xxx

2016-01-04 Thread Peter Zijlstra
On Mon, Jan 04, 2016 at 02:36:58PM +0100, Peter Zijlstra wrote: > On Sun, Jan 03, 2016 at 11:12:44AM +0200, Michael S. Tsirkin wrote: > > On Sat, Jan 02, 2016 at 11:24:38AM +, Russell King - ARM Linux wrote: > > > > My only concern is that it gives people an a

Re: [Xen-devel] [PATCH v2 20/32] metag: define __smp_xxx

2016-01-04 Thread Peter Zijlstra
On Mon, Jan 04, 2016 at 03:25:58PM +, James Hogan wrote: > It is used along with the metag specific __global_lock1() (global > voluntary lock between hw threads) whenever a write is performed, and by > smp_mb/smp_rmb to try to catch other cases, but I've never been > confident this fixes every

Re: [Xen-devel] new barrier type for paravirt (was Re: [PATCH] virtio_ring: use smp_store_mb)

2015-12-20 Thread Peter Zijlstra
On Sun, Dec 20, 2015 at 05:07:19PM +, Andrew Cooper wrote: > > Very much +1 for fixing this. > > Those names would be fine, but they do add yet another set of options in > an already-complicated area. > > An alternative might be to have the regular smp_{w,r,}mb() not revert > back to nops

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-12 Thread Peter Zijlstra
On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote: > This statement doesn't fit MIPS barriers variations. Moreover, there is a > reason to extend that even more specific, at least for smp_store_release and > smp_load_acquire, look into > >

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-12 Thread Peter Zijlstra
On Tue, Jan 12, 2016 at 10:43:36AM +0200, Michael S. Tsirkin wrote: > On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote: > > On 01/10/2016 06:18 AM, Michael S. Tsirkin wrote: > > >On mips dma_rmb, dma_wmb, smp_store_mb, read_barrier_depends, > > >smp_read_barrier_depends,

Re: [Xen-devel] [PATCH v3 00/41] arch: barrier cleanup + barriers for virt

2016-01-12 Thread Peter Zijlstra
> duplicate patch, and assume conflict will be resolved. > > I would really appreciate some feedback on arch bits (especially the x86 > bits), > and acks for merging this through the vhost tree. Thanks for doing this, looks good to me. Acked-by: Peter Zijlstra (I

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-12 Thread Peter Zijlstra
On Tue, Jan 12, 2016 at 11:25:55AM +0100, Peter Zijlstra wrote: > On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote: > > 2) the changelog _completely_ fails to explain the sync 0x11 and sync > > 0x12 semantics nor does it provide a publicly accessible link to &g

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-12 Thread Peter Zijlstra
3. I bother MIPS Arch team long time until I completely understood that MIPS > SYNC_WMB, SYNC_MB, SYNC_RMB, SYNC_RELEASE and SYNC_ACQUIRE do an exactly > that is required in Documentation/memory-barriers.txt Ha! and you think that document covers all the really fun details? In particular we're ve

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-12 Thread Peter Zijlstra
On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote: > 2) the changelog _completely_ fails to explain the sync 0x11 and sync > 0x12 semantics nor does it provide a publicly accessible link to > documentation that does. Ralf pointed me at: https://imgtec.com/mips/architectur

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-13 Thread Peter Zijlstra
On Wed, Jan 13, 2016 at 11:02:35AM -0800, Leonid Yegoshin wrote: > I ask HW team about it but I have a question - has it any relationship with > replacing MIPS SYNC with lightweight SYNCs (SYNC_WMB etc)? Of course. If you cannot explain the semantics of the primitives you introduce, how can we

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-15 Thread Peter Zijlstra
of smp_store_release()/smp_load_acquire() chains is local. This > commit therefore introduces the notion of local transitivity and > gives an example. > > Reported-by: Peter Zijlstra <pet...@infradead.org> > Reported-by: Will Deacon <will.dea...@arm.com&g

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-15 Thread Peter Zijlstra
On Fri, Jan 15, 2016 at 09:46:12AM -0800, Paul E. McKenney wrote: > On Fri, Jan 15, 2016 at 10:13:48AM +0100, Peter Zijlstra wrote: > > And the stuff we're confused about is how best to express the difference > > and guarantees of these two forms of transitivity and how exactly th

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-15 Thread Peter Zijlstra
On Thu, Jan 14, 2016 at 01:29:13PM -0800, Paul E. McKenney wrote: > So smp_mb() provides transitivity, as do pairs of smp_store_release() > and smp_read_acquire(), But they provide different grades of transitivity, which is where all the confusion lays. smp_mb() is strongly/globally transitive,

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-15 Thread Peter Zijlstra
On Fri, Jan 15, 2016 at 09:55:54AM +0100, Peter Zijlstra wrote: > On Thu, Jan 14, 2016 at 01:29:13PM -0800, Paul E. McKenney wrote: > > So smp_mb() provides transitivity, as do pairs of smp_store_release() > > and smp_read_acquire(), > > But they provide different grades o

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-14 Thread Peter Zijlstra
On Thu, Jan 14, 2016 at 11:42:02AM -0800, Leonid Yegoshin wrote: > An the only point - please use an appropriate SYNC_* barriers instead of > heavy bold hammer. That stuff was design explicitly to support the > requirements of Documentation/memory-barriers.txt That's madness. That document

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-14 Thread Peter Zijlstra
On Thu, Jan 14, 2016 at 09:15:13PM +0100, Peter Zijlstra wrote: > On Thu, Jan 14, 2016 at 11:42:02AM -0800, Leonid Yegoshin wrote: > > An the only point - please use an appropriate SYNC_* barriers instead of > > heavy bold hammer. That stuff was design explicitly to support the &

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-26 Thread Peter Zijlstra
On Mon, Jan 25, 2016 at 10:03:22PM -0800, Paul E. McKenney wrote: > On Mon, Jan 25, 2016 at 04:42:43PM +, Will Deacon wrote: > > On Fri, Jan 15, 2016 at 01:58:53PM -0800, Paul E. McKenney wrote: > > > On Fri, Jan 15, 2016 at 10:27:14PM +0100, Peter Zijlstra wrote: >

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-26 Thread Peter Zijlstra
On Thu, Jan 14, 2016 at 02:20:46PM -0800, Paul E. McKenney wrote: > On Thu, Jan 14, 2016 at 01:24:34PM -0800, Leonid Yegoshin wrote: > > On 01/14/2016 12:48 PM, Paul E. McKenney wrote: > > > > > >So SYNC_RMB is intended to implement smp_rmb(), correct? > > Yes. > > > > > >You could use

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-26 Thread Peter Zijlstra
On Tue, Jan 26, 2016 at 02:33:40PM -0800, Linus Torvalds wrote: > If it turns out that some architecture does actually need a barrier > between a read and a dependent write, then that will mean that > > (a) we'll have to make up a _new_ barrier, because > "smp_read_barrier_depends()" is not

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-26 Thread Peter Zijlstra
On Wed, Jan 27, 2016 at 12:52:07AM +0800, Boqun Feng wrote: > I recall that last time you and Linus came into a conclusion that even > on Alpha, a barrier for read->write with data dependency is unnecessary: > > http://article.gmane.org/gmane.linux.kernel/2077661 > > And in an earlier mail of

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-26 Thread Peter Zijlstra
On Mon, Jan 25, 2016 at 10:12:11PM -0800, Paul E. McKenney wrote: > On Mon, Jan 25, 2016 at 06:02:34PM +, Will Deacon wrote: > > Thanks for having a go at this. I tried defining something axiomatically, > > but got stuck pretty quickly. In my scheme, I used "data-directed > > transitivity"

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-27 Thread Peter Zijlstra
On Tue, Jan 26, 2016 at 12:13:39PM -0800, Paul E. McKenney wrote: > On Tue, Jan 26, 2016 at 11:19:27AM +0100, Peter Zijlstra wrote: > > So isn't smp_mb__after_unlock_lock() exactly such a scenario? And would > > not someone trying to implement RCsc locks using locally transit

[Xen-devel] [PATCH] documentation: Add disclaimer

2016-01-27 Thread Peter Zijlstra
insane to require it when building new hardware. Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> --- Documentation/memory-barriers.txt | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-

Re: [Xen-devel] [PATCH v4 2/5] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2016-03-14 Thread Peter Zijlstra
On Mon, Mar 14, 2016 at 11:10:16AM -0700, Andy Lutomirski wrote: > A couple of the wrmsr users actually care about performance. These > are the ones involved in context switching and, to a lesser extent, in > switching in and out of guest mode. Right, this very much includes a number of perf

Re: [Xen-devel] [PATCH 0/6] Support calling functions on dedicated physical cpu

2016-03-11 Thread Peter Zijlstra
On Fri, Mar 11, 2016 at 12:59:28PM +0100, Juergen Gross wrote: > Some hardware (e.g. Dell Studio laptops) require special functions to > be called on physical cpu 0 in order to avoid occasional hangs. When > running as dom0 under Xen this could be achieved only via special boot > parameters (vcpu

Re: [Xen-devel] [PATCH 2/6] sched: add function to execute a function synchronously on a physical cpu

2016-03-11 Thread Peter Zijlstra
On Fri, Mar 11, 2016 at 01:43:53PM +0100, Juergen Gross wrote: > On 11/03/16 13:19, Peter Zijlstra wrote: > > On Fri, Mar 11, 2016 at 12:59:30PM +0100, Juergen Gross wrote: > >> +int call_sync_on_phys_cpu(unsigned cpu, int (*func)(void *), void *par) > >> +{ >

Re: [Xen-devel] [PATCH 2/6] sched: add function to execute a function synchronously on a physical cpu

2016-03-11 Thread Peter Zijlstra
On Fri, Mar 11, 2016 at 01:48:12PM +0100, Juergen Gross wrote: > On 11/03/16 13:42, Peter Zijlstra wrote: > > how about something like: > > > > struct xen_callback_struct { > > struct work_struct work; > > struct completion done; int

Re: [Xen-devel] [PATCH 2/6] sched: add function to execute a function synchronously on a physical cpu

2016-03-11 Thread Peter Zijlstra
On Fri, Mar 11, 2016 at 12:59:30PM +0100, Juergen Gross wrote: > +int call_sync_on_phys_cpu(unsigned cpu, int (*func)(void *), void *par) > +{ > + cpumask_var_t old_mask; > + int ret; > + > + if (cpu >= nr_cpu_ids) > + return -EINVAL; > + > + if

Re: [Xen-devel] [PATCH 2/6] sched: add function to execute a function synchronously on a physical cpu

2016-03-11 Thread Peter Zijlstra
On Fri, Mar 11, 2016 at 01:19:50PM +0100, Peter Zijlstra wrote: > On Fri, Mar 11, 2016 at 12:59:30PM +0100, Juergen Gross wrote: > > +int call_sync_on_phys_cpu(unsigned cpu, int (*func)(void *), void *par) > > +{ > > + cpumask_var_t old_mask; > > + int ret; > >

Re: [Xen-devel] [PATCH 0/6] Support calling functions on dedicated physical cpu

2016-03-11 Thread Peter Zijlstra
On Fri, Mar 11, 2016 at 01:15:04PM +, One Thousand Gnomes wrote: > On Fri, 11 Mar 2016 13:25:14 +0100 > Peter Zijlstra <pet...@infradead.org> wrote: > > > On Fri, Mar 11, 2016 at 12:59:28PM +0100, Juergen Gross wrote: > > > Some hardware (e.g. Dell Studio lapt

Re: [Xen-devel] [PATCH v5 3/6] smp: add function to execute a function synchronously on a cpu

2016-04-06 Thread Peter Zijlstra
e kernel add a service function for this purpose. This will enable > the possibility to take special measures in virtualized environments > like Xen, too. > > Signed-off-by: Juergen Gross <jgr...@suse.com> Thanks! Acked-by: Peter Zijlstra

Re: [Xen-devel] [PATCH v5 3/9] x86/head: Move early exception panic code into early_fixup_exception

2016-04-04 Thread Peter Zijlstra
On Mon, Apr 04, 2016 at 01:52:06PM +0200, Jan Kara wrote: > Sounds like a good idea to me. I've also consulted this with Petr Mladek > (added to CC) who is using printk_func per-cpu variable in his > printk-from-NMI patches and he also doesn't see a problem with this. There's a few printk()

Re: [Xen-devel] [PATCH v4 3/6] smp: add function to execute a function synchronously on a cpu

2016-04-05 Thread Peter Zijlstra
On Tue, Apr 05, 2016 at 07:10:04AM +0200, Juergen Gross wrote: > +int smp_call_on_cpu(unsigned int cpu, bool pin, int (*func)(void *), void > *par) Why .pin and not .phys? .pin does not (to me) reflect the hypervisor/physical-cpu thing. Also, as per smp_call_function_single() would it not be

Re: [Xen-devel] [PATCH v3 2/6] smp: add function to execute a function synchronously on a physical cpu

2016-04-01 Thread Peter Zijlstra
On Fri, Apr 01, 2016 at 09:14:30AM +0200, Juergen Gross wrote: > + if (cpu >= nr_cpu_ids) > + return -EINVAL; > + if (cpu != 0) > + return -EINVAL; The other functions return -ENXIO for this. ___ Xen-devel mailing list

Re: [Xen-devel] [PATCH v3 5/6] virt, sched: add cpu pinning to smp_call_sync_on_phys_cpu()

2016-04-01 Thread Peter Zijlstra
On Fri, Apr 01, 2016 at 09:14:33AM +0200, Juergen Gross wrote: > --- a/kernel/smp.c > +++ b/kernel/smp.c > @@ -14,6 +14,7 @@ > #include > #include > #include > +#include > > #include "smpboot.h" > > @@ -758,9 +759,14 @@ struct smp_sync_call_struct { > static void

Re: [Xen-devel] [PATCH v3 5/6] virt, sched: add cpu pinning to smp_call_sync_on_phys_cpu()

2016-04-01 Thread Peter Zijlstra
On Fri, Apr 01, 2016 at 11:03:21AM +0200, Juergen Gross wrote: > > Maybe just make the vpin thing an option like: > > > > smp_call_on_cpu(int (*func)(void *), int phys_cpu); > > Also; is something like the vpin thing possible on KVM? because if we're > > going to expose it to generic code

Re: [Xen-devel] [PATCH v3 5/6] virt, sched: add cpu pinning to smp_call_sync_on_phys_cpu()

2016-04-01 Thread Peter Zijlstra
On Fri, Apr 01, 2016 at 10:28:46AM +0200, Juergen Gross wrote: > On 01/04/16 09:43, Peter Zijlstra wrote: > > On Fri, Apr 01, 2016 at 09:14:33AM +0200, Juergen Gross wrote: > >> --- a/kernel/smp.c > >> +++ b/kernel/smp.c > >> @@ -14,6 +14,7 @@ > >&

Re: [Xen-devel] [PATCH v5 3/9] x86/head: Move early exception panic code into early_fixup_exception

2016-04-04 Thread Peter Zijlstra
On Mon, Apr 04, 2016 at 08:32:21AM -0700, Andy Lutomirski wrote: > Adding locking would be easy enough, wouldn't it? See patch in this thread.. > But do any platforms really boot a second CPU before switching to real > printk? I _only_ use early_printk() as printk() is a quagmire of fail :-)

Re: [Xen-devel] [PATCH] x86, locking: Remove ticket (spin)lock implementation

2016-05-18 Thread Peter Zijlstra
On Wed, May 18, 2016 at 03:13:44PM -0400, Konrad Rzeszutek Wilk wrote: > On Wed, May 18, 2016 at 08:43:02PM +0200, Peter Zijlstra wrote: > > > > We've unconditionally used the queued spinlock for many releases now. > > Like since 4.2? Yeah, that seems to be the right nu

[Xen-devel] [PATCH] x86, locking: Remove ticket (spin)lock implementation

2016-05-18 Thread Peter Zijlstra
We've unconditionally used the queued spinlock for many releases now. Its time to remove the old ticket lock code. Cc: Waiman Long <waiman.l...@hpe.com> Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> --- arch/x86/Kconfig | 3 +- arch/x86

Re: [Xen-devel] [PATCH 2/2] locking/mutex, rwsem: Reduce vcpu_is_preempted() calling frequency

2017-02-08 Thread Peter Zijlstra
On Wed, Feb 08, 2017 at 01:00:25PM -0500, Waiman Long wrote: > As the vcpu_is_preempted() call is pretty costly compared with other > checks within mutex_spin_on_owner() and rwsem_spin_on_owner(), they > are done at a reduce frequency of once every 256 iterations. That's just disgusting.

Re: [Xen-devel] [PATCH 1/2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-08 Thread Peter Zijlstra
On Wed, Feb 08, 2017 at 01:00:24PM -0500, Waiman Long wrote: > It was found when running fio sequential write test with a XFS ramdisk > on a 2-socket x86-64 system, the %CPU times as reported by perf were > as follows: > > 71.27% 0.28% fio [k] down_write > 70.99% 0.01% fio [k]

Re: [Xen-devel] [PATCH v5 0/2] x86/kvm: Reduce vcpu_is_preempted() overhead

2017-02-20 Thread Peter Zijlstra
s will go through the KVM tree, if people want me to take it through the tip tree, please let me know. Acked-by: Peter Zijlstra (Intel) <pet...@infraded.org> ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-13 Thread Peter Zijlstra
On Mon, Feb 13, 2017 at 12:06:44PM -0800, h...@zytor.com wrote: > >Maybe: > > > >movsql %edi, %rax; > >movq __per_cpu_offset(,%rax,8), %rax; > >cmpb $0, %[offset](%rax); > >setne %al; > > > >? > > We could kill the zero or sign extend by changing the calling > interface to pass an unsigned long

  1   2   >