specific code to support generic SMP function
call interfaces, so kill the redundant single function call interrupt.
Cc: Peter Zijlstra a.p.zijls...@chello.nl
Cc: Ingo Molnar mi...@elte.hu
Cc: Steven Rostedt rost...@goodmis.org
Signed-off-by: Jiang Liu jiang@linux.intel.com
Acked-by: Peter
On Mon, Feb 09, 2015 at 03:04:22PM +0530, Raghavendra K T wrote:
So we have 3 choices,
1. xadd
2. continue with current approach.
3. a read before unlock and also after that.
For the truly paranoid we have probe_kernel_address(), suppose the lock
was in module space and the module just got
On Thu, Feb 12, 2015 at 05:17:27PM +0530, Raghavendra K T wrote:
Paravirt spinlock clears slowpath flag after doing unlock.
As explained by Linus currently it does:
prev = *lock;
add_smp(lock-tickets.head, TICKET_LOCK_INC);
/* add_smp() is a
From: Peter Zijlstra pet...@infradead.org
When we detect a hypervisor (!paravirt, see qspinlock paravirt support
patches), revert to a simple test-and-set lock to avoid the horrors
of queue preemption.
Cc: Ingo Molnar mi...@redhat.com
Cc: David Vrabel david.vra...@citrix.com
Cc: Oleg Nesterov o
From: Peter Zijlstra pet...@infradead.org
When we allow for a max NR_CPUS 2^14 we can optimize the pending
wait-acquire and the xchg_tail() operations.
By growing the pending bit to a byte, we reduce the tail to 16bit.
This means we can use xchg16 for the tail part and do away with all
...@linux.vnet.ibm.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
Link:
http://lkml.kernel.org/r/1421784755-21945-2-git-send-email-waiman.l...@hp.com
---
include/asm-generic
...@linux-foundation.org
Cc: Thomas Gleixner t...@linutronix.de
Cc: H. Peter Anvin h...@zytor.com
Cc: Rik van Riel r...@redhat.com
Cc: Raghavendra K T raghavendra...@linux.vnet.ibm.com
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
Link
From: Peter Zijlstra pet...@infradead.org
Because the qspinlock needs to touch a second cacheline (the per-cpu
mcs_nodes[]); add a pending bit and allow a single in-word spinner
before we punt to the second cacheline.
It is possible so observe the pending bit without the locked bit when
the last
: Rik van Riel r...@redhat.com
Cc: Linus Torvalds torva...@linux-foundation.org
Cc: Raghavendra K T raghavendra...@linux.vnet.ibm.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra (Intel) pet
to the current kvm code. We can do a single
enrty because any nesting will wake the vcpu and cause the lower loop
to retry.
Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
---
include/asm-generic/qspinlock.h |3
kernel/locking/qspinlock.c | 69 +-
kernel/locking
.
This significantly lowers the overhead of having
CONFIG_PARAVIRT_SPINLOCKS enabled, even for native code.
Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
---
arch/x86/Kconfig |2 -
arch/x86/include/asm/paravirt.h | 28 -
arch/x86/include/asm
Hi Waiman,
As promised; here is the paravirt stuff I did during the trip to BOS last week.
All the !paravirt patches are more or less the same as before (the only real
change is the copyright lines in the first patch).
The paravirt stuff is 'simple' and KVM only -- the Xen code was a little
On Wed, Mar 18, 2015 at 10:45:55PM -0400, Waiman Long wrote:
On 03/16/2015 09:16 AM, Peter Zijlstra wrote:
I do have some concern about this call site patching mechanism as the
modification is not atomic. The spin_unlock() calls are in many places in
the kernel. There is a possibility
On Wed, Mar 18, 2015 at 04:50:37PM -0400, Waiman Long wrote:
+this_cpu_write(__pv_lock_wait, lock);
We may run into the same problem of needing to have 4 queue nodes per CPU.
If an interrupt happens just after the write and before the actual wait and
it goes through the same
On Thu, Mar 19, 2015 at 01:25:36PM +0100, Peter Zijlstra wrote:
+static struct qspinlock **pv_hash(struct qspinlock *lock)
+{
+ u32 hash = hash_ptr(lock, PV_LOCK_HASH_BITS);
+ struct pv_hash_bucket *hb, *end;
+
+ if (!hash)
+ hash = 1;
+
+ hb = __pv_lock_hash
On Thu, Mar 19, 2015 at 06:01:34PM +, David Vrabel wrote:
This seems work for me, but I've not got time to give it a more thorough
testing.
You can fold this into your series.
Thanks!
There doesn't seem to be a way to disable QUEUE_SPINLOCKS when supported by
the arch, is this
On Mon, Mar 30, 2015 at 12:25:12PM -0400, Waiman Long wrote:
I did it differently in my PV portion of the qspinlock patch. Instead of
just waking up the CPU, the new lock holder will check if the new queue head
has been halted. If so, it will set the slowpath flag for the halted queue
head in
On Wed, Mar 25, 2015 at 03:47:39PM -0400, Konrad Rzeszutek Wilk wrote:
Ah nice. That could be spun out as a seperate patch to optimize the existing
ticket locks I presume.
Yes I suppose we can do something similar for the ticket and patch in
the right increment. We'd need to restructure the
On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote:
In the pv_scan_next() function, the slow cmpxchg atomic operation is
performed even if the other CPU is not even close to being halted. This
extra cmpxchg can harm slowpath performance.
This patch introduces the new mayhalt flag to
On Thu, Apr 30, 2015 at 02:49:26PM -0400, Waiman Long wrote:
On 04/29/2015 02:11 PM, Peter Zijlstra wrote:
On Fri, Apr 24, 2015 at 02:56:42PM -0400, Waiman Long wrote:
In the pv_scan_next() function, the slow cmpxchg atomic operation is
performed even if the other CPU is not even close
...@redhat.com
Cc: H. Peter Anvin h...@zytor.com
Suggested-by: Peter Zijlstra (Intel) pet...@infradead.org
Signed-off-by: Waiman Long waiman.l...@hp.com
Signed-off-by: Peter Zijlstra (Intel) pet...@infradead.org
Link:
http://lkml.kernel.org/r/1429901803-29771-9-git-send-email-waiman.l...@hp.com
---
kernel
On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote:
+void __init __pv_init_lock_hash(void)
+{
+ int pv_hash_size = 4 * num_possible_cpus();
+
+ if (pv_hash_size (1U LFSR_MIN_BITS))
+ pv_hash_size = (1U LFSR_MIN_BITS);
+ /*
+* Allocate space from bootmem
On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote:
+static void pv_wait_head(struct qspinlock *lock, struct mcs_spinlock *node)
+{
+ struct __qspinlock *l = (void *)lock;
+ struct qspinlock **lp = NULL;
+ struct pv_node *pn = (struct pv_node *)node;
+ int slow_set = false;
On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote:
+__visible void __pv_queue_spin_unlock(struct qspinlock *lock)
+{
+ struct __qspinlock *l = (void *)lock;
+ struct pv_node *node;
+
+ if (likely(cmpxchg(l-locked, _Q_LOCKED_VAL, 0) == _Q_LOCKED_VAL))
+ return;
+
On Mon, Apr 06, 2015 at 10:55:44PM -0400, Waiman Long wrote:
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -0,0 +1,321 @@
+#ifndef _GEN_PV_LOCK_SLOWPATH
+#error do not include this file
+#endif
+
+/*
+ * Implement paravirt qspinlocks; the general idea is to halt the vcpus
instead
+ * of
On Mon, Apr 06, 2015 at 10:55:48PM -0400, Waiman Long wrote:
@@ -219,24 +236,30 @@ static void pv_wait_node(struct mcs_spinlock *node)
}
/*
+ * Called after setting next-locked = 1 lock acquired.
+ * Check if the the CPU has been halted. If so, set the _Q_SLOW_VAL flag
+ * and put an
On Thu, Apr 09, 2015 at 08:13:27PM +0200, Peter Zijlstra wrote:
On Mon, Apr 06, 2015 at 10:55:44PM -0400, Waiman Long wrote:
+#define PV_HB_PER_LINE (SMP_CACHE_BYTES / sizeof(struct
pv_hash_bucket))
+static struct qspinlock **pv_hash(struct qspinlock *lock, struct pv_node
*node
On Thu, Apr 09, 2015 at 09:16:24AM -0400, Rik van Riel wrote:
On 04/09/2015 03:01 AM, Peter Zijlstra wrote:
On Wed, Apr 08, 2015 at 02:32:19PM -0400, Waiman Long wrote:
For a virtual guest with the qspinlock patch, a simple unfair byte lock
will be used if PV spinlock is not configured
On Thu, Apr 09, 2015 at 09:57:21PM +0200, Peter Zijlstra wrote:
On Mon, Apr 06, 2015 at 10:55:48PM -0400, Waiman Long wrote:
@@ -219,24 +236,30 @@ static void pv_wait_node(struct mcs_spinlock *node)
}
/*
+ * Called after setting next-locked = 1 lock acquired.
+ * Check
On Wed, Apr 08, 2015 at 02:32:19PM -0400, Waiman Long wrote:
For a virtual guest with the qspinlock patch, a simple unfair byte lock
will be used if PV spinlock is not configured in or the hypervisor
isn't either KVM or Xen. The byte lock works fine with small guest
of just a few vCPUs. On a
On Wed, Apr 01, 2015 at 12:20:30PM -0400, Waiman Long wrote:
After more careful reading, I think the assumption that the presence of an
unused bucket means there is no match is not true. Consider the scenario:
1. cpu 0 puts lock1 into hb[0]
2. cpu 1 puts lock2 into hb[1]
3. cpu 2 clears
On Wed, Apr 01, 2015 at 07:12:23PM +0200, Peter Zijlstra wrote:
On Wed, Apr 01, 2015 at 12:20:30PM -0400, Waiman Long wrote:
After more careful reading, I think the assumption that the presence of an
unused bucket means there is no match is not true. Consider the scenario:
1. cpu 0 puts
On Wed, Apr 01, 2015 at 07:42:39PM +0200, Peter Zijlstra wrote:
Hohumm.. time to think more I think ;-)
So bear with me, I've not really pondered this well so it could be full
of holes (again).
After the cmpxchg(l-locked, _Q_LOCKED_VAL, _Q_SLOW_VAL) succeeds the
spin_unlock() must do
On Wed, Apr 01, 2015 at 02:54:45PM -0400, Waiman Long wrote:
On 04/01/2015 02:17 PM, Peter Zijlstra wrote:
On Wed, Apr 01, 2015 at 07:42:39PM +0200, Peter Zijlstra wrote:
Hohumm.. time to think more I think ;-)
So bear with me, I've not really pondered this well so it could be full
of holes
On Thu, Apr 02, 2015 at 12:28:30PM -0400, Waiman Long wrote:
On 04/01/2015 05:03 PM, Peter Zijlstra wrote:
On Wed, Apr 01, 2015 at 03:58:58PM -0400, Waiman Long wrote:
On 04/01/2015 02:48 PM, Peter Zijlstra wrote:
I am sorry that I don't quite get what you mean here. My point
On Wed, Apr 01, 2015 at 03:58:58PM -0400, Waiman Long wrote:
On 04/01/2015 02:48 PM, Peter Zijlstra wrote:
I am sorry that I don't quite get what you mean here. My point is that in
the hashing step, a cpu will need to scan an empty bucket to put the lock
in. In the interim, an previously used
On Thu, Apr 02, 2015 at 09:48:34PM +0200, Peter Zijlstra wrote:
@@ -158,20 +257,20 @@ static void pv_wait_head(struct qspinloc
void __pv_queue_spin_unlock(struct qspinlock *lock)
{
struct __qspinlock *l = (void *)lock;
+ struct pv_hash_bucket *hb;
if (xchg(l-locked, 0
On Wed, Aug 12, 2015 at 07:21:05PM +0200, Peter Zijlstra wrote:
On Wed, Aug 12, 2015 at 09:27:38AM -0400, Boris Ostrovsky wrote:
Incidentally, 11276d53 (locking/static_keys: Add a new static_key
interface) breaks old-ish compilers (gcc version 4.4.4 20100503 (Red Hat
4.4.4-2) (GCC
On Wed, Aug 12, 2015 at 09:27:38AM -0400, Boris Ostrovsky wrote:
Incidentally, 11276d53 (locking/static_keys: Add a new static_key
interface) breaks old-ish compilers (gcc version 4.4.4 20100503 (Red Hat
4.4.4-2) (GCC)):
CC arch/x86/kernel/nmi.o
In file included from
On Wed, Aug 12, 2015 at 07:21:05PM +0200, Peter Zijlstra wrote:
On Wed, Aug 12, 2015 at 09:27:38AM -0400, Boris Ostrovsky wrote:
Incidentally, 11276d53 (locking/static_keys: Add a new static_key
interface) breaks old-ish compilers (gcc version 4.4.4 20100503 (Red Hat
4.4.4-2) (GCC
On Wed, Jul 15, 2015 at 03:52:34PM +0300, Konstantin Khlebnikov wrote:
On 15.07.2015 15:16, Eric Dumazet wrote:
On Wed, 2015-07-15 at 12:52 +0300, Konstantin Khlebnikov wrote:
These functions check should_resched() before unlocking spinlock/bh-enable:
preempt_count always non-zero =
On Thu, Nov 05, 2015 at 05:30:01PM +, Stefano Stabellini wrote:
> On Thu, 5 Nov 2015, Peter Zijlstra wrote:
> > How can this be missing? Things compile fine now, right?
>
> Fair enough.
>
>
> > So please better explain why we do this change.
>
>
On Tue, Nov 10, 2015 at 11:57:49AM +, Stefano Stabellini wrote:
> __current_kernel_time64 returns a struct timespec64, without taking the
> xtime lock. Mirrors __current_kernel_time/current_kernel_time.
It always helps if you include a reason why you want a patch.
On Tue, Nov 10, 2015 at 11:27:33AM +, Stefano Stabellini wrote:
> On Mon, 9 Nov 2015, Peter Zijlstra wrote:
> > On Thu, Nov 05, 2015 at 05:30:01PM +, Stefano Stabellini wrote:
> > > On Thu, 5 Nov 2015, Peter Zijlstra wrote:
> > > > How can this be missing?
How can this be missing? Things compile fine now, right? So please
better explain why we do this change.
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
On Wed, Sep 16, 2015 at 04:33:11PM -0700, Andy Lutomirski wrote:
> Setting CONFIG_PARAVIRT=y has an unintended side effect: it silently
> turns all rdmsr and wrmsr operations into the safe variants without
> any checks that the operations actually succeed.
>
> This is IMO awful: it papers over
On Thu, Sep 17, 2015 at 01:40:30PM +0200, Paolo Bonzini wrote:
>
>
> On 17/09/2015 10:58, Peter Zijlstra wrote:
> > But the far greater problem I have with the whole virt thing is that
> > you cannot use rdmsr_safe() to probe if an MSR exists at all because, as
> &
On Thu, Sep 17, 2015 at 08:17:18AM -0700, Andy Lutomirski wrote:
> > Ah, that would be good news. Andy earlier argued I could not rely on
> > rdmsr_safe() faulting on unknown MSRs. If practically we can there's
> > some code I can simplify :-)
>
> I was taking about QEMU TCG, not KVM.
Just for
On Mon, Sep 21, 2015 at 09:36:15AM -0700, Linus Torvalds wrote:
> On Mon, Sep 21, 2015 at 1:46 AM, Ingo Molnar wrote:
> >
> > Linus, what's your preference?
>
> So quite frankly, is there any reason we don't just implement
> native_read_msr() as just
>
>unsigned long long
On Sun, Jan 03, 2016 at 11:12:44AM +0200, Michael S. Tsirkin wrote:
> On Sat, Jan 02, 2016 at 11:24:38AM +, Russell King - ARM Linux wrote:
> > My only concern is that it gives people an additional handle onto a
> > "new" set of barriers - just because they're prefixed with __*
> >
On Thu, Dec 31, 2015 at 09:06:30PM +0200, Michael S. Tsirkin wrote:
> On s390 read_barrier_depends, smp_read_barrier_depends
> smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the
> asm-generic variants exactly. Drop the local definitions and pull in
> asm-generic/barrier.h
On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote:
> This defines __smp_xxx barriers for s390,
> for use by virtualization.
>
> Some smp_xxx barriers are removed as they are
> defined correctly by asm-generic/barriers.h
>
> Note: smp_mb, smp_rmb and smp_wmb are defined as full
On Thu, Dec 31, 2015 at 09:10:01PM +0200, Michael S. Tsirkin wrote:
> drivers/xen/xenbus/xenbus_comms.c uses
> full memory barriers to communicate with the other side.
>
> For guests compiled with CONFIG_SMP, smp_wmb and smp_mb
> would be sufficient, so mb() and wmb() here are only needed if
> a
On Thu, Dec 31, 2015 at 09:08:22PM +0200, Michael S. Tsirkin wrote:
> +#ifdef CONFIG_SMP
> +#define fence() metag_fence()
> +#else
> +#define fence() do { } while (0)
> #endif
James, it strikes me as odd that fence() is a no-op instead of a
barrier() for UP, can you verify/explain?
On Thu, Dec 31, 2015 at 09:09:47PM +0200, Michael S. Tsirkin wrote:
> At the moment, xchg on sh only supports 4 and 1 byte values, so using it
> from smp_store_mb means attempts to store a 2 byte value using this
> macro fail.
>
> And happens to be exactly what virtio drivers want to do.
>
>
On Thu, Dec 31, 2015 at 09:07:10PM +0200, Michael S. Tsirkin wrote:
> -#define smp_store_release(p, v)
> \
> -do { \
> - compiletime_assert_atomic_type(*p);
On Mon, Jan 04, 2016 at 02:36:58PM +0100, Peter Zijlstra wrote:
> On Sun, Jan 03, 2016 at 11:12:44AM +0200, Michael S. Tsirkin wrote:
> > On Sat, Jan 02, 2016 at 11:24:38AM +, Russell King - ARM Linux wrote:
>
> > > My only concern is that it gives people an a
On Mon, Jan 04, 2016 at 03:25:58PM +, James Hogan wrote:
> It is used along with the metag specific __global_lock1() (global
> voluntary lock between hw threads) whenever a write is performed, and by
> smp_mb/smp_rmb to try to catch other cases, but I've never been
> confident this fixes every
On Sun, Dec 20, 2015 at 05:07:19PM +, Andrew Cooper wrote:
>
> Very much +1 for fixing this.
>
> Those names would be fine, but they do add yet another set of options in
> an already-complicated area.
>
> An alternative might be to have the regular smp_{w,r,}mb() not revert
> back to nops
On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote:
> This statement doesn't fit MIPS barriers variations. Moreover, there is a
> reason to extend that even more specific, at least for smp_store_release and
> smp_load_acquire, look into
>
>
On Tue, Jan 12, 2016 at 10:43:36AM +0200, Michael S. Tsirkin wrote:
> On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote:
> > On 01/10/2016 06:18 AM, Michael S. Tsirkin wrote:
> > >On mips dma_rmb, dma_wmb, smp_store_mb, read_barrier_depends,
> > >smp_read_barrier_depends,
> duplicate patch, and assume conflict will be resolved.
>
> I would really appreciate some feedback on arch bits (especially the x86
> bits),
> and acks for merging this through the vhost tree.
Thanks for doing this, looks good to me.
Acked-by: Peter Zijlstra (I
On Tue, Jan 12, 2016 at 11:25:55AM +0100, Peter Zijlstra wrote:
> On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote:
> > 2) the changelog _completely_ fails to explain the sync 0x11 and sync
> > 0x12 semantics nor does it provide a publicly accessible link to
&g
3. I bother MIPS Arch team long time until I completely understood that MIPS
> SYNC_WMB, SYNC_MB, SYNC_RMB, SYNC_RELEASE and SYNC_ACQUIRE do an exactly
> that is required in Documentation/memory-barriers.txt
Ha! and you think that document covers all the really fun details?
In particular we're ve
On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote:
> 2) the changelog _completely_ fails to explain the sync 0x11 and sync
> 0x12 semantics nor does it provide a publicly accessible link to
> documentation that does.
Ralf pointed me at: https://imgtec.com/mips/architectur
On Wed, Jan 13, 2016 at 11:02:35AM -0800, Leonid Yegoshin wrote:
> I ask HW team about it but I have a question - has it any relationship with
> replacing MIPS SYNC with lightweight SYNCs (SYNC_WMB etc)?
Of course. If you cannot explain the semantics of the primitives you
introduce, how can we
of smp_store_release()/smp_load_acquire() chains is local. This
> commit therefore introduces the notion of local transitivity and
> gives an example.
>
> Reported-by: Peter Zijlstra <pet...@infradead.org>
> Reported-by: Will Deacon <will.dea...@arm.com&g
On Fri, Jan 15, 2016 at 09:46:12AM -0800, Paul E. McKenney wrote:
> On Fri, Jan 15, 2016 at 10:13:48AM +0100, Peter Zijlstra wrote:
> > And the stuff we're confused about is how best to express the difference
> > and guarantees of these two forms of transitivity and how exactly th
On Thu, Jan 14, 2016 at 01:29:13PM -0800, Paul E. McKenney wrote:
> So smp_mb() provides transitivity, as do pairs of smp_store_release()
> and smp_read_acquire(),
But they provide different grades of transitivity, which is where all
the confusion lays.
smp_mb() is strongly/globally transitive,
On Fri, Jan 15, 2016 at 09:55:54AM +0100, Peter Zijlstra wrote:
> On Thu, Jan 14, 2016 at 01:29:13PM -0800, Paul E. McKenney wrote:
> > So smp_mb() provides transitivity, as do pairs of smp_store_release()
> > and smp_read_acquire(),
>
> But they provide different grades o
On Thu, Jan 14, 2016 at 11:42:02AM -0800, Leonid Yegoshin wrote:
> An the only point - please use an appropriate SYNC_* barriers instead of
> heavy bold hammer. That stuff was design explicitly to support the
> requirements of Documentation/memory-barriers.txt
That's madness. That document
On Thu, Jan 14, 2016 at 09:15:13PM +0100, Peter Zijlstra wrote:
> On Thu, Jan 14, 2016 at 11:42:02AM -0800, Leonid Yegoshin wrote:
> > An the only point - please use an appropriate SYNC_* barriers instead of
> > heavy bold hammer. That stuff was design explicitly to support the
&
On Mon, Jan 25, 2016 at 10:03:22PM -0800, Paul E. McKenney wrote:
> On Mon, Jan 25, 2016 at 04:42:43PM +, Will Deacon wrote:
> > On Fri, Jan 15, 2016 at 01:58:53PM -0800, Paul E. McKenney wrote:
> > > On Fri, Jan 15, 2016 at 10:27:14PM +0100, Peter Zijlstra wrote:
>
On Thu, Jan 14, 2016 at 02:20:46PM -0800, Paul E. McKenney wrote:
> On Thu, Jan 14, 2016 at 01:24:34PM -0800, Leonid Yegoshin wrote:
> > On 01/14/2016 12:48 PM, Paul E. McKenney wrote:
> > >
> > >So SYNC_RMB is intended to implement smp_rmb(), correct?
> > Yes.
> > >
> > >You could use
On Tue, Jan 26, 2016 at 02:33:40PM -0800, Linus Torvalds wrote:
> If it turns out that some architecture does actually need a barrier
> between a read and a dependent write, then that will mean that
>
> (a) we'll have to make up a _new_ barrier, because
> "smp_read_barrier_depends()" is not
On Wed, Jan 27, 2016 at 12:52:07AM +0800, Boqun Feng wrote:
> I recall that last time you and Linus came into a conclusion that even
> on Alpha, a barrier for read->write with data dependency is unnecessary:
>
> http://article.gmane.org/gmane.linux.kernel/2077661
>
> And in an earlier mail of
On Mon, Jan 25, 2016 at 10:12:11PM -0800, Paul E. McKenney wrote:
> On Mon, Jan 25, 2016 at 06:02:34PM +, Will Deacon wrote:
> > Thanks for having a go at this. I tried defining something axiomatically,
> > but got stuck pretty quickly. In my scheme, I used "data-directed
> > transitivity"
On Tue, Jan 26, 2016 at 12:13:39PM -0800, Paul E. McKenney wrote:
> On Tue, Jan 26, 2016 at 11:19:27AM +0100, Peter Zijlstra wrote:
> > So isn't smp_mb__after_unlock_lock() exactly such a scenario? And would
> > not someone trying to implement RCsc locks using locally transit
insane to require it
when building new hardware.
Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
---
Documentation/memory-barriers.txt | 18 +-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/Documentation/memory-barriers.txt
b/Documentation/memory-
On Mon, Mar 14, 2016 at 11:10:16AM -0700, Andy Lutomirski wrote:
> A couple of the wrmsr users actually care about performance. These
> are the ones involved in context switching and, to a lesser extent, in
> switching in and out of guest mode.
Right, this very much includes a number of perf
On Fri, Mar 11, 2016 at 12:59:28PM +0100, Juergen Gross wrote:
> Some hardware (e.g. Dell Studio laptops) require special functions to
> be called on physical cpu 0 in order to avoid occasional hangs. When
> running as dom0 under Xen this could be achieved only via special boot
> parameters (vcpu
On Fri, Mar 11, 2016 at 01:43:53PM +0100, Juergen Gross wrote:
> On 11/03/16 13:19, Peter Zijlstra wrote:
> > On Fri, Mar 11, 2016 at 12:59:30PM +0100, Juergen Gross wrote:
> >> +int call_sync_on_phys_cpu(unsigned cpu, int (*func)(void *), void *par)
> >> +{
>
On Fri, Mar 11, 2016 at 01:48:12PM +0100, Juergen Gross wrote:
> On 11/03/16 13:42, Peter Zijlstra wrote:
> > how about something like:
> >
> > struct xen_callback_struct {
> > struct work_struct work;
> > struct completion done;
int
On Fri, Mar 11, 2016 at 12:59:30PM +0100, Juergen Gross wrote:
> +int call_sync_on_phys_cpu(unsigned cpu, int (*func)(void *), void *par)
> +{
> + cpumask_var_t old_mask;
> + int ret;
> +
> + if (cpu >= nr_cpu_ids)
> + return -EINVAL;
> +
> + if
On Fri, Mar 11, 2016 at 01:19:50PM +0100, Peter Zijlstra wrote:
> On Fri, Mar 11, 2016 at 12:59:30PM +0100, Juergen Gross wrote:
> > +int call_sync_on_phys_cpu(unsigned cpu, int (*func)(void *), void *par)
> > +{
> > + cpumask_var_t old_mask;
> > + int ret;
> >
On Fri, Mar 11, 2016 at 01:15:04PM +, One Thousand Gnomes wrote:
> On Fri, 11 Mar 2016 13:25:14 +0100
> Peter Zijlstra <pet...@infradead.org> wrote:
>
> > On Fri, Mar 11, 2016 at 12:59:28PM +0100, Juergen Gross wrote:
> > > Some hardware (e.g. Dell Studio lapt
e kernel add a service function for this purpose. This will enable
> the possibility to take special measures in virtualized environments
> like Xen, too.
>
> Signed-off-by: Juergen Gross <jgr...@suse.com>
Thanks!
Acked-by: Peter Zijlstra
On Mon, Apr 04, 2016 at 01:52:06PM +0200, Jan Kara wrote:
> Sounds like a good idea to me. I've also consulted this with Petr Mladek
> (added to CC) who is using printk_func per-cpu variable in his
> printk-from-NMI patches and he also doesn't see a problem with this.
There's a few printk()
On Tue, Apr 05, 2016 at 07:10:04AM +0200, Juergen Gross wrote:
> +int smp_call_on_cpu(unsigned int cpu, bool pin, int (*func)(void *), void
> *par)
Why .pin and not .phys? .pin does not (to me) reflect the
hypervisor/physical-cpu thing.
Also, as per smp_call_function_single() would it not be
On Fri, Apr 01, 2016 at 09:14:30AM +0200, Juergen Gross wrote:
> + if (cpu >= nr_cpu_ids)
> + return -EINVAL;
> + if (cpu != 0)
> + return -EINVAL;
The other functions return -ENXIO for this.
___
Xen-devel mailing list
On Fri, Apr 01, 2016 at 09:14:33AM +0200, Juergen Gross wrote:
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -14,6 +14,7 @@
> #include
> #include
> #include
> +#include
>
> #include "smpboot.h"
>
> @@ -758,9 +759,14 @@ struct smp_sync_call_struct {
> static void
On Fri, Apr 01, 2016 at 11:03:21AM +0200, Juergen Gross wrote:
> > Maybe just make the vpin thing an option like:
> >
> > smp_call_on_cpu(int (*func)(void *), int phys_cpu);
> > Also; is something like the vpin thing possible on KVM? because if we're
> > going to expose it to generic code
On Fri, Apr 01, 2016 at 10:28:46AM +0200, Juergen Gross wrote:
> On 01/04/16 09:43, Peter Zijlstra wrote:
> > On Fri, Apr 01, 2016 at 09:14:33AM +0200, Juergen Gross wrote:
> >> --- a/kernel/smp.c
> >> +++ b/kernel/smp.c
> >> @@ -14,6 +14,7 @@
> >&
On Mon, Apr 04, 2016 at 08:32:21AM -0700, Andy Lutomirski wrote:
> Adding locking would be easy enough, wouldn't it?
See patch in this thread..
> But do any platforms really boot a second CPU before switching to real
> printk?
I _only_ use early_printk() as printk() is a quagmire of fail :-)
On Wed, May 18, 2016 at 03:13:44PM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, May 18, 2016 at 08:43:02PM +0200, Peter Zijlstra wrote:
> >
> > We've unconditionally used the queued spinlock for many releases now.
>
> Like since 4.2?
Yeah, that seems to be the right nu
We've unconditionally used the queued spinlock for many releases now.
Its time to remove the old ticket lock code.
Cc: Waiman Long <waiman.l...@hpe.com>
Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
---
arch/x86/Kconfig | 3 +-
arch/x86
On Wed, Feb 08, 2017 at 01:00:25PM -0500, Waiman Long wrote:
> As the vcpu_is_preempted() call is pretty costly compared with other
> checks within mutex_spin_on_owner() and rwsem_spin_on_owner(), they
> are done at a reduce frequency of once every 256 iterations.
That's just disgusting.
On Wed, Feb 08, 2017 at 01:00:24PM -0500, Waiman Long wrote:
> It was found when running fio sequential write test with a XFS ramdisk
> on a 2-socket x86-64 system, the %CPU times as reported by perf were
> as follows:
>
> 71.27% 0.28% fio [k] down_write
> 70.99% 0.01% fio [k]
s will go through the KVM tree, if people want me to
take it through the tip tree, please let me know.
Acked-by: Peter Zijlstra (Intel) <pet...@infraded.org>
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
On Mon, Feb 13, 2017 at 12:06:44PM -0800, h...@zytor.com wrote:
> >Maybe:
> >
> >movsql %edi, %rax;
> >movq __per_cpu_offset(,%rax,8), %rax;
> >cmpb $0, %[offset](%rax);
> >setne %al;
> >
> >?
>
> We could kill the zero or sign extend by changing the calling
> interface to pass an unsigned long
1 - 100 of 137 matches
Mail list logo