Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c

2014-01-30 Thread Peter Zijlstra
On Thu, Jan 30, 2014 at 08:41:16AM -0800, Joe Perches wrote: > Perhaps you could use a newer version of patch > > GNU patch version 2.7 released Yeah, I know about that, I'll wait until its common in all distros, updating all machines I use by hand is just painful. ___

Re: [PATCH v2 1/6] idle: move the cpuidle entry point to the generic idle loop

2014-01-30 Thread Peter Zijlstra
On Thu, Jan 30, 2014 at 06:28:52PM +0100, Daniel Lezcano wrote: > Ok, I think the mess is coming from 'default_idle' which does not re-enable > the local_irq but used from different places like amd_e400_idle and > apm_cpu_idle. > > void default_idle(void) > { > trace_cpu_idle_rcuidle(1, sm

Re: [PATCH] Convert powerpc simple spinlocks into ticket locks

2014-02-06 Thread Peter Zijlstra
On Thu, Feb 06, 2014 at 11:37:37AM +0100, Torsten Duwe wrote: > x86 has them, MIPS has them, ARM has them, even ia64 has them: > ticket locks. They reduce memory bus and cache pressure especially > for contended spinlocks, increasing performance. > > This patch is a port of the x86 spin locks, mos

Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c

2014-02-06 Thread Peter Zijlstra
On Thu, Feb 06, 2014 at 02:09:59PM +, Nicolas Pitre wrote: > Hi Peter, > > Did you merge those patches in your tree? tree, tree, what's in a word. Its in my patch stack yes. I should get some of that into tip I suppose, been side-tracked a bit this week. Sorry for the delay. > If so, is it

Re: [PATCH] Convert powerpc simple spinlocks into ticket locks

2014-02-06 Thread Peter Zijlstra
On Thu, Feb 06, 2014 at 06:37:27PM +0100, Torsten Duwe wrote: > On Thu, Feb 06, 2014 at 05:38:37PM +0100, Peter Zijlstra wrote: > > On Thu, Feb 06, 2014 at 11:37:37AM +0100, Torsten Duwe wrote: > > > x86 has them, MIPS has them, ARM has them, even ia64 has them: > > >

Re: [PATCH] Convert powerpc simple spinlocks into ticket locks

2014-02-07 Thread Peter Zijlstra
On Fri, Feb 07, 2014 at 10:02:48AM +0100, Torsten Duwe wrote: > On Thu, Feb 06, 2014 at 02:19:52PM -0600, Scott Wood wrote: > > On Thu, 2014-02-06 at 18:37 +0100, Torsten Duwe wrote: > > > On Thu, Feb 06, 2014 at 05:38:37PM +0100, Peter Zijlstra wrote: > > > > >

Re: [PATCH] Convert powerpc simple spinlocks into ticket locks

2014-02-07 Thread Peter Zijlstra
> So if you have ll/sc on the whole word concurrent with the half-word > store, you can loose the half-word store like: > > lwarx &tickets > ... sth &tail > stwcd &tickets > > > The stwcd will over-write the tail store. Oh wait, that's stupid, it will invalidate the lock a

Re: [PATCH] Convert powerpc simple spinlocks into ticket locks

2014-02-07 Thread Peter Zijlstra
On Fri, Feb 07, 2014 at 11:31:39AM +0100, Peter Zijlstra wrote: > Anyway, what might work is something like (please forgive my ppc asm, I > can barely read the thing, I've never before attempted writing it): > > lock: > 1:lharx %0, 0, &head > mov %1, %

Re: [PATCH] Convert powerpc simple spinlocks into ticket locks

2014-02-07 Thread Peter Zijlstra
On Fri, Feb 07, 2014 at 12:49:49PM +0100, Torsten Duwe wrote: > On Fri, Feb 07, 2014 at 11:45:30AM +0100, Peter Zijlstra wrote: > > > > That might need to be lhz too, I'm confused on all the load variants. > > ;-) > > > > unlock: > > > lhz

Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c

2014-02-07 Thread Peter Zijlstra
On Fri, Feb 07, 2014 at 11:09:23AM +, Nicolas Pitre wrote: > On Thu, 6 Feb 2014, Peter Zijlstra wrote: > > tree, tree, what's in a word. > > Something you may plant on a patch of grass? "Merging" becomes a > strange concept in that context though. :-)

Re: [PATCH 1/2] PPC: powernv: remove redundant cpuidle_idle_call()

2014-02-07 Thread Peter Zijlstra
On Fri, Feb 07, 2014 at 05:11:26PM +0530, Preeti U Murthy wrote: > But observe the idle state "snooze" on powerpc. The power that this idle > state saves is through the lowering of the thread priority of the CPU. > After it lowers the thread priority, it is done. It cannot > "wait_for_interrupts".

Re: [PATCH] Convert powerpc simple spinlocks into ticket locks

2014-02-07 Thread Peter Zijlstra
On Fri, Feb 07, 2014 at 01:28:37PM +0100, Peter Zijlstra wrote: > Anyway, you can do a version with lwarx/stwcx if you're looking get rid > of lharx. the below seems to compile into relatively ok asm. It can be done better if you write the entire thing by hand though. --- typedef uns

Re: [PATCH] Convert powerpc simple spinlocks into ticket locks

2014-02-07 Thread Peter Zijlstra
On Fri, Feb 07, 2014 at 04:18:47PM +0100, Peter Zijlstra wrote: > void ticket_lock(tickets_t *lock) > { > tickets_t t; > > /* >* Because @head is MSB, the direct increment wrap doesn't disturb >* @tail. >*/ >

Re: [PATCH] Convert powerpc simple spinlocks into ticket locks

2014-02-07 Thread Peter Zijlstra
On Fri, Feb 07, 2014 at 09:51:16AM -0600, Kumar Gala wrote: > > On Feb 7, 2014, at 3:02 AM, Torsten Duwe wrote: > > > On Thu, Feb 06, 2014 at 02:19:52PM -0600, Scott Wood wrote: > >> On Thu, 2014-02-06 at 18:37 +0100, Torsten Duwe wrote: > >>> On Thu, Feb

Re: [PATCH] Convert powerpc simple spinlocks into ticket locks

2014-02-07 Thread Peter Zijlstra
On Fri, Feb 07, 2014 at 06:08:45PM +0100, Torsten Duwe wrote: > > static inline unsigned int xadd(unsigned int *v, unsigned int i) > > { > > int t, ret; > > > > __asm__ __volatile__ ( > > "1: lwarx %0, 0, %4\n" > > " mr %1, %0\n" > > " add %0, %3, %0\n" > > " stwcx. %

Re: [PATCH v2] powerpc ticket locks

2014-02-07 Thread Peter Zijlstra
On Fri, Feb 07, 2014 at 05:58:01PM +0100, Torsten Duwe wrote: > +static __always_inline void arch_spin_lock(arch_spinlock_t *lock) > { > + register struct __raw_tickets old, tmp, > + inc = { .tail = TICKET_LOCK_INC }; > + > CLEAR_IO_SYNC; > + __asm__ __volatile__( > +"1:

Re: [PATCH v2] powerpc ticket locks

2014-02-10 Thread Peter Zijlstra
On Mon, Feb 10, 2014 at 04:52:17PM +0100, Torsten Duwe wrote: > Opinions, anyone? Since the holder thing is a performance thing, not a correctness thing; one thing you could do is something like: static const int OWNER_HASH_SIZE = CONFIG_NR_CPUS * 4; static const int OWNER_HASH_BITS = ilog2(OWNER

[PATCH] powerpc/spufs: Fix duplicate definition of MAX_USER_PRIO

2014-02-11 Thread Peter Zijlstra
) is ((p)-MAX_RT_PRIO) the above two definitions are the same and we can simply remove the spufs one. Fixes: 6b6350f155af ("sched: Expose some macros related to priority") Reported-by: Fengguang Wu Signed-off-by: Peter Zijlstra --- arch/powerpc/platforms/cell/spufs/sched.c | 1 - 1 fi

Re: [PATCH v2 02/11] perf core: export swevent hrtimer helpers

2014-02-25 Thread Peter Zijlstra
On Tue, Feb 25, 2014 at 02:33:26PM +1100, Michael Ellerman wrote: > On Fri, 2014-14-02 at 22:02:06 UTC, Cody P Schafer wrote: > > Export the swevent hrtimer helpers currently only used in events/core.c > > to allow the addition of architecture specific sw-like pmus. > > Peter, Ingo, can we get you

Re: [PATCH v2 02/11] perf core: export swevent hrtimer helpers

2014-02-26 Thread Peter Zijlstra
On Tue, Feb 25, 2014 at 01:38:31PM -0800, Cody P Schafer wrote: > On 02/25/2014 02:20 AM, Peter Zijlstra wrote: > >On Tue, Feb 25, 2014 at 02:33:26PM +1100, Michael Ellerman wrote: > >>On Fri, 2014-14-02 at 22:02:06 UTC, Cody P Schafer wrote: > >>>Export the swevent

Re: [PATCH V3] mm: numa: bugfix for LAST_CPUPID_NOT_IN_PAGE_FLAGS

2014-02-28 Thread Peter Zijlstra
t; Signed-off-by: Liu Ping Fan > Signed-off-by: Aneesh Kumar K.V Acked-by: Peter Zijlstra > --- > > include/linux/mm.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index f28f46eade6a..8624583

Re: [PATCH 0/2] sched: Removed unused mc_capable() and smt_capable()

2014-03-05 Thread Peter Zijlstra
On Tue, Mar 04, 2014 at 02:07:31PM -0700, Bjorn Helgaas wrote: > This is just cleanup of a couple unused interfaces and (for sparc64) a > supporting variable. > Thanks! ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org

Re: Tasks stuck in futex code (in 3.14-rc6)

2014-03-19 Thread Peter Zijlstra
On Wed, Mar 19, 2014 at 08:56:19PM +0530, Srikar Dronamraju wrote: > There are 332 tasks all stuck in futex_wait_queue_me(). > I am able to reproduce this consistently. > > Infact I can reproduce this if the java_constraint is either node, socket, > system. > However I am not able to reproduce if

Re: Tasks stuck in futex code (in 3.14-rc6)

2014-03-19 Thread Peter Zijlstra
On Wed, Mar 19, 2014 at 04:47:05PM +0100, Peter Zijlstra wrote: > > I reverted b0c29f79ecea0b6fbcefc999e70f2843ae8306db on top of v3.14-rc6 and > > confirmed that > > reverting the commit solved the problem. > > Joy,.. let me look at that with ppc in mind. OK; so w

Re: Tasks stuck in futex code (in 3.14-rc6)

2014-03-20 Thread Peter Zijlstra
On Thu, Mar 20, 2014 at 11:03:50AM +0530, Srikar Dronamraju wrote: > > > Joy,.. let me look at that with ppc in mind. > > > > OK; so while pretty much all the comments from that patch are utter > > nonsense (what was I thinking), I cannot actually find a real bug. > > > > But could you try the be

Re: [PATCH V2 1/2] mm: move FAULT_AROUND_ORDER to arch/

2014-04-09 Thread Peter Zijlstra
On Wed, Apr 09, 2014 at 07:02:02AM +0530, Madhavan Srinivasan wrote: > On Friday 04 April 2014 09:48 PM, Dave Hansen wrote: > > On 04/03/2014 11:27 PM, Madhavan Srinivasan wrote: > >> This patch creates infrastructure to move the FAULT_AROUND_ORDER > >> to arch/ using Kconfig. This will enable arch

Re: [PATCH 1/3] perf/e6500: Make event translations available in sysfs

2015-02-09 Thread Peter Zijlstra
On Fri, Feb 06, 2015 at 04:43:54PM -0600, Tom Huynh wrote: > arch/powerpc/perf/e6500-events-list.h | 289 > ++ That's a lot of events to stuff in the kernel, would a userspace list not be more convenient? ISTR there being various discussions on providing support f

Re: [PATCH 1/4] perf: Add 'flags' parameter to pmu txn interfaces

2015-03-16 Thread Peter Zijlstra
ags indicate wheether the transaction is to add events to > the PMU (PERF_PMU_TXN_ADD) or to read the events PERF_PMU_TXN_READ. > > Based on input from Peter Zijlstra. > > Signed-off-by: Sukadev Bhattiprolu > --- > arch/powerpc/perf/core-book3s.c | 15 --- > a

Re: [PATCH 3/4] perf: Add 'update' parameter to perf_event_read_value()

2015-03-16 Thread Peter Zijlstra
On Wed, Mar 04, 2015 at 12:35:07AM -0800, Sukadev Bhattiprolu wrote: > extern u64 perf_event_read_value(struct perf_event *event, > - u64 *enabled, u64 *running); > + u64 *enabled, u64 *running, int update); > I think someone recently s

Re: [PATCH 4/4] perf/powerpc: Implement group_read() txn interface for 24x7 counters

2015-03-16 Thread Peter Zijlstra
On Wed, Mar 04, 2015 at 12:35:08AM -0800, Sukadev Bhattiprolu wrote: > +++ b/kernel/events/core.c > @@ -3677,11 +3677,34 @@ u64 perf_event_read_value(struct perf_event *event, > u64 *enabled, u64 *running, > } > EXPORT_SYMBOL_GPL(perf_event_read_value); > > +static int do_pmu_group_read(struct

Re: [PATCH] hw_breakpoint: Fix Oops at destroying hw_breakpoint event on powerpc

2016-03-02 Thread Peter Zijlstra
On Wed, Mar 02, 2016 at 03:25:17PM +0530, Ravi Bangoria wrote: > At a time of destroying hw_breakpoint event, kernel ends up with Oops. > Here is the sample output from 4.5.0-rc6 kernel. > Call chain: > > hw_breakpoint_event_init() > bp->destroy = bp_perf_event_destroy; > > do_exit() >

Re: hw_breakpoint: Fix Oops at destroying hw_breakpoint event on powerpc

2016-03-02 Thread Peter Zijlstra
On Wed, Mar 02, 2016 at 10:53:24PM +1100, Michael Ellerman wrote: > Peterz, acme, do you guys want to take this? Or should I? I'm not too happy its touching event->ctx at all. It really should not be doing that. ___ Linuxppc-dev mailing list Linuxppc-dev

Re: hw_breakpoint: Fix Oops at destroying hw_breakpoint event on powerpc

2016-03-03 Thread Peter Zijlstra
On Thu, Mar 03, 2016 at 08:23:38PM +1100, Michael Ellerman wrote: > On Wed, 2016-03-02 at 12:59 +0100, Peter Zijlstra wrote: > > > On Wed, Mar 02, 2016 at 10:53:24PM +1100, Michael Ellerman wrote: > > > > Peterz, acme, do you guys want to take this? Or should I? >

Re: [PATCH 3/4] exit_thread: accept a task parameter to be exited

2016-03-24 Thread Peter Zijlstra
On Thu, Mar 24, 2016 at 01:58:13PM +0100, Jiri Slaby wrote: > void > -exit_thread(void) > +exit_thread(struct task_struct *me) > { > } task_struct arguments are called: tsk, task, p 'me' seems very wrong, as that could only mean 'current', and its clearly not that. _

Re: [PATCH] powerpc: introduce {cmp}xchg for u8 and u16

2016-04-08 Thread Peter Zijlstra
On Fri, Apr 08, 2016 at 02:41:46PM +0800, Pan Xinhui wrote: > From: pan xinhui > > Implement xchg{u8,u16}{local,relaxed}, and > cmpxchg{u8,u16}{,local,acquire,relaxed}. > > Atomic operation on 8-bit and 16-bit data type is supported from power7 And yes I see nothing P7 specific here, this imple

Re: [PATCH] powerpc: introduce {cmp}xchg for u8 and u16

2016-04-12 Thread Peter Zijlstra
On Sun, Apr 10, 2016 at 10:17:28PM +0800, Pan Xinhui wrote: > > On 2016年04月08日 15:47, Peter Zijlstra wrote: > > On Fri, Apr 08, 2016 at 02:41:46PM +0800, Pan Xinhui wrote: > >> From: pan xinhui > >> > >> Implement xchg{u8,u16}{local,relaxed}, and >

Re: [PATCH V3] powerpc: Implement {cmp}xchg for u8 and u16

2016-04-20 Thread Peter Zijlstra
On Wed, Apr 20, 2016 at 09:24:00PM +0800, Pan Xinhui wrote: > +#define __XCHG_GEN(cmp, type, sfx, skip, v) \ > +static __always_inline unsigned long \ > +__cmpxchg_u32##sfx(v unsigned int *p, unsigned long old, \ > +

Re: [PATCH V3] powerpc: Implement {cmp}xchg for u8 and u16

2016-04-21 Thread Peter Zijlstra
On Thu, Apr 21, 2016 at 11:35:07PM +0800, Pan Xinhui wrote: > yes, you are right. more load/store will be done in C code. > However such xchg_u8/u16 is just used by qspinlock now. and I did not see any > performance regression. > So just wrote in C, for simple. :) Which is fine; but worthy of a n

Re: [PATCH V3] powerpc: Implement {cmp}xchg for u8 and u16

2016-04-25 Thread Peter Zijlstra
On Mon, Apr 25, 2016 at 06:10:51PM +0800, Pan Xinhui wrote: > > So I'm not actually _that_ familiar with the PPC LL/SC implementation; > > but there are things a CPU can do to optimize these loops. > > > > For example, a CPU might choose to not release the exclusive hold of the > > line for a numb

Re: [PATCH V4] powerpc: Implement {cmp}xchg for u8 and u16

2016-04-28 Thread Peter Zijlstra
> > Suggested-by: Peter Zijlstra (Intel) > Signed-off-by: Pan Xinhui Generally has the right shape; and I trust others to double check the ppc-asm minutia. Acked-by: Peter Zijlstra (Intel) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.oz

Re: [PATCH 1/8] jump_label: no need to acquire the jump_label_mutex in jump_lable_init()

2015-08-20 Thread Peter Zijlstra
On Thu, Aug 20, 2015 at 08:14:29PM +0800, Kevin Hao wrote: > The jump_label_init() run in a very early stage, even before the > sched_init(). So there is no chance for concurrent access of the > jump label table. It also doesn't hurt to have it. Its better to be consistent and conservative with lo

Re: [PATCH 3/8] jump_label: introduce DEFINE_STATIC_KEY_{TRUE,FALSE}_ARRAY macros

2015-08-20 Thread Peter Zijlstra
On Thu, Aug 20, 2015 at 08:14:31PM +0800, Kevin Hao wrote: > These are used to define a static_key_{true,false} array. Yes but why... there might have been some clue in the patches you didn't send me, but since you didn't send them, I'm left wondering.

Re: [RFC 3/5] powerpc: atomic: implement atomic{,64}_{add,sub}_return_* variants

2015-08-28 Thread Peter Zijlstra
On Fri, Aug 28, 2015 at 10:48:17AM +0800, Boqun Feng wrote: > +/* > + * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with > + * a "bne-" instruction at the end, so an isync is enough as a acquire > barrier > + * on the platform without lwsync. > + */ > +#ifdef CONFIG_SMP > +#def

Re: [RFC 2/5] atomics: introduce arch_atomic_op_{acquire,release,fence} helpers

2015-08-28 Thread Peter Zijlstra
On Fri, Aug 28, 2015 at 10:48:16AM +0800, Boqun Feng wrote: > Some architectures may have their special barriers for acquire, release > and fence semantics, general memory barriers(smp_mb__*_atomic()) in > __atomic_op_*() may be too strong, so arch_atomic_op_*() helpers are > introduced for archite

Re: [RFC 3/5] powerpc: atomic: implement atomic{,64}_{add,sub}_return_* variants

2015-08-28 Thread Peter Zijlstra
On Fri, Aug 28, 2015 at 10:16:02PM +0800, Boqun Feng wrote: > On Fri, Aug 28, 2015 at 08:06:14PM +0800, Boqun Feng wrote: > > Hi Peter, > > > > On Fri, Aug 28, 2015 at 12:48:54PM +0200, Peter Zijlstra wrote: > > > On Fri, Aug 28, 2015 at 10:4

Re: [PATCH v5 1/8] perf: Add a flags parameter to pmu txn interfaces

2015-09-01 Thread Peter Zijlstra
On Thu, Aug 13, 2015 at 11:49:34PM -0700, Sukadev Bhattiprolu wrote: I'm ever so sorry I keep going on about this, but.. > diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c > index d90893b..b18efe4 100644 > --- a/arch/powerpc/perf/core-book3s.c > +++ b/arch/powerpc/pe

Re: [[PATCH v6 09/10] powerpc/perf/hv-24x7: Use PMU_TXN_READ interface

2015-09-08 Thread Peter Zijlstra
nt to be read > > > > pmu->commit_txn() // Read/update all queuedcounters > > > > The ->commit_txn() also updates the event counts in the respective > > perf_event objects. The perf subsystem can then directly get the > > event c

Re: [RFC 3/5] powerpc: atomic: implement atomic{,64}_{add,sub}_return_* variants

2015-09-14 Thread Peter Zijlstra
Sorry for being tardy, I had a wee spell of feeling horrible and then I procrastinated longer than I should have. On Fri, Sep 11, 2015 at 01:45:07PM +0100, Will Deacon wrote: > Peter, any thoughts? I'm not au fait with the x86 memory model, but what > Paul's saying is worrying. Right, so Paul i

Re: [RFC 3/5] powerpc: atomic: implement atomic{,64}_{add,sub}_return_* variants

2015-09-14 Thread Peter Zijlstra
On Mon, Sep 14, 2015 at 01:35:20PM +0200, Peter Zijlstra wrote: > > Sorry for being tardy, I had a wee spell of feeling horrible and then I > procrastinated longer than I should have. > > On Fri, Sep 11, 2015 at 01:45:07PM +0100, Will Deacon wrote: > > > Peter, any t

Re: [RFC 3/5] powerpc: atomic: implement atomic{,64}_{add,sub}_return_* variants

2015-09-14 Thread Peter Zijlstra
On Mon, Sep 14, 2015 at 02:01:53PM +0200, Peter Zijlstra wrote: > The scenario is: > > CPU0CPU1 > > unlock(x) > smp_store_release(&x->lock, 0); > > unlock(y) >

Re: [RFC v2 4/7] powerpc: atomic: Implement xchg_* and atomic{,64}_xchg_* variants

2015-10-01 Thread Peter Zijlstra
On Wed, Sep 16, 2015 at 11:49:32PM +0800, Boqun Feng wrote: > Implement xchg_relaxed and define atomic{,64}_xchg_* as xchg_relaxed, > based on these _relaxed variants, release/acquire variants can be built. > > Note that xchg_relaxed and atomic_{,64}_xchg_relaxed are not compiler > barriers. Hmm,

Re: [RFC v2 5/7] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants

2015-10-01 Thread Peter Zijlstra
On Wed, Sep 16, 2015 at 11:49:33PM +0800, Boqun Feng wrote: > Unlike other atomic operation variants, cmpxchg{,64}_acquire and > atomic{,64}_cmpxchg_acquire don't have acquire semantics if the cmp part > fails, so we need to implement these using assembly. I think that is actually expected and doc

Re: [RFC v2 6/7] powerpc: atomic: Make atomic{,64}_xchg and xchg a full barrier

2015-10-01 Thread Peter Zijlstra
On Wed, Sep 16, 2015 at 11:49:34PM +0800, Boqun Feng wrote: > According to memory-barriers.txt, xchg and its atomic{,64}_ versions > need to imply a full barrier, however they are now just RELEASE+ACQUIRE, > which is not a full barrier. > > So remove the definition of xchg(), and let __atomic_op_f

Re: [RFC v2 5/7] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants

2015-10-01 Thread Peter Zijlstra
On Thu, Oct 01, 2015 at 02:27:15PM +0200, Peter Zijlstra wrote: > On Wed, Sep 16, 2015 at 11:49:33PM +0800, Boqun Feng wrote: > > Unlike other atomic operation variants, cmpxchg{,64}_acquire and > > atomic{,64}_cmpxchg_acquire don't have acquire semantics if the cmp part >

Re: [RFC v2 5/7] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants

2015-10-01 Thread Peter Zijlstra
On Thu, Oct 01, 2015 at 08:12:19AM -0700, Paul E. McKenney wrote: > What C11 does is to allow the developer to specify different orderings > on success and failure. But it is no harder to supply a barrier (if > needed) on the failure path, right? Quite right.

Re: [RFC v2 4/7] powerpc: atomic: Implement xchg_* and atomic{,64}_xchg_* variants

2015-10-01 Thread Peter Zijlstra
On Thu, Oct 01, 2015 at 08:09:09AM -0700, Paul E. McKenney wrote: > On Thu, Oct 01, 2015 at 02:24:40PM +0200, Peter Zijlstra wrote: > > I must say I'm somewhat surprised by this level of relaxation, I had > > expected to only loose SMP barriers, not the program order ones.

Re: [RFC v2 4/7] powerpc: atomic: Implement xchg_* and atomic{,64}_xchg_* variants

2015-10-01 Thread Peter Zijlstra
On Thu, Oct 01, 2015 at 11:03:01AM -0700, Paul E. McKenney wrote: > On Thu, Oct 01, 2015 at 07:13:04PM +0200, Peter Zijlstra wrote: > > On Thu, Oct 01, 2015 at 08:09:09AM -0700, Paul E. McKenney wrote: > > > On Thu, Oct 01, 2015 at 02:24:40PM +0200, Peter Zijlstra wrote: >

Re: [RFC v2 6/7] powerpc: atomic: Make atomic{,64}_xchg and xchg a full barrier

2015-10-01 Thread Peter Zijlstra
On Fri, Oct 02, 2015 at 07:19:04AM +0800, Boqun Feng wrote: > Hi Peter, > > Please forgive me for the format of my reply. I'm travelling, > and replying from my phone. > > 2015年10月1日 下午7:28,"Peter Zijlstra" 写道: > > > > On Wed, Sep 16, 2015 at 11:49:

Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

2015-10-08 Thread Peter Zijlstra
On Thu, Oct 08, 2015 at 02:50:36PM +1100, Michael Ellerman wrote: > On Wed, 2015-10-07 at 08:25 -0700, Paul E. McKenney wrote: > > Currently, we do need smp_mb__after_unlock_lock() to be after the > > acquisition on PPC -- putting it between the unlock and the lock > > of course doesn't cut it for

Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

2015-10-09 Thread Peter Zijlstra
On Thu, Oct 08, 2015 at 02:44:39PM -0700, Paul E. McKenney wrote: > > > > I am with Peter -- we do need the benchmark results for PPC. > > > > > > Urgh, sorry guys. I have been slowly doing some benchmarks, but time is > > > not > > > plentiful at the moment. > > > > > > If we do a straight lws

Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

2015-10-09 Thread Peter Zijlstra
On Thu, Oct 08, 2015 at 02:44:39PM -0700, Paul E. McKenney wrote: > On Thu, Oct 08, 2015 at 01:16:38PM +0200, Peter Zijlstra wrote: > > On Thu, Oct 08, 2015 at 02:50:36PM +1100, Michael Ellerman wrote: > > > On Wed, 2015-10-07 at 08:25 -0700, Paul E. McKenney wrote: > >

Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

2015-10-09 Thread Peter Zijlstra
On Fri, Oct 09, 2015 at 10:40:39AM +0100, Will Deacon wrote: > Stepping back a second, I believe that there are three cases: > > > RELEASE X -> ACQUIRE Y (same CPU) >* Needs a barrier on TSO architectures for full ordering +PPC > UNLOCK X -> LOCK Y (same CPU) >* Needs a barrier

Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

2015-10-09 Thread Peter Zijlstra
On Fri, Oct 09, 2015 at 10:40:39AM +0100, Will Deacon wrote: > > Which leads me to think I would like to suggest alternative rules for > > RELEASE/ACQUIRE (to replace those Will suggested; as I think those are > > partly responsible for my confusion). > > Yeah, sorry. I originally used the phrase

Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

2015-10-09 Thread Peter Zijlstra
On Fri, Oct 09, 2015 at 10:40:39AM +0100, Will Deacon wrote: > > > - RELEASE -> ACQUIRE _chains_ (on shared variables) preserve causality, > >(because each link is fully ordered) but are not transitive. > > Yup, and that's the same for UNLOCK -> LOCK, too. Agreed, except RELEASE/ACQUIRE is

Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

2015-10-09 Thread Peter Zijlstra
On Fri, Oct 09, 2015 at 10:51:29AM +0100, Will Deacon wrote: > > The corresponding litmus tests are below. > > How do people feel about including these in memory-barriers.txt? I find > them considerably easier to read than our current kernel code + list of > possible orderings + wall of text, but

Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

2015-10-09 Thread Peter Zijlstra
On Fri, Oct 09, 2015 at 01:51:11PM +0100, Will Deacon wrote: > On Fri, Oct 09, 2015 at 01:12:02PM +0200, Peter Zijlstra wrote: > > On Fri, Oct 09, 2015 at 10:40:39AM +0100, Will Deacon wrote: > > > > Which leads me to think I would like to suggest alternative rules for > &

Re: [RFC v2 5/7] powerpc: atomic: Implement cmpxchg{,64}_* and atomic{,64}_cmpxchg_* variants

2015-10-11 Thread Peter Zijlstra
On Sun, Oct 11, 2015 at 06:25:20PM +0800, Boqun Feng wrote: > On Sat, Oct 10, 2015 at 09:58:05AM +0800, Boqun Feng wrote: > > Hi Peter, > > > > Sorry for replying late. > > > > On Thu, Oct 01, 2015 at 02:27:16PM +0200, Peter Zijlstra wrote: > > > On We

Re: [PATCH v3 0/6] atomics: powerpc: Implement relaxed/acquire/release variants of some atomics

2015-10-13 Thread Peter Zijlstra
On Mon, Oct 12, 2015 at 10:14:00PM +0800, Boqun Feng wrote: > The patchset consists of 6 parts: > > 1.Make xchg, cmpxchg and their atomic_ versions a full barrier > > 2.Add trivial tests for the new variants in lib/atomic64_test.c > > 3.Allow architectures to define their own __ato

Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier

2015-10-14 Thread Peter Zijlstra
On Wed, Oct 14, 2015 at 08:51:34AM +0800, Boqun Feng wrote: > On Wed, Oct 14, 2015 at 11:10:00AM +1100, Michael Ellerman wrote: > > Thanks for fixing this. In future you should send a patch like this as a > > separate patch. I've not been paying attention to it because I assumed it > > was > > G

Re: [PATCH RESEND v3 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier

2015-10-14 Thread Peter Zijlstra
On Wed, Oct 14, 2015 at 05:26:53PM +0800, Boqun Feng wrote: > Michael and Peter, rest of this patchset depends on commits which are > currently in the locking/core branch of the tip, so I would like it as a > whole queued there. Besides, I will keep this patch Cc'ed to stable in > future versions,

Re: [PATCH tip/locking/core v4 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier

2015-10-14 Thread Peter Zijlstra
On Wed, Oct 14, 2015 at 01:19:17PM -0700, Paul E. McKenney wrote: > Suppose we have something like the following, where "a" and "x" are both > initially zero: > > CPU 0 CPU 1 > - - > > WRITE_ONCE(x, 1); WR

Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

2015-10-19 Thread Peter Zijlstra
On Mon, Oct 19, 2015 at 09:17:18AM +0800, Boqun Feng wrote: > This is confusing me right now. ;-) > > Let's use a simple example for only one primitive, as I understand it, > if we say a primitive A is "fully ordered", we actually mean: > > 1.The memory operations preceding(in program order)

Re: [PATCH tip/locking/core v4 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier

2015-10-20 Thread Peter Zijlstra
On Tue, Oct 20, 2015 at 03:15:32PM +0800, Boqun Feng wrote: > On Wed, Oct 14, 2015 at 01:19:17PM -0700, Paul E. McKenney wrote: > > > > Am I missing something here? If not, it seems to me that you need > > the leading lwsync to instead be a sync. > > > > Of course, if I am not missing something,

Re: [PATCH tip/locking/core v4 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier

2015-10-21 Thread Peter Zijlstra
On Tue, Oct 20, 2015 at 02:28:35PM -0700, Paul E. McKenney wrote: > I am not seeing a sync there, but I really have to defer to the > maintainers on this one. I could easily have missed one. So x86 implies a full barrier for everything that changes the CPL; and some form of implied ordering seems

Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

2015-10-21 Thread Peter Zijlstra
On Tue, Oct 20, 2015 at 04:34:51PM -0700, Paul E. McKenney wrote: > There is also the question of whether the barrier forces ordering > of unrelated stores, everything initially zero and all accesses > READ_ONCE() or WRITE_ONCE(): > > P0 P1 P2 P3 >

Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

2015-10-21 Thread Peter Zijlstra
On Wed, Oct 21, 2015 at 12:29:23PM -0700, Paul E. McKenney wrote: > On Wed, Oct 21, 2015 at 10:24:52AM +0200, Peter Zijlstra wrote: > > On Tue, Oct 20, 2015 at 04:34:51PM -0700, Paul E. McKenney wrote: > > > There is also the question of whether the barrier forces ordering

Re: [PATCH tip/locking/core v4 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier

2015-10-21 Thread Peter Zijlstra
On Wed, Oct 21, 2015 at 12:35:23PM -0700, Paul E. McKenney wrote: > > > > > I ask this because I recall Peter once bought up a discussion: > > > > > > > > > > https://lkml.org/lkml/2015/8/26/596 > > So a full barrier on one side of these operations is enough, I think. > > IOW, there is no need to

Re: [PATCH tip/locking/core v4 1/6] powerpc: atomic: Make *xchg and *cmpxchg a full barrier

2015-10-24 Thread Peter Zijlstra
On Thu, Oct 22, 2015 at 08:07:16PM +0800, Boqun Feng wrote: > On Wed, Oct 21, 2015 at 09:48:25PM +0200, Peter Zijlstra wrote: > > On Wed, Oct 21, 2015 at 12:35:23PM -0700, Paul E. McKenney wrote: > > > > > > > I ask this because I recall Pet

Re: [PATCH powerpc/next 1/2] powerpc: Make value-returning atomics fully ordered

2015-11-02 Thread Peter Zijlstra
ed, which can avoid possible > memory ordering problems if userspace code relies on futex system call > for fully ordered semantics. > > Cc: # 3.4+ > Signed-off-by: Boqun Feng Acked-by: Peter Zijlstra (Intel) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH powerpc/next 2/2] powerpc: Make {cmp}xchg* and their atomic_ versions fully ordered

2015-11-02 Thread Peter Zijlstra
mic_xxx_return barrier semantics") > > This patch depends on patch "powerpc: Make value-returning atomics fully > ordered" for PPC_ATOMIC_ENTRY_BARRIER definition. > > Cc: # 3.4+ > Signed-off-by: Boqun Feng Acked-by: Peter Zijlstra (Intel) _

Re: [RFC PATCH 0/3]perf/core: extend perf_reg and perf_sample_regs_intr

2015-11-05 Thread Peter Zijlstra
On Thu, Nov 05, 2015 at 02:16:15AM +0530, Madhavan Srinivasan wrote: > Second patch updates struct arch_misc_reg for arch/powerpc with pmu registers > and adds offsetof macro for the same. It extends perf_reg_value() > to use reg idx to decide on struct to return value from. Why; what's in those r

Re: [RFC PATCH 0/3]perf/core: extend perf_reg and perf_sample_regs_intr

2015-11-06 Thread Peter Zijlstra
On Fri, Nov 06, 2015 at 12:57:17PM +0530, Madhavan Srinivasan wrote: > > > On Thursday 05 November 2015 06:37 PM, Peter Zijlstra wrote: > > On Thu, Nov 05, 2015 at 02:16:15AM +0530, Madhavan Srinivasan wrote: > >> Second patch updates struct arch_misc_reg f

Re: [RFC PATCH 0/3]perf/core: extend perf_reg and perf_sample_regs_intr

2015-11-06 Thread Peter Zijlstra
On Fri, Nov 06, 2015 at 09:04:00PM +1100, Michael Ellerman wrote: > It's a perrenial request from our hardware PMU folks to be able to see the raw > values of the PMU registers. > > I think partly it's so that they can verify that perf is doing what they want, > and some of it is that they're inte

Re: [PATCH v3 4/4] printk/nmi: Increase the size of NMI buffer and make it configurable

2015-12-18 Thread Peter Zijlstra
On Fri, Dec 18, 2015 at 10:18:08AM +, Daniel Thompson wrote: > I'm not entirely sure that this is an improvement. What I do these days is delete everything in vprintk_emit() and simply call early_printk(). Kill the useless kmsg buffer crap and locking, just pound bytes to the UART registers w

Re: [PATCH v3 4/4] printk/nmi: Increase the size of NMI buffer and make it configurable

2015-12-18 Thread Peter Zijlstra
On Fri, Dec 18, 2015 at 12:29:02PM +0100, Peter Zijlstra wrote: > On Fri, Dec 18, 2015 at 10:18:08AM +, Daniel Thompson wrote: > > I'm not entirely sure that this is an improvement. > > What I do these days is delete everything in vprintk_emit() and simply > call e

Re: [PATCH v2 06/32] s390: reuse asm-generic/barrier.h

2016-01-04 Thread Peter Zijlstra
On Thu, Dec 31, 2015 at 09:06:30PM +0200, Michael S. Tsirkin wrote: > On s390 read_barrier_depends, smp_read_barrier_depends > smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the > asm-generic variants exactly. Drop the local definitions and pull in > asm-generic/barrier.h inst

Re: [PATCH v2 11/32] mips: reuse asm-generic/barrier.h

2016-01-04 Thread Peter Zijlstra
On Thu, Dec 31, 2015 at 09:07:10PM +0200, Michael S. Tsirkin wrote: > -#define smp_store_release(p, v) > \ > -do { \ > - compiletime_assert_atomic_type(*p);

Re: [PATCH v2 17/32] arm: define __smp_xxx

2016-01-04 Thread Peter Zijlstra
On Sun, Jan 03, 2016 at 11:12:44AM +0200, Michael S. Tsirkin wrote: > On Sat, Jan 02, 2016 at 11:24:38AM +, Russell King - ARM Linux wrote: > > My only concern is that it gives people an additional handle onto a > > "new" set of barriers - just because they're prefixed with __* > > unfortunate

Re: [PATCH v2 20/32] metag: define __smp_xxx

2016-01-04 Thread Peter Zijlstra
On Thu, Dec 31, 2015 at 09:08:22PM +0200, Michael S. Tsirkin wrote: > +#ifdef CONFIG_SMP > +#define fence() metag_fence() > +#else > +#define fence() do { } while (0) > #endif James, it strikes me as odd that fence() is a no-op instead of a barrier() for UP, can you verify/explain? _

Re: [PATCH v2 22/32] s390: define __smp_xxx

2016-01-04 Thread Peter Zijlstra
On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote: > This defines __smp_xxx barriers for s390, > for use by virtualization. > > Some smp_xxx barriers are removed as they are > defined correctly by asm-generic/barriers.h > > Note: smp_mb, smp_rmb and smp_wmb are defined as full ba

Re: [PATCH v2 17/32] arm: define __smp_xxx

2016-01-04 Thread Peter Zijlstra
On Mon, Jan 04, 2016 at 02:36:58PM +0100, Peter Zijlstra wrote: > On Sun, Jan 03, 2016 at 11:12:44AM +0200, Michael S. Tsirkin wrote: > > On Sat, Jan 02, 2016 at 11:24:38AM +, Russell King - ARM Linux wrote: > > > > My only concern is that it gives people an add

Re: [PATCH v2 31/32] sh: support a 2-byte smp_store_mb

2016-01-04 Thread Peter Zijlstra
On Thu, Dec 31, 2015 at 09:09:47PM +0200, Michael S. Tsirkin wrote: > At the moment, xchg on sh only supports 4 and 1 byte values, so using it > from smp_store_mb means attempts to store a 2 byte value using this > macro fail. > > And happens to be exactly what virtio drivers want to do. > > Chec

Re: [PATCH v2 33/34] xenbus: use virt_xxx barriers

2016-01-04 Thread Peter Zijlstra
On Thu, Dec 31, 2015 at 09:10:01PM +0200, Michael S. Tsirkin wrote: > drivers/xen/xenbus/xenbus_comms.c uses > full memory barriers to communicate with the other side. > > For guests compiled with CONFIG_SMP, smp_wmb and smp_mb > would be sufficient, so mb() and wmb() here are only needed if > a n

Re: [PATCH v2 20/32] metag: define __smp_xxx

2016-01-04 Thread Peter Zijlstra
On Mon, Jan 04, 2016 at 03:25:58PM +, James Hogan wrote: > It is used along with the metag specific __global_lock1() (global > voluntary lock between hw threads) whenever a write is performed, and by > smp_mb/smp_rmb to try to catch other cases, but I've never been > confident this fixes every

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-12 Thread Peter Zijlstra
On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote: > This statement doesn't fit MIPS barriers variations. Moreover, there is a > reason to extend that even more specific, at least for smp_store_release and > smp_load_acquire, look into > > http://patchwork.linux-mips.org/patch/1

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-12 Thread Peter Zijlstra
On Tue, Jan 12, 2016 at 10:43:36AM +0200, Michael S. Tsirkin wrote: > On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote: > > On 01/10/2016 06:18 AM, Michael S. Tsirkin wrote: > > >On mips dma_rmb, dma_wmb, smp_store_mb, read_barrier_depends, > > >smp_read_barrier_depends, smp_store_re

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-12 Thread Peter Zijlstra
On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote: > 2) the changelog _completely_ fails to explain the sync 0x11 and sync > 0x12 semantics nor does it provide a publicly accessible link to > documentation that does. Ralf pointed me at: https://imgtec.com/mips/architectur

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-12 Thread Peter Zijlstra
On Tue, Jan 12, 2016 at 11:25:55AM +0100, Peter Zijlstra wrote: > On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote: > > 2) the changelog _completely_ fails to explain the sync 0x11 and sync > > 0x12 semantics nor does it provide a publicly accessible link to > &g

Re: [PATCH v3 00/41] arch: barrier cleanup + barriers for virt

2016-01-12 Thread Peter Zijlstra
> duplicate patch, and assume conflict will be resolved. > > I would really appreciate some feedback on arch bits (especially the x86 > bits), > and acks for merging this through the vhost tree. Thanks for doing this, looks good to me. Acke

Re: [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-12 Thread Peter Zijlstra
. > > 3. I bother MIPS Arch team long time until I completely understood that MIPS > SYNC_WMB, SYNC_MB, SYNC_RMB, SYNC_RELEASE and SYNC_ACQUIRE do an exactly > that is required in Documentation/memory-barriers.txt Ha! and you think that document covers all the really fun details? In part

<    1   2   3   4   5   6   7   8   9   10   >