On Thu, Jan 30, 2014 at 08:41:16AM -0800, Joe Perches wrote:
> Perhaps you could use a newer version of patch
>
> GNU patch version 2.7 released
Yeah, I know about that, I'll wait until its common in all distros,
updating all machines I use by hand is just painful.
___
On Thu, Jan 30, 2014 at 06:28:52PM +0100, Daniel Lezcano wrote:
> Ok, I think the mess is coming from 'default_idle' which does not re-enable
> the local_irq but used from different places like amd_e400_idle and
> apm_cpu_idle.
>
> void default_idle(void)
> {
> trace_cpu_idle_rcuidle(1, sm
On Thu, Feb 06, 2014 at 11:37:37AM +0100, Torsten Duwe wrote:
> x86 has them, MIPS has them, ARM has them, even ia64 has them:
> ticket locks. They reduce memory bus and cache pressure especially
> for contended spinlocks, increasing performance.
>
> This patch is a port of the x86 spin locks, mos
On Thu, Feb 06, 2014 at 02:09:59PM +, Nicolas Pitre wrote:
> Hi Peter,
>
> Did you merge those patches in your tree?
tree, tree, what's in a word. Its in my patch stack yes. I should get
some of that into tip I suppose, been side-tracked a bit this week.
Sorry for the delay.
> If so, is it
On Thu, Feb 06, 2014 at 06:37:27PM +0100, Torsten Duwe wrote:
> On Thu, Feb 06, 2014 at 05:38:37PM +0100, Peter Zijlstra wrote:
> > On Thu, Feb 06, 2014 at 11:37:37AM +0100, Torsten Duwe wrote:
> > > x86 has them, MIPS has them, ARM has them, even ia64 has them:
> > >
On Fri, Feb 07, 2014 at 10:02:48AM +0100, Torsten Duwe wrote:
> On Thu, Feb 06, 2014 at 02:19:52PM -0600, Scott Wood wrote:
> > On Thu, 2014-02-06 at 18:37 +0100, Torsten Duwe wrote:
> > > On Thu, Feb 06, 2014 at 05:38:37PM +0100, Peter Zijlstra wrote:
> >
> > >
> So if you have ll/sc on the whole word concurrent with the half-word
> store, you can loose the half-word store like:
>
> lwarx &tickets
> ... sth &tail
> stwcd &tickets
>
>
> The stwcd will over-write the tail store.
Oh wait, that's stupid, it will invalidate the lock a
On Fri, Feb 07, 2014 at 11:31:39AM +0100, Peter Zijlstra wrote:
> Anyway, what might work is something like (please forgive my ppc asm, I
> can barely read the thing, I've never before attempted writing it):
>
> lock:
> 1:lharx %0, 0, &head
> mov %1, %
On Fri, Feb 07, 2014 at 12:49:49PM +0100, Torsten Duwe wrote:
> On Fri, Feb 07, 2014 at 11:45:30AM +0100, Peter Zijlstra wrote:
> >
> > That might need to be lhz too, I'm confused on all the load variants.
>
> ;-)
>
> > > unlock:
> > > lhz
On Fri, Feb 07, 2014 at 11:09:23AM +, Nicolas Pitre wrote:
> On Thu, 6 Feb 2014, Peter Zijlstra wrote:
> > tree, tree, what's in a word.
>
> Something you may plant on a patch of grass? "Merging" becomes a
> strange concept in that context though. :-)
On Fri, Feb 07, 2014 at 05:11:26PM +0530, Preeti U Murthy wrote:
> But observe the idle state "snooze" on powerpc. The power that this idle
> state saves is through the lowering of the thread priority of the CPU.
> After it lowers the thread priority, it is done. It cannot
> "wait_for_interrupts".
On Fri, Feb 07, 2014 at 01:28:37PM +0100, Peter Zijlstra wrote:
> Anyway, you can do a version with lwarx/stwcx if you're looking get rid
> of lharx.
the below seems to compile into relatively ok asm. It can be done better
if you write the entire thing by hand though.
---
typedef uns
On Fri, Feb 07, 2014 at 04:18:47PM +0100, Peter Zijlstra wrote:
> void ticket_lock(tickets_t *lock)
> {
> tickets_t t;
>
> /*
>* Because @head is MSB, the direct increment wrap doesn't disturb
>* @tail.
>*/
>
On Fri, Feb 07, 2014 at 09:51:16AM -0600, Kumar Gala wrote:
>
> On Feb 7, 2014, at 3:02 AM, Torsten Duwe wrote:
>
> > On Thu, Feb 06, 2014 at 02:19:52PM -0600, Scott Wood wrote:
> >> On Thu, 2014-02-06 at 18:37 +0100, Torsten Duwe wrote:
> >>> On Thu, Feb
On Fri, Feb 07, 2014 at 06:08:45PM +0100, Torsten Duwe wrote:
> > static inline unsigned int xadd(unsigned int *v, unsigned int i)
> > {
> > int t, ret;
> >
> > __asm__ __volatile__ (
> > "1: lwarx %0, 0, %4\n"
> > " mr %1, %0\n"
> > " add %0, %3, %0\n"
> > " stwcx. %
On Fri, Feb 07, 2014 at 05:58:01PM +0100, Torsten Duwe wrote:
> +static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
> {
> + register struct __raw_tickets old, tmp,
> + inc = { .tail = TICKET_LOCK_INC };
> +
> CLEAR_IO_SYNC;
> + __asm__ __volatile__(
> +"1:
On Mon, Feb 10, 2014 at 04:52:17PM +0100, Torsten Duwe wrote:
> Opinions, anyone?
Since the holder thing is a performance thing, not a correctness thing;
one thing you could do is something like:
static const int OWNER_HASH_SIZE = CONFIG_NR_CPUS * 4;
static const int OWNER_HASH_BITS = ilog2(OWNER
) is ((p)-MAX_RT_PRIO) the above two definitions are
the same and we can simply remove the spufs one.
Fixes: 6b6350f155af ("sched: Expose some macros related to priority")
Reported-by: Fengguang Wu
Signed-off-by: Peter Zijlstra
---
arch/powerpc/platforms/cell/spufs/sched.c | 1 -
1 fi
On Tue, Feb 25, 2014 at 02:33:26PM +1100, Michael Ellerman wrote:
> On Fri, 2014-14-02 at 22:02:06 UTC, Cody P Schafer wrote:
> > Export the swevent hrtimer helpers currently only used in events/core.c
> > to allow the addition of architecture specific sw-like pmus.
>
> Peter, Ingo, can we get you
On Tue, Feb 25, 2014 at 01:38:31PM -0800, Cody P Schafer wrote:
> On 02/25/2014 02:20 AM, Peter Zijlstra wrote:
> >On Tue, Feb 25, 2014 at 02:33:26PM +1100, Michael Ellerman wrote:
> >>On Fri, 2014-14-02 at 22:02:06 UTC, Cody P Schafer wrote:
> >>>Export the swevent
t; Signed-off-by: Liu Ping Fan
> Signed-off-by: Aneesh Kumar K.V
Acked-by: Peter Zijlstra
> ---
>
> include/linux/mm.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index f28f46eade6a..8624583
On Tue, Mar 04, 2014 at 02:07:31PM -0700, Bjorn Helgaas wrote:
> This is just cleanup of a couple unused interfaces and (for sparc64) a
> supporting variable.
>
Thanks!
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org
On Wed, Mar 19, 2014 at 08:56:19PM +0530, Srikar Dronamraju wrote:
> There are 332 tasks all stuck in futex_wait_queue_me().
> I am able to reproduce this consistently.
>
> Infact I can reproduce this if the java_constraint is either node, socket,
> system.
> However I am not able to reproduce if
On Wed, Mar 19, 2014 at 04:47:05PM +0100, Peter Zijlstra wrote:
> > I reverted b0c29f79ecea0b6fbcefc999e70f2843ae8306db on top of v3.14-rc6 and
> > confirmed that
> > reverting the commit solved the problem.
>
> Joy,.. let me look at that with ppc in mind.
OK; so w
On Thu, Mar 20, 2014 at 11:03:50AM +0530, Srikar Dronamraju wrote:
> > > Joy,.. let me look at that with ppc in mind.
> >
> > OK; so while pretty much all the comments from that patch are utter
> > nonsense (what was I thinking), I cannot actually find a real bug.
> >
> > But could you try the be
On Wed, Apr 09, 2014 at 07:02:02AM +0530, Madhavan Srinivasan wrote:
> On Friday 04 April 2014 09:48 PM, Dave Hansen wrote:
> > On 04/03/2014 11:27 PM, Madhavan Srinivasan wrote:
> >> This patch creates infrastructure to move the FAULT_AROUND_ORDER
> >> to arch/ using Kconfig. This will enable arch
On Fri, Feb 06, 2015 at 04:43:54PM -0600, Tom Huynh wrote:
> arch/powerpc/perf/e6500-events-list.h | 289
> ++
That's a lot of events to stuff in the kernel, would a userspace list
not be more convenient?
ISTR there being various discussions on providing support f
ags indicate wheether the transaction is to add events to
> the PMU (PERF_PMU_TXN_ADD) or to read the events PERF_PMU_TXN_READ.
>
> Based on input from Peter Zijlstra.
>
> Signed-off-by: Sukadev Bhattiprolu
> ---
> arch/powerpc/perf/core-book3s.c | 15 ---
> a
On Wed, Mar 04, 2015 at 12:35:07AM -0800, Sukadev Bhattiprolu wrote:
> extern u64 perf_event_read_value(struct perf_event *event,
> - u64 *enabled, u64 *running);
> + u64 *enabled, u64 *running, int update);
>
I think someone recently s
On Wed, Mar 04, 2015 at 12:35:08AM -0800, Sukadev Bhattiprolu wrote:
> +++ b/kernel/events/core.c
> @@ -3677,11 +3677,34 @@ u64 perf_event_read_value(struct perf_event *event,
> u64 *enabled, u64 *running,
> }
> EXPORT_SYMBOL_GPL(perf_event_read_value);
>
> +static int do_pmu_group_read(struct
On Wed, Mar 02, 2016 at 03:25:17PM +0530, Ravi Bangoria wrote:
> At a time of destroying hw_breakpoint event, kernel ends up with Oops.
> Here is the sample output from 4.5.0-rc6 kernel.
> Call chain:
>
> hw_breakpoint_event_init()
> bp->destroy = bp_perf_event_destroy;
>
> do_exit()
>
On Wed, Mar 02, 2016 at 10:53:24PM +1100, Michael Ellerman wrote:
> Peterz, acme, do you guys want to take this? Or should I?
I'm not too happy its touching event->ctx at all. It really should not
be doing that.
___
Linuxppc-dev mailing list
Linuxppc-dev
On Thu, Mar 03, 2016 at 08:23:38PM +1100, Michael Ellerman wrote:
> On Wed, 2016-03-02 at 12:59 +0100, Peter Zijlstra wrote:
>
> > On Wed, Mar 02, 2016 at 10:53:24PM +1100, Michael Ellerman wrote:
>
> > > Peterz, acme, do you guys want to take this? Or should I?
>
On Thu, Mar 24, 2016 at 01:58:13PM +0100, Jiri Slaby wrote:
> void
> -exit_thread(void)
> +exit_thread(struct task_struct *me)
> {
> }
task_struct arguments are called: tsk, task, p
'me' seems very wrong, as that could only mean 'current', and its
clearly not that.
_
On Fri, Apr 08, 2016 at 02:41:46PM +0800, Pan Xinhui wrote:
> From: pan xinhui
>
> Implement xchg{u8,u16}{local,relaxed}, and
> cmpxchg{u8,u16}{,local,acquire,relaxed}.
>
> Atomic operation on 8-bit and 16-bit data type is supported from power7
And yes I see nothing P7 specific here, this imple
On Sun, Apr 10, 2016 at 10:17:28PM +0800, Pan Xinhui wrote:
>
> On 2016年04月08日 15:47, Peter Zijlstra wrote:
> > On Fri, Apr 08, 2016 at 02:41:46PM +0800, Pan Xinhui wrote:
> >> From: pan xinhui
> >>
> >> Implement xchg{u8,u16}{local,relaxed}, and
>
On Wed, Apr 20, 2016 at 09:24:00PM +0800, Pan Xinhui wrote:
> +#define __XCHG_GEN(cmp, type, sfx, skip, v) \
> +static __always_inline unsigned long \
> +__cmpxchg_u32##sfx(v unsigned int *p, unsigned long old, \
> +
On Thu, Apr 21, 2016 at 11:35:07PM +0800, Pan Xinhui wrote:
> yes, you are right. more load/store will be done in C code.
> However such xchg_u8/u16 is just used by qspinlock now. and I did not see any
> performance regression.
> So just wrote in C, for simple. :)
Which is fine; but worthy of a n
On Mon, Apr 25, 2016 at 06:10:51PM +0800, Pan Xinhui wrote:
> > So I'm not actually _that_ familiar with the PPC LL/SC implementation;
> > but there are things a CPU can do to optimize these loops.
> >
> > For example, a CPU might choose to not release the exclusive hold of the
> > line for a numb
>
> Suggested-by: Peter Zijlstra (Intel)
> Signed-off-by: Pan Xinhui
Generally has the right shape; and I trust others to double check the
ppc-asm minutia.
Acked-by: Peter Zijlstra (Intel)
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.oz
On Thu, Aug 20, 2015 at 08:14:29PM +0800, Kevin Hao wrote:
> The jump_label_init() run in a very early stage, even before the
> sched_init(). So there is no chance for concurrent access of the
> jump label table.
It also doesn't hurt to have it. Its better to be consistent and
conservative with lo
On Thu, Aug 20, 2015 at 08:14:31PM +0800, Kevin Hao wrote:
> These are used to define a static_key_{true,false} array.
Yes but why...
there might have been some clue in the patches you didn't send me, but
since you didn't send them, I'm left wondering.
On Fri, Aug 28, 2015 at 10:48:17AM +0800, Boqun Feng wrote:
> +/*
> + * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with
> + * a "bne-" instruction at the end, so an isync is enough as a acquire
> barrier
> + * on the platform without lwsync.
> + */
> +#ifdef CONFIG_SMP
> +#def
On Fri, Aug 28, 2015 at 10:48:16AM +0800, Boqun Feng wrote:
> Some architectures may have their special barriers for acquire, release
> and fence semantics, general memory barriers(smp_mb__*_atomic()) in
> __atomic_op_*() may be too strong, so arch_atomic_op_*() helpers are
> introduced for archite
On Fri, Aug 28, 2015 at 10:16:02PM +0800, Boqun Feng wrote:
> On Fri, Aug 28, 2015 at 08:06:14PM +0800, Boqun Feng wrote:
> > Hi Peter,
> >
> > On Fri, Aug 28, 2015 at 12:48:54PM +0200, Peter Zijlstra wrote:
> > > On Fri, Aug 28, 2015 at 10:4
On Thu, Aug 13, 2015 at 11:49:34PM -0700, Sukadev Bhattiprolu wrote:
I'm ever so sorry I keep going on about this, but..
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index d90893b..b18efe4 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/pe
nt to be read
> >
> > pmu->commit_txn() // Read/update all queuedcounters
> >
> > The ->commit_txn() also updates the event counts in the respective
> > perf_event objects. The perf subsystem can then directly get the
> > event c
Sorry for being tardy, I had a wee spell of feeling horrible and then I
procrastinated longer than I should have.
On Fri, Sep 11, 2015 at 01:45:07PM +0100, Will Deacon wrote:
> Peter, any thoughts? I'm not au fait with the x86 memory model, but what
> Paul's saying is worrying.
Right, so Paul i
On Mon, Sep 14, 2015 at 01:35:20PM +0200, Peter Zijlstra wrote:
>
> Sorry for being tardy, I had a wee spell of feeling horrible and then I
> procrastinated longer than I should have.
>
> On Fri, Sep 11, 2015 at 01:45:07PM +0100, Will Deacon wrote:
>
> > Peter, any t
On Mon, Sep 14, 2015 at 02:01:53PM +0200, Peter Zijlstra wrote:
> The scenario is:
>
> CPU0CPU1
>
> unlock(x)
> smp_store_release(&x->lock, 0);
>
> unlock(y)
>
On Wed, Sep 16, 2015 at 11:49:32PM +0800, Boqun Feng wrote:
> Implement xchg_relaxed and define atomic{,64}_xchg_* as xchg_relaxed,
> based on these _relaxed variants, release/acquire variants can be built.
>
> Note that xchg_relaxed and atomic_{,64}_xchg_relaxed are not compiler
> barriers.
Hmm,
On Wed, Sep 16, 2015 at 11:49:33PM +0800, Boqun Feng wrote:
> Unlike other atomic operation variants, cmpxchg{,64}_acquire and
> atomic{,64}_cmpxchg_acquire don't have acquire semantics if the cmp part
> fails, so we need to implement these using assembly.
I think that is actually expected and doc
On Wed, Sep 16, 2015 at 11:49:34PM +0800, Boqun Feng wrote:
> According to memory-barriers.txt, xchg and its atomic{,64}_ versions
> need to imply a full barrier, however they are now just RELEASE+ACQUIRE,
> which is not a full barrier.
>
> So remove the definition of xchg(), and let __atomic_op_f
On Thu, Oct 01, 2015 at 02:27:15PM +0200, Peter Zijlstra wrote:
> On Wed, Sep 16, 2015 at 11:49:33PM +0800, Boqun Feng wrote:
> > Unlike other atomic operation variants, cmpxchg{,64}_acquire and
> > atomic{,64}_cmpxchg_acquire don't have acquire semantics if the cmp part
>
On Thu, Oct 01, 2015 at 08:12:19AM -0700, Paul E. McKenney wrote:
> What C11 does is to allow the developer to specify different orderings
> on success and failure. But it is no harder to supply a barrier (if
> needed) on the failure path, right?
Quite right.
On Thu, Oct 01, 2015 at 08:09:09AM -0700, Paul E. McKenney wrote:
> On Thu, Oct 01, 2015 at 02:24:40PM +0200, Peter Zijlstra wrote:
> > I must say I'm somewhat surprised by this level of relaxation, I had
> > expected to only loose SMP barriers, not the program order ones.
On Thu, Oct 01, 2015 at 11:03:01AM -0700, Paul E. McKenney wrote:
> On Thu, Oct 01, 2015 at 07:13:04PM +0200, Peter Zijlstra wrote:
> > On Thu, Oct 01, 2015 at 08:09:09AM -0700, Paul E. McKenney wrote:
> > > On Thu, Oct 01, 2015 at 02:24:40PM +0200, Peter Zijlstra wrote:
>
On Fri, Oct 02, 2015 at 07:19:04AM +0800, Boqun Feng wrote:
> Hi Peter,
>
> Please forgive me for the format of my reply. I'm travelling,
> and replying from my phone.
>
> 2015年10月1日 下午7:28,"Peter Zijlstra" 写道:
> >
> > On Wed, Sep 16, 2015 at 11:49:
On Thu, Oct 08, 2015 at 02:50:36PM +1100, Michael Ellerman wrote:
> On Wed, 2015-10-07 at 08:25 -0700, Paul E. McKenney wrote:
> > Currently, we do need smp_mb__after_unlock_lock() to be after the
> > acquisition on PPC -- putting it between the unlock and the lock
> > of course doesn't cut it for
On Thu, Oct 08, 2015 at 02:44:39PM -0700, Paul E. McKenney wrote:
> > > > I am with Peter -- we do need the benchmark results for PPC.
> > >
> > > Urgh, sorry guys. I have been slowly doing some benchmarks, but time is
> > > not
> > > plentiful at the moment.
> > >
> > > If we do a straight lws
On Thu, Oct 08, 2015 at 02:44:39PM -0700, Paul E. McKenney wrote:
> On Thu, Oct 08, 2015 at 01:16:38PM +0200, Peter Zijlstra wrote:
> > On Thu, Oct 08, 2015 at 02:50:36PM +1100, Michael Ellerman wrote:
> > > On Wed, 2015-10-07 at 08:25 -0700, Paul E. McKenney wrote:
> >
On Fri, Oct 09, 2015 at 10:40:39AM +0100, Will Deacon wrote:
> Stepping back a second, I believe that there are three cases:
>
>
> RELEASE X -> ACQUIRE Y (same CPU)
>* Needs a barrier on TSO architectures for full ordering
+PPC
> UNLOCK X -> LOCK Y (same CPU)
>* Needs a barrier
On Fri, Oct 09, 2015 at 10:40:39AM +0100, Will Deacon wrote:
> > Which leads me to think I would like to suggest alternative rules for
> > RELEASE/ACQUIRE (to replace those Will suggested; as I think those are
> > partly responsible for my confusion).
>
> Yeah, sorry. I originally used the phrase
On Fri, Oct 09, 2015 at 10:40:39AM +0100, Will Deacon wrote:
>
> > - RELEASE -> ACQUIRE _chains_ (on shared variables) preserve causality,
> >(because each link is fully ordered) but are not transitive.
>
> Yup, and that's the same for UNLOCK -> LOCK, too.
Agreed, except RELEASE/ACQUIRE is
On Fri, Oct 09, 2015 at 10:51:29AM +0100, Will Deacon wrote:
> > The corresponding litmus tests are below.
>
> How do people feel about including these in memory-barriers.txt? I find
> them considerably easier to read than our current kernel code + list of
> possible orderings + wall of text, but
On Fri, Oct 09, 2015 at 01:51:11PM +0100, Will Deacon wrote:
> On Fri, Oct 09, 2015 at 01:12:02PM +0200, Peter Zijlstra wrote:
> > On Fri, Oct 09, 2015 at 10:40:39AM +0100, Will Deacon wrote:
> > > > Which leads me to think I would like to suggest alternative rules for
> &
On Sun, Oct 11, 2015 at 06:25:20PM +0800, Boqun Feng wrote:
> On Sat, Oct 10, 2015 at 09:58:05AM +0800, Boqun Feng wrote:
> > Hi Peter,
> >
> > Sorry for replying late.
> >
> > On Thu, Oct 01, 2015 at 02:27:16PM +0200, Peter Zijlstra wrote:
> > > On We
On Mon, Oct 12, 2015 at 10:14:00PM +0800, Boqun Feng wrote:
> The patchset consists of 6 parts:
>
> 1.Make xchg, cmpxchg and their atomic_ versions a full barrier
>
> 2.Add trivial tests for the new variants in lib/atomic64_test.c
>
> 3.Allow architectures to define their own __ato
On Wed, Oct 14, 2015 at 08:51:34AM +0800, Boqun Feng wrote:
> On Wed, Oct 14, 2015 at 11:10:00AM +1100, Michael Ellerman wrote:
> > Thanks for fixing this. In future you should send a patch like this as a
> > separate patch. I've not been paying attention to it because I assumed it
> > was
>
> G
On Wed, Oct 14, 2015 at 05:26:53PM +0800, Boqun Feng wrote:
> Michael and Peter, rest of this patchset depends on commits which are
> currently in the locking/core branch of the tip, so I would like it as a
> whole queued there. Besides, I will keep this patch Cc'ed to stable in
> future versions,
On Wed, Oct 14, 2015 at 01:19:17PM -0700, Paul E. McKenney wrote:
> Suppose we have something like the following, where "a" and "x" are both
> initially zero:
>
> CPU 0 CPU 1
> - -
>
> WRITE_ONCE(x, 1); WR
On Mon, Oct 19, 2015 at 09:17:18AM +0800, Boqun Feng wrote:
> This is confusing me right now. ;-)
>
> Let's use a simple example for only one primitive, as I understand it,
> if we say a primitive A is "fully ordered", we actually mean:
>
> 1.The memory operations preceding(in program order)
On Tue, Oct 20, 2015 at 03:15:32PM +0800, Boqun Feng wrote:
> On Wed, Oct 14, 2015 at 01:19:17PM -0700, Paul E. McKenney wrote:
> >
> > Am I missing something here? If not, it seems to me that you need
> > the leading lwsync to instead be a sync.
> >
> > Of course, if I am not missing something,
On Tue, Oct 20, 2015 at 02:28:35PM -0700, Paul E. McKenney wrote:
> I am not seeing a sync there, but I really have to defer to the
> maintainers on this one. I could easily have missed one.
So x86 implies a full barrier for everything that changes the CPL; and
some form of implied ordering seems
On Tue, Oct 20, 2015 at 04:34:51PM -0700, Paul E. McKenney wrote:
> There is also the question of whether the barrier forces ordering
> of unrelated stores, everything initially zero and all accesses
> READ_ONCE() or WRITE_ONCE():
>
> P0 P1 P2 P3
>
On Wed, Oct 21, 2015 at 12:29:23PM -0700, Paul E. McKenney wrote:
> On Wed, Oct 21, 2015 at 10:24:52AM +0200, Peter Zijlstra wrote:
> > On Tue, Oct 20, 2015 at 04:34:51PM -0700, Paul E. McKenney wrote:
> > > There is also the question of whether the barrier forces ordering
On Wed, Oct 21, 2015 at 12:35:23PM -0700, Paul E. McKenney wrote:
> > > > > I ask this because I recall Peter once bought up a discussion:
> > > > >
> > > > > https://lkml.org/lkml/2015/8/26/596
> > So a full barrier on one side of these operations is enough, I think.
> > IOW, there is no need to
On Thu, Oct 22, 2015 at 08:07:16PM +0800, Boqun Feng wrote:
> On Wed, Oct 21, 2015 at 09:48:25PM +0200, Peter Zijlstra wrote:
> > On Wed, Oct 21, 2015 at 12:35:23PM -0700, Paul E. McKenney wrote:
> > > > > > > I ask this because I recall Pet
ed, which can avoid possible
> memory ordering problems if userspace code relies on futex system call
> for fully ordered semantics.
>
> Cc: # 3.4+
> Signed-off-by: Boqun Feng
Acked-by: Peter Zijlstra (Intel)
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev
mic_xxx_return barrier semantics")
>
> This patch depends on patch "powerpc: Make value-returning atomics fully
> ordered" for PPC_ATOMIC_ENTRY_BARRIER definition.
>
> Cc: # 3.4+
> Signed-off-by: Boqun Feng
Acked-by: Peter Zijlstra (Intel)
_
On Thu, Nov 05, 2015 at 02:16:15AM +0530, Madhavan Srinivasan wrote:
> Second patch updates struct arch_misc_reg for arch/powerpc with pmu registers
> and adds offsetof macro for the same. It extends perf_reg_value()
> to use reg idx to decide on struct to return value from.
Why; what's in those r
On Fri, Nov 06, 2015 at 12:57:17PM +0530, Madhavan Srinivasan wrote:
>
>
> On Thursday 05 November 2015 06:37 PM, Peter Zijlstra wrote:
> > On Thu, Nov 05, 2015 at 02:16:15AM +0530, Madhavan Srinivasan wrote:
> >> Second patch updates struct arch_misc_reg f
On Fri, Nov 06, 2015 at 09:04:00PM +1100, Michael Ellerman wrote:
> It's a perrenial request from our hardware PMU folks to be able to see the raw
> values of the PMU registers.
>
> I think partly it's so that they can verify that perf is doing what they want,
> and some of it is that they're inte
On Fri, Dec 18, 2015 at 10:18:08AM +, Daniel Thompson wrote:
> I'm not entirely sure that this is an improvement.
What I do these days is delete everything in vprintk_emit() and simply
call early_printk().
Kill the useless kmsg buffer crap and locking, just pound bytes to the
UART registers w
On Fri, Dec 18, 2015 at 12:29:02PM +0100, Peter Zijlstra wrote:
> On Fri, Dec 18, 2015 at 10:18:08AM +, Daniel Thompson wrote:
> > I'm not entirely sure that this is an improvement.
>
> What I do these days is delete everything in vprintk_emit() and simply
> call e
On Thu, Dec 31, 2015 at 09:06:30PM +0200, Michael S. Tsirkin wrote:
> On s390 read_barrier_depends, smp_read_barrier_depends
> smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the
> asm-generic variants exactly. Drop the local definitions and pull in
> asm-generic/barrier.h inst
On Thu, Dec 31, 2015 at 09:07:10PM +0200, Michael S. Tsirkin wrote:
> -#define smp_store_release(p, v)
> \
> -do { \
> - compiletime_assert_atomic_type(*p);
On Sun, Jan 03, 2016 at 11:12:44AM +0200, Michael S. Tsirkin wrote:
> On Sat, Jan 02, 2016 at 11:24:38AM +, Russell King - ARM Linux wrote:
> > My only concern is that it gives people an additional handle onto a
> > "new" set of barriers - just because they're prefixed with __*
> > unfortunate
On Thu, Dec 31, 2015 at 09:08:22PM +0200, Michael S. Tsirkin wrote:
> +#ifdef CONFIG_SMP
> +#define fence() metag_fence()
> +#else
> +#define fence() do { } while (0)
> #endif
James, it strikes me as odd that fence() is a no-op instead of a
barrier() for UP, can you verify/explain?
_
On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote:
> This defines __smp_xxx barriers for s390,
> for use by virtualization.
>
> Some smp_xxx barriers are removed as they are
> defined correctly by asm-generic/barriers.h
>
> Note: smp_mb, smp_rmb and smp_wmb are defined as full ba
On Mon, Jan 04, 2016 at 02:36:58PM +0100, Peter Zijlstra wrote:
> On Sun, Jan 03, 2016 at 11:12:44AM +0200, Michael S. Tsirkin wrote:
> > On Sat, Jan 02, 2016 at 11:24:38AM +, Russell King - ARM Linux wrote:
>
> > > My only concern is that it gives people an add
On Thu, Dec 31, 2015 at 09:09:47PM +0200, Michael S. Tsirkin wrote:
> At the moment, xchg on sh only supports 4 and 1 byte values, so using it
> from smp_store_mb means attempts to store a 2 byte value using this
> macro fail.
>
> And happens to be exactly what virtio drivers want to do.
>
> Chec
On Thu, Dec 31, 2015 at 09:10:01PM +0200, Michael S. Tsirkin wrote:
> drivers/xen/xenbus/xenbus_comms.c uses
> full memory barriers to communicate with the other side.
>
> For guests compiled with CONFIG_SMP, smp_wmb and smp_mb
> would be sufficient, so mb() and wmb() here are only needed if
> a n
On Mon, Jan 04, 2016 at 03:25:58PM +, James Hogan wrote:
> It is used along with the metag specific __global_lock1() (global
> voluntary lock between hw threads) whenever a write is performed, and by
> smp_mb/smp_rmb to try to catch other cases, but I've never been
> confident this fixes every
On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote:
> This statement doesn't fit MIPS barriers variations. Moreover, there is a
> reason to extend that even more specific, at least for smp_store_release and
> smp_load_acquire, look into
>
> http://patchwork.linux-mips.org/patch/1
On Tue, Jan 12, 2016 at 10:43:36AM +0200, Michael S. Tsirkin wrote:
> On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote:
> > On 01/10/2016 06:18 AM, Michael S. Tsirkin wrote:
> > >On mips dma_rmb, dma_wmb, smp_store_mb, read_barrier_depends,
> > >smp_read_barrier_depends, smp_store_re
On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote:
> 2) the changelog _completely_ fails to explain the sync 0x11 and sync
> 0x12 semantics nor does it provide a publicly accessible link to
> documentation that does.
Ralf pointed me at: https://imgtec.com/mips/architectur
On Tue, Jan 12, 2016 at 11:25:55AM +0100, Peter Zijlstra wrote:
> On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote:
> > 2) the changelog _completely_ fails to explain the sync 0x11 and sync
> > 0x12 semantics nor does it provide a publicly accessible link to
> &g
> duplicate patch, and assume conflict will be resolved.
>
> I would really appreciate some feedback on arch bits (especially the x86
> bits),
> and acks for merging this through the vhost tree.
Thanks for doing this, looks good to me.
Acke
.
>
> 3. I bother MIPS Arch team long time until I completely understood that MIPS
> SYNC_WMB, SYNC_MB, SYNC_RMB, SYNC_RELEASE and SYNC_ACQUIRE do an exactly
> that is required in Documentation/memory-barriers.txt
Ha! and you think that document covers all the really fun details?
In part
101 - 200 of 1104 matches
Mail list logo