I'm also thinking that the combination of rcu_cmpxchg_pointer() and rcu_dereference() are problematic, because we use ll/sc for the cmpxchg without the matching lwz on the read-side. We should probably also use a matching stw for rcu_assign_pointer if we want to support this case.
Mathieu * Mathieu Desnoyers ([email protected]) wrote: > Hi Paul, > > Please see the message below. It looks like the liburcu > uatomic_read()/uatomic_set() implementations would need to be moved to > lwz/stw if what Steven says below is true. It seems to be in sync with > what is done in the libatomic ops implementation. > > Thoughts ? > > Mathieu > > ----- Forwarded message from Steven Rostedt <[email protected]> ----- > > Date: Mon, 14 Feb 2011 16:39:36 -0500 > To: Peter Zijlstra <[email protected]> > Cc: Will Newton <[email protected]>, Jason Baron <[email protected]>, > Mathieu Desnoyers <[email protected]>, [email protected], > [email protected], [email protected], [email protected], > [email protected], [email protected], [email protected], > [email protected], [email protected], [email protected], > [email protected], [email protected], [email protected], > [email protected], Mike Frysinger <[email protected]>, > Chris Metcalf <[email protected]>, dhowells <[email protected]>, > Martin Schwidefsky <[email protected]>, > "heiko.carstens" <[email protected]>, > benh <[email protected]> > X-Mailer: Evolution 2.30.3 > From: Steven Rostedt <[email protected]> > Subject: Re: [PATCH 0/2] jump label: 2.6.38 updates > > On Mon, 2011-02-14 at 16:29 -0500, Steven Rostedt wrote: > > > > while (atomic_read(&foo) != n) > > > cpu_relax(); > > > > > > and the problem is that cpu_relax() doesn't know which particular > > > cacheline to flush in order to make things go faster, hm? > > > > But what about any global variable? Can't we also just have: > > > > while (global != n) > > cpu_relax(); > > > > ? > > Matt Fleming answered this for me on IRC, and I'll share the answer here > (for those that are dying to know ;) > > Seems that the atomic_inc() uses ll/sc operations that do not affect the > cache. Thus the problem is only with atomic_read() as > > while(atomic_read(&foo) != n) > cpu_relax(); > > Will just check the cache version of foo. But because ll/sc skips the > cache, the foo will never update. That is, atomic_inc() and friends do > not touch the cache, and the CPU spinning in this loop will is only > checking the cache, and will spin forever. > > Thus it is not about global, as global is updated by normal means and > will update the caches. atomic_t is updated via the ll/sc that ignores > the cache and causes all this to break down. IOW... broken hardware ;) > > Matt, feel free to correct this if it is wrong. > > -- Steve > > -- > Mathieu Desnoyers > Operating System Efficiency R&D Consultant > EfficiOS Inc. > http://www.efficios.com -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com _______________________________________________ ltt-dev mailing list [email protected] http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
