On Wed, Jun 18, 2014 at 10:07:27PM -0400, Tejun Heo wrote: > On Thu, Jun 19, 2014 at 09:58:16AM +0800, Lai Jiangshan wrote: > > On 06/18/2014 11:32 PM, Tejun Heo wrote: > > > On Wed, Jun 18, 2014 at 11:37:35AM +0800, Lai Jiangshan wrote: > > >>> @@ -97,7 +98,10 @@ static inline void percpu_ref_kill(struct percpu_ref > > >>> *ref) > > >>> static inline bool __pcpu_ref_alive(struct percpu_ref *ref, > > >>> unsigned __percpu **pcpu_countp) > > >>> { > > >>> - unsigned long pcpu_ptr = ACCESS_ONCE(ref->pcpu_count_ptr); > > >>> + unsigned long pcpu_ptr; > > >>> + > > >>> + /* paired with smp_store_release() in percpu_ref_reinit() */ > > >>> + pcpu_ptr = smp_load_acquire(&ref->pcpu_count_ptr); > > >> > > >> > > >> Does "smp_load_acquire()" hurts the performance of percpu_ref_get/put() > > >> in non-x86 system? > > > > > > It's equivalent to data dependency barrier. The only arch which needs > > > something more than barrier() is alpha. It isn't an issue. > > > > But I searched from the source, smp_load_acquire() is just barrier() in > > x86, arm64, ia64, s390, sparc, but it includes memory barrier > > instruction in other archs. > > Hmmm, right, it's a stronger guarantee than the data dependency > barrier. This should probably use smp_wmb() and > smp_read_barrier_depends(). That's all it needs anyway.
Yep, smp_load_acquire() orders its load against later loads and stores, so it really does need a memory barrier on weakly ordered systems. This is the "publish" operation for dynamically allocated per-CPU references? If so, agreed, you should be able to rely on dependency ordering. Make sure to comment the smp_read_barrier_depends(). ;-) Thanx, Paul > Thanks. > > -- > tejun > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/