On Thu, Apr 21, 2022 at 11:10:40PM +0200, Mark Kettenis wrote: > > Date: Thu, 21 Apr 2022 22:17:31 +0200 > > From: Alexander Bluhm <alexander.bl...@gmx.net> > > > > On Mon, Apr 18, 2022 at 08:33:06AM +0000, Visa Hankala wrote: > > > I think the sanest solution is to add the release and acquire barriers > > > in refcnt_rele(). > > > > Getting memory barriers right is too complicated for developers > > doing MP stuff. The existing locking and refcount primitives have > > to implement that functionality. I am on visa@'s side and would > > prefer a memory barrier in refcount API instead of searching for > > races in MP code. > > > > Better waste some CPU cycles in some cases than having strange > > behavior due to missing barries in other cases. > > I don't disagree with that. The refcount API is at the same level as > the mutex API and explicit memory barriers should not be needed to use > it safely. It's just that it isn't obvious what the right memory > ordering semantics are for a refcount API. For some reason this > doesn't seem to be something that's widely discussed. > > The Linux include/linux/refcount.h file has the following comment at > the start: > > * Memory ordering > * =============== > * > * Memory ordering rules are slightly relaxed wrt regular atomic_t functions > * and provide only what is strictly required for refcounts. > * > * The increments are fully relaxed; these will not provide ordering. The > * rationale is that whatever is used to obtain the object we're increasing > the > * reference count on will provide the ordering. For locked data structures, > * its the lock acquire, for RCU/lockless data structures its the dependent > * load. > * > * Do note that inc_not_zero() provides a control dependency which will order > * future stores against the inc, this ensures we'll never modify the object > * if we did not in fact acquire a reference. > * > * The decrements will provide release order, such that all the prior loads > and > * stores will be issued before, it also provides a control dependency, which > * will order us against the subsequent free(). > * > * The control dependency is against the load of the cmpxchg (ll/sc) that > * succeeded. This means the stores aren't fully ordered, but this is fine > * because the 1->0 transition indicates no concurrency. > * > * Note that the allocator is responsible for ordering things between free() > * and alloc(). > * > * The decrements dec_and_test() and sub_and_test() also provide acquire > * ordering on success. > > That is still a bit murky, but I think it matches what visa@'s diff > does for refcnt_rele(9). And doing what Linux does in its refcount > API is probably a good thing. But I don't think it covers the > membar_sync() added to the end of refcnt_finalize(9), so I'm not > confident about that bit, and would like to understand that better.
The patch uses membar_sync(), and not membar_enter(), after the loop in refcnt_finalize() because subsequent memory operations should hinge on the load of r_refs. membar_enter() is usable when the reference point is a store. > The other issue I have with the diff is that it documentations the > memory ordering in terms of acquire and release which is not what we > do in other places such as the membar_enter(9) man page. Maybe this > should explicitly call out the memory ordering like what the Linux > comment does. I have updated the documentation, though I am not sure if the outcome is an improvement. Index: share/man/man9/refcnt_init.9 =================================================================== RCS file: src/share/man/man9/refcnt_init.9,v retrieving revision 1.2 diff -u -p -r1.2 refcnt_init.9 --- share/man/man9/refcnt_init.9 16 Mar 2022 14:13:01 -0000 1.2 +++ share/man/man9/refcnt_init.9 25 Apr 2022 14:34:05 -0000 @@ -74,6 +74,17 @@ There may only be one caller to per refcnt .Fa r . .Pp +.Fn refcnt_rele , +.Fn refcnt_rele_wake +and +.Fn refcnt_finalize +order prior memory loads and stores before the release of the reference. +The functions enforce control dependency so that after the final reference +has been released, subsequent loads and stores happen after the release. +These ensure that concurrent accesses cease before the object's destructor +runs and that the destructor sees all updates done during the lifetime +of the object. +.Pp .Fn refcnt_shared tests if the object has multiple references. .Pp Index: sys/kern/kern_synch.c =================================================================== RCS file: src/sys/kern/kern_synch.c,v retrieving revision 1.185 diff -u -p -r1.185 kern_synch.c --- sys/kern/kern_synch.c 18 Mar 2022 15:32:06 -0000 1.185 +++ sys/kern/kern_synch.c 25 Apr 2022 14:34:05 -0000 @@ -822,9 +822,14 @@ refcnt_rele(struct refcnt *r) { u_int refs; + membar_exit_before_atomic(); refs = atomic_dec_int_nv(&r->r_refs); KASSERT(refs != ~0); - return (refs == 0); + if (refs == 0) { + membar_enter_after_atomic(); + return (1); + } + return (0); } void @@ -840,6 +845,7 @@ refcnt_finalize(struct refcnt *r, const struct sleep_state sls; u_int refs; + membar_exit_before_atomic(); refs = atomic_dec_int_nv(&r->r_refs); KASSERT(refs != ~0); while (refs) { @@ -847,6 +853,8 @@ refcnt_finalize(struct refcnt *r, const refs = atomic_load_int(&r->r_refs); sleep_finish(&sls, refs); } + /* Order subsequent loads and stores after refs == 0 load. */ + membar_sync(); } int