On Sun, Mar 13, 2022 at 06:26:19PM -0700, Philip Guenther wrote:
> On Sun, Mar 13, 2022 at 10:27 AM Visa Hankala <[email protected]> wrote:
>
> > On Sun, Mar 13, 2022 at 04:29:44PM +0100, Mark Kettenis wrote:
> >
> ...
>
> > > Under what circumstances does memory ordering matter for these
> > > interfaces?
> >
> > Consider the following scenario:
> >
> > struct widget {
> > struct refcnt w_refcnt;
> > /* more fields spanning many cache lines */
> > ...
> > int w_var;
> > };
> >
> > First, CPU 1 executes:
> >
> > w->w_var = 1;
> > refcnt_rele(&w->w_refcnt); /* remains above zero */
> >
>
> Having incremented the refcnt previous does not give this thread exclusive
> access to 'w', so if it's writing to w->w_var then it must either
> a) have some sort of write lock taken, which it will release after this and
> which will contain the necessary member, OR
> b) have the only access patch to this structure (i.e, it's not yet
> 'published' into structures which can be seen by other threads), in which
> case the operations which do that 'publishing' of the access to 'w' (adding
> it to a global list, etc) must include the necessary membar.
Lets change the sequence to this:
local_var = atomic_load_int(&w->w_var);
refcnt_rele(&w->w_refcnt);
Without the release barrier, is the load guaranteed to happen before
the reference count is decremented?
> Next, CPU 2 executes:
> >
> > if (refcnt_rele(&w->w_refcnt)) /* refcnt drops to zero */
> > free(w);
> >
>
> How did CPU 2 get what is now exclusive access to 'w' without any membars?
> If that's possible then it was just accessing 'w' and possibly not seeing
> the update to w->w_var even _before_ the refcnt_rele(), so putting a membar
> in refcnt_rele() hides the incorrect code by suppressing the later crash!
>
> If these membars appear to help then the code is and remains broken. This
> change should not be done.
It is not uncommon to see something like below:
Access object with read intent:
mtx_enter(&w_lock);
w = lookup_from_list();
if (w != NULL)
refcnt_take(&w->w_refcnt);
mtx_leave(&w_lock);
if (w == NULL)
return;
...
if (refcnt_rele(&w->w_refcnt))
free(w);
Delete object:
mtx_enter(&w_lock);
w = lookup_from_list();
if (w != NULL)
remove_from_list(w);
mtx_leave(&w_lock);
/* Release list's reference. */
if (w != NULL && refcnt_rele(&w->w_refcnt))
free(w);
Above, any refcnt_rele() can release the final reference.
If there is no acquire barrier after the refcnt 1->0 transition, what
is known about the CPU's local view of the object after refcnt_rele()?
The decrement operation in the <linux/refcount.h> API of Linux and
the <sys/refcount.h> API of FreeBSD provide release and acquire
barriers. Are they wrong?