On Sun, Mar 13, 2022 at 06:26:19PM -0700, Philip Guenther wrote:
> On Sun, Mar 13, 2022 at 10:27 AM Visa Hankala <[email protected]> wrote:
> 
> > On Sun, Mar 13, 2022 at 04:29:44PM +0100, Mark Kettenis wrote:
> >
> ...
> 
> > > Under what circumstances does memory ordering matter for these
> > > interfaces?
> >
> > Consider the following scenario:
> >
> > struct widget {
> >         struct refcnt   w_refcnt;
> >         /* more fields spanning many cache lines */
> >         ...
> >         int             w_var;
> > };
> >
> > First, CPU 1 executes:
> >
> >         w->w_var = 1;
> >         refcnt_rele(&w->w_refcnt);      /* remains above zero */
> >
> 
> Having incremented the refcnt previous does not give this thread exclusive
> access to 'w', so if it's writing to w->w_var then it must either
> a) have some sort of write lock taken, which it will release after this and
> which will contain the necessary member, OR
> b) have the only access patch to this structure (i.e, it's not yet
> 'published' into structures which can be seen by other threads), in which
> case the operations which do that 'publishing' of the access to 'w' (adding
> it to a global list, etc) must include the necessary membar.

Lets change the sequence to this:

        local_var = atomic_load_int(&w->w_var);
        refcnt_rele(&w->w_refcnt);

Without the release barrier, is the load guaranteed to happen before
the reference count is decremented?

> Next, CPU 2 executes:
> >
> >         if (refcnt_rele(&w->w_refcnt))  /* refcnt drops to zero */
> >                 free(w);
> >
> 
> How did CPU 2 get what is now exclusive access to 'w' without any membars?
> If that's possible then it was just accessing 'w' and possibly not seeing
> the update to w->w_var even _before_ the refcnt_rele(), so putting a membar
> in refcnt_rele() hides the incorrect code by suppressing the later crash!
> 
> If these membars appear to help then the code is and remains broken.  This
> change should not be done.

It is not uncommon to see something like below:

Access object with read intent:

        mtx_enter(&w_lock);
        w = lookup_from_list();
        if (w != NULL)
                refcnt_take(&w->w_refcnt);
        mtx_leave(&w_lock);
        if (w == NULL)
                return;
        ...
        if (refcnt_rele(&w->w_refcnt))
                free(w);

Delete object:

        mtx_enter(&w_lock);
        w = lookup_from_list();
        if (w != NULL)
                remove_from_list(w);
        mtx_leave(&w_lock);
        /* Release list's reference. */
        if (w != NULL && refcnt_rele(&w->w_refcnt))
                free(w);

Above, any refcnt_rele() can release the final reference.

If there is no acquire barrier after the refcnt 1->0 transition, what
is known about the CPU's local view of the object after refcnt_rele()?

The decrement operation in the <linux/refcount.h> API of Linux and
the <sys/refcount.h> API of FreeBSD provide release and acquire
barriers. Are they wrong?

Reply via email to