On Tue, Mar 19, 2019 at 08:06:48AM -0700, Bart Van Assche wrote:
> On Tue, 2019-03-19 at 16:38 +0800, Ming Lei wrote:
> > Inside sbitmap_queue_clear(), once the clear bit is set, it will be
> > visiable to allocation path immediately. Meantime READ/WRITE on old
> > associated instance(such as request in case of blk-mq) may be
> > out-of-order with the setting clear bit, so race with re-allocation
> > may be triggered.
> >
> > Adds one memory barrier for ordering READ/WRITE of the freed associated
> > instance with setting clear bit for avoiding race with re-allocation.
> >
> > The following kernel oops triggerd by block/006 on aarch64 may be fixed:
> ^^^
>
> Does that mean that it has not been verified whether this patch fixes the
> NULL pointer issue mentioned in the patch description?
V1 has been verified by Zhang Yi, but V2 isn't done yet.
>
> > diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> > index 5b382c1244ed..941a46495e12 100644
> > --- a/lib/sbitmap.c
> > +++ b/lib/sbitmap.c
> > @@ -591,6 +591,17 @@ EXPORT_SYMBOL_GPL(sbitmap_queue_wake_up);
> > void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr,
> > unsigned int cpu)
> > {
> > + /*
> > + * Once the clear bit is set, the bit may be allocated out.
> ^^^^^^^^^^^^^^^^^^^^^^^^^
>
> This comment is confusing. Did you perhaps mean "Once the bit is cleared"?
No, please take at look at sbitmap_deferred_clear_bit().
>
> > + *
> > + * Orders READ/WRITE on the asssociated instance(such as request
> > + * of blk_mq) by this bit for avoiding race with re-allocation,
> > + * and its pair is the memory barrier implied in __sbitmap_get_word.
> > + *
> > + * One invarient is that the clear bit has to be zero when the bit
> ^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> invariant? I cant' make sense of this.
It is really 'clear bit', again please look at
sbitmap_deferred_clear_bit().
Thanks,
Ming