On 11/01, Oleg Nesterov wrote:
>
> On 11/01, Paul E. McKenney wrote:
> >
> > OK, so it looks to me that this code relies on synchronize_sched()
> > forcing a memory barrier on each CPU executing in the kernel.
>
> No, the patch tries to avoid this assumption, but probably I missed
> something.
>
> > 1.  A task running on CPU 0 currently write-holds the lock.
> >
> > 2.  CPU 1 is running in the kernel, executing a longer-than-average
> >     loop of normal instructions (no atomic instructions or memory
> >     barriers).
> >
> > 3.  CPU 0 invokes percpu_up_write(), calling up_write(),
> >     synchronize_sched(), and finally mutex_unlock().
>
> And my expectation was, this should be enough because ...
>
> > 4.  CPU 1 executes percpu_down_read(), which calls update_fast_ctr(),
>
> since update_fast_ctr does preempt_disable/enable it should see all
> modifications done by CPU 0.
>
> IOW. Suppose that the writer (CPU 0) does
>
>       percpu_done_write();
>       STORE;
>       percpu_up_write();
>
> This means
>
>       STORE;
>       synchronize_sched();
>       mutex_unlock();
>
> Now. Do you mean that the next preempt_disable/enable can see the
> result of mutex_unlock() but not STORE?

So far I think this is not possible, so the code doesn't need the
additional wstate/barriers.

> > +static bool update_fast_ctr(struct percpu_rw_semaphore *brw, int val)
> > +{
> > +   bool success = false;
>
>       int state;
>
> > +
> > +   preempt_disable();
> > +   if (likely(!mutex_is_locked(&brw->writer_mutex))) {
>
>       state = ACCESS_ONCE(brw->wstate);
>       if (likely(!state)) {
>
> > +           __this_cpu_add(*brw->fast_read_ctr, val);
> > +           success = true;
>
>       } else if (state & WSTATE_NEED_MB) {
>               __this_cpu_add(*brw->fast_read_ctr, val);
>               smb_mb(); /* Order increment against critical section. */
>               success = true;
>       }

...

> > +void percpu_up_write(struct percpu_rw_semaphore *brw)
> > +{
> > +   /* allow the new readers, but only the slow-path */
> > +   up_write(&brw->rw_sem);
>
>       ACCESS_ONCE(brw->wstate) = WSTATE_NEED_MB;
>
> > +
> > +   /* insert the barrier before the next fast-path in down_read */
> > +   synchronize_sched();

But update_fast_ctr() should see mutex_is_locked(), obiously down_write()
must ensure this.

So update_fast_ctr() can execute the WSTATE_NEED_MB code only if it
races with

>       ACCESS_ONCE(brw->wstate) = 0;
>
> > +   mutex_unlock(&brw->writer_mutex);

these 2 stores and sees them in reverse order.



I guess that mutex_is_locked() in update_fast_ctr() looks a bit confusing.
It means no-fast-path for the reader, we could use ->state instead.

And even ->writer_mutex should go away if we want to optimize the
write-contended case, but I think this needs another patch on top of
this initial implementation.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to