On Tue, 2017-09-26 at 06:59 +0800, Ming Lei wrote:
> On Mon, Sep 25, 2017 at 01:29:24PM -0700, Bart Van Assche wrote:
> > +int blk_queue_enter(struct request_queue *q, bool nowait, bool preempt)
> > {
> > while (true) {
> > int ret;
> >
> > - if (percpu_ref_tryget_live(&q->q_usage_counter))
> > - return 0;
> > + if (percpu_ref_tryget_live(&q->q_usage_counter)) {
> > + /*
> > + * Since setting the PREEMPT_ONLY flag is followed
> > + * by a switch of q_usage_counter from per-cpu to
> > + * atomic mode and back to per-cpu and since the
> > + * switch to atomic mode uses call_rcu_sched(), it
> > + * is not necessary to call smp_rmb() here.
> > + */
>
> rcu_read_lock is held only inside percpu_ref_tryget_live().
>
> Without one explicit barrier(smp_mb) between getting the refcounter
> and reading the preempt only flag, the two operations(writing to
> refcounter and reading the flag) can be reordered, so
> unfreeze/unfreeze may be completed before this IO is completed.
Sorry but I disagree. I'm using RCU to achieve the same effect as a barrier
and to move the cost of the barrier from the reader to the updater. See also
Paul E. McKenney, Mathieu Desnoyers, Lai Jiangshan, and Josh Triplett,
The RCU-barrier menagerie, LWN.net, November 12, 2013
(https://lwn.net/Articles/573497/).
Bart.