On 06/30/20 19:07, Peter Zijlstra wrote:
> On Tue, Jun 30, 2020 at 12:21:23PM +0100, Qais Yousef wrote:
> > @@ -993,10 +1013,38 @@ static inline void uclamp_rq_dec_id(struct rq *rq, 
> > struct task_struct *p,
> >  
> >     lockdep_assert_held(&rq->lock);
> >  
> > +   /*
> > +    * If sched_uclamp_used was enabled after task @p was enqueued,
> > +    * we could end up with unbalanced call to uclamp_rq_dec_id().
> > +    *
> > +    * In this case the uc_se->active flag should be false since no uclamp
> > +    * accounting was performed at enqueue time and we can just return
> > +    * here.
> > +    *
> > +    * Need to be careful of the following enqeueue/dequeue ordering
> > +    * problem too
> > +    *
> > +    *      enqueue(taskA)
> > +    *      // sched_uclamp_used gets enabled
> > +    *      enqueue(taskB)
> > +    *      dequeue(taskA)
> > +    *      // Must not decrement bukcet->tasks here
> > +    *      dequeue(taskB)
> > +    *
> > +    * where we could end up with stale data in uc_se and
> > +    * bucket[uc_se->bucket_id].
> > +    *
> > +    * The following check here eliminates the possibility of such race.
> > +    */
> > +   if (unlikely(!uc_se->active))
> > +           return;
> > +
> >     bucket = &uc_rq->bucket[uc_se->bucket_id];
> > +
> >     SCHED_WARN_ON(!bucket->tasks);
> >     if (likely(bucket->tasks))
> >             bucket->tasks--;
> > +
> >     uc_se->active = false;
> >  
> >     /*
> 
> > @@ -1221,6 +1289,8 @@ static void __setscheduler_uclamp(struct task_struct 
> > *p,
> >     if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
> >             return;
> >  
> > +   static_branch_enable(&sched_uclamp_used);
> > +
> >     if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> >             uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
> >                           attr->sched_util_min, true);
> > @@ -7387,6 +7457,8 @@ static ssize_t cpu_uclamp_write(struct 
> > kernfs_open_file *of, char *buf,
> >     if (req.ret)
> >             return req.ret;
> >  
> > +   static_branch_enable(&sched_uclamp_used);
> > +
> >     mutex_lock(&uclamp_mutex);
> >     rcu_read_lock();
> >  
> 
> There's a fun race described in 9107c89e269d ("perf: Fix race between
> event install and jump_labels"), are we sure this isn't also susceptible
> to something similar?
> 
> I suspect not, but I just wanted to make sure.

IIUC, the worry is that not all CPUs might have observed the change in the
static key state; hence could not be running the patched
enqueue/dequeue_task(), so we could end up with some CPUs accounting for
uclamp in the enqueue/dequeue path but not others?

I was hoping this synchronization is guaranteed by the static_branch_*() call.

aarch64_insn_patch_text_nosync() in arch/arm64/kernel/insn.c performs
__flush_icache_range() after writing the new instruction.

I need to dig into what __flush_icache_range() do, but I think it'll just
flush the icache of the calling CPU. Need to dig into upper layers too.

It'd be good to know if I got you correctly first.

Thanks

--
Qais Yousef

Reply via email to