On 22-May 16:04, Viresh Kumar wrote:
> Okay, me and Rafael were discussing this patch, locking and races around this.
> 
> On 18-05-18, 11:55, Joel Fernandes (Google.) wrote:
> > @@ -382,13 +386,27 @@ sugov_update_shared(struct update_util_data *hook, 
> > u64 time, unsigned int flags)
> >  static void sugov_work(struct kthread_work *work)
> >  {
> >     struct sugov_policy *sg_policy = container_of(work, struct 
> > sugov_policy, work);
> > +   unsigned int freq;
> > +   unsigned long flags;
> > +
> > +   /*
> > +    * Hold sg_policy->update_lock shortly to handle the case where:
> > +    * incase sg_policy->next_freq is read here, and then updated by
> > +    * sugov_update_shared just before work_in_progress is set to false
> > +    * here, we may miss queueing the new update.
> > +    *
> > +    * Note: If a work was queued after the update_lock is released,
> > +    * sugov_work will just be called again by kthread_work code; and the
> > +    * request will be proceed before the sugov thread sleeps.
> > +    */
> > +   raw_spin_lock_irqsave(&sg_policy->update_lock, flags);
> > +   freq = sg_policy->next_freq;
> > +   sg_policy->work_in_progress = false;
> > +   raw_spin_unlock_irqrestore(&sg_policy->update_lock, flags);
> >  
> >     mutex_lock(&sg_policy->work_lock);
> > -   __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
> > -                           CPUFREQ_RELATION_L);
> > +   __cpufreq_driver_target(sg_policy->policy, freq, CPUFREQ_RELATION_L);
> >     mutex_unlock(&sg_policy->work_lock);
> > -
> > -   sg_policy->work_in_progress = false;
> >  }
> 
> And I do see a race here for single policy systems doing slow switching.
> 
> Kthread                                                 Sched update
> 
> sugov_work()                                            sugov_update_single()
> 
>         lock();
>         // The CPU is free to rearrange below
>         // two in any order, so it may clear
>         // the flag first and then read next
>         // freq. Lets assume it does.
>         work_in_progress = false
> 
>                                                         if (work_in_progress)
>                                                                 return;
> 
>                                                         sg_policy->next_freq 
> = 0;
>         freq = sg_policy->next_freq;
>                                                         sg_policy->next_freq 
> = real-next-freq;
>         unlock();
> 
> 
> 
> Is the above theory right or am I day dreaming ? :)

It could happen, but using:

   raw_spin_lock_irqsave(&sg_policy->update_lock, flags);
   freq = READ_ONCE(sg_policy->next_freq)
   WRITE_ONCE(sg_policy->work_in_progress, false);
   raw_spin_unlock_irqrestore(&sg_policy->update_lock, flags);

                       if (!READ_ONCE(sg_policy->work_in_progress)) {
                           WRITE_ONCE(sg_policy->work_in_progress, true);
                           irq_work_queue(&sg_policy->irq_work);
                       }

should fix it by enforcing the ordering as well as documenting the
concurrent access.

However, in the "sched update" side, where do we have the sequence:

   sg_policy->next_freq = 0;
   sg_policy->next_freq = real-next-freq;

AFAICS we always use locals for next_freq and do one single assignment
in sugov_update_commit(), isn't it?

-- 
#include <best/regards.h>

Patrick Bellasi

Reply via email to