Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Viresh Kumar
On 08-12-15, 17:42, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki 
> Subject: [PATCH] cpufreq: governor: Use lockless timer function
> 
> It is possible to get rid of the timer_lock spinlock used by the
> governor timer function for synchronization, but a couple of races
> need to be avoided.
> 
> The first race is between multiple dbs_timer_handler() instances
> that may be running in parallel with each other on different
> CPUs.  Namely, one of them has to queue up the work item, but it
> cannot be queued up more than once.  To achieve that,
> atomic_inc_return() can be used on the skip_work field of
> struct cpu_common_dbs_info.
> 
> The second race is between an already running dbs_timer_handler()
> and gov_cancel_work().  In that case the dbs_timer_handler() might
> not notice the skip_work incrementation in gov_cancel_work() and
> it might queue up its work item after gov_cancel_work() had
> returned (and that work item would corrupt skip_work going
> forward).  To prevent that from happening, gov_cancel_work()
> can be made wait for the timer function to complete (on all CPUs)
> right after skip_work has been incremented.
> 
> Signed-off-by: Rafael J. Wysocki 
> ---
>  drivers/cpufreq/cpufreq_governor.c |   49 
> -
>  drivers/cpufreq/cpufreq_governor.h |9 +-
>  2 files changed, 24 insertions(+), 34 deletions(-)

Acked-by: Viresh Kumar 

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Rafael J. Wysocki
On Tuesday, December 08, 2015 08:26:58 PM Viresh Kumar wrote:
> On 08-12-15, 15:30, Rafael J. Wysocki wrote:
> > It doesn't look nice, but then having a lockless timer function is worth
> > it in my view.
> > 
> > The code in gov_cancel_work() runs relatively rarely, but the timer
> > function can run very often, so avoiding the lock in there is a priority
> > to me.
> > 
> > Plus we can avoid disabling interrupts in two places this way.
> 
> Okay, that's good enough then. I hope you will be sending these
> patches now, right? And ofcourse, we need documentation in this case
> as well.

Your series is in my linux-next branch now, so that's just one patch on top
of it.  The current version of it is appended.  Unfortunately, I can't test
it here, but I'll do that later today.

I have updated the comments too, so please let me know if they are clear enough.

Thanks,
Rafael


---
From: Rafael J. Wysocki 
Subject: [PATCH] cpufreq: governor: Use lockless timer function

It is possible to get rid of the timer_lock spinlock used by the
governor timer function for synchronization, but a couple of races
need to be avoided.

The first race is between multiple dbs_timer_handler() instances
that may be running in parallel with each other on different
CPUs.  Namely, one of them has to queue up the work item, but it
cannot be queued up more than once.  To achieve that,
atomic_inc_return() can be used on the skip_work field of
struct cpu_common_dbs_info.

The second race is between an already running dbs_timer_handler()
and gov_cancel_work().  In that case the dbs_timer_handler() might
not notice the skip_work incrementation in gov_cancel_work() and
it might queue up its work item after gov_cancel_work() had
returned (and that work item would corrupt skip_work going
forward).  To prevent that from happening, gov_cancel_work()
can be made wait for the timer function to complete (on all CPUs)
right after skip_work has been incremented.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/cpufreq/cpufreq_governor.c |   49 -
 drivers/cpufreq/cpufreq_governor.h |9 +-
 2 files changed, 24 insertions(+), 34 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -186,22 +186,24 @@ static inline void gov_cancel_timers(str
 
 void gov_cancel_work(struct cpu_common_dbs_info *shared)
 {
-   unsigned long flags;
-
+   /* Tell dbs_timer_handler() to skip queuing up work items. */
+   atomic_inc(>skip_work);
/*
-* No work will be queued from timer handlers after skip_work is
-* updated. And so we can safely cancel the work first and then the
-* timers.
+* If dbs_timer_handler() is already running, it may not notice the
+* incremented skip_work, so wait for it to complete to prevent its work
+* item from being queued up after the cancel_work_sync() below.
+*/
+   gov_cancel_timers(shared->policy);
+   /*
+* In case dbs_timer_handler() managed to run and spawn a work item
+* before the timers have been canceled, wait for that work item to
+* complete and then cancel all of the timers set up by it.  If
+* dbs_timer_handler() runs again at that point, it will see the
+* positive value of skip_work and won't spawn any more work items.
 */
-   spin_lock_irqsave(>timer_lock, flags);
-   shared->skip_work++;
-   spin_unlock_irqrestore(>timer_lock, flags);
-
cancel_work_sync(>work);
-
gov_cancel_timers(shared->policy);
-
-   shared->skip_work = 0;
+   atomic_set(>skip_work, 0);
 }
 
 /* Will return if we need to evaluate cpu load again or not */
@@ -229,7 +231,6 @@ static void dbs_work_handler(struct work
struct cpufreq_policy *policy;
struct dbs_data *dbs_data;
unsigned int sampling_rate, delay;
-   unsigned long flags;
bool eval_load;
 
policy = shared->policy;
@@ -258,9 +259,7 @@ static void dbs_work_handler(struct work
delay = dbs_data->cdata->gov_dbs_timer(policy, eval_load);
mutex_unlock(>timer_mutex);
 
-   spin_lock_irqsave(>timer_lock, flags);
-   shared->skip_work--;
-   spin_unlock_irqrestore(>timer_lock, flags);
+   atomic_dec(>skip_work);
 
gov_add_timers(policy, delay);
 }
@@ -269,22 +268,18 @@ static void dbs_timer_handler(unsigned l
 {
struct cpu_dbs_info *cdbs = (struct cpu_dbs_info *)data;
struct cpu_common_dbs_info *shared = cdbs->shared;
-   unsigned long flags;
-
-   spin_lock_irqsave(>timer_lock, flags);
 
/*
-* Timer handler isn't allowed to queue work at the moment, because:
+* Timer handler may not be allowed to queue the work at the moment,
+* because:
 * - Another timer handler has done that

Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Viresh Kumar
On 08-12-15, 15:30, Rafael J. Wysocki wrote:
> It doesn't look nice, but then having a lockless timer function is worth
> it in my view.
> 
> The code in gov_cancel_work() runs relatively rarely, but the timer
> function can run very often, so avoiding the lock in there is a priority
> to me.
> 
> Plus we can avoid disabling interrupts in two places this way.

Okay, that's good enough then. I hope you will be sending these
patches now, right? And ofcourse, we need documentation in this case
as well.

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Rafael J. Wysocki
On Tuesday, December 08, 2015 07:25:18 PM Viresh Kumar wrote:
> On 08-12-15, 15:19, Rafael J. Wysocki wrote:
> > We know what should be done.  We need to wait for the timer function to
> > complete, then cancel the work item spawned by it (if any) and then
> > cancel the timers set by that work item.
> 
> Yeah, there is no race, but it looks ugly to me. I have written it
> earlier, and then the spinlock thing looked better to me. :)

It doesn't look nice, but then having a lockless timer function is worth
it in my view.

The code in gov_cancel_work() runs relatively rarely, but the timer
function can run very often, so avoiding the lock in there is a priority
to me.

Plus we can avoid disabling interrupts in two places this way.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Viresh Kumar
On 08-12-15, 15:19, Rafael J. Wysocki wrote:
> We know what should be done.  We need to wait for the timer function to
> complete, then cancel the work item spawned by it (if any) and then
> cancel the timers set by that work item.

Yeah, there is no race, but it looks ugly to me. I have written it
earlier, and then the spinlock thing looked better to me. :)

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Rafael J. Wysocki
On Tuesday, December 08, 2015 07:06:33 PM Viresh Kumar wrote:
> On 08-12-15, 14:30, Rafael J. Wysocki wrote:
> > OK, but instead of relying on the spinlock to wait for the already running
> 
> That's the purpose of the spinlock, not a side-effect.
> 
> > dbs_timer_handler() in gov_cancel_work() (which is really easy to overlook
> > and should at least be mentioned in a comment) we can wait for it 
> > explicitly.
> 
> I agree, and I will add explicit comment about it.
> 
> > That is, if the relevant code in gov_cancel_work() is like this:
> > 
> > 
> > atomic_inc(>skip_work);
> > gov_cancel_timers(shared->policy);
> > cancel_work_sync(>work);
> > gov_cancel_timers(shared->policy);
> 
> Apart from it being *really* ugly (we should know exactly what should
> be done, it rather looks like hit and try),

We know what should be done.  We need to wait for the timer function to
complete, then cancel the work item spawned by it (if any) and then
cancel the timers set by that work item.

> it is still racy.
> 
> > atomic_set(>skip_work, 0);
> > 
> > then the work item should not be leaked behind the cancel_work_sync() any 
> > more
> > AFAICS.
> 
> Suppose queue_work() isn't done within the spin lock.
> 
> CPU0CPU1
> 
> cpufreq_governor_stop() dbs_timer_handler()
> -> gov_cancel_work()-> lock
> -> shared->skip_work++, as 
> skip_work was 0. //skip_work=1
> -> unlock
>-> lock
>-> shared->skip_work++; //skip_work=2

(*)

>-> unlock
> -> queue_work();
>-> gov_cancel_timers(shared->policy);
> dbs_work_handler();
> -> queue-timers again (as we 
> aren't checking skip_work here)

skip_work = 1 (because dbs_work_handler() decrements it).

>-> cancel_work_sync(>work);
> dbs_timer_handler()
> -> lock
> -> shared->skip_work++, as 
> skip_work was 0.

No, it wasn't 0, it was 1, because (*) incremented it
and it has only been decremented once by dbs_work_handler().

> //skip_work=1
> -> unlock

And the below won't happen.

> ->queue_work()
>
>-> gov_cancel_timers(shared->policy);
>-> shared->skip_work = 0;
> 
> 
> And we have the same situation again.

I don't really think so.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Viresh Kumar
On 08-12-15, 14:30, Rafael J. Wysocki wrote:
> OK, but instead of relying on the spinlock to wait for the already running

That's the purpose of the spinlock, not a side-effect.

> dbs_timer_handler() in gov_cancel_work() (which is really easy to overlook
> and should at least be mentioned in a comment) we can wait for it explicitly.

I agree, and I will add explicit comment about it.

> That is, if the relevant code in gov_cancel_work() is like this:
> 
> 
>   atomic_inc(>skip_work);
>   gov_cancel_timers(shared->policy);
>   cancel_work_sync(>work);
>   gov_cancel_timers(shared->policy);

Apart from it being *really* ugly (we should know exactly what should
be done, it rather looks like hit and try), it is still racy.

>   atomic_set(>skip_work, 0);
> 
> then the work item should not be leaked behind the cancel_work_sync() any more
> AFAICS.

Suppose queue_work() isn't done within the spin lock.

CPU0CPU1

cpufreq_governor_stop() dbs_timer_handler()
-> gov_cancel_work()-> lock
-> shared->skip_work++, as 
skip_work was 0. //skip_work=1
-> unlock
   -> lock
   -> shared->skip_work++; //skip_work=2
   -> unlock
-> queue_work();
   -> gov_cancel_timers(shared->policy);
dbs_work_handler();
-> queue-timers again (as we 
aren't checking skip_work here)
   -> cancel_work_sync(>work);
dbs_timer_handler()
-> lock
-> shared->skip_work++, as 
skip_work was 0. //skip_work=1
-> unlock
->queue_work()
   -> gov_cancel_timers(shared->policy);
   -> shared->skip_work = 0;


And we have the same situation again. I have thought of all this
before I wrote the initial patch, and really tried the ugly double
timer-cancel thing. But the current approach is really the right thing
to do.

I will send a patch adding the comment.

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Rafael J. Wysocki
On Tuesday, December 08, 2015 12:29:05 PM Viresh Kumar wrote:
> On 08-12-15, 01:39, Rafael J. Wysocki wrote:
> > @@ -269,9 +259,6 @@ static void dbs_timer_handler(unsigned l
> >  {
> > struct cpu_dbs_info *cdbs = (struct cpu_dbs_info *)data;
> > struct cpu_common_dbs_info *shared = cdbs->shared;
> > -   unsigned long flags;
> > -
> > -   spin_lock_irqsave(>timer_lock, flags);
> >  
> > /*
> >  * Timer handler isn't allowed to queue work at the moment, because:
> > @@ -279,12 +266,10 @@ static void dbs_timer_handler(unsigned l
> >  * - We are stopping the governor
> >  * - Or we are updating the sampling rate of ondemand governor
> >  */
> > -   if (!shared->skip_work) {
> > -   shared->skip_work++;
> > +   if (atomic_inc_return(>skip_work) > 1)
> > +   atomic_dec(>skip_work);
> > +   else
> > queue_work(system_wq, >work);
> > -   }
> 
> As explained in the other email, this is wrong..

OK, but instead of relying on the spinlock to wait for the already running
dbs_timer_handler() in gov_cancel_work() (which is really easy to overlook
and should at least be mentioned in a comment) we can wait for it explicitly.

That is, if the relevant code in gov_cancel_work() is like this:


atomic_inc(>skip_work);
gov_cancel_timers(shared->policy);
cancel_work_sync(>work);
gov_cancel_timers(shared->policy);
atomic_set(>skip_work, 0);

then the work item should not be leaked behind the cancel_work_sync() any more
AFAICS.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Rafael J. Wysocki
On Tuesday, December 08, 2015 12:29:05 PM Viresh Kumar wrote:
> On 08-12-15, 01:39, Rafael J. Wysocki wrote:
> > @@ -269,9 +259,6 @@ static void dbs_timer_handler(unsigned l
> >  {
> > struct cpu_dbs_info *cdbs = (struct cpu_dbs_info *)data;
> > struct cpu_common_dbs_info *shared = cdbs->shared;
> > -   unsigned long flags;
> > -
> > -   spin_lock_irqsave(>timer_lock, flags);
> >  
> > /*
> >  * Timer handler isn't allowed to queue work at the moment, because:
> > @@ -279,12 +266,10 @@ static void dbs_timer_handler(unsigned l
> >  * - We are stopping the governor
> >  * - Or we are updating the sampling rate of ondemand governor
> >  */
> > -   if (!shared->skip_work) {
> > -   shared->skip_work++;
> > +   if (atomic_inc_return(>skip_work) > 1)
> > +   atomic_dec(>skip_work);
> > +   else
> > queue_work(system_wq, >work);
> > -   }
> 
> As explained in the other email, this is wrong..

OK, but instead of relying on the spinlock to wait for the already running
dbs_timer_handler() in gov_cancel_work() (which is really easy to overlook
and should at least be mentioned in a comment) we can wait for it explicitly.

That is, if the relevant code in gov_cancel_work() is like this:


atomic_inc(>skip_work);
gov_cancel_timers(shared->policy);
cancel_work_sync(>work);
gov_cancel_timers(shared->policy);
atomic_set(>skip_work, 0);

then the work item should not be leaked behind the cancel_work_sync() any more
AFAICS.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Rafael J. Wysocki
On Tuesday, December 08, 2015 07:06:33 PM Viresh Kumar wrote:
> On 08-12-15, 14:30, Rafael J. Wysocki wrote:
> > OK, but instead of relying on the spinlock to wait for the already running
> 
> That's the purpose of the spinlock, not a side-effect.
> 
> > dbs_timer_handler() in gov_cancel_work() (which is really easy to overlook
> > and should at least be mentioned in a comment) we can wait for it 
> > explicitly.
> 
> I agree, and I will add explicit comment about it.
> 
> > That is, if the relevant code in gov_cancel_work() is like this:
> > 
> > 
> > atomic_inc(>skip_work);
> > gov_cancel_timers(shared->policy);
> > cancel_work_sync(>work);
> > gov_cancel_timers(shared->policy);
> 
> Apart from it being *really* ugly (we should know exactly what should
> be done, it rather looks like hit and try),

We know what should be done.  We need to wait for the timer function to
complete, then cancel the work item spawned by it (if any) and then
cancel the timers set by that work item.

> it is still racy.
> 
> > atomic_set(>skip_work, 0);
> > 
> > then the work item should not be leaked behind the cancel_work_sync() any 
> > more
> > AFAICS.
> 
> Suppose queue_work() isn't done within the spin lock.
> 
> CPU0CPU1
> 
> cpufreq_governor_stop() dbs_timer_handler()
> -> gov_cancel_work()-> lock
> -> shared->skip_work++, as 
> skip_work was 0. //skip_work=1
> -> unlock
>-> lock
>-> shared->skip_work++; //skip_work=2

(*)

>-> unlock
> -> queue_work();
>-> gov_cancel_timers(shared->policy);
> dbs_work_handler();
> -> queue-timers again (as we 
> aren't checking skip_work here)

skip_work = 1 (because dbs_work_handler() decrements it).

>-> cancel_work_sync(>work);
> dbs_timer_handler()
> -> lock
> -> shared->skip_work++, as 
> skip_work was 0.

No, it wasn't 0, it was 1, because (*) incremented it
and it has only been decremented once by dbs_work_handler().

> //skip_work=1
> -> unlock

And the below won't happen.

> ->queue_work()
>
>-> gov_cancel_timers(shared->policy);
>-> shared->skip_work = 0;
> 
> 
> And we have the same situation again.

I don't really think so.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Viresh Kumar
On 08-12-15, 14:30, Rafael J. Wysocki wrote:
> OK, but instead of relying on the spinlock to wait for the already running

That's the purpose of the spinlock, not a side-effect.

> dbs_timer_handler() in gov_cancel_work() (which is really easy to overlook
> and should at least be mentioned in a comment) we can wait for it explicitly.

I agree, and I will add explicit comment about it.

> That is, if the relevant code in gov_cancel_work() is like this:
> 
> 
>   atomic_inc(>skip_work);
>   gov_cancel_timers(shared->policy);
>   cancel_work_sync(>work);
>   gov_cancel_timers(shared->policy);

Apart from it being *really* ugly (we should know exactly what should
be done, it rather looks like hit and try), it is still racy.

>   atomic_set(>skip_work, 0);
> 
> then the work item should not be leaked behind the cancel_work_sync() any more
> AFAICS.

Suppose queue_work() isn't done within the spin lock.

CPU0CPU1

cpufreq_governor_stop() dbs_timer_handler()
-> gov_cancel_work()-> lock
-> shared->skip_work++, as 
skip_work was 0. //skip_work=1
-> unlock
   -> lock
   -> shared->skip_work++; //skip_work=2
   -> unlock
-> queue_work();
   -> gov_cancel_timers(shared->policy);
dbs_work_handler();
-> queue-timers again (as we 
aren't checking skip_work here)
   -> cancel_work_sync(>work);
dbs_timer_handler()
-> lock
-> shared->skip_work++, as 
skip_work was 0. //skip_work=1
-> unlock
->queue_work()
   -> gov_cancel_timers(shared->policy);
   -> shared->skip_work = 0;


And we have the same situation again. I have thought of all this
before I wrote the initial patch, and really tried the ugly double
timer-cancel thing. But the current approach is really the right thing
to do.

I will send a patch adding the comment.

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Rafael J. Wysocki
On Tuesday, December 08, 2015 07:25:18 PM Viresh Kumar wrote:
> On 08-12-15, 15:19, Rafael J. Wysocki wrote:
> > We know what should be done.  We need to wait for the timer function to
> > complete, then cancel the work item spawned by it (if any) and then
> > cancel the timers set by that work item.
> 
> Yeah, there is no race, but it looks ugly to me. I have written it
> earlier, and then the spinlock thing looked better to me. :)

It doesn't look nice, but then having a lockless timer function is worth
it in my view.

The code in gov_cancel_work() runs relatively rarely, but the timer
function can run very often, so avoiding the lock in there is a priority
to me.

Plus we can avoid disabling interrupts in two places this way.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Viresh Kumar
On 08-12-15, 15:19, Rafael J. Wysocki wrote:
> We know what should be done.  We need to wait for the timer function to
> complete, then cancel the work item spawned by it (if any) and then
> cancel the timers set by that work item.

Yeah, there is no race, but it looks ugly to me. I have written it
earlier, and then the spinlock thing looked better to me. :)

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Viresh Kumar
On 08-12-15, 15:30, Rafael J. Wysocki wrote:
> It doesn't look nice, but then having a lockless timer function is worth
> it in my view.
> 
> The code in gov_cancel_work() runs relatively rarely, but the timer
> function can run very often, so avoiding the lock in there is a priority
> to me.
> 
> Plus we can avoid disabling interrupts in two places this way.

Okay, that's good enough then. I hope you will be sending these
patches now, right? And ofcourse, we need documentation in this case
as well.

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Viresh Kumar
On 08-12-15, 17:42, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki 
> Subject: [PATCH] cpufreq: governor: Use lockless timer function
> 
> It is possible to get rid of the timer_lock spinlock used by the
> governor timer function for synchronization, but a couple of races
> need to be avoided.
> 
> The first race is between multiple dbs_timer_handler() instances
> that may be running in parallel with each other on different
> CPUs.  Namely, one of them has to queue up the work item, but it
> cannot be queued up more than once.  To achieve that,
> atomic_inc_return() can be used on the skip_work field of
> struct cpu_common_dbs_info.
> 
> The second race is between an already running dbs_timer_handler()
> and gov_cancel_work().  In that case the dbs_timer_handler() might
> not notice the skip_work incrementation in gov_cancel_work() and
> it might queue up its work item after gov_cancel_work() had
> returned (and that work item would corrupt skip_work going
> forward).  To prevent that from happening, gov_cancel_work()
> can be made wait for the timer function to complete (on all CPUs)
> right after skip_work has been incremented.
> 
> Signed-off-by: Rafael J. Wysocki 
> ---
>  drivers/cpufreq/cpufreq_governor.c |   49 
> -
>  drivers/cpufreq/cpufreq_governor.h |9 +-
>  2 files changed, 24 insertions(+), 34 deletions(-)

Acked-by: Viresh Kumar 

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-08 Thread Rafael J. Wysocki
On Tuesday, December 08, 2015 08:26:58 PM Viresh Kumar wrote:
> On 08-12-15, 15:30, Rafael J. Wysocki wrote:
> > It doesn't look nice, but then having a lockless timer function is worth
> > it in my view.
> > 
> > The code in gov_cancel_work() runs relatively rarely, but the timer
> > function can run very often, so avoiding the lock in there is a priority
> > to me.
> > 
> > Plus we can avoid disabling interrupts in two places this way.
> 
> Okay, that's good enough then. I hope you will be sending these
> patches now, right? And ofcourse, we need documentation in this case
> as well.

Your series is in my linux-next branch now, so that's just one patch on top
of it.  The current version of it is appended.  Unfortunately, I can't test
it here, but I'll do that later today.

I have updated the comments too, so please let me know if they are clear enough.

Thanks,
Rafael


---
From: Rafael J. Wysocki 
Subject: [PATCH] cpufreq: governor: Use lockless timer function

It is possible to get rid of the timer_lock spinlock used by the
governor timer function for synchronization, but a couple of races
need to be avoided.

The first race is between multiple dbs_timer_handler() instances
that may be running in parallel with each other on different
CPUs.  Namely, one of them has to queue up the work item, but it
cannot be queued up more than once.  To achieve that,
atomic_inc_return() can be used on the skip_work field of
struct cpu_common_dbs_info.

The second race is between an already running dbs_timer_handler()
and gov_cancel_work().  In that case the dbs_timer_handler() might
not notice the skip_work incrementation in gov_cancel_work() and
it might queue up its work item after gov_cancel_work() had
returned (and that work item would corrupt skip_work going
forward).  To prevent that from happening, gov_cancel_work()
can be made wait for the timer function to complete (on all CPUs)
right after skip_work has been incremented.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/cpufreq/cpufreq_governor.c |   49 -
 drivers/cpufreq/cpufreq_governor.h |9 +-
 2 files changed, 24 insertions(+), 34 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -186,22 +186,24 @@ static inline void gov_cancel_timers(str
 
 void gov_cancel_work(struct cpu_common_dbs_info *shared)
 {
-   unsigned long flags;
-
+   /* Tell dbs_timer_handler() to skip queuing up work items. */
+   atomic_inc(>skip_work);
/*
-* No work will be queued from timer handlers after skip_work is
-* updated. And so we can safely cancel the work first and then the
-* timers.
+* If dbs_timer_handler() is already running, it may not notice the
+* incremented skip_work, so wait for it to complete to prevent its work
+* item from being queued up after the cancel_work_sync() below.
+*/
+   gov_cancel_timers(shared->policy);
+   /*
+* In case dbs_timer_handler() managed to run and spawn a work item
+* before the timers have been canceled, wait for that work item to
+* complete and then cancel all of the timers set up by it.  If
+* dbs_timer_handler() runs again at that point, it will see the
+* positive value of skip_work and won't spawn any more work items.
 */
-   spin_lock_irqsave(>timer_lock, flags);
-   shared->skip_work++;
-   spin_unlock_irqrestore(>timer_lock, flags);
-
cancel_work_sync(>work);
-
gov_cancel_timers(shared->policy);
-
-   shared->skip_work = 0;
+   atomic_set(>skip_work, 0);
 }
 
 /* Will return if we need to evaluate cpu load again or not */
@@ -229,7 +231,6 @@ static void dbs_work_handler(struct work
struct cpufreq_policy *policy;
struct dbs_data *dbs_data;
unsigned int sampling_rate, delay;
-   unsigned long flags;
bool eval_load;
 
policy = shared->policy;
@@ -258,9 +259,7 @@ static void dbs_work_handler(struct work
delay = dbs_data->cdata->gov_dbs_timer(policy, eval_load);
mutex_unlock(>timer_mutex);
 
-   spin_lock_irqsave(>timer_lock, flags);
-   shared->skip_work--;
-   spin_unlock_irqrestore(>timer_lock, flags);
+   atomic_dec(>skip_work);
 
gov_add_timers(policy, delay);
 }
@@ -269,22 +268,18 @@ static void dbs_timer_handler(unsigned l
 {
struct cpu_dbs_info *cdbs = (struct cpu_dbs_info *)data;
struct cpu_common_dbs_info *shared = cdbs->shared;
-   unsigned long flags;
-
-   spin_lock_irqsave(>timer_lock, flags);
 
/*
-* Timer handler isn't allowed to queue work at the moment, because:
+* Timer handler may not be allowed to queue the work at the moment,
+* 

Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-07 Thread Viresh Kumar
On 08-12-15, 01:39, Rafael J. Wysocki wrote:
> @@ -269,9 +259,6 @@ static void dbs_timer_handler(unsigned l
>  {
>   struct cpu_dbs_info *cdbs = (struct cpu_dbs_info *)data;
>   struct cpu_common_dbs_info *shared = cdbs->shared;
> - unsigned long flags;
> -
> - spin_lock_irqsave(>timer_lock, flags);
>  
>   /*
>* Timer handler isn't allowed to queue work at the moment, because:
> @@ -279,12 +266,10 @@ static void dbs_timer_handler(unsigned l
>* - We are stopping the governor
>* - Or we are updating the sampling rate of ondemand governor
>*/
> - if (!shared->skip_work) {
> - shared->skip_work++;
> + if (atomic_inc_return(>skip_work) > 1)
> + atomic_dec(>skip_work);
> + else
>   queue_work(system_wq, >work);
> - }

As explained in the other email, this is wrong..

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-07 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

Use the observation that if skip_work in struct cpu_common_dbs_info
is an atomic_t variable, the code may be rearranged to avoid using
the timer_lock spinlock in which case that lock is not necessary any
more.

Make that change and drop timer_lock.

Signed-off-by: Rafael J. Wysocki 
---

This is on top of my current linux-next branch.  Completely experimental and
untested.

---
 drivers/cpufreq/cpufreq_governor.c |   29 +++--
 drivers/cpufreq/cpufreq_governor.h |9 ++---
 2 files changed, 9 insertions(+), 29 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -186,22 +186,15 @@ static inline void gov_cancel_timers(str
 
 void gov_cancel_work(struct cpu_common_dbs_info *shared)
 {
-   unsigned long flags;
-
/*
 * No work will be queued from timer handlers after skip_work is
 * updated. And so we can safely cancel the work first and then the
 * timers.
 */
-   spin_lock_irqsave(>timer_lock, flags);
-   shared->skip_work++;
-   spin_unlock_irqrestore(>timer_lock, flags);
-
+   atomic_inc(>skip_work);
cancel_work_sync(>work);
-
gov_cancel_timers(shared->policy);
-
-   shared->skip_work = 0;
+   atomic_set(>skip_work, 0);
 }
 
 /* Will return if we need to evaluate cpu load again or not */
@@ -229,7 +222,6 @@ static void dbs_work_handler(struct work
struct cpufreq_policy *policy;
struct dbs_data *dbs_data;
unsigned int sampling_rate, delay;
-   unsigned long flags;
bool eval_load;
 
policy = shared->policy;
@@ -258,9 +250,7 @@ static void dbs_work_handler(struct work
delay = dbs_data->cdata->gov_dbs_timer(policy, eval_load);
mutex_unlock(>timer_mutex);
 
-   spin_lock_irqsave(>timer_lock, flags);
-   shared->skip_work--;
-   spin_unlock_irqrestore(>timer_lock, flags);
+   atomic_dec(>skip_work);
 
gov_add_timers(policy, delay);
 }
@@ -269,9 +259,6 @@ static void dbs_timer_handler(unsigned l
 {
struct cpu_dbs_info *cdbs = (struct cpu_dbs_info *)data;
struct cpu_common_dbs_info *shared = cdbs->shared;
-   unsigned long flags;
-
-   spin_lock_irqsave(>timer_lock, flags);
 
/*
 * Timer handler isn't allowed to queue work at the moment, because:
@@ -279,12 +266,10 @@ static void dbs_timer_handler(unsigned l
 * - We are stopping the governor
 * - Or we are updating the sampling rate of ondemand governor
 */
-   if (!shared->skip_work) {
-   shared->skip_work++;
+   if (atomic_inc_return(>skip_work) > 1)
+   atomic_dec(>skip_work);
+   else
queue_work(system_wq, >work);
-   }
-
-   spin_unlock_irqrestore(>timer_lock, flags);
 }
 
 static void set_sampling_rate(struct dbs_data *dbs_data,
@@ -315,7 +300,7 @@ static int alloc_common_dbs_info(struct
cdata->get_cpu_cdbs(j)->shared = shared;
 
mutex_init(>timer_mutex);
-   spin_lock_init(>timer_lock);
+   atomic_set(>skip_work, 0);
INIT_WORK(>work, dbs_work_handler);
return 0;
 }
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -17,6 +17,7 @@
 #ifndef _CPUFREQ_GOVERNOR_H
 #define _CPUFREQ_GOVERNOR_H
 
+#include 
 #include 
 #include 
 #include 
@@ -137,14 +138,8 @@ struct cpu_common_dbs_info {
 */
struct mutex timer_mutex;
 
-   /*
-* Per policy lock that serializes access to queuing work from timer
-* handlers.
-*/
-   spinlock_t timer_lock;
-
ktime_t time_stamp;
-   unsigned int skip_work;
+   atomic_t skip_work;
struct work_struct work;
 };
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-07 Thread Viresh Kumar
On 08-12-15, 01:39, Rafael J. Wysocki wrote:
> @@ -269,9 +259,6 @@ static void dbs_timer_handler(unsigned l
>  {
>   struct cpu_dbs_info *cdbs = (struct cpu_dbs_info *)data;
>   struct cpu_common_dbs_info *shared = cdbs->shared;
> - unsigned long flags;
> -
> - spin_lock_irqsave(>timer_lock, flags);
>  
>   /*
>* Timer handler isn't allowed to queue work at the moment, because:
> @@ -279,12 +266,10 @@ static void dbs_timer_handler(unsigned l
>* - We are stopping the governor
>* - Or we are updating the sampling rate of ondemand governor
>*/
> - if (!shared->skip_work) {
> - shared->skip_work++;
> + if (atomic_inc_return(>skip_work) > 1)
> + atomic_dec(>skip_work);
> + else
>   queue_work(system_wq, >work);
> - }

As explained in the other email, this is wrong..

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][experimantal] cpufreq: governor: Use an atomic variable for synchronization

2015-12-07 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

Use the observation that if skip_work in struct cpu_common_dbs_info
is an atomic_t variable, the code may be rearranged to avoid using
the timer_lock spinlock in which case that lock is not necessary any
more.

Make that change and drop timer_lock.

Signed-off-by: Rafael J. Wysocki 
---

This is on top of my current linux-next branch.  Completely experimental and
untested.

---
 drivers/cpufreq/cpufreq_governor.c |   29 +++--
 drivers/cpufreq/cpufreq_governor.h |9 ++---
 2 files changed, 9 insertions(+), 29 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -186,22 +186,15 @@ static inline void gov_cancel_timers(str
 
 void gov_cancel_work(struct cpu_common_dbs_info *shared)
 {
-   unsigned long flags;
-
/*
 * No work will be queued from timer handlers after skip_work is
 * updated. And so we can safely cancel the work first and then the
 * timers.
 */
-   spin_lock_irqsave(>timer_lock, flags);
-   shared->skip_work++;
-   spin_unlock_irqrestore(>timer_lock, flags);
-
+   atomic_inc(>skip_work);
cancel_work_sync(>work);
-
gov_cancel_timers(shared->policy);
-
-   shared->skip_work = 0;
+   atomic_set(>skip_work, 0);
 }
 
 /* Will return if we need to evaluate cpu load again or not */
@@ -229,7 +222,6 @@ static void dbs_work_handler(struct work
struct cpufreq_policy *policy;
struct dbs_data *dbs_data;
unsigned int sampling_rate, delay;
-   unsigned long flags;
bool eval_load;
 
policy = shared->policy;
@@ -258,9 +250,7 @@ static void dbs_work_handler(struct work
delay = dbs_data->cdata->gov_dbs_timer(policy, eval_load);
mutex_unlock(>timer_mutex);
 
-   spin_lock_irqsave(>timer_lock, flags);
-   shared->skip_work--;
-   spin_unlock_irqrestore(>timer_lock, flags);
+   atomic_dec(>skip_work);
 
gov_add_timers(policy, delay);
 }
@@ -269,9 +259,6 @@ static void dbs_timer_handler(unsigned l
 {
struct cpu_dbs_info *cdbs = (struct cpu_dbs_info *)data;
struct cpu_common_dbs_info *shared = cdbs->shared;
-   unsigned long flags;
-
-   spin_lock_irqsave(>timer_lock, flags);
 
/*
 * Timer handler isn't allowed to queue work at the moment, because:
@@ -279,12 +266,10 @@ static void dbs_timer_handler(unsigned l
 * - We are stopping the governor
 * - Or we are updating the sampling rate of ondemand governor
 */
-   if (!shared->skip_work) {
-   shared->skip_work++;
+   if (atomic_inc_return(>skip_work) > 1)
+   atomic_dec(>skip_work);
+   else
queue_work(system_wq, >work);
-   }
-
-   spin_unlock_irqrestore(>timer_lock, flags);
 }
 
 static void set_sampling_rate(struct dbs_data *dbs_data,
@@ -315,7 +300,7 @@ static int alloc_common_dbs_info(struct
cdata->get_cpu_cdbs(j)->shared = shared;
 
mutex_init(>timer_mutex);
-   spin_lock_init(>timer_lock);
+   atomic_set(>skip_work, 0);
INIT_WORK(>work, dbs_work_handler);
return 0;
 }
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -17,6 +17,7 @@
 #ifndef _CPUFREQ_GOVERNOR_H
 #define _CPUFREQ_GOVERNOR_H
 
+#include 
 #include 
 #include 
 #include 
@@ -137,14 +138,8 @@ struct cpu_common_dbs_info {
 */
struct mutex timer_mutex;
 
-   /*
-* Per policy lock that serializes access to queuing work from timer
-* handlers.
-*/
-   spinlock_t timer_lock;
-
ktime_t time_stamp;
-   unsigned int skip_work;
+   atomic_t skip_work;
struct work_struct work;
 };
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/