Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-13 Thread Li, Aubrey
On 2014/11/13 21:06, Thomas Gleixner wrote:
> On Thu, 13 Nov 2014, Li, Aubrey wrote:
> 
>> On 2014/11/13 17:10, Thomas Gleixner wrote:
>>> On Thu, 13 Nov 2014, Peter Zijlstra wrote:
 On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
 But sure, we can add suspend notifiers to stuff to shut down timers; I
 should have a patch for at least one of the offenders somewhere. But I
 really think that we should not be looking at the individual timers for
 this, none of the other suspend modes care about active timers.
>>>
>>> Fair enough.
>>>  
>>
>> If you are okay with the current method to suspend timekeeping entirely,
>> then we can go further to fix the rest concerns.
> 
> I'm fine with that when it's done proper :)
> 

Sure, thanks for the suggestion, let me try my best to make you happy, ;)

> Thanks,
> 
>   tglx
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-13 Thread Thomas Gleixner
On Thu, 13 Nov 2014, Li, Aubrey wrote:

> On 2014/11/13 17:10, Thomas Gleixner wrote:
> > On Thu, 13 Nov 2014, Peter Zijlstra wrote:
> >> On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
> >> But sure, we can add suspend notifiers to stuff to shut down timers; I
> >> should have a patch for at least one of the offenders somewhere. But I
> >> really think that we should not be looking at the individual timers for
> >> this, none of the other suspend modes care about active timers.
> > 
> > Fair enough.
> >  
> 
> If you are okay with the current method to suspend timekeeping entirely,
> then we can go further to fix the rest concerns.

I'm fine with that when it's done proper :)

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-13 Thread Li, Aubrey
On 2014/11/13 17:19, Thomas Gleixner wrote:
> On Thu, 13 Nov 2014, Li, Aubrey wrote:
>> On 2014/11/13 9:37, Peter Zijlstra wrote:
>>> On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
 On Thu, 30 Oct 2014, Li, Aubrey wrote:

> Freeze is a general power saving state that processes are frozen, devices
> are suspended and CPUs are in idle state. However, when the system enters
> freeze state, there are a few timers keep ticking and hence consumes more
> power unnecessarily. The observed timer events in freeze state are:
> - tick_sched_timer
> - watchdog lockup detector
> - realtime scheduler period timer
>
> The system power consumption in freeze state will be reduced significantly
> if we quiesce these timers.

 So the obvious question is why dont we quiesce these timers by telling
 the subsystems which manage these timers to shut them down?

 I really want a proper answer for this in the first place, but let me
 look at the proposed "solution" as well.
>>>
>>> Two arguments here:
>>>
>>>  1) the current suspend modes don't care, so if this suspend mode starts
>>>  to care, its likely to 'break' in the future simply because people
>>>  never cared about timers.
>>>
>>>  2) there could be userland timers, userland is frozen but they'll still
>>>  have their timers set and those can and will fire.
>>>
>>> But sure, we can add suspend notifiers to stuff to shut down timers; I
>>> should have a patch for at least one of the offenders somewhere. But I
>>> really think that we should not be looking at the individual timers for
>>> this, none of the other suspend modes care about active timers.
>>>
 But before we do that we want a proper explanation why the interrupt
 fires at all. The lack of explanation cleary documents that this is a
 'hacked it into submission' approach.
>>>
>>> >From what I remember its the waking interrupt that ends up in the
>>> timekeeping code, Li should have a backtrace somwhere.
>>
>> There are two race conditions:
>>
>> The first one occurs after the interrupt is disabled and before we
>> suspend lapic. In this time slot, if apic timer interrupt occurs, the
>> interrupt is pending there because the interrupt is disabled. Then we
>> suspend timekeeping, and then we enter idle and exit idle with interrupt
>> re-enabled, the timer interrupt is handled with timekeeping is
>> suspended.
>>
>> The other occurs after timekeeping_suspended = 1 and before we suspend
>> lapic. In this time slot, if apic timer interrupt occurs, we invoke the
>> timer interrupt while timekeeping is suspended as above.
> 
> And that race exists for every implementation and is not at all apic
> timer specific. So we fix it at the core and not at some random place
> in the architecture code.
> 
You're right, will refine this in the next patch version.

Thanks,
-Aubrey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-13 Thread Li, Aubrey
On 2014/11/13 17:10, Thomas Gleixner wrote:
> On Thu, 13 Nov 2014, Peter Zijlstra wrote:
>> On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
>> But sure, we can add suspend notifiers to stuff to shut down timers; I
>> should have a patch for at least one of the offenders somewhere. But I
>> really think that we should not be looking at the individual timers for
>> this, none of the other suspend modes care about active timers.
> 
> Fair enough.
>  

If you are okay with the current method to suspend timekeeping entirely,
then we can go further to fix the rest concerns.

Thanks,
-Aubrey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-13 Thread Thomas Gleixner
On Thu, 13 Nov 2014, Li, Aubrey wrote:
> On 2014/11/13 9:37, Peter Zijlstra wrote:
> > On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
> >> On Thu, 30 Oct 2014, Li, Aubrey wrote:
> >>
> >>> Freeze is a general power saving state that processes are frozen, devices
> >>> are suspended and CPUs are in idle state. However, when the system enters
> >>> freeze state, there are a few timers keep ticking and hence consumes more
> >>> power unnecessarily. The observed timer events in freeze state are:
> >>> - tick_sched_timer
> >>> - watchdog lockup detector
> >>> - realtime scheduler period timer
> >>>
> >>> The system power consumption in freeze state will be reduced significantly
> >>> if we quiesce these timers.
> >>
> >> So the obvious question is why dont we quiesce these timers by telling
> >> the subsystems which manage these timers to shut them down?
> >>
> >> I really want a proper answer for this in the first place, but let me
> >> look at the proposed "solution" as well.
> > 
> > Two arguments here:
> > 
> >  1) the current suspend modes don't care, so if this suspend mode starts
> >  to care, its likely to 'break' in the future simply because people
> >  never cared about timers.
> > 
> >  2) there could be userland timers, userland is frozen but they'll still
> >  have their timers set and those can and will fire.
> > 
> > But sure, we can add suspend notifiers to stuff to shut down timers; I
> > should have a patch for at least one of the offenders somewhere. But I
> > really think that we should not be looking at the individual timers for
> > this, none of the other suspend modes care about active timers.
> > 
> >> But before we do that we want a proper explanation why the interrupt
> >> fires at all. The lack of explanation cleary documents that this is a
> >> 'hacked it into submission' approach.
> > 
> >>From what I remember its the waking interrupt that ends up in the
> > timekeeping code, Li should have a backtrace somwhere.
> 
> There are two race conditions:
> 
> The first one occurs after the interrupt is disabled and before we
> suspend lapic. In this time slot, if apic timer interrupt occurs, the
> interrupt is pending there because the interrupt is disabled. Then we
> suspend timekeeping, and then we enter idle and exit idle with interrupt
> re-enabled, the timer interrupt is handled with timekeeping is
> suspended.
> 
> The other occurs after timekeeping_suspended = 1 and before we suspend
> lapic. In this time slot, if apic timer interrupt occurs, we invoke the
> timer interrupt while timekeeping is suspended as above.

And that race exists for every implementation and is not at all apic
timer specific. So we fix it at the core and not at some random place
in the architecture code.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-13 Thread Thomas Gleixner
On Thu, 13 Nov 2014, Peter Zijlstra wrote:
> On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
> But sure, we can add suspend notifiers to stuff to shut down timers; I
> should have a patch for at least one of the offenders somewhere. But I
> really think that we should not be looking at the individual timers for
> this, none of the other suspend modes care about active timers.

Fair enough.
 
> > But before we do that we want a proper explanation why the interrupt
> > fires at all. The lack of explanation cleary documents that this is a
> > 'hacked it into submission' approach.
> 
> >From what I remember its the waking interrupt that ends up in the
> timekeeping code, Li should have a backtrace somwhere.

I can imagine what happens :)

> > stomp_machine() is in 99% of all use cases a clear indicator for a
> > complete design failure.
> 
> >So the generic idle task needs a check like this:
> > 
> >if (idle_should_freeze())
> > frozen_idle();
> 
> So that is adding extra code to fairly common/hot paths just for this
> one extra special case. I tried to avoid doing that.

idle enter is not that much of a hot path, really.
 
Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-13 Thread Thomas Gleixner
On Thu, 13 Nov 2014, Peter Zijlstra wrote:
 On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
 But sure, we can add suspend notifiers to stuff to shut down timers; I
 should have a patch for at least one of the offenders somewhere. But I
 really think that we should not be looking at the individual timers for
 this, none of the other suspend modes care about active timers.

Fair enough.
 
  But before we do that we want a proper explanation why the interrupt
  fires at all. The lack of explanation cleary documents that this is a
  'hacked it into submission' approach.
 
 From what I remember its the waking interrupt that ends up in the
 timekeeping code, Li should have a backtrace somwhere.

I can imagine what happens :)

  stomp_machine() is in 99% of all use cases a clear indicator for a
  complete design failure.
 
 So the generic idle task needs a check like this:
  
 if (idle_should_freeze())
  frozen_idle();
 
 So that is adding extra code to fairly common/hot paths just for this
 one extra special case. I tried to avoid doing that.

idle enter is not that much of a hot path, really.
 
Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-13 Thread Thomas Gleixner
On Thu, 13 Nov 2014, Li, Aubrey wrote:
 On 2014/11/13 9:37, Peter Zijlstra wrote:
  On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
  On Thu, 30 Oct 2014, Li, Aubrey wrote:
 
  Freeze is a general power saving state that processes are frozen, devices
  are suspended and CPUs are in idle state. However, when the system enters
  freeze state, there are a few timers keep ticking and hence consumes more
  power unnecessarily. The observed timer events in freeze state are:
  - tick_sched_timer
  - watchdog lockup detector
  - realtime scheduler period timer
 
  The system power consumption in freeze state will be reduced significantly
  if we quiesce these timers.
 
  So the obvious question is why dont we quiesce these timers by telling
  the subsystems which manage these timers to shut them down?
 
  I really want a proper answer for this in the first place, but let me
  look at the proposed solution as well.
  
  Two arguments here:
  
   1) the current suspend modes don't care, so if this suspend mode starts
   to care, its likely to 'break' in the future simply because people
   never cared about timers.
  
   2) there could be userland timers, userland is frozen but they'll still
   have their timers set and those can and will fire.
  
  But sure, we can add suspend notifiers to stuff to shut down timers; I
  should have a patch for at least one of the offenders somewhere. But I
  really think that we should not be looking at the individual timers for
  this, none of the other suspend modes care about active timers.
  
  But before we do that we want a proper explanation why the interrupt
  fires at all. The lack of explanation cleary documents that this is a
  'hacked it into submission' approach.
  
 From what I remember its the waking interrupt that ends up in the
  timekeeping code, Li should have a backtrace somwhere.
 
 There are two race conditions:
 
 The first one occurs after the interrupt is disabled and before we
 suspend lapic. In this time slot, if apic timer interrupt occurs, the
 interrupt is pending there because the interrupt is disabled. Then we
 suspend timekeeping, and then we enter idle and exit idle with interrupt
 re-enabled, the timer interrupt is handled with timekeeping is
 suspended.
 
 The other occurs after timekeeping_suspended = 1 and before we suspend
 lapic. In this time slot, if apic timer interrupt occurs, we invoke the
 timer interrupt while timekeeping is suspended as above.

And that race exists for every implementation and is not at all apic
timer specific. So we fix it at the core and not at some random place
in the architecture code.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-13 Thread Li, Aubrey
On 2014/11/13 17:10, Thomas Gleixner wrote:
 On Thu, 13 Nov 2014, Peter Zijlstra wrote:
 On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
 But sure, we can add suspend notifiers to stuff to shut down timers; I
 should have a patch for at least one of the offenders somewhere. But I
 really think that we should not be looking at the individual timers for
 this, none of the other suspend modes care about active timers.
 
 Fair enough.
  

If you are okay with the current method to suspend timekeeping entirely,
then we can go further to fix the rest concerns.

Thanks,
-Aubrey
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-13 Thread Li, Aubrey
On 2014/11/13 17:19, Thomas Gleixner wrote:
 On Thu, 13 Nov 2014, Li, Aubrey wrote:
 On 2014/11/13 9:37, Peter Zijlstra wrote:
 On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
 On Thu, 30 Oct 2014, Li, Aubrey wrote:

 Freeze is a general power saving state that processes are frozen, devices
 are suspended and CPUs are in idle state. However, when the system enters
 freeze state, there are a few timers keep ticking and hence consumes more
 power unnecessarily. The observed timer events in freeze state are:
 - tick_sched_timer
 - watchdog lockup detector
 - realtime scheduler period timer

 The system power consumption in freeze state will be reduced significantly
 if we quiesce these timers.

 So the obvious question is why dont we quiesce these timers by telling
 the subsystems which manage these timers to shut them down?

 I really want a proper answer for this in the first place, but let me
 look at the proposed solution as well.

 Two arguments here:

  1) the current suspend modes don't care, so if this suspend mode starts
  to care, its likely to 'break' in the future simply because people
  never cared about timers.

  2) there could be userland timers, userland is frozen but they'll still
  have their timers set and those can and will fire.

 But sure, we can add suspend notifiers to stuff to shut down timers; I
 should have a patch for at least one of the offenders somewhere. But I
 really think that we should not be looking at the individual timers for
 this, none of the other suspend modes care about active timers.

 But before we do that we want a proper explanation why the interrupt
 fires at all. The lack of explanation cleary documents that this is a
 'hacked it into submission' approach.

 From what I remember its the waking interrupt that ends up in the
 timekeeping code, Li should have a backtrace somwhere.

 There are two race conditions:

 The first one occurs after the interrupt is disabled and before we
 suspend lapic. In this time slot, if apic timer interrupt occurs, the
 interrupt is pending there because the interrupt is disabled. Then we
 suspend timekeeping, and then we enter idle and exit idle with interrupt
 re-enabled, the timer interrupt is handled with timekeeping is
 suspended.

 The other occurs after timekeeping_suspended = 1 and before we suspend
 lapic. In this time slot, if apic timer interrupt occurs, we invoke the
 timer interrupt while timekeeping is suspended as above.
 
 And that race exists for every implementation and is not at all apic
 timer specific. So we fix it at the core and not at some random place
 in the architecture code.
 
You're right, will refine this in the next patch version.

Thanks,
-Aubrey
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-13 Thread Thomas Gleixner
On Thu, 13 Nov 2014, Li, Aubrey wrote:

 On 2014/11/13 17:10, Thomas Gleixner wrote:
  On Thu, 13 Nov 2014, Peter Zijlstra wrote:
  On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
  But sure, we can add suspend notifiers to stuff to shut down timers; I
  should have a patch for at least one of the offenders somewhere. But I
  really think that we should not be looking at the individual timers for
  this, none of the other suspend modes care about active timers.
  
  Fair enough.
   
 
 If you are okay with the current method to suspend timekeeping entirely,
 then we can go further to fix the rest concerns.

I'm fine with that when it's done proper :)

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-13 Thread Li, Aubrey
On 2014/11/13 21:06, Thomas Gleixner wrote:
 On Thu, 13 Nov 2014, Li, Aubrey wrote:
 
 On 2014/11/13 17:10, Thomas Gleixner wrote:
 On Thu, 13 Nov 2014, Peter Zijlstra wrote:
 On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
 But sure, we can add suspend notifiers to stuff to shut down timers; I
 should have a patch for at least one of the offenders somewhere. But I
 really think that we should not be looking at the individual timers for
 this, none of the other suspend modes care about active timers.

 Fair enough.
  

 If you are okay with the current method to suspend timekeeping entirely,
 then we can go further to fix the rest concerns.
 
 I'm fine with that when it's done proper :)
 

Sure, thanks for the suggestion, let me try my best to make you happy, ;)

 Thanks,
 
   tglx
 
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-12 Thread Li, Aubrey
On 2014/11/13 9:37, Peter Zijlstra wrote:
> On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
>> On Thu, 30 Oct 2014, Li, Aubrey wrote:
>>
>>> Freeze is a general power saving state that processes are frozen, devices
>>> are suspended and CPUs are in idle state. However, when the system enters
>>> freeze state, there are a few timers keep ticking and hence consumes more
>>> power unnecessarily. The observed timer events in freeze state are:
>>> - tick_sched_timer
>>> - watchdog lockup detector
>>> - realtime scheduler period timer
>>>
>>> The system power consumption in freeze state will be reduced significantly
>>> if we quiesce these timers.
>>
>> So the obvious question is why dont we quiesce these timers by telling
>> the subsystems which manage these timers to shut them down?
>>
>> I really want a proper answer for this in the first place, but let me
>> look at the proposed "solution" as well.
> 
> Two arguments here:
> 
>  1) the current suspend modes don't care, so if this suspend mode starts
>  to care, its likely to 'break' in the future simply because people
>  never cared about timers.
> 
>  2) there could be userland timers, userland is frozen but they'll still
>  have their timers set and those can and will fire.
> 
> But sure, we can add suspend notifiers to stuff to shut down timers; I
> should have a patch for at least one of the offenders somewhere. But I
> really think that we should not be looking at the individual timers for
> this, none of the other suspend modes care about active timers.
> 
>> But before we do that we want a proper explanation why the interrupt
>> fires at all. The lack of explanation cleary documents that this is a
>> 'hacked it into submission' approach.
> 
>>From what I remember its the waking interrupt that ends up in the
> timekeeping code, Li should have a backtrace somwhere.

There are two race conditions:

The first one occurs after the interrupt is disabled and before we
suspend lapic. In this time slot, if apic timer interrupt occurs, the
interrupt is pending there because the interrupt is disabled. Then we
suspend timekeeping, and then we enter idle and exit idle with interrupt
re-enabled, the timer interrupt is handled with timekeeping is
suspended.

The other occurs after timekeeping_suspended = 1 and before we suspend
lapic. In this time slot, if apic timer interrupt occurs, we invoke the
timer interrupt while timekeeping is suspended as above.

Thanks,
-Aubrey
> 
>>> +#include "../time/tick-internal.h"
>>> +#include "../time/timekeeping_internal.h"
>>
>> Eew.
> 
> I knew you'd love that :-)
> 
>> So you export the world and some more from timekeeping and the tick
>> code and fiddle with it randomly just to do:
>>
>> 1) Suspend clock event devices
>> 2) Suspend timekeeping
>> 3) Resume timekeeping
>> 4) Resume clock event devices
> 
> Sure, we can add some exports and clean that up, but..
> 
>> stomp_machine() is in 99% of all use cases a clear indicator for a
>> complete design failure.
> 
>>So the generic idle task needs a check like this:
>>
>>if (idle_should_freeze())
>>  frozen_idle();
> 
> So that is adding extra code to fairly common/hot paths just for this
> one extra special case. I tried to avoid doing that.
> 
> But I suppose we can try and merge that with the offline case and guard
> both special cases with a single variable or so.
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-12 Thread Peter Zijlstra
On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
> On Thu, 30 Oct 2014, Li, Aubrey wrote:
> 
> > Freeze is a general power saving state that processes are frozen, devices
> > are suspended and CPUs are in idle state. However, when the system enters
> > freeze state, there are a few timers keep ticking and hence consumes more
> > power unnecessarily. The observed timer events in freeze state are:
> > - tick_sched_timer
> > - watchdog lockup detector
> > - realtime scheduler period timer
> > 
> > The system power consumption in freeze state will be reduced significantly
> > if we quiesce these timers.
> 
> So the obvious question is why dont we quiesce these timers by telling
> the subsystems which manage these timers to shut them down?
> 
> I really want a proper answer for this in the first place, but let me
> look at the proposed "solution" as well.

Two arguments here:

 1) the current suspend modes don't care, so if this suspend mode starts
 to care, its likely to 'break' in the future simply because people
 never cared about timers.

 2) there could be userland timers, userland is frozen but they'll still
 have their timers set and those can and will fire.

But sure, we can add suspend notifiers to stuff to shut down timers; I
should have a patch for at least one of the offenders somewhere. But I
really think that we should not be looking at the individual timers for
this, none of the other suspend modes care about active timers.

> But before we do that we want a proper explanation why the interrupt
> fires at all. The lack of explanation cleary documents that this is a
> 'hacked it into submission' approach.

>From what I remember its the waking interrupt that ends up in the
timekeeping code, Li should have a backtrace somwhere.

> > +#include "../time/tick-internal.h"
> > +#include "../time/timekeeping_internal.h"
> 
> Eew.

I knew you'd love that :-)

> So you export the world and some more from timekeeping and the tick
> code and fiddle with it randomly just to do:
> 
> 1) Suspend clock event devices
> 2) Suspend timekeeping
> 3) Resume timekeeping
> 4) Resume clock event devices

Sure, we can add some exports and clean that up, but..

> stomp_machine() is in 99% of all use cases a clear indicator for a
> complete design failure.

>So the generic idle task needs a check like this:
> 
>if (idle_should_freeze())
>   frozen_idle();

So that is adding extra code to fairly common/hot paths just for this
one extra special case. I tried to avoid doing that.

But I suppose we can try and merge that with the offline case and guard
both special cases with a single variable or so.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-12 Thread Thomas Gleixner
On Thu, 30 Oct 2014, Li, Aubrey wrote:

> Freeze is a general power saving state that processes are frozen, devices
> are suspended and CPUs are in idle state. However, when the system enters
> freeze state, there are a few timers keep ticking and hence consumes more
> power unnecessarily. The observed timer events in freeze state are:
> - tick_sched_timer
> - watchdog lockup detector
> - realtime scheduler period timer
> 
> The system power consumption in freeze state will be reduced significantly
> if we quiesce these timers.

So the obvious question is why dont we quiesce these timers by telling
the subsystems which manage these timers to shut them down?

I really want a proper answer for this in the first place, but let me
look at the proposed "solution" as well.

> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 6776027..f2bb645 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -917,6 +917,14 @@ static void local_apic_timer_interrupt(void)
>*/
>   inc_irq_stat(apic_timer_irqs);
>  
> + /*
> +  * if timekeeping is suspended, the clock event device will be
> +  * suspended as well, so we are not supposed to invoke the event
> +  * handler of clock event device.
> +  */
> + if (unlikely(timekeeping_suspended))
> + return;

Why do you need that if you already suspended the clock event device?
The above comment does not explain that at all.

So if there is a proper reason to do so, we rather do the following in
tick_suspend():

td->evtdev.real_handler = td->evtdev.event_handler;
td->evtdev.event_handler = clockevents_handle_noop;

and restore that on resume instead of sprinkling if (tk_suspended)
checks all over the place. x86/apic is probably not the only one which
wants that treatment.

But before we do that we want a proper explanation why the interrupt
fires at all. The lack of explanation cleary documents that this is a
'hacked it into submission' approach.

> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> index 4ca9a33..660fd15 100644
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -28,16 +28,20 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
>  
>  #include "power.h"
> +#include "../time/tick-internal.h"
> +#include "../time/timekeeping_internal.h"

Eew.
  
> +static void freezer_pick_tk(int cpu)
> +{
> + if (tick_do_timer_cpu == TICK_DO_TIMER_NONE) {
> + static DEFINE_SPINLOCK(lock);
> +
> + spin_lock();
> + if (tick_do_timer_cpu == TICK_DO_TIMER_NONE)
> + tick_do_timer_cpu = cpu;
> + spin_unlock();
> + }
> +}
> +static void freezer_suspend_clkevt(int cpu)
> +{
> + if (tick_do_timer_cpu == cpu)
> + return;
> +
> + clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
> +}
> +
> +static void freezer_suspend_tk(int cpu)
> +{
> + if (tick_do_timer_cpu != cpu)
> + return;
> +
> + timekeeping_suspend();
> +
> +}

So you export the world and some more from timekeeping and the tick
code and fiddle with it randomly just to do:

1) Suspend clock event devices
2) Suspend timekeeping
3) Resume timekeeping
4) Resume clock event devices

And for that you kick the frozen cpus out of idle into the
stomp_machine task and let them enter deep idle from there.

stomp_machine() is in 99% of all use cases a clear indicator for a
complete design failure.

It's not that hard to solve that problem, w/o stomp_machine and w/o
all the tick_do_timer_cpu mess.

1) Run the freeze code until freeze_enter()

2) Prevent CPU hotplug and switch state.

   That tells the cpu idle code to enter the deepest idle state and
   also tells the clock events code about the desire to freeze
   everything.

   clock_events_set_freeze_state(true);

   And let that be:

   clock_events_set_freeze_state(bool on)
   {
raw_spin_lock_irq(_lock);
if (on)
tobefrozen_cpus = num_online_cpus();
idle_freeze = on;
raw_spin_unlock_irq(_lock);
   }

   So the generic idle task needs a check like this:

   if (idle_should_freeze())
frozen_idle();

   with the implementation:

   bool idle_should_freeze()
   {
return clock_events_get_freeze_state();
   }

   which resolves to:

   bool clock_events_get_freeze_state()
   {
/*
 * Lockfree access because it does not matter.
 *
 * See below at CLOCK_EVT_NOTIFY_FREEZE
 */
return idle_freeze;
   }

4) Kick all cpus out of idle, so they enter the deep idle state via
   frozen_idle()

   frozen_idle()
   {
if (clock_events_notify(CLOCK_EVT_NOTIFY_FREEZE))
  return;

while (idle_should_freeze())
  magic_frozen_idle();

clock_events_notify(CLOCK_EVT_NOTIFY_UNFREEZE);
   }

   Let clock_events_notify() have these new cases:

   CLOCK_EVT_NOTIFY_FREEZE:
  

Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-12 Thread Thomas Gleixner
On Thu, 30 Oct 2014, Li, Aubrey wrote:

 Freeze is a general power saving state that processes are frozen, devices
 are suspended and CPUs are in idle state. However, when the system enters
 freeze state, there are a few timers keep ticking and hence consumes more
 power unnecessarily. The observed timer events in freeze state are:
 - tick_sched_timer
 - watchdog lockup detector
 - realtime scheduler period timer
 
 The system power consumption in freeze state will be reduced significantly
 if we quiesce these timers.

So the obvious question is why dont we quiesce these timers by telling
the subsystems which manage these timers to shut them down?

I really want a proper answer for this in the first place, but let me
look at the proposed solution as well.

 diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
 index 6776027..f2bb645 100644
 --- a/arch/x86/kernel/apic/apic.c
 +++ b/arch/x86/kernel/apic/apic.c
 @@ -917,6 +917,14 @@ static void local_apic_timer_interrupt(void)
*/
   inc_irq_stat(apic_timer_irqs);
  
 + /*
 +  * if timekeeping is suspended, the clock event device will be
 +  * suspended as well, so we are not supposed to invoke the event
 +  * handler of clock event device.
 +  */
 + if (unlikely(timekeeping_suspended))
 + return;

Why do you need that if you already suspended the clock event device?
The above comment does not explain that at all.

So if there is a proper reason to do so, we rather do the following in
tick_suspend():

td-evtdev.real_handler = td-evtdev.event_handler;
td-evtdev.event_handler = clockevents_handle_noop;

and restore that on resume instead of sprinkling if (tk_suspended)
checks all over the place. x86/apic is probably not the only one which
wants that treatment.

But before we do that we want a proper explanation why the interrupt
fires at all. The lack of explanation cleary documents that this is a
'hacked it into submission' approach.

 diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
 index 4ca9a33..660fd15 100644
 --- a/kernel/power/suspend.c
 +++ b/kernel/power/suspend.c
 @@ -28,16 +28,20 @@
  #include linux/ftrace.h
  #include trace/events/power.h
  #include linux/compiler.h
 +#include linux/stop_machine.h
 +#include linux/clockchips.h
 +#include linux/hrtimer.h
  
  #include power.h
 +#include ../time/tick-internal.h
 +#include ../time/timekeeping_internal.h

Eew.
  
 +static void freezer_pick_tk(int cpu)
 +{
 + if (tick_do_timer_cpu == TICK_DO_TIMER_NONE) {
 + static DEFINE_SPINLOCK(lock);
 +
 + spin_lock(lock);
 + if (tick_do_timer_cpu == TICK_DO_TIMER_NONE)
 + tick_do_timer_cpu = cpu;
 + spin_unlock(lock);
 + }
 +}
 +static void freezer_suspend_clkevt(int cpu)
 +{
 + if (tick_do_timer_cpu == cpu)
 + return;
 +
 + clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
 +}
 +
 +static void freezer_suspend_tk(int cpu)
 +{
 + if (tick_do_timer_cpu != cpu)
 + return;
 +
 + timekeeping_suspend();
 +
 +}

So you export the world and some more from timekeeping and the tick
code and fiddle with it randomly just to do:

1) Suspend clock event devices
2) Suspend timekeeping
3) Resume timekeeping
4) Resume clock event devices

And for that you kick the frozen cpus out of idle into the
stomp_machine task and let them enter deep idle from there.

stomp_machine() is in 99% of all use cases a clear indicator for a
complete design failure.

It's not that hard to solve that problem, w/o stomp_machine and w/o
all the tick_do_timer_cpu mess.

1) Run the freeze code until freeze_enter()

2) Prevent CPU hotplug and switch state.

   That tells the cpu idle code to enter the deepest idle state and
   also tells the clock events code about the desire to freeze
   everything.

   clock_events_set_freeze_state(true);

   And let that be:

   clock_events_set_freeze_state(bool on)
   {
raw_spin_lock_irq(clockevents_lock);
if (on)
tobefrozen_cpus = num_online_cpus();
idle_freeze = on;
raw_spin_unlock_irq(clockevents_lock);
   }

   So the generic idle task needs a check like this:

   if (idle_should_freeze())
frozen_idle();

   with the implementation:

   bool idle_should_freeze()
   {
return clock_events_get_freeze_state();
   }

   which resolves to:

   bool clock_events_get_freeze_state()
   {
/*
 * Lockfree access because it does not matter.
 *
 * See below at CLOCK_EVT_NOTIFY_FREEZE
 */
return idle_freeze;
   }

4) Kick all cpus out of idle, so they enter the deep idle state via
   frozen_idle()

   frozen_idle()
   {
if (clock_events_notify(CLOCK_EVT_NOTIFY_FREEZE))
  return;

while (idle_should_freeze())
  magic_frozen_idle();

clock_events_notify(CLOCK_EVT_NOTIFY_UNFREEZE);
   }

   Let clock_events_notify() 

Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-12 Thread Peter Zijlstra
On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
 On Thu, 30 Oct 2014, Li, Aubrey wrote:
 
  Freeze is a general power saving state that processes are frozen, devices
  are suspended and CPUs are in idle state. However, when the system enters
  freeze state, there are a few timers keep ticking and hence consumes more
  power unnecessarily. The observed timer events in freeze state are:
  - tick_sched_timer
  - watchdog lockup detector
  - realtime scheduler period timer
  
  The system power consumption in freeze state will be reduced significantly
  if we quiesce these timers.
 
 So the obvious question is why dont we quiesce these timers by telling
 the subsystems which manage these timers to shut them down?
 
 I really want a proper answer for this in the first place, but let me
 look at the proposed solution as well.

Two arguments here:

 1) the current suspend modes don't care, so if this suspend mode starts
 to care, its likely to 'break' in the future simply because people
 never cared about timers.

 2) there could be userland timers, userland is frozen but they'll still
 have their timers set and those can and will fire.

But sure, we can add suspend notifiers to stuff to shut down timers; I
should have a patch for at least one of the offenders somewhere. But I
really think that we should not be looking at the individual timers for
this, none of the other suspend modes care about active timers.

 But before we do that we want a proper explanation why the interrupt
 fires at all. The lack of explanation cleary documents that this is a
 'hacked it into submission' approach.

From what I remember its the waking interrupt that ends up in the
timekeeping code, Li should have a backtrace somwhere.

  +#include ../time/tick-internal.h
  +#include ../time/timekeeping_internal.h
 
 Eew.

I knew you'd love that :-)

 So you export the world and some more from timekeeping and the tick
 code and fiddle with it randomly just to do:
 
 1) Suspend clock event devices
 2) Suspend timekeeping
 3) Resume timekeeping
 4) Resume clock event devices

Sure, we can add some exports and clean that up, but..

 stomp_machine() is in 99% of all use cases a clear indicator for a
 complete design failure.

So the generic idle task needs a check like this:
 
if (idle_should_freeze())
   frozen_idle();

So that is adding extra code to fairly common/hot paths just for this
one extra special case. I tried to avoid doing that.

But I suppose we can try and merge that with the offline case and guard
both special cases with a single variable or so.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-12 Thread Li, Aubrey
On 2014/11/13 9:37, Peter Zijlstra wrote:
 On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
 On Thu, 30 Oct 2014, Li, Aubrey wrote:

 Freeze is a general power saving state that processes are frozen, devices
 are suspended and CPUs are in idle state. However, when the system enters
 freeze state, there are a few timers keep ticking and hence consumes more
 power unnecessarily. The observed timer events in freeze state are:
 - tick_sched_timer
 - watchdog lockup detector
 - realtime scheduler period timer

 The system power consumption in freeze state will be reduced significantly
 if we quiesce these timers.

 So the obvious question is why dont we quiesce these timers by telling
 the subsystems which manage these timers to shut them down?

 I really want a proper answer for this in the first place, but let me
 look at the proposed solution as well.
 
 Two arguments here:
 
  1) the current suspend modes don't care, so if this suspend mode starts
  to care, its likely to 'break' in the future simply because people
  never cared about timers.
 
  2) there could be userland timers, userland is frozen but they'll still
  have their timers set and those can and will fire.
 
 But sure, we can add suspend notifiers to stuff to shut down timers; I
 should have a patch for at least one of the offenders somewhere. But I
 really think that we should not be looking at the individual timers for
 this, none of the other suspend modes care about active timers.
 
 But before we do that we want a proper explanation why the interrupt
 fires at all. The lack of explanation cleary documents that this is a
 'hacked it into submission' approach.
 
From what I remember its the waking interrupt that ends up in the
 timekeeping code, Li should have a backtrace somwhere.

There are two race conditions:

The first one occurs after the interrupt is disabled and before we
suspend lapic. In this time slot, if apic timer interrupt occurs, the
interrupt is pending there because the interrupt is disabled. Then we
suspend timekeeping, and then we enter idle and exit idle with interrupt
re-enabled, the timer interrupt is handled with timekeeping is
suspended.

The other occurs after timekeeping_suspended = 1 and before we suspend
lapic. In this time slot, if apic timer interrupt occurs, we invoke the
timer interrupt while timekeeping is suspended as above.

Thanks,
-Aubrey
 
 +#include ../time/tick-internal.h
 +#include ../time/timekeeping_internal.h

 Eew.
 
 I knew you'd love that :-)
 
 So you export the world and some more from timekeeping and the tick
 code and fiddle with it randomly just to do:

 1) Suspend clock event devices
 2) Suspend timekeeping
 3) Resume timekeeping
 4) Resume clock event devices
 
 Sure, we can add some exports and clean that up, but..
 
 stomp_machine() is in 99% of all use cases a clear indicator for a
 complete design failure.
 
So the generic idle task needs a check like this:

if (idle_should_freeze())
  frozen_idle();
 
 So that is adding extra code to fairly common/hot paths just for this
 one extra special case. I tried to avoid doing that.
 
 But I suppose we can try and merge that with the offline case and guard
 both special cases with a single variable or so.
 
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-10 Thread Peter Zijlstra
On Sat, Nov 08, 2014 at 03:05:56AM +0100, Rafael J. Wysocki wrote:
> Peter, Thomas, any comments here?

I'm fine with this; but Thomas needs to ack, lets give him a few more
days to reply with this reminder.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-10 Thread Peter Zijlstra
On Sat, Nov 08, 2014 at 03:05:56AM +0100, Rafael J. Wysocki wrote:
 Peter, Thomas, any comments here?

I'm fine with this; but Thomas needs to ack, lets give him a few more
days to reply with this reminder.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-07 Thread Rafael J. Wysocki
On Thursday, October 30, 2014 10:58:23 AM Li, Aubrey wrote:
> The patch is based on v3.17, merged with Rafael's pm+acpi-3.18-rc1 tag from
> linux-pm.git tree.
> 
> The patch is based on the patch PeterZ initially wrote.
> ---
> Freeze is a general power saving state that processes are frozen, devices
> are suspended and CPUs are in idle state. However, when the system enters
> freeze state, there are a few timers keep ticking and hence consumes more
> power unnecessarily. The observed timer events in freeze state are:
> - tick_sched_timer
> - watchdog lockup detector
> - realtime scheduler period timer
> 
> The system power consumption in freeze state will be reduced significantly
> if we quiesce these timers.
> 
> On Baytrail-T(ASUS_T100) platform, when the system is freezed to low power
> idle state(S0ix), quiescing these timers saves 29.8% power(94.48mw -> 
> 66.32mw).
> 
> The patch is also tested on:
> - Sandybrdige-EP system, both RTC alarm and power button are able to wake
>   the system up from freeze state.
> - HP laptop EliteBook 8460p, both RTC alarm and power button are able to
>   wake the system up from freeze state.
> 
> Signed-off-by: Aubrey Li 
> Signed-off-by: Peter Zijlstra 
> Cc: Rafael J. Wysocki 
> Cc: Len Brown 
> Cc: Alan Cox 

Peter, Thomas, any comments here?

> ---
>  arch/x86/kernel/apic/apic.c|   8 ++
>  drivers/cpuidle/cpuidle.c  |  12 +++
>  kernel/power/suspend.c | 185 
> +++--
>  kernel/time/timekeeping.c  |   4 +-
>  kernel/time/timekeeping_internal.h |   3 +
>  5 files changed, 204 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index 6776027..f2bb645 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -917,6 +917,14 @@ static void local_apic_timer_interrupt(void)
>*/
>   inc_irq_stat(apic_timer_irqs);
>  
> + /*
> +  * if timekeeping is suspended, the clock event device will be
> +  * suspended as well, so we are not supposed to invoke the event
> +  * handler of clock event device.
> +  */
> + if (unlikely(timekeeping_suspended))
> + return;
> +
>   evt->event_handler(evt);
>  }
>  
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index ee9df5e..8f84f40 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -119,6 +119,18 @@ int cpuidle_enter_state(struct cpuidle_device *dev, 
> struct cpuidle_driver *drv,
>   ktime_t time_start, time_end;
>   s64 diff;
>  
> + /*
> +  * under the scenario of use deepest idle state, the timekeeping
> +  * could be suspended as well as the clock source device, so we
> +  * bypass the idle counter update for this case
> +  */
> + if (unlikely(use_deepest_state)) {
> + entered_state = target_state->enter(dev, drv, index);
> + if (!cpuidle_state_is_coupled(dev, drv, entered_state))
> + local_irq_enable();
> + return entered_state;
> + }
> +
>   trace_cpu_idle_rcuidle(index, dev->cpu);
>   time_start = ktime_get();
>  
> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> index 4ca9a33..660fd15 100644
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -28,16 +28,20 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
>  
>  #include "power.h"
> +#include "../time/tick-internal.h"
> +#include "../time/timekeeping_internal.h"
>  
>  const char *pm_labels[] = { "mem", "standby", "freeze", NULL };
>  const char *pm_states[PM_SUSPEND_MAX];
>  
>  static const struct platform_suspend_ops *suspend_ops;
>  static const struct platform_freeze_ops *freeze_ops;
> -static DECLARE_WAIT_QUEUE_HEAD(suspend_freeze_wait_head);
> -static bool suspend_freeze_wake;
> +static int suspend_freeze_wake;
>  
>  void freeze_set_ops(const struct platform_freeze_ops *ops)
>  {
> @@ -48,22 +52,191 @@ void freeze_set_ops(const struct platform_freeze_ops 
> *ops)
>  
>  static void freeze_begin(void)
>  {
> - suspend_freeze_wake = false;
> + suspend_freeze_wake = -1;
> +}
> +
> +enum freezer_state {
> + FREEZER_NONE,
> + FREEZER_PICK_TK,
> + FREEZER_SUSPEND_CLKEVT,
> + FREEZER_SUSPEND_TK,
> + FREEZER_IDLE,
> + FREEZER_RESUME_TK,
> + FREEZER_RESUME_CLKEVT,
> + FREEZER_EXIT,
> +};
> +
> +struct freezer_data {
> + int thread_num;
> + atomic_tthread_ack;
> + enum freezer_state  state;
> +};
> +
> +static void set_state(struct freezer_data *fd, enum freezer_state state)
> +{
> + /* set ack counter */
> + atomic_set(>thread_ack, fd->thread_num);
> + /* guarantee the write ordering between ack counter and state */
> + smp_wmb();
> + fd->state = state;
> +}
> +
> +static void ack_state(struct freezer_data *fd)
> +{
> + if (atomic_dec_and_test(>thread_ack))
> 

Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-11-07 Thread Rafael J. Wysocki
On Thursday, October 30, 2014 10:58:23 AM Li, Aubrey wrote:
 The patch is based on v3.17, merged with Rafael's pm+acpi-3.18-rc1 tag from
 linux-pm.git tree.
 
 The patch is based on the patch PeterZ initially wrote.
 ---
 Freeze is a general power saving state that processes are frozen, devices
 are suspended and CPUs are in idle state. However, when the system enters
 freeze state, there are a few timers keep ticking and hence consumes more
 power unnecessarily. The observed timer events in freeze state are:
 - tick_sched_timer
 - watchdog lockup detector
 - realtime scheduler period timer
 
 The system power consumption in freeze state will be reduced significantly
 if we quiesce these timers.
 
 On Baytrail-T(ASUS_T100) platform, when the system is freezed to low power
 idle state(S0ix), quiescing these timers saves 29.8% power(94.48mw - 
 66.32mw).
 
 The patch is also tested on:
 - Sandybrdige-EP system, both RTC alarm and power button are able to wake
   the system up from freeze state.
 - HP laptop EliteBook 8460p, both RTC alarm and power button are able to
   wake the system up from freeze state.
 
 Signed-off-by: Aubrey Li aubrey...@linux.intel.com
 Signed-off-by: Peter Zijlstra pet...@infradead.org
 Cc: Rafael J. Wysocki rafael.j.wyso...@intel.com
 Cc: Len Brown len.br...@intel.com
 Cc: Alan Cox a...@linux.intel.com

Peter, Thomas, any comments here?

 ---
  arch/x86/kernel/apic/apic.c|   8 ++
  drivers/cpuidle/cpuidle.c  |  12 +++
  kernel/power/suspend.c | 185 
 +++--
  kernel/time/timekeeping.c  |   4 +-
  kernel/time/timekeeping_internal.h |   3 +
  5 files changed, 204 insertions(+), 8 deletions(-)
 
 diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
 index 6776027..f2bb645 100644
 --- a/arch/x86/kernel/apic/apic.c
 +++ b/arch/x86/kernel/apic/apic.c
 @@ -917,6 +917,14 @@ static void local_apic_timer_interrupt(void)
*/
   inc_irq_stat(apic_timer_irqs);
  
 + /*
 +  * if timekeeping is suspended, the clock event device will be
 +  * suspended as well, so we are not supposed to invoke the event
 +  * handler of clock event device.
 +  */
 + if (unlikely(timekeeping_suspended))
 + return;
 +
   evt-event_handler(evt);
  }
  
 diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
 index ee9df5e..8f84f40 100644
 --- a/drivers/cpuidle/cpuidle.c
 +++ b/drivers/cpuidle/cpuidle.c
 @@ -119,6 +119,18 @@ int cpuidle_enter_state(struct cpuidle_device *dev, 
 struct cpuidle_driver *drv,
   ktime_t time_start, time_end;
   s64 diff;
  
 + /*
 +  * under the scenario of use deepest idle state, the timekeeping
 +  * could be suspended as well as the clock source device, so we
 +  * bypass the idle counter update for this case
 +  */
 + if (unlikely(use_deepest_state)) {
 + entered_state = target_state-enter(dev, drv, index);
 + if (!cpuidle_state_is_coupled(dev, drv, entered_state))
 + local_irq_enable();
 + return entered_state;
 + }
 +
   trace_cpu_idle_rcuidle(index, dev-cpu);
   time_start = ktime_get();
  
 diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
 index 4ca9a33..660fd15 100644
 --- a/kernel/power/suspend.c
 +++ b/kernel/power/suspend.c
 @@ -28,16 +28,20 @@
  #include linux/ftrace.h
  #include trace/events/power.h
  #include linux/compiler.h
 +#include linux/stop_machine.h
 +#include linux/clockchips.h
 +#include linux/hrtimer.h
  
  #include power.h
 +#include ../time/tick-internal.h
 +#include ../time/timekeeping_internal.h
  
  const char *pm_labels[] = { mem, standby, freeze, NULL };
  const char *pm_states[PM_SUSPEND_MAX];
  
  static const struct platform_suspend_ops *suspend_ops;
  static const struct platform_freeze_ops *freeze_ops;
 -static DECLARE_WAIT_QUEUE_HEAD(suspend_freeze_wait_head);
 -static bool suspend_freeze_wake;
 +static int suspend_freeze_wake;
  
  void freeze_set_ops(const struct platform_freeze_ops *ops)
  {
 @@ -48,22 +52,191 @@ void freeze_set_ops(const struct platform_freeze_ops 
 *ops)
  
  static void freeze_begin(void)
  {
 - suspend_freeze_wake = false;
 + suspend_freeze_wake = -1;
 +}
 +
 +enum freezer_state {
 + FREEZER_NONE,
 + FREEZER_PICK_TK,
 + FREEZER_SUSPEND_CLKEVT,
 + FREEZER_SUSPEND_TK,
 + FREEZER_IDLE,
 + FREEZER_RESUME_TK,
 + FREEZER_RESUME_CLKEVT,
 + FREEZER_EXIT,
 +};
 +
 +struct freezer_data {
 + int thread_num;
 + atomic_tthread_ack;
 + enum freezer_state  state;
 +};
 +
 +static void set_state(struct freezer_data *fd, enum freezer_state state)
 +{
 + /* set ack counter */
 + atomic_set(fd-thread_ack, fd-thread_num);
 + /* guarantee the write ordering between ack counter and state */
 + smp_wmb();
 + fd-state = state;
 +}
 +
 +static void ack_state(struct freezer_data *fd)
 

[PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-10-29 Thread Li, Aubrey
The patch is based on v3.17, merged with Rafael's pm+acpi-3.18-rc1 tag from
linux-pm.git tree.

The patch is based on the patch PeterZ initially wrote.
---
Freeze is a general power saving state that processes are frozen, devices
are suspended and CPUs are in idle state. However, when the system enters
freeze state, there are a few timers keep ticking and hence consumes more
power unnecessarily. The observed timer events in freeze state are:
- tick_sched_timer
- watchdog lockup detector
- realtime scheduler period timer

The system power consumption in freeze state will be reduced significantly
if we quiesce these timers.

On Baytrail-T(ASUS_T100) platform, when the system is freezed to low power
idle state(S0ix), quiescing these timers saves 29.8% power(94.48mw -> 66.32mw).

The patch is also tested on:
- Sandybrdige-EP system, both RTC alarm and power button are able to wake
  the system up from freeze state.
- HP laptop EliteBook 8460p, both RTC alarm and power button are able to
  wake the system up from freeze state.

Signed-off-by: Aubrey Li 
Signed-off-by: Peter Zijlstra 
Cc: Rafael J. Wysocki 
Cc: Len Brown 
Cc: Alan Cox 
---
 arch/x86/kernel/apic/apic.c|   8 ++
 drivers/cpuidle/cpuidle.c  |  12 +++
 kernel/power/suspend.c | 185 +++--
 kernel/time/timekeeping.c  |   4 +-
 kernel/time/timekeeping_internal.h |   3 +
 5 files changed, 204 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 6776027..f2bb645 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -917,6 +917,14 @@ static void local_apic_timer_interrupt(void)
 */
inc_irq_stat(apic_timer_irqs);
 
+   /*
+* if timekeeping is suspended, the clock event device will be
+* suspended as well, so we are not supposed to invoke the event
+* handler of clock event device.
+*/
+   if (unlikely(timekeeping_suspended))
+   return;
+
evt->event_handler(evt);
 }
 
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index ee9df5e..8f84f40 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -119,6 +119,18 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct 
cpuidle_driver *drv,
ktime_t time_start, time_end;
s64 diff;
 
+   /*
+* under the scenario of use deepest idle state, the timekeeping
+* could be suspended as well as the clock source device, so we
+* bypass the idle counter update for this case
+*/
+   if (unlikely(use_deepest_state)) {
+   entered_state = target_state->enter(dev, drv, index);
+   if (!cpuidle_state_is_coupled(dev, drv, entered_state))
+   local_irq_enable();
+   return entered_state;
+   }
+
trace_cpu_idle_rcuidle(index, dev->cpu);
time_start = ktime_get();
 
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 4ca9a33..660fd15 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -28,16 +28,20 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 #include "power.h"
+#include "../time/tick-internal.h"
+#include "../time/timekeeping_internal.h"
 
 const char *pm_labels[] = { "mem", "standby", "freeze", NULL };
 const char *pm_states[PM_SUSPEND_MAX];
 
 static const struct platform_suspend_ops *suspend_ops;
 static const struct platform_freeze_ops *freeze_ops;
-static DECLARE_WAIT_QUEUE_HEAD(suspend_freeze_wait_head);
-static bool suspend_freeze_wake;
+static int suspend_freeze_wake;
 
 void freeze_set_ops(const struct platform_freeze_ops *ops)
 {
@@ -48,22 +52,191 @@ void freeze_set_ops(const struct platform_freeze_ops *ops)
 
 static void freeze_begin(void)
 {
-   suspend_freeze_wake = false;
+   suspend_freeze_wake = -1;
+}
+
+enum freezer_state {
+   FREEZER_NONE,
+   FREEZER_PICK_TK,
+   FREEZER_SUSPEND_CLKEVT,
+   FREEZER_SUSPEND_TK,
+   FREEZER_IDLE,
+   FREEZER_RESUME_TK,
+   FREEZER_RESUME_CLKEVT,
+   FREEZER_EXIT,
+};
+
+struct freezer_data {
+   int thread_num;
+   atomic_tthread_ack;
+   enum freezer_state  state;
+};
+
+static void set_state(struct freezer_data *fd, enum freezer_state state)
+{
+   /* set ack counter */
+   atomic_set(>thread_ack, fd->thread_num);
+   /* guarantee the write ordering between ack counter and state */
+   smp_wmb();
+   fd->state = state;
+}
+
+static void ack_state(struct freezer_data *fd)
+{
+   if (atomic_dec_and_test(>thread_ack))
+   set_state(fd, fd->state + 1);
+}
+
+static void freezer_pick_tk(int cpu)
+{
+   if (tick_do_timer_cpu == TICK_DO_TIMER_NONE) {
+   static DEFINE_SPINLOCK(lock);
+
+   spin_lock();
+   if (tick_do_timer_cpu == TICK_DO_TIMER_NONE)
+   

[PATCH v2] PM / Sleep: Timer quiesce in freeze state

2014-10-29 Thread Li, Aubrey
The patch is based on v3.17, merged with Rafael's pm+acpi-3.18-rc1 tag from
linux-pm.git tree.

The patch is based on the patch PeterZ initially wrote.
---
Freeze is a general power saving state that processes are frozen, devices
are suspended and CPUs are in idle state. However, when the system enters
freeze state, there are a few timers keep ticking and hence consumes more
power unnecessarily. The observed timer events in freeze state are:
- tick_sched_timer
- watchdog lockup detector
- realtime scheduler period timer

The system power consumption in freeze state will be reduced significantly
if we quiesce these timers.

On Baytrail-T(ASUS_T100) platform, when the system is freezed to low power
idle state(S0ix), quiescing these timers saves 29.8% power(94.48mw - 66.32mw).

The patch is also tested on:
- Sandybrdige-EP system, both RTC alarm and power button are able to wake
  the system up from freeze state.
- HP laptop EliteBook 8460p, both RTC alarm and power button are able to
  wake the system up from freeze state.

Signed-off-by: Aubrey Li aubrey...@linux.intel.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
Cc: Rafael J. Wysocki rafael.j.wyso...@intel.com
Cc: Len Brown len.br...@intel.com
Cc: Alan Cox a...@linux.intel.com
---
 arch/x86/kernel/apic/apic.c|   8 ++
 drivers/cpuidle/cpuidle.c  |  12 +++
 kernel/power/suspend.c | 185 +++--
 kernel/time/timekeeping.c  |   4 +-
 kernel/time/timekeeping_internal.h |   3 +
 5 files changed, 204 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 6776027..f2bb645 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -917,6 +917,14 @@ static void local_apic_timer_interrupt(void)
 */
inc_irq_stat(apic_timer_irqs);
 
+   /*
+* if timekeeping is suspended, the clock event device will be
+* suspended as well, so we are not supposed to invoke the event
+* handler of clock event device.
+*/
+   if (unlikely(timekeeping_suspended))
+   return;
+
evt-event_handler(evt);
 }
 
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index ee9df5e..8f84f40 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -119,6 +119,18 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct 
cpuidle_driver *drv,
ktime_t time_start, time_end;
s64 diff;
 
+   /*
+* under the scenario of use deepest idle state, the timekeeping
+* could be suspended as well as the clock source device, so we
+* bypass the idle counter update for this case
+*/
+   if (unlikely(use_deepest_state)) {
+   entered_state = target_state-enter(dev, drv, index);
+   if (!cpuidle_state_is_coupled(dev, drv, entered_state))
+   local_irq_enable();
+   return entered_state;
+   }
+
trace_cpu_idle_rcuidle(index, dev-cpu);
time_start = ktime_get();
 
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 4ca9a33..660fd15 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -28,16 +28,20 @@
 #include linux/ftrace.h
 #include trace/events/power.h
 #include linux/compiler.h
+#include linux/stop_machine.h
+#include linux/clockchips.h
+#include linux/hrtimer.h
 
 #include power.h
+#include ../time/tick-internal.h
+#include ../time/timekeeping_internal.h
 
 const char *pm_labels[] = { mem, standby, freeze, NULL };
 const char *pm_states[PM_SUSPEND_MAX];
 
 static const struct platform_suspend_ops *suspend_ops;
 static const struct platform_freeze_ops *freeze_ops;
-static DECLARE_WAIT_QUEUE_HEAD(suspend_freeze_wait_head);
-static bool suspend_freeze_wake;
+static int suspend_freeze_wake;
 
 void freeze_set_ops(const struct platform_freeze_ops *ops)
 {
@@ -48,22 +52,191 @@ void freeze_set_ops(const struct platform_freeze_ops *ops)
 
 static void freeze_begin(void)
 {
-   suspend_freeze_wake = false;
+   suspend_freeze_wake = -1;
+}
+
+enum freezer_state {
+   FREEZER_NONE,
+   FREEZER_PICK_TK,
+   FREEZER_SUSPEND_CLKEVT,
+   FREEZER_SUSPEND_TK,
+   FREEZER_IDLE,
+   FREEZER_RESUME_TK,
+   FREEZER_RESUME_CLKEVT,
+   FREEZER_EXIT,
+};
+
+struct freezer_data {
+   int thread_num;
+   atomic_tthread_ack;
+   enum freezer_state  state;
+};
+
+static void set_state(struct freezer_data *fd, enum freezer_state state)
+{
+   /* set ack counter */
+   atomic_set(fd-thread_ack, fd-thread_num);
+   /* guarantee the write ordering between ack counter and state */
+   smp_wmb();
+   fd-state = state;
+}
+
+static void ack_state(struct freezer_data *fd)
+{
+   if (atomic_dec_and_test(fd-thread_ack))
+   set_state(fd, fd-state + 1);
+}
+
+static void freezer_pick_tk(int cpu)
+{
+   if