Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-25 Thread Lorenzo Pieralisi
On Fri, Feb 22, 2013 at 06:52:14PM +, Thomas Gleixner wrote:
 On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote:
  On Fri, Feb 22, 2013 at 03:03:02PM +, Thomas Gleixner wrote:
   On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote:
On Fri, Feb 22, 2013 at 12:07:30PM +, Thomas Gleixner wrote:
 Now we could make use of that and avoid going deep idle just to come
 back right away via the IPI. Unfortunately the notification thingy has
 no return value, but we can fix that.
 
 To confirm that theory, could you please try the hack below and add
 some instrumentation (trace_printk)?

Applied, and it looks like that's exactly why the warning triggers, at 
least
on the platform I am testing on which is a dual-cluster ARM testchip.

There is a still time window though where the CPU (the IPI target) can 
get
back to idle (tick_broadcast_pending still not set) before the CPU 
target of
the broadcast has a chance to run tick_handle_oneshot_broadcast (and set
tick_broadcast_pending), or am I missing something ?
   
   Well, the tick_broadcast_pending bit is uninteresting if the
   force_broadcast bit is set. Because if that bit is set we know for
   sure, that we got woken with the cpu which gets the broadcast timer
   and raced back to idle before the broadcast handler managed to
   send the IPI.
  
  Gah, my bad sorry, I mixed things up. I thought
  
  tick_check_broadcast_pending()
  
  was checking against the tick_broadcast_pending mask not
  
  tick_force_broadcast_mask
 
 Yep, that's a misnomer. I just wanted to make sure that my theory is
 correct. I need to think about the real solution some more.
 
 We have two alternatives:
 
 1) Make the clockevents_notify function have a return value.
 
 2) Add something like the hack I gave you with a proper name.
 
 The latter has the beauty, that we just need to modify the platform
 independent idle code instead of going down to every callsite of the
 clockevents_notify thing.

Thank you.

I am not sure (1) would buy us anything compared to (2) and as you said we
would end up patching all callsites so (2) is preferred.

As I mentioned, we can even just apply your fixes and leave platform specific
code deal with this optimization, at the end of the day idle driver has
just to check pending IRQs/wake-up sources (which would cover all IRQs not
just TIMER IPI) if and when it has to start time consuming operations like
cache cleaning to enter deep idle. If it goes into a shallow C-state so be it.

On x86 I think it is HW/FW that prevents C-state entering if IRQs are pending,
and on ARM it is likely to happen too, so I am just saying you should not
bother if you think the code becomes too hairy to justify this change.

Thank you very much for the fixes and your help,
Lorenzo

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-24 Thread Santosh Shilimkar

On Saturday 23 February 2013 12:22 AM, Thomas Gleixner wrote:

On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote:

On Fri, Feb 22, 2013 at 03:03:02PM +, Thomas Gleixner wrote:

On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote:

On Fri, Feb 22, 2013 at 12:07:30PM +, Thomas Gleixner wrote:

Now we could make use of that and avoid going deep idle just to come
back right away via the IPI. Unfortunately the notification thingy has
no return value, but we can fix that.

To confirm that theory, could you please try the hack below and add
some instrumentation (trace_printk)?


Applied, and it looks like that's exactly why the warning triggers, at least
on the platform I am testing on which is a dual-cluster ARM testchip.


I too confirm that the warnings cause is same.


There is a still time window though where the CPU (the IPI target) can get
back to idle (tick_broadcast_pending still not set) before the CPU target of
the broadcast has a chance to run tick_handle_oneshot_broadcast (and set
tick_broadcast_pending), or am I missing something ?


Well, the tick_broadcast_pending bit is uninteresting if the
force_broadcast bit is set. Because if that bit is set we know for
sure, that we got woken with the cpu which gets the broadcast timer
and raced back to idle before the broadcast handler managed to
send the IPI.


Gah, my bad sorry, I mixed things up. I thought

tick_check_broadcast_pending()

was checking against the tick_broadcast_pending mask not

tick_force_broadcast_mask


Yep, that's a misnomer. I just wanted to make sure that my theory is
correct. I need to think about the real solution some more.

We have two alternatives:

1) Make the clockevents_notify function have a return value.

2) Add something like the hack I gave you with a proper name.

The latter has the beauty, that we just need to modify the platform
independent idle code instead of going down to every callsite of the
clockevents_notify thing.


I agree that 2 is better alternative to avoid multiple changes.
Whichever alternative you choose, will be happy to test it :)

Regards,
Santosh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-24 Thread Jason Liu
Thomas,

2013/2/23 Thomas Gleixner t...@linutronix.de:
 On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote:
 On Fri, Feb 22, 2013 at 03:03:02PM +, Thomas Gleixner wrote:
  On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote:
   On Fri, Feb 22, 2013 at 12:07:30PM +, Thomas Gleixner wrote:
Now we could make use of that and avoid going deep idle just to come
back right away via the IPI. Unfortunately the notification thingy has
no return value, but we can fix that.
   
To confirm that theory, could you please try the hack below and add
some instrumentation (trace_printk)?
  
   Applied, and it looks like that's exactly why the warning triggers, at 
   least
   on the platform I am testing on which is a dual-cluster ARM testchip.
  
   There is a still time window though where the CPU (the IPI target) can 
   get
   back to idle (tick_broadcast_pending still not set) before the CPU 
   target of
   the broadcast has a chance to run tick_handle_oneshot_broadcast (and set
   tick_broadcast_pending), or am I missing something ?
 
  Well, the tick_broadcast_pending bit is uninteresting if the
  force_broadcast bit is set. Because if that bit is set we know for
  sure, that we got woken with the cpu which gets the broadcast timer
  and raced back to idle before the broadcast handler managed to
  send the IPI.

 Gah, my bad sorry, I mixed things up. I thought

 tick_check_broadcast_pending()

 was checking against the tick_broadcast_pending mask not

 tick_force_broadcast_mask

 Yep, that's a misnomer. I just wanted to make sure that my theory is
 correct. I need to think about the real solution some more.

I have applied your patch and tested, there is NO warning at all then.
I think your theory is correct.


 We have two alternatives:

 1) Make the clockevents_notify function have a return value.

 2) Add something like the hack I gave you with a proper name.

 The latter has the beauty, that we just need to modify the platform
 independent idle code instead of going down to every callsite of the
 clockevents_notify thing.

I prefer the solution 2).


 Thanks,

 tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Santosh Shilimkar

Thomas,

On Friday 22 February 2013 03:49 AM, Thomas Gleixner wrote:

On Thu, 21 Feb 2013, Santosh Shilimkar wrote:

On Thursday 21 February 2013 07:18 PM, Thomas Gleixner wrote:

find below a completely untested patch, which should address that issue.


After looking at the thread, I tried to see the issue on OMAP and could
see the same issue as Jason.


That's interesting. We have the same issue on x86 since 2007 and
nobody noticed ever. It's basically the same problem there, but it
seems that on x86 getting out of those low power states is way slower
than the minimal reprogramming delta which is used to enforce the
local timer to fire after the wakeup.

I'm still amazed that as Jason stated a 1us reprogramming delta is
sufficient to get this ping-pong going. I somehow doubt that, but
maybe ARM is really that fast :)


Your patch fixes the retries on both CPUs on my dual core machine. So
you use my tested by if you need one.


They are always welcome.


BTW, Lorenzo off-list mentioned to me about warning in boot-up
which I missed while testing your patch. It will take bit more
time for me to look into it and hence thought of reporting it.

[2.186126] [ cut here ]
[2.190979] WARNING: at kernel/time/tick-broadcast.c:501 
tick_broadcast_oneshot_control+0x1c0/0x21c()

[2.200622] Modules linked in:
[2.203826] [c001bfe4] (unwind_backtrace+0x0/0xf0) from 
[c0047d6c] (warn_slowpath_common+0x4c/0x64)
[2.213653] [c0047d6c] (warn_slowpath_common+0x4c/0x64) from 
[c0047da0] (warn_slowpath_null+0x1c/0x24)
[2.223754] [c0047da0] (warn_slowpath_null+0x1c/0x24) from 
[c009336c] (tick_broadcast_oneshot_control+0x1c0/0x21c)
[2.234924] [c009336c] (tick_broadcast_oneshot_control+0x1c0/0x21c) 
from [c00928dc] (tick_notify+0x23c/0x42c)
[2.245666] [c00928dc] (tick_notify+0x23c/0x42c) from [c0539a3c] 
(notifier_call_chain+0x44/0x84)
[2.255218] [c0539a3c] (notifier_call_chain+0x44/0x84) from 
[c0071068] (raw_notifier_call_chain+0x18/0x20)
[2.265686] [c0071068] (raw_notifier_call_chain+0x18/0x20) from 
[c0091c70] (clockevents_notify+0x2c/0x174)
[2.276123] [c0091c70] (clockevents_notify+0x2c/0x174) from 
[c0035294] (omap_enter_idle_smp+0x3c/0x120)
[2.286315] [c0035294] (omap_enter_idle_smp+0x3c/0x120) from 
[c042e504] (cpuidle_enter+0x14/0x18)
[2.295928] [c042e504] (cpuidle_enter+0x14/0x18) from [c042ef14] 
(cpuidle_wrap_enter+0x34/0xa0)
[2.305389] [c042ef14] (cpuidle_wrap_enter+0x34/0xa0) from 
[c042eb20] (cpuidle_idle_call+0xe0/0x328)
[2.315307] [c042eb20] (cpuidle_idle_call+0xe0/0x328) from 
[c0015100] (cpu_idle+0x8c/0x11c)
[2.324401] [c0015100] (cpu_idle+0x8c/0x11c) from [c073d7ac] 
(start_kernel+0x2b0/0x300)

[2.333129] ---[ end trace 6fe1f7b4606a9e20 ]---


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Thomas Gleixner
On Fri, 22 Feb 2013, Santosh Shilimkar wrote:
 BTW, Lorenzo off-list mentioned to me about warning in boot-up
 which I missed while testing your patch. It will take bit more
 time for me to look into it and hence thought of reporting it.
 
 [2.186126] [ cut here ]
 [2.190979] WARNING: at kernel/time/tick-broadcast.c:501
 tick_broadcast_oneshot_control+0x1c0/0x21c()

Which one is that? tick_broadcast_pending or tick_force_broadcast_mask ?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Jason Liu
Thomas,

2013/2/21 Thomas Gleixner t...@linutronix.de:
 Jason,

 On Thu, 21 Feb 2013, Jason Liu wrote:
 2013/2/21 Thomas Gleixner t...@linutronix.de:
  Now your explanation makes sense.
 
  I have no fast solution for this, but I think that I have an idea how
  to fix it. Stay tuned.

 Thanks Thomas, wait for your fix. :)

 find below a completely untested patch, which should address that issue.


I have tested the below patch, but run into the following warnings:

[  713.340593] [ cut here ]
[  713.345238] WARNING: at
/linux-2.6-imx/kernel/time/tick-broadcast.c:497
tick_broadcast_oneshot_control+0x210/0x254()
[  713.359240] Modules linked in:
[  713.362332] [8004b2cc] (unwind_backtrace+0x0/0xf8) from
[80079764] (warn_slowpath_common+0x54/0x64)
[  713.371740] [80079764] (warn_slowpath_common+0x54/0x64) from
[80079790] (warn_slowpath_null+0x1c/0x24)
[  713.381408] [80079790] (warn_slowpath_null+0x1c/0x24) from
[800a5320] (tick_broadcast_oneshot_control+0x210/0x254)
[  713.392120] [800a5320]
(tick_broadcast_oneshot_control+0x210/0x254) from [800a48b0]
(tick_notify+0xd4/0x1f8)
[  713.402317] [800a48b0] (tick_notify+0xd4/0x1f8) from [8009b154]
(notifier_call_chain+0x4c/0x8c)
[  713.411377] [8009b154] (notifier_call_chain+0x4c/0x8c) from
[8009b24c] (raw_notifier_call_chain+0x18/0x20)
[  713.421392] [8009b24c] (raw_notifier_call_chain+0x18/0x20) from
[800a3d9c] (clockevents_notify+0x30/0x198)
[  713.431417] [800a3d9c] (clockevents_notify+0x30/0x198) from
[80052f58] (arch_idle_multi_core+0x4c/0xc4)
[  713.441175] [80052f58] (arch_idle_multi_core+0x4c/0xc4) from
[80044f68] (default_idle+0x20/0x28)
[  713.450322] [80044f68] (default_idle+0x20/0x28) from [800455c0]
(cpu_idle+0xc8/0xfc)
[  713.458425] [800455c0] (cpu_idle+0xc8/0xfc) from [80008984]
(start_kernel+0x260/0x294)
[  713.466701] [80008984] (start_kernel+0x260/0x294) from
[10008040] (0x10008040)

So, the code here which bring the warnings:
void tick_broadcast_oneshot_control(unsigned long reason)
{
...
WARN_ON_ONCE(test_bit(cpu, tick_force_broadcast_mask));

}


 Thanks,

 tglx

 Index: linux-2.6/kernel/time/tick-broadcast.c
 ===
 --- linux-2.6.orig/kernel/time/tick-broadcast.c
 +++ linux-2.6/kernel/time/tick-broadcast.c
 @@ -397,6 +397,8 @@ int tick_resume_broadcast(void)

  /* FIXME: use cpumask_var_t. */
  static DECLARE_BITMAP(tick_broadcast_oneshot_mask, NR_CPUS);
 +static DECLARE_BITMAP(tick_broadcast_pending, NR_CPUS);
 +static DECLARE_BITMAP(tick_force_broadcast_mask, NR_CPUS);

  /*
   * Exposed for debugging: see timer_list.c
 @@ -453,12 +455,24 @@ again:
 /* Find all expired events */
 for_each_cpu(cpu, tick_get_broadcast_oneshot_mask()) {
 td = per_cpu(tick_cpu_device, cpu);
 -   if (td-evtdev-next_event.tv64 = now.tv64)
 +   if (td-evtdev-next_event.tv64 = now.tv64) {
 cpumask_set_cpu(cpu, to_cpumask(tmpmask));
 -   else if (td-evtdev-next_event.tv64  next_event.tv64)
 +   /*
 +* Mark the remote cpu in the pending mask, so
 +* it can avoid reprogramming the cpu local
 +* timer in tick_broadcast_oneshot_control().
 +*/
 +   set_bit(cpu, tick_broadcast_pending);
 +   } else if (td-evtdev-next_event.tv64  next_event.tv64)
 next_event.tv64 = td-evtdev-next_event.tv64;
 }

 +   /* Take care of enforced broadcast requests */
 +   for_each_cpu(cpu, to_cpumask(tick_force_broadcast_mask)) {
 +   set_bit(cpu, tmpmask);
 +   clear_bit(cpu, tick_force_broadcast_mask);
 +   }
 +
 /*
  * Wakeup the cpus which have an expired event.
  */
 @@ -494,6 +508,7 @@ void tick_broadcast_oneshot_control(unsi
 struct clock_event_device *bc, *dev;
 struct tick_device *td;
 unsigned long flags;
 +   ktime_t now;
 int cpu;

 /*
 @@ -518,6 +533,8 @@ void tick_broadcast_oneshot_control(unsi

 raw_spin_lock_irqsave(tick_broadcast_lock, flags);
 if (reason == CLOCK_EVT_NOTIFY_BROADCAST_ENTER) {
 +   WARN_ON_ONCE(test_bit(cpu, tick_broadcast_pending));
 +   WARN_ON_ONCE(test_bit(cpu, tick_force_broadcast_mask));
 if (!cpumask_test_cpu(cpu, 
 tick_get_broadcast_oneshot_mask())) {
 cpumask_set_cpu(cpu, 
 tick_get_broadcast_oneshot_mask());
 clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
 @@ -529,10 +546,63 @@ void tick_broadcast_oneshot_control(unsi
 cpumask_clear_cpu(cpu,
   tick_get_broadcast_oneshot_mask());
 clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT);
 -   if (dev-next_event.tv64 != 

Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Lorenzo Pieralisi
On Thu, Feb 21, 2013 at 10:19:18PM +, Thomas Gleixner wrote:
 On Thu, 21 Feb 2013, Santosh Shilimkar wrote:
  On Thursday 21 February 2013 07:18 PM, Thomas Gleixner wrote:
   find below a completely untested patch, which should address that issue.
   
  After looking at the thread, I tried to see the issue on OMAP and could
  see the same issue as Jason.
 
 That's interesting. We have the same issue on x86 since 2007 and
 nobody noticed ever. It's basically the same problem there, but it
 seems that on x86 getting out of those low power states is way slower
 than the minimal reprogramming delta which is used to enforce the
 local timer to fire after the wakeup. 
 
 I'm still amazed that as Jason stated a 1us reprogramming delta is
 sufficient to get this ping-pong going. I somehow doubt that, but
 maybe ARM is really that fast :)

It also depends on when the idle driver exits broadcast mode.
Certainly if that's the last thing it does before enabling IRQs, that
might help trigger the issue.

I am still a bit sceptic myself too, and I take advantage of Thomas'
knowledge on the subject, which is ways deeper than mine BTW, to ask a
question. The thread started with a subject too many retries and
here I have a doubt. If the fix is not applied, on the CPU affine to
the broadcast timer, it is _normal_ to have local timer retries, since
the CPU is setting/forcing the local timer to fire after a min_delta_ns every
time the expired event was local to the CPU affine to the broadcast timer.

The problem, supposedly, is that the timer has not enough time (sorry for the
mouthful) to expire(fire) before IRQs are disabled and the idle thread goes
back to idle again. This means that we should notice a mismatch between
the number of broadcast timer IRQs and local timer IRQs on the CPU
affine to the broadcast timer IRQ (granted, we also have to take into
account broadcast timer IRQs meant to service (through IPI) a local timer
expired on a CPU which is not the one running the broadcast IRQ handler and
normal local timer IRQs as well).

I am not sure the sheer number of retries is a symptom of the problem
happening, but I might well be mistaken so I am asking.

For certain, with the fix applied, lots of duplicate IRQs on the CPU
affine to the broadcast timer are eliminated, since the local timer is
not reprogrammed anymore (before the fix, basically the broadcast timer
was firing an IRQ that did nothing since the CPU was already out of
broadcast mode by the time the broadcast handler was running, the real job
was carried out in the local timer handler).

 
  Your patch fixes the retries on both CPUs on my dual core machine. So
  you use my tested by if you need one.
 
 They are always welcome.
 
  Tested-by: Santosh Shilimkar santosh.shilim...@ti.com

You can add mine too, we should fix the WARN_ON_ONCE mentioned in Santosh's
reply.

Tested-by: Lorenzo Pieralisi lorenzo.pieral...@arm.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Santosh Shilimkar

On Friday 22 February 2013 03:54 PM, Thomas Gleixner wrote:

On Fri, 22 Feb 2013, Santosh Shilimkar wrote:

BTW, Lorenzo off-list mentioned to me about warning in boot-up
which I missed while testing your patch. It will take bit more
time for me to look into it and hence thought of reporting it.

[2.186126] [ cut here ]
[2.190979] WARNING: at kernel/time/tick-broadcast.c:501
tick_broadcast_oneshot_control+0x1c0/0x21c()


Which one is that? tick_broadcast_pending or tick_force_broadcast_mask ?


The force broadcast mask one coming from below.
WARN_ON_ONCE(test_bit(cpu, tick_force_broadcast_mask));

Regards,
Santosh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Lorenzo Pieralisi
On Fri, Feb 22, 2013 at 10:24:00AM +, Thomas Gleixner wrote:
 On Fri, 22 Feb 2013, Santosh Shilimkar wrote:
  BTW, Lorenzo off-list mentioned to me about warning in boot-up
  which I missed while testing your patch. It will take bit more
  time for me to look into it and hence thought of reporting it.
  
  [2.186126] [ cut here ]
  [2.190979] WARNING: at kernel/time/tick-broadcast.c:501
  tick_broadcast_oneshot_control+0x1c0/0x21c()
 
 Which one is that? tick_broadcast_pending or tick_force_broadcast_mask ?

It is the tick_force_broadcast_mask and I think that's because on all
systems we are testing, the broadcast timer IRQ is a thundering herd,
all CPUs get out of idle at once and try to get out of broadcast mode
at more or less the same time.

Lorenzo

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Santosh Shilimkar

On Friday 22 February 2013 04:01 PM, Lorenzo Pieralisi wrote:

On Fri, Feb 22, 2013 at 10:24:00AM +, Thomas Gleixner wrote:

On Fri, 22 Feb 2013, Santosh Shilimkar wrote:

BTW, Lorenzo off-list mentioned to me about warning in boot-up
which I missed while testing your patch. It will take bit more
time for me to look into it and hence thought of reporting it.

[2.186126] [ cut here ]
[2.190979] WARNING: at kernel/time/tick-broadcast.c:501
tick_broadcast_oneshot_control+0x1c0/0x21c()


Which one is that? tick_broadcast_pending or tick_force_broadcast_mask ?


It is the tick_force_broadcast_mask and I think that's because on all
systems we are testing, the broadcast timer IRQ is a thundering herd,
all CPUs get out of idle at once and try to get out of broadcast mode
at more or less the same time.


So the issue comes ups only when the idle state used where CPU wakeup
more or less at same time as Lorenzo mentioned. I have two platforms
where I could test the patch and see the issue only with one platform.

Yesterday I didn't notice the warning because it wasn't seen on that
platform :-) OMAP4 idle entry and exit in deep state is staggered
between CPUs and hence the warning isn't seen. On OMAP5 though,
there is an additional C-state where idle entry/exit for CPU
isn't staggered and I see the issue in that case.

Actually the broad-cast code doesn't expect such a behavior
from CPUs since only the broad-cast affine CPU should wake
up and rest of the CPU should be woken up by the broad-cast
IPIs.

Regards,
Santosh








--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Thomas Gleixner
On Fri, 22 Feb 2013, Santosh Shilimkar wrote:

 On Friday 22 February 2013 04:01 PM, Lorenzo Pieralisi wrote:
  On Fri, Feb 22, 2013 at 10:24:00AM +, Thomas Gleixner wrote:
   On Fri, 22 Feb 2013, Santosh Shilimkar wrote:
BTW, Lorenzo off-list mentioned to me about warning in boot-up
which I missed while testing your patch. It will take bit more
time for me to look into it and hence thought of reporting it.

[2.186126] [ cut here ]
[2.190979] WARNING: at kernel/time/tick-broadcast.c:501
tick_broadcast_oneshot_control+0x1c0/0x21c()
   
   Which one is that? tick_broadcast_pending or tick_force_broadcast_mask ?
  
  It is the tick_force_broadcast_mask and I think that's because on all
  systems we are testing, the broadcast timer IRQ is a thundering herd,
  all CPUs get out of idle at once and try to get out of broadcast mode
  at more or less the same time.
  
 So the issue comes ups only when the idle state used where CPU wakeup
 more or less at same time as Lorenzo mentioned. I have two platforms
 where I could test the patch and see the issue only with one platform.
 
 Yesterday I didn't notice the warning because it wasn't seen on that
 platform :-) OMAP4 idle entry and exit in deep state is staggered
 between CPUs and hence the warning isn't seen. On OMAP5 though,
 there is an additional C-state where idle entry/exit for CPU
 isn't staggered and I see the issue in that case.
 
 Actually the broad-cast code doesn't expect such a behavior
 from CPUs since only the broad-cast affine CPU should wake
 up and rest of the CPU should be woken up by the broad-cast
 IPIs.

That's what I feared. We might have the same issue on x86, depending
on the cpu model.

But thinking more about it. It's actually not a real problem, just
pointless burning of cpu cycles.

So on the CPU which gets woken along with the target CPU of the
broadcast the following happens:

  deep_idle()
-- spurious wakeup
  broadcast_exit()
set forced bit
  
  enable interrupts

-- Nothing happens

  disable interrupts

  broadcast_enter()
-- Here we observe the forced bit is set
  deep_idle()

Now after that the target CPU of the broadcast runs the broadcast
handler and finds the other CPU in both the broadcast and the forced
mask, sends the IPI and stuff gets back to normal.

So it's not actually harmful, just more evidence for the theory, that
hardware designers have access to very special drug supplies.

Now we could make use of that and avoid going deep idle just to come
back right away via the IPI. Unfortunately the notification thingy has
no return value, but we can fix that.

To confirm that theory, could you please try the hack below and add
some instrumentation (trace_printk)?

Thanks,

tglx

Index: linux-2.6/arch/arm/kernel/process.c
===
--- linux-2.6.orig/arch/arm/kernel/process.c
+++ linux-2.6/arch/arm/kernel/process.c
@@ -199,7 +199,7 @@ void cpu_idle(void)
 #ifdef CONFIG_PL310_ERRATA_769419
wmb();
 #endif
-   if (hlt_counter) {
+   if (hlt_counter || tick_check_broadcast_pending()) {
local_irq_enable();
cpu_relax();
} else if (!need_resched()) {
Index: linux-2.6/include/linux/clockchips.h
===
--- linux-2.6.orig/include/linux/clockchips.h
+++ linux-2.6/include/linux/clockchips.h
@@ -170,6 +170,12 @@ extern void tick_broadcast(const struct 
 extern int tick_receive_broadcast(void);
 #endif
 
+#if defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST)  
defined(CONFIG_TICK_ONESHOT)
+extern int tick_check_broadcast_pending(void);
+#else
+static inline int tick_check_broadcast_pending(void) { return 0; }
+#endif
+
 #ifdef CONFIG_GENERIC_CLOCKEVENTS
 extern void clockevents_notify(unsigned long reason, void *arg);
 #else
Index: linux-2.6/kernel/time/tick-broadcast.c
===
--- linux-2.6.orig/kernel/time/tick-broadcast.c
+++ linux-2.6/kernel/time/tick-broadcast.c
@@ -418,6 +418,15 @@ static int tick_broadcast_set_event(ktim
return clockevents_program_event(bc, expires, force);
 }
 
+/*
+ * Called before going idle with interrupts disabled. Checks whether a
+ * broadcast event from the other core is about to happen.
+ */
+int tick_check_broadcast_pending(void)
+{
+   return test_bit(smp_processor_id(), tick_force_broadcast_mask);
+}
+
 int tick_resume_broadcast_oneshot(struct clock_event_device *bc)
 {
clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Lorenzo Pieralisi
On Fri, Feb 22, 2013 at 12:07:30PM +, Thomas Gleixner wrote:
 On Fri, 22 Feb 2013, Santosh Shilimkar wrote:
 
  On Friday 22 February 2013 04:01 PM, Lorenzo Pieralisi wrote:
   On Fri, Feb 22, 2013 at 10:24:00AM +, Thomas Gleixner wrote:
On Fri, 22 Feb 2013, Santosh Shilimkar wrote:
 BTW, Lorenzo off-list mentioned to me about warning in boot-up
 which I missed while testing your patch. It will take bit more
 time for me to look into it and hence thought of reporting it.
 
 [2.186126] [ cut here ]
 [2.190979] WARNING: at kernel/time/tick-broadcast.c:501
 tick_broadcast_oneshot_control+0x1c0/0x21c()

Which one is that? tick_broadcast_pending or tick_force_broadcast_mask ?
   
   It is the tick_force_broadcast_mask and I think that's because on all
   systems we are testing, the broadcast timer IRQ is a thundering herd,
   all CPUs get out of idle at once and try to get out of broadcast mode
   at more or less the same time.
   
  So the issue comes ups only when the idle state used where CPU wakeup
  more or less at same time as Lorenzo mentioned. I have two platforms
  where I could test the patch and see the issue only with one platform.
  
  Yesterday I didn't notice the warning because it wasn't seen on that
  platform :-) OMAP4 idle entry and exit in deep state is staggered
  between CPUs and hence the warning isn't seen. On OMAP5 though,
  there is an additional C-state where idle entry/exit for CPU
  isn't staggered and I see the issue in that case.
  
  Actually the broad-cast code doesn't expect such a behavior
  from CPUs since only the broad-cast affine CPU should wake
  up and rest of the CPU should be woken up by the broad-cast
  IPIs.
 
 That's what I feared. We might have the same issue on x86, depending
 on the cpu model.
 
 But thinking more about it. It's actually not a real problem, just
 pointless burning of cpu cycles.
 
 So on the CPU which gets woken along with the target CPU of the
 broadcast the following happens:
 
   deep_idle()
   -- spurious wakeup
   broadcast_exit()
 set forced bit
   
   enable interrupts
 
   -- Nothing happens
 
   disable interrupts
 
   broadcast_enter()
   -- Here we observe the forced bit is set
   deep_idle()
 
 Now after that the target CPU of the broadcast runs the broadcast
 handler and finds the other CPU in both the broadcast and the forced
 mask, sends the IPI and stuff gets back to normal.
 
 So it's not actually harmful, just more evidence for the theory, that
 hardware designers have access to very special drug supplies.
 
 Now we could make use of that and avoid going deep idle just to come
 back right away via the IPI. Unfortunately the notification thingy has
 no return value, but we can fix that.
 
 To confirm that theory, could you please try the hack below and add
 some instrumentation (trace_printk)?

Applied, and it looks like that's exactly why the warning triggers, at least
on the platform I am testing on which is a dual-cluster ARM testchip.

There is a still time window though where the CPU (the IPI target) can get
back to idle (tick_broadcast_pending still not set) before the CPU target of
the broadcast has a chance to run tick_handle_oneshot_broadcast (and set
tick_broadcast_pending), or am I missing something ?
It is a corner case, granted. Best thing would be to check pending IRQs in the
idle driver back-end (or have always-on local timers :-)).

Thanks,
Lorenzo

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Thomas Gleixner
On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote:
 On Fri, Feb 22, 2013 at 12:07:30PM +, Thomas Gleixner wrote:
  Now we could make use of that and avoid going deep idle just to come
  back right away via the IPI. Unfortunately the notification thingy has
  no return value, but we can fix that.
  
  To confirm that theory, could you please try the hack below and add
  some instrumentation (trace_printk)?
 
 Applied, and it looks like that's exactly why the warning triggers, at least
 on the platform I am testing on which is a dual-cluster ARM testchip.
 
 There is a still time window though where the CPU (the IPI target) can get
 back to idle (tick_broadcast_pending still not set) before the CPU target of
 the broadcast has a chance to run tick_handle_oneshot_broadcast (and set
 tick_broadcast_pending), or am I missing something ?

Well, the tick_broadcast_pending bit is uninteresting if the
force_broadcast bit is set. Because if that bit is set we know for
sure, that we got woken with the cpu which gets the broadcast timer
and raced back to idle before the broadcast handler managed to
send the IPI.

If we did not get woken before the broadcast IPI then the pending bit
is set when we exit the broadcast mode.

 It is a corner case, granted. Best thing would be to check pending IRQs in the
 idle driver back-end (or have always-on local timers :-)).

The latter is definitely the only sane solution.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Lorenzo Pieralisi
On Fri, Feb 22, 2013 at 03:03:02PM +, Thomas Gleixner wrote:
 On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote:
  On Fri, Feb 22, 2013 at 12:07:30PM +, Thomas Gleixner wrote:
   Now we could make use of that and avoid going deep idle just to come
   back right away via the IPI. Unfortunately the notification thingy has
   no return value, but we can fix that.
   
   To confirm that theory, could you please try the hack below and add
   some instrumentation (trace_printk)?
  
  Applied, and it looks like that's exactly why the warning triggers, at least
  on the platform I am testing on which is a dual-cluster ARM testchip.
  
  There is a still time window though where the CPU (the IPI target) can get
  back to idle (tick_broadcast_pending still not set) before the CPU target of
  the broadcast has a chance to run tick_handle_oneshot_broadcast (and set
  tick_broadcast_pending), or am I missing something ?
 
 Well, the tick_broadcast_pending bit is uninteresting if the
 force_broadcast bit is set. Because if that bit is set we know for
 sure, that we got woken with the cpu which gets the broadcast timer
 and raced back to idle before the broadcast handler managed to
 send the IPI.

Gah, my bad sorry, I mixed things up. I thought

tick_check_broadcast_pending()

was checking against the tick_broadcast_pending mask not

tick_force_broadcast_mask

as it correctly does.

All clear now.

Thanks a lot,
Lorenzo

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Thomas Gleixner
On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote:
 On Fri, Feb 22, 2013 at 03:03:02PM +, Thomas Gleixner wrote:
  On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote:
   On Fri, Feb 22, 2013 at 12:07:30PM +, Thomas Gleixner wrote:
Now we could make use of that and avoid going deep idle just to come
back right away via the IPI. Unfortunately the notification thingy has
no return value, but we can fix that.

To confirm that theory, could you please try the hack below and add
some instrumentation (trace_printk)?
   
   Applied, and it looks like that's exactly why the warning triggers, at 
   least
   on the platform I am testing on which is a dual-cluster ARM testchip.
   
   There is a still time window though where the CPU (the IPI target) can get
   back to idle (tick_broadcast_pending still not set) before the CPU target 
   of
   the broadcast has a chance to run tick_handle_oneshot_broadcast (and set
   tick_broadcast_pending), or am I missing something ?
  
  Well, the tick_broadcast_pending bit is uninteresting if the
  force_broadcast bit is set. Because if that bit is set we know for
  sure, that we got woken with the cpu which gets the broadcast timer
  and raced back to idle before the broadcast handler managed to
  send the IPI.
 
 Gah, my bad sorry, I mixed things up. I thought
 
 tick_check_broadcast_pending()
 
 was checking against the tick_broadcast_pending mask not
 
 tick_force_broadcast_mask

Yep, that's a misnomer. I just wanted to make sure that my theory is
correct. I need to think about the real solution some more.

We have two alternatives:

1) Make the clockevents_notify function have a return value.

2) Add something like the hack I gave you with a proper name.

The latter has the beauty, that we just need to modify the platform
independent idle code instead of going down to every callsite of the
clockevents_notify thing.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-21 Thread Thomas Gleixner
On Thu, 21 Feb 2013, Jason Liu wrote:
 2013/2/20 Thomas Gleixner t...@linutronix.de:
  On Wed, 20 Feb 2013, Jason Liu wrote:
  void arch_idle(void)
  {
  
  clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, cpu);
 
  enter_the_wait_mode();
 
  clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);
  }
 
  when the broadcast timer interrupt arrives(this interrupt just wakeup
  the ARM, and ARM has no chance
  to handle it since local irq is disabled. In fact it's disabled in
  cpu_idle() of arch/arm/kernel/process.c)
 
  the broadcast timer interrupt will wake up the CPU and run:
 
  clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);-
  tick_broadcast_oneshot_control(...);
  -
  tick_program_event(dev-next_event, 1);
  -
  tick_dev_program_event(dev, expires, force);
  -
  for (i = 0;;) {
  int ret = clockevents_program_event(dev, expires, now);
  if (!ret || !force)
  return ret;
 
  dev-retries++;
  
  now = ktime_get();
  expires = ktime_add_ns(now, dev-min_delta_ns);
  }
  clockevents_program_event(dev, expires, now);
 
  delta = ktime_to_ns(ktime_sub(expires, now));
 
  if (delta = 0)
  return -ETIME;
 
  when the bc timer interrupt arrives,  which means the last local timer
  expires too. so,
  clockevents_program_event will return -ETIME, which will cause the
  dev-retries++
  when retry to program the expired timer.
 
  Even under the worst case, after the re-program the expired timer,
  then CPU enter idle
  quickly before the re-progam timer expired, it will make system
  ping-pang forever,
 
  That's nonsense.
 
 I don't think so.
 
 
  The timer IPI brings the core out of the deep idle state.
 
  So after returning from enter_wait_mode() and after calling
  clockevents_notify() it returns from arch_idle() to cpu_idle().
 
  In cpu_idle() interrupts are reenabled, so the timer IPI handler is
  invoked. That calls the event_handler of the per cpu local clockevent
  device (the one which stops in C3). That ends up in the generic timer
  code which expires timers and reprograms the local clock event device
  with the next pending timer.
 
  So you cannot go idle again, before the expired timers of this event
  are handled and their callbacks invoked.
 
 That's true for the CPUs which not response to the global timer interrupt.
 Take our platform as example: we have 4CPUs(CPU0, CPU1,CPU2,CPU3)
 The global timer device will keep running even in the deep idle mode, so, it
 can be used as the broadcast timer device, and the interrupt of this device
 just raised to CPU0 when the timer expired, then, CPU0 will broadcast the
 IPI timer to other CPUs which is in deep idle mode.
 
 So for CPU1, CPU2, CPU3, you are right, the IPI timer will bring it out of 
 idle
 state, after running clockevents_notify() it returns from arch_idle()
 to cpu_idle(),
 then local_irq_enable(), the IPI handler will be invoked and handle
 the expires times
 and re-program the next pending timer.
 
 But, that's not true for the CPU0. The flow for CPU0 is:
 the global timer interrupt wakes up CPU0 and then call:
 clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);
 
 which will cpumask_clear_cpu(cpu, tick_get_broadcast_oneshot_mask());
 in the function tick_broadcast_oneshot_control(),

Now your explanation makes sense. 

I have no fast solution for this, but I think that I have an idea how
to fix it. Stay tuned.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-21 Thread Lorenzo Pieralisi
Hi Jason,

On Thu, Feb 21, 2013 at 06:16:51AM +, Jason Liu wrote:
 2013/2/20 Thomas Gleixner t...@linutronix.de:
  On Wed, 20 Feb 2013, Jason Liu wrote:
  void arch_idle(void)
  {
  
  clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, cpu);
 
  enter_the_wait_mode();
 
  clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);
  }
 
  when the broadcast timer interrupt arrives(this interrupt just wakeup
  the ARM, and ARM has no chance
  to handle it since local irq is disabled. In fact it's disabled in
  cpu_idle() of arch/arm/kernel/process.c)
 
  the broadcast timer interrupt will wake up the CPU and run:
 
  clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);-
  tick_broadcast_oneshot_control(...);
  -
  tick_program_event(dev-next_event, 1);
  -
  tick_dev_program_event(dev, expires, force);
  -
  for (i = 0;;) {
  int ret = clockevents_program_event(dev, expires, now);
  if (!ret || !force)
  return ret;
 
  dev-retries++;
  
  now = ktime_get();
  expires = ktime_add_ns(now, dev-min_delta_ns);
  }
  clockevents_program_event(dev, expires, now);
 
  delta = ktime_to_ns(ktime_sub(expires, now));
 
  if (delta = 0)
  return -ETIME;
 
  when the bc timer interrupt arrives,  which means the last local timer
  expires too. so,
  clockevents_program_event will return -ETIME, which will cause the
  dev-retries++
  when retry to program the expired timer.
 
  Even under the worst case, after the re-program the expired timer,
  then CPU enter idle
  quickly before the re-progam timer expired, it will make system
  ping-pang forever,
 
  That's nonsense.
 
 I don't think so.
 
 
  The timer IPI brings the core out of the deep idle state.
 
  So after returning from enter_wait_mode() and after calling
  clockevents_notify() it returns from arch_idle() to cpu_idle().
 
  In cpu_idle() interrupts are reenabled, so the timer IPI handler is
  invoked. That calls the event_handler of the per cpu local clockevent
  device (the one which stops in C3). That ends up in the generic timer
  code which expires timers and reprograms the local clock event device
  with the next pending timer.
 
  So you cannot go idle again, before the expired timers of this event
  are handled and their callbacks invoked.
 
 That's true for the CPUs which not response to the global timer interrupt.
 Take our platform as example: we have 4CPUs(CPU0, CPU1,CPU2,CPU3)
 The global timer device will keep running even in the deep idle mode, so, it
 can be used as the broadcast timer device, and the interrupt of this device
 just raised to CPU0 when the timer expired, then, CPU0 will broadcast the
 IPI timer to other CPUs which is in deep idle mode.
 
 So for CPU1, CPU2, CPU3, you are right, the IPI timer will bring it out of 
 idle
 state, after running clockevents_notify() it returns from arch_idle()
 to cpu_idle(),
 then local_irq_enable(), the IPI handler will be invoked and handle
 the expires times
 and re-program the next pending timer.
 
 But, that's not true for the CPU0. The flow for CPU0 is:
 the global timer interrupt wakes up CPU0 and then call:
 clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);
 
 which will cpumask_clear_cpu(cpu, tick_get_broadcast_oneshot_mask());
 in the function tick_broadcast_oneshot_control(),

For my own understanding: at this point in time CPU0 local timer is
also reprogrammed, with min_delta (ie 1us) if I got your description
right.

 
 After return from clockevents_notify(), it will return to cpu_idle
 from arch_idle,
 then local_irq_enable(), the CPU0 will response to the global timer
 interrupt, and
 call the interrupt handler: tick_handle_oneshot_broadcast()
 
 static void tick_handle_oneshot_broadcast(struct clock_event_device *dev)
 {
 struct tick_device *td;
 ktime_t now, next_event;
 int cpu;
 
 raw_spin_lock(tick_broadcast_lock);
 again:
 dev-next_event.tv64 = KTIME_MAX;
 next_event.tv64 = KTIME_MAX;
 cpumask_clear(to_cpumask(tmpmask));
 now = ktime_get();
 /* Find all expired events */
 for_each_cpu(cpu, tick_get_broadcast_oneshot_mask()) {
 td = per_cpu(tick_cpu_device, cpu);
 if (td-evtdev-next_event.tv64 = now.tv64)
 cpumask_set_cpu(cpu, to_cpumask(tmpmask));
 else if (td-evtdev-next_event.tv64  next_event.tv64)
 next_event.tv64 = td-evtdev-next_event.tv64;
 }
 
 /*
  * Wakeup the cpus which have an expired event.
  */
 tick_do_broadcast(to_cpumask(tmpmask));
 ...
 }
 
 since cpu0 has been removed from the tick_get_broadcast_oneshot_mask(), and if
 all the other cpu1/2/3 state in idle, and no expired timers, then the
 tmpmask will be 0,
 when call tick_do_broadcast().
 
 static void 

Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-21 Thread Jason Liu
2013/2/21 Lorenzo Pieralisi lorenzo.pieral...@arm.com:
 Hi Jason,

 On Thu, Feb 21, 2013 at 06:16:51AM +, Jason Liu wrote:
 2013/2/20 Thomas Gleixner t...@linutronix.de:
  On Wed, 20 Feb 2013, Jason Liu wrote:
  void arch_idle(void)
  {
  
  clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, cpu);
 
  enter_the_wait_mode();
 
  clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);
  }
 
  when the broadcast timer interrupt arrives(this interrupt just wakeup
  the ARM, and ARM has no chance
  to handle it since local irq is disabled. In fact it's disabled in
  cpu_idle() of arch/arm/kernel/process.c)
 
  the broadcast timer interrupt will wake up the CPU and run:
 
  clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);-
  tick_broadcast_oneshot_control(...);
  -
  tick_program_event(dev-next_event, 1);
  -
  tick_dev_program_event(dev, expires, force);
  -
  for (i = 0;;) {
  int ret = clockevents_program_event(dev, expires, now);
  if (!ret || !force)
  return ret;
 
  dev-retries++;
  
  now = ktime_get();
  expires = ktime_add_ns(now, dev-min_delta_ns);
  }
  clockevents_program_event(dev, expires, now);
 
  delta = ktime_to_ns(ktime_sub(expires, now));
 
  if (delta = 0)
  return -ETIME;
 
  when the bc timer interrupt arrives,  which means the last local timer
  expires too. so,
  clockevents_program_event will return -ETIME, which will cause the
  dev-retries++
  when retry to program the expired timer.
 
  Even under the worst case, after the re-program the expired timer,
  then CPU enter idle
  quickly before the re-progam timer expired, it will make system
  ping-pang forever,
 
  That's nonsense.

 I don't think so.

 
  The timer IPI brings the core out of the deep idle state.
 
  So after returning from enter_wait_mode() and after calling
  clockevents_notify() it returns from arch_idle() to cpu_idle().
 
  In cpu_idle() interrupts are reenabled, so the timer IPI handler is
  invoked. That calls the event_handler of the per cpu local clockevent
  device (the one which stops in C3). That ends up in the generic timer
  code which expires timers and reprograms the local clock event device
  with the next pending timer.
 
  So you cannot go idle again, before the expired timers of this event
  are handled and their callbacks invoked.

 That's true for the CPUs which not response to the global timer interrupt.
 Take our platform as example: we have 4CPUs(CPU0, CPU1,CPU2,CPU3)
 The global timer device will keep running even in the deep idle mode, so, it
 can be used as the broadcast timer device, and the interrupt of this device
 just raised to CPU0 when the timer expired, then, CPU0 will broadcast the
 IPI timer to other CPUs which is in deep idle mode.

 So for CPU1, CPU2, CPU3, you are right, the IPI timer will bring it out of 
 idle
 state, after running clockevents_notify() it returns from arch_idle()
 to cpu_idle(),
 then local_irq_enable(), the IPI handler will be invoked and handle
 the expires times
 and re-program the next pending timer.

 But, that's not true for the CPU0. The flow for CPU0 is:
 the global timer interrupt wakes up CPU0 and then call:
 clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);

 which will cpumask_clear_cpu(cpu, tick_get_broadcast_oneshot_mask());
 in the function tick_broadcast_oneshot_control(),

 For my own understanding: at this point in time CPU0 local timer is
 also reprogrammed, with min_delta (ie 1us) if I got your description
 right.

yes, right.



 After return from clockevents_notify(), it will return to cpu_idle
 from arch_idle,
 then local_irq_enable(), the CPU0 will response to the global timer
 interrupt, and
 call the interrupt handler: tick_handle_oneshot_broadcast()

 static void tick_handle_oneshot_broadcast(struct clock_event_device *dev)
 {
 struct tick_device *td;
 ktime_t now, next_event;
 int cpu;

 raw_spin_lock(tick_broadcast_lock);
 again:
 dev-next_event.tv64 = KTIME_MAX;
 next_event.tv64 = KTIME_MAX;
 cpumask_clear(to_cpumask(tmpmask));
 now = ktime_get();
 /* Find all expired events */
 for_each_cpu(cpu, tick_get_broadcast_oneshot_mask()) {
 td = per_cpu(tick_cpu_device, cpu);
 if (td-evtdev-next_event.tv64 = now.tv64)
 cpumask_set_cpu(cpu, to_cpumask(tmpmask));
 else if (td-evtdev-next_event.tv64  next_event.tv64)
 next_event.tv64 = td-evtdev-next_event.tv64;
 }

 /*
  * Wakeup the cpus which have an expired event.
  */
 tick_do_broadcast(to_cpumask(tmpmask));
 ...
 }

 since cpu0 has been removed from the tick_get_broadcast_oneshot_mask(), and 
 if
 all the other cpu1/2/3 state in idle, and no expired timers, then the
 tmpmask 

Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-21 Thread Jason Liu
2013/2/21 Thomas Gleixner t...@linutronix.de:
 On Thu, 21 Feb 2013, Jason Liu wrote:
 2013/2/20 Thomas Gleixner t...@linutronix.de:
  On Wed, 20 Feb 2013, Jason Liu wrote:
  void arch_idle(void)
  {
  
  clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, cpu);
 
  enter_the_wait_mode();
 
  clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);
  }
 
  when the broadcast timer interrupt arrives(this interrupt just wakeup
  the ARM, and ARM has no chance
  to handle it since local irq is disabled. In fact it's disabled in
  cpu_idle() of arch/arm/kernel/process.c)
 
  the broadcast timer interrupt will wake up the CPU and run:
 
  clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);-
  tick_broadcast_oneshot_control(...);
  -
  tick_program_event(dev-next_event, 1);
  -
  tick_dev_program_event(dev, expires, force);
  -
  for (i = 0;;) {
  int ret = clockevents_program_event(dev, expires, now);
  if (!ret || !force)
  return ret;
 
  dev-retries++;
  
  now = ktime_get();
  expires = ktime_add_ns(now, dev-min_delta_ns);
  }
  clockevents_program_event(dev, expires, now);
 
  delta = ktime_to_ns(ktime_sub(expires, now));
 
  if (delta = 0)
  return -ETIME;
 
  when the bc timer interrupt arrives,  which means the last local timer
  expires too. so,
  clockevents_program_event will return -ETIME, which will cause the
  dev-retries++
  when retry to program the expired timer.
 
  Even under the worst case, after the re-program the expired timer,
  then CPU enter idle
  quickly before the re-progam timer expired, it will make system
  ping-pang forever,
 
  That's nonsense.

 I don't think so.

 
  The timer IPI brings the core out of the deep idle state.
 
  So after returning from enter_wait_mode() and after calling
  clockevents_notify() it returns from arch_idle() to cpu_idle().
 
  In cpu_idle() interrupts are reenabled, so the timer IPI handler is
  invoked. That calls the event_handler of the per cpu local clockevent
  device (the one which stops in C3). That ends up in the generic timer
  code which expires timers and reprograms the local clock event device
  with the next pending timer.
 
  So you cannot go idle again, before the expired timers of this event
  are handled and their callbacks invoked.

 That's true for the CPUs which not response to the global timer interrupt.
 Take our platform as example: we have 4CPUs(CPU0, CPU1,CPU2,CPU3)
 The global timer device will keep running even in the deep idle mode, so, it
 can be used as the broadcast timer device, and the interrupt of this device
 just raised to CPU0 when the timer expired, then, CPU0 will broadcast the
 IPI timer to other CPUs which is in deep idle mode.

 So for CPU1, CPU2, CPU3, you are right, the IPI timer will bring it out of 
 idle
 state, after running clockevents_notify() it returns from arch_idle()
 to cpu_idle(),
 then local_irq_enable(), the IPI handler will be invoked and handle
 the expires times
 and re-program the next pending timer.

 But, that's not true for the CPU0. The flow for CPU0 is:
 the global timer interrupt wakes up CPU0 and then call:
 clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);

 which will cpumask_clear_cpu(cpu, tick_get_broadcast_oneshot_mask());
 in the function tick_broadcast_oneshot_control(),

 Now your explanation makes sense.

 I have no fast solution for this, but I think that I have an idea how
 to fix it. Stay tuned.

Thanks Thomas, wait for your fix. :)


 Thanks,

 tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-21 Thread Thomas Gleixner
Jason,

On Thu, 21 Feb 2013, Jason Liu wrote:
 2013/2/21 Thomas Gleixner t...@linutronix.de:
  Now your explanation makes sense.
 
  I have no fast solution for this, but I think that I have an idea how
  to fix it. Stay tuned.
 
 Thanks Thomas, wait for your fix. :)

find below a completely untested patch, which should address that issue.

Sigh. Stopping the cpu local timer in deep idle is known to be an
idiotic design decision for 10+ years. I'll never understand why ARM
vendors insist on implementing the same stupidity over and over.

Thanks,

tglx

Index: linux-2.6/kernel/time/tick-broadcast.c
===
--- linux-2.6.orig/kernel/time/tick-broadcast.c
+++ linux-2.6/kernel/time/tick-broadcast.c
@@ -397,6 +397,8 @@ int tick_resume_broadcast(void)
 
 /* FIXME: use cpumask_var_t. */
 static DECLARE_BITMAP(tick_broadcast_oneshot_mask, NR_CPUS);
+static DECLARE_BITMAP(tick_broadcast_pending, NR_CPUS);
+static DECLARE_BITMAP(tick_force_broadcast_mask, NR_CPUS);
 
 /*
  * Exposed for debugging: see timer_list.c
@@ -453,12 +455,24 @@ again:
/* Find all expired events */
for_each_cpu(cpu, tick_get_broadcast_oneshot_mask()) {
td = per_cpu(tick_cpu_device, cpu);
-   if (td-evtdev-next_event.tv64 = now.tv64)
+   if (td-evtdev-next_event.tv64 = now.tv64) {
cpumask_set_cpu(cpu, to_cpumask(tmpmask));
-   else if (td-evtdev-next_event.tv64  next_event.tv64)
+   /*
+* Mark the remote cpu in the pending mask, so
+* it can avoid reprogramming the cpu local
+* timer in tick_broadcast_oneshot_control().
+*/
+   set_bit(cpu, tick_broadcast_pending);
+   } else if (td-evtdev-next_event.tv64  next_event.tv64)
next_event.tv64 = td-evtdev-next_event.tv64;
}
 
+   /* Take care of enforced broadcast requests */
+   for_each_cpu(cpu, to_cpumask(tick_force_broadcast_mask)) {
+   set_bit(cpu, tmpmask);
+   clear_bit(cpu, tick_force_broadcast_mask);
+   }
+
/*
 * Wakeup the cpus which have an expired event.
 */
@@ -494,6 +508,7 @@ void tick_broadcast_oneshot_control(unsi
struct clock_event_device *bc, *dev;
struct tick_device *td;
unsigned long flags;
+   ktime_t now;
int cpu;
 
/*
@@ -518,6 +533,8 @@ void tick_broadcast_oneshot_control(unsi
 
raw_spin_lock_irqsave(tick_broadcast_lock, flags);
if (reason == CLOCK_EVT_NOTIFY_BROADCAST_ENTER) {
+   WARN_ON_ONCE(test_bit(cpu, tick_broadcast_pending));
+   WARN_ON_ONCE(test_bit(cpu, tick_force_broadcast_mask));
if (!cpumask_test_cpu(cpu, tick_get_broadcast_oneshot_mask())) {
cpumask_set_cpu(cpu, tick_get_broadcast_oneshot_mask());
clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
@@ -529,10 +546,63 @@ void tick_broadcast_oneshot_control(unsi
cpumask_clear_cpu(cpu,
  tick_get_broadcast_oneshot_mask());
clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT);
-   if (dev-next_event.tv64 != KTIME_MAX)
-   tick_program_event(dev-next_event, 1);
+   if (dev-next_event.tv64 == KTIME_MAX)
+   goto out;
+   /*
+* The cpu handling the broadcast timer marked
+* this cpu in the broadcast pending mask and
+* fired the broadcast IPI. So we are going to
+* handle the expired event anyway via the
+* broadcast IPI handler. No need to reprogram
+* the timer with an already expired event.
+*/
+   if (test_and_clear_bit(cpu, tick_broadcast_pending))
+   goto out;
+   /*
+* If the pending bit is not set, then we are
+* either the CPU handling the broadcast
+* interrupt or we got woken by something else.
+*
+* We are not longer in the broadcast mask, so
+* if the cpu local expiry time is already
+* reached, we would reprogram the cpu local
+* timer with an already expired event.
+*
+* This can lead to a ping-pong when we return
+* to idle and therefor rearm the broadcast
+* timer before the cpu local timer was able
+* to fire. This happens 

Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-21 Thread Santosh Shilimkar

Thomas,

On Thursday 21 February 2013 07:18 PM, Thomas Gleixner wrote:

Jason,

On Thu, 21 Feb 2013, Jason Liu wrote:

2013/2/21 Thomas Gleixner t...@linutronix.de:

Now your explanation makes sense.

I have no fast solution for this, but I think that I have an idea how
to fix it. Stay tuned.


Thanks Thomas, wait for your fix. :)


find below a completely untested patch, which should address that issue.


After looking at the thread, I tried to see the issue on OMAP and could
see the same issue as Jason.

Your patch fixes the retries on both CPUs on my dual core machine. So
you use my tested by if you need one.
Tested-by: Santosh Shilimkar santosh.shilim...@ti.com

Thanks for the patch.
And thanks to Jason for spotting the issue.

Regards,
Santosh

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-21 Thread Thomas Gleixner
On Thu, 21 Feb 2013, Santosh Shilimkar wrote:
 On Thursday 21 February 2013 07:18 PM, Thomas Gleixner wrote:
  find below a completely untested patch, which should address that issue.
  
 After looking at the thread, I tried to see the issue on OMAP and could
 see the same issue as Jason.

That's interesting. We have the same issue on x86 since 2007 and
nobody noticed ever. It's basically the same problem there, but it
seems that on x86 getting out of those low power states is way slower
than the minimal reprogramming delta which is used to enforce the
local timer to fire after the wakeup. 

I'm still amazed that as Jason stated a 1us reprogramming delta is
sufficient to get this ping-pong going. I somehow doubt that, but
maybe ARM is really that fast :)

 Your patch fixes the retries on both CPUs on my dual core machine. So
 you use my tested by if you need one.

They are always welcome.

 Tested-by: Santosh Shilimkar santosh.shilim...@ti.com
 
 Thanks for the patch.
 And thanks to Jason for spotting the issue.

And for coping with my initial inability to parse his report!

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-20 Thread Thomas Gleixner
On Wed, 20 Feb 2013, Jason Liu wrote:
 void arch_idle(void)
 {
 
 clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, cpu);
 
 enter_the_wait_mode();
 
 clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);
 }
 
 when the broadcast timer interrupt arrives(this interrupt just wakeup
 the ARM, and ARM has no chance
 to handle it since local irq is disabled. In fact it's disabled in
 cpu_idle() of arch/arm/kernel/process.c)
 
 the broadcast timer interrupt will wake up the CPU and run:
 
 clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);-
 tick_broadcast_oneshot_control(...);
 -
 tick_program_event(dev-next_event, 1);
 -
 tick_dev_program_event(dev, expires, force);
 -
 for (i = 0;;) {
 int ret = clockevents_program_event(dev, expires, now);
 if (!ret || !force)
 return ret;
 
 dev-retries++;
 
 now = ktime_get();
 expires = ktime_add_ns(now, dev-min_delta_ns);
 }
 clockevents_program_event(dev, expires, now);
 
 delta = ktime_to_ns(ktime_sub(expires, now));
 
 if (delta = 0)
 return -ETIME;
 
 when the bc timer interrupt arrives,  which means the last local timer
 expires too. so,
 clockevents_program_event will return -ETIME, which will cause the
 dev-retries++
 when retry to program the expired timer.
 
 Even under the worst case, after the re-program the expired timer,
 then CPU enter idle
 quickly before the re-progam timer expired, it will make system
 ping-pang forever,

That's nonsense.

The timer IPI brings the core out of the deep idle state.

So after returning from enter_wait_mode() and after calling
clockevents_notify() it returns from arch_idle() to cpu_idle().

In cpu_idle() interrupts are reenabled, so the timer IPI handler is
invoked. That calls the event_handler of the per cpu local clockevent
device (the one which stops in C3). That ends up in the generic timer
code which expires timers and reprograms the local clock event device
with the next pending timer.

So you cannot go idle again, before the expired timers of this event
are handled and their callbacks invoked.


Now the reprogramming itself is a different issue.

It's necessary as we have no information whether the wakeup was caused
by the timer IPI or by something else.

We might try to be clever and store the pending IPI in
tick_handle_oneshot_broadcast() and avoid the reprogramming in that
case. Completely untested patch below.

Thanks,

tglx

Index: linux-2.6/kernel/time/tick-broadcast.c
===
--- linux-2.6.orig/kernel/time/tick-broadcast.c
+++ linux-2.6/kernel/time/tick-broadcast.c
@@ -29,6 +29,7 @@
 static struct tick_device tick_broadcast_device;
 /* FIXME: Use cpumask_var_t. */
 static DECLARE_BITMAP(tick_broadcast_mask, NR_CPUS);
+static DECLARE_BITMAP(tick_broadcast_pending, NR_CPUS);
 static DECLARE_BITMAP(tmpmask, NR_CPUS);
 static DEFINE_RAW_SPINLOCK(tick_broadcast_lock);
 static int tick_broadcast_force;
@@ -417,9 +418,10 @@ again:
/* Find all expired events */
for_each_cpu(cpu, tick_get_broadcast_oneshot_mask()) {
td = per_cpu(tick_cpu_device, cpu);
-   if (td-evtdev-next_event.tv64 = now.tv64)
+   if (td-evtdev-next_event.tv64 = now.tv64) {
cpumask_set_cpu(cpu, to_cpumask(tmpmask));
-   else if (td-evtdev-next_event.tv64  next_event.tv64)
+   set_bit(cpu, tick_broadcast_pending);
+   } else if (td-evtdev-next_event.tv64  next_event.tv64)
next_event.tv64 = td-evtdev-next_event.tv64;
}
 
@@ -493,7 +495,8 @@ void tick_broadcast_oneshot_control(unsi
cpumask_clear_cpu(cpu,
  tick_get_broadcast_oneshot_mask());
clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT);
-   if (dev-next_event.tv64 != KTIME_MAX)
+   if (dev-next_event.tv64 != KTIME_MAX 
+   !test_and_clear_bit(cpu, tick_broadcast_pending))
tick_program_event(dev-next_event, 1);
}
}






--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-20 Thread Jason Liu
2013/2/20 Thomas Gleixner t...@linutronix.de:
 On Wed, 20 Feb 2013, Jason Liu wrote:
 void arch_idle(void)
 {
 
 clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, cpu);

 enter_the_wait_mode();

 clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);
 }

 when the broadcast timer interrupt arrives(this interrupt just wakeup
 the ARM, and ARM has no chance
 to handle it since local irq is disabled. In fact it's disabled in
 cpu_idle() of arch/arm/kernel/process.c)

 the broadcast timer interrupt will wake up the CPU and run:

 clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);-
 tick_broadcast_oneshot_control(...);
 -
 tick_program_event(dev-next_event, 1);
 -
 tick_dev_program_event(dev, expires, force);
 -
 for (i = 0;;) {
 int ret = clockevents_program_event(dev, expires, now);
 if (!ret || !force)
 return ret;

 dev-retries++;
 
 now = ktime_get();
 expires = ktime_add_ns(now, dev-min_delta_ns);
 }
 clockevents_program_event(dev, expires, now);

 delta = ktime_to_ns(ktime_sub(expires, now));

 if (delta = 0)
 return -ETIME;

 when the bc timer interrupt arrives,  which means the last local timer
 expires too. so,
 clockevents_program_event will return -ETIME, which will cause the
 dev-retries++
 when retry to program the expired timer.

 Even under the worst case, after the re-program the expired timer,
 then CPU enter idle
 quickly before the re-progam timer expired, it will make system
 ping-pang forever,

 That's nonsense.

I don't think so.


 The timer IPI brings the core out of the deep idle state.

 So after returning from enter_wait_mode() and after calling
 clockevents_notify() it returns from arch_idle() to cpu_idle().

 In cpu_idle() interrupts are reenabled, so the timer IPI handler is
 invoked. That calls the event_handler of the per cpu local clockevent
 device (the one which stops in C3). That ends up in the generic timer
 code which expires timers and reprograms the local clock event device
 with the next pending timer.

 So you cannot go idle again, before the expired timers of this event
 are handled and their callbacks invoked.

That's true for the CPUs which not response to the global timer interrupt.
Take our platform as example: we have 4CPUs(CPU0, CPU1,CPU2,CPU3)
The global timer device will keep running even in the deep idle mode, so, it
can be used as the broadcast timer device, and the interrupt of this device
just raised to CPU0 when the timer expired, then, CPU0 will broadcast the
IPI timer to other CPUs which is in deep idle mode.

So for CPU1, CPU2, CPU3, you are right, the IPI timer will bring it out of idle
state, after running clockevents_notify() it returns from arch_idle()
to cpu_idle(),
then local_irq_enable(), the IPI handler will be invoked and handle
the expires times
and re-program the next pending timer.

But, that's not true for the CPU0. The flow for CPU0 is:
the global timer interrupt wakes up CPU0 and then call:
clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);

which will cpumask_clear_cpu(cpu, tick_get_broadcast_oneshot_mask());
in the function tick_broadcast_oneshot_control(),

After return from clockevents_notify(), it will return to cpu_idle
from arch_idle,
then local_irq_enable(), the CPU0 will response to the global timer
interrupt, and
call the interrupt handler: tick_handle_oneshot_broadcast()

static void tick_handle_oneshot_broadcast(struct clock_event_device *dev)
{
struct tick_device *td;
ktime_t now, next_event;
int cpu;

raw_spin_lock(tick_broadcast_lock);
again:
dev-next_event.tv64 = KTIME_MAX;
next_event.tv64 = KTIME_MAX;
cpumask_clear(to_cpumask(tmpmask));
now = ktime_get();
/* Find all expired events */
for_each_cpu(cpu, tick_get_broadcast_oneshot_mask()) {
td = per_cpu(tick_cpu_device, cpu);
if (td-evtdev-next_event.tv64 = now.tv64)
cpumask_set_cpu(cpu, to_cpumask(tmpmask));
else if (td-evtdev-next_event.tv64  next_event.tv64)
next_event.tv64 = td-evtdev-next_event.tv64;
}

/*
 * Wakeup the cpus which have an expired event.
 */
tick_do_broadcast(to_cpumask(tmpmask));
...
}

since cpu0 has been removed from the tick_get_broadcast_oneshot_mask(), and if
all the other cpu1/2/3 state in idle, and no expired timers, then the
tmpmask will be 0,
when call tick_do_broadcast().

static void tick_do_broadcast(struct cpumask *mask)
{
int cpu = smp_processor_id();
struct tick_device *td;

/*
 * Check, if the current cpu is in the mask
 */
if (cpumask_test_cpu(cpu, mask)) {
cpumask_clear_cpu(cpu, mask);
td = per_cpu(tick_cpu_device, cpu);