Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-28 Thread Frederic Weisbecker
On Fri, Aug 28, 2015 at 10:32:59AM +0200, Ingo Molnar wrote:
> 
> * Frederic Weisbecker  wrote:
> 
> > On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote:
> > > 
> > > * Frederic Weisbecker  wrote:
> > > 
> > > > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > > > > 2)
> > > > > 
> > > > > What happens if the boot CPU is offlined? (under 
> > > > > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > > > > 
> > > > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if 
> > > > > the boot CPU 
> > > > > is offlined.
> > > > 
> > > > We have tick_nohz_cpu_down_callback() which makes sure that the 
> > > > timekeeper, which
> > > > is the boot CPU in nohz full, never gets offlined.
> > > 
> > > That solution really sucks - it essentially regresses a feature the user 
> > > explicitly asked for! I also see no way for the user to migrate the 
> > > timekeeping 
> > > functionality over to another CPU without rebooting.
> > > 
> > > If this is the last timekeeping CPU then it should migrate the 
> > > timekeeping 
> > > functionality to another CPU, and perhaps printk a warning if all other 
> > > CPUs are 
> > > nohz-full and we have to mark one of them as the timekeeper.
> > > 
> > > Also, the nohz-full and timekeeper functionality should not be a boot 
> > > parameter 
> > > only thing, but should be runtime configurable.
> > 
> > When I tried to allow moving the timekeeping duty over all housekeeping 
> > CPUs, 
> > Thomas got angry because it broke the KISS current nohz full code. Indeed, 
> > there 
> > must be at least one running all the time on behalf of nohz full CPUs that 
> > can 
> > run anytime. Thus balancing the timekeeping duty over housekeepers is a bit 
> > more 
> > complicated than in normal configurations.
> > 
> > Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a 
> > housekeeper if 
> > any remains or to a nohz full one. Then we must make sure the new 
> > timekeeper 
> > never goes to idle.
> > 
> > But nohz_full is a corner usecase and I'm not sure it's worth the 
> > complexity. If 
> > a nohz full user came and complained about CPU0 hotplog not working, I 
> > would 
> > definetly retry it but I haven't heard about that yet. Besides, hotplug is 
> > very 
> > isolation-unfriendly in general due to stop machine.
> 
> Ok, I guess we can live with this.

Now this will likely evolve in the future, I can easily imagine that 
timekeeping becomes
balanced among housekeepers when we'll have one per node. It's not yet the 
priority but
we may come to that one day.

> 
> Mind sending an updated series with all patches?

Sure, I'm cooking that.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-28 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote:
> > 
> > * Frederic Weisbecker  wrote:
> > 
> > > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > > > 2)
> > > > 
> > > > What happens if the boot CPU is offlined? (under 
> > > > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > > > 
> > > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if 
> > > > the boot CPU 
> > > > is offlined.
> > > 
> > > We have tick_nohz_cpu_down_callback() which makes sure that the 
> > > timekeeper, which
> > > is the boot CPU in nohz full, never gets offlined.
> > 
> > That solution really sucks - it essentially regresses a feature the user 
> > explicitly asked for! I also see no way for the user to migrate the 
> > timekeeping 
> > functionality over to another CPU without rebooting.
> > 
> > If this is the last timekeeping CPU then it should migrate the timekeeping 
> > functionality to another CPU, and perhaps printk a warning if all other 
> > CPUs are 
> > nohz-full and we have to mark one of them as the timekeeper.
> > 
> > Also, the nohz-full and timekeeper functionality should not be a boot 
> > parameter 
> > only thing, but should be runtime configurable.
> 
> When I tried to allow moving the timekeeping duty over all housekeeping CPUs, 
> Thomas got angry because it broke the KISS current nohz full code. Indeed, 
> there 
> must be at least one running all the time on behalf of nohz full CPUs that 
> can 
> run anytime. Thus balancing the timekeeping duty over housekeepers is a bit 
> more 
> complicated than in normal configurations.
> 
> Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a housekeeper 
> if 
> any remains or to a nohz full one. Then we must make sure the new timekeeper 
> never goes to idle.
> 
> But nohz_full is a corner usecase and I'm not sure it's worth the complexity. 
> If 
> a nohz full user came and complained about CPU0 hotplog not working, I would 
> definetly retry it but I haven't heard about that yet. Besides, hotplug is 
> very 
> isolation-unfriendly in general due to stop machine.

Ok, I guess we can live with this.

Mind sending an updated series with all patches?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-28 Thread Ingo Molnar

* Frederic Weisbecker fweis...@gmail.com wrote:

 On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote:
  
  * Frederic Weisbecker fweis...@gmail.com wrote:
  
   On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
2)

What happens if the boot CPU is offlined? (under 
CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)

I don't see CPU hotplug callbacks fixing up the housekeeping_mask if 
the boot CPU 
is offlined.
   
   We have tick_nohz_cpu_down_callback() which makes sure that the 
   timekeeper, which
   is the boot CPU in nohz full, never gets offlined.
  
  That solution really sucks - it essentially regresses a feature the user 
  explicitly asked for! I also see no way for the user to migrate the 
  timekeeping 
  functionality over to another CPU without rebooting.
  
  If this is the last timekeeping CPU then it should migrate the timekeeping 
  functionality to another CPU, and perhaps printk a warning if all other 
  CPUs are 
  nohz-full and we have to mark one of them as the timekeeper.
  
  Also, the nohz-full and timekeeper functionality should not be a boot 
  parameter 
  only thing, but should be runtime configurable.
 
 When I tried to allow moving the timekeeping duty over all housekeeping CPUs, 
 Thomas got angry because it broke the KISS current nohz full code. Indeed, 
 there 
 must be at least one running all the time on behalf of nohz full CPUs that 
 can 
 run anytime. Thus balancing the timekeeping duty over housekeepers is a bit 
 more 
 complicated than in normal configurations.
 
 Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a housekeeper 
 if 
 any remains or to a nohz full one. Then we must make sure the new timekeeper 
 never goes to idle.
 
 But nohz_full is a corner usecase and I'm not sure it's worth the complexity. 
 If 
 a nohz full user came and complained about CPU0 hotplog not working, I would 
 definetly retry it but I haven't heard about that yet. Besides, hotplug is 
 very 
 isolation-unfriendly in general due to stop machine.

Ok, I guess we can live with this.

Mind sending an updated series with all patches?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-28 Thread Frederic Weisbecker
On Fri, Aug 28, 2015 at 10:32:59AM +0200, Ingo Molnar wrote:
 
 * Frederic Weisbecker fweis...@gmail.com wrote:
 
  On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote:
   
   * Frederic Weisbecker fweis...@gmail.com wrote:
   
On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
 2)
 
 What happens if the boot CPU is offlined? (under 
 CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
 
 I don't see CPU hotplug callbacks fixing up the housekeeping_mask if 
 the boot CPU 
 is offlined.

We have tick_nohz_cpu_down_callback() which makes sure that the 
timekeeper, which
is the boot CPU in nohz full, never gets offlined.
   
   That solution really sucks - it essentially regresses a feature the user 
   explicitly asked for! I also see no way for the user to migrate the 
   timekeeping 
   functionality over to another CPU without rebooting.
   
   If this is the last timekeeping CPU then it should migrate the 
   timekeeping 
   functionality to another CPU, and perhaps printk a warning if all other 
   CPUs are 
   nohz-full and we have to mark one of them as the timekeeper.
   
   Also, the nohz-full and timekeeper functionality should not be a boot 
   parameter 
   only thing, but should be runtime configurable.
  
  When I tried to allow moving the timekeeping duty over all housekeeping 
  CPUs, 
  Thomas got angry because it broke the KISS current nohz full code. Indeed, 
  there 
  must be at least one running all the time on behalf of nohz full CPUs that 
  can 
  run anytime. Thus balancing the timekeeping duty over housekeepers is a bit 
  more 
  complicated than in normal configurations.
  
  Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a 
  housekeeper if 
  any remains or to a nohz full one. Then we must make sure the new 
  timekeeper 
  never goes to idle.
  
  But nohz_full is a corner usecase and I'm not sure it's worth the 
  complexity. If 
  a nohz full user came and complained about CPU0 hotplog not working, I 
  would 
  definetly retry it but I haven't heard about that yet. Besides, hotplug is 
  very 
  isolation-unfriendly in general due to stop machine.
 
 Ok, I guess we can live with this.

Now this will likely evolve in the future, I can easily imagine that 
timekeeping becomes
balanced among housekeepers when we'll have one per node. It's not yet the 
priority but
we may come to that one day.

 
 Mind sending an updated series with all patches?

Sure, I'm cooking that.

Thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-25 Thread Frederic Weisbecker
On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote:
> 
> * Frederic Weisbecker  wrote:
> 
> > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > > 2)
> > > 
> > > What happens if the boot CPU is offlined? (under 
> > > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > > 
> > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the 
> > > boot CPU 
> > > is offlined.
> > 
> > We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, 
> > which
> > is the boot CPU in nohz full, never gets offlined.
> 
> That solution really sucks - it essentially regresses a feature the user 
> explicitly asked for! I also see no way for the user to migrate the 
> timekeeping 
> functionality over to another CPU without rebooting.
> 
> If this is the last timekeeping CPU then it should migrate the timekeeping 
> functionality to another CPU, and perhaps printk a warning if all other CPUs 
> are 
> nohz-full and we have to mark one of them as the timekeeper.
> 
> Also, the nohz-full and timekeeper functionality should not be a boot 
> parameter 
> only thing, but should be runtime configurable.

When I tried to allow moving the timekeeping duty over all housekeeping CPUs, 
Thomas got
angry because it broke the KISS current nohz full code. Indeed, there must be 
at least
one running all the time on behalf of nohz full CPUs that can run anytime. Thus 
balancing
the timekeeping duty over housekeepers is a bit more complicated than in normal
configurations.

Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a housekeeper if
any remains or to a nohz full one. Then we must make sure the new timekeeper 
never
goes to idle.

But nohz_full is a corner usecase and I'm not sure it's worth the complexity. 
If a
nohz full user came and complained about CPU0 hotplog not working, I would 
definetly
retry it but I haven't heard about that yet. Besides, hotplug is very 
isolation-unfriendly
in general due to stop machine.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-25 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > 2)
> > 
> > What happens if the boot CPU is offlined? (under 
> > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > 
> > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the 
> > boot CPU 
> > is offlined.
> 
> We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, 
> which
> is the boot CPU in nohz full, never gets offlined.

That solution really sucks - it essentially regresses a feature the user 
explicitly asked for! I also see no way for the user to migrate the timekeeping 
functionality over to another CPU without rebooting.

If this is the last timekeeping CPU then it should migrate the timekeeping 
functionality to another CPU, and perhaps printk a warning if all other CPUs 
are 
nohz-full and we have to mark one of them as the timekeeper.

Also, the nohz-full and timekeeper functionality should not be a boot parameter 
only thing, but should be runtime configurable.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-25 Thread Ingo Molnar

* Frederic Weisbecker fweis...@gmail.com wrote:

 On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
  2)
  
  What happens if the boot CPU is offlined? (under 
  CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
  
  I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the 
  boot CPU 
  is offlined.
 
 We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, 
 which
 is the boot CPU in nohz full, never gets offlined.

That solution really sucks - it essentially regresses a feature the user 
explicitly asked for! I also see no way for the user to migrate the timekeeping 
functionality over to another CPU without rebooting.

If this is the last timekeeping CPU then it should migrate the timekeeping 
functionality to another CPU, and perhaps printk a warning if all other CPUs 
are 
nohz-full and we have to mark one of them as the timekeeper.

Also, the nohz-full and timekeeper functionality should not be a boot parameter 
only thing, but should be runtime configurable.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-25 Thread Frederic Weisbecker
On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote:
 
 * Frederic Weisbecker fweis...@gmail.com wrote:
 
  On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
   2)
   
   What happens if the boot CPU is offlined? (under 
   CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
   
   I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the 
   boot CPU 
   is offlined.
  
  We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, 
  which
  is the boot CPU in nohz full, never gets offlined.
 
 That solution really sucks - it essentially regresses a feature the user 
 explicitly asked for! I also see no way for the user to migrate the 
 timekeeping 
 functionality over to another CPU without rebooting.
 
 If this is the last timekeeping CPU then it should migrate the timekeeping 
 functionality to another CPU, and perhaps printk a warning if all other CPUs 
 are 
 nohz-full and we have to mark one of them as the timekeeper.
 
 Also, the nohz-full and timekeeper functionality should not be a boot 
 parameter 
 only thing, but should be runtime configurable.

When I tried to allow moving the timekeeping duty over all housekeeping CPUs, 
Thomas got
angry because it broke the KISS current nohz full code. Indeed, there must be 
at least
one running all the time on behalf of nohz full CPUs that can run anytime. Thus 
balancing
the timekeeping duty over housekeepers is a bit more complicated than in normal
configurations.

Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a housekeeper if
any remains or to a nohz full one. Then we must make sure the new timekeeper 
never
goes to idle.

But nohz_full is a corner usecase and I'm not sure it's worth the complexity. 
If a
nohz full user came and complained about CPU0 hotplog not working, I would 
definetly
retry it but I haven't heard about that yet. Besides, hotplug is very 
isolation-unfriendly
in general due to stop machine.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Paul E. McKenney
On Mon, Aug 24, 2015 at 04:04:37PM +0200, Frederic Weisbecker wrote:
> On Mon, Aug 24, 2015 at 06:50:18AM -0700, Paul E. McKenney wrote:
> > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > > 
> > > * Paul E. McKenney  wrote:
> > > 
> > > > > here it's fully set - triggering the bug I'm worried about. So what 
> > > > > am I 
> > > > > missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?
> > > > 
> > > > The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), 
> > > > which is 
> > > > called from tick_init() which is called from start_kernel() shortly 
> > > > after 
> > > > rcu_init():
> > > > 
> > > > cpu = smp_processor_id();
> > > > 
> > > > if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
> > > > pr_warning("NO_HZ: Clearing %d from nohz_full range for 
> > > > timekeeping\n", cpu);
> > > > cpumask_clear_cpu(cpu, tick_nohz_full_mask);
> > > > }
> > > > 
> > > > This happens after the call to tick_nohz_init_all() that does the 
> > > > cpumask_setall() that you called out above.
> > > 
> > > Ah, indeed - I somehow missed that.
> > > 
> > > This brings up two other questions:
> > > 
> > > 1)
> > > 
> > > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a 
> > > full mask to 
> > > it (housekeeping_mask - a variable mask to begin with) and recover the 
> > > housekeeping CPU via:
> > > 
> > > +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> > > 
> > > which can be pretty expensive, and which gets executed in two hotpaths:
> > > 
> > > kernel/time/hrtimer.c:  return _cpu(hrtimer_bases, 
> > > get_nohz_timer_target());
> > > kernel/time/timer.c:return per_cpu_ptr(_bases, 
> > > get_nohz_timer_target());
> > > 
> > > ... why not just use a single housekeeping_cpu which would be way faster 
> > > to pass 
> > > down to the timer code?
> > 
> > The housekeeping_cpu came later, but that does seem like a good 
> > optimization.
> 
> Well nohz full is likely to be used for HPC and that can involve big machines.
> Having the housekeeping duty spread per node is a likely future evolution 
> there,
> if it isn't already used that way.
> 
> So we need to keep it a cpumask.

Fair point!

Thanx, Paul

> > > 2)
> > > 
> > > What happens if the boot CPU is offlined? (under 
> > > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > > 
> > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the 
> > > boot CPU 
> > > is offlined.
> > 
> > The tick_nohz_cpu_down_callback() function does this, though in a less
> > than obvious way.  The tick_do_timer_cpu variable is the housekeeping
> > CPU that is currently handling timing, and it is not permitted to go
> > offline.
> 
> Indeed, more specifically tick-common.c makes sure to set the timekeeping
> duty to a housekeeper and that housekeeper is always the boot CPU due to
> early device initialization.
> 
> But I should find a way to simplify that code and make it obvious it's always
> set to the boot CPU.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Frederic Weisbecker
On Mon, Aug 24, 2015 at 06:50:18AM -0700, Paul E. McKenney wrote:
> On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > 
> > * Paul E. McKenney  wrote:
> > 
> > > > here it's fully set - triggering the bug I'm worried about. So what am 
> > > > I 
> > > > missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?
> > > 
> > > The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), 
> > > which is 
> > > called from tick_init() which is called from start_kernel() shortly after 
> > > rcu_init():
> > > 
> > >   cpu = smp_processor_id();
> > > 
> > >   if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
> > >   pr_warning("NO_HZ: Clearing %d from nohz_full range for 
> > > timekeeping\n", cpu);
> > >   cpumask_clear_cpu(cpu, tick_nohz_full_mask);
> > >   }
> > > 
> > > This happens after the call to tick_nohz_init_all() that does the 
> > > cpumask_setall() that you called out above.
> > 
> > Ah, indeed - I somehow missed that.
> > 
> > This brings up two other questions:
> > 
> > 1)
> > 
> > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full 
> > mask to 
> > it (housekeeping_mask - a variable mask to begin with) and recover the 
> > housekeeping CPU via:
> > 
> > +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> > 
> > which can be pretty expensive, and which gets executed in two hotpaths:
> > 
> > kernel/time/hrtimer.c:  return _cpu(hrtimer_bases, 
> > get_nohz_timer_target());
> > kernel/time/timer.c:return per_cpu_ptr(_bases, 
> > get_nohz_timer_target());
> > 
> > ... why not just use a single housekeeping_cpu which would be way faster to 
> > pass 
> > down to the timer code?
> 
> The housekeeping_cpu came later, but that does seem like a good optimization.

Well nohz full is likely to be used for HPC and that can involve big machines.
Having the housekeeping duty spread per node is a likely future evolution there,
if it isn't already used that way.

So we need to keep it a cpumask.

> 
> > 2)
> > 
> > What happens if the boot CPU is offlined? (under 
> > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > 
> > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the 
> > boot CPU 
> > is offlined.
> 
> The tick_nohz_cpu_down_callback() function does this, though in a less
> than obvious way.  The tick_do_timer_cpu variable is the housekeeping
> CPU that is currently handling timing, and it is not permitted to go
> offline.

Indeed, more specifically tick-common.c makes sure to set the timekeeping
duty to a housekeeper and that housekeeper is always the boot CPU due to
early device initialization.

But I should find a way to simplify that code and make it obvious it's always
set to the boot CPU.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Mike Galbraith
On Mon, 2015-08-24 at 15:36 +0200, Frederic Weisbecker wrote:
> On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> > 2)
> > 
> > What happens if the boot CPU is offlined? (under 
> > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> > 
> > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the 
> > boot CPU 
> > is offlined.
> 
> We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, 
> which
> is the boot CPU in nohz full, never gets offlined.

Aha.. and it works fine too.  (CPU0 told me to go away earlier;)

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Paul E. McKenney
On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney  wrote:
> 
> > > here it's fully set - triggering the bug I'm worried about. So what am I 
> > > missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?
> > 
> > The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), 
> > which is 
> > called from tick_init() which is called from start_kernel() shortly after 
> > rcu_init():
> > 
> > cpu = smp_processor_id();
> > 
> > if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
> > pr_warning("NO_HZ: Clearing %d from nohz_full range for 
> > timekeeping\n", cpu);
> > cpumask_clear_cpu(cpu, tick_nohz_full_mask);
> > }
> > 
> > This happens after the call to tick_nohz_init_all() that does the 
> > cpumask_setall() that you called out above.
> 
> Ah, indeed - I somehow missed that.
> 
> This brings up two other questions:
> 
> 1)
> 
> the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full 
> mask to 
> it (housekeeping_mask - a variable mask to begin with) and recover the 
> housekeeping CPU via:
> 
> +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> 
> which can be pretty expensive, and which gets executed in two hotpaths:
> 
> kernel/time/hrtimer.c:  return _cpu(hrtimer_bases, 
> get_nohz_timer_target());
> kernel/time/timer.c:return per_cpu_ptr(_bases, 
> get_nohz_timer_target());
> 
> ... why not just use a single housekeeping_cpu which would be way faster to 
> pass 
> down to the timer code?

The housekeeping_cpu came later, but that does seem like a good optimization.

> 2)
> 
> What happens if the boot CPU is offlined? (under 
> CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> 
> I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot 
> CPU 
> is offlined.

The tick_nohz_cpu_down_callback() function does this, though in a less
than obvious way.  The tick_do_timer_cpu variable is the housekeeping
CPU that is currently handling timing, and it is not permitted to go
offline.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Frederic Weisbecker
On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
> 2)
> 
> What happens if the boot CPU is offlined? (under 
> CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
> 
> I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot 
> CPU 
> is offlined.

We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, 
which
is the boot CPU in nohz full, never gets offlined.

> 
> Thanks,
> 
>   Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Ingo Molnar

* Mike Galbraith  wrote:

> On Mon, 2015-08-24 at 09:41 +0200, Ingo Molnar wrote:
> > * Mike Galbraith  wrote:
> > 
> > > On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote:
> > > 
> > > > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a 
> > > > full mask to 
> > > > it (housekeeping_mask - a variable mask to begin with) and recover the 
> > > > housekeeping CPU via:
> > > > 
> > > > +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> > > 
> > > There can be, and had better be if box is big, multiple housekeepers.
> > 
> > Yes - but that does not seem to be possible via the code right now AFAICS, 
> > so 
> > at minimum it's incomplete.
> 
> In master housekeepers are set up in tick_nohz_init().  Everybody who's not a 
> nohz_full CPU is a housekeeper.

Ah, indeed, I missed the nohz_full= boot option:

static int __init tick_nohz_full_setup(char *str)
{
alloc_bootmem_cpumask_var(_nohz_full_mask);
if (cpulist_parse(str, tick_nohz_full_mask) < 0) {


Ok, first question is resolved - but the second one, behavior on boot CPU 
unplug 
still holds.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Mike Galbraith
On Mon, 2015-08-24 at 09:41 +0200, Ingo Molnar wrote:
> * Mike Galbraith  wrote:
> 
> > On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote:
> > 
> > > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a 
> > > full mask to 
> > > it (housekeeping_mask - a variable mask to begin with) and recover the 
> > > housekeeping CPU via:
> > > 
> > > +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> > 
> > There can be, and had better be if box is big, multiple housekeepers.
> 
> Yes - but that does not seem to be possible via the code right now AFAICS, so 
> at 
> minimum it's incomplete.

In master housekeepers are set up in tick_nohz_init().  Everybody who's
not a nohz_full CPU is a housekeeper.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Ingo Molnar

* Mike Galbraith  wrote:

> On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote:
> 
> > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full 
> > mask to 
> > it (housekeeping_mask - a variable mask to begin with) and recover the 
> > housekeeping CPU via:
> > 
> > +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> 
> There can be, and had better be if box is big, multiple housekeepers.

Yes - but that does not seem to be possible via the code right now AFAICS, so 
at 
minimum it's incomplete.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Mike Galbraith
On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote:

> the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full 
> mask to 
> it (housekeeping_mask - a variable mask to begin with) and recover the 
> housekeeping CPU via:
> 
> +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);

There can be, and had better be if box is big, multiple housekeepers.

Imagine a NO_HZ_FULL_ALL kernel on an SGI beast from hell.  Offloading
8191 CPUs onto poor little CPU0 probably wouldn't work out well :)

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Ingo Molnar

* Paul E. McKenney  wrote:

> > here it's fully set - triggering the bug I'm worried about. So what am I 
> > missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?
> 
> The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which 
> is 
> called from tick_init() which is called from start_kernel() shortly after 
> rcu_init():
> 
>   cpu = smp_processor_id();
> 
>   if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
>   pr_warning("NO_HZ: Clearing %d from nohz_full range for 
> timekeeping\n", cpu);
>   cpumask_clear_cpu(cpu, tick_nohz_full_mask);
>   }
> 
> This happens after the call to tick_nohz_init_all() that does the 
> cpumask_setall() that you called out above.

Ah, indeed - I somehow missed that.

This brings up two other questions:

1)

the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask 
to 
it (housekeeping_mask - a variable mask to begin with) and recover the 
housekeeping CPU via:

+   return cpumask_any_and(housekeeping_mask, cpu_online_mask);

which can be pretty expensive, and which gets executed in two hotpaths:

kernel/time/hrtimer.c:  return _cpu(hrtimer_bases, get_nohz_timer_target());
kernel/time/timer.c:return per_cpu_ptr(_bases, 
get_nohz_timer_target());

... why not just use a single housekeeping_cpu which would be way faster to 
pass 
down to the timer code?

2)

What happens if the boot CPU is offlined? (under 
CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)

I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot 
CPU 
is offlined.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Frederic Weisbecker
On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
 2)
 
 What happens if the boot CPU is offlined? (under 
 CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
 
 I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot 
 CPU 
 is offlined.

We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, 
which
is the boot CPU in nohz full, never gets offlined.

 
 Thanks,
 
   Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Paul E. McKenney
On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
 
 * Paul E. McKenney paul...@linux.vnet.ibm.com wrote:
 
   here it's fully set - triggering the bug I'm worried about. So what am I 
   missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?
  
  The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), 
  which is 
  called from tick_init() which is called from start_kernel() shortly after 
  rcu_init():
  
  cpu = smp_processor_id();
  
  if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
  pr_warning(NO_HZ: Clearing %d from nohz_full range for 
  timekeeping\n, cpu);
  cpumask_clear_cpu(cpu, tick_nohz_full_mask);
  }
  
  This happens after the call to tick_nohz_init_all() that does the 
  cpumask_setall() that you called out above.
 
 Ah, indeed - I somehow missed that.
 
 This brings up two other questions:
 
 1)
 
 the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full 
 mask to 
 it (housekeeping_mask - a variable mask to begin with) and recover the 
 housekeeping CPU via:
 
 +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
 
 which can be pretty expensive, and which gets executed in two hotpaths:
 
 kernel/time/hrtimer.c:  return per_cpu(hrtimer_bases, 
 get_nohz_timer_target());
 kernel/time/timer.c:return per_cpu_ptr(tvec_bases, 
 get_nohz_timer_target());
 
 ... why not just use a single housekeeping_cpu which would be way faster to 
 pass 
 down to the timer code?

The housekeeping_cpu came later, but that does seem like a good optimization.

 2)
 
 What happens if the boot CPU is offlined? (under 
 CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
 
 I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot 
 CPU 
 is offlined.

The tick_nohz_cpu_down_callback() function does this, though in a less
than obvious way.  The tick_do_timer_cpu variable is the housekeeping
CPU that is currently handling timing, and it is not permitted to go
offline.

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Mike Galbraith
On Mon, 2015-08-24 at 15:36 +0200, Frederic Weisbecker wrote:
 On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
  2)
  
  What happens if the boot CPU is offlined? (under 
  CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
  
  I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the 
  boot CPU 
  is offlined.
 
 We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, 
 which
 is the boot CPU in nohz full, never gets offlined.

Aha.. and it works fine too.  (CPU0 told me to go away earlier;)

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Frederic Weisbecker
On Mon, Aug 24, 2015 at 06:50:18AM -0700, Paul E. McKenney wrote:
 On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
  
  * Paul E. McKenney paul...@linux.vnet.ibm.com wrote:
  
here it's fully set - triggering the bug I'm worried about. So what am 
I 
missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?
   
   The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), 
   which is 
   called from tick_init() which is called from start_kernel() shortly after 
   rcu_init():
   
 cpu = smp_processor_id();
   
 if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
 pr_warning(NO_HZ: Clearing %d from nohz_full range for 
   timekeeping\n, cpu);
 cpumask_clear_cpu(cpu, tick_nohz_full_mask);
 }
   
   This happens after the call to tick_nohz_init_all() that does the 
   cpumask_setall() that you called out above.
  
  Ah, indeed - I somehow missed that.
  
  This brings up two other questions:
  
  1)
  
  the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full 
  mask to 
  it (housekeeping_mask - a variable mask to begin with) and recover the 
  housekeeping CPU via:
  
  +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
  
  which can be pretty expensive, and which gets executed in two hotpaths:
  
  kernel/time/hrtimer.c:  return per_cpu(hrtimer_bases, 
  get_nohz_timer_target());
  kernel/time/timer.c:return per_cpu_ptr(tvec_bases, 
  get_nohz_timer_target());
  
  ... why not just use a single housekeeping_cpu which would be way faster to 
  pass 
  down to the timer code?
 
 The housekeeping_cpu came later, but that does seem like a good optimization.

Well nohz full is likely to be used for HPC and that can involve big machines.
Having the housekeeping duty spread per node is a likely future evolution there,
if it isn't already used that way.

So we need to keep it a cpumask.

 
  2)
  
  What happens if the boot CPU is offlined? (under 
  CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
  
  I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the 
  boot CPU 
  is offlined.
 
 The tick_nohz_cpu_down_callback() function does this, though in a less
 than obvious way.  The tick_do_timer_cpu variable is the housekeeping
 CPU that is currently handling timing, and it is not permitted to go
 offline.

Indeed, more specifically tick-common.c makes sure to set the timekeeping
duty to a housekeeper and that housekeeper is always the boot CPU due to
early device initialization.

But I should find a way to simplify that code and make it obvious it's always
set to the boot CPU.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Paul E. McKenney
On Mon, Aug 24, 2015 at 04:04:37PM +0200, Frederic Weisbecker wrote:
 On Mon, Aug 24, 2015 at 06:50:18AM -0700, Paul E. McKenney wrote:
  On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote:
   
   * Paul E. McKenney paul...@linux.vnet.ibm.com wrote:
   
 here it's fully set - triggering the bug I'm worried about. So what 
 am I 
 missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?

The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), 
which is 
called from tick_init() which is called from start_kernel() shortly 
after 
rcu_init():

cpu = smp_processor_id();

if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
pr_warning(NO_HZ: Clearing %d from nohz_full range for 
timekeeping\n, cpu);
cpumask_clear_cpu(cpu, tick_nohz_full_mask);
}

This happens after the call to tick_nohz_init_all() that does the 
cpumask_setall() that you called out above.
   
   Ah, indeed - I somehow missed that.
   
   This brings up two other questions:
   
   1)
   
   the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a 
   full mask to 
   it (housekeeping_mask - a variable mask to begin with) and recover the 
   housekeeping CPU via:
   
   +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
   
   which can be pretty expensive, and which gets executed in two hotpaths:
   
   kernel/time/hrtimer.c:  return per_cpu(hrtimer_bases, 
   get_nohz_timer_target());
   kernel/time/timer.c:return per_cpu_ptr(tvec_bases, 
   get_nohz_timer_target());
   
   ... why not just use a single housekeeping_cpu which would be way faster 
   to pass 
   down to the timer code?
  
  The housekeeping_cpu came later, but that does seem like a good 
  optimization.
 
 Well nohz full is likely to be used for HPC and that can involve big machines.
 Having the housekeeping duty spread per node is a likely future evolution 
 there,
 if it isn't already used that way.
 
 So we need to keep it a cpumask.

Fair point!

Thanx, Paul

   2)
   
   What happens if the boot CPU is offlined? (under 
   CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)
   
   I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the 
   boot CPU 
   is offlined.
  
  The tick_nohz_cpu_down_callback() function does this, though in a less
  than obvious way.  The tick_do_timer_cpu variable is the housekeeping
  CPU that is currently handling timing, and it is not permitted to go
  offline.
 
 Indeed, more specifically tick-common.c makes sure to set the timekeeping
 duty to a housekeeper and that housekeeper is always the boot CPU due to
 early device initialization.
 
 But I should find a way to simplify that code and make it obvious it's always
 set to the boot CPU.
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Ingo Molnar

* Paul E. McKenney paul...@linux.vnet.ibm.com wrote:

  here it's fully set - triggering the bug I'm worried about. So what am I 
  missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing?
 
 The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which 
 is 
 called from tick_init() which is called from start_kernel() shortly after 
 rcu_init():
 
   cpu = smp_processor_id();
 
   if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
   pr_warning(NO_HZ: Clearing %d from nohz_full range for 
 timekeeping\n, cpu);
   cpumask_clear_cpu(cpu, tick_nohz_full_mask);
   }
 
 This happens after the call to tick_nohz_init_all() that does the 
 cpumask_setall() that you called out above.

Ah, indeed - I somehow missed that.

This brings up two other questions:

1)

the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask 
to 
it (housekeeping_mask - a variable mask to begin with) and recover the 
housekeeping CPU via:

+   return cpumask_any_and(housekeeping_mask, cpu_online_mask);

which can be pretty expensive, and which gets executed in two hotpaths:

kernel/time/hrtimer.c:  return per_cpu(hrtimer_bases, get_nohz_timer_target());
kernel/time/timer.c:return per_cpu_ptr(tvec_bases, 
get_nohz_timer_target());

... why not just use a single housekeeping_cpu which would be way faster to 
pass 
down to the timer code?

2)

What happens if the boot CPU is offlined? (under 
CONFIG_BOOTPARAM_HOTPLUG_CPU0=y)

I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot 
CPU 
is offlined.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Ingo Molnar

* Mike Galbraith umgwanakikb...@gmail.com wrote:

 On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote:
 
  the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full 
  mask to 
  it (housekeeping_mask - a variable mask to begin with) and recover the 
  housekeeping CPU via:
  
  +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
 
 There can be, and had better be if box is big, multiple housekeepers.

Yes - but that does not seem to be possible via the code right now AFAICS, so 
at 
minimum it's incomplete.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Mike Galbraith
On Mon, 2015-08-24 at 09:41 +0200, Ingo Molnar wrote:
 * Mike Galbraith umgwanakikb...@gmail.com wrote:
 
  On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote:
  
   the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a 
   full mask to 
   it (housekeeping_mask - a variable mask to begin with) and recover the 
   housekeeping CPU via:
   
   +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
  
  There can be, and had better be if box is big, multiple housekeepers.
 
 Yes - but that does not seem to be possible via the code right now AFAICS, so 
 at 
 minimum it's incomplete.

In master housekeepers are set up in tick_nohz_init().  Everybody who's
not a nohz_full CPU is a housekeeper.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Ingo Molnar

* Mike Galbraith umgwanakikb...@gmail.com wrote:

 On Mon, 2015-08-24 at 09:41 +0200, Ingo Molnar wrote:
  * Mike Galbraith umgwanakikb...@gmail.com wrote:
  
   On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote:
   
the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a 
full mask to 
it (housekeeping_mask - a variable mask to begin with) and recover the 
housekeeping CPU via:

+   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
   
   There can be, and had better be if box is big, multiple housekeepers.
  
  Yes - but that does not seem to be possible via the code right now AFAICS, 
  so 
  at minimum it's incomplete.
 
 In master housekeepers are set up in tick_nohz_init().  Everybody who's not a 
 nohz_full CPU is a housekeeper.

Ah, indeed, I missed the nohz_full= boot option:

static int __init tick_nohz_full_setup(char *str)
{
alloc_bootmem_cpumask_var(tick_nohz_full_mask);
if (cpulist_parse(str, tick_nohz_full_mask)  0) {


Ok, first question is resolved - but the second one, behavior on boot CPU 
unplug 
still holds.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-24 Thread Mike Galbraith
On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote:

 the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full 
 mask to 
 it (housekeeping_mask - a variable mask to begin with) and recover the 
 housekeeping CPU via:
 
 +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);

There can be, and had better be if box is big, multiple housekeepers.

Imagine a NO_HZ_FULL_ALL kernel on an SGI beast from hell.  Offloading
8191 CPUs onto poor little CPU0 probably wouldn't work out well :)

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-23 Thread Frederic Weisbecker
On Sun, Aug 23, 2015 at 07:40:32AM +0200, Ingo Molnar wrote:
> So I almost applied this yesterday, but had the following question: what 
> ensures 
> that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() 
> returns cpumask_any_and() of an empty cpumask - which returns an out of range 
> index AFAICS - which will crash and burn in:
> 
> kernel/time/hrtimer.c:  return _cpu(hrtimer_bases, 
> get_nohz_timer_target());
> kernel/time/timer.c:return per_cpu_ptr(_bases, 
> get_nohz_timer_target());
> 
> housekeeping_mask itself is derived from tick_nohz_full_mask (it's the 
> inverse of 
> it in essence), and tick_nohz_full_mask is set via two methods, either via a 
> boot 
> parameter:
> 
> if (cpulist_parse(str, tick_nohz_full_mask) < 0) {
> 
> in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not 
> completely full - making housekeeping_mask empty?
> 
> The other method is via CONFIG_NO_HZ_FULL_ALL:
> 
> cpumask_setall(tick_nohz_full_mask);
> 
> here it's fully set - triggering the bug I'm worried about. So what am I 
> missing, 
> what prevents CONFIG_NO_HZ_FULL_ALL from crashing?

Legitimate worry and I should have explained that in the changelog.

Like Paul replied, we make sure that at least the boot CPU is excluded
from tick_nohz_full_mask in tick_nohz_init(). Then housekeeping_mask,
by reverse effect, contains that boot CPU at least.

And we also make sure that the boot CPU can't get offline
(tick_nohz_cpu_down_callback()).

Now we should really document and check that assumption so here is a
second patch below. The sched patch depends on tip:sched/core (to avoid
conflicts with sched changes) and the following one is based on
tip:timer/nohz but should be applicable to sched/core without conflict.
Both are standalone anyway.

Thanks!

---
From: Frederic Weisbecker 
Date: Sun, 23 Aug 2015 19:34:31 +0200
Subject: [PATCH] nohz: Assert existing housekeepers when nohz full enabled

The code ensures that at least the boot CPU serves as a housekeeper.

Let's assert this assumption to make sure that we have CPUs to handle
unbound jobs like workqueues and timers while nohz full CPUs run
undisturbed.

Signed-off-by: Frederic Weisbecker 
---
 kernel/time/tick-sched.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 3319e16..cc9884f 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -370,6 +370,12 @@ void __init tick_nohz_init(void)
cpu_notifier(tick_nohz_cpu_down_callback, 0);
pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n",
cpumask_pr_args(tick_nohz_full_mask));
+
+   /*
+* We need at least one CPU to handle housekeeping work such
+* as timekeeping, unbound timers, workqueues, ...
+*/
+   WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
 }
 #endif
 
-- 
2.1.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-23 Thread Frederic Weisbecker
On Sun, Aug 23, 2015 at 09:01:01AM -0700, Paul E. McKenney wrote:
> The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(),
> which is called from tick_init() which is called from start_kernel()
> shortly after rcu_init():
> 
>   cpu = smp_processor_id();
> 
>   if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
>   pr_warning("NO_HZ: Clearing %d from nohz_full range for 
> timekeeping\n", cpu);
>   cpumask_clear_cpu(cpu, tick_nohz_full_mask);
>   }
> 
> This happens after the call to tick_nohz_init_all() that does the
> cpumask_setall() that you called out above.
> 
> Or is a recent patch that I missed changing this?

Exactly, this happens right after tick_nohz_full_mask is filled and
makes sure that at least the boot CPU runs as a housekeeper. We
also make sure that it can't become offline later.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-23 Thread Paul E. McKenney
On Sun, Aug 23, 2015 at 07:40:32AM +0200, Ingo Molnar wrote:
> 
> * Frederic Weisbecker  wrote:
> 
> > From: Vatika Harlalka 
> > 
> > The problem addressed in this patch is about affining unpinned timers.
> > Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
> > jitter due to firing of such timers on them.
> > 
> > This patch will affine timers to online CPUs which are not full dynticks
> > in NOHZ_FULL configured systems. It should not introduce overhead in
> > nohz full off case due to static keys.
> > 
> > Reviewed-by: Preeti U Murthy 
> > Signed-off by: Vatika Harlalka 
> > Cc: Ingo Molnar 
> > Cc: Peter Zijlstra 
> > Cc: Christoph Lameter 
> > Cc: Thomas Gleixner 
> > Cc: Paul E. McKenney 
> > Cc: Chris Metcalf 
> > Signed-off-by: Frederic Weisbecker 
> > ---
> >  include/linux/tick.h | 9 -
> >  kernel/sched/core.c  | 7 +--
> >  2 files changed, 13 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/tick.h b/include/linux/tick.h
> > index 3741ba1..51e6493 100644
> > --- a/include/linux/tick.h
> > +++ b/include/linux/tick.h
> > @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct 
> > cpumask *mask)
> > if (tick_nohz_full_enabled())
> > cpumask_or(mask, mask, tick_nohz_full_mask);
> >  }
> > -
> > +static inline int housekeeping_any_cpu(void)
> > +{
> > +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> > +}
> >  extern void __tick_nohz_full_check(void);
> >  extern void tick_nohz_full_kick(void);
> >  extern void tick_nohz_full_kick_cpu(int cpu);
> >  extern void tick_nohz_full_kick_all(void);
> >  extern void __tick_nohz_task_switch(struct task_struct *tsk);
> >  #else
> > +static inline int housekeeping_any_cpu(void)
> > +{
> > +   return smp_processor_id();
> > +}
> >  static inline bool tick_nohz_full_enabled(void) { return false; }
> >  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
> >  static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 9917c96..4fd42e4 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
> > int i, cpu = smp_processor_id();
> > struct sched_domain *sd;
> >  
> > -   if (!idle_cpu(cpu))
> > +   if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
> > return cpu;
> >  
> > rcu_read_lock();
> > for_each_domain(cpu, sd) {
> > for_each_cpu(i, sched_domain_span(sd)) {
> > -   if (!idle_cpu(i)) {
> > +   if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
> > cpu = i;
> > goto unlock;
> > }
> > }
> > }
> > +
> > +   if (!is_housekeeping_cpu(cpu))
> > +   cpu = housekeeping_any_cpu();
> >  unlock:
> > rcu_read_unlock();
> > return cpu;
> 
> So I almost applied this yesterday, but had the following question: what 
> ensures 
> that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() 
> returns cpumask_any_and() of an empty cpumask - which returns an out of range 
> index AFAICS - which will crash and burn in:
> 
> kernel/time/hrtimer.c:  return _cpu(hrtimer_bases, 
> get_nohz_timer_target());
> kernel/time/timer.c:return per_cpu_ptr(_bases, 
> get_nohz_timer_target());
> 
> housekeeping_mask itself is derived from tick_nohz_full_mask (it's the 
> inverse of 
> it in essence), and tick_nohz_full_mask is set via two methods, either via a 
> boot 
> parameter:
> 
> if (cpulist_parse(str, tick_nohz_full_mask) < 0) {
> 
> in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not 
> completely full - making housekeeping_mask empty?
> 
> The other method is via CONFIG_NO_HZ_FULL_ALL:
> 
> cpumask_setall(tick_nohz_full_mask);
> 
> here it's fully set - triggering the bug I'm worried about. So what am I 
> missing, 
> what prevents CONFIG_NO_HZ_FULL_ALL from crashing?

The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(),
which is called from tick_init() which is called from start_kernel()
shortly after rcu_init():

cpu = smp_processor_id();

if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
pr_warning("NO_HZ: Clearing %d from nohz_full range for 
timekeeping\n", cpu);
cpumask_clear_cpu(cpu, tick_nohz_full_mask);
}

This happens after the call to tick_nohz_init_all() that does the
cpumask_setall() that you called out above.

Or is a recent patch that I missed changing this?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-23 Thread Paul E. McKenney
On Sun, Aug 23, 2015 at 07:40:32AM +0200, Ingo Molnar wrote:
 
 * Frederic Weisbecker fweis...@gmail.com wrote:
 
  From: Vatika Harlalka vatikaharla...@gmail.com
  
  The problem addressed in this patch is about affining unpinned timers.
  Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
  jitter due to firing of such timers on them.
  
  This patch will affine timers to online CPUs which are not full dynticks
  in NOHZ_FULL configured systems. It should not introduce overhead in
  nohz full off case due to static keys.
  
  Reviewed-by: Preeti U Murthy pre...@linux.vnet.ibm.com
  Signed-off by: Vatika Harlalka vatikaharla...@gmail.com
  Cc: Ingo Molnar mi...@kernel.org
  Cc: Peter Zijlstra pet...@infradead.org
  Cc: Christoph Lameter c...@linux.com
  Cc: Thomas Gleixner t...@linutronix.de
  Cc: Paul E. McKenney paul...@linux.vnet.ibm.com
  Cc: Chris Metcalf cmetc...@ezchip.com
  Signed-off-by: Frederic Weisbecker fweis...@gmail.com
  ---
   include/linux/tick.h | 9 -
   kernel/sched/core.c  | 7 +--
   2 files changed, 13 insertions(+), 3 deletions(-)
  
  diff --git a/include/linux/tick.h b/include/linux/tick.h
  index 3741ba1..51e6493 100644
  --- a/include/linux/tick.h
  +++ b/include/linux/tick.h
  @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct 
  cpumask *mask)
  if (tick_nohz_full_enabled())
  cpumask_or(mask, mask, tick_nohz_full_mask);
   }
  -
  +static inline int housekeeping_any_cpu(void)
  +{
  +   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
  +}
   extern void __tick_nohz_full_check(void);
   extern void tick_nohz_full_kick(void);
   extern void tick_nohz_full_kick_cpu(int cpu);
   extern void tick_nohz_full_kick_all(void);
   extern void __tick_nohz_task_switch(struct task_struct *tsk);
   #else
  +static inline int housekeeping_any_cpu(void)
  +{
  +   return smp_processor_id();
  +}
   static inline bool tick_nohz_full_enabled(void) { return false; }
   static inline bool tick_nohz_full_cpu(int cpu) { return false; }
   static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
  diff --git a/kernel/sched/core.c b/kernel/sched/core.c
  index 9917c96..4fd42e4 100644
  --- a/kernel/sched/core.c
  +++ b/kernel/sched/core.c
  @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
  int i, cpu = smp_processor_id();
  struct sched_domain *sd;
   
  -   if (!idle_cpu(cpu))
  +   if (!idle_cpu(cpu)  is_housekeeping_cpu(cpu))
  return cpu;
   
  rcu_read_lock();
  for_each_domain(cpu, sd) {
  for_each_cpu(i, sched_domain_span(sd)) {
  -   if (!idle_cpu(i)) {
  +   if (!idle_cpu(i)  is_housekeeping_cpu(cpu)) {
  cpu = i;
  goto unlock;
  }
  }
  }
  +
  +   if (!is_housekeeping_cpu(cpu))
  +   cpu = housekeeping_any_cpu();
   unlock:
  rcu_read_unlock();
  return cpu;
 
 So I almost applied this yesterday, but had the following question: what 
 ensures 
 that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() 
 returns cpumask_any_and() of an empty cpumask - which returns an out of range 
 index AFAICS - which will crash and burn in:
 
 kernel/time/hrtimer.c:  return per_cpu(hrtimer_bases, 
 get_nohz_timer_target());
 kernel/time/timer.c:return per_cpu_ptr(tvec_bases, 
 get_nohz_timer_target());
 
 housekeeping_mask itself is derived from tick_nohz_full_mask (it's the 
 inverse of 
 it in essence), and tick_nohz_full_mask is set via two methods, either via a 
 boot 
 parameter:
 
 if (cpulist_parse(str, tick_nohz_full_mask)  0) {
 
 in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not 
 completely full - making housekeeping_mask empty?
 
 The other method is via CONFIG_NO_HZ_FULL_ALL:
 
 cpumask_setall(tick_nohz_full_mask);
 
 here it's fully set - triggering the bug I'm worried about. So what am I 
 missing, 
 what prevents CONFIG_NO_HZ_FULL_ALL from crashing?

The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(),
which is called from tick_init() which is called from start_kernel()
shortly after rcu_init():

cpu = smp_processor_id();

if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
pr_warning(NO_HZ: Clearing %d from nohz_full range for 
timekeeping\n, cpu);
cpumask_clear_cpu(cpu, tick_nohz_full_mask);
}

This happens after the call to tick_nohz_init_all() that does the
cpumask_setall() that you called out above.

Or is a recent patch that I missed changing this?

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-23 Thread Frederic Weisbecker
On Sun, Aug 23, 2015 at 09:01:01AM -0700, Paul E. McKenney wrote:
 The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(),
 which is called from tick_init() which is called from start_kernel()
 shortly after rcu_init():
 
   cpu = smp_processor_id();
 
   if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
   pr_warning(NO_HZ: Clearing %d from nohz_full range for 
 timekeeping\n, cpu);
   cpumask_clear_cpu(cpu, tick_nohz_full_mask);
   }
 
 This happens after the call to tick_nohz_init_all() that does the
 cpumask_setall() that you called out above.
 
 Or is a recent patch that I missed changing this?

Exactly, this happens right after tick_nohz_full_mask is filled and
makes sure that at least the boot CPU runs as a housekeeper. We
also make sure that it can't become offline later.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-23 Thread Frederic Weisbecker
On Sun, Aug 23, 2015 at 07:40:32AM +0200, Ingo Molnar wrote:
 So I almost applied this yesterday, but had the following question: what 
 ensures 
 that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() 
 returns cpumask_any_and() of an empty cpumask - which returns an out of range 
 index AFAICS - which will crash and burn in:
 
 kernel/time/hrtimer.c:  return per_cpu(hrtimer_bases, 
 get_nohz_timer_target());
 kernel/time/timer.c:return per_cpu_ptr(tvec_bases, 
 get_nohz_timer_target());
 
 housekeeping_mask itself is derived from tick_nohz_full_mask (it's the 
 inverse of 
 it in essence), and tick_nohz_full_mask is set via two methods, either via a 
 boot 
 parameter:
 
 if (cpulist_parse(str, tick_nohz_full_mask)  0) {
 
 in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not 
 completely full - making housekeeping_mask empty?
 
 The other method is via CONFIG_NO_HZ_FULL_ALL:
 
 cpumask_setall(tick_nohz_full_mask);
 
 here it's fully set - triggering the bug I'm worried about. So what am I 
 missing, 
 what prevents CONFIG_NO_HZ_FULL_ALL from crashing?

Legitimate worry and I should have explained that in the changelog.

Like Paul replied, we make sure that at least the boot CPU is excluded
from tick_nohz_full_mask in tick_nohz_init(). Then housekeeping_mask,
by reverse effect, contains that boot CPU at least.

And we also make sure that the boot CPU can't get offline
(tick_nohz_cpu_down_callback()).

Now we should really document and check that assumption so here is a
second patch below. The sched patch depends on tip:sched/core (to avoid
conflicts with sched changes) and the following one is based on
tip:timer/nohz but should be applicable to sched/core without conflict.
Both are standalone anyway.

Thanks!

---
From: Frederic Weisbecker fweis...@gmail.com
Date: Sun, 23 Aug 2015 19:34:31 +0200
Subject: [PATCH] nohz: Assert existing housekeepers when nohz full enabled

The code ensures that at least the boot CPU serves as a housekeeper.

Let's assert this assumption to make sure that we have CPUs to handle
unbound jobs like workqueues and timers while nohz full CPUs run
undisturbed.

Signed-off-by: Frederic Weisbecker fweis...@gmail.com
---
 kernel/time/tick-sched.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 3319e16..cc9884f 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -370,6 +370,12 @@ void __init tick_nohz_init(void)
cpu_notifier(tick_nohz_cpu_down_callback, 0);
pr_info(NO_HZ: Full dynticks CPUs: %*pbl.\n,
cpumask_pr_args(tick_nohz_full_mask));
+
+   /*
+* We need at least one CPU to handle housekeeping work such
+* as timekeeping, unbound timers, workqueues, ...
+*/
+   WARN_ON_ONCE(cpumask_empty(housekeeping_mask));
 }
 #endif
 
-- 
2.1.4


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-22 Thread Ingo Molnar

* Frederic Weisbecker  wrote:

> From: Vatika Harlalka 
> 
> The problem addressed in this patch is about affining unpinned timers.
> Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
> jitter due to firing of such timers on them.
> 
> This patch will affine timers to online CPUs which are not full dynticks
> in NOHZ_FULL configured systems. It should not introduce overhead in
> nohz full off case due to static keys.
> 
> Reviewed-by: Preeti U Murthy 
> Signed-off by: Vatika Harlalka 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Cc: Christoph Lameter 
> Cc: Thomas Gleixner 
> Cc: Paul E. McKenney 
> Cc: Chris Metcalf 
> Signed-off-by: Frederic Weisbecker 
> ---
>  include/linux/tick.h | 9 -
>  kernel/sched/core.c  | 7 +--
>  2 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 3741ba1..51e6493 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct 
> cpumask *mask)
>   if (tick_nohz_full_enabled())
>   cpumask_or(mask, mask, tick_nohz_full_mask);
>  }
> -
> +static inline int housekeeping_any_cpu(void)
> +{
> + return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> +}
>  extern void __tick_nohz_full_check(void);
>  extern void tick_nohz_full_kick(void);
>  extern void tick_nohz_full_kick_cpu(int cpu);
>  extern void tick_nohz_full_kick_all(void);
>  extern void __tick_nohz_task_switch(struct task_struct *tsk);
>  #else
> +static inline int housekeeping_any_cpu(void)
> +{
> + return smp_processor_id();
> +}
>  static inline bool tick_nohz_full_enabled(void) { return false; }
>  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
>  static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9917c96..4fd42e4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
>   int i, cpu = smp_processor_id();
>   struct sched_domain *sd;
>  
> - if (!idle_cpu(cpu))
> + if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
>   return cpu;
>  
>   rcu_read_lock();
>   for_each_domain(cpu, sd) {
>   for_each_cpu(i, sched_domain_span(sd)) {
> - if (!idle_cpu(i)) {
> + if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
>   cpu = i;
>   goto unlock;
>   }
>   }
>   }
> +
> + if (!is_housekeeping_cpu(cpu))
> + cpu = housekeeping_any_cpu();
>  unlock:
>   rcu_read_unlock();
>   return cpu;

So I almost applied this yesterday, but had the following question: what 
ensures 
that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() 
returns cpumask_any_and() of an empty cpumask - which returns an out of range 
index AFAICS - which will crash and burn in:

kernel/time/hrtimer.c:  return _cpu(hrtimer_bases, get_nohz_timer_target());
kernel/time/timer.c:return per_cpu_ptr(_bases, 
get_nohz_timer_target());

housekeeping_mask itself is derived from tick_nohz_full_mask (it's the inverse 
of 
it in essence), and tick_nohz_full_mask is set via two methods, either via a 
boot 
parameter:

if (cpulist_parse(str, tick_nohz_full_mask) < 0) {

in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not 
completely full - making housekeeping_mask empty?

The other method is via CONFIG_NO_HZ_FULL_ALL:

cpumask_setall(tick_nohz_full_mask);

here it's fully set - triggering the bug I'm worried about. So what am I 
missing, 
what prevents CONFIG_NO_HZ_FULL_ALL from crashing?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ping: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-22 Thread Christoph Lameter
On Sat, 22 Aug 2015, Frederic Weisbecker wrote:

> Ping!

Exellent!!!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Ping: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-22 Thread Frederic Weisbecker

Ping!

On Fri, Aug 14, 2015 at 03:46:14AM +0200, Frederic Weisbecker wrote:
> From: Vatika Harlalka 
> 
> The problem addressed in this patch is about affining unpinned timers.
> Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
> jitter due to firing of such timers on them.
> 
> This patch will affine timers to online CPUs which are not full dynticks
> in NOHZ_FULL configured systems. It should not introduce overhead in
> nohz full off case due to static keys.
> 
> Reviewed-by: Preeti U Murthy 
> Signed-off by: Vatika Harlalka 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Cc: Christoph Lameter 
> Cc: Thomas Gleixner 
> Cc: Paul E. McKenney 
> Cc: Chris Metcalf 
> Signed-off-by: Frederic Weisbecker 
> ---
>  include/linux/tick.h | 9 -
>  kernel/sched/core.c  | 7 +--
>  2 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 3741ba1..51e6493 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct 
> cpumask *mask)
>   if (tick_nohz_full_enabled())
>   cpumask_or(mask, mask, tick_nohz_full_mask);
>  }
> -
> +static inline int housekeeping_any_cpu(void)
> +{
> + return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> +}
>  extern void __tick_nohz_full_check(void);
>  extern void tick_nohz_full_kick(void);
>  extern void tick_nohz_full_kick_cpu(int cpu);
>  extern void tick_nohz_full_kick_all(void);
>  extern void __tick_nohz_task_switch(struct task_struct *tsk);
>  #else
> +static inline int housekeeping_any_cpu(void)
> +{
> + return smp_processor_id();
> +}
>  static inline bool tick_nohz_full_enabled(void) { return false; }
>  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
>  static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9917c96..4fd42e4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
>   int i, cpu = smp_processor_id();
>   struct sched_domain *sd;
>  
> - if (!idle_cpu(cpu))
> + if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
>   return cpu;
>  
>   rcu_read_lock();
>   for_each_domain(cpu, sd) {
>   for_each_cpu(i, sched_domain_span(sd)) {
> - if (!idle_cpu(i)) {
> + if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
>   cpu = i;
>   goto unlock;
>   }
>   }
>   }
> +
> + if (!is_housekeeping_cpu(cpu))
> + cpu = housekeeping_any_cpu();
>  unlock:
>   rcu_read_unlock();
>   return cpu;
> -- 
> 2.1.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-22 Thread Ingo Molnar

* Frederic Weisbecker fweis...@gmail.com wrote:

 From: Vatika Harlalka vatikaharla...@gmail.com
 
 The problem addressed in this patch is about affining unpinned timers.
 Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
 jitter due to firing of such timers on them.
 
 This patch will affine timers to online CPUs which are not full dynticks
 in NOHZ_FULL configured systems. It should not introduce overhead in
 nohz full off case due to static keys.
 
 Reviewed-by: Preeti U Murthy pre...@linux.vnet.ibm.com
 Signed-off by: Vatika Harlalka vatikaharla...@gmail.com
 Cc: Ingo Molnar mi...@kernel.org
 Cc: Peter Zijlstra pet...@infradead.org
 Cc: Christoph Lameter c...@linux.com
 Cc: Thomas Gleixner t...@linutronix.de
 Cc: Paul E. McKenney paul...@linux.vnet.ibm.com
 Cc: Chris Metcalf cmetc...@ezchip.com
 Signed-off-by: Frederic Weisbecker fweis...@gmail.com
 ---
  include/linux/tick.h | 9 -
  kernel/sched/core.c  | 7 +--
  2 files changed, 13 insertions(+), 3 deletions(-)
 
 diff --git a/include/linux/tick.h b/include/linux/tick.h
 index 3741ba1..51e6493 100644
 --- a/include/linux/tick.h
 +++ b/include/linux/tick.h
 @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct 
 cpumask *mask)
   if (tick_nohz_full_enabled())
   cpumask_or(mask, mask, tick_nohz_full_mask);
  }
 -
 +static inline int housekeeping_any_cpu(void)
 +{
 + return cpumask_any_and(housekeeping_mask, cpu_online_mask);
 +}
  extern void __tick_nohz_full_check(void);
  extern void tick_nohz_full_kick(void);
  extern void tick_nohz_full_kick_cpu(int cpu);
  extern void tick_nohz_full_kick_all(void);
  extern void __tick_nohz_task_switch(struct task_struct *tsk);
  #else
 +static inline int housekeeping_any_cpu(void)
 +{
 + return smp_processor_id();
 +}
  static inline bool tick_nohz_full_enabled(void) { return false; }
  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
  static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
 diff --git a/kernel/sched/core.c b/kernel/sched/core.c
 index 9917c96..4fd42e4 100644
 --- a/kernel/sched/core.c
 +++ b/kernel/sched/core.c
 @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
   int i, cpu = smp_processor_id();
   struct sched_domain *sd;
  
 - if (!idle_cpu(cpu))
 + if (!idle_cpu(cpu)  is_housekeeping_cpu(cpu))
   return cpu;
  
   rcu_read_lock();
   for_each_domain(cpu, sd) {
   for_each_cpu(i, sched_domain_span(sd)) {
 - if (!idle_cpu(i)) {
 + if (!idle_cpu(i)  is_housekeeping_cpu(cpu)) {
   cpu = i;
   goto unlock;
   }
   }
   }
 +
 + if (!is_housekeeping_cpu(cpu))
 + cpu = housekeeping_any_cpu();
  unlock:
   rcu_read_unlock();
   return cpu;

So I almost applied this yesterday, but had the following question: what 
ensures 
that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() 
returns cpumask_any_and() of an empty cpumask - which returns an out of range 
index AFAICS - which will crash and burn in:

kernel/time/hrtimer.c:  return per_cpu(hrtimer_bases, get_nohz_timer_target());
kernel/time/timer.c:return per_cpu_ptr(tvec_bases, 
get_nohz_timer_target());

housekeeping_mask itself is derived from tick_nohz_full_mask (it's the inverse 
of 
it in essence), and tick_nohz_full_mask is set via two methods, either via a 
boot 
parameter:

if (cpulist_parse(str, tick_nohz_full_mask)  0) {

in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not 
completely full - making housekeeping_mask empty?

The other method is via CONFIG_NO_HZ_FULL_ALL:

cpumask_setall(tick_nohz_full_mask);

here it's fully set - triggering the bug I'm worried about. So what am I 
missing, 
what prevents CONFIG_NO_HZ_FULL_ALL from crashing?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ping: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-22 Thread Christoph Lameter
On Sat, 22 Aug 2015, Frederic Weisbecker wrote:

 Ping!

Exellent!!!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Ping: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-22 Thread Frederic Weisbecker

Ping!

On Fri, Aug 14, 2015 at 03:46:14AM +0200, Frederic Weisbecker wrote:
 From: Vatika Harlalka vatikaharla...@gmail.com
 
 The problem addressed in this patch is about affining unpinned timers.
 Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
 jitter due to firing of such timers on them.
 
 This patch will affine timers to online CPUs which are not full dynticks
 in NOHZ_FULL configured systems. It should not introduce overhead in
 nohz full off case due to static keys.
 
 Reviewed-by: Preeti U Murthy pre...@linux.vnet.ibm.com
 Signed-off by: Vatika Harlalka vatikaharla...@gmail.com
 Cc: Ingo Molnar mi...@kernel.org
 Cc: Peter Zijlstra pet...@infradead.org
 Cc: Christoph Lameter c...@linux.com
 Cc: Thomas Gleixner t...@linutronix.de
 Cc: Paul E. McKenney paul...@linux.vnet.ibm.com
 Cc: Chris Metcalf cmetc...@ezchip.com
 Signed-off-by: Frederic Weisbecker fweis...@gmail.com
 ---
  include/linux/tick.h | 9 -
  kernel/sched/core.c  | 7 +--
  2 files changed, 13 insertions(+), 3 deletions(-)
 
 diff --git a/include/linux/tick.h b/include/linux/tick.h
 index 3741ba1..51e6493 100644
 --- a/include/linux/tick.h
 +++ b/include/linux/tick.h
 @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct 
 cpumask *mask)
   if (tick_nohz_full_enabled())
   cpumask_or(mask, mask, tick_nohz_full_mask);
  }
 -
 +static inline int housekeeping_any_cpu(void)
 +{
 + return cpumask_any_and(housekeeping_mask, cpu_online_mask);
 +}
  extern void __tick_nohz_full_check(void);
  extern void tick_nohz_full_kick(void);
  extern void tick_nohz_full_kick_cpu(int cpu);
  extern void tick_nohz_full_kick_all(void);
  extern void __tick_nohz_task_switch(struct task_struct *tsk);
  #else
 +static inline int housekeeping_any_cpu(void)
 +{
 + return smp_processor_id();
 +}
  static inline bool tick_nohz_full_enabled(void) { return false; }
  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
  static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
 diff --git a/kernel/sched/core.c b/kernel/sched/core.c
 index 9917c96..4fd42e4 100644
 --- a/kernel/sched/core.c
 +++ b/kernel/sched/core.c
 @@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
   int i, cpu = smp_processor_id();
   struct sched_domain *sd;
  
 - if (!idle_cpu(cpu))
 + if (!idle_cpu(cpu)  is_housekeeping_cpu(cpu))
   return cpu;
  
   rcu_read_lock();
   for_each_domain(cpu, sd) {
   for_each_cpu(i, sched_domain_span(sd)) {
 - if (!idle_cpu(i)) {
 + if (!idle_cpu(i)  is_housekeeping_cpu(cpu)) {
   cpu = i;
   goto unlock;
   }
   }
   }
 +
 + if (!is_housekeeping_cpu(cpu))
 + cpu = housekeeping_any_cpu();
  unlock:
   rcu_read_unlock();
   return cpu;
 -- 
 2.1.4
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-13 Thread Frederic Weisbecker
From: Vatika Harlalka 

The problem addressed in this patch is about affining unpinned timers.
Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
jitter due to firing of such timers on them.

This patch will affine timers to online CPUs which are not full dynticks
in NOHZ_FULL configured systems. It should not introduce overhead in
nohz full off case due to static keys.

Reviewed-by: Preeti U Murthy 
Signed-off by: Vatika Harlalka 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Christoph Lameter 
Cc: Thomas Gleixner 
Cc: Paul E. McKenney 
Cc: Chris Metcalf 
Signed-off-by: Frederic Weisbecker 
---
 include/linux/tick.h | 9 -
 kernel/sched/core.c  | 7 +--
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 3741ba1..51e6493 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct 
cpumask *mask)
if (tick_nohz_full_enabled())
cpumask_or(mask, mask, tick_nohz_full_mask);
 }
-
+static inline int housekeeping_any_cpu(void)
+{
+   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+}
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
 extern void tick_nohz_full_kick_all(void);
 extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
+static inline int housekeeping_any_cpu(void)
+{
+   return smp_processor_id();
+}
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
 static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9917c96..4fd42e4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
int i, cpu = smp_processor_id();
struct sched_domain *sd;
 
-   if (!idle_cpu(cpu))
+   if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu))
return cpu;
 
rcu_read_lock();
for_each_domain(cpu, sd) {
for_each_cpu(i, sched_domain_span(sd)) {
-   if (!idle_cpu(i)) {
+   if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
cpu = i;
goto unlock;
}
}
}
+
+   if (!is_housekeeping_cpu(cpu))
+   cpu = housekeeping_any_cpu();
 unlock:
rcu_read_unlock();
return cpu;
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers

2015-08-13 Thread Frederic Weisbecker
From: Vatika Harlalka vatikaharla...@gmail.com

The problem addressed in this patch is about affining unpinned timers.
Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary
jitter due to firing of such timers on them.

This patch will affine timers to online CPUs which are not full dynticks
in NOHZ_FULL configured systems. It should not introduce overhead in
nohz full off case due to static keys.

Reviewed-by: Preeti U Murthy pre...@linux.vnet.ibm.com
Signed-off by: Vatika Harlalka vatikaharla...@gmail.com
Cc: Ingo Molnar mi...@kernel.org
Cc: Peter Zijlstra pet...@infradead.org
Cc: Christoph Lameter c...@linux.com
Cc: Thomas Gleixner t...@linutronix.de
Cc: Paul E. McKenney paul...@linux.vnet.ibm.com
Cc: Chris Metcalf cmetc...@ezchip.com
Signed-off-by: Frederic Weisbecker fweis...@gmail.com
---
 include/linux/tick.h | 9 -
 kernel/sched/core.c  | 7 +--
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 3741ba1..51e6493 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct 
cpumask *mask)
if (tick_nohz_full_enabled())
cpumask_or(mask, mask, tick_nohz_full_mask);
 }
-
+static inline int housekeeping_any_cpu(void)
+{
+   return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+}
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
 extern void tick_nohz_full_kick_all(void);
 extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
+static inline int housekeeping_any_cpu(void)
+{
+   return smp_processor_id();
+}
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
 static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9917c96..4fd42e4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -623,18 +623,21 @@ int get_nohz_timer_target(void)
int i, cpu = smp_processor_id();
struct sched_domain *sd;
 
-   if (!idle_cpu(cpu))
+   if (!idle_cpu(cpu)  is_housekeeping_cpu(cpu))
return cpu;
 
rcu_read_lock();
for_each_domain(cpu, sd) {
for_each_cpu(i, sched_domain_span(sd)) {
-   if (!idle_cpu(i)) {
+   if (!idle_cpu(i)  is_housekeeping_cpu(cpu)) {
cpu = i;
goto unlock;
}
}
}
+
+   if (!is_housekeeping_cpu(cpu))
+   cpu = housekeeping_any_cpu();
 unlock:
rcu_read_unlock();
return cpu;
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/