Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Fri, Aug 28, 2015 at 10:32:59AM +0200, Ingo Molnar wrote: > > * Frederic Weisbecker wrote: > > > On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote: > > > > > > * Frederic Weisbecker wrote: > > > > > > > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: > > > > > 2) > > > > > > > > > > What happens if the boot CPU is offlined? (under > > > > > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) > > > > > > > > > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if > > > > > the boot CPU > > > > > is offlined. > > > > > > > > We have tick_nohz_cpu_down_callback() which makes sure that the > > > > timekeeper, which > > > > is the boot CPU in nohz full, never gets offlined. > > > > > > That solution really sucks - it essentially regresses a feature the user > > > explicitly asked for! I also see no way for the user to migrate the > > > timekeeping > > > functionality over to another CPU without rebooting. > > > > > > If this is the last timekeeping CPU then it should migrate the > > > timekeeping > > > functionality to another CPU, and perhaps printk a warning if all other > > > CPUs are > > > nohz-full and we have to mark one of them as the timekeeper. > > > > > > Also, the nohz-full and timekeeper functionality should not be a boot > > > parameter > > > only thing, but should be runtime configurable. > > > > When I tried to allow moving the timekeeping duty over all housekeeping > > CPUs, > > Thomas got angry because it broke the KISS current nohz full code. Indeed, > > there > > must be at least one running all the time on behalf of nohz full CPUs that > > can > > run anytime. Thus balancing the timekeeping duty over housekeepers is a bit > > more > > complicated than in normal configurations. > > > > Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a > > housekeeper if > > any remains or to a nohz full one. Then we must make sure the new > > timekeeper > > never goes to idle. > > > > But nohz_full is a corner usecase and I'm not sure it's worth the > > complexity. If > > a nohz full user came and complained about CPU0 hotplog not working, I > > would > > definetly retry it but I haven't heard about that yet. Besides, hotplug is > > very > > isolation-unfriendly in general due to stop machine. > > Ok, I guess we can live with this. Now this will likely evolve in the future, I can easily imagine that timekeeping becomes balanced among housekeepers when we'll have one per node. It's not yet the priority but we may come to that one day. > > Mind sending an updated series with all patches? Sure, I'm cooking that. Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
* Frederic Weisbecker wrote: > On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote: > > > > * Frederic Weisbecker wrote: > > > > > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: > > > > 2) > > > > > > > > What happens if the boot CPU is offlined? (under > > > > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) > > > > > > > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if > > > > the boot CPU > > > > is offlined. > > > > > > We have tick_nohz_cpu_down_callback() which makes sure that the > > > timekeeper, which > > > is the boot CPU in nohz full, never gets offlined. > > > > That solution really sucks - it essentially regresses a feature the user > > explicitly asked for! I also see no way for the user to migrate the > > timekeeping > > functionality over to another CPU without rebooting. > > > > If this is the last timekeeping CPU then it should migrate the timekeeping > > functionality to another CPU, and perhaps printk a warning if all other > > CPUs are > > nohz-full and we have to mark one of them as the timekeeper. > > > > Also, the nohz-full and timekeeper functionality should not be a boot > > parameter > > only thing, but should be runtime configurable. > > When I tried to allow moving the timekeeping duty over all housekeeping CPUs, > Thomas got angry because it broke the KISS current nohz full code. Indeed, > there > must be at least one running all the time on behalf of nohz full CPUs that > can > run anytime. Thus balancing the timekeeping duty over housekeepers is a bit > more > complicated than in normal configurations. > > Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a housekeeper > if > any remains or to a nohz full one. Then we must make sure the new timekeeper > never goes to idle. > > But nohz_full is a corner usecase and I'm not sure it's worth the complexity. > If > a nohz full user came and complained about CPU0 hotplog not working, I would > definetly retry it but I haven't heard about that yet. Besides, hotplug is > very > isolation-unfriendly in general due to stop machine. Ok, I guess we can live with this. Mind sending an updated series with all patches? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
* Frederic Weisbecker fweis...@gmail.com wrote: On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote: * Frederic Weisbecker fweis...@gmail.com wrote: On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: 2) What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU is offlined. We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which is the boot CPU in nohz full, never gets offlined. That solution really sucks - it essentially regresses a feature the user explicitly asked for! I also see no way for the user to migrate the timekeeping functionality over to another CPU without rebooting. If this is the last timekeeping CPU then it should migrate the timekeeping functionality to another CPU, and perhaps printk a warning if all other CPUs are nohz-full and we have to mark one of them as the timekeeper. Also, the nohz-full and timekeeper functionality should not be a boot parameter only thing, but should be runtime configurable. When I tried to allow moving the timekeeping duty over all housekeeping CPUs, Thomas got angry because it broke the KISS current nohz full code. Indeed, there must be at least one running all the time on behalf of nohz full CPUs that can run anytime. Thus balancing the timekeeping duty over housekeepers is a bit more complicated than in normal configurations. Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a housekeeper if any remains or to a nohz full one. Then we must make sure the new timekeeper never goes to idle. But nohz_full is a corner usecase and I'm not sure it's worth the complexity. If a nohz full user came and complained about CPU0 hotplog not working, I would definetly retry it but I haven't heard about that yet. Besides, hotplug is very isolation-unfriendly in general due to stop machine. Ok, I guess we can live with this. Mind sending an updated series with all patches? Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Fri, Aug 28, 2015 at 10:32:59AM +0200, Ingo Molnar wrote: * Frederic Weisbecker fweis...@gmail.com wrote: On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote: * Frederic Weisbecker fweis...@gmail.com wrote: On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: 2) What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU is offlined. We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which is the boot CPU in nohz full, never gets offlined. That solution really sucks - it essentially regresses a feature the user explicitly asked for! I also see no way for the user to migrate the timekeeping functionality over to another CPU without rebooting. If this is the last timekeeping CPU then it should migrate the timekeeping functionality to another CPU, and perhaps printk a warning if all other CPUs are nohz-full and we have to mark one of them as the timekeeper. Also, the nohz-full and timekeeper functionality should not be a boot parameter only thing, but should be runtime configurable. When I tried to allow moving the timekeeping duty over all housekeeping CPUs, Thomas got angry because it broke the KISS current nohz full code. Indeed, there must be at least one running all the time on behalf of nohz full CPUs that can run anytime. Thus balancing the timekeeping duty over housekeepers is a bit more complicated than in normal configurations. Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a housekeeper if any remains or to a nohz full one. Then we must make sure the new timekeeper never goes to idle. But nohz_full is a corner usecase and I'm not sure it's worth the complexity. If a nohz full user came and complained about CPU0 hotplog not working, I would definetly retry it but I haven't heard about that yet. Besides, hotplug is very isolation-unfriendly in general due to stop machine. Ok, I guess we can live with this. Now this will likely evolve in the future, I can easily imagine that timekeeping becomes balanced among housekeepers when we'll have one per node. It's not yet the priority but we may come to that one day. Mind sending an updated series with all patches? Sure, I'm cooking that. Thanks! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote: > > * Frederic Weisbecker wrote: > > > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: > > > 2) > > > > > > What happens if the boot CPU is offlined? (under > > > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) > > > > > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the > > > boot CPU > > > is offlined. > > > > We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, > > which > > is the boot CPU in nohz full, never gets offlined. > > That solution really sucks - it essentially regresses a feature the user > explicitly asked for! I also see no way for the user to migrate the > timekeeping > functionality over to another CPU without rebooting. > > If this is the last timekeeping CPU then it should migrate the timekeeping > functionality to another CPU, and perhaps printk a warning if all other CPUs > are > nohz-full and we have to mark one of them as the timekeeper. > > Also, the nohz-full and timekeeper functionality should not be a boot > parameter > only thing, but should be runtime configurable. When I tried to allow moving the timekeeping duty over all housekeeping CPUs, Thomas got angry because it broke the KISS current nohz full code. Indeed, there must be at least one running all the time on behalf of nohz full CPUs that can run anytime. Thus balancing the timekeeping duty over housekeepers is a bit more complicated than in normal configurations. Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a housekeeper if any remains or to a nohz full one. Then we must make sure the new timekeeper never goes to idle. But nohz_full is a corner usecase and I'm not sure it's worth the complexity. If a nohz full user came and complained about CPU0 hotplog not working, I would definetly retry it but I haven't heard about that yet. Besides, hotplug is very isolation-unfriendly in general due to stop machine. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
* Frederic Weisbecker wrote: > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: > > 2) > > > > What happens if the boot CPU is offlined? (under > > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) > > > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the > > boot CPU > > is offlined. > > We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, > which > is the boot CPU in nohz full, never gets offlined. That solution really sucks - it essentially regresses a feature the user explicitly asked for! I also see no way for the user to migrate the timekeeping functionality over to another CPU without rebooting. If this is the last timekeeping CPU then it should migrate the timekeeping functionality to another CPU, and perhaps printk a warning if all other CPUs are nohz-full and we have to mark one of them as the timekeeper. Also, the nohz-full and timekeeper functionality should not be a boot parameter only thing, but should be runtime configurable. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
* Frederic Weisbecker fweis...@gmail.com wrote: On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: 2) What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU is offlined. We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which is the boot CPU in nohz full, never gets offlined. That solution really sucks - it essentially regresses a feature the user explicitly asked for! I also see no way for the user to migrate the timekeeping functionality over to another CPU without rebooting. If this is the last timekeeping CPU then it should migrate the timekeeping functionality to another CPU, and perhaps printk a warning if all other CPUs are nohz-full and we have to mark one of them as the timekeeper. Also, the nohz-full and timekeeper functionality should not be a boot parameter only thing, but should be runtime configurable. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Tue, Aug 25, 2015 at 10:29:04AM +0200, Ingo Molnar wrote: * Frederic Weisbecker fweis...@gmail.com wrote: On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: 2) What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU is offlined. We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which is the boot CPU in nohz full, never gets offlined. That solution really sucks - it essentially regresses a feature the user explicitly asked for! I also see no way for the user to migrate the timekeeping functionality over to another CPU without rebooting. If this is the last timekeeping CPU then it should migrate the timekeeping functionality to another CPU, and perhaps printk a warning if all other CPUs are nohz-full and we have to mark one of them as the timekeeper. Also, the nohz-full and timekeeper functionality should not be a boot parameter only thing, but should be runtime configurable. When I tried to allow moving the timekeeping duty over all housekeeping CPUs, Thomas got angry because it broke the KISS current nohz full code. Indeed, there must be at least one running all the time on behalf of nohz full CPUs that can run anytime. Thus balancing the timekeeping duty over housekeepers is a bit more complicated than in normal configurations. Now surely we can do that using an IPI from CPU_DOWN_PREPARE to a housekeeper if any remains or to a nohz full one. Then we must make sure the new timekeeper never goes to idle. But nohz_full is a corner usecase and I'm not sure it's worth the complexity. If a nohz full user came and complained about CPU0 hotplog not working, I would definetly retry it but I haven't heard about that yet. Besides, hotplug is very isolation-unfriendly in general due to stop machine. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, Aug 24, 2015 at 04:04:37PM +0200, Frederic Weisbecker wrote: > On Mon, Aug 24, 2015 at 06:50:18AM -0700, Paul E. McKenney wrote: > > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: > > > > > > * Paul E. McKenney wrote: > > > > > > > > here it's fully set - triggering the bug I'm worried about. So what > > > > > am I > > > > > missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? > > > > > > > > The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), > > > > which is > > > > called from tick_init() which is called from start_kernel() shortly > > > > after > > > > rcu_init(): > > > > > > > > cpu = smp_processor_id(); > > > > > > > > if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { > > > > pr_warning("NO_HZ: Clearing %d from nohz_full range for > > > > timekeeping\n", cpu); > > > > cpumask_clear_cpu(cpu, tick_nohz_full_mask); > > > > } > > > > > > > > This happens after the call to tick_nohz_init_all() that does the > > > > cpumask_setall() that you called out above. > > > > > > Ah, indeed - I somehow missed that. > > > > > > This brings up two other questions: > > > > > > 1) > > > > > > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a > > > full mask to > > > it (housekeeping_mask - a variable mask to begin with) and recover the > > > housekeeping CPU via: > > > > > > + return cpumask_any_and(housekeeping_mask, cpu_online_mask); > > > > > > which can be pretty expensive, and which gets executed in two hotpaths: > > > > > > kernel/time/hrtimer.c: return _cpu(hrtimer_bases, > > > get_nohz_timer_target()); > > > kernel/time/timer.c:return per_cpu_ptr(_bases, > > > get_nohz_timer_target()); > > > > > > ... why not just use a single housekeeping_cpu which would be way faster > > > to pass > > > down to the timer code? > > > > The housekeeping_cpu came later, but that does seem like a good > > optimization. > > Well nohz full is likely to be used for HPC and that can involve big machines. > Having the housekeeping duty spread per node is a likely future evolution > there, > if it isn't already used that way. > > So we need to keep it a cpumask. Fair point! Thanx, Paul > > > 2) > > > > > > What happens if the boot CPU is offlined? (under > > > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) > > > > > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the > > > boot CPU > > > is offlined. > > > > The tick_nohz_cpu_down_callback() function does this, though in a less > > than obvious way. The tick_do_timer_cpu variable is the housekeeping > > CPU that is currently handling timing, and it is not permitted to go > > offline. > > Indeed, more specifically tick-common.c makes sure to set the timekeeping > duty to a housekeeper and that housekeeper is always the boot CPU due to > early device initialization. > > But I should find a way to simplify that code and make it obvious it's always > set to the boot CPU. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, Aug 24, 2015 at 06:50:18AM -0700, Paul E. McKenney wrote: > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: > > > > * Paul E. McKenney wrote: > > > > > > here it's fully set - triggering the bug I'm worried about. So what am > > > > I > > > > missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? > > > > > > The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), > > > which is > > > called from tick_init() which is called from start_kernel() shortly after > > > rcu_init(): > > > > > > cpu = smp_processor_id(); > > > > > > if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { > > > pr_warning("NO_HZ: Clearing %d from nohz_full range for > > > timekeeping\n", cpu); > > > cpumask_clear_cpu(cpu, tick_nohz_full_mask); > > > } > > > > > > This happens after the call to tick_nohz_init_all() that does the > > > cpumask_setall() that you called out above. > > > > Ah, indeed - I somehow missed that. > > > > This brings up two other questions: > > > > 1) > > > > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full > > mask to > > it (housekeeping_mask - a variable mask to begin with) and recover the > > housekeeping CPU via: > > > > + return cpumask_any_and(housekeeping_mask, cpu_online_mask); > > > > which can be pretty expensive, and which gets executed in two hotpaths: > > > > kernel/time/hrtimer.c: return _cpu(hrtimer_bases, > > get_nohz_timer_target()); > > kernel/time/timer.c:return per_cpu_ptr(_bases, > > get_nohz_timer_target()); > > > > ... why not just use a single housekeeping_cpu which would be way faster to > > pass > > down to the timer code? > > The housekeeping_cpu came later, but that does seem like a good optimization. Well nohz full is likely to be used for HPC and that can involve big machines. Having the housekeeping duty spread per node is a likely future evolution there, if it isn't already used that way. So we need to keep it a cpumask. > > > 2) > > > > What happens if the boot CPU is offlined? (under > > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) > > > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the > > boot CPU > > is offlined. > > The tick_nohz_cpu_down_callback() function does this, though in a less > than obvious way. The tick_do_timer_cpu variable is the housekeeping > CPU that is currently handling timing, and it is not permitted to go > offline. Indeed, more specifically tick-common.c makes sure to set the timekeeping duty to a housekeeper and that housekeeper is always the boot CPU due to early device initialization. But I should find a way to simplify that code and make it obvious it's always set to the boot CPU. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, 2015-08-24 at 15:36 +0200, Frederic Weisbecker wrote: > On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: > > 2) > > > > What happens if the boot CPU is offlined? (under > > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) > > > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the > > boot CPU > > is offlined. > > We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, > which > is the boot CPU in nohz full, never gets offlined. Aha.. and it works fine too. (CPU0 told me to go away earlier;) -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: > > * Paul E. McKenney wrote: > > > > here it's fully set - triggering the bug I'm worried about. So what am I > > > missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? > > > > The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), > > which is > > called from tick_init() which is called from start_kernel() shortly after > > rcu_init(): > > > > cpu = smp_processor_id(); > > > > if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { > > pr_warning("NO_HZ: Clearing %d from nohz_full range for > > timekeeping\n", cpu); > > cpumask_clear_cpu(cpu, tick_nohz_full_mask); > > } > > > > This happens after the call to tick_nohz_init_all() that does the > > cpumask_setall() that you called out above. > > Ah, indeed - I somehow missed that. > > This brings up two other questions: > > 1) > > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full > mask to > it (housekeeping_mask - a variable mask to begin with) and recover the > housekeeping CPU via: > > + return cpumask_any_and(housekeeping_mask, cpu_online_mask); > > which can be pretty expensive, and which gets executed in two hotpaths: > > kernel/time/hrtimer.c: return _cpu(hrtimer_bases, > get_nohz_timer_target()); > kernel/time/timer.c:return per_cpu_ptr(_bases, > get_nohz_timer_target()); > > ... why not just use a single housekeeping_cpu which would be way faster to > pass > down to the timer code? The housekeeping_cpu came later, but that does seem like a good optimization. > 2) > > What happens if the boot CPU is offlined? (under > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot > CPU > is offlined. The tick_nohz_cpu_down_callback() function does this, though in a less than obvious way. The tick_do_timer_cpu variable is the housekeeping CPU that is currently handling timing, and it is not permitted to go offline. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: > 2) > > What happens if the boot CPU is offlined? (under > CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) > > I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot > CPU > is offlined. We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which is the boot CPU in nohz full, never gets offlined. > > Thanks, > > Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
* Mike Galbraith wrote: > On Mon, 2015-08-24 at 09:41 +0200, Ingo Molnar wrote: > > * Mike Galbraith wrote: > > > > > On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote: > > > > > > > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a > > > > full mask to > > > > it (housekeeping_mask - a variable mask to begin with) and recover the > > > > housekeeping CPU via: > > > > > > > > + return cpumask_any_and(housekeeping_mask, cpu_online_mask); > > > > > > There can be, and had better be if box is big, multiple housekeepers. > > > > Yes - but that does not seem to be possible via the code right now AFAICS, > > so > > at minimum it's incomplete. > > In master housekeepers are set up in tick_nohz_init(). Everybody who's not a > nohz_full CPU is a housekeeper. Ah, indeed, I missed the nohz_full= boot option: static int __init tick_nohz_full_setup(char *str) { alloc_bootmem_cpumask_var(_nohz_full_mask); if (cpulist_parse(str, tick_nohz_full_mask) < 0) { Ok, first question is resolved - but the second one, behavior on boot CPU unplug still holds. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, 2015-08-24 at 09:41 +0200, Ingo Molnar wrote: > * Mike Galbraith wrote: > > > On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote: > > > > > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a > > > full mask to > > > it (housekeeping_mask - a variable mask to begin with) and recover the > > > housekeeping CPU via: > > > > > > + return cpumask_any_and(housekeeping_mask, cpu_online_mask); > > > > There can be, and had better be if box is big, multiple housekeepers. > > Yes - but that does not seem to be possible via the code right now AFAICS, so > at > minimum it's incomplete. In master housekeepers are set up in tick_nohz_init(). Everybody who's not a nohz_full CPU is a housekeeper. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
* Mike Galbraith wrote: > On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote: > > > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full > > mask to > > it (housekeeping_mask - a variable mask to begin with) and recover the > > housekeeping CPU via: > > > > + return cpumask_any_and(housekeeping_mask, cpu_online_mask); > > There can be, and had better be if box is big, multiple housekeepers. Yes - but that does not seem to be possible via the code right now AFAICS, so at minimum it's incomplete. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote: > the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full > mask to > it (housekeeping_mask - a variable mask to begin with) and recover the > housekeeping CPU via: > > + return cpumask_any_and(housekeeping_mask, cpu_online_mask); There can be, and had better be if box is big, multiple housekeepers. Imagine a NO_HZ_FULL_ALL kernel on an SGI beast from hell. Offloading 8191 CPUs onto poor little CPU0 probably wouldn't work out well :) -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
* Paul E. McKenney wrote: > > here it's fully set - triggering the bug I'm worried about. So what am I > > missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? > > The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which > is > called from tick_init() which is called from start_kernel() shortly after > rcu_init(): > > cpu = smp_processor_id(); > > if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { > pr_warning("NO_HZ: Clearing %d from nohz_full range for > timekeeping\n", cpu); > cpumask_clear_cpu(cpu, tick_nohz_full_mask); > } > > This happens after the call to tick_nohz_init_all() that does the > cpumask_setall() that you called out above. Ah, indeed - I somehow missed that. This brings up two other questions: 1) the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to it (housekeeping_mask - a variable mask to begin with) and recover the housekeeping CPU via: + return cpumask_any_and(housekeeping_mask, cpu_online_mask); which can be pretty expensive, and which gets executed in two hotpaths: kernel/time/hrtimer.c: return _cpu(hrtimer_bases, get_nohz_timer_target()); kernel/time/timer.c:return per_cpu_ptr(_bases, get_nohz_timer_target()); ... why not just use a single housekeeping_cpu which would be way faster to pass down to the timer code? 2) What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU is offlined. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: 2) What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU is offlined. We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which is the boot CPU in nohz full, never gets offlined. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: * Paul E. McKenney paul...@linux.vnet.ibm.com wrote: here it's fully set - triggering the bug I'm worried about. So what am I missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which is called from tick_init() which is called from start_kernel() shortly after rcu_init(): cpu = smp_processor_id(); if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { pr_warning(NO_HZ: Clearing %d from nohz_full range for timekeeping\n, cpu); cpumask_clear_cpu(cpu, tick_nohz_full_mask); } This happens after the call to tick_nohz_init_all() that does the cpumask_setall() that you called out above. Ah, indeed - I somehow missed that. This brings up two other questions: 1) the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to it (housekeeping_mask - a variable mask to begin with) and recover the housekeeping CPU via: + return cpumask_any_and(housekeeping_mask, cpu_online_mask); which can be pretty expensive, and which gets executed in two hotpaths: kernel/time/hrtimer.c: return per_cpu(hrtimer_bases, get_nohz_timer_target()); kernel/time/timer.c:return per_cpu_ptr(tvec_bases, get_nohz_timer_target()); ... why not just use a single housekeeping_cpu which would be way faster to pass down to the timer code? The housekeeping_cpu came later, but that does seem like a good optimization. 2) What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU is offlined. The tick_nohz_cpu_down_callback() function does this, though in a less than obvious way. The tick_do_timer_cpu variable is the housekeeping CPU that is currently handling timing, and it is not permitted to go offline. Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, 2015-08-24 at 15:36 +0200, Frederic Weisbecker wrote: On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: 2) What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU is offlined. We have tick_nohz_cpu_down_callback() which makes sure that the timekeeper, which is the boot CPU in nohz full, never gets offlined. Aha.. and it works fine too. (CPU0 told me to go away earlier;) -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, Aug 24, 2015 at 06:50:18AM -0700, Paul E. McKenney wrote: On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: * Paul E. McKenney paul...@linux.vnet.ibm.com wrote: here it's fully set - triggering the bug I'm worried about. So what am I missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which is called from tick_init() which is called from start_kernel() shortly after rcu_init(): cpu = smp_processor_id(); if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { pr_warning(NO_HZ: Clearing %d from nohz_full range for timekeeping\n, cpu); cpumask_clear_cpu(cpu, tick_nohz_full_mask); } This happens after the call to tick_nohz_init_all() that does the cpumask_setall() that you called out above. Ah, indeed - I somehow missed that. This brings up two other questions: 1) the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to it (housekeeping_mask - a variable mask to begin with) and recover the housekeeping CPU via: + return cpumask_any_and(housekeeping_mask, cpu_online_mask); which can be pretty expensive, and which gets executed in two hotpaths: kernel/time/hrtimer.c: return per_cpu(hrtimer_bases, get_nohz_timer_target()); kernel/time/timer.c:return per_cpu_ptr(tvec_bases, get_nohz_timer_target()); ... why not just use a single housekeeping_cpu which would be way faster to pass down to the timer code? The housekeeping_cpu came later, but that does seem like a good optimization. Well nohz full is likely to be used for HPC and that can involve big machines. Having the housekeeping duty spread per node is a likely future evolution there, if it isn't already used that way. So we need to keep it a cpumask. 2) What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU is offlined. The tick_nohz_cpu_down_callback() function does this, though in a less than obvious way. The tick_do_timer_cpu variable is the housekeeping CPU that is currently handling timing, and it is not permitted to go offline. Indeed, more specifically tick-common.c makes sure to set the timekeeping duty to a housekeeper and that housekeeper is always the boot CPU due to early device initialization. But I should find a way to simplify that code and make it obvious it's always set to the boot CPU. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, Aug 24, 2015 at 04:04:37PM +0200, Frederic Weisbecker wrote: On Mon, Aug 24, 2015 at 06:50:18AM -0700, Paul E. McKenney wrote: On Mon, Aug 24, 2015 at 08:44:12AM +0200, Ingo Molnar wrote: * Paul E. McKenney paul...@linux.vnet.ibm.com wrote: here it's fully set - triggering the bug I'm worried about. So what am I missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which is called from tick_init() which is called from start_kernel() shortly after rcu_init(): cpu = smp_processor_id(); if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { pr_warning(NO_HZ: Clearing %d from nohz_full range for timekeeping\n, cpu); cpumask_clear_cpu(cpu, tick_nohz_full_mask); } This happens after the call to tick_nohz_init_all() that does the cpumask_setall() that you called out above. Ah, indeed - I somehow missed that. This brings up two other questions: 1) the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to it (housekeeping_mask - a variable mask to begin with) and recover the housekeeping CPU via: + return cpumask_any_and(housekeeping_mask, cpu_online_mask); which can be pretty expensive, and which gets executed in two hotpaths: kernel/time/hrtimer.c: return per_cpu(hrtimer_bases, get_nohz_timer_target()); kernel/time/timer.c:return per_cpu_ptr(tvec_bases, get_nohz_timer_target()); ... why not just use a single housekeeping_cpu which would be way faster to pass down to the timer code? The housekeeping_cpu came later, but that does seem like a good optimization. Well nohz full is likely to be used for HPC and that can involve big machines. Having the housekeeping duty spread per node is a likely future evolution there, if it isn't already used that way. So we need to keep it a cpumask. Fair point! Thanx, Paul 2) What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU is offlined. The tick_nohz_cpu_down_callback() function does this, though in a less than obvious way. The tick_do_timer_cpu variable is the housekeeping CPU that is currently handling timing, and it is not permitted to go offline. Indeed, more specifically tick-common.c makes sure to set the timekeeping duty to a housekeeper and that housekeeper is always the boot CPU due to early device initialization. But I should find a way to simplify that code and make it obvious it's always set to the boot CPU. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
* Paul E. McKenney paul...@linux.vnet.ibm.com wrote: here it's fully set - triggering the bug I'm worried about. So what am I missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which is called from tick_init() which is called from start_kernel() shortly after rcu_init(): cpu = smp_processor_id(); if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { pr_warning(NO_HZ: Clearing %d from nohz_full range for timekeeping\n, cpu); cpumask_clear_cpu(cpu, tick_nohz_full_mask); } This happens after the call to tick_nohz_init_all() that does the cpumask_setall() that you called out above. Ah, indeed - I somehow missed that. This brings up two other questions: 1) the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to it (housekeeping_mask - a variable mask to begin with) and recover the housekeeping CPU via: + return cpumask_any_and(housekeeping_mask, cpu_online_mask); which can be pretty expensive, and which gets executed in two hotpaths: kernel/time/hrtimer.c: return per_cpu(hrtimer_bases, get_nohz_timer_target()); kernel/time/timer.c:return per_cpu_ptr(tvec_bases, get_nohz_timer_target()); ... why not just use a single housekeeping_cpu which would be way faster to pass down to the timer code? 2) What happens if the boot CPU is offlined? (under CONFIG_BOOTPARAM_HOTPLUG_CPU0=y) I don't see CPU hotplug callbacks fixing up the housekeeping_mask if the boot CPU is offlined. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
* Mike Galbraith umgwanakikb...@gmail.com wrote: On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote: the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to it (housekeeping_mask - a variable mask to begin with) and recover the housekeeping CPU via: + return cpumask_any_and(housekeeping_mask, cpu_online_mask); There can be, and had better be if box is big, multiple housekeepers. Yes - but that does not seem to be possible via the code right now AFAICS, so at minimum it's incomplete. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, 2015-08-24 at 09:41 +0200, Ingo Molnar wrote: * Mike Galbraith umgwanakikb...@gmail.com wrote: On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote: the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to it (housekeeping_mask - a variable mask to begin with) and recover the housekeeping CPU via: + return cpumask_any_and(housekeeping_mask, cpu_online_mask); There can be, and had better be if box is big, multiple housekeepers. Yes - but that does not seem to be possible via the code right now AFAICS, so at minimum it's incomplete. In master housekeepers are set up in tick_nohz_init(). Everybody who's not a nohz_full CPU is a housekeeper. -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
* Mike Galbraith umgwanakikb...@gmail.com wrote: On Mon, 2015-08-24 at 09:41 +0200, Ingo Molnar wrote: * Mike Galbraith umgwanakikb...@gmail.com wrote: On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote: the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to it (housekeeping_mask - a variable mask to begin with) and recover the housekeeping CPU via: + return cpumask_any_and(housekeeping_mask, cpu_online_mask); There can be, and had better be if box is big, multiple housekeepers. Yes - but that does not seem to be possible via the code right now AFAICS, so at minimum it's incomplete. In master housekeepers are set up in tick_nohz_init(). Everybody who's not a nohz_full CPU is a housekeeper. Ah, indeed, I missed the nohz_full= boot option: static int __init tick_nohz_full_setup(char *str) { alloc_bootmem_cpumask_var(tick_nohz_full_mask); if (cpulist_parse(str, tick_nohz_full_mask) 0) { Ok, first question is resolved - but the second one, behavior on boot CPU unplug still holds. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Mon, 2015-08-24 at 08:44 +0200, Ingo Molnar wrote: the 'housekeeping CPU' is essentially the boot CPU. Yet we dedicate a full mask to it (housekeeping_mask - a variable mask to begin with) and recover the housekeeping CPU via: + return cpumask_any_and(housekeeping_mask, cpu_online_mask); There can be, and had better be if box is big, multiple housekeepers. Imagine a NO_HZ_FULL_ALL kernel on an SGI beast from hell. Offloading 8191 CPUs onto poor little CPU0 probably wouldn't work out well :) -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Sun, Aug 23, 2015 at 07:40:32AM +0200, Ingo Molnar wrote: > So I almost applied this yesterday, but had the following question: what > ensures > that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() > returns cpumask_any_and() of an empty cpumask - which returns an out of range > index AFAICS - which will crash and burn in: > > kernel/time/hrtimer.c: return _cpu(hrtimer_bases, > get_nohz_timer_target()); > kernel/time/timer.c:return per_cpu_ptr(_bases, > get_nohz_timer_target()); > > housekeeping_mask itself is derived from tick_nohz_full_mask (it's the > inverse of > it in essence), and tick_nohz_full_mask is set via two methods, either via a > boot > parameter: > > if (cpulist_parse(str, tick_nohz_full_mask) < 0) { > > in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not > completely full - making housekeeping_mask empty? > > The other method is via CONFIG_NO_HZ_FULL_ALL: > > cpumask_setall(tick_nohz_full_mask); > > here it's fully set - triggering the bug I'm worried about. So what am I > missing, > what prevents CONFIG_NO_HZ_FULL_ALL from crashing? Legitimate worry and I should have explained that in the changelog. Like Paul replied, we make sure that at least the boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(). Then housekeeping_mask, by reverse effect, contains that boot CPU at least. And we also make sure that the boot CPU can't get offline (tick_nohz_cpu_down_callback()). Now we should really document and check that assumption so here is a second patch below. The sched patch depends on tip:sched/core (to avoid conflicts with sched changes) and the following one is based on tip:timer/nohz but should be applicable to sched/core without conflict. Both are standalone anyway. Thanks! --- From: Frederic Weisbecker Date: Sun, 23 Aug 2015 19:34:31 +0200 Subject: [PATCH] nohz: Assert existing housekeepers when nohz full enabled The code ensures that at least the boot CPU serves as a housekeeper. Let's assert this assumption to make sure that we have CPUs to handle unbound jobs like workqueues and timers while nohz full CPUs run undisturbed. Signed-off-by: Frederic Weisbecker --- kernel/time/tick-sched.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 3319e16..cc9884f 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -370,6 +370,12 @@ void __init tick_nohz_init(void) cpu_notifier(tick_nohz_cpu_down_callback, 0); pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n", cpumask_pr_args(tick_nohz_full_mask)); + + /* +* We need at least one CPU to handle housekeeping work such +* as timekeeping, unbound timers, workqueues, ... +*/ + WARN_ON_ONCE(cpumask_empty(housekeeping_mask)); } #endif -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Sun, Aug 23, 2015 at 09:01:01AM -0700, Paul E. McKenney wrote: > The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), > which is called from tick_init() which is called from start_kernel() > shortly after rcu_init(): > > cpu = smp_processor_id(); > > if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { > pr_warning("NO_HZ: Clearing %d from nohz_full range for > timekeeping\n", cpu); > cpumask_clear_cpu(cpu, tick_nohz_full_mask); > } > > This happens after the call to tick_nohz_init_all() that does the > cpumask_setall() that you called out above. > > Or is a recent patch that I missed changing this? Exactly, this happens right after tick_nohz_full_mask is filled and makes sure that at least the boot CPU runs as a housekeeper. We also make sure that it can't become offline later. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Sun, Aug 23, 2015 at 07:40:32AM +0200, Ingo Molnar wrote: > > * Frederic Weisbecker wrote: > > > From: Vatika Harlalka > > > > The problem addressed in this patch is about affining unpinned timers. > > Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary > > jitter due to firing of such timers on them. > > > > This patch will affine timers to online CPUs which are not full dynticks > > in NOHZ_FULL configured systems. It should not introduce overhead in > > nohz full off case due to static keys. > > > > Reviewed-by: Preeti U Murthy > > Signed-off by: Vatika Harlalka > > Cc: Ingo Molnar > > Cc: Peter Zijlstra > > Cc: Christoph Lameter > > Cc: Thomas Gleixner > > Cc: Paul E. McKenney > > Cc: Chris Metcalf > > Signed-off-by: Frederic Weisbecker > > --- > > include/linux/tick.h | 9 - > > kernel/sched/core.c | 7 +-- > > 2 files changed, 13 insertions(+), 3 deletions(-) > > > > diff --git a/include/linux/tick.h b/include/linux/tick.h > > index 3741ba1..51e6493 100644 > > --- a/include/linux/tick.h > > +++ b/include/linux/tick.h > > @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct > > cpumask *mask) > > if (tick_nohz_full_enabled()) > > cpumask_or(mask, mask, tick_nohz_full_mask); > > } > > - > > +static inline int housekeeping_any_cpu(void) > > +{ > > + return cpumask_any_and(housekeeping_mask, cpu_online_mask); > > +} > > extern void __tick_nohz_full_check(void); > > extern void tick_nohz_full_kick(void); > > extern void tick_nohz_full_kick_cpu(int cpu); > > extern void tick_nohz_full_kick_all(void); > > extern void __tick_nohz_task_switch(struct task_struct *tsk); > > #else > > +static inline int housekeeping_any_cpu(void) > > +{ > > + return smp_processor_id(); > > +} > > static inline bool tick_nohz_full_enabled(void) { return false; } > > static inline bool tick_nohz_full_cpu(int cpu) { return false; } > > static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { } > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index 9917c96..4fd42e4 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -623,18 +623,21 @@ int get_nohz_timer_target(void) > > int i, cpu = smp_processor_id(); > > struct sched_domain *sd; > > > > - if (!idle_cpu(cpu)) > > + if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu)) > > return cpu; > > > > rcu_read_lock(); > > for_each_domain(cpu, sd) { > > for_each_cpu(i, sched_domain_span(sd)) { > > - if (!idle_cpu(i)) { > > + if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) { > > cpu = i; > > goto unlock; > > } > > } > > } > > + > > + if (!is_housekeeping_cpu(cpu)) > > + cpu = housekeeping_any_cpu(); > > unlock: > > rcu_read_unlock(); > > return cpu; > > So I almost applied this yesterday, but had the following question: what > ensures > that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() > returns cpumask_any_and() of an empty cpumask - which returns an out of range > index AFAICS - which will crash and burn in: > > kernel/time/hrtimer.c: return _cpu(hrtimer_bases, > get_nohz_timer_target()); > kernel/time/timer.c:return per_cpu_ptr(_bases, > get_nohz_timer_target()); > > housekeeping_mask itself is derived from tick_nohz_full_mask (it's the > inverse of > it in essence), and tick_nohz_full_mask is set via two methods, either via a > boot > parameter: > > if (cpulist_parse(str, tick_nohz_full_mask) < 0) { > > in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not > completely full - making housekeeping_mask empty? > > The other method is via CONFIG_NO_HZ_FULL_ALL: > > cpumask_setall(tick_nohz_full_mask); > > here it's fully set - triggering the bug I'm worried about. So what am I > missing, > what prevents CONFIG_NO_HZ_FULL_ALL from crashing? The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which is called from tick_init() which is called from start_kernel() shortly after rcu_init(): cpu = smp_processor_id(); if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { pr_warning("NO_HZ: Clearing %d from nohz_full range for timekeeping\n", cpu); cpumask_clear_cpu(cpu, tick_nohz_full_mask); } This happens after the call to tick_nohz_init_all() that does the cpumask_setall() that you called out above. Or is a recent patch that I missed changing this? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Sun, Aug 23, 2015 at 07:40:32AM +0200, Ingo Molnar wrote: * Frederic Weisbecker fweis...@gmail.com wrote: From: Vatika Harlalka vatikaharla...@gmail.com The problem addressed in this patch is about affining unpinned timers. Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary jitter due to firing of such timers on them. This patch will affine timers to online CPUs which are not full dynticks in NOHZ_FULL configured systems. It should not introduce overhead in nohz full off case due to static keys. Reviewed-by: Preeti U Murthy pre...@linux.vnet.ibm.com Signed-off by: Vatika Harlalka vatikaharla...@gmail.com Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Christoph Lameter c...@linux.com Cc: Thomas Gleixner t...@linutronix.de Cc: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: Chris Metcalf cmetc...@ezchip.com Signed-off-by: Frederic Weisbecker fweis...@gmail.com --- include/linux/tick.h | 9 - kernel/sched/core.c | 7 +-- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/include/linux/tick.h b/include/linux/tick.h index 3741ba1..51e6493 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) if (tick_nohz_full_enabled()) cpumask_or(mask, mask, tick_nohz_full_mask); } - +static inline int housekeeping_any_cpu(void) +{ + return cpumask_any_and(housekeeping_mask, cpu_online_mask); +} extern void __tick_nohz_full_check(void); extern void tick_nohz_full_kick(void); extern void tick_nohz_full_kick_cpu(int cpu); extern void tick_nohz_full_kick_all(void); extern void __tick_nohz_task_switch(struct task_struct *tsk); #else +static inline int housekeeping_any_cpu(void) +{ + return smp_processor_id(); +} static inline bool tick_nohz_full_enabled(void) { return false; } static inline bool tick_nohz_full_cpu(int cpu) { return false; } static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { } diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9917c96..4fd42e4 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -623,18 +623,21 @@ int get_nohz_timer_target(void) int i, cpu = smp_processor_id(); struct sched_domain *sd; - if (!idle_cpu(cpu)) + if (!idle_cpu(cpu) is_housekeeping_cpu(cpu)) return cpu; rcu_read_lock(); for_each_domain(cpu, sd) { for_each_cpu(i, sched_domain_span(sd)) { - if (!idle_cpu(i)) { + if (!idle_cpu(i) is_housekeeping_cpu(cpu)) { cpu = i; goto unlock; } } } + + if (!is_housekeeping_cpu(cpu)) + cpu = housekeeping_any_cpu(); unlock: rcu_read_unlock(); return cpu; So I almost applied this yesterday, but had the following question: what ensures that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() returns cpumask_any_and() of an empty cpumask - which returns an out of range index AFAICS - which will crash and burn in: kernel/time/hrtimer.c: return per_cpu(hrtimer_bases, get_nohz_timer_target()); kernel/time/timer.c:return per_cpu_ptr(tvec_bases, get_nohz_timer_target()); housekeeping_mask itself is derived from tick_nohz_full_mask (it's the inverse of it in essence), and tick_nohz_full_mask is set via two methods, either via a boot parameter: if (cpulist_parse(str, tick_nohz_full_mask) 0) { in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not completely full - making housekeeping_mask empty? The other method is via CONFIG_NO_HZ_FULL_ALL: cpumask_setall(tick_nohz_full_mask); here it's fully set - triggering the bug I'm worried about. So what am I missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which is called from tick_init() which is called from start_kernel() shortly after rcu_init(): cpu = smp_processor_id(); if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { pr_warning(NO_HZ: Clearing %d from nohz_full range for timekeeping\n, cpu); cpumask_clear_cpu(cpu, tick_nohz_full_mask); } This happens after the call to tick_nohz_init_all() that does the cpumask_setall() that you called out above. Or is a recent patch that I missed changing this? Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Sun, Aug 23, 2015 at 09:01:01AM -0700, Paul E. McKenney wrote: The boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(), which is called from tick_init() which is called from start_kernel() shortly after rcu_init(): cpu = smp_processor_id(); if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) { pr_warning(NO_HZ: Clearing %d from nohz_full range for timekeeping\n, cpu); cpumask_clear_cpu(cpu, tick_nohz_full_mask); } This happens after the call to tick_nohz_init_all() that does the cpumask_setall() that you called out above. Or is a recent patch that I missed changing this? Exactly, this happens right after tick_nohz_full_mask is filled and makes sure that at least the boot CPU runs as a housekeeper. We also make sure that it can't become offline later. Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Sun, Aug 23, 2015 at 07:40:32AM +0200, Ingo Molnar wrote: So I almost applied this yesterday, but had the following question: what ensures that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() returns cpumask_any_and() of an empty cpumask - which returns an out of range index AFAICS - which will crash and burn in: kernel/time/hrtimer.c: return per_cpu(hrtimer_bases, get_nohz_timer_target()); kernel/time/timer.c:return per_cpu_ptr(tvec_bases, get_nohz_timer_target()); housekeeping_mask itself is derived from tick_nohz_full_mask (it's the inverse of it in essence), and tick_nohz_full_mask is set via two methods, either via a boot parameter: if (cpulist_parse(str, tick_nohz_full_mask) 0) { in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not completely full - making housekeeping_mask empty? The other method is via CONFIG_NO_HZ_FULL_ALL: cpumask_setall(tick_nohz_full_mask); here it's fully set - triggering the bug I'm worried about. So what am I missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? Legitimate worry and I should have explained that in the changelog. Like Paul replied, we make sure that at least the boot CPU is excluded from tick_nohz_full_mask in tick_nohz_init(). Then housekeeping_mask, by reverse effect, contains that boot CPU at least. And we also make sure that the boot CPU can't get offline (tick_nohz_cpu_down_callback()). Now we should really document and check that assumption so here is a second patch below. The sched patch depends on tip:sched/core (to avoid conflicts with sched changes) and the following one is based on tip:timer/nohz but should be applicable to sched/core without conflict. Both are standalone anyway. Thanks! --- From: Frederic Weisbecker fweis...@gmail.com Date: Sun, 23 Aug 2015 19:34:31 +0200 Subject: [PATCH] nohz: Assert existing housekeepers when nohz full enabled The code ensures that at least the boot CPU serves as a housekeeper. Let's assert this assumption to make sure that we have CPUs to handle unbound jobs like workqueues and timers while nohz full CPUs run undisturbed. Signed-off-by: Frederic Weisbecker fweis...@gmail.com --- kernel/time/tick-sched.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 3319e16..cc9884f 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -370,6 +370,12 @@ void __init tick_nohz_init(void) cpu_notifier(tick_nohz_cpu_down_callback, 0); pr_info(NO_HZ: Full dynticks CPUs: %*pbl.\n, cpumask_pr_args(tick_nohz_full_mask)); + + /* +* We need at least one CPU to handle housekeeping work such +* as timekeeping, unbound timers, workqueues, ... +*/ + WARN_ON_ONCE(cpumask_empty(housekeeping_mask)); } #endif -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
* Frederic Weisbecker wrote: > From: Vatika Harlalka > > The problem addressed in this patch is about affining unpinned timers. > Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary > jitter due to firing of such timers on them. > > This patch will affine timers to online CPUs which are not full dynticks > in NOHZ_FULL configured systems. It should not introduce overhead in > nohz full off case due to static keys. > > Reviewed-by: Preeti U Murthy > Signed-off by: Vatika Harlalka > Cc: Ingo Molnar > Cc: Peter Zijlstra > Cc: Christoph Lameter > Cc: Thomas Gleixner > Cc: Paul E. McKenney > Cc: Chris Metcalf > Signed-off-by: Frederic Weisbecker > --- > include/linux/tick.h | 9 - > kernel/sched/core.c | 7 +-- > 2 files changed, 13 insertions(+), 3 deletions(-) > > diff --git a/include/linux/tick.h b/include/linux/tick.h > index 3741ba1..51e6493 100644 > --- a/include/linux/tick.h > +++ b/include/linux/tick.h > @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct > cpumask *mask) > if (tick_nohz_full_enabled()) > cpumask_or(mask, mask, tick_nohz_full_mask); > } > - > +static inline int housekeeping_any_cpu(void) > +{ > + return cpumask_any_and(housekeeping_mask, cpu_online_mask); > +} > extern void __tick_nohz_full_check(void); > extern void tick_nohz_full_kick(void); > extern void tick_nohz_full_kick_cpu(int cpu); > extern void tick_nohz_full_kick_all(void); > extern void __tick_nohz_task_switch(struct task_struct *tsk); > #else > +static inline int housekeeping_any_cpu(void) > +{ > + return smp_processor_id(); > +} > static inline bool tick_nohz_full_enabled(void) { return false; } > static inline bool tick_nohz_full_cpu(int cpu) { return false; } > static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { } > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 9917c96..4fd42e4 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -623,18 +623,21 @@ int get_nohz_timer_target(void) > int i, cpu = smp_processor_id(); > struct sched_domain *sd; > > - if (!idle_cpu(cpu)) > + if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu)) > return cpu; > > rcu_read_lock(); > for_each_domain(cpu, sd) { > for_each_cpu(i, sched_domain_span(sd)) { > - if (!idle_cpu(i)) { > + if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) { > cpu = i; > goto unlock; > } > } > } > + > + if (!is_housekeeping_cpu(cpu)) > + cpu = housekeeping_any_cpu(); > unlock: > rcu_read_unlock(); > return cpu; So I almost applied this yesterday, but had the following question: what ensures that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() returns cpumask_any_and() of an empty cpumask - which returns an out of range index AFAICS - which will crash and burn in: kernel/time/hrtimer.c: return _cpu(hrtimer_bases, get_nohz_timer_target()); kernel/time/timer.c:return per_cpu_ptr(_bases, get_nohz_timer_target()); housekeeping_mask itself is derived from tick_nohz_full_mask (it's the inverse of it in essence), and tick_nohz_full_mask is set via two methods, either via a boot parameter: if (cpulist_parse(str, tick_nohz_full_mask) < 0) { in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not completely full - making housekeeping_mask empty? The other method is via CONFIG_NO_HZ_FULL_ALL: cpumask_setall(tick_nohz_full_mask); here it's fully set - triggering the bug I'm worried about. So what am I missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Ping: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Sat, 22 Aug 2015, Frederic Weisbecker wrote: > Ping! Exellent!!! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Ping: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
Ping! On Fri, Aug 14, 2015 at 03:46:14AM +0200, Frederic Weisbecker wrote: > From: Vatika Harlalka > > The problem addressed in this patch is about affining unpinned timers. > Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary > jitter due to firing of such timers on them. > > This patch will affine timers to online CPUs which are not full dynticks > in NOHZ_FULL configured systems. It should not introduce overhead in > nohz full off case due to static keys. > > Reviewed-by: Preeti U Murthy > Signed-off by: Vatika Harlalka > Cc: Ingo Molnar > Cc: Peter Zijlstra > Cc: Christoph Lameter > Cc: Thomas Gleixner > Cc: Paul E. McKenney > Cc: Chris Metcalf > Signed-off-by: Frederic Weisbecker > --- > include/linux/tick.h | 9 - > kernel/sched/core.c | 7 +-- > 2 files changed, 13 insertions(+), 3 deletions(-) > > diff --git a/include/linux/tick.h b/include/linux/tick.h > index 3741ba1..51e6493 100644 > --- a/include/linux/tick.h > +++ b/include/linux/tick.h > @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct > cpumask *mask) > if (tick_nohz_full_enabled()) > cpumask_or(mask, mask, tick_nohz_full_mask); > } > - > +static inline int housekeeping_any_cpu(void) > +{ > + return cpumask_any_and(housekeeping_mask, cpu_online_mask); > +} > extern void __tick_nohz_full_check(void); > extern void tick_nohz_full_kick(void); > extern void tick_nohz_full_kick_cpu(int cpu); > extern void tick_nohz_full_kick_all(void); > extern void __tick_nohz_task_switch(struct task_struct *tsk); > #else > +static inline int housekeeping_any_cpu(void) > +{ > + return smp_processor_id(); > +} > static inline bool tick_nohz_full_enabled(void) { return false; } > static inline bool tick_nohz_full_cpu(int cpu) { return false; } > static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { } > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 9917c96..4fd42e4 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -623,18 +623,21 @@ int get_nohz_timer_target(void) > int i, cpu = smp_processor_id(); > struct sched_domain *sd; > > - if (!idle_cpu(cpu)) > + if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu)) > return cpu; > > rcu_read_lock(); > for_each_domain(cpu, sd) { > for_each_cpu(i, sched_domain_span(sd)) { > - if (!idle_cpu(i)) { > + if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) { > cpu = i; > goto unlock; > } > } > } > + > + if (!is_housekeeping_cpu(cpu)) > + cpu = housekeeping_any_cpu(); > unlock: > rcu_read_unlock(); > return cpu; > -- > 2.1.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
* Frederic Weisbecker fweis...@gmail.com wrote: From: Vatika Harlalka vatikaharla...@gmail.com The problem addressed in this patch is about affining unpinned timers. Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary jitter due to firing of such timers on them. This patch will affine timers to online CPUs which are not full dynticks in NOHZ_FULL configured systems. It should not introduce overhead in nohz full off case due to static keys. Reviewed-by: Preeti U Murthy pre...@linux.vnet.ibm.com Signed-off by: Vatika Harlalka vatikaharla...@gmail.com Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Christoph Lameter c...@linux.com Cc: Thomas Gleixner t...@linutronix.de Cc: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: Chris Metcalf cmetc...@ezchip.com Signed-off-by: Frederic Weisbecker fweis...@gmail.com --- include/linux/tick.h | 9 - kernel/sched/core.c | 7 +-- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/include/linux/tick.h b/include/linux/tick.h index 3741ba1..51e6493 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) if (tick_nohz_full_enabled()) cpumask_or(mask, mask, tick_nohz_full_mask); } - +static inline int housekeeping_any_cpu(void) +{ + return cpumask_any_and(housekeeping_mask, cpu_online_mask); +} extern void __tick_nohz_full_check(void); extern void tick_nohz_full_kick(void); extern void tick_nohz_full_kick_cpu(int cpu); extern void tick_nohz_full_kick_all(void); extern void __tick_nohz_task_switch(struct task_struct *tsk); #else +static inline int housekeeping_any_cpu(void) +{ + return smp_processor_id(); +} static inline bool tick_nohz_full_enabled(void) { return false; } static inline bool tick_nohz_full_cpu(int cpu) { return false; } static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { } diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9917c96..4fd42e4 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -623,18 +623,21 @@ int get_nohz_timer_target(void) int i, cpu = smp_processor_id(); struct sched_domain *sd; - if (!idle_cpu(cpu)) + if (!idle_cpu(cpu) is_housekeeping_cpu(cpu)) return cpu; rcu_read_lock(); for_each_domain(cpu, sd) { for_each_cpu(i, sched_domain_span(sd)) { - if (!idle_cpu(i)) { + if (!idle_cpu(i) is_housekeeping_cpu(cpu)) { cpu = i; goto unlock; } } } + + if (!is_housekeeping_cpu(cpu)) + cpu = housekeeping_any_cpu(); unlock: rcu_read_unlock(); return cpu; So I almost applied this yesterday, but had the following question: what ensures that housekeeping_mask isn't empty? If it's empty then housekeeping_any_cpu() returns cpumask_any_and() of an empty cpumask - which returns an out of range index AFAICS - which will crash and burn in: kernel/time/hrtimer.c: return per_cpu(hrtimer_bases, get_nohz_timer_target()); kernel/time/timer.c:return per_cpu_ptr(tvec_bases, get_nohz_timer_target()); housekeeping_mask itself is derived from tick_nohz_full_mask (it's the inverse of it in essence), and tick_nohz_full_mask is set via two methods, either via a boot parameter: if (cpulist_parse(str, tick_nohz_full_mask) 0) { in tick_nohz_full_setup(). What ensures here that tick_nohz_full_mask is not completely full - making housekeeping_mask empty? The other method is via CONFIG_NO_HZ_FULL_ALL: cpumask_setall(tick_nohz_full_mask); here it's fully set - triggering the bug I'm worried about. So what am I missing, what prevents CONFIG_NO_HZ_FULL_ALL from crashing? Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Ping: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
On Sat, 22 Aug 2015, Frederic Weisbecker wrote: Ping! Exellent!!! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Ping: [PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
Ping! On Fri, Aug 14, 2015 at 03:46:14AM +0200, Frederic Weisbecker wrote: From: Vatika Harlalka vatikaharla...@gmail.com The problem addressed in this patch is about affining unpinned timers. Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary jitter due to firing of such timers on them. This patch will affine timers to online CPUs which are not full dynticks in NOHZ_FULL configured systems. It should not introduce overhead in nohz full off case due to static keys. Reviewed-by: Preeti U Murthy pre...@linux.vnet.ibm.com Signed-off by: Vatika Harlalka vatikaharla...@gmail.com Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Christoph Lameter c...@linux.com Cc: Thomas Gleixner t...@linutronix.de Cc: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: Chris Metcalf cmetc...@ezchip.com Signed-off-by: Frederic Weisbecker fweis...@gmail.com --- include/linux/tick.h | 9 - kernel/sched/core.c | 7 +-- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/include/linux/tick.h b/include/linux/tick.h index 3741ba1..51e6493 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) if (tick_nohz_full_enabled()) cpumask_or(mask, mask, tick_nohz_full_mask); } - +static inline int housekeeping_any_cpu(void) +{ + return cpumask_any_and(housekeeping_mask, cpu_online_mask); +} extern void __tick_nohz_full_check(void); extern void tick_nohz_full_kick(void); extern void tick_nohz_full_kick_cpu(int cpu); extern void tick_nohz_full_kick_all(void); extern void __tick_nohz_task_switch(struct task_struct *tsk); #else +static inline int housekeeping_any_cpu(void) +{ + return smp_processor_id(); +} static inline bool tick_nohz_full_enabled(void) { return false; } static inline bool tick_nohz_full_cpu(int cpu) { return false; } static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { } diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9917c96..4fd42e4 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -623,18 +623,21 @@ int get_nohz_timer_target(void) int i, cpu = smp_processor_id(); struct sched_domain *sd; - if (!idle_cpu(cpu)) + if (!idle_cpu(cpu) is_housekeeping_cpu(cpu)) return cpu; rcu_read_lock(); for_each_domain(cpu, sd) { for_each_cpu(i, sched_domain_span(sd)) { - if (!idle_cpu(i)) { + if (!idle_cpu(i) is_housekeeping_cpu(cpu)) { cpu = i; goto unlock; } } } + + if (!is_housekeeping_cpu(cpu)) + cpu = housekeeping_any_cpu(); unlock: rcu_read_unlock(); return cpu; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
From: Vatika Harlalka The problem addressed in this patch is about affining unpinned timers. Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary jitter due to firing of such timers on them. This patch will affine timers to online CPUs which are not full dynticks in NOHZ_FULL configured systems. It should not introduce overhead in nohz full off case due to static keys. Reviewed-by: Preeti U Murthy Signed-off by: Vatika Harlalka Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Christoph Lameter Cc: Thomas Gleixner Cc: Paul E. McKenney Cc: Chris Metcalf Signed-off-by: Frederic Weisbecker --- include/linux/tick.h | 9 - kernel/sched/core.c | 7 +-- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/include/linux/tick.h b/include/linux/tick.h index 3741ba1..51e6493 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) if (tick_nohz_full_enabled()) cpumask_or(mask, mask, tick_nohz_full_mask); } - +static inline int housekeeping_any_cpu(void) +{ + return cpumask_any_and(housekeeping_mask, cpu_online_mask); +} extern void __tick_nohz_full_check(void); extern void tick_nohz_full_kick(void); extern void tick_nohz_full_kick_cpu(int cpu); extern void tick_nohz_full_kick_all(void); extern void __tick_nohz_task_switch(struct task_struct *tsk); #else +static inline int housekeeping_any_cpu(void) +{ + return smp_processor_id(); +} static inline bool tick_nohz_full_enabled(void) { return false; } static inline bool tick_nohz_full_cpu(int cpu) { return false; } static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { } diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9917c96..4fd42e4 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -623,18 +623,21 @@ int get_nohz_timer_target(void) int i, cpu = smp_processor_id(); struct sched_domain *sd; - if (!idle_cpu(cpu)) + if (!idle_cpu(cpu) && is_housekeeping_cpu(cpu)) return cpu; rcu_read_lock(); for_each_domain(cpu, sd) { for_each_cpu(i, sched_domain_span(sd)) { - if (!idle_cpu(i)) { + if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) { cpu = i; goto unlock; } } } + + if (!is_housekeeping_cpu(cpu)) + cpu = housekeeping_any_cpu(); unlock: rcu_read_unlock(); return cpu; -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RESEND] sched/nohz: Affine unpinned timers to housekeepers
From: Vatika Harlalka vatikaharla...@gmail.com The problem addressed in this patch is about affining unpinned timers. Adaptive or Full Dynticks CPUs are currently disturbed by unnecessary jitter due to firing of such timers on them. This patch will affine timers to online CPUs which are not full dynticks in NOHZ_FULL configured systems. It should not introduce overhead in nohz full off case due to static keys. Reviewed-by: Preeti U Murthy pre...@linux.vnet.ibm.com Signed-off by: Vatika Harlalka vatikaharla...@gmail.com Cc: Ingo Molnar mi...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Christoph Lameter c...@linux.com Cc: Thomas Gleixner t...@linutronix.de Cc: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: Chris Metcalf cmetc...@ezchip.com Signed-off-by: Frederic Weisbecker fweis...@gmail.com --- include/linux/tick.h | 9 - kernel/sched/core.c | 7 +-- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/include/linux/tick.h b/include/linux/tick.h index 3741ba1..51e6493 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -143,13 +143,20 @@ static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) if (tick_nohz_full_enabled()) cpumask_or(mask, mask, tick_nohz_full_mask); } - +static inline int housekeeping_any_cpu(void) +{ + return cpumask_any_and(housekeeping_mask, cpu_online_mask); +} extern void __tick_nohz_full_check(void); extern void tick_nohz_full_kick(void); extern void tick_nohz_full_kick_cpu(int cpu); extern void tick_nohz_full_kick_all(void); extern void __tick_nohz_task_switch(struct task_struct *tsk); #else +static inline int housekeeping_any_cpu(void) +{ + return smp_processor_id(); +} static inline bool tick_nohz_full_enabled(void) { return false; } static inline bool tick_nohz_full_cpu(int cpu) { return false; } static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { } diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9917c96..4fd42e4 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -623,18 +623,21 @@ int get_nohz_timer_target(void) int i, cpu = smp_processor_id(); struct sched_domain *sd; - if (!idle_cpu(cpu)) + if (!idle_cpu(cpu) is_housekeeping_cpu(cpu)) return cpu; rcu_read_lock(); for_each_domain(cpu, sd) { for_each_cpu(i, sched_domain_span(sd)) { - if (!idle_cpu(i)) { + if (!idle_cpu(i) is_housekeeping_cpu(cpu)) { cpu = i; goto unlock; } } } + + if (!is_housekeeping_cpu(cpu)) + cpu = housekeeping_any_cpu(); unlock: rcu_read_unlock(); return cpu; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/