Re: [PATCH] Add support for deferrable timers (respun-Mar28)
> Might also be useful to add an extra option to "top" to reduce the > polling frequency if the system is otherwise idle. A fixed 30-sec > timer and a deferrable 1-sec timer or somesuch? Hmm, i think the current implementation is per CPU. top really would like to have one that applies to all CPUs though. Thinking about it for sane user space semantics it would probably need a global implementation anyways. Perhaps it could use the same infrastructure as RCU does to handle this? -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
On Mar 29, 2007, at 07:41:12, Andi Kleen wrote: ondemand is the biggest offender and the patch below reduces the number of interrupts by 50% or more (depending on HZ) on different test systems here. Cool! Yes. There are quite a few other timers inside kernel that can be migrated. I will use timer_stats and track others and send in the patches soon. Longer term it might make sense to even expose this as a option to user space. Maybe as a new timer in setitimer()? This might safe power with "wiggling desktop applets" too. Might also be useful to add an extra option to "top" to reduce the polling frequency if the system is otherwise idle. A fixed 30-sec timer and a deferrable 1-sec timer or somesuch? Cheers, Kyle Moffett - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
> ondemand is the biggest offender and the patch below reduces the number of > interrupts by 50% or more (depending on HZ) on different test systems here. Cool! > Yes. There are quite a few other timers inside kernel that can be > migrated. I will use timer_stats and track others and send in the patches > soon. Longer term it might make sense to even expose this as a option to user space. Maybe as a new timer in setitimer()? This might safe power with "wiggling desktop applets" too. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
ondemand is the biggest offender and the patch below reduces the number of interrupts by 50% or more (depending on HZ) on different test systems here. Cool! Yes. There are quite a few other timers inside kernel that can be migrated. I will use timer_stats and track others and send in the patches soon. Longer term it might make sense to even expose this as a option to user space. Maybe as a new timer in setitimer()? This might safe power with wiggling desktop applets too. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
On Mar 29, 2007, at 07:41:12, Andi Kleen wrote: ondemand is the biggest offender and the patch below reduces the number of interrupts by 50% or more (depending on HZ) on different test systems here. Cool! Yes. There are quite a few other timers inside kernel that can be migrated. I will use timer_stats and track others and send in the patches soon. Longer term it might make sense to even expose this as a option to user space. Maybe as a new timer in setitimer()? This might safe power with wiggling desktop applets too. Might also be useful to add an extra option to top to reduce the polling frequency if the system is otherwise idle. A fixed 30-sec timer and a deferrable 1-sec timer or somesuch? Cheers, Kyle Moffett - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
Might also be useful to add an extra option to top to reduce the polling frequency if the system is otherwise idle. A fixed 30-sec timer and a deferrable 1-sec timer or somesuch? Hmm, i think the current implementation is per CPU. top really would like to have one that applies to all CPUs though. Thinking about it for sane user space semantics it would probably need a global implementation anyways. Perhaps it could use the same infrastructure as RCU does to handle this? -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
On Wed, Mar 28, 2007 at 05:01:59PM -0700, Andrew Morton wrote: > On Wed, 28 Mar 2007 16:00:21 -0700 > Venki Pallipadi <[EMAIL PROTECTED]> wrote: > > > Please drop the patch you included yesterday and two incremental patches and > > use the patch below. > > As you saw, I went and turned it into an incremental patch again. It makes > it easier to see what changed, but harder to see the whole thing. > > > Introduce a new flag for timers - deferrable: > > OK, but there's nothing in-kernel whcih actually uses this. > > It would be good to identify some timer users which can be switched over (as > many as possible, really) so this thing actually gets some runtime testing. ondemand is the biggest offender and the patch below reduces the number of interrupts by 50% or more (depending on HZ) on different test systems here. Yes. There are quite a few other timers inside kernel that can be migrated. I will use timer_stats and track others and send in the patches soon. Thanks, Venki -- Add a new deferrable delayed work init. This can be used to schedule work that are 'unimportant' when CPU is idle and can be called later, when CPU eventually comes out of idle. Use this init in cpufreq ondemand governor. Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]> Index: new/drivers/cpufreq/cpufreq_ondemand.c === --- new.orig/drivers/cpufreq/cpufreq_ondemand.c 2007-03-28 10:03:21.0 -0800 +++ new/drivers/cpufreq/cpufreq_ondemand.c 2007-03-28 10:05:44.0 -0800 @@ -470,7 +470,7 @@ dbs_info->enable = 1; ondemand_powersave_bias_init(); dbs_info->sample_type = DBS_NORMAL_SAMPLE; - INIT_DELAYED_WORK(_info->work, do_dbs_timer); + INIT_DELAYED_WORK_DEFERRABLE(_info->work, do_dbs_timer); queue_delayed_work_on(dbs_info->cpu, kondemand_wq, _info->work, delay); } Index: new/include/linux/workqueue.h === --- new.orig/include/linux/workqueue.h 2007-03-28 10:03:21.0 -0800 +++ new/include/linux/workqueue.h 2007-03-28 10:05:44.0 -0800 @@ -89,6 +89,12 @@ init_timer(&(_work)->timer);\ } while (0) +#define INIT_DELAYED_WORK_DEFERRABLE(_work, _func) \ + do {\ + INIT_WORK(&(_work)->work, (_func)); \ + init_timer_deferrable(&(_work)->timer); \ + } while (0) + /** * work_pending - Find out whether a work item is currently pending * @work: The work item in question - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
On Wed, 28 Mar 2007 16:00:21 -0700 Venki Pallipadi <[EMAIL PROTECTED]> wrote: > Please drop the patch you included yesterday and two incremental patches and > use the patch below. As you saw, I went and turned it into an incremental patch again. It makes it easier to see what changed, but harder to see the whole thing. > Introduce a new flag for timers - deferrable: OK, but there's nothing in-kernel whcih actually uses this. It would be good to identify some timer users which can be switched over (as many as possible, really) so this thing actually gets some runtime testing. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
Andrew, Please drop the patch you included yesterday and two incremental patches and use the patch below. This patch is - yesterday's patch + Your tidy cleanup + minor changes based on comments from Oleg and Andi. This is a lot cleaner (and smaller) than earlier patches. Thanks, Venki Introduce a new flag for timers - deferrable: Timers that work normally when system is busy. But, will not cause CPU to come out of idle (just to service this timer), when CPU is idle. Instead, this timer will be serviced when CPU eventually wakes up with a subsequent non-deferrable timer. The main advantage of this is to avoid unnecessary timer interrupts when CPU is idle. If the routine currently called by a timer can wait until next event without any issues, this new timer can be used to setup timer event for that routine. This, with dynticks, allows CPUs to be lazy, allowing them to stay in idle for extended period of time by reducing unnecesary wakeup and thereby reducing the power consumption. This patch: Builds this new timer on top of existing timer infrastructure. It uses last bit in 'base' pointer of timer_list structure to store this deferrable timer flag. __next_timer_interrupt() function skips over these deferrable timers when CPU looks for next timer event for which it has to wake up. This is exported by a new interface init_timer_deferrable() that can be called in place of regular init_timer(). Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]> Index: new/kernel/timer.c === --- new.orig/kernel/timer.c 2007-03-22 16:27:44.0 -0800 +++ new/kernel/timer.c 2007-03-28 10:05:38.0 -0800 @@ -74,7 +74,7 @@ tvec_t tv3; tvec_t tv4; tvec_t tv5; -} cacheline_aligned_in_smp; +} cacheline_aligned; typedef struct tvec_t_base_s tvec_base_t; @@ -82,6 +82,37 @@ EXPORT_SYMBOL(boot_tvec_bases); static DEFINE_PER_CPU(tvec_base_t *, tvec_bases) = _tvec_bases; +/* + * Note that all tvec_bases is 2 byte aligned and lower bit of + * base in timer_list is guaranteed to be zero. Use the LSB for + * the new flag to indicate whether the timer is deferrable + */ +#define TBASE_DEFERRABLE_FLAG (0x1) + +/* Functions below help us manage 'deferrable' flag */ +static inline unsigned int tbase_get_deferrable(tvec_base_t *base) +{ + return ((unsigned int)(unsigned long)base & TBASE_DEFERRABLE_FLAG); +} + +static inline tvec_base_t *tbase_get_base(tvec_base_t *base) +{ + return ((tvec_base_t *)((unsigned long)base & ~TBASE_DEFERRABLE_FLAG)); +} + +static inline void timer_set_deferrable(struct timer_list *timer) +{ + timer->base = ((tvec_base_t *)((unsigned long)(timer->base) | + TBASE_DEFERRABLE_FLAG)); +} + +static inline void +timer_set_base(struct timer_list *timer, tvec_base_t *new_base) +{ + timer->base = (tvec_base_t *)((unsigned long)(new_base) | + tbase_get_deferrable(timer->base)); +} + /** * __round_jiffies - function to round jiffies to a full second * @j: the time in (absolute) jiffies that should be rounded @@ -295,6 +326,13 @@ } EXPORT_SYMBOL(init_timer); +void fastcall init_timer_deferrable(struct timer_list *timer) +{ + init_timer(timer); + timer_set_deferrable(timer); +} +EXPORT_SYMBOL(init_timer_deferrable); + static inline void detach_timer(struct timer_list *timer, int clear_pending) { @@ -325,10 +363,11 @@ tvec_base_t *base; for (;;) { - base = timer->base; + tvec_base_t *prelock_base = timer->base; + base = tbase_get_base(prelock_base); if (likely(base != NULL)) { spin_lock_irqsave(>lock, *flags); - if (likely(base == timer->base)) + if (likely(prelock_base == timer->base)) return base; /* The timer has migrated to another CPU */ spin_unlock_irqrestore(>lock, *flags); @@ -365,11 +404,11 @@ */ if (likely(base->running_timer != timer)) { /* See the comment in lock_timer_base() */ - timer->base = NULL; + timer_set_base(timer, NULL); spin_unlock(>lock); base = new_base; spin_lock(>lock); - timer->base = base; + timer_set_base(timer, base); } } @@ -397,7 +436,7 @@ timer_stats_timer_set_start_info(timer); BUG_ON(timer_pending(timer) || !timer->function); spin_lock_irqsave(>lock, flags); - timer->base = base; + timer_set_base(timer, base); internal_add_timer(base, timer); spin_unlock_irqrestore(>lock, flags); } @@ -548,7
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
Andrew, Please drop the patch you included yesterday and two incremental patches and use the patch below. This patch is - yesterday's patch + Your tidy cleanup + minor changes based on comments from Oleg and Andi. This is a lot cleaner (and smaller) than earlier patches. Thanks, Venki Introduce a new flag for timers - deferrable: Timers that work normally when system is busy. But, will not cause CPU to come out of idle (just to service this timer), when CPU is idle. Instead, this timer will be serviced when CPU eventually wakes up with a subsequent non-deferrable timer. The main advantage of this is to avoid unnecessary timer interrupts when CPU is idle. If the routine currently called by a timer can wait until next event without any issues, this new timer can be used to setup timer event for that routine. This, with dynticks, allows CPUs to be lazy, allowing them to stay in idle for extended period of time by reducing unnecesary wakeup and thereby reducing the power consumption. This patch: Builds this new timer on top of existing timer infrastructure. It uses last bit in 'base' pointer of timer_list structure to store this deferrable timer flag. __next_timer_interrupt() function skips over these deferrable timers when CPU looks for next timer event for which it has to wake up. This is exported by a new interface init_timer_deferrable() that can be called in place of regular init_timer(). Signed-off-by: Venkatesh Pallipadi [EMAIL PROTECTED] Index: new/kernel/timer.c === --- new.orig/kernel/timer.c 2007-03-22 16:27:44.0 -0800 +++ new/kernel/timer.c 2007-03-28 10:05:38.0 -0800 @@ -74,7 +74,7 @@ tvec_t tv3; tvec_t tv4; tvec_t tv5; -} cacheline_aligned_in_smp; +} cacheline_aligned; typedef struct tvec_t_base_s tvec_base_t; @@ -82,6 +82,37 @@ EXPORT_SYMBOL(boot_tvec_bases); static DEFINE_PER_CPU(tvec_base_t *, tvec_bases) = boot_tvec_bases; +/* + * Note that all tvec_bases is 2 byte aligned and lower bit of + * base in timer_list is guaranteed to be zero. Use the LSB for + * the new flag to indicate whether the timer is deferrable + */ +#define TBASE_DEFERRABLE_FLAG (0x1) + +/* Functions below help us manage 'deferrable' flag */ +static inline unsigned int tbase_get_deferrable(tvec_base_t *base) +{ + return ((unsigned int)(unsigned long)base TBASE_DEFERRABLE_FLAG); +} + +static inline tvec_base_t *tbase_get_base(tvec_base_t *base) +{ + return ((tvec_base_t *)((unsigned long)base ~TBASE_DEFERRABLE_FLAG)); +} + +static inline void timer_set_deferrable(struct timer_list *timer) +{ + timer-base = ((tvec_base_t *)((unsigned long)(timer-base) | + TBASE_DEFERRABLE_FLAG)); +} + +static inline void +timer_set_base(struct timer_list *timer, tvec_base_t *new_base) +{ + timer-base = (tvec_base_t *)((unsigned long)(new_base) | + tbase_get_deferrable(timer-base)); +} + /** * __round_jiffies - function to round jiffies to a full second * @j: the time in (absolute) jiffies that should be rounded @@ -295,6 +326,13 @@ } EXPORT_SYMBOL(init_timer); +void fastcall init_timer_deferrable(struct timer_list *timer) +{ + init_timer(timer); + timer_set_deferrable(timer); +} +EXPORT_SYMBOL(init_timer_deferrable); + static inline void detach_timer(struct timer_list *timer, int clear_pending) { @@ -325,10 +363,11 @@ tvec_base_t *base; for (;;) { - base = timer-base; + tvec_base_t *prelock_base = timer-base; + base = tbase_get_base(prelock_base); if (likely(base != NULL)) { spin_lock_irqsave(base-lock, *flags); - if (likely(base == timer-base)) + if (likely(prelock_base == timer-base)) return base; /* The timer has migrated to another CPU */ spin_unlock_irqrestore(base-lock, *flags); @@ -365,11 +404,11 @@ */ if (likely(base-running_timer != timer)) { /* See the comment in lock_timer_base() */ - timer-base = NULL; + timer_set_base(timer, NULL); spin_unlock(base-lock); base = new_base; spin_lock(base-lock); - timer-base = base; + timer_set_base(timer, base); } } @@ -397,7 +436,7 @@ timer_stats_timer_set_start_info(timer); BUG_ON(timer_pending(timer) || !timer-function); spin_lock_irqsave(base-lock, flags); - timer-base = base; + timer_set_base(timer, base); internal_add_timer(base, timer); spin_unlock_irqrestore(base-lock, flags); } @@
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
On Wed, 28 Mar 2007 16:00:21 -0700 Venki Pallipadi [EMAIL PROTECTED] wrote: Please drop the patch you included yesterday and two incremental patches and use the patch below. As you saw, I went and turned it into an incremental patch again. It makes it easier to see what changed, but harder to see the whole thing. Introduce a new flag for timers - deferrable: OK, but there's nothing in-kernel whcih actually uses this. It would be good to identify some timer users which can be switched over (as many as possible, really) so this thing actually gets some runtime testing. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add support for deferrable timers (respun-Mar28)
On Wed, Mar 28, 2007 at 05:01:59PM -0700, Andrew Morton wrote: On Wed, 28 Mar 2007 16:00:21 -0700 Venki Pallipadi [EMAIL PROTECTED] wrote: Please drop the patch you included yesterday and two incremental patches and use the patch below. As you saw, I went and turned it into an incremental patch again. It makes it easier to see what changed, but harder to see the whole thing. Introduce a new flag for timers - deferrable: OK, but there's nothing in-kernel whcih actually uses this. It would be good to identify some timer users which can be switched over (as many as possible, really) so this thing actually gets some runtime testing. ondemand is the biggest offender and the patch below reduces the number of interrupts by 50% or more (depending on HZ) on different test systems here. Yes. There are quite a few other timers inside kernel that can be migrated. I will use timer_stats and track others and send in the patches soon. Thanks, Venki -- Add a new deferrable delayed work init. This can be used to schedule work that are 'unimportant' when CPU is idle and can be called later, when CPU eventually comes out of idle. Use this init in cpufreq ondemand governor. Signed-off-by: Venkatesh Pallipadi [EMAIL PROTECTED] Index: new/drivers/cpufreq/cpufreq_ondemand.c === --- new.orig/drivers/cpufreq/cpufreq_ondemand.c 2007-03-28 10:03:21.0 -0800 +++ new/drivers/cpufreq/cpufreq_ondemand.c 2007-03-28 10:05:44.0 -0800 @@ -470,7 +470,7 @@ dbs_info-enable = 1; ondemand_powersave_bias_init(); dbs_info-sample_type = DBS_NORMAL_SAMPLE; - INIT_DELAYED_WORK(dbs_info-work, do_dbs_timer); + INIT_DELAYED_WORK_DEFERRABLE(dbs_info-work, do_dbs_timer); queue_delayed_work_on(dbs_info-cpu, kondemand_wq, dbs_info-work, delay); } Index: new/include/linux/workqueue.h === --- new.orig/include/linux/workqueue.h 2007-03-28 10:03:21.0 -0800 +++ new/include/linux/workqueue.h 2007-03-28 10:05:44.0 -0800 @@ -89,6 +89,12 @@ init_timer((_work)-timer);\ } while (0) +#define INIT_DELAYED_WORK_DEFERRABLE(_work, _func) \ + do {\ + INIT_WORK((_work)-work, (_func)); \ + init_timer_deferrable((_work)-timer); \ + } while (0) + /** * work_pending - Find out whether a work item is currently pending * @work: The work item in question - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/