Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Sat, Jul 23, 2005 at 04:40:46PM -0700, randy_dunlap wrote: > On Sat, 16 Jul 2005 23:55:17 -0400 Lee Revell wrote: > > > On Sat, 2005-07-16 at 19:35 -0700, Nish Aravamudan wrote: > > > As you've seen, I think it depends on the timesource: for the PIT, it > > > would be arch/i386/kernel/timers/timer_pit.c::setup_pit_timer(). > > > > That one looks pretty straightforward. > > arch/i386/kernel/timers/timer_tsc.c really looks like fun. So many > > corner cases... > > > > BTW shouldn't this code from mark_offset_tsc(): > > > > 402 if (pit_latch_buggy) { > > 403 /* get center value of last 3 time lutch */ > > 404 if ((count2 >= count && count >= count1) > > 405 || (count1 >= count && count >= count2)) { > > 406 count2 = count1; count1 = count; > > 407 } else if ((count1 >= count2 && count2 >= count) > > 408|| (count >= count2 && count2 >= count1)) { > > 409 countmp = count;count = count2; > > 410 count2 = count1;count1 = countmp; > > 411 } else { > > 412 count2 = count1; count1 = count; count = count1; > > 413 } > > 414 } > > > > use an ifdef? It only applies to cyrix_55x0, and mark_offset_tsc is a > > pretty hot path. > > I see your point, but several distros build kernels that run on > almost any x86-32 machine, so I think that it's there as is > for universal-kernel support. The same latch bug is in stone age Intel Pentium chipsets and some medieval SiS chipsets. VIA chipsets from the middle age have another interesting bug in the PIT. -- Vojtech Pavlik SuSE Labs, SuSE CR - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Sat, Jul 23, 2005 at 04:40:46PM -0700, randy_dunlap wrote: On Sat, 16 Jul 2005 23:55:17 -0400 Lee Revell wrote: On Sat, 2005-07-16 at 19:35 -0700, Nish Aravamudan wrote: As you've seen, I think it depends on the timesource: for the PIT, it would be arch/i386/kernel/timers/timer_pit.c::setup_pit_timer(). That one looks pretty straightforward. arch/i386/kernel/timers/timer_tsc.c really looks like fun. So many corner cases... BTW shouldn't this code from mark_offset_tsc(): 402 if (pit_latch_buggy) { 403 /* get center value of last 3 time lutch */ 404 if ((count2 = count count = count1) 405 || (count1 = count count = count2)) { 406 count2 = count1; count1 = count; 407 } else if ((count1 = count2 count2 = count) 408|| (count = count2 count2 = count1)) { 409 countmp = count;count = count2; 410 count2 = count1;count1 = countmp; 411 } else { 412 count2 = count1; count1 = count; count = count1; 413 } 414 } use an ifdef? It only applies to cyrix_55x0, and mark_offset_tsc is a pretty hot path. I see your point, but several distros build kernels that run on almost any x86-32 machine, so I think that it's there as is for universal-kernel support. The same latch bug is in stone age Intel Pentium chipsets and some medieval SiS chipsets. VIA chipsets from the middle age have another interesting bug in the PIT. -- Vojtech Pavlik SuSE Labs, SuSE CR - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Jesper Juhl wrote: On 7/24/05, randy_dunlap <[EMAIL PROTECTED]> wrote: On Fri, 15 Jul 2005 05:46:44 +0200 Jesper Juhl wrote: +static int __init jiffies_increment_setup(char *str) +{ + printk(KERN_NOTICE "setting up jiffies_increment : "); + if (str) { + printk("kernel_hz = %s, ", str); + } else { + printk("kernel_hz is unset, "); + } + if (!strncmp("100", str, 3)) { BTW, if someone enters "kernel_hz=1000", this check (above) for "100" matches (detects) 100, not 1000. ouch. You are right - thanks. I'll be sure to fix that. I haven't had time to look more at this little thing for the last few days, but I'll get back to it soon. Thank you for the feedback. I have to admit that I like paranoid programming, and would rather see this look for "kernel_hz=" then convert the digits after to an integer and validate that. It would catch invalid values far better, allow other values to be either implemented as best as is possible if desired, and NOT ignore invalid values if they didn't match these predefined strings. -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Jesper Juhl wrote: On 7/24/05, randy_dunlap [EMAIL PROTECTED] wrote: On Fri, 15 Jul 2005 05:46:44 +0200 Jesper Juhl wrote: +static int __init jiffies_increment_setup(char *str) +{ + printk(KERN_NOTICE setting up jiffies_increment : ); + if (str) { + printk(kernel_hz = %s, , str); + } else { + printk(kernel_hz is unset, ); + } + if (!strncmp(100, str, 3)) { BTW, if someone enters kernel_hz=1000, this check (above) for 100 matches (detects) 100, not 1000. ouch. You are right - thanks. I'll be sure to fix that. I haven't had time to look more at this little thing for the last few days, but I'll get back to it soon. Thank you for the feedback. I have to admit that I like paranoid programming, and would rather see this look for kernel_hz= then convert the digits after to an integer and validate that. It would catch invalid values far better, allow other values to be either implemented as best as is possible if desired, and NOT ignore invalid values if they didn't match these predefined strings. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/24/05, randy_dunlap <[EMAIL PROTECTED]> wrote: > On Fri, 15 Jul 2005 05:46:44 +0200 Jesper Juhl wrote: > > > +static int __init jiffies_increment_setup(char *str) > > +{ > > + printk(KERN_NOTICE "setting up jiffies_increment : "); > > + if (str) { > > + printk("kernel_hz = %s, ", str); > > + } else { > > + printk("kernel_hz is unset, "); > > + } > > + if (!strncmp("100", str, 3)) { > > BTW, if someone enters "kernel_hz=1000", this check (above) for "100" > matches (detects) 100, not 1000. > ouch. You are right - thanks. I'll be sure to fix that. I haven't had time to look more at this little thing for the last few days, but I'll get back to it soon. Thank you for the feedback. -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 15 Jul 2005 05:46:44 +0200 Jesper Juhl wrote: > +static int __init jiffies_increment_setup(char *str) > +{ > + printk(KERN_NOTICE "setting up jiffies_increment : "); > + if (str) { > + printk("kernel_hz = %s, ", str); > + } else { > + printk("kernel_hz is unset, "); > + } > + if (!strncmp("100", str, 3)) { BTW, if someone enters "kernel_hz=1000", this check (above) for "100" matches (detects) 100, not 1000. > + jiffies_increment = 10; > + printk("jiffies_increment set to 10, effective HZ will be > 100\n"); > + } else if (!strncmp("250", str, 3)) { > + jiffies_increment = 4; > + printk("jiffies_increment set to 4, effective HZ will be > 250\n"); > + } else { > + jiffies_increment = 1; > + printk("jiffies_increment set to 1, effective HZ will be > 1000\n"); > + } > + > + return 1; > +} --- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Sat, 16 Jul 2005 23:55:17 -0400 Lee Revell wrote: > On Sat, 2005-07-16 at 19:35 -0700, Nish Aravamudan wrote: > > As you've seen, I think it depends on the timesource: for the PIT, it > > would be arch/i386/kernel/timers/timer_pit.c::setup_pit_timer(). > > That one looks pretty straightforward. > arch/i386/kernel/timers/timer_tsc.c really looks like fun. So many > corner cases... > > BTW shouldn't this code from mark_offset_tsc(): > > 402 if (pit_latch_buggy) { > 403 /* get center value of last 3 time lutch */ > 404 if ((count2 >= count && count >= count1) > 405 || (count1 >= count && count >= count2)) { > 406 count2 = count1; count1 = count; > 407 } else if ((count1 >= count2 && count2 >= count) > 408|| (count >= count2 && count2 >= count1)) { > 409 countmp = count;count = count2; > 410 count2 = count1;count1 = countmp; > 411 } else { > 412 count2 = count1; count1 = count; count = count1; > 413 } > 414 } > > use an ifdef? It only applies to cyrix_55x0, and mark_offset_tsc is a > pretty hot path. I see your point, but several distros build kernels that run on almost any x86-32 machine, so I think that it's there as is for universal-kernel support. --- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Sat, 16 Jul 2005 23:55:17 -0400 Lee Revell wrote: On Sat, 2005-07-16 at 19:35 -0700, Nish Aravamudan wrote: As you've seen, I think it depends on the timesource: for the PIT, it would be arch/i386/kernel/timers/timer_pit.c::setup_pit_timer(). That one looks pretty straightforward. arch/i386/kernel/timers/timer_tsc.c really looks like fun. So many corner cases... BTW shouldn't this code from mark_offset_tsc(): 402 if (pit_latch_buggy) { 403 /* get center value of last 3 time lutch */ 404 if ((count2 = count count = count1) 405 || (count1 = count count = count2)) { 406 count2 = count1; count1 = count; 407 } else if ((count1 = count2 count2 = count) 408|| (count = count2 count2 = count1)) { 409 countmp = count;count = count2; 410 count2 = count1;count1 = countmp; 411 } else { 412 count2 = count1; count1 = count; count = count1; 413 } 414 } use an ifdef? It only applies to cyrix_55x0, and mark_offset_tsc is a pretty hot path. I see your point, but several distros build kernels that run on almost any x86-32 machine, so I think that it's there as is for universal-kernel support. --- ~Randy - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 15 Jul 2005 05:46:44 +0200 Jesper Juhl wrote: +static int __init jiffies_increment_setup(char *str) +{ + printk(KERN_NOTICE setting up jiffies_increment : ); + if (str) { + printk(kernel_hz = %s, , str); + } else { + printk(kernel_hz is unset, ); + } + if (!strncmp(100, str, 3)) { BTW, if someone enters kernel_hz=1000, this check (above) for 100 matches (detects) 100, not 1000. + jiffies_increment = 10; + printk(jiffies_increment set to 10, effective HZ will be 100\n); + } else if (!strncmp(250, str, 3)) { + jiffies_increment = 4; + printk(jiffies_increment set to 4, effective HZ will be 250\n); + } else { + jiffies_increment = 1; + printk(jiffies_increment set to 1, effective HZ will be 1000\n); + } + + return 1; +} --- ~Randy - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/24/05, randy_dunlap [EMAIL PROTECTED] wrote: On Fri, 15 Jul 2005 05:46:44 +0200 Jesper Juhl wrote: +static int __init jiffies_increment_setup(char *str) +{ + printk(KERN_NOTICE setting up jiffies_increment : ); + if (str) { + printk(kernel_hz = %s, , str); + } else { + printk(kernel_hz is unset, ); + } + if (!strncmp(100, str, 3)) { BTW, if someone enters kernel_hz=1000, this check (above) for 100 matches (detects) 100, not 1000. ouch. You are right - thanks. I'll be sure to fix that. I haven't had time to look more at this little thing for the last few days, but I'll get back to it soon. Thank you for the feedback. -- Jesper Juhl [EMAIL PROTECTED] Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 15 Jul 2005, Venkatesh Pallipadi wrote: > Well.. I tried a patch to do the broadcast thing couple of months ago and > failed to convince everyone :(. I must have missed the patch -- but was the change unconditional or affecting only broken systems? And how such systems were determined? > Further, it doesn't work well if you want to exclude some CPUs from the > list of recievers. Logical destination is simple only for less than 8 > CPUs. Beyond that with clustered or physical configuration it is > difficult. Well, I've thought the number of bits for LDR has been reexpanded at one point (it had been 32 originally with the 82489DX and then shrank to 8 with the Pentium integrated APIC) -- it must have been something else... Anyway, for this you should just need a single bit as, quoting APIC documentation: "When the message addresses the destination using logical addressing scheme each Local Unit in the ICC bus compares the logical address in the interrupt message with its own Logical Destination Register. If there is a bit match (i.e., if at least one of the corresponding pair of bits match) this local unit is selected for delivery." Thus you could make, say, bit 31 of LDR the "timer bit", set it in all local APIC units and send timer interrupts in the Fixed delivery mode using a logical destination address with only bit 31 set. You could then exclude processors from delivery by clearing bit 31 of LDR as needed. It could probably be applied to addresses within clusters, too, though space there is a bit tight. None of such hassle should be needed in reality, though -- I presume the only broken systems are uniprocessor ones with the HT feature as real SMP systems need to have their APICs enabled all the time to be able to accept IPIs if nothing else. As UP systems with HT have only two processors logically, there is no problem with using the logical destination mode as currently implemented. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 15 Jul 2005, Andi Kleen wrote: > > That's like scratching your left ear with your right hand -- broadcasting > > that external timer interrupt in the first place is more straightforward. > > If you want to exclude CPUs from the list of receivers, just use the > > logical destination mode appropriately. > > The problem with that is that it would need regular synchronizations > of all CPUs to coordinate this. Not good for scalability and I > believe the fundamentally wrong way to do this. What to you mean by "regular synchronizations of all CPUs?" And how is a broadcasted external timer interrupt different from a unicasted one redistributed further via an all-but-self IPI, except from removing an unnecessary burden from the CPU targeted by the unicast interrupt? Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 15 Jul 2005, Andi Kleen wrote: That's like scratching your left ear with your right hand -- broadcasting that external timer interrupt in the first place is more straightforward. If you want to exclude CPUs from the list of receivers, just use the logical destination mode appropriately. The problem with that is that it would need regular synchronizations of all CPUs to coordinate this. Not good for scalability and I believe the fundamentally wrong way to do this. What to you mean by regular synchronizations of all CPUs? And how is a broadcasted external timer interrupt different from a unicasted one redistributed further via an all-but-self IPI, except from removing an unnecessary burden from the CPU targeted by the unicast interrupt? Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 15 Jul 2005, Venkatesh Pallipadi wrote: Well.. I tried a patch to do the broadcast thing couple of months ago and failed to convince everyone :(. I must have missed the patch -- but was the change unconditional or affecting only broken systems? And how such systems were determined? Further, it doesn't work well if you want to exclude some CPUs from the list of recievers. Logical destination is simple only for less than 8 CPUs. Beyond that with clustered or physical configuration it is difficult. Well, I've thought the number of bits for LDR has been reexpanded at one point (it had been 32 originally with the 82489DX and then shrank to 8 with the Pentium integrated APIC) -- it must have been something else... Anyway, for this you should just need a single bit as, quoting APIC documentation: When the message addresses the destination using logical addressing scheme each Local Unit in the ICC bus compares the logical address in the interrupt message with its own Logical Destination Register. If there is a bit match (i.e., if at least one of the corresponding pair of bits match) this local unit is selected for delivery. Thus you could make, say, bit 31 of LDR the timer bit, set it in all local APIC units and send timer interrupts in the Fixed delivery mode using a logical destination address with only bit 31 set. You could then exclude processors from delivery by clearing bit 31 of LDR as needed. It could probably be applied to addresses within clusters, too, though space there is a bit tight. None of such hassle should be needed in reality, though -- I presume the only broken systems are uniprocessor ones with the HT feature as real SMP systems need to have their APICs enabled all the time to be able to accept IPIs if nothing else. As UP systems with HT have only two processors logically, there is no problem with using the logical destination mode as currently implemented. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High irq load (Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt)
On Thu, Jul 14, 2005 at 04:25:12PM +0200, Peter Osterlund wrote: > Linus Torvalds <[EMAIL PROTECTED]> writes: > > > On Wed, 13 Jul 2005, Jan Engelhardt wrote: > > > > > > No, some kernel code causes a triple-fault-and-reboot when the HZ is >= > > > 10KHz. Maybe the highest possible value is 8192 Hz, not sure. > > > > Can you post the triple-fault message? It really shouldn't triple-fault, > > although it _will_ obviously spend all time just doing timer interrupts, > > so it shouldn't get much (if any) real work done either. > ... > > There should be no conceptual "highest possible HZ", although there are > > certainly obvious practical limits to it (both on the timer hw itself, and > > just the fact that at some point we'll spend all time on the timer > > interrupt and won't get anything done..) > > HZ=1 appears to work fine here after some hacks to avoid > over/underflows in integer arithmetics. gkrellm reports about 3-4% CPU > usage when the system is idle, on a 3.07 GHz P4. yep, we've gone up to 20kHz actually, but this requires some changes to long lasting network timeouts :) nevertheless 20Hz-20kHz works fine on 'most' archs ... best, Herbert > --- > > Makefile|2 +- > arch/i386/kernel/cpu/proc.c |6 ++ > fs/nfsd/nfssvc.c|2 +- > include/linux/jiffies.h |6 ++ > include/linux/nfsd/stats.h |4 > include/linux/timex.h |2 +- > include/net/tcp.h | 12 +--- > init/calibrate.c| 21 + > kernel/Kconfig.hz |6 ++ > kernel/timer.c |4 ++-- > net/ipv4/netfilter/ip_conntrack_proto_tcp.c |2 +- > 11 files changed, 58 insertions(+), 9 deletions(-) > > diff --git a/Makefile b/Makefile > --- a/Makefile > +++ b/Makefile > @@ -1,7 +1,7 @@ > VERSION = 2 > PATCHLEVEL = 6 > SUBLEVEL = 13 > -EXTRAVERSION =-rc3 > +EXTRAVERSION =-rc3-test > NAME=Woozy Numbat > > # *DOCUMENTATION* > diff --git a/arch/i386/kernel/cpu/proc.c b/arch/i386/kernel/cpu/proc.c > --- a/arch/i386/kernel/cpu/proc.c > +++ b/arch/i386/kernel/cpu/proc.c > @@ -128,9 +128,15 @@ static int show_cpuinfo(struct seq_file >x86_cap_flags[i] != NULL ) > seq_printf(m, " %s", x86_cap_flags[i]); > > +#if HZ <= 5000 > seq_printf(m, "\nbogomips\t: %lu.%02lu\n\n", >c->loops_per_jiffy/(50/HZ), >(c->loops_per_jiffy/(5000/HZ)) % 100); > +#else > + seq_printf(m, "\nbogomips\t: %lu.%02lu\n\n", > + c->loops_per_jiffy/(50/HZ), > + (c->loops_per_jiffy*(HZ/5000)) % 100); > +#endif > > return 0; > } > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c > --- a/fs/nfsd/nfssvc.c > +++ b/fs/nfsd/nfssvc.c > @@ -160,7 +160,7 @@ update_thread_usage(int busy_threads) > decile = busy_threads*10/nfsdstats.th_cnt; > if (decile>0 && decile <= 10) { > diff = nfsd_last_call - prev_call; > - if ( (nfsdstats.th_usage[decile-1] += diff) >= NFSD_USAGE_WRAP) > + if ( (nfsdstats.th_usage[decile-1] += diff) >= NFSD_USAGE_WRAP) > nfsdstats.th_usage[decile-1] -= NFSD_USAGE_WRAP; > if (decile == 10) > nfsdstats.th_fullcnt++; > diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h > --- a/include/linux/jiffies.h > +++ b/include/linux/jiffies.h > @@ -38,6 +38,12 @@ > # define SHIFT_HZ9 > #elif HZ >= 768 && HZ < 1536 > # define SHIFT_HZ10 > +#elif HZ >= 1536 && HZ < 3072 > +# define SHIFT_HZ11 > +#elif HZ >= 3072 && HZ < 6144 > +# define SHIFT_HZ12 > +#elif HZ >= 6144 && HZ < 12288 > +# define SHIFT_HZ13 > #else > # error You lose. > #endif > diff --git a/include/linux/nfsd/stats.h b/include/linux/nfsd/stats.h > --- a/include/linux/nfsd/stats.h > +++ b/include/linux/nfsd/stats.h > @@ -30,7 +30,11 @@ struct nfsd_stats { > }; > > /* thread usage wraps very million seconds (approx one fortnight) */ > +#if HZ < 2048 > #define NFSD_USAGE_WRAP (HZ*100) > +#else > +#define NFSD_USAGE_WRAP (2048*100) > +#endif > > #ifdef __KERNEL__ > > diff --git a/include/linux/timex.h b/include/linux/timex.h > --- a/include/linux/timex.h > +++ b/include/linux/timex.h > @@ -90,7 +90,7 @@ > * > * FINENSEC is 1 ns in SHIFT_UPDATE units of the time_phase variable. > */ > -#define SHIFT_SCALE 22 /* phase scale (shift) */ > +#define SHIFT_SCALE 25 /* phase scale (shift) */ > #define SHIFT_UPDATE (SHIFT_KG + MAXTC) /* time offset scale (shift) */ > #define SHIFT_USEC 16/* frequency offset scale (shift) */ > #define FINENSEC (1L << (SHIFT_SCALE - 10)) /* ~1 ns in phase units */ > diff --git a/include/net/tcp.h
Re: High irq load (Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt)
On Thu, Jul 14, 2005 at 04:25:12PM +0200, Peter Osterlund wrote: Linus Torvalds [EMAIL PROTECTED] writes: On Wed, 13 Jul 2005, Jan Engelhardt wrote: No, some kernel code causes a triple-fault-and-reboot when the HZ is = 10KHz. Maybe the highest possible value is 8192 Hz, not sure. Can you post the triple-fault message? It really shouldn't triple-fault, although it _will_ obviously spend all time just doing timer interrupts, so it shouldn't get much (if any) real work done either. ... There should be no conceptual highest possible HZ, although there are certainly obvious practical limits to it (both on the timer hw itself, and just the fact that at some point we'll spend all time on the timer interrupt and won't get anything done..) HZ=1 appears to work fine here after some hacks to avoid over/underflows in integer arithmetics. gkrellm reports about 3-4% CPU usage when the system is idle, on a 3.07 GHz P4. yep, we've gone up to 20kHz actually, but this requires some changes to long lasting network timeouts :) nevertheless 20Hz-20kHz works fine on 'most' archs ... best, Herbert --- Makefile|2 +- arch/i386/kernel/cpu/proc.c |6 ++ fs/nfsd/nfssvc.c|2 +- include/linux/jiffies.h |6 ++ include/linux/nfsd/stats.h |4 include/linux/timex.h |2 +- include/net/tcp.h | 12 +--- init/calibrate.c| 21 + kernel/Kconfig.hz |6 ++ kernel/timer.c |4 ++-- net/ipv4/netfilter/ip_conntrack_proto_tcp.c |2 +- 11 files changed, 58 insertions(+), 9 deletions(-) diff --git a/Makefile b/Makefile --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ VERSION = 2 PATCHLEVEL = 6 SUBLEVEL = 13 -EXTRAVERSION =-rc3 +EXTRAVERSION =-rc3-test NAME=Woozy Numbat # *DOCUMENTATION* diff --git a/arch/i386/kernel/cpu/proc.c b/arch/i386/kernel/cpu/proc.c --- a/arch/i386/kernel/cpu/proc.c +++ b/arch/i386/kernel/cpu/proc.c @@ -128,9 +128,15 @@ static int show_cpuinfo(struct seq_file x86_cap_flags[i] != NULL ) seq_printf(m, %s, x86_cap_flags[i]); +#if HZ = 5000 seq_printf(m, \nbogomips\t: %lu.%02lu\n\n, c-loops_per_jiffy/(50/HZ), (c-loops_per_jiffy/(5000/HZ)) % 100); +#else + seq_printf(m, \nbogomips\t: %lu.%02lu\n\n, + c-loops_per_jiffy/(50/HZ), + (c-loops_per_jiffy*(HZ/5000)) % 100); +#endif return 0; } diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -160,7 +160,7 @@ update_thread_usage(int busy_threads) decile = busy_threads*10/nfsdstats.th_cnt; if (decile0 decile = 10) { diff = nfsd_last_call - prev_call; - if ( (nfsdstats.th_usage[decile-1] += diff) = NFSD_USAGE_WRAP) + if ( (nfsdstats.th_usage[decile-1] += diff) = NFSD_USAGE_WRAP) nfsdstats.th_usage[decile-1] -= NFSD_USAGE_WRAP; if (decile == 10) nfsdstats.th_fullcnt++; diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -38,6 +38,12 @@ # define SHIFT_HZ9 #elif HZ = 768 HZ 1536 # define SHIFT_HZ10 +#elif HZ = 1536 HZ 3072 +# define SHIFT_HZ11 +#elif HZ = 3072 HZ 6144 +# define SHIFT_HZ12 +#elif HZ = 6144 HZ 12288 +# define SHIFT_HZ13 #else # error You lose. #endif diff --git a/include/linux/nfsd/stats.h b/include/linux/nfsd/stats.h --- a/include/linux/nfsd/stats.h +++ b/include/linux/nfsd/stats.h @@ -30,7 +30,11 @@ struct nfsd_stats { }; /* thread usage wraps very million seconds (approx one fortnight) */ +#if HZ 2048 #define NFSD_USAGE_WRAP (HZ*100) +#else +#define NFSD_USAGE_WRAP (2048*100) +#endif #ifdef __KERNEL__ diff --git a/include/linux/timex.h b/include/linux/timex.h --- a/include/linux/timex.h +++ b/include/linux/timex.h @@ -90,7 +90,7 @@ * * FINENSEC is 1 ns in SHIFT_UPDATE units of the time_phase variable. */ -#define SHIFT_SCALE 22 /* phase scale (shift) */ +#define SHIFT_SCALE 25 /* phase scale (shift) */ #define SHIFT_UPDATE (SHIFT_KG + MAXTC) /* time offset scale (shift) */ #define SHIFT_USEC 16/* frequency offset scale (shift) */ #define FINENSEC (1L (SHIFT_SCALE - 10)) /* ~1 ns in phase units */ diff --git a/include/net/tcp.h b/include/net/tcp.h --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -486,8 +486,8 @@ static __inline__ int tcp_sk_listen_hash so that we select tick to get range about 4
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Sat, 2005-07-16 at 19:35 -0700, Nish Aravamudan wrote: > As you've seen, I think it depends on the timesource: for the PIT, it > would be arch/i386/kernel/timers/timer_pit.c::setup_pit_timer(). That one looks pretty straightforward. arch/i386/kernel/timers/timer_tsc.c really looks like fun. So many corner cases... BTW shouldn't this code from mark_offset_tsc(): 402 if (pit_latch_buggy) { 403 /* get center value of last 3 time lutch */ 404 if ((count2 >= count && count >= count1) 405 || (count1 >= count && count >= count2)) { 406 count2 = count1; count1 = count; 407 } else if ((count1 >= count2 && count2 >= count) 408|| (count >= count2 && count2 >= count1)) { 409 countmp = count;count = count2; 410 count2 = count1;count1 = countmp; 411 } else { 412 count2 = count1; count1 = count; count = count1; 413 } 414 } use an ifdef? It only applies to cyrix_55x0, and mark_offset_tsc is a pretty hot path. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/16/05, Jesper Juhl <[EMAIL PROTECTED]> wrote: > On 7/15/05, Jesper Juhl <[EMAIL PROTECTED]> wrote: > > On 7/15/05, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > > > > On Fri, 15 Jul 2005, Jesper Juhl wrote: > > > > > > > > It's buggy, that I know. setting kernel_hz (the new boot parameter) to > > > > 250 causes my system clock to run at something like 4-5 times normal > > > > speed > > > > > > 4 times normal. You don't actually make the timer interrupt happen at > > > 250Hz, so the timer will be programmed to run at the full 1kHz. > > > > > Right, that's the basic problem. I increase jiffies at a higher rate > > but didn't slow the timer interrupt down at the same time. > > > > > You also need to actually change the LATCH define (in > > > include/linux/jiffies.h) to take this into account (there might be > > > something else too). > > > > > [...] > > > and you might be getting closer. > > > > > > Of course, you need to make sure that LATCH is used only after > > > jiffies_increment is set up. See "setup_pit_timer(void)" in > > > arch/i386/kernel/timers/timer_pit.c for more details. > > > > > > > Thank you for all the pointers and hints. This is a new area of code > > for me, so I'll need some time to poke around - the above helps a lot. > > Unfortunately I won't have any time to work on this today, but I'll > > see if I can get a working implementation together tomorrow. > > > > Ok, I'm afraid I'm going to need another hint or two. > > I've been looking at the timer code and getting thoroughly confused. > I've tried to find out where we actually program the interrupt > controller to say "this is the frequency I want you to interrupt me > at", but I can't seem to find it. > I'm aware that there are multiple possible time sources, and I've been > looking at the 8259 code, the ioapic code, the hpet code and various > other bits in arch/i386/kernel/ and arch/i386/kernel/timers/ , but I > seem to end up getting confused about all the different defines like > CLOCK_TICK_RATE, ACTHZ, TICK_NSEC, TICK_USEC, etc... > > Where do we actually program the tick rate we want? As you've seen, I think it depends on the timesource: for the PIT, it would be arch/i386/kernel/timers/timer_pit.c::setup_pit_timer(). In that function, you'll notice we outb() the LATCH value. I think there are similar functions in the other timesources, e.g. the arch/i386/kernel/apic.c::setup_APIC_timer(). Does that help some? Thanks, Nish - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Sun, 2005-07-17 at 04:13 +0200, Jesper Juhl wrote: > Where do we actually program the tick rate we want? > In arch/i386/kernel/timers/timer_pit.c: 166 void setup_pit_timer(void) 167 { 168 unsigned long flags; 169 170 spin_lock_irqsave(_lock, flags); 171 outb_p(0x34,PIT_MODE); /* binary, mode 2, LSB/MSB, ch 0 */ 172 udelay(10); 173 outb_p(LATCH & 0xff , PIT_CH0); /* LSB */ 174 udelay(10); 175 outb(LATCH >> 8 , PIT_CH0); /* MSB */ 176 spin_unlock_irqrestore(_lock, flags); 177 } 178 Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/15/05, Jesper Juhl <[EMAIL PROTECTED]> wrote: > On 7/15/05, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > > On Fri, 15 Jul 2005, Jesper Juhl wrote: > > > > > > It's buggy, that I know. setting kernel_hz (the new boot parameter) to > > > 250 causes my system clock to run at something like 4-5 times normal > > > speed > > > > 4 times normal. You don't actually make the timer interrupt happen at > > 250Hz, so the timer will be programmed to run at the full 1kHz. > > > Right, that's the basic problem. I increase jiffies at a higher rate > but didn't slow the timer interrupt down at the same time. > > > You also need to actually change the LATCH define (in > > include/linux/jiffies.h) to take this into account (there might be > > something else too). > > > [...] > > and you might be getting closer. > > > > Of course, you need to make sure that LATCH is used only after > > jiffies_increment is set up. See "setup_pit_timer(void)" in > > arch/i386/kernel/timers/timer_pit.c for more details. > > > > Thank you for all the pointers and hints. This is a new area of code > for me, so I'll need some time to poke around - the above helps a lot. > Unfortunately I won't have any time to work on this today, but I'll > see if I can get a working implementation together tomorrow. > Ok, I'm afraid I'm going to need another hint or two. I've been looking at the timer code and getting thoroughly confused. I've tried to find out where we actually program the interrupt controller to say "this is the frequency I want you to interrupt me at", but I can't seem to find it. I'm aware that there are multiple possible time sources, and I've been looking at the 8259 code, the ioapic code, the hpet code and various other bits in arch/i386/kernel/ and arch/i386/kernel/timers/ , but I seem to end up getting confused about all the different defines like CLOCK_TICK_RATE, ACTHZ, TICK_NSEC, TICK_USEC, etc... Where do we actually program the tick rate we want? -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Hi! > > The real answer here is for the tickless patches to cleaned up to > > the point where they can be merged, and then we won't waste battery > > power entering the timer interrupt in the first place. :-) > > Whilst conceptually this is a nice idea I've yet to see any viable > code that overall has a lower cost. Tickless is a really nice idea > for embedded devices and also paravirtualized hardware but I don't > think anyone has it working well enough yet do they? Actually for power managment uses, "NO_IDLE_HZ" seems to be enough, and that is both implemented and working. Pavel -- teflon -- maybe it is a trademark, but it should not be. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Hi! > > Alan tested it and said that 250HZ does not save much power anyway. > > Len Brown, a year ago: "The bottom line number to laptop users is battery > lifetime. Just today somebody complained to me that Windows gets twice the > battery life that Linux does." > > And "Maybe I can get Andy Grover over in the moble lab to get some time on > that fancy power measurement setup they have... > > "My expectation is if we want to beat the competition, we'll want the > ability to go *under* 100Hz." > > But then, power consumption of the display should preponderate, so it's not > clear. > > Len, any updates on the relationship between HZ and power consumption? Last time I checked, HZ=100 to HZ=1000 difference was about 1W, about twice as much as disk spinning up vs. disk spinned down. Pavel -- teflon -- maybe it is a trademark, but it should not be. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Hi! Alan tested it and said that 250HZ does not save much power anyway. Len Brown, a year ago: The bottom line number to laptop users is battery lifetime. Just today somebody complained to me that Windows gets twice the battery life that Linux does. And Maybe I can get Andy Grover over in the moble lab to get some time on that fancy power measurement setup they have... My expectation is if we want to beat the competition, we'll want the ability to go *under* 100Hz. But then, power consumption of the display should preponderate, so it's not clear. Len, any updates on the relationship between HZ and power consumption? Last time I checked, HZ=100 to HZ=1000 difference was about 1W, about twice as much as disk spinning up vs. disk spinned down. Pavel -- teflon -- maybe it is a trademark, but it should not be. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Hi! The real answer here is for the tickless patches to cleaned up to the point where they can be merged, and then we won't waste battery power entering the timer interrupt in the first place. :-) Whilst conceptually this is a nice idea I've yet to see any viable code that overall has a lower cost. Tickless is a really nice idea for embedded devices and also paravirtualized hardware but I don't think anyone has it working well enough yet do they? Actually for power managment uses, NO_IDLE_HZ seems to be enough, and that is both implemented and working. Pavel -- teflon -- maybe it is a trademark, but it should not be. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/15/05, Jesper Juhl [EMAIL PROTECTED] wrote: On 7/15/05, Linus Torvalds [EMAIL PROTECTED] wrote: On Fri, 15 Jul 2005, Jesper Juhl wrote: It's buggy, that I know. setting kernel_hz (the new boot parameter) to 250 causes my system clock to run at something like 4-5 times normal speed 4 times normal. You don't actually make the timer interrupt happen at 250Hz, so the timer will be programmed to run at the full 1kHz. Right, that's the basic problem. I increase jiffies at a higher rate but didn't slow the timer interrupt down at the same time. You also need to actually change the LATCH define (in include/linux/jiffies.h) to take this into account (there might be something else too). [...] and you might be getting closer. Of course, you need to make sure that LATCH is used only after jiffies_increment is set up. See setup_pit_timer(void) in arch/i386/kernel/timers/timer_pit.c for more details. Thank you for all the pointers and hints. This is a new area of code for me, so I'll need some time to poke around - the above helps a lot. Unfortunately I won't have any time to work on this today, but I'll see if I can get a working implementation together tomorrow. Ok, I'm afraid I'm going to need another hint or two. I've been looking at the timer code and getting thoroughly confused. I've tried to find out where we actually program the interrupt controller to say this is the frequency I want you to interrupt me at, but I can't seem to find it. I'm aware that there are multiple possible time sources, and I've been looking at the 8259 code, the ioapic code, the hpet code and various other bits in arch/i386/kernel/ and arch/i386/kernel/timers/ , but I seem to end up getting confused about all the different defines like CLOCK_TICK_RATE, ACTHZ, TICK_NSEC, TICK_USEC, etc... Where do we actually program the tick rate we want? -- Jesper Juhl [EMAIL PROTECTED] Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Sun, 2005-07-17 at 04:13 +0200, Jesper Juhl wrote: Where do we actually program the tick rate we want? In arch/i386/kernel/timers/timer_pit.c: 166 void setup_pit_timer(void) 167 { 168 unsigned long flags; 169 170 spin_lock_irqsave(i8253_lock, flags); 171 outb_p(0x34,PIT_MODE); /* binary, mode 2, LSB/MSB, ch 0 */ 172 udelay(10); 173 outb_p(LATCH 0xff , PIT_CH0); /* LSB */ 174 udelay(10); 175 outb(LATCH 8 , PIT_CH0); /* MSB */ 176 spin_unlock_irqrestore(i8253_lock, flags); 177 } 178 Lee - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/16/05, Jesper Juhl [EMAIL PROTECTED] wrote: On 7/15/05, Jesper Juhl [EMAIL PROTECTED] wrote: On 7/15/05, Linus Torvalds [EMAIL PROTECTED] wrote: On Fri, 15 Jul 2005, Jesper Juhl wrote: It's buggy, that I know. setting kernel_hz (the new boot parameter) to 250 causes my system clock to run at something like 4-5 times normal speed 4 times normal. You don't actually make the timer interrupt happen at 250Hz, so the timer will be programmed to run at the full 1kHz. Right, that's the basic problem. I increase jiffies at a higher rate but didn't slow the timer interrupt down at the same time. You also need to actually change the LATCH define (in include/linux/jiffies.h) to take this into account (there might be something else too). [...] and you might be getting closer. Of course, you need to make sure that LATCH is used only after jiffies_increment is set up. See setup_pit_timer(void) in arch/i386/kernel/timers/timer_pit.c for more details. Thank you for all the pointers and hints. This is a new area of code for me, so I'll need some time to poke around - the above helps a lot. Unfortunately I won't have any time to work on this today, but I'll see if I can get a working implementation together tomorrow. Ok, I'm afraid I'm going to need another hint or two. I've been looking at the timer code and getting thoroughly confused. I've tried to find out where we actually program the interrupt controller to say this is the frequency I want you to interrupt me at, but I can't seem to find it. I'm aware that there are multiple possible time sources, and I've been looking at the 8259 code, the ioapic code, the hpet code and various other bits in arch/i386/kernel/ and arch/i386/kernel/timers/ , but I seem to end up getting confused about all the different defines like CLOCK_TICK_RATE, ACTHZ, TICK_NSEC, TICK_USEC, etc... Where do we actually program the tick rate we want? As you've seen, I think it depends on the timesource: for the PIT, it would be arch/i386/kernel/timers/timer_pit.c::setup_pit_timer(). In that function, you'll notice we outb() the LATCH value. I think there are similar functions in the other timesources, e.g. the arch/i386/kernel/apic.c::setup_APIC_timer(). Does that help some? Thanks, Nish - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Sat, 2005-07-16 at 19:35 -0700, Nish Aravamudan wrote: As you've seen, I think it depends on the timesource: for the PIT, it would be arch/i386/kernel/timers/timer_pit.c::setup_pit_timer(). That one looks pretty straightforward. arch/i386/kernel/timers/timer_tsc.c really looks like fun. So many corner cases... BTW shouldn't this code from mark_offset_tsc(): 402 if (pit_latch_buggy) { 403 /* get center value of last 3 time lutch */ 404 if ((count2 = count count = count1) 405 || (count1 = count count = count2)) { 406 count2 = count1; count1 = count; 407 } else if ((count1 = count2 count2 = count) 408|| (count = count2 count2 = count1)) { 409 countmp = count;count = count2; 410 count2 = count1;count1 = countmp; 411 } else { 412 count2 = count1; count1 = count; count = count1; 413 } 414 } use an ifdef? It only applies to cyrix_55x0, and mark_offset_tsc is a pretty hot path. Lee - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 2005-07-15 at 12:58 -0700, Stephen Pollei wrote: > But If I understand Linus's points he wants jiffies to remain a memory > fetch, and make sure it doesn't turn into a singing dancing christmas > tree. It seems it relatively easy to support dynamic tick, the ARM architecture has it. But with the numerous users of jiffies through the code, it seems to me that it's hard to ensure that everyone of them will continue to work correctly if the jiffies_increment is changed during runtime. As Linus noted, the current tick code is flexible and powerful, but it can be hard to get it right in all case. WinCE developers have similar problems/concerns: http://blogs.msdn.com/ce_base/archive/2005/06/08/426762.aspx With the previous cleanup like time_after()/time_before(), msleep() and friends, unit conversion helpers, etc. it's a step in the right direction. I just wanted to point out that while it's good to preserve the current efficient tick implementation, it may be worthwhile to add a relative timeout API like Alan Cox proposed a year ago to better hide the implementation details. - Eric St-Laurent - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/14/05, Eric St-Laurent <[EMAIL PROTECTED]> wrote: > On Thu, 2005-07-14 at 17:24 -0700, Linus Torvalds wrote: > > Trust me. When I say that the right thing to do is to just have a fixed > > (but high) HZ value, and just changing the timer rate, I'm -right-. > Of course you are, jiffies are simple and efficient. > If i sum-up the discussion from my POV: > - use a 32-bit tick counter on 32-bit platforms and use a 64-bit counter > on 64-bit platforms If the 64bit counter doesn't have any overhead then sure. > - keep the constant HZ=1000 (mS resolution) on 32-bit platforms Which HZ Is that? CONFIG_JIFFIES_HZ or CONFIG_FIXED_PIT_HZ ? I think you meant CONFIG_JIFFIES_HZ which I think for even 32bit counters could go up to 1e4 to 5e4 , with some patching going on in some places of course. > - remove the assumption that timer interrupts and jiffies are 1:1 thing > (jiffies may be incremented by >1 ticks at timer interrupt) Yes maybe nuke CONFIG_HZ and replace it with CONFIG_JIFFIES_HZ and CONFIG_(FIXED|DEFAULT|DYNAMIC)_PIT_HZ . Starting with just CONFIG_FIXED_PIT_HZ, add others as needed. Extreme might be to also just nuke HZ and replace it with JHZ and PHZ, or whatever so that people are *crystal* clear about the difference. > - determine jiffies_increment at boot So CONFIG__PIT_HZ could be a per boot time thing maybe. So you'd have CONFIG_DEFAULT_PIT_HZ if it was a per per boot or runtime thing. CONFIG_DYNAMIC_PIT_HZ if it was changable as the system is running -- like windows. CONFIG_FIXED_PIT_HZ if it is a compile time constant. Or something like the that? > - have a slow clock mode to help power management (adjust > jiffies_increment by the slowdown factor) CONFIG_DYNAMIC_PIT_HZ unless it's overhead is so low that everyone just wants it by default. > - it may be useful to bump up HZ to 1e6 (uS res.) or 1e9 (nS res.) on > 64-bit platforms, if there are benefits such as better accuracy during > time units conversions or if a higher frequency timer hardware is > available/viable. Too high starts to cause other troubles. I think that the real time people want 10uS scheduling, but even the ipipe and rt-preempt has 18us-70uS delays at times IIRC. So 5e4 to 1e5 is about the extreme end of the road for CONFIG_JIFFIES_HZ . I think even long term that 1e5 to 1e6 would be extreme because of speed of light issues, etc. Hpet is only 1.4e7 IIRC. I think that you should start with: 1) CONFIG_FIXED_PIT_HZ=50 CONFIG_JIFFIES_HZ=2000 2) try it out and fix any bugs, send the fixes to Linus to see if how much he bitches. 3) if you still need CONFIG_JIFFIES_HZ to be larger, double it and then goto 2. 4) enjoy your higher frequency jiffies I bet that even that going to somewhere between 2e3 through 1e5 will make you want to change a few things for performance and sanity reasons. So I'd focus on that before I even thought about 1e6 through 1e10 . Plus I think the interest level really fails off to go that extreme. Just making JIFFIES_HZ != PIT_HZ will require patches. Dynamic pit hz or lazy update of jiffies based on tsc/hpet/other are other patches. > - it may be also useful to bump HZ on -RT (Real-time) kernels, yes they sound like they want JIFFIES_HZ to be 1e3 through 1e5 depending on task. They also want hpet(or other), vertical retrace interrupts(so xsync works for video), perhaps a nist mini atomic clock, and a few other goodies AFAIK. > -HRT (High-resolution timers support). Yes tsc or hpet or whatever users might benefit in several ways. 1) both tsc and hpet might be able to bump up to a more accurate value on entry to idle and then test to see if anything got scheduled. 2) hpet can set set one shot timers for the next up coming event on idle if it's sooner than when the PIT interrupt is suppose to come in. Of course update the jiffies when that hpet interrupt comes. >Users of those kernel are willing > to pay the cost of the overhead to have better resolution Yes realtime users with something like hpet might not vary the pit timer, but place hooks to update the jiffies between pit interrupts like idle, scheduler(task switch), etc. And use the hpet one shot interrupts as well. > - avoid direct usage of the jiffies variable, instead use jiffies() > (inline or MACRO), IMO monotonic_clock() would be a better name I don't know I think it could remain a variable you usual just want it to be a light-weight memory read not a call out to an hpet and then a math conversion, or a call out to tsc that then has to known about if the tsc represents work or time, and if the cpu has been slowed for power save reasons etc etc etc. I think you want a symbol exported gpl of something like void force_update_jiffies(void); that you can call in different hook locations to force the update of jiffies from non-interupt sources. Actually you might want more than one version of that function or have it take an argument, becuase some people might want to be super lazy and only update it when the enter or leave idle, while
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, Jul 15, 2005 at 07:57:01PM +0200, Andi Kleen wrote: > > I wouldn't say it is totally impossible. There are ways in which Linux can > > work > > without a reliable Local APIC timer. One option being - make one CPU that > > gets > > the external timer interrupt multicast an IPI to all the other CPUs that > > wants to get periodic timer interrupt. > > That doesn't mix very well with variable ticks. And I believe > we really need them. It should work with variable ticks as we can easily add/remove CPUs from this multicast destination. > For no tick in idle you need a timer for each CPU that > can be programmed to a reasonably long interval to wake you > up after longer idleness. And all that > should work without bouncing cache lines around all the time > because that doesn't work on larger systems. And each CPU timer itself does very little writes in terms of caches. The two things that happen here are scheduler_tick and kstat_accounting. Or am I missing something? Did you have anything specific in mind wrt bouncing cachelines? Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, Jul 15, 2005 at 06:54:30PM +0100, Maciej W. Rozycki wrote: > On Fri, 15 Jul 2005, Venkatesh Pallipadi wrote: > > > I wouldn't say it is totally impossible. There are ways in which Linux can > > work > > without a reliable Local APIC timer. One option being - make one CPU that > > gets > > the external timer interrupt multicast an IPI to all the other CPUs that > > wants to get periodic timer interrupt. > > That's like scratching your left ear with your right hand -- broadcasting > that external timer interrupt in the first place is more straightforward. > If you want to exclude CPUs from the list of receivers, just use the > logical destination mode appropriately. > Well.. I tried a patch to do the broadcast thing couple of months ago and failed to convince everyone :(. Further, it doesn't work well if you want to exclude some CPUs from the list of recievers. Logical destination is simple only for less than 8 CPUs. Beyond that with clustered or physical configuration it is difficult. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
> That's like scratching your left ear with your right hand -- broadcasting > that external timer interrupt in the first place is more straightforward. > If you want to exclude CPUs from the list of receivers, just use the > logical destination mode appropriately. The problem with that is that it would need regular synchronizations of all CPUs to coordinate this. Not good for scalability and I believe the fundamentally wrong way to do this. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
> I wouldn't say it is totally impossible. There are ways in which Linux can > work > without a reliable Local APIC timer. One option being - make one CPU that > gets > the external timer interrupt multicast an IPI to all the other CPUs that > wants to get periodic timer interrupt. That doesn't mix very well with variable ticks. And I believe we really need them. For no tick in idle you need a timer for each CPU that can be programmed to a reasonably long interval to wake you up after longer idleness. And all that should work without bouncing cache lines around all the time because that doesn't work on larger systems. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 15 Jul 2005, Venkatesh Pallipadi wrote: > I wouldn't say it is totally impossible. There are ways in which Linux can > work > without a reliable Local APIC timer. One option being - make one CPU that > gets > the external timer interrupt multicast an IPI to all the other CPUs that > wants to get periodic timer interrupt. That's like scratching your left ear with your right hand -- broadcasting that external timer interrupt in the first place is more straightforward. If you want to exclude CPUs from the list of receivers, just use the logical destination mode appropriately. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, Jul 15, 2005 at 07:02:24PM +0200, Andi Kleen wrote: > > At least on multi processor systems LAPIC has to work anyways (otherwise > you cannot schedule other CPUs), so it is fine to use there. > > AFAIK there are no x86 CPUs right now that do both C3 > and SMP. If they ever do then they will need to keep the > LAPIC ticking in C3. > > This has nothing even to do with advanced power saving, > but is pretty much a hard requirement for Linux (and I would > be surprise if it wasn't one for other OS too). Without it > scheduling and local timers on APs will not work at all. > > In theory it could be replaced with HPET if HPET had enough banks (one > for each CPU - most implementations today usually only have 2 or 4), but > that would severly limit scalability for lazy tick schemes because > they would depend on a common resource in the southbridge. Also the > max number of banks needed on a big system would be huge > (128? 256?) because you couldn't have more CPUs than that. > > With PIC only it's absolutely impossible. > I wouldn't say it is totally impossible. There are ways in which Linux can work without a reliable Local APIC timer. One option being - make one CPU that gets the external timer interrupt multicast an IPI to all the other CPUs that wants to get periodic timer interrupt. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Quoting Lee Revell <[EMAIL PROTECTED]>: > On Thu, 2005-07-14 at 22:54 -0600, Zwane Mwaikambo wrote: > > On Fri, 15 Jul 2005, Lee Revell wrote: > > > > > On Fri, 2005-07-15 at 14:08 +1000, Con Kolivas wrote: > > > > Audio did show slightly larger max latencies but nothing that would be > of > > > > significance. > > > > > > > > On video, maximum latencies are only slightly larger at HZ 250, all the > > > > > desired cpu was achieved, but the average latency and number of missed > > > > > deadlines was significantly higher. > > > > > > Because audio timing is driven by the soundcard interrupt while video, > > > like MIDI, relies heavily on timers. > > > > In interbench it's not driven by a soundcard interrupt. > > > > > > OK. Con, any idea why video is so much more affected than audio? In the emulation, video vs audio is 40% cpu vs 5% cpu, 16.7ms frames instead of 50ms frames. When your cpu requirements are higher and your frames are shorter the likelihood of dropping a frame, especially under load, will skyrocket as your timing granularity decreases. Clearly 250HZ is not as good as 1000HZ for this, and I assume your midi example. Cheers, Con - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
"Brown, Len" <[EMAIL PROTECTED]> writes: > >That's an APIC bug. > >When Intel originally released the APIC (some > >thirteen years ago) they stated it should be used as a source of the > timer > >interrupt instead of the 8254. There is no excuse for changing the > >behaviour after so many years. So if you are on a broken system, you > may > >want to work around the bug, but it shouldn't impact good systems. > > > I'm perfectly happy having Linux optimize itself for the hardware > that it is running on. However, the (harsh, I know) reality is that > systems with a reliable LAPIC timer in the face of C3 do not exist > today, and probably never will. (don't shoot me, it wasn't my design > decision, I'm just the messenger:-) Further, I expect that power saving > features, such as C3, will become more important and deployed more > widely in the future, rather than less widely. At least on multi processor systems LAPIC has to work anyways (otherwise you cannot schedule other CPUs), so it is fine to use there. AFAIK there are no x86 CPUs right now that do both C3 and SMP. If they ever do then they will need to keep the LAPIC ticking in C3. This has nothing even to do with advanced power saving, but is pretty much a hard requirement for Linux (and I would be surprise if it wasn't one for other OS too). Without it scheduling and local timers on APs will not work at all. In theory it could be replaced with HPET if HPET had enough banks (one for each CPU - most implementations today usually only have 2 or 4), but that would severly limit scalability for lazy tick schemes because they would depend on a common resource in the southbridge. Also the max number of banks needed on a big system would be huge (128? 256?) because you couldn't have more CPUs than that. With PIC only it's absolutely impossible. > > So, the 13-year-old design advice will continue to apply to > 13-year-old systems, but newer systems with C3 and HPET > should be using them. Problem is that many systems even though they have HPET in the hardware don't advertise it in ACPI because Windows doesn't use it yet. AndLinux can't use it then neither because it doesn't know where the registers are located (and guessing is too risky) This means they are stuck with the old PIC, which makes lazy ticks very unpleasant. I think this is a big problem. We cannot wait until Redmond brings their new release out with this. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 2005-07-14 at 22:54 -0600, Zwane Mwaikambo wrote: > On Fri, 15 Jul 2005, Lee Revell wrote: > > > On Fri, 2005-07-15 at 14:08 +1000, Con Kolivas wrote: > > > Audio did show slightly larger max latencies but nothing that would be of > > > significance. > > > > > > On video, maximum latencies are only slightly larger at HZ 250, all the > > > desired cpu was achieved, but the average latency and number of missed > > > deadlines was significantly higher. > > > > Because audio timing is driven by the soundcard interrupt while video, > > like MIDI, relies heavily on timers. > > In interbench it's not driven by a soundcard interrupt. > > OK. Con, any idea why video is so much more affected than audio? Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, Jul 15, 2005 at 12:33:15PM -0400, Brown, Len wrote: > So, the 13-year-old design advice will continue to apply to > 13-year-old systems, but newer systems with C3 and HPET > should be using them. Last I looked HPET isn't everywhere yet (absent from nforce4 mainboards for example, but that might be a linux issue as I was told window can see one). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 2005-07-15 at 05:46, Bill Davidsen wrote: > Fernando Lopez-Lezcano wrote: > >On Thu, 2005-07-14 at 16:49, Linus Torvalds wrote: > >>On Thu, 14 Jul 2005, Lee Revell wrote: > >>>And I'm incredibly frustrated by this insistence on hard data when it's > >>>completely obvious to anyone who knows the first thing about MIDI that > >>>HZ=250 will fail in situations where HZ=1000 succeeds. > >>> > >>Ok, guys. How many people have this MIDI thing? How many of you can't be > >>bothered to set the default to suit your usage? > >> > >>>It's straight from the MIDI spec. Your argument is pretty close to "the > >>>MIDI spec is wrong, no one can hear the difference between 1ms and 4ms". > >>> > >>No. > >> > >>YOUR argument is "nobody else matters, only I do". > >> > >>MY argument is that this is a case of give and take. > > > >Take from "few" multimedia users, give to "many" laptop users. Where > >"few" and "many" are not very well defined quantities, but obviously > >"many" > "few" :-) > > > Of course that assumes that these are not the same users, which clearly > isn't true in all cases. Yes, indeed, the two sets do intersect, I belong to both. I can't speak for everybody in the "multimedia" set but in my musical work (and I suspect many others share this view) I would rather not sacrifice timing precision for battery life. Of course there are scenarios where the opposite can be true, but are fewer by far (IMHO). -- Fernando - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] i386: Selectable Frequency of the Timer Interrupt
>That's an APIC bug. >When Intel originally released the APIC (some >thirteen years ago) they stated it should be used as a source of the timer >interrupt instead of the 8254. There is no excuse for changing the >behaviour after so many years. So if you are on a broken system, you may >want to work around the bug, but it shouldn't impact good systems. I'm perfectly happy having Linux optimize itself for the hardware that it is running on. However, the (harsh, I know) reality is that systems with a reliable LAPIC timer in the face of C3 do not exist today, and probably never will. (don't shoot me, it wasn't my design decision, I'm just the messenger:-) Further, I expect that power saving features, such as C3, will become more important and deployed more widely in the future, rather than less widely. So, the 13-year-old design advice will continue to apply to 13-year-old systems, but newer systems with C3 and HPET should be using them. -Len - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OT] high precision hardware (was Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt)
On Fri, 2005-07-15 at 08:57 -0700, Christoph Lameter wrote: > Try HPET which is pretty standard these days. > Really? None of my machines have it. I suspect lots of "embeddable" systems don't either. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OT] high precision hardware (was Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt)
On Fri, 15 Jul 2005, Paul Jakma wrote: > On Thu, 14 Jul 2005, Christoph Lameter wrote: > > > Linux can already provide a response time within < 3 usecs from user space > > using f.e. the Altix RTC driver which can generate an interrupt that then > > sends a signal to an application. The Altix RTC clock is supported via POSIX > > timer syscalls and can be accessed using CLOCK_SGI_CYCLE. This has been > > available in Linux since last fall and events can be specified with 50 > > nanoseconds accurary. > > Out of curiosity, are there any cheap and 'embeddable' linux supported > architectures which support such response times (User or kernel space)? Well, just implement the proper hooks for the HPET so that you can use CLOCK_HPET from user space. > Input comes in at anywhere from 6kHz to 100kHz (variable), (T0 say), > requirement is to assert an output line Ta seconds after each T0, Ta needs to > be accurate to about 6us in the extreme case (how long the output is held has > similar accuracy requirement). Well the interrupt latency depends on many things in the linux kernel. Worst case is much greater than 6us. You probably need the RT patches as well. > What kind of hardware is capable of this? Even in microcontroller space it's > difficult to do (eg looked at some ARM microcontrollers, which still have > several usec of interrupt latency - even with no OS, still likely cant use > timers and interrupts.). Try HPET which is pretty standard these days. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/15/05, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > On Fri, 15 Jul 2005, Jesper Juhl wrote: > > > > It's buggy, that I know. setting kernel_hz (the new boot parameter) to > > 250 causes my system clock to run at something like 4-5 times normal > > speed > > 4 times normal. You don't actually make the timer interrupt happen at > 250Hz, so the timer will be programmed to run at the full 1kHz. > Right, that's the basic problem. I increase jiffies at a higher rate but didn't slow the timer interrupt down at the same time. > You also need to actually change the LATCH define (in > include/linux/jiffies.h) to take this into account (there might be > something else too). > [...] > and you might be getting closer. > > Of course, you need to make sure that LATCH is used only after > jiffies_increment is set up. See "setup_pit_timer(void)" in > arch/i386/kernel/timers/timer_pit.c for more details. > Thank you for all the pointers and hints. This is a new area of code for me, so I'll need some time to poke around - the above helps a lot. Unfortunately I won't have any time to work on this today, but I'll see if I can get a working implementation together tomorrow. -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Fernando Lopez-Lezcano wrote: On Thu, 2005-07-14 at 16:49, Linus Torvalds wrote: On Thu, 14 Jul 2005, Lee Revell wrote: And I'm incredibly frustrated by this insistence on hard data when it's completely obvious to anyone who knows the first thing about MIDI that HZ=250 will fail in situations where HZ=1000 succeeds. Ok, guys. How many people have this MIDI thing? How many of you can't be bothered to set the default to suit your usage? It's straight from the MIDI spec. Your argument is pretty close to "the MIDI spec is wrong, no one can hear the difference between 1ms and 4ms". No. YOUR argument is "nobody else matters, only I do". MY argument is that this is a case of give and take. Take from "few" multimedia users, give to "many" laptop users. Where "few" and "many" are not very well defined quantities, but obviously "many" > "few" :-) Of course that assumes that these are not the same users, which clearly isn't true in all cases. -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Gwe, 2005-07-15 at 00:19, Linus Torvalds wrote: > That's not what "jiffies" are about. If you want accurate time, use > something else, like gettimeofday. The timeouts are _only_ relevant on the > scale of a timer interrupt, since by definition that's what we're waiting > for. Ok makes sense - thats the same reason I wanted jiffies() - the timer interrupt resolution might be useless at the given time (eg seconds). It does mean people using while loops testing against jiffies are generally wrong still. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Lee Revell wrote: On Thu, 2005-07-14 at 16:49 -0700, Linus Torvalds wrote: YOUR argument is "nobody else matters, only I do". MY argument is that this is a case of give and take. I wouldn't say that. I do agree with you that HZ=1000 for everyone is problematic, I just feel that a reasonable compromise is CONFIG_HZ with the default left at 1000. I would just say that changing something like this now is probably not a great idea, while allowing a config option for 100/250/1000 and maybe even 2000 won't make everyone happy, but seems to allow everyone to make themselves happy. -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[OT] high precision hardware (was Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt)
On Thu, 14 Jul 2005, Christoph Lameter wrote: Linux can already provide a response time within < 3 usecs from user space using f.e. the Altix RTC driver which can generate an interrupt that then sends a signal to an application. The Altix RTC clock is supported via POSIX timer syscalls and can be accessed using CLOCK_SGI_CYCLE. This has been available in Linux since last fall and events can be specified with 50 nanoseconds accurary. Out of curiosity, are there any cheap and 'embeddable' linux supported architectures which support such response times (User or kernel space)? Would be very interested for a hobby project I'm planning (programmable digital ignition) which requires about 10usec resolution +/- 6us accuracy response times. At moment looks like this task will have to done on a dedicated microcontroller. Input comes in at anywhere from 6kHz to 100kHz (variable), (T0 say), requirement is to assert an output line Ta seconds after each T0, Ta needs to be accurate to about 6us in the extreme case (how long the output is held has similar accuracy requirement). What kind of hardware is capable of this? Even in microcontroller space it's difficult to do (eg looked at some ARM microcontrollers, which still have several usec of interrupt latency - even with no OS, still likely cant use timers and interrupts.). regards, -- Paul Jakma [EMAIL PROTECTED] [EMAIL PROTECTED] Key ID: 64A2FF6A Fortune: The church saves sinners, but science seeks to stop their manufacture. -- Elbert Hubbard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 14 Jul 2005, Brown, Len wrote: > >Of course using APIC internal timers is generally the best idea on SMP, > >but they may have had reasons to avoid them (it's not an ISA interrupt, > so > >it could have been simply out of question in the initial design). > > Best? No. > > Local APIC timers are based on a clock which on many processors will > STOP when the processor enters power saving idle states, such as C3. > So the LAPIC timer will not accurately reflect how much time > has passed across entry/exit from idle. That's an APIC bug. When Intel originally released the APIC (some thirteen years ago) they stated it should be used as a source of the timer interrupt instead of the 8254. There is no excuse for changing the behaviour after so many years. So if you are on a broken system, you may want to work around the bug, but it shouldn't impact good systems. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Hi, Bill Davidsen wrote: > Do you actually have something against tickless, or just don't think it > can be done in reasonable time? You can do it in small steps. When you have that jiffies_increment variable, you can add code to dynamically adjust it at runtime -- just reprogram the system timer (which may not be cheap). After you've done *that*, you can teach the add_timer code to optionally adjust jiffies_increment when demand changes; add an estimate on timer tick cost vs. reprogramming cost (which could return "always" when you're running UML); you might want to take user prefs into account ("always reprogram if the timeout would arrive more than 10 msec late, because otherwise my Doom3 game lags too much"). There you are. Tickless, and nobody even notices. -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | [EMAIL PROTECTED] Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - Caesar had his Brutus -- charles the First, his Cromwell -- and George the Third ("Treason!" cried the Speaker) -- may profit by their example. If this be treason, make the most of it. -- Patrick Henry - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, Jul 14, 2005 at 05:24:39PM -0700, Linus Torvalds wrote: > HOWEVER. I bet that somebody who really really cares (hint hint) could > easily make HZ be 1000, and then dynamically tweak the divisor at bootup > to be either 1000, 250, or 100, and then increment "jiffies" by 1, 4 or > 10. Wouldn't this be better suited to a VST like implementation, but instead of using VST to dynamically adjust the timer divisor, it operates in a "fixed" mode? (I'm arguing this way because ARM has VST merged already, and all there are no changes required to the core kernel code to achieve this.) -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, Jul 14, 2005 at 05:42:15PM -0700, Linus Torvalds wrote: > So this is why I so strongly argue that we should have a constant HZ, but > a dynamic _increment_ of "jiffies". Nobody (obviously) depends on jiffies > being constant, so it's ok to increment jiffies by pretty much any value. I agree. Isn't this exactly what HZ=1000 with VST achieves? We know this works already... > But I really wouldn't be surprised if the bogomips calibration loop was > really the only thing that needed some small tweaking for increments of > other than one. Having run VST on ARM, VST must be disabled while the bogomips calibrations have completed - I suspect VST requires some sort of enable/disable counted system like we do for interrupts and the hlt thing, so that the hotplug CPU code can do it's bogomips calibration appropriately. -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Linus Torvalds <[EMAIL PROTECTED]> writes: > Now, if somebody wants to make nicer helper functions so that you can say > > timeout = ms_from_now(500); We already have something very simliar: timeout = jiffies + msecs_to_jiffies(500); ;) Gerd -- panic("it works"); /* avoid being flooded with debug messages */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Linus Torvalds [EMAIL PROTECTED] writes: Now, if somebody wants to make nicer helper functions so that you can say timeout = ms_from_now(500); We already have something very simliar: timeout = jiffies + msecs_to_jiffies(500); ;) Gerd -- panic(it works); /* avoid being flooded with debug messages */ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, Jul 14, 2005 at 05:42:15PM -0700, Linus Torvalds wrote: So this is why I so strongly argue that we should have a constant HZ, but a dynamic _increment_ of jiffies. Nobody (obviously) depends on jiffies being constant, so it's ok to increment jiffies by pretty much any value. I agree. Isn't this exactly what HZ=1000 with VST achieves? We know this works already... But I really wouldn't be surprised if the bogomips calibration loop was really the only thing that needed some small tweaking for increments of other than one. Having run VST on ARM, VST must be disabled while the bogomips calibrations have completed - I suspect VST requires some sort of enable/disable counted system like we do for interrupts and the hlt thing, so that the hotplug CPU code can do it's bogomips calibration appropriately. -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, Jul 14, 2005 at 05:24:39PM -0700, Linus Torvalds wrote: HOWEVER. I bet that somebody who really really cares (hint hint) could easily make HZ be 1000, and then dynamically tweak the divisor at bootup to be either 1000, 250, or 100, and then increment jiffies by 1, 4 or 10. Wouldn't this be better suited to a VST like implementation, but instead of using VST to dynamically adjust the timer divisor, it operates in a fixed mode? (I'm arguing this way because ARM has VST merged already, and all there are no changes required to the core kernel code to achieve this.) -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Hi, Bill Davidsen wrote: Do you actually have something against tickless, or just don't think it can be done in reasonable time? You can do it in small steps. When you have that jiffies_increment variable, you can add code to dynamically adjust it at runtime -- just reprogram the system timer (which may not be cheap). After you've done *that*, you can teach the add_timer code to optionally adjust jiffies_increment when demand changes; add an estimate on timer tick cost vs. reprogramming cost (which could return always when you're running UML); you might want to take user prefs into account (always reprogram if the timeout would arrive more than 10 msec late, because otherwise my Doom3 game lags too much). There you are. Tickless, and nobody even notices. -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | [EMAIL PROTECTED] Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - Caesar had his Brutus -- charles the First, his Cromwell -- and George the Third (Treason! cried the Speaker) -- may profit by their example. If this be treason, make the most of it. -- Patrick Henry - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 14 Jul 2005, Brown, Len wrote: Of course using APIC internal timers is generally the best idea on SMP, but they may have had reasons to avoid them (it's not an ISA interrupt, so it could have been simply out of question in the initial design). Best? No. Local APIC timers are based on a clock which on many processors will STOP when the processor enters power saving idle states, such as C3. So the LAPIC timer will not accurately reflect how much time has passed across entry/exit from idle. That's an APIC bug. When Intel originally released the APIC (some thirteen years ago) they stated it should be used as a source of the timer interrupt instead of the 8254. There is no excuse for changing the behaviour after so many years. So if you are on a broken system, you may want to work around the bug, but it shouldn't impact good systems. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[OT] high precision hardware (was Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt)
On Thu, 14 Jul 2005, Christoph Lameter wrote: Linux can already provide a response time within 3 usecs from user space using f.e. the Altix RTC driver which can generate an interrupt that then sends a signal to an application. The Altix RTC clock is supported via POSIX timer syscalls and can be accessed using CLOCK_SGI_CYCLE. This has been available in Linux since last fall and events can be specified with 50 nanoseconds accurary. Out of curiosity, are there any cheap and 'embeddable' linux supported architectures which support such response times (User or kernel space)? Would be very interested for a hobby project I'm planning (programmable digital ignition) which requires about 10usec resolution +/- 6us accuracy response times. At moment looks like this task will have to done on a dedicated microcontroller. Input comes in at anywhere from 6kHz to 100kHz (variable), (T0 say), requirement is to assert an output line Ta seconds after each T0, Ta needs to be accurate to about 6us in the extreme case (how long the output is held has similar accuracy requirement). What kind of hardware is capable of this? Even in microcontroller space it's difficult to do (eg looked at some ARM microcontrollers, which still have several usec of interrupt latency - even with no OS, still likely cant use timers and interrupts.). regards, -- Paul Jakma [EMAIL PROTECTED] [EMAIL PROTECTED] Key ID: 64A2FF6A Fortune: The church saves sinners, but science seeks to stop their manufacture. -- Elbert Hubbard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Lee Revell wrote: On Thu, 2005-07-14 at 16:49 -0700, Linus Torvalds wrote: YOUR argument is nobody else matters, only I do. MY argument is that this is a case of give and take. I wouldn't say that. I do agree with you that HZ=1000 for everyone is problematic, I just feel that a reasonable compromise is CONFIG_HZ with the default left at 1000. I would just say that changing something like this now is probably not a great idea, while allowing a config option for 100/250/1000 and maybe even 2000 won't make everyone happy, but seems to allow everyone to make themselves happy. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Gwe, 2005-07-15 at 00:19, Linus Torvalds wrote: That's not what jiffies are about. If you want accurate time, use something else, like gettimeofday. The timeouts are _only_ relevant on the scale of a timer interrupt, since by definition that's what we're waiting for. Ok makes sense - thats the same reason I wanted jiffies() - the timer interrupt resolution might be useless at the given time (eg seconds). It does mean people using while loops testing against jiffies are generally wrong still. Alan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Fernando Lopez-Lezcano wrote: On Thu, 2005-07-14 at 16:49, Linus Torvalds wrote: On Thu, 14 Jul 2005, Lee Revell wrote: And I'm incredibly frustrated by this insistence on hard data when it's completely obvious to anyone who knows the first thing about MIDI that HZ=250 will fail in situations where HZ=1000 succeeds. Ok, guys. How many people have this MIDI thing? How many of you can't be bothered to set the default to suit your usage? It's straight from the MIDI spec. Your argument is pretty close to the MIDI spec is wrong, no one can hear the difference between 1ms and 4ms. No. YOUR argument is nobody else matters, only I do. MY argument is that this is a case of give and take. Take from few multimedia users, give to many laptop users. Where few and many are not very well defined quantities, but obviously many few :-) Of course that assumes that these are not the same users, which clearly isn't true in all cases. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/15/05, Linus Torvalds [EMAIL PROTECTED] wrote: On Fri, 15 Jul 2005, Jesper Juhl wrote: It's buggy, that I know. setting kernel_hz (the new boot parameter) to 250 causes my system clock to run at something like 4-5 times normal speed 4 times normal. You don't actually make the timer interrupt happen at 250Hz, so the timer will be programmed to run at the full 1kHz. Right, that's the basic problem. I increase jiffies at a higher rate but didn't slow the timer interrupt down at the same time. You also need to actually change the LATCH define (in include/linux/jiffies.h) to take this into account (there might be something else too). [...] and you might be getting closer. Of course, you need to make sure that LATCH is used only after jiffies_increment is set up. See setup_pit_timer(void) in arch/i386/kernel/timers/timer_pit.c for more details. Thank you for all the pointers and hints. This is a new area of code for me, so I'll need some time to poke around - the above helps a lot. Unfortunately I won't have any time to work on this today, but I'll see if I can get a working implementation together tomorrow. -- Jesper Juhl [EMAIL PROTECTED] Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OT] high precision hardware (was Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt)
On Fri, 15 Jul 2005, Paul Jakma wrote: On Thu, 14 Jul 2005, Christoph Lameter wrote: Linux can already provide a response time within 3 usecs from user space using f.e. the Altix RTC driver which can generate an interrupt that then sends a signal to an application. The Altix RTC clock is supported via POSIX timer syscalls and can be accessed using CLOCK_SGI_CYCLE. This has been available in Linux since last fall and events can be specified with 50 nanoseconds accurary. Out of curiosity, are there any cheap and 'embeddable' linux supported architectures which support such response times (User or kernel space)? Well, just implement the proper hooks for the HPET so that you can use CLOCK_HPET from user space. Input comes in at anywhere from 6kHz to 100kHz (variable), (T0 say), requirement is to assert an output line Ta seconds after each T0, Ta needs to be accurate to about 6us in the extreme case (how long the output is held has similar accuracy requirement). Well the interrupt latency depends on many things in the linux kernel. Worst case is much greater than 6us. You probably need the RT patches as well. What kind of hardware is capable of this? Even in microcontroller space it's difficult to do (eg looked at some ARM microcontrollers, which still have several usec of interrupt latency - even with no OS, still likely cant use timers and interrupts.). Try HPET which is pretty standard these days. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OT] high precision hardware (was Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt)
On Fri, 2005-07-15 at 08:57 -0700, Christoph Lameter wrote: Try HPET which is pretty standard these days. Really? None of my machines have it. I suspect lots of embeddable systems don't either. Lee - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] i386: Selectable Frequency of the Timer Interrupt
That's an APIC bug. When Intel originally released the APIC (some thirteen years ago) they stated it should be used as a source of the timer interrupt instead of the 8254. There is no excuse for changing the behaviour after so many years. So if you are on a broken system, you may want to work around the bug, but it shouldn't impact good systems. I'm perfectly happy having Linux optimize itself for the hardware that it is running on. However, the (harsh, I know) reality is that systems with a reliable LAPIC timer in the face of C3 do not exist today, and probably never will. (don't shoot me, it wasn't my design decision, I'm just the messenger:-) Further, I expect that power saving features, such as C3, will become more important and deployed more widely in the future, rather than less widely. So, the 13-year-old design advice will continue to apply to 13-year-old systems, but newer systems with C3 and HPET should be using them. -Len - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 2005-07-15 at 05:46, Bill Davidsen wrote: Fernando Lopez-Lezcano wrote: On Thu, 2005-07-14 at 16:49, Linus Torvalds wrote: On Thu, 14 Jul 2005, Lee Revell wrote: And I'm incredibly frustrated by this insistence on hard data when it's completely obvious to anyone who knows the first thing about MIDI that HZ=250 will fail in situations where HZ=1000 succeeds. Ok, guys. How many people have this MIDI thing? How many of you can't be bothered to set the default to suit your usage? It's straight from the MIDI spec. Your argument is pretty close to the MIDI spec is wrong, no one can hear the difference between 1ms and 4ms. No. YOUR argument is nobody else matters, only I do. MY argument is that this is a case of give and take. Take from few multimedia users, give to many laptop users. Where few and many are not very well defined quantities, but obviously many few :-) Of course that assumes that these are not the same users, which clearly isn't true in all cases. Yes, indeed, the two sets do intersect, I belong to both. I can't speak for everybody in the multimedia set but in my musical work (and I suspect many others share this view) I would rather not sacrifice timing precision for battery life. Of course there are scenarios where the opposite can be true, but are fewer by far (IMHO). -- Fernando - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, Jul 15, 2005 at 12:33:15PM -0400, Brown, Len wrote: So, the 13-year-old design advice will continue to apply to 13-year-old systems, but newer systems with C3 and HPET should be using them. Last I looked HPET isn't everywhere yet (absent from nforce4 mainboards for example, but that might be a linux issue as I was told window can see one). - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 2005-07-14 at 22:54 -0600, Zwane Mwaikambo wrote: On Fri, 15 Jul 2005, Lee Revell wrote: On Fri, 2005-07-15 at 14:08 +1000, Con Kolivas wrote: Audio did show slightly larger max latencies but nothing that would be of significance. On video, maximum latencies are only slightly larger at HZ 250, all the desired cpu was achieved, but the average latency and number of missed deadlines was significantly higher. Because audio timing is driven by the soundcard interrupt while video, like MIDI, relies heavily on timers. In interbench it's not driven by a soundcard interrupt. OK. Con, any idea why video is so much more affected than audio? Lee - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Brown, Len [EMAIL PROTECTED] writes: That's an APIC bug. When Intel originally released the APIC (some thirteen years ago) they stated it should be used as a source of the timer interrupt instead of the 8254. There is no excuse for changing the behaviour after so many years. So if you are on a broken system, you may want to work around the bug, but it shouldn't impact good systems. I'm perfectly happy having Linux optimize itself for the hardware that it is running on. However, the (harsh, I know) reality is that systems with a reliable LAPIC timer in the face of C3 do not exist today, and probably never will. (don't shoot me, it wasn't my design decision, I'm just the messenger:-) Further, I expect that power saving features, such as C3, will become more important and deployed more widely in the future, rather than less widely. At least on multi processor systems LAPIC has to work anyways (otherwise you cannot schedule other CPUs), so it is fine to use there. AFAIK there are no x86 CPUs right now that do both C3 and SMP. If they ever do then they will need to keep the LAPIC ticking in C3. This has nothing even to do with advanced power saving, but is pretty much a hard requirement for Linux (and I would be surprise if it wasn't one for other OS too). Without it scheduling and local timers on APs will not work at all. In theory it could be replaced with HPET if HPET had enough banks (one for each CPU - most implementations today usually only have 2 or 4), but that would severly limit scalability for lazy tick schemes because they would depend on a common resource in the southbridge. Also the max number of banks needed on a big system would be huge (128? 256?) because you couldn't have more CPUs than that. With PIC only it's absolutely impossible. So, the 13-year-old design advice will continue to apply to 13-year-old systems, but newer systems with C3 and HPET should be using them. Problem is that many systems even though they have HPET in the hardware don't advertise it in ACPI because Windows doesn't use it yet. AndLinux can't use it then neither because it doesn't know where the registers are located (and guessing is too risky) This means they are stuck with the old PIC, which makes lazy ticks very unpleasant. I think this is a big problem. We cannot wait until Redmond brings their new release out with this. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Quoting Lee Revell [EMAIL PROTECTED]: On Thu, 2005-07-14 at 22:54 -0600, Zwane Mwaikambo wrote: On Fri, 15 Jul 2005, Lee Revell wrote: On Fri, 2005-07-15 at 14:08 +1000, Con Kolivas wrote: Audio did show slightly larger max latencies but nothing that would be of significance. On video, maximum latencies are only slightly larger at HZ 250, all the desired cpu was achieved, but the average latency and number of missed deadlines was significantly higher. Because audio timing is driven by the soundcard interrupt while video, like MIDI, relies heavily on timers. In interbench it's not driven by a soundcard interrupt. OK. Con, any idea why video is so much more affected than audio? In the emulation, video vs audio is 40% cpu vs 5% cpu, 16.7ms frames instead of 50ms frames. When your cpu requirements are higher and your frames are shorter the likelihood of dropping a frame, especially under load, will skyrocket as your timing granularity decreases. Clearly 250HZ is not as good as 1000HZ for this, and I assume your midi example. Cheers, Con - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, Jul 15, 2005 at 07:02:24PM +0200, Andi Kleen wrote: At least on multi processor systems LAPIC has to work anyways (otherwise you cannot schedule other CPUs), so it is fine to use there. AFAIK there are no x86 CPUs right now that do both C3 and SMP. If they ever do then they will need to keep the LAPIC ticking in C3. This has nothing even to do with advanced power saving, but is pretty much a hard requirement for Linux (and I would be surprise if it wasn't one for other OS too). Without it scheduling and local timers on APs will not work at all. In theory it could be replaced with HPET if HPET had enough banks (one for each CPU - most implementations today usually only have 2 or 4), but that would severly limit scalability for lazy tick schemes because they would depend on a common resource in the southbridge. Also the max number of banks needed on a big system would be huge (128? 256?) because you couldn't have more CPUs than that. With PIC only it's absolutely impossible. I wouldn't say it is totally impossible. There are ways in which Linux can work without a reliable Local APIC timer. One option being - make one CPU that gets the external timer interrupt multicast an IPI to all the other CPUs that wants to get periodic timer interrupt. Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 15 Jul 2005, Venkatesh Pallipadi wrote: I wouldn't say it is totally impossible. There are ways in which Linux can work without a reliable Local APIC timer. One option being - make one CPU that gets the external timer interrupt multicast an IPI to all the other CPUs that wants to get periodic timer interrupt. That's like scratching your left ear with your right hand -- broadcasting that external timer interrupt in the first place is more straightforward. If you want to exclude CPUs from the list of receivers, just use the logical destination mode appropriately. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
I wouldn't say it is totally impossible. There are ways in which Linux can work without a reliable Local APIC timer. One option being - make one CPU that gets the external timer interrupt multicast an IPI to all the other CPUs that wants to get periodic timer interrupt. That doesn't mix very well with variable ticks. And I believe we really need them. For no tick in idle you need a timer for each CPU that can be programmed to a reasonably long interval to wake you up after longer idleness. And all that should work without bouncing cache lines around all the time because that doesn't work on larger systems. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
That's like scratching your left ear with your right hand -- broadcasting that external timer interrupt in the first place is more straightforward. If you want to exclude CPUs from the list of receivers, just use the logical destination mode appropriately. The problem with that is that it would need regular synchronizations of all CPUs to coordinate this. Not good for scalability and I believe the fundamentally wrong way to do this. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, Jul 15, 2005 at 06:54:30PM +0100, Maciej W. Rozycki wrote: On Fri, 15 Jul 2005, Venkatesh Pallipadi wrote: I wouldn't say it is totally impossible. There are ways in which Linux can work without a reliable Local APIC timer. One option being - make one CPU that gets the external timer interrupt multicast an IPI to all the other CPUs that wants to get periodic timer interrupt. That's like scratching your left ear with your right hand -- broadcasting that external timer interrupt in the first place is more straightforward. If you want to exclude CPUs from the list of receivers, just use the logical destination mode appropriately. Well.. I tried a patch to do the broadcast thing couple of months ago and failed to convince everyone :(. Further, it doesn't work well if you want to exclude some CPUs from the list of recievers. Logical destination is simple only for less than 8 CPUs. Beyond that with clustered or physical configuration it is difficult. Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, Jul 15, 2005 at 07:57:01PM +0200, Andi Kleen wrote: I wouldn't say it is totally impossible. There are ways in which Linux can work without a reliable Local APIC timer. One option being - make one CPU that gets the external timer interrupt multicast an IPI to all the other CPUs that wants to get periodic timer interrupt. That doesn't mix very well with variable ticks. And I believe we really need them. It should work with variable ticks as we can easily add/remove CPUs from this multicast destination. For no tick in idle you need a timer for each CPU that can be programmed to a reasonably long interval to wake you up after longer idleness. And all that should work without bouncing cache lines around all the time because that doesn't work on larger systems. And each CPU timer itself does very little writes in terms of caches. The two things that happen here are scheduler_tick and kstat_accounting. Or am I missing something? Did you have anything specific in mind wrt bouncing cachelines? Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/14/05, Eric St-Laurent [EMAIL PROTECTED] wrote: On Thu, 2005-07-14 at 17:24 -0700, Linus Torvalds wrote: Trust me. When I say that the right thing to do is to just have a fixed (but high) HZ value, and just changing the timer rate, I'm -right-. Of course you are, jiffies are simple and efficient. If i sum-up the discussion from my POV: - use a 32-bit tick counter on 32-bit platforms and use a 64-bit counter on 64-bit platforms If the 64bit counter doesn't have any overhead then sure. - keep the constant HZ=1000 (mS resolution) on 32-bit platforms Which HZ Is that? CONFIG_JIFFIES_HZ or CONFIG_FIXED_PIT_HZ ? I think you meant CONFIG_JIFFIES_HZ which I think for even 32bit counters could go up to 1e4 to 5e4 , with some patching going on in some places of course. - remove the assumption that timer interrupts and jiffies are 1:1 thing (jiffies may be incremented by 1 ticks at timer interrupt) Yes maybe nuke CONFIG_HZ and replace it with CONFIG_JIFFIES_HZ and CONFIG_(FIXED|DEFAULT|DYNAMIC)_PIT_HZ . Starting with just CONFIG_FIXED_PIT_HZ, add others as needed. Extreme might be to also just nuke HZ and replace it with JHZ and PHZ, or whatever so that people are *crystal* clear about the difference. - determine jiffies_increment at boot So CONFIG_foo_PIT_HZ could be a per boot time thing maybe. So you'd have CONFIG_DEFAULT_PIT_HZ if it was a per per boot or runtime thing. CONFIG_DYNAMIC_PIT_HZ if it was changable as the system is running -- like windows. CONFIG_FIXED_PIT_HZ if it is a compile time constant. Or something like the that? - have a slow clock mode to help power management (adjust jiffies_increment by the slowdown factor) CONFIG_DYNAMIC_PIT_HZ unless it's overhead is so low that everyone just wants it by default. - it may be useful to bump up HZ to 1e6 (uS res.) or 1e9 (nS res.) on 64-bit platforms, if there are benefits such as better accuracy during time units conversions or if a higher frequency timer hardware is available/viable. Too high starts to cause other troubles. I think that the real time people want 10uS scheduling, but even the ipipe and rt-preempt has 18us-70uS delays at times IIRC. So 5e4 to 1e5 is about the extreme end of the road for CONFIG_JIFFIES_HZ . I think even long term that 1e5 to 1e6 would be extreme because of speed of light issues, etc. Hpet is only 1.4e7 IIRC. I think that you should start with: 1) CONFIG_FIXED_PIT_HZ=50 CONFIG_JIFFIES_HZ=2000 2) try it out and fix any bugs, send the fixes to Linus to see if how much he bitches. 3) if you still need CONFIG_JIFFIES_HZ to be larger, double it and then goto 2. 4) enjoy your higher frequency jiffies I bet that even that going to somewhere between 2e3 through 1e5 will make you want to change a few things for performance and sanity reasons. So I'd focus on that before I even thought about 1e6 through 1e10 . Plus I think the interest level really fails off to go that extreme. Just making JIFFIES_HZ != PIT_HZ will require patches. Dynamic pit hz or lazy update of jiffies based on tsc/hpet/other are other patches. - it may be also useful to bump HZ on -RT (Real-time) kernels, yes they sound like they want JIFFIES_HZ to be 1e3 through 1e5 depending on task. They also want hpet(or other), vertical retrace interrupts(so xsync works for video), perhaps a nist mini atomic clock, and a few other goodies AFAIK. -HRT (High-resolution timers support). Yes tsc or hpet or whatever users might benefit in several ways. 1) both tsc and hpet might be able to bump up to a more accurate value on entry to idle and then test to see if anything got scheduled. 2) hpet can set set one shot timers for the next up coming event on idle if it's sooner than when the PIT interrupt is suppose to come in. Of course update the jiffies when that hpet interrupt comes. Users of those kernel are willing to pay the cost of the overhead to have better resolution Yes realtime users with something like hpet might not vary the pit timer, but place hooks to update the jiffies between pit interrupts like idle, scheduler(task switch), etc. And use the hpet one shot interrupts as well. - avoid direct usage of the jiffies variable, instead use jiffies() (inline or MACRO), IMO monotonic_clock() would be a better name I don't know I think it could remain a variable you usual just want it to be a light-weight memory read not a call out to an hpet and then a math conversion, or a call out to tsc that then has to known about if the tsc represents work or time, and if the cpu has been slowed for power save reasons etc etc etc. I think you want a symbol exported gpl of something like void force_update_jiffies(void); that you can call in different hook locations to force the update of jiffies from non-interupt sources. Actually you might want more than one version of that function or have it take an argument, becuase some people might want to be super lazy and only update it when the enter or leave idle, while others(real timers) might want
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 2005-07-15 at 12:58 -0700, Stephen Pollei wrote: But If I understand Linus's points he wants jiffies to remain a memory fetch, and make sure it doesn't turn into a singing dancing christmas tree. It seems it relatively easy to support dynamic tick, the ARM architecture has it. But with the numerous users of jiffies through the code, it seems to me that it's hard to ensure that everyone of them will continue to work correctly if the jiffies_increment is changed during runtime. As Linus noted, the current tick code is flexible and powerful, but it can be hard to get it right in all case. WinCE developers have similar problems/concerns: http://blogs.msdn.com/ce_base/archive/2005/06/08/426762.aspx With the previous cleanup like time_after()/time_before(), msleep() and friends, unit conversion helpers, etc. it's a step in the right direction. I just wanted to point out that while it's good to preserve the current efficient tick implementation, it may be worthwhile to add a relative timeout API like Alan Cox proposed a year ago to better hide the implementation details. - Eric St-Laurent - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 15 Jul 2005, Jesper Juhl wrote: > > It's buggy, that I know. setting kernel_hz (the new boot parameter) to > 250 causes my system clock to run at something like 4-5 times normal > speed 4 times normal. You don't actually make the timer interrupt happen at 250Hz, so the timer will be programmed to run at the full 1kHz. You also need to actually change the LATCH define (in include/linux/jiffies.h) to take this into account (there might be something else too). So /* LATCH is used in the interval timer and ftape setup. */ #define LATCH ((CLOCK_TICK_RATE + HZ/2) / HZ) /* For divider */ should become something like /* LATCH is used in the interval timer and ftape setup. */ #define LATCH ((CLOCK_TICK_RATE*jiffies_increment + HZ/2) / HZ) /* For divider */ and you might be getting closer. Of course, you need to make sure that LATCH is used only after jiffies_increment is set up. See "setup_pit_timer(void)" in arch/i386/kernel/timers/timer_pit.c for more details. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 15 Jul 2005, Lee Revell wrote: > On Fri, 2005-07-15 at 14:08 +1000, Con Kolivas wrote: > > Audio did show slightly larger max latencies but nothing that would be of > > significance. > > > > On video, maximum latencies are only slightly larger at HZ 250, all the > > desired cpu was achieved, but the average latency and number of missed > > deadlines was significantly higher. > > Because audio timing is driven by the soundcard interrupt while video, > like MIDI, relies heavily on timers. In interbench it's not driven by a soundcard interrupt. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 2005-07-15 at 14:08 +1000, Con Kolivas wrote: > Audio did show slightly larger max latencies but nothing that would be of > significance. > > On video, maximum latencies are only slightly larger at HZ 250, all the > desired cpu was achieved, but the average latency and number of missed > deadlines was significantly higher. Because audio timing is driven by the soundcard interrupt while video, like MIDI, relies heavily on timers. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 15 Jul 2005 09:25, Linus Torvalds wrote: > On Thu, 14 Jul 2005, Lee Revell wrote: > > On Thu, 2005-07-14 at 09:37 -0700, Linus Torvalds wrote: > > > I have to say, this whole thread has been pretty damn worthless in > > > general in my not-so-humble opinion. > > > > This thread has really gone OT, but to revisit the original issue for a > > bit, are you still unwilling to consider leaving the default HZ at 1000 > > for 2.6.13? > > Yes. I see absolutely no point to it until I actually hear people who have > actually tried some real load that doesn't work. Dammit, I want a real > user who says that he can noticeable see his DVD stuttering, not some > theory. Disclaimer - This is not proof of a real world dvd stuttering, simply a benchmarked result. My code may be crap, but then the real apps out there may also be. Results from interbench v0.21 (http://ck.kolivas.org/apps/interbench/interbench-0.21.tar.bz2) 2.6.13-rc1 on a pentium4 3.06 HZ=1000: --- Benchmarking Audio in the presence of loads --- Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.012 +/- 0.001960.021 100100 Video 1.28 +/- 0.509 2.01 100100 X 0.289 +/- 0.578 2 100100 Burn 0.014 +/- 0.002 0.023 100100 Write 0.025 +/- 0.0349 0.49 100100 Read 0.02 +/- 0.003830.052 100100 Compile 0.023 +/- 0.007520.054 100100 Memload 0.222 +/- 0.892 9.04 100100 --- Benchmarking Video in the presence of loads --- Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.012 +/- 0.001690.023 100100 X 2.55 +/- 2.3718.7 100 75.8 Burn 1.08 +/- 1.0616.7 100 88.2 Write 0.224 +/- 0.215 16.7 100 97.8 Read 0.019 +/- 0.003540.059 100100 Compile4.55 +/- 4.5317.6 100 57.5 Memload 1.3 +/- 1.3451.5 100 88 HZ=250: --- Benchmarking Audio in the presence of loads --- Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 0.011 +/- 0.001520.022 100100 Video 0.157 +/- 0.398 3.62 100100 X 1.3 +/- 1.824.01 100100 Burn 0.014 +/- 0.001420.026 100100 Write 0.022 +/- 0.0125 0.092 100100 Read 0.021 +/- 0.003660.048 100100 Compile0.03 +/- 0.0469 0.559 100100 Memload 0.144 +/- 0.681 8.05 100100 --- Benchmarking Video in the presence of loads --- Latency +/- SD (ms) Max Latency % Desired CPU % Deadlines Met None 5 +/- 4.9916.7 100 54 X 9.98 +/- 8.9420.7 100 31.2 Burn 16.6 +/- 16.616.7 100 0.167 Write 4.11 +/- 4.0816.7 100 60.8 Read 2.55 +/- 2.5316.7 100 73.8 Compile15.6 +/- 15.617.7 1003.5 Memload2.91 +/- 2.9245.4 100 72.5 Audio did show slightly larger max latencies but nothing that would be of significance. On video, maximum latencies are only slightly larger at HZ 250, all the desired cpu was achieved, but the average latency and number of missed deadlines was significantly higher. Cheers, Con pgp55H3zXUdbU.pgp Description: PGP signature
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Linus Torvalds wrote: On Thu, 14 Jul 2005, Lee Revell wrote: I don't think this will fly because we take a big performance hit by calculating HZ at runtime. I think it might be an acceptable solution for a distribution that really needed it, since it should be fairly simple. However, it's definitely not the right solution. HOWEVER. I bet that somebody who really really cares (hint hint) could easily make HZ be 1000, and then dynamically tweak the divisor at bootup to be either 1000, 250, or 100, and then increment "jiffies" by 1, 4 or 10. My wild guess is that this is 20 lines of code, plus another 20 for "setup", so that you can choose between 100/250/1000 Hz with a kernel command line. It wouldn't be "dynamic" at first - you'd just set it up at bootup, and set a "jiffies_increment" variable, and change the jiffies_64++; into jiffies_64 += jiffies_increment; and you'd be done. Really. I dare you guys. First one to send me a tested patch gets a gold star. I don't know if this will earn me that gold star or not, but it's what I have at this point. It's buggy, that I know. setting kernel_hz (the new boot parameter) to 250 causes my system clock to run at something like 4-5 times normal speed, and key repeat is all funny (super fast) and various other nasty things - however, it does build and it does boot (i386 only as that's all I have) and apart from the kernels notion of time being way off it does seem to be a step in the right direction. Me never having looked at the kernels timer code before is most likely the explanation for the bugs - that and the fact that I didn't try to tackle the bogomips calculation at all... Ohh, and it'll probably bee even more strange if you build the kernel with CONFIG_HZ set to anything other than 1000 - if this is what we want to do, then I guess CONFIG_HZ needs to go away completely and just always be 1000 and then people should use the boot option to modify if needed/wanted. Anyway, is this somewhere along the lines of what you were thinking? The patch is below, but I've also attached it since I suspect thunderbird will mangle it. diff -upr linux-2.6.13-rc3-orig/arch/i386/kernel/time.c linux-2.6.13-rc3/arch/i386/kernel/time.c --- linux-2.6.13-rc3-orig/arch/i386/kernel/time.c 2005-07-14 20:33:35.0 +0200 +++ linux-2.6.13-rc3/arch/i386/kernel/time.c 2005-07-15 04:02:50.0 +0200 @@ -75,6 +75,7 @@ int pit_latch_buggy; /* ext #include "do_timer.h" u64 jiffies_64 = INITIAL_JIFFIES; +u16 jiffies_increment = 1; EXPORT_SYMBOL(jiffies_64); @@ -481,3 +482,27 @@ void __init time_init(void) time_init_hook(); } + +static int __init jiffies_increment_setup(char *str) +{ + printk(KERN_NOTICE "setting up jiffies_increment : "); + if (str) { + printk("kernel_hz = %s, ", str); + } else { + printk("kernel_hz is unset, "); + } + if (!strncmp("100", str, 3)) { + jiffies_increment = 10; + printk("jiffies_increment set to 10, effective HZ will be 100\n"); + } else if (!strncmp("250", str, 3)) { + jiffies_increment = 4; + printk("jiffies_increment set to 4, effective HZ will be 250\n"); + } else { + jiffies_increment = 1; + printk("jiffies_increment set to 1, effective HZ will be 1000\n"); + } + + return 1; +} + +__setup("kernel_hz=", jiffies_increment_setup); diff -upr linux-2.6.13-rc3-orig/arch/i386/kernel/timers/timer_pm.c linux-2.6.13-rc3/arch/i386/kernel/timers/timer_pm.c --- linux-2.6.13-rc3-orig/arch/i386/kernel/timers/timer_pm.c 2005-07-14 20:33:35.0 +0200 +++ linux-2.6.13-rc3/arch/i386/kernel/timers/timer_pm.c 2005-07-15 02:59:39.0 +0200 @@ -176,7 +176,7 @@ static void mark_offset_pmtmr(void) /* compensate for lost ticks */ if (lost >= 2) - jiffies_64 += lost - 1; + jiffies_64 += (lost * jiffies_increment) - 1; /* don't calculate delay for first run, or if we've got less then a tick */ diff -upr linux-2.6.13-rc3-orig/arch/i386/kernel/timers/timer_tsc.c linux-2.6.13-rc3/arch/i386/kernel/timers/timer_tsc.c --- linux-2.6.13-rc3-orig/arch/i386/kernel/timers/timer_tsc.c 2005-07-14 20:33:35.0 +0200 +++ linux-2.6.13-rc3/arch/i386/kernel/timers/timer_tsc.c 2005-07-15 02:59:13.0 +0200 @@ -193,7 +193,7 @@ static void mark_offset_tsc_hpet(void) offset = hpet_readl(HPET_T0_CMP) - hpet_tick; if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0))) { int lost_ticks = (offset - hpet_last) / hpet_tick; - jiffies_64 += lost_ticks; + jiffies_64 += lost_ticks * jiffies_increment; } hpet_last = hpet_current; @@ -415,7 +415,7 @@ static void mark_offset_tsc(void) lost = delta/(100/HZ); delay = delta%(100/HZ);
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 2005-07-14 at 16:49, Linus Torvalds wrote: > On Thu, 14 Jul 2005, Lee Revell wrote: > > > > And I'm incredibly frustrated by this insistence on hard data when it's > > completely obvious to anyone who knows the first thing about MIDI that > > HZ=250 will fail in situations where HZ=1000 succeeds. > > Ok, guys. How many people have this MIDI thing? How many of you can't be > bothered to set the default to suit your usage? > > > It's straight from the MIDI spec. Your argument is pretty close to "the > > MIDI spec is wrong, no one can hear the difference between 1ms and 4ms". > > No. > > YOUR argument is "nobody else matters, only I do". > > MY argument is that this is a case of give and take. Take from "few" multimedia users, give to "many" laptop users. Where "few" and "many" are not very well defined quantities, but obviously "many" > "few" :-) As to how few is few. I don't claim to know, but users that bother to subscribe to the Planet CCRMA[*] mailing list number 750+, so that's one datapoint. Total users of Planet CCRMA, I have no idea. Most of them will use MIDI, either externally through hardware interfaces or internally through the ALSA sequencer api. Planet CCRMA includes custom kernels with Ingo's patches for low latency, so I will have to configure them with HZ=1000 (or 500 or whatever) in 2.6.13+. Oh well. HZ=250 is a setback anyway, as many advances had been made recently in the stock kernel that made it more and more suitable to multimedia work (_GREAT_ work BTW). That raised my hopes that, eventually, I would not have to build kernels, just apps, as stock kernels would be good enough. This will make the wait longer. Sigh, I'll be patient and dream about high resolution timers or other technically elegant solutions that will not penalize multimedia apps or laptops... -- Fernando [*] http://ccrma.stanford.edu/planetccrma/software/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
Linus Torvalds wrote: On Thu, 14 Jul 2005, Lee Revell wrote: I don't think this will fly because we take a big performance hit by calculating HZ at runtime. I think it might be an acceptable solution for a distribution that really needed it, since it should be fairly simple. However, it's definitely not the right solution. HOWEVER. I bet that somebody who really really cares (hint hint) could easily make HZ be 1000, and then dynamically tweak the divisor at bootup to be either 1000, 250, or 100, and then increment "jiffies" by 1, 4 or 10. My wild guess is that this is 20 lines of code, plus another 20 for "setup", so that you can choose between 100/250/1000 Hz with a kernel command line. It wouldn't be "dynamic" at first - you'd just set it up at bootup, and set a "jiffies_increment" variable, and change the jiffies_64++; into jiffies_64 += jiffies_increment; and you'd be done. Really. I dare you guys. First one to send me a tested patch gets a gold star. Then, a year from now, people will realize how _easy_ it is to change the jiffies_increment on the fly, and add a /sys/kernel/timer_frequency file, and then you can switch it around at run-time. Trust me. When I say that the right thing to do is to just have a fixed (but high) HZ value, and just changing the timer rate, I'm -right-. I'm always right. This time I'm just even more right than usual. And humble, too ;-) Do you actually have something against tickless, or just don't think it can be done in reasonable time? I can see this needing very careful thought on SMP. -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 2005-07-14 at 17:24 -0700, Linus Torvalds wrote: > > On Thu, 14 Jul 2005, Lee Revell wrote: > > Trust me. When I say that the right thing to do is to just have a fixed > (but high) HZ value, and just changing the timer rate, I'm -right-. > > I'm always right. This time I'm just even more right than usual. Of course you are, jiffies are simple and efficient. But it may be worthwhile to provide better/simpler API for relative timeouts and also better hide the implementation details of the tick system. If i sum-up the discussion from my POV: - use a 32-bit tick counter on 32-bit platforms and use a 64-bit counter on 64-bit platforms - keep the constant HZ=1000 (mS resolution) on 32-bit platforms - remove the assumption that timer interrupts and jiffies are 1:1 thing (jiffies may be incremented by >1 ticks at timer interrupt) - determine jiffies_increment at boot - have a slow clock mode to help power management (adjust jiffies_increment by the slowdown factor) - it may be useful to bump up HZ to 1e6 (uS res.) or 1e9 (nS res.) on 64-bit platforms, if there are benefits such as better accuracy during time units conversions or if a higher frequency timer hardware is available/viable. - it may be also useful to bump HZ on -RT (Real-time) kernels, or with -HRT (High-resolution timers support). Users of those kernel are willing to pay the cost of the overhead to have better resolution - avoid direct usage of the jiffies variable, instead use jiffies() (inline or MACRO), IMO monotonic_clock() would be a better name - provide a relative timeout API (see my previous post, or Alan's suggestions) - remove most of the direct use of jiffies through the code and replace them with msleep(), relative timer, etc - use human units for those APIs - Eric St-Laurent - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/15/05, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Thu, 14 Jul 2005, Lee Revell wrote: > > > > I don't think this will fly because we take a big performance hit by > > calculating HZ at runtime. > > I think it might be an acceptable solution for a distribution that really > needed it, since it should be fairly simple. However, it's definitely not > the right solution. > > HOWEVER. I bet that somebody who really really cares (hint hint) could > easily make HZ be 1000, and then dynamically tweak the divisor at bootup > to be either 1000, 250, or 100, and then increment "jiffies" by 1, 4 or > 10. > [...] > > Really. I dare you guys. First one to send me a tested patch gets a gold > star. > Testing a patch right now, I'll send it to you as soon as it doesn't blow up on boot (which it currently does). -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 2005-07-14 at 23:37 +0100, Alan Cox wrote: > In actual fact you also want to fix users of > > while(time_before(foo, jiffies)) { whack(mole); } > > to become > > init_timeout(); > timeout.expires = jiffies + n > add_timeout(); > while(!timeout_expired()) {} > > Which is a trivial wrapper around timers as we have them now Or something like this: struct timeout_timer { unsigned long expires; }; static inline void timeout_set(struct timeout_timer *timer, unsigned int msecs) { timer->expires = jiffies + msecs_to_jiffies(msecs); } static inline int timeout_expired(struct timeout_timer *timer) { return (time_after(jiffies, timer->expires)); } It provides a nice API for relative timeouts without adding overhead. - Eric St-Laurent - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
David Lang wrote: On Wed, 13 Jul 2005, Bill Davidsen wrote: How serious is the 1/HZ = sane problem, and more to the point how many programs get the HZ value with a system call as opposed to including a header or building it in? I know some of my older programs use header files, that was part of the planning for the future even before 2.5 started. At the time I didn't expect to have to use the system call. in binary 1/100 or 1/1000 are not sane values to start with so I don't think that that this is likly to be that critical (remembering that the kernel doesn't do floating point math) The kernel isn't the issue, it's programs which do timing and get values in ticks which they convert to time by dividing by HZ. I at least get that from a header, proper way would be with the syscall. -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 15 Jul 2005, Jesper Juhl wrote: > > Even if we only have to do it once at boot? The thought was to detect > what type of machine we are booting on, figure out what a good HZ > would be for that type of box, then set that HZ value and treat it as > a constant from that point forward. No, it really should be a compile-time constant, or a lot of things get a lot more expensive. There's a HZ embedded in a lot of places, and some of them are divides, for example. Others do optimized special cases based on static knowledge of what HZ is. So this is why I so strongly argue that we should have a constant HZ, but a dynamic _increment_ of "jiffies". Nobody (obviously) depends on jiffies being constant, so it's ok to increment jiffies by pretty much any value. Yeah, yeah, there might be some _very_ few code-paths (bogomips, I think) that may look at when "jiffies" changes, and actually measure one tick that way. They would need to be taught that they don't measure "one" tick any more, they measure "jiffies_increment" ticks or something. But I really wouldn't be surprised if the bogomips calibration loop was really the only thing that needed some small tweaking for increments of other than one. (Oh, we'll find other things we want to fix up, and such a change would result in other changes down the line, no question about that. But I don't think it would be very much at all, and I don't think it would turn out at all traumatic). Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 14 Jul 2005, Lee Revell wrote: > > I don't think this will fly because we take a big performance hit by > calculating HZ at runtime. I think it might be an acceptable solution for a distribution that really needed it, since it should be fairly simple. However, it's definitely not the right solution. HOWEVER. I bet that somebody who really really cares (hint hint) could easily make HZ be 1000, and then dynamically tweak the divisor at bootup to be either 1000, 250, or 100, and then increment "jiffies" by 1, 4 or 10. My wild guess is that this is 20 lines of code, plus another 20 for "setup", so that you can choose between 100/250/1000 Hz with a kernel command line. It wouldn't be "dynamic" at first - you'd just set it up at bootup, and set a "jiffies_increment" variable, and change the jiffies_64++; into jiffies_64 += jiffies_increment; and you'd be done. Really. I dare you guys. First one to send me a tested patch gets a gold star. Then, a year from now, people will realize how _easy_ it is to change the jiffies_increment on the fly, and add a /sys/kernel/timer_frequency file, and then you can switch it around at run-time. Trust me. When I say that the right thing to do is to just have a fixed (but high) HZ value, and just changing the timer rate, I'm -right-. I'm always right. This time I'm just even more right than usual. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/15/05, Lee Revell <[EMAIL PROTECTED]> wrote: > On Fri, 2005-07-15 at 02:04 +0200, Jesper Juhl wrote: > > While reading this thread it occoured to me that perhaps what we > > really want (besides sub HZ timers) might be for the kernel to > > auto-tune HZ? > > > > Would it make sense to introduce a new config option (say > > CONFIG_HZ_AUTO) that when selected does something like this at boot: > > > > if (running_on_a_laptop()) { > > set_HZ_to(250); > > } > > I don't think this will fly because we take a big performance hit by > calculating HZ at runtime. > Even if we only have to do it once at boot? The thought was to detect what type of machine we are booting on, figure out what a good HZ would be for that type of box, then set that HZ value and treat it as a constant from that point forward. -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Fri, 2005-07-15 at 02:04 +0200, Jesper Juhl wrote: > While reading this thread it occoured to me that perhaps what we > really want (besides sub HZ timers) might be for the kernel to > auto-tune HZ? > > Would it make sense to introduce a new config option (say > CONFIG_HZ_AUTO) that when selected does something like this at boot: > > if (running_on_a_laptop()) { > set_HZ_to(250); > } I don't think this will fly because we take a big performance hit by calculating HZ at runtime. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On 7/13/05, Chris Wedgwood <[EMAIL PROTECTED]> wrote: > On Wed, Jul 13, 2005 at 01:48:57PM -0700, Andrew Morton wrote: > > > Len Brown, a year ago: "The bottom line number to laptop users is > > battery lifetime. Just today somebody complained to me that Windows > > gets twice the battery life that Linux does." > > It seems the motivation for lower HZ is really: > >(1) ACPI/SMM suckage in laptops > >(2) NUMA systems with *horrible* remote memory latencies > > Both can be detected from you .config and we could see HZ as needed > there and everyone else could avoid this surely? > While reading this thread it occoured to me that perhaps what we really want (besides sub HZ timers) might be for the kernel to auto-tune HZ? Would it make sense to introduce a new config option (say CONFIG_HZ_AUTO) that when selected does something like this at boot: if (running_on_a_laptop()) { set_HZ_to(250); } else if (running_on_large_NUMA_box()) { set_HZ_to_100(); } else if (running_on_multimedia_box() { set_HZ_to_1000(); } else { set_HZ_to_some_other_sane_default(); } and if user wants to not use the auto detection they can select a certain HZ in their .config instead of CONFIG_HZ_AUTO. Just wanted to throw the idea up in the air in case it made sense. Feel free to pick it apart or simply ignore it. :-) -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 2005-07-14 at 16:49 -0700, Linus Torvalds wrote: > > On Thu, 14 Jul 2005, Lee Revell wrote: > > > > And I'm incredibly frustrated by this insistence on hard data when it's > > completely obvious to anyone who knows the first thing about MIDI that > > HZ=250 will fail in situations where HZ=1000 succeeds. > > Ok, guys. How many people have this MIDI thing? > How many of you can't be > bothered to set the default to suit your usage? Very few, and even fewer, respectively. But, we'd still like to be able to use the same kernel image as everyone else if possible. I guess we'll have to deal with it until a variable tick solution is ready. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 2005-07-14 at 16:49 -0700, Linus Torvalds wrote: > YOUR argument is "nobody else matters, only I do". > > MY argument is that this is a case of give and take. I wouldn't say that. I do agree with you that HZ=1000 for everyone is problematic, I just feel that a reasonable compromise is CONFIG_HZ with the default left at 1000. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 14 Jul 2005, Lee Revell wrote: > > And I'm incredibly frustrated by this insistence on hard data when it's > completely obvious to anyone who knows the first thing about MIDI that > HZ=250 will fail in situations where HZ=1000 succeeds. Ok, guys. How many people have this MIDI thing? How many of you can't be bothered to set the default to suit your usage? > It's straight from the MIDI spec. Your argument is pretty close to "the > MIDI spec is wrong, no one can hear the difference between 1ms and 4ms". No. YOUR argument is "nobody else matters, only I do". MY argument is that this is a case of give and take. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386: Selectable Frequency of the Timer Interrupt
On Thu, 2005-07-14 at 16:25 -0700, Linus Torvalds wrote: > On Thu, 14 Jul 2005, Lee Revell wrote: > > This thread has really gone OT, but to revisit the original issue for a > > bit, are you still unwilling to consider leaving the default HZ at 1000 > > for 2.6.13? > > Yes. I see absolutely no point to it until I actually hear people who have > actually tried some real load that doesn't work. Dammit, I want a real > user who says that he can noticeable see his DVD stuttering, not some > theory. Well, on the plus side, this will probably drive the development of a mergeable variable tick solution, as I can't imagine the distros will want to have to ship a separate HZ=1000 kernel for multimedia use. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/