http://kerneltrap.org/node/6750

June 23, 2006 - 11:11am
Submitted by Jeremy on June 23, 2006 - 11:11am.

Thomas Gleixner and Ingo Molnar [interview] posted an update of their high-res timers kernel patches for the 2.6.17 kernel, "upon which we based a tickless kernel (dyntick) implementation and a 'dynamic HZ' feature as well". The patch currently works for x86, with ports to x86_64, PPC and ARM in the works. Thomas explains, "the high-res timers feature (CONFIG_HIGH_RES_TIMERS) enables POSIX timers and nanosleep() to be as accurate as the hardware allows (around 1usec on typical hardware). This feature is transparent - if enabled it just makes these timers much more accurate than the current HZ resolution." He goes on to discribe the tickless kernel:

"The tickless kernel feature (CONFIG_NO_HZ) enables 'on-demand' timer interrupts: if there is no timer to be expired for say 1.5 seconds when the system goes idle, then the system will stay totally idle for 1.5 seconds. This should bring cooler CPUs and power savings: on our (x86) testboxes we have measured the effective IRQ rate to go from HZ to 1-2 timer interrupts per second.

"This feature is implemented by driving 'low res timer wheel' processing via special per-CPU high-res timers, which timers are reprogrammed to the next-low-res-timer-expires interval. This tickless-kernel design is SMP-safe in a natural way and has been developed on SMP systems from the beginning."


From: Thomas Gleixner [email blocked]
To: LKML [email blocked]
Subject: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
Date:	Sun, 18 Jun 2006 17:10:26 +0200

We are pleased to announce the 2.6.17 based release of our high-res 
timers kernel feature, upon which we based a tickless kernel (dyntick) 
implementation and a 'dynamic HZ' feature as well:

http://www.tglx.de/projects/hrtimers/2.6.17/

The easiest way to try these features is to apply the combo patch to 
vanilla 2.6.17. The patching order is:

http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.17.tar.bz2
http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick1.patch


A broken out patch series is available too:

http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick1.patches.tar.bz2


The high-res timers feature (CONFIG_HIGH_RES_TIMERS) enables POSIX 
timers and nanosleep() to be as accurate as the hardware allows (around 
1usec on typical hardware). This feature is transparent - if enabled it 
just makes these timers much more accurate than the current HZ 
resolution. It is based on the Generic Time Of Day patchset from John 
Stultz and it in essence finishes what we started with the 
kernel/hrtimers.c code in 2.6.16.
 
The tickless kernel feature (CONFIG_NO_HZ) enables 'on-demand' timer 
interrupts: if there is no timer to be expired for say 1.5 seconds when 
the system goes idle, then the system will stay totally idle for 1.5 
seconds. This should bring cooler CPUs and power savings: on our (x86) 
testboxes we have measured the effective IRQ rate to go from HZ to 1-2 
timer interrupts per second.

This feature is implemented by driving 'low res timer wheel' processing 
via special per-CPU high-res timers, which timers are reprogrammed to 
the next-low-res-timer-expires interval. This tickless-kernel design is 
SMP-safe in a natural way and has been developed on SMP systems from
the 
beginning.

Note: while our code should be similar in behavior to the existing 
dynticks kernel patch from Con, it is a fundamentally different design 
(being based on the high-res timers support and APIs) and is thus a 
different implementation. We reused one area of dynticks: we integrated 
and improved the 'timer top' profiling tool (CONFIG_TIMER_INFO).

When running the kernel then there's a 'timeout granularity' 
runtime tunable parameter as well, under:

   /proc/sys/kernel/timeout_granularity

it defaults to 1, meaning that CONFIG_HZ is the granularity of timers. 

For example, if CONFIG_HZ is 1000 and timeout_granularity is set to 10, 
then low-res timers will be expired every 10 jiffies (every 10 msecs), 
thus the effective granularity of low-res timers is 100 HZ. Thus this 
feature implements nonintrusive dynamic HZ in essence, without touching 
the HZ macro itself.

Supported platforms: high-res timers and tickless works on x86 (x86_64,
PPC and ARM port are in the works). Other platforms should still work
fine with the usual HZ frequency timer tick.

Naturally, we'd like these features to be integrated into the upstream 
kernel as well.

Bugreports and suggestions are welcome,
 
	Thomas, Ingo



From: Roman Zippel <[EMAIL PROTECTED]>
Subject: Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
Date:	Mon, 19 Jun 2006 01:47:22 +0200 (CEST)

Hi,

On Sun, 18 Jun 2006, Thomas Gleixner wrote:

> Bugreports and suggestions are welcome,

Could you please document the patches? I know it sucks compared to 
hacking, but it would make a review a lot simpler.

bye, Roman



From: Ingo Molnar [email blocked]
Subject: Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
Date:	Mon, 19 Jun 2006 14:50:18 +0200


* Roman Zippel <[EMAIL PROTECTED]> wrote:

> > Bugreports and suggestions are welcome,
> 
> Could you please document the patches? I know it sucks compared to 
> hacking, but it would make a review a lot simpler.

yeah, we'll add some description to the patches themselves, but 
otherwise i'm afraid it will be like with almost all patch submissions 
on lkml: 99% of the details are in the code and people have to ask 
specifically if one area or another is unclear :-|

Meanwhile the patch names should provide you with some initial info 
(also, we reuse GTOD which is documented in -mm) and the splitup is 
pretty clean too - but in any case please feel free to ask pointed 
questions! (we happily accept documentation patches as well.)

	Ingo



From: Roman Zippel <[EMAIL PROTECTED]>
Subject: Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ
Date:	Mon, 19 Jun 2006 15:47:45 +0200 (CEST)

Hi,

On Mon, 19 Jun 2006, Ingo Molnar wrote:

> > > Bugreports and suggestions are welcome,
> > 
> > Could you please document the patches? I know it sucks compared to 
> > hacking, but it would make a review a lot simpler.
> 
> yeah, we'll add some description to the patches themselves, but 

The problem is this is not the first time I mentioned this and some 
patches still have no descriptions at all! :-(

> otherwise i'm afraid it will be like with almost all patch submissions 
> on lkml: 99% of the details are in the code and people have to ask 
> specifically if one area or another is unclear :-|

For a lot of things this acceptable, but if patches (e.g. clockevents) add 
new generic infrastructure which effect all archs, they need 
documentation (unless you also provide all the arch specific changes).

> Meanwhile the patch names should provide you with some initial info 
> (also, we reuse GTOD which is documented in -mm) and the splitup is 
> pretty clean too - but in any case please feel free to ask pointed 
> questions! (we happily accept documentation patches as well.)

I can't do this without documentation. Without any information I'm only 
wondering why it has to be this complex.
For example clockevents, I think all the special event handlers are 
overkill, a simple list would do just fine. This way it may also possible 
to treat a clock as virtual interrupt source and we could share code with 
interrupt code and a callback can simply be requested via request_irq().
More information about what this code actually intends to do and what it 
is required to do, would help a great deal to judge alternative solutions, 
but only the author of this code can really provide this information and 
IMO it's really sad that this information is still lacking after being 
requested multiple times.

bye, Roman


From: Con Kolivas [email blocked] Subject: Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ Date: Mon, 19 Jun 2006 15:21:05 +1000 On Monday 19 June 2006 01:10, Thomas Gleixner wrote: > We are pleased to announce the 2.6.17 based release of our high-res > timers kernel feature, upon which we based a tickless kernel (dyntick) > implementation and a 'dynamic HZ' feature as well: > > http://www.tglx.de/projects/hrtimers/2.6.17/ > > The easiest way to try these features is to apply the combo patch to > vanilla 2.6.17. The patching order is: > > http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.17.tar.bz2 > http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick1.patch > > > A broken out patch series is available too: > > http://www.tglx.de/projects/hrtimers/2.6.17/patch-2.6.17-hrt-dyntick1.patch >es.tar.bz2 > > > The high-res timers feature (CONFIG_HIGH_RES_TIMERS) enables POSIX > timers and nanosleep() to be as accurate as the hardware allows (around > 1usec on typical hardware). This feature is transparent - if enabled it > just makes these timers much more accurate than the current HZ > resolution. It is based on the Generic Time Of Day patchset from John > Stultz and it in essence finishes what we started with the > kernel/hrtimers.c code in 2.6.16. > > The tickless kernel feature (CONFIG_NO_HZ) enables 'on-demand' timer > interrupts: if there is no timer to be expired for say 1.5 seconds when > the system goes idle, then the system will stay totally idle for 1.5 > seconds. This should bring cooler CPUs and power savings: on our (x86) > testboxes we have measured the effective IRQ rate to go from HZ to 1-2 > timer interrupts per second. > > This feature is implemented by driving 'low res timer wheel' processing > via special per-CPU high-res timers, which timers are reprogrammed to > the next-low-res-timer-expires interval. This tickless-kernel design is > SMP-safe in a natural way and has been developed on SMP systems from > the > beginning. > > Note: while our code should be similar in behavior to the existing > dynticks kernel patch from Con, it is a fundamentally different design > (being based on the high-res timers support and APIs) and is thus a > different implementation. We reused one area of dynticks: we integrated > and improved the 'timer top' profiling tool (CONFIG_TIMER_INFO). > > When running the kernel then there's a 'timeout granularity' > runtime tunable parameter as well, under: > > /proc/sys/kernel/timeout_granularity > > it defaults to 1, meaning that CONFIG_HZ is the granularity of timers. > > For example, if CONFIG_HZ is 1000 and timeout_granularity is set to 10, > then low-res timers will be expired every 10 jiffies (every 10 msecs), > thus the effective granularity of low-res timers is 100 HZ. Thus this > feature implements nonintrusive dynamic HZ in essence, without touching > the HZ macro itself. > > Supported platforms: high-res timers and tickless works on x86 (x86_64, > PPC and ARM port are in the works). Other platforms should still work > fine with the usual HZ frequency timer tick. > > Naturally, we'd like these features to be integrated into the upstream > kernel as well. > > Bugreports and suggestions are welcome, > > Thomas, Ingo Nice work Thomas and Ingo. The approach to previous dynticks that I was working on had some nasty issues with scalability that were not addressable without a complete rewrite which is why I abandoned the previous implementation. Your approach for using the hires timer events is ultimately a better solution and the code base is cleaner so I'm very pleased to see it. A couple of comments. One of the problems we enountered with dynticks was that using the higher resolution timers such as TSC and HPET to adjust for timer ticks over longer periods when skipping ticks made the overall clock drift when run for many days and only the PM Timer was not prone to this happening. ie the timers were very accurate for short periods but over days it would drift. It could well have been a design flaw in the dynticks I was maintaining rather than the timers themselves but have you checked that this isn't a problem? The other thing I note is that there is a reasonable amount of indirection in fairly hot paths. It looks like there is scope for more local variable storage of these indirect calls. Also if set_next_event is separated from struct clock_event, the whole struct looks like a suitable candidate for __read_only. -- -ck From: Ingo Molnar [email blocked] Subject: Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ Date: Mon, 19 Jun 2006 14:26:07 +0200 * Con Kolivas [email blocked] wrote: > Nice work Thomas and Ingo. > > The approach to previous dynticks that I was working on had some nasty > issues with scalability that were not addressable without a complete > rewrite which is why I abandoned the previous implementation. Your > approach for using the hires timer events is ultimately a better > solution and the code base is cleaner so I'm very pleased to see it. thanks! > A couple of comments. > > One of the problems we enountered with dynticks was that using the > higher resolution timers such as TSC and HPET to adjust for timer > ticks over longer periods when skipping ticks made the overall clock > drift when run for many days and only the PM Timer was not prone to > this happening. ie the timers were very accurate for short periods but > over days it would drift. It could well have been a design flaw in the > dynticks I was maintaining rather than the timers themselves but have > you checked that this isn't a problem? not yet. If it's a real problem we could introduce a 'make clock events more reliable' framework by doing something like always programming clock event sources into periodic mode and reading their current time offset [if possible] when the event is processesed (thus compensating for most of the drift caused by irq processing latency). But if it's not needed it would be nice to avoid that complexity. I'm also wondering why the PM timer was the most accurate in that regard - it's almost as slow to program as the PIT, so i'd have expected it to to show the biggest drift. (another technique to reduce drift: we could increase the APIC-priority of the lapic timer, making it less suspect to drift when there are lots of other IRQs going on.) can you think of any other similar 'weird cases' that you saw happen with dynticks? For example there's the 'APIC stops timer irqs when entering C3 mode' bug - any similar weirdness we should be careful about? [right now the patch doesnt handle the C3 mode bug, but it should be relatively straightforward to blacklist lapic events in that case] i'm looking at dynticks-060227.patch right now, and there seem to be a fair amount of dyntick specific changes to ACPI's processor_idle.c code. Do you remember what those changes were about and should we pick them up in one way or another? > The other thing I note is that there is a reasonable amount of > indirection in fairly hot paths. It looks like there is scope for more > local variable storage of these indirect calls. [...] which function(s) were you looking at when coming to this conclusion? clockevents_init_next_event() perhaps? [we could certainly put 'sources->nextevent' into a local variable there] > [...] Also if set_next_event is separated from struct clock_event, the > whole struct looks like a suitable candidate for __read_mostly. You mean ->event_handler()? We can make all clockevent instantiations __read_mostly right now - all of the fields of clock_event are static, even ->event_handler() will change at most once per bootup [when we switch from low-res into high-res mode]. Ingo From: Con Kolivas [email blocked] Subject: Re: [PATCHSET] Announce: High-res timers, tickless/dyntick and dynamic HZ Date: Tue, 20 Jun 2006 00:03:25 +1000 On Monday 19 June 2006 22:26, Ingo Molnar wrote: > * Con Kolivas [email blocked] wrote: > > One of the problems we enountered with dynticks was that using the > > higher resolution timers such as TSC and HPET to adjust for timer > > ticks over longer periods when skipping ticks made the overall clock > > drift when run for many days and only the PM Timer was not prone to > > this happening. ie the timers were very accurate for short periods but > > over days it would drift. It could well have been a design flaw in the > > dynticks I was maintaining rather than the timers themselves but have > > you checked that this isn't a problem? > > not yet. If it's a real problem we could introduce a 'make clock events > more reliable' framework by doing something like always programming > clock event sources into periodic mode and reading their current time > offset [if possible] when the event is processesed (thus compensating > for most of the drift caused by irq processing latency). But if it's not > needed it would be nice to avoid that complexity. I'm also wondering why > the PM timer was the most accurate in that regard - it's almost as slow > to program as the PIT, so i'd have expected it to to show the biggest > drift. > > (another technique to reduce drift: we could increase the APIC-priority > of the lapic timer, making it less suspect to drift when there are lots > of other IRQs going on.) Better to wait and see if it was an artefact of my dodgy code for recover walltime and if this code doesn't have that issue. > can you think of any other similar 'weird cases' that you saw happen > with dynticks? For example there's the 'APIC stops timer irqs when > entering C3 mode' bug - any similar weirdness we should be careful > about? [right now the patch doesnt handle the C3 mode bug, but it should > be relatively straightforward to blacklist lapic events in that case] The hardware that also did C4 was more troublesome but for the same reasons since it's a subset of C3. See Dominik's patches mentioned below which address these high state transitions. There isn't anything else offhand I can think of that I actually managed to track down :| > i'm looking at dynticks-060227.patch right now, and there seem to be a > fair amount of dyntick specific changes to ACPI's processor_idle.c code. > Do you remember what those changes were about and should we pick them up > in one way or another? Dominik donated a lot of code to use the dynticks infrastructure to actually implement the power savings. Just skipping ticks seemed to make very little power difference unless we also used the knowledge from next timer interrupt to know how long we are going to be idle and choose C state transitions accordingly. Each patch is documented at length in the split out C-States-1_bm_activity_improvements.patch C-States-2_bm_activity_handling_improvement.patch C-States-3_accounting_of_sleep_times.patch C-States-4_dyn-ticks_tweaks.patch http://ck.kolivas.org/patches/dyn-ticks/split-out/ > > The other thing I note is that there is a reasonable amount of > > indirection in fairly hot paths. It looks like there is scope for more > > local variable storage of these indirect calls. [...] > > which function(s) were you looking at when coming to this conclusion? > clockevents_init_next_event() perhaps? [we could certainly put > 'sources->nextevent' into a local variable there] >From what I could see hrtimer_restart_sched_tick() could use struct hrtimer *sched_timer = &cpu_base->sched_timer; clockevents_init_next_event() and clockevents_set_next_event() could use struct clock_event *nextevt = sources->nextevt; > > [...] Also if set_next_event is separated from struct clock_event, the > > whole struct looks like a suitable candidate for __read_mostly. > > You mean ->event_handler()? We can make all clockevent instantiations > __read_mostly right now - all of the fields of clock_event are static, > even ->event_handler() will change at most once per bootup [when we > switch from low-res into high-res mode]. Great, thanks! -- -ck



Related Links:

"tickless" ... irq latency FX?

June 24, 2006 - 7:11pm
A Nony Mouse (not verified)

Does this address the issue whereby catching up to all the omitted ticks at once -- by calling timer_tick() in a loop -- adds lots of IRQ latency? Seems that catching them all up at once would be a lot more efficient than the current ARM approach.

On one system I've seen that cause lots of trouble with the serial console driver, which happens to use PIO not DMA. The kernel runs nicely with an actual tick rate of about 3 timer IRQs per second, but that means it usually needs to catch up to HZ/3 ticks before it calls the PIO handler ... losing badly. Can't use the uparrow key to scroll back through BASH history etc.

Did you point out this issue

June 25, 2006 - 11:17am
Anonymous (not verified)

Did you point out this issue on LKML ?

It does catch up in one go. T

June 25, 2006 - 6:07pm
tglx (not verified)

It does catch up in one go. There is no looping involved.

high-res timers in realtime-preempt?

July 17, 2006 - 11:56am
Alessio (not verified)

Does the 2.6.17-rt7 realtime-preempt patch at http://people.redhat.com/mingo/realtime-preempt/ have high-res timer support like this one?

High Res Timers

August 10, 2006 - 12:56pm

Hello,

I downloaded 2.6.17-7 and do not see the CONFIG_HIGH_RES_TIMERS set. I tried looking using "make menuconfig"

(OR) is this available only thro the dyntick patch? I am kind of newbie, so detailed answers would help.

much appreciate your help

My view

February 16, 2007 - 3:31pm

High-res timers are a great set of patches, I use them often. Thomas Gleixner and Ingo Molnar do a great job.

Yes, I agree. Don't use them

March 7, 2008 - 12:54pm

Yes, I agree. Don't use them all that much but they have proven to be really good.

tickless causes high interrupt rate + keyboard repeat problems

May 23, 2007 - 5:02am
Hmmm... (not verified)

Using tickless on 2.6.21.1 ...

Under high system load and typing, keys repeat when the key hasn't been pressed; I need to press another key to stop it. Turning off CONFIG_NO_HZ and returning to HZ_300 almost (?!) completely solves this issue.

On another computer with a parallel printer attached, printing a large file causes the interrupt rate to increase to 80,000 per second, using about 86% of the CPU until the large print job finishes.

Keyboard problems might not be tickless

May 23, 2007 - 7:44am
Hmmm... (not verified)

After recompiling without CONFIG_NO_HZ the problem still recurs under heavy CPU usage. So this might be a problem with 2.6.21.1 itself rather than the tickless changes.

Have you found out anything

November 24, 2007 - 6:40pm
dottem (not verified)

Have you found out anything interesting? I think, I am experiencing similar problem. I suspect CONFIG_HZ_432=y might be the reason for such behaviour, however I have not tested it yet. This 'keyboard weirdness' became obvious under quite heavy eth0 load (large .tar file transfer).

Linux grasshopper 2.6.24-rc3-zen3 #2 SMP Fri Nov 23 22:34:11 CET 2007 x86_64 Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz GenuineIntel GNU/Linux

Tickless in PPC32

January 14, 2008 - 4:58am
Anonymous (not verified)

Tickless kernel runs in X86 superby but need to run it in ppc 32bit.Any pointers where i can look for the patch?


Reply via email to