Hi folks.

The forthcoming kernel will be introducing a rather visible (and
potentially important) change. Firstly, it will provide an option to
select the default timer frequency. Secondly, the default frequency
will now be 250HZ. For the record, the value used to be 100HZ and was
rather drastically changed to 1000HZ for 2.6. It will now be possible
to choose a HZ value of either 100, 250 or 1000.

The most interesting aspect of the timer is that it affects scheduling
granularity. That is, the minimum possible time for which a given
process is allowed to run before interruption. This is commonly known
as a "jiffy" and can be calculated by dividing 1000 by the HZ value.
So, the length of a jiffy in 2.6 at present is 1ms. In general, a
longer jiffy results in more throughput but with more latency whereas
a shorter jiffy results in less throughput (due to the overhead from
frequent interruption) but with a reduction in latency. In my view,
this is rather interesting in terms of the impact this can have on
server performance. The help text for the new kernel configuration
option explains further:

- "Allows the configuration of the timer frequency. It is customary to
have the timer interrupt run at 1000 HZ but 100 HZ may be more
beneficial for servers and NUMA systems that do not need to have a
fast response for user interaction and that may experience bus
contention and cacheline bounces as a result of timer interrupts. Note
that the timer interrupt occurs on each processor in an SMP
environment leading to NR_CPUS * HZ number of timer interrupts per
second."

The specific help text for the 100HZ option is as follows:

- "100 HZ is a typical choice for servers, SMP and NUMA systems with
lots of processors that may show reduced performance if too many timer
interrupts are occurring."

For 250HZ:

- "250 HZ is a good compromise choice allowing server performance
while also showing good interactive responsiveness even on SMP and
NUMA systems."

For 1000HZ:

- "1000 HZ is the preferred choice for desktop systems and other
systems requiring fast interactive responses to events."

Another interesting point that is not mentioned above is that reducing
the timer frequency can significantly help to reduce time drift
(particularly on "real" SMP systems where this effect may be even more
pronounced). The reason that I am posting is twofold:

* To notify everyone that this change is coming
* To provide a means for anyone interested to test the ramifications
of this newly acquired flexibility using a 2.6.12 kernel prior to the
eventual release of 2.6.13.

To that end I have grabbed the specific patch that adds the
configuration option which can be found here:

  http://www.recruit2recruit.net/kerframil/2.6.12-i386-selectable-hz.patch

This patch is taken directly from upstream (the option itself can be
found at the bottom of the "Processor type and features" menu).
Interestingly, the change has drawn attention to a few areas in the
kernel where the methods used for timing (such as delay loops) are not
ideal. So I have gone through all of the patches committed to the
mainline tree since 2.6.12.3 and collected the ones that are relevant
then aggregated them into one patch (the second one applies cleanly
against a gentoo-sources tree):

  http://www.recruit2recruit.net/kerframil/2.6.12-timing-fixes-rollup.patch
  
http://www.recruit2recruit.net/kerframil/2.6.12-gentoo-r7-timing-fixes-rollup.patch

The individual patches are all very small and non-intrusive by nature.
They are certainly not critical in order to be able to change the
timer frequency but I recommend that they be used. For the curious,
some notes on the contents of the patch can be found here here:
http://www.recruit2recruit.net/kerframil/NOTES-2.6.12-timing-fixes-rollup

What I would like is for anyone who is interested to put an alternate
value (100 or 250) to the test (if they are in a position to be able
to do so) and to determine whether there is any clear performance
improvement with their workload. I switched my system (Compaq Proliant
ML370) to 100HZ 2 days ago and can at least confirm that it is stable
although I have not yet had the opportunity to perform any comparative
benchmarks. If in doubt, then 250 might be a good value to try as,
after all, that is going to be the default - at least for x86 ;)

Finally, this topic has generated a nice little flame war over on the
LKML (actually, it's quite an interesting thread if a little confusing
at points):

http://kerneltrap.org/node/5430
http://lkml.org/lkml/2005/7/8/259

Cheers,

--Kerin Millar

-- 
[email protected] mailing list

Reply via email to