On 8/6/05, William Kenworthy <[EMAIL PROTECTED]> wrote:
> Hi, is there any guide as to what scheduler is best for what purposes?
> I have used CFQ since the early days as I was rather unhappy with the
> performance of 2.6 as against 2.4, but I am finding the latest 2.6
> kernels do not seem to be running as well as the earlier ones with this
> scheduler - so I presume there have have been changes/improvements etc
> that may mean I need to reassess what I am using.

Firstly, if you are wish to fairly compare 2.6 to 2.4 on a
"like-for-like" basis then configure the 2.6 kernel to run at 100HZ
and boot it with the "elevator=deadline" option. Secondly, it is
superfluous to base any performance notions upon experiences with
earlier versions of 2.6. I would definitely recommend that you
reassess the situation if needs be.

The topic of elevators/schedulers has been discussed in great (and
revealing) detail in various posts over at kerneltrap.org in the past.
The most informative post that I can recall was posted by someone who
unfortunately elected to reamain anonymous so I cannot credit him by
name. I daresay he would not object to my quoting him in full so
that's what I'm going to do:

<quote="Anonymous on Monday, September 20, 2004 - 09:31">
This is just an executive summary. I don't know the details.
Normally a disk scheduler tries to balance between these goals:

- fairness (let every process have its share of the access to disk)
- performance (try to serve requests close to current disk head
position first, because seeking there is fastest)
- realtime (guarantee that a request is serviced in a given time)

A scheduler gets as input the list of sectors that processes have
recently requested and which haven't yet given. It knows where the
disk head currently is, and whatever other data it "remembers" from
the past requests.

A typical scheduler is called cyclic elevator, where disk heads move
from beginning of disk to the end of disk in one direction. Assume
that you get some requests in a sorted queue, and you're going to
satisfy them. So, you'll first filter the list of requests by deciding
that you will not satisfy requests for sectors below your current head
position. Then you'll simply go to the sector closest to your current
position in your sorted list. So, crunch crunch crunch, you move from
start of disk to the end of disk. When you reach the highest
unsatisfied sector number, your cycle is over and you seek to the
lowest unsatisfied sector. Rinse and repeat.

Linux 2.2 at least used the cyclic elevator if I remember right.

So, with these approximate goals in mind, we can look at the different
schedulers available.

- deadline scheduler: this is a cyclic elevator but with a twist:
requests are given a deadline by which they get served. When a request
starts to look like it's going to expire, the kernel will skip
intermediate sectors and move to that request, thus giving some
realtime behaviour.

- AS scheduler: a cyclic elevator with waiting policy: after you
service a request that looks like it might have future requests coming
nearby, you pause even if there's more sectors in your work queue. The
anticipatory scheduler literally anticipates more requests to follow
on this track or very close by. How AS decides whether to anticipate
is basically just lot of guesswork based on typical access patterns.

- cfq scheduler: different sort of stab at fairness. It's different
from either of these two, and doesn't use cyclic elevator and has
realtime guarantees and aggressively avoids starvation. It could be a
good scheduler for multiuser systems.

- noop scheduler: just service next request in the queue without any
algorithm to prefer this or that request.

You want to use noop scheduler on devices where there are no seeking
penalty, such as flash drives. That's why USB stick wants noop.
Unfortunately, harddisks are very mechanial beasts and their
performance is highly controlled by their seeking abilities. All these
schedulers above are really trying to figure out how to extract
maximum performance off the harddisk without causing bad behaviour in
other cases.
</quote>

Note that if you're using a sophisticated hardware RAID device then
"noop" may also be a good choice. This is because there may only be
one logical block device presented to Linux and it doesn't make sense
for a fancy elevator to spend its time reordering requests according
to a model of access that is irrelevant to the underlying media (which
is abastracted from the kernel).

The explanation for CFQ is a little scanty. Another anonymous poster
at the same site summarised it rather well:

<quote="Anonymous on Wednesday, February 26, 2003 - 15:38">
CFQ aims to provide fairness among competing tasks in that when more
than one task requests disk access, the I/O scheduler round-robins
among all requestors rather than letting one requestor dominate. It
can result in seek storm, however, when the various tasks want to
access different areas of the disk. Thus, it can be bad for total
throughput. (This is unlike the network stack, where the cost of
switching between two unrelated streams is zero or nearly zero.)

AS seeks to improve throughput by reducing the seek storms that occur
when you switch among requestors. It observes that most small reads
are so-called dependent reads, and so even though a given requestor
might have only one request queued at the moment, it may have several
others for areas nearby on the disk. Thus, when it picks one of these
reads to schedule, it decides whether to hold off scheduling anything
else for that same disk for "a moment" to see if more read requests
arrive. The net effect is to reduce the total number of seeks that
result between competing streams by allowing greater batches of nearby
reads to execute together. Overall throughput should go up as a
result.

CFQ should be better for low-latency tasks *if* the resultant seek
storms don't kill throughput. If you have two streams duking it out
and no hysteresis to keep you from hopping queues continuously, then
the resulting seeks could cause CFQ to leave you crying. In the
dependent read case, since the requestor's queue will go empty (and
cause a queue switch) after each request, it seems as though these
seek storms are practically a given without aggressive readahead.
</quote>

Personally, I'm using the deadline elevator with an md (software)
RAID5 setup in conjunction with LVM2.

Here are my sources for the above quotations:

http://kerneltrap.org/node/3851
http://kerneltrap.org/node/592/2516

Cheers,

--Kerin Millar

-- 
[email protected] mailing list

Reply via email to