On 8/6/05, William Kenworthy <[EMAIL PROTECTED]> wrote: > Hi, is there any guide as to what scheduler is best for what purposes? > I have used CFQ since the early days as I was rather unhappy with the > performance of 2.6 as against 2.4, but I am finding the latest 2.6 > kernels do not seem to be running as well as the earlier ones with this > scheduler - so I presume there have have been changes/improvements etc > that may mean I need to reassess what I am using.
Firstly, if you are wish to fairly compare 2.6 to 2.4 on a "like-for-like" basis then configure the 2.6 kernel to run at 100HZ and boot it with the "elevator=deadline" option. Secondly, it is superfluous to base any performance notions upon experiences with earlier versions of 2.6. I would definitely recommend that you reassess the situation if needs be. The topic of elevators/schedulers has been discussed in great (and revealing) detail in various posts over at kerneltrap.org in the past. The most informative post that I can recall was posted by someone who unfortunately elected to reamain anonymous so I cannot credit him by name. I daresay he would not object to my quoting him in full so that's what I'm going to do: <quote="Anonymous on Monday, September 20, 2004 - 09:31"> This is just an executive summary. I don't know the details. Normally a disk scheduler tries to balance between these goals: - fairness (let every process have its share of the access to disk) - performance (try to serve requests close to current disk head position first, because seeking there is fastest) - realtime (guarantee that a request is serviced in a given time) A scheduler gets as input the list of sectors that processes have recently requested and which haven't yet given. It knows where the disk head currently is, and whatever other data it "remembers" from the past requests. A typical scheduler is called cyclic elevator, where disk heads move from beginning of disk to the end of disk in one direction. Assume that you get some requests in a sorted queue, and you're going to satisfy them. So, you'll first filter the list of requests by deciding that you will not satisfy requests for sectors below your current head position. Then you'll simply go to the sector closest to your current position in your sorted list. So, crunch crunch crunch, you move from start of disk to the end of disk. When you reach the highest unsatisfied sector number, your cycle is over and you seek to the lowest unsatisfied sector. Rinse and repeat. Linux 2.2 at least used the cyclic elevator if I remember right. So, with these approximate goals in mind, we can look at the different schedulers available. - deadline scheduler: this is a cyclic elevator but with a twist: requests are given a deadline by which they get served. When a request starts to look like it's going to expire, the kernel will skip intermediate sectors and move to that request, thus giving some realtime behaviour. - AS scheduler: a cyclic elevator with waiting policy: after you service a request that looks like it might have future requests coming nearby, you pause even if there's more sectors in your work queue. The anticipatory scheduler literally anticipates more requests to follow on this track or very close by. How AS decides whether to anticipate is basically just lot of guesswork based on typical access patterns. - cfq scheduler: different sort of stab at fairness. It's different from either of these two, and doesn't use cyclic elevator and has realtime guarantees and aggressively avoids starvation. It could be a good scheduler for multiuser systems. - noop scheduler: just service next request in the queue without any algorithm to prefer this or that request. You want to use noop scheduler on devices where there are no seeking penalty, such as flash drives. That's why USB stick wants noop. Unfortunately, harddisks are very mechanial beasts and their performance is highly controlled by their seeking abilities. All these schedulers above are really trying to figure out how to extract maximum performance off the harddisk without causing bad behaviour in other cases. </quote> Note that if you're using a sophisticated hardware RAID device then "noop" may also be a good choice. This is because there may only be one logical block device presented to Linux and it doesn't make sense for a fancy elevator to spend its time reordering requests according to a model of access that is irrelevant to the underlying media (which is abastracted from the kernel). The explanation for CFQ is a little scanty. Another anonymous poster at the same site summarised it rather well: <quote="Anonymous on Wednesday, February 26, 2003 - 15:38"> CFQ aims to provide fairness among competing tasks in that when more than one task requests disk access, the I/O scheduler round-robins among all requestors rather than letting one requestor dominate. It can result in seek storm, however, when the various tasks want to access different areas of the disk. Thus, it can be bad for total throughput. (This is unlike the network stack, where the cost of switching between two unrelated streams is zero or nearly zero.) AS seeks to improve throughput by reducing the seek storms that occur when you switch among requestors. It observes that most small reads are so-called dependent reads, and so even though a given requestor might have only one request queued at the moment, it may have several others for areas nearby on the disk. Thus, when it picks one of these reads to schedule, it decides whether to hold off scheduling anything else for that same disk for "a moment" to see if more read requests arrive. The net effect is to reduce the total number of seeks that result between competing streams by allowing greater batches of nearby reads to execute together. Overall throughput should go up as a result. CFQ should be better for low-latency tasks *if* the resultant seek storms don't kill throughput. If you have two streams duking it out and no hysteresis to keep you from hopping queues continuously, then the resulting seeks could cause CFQ to leave you crying. In the dependent read case, since the requestor's queue will go empty (and cause a queue switch) after each request, it seems as though these seek storms are practically a given without aggressive readahead. </quote> Personally, I'm using the deadline elevator with an md (software) RAID5 setup in conjunction with LVM2. Here are my sources for the above quotations: http://kerneltrap.org/node/3851 http://kerneltrap.org/node/592/2516 Cheers, --Kerin Millar -- [email protected] mailing list
