[PATCH RFC 00/22] Replace the CFQ I/O Scheduler with BFQ
Hi, this patchset replaces CFQ with the last version of BFQ (which is a proportional-share I/O scheduler). To make a smooth transition, this patchset first brings CFQ back to its state at the time when BFQ was forked from CFQ. Basically, this reduces CFQ to its engine, by removing every heuristic and improvement that has nothing to do with any heuristic or improvement in BFQ, and every heuristic and improvement whose goal is achieved in a different way in BFQ. Then, the second part of the patchset starts by replacing CFQ's engine with BFQ's engine, and goes on by adding current BFQ improvements and extra heuristics. For your convenience, here is the thread in which we agreed on both this first step, and the second and last step: [1]. Moreover, here is a direct link to the email describing both steps: [2]. Some patch generates WARNINGS with checkpatch.pl, but these WARNINGS seem to be either unavoidable for the involved pieces of code (which the patch just extends), or false positives. Turning back to BFQ, its first version was submitted a few years ago [3]. It is denoted as v0 in this patchset, to distinguish it from the version I am submitting now, v7r11. In particular, the first two patches concerned with BFQ introduce BFQ-v0, whereas the remaining patches turn progressively BFQ-v0 into BFQ-v7r11. Here are some nice features of BFQ-v7r11. Low latency for interactive applications According to our results, and regardless of the actual background workload, for interactive tasks the storage device is virtually as responsive as if it was idle. For example, even if one or more of the following background workloads are being executed: - one or more large files are being read or written, - a tree of source files is being compiled, - one or more virtual machines are performing I/O, - a software update is in progress, - indexing daemons are scanning filesystems and updating their databases, starting an application or loading a file from within an application takes about the same time as if the storage device was idle. As a comparison, with CFQ, NOOP or DEADLINE, and in the same conditions, applications experience high latencies, or even become unresponsive until the background workload terminates (also on SSDs). Low latency for soft real-time applications Also soft real-time applications, such as audio and video players/streamers, enjoy a low latency and a low drop rate, regardless of the background I/O workload. As a consequence, these applications do not suffer from almost any glitch due to the background workload. High throughput On hard disks, BFQ achieves up to 30% higher throughput than CFQ, and up to 150% higher throughput than DEADLINE and NOOP, with half of the parallel workloads considered in our tests. With the rest of the workloads, and with all the workloads on flash-based devices, BFQ achieves instead about the same throughput as the other schedulers. Strong fairness guarantees (already provided by BFQ-v0) As for long-term guarantees, BFQ distributes the device throughput (and not just the device time) as desired to I/O-bound applications, with any workload and regardless of the device parameters. BFQ achieves the above service properties thanks to the combination of its accurate scheduling engine (patches 9-10), and a set of simple heuristics and improvements (patches 11-22). Details on how BFQ and its components work are provided in the descriptions of the patches. In addition, an organic description of the main BFQ algorithm and of most of its features can be found in this paper [4]. What BFQ can do in practice is shown, e.g., in this 8-minute demo with an SSD: [5]. I made this demo with an older version of BFQ (v7r6) and under Linux 3.17.0, but, for the tests considered in the demo, performance has remained about the same with more recent BFQ and kernel versions. More details about this point can be found here [6], together with graphs showing the performance of BFQ, as compared with CFQ, DEADLINE and NOOP, and on: a fast and a slow hard disk, a RAID1, an SSD, a microSDHC Card and an eMMC. As an example, our results on the SSD are reported also in a table at the end of this email. Finally, as for testing in everyday use, BFQ is the default I/O scheduler in, e.g., Manjaro, Sabayon, OpenMandriva and Arch Linux ARM, plus several kernel forks for PCs and smartphones. In addition, BFQ is optionally available in, e.g., Arch, PCLinuxOS and Gentoo, and we record several downloads a day from people using other distributions. The feedback received so far basically confirms the expected latency drop and throughput boost. Paolo Results on a Plextor PX-256M5S SSD The first two rows of the next table report the aggregate throughput achieved by BFQ, CFQ, DEADLINE and NOOP, while ten parallel processes read, either sequentially or randomly, a separate portion of the memory blocks each. These processes read directly from the device, and no process performs writes, to avoid writing large files
[PATCH RFC 00/22] Replace the CFQ I/O Scheduler with BFQ
Hi, this patchset replaces CFQ with the last version of BFQ (which is a proportional-share I/O scheduler). To make a smooth transition, this patchset first brings CFQ back to its state at the time when BFQ was forked from CFQ. Basically, this reduces CFQ to its engine, by removing every heuristic and improvement that has nothing to do with any heuristic or improvement in BFQ, and every heuristic and improvement whose goal is achieved in a different way in BFQ. Then, the second part of the patchset starts by replacing CFQ's engine with BFQ's engine, and goes on by adding current BFQ improvements and extra heuristics. For your convenience, here is the thread in which we agreed on both this first step, and the second and last step: [1]. Moreover, here is a direct link to the email describing both steps: [2]. Some patch generates WARNINGS with checkpatch.pl, but these WARNINGS seem to be either unavoidable for the involved pieces of code (which the patch just extends), or false positives. Turning back to BFQ, its first version was submitted a few years ago [3]. It is denoted as v0 in this patchset, to distinguish it from the version I am submitting now, v7r11. In particular, the first two patches concerned with BFQ introduce BFQ-v0, whereas the remaining patches turn progressively BFQ-v0 into BFQ-v7r11. Here are some nice features of BFQ-v7r11. Low latency for interactive applications According to our results, and regardless of the actual background workload, for interactive tasks the storage device is virtually as responsive as if it was idle. For example, even if one or more of the following background workloads are being executed: - one or more large files are being read or written, - a tree of source files is being compiled, - one or more virtual machines are performing I/O, - a software update is in progress, - indexing daemons are scanning filesystems and updating their databases, starting an application or loading a file from within an application takes about the same time as if the storage device was idle. As a comparison, with CFQ, NOOP or DEADLINE, and in the same conditions, applications experience high latencies, or even become unresponsive until the background workload terminates (also on SSDs). Low latency for soft real-time applications Also soft real-time applications, such as audio and video players/streamers, enjoy a low latency and a low drop rate, regardless of the background I/O workload. As a consequence, these applications do not suffer from almost any glitch due to the background workload. High throughput On hard disks, BFQ achieves up to 30% higher throughput than CFQ, and up to 150% higher throughput than DEADLINE and NOOP, with half of the parallel workloads considered in our tests. With the rest of the workloads, and with all the workloads on flash-based devices, BFQ achieves instead about the same throughput as the other schedulers. Strong fairness guarantees (already provided by BFQ-v0) As for long-term guarantees, BFQ distributes the device throughput (and not just the device time) as desired to I/O-bound applications, with any workload and regardless of the device parameters. BFQ achieves the above service properties thanks to the combination of its accurate scheduling engine (patches 9-10), and a set of simple heuristics and improvements (patches 11-22). Details on how BFQ and its components work are provided in the descriptions of the patches. In addition, an organic description of the main BFQ algorithm and of most of its features can be found in this paper [4]. What BFQ can do in practice is shown, e.g., in this 8-minute demo with an SSD: [5]. I made this demo with an older version of BFQ (v7r6) and under Linux 3.17.0, but, for the tests considered in the demo, performance has remained about the same with more recent BFQ and kernel versions. More details about this point can be found here [6], together with graphs showing the performance of BFQ, as compared with CFQ, DEADLINE and NOOP, and on: a fast and a slow hard disk, a RAID1, an SSD, a microSDHC Card and an eMMC. As an example, our results on the SSD are reported also in a table at the end of this email. Finally, as for testing in everyday use, BFQ is the default I/O scheduler in, e.g., Manjaro, Sabayon, OpenMandriva and Arch Linux ARM, plus several kernel forks for PCs and smartphones. In addition, BFQ is optionally available in, e.g., Arch, PCLinuxOS and Gentoo, and we record several downloads a day from people using other distributions. The feedback received so far basically confirms the expected latency drop and throughput boost. Paolo Results on a Plextor PX-256M5S SSD The first two rows of the next table report the aggregate throughput achieved by BFQ, CFQ, DEADLINE and NOOP, while ten parallel processes read, either sequentially or randomly, a separate portion of the memory blocks each. These processes read directly from the device, and no process performs writes, to avoid writing large files