Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
> Il giorno 29 ott 2016, alle ore 16:12, Jens Axboeha > scritto: > > On 10/28/2016 11:38 PM, Paolo Valente wrote: >> >>> Il giorno 26 ott 2016, alle ore 18:12, Jens Axboe ha >>> scritto: >>> >>> On 10/26/2016 10:04 AM, Paolo Valente wrote: > Il giorno 26 ott 2016, alle ore 17:32, Jens Axboe ha > scritto: > > On 10/26/2016 09:29 AM, Christoph Hellwig wrote: >> On Wed, Oct 26, 2016 at 05:13:07PM +0200, Arnd Bergmann wrote: >>> The question to ask first is whether to actually have pluggable >>> schedulers on blk-mq at all, or just have one that is meant to >>> do the right thing in every case (and possibly can be bypassed >>> completely). >> >> That would be my preference. Have a BFQ-variant for blk-mq as an >> option (default to off unless opted in by the driver or user), and >> not other scheduler for blk-mq. Don't bother with bfq for non >> blk-mq. It's not like there is any advantage in the legacy-request >> device even for slow devices, except for the option of having I/O >> scheduling. > > It's the only right way forward. blk-mq might not offer any substantial > advantages to rotating storage, but with scheduling, it won't offer a > downside either. And it'll take us towards the real goal, which is to > have just one IO path. ok > Adding a new scheduler for the legacy IO path > makes no sense. I would fully agree if effective and stable I/O scheduling would be available in blk-mq in one or two months. But I guess that it will take at least one year optimistically, given the current status of the needed infrastructure, and given the great difficulties of doing effective scheduling at the high parallelism and extreme target speeds of blk-mq. Of course, this holds true unless little clever scheduling is performed. So, what's the point in forcing a lot of users wait another year or more, for a solution that has yet to be even defined, while they could enjoy a much better system, and then switch an even better system when scheduling is ready in blk-mq too? >>> >>> That same argument could have been made 2 years ago. Saying no to a new >>> scheduler for the legacy framework goes back roughly that long. We could >>> have had BFQ for mq NOW, if we didn't keep coming back to this very >>> point. >>> >>> I'm hesistant to add a new scheduler because it's very easy to add, very >>> difficult to get rid of. If we do add BFQ as a legacy scheduler now, >>> it'll take us years and years to get rid of it again. We should be >>> moving towards LESS moving parts in the legacy path, not more. >>> >>> We can keep having this discussion every few years, but I think we'd >>> both prefer to make some actual progress here. >> >> ok Jens, I give up >> >>> It's perfectly fine to >>> add an interface for a single queue interface for an IO scheduler for >>> blk-mq, since we don't care too much about scalability there. And that >>> won't take years, that should be a few weeks. Retrofitting BFQ on top of >>> that should not be hard either. That can co-exist with a real multiqueue >>> scheduler as well, something that's geared towards some fairness for >>> faster devices. >>> >> >> AFAICT this solution is good, for many practical reasons. I don't >> have the expertise to make such an infrastructure well on my own. At >> least not in an acceptable amount of time, because working on this >> nice stuff is unfortunately not my job (although Linaro is now >> supporting me for BFQ). >> >> Then, assuming that this solution may be of general interest, and that >> BFQ benefits convinced you a little bit too, may I get significant >> collaboration/help on implementing this infrastructure? > > Of course, I already offered to help with this. > Yep, I just did not want to take this important point for granted. >> If so, Jens >> and all possibly interested parties, could we have a sort of short >> kick-off technical meeting during KS/LPC? > > I'm not a huge fan of setting up a BoF to discuss something technical, > when there's no code to discuss yet. We need some actual meat on the > bone in the shape of code, and that's much better dealt with in email. > Timing is pretty advanced at this point, otherwise I'd offer to cook > something up that we COULD discuss, but I will not have time to do that > for KS. > Sorry, I was not thinking of any BoF or the like. I just meant, with a stuffy phrase, "let's get it started concretely". > If you are at LPC, why don't the two of us sit down and talk about it > Wednesday or Thursday? I'm also at KS. I'm available from Sunday evening to Wednesday evening. I'm leaving on Thursday morning. If Wednesday is in any case your preferred day, then let's do it on Wednesday. At what time? If I understand correctly, Bart will join us too. > I'd
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On 10/28/2016 11:38 PM, Paolo Valente wrote: Il giorno 26 ott 2016, alle ore 18:12, Jens Axboeha scritto: On 10/26/2016 10:04 AM, Paolo Valente wrote: Il giorno 26 ott 2016, alle ore 17:32, Jens Axboe ha scritto: On 10/26/2016 09:29 AM, Christoph Hellwig wrote: On Wed, Oct 26, 2016 at 05:13:07PM +0200, Arnd Bergmann wrote: The question to ask first is whether to actually have pluggable schedulers on blk-mq at all, or just have one that is meant to do the right thing in every case (and possibly can be bypassed completely). That would be my preference. Have a BFQ-variant for blk-mq as an option (default to off unless opted in by the driver or user), and not other scheduler for blk-mq. Don't bother with bfq for non blk-mq. It's not like there is any advantage in the legacy-request device even for slow devices, except for the option of having I/O scheduling. It's the only right way forward. blk-mq might not offer any substantial advantages to rotating storage, but with scheduling, it won't offer a downside either. And it'll take us towards the real goal, which is to have just one IO path. ok Adding a new scheduler for the legacy IO path makes no sense. I would fully agree if effective and stable I/O scheduling would be available in blk-mq in one or two months. But I guess that it will take at least one year optimistically, given the current status of the needed infrastructure, and given the great difficulties of doing effective scheduling at the high parallelism and extreme target speeds of blk-mq. Of course, this holds true unless little clever scheduling is performed. So, what's the point in forcing a lot of users wait another year or more, for a solution that has yet to be even defined, while they could enjoy a much better system, and then switch an even better system when scheduling is ready in blk-mq too? That same argument could have been made 2 years ago. Saying no to a new scheduler for the legacy framework goes back roughly that long. We could have had BFQ for mq NOW, if we didn't keep coming back to this very point. I'm hesistant to add a new scheduler because it's very easy to add, very difficult to get rid of. If we do add BFQ as a legacy scheduler now, it'll take us years and years to get rid of it again. We should be moving towards LESS moving parts in the legacy path, not more. We can keep having this discussion every few years, but I think we'd both prefer to make some actual progress here. ok Jens, I give up It's perfectly fine to add an interface for a single queue interface for an IO scheduler for blk-mq, since we don't care too much about scalability there. And that won't take years, that should be a few weeks. Retrofitting BFQ on top of that should not be hard either. That can co-exist with a real multiqueue scheduler as well, something that's geared towards some fairness for faster devices. AFAICT this solution is good, for many practical reasons. I don't have the expertise to make such an infrastructure well on my own. At least not in an acceptable amount of time, because working on this nice stuff is unfortunately not my job (although Linaro is now supporting me for BFQ). Then, assuming that this solution may be of general interest, and that BFQ benefits convinced you a little bit too, may I get significant collaboration/help on implementing this infrastructure? Of course, I already offered to help with this. If so, Jens and all possibly interested parties, could we have a sort of short kick-off technical meeting during KS/LPC? I'm not a huge fan of setting up a BoF to discuss something technical, when there's no code to discuss yet. We need some actual meat on the bone in the shape of code, and that's much better dealt with in email. Timing is pretty advanced at this point, otherwise I'd offer to cook something up that we COULD discuss, but I will not have time to do that for KS. If you are at LPC, why don't the two of us sit down and talk about it Wednesday or Thursday? I'd like to try and understand what parts of blk-mq you aren't up to speed on, and how we can best get a simple framework going that will allow us to entertain single queue scheduling within blk-mq. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On 10/28/16 22:38, Paolo Valente wrote: > Then, assuming that this solution may be of general interest, and that > BFQ benefits convinced you a little bit too, may I get significant > collaboration/help on implementing this infrastructure? If so, Jens > and all possibly interested parties, could we have a sort of short > kick-off technical meeting during KS/LPC? Hello Paolo and Jens, Please keep me in the loop for any communication about BFQ / blk-mq scheduling. My employer was so kind to allow me to spend some of my time to work on this. I plan to attend the KS. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
> Il giorno 26 ott 2016, alle ore 18:12, Jens Axboeha > scritto: > > On 10/26/2016 10:04 AM, Paolo Valente wrote: >> >>> Il giorno 26 ott 2016, alle ore 17:32, Jens Axboe ha >>> scritto: >>> >>> On 10/26/2016 09:29 AM, Christoph Hellwig wrote: On Wed, Oct 26, 2016 at 05:13:07PM +0200, Arnd Bergmann wrote: > The question to ask first is whether to actually have pluggable > schedulers on blk-mq at all, or just have one that is meant to > do the right thing in every case (and possibly can be bypassed > completely). That would be my preference. Have a BFQ-variant for blk-mq as an option (default to off unless opted in by the driver or user), and not other scheduler for blk-mq. Don't bother with bfq for non blk-mq. It's not like there is any advantage in the legacy-request device even for slow devices, except for the option of having I/O scheduling. >>> >>> It's the only right way forward. blk-mq might not offer any substantial >>> advantages to rotating storage, but with scheduling, it won't offer a >>> downside either. And it'll take us towards the real goal, which is to >>> have just one IO path. >> >> ok >> >>> Adding a new scheduler for the legacy IO path >>> makes no sense. >> >> I would fully agree if effective and stable I/O scheduling would be >> available in blk-mq in one or two months. But I guess that it will >> take at least one year optimistically, given the current status of the >> needed infrastructure, and given the great difficulties of doing >> effective scheduling at the high parallelism and extreme target speeds >> of blk-mq. Of course, this holds true unless little clever scheduling >> is performed. >> >> So, what's the point in forcing a lot of users wait another year or >> more, for a solution that has yet to be even defined, while they could >> enjoy a much better system, and then switch an even better system when >> scheduling is ready in blk-mq too? > > That same argument could have been made 2 years ago. Saying no to a new > scheduler for the legacy framework goes back roughly that long. We could > have had BFQ for mq NOW, if we didn't keep coming back to this very > point. > > I'm hesistant to add a new scheduler because it's very easy to add, very > difficult to get rid of. If we do add BFQ as a legacy scheduler now, > it'll take us years and years to get rid of it again. We should be > moving towards LESS moving parts in the legacy path, not more. > > We can keep having this discussion every few years, but I think we'd > both prefer to make some actual progress here. ok Jens, I give up > It's perfectly fine to > add an interface for a single queue interface for an IO scheduler for > blk-mq, since we don't care too much about scalability there. And that > won't take years, that should be a few weeks. Retrofitting BFQ on top of > that should not be hard either. That can co-exist with a real multiqueue > scheduler as well, something that's geared towards some fairness for > faster devices. > AFAICT this solution is good, for many practical reasons. I don't have the expertise to make such an infrastructure well on my own. At least not in an acceptable amount of time, because working on this nice stuff is unfortunately not my job (although Linaro is now supporting me for BFQ). Then, assuming that this solution may be of general interest, and that BFQ benefits convinced you a little bit too, may I get significant collaboration/help on implementing this infrastructure? If so, Jens and all possibly interested parties, could we have a sort of short kick-off technical meeting during KS/LPC? Thanks, Paolo > -- > Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Fri, Oct 28, 2016 at 5:29 PM, Christoph Hellwigwrote: > On Fri, Oct 28, 2016 at 11:32:21AM +0200, Linus Walleij wrote: >> So I'm not just complaining by the way, I'm trying to fix this. Also >> Bartlomiej from Samsung has done some stabs at switching MMC/SD >> to blk-mq. I just rebased my latest stab at a naīve switch to blk-mq >> to v4.9-rc2 with these results. >> >> The patch to enable MQ looks like this: >> https://git.kernel.org/cgit/linux/kernel/git/linusw/linux-stericsson.git/commit/?h=mmc-mq=8f79b527e2e854071d8da019451da68d4753f71d >> >> I run these tests directly after boot with cold caches. The results >> are consistent: I ran the same commands 10 times in a row. > > A couple comments from a quick look over the patch: > > In the changelog you complain: > > ". Lack of front- and back-end merging in the MQ block layer creating > several small requests instead of a few large ones." > > In blk-mq merging is controller by the BLK_MQ_F_SHOULD_MERGE and > BLK_MQ_F_SG_MERGE flags. You set the former, but not the latter. > BLK_MQ_F_SG_MERGE controls wether multiple physical contiguous pages get > merged into a single segment. For a dd after a fresh boot that is > probably very common. Except for the polarity of the merge flags the > basic merge functionality between the legacy and blk-mq path should be > the same, and if they aren't you've found a bug we need to address. Aha OK I will make sure to set both flags next time. (I will also stop guessing about that as a cause since that part probably works.) > You also say that you disable the pipelining. How much of a performance > gain did this feature give when added? How much does just removing that > on it's own cost you? Interestingly, the original commit doesn't say. http://marc.info/?l=linaro-dev=137645684811479=2 It however dependends the cache architecture of the machine how much is won. The heavier the cache flushes, the more it gains. I guess I need to make a patch removing that mechanism to bench it. It's pretty hard to get rid of because it goes really deep into the MMC subsystem. It's massaged in like a schampoo. > While I think that features is rather messy and > should be avoided if possible I don't see how it's impossible to > implement in blk-mq. It's probably possible. What I discussed with Arnd was to let the blk-mq core call out to these pre-request and post-request hooks on new requests in parallel with processing a request or a queue of requests. I.e. add .prep_request() and .unprep_request() callbacks to struct blk_mq_ops. I tried to understand if the existing .init_request and .exit_request callbacks could be used. But as I understand it they are only used to allocate and prepare the extra per-request-associated memory and state, and does not have access to the request per se, so it doesn't know anything about the actual request when .init_request() is called. So we're looking for something called whenever the contents of a request are done, right before queueing it, and right after dequeueing it after being served. > If you just increase your queue depth and use > the old scheme you should get it - if you currently can't handle the > second command for some reason (i.e. the special request magic) you > can just return BLK_MQ_RQ_QUEUE_BUSY from the queue_rq function. Bartlomiejs patch set did that, but I haven't been able to reproduce it. I will try to make a clean patch in the spirit of his. Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Fri, Oct 28, 2016 at 4:22 PM, Jens Axboewrote: > On 10/28/2016 03:32 AM, Linus Walleij wrote: >> >> This is without using Bartlomiej's clever hack to pretend we have >> 2 elements in the HW queue though. His early tests indicate that >> it doesn't help much: the performance regression we see is due to >> lack of block scheduling. > > A simple dd test, I don't see how that can be slower due to lack of > scheduling. There's nothing to schedule there, just issue them in order? Yeah I guess you're right, I guess it could be in part to not having activated front- and back-end merges properly as Christoph pointed out, I'll look closer at this. > So that would probably be where I would start looking. A blktrace of the > in-kernel code and the blk-mq enabled code would perhaps be > enlightening. I don't think it's worth looking at the more complex test > cases until the dd test case is at least as fast as the non-mq version. Yeah. > Was that with CFQ, btw, or what scheduler did it run? CFQ, just plain defconfig. > It'd be nice to NOT have to rely on that fake QD=2 setup, since it will > mess with the IO scheduling as well. I agree. >> I try to find a way forward with this, and also massage the MMC/SD >> code to be more MQ friendly to begin with (like only pick requests >> when we get a request notification and stop pulling NULL requests >> off the queue) but it's really a messy piece of code. > > Yeah, it does look pretty messy... I'd be happy to help out with that, > and particularly in figuring out why the direct conversion is slower for > a basic 'dd' test case. I'm looking into it. Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Fri, Oct 28, 2016 at 08:17:01AM -0600, Jens Axboe wrote: > On 10/28/2016 12:36 AM, Ulf Hansson wrote: > > You have been pushing Paolo in different directions throughout the > > years with his work in BFQ, wasting lots of his time/effort. > I have not. Various entities have advised Paolo approach it in various ways. > We've had blk-mq for 3 years now, my position should have been pretty clear > on that. Having come to this somewhat late I have to say that that hasn't been 100% clear as a set opinion from everyone - in the time I've been following things there's been engagement about the meat of the code which gave the impression the patches were being seriously considered. But like I said in a previous mail this is all in the past anyway, we need to focus on the present situation. signature.asc Description: PGP signature
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Friday, October 28, 2016 9:30:07 AM CEST Jens Axboe wrote: > On 10/28/2016 03:32 AM, Linus Walleij wrote: > > The patch to enable MQ looks like this: > > https://git.kernel.org/cgit/linux/kernel/git/linusw/linux-stericsson.git/commit/?h=mmc-mq=8f79b527e2e854071d8da019451da68d4753f71d > > BTW, another viable "hack" for the depth issue would be to expose more > than one hardware queue. It's meant to map to a distinct submission > region in the hardware, but there's nothing stopping the driver from > using it differently. Might not be cleaner than just increasing the > queue depth on a single queue, though. > > That still won't solve the issue of lying about it and causing IO > scheduler confusion, of course. > > Also, 4.8 and newer have support for BLK_MQ_F_BLOCKING, if you need to > block in ->queue_rq(). That could eliminate the need to offload to a > kthread manually. I think the main reason for the kthread is that on ARM and other architectures, the dma mapping operations are fairly slow (for cache flushes or bounce buffering) and we want to minimize the time between subsequent requests being handled by the hardware. This is not unique to MMC in any way, MMC just happens to be common on ARM and it is limited by its lack of hardware command queuing. It would be nice to do a similar trick for SCSI disks, especially USB mass storage, maybe also SATA, which are the next most common storage devices on non-coherent ARM systems (SATA nowadays often comes with NCQ, so it's less of an issue) It may be reasonable to tie this in with the I/O scheduler: if you don't have a scheduler, the access to the device is probably rather direct and you want to avoid any complexity in the kernel, but if preparing a request is expensive and the hardware has no queuing, you probably also want to use a scheduler. We should probably also try to understand how this could work out with USB mass storage, if there is a solution at all, and then do it for MMC in a way that would work on both. I don't think the USB core can currently split the dma_map_sg() operation from the USB command submission, so this may require some deeper surgery there. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
Hi, On Friday, October 28, 2016 09:30:07 AM Jens Axboe wrote: > On 10/28/2016 03:32 AM, Linus Walleij wrote: > > The patch to enable MQ looks like this: > > https://git.kernel.org/cgit/linux/kernel/git/linusw/linux-stericsson.git/commit/?h=mmc-mq=8f79b527e2e854071d8da019451da68d4753f71d > > BTW, another viable "hack" for the depth issue would be to expose more > than one hardware queue. It's meant to map to a distinct submission > region in the hardware, but there's nothing stopping the driver from > using it differently. Might not be cleaner than just increasing the > queue depth on a single queue, though. Yes, I'm already considering this for rewritten version of my patch set as it may also help with performance when compared to non blk-mq case. Significant amount of time is spent on DMA map/unmap operations on ARM MMC hosts and I would like to do these DMA (un)mapping-s in parallel for two (or more) requests to check whether it helps the performance (hopefully the cache controller doesn't serialize these operations). BTW I'm following the discussion and still would like to help with getting blk-mq work for MMC. I'm just quite busy with other things at the moment. > That still won't solve the issue of lying about it and causing IO > scheduler confusion, of course. > > Also, 4.8 and newer have support for BLK_MQ_F_BLOCKING, if you need to > block in ->queue_rq(). That could eliminate the need to offload to a > kthread manually. Best regards, -- Bartlomiej Zolnierkiewicz Samsung R Institute Poland Samsung Electronics -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On 10/28/2016 03:32 AM, Linus Walleij wrote: The patch to enable MQ looks like this: https://git.kernel.org/cgit/linux/kernel/git/linusw/linux-stericsson.git/commit/?h=mmc-mq=8f79b527e2e854071d8da019451da68d4753f71d BTW, another viable "hack" for the depth issue would be to expose more than one hardware queue. It's meant to map to a distinct submission region in the hardware, but there's nothing stopping the driver from using it differently. Might not be cleaner than just increasing the queue depth on a single queue, though. That still won't solve the issue of lying about it and causing IO scheduler confusion, of course. Also, 4.8 and newer have support for BLK_MQ_F_BLOCKING, if you need to block in ->queue_rq(). That could eliminate the need to offload to a kthread manually. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On 10/28/2016 12:36 AM, Ulf Hansson wrote: [...] Moreover, I am still trying to understand what's the big deal to why you say no to BFQ as a legacy scheduler. Ideally it shouldn't cause you any maintenance burden and it doesn't make the removal of the legacy blk layer any more difficult, right? Not sure I can state it much clearer. It's a new scheduler, and a complicated one at that. It WILL carry a maintenance burden. And I'm Really? Either you maintain the code or not. And if Paolo would do it, then your are off the hook! Are you trying to be deliberately obtuse? If so, good job. I'd advise you to look into how code in the kernel is maintained in general. A maintenance burden exists for code A, but it also carries over to the subsystem it is under, and the kernel in general. Adding code is never free. really not that interested in adding such a burden for something that will be defunct as soon as the single queue blk-mq version is done. Additionally, if we put BFQ in right now, the motivation to do the real work will be gone. You have been pushing Paolo in different directions throughout the years with his work in BFQ, wasting lots of his time/effort. I have not. Various entities have advised Paolo approach it in various ways. We've had blk-mq for 3 years now, my position should have been pretty clear on that. You have not given him any credibility for his work in BFQ and now you point him yet in another direction. I don't even know what that means. But I'm not pointing him in a new direction. Ulf, I'm done discussing with you. I've made my position clear, yet you continue to beat on a dead horse. As far as I'm concerned, there's nothing further to discuss here. I'll be happy to discuss when there's some meat on the bone (ie code). Until then, EOD. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Fri, Oct 28, 2016 at 2:07 PM, Arnd Bergmannwrote: >> > I don't think that's an accurate statement. In terms of coverage, most >> > drivers do support blk-mq. Anything SCSI, nvme, virtio-blk, SATA runs on >> > (or can run on) top of blk-mq. >> >> Well, I just used "git grep" and found that many drivers didn't use >> blkmq. Apologize if I gave the wrong impressions. > > To clarify, this seems to be a complete list: > > $ git grep -wl '\(__\|\)blk_\(fetch\|end\|start\)_request' | xargs grep -L > blk_mq > Documentation/scsi/scsi_eh.txt > arch/um/drivers/ubd_kern.c AFAICT Daniel looked at the UML block driver and did an initial conversion some time ago. Daniel? Anton is also working on a patch series to speed up the driver. Maybe it is time to bite the bullet and do the conversion. -- Thanks, //richard -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Thursday, October 27, 2016 8:13:08 PM CEST Ulf Hansson wrote: > On 27 October 2016 at 19:43, Jens Axboewrote: > > On 10/27/2016 11:32 AM, Ulf Hansson wrote: > >> > >> [...] > >> > >>> > >>> I'm hesistant to add a new scheduler because it's very easy to add, very > >>> difficult to get rid of. If we do add BFQ as a legacy scheduler now, > >>> it'll take us years and years to get rid of it again. We should be > >>> moving towards LESS moving parts in the legacy path, not more. > >> > >> > >> Jens, I think you are wrong here and let me try to elaborate on why. > >> > >> 1) > >> We already have legacy schedulers like CFQ, DEADLINE, etc - and most > >> block device drivers are still using the legacy blk interface. > > > > > > I don't think that's an accurate statement. In terms of coverage, most > > drivers do support blk-mq. Anything SCSI, nvme, virtio-blk, SATA runs on > > (or can run on) top of blk-mq. > > Well, I just used "git grep" and found that many drivers didn't use > blkmq. Apologize if I gave the wrong impressions. To clarify, this seems to be a complete list: $ git grep -wl '\(__\|\)blk_\(fetch\|end\|start\)_request' | xargs grep -L blk_mq Documentation/scsi/scsi_eh.txt arch/um/drivers/ubd_kern.c block/blk-tag.c block/bsg-lib.c drivers/block/DAC960.c drivers/block/amiflop.c drivers/block/aoe/aoeblk.c drivers/block/aoe/aoecmd.c drivers/block/aoe/aoedev.c drivers/block/ataflop.c drivers/block/cciss.c drivers/block/floppy.c drivers/block/hd.c drivers/block/mg_disk.c drivers/block/osdblk.c drivers/block/paride/pcd.c drivers/block/paride/pd.c drivers/block/paride/pf.c drivers/block/ps3disk.c drivers/block/skd_main.c drivers/block/sunvdc.c drivers/block/swim.c drivers/block/swim3.c drivers/block/sx8.c drivers/block/xsysace.c drivers/block/z2ram.c drivers/cdrom/gdrom.c drivers/ide/ide-atapi.c drivers/ide/ide-io.c drivers/ide/ide-pm.c drivers/memstick/core/ms_block.c drivers/memstick/core/mspro_block.c drivers/mmc/card/block.c drivers/mmc/card/queue.c drivers/mtd/mtd_blkdevs.c drivers/s390/block/dasd.c drivers/s390/block/scm_blk.c drivers/sbus/char/jsflash.c drivers/scsi/osd/osd_initiator.c drivers/scsi/scsi_transport_fc.c drivers/scsi/scsi_transport_sas.c samples/bpf/tracex3_kern.c >From what I can tell, most of these are hopelessly obsolete, but there are some notable exceptions: aoe, osdblk, skd, sunvdc, mtdblk, mmc, dasd and scm. I've never used any of the first four, but the last four of the list are certainly important (for very different reasons). Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Fri, Oct 28, 2016 at 12:27 AM, Linus Walleijwrote: > On Thu, Oct 27, 2016 at 11:08 PM, Jens Axboe wrote: > >> blk-mq has evolved to support a variety of devices, there's nothing >> special about mmc that can't work well within that framework. > > There is. Read mmc_queue_thread() in drivers/mmc/card/queue.c So I'm not just complaining by the way, I'm trying to fix this. Also Bartlomiej from Samsung has done some stabs at switching MMC/SD to blk-mq. I just rebased my latest stab at a naïve switch to blk-mq to v4.9-rc2 with these results. The patch to enable MQ looks like this: https://git.kernel.org/cgit/linux/kernel/git/linusw/linux-stericsson.git/commit/?h=mmc-mq=8f79b527e2e854071d8da019451da68d4753f71d I run these tests directly after boot with cold caches. The results are consistent: I ran the same commands 10 times in a row. BEFORE switching to BLK-MQ (clean v4.9-rc2): time dd if=/dev/mmcblk0 of=/dev/null bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.0GB) copied, 47.781464 seconds, 21.4MB/s real0m 47.79s user0m 0.02s sys 0m 9.35s mount /dev/mmcblk0p1 /mnt/ cd /mnt/ time find . > /dev/null real0m 3.60s user0m 0.25s sys 0m 1.58s mount /dev/mmcblk0p1 /mnt/ iozone -az -i0 -i1 -i2 -s 20m -I -f /mnt/foo.test (kBytes/second) randomrandom kB reclenwrite rewritereadrereadread write 20480 4 2112 2157 6052 6060 6025 40 20480 8 4820 5074 9163 9121 9125 81 20480 16 5755 5242123171232012280 165 20480 32 6176 6261149811498714962 336 20480 64 6547 5875168261682816810 692 20480 128 6762 6828178991789617896 1408 20480 256 6802 6871169601751318373 3048 20480 512 7220 7252186751874618741 7228 204801024 7222 7304184361785818246 7322 204802048 7316 7398187441875118526 7419 204804096 7520 7636207742099520703 7609 204808192 7519 7704218502148921467 7663 20480 16384 7395 7782223992221022215 7781 AFTER switching to BLK-MQ: time dd if=/dev/mmcblk0 of=/dev/null bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.0GB) copied, 60.551117 seconds, 16.9MB/s real1m 0.56s user0m 0.02s sys 0m 9.81s mount /dev/mmcblk0p1 /mnt/ cd /mnt/ time find . > /dev/null real0m 4.42s user0m 0.24s sys 0m 1.81s mount /dev/mmcblk0p1 /mnt/ iozone -az -i0 -i1 -i2 -s 20m -I -f /mnt/foo.test (kBytes/second) randomrandom kB reclenwrite rewritereadrereadread write 20480 4 2086 2201 6024 6036 6006 40 20480 8 4812 5036 8014 9121 9090 82 20480 16 5432 563312267 977612212 168 20480 32 6180 6233148701489114852 340 20480 64 6382 5454167441677116746 702 20480 128 6761 6776178161784617836 1394 20480 256 6828 6842177891789517094 3084 20480 512 7158 7222179571768117698 7232 204801024 7215 7274186421767918031 7300 204802048 7229 7269179431864217732 7358 204804096 7212 7360182721815718889 7371 204808192 7008 7271186321870718225 7282 20480 16384 6889 7211182431842918018 7246 A simple dd readtest of 1 GB is always consistently 10+ seconds slower with MQ. find in the rootfs is a second slower. iozone results are consistently lower throughput or the same. This is without using Bartlomiej's clever hack to pretend we have 2 elements in the HW queue though. His early tests indicate that it doesn't help much: the performance regression we see is due to lack of block scheduling. I try to find a way forward with this, and also massage the MMC/SD code to be more MQ friendly to begin with (like only pick requests when we get a request notification and stop pulling NULL requests off the queue) but it's really a messy piece of code. Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Thu 27-10-16 10:26:18, Jens Axboe wrote: > On 10/27/2016 03:26 AM, Jan Kara wrote: > >On Wed 26-10-16 10:12:38, Jens Axboe wrote: > >>On 10/26/2016 10:04 AM, Paolo Valente wrote: > >>> > Il giorno 26 ott 2016, alle ore 17:32, Jens Axboeha > scritto: > > On 10/26/2016 09:29 AM, Christoph Hellwig wrote: > >On Wed, Oct 26, 2016 at 05:13:07PM +0200, Arnd Bergmann wrote: > >>The question to ask first is whether to actually have pluggable > >>schedulers on blk-mq at all, or just have one that is meant to > >>do the right thing in every case (and possibly can be bypassed > >>completely). > > > >That would be my preference. Have a BFQ-variant for blk-mq as an > >option (default to off unless opted in by the driver or user), and > >not other scheduler for blk-mq. Don't bother with bfq for non > >blk-mq. It's not like there is any advantage in the legacy-request > >device even for slow devices, except for the option of having I/O > >scheduling. > > It's the only right way forward. blk-mq might not offer any substantial > advantages to rotating storage, but with scheduling, it won't offer a > downside either. And it'll take us towards the real goal, which is to > have just one IO path. > >>> > >>>ok > >>> > Adding a new scheduler for the legacy IO path > makes no sense. > >>> > >>>I would fully agree if effective and stable I/O scheduling would be > >>>available in blk-mq in one or two months. But I guess that it will > >>>take at least one year optimistically, given the current status of the > >>>needed infrastructure, and given the great difficulties of doing > >>>effective scheduling at the high parallelism and extreme target speeds > >>>of blk-mq. Of course, this holds true unless little clever scheduling > >>>is performed. > >>> > >>>So, what's the point in forcing a lot of users wait another year or > >>>more, for a solution that has yet to be even defined, while they could > >>>enjoy a much better system, and then switch an even better system when > >>>scheduling is ready in blk-mq too? > >> > >>That same argument could have been made 2 years ago. Saying no to a new > >>scheduler for the legacy framework goes back roughly that long. We could > >>have had BFQ for mq NOW, if we didn't keep coming back to this very > >>point. > >> > >>I'm hesistant to add a new scheduler because it's very easy to add, very > >>difficult to get rid of. If we do add BFQ as a legacy scheduler now, > >>it'll take us years and years to get rid of it again. We should be > >>moving towards LESS moving parts in the legacy path, not more. > >> > >>We can keep having this discussion every few years, but I think we'd > >>both prefer to make some actual progress here. It's perfectly fine to > >>add an interface for a single queue interface for an IO scheduler for > >>blk-mq, since we don't care too much about scalability there. And that > >>won't take years, that should be a few weeks. Retrofitting BFQ on top of > >>that should not be hard either. That can co-exist with a real multiqueue > >>scheduler as well, something that's geared towards some fairness for > >>faster devices. > > > >OK, so some solution like having a variant of blk_sq_make_request() that > >will consume requests, do IO scheduling decisions on them, and feed them > >into the HW queue is it sees fit would be acceptable? That will provide the > >IO scheduler a global view that it needs for complex scheduling decisions > >so it should indeed be relatively easy to port BFQ to work like that. > > I'd probably start off Omar's base [1] that switches the software queues > to store bios instead of requests, since that lifts the of the 1:1 > mapping between what we can queue up and what we can dispatch. Without > that, the IO scheduler won't have too much to work with. And with that > in place, it'll be a "bio in, request out" type of setup, which is > similar to what we have in the legacy path. > > I'd keep the software queues, but as a starting point, mandate 1 > hardware queue to keep that as the per-device view of the state. The IO > scheduler would be responsible for moving one or more bios from the > software queues to the hardware queue, when they are ready to dispatch. > > [1] > https://github.com/osandov/linux/commit/8ef3508628b6cf7c4712cd3d8084ee11ef5d2530 Yeah, but what would be software queues actually good for for a single queue device with device-global IO scheduling? The IO scheduler doing complex decisions will keep all the bios / requests in a single structure anyway so there's no scalability to gain from per-cpu software queues... So you can directly consume bios in your ->make_request handler, place it in IO scheduler structures and then push requests out to the HW queue in response to HW tags getting freed (i.e. IO completion). No need for intermediate software queues. But maybe I miss something.
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
[...] > >> Moreover, I am still trying to understand what's the big deal to why >> you say no to BFQ as a legacy scheduler. Ideally it shouldn't cause >> you any maintenance burden and it doesn't make the removal of the >> legacy blk layer any more difficult, right? > > > Not sure I can state it much clearer. It's a new scheduler, and a > complicated one at that. It WILL carry a maintenance burden. And I'm Really? Either you maintain the code or not. And if Paolo would do it, then your are off the hook! > really not that interested in adding such a burden for something that > will be defunct as soon as the single queue blk-mq version is done. > Additionally, if we put BFQ in right now, the motivation to do the real > work will be gone. You have been pushing Paolo in different directions throughout the years with his work in BFQ, wasting lots of his time/effort. You have not given him any credibility for his work in BFQ and now you point him yet in another direction. I understand Paolo is a very persistent hard working guy, most likely because he is really confident about his work in BFQ and he should be! But, regarding motivation, if you continue to push him in different directions and without giving him any credibility - then at some point, you probably knows what will happen. > > The path forward is clear. It'd be a lot better to put some work behind > that, rather than continue this email thread. Yes, it seems so! Kind regards Ulf Hansson -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On 10/27/2016 01:34 PM, Ulf Hansson wrote: [...] Instead, what I can tell, as we have been looking into converting mmc (which I maintains) and that is indeed a significant amount of work. We will need to rip out all of the mmc request management, and most likely we also need to extend the blkmq interface - as to be able to do re-implement all the current request optimizations. We are looking into this, but it just takes time. It's usually as much work as you make it into, for most cases it's pretty straight forward and usually removes more code than it adds. Hence the end result is better for it as well - less code in a driver is better. From a scalability and maintenance point of view, converting to blkmq makes perfect sense. Although, me personally don't want to sacrifice on performance (at least very little), just for the sake of gaining in scalability/maintainability. Nobody has said anything about sacrificing performance. And whether you like it or not, maintainability is always the most important aspect. Even performance takes a backseat to maintainability. I would rather strive to adopt the blkmq framework to also suit my needs. Then it simply do takes more time. For example, in the mmc case we have implemented an asynchronous request path, which greatly improves performance on some systems. blk-mq has evolved to support a variety of devices, there's nothing special about mmc that can't work well within that framework. 3) While we work on scheduling in blkmq (at least for single queue devices), it's of course important that we set high goals. Having BFQ (and the other schedulers) in the legacy blk, provides a good reference for what we could aim for. Sure, but you don't need BFQ to be included in the kernel for that. Perhaps not. But does that mean, you expect Paolo to maintain an up to date BFQ tree for you? I don't expect anything. If Paolo or others want to compare with BFQ on the legacy IO path, then they can do that however way they want. If you (and others) want to have that reference point, it's up to you how to accomplish that. Do I get this right? You personally don't care about using BFQ as reference when evolving blkmq for single queue devices? Paolo and lots of other Linux users certainly do care about this. I'm getting a little tired of this putting words in my mouth... That is not what I'm saying at all. What I'm saying is that the people working on BFQ can do what they need to do to have a reference implementation to compare against. You don't need BFQ in the kernel for that. I said it's up to YOU, with the you here meaning the people that want to work on it, how that goes down. Moreover, I am still trying to understand what's the big deal to why you say no to BFQ as a legacy scheduler. Ideally it shouldn't cause you any maintenance burden and it doesn't make the removal of the legacy blk layer any more difficult, right? Not sure I can state it much clearer. It's a new scheduler, and a complicated one at that. It WILL carry a maintenance burden. And I'm really not that interested in adding such a burden for something that will be defunct as soon as the single queue blk-mq version is done. Additionally, if we put BFQ in right now, the motivation to do the real work will be gone. The path forward is clear. It'd be a lot better to put some work behind that, rather than continue this email thread. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Thu, Oct 27, 2016 at 08:41:27PM +0100, Mark Brown wrote: > Plus the benchmarking to verify that it works well of course, especially > initially where it'll also be a new queue infrastructure as well as the > blk-mq conversion itself. It does feel like something that's going to > take at least a couple of kernel releases to get through. Or to put it the other way around: it could have been long done if people had started it the first it was suggestead. Instead you guys keep arguing and nothing gets done. Get started now, waiting won't make anything go faster. > I think there's also value in having improvements there for people who > benefit from them while queue infrastructure for blk-mq is being worked > on. Well, apply it to you vendor tree then and maintain it yourself if you disagree with our direction. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Thu, Oct 27, 2016 at 12:21:06PM -0600, Jens Axboe wrote: > On 10/27/2016 12:13 PM, Ulf Hansson wrote: > > I can imagine, that it's not always a straight forward "convert to blk > > mq" patch for every block device driver. > Well, I've actually done a few conversions, and it's not difficult at > all. The grunt of the work is usually around converting to using some of > the blk-mq features for parts of the driver that it had implemented > privately, like timeout handling, etc. Plus the benchmarking to verify that it works well of course, especially initially where it'll also be a new queue infrastructure as well as the blk-mq conversion itself. It does feel like something that's going to take at least a couple of kernel releases to get through. > > > > 3) > > > > While we work on scheduling in blkmq (at least for single queue > > > > devices), it's of course important that we set high goals. Having BFQ > > > > (and the other schedulers) in the legacy blk, provides a good > > > > reference for what we could aim for. > > > Sure, but you don't need BFQ to be included in the kernel for that. > > Perhaps not. > > But does that mean, you expect Paolo to maintain an up to date BFQ > > tree for you? > I don't expect anything. If Paolo or others want to compare with BFQ on > the legacy IO path, then they can do that however way they want. If you > (and others) want to have that reference point, it's up to you how to > accomplish that. I think there's also value in having improvements there for people who benefit from them while queue infrastructure for blk-mq is being worked on. signature.asc Description: PGP signature
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
[...] >> Instead, what I can tell, as we have been looking into converting mmc >> (which I maintains) and that is indeed a significant amount of work. >> We will need to rip out all of the mmc request management, and most >> likely we also need to extend the blkmq interface - as to be able to >> do re-implement all the current request optimizations. We are looking >> into this, but it just takes time. > > > It's usually as much work as you make it into, for most cases it's > pretty straight forward and usually removes more code than it adds. > Hence the end result is better for it as well - less code in a driver is > better. >From a scalability and maintenance point of view, converting to blkmq makes perfect sense. Although, me personally don't want to sacrifice on performance (at least very little), just for the sake of gaining in scalability/maintainability. I would rather strive to adopt the blkmq framework to also suit my needs. Then it simply do takes more time. For example, in the mmc case we have implemented an asynchronous request path, which greatly improves performance on some systems. > >> I can imagine, that it's not always a straight forward "convert to blk >> mq" patch for every block device driver. > > > Well, I've actually done a few conversions, and it's not difficult at > all. The grunt of the work is usually around converting to using some of > the blk-mq features for parts of the driver that it had implemented > privately, like timeout handling, etc. > > I'm always happy to help people with converting drivers. Great, we ping you if we need some help! Thanks! > 3) While we work on scheduling in blkmq (at least for single queue devices), it's of course important that we set high goals. Having BFQ (and the other schedulers) in the legacy blk, provides a good reference for what we could aim for. >>> >>> >>> >>> Sure, but you don't need BFQ to be included in the kernel for that. >> >> >> Perhaps not. >> >> But does that mean, you expect Paolo to maintain an up to date BFQ >> tree for you? > > > I don't expect anything. If Paolo or others want to compare with BFQ on > the legacy IO path, then they can do that however way they want. If you > (and others) want to have that reference point, it's up to you how to > accomplish that. Do I get this right? You personally don't care about using BFQ as reference when evolving blkmq for single queue devices? Paolo and lots of other Linux users certainly do care about this. Moreover, I am still trying to understand what's the big deal to why you say no to BFQ as a legacy scheduler. Ideally it shouldn't cause you any maintenance burden and it doesn't make the removal of the legacy blk layer any more difficult, right? Kind regards Ulf Hansson -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On 10/27/2016 11:32 AM, Ulf Hansson wrote: [...] I'm hesistant to add a new scheduler because it's very easy to add, very difficult to get rid of. If we do add BFQ as a legacy scheduler now, it'll take us years and years to get rid of it again. We should be moving towards LESS moving parts in the legacy path, not more. Jens, I think you are wrong here and let me try to elaborate on why. 1) We already have legacy schedulers like CFQ, DEADLINE, etc - and most block device drivers are still using the legacy blk interface. I don't think that's an accurate statement. In terms of coverage, most drivers do support blk-mq. Anything SCSI, nvme, virtio-blk, SATA runs on (or can run on) top of blk-mq. To be able to remove the legacy blk layer, all block device drivers must be converted to blkmq - of course. That's a given. So to reach that goal, we will not only need to evolve blkmq to allow scheduling (at least for single queue devices), but we also need to convert *all* block device drivers to blkmq. For sure this will take *years* and not months. Correct. More important, when the transition to blkmq has been completed, then there is absolutely no difference (from effort point of view) in removing the legacy blk layer - no matter if we have BFQ in there or not. I do understand if you have concern from maintenance point of view, as I assume you would rather focus on evolving blkmq, than care about legacy blk code. So, would it help if Paolo volunteers to maintain the BFQ code in the meantime? We're obviously still maintaining the legacy IO path. But we don't want to actively develop it, and we haven't, for a long time. And Paolo maintaining it is a strict requirement for inclusion, legacy or blk-mq aside. That would go for both. I'd never accept a major feature from an individual or company if they weren't willing and capable of maintaining it. Throwing submissions over the wall is not viable. 2) While we work on evolving blkmq and convert block device drivers to it, BFQ could as a separate legacy scheduler, help *lots* of Linux users to get a significant improved experience. Should we really prevent them from that? I think you block maintainer guys, really need to consider this fact. You still seem to be basing that assumption on the notion that we have to convert tons of drivers for BFQ to make sense under the blk-mq umbrella. That's not the case. 3) While we work on scheduling in blkmq (at least for single queue devices), it's of course important that we set high goals. Having BFQ (and the other schedulers) in the legacy blk, provides a good reference for what we could aim for. Sure, but you don't need BFQ to be included in the kernel for that. We can keep having this discussion every few years, but I think we'd both prefer to make some actual progress here. It's perfectly fine to add an interface for a single queue interface for an IO scheduler for blk-mq, since we don't care too much about scalability there. And that won't take years, that should be a few weeks. Retrofitting BFQ on top of that should not be hard either. That can co-exist with a real multiqueue scheduler as well, something that's geared towards some fairness for faster devices. That's really great news! I hope we get a possibility to meet and discuss the plans for this at Kernel summit/Linux Plumbers the next week! I'll be there. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
[...] > > I'm hesistant to add a new scheduler because it's very easy to add, very > difficult to get rid of. If we do add BFQ as a legacy scheduler now, > it'll take us years and years to get rid of it again. We should be > moving towards LESS moving parts in the legacy path, not more. Jens, I think you are wrong here and let me try to elaborate on why. 1) We already have legacy schedulers like CFQ, DEADLINE, etc - and most block device drivers are still using the legacy blk interface. To be able to remove the legacy blk layer, all block device drivers must be converted to blkmq - of course. So to reach that goal, we will not only need to evolve blkmq to allow scheduling (at least for single queue devices), but we also need to convert *all* block device drivers to blkmq. For sure this will take *years* and not months. More important, when the transition to blkmq has been completed, then there is absolutely no difference (from effort point of view) in removing the legacy blk layer - no matter if we have BFQ in there or not. I do understand if you have concern from maintenance point of view, as I assume you would rather focus on evolving blkmq, than care about legacy blk code. So, would it help if Paolo volunteers to maintain the BFQ code in the meantime? 2) While we work on evolving blkmq and convert block device drivers to it, BFQ could as a separate legacy scheduler, help *lots* of Linux users to get a significant improved experience. Should we really prevent them from that? I think you block maintainer guys, really need to consider this fact. 3) While we work on scheduling in blkmq (at least for single queue devices), it's of course important that we set high goals. Having BFQ (and the other schedulers) in the legacy blk, provides a good reference for what we could aim for. > > We can keep having this discussion every few years, but I think we'd > both prefer to make some actual progress here. It's perfectly fine to > add an interface for a single queue interface for an IO scheduler for > blk-mq, since we don't care too much about scalability there. And that > won't take years, that should be a few weeks. Retrofitting BFQ on top of > that should not be hard either. That can co-exist with a real multiqueue > scheduler as well, something that's geared towards some fairness for > faster devices. That's really great news! I hope we get a possibility to meet and discuss the plans for this at Kernel summit/Linux Plumbers the next week! > > -- > Jens Axboe Kind regards Ulf Hansson -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On 10/27/2016 03:26 AM, Jan Kara wrote: On Wed 26-10-16 10:12:38, Jens Axboe wrote: On 10/26/2016 10:04 AM, Paolo Valente wrote: Il giorno 26 ott 2016, alle ore 17:32, Jens Axboeha scritto: On 10/26/2016 09:29 AM, Christoph Hellwig wrote: On Wed, Oct 26, 2016 at 05:13:07PM +0200, Arnd Bergmann wrote: The question to ask first is whether to actually have pluggable schedulers on blk-mq at all, or just have one that is meant to do the right thing in every case (and possibly can be bypassed completely). That would be my preference. Have a BFQ-variant for blk-mq as an option (default to off unless opted in by the driver or user), and not other scheduler for blk-mq. Don't bother with bfq for non blk-mq. It's not like there is any advantage in the legacy-request device even for slow devices, except for the option of having I/O scheduling. It's the only right way forward. blk-mq might not offer any substantial advantages to rotating storage, but with scheduling, it won't offer a downside either. And it'll take us towards the real goal, which is to have just one IO path. ok Adding a new scheduler for the legacy IO path makes no sense. I would fully agree if effective and stable I/O scheduling would be available in blk-mq in one or two months. But I guess that it will take at least one year optimistically, given the current status of the needed infrastructure, and given the great difficulties of doing effective scheduling at the high parallelism and extreme target speeds of blk-mq. Of course, this holds true unless little clever scheduling is performed. So, what's the point in forcing a lot of users wait another year or more, for a solution that has yet to be even defined, while they could enjoy a much better system, and then switch an even better system when scheduling is ready in blk-mq too? That same argument could have been made 2 years ago. Saying no to a new scheduler for the legacy framework goes back roughly that long. We could have had BFQ for mq NOW, if we didn't keep coming back to this very point. I'm hesistant to add a new scheduler because it's very easy to add, very difficult to get rid of. If we do add BFQ as a legacy scheduler now, it'll take us years and years to get rid of it again. We should be moving towards LESS moving parts in the legacy path, not more. We can keep having this discussion every few years, but I think we'd both prefer to make some actual progress here. It's perfectly fine to add an interface for a single queue interface for an IO scheduler for blk-mq, since we don't care too much about scalability there. And that won't take years, that should be a few weeks. Retrofitting BFQ on top of that should not be hard either. That can co-exist with a real multiqueue scheduler as well, something that's geared towards some fairness for faster devices. OK, so some solution like having a variant of blk_sq_make_request() that will consume requests, do IO scheduling decisions on them, and feed them into the HW queue is it sees fit would be acceptable? That will provide the IO scheduler a global view that it needs for complex scheduling decisions so it should indeed be relatively easy to port BFQ to work like that. I'd probably start off Omar's base [1] that switches the software queues to store bios instead of requests, since that lifts the of the 1:1 mapping between what we can queue up and what we can dispatch. Without that, the IO scheduler won't have too much to work with. And with that in place, it'll be a "bio in, request out" type of setup, which is similar to what we have in the legacy path. I'd keep the software queues, but as a starting point, mandate 1 hardware queue to keep that as the per-device view of the state. The IO scheduler would be responsible for moving one or more bios from the software queues to the hardware queue, when they are ready to dispatch. [1] https://github.com/osandov/linux/commit/8ef3508628b6cf7c4712cd3d8084ee11ef5d2530 -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On 10/27/2016 08:34 AM, Grozdan wrote: On Thu, Oct 27, 2016 at 11:26 AM, Jan Karawrote: On Wed 26-10-16 10:12:38, Jens Axboe wrote: On 10/26/2016 10:04 AM, Paolo Valente wrote: Il giorno 26 ott 2016, alle ore 17:32, Jens Axboe ha scritto: On 10/26/2016 09:29 AM, Christoph Hellwig wrote: On Wed, Oct 26, 2016 at 05:13:07PM +0200, Arnd Bergmann wrote: The question to ask first is whether to actually have pluggable schedulers on blk-mq at all, or just have one that is meant to do the right thing in every case (and possibly can be bypassed completely). That would be my preference. Have a BFQ-variant for blk-mq as an option (default to off unless opted in by the driver or user), and not other scheduler for blk-mq. Don't bother with bfq for non blk-mq. It's not like there is any advantage in the legacy-request device even for slow devices, except for the option of having I/O scheduling. It's the only right way forward. blk-mq might not offer any substantial advantages to rotating storage, but with scheduling, it won't offer a downside either. And it'll take us towards the real goal, which is to have just one IO path. ok Adding a new scheduler for the legacy IO path makes no sense. I would fully agree if effective and stable I/O scheduling would be available in blk-mq in one or two months. But I guess that it will take at least one year optimistically, given the current status of the needed infrastructure, and given the great difficulties of doing effective scheduling at the high parallelism and extreme target speeds of blk-mq. Of course, this holds true unless little clever scheduling is performed. So, what's the point in forcing a lot of users wait another year or more, for a solution that has yet to be even defined, while they could enjoy a much better system, and then switch an even better system when scheduling is ready in blk-mq too? That same argument could have been made 2 years ago. Saying no to a new scheduler for the legacy framework goes back roughly that long. We could have had BFQ for mq NOW, if we didn't keep coming back to this very point. I'm hesistant to add a new scheduler because it's very easy to add, very difficult to get rid of. If we do add BFQ as a legacy scheduler now, it'll take us years and years to get rid of it again. We should be moving towards LESS moving parts in the legacy path, not more. We can keep having this discussion every few years, but I think we'd both prefer to make some actual progress here. It's perfectly fine to add an interface for a single queue interface for an IO scheduler for blk-mq, since we don't care too much about scalability there. And that won't take years, that should be a few weeks. Retrofitting BFQ on top of that should not be hard either. That can co-exist with a real multiqueue scheduler as well, something that's geared towards some fairness for faster devices. OK, so some solution like having a variant of blk_sq_make_request() that will consume requests, do IO scheduling decisions on them, and feed them into the HW queue is it sees fit would be acceptable? That will provide the IO scheduler a global view that it needs for complex scheduling decisions so it should indeed be relatively easy to port BFQ to work like that. Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Hello, Let me first say that I'm in no way associated with Paolo Valente or any other BFQ developer. I'm a mere user who has had great experience using BFQ My workload is one that takes my disks to their limits. I often use large files like raw Blu-ray streams which then I remux to mkv's while at the same time streaming at least 2 movies to various devices in house and using my system as I do while the remuxing process is going on. At times, I'm also pushing video files to my NAS at close to Gbps speed while the stuff I mentioned is in progress My experience with BFQ is that it has never resulted in the video streams being interrupted due to disk trashing. I've extensively used all the other Linux disk schedulers in the past and what I've observed is that whenever I start the remuxing (and copying) process, the streams will begin to hiccup, stutter and often multi-seconds long "waits" will occur. It gets even worse, when I do this kind of workload, the whole system will come to almost a halt and interactivity goes out the window. Impossible to start an app in a reasonable amount of time. Loading a visited website makes Chrome hang while trying to get the contents from its cache, etc BFQ has greatly helped to have a responsive system during such operations and as I said, I have never experience any interruption of the video streams. Do I think BFQ is the best
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Thu, Oct 27, 2016 at 11:26 AM, Jan Karawrote: > On Wed 26-10-16 10:12:38, Jens Axboe wrote: >> On 10/26/2016 10:04 AM, Paolo Valente wrote: >> > >> >>Il giorno 26 ott 2016, alle ore 17:32, Jens Axboe ha >> >>scritto: >> >> >> >>On 10/26/2016 09:29 AM, Christoph Hellwig wrote: >> >>>On Wed, Oct 26, 2016 at 05:13:07PM +0200, Arnd Bergmann wrote: >> The question to ask first is whether to actually have pluggable >> schedulers on blk-mq at all, or just have one that is meant to >> do the right thing in every case (and possibly can be bypassed >> completely). >> >>> >> >>>That would be my preference. Have a BFQ-variant for blk-mq as an >> >>>option (default to off unless opted in by the driver or user), and >> >>>not other scheduler for blk-mq. Don't bother with bfq for non >> >>>blk-mq. It's not like there is any advantage in the legacy-request >> >>>device even for slow devices, except for the option of having I/O >> >>>scheduling. >> >> >> >>It's the only right way forward. blk-mq might not offer any substantial >> >>advantages to rotating storage, but with scheduling, it won't offer a >> >>downside either. And it'll take us towards the real goal, which is to >> >>have just one IO path. >> > >> >ok >> > >> >>Adding a new scheduler for the legacy IO path >> >>makes no sense. >> > >> >I would fully agree if effective and stable I/O scheduling would be >> >available in blk-mq in one or two months. But I guess that it will >> >take at least one year optimistically, given the current status of the >> >needed infrastructure, and given the great difficulties of doing >> >effective scheduling at the high parallelism and extreme target speeds >> >of blk-mq. Of course, this holds true unless little clever scheduling >> >is performed. >> > >> >So, what's the point in forcing a lot of users wait another year or >> >more, for a solution that has yet to be even defined, while they could >> >enjoy a much better system, and then switch an even better system when >> >scheduling is ready in blk-mq too? >> >> That same argument could have been made 2 years ago. Saying no to a new >> scheduler for the legacy framework goes back roughly that long. We could >> have had BFQ for mq NOW, if we didn't keep coming back to this very >> point. >> >> I'm hesistant to add a new scheduler because it's very easy to add, very >> difficult to get rid of. If we do add BFQ as a legacy scheduler now, >> it'll take us years and years to get rid of it again. We should be >> moving towards LESS moving parts in the legacy path, not more. >> >> We can keep having this discussion every few years, but I think we'd >> both prefer to make some actual progress here. It's perfectly fine to >> add an interface for a single queue interface for an IO scheduler for >> blk-mq, since we don't care too much about scalability there. And that >> won't take years, that should be a few weeks. Retrofitting BFQ on top of >> that should not be hard either. That can co-exist with a real multiqueue >> scheduler as well, something that's geared towards some fairness for >> faster devices. > > OK, so some solution like having a variant of blk_sq_make_request() that > will consume requests, do IO scheduling decisions on them, and feed them > into the HW queue is it sees fit would be acceptable? That will provide the > IO scheduler a global view that it needs for complex scheduling decisions > so it should indeed be relatively easy to port BFQ to work like that. > > Honza > -- > Jan Kara > SUSE Labs, CR > -- > To unsubscribe from this list: send the line "unsubscribe linux-block" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Hello, Let me first say that I'm in no way associated with Paolo Valente or any other BFQ developer. I'm a mere user who has had great experience using BFQ My workload is one that takes my disks to their limits. I often use large files like raw Blu-ray streams which then I remux to mkv's while at the same time streaming at least 2 movies to various devices in house and using my system as I do while the remuxing process is going on. At times, I'm also pushing video files to my NAS at close to Gbps speed while the stuff I mentioned is in progress My experience with BFQ is that it has never resulted in the video streams being interrupted due to disk trashing. I've extensively used all the other Linux disk schedulers in the past and what I've observed is that whenever I start the remuxing (and copying) process, the streams will begin to hiccup, stutter and often multi-seconds long "waits" will occur. It gets even worse, when I do this kind of workload, the whole system will come to almost a halt and interactivity goes out the window. Impossible to start an app in a reasonable amount of time. Loading a visited website makes Chrome hang
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Wed 26-10-16 10:12:38, Jens Axboe wrote: > On 10/26/2016 10:04 AM, Paolo Valente wrote: > > > >>Il giorno 26 ott 2016, alle ore 17:32, Jens Axboeha > >>scritto: > >> > >>On 10/26/2016 09:29 AM, Christoph Hellwig wrote: > >>>On Wed, Oct 26, 2016 at 05:13:07PM +0200, Arnd Bergmann wrote: > The question to ask first is whether to actually have pluggable > schedulers on blk-mq at all, or just have one that is meant to > do the right thing in every case (and possibly can be bypassed > completely). > >>> > >>>That would be my preference. Have a BFQ-variant for blk-mq as an > >>>option (default to off unless opted in by the driver or user), and > >>>not other scheduler for blk-mq. Don't bother with bfq for non > >>>blk-mq. It's not like there is any advantage in the legacy-request > >>>device even for slow devices, except for the option of having I/O > >>>scheduling. > >> > >>It's the only right way forward. blk-mq might not offer any substantial > >>advantages to rotating storage, but with scheduling, it won't offer a > >>downside either. And it'll take us towards the real goal, which is to > >>have just one IO path. > > > >ok > > > >>Adding a new scheduler for the legacy IO path > >>makes no sense. > > > >I would fully agree if effective and stable I/O scheduling would be > >available in blk-mq in one or two months. But I guess that it will > >take at least one year optimistically, given the current status of the > >needed infrastructure, and given the great difficulties of doing > >effective scheduling at the high parallelism and extreme target speeds > >of blk-mq. Of course, this holds true unless little clever scheduling > >is performed. > > > >So, what's the point in forcing a lot of users wait another year or > >more, for a solution that has yet to be even defined, while they could > >enjoy a much better system, and then switch an even better system when > >scheduling is ready in blk-mq too? > > That same argument could have been made 2 years ago. Saying no to a new > scheduler for the legacy framework goes back roughly that long. We could > have had BFQ for mq NOW, if we didn't keep coming back to this very > point. > > I'm hesistant to add a new scheduler because it's very easy to add, very > difficult to get rid of. If we do add BFQ as a legacy scheduler now, > it'll take us years and years to get rid of it again. We should be > moving towards LESS moving parts in the legacy path, not more. > > We can keep having this discussion every few years, but I think we'd > both prefer to make some actual progress here. It's perfectly fine to > add an interface for a single queue interface for an IO scheduler for > blk-mq, since we don't care too much about scalability there. And that > won't take years, that should be a few weeks. Retrofitting BFQ on top of > that should not be hard either. That can co-exist with a real multiqueue > scheduler as well, something that's geared towards some fairness for > faster devices. OK, so some solution like having a variant of blk_sq_make_request() that will consume requests, do IO scheduling decisions on them, and feed them into the HW queue is it sees fit would be acceptable? That will provide the IO scheduler a global view that it needs for complex scheduling decisions so it should indeed be relatively easy to port BFQ to work like that. Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On 10/26/2016 10:04 AM, Paolo Valente wrote: Il giorno 26 ott 2016, alle ore 17:32, Jens Axboeha scritto: On 10/26/2016 09:29 AM, Christoph Hellwig wrote: On Wed, Oct 26, 2016 at 05:13:07PM +0200, Arnd Bergmann wrote: The question to ask first is whether to actually have pluggable schedulers on blk-mq at all, or just have one that is meant to do the right thing in every case (and possibly can be bypassed completely). That would be my preference. Have a BFQ-variant for blk-mq as an option (default to off unless opted in by the driver or user), and not other scheduler for blk-mq. Don't bother with bfq for non blk-mq. It's not like there is any advantage in the legacy-request device even for slow devices, except for the option of having I/O scheduling. It's the only right way forward. blk-mq might not offer any substantial advantages to rotating storage, but with scheduling, it won't offer a downside either. And it'll take us towards the real goal, which is to have just one IO path. ok Adding a new scheduler for the legacy IO path makes no sense. I would fully agree if effective and stable I/O scheduling would be available in blk-mq in one or two months. But I guess that it will take at least one year optimistically, given the current status of the needed infrastructure, and given the great difficulties of doing effective scheduling at the high parallelism and extreme target speeds of blk-mq. Of course, this holds true unless little clever scheduling is performed. So, what's the point in forcing a lot of users wait another year or more, for a solution that has yet to be even defined, while they could enjoy a much better system, and then switch an even better system when scheduling is ready in blk-mq too? That same argument could have been made 2 years ago. Saying no to a new scheduler for the legacy framework goes back roughly that long. We could have had BFQ for mq NOW, if we didn't keep coming back to this very point. I'm hesistant to add a new scheduler because it's very easy to add, very difficult to get rid of. If we do add BFQ as a legacy scheduler now, it'll take us years and years to get rid of it again. We should be moving towards LESS moving parts in the legacy path, not more. We can keep having this discussion every few years, but I think we'd both prefer to make some actual progress here. It's perfectly fine to add an interface for a single queue interface for an IO scheduler for blk-mq, since we don't care too much about scalability there. And that won't take years, that should be a few weeks. Retrofitting BFQ on top of that should not be hard either. That can co-exist with a real multiqueue scheduler as well, something that's geared towards some fairness for faster devices. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On 10/26/2016 09:29 AM, Christoph Hellwig wrote: On Wed, Oct 26, 2016 at 05:13:07PM +0200, Arnd Bergmann wrote: The question to ask first is whether to actually have pluggable schedulers on blk-mq at all, or just have one that is meant to do the right thing in every case (and possibly can be bypassed completely). That would be my preference. Have a BFQ-variant for blk-mq as an option (default to off unless opted in by the driver or user), and not other scheduler for blk-mq. Don't bother with bfq for non blk-mq. It's not like there is any advantage in the legacy-request device even for slow devices, except for the option of having I/O scheduling. It's the only right way forward. blk-mq might not offer any substantial advantages to rotating storage, but with scheduling, it won't offer a downside either. And it'll take us towards the real goal, which is to have just one IO path. Adding a new scheduler for the legacy IO path makes no sense. Adding one for blk-mq and phasing out the old path is what we need to do. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Wed, Oct 26, 2016 at 05:13:07PM +0200, Arnd Bergmann wrote: > The question to ask first is whether to actually have pluggable > schedulers on blk-mq at all, or just have one that is meant to > do the right thing in every case (and possibly can be bypassed > completely). That would be my preference. Have a BFQ-variant for blk-mq as an option (default to off unless opted in by the driver or user), and not other scheduler for blk-mq. Don't bother with bfq for non blk-mq. It's not like there is any advantage in the legacy-request device even for slow devices, except for the option of having I/O scheduling. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
On Wednesday, October 26, 2016 8:05:11 AM CEST Bart Van Assche wrote: > On 10/26/2016 04:34 AM, Jan Kara wrote: > > On Wed 26-10-16 03:19:03, Christoph Hellwig wrote: > >> Just as last time: > >> > >> big NAK for introducing giant new infrastructure like a new I/O scheduler > >> for the legacy request structure. > >> > >> Please direct your engergy towards blk-mq instead. > > > > Christoph, we will probably talk about this next week but IMO rotating > > disks and SATA based SSDs are going to stay with us for another 15 years, > > likely more. For them blk-mq is no win, relatively complex IO scheduling > > like CFQ or BFQ does is a big win for them in some cases. So I think IO > > scheduling (and thus place for something like BFQ) is going to stay with us > > for quite a long time still. So are we going to add hooks in blk-mq to > > support full-blown IO scheduling at least for single queue devices? Or how > > else do we want to support that HW? > > Hello Jan, > > Having two versions (one for non-blk-mq, one for blk-mq) of every I/O > scheduler would be a maintenance nightmare. Has anyone already analyzed > whether it would be possible to come up with an API for I/O schedulers > that makes it possible to use the same I/O scheduler for both blk-mq and > the traditional block layer? The question to ask first is whether to actually have pluggable schedulers on blk-mq at all, or just have one that is meant to do the right thing in every case (and possibly can be bypassed completely). Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
Just as last time: big NAK for introducing giant new infrastructure like a new I/O scheduler for the legacy request structure. Please direct your engergy towards blk-mq instead. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler
Hi, this new patch series turns back to the initial approach, i.e., it adds BFQ as an extra scheduler, instead of replacing CFQ with BFQ. This patch series also contains all the improvements and bug fixes recommended by Tejun [5], plus new features of BFQ-v8r5. Details about old and new features in patch descriptions. The first version of BFQ was submitted a few years ago [1]. It is denoted as v0 in this patchset, to distinguish it from the version I am submitting now, v8r5. In particular, the first two patches introduce BFQ-v0, whereas the remaining patches turn progressively BFQ-v0 into BFQ-v8r5. Some patch generates WARNINGS with checkpatch.pl, but these WARNINGS seem to be either unavoidable for the involved pieces of code (which the patch just extends), or false positives. For your convenience, a slightly updated and extended description of BFQ follows. On average CPUs, the current version of BFQ can handle devices performing at most ~30K IOPS; at most ~50 KIOPS on faster CPUs. These are about the same limits as CFQ. There may be room for noticeable improvements regarding these limits, but, given the overall limitations of blk itself, I thought it was not the case to further delay this new submission. Here are some nice features of BFQ-v8r5. Low latency for interactive applications Regardless of the actual background workload, BFQ guarantees that, for interactive tasks, the storage device is virtually as responsive as if it was idle. For example, even if one or more of the following background workloads are being executed: - one or more large files are being read, written or copied, - a tree of source files is being compiled, - one or more virtual machines are performing I/O, - a software update is in progress, - indexing daemons are scanning filesystems and updating their databases, starting an application or loading a file from within an application takes about the same time as if the storage device was idle. As a comparison, with CFQ, NOOP or DEADLINE, and in the same conditions, applications experience high latencies, or even become unresponsive until the background workload terminates (also on SSDs). Low latency for soft real-time applications Also soft real-time applications, such as audio and video players/streamers, enjoy a low latency and a low drop rate, regardless of the background I/O workload. As a consequence, these applications do not suffer from almost any glitch due to the background workload. Higher speed for code-development tasks If some additional workload happens to be executed in parallel, then BFQ executes the I/O-related components of typical code-development tasks (compilation, checkout, merge, ...) much more quickly than CFQ, NOOP or DEADLINE. High throughput On hard disks, BFQ achieves up to 30% higher throughput than CFQ, and up to 150% higher throughput than DEADLINE and NOOP, with all the sequential workloads considered in our tests. With random workloads, and with all the workloads on flash-based devices, BFQ achieves, instead, about the same throughput as the other schedulers. Strong fairness, bandwidth and delay guarantees BFQ distributes the device throughput, and not just the device time, among I/O-bound applications in proportion their weights, with any workload and regardless of the device parameters. From these bandwidth guarantees, it is possible to compute tight per-I/O-request delay guarantees by a simple formula. If not configured for strict service guarantees, BFQ switches to time-based resource sharing (only) for applications that would otherwise cause a throughput loss. BFQ achieves the above service properties thanks to the combination of its accurate scheduling engine (patches 1-2), and a set of simple heuristics and improvements (patches 3-14). Details on how BFQ and its components work are provided in the descriptions of the patches. In addition, an organic description of the main BFQ algorithm and of most of its features can be found in this paper [2]. What BFQ can do in practice is shown, e.g., in this 8-minute demo with an SSD: [3]. I made this demo with an older version of BFQ (v7r6) and under Linux 3.17.0, but, for the tests considered in the demo, performance has remained about the same with more recent BFQ and kernel versions. More details about this point can be found here [4], together with graphs showing the performance of BFQ, as compared with CFQ, DEADLINE and NOOP, and on: a fast and a slow hard disk, a RAID1, an SSD, a microSDHC Card and an eMMC. As an example, our results on the SSD are reported also in a table at the end of this email. Finally, as for testing in everyday use, BFQ is the default I/O scheduler in, e.g., Mageia, Manjaro, Sabayon, OpenMandriva and Arch Linux ARM, plus several kernel forks for PCs and smartphones. In addition, BFQ is optionally available in, e.g., Arch, PCLinuxOS and Gentoo, and we record several downloads a day from people using other distributions. The feedback received so far