Not to compound upon this again. However if BFQ isn't suitable to
replace CFQ for high I/O workloads (I've yet to see 20k IOPS on any
reasonably sized SAN (SC4020 / v5000, etc)), can't we at-least default
BFQ to become the default I/O scheduler for people otherwise
requesting CFQ? Paolo has had a team of students working on this for
years, even if the otherwise "secret weapon" is mainlined I highly
doubt his work will stop. We're pretty close to fixing hard I/O stalls
in Linux, mainlining being the last major burden.
While I've contributed nothing to BFQ code wise, absolutely let any of
us know if there's anything outstanding to solve hard lockups and I
believe any of us will try our best.
On Sun, Oct 16, 2016 at 12:02 PM, Paolo Valente
>> Il giorno 14 ott 2016, alle ore 20:35, Tejun Heo <t...@kernel.org> ha
>> Hello, Paolo.
>> On Fri, Oct 14, 2016 at 07:13:41PM +0200, Paolo Valente wrote:
>>> That said, your 'thus' seems a little too strong: "bfq does not yet
>>> handle fast SSDs, thus we need something else". What about the
>>> millions of devices (and people) still within 10-20 K IOPS, and
>>> experiencing awful latencies and lack of bandwidth guarantees?
>> I'm not objecting to any of that.
> Ok, sorry for misunderstanding. I'm just more and more confused about
> why a readily available, and not proven wrong solution has not yet
> been accepted, if everybody apparently acknowledges the problem.
>> My point just is that bfq, at least
>> as currently implemented, is unfit for certain classes of use cases.
> Absolutely correct.
>>>> FWIW, it looks like the only way we can implement proportional control
>>>> on highspeed ssds with acceptable overhead
>>> Maybe not: as I wrote to Viveck in a previous reply, containing
>>> pointers to documentation, we have already achieved twenty millions
>>> of decisions per second with a prototype driving existing
>>> proportional-share packet schedulers (essentially without
>> And that doesn't require idling and thus doesn't severely impact
> Nope. Packets are commonly assumed to be sent asynchronously.
> I guess that discussing the validity of this assumption is out of the
> scope of this thread.
>>>> is somehow finding a way to
>>>> calculate the cost of each IO and throttle IOs according to that while
>>>> controlling for latency as necessary. Slice scheduling with idling
>>>> seems too expensive with highspeed devices with high io depth.
>>> Yes, that's absolutely true. I'm already thinking about an idleless
>>> solution. As I already wrote, I'm willing to help with scheduling in
>>> blk-mq. I hope there will be the opportunity to find some way to go
>>> at KS.
>> It'd be great to have a proportional control mechanism whose overhead
>> is acceptable. Unfortunately, we don't have one now and nothing seems
>> right around the corner. (Mostly) work-conserving throttling would be
>> fiddlier to use but is something which is useful regardless of such
>> proportional control mechanism and can be obtained relatively easily.
>> I don't see why the two approaches would be mutually exclusive.
>> To unsubscribe from this list: send the line "unsubscribe linux-block" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Paolo Valente
> Dipartimento di Scienze Fisiche, Informatiche e Matematiche
> Via Campi 213/B
> 41125 Modena - Italy