Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-02-16 Thread Jens Axboe
On 02/16/2017 03:46 AM, Paolo Valente wrote:
> 
>> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
>>
>> This is basically identical to deadline-iosched, except it registers
>> as a MQ capable scheduler. This is still a single queue design.
>>
>> Signed-off-by: Jens Axboe 
> ...
>> +
>> +static void dd_merged_requests(struct request_queue *q, struct request *req,
>> +   struct request *next)
>> +{
>> +/*
>> + * if next expires before rq, assign its expire time to rq
>> + * and move into next position (next will be deleted) in fifo
>> + */
>> +if (!list_empty(&req->queuelist) && !list_empty(&next->queuelist)) {
>> +if (time_before((unsigned long)next->fifo_time,
>> +(unsigned long)req->fifo_time)) {
>> +list_move(&req->queuelist, &next->queuelist);
>> +req->fifo_time = next->fifo_time;
>> +}
>> +}
>> +
> 
> Jens,
> while trying to imagine the possible causes of Bart's hang with
> bfq-mq, I've bumped into the following doubt: in the above function
> (in my case, in bfq-mq-'s equivalent of the above function), are
> we sure that neither req or next could EVER be in dd->dispatch instead
> of dd->fifo_list?  I've tried to verify it, but, although I think it has never
> happened in my tests, I was not able to make sure that no unlucky
> combination may ever happen (considering also the use of
> blk_rq_is_passthrough too, to decide where to put a new request).
> 
> I'm making a blunder, right?

If a request goes into dd->dispatch, it's going to be found for merging.
Hence we can never call the above on the request.

-- 
Jens Axboe



Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-02-16 Thread Paolo Valente

> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
> 
> This is basically identical to deadline-iosched, except it registers
> as a MQ capable scheduler. This is still a single queue design.
> 
> Signed-off-by: Jens Axboe 
...
> +
> +static void dd_merged_requests(struct request_queue *q, struct request *req,
> +struct request *next)
> +{
> + /*
> +  * if next expires before rq, assign its expire time to rq
> +  * and move into next position (next will be deleted) in fifo
> +  */
> + if (!list_empty(&req->queuelist) && !list_empty(&next->queuelist)) {
> + if (time_before((unsigned long)next->fifo_time,
> + (unsigned long)req->fifo_time)) {
> + list_move(&req->queuelist, &next->queuelist);
> + req->fifo_time = next->fifo_time;
> + }
> + }
> +

Jens,
while trying to imagine the possible causes of Bart's hang with
bfq-mq, I've bumped into the following doubt: in the above function
(in my case, in bfq-mq-'s equivalent of the above function), are
we sure that neither req or next could EVER be in dd->dispatch instead
of dd->fifo_list?  I've tried to verify it, but, although I think it has never
happened in my tests, I was not able to make sure that no unlucky
combination may ever happen (considering also the use of
blk_rq_is_passthrough too, to decide where to put a new request).

I'm making a blunder, right?

Thanks,
Paolo



Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-02-07 Thread Paolo Valente

> Il giorno 02 feb 2017, alle ore 22:32, Jens Axboe  ha scritto:
> 
> On 02/02/2017 02:15 PM, Paolo Valente wrote:
>> 
>>> Il giorno 02 feb 2017, alle ore 16:30, Jens Axboe  ha scritto:
>>> 
>>> On 02/02/2017 02:19 AM, Paolo Valente wrote:
 The scheme is clear.  One comment, in case it could make sense and
 avoid more complexity: since put_rq_priv is invoked in two different
 contexts, process or interrupt, I didn't feel so confusing that, when
 put_rq_priv is invoked in the context where the lock cannot be held
 (unless one is willing to pay with irq disabling all the times), the
 lock is not held, while, when invoked in the context where the lock
 can be held, the lock is actually held, or must be taken.
>>> 
>>> If you grab the same lock from put_rq_priv, yes, you must make it IRQ
>>> disabling in all contexts, and use _irqsave() from put_rq_priv. If it's
>>> just freeing resources, you could potentially wait and do that when
>>> someone else needs them, since that part will come from proces context.
>>> That would need two locks, though.
>>> 
>>> As I said above, I would not worry about the IRQ disabling lock.
>>> 
>> 
>> I'm sorry, I focused only on the IRQ-disabling consequence of grabbing
>> a scheduler lock also in IRQ context.  I thought it was a serious
>> enough issue to avoid this option.  Yet there is also a deadlock
>> problem related to this option.  In fact, the IRQ handler may preempt
>> some process-context code that already holds some other locks, and, if
>> some of these locks are already held by another process, which is
>> executing on another CPU and which then tries to take the scheduler
>> lock, or which happens to be preempted by an IRQ handler trying to
>> grab the scheduler lock, then a deadlock occurs.  This is not just a
>> speculation, but a problem that did occur before I moved to a
>> deferred-work solution, and that can be readily reproduced.  Before
>> moving to a deferred work solution, I tried various code manipulations
>> to avoid these deadlocks without resorting to deferred work, but at no
>> avail.
> 
> There are two important rules here:
> 
> 1) If a lock is ever used in interrupt context, anyone acquiring it must
>   ensure that interrupts gets disabled.
> 
> 2) If multiple locks are needed, they need to be acquired in the right
>   order.
> 
> Instead of talking in hypotheticals, be more specific. With the latest
> code, the scheduler lock should now be fine, there should be no cases
> where you are being invoked with it held. I'm assuming you are running
> with lockdep enabled on your kernel? Post the stack traces from your
> problem (and your code...), then we can take a look.
> 

Hi Jens,

your last change (freeing requests outside merges) did remove two of
out three deadlock scenarios for which I turned some handlers into
deferred work items in bfq-mq.  For the remaining one, I'm about to
send a separate email, with the description of the deadlock, together
with the patch that, applied on top of the bfq-mq branch, causes the
deadlock by turning moving back the body of exit_icq from a deferred
work to the exit_icq hook itself.  And, yes, as I'll write below, I'm
finally about to share a branch containing bfq-mq.

> Don't punt to deferring work from your put_rq_private() function, that's
> a suboptimal work around. It needs to be fixed for real.
> 

Yeah, sub-optimal also in terms of poor developer time: I spent a lot
of time letting that deferred work, and hopefully be a little
efficient!  The actual problem has been that I preferred to try to get
to the bottom of those deadlocks on my own, and not to bother you also
on that issue.  Maybe next time I will ask you one more question
instead of one less :)

>> At any rate, bfq seems now to work, so I can finally move from just
>> asking questions endlessly, to proposing actual code to discuss on.
>> I'm about to: port this version of bfq to your improved/fixed
>> blk-mq-sched version in for-4.11 (port postponed, to avoid introducing
>> further changes in code that did not yet wok), run more extensive
>> tests, polish commits a little bit, and finally share a branch.
> 
> Post the code sooner rather than later. There are bound to be things
> that need to be improved or fixed up, let's start this process now. The
> framework is pretty much buttoned up at this point, so there's time to
> shift the attention a bit to a consumer of it.
> 

Ok, to follow this suggestion of yours at 100%, I have postponed
several steps (removal of any invariant check or extra log message,
merging of the various files bfq is made of in just one file, code
polishing), and I'm about to share my current WIP branch in a
follow-up message.

Thanks,
Paolo

> -- 
> Jens Axboe



Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-02-02 Thread Jens Axboe
On 02/02/2017 02:15 PM, Paolo Valente wrote:
> 
>> Il giorno 02 feb 2017, alle ore 16:30, Jens Axboe  ha scritto:
>>
>> On 02/02/2017 02:19 AM, Paolo Valente wrote:
>>> The scheme is clear.  One comment, in case it could make sense and
>>> avoid more complexity: since put_rq_priv is invoked in two different
>>> contexts, process or interrupt, I didn't feel so confusing that, when
>>> put_rq_priv is invoked in the context where the lock cannot be held
>>> (unless one is willing to pay with irq disabling all the times), the
>>> lock is not held, while, when invoked in the context where the lock
>>> can be held, the lock is actually held, or must be taken.
>>
>> If you grab the same lock from put_rq_priv, yes, you must make it IRQ
>> disabling in all contexts, and use _irqsave() from put_rq_priv. If it's
>> just freeing resources, you could potentially wait and do that when
>> someone else needs them, since that part will come from proces context.
>> That would need two locks, though.
>>
>> As I said above, I would not worry about the IRQ disabling lock.
>>
> 
> I'm sorry, I focused only on the IRQ-disabling consequence of grabbing
> a scheduler lock also in IRQ context.  I thought it was a serious
> enough issue to avoid this option.  Yet there is also a deadlock
> problem related to this option.  In fact, the IRQ handler may preempt
> some process-context code that already holds some other locks, and, if
> some of these locks are already held by another process, which is
> executing on another CPU and which then tries to take the scheduler
> lock, or which happens to be preempted by an IRQ handler trying to
> grab the scheduler lock, then a deadlock occurs.  This is not just a
> speculation, but a problem that did occur before I moved to a
> deferred-work solution, and that can be readily reproduced.  Before
> moving to a deferred work solution, I tried various code manipulations
> to avoid these deadlocks without resorting to deferred work, but at no
> avail.

There are two important rules here:

1) If a lock is ever used in interrupt context, anyone acquiring it must
   ensure that interrupts gets disabled.

2) If multiple locks are needed, they need to be acquired in the right
   order.

Instead of talking in hypotheticals, be more specific. With the latest
code, the scheduler lock should now be fine, there should be no cases
where you are being invoked with it held. I'm assuming you are running
with lockdep enabled on your kernel? Post the stack traces from your
problem (and your code...), then we can take a look.

Don't punt to deferring work from your put_rq_private() function, that's
a suboptimal work around. It needs to be fixed for real.

> At any rate, bfq seems now to work, so I can finally move from just
> asking questions endlessly, to proposing actual code to discuss on.
> I'm about to: port this version of bfq to your improved/fixed
> blk-mq-sched version in for-4.11 (port postponed, to avoid introducing
> further changes in code that did not yet wok), run more extensive
> tests, polish commits a little bit, and finally share a branch.

Post the code sooner rather than later. There are bound to be things
that need to be improved or fixed up, let's start this process now. The
framework is pretty much buttoned up at this point, so there's time to
shift the attention a bit to a consumer of it.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-02-02 Thread Paolo Valente

> Il giorno 02 feb 2017, alle ore 16:30, Jens Axboe  ha scritto:
> 
> On 02/02/2017 02:19 AM, Paolo Valente wrote:
>> The scheme is clear.  One comment, in case it could make sense and
>> avoid more complexity: since put_rq_priv is invoked in two different
>> contexts, process or interrupt, I didn't feel so confusing that, when
>> put_rq_priv is invoked in the context where the lock cannot be held
>> (unless one is willing to pay with irq disabling all the times), the
>> lock is not held, while, when invoked in the context where the lock
>> can be held, the lock is actually held, or must be taken.
> 
> If you grab the same lock from put_rq_priv, yes, you must make it IRQ
> disabling in all contexts, and use _irqsave() from put_rq_priv. If it's
> just freeing resources, you could potentially wait and do that when
> someone else needs them, since that part will come from proces context.
> That would need two locks, though.
> 
> As I said above, I would not worry about the IRQ disabling lock.
> 

I'm sorry, I focused only on the IRQ-disabling consequence of grabbing
a scheduler lock also in IRQ context.  I thought it was a serious
enough issue to avoid this option.  Yet there is also a deadlock
problem related to this option.  In fact, the IRQ handler may preempt
some process-context code that already holds some other locks, and, if
some of these locks are already held by another process, which is
executing on another CPU and which then tries to take the scheduler
lock, or which happens to be preempted by an IRQ handler trying to
grab the scheduler lock, then a deadlock occurs.  This is not just a
speculation, but a problem that did occur before I moved to a
deferred-work solution, and that can be readily reproduced.  Before
moving to a deferred work solution, I tried various code manipulations
to avoid these deadlocks without resorting to deferred work, but at no
avail.

At any rate, bfq seems now to work, so I can finally move from just
asking questions endlessly, to proposing actual code to discuss on.
I'm about to: port this version of bfq to your improved/fixed
blk-mq-sched version in for-4.11 (port postponed, to avoid introducing
further changes in code that did not yet wok), run more extensive
tests, polish commits a little bit, and finally share a branch.

Thanks,
Paolo

> -- 
> Jens Axboe
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-block" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-02-02 Thread Jens Axboe
On 02/02/2017 02:19 AM, Paolo Valente wrote:
> The scheme is clear.  One comment, in case it could make sense and
> avoid more complexity: since put_rq_priv is invoked in two different
> contexts, process or interrupt, I didn't feel so confusing that, when
> put_rq_priv is invoked in the context where the lock cannot be held
> (unless one is willing to pay with irq disabling all the times), the
> lock is not held, while, when invoked in the context where the lock
> can be held, the lock is actually held, or must be taken.

If you grab the same lock from put_rq_priv, yes, you must make it IRQ
disabling in all contexts, and use _irqsave() from put_rq_priv. If it's
just freeing resources, you could potentially wait and do that when
someone else needs them, since that part will come from proces context.
That would need two locks, though.

As I said above, I would not worry about the IRQ disabling lock.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-02-02 Thread Paolo Valente

> Il giorno 02 feb 2017, alle ore 06:19, Jens Axboe  ha scritto:
> 
> On 02/01/2017 04:11 AM, Paolo Valente wrote:
>>> +static bool dd_bio_merge(struct blk_mq_hw_ctx *hctx, struct bio *bio)
>>> +{
>>> +   struct request_queue *q = hctx->queue;
>>> +   struct deadline_data *dd = q->elevator->elevator_data;
>>> +   int ret;
>>> +
>>> +   spin_lock(&dd->lock);
>>> +   ret = blk_mq_sched_try_merge(q, bio);
>>> +   spin_unlock(&dd->lock);
>>> +
>> 
>> Hi Jens,
>> first, good news, bfq is passing my first sanity checks.  Still, I
>> need a little more help for the following issue.  There is a case that
>> would be impossible to handle without modifying code outside bfq.  But
>> so far such a case never occurred, and I hope that it can never occur.
>> I'll try to briefly list all relevant details on this concern of mine,
>> so that you can quickly confirm my hope, or highlight where or what I
>> am missing.
> 
> Remember my earlier advice - it's not a problem to change anything in
> the core, in fact I would be surprised if you did not need to. My
> foresight isn't THAT good! It's much better to fix up an inconsistency
> there, rather than work around it in the consumer of that API.
> 
>> First, as done above for mq-deadline, invoking blk_mq_sched_try_merge
>> with the scheduler lock held is of course necessary (for example, to
>> protect q->last_merge).  This may lead to put_rq_private invoked
>> with the lock held, in case of successful merge.
> 
> Right, or some other lock with the same scope, as per my other email.
> 
>> As a consequence, put_rq_private may be invoked:
>> (1) in IRQ context, no scheduler lock held, because of a completion:
>> can be handled by deferring work and lock grabbing, because the
>> completed request is not queued in the scheduler any more;
>> (2) in process context, scheduler lock held, because of the above
>> successful merge: must be handled immediately, for consistency,
>> because the request is still queued in the scheduler;
>> (3) in process context, no scheduler lock held, for some other reason:
>> some path apparently may lead to this case, although I've never seen
>> it to happen.  Immediate handling, and hence locking, may be needed,
>> depending on whether the request is still queued in the scheduler.
>> 
>> So, my main question is: is case (3) actually impossible?  Should it
>> be possible, I guess we would have a problem, because of the
>> different lock state with respect to (2).
> 
> I agree, there's some inconsistency there, if you potentially need to
> grab the lock in your put_rq_private handler. The problem case is #2,
> when we have the merge. I would probably suggest that the best way to
> handle that is to pass back the dropped request so we can put it outside
> of holding the lock.
> 
> Let me see if I can come up with a good solution for this. We have to be
> consistent in how we invoke the scheduler functions, we can't have hooks
> that are called in unknown lock states. I also don't want you to have to
> add defer work handling in that kind of path, that will impact your
> performance and overhead.
> 

I'll try to learn from your solution, because, as of now, I don't see
how to avoid deferred work for the case where put_rq_private is
invoked in interrupt context.  In fact, for this case, we cannot grab
the lock, unless we turn all spin_lock into spin_lock_irq*.

>> Finally, I hope that it is certainly impossible to have a case (4): in
>> IRQ context, no lock held, but with the request in the scheduler.
> 
> That should not be possible.
> 
> Edit: since I'm on a flight and email won't send, I had a few minutes to
> hack this up. Totally untested, but something like the below should do
> it. Not super pretty... I'll play with this a bit more tomorrow.
> 
> 

The scheme is clear.  One comment, in case it could make sense and
avoid more complexity: since put_rq_priv is invoked in two different
contexts, process or interrupt, I didn't feel so confusing that, when
put_rq_priv is invoked in the context where the lock cannot be held
(unless one is willing to pay with irq disabling all the times), the
lock is not held, while, when invoked in the context where the lock
can be held, the lock is actually held, or must be taken.

Thanks,
Paolo

> diff --git a/block/blk-core.c b/block/blk-core.c
> index c142de090c41..530a9a3f60c9 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1609,7 +1609,7 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, 
> struct bio *bio)
> {
>   struct blk_plug *plug;
>   int el_ret, where = ELEVATOR_INSERT_SORT;
> - struct request *req;
> + struct request *req, *free;
>   unsigned int request_count = 0;
>   unsigned int wb_acct;
> 
> @@ -1650,15 +1650,21 @@ static blk_qc_t blk_queue_bio(struct request_queue 
> *q, struct bio *bio)
>   if (el_ret == ELEVATOR_BACK_MERGE) {
>   if (bio_attempt_back_merge(q, req, bio)) {
>   elv_bio_merged(q, req, bio);
> - 

Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-02-01 Thread Jens Axboe
On 02/01/2017 04:56 AM, Paolo Valente wrote:
>> +/*
>> + * add rq to rbtree and fifo
>> + */
>> +static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request 
>> *rq,
>> +  bool at_head)
>> +{
>> +struct request_queue *q = hctx->queue;
>> +struct deadline_data *dd = q->elevator->elevator_data;
>> +const int data_dir = rq_data_dir(rq);
>> +
>> +if (blk_mq_sched_try_insert_merge(q, rq))
>> +return;
>> +
> 
> A related doubt: shouldn't blk_mq_sched_try_insert_merge be invoked
> with the scheduler lock held too, as blk_mq_sched_try_merge, to
> protect (at least) q->last_merge?
>
> In bfq this function is invoked with the lock held.

It doesn't matter which lock you use, as long as:

1) You use the same one consistently
2) It has the same scope as the queue lock (the one you call the
   scheduler lock)

mq-deadline sets up a per-queue structure, deadline_data, and it has a
lock embedded in that structure. This is what mq-deadline uses to
serialize access to its data structures, as well as those in the queue
(like last_merge).

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-02-01 Thread Jens Axboe
On 02/01/2017 04:11 AM, Paolo Valente wrote:
>> +static bool dd_bio_merge(struct blk_mq_hw_ctx *hctx, struct bio *bio)
>> +{
>> +struct request_queue *q = hctx->queue;
>> +struct deadline_data *dd = q->elevator->elevator_data;
>> +int ret;
>> +
>> +spin_lock(&dd->lock);
>> +ret = blk_mq_sched_try_merge(q, bio);
>> +spin_unlock(&dd->lock);
>> +
> 
> Hi Jens,
> first, good news, bfq is passing my first sanity checks.  Still, I
> need a little more help for the following issue.  There is a case that
> would be impossible to handle without modifying code outside bfq.  But
> so far such a case never occurred, and I hope that it can never occur.
> I'll try to briefly list all relevant details on this concern of mine,
> so that you can quickly confirm my hope, or highlight where or what I
> am missing.

Remember my earlier advice - it's not a problem to change anything in
the core, in fact I would be surprised if you did not need to. My
foresight isn't THAT good! It's much better to fix up an inconsistency
there, rather than work around it in the consumer of that API.

> First, as done above for mq-deadline, invoking blk_mq_sched_try_merge
> with the scheduler lock held is of course necessary (for example, to
> protect q->last_merge).  This may lead to put_rq_private invoked
> with the lock held, in case of successful merge.

Right, or some other lock with the same scope, as per my other email.

> As a consequence, put_rq_private may be invoked:
> (1) in IRQ context, no scheduler lock held, because of a completion:
> can be handled by deferring work and lock grabbing, because the
> completed request is not queued in the scheduler any more;
> (2) in process context, scheduler lock held, because of the above
> successful merge: must be handled immediately, for consistency,
> because the request is still queued in the scheduler;
> (3) in process context, no scheduler lock held, for some other reason:
> some path apparently may lead to this case, although I've never seen
> it to happen.  Immediate handling, and hence locking, may be needed,
> depending on whether the request is still queued in the scheduler.
> 
> So, my main question is: is case (3) actually impossible?  Should it
> be possible, I guess we would have a problem, because of the
> different lock state with respect to (2).

I agree, there's some inconsistency there, if you potentially need to
grab the lock in your put_rq_private handler. The problem case is #2,
when we have the merge. I would probably suggest that the best way to
handle that is to pass back the dropped request so we can put it outside
of holding the lock.

Let me see if I can come up with a good solution for this. We have to be
consistent in how we invoke the scheduler functions, we can't have hooks
that are called in unknown lock states. I also don't want you to have to
add defer work handling in that kind of path, that will impact your
performance and overhead.

> Finally, I hope that it is certainly impossible to have a case (4): in
> IRQ context, no lock held, but with the request in the scheduler.

That should not be possible.

Edit: since I'm on a flight and email won't send, I had a few minutes to
hack this up. Totally untested, but something like the below should do
it. Not super pretty... I'll play with this a bit more tomorrow.


diff --git a/block/blk-core.c b/block/blk-core.c
index c142de090c41..530a9a3f60c9 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1609,7 +1609,7 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, 
struct bio *bio)
 {
struct blk_plug *plug;
int el_ret, where = ELEVATOR_INSERT_SORT;
-   struct request *req;
+   struct request *req, *free;
unsigned int request_count = 0;
unsigned int wb_acct;
 
@@ -1650,15 +1650,21 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, 
struct bio *bio)
if (el_ret == ELEVATOR_BACK_MERGE) {
if (bio_attempt_back_merge(q, req, bio)) {
elv_bio_merged(q, req, bio);
-   if (!attempt_back_merge(q, req))
+   free = attempt_back_merge(q, req);
+   if (!free)
elv_merged_request(q, req, el_ret);
+   else
+   __blk_put_request(q, free);
goto out_unlock;
}
} else if (el_ret == ELEVATOR_FRONT_MERGE) {
if (bio_attempt_front_merge(q, req, bio)) {
elv_bio_merged(q, req, bio);
-   if (!attempt_front_merge(q, req))
+   free = attempt_front_merge(q, req);
+   if (!free)
elv_merged_request(q, req, el_ret);
+   else
+   __blk_put_request(q, free);
goto out_unlock;
}
}
diff --git a/block/blk-merge.c b/block/

Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-02-01 Thread Paolo Valente

> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
> 
> This is basically identical to deadline-iosched, except it registers
> as a MQ capable scheduler. This is still a single queue design.
> 
> Signed-off-by: Jens Axboe 
> ---
> block/Kconfig.iosched |   6 +
> block/Makefile|   1 +
> block/mq-deadline.c   | 649 ++
> 3 files changed, 656 insertions(+)
> create mode 100644 block/mq-deadline.c
> 
> diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched
> index 421bef9c4c48..490ef2850fae 100644
> --- a/block/Kconfig.iosched
> +++ b/block/Kconfig.iosched
> @@ -32,6 +32,12 @@ config IOSCHED_CFQ
> 
> This is the default I/O scheduler.
> 
> +config MQ_IOSCHED_DEADLINE
> + tristate "MQ deadline I/O scheduler"
> + default y
> + ---help---
> +   MQ version of the deadline IO scheduler.
> +
> config CFQ_GROUP_IOSCHED
>   bool "CFQ Group Scheduling support"
>   depends on IOSCHED_CFQ && BLK_CGROUP
> diff --git a/block/Makefile b/block/Makefile
> index 2eee9e1bb6db..3ee0abd7205a 100644
> --- a/block/Makefile
> +++ b/block/Makefile
> @@ -18,6 +18,7 @@ obj-$(CONFIG_BLK_DEV_THROTTLING)+= blk-throttle.o
> obj-$(CONFIG_IOSCHED_NOOP)+= noop-iosched.o
> obj-$(CONFIG_IOSCHED_DEADLINE)+= deadline-iosched.o
> obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o
> +obj-$(CONFIG_MQ_IOSCHED_DEADLINE)+= mq-deadline.o
> 
> obj-$(CONFIG_BLOCK_COMPAT)+= compat_ioctl.o
> obj-$(CONFIG_BLK_CMDLINE_PARSER)  += cmdline-parser.o
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> new file mode 100644
> index ..3cb9de21ab21
> --- /dev/null
> +++ b/block/mq-deadline.c
> @@ -0,0 +1,649 @@
> +/*
> + *  MQ Deadline i/o scheduler - adaptation of the legacy deadline scheduler,
> + *  for the blk-mq scheduling framework
> + *
> + *  Copyright (C) 2016 Jens Axboe 
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "blk.h"
> +#include "blk-mq.h"
> +#include "blk-mq-tag.h"
> +#include "blk-mq-sched.h"
> +
> +static unsigned int queue_depth = 256;
> +module_param(queue_depth, uint, 0644);
> +MODULE_PARM_DESC(queue_depth, "Use this value as the scheduler queue depth");
> +
> +/*
> + * See Documentation/block/deadline-iosched.txt
> + */
> +static const int read_expire = HZ / 2;  /* max time before a read is 
> submitted. */
> +static const int write_expire = 5 * HZ; /* ditto for writes, these limits 
> are SOFT! */
> +static const int writes_starved = 2;/* max times reads can starve a 
> write */
> +static const int fifo_batch = 16;   /* # of sequential requests treated 
> as one
> +  by the above parameters. For throughput. */
> +
> +struct deadline_data {
> + /*
> +  * run time data
> +  */
> +
> + /*
> +  * requests (deadline_rq s) are present on both sort_list and fifo_list
> +  */
> + struct rb_root sort_list[2];
> + struct list_head fifo_list[2];
> +
> + /*
> +  * next in sort order. read, write or both are NULL
> +  */
> + struct request *next_rq[2];
> + unsigned int batching;  /* number of sequential requests made */
> + unsigned int starved;   /* times reads have starved writes */
> +
> + /*
> +  * settings that change how the i/o scheduler behaves
> +  */
> + int fifo_expire[2];
> + int fifo_batch;
> + int writes_starved;
> + int front_merges;
> +
> + spinlock_t lock;
> + struct list_head dispatch;
> + struct blk_mq_tags *tags;
> + atomic_t wait_index;
> +};
> +
> +static inline struct rb_root *
> +deadline_rb_root(struct deadline_data *dd, struct request *rq)
> +{
> + return &dd->sort_list[rq_data_dir(rq)];
> +}
> +
> +/*
> + * get the request after `rq' in sector-sorted order
> + */
> +static inline struct request *
> +deadline_latter_request(struct request *rq)
> +{
> + struct rb_node *node = rb_next(&rq->rb_node);
> +
> + if (node)
> + return rb_entry_rq(node);
> +
> + return NULL;
> +}
> +
> +static void
> +deadline_add_rq_rb(struct deadline_data *dd, struct request *rq)
> +{
> + struct rb_root *root = deadline_rb_root(dd, rq);
> +
> + elv_rb_add(root, rq);
> +}
> +
> +static inline void
> +deadline_del_rq_rb(struct deadline_data *dd, struct request *rq)
> +{
> + const int data_dir = rq_data_dir(rq);
> +
> + if (dd->next_rq[data_dir] == rq)
> + dd->next_rq[data_dir] = deadline_latter_request(rq);
> +
> + elv_rb_del(deadline_rb_root(dd, rq), rq);
> +}
> +
> +/*
> + * remove rq from rbtree and fifo.
> + */
> +static void deadline_remove_request(struct request_queue *q, struct request 
> *rq)
> +{
> + struct deadline_data *dd = q->elevator->elevator_data;
> +
> + list_del_init(&rq->queuelist);
> +
> + /*
> +  * We might not be on the rbtree, if

Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-02-01 Thread Paolo Valente

> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
> 
> This is basically identical to deadline-iosched, except it registers
> as a MQ capable scheduler. This is still a single queue design.
> 
> Signed-off-by: Jens Axboe 
> ---
> block/Kconfig.iosched |   6 +
> block/Makefile|   1 +
> block/mq-deadline.c   | 649 ++
> 3 files changed, 656 insertions(+)
> create mode 100644 block/mq-deadline.c
> 
> diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched
> index 421bef9c4c48..490ef2850fae 100644
> --- a/block/Kconfig.iosched
> +++ b/block/Kconfig.iosched
> @@ -32,6 +32,12 @@ config IOSCHED_CFQ
> 
> This is the default I/O scheduler.
> 
> +config MQ_IOSCHED_DEADLINE
> + tristate "MQ deadline I/O scheduler"
> + default y
> + ---help---
> +   MQ version of the deadline IO scheduler.
> +
> config CFQ_GROUP_IOSCHED
>   bool "CFQ Group Scheduling support"
>   depends on IOSCHED_CFQ && BLK_CGROUP
> diff --git a/block/Makefile b/block/Makefile
> index 2eee9e1bb6db..3ee0abd7205a 100644
> --- a/block/Makefile
> +++ b/block/Makefile
> @@ -18,6 +18,7 @@ obj-$(CONFIG_BLK_DEV_THROTTLING)+= blk-throttle.o
> obj-$(CONFIG_IOSCHED_NOOP)+= noop-iosched.o
> obj-$(CONFIG_IOSCHED_DEADLINE)+= deadline-iosched.o
> obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o
> +obj-$(CONFIG_MQ_IOSCHED_DEADLINE)+= mq-deadline.o
> 
> obj-$(CONFIG_BLOCK_COMPAT)+= compat_ioctl.o
> obj-$(CONFIG_BLK_CMDLINE_PARSER)  += cmdline-parser.o
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> new file mode 100644
> index ..3cb9de21ab21
> --- /dev/null
> +++ b/block/mq-deadline.c
> @@ -0,0 +1,649 @@
> +/*
> + *  MQ Deadline i/o scheduler - adaptation of the legacy deadline scheduler,
> + *  for the blk-mq scheduling framework
> + *
> + *  Copyright (C) 2016 Jens Axboe 
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "blk.h"
> +#include "blk-mq.h"
> +#include "blk-mq-tag.h"
> +#include "blk-mq-sched.h"
> +
> +static unsigned int queue_depth = 256;
> +module_param(queue_depth, uint, 0644);
> +MODULE_PARM_DESC(queue_depth, "Use this value as the scheduler queue depth");
> +
> +/*
> + * See Documentation/block/deadline-iosched.txt
> + */
> +static const int read_expire = HZ / 2;  /* max time before a read is 
> submitted. */
> +static const int write_expire = 5 * HZ; /* ditto for writes, these limits 
> are SOFT! */
> +static const int writes_starved = 2;/* max times reads can starve a 
> write */
> +static const int fifo_batch = 16;   /* # of sequential requests treated 
> as one
> +  by the above parameters. For throughput. */
> +
> +struct deadline_data {
> + /*
> +  * run time data
> +  */
> +
> + /*
> +  * requests (deadline_rq s) are present on both sort_list and fifo_list
> +  */
> + struct rb_root sort_list[2];
> + struct list_head fifo_list[2];
> +
> + /*
> +  * next in sort order. read, write or both are NULL
> +  */
> + struct request *next_rq[2];
> + unsigned int batching;  /* number of sequential requests made */
> + unsigned int starved;   /* times reads have starved writes */
> +
> + /*
> +  * settings that change how the i/o scheduler behaves
> +  */
> + int fifo_expire[2];
> + int fifo_batch;
> + int writes_starved;
> + int front_merges;
> +
> + spinlock_t lock;
> + struct list_head dispatch;
> + struct blk_mq_tags *tags;
> + atomic_t wait_index;
> +};
> +
> +static inline struct rb_root *
> +deadline_rb_root(struct deadline_data *dd, struct request *rq)
> +{
> + return &dd->sort_list[rq_data_dir(rq)];
> +}
> +
> +/*
> + * get the request after `rq' in sector-sorted order
> + */
> +static inline struct request *
> +deadline_latter_request(struct request *rq)
> +{
> + struct rb_node *node = rb_next(&rq->rb_node);
> +
> + if (node)
> + return rb_entry_rq(node);
> +
> + return NULL;
> +}
> +
> +static void
> +deadline_add_rq_rb(struct deadline_data *dd, struct request *rq)
> +{
> + struct rb_root *root = deadline_rb_root(dd, rq);
> +
> + elv_rb_add(root, rq);
> +}
> +
> +static inline void
> +deadline_del_rq_rb(struct deadline_data *dd, struct request *rq)
> +{
> + const int data_dir = rq_data_dir(rq);
> +
> + if (dd->next_rq[data_dir] == rq)
> + dd->next_rq[data_dir] = deadline_latter_request(rq);
> +
> + elv_rb_del(deadline_rb_root(dd, rq), rq);
> +}
> +
> +/*
> + * remove rq from rbtree and fifo.
> + */
> +static void deadline_remove_request(struct request_queue *q, struct request 
> *rq)
> +{
> + struct deadline_data *dd = q->elevator->elevator_data;
> +
> + list_del_init(&rq->queuelist);
> +
> + /*
> +  * We might not be on the rbtree, if

Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-01-20 Thread Jens Axboe
On Fri, Jan 20 2017, Paolo Valente wrote:
> 
> > Il giorno 20 gen 2017, alle ore 14:14, Paolo Valente 
> >  ha scritto:
> > 
> >> 
> >> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha 
> >> scritto:
> >> 
> >> This is basically identical to deadline-iosched, except it registers
> >> as a MQ capable scheduler. This is still a single queue design.
> >> 
> > 
> > Jens,
> > no spin_lock_irq* in the code.  So, also request dispatches are
> > guaranteed to never be executed in IRQ context?
> 
> Or maybe the opposite? That is, all scheduler functions invoked in IRQ 
> context?

Nope

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-01-20 Thread Jens Axboe
On Fri, Jan 20 2017, Paolo Valente wrote:
> 
> > Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
> > 
> > This is basically identical to deadline-iosched, except it registers
> > as a MQ capable scheduler. This is still a single queue design.
> > 
> 
> Jens,
> no spin_lock_irq* in the code.  So, also request dispatches are
> guaranteed to never be executed in IRQ context?  I'm asking this
> question to understand whether I'm missing something that, even in
> BFQ, would somehow allow me to not disable irqs in critical sections,
> even if there is the slice_idle-expiration handler.  Be patient with
> my ignorance.

Yes, dispatches will never happen from IRQ context. blk-mq was designed
so we didn't have to use irq disabling locks.

That said, certain parts of the API can be called from IRQ context.
put_request and the completion parts, for instance. But blk-mq doesn't
need to grab any locks there, and neither does mq-deadline. This might
be different from bfq. lockdep can be a big help there.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-01-20 Thread Jens Axboe
On Fri, Jan 20 2017, Paolo Valente wrote:
> 
> > Il giorno 17 gen 2017, alle ore 03:47, Jens Axboe  ha scritto:
> > 
> > On 12/22/2016 09:49 AM, Paolo Valente wrote:
> >> 
> >>> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha 
> >>> scritto:
> >>> 
> >>> This is basically identical to deadline-iosched, except it registers
> >>> as a MQ capable scheduler. This is still a single queue design.
> >>> 
> >> 
> >> One last question (for today ...):in mq-deadline there are no
> >> "schedule dispatch" or "unplug work" functions.  In blk, CFQ and BFQ
> >> do these schedules/unplugs in a lot of cases.  What's the right
> >> replacement?  Just doing nothing?
> > 
> > You just use blk_mq_run_hw_queue() or variants thereof to kick off queue
> > runs.
> > 
> 
> Hi Jens,
> I'm working on this right now.  I have a pair of quick questions about
> performance.
> 
> In the blk version of bfq, if the in-service bfq_queue happen to have
> no more budget when the bfq dispatch function is invoked, then bfq:
> returns no request (NULL), immediately expires the in-service
> bfq_queue, and schedules a new dispatch.  The third step is taken so
> that, if other bfq_queues have requests, then a new in-service
> bfq_queue will be selected on the upcoming new dispatch, and a new
> request will be provided right away.
> 
> My questions are: is this dispatch-schedule step still needed with
> blk-mq, to avoid a stall?  If it is not needed to avoid a stall, would
> it still be needed to boost throughput, because it would force an
> immediate, next dispatch?

Generally that step is only needed if you don't dispatch a request for
that invocation, yet you have requests to dispatch. For that case, you
must ensure that the queues are run at some point in the future. So I'm
inclined to answer yes to your question, though it depends on exactly
how it happens. If you have the queue run in the code, comment it like
that, and we can always revisit.

> BTW, bfq-mq survived its first request completion.  I will provide you
> with a link to a github branch as soon as bfq-mq seems able to stand
> up with a minimal workload.

Congratulations, that's a nice milestone!

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-01-20 Thread Paolo Valente

> Il giorno 20 gen 2017, alle ore 14:14, Paolo Valente 
>  ha scritto:
> 
>> 
>> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
>> 
>> This is basically identical to deadline-iosched, except it registers
>> as a MQ capable scheduler. This is still a single queue design.
>> 
> 
> Jens,
> no spin_lock_irq* in the code.  So, also request dispatches are
> guaranteed to never be executed in IRQ context?

Or maybe the opposite? That is, all scheduler functions invoked in IRQ context?

Thanks,
Paolo

>  I'm asking this
> question to understand whether I'm missing something that, even in
> BFQ, would somehow allow me to not disable irqs in critical sections,
> even if there is the slice_idle-expiration handler.  Be patient with
> my ignorance.
> 
> Thanks,
> Paolo
> 
>> Signed-off-by: Jens Axboe 
>> ---
>> block/Kconfig.iosched |   6 +
>> block/Makefile|   1 +
>> block/mq-deadline.c   | 649 
>> ++
>> 3 files changed, 656 insertions(+)
>> create mode 100644 block/mq-deadline.c
>> 
>> diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched
>> index 421bef9c4c48..490ef2850fae 100644
>> --- a/block/Kconfig.iosched
>> +++ b/block/Kconfig.iosched
>> @@ -32,6 +32,12 @@ config IOSCHED_CFQ
>> 
>>This is the default I/O scheduler.
>> 
>> +config MQ_IOSCHED_DEADLINE
>> +tristate "MQ deadline I/O scheduler"
>> +default y
>> +---help---
>> +  MQ version of the deadline IO scheduler.
>> +
>> config CFQ_GROUP_IOSCHED
>>  bool "CFQ Group Scheduling support"
>>  depends on IOSCHED_CFQ && BLK_CGROUP
>> diff --git a/block/Makefile b/block/Makefile
>> index 2eee9e1bb6db..3ee0abd7205a 100644
>> --- a/block/Makefile
>> +++ b/block/Makefile
>> @@ -18,6 +18,7 @@ obj-$(CONFIG_BLK_DEV_THROTTLING)   += blk-throttle.o
>> obj-$(CONFIG_IOSCHED_NOOP)   += noop-iosched.o
>> obj-$(CONFIG_IOSCHED_DEADLINE)   += deadline-iosched.o
>> obj-$(CONFIG_IOSCHED_CFQ)+= cfq-iosched.o
>> +obj-$(CONFIG_MQ_IOSCHED_DEADLINE)   += mq-deadline.o
>> 
>> obj-$(CONFIG_BLOCK_COMPAT)   += compat_ioctl.o
>> obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o
>> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
>> new file mode 100644
>> index ..3cb9de21ab21
>> --- /dev/null
>> +++ b/block/mq-deadline.c
>> @@ -0,0 +1,649 @@
>> +/*
>> + *  MQ Deadline i/o scheduler - adaptation of the legacy deadline scheduler,
>> + *  for the blk-mq scheduling framework
>> + *
>> + *  Copyright (C) 2016 Jens Axboe 
>> + */
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "blk.h"
>> +#include "blk-mq.h"
>> +#include "blk-mq-tag.h"
>> +#include "blk-mq-sched.h"
>> +
>> +static unsigned int queue_depth = 256;
>> +module_param(queue_depth, uint, 0644);
>> +MODULE_PARM_DESC(queue_depth, "Use this value as the scheduler queue 
>> depth");
>> +
>> +/*
>> + * See Documentation/block/deadline-iosched.txt
>> + */
>> +static const int read_expire = HZ / 2;  /* max time before a read is 
>> submitted. */
>> +static const int write_expire = 5 * HZ; /* ditto for writes, these limits 
>> are SOFT! */
>> +static const int writes_starved = 2;/* max times reads can starve a 
>> write */
>> +static const int fifo_batch = 16;   /* # of sequential requests treated 
>> as one
>> + by the above parameters. For throughput. */
>> +
>> +struct deadline_data {
>> +/*
>> + * run time data
>> + */
>> +
>> +/*
>> + * requests (deadline_rq s) are present on both sort_list and fifo_list
>> + */
>> +struct rb_root sort_list[2];
>> +struct list_head fifo_list[2];
>> +
>> +/*
>> + * next in sort order. read, write or both are NULL
>> + */
>> +struct request *next_rq[2];
>> +unsigned int batching;  /* number of sequential requests made */
>> +unsigned int starved;   /* times reads have starved writes */
>> +
>> +/*
>> + * settings that change how the i/o scheduler behaves
>> + */
>> +int fifo_expire[2];
>> +int fifo_batch;
>> +int writes_starved;
>> +int front_merges;
>> +
>> +spinlock_t lock;
>> +struct list_head dispatch;
>> +struct blk_mq_tags *tags;
>> +atomic_t wait_index;
>> +};
>> +
>> +static inline struct rb_root *
>> +deadline_rb_root(struct deadline_data *dd, struct request *rq)
>> +{
>> +return &dd->sort_list[rq_data_dir(rq)];
>> +}
>> +
>> +/*
>> + * get the request after `rq' in sector-sorted order
>> + */
>> +static inline struct request *
>> +deadline_latter_request(struct request *rq)
>> +{
>> +struct rb_node *node = rb_next(&rq->rb_node);
>> +
>> +if (node)
>> +return rb_entry_rq(node);
>> +
>> +return NULL;
>> +}
>> +
>> +static void
>> +deadline_add_rq_rb(struct deadline_data *dd, struct request *rq)
>> +{
>> +struct rb_root *root = deadline_rb_roo

Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-01-20 Thread Paolo Valente

> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
> 
> This is basically identical to deadline-iosched, except it registers
> as a MQ capable scheduler. This is still a single queue design.
> 

Jens,
no spin_lock_irq* in the code.  So, also request dispatches are
guaranteed to never be executed in IRQ context?  I'm asking this
question to understand whether I'm missing something that, even in
BFQ, would somehow allow me to not disable irqs in critical sections,
even if there is the slice_idle-expiration handler.  Be patient with
my ignorance.

Thanks,
Paolo

> Signed-off-by: Jens Axboe 
> ---
> block/Kconfig.iosched |   6 +
> block/Makefile|   1 +
> block/mq-deadline.c   | 649 ++
> 3 files changed, 656 insertions(+)
> create mode 100644 block/mq-deadline.c
> 
> diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched
> index 421bef9c4c48..490ef2850fae 100644
> --- a/block/Kconfig.iosched
> +++ b/block/Kconfig.iosched
> @@ -32,6 +32,12 @@ config IOSCHED_CFQ
> 
> This is the default I/O scheduler.
> 
> +config MQ_IOSCHED_DEADLINE
> + tristate "MQ deadline I/O scheduler"
> + default y
> + ---help---
> +   MQ version of the deadline IO scheduler.
> +
> config CFQ_GROUP_IOSCHED
>   bool "CFQ Group Scheduling support"
>   depends on IOSCHED_CFQ && BLK_CGROUP
> diff --git a/block/Makefile b/block/Makefile
> index 2eee9e1bb6db..3ee0abd7205a 100644
> --- a/block/Makefile
> +++ b/block/Makefile
> @@ -18,6 +18,7 @@ obj-$(CONFIG_BLK_DEV_THROTTLING)+= blk-throttle.o
> obj-$(CONFIG_IOSCHED_NOOP)+= noop-iosched.o
> obj-$(CONFIG_IOSCHED_DEADLINE)+= deadline-iosched.o
> obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o
> +obj-$(CONFIG_MQ_IOSCHED_DEADLINE)+= mq-deadline.o
> 
> obj-$(CONFIG_BLOCK_COMPAT)+= compat_ioctl.o
> obj-$(CONFIG_BLK_CMDLINE_PARSER)  += cmdline-parser.o
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> new file mode 100644
> index ..3cb9de21ab21
> --- /dev/null
> +++ b/block/mq-deadline.c
> @@ -0,0 +1,649 @@
> +/*
> + *  MQ Deadline i/o scheduler - adaptation of the legacy deadline scheduler,
> + *  for the blk-mq scheduling framework
> + *
> + *  Copyright (C) 2016 Jens Axboe 
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "blk.h"
> +#include "blk-mq.h"
> +#include "blk-mq-tag.h"
> +#include "blk-mq-sched.h"
> +
> +static unsigned int queue_depth = 256;
> +module_param(queue_depth, uint, 0644);
> +MODULE_PARM_DESC(queue_depth, "Use this value as the scheduler queue depth");
> +
> +/*
> + * See Documentation/block/deadline-iosched.txt
> + */
> +static const int read_expire = HZ / 2;  /* max time before a read is 
> submitted. */
> +static const int write_expire = 5 * HZ; /* ditto for writes, these limits 
> are SOFT! */
> +static const int writes_starved = 2;/* max times reads can starve a 
> write */
> +static const int fifo_batch = 16;   /* # of sequential requests treated 
> as one
> +  by the above parameters. For throughput. */
> +
> +struct deadline_data {
> + /*
> +  * run time data
> +  */
> +
> + /*
> +  * requests (deadline_rq s) are present on both sort_list and fifo_list
> +  */
> + struct rb_root sort_list[2];
> + struct list_head fifo_list[2];
> +
> + /*
> +  * next in sort order. read, write or both are NULL
> +  */
> + struct request *next_rq[2];
> + unsigned int batching;  /* number of sequential requests made */
> + unsigned int starved;   /* times reads have starved writes */
> +
> + /*
> +  * settings that change how the i/o scheduler behaves
> +  */
> + int fifo_expire[2];
> + int fifo_batch;
> + int writes_starved;
> + int front_merges;
> +
> + spinlock_t lock;
> + struct list_head dispatch;
> + struct blk_mq_tags *tags;
> + atomic_t wait_index;
> +};
> +
> +static inline struct rb_root *
> +deadline_rb_root(struct deadline_data *dd, struct request *rq)
> +{
> + return &dd->sort_list[rq_data_dir(rq)];
> +}
> +
> +/*
> + * get the request after `rq' in sector-sorted order
> + */
> +static inline struct request *
> +deadline_latter_request(struct request *rq)
> +{
> + struct rb_node *node = rb_next(&rq->rb_node);
> +
> + if (node)
> + return rb_entry_rq(node);
> +
> + return NULL;
> +}
> +
> +static void
> +deadline_add_rq_rb(struct deadline_data *dd, struct request *rq)
> +{
> + struct rb_root *root = deadline_rb_root(dd, rq);
> +
> + elv_rb_add(root, rq);
> +}
> +
> +static inline void
> +deadline_del_rq_rb(struct deadline_data *dd, struct request *rq)
> +{
> + const int data_dir = rq_data_dir(rq);
> +
> + if (dd->next_rq[data_dir] == rq)
> + dd->next_rq[data_dir] = deadline_latter_request(rq);

Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-01-20 Thread Paolo Valente

> Il giorno 17 gen 2017, alle ore 03:47, Jens Axboe  ha scritto:
> 
> On 12/22/2016 09:49 AM, Paolo Valente wrote:
>> 
>>> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
>>> 
>>> This is basically identical to deadline-iosched, except it registers
>>> as a MQ capable scheduler. This is still a single queue design.
>>> 
>> 
>> One last question (for today ...):in mq-deadline there are no
>> "schedule dispatch" or "unplug work" functions.  In blk, CFQ and BFQ
>> do these schedules/unplugs in a lot of cases.  What's the right
>> replacement?  Just doing nothing?
> 
> You just use blk_mq_run_hw_queue() or variants thereof to kick off queue
> runs.
> 

Hi Jens,
I'm working on this right now.  I have a pair of quick questions about
performance.

In the blk version of bfq, if the in-service bfq_queue happen to have
no more budget when the bfq dispatch function is invoked, then bfq:
returns no request (NULL), immediately expires the in-service
bfq_queue, and schedules a new dispatch.  The third step is taken so
that, if other bfq_queues have requests, then a new in-service
bfq_queue will be selected on the upcoming new dispatch, and a new
request will be provided right away.

My questions are: is this dispatch-schedule step still needed with
blk-mq, to avoid a stall?  If it is not needed to avoid a stall, would
it still be needed to boost throughput, because it would force an
immediate, next dispatch?

BTW, bfq-mq survived its first request completion.  I will provide you
with a link to a github branch as soon as bfq-mq seems able to stand
up with a minimal workload.

Thanks,
Paolo

> -- 
> Jens Axboe
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-01-16 Thread Jens Axboe
On 12/22/2016 09:07 AM, Paolo Valente wrote:
> 
>> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
>>
>> This is basically identical to deadline-iosched, except it registers
>> as a MQ capable scheduler. This is still a single queue design.
>>
>> Signed-off-by: Jens Axboe 
>> ---
> 
> ...
> 
>> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
>> new file mode 100644
>> index ..3cb9de21ab21
>> --- /dev/null
>> +++ b/block/mq-deadline.c
>> ...
>> +/*
>> + * remove rq from rbtree and fifo.
>> + */
>> +static void deadline_remove_request(struct request_queue *q, struct request 
>> *rq)
>> +{
>> +struct deadline_data *dd = q->elevator->elevator_data;
>> +
>> +list_del_init(&rq->queuelist);
>> +
>> +/*
>> + * We might not be on the rbtree, if we are doing an insert merge
>> + */
>> +if (!RB_EMPTY_NODE(&rq->rb_node))
>> +deadline_del_rq_rb(dd, rq);
>> +
> 
> I've been scratching my head on the last three instructions, but at no
> avail.  If I understand correctly, the
> list_del_init(&rq->queue list);
> removes rq from the fifo list.  But, if so, I don't understand how it
> could be possible that rq has not been added to the rb_tree too.
>
> Another interpretation that I tried is that the above three lines
> handle correctly the following case where rq has not been inserted at
> all into deadline fifo queue and rb tree: when dd_insert_request was
> executed for rq, blk_mq_sched_try_insert_merge succeeded.  Yet, the
> list_del_init(&rq->queue list);
> does not seem to make sense.
> 
> Could you please shed some light on this for me?

I think you are correct, we don't need to touch ->queuelist for the case
where RB_EMPTY_NODE() is true. Minor detail, the list is already empty,
so it does no harm.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2017-01-16 Thread Jens Axboe
On 12/22/2016 09:49 AM, Paolo Valente wrote:
> 
>> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
>>
>> This is basically identical to deadline-iosched, except it registers
>> as a MQ capable scheduler. This is still a single queue design.
>>
> 
> One last question (for today ...):in mq-deadline there are no
> "schedule dispatch" or "unplug work" functions.  In blk, CFQ and BFQ
> do these schedules/unplugs in a lot of cases.  What's the right
> replacement?  Just doing nothing?

You just use blk_mq_run_hw_queue() or variants thereof to kick off queue
runs.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2016-12-22 Thread Paolo Valente

> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
> 
> This is basically identical to deadline-iosched, except it registers
> as a MQ capable scheduler. This is still a single queue design.
> 

One last question (for today ...):in mq-deadline there are no
"schedule dispatch" or "unplug work" functions.  In blk, CFQ and BFQ
do these schedules/unplugs in a lot of cases.  What's the right
replacement?  Just doing nothing?

Thanks,
Paolo

> Signed-off-by: Jens Axboe 
> ---
> block/Kconfig.iosched |   6 +
> block/Makefile|   1 +
> block/mq-deadline.c   | 649 ++
> 3 files changed, 656 insertions(+)
> create mode 100644 block/mq-deadline.c
> 
> diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched
> index 421bef9c4c48..490ef2850fae 100644
> --- a/block/Kconfig.iosched
> +++ b/block/Kconfig.iosched
> @@ -32,6 +32,12 @@ config IOSCHED_CFQ
> 
> This is the default I/O scheduler.
> 
> +config MQ_IOSCHED_DEADLINE
> + tristate "MQ deadline I/O scheduler"
> + default y
> + ---help---
> +   MQ version of the deadline IO scheduler.
> +
> config CFQ_GROUP_IOSCHED
>   bool "CFQ Group Scheduling support"
>   depends on IOSCHED_CFQ && BLK_CGROUP
> diff --git a/block/Makefile b/block/Makefile
> index 2eee9e1bb6db..3ee0abd7205a 100644
> --- a/block/Makefile
> +++ b/block/Makefile
> @@ -18,6 +18,7 @@ obj-$(CONFIG_BLK_DEV_THROTTLING)+= blk-throttle.o
> obj-$(CONFIG_IOSCHED_NOOP)+= noop-iosched.o
> obj-$(CONFIG_IOSCHED_DEADLINE)+= deadline-iosched.o
> obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o
> +obj-$(CONFIG_MQ_IOSCHED_DEADLINE)+= mq-deadline.o
> 
> obj-$(CONFIG_BLOCK_COMPAT)+= compat_ioctl.o
> obj-$(CONFIG_BLK_CMDLINE_PARSER)  += cmdline-parser.o
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> new file mode 100644
> index ..3cb9de21ab21
> --- /dev/null
> +++ b/block/mq-deadline.c
> @@ -0,0 +1,649 @@
> +/*
> + *  MQ Deadline i/o scheduler - adaptation of the legacy deadline scheduler,
> + *  for the blk-mq scheduling framework
> + *
> + *  Copyright (C) 2016 Jens Axboe 
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "blk.h"
> +#include "blk-mq.h"
> +#include "blk-mq-tag.h"
> +#include "blk-mq-sched.h"
> +
> +static unsigned int queue_depth = 256;
> +module_param(queue_depth, uint, 0644);
> +MODULE_PARM_DESC(queue_depth, "Use this value as the scheduler queue depth");
> +
> +/*
> + * See Documentation/block/deadline-iosched.txt
> + */
> +static const int read_expire = HZ / 2;  /* max time before a read is 
> submitted. */
> +static const int write_expire = 5 * HZ; /* ditto for writes, these limits 
> are SOFT! */
> +static const int writes_starved = 2;/* max times reads can starve a 
> write */
> +static const int fifo_batch = 16;   /* # of sequential requests treated 
> as one
> +  by the above parameters. For throughput. */
> +
> +struct deadline_data {
> + /*
> +  * run time data
> +  */
> +
> + /*
> +  * requests (deadline_rq s) are present on both sort_list and fifo_list
> +  */
> + struct rb_root sort_list[2];
> + struct list_head fifo_list[2];
> +
> + /*
> +  * next in sort order. read, write or both are NULL
> +  */
> + struct request *next_rq[2];
> + unsigned int batching;  /* number of sequential requests made */
> + unsigned int starved;   /* times reads have starved writes */
> +
> + /*
> +  * settings that change how the i/o scheduler behaves
> +  */
> + int fifo_expire[2];
> + int fifo_batch;
> + int writes_starved;
> + int front_merges;
> +
> + spinlock_t lock;
> + struct list_head dispatch;
> + struct blk_mq_tags *tags;
> + atomic_t wait_index;
> +};
> +
> +static inline struct rb_root *
> +deadline_rb_root(struct deadline_data *dd, struct request *rq)
> +{
> + return &dd->sort_list[rq_data_dir(rq)];
> +}
> +
> +/*
> + * get the request after `rq' in sector-sorted order
> + */
> +static inline struct request *
> +deadline_latter_request(struct request *rq)
> +{
> + struct rb_node *node = rb_next(&rq->rb_node);
> +
> + if (node)
> + return rb_entry_rq(node);
> +
> + return NULL;
> +}
> +
> +static void
> +deadline_add_rq_rb(struct deadline_data *dd, struct request *rq)
> +{
> + struct rb_root *root = deadline_rb_root(dd, rq);
> +
> + elv_rb_add(root, rq);
> +}
> +
> +static inline void
> +deadline_del_rq_rb(struct deadline_data *dd, struct request *rq)
> +{
> + const int data_dir = rq_data_dir(rq);
> +
> + if (dd->next_rq[data_dir] == rq)
> + dd->next_rq[data_dir] = deadline_latter_request(rq);
> +
> + elv_rb_del(deadline_rb_root(dd, rq), rq);
> +}
> +
> +/*
> + * remove rq from rbtree and fifo.
> + */
> +static v

Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2016-12-22 Thread Paolo Valente

> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
> 
> This is basically identical to deadline-iosched, except it registers
> as a MQ capable scheduler. This is still a single queue design.
> 
> Signed-off-by: Jens Axboe 
> ---

...

> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> new file mode 100644
> index ..3cb9de21ab21
> --- /dev/null
> +++ b/block/mq-deadline.c
> ...
> +/*
> + * remove rq from rbtree and fifo.
> + */
> +static void deadline_remove_request(struct request_queue *q, struct request 
> *rq)
> +{
> + struct deadline_data *dd = q->elevator->elevator_data;
> +
> + list_del_init(&rq->queuelist);
> +
> + /*
> +  * We might not be on the rbtree, if we are doing an insert merge
> +  */
> + if (!RB_EMPTY_NODE(&rq->rb_node))
> + deadline_del_rq_rb(dd, rq);
> +

I've been scratching my head on the last three instructions, but at no
avail.  If I understand correctly, the
list_del_init(&rq->queue list);
removes rq from the fifo list.  But, if so, I don't understand how it
could be possible that rq has not been added to the rb_tree too.

Another interpretation that I tried is that the above three lines
handle correctly the following case where rq has not been inserted at
all into deadline fifo queue and rb tree: when dd_insert_request was
executed for rq, blk_mq_sched_try_insert_merge succeeded.  Yet, the
list_del_init(&rq->queue list);
does not seem to make sense.

Could you please shed some light on this for me?

Thanks,
Paolo

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2016-12-21 Thread Jens Axboe
On 12/21/2016 04:59 AM, Bart Van Assche wrote:
> Since this patch is the first patch that introduces a call to
> blk_queue_exit() from a module other than the block layer core,
> shouldn't this patch export the blk_queue_exit() function? An attempt
> to build mq-deadline as a module resulted in the following:
> 
> ERROR: "blk_queue_exit" [block/mq-deadline.ko] undefined!
> make[1]: *** [scripts/Makefile.modpost:91: __modpost] Error 1
> make: *** [Makefile:1198: modules] Error 2
> Execution failed: make all

Yes, it should. I'll make the export for now, I want to move that check
and free/drop into the generic code so that the schedulers don't have to
worry about it.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2016-12-21 Thread Bart Van Assche
On 12/17/2016 01:12 AM, Jens Axboe wrote:
> +static bool dd_put_request(struct request *rq)
> +{
> + /*
> +  * If it's a real request, we just have to
free it. For a shadow
> +  * request, we should only free it if we haven't started it. A
> +  * started request is mapped to a real one, and the
real one will
> +  * free it. We can get here with request merges, since we then
> +  * free the request before we start/issue it.
> +  */
> + 
if (!blk_mq_sched_rq_is_shadow(rq))
> + return false;
> +
> + if (!(rq->rq_flags & RQF_STARTED)) {
> + struct request_queue *q
= rq->q;
> + struct deadline_data *dd = q->elevator->elevator_data;
> +
> + /*
> +  * IO completion would normally do
this, but if we merge
> +  * and free before we issue the request, drop both the
> +  * tag and queue ref
> +  */
> + 
blk_mq_sched_free_shadow_request(dd->tags, rq);
> + blk_queue_exit(q);
> + }
> +
> + return true;
> +}

Hello Jens,

Since this patch is the first patch that introduces a call to blk_queue_exit()
from a module other than the block layer core, shouldn't this patch export the
blk_queue_exit() function? An attempt to build mq-deadline as a module resulted
in the following:

ERROR: "blk_queue_exit" [block/mq-deadline.ko] undefined!
make[1]: *** [scripts/Makefile.modpost:91: __modpost] Error 1
make: *** [Makefile:1198: modules] Error 2
Execution failed: make all

Bart.

Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2016-12-20 Thread Jens Axboe
On 12/20/2016 02:34 AM, Paolo Valente wrote:
> 
>> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
>>
>> This is basically identical to deadline-iosched, except it registers
>> as a MQ capable scheduler. This is still a single queue design.
>>
>> Signed-off-by: Jens Axboe 
>> ...
>> +
>> +static bool dd_has_work(struct blk_mq_hw_ctx *hctx)
>> +{
>> +struct deadline_data *dd = hctx->queue->elevator->elevator_data;
>> +
>> +return !list_empty_careful(&dd->dispatch) ||
>> +!list_empty_careful(&dd->fifo_list[0]) ||
>> +!list_empty_careful(&dd->fifo_list[1]);
> 
> Just a request for clarification: if I'm not mistaken,
> list_empty_careful can be used safely only if the only possible other
> concurrent access is a delete.  Or am I missing something?

We can "solve" that with memory barriers. For now, it's safe to ignore
on your end.


-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/8] mq-deadline: add blk-mq adaptation of the deadline IO scheduler

2016-12-20 Thread Paolo Valente

> Il giorno 17 dic 2016, alle ore 01:12, Jens Axboe  ha scritto:
> 
> This is basically identical to deadline-iosched, except it registers
> as a MQ capable scheduler. This is still a single queue design.
> 
> Signed-off-by: Jens Axboe 
> ...
> +
> +static bool dd_has_work(struct blk_mq_hw_ctx *hctx)
> +{
> + struct deadline_data *dd = hctx->queue->elevator->elevator_data;
> +
> + return !list_empty_careful(&dd->dispatch) ||
> + !list_empty_careful(&dd->fifo_list[0]) ||
> + !list_empty_careful(&dd->fifo_list[1]);

Just a request for clarification: if I'm not mistaken,
list_empty_careful can be used safely only if the only possible other
concurrent access is a delete.  Or am I missing something?

If the above constraint does hold, then how are we guaranteed that it
is met?  My doubt arises from, e.g., the possible concurrent list_add
from dd_insert_request.

Thanks,
Paolo

> +}
> +
> +/*
> + * sysfs parts below
> + */
> +static ssize_t
> +deadline_var_show(int var, char *page)
> +{
> + return sprintf(page, "%d\n", var);
> +}
> +
> +static ssize_t
> +deadline_var_store(int *var, const char *page, size_t count)
> +{
> + char *p = (char *) page;
> +
> + *var = simple_strtol(p, &p, 10);
> + return count;
> +}
> +
> +#define SHOW_FUNCTION(__FUNC, __VAR, __CONV) \
> +static ssize_t __FUNC(struct elevator_queue *e, char *page)  \
> +{\
> + struct deadline_data *dd = e->elevator_data;\
> + int __data = __VAR; \
> + if (__CONV) \
> + __data = jiffies_to_msecs(__data);  \
> + return deadline_var_show(__data, (page));   \
> +}
> +SHOW_FUNCTION(deadline_read_expire_show, dd->fifo_expire[READ], 1);
> +SHOW_FUNCTION(deadline_write_expire_show, dd->fifo_expire[WRITE], 1);
> +SHOW_FUNCTION(deadline_writes_starved_show, dd->writes_starved, 0);
> +SHOW_FUNCTION(deadline_front_merges_show, dd->front_merges, 0);
> +SHOW_FUNCTION(deadline_fifo_batch_show, dd->fifo_batch, 0);
> +#undef SHOW_FUNCTION
> +
> +#define STORE_FUNCTION(__FUNC, __PTR, MIN, MAX, __CONV)  
> \
> +static ssize_t __FUNC(struct elevator_queue *e, const char *page, size_t 
> count)  \
> +{\
> + struct deadline_data *dd = e->elevator_data;\
> + int __data; \
> + int ret = deadline_var_store(&__data, (page), count);   \
> + if (__data < (MIN)) \
> + __data = (MIN); \
> + else if (__data > (MAX))\
> + __data = (MAX); \
> + if (__CONV) \
> + *(__PTR) = msecs_to_jiffies(__data);\
> + else\
> + *(__PTR) = __data;  \
> + return ret; \
> +}
> +STORE_FUNCTION(deadline_read_expire_store, &dd->fifo_expire[READ], 0, 
> INT_MAX, 1);
> +STORE_FUNCTION(deadline_write_expire_store, &dd->fifo_expire[WRITE], 0, 
> INT_MAX, 1);
> +STORE_FUNCTION(deadline_writes_starved_store, &dd->writes_starved, INT_MIN, 
> INT_MAX, 0);
> +STORE_FUNCTION(deadline_front_merges_store, &dd->front_merges, 0, 1, 0);
> +STORE_FUNCTION(deadline_fifo_batch_store, &dd->fifo_batch, 0, INT_MAX, 0);
> +#undef STORE_FUNCTION
> +
> +#define DD_ATTR(name) \
> + __ATTR(name, S_IRUGO|S_IWUSR, deadline_##name##_show, \
> +   deadline_##name##_store)
> +
> +static struct elv_fs_entry deadline_attrs[] = {
> + DD_ATTR(read_expire),
> + DD_ATTR(write_expire),
> + DD_ATTR(writes_starved),
> + DD_ATTR(front_merges),
> + DD_ATTR(fifo_batch),
> + __ATTR_NULL
> +};
> +
> +static struct elevator_type mq_deadline = {
> + .ops.mq = {
> + .get_request= dd_get_request,
> + .put_request= dd_put_request,
> + .insert_requests= dd_insert_requests,
> + .dispatch_requests  = dd_dispatch_requests,
> + .completed_request  = dd_completed_request,
> + .next_request   = elv_rb_latter_request,
> + .former_request = elv_rb_former_request,
> + .bio_merge  = dd_bio_merge,
> + .request_merge  = dd_request_merge,
> + .requests_merged= dd_merged_requests,
> + .request_merged = dd_request_merged,
> +