Re: [PATCH] blk-mq-sched: fix possible crash if changing scheduler fails

2017-01-24 Thread Jens Axboe
On 01/23/2017 04:47 PM, Omar Sandoval wrote:
> From: Omar Sandoval 
> 
> In elevator_switch(), we free the old scheduler's tags and then
> initialize the new scheduler. If initializing the new scheduler fails,
> we fall back to the old scheduler, but our tags have already been freed.
> There's no reason to free the sched_tags only to reallocate them, so
> let's just reuse them, which makes it possible to safely fall back to
> the old scheduler.

Do we need a failure case on both ends? We never free the driver
tags, so it's always perfectly safe to fall back to not running
with a scheduler.

Since we were talking about making the scheduler depth configurable
or dependent on the scheduler, reusing tags seems like something that
would potentially be a short lived "feature".

So I think I'd prefer if we teardown/setup like we do now, and just
have the failure case be falling back to "none".


-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] blk-mq-sched: fix possible crash if changing scheduler fails

2017-01-23 Thread Omar Sandoval
From: Omar Sandoval 

In elevator_switch(), we free the old scheduler's tags and then
initialize the new scheduler. If initializing the new scheduler fails,
we fall back to the old scheduler, but our tags have already been freed.
There's no reason to free the sched_tags only to reallocate them, so
let's just reuse them, which makes it possible to safely fall back to
the old scheduler.

Signed-off-by: Omar Sandoval 
---
Applies to for-4.11/block. Reproduced the issue and verified the fix by
sticking an err = -ENOMEM in the right place. As far as I can tell, it's
safe to reuse the previously allocated sched_tags for the new scheduler.

 block/elevator.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/block/elevator.c b/block/elevator.c
index ef7f59469acc..28283aaab04c 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -963,9 +963,6 @@ static int elevator_switch(struct request_queue *q, struct 
elevator_type *new_e)
if (old) {
old_registered = old->registered;
 
-   if (old->uses_mq)
-   blk_mq_sched_teardown(q);
-
if (!q->mq_ops)
blk_queue_bypass_start(q);
 
@@ -981,7 +978,14 @@ static int elevator_switch(struct request_queue *q, struct 
elevator_type *new_e)
/* allocate, init and register new elevator */
if (new_e) {
if (new_e->uses_mq) {
-   err = blk_mq_sched_setup(q);
+   /*
+* If we were previously using an I/O scheduler, the
+* sched_tags are already allocated.
+*/
+   if (old)
+   err = 0;
+   else
+   err = blk_mq_sched_setup(q);
if (!err)
err = new_e->ops.mq.init_sched(q, new_e);
} else
@@ -997,6 +1001,12 @@ static int elevator_switch(struct request_queue *q, 
struct elevator_type *new_e)
 
/* done, kill the old one and finish */
if (old) {
+   /*
+* If we switched to another I/O scheduler, then we shouldn't
+* free sched_tags.
+*/
+   if (old->uses_mq && !new_e)
+   blk_mq_sched_teardown(q);
elevator_exit(old);
if (!q->mq_ops)
blk_queue_bypass_end(q);
@@ -1015,7 +1025,7 @@ static int elevator_switch(struct request_queue *q, 
struct elevator_type *new_e)
return 0;
 
 fail_register:
-   if (q->mq_ops)
+   if (q->mq_ops && !old)
blk_mq_sched_teardown(q);
elevator_exit(q->elevator);
 fail_init:
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html