Re: bug in tag handling in blk-mq?
On Wed, 2018-05-09 at 13:50 -0600, Jens Axboe wrote: > On 5/9/18 12:31 PM, Mike Galbraith wrote: > > On Wed, 2018-05-09 at 11:01 -0600, Jens Axboe wrote: > >> On 5/9/18 10:57 AM, Mike Galbraith wrote: > >> > > Confirmed. Impressive high speed bug stomping. > > Well, that's good news. Can I get you to try this patch? > >>> > >>> Sure thing. The original hang (minus provocation patch) being > >>> annoyingly non-deterministic, this will (hopefully) take a while. > >> > >> You can verify with the provocation patch as well first, if you wish. > > > > Done, box still seems fine. > > Omar had some (valid) complaints, can you try this one as well? You > can also find it as a series here: > > http://git.kernel.dk/cgit/linux-block/log/?h=bfq-cleanups > > I'll repost the series shortly, need to check if it actually builds and > boots. I applied the series (+ provocation), all is well. -Mike
Re: bug in tag handling in blk-mq?
On 5/9/18 12:31 PM, Mike Galbraith wrote: > On Wed, 2018-05-09 at 11:01 -0600, Jens Axboe wrote: >> On 5/9/18 10:57 AM, Mike Galbraith wrote: >> > Confirmed. Impressive high speed bug stomping. Well, that's good news. Can I get you to try this patch? >>> >>> Sure thing. The original hang (minus provocation patch) being >>> annoyingly non-deterministic, this will (hopefully) take a while. >> >> You can verify with the provocation patch as well first, if you wish. > > Done, box still seems fine. Omar had some (valid) complaints, can you try this one as well? You can also find it as a series here: http://git.kernel.dk/cgit/linux-block/log/?h=bfq-cleanups I'll repost the series shortly, need to check if it actually builds and boots. diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index ebc264c87a09..cba6e82153a2 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -487,46 +487,6 @@ static struct request *bfq_choose_req(struct bfq_data *bfqd, } /* - * See the comments on bfq_limit_depth for the purpose of - * the depths set in the function. - */ -static void bfq_update_depths(struct bfq_data *bfqd, struct sbitmap_queue *bt) -{ - bfqd->sb_shift = bt->sb.shift; - - /* -* In-word depths if no bfq_queue is being weight-raised: -* leaving 25% of tags only for sync reads. -* -* In next formulas, right-shift the value -* (1Usb_shift - something)), to be robust against -* any possible value of bfqd->sb_shift, without having to -* limit 'something'. -*/ - /* no more than 50% of tags for async I/O */ - bfqd->word_depths[0][0] = max((1U >1, 1U); - /* -* no more than 75% of tags for sync writes (25% extra tags -* w.r.t. async I/O, to prevent async I/O from starving sync -* writes) -*/ - bfqd->word_depths[0][1] = max(((1U >2, 1U); - - /* -* In-word depths in case some bfq_queue is being weight- -* raised: leaving ~63% of tags for sync reads. This is the -* highest percentage for which, in our tests, application -* start-up times didn't suffer from any regression due to tag -* shortage. -*/ - /* no more than ~18% of tags for async I/O */ - bfqd->word_depths[1][0] = max(((1U >4, 1U); - /* no more than ~37% of tags for sync writes (~20% extra tags) */ - bfqd->word_depths[1][1] = max(((1U >4, 1U); -} - -/* * Async I/O can easily starve sync I/O (both sync reads and sync * writes), by consuming all tags. Similarly, storms of sync writes, * such as those that sync(2) may trigger, can starve sync reads. @@ -535,25 +495,11 @@ static void bfq_update_depths(struct bfq_data *bfqd, struct sbitmap_queue *bt) */ static void bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data) { - struct blk_mq_tags *tags = blk_mq_tags_from_data(data); struct bfq_data *bfqd = data->q->elevator->elevator_data; - struct sbitmap_queue *bt; if (op_is_sync(op) && !op_is_write(op)) return; - if (data->flags & BLK_MQ_REQ_RESERVED) { - if (unlikely(!tags->nr_reserved_tags)) { - WARN_ON_ONCE(1); - return; - } - bt = &tags->breserved_tags; - } else - bt = &tags->bitmap_tags; - - if (unlikely(bfqd->sb_shift != bt->sb.shift)) - bfq_update_depths(bfqd, bt); - data->shallow_depth = bfqd->word_depths[!!bfqd->wr_busy_queues][op_is_sync(op)]; @@ -5105,6 +5051,66 @@ void bfq_put_async_queues(struct bfq_data *bfqd, struct bfq_group *bfqg) __bfq_put_async_bfqq(bfqd, &bfqg->async_idle_bfqq); } +/* + * See the comments on bfq_limit_depth for the purpose of + * the depths set in the function. Return minimum shallow depth we'll use. + */ +static unsigned int bfq_update_depths(struct bfq_data *bfqd, + struct sbitmap_queue *bt) +{ + unsigned int i, j, min_shallow = UINT_MAX; + + bfqd->sb_shift = bt->sb.shift; + + /* +* In-word depths if no bfq_queue is being weight-raised: +* leaving 25% of tags only for sync reads. +* +* In next formulas, right-shift the value +* (1U sb_shift - something)), to be robust against +* any possible value of bfqd->sb_shift, without having to +* limit 'something'. +*/ + /* no more than 50% of tags for async I/O */ + bfqd->word_depths[0][0] = max((1U >1, 1U); + /* +* no more than 75% of tags for sync writes (25% extra tags +* w.r.t. async I/O, to prevent async I/O from starving sync +* writes) +*/ + bfqd->word
Re: bug in tag handling in blk-mq?
On Wed, 2018-05-09 at 11:01 -0600, Jens Axboe wrote: > On 5/9/18 10:57 AM, Mike Galbraith wrote: > > >>> Confirmed. Impressive high speed bug stomping. > >> > >> Well, that's good news. Can I get you to try this patch? > > > > Sure thing. The original hang (minus provocation patch) being > > annoyingly non-deterministic, this will (hopefully) take a while. > > You can verify with the provocation patch as well first, if you wish. Done, box still seems fine. -Mike
Re: bug in tag handling in blk-mq?
On 5/9/18 10:57 AM, Mike Galbraith wrote: > On Wed, 2018-05-09 at 09:18 -0600, Jens Axboe wrote: >> On 5/8/18 10:11 PM, Mike Galbraith wrote: >>> On Tue, 2018-05-08 at 19:09 -0600, Jens Axboe wrote: Alright, I managed to reproduce it. What I think is happening is that BFQ is limiting the inflight case to something less than the wake batch for sbitmap, which can lead to stalls. I don't have time to test this tonight, but perhaps you can give it a go when you are back at it. If not, I'll try tomorrow morning. If this is the issue, I can turn it into a real patch. This is just to confirm that the issue goes away with the below. >>> >>> Confirmed. Impressive high speed bug stomping. >> >> Well, that's good news. Can I get you to try this patch? > > Sure thing. The original hang (minus provocation patch) being > annoyingly non-deterministic, this will (hopefully) take a while. You can verify with the provocation patch as well first, if you wish. Just need to hand-apply since it'll conflict with this patch in bfq. But it's a trivial resolve. -- Jens Axboe
Re: bug in tag handling in blk-mq?
On Wed, 2018-05-09 at 09:18 -0600, Jens Axboe wrote: > On 5/8/18 10:11 PM, Mike Galbraith wrote: > > On Tue, 2018-05-08 at 19:09 -0600, Jens Axboe wrote: > >> > >> Alright, I managed to reproduce it. What I think is happening is that > >> BFQ is limiting the inflight case to something less than the wake > >> batch for sbitmap, which can lead to stalls. I don't have time to test > >> this tonight, but perhaps you can give it a go when you are back at it. > >> If not, I'll try tomorrow morning. > >> > >> If this is the issue, I can turn it into a real patch. This is just to > >> confirm that the issue goes away with the below. > > > > Confirmed. Impressive high speed bug stomping. > > Well, that's good news. Can I get you to try this patch? Sure thing. The original hang (minus provocation patch) being annoyingly non-deterministic, this will (hopefully) take a while. -Mike
Re: bug in tag handling in blk-mq?
On 5/8/18 10:11 PM, Mike Galbraith wrote: > On Tue, 2018-05-08 at 19:09 -0600, Jens Axboe wrote: >> >> Alright, I managed to reproduce it. What I think is happening is that >> BFQ is limiting the inflight case to something less than the wake >> batch for sbitmap, which can lead to stalls. I don't have time to test >> this tonight, but perhaps you can give it a go when you are back at it. >> If not, I'll try tomorrow morning. >> >> If this is the issue, I can turn it into a real patch. This is just to >> confirm that the issue goes away with the below. > > Confirmed. Impressive high speed bug stomping. Well, that's good news. Can I get you to try this patch? Needs to be split, but it'll be good to know if this fixes it too (since it's an ACTUAL attempt at a fix, not just a masking). diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index ebc264c87a09..b0dbfd297d20 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -533,19 +533,20 @@ static void bfq_update_depths(struct bfq_data *bfqd, struct sbitmap_queue *bt) * Limit depths of async I/O and sync writes so as to counter both * problems. */ -static void bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data) +static int bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data) { struct blk_mq_tags *tags = blk_mq_tags_from_data(data); struct bfq_data *bfqd = data->q->elevator->elevator_data; struct sbitmap_queue *bt; + int old_depth; if (op_is_sync(op) && !op_is_write(op)) - return; + return 0; if (data->flags & BLK_MQ_REQ_RESERVED) { if (unlikely(!tags->nr_reserved_tags)) { WARN_ON_ONCE(1); - return; + return 0; } bt = &tags->breserved_tags; } else @@ -554,12 +555,18 @@ static void bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data) if (unlikely(bfqd->sb_shift != bt->sb.shift)) bfq_update_depths(bfqd, bt); + old_depth = data->shallow_depth; data->shallow_depth = bfqd->word_depths[!!bfqd->wr_busy_queues][op_is_sync(op)]; bfq_log(bfqd, "[%s] wr_busy %d sync %d depth %u", __func__, bfqd->wr_busy_queues, op_is_sync(op), data->shallow_depth); + + if (old_depth != data->shallow_depth) + return data->shallow_depth; + + return 0; } static struct bfq_queue * diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 25c14c58385c..0c53a254671f 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -16,6 +16,32 @@ #include "blk-mq-tag.h" #include "blk-wbt.h" +void blk_mq_sched_limit_depth(struct elevator_queue *e, + struct blk_mq_alloc_data *data, unsigned int op) +{ + struct blk_mq_tags *tags = blk_mq_tags_from_data(data); + struct sbitmap_queue *bt; + int ret; + + /* +* Flush requests are special and go directly to the +* dispatch list. +*/ + if (op_is_flush(op) || !e->type->ops.mq.limit_depth) + return; + + ret = e->type->ops.mq.limit_depth(op, data); + if (!ret) + return; + + if (data->flags & BLK_MQ_REQ_RESERVED) + bt = &tags->breserved_tags; + else + bt = &tags->bitmap_tags; + + sbitmap_queue_shallow_depth(bt, ret); +} + void blk_mq_sched_free_hctx_data(struct request_queue *q, void (*exit)(struct blk_mq_hw_ctx *)) { diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h index 1e9c9018ace1..6abebc1b9ae0 100644 --- a/block/blk-mq-sched.h +++ b/block/blk-mq-sched.h @@ -5,6 +5,9 @@ #include "blk-mq.h" #include "blk-mq-tag.h" +void blk_mq_sched_limit_depth(struct elevator_queue *e, + struct blk_mq_alloc_data *data, unsigned int op); + void blk_mq_sched_free_hctx_data(struct request_queue *q, void (*exit)(struct blk_mq_hw_ctx *)); diff --git a/block/blk-mq.c b/block/blk-mq.c index 4e9d83594cca..1bb7aa40c192 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -357,13 +357,7 @@ static struct request *blk_mq_get_request(struct request_queue *q, if (e) { data->flags |= BLK_MQ_REQ_INTERNAL; - - /* -* Flush requests are special and go directly to the -* dispatch list. -*/ - if (!op_is_flush(op) && e->type->ops.mq.limit_depth) - e->type->ops.mq.limit_depth(op, data); + blk_mq_sched_limit_depth(e, data, op); } tag = blk_mq_get_tag(data); diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c index 564967fafe5f..d2622386c115 100644 --- a/block/kyber-iosched.c +++ b/block/kyber-iosched.c @@ -433,17 +433,23 @@ static void rq_clear_domain_t
Re: bug in tag handling in blk-mq?
On Tue, 2018-05-08 at 14:37 -0600, Jens Axboe wrote: > > - sdd has nothing pending, yet has 6 active waitqueues. sdd is where ccache storage lives, which that should have been the only activity on that drive, as I built source in sdb, and was doing nothing else that utilizes sdd. -Mike
Re: bug in tag handling in blk-mq?
> Il giorno 09 mag 2018, alle ore 06:11, Mike Galbraith ha > scritto: > > On Tue, 2018-05-08 at 19:09 -0600, Jens Axboe wrote: >> >> Alright, I managed to reproduce it. What I think is happening is that >> BFQ is limiting the inflight case to something less than the wake >> batch for sbitmap, which can lead to stalls. I don't have time to test >> this tonight, but perhaps you can give it a go when you are back at it. >> If not, I'll try tomorrow morning. >> >> If this is the issue, I can turn it into a real patch. This is just to >> confirm that the issue goes away with the below. > > Confirmed. Impressive high speed bug stomping. > Great! It's a real relief that this ghost is gone. Thank you both, Paolo >> diff --git a/lib/sbitmap.c b/lib/sbitmap.c >> index e6a9c06ec70c..94ced15b6428 100644 >> --- a/lib/sbitmap.c >> +++ b/lib/sbitmap.c >> @@ -272,6 +272,7 @@ EXPORT_SYMBOL_GPL(sbitmap_bitmap_show); >> >> static unsigned int sbq_calc_wake_batch(unsigned int depth) >> { >> +#if 0 >> unsigned int wake_batch; >> >> /* >> @@ -284,6 +285,9 @@ static unsigned int sbq_calc_wake_batch(unsigned int >> depth) >> wake_batch = max(1U, depth / SBQ_WAIT_QUEUES); >> >> return wake_batch; >> +#else >> +return 1; >> +#endif >> } >> >> int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth, >>
Re: bug in tag handling in blk-mq?
On Tue, 2018-05-08 at 19:09 -0600, Jens Axboe wrote: > > Alright, I managed to reproduce it. What I think is happening is that > BFQ is limiting the inflight case to something less than the wake > batch for sbitmap, which can lead to stalls. I don't have time to test > this tonight, but perhaps you can give it a go when you are back at it. > If not, I'll try tomorrow morning. > > If this is the issue, I can turn it into a real patch. This is just to > confirm that the issue goes away with the below. Confirmed. Impressive high speed bug stomping. > diff --git a/lib/sbitmap.c b/lib/sbitmap.c > index e6a9c06ec70c..94ced15b6428 100644 > --- a/lib/sbitmap.c > +++ b/lib/sbitmap.c > @@ -272,6 +272,7 @@ EXPORT_SYMBOL_GPL(sbitmap_bitmap_show); > > static unsigned int sbq_calc_wake_batch(unsigned int depth) > { > +#if 0 > unsigned int wake_batch; > > /* > @@ -284,6 +285,9 @@ static unsigned int sbq_calc_wake_batch(unsigned int > depth) > wake_batch = max(1U, depth / SBQ_WAIT_QUEUES); > > return wake_batch; > +#else > + return 1; > +#endif > } > > int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth, >
Re: bug in tag handling in blk-mq?
On 5/8/18 3:19 PM, Jens Axboe wrote: > On 5/8/18 2:37 PM, Jens Axboe wrote: >> On 5/8/18 10:42 AM, Mike Galbraith wrote: >>> On Tue, 2018-05-08 at 08:55 -0600, Jens Axboe wrote: All the block debug files are empty... >>> >>> Sigh. Take 2, this time cat debug files, having turned block tracing >>> off before doing anything else (so trace bits in dmesg.txt should end >>> AT the stall). >> >> OK, that's better. What I see from the traces: >> >> - You have regular IO and some non-fs IO (from scsi_execute()). This mix >> may be key. >> >> - sdd has nothing pending, yet has 6 active waitqueues. >> >> I'm going to see if I can reproduce this. Paolo, what kind of attempts >> to reproduce this have you done? > > No luck so far. Out of the patches you referenced, I can only find the > shallow depth change, since that's in the parent of this email. Can > you send those as well? > > Perhaps also expand a bit on exactly what you are running. File system, > mount options, etc. Alright, I managed to reproduce it. What I think is happening is that BFQ is limiting the inflight case to something less than the wake batch for sbitmap, which can lead to stalls. I don't have time to test this tonight, but perhaps you can give it a go when you are back at it. If not, I'll try tomorrow morning. If this is the issue, I can turn it into a real patch. This is just to confirm that the issue goes away with the below. diff --git a/lib/sbitmap.c b/lib/sbitmap.c index e6a9c06ec70c..94ced15b6428 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -272,6 +272,7 @@ EXPORT_SYMBOL_GPL(sbitmap_bitmap_show); static unsigned int sbq_calc_wake_batch(unsigned int depth) { +#if 0 unsigned int wake_batch; /* @@ -284,6 +285,9 @@ static unsigned int sbq_calc_wake_batch(unsigned int depth) wake_batch = max(1U, depth / SBQ_WAIT_QUEUES); return wake_batch; +#else + return 1; +#endif } int sbitmap_queue_init_node(struct sbitmap_queue *sbq, unsigned int depth, -- Jens Axboe
Re: bug in tag handling in blk-mq?
On 5/8/18 2:37 PM, Jens Axboe wrote: > On 5/8/18 10:42 AM, Mike Galbraith wrote: >> On Tue, 2018-05-08 at 08:55 -0600, Jens Axboe wrote: >>> >>> All the block debug files are empty... >> >> Sigh. Take 2, this time cat debug files, having turned block tracing >> off before doing anything else (so trace bits in dmesg.txt should end >> AT the stall). > > OK, that's better. What I see from the traces: > > - You have regular IO and some non-fs IO (from scsi_execute()). This mix > may be key. > > - sdd has nothing pending, yet has 6 active waitqueues. > > I'm going to see if I can reproduce this. Paolo, what kind of attempts > to reproduce this have you done? No luck so far. Out of the patches you referenced, I can only find the shallow depth change, since that's in the parent of this email. Can you send those as well? Perhaps also expand a bit on exactly what you are running. File system, mount options, etc. -- Jens Axboe
Re: bug in tag handling in blk-mq?
On 5/8/18 10:42 AM, Mike Galbraith wrote: > On Tue, 2018-05-08 at 08:55 -0600, Jens Axboe wrote: >> >> All the block debug files are empty... > > Sigh. Take 2, this time cat debug files, having turned block tracing > off before doing anything else (so trace bits in dmesg.txt should end > AT the stall). OK, that's better. What I see from the traces: - You have regular IO and some non-fs IO (from scsi_execute()). This mix may be key. - sdd has nothing pending, yet has 6 active waitqueues. I'm going to see if I can reproduce this. Paolo, what kind of attempts to reproduce this have you done? -- Jens Axboe
Re: bug in tag handling in blk-mq?
On Tue, 2018-05-08 at 08:55 -0600, Jens Axboe wrote: > > All the block debug files are empty... Sigh. Take 2, this time cat debug files, having turned block tracing off before doing anything else (so trace bits in dmesg.txt should end AT the stall). -Mike dmesg.xz Description: application/xz dmesg.txt.xz Description: application/xz block_debug.xz Description: application/xz
Re: bug in tag handling in blk-mq?
On 5/8/18 2:37 AM, Mike Galbraith wrote: > On Tue, 2018-05-08 at 06:51 +0200, Mike Galbraith wrote: >> >> I'm deadlined ATM, but will get to it. > > (Bah, even a zombie can type ccache -C; make -j8 and stare...) > > kbuild again hung on the first go (yay), and post hang data written to > sdd1 survived (kernel source lives in sdb3). Full ftrace buffer (echo > 1 > events/block/enable) available off list if desired. dmesg.txt.xz > is dmesg from post hang crashdump, attached because it contains the > tail of trace buffer, so _might_ be useful. > > homer:~ # df|grep sd > /dev/sdb3 959074776 785342824 172741072 82% / > /dev/sdc3 959074776 455464912 502618984 48% /backup > /dev/sdb1 159564 7980 151584 6% /boot/efi > /dev/sdd1 961301832 393334868 519112540 44% /abuild > > Kernel is virgin modulo these... > > patches/remove_irritating_plus.diff > patches/add-scm-version-to-EXTRAVERSION.patch > patches/block-bfq:-postpone-rq-preparation-to-insert-or-merge.patch > patches/block-bfq:-test.patch (hang provocation hack from Paolo) All the block debug files are empty... -- Jens Axboe
Re: bug in tag handling in blk-mq?
On Tue, 2018-05-08 at 06:51 +0200, Mike Galbraith wrote: > > I'm deadlined ATM, but will get to it. (Bah, even a zombie can type ccache -C; make -j8 and stare...) kbuild again hung on the first go (yay), and post hang data written to sdd1 survived (kernel source lives in sdb3). Full ftrace buffer (echo 1 > events/block/enable) available off list if desired. dmesg.txt.xz is dmesg from post hang crashdump, attached because it contains the tail of trace buffer, so _might_ be useful. homer:~ # df|grep sd /dev/sdb3 959074776 785342824 172741072 82% / /dev/sdc3 959074776 455464912 502618984 48% /backup /dev/sdb1 159564 7980 151584 6% /boot/efi /dev/sdd1 961301832 393334868 519112540 44% /abuild Kernel is virgin modulo these... patches/remove_irritating_plus.diff patches/add-scm-version-to-EXTRAVERSION.patch patches/block-bfq:-postpone-rq-preparation-to-insert-or-merge.patch patches/block-bfq:-test.patch (hang provocation hack from Paolo) -Mike block_debug.tar.xz Description: application/xz-compressed-tar dmesg.xz Description: application/xz dmesg.txt.xz Description: application/xz
Re: bug in tag handling in blk-mq?
On Mon, 2018-05-07 at 20:02 +0200, Paolo Valente wrote: > > > > Is there a reproducer? Just building fat config kernels works for me. It was highly non- deterministic, but reproduced quickly twice in a row with Paolos hack. > Ok Mike, I guess it's your turn now, for at least a stack trace. Sure. I'm deadlined ATM, but will get to it. -Mike
Re: bug in tag handling in blk-mq?
> Il giorno 07 mag 2018, alle ore 18:39, Jens Axboe ha > scritto: > > On 5/7/18 8:03 AM, Paolo Valente wrote: >> Hi Jens, Christoph, all, >> Mike Galbraith has been experiencing hangs, on blk_mq_get_tag, only >> with bfq [1]. Symptoms seem to clearly point to a problem in I/O-tag >> handling, triggered by bfq because it limits the number of tags for >> async and sync write requests (in bfq_limit_depth). >> >> Fortunately, I just happened to find a way to apparently confirm it. >> With the following one-liner for block/bfq-iosched.c: >> >> @@ -554,8 +554,7 @@ static void bfq_limit_depth(unsigned int op, struct >> blk_mq_alloc_data *data) >>if (unlikely(bfqd->sb_shift != bt->sb.shift)) >>bfq_update_depths(bfqd, bt); >> >> - data->shallow_depth = >> - bfqd->word_depths[!!bfqd->wr_busy_queues][op_is_sync(op)]; >> + data->shallow_depth = 1; >> >>bfq_log(bfqd, "[%s] wr_busy %d sync %d depth %u", >>__func__, bfqd->wr_busy_queues, op_is_sync(op), >> >> Mike's machine now crashes soon and systematically, while nothing bad >> happens on my machines, even with heavy workloads (apart from an >> expected throughput drop). >> >> This change simply reduces to 1 the maximum possible value for the sum >> of the number of async requests and of sync write requests. >> >> This email is basically a request for help to knowledgeable people. To >> start, here are my first doubts/questions: >> 1) Just to be certain, I guess it is not normal that blk-mq hangs if >> async requests and sync write requests can be at most one, right? >> 2) Do you have any hint to where I could look for, to chase this bug? >> Of course, the bug may be in bfq, i.e, it may be a somehow unrelated >> bfq bug that causes this hang in blk-mq, indirectly. But it is hard >> for me to understand how. > > CC Omar, since he implemented the shallow part. But we'll need some > traces to show where we are hung, probably also the value of the > /sys/debug/kernel/block// directory. For the crash mentioned, a > trace as well. Otherwise we'll be wasting a lot of time on this. > > Is there a reproducer? > Ok Mike, I guess it's your turn now, for at least a stack trace. Thanks, Paolo > -- > Jens Axboe
Re: bug in tag handling in blk-mq?
On 5/7/18 8:03 AM, Paolo Valente wrote: > Hi Jens, Christoph, all, > Mike Galbraith has been experiencing hangs, on blk_mq_get_tag, only > with bfq [1]. Symptoms seem to clearly point to a problem in I/O-tag > handling, triggered by bfq because it limits the number of tags for > async and sync write requests (in bfq_limit_depth). > > Fortunately, I just happened to find a way to apparently confirm it. > With the following one-liner for block/bfq-iosched.c: > > @@ -554,8 +554,7 @@ static void bfq_limit_depth(unsigned int op, struct > blk_mq_alloc_data *data) > if (unlikely(bfqd->sb_shift != bt->sb.shift)) > bfq_update_depths(bfqd, bt); > > - data->shallow_depth = > - bfqd->word_depths[!!bfqd->wr_busy_queues][op_is_sync(op)]; > + data->shallow_depth = 1; > > bfq_log(bfqd, "[%s] wr_busy %d sync %d depth %u", > __func__, bfqd->wr_busy_queues, op_is_sync(op), > > Mike's machine now crashes soon and systematically, while nothing bad > happens on my machines, even with heavy workloads (apart from an > expected throughput drop). > > This change simply reduces to 1 the maximum possible value for the sum > of the number of async requests and of sync write requests. > > This email is basically a request for help to knowledgeable people. To > start, here are my first doubts/questions: > 1) Just to be certain, I guess it is not normal that blk-mq hangs if > async requests and sync write requests can be at most one, right? > 2) Do you have any hint to where I could look for, to chase this bug? > Of course, the bug may be in bfq, i.e, it may be a somehow unrelated > bfq bug that causes this hang in blk-mq, indirectly. But it is hard > for me to understand how. CC Omar, since he implemented the shallow part. But we'll need some traces to show where we are hung, probably also the value of the /sys/debug/kernel/block// directory. For the crash mentioned, a trace as well. Otherwise we'll be wasting a lot of time on this. Is there a reproducer? -- Jens Axboe
bug in tag handling in blk-mq?
Hi Jens, Christoph, all, Mike Galbraith has been experiencing hangs, on blk_mq_get_tag, only with bfq [1]. Symptoms seem to clearly point to a problem in I/O-tag handling, triggered by bfq because it limits the number of tags for async and sync write requests (in bfq_limit_depth). Fortunately, I just happened to find a way to apparently confirm it. With the following one-liner for block/bfq-iosched.c: @@ -554,8 +554,7 @@ static void bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data) if (unlikely(bfqd->sb_shift != bt->sb.shift)) bfq_update_depths(bfqd, bt); - data->shallow_depth = - bfqd->word_depths[!!bfqd->wr_busy_queues][op_is_sync(op)]; + data->shallow_depth = 1; bfq_log(bfqd, "[%s] wr_busy %d sync %d depth %u", __func__, bfqd->wr_busy_queues, op_is_sync(op), Mike's machine now crashes soon and systematically, while nothing bad happens on my machines, even with heavy workloads (apart from an expected throughput drop). This change simply reduces to 1 the maximum possible value for the sum of the number of async requests and of sync write requests. This email is basically a request for help to knowledgeable people. To start, here are my first doubts/questions: 1) Just to be certain, I guess it is not normal that blk-mq hangs if async requests and sync write requests can be at most one, right? 2) Do you have any hint to where I could look for, to chase this bug? Of course, the bug may be in bfq, i.e, it may be a somehow unrelated bfq bug that causes this hang in blk-mq, indirectly. But it is hard for me to understand how. Looking forward to some help. Thanks, Paolo [1] https://www.spinics.net/lists/stable/msg215036.html