On 11/07/2017 09:29 AM, Jens Axboe wrote:
> On 11/07/2017 09:20 AM, Bart Van Assche wrote:
>> On Tue, 2017-11-07 at 10:11 +0800, Ming Lei wrote:
>>> If you can reproduce, please provide me at least the following log
>>> first:
>>>
>>> find /sys/kernel/debug/block -name tags | xargs cat | grep busy
>>>
>>> If any pending requests arn't completed, please provide the related
>>> info in dbgfs about where is the request.
>>
>> Every time I ran the above or a similar command its output was empty. I
>> assume that's because the hang usually occurs in a phase where these debugfs
>> attributes either have not yet been created or have already disappeared.
>
> Bart, do you still see a hang with the patch that fixes the tag leak when
> we fail to get a dispatch budget?
>
> https://marc.info/?l=linux-block&m=151004881411480&w=2
>
> I've run a lot of stability testing here, and I haven't run into any
> issues. This is with shared tags as well. So if you still see the failure
> case with the current tree AND the above patch, then I'll try and get
> a test case setup that hits it too so we can get to the bottom of this.
Ming, I managed to reproduce the hang using null_blk. Note this is
WITHOUT the patch mentioned above, running with that now.
# modprobe null_blk queue_mode=2 nr_devices=4 shared_tags=1 submit_queues=1
hw_queue_depth=1
and using this fio file:
[global]
bs=4k
rw=randread
norandommap
direct=1
ioengine=libaio
iodepth=4
[nullb0]
filename=/dev/nullb0
[nullb1]
filename=/dev/nullb1
[nullb2]
filename=/dev/nullb2
[nullb3]
filename=/dev/nullb3
it seemed to keep running, but it hung when exiting. The troublesome
device was nullb1:
[ 492.513374] INFO: task fio:3263 blocked for more than 120 seconds.
[ 492.520782] Not tainted 4.14.0-rc7+ #499
[ 492.526247] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 492.535904] fio D13208 3263 3211 0x00000000
[ 492.542535] Call Trace:
[ 492.545764] __schedule+0x279/0x720
[ 492.550151] schedule+0x33/0x90
[ 492.554145] io_schedule+0x16/0x40
[ 492.558435] blk_mq_get_tag+0x148/0x250
[ 492.563211] ? finish_wait+0x90/0x90
[ 492.567693] blk_mq_get_request+0xf0/0x3e0
[ 492.572760] blk_mq_make_request+0xe2/0x690
[ 492.577913] generic_make_request+0xfc/0x2f0
[ 492.583177] submit_bio+0x64/0x120
[ 492.587475] ? set_page_dirty_lock+0x4b/0x60
[ 492.592736] ? submit_bio+0x64/0x120
[ 492.597309] ? bio_set_pages_dirty+0x55/0x60
[ 492.602570] blkdev_direct_IO+0x388/0x3c0
[ 492.607546] ? free_ioctx_users+0xe0/0xe0
[ 492.612511] ? blk_mq_dispatch_rq_list+0x238/0x3a0
[ 492.618353] ? _raw_spin_unlock+0xe/0x20
[ 492.623227] generic_file_read_iter+0xb3/0xa00
[ 492.628682] ? generic_file_read_iter+0xb3/0xa00
[ 492.634334] ? security_file_permission+0x9b/0xc0
[ 492.640114] blkdev_read_iter+0x35/0x40
[ 492.644877] aio_read+0xc5/0x120
[ 492.648973] ? aio_read_events+0x24c/0x340
[ 492.654124] ? __might_sleep+0x4a/0x80
[ 492.658800] do_io_submit+0x47c/0x5e0
[ 492.663373] ? do_io_submit+0x47c/0x5e0
[ 492.668234] SyS_io_submit+0x10/0x20
[ 492.672715] ? SyS_io_submit+0x10/0x20
[ 492.677394] entry_SYSCALL_64_fastpath+0x13/0x94
[ 492.683039] RIP: 0033:0x7f83d1871717
[ 492.687521] RSP: 002b:00007ffd38fe5a88 EFLAGS: 00000202 ORIG_RAX:
00000000000000d1
[ 492.696969] RAX: ffffffffffffffda RBX: 00007f83b6972b50 RCX: 00007f83d1871717
[ 492.705423] RDX: 0000000001f41418 RSI: 0000000000000001 RDI: 00007f83e4d36000
[ 492.713889] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000001f3f2e0
[ 492.722352] R10: 0000000000001000 R11: 0000000000000202 R12: 00007ffd38fe5be0
[ 492.730827] R13: 00007f83b6972b01 R14: 00007f83b69824b8 R15: 00007f83b6982368
and if we look at the debug entries, it's waiting on a scheduler tag:
sched_tags:nr_tags=2
sched_tags:nr_reserved_tags=0
sched_tags:active_queues=0
sched_tags:bitmap_tags:
sched_tags:depth=2
sched_tags:busy=2
sched_tags:bits_per_word=64
sched_tags:map_nr=1
sched_tags:alloc_hint={0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0,
1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0,
0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0}
sched_tags:wake_batch=1
sched_tags:wake_index=4
sched_tags:ws={
sched_tags: {.wait_cnt=-102, .wait=inactive},
sched_tags: {.wait_cnt=-126, .wait=inactive},
sched_tags: {.wait_cnt=-72, .wait=inactive},
sched_tags: {.wait_cnt=-96, .wait=inactive},
sched_tags: {.wait_cnt=-134, .wait=inactive},
sched_tags: {.wait_cnt=-116, .wait=inactive},
sched_tags: {.wait_cnt=-90, .wait=inactive},
sched_tags: {.wait_cnt=-115, .wait=active},
sched_tags:}
sched_tags:round_robin=0
sched_tags_bitmap:00000000: 03
with SCHED_RESTART being set:
state:SCHED_RESTART
and with the driver tags being idle:
tags:nr_tags=1
tags:nr_reserved_tags=0
tags:active_queues=0
tags:bitmap_tags:
tags:depth=1
tags:busy=0
tags:bits_per_word=64
tags:map_nr=1
tags:alloc_hint={0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
tags:wake_batch=1
tags:wake_index=3
tags:ws={
tags: {.wait_cnt=-48, .wait=inactive},
tags: {.wait_cnt=-39, .wait=inactive},
tags: {.wait_cnt=-50, .wait=inactive},
tags: {.wait_cnt=-47, .wait=inactive},
tags: {.wait_cnt=-25, .wait=inactive},
tags: {.wait_cnt=-24, .wait=inactive},
tags: {.wait_cnt=-47, .wait=inactive},
tags: {.wait_cnt=-47, .wait=inactive},
tags:}
Jens Axboe