Re: [PATCH] blk-mq: don't complete un-started request in timeout handler

2017-03-23 Thread Ming Lei
On Wed, Mar 22, 2017 at 11:58:17AM -0400, Keith Busch wrote:
> On Tue, Mar 21, 2017 at 11:03:59PM -0400, Jens Axboe wrote:
> > On 03/21/2017 10:14 PM, Ming Lei wrote:
> > > When iterating busy requests in timeout handler,
> > > if the STARTED flag of one request isn't set, that means
> > > the request is being processed in block layer or driver, and
> > > isn't submitted to hardware yet.
> > > 
> > > In current implementation of blk_mq_check_expired(),
> > > if the request queue becomes dying, un-started requests are
> > > handled as being completed/freed immediately. This way is
> > > wrong, and can cause rq corruption or double allocation[1][2],
> > > when doing I/O and removing NVMe device at the sametime.
> > 
> > I agree, completing it looks bogus. If the request is in a scheduler or
> > on a software queue, this won't end well at all. Looks like it was
> > introduced by this patch:
> > 
> > commit eb130dbfc40eabcd4e10797310bda6b9f6dd7e76
> > Author: Keith Busch 
> > Date:   Thu Jan 8 08:59:53 2015 -0700
> > 
> > blk-mq: End unstarted requests on a dying queue
> > 
> > Before that, we just ignored it. Keith?
> 
> The above was intended for a stopped hctx on a dying queue such that
> there's nothing in flight to the driver. Nvme had been relying on this
> to end unstarted requests so we may progress when a controller dies.

So the brokenness started just from the begining.

> 
> We've since obviated the need: we restart the hw queues to flush entered
> requests to failure, so we don't need that brokenness.

Looks the following commit need to be backported too if we port this patch.

commit 69d9a99c258eb1d6478fd9608a2070890797eed7
Author: Keith Busch 
Date:   Wed Feb 24 09:15:56 2016 -0700

NVMe: Move error handling to failed reset handler
 

Thanks,
Ming


Re: [PATCH] blk-mq: don't complete un-started request in timeout handler

2017-03-23 Thread Ming Lei
On Wed, Mar 22, 2017 at 11:58:17AM -0400, Keith Busch wrote:
> On Tue, Mar 21, 2017 at 11:03:59PM -0400, Jens Axboe wrote:
> > On 03/21/2017 10:14 PM, Ming Lei wrote:
> > > When iterating busy requests in timeout handler,
> > > if the STARTED flag of one request isn't set, that means
> > > the request is being processed in block layer or driver, and
> > > isn't submitted to hardware yet.
> > > 
> > > In current implementation of blk_mq_check_expired(),
> > > if the request queue becomes dying, un-started requests are
> > > handled as being completed/freed immediately. This way is
> > > wrong, and can cause rq corruption or double allocation[1][2],
> > > when doing I/O and removing NVMe device at the sametime.
> > 
> > I agree, completing it looks bogus. If the request is in a scheduler or
> > on a software queue, this won't end well at all. Looks like it was
> > introduced by this patch:
> > 
> > commit eb130dbfc40eabcd4e10797310bda6b9f6dd7e76
> > Author: Keith Busch 
> > Date:   Thu Jan 8 08:59:53 2015 -0700
> > 
> > blk-mq: End unstarted requests on a dying queue
> > 
> > Before that, we just ignored it. Keith?
> 
> The above was intended for a stopped hctx on a dying queue such that
> there's nothing in flight to the driver. Nvme had been relying on this
> to end unstarted requests so we may progress when a controller dies.

So the brokenness started just from the begining.

> 
> We've since obviated the need: we restart the hw queues to flush entered
> requests to failure, so we don't need that brokenness.

Looks the following commit need to be backported too if we port this patch.

commit 69d9a99c258eb1d6478fd9608a2070890797eed7
Author: Keith Busch 
Date:   Wed Feb 24 09:15:56 2016 -0700

NVMe: Move error handling to failed reset handler
 

Thanks,
Ming


Re: [PATCH] blk-mq: don't complete un-started request in timeout handler

2017-03-22 Thread Jens Axboe
On 03/22/2017 11:58 AM, Keith Busch wrote:
> On Tue, Mar 21, 2017 at 11:03:59PM -0400, Jens Axboe wrote:
>> On 03/21/2017 10:14 PM, Ming Lei wrote:
>>> When iterating busy requests in timeout handler,
>>> if the STARTED flag of one request isn't set, that means
>>> the request is being processed in block layer or driver, and
>>> isn't submitted to hardware yet.
>>>
>>> In current implementation of blk_mq_check_expired(),
>>> if the request queue becomes dying, un-started requests are
>>> handled as being completed/freed immediately. This way is
>>> wrong, and can cause rq corruption or double allocation[1][2],
>>> when doing I/O and removing NVMe device at the sametime.
>>
>> I agree, completing it looks bogus. If the request is in a scheduler or
>> on a software queue, this won't end well at all. Looks like it was
>> introduced by this patch:
>>
>> commit eb130dbfc40eabcd4e10797310bda6b9f6dd7e76
>> Author: Keith Busch 
>> Date:   Thu Jan 8 08:59:53 2015 -0700
>>
>> blk-mq: End unstarted requests on a dying queue
>>
>> Before that, we just ignored it. Keith?
> 
> The above was intended for a stopped hctx on a dying queue such that
> there's nothing in flight to the driver. Nvme had been relying on this
> to end unstarted requests so we may progress when a controller dies.
> 
> We've since obviated the need: we restart the hw queues to flush entered
> requests to failure, so we don't need that brokenness.

Good, thanks for confirming, Keith. I queued up the patch for 4.11 this
morning.

-- 
Jens Axboe



Re: [PATCH] blk-mq: don't complete un-started request in timeout handler

2017-03-22 Thread Jens Axboe
On 03/22/2017 11:58 AM, Keith Busch wrote:
> On Tue, Mar 21, 2017 at 11:03:59PM -0400, Jens Axboe wrote:
>> On 03/21/2017 10:14 PM, Ming Lei wrote:
>>> When iterating busy requests in timeout handler,
>>> if the STARTED flag of one request isn't set, that means
>>> the request is being processed in block layer or driver, and
>>> isn't submitted to hardware yet.
>>>
>>> In current implementation of blk_mq_check_expired(),
>>> if the request queue becomes dying, un-started requests are
>>> handled as being completed/freed immediately. This way is
>>> wrong, and can cause rq corruption or double allocation[1][2],
>>> when doing I/O and removing NVMe device at the sametime.
>>
>> I agree, completing it looks bogus. If the request is in a scheduler or
>> on a software queue, this won't end well at all. Looks like it was
>> introduced by this patch:
>>
>> commit eb130dbfc40eabcd4e10797310bda6b9f6dd7e76
>> Author: Keith Busch 
>> Date:   Thu Jan 8 08:59:53 2015 -0700
>>
>> blk-mq: End unstarted requests on a dying queue
>>
>> Before that, we just ignored it. Keith?
> 
> The above was intended for a stopped hctx on a dying queue such that
> there's nothing in flight to the driver. Nvme had been relying on this
> to end unstarted requests so we may progress when a controller dies.
> 
> We've since obviated the need: we restart the hw queues to flush entered
> requests to failure, so we don't need that brokenness.

Good, thanks for confirming, Keith. I queued up the patch for 4.11 this
morning.

-- 
Jens Axboe



Re: [PATCH] blk-mq: don't complete un-started request in timeout handler

2017-03-22 Thread Keith Busch
On Tue, Mar 21, 2017 at 11:03:59PM -0400, Jens Axboe wrote:
> On 03/21/2017 10:14 PM, Ming Lei wrote:
> > When iterating busy requests in timeout handler,
> > if the STARTED flag of one request isn't set, that means
> > the request is being processed in block layer or driver, and
> > isn't submitted to hardware yet.
> > 
> > In current implementation of blk_mq_check_expired(),
> > if the request queue becomes dying, un-started requests are
> > handled as being completed/freed immediately. This way is
> > wrong, and can cause rq corruption or double allocation[1][2],
> > when doing I/O and removing NVMe device at the sametime.
> 
> I agree, completing it looks bogus. If the request is in a scheduler or
> on a software queue, this won't end well at all. Looks like it was
> introduced by this patch:
> 
> commit eb130dbfc40eabcd4e10797310bda6b9f6dd7e76
> Author: Keith Busch 
> Date:   Thu Jan 8 08:59:53 2015 -0700
> 
> blk-mq: End unstarted requests on a dying queue
> 
> Before that, we just ignored it. Keith?

The above was intended for a stopped hctx on a dying queue such that
there's nothing in flight to the driver. Nvme had been relying on this
to end unstarted requests so we may progress when a controller dies.

We've since obviated the need: we restart the hw queues to flush entered
requests to failure, so we don't need that brokenness.


Re: [PATCH] blk-mq: don't complete un-started request in timeout handler

2017-03-22 Thread Keith Busch
On Tue, Mar 21, 2017 at 11:03:59PM -0400, Jens Axboe wrote:
> On 03/21/2017 10:14 PM, Ming Lei wrote:
> > When iterating busy requests in timeout handler,
> > if the STARTED flag of one request isn't set, that means
> > the request is being processed in block layer or driver, and
> > isn't submitted to hardware yet.
> > 
> > In current implementation of blk_mq_check_expired(),
> > if the request queue becomes dying, un-started requests are
> > handled as being completed/freed immediately. This way is
> > wrong, and can cause rq corruption or double allocation[1][2],
> > when doing I/O and removing NVMe device at the sametime.
> 
> I agree, completing it looks bogus. If the request is in a scheduler or
> on a software queue, this won't end well at all. Looks like it was
> introduced by this patch:
> 
> commit eb130dbfc40eabcd4e10797310bda6b9f6dd7e76
> Author: Keith Busch 
> Date:   Thu Jan 8 08:59:53 2015 -0700
> 
> blk-mq: End unstarted requests on a dying queue
> 
> Before that, we just ignored it. Keith?

The above was intended for a stopped hctx on a dying queue such that
there's nothing in flight to the driver. Nvme had been relying on this
to end unstarted requests so we may progress when a controller dies.

We've since obviated the need: we restart the hw queues to flush entered
requests to failure, so we don't need that brokenness.


Re: [PATCH] blk-mq: don't complete un-started request in timeout handler

2017-03-21 Thread Jens Axboe
On 03/21/2017 10:14 PM, Ming Lei wrote:
> When iterating busy requests in timeout handler,
> if the STARTED flag of one request isn't set, that means
> the request is being processed in block layer or driver, and
> isn't submitted to hardware yet.
> 
> In current implementation of blk_mq_check_expired(),
> if the request queue becomes dying, un-started requests are
> handled as being completed/freed immediately. This way is
> wrong, and can cause rq corruption or double allocation[1][2],
> when doing I/O and removing NVMe device at the sametime.

I agree, completing it looks bogus. If the request is in a scheduler or
on a software queue, this won't end well at all. Looks like it was
introduced by this patch:

commit eb130dbfc40eabcd4e10797310bda6b9f6dd7e76
Author: Keith Busch 
Date:   Thu Jan 8 08:59:53 2015 -0700

blk-mq: End unstarted requests on a dying queue

Before that, we just ignored it. Keith?

-- 
Jens Axboe



Re: [PATCH] blk-mq: don't complete un-started request in timeout handler

2017-03-21 Thread Jens Axboe
On 03/21/2017 10:14 PM, Ming Lei wrote:
> When iterating busy requests in timeout handler,
> if the STARTED flag of one request isn't set, that means
> the request is being processed in block layer or driver, and
> isn't submitted to hardware yet.
> 
> In current implementation of blk_mq_check_expired(),
> if the request queue becomes dying, un-started requests are
> handled as being completed/freed immediately. This way is
> wrong, and can cause rq corruption or double allocation[1][2],
> when doing I/O and removing NVMe device at the sametime.

I agree, completing it looks bogus. If the request is in a scheduler or
on a software queue, this won't end well at all. Looks like it was
introduced by this patch:

commit eb130dbfc40eabcd4e10797310bda6b9f6dd7e76
Author: Keith Busch 
Date:   Thu Jan 8 08:59:53 2015 -0700

blk-mq: End unstarted requests on a dying queue

Before that, we just ignored it. Keith?

-- 
Jens Axboe



[PATCH] blk-mq: don't complete un-started request in timeout handler

2017-03-21 Thread Ming Lei
When iterating busy requests in timeout handler,
if the STARTED flag of one request isn't set, that means
the request is being processed in block layer or driver, and
isn't submitted to hardware yet.

In current implementation of blk_mq_check_expired(),
if the request queue becomes dying, un-started requests are
handled as being completed/freed immediately. This way is
wrong, and can cause rq corruption or double allocation[1][2],
when doing I/O and removing NVMe device at the sametime.

This patch fixes several issues reported by Yi Zhang.

[1]. oops log 1
[  581.789754] [ cut here ]
[  581.789758] kernel BUG at block/blk-mq.c:374!
[  581.789760] invalid opcode:  [#1] SMP
[  581.789761] Modules linked in: vfat fat ipmi_ssif intel_rapl sb_edac
edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nvme
irqbypass crct10dif_pclmul nvme_core crc32_pclmul ghash_clmulni_intel
intel_cstate ipmi_si mei_me ipmi_devintf intel_uncore sg ipmi_msghandler
intel_rapl_perf iTCO_wdt mei iTCO_vendor_support mxm_wmi lpc_ich dcdbas shpchp
pcspkr acpi_power_meter wmi nfsd auth_rpcgss nfs_acl lockd dm_multipath grace
sunrpc ip_tables xfs libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ahci libahci
crc32c_intel tg3 libata megaraid_sas i2c_core ptp fjes pps_core dm_mirror
dm_region_hash dm_log dm_mod
[  581.789796] CPU: 1 PID: 1617 Comm: kworker/1:1H Not tainted 
4.10.0.bz1420297+ #4
[  581.789797] Hardware name: Dell Inc. PowerEdge R730xd/072T6D, BIOS 2.2.5 
09/06/2016
[  581.789804] Workqueue: kblockd blk_mq_timeout_work
[  581.789806] task: 8804721c8000 task.stack: c90006ee4000
[  581.789809] RIP: 0010:blk_mq_end_request+0x58/0x70
[  581.789810] RSP: 0018:c90006ee7d50 EFLAGS: 00010202
[  581.789811] RAX: 0001 RBX: 8802e4195340 RCX: 88028e2f4b88
[  581.789812] RDX: 1000 RSI: 1000 RDI: 
[  581.789813] RBP: c90006ee7d60 R08: 0003 R09: 88028e2f4b00
[  581.789814] R10: 1000 R11: 0001 R12: fffb
[  581.789815] R13: 88042abe5780 R14: 002d R15: 88046fbdff80
[  581.789817] FS:  () GS:88047fc0() 
knlGS:
[  581.789818] CS:  0010 DS:  ES:  CR0: 80050033
[  581.789819] CR2: 7f64f403a008 CR3: 00014d078000 CR4: 001406e0
[  581.789820] Call Trace:
[  581.789825]  blk_mq_check_expired+0x76/0x80
[  581.789828]  bt_iter+0x45/0x50
[  581.789830]  blk_mq_queue_tag_busy_iter+0xdd/0x1f0
[  581.789832]  ? blk_mq_rq_timed_out+0x70/0x70
[  581.789833]  ? blk_mq_rq_timed_out+0x70/0x70
[  581.789840]  ? __switch_to+0x140/0x450
[  581.789841]  blk_mq_timeout_work+0x88/0x170
[  581.789845]  process_one_work+0x165/0x410
[  581.789847]  worker_thread+0x137/0x4c0
[  581.789851]  kthread+0x101/0x140
[  581.789853]  ? rescuer_thread+0x3b0/0x3b0
[  581.789855]  ? kthread_park+0x90/0x90
[  581.789860]  ret_from_fork+0x2c/0x40
[  581.789861] Code: 48 85 c0 74 0d 44 89 e6 48 89 df ff d0 5b 41 5c 5d c3 48
8b bb 70 01 00 00 48 85 ff 75 0f 48 89 df e8 7d f0 ff ff 5b 41 5c 5d c3 <0f>
0b e8 71 f0 ff ff 90 eb e9 0f 1f 40 00 66 2e 0f 1f 84 00 00
[  581.789882] RIP: blk_mq_end_request+0x58/0x70 RSP: c90006ee7d50
[  581.789889] ---[ end trace bcaf03d9a14a0a70 ]---

[2]. oops log2
[ 6984.857362] BUG: unable to handle kernel NULL pointer dereference at 
0010
[ 6984.857372] IP: nvme_queue_rq+0x6e6/0x8cd [nvme]
[ 6984.857373] PGD 0
[ 6984.857374]
[ 6984.857376] Oops:  [#1] SMP
[ 6984.857379] Modules linked in: ipmi_ssif vfat fat intel_rapl sb_edac
edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm
irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ipmi_si iTCO_wdt
iTCO_vendor_support mxm_wmi ipmi_devintf intel_cstate sg dcdbas intel_uncore
mei_me intel_rapl_perf mei pcspkr lpc_ich ipmi_msghandler shpchp
acpi_power_meter wmi nfsd auth_rpcgss dm_multipath nfs_acl lockd grace sunrpc
ip_tables xfs libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea
sysfillrect crc32c_intel sysimgblt fb_sys_fops ttm nvme drm nvme_core ahci
libahci i2c_core tg3 libata ptp megaraid_sas pps_core fjes dm_mirror
dm_region_hash dm_log dm_mod
[ 6984.857416] CPU: 7 PID: 1635 Comm: kworker/7:1H Not tainted
4.10.0-2.el7.bz1420297.x86_64 #1
[ 6984.857417] Hardware name: Dell Inc. PowerEdge R730xd/072T6D, BIOS 2.2.5 
09/06/2016
[ 6984.857427] Workqueue: kblockd blk_mq_run_work_fn
[ 6984.857429] task: 880476e3da00 task.stack: c90002e9
[ 6984.857432] RIP: 0010:nvme_queue_rq+0x6e6/0x8cd [nvme]
[ 6984.857433] RSP: 0018:c90002e93c50 EFLAGS: 00010246
[ 6984.857434] RAX:  RBX: 880275646600 RCX: 1000
[ 6984.857435] RDX: 0fff RSI: 0002fba2a000 RDI: 8804734e6950
[ 6984.857436] RBP: c90002e93d30 R08: 2000 R09: 1000
[ 6984.857437] R10: 

[PATCH] blk-mq: don't complete un-started request in timeout handler

2017-03-21 Thread Ming Lei
When iterating busy requests in timeout handler,
if the STARTED flag of one request isn't set, that means
the request is being processed in block layer or driver, and
isn't submitted to hardware yet.

In current implementation of blk_mq_check_expired(),
if the request queue becomes dying, un-started requests are
handled as being completed/freed immediately. This way is
wrong, and can cause rq corruption or double allocation[1][2],
when doing I/O and removing NVMe device at the sametime.

This patch fixes several issues reported by Yi Zhang.

[1]. oops log 1
[  581.789754] [ cut here ]
[  581.789758] kernel BUG at block/blk-mq.c:374!
[  581.789760] invalid opcode:  [#1] SMP
[  581.789761] Modules linked in: vfat fat ipmi_ssif intel_rapl sb_edac
edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nvme
irqbypass crct10dif_pclmul nvme_core crc32_pclmul ghash_clmulni_intel
intel_cstate ipmi_si mei_me ipmi_devintf intel_uncore sg ipmi_msghandler
intel_rapl_perf iTCO_wdt mei iTCO_vendor_support mxm_wmi lpc_ich dcdbas shpchp
pcspkr acpi_power_meter wmi nfsd auth_rpcgss nfs_acl lockd dm_multipath grace
sunrpc ip_tables xfs libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ahci libahci
crc32c_intel tg3 libata megaraid_sas i2c_core ptp fjes pps_core dm_mirror
dm_region_hash dm_log dm_mod
[  581.789796] CPU: 1 PID: 1617 Comm: kworker/1:1H Not tainted 
4.10.0.bz1420297+ #4
[  581.789797] Hardware name: Dell Inc. PowerEdge R730xd/072T6D, BIOS 2.2.5 
09/06/2016
[  581.789804] Workqueue: kblockd blk_mq_timeout_work
[  581.789806] task: 8804721c8000 task.stack: c90006ee4000
[  581.789809] RIP: 0010:blk_mq_end_request+0x58/0x70
[  581.789810] RSP: 0018:c90006ee7d50 EFLAGS: 00010202
[  581.789811] RAX: 0001 RBX: 8802e4195340 RCX: 88028e2f4b88
[  581.789812] RDX: 1000 RSI: 1000 RDI: 
[  581.789813] RBP: c90006ee7d60 R08: 0003 R09: 88028e2f4b00
[  581.789814] R10: 1000 R11: 0001 R12: fffb
[  581.789815] R13: 88042abe5780 R14: 002d R15: 88046fbdff80
[  581.789817] FS:  () GS:88047fc0() 
knlGS:
[  581.789818] CS:  0010 DS:  ES:  CR0: 80050033
[  581.789819] CR2: 7f64f403a008 CR3: 00014d078000 CR4: 001406e0
[  581.789820] Call Trace:
[  581.789825]  blk_mq_check_expired+0x76/0x80
[  581.789828]  bt_iter+0x45/0x50
[  581.789830]  blk_mq_queue_tag_busy_iter+0xdd/0x1f0
[  581.789832]  ? blk_mq_rq_timed_out+0x70/0x70
[  581.789833]  ? blk_mq_rq_timed_out+0x70/0x70
[  581.789840]  ? __switch_to+0x140/0x450
[  581.789841]  blk_mq_timeout_work+0x88/0x170
[  581.789845]  process_one_work+0x165/0x410
[  581.789847]  worker_thread+0x137/0x4c0
[  581.789851]  kthread+0x101/0x140
[  581.789853]  ? rescuer_thread+0x3b0/0x3b0
[  581.789855]  ? kthread_park+0x90/0x90
[  581.789860]  ret_from_fork+0x2c/0x40
[  581.789861] Code: 48 85 c0 74 0d 44 89 e6 48 89 df ff d0 5b 41 5c 5d c3 48
8b bb 70 01 00 00 48 85 ff 75 0f 48 89 df e8 7d f0 ff ff 5b 41 5c 5d c3 <0f>
0b e8 71 f0 ff ff 90 eb e9 0f 1f 40 00 66 2e 0f 1f 84 00 00
[  581.789882] RIP: blk_mq_end_request+0x58/0x70 RSP: c90006ee7d50
[  581.789889] ---[ end trace bcaf03d9a14a0a70 ]---

[2]. oops log2
[ 6984.857362] BUG: unable to handle kernel NULL pointer dereference at 
0010
[ 6984.857372] IP: nvme_queue_rq+0x6e6/0x8cd [nvme]
[ 6984.857373] PGD 0
[ 6984.857374]
[ 6984.857376] Oops:  [#1] SMP
[ 6984.857379] Modules linked in: ipmi_ssif vfat fat intel_rapl sb_edac
edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm
irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ipmi_si iTCO_wdt
iTCO_vendor_support mxm_wmi ipmi_devintf intel_cstate sg dcdbas intel_uncore
mei_me intel_rapl_perf mei pcspkr lpc_ich ipmi_msghandler shpchp
acpi_power_meter wmi nfsd auth_rpcgss dm_multipath nfs_acl lockd grace sunrpc
ip_tables xfs libcrc32c sd_mod mgag200 i2c_algo_bit drm_kms_helper syscopyarea
sysfillrect crc32c_intel sysimgblt fb_sys_fops ttm nvme drm nvme_core ahci
libahci i2c_core tg3 libata ptp megaraid_sas pps_core fjes dm_mirror
dm_region_hash dm_log dm_mod
[ 6984.857416] CPU: 7 PID: 1635 Comm: kworker/7:1H Not tainted
4.10.0-2.el7.bz1420297.x86_64 #1
[ 6984.857417] Hardware name: Dell Inc. PowerEdge R730xd/072T6D, BIOS 2.2.5 
09/06/2016
[ 6984.857427] Workqueue: kblockd blk_mq_run_work_fn
[ 6984.857429] task: 880476e3da00 task.stack: c90002e9
[ 6984.857432] RIP: 0010:nvme_queue_rq+0x6e6/0x8cd [nvme]
[ 6984.857433] RSP: 0018:c90002e93c50 EFLAGS: 00010246
[ 6984.857434] RAX:  RBX: 880275646600 RCX: 1000
[ 6984.857435] RDX: 0fff RSI: 0002fba2a000 RDI: 8804734e6950
[ 6984.857436] RBP: c90002e93d30 R08: 2000 R09: 1000
[ 6984.857437] R10: