OK, I think we can get it for fabrics too, need to figure out how to
handle it there too.

Do you have a reproducer?

To repro, I have to run a buffered writer workload then put the system into S3.

This fio job seems to reproduce for me:

  fio --name=global --filename=/dev/nvme0n1 --bsrange=4k-128k --rw=randwrite 
--ioengine=libaio --iodepth=8 --numjobs=8 --name=foobar

I use rtcwake to test suspend/resume:

  rtcwake -m mem -s 10

Without the patch we'll get stuck after "Disabling non-boot CPUs ..."
when blk-mq waits to freeze some entered queues after nvme was disabled.

I'm observing the same thing when hibernating during mdraid resync on
nvme - it hangs in blk_mq_freeze_queue_wait() after "Disabling non-boot
CPUs ...". This patch did not help but when I put nvme_wait_freeze()
right after nvme_start_freeze() it appeared to be working.

Interesting. did the nvme device succeeded to shutdown at all?

Maybe the
difference here is that requests are submitted from a non-freezable
kernel thread (md sync_thread)?

Don't think its related...

Reply via email to