Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-10 Thread jianchao.wang


On 02/10/2018 10:32 AM, jianchao.wang wrote:
> Hi Keith
> 
> Thanks for your kindly response here.
> That's really appreciated.
> 
> On 02/10/2018 01:12 AM, Keith Busch wrote:
>> On Fri, Feb 09, 2018 at 09:50:58AM +0800, jianchao.wang wrote:
>>>
>>> if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING 
>>> case,
>>> nvme_reset_work will hang forever, because no one could complete the 
>>> entered requests.
>>
>> Except it's no longer in the "RESETTING" case since you added the
>> "CONNECTING" state, so that's already broken for other reasons...
>>
> 
> Yes, but as your patch, we have to fail the IOs and even kill the controller.
> In fact, up to nvme_wait_freeze in nvme_reset_work, the RECONNECTING state 
> has been completed.
> We even could say it is in LIVE state. Maybe we should recover the controller 
> again instead
> of fail the IOs and kill the controller.
> 
> On the other hand, can you share with me why we cannot use 
> blk_set_preempt_only to replace
> blk_freeze_queue ? we just want to gate the new bios out of 
> generic_make_request and we 
> needn't use the preempt requests.

Looks like blk_freeze_queue in blk_mq_update_nr_hw_queues cannot be worked 
around.
Still face IOs in nvme_reset_work. T_T

> 
> Looking forward your advice and directive.
> 
> Thanks
> Jianchao
> 
> 
>> ___
>> Linux-nvme mailing list
>> linux-n...@lists.infradead.org
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_linux-2Dnvme=DwICAg=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ=UqKQMB3A2ppfm2sN7PyisX0xTtXKsHlTBwjsS18qVx8=A2VMSm9IjQQXxM7foB6VUiRHLs-nIREF2_kMstwxlgw=
>>
> 
> ___
> Linux-nvme mailing list
> linux-n...@lists.infradead.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_linux-2Dnvme=DwICAg=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ=cstI6JeNHGVX4OV1UdHdoemUr75aCUAjUrPe23Dhv8U=MYTmBPYS5tW4vC23iMEKINprtCxnRHe5AXrbST91XpY=
> 


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-10 Thread jianchao.wang


On 02/10/2018 10:32 AM, jianchao.wang wrote:
> Hi Keith
> 
> Thanks for your kindly response here.
> That's really appreciated.
> 
> On 02/10/2018 01:12 AM, Keith Busch wrote:
>> On Fri, Feb 09, 2018 at 09:50:58AM +0800, jianchao.wang wrote:
>>>
>>> if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING 
>>> case,
>>> nvme_reset_work will hang forever, because no one could complete the 
>>> entered requests.
>>
>> Except it's no longer in the "RESETTING" case since you added the
>> "CONNECTING" state, so that's already broken for other reasons...
>>
> 
> Yes, but as your patch, we have to fail the IOs and even kill the controller.
> In fact, up to nvme_wait_freeze in nvme_reset_work, the RECONNECTING state 
> has been completed.
> We even could say it is in LIVE state. Maybe we should recover the controller 
> again instead
> of fail the IOs and kill the controller.
> 
> On the other hand, can you share with me why we cannot use 
> blk_set_preempt_only to replace
> blk_freeze_queue ? we just want to gate the new bios out of 
> generic_make_request and we 
> needn't use the preempt requests.

Looks like blk_freeze_queue in blk_mq_update_nr_hw_queues cannot be worked 
around.
Still face IOs in nvme_reset_work. T_T

> 
> Looking forward your advice and directive.
> 
> Thanks
> Jianchao
> 
> 
>> ___
>> Linux-nvme mailing list
>> linux-n...@lists.infradead.org
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_linux-2Dnvme=DwICAg=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ=UqKQMB3A2ppfm2sN7PyisX0xTtXKsHlTBwjsS18qVx8=A2VMSm9IjQQXxM7foB6VUiRHLs-nIREF2_kMstwxlgw=
>>
> 
> ___
> Linux-nvme mailing list
> linux-n...@lists.infradead.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_linux-2Dnvme=DwICAg=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ=cstI6JeNHGVX4OV1UdHdoemUr75aCUAjUrPe23Dhv8U=MYTmBPYS5tW4vC23iMEKINprtCxnRHe5AXrbST91XpY=
> 


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-09 Thread jianchao.wang
Hi Keith

On 02/10/2018 10:32 AM, jianchao.wang wrote:
> Hi Keith
> 
> Thanks for your kindly response here.
> That's really appreciated.
> 
> On 02/10/2018 01:12 AM, Keith Busch wrote:
>> On Fri, Feb 09, 2018 at 09:50:58AM +0800, jianchao.wang wrote:
>>>
>>> if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING 
>>> case,
>>> nvme_reset_work will hang forever, because no one could complete the 
>>> entered requests.
>>
>> Except it's no longer in the "RESETTING" case since you added the
>> "CONNECTING" state, so that's already broken for other reasons...
>>
> 
> Yes, but as your patch, we have to fail the IOs and even kill the controller.
> In fact, up to nvme_wait_freeze in nvme_reset_work, the RECONNECTING state 
> has been completed.
> We even could say it is in LIVE state. Maybe we should recover the controller 
> again instead
> of fail the IOs and kill the controller.
> 
> On the other hand, can you share with me why we cannot use 
> blk_set_preempt_only to replace
> blk_freeze_queue ? we just want to gate the new bios out of 
> generic_make_request and we 
> needn't use the preempt requests.
> 
> Looking forward your advice and directive.

Avoid wait_freeze in nvme_reset_work should be a better way to fix this defect.

> 
> Thanks
> Jianchao


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-09 Thread jianchao.wang
Hi Keith

On 02/10/2018 10:32 AM, jianchao.wang wrote:
> Hi Keith
> 
> Thanks for your kindly response here.
> That's really appreciated.
> 
> On 02/10/2018 01:12 AM, Keith Busch wrote:
>> On Fri, Feb 09, 2018 at 09:50:58AM +0800, jianchao.wang wrote:
>>>
>>> if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING 
>>> case,
>>> nvme_reset_work will hang forever, because no one could complete the 
>>> entered requests.
>>
>> Except it's no longer in the "RESETTING" case since you added the
>> "CONNECTING" state, so that's already broken for other reasons...
>>
> 
> Yes, but as your patch, we have to fail the IOs and even kill the controller.
> In fact, up to nvme_wait_freeze in nvme_reset_work, the RECONNECTING state 
> has been completed.
> We even could say it is in LIVE state. Maybe we should recover the controller 
> again instead
> of fail the IOs and kill the controller.
> 
> On the other hand, can you share with me why we cannot use 
> blk_set_preempt_only to replace
> blk_freeze_queue ? we just want to gate the new bios out of 
> generic_make_request and we 
> needn't use the preempt requests.
> 
> Looking forward your advice and directive.

Avoid wait_freeze in nvme_reset_work should be a better way to fix this defect.

> 
> Thanks
> Jianchao


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-09 Thread jianchao.wang
Hi Keith

Thanks for your kindly response here.
That's really appreciated.

On 02/10/2018 01:12 AM, Keith Busch wrote:
> On Fri, Feb 09, 2018 at 09:50:58AM +0800, jianchao.wang wrote:
>>
>> if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING case,
>> nvme_reset_work will hang forever, because no one could complete the entered 
>> requests.
> 
> Except it's no longer in the "RESETTING" case since you added the
> "CONNECTING" state, so that's already broken for other reasons...
> 

Yes, but as your patch, we have to fail the IOs and even kill the controller.
In fact, up to nvme_wait_freeze in nvme_reset_work, the RECONNECTING state has 
been completed.
We even could say it is in LIVE state. Maybe we should recover the controller 
again instead
of fail the IOs and kill the controller.

On the other hand, can you share with me why we cannot use blk_set_preempt_only 
to replace
blk_freeze_queue ? we just want to gate the new bios out of 
generic_make_request and we 
needn't use the preempt requests.

Looking forward your advice and directive.

Thanks
Jianchao


> ___
> Linux-nvme mailing list
> linux-n...@lists.infradead.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_linux-2Dnvme=DwICAg=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ=UqKQMB3A2ppfm2sN7PyisX0xTtXKsHlTBwjsS18qVx8=A2VMSm9IjQQXxM7foB6VUiRHLs-nIREF2_kMstwxlgw=
> 


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-09 Thread jianchao.wang
Hi Keith

Thanks for your kindly response here.
That's really appreciated.

On 02/10/2018 01:12 AM, Keith Busch wrote:
> On Fri, Feb 09, 2018 at 09:50:58AM +0800, jianchao.wang wrote:
>>
>> if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING case,
>> nvme_reset_work will hang forever, because no one could complete the entered 
>> requests.
> 
> Except it's no longer in the "RESETTING" case since you added the
> "CONNECTING" state, so that's already broken for other reasons...
> 

Yes, but as your patch, we have to fail the IOs and even kill the controller.
In fact, up to nvme_wait_freeze in nvme_reset_work, the RECONNECTING state has 
been completed.
We even could say it is in LIVE state. Maybe we should recover the controller 
again instead
of fail the IOs and kill the controller.

On the other hand, can you share with me why we cannot use blk_set_preempt_only 
to replace
blk_freeze_queue ? we just want to gate the new bios out of 
generic_make_request and we 
needn't use the preempt requests.

Looking forward your advice and directive.

Thanks
Jianchao


> ___
> Linux-nvme mailing list
> linux-n...@lists.infradead.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_linux-2Dnvme=DwICAg=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ=UqKQMB3A2ppfm2sN7PyisX0xTtXKsHlTBwjsS18qVx8=A2VMSm9IjQQXxM7foB6VUiRHLs-nIREF2_kMstwxlgw=
> 


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-09 Thread Keith Busch
On Fri, Feb 09, 2018 at 09:50:58AM +0800, jianchao.wang wrote:
> 
> if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING case,
> nvme_reset_work will hang forever, because no one could complete the entered 
> requests.

Except it's no longer in the "RESETTING" case since you added the
"CONNECTING" state, so that's already broken for other reasons...


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-09 Thread Keith Busch
On Fri, Feb 09, 2018 at 09:50:58AM +0800, jianchao.wang wrote:
> 
> if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING case,
> nvme_reset_work will hang forever, because no one could complete the entered 
> requests.

Except it's no longer in the "RESETTING" case since you added the
"CONNECTING" state, so that's already broken for other reasons...


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-08 Thread jianchao.wang
Hi Keith and Sagi

Many thanks for your kindly response.
That's really appreciated.

On 02/09/2018 01:56 AM, Keith Busch wrote:
> On Thu, Feb 08, 2018 at 05:56:49PM +0200, Sagi Grimberg wrote:
>> Given the discussion on this set, you plan to respin again
>> for 4.16?
> 
> With the exception of maybe patch 1, this needs more consideration than
> I'd feel okay with for the 4.16 release.
> 
Currently, one of the block is the nvme_wait_freeze in nvme_reset_work.
This cause some issues when I test this patchset yesterday.
As I posted on the V1 patchset mail thread:

if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING case,
nvme_reset_work will hang forever, because no one could complete the entered 
requests.

if we invoke nvme_reset_ctrl after modify the state machine to be able to 
change to RESETTING
to RECONNECTING and queue reset_work, we still cannot move things forward, 
because the reset_work
is being executed.

if we use nvme_wait_freeze_timeout in nvme_reset_work, unfreeze and return if 
expires. But the 
timeout value is tricky..


And actually, one of the possible solution to fix this cleanly is 
blk_set_preempt_only.
It is a lightweight way to gate the new bios out of generic_make_request.

Looking forward your advice on this.
And many thanks for your precious time on this.

Sincerely
Thanks
Jianchao


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-08 Thread jianchao.wang
Hi Keith and Sagi

Many thanks for your kindly response.
That's really appreciated.

On 02/09/2018 01:56 AM, Keith Busch wrote:
> On Thu, Feb 08, 2018 at 05:56:49PM +0200, Sagi Grimberg wrote:
>> Given the discussion on this set, you plan to respin again
>> for 4.16?
> 
> With the exception of maybe patch 1, this needs more consideration than
> I'd feel okay with for the 4.16 release.
> 
Currently, one of the block is the nvme_wait_freeze in nvme_reset_work.
This cause some issues when I test this patchset yesterday.
As I posted on the V1 patchset mail thread:

if we set NVME_REQ_CANCELLED and return BLK_EH_HANDLED as the RESETTING case,
nvme_reset_work will hang forever, because no one could complete the entered 
requests.

if we invoke nvme_reset_ctrl after modify the state machine to be able to 
change to RESETTING
to RECONNECTING and queue reset_work, we still cannot move things forward, 
because the reset_work
is being executed.

if we use nvme_wait_freeze_timeout in nvme_reset_work, unfreeze and return if 
expires. But the 
timeout value is tricky..


And actually, one of the possible solution to fix this cleanly is 
blk_set_preempt_only.
It is a lightweight way to gate the new bios out of generic_make_request.

Looking forward your advice on this.
And many thanks for your precious time on this.

Sincerely
Thanks
Jianchao


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-08 Thread Keith Busch
On Thu, Feb 08, 2018 at 05:56:49PM +0200, Sagi Grimberg wrote:
> Given the discussion on this set, you plan to respin again
> for 4.16?

With the exception of maybe patch 1, this needs more consideration than
I'd feel okay with for the 4.16 release.


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-08 Thread Keith Busch
On Thu, Feb 08, 2018 at 05:56:49PM +0200, Sagi Grimberg wrote:
> Given the discussion on this set, you plan to respin again
> for 4.16?

With the exception of maybe patch 1, this needs more consideration than
I'd feel okay with for the 4.16 release.


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-08 Thread Sagi Grimberg

Jianchao,

Given the discussion on this set, you plan to respin again
for 4.16?


Re: [PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-08 Thread Sagi Grimberg

Jianchao,

Given the discussion on this set, you plan to respin again
for 4.16?


[PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-05 Thread Jianchao Wang
Hi Christoph, Keith and Sagi

Please consider and comment on the following patchset.
That's really appreciated.

There is a complicated relationship between nvme_timeout and nvme_dev_disable.
 - nvme_timeout has to invoke nvme_dev_disable to stop the
   controller doing DMA access before free the request.
 - nvme_dev_disable has to depend on nvme_timeout to complete
   adminq requests to set HMB or delete sq/cq when the controller
   has no response.
 - nvme_dev_disable will race with nvme_timeout when cancels the
   outstanding requests.
We have found some issues introduced by them, please refer the following link

http://lists.infradead.org/pipermail/linux-nvme/2018-January/015053.html 
http://lists.infradead.org/pipermail/linux-nvme/2018-January/015276.html
http://lists.infradead.org/pipermail/linux-nvme/2018-January/015328.html
Even we cannot ensure there is no other issue.

The best way to fix them is to break up the relationship between them.
With this patch, we could avoid nvme_dev_disable to be invoked
by nvme_timeout and eliminate the race between nvme_timeout and
nvme_dev_disable on outstanding requests.

Changes V1->V2:
 - free and disable pci things in nvme_pci_disable_ctrl_directly
 - change comment and add reviewed-by in 1st patch
 - resort patches
 - other misc changes

There are 6 patches:

1st ~ 3th patches does some preparation for the 4th one.
4th fixes a bug found when test.
5th is to avoid nvme_dev_disable to be invoked by nvme_timeout, and implement
the synchronization between them. More details, please refer to the comment of
this patch.
6th fixes a bug after 4th patch is introduced. It let nvme_delete_io_queues can
only be wakeup by completion path.

This patchset was tested under debug patch for some days.
And some bugfix have been done.
The patches are available in following it branch:
https://github.com/jianchwa/linux-blcok.git nvme_fixes_V2

Jianchao Wang (6)
0001-nvme-pci-quiesce-IO-queues-prior-to-disabling-device.patch
0002-nvme-pci-fix-the-freeze-and-quiesce-for-shutdown-and.patch
0003-blk-mq-make-blk_mq_rq_update_aborted_gstate-a-extern.patch
0004-nvme-pci-suspend-queues-based-on-online_queues.patch
0005-nvme-pci-break-up-nvme_timeout-and-nvme_dev_disable.patch
0006-nvme-pci-discard-wait-timeout-when-delete-cq-sq.patch


diff stat:
block/blk-mq.c  |   3 +-
drivers/nvme/host/pci.c | 250 
++-
include/linux/blk-mq.h  |   1 +
3 files changed, 188 insertions(+), 66 deletions(-)

Thanks
Jianchao



[PATCH V2 0/6]nvme-pci: fixes on nvme_timeout and nvme_dev_disable

2018-02-05 Thread Jianchao Wang
Hi Christoph, Keith and Sagi

Please consider and comment on the following patchset.
That's really appreciated.

There is a complicated relationship between nvme_timeout and nvme_dev_disable.
 - nvme_timeout has to invoke nvme_dev_disable to stop the
   controller doing DMA access before free the request.
 - nvme_dev_disable has to depend on nvme_timeout to complete
   adminq requests to set HMB or delete sq/cq when the controller
   has no response.
 - nvme_dev_disable will race with nvme_timeout when cancels the
   outstanding requests.
We have found some issues introduced by them, please refer the following link

http://lists.infradead.org/pipermail/linux-nvme/2018-January/015053.html 
http://lists.infradead.org/pipermail/linux-nvme/2018-January/015276.html
http://lists.infradead.org/pipermail/linux-nvme/2018-January/015328.html
Even we cannot ensure there is no other issue.

The best way to fix them is to break up the relationship between them.
With this patch, we could avoid nvme_dev_disable to be invoked
by nvme_timeout and eliminate the race between nvme_timeout and
nvme_dev_disable on outstanding requests.

Changes V1->V2:
 - free and disable pci things in nvme_pci_disable_ctrl_directly
 - change comment and add reviewed-by in 1st patch
 - resort patches
 - other misc changes

There are 6 patches:

1st ~ 3th patches does some preparation for the 4th one.
4th fixes a bug found when test.
5th is to avoid nvme_dev_disable to be invoked by nvme_timeout, and implement
the synchronization between them. More details, please refer to the comment of
this patch.
6th fixes a bug after 4th patch is introduced. It let nvme_delete_io_queues can
only be wakeup by completion path.

This patchset was tested under debug patch for some days.
And some bugfix have been done.
The patches are available in following it branch:
https://github.com/jianchwa/linux-blcok.git nvme_fixes_V2

Jianchao Wang (6)
0001-nvme-pci-quiesce-IO-queues-prior-to-disabling-device.patch
0002-nvme-pci-fix-the-freeze-and-quiesce-for-shutdown-and.patch
0003-blk-mq-make-blk_mq_rq_update_aborted_gstate-a-extern.patch
0004-nvme-pci-suspend-queues-based-on-online_queues.patch
0005-nvme-pci-break-up-nvme_timeout-and-nvme_dev_disable.patch
0006-nvme-pci-discard-wait-timeout-when-delete-cq-sq.patch


diff stat:
block/blk-mq.c  |   3 +-
drivers/nvme/host/pci.c | 250 
++-
include/linux/blk-mq.h  |   1 +
3 files changed, 188 insertions(+), 66 deletions(-)

Thanks
Jianchao