Re: [PATCH v2 3/4] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device

2018-01-25 Thread Bart Van Assche
On Sat, 2018-01-13 at 23:54 +0800, Ming Lei wrote:
> Thanks for your mentioning, then I can find the following comment in
> srp_queuecommand():
> 
>   /*
>* The SCSI EH thread is the only context from which 
> srp_queuecommand()
>* can get invoked for blocked devices (SDEV_BLOCK /
>* SDEV_CREATED_BLOCK). Avoid racing with srp_reconnect_rport() 
> by
>* locking the rport mutex if invoked from inside the SCSI EH.
>*/
> 
> That means EH request is allowed to send to blocked device.

No way. That's a bug and a bug that needs to be fixed.

Bart.

Re: [PATCH v2 3/4] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device

2018-01-13 Thread Ming Lei
On Fri, Jan 12, 2018 at 10:45:57PM +, Bart Van Assche wrote:
> On Thu, 2018-01-11 at 10:23 +0800, Ming Lei wrote:
> > > not sufficient to prevent .queuecommand() calls from scsi_send_eh_cmnd().
> > 
> > Given it is error handling, do we need to prevent the .queuecommand() call
> > in scsi_send_eh_cmnd()? Could you share us what the actual issue
> > observed is from user view?
> 
> Please have a look at the kernel bug report in the description of patch 4/4 of
> this series.

Thanks for your mentioning, then I can find the following comment in
srp_queuecommand():

/*
 * The SCSI EH thread is the only context from which 
srp_queuecommand()
 * can get invoked for blocked devices (SDEV_BLOCK /
 * SDEV_CREATED_BLOCK). Avoid racing with srp_reconnect_rport() 
by
 * locking the rport mutex if invoked from inside the SCSI EH.
 */

That means EH request is allowed to send to blocked device.

I also replied in patch 4/4, looks there is one simple one line change
which should address the issue of 'sleep in atomic context', please discuss that
in patch 4/4 thread.

-- 
Ming


Re: [PATCH v2 3/4] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device

2018-01-12 Thread Bart Van Assche
On Thu, 2018-01-11 at 10:23 +0800, Ming Lei wrote:
> > not sufficient to prevent .queuecommand() calls from scsi_send_eh_cmnd().
> 
> Given it is error handling, do we need to prevent the .queuecommand() call
> in scsi_send_eh_cmnd()? Could you share us what the actual issue
> observed is from user view?

Please have a look at the kernel bug report in the description of patch 4/4 of
this series.

> > Hence surround the .queuecommand() call from the SCSI error handler with
> > blk_start_wait_if_quiesced() / blk_finish_wait_if_quiesced().
> > 
> > Note: converting the .queuecommand() call in scsi_send_eh_cmnd() into
> > code that calls blk_get_request(), e.g. scsi_execute_req(), is not an
> > option since scsi_send_eh_cmnd() can be called if all requests are
> > allocated and if no requests will make progress without aborting any
> > of these requests.
> 
> If we need to prevent the .queuecommand() in scsi_send_eh_cmnd(), what
> do you think of the approach by requeuing the EH command via
> scsi_mq_requeue_cmd()/scsi_requeue_cmd()? And the EH request will be
> dispatched finally when the queue becomes unquiesced or the STOPPED
> is cleared.

Calling scsi_mq_requeue_cmd() would trigger scsi_mq_uninit_cmd(),
blk_mq_put_driver_tag(), blk_mq_get_driver_tag() and scsi_mq_prep_fn().
That could cause scsi_eh_scmd_add() to be called multiple times for the
same SCSI command. However, that would break one of the assumptions
scsi_eh_scmd_add() is based on, namely that that function gets called
only once per SCSI command that is in progress.

Bart. 

Re: [PATCH v2 3/4] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device

2018-01-10 Thread Ming Lei
On Wed, Jan 10, 2018 at 10:18:16AM -0800, Bart Van Assche wrote:
> Several SCSI transport and LLD drivers surround code that does not
> tolerate concurrent calls of .queuecommand() with scsi_target_block() /
> scsi_target_unblock(). These last two functions use
> blk_mq_quiesce_queue() / blk_mq_unquiesce_queue() for scsi-mq request
> queues to prevent concurrent .queuecommand() calls. However, that is

Actually blk_mq_quiesce_queue() is supposed to disable and drain dispatch,
not for prevent concurrent .queuecommand() calls.

> not sufficient to prevent .queuecommand() calls from scsi_send_eh_cmnd().

Given it is error handling, do we need to prevent the .queuecommand() call
in scsi_send_eh_cmnd()? Could you share us what the actual issue
observed is from user view?

> Hence surround the .queuecommand() call from the SCSI error handler with
> blk_start_wait_if_quiesced() / blk_finish_wait_if_quiesced().
> 
> Note: converting the .queuecommand() call in scsi_send_eh_cmnd() into
> code that calls blk_get_request(), e.g. scsi_execute_req(), is not an
> option since scsi_send_eh_cmnd() can be called if all requests are
> allocated and if no requests will make progress without aborting any
> of these requests.

If we need to prevent the .queuecommand() in scsi_send_eh_cmnd(), what
do you think of the approach by requeuing the EH command via
scsi_mq_requeue_cmd()/scsi_requeue_cmd()? And the EH request will be
dispatched finally when the queue becomes unquiesced or the STOPPED
is cleared.

-- 
Ming


[PATCH v2 3/4] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device

2018-01-10 Thread Bart Van Assche
Several SCSI transport and LLD drivers surround code that does not
tolerate concurrent calls of .queuecommand() with scsi_target_block() /
scsi_target_unblock(). These last two functions use
blk_mq_quiesce_queue() / blk_mq_unquiesce_queue() for scsi-mq request
queues to prevent concurrent .queuecommand() calls. However, that is
not sufficient to prevent .queuecommand() calls from scsi_send_eh_cmnd().
Hence surround the .queuecommand() call from the SCSI error handler with
blk_start_wait_if_quiesced() / blk_finish_wait_if_quiesced().

Note: converting the .queuecommand() call in scsi_send_eh_cmnd() into
code that calls blk_get_request(), e.g. scsi_execute_req(), is not an
option since scsi_send_eh_cmnd() can be called if all requests are
allocated and if no requests will make progress without aborting any
of these requests.

Signed-off-by: Bart Van Assche 
Reviewed-by: Hannes Reinecke 
Cc: Martin K. Petersen 
Cc: Christoph Hellwig 
Cc: Johannes Thumshirn 
Cc: Ming Lei 
---
 drivers/scsi/scsi_error.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 62b56de38ae8..f7154ea86715 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1016,6 +1016,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, 
unsigned char *cmnd,
 {
struct scsi_device *sdev = scmd->device;
struct Scsi_Host *shost = sdev->host;
+   struct request_queue *q = sdev->request_queue;
DECLARE_COMPLETION_ONSTACK(done);
unsigned long timeleft = timeout;
struct scsi_eh_save ses;
@@ -1028,7 +1029,9 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, 
unsigned char *cmnd,
 
scsi_log_send(scmd);
scmd->scsi_done = scsi_eh_done;
+   blk_start_wait_if_quiesced(q);
rtn = shost->hostt->queuecommand(shost, scmd);
+   blk_finish_wait_if_quiesced(q);
if (rtn) {
if (timeleft > stall_for) {
scsi_eh_restore_cmnd(scmd, );
-- 
2.15.1