Re: [PATCH 20/25] qla2xxx: Remove redundant code

2017-05-31 Thread Tran, Quinn
Bart,

Will make available in the next upstream, once it has a chance to go through 
our internal test cycle. 

Regards,
Quinn Tran

-Original Message-
From: Bart Van Assche <bart.vanass...@sandisk.com>
Date: Wednesday, May 31, 2017 at 4:41 PM
To: "Tran, Quinn" <quinn.t...@cavium.com>, Nicholas Bellinger 
<n...@linux-iscsi.org>
Cc: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>, 
"james.bottom...@hansenpartnership.com" 
<james.bottom...@hansenpartnership.com>, "Madhani, Himanshu" 
<himanshu.madh...@cavium.com>, "martin.peter...@oracle.com" 
<martin.peter...@oracle.com>
Subject: Re: [PATCH 20/25] qla2xxx: Remove redundant code

On Mon, 2017-05-22 at 22:14 +, Tran, Quinn wrote:
> 
> I think an initiator could get
> really confused if it receives two responses for the same exchange.
> 
> QT:  I do see the window youʼre describing in the 
qlt_try_to_dequeue_unknown_atios().
> Will have to follow up with another patch.   This patch is not meant for 
this window
> youʼve just identify.

Hello Quinn,

Has that patch been included in v2 of this series or will that patch 
perhaps be
made available later?

Thanks,

Bart.



Re: [PATCH 20/25] qla2xxx: Remove redundant code

2017-05-31 Thread Bart Van Assche
On Mon, 2017-05-22 at 22:14 +, Tran, Quinn wrote:
> 
> I think an initiator could get
> really confused if it receives two responses for the same exchange.
> 
> QT:  I do see the window you’re describing in the 
> qlt_try_to_dequeue_unknown_atios().
> Will have to follow up with another patch.   This patch is not meant for this 
> window
> you’ve just identify.

Hello Quinn,

Has that patch been included in v2 of this series or will that patch perhaps be
made available later?

Thanks,

Bart.

Re: [PATCH 20/25] qla2xxx: Remove redundant code

2017-05-22 Thread Tran, Quinn
-Original Message-
From: <linux-scsi-ow...@vger.kernel.org> on behalf of Bart Van Assche 
<bart.vanass...@sandisk.com>
Date: Monday, May 22, 2017 at 11:23 AM
To: Nicholas Bellinger <n...@linux-iscsi.org>
Cc: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>, 
"james.bottom...@hansenpartnership.com" 
<james.bottom...@hansenpartnership.com>, "Madhani, Himanshu" 
<himanshu.madh...@cavium.com>, "martin.peter...@oracle.com" 
<martin.peter...@oracle.com>
Subject: Re: [PATCH 20/25] qla2xxx: Remove redundant code

On Sun, 2017-05-21 at 21:27 -0700, Nicholas A. Bellinger wrote:
> The three lists abort_cmd_for_tag() walks from __qlt_24xx_handle_abts()
> are used to track descriptors only before __qlt_do_work() is reached,
> and before a descriptor is submitted into tcm_qla2xxx code.
> 
> Or rather, the three lists in abort_cmd_for_tag() only contain
> qla_tgt_cmd or qla_tgt_sess_op descriptors that have not yet reached
> qla_tgt_func_tmpl->handle_cmd() code.
> 
> Both qlt_do_work() and qlt_create_sess_from_atio() drop their respective
> descriptors from ->cmd_list before dispatching into tcm_qla2xxx ->
> target-core, which means there is no way for a descriptor to be part of
> internal lists once __qlt_do_work() is called.
> 
> That said, the patch is correct and removes the redundant lookup.

I do not agree that this patch is makes ABTS handling fully robust. It seems
like you have not noticed that the following race can occur with or without
this patch applied: if abort_cmd_for_tag() and 
qlt_try_to_dequeue_unknown_atios()
are called concurrently, since the latter function calls list_del() after
qlt_send_term_exchange(), abort_cmd_for_tag() can return 1 and thereby 
trigger
a call to qlt_24xx_send_abts_resp() while qlt_try_to_dequeue_unknown_atios()
calls qlt_send_term_exchange() concurrently. 

QT:  “[PATCH 03/25] qla2xxx: Allow ABTS RX, RIDA on ATIOQ for ISP83XX/27XX” 
attempts to reduce the describe concurrency by moving the IOCB from the RSPQ 
thread to the ATIOQ thread.  This helps preserve the ordering between SCSI cmd 
& ABTS.

I think an initiator could get
really confused if it receives two responses for the same exchange.

QT:  I do see the window you’re describing in the 
qlt_try_to_dequeue_unknown_atios().  Will have to follow up with another patch. 
  This patch is not meant for this window you’ve just identify.


An existing issue that is not addressed by this patch is that if an ABTS is
received after qlt_do_work() has called list_del() and before 
__qlt_do_work()
triggers a call target_submit_cmd() that a command is not on any list and 
hence
that abort_cmd_for_tag() won't be able to find it anywhere.

QT:  Actually, we’re covered in this case.There 2 responds for 2 
IOCB/transactions.  1) The  abort_cmd_for_tag() finds the SCSI command on the 
qla_cmd_list and turns on the abort flag while cmd is in work queue & before 
__qlt_do_work is called.  When __qlt_do_work is call and sees the aborted flag, 
the thread will terminate/cleanup the SCSI command and will not submit to TCM.

As for the ABTS, the abort_cmd_for_tag() returns 1 after finding the command 
will trigger an ABTS respond.  

static void __qlt_do_work(struct qla_tgt_cmd *cmd)
{
   if (cmd->aborted) {
ql_dbg(ql_dbg_tgt_mgt, vha, 0xf082,
"cmd with tag %u is aborted\n",
cmd->atio.u.isp24.exchange_addr);
goto out_term;
}
}

Shouldn't these issues be addressed properly?

QT: Will follow up with another patch for the new problem. thanks for the extra 
eyes.

Bart.



Re: [PATCH 20/25] qla2xxx: Remove redundant code

2017-05-22 Thread Bart Van Assche
On Sun, 2017-05-21 at 21:27 -0700, Nicholas A. Bellinger wrote:
> The three lists abort_cmd_for_tag() walks from __qlt_24xx_handle_abts()
> are used to track descriptors only before __qlt_do_work() is reached,
> and before a descriptor is submitted into tcm_qla2xxx code.
> 
> Or rather, the three lists in abort_cmd_for_tag() only contain
> qla_tgt_cmd or qla_tgt_sess_op descriptors that have not yet reached
> qla_tgt_func_tmpl->handle_cmd() code.
> 
> Both qlt_do_work() and qlt_create_sess_from_atio() drop their respective
> descriptors from ->cmd_list before dispatching into tcm_qla2xxx ->
> target-core, which means there is no way for a descriptor to be part of
> internal lists once __qlt_do_work() is called.
> 
> That said, the patch is correct and removes the redundant lookup.

I do not agree that this patch is makes ABTS handling fully robust. It seems
like you have not noticed that the following race can occur with or without
this patch applied: if abort_cmd_for_tag() and 
qlt_try_to_dequeue_unknown_atios()
are called concurrently, since the latter function calls list_del() after
qlt_send_term_exchange(), abort_cmd_for_tag() can return 1 and thereby trigger
a call to qlt_24xx_send_abts_resp() while qlt_try_to_dequeue_unknown_atios()
calls qlt_send_term_exchange() concurrently. I think an initiator could get
really confused if it receives two responses for the same exchange.

An existing issue that is not addressed by this patch is that if an ABTS is
received after qlt_do_work() has called list_del() and before __qlt_do_work()
triggers a call target_submit_cmd() that a command is not on any list and hence
that abort_cmd_for_tag() won't be able to find it anywhere.

Shouldn't these issues be addressed properly?

Bart.

Re: [PATCH 20/25] qla2xxx: Remove redundant code

2017-05-21 Thread Nicholas A. Bellinger
On Fri, 2017-05-19 at 23:43 +, Bart Van Assche wrote:
> On Fri, 2017-05-19 at 14:53 -0700, Himanshu Madhani wrote:
> > From: Quinn Tran 
> > 
> > During ABTS or Abort task, qla2xxx does a pre-search for
> > the se_cmd, based on command's tag. The same search is
> > performed by TCM. Remove the extra search from qla2xxx.
> > 
> > Signed-off-by: Quinn Tran 
> > Signed-off-by: Himanshu Madhani 
> > ---
> >  drivers/scsi/qla2xxx/qla_target.c | 29 -
> >  1 file changed, 4 insertions(+), 25 deletions(-)
> > 
> > diff --git a/drivers/scsi/qla2xxx/qla_target.c 
> > b/drivers/scsi/qla2xxx/qla_target.c
> > index 21e8993baf4b..b8e609ae6cff 100644
> > --- a/drivers/scsi/qla2xxx/qla_target.c
> > +++ b/drivers/scsi/qla2xxx/qla_target.c
> > @@ -1836,34 +1836,13 @@ static int __qlt_24xx_handle_abts(struct 
> > scsi_qla_host *vha,
> > struct abts_recv_from_24xx *abts, struct fc_port *sess)
> >  {
> > struct qla_hw_data *ha = vha->hw;
> > -   struct se_session *se_sess = sess->se_sess;
> > struct qla_tgt_mgmt_cmd *mcmd;
> > -   struct se_cmd *se_cmd;
> > int rc;
> > -   bool found_lun = false;
> > -   unsigned long flags;
> > -
> > -   spin_lock_irqsave(_sess->sess_cmd_lock, flags);
> > -   list_for_each_entry(se_cmd, _sess->sess_cmd_list, se_cmd_list) {
> > -   if (se_cmd->tag == abts->exchange_addr_to_abort) {
> > -   found_lun = true;
> > -   break;
> > -   }
> > -   }
> > -   spin_unlock_irqrestore(_sess->sess_cmd_lock, flags);
> >  
> > -   /* cmd not in LIO lists, look in qla list */
> > -   if (!found_lun) {
> > -   if (abort_cmd_for_tag(vha, abts->exchange_addr_to_abort)) {
> > -   /* send TASK_ABORT response immediately */
> > -   qlt_24xx_send_abts_resp(vha, abts, FCP_TMF_CMPL, false);
> > -   return 0;
> > -   } else {
> > -   ql_dbg(ql_dbg_tgt_mgt, vha, 0xf081,
> > -   "unable to find cmd in driver or LIO for tag 
> > 0x%x\n",
> > -   abts->exchange_addr_to_abort);
> > -   return -ENOENT;
> > -   }
> > +   if (abort_cmd_for_tag(vha, abts->exchange_addr_to_abort)) {
> > +   /* send TASK_ABORT response immediately */
> > +   qlt_24xx_send_abts_resp(vha, abts, FCP_TMF_CMPL, false);
> > +   return 0;
> > }
> >  
> > ql_dbg(ql_dbg_tgt_mgt, vha, 0xf00f,
> 
> Hello Himanshu and Quinn,
> 
> Please drop this patch. If a command has already been submitted to the LIO
> core and an ABTS is received then the LIO core should be requested to perform
> the abort. This patch changes the behavior of the qla2xxx target driver such
> that the LIO core is not informed at all if abort_cmd_for_tag() finds the
> command that has to be aborted in one of the command lists maintained by the
> qla2xxx driver. That can lead to the presence of overlapping writes in the
> command set on the target system and hence to data corruption.

This analysis is wrong.

The three lists abort_cmd_for_tag() walks from __qlt_24xx_handle_abts()
are used to track descriptors only before __qlt_do_work() is reached,
and before a descriptor is submitted into tcm_qla2xxx code.

Or rather, the three lists in abort_cmd_for_tag() only contain
qla_tgt_cmd or qla_tgt_sess_op descriptors that have not yet reached
qla_tgt_func_tmpl->handle_cmd() code.

Both qlt_do_work() and qlt_create_sess_from_atio() drop their respective
descriptors from ->cmd_list before dispatching into tcm_qla2xxx ->
target-core, which means there is no way for a descriptor to be part of
internal lists once __qlt_do_work() is called.

That said, the patch is correct and removes the redundant lookup.

Acked-by: Nicholas Bellinger 



Re: [PATCH 20/25] qla2xxx: Remove redundant code

2017-05-19 Thread Bart Van Assche
On Fri, 2017-05-19 at 14:53 -0700, Himanshu Madhani wrote:
> From: Quinn Tran 
> 
> During ABTS or Abort task, qla2xxx does a pre-search for
> the se_cmd, based on command's tag. The same search is
> performed by TCM. Remove the extra search from qla2xxx.
> 
> Signed-off-by: Quinn Tran 
> Signed-off-by: Himanshu Madhani 
> ---
>  drivers/scsi/qla2xxx/qla_target.c | 29 -
>  1 file changed, 4 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_target.c 
> b/drivers/scsi/qla2xxx/qla_target.c
> index 21e8993baf4b..b8e609ae6cff 100644
> --- a/drivers/scsi/qla2xxx/qla_target.c
> +++ b/drivers/scsi/qla2xxx/qla_target.c
> @@ -1836,34 +1836,13 @@ static int __qlt_24xx_handle_abts(struct 
> scsi_qla_host *vha,
>   struct abts_recv_from_24xx *abts, struct fc_port *sess)
>  {
>   struct qla_hw_data *ha = vha->hw;
> - struct se_session *se_sess = sess->se_sess;
>   struct qla_tgt_mgmt_cmd *mcmd;
> - struct se_cmd *se_cmd;
>   int rc;
> - bool found_lun = false;
> - unsigned long flags;
> -
> - spin_lock_irqsave(_sess->sess_cmd_lock, flags);
> - list_for_each_entry(se_cmd, _sess->sess_cmd_list, se_cmd_list) {
> - if (se_cmd->tag == abts->exchange_addr_to_abort) {
> - found_lun = true;
> - break;
> - }
> - }
> - spin_unlock_irqrestore(_sess->sess_cmd_lock, flags);
>  
> - /* cmd not in LIO lists, look in qla list */
> - if (!found_lun) {
> - if (abort_cmd_for_tag(vha, abts->exchange_addr_to_abort)) {
> - /* send TASK_ABORT response immediately */
> - qlt_24xx_send_abts_resp(vha, abts, FCP_TMF_CMPL, false);
> - return 0;
> - } else {
> - ql_dbg(ql_dbg_tgt_mgt, vha, 0xf081,
> - "unable to find cmd in driver or LIO for tag 
> 0x%x\n",
> - abts->exchange_addr_to_abort);
> - return -ENOENT;
> - }
> + if (abort_cmd_for_tag(vha, abts->exchange_addr_to_abort)) {
> + /* send TASK_ABORT response immediately */
> + qlt_24xx_send_abts_resp(vha, abts, FCP_TMF_CMPL, false);
> + return 0;
>   }
>  
>   ql_dbg(ql_dbg_tgt_mgt, vha, 0xf00f,

Hello Himanshu and Quinn,

Please drop this patch. If a command has already been submitted to the LIO
core and an ABTS is received then the LIO core should be requested to perform
the abort. This patch changes the behavior of the qla2xxx target driver such
that the LIO core is not informed at all if abort_cmd_for_tag() finds the
command that has to be aborted in one of the command lists maintained by the
qla2xxx driver. That can lead to the presence of overlapping writes in the
command set on the target system and hence to data corruption. Please note
that I had proposed a better approach on the target-devel mailing list and
that I'm still waiting for someone from Cavium to review these patches:
* [PATCH v6 09/33] target: Make it possible to specify I_T nexus for SCSI
  abort (http://www.spinics.net/lists/target-devel/msg14534.html).
* [PATCH v6 10/33] tcm_qla2xxx: Let the target core look up the LUN of the
  aborted cmd (http://www.spinics.net/lists/target-devel/msg14563.html).

Bart.