Re: [PATCH 32/47] scsi: Use scsi_target as argument for eh_target_reset_handler()

2017-08-02 Thread Steffen Maier

just an intermediate update on storage processing of target reset...

On 07/25/2017 04:19 PM, Hannes Reinecke wrote:

On 07/24/2017 08:10 PM, Steffen Maier wrote:

On 06/28/2017 10:32 AM, Hannes Reinecke wrote:

The target reset function should only depend on the scsi target,
not the scsi command.

Signed-off-by: Hannes Reinecke 
---



   drivers/s390/scsi/zfcp_scsi.c   | 20 ++--



   33 files changed, 214 insertions(+), 174 deletions(-)



diff --git a/drivers/s390/scsi/zfcp_scsi.c
b/drivers/s390/scsi/zfcp_scsi.c
index dd7bea0..92a3902 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -309,9 +309,25 @@ static int
zfcp_scsi_eh_device_reset_handler(struct scsi_cmnd *scpnt)
   return zfcp_task_mgmt_function(scpnt->device, FCP_TMF_LUN_RESET);
   }

-static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt)
+/*
+ * Note: We need to select a LUN as the storage array doesn't
+ * necessarily supports LUN 0 and might refuse the target reset.
+ */


Do you have any real experience with targets regarding this?

Did you even try this and it failed?
If so, how did it fail?


Hehe.

Actually, it was _you_ (well, not you personally, but the zfcp
maintainer at that time) who insisted on _not_ having to rely on LUN 0,
as that LUN might not be available on non-NPIV setups.
In the same vein he argued that we should be using the WLUN here.


Thanks a lot for letting me know!


It seems other drivers hardcode LUN 0 for target reset [see below].

At least you made a similar loop to search for a suitable "victim"
scsi_device with some other driver changes below, so zfcp is not the
only one.

In fact, this is one of my open questions in my own patch set:
Is the TMF flag in the FCP_CMND IU sufficient or does the transmission
path require a valid FCP_LUN also in the same IU even for a target reset.


Technically, you need an IT nexus for the target reset.
As the SCSI target is somewhat under-represented in the linux SCSI stack
typically it's easier to use a scsi device for this, and derive the IT
nexus from there.
And target reset is a tad tricky anyway; it got deprecated with later
SCSI releases (SPC-3?), so chances is that it doesn't do anything.

(You could do yourself a favour and enquire with your friendly array
vendors if _they_ support target reset; I have a strong feeling that
they don't. In which case you might as well drop it completely, and
target reset doing an IT nexus reset.)


# lsscsi
[0:0:0:1073758277]diskIBM  2107900  .280  /dev/sdc
[0:0:0:1073823813]diskIBM  2107900  .280  /dev/sda
[0:0:0:1073889349]diskIBM  2107900  .280  /dev/sdb

With test code I made the following request run into a timeout:

# dd count=1 of=/dev/null if=/dev/sda iflag=direct


[  633.459218] sd 0:0:0:1073823813: [sda] tag#0 Done: TIMEOUT_ERROR Result: 
hostbyte=DID_OK driverbyte=DRIVER_OK
[  633.459267] sd 0:0:0:1073823813: [sda] tag#0 CDB: Read(10) 28 00 00 00 00 00 
00 00 01 00
[  633.459277] sd 0:0:0:1073823813: [sda] tag#0 abort scheduled
[  633.479364] sd 0:0:0:1073823813: [sda] tag#0 aborting command
[  633.479382] sd 0:0:0:1073823813: [sda] tag#0 cmd abort failed


More test code makes the abort fail (before even attempting it).


[  633.479456] scsi host0: scsi_eh_0: waking up 0/1/1
[  633.479483] scsi host0: scsi_eh_prt_fail_stats: cmds failed: 0, cancel: 1
[  633.479492] scsi host0: Total of 1 commands on 1 devices require eh work
[  633.479502] sd 0:0:0:1073823813: scsi_eh_0: Sending BDR
[  633.479512] sd 0:0:0:1073823813: scsi_eh_0: BDR failed


More test code makes the LUN reset fail (before even attempting it).


[  633.479519] scsi host0: scsi_eh_0: Sending target reset to target 0
[  633.483654] sd 0:0:0:1073823813: [sda] tag#0 scsi_eh_done result: 2
[  633.483729] sd 0:0:0:1073823813: [sda] tag#0 Done: SUCCESS Result: 
hostbyte=DID_OK driverbyte=DRIVER_OK
[  633.483736] sd 0:0:0:1073823813: [sda] tag#0 CDB: Test Unit Ready 00 00 00 
00 00 00
[  633.483741] sd 0:0:0:1073823813: [sda] tag#0 Sense Key : Unit Attention [current] 
[  633.483747] sd 0:0:0:1073823813: [sda] tag#0 Add. Sense: Power on, reset, or bus device reset occurred

[  633.483753] sd 0:0:0:1073823813: [sda] tag#0 scsi_send_eh_cmnd timeleft: 1000
[  633.483758] sd 0:0:0:1073823813: [sda] tag#0 scsi_send_eh_cmnd: 
scsi_eh_completed_normally 2001
[  633.483764] sd 0:0:0:1073823813: [sda] tag#0 scsi_eh_tur return: 2001
[  633.484074] sd 0:0:0:1073823813: [sda] tag#0 scsi_eh_done result: 0
[  633.484093] sd 0:0:0:1073823813: [sda] tag#0 scsi_send_eh_cmnd timeleft: 1000
[  633.484118] sd 0:0:0:1073823813: [sda] tag#0 scsi_send_eh_cmnd: 
scsi_eh_completed_normally 2002
[  633.484124] sd 0:0:0:1073823813: [sda] tag#0 scsi_eh_tur return: 2002
[  633.484130] sd 0:0:0:1073823813: [sda] tag#0 scsi_eh_0: flush retry cmd
[  633.484260] scsi host0: waking up host to restart
[  633.484299] scsi host0: scsi_eh_0: sleeping


The target reset 

Re: [PATCH 32/47] scsi: Use scsi_target as argument for eh_target_reset_handler()

2017-07-25 Thread Hannes Reinecke
On 07/24/2017 08:10 PM, Steffen Maier wrote:
> 
> On 06/28/2017 10:32 AM, Hannes Reinecke wrote:
>> The target reset function should only depend on the scsi target,
>> not the scsi command.
>>
>> Signed-off-by: Hannes Reinecke 
>> ---
> 
>>   drivers/s390/scsi/zfcp_scsi.c   | 20 ++--
> 
>>   drivers/scsi/scsi_debug.c   | 21 -
>>   drivers/scsi/scsi_error.c   |  5 +--
> 
>>   include/scsi/scsi_host.h|  2 +-
> 
>>   33 files changed, 214 insertions(+), 174 deletions(-)
> 
>> diff --git a/drivers/s390/scsi/zfcp_scsi.c
>> b/drivers/s390/scsi/zfcp_scsi.c
>> index dd7bea0..92a3902 100644
>> --- a/drivers/s390/scsi/zfcp_scsi.c
>> +++ b/drivers/s390/scsi/zfcp_scsi.c
>> @@ -309,9 +309,25 @@ static int
>> zfcp_scsi_eh_device_reset_handler(struct scsi_cmnd *scpnt)
>>   return zfcp_task_mgmt_function(scpnt->device, FCP_TMF_LUN_RESET);
>>   }
>>
>> -static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt)
>> +/*
>> + * Note: We need to select a LUN as the storage array doesn't
>> + * necessarily supports LUN 0 and might refuse the target reset.
>> + */
> 
> Do you have any real experience with targets regarding this?
> 
> Did you even try this and it failed?
> If so, how did it fail?
> 
Hehe.

Actually, it was _you_ (well, not you personally, but the zfcp
maintainer at that time) who insisted on _not_ having to rely on LUN 0,
as that LUN might not be available on non-NPIV setups.
In the same vein he argued that we should be using the WLUN here.

> It seems other drivers hardcode LUN 0 for target reset [see below].
> 
> At least you made a similar loop to search for a suitable "victim"
> scsi_device with some other driver changes below, so zfcp is not the
> only one.
> 
> In fact, this is one of my open questions in my own patch set:
> Is the TMF flag in the FCP_CMND IU sufficient or does the transmission
> path require a valid FCP_LUN also in the same IU even for a target reset.
> 
Technically, you need an IT nexus for the target reset.
As the SCSI target is somewhat under-represented in the linux SCSI stack
typically it's easier to use a scsi device for this, and derive the IT
nexus from there.
And target reset is a tad tricky anyway; it got deprecated with later
SCSI releases (SPC-3?), so chances is that it doesn't do anything.

(You could do yourself a favour and enquire with your friendly array
vendors if _they_ support target reset; I have a strong feeling that
they don't. In which case you might as well drop it completely, and
target reset doing an IT nexus reset.)

>> +static int zfcp_scsi_eh_target_reset_handler(struct scsi_target
>> *starget)
>>   {
>> -return zfcp_task_mgmt_function(scpnt->device, FCP_TMF_TGT_RESET);
>> +struct fc_rport *rport = starget_to_rport(starget);
>> +struct Scsi_Host *shost = rport_to_shost(rport);
>> +struct scsi_device *sdev = NULL, *tmp_sdev;
>> +
> 
> In "[PATCH 09/47] zfcp: open-code fc_block_scsi_eh() for host reset" you
> introduced a shost lock, but here you did not?
> 
> Does the midlayer already hold an shost lock when calling any of these
> eh callbacks?
> 
Yes.

> Not sure if that's the corresponding part of
> Documentation/scsi/scsi_eh.txt (but even if, I don't understand who's
> supposed to have the shost lock):
>> [2-1-2] Flow of scmds through EH
>>  2. EH starts
>> ACTION: move all scmds to EH's local eh_work_q.  shost->eh_cmd_q
>> is cleared.
>> LOCKING: shost->host_lock (not strictly necessary, just for
>>  consistency)
>> [2-2-3] Things to consider
>>  - For consistency, when accessing/modifying shost data structure,
>>grab shost->host_lock.
> 
> 
>> +shost_for_each_device(tmp_sdev, shost) {
>> +if (tmp_sdev->id == starget->id) {
>> +sdev = tmp_sdev;
>> +break;
>> +}
>> +}
>> +if (!sdev)
>> +return FAILED;
>> +return zfcp_task_mgmt_function(sdev, FCP_TMF_TGT_RESET);
>>   }
> 
> Ah, this "solves" the problem of needing a scsi_device even though we
> only get scsi_target as scope argument.
> 
>> diff --git a/drivers/scsi/lpfc/lpfc_scsi.c
>> b/drivers/scsi/lpfc/lpfc_scsi.c
>> index 107c0f6..573bd43 100644
>> --- a/drivers/scsi/lpfc/lpfc_scsi.c
>> +++ b/drivers/scsi/lpfc/lpfc_scsi.c
>> @@ -5226,16 +5226,16 @@ void lpfc_poll_timeout(unsigned long ptr)
>>*  0x2002 - Success
>>**/
>>   static int
>> -lpfc_target_reset_handler(struct scsi_cmnd *cmnd)
>> +lpfc_target_reset_handler(struct scsi_target *starget)
>>   {
>> -struct Scsi_Host  *shost = cmnd->device->host;
>> +struct fc_rport *rport = starget_to_rport(starget);
>> +struct Scsi_Host  *shost = rport_to_shost(rport);
>>   struct lpfc_vport *vport = (struct lpfc_vport *) shost->hostdata;
>>   struct lpfc_rport_data *rdata;
>>   struct lpfc_nodelist *pnode;
>> -unsigned tgt_id = cmnd->device->id;
>> -uint64_t lun_id = cmnd->device->lun;
>> +unsigned 

Re: [PATCH 32/47] scsi: Use scsi_target as argument for eh_target_reset_handler()

2017-07-24 Thread Steffen Maier


On 06/28/2017 10:32 AM, Hannes Reinecke wrote:

The target reset function should only depend on the scsi target,
not the scsi command.

Signed-off-by: Hannes Reinecke 
---



  drivers/s390/scsi/zfcp_scsi.c   | 20 ++--



  drivers/scsi/scsi_debug.c   | 21 -
  drivers/scsi/scsi_error.c   |  5 +--



  include/scsi/scsi_host.h|  2 +-



  33 files changed, 214 insertions(+), 174 deletions(-)



diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index dd7bea0..92a3902 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -309,9 +309,25 @@ static int zfcp_scsi_eh_device_reset_handler(struct 
scsi_cmnd *scpnt)
return zfcp_task_mgmt_function(scpnt->device, FCP_TMF_LUN_RESET);
  }

-static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt)
+/*
+ * Note: We need to select a LUN as the storage array doesn't
+ * necessarily supports LUN 0 and might refuse the target reset.
+ */


Do you have any real experience with targets regarding this?

Did you even try this and it failed?
If so, how did it fail?

It seems other drivers hardcode LUN 0 for target reset [see below].

At least you made a similar loop to search for a suitable "victim" 
scsi_device with some other driver changes below, so zfcp is not the 
only one.


In fact, this is one of my open questions in my own patch set:
Is the TMF flag in the FCP_CMND IU sufficient or does the transmission 
path require a valid FCP_LUN also in the same IU even for a target reset.



+static int zfcp_scsi_eh_target_reset_handler(struct scsi_target *starget)
  {
-   return zfcp_task_mgmt_function(scpnt->device, FCP_TMF_TGT_RESET);
+   struct fc_rport *rport = starget_to_rport(starget);
+   struct Scsi_Host *shost = rport_to_shost(rport);
+   struct scsi_device *sdev = NULL, *tmp_sdev;
+


In "[PATCH 09/47] zfcp: open-code fc_block_scsi_eh() for host reset" you 
introduced a shost lock, but here you did not?


Does the midlayer already hold an shost lock when calling any of these 
eh callbacks?


Not sure if that's the corresponding part of 
Documentation/scsi/scsi_eh.txt (but even if, I don't understand who's 
supposed to have the shost lock):

[2-1-2] Flow of scmds through EH
 2. EH starts
ACTION: move all scmds to EH's local eh_work_q.  shost->eh_cmd_q
is cleared.
LOCKING: shost->host_lock (not strictly necessary, just for
 consistency)
[2-2-3] Things to consider
 - For consistency, when accessing/modifying shost data structure,
   grab shost->host_lock.




+   shost_for_each_device(tmp_sdev, shost) {
+   if (tmp_sdev->id == starget->id) {
+   sdev = tmp_sdev;
+   break;
+   }
+   }
+   if (!sdev)
+   return FAILED;
+   return zfcp_task_mgmt_function(sdev, FCP_TMF_TGT_RESET);
  }


Ah, this "solves" the problem of needing a scsi_device even though we 
only get scsi_target as scope argument.



diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index 107c0f6..573bd43 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -5226,16 +5226,16 @@ void lpfc_poll_timeout(unsigned long ptr)
   *  0x2002 - Success
   **/
  static int
-lpfc_target_reset_handler(struct scsi_cmnd *cmnd)
+lpfc_target_reset_handler(struct scsi_target *starget)
  {
-   struct Scsi_Host  *shost = cmnd->device->host;
+   struct fc_rport *rport = starget_to_rport(starget);
+   struct Scsi_Host  *shost = rport_to_shost(rport);
struct lpfc_vport *vport = (struct lpfc_vport *) shost->hostdata;
struct lpfc_rport_data *rdata;
struct lpfc_nodelist *pnode;
-   unsigned tgt_id = cmnd->device->id;
-   uint64_t lun_id = cmnd->device->lun;
+   unsigned tgt_id = starget->id;
+   uint64_t lun_id = 0;





diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c 
b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index f990ab4d..db40ddf 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c



@@ -4038,42 +4040,43 @@ int megasas_reset_target_fusion(struct scsi_cmnd *scmd)



+   shost_for_each_device(sdev, shost) {
+   if (!sdev->hostdata)
+   continue;
+   mr_device_priv_data = sdev->hostdata;
+   if (mr_device_priv_data->is_tm_capable) {
+   devhandle = megasas_get_tm_devhandle(sdev);
+   break;
+   }
+   }
+



-   devhandle = megasas_get_tm_devhandle(scmd->device);



ret = megasas_issue_tm(instance, devhandle,
-   scmd->device->channel, scmd->device->id, 0,
+   starget->channel, starget->id, 0,
MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET);


The called function seems to 

[PATCH 32/47] scsi: Use scsi_target as argument for eh_target_reset_handler()

2017-06-28 Thread Hannes Reinecke
The target reset function should only depend on the scsi target,
not the scsi command.

Signed-off-by: Hannes Reinecke 
---
 drivers/message/fusion/mptsas.c | 10 +-
 drivers/message/fusion/mptscsih.c   | 27 +++--
 drivers/message/fusion/mptscsih.h   |  2 +-
 drivers/message/fusion/mptspi.c |  8 -
 drivers/s390/scsi/zfcp_scsi.c   | 20 ++--
 drivers/scsi/aacraid/linit.c|  9 +++---
 drivers/scsi/be2iscsi/be_main.c |  4 +--
 drivers/scsi/bfa/bfad_im.c  |  3 +-
 drivers/scsi/bnx2fc/bnx2fc.h|  2 +-
 drivers/scsi/bnx2fc/bnx2fc_io.c |  4 +--
 drivers/scsi/esas2r/esas2r.h|  2 +-
 drivers/scsi/esas2r/esas2r_main.c   | 38 ---
 drivers/scsi/ibmvscsi/ibmvfc.c  |  3 +-
 drivers/scsi/libiscsi.c |  4 +--
 drivers/scsi/libsas/sas_scsi_host.c |  9 +++---
 drivers/scsi/lpfc/lpfc_scsi.c   | 10 +++---
 drivers/scsi/megaraid/megaraid_sas.h|  3 +-
 drivers/scsi/megaraid/megaraid_sas_base.c   | 10 +++---
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 47 +++--
 drivers/scsi/mpt3sas/mpt3sas_scsih.c| 35 +++--
 drivers/scsi/pmcraid.c  | 24 +++
 drivers/scsi/qedf/qedf_main.c   |  3 +-
 drivers/scsi/qla2xxx/qla_os.c   | 18 +--
 drivers/scsi/qla4xxx/ql4_os.c   | 41 ++---
 drivers/scsi/scsi_debug.c   | 21 -
 drivers/scsi/scsi_error.c   |  5 +--
 drivers/scsi/scsi_transport_iscsi.c |  6 ++--
 drivers/scsi/sym53c8xx_2/sym_glue.c |  3 +-
 drivers/target/loopback/tcm_loop.c  |  9 +++---
 include/scsi/libiscsi.h |  2 +-
 include/scsi/libsas.h   |  2 +-
 include/scsi/scsi_host.h|  2 +-
 include/scsi/scsi_transport_iscsi.h |  2 +-
 33 files changed, 214 insertions(+), 174 deletions(-)

diff --git a/drivers/message/fusion/mptsas.c b/drivers/message/fusion/mptsas.c
index f6308ad..bdb420b 100644
--- a/drivers/message/fusion/mptsas.c
+++ b/drivers/message/fusion/mptsas.c
@@ -1970,6 +1970,14 @@ static enum blk_eh_timer_return 
mptsas_eh_timed_out(struct scsi_cmnd *sc)
 }
 
 
+static int mptsas_eh_target_reset(struct scsi_target *starget)
+{
+   struct sas_rphy *rphy = dev_to_rphy(starget->dev.parent);
+   MPT_ADAPTER *ioc = rphy_to_ioc(rphy);
+
+   return mptscsih_target_reset(ioc->sh, starget);
+}
+
 static struct scsi_host_template mptsas_driver_template = {
.module = THIS_MODULE,
.proc_name  = "mptsas",
@@ -1985,7 +1993,7 @@ static enum blk_eh_timer_return 
mptsas_eh_timed_out(struct scsi_cmnd *sc)
.change_queue_depth = mptscsih_change_queue_depth,
.eh_timed_out   = mptsas_eh_timed_out,
.eh_abort_handler   = mptscsih_abort,
-   .eh_device_reset_handler= mptscsih_dev_reset,
+   .eh_target_reset_handler= mptsas_eh_target_reset,
.eh_host_reset_handler  = mptscsih_host_reset,
.bios_param = mptscsih_bios_param,
.can_queue  = MPT_SAS_CAN_QUEUE,
diff --git a/drivers/message/fusion/mptscsih.c 
b/drivers/message/fusion/mptscsih.c
index ca79982..0f25a2a 100644
--- a/drivers/message/fusion/mptscsih.c
+++ b/drivers/message/fusion/mptscsih.c
@@ -1852,48 +1852,43 @@ int mptscsih_show_info(struct seq_file *m, struct 
Scsi_Host *host)
  * Returns SUCCESS or FAILED.
  **/
 int
-mptscsih_target_reset(struct scsi_cmnd * SCpnt)
+mptscsih_target_reset(struct Scsi_Host *shost, struct scsi_target * starget)
 {
MPT_SCSI_HOST   *hd;
int  retval;
-   VirtDevice   *vdevice;
+   VirtTarget   *vtarget;
MPT_ADAPTER *ioc;
 
/* If we can't locate our host adapter structure, return FAILED status.
 */
-   if ((hd = shost_priv(SCpnt->device->host)) == NULL){
-   printk(KERN_ERR MYNAM ": target reset: "
-  "Can't locate host! (sc=%p)\n", SCpnt);
+   if ((hd = shost_priv(shost)) == NULL){
+   printk(KERN_ERR MYNAM ": target reset: Can't locate host!\n");
return FAILED;
}
 
ioc = hd->ioc;
-   printk(MYIOC_s_INFO_FMT "attempting target reset! (sc=%p)\n",
-  ioc->name, SCpnt);
-   scsi_print_command(SCpnt);
-
-   vdevice = SCpnt->device->hostdata;
-   if (!vdevice || !vdevice->vtarget) {
+   printk(MYIOC_s_INFO_FMT "attempting target reset!\n", ioc->name);
+   vtarget = starget->hostdata;
+   if (!vtarget) {
retval = 0;
goto out;
}
 
/* Target reset to hidden raid component is