Re: [PATCH v2 00/13] mpt3sas driver NVMe support:

2017-07-24 Thread Suganath Prabu Subramani
Is there any update on these patches ?

Thanks,
Suganath Prabu S

On Fri, Jul 14, 2017 at 6:52 PM, Suganath Prabu S
 wrote:
> Ventura Series controller are Tri-mode. The controller and
> firmware are capable of supporting NVMe devices and
> PCIe switches to be connected with the controller. This
> patch set adds driver level support for NVMe devices and
> PCIe switches.
>
> Suganath Prabu S (13):
>   mpt3sas: Update MPI Header
>   mpt3sas: Add nvme device support in slave alloc, target alloc and
> probe
>   mpt3sas: SGL to PRP Translation for I/Os to NVMe  devices
>   mpt3sas: Added support for nvme encapsulated request message.
>   mpt3sas: API 's to support NVMe drive addition to SML
>   mpt3sas: API's to remove nvme drive from sml
>   mpt3sas: Handle NVMe PCIe device related events generated
>from firmware.
>   mpt3sas: Set NVMe device queue depth as 128
>   mpt3sas: scan and add nvme device after controller reset
>   mpt3as: Add-Task-management-debug-info-for-NVMe-drives.
>   mpt3sas: NVMe drive support for BTDHMAPPING ioctl command and log
> info
>   mpt3sas: Fix nvme drives checking for tlr.
>   mpt3sas: Update mpt3sas driver version.
>
>  drivers/scsi/mpt3sas/mpi/mpi2.h  |   43 +-
>  drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h |  647 ++-
>  drivers/scsi/mpt3sas/mpi/mpi2_init.h |   11 +-
>  drivers/scsi/mpt3sas/mpi/mpi2_ioc.h  |  331 ++-
>  drivers/scsi/mpt3sas/mpi/mpi2_pci.h  |  142 +++
>  drivers/scsi/mpt3sas/mpi/mpi2_tool.h |   14 +-
>  drivers/scsi/mpt3sas/mpt3sas_base.c  |  710 +++-
>  drivers/scsi/mpt3sas/mpt3sas_base.h  |  171 +++-
>  drivers/scsi/mpt3sas/mpt3sas_config.c|  100 ++
>  drivers/scsi/mpt3sas/mpt3sas_ctl.c   |  158 ++-
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 1874 
> --
>  drivers/scsi/mpt3sas/mpt3sas_warpdrive.c |2 +-
>  12 files changed, 4063 insertions(+), 140 deletions(-)
>  create mode 100644 drivers/scsi/mpt3sas/mpi/mpi2_pci.h
>
> Thanks,
> Suganath Prabu S


Re: [PATCH] scsi_transport_fc: return -EBUSY for deleted vport

2017-07-24 Thread Martin K. Petersen

Hannes,

> When trying to delete a vport via 'vport_delete' sysfs attribute
> we should be checking if the port is already in state VPORT_DELETING;
> if so there's no need to do anything.

Applied to 4.13/scsi-fixes. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: libcxgbi: add check for valid cxgbi_task_data

2017-07-24 Thread Martin K. Petersen

Varun,

> In error case it is possible that ->cleanup_task() gets called without
> calling ->alloc_pdu() in this case cxgbi_task_data is not valid, so
> add a check for for valid cxgbi_task_data in cxgbi_cleanup_task().

Applied to 4.13/scsi-fixes, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] aic7xxx: fix firmware build with O=path

2017-07-24 Thread Martin K. Petersen

Jakub,

> Building firmware with O=path was apparently broken in aic7 for ever.
> Message of the previous commit to the Makefile (from 2008) mentions
> this unfortunate state of affairs already.  Fix this, mostly to make
> randconfig builds more reliable.

Applied to 4.13/scsi-fixes. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: megaraid_sas: fix memleak in megasas_alloc_cmdlist_fusion

2017-07-24 Thread Martin K. Petersen

shuw...@redhat.com,

> Found this issue by kmemleak, a few kb mem was leaked in
> megasas_alloc_cmdlist_fusion when kzalloc failed for one
> megasas_cmd_fusion allocation.

Applied to 4.13/scsi-fixes. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] qedi: Add ISCSI_BOOT_SYSFS to Kconfig

2017-07-24 Thread Martin K. Petersen

Nilesh,

> qedi uses iscsi_boot_sysfs to export the targets used for boot to
> sysfs. Select the config option to make sure the module is built.

Applied to 4.13/scsi-fixes. Whitespace was bad. Fixed.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] SCSI: remove DRIVER_ATTR() usage

2017-07-24 Thread Martin K. Petersen

Greg,

> It's better to use the DRIVER_ATTR_RW() and DRIVER_ATTR_RO() macros to
> explicitly show that this is a read/write or read/only sysfs file.  So
> convert the remaining SCSI drivers that use the old style to use the
> newer macros.

Applied to 4.14/scsi-queue. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: Convert to using %pOF instead of full_name

2017-07-24 Thread Martin K. Petersen

Rob,

> Now that we have a custom printf format specifier, convert users of
> full_name to use %pOF instead. This is preparation to remove storing
> of the full path string for each node.

Applied to 4.14/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 00/22] gcc-7 -Wformat-* warnings

2017-07-24 Thread Martin K. Petersen

Arnd,

> This series addresses all warnings that gcc-7 introduces for
> -Wformat-overflow= and turns off the -Wformat-truncation by default
> (they remain enabled with "make W=1").

Applied the SCSI patches. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 1/1] scsi: ufs: changing maintainer

2017-07-24 Thread Martin K. Petersen

Prabu,

> As per internal decision, Joao Pinto will be maintainer for DWC UFS
> driver.

Applied to 4.14/scsi-queue. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] aic7xxx: fix firmware build with O=path

2017-07-24 Thread Jakub Kicinski
On Mon, 24 Jul 2017 09:00:41 +0200, Hannes Reinecke wrote:
> On 07/21/2017 08:51 PM, Jakub Kicinski wrote:
> > On Tue, 18 Jul 2017 18:58:34 -0700, Jakub Kicinski wrote:  
> >> Building firmware with O=path was apparently broken in aic7 for ever.
> >> Message of the previous commit to the Makefile (from 2008) mentions
> >> this unfortunate state of affairs already.  Fix this, mostly to make
> >> randconfig builds more reliable.
> >>
> >> Signed-off-by: Jakub Kicinski   
> > 
> > Gentle ping :)  Hannes, are you still maintaining aic7xxx?
> >   
> Yes, I do.
> 
> Patch looks reasonable; Kconfig is still black magic to me, though :-)

To me as well, but is seems to work fine in my tests :)

> Reviewed-by: Hannes Reinecke 

Thanks!


Re: Please help me with kernel hung

2017-07-24 Thread Alan Stern
On Tue, 25 Jul 2017, Михаил Гаврилов wrote:

> Sure.
> I've get it by command:
> journalctl -b -1 -k --no-host -oshort-monotonic
> it's enough? Or needed another logging level?
> Thanks for respond.

This is enough for now.  The problem is that a USB transfer is started 
but never stopped, which indicates a problem with the host controller.

In this case you're using an xHCI host controller.  Unfortunately the 
xHCI maintainer is on vacation, so he won't be able to help for a 
while.

In the meantime, you could try doing a git bisection to find which 
commit caused the problem to occur.

Alan Stern



Re: [PATCH 46/47] scsi: Move eh_device_reset_handler() to use scsi_device as argument

2017-07-24 Thread Steffen Maier

sparse review comments below...

On 06/28/2017 10:33 AM, Hannes Reinecke wrote:

When resetting a device we shouldn't depend on an existing SCSI
device, so use 'struct scsi_device' as argument for
eh_device_reset_handler().

Signed-off-by: Hannes Reinecke 
---
  Documentation/scsi/scsi_eh.txt  |  2 +-
  Documentation/scsi/scsi_mid_low_api.txt |  4 +--



  drivers/s390/scsi/zfcp_scsi.c   |  4 +--



  drivers/scsi/scsi_debug.c   | 18 ++---
  drivers/scsi/scsi_error.c   | 35 +



  include/scsi/scsi_host.h|  2 +-
  60 files changed, 287 insertions(+), 307 deletions(-)

diff --git a/Documentation/scsi/scsi_eh.txt b/Documentation/scsi/scsi_eh.txt
index cb9f4bc22..f8a6566 100644
--- a/Documentation/scsi/scsi_eh.txt
+++ b/Documentation/scsi/scsi_eh.txt
@@ -206,7 +206,7 @@ hostt EH callbacks.  Callbacks may be omitted and omitted 
ones are
  considered to fail always.

  int (* eh_abort_handler)(struct scsi_cmnd *);
-int (* eh_device_reset_handler)(struct scsi_cmnd *);
+int (* eh_device_reset_handler)(struct scsi_device *);
  int (* eh_bus_reset_handler)(struct Scsi_Host *, int);
  int (* eh_host_reset_handler)(struct Scsi_Host *);

diff --git a/Documentation/scsi/scsi_mid_low_api.txt 
b/Documentation/scsi/scsi_mid_low_api.txt
index e2609a63..28dc029 100644
--- a/Documentation/scsi/scsi_mid_low_api.txt
+++ b/Documentation/scsi/scsi_mid_low_api.txt
@@ -874,7 +874,7 @@ Details:

  /**
   *  eh_device_reset_handler - issue SCSI device reset
- *  @scp: identifies SCSI device to be reset
+ *  @sdev: identifies SCSI device to be reset
   *
   *  Returns SUCCESS if command aborted else FAILED
   *
@@ -887,7 +887,7 @@ Details:
   *
   *  Optionally defined in: LLD
   **/
- int eh_device_reset_handler(struct scsi_cmnd * scp)
+ int eh_device_reset_handler(struct scsi_device * sdev)


  /**




diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 92a3902..23b5a7d 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -304,9 +304,9 @@ static int zfcp_task_mgmt_function(struct scsi_device 
*sdev, u8 tm_flags)
return retval;
  }

-static int zfcp_scsi_eh_device_reset_handler(struct scsi_cmnd *scpnt)
+static int zfcp_scsi_eh_device_reset_handler(struct scsi_device *sdev)
  {
-   return zfcp_task_mgmt_function(scpnt->device, FCP_TMF_LUN_RESET);
+   return zfcp_task_mgmt_function(sdev, FCP_TMF_LUN_RESET);
  }

  /*




diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 37e511f..9029500 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -3793,19 +3793,17 @@ static int scsi_debug_abort(struct scsi_cmnd *SCpnt)
return SUCCESS;
  }

-static int scsi_debug_device_reset(struct scsi_cmnd * SCpnt)
+static int scsi_debug_device_reset(struct scsi_device * sdp)
  {
+   struct sdebug_dev_info *devip =
+   (struct sdebug_dev_info *)sdp->hostdata;
+
++num_dev_resets;
-   if (SCpnt && SCpnt->device) {
-   struct scsi_device *sdp = SCpnt->device;
-   struct sdebug_dev_info *devip =
-   (struct sdebug_dev_info *)sdp->hostdata;

-   if (SDEBUG_OPT_ALL_NOISE & sdebug_opts)
-   sdev_printk(KERN_INFO, sdp, "%s\n", __func__);
-   if (devip)
-   set_bit(SDEBUG_UA_POR, devip->uas_bm);
-   }
+   if (SDEBUG_OPT_ALL_NOISE & sdebug_opts)
+   sdev_printk(KERN_INFO, sdp, "%s\n", __func__);
+   if (devip)
+   set_bit(SDEBUG_UA_POR, devip->uas_bm);
return SUCCESS;
  }

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 368a961..4a7fe97f 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -844,7 +844,7 @@ static int scsi_try_bus_device_reset(struct scsi_cmnd *scmd)
if (!hostt->eh_device_reset_handler)
return FAILED;

-   rtn = hostt->eh_device_reset_handler(scmd);
+   rtn = hostt->eh_device_reset_handler(scmd->device);
if (rtn == SUCCESS)
__scsi_report_device_reset(scmd->device, NULL);
return rtn;


up to here it looks good,
however ...


@@ -1106,6 +1106,7 @@ static int scsi_eh_action(struct scsi_cmnd *scmd, int rtn)
   * scsi_eh_finish_cmd - Handle a cmd that eh is finished with.


__scsi_eh_finish_cmd  ?


   * @scmd: Original SCSI cmd that eh has finished.
   * @done_q:   Queue for processed commands.
+ * @result:Final command status to be set
   *
   * Notes:
   *We don't want to use the normal command completion while we are are


Did this hunk slip in accidentally?
I don't see a new additional argument result being introduced with 
either the new __scsi_eh_finish_cmd() nor the old scsi_eh_finish_cmd().
However, the former __scsi_eh_finish_cmd() seems to 

Re: [PATCH 0/7] Enable iSCSI offload drivers to use information from iface.

2017-07-24 Thread Robert LeBlanc
On Wed, Jun 14, 2017 at 10:47 AM, Robert LeBlanc  wrote:
> On Wed, Jun 14, 2017 at 3:20 AM, Rangankar, Manish
>  wrote:
>>
>> On 13/06/17 10:19 PM, "Robert LeBlanc"  wrote:
>>
>>>On Wed, Jun 7, 2017 at 12:30 PM, Robert LeBlanc 
>>>wrote:
 On Wed, Jun 7, 2017 at 10:28 AM, Chris Leech  wrote:
> On Tue, Jun 06, 2017 at 12:07:10PM -0600, Robert LeBlanc wrote:
>> This patchset enables iSCSI offload drivers to have access to the
>>iface
>> information provided by iscsid. This allows users to have more control
>> of how the driver connects to the iSCSI target. iSER is updated to use
>> iface.ipaddress to set the source IP address if configured. This
>>allows
>> iSER to use multiple ports on the same network or in more complicated
>> routed configurations.
>>
>> Since there is already a change to the function parameters, dst_addr
>> is upgraded to sockaddr_storage so that it is more future proof and
>>makes
>> the size of the struct static and not dependent on checking the
>>SA_FAMILY.
>>
>> This is dependent on updates to Open-iSCSI.
>
> Hi Robert,
>
> I don't think that passing the iface_rec structure directly from the
> iscsid internals into a netlink message is a good way to go about this.
> It's really big, there's an embedded list_head with user address
> pointers that needs to be left out, and there are 32/64-bit layout
> differences.
>
> Let me take a look at how you're proposing using this info for iSER, if
> it makes sense I think we should come up with a better designed
> structure for passing the information.
>
> Thanks,
> Chris
>

 Chris,

 Thank you for your feedback. I agree that the entire iface is probably
 overkill, it was more of a proof of concept. We are only using the
 ipaddress in the iface for iSER (in my patch), but I could see other
 drivers benefiting from some of the other data in the iface (mac,
 interface_name, vlan, etc) so I didn't want to be too restrictive so
 that it wouldn't have to be extended later. I've not worked on
 userspace/kernel interaction before so I need some guidance to make
 the transition between userspace and kernel versions smoother.

 This patchset works for what we need and it is very important for us
 (and I'm sure others once the feature is available) and I'm happy to
 put in the time to get it accepted upstream, I'm just new to kernel
 development and need some guidance.
>>>
>>>Are there other comments/ideas/suggestions specifically from the
>>>iSCSI/iSER guys? I'd like to keep this patch moving.
>>
>> Considering partial iSCSI offload solution (like bnx2i and qedi) point of
>> view, we liked the idea from Hannes to create TAP interface to associate
>> with each iSCSI offload interface, which will allow us to use userspace
>> tools for configuration. I haven't dig into its details yet, but at higher
>> level it looks like this will help us to move away from our dependency
>> over iscsiuio.
>
> I'm having a hard time wrapping my head around this idea. How can
> configuring a TAP (separate Ethernet device) affect the offload NIC. I
> don't see how you can bind to the right interface with the IP address
> or how that is passed to rdma_connect() using a TAP. Is TAP used
> instead of netlink for communicating between userspace and the kernel?
> Obviously by my questions, at the moment I'm not sure how to approach
> this at all and would be the wrong person to make this happen.
>
> If someone can explain how this would work and point to some code that
> does something like this (examples are good for me), I can try to
> create a patch with TAPs. As I mentioned before, this is important for
> us and I'm willing to put in the time to learn and code, but I'm
> really lost at this point.
>
> Thank you,
>
> 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

Can we get some more discussion/enlightenment on this? It would be
really nice to have this in 4.14 and time is getting short.

Thanks.


Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


Re: [PATCH 32/47] scsi: Use scsi_target as argument for eh_target_reset_handler()

2017-07-24 Thread Steffen Maier


On 06/28/2017 10:32 AM, Hannes Reinecke wrote:

The target reset function should only depend on the scsi target,
not the scsi command.

Signed-off-by: Hannes Reinecke 
---



  drivers/s390/scsi/zfcp_scsi.c   | 20 ++--



  drivers/scsi/scsi_debug.c   | 21 -
  drivers/scsi/scsi_error.c   |  5 +--



  include/scsi/scsi_host.h|  2 +-



  33 files changed, 214 insertions(+), 174 deletions(-)



diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index dd7bea0..92a3902 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -309,9 +309,25 @@ static int zfcp_scsi_eh_device_reset_handler(struct 
scsi_cmnd *scpnt)
return zfcp_task_mgmt_function(scpnt->device, FCP_TMF_LUN_RESET);
  }

-static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt)
+/*
+ * Note: We need to select a LUN as the storage array doesn't
+ * necessarily supports LUN 0 and might refuse the target reset.
+ */


Do you have any real experience with targets regarding this?

Did you even try this and it failed?
If so, how did it fail?

It seems other drivers hardcode LUN 0 for target reset [see below].

At least you made a similar loop to search for a suitable "victim" 
scsi_device with some other driver changes below, so zfcp is not the 
only one.


In fact, this is one of my open questions in my own patch set:
Is the TMF flag in the FCP_CMND IU sufficient or does the transmission 
path require a valid FCP_LUN also in the same IU even for a target reset.



+static int zfcp_scsi_eh_target_reset_handler(struct scsi_target *starget)
  {
-   return zfcp_task_mgmt_function(scpnt->device, FCP_TMF_TGT_RESET);
+   struct fc_rport *rport = starget_to_rport(starget);
+   struct Scsi_Host *shost = rport_to_shost(rport);
+   struct scsi_device *sdev = NULL, *tmp_sdev;
+


In "[PATCH 09/47] zfcp: open-code fc_block_scsi_eh() for host reset" you 
introduced a shost lock, but here you did not?


Does the midlayer already hold an shost lock when calling any of these 
eh callbacks?


Not sure if that's the corresponding part of 
Documentation/scsi/scsi_eh.txt (but even if, I don't understand who's 
supposed to have the shost lock):

[2-1-2] Flow of scmds through EH
 2. EH starts
ACTION: move all scmds to EH's local eh_work_q.  shost->eh_cmd_q
is cleared.
LOCKING: shost->host_lock (not strictly necessary, just for
 consistency)
[2-2-3] Things to consider
 - For consistency, when accessing/modifying shost data structure,
   grab shost->host_lock.




+   shost_for_each_device(tmp_sdev, shost) {
+   if (tmp_sdev->id == starget->id) {
+   sdev = tmp_sdev;
+   break;
+   }
+   }
+   if (!sdev)
+   return FAILED;
+   return zfcp_task_mgmt_function(sdev, FCP_TMF_TGT_RESET);
  }


Ah, this "solves" the problem of needing a scsi_device even though we 
only get scsi_target as scope argument.



diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index 107c0f6..573bd43 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -5226,16 +5226,16 @@ void lpfc_poll_timeout(unsigned long ptr)
   *  0x2002 - Success
   **/
  static int
-lpfc_target_reset_handler(struct scsi_cmnd *cmnd)
+lpfc_target_reset_handler(struct scsi_target *starget)
  {
-   struct Scsi_Host  *shost = cmnd->device->host;
+   struct fc_rport *rport = starget_to_rport(starget);
+   struct Scsi_Host  *shost = rport_to_shost(rport);
struct lpfc_vport *vport = (struct lpfc_vport *) shost->hostdata;
struct lpfc_rport_data *rdata;
struct lpfc_nodelist *pnode;
-   unsigned tgt_id = cmnd->device->id;
-   uint64_t lun_id = cmnd->device->lun;
+   unsigned tgt_id = starget->id;
+   uint64_t lun_id = 0;





diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c 
b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index f990ab4d..db40ddf 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c



@@ -4038,42 +4040,43 @@ int megasas_reset_target_fusion(struct scsi_cmnd *scmd)



+   shost_for_each_device(sdev, shost) {
+   if (!sdev->hostdata)
+   continue;
+   mr_device_priv_data = sdev->hostdata;
+   if (mr_device_priv_data->is_tm_capable) {
+   devhandle = megasas_get_tm_devhandle(sdev);
+   break;
+   }
+   }
+



-   devhandle = megasas_get_tm_devhandle(scmd->device);



ret = megasas_issue_tm(instance, devhandle,
-   scmd->device->channel, scmd->device->id, 0,
+   starget->channel, starget->id, 0,
MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET);


The called function seems to 

Re: [PATCH 09/47] zfcp: open-code fc_block_scsi_eh() for host reset

2017-07-24 Thread Hannes Reinecke
On 07/24/2017 06:18 PM, Steffen Maier wrote:
> Hi Hannes,
> 
> unfortunately I only realized now by accident that there's stuff to
> review. Would be nice to send it also explicitly to driver maintainers
> in addition to the list.
> 
Well, the entire patchset got send as an RFC anyway, and as such I
didn't include all the individual driver maintainers.
But will be doing so for the actual patchset.

> Since you've asked for this multiple times, I happened to just now code
> a patch series of 6 patches in order to decouple zfcp from scsi_cmnd for
> device, target, and host reset.
> 
Oh, cool. That's precisely what I need.

> While I provide some review comments below, I think it might be clearer
> and easier to review if you would rebase your series on top of my
> decoupling.
> 
Sure. Get me the patches and I'll be doing it :-)

> Let me know how urgent you'd like to see my code. I planned to send it
> as RFC soon anyway. However, it hasn't seen any function testing yet. If
> you don't care, let me know and I can send it.
> 
At this stage I don't really care; the idea is to get the preliminary
patches in (preferably before the next merge window), so that the actual
patch to modify SCSI EH syntax doesn't have to modify the drivers
themselves, only the calling sequence.

> On 06/28/2017 10:32 AM, Hannes Reinecke wrote:
>> When issuing a host reset we should be waiting for all
>> ports to become unblocked; just waiting for one might
>> be resulting in host reset to return too early.
>>
>> Signed-off-by: Hannes Reinecke 
>> ---
>>   drivers/s390/scsi/zfcp_scsi.c | 27 +++
>>   1 file changed, 23 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/s390/scsi/zfcp_scsi.c
>> b/drivers/s390/scsi/zfcp_scsi.c
>> index 0678cf7..3d18659 100644
>> --- a/drivers/s390/scsi/zfcp_scsi.c
>> +++ b/drivers/s390/scsi/zfcp_scsi.c
>> @@ -311,13 +311,32 @@ static int
>> zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt)
>>
>>   static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt)
>>   {
>> -struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device);
>> -struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
>> -int ret;
>> +struct Scsi_Host *host = scpnt->device->host;
>> +struct zfcp_adapter *adapter = (struct zfcp_adapter
>> *)host->hostdata[0];
>> +int ret = 0;
>> +unsigned long flags;
>> +struct zfcp_port *port;
>>
>>   zfcp_erp_adapter_reopen(adapter, 0, "schrh_1");
>>   zfcp_erp_wait(adapter);
>> -ret = fc_block_scsi_eh(scpnt);
>> +retry_rport_blocked:
>> +spin_lock_irqsave(host->host_lock, flags);
> 
> missing read_lock(>port_list_lock);
> 
> Hm, well, I have to think about lock ordering, because my patch has the
> port_list as outer loop and inside it calls fc_block_scsi_eh (also
> modified with fc_rport as argument).
> If there's any other location taking both locks we better get them in
> the same order.
> 
In general I'm not terribly happy with this; after all, there is no
guarantee that the rport list after host reset is identical to the list
prior to it; in extremis we could even end up with no rports whatsoever.
In which case we wouldn't have anything to synchronize upon, leaving
host reset in a somewhat dubious state.
I'd be far happier if we could have a synchronisation point independent
on the rport states; then this problem wouldn't occur.
(And rports would be handled separately via the dev_loss_tmo mechanism
anyway, so it wouldn't matter if the rports are still recovering after
host reset returned.)

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH 24/47] zfcp: use scsi device as argument for zfcp_task_mgmt_function()

2017-07-24 Thread Steffen Maier



On 06/28/2017 10:32 AM, Hannes Reinecke wrote:

zfcp_task_mgmt_function() is only used for lun and device reset,
so it should be using the scsi device as an argument, not the
scsi command.

Signed-off-by: Hannes Reinecke 
---
  drivers/s390/scsi/zfcp_ext.h  |  2 +-
  drivers/s390/scsi/zfcp_fc.h   |  9 +
  drivers/s390/scsi/zfcp_fsf.c  | 21 +++--
  drivers/s390/scsi/zfcp_scsi.c | 12 ++--
  4 files changed, 19 insertions(+), 25 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index 57f1d4a..64d4db7 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -117,7 +117,7 @@ extern int zfcp_fsf_send_els(struct zfcp_adapter *, u32,
 struct zfcp_fsf_ct_els *, unsigned int);
  extern int zfcp_fsf_fcp_cmnd(struct scsi_cmnd *);
  extern void zfcp_fsf_req_free(struct zfcp_fsf_req *);
-extern struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct scsi_cmnd *, u8);
+extern struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct scsi_device *, u8);
  extern struct zfcp_fsf_req *zfcp_fsf_abort_fcp_cmnd(struct scsi_cmnd *);
  extern void zfcp_fsf_reqid_check(struct zfcp_qdio *, int);

diff --git a/drivers/s390/scsi/zfcp_fc.h b/drivers/s390/scsi/zfcp_fc.h
index df2b541..f08eaf9 100644
--- a/drivers/s390/scsi/zfcp_fc.h
+++ b/drivers/s390/scsi/zfcp_fc.h
@@ -206,19 +206,12 @@ struct zfcp_fc_wka_ports {
   * zfcp_fc_scsi_to_fcp - setup FCP command with data from scsi_cmnd
   * @fcp: fcp_cmnd to setup
   * @scsi: scsi_cmnd where to get LUN, task attributes/flags and CDB
- * @tm: task management flags to setup task management command
   */
  static inline
-void zfcp_fc_scsi_to_fcp(struct fcp_cmnd *fcp, struct scsi_cmnd *scsi,
-u8 tm_flags)
+void zfcp_fc_scsi_to_fcp(struct fcp_cmnd *fcp, struct scsi_cmnd *scsi)
  {
int_to_scsilun(scsi->device->lun, (struct scsi_lun *) >fc_lun);

-   if (unlikely(tm_flags)) {
-   fcp->fc_tm_flags = tm_flags;
-   return;
-   }
-
fcp->fc_pri_ta = FCP_PTA_SIMPLE;

if (scsi->sc_data_direction == DMA_FROM_DEVICE)


When I wrote my zfcp decoupling patch series I did a lot of git history 
research in order to double check if my changes are OK.


Here, I figured that I'd like to separately revert commit 2443c8b23aea 
("[SCSI] zfcp: Merge FCP task management setup with regular FCP command 
setup"), because this introduced a dependency on the unsuitable SCSI 
command for scsi_eh / TMF. Even though it was one of the first commits I 
posted upstream as newbie zfcp maintainer... little did I know.



diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 27ff38f..c4bd3d4 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2036,10 +2036,9 @@ static void zfcp_fsf_req_trace(struct zfcp_fsf_req *req, 
struct scsi_cmnd *scsi)



-static void zfcp_fsf_fcp_handler_common(struct zfcp_fsf_req *req)
+static void zfcp_fsf_fcp_handler_common(struct zfcp_fsf_req *req,
+   struct scsi_device *sdev)




@@ -2120,7 +2119,7 @@ static void zfcp_fsf_fcp_cmnd_handler(struct zfcp_fsf_req 
*req)



-   zfcp_fsf_fcp_handler_common(req);
+   zfcp_fsf_fcp_handler_common(req, scpnt->device);



@@ -2296,8 +2295,9 @@ static void zfcp_fsf_fcp_task_mgmt_handler(struct 
zfcp_fsf_req *req)
  {



+   struct scsi_device *sdev = req->data;

-   zfcp_fsf_fcp_handler_common(req);
+   zfcp_fsf_fcp_handler_common(req, sdev);


Moving the resolving of req->data into the callers of 
zfcp_fsf_fcp_handler_common() I did the same way in my own patch series.



@@ -2313,12 +2313,12 @@ static void zfcp_fsf_fcp_task_mgmt_handler(struct 
zfcp_fsf_req *req)



-struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct scsi_cmnd *scmnd,
+struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct scsi_device *sdev,
u8 tm_flags)



@@ -2338,7 +2338,7 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct 
scsi_cmnd *scmnd,



-   req->data = scmnd;
+   req->data = sdev;



diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 5fada93..dd7bea0 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -259,17 +259,17 @@ static void zfcp_scsi_forget_cmnds(struct zfcp_scsi_dev 
*zsdev, u8 tm_flags)



-static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, u8 tm_flags)
+static int zfcp_task_mgmt_function(struct scsi_device *sdev, u8 tm_flags)
  {
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device);
+   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
-   struct fc_port *rport = zfcp_sdev->port->rport;


This smells very odd: fc_port instead of fc_*r*port.
Looks like this was introduced buggy in "[PATCH 15/47] 
scsi_transport_fc: Use fc_rport as argument for 

Re: [PATCH 14/47] scsi: Use Scsi_Host as argument for eh_host_reset_handler

2017-07-24 Thread Steffen Maier


On 06/28/2017 10:32 AM, Hannes Reinecke wrote:

Issuing a host reset should not rely on any commands.
So use Scsi_Host as argument for eh_host_reset_handler.

Signed-off-by: Hannes Reinecke 
---
  Documentation/scsi/scsi_eh.txt  |  2 +-
  Documentation/scsi/scsi_mid_low_api.txt |  4 +--



  drivers/s390/scsi/zfcp_scsi.c   |  3 +-



  drivers/scsi/scsi_debug.c   | 16 +-
  drivers/scsi/scsi_error.c   |  2 +-



  include/scsi/scsi_host.h|  2 +-
  81 files changed, 290 insertions(+), 380 deletions(-)

diff --git a/Documentation/scsi/scsi_eh.txt b/Documentation/scsi/scsi_eh.txt
index 11e447b..78cd6b9 100644
--- a/Documentation/scsi/scsi_eh.txt
+++ b/Documentation/scsi/scsi_eh.txt
@@ -208,7 +208,7 @@ considered to fail always.
  int (* eh_abort_handler)(struct scsi_cmnd *);
  int (* eh_device_reset_handler)(struct scsi_cmnd *);
  int (* eh_bus_reset_handler)(struct scsi_cmnd *);
-int (* eh_host_reset_handler)(struct scsi_cmnd *);
+int (* eh_host_reset_handler)(struct Scsi_Host *);

   Higher-severity actions are taken only when lower-severity actions
  cannot recover some of failed scmds.  Also, note that failure of the
diff --git a/Documentation/scsi/scsi_mid_low_api.txt 
b/Documentation/scsi/scsi_mid_low_api.txt
index 6338400..ad517d5 100644
--- a/Documentation/scsi/scsi_mid_low_api.txt
+++ b/Documentation/scsi/scsi_mid_low_api.txt
@@ -891,7 +891,7 @@ Details:

  /**
   *  eh_host_reset_handler - reset host (host bus adapter)
- *  @scp: SCSI host that contains this device should be reset
+ *  @shost: SCSI host that should be reset
   *
   *  Returns SUCCESS if command aborted else FAILED
   *
@@ -908,7 +908,7 @@ Details:
   *
   *  Optionally defined in: LLD
   **/
- int eh_host_reset_handler(struct scsi_cmnd * scp)
+ int eh_host_reset_handler(struct Scsi_Host * shost)


  /**





diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 3d18659..d7d4a63 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -309,9 +309,8 @@ static int zfcp_scsi_eh_target_reset_handler(struct 
scsi_cmnd *scpnt)
return zfcp_task_mgmt_function(scpnt, FCP_TMF_TGT_RESET);
  }

-static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt)
+static int zfcp_scsi_eh_host_reset_handler(struct Scsi_Host *host)
  {
-   struct Scsi_Host *host = scpnt->device->host;
struct zfcp_adapter *adapter = (struct zfcp_adapter *)host->hostdata[0];
int ret = 0;
unsigned long flags;





diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index dc095a2..ebf525b 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -3878,27 +3878,27 @@ static int scsi_debug_bus_reset(struct scsi_cmnd * 
SCpnt)
return SUCCESS;
  }

-static int scsi_debug_host_reset(struct scsi_cmnd * SCpnt)
+static int scsi_debug_host_reset(struct Scsi_Host * shost)
  {
struct sdebug_host_info * sdbg_host;
struct sdebug_dev_info *devip;
int k = 0;

++num_host_resets;
-   if ((SCpnt->device) && (SDEBUG_OPT_ALL_NOISE & sdebug_opts))
-   sdev_printk(KERN_INFO, SCpnt->device, "%s\n", __func__);
-spin_lock(_host_list_lock);
-list_for_each_entry(sdbg_host, _host_list, host_list) {
+   if (SDEBUG_OPT_ALL_NOISE & sdebug_opts)
+   shost_printk(KERN_INFO, shost, "%s\n", __func__);
+   spin_lock(_host_list_lock);
+   list_for_each_entry(sdbg_host, _host_list, host_list) {
list_for_each_entry(devip, _host->dev_info_list,
dev_list) {
set_bit(SDEBUG_UA_BUS_RESET, devip->uas_bm);
++k;
}
-}
-spin_unlock(_host_list_lock);
+   }
+   spin_unlock(_host_list_lock);
stop_all_queued();
if (SDEBUG_OPT_RESET_NOISE & sdebug_opts)
-   sdev_printk(KERN_INFO, SCpnt->device,
+   shost_printk(KERN_INFO, shost,
"%s: %d device(s) found\n", __func__, k);
return SUCCESS;
  }


Mechanically, the change looks good.

However, is it correct that it's been looping over all its shosts and 
doing something even for those that do not match SCpnt or now shost?
I thought it should only reset on the scope but it seems to call 
stop_all_queued(void) which might reset all shosts?
But then again the loop seems about unit attentions so the loop may be 
correct.

Anyway, this is not related to your change.

The other quoted parts incl. zfcp look good to me.


diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index ac31964..808167f 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -745,7 +745,7 @@ static int scsi_try_host_reset(struct scsi_cmnd *scmd)
if (!hostt->eh_host_reset_handler)
   

Re: [PATCH 15/47] scsi_transport_fc: Use fc_rport as argument for fc_block_scsi_eh

2017-07-24 Thread Steffen Maier


On 06/28/2017 10:32 AM, Hannes Reinecke wrote:

fc_block_scsi_eh() works on a remote port, so we should be using
that as an argument, and not the scsi command.



diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index d7d4a63..e3160fc 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -150,6 +150,7 @@ static int zfcp_scsi_eh_abort_handler(struct scsi_cmnd 
*scpnt)
struct zfcp_adapter *adapter =
(struct zfcp_adapter *) scsi_host->hostdata[0];
struct zfcp_fsf_req *old_req, *abrt_req;
+   struct fc_rport *rport = starget_to_rport(scsi_target(scpnt->device));
unsigned long flags;
unsigned long old_reqid = (unsigned long) scpnt->host_scribble;
int retval = SUCCESS, ret;
@@ -176,7 +177,7 @@ static int zfcp_scsi_eh_abort_handler(struct scsi_cmnd 
*scpnt)
break;

zfcp_erp_wait(adapter);
-   ret = fc_block_scsi_eh(scpnt);
+   ret = fc_block_scsi_eh(rport);


I think for abort handlers some original fc_block_scsi_eh with scsi_cmnd 
as argument would be perfectly valid as this really has the scope of a 
particular single command.


Do you really want to touch all abort handlers just because of this?

In my zfcp decoupling I had created a new fc_block_rport for 
TMFs/host_reset and kept using the old fc_block_scsi_eh (which does the 
argument lifting and simply delegates to the new fc_block_rport) for the 
abort handler.
Admittedly, of course, I also did so because I did not touch all other 
users/callers of the old function with the scsi_cmnd argument.



if (ret) {
zfcp_dbf_scsi_abort("abrt_bl", scpnt, NULL);
return ret;
@@ -262,6 +263,7 @@ static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, 
u8 tm_flags)
  {
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device);
struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
+   struct fc_port *rport = zfcp_sdev->port->rport;


port->rport could be NULL

[async zfcp_scsi_rport_block()]


struct zfcp_fsf_req *fsf_req = NULL;
int retval = SUCCESS, ret;
int retry = 3;
@@ -272,7 +274,7 @@ static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, 
u8 tm_flags)
break;

zfcp_erp_wait(adapter);
-   ret = fc_block_scsi_eh(scpnt);
+   ret = fc_block_scsi_eh(rport);
if (ret)
return ret;




diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 6dd0922..f83e512 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3273,7 +3273,7 @@ struct fc_rport *

  /**
   * fc_block_scsi_eh - Block SCSI eh thread for blocked fc_rport
- * @cmnd: SCSI command that scsi_eh is trying to recover
+ * @rport: remote port to be checked
   *
   * This routine can be called from a FC LLD scsi_eh callback. It
   * blocks the scsi_eh thread until the fc_rport leaves the
@@ -3285,10 +3285,9 @@ struct fc_rport *
   *FAST_IO_FAIL if the fast_io_fail_tmo fired, this should be
   *passed back to scsi_eh.
   */
-int fc_block_scsi_eh(struct scsi_cmnd *cmnd)
+int fc_block_scsi_eh(struct fc_rport *rport)
  {
-   struct Scsi_Host *shost = cmnd->device->host;
-   struct fc_rport *rport = starget_to_rport(scsi_target(cmnd->device));
+   struct Scsi_Host *shost = rport_to_shost(rport);
unsigned long flags;

spin_lock_irqsave(shost->host_lock, flags);
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index e308cd5..5b26943 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -804,7 +804,7 @@ void fc_host_post_vendor_event(struct Scsi_Host *shost, u32 
event_number,
  struct fc_vport *fc_vport_create(struct Scsi_Host *shost, int channel,
struct fc_vport_identifiers *);
  int fc_vport_terminate(struct fc_vport *vport);
-int fc_block_scsi_eh(struct scsi_cmnd *cmnd);
+int fc_block_scsi_eh(struct fc_rport *rport);
  enum blk_eh_timer_return fc_eh_timed_out(struct scsi_cmnd *scmd);

  static inline struct Scsi_Host *fc_bsg_to_shost(struct bsg_job *job)



Aside from above thoughts, the coding of the argument change seems good.


--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 09/47] zfcp: open-code fc_block_scsi_eh() for host reset

2017-07-24 Thread Steffen Maier



On 06/28/2017 10:32 AM, Hannes Reinecke wrote:

When issuing a host reset we should be waiting for all
ports to become unblocked; just waiting for one might
be resulting in host reset to return too early.

Signed-off-by: Hannes Reinecke 
---
  drivers/s390/scsi/zfcp_scsi.c | 27 +++
  1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 0678cf7..3d18659 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -311,13 +311,32 @@ static int zfcp_scsi_eh_target_reset_handler(struct 
scsi_cmnd *scpnt)

  static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt)
  {
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device);
-   struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
-   int ret;
+   struct Scsi_Host *host = scpnt->device->host;
+   struct zfcp_adapter *adapter = (struct zfcp_adapter *)host->hostdata[0];


Oh, only realized when reading the later "[PATCH 14/47] scsi: Use 
Scsi_Host as argument for eh_host_reset_handler" that this part already 
anticipates parts of that later patch. Seems not fully topically 
separated patches to me, if this one is only about the fc_block_scsi_eh 
aspect in zfcp_scsi_eh_host_reset_handler().



--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 09/47] zfcp: open-code fc_block_scsi_eh() for host reset

2017-07-24 Thread Steffen Maier

Hi Hannes,

unfortunately I only realized now by accident that there's stuff to 
review. Would be nice to send it also explicitly to driver maintainers 
in addition to the list.


Since you've asked for this multiple times, I happened to just now code 
a patch series of 6 patches in order to decouple zfcp from scsi_cmnd for 
device, target, and host reset.


While I provide some review comments below, I think it might be clearer 
and easier to review if you would rebase your series on top of my 
decoupling.


Let me know how urgent you'd like to see my code. I planned to send it 
as RFC soon anyway. However, it hasn't seen any function testing yet. If 
you don't care, let me know and I can send it.


On 06/28/2017 10:32 AM, Hannes Reinecke wrote:

When issuing a host reset we should be waiting for all
ports to become unblocked; just waiting for one might
be resulting in host reset to return too early.

Signed-off-by: Hannes Reinecke 
---
  drivers/s390/scsi/zfcp_scsi.c | 27 +++
  1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 0678cf7..3d18659 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -311,13 +311,32 @@ static int zfcp_scsi_eh_target_reset_handler(struct 
scsi_cmnd *scpnt)

  static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt)
  {
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device);
-   struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
-   int ret;
+   struct Scsi_Host *host = scpnt->device->host;
+   struct zfcp_adapter *adapter = (struct zfcp_adapter *)host->hostdata[0];
+   int ret = 0;
+   unsigned long flags;
+   struct zfcp_port *port;

zfcp_erp_adapter_reopen(adapter, 0, "schrh_1");
zfcp_erp_wait(adapter);
-   ret = fc_block_scsi_eh(scpnt);
+retry_rport_blocked:
+   spin_lock_irqsave(host->host_lock, flags);


missing read_lock(>port_list_lock);

Hm, well, I have to think about lock ordering, because my patch has the 
port_list as outer loop and inside it calls fc_block_scsi_eh (also 
modified with fc_rport as argument).
If there's any other location taking both locks we better get them in 
the same order.



+   list_for_each_entry(port, >port_list, list) {
+   struct fc_rport *rport = port->rport;


port->rport can be NULL, so need to check


+
+   if (rport->port_state == FC_PORTSTATE_BLOCKED) {
+   if (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
+   ret = FAST_IO_FAIL;


Hm, doesn't this get lost if a next iteration hits ret=NEEDS_RETRY?

I was pondering in my own patch version what to return of just a subset 
of ports ran into fast_io_fail. And for now I thought just fast_io_fail 
would be good to let I/O bubble up for path failover, even if this would 
include of rport which meanwhile unblocked properly and would not need 
bubbling up pending requests because they could service them with a 
simple retry.



+   else
+   ret = NEEDS_RETRY;
+   break;
+   }


Why do you open code fc_block_scsi_eh() instead of calling it with 
port->rport (if it's !=NULL)?


Typically all rports would be blocked after adapter recovery, until they 
become unblocked (via zfcp's async rport_work). So we can wait for each 
in turn which should still only wait in summary for the last one to 
unblock. E.g. if the first rport takes longest we wait for it, and the 
rest of the loop will be fast since the others happen to be unblocked 
(or fast_io_fail) already?



+   }


missing read_unlock(>port_list_lock);


+   spin_unlock_irqrestore(host->host_lock, flags);
+   if (ret == NEEDS_RETRY) {
+   msleep(1000);
+   goto retry_rport_blocked;
+   }
if (ret)
return ret;



--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: Please help me with kernel hung

2017-07-24 Thread Alan Stern
On Mon, 24 Jul 2017, Михаил Гаврилов wrote:

> Hi guys. I hope what I wrote to appropriate mailing list. Can anybody
> look in hung issue below:
> 
> [  246.751384] kernel: INFO: task scsi_eh_6:388 blocked for more than
> 120 seconds.
> [  246.752085] kernel:   Not tainted 4.13.0-0.rc1.git4.1.fc27.x86_64 #1
> [  246.752187] kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  246.752312] kernel: scsi_eh_6   D13432   388  2 0x0080
> [  246.752332] kernel: Call Trace:
> [  246.752357] kernel:  __schedule+0x2dc/0xbb0
> [  246.752385] kernel:  schedule+0x3d/0x90
> [  246.752396] kernel:  schedule_preempt_disabled+0x15/0x20
> [  246.752406] kernel:  __mutex_lock+0x2f4/0xa00
> [  246.752422] kernel:  ? device_reset+0x24/0x50 [usb_storage]
> [  246.752460] kernel:  ? scsi_error_handler+0x96/0x620
> [  246.752470] kernel:  mutex_lock_nested+0x1b/0x20
> [  246.752478] kernel:  ? mutex_lock_nested+0x1b/0x20
> [  246.752489] kernel:  device_reset+0x24/0x50 [usb_storage]
> [  246.752502] kernel:  scsi_eh_ready_devs+0x37a/0xe10
> [  246.752523] kernel:  ? trace_hardirqs_on_caller+0xf4/0x190
> [  246.752538] kernel:  ? scsi_error_handler+0x5/0x620
> [  246.752546] kernel:  ? scsi_error_handler+0x96/0x620
> [  246.752555] kernel:  scsi_error_handler+0x505/0x620
> [  246.752589] kernel:  kthread+0x133/0x150
> [  246.752598] kernel:  ? scsi_eh_get_sense+0x250/0x250
> [  246.752607] kernel:  ? kthread_create_on_node+0x70/0x70
> [  246.752621] kernel:  ? kthread_create_on_node+0x70/0x70
> [  246.752632] kernel:  ret_from_fork+0x2a/0x40
> [  246.752672] kernel: INFO: task usb-storage:390 blocked for more
> than 120 seconds.
> [  246.752782] kernel:   Not tainted 4.13.0-0.rc1.git4.1.fc27.x86_64 #1
> [  246.752879] kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  246.753056] kernel: usb-storage D13584   390  2 0x0080
> [  246.753082] kernel: Call Trace:
> [  246.753114] kernel:  __schedule+0x2dc/0xbb0
> [  246.753162] kernel:  ? wait_for_completion+0x117/0x1a0
> [  246.753177] kernel:  schedule+0x3d/0x90
> [  246.753197] kernel:  schedule_timeout+0x29d/0x510
> [  246.753241] kernel:  ? mark_held_locks+0x5f/0x90
> [  246.753259] kernel:  ? _raw_spin_unlock_irq+0x2c/0x40
> [  246.753277] kernel:  ? wait_for_completion+0x117/0x1a0
> [  246.753310] kernel:  ? wait_for_completion+0x117/0x1a0
> [  246.753326] kernel:  wait_for_completion+0x136/0x1a0
> [  246.753339] kernel:  ? wait_for_completion+0x136/0x1a0
> [  246.753358] kernel:  ? wake_up_q+0x80/0x80
> [  246.753401] kernel:  usb_sg_wait+0x119/0x180
> [  246.753441] kernel:  usb_stor_bulk_transfer_sglist.part.1+0x70/0xd0
> [usb_storage]
> [  246.753477] kernel:  usb_stor_bulk_srb+0x4e/0x80 [usb_storage]
> [  246.753506] kernel:  usb_stor_Bulk_transport+0x187/0x430 [usb_storage]
> [  246.753554] kernel:  usb_stor_invoke_transport+0x3e/0x540 [usb_storage]
> [  246.753596] kernel:  ? mark_held_locks+0x5f/0x90
> [  246.753613] kernel:  ? _raw_spin_unlock_irq+0x2c/0x40
> [  246.753664] kernel:  usb_stor_transparent_scsi_command+0xe/0x10 
> [usb_storage]
> [  246.753685] kernel:  usb_stor_control_thread+0x1fb/0x280 [usb_storage]
> [  246.753731] kernel:  kthread+0x133/0x150
> [  246.753751] kernel:  ? fill_inquiry_response+0x20/0x20 [usb_storage]
> [  246.753766] kernel:  ? kthread_create_on_node+0x70/0x70
> [  246.753796] kernel:  ret_from_fork+0x2a/0x40
> [  246.753878] kernel:
>Showing all locks held in the system:
> [  246.753903] kernel: 2 locks held by kworker/6:1/63:
> [  246.753926] kernel:  #0:
> ("events_freezable_power_efficient"){.+.+.+}, at: []
> process_one_work+0x1d0/0x6a0
> [  246.754031] kernel:  #1:  ((&(>dwork)->work)){+.+.+.}, at:
> [] process_one_work+0x1d0/0x6a0
> [  246.754085] kernel: 1 lock held by khungtaskd/65:
> [  246.754095] kernel:  #0:  (tasklist_lock){.+.+..}, at:
> [] debug_show_all_locks+0x3d/0x1a0
> [  246.754156] kernel: 1 lock held by scsi_eh_6/388:
> [  246.754166] kernel:  #0:  (_interface_key[i]){+.+...}, at:
> [] device_reset+0x24/0x50 [usb_storage]
> [  246.754222] kernel: 1 lock held by usb-storage/390:
> [  246.754232] kernel:  #0:  (_interface_key[i]){+.+...}, at:
> [] usb_stor_control_thread+0x7d/0x280 [usb_storage]
> [  246.754290] kernel: 1 lock held by lvm/519:
> [  246.754300] kernel:  #0:  (>block_mutex){+.+...}, at:
> [] disk_block_events+0x33/0x90
> [  246.754357] kernel: 1 lock held by systemd-udevd/543:
> [  246.754366] kernel:  #0:  (>bd_mutex){+.+.+.}, at:
> [] __blkdev_put+0x43/0x200
> [  246.754427] kernel:
> [  246.754436] kernel: =
> 
> 
> This occured after update kernel to 4.13.0-0.rc1.git4.1 then next
> reboot. After pressing reset button and follow reboot problem not
> reproduced. But sometime kernel may hangs silently.
> 
> If I'm wrong with the mailing list, tell me please appropriate mailing list.

These are the right mailing lists.

We need 

RE: [PATCH v2 11/15] megaraid_sas: Set device queue_depth same as HBA can_queue value in scsi-mq mode

2017-07-24 Thread Shivasharan Srikanteshwara
> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Thursday, July 20, 2017 1:18 PM
> To: Shivasharan Srikanteshwara
> Cc: Christoph Hellwig; linux-scsi@vger.kernel.org;
martin.peter...@oracle.com;
> the...@redhat.com; j...@linux.vnet.ibm.com; Sumit Saxena; h...@suse.com;
> Kashyap Desai
> Subject: Re: [PATCH v2 11/15] megaraid_sas: Set device queue_depth same
as
> HBA can_queue value in scsi-mq mode
>
> I still don't understand why you don't want to do the same for the
non-mq path.

Hi Christoph,

Sorry for delay in response.

MQ case -
If there is any block layer requeue happens, we see performance drop. So
we avoid re-queue increasing Device QD = HBA QD. Performance drop due to
block layer re-queue is more in case of HDD (sequential IO converted into
random IO).

Non-MQ case.
If we increase Device QD = HBA QD for no-mq case, we see performance drop
for certain profiles.
For example SATA SSD, previous driver in non-mq set Device QD=32. In this
case, if we have more outstanding IO per device (more than 32), block
layer attempts soft merge and eventually end user experience higher
performance due to block layer attempting soft merge.  Same is not correct
in MQ case, as IO scheduler in MQ adds overhead if at all there is any
throttling or staging due to device QD.

Below is example of single SATA SSD, Sequential Read, BS=4K, IO depth =
256

MQ enable, Device QD = 32 achieves 137K IOPS
MQ enable, Device QD = 916 (HBA QD) achieves 145K IOPS

MQ disable, Device QD = 32 achieves 237K IOPS
MQ disable, Device QD = 916 (HBA QD) achieves 145K IOPS

Ideally we want to keep same QD settings in non-MQ mode, but trying to
avoid as we may face some regression from end user as explained.

Thanks,
Shivasharan


[patch 4/5] scsi/bnx2fc: Simplify CPU hotplug code

2017-07-24 Thread Thomas Gleixner
The CPU hotplug related code of this driver can be simplified by:

1) Consolidating the callbacks into a single state. The CPU thread can be
   torn down on the CPU which goes offline. There is no point in delaying
   that to the CPU dead state

2) Let the core code invoke the online/offline callbacks and remove the
   extra for_each_online_cpu() loops.

Signed-off-by: Thomas Gleixner 
---
 drivers/scsi/bnx2fc/bnx2fc_fcoe.c |   69 --
 include/linux/cpuhotplug.h|1 
 2 files changed, 15 insertions(+), 55 deletions(-)

--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -2624,12 +2624,11 @@ static struct fcoe_transport bnx2fc_tran
 };
 
 /**
- * bnx2fc_percpu_thread_create - Create a receive thread for an
- *  online CPU
+ * bnx2fc_cpu_online - Create a receive thread for an  online CPU
  *
  * @cpu: cpu index for the online cpu
  */
-static void bnx2fc_percpu_thread_create(unsigned int cpu)
+static int bnx2fc_cpu_online(unsigned int cpu)
 {
struct bnx2fc_percpu_s *p;
struct task_struct *thread;
@@ -2639,15 +2638,17 @@ static void bnx2fc_percpu_thread_create(
thread = kthread_create_on_node(bnx2fc_percpu_io_thread,
(void *)p, cpu_to_node(cpu),
"bnx2fc_thread/%d", cpu);
+   if (IS_ERR(thread))
+   return PTR_ERR(thread);
+
/* bind thread to the cpu */
-   if (likely(!IS_ERR(thread))) {
-   kthread_bind(thread, cpu);
-   p->iothread = thread;
-   wake_up_process(thread);
-   }
+   kthread_bind(thread, cpu);
+   p->iothread = thread;
+   wake_up_process(thread);
+   return 0;
 }
 
-static void bnx2fc_percpu_thread_destroy(unsigned int cpu)
+static int bnx2fc_cpu_offline(unsigned int cpu)
 {
struct bnx2fc_percpu_s *p;
struct task_struct *thread;
@@ -2661,7 +2662,6 @@ static void bnx2fc_percpu_thread_destroy
thread = p->iothread;
p->iothread = NULL;
 
-
/* Free all work in the list */
list_for_each_entry_safe(work, tmp, >work_list, list) {
list_del_init(>list);
@@ -2673,20 +2673,6 @@ static void bnx2fc_percpu_thread_destroy
 
if (thread)
kthread_stop(thread);
-}
-
-
-static int bnx2fc_cpu_online(unsigned int cpu)
-{
-   printk(PFX "CPU %x online: Create Rx thread\n", cpu);
-   bnx2fc_percpu_thread_create(cpu);
-   return 0;
-}
-
-static int bnx2fc_cpu_dead(unsigned int cpu)
-{
-   printk(PFX "CPU %x offline: Remove Rx thread\n", cpu);
-   bnx2fc_percpu_thread_destroy(cpu);
return 0;
 }
 
@@ -2761,31 +2747,16 @@ static int __init bnx2fc_mod_init(void)
spin_lock_init(>fp_work_lock);
}
 
-   get_online_cpus();
-
-   for_each_online_cpu(cpu)
-   bnx2fc_percpu_thread_create(cpu);
-
-   rc = cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ONLINE_DYN,
- "scsi/bnx2fc:online",
- bnx2fc_cpu_online, NULL);
+   rc = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "scsi/bnx2fc:online",
+  bnx2fc_cpu_online, bnx2fc_cpu_offline);
if (rc < 0)
-   goto stop_threads;
+   goto stop_thread;
bnx2fc_online_state = rc;
 
-   cpuhp_setup_state_nocalls_cpuslocked(CPUHP_SCSI_BNX2FC_DEAD,
-"scsi/bnx2fc:dead",
-NULL, bnx2fc_cpu_dead);
-   put_online_cpus();
-
cnic_register_driver(CNIC_ULP_FCOE, _cnic_cb);
-
return 0;
 
-stop_threads:
-   for_each_online_cpu(cpu)
-   bnx2fc_percpu_thread_destroy(cpu);
-   put_online_cpus();
+stop_thread:
kthread_stop(l2_thread);
 free_wq:
destroy_workqueue(bnx2fc_wq);
@@ -2804,7 +2775,6 @@ static void __exit bnx2fc_mod_exit(void)
struct fcoe_percpu_s *bg;
struct task_struct *l2_thread;
struct sk_buff *skb;
-   unsigned int cpu = 0;
 
/*
 * NOTE: Since cnic calls register_driver routine rtnl_lock,
@@ -2845,16 +2815,7 @@ static void __exit bnx2fc_mod_exit(void)
if (l2_thread)
kthread_stop(l2_thread);
 
-   get_online_cpus();
-   /* Destroy per cpu threads */
-   for_each_online_cpu(cpu) {
-   bnx2fc_percpu_thread_destroy(cpu);
-   }
-
-   cpuhp_remove_state_nocalls_cpuslocked(bnx2fc_online_state);
-   cpuhp_remove_state_nocalls_cpuslocked(CPUHP_SCSI_BNX2FC_DEAD);
-
-   put_online_cpus();
+   cpuhp_remove_state(bnx2fc_online_state);
 
destroy_workqueue(bnx2fc_wq);
/*
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -39,7 +39,6 @@ enum cpuhp_state {
CPUHP_PCI_XGENE_DEAD,

[patch 2/5] scsi/bnx2fc: Prevent recursive cpuhotplug locking

2017-07-24 Thread Thomas Gleixner
The BNX2FC module init/exit code installs/removes the hotplug callbacks with
the cpu hotplug lock held. This worked with the old CPU locking
implementation which allowed recursive locking, but with the new percpu
rwsem based mechanism this is not longer allowed.

Use the _cpuslocked() variants to fix this.

Reported-by: kernel test robot 
Signed-off-by: Thomas Gleixner 
---
 drivers/scsi/bnx2fc/bnx2fc_fcoe.c |   15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -2766,15 +2766,16 @@ static int __init bnx2fc_mod_init(void)
for_each_online_cpu(cpu)
bnx2fc_percpu_thread_create(cpu);
 
-   rc = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
-  "scsi/bnx2fc:online",
-  bnx2fc_cpu_online, NULL);
+   rc = cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ONLINE_DYN,
+ "scsi/bnx2fc:online",
+ bnx2fc_cpu_online, NULL);
if (rc < 0)
goto stop_threads;
bnx2fc_online_state = rc;
 
-   cpuhp_setup_state_nocalls(CPUHP_SCSI_BNX2FC_DEAD, "scsi/bnx2fc:dead",
- NULL, bnx2fc_cpu_dead);
+   cpuhp_setup_state_nocalls_cpuslocked(CPUHP_SCSI_BNX2FC_DEAD,
+"scsi/bnx2fc:dead",
+NULL, bnx2fc_cpu_dead);
put_online_cpus();
 
cnic_register_driver(CNIC_ULP_FCOE, _cnic_cb);
@@ -2850,8 +2851,8 @@ static void __exit bnx2fc_mod_exit(void)
bnx2fc_percpu_thread_destroy(cpu);
}
 
-   cpuhp_remove_state_nocalls(bnx2fc_online_state);
-   cpuhp_remove_state_nocalls(CPUHP_SCSI_BNX2FC_DEAD);
+   cpuhp_remove_state_nocalls_cpuslocked(bnx2fc_online_state);
+   cpuhp_remove_state_nocalls_cpuslocked(CPUHP_SCSI_BNX2FC_DEAD);
 
put_online_cpus();
 




[patch 3/5] scsi/bnx2i: Prevent recursive cpuhotplug locking

2017-07-24 Thread Thomas Gleixner
The BNX2I module init/exit code installs/removes the hotplug callbacks with
the cpu hotplug lock held. This worked with the old CPU locking
implementation which allowed recursive locking, but with the new percpu
rwsem based mechanism this is not longer allowed.

Use the _cpuslocked() variants to fix this.

Reported-by: Steven Rostedt 
Signed-off-by: Thomas Gleixner 
---
 drivers/scsi/bnx2i/bnx2i_init.c |   15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

--- a/drivers/scsi/bnx2i/bnx2i_init.c
+++ b/drivers/scsi/bnx2i/bnx2i_init.c
@@ -516,15 +516,16 @@ static int __init bnx2i_mod_init(void)
for_each_online_cpu(cpu)
bnx2i_percpu_thread_create(cpu);
 
-   err = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
-  "scsi/bnx2i:online",
-  bnx2i_cpu_online, NULL);
+   err = cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ONLINE_DYN,
+  "scsi/bnx2i:online",
+  bnx2i_cpu_online, NULL);
if (err < 0)
goto remove_threads;
bnx2i_online_state = err;
 
-   cpuhp_setup_state_nocalls(CPUHP_SCSI_BNX2I_DEAD, "scsi/bnx2i:dead",
- NULL, bnx2i_cpu_dead);
+   cpuhp_setup_state_nocalls_cpuslocked(CPUHP_SCSI_BNX2I_DEAD,
+"scsi/bnx2i:dead",
+NULL, bnx2i_cpu_dead);
put_online_cpus();
return 0;
 
@@ -574,8 +575,8 @@ static void __exit bnx2i_mod_exit(void)
for_each_online_cpu(cpu)
bnx2i_percpu_thread_destroy(cpu);
 
-   cpuhp_remove_state_nocalls(bnx2i_online_state);
-   cpuhp_remove_state_nocalls(CPUHP_SCSI_BNX2I_DEAD);
+   cpuhp_remove_state_nocalls_cpuslocked(bnx2i_online_state);
+   cpuhp_remove_state_nocalls_cpuslocked(CPUHP_SCSI_BNX2I_DEAD);
put_online_cpus();
 
iscsi_unregister_transport(_iscsi_transport);




[patch 0/5] scsi/bnx2*: Plug hotplug race, correct locking and simplify hotplug code

2017-07-24 Thread Thomas Gleixner
The conversion of the cpu hotplug locking to a percpu rwsem does not longer
allow recursive locking of the hotplug lock.

The BNX2I and BNX2FC drivers install/remove hotplug states with the hotplug
lock held. The install/removal code acquired the hotplug lock as well.

While looking into this, I noticed an interesting hotplug race in the
BNX2FC driver, which could result in dereferencing a NULL pointer or freed
and potentially reused memory.

The following series addresses these problems and as a final step on top it
simplifies the hotplug code in both drivers.

Thanks,

tglx


 drivers/scsi/bnx2fc/bnx2fc_fcoe.c |   68 --
 drivers/scsi/bnx2fc/bnx2fc_hwi.c  |   45 -
 drivers/scsi/bnx2i/bnx2i_init.c   |   64 ---
 include/linux/cpuhotplug.h|2 -
 4 files changed, 53 insertions(+), 126 deletions(-)





[patch 5/5] scsi/bnx2i: Simplify cpu hotplug code

2017-07-24 Thread Thomas Gleixner
The CPU hotplug related code of this driver can be simplified by:

1) Consolidating the callbacks into a single state. The CPU thread can be
   torn down on the CPU which goes offline. There is no point in delaying
   that to the CPU dead state

2) Let the core code invoke the online/offline callbacks and remove the
   extra for_each_online_cpu() loops.

Signed-off-by: Thomas Gleixner 
---
 drivers/scsi/bnx2i/bnx2i_init.c |   65 +---
 include/linux/cpuhotplug.h  |1 
 2 files changed, 15 insertions(+), 51 deletions(-)

--- a/drivers/scsi/bnx2i/bnx2i_init.c
+++ b/drivers/scsi/bnx2i/bnx2i_init.c
@@ -404,12 +404,11 @@ int bnx2i_get_stats(void *handle)
 
 
 /**
- * bnx2i_percpu_thread_create - Create a receive thread for an
- * online CPU
+ * bnx2i_cpu_online - Create a receive thread for an online CPU
  *
  * @cpu:   cpu index for the online cpu
  */
-static void bnx2i_percpu_thread_create(unsigned int cpu)
+static int bnx2i_cpu_online(unsigned int cpu)
 {
struct bnx2i_percpu_s *p;
struct task_struct *thread;
@@ -419,16 +418,17 @@ static void bnx2i_percpu_thread_create(u
thread = kthread_create_on_node(bnx2i_percpu_io_thread, (void *)p,
cpu_to_node(cpu),
"bnx2i_thread/%d", cpu);
+   if (IS_ERR(thread))
+   return PTR_ERR(thread);
+
/* bind thread to the cpu */
-   if (likely(!IS_ERR(thread))) {
-   kthread_bind(thread, cpu);
-   p->iothread = thread;
-   wake_up_process(thread);
-   }
+   kthread_bind(thread, cpu);
+   p->iothread = thread;
+   wake_up_process(thread);
+   return 0;
 }
 
-
-static void bnx2i_percpu_thread_destroy(unsigned int cpu)
+static int bnx2i_cpu_offline(unsigned int cpu)
 {
struct bnx2i_percpu_s *p;
struct task_struct *thread;
@@ -451,19 +451,6 @@ static void bnx2i_percpu_thread_destroy(
spin_unlock_bh(>p_work_lock);
if (thread)
kthread_stop(thread);
-}
-
-static int bnx2i_cpu_online(unsigned int cpu)
-{
-   pr_info("bnx2i: CPU %x online: Create Rx thread\n", cpu);
-   bnx2i_percpu_thread_create(cpu);
-   return 0;
-}
-
-static int bnx2i_cpu_dead(unsigned int cpu)
-{
-   pr_info("CPU %x offline: Remove Rx thread\n", cpu);
-   bnx2i_percpu_thread_destroy(cpu);
return 0;
 }
 
@@ -511,28 +498,14 @@ static int __init bnx2i_mod_init(void)
p->iothread = NULL;
}
 
-   get_online_cpus();
-
-   for_each_online_cpu(cpu)
-   bnx2i_percpu_thread_create(cpu);
-
-   err = cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ONLINE_DYN,
-  "scsi/bnx2i:online",
-  bnx2i_cpu_online, NULL);
+   err = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "scsi/bnx2i:online",
+   bnx2i_cpu_online, bnx2i_cpu_offline);
if (err < 0)
-   goto remove_threads;
+   goto unreg_driver;
bnx2i_online_state = err;
-
-   cpuhp_setup_state_nocalls_cpuslocked(CPUHP_SCSI_BNX2I_DEAD,
-"scsi/bnx2i:dead",
-NULL, bnx2i_cpu_dead);
-   put_online_cpus();
return 0;
 
-remove_threads:
-   for_each_online_cpu(cpu)
-   bnx2i_percpu_thread_destroy(cpu);
-   put_online_cpus();
+unreg_driver:
cnic_unregister_driver(CNIC_ULP_ISCSI);
 unreg_xport:
iscsi_unregister_transport(_iscsi_transport);
@@ -552,7 +525,6 @@ static int __init bnx2i_mod_init(void)
 static void __exit bnx2i_mod_exit(void)
 {
struct bnx2i_hba *hba;
-   unsigned cpu = 0;
 
mutex_lock(_dev_lock);
while (!list_empty(_list)) {
@@ -570,14 +542,7 @@ static void __exit bnx2i_mod_exit(void)
}
mutex_unlock(_dev_lock);
 
-   get_online_cpus();
-
-   for_each_online_cpu(cpu)
-   bnx2i_percpu_thread_destroy(cpu);
-
-   cpuhp_remove_state_nocalls_cpuslocked(bnx2i_online_state);
-   cpuhp_remove_state_nocalls_cpuslocked(CPUHP_SCSI_BNX2I_DEAD);
-   put_online_cpus();
+   cpuhp_remove_state(bnx2i_online_state);
 
iscsi_unregister_transport(_iscsi_transport);
cnic_unregister_driver(CNIC_ULP_ISCSI);
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -39,7 +39,6 @@ enum cpuhp_state {
CPUHP_PCI_XGENE_DEAD,
CPUHP_IOMMU_INTEL_DEAD,
CPUHP_LUSTRE_CFS_DEAD,
-   CPUHP_SCSI_BNX2I_DEAD,
CPUHP_WORKQUEUE_PREP,
CPUHP_POWER_NUMA_PREPARE,
CPUHP_HRTIMERS_PREPARE,




[patch 1/5] scsi/bnx2fc: Plug CPU hotplug race

2017-07-24 Thread Thomas Gleixner
bnx2fc_process_new_cqes() has protection against CPU hotplug, which relies
on the per cpu thread pointer. This protection is racy because it happens
only partially with the per cpu fp_work_lock held.

If the CPU is unplugged after the lock is dropped, the wakeup code can
dereference a NULL pointer or access freed and potentially reused memory.

Restructure the code so the thread check and wakeup happens with the
fp_work_lock held.

Signed-off-by: Thomas Gleixner 
---
 drivers/scsi/bnx2fc/bnx2fc_hwi.c |   45 +++
 1 file changed, 23 insertions(+), 22 deletions(-)

--- a/drivers/scsi/bnx2fc/bnx2fc_hwi.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_hwi.c
@@ -1008,6 +1008,28 @@ static struct bnx2fc_work *bnx2fc_alloc_
return work;
 }
 
+/* Pending work request completion */
+static void bnx2fc_pending_work(struct bnx2fc_rport *tgt, unsigned int wqe)
+{
+   unsigned int cpu = wqe % num_possible_cpus();
+   struct bnx2fc_percpu_s *fps;
+   struct bnx2fc_work *work;
+
+   fps = _cpu(bnx2fc_percpu, cpu);
+   spin_lock_bh(>fp_work_lock);
+   if (fps->iothread) {
+   work = bnx2fc_alloc_work(tgt, wqe);
+   if (work) {
+   list_add_tail(>list, >work_list);
+   wake_up_process(fps->iothread);
+   spin_unlock_bh(>fp_work_lock);
+   return;
+   }
+   }
+   spin_unlock_bh(>fp_work_lock);
+   bnx2fc_process_cq_compl(tgt, wqe);
+}
+
 int bnx2fc_process_new_cqes(struct bnx2fc_rport *tgt)
 {
struct fcoe_cqe *cq;
@@ -1042,28 +1064,7 @@ int bnx2fc_process_new_cqes(struct bnx2f
/* Unsolicited event notification */
bnx2fc_process_unsol_compl(tgt, wqe);
} else {
-   /* Pending work request completion */
-   struct bnx2fc_work *work = NULL;
-   struct bnx2fc_percpu_s *fps = NULL;
-   unsigned int cpu = wqe % num_possible_cpus();
-
-   fps = _cpu(bnx2fc_percpu, cpu);
-   spin_lock_bh(>fp_work_lock);
-   if (unlikely(!fps->iothread))
-   goto unlock;
-
-   work = bnx2fc_alloc_work(tgt, wqe);
-   if (work)
-   list_add_tail(>list,
- >work_list);
-unlock:
-   spin_unlock_bh(>fp_work_lock);
-
-   /* Pending work request completion */
-   if (fps->iothread && work)
-   wake_up_process(fps->iothread);
-   else
-   bnx2fc_process_cq_compl(tgt, wqe);
+   bnx2fc_pending_work(tgt, wqe);
num_free_sqes++;
}
cqe++;




[PATCH] scsi_transport_fc: return -EBUSY for deleted vport

2017-07-24 Thread Hannes Reinecke
When trying to delete a vport via 'vport_delete' sysfs attribute
we should be checking if the port is already in state VPORT_DELETING;
if so there's no need to do anything.

Signed-off-by: Hannes Reinecke 
---
 drivers/scsi/scsi_transport_fc.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 6dd0922..ec9837e 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -1286,7 +1286,7 @@ static FC_DEVICE_ATTR(vport, symbolic_name, S_IRUGO | 
S_IWUSR,
unsigned long flags;
 
spin_lock_irqsave(shost->host_lock, flags);
-   if (vport->flags & (FC_VPORT_DEL | FC_VPORT_CREATING)) {
+   if (vport->flags & (FC_VPORT_DEL | FC_VPORT_CREATING | 
FC_VPORT_DELETING)) {
spin_unlock_irqrestore(shost->host_lock, flags);
return -EBUSY;
}
@@ -2430,8 +2430,10 @@ void fc_release_transport(struct scsi_transport_template 
*t)
spin_lock_irqsave(shost->host_lock, flags);
 
/* Remove any vports */
-   list_for_each_entry_safe(vport, next_vport, _host->vports, peers)
+   list_for_each_entry_safe(vport, next_vport, _host->vports, peers) {
+   vport->flags |= FC_VPORT_DELETING;
fc_queue_work(shost, >vport_delete_work);
+   }
 
/* Remove any remote ports */
list_for_each_entry_safe(rport, next_rport,
-- 
1.8.5.6



Re: [PATCH 26/28] visorhba: sanitze private device data allocation

2017-07-24 Thread Hannes Reinecke
On 07/19/2017 07:45 PM, Kershner, David A wrote:
> 
> 
>> -Original Message-
>> From: Hannes Reinecke [mailto:h...@suse.de]
>> Sent: Wednesday, June 28, 2017 4:25 AM
>> To: Christoph Hellwig 
>> Subject: [PATCH 26/28] visorhba: sanitze private device data allocation
>>
>> There's no need to keep the private data for a device in a separate
>> list; better to store it in ->hostdata and do away with the additional
>> list.
>>
>> Signed-off-by: Hannes Reinecke 
>> Cc: David Kershner 
> 
> I'm not sure what the process for this patch was to get into the upstream, do
> you know the forecast for it? I.e. when it will be available in Linus's tree? 
> 
> We have several more changes that I would like to make to this driver and
> to simplify them, I would like to put them on top of this change, would you
> mind if I submitted it to GregKH for the staging branch if it is going to take
> some time?
> 
> Sorry for the newbie questions.
> 
typically the maintainer (that'll be you :-) does a review, and replies
with something like:

Patch is okay;

Reviewed-by: David Kershner 

Then the subsystem maintainer (ie Martin Petersen) will be picking it
up, and queueing it in his subsystem tree for eventual merge with linux
upstream.

The first step has been completed, so we're waiting for Martin to pick
it up. It might be some time, though; depending on the merge window and
his overall workload.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH] aic7xxx: fix firmware build with O=path

2017-07-24 Thread Hannes Reinecke
On 07/21/2017 08:51 PM, Jakub Kicinski wrote:
> On Tue, 18 Jul 2017 18:58:34 -0700, Jakub Kicinski wrote:
>> Building firmware with O=path was apparently broken in aic7 for ever.
>> Message of the previous commit to the Makefile (from 2008) mentions
>> this unfortunate state of affairs already.  Fix this, mostly to make
>> randconfig builds more reliable.
>>
>> Signed-off-by: Jakub Kicinski 
> 
> Gentle ping :)  Hannes, are you still maintaining aic7xxx?
> 
Yes, I do.

Patch looks reasonable; Kconfig is still black magic to me, though :-)

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.com  +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH] scsi: qedi: select CONFIG_ISCSI_BOOT_SYSFS

2017-07-24 Thread Javali, Nilesh

On 21/07/17, 9:41 PM, "Arnd Bergmann"  wrote:

>Without the base library support, we get a link failure
>
>drivers/scsi/qedi/qedi_main.o: In function `__qedi_probe.constprop.0':
>qedi_main.c:(.text+0x2d8e): undefined reference to
>`iscsi_boot_create_target'
>qedi_main.c:(.text+0x2dee): undefined reference to
>`iscsi_boot_create_initiator'
>qedi_main.c:(.text+0x2e1c): undefined reference to
>`iscsi_boot_create_ethernet'
>
>This selects the Kconfig symbol like the other two users of that
>module do.
>
>Fixes: c57ec8fb7c02 ("scsi: qedi: Add support for Boot from SAN over
>iSCSI offload")
>Signed-off-by: Arnd Bergmann 
>---
> drivers/scsi/qedi/Kconfig | 1 +
> 1 file changed, 1 insertion(+)
>
>diff --git a/drivers/scsi/qedi/Kconfig b/drivers/scsi/qedi/Kconfig
>index 21331453db7b..8deb8723c4dd 100644
>--- a/drivers/scsi/qedi/Kconfig
>+++ b/drivers/scsi/qedi/Kconfig
>@@ -2,6 +2,7 @@ config QEDI
>   tristate "QLogic QEDI 25/40/100Gb iSCSI Initiator Driver Support"
>   depends on PCI && SCSI && UIO
>   depends on QED
>+  select ISCSI_BOOT_SYSFS
>   select SCSI_ISCSI_ATTRS
>   select QED_LL2
>   select QED_ISCSI
>-- 
>2.9.0
>

NACK. The fix already posted to address this issue,
http://marc.info/?l=linux-scsi=150045528332067=2

Thanks,
Nilesh