[PATCH 2/2] zfcp: improve kdoc for return of zfcp_status_read_refill()

2018-12-06 Thread Steffen Maier
Complements

v2.6.35 commit 64deb6efdc55
("[SCSI] zfcp: Use status_read_buf_num provided by FCP channel")
which replaced the hardcoded 16 with a variable value

Also complements already existing fixups for above commit

v2.6.35 commit 8d88cf3f3b9a
("[SCSI] zfcp: Update status read mempool")
v3.10   commit 9edf7d75ee5f
("[SCSI] zfcp: status read buffers on first adapter open with link down")

Signed-off-by: Steffen Maier 
Reviewed-by: Jens Remus 
---
 drivers/s390/scsi/zfcp_aux.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_aux.c b/drivers/s390/scsi/zfcp_aux.c
index 882789fff574..9cf30d124b9e 100644
--- a/drivers/s390/scsi/zfcp_aux.c
+++ b/drivers/s390/scsi/zfcp_aux.c
@@ -264,10 +264,10 @@ static void zfcp_free_low_mem_buffers(struct zfcp_adapter 
*adapter)
  * zfcp_status_read_refill - refill the long running status_read_requests
  * @adapter: ptr to struct zfcp_adapter for which the buffers should be 
refilled
  *
- * Returns: 0 on success, 1 otherwise
- *
- * if there are 16 or more status_read requests missing an adapter_reopen
- * is triggered
+ * Return:
+ * * 0 on success meaning at least one status read is pending
+ * * 1 if posting failed and not a single status read buffer is pending,
+ * also triggers adapter reopen recovery
  */
 int zfcp_status_read_refill(struct zfcp_adapter *adapter)
 {
-- 
2.16.4



[PATCH 1/2] zfcp: fix posting too many status read buffers leading to adapter shutdown

2018-12-06 Thread Steffen Maier
Suppose adapter (open) recovery is between opened QDIO queues and before
(the end of) initial posting of status read buffers (SRBs). This time
window can be seconds long due to FSF_PROT_HOST_CONNECTION_INITIALIZING
causing by design looping with exponential increase sleeps in the function
performing exchange config data during recovery
[zfcp_erp_adapter_strat_fsf_xconf()]. Recovery triggered by local link up.

Suppose an event occurs for which the FCP channel would send an
unsolicited notification to zfcp by means of a previously posted SRB.
We saw it with local cable pull (link down) in multi-initiator zoning
with multiple NPIV-enabled subchannels of the same shared FCP channel.

As soon as zfcp_erp_adapter_strategy_open_fsf() starts posting the
initial status read buffers from within the adapter's ERP thread,
the channel does send an unsolicited notification.

Since v2.6.27 commit d26ab06ede83 ("[SCSI] zfcp: receiving an unsolicted
status can lead to I/O stall"), zfcp_fsf_status_read_handler() schedules
adapter->stat_work to re-fill the just consumed SRB from a work item.

Now the ERP thread and the work item post SRBs in parallel.
Both contexts call the helper function zfcp_status_read_refill().
The tracking of missing (to be posted / re-filled) SRBs is not thread-safe
due to separate atomic_read() and atomic_dec(), in order to depend on
posting success. Hence, both contexts can see
atomic_read(>stat_miss) == 1. One of the two contexts posts
one too many SRB. Zfcp gets QDIO_ERROR_SLSB_STATE on the output queue
(trace tag "qdireq1") leading to zfcp_erp_adapter_shutdown() in
zfcp_qdio_handler_error().

An obvious and seemingly clean fix would be to schedule stat_work
from the ERP thread and wait for it to finish. This would serialize
all SRB re-fills. However, we already have another work item wait on the
ERP thread: adapter->scan_work runs zfcp_fc_scan_ports() which calls
zfcp_fc_eval_gpn_ft(). The latter calls zfcp_erp_wait() to wait for all
the open port recoveries during zfcp auto port scan, but in fact it waits
for any pending recovery including an adapter recovery. This approach
leads to a deadlock.
[see also v3.19 commit 18f87a67e6d6 ("zfcp: auto port scan resiliency");
v2.6.37 commit d3e1088d6873
("[SCSI] zfcp: No ERP escalation on gpn_ft eval");
v2.6.28 commit fca55b6fb587
("[SCSI] zfcp: fix deadlock between wq triggered port scan and ERP")
fixing v2.6.27 commit c57a39a45a76
("[SCSI] zfcp: wait until adapter is finished with ERP during auto-port");
v2.6.27 commit cc8c282963bd
("[SCSI] zfcp: Automatically attach remote ports")]

Instead make the accounting of missing SRBs atomic for parallel
execution in both the ERP thread and adapter->stat_work.

Signed-off-by: Steffen Maier 
Fixes: d26ab06ede83 ("[SCSI] zfcp: receiving an unsolicted status can lead to 
I/O stall")
Cc:  #2.6.27+
Reviewed-by: Jens Remus 
---
 drivers/s390/scsi/zfcp_aux.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_aux.c b/drivers/s390/scsi/zfcp_aux.c
index df10f4e07a4a..882789fff574 100644
--- a/drivers/s390/scsi/zfcp_aux.c
+++ b/drivers/s390/scsi/zfcp_aux.c
@@ -271,16 +271,16 @@ static void zfcp_free_low_mem_buffers(struct zfcp_adapter 
*adapter)
  */
 int zfcp_status_read_refill(struct zfcp_adapter *adapter)
 {
-   while (atomic_read(>stat_miss) > 0)
+   while (atomic_add_unless(>stat_miss, -1, 0))
if (zfcp_fsf_status_read(adapter->qdio)) {
+   atomic_inc(>stat_miss); /* undo add -1 */
if (atomic_read(>stat_miss) >=
adapter->stat_read_buf_num) {
zfcp_erp_adapter_reopen(adapter, 0, "axsref1");
return 1;
}
break;
-   } else
-   atomic_dec(>stat_miss);
+   }
return 0;
 }
 
-- 
2.16.4



[PATCH 0/2] zfcp: small bugfix on top of previous v4.21 patches

2018-12-06 Thread Steffen Maier
James, Martin,

One new recovery fix, which is not urgent, for an old bug.
It's sufficient to apply it on top of the previously sent
23 zfcp updates for the v4.21 merge window
[https://www.spinics.net/lists/linux-scsi/msg125211.html].
The 2 new patches apply to Martin's 4.21/scsi-queue
and to James' misc branch.

Steffen Maier (2):
  zfcp: fix posting too many status read buffers leading to adapter
shutdown
  zfcp: improve kdoc for return of zfcp_status_read_refill()

 drivers/s390/scsi/zfcp_aux.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

-- 
2.16.4



[PATCH v2 01/23] zfcp: make DIX experimental, disabled, and independent of DIF

2018-11-29 Thread Steffen Maier
From: Fedor Loshakov 

Introduce separate zfcp module parameters to individually
select support for:
DIF which should work (zfcp.dif, which used to be DIF+DIX, disabled) or
DIX+DIF which can cause trouble (zfcp.dix, new, disabled).

If DIX is enabled, we warn on zfcp driver initialization.
As before, this also reduces the maximum I/O request size to half,
to support the worst case of merged single sector requests with one
protection data scatter gather element per sector. This can impact the
maximum throughput.

In DIF-only mode (zfcp.dif=1 zfcp.dix=0), we can use the full maximum
I/O request size as there is no protection data for zfcp.

Signed-off-by: Steffen Maier 
Co-developed-by: Fedor Loshakov 
Signed-off-by: Fedor Loshakov 
Reviewed-by: Jens Remus 
---

Changes in description since v1:
Don't erroneously blame non-zfcp code for DIX issues.
Explain technical reasons why DIF-only mode is interesting for zfcp.

 drivers/s390/scsi/zfcp_aux.c  |  3 +++
 drivers/s390/scsi/zfcp_ext.h  |  1 +
 drivers/s390/scsi/zfcp_scsi.c | 10 +++---
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_aux.c b/drivers/s390/scsi/zfcp_aux.c
index 94f4d8fe85e0..08cdc00e8299 100644
--- a/drivers/s390/scsi/zfcp_aux.c
+++ b/drivers/s390/scsi/zfcp_aux.c
@@ -124,6 +124,9 @@ static int __init zfcp_module_init(void)
 {
int retval = -ENOMEM;
 
+   if (zfcp_experimental_dix)
+   pr_warn("DIX is enabled. It is experimental and might cause 
problems\n");
+
zfcp_fsf_qtcb_cache = zfcp_cache_hw_align("zfcp_fsf_qtcb",
  sizeof(struct fsf_qtcb));
if (!zfcp_fsf_qtcb_cache)
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index bd0c5a9f04cb..0940bef35020 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -144,6 +144,7 @@ extern void zfcp_qdio_close(struct zfcp_qdio *);
 extern void zfcp_qdio_siosl(struct zfcp_adapter *);
 
 /* zfcp_scsi.c */
+extern bool zfcp_experimental_dix;
 extern struct scsi_transport_template *zfcp_scsi_transport_template;
 extern int zfcp_scsi_adapter_register(struct zfcp_adapter *);
 extern void zfcp_scsi_adapter_unregister(struct zfcp_adapter *);
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index a8efcb330bc1..2b8c33627460 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -27,7 +27,11 @@ MODULE_PARM_DESC(queue_depth, "Default queue depth for new 
SCSI devices");
 
 static bool enable_dif;
 module_param_named(dif, enable_dif, bool, 0400);
-MODULE_PARM_DESC(dif, "Enable DIF/DIX data integrity support");
+MODULE_PARM_DESC(dif, "Enable DIF data integrity support (default off)");
+
+bool zfcp_experimental_dix;
+module_param_named(dix, zfcp_experimental_dix, bool, 0400);
+MODULE_PARM_DESC(dix, "Enable experimental DIX (data integrity extension) 
support which implies DIF support (default off)");
 
 static bool allow_lun_scan = true;
 module_param(allow_lun_scan, bool, 0600);
@@ -788,11 +792,11 @@ void zfcp_scsi_set_prot(struct zfcp_adapter *adapter)
data_div = atomic_read(>status) &
   ZFCP_STATUS_ADAPTER_DATA_DIV_ENABLED;
 
-   if (enable_dif &&
+   if ((enable_dif || zfcp_experimental_dix) &&
adapter->adapter_features & FSF_FEATURE_DIF_PROT_TYPE1)
mask |= SHOST_DIF_TYPE1_PROTECTION;
 
-   if (enable_dif && data_div &&
+   if (zfcp_experimental_dix && data_div &&
adapter->adapter_features & FSF_FEATURE_DIX_PROT_TCPIP) {
mask |= SHOST_DIX_TYPE1_PROTECTION;
scsi_host_set_guard(shost, SHOST_DIX_GUARD_IP);
-- 
2.16.4



Re: [PATCH 22/23] zfcp: drop default switch case which might paper over missing case

2018-11-22 Thread Steffen Maier

On 11/16/2018 12:22 PM, Hannes Reinecke wrote:

On 11/8/18 3:44 PM, Steffen Maier wrote:

would now
suppress helpful -Wswitch compiler warnings when building with W=1 



But then again, only with W=1 we would notice unhandled enum cases.


that's the only caveat


Without the default cases and a missed unhandled enum case, the code
might perform unforeseen things we might not want...


this would be a bug that needs fixing


As of today, we never run through the removed default case,
so removing it is no functional change.
In the future, we never should run through a default case but



introduce the necessary specific case(s) to handle new functionality.


that's the fix


diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index b2845c5b8106..9345fed3bb37 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -151,9 +151,6 @@ static enum zfcp_erp_act_type zfcp_erp_handle_failed(
  adapter, ZFCP_STATUS_COMMON_ERP_FAILED);
  }
  break;
-    default:
-    need = 0;
-    break;
  }
  return need;

If you 'should' not run through this code path, doesn't it warrant a 
WARN_ON() or something?


#include 
enum foo { A, B };
int main(int argc, char *argv[]) {
enum foo f = argc - 1;
switch (f) {
case A: printf("A\n"); break;
case B: printf("B\n"); break;
default: printf("default\n"); break;
}
return 0;
}

$ gcc -Wswitch -Wall -g -o Wswitch Wswitch.c

Removing case B (while keeping default:) does not warn on build.
Removing case B and default: nicely warns on build.

Hopefully I haven't missed anything. From above, I conclude:

A runtime check would require the introduction of a default case.
Adding a default case would trade a static build warning for a runtime 
WARN_ON(_ONCE) which only appears if one manages to get the code run 
into the default case that should not happen.


I find the static build warning more helpful for future extensions 
adding more value(s) to the enum. Without a default case, we always 
getting a build warning for missing switch cases.


--
Mit freundlichen Gruessen / Kind regards
Steffen Maier

Linux on IBM Z Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 01/23] zfcp: make DIX experimental, disabled, and independent of DIF

2018-11-22 Thread Steffen Maier

Hi Martin,

On 11/21/2018 07:13 PM, Martin K. Petersen wrote:

Sorry about the delay. Travel got in the way.


No problem.


BDI_CAP_STABLE_WRITES should take care of this. What's the configuration
that fails?


Apologies, if the commit description sounds unfair. I did not mean to
blame anyone. It's just the collection of issues we saw in distros over
the years. Some of the old issues might be fixed with above zfcp patch
or common code changes. Unfortunately, I could not handle the DIX things
we saw. I think, DIF by itself provides a lot of the protection benefit
and was not affected by the encountered issues. We would like to give
users an easy way to operate in such setup.


I don't have a problem with zfcp having a parameter that affects the
host protection mask, the other drivers do that too. However, these
knobs exist exclusively for debugging and testing purposes. They are not
something regular users should twiddle to switch features on or off.

So DIF and DIX should always be enabled in the driver. And there is no
point in ever operating without DIF enabled if the hardware is capable.


Our long term plan is to make the new zfcp.dif (for DIF only) default to 
enabled once we got enough experience about zfcp stability in this mode.



If there is a desire to disable DIX protection for whatever reason
(legacy code doing bad things), do so using the block layer sysfs
knobs. That's where the policy of whether to generate and verify
protection information resides, not in the HBA driver.


Yes, we came up with udev rules to set read_verify and write_generate to 
0 in order to get DIF without DIX. However, this seems complicated for 
users, especially since we always have at least dm-multipath and maybe 
other dm targets such as LVM on top. The setting that matters is on the 
top level block device of some dm (or maybe mdraid) virtual block device 
stack. Getting this right, gets more complicated if there are also disks 
not attached through zfcp, and which may need different settings, so the 
udev rules would need somewhat involved matching. The new zfcp.dif 
parameter makes it simpler because the SCSI disk comes up with the 
desired limits and anything on top automatically inherits these block 
queue limits.


There's one more important thing that has performance impact: We need to 
pack payload and protection data into the same queue of limited length. 
So for the worst case with DIX, we have to use half the size for 
sg_tablesize to get the other half for sg_prot_tablesize. This limits 
the maximum I/O request size and thus throughput. Using read_verify and 
write_generate does not change the tablesizes, as zfcp would still 
announce support for DIF and DIX. With the new zfcp.dif=1 and 
zfcp.dix=0, we can use the full sg_tablesize for payload data and 
sg_prot_tablesize=0. (The DIF "overhead" on the fibre still exists of 
course.)


Are there other ways for accomplishing this which I'm not aware of?


And if there are unaddressed issues in the I/O stack that prevents you
from having integrity enabled, I'd prefer to know about them so they can
be fixed rather than circumventing them through driver module parameter.


Sure.

--
Mit freundlichen Gruessen / Kind regards
Steffen Maier

Linux on IBM Z Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 12/23] zfcp: update kernel message for invalid FCP_CMND length, it's not the CDB

2018-11-16 Thread Steffen Maier

On 11/16/2018 12:13 PM, Hannes Reinecke wrote:

On 11/8/18 3:44 PM, Steffen Maier wrote:

The CDB is just a part inside of FCP_CMND, see zfcp_fc_scsi_to_fcp().
While at it, fix the device driver reaction: adapter not LUN shutdown.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
  drivers/s390/scsi/zfcp_fsf.c | 7 ++-
  1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index c949c65ffc6a..0bdbc596da97 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2090,11 +2090,8 @@ static void zfcp_fsf_fcp_handler_common(struct 
zfcp_fsf_req *req,

  break;
  case FSF_CMND_LENGTH_NOT_VALID:
  dev_err(>adapter->ccw_device->dev,
-    "Incorrect CDB length %d, LUN 0x%016Lx on "
-    "port 0x%016Lx closed\n",
-    req->qtcb->bottom.io.fcp_cmnd_length,
-    (unsigned long long)zfcp_scsi_dev_lun(sdev),
-    (unsigned long long)zfcp_sdev->port->wwpn);
+    "Incorrect FCP_CMND length %d, FCP device closed\n",
+    req->qtcb->bottom.io.fcp_cmnd_length);
  zfcp_erp_adapter_shutdown(req->adapter, 0, "fssfch4");
  req->status |= ZFCP_STATUS_FSFREQ_ERROR;
  break;


Really? You're only fixing the message, not the adapter behaviour.
Care to clarify the commit message?


This is one of few cases in zfcp where we shutdown the entire adapter.
If we would ever get this, it would be very likely a zfcp bug in 
hardcoded parts affecting all IO paths to all our LUNs. Also, retrying 
would likely repeat the error.


IIRC, it's always been an adapter shutdown in git history. It's not that 
the type of recovery has changed at some point. A previous version of 
the message even had the correct recovery description part

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/s390/scsi?id=553448f6c4838a1e4bed2bc9301c748278d7d9ce
v2.6.27 ("[SCSI] zfcp: Message cleanup")
but broke with
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/s390/scsi?id=ff3b24fa5370a7ca618f212284d9b36fcedb9c0e
v2.6.28 ("[SCSI] zfcp: Update message with input from review")

Is this what you would like to see as clarification?

--
Mit freundlichen Gruessen / Kind regards
Steffen Maier

Linux on IBM Z Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 16/23] zfcp: use enum zfcp_erp_steps for struct zfcp_erp_action.step

2018-11-16 Thread Steffen Maier

On 11/16/2018 12:17 PM, Hannes Reinecke wrote:

On 11/8/18 3:44 PM, Steffen Maier wrote:

Use the already defined enum for this purpose to get at least some build
checking (even though an enum is type equivalent to an int in C).
v2.6.27 commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c")
introduced the enum which was cpp defines previously.

Since struct zfcp_erp_action type is embedded into other structures
living in zfcp_def.h, we have to move enum zfcp_erp_act_type from
its private definition in zfcp_erp.c to the zfcp-global zfcp_def.h

Silence some false -Wswitch compiler warning cases with individual
NOP cases. When adding more enum values and building with W=1 we
would get compiler warnings about missed new cases.

Add missing break statements in some of the above switch cases.
No functional change, but making it future-proof.
I think all of these should have had a break statement ever since,
even if these switch cases happened to be the last ones in the switch
statement body.

"Fall through" in the context of switch case usually means not to have a
break and fall through to the subsequent switch case. However, I think
this old comment meant that here we do not have an _early return_ in the
switch case but the code path continues after the switch case body.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
  drivers/s390/scsi/zfcp_def.h |   16 +++-
  drivers/s390/scsi/zfcp_erp.c |   35 +--
  2 files changed, 40 insertions(+), 11 deletions(-)

--- a/drivers/s390/scsi/zfcp_def.h
+++ b/drivers/s390/scsi/zfcp_def.h
@@ -107,6 +107,20 @@ enum zfcp_erp_act_type {
  ZFCP_ERP_ACTION_REOPEN_ADAPTER   = 4,
  };
+/*
+ * Values must fit into u16 because of code dependencies:
+ * zfcp_dbf_rec_run_lvl(), zfcp_dbf_rec_run(), zfcp_dbf_rec_run_wka(),
+ * _dbf_rec_running.rec_step.
+ */
+enum zfcp_erp_steps {
+    ZFCP_ERP_STEP_UNINITIALIZED    = 0x,
+    ZFCP_ERP_STEP_PHYS_PORT_CLOSING    = 0x0010,
+    ZFCP_ERP_STEP_PORT_CLOSING    = 0x0100,
+    ZFCP_ERP_STEP_PORT_OPENING    = 0x0800,
+    ZFCP_ERP_STEP_LUN_CLOSING    = 0x1000,
+    ZFCP_ERP_STEP_LUN_OPENING    = 0x2000,
+};
+
  struct zfcp_erp_action {
  struct list_head list;
  enum zfcp_erp_act_type type;  /* requested action code */
@@ -114,7 +128,7 @@ struct zfcp_erp_action {
  struct zfcp_port *port;
  struct scsi_device *sdev;
  u32    status;  /* recovery status */
-    u32 step;  /* active step of this erp action */
+    enum zfcp_erp_steps    step;    /* active step of this erp action */
  unsigned long    fsf_req_id;
  struct timer_list timer;
  };
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -24,15 +24,6 @@ enum zfcp_erp_act_flags {
  ZFCP_STATUS_ERP_NO_REF    = 0x0080,
  };
-enum zfcp_erp_steps {
-    ZFCP_ERP_STEP_UNINITIALIZED    = 0x,
-    ZFCP_ERP_STEP_PHYS_PORT_CLOSING    = 0x0010,
-    ZFCP_ERP_STEP_PORT_CLOSING    = 0x0100,
-    ZFCP_ERP_STEP_PORT_OPENING    = 0x0800,
-    ZFCP_ERP_STEP_LUN_CLOSING    = 0x1000,
-    ZFCP_ERP_STEP_LUN_OPENING    = 0x2000,
-};
-
  /*
   * Eyecatcher pseudo flag to bitwise or-combine with enum 
zfcp_erp_act_type.
   * Used to indicate that an ERP action could not be set up despite a 
detected

@@ -900,6 +891,13 @@ static int zfcp_erp_port_forced_strategy
  case ZFCP_ERP_STEP_PHYS_PORT_CLOSING:
  if (!(status & ZFCP_STATUS_PORT_PHYS_OPEN))
  return ZFCP_ERP_SUCCEEDED;
+    break;
+    case ZFCP_ERP_STEP_PORT_CLOSING:
+    case ZFCP_ERP_STEP_PORT_OPENING:
+    case ZFCP_ERP_STEP_LUN_CLOSING:
+    case ZFCP_ERP_STEP_LUN_OPENING:
+    /* NOP */
+    break;
  }
  return ZFCP_ERP_FAILED;
  }
@@ -974,7 +972,12 @@ static int zfcp_erp_port_strategy_open_c
  port->d_id = 0;
  return ZFCP_ERP_FAILED;
  }
-    /* fall through otherwise */
+    /* no early return otherwise, continue after switch case */
+    break;
+    case ZFCP_ERP_STEP_LUN_CLOSING:
+    case ZFCP_ERP_STEP_LUN_OPENING:
+    /* NOP */
+    break;
  }
  return ZFCP_ERP_FAILED;
  }
@@ -998,6 +1001,12 @@ static int zfcp_erp_port_strategy(struct
  if (p_status & ZFCP_STATUS_COMMON_OPEN)
  return ZFCP_ERP_FAILED;
  break;
+    case ZFCP_ERP_STEP_PHYS_PORT_CLOSING:
+    case ZFCP_ERP_STEP_PORT_OPENING:
+    case ZFCP_ERP_STEP_LUN_CLOSING:
+    case ZFCP_ERP_STEP_LUN_OPENING:
+    /* NOP */
+    break;
  }
  close_init_done:
@@ -1058,6 +1067,12 @@ static int zfcp_erp_lun_strategy(struct
  case ZFCP_ERP_STEP_LUN_OPENING:
  if (atomic_read(_sdev->status) & ZFCP_STATUS_COMMON_OPEN)
  return ZFCP_ERP_SUCCEEDED;
+    break;
+    case ZFCP_ERP_STEP_PHYS_PORT_CLOSING:
+    case ZFCP_ERP_STEP_PORT_CLOSING:
+    case ZFCP_ERP_STEP_PORT_OPENING:
+    /* NOP */
+    break

Re: [PATCH 01/23] zfcp: make DIX experimental, disabled, and independent of DIF

2018-11-09 Thread Steffen Maier

Hi Martin,

On 11/09/2018 03:07 AM, Martin K. Petersen wrote:

There are too many unresolved issues with DIX outside of zfcp such as
wrong protection data on writesame/discard (over device-mapper)


We don't configure protected transfers for anything but read and write
commands. There is currently no protection information generated for
WRITE SAME.



So if you guys are seeing failures, it must be due to zfcp
not handling the scsi_cmnd prot_op/prot_flags or the command PROTECT bit
correctly.


I think we're good since
("scsi: zfcp: fix queuecommand for scsi_eh commands when DIX enabled")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/s390/scsi?id=71b8e45da51a7b64a23378221c0a5868bd79da4f.

Previously, at least regular (non-recovery) I/O should have been good by 
having checked at least scsi_prot_sg_count().



or due to unstable page writes.


BDI_CAP_STABLE_WRITES should take care of this. What's the configuration
that fails?


Apologies, if the commit description sounds unfair. I did not mean to 
blame anyone. It's just the collection of issues we saw in distros over 
the years. Some of the old issues might be fixed with above zfcp patch 
or common code changes. Unfortunately, I could not handle the DIX things 
we saw. I think, DIF by itself provides a lot of the protection benefit 
and was not affected by the encountered issues. We would like to give 
users an easy way to operate in such setup.



--
Mit freundlichen Gruessen / Kind regards
Steffen Maier

Linux on IBM Z Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



[PATCH 05/23] zfcp: move scsi_eh & non-ERP timeout defines owned by and local to zfcp_fsf.c

2018-11-08 Thread Steffen Maier
Also clarify namespace prefix for the timeout used for FSF requests
on behalf of SCSI error recovery: It is zfcp_fsf_ not zfcp_scsi_.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_def.h | 6 --
 drivers/s390/scsi/zfcp_fsf.c | 9 +++--
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_def.h b/drivers/s390/scsi/zfcp_def.h
index 1b6d64eb66b7..87a1fef5568e 100644
--- a/drivers/s390/scsi/zfcp_def.h
+++ b/drivers/s390/scsi/zfcp_def.h
@@ -41,17 +41,11 @@
 #include "zfcp_fc.h"
 #include "zfcp_qdio.h"
 
-/* SCSI SPECIFIC DEFINES */
-#define ZFCP_SCSI_ER_TIMEOUT(10*HZ)
-
 /* FSF SPECIFIC DEFINES */
 
 /* ATTENTION: value must not be used by hardware */
 #define FSF_QTCB_UNSOLICITED_STATUS0x6305
 
-/* timeout value for "default timer" for fsf requests */
-#define ZFCP_FSF_REQUEST_TIMEOUT (60*HZ)
-
 /*** ADAPTER/PORT/UNIT AND FSF_REQ STATUS FLAGS **/
 
 /*
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 3c86e27f094d..095ab7fdcf4b 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -19,6 +19,11 @@
 #include "zfcp_qdio.h"
 #include "zfcp_reqlist.h"
 
+/* timeout for FSF requests sent during scsi_eh: abort or FCP TMF */
+#define ZFCP_FSF_SCSI_ER_TIMEOUT (10*HZ)
+/* timeout for: exchange config/port data outside ERP, or open/close WKA port 
*/
+#define ZFCP_FSF_REQUEST_TIMEOUT (60*HZ)
+
 struct kmem_cache *zfcp_fsf_qtcb_cache;
 
 static void zfcp_fsf_request_timeout_handler(struct timer_list *t)
@@ -912,7 +917,7 @@ struct zfcp_fsf_req *zfcp_fsf_abort_fcp_cmnd(struct 
scsi_cmnd *scmnd)
req->qtcb->header.port_handle = zfcp_sdev->port->handle;
req->qtcb->bottom.support.req_handle = (u64) old_req_id;
 
-   zfcp_fsf_start_timer(req, ZFCP_SCSI_ER_TIMEOUT);
+   zfcp_fsf_start_timer(req, ZFCP_FSF_SCSI_ER_TIMEOUT);
if (!zfcp_fsf_req_send(req))
goto out;
 
@@ -2369,7 +2374,7 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct 
scsi_device *sdev,
fcp_cmnd = >qtcb->bottom.io.fcp_cmnd.iu;
zfcp_fc_fcp_tm(fcp_cmnd, sdev, tm_flags);
 
-   zfcp_fsf_start_timer(req, ZFCP_SCSI_ER_TIMEOUT);
+   zfcp_fsf_start_timer(req, ZFCP_FSF_SCSI_ER_TIMEOUT);
if (!zfcp_fsf_req_send(req))
goto out;
 
-- 
2.16.4



[PATCH 00/23] zfcp updates for v4.21

2018-11-08 Thread Steffen Maier
James, Martin,

this is the zfcp patch set for the v4.21 merge window.
The patches apply to Martin's 4.21/scsi-queue
and to James' misc branch.

Patch 1 is a small feature to select DIF only without DIX.

Patches 2-23 are cleanups including resolving new build warnings.

Fedor Loshakov (1):
  zfcp: make DIX experimental, disabled, and independent of DIF

Steffen Maier (21):
  zfcp: move SG table helper from aux to fc and make them static
  zfcp: drop unnecessary forward prototype for struct zfcp_reqlist
  zfcp: move scsi_eh & non-ERP timeout defines owned by and local to
zfcp_fsf.c
  zfcp: update width in comment for ZFCP_COMMON_FLAGS mask
  zfcp: namespace prefix for internal latency data structures
  zfcp: group sort internal structure definitions for proximity
  zfcp: drop unnecessary forward prototype for struct zfcp_fsf_req
  zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in
QTCB header
  zfcp: drop duplicate seq_no from zfcp_fsf_req which is also in QTCB
header
  zfcp: update kernel message for invalid FCP_CMND length, it's not the
CDB
  zfcp: ERP thread setup kdoc update
  zfcp: clarify function argument name for trace tag string
  zfcp: the action field of zfcp_erp_action is actually the type
  zfcp: use enum zfcp_erp_steps for struct zfcp_erp_action.step
  zfcp: use enum zfcp_erp_act_result for argument/return of affected
functions
  zfcp: properly format LUN (and WWPN) for LUN sharing violation kmsg
  zfcp: silence all W=1 build warnings for existing kdoc
  zfcp: silence remaining kdoc warnings in header files
  zfcp: silence -Wimplicit-fallthrough in zfcp_erp_lun_strategy()
  zfcp: drop default switch case which might paper over missing case
  zfcp: drop old default switch case which might paper over missing case

zhong jiang (1):
  zfcp: remove unnecessary null pointer check before mempool_destroy

-- 
2.16.4



[PATCH 06/23] zfcp: update width in comment for ZFCP_COMMON_FLAGS mask

2018-11-08 Thread Steffen Maier
v2.6.10 history commit 4062e12b2ba2 ("[PATCH] s390: zfcp act enhancements")
extended this mask by one nibble with the introduction of
ZFCP_STATUS_COMMON_ACCESS_DENIED == 0x0080 for ACT
(access control table).

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_def.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_def.h b/drivers/s390/scsi/zfcp_def.h
index 87a1fef5568e..13bfc13eb42d 100644
--- a/drivers/s390/scsi/zfcp_def.h
+++ b/drivers/s390/scsi/zfcp_def.h
@@ -49,8 +49,8 @@
 /*** ADAPTER/PORT/UNIT AND FSF_REQ STATUS FLAGS **/
 
 /*
- * Note, the leftmost status byte is common among adapter, port
- * and unit
+ * Note, the leftmost 12 status bits (3 nibbles) are common among adapter, port
+ * and unit. This is a mask for bitwise 'and' with status values.
  */
 #define ZFCP_COMMON_FLAGS  0xfff0
 
-- 
2.16.4



[PATCH 20/23] zfcp: silence remaining kdoc warnings in header files

2018-11-08 Thread Steffen Maier
Improve whatever the following simple invocation reported:
$ ./scripts/kernel-doc -none drivers/s390/scsi/*.h

While at it, improve some related kdoc,
including struct zfcp_fsf_ct_els in zfcp_fsf.h.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_dbf.h | 10 +-
 drivers/s390/scsi/zfcp_fc.h  | 21 ++---
 drivers/s390/scsi/zfcp_fsf.h |  4 ++--
 drivers/s390/scsi/zfcp_qdio.h|  9 +++--
 drivers/s390/scsi/zfcp_reqlist.h |  2 +-
 5 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.h b/drivers/s390/scsi/zfcp_dbf.h
index b4438713d1cc..900c779cc39b 100644
--- a/drivers/s390/scsi/zfcp_dbf.h
+++ b/drivers/s390/scsi/zfcp_dbf.h
@@ -42,7 +42,8 @@ struct zfcp_dbf_rec_trigger {
  * @fsf_req_id: request id for fsf requests
  * @rec_status: status of the fsf request
  * @rec_step: current step of the recovery action
- * rec_count: recovery counter
+ * @rec_action: ERP action type
+ * @rec_count: recoveries including retries for particular @rec_action
  */
 struct zfcp_dbf_rec_running {
u64 fsf_req_id;
@@ -72,6 +73,7 @@ enum zfcp_dbf_rec_id {
  * @adapter_status: current status of the adapter
  * @port_status: current status of the port
  * @lun_status: current status of the lun
+ * @u: record type specific data
  * @u.trig: structure zfcp_dbf_rec_trigger
  * @u.run: structure zfcp_dbf_rec_running
  */
@@ -126,6 +128,8 @@ struct zfcp_dbf_san {
  * @prot_status_qual: protocol status qualifier
  * @fsf_status: fsf status
  * @fsf_status_qual: fsf status qualifier
+ * @port_handle: handle for port
+ * @lun_handle: handle for LUN
  */
 struct zfcp_dbf_hba_res {
u64 req_issued;
@@ -158,6 +162,7 @@ struct zfcp_dbf_hba_uss {
  * @ZFCP_DBF_HBA_RES: response trace record
  * @ZFCP_DBF_HBA_USS: unsolicited status trace record
  * @ZFCP_DBF_HBA_BIT: bit error trace record
+ * @ZFCP_DBF_HBA_BASIC: basic adapter event, only trace tag, no other data
  */
 enum zfcp_dbf_hba_id {
ZFCP_DBF_HBA_RES= 1,
@@ -176,6 +181,9 @@ enum zfcp_dbf_hba_id {
  * @fsf_seq_no: fsf sequence number
  * @pl_len: length of payload stored as zfcp_dbf_pay
  * @u: record type specific data
+ * @u.res: data for fsf responses
+ * @u.uss: data for unsolicited status buffer
+ * @u.be:  data for bit error unsolicited status buffer
  */
 struct zfcp_dbf_hba {
u8 id;
diff --git a/drivers/s390/scsi/zfcp_fc.h b/drivers/s390/scsi/zfcp_fc.h
index 3cd74729cfb9..6902ae1f8e4f 100644
--- a/drivers/s390/scsi/zfcp_fc.h
+++ b/drivers/s390/scsi/zfcp_fc.h
@@ -121,9 +121,24 @@ struct zfcp_fc_rspn_req {
 /**
  * struct zfcp_fc_req - Container for FC ELS and CT requests sent from zfcp
  * @ct_els: data required for issuing fsf command
- * @sg_req: scatterlist entry for request data
- * @sg_rsp: scatterlist entry for response data
- * @u: request specific data
+ * @sg_req: scatterlist entry for request data, refers to embedded @u submember
+ * @sg_rsp: scatterlist entry for response data, refers to embedded @u 
submember
+ * @u: request and response specific data
+ * @u.adisc: ADISC specific data
+ * @u.adisc.req: ADISC request
+ * @u.adisc.rsp: ADISC response
+ * @u.gid_pn: GID_PN specific data
+ * @u.gid_pn.req: GID_PN request
+ * @u.gid_pn.rsp: GID_PN response
+ * @u.gpn_ft: GPN_FT specific data
+ * @u.gpn_ft.sg_rsp2: GPN_FT response, not embedded here, allocated elsewhere
+ * @u.gpn_ft.req: GPN_FT request
+ * @u.gspn: GSPN specific data
+ * @u.gspn.req: GSPN request
+ * @u.gspn.rsp: GSPN response
+ * @u.rspn: RSPN specific data
+ * @u.rspn.req: RSPN request
+ * @u.rspn.rsp: RSPN response
  */
 struct zfcp_fc_req {
struct zfcp_fsf_ct_els  ct_els;
diff --git a/drivers/s390/scsi/zfcp_fsf.h b/drivers/s390/scsi/zfcp_fsf.h
index 535628b92f0a..2c658b66318c 100644
--- a/drivers/s390/scsi/zfcp_fsf.h
+++ b/drivers/s390/scsi/zfcp_fsf.h
@@ -438,8 +438,8 @@ struct zfcp_blk_drv_data {
 
 /**
  * struct zfcp_fsf_ct_els - zfcp data for ct or els request
- * @req: scatter-gather list for request
- * @resp: scatter-gather list for response
+ * @req: scatter-gather list for request, points to _fc_req.sg_req or BSG
+ * @resp: scatter-gather list for response, points to _fc_req.sg_rsp or 
BSG
  * @handler: handler function (called for response to the request)
  * @handler_data: data passed to handler function
  * @port: Optional pointer to port for zfcp internal ELS (only test link ADISC)
diff --git a/drivers/s390/scsi/zfcp_qdio.h b/drivers/s390/scsi/zfcp_qdio.h
index 886c662cc154..2a816a37b3c0 100644
--- a/drivers/s390/scsi/zfcp_qdio.h
+++ b/drivers/s390/scsi/zfcp_qdio.h
@@ -30,6 +30,8 @@
  * @req_q_full: queue full incidents
  * @req_q_wq: used to wait for SBAL availability
  * @adapter: adapter used in conjunction with this qdio structure
+ * @max_sbale_per_sbal: qdio limit per sbal
+ * @max_sbale_per_req: qdio limit per request
  */
 struct zfcp_qdio {
struct qdio_buffer  *res_q

[PATCH 18/23] zfcp: properly format LUN (and WWPN) for LUN sharing violation kmsg

2018-11-08 Thread Steffen Maier
zfcp: : LUN 0x0 on port 0x5005076. ...
zfcp: : LUN 0x1 on port 0x5005076. ...

should be

zfcp: : LUN 0x on port 0x5005076. ...
zfcp: : LUN 0x0001 on port 0x5005076.
  is already in use by CSS., MIF Image ID .

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_fsf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 0bdbc596da97..b83d249d07dc 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -1811,7 +1811,7 @@ static void zfcp_fsf_open_lun_handler(struct zfcp_fsf_req 
*req)
case FSF_LUN_SHARING_VIOLATION:
if (qual->word[0])
dev_warn(_sdev->port->adapter->ccw_device->dev,
-"LUN 0x%Lx on port 0x%Lx is already in "
+"LUN 0x%016Lx on port 0x%016Lx is already in "
 "use by CSS%d, MIF Image ID %x\n",
 zfcp_scsi_dev_lun(sdev),
 (unsigned long long)zfcp_sdev->port->wwpn,
-- 
2.16.4



[PATCH 22/23] zfcp: drop default switch case which might paper over missing case

2018-11-08 Thread Steffen Maier
This was introduced with v4.18 commit 8c3d20aada70 ("scsi: zfcp: fix
missing REC trigger trace for all objects in ERP_FAILED") but would now
suppress helpful -Wswitch compiler warnings when building with W=1 such as
the following forced example:

drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_handle_failed':
drivers/s390/scsi/zfcp_erp.c:126:2: warning: enumeration value 
'ZFCP_ERP_ACTION_REOPEN_PORT_FORCED' not handled in switch [-Wswitch]
  switch (want) {
  ^~

But then again, only with W=1 we would notice unhandled enum cases.
Without the default cases and a missed unhandled enum case, the code
might perform unforeseen things we might not want...

As of today, we never run through the removed default case,
so removing it is no functional change.
In the future, we never should run through a default case but
introduce the necessary specific case(s) to handle new functionality.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_erp.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index b2845c5b8106..9345fed3bb37 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -151,9 +151,6 @@ static enum zfcp_erp_act_type zfcp_erp_handle_failed(
adapter, ZFCP_STATUS_COMMON_ERP_FAILED);
}
break;
-   default:
-   need = 0;
-   break;
}
 
return need;
-- 
2.16.4



[PATCH 23/23] zfcp: drop old default switch case which might paper over missing case

2018-11-08 Thread Steffen Maier
This was introduced with v2.6.27 commit 287ac01acf22 ("[SCSI] zfcp: Cleanup
code in zfcp_erp.c") but would now suppress helpful -Wswitch compiler
warnings when building with W=1 such as the following forced example:

drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_setup_act':
drivers/s390/scsi/zfcp_erp.c:220:2: warning: enumeration value 
'ZFCP_ERP_ACTION_REOPEN_PORT' not handled in switch [-Wswitch]
  switch (need) {
  ^~

But then again, only with W=1 we would notice unhandled enum cases.
Without the default cases and a missed unhandled enum case, the code
might perform unforeseen things we might not want...

As of today, we never run through the removed default case,
so removing it is no functional change.
In the future, we never should run through a default case but
introduce the necessary specific case(s) to handle new functionality.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_erp.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 9345fed3bb37..744a64680d5b 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -257,9 +257,6 @@ static struct zfcp_erp_action *zfcp_erp_setup_act(enum 
zfcp_erp_act_type need,
  ZFCP_STATUS_COMMON_RUNNING))
act_status |= ZFCP_STATUS_ERP_CLOSE_ONLY;
break;
-
-   default:
-   return NULL;
}
 
WARN_ON_ONCE(erp_action->adapter != adapter);
-- 
2.16.4



[PATCH 19/23] zfcp: silence all W=1 build warnings for existing kdoc

2018-11-08 Thread Steffen Maier
While at it also improve some copy kdoc mistakes.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_dbf.c  | 13 -
 drivers/s390/scsi/zfcp_erp.c  |  6 +++---
 drivers/s390/scsi/zfcp_fc.c   |  2 +-
 drivers/s390/scsi/zfcp_fsf.c  | 14 +-
 drivers/s390/scsi/zfcp_qdio.c |  3 +--
 5 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index 06696b76c300..dccdb41bed8c 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -63,7 +63,8 @@ void zfcp_dbf_pl_write(struct zfcp_dbf *dbf, void *data, u16 
length, char *area,
 
 /**
  * zfcp_dbf_hba_fsf_res - trace event for fsf responses
- * @tag: tag indicating which kind of unsolicited status has been received
+ * @tag: tag indicating which kind of FSF response has been received
+ * @level: trace level to be used for event
  * @req: request for which a response was received
  */
 void zfcp_dbf_hba_fsf_res(char *tag, int level, struct zfcp_fsf_req *req)
@@ -153,7 +154,7 @@ void zfcp_dbf_hba_fsf_uss(char *tag, struct zfcp_fsf_req 
*req)
 
 /**
  * zfcp_dbf_hba_bit_err - trace event for bit error conditions
- * @tag: tag indicating which kind of unsolicited status has been received
+ * @tag: tag indicating which kind of bit error unsolicited status was received
  * @req: request which caused the bit_error condition
  */
 void zfcp_dbf_hba_bit_err(char *tag, struct zfcp_fsf_req *req)
@@ -224,6 +225,7 @@ void zfcp_dbf_hba_def_err(struct zfcp_adapter *adapter, u64 
req_id, u16 scount,
 
 /**
  * zfcp_dbf_hba_basic - trace event for basic adapter events
+ * @tag: identifier for event
  * @adapter: pointer to struct zfcp_adapter
  */
 void zfcp_dbf_hba_basic(char *tag, struct zfcp_adapter *adapter)
@@ -478,7 +480,8 @@ void zfcp_dbf_san(char *tag, struct zfcp_dbf *dbf,
 /**
  * zfcp_dbf_san_req - trace event for issued SAN request
  * @tag: identifier for event
- * @fsf_req: request containing issued CT data
+ * @fsf: request containing issued CT or ELS data
+ * @d_id: N_Port_ID where SAN request is sent to
  * d_id: destination ID
  */
 void zfcp_dbf_san_req(char *tag, struct zfcp_fsf_req *fsf, u32 d_id)
@@ -560,7 +563,7 @@ static u16 zfcp_dbf_san_res_cap_len_if_gpn_ft(char *tag,
 /**
  * zfcp_dbf_san_res - trace event for received SAN request
  * @tag: identifier for event
- * @fsf_req: request containing issued CT data
+ * @fsf: request containing received CT or ELS data
  */
 void zfcp_dbf_san_res(char *tag, struct zfcp_fsf_req *fsf)
 {
@@ -580,7 +583,7 @@ void zfcp_dbf_san_res(char *tag, struct zfcp_fsf_req *fsf)
 /**
  * zfcp_dbf_san_in_els - trace event for incoming ELS
  * @tag: identifier for event
- * @fsf_req: request containing issued CT data
+ * @fsf: request containing received ELS data
  */
 void zfcp_dbf_san_in_els(char *tag, struct zfcp_fsf_req *fsf)
 {
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 5c7fb64111fe..8e5f01f5be81 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -435,7 +435,7 @@ static void _zfcp_erp_port_reopen(struct zfcp_port *port, 
int clear,
 /**
  * zfcp_erp_port_reopen - trigger remote port recovery
  * @port: port to recover
- * @clear_mask: flags in port status to be cleared
+ * @clear: flags in port status to be cleared
  * @dbftag: Tag for debug trace event.
  */
 void zfcp_erp_port_reopen(struct zfcp_port *port, int clear, char *dbftag)
@@ -469,7 +469,7 @@ static void _zfcp_erp_lun_reopen(struct scsi_device *sdev, 
int clear,
 /**
  * zfcp_erp_lun_reopen - initiate reopen of a LUN
  * @sdev: SCSI device / LUN to be reopened
- * @clear_mask: specifies flags in LUN status to be cleared
+ * @clear: specifies flags in LUN status to be cleared
  * @dbftag: Tag for debug trace event.
  *
  * Return: 0 on success, < 0 on error
@@ -606,7 +606,7 @@ void zfcp_erp_notify(struct zfcp_erp_action *erp_action, 
unsigned long set_mask)
 
 /**
  * zfcp_erp_timeout_handler - Trigger ERP action from timed out ERP request
- * @data: ERP action (from timer data)
+ * @t: timer list entry embedded in zfcp FSF request
  */
 void zfcp_erp_timeout_handler(struct timer_list *t)
 {
diff --git a/drivers/s390/scsi/zfcp_fc.c b/drivers/s390/scsi/zfcp_fc.c
index 84a9c69cdd56..db00b5e3abbe 100644
--- a/drivers/s390/scsi/zfcp_fc.c
+++ b/drivers/s390/scsi/zfcp_fc.c
@@ -312,7 +312,7 @@ static void zfcp_fc_incoming_logo(struct zfcp_fsf_req *req)
 
 /**
  * zfcp_fc_incoming_els - handle incoming ELS
- * @fsf_req - request which contains incoming ELS
+ * @fsf_req: request which contains incoming ELS
  */
 void zfcp_fc_incoming_els(struct zfcp_fsf_req *fsf_req)
 {
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index b83d249d07dc..d94496ee6883 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -79,7 +79,7 @@ static void zfcp_fsf_class_not_supp(struct zfcp_fsf_req *req)
 
 /**
  * zfcp_fsf_req_free - f

[PATCH 16/23] zfcp: use enum zfcp_erp_steps for struct zfcp_erp_action.step

2018-11-08 Thread Steffen Maier
Use the already defined enum for this purpose to get at least some build
checking (even though an enum is type equivalent to an int in C).
v2.6.27 commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c")
introduced the enum which was cpp defines previously.

Since struct zfcp_erp_action type is embedded into other structures
living in zfcp_def.h, we have to move enum zfcp_erp_act_type from
its private definition in zfcp_erp.c to the zfcp-global zfcp_def.h

Silence some false -Wswitch compiler warning cases with individual
NOP cases. When adding more enum values and building with W=1 we
would get compiler warnings about missed new cases.

Add missing break statements in some of the above switch cases.
No functional change, but making it future-proof.
I think all of these should have had a break statement ever since,
even if these switch cases happened to be the last ones in the switch
statement body.

"Fall through" in the context of switch case usually means not to have a
break and fall through to the subsequent switch case. However, I think
this old comment meant that here we do not have an _early return_ in the
switch case but the code path continues after the switch case body.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_def.h |   16 +++-
 drivers/s390/scsi/zfcp_erp.c |   35 +--
 2 files changed, 40 insertions(+), 11 deletions(-)

--- a/drivers/s390/scsi/zfcp_def.h
+++ b/drivers/s390/scsi/zfcp_def.h
@@ -107,6 +107,20 @@ enum zfcp_erp_act_type {
ZFCP_ERP_ACTION_REOPEN_ADAPTER = 4,
 };
 
+/*
+ * Values must fit into u16 because of code dependencies:
+ * zfcp_dbf_rec_run_lvl(), zfcp_dbf_rec_run(), zfcp_dbf_rec_run_wka(),
+ * _dbf_rec_running.rec_step.
+ */
+enum zfcp_erp_steps {
+   ZFCP_ERP_STEP_UNINITIALIZED = 0x,
+   ZFCP_ERP_STEP_PHYS_PORT_CLOSING = 0x0010,
+   ZFCP_ERP_STEP_PORT_CLOSING  = 0x0100,
+   ZFCP_ERP_STEP_PORT_OPENING  = 0x0800,
+   ZFCP_ERP_STEP_LUN_CLOSING   = 0x1000,
+   ZFCP_ERP_STEP_LUN_OPENING   = 0x2000,
+};
+
 struct zfcp_erp_action {
struct list_head list;
enum zfcp_erp_act_type type;  /* requested action code */
@@ -114,7 +128,7 @@ struct zfcp_erp_action {
struct zfcp_port *port;
struct scsi_device *sdev;
u32 status;   /* recovery status */
-   u32 step; /* active step of this erp action */
+   enum zfcp_erp_steps step;   /* active step of this erp action */
unsigned long   fsf_req_id;
struct timer_list timer;
 };
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -24,15 +24,6 @@ enum zfcp_erp_act_flags {
ZFCP_STATUS_ERP_NO_REF  = 0x0080,
 };
 
-enum zfcp_erp_steps {
-   ZFCP_ERP_STEP_UNINITIALIZED = 0x,
-   ZFCP_ERP_STEP_PHYS_PORT_CLOSING = 0x0010,
-   ZFCP_ERP_STEP_PORT_CLOSING  = 0x0100,
-   ZFCP_ERP_STEP_PORT_OPENING  = 0x0800,
-   ZFCP_ERP_STEP_LUN_CLOSING   = 0x1000,
-   ZFCP_ERP_STEP_LUN_OPENING   = 0x2000,
-};
-
 /*
  * Eyecatcher pseudo flag to bitwise or-combine with enum zfcp_erp_act_type.
  * Used to indicate that an ERP action could not be set up despite a detected
@@ -900,6 +891,13 @@ static int zfcp_erp_port_forced_strategy
case ZFCP_ERP_STEP_PHYS_PORT_CLOSING:
if (!(status & ZFCP_STATUS_PORT_PHYS_OPEN))
return ZFCP_ERP_SUCCEEDED;
+   break;
+   case ZFCP_ERP_STEP_PORT_CLOSING:
+   case ZFCP_ERP_STEP_PORT_OPENING:
+   case ZFCP_ERP_STEP_LUN_CLOSING:
+   case ZFCP_ERP_STEP_LUN_OPENING:
+   /* NOP */
+   break;
}
return ZFCP_ERP_FAILED;
 }
@@ -974,7 +972,12 @@ static int zfcp_erp_port_strategy_open_c
port->d_id = 0;
return ZFCP_ERP_FAILED;
}
-   /* fall through otherwise */
+   /* no early return otherwise, continue after switch case */
+   break;
+   case ZFCP_ERP_STEP_LUN_CLOSING:
+   case ZFCP_ERP_STEP_LUN_OPENING:
+   /* NOP */
+   break;
}
return ZFCP_ERP_FAILED;
 }
@@ -998,6 +1001,12 @@ static int zfcp_erp_port_strategy(struct
if (p_status & ZFCP_STATUS_COMMON_OPEN)
return ZFCP_ERP_FAILED;
break;
+   case ZFCP_ERP_STEP_PHYS_PORT_CLOSING:
+   case ZFCP_ERP_STEP_PORT_OPENING:
+   case ZFCP_ERP_STEP_LUN_CLOSING:
+   case ZFCP_ERP_STEP_LUN_OPENING:
+   /* NOP */
+   break;
}
 
 close_init_done:
@@ -1058,6 +1067,12 @@ static int zfcp_erp_lun_strategy(struct
case ZFCP_ERP_STEP_LUN_OPENING:
if (atomic_read(_sdev->status) & ZFCP_STATUS_COMMON_OPEN)
return ZFCP_ERP_SUCCEEDE

[PATCH 17/23] zfcp: use enum zfcp_erp_act_result for argument/return of affected functions

2018-11-08 Thread Steffen Maier
With that instead of just "int" it becomes clear which functions return
this type and which ones also accept it as argument they just pass through
in some cases or modify in other cases.
v2.6.27 commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c")
introduced the enum which was cpp defines previously.

Silence some false -Wswitch compiler warning cases with individual
NOP cases. When adding more enum values and building with W=1 we
would get compiler warnings about missed new cases.

Consistently use the variable name "result", so change "retval" in
zfcp_erp_strategy() to "result". This avoids confusion with other
compile unit variables "retval" having different semantics and type.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_erp.c | 124 +--
 1 file changed, 84 insertions(+), 40 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 3da870e55ab5..5c7fb64111fe 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -713,7 +713,8 @@ static void zfcp_erp_enqueue_ptp_port(struct zfcp_adapter 
*adapter)
_zfcp_erp_port_reopen(port, 0, "ereptp1");
 }
 
-static int zfcp_erp_adapter_strat_fsf_xconf(struct zfcp_erp_action *erp_action)
+static enum zfcp_erp_act_result zfcp_erp_adapter_strat_fsf_xconf(
+   struct zfcp_erp_action *erp_action)
 {
int retries;
int sleep = 1;
@@ -758,7 +759,8 @@ static int zfcp_erp_adapter_strat_fsf_xconf(struct 
zfcp_erp_action *erp_action)
return ZFCP_ERP_SUCCEEDED;
 }
 
-static int zfcp_erp_adapter_strategy_open_fsf_xport(struct zfcp_erp_action 
*act)
+static enum zfcp_erp_act_result zfcp_erp_adapter_strategy_open_fsf_xport(
+   struct zfcp_erp_action *act)
 {
int ret;
struct zfcp_adapter *adapter = act->adapter;
@@ -783,7 +785,8 @@ static int zfcp_erp_adapter_strategy_open_fsf_xport(struct 
zfcp_erp_action *act)
return ZFCP_ERP_SUCCEEDED;
 }
 
-static int zfcp_erp_adapter_strategy_open_fsf(struct zfcp_erp_action *act)
+static enum zfcp_erp_act_result zfcp_erp_adapter_strategy_open_fsf(
+   struct zfcp_erp_action *act)
 {
if (zfcp_erp_adapter_strat_fsf_xconf(act) == ZFCP_ERP_FAILED)
return ZFCP_ERP_FAILED;
@@ -822,7 +825,8 @@ static void zfcp_erp_adapter_strategy_close(struct 
zfcp_erp_action *act)
  ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED, >status);
 }
 
-static int zfcp_erp_adapter_strategy_open(struct zfcp_erp_action *act)
+static enum zfcp_erp_act_result zfcp_erp_adapter_strategy_open(
+   struct zfcp_erp_action *act)
 {
struct zfcp_adapter *adapter = act->adapter;
 
@@ -843,7 +847,8 @@ static int zfcp_erp_adapter_strategy_open(struct 
zfcp_erp_action *act)
return ZFCP_ERP_SUCCEEDED;
 }
 
-static int zfcp_erp_adapter_strategy(struct zfcp_erp_action *act)
+static enum zfcp_erp_act_result zfcp_erp_adapter_strategy(
+   struct zfcp_erp_action *act)
 {
struct zfcp_adapter *adapter = act->adapter;
 
@@ -861,7 +866,8 @@ static int zfcp_erp_adapter_strategy(struct zfcp_erp_action 
*act)
return ZFCP_ERP_SUCCEEDED;
 }
 
-static int zfcp_erp_port_forced_strategy_close(struct zfcp_erp_action *act)
+static enum zfcp_erp_act_result zfcp_erp_port_forced_strategy_close(
+   struct zfcp_erp_action *act)
 {
int retval;
 
@@ -875,7 +881,8 @@ static int zfcp_erp_port_forced_strategy_close(struct 
zfcp_erp_action *act)
return ZFCP_ERP_CONTINUES;
 }
 
-static int zfcp_erp_port_forced_strategy(struct zfcp_erp_action *erp_action)
+static enum zfcp_erp_act_result zfcp_erp_port_forced_strategy(
+   struct zfcp_erp_action *erp_action)
 {
struct zfcp_port *port = erp_action->port;
int status = atomic_read(>status);
@@ -902,7 +909,8 @@ static int zfcp_erp_port_forced_strategy(struct 
zfcp_erp_action *erp_action)
return ZFCP_ERP_FAILED;
 }
 
-static int zfcp_erp_port_strategy_close(struct zfcp_erp_action *erp_action)
+static enum zfcp_erp_act_result zfcp_erp_port_strategy_close(
+   struct zfcp_erp_action *erp_action)
 {
int retval;
 
@@ -915,7 +923,8 @@ static int zfcp_erp_port_strategy_close(struct 
zfcp_erp_action *erp_action)
return ZFCP_ERP_CONTINUES;
 }
 
-static int zfcp_erp_port_strategy_open_port(struct zfcp_erp_action *erp_action)
+static enum zfcp_erp_act_result zfcp_erp_port_strategy_open_port(
+   struct zfcp_erp_action *erp_action)
 {
int retval;
 
@@ -941,7 +950,8 @@ static int zfcp_erp_open_ptp_port(struct zfcp_erp_action 
*act)
return zfcp_erp_port_strategy_open_port(act);
 }
 
-static int zfcp_erp_port_strategy_open_common(struct zfcp_erp_action *act)
+static enum zfcp_erp_act_result zfcp_erp_port_strategy_open_common(
+   struct zfcp_erp_action *act)
 {
struct zfcp_adapter *adapter = 

[PATCH 21/23] zfcp: silence -Wimplicit-fallthrough in zfcp_erp_lun_strategy()

2018-11-08 Thread Steffen Maier
For some reason the already existing substring "fall through" in the
comment is not sufficient for GCC to silence -Wimplicit-fallthrough.

  CC [M]  drivers/s390/scsi/zfcp_erp.o
drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_lun_strategy':
drivers/s390/scsi/zfcp_erp.c:1065:6: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
  if (atomic_read(_sdev->status) & ZFCP_STATUS_COMMON_OPEN)
  ^
drivers/s390/scsi/zfcp_erp.c:1068:2: note: here
  case ZFCP_ERP_STEP_LUN_CLOSING:
  ^~~~

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_erp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 8e5f01f5be81..b2845c5b8106 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -1070,7 +1070,8 @@ static enum zfcp_erp_act_result zfcp_erp_lun_strategy(
zfcp_erp_lun_strategy_clearstati(sdev);
if (atomic_read(_sdev->status) & ZFCP_STATUS_COMMON_OPEN)
return zfcp_erp_lun_strategy_close(erp_action);
-   /* already closed, fall through */
+   /* already closed */
+   /* fall through */
case ZFCP_ERP_STEP_LUN_CLOSING:
if (atomic_read(_sdev->status) & ZFCP_STATUS_COMMON_OPEN)
return ZFCP_ERP_FAILED;
-- 
2.16.4



[PATCH 13/23] zfcp: ERP thread setup kdoc update

2018-11-08 Thread Steffen Maier
zfcp_erp_thread_setup() update complements v2.6.32 commit 347c6a965dc1
("[SCSI] zfcp: Use kthread API for zfcp erp thread").

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_erp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index e7e6b63905e2..f6a2d66eef57 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -1489,7 +1489,7 @@ static int zfcp_erp_thread(void *data)
  * zfcp_erp_thread_setup - Start ERP thread for adapter
  * @adapter: Adapter to start the ERP thread for
  *
- * Returns 0 on success or error code from kernel_thread()
+ * Return: 0 on success, or error code from kthread_run().
  */
 int zfcp_erp_thread_setup(struct zfcp_adapter *adapter)
 {
-- 
2.16.4



[PATCH 14/23] zfcp: clarify function argument name for trace tag string

2018-11-08 Thread Steffen Maier
v2.6.30 commit 5ffd51a5e495 ("[SCSI] zfcp: replace current ERP logging
with a more convenient version") changed trace record distinguishing from a
numerical ID to a 7 character string called "trace tag". While starting to
use function arguments with different type and semantics, it did not change
the argument name accordingly.

v2.6.38 commit ae0904f60fab ("[SCSI] zfcp: Redesign of the debug tracing
for recovery actions.") renamed variable names "id" into "tag" but only
within zfcp_dbf.*, not within zfcp_erp.c.

This was a bit confusing since the remainder of zfcp does use the term
"trace tag". Also "id" is quite generic and it's not obvious for what.
Just unify it consistently and use the "dbf" prefix to relate the arguments
to the code in zfcp_dbf.*.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_erp.c  | 92 ++-
 drivers/s390/scsi/zfcp_ext.h  |  8 ++--
 drivers/s390/scsi/zfcp_qdio.c |  8 ++--
 3 files changed, 57 insertions(+), 51 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index f6a2d66eef57..efb47cd6ab4a 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -297,7 +297,7 @@ static struct zfcp_erp_action *zfcp_erp_setup_act(int need, 
u32 act_status,
 static void zfcp_erp_action_enqueue(int want, struct zfcp_adapter *adapter,
struct zfcp_port *port,
struct scsi_device *sdev,
-   char *id, u32 act_status)
+   char *dbftag, u32 act_status)
 {
int need;
struct zfcp_erp_action *act;
@@ -327,10 +327,11 @@ static void zfcp_erp_action_enqueue(int want, struct 
zfcp_adapter *adapter,
list_add_tail(>list, >erp_ready_head);
wake_up(>erp_ready_wq);
  out:
-   zfcp_dbf_rec_trig(id, adapter, port, sdev, want, need);
+   zfcp_dbf_rec_trig(dbftag, adapter, port, sdev, want, need);
 }
 
-void zfcp_erp_port_forced_no_port_dbf(char *id, struct zfcp_adapter *adapter,
+void zfcp_erp_port_forced_no_port_dbf(char *dbftag,
+ struct zfcp_adapter *adapter,
  u64 port_name, u32 port_id)
 {
unsigned long flags;
@@ -344,29 +345,30 @@ void zfcp_erp_port_forced_no_port_dbf(char *id, struct 
zfcp_adapter *adapter,
atomic_set(, -1); /* unknown */
tmpport.wwpn = port_name;
tmpport.d_id = port_id;
-   zfcp_dbf_rec_trig(id, adapter, , NULL,
+   zfcp_dbf_rec_trig(dbftag, adapter, , NULL,
  ZFCP_ERP_ACTION_REOPEN_PORT_FORCED,
  ZFCP_ERP_ACTION_NONE);
write_unlock_irqrestore(>erp_lock, flags);
 }
 
 static void _zfcp_erp_adapter_reopen(struct zfcp_adapter *adapter,
-   int clear_mask, char *id)
+   int clear_mask, char *dbftag)
 {
zfcp_erp_adapter_block(adapter, clear_mask);
zfcp_scsi_schedule_rports_block(adapter);
 
zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_ADAPTER,
-   adapter, NULL, NULL, id, 0);
+   adapter, NULL, NULL, dbftag, 0);
 }
 
 /**
  * zfcp_erp_adapter_reopen - Reopen adapter.
  * @adapter: Adapter to reopen.
  * @clear: Status flags to clear.
- * @id: Id for debug trace event.
+ * @dbftag: Tag for debug trace event.
  */
-void zfcp_erp_adapter_reopen(struct zfcp_adapter *adapter, int clear, char *id)
+void zfcp_erp_adapter_reopen(struct zfcp_adapter *adapter, int clear,
+char *dbftag)
 {
unsigned long flags;
 
@@ -375,7 +377,7 @@ void zfcp_erp_adapter_reopen(struct zfcp_adapter *adapter, 
int clear, char *id)
 
write_lock_irqsave(>erp_lock, flags);
zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_ADAPTER, adapter,
-   NULL, NULL, id, 0);
+   NULL, NULL, dbftag, 0);
write_unlock_irqrestore(>erp_lock, flags);
 }
 
@@ -383,25 +385,25 @@ void zfcp_erp_adapter_reopen(struct zfcp_adapter 
*adapter, int clear, char *id)
  * zfcp_erp_adapter_shutdown - Shutdown adapter.
  * @adapter: Adapter to shut down.
  * @clear: Status flags to clear.
- * @id: Id for debug trace event.
+ * @dbftag: Tag for debug trace event.
  */
 void zfcp_erp_adapter_shutdown(struct zfcp_adapter *adapter, int clear,
-  char *id)
+  char *dbftag)
 {
int flags = ZFCP_STATUS_COMMON_RUNNING | ZFCP_STATUS_COMMON_ERP_FAILED;
-   zfcp_erp_adapter_reopen(adapter, clear | flags, id);
+   zfcp_erp_adapter_reopen(adapter, clear | flags, dbftag);
 }
 
 /**
  * zfcp_erp_port_shutdown - Shutdown port
  * @port: Port to shut down.
  * @clear: Status fla

[PATCH 15/23] zfcp: the action field of zfcp_erp_action is actually the type

2018-11-08 Thread Steffen Maier
_erp_action.action ==> _erp_action.type

While at it, make use of the already defined enum for this purpose
to get at least some build checking (even though an enum is type equivalent
to an int in C). v2.6.27 commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code
in zfcp_erp.c") introduced the enum which was cpp defines previously.

To prevent compiler warnings with the switch(act->type), we have to
separate the recently added eyecatchers from enum zfcp_erp_act_type.

Since struct zfcp_erp_action type is embedded into other structures
living in zfcp_def.h, we have to move enum zfcp_erp_act_type from
its private definition in zfcp_erp.c to the zfcp-global zfcp_def.h.

Silence one false -Wswitch compiler warning case: LUNs as the leaves in our
object tree do not have any follow-up success recovery.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_dbf.c |  2 +-
 drivers/s390/scsi/zfcp_def.h | 20 +++-
 drivers/s390/scsi/zfcp_erp.c | 77 +---
 3 files changed, 56 insertions(+), 43 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index 3503de873963..06696b76c300 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -357,7 +357,7 @@ void zfcp_dbf_rec_run_lvl(int level, char *tag, struct 
zfcp_erp_action *erp)
rec->u.run.fsf_req_id = erp->fsf_req_id;
rec->u.run.rec_status = erp->status;
rec->u.run.rec_step = erp->step;
-   rec->u.run.rec_action = erp->action;
+   rec->u.run.rec_action = erp->type;
 
if (erp->sdev)
rec->u.run.rec_count =
diff --git a/drivers/s390/scsi/zfcp_def.h b/drivers/s390/scsi/zfcp_def.h
index 84a742a67975..4c938eb604b6 100644
--- a/drivers/s390/scsi/zfcp_def.h
+++ b/drivers/s390/scsi/zfcp_def.h
@@ -89,9 +89,27 @@
 
 /* STRUCTURE DEFINITIONS */
 
+/**
+ * enum zfcp_erp_act_type - Type of ERP action object.
+ * @ZFCP_ERP_ACTION_REOPEN_LUN: LUN recovery.
+ * @ZFCP_ERP_ACTION_REOPEN_PORT: Port recovery.
+ * @ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: Forced port recovery.
+ * @ZFCP_ERP_ACTION_REOPEN_ADAPTER: Adapter recovery.
+ *
+ * Values must fit into u8 because of code dependencies:
+ * zfcp_dbf_rec_trig(), _dbf_rec_trigger.want, _dbf_rec_trigger.need;
+ * zfcp_dbf_rec_run_lvl(), zfcp_dbf_rec_run(), 
_dbf_rec_running.rec_action.
+ */
+enum zfcp_erp_act_type {
+   ZFCP_ERP_ACTION_REOPEN_LUN = 1,
+   ZFCP_ERP_ACTION_REOPEN_PORT= 2,
+   ZFCP_ERP_ACTION_REOPEN_PORT_FORCED = 3,
+   ZFCP_ERP_ACTION_REOPEN_ADAPTER = 4,
+};
+
 struct zfcp_erp_action {
struct list_head list;
-   int action;   /* requested action code */
+   enum zfcp_erp_act_type type;  /* requested action code */
struct zfcp_adapter *adapter; /* device which should be recovered */
struct zfcp_port *port;
struct scsi_device *sdev;
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index efb47cd6ab4a..49d04e5af55f 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -4,7 +4,7 @@
  *
  * Error Recovery Procedures (ERP).
  *
- * Copyright IBM Corp. 2002, 2016
+ * Copyright IBM Corp. 2002, 2017
  */
 
 #define KMSG_COMPONENT "zfcp"
@@ -33,29 +33,18 @@ enum zfcp_erp_steps {
ZFCP_ERP_STEP_LUN_OPENING   = 0x2000,
 };
 
-/**
- * enum zfcp_erp_act_type - Type of ERP action object.
- * @ZFCP_ERP_ACTION_REOPEN_LUN: LUN recovery.
- * @ZFCP_ERP_ACTION_REOPEN_PORT: Port recovery.
- * @ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: Forced port recovery.
- * @ZFCP_ERP_ACTION_REOPEN_ADAPTER: Adapter recovery.
- * @ZFCP_ERP_ACTION_NONE: Eyecatcher pseudo flag to bitwise or-combine with
- *   either of the first four enum values.
- *   Used to indicate that an ERP action could not be
- *   set up despite a detected need for some recovery.
- * @ZFCP_ERP_ACTION_FAILED: Eyecatcher pseudo flag to bitwise or-combine with
- * either of the first four enum values.
- * Used to indicate that ERP not needed because
- * the object has ZFCP_STATUS_COMMON_ERP_FAILED.
+/*
+ * Eyecatcher pseudo flag to bitwise or-combine with enum zfcp_erp_act_type.
+ * Used to indicate that an ERP action could not be set up despite a detected
+ * need for some recovery.
  */
-enum zfcp_erp_act_type {
-   ZFCP_ERP_ACTION_REOPEN_LUN = 1,
-   ZFCP_ERP_ACTION_REOPEN_PORT= 2,
-   ZFCP_ERP_ACTION_REOPEN_PORT_FORCED = 3,
-   ZFCP_ERP_ACTION_REOPEN_ADAPTER = 4,
-   ZFCP_ERP_ACTION_NONE   = 0xc0,
-   ZFCP_ERP_ACTION_FAILED = 0xe0,
-};
+#define ZFCP_ERP_ACTION_NONE   0xc0
+/*
+ * Eyecatcher pseudo flag to bitwise or-combine with enum zfcp_

[PATCH 09/23] zfcp: drop unnecessary forward prototype for struct zfcp_fsf_req

2018-11-08 Thread Steffen Maier
Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_def.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_def.h b/drivers/s390/scsi/zfcp_def.h
index 31b3e2bb3b42..572debf2f528 100644
--- a/drivers/s390/scsi/zfcp_def.h
+++ b/drivers/s390/scsi/zfcp_def.h
@@ -89,8 +89,6 @@
 
 /* STRUCTURE DEFINITIONS */
 
-struct zfcp_fsf_req;
-
 struct zfcp_erp_action {
struct list_head list;
int action;   /* requested action code */
-- 
2.16.4



[PATCH 12/23] zfcp: update kernel message for invalid FCP_CMND length, it's not the CDB

2018-11-08 Thread Steffen Maier
The CDB is just a part inside of FCP_CMND, see zfcp_fc_scsi_to_fcp().
While at it, fix the device driver reaction: adapter not LUN shutdown.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_fsf.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index c949c65ffc6a..0bdbc596da97 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2090,11 +2090,8 @@ static void zfcp_fsf_fcp_handler_common(struct 
zfcp_fsf_req *req,
break;
case FSF_CMND_LENGTH_NOT_VALID:
dev_err(>adapter->ccw_device->dev,
-   "Incorrect CDB length %d, LUN 0x%016Lx on "
-   "port 0x%016Lx closed\n",
-   req->qtcb->bottom.io.fcp_cmnd_length,
-   (unsigned long long)zfcp_scsi_dev_lun(sdev),
-   (unsigned long long)zfcp_sdev->port->wwpn);
+   "Incorrect FCP_CMND length %d, FCP device closed\n",
+   req->qtcb->bottom.io.fcp_cmnd_length);
zfcp_erp_adapter_shutdown(req->adapter, 0, "fssfch4");
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
break;
-- 
2.16.4



[PATCH 11/23] zfcp: drop duplicate seq_no from zfcp_fsf_req which is also in QTCB header

2018-11-08 Thread Steffen Maier
There is no point for double bookkeeping especially just for tracing.
The trace can take it from the QTCB which always exists for non-SRB
responses traced with zfcp_dbf_hba_fsf_res().

As a side effect, this removes an alignment hole and reduces the
size of struct zfcp_fsf_req, and thus of each pending request, by
8 bytes.
Before:
$ pahole -C zfcp_fsf_req drivers/s390/scsi/zfcp.ko
...
struct fsf_qtcb *  qtcb; /*   144 8 */
u32seq_no;   /*   152 4 */
/* XXX 4 bytes hole, try to pack */
void * data; /*   160 8 */
...
/* size: 296, cachelines: 2, members: 14 */
/* sum members: 288, holes: 2, sum holes: 8 */
/* last cacheline: 40 bytes */
After:
$ pahole -C zfcp_fsf_req drivers/s390/scsi/zfcp.ko
...
struct fsf_qtcb *  qtcb; /*   144 8 */
void * data; /*   152 8 */
...
/* size: 288, cachelines: 2, members: 13 */
/* sum members: 284, holes: 1, sum holes: 4 */

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_dbf.c | 2 +-
 drivers/s390/scsi/zfcp_def.h | 2 --
 drivers/s390/scsi/zfcp_fsf.c | 1 -
 3 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index d20977bb27a4..3503de873963 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -82,7 +82,7 @@ void zfcp_dbf_hba_fsf_res(char *tag, int level, struct 
zfcp_fsf_req *req)
rec->fsf_req_id = req->req_id;
rec->fsf_req_status = req->status;
rec->fsf_cmd = q_head->fsf_command;
-   rec->fsf_seq_no = req->seq_no;
+   rec->fsf_seq_no = q_pref->req_seq_no;
rec->u.res.req_issued = req->issued;
rec->u.res.prot_status = q_pref->prot_status;
rec->u.res.fsf_status = q_head->fsf_status;
diff --git a/drivers/s390/scsi/zfcp_def.h b/drivers/s390/scsi/zfcp_def.h
index d65adb0ae9f1..84a742a67975 100644
--- a/drivers/s390/scsi/zfcp_def.h
+++ b/drivers/s390/scsi/zfcp_def.h
@@ -278,7 +278,6 @@ static inline u64 zfcp_scsi_dev_lun(struct scsi_device 
*sdev)
  * @completion: used to signal the completion of the request
  * @status: status of the request
  * @qtcb: associated QTCB
- * @seq_no: sequence number of this request
  * @data: private data
  * @timer: timer data of this request
  * @erp_action: reference to erp action if request issued on behalf of ERP
@@ -294,7 +293,6 @@ struct zfcp_fsf_req {
struct completion   completion;
u32 status;
struct fsf_qtcb *qtcb;
-   u32 seq_no;
void*data;
struct timer_list   timer;
struct zfcp_erp_action  *erp_action;
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 07b86375b461..c949c65ffc6a 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -724,7 +724,6 @@ static struct zfcp_fsf_req *zfcp_fsf_req_create(struct 
zfcp_qdio *qdio,
return ERR_PTR(-ENOMEM);
}
 
-   req->seq_no = adapter->fsf_req_seq_no;
req->qtcb->prefix.req_seq_no = adapter->fsf_req_seq_no;
req->qtcb->prefix.req_id = req->req_id;
req->qtcb->prefix.ulp_info = 26;
-- 
2.16.4



[PATCH 10/23] zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header

2018-11-08 Thread Steffen Maier
Status read buffers (SRBs, unsolicited notifications) never use a QTCB
[zfcp_fsf_req_create()]. zfcp_fsf_req_send() already uses this to
distinguish SRBs from other FSF request types. We can re-use this method
in zfcp_fsf_req_complete(). Introduce a helper function to make the check
for req->qtcb less magic.

SRBs always are FSF_QTCB_UNSOLICITED_STATUS, so we can hard-code this for
the two trace functions dealing with SRBs.

All other FSF request types have a QTCB and we can get the fsf_command
from there.

zfcp_dbf_hba_fsf_response() and thus zfcp_dbf_hba_fsf_res() are only called
for non-SRB requests so it's safe to dereference the QTCB
[zfcp_fsf_req_complete() returns early on SRB,  else calls
 zfcp_fsf_protstatus_eval() which calls zfcp_dbf_hba_fsf_response()].
In zfcp_scsi_forget_cmnd() we guard the QTCB dereference with a preceding
NULL check and rely on boolean shortcut evaluation.

As a side effect, this causes an alignment hole which we can close in
a later patch after having cleaned up all fields of struct zfcp_fsf_req.
Before:
$ pahole -C zfcp_fsf_req drivers/s390/scsi/zfcp.ko
...
u32status;   /*   136 4 */
u32fsf_command;  /*   140 4 */
struct fsf_qtcb *  qtcb; /*   144 8 */
...
After:
$ pahole -C zfcp_fsf_req drivers/s390/scsi/zfcp.ko
...
u32status;   /*   136 4 */
/* XXX 4 bytes hole, try to pack */
struct fsf_qtcb *  qtcb; /*   144 8 */
...

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_dbf.c  |  8 
 drivers/s390/scsi/zfcp_dbf.h  |  4 ++--
 drivers/s390/scsi/zfcp_def.h  |  7 +--
 drivers/s390/scsi/zfcp_fsf.c  | 14 ++
 drivers/s390/scsi/zfcp_scsi.c |  4 +++-
 5 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index 3b368fcf13f4..d20977bb27a4 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -81,7 +81,7 @@ void zfcp_dbf_hba_fsf_res(char *tag, int level, struct 
zfcp_fsf_req *req)
rec->id = ZFCP_DBF_HBA_RES;
rec->fsf_req_id = req->req_id;
rec->fsf_req_status = req->status;
-   rec->fsf_cmd = req->fsf_command;
+   rec->fsf_cmd = q_head->fsf_command;
rec->fsf_seq_no = req->seq_no;
rec->u.res.req_issued = req->issued;
rec->u.res.prot_status = q_pref->prot_status;
@@ -94,7 +94,7 @@ void zfcp_dbf_hba_fsf_res(char *tag, int level, struct 
zfcp_fsf_req *req)
memcpy(rec->u.res.fsf_status_qual, _head->fsf_status_qual,
   FSF_STATUS_QUALIFIER_SIZE);
 
-   if (req->fsf_command != FSF_QTCB_FCP_CMND) {
+   if (q_head->fsf_command != FSF_QTCB_FCP_CMND) {
rec->pl_len = q_head->log_length;
zfcp_dbf_pl_write(dbf, (char *)q_pref + q_head->log_start,
  rec->pl_len, "fsf_res", req->req_id);
@@ -127,7 +127,7 @@ void zfcp_dbf_hba_fsf_uss(char *tag, struct zfcp_fsf_req 
*req)
rec->id = ZFCP_DBF_HBA_USS;
rec->fsf_req_id = req->req_id;
rec->fsf_req_status = req->status;
-   rec->fsf_cmd = req->fsf_command;
+   rec->fsf_cmd = FSF_QTCB_UNSOLICITED_STATUS;
 
if (!srb)
goto log;
@@ -174,7 +174,7 @@ void zfcp_dbf_hba_bit_err(char *tag, struct zfcp_fsf_req 
*req)
rec->id = ZFCP_DBF_HBA_BIT;
rec->fsf_req_id = req->req_id;
rec->fsf_req_status = req->status;
-   rec->fsf_cmd = req->fsf_command;
+   rec->fsf_cmd = FSF_QTCB_UNSOLICITED_STATUS;
memcpy(>u.be, _buf->payload.bit_error,
   sizeof(struct fsf_bit_error_payload));
 
diff --git a/drivers/s390/scsi/zfcp_dbf.h b/drivers/s390/scsi/zfcp_dbf.h
index d116c07ed77a..b4438713d1cc 100644
--- a/drivers/s390/scsi/zfcp_dbf.h
+++ b/drivers/s390/scsi/zfcp_dbf.h
@@ -339,8 +339,8 @@ void zfcp_dbf_hba_fsf_response(struct zfcp_fsf_req *req)
  zfcp_dbf_hba_fsf_resp_suppress(req)
  ? 5 : 1, req);
 
-   } else if ((req->fsf_command == FSF_QTCB_OPEN_PORT_WITH_DID) ||
-  (req->fsf_command == FSF_QTCB_OPEN_LUN)) {
+   } else if ((qtcb->header.fsf_command == FSF_QTCB_OPEN_PORT_WITH_DID) ||
+  (qtcb->header.fsf_command == FSF_QTCB_OPEN_LUN)) {
zfcp_dbf_hba_fsf_resp("fs_open", 4, req);
 
} else if (qtcb->header.log_length) {
diff --git a/drivers/s390/scsi/zfcp_def.h b/drivers/s390/scsi/zfcp_def.h
index 572debf2f528..d65adb0ae9f1 100644
--- a/drivers/s390/scsi/zfcp_def.h
+++ b/drivers/s390/scsi/zfcp_def.h
@@ -277,7 +277,6 @@ static inline u64 zfcp_scsi_dev_lun(

[PATCH 03/23] zfcp: move SG table helper from aux to fc and make them static

2018-11-08 Thread Steffen Maier
Since commit 663e0890e31c ("[SCSI] zfcp: remove access control tables
interface") these helper functions are only used for auto port scan in
zfcp_fc.c. Also change them to the corresponding namespace prefix.

This is a small cleanup for the miscellaneous catchall compile unit
zfcp_aux.c.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_aux.c | 44 +-
 drivers/s390/scsi/zfcp_fc.c  | 46 ++--
 2 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_aux.c b/drivers/s390/scsi/zfcp_aux.c
index 8818a3a290f6..df10f4e07a4a 100644
--- a/drivers/s390/scsi/zfcp_aux.c
+++ b/drivers/s390/scsi/zfcp_aux.c
@@ -4,7 +4,7 @@
  *
  * Module interface and handling of zfcp data structures.
  *
- * Copyright IBM Corp. 2002, 2013
+ * Copyright IBM Corp. 2002, 2017
  */
 
 /*
@@ -538,45 +538,3 @@ struct zfcp_port *zfcp_port_enqueue(struct zfcp_adapter 
*adapter, u64 wwpn,
zfcp_ccw_adapter_put(adapter);
return ERR_PTR(retval);
 }
-
-/**
- * zfcp_sg_free_table - free memory used by scatterlists
- * @sg: pointer to scatterlist
- * @count: number of scatterlist which are to be free'ed
- * the scatterlist are expected to reference pages always
- */
-void zfcp_sg_free_table(struct scatterlist *sg, int count)
-{
-   int i;
-
-   for (i = 0; i < count; i++, sg++)
-   if (sg)
-   free_page((unsigned long) sg_virt(sg));
-   else
-   break;
-}
-
-/**
- * zfcp_sg_setup_table - init scatterlist and allocate, assign buffers
- * @sg: pointer to struct scatterlist
- * @count: number of scatterlists which should be assigned with buffers
- * of size page
- *
- * Returns: 0 on success, -ENOMEM otherwise
- */
-int zfcp_sg_setup_table(struct scatterlist *sg, int count)
-{
-   void *addr;
-   int i;
-
-   sg_init_table(sg, count);
-   for (i = 0; i < count; i++, sg++) {
-   addr = (void *) get_zeroed_page(GFP_KERNEL);
-   if (!addr) {
-   zfcp_sg_free_table(sg, i);
-   return -ENOMEM;
-   }
-   sg_set_buf(sg, addr, PAGE_SIZE);
-   }
-   return 0;
-}
diff --git a/drivers/s390/scsi/zfcp_fc.c b/drivers/s390/scsi/zfcp_fc.c
index f6c415d6ef48..84a9c69cdd56 100644
--- a/drivers/s390/scsi/zfcp_fc.c
+++ b/drivers/s390/scsi/zfcp_fc.c
@@ -597,6 +597,48 @@ void zfcp_fc_test_link(struct zfcp_port *port)
put_device(>dev);
 }
 
+/**
+ * zfcp_fc_sg_free_table - free memory used by scatterlists
+ * @sg: pointer to scatterlist
+ * @count: number of scatterlist which are to be free'ed
+ * the scatterlist are expected to reference pages always
+ */
+static void zfcp_fc_sg_free_table(struct scatterlist *sg, int count)
+{
+   int i;
+
+   for (i = 0; i < count; i++, sg++)
+   if (sg)
+   free_page((unsigned long) sg_virt(sg));
+   else
+   break;
+}
+
+/**
+ * zfcp_fc_sg_setup_table - init scatterlist and allocate, assign buffers
+ * @sg: pointer to struct scatterlist
+ * @count: number of scatterlists which should be assigned with buffers
+ * of size page
+ *
+ * Returns: 0 on success, -ENOMEM otherwise
+ */
+static int zfcp_fc_sg_setup_table(struct scatterlist *sg, int count)
+{
+   void *addr;
+   int i;
+
+   sg_init_table(sg, count);
+   for (i = 0; i < count; i++, sg++) {
+   addr = (void *) get_zeroed_page(GFP_KERNEL);
+   if (!addr) {
+   zfcp_fc_sg_free_table(sg, i);
+   return -ENOMEM;
+   }
+   sg_set_buf(sg, addr, PAGE_SIZE);
+   }
+   return 0;
+}
+
 static struct zfcp_fc_req *zfcp_fc_alloc_sg_env(int buf_num)
 {
struct zfcp_fc_req *fc_req;
@@ -605,7 +647,7 @@ static struct zfcp_fc_req *zfcp_fc_alloc_sg_env(int buf_num)
if (!fc_req)
return NULL;
 
-   if (zfcp_sg_setup_table(_req->sg_rsp, buf_num)) {
+   if (zfcp_fc_sg_setup_table(_req->sg_rsp, buf_num)) {
kmem_cache_free(zfcp_fc_req_cache, fc_req);
return NULL;
}
@@ -763,7 +805,7 @@ void zfcp_fc_scan_ports(struct work_struct *work)
break;
}
}
-   zfcp_sg_free_table(_req->sg_rsp, buf_num);
+   zfcp_fc_sg_free_table(_req->sg_rsp, buf_num);
kmem_cache_free(zfcp_fc_req_cache, fc_req);
 out:
zfcp_fc_wka_port_put(>gs->ds);
-- 
2.16.4



[PATCH 08/23] zfcp: group sort internal structure definitions for proximity

2018-11-08 Thread Steffen Maier
Have structures just before the structures that use them
(without disrupting sequences of using structures such as
 zfcp_unit and zfcp_scsi_dev):
- zfcp_adapter_mempool embedded in zfcp_adapter,
- zfcp_latenc... embedded in zfcp_scsi_dev.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_def.h | 58 ++--
 1 file changed, 29 insertions(+), 29 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_def.h b/drivers/s390/scsi/zfcp_def.h
index e227b0770221..31b3e2bb3b42 100644
--- a/drivers/s390/scsi/zfcp_def.h
+++ b/drivers/s390/scsi/zfcp_def.h
@@ -91,18 +91,6 @@
 
 struct zfcp_fsf_req;
 
-/* holds various memory pools of an adapter */
-struct zfcp_adapter_mempool {
-   mempool_t *erp_req;
-   mempool_t *gid_pn_req;
-   mempool_t *scsi_req;
-   mempool_t *scsi_abort;
-   mempool_t *status_read_req;
-   mempool_t *sr_data;
-   mempool_t *gid_pn;
-   mempool_t *qtcb_pool;
-};
-
 struct zfcp_erp_action {
struct list_head list;
int action;   /* requested action code */
@@ -115,23 +103,16 @@ struct zfcp_erp_action {
struct timer_list timer;
 };
 
-struct zfcp_latency_record {
-   u32 min;
-   u32 max;
-   u64 sum;
-};
-
-struct zfcp_latency_cont {
-   struct zfcp_latency_record channel;
-   struct zfcp_latency_record fabric;
-   u64 counter;
-};
-
-struct zfcp_latencies {
-   struct zfcp_latency_cont read;
-   struct zfcp_latency_cont write;
-   struct zfcp_latency_cont cmd;
-   spinlock_t lock;
+/* holds various memory pools of an adapter */
+struct zfcp_adapter_mempool {
+   mempool_t *erp_req;
+   mempool_t *gid_pn_req;
+   mempool_t *scsi_req;
+   mempool_t *scsi_abort;
+   mempool_t *status_read_req;
+   mempool_t *sr_data;
+   mempool_t *gid_pn;
+   mempool_t *qtcb_pool;
 };
 
 struct zfcp_adapter {
@@ -212,6 +193,25 @@ struct zfcp_port {
unsigned intstarget_id;
 };
 
+struct zfcp_latency_record {
+   u32 min;
+   u32 max;
+   u64 sum;
+};
+
+struct zfcp_latency_cont {
+   struct zfcp_latency_record channel;
+   struct zfcp_latency_record fabric;
+   u64 counter;
+};
+
+struct zfcp_latencies {
+   struct zfcp_latency_cont read;
+   struct zfcp_latency_cont write;
+   struct zfcp_latency_cont cmd;
+   spinlock_t lock;
+};
+
 /**
  * struct zfcp_unit - LUN configured via zfcp sysfs
  * @dev: struct device for sysfs representation and reference counting
-- 
2.16.4



[PATCH 07/23] zfcp: namespace prefix for internal latency data structures

2018-11-08 Thread Steffen Maier
In contrast to struct fsf_qual_latency_info, the ones here are not FSF
but software defined zfcp-internal.

Signed-off-by: Steffen Maier 
Reviewed-by: Benjamin Block 
---
 drivers/s390/scsi/zfcp_def.h | 14 +++---
 drivers/s390/scsi/zfcp_fsf.c |  4 ++--
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_def.h b/drivers/s390/scsi/zfcp_def.h
index 13bfc13eb42d..e227b0770221 100644
--- a/drivers/s390/scsi/zfcp_def.h
+++ b/drivers/s390/scsi/zfcp_def.h
@@ -115,22 +115,22 @@ struct zfcp_erp_action {
struct timer_list timer;
 };
 
-struct fsf_latency_record {
+struct zfcp_latency_record {
u32 min;
u32 max;
u64 sum;
 };
 
-struct latency_cont {
-   struct fsf_latency_record channel;
-   struct fsf_latency_record fabric;
+struct zfcp_latency_cont {
+   struct zfcp_latency_record channel;
+   struct zfcp_latency_record fabric;
u64 counter;
 };
 
 struct zfcp_latencies {
-   struct latency_cont read;
-   struct latency_cont write;
-   struct latency_cont cmd;
+   struct zfcp_latency_cont read;
+   struct zfcp_latency_cont write;
+   struct zfcp_latency_cont cmd;
spinlock_t lock;
 };
 
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 095ab7fdcf4b..62311bd2df03 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -1991,7 +1991,7 @@ int zfcp_fsf_close_lun(struct zfcp_erp_action *erp_action)
return retval;
 }
 
-static void zfcp_fsf_update_lat(struct fsf_latency_record *lat_rec, u32 lat)
+static void zfcp_fsf_update_lat(struct zfcp_latency_record *lat_rec, u32 lat)
 {
lat_rec->sum += lat;
lat_rec->min = min(lat_rec->min, lat);
@@ -2001,7 +2001,7 @@ static void zfcp_fsf_update_lat(struct fsf_latency_record 
*lat_rec, u32 lat)
 static void zfcp_fsf_req_trace(struct zfcp_fsf_req *req, struct scsi_cmnd 
*scsi)
 {
struct fsf_qual_latency_info *lat_in;
-   struct latency_cont *lat = NULL;
+   struct zfcp_latency_cont *lat = NULL;
struct zfcp_scsi_dev *zfcp_sdev;
struct zfcp_blk_drv_data blktrc;
int ticks = req->adapter->timer_ticks;
-- 
2.16.4



[PATCH 02/22] zfcp: remove unnecessary null pointer check before mempool_destroy

2018-11-08 Thread Steffen Maier
From: zhong jiang 

mempool_destroy has taken null pointer check into account. so remove the
redundant check.

Signed-off-by: zhong jiang 
Acked-by: Benjamin Block 
[ma...@linux.ibm.com: depends on v4.3 4e3ca3e033d1 ("mm/mempool: allow NULL 
`pool' pointer in mempool_destroy()")]
Signed-off-by: Steffen Maier 
---
 drivers/s390/scsi/zfcp_aux.c | 21 +++--
 1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_aux.c b/drivers/s390/scsi/zfcp_aux.c
index 08cdc00e8299..8818a3a290f6 100644
--- a/drivers/s390/scsi/zfcp_aux.c
+++ b/drivers/s390/scsi/zfcp_aux.c
@@ -251,20 +251,13 @@ static int zfcp_allocate_low_mem_buffers(struct 
zfcp_adapter *adapter)
 
 static void zfcp_free_low_mem_buffers(struct zfcp_adapter *adapter)
 {
-   if (adapter->pool.erp_req)
-   mempool_destroy(adapter->pool.erp_req);
-   if (adapter->pool.scsi_req)
-   mempool_destroy(adapter->pool.scsi_req);
-   if (adapter->pool.scsi_abort)
-   mempool_destroy(adapter->pool.scsi_abort);
-   if (adapter->pool.qtcb_pool)
-   mempool_destroy(adapter->pool.qtcb_pool);
-   if (adapter->pool.status_read_req)
-   mempool_destroy(adapter->pool.status_read_req);
-   if (adapter->pool.sr_data)
-   mempool_destroy(adapter->pool.sr_data);
-   if (adapter->pool.gid_pn)
-   mempool_destroy(adapter->pool.gid_pn);
+   mempool_destroy(adapter->pool.erp_req);
+   mempool_destroy(adapter->pool.scsi_req);
+   mempool_destroy(adapter->pool.scsi_abort);
+   mempool_destroy(adapter->pool.qtcb_pool);
+   mempool_destroy(adapter->pool.status_read_req);
+   mempool_destroy(adapter->pool.sr_data);
+   mempool_destroy(adapter->pool.gid_pn);
 }
 
 /**
-- 
2.16.4



[PATCH 01/23] zfcp: make DIX experimental, disabled, and independent of DIF

2018-11-08 Thread Steffen Maier
From: Fedor Loshakov 

There are too many unresolved issues with DIX outside of zfcp
such as wrong protection data on writesame/discard (over device-mapper)
or due to unstable page writes.
This can cause I/O stalls or endless loops or even kernel panics,
or I/O errors due to erroneously failed logical block guard checks.

Therefore, introduce separate zfcp module parameters to individually
select support for:
DIF which should work (zfcp.dif, which used to be DIF+DIX, disabled) or
DIX+DIF which causes trouble (zfcp.dix, new, disabled).

If DIX is enabled, we warn on zfcp driver initialization.

Signed-off-by: Steffen Maier 
Co-developed-by: Fedor Loshakov 
Signed-off-by: Fedor Loshakov 
Reviewed-by: Jens Remus 
---
 drivers/s390/scsi/zfcp_aux.c  |  3 +++
 drivers/s390/scsi/zfcp_ext.h  |  1 +
 drivers/s390/scsi/zfcp_scsi.c | 10 +++---
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_aux.c b/drivers/s390/scsi/zfcp_aux.c
index 94f4d8fe85e0..08cdc00e8299 100644
--- a/drivers/s390/scsi/zfcp_aux.c
+++ b/drivers/s390/scsi/zfcp_aux.c
@@ -124,6 +124,9 @@ static int __init zfcp_module_init(void)
 {
int retval = -ENOMEM;
 
+   if (zfcp_experimental_dix)
+   pr_warn("DIX is enabled. It is experimental and might cause 
problems\n");
+
zfcp_fsf_qtcb_cache = zfcp_cache_hw_align("zfcp_fsf_qtcb",
  sizeof(struct fsf_qtcb));
if (!zfcp_fsf_qtcb_cache)
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index bd0c5a9f04cb..0940bef35020 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -144,6 +144,7 @@ extern void zfcp_qdio_close(struct zfcp_qdio *);
 extern void zfcp_qdio_siosl(struct zfcp_adapter *);
 
 /* zfcp_scsi.c */
+extern bool zfcp_experimental_dix;
 extern struct scsi_transport_template *zfcp_scsi_transport_template;
 extern int zfcp_scsi_adapter_register(struct zfcp_adapter *);
 extern void zfcp_scsi_adapter_unregister(struct zfcp_adapter *);
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index a8efcb330bc1..2b8c33627460 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -27,7 +27,11 @@ MODULE_PARM_DESC(queue_depth, "Default queue depth for new 
SCSI devices");
 
 static bool enable_dif;
 module_param_named(dif, enable_dif, bool, 0400);
-MODULE_PARM_DESC(dif, "Enable DIF/DIX data integrity support");
+MODULE_PARM_DESC(dif, "Enable DIF data integrity support (default off)");
+
+bool zfcp_experimental_dix;
+module_param_named(dix, zfcp_experimental_dix, bool, 0400);
+MODULE_PARM_DESC(dix, "Enable experimental DIX (data integrity extension) 
support which implies DIF support (default off)");
 
 static bool allow_lun_scan = true;
 module_param(allow_lun_scan, bool, 0600);
@@ -788,11 +792,11 @@ void zfcp_scsi_set_prot(struct zfcp_adapter *adapter)
data_div = atomic_read(>status) &
   ZFCP_STATUS_ADAPTER_DATA_DIV_ENABLED;
 
-   if (enable_dif &&
+   if ((enable_dif || zfcp_experimental_dix) &&
adapter->adapter_features & FSF_FEATURE_DIF_PROT_TYPE1)
mask |= SHOST_DIF_TYPE1_PROTECTION;
 
-   if (enable_dif && data_div &&
+   if (zfcp_experimental_dix && data_div &&
adapter->adapter_features & FSF_FEATURE_DIX_PROT_TCPIP) {
mask |= SHOST_DIX_TYPE1_PROTECTION;
scsi_host_set_guard(shost, SHOST_DIX_GUARD_IP);
-- 
2.16.4



[PATCH 08/25] zfcp: decouple SCSI traces for scsi_eh / TMF from scsi_cmnd

2018-05-17 Thread Steffen Maier
The SCSI command pointer passed to scsi_eh callbacks is just one arbitrary
command of potentially many that are in the eh queue to be processed.
The command is only used to indirectly pass the TMF scope in terms of
SCSI ID/target and SCSI LUN for LUN reset.

Hence, zfcp had filled in SCSI trace record fields which do not really
belong to the TMF. This was confusing.

Therefore, refactor the TMF tracing to work without SCSI command.
Since the FCP channel always requires a valid LUN handle,
we use SCSI device as common context for any TMF (even target reset).
To make it even clearer, we set all bits to 1 for the fields, which do
not belong to the TMF, to indicate that these fields are invalid.

The old zfcp_dbf_scsi() became zfcp_dbf_scsi_common() to now handle both
SCSI commands and TMFs. The old argument scsi_cmnd is now optional and
can be NULL with TMFs. The new argument scsi_device is mandatory to carry
context, as well as SCSI ID/target and SCSI LUN in case of TMFs.

New example trace record formatted with zfcpdbf from s390-tools:

Timestamp  : ...
Area   : SCSI
Subarea: 00
Level  : 1
Exception  : -
CPU ID : ..
Caller : 0x...
Record ID  : 1
Tag: [lt]r_
Request ID : 0x  ID of FSF FCP request with TM flag
 For cases without FSF request: 0x0 for none (invalid)
SCSI ID: 0xSCSI ID/target denoting scope
SCSI LUN   : 0x   SCSI LUN denoting scope
SCSI LUN high  : 0x  SCSI LUN denoting scope
SCSI result: 0x none (invalid)
SCSI retries   : 0xff   none (invalid)
SCSI allowed   : 0xff   none (invalid)
SCSI scribble  : 0x none (invalid)
SCSI opcode:    none (invalid)
FCP rsp inf cod: 0x00   FCP_RSP info code of TMF
FCP rsp IU :   0100  ext FCP_RSP IU
  0008   ext FCP_RSP IU
FCP rsp IU len : 32  FCP_RSP IU length
Payload time   : ...
FCP rsp IU all :   0100  full FCP_RSP IU
  0008   full FCP_RSP IU

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---

Notes:
Changes since RFC:

For consistency, renamed from
"zfcp: drop unsuitable scsi_cmnd usage from SCSI traces for scsi_eh / TMF".
Since the FCP channel always requires a valid LUN handle,
we use SCSI device as common context for any TMF (even target reset)
instead of explicit SCSI ID and SCSI LUN values.
Added example trace record to commit description.

 drivers/s390/scsi/zfcp_dbf.c  | 48 ---
 drivers/s390/scsi/zfcp_dbf.h  | 21 ---
 drivers/s390/scsi/zfcp_ext.h  |  5 +++--
 drivers/s390/scsi/zfcp_scsi.c | 15 +++---
 4 files changed, 56 insertions(+), 33 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index 1e5ea5e4992b..bb3373260169 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -578,16 +578,18 @@ void zfcp_dbf_san_in_els(char *tag, struct zfcp_fsf_req 
*fsf)
 }
 
 /**
- * zfcp_dbf_scsi - trace event for scsi commands
- * @tag: identifier for event
- * @sc: pointer to struct scsi_cmnd
- * @fsf: pointer to struct zfcp_fsf_req
+ * zfcp_dbf_scsi_common() - Common trace event helper for scsi.
+ * @tag: Identifier for event.
+ * @level: trace level of event.
+ * @sdev: Pointer to SCSI device as context for this event.
+ * @sc: Pointer to SCSI command, or NULL with task management function (TMF).
+ * @fsf: Pointer to FSF request, or NULL.
  */
-void zfcp_dbf_scsi(char *tag, int level, struct scsi_cmnd *sc,
-  struct zfcp_fsf_req *fsf)
+void zfcp_dbf_scsi_common(char *tag, int level, struct scsi_device *sdev,
+ struct scsi_cmnd *sc, struct zfcp_fsf_req *fsf)
 {
struct zfcp_adapter *adapter =
-   (struct zfcp_adapter *) sc->device->host->hostdata[0];
+   (struct zfcp_adapter *) sdev->host->hostdata[0];
struct zfcp_dbf *dbf = adapter->dbf;
struct zfcp_dbf_scsi *rec = >scsi_buf;
struct fcp_resp_with_ext *fcp_rsp;
@@ -599,16 +601,28 @@ void zfcp_dbf_scsi(char *tag, int level, struct scsi_cmnd 
*sc,
 
memcpy(rec->tag, tag, ZFCP_DBF_TAG_LEN);
rec->id = ZFCP_DBF_SCSI_CMND;
-   rec->scsi_result = sc->result;
-   rec->scsi_retries = sc->retries;
-   rec->scsi_allowed = sc->allowed;
-   rec->scsi_id = sc->device->id;
-   rec->scsi_lun = (u32)sc->device->lun;
-   rec->scsi_lun_64_hi = (u32)(sc->device->lun >> 32);
- 

[PATCH 25/25] zfcp: enhance comments on fc_link_speed and supported_speed

2018-05-17 Thread Steffen Maier
From: Jens Remus <jre...@linux.ibm.com>

The comment on fsf_qtcb_bottom_port.supported_speed did read as if the
field can only assume one of two possible values (i.e. 0x1 for 1 GBit/s or
0x2 for 2 GBit/s). This is not true for two reasons: first it is a flag
field and can thus assume any combination and second there are meanwhile
more speeds.

Clarify comment on fsf_qtcb_bottom_port.supported_speed and add a comment
to fsf_qtcb_bottom_config.fc_link_speed.

Signed-off-by: Jens Remus <jre...@linux.ibm.com>
Reviewed-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Fedor Loshakov <losha...@linux.ibm.com>
Acked-by: Benjamin Block <bbl...@linux.ibm.com>
Acked-by: Hendrik Brueckner <brueck...@linux.ibm.com>
Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_fsf.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.h b/drivers/s390/scsi/zfcp_fsf.h
index 4baca67aba6d..535628b92f0a 100644
--- a/drivers/s390/scsi/zfcp_fsf.h
+++ b/drivers/s390/scsi/zfcp_fsf.h
@@ -4,7 +4,7 @@
  *
  * Interface to the FSF support functions.
  *
- * Copyright IBM Corp. 2002, 2017
+ * Copyright IBM Corp. 2002, 2018
  */
 
 #ifndef FSF_H
@@ -356,7 +356,7 @@ struct fsf_qtcb_bottom_config {
u32 adapter_features;
u32 connection_features;
u32 fc_topology;
-   u32 fc_link_speed;
+   u32 fc_link_speed;  /* one of ZFCP_FSF_PORTSPEED_* */
u32 adapter_type;
u8 res0;
u8 peer_d_id[3];
@@ -382,7 +382,7 @@ struct fsf_qtcb_bottom_port {
u32 class_of_service;   /* should be 0x0006 for class 2 and 3 */
u8 supported_fc4_types[32]; /* should be 0x0100 for scsi fcp */
u8 active_fc4_types[32];
-   u32 supported_speed;/* 0x0001 for 1 GBit/s or 0x0002 for 2 GBit/s */
+   u32 supported_speed;/* any combination of ZFCP_FSF_PORTSPEED_* */
u32 maximum_frame_size; /* fixed value of 2112 */
u64 seconds_since_last_reset;
u64 tx_frames;
-- 
2.16.3



[PATCH 23/25] zfcp: assert that the ERP lock is held when tracing a recovery trigger

2018-05-17 Thread Steffen Maier
From: Jens Remus <jre...@linux.ibm.com>

Otherwise iterating with list_for_each() over the adapter->erp_ready_head
and adapter->erp_running_head lists can lead to an infinite loop. See
commit "zfcp: fix infinite iteration on erp_ready_head list".

The run-time check is only performed for debug kernels which have the
kernel lock validator enabled. Following is an example of the warning that
is reported, if the ERP lock is not held when calling zfcp_dbf_rec_trig():

WARNING: CPU: 0 PID: 604 at drivers/s390/scsi/zfcp_dbf.c:288 
zfcp_dbf_rec_trig+0x172/0x188
Modules linked in: ...
CPU: 0 PID: 604 Comm: kworker/u128:3 Not tainted 4.16.0-... #1
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
Workqueue: zfcp_q_0.0.1906 zfcp_scsi_rport_work
Krnl PSW : 330fdbf9 367e9728 (zfcp_dbf_rec_trig+0x172/0x188)
   R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3
Krnl GPRS: c57a5d99 32882000  6cc82740
   009d09d6  00ff 
    00e1b5fe 6de01d38 76130958
   6cc82548 6de01a98 009d09d6 6a6d3c80
Krnl Code: 009d0ad2: eb7ff0b80004lmg%r7,%r15,184(%r15)
   009d0ad8: c0f4000d7dd0brcl   15,b80678
  #009d0ade: a7f40001brc15,9d0ae0
  >009d0ae2: a7f4ff7dbrc15,9d09dc
   009d0ae6: e340f0f4lg %r4,240(%r15)
   009d0aec: eb7ff0b80004lmg%r7,%r15,184(%r15)
   009d0af2: 07f4bcr15,%r4
   009d0af4: 0707bcr0,%r7
Call Trace:
([<009d09d6>] zfcp_dbf_rec_trig+0x66/0x188)
 [<009dd740>] zfcp_scsi_rport_work+0x98/0x190
 [<00169b34>] process_one_work+0x3d4/0x6f8
 [<0016a08a>] worker_thread+0x232/0x418
 [<0017219e>] kthread+0x166/0x178
 [<00b815ea>] kernel_thread_starter+0x6/0xc
 [<00b815e4>] kernel_thread_starter+0x0/0xc
2 locks held by kworker/u128:3/604:
 #0:  ((wq_completion)name){+.+.}, at: [<82af1024>] 
process_one_work+0x1dc/0x6f8
 #1:  ((work_completion)(>rport_work)){+.+.}, at: [<82af1024>] 
process_one_work+0x1dc/0x6f8
Last Breaking-Event-Address:
 [<009d0ade>] zfcp_dbf_rec_trig+0x16e/0x188
---[ end trace b2f4020572e2c124 ]---

Suggested-by: Steffen Maier <ma...@linux.ibm.com>
Signed-off-by: Jens Remus <jre...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
Reviewed-by: Steffen Maier <ma...@linux.ibm.com>
Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_dbf.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index bb3373260169..781141bf2c28 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -285,6 +285,8 @@ void zfcp_dbf_rec_trig(char *tag, struct zfcp_adapter 
*adapter,
struct list_head *entry;
unsigned long flags;
 
+   lockdep_assert_held(>erp_lock);
+
if (unlikely(!debug_level_enabled(dbf->rec, level)))
return;
 
-- 
2.16.3



[PATCH 22/25] zfcp: cleanup indentation for posting FC events

2018-05-17 Thread Steffen Maier
I just happened to see the function header indentation of
zfcp_fc_enqueue_event() and I picked some more from checkpatch:

$ checkpatch.pl --strict -f drivers/s390/scsi/zfcp_fc.c
...
CHECK: Alignment should match open parenthesis
 #113: FILE: drivers/s390/scsi/zfcp_fc.c:113:
+   fc_host_post_event(adapter->scsi_host, fc_get_event_number(),
+   event->code, event->data);

CHECK: Blank lines aren't necessary before a close brace '}'
 #118: FILE: drivers/s390/scsi/zfcp_fc.c:118:
+
+}
...

The change complements v2.6.36 commit 2d1e547f7523 ("[SCSI] zfcp: Post
events through FC transport class").

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_fc.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fc.c b/drivers/s390/scsi/zfcp_fc.c
index 54186943896b..f6c415d6ef48 100644
--- a/drivers/s390/scsi/zfcp_fc.c
+++ b/drivers/s390/scsi/zfcp_fc.c
@@ -111,11 +111,10 @@ void zfcp_fc_post_event(struct work_struct *work)
 
list_for_each_entry_safe(event, tmp, _lh, list) {
fc_host_post_event(adapter->scsi_host, fc_get_event_number(),
-   event->code, event->data);
+  event->code, event->data);
list_del(>list);
kfree(event);
}
-
 }
 
 /**
@@ -126,7 +125,7 @@ void zfcp_fc_post_event(struct work_struct *work)
  * @event_data: The event data (e.g. n_port page in case of els)
  */
 void zfcp_fc_enqueue_event(struct zfcp_adapter *adapter,
-   enum fc_host_event_code event_code, u32 event_data)
+  enum fc_host_event_code event_code, u32 event_data)
 {
struct zfcp_fc_event *event;
 
-- 
2.16.3



[PATCH 21/25] zfcp: support SCSI_ADAPTER_RESET via scsi_host sysfs attribute host_reset

2018-05-17 Thread Steffen Maier
Make use of feature introduced with v3.2 commit 294436914454
("[SCSI] scsi: Added support for adapter and firmware reset").
The common code interface was introduced for commit 95d31262b3c1
("[SCSI] qla4xxx: Added support for adapter and firmware reset").

$ echo adapter > /sys/class/scsi_host/host/host_reset

Example trace record formatted with zfcpdbf from s390-tools:

Timestamp  : ...
Area   : REC
Subarea: 00
Level  : 1
Exception  : -
CPU ID : ..
Caller : 0x...
Record ID  : 1  ZFCP_DBF_REC_TRIG
Tag: scshr_ySCSI sysfs host_reset yes
LUN: 0x none (invalid)
WWPN   : 0x none (invalid)
D_ID   : 0x none (invalid)
Adapter status : 0x4500050b
Port status: 0x none (invalid)
LUN status : 0x none (invalid)
Ready count: 0x0001
Running count  : 0x
ERP want   : 0x04   ZFCP_ERP_ACTION_REOPEN_ADAPTER
ERP need   : 0x04   ZFCP_ERP_ACTION_REOPEN_ADAPTER

This is the common code equivalent to the zfcp-specific
_attr_adapter_failed.attr in zfcp_sysfs_adapter_attrs.attrs[]:

$ echo 0 > /sys/bus/ccw/drivers/zfcp//failed

The unsupported case returns EOPNOTSUPP:

$ echo firmware > /sys/class/scsi_host/host/host_reset
-bash: echo: write error: Operation not supported

Example trace record formatted with zfcpdbf from s390-tools:

Timestamp  : ...
Area   : SCSI
Subarea: 00
Level  : 1
Exception  : -
CPU ID : ..
Caller : 0x...
Record ID  : 1
Tag: scshr_nSCSI sysfs host_reset no
Request ID : 0x none (invalid)
SCSI ID: 0x none (invalid)
SCSI LUN   : 0x none (invalid)
SCSI LUN high  : 0x none (invalid)
SCSI result: 0xffa1 -EOPNOTSUPP==-95
SCSI retries   : 0xff   none (invalid)
SCSI allowed   : 0xff   none (invalid)
SCSI scribble  : 0x none (invalid)
SCSI opcode:    none (invalid)
FCP rsp inf cod: 0xff   none (invalid)
FCP rsp IU :    none (invalid)
  

For any other invalid value, common code returns EINVAL without invoking
our callback:

$ echo foo > /sys/class/scsi_host/host/host_reset
-bash: echo: write error: Invalid argument

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_erp.c   | 11 +++
 drivers/s390/scsi/zfcp_ext.h   |  1 +
 drivers/s390/scsi/zfcp_scsi.c  | 26 ++
 drivers/s390/scsi/zfcp_sysfs.c |  5 +
 4 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 2968d2f57788..e7e6b63905e2 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -1691,3 +1691,14 @@ void zfcp_erp_clear_lun_status(struct scsi_device *sdev, 
u32 mask)
atomic_set(_sdev->erp_counter, 0);
 }
 
+/**
+ * zfcp_erp_adapter_reset_sync() - Really reopen adapter and wait.
+ * @adapter: Pointer to zfcp_adapter to reopen.
+ * @id: Trace tag string of length %ZFCP_DBF_TAG_LEN.
+ */
+void zfcp_erp_adapter_reset_sync(struct zfcp_adapter *adapter, char *id)
+{
+   zfcp_erp_set_adapter_status(adapter, ZFCP_STATUS_COMMON_RUNNING);
+   zfcp_erp_adapter_reopen(adapter, ZFCP_STATUS_COMMON_ERP_FAILED, id);
+   zfcp_erp_wait(adapter);
+}
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index ad7c28ffd49f..f3b55cce748a 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -76,6 +76,7 @@ extern void zfcp_erp_thread_kill(struct zfcp_adapter *);
 extern void zfcp_erp_wait(struct zfcp_adapter *);
 extern void zfcp_erp_notify(struct zfcp_erp_action *, unsigned long);
 extern void zfcp_erp_timeout_handler(struct timer_list *t);
+extern void zfcp_erp_adapter_reset_sync(struct zfcp_adapter *adapter, char 
*id);
 
 /* zfcp_fc.c */
 extern struct kmem_cache *zfcp_fc_req_cache;
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index f69ef78ea930..9a01f583e562 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -372,6 +372,31 @@ static int zfcp_scsi_eh_host_reset_handler(struct 
scsi_cmnd *scpnt)
return ret;
 }
 
+/**
+ * zfcp_scsi_sysfs_host_reset() - Support scsi_host sysfs attribute host_reset.
+ *

[PATCH 24/25] zfcp: add port speed capabilities

2018-05-17 Thread Steffen Maier
From: Jens Remus <jre...@linux.ibm.com>

Add port speed capabilities as defined in FC-LS RPSC ELS that have a
counterpart FC_PORTSPEED_* defined in scsi/scsi_transport_fc.h.

Suggested-by: Steffen Maier <ma...@linux.ibm.com>
Signed-off-by: Jens Remus <jre...@linux.ibm.com>
Reviewed-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Fedor Loshakov <losha...@linux.ibm.com>
Acked-by: Hendrik Brueckner <brueck...@linux.ibm.com>
Acked-by: Benjamin Block <bbl...@linux.ibm.com>
Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_fsf.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 049fdd968130..3c86e27f094d 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -4,7 +4,7 @@
  *
  * Implementation of FSF commands.
  *
- * Copyright IBM Corp. 2002, 2017
+ * Copyright IBM Corp. 2002, 2018
  */
 
 #define KMSG_COMPONENT "zfcp"
@@ -437,6 +437,9 @@ void zfcp_fsf_req_dismiss_all(struct zfcp_adapter *adapter)
 #define ZFCP_FSF_PORTSPEED_10GBIT  (1 <<  3)
 #define ZFCP_FSF_PORTSPEED_8GBIT   (1 <<  4)
 #define ZFCP_FSF_PORTSPEED_16GBIT  (1 <<  5)
+#define ZFCP_FSF_PORTSPEED_32GBIT  (1 <<  6)
+#define ZFCP_FSF_PORTSPEED_64GBIT  (1 <<  7)
+#define ZFCP_FSF_PORTSPEED_128GBIT (1 <<  8)
 #define ZFCP_FSF_PORTSPEED_NOT_NEGOTIATED (1 << 15)
 
 static u32 zfcp_fsf_convert_portspeed(u32 fsf_speed)
@@ -454,6 +457,12 @@ static u32 zfcp_fsf_convert_portspeed(u32 fsf_speed)
fdmi_speed |= FC_PORTSPEED_8GBIT;
if (fsf_speed & ZFCP_FSF_PORTSPEED_16GBIT)
fdmi_speed |= FC_PORTSPEED_16GBIT;
+   if (fsf_speed & ZFCP_FSF_PORTSPEED_32GBIT)
+   fdmi_speed |= FC_PORTSPEED_32GBIT;
+   if (fsf_speed & ZFCP_FSF_PORTSPEED_64GBIT)
+   fdmi_speed |= FC_PORTSPEED_64GBIT;
+   if (fsf_speed & ZFCP_FSF_PORTSPEED_128GBIT)
+   fdmi_speed |= FC_PORTSPEED_128GBIT;
if (fsf_speed & ZFCP_FSF_PORTSPEED_NOT_NEGOTIATED)
fdmi_speed |= FC_PORTSPEED_NOT_NEGOTIATED;
return fdmi_speed;
-- 
2.16.3



[PATCH 18/25] zfcp: zfcp_erp_action_exists() does only check for running

2018-05-17 Thread Steffen Maier
Simplify its signature to return boolean and rename it to
zfcp_erp_action_is_running() to indicate its actual unmodified semantics.
It has always been used like this since v2.6.0 history commit ea127f975424
("[PATCH] s390 (7/7): zfcp host adapter.").

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_erp.c | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index b5ca484d5d5f..245621769c26 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -57,10 +57,6 @@ enum zfcp_erp_act_type {
ZFCP_ERP_ACTION_FAILED = 0xe0,
 };
 
-enum zfcp_erp_act_state {
-   ZFCP_ERP_ACTION_RUNNING = 1,
-};
-
 enum zfcp_erp_act_result {
ZFCP_ERP_SUCCEEDED = 0,
ZFCP_ERP_FAILED= 1,
@@ -76,14 +72,14 @@ static void zfcp_erp_adapter_block(struct zfcp_adapter 
*adapter, int mask)
   ZFCP_STATUS_COMMON_UNBLOCKED | mask);
 }
 
-static int zfcp_erp_action_exists(struct zfcp_erp_action *act)
+static bool zfcp_erp_action_is_running(struct zfcp_erp_action *act)
 {
struct zfcp_erp_action *curr_act;
 
list_for_each_entry(curr_act, >adapter->erp_running_head, list)
if (act == curr_act)
-   return ZFCP_ERP_ACTION_RUNNING;
-   return 0;
+   return true;
+   return false;
 }
 
 static void zfcp_erp_action_ready(struct zfcp_erp_action *act)
@@ -99,7 +95,7 @@ static void zfcp_erp_action_ready(struct zfcp_erp_action *act)
 static void zfcp_erp_action_dismiss(struct zfcp_erp_action *act)
 {
act->status |= ZFCP_STATUS_ERP_DISMISSED;
-   if (zfcp_erp_action_exists(act) == ZFCP_ERP_ACTION_RUNNING)
+   if (zfcp_erp_action_is_running(act))
zfcp_erp_action_ready(act);
 }
 
@@ -622,7 +618,7 @@ void zfcp_erp_notify(struct zfcp_erp_action *erp_action, 
unsigned long set_mask)
unsigned long flags;
 
write_lock_irqsave(>erp_lock, flags);
-   if (zfcp_erp_action_exists(erp_action) == ZFCP_ERP_ACTION_RUNNING) {
+   if (zfcp_erp_action_is_running(erp_action)) {
erp_action->status |= set_mask;
zfcp_erp_action_ready(erp_action);
}
-- 
2.16.3



[PATCH 09/25] zfcp: decouple TMF response handler from scsi_cmnd

2018-05-17 Thread Steffen Maier
Originally, I planned for TMF handling to have different context data in
fsf_req->data depending on the TMF scope in fcp_cmnd->fc_tm_flags:
* scsi_device if FCP_TMF_LUN_RESET,
* zfcp_port if FCP_TMF_TGT_RESET.
However, the FCP channel requires a valid LUN handle so we now use
scsi_device as context data with any TMF for the time being.

Regular SCSI I/O FCP requests continue using scsi_cmnd as req->data.

Hence, the callers of zfcp_fsf_fcp_handler_common() must resolve req->data
and pass scsi_device as common context.
While at it, remove the detour zfcp_sdev->port->adapter and use the more
direct req->adapter as elsewhere in this function already.

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---

Notes:
Changes since RFC:

Since the FCP channel always requires a valid LUN handle,
we now use scsi_device as context data with any TMF instead of either
scsi_device for FCP_TMF_LUN_RESET or zfcp_port for FCP_TMF_TGT_RESET.

This also fixes a kernel panic due to a wrongly assigned req->data
in the previous patch version.

Added missing description on replacing zfcp_sdev->port->adapter with
req->adapter which was already in the previous patch content.
This is now the only change left in zfcp_fsf_fcp_handler_common().

 drivers/s390/scsi/zfcp_fsf.c | 25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index b12cb81ad8a2..a95070c7cad8 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2036,10 +2036,14 @@ static void zfcp_fsf_req_trace(struct zfcp_fsf_req 
*req, struct scsi_cmnd *scsi)
sizeof(blktrc));
 }
 
-static void zfcp_fsf_fcp_handler_common(struct zfcp_fsf_req *req)
+/**
+ * zfcp_fsf_fcp_handler_common() - FCP response handler common to I/O and TMF.
+ * @req: Pointer to FSF request.
+ * @sdev: Pointer to SCSI device as request context.
+ */
+static void zfcp_fsf_fcp_handler_common(struct zfcp_fsf_req *req,
+   struct scsi_device *sdev)
 {
-   struct scsi_cmnd *scmnd = req->data;
-   struct scsi_device *sdev = scmnd->device;
struct zfcp_scsi_dev *zfcp_sdev;
struct fsf_qtcb_header *header = >qtcb->header;
 
@@ -2051,7 +2055,7 @@ static void zfcp_fsf_fcp_handler_common(struct 
zfcp_fsf_req *req)
switch (header->fsf_status) {
case FSF_HANDLE_MISMATCH:
case FSF_PORT_HANDLE_NOT_VALID:
-   zfcp_erp_adapter_reopen(zfcp_sdev->port->adapter, 0, "fssfch1");
+   zfcp_erp_adapter_reopen(req->adapter, 0, "fssfch1");
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
break;
case FSF_FCPLUN_NOT_VALID:
@@ -2069,8 +2073,7 @@ static void zfcp_fsf_fcp_handler_common(struct 
zfcp_fsf_req *req)
req->qtcb->bottom.io.data_direction,
(unsigned long long)zfcp_scsi_dev_lun(sdev),
(unsigned long long)zfcp_sdev->port->wwpn);
-   zfcp_erp_adapter_shutdown(zfcp_sdev->port->adapter, 0,
- "fssfch3");
+   zfcp_erp_adapter_shutdown(req->adapter, 0, "fssfch3");
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
break;
case FSF_CMND_LENGTH_NOT_VALID:
@@ -2080,8 +2083,7 @@ static void zfcp_fsf_fcp_handler_common(struct 
zfcp_fsf_req *req)
req->qtcb->bottom.io.fcp_cmnd_length,
(unsigned long long)zfcp_scsi_dev_lun(sdev),
(unsigned long long)zfcp_sdev->port->wwpn);
-   zfcp_erp_adapter_shutdown(zfcp_sdev->port->adapter, 0,
- "fssfch4");
+   zfcp_erp_adapter_shutdown(req->adapter, 0, "fssfch4");
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
break;
case FSF_PORT_BOXED:
@@ -2120,7 +2122,7 @@ static void zfcp_fsf_fcp_cmnd_handler(struct zfcp_fsf_req 
*req)
return;
}
 
-   zfcp_fsf_fcp_handler_common(req);
+   zfcp_fsf_fcp_handler_common(req, scpnt->device);
 
if (unlikely(req->status & ZFCP_STATUS_FSFREQ_ERROR)) {
set_host_byte(scpnt, DID_TRANSPORT_DISRUPTED);
@@ -2297,10 +2299,11 @@ int zfcp_fsf_fcp_cmnd(struct scsi_cmnd *scsi_cmnd)
 
 static void zfcp_fsf_fcp_task_mgmt_handler(struct zfcp_fsf_req *req)
 {
+   struct scsi_device *sdev = req->data;
struct fcp_resp_with_ext *fcp_rsp;
struct fcp_resp_rsp_info *rsp_info;
 
-   zfcp_fsf_fcp_handler_common(req);
+   zfcp_fsf_fcp_handler_common(req, sdev);
 
fcp_rsp = >qtcb->bottom.io.fcp_rsp.iu;

[PATCH 16/25] zfcp: consistently use function name space prefix

2018-05-17 Thread Steffen Maier
I've been mixing up
zfcp_task_mgmt_function() [SCSI] and
zfcp_fsf_fcp_task_mgmt()  [FSF]
so often lately that I wanted to fix this.

SCSI changes complement v2.6.27 commit f76af7d7e363
("[SCSI] zfcp: Cleanup of code in zfcp_scsi.c").

While at it, also fixup the other inconsistencies elsewhere.

ERP changes complement v2.6.27 commit 287ac01acf22
("[SCSI] zfcp: Cleanup code in zfcp_erp.c") which introduced
status_change_set().

FC changes complement v2.6.32 commit 6f53a2d2ecae
("[SCSI] zfcp: Apply common naming conventions to zfcp_fc").
by renaming a leftover introduced with v2.6.27 commit cc8c282963bd
("[SCSI] zfcp: Automatically attach remote ports").

FSF changes fixup v2.6.32 commit a4623c467ff7
("[SCSI] zfcp: Improve request allocation through mempools").
which replaced zfcp_fsf_alloc_qtcb() introduced with v2.6.27
commit c41f8cbddd4e ("[SCSI] zfcp: zfcp_fsf cleanup.").

SCSI fc_host statistics were introduced with v2.6.16 commit f6cd94b126aa
("[SCSI] zfcp: transport class adaptations").

SCSI fc_host port_state was introduced with v2.6.27 commit 85a82392fe6f
("[SCSI] zfcp: Add port_state attribute to sysfs").

SCSI rport setter for dev_loss_tmo was introduced with v2.6.18
commit 338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when
target port is unavailable").

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_erp.c  | 11 +++
 drivers/s390/scsi/zfcp_fc.c   |  4 ++--
 drivers/s390/scsi/zfcp_fsf.c  |  7 ---
 drivers/s390/scsi/zfcp_scsi.c | 46 ++-
 4 files changed, 37 insertions(+), 31 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 69dfb328dba4..9be629607dc0 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -551,21 +551,23 @@ void zfcp_erp_lun_shutdown_wait(struct scsi_device *sdev, 
char *id)
zfcp_erp_wait(adapter);
 }
 
-static int status_change_set(unsigned long mask, atomic_t *status)
+static int zfcp_erp_status_change_set(unsigned long mask, atomic_t *status)
 {
return (atomic_read(status) ^ mask) & mask;
 }
 
 static void zfcp_erp_adapter_unblock(struct zfcp_adapter *adapter)
 {
-   if (status_change_set(ZFCP_STATUS_COMMON_UNBLOCKED, >status))
+   if (zfcp_erp_status_change_set(ZFCP_STATUS_COMMON_UNBLOCKED,
+  >status))
zfcp_dbf_rec_run("eraubl1", >erp_action);
atomic_or(ZFCP_STATUS_COMMON_UNBLOCKED, >status);
 }
 
 static void zfcp_erp_port_unblock(struct zfcp_port *port)
 {
-   if (status_change_set(ZFCP_STATUS_COMMON_UNBLOCKED, >status))
+   if (zfcp_erp_status_change_set(ZFCP_STATUS_COMMON_UNBLOCKED,
+  >status))
zfcp_dbf_rec_run("erpubl1", >erp_action);
atomic_or(ZFCP_STATUS_COMMON_UNBLOCKED, >status);
 }
@@ -574,7 +576,8 @@ static void zfcp_erp_lun_unblock(struct scsi_device *sdev)
 {
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
 
-   if (status_change_set(ZFCP_STATUS_COMMON_UNBLOCKED, _sdev->status))
+   if (zfcp_erp_status_change_set(ZFCP_STATUS_COMMON_UNBLOCKED,
+  _sdev->status))
zfcp_dbf_rec_run("erlubl1", _to_zfcp(sdev)->erp_action);
atomic_or(ZFCP_STATUS_COMMON_UNBLOCKED, _sdev->status);
 }
diff --git a/drivers/s390/scsi/zfcp_fc.c b/drivers/s390/scsi/zfcp_fc.c
index 2ad80c43f674..54186943896b 100644
--- a/drivers/s390/scsi/zfcp_fc.c
+++ b/drivers/s390/scsi/zfcp_fc.c
@@ -598,7 +598,7 @@ void zfcp_fc_test_link(struct zfcp_port *port)
put_device(>dev);
 }
 
-static struct zfcp_fc_req *zfcp_alloc_sg_env(int buf_num)
+static struct zfcp_fc_req *zfcp_fc_alloc_sg_env(int buf_num)
 {
struct zfcp_fc_req *fc_req;
 
@@ -750,7 +750,7 @@ void zfcp_fc_scan_ports(struct work_struct *work)
if (zfcp_fc_wka_port_get(>gs->ds))
return;
 
-   fc_req = zfcp_alloc_sg_env(buf_num);
+   fc_req = zfcp_fc_alloc_sg_env(buf_num);
if (!fc_req)
goto out;
 
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index d86c3bf71664..049fdd968130 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -662,7 +662,7 @@ static struct zfcp_fsf_req *zfcp_fsf_alloc(mempool_t *pool)
return req;
 }
 
-static struct fsf_qtcb *zfcp_qtcb_alloc(mempool_t *pool)
+static struct fsf_qtcb *zfcp_fsf_qtcb_alloc(mempool_t *pool)
 {
struct fsf_qtcb *qtcb;
 
@@ -701,9 +701,10 @@ static struct zfcp_fsf_req *zfcp_fsf_req_create(struct 
zfcp_qdio *qdio,
 
if (likely(fsf_cmd != FSF_QTCB_UNSOLICITED_STATUS)) {
if (likely(pool))
-

[PATCH 05/25] zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED

2018-05-17 Thread Steffen Maier
For problem determination we always want to see when we were invoked
on the terminate_rport_io callback whether we perform something or not.

Temporal event sequence of interest with a long fast_io_fail_tmo of 27 sec:

loose remote port

t   workqueue
[s] zfcp_q_   IRQ zfcperp
=== == === 

  0recv RSCN
   q p.test_link_work
block rport
 start fast_io_fail_tmo
send ADISC ELS
  4recv ADISC fail
   block zfcp_port
   port forced reopen
   send open port
 12recv open port fail
   q p.gid_pn_work
   zfcp_erp_wakeup
   (zfcp_erp_wait would return)
GID_PN fail

Before this point, we got a SCSI trace with tag "sctrpi1" on fast_io_fail,
e.g. with the typical 5 sec setting.

port.status |= ERP_FAILED

If fast_io_fail_tmo triggers after this point, we missed a SCSI trace.

workqueue
fc_dl_
==
 27 fc_timeout_fail_rport_io
fc_terminate_rport_io
zfcp_scsi_terminate_rport_io
zfcp_erp_port_forced_reopen
_zfcp_erp_port_forced_reopen
 if (port.status & ERP_FAILED)
  return;

Therefore, write a trace before above early return.

Example trace record formatted with zfcpdbf from s390-tools:

Timestamp  : ...
Area   : REC
Subarea: 00
Level  : 1
Exception  : -
CPU ID : ..
Caller : 0x...
Record ID  : 1  ZFCP_DBF_REC_TRIG
Tag: sctrpi1SCSI terminate rport I/O
LUN: 0x none (invalid)
WWPN   : 0x
D_ID   : 0x
Adapter status : 0x...
Port status: 0x...
LUN status : 0x none (invalid)
Ready count: 0x...
Running count  : 0x...
ERP want   : 0x03   ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
ERP need   : 0xe0   ZFCP_ERP_ACTION_FAILED

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Cc: <sta...@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_erp.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 3489b1bc9121..5c368cdfc455 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -42,9 +42,13 @@ enum zfcp_erp_steps {
  * @ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: Forced port recovery.
  * @ZFCP_ERP_ACTION_REOPEN_ADAPTER: Adapter recovery.
  * @ZFCP_ERP_ACTION_NONE: Eyecatcher pseudo flag to bitwise or-combine with
- *   either of the other enum values.
+ *   either of the first four enum values.
  *   Used to indicate that an ERP action could not be
  *   set up despite a detected need for some recovery.
+ * @ZFCP_ERP_ACTION_FAILED: Eyecatcher pseudo flag to bitwise or-combine with
+ * either of the first four enum values.
+ * Used to indicate that ERP not needed because
+ * the object has ZFCP_STATUS_COMMON_ERP_FAILED.
  */
 enum zfcp_erp_act_type {
ZFCP_ERP_ACTION_REOPEN_LUN = 1,
@@ -52,6 +56,7 @@ enum zfcp_erp_act_type {
ZFCP_ERP_ACTION_REOPEN_PORT_FORCED = 3,
ZFCP_ERP_ACTION_REOPEN_ADAPTER = 4,
ZFCP_ERP_ACTION_NONE   = 0xc0,
+   ZFCP_ERP_ACTION_FAILED = 0xe0,
 };
 
 enum zfcp_erp_act_state {
@@ -379,8 +384,12 @@ static void _zfcp_erp_port_forced_reopen(struct zfcp_port 
*port, int clear,
zfcp_erp_port_block(port, clear);
zfcp_scsi_schedule_rport_block(port);
 
-   if (atomic_read(>status) & ZFCP_STATUS_COMMON_ERP_FAILED)
+   if (atomic_read(>status) & ZFCP_STATUS_COMMON_ERP_FAILED) {
+   zfcp_dbf_rec_trig(id, port->adapter, port, NULL,
+ ZFCP_ERP_ACTION_REOPEN_PORT_FORCED,
+ ZFCP_ERP_ACTION_FAILED);
return;
+   }
 
zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_PORT_FORCED,
port->adapter, port, NULL, id, 0);
-- 
2.16.3



[PATCH 14/25] zfcp: decouple our scsi_eh callbacks from scsi_cmnd

2018-05-17 Thread Steffen Maier
Note: zfcp_scsi_eh_host_reset_handler() will be converted in a later patch.

zfcp_scsi_eh_device_reset_handler() now only depends on scsi_device.
zfcp_scsi_eh_target_reset_handler() now only depends on scsi_target.
All derive other objects from these intended callback arguments.

zfcp_scsi_eh_target_reset_handler() is special: The FCP channel requires
a valid LUN handle so we try to find ourselves a stand-in scsi_device as
suggested by Hannes Reinecke. If it cannot find a stand-in scsi device,
trace a record like the following (formatted with zfcpdbf from s390-tools):

Timestamp  : ...
Area   : SCSI
Subarea: 00
Level  : 1
Exception  : -
CPU ID : ..
Caller : 0x...
Record ID  : 1
Tag: tr_nosdtarget reset, no SCSI device
Request ID : 0x none (invalid)
SCSI ID: 0x SCSI ID/target denoting scope
SCSI LUN   : 0x none (invalid)
SCSI LUN high  : 0x none (invalid)
SCSI result: 0x2003 field re-used for midlayer value: FAILED
SCSI retries   : 0xff   none (invalid)
SCSI allowed   : 0xff   none (invalid)
SCSI scribble  : 0x none (invalid)
SCSI opcode:    none (invalid)
FCP rsp inf cod: 0xff   none (invalid)
FCP rsp IU :    none (invalid)
  

Actually change the signature of zfcp_task_mgmt_function() used by
zfcp_scsi_eh_device_reset_handler() & zfcp_scsi_eh_target_reset_handler().
Since it was prepared in a previous patch, we only need to delete
a local auto variable which is now the intended argument.

Suggested-by: Hannes Reinecke <h...@suse.com>
Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---

Notes:
Changes since RFC:

Since the FCP channel always requires a valid LUN handle,
we now use scsi_device as context data with any TMF instead of
zfcp_port for FCP_TMF_TGT_RESET and an optional scsi_device for
FCP_TMF_LUN_RESET.

zfcp_scsi_eh_target_reset_handler() became more involved as it needs
to find a stand-in scsi_device within the target scope as suggested by
Hannes.
Trace if we could not find a stand-in scsi_device along with
an example trace in the commit description.
Put the refcount from shost_for_each_device for the stand-in scsi_device.

NB:
zfcp_scsi_eh_host_reset_handler() will be converted in a later patch.
I need more time to resolve the proper sync with all fc_rports states
with support for FAST_IO_FAIL.

 drivers/s390/scsi/zfcp_scsi.c | 41 +
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index e0c5735cf3db..fcc832b73960 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -265,9 +265,14 @@ static void zfcp_scsi_forget_cmnds(struct zfcp_scsi_dev 
*zsdev, u8 tm_flags)
write_unlock_irqrestore(>abort_lock, flags);
 }
 
-static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, u8 tm_flags)
+/**
+ * zfcp_task_mgmt_function() - Synchronously send a task management function.
+ * @sdev: Pointer to SCSI device to send the task management command to.
+ * @tm_flags: Task management flags,
+ *   here we only handle %FCP_TMF_TGT_RESET or %FCP_TMF_LUN_RESET.
+ */
+static int zfcp_task_mgmt_function(struct scsi_device *sdev, u8 tm_flags)
 {
-   struct scsi_device *sdev = scpnt->device;
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
struct fc_rport *rport = starget_to_rport(scsi_target(sdev));
@@ -315,12 +320,40 @@ static int zfcp_task_mgmt_function(struct scsi_cmnd 
*scpnt, u8 tm_flags)
 
 static int zfcp_scsi_eh_device_reset_handler(struct scsi_cmnd *scpnt)
 {
-   return zfcp_task_mgmt_function(scpnt, FCP_TMF_LUN_RESET);
+   struct scsi_device *sdev = scpnt->device;
+
+   return zfcp_task_mgmt_function(sdev, FCP_TMF_LUN_RESET);
 }
 
 static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt)
 {
-   return zfcp_task_mgmt_function(scpnt, FCP_TMF_TGT_RESET);
+   struct scsi_target *starget = scsi_target(scpnt->device);
+   struct fc_rport *rport = starget_to_rport(starget);
+   struct Scsi_Host *shost = rport_to_shost(rport);
+   struct scsi_device *sdev = NULL, *tmp_sdev;
+   struct zfcp_adapter *adapter =
+   (struct zfcp_adapter *)shost->hostdata[0];
+   int ret;
+
+   shost_for_each_device(tmp_sdev, shost) {
+   if (tmp_sdev->id == starget-

[PATCH 17/25] zfcp: remove unused ERP enum values

2018-05-17 Thread Steffen Maier
All constant defines were introduced with v2.6.0 history commit
ea127f975424 ("[PATCH] s390 (7/7): zfcp host adapter.") and refactored into
enums with commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c").

ZFCP_STATUS_ERP_DISMISSING and ZFCP_ERP_STEP_FSF_XCONFIG were never used.

v2.6.27 commit 287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c")
removed the use of ZFCP_ERP_ACTION_READY on refactoring
zfcp_erp_action_exists() to now only check adapter->erp_running_head
but no longer adapter->erp_ready_head. The same commit could have
changed the function return type from int to "enum zfcp_erp_act_state".
ZFCP_ERP_ACTION_READY was never used outside of zfcp_erp_action_exists().

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_erp.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 9be629607dc0..b5ca484d5d5f 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -19,7 +19,6 @@
 enum zfcp_erp_act_flags {
ZFCP_STATUS_ERP_TIMEDOUT= 0x1000,
ZFCP_STATUS_ERP_CLOSE_ONLY  = 0x0100,
-   ZFCP_STATUS_ERP_DISMISSING  = 0x0010,
ZFCP_STATUS_ERP_DISMISSED   = 0x0020,
ZFCP_STATUS_ERP_LOWMEM  = 0x0040,
ZFCP_STATUS_ERP_NO_REF  = 0x0080,
@@ -27,7 +26,6 @@ enum zfcp_erp_act_flags {
 
 enum zfcp_erp_steps {
ZFCP_ERP_STEP_UNINITIALIZED = 0x,
-   ZFCP_ERP_STEP_FSF_XCONFIG   = 0x0001,
ZFCP_ERP_STEP_PHYS_PORT_CLOSING = 0x0010,
ZFCP_ERP_STEP_PORT_CLOSING  = 0x0100,
ZFCP_ERP_STEP_PORT_OPENING  = 0x0800,
@@ -61,7 +59,6 @@ enum zfcp_erp_act_type {
 
 enum zfcp_erp_act_state {
ZFCP_ERP_ACTION_RUNNING = 1,
-   ZFCP_ERP_ACTION_READY   = 2,
 };
 
 enum zfcp_erp_act_result {
-- 
2.16.3



[PATCH 19/25] zfcp: remove unused return values of ERP trigger functions

2018-05-17 Thread Steffen Maier
Since v2.6.27 commit 553448f6c483 ("[SCSI] zfcp: Message cleanup"),
none of the callers has been interested any more.
Values were not returned consistently in all ERP trigger functions.

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_erp.c | 34 +-
 drivers/s390/scsi/zfcp_ext.h |  2 +-
 2 files changed, 14 insertions(+), 22 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 245621769c26..2968d2f57788 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -294,12 +294,12 @@ static struct zfcp_erp_action *zfcp_erp_setup_act(int 
need, u32 act_status,
return erp_action;
 }
 
-static int zfcp_erp_action_enqueue(int want, struct zfcp_adapter *adapter,
-  struct zfcp_port *port,
-  struct scsi_device *sdev,
-  char *id, u32 act_status)
+static void zfcp_erp_action_enqueue(int want, struct zfcp_adapter *adapter,
+   struct zfcp_port *port,
+   struct scsi_device *sdev,
+   char *id, u32 act_status)
 {
-   int retval = 1, need;
+   int need;
struct zfcp_erp_action *act;
 
need = zfcp_erp_handle_failed(want, adapter, port, sdev);
@@ -310,7 +310,6 @@ static int zfcp_erp_action_enqueue(int want, struct 
zfcp_adapter *adapter,
 
if (!adapter->erp_thread) {
need = ZFCP_ERP_ACTION_NONE; /* marker for trace */
-   retval = -EIO;
goto out;
}
 
@@ -327,10 +326,8 @@ static int zfcp_erp_action_enqueue(int want, struct 
zfcp_adapter *adapter,
++adapter->erp_total_count;
list_add_tail(>list, >erp_ready_head);
wake_up(>erp_ready_wq);
-   retval = 0;
  out:
zfcp_dbf_rec_trig(id, adapter, port, sdev, want, need);
-   return retval;
 }
 
 void zfcp_erp_port_forced_no_port_dbf(char *id, struct zfcp_adapter *adapter,
@@ -353,14 +350,14 @@ void zfcp_erp_port_forced_no_port_dbf(char *id, struct 
zfcp_adapter *adapter,
write_unlock_irqrestore(>erp_lock, flags);
 }
 
-static int _zfcp_erp_adapter_reopen(struct zfcp_adapter *adapter,
+static void _zfcp_erp_adapter_reopen(struct zfcp_adapter *adapter,
int clear_mask, char *id)
 {
zfcp_erp_adapter_block(adapter, clear_mask);
zfcp_scsi_schedule_rports_block(adapter);
 
-   return zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_ADAPTER,
-  adapter, NULL, NULL, id, 0);
+   zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_ADAPTER,
+   adapter, NULL, NULL, id, 0);
 }
 
 /**
@@ -439,13 +436,13 @@ void zfcp_erp_port_forced_reopen(struct zfcp_port *port, 
int clear, char *id)
write_unlock_irqrestore(>erp_lock, flags);
 }
 
-static int _zfcp_erp_port_reopen(struct zfcp_port *port, int clear, char *id)
+static void _zfcp_erp_port_reopen(struct zfcp_port *port, int clear, char *id)
 {
zfcp_erp_port_block(port, clear);
zfcp_scsi_schedule_rport_block(port);
 
-   return zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_PORT,
-  port->adapter, port, NULL, id, 0);
+   zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_PORT,
+   port->adapter, port, NULL, id, 0);
 }
 
 /**
@@ -453,20 +450,15 @@ static int _zfcp_erp_port_reopen(struct zfcp_port *port, 
int clear, char *id)
  * @port: port to recover
  * @clear_mask: flags in port status to be cleared
  * @id: Id for debug trace event.
- *
- * Returns 0 if recovery has been triggered, < 0 if not.
  */
-int zfcp_erp_port_reopen(struct zfcp_port *port, int clear, char *id)
+void zfcp_erp_port_reopen(struct zfcp_port *port, int clear, char *id)
 {
-   int retval;
unsigned long flags;
struct zfcp_adapter *adapter = port->adapter;
 
write_lock_irqsave(>erp_lock, flags);
-   retval = _zfcp_erp_port_reopen(port, clear, id);
+   _zfcp_erp_port_reopen(port, clear, id);
write_unlock_irqrestore(>erp_lock, flags);
-
-   return retval;
 }
 
 static void zfcp_erp_lun_block(struct scsi_device *sdev, int clear_mask)
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index e317c4b513c9..ad7c28ffd49f 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -63,7 +63,7 @@ extern void zfcp_erp_adapter_reopen(struct zfcp_adapter *, 
int, char *);
 extern void zfcp_erp_adapter_shutdown(struct zfcp_adapter *, int, char *);
 extern void zfcp_erp_set_port_status(struct zfcp_port *, u32);
 extern void zfcp_erp_clear_port_status(struct zfcp_port *, u32);
-extern int  zfcp_erp_port_reopen(struct zfcp_port *, int, char *);

[PATCH 20/25] zfcp: explicitly support initiator in scsi_host_template

2018-05-17 Thread Steffen Maier
While the default did already correctly print "Initiator"
let's make it explicit and convert zfcp to the feature.

$ cat /sys/class/scsi_host/host0/supported_mode
Initiator

$ cat /sys/class/scsi_host/host0/active_mode
Initiator

The default worked, because not setting the field has it initialized
to zero == MODE_UNKNOWN. scsi_host_alloc() sets shost->active_mode =
MODE_INITIATOR in this case. The sysfs accessor function
show_shost_supported_mode() assumes MODE_INITIATOR in this case.
This default behavior was introduced with v2.6.24 commit 7a39ac3f25be
("[SCSI] make supported_mode default to initiator.").
The feature flag was introduced with v2.6.24 commit 5dc2b89e1242
("[SCSI] add supported_mode and active_mode attributes to the host").
So there was no release where zfcp would have shown "unknown".

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_scsi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index b4e1f1b82503..f69ef78ea930 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -401,6 +401,7 @@ static struct scsi_host_template zfcp_scsi_host_template = {
.shost_attrs = zfcp_sysfs_shost_attrs,
.sdev_attrs  = zfcp_sysfs_sdev_attrs,
.track_queue_depth   = 1,
+   .supported_mode  = MODE_INITIATOR,
 };
 
 /**
-- 
2.16.3



[PATCH 06/25] zfcp: fix missing REC trigger trace for all objects in ERP_FAILED

2018-05-17 Thread Steffen Maier
That other commit introduced an inconsistency because it would trace
on ERP_FAILED for all callers of port forced reopen triggers
(not just terminate_rport_io), but it would not trace on ERP_FAILED
for all callers of other ERP triggers such as adapter, port regular, LUN.

Therefore, generalize that other commit. zfcp_erp_action_enqueue()
already had two early outs which re-used the one zfcp_dbf_rec_trig() call.
All ERP trigger functions finally run through zfcp_erp_action_enqueue().
So move the special handling for ZFCP_STATUS_COMMON_ERP_FAILED into
zfcp_erp_action_enqueue() and add another early out with new trace marker
for pseudo ERP need in this case. This removes all early returns from
all ERP trigger functions so we always end up at zfcp_dbf_rec_trig().

Example trace record formatted with zfcpdbf from s390-tools:

Timestamp  : ...
Area   : REC
Subarea: 00
Level  : 1
Exception  : -
CPU ID : ..
Caller : 0x...
Record ID  : 1  ZFCP_DBF_REC_TRIG
Tag: ...
LUN: 0x...
WWPN   : 0x...
D_ID   : 0x...
Adapter status : 0x...
Port status: 0x...
LUN status : 0x...
Ready count: 0x...
Running count  : 0x...
ERP want   : 0x0.   ZFCP_ERP_ACTION_REOPEN_...
ERP need   : 0xe0   ZFCP_ERP_ACTION_FAILED

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Cc: <sta...@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_erp.c | 79 
 1 file changed, 51 insertions(+), 28 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 5c368cdfc455..20fe59300d0e 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -143,6 +143,49 @@ static void zfcp_erp_action_dismiss_adapter(struct 
zfcp_adapter *adapter)
}
 }
 
+static int zfcp_erp_handle_failed(int want, struct zfcp_adapter *adapter,
+ struct zfcp_port *port,
+ struct scsi_device *sdev)
+{
+   int need = want;
+   struct zfcp_scsi_dev *zsdev;
+
+   switch (want) {
+   case ZFCP_ERP_ACTION_REOPEN_LUN:
+   zsdev = sdev_to_zfcp(sdev);
+   if (atomic_read(>status) & ZFCP_STATUS_COMMON_ERP_FAILED)
+   need = 0;
+   break;
+   case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED:
+   if (atomic_read(>status) & ZFCP_STATUS_COMMON_ERP_FAILED)
+   need = 0;
+   break;
+   case ZFCP_ERP_ACTION_REOPEN_PORT:
+   if (atomic_read(>status) &
+   ZFCP_STATUS_COMMON_ERP_FAILED) {
+   need = 0;
+   /* ensure propagation of failed status to new devices */
+   zfcp_erp_set_port_status(
+   port, ZFCP_STATUS_COMMON_ERP_FAILED);
+   }
+   break;
+   case ZFCP_ERP_ACTION_REOPEN_ADAPTER:
+   if (atomic_read(>status) &
+   ZFCP_STATUS_COMMON_ERP_FAILED) {
+   need = 0;
+   /* ensure propagation of failed status to new devices */
+   zfcp_erp_set_adapter_status(
+   adapter, ZFCP_STATUS_COMMON_ERP_FAILED);
+   }
+   break;
+   default:
+   need = 0;
+   break;
+   }
+
+   return need;
+}
+
 static int zfcp_erp_required_act(int want, struct zfcp_adapter *adapter,
 struct zfcp_port *port,
 struct scsi_device *sdev)
@@ -266,6 +309,12 @@ static int zfcp_erp_action_enqueue(int want, struct 
zfcp_adapter *adapter,
int retval = 1, need;
struct zfcp_erp_action *act;
 
+   need = zfcp_erp_handle_failed(want, adapter, port, sdev);
+   if (!need) {
+   need = ZFCP_ERP_ACTION_FAILED; /* marker for trace */
+   goto out;
+   }
+
if (!adapter->erp_thread)
return -EIO;
 
@@ -314,12 +363,6 @@ static int _zfcp_erp_adapter_reopen(struct zfcp_adapter 
*adapter,
zfcp_erp_adapter_block(adapter, clear_mask);
zfcp_scsi_schedule_rports_block(adapter);
 
-   /* ensure propagation of failed status to new devices */
-   if (atomic_read(>status) & ZFCP_STATUS_COMMON_ERP_FAILED) {
-   zfcp_erp_set_adapter_status(adapter,
-   ZFCP_STATUS_COMMON_ERP_FAILED);
-   return -EIO;
-   }
return zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_ADAPTER,
   adapter, NULL, NULL, id, 0);
 }
@@ -338,12 +381,8 @@ void zfcp_erp_adapter_reopen(struct zfcp_adapter *adapter, 
int clear, char *id)
zfcp_scsi_sche

[PATCH 03/25] zfcp: fix misleading REC trigger trace where erp_action setup failed

2018-05-17 Thread Steffen Maier
If a SCSI device is deleted during scsi_eh host reset, we cannot get a
reference to the SCSI device anymore since scsi_device_get returns !=0
by design. Assuming the recovery of adapter and port(s) was successful,
zfcp_erp_strategy_followup_success() attempts to trigger a LUN reset for
the half-gone SCSI device. Unfortunately, it causes the following confusing
trace record which states that zfcp will do a LUN recovery as "ERP need" is
ZFCP_ERP_ACTION_REOPEN_LUN == 1 and equals "ERP want".

Old example trace record formatted with zfcpdbf from s390-tools:

Tag:   : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded
LUN: 0x
WWPN   : 0x
D_ID   : 0x
Adapter status : 0x5400050b
Port status: 0x5401
LUN status : 0x4000 ZFCP_STATUS_COMMON_RUNNING
but not ZFCP_STATUS_COMMON_UNBLOCKED as it
was closed on close part of adapter reopen
ERP want   : 0x01
ERP need   : 0x01   misleading

However, zfcp_erp_setup_act() returns NULL as it cannot get the reference.
Hence, zfcp_erp_action_enqueue() takes an early goto out and _NO_ recovery
actually happens.

We always do want the recovery trigger trace record even if no erp_action
could be enqueued as in this case. For other cases where we did not enqueue
an erp_action, 'need' has always been zero to indicate this. In order to
indicate above goto out, introduce an eyecatcher "flag" to mark the
"ERP need" as 'not needed' but still keep the information which erp_action
type, that zfcp_erp_required_act() had decided upon, is needed.
0xc_ is chosen to be visibly different from 0x0_ in "ERP want".

New example trace record formatted with zfcpdbf from s390-tools:

Tag:   : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded
LUN: 0x
WWPN   : 0x
D_ID   : 0x
Adapter status : 0x5400050b
Port status: 0x5401
LUN status : 0x4000
ERP want   : 0x01
ERP need   : 0xc1   would need LUN ERP, but no action set up
   ^

Before v2.6.38 commit ae0904f60fab ("[SCSI] zfcp: Redesign of the debug
tracing for recovery actions.") we could detect this case because the
"erp_action" field in the trace was NULL. The rework removed erp_action
as argument and field from the trace.

This patch here is for tracing. A fix to allow LUN recovery in the case at
hand is a topic for a separate patch.

See also commit fdbd1c5e27da ("[SCSI] zfcp: Allow running unit/LUN shutdown
without acquiring reference") for a similar case and background info.

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Fixes: ae0904f60fab ("[SCSI] zfcp: Redesign of the debug tracing for recovery 
actions.")
Cc: <sta...@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_erp.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 1d91a32db08e..d9cd25b56cfa 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -35,11 +35,23 @@ enum zfcp_erp_steps {
ZFCP_ERP_STEP_LUN_OPENING   = 0x2000,
 };
 
+/**
+ * enum zfcp_erp_act_type - Type of ERP action object.
+ * @ZFCP_ERP_ACTION_REOPEN_LUN: LUN recovery.
+ * @ZFCP_ERP_ACTION_REOPEN_PORT: Port recovery.
+ * @ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: Forced port recovery.
+ * @ZFCP_ERP_ACTION_REOPEN_ADAPTER: Adapter recovery.
+ * @ZFCP_ERP_ACTION_NONE: Eyecatcher pseudo flag to bitwise or-combine with
+ *   either of the other enum values.
+ *   Used to indicate that an ERP action could not be
+ *   set up despite a detected need for some recovery.
+ */
 enum zfcp_erp_act_type {
ZFCP_ERP_ACTION_REOPEN_LUN = 1,
ZFCP_ERP_ACTION_REOPEN_PORT= 2,
ZFCP_ERP_ACTION_REOPEN_PORT_FORCED = 3,
ZFCP_ERP_ACTION_REOPEN_ADAPTER = 4,
+   ZFCP_ERP_ACTION_NONE   = 0xc0,
 };
 
 enum zfcp_erp_act_state {
@@ -257,8 +269,10 @@ static int zfcp_erp_action_enqueue(int want, struct 
zfcp_adapter *adapter,
goto out;
 
act = zfcp_erp_setup_act(need, act_status, adapter, port, sdev);
-   if (!act)
+   if (!act) {
+   need |= ZFCP_ERP_ACTION_NONE; /* marker for trace */
goto out;
+   }
atomic_or(ZFCP_STATUS_ADAPTER_ERP_PENDING, >status);
++adapter->erp_total_count;
list_add_tail(>list, >erp_ready_head);
-- 
2.16.3



[PATCH 07/25] zfcp: fix missing REC trigger trace on enqueue without ERP thread

2018-05-17 Thread Steffen Maier
Example trace record formatted with zfcpdbf from s390-tools:

Timestamp  : ...
Area   : REC
Subarea: 00
Level  : 1
Exception  : -
CPU ID : ..
Caller : 0x...
Record ID  : 1  ZFCP_DBF_REC_TRIG
Tag: ...
LUN: 0x...
WWPN   : 0x...
D_ID   : 0x...
Adapter status : 0x...
Port status: 0x...
LUN status : 0x...
Ready count: 0x...
Running count  : 0x...
ERP want   : 0x0.   ZFCP_ERP_ACTION_REOPEN_...
ERP need   : 0xc0   ZFCP_ERP_ACTION_NONE

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Cc: <sta...@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_erp.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 20fe59300d0e..69dfb328dba4 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -315,8 +315,11 @@ static int zfcp_erp_action_enqueue(int want, struct 
zfcp_adapter *adapter,
goto out;
}
 
-   if (!adapter->erp_thread)
-   return -EIO;
+   if (!adapter->erp_thread) {
+   need = ZFCP_ERP_ACTION_NONE; /* marker for trace */
+   retval = -EIO;
+   goto out;
+   }
 
need = zfcp_erp_required_act(want, adapter, port, sdev);
if (!need)
-- 
2.16.3



[PATCH 15/25] workqueue,zfcp: set description for port work items with their WWPN as context

2018-05-17 Thread Steffen Maier
As a prerequisite, complement commit 3d1cb2059d93 ("workqueue: include
workqueue info when printing debug dump of a worker task") to be usable
with kernel modules by exporting the symbol set_worker_desc().
Current built-in user was introduced with commit ef3b101925f2 ("writeback:
set worker desc to identify writeback workers in task dumps").

Can help distinguishing work items which do not have adapter scope.
Description is printed out with task dump for debugging on
WARN, BUG, panic, or magic-sysrq [show-task-states(t)].

Example:
$ echo 0 >| /sys/bus/ccw/drivers/zfcp/0.0.1880/0x50050763031bd327/failed &
$ echo 't' >| /proc/sysrq-trigger
$ dmesg
sysrq: SysRq : Show State
  taskPC stack   pid father
...
zfcp_q_0.0.1880 S14640  2165  2 0x0200
Call Trace:
([<009df464>] __schedule+0xbf4/0xc78)
 [<009df57c>] schedule+0x94/0xc0
 [<00168654>] rescuer_thread+0x33c/0x3a0
 [<0016f8be>] kthread+0x166/0x178
 [<009e71f2>] kernel_thread_starter+0x6/0xc
 [<009e71ec>] kernel_thread_starter+0x0/0xc
no locks held by zfcp_q_0.0.1880/2165.
...
kworker/u512:2  D11280  2193  2 0x0200
Workqueue: zfcp_q_0.0.1880 zfcp_scsi_rport_work [zfcp] (zrpd-50050763031bd327)
^
Call Trace:
([<009df464>] __schedule+0xbf4/0xc78)
 [<009df57c>] schedule+0x94/0xc0
 [<009e50c0>] schedule_timeout+0x488/0x4d0
 [<001e425c>] msleep+0x5c/0x78  >>test code only<<
 [<03ff8008a21e>] zfcp_scsi_rport_work+0xbe/0x100 [zfcp]
 [<00167154>] process_one_work+0x3b4/0x718
 [<0016771c>] worker_thread+0x264/0x408
 [<0016f8be>] kthread+0x166/0x178
 [<009e71f2>] kernel_thread_starter+0x6/0xc
 [<009e71ec>] kernel_thread_starter+0x0/0xc
2 locks held by kworker/u512:2/2193:
 #0:  (name){.+}, at: [<00166f4e>] process_one_work+0x1ae/0x718
 #1:  ((&(>rport_work)->work)){+.+.+.}, at: [<00166f4e>] 
process_one_work+0x1ae/0x718
...
=
Showing busy workqueues and worker pools:
workqueue zfcp_q_0.0.1880: flags=0x2000a
  pwq 512: cpus=0-255 flags=0x4 nice=0 active=1/1
in-flight: 2193:zfcp_scsi_rport_work [zfcp]
pool 512: cpus=0-255 flags=0x4 nice=0 hung=0s workers=4 idle: 5 2354 2311

Work items with adapter scope are already identified by the workqueue name
"zfcp_q_" and the work item function name.

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Cc: Tejun Heo <t...@kernel.org>
Cc: Lai Jiangshan <jiangshan...@gmail.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_fc.c   | 2 ++
 drivers/s390/scsi/zfcp_scsi.c | 3 +++
 kernel/workqueue.c| 1 +
 3 files changed, 6 insertions(+)

diff --git a/drivers/s390/scsi/zfcp_fc.c b/drivers/s390/scsi/zfcp_fc.c
index 6162cf57a20a..2ad80c43f674 100644
--- a/drivers/s390/scsi/zfcp_fc.c
+++ b/drivers/s390/scsi/zfcp_fc.c
@@ -425,6 +425,7 @@ void zfcp_fc_port_did_lookup(struct work_struct *work)
struct zfcp_port *port = container_of(work, struct zfcp_port,
  gid_pn_work);
 
+   set_worker_desc("zgidpn%16llx", port->wwpn); /* < WORKER_DESC_LEN=24 */
ret = zfcp_fc_ns_gid_pn(port);
if (ret) {
/* could not issue gid_pn for some reason */
@@ -559,6 +560,7 @@ void zfcp_fc_link_test_work(struct work_struct *work)
container_of(work, struct zfcp_port, test_link_work);
int retval;
 
+   set_worker_desc("zadisc%16llx", port->wwpn); /* < WORKER_DESC_LEN=24 */
get_device(>dev);
port->rport_task = RPORT_DEL;
zfcp_scsi_rport_work(>rport_work);
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index fcc832b73960..0f7830ffd40a 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -730,6 +730,9 @@ void zfcp_scsi_rport_work(struct work_struct *work)
struct zfcp_port *port = container_of(work, struct zfcp_port,
  rport_work);
 
+   set_worker_desc("zrp%c-%16llx",
+   (port->rport_task == RPORT_ADD) ? 'a' : 'd',
+   port->wwpn); /* < WORKER_DESC_LEN=24 */
while (port->rport_task) {
if (port->rport_task == RPORT_ADD) {
port->rport_task = RPORT_NONE;
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index ca7959be8aaa..4e6fa755ebdc 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4350,6 +4350,7 @@ void set_worker_desc(const char *fmt, ...)
worker->desc_valid = true;
}
 }
+EXPORT_SYMBOL_GPL(set_worker_desc);
 
 /**
  * print_worker_info - print out worker information and description
-- 
2.16.3



[PATCH 13/25] zfcp: decouple TMFs from scsi_cmnd by using fc_block_rport

2018-05-17 Thread Steffen Maier
Intentionally retrieve the rport by walking SCSI common code objects
rather than zfcp_sdev->port->rport.

The latter is used for pairing the calls to fc_remote_port_add() and
fc_remote_port_delete(). [see v2.6.31 commit 379d6bf6573e ("[SCSI] zfcp:
Add port only once to FC transport class")]

zfcp_scsi_rport_register() sets zfcp_port.rport to what
fc_remote_port_add() returned.
zfcp_scsi_rport_block() sets zfcp_port.rport = NULL after having called
fc_remote_port_delete().

Hence, while an rport is blocked (or in any subsequent state due to
scsi_transport_fc timeouts such as fast_io_fail_tmo or dev_loss_tmo),
zfcp_port.rport is NULL and cannot serve as argument to fc_block_rport().

During zfcp recovery, a just recovered zfcp_port can have the UNBLOCKED
status flag, but an async rport unblocking has only started via
zfcp_scsi_schedule_rport_register() in zfcp_erp_try_rport_unblock()
[see v4.10 commit 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with
LUN recovery")] in zfcp_erp_action_cleanup(). Now zfcp_erp_wait() can
return. This would be sufficient to successfully send a TMF.
But the rport can still be blocked and zfcp_port.rport can still be NULL
until zfcp_port.rport_work was scheduled and has actually called
fc_remote_port_add() and assigned its return value to zfcp_port.rport.
We need an unblocked rport for a successful scsi_eh TUR.

Similarly, for a zfcp_port which has just lost its UNBLOCKED status flag,
the return of zfcp_erp_wait() can race with zfcp_port.rport_work queued
by zfcp_scsi_schedule_rport_block(). Therefore we cannot reliably access
zfcp_port.rport. However, we'd like to get fc_rport_block()'s opinion on
when fast_io_fail_tmo triggered. While we might use
flush_work(>rport_work) to sync with the work item, we can simply use
the other way to get an rport pointer.

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---

Notes:
Changes since RFC:

For consistency renamed from "zfcp: use fc_block_rport for TMFs and
host reset to decouple from scsi_cmnd".

zfcp_scsi_eh_host_reset_handler() will be converted in a later patch.
Therefore, this patch here does not touch the host reset case any more.

Since the previous "[RFC 6/9] scsi: fc: start decoupling fc_block_scsi_eh
from scsi_cmnd" was queued for 4.14 already, I dropped it from this new
patch set version and simply depend on it.

Intentionally retrieve the rport by walking SCSI common code objects
rather than zfcp_sdev->port->rport.

This also fixes the problem that we could not synchronize if port->rport
is NULL but still continued as if the TMF was successful as Hannes
correctly pointed out.

 drivers/s390/scsi/zfcp_scsi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index e77e43a0630a..e0c5735cf3db 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -270,6 +270,7 @@ static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, 
u8 tm_flags)
struct scsi_device *sdev = scpnt->device;
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
+   struct fc_rport *rport = starget_to_rport(scsi_target(sdev));
struct zfcp_fsf_req *fsf_req = NULL;
int retval = SUCCESS, ret;
int retry = 3;
@@ -281,7 +282,7 @@ static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, 
u8 tm_flags)
 
zfcp_dbf_scsi_devreset("wait", sdev, tm_flags, NULL);
zfcp_erp_wait(adapter);
-   ret = fc_block_scsi_eh(scpnt);
+   ret = fc_block_rport(rport);
if (ret) {
zfcp_dbf_scsi_devreset("fiof", sdev, tm_flags, NULL);
return ret;
-- 
2.16.3



[PATCH 12/25] zfcp: decouple SCSI setup of TMF from scsi_cmnd

2018-05-17 Thread Steffen Maier
Actually change the signature of zfcp_fsf_fcp_task_mgmt().
Since it was prepared in the previous patch, we only need to delete
a local auto variable which is now the intended argument.

Prepare zfcp_fsf_fcp_task_mgmt's caller zfcp_task_mgmt_function()
to have its function body only depend on a scsi_device and derived objects.

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---

Notes:
Changes since RFC:

Since the FCP channel always requires a valid LUN handle,
we now use scsi_device as context data with any TMF instead of
zfcp_port for FCP_TMF_TGT_RESET and an optional scsi_device for
FCP_TMF_LUN_RESET.

Thus, zfcp_scsi_forget_cmnds() does not need a change anymore and
zfcp_task_mgmt_function() needs less changes because the tracing
with a mandatory scsi_device is simpler.

 drivers/s390/scsi/zfcp_ext.h  |  3 ++-
 drivers/s390/scsi/zfcp_fsf.c  | 12 ++--
 drivers/s390/scsi/zfcp_scsi.c |  2 +-
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index 78bcd80d0509..e317c4b513c9 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -123,7 +123,8 @@ extern int zfcp_fsf_send_els(struct zfcp_adapter *, u32,
 struct zfcp_fsf_ct_els *, unsigned int);
 extern int zfcp_fsf_fcp_cmnd(struct scsi_cmnd *);
 extern void zfcp_fsf_req_free(struct zfcp_fsf_req *);
-extern struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct scsi_cmnd *, u8);
+extern struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct scsi_device *sdev,
+  u8 tm_flags);
 extern struct zfcp_fsf_req *zfcp_fsf_abort_fcp_cmnd(struct scsi_cmnd *);
 extern void zfcp_fsf_reqid_check(struct zfcp_qdio *, int);
 
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 5bc84eaa6948..d86c3bf71664 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2314,17 +2314,17 @@ static void zfcp_fsf_fcp_task_mgmt_handler(struct 
zfcp_fsf_req *req)
 }
 
 /**
- * zfcp_fsf_fcp_task_mgmt - send SCSI task management command
- * @scmnd: SCSI command to send the task management command for
- * @tm_flags: unsigned byte for task management flags
- * Returns: on success pointer to struct fsf_req, NULL otherwise
+ * zfcp_fsf_fcp_task_mgmt() - Send SCSI task management command (TMF).
+ * @sdev: Pointer to SCSI device to send the task management command to.
+ * @tm_flags: Unsigned byte for task management flags.
+ *
+ * Return: On success pointer to struct zfcp_fsf_req, %NULL otherwise.
  */
-struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct scsi_cmnd *scmnd,
+struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct scsi_device *sdev,
u8 tm_flags)
 {
struct zfcp_fsf_req *req = NULL;
struct fcp_cmnd *fcp_cmnd;
-   struct scsi_device *sdev = scmnd->device;
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
struct zfcp_qdio *qdio = zfcp_sdev->port->adapter->qdio;
 
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 0afc546b71df..e77e43a0630a 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -275,7 +275,7 @@ static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, 
u8 tm_flags)
int retry = 3;
 
while (retry--) {
-   fsf_req = zfcp_fsf_fcp_task_mgmt(scpnt, tm_flags);
+   fsf_req = zfcp_fsf_fcp_task_mgmt(sdev, tm_flags);
if (fsf_req)
break;
 
-- 
2.16.3



[PATCH 11/25] zfcp: decouple FSF request setup of TMF from scsi_cmnd

2018-05-17 Thread Steffen Maier
In zfcp_fsf_fcp_task_mgmt() resolve the still old argument scsi_cmnd
into scsi_device very early and only depend on scsi_device and derived
objects in the function body.

This prepares to later change the function signature replacing the
scsi_cmnd argument with scsi_device.

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---

Notes:
Changes since RFC:

Since the FCP channel always requires a valid LUN handle,
we now use scsi_device as context data with any TMF instead of either
scsi_device for FCP_TMF_LUN_RESET or zfcp_port for FCP_TMF_TGT_RESET.

Thus, zfcp_fc_fcp_tm() no longer needs a change.

This also fixes a kernel panic due to the unconditional dereference with
sdev_to_zfcp(sdev) where sdev could have been NULL later in the patch set.

 drivers/s390/scsi/zfcp_fsf.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 8bc768a01ef5..5bc84eaa6948 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2324,7 +2324,8 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct 
scsi_cmnd *scmnd,
 {
struct zfcp_fsf_req *req = NULL;
struct fcp_cmnd *fcp_cmnd;
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scmnd->device);
+   struct scsi_device *sdev = scmnd->device;
+   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
struct zfcp_qdio *qdio = zfcp_sdev->port->adapter->qdio;
 
if (unlikely(!(atomic_read(_sdev->status) &
@@ -2344,7 +2345,8 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct 
scsi_cmnd *scmnd,
goto out;
}
 
-   req->data = scmnd->device;
+   req->data = sdev;
+
req->handler = zfcp_fsf_fcp_task_mgmt_handler;
req->qtcb->header.lun_handle = zfcp_sdev->lun_handle;
req->qtcb->header.port_handle = zfcp_sdev->port->handle;
@@ -2355,7 +2357,7 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct 
scsi_cmnd *scmnd,
zfcp_qdio_set_sbale_last(qdio, >qdio_req);
 
fcp_cmnd = >qtcb->bottom.io.fcp_cmnd.iu;
-   zfcp_fc_fcp_tm(fcp_cmnd, scmnd->device, tm_flags);
+   zfcp_fc_fcp_tm(fcp_cmnd, sdev, tm_flags);
 
zfcp_fsf_start_timer(req, ZFCP_SCSI_ER_TIMEOUT);
if (!zfcp_fsf_req_send(req))
-- 
2.16.3



[PATCH 10/25] zfcp: split FCP_CMND IU setup between SCSI I/O and TMF again

2018-05-17 Thread Steffen Maier
This reverts commit 2443c8b23aea ("[SCSI] zfcp: Merge FCP task management
setup with regular FCP command setup"), because this introduced a
dependency on the unsuitable SCSI command for scsi_eh / TMF.

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Reviewed-by: Hannes Reinecke <h...@suse.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_fc.h  | 22 ++
 drivers/s390/scsi/zfcp_fsf.c |  4 ++--
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fc.h b/drivers/s390/scsi/zfcp_fc.h
index 6a397ddaadf0..3cd74729cfb9 100644
--- a/drivers/s390/scsi/zfcp_fc.h
+++ b/drivers/s390/scsi/zfcp_fc.h
@@ -207,21 +207,14 @@ struct zfcp_fc_wka_ports {
  * zfcp_fc_scsi_to_fcp - setup FCP command with data from scsi_cmnd
  * @fcp: fcp_cmnd to setup
  * @scsi: scsi_cmnd where to get LUN, task attributes/flags and CDB
- * @tm: task management flags to setup task management command
  */
 static inline
-void zfcp_fc_scsi_to_fcp(struct fcp_cmnd *fcp, struct scsi_cmnd *scsi,
-u8 tm_flags)
+void zfcp_fc_scsi_to_fcp(struct fcp_cmnd *fcp, struct scsi_cmnd *scsi)
 {
u32 datalen;
 
int_to_scsilun(scsi->device->lun, (struct scsi_lun *) >fc_lun);
 
-   if (unlikely(tm_flags)) {
-   fcp->fc_tm_flags = tm_flags;
-   return;
-   }
-
fcp->fc_pri_ta = FCP_PTA_SIMPLE;
 
if (scsi->sc_data_direction == DMA_FROM_DEVICE)
@@ -240,6 +233,19 @@ void zfcp_fc_scsi_to_fcp(struct fcp_cmnd *fcp, struct 
scsi_cmnd *scsi,
}
 }
 
+/**
+ * zfcp_fc_fcp_tm() - Setup FCP command as task management command.
+ * @fcp: Pointer to FCP_CMND IU to set up.
+ * @dev: Pointer to SCSI_device where to send the task management command.
+ * @tm_flags: Task management flags to setup tm command.
+ */
+static inline
+void zfcp_fc_fcp_tm(struct fcp_cmnd *fcp, struct scsi_device *dev, u8 tm_flags)
+{
+   int_to_scsilun(dev->lun, (struct scsi_lun *) >fc_lun);
+   fcp->fc_tm_flags = tm_flags;
+}
+
 /**
  * zfcp_fc_evap_fcp_rsp - evaluate FCP RSP IU and update scsi_cmnd accordingly
  * @fcp_rsp: FCP RSP IU to evaluate
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index a95070c7cad8..8bc768a01ef5 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2260,7 +2260,7 @@ int zfcp_fsf_fcp_cmnd(struct scsi_cmnd *scsi_cmnd)
 
BUILD_BUG_ON(sizeof(struct fcp_cmnd) > FSF_FCP_CMND_SIZE);
fcp_cmnd = >qtcb->bottom.io.fcp_cmnd.iu;
-   zfcp_fc_scsi_to_fcp(fcp_cmnd, scsi_cmnd, 0);
+   zfcp_fc_scsi_to_fcp(fcp_cmnd, scsi_cmnd);
 
if ((scsi_get_prot_op(scsi_cmnd) != SCSI_PROT_NORMAL) &&
scsi_prot_sg_count(scsi_cmnd)) {
@@ -2355,7 +2355,7 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct 
scsi_cmnd *scmnd,
zfcp_qdio_set_sbale_last(qdio, >qdio_req);
 
fcp_cmnd = >qtcb->bottom.io.fcp_cmnd.iu;
-   zfcp_fc_scsi_to_fcp(fcp_cmnd, scmnd, tm_flags);
+   zfcp_fc_fcp_tm(fcp_cmnd, scmnd->device, tm_flags);
 
zfcp_fsf_start_timer(req, ZFCP_SCSI_ER_TIMEOUT);
if (!zfcp_fsf_req_send(req))
-- 
2.16.3



[PATCH 02/25] zfcp: fix missing SCSI trace for retry of abort / scsi_eh TMF

2018-05-17 Thread Steffen Maier
We already have a SCSI trace for the end of abort and scsi_eh TMF. Due to
zfcp_erp_wait() and fc_block_scsi_eh() time can pass between the
start of our eh callback and an actual send/recv of an abort / TMF request.
In order to see the temporal sequence including any abort / TMF send
retries, add a trace before the above two blocking functions.
This supports problem determination with scsi_eh and parallel zfcp ERP.

No need to explicitly trace the beginning of our eh callback, since we
typically can send an abort / TMF and see its HBA response (in the worst
case, it's a pseudo response on dismiss all of adapter recovery, e.g. due
to an FSF request timeout [fsrth_1] of the abort / TMF). If we cannot send,
we now get a trace record for the first "abrt_wt" or "[lt]r_wait" which
denotes almost the beginning of the callback.

No need to explicitly trace the wakeup after the above two blocking
functions because the next retry loop causes another trace in any case
and that is sufficient.

Example trace records formatted with zfcpdbf from s390-tools:

Timestamp  : ...
Area   : SCSI
Subarea: 00
Level  : 1
Exception  : -
CPU ID : ..
Caller : 0x...
Record ID  : 1
Tag: abrt_wtabort, before zfcp_erp_wait()
Request ID : 0x none (invalid)
SCSI ID: 0x
SCSI LUN   : 0x
SCSI LUN high  : 0x
SCSI result: 0x
SCSI retries   : 0x
SCSI allowed   : 0x
SCSI scribble  : 0x
SCSI opcode: 
FCP rsp inf cod: 0x..   none (invalid)
FCP rsp IU : ...none (invalid)

Timestamp  : ...
Area   : SCSI
Subarea: 00
Level  : 1
Exception  : -
CPU ID : ..
Caller : 0x...
Record ID  : 1
Tag: lr_waitLUN reset, before zfcp_erp_wait()
Request ID : 0x none (invalid)
SCSI ID: 0x
SCSI LUN   : 0x
SCSI LUN high  : 0x
SCSI result: 0x...  unrelated
SCSI retries   : 0x..   unrelated
SCSI allowed   : 0x..   unrelated
SCSI scribble  : 0x...  unrelated
SCSI opcode: ...unrelated
FCP rsp inf cod: 0x..   none (invalid)
FCP rsp IU : ...none (invalid)

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Fixes: 63caf367e1c9 ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in 
zfcp")
Fixes: af4de36d911a ("[SCSI] zfcp: Block scsi_eh thread for rport state 
BLOCKED")
Cc: <sta...@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_scsi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index a62357f5e8b4..4fdb1665b0e6 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -181,6 +181,7 @@ static int zfcp_scsi_eh_abort_handler(struct scsi_cmnd 
*scpnt)
if (abrt_req)
break;
 
+   zfcp_dbf_scsi_abort("abrt_wt", scpnt, NULL);
zfcp_erp_wait(adapter);
ret = fc_block_scsi_eh(scpnt);
if (ret) {
@@ -277,6 +278,7 @@ static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, 
u8 tm_flags)
if (fsf_req)
break;
 
+   zfcp_dbf_scsi_devreset("wait", scpnt, tm_flags, NULL);
zfcp_erp_wait(adapter);
ret = fc_block_scsi_eh(scpnt);
if (ret) {
-- 
2.16.3



[PATCH 04/25] zfcp: fix missing REC trigger trace on terminate_rport_io early return

2018-05-17 Thread Steffen Maier
get_device() and its internally used kobject_get() only return NULL
if they get passed NULL as argument. zfcp_get_port_by_wwpn() loops over
adapter->port_list so the iteration variable port is always non-NULL.
Struct device is embedded in struct zfcp_port so >dev is always
non-NULL. This is the argument to get_device().
However, if we get an fc_rport in terminate_rport_io() for which we cannot
find a match within zfcp_get_port_by_wwpn(), the latter can return NULL.
v2.6.30 commit 70932935b61e ("[SCSI] zfcp: Fix oops when port disappears")
introduced an early return without adding a trace record for this case.
Even if we don't need recovery in this case, for debugging we should still
see that our callback was invoked originally by scsi_transport_fc.

Example trace record formatted with zfcpdbf from s390-tools:

Timestamp  : ...
Area   : REC
Subarea: 00
Level  : 1
Exception  : -
CPU ID : ..
Caller : 0x...
Record ID  : 1
Tag: sctrpinSCSI terminate rport I/O, no zfcp port
LUN: 0x none (invalid)
WWPN   : 0x   WWPN
D_ID   : 0x  N_Port-ID
Adapter status : 0x...
Port status: 0x unknown (-1)
LUN status : 0x none (invalid)
Ready count: 0x...
Running count  : 0x...
ERP want   : 0x03   ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
ERP need   : 0xc0   ZFCP_ERP_ACTION_NONE

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Fixes: 70932935b61e ("[SCSI] zfcp: Fix oops when port disappears")
Cc: <sta...@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_erp.c  | 20 
 drivers/s390/scsi/zfcp_ext.h  |  3 +++
 drivers/s390/scsi/zfcp_scsi.c |  5 +
 3 files changed, 28 insertions(+)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index d9cd25b56cfa..3489b1bc9121 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -283,6 +283,26 @@ static int zfcp_erp_action_enqueue(int want, struct 
zfcp_adapter *adapter,
return retval;
 }
 
+void zfcp_erp_port_forced_no_port_dbf(char *id, struct zfcp_adapter *adapter,
+ u64 port_name, u32 port_id)
+{
+   unsigned long flags;
+   static /* don't waste stack */ struct zfcp_port tmpport;
+
+   write_lock_irqsave(>erp_lock, flags);
+   /* Stand-in zfcp port with fields just good enough for
+* zfcp_dbf_rec_trig() and zfcp_dbf_set_common().
+* Under lock because tmpport is static.
+*/
+   atomic_set(, -1); /* unknown */
+   tmpport.wwpn = port_name;
+   tmpport.d_id = port_id;
+   zfcp_dbf_rec_trig(id, adapter, , NULL,
+ ZFCP_ERP_ACTION_REOPEN_PORT_FORCED,
+ ZFCP_ERP_ACTION_NONE);
+   write_unlock_irqrestore(>erp_lock, flags);
+}
+
 static int _zfcp_erp_adapter_reopen(struct zfcp_adapter *adapter,
int clear_mask, char *id)
 {
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index e55f42ce1168..3299bd345076 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -55,6 +55,9 @@ extern void zfcp_dbf_scsi_eh(char *tag, struct zfcp_adapter 
*adapter,
 /* zfcp_erp.c */
 extern void zfcp_erp_set_adapter_status(struct zfcp_adapter *, u32);
 extern void zfcp_erp_clear_adapter_status(struct zfcp_adapter *, u32);
+extern void zfcp_erp_port_forced_no_port_dbf(char *id,
+struct zfcp_adapter *adapter,
+u64 port_name, u32 port_id);
 extern void zfcp_erp_adapter_reopen(struct zfcp_adapter *, int, char *);
 extern void zfcp_erp_adapter_shutdown(struct zfcp_adapter *, int, char *);
 extern void zfcp_erp_set_port_status(struct zfcp_port *, u32);
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 4fdb1665b0e6..478e7ef9ea2f 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -605,6 +605,11 @@ static void zfcp_scsi_terminate_rport_io(struct fc_rport 
*rport)
if (port) {
zfcp_erp_port_forced_reopen(port, 0, "sctrpi1");
put_device(>dev);
+   } else {
+   zfcp_erp_port_forced_no_port_dbf(
+   "sctrpin", adapter,
+   rport->port_name /* zfcp_scsi_rport_register */,
+   rport->port_id /* zfcp_scsi_rport_register */);
}
 }
 
-- 
2.16.3



[PATCH 01/25] zfcp: fix missing SCSI trace for result of eh_host_reset_handler

2018-05-17 Thread Steffen Maier
For problem determination we need to see whether and why we were
successful or not. This allows deduction of scsi_eh escalation.

Example trace record formatted with zfcpdbf from s390-tools:

Timestamp  : ...
Area   : SCSI
Subarea: 00
Level  : 1
Exception  : -
CPU ID : ..
Caller : 0x...
Record ID  : 1
Tag: schrh_rSCSI host reset handler result
Request ID : 0x none (invalid)
SCSI ID: 0x none (invalid)
SCSI LUN   : 0x none (invalid)
SCSI LUN high  : 0x none (invalid)
SCSI result: 0x2002 field re-used for midlayer value: SUCCESS
or in other cases: 0x2009 == FAST_IO_FAIL
SCSI retries   : 0xff   none (invalid)
SCSI allowed   : 0xff   none (invalid)
SCSI scribble  : 0x none (invalid)
SCSI opcode:    none (invalid)
FCP rsp inf cod: 0xff   none (invalid)
FCP rsp IU :    none (invalid)
  

v2.6.35 commit a1dbfddd02d2 ("[SCSI] zfcp: Pass return code from
fc_block_scsi_eh to scsi eh") introduced the first return with something
other than the previously hardcoded single SUCCESS return path.

Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
Fixes: a1dbfddd02d2 ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to 
scsi eh")
Cc: <sta...@vger.kernel.org> #2.6.38+
Reviewed-by: Jens Remus <jre...@linux.ibm.com>
Reviewed-by: Benjamin Block <bbl...@linux.ibm.com>
---
 drivers/s390/scsi/zfcp_dbf.c  | 40 
 drivers/s390/scsi/zfcp_ext.h  |  2 ++
 drivers/s390/scsi/zfcp_scsi.c | 11 ++-
 3 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index a8b831000b2d..1e5ea5e4992b 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -643,6 +643,46 @@ void zfcp_dbf_scsi(char *tag, int level, struct scsi_cmnd 
*sc,
spin_unlock_irqrestore(>scsi_lock, flags);
 }
 
+/**
+ * zfcp_dbf_scsi_eh() - Trace event for special cases of scsi_eh callbacks.
+ * @tag: Identifier for event.
+ * @adapter: Pointer to zfcp adapter as context for this event.
+ * @scsi_id: SCSI ID/target to indicate scope of task management function 
(TMF).
+ * @ret: Return value of calling function.
+ *
+ * This SCSI trace variant does not depend on any of:
+ * scsi_cmnd, zfcp_fsf_req, scsi_device.
+ */
+void zfcp_dbf_scsi_eh(char *tag, struct zfcp_adapter *adapter,
+ unsigned int scsi_id, int ret)
+{
+   struct zfcp_dbf *dbf = adapter->dbf;
+   struct zfcp_dbf_scsi *rec = >scsi_buf;
+   unsigned long flags;
+   static int const level = 1;
+
+   if (unlikely(!debug_level_enabled(adapter->dbf->scsi, level)))
+   return;
+
+   spin_lock_irqsave(>scsi_lock, flags);
+   memset(rec, 0, sizeof(*rec));
+
+   memcpy(rec->tag, tag, ZFCP_DBF_TAG_LEN);
+   rec->id = ZFCP_DBF_SCSI_CMND;
+   rec->scsi_result = ret; /* re-use field, int is 4 bytes and fits */
+   rec->scsi_retries = ~0;
+   rec->scsi_allowed = ~0;
+   rec->fcp_rsp_info = ~0;
+   rec->scsi_id = scsi_id;
+   rec->scsi_lun = (u32)ZFCP_DBF_INVALID_LUN;
+   rec->scsi_lun_64_hi = (u32)(ZFCP_DBF_INVALID_LUN >> 32);
+   rec->host_scribble = ~0;
+   memset(rec->scsi_opcode, 0xff, ZFCP_DBF_SCSI_OPCODE);
+
+   debug_event(dbf->scsi, level, rec, sizeof(*rec));
+   spin_unlock_irqrestore(>scsi_lock, flags);
+}
+
 static debug_info_t *zfcp_dbf_reg(const char *name, int size, int rec_size)
 {
struct debug_info *d;
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index bf8ea4df2bb8..e55f42ce1168 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -49,6 +49,8 @@ extern void zfcp_dbf_san_res(char *, struct zfcp_fsf_req *);
 extern void zfcp_dbf_san_in_els(char *, struct zfcp_fsf_req *);
 extern void zfcp_dbf_scsi(char *, int, struct scsi_cmnd *,
  struct zfcp_fsf_req *);
+extern void zfcp_dbf_scsi_eh(char *tag, struct zfcp_adapter *adapter,
+unsigned int scsi_id, int ret);
 
 /* zfcp_erp.c */
 extern void zfcp_erp_set_adapter_status(struct zfcp_adapter *, u32);
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 4d2ba5682493..a62357f5e8b4 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -323,15 +323,16 @@ static int zfcp_scsi_eh_host_reset_handler(struct 
scsi_cmnd *scpnt)
 {
stru

[PATCH 00/25] zfcp: updates for v4.18

2018-05-17 Thread Steffen Maier
James, Martin,

this is the zfcp patch set for the v4.18 merge window.
The patches apply to Martin's 4.18/scsi-queue.
The patches eventually go on top of the bug fix commit
fa89adba1941e4f3b213399b81732a5c12fd9131
("scsi: zfcp: fix infinite iteration on ERP ready list")
in Martin's 4.17/scsi-fixes or James' scsi-fixes
[https://www.spinics.net/lists/linux-scsi/msg120124.html].
There should be no merge conflicts between the fix and this patch set.

Patches 1-7 are debugging/tracing fixes found during function test of 8-14.

Patches 8-14 are the result of an earlier RFC to prepare for changing
scsi_eh callback function arguments to decouple from scsi_cmnd.
[http://www.spinics.net/lists/linux-scsi/msg92.html /
https://marc.info/?l=linux-scsi=150099208822680=2]
The only difference is that I have to defer the conversion of host_reset().

Patches 15-25 are small cleanups / updates.

Jens Remus (3):
  zfcp: assert that the ERP lock is held when tracing a recovery trigger
  zfcp: add port speed capabilities
  zfcp: enhance comments on fc_link_speed and supported_speed

Steffen Maier (22):
  zfcp: fix missing SCSI trace for result of eh_host_reset_handler
  zfcp: fix missing SCSI trace for retry of abort / scsi_eh TMF
  zfcp: fix misleading REC trigger trace where erp_action setup failed
  zfcp: fix missing REC trigger trace on terminate_rport_io early return
  zfcp: fix missing REC trigger trace on terminate_rport_io for
ERP_FAILED
  zfcp: fix missing REC trigger trace for all objects in ERP_FAILED
  zfcp: fix missing REC trigger trace on enqueue without ERP thread
  zfcp: decouple SCSI traces for scsi_eh / TMF from scsi_cmnd
  zfcp: decouple TMF response handler from scsi_cmnd
  zfcp: split FCP_CMND IU setup between SCSI I/O and TMF again
  zfcp: decouple FSF request setup of TMF from scsi_cmnd
  zfcp: decouple SCSI setup of TMF from scsi_cmnd
  zfcp: decouple TMFs from scsi_cmnd by using fc_block_rport
  zfcp: decouple our scsi_eh callbacks from scsi_cmnd
  workqueue,zfcp: set description for port work items with their WWPN as
context
  zfcp: consistently use function name space prefix
  zfcp: remove unused ERP enum values
  zfcp: zfcp_erp_action_exists() does only check for running
  zfcp: remove unused return values of ERP trigger functions
  zfcp: explicitly support initiator in scsi_host_template
  zfcp: support SCSI_ADAPTER_RESET via scsi_host sysfs attribute
host_reset
  zfcp: cleanup indentation for posting FC events

 drivers/s390/scsi/zfcp_dbf.c   |  90 +++
 drivers/s390/scsi/zfcp_dbf.h   |  21 +++--
 drivers/s390/scsi/zfcp_erp.c   | 194 -
 drivers/s390/scsi/zfcp_ext.h   |  16 +++-
 drivers/s390/scsi/zfcp_fc.c|  11 +--
 drivers/s390/scsi/zfcp_fc.h|  22 +++--
 drivers/s390/scsi/zfcp_fsf.c   |  61 -
 drivers/s390/scsi/zfcp_fsf.h   |   6 +-
 drivers/s390/scsi/zfcp_scsi.c  | 141 +++---
 drivers/s390/scsi/zfcp_sysfs.c |   5 +-
 kernel/workqueue.c |   1 +
 11 files changed, 401 insertions(+), 167 deletions(-)

-- 
2.16.3



Re: [PATCH 0/3] scsi: arcmsr: Add driver parameter cmd_timeout for scsi command timeout setting

2018-05-09 Thread Steffen Maier


On 05/08/2018 08:43 AM, Ching Huang wrote:

On Tue, 2018-05-08 at 14:32 +0800, Ching Huang wrote:

On Tue, 2018-05-08 at 01:41 -0400, Martin K. Petersen wrote:

Hello Ching,


1. Add driver parameter cmd_timeout, default value is ARCMSR_DEFAULT_TIMEOUT.
2. Add slave_configure callback function to set device command timeout value.
3. Update driver version to v1.40.00.06-20180504.


I am not so keen on arcmsr overriding the timeout set by the admin or
application.

Also, instead of introducing this module parameter, why not simply ask
the user to change rq_timeout?


This timeout setting only after device has been inquiry successfully.
Of course, user can set timeout value to /sys/block/sdX/device/timeout.
But user does not like to set this value once command timeout occurred.
They rather like timeout never happen.


This timeout setting apply to all devices, its better than user has to
set one bye one for each device.


Udev rules?

Something roughly like bottom of:
https://www.ibm.com/support/knowledgecenter/ST3FR7_8.1.2/com.ibm.storwize.v7000.812.doc/svc_linux_settings.html
or better doing the assignment with udev builtins (fix the syntax error 
with model):

https://www.ibm.com/support/knowledgecenter/ST3FR7_8.1.2/com.ibm.storwize.v7000.812.doc/svc_zs_statechange_3fgeri.html

--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



[PATCH] zfcp: fix infinite iteration on ERP ready list

2018-05-03 Thread Steffen Maier
From: Jens Remus <jre...@linux.ibm.com>

zfcp_erp_adapter_reopen() schedules blocking of all of the adapter's
rports via zfcp_scsi_schedule_rports_block() and enqueues a reopen
adapter ERP action via zfcp_erp_action_enqueue(). Both are separately
processed asynchronously and concurrently.

Blocking of rports is done in a kworker by zfcp_scsi_rport_work(). It
calls zfcp_scsi_rport_block(), which then traces a DBF REC "scpdely" via
zfcp_dbf_rec_trig().
zfcp_dbf_rec_trig() acquires the DBF REC spin lock and then iterates with
list_for_each() over the adapter's ERP ready list without holding the ERP
lock. This opens a race window in which the current list entry can be
moved to another list, causing list_for_each() to iterate forever on the
wrong list, as the erp_ready_head is never encountered as terminal
condition.

Meanwhile the ERP action can be processed in the ERP thread by
zfcp_erp_thread(). It calls zfcp_erp_strategy(), which acquires the ERP
lock and then calls zfcp_erp_action_to_running() to move the ERP action
from the ready to the running list.
zfcp_erp_action_to_running() can move the ERP action using list_move()
just during the aforementioned race window. It then traces a REC RUN
"erator1" via zfcp_dbf_rec_run().
zfcp_dbf_rec_run() tries to acquire the DBF REC spin lock. If this is held
by the infinitely looping kworker, it effectively spins forever.

Example Sequence Diagram:

ProcessERP Thread rport_work
---------
zfcp_erp_adapter_reopen()
zfcp_erp_adapter_block()
zfcp_scsi_schedule_rports_block()
lock ERP  zfcp_scsi_rport_work()
zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_ADAPTER)
list_add_tail() on ready  !(rport_task==RPORT_ADD)
wake_up() ERP thread  zfcp_scsi_rport_block()
zfcp_dbf_rec_trig()zfcp_erp_strategy()zfcp_dbf_rec_trig()
unlock ERPlock DBF REC
zfcp_erp_wait()lock ERP
|  zfcp_erp_action_to_running()
| list_for_each() ready
|  list_move()  current entry
|ready to running
|  zfcp_dbf_rec_run()   endless loop over running
|  zfcp_dbf_rec_run_lvl()
|  lock DBF REC spins forever

Any adapter recovery can trigger this, such as setting the device offline
or reboot.

V4.9 commit 4eeaa4f3f1d6 ("zfcp: close window with unblocked rport during
rport gone") introduced additional tracing of (un)blocking of rports. It
missed that the adapter->erp_lock must be held when calling
zfcp_dbf_rec_trig().

This fix uses the approach formerly introduced by commit aa0fec62391c
("[SCSI] zfcp: Fix sparse warning by providing new entry in dbf") that got
later removed by commit ae0904f60fab ("[SCSI] zfcp: Redesign of the debug
tracing for recovery actions.").

Introduce zfcp_dbf_rec_trig_lock(), a wrapper for zfcp_dbf_rec_trig() that
acquires and releases the adapter->erp_lock for read.

Reported-by: Sebastian Ott <seb...@linux.ibm.com>
Signed-off-by: Jens Remus <jre...@linux.ibm.com>
Fixes: 4eeaa4f3f1d6 ("zfcp: close window with unblocked rport during rport 
gone")
Cc: <sta...@vger.kernel.org> # 2.6.32+
Reviewed-by: Benjamin Block <bbl...@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
---

James, Martin,

this is an important zfcp regression fix.
It would be nice if it could make it into 4.17-rcX.
The patch applies to James' fixes branch or Martin's 4.17/scsi-fixes branch.

Regards,
Steffen

 drivers/s390/scsi/zfcp_dbf.c  | 23 ++-
 drivers/s390/scsi/zfcp_ext.h  |  5 -
 drivers/s390/scsi/zfcp_scsi.c | 14 +++---
 3 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index a8b831000b2d..18c4f933e8b9 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -4,7 +4,7 @@
  *
  * Debug traces for zfcp.
  *
- * Copyright IBM Corp. 2002, 2017
+ * Copyright IBM Corp. 2002, 2018
  */
 
 #define KMSG_COMPONENT "zfcp"
@@ -308,6 +308,27 @@ void zfcp_dbf_rec_trig(char *tag, struct zfcp_adapter 
*adapter,
spin_unlock_irqrestore(>rec_lock, flags);
 }
 
+/**
+ * zfcp_dbf_rec_trig_lock - trace event related to triggered recovery with lock
+ * @tag: identifier for event
+ * @adapter: adapter on which the erp_action should run
+ * @port: remote port involved in the erp_action
+ * @sdev: scsi device involved in the erp_action
+ * @want: wanted erp_action
+ * @need: required erp_action
+ *
+ * The adapter->erp_lock must not be held.
+ */
+void zfcp_dbf_rec_trig_lock(char *tag, struct zfcp_adapter *adapter,
+   struct z

Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)

2018-04-24 Thread Steffen Maier


On 11/04/2016 05:35 PM, Martin K. Petersen wrote:

"Hannes" == Hannes Reinecke <h...@suse.de> writes:


Hannes> Checking with SAT-3 (section 6.2.4: Commands the SATL queues
Hannes> internally) the implemented behaviour is standards conformant,
Hannes> although the standard also allows for returning 'TASK SET FULL'
Hannes> or 'BUSY' in these cases.  Doing so would nicely solve this
Hannes> issue.

I agree with Hannes that it would be appropriate for the SATL to report
busy when it makes an non-queued command queueable.


Wouldn't this potentially still cause problems if the secure erase takes 
longer than max_retries * scmd_tmo. I.e. the command timing out by 
default after 180 seconds as in 
https://www.spinics.net/lists/linux-block/msg24837.html ?


The fix approach here seems to also handle this gracefully.

--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events

2018-04-23 Thread Steffen Maier


On 04/17/2018 12:00 PM, Bean Huo (beanhuo) wrote:


#Cat trace
iozone-4055  [000]    665.039276: block_unplug: [iozone] 1 Sync
iozone-4055  [000] ...1   665.039278: block_rq_insert: 8,48 WS 0 () 39604352 + 
128 tag=18 [iozone]
iozone-4055  [000] ...1   665.039280: block_rq_issue: 8,48 WS 0 () 39604352 + 
128 tag=18 [iozone]
iozone-4055  [000] ...1   665.039284: scsi_dispatch_cmd_start: host_no=0 
channel=0 id=0 lun=3 data_sgl=16 prot_sgl=0 prot_op=SCSI_PROT_NORMAL tag=18 
cmnd=(WRITE_10 lba=4950544 txlen=16 protect=0 raw=2a 00 00 4b 8a 10 00 00 10 00)
iozone-4056  [002]    665.039284: block_dirty_buffer: 8,62 sector=44375 
size=4096
-0 [000] d.h2   665.039319: scsi_dispatch_cmd_done: host_no=0 
channel=0 id=0 lun=3 data_sgl=16 prot_sgl=0 prot_op=SCSI_PROT_NORMAL tag=24 
cmnd=(WRITE_10 lba=4944016 txlen=16 protect=0 raw=2a 00 00 4b 70 90 00 00 10 00) 
result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)
-0 [000] d.h3   665.039321: block_rq_complete: 8,48 WS () 39552128 + 
128 tag=24 [0]



iozone-4058  [003]    665.039362: block_bio_remap: 8,48 WS 39568768 + 128 
<- (8,62) 337280
iozone-4058  [003]    665.039364: block_bio_queue: 8,48 WS 39568768 + 128 
[iozone]
iozone-4058  [003] ...1   665.039366: block_getrq: 8,48 WS 39568768 + 128 
[iozone]


I'm not familiar with block/scsi command tagging.

Some block events now would get a tag field.
Some block events would not get a tag field (maybe because for some the 
tag is not (yet) known).


So all block events that belong to the same request would still need to 
be correlated by something like (devt, RWBS, LBA, length) because not 
all have a tag field.



Especially, the ftrace log with tag information, I can easily figure out one 
I/O request between block layer and SCSI.


Provided this is done correctly, I would be in favor of a solution.
Since
v4.11 commit 48b77ad60844 (``block: cleanup tracing'')\newline
v4.11 commit 82ed4db499b8 (``block: split scsi\_request out of struct 
request'')
we don't have the SCSI CDB in block traces (nor in blktrace traditional 
relayfs trace format, nor in ftrace 'blk' tracer binary synthesized 
output [1]) any more (empty Packet Command payload).
Being able to correlate block events with scsi events would indeed be 
very helpful for some cases.


Is a correlation between block and scsi only necessary for these pairs?:

block_rq_issue causes scsi_dispatch_cmd_start, and
scsi_dispatch_cmd_done causes block_rq_complete.

If so, only those two block trace events would need to get a new field?


[1] v2.6.30 commit 08a06b83ff8b (``blkftrace: binary tracing, 
synthesizing old format'')
v2.6.31 commit f3948f8857ef (``blktrace: fix context-info when 
mixed-using blk tracer and trace events'')


--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [RESEND PATCH v1 2/2] trace: events: block: Add tag in block trace events

2018-04-23 Thread Steffen Maier


On 04/16/2018 04:33 PM, Bean Huo (beanhuo) wrote:

Print the request tag along with other information in block trace events
when tracing request , and unplug type (Sync / Async).

Signed-off-by: Bean Huo <bean...@micron.com>
---
  include/trace/events/block.h | 36 +---
  1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index 81b43f5..f8c0b9e 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h



@@ -478,15 +486,18 @@ DECLARE_EVENT_CLASS(block_unplug,

TP_STRUCT__entry(
__field( int,   nr_rq   )
+   __field( bool,  explicit)
__array( char,  comm,   TASK_COMM_LEN   )
),

TP_fast_assign(
__entry->nr_rq = depth;
+   __entry->explicit = explicit;
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),

-   TP_printk("[%s] %d", __entry->comm, __entry->nr_rq)
+   TP_printk("[%s] %d %s", __entry->comm, __entry->nr_rq,
+  __entry->explicit ? "Sync" : "Async")
  );

  /**


This entire hunk does not seem related to this patch description.
Also, I'm not sure trace-cmd and perf et al. could format it accordingly.
See also my patch for this same functionality:
https://www.spinics.net/lists/linux-block/msg24691.html
("[PATCH v2 1/2] tracing/events: block: track and print if unplug was 
explicit or schedule")




--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status

2018-04-23 Thread Steffen Maier

On 04/19/2018 10:18 PM, Omar Sandoval wrote:
> On Thu, Apr 19, 2018 at 01:44:41PM -0600, Jens Axboe wrote:
>> On 4/19/18 1:41 PM, Bart Van Assche wrote:
>>> On Thu, 2018-04-19 at 12:13 -0700, Omar Sandoval wrote:
>>>> On Thu, Apr 19, 2018 at 11:53:30AM -0700, Omar Sandoval wrote:
>>>>> Thanks for the test! Applied.
>>>>
>>>> Side note, it's unfortunate that this test takes 180 seconds to run only
>>>> because we have to wait for the command timeout. We should be able to
>>>> export request_queue->rq_timeout writeable in sysfs. Would you be
>>>> interested in doing that?
>>>
>>> Hello Omar,
>>>
>>> Is this perhaps what you are looking for?
>>> # ls -l /sys/class/scsi_device/*/*/timeout
>>> -rw-r--r-- 1 root root 4096 Apr 19 08:52 
>>> /sys/class/scsi_device/2:0:0:0/device/timeout
>>> -rw-r--r-- 1 root root 4096 Apr 19 12:39 
>>> /sys/class/scsi_device/8:0:0:1/device/timeout
>>
>> We should have it generically available though, not just for SCSI. In
>> retrospect, it should have been under queue/ from the start, now we'll
>> end up with duplicate entries for SCSI.
> 
> For the sake of this test, I just decreased the timeout through SCSI.

Great idea.

>   echo 5 > "/sys/block/${SCSI_DEBUG_DEVICES[0]}/device/timeout"

However, the timeout should be sufficiently larger than scsi_debug/delay,
in order not to run into the command timeout.
It may be unfortunate that scsi_debug/delay uses jiffies as unit and
can thus differ in a range of an order of magnitude for different kernel 
configs.

>   # delay to reduce response repetition: around 1..10sec depending on HZ
>   echo 1000 > /sys/bus/pseudo/drivers/scsi_debug/delay

On s390, we typically have HZ=100, so 1000 jiffies are 10 seconds.

We can increase the sdev cmd timeout or decrease the scsi_debug/delay.
100 instead of 1000 for scsi_debug/delay worked for me;
but for some reason the loop checking for busy did not work (any more?)
causing an unexpected test case error:

> # ./check scsi/004
> scsi/004 (ensure repeated TASK SET FULL results in EIO on timing out command) 
> [failed]
> runtime  31.892s  ...  31.720s
> --- tests/scsi/004.out2018-04-16 11:47:19.105931872 +0200
> +++ results/nodev/scsi/004.out.bad2018-04-23 14:07:33.615445253 
> +0200
> @@ -1,3 +1,3 @@
>  Running scsi/004
> -Input/output error
> +modprobe: FATAL: Module scsi_debug is in use.
>  Test complete

so I added another sleep hack:

 # dd closing SCSI disk causes implicit TUR also being delayed once
+# sleep over time window where READ was done and TUR not yet queued
+sleep 2
 while grep -q -F "in_use_bm BUSY:" 
"/proc/scsi/scsi_debug/${SCSI_DEBUG_HOSTS[0]}"; do

What do you think?

-- 
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



[PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status

2018-04-17 Thread Steffen Maier
Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
---
 tests/scsi/004 |   59 
 tests/scsi/004.out |3 ++
 2 files changed, 62 insertions(+), 0 deletions(-)
 create mode 100755 tests/scsi/004
 create mode 100644 tests/scsi/004.out

diff --git a/tests/scsi/004 b/tests/scsi/004
new file mode 100755
index 000..4852efc
--- /dev/null
+++ b/tests/scsi/004
@@ -0,0 +1,59 @@
+#!/bin/bash
+#
+# Ensure repeated SAM_STAT_TASK_SET_FULL results in EIO on timing out command.
+#
+# Regression test for commit cbe095e2b584 ("Revert "scsi: core: return
+# BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()"")
+#
+# Found independently of corresponding commit mail threads while
+# experimenting with storage mirroring. This test is a storage-independent
+# reproducer for the error case I ran into.
+#
+# Copyright IBM Corp. 2018
+# Author: Steffen Maier <ma...@linux.ibm.com>
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. common/scsi_debug
+
+DESCRIPTION="ensure repeated TASK SET FULL results in EIO on timing out 
command"
+
+requires() {
+   _have_scsi_debug
+}
+
+test() {
+   echo "Running ${TEST_NAME}"
+
+   if ! _init_scsi_debug add_host=1 max_luns=1 statistics=1 every_nth=1; 
then
+   return 1
+   fi
+   # every_nth RW with full queue gets SAM_STAT_TASK_SET_FULL
+   echo 0x800 > /sys/bus/pseudo/drivers/scsi_debug/opts
+   # delay to reduce response repetition: around 1..10sec depending on HZ
+   echo 1000 > /sys/bus/pseudo/drivers/scsi_debug/delay
+   # a single command fills device queue to satisfy 0x800 opts condition
+   echo 1 > "/sys/block/${SCSI_DEBUG_DEVICES[0]}/device/queue_depth"
+   dd if="/dev/${SCSI_DEBUG_DEVICES[0]}" iflag=direct of=/dev/null bs=512 
count=1 |& grep -o "Input/output error"
+   # stop injection
+   echo 0 > /sys/bus/pseudo/drivers/scsi_debug/opts
+   # dd closing SCSI disk causes implicit TUR also being delayed once
+   while grep -q -F "in_use_bm BUSY:" 
"/proc/scsi/scsi_debug/${SCSI_DEBUG_HOSTS[0]}"; do
+   sleep 1
+   done
+   echo 1 > /sys/bus/pseudo/drivers/scsi_debug/delay
+   _exit_scsi_debug
+
+   echo "Test complete"
+}
diff --git a/tests/scsi/004.out b/tests/scsi/004.out
new file mode 100644
index 000..b1126fb
--- /dev/null
+++ b/tests/scsi/004.out
@@ -0,0 +1,3 @@
+Running scsi/004
+Input/output error
+Test complete
-- 
1.7.1



Re: dmesg flooded with "Very big device. Trying to use READ CAPACITY(16)"

2018-03-08 Thread Steffen Maier


On 03/08/2018 12:07 PM, Menion wrote:

Unfortunately the Ubuntu kernel is not configured for ftrace or
kprobe, and I am operating this server so I am not sure if I will
eventually find the time and the risk to install a self-compiled
kernel


systemtap?



Re: dmesg flooded with "Very big device. Trying to use READ CAPACITY(16)"

2018-03-08 Thread Steffen Maier


On 03/08/2018 11:34 AM, Menion wrote:

I did some more test
This log is specific from the function sd_read_capacitysd_revalidate_disk
 From what I can see, it seems that it is called only when probing
newly attached devices
A quick look in the code I see that it is called by  sd_revalidate_disk
This function is registered by fops for the scsi device or called
directly by sd_probe (via sd_probe_async)
So, assuming that there is no disconnection ad USB level (and it is
not since I don't get any log of it), the question is: how can trigger
a probe or call the sd_revalidate_disk?
Can it be the filesystem?


echo 1 > /sys/class/scsi_device/.../device/rescan
?

That's what I meant with "sdev _rescan_" in my previous mail.

Not sure what call paths lead to sd_revalidate_disk().


2018-03-08 11:10 GMT+01:00 Menion <men...@gmail.com>:

Anyhow, I checked something that I should have checked since the beginning.
I have stopped smartd and I still get this log, so it is something
else doing it, but does anyone have an idea how understand what
subsystem is calling again and again the read_capacity_10?


ftrace: kernel function trace
[https://lwn.net/Articles/365835/, https://lwn.net/Articles/366796/]
or dynamically attach a kprobe
[https://www.kernel.org/doc/Documentation/trace/kprobetrace.txt]
to see which process calls this (indirectly)


2018-03-08 10:16 GMT+01:00 Menion <men...@gmail.com>:

I have tried it, but it does not work:

[   39.230095] sd 0:0:0:0: [sda] Very big device. Trying to use READ
CAPACITY(16).



[  348.134002] sd 0:0:0:0: [sda] Very big device. Trying to use READ
CAPACITY(16).



[  657.963478] sd 0:0:0:0: [sda] Very big device. Trying to use READ
CAPACITY(16).



2018-03-07 18:14 GMT+01:00 Douglas Gilbert <dgilb...@interlog.com>:

On 2018-03-07 09:02 AM, Menion wrote:

2018-03-07 14:51 GMT+01:00 Steffen Maier <ma...@linux.vnet.ibm.com>:

On 03/07/2018 09:24 AM, Menion wrote:



but from then on, you only get it roughly once every 300 seconds, i.e. 5
minutes

that's where I suspect user space as trigger, unless there is a kernel
feature I'm not aware of doing such sdev rescans

preventing this would be a workaround



Is it possible that it is smartd? It is the only daemon that could do
some low level access to the device (bypassing the filesystem)



   https://github.com/mirror/smartmontools

To check it is the revision (svn rev >= 4718) you need for this fix, look
at the top of the ChangeLog file and look for today's date (20180307).



Currently smartmontools only has a quirks database (and it is large)
for ATA devices, not real or pseudo SCSI device, nor NVMe devices (yet).
Hopefully this fix will be sufficient.

If it does not work, please send me the details.



  /*
   * Many devices do not respond properly to
READ_CAPACITY_16.
   * Tell the SCSI layer to try READ_CAPACITY_10 first.
   * However some USB 3.0 drive enclosures return
capacity
   * modulo 2TB. Those must use READ_CAPACITY_16
   */
  if (!(us->fflags & US_FL_NEEDS_CAP16))
  sdev->try_rc_10_first = 1;


if that's the cause, maybe an entry in drivers/usb/storage/unusual_devs.h
would help, but that's really just guessing as I'm not familiar with USB


It seems that the bridge does have an entry in unusual_devs.h:

/* Reported by Michael Büsch <m...@bues.ch> */
UNUSUAL_DEV( 0x152d, 0x0567, 0x0114, 0x0116,
"JMicron",
"USB to ATA/ATAPI Bridge",
USB_SC_DEVICE, USB_PR_DEVICE, NULL,
US_FL_BROKEN_FUA ),

VID:PID is 0x152d 0x0567, not sure what are the other two numbers, so
I went back and used another enclosure with same USB to SATA bridge.
The strange thing is that this other enclosure goes in UAS mode while
the one for which I am reporting the issue goes in usb-storage mode
because it gets somehow the quirks 0x5000
Unfortunately I cannot move these 5 HDDs in the other enclosure. So do
you think that it shall be reported to linux-usb maybe?


--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: dmesg flooded with "Very big device. Trying to use READ CAPACITY(16)"

2018-03-07 Thread Steffen Maier


On 03/07/2018 09:24 AM, Menion wrote:

By flooded I mean that it continously fill the dmesg log with no
interruption, check attached a log that I have just taken from my
server
Some more details on my setup. I have these 5 HDD, WD RED 8TB in an
Orico 5 bay enclosure, running JMS567 USBtoSATA bridge and an internal
SATA multiplexer
This is connected to the USB 3.0 host port of my server, it is an Intel Atom



2018-03-07 3:45 GMT+01:00 Martin K. Petersen <martin.peter...@oracle.com>:

Also, what kind of controller are these disks attached to? The reason
you see these messages is that to the kernel it looks like a legacy disk
device that predates capacities in the TB range. The warnings are logged
because we're surprised to be going down this path based on what the
device has previously told us.


Of course Martin's statement regarding the occurrence holds true.

It does not look like continuously flooding, but rather like a 
repetition at some not even high frequency. Do you have some user space 
periodically performing SCSI target or SCSI device rescans?

Each repetition is per drive, i.e. a junk of 5 messages in your case.


[4.929517] sd 0:0:0:0: [sda] Very big device. Trying to use READ 
CAPACITY(16).


first occurrence after initial probing


[4.933893] sd 0:0:0:0: [sda] Very big device. Trying to use READ 
CAPACITY(16).
[4.946474] sd 0:0:0:0: [sda] Very big device. Trying to use READ 
CAPACITY(16).


looks like we go through the code path more than once during initial probing


[   99.057592] sd 0:0:0:0: [sda] Very big device. Trying to use READ 
CAPACITY(16).



[  409.335119] sd 0:0:0:0: [sda] Very big device. Trying to use READ 
CAPACITY(16).



[  719.760106] sd 0:0:0:0: [sda] Very big device. Trying to use READ 
CAPACITY(16).



[ 1018.089562] sd 0:0:0:0: [sda] Very big device. Trying to use READ 
CAPACITY(16).



[ 1328.086120] sd 0:0:0:0: [sda] Very big device. Trying to use READ 
CAPACITY(16).


...

but from then on, you only get it roughly once every 300 seconds, i.e. 5 
minutes


that's where I suspect user space as trigger, unless there is a kernel 
feature I'm not aware of doing such sdev rescans


preventing this would be a workaround

assuming the Linux check is correct, the proper fix might be that the 
device should present itself according to standards such that Linux 
silently uses READ CAPACITY(16) in the first place



static int sd_try_rc16_first(struct scsi_device *sdp)
{
if (sdp->host->max_cmd_len < 16)
return 0;


option


if (sdp->try_rc_10_first)
return 0;


option


if (sdp->scsi_level > SCSI_SPC_2)
return 1;
if (scsi_device_protection(sdp))
return 1;
return 0;


option


}


just picking one arbitrary option and not being entirely sure that's the 
code path but you mentioned USB to SATA bridge, it might be related to:



*** drivers/usb/storage/scsiglue.c:
slave_configure[239]   sdev->try_rc_10_first = 1;



/*
 * Many devices do not respond properly to READ_CAPACITY_16.
 * Tell the SCSI layer to try READ_CAPACITY_10 first.
 * However some USB 3.0 drive enclosures return capacity
 * modulo 2TB. Those must use READ_CAPACITY_16
 */
if (!(us->fflags & US_FL_NEEDS_CAP16))
sdev->try_rc_10_first = 1;


if that's the cause, maybe an entry in 
drivers/usb/storage/unusual_devs.h would help, but that's really just 
guessing as I'm not familiar with USB


--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 2/3] virtio-scsi: Add FC transport class

2018-02-02 Thread Steffen Maier


On 02/02/2018 05:00 PM, Hannes Reinecke wrote:

On 01/26/2018 05:54 PM, Steffen Maier wrote:

On 12/18/2017 09:31 AM, Hannes Reinecke wrote:

On 12/15/2017 07:08 PM, Steffen Maier wrote:

On 12/14/2017 11:11 AM, Hannes Reinecke wrote:



To me, this raises the question which properties of the host's FC
(driver core) objects should be mirrored to the guest. Ideally all (and
that's a lot).
This in turn makes me wonder if mirroring is really desirable (e.g.
considering the effort) or if only the guest should have its own FC
object hierarchy which does _not_ exist on the KVM host in case an
fc_host is passed through with virtio-(v)fc.



A few more thoughts on your presentation [1]:

"Devices on the vport will not be visible on the host"
I could not agree more to the design point that devices (or at least
their descendant object subtree) passed through to a guest should not
appear on the host!
With virtio-blk or virtio-scsi, we have SCSI devices and thus disks
visible in the host, which needlessly scans partitions, or even worse
automatically scans for LVM and maybe even activates PVs/VGs/LVs. It's
hard for a KVM host admin to suppress this (and not break the devices
the host needs itself).
If we mirror the host's scsi_transport_fc tree including fc_rports and
thus SCSI devices etc., we would have the same problems?
Even more so, dev_loss_tmo and fast_io_fail_tmo would run independently
on the host and in the guest on the same mirrored scsi_transport_fc
object tree. I can envision user confusion having configured timeouts on
the "wrong" side (host vs. guest). Also we would still need a mechanism
to mirror fc_rport (un)block from host to guest for proper transport
recovery. In zfcp we try to recover on transport rather than scsi_eh
whenever possible because it is so much smoother.


As similar thing can be achieved event today, by setting the
'no_uld_attach' parameter when scanning the scsi device
(that's what some RAID HBAs do).
However, there currently is no way of modifying it from user-space, and
certainly not to change the behaviour for existing devices.
It should be relatively simple to set this flag whenever the host is
exposed to a VM; we would still see the scsi devices, but the 'sd'
driver won't be attached so nothing will scan the device on the host.


Ah, nice, didn't know that. It would solve the undesired I/O problem in 
the host.
But it would not solve the so far somewhat unsynchronized state 
transitions of fc_rports on the host and their mirrors in the guest?


I would be very interested in how you intend to do transport recovery.


"Towards virtio-fc?"
Using the FCP_CMND_IU (instead of just a plain SCB as with virtio-scsi)
sounds promising to me as starting point.
A listener from the audience asked if you would also do ELS/CT in the
guest and you replied that this would not be good. Why is that?
Based on above starting point, doing ELS/CT (and basic aborts and maybe
a few other functions such as open/close ports or metadata transfer
commands) in the guest is exactly what I would have expected. An HBA
LLDD on the KVM host would implement such API and for all fc_hosts,
passed through this way, it would *not* establish any scsi_transport_fc
tree on the host. Instead the one virtio-vfc implementation in the guest
would do this indendently of which HBA LLDD provides the passed through
fc_host in the KVM host.
ELS/CT pass through is maybe even for free via FC_BSG for those LLDDs
that already implement it.
Rport open/close is just the analogon of slave_alloc()/slave_destroy().


I'm not convinced that moving to full virtio-fc is something we want or
even can do.
Neither qla2xxx nor lpfc allow for direct FC frame access; so one would
need to reformat the FC frames into something the driver understands,
just so that the hardware can transform it back into FC frames.


I thought of a more high level para-virtualized FCP HBA interface, than 
FC frames (which did exist in kernel v2.4 under drivers/fc4/ but no 
longer as it seems). Just like large parts of today's FCP LLDDs handle 
scatter gather lists and framing is done by the hardware.



Another thing is xid management; some drivers have to do their own xid
management, based on hardware capabilities etc.
So the FC frames would need to re-write the xids, making it hard if not
impossible to match things up when the response comes in.


For such things, where the hardware exposes more details (than, say, 
zfcp sees) I thought the LLDD on the KVM host would handle such details 
internally and only expose the higher level interface to virtio-fc.


Maybe something roughly like the basic transport protocol part of 
ibmvfc/ibmvscsi (not the other end in the firmware and not the cross 
partition DMA part), if I understood its overall design correctly by 
quickly looking at the code.
I somewhat had the impression that zfcp isn't too far from the overall 
operations style. As seem qla2xxx or lpfc to me, they just see and need 
to

Re: [PATCH 06/13] lpfc: Add 64G link speed support

2018-01-29 Thread Steffen Maier

On 01/26/2018 08:31 PM, James Smart wrote:

The G7 adapter supports 64G link speeds. Add support to the driver.

In addition, a small cleanup to replace the odd bitmap logic with
a switch case.

Signed-off-by: Dick Kennedy <dick.kenn...@broadcom.com>
Signed-off-by: James Smart <james.sm...@broadcom.com>
---



  9 files changed, 92 insertions(+), 30 deletions(-)



diff --git a/drivers/scsi/lpfc/lpfc_hw.h b/drivers/scsi/lpfc/lpfc_hw.h
index d07d2fcbea34..b91b429fe2df 100644
--- a/drivers/scsi/lpfc/lpfc_hw.h
+++ b/drivers/scsi/lpfc/lpfc_hw.h



@@ -2966,6 +2975,9 @@ struct lpfc_mbx_read_top {
  #define LPFC_LINK_SPEED_10GHZ 0x40
  #define LPFC_LINK_SPEED_16GHZ 0x80
  #define LPFC_LINK_SPEED_32GHZ 0x90
+#define LPFC_LINK_SPEED_64GHZ  0xA0
+#define LPFC_LINK_SPEED_128GHZ 0xB0
+#define LPFC_LINK_SPEED_2568GHZ0xC0

  ^
typo? 2568 => 256

The new 128 and 256 definitions do not seem to be used in this patch set?

--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 3/3] virtio_scsi: Implement 'native LUN' feature

2018-01-26 Thread Steffen Maier

On 01/26/2018 03:15 PM, Steffen Maier wrote:

On 12/18/2017 08:48 AM, Hannes Reinecke wrote:

On 12/15/2017 07:17 PM, Steffen Maier wrote:

On 12/14/2017 11:11 AM, Hannes Reinecke wrote:



@@ -524,10 +532,16 @@ static void virtio_scsi_init_hdr(struct
virtio_device *vdev,
    int target_id,
    struct scsi_cmnd *sc)
   {
-    cmd->lun[0] = 1;
-    cmd->lun[1] = target_id;
-    cmd->lun[2] = (sc->device->lun >> 8) | 0x40;
-    cmd->lun[3] = sc->device->lun & 0xff;
+    if (virtio_has_feature(vdev, VIRTIO_SCSI_F_NATIVE_LUN)) {
+    u64 lun = sc->device->lun << 16;
+    lun |= ((u64)1 << 8) | (u64)target_id;
+    int_to_scsilun(lun, (struct scsi_lun *)>lun);
+    } else {
+    cmd->lun[0] = 1;
+    cmd->lun[1] = target_id;
+    cmd->lun[2] = (sc->device->lun >> 8) | 0x40;
+    cmd->lun[3] = sc->device->lun & 0xff;
+    }


Above 2 patterns seem to repeat. Have helper functions (similar to
int_to_scsilun()) now that it's more than just 4 lines of filling in the
virtio lun?


Yes, can do.


Meanwhile I think I realized why I had trouble understanding what the 
code does. I guess, I expected a conversion with int_to_scsilun() first, 
and then we would fill in the virtio-specific parts of magic-one and 
target-ID.

You do it just the other way round, which is OK.

Say we have a 4 level 64-bit LUN and represent it in hex using 
placeholder hexdigits for the 4 levels like this:

0xL1L1L2L2L3L3L4L4
Its decimal SCSI LUN representation (in hex) is:
0xL4L4L3L3L2L2L1L1
Then you shift left by 16 bits (2 bytes, 1 LUN level), basically 
dropping the 4th level:

0xL3L3L2L2L1L1
The steering header is 0x01TT where TT is the target ID.
You bitwise or the virtio-specific parts into the SCSI LUN representation:
0xL3L3L2L2L1L101TT
Finally you convert it into the 64-bit LUN representation:
0x01TTL1L1L2L2L3L3
   0123456789abcdef [char array indexes]

 0 1 2 3 4 5 6 7
of course
So we nicely have the virtio-specific parts at those array indexes where 
the virtio-scsi protocol expects them.
The usage of the other bytes is now of course different from the 
original LUN encoding: It allows more than just peripheral and flat 
space addressing for the 1st level; and it now also uses levels 2 and 3 
which were previously always zero. The 3rd level really requires 64-bit 
support in the kvm host kernel.
This also means that a 4-level LUN is not supported unless we would 
create a new virtio-scsi protocol version that would transfer the target 
ID in a separate field not as part of the LUN field.


Did I get that right?

A similar explanation in a kernel doc comment for the helper conversion 
function(s) might be helpful.


--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 2/3] virtio-scsi: Add FC transport class

2018-01-26 Thread Steffen Maier

On 12/18/2017 09:31 AM, Hannes Reinecke wrote:

On 12/15/2017 07:08 PM, Steffen Maier wrote:

On 12/14/2017 11:11 AM, Hannes Reinecke wrote:

When a device announces an 'FC' protocol we should be pulling
in the FC transport class to have the rports etc setup correctly.


It took some time for me to understand what this does.
It seems to mirror the topology of rports and sdevs that exist under the
fc_host on the kvm host side to the virtio-scsi-fc side in the guest.

I like the idea. This is also what I've been suggesting users to do if
they back virtio-scsi with zfcp on the kvm host side. Primarily to not
stall all virtio-scsi I/O on all paths if the guest ever gets into
scsi_eh. But also to make it look like an HBA pass through so one can
more easily migrate to this once we have FCP pass through.


On second thought, I like the idea for virtio-scsi.

For the future virtio-(v)fc case, see below.


@@ -755,19 +823,34 @@ static int virtscsi_abort(struct scsi_cmnd *sc)



+    if (vscsi->protocol == SCSI_PROTOCOL_FCP) {
+    struct fc_rport *rport =
+    starget_to_rport(scsi_target(sc->device));
+    if (rport && rport->dd_data ) {
+    tgt = rport->dd_data;
+    target_id = tgt->target_id;
+    } else
+    return FAST_IO_FAIL;
+    } else {
+    tgt = scsi_target(sc->device)->hostdata;
+    if (!tgt || tgt->removed)
+    return FAST_IO_FAIL;
+    }


dito


@@ -857,27 +970,67 @@ static void virtscsi_rescan_work(struct
work_struct *work)

   wait_for_completion();


Waiting in work item .vs. having the response (IRQ) path trigger
subsequent processing async ?
Or do we need the call chain(s) getting here to be in our own process
context via the workqueue anyway?


Can't see I can parse this sentence, but I'll be looking at the code
trying to come up with a clever explanation :-)


Sorry, meanwhile I have a hard time understanding my own words, too.

I think I wondered if the effort of a work item is really necessary, 
especially considering that it does block on the completion and thus 
could delay other queued work items (even though Concurrency Managed 
Workqueues can often hide this delay).


Couldn't we just return asynchronously after having sent the request. 
And then later on, simply have the response (IRQ) path trigger whatever 
processing is necessary (after the work item variant woke up from the 
wait_for_completion) in some asynchronuous fashion? Of course, this 
could also be a work item which just does necessary remaining processing 
after we got a response.

Just a wild guess, without knowing the environmental requirements.


+    if (transport == SCSI_PROTOCOL_FCP) {
+    struct fc_rport_identifiers rport_ids;
+    struct fc_rport *rport;
+
+    rport_ids.node_name = wwn_to_u64(cmd->resp.rescan.node_wwn);
+    rport_ids.port_name = wwn_to_u64(cmd->resp.rescan.port_wwn);
+    rport_ids.port_id = (target_id >> 8);


Why do you shift target_id by one byte to the right?


Because with the original setup virtio_scsi guest would pass in the
target_id, and the host would be selecting the device based on that
information.
With virtio-vfc we pass in the wwpn, but still require the target ID to
be compliant with things like event notification etc.


Don't we need the true N_Port-ID, then? That's what an fc_rport.port_id 
usually contains. It's also a simple way to lookup resources on a SAN 
switch for problem determination. Or did I misunderstand the 
content/semantics of the variable target_id, assuming it's a SCSI target 
ID, i.e. the 3rd part of a HCTL 4-tuple?



So I've shifted the target id onto the port ID (which is 24 bit anyway).
I could've used a bitfield here, but then I wasn't quite sure about the
endianness of which.



+    rport = fc_remote_port_add(sh, 0, _ids);
+    if (rport) {
+    tgt->rport = rport;
+    rport->dd_data = tgt;
+    fc_remote_port_rolechg(rport, FC_RPORT_ROLE_FCP_TARGET);


Is the rolechg to get some event? Otherwise we could have
rport_ids.roles = FC_RPORT_ROLE_FCP_TARGET before fc_remote_port_add().


That's how the 'normal' transport classes do it; but I'll check if this
can be rolled into the call to fc_remote_port_add().


My idea was just based on how zfcp does it. Do you think I need to check 
if zfcp should do it via rolechg (even though zfcp never changes an 
rport role since it can only open targets)?



@@ -932,14 +1089,31 @@ static void virtscsi_scan_host(struct
virtio_scsi *vscsi)
   static void virtscsi_scan_start(struct Scsi_Host *sh)
   {



+    if (vscsi->protocol == SCSI_PROTOCOL_FCP) {
+    fc_host_node_name(sh) = vscsi->wwnn;
+    fc_host_port_name(sh) = vscsi->wwpn;
+    fc_host_port_id(sh) = 0x00ff00;
+    fc_host_port_type(sh) = FC_PORTTYPE_NPIV;


Why is this hardcoded?

At least with zfcp, we can have kvm host *v*HBAs without NPIV.


For the simple fact 

Re: [PATCH 3/3] virtio_scsi: Implement 'native LUN' feature

2018-01-26 Thread Steffen Maier

On 12/18/2017 08:48 AM, Hannes Reinecke wrote:

On 12/15/2017 07:17 PM, Steffen Maier wrote:

On 12/14/2017 11:11 AM, Hannes Reinecke wrote:

The 'native LUN' feature allows virtio-scsi to pass in the LUN
numbers from the underlying storage directly, without having
to modify the LUN number itself.
It works by shifting the existing LUN number down by 8 bytes,
and add the virtio-specific 8-byte LUN steering header.
With that virtio doesn't have to mangle the LUN number, allowing
us to pass the 'real' LUN number to the guest.


I only see shifts by 16 bits in the code below which would be 2 bytes.
I had a quick look at the corresponding qemu code which looked the same
to me.
What's the relation to 8 byte shifting, which would be 64 bit shift and
thus odd for a 64 bit LUN, mentioned in the description here?

If the code keeps the LUN level 1 and 2 (dropping level 3 and 4) and I
just don't understand it, it would be fine, I guess.


Yeah, messed that one up. It should be 8 _bits_, obviously.


Isn't it 16 bits or 2 bytes corresponding to one LUN level?
See also below.


Of course, we do cut off the last 8 bytes of the 'real' LUN number,
but I'm not aware of any array utilizing that, so the impact should
be negligible.


Why did we do v3.17 commit 9cb78c16f5da ("scsi: use 64-bit LUNs")? ;-)


Because that patch just lifts the internal code to use 64-bit LUNs
without any changes to the behaviour.
This one uses the internal 64-bit LUNs and actually changes the behaviour.


Sure, I was just being ironic, because your description sounded a bit as 
if all the LUN range extension is not even required because no storage 
array uses it.



Signed-off-by: Hannes Reinecke <h...@suse.com>
---
   drivers/scsi/virtio_scsi.c   | 62
++--
   include/uapi/linux/virtio_scsi.h |  1 +
   2 files changed, 48 insertions(+), 15 deletions(-)



@@ -524,10 +532,16 @@ static void virtio_scsi_init_hdr(struct
virtio_device *vdev,
    int target_id,
    struct scsi_cmnd *sc)
   {
-    cmd->lun[0] = 1;
-    cmd->lun[1] = target_id;
-    cmd->lun[2] = (sc->device->lun >> 8) | 0x40;
-    cmd->lun[3] = sc->device->lun & 0xff;
+    if (virtio_has_feature(vdev, VIRTIO_SCSI_F_NATIVE_LUN)) {
+    u64 lun = sc->device->lun << 16;
+    lun |= ((u64)1 << 8) | (u64)target_id;
+    int_to_scsilun(lun, (struct scsi_lun *)>lun);
+    } else {
+    cmd->lun[0] = 1;
+    cmd->lun[1] = target_id;
+    cmd->lun[2] = (sc->device->lun >> 8) | 0x40;
+    cmd->lun[3] = sc->device->lun & 0xff;
+    }


Above 2 patterns seem to repeat. Have helper functions (similar to
int_to_scsilun()) now that it's more than just 4 lines of filling in the
virtio lun?


Yes, can do.


Meanwhile I think I realized why I had trouble understanding what the 
code does. I guess, I expected a conversion with int_to_scsilun() first, 
and then we would fill in the virtio-specific parts of magic-one and 
target-ID.

You do it just the other way round, which is OK.

Say we have a 4 level 64-bit LUN and represent it in hex using 
placeholder hexdigits for the 4 levels like this:

0xL1L1L2L2L3L3L4L4
Its decimal SCSI LUN representation (in hex) is:
0xL4L4L3L3L2L2L1L1
Then you shift left by 16 bits (2 bytes, 1 LUN level), basically 
dropping the 4th level:

0xL3L3L2L2L1L1
The steering header is 0x01TT where TT is the target ID.
You bitwise or the virtio-specific parts into the SCSI LUN representation:
0xL3L3L2L2L1L101TT
Finally you convert it into the 64-bit LUN representation:
0x01TTL1L1L2L2L3L3
  0123456789abcdef [char array indexes]
So we nicely have the virtio-specific parts at those array indexes where 
the virtio-scsi protocol expects them.
The usage of the other bytes is now of course different from the 
original LUN encoding: It allows more than just peripheral and flat 
space addressing for the 1st level; and it now also uses levels 2 and 3 
which were previously always zero. The 3rd level really requires 64-bit 
support in the kvm host kernel.
This also means that a 4-level LUN is not supported unless we would 
create a new virtio-scsi protocol version that would transfer the target 
ID in a separate field not as part of the LUN field.


Did I get that right?

A similar explanation in a kernel doc comment for the helper conversion 
function(s) might be helpful.


--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 3/3] virtio_scsi: Implement 'native LUN' feature

2017-12-15 Thread Steffen Maier

Just a few very early view-only review comments.
Haven't run the code.

On 12/14/2017 11:11 AM, Hannes Reinecke wrote:

The 'native LUN' feature allows virtio-scsi to pass in the LUN
numbers from the underlying storage directly, without having
to modify the LUN number itself.
It works by shifting the existing LUN number down by 8 bytes,
and add the virtio-specific 8-byte LUN steering header.
With that virtio doesn't have to mangle the LUN number, allowing
us to pass the 'real' LUN number to the guest.


I only see shifts by 16 bits in the code below which would be 2 bytes.
I had a quick look at the corresponding qemu code which looked the same 
to me.
What's the relation to 8 byte shifting, which would be 64 bit shift and 
thus odd for a 64 bit LUN, mentioned in the description here?


If the code keeps the LUN level 1 and 2 (dropping level 3 and 4) and I 
just don't understand it, it would be fine, I guess.



Of course, we do cut off the last 8 bytes of the 'real' LUN number,
but I'm not aware of any array utilizing that, so the impact should
be negligible.


Why did we do v3.17 commit 9cb78c16f5da ("scsi: use 64-bit LUNs")? ;-)


Signed-off-by: Hannes Reinecke <h...@suse.com>
---
  drivers/scsi/virtio_scsi.c   | 62 ++--
  include/uapi/linux/virtio_scsi.h |  1 +
  2 files changed, 48 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index f925fbd..63c2c85 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -356,8 +356,12 @@ static void virtscsi_handle_transport_reset(struct 
virtio_scsi *vscsi,
struct scsi_device *sdev;
struct Scsi_Host *shost = virtio_scsi_host(vscsi->vdev);
unsigned int target = event->lun[1];
-   unsigned int lun = (event->lun[2] << 8) | event->lun[3];
+   u64 lun;

+   if (virtio_has_feature(vscsi->vdev, VIRTIO_SCSI_F_NATIVE_LUN))
+   lun = scsilun_to_int((struct scsi_lun *)event->lun) >> 16;
+   else
+   lun = (event->lun[2] << 8) | event->lun[3];




@@ -524,10 +532,16 @@ static void virtio_scsi_init_hdr(struct virtio_device 
*vdev,
 int target_id,
 struct scsi_cmnd *sc)
  {
-   cmd->lun[0] = 1;
-   cmd->lun[1] = target_id;
-   cmd->lun[2] = (sc->device->lun >> 8) | 0x40;
-   cmd->lun[3] = sc->device->lun & 0xff;
+   if (virtio_has_feature(vdev, VIRTIO_SCSI_F_NATIVE_LUN)) {
+   u64 lun = sc->device->lun << 16;
+   lun |= ((u64)1 << 8) | (u64)target_id;
+   int_to_scsilun(lun, (struct scsi_lun *)>lun);
+   } else {
+   cmd->lun[0] = 1;
+   cmd->lun[1] = target_id;
+   cmd->lun[2] = (sc->device->lun >> 8) | 0x40;
+   cmd->lun[3] = sc->device->lun & 0xff;
+   }


Above 2 patterns seem to repeat. Have helper functions (similar to 
int_to_scsilun()) now that it's more than just 4 lines of filling in the 
virtio lun?



@@ -851,10 +871,18 @@ static int virtscsi_abort(struct scsi_cmnd *sc)
.subtype = VIRTIO_SCSI_T_TMF_ABORT_TASK,



.lun[0] = 1,
.lun[1] = target_id,


drop those 2 superfluous lines, too?


-   .lun[2] = (sc->device->lun >> 8) | 0x40,
-   .lun[3] = sc->device->lun & 0xff,
.tag = cpu_to_virtio64(vscsi->vdev, (unsigned long)sc),
};
+   if (virtio_has_feature(vscsi->vdev, VIRTIO_SCSI_F_NATIVE_LUN)) {
+   u64 lun = sc->device->lun << 16;
+   lun |= ((u64)1 << 8) | (u64)target_id;
+   int_to_scsilun(lun, (struct scsi_lun *)>req.tmf.lun);
+   } else {
+   cmd->req.tmf.lun[0] = 1;
+   cmd->req.tmf.lun[1] = target_id;
+   cmd->req.tmf.lun[2] = (sc->device->lun >> 8) | 0x40;
+   cmd->req.tmf.lun[3] = sc->device->lun & 0xff;
+   }



return virtscsi_tmf(vscsi, cmd);
  }

@@ -1429,7 +1457,10 @@ static int virtscsi_probe(struct virtio_device *vdev)
/* LUNs > 256 are reported with format 1, so they go in the range
 * 16640-32767.
 */


Above old comment now only seems to apply to the then case of the 
following if statement, not to the else case.



-   shost->max_lun = virtscsi_config_get(vdev, max_lun) + 1 + 0x4000;
+   if (!virtio_has_feature(vdev, VIRTIO_SCSI_F_NATIVE_LUN))
+   shost->max_lun = virtscsi_config_get(vdev, max_lun) + 1 + 
0x4000;
+   else
+   shost->max_lun = (u64)-1;
shost->max_id = num_targets;
shost->max_channel = 0;
shost->max_cmd_len = VIRTIO_SCSI_CDB_SIZE;



--
Mit freundlichen Grüßen 

Re: [PATCH 2/3] virtio-scsi: Add FC transport class

2017-12-15 Thread Steffen Maier
fc_remote_port_delete(tgt->rport);
+   tgt->rport = NULL;
+   }
+   }
+   vscsi->next_target_id = 0;


I see some code duplication with what's in virtscsi_scan_host().
Not sure if reduction is worth it.


+   spin_unlock_irq(>rescan_lock);
+   queue_work(system_freezable_wq, >rescan_work);
+
+   while (!virtscsi_scan_finished(shost, jiffies - start))
+   msleep(10);
+
+   return 0;
+}
+
  static struct scsi_host_template virtscsi_host_template_single = {
.module = THIS_MODULE,
.name = "Virtio SCSI HBA",
@@ -1066,6 +1270,20 @@ static ssize_t virtscsi_host_store_rescan(struct device 
*dev,
.track_queue_depth = 1,
  };

+static struct fc_function_template virtscsi_transport_functions = {
+   .dd_fcrport_size = sizeof(struct virtio_scsi_target_state *),
+   .show_host_node_name = 1,
+   .show_host_port_name = 1,
+   .show_host_port_id = 1,
+   .show_host_port_state = 1,
+   .show_host_port_type = 1,
+   .show_starget_node_name = 1,
+   .show_starget_port_name = 1,
+   .show_starget_port_id = 1,
+   .show_rport_dev_loss_tmo = 1,
+   .issue_fc_host_lip = virtscsi_issue_lip,
+};
+
  #define virtscsi_config_get(vdev, fld) \
({ \
typeof(((struct virtio_scsi_config *)0)->fld) __val; \
@@ -1193,7 +1411,9 @@ static int virtscsi_probe(struct virtio_device *vdev)
vscsi->num_queues = num_queues;
vdev->priv = shost;
vscsi->next_target_id = -1;
+   vscsi->protocol = SCSI_PROTOCOL_SAS;


Why is the old/legacy/non-fcp hardcoded SAS?
Doesn't the non-fcp virtio-scsi have any real transport at all, i.e. "none"?
Maybe I just don't understand semantics of vscsi->protocol well enough.


spin_lock_init(>rescan_lock);
+   INIT_LIST_HEAD(>target_list);
INIT_WORK(>rescan_work, virtscsi_rescan_work);

err = virtscsi_init(vdev, vscsi);



--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 1/3] virtio-scsi: implement target rescan

2017-12-15 Thread Steffen Maier


On 12/14/2017 11:11 AM, Hannes Reinecke wrote:

Implement the 'rescan' virtio-scsi feature. Rescanning works by
sending a 'rescan' virtio-scsi command with the next requested
target id to the backend. The backend will respond with the next
used target id or '-1' if no more targets are found.
This avoids scanning all possible targets.

Signed-off-by: Hannes Reinecke <h...@suse.com>
---
  drivers/scsi/virtio_scsi.c   | 239 ++-
  include/uapi/linux/virtio_scsi.h |  15 +++
  2 files changed, 250 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 7c28e8d..a561e90 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c



+static void virtscsi_rescan_work(struct work_struct *work)
+{



+   if (target_id == -1) {
+   shost_printk(KERN_INFO, sh, "rescan: terminated\n");
+   spin_unlock_irq(>rescan_lock);
+   return;
+   }
+   spin_unlock_irq(>rescan_lock);
+
+   cmd = mempool_alloc(virtscsi_cmd_pool, GFP_NOIO);
+   if (!cmd) {
+   shost_printk(KERN_INFO, sh, "rescan: no memory\n");
+   goto scan_host;
+   }
+   shost_printk(KERN_INFO, sh, "rescan: next target %d\n", target_id);



+   shost_printk(KERN_INFO, sh,
+"rescan: no more targets\n");



+   shost_printk(KERN_INFO, sh, "rescan: scan host\n");
+   scsi_scan_host(sh);
+}
+
+static void virtscsi_scan_host(struct virtio_scsi *vscsi)
+{
+   struct Scsi_Host *sh = virtio_scsi_host(vscsi->vdev);
+   int ret;
+   struct virtio_scsi_cmd *cmd;
+   DECLARE_COMPLETION_ONSTACK(comp);
+
+   cmd = mempool_alloc(virtscsi_cmd_pool, GFP_NOIO);
+   if (!cmd) {
+   shost_printk(KERN_INFO, sh, "rescan: no memory\n");


If shost_printk does not add any info about calling function, this 
cannot be distinguished from a message with the same format string above 
in virtscsi_rescan_work()?



+   return;
+   }
+   shost_printk(KERN_INFO, sh, "rescan: scan host\n");


dito



+static void virtscsi_scan_start(struct Scsi_Host *sh)
+{
+   struct virtio_scsi *vscsi = shost_priv(sh);
+
+   virtscsi_scan_host(vscsi);
+   spin_lock_irq(>rescan_lock);
+   if (vscsi->next_target_id != -1) {
+   shost_printk(KERN_INFO, sh, "rescan: already running\n");
+   spin_unlock_irq(>rescan_lock);
+   return;
+   }
+   vscsi->next_target_id = 0;
+   shost_printk(KERN_INFO, sh, "rescan: start\n");
+   spin_unlock_irq(>rescan_lock);
+   queue_work(system_freezable_wq, >rescan_work);
+}
+
+int virtscsi_scan_finished(struct Scsi_Host *sh, unsigned long time)



+   shost_printk(KERN_INFO, sh, "rescan: %s finished\n",
+ret ? "" : "not");
+   return ret;
+}



--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 3/3] zfcp: drop open coded assignments of timer_list.function

2017-11-16 Thread Steffen Maier
If this has not been picked/merged yet (it's not in Linus' tree yet),
could you please drop it because it's buggy?

This would buy me time to come up with a proper solution,
otherwise I would be forced to fix it within 4.15-rc and
am not sure I can make it.

On 11/08/2017 03:17 PM, Steffen Maier wrote:
> The majority of requests is regular SCSI I/O on the hot path.
> Since these use a timeout owned by the block layer, zfcp does not use
> zfcp_fsf_req.timer. Hence, the very early unconditional and even
> incomplete (handler function yet unknown) timer initialization in
> zfcp_fsf_req_create() is not necessary.
> 
> Instead defer the timer initialization to when we know zfcp needs to use
> its own request timeout in zfcp_fsf_start_timer() and
> zfcp_fsf_start_erp_timer().

This means, we no longer always initialize the timer for
any request type, but only for some request types.

However, we still do have 2 unconditional del_timer() calls
independent of the request type.

I don't understand yet why I haven't seen the following on function testing,
but I see it now while working on something else:

[  325.908536] scsi host2: scsi_eh_2: sleeping
[  325.908707] scsi host2: zfcp
[  325.912974] qdio: 0.0.1900 ZFCP on SC 11 using AI:1 QEBSM:1 PRI:1 TDD:1 
SIGA: W A 
[  331.112469] scsi 2:0:0:0: scsi scan: INQUIRY pass 1 length 36
[  331.122253] ODEBUG: assert_init not available (active state 0) object type: 
timer_list hint:   (null)
[  331.122319] [ cut here ]
[  331.122332] WARNING: CPU: 0 PID: 2195 at lib/debugobjects.c:291 
debug_print_object+0xb4/0xd8
[  331.122339] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 
nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
bridge stp llc ip6table_filter ip6_tables iptable_filter sunrpc qeth_l2 
rng_core ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 zfcp 
sha256_s390 scsi_transport_fc sha1_s390 sha_common dm_multipath qeth qdio 
scsi_mod vmur ccwgroup dm_mod vhost_net tun vhost tap sch_fq_codel kvm 
ip_tables x_tables autofs4
[  331.122503] CPU: 0 PID: 2195 Comm: chccwdev Not tainted 4.14.0localversion+ 
#1
[  331.122510] Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
[  331.122518] task: 4c673200 task.stack: 5fdac000
[  331.122599] Krnl PSW : 0704d0018000 007828cc 
(debug_print_object+0xb4/0xd8)
[  331.122693]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 
RI:0 EA:3
[  331.122748] Krnl GPRS: 200088a4a0f0 8100 0061 
00c215b6
[  331.122753]007828c8  00c158da 
4cf57200
[  331.122766]5e462548 0201d608 00c65888 
00e8dc08
[  331.122830]6fbfb808 00aac910 007828c8 
6fbfb708
[  331.122843] Krnl Code: 007828bc: c02000271779larl
%r2,c657ae
  007828c2: c0e5ffd25793brasl   
%r14,1cd7e8
 #007828c8: a7f40001brc 
15,7828ca
 >007828cc: c41d003655b4lrl 
%r1,e4d434
  007828d2: e340f0e80004lg  
%r4,232(%r15)
  007828d8: a71a0001ahi %r1,1
  007828dc: eb6ff0a80004lmg 
%r6,%r15,168(%r15)
  007828e2: c41f003655a9strl
%r1,e4d434
[  331.123056] Call Trace:
[  331.123065] ([<007828c8>] debug_print_object+0xb0/0xd8)
[  331.123074]  [<00783900>] debug_object_assert_init+0x148/0x180 
[  331.123085]  [<001e8e2c>] del_timer+0x34/0x90 
[  331.123106]  [<03ff8032fad2>] zfcp_fsf_req_complete+0x2b2/0x7a8 [zfcp] 
[  331.123122]  [<03ff80331e2e>] zfcp_fsf_reqid_check+0xe6/0x150 [zfcp] 
[  331.123151]  [<03ff80332be0>] zfcp_qdio_int_resp+0x138/0x180 [zfcp] 
[  331.123167]  [<03ff801df19e>] qdio_kick_handler+0x1be/0x2c0 [qdio] 
[  331.123178]  [<03ff801e1ca6>] __tiqdio_inbound_processing+0x466/0xd00 
[qdio] 
[  331.123191]  [<0014f5e0>] tasklet_action+0x100/0x188 
[  331.123203]  [<00a56af2>] __do_softirq+0x2ca/0x5e0 
[  331.123215]  [<0014ec24>] irq_exit+0x74/0xd8 
[  331.123228]  [<0010c5c4>] do_IRQ+0xbc/0xf0 
[  331.123278]  [<00a55c2c>] io_int_handler+0x104/0x2d4 
[  331.123354]  [<00168ca6>] queue_work_on+0x8e/0xa8 
[  331.123393] ([<00168ca2>] queue_work_on+0x8a/0xa8)
[  331.123443]  [<0080a932>] pty_write+0x62/0x88 
[  331.123454]  [<00801c64>] n_tty_write+0x284/0x4b8 
[  331.123463]  [<007febb6>] tty_write+0x34e/0x378 
[  331.123473]  [<000

[PATCH 0/3] zfcp: timer_setup() refactoring feature for v4.15-rc1

2017-11-08 Thread Steffen Maier
Hi all,

here is a small series for the timer_setup() refactoring of zfcp.
We target it for the merge window to land in v4.15-rc1.

Unfortunately, they don't seem to apply to the current state of neither
James' misc branch nor Martin's 4.15/scsi-queue branch,
because they depend on:
v4.14-rc3 686fef928bba ("timer: Prepare to change timer callback argument type")
and
v4.14-rc7 ab31fd0ce65e ("scsi: zfcp: fix erp_action use-before-initialize in 
REC action trace").

However, they do apply to Linus' tree for v4.14-rc7 or later and
thus they would also apply for the upcoming merge window.

In http://www.spinics.net/lists/linux-scsi/msg114581.html I saw a decision
to have such changes go in via the timer tree. I would be happy with that.

Kees Cook (1):
  zfcp: convert timers to use timer_setup()

Steffen Maier (2):
  zfcp: purely mechanical update using timer API, plus blank lines
  zfcp: drop open coded assignments of timer_list.function

 drivers/s390/scsi/zfcp_erp.c | 15 +--
 drivers/s390/scsi/zfcp_ext.h |  2 +-
 drivers/s390/scsi/zfcp_fsf.c | 13 ++---
 3 files changed, 16 insertions(+), 14 deletions(-)

-- 
2.13.5



[PATCH 1/3] zfcp: convert timers to use timer_setup()

2017-11-08 Thread Steffen Maier
From: Kees Cook <keesc...@chromium.org>

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Steffen Maier <ma...@linux.vnet.ibm.com>
Cc: Benjamin Block <bbl...@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidef...@de.ibm.com>
Cc: Heiko Carstens <heiko.carst...@de.ibm.com>
Cc: linux-s...@vger.kernel.org
Signed-off-by: Kees Cook <keesc...@chromium.org>
Signed-off-by: Martin Schwidefsky <schwidef...@de.ibm.com>
[ma...@linux.vnet.ibm.com:
 depends on v4.14-rc3 686fef928bba ("timer: Prepare to change timer callback 
argument type"),
 rebased onto v4.14-rc7 ab31fd0ce65e ("scsi: zfcp: fix erp_action 
use-before-initialize in REC action trace")]
Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
Reviewed-by: Jens Remus <jre...@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_erp.c | 16 ++--
 drivers/s390/scsi/zfcp_ext.h |  2 +-
 drivers/s390/scsi/zfcp_fsf.c | 13 ++---
 3 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index cbb8156bf5e0..822a852d578e 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -56,6 +56,8 @@ enum zfcp_erp_act_result {
ZFCP_ERP_NOMEM = 5,
 };
 
+static void zfcp_erp_memwait_handler(struct timer_list *t);
+
 static void zfcp_erp_adapter_block(struct zfcp_adapter *adapter, int mask)
 {
zfcp_erp_clear_adapter_status(adapter,
@@ -237,6 +239,7 @@ static struct zfcp_erp_action *zfcp_erp_setup_act(int need, 
u32 act_status,
erp_action->fsf_req_id = 0;
erp_action->action = need;
erp_action->status = act_status;
+   timer_setup(_action->timer, zfcp_erp_memwait_handler, 0);
 
return erp_action;
 }
@@ -564,21 +567,22 @@ void zfcp_erp_notify(struct zfcp_erp_action *erp_action, 
unsigned long set_mask)
  * zfcp_erp_timeout_handler - Trigger ERP action from timed out ERP request
  * @data: ERP action (from timer data)
  */
-void zfcp_erp_timeout_handler(unsigned long data)
+void zfcp_erp_timeout_handler(struct timer_list *t)
 {
-   struct zfcp_erp_action *act = (struct zfcp_erp_action *) data;
+   struct zfcp_fsf_req *fsf_req = from_timer(fsf_req, t, timer);
+   struct zfcp_erp_action *act = fsf_req->erp_action;
zfcp_erp_notify(act, ZFCP_STATUS_ERP_TIMEDOUT);
 }
 
-static void zfcp_erp_memwait_handler(unsigned long data)
+static void zfcp_erp_memwait_handler(struct timer_list *t)
 {
-   zfcp_erp_notify((struct zfcp_erp_action *)data, 0);
+   struct zfcp_erp_action *act = from_timer(act, t, timer);
+
+   zfcp_erp_notify(act, 0);
 }
 
 static void zfcp_erp_strategy_memwait(struct zfcp_erp_action *erp_action)
 {
-   setup_timer(_action->timer, zfcp_erp_memwait_handler,
-   (unsigned long) erp_action);
erp_action->timer.expires = jiffies + HZ;
add_timer(_action->timer);
 }
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index 8ca2ab7deaa9..978a0d596f68 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -69,7 +69,7 @@ extern int  zfcp_erp_thread_setup(struct zfcp_adapter *);
 extern void zfcp_erp_thread_kill(struct zfcp_adapter *);
 extern void zfcp_erp_wait(struct zfcp_adapter *);
 extern void zfcp_erp_notify(struct zfcp_erp_action *, unsigned long);
-extern void zfcp_erp_timeout_handler(unsigned long);
+extern void zfcp_erp_timeout_handler(struct timer_list *);
 
 /* zfcp_fc.c */
 extern struct kmem_cache *zfcp_fc_req_cache;
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 00fb98f7b2cd..6f437df1995f 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -21,9 +21,10 @@
 
 struct kmem_cache *zfcp_fsf_qtcb_cache;
 
-static void zfcp_fsf_request_timeout_handler(unsigned long data)
+static void zfcp_fsf_request_timeout_handler(struct timer_list *t)
 {
-   struct zfcp_adapter *adapter = (struct zfcp_adapter *) data;
+   struct zfcp_fsf_req *fsf_req = from_timer(fsf_req, t, timer);
+   struct zfcp_adapter *adapter = fsf_req->adapter;
zfcp_qdio_siosl(adapter);
zfcp_erp_adapter_reopen(adapter, ZFCP_STATUS_COMMON_ERP_FAILED,
"fsrth_1");
@@ -32,8 +33,7 @@ static void zfcp_fsf_request_timeout_handler(unsigned long 
data)
 static void zfcp_fsf_start_timer(struct zfcp_fsf_req *fsf_req,
 unsigned long timeout)
 {
-   fsf_req->timer.function = zfcp_fsf_request_timeout_handler;
-   fsf_req->timer.data = (unsigned long) fsf_req->adapter;
+   fsf_req->timer.function = 
(TIMER_FUNC_TYPE)zfcp_fsf_request_timeout_handler;
fsf_req->timer.expires = jiffies + timeout;
add_timer(_req->

[PATCH 2/3] zfcp: purely mechanical update using timer API, plus blank lines

2017-11-08 Thread Steffen Maier
erp_memwait only occurs in seldom memory pressure situations.
The typical case never uses the associated timer and thus also
does not need to initialize the timer.
Also, we don't want to re-initialize the timer each time we re-use an
erp_action in zfcp_erp_setup_act() [see also v4.14-rc7 commit ab31fd0ce65e
("scsi: zfcp: fix erp_action use-before-initialize in REC action trace")
for erp_action life cycle].
Hence, retain the lazy inintialization of zfcp_erp_action.timer
in zfcp_erp_strategy_memwait().

Add an empty line after declarations in zfcp_erp_timeout_handler()
and zfcp_fsf_request_timeout_handler() even though it was also missing
before the timer conversion.

Fix checkpatch warning:
WARNING: function definition argument 'struct timer_list *' should also have an 
identifier name
+extern void zfcp_erp_timeout_handler(struct timer_list *);

Depends-on: v4.14-rc3 commit 686fef928bba ("timer: Prepare to change timer 
callback argument type")
Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
Reviewed-by: Jens Remus <jre...@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_erp.c | 5 ++---
 drivers/s390/scsi/zfcp_ext.h | 2 +-
 drivers/s390/scsi/zfcp_fsf.c | 1 +
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 822a852d578e..1d91a32db08e 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -56,8 +56,6 @@ enum zfcp_erp_act_result {
ZFCP_ERP_NOMEM = 5,
 };
 
-static void zfcp_erp_memwait_handler(struct timer_list *t);
-
 static void zfcp_erp_adapter_block(struct zfcp_adapter *adapter, int mask)
 {
zfcp_erp_clear_adapter_status(adapter,
@@ -239,7 +237,6 @@ static struct zfcp_erp_action *zfcp_erp_setup_act(int need, 
u32 act_status,
erp_action->fsf_req_id = 0;
erp_action->action = need;
erp_action->status = act_status;
-   timer_setup(_action->timer, zfcp_erp_memwait_handler, 0);
 
return erp_action;
 }
@@ -571,6 +568,7 @@ void zfcp_erp_timeout_handler(struct timer_list *t)
 {
struct zfcp_fsf_req *fsf_req = from_timer(fsf_req, t, timer);
struct zfcp_erp_action *act = fsf_req->erp_action;
+
zfcp_erp_notify(act, ZFCP_STATUS_ERP_TIMEDOUT);
 }
 
@@ -583,6 +581,7 @@ static void zfcp_erp_memwait_handler(struct timer_list *t)
 
 static void zfcp_erp_strategy_memwait(struct zfcp_erp_action *erp_action)
 {
+   timer_setup(_action->timer, zfcp_erp_memwait_handler, 0);
erp_action->timer.expires = jiffies + HZ;
add_timer(_action->timer);
 }
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index 978a0d596f68..bf8ea4df2bb8 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -69,7 +69,7 @@ extern int  zfcp_erp_thread_setup(struct zfcp_adapter *);
 extern void zfcp_erp_thread_kill(struct zfcp_adapter *);
 extern void zfcp_erp_wait(struct zfcp_adapter *);
 extern void zfcp_erp_notify(struct zfcp_erp_action *, unsigned long);
-extern void zfcp_erp_timeout_handler(struct timer_list *);
+extern void zfcp_erp_timeout_handler(struct timer_list *t);
 
 /* zfcp_fc.c */
 extern struct kmem_cache *zfcp_fc_req_cache;
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 6f437df1995f..51b81c0a0652 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -25,6 +25,7 @@ static void zfcp_fsf_request_timeout_handler(struct 
timer_list *t)
 {
struct zfcp_fsf_req *fsf_req = from_timer(fsf_req, t, timer);
struct zfcp_adapter *adapter = fsf_req->adapter;
+
zfcp_qdio_siosl(adapter);
zfcp_erp_adapter_reopen(adapter, ZFCP_STATUS_COMMON_ERP_FAILED,
"fsrth_1");
-- 
2.13.5



[PATCH 3/3] zfcp: drop open coded assignments of timer_list.function

2017-11-08 Thread Steffen Maier
The majority of requests is regular SCSI I/O on the hot path.
Since these use a timeout owned by the block layer, zfcp does not use
zfcp_fsf_req.timer. Hence, the very early unconditional and even
incomplete (handler function yet unknown) timer initialization in
zfcp_fsf_req_create() is not necessary.

Instead defer the timer initialization to when we know zfcp needs to use
its own request timeout in zfcp_fsf_start_timer() and
zfcp_fsf_start_erp_timer(). At that point in time we also know the handler
function. So drop open coded assignments of timer_list.function and
instead use the new timer API wrapper function timer_setup().

This way, we don't have to touch zfcp again, when the cast macro
TIMER_FUNC_TYPE gets removed again after the global conversion to
timer_setup() is complete.

Depends-on: v4.14-rc3 commit 686fef928bba ("timer: Prepare to change timer 
callback argument type")
Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
Reviewed-by: Jens Remus <jre...@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_fsf.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 51b81c0a0652..c8e368f0f299 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -34,7 +34,7 @@ static void zfcp_fsf_request_timeout_handler(struct 
timer_list *t)
 static void zfcp_fsf_start_timer(struct zfcp_fsf_req *fsf_req,
 unsigned long timeout)
 {
-   fsf_req->timer.function = 
(TIMER_FUNC_TYPE)zfcp_fsf_request_timeout_handler;
+   timer_setup(_req->timer, zfcp_fsf_request_timeout_handler, 0);
fsf_req->timer.expires = jiffies + timeout;
add_timer(_req->timer);
 }
@@ -42,7 +42,7 @@ static void zfcp_fsf_start_timer(struct zfcp_fsf_req *fsf_req,
 static void zfcp_fsf_start_erp_timer(struct zfcp_fsf_req *fsf_req)
 {
BUG_ON(!fsf_req->erp_action);
-   fsf_req->timer.function = (TIMER_FUNC_TYPE)zfcp_erp_timeout_handler;
+   timer_setup(_req->timer, zfcp_erp_timeout_handler, 0);
fsf_req->timer.expires = jiffies + 30 * HZ;
add_timer(_req->timer);
 }
@@ -692,7 +692,6 @@ static struct zfcp_fsf_req *zfcp_fsf_req_create(struct 
zfcp_qdio *qdio,
adapter->req_no++;
 
INIT_LIST_HEAD(>list);
-   timer_setup(>timer, NULL, 0);
init_completion(>completion);
 
req->adapter = adapter;
-- 
2.13.5



[PATCH] zfcp: fix erp_action use-before-initialize in REC action trace

2017-10-13 Thread Steffen Maier
v4.10 commit 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN
recovery") extended accessing parent pointer fields of
struct zfcp_erp_action for tracing.
If an erp_action has never been enqueued before, these parent pointer
fields are uninitialized and NULL. Examples are zfcp objects freshly added
to the parent object's children list, before enqueueing their first
recovery subsequently. In zfcp_erp_try_rport_unblock(), we iterate such
list. Accessing erp_action fields can cause a NULL pointer dereference.
Since the kernel can read from lowcore on s390, it does not immediately
cause a kernel page fault. Instead it can cause hangs on trying to acquire
the wrong erp_action->adapter->dbf->rec_lock in zfcp_dbf_rec_action_lvl()
  ^bogus^
while holding already other locks with IRQs disabled.

Real life example from attaching lots of LUNs in parallel on many CPUs:

crash> bt 17723
PID: 17723  TASK: ...   CPU: 25  COMMAND: "zfcperp0.0.1800"
 LOWCORE INFO:
  -psw  : 0x040430018000 0x0038e424
  -function : _raw_spin_lock_wait_flags at 38e424
...
 #0 [fdde8fc90] zfcp_dbf_rec_action_lvl at 3e0004e9862 [zfcp]
 #1 [fdde8fce8] zfcp_erp_try_rport_unblock at 3e0004dfddc [zfcp]
 #2 [fdde8fd38] zfcp_erp_strategy at 3e0004e0234 [zfcp]
 #3 [fdde8fda8] zfcp_erp_thread at 3e0004e0a12 [zfcp]
 #4 [fdde8fe60] kthread at 173550
 #5 [fdde8feb8] kernel_thread_starter at 10add2

zfcp_adapter
 zfcp_port
  zfcp_unit , 0x404040d6
  scsi_device NULL, returning early!
zfcp_scsi_dev.status = 0x4000
0x4000 ZFCP_STATUS_COMMON_RUNNING

crash> zfcp_unit 
struct zfcp_unit {
  erp_action = {
adapter = 0x0,
port = 0x0,
unit = 0x0,
  },
}

zfcp_erp_action is always fully embedded into its container object. Such
container object is never moved in its object tree (only add or delete).
Hence, erp_action parent pointers can never change.

To fix the issue, initialize the erp_action parent pointers
before adding the erp_action container to any list and thus before it
becomes accessible from outside of its initializing function.

In order to also close the time window between zfcp_erp_setup_act()
memsetting the entire erp_action to zero and setting the parent pointers
again, drop the memset and instead explicitly initialize individually all
erp_action fields except for parent pointers. To be extra careful not to
introduce any other unintended side effect, even keep zeroing the
erp_action fields for list and timer. Also double-check with WARN_ON_ONCE
that erp_action parent pointers never change, so we get to know when
we would deviate from previous behavior.

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
Fixes: 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery")
Cc: <sta...@vger.kernel.org> #2.6.32+
Reviewed-by: Benjamin Block <bbl...@linux.vnet.ibm.com>
---

James, Martin,

it's an important bugfix cut against James' scsi.git fixes branch,
and would be nice if it could make it into 4.14 via rc.

 drivers/s390/scsi/zfcp_aux.c  |5 +
 drivers/s390/scsi/zfcp_erp.c  |   18 +++---
 drivers/s390/scsi/zfcp_scsi.c |5 +
 3 files changed, 21 insertions(+), 7 deletions(-)

--- a/drivers/s390/scsi/zfcp_aux.c
+++ b/drivers/s390/scsi/zfcp_aux.c
@@ -357,6 +357,8 @@ struct zfcp_adapter *zfcp_adapter_enqueu
 
adapter->next_port_scan = jiffies;
 
+   adapter->erp_action.adapter = adapter;
+
if (zfcp_qdio_setup(adapter))
goto failed;
 
@@ -513,6 +515,9 @@ struct zfcp_port *zfcp_port_enqueue(stru
port->dev.groups = zfcp_port_attr_groups;
port->dev.release = zfcp_port_release;
 
+   port->erp_action.adapter = adapter;
+   port->erp_action.port = port;
+
if (dev_set_name(>dev, "0x%016llx", (unsigned long long)wwpn)) {
kfree(port);
goto err_out;
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -193,9 +193,8 @@ static struct zfcp_erp_action *zfcp_erp_
atomic_or(ZFCP_STATUS_COMMON_ERP_INUSE,
_sdev->status);
erp_action = _sdev->erp_action;
-   memset(erp_action, 0, sizeof(struct zfcp_erp_action));
-   erp_action->port = port;
-   erp_action->sdev = sdev;
+   WARN_ON_ONCE(erp_action->port != port);
+   WARN_ON_ONCE(erp_action->sdev != sdev);
if (!(atomic_read(_sdev->status) &
  ZFCP_STATUS_COMMON_RUNNING))
act_status |= ZFCP_STATUS_ERP_CLOSE_ONLY;
@@ -208,8 +207,8 @@ static struct zfcp_erp_action *zfcp_erp_
zfcp_erp_action_dismiss_port(port);
atomic_or(ZFCP_STATUS_COMMON_ERP_INUSE, >status);
erp_action = >erp_action;
-   memset(erp_action, 0, si

Re: [PATCH] scsi: logging_level: update bits description

2017-10-11 Thread Steffen Maier


On 10/10/2017 09:32 PM, Kyle Fortin wrote:

On Oct 10, 2017, at 3:05 PM, Randy Dunlap <rdun...@infradead.org> wrote:

From: Randy Dunlap <rdun...@infradead.org>

Update the description of 'scsi_logging_level' from 8 4-bit nibbles
to the (pre-git) reality of 10 3-bit 'nibbles'.

Signed-off-by: Randy Dunlap <rdun...@infradead.org>
---
drivers/scsi/scsi_logging.h |8 
1 file changed, 4 insertions(+), 4 deletions(-)

--- lnx-414-rc3.orig/drivers/scsi/scsi_logging.h
+++ lnx-414-rc3/drivers/scsi/scsi_logging.h
@@ -3,10 +3,10 @@


/*
- * This defines the scsi logging feature.  It is a means by which the user
- * can select how much information they get about various goings on, and it
- * can be really useful for fault tracing.  The logging word is divided into


nit pick: Why reflow and thus "change" these 3 lines even though the 
content is the same?



- * 8 nibbles, each of which describes a loglevel.  The division of things is
+ * This defines the scsi logging feature.  It is a means by which the user can
+ * select how much information they get about various goings on, and it can be
+ * really useful for fault tracing.  The logging word is divided into 10 3-bit
+ * 'nibbles', each of which describes a loglevel.  The division of things is


I think ‘bitfields' is more appropriate than ‘nibbles’ (a 4-bit construct in 
compute).


+1


  * somewhat arbitrary, and the division of the word could be changed if it
  * were really needed for any reason.  The numbers below are the only place
  * where these are specified.  For a first go-around, 3 bits is more than


Reviewed-by: Kyle Fortin <kyle.for...@oracle.com>


Reviewed-by: Steffen Maier <ma...@linux.vnet.ibm.com>

--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH] scsi: use set_host_byte instead of open-coding it

2017-10-11 Thread Steffen Maier


On 10/10/2017 05:29 PM, Johannes Thumshirn wrote:

Call set_host_byte() instead of open-coding it.

Converted using this simple Coccinelle spatch


@@
local idexpression struct scsi_cmnd *c;
expression E1;
@@

- c->result = E1 << 16;
+ set_host_byte(c, E1);


Maybe I misunderstand, but doesn't set_host_byte only set the host byte 
but leave the other 3 parts untouched in c->result?


static inline void set_host_byte(struct scsi_cmnd *cmd, char status)
{
cmd->result = (cmd->result & 0xff00) | (status << 16);
}

In contrast, assigning something to c->result resets all parts.
If so, the semantic patch would introduce a subtle semantic change.
Unless it's guaranteed that in all the touched cases, c->result always 
has 0 for status, message, and driver byte before calling set_host_byte().


Bart's suggestion also sounds nice.

FYI: Originally, I only thought about using set_host_byte in that one 
place fix of yours; I did not expect a full framework rework.


--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH] scsi: libiscsi: fix shifting of DID_REQUEUE host byte

2017-10-09 Thread Steffen Maier
Use wrapper functions to advertize their use in an attempt to avoid 
wrong shifting in the future?


On 10/09/2017 01:33 PM, Johannes Thumshirn wrote:

The SCSI host byte should be shifted left by 16 in order to have
scsi_decide_disposition() do the right thing (.i.e. requeue the command).

Signed-off-by: Johannes Thumshirn <jthumsh...@suse.de>
Fixes: 661134ad3765 ("[SCSI] libiscsi, bnx2i: make bound ep check common")
Cc: Lee Duncan <ldun...@suse.com>
Cc: Hannes Reinecke <h...@suse.de>
Cc: Bart Van Assche <bart.vanass...@sandisk.com>
Cc: Chris Leech <cle...@redhat.com>
---
  drivers/scsi/libiscsi.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index bd4605a34f54..9cba4913b43c 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1728,7 +1728,7 @@ int iscsi_queuecommand(struct Scsi_Host *host, struct 
scsi_cmnd *sc)

if (test_bit(ISCSI_SUSPEND_BIT, >suspend_tx)) {
reason = FAILURE_SESSION_IN_RECOVERY;
-   sc->result = DID_REQUEUE;
+   sc->result = DID_REQUEUE << 16;


not sure if this really wants to reset the other parts of result, but if 
so (and they are not 0 already anyway), preceed the set_host_byte() by:

sc->result = 0;

set_host_byte(sc, DID_REQUEUE);


goto fault;
    }



--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH v2 1/1] scsi: fc: check for rport presence in fc_block_scsi_eh

2017-09-26 Thread Steffen Maier

On 09/26/2017 08:58 AM, Johannes Thumshirn wrote:

Coverity-scan recently found a possible NULL pointer dereference in
fc_block_scsi_eh() as starget_to_rport() either returns the rport for
the startget or NULL.

While it is rather unlikely to have fc_block_scsi_eh() called without
an rport associated it's a good idea to catch potential misuses of the
API gracefully.

Signed-off-by: Johannes Thumshirn <jthumsh...@suse.de>
Reviewed-by: Bart Van Assche <bart.vanass...@wdc.com>
---

Changes since v1:
- s/WARN_ON/WARN_ON_ONCE/ (Bart)

---
  drivers/scsi/scsi_transport_fc.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index ba9d70f8a6a1..38abff7b5dbc 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3328,6 +3328,9 @@ int fc_block_scsi_eh(struct scsi_cmnd *cmnd)
  {
struct fc_rport *rport = starget_to_rport(scsi_target(cmnd->device));

+   if (WARN_ON_ONCE(!rport))
+   return 0;


Good idea.

However, return 0 or FAST_IO_FAIL?
I mean the callchains to this function (and of fc_block_rport()) react 
differently depending on the return value.
Returning 0 means that the rport left the blocked state, i.e. is usable 
for traffic again.

If there is no rport at all, I suppose one cannot use it for traffic.
If there is any I/O pending on this scope and we return 0, scsi_eh 
escalates; and if this happens for a host_reset we end up with offlined 
scsi_devices.
I wonder if returning FAST_IO_FAIL would be more appropriate here in 
this case, in order to have scsi_eh let the pending I/O bubble up for a 
timely path failover?


--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [RFC 4/9] zfcp: decouple FSF request setup of TMF from scsi_cmnd

2017-08-04 Thread Steffen Maier

Just for the records: There's another bug below.

On 07/25/2017 04:14 PM, Steffen Maier wrote:

The scsi_device argument of zfcp_fc_fcp_tm() can now be NULL.

In zfcp_fsf_fcp_task_mgmt() resolve the still old argument scsi_cmnd
into scsi_device very early and only depend on scsi_device and derived
objects in the function body.

Scsi_device and derived zfcp_scsi_dev can later be NULL for the
target reset case, so do not depend on them unconditionally.
For the generic case, rather change to using zfcp_port directly.

This prepares to later change the function signature replacing the
scsi_cmnd argument with zfcp_port and an
optional scsi_device which can be NULL.

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
---
  drivers/s390/scsi/zfcp_fc.h  |  6 --
  drivers/s390/scsi/zfcp_fsf.c | 25 +
  2 files changed, 21 insertions(+), 10 deletions(-)



diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index f221a34c26df..2dc7d2a6f6ea 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2339,13 +2339,19 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct 
scsi_cmnd *scmnd,
  {
struct zfcp_fsf_req *req = NULL;
struct fcp_cmnd *fcp_cmnd;
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scmnd->device);
-   struct zfcp_qdio *qdio = zfcp_sdev->port->adapter->qdio;
+   struct scsi_device *sdev = scmnd->device;
+   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);


BUG: must not unconditionally dereference sdev which can be NULL later 
on in the patch set!


Fix: +  struct zfcp_scsi_dev *zfcp_sdev = sdev ? sdev_to_zfcp(sdev) : NULL;

Fix is no longer necessary in my reworked v2 (always having a non-NULL 
sdev) to be sent when I successfully completed function test.



+   struct zfcp_port *port = zfcp_sdev->port;


This line was removed in the subsequent patch 5/9, so here the 
unconditional deref is OK because here in this patch we still get a 
non-NULL sdev. (The line is just argument lifting preparing for the 
function argument replacement in 5/9.)


Other accesses to sdev or zfcp_sdev were properly guarded with this patch.

--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 32/47] scsi: Use scsi_target as argument for eh_target_reset_handler()

2017-08-02 Thread Steffen Maier

just an intermediate update on storage processing of target reset...

On 07/25/2017 04:19 PM, Hannes Reinecke wrote:

On 07/24/2017 08:10 PM, Steffen Maier wrote:

On 06/28/2017 10:32 AM, Hannes Reinecke wrote:

The target reset function should only depend on the scsi target,
not the scsi command.

Signed-off-by: Hannes Reinecke <h...@suse.com>
---



   drivers/s390/scsi/zfcp_scsi.c   | 20 ++--



   33 files changed, 214 insertions(+), 174 deletions(-)



diff --git a/drivers/s390/scsi/zfcp_scsi.c
b/drivers/s390/scsi/zfcp_scsi.c
index dd7bea0..92a3902 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -309,9 +309,25 @@ static int
zfcp_scsi_eh_device_reset_handler(struct scsi_cmnd *scpnt)
   return zfcp_task_mgmt_function(scpnt->device, FCP_TMF_LUN_RESET);
   }

-static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt)
+/*
+ * Note: We need to select a LUN as the storage array doesn't
+ * necessarily supports LUN 0 and might refuse the target reset.
+ */


Do you have any real experience with targets regarding this?

Did you even try this and it failed?
If so, how did it fail?


Hehe.

Actually, it was _you_ (well, not you personally, but the zfcp
maintainer at that time) who insisted on _not_ having to rely on LUN 0,
as that LUN might not be available on non-NPIV setups.
In the same vein he argued that we should be using the WLUN here.


Thanks a lot for letting me know!


It seems other drivers hardcode LUN 0 for target reset [see below].

At least you made a similar loop to search for a suitable "victim"
scsi_device with some other driver changes below, so zfcp is not the
only one.

In fact, this is one of my open questions in my own patch set:
Is the TMF flag in the FCP_CMND IU sufficient or does the transmission
path require a valid FCP_LUN also in the same IU even for a target reset.


Technically, you need an IT nexus for the target reset.
As the SCSI target is somewhat under-represented in the linux SCSI stack
typically it's easier to use a scsi device for this, and derive the IT
nexus from there.
And target reset is a tad tricky anyway; it got deprecated with later
SCSI releases (SPC-3?), so chances is that it doesn't do anything.

(You could do yourself a favour and enquire with your friendly array
vendors if _they_ support target reset; I have a strong feeling that
they don't. In which case you might as well drop it completely, and
target reset doing an IT nexus reset.)


# lsscsi
[0:0:0:1073758277]diskIBM  2107900  .280  /dev/sdc
[0:0:0:1073823813]diskIBM  2107900  .280  /dev/sda
[0:0:0:1073889349]diskIBM  2107900  .280  /dev/sdb

With test code I made the following request run into a timeout:

# dd count=1 of=/dev/null if=/dev/sda iflag=direct


[  633.459218] sd 0:0:0:1073823813: [sda] tag#0 Done: TIMEOUT_ERROR Result: 
hostbyte=DID_OK driverbyte=DRIVER_OK
[  633.459267] sd 0:0:0:1073823813: [sda] tag#0 CDB: Read(10) 28 00 00 00 00 00 
00 00 01 00
[  633.459277] sd 0:0:0:1073823813: [sda] tag#0 abort scheduled
[  633.479364] sd 0:0:0:1073823813: [sda] tag#0 aborting command
[  633.479382] sd 0:0:0:1073823813: [sda] tag#0 cmd abort failed


More test code makes the abort fail (before even attempting it).


[  633.479456] scsi host0: scsi_eh_0: waking up 0/1/1
[  633.479483] scsi host0: scsi_eh_prt_fail_stats: cmds failed: 0, cancel: 1
[  633.479492] scsi host0: Total of 1 commands on 1 devices require eh work
[  633.479502] sd 0:0:0:1073823813: scsi_eh_0: Sending BDR
[  633.479512] sd 0:0:0:1073823813: scsi_eh_0: BDR failed


More test code makes the LUN reset fail (before even attempting it).


[  633.479519] scsi host0: scsi_eh_0: Sending target reset to target 0
[  633.483654] sd 0:0:0:1073823813: [sda] tag#0 scsi_eh_done result: 2
[  633.483729] sd 0:0:0:1073823813: [sda] tag#0 Done: SUCCESS Result: 
hostbyte=DID_OK driverbyte=DRIVER_OK
[  633.483736] sd 0:0:0:1073823813: [sda] tag#0 CDB: Test Unit Ready 00 00 00 
00 00 00
[  633.483741] sd 0:0:0:1073823813: [sda] tag#0 Sense Key : Unit Attention [current] 
[  633.483747] sd 0:0:0:1073823813: [sda] tag#0 Add. Sense: Power on, reset, or bus device reset occurred

[  633.483753] sd 0:0:0:1073823813: [sda] tag#0 scsi_send_eh_cmnd timeleft: 1000
[  633.483758] sd 0:0:0:1073823813: [sda] tag#0 scsi_send_eh_cmnd: 
scsi_eh_completed_normally 2001
[  633.483764] sd 0:0:0:1073823813: [sda] tag#0 scsi_eh_tur return: 2001
[  633.484074] sd 0:0:0:1073823813: [sda] tag#0 scsi_eh_done result: 0
[  633.484093] sd 0:0:0:1073823813: [sda] tag#0 scsi_send_eh_cmnd timeleft: 1000
[  633.484118] sd 0:0:0:1073823813: [sda] tag#0 scsi_send_eh_cmnd: 
scsi_eh_completed_normally 2002
[  633.484124] sd 0:0:0:1073823813: [sda] tag#0 scsi_eh_tur return: 2002
[  633.484130] sd 0:0:0:1073823813: [sda] tag#0 scsi_eh_0: flush retry cmd
[  633.484260] scsi host0: waking up host to restart
[  633.484299] scsi host0: scsi_eh_0: sleeping

Re: [RFC 2/9] zfcp: decouple TMF response handler from scsi_cmnd

2017-08-01 Thread Steffen Maier

Just for the records, in case anyone wants to resurrect this later on:
This patch is buggy.

On 07/25/2017 04:14 PM, Steffen Maier wrote:

Do not get scsi_device via req->data any more, but pass an optional(!)
scsi_device to zfcp_fsf_fcp_handler_common(). The latter must now guard
any access to scsi_device as it can be NULL.

Since we always have at least a zfcp port as scope, pass this as mandatory
argument to zfcp_fsf_fcp_handler_common() because we cannot get it through
scsi_device => zfcp_scsi_dev => port any more.

Hence, the callers of zfcp_fsf_fcp_handler_common() must resolve req->data.

TMF handling now has different context data in fsf_req->data
depending on the TMF scope in fcp_cmnd->fc_tm_flags:
* scsi_device if FCP_TMF_LUN_RESET,
* zfcp_port if FCP_TMF_TGT_RESET.

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
---
  drivers/s390/scsi/zfcp_fsf.c | 72 
  1 file changed, 46 insertions(+), 26 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 69d1dc3ec79d..8b2b2ea552d6 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c



@@ -2296,10 +2304,21 @@ int zfcp_fsf_fcp_cmnd(struct scsi_cmnd *scsi_cmnd)

  static void zfcp_fsf_fcp_task_mgmt_handler(struct zfcp_fsf_req *req)
  {
+   struct fcp_cmnd *fcp_cmnd;
+   struct zfcp_port *port;
+   struct scsi_device *sdev;
struct fcp_resp_with_ext *fcp_rsp;
struct fcp_resp_rsp_info *rsp_info;

-   zfcp_fsf_fcp_handler_common(req);
+   fcp_cmnd = >qtcb->bottom.io.fcp_cmnd.iu;
+   if (fcp_cmnd->fc_tm_flags & FCP_TMF_LUN_RESET) {
+   sdev = req->data;
+   port = sdev_to_zfcp(sdev)->port;


Below described bug causes, in case of a LUN reset, a wrong type 
interpretation because we interpret req->data as scsi_device but the 
request function had assigned a zfcp_port to it. Dereferencing the port 
field leads to a kernel page fault in (Soft)IRQ context ending up in a 
panic.



+   } else { > + sdev = NULL;
+   port = req->data;
+   }
+   zfcp_fsf_fcp_handler_common(req, port, sdev);

fcp_rsp = >qtcb->bottom.io.fcp_rsp.iu;
rsp_info = (struct fcp_resp_rsp_info *) _rsp[1];
@@ -2340,7 +2359,9 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct 
scsi_cmnd *scmnd,
goto out;
}

-   req->data = scmnd;
+   fcp_cmnd = >qtcb->bottom.io.fcp_cmnd.iu;


While I moved the pointer assignment here,
the memory it points to is only filled in below with:
zfcp_fc_scsi_to_fcp(fcp_cmnd, scmnd, tm_flags).
The still freshly allocated QTCB is pre-initialized with zero.
Hence, the subsequent boolean expression always evaluates to false
since no flag is set yet.
Thus, a LUN reset erroneously has:
req->data = (void *)sdev_to_zfcp(scmnd->device)->port.

A fix would be to base the boolean expression on function argument 
tm_flags rather than the QTCB content:

(tm_flags & FCP_TMF_LUN_RESET).
To not confuse people, I would also undo the move of the fcp_cmnd 
pointer assignment.


I won't send a new version with this fix,
because it turned out the FCP channel always requires a valid LUN handle 
(even for a target reset), so I'm settling on scsi_device as common 
context for any TMF, similarly like Hannes did.

Once I've successfully completed function test of v2 of my patch set,
I'm going to re-submit the full refactored set.


+   req->data = (fcp_cmnd->fc_tm_flags & FCP_TMF_LUN_RESET) ?
+   scmnd->device : (void *)sdev_to_zfcp(scmnd->device)->port;
req->handler = zfcp_fsf_fcp_task_mgmt_handler;
req->qtcb->header.lun_handle = zfcp_sdev->lun_handle;
req->qtcb->header.port_handle = zfcp_sdev->port->handle;
@@ -2350,7 +2371,6 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct 
scsi_cmnd *scmnd,

zfcp_qdio_set_sbale_last(qdio, >qdio_req);

-   fcp_cmnd = >qtcb->bottom.io.fcp_cmnd.iu;
zfcp_fc_scsi_to_fcp(fcp_cmnd, scmnd, tm_flags);

zfcp_fsf_start_timer(req, ZFCP_SCSI_ER_TIMEOUT);



--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



[RFC 7/9] zfcp: use fc_block_rport for TMFs and host reset to decouple from scsi_cmnd

2017-07-25 Thread Steffen Maier
Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_scsi.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 05c823ccb959..8e96196fa877 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -287,7 +287,7 @@ static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, 
u8 tm_flags)
break;
 
zfcp_erp_wait(adapter);
-   ret = fc_block_scsi_eh(scpnt);
+   ret = port->rport ? fc_block_rport(port->rport) : 0;
if (ret) {
zfcp_dbf_scsi_devreset("fiof", adapter, tm_flags, NULL,
   scsi_id, scsi_lun);
@@ -337,11 +337,13 @@ static int zfcp_scsi_eh_host_reset_handler(struct 
scsi_cmnd *scpnt)
 {
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device);
struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
+   struct zfcp_port *port;
int ret;
 
zfcp_erp_adapter_reopen(adapter, 0, "schrh_1");
zfcp_erp_wait(adapter);
-   ret = fc_block_scsi_eh(scpnt);
+   port = zfcp_sdev->port;
+   ret = port->rport ? fc_block_rport(port->rport) : 0;
if (ret)
return ret;
 
-- 
2.11.2



[RFC 9/9] zfcp: decouple our scsi_eh callbacks from scsi_cmnd

2017-07-25 Thread Steffen Maier
zfcp_scsi_eh_device_reset_handler() now only depends on scsi_device.
zfcp_scsi_eh_target_reset_handler() now only depends on scsi_target.
zfcp_scsi_eh_host_reset_handler() now only depends on Scsi_Host.
All derive other objects from these intended callback arguments.

Actually change the signature of zfcp_task_mgmt_function() used by
zfcp_scsi_eh_device_reset_handler() & zfcp_scsi_eh_target_reset_handler().
Since it was prepared in a previous patch, we only need to delete
some local auto variables which are now the intended arguments.

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_scsi.c | 40 
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 11cf33ea8c14..4cb38cfd46e3 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -266,11 +266,17 @@ static void zfcp_scsi_forget_cmnds(struct zfcp_port *port,
write_unlock_irqrestore(>abort_lock, flags);
 }
 
-static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, u8 tm_flags)
+/**
+ * zfcp_task_mgmt_function() - Synchronously send a task management function.
+ * @port: Pointer to zfcp port indicating scope.
+ * @sdev: Pointer to SCSI device as scope, or %NULL if scope is only port.
+ * @tm_flags: Task management flags,
+ *here we only handle %FCP_TMF_TGT_RESET or %FCP_TMF_LUN_RESET.
+ */
+static int zfcp_task_mgmt_function(struct zfcp_port *port,
+  struct scsi_device *sdev, u8 tm_flags)
 {
-   struct scsi_device *sdev = scpnt->device;
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
-   struct zfcp_port *port = zfcp_sdev->port;
+   struct zfcp_scsi_dev *zfcp_sdev = sdev ? sdev_to_zfcp(sdev) : NULL;
struct zfcp_adapter *adapter = port->adapter;
unsigned int scsi_id = port->starget_id;
u64 scsi_lun = ZFCP_DBF_INVALID_LUN;
@@ -325,18 +331,36 @@ static int zfcp_task_mgmt_function(struct scsi_cmnd 
*scpnt, u8 tm_flags)
 
 static int zfcp_scsi_eh_device_reset_handler(struct scsi_cmnd *scpnt)
 {
-   return zfcp_task_mgmt_function(scpnt, FCP_TMF_LUN_RESET);
+   struct scsi_device *sdev = scpnt->device;
+   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
+   struct zfcp_port *port = zfcp_sdev->port;
+
+   return zfcp_task_mgmt_function(port, sdev, FCP_TMF_LUN_RESET);
 }
 
 static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt)
 {
-   return zfcp_task_mgmt_function(scpnt, FCP_TMF_TGT_RESET);
+   struct scsi_target *starget = scsi_target(scpnt->device);
+   struct fc_rport *rport = starget_to_rport(starget);
+   struct zfcp_adapter *adapter =
+   (struct zfcp_adapter *)rport_to_shost(rport)->hostdata[0];
+   struct zfcp_port *port = zfcp_get_port_by_wwpn(adapter,
+  rport->port_name);
+   if (!port) {
+   zfcp_dbf_scsi_devreset("nopt", adapter, FCP_TMF_TGT_RESET,
+  NULL, port->starget_id,
+  ZFCP_DBF_INVALID_LUN);
+   return 0;
+   }
+
+   return zfcp_task_mgmt_function(port, NULL, FCP_TMF_TGT_RESET);
 }
 
 static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt)
 {
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device);
-   struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
+   struct Scsi_Host *shost = scpnt->device->host;
+   struct zfcp_adapter *adapter =
+   (struct zfcp_adapter *)shost->hostdata[0];
struct zfcp_port *port;
int ret = SUCCESS;
 
-- 
2.11.2



[RFC 8/9] zfcp: fix waiting for rport(s) unblock in eh_host_reset_handler

2017-07-25 Thread Steffen Maier
v2.6.30 commit 63caf367e1c9 ("[SCSI] zfcp: Improve reliability of SCSI eh
handlers in zfcp") added calls to zfcp_erp_wait() within
eh_abort_handler(), eh_device_reset_handler(), eh_target_reset_handler()
in order to synchronize with zfcp recovery completion before returning
from a scsi_eh callback (e.g. with SUCCESS) to prevent eh escalation.

v2.6.33 commit af4de36d911a ("[SCSI] zfcp: Block scsi_eh thread for rport
state BLOCKED") introduced the use of fc_block_scsi_eh() for
eh_abort_handler(), eh_device_reset_handler(), eh_target_reset_handler(),
and eh_host_reset_handler(), because zfcp_erp_wait() from above commit is
not sufficient.
The use in zfcp_task_mgmt_function() is correct even for a LUN reset,
as described in commit 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race
with LUN recovery").
However, the one call in zfcp_scsi_eh_host_reset_handler() waiting for
just one arbitrary port of the arbitrary scsi_cmnd seems insufficient
as the preceding adapter recovery could have recovered multiple ports
for which we all should wait to unblock (or have run into FAST_IO_FAIL).

Therefore, we now wait for all ports of the adapter with this fix.

NB: We cannot easily wait for an event because there is a time window
between zfcp_erp_wait() returned and zfcp_erp_try_rport_unblock() as part
of zfcp_erp_action_cleanup() actually scheduled rport_work which will
unblock an rport in zfcp_scsi_rport_work() asynchronously. Hence a
flush_work() could come early before queue_work() was even done.

v2.6.35 commit a1dbfddd02d2 ("[SCSI] zfcp: Pass return code from
fc_block_scsi_eh to scsi eh") fixed v2.6.33 for the FAST_IO_FAIL case.

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
Fixes: af4de36d911a ("[SCSI] zfcp: Block scsi_eh thread for rport state 
BLOCKED")
Fixes: a1dbfddd02d2 ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to 
scsi eh")
---
 drivers/s390/scsi/zfcp_scsi.c | 25 +++--
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 8e96196fa877..11cf33ea8c14 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -338,16 +338,29 @@ static int zfcp_scsi_eh_host_reset_handler(struct 
scsi_cmnd *scpnt)
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device);
struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
struct zfcp_port *port;
-   int ret;
+   int ret = SUCCESS;
 
zfcp_erp_adapter_reopen(adapter, 0, "schrh_1");
zfcp_erp_wait(adapter);
-   port = zfcp_sdev->port;
-   ret = port->rport ? fc_block_rport(port->rport) : 0;
-   if (ret)
-   return ret;
+   /* after internal recovery, wait for async unblock of rport(s) */
+   read_lock(>port_list_lock);
+   list_for_each_entry(port, >port_list, list) {
+   int fc_ret;
+
+   if (!port->rport)
+   continue;
+
+   fc_ret = fc_block_rport(port->rport);
+   /* Any rport ran into fast_io_fail_tmo: FAST_IO_FAIL.
+* To let pending requests bubble up, even if too many
+* because of other rports without this timeout.
+*/
+   if (fc_ret)
+   ret = fc_ret;
+   }
+   read_unlock(>port_list_lock);
 
-   return SUCCESS;
+   return ret;
 }
 
 struct scsi_transport_template *zfcp_scsi_transport_template;
-- 
2.11.2



[RFC 3/9] zfcp: split FCP_CMND IU setup between SCSI I/O and TMF again

2017-07-25 Thread Steffen Maier
This reverts commit 2443c8b23aea ("[SCSI] zfcp: Merge FCP task management
setup with regular FCP command setup"), because this introduced a
dependency on the unsuitable SCSI command for scsi_eh / TMF.

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_fc.h  | 22 ++
 drivers/s390/scsi/zfcp_fsf.c |  4 ++--
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fc.h b/drivers/s390/scsi/zfcp_fc.h
index 41f22d3dc6d1..24949868d027 100644
--- a/drivers/s390/scsi/zfcp_fc.h
+++ b/drivers/s390/scsi/zfcp_fc.h
@@ -206,21 +206,14 @@ struct zfcp_fc_wka_ports {
  * zfcp_fc_scsi_to_fcp - setup FCP command with data from scsi_cmnd
  * @fcp: fcp_cmnd to setup
  * @scsi: scsi_cmnd where to get LUN, task attributes/flags and CDB
- * @tm: task management flags to setup task management command
  */
 static inline
-void zfcp_fc_scsi_to_fcp(struct fcp_cmnd *fcp, struct scsi_cmnd *scsi,
-u8 tm_flags)
+void zfcp_fc_scsi_to_fcp(struct fcp_cmnd *fcp, struct scsi_cmnd *scsi)
 {
u32 datalen;
 
int_to_scsilun(scsi->device->lun, (struct scsi_lun *) >fc_lun);
 
-   if (unlikely(tm_flags)) {
-   fcp->fc_tm_flags = tm_flags;
-   return;
-   }
-
fcp->fc_pri_ta = FCP_PTA_SIMPLE;
 
if (scsi->sc_data_direction == DMA_FROM_DEVICE)
@@ -240,6 +233,19 @@ void zfcp_fc_scsi_to_fcp(struct fcp_cmnd *fcp, struct 
scsi_cmnd *scsi,
 }
 
 /**
+ * zfcp_fc_fcp_tm() - Setup FCP command as task management command.
+ * @fcp: Pointer to FCP_CMND IU to set up.
+ * @dev: Pointer to SCSI_device where to send the task management command.
+ * @tm_flags: Task management flags to setup tm command.
+ */
+static inline
+void zfcp_fc_fcp_tm(struct fcp_cmnd *fcp, struct scsi_device *dev, u8 tm_flags)
+{
+   int_to_scsilun(dev->lun, (struct scsi_lun *) >fc_lun);
+   fcp->fc_tm_flags = tm_flags;
+}
+
+/**
  * zfcp_fc_evap_fcp_rsp - evaluate FCP RSP IU and update scsi_cmnd accordingly
  * @fcp_rsp: FCP RSP IU to evaluate
  * @scsi: SCSI command where to update status and sense buffer
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 8b2b2ea552d6..f221a34c26df 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2265,7 +2265,7 @@ int zfcp_fsf_fcp_cmnd(struct scsi_cmnd *scsi_cmnd)
 
BUILD_BUG_ON(sizeof(struct fcp_cmnd) > FSF_FCP_CMND_SIZE);
fcp_cmnd = >qtcb->bottom.io.fcp_cmnd.iu;
-   zfcp_fc_scsi_to_fcp(fcp_cmnd, scsi_cmnd, 0);
+   zfcp_fc_scsi_to_fcp(fcp_cmnd, scsi_cmnd);
 
if ((scsi_get_prot_op(scsi_cmnd) != SCSI_PROT_NORMAL) &&
scsi_prot_sg_count(scsi_cmnd)) {
@@ -2371,7 +2371,7 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct 
scsi_cmnd *scmnd,
 
zfcp_qdio_set_sbale_last(qdio, >qdio_req);
 
-   zfcp_fc_scsi_to_fcp(fcp_cmnd, scmnd, tm_flags);
+   zfcp_fc_fcp_tm(fcp_cmnd, scmnd->device, tm_flags);
 
zfcp_fsf_start_timer(req, ZFCP_SCSI_ER_TIMEOUT);
if (!zfcp_fsf_req_send(req))
-- 
2.11.2



[RFC 6/9] scsi: fc: start decoupling fc_block_scsi_eh from scsi_cmnd

2017-07-25 Thread Steffen Maier
Scsi_cmnd is an unsuitable argument for eh_device_reset_handler(),
eh_target_reset_handler(), and eh_host_reset_handler()
which do not have the scope of one single SCSI command.
These callbacks tend to use fc_block_scsi_eh() requiring scsi_cmnd.
In order to start decoupling above eh callbacks from scsi_cmnd,
introduce a new variant of the function called fc_block_rport()
taking an fc_rport as argument.
Refactor the old fc_block_scsi_eh() to simply delegate to fc_block_rport().

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_transport_fc.c | 31 ++-
 include/scsi/scsi_transport_fc.h |  1 +
 2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index d4cf32d55546..3594043834c7 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3272,8 +3272,8 @@ fc_scsi_scan_rport(struct work_struct *work)
 }
 
 /**
- * fc_block_scsi_eh - Block SCSI eh thread for blocked fc_rport
- * @cmnd: SCSI command that scsi_eh is trying to recover
+ * fc_block_rport() - Block SCSI eh thread for blocked fc_rport.
+ * @rport: Remote port that scsi_eh is trying to recover.
  *
  * This routine can be called from a FC LLD scsi_eh callback. It
  * blocks the scsi_eh thread until the fc_rport leaves the
@@ -3285,10 +3285,9 @@ fc_scsi_scan_rport(struct work_struct *work)
  * FAST_IO_FAIL if the fast_io_fail_tmo fired, this should be
  * passed back to scsi_eh.
  */
-int fc_block_scsi_eh(struct scsi_cmnd *cmnd)
+int fc_block_rport(struct fc_rport *rport)
 {
-   struct Scsi_Host *shost = cmnd->device->host;
-   struct fc_rport *rport = starget_to_rport(scsi_target(cmnd->device));
+   struct Scsi_Host *shost = rport_to_shost(rport);
unsigned long flags;
 
spin_lock_irqsave(shost->host_lock, flags);
@@ -3305,6 +3304,28 @@ int fc_block_scsi_eh(struct scsi_cmnd *cmnd)
 
return 0;
 }
+EXPORT_SYMBOL(fc_block_rport);
+
+/**
+ * fc_block_scsi_eh - Block SCSI eh thread for blocked fc_rport
+ * @cmnd: SCSI command that scsi_eh is trying to recover
+ *
+ * This routine can be called from a FC LLD scsi_eh callback. It
+ * blocks the scsi_eh thread until the fc_rport leaves the
+ * FC_PORTSTATE_BLOCKED, or the fast_io_fail_tmo fires. This is
+ * necessary to avoid the scsi_eh failing recovery actions for blocked
+ * rports which would lead to offlined SCSI devices.
+ *
+ * Returns: 0 if the fc_rport left the state FC_PORTSTATE_BLOCKED.
+ * FAST_IO_FAIL if the fast_io_fail_tmo fired, this should be
+ * passed back to scsi_eh.
+ */
+int fc_block_scsi_eh(struct scsi_cmnd *cmnd)
+{
+   struct fc_rport *rport = starget_to_rport(scsi_target(cmnd->device));
+
+   return fc_block_rport(rport);
+}
 EXPORT_SYMBOL(fc_block_scsi_eh);
 
 /**
diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h
index 6e208bb32c78..d8cae7bd8161 100644
--- a/include/scsi/scsi_transport_fc.h
+++ b/include/scsi/scsi_transport_fc.h
@@ -808,6 +808,7 @@ void fc_host_post_vendor_event(struct Scsi_Host *shost, u32 
event_number,
 struct fc_vport *fc_vport_create(struct Scsi_Host *shost, int channel,
struct fc_vport_identifiers *);
 int fc_vport_terminate(struct fc_vport *vport);
+int fc_block_rport(struct fc_rport *rport);
 int fc_block_scsi_eh(struct scsi_cmnd *cmnd);
 enum blk_eh_timer_return fc_eh_timed_out(struct scsi_cmnd *scmd);
 
-- 
2.11.2



[RFC 4/9] zfcp: decouple FSF request setup of TMF from scsi_cmnd

2017-07-25 Thread Steffen Maier
The scsi_device argument of zfcp_fc_fcp_tm() can now be NULL.

In zfcp_fsf_fcp_task_mgmt() resolve the still old argument scsi_cmnd
into scsi_device very early and only depend on scsi_device and derived
objects in the function body.

Scsi_device and derived zfcp_scsi_dev can later be NULL for the
target reset case, so do not depend on them unconditionally.
For the generic case, rather change to using zfcp_port directly.

This prepares to later change the function signature replacing the
scsi_cmnd argument with zfcp_port and an
optional scsi_device which can be NULL.

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_fc.h  |  6 --
 drivers/s390/scsi/zfcp_fsf.c | 25 +
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fc.h b/drivers/s390/scsi/zfcp_fc.h
index 24949868d027..0e5b01c33873 100644
--- a/drivers/s390/scsi/zfcp_fc.h
+++ b/drivers/s390/scsi/zfcp_fc.h
@@ -235,13 +235,15 @@ void zfcp_fc_scsi_to_fcp(struct fcp_cmnd *fcp, struct 
scsi_cmnd *scsi)
 /**
  * zfcp_fc_fcp_tm() - Setup FCP command as task management command.
  * @fcp: Pointer to FCP_CMND IU to set up.
- * @dev: Pointer to SCSI_device where to send the task management command.
+ * @dev: Pointer to SCSI device if LUN Reset TMF, or %NULL.
  * @tm_flags: Task management flags to setup tm command.
  */
 static inline
 void zfcp_fc_fcp_tm(struct fcp_cmnd *fcp, struct scsi_device *dev, u8 tm_flags)
 {
-   int_to_scsilun(dev->lun, (struct scsi_lun *) >fc_lun);
+   if (dev)
+   int_to_scsilun(dev->lun, (struct scsi_lun *) >fc_lun);
+
fcp->fc_tm_flags = tm_flags;
 }
 
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index f221a34c26df..2dc7d2a6f6ea 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2339,13 +2339,19 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct 
scsi_cmnd *scmnd,
 {
struct zfcp_fsf_req *req = NULL;
struct fcp_cmnd *fcp_cmnd;
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scmnd->device);
-   struct zfcp_qdio *qdio = zfcp_sdev->port->adapter->qdio;
+   struct scsi_device *sdev = scmnd->device;
+   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
+   struct zfcp_port *port = zfcp_sdev->port;
+   struct zfcp_qdio *qdio = port->adapter->qdio;
 
-   if (unlikely(!(atomic_read(_sdev->status) &
+   if (unlikely(!(atomic_read(>status) &
   ZFCP_STATUS_COMMON_UNBLOCKED)))
return NULL;
 
+   if (unlikely(zfcp_sdev && !(atomic_read(_sdev->status) &
+   ZFCP_STATUS_COMMON_UNBLOCKED)))
+   return NULL;
+
spin_lock_irq(>req_q_lock);
if (zfcp_qdio_sbal_get(qdio))
goto out;
@@ -2360,18 +2366,21 @@ struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct 
scsi_cmnd *scmnd,
}
 
fcp_cmnd = >qtcb->bottom.io.fcp_cmnd.iu;
-   req->data = (fcp_cmnd->fc_tm_flags & FCP_TMF_LUN_RESET) ?
-   scmnd->device : (void *)sdev_to_zfcp(scmnd->device)->port;
+   if (fcp_cmnd->fc_tm_flags & FCP_TMF_LUN_RESET) {
+   req->data = sdev;
+   req->qtcb->header.lun_handle = zfcp_sdev->lun_handle;
+   } else
+   req->data = port;
+
req->handler = zfcp_fsf_fcp_task_mgmt_handler;
-   req->qtcb->header.lun_handle = zfcp_sdev->lun_handle;
-   req->qtcb->header.port_handle = zfcp_sdev->port->handle;
+   req->qtcb->header.port_handle = port->handle;
req->qtcb->bottom.io.data_direction = FSF_DATADIR_CMND;
req->qtcb->bottom.io.service_class = FSF_CLASS_3;
req->qtcb->bottom.io.fcp_cmnd_length = FCP_CMND_LEN;
 
zfcp_qdio_set_sbale_last(qdio, >qdio_req);
 
-   zfcp_fc_fcp_tm(fcp_cmnd, scmnd->device, tm_flags);
+   zfcp_fc_fcp_tm(fcp_cmnd, sdev, tm_flags);
 
zfcp_fsf_start_timer(req, ZFCP_SCSI_ER_TIMEOUT);
if (!zfcp_fsf_req_send(req))
-- 
2.11.2



[RFC 2/9] zfcp: decouple TMF response handler from scsi_cmnd

2017-07-25 Thread Steffen Maier
Do not get scsi_device via req->data any more, but pass an optional(!)
scsi_device to zfcp_fsf_fcp_handler_common(). The latter must now guard
any access to scsi_device as it can be NULL.

Since we always have at least a zfcp port as scope, pass this as mandatory
argument to zfcp_fsf_fcp_handler_common() because we cannot get it through
scsi_device => zfcp_scsi_dev => port any more.

Hence, the callers of zfcp_fsf_fcp_handler_common() must resolve req->data.

TMF handling now has different context data in fsf_req->data
depending on the TMF scope in fcp_cmnd->fc_tm_flags:
* scsi_device if FCP_TMF_LUN_RESET,
* zfcp_port if FCP_TMF_TGT_RESET.

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_fsf.c | 72 
 1 file changed, 46 insertions(+), 26 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 69d1dc3ec79d..8b2b2ea552d6 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2035,27 +2035,30 @@ static void zfcp_fsf_req_trace(struct zfcp_fsf_req 
*req, struct scsi_cmnd *scsi)
sizeof(blktrc));
 }
 
-static void zfcp_fsf_fcp_handler_common(struct zfcp_fsf_req *req)
+/**
+ * zfcp_fsf_fcp_handler_common() - FCP response handler common to I/O and TMF.
+ * @req: Pointer to FSF request.
+ * @port: Pointer to zfcp port.
+ * @sdev: Pointer to SCSI device, or %NULL with Target Reset TMF.
+ */
+static void zfcp_fsf_fcp_handler_common(struct zfcp_fsf_req *req,
+   struct zfcp_port *port,
+   struct scsi_device *sdev)
 {
-   struct scsi_cmnd *scmnd = req->data;
-   struct scsi_device *sdev = scmnd->device;
-   struct zfcp_scsi_dev *zfcp_sdev;
struct fsf_qtcb_header *header = >qtcb->header;
 
if (unlikely(req->status & ZFCP_STATUS_FSFREQ_ERROR))
return;
 
-   zfcp_sdev = sdev_to_zfcp(sdev);
-
switch (header->fsf_status) {
case FSF_HANDLE_MISMATCH:
case FSF_PORT_HANDLE_NOT_VALID:
-   zfcp_erp_adapter_reopen(zfcp_sdev->port->adapter, 0, "fssfch1");
+   zfcp_erp_adapter_reopen(req->adapter, 0, "fssfch1");
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
break;
case FSF_FCPLUN_NOT_VALID:
case FSF_LUN_HANDLE_NOT_VALID:
-   zfcp_erp_port_reopen(zfcp_sdev->port, 0, "fssfch2");
+   zfcp_erp_port_reopen(port, 0, "fssfch2");
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
break;
case FSF_SERVICE_CLASS_NOT_SUPPORTED:
@@ -2066,10 +2069,10 @@ static void zfcp_fsf_fcp_handler_common(struct 
zfcp_fsf_req *req)
"Incorrect direction %d, LUN 0x%016Lx on port "
"0x%016Lx closed\n",
req->qtcb->bottom.io.data_direction,
-   (unsigned long long)zfcp_scsi_dev_lun(sdev),
-   (unsigned long long)zfcp_sdev->port->wwpn);
-   zfcp_erp_adapter_shutdown(zfcp_sdev->port->adapter, 0,
- "fssfch3");
+   sdev ? (unsigned long long)zfcp_scsi_dev_lun(sdev) :
+   ZFCP_DBF_INVALID_LUN,
+   (unsigned long long)port->wwpn);
+   zfcp_erp_adapter_shutdown(req->adapter, 0, "fssfch3");
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
break;
case FSF_CMND_LENGTH_NOT_VALID:
@@ -2077,29 +2080,32 @@ static void zfcp_fsf_fcp_handler_common(struct 
zfcp_fsf_req *req)
"Incorrect CDB length %d, LUN 0x%016Lx on "
"port 0x%016Lx closed\n",
req->qtcb->bottom.io.fcp_cmnd_length,
-   (unsigned long long)zfcp_scsi_dev_lun(sdev),
-   (unsigned long long)zfcp_sdev->port->wwpn);
-   zfcp_erp_adapter_shutdown(zfcp_sdev->port->adapter, 0,
- "fssfch4");
+   sdev ? (unsigned long long)zfcp_scsi_dev_lun(sdev) :
+   ZFCP_DBF_INVALID_LUN,
+   (unsigned long long)port->wwpn);
+   zfcp_erp_adapter_shutdown(req->adapter, 0, "fssfch4");
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
break;
case FSF_PORT_BOXED:
-   zfcp_erp_set_port_status(zfcp_sdev->port,
+   zfcp_erp_set_port_status(port,
 ZFCP_STATUS_COMMON_ACCESS_BOXED);
-   zfcp_erp_port_reopen(zfcp_sdev->port,
+   zfcp_erp_port_r

[RFC 5/9] zfcp: decouple SCSI setup of TMF from scsi_cmnd

2017-07-25 Thread Steffen Maier
Actually change the signature of zfcp_fsf_fcp_task_mgmt().
Since it was prepared in the previous patch, we only need to delete
some local auto variables which are now the intended arguments.

Refactor zfcp_scsi_forget_cmnds() to now take a mandatory zfcp_port
and an optional zfcp_scsi_dev, which can be NULL for target reset,
instead of a mandatory zfcp_scsi_dev.

Prepare zfcp_fsf_fcp_task_mgmt's caller zfcp_task_mgmt_function()
to have its function body only depend on a mandatory zfcp_port
and an optional scsi_device, which can be NULL for target reset.

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_ext.h  |  4 +++-
 drivers/s390/scsi/zfcp_fsf.c  | 15 ---
 drivers/s390/scsi/zfcp_scsi.c | 28 +++-
 3 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index f3101bc5d1bc..26772b0c1c39 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -118,7 +118,9 @@ extern int zfcp_fsf_send_els(struct zfcp_adapter *, u32,
 struct zfcp_fsf_ct_els *, unsigned int);
 extern int zfcp_fsf_fcp_cmnd(struct scsi_cmnd *);
 extern void zfcp_fsf_req_free(struct zfcp_fsf_req *);
-extern struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct scsi_cmnd *, u8);
+extern struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct zfcp_port *port,
+  struct scsi_device *sdev,
+  u8 tm_flags);
 extern struct zfcp_fsf_req *zfcp_fsf_abort_fcp_cmnd(struct scsi_cmnd *);
 extern void zfcp_fsf_reqid_check(struct zfcp_qdio *, int);
 
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 2dc7d2a6f6ea..7cc2d7ee1f56 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2329,19 +2329,20 @@ static void zfcp_fsf_fcp_task_mgmt_handler(struct 
zfcp_fsf_req *req)
 }
 
 /**
- * zfcp_fsf_fcp_task_mgmt - send SCSI task management command
- * @scmnd: SCSI command to send the task management command for
- * @tm_flags: unsigned byte for task management flags
- * Returns: on success pointer to struct fsf_req, NULL otherwise
+ * zfcp_fsf_fcp_task_mgmt() - Send SCSI task management command (TMF).
+ * @port: Pointer to zfcp port as scope for TMF.
+ * @sdev: Pointer to scsi device if LUN Reset TMF, or %NULL.
+ * @tm_flags: Unsigned byte for task management flags.
+ *
+ * Return: On success pointer to struct fsf_req, %NULL otherwise.
  */
-struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct scsi_cmnd *scmnd,
+struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct zfcp_port *port,
+   struct scsi_device *sdev,
u8 tm_flags)
 {
struct zfcp_fsf_req *req = NULL;
struct fcp_cmnd *fcp_cmnd;
-   struct scsi_device *sdev = scmnd->device;
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
-   struct zfcp_port *port = zfcp_sdev->port;
struct zfcp_qdio *qdio = port->adapter->qdio;
 
if (unlikely(!(atomic_read(>status) &
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index cd0f811452b7..05c823ccb959 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -234,12 +234,20 @@ static void zfcp_scsi_forget_cmnd(struct zfcp_fsf_req 
*old_req, void *data)
old_req->data = NULL;
 }
 
-static void zfcp_scsi_forget_cmnds(struct zfcp_scsi_dev *zsdev, u8 tm_flags)
+/**
+ * zfcp_scsi_forget_cmnds() - Forget pending SCSI requests on given scope.
+ * @port: Pointer to zfcp port indicating scope.
+ * @zsdev: Pointer to zfcp scsi dev as scope, or %NULL if scope is only port.
+ * @tm_flags: Task management flags,
+ *here we only handle %FCP_TMF_TGT_RESET or %FCP_TMF_LUN_RESET.
+ */
+static void zfcp_scsi_forget_cmnds(struct zfcp_port *port,
+  struct zfcp_scsi_dev *zsdev, u8 tm_flags)
 {
-   struct zfcp_adapter *adapter = zsdev->port->adapter;
+   struct zfcp_adapter *adapter = port->adapter;
struct zfcp_scsi_req_filter filter = {
.tmf_scope = FCP_TMF_TGT_RESET,
-   .port_handle = zsdev->port->handle,
+   .port_handle = port->handle,
};
unsigned long flags;
 
@@ -260,19 +268,21 @@ static void zfcp_scsi_forget_cmnds(struct zfcp_scsi_dev 
*zsdev, u8 tm_flags)
 
 static int zfcp_task_mgmt_function(struct scsi_cmnd *scpnt, u8 tm_flags)
 {
-   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device);
-   struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
-   unsigned int scsi_id = zfcp_sdev->port->starget_id;
+   struct scsi_device *sdev = scpnt->device;
+   struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
+   struct zfcp_port *port = zfcp_sdev->port;
+   struct 

[RFC 1/9] zfcp: drop unsuitable scsi_cmnd usage from SCSI traces for scsi_eh / TMF

2017-07-25 Thread Steffen Maier
The SCSI command pointer passed to scsi_eh callbacks is just one arbitrary
command of potentially many that are in the eh queue to be processed.
The command is only used to indirectly pass the TMF scope in terms of
SCSI ID/target and SCSI LUN for LUN reset.

Hence, zfcp had filled in SCSI trace record fields which do not really
belong to the TMF. This was confusing.

Therefore, refactor the TMF tracing to work without SCSI command
and instead pass explicit arguments for SCSI ID and SCSI LUN.
As context we now need a pointer to zfcp_adapter.
To make it even clearer, we set all bits to 1 for the fields, which do
not belong to the TMF, to indicate that these fields are invalid.

The old zfcp_dbf_scsi() became zfcp_dbf_scsi_common() to now handle both
SCSI commands and TMFs. The old argument scsi_cmnd is now optional and
can be NULL with TMFs. Two new arguments scsi_id and scsi_lun are
optional and only used if scsi_cmnd is NULL, i.e. with TMFs.

Signed-off-by: Steffen Maier <ma...@linux.vnet.ibm.com>
---
 drivers/s390/scsi/zfcp_dbf.c  | 51 ---
 drivers/s390/scsi/zfcp_dbf.h  | 26 +++---
 drivers/s390/scsi/zfcp_ext.h  |  8 ---
 drivers/s390/scsi/zfcp_scsi.c | 20 -
 4 files changed, 71 insertions(+), 34 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_dbf.c b/drivers/s390/scsi/zfcp_dbf.c
index 8227076c9cbb..dca624aaa7c0 100644
--- a/drivers/s390/scsi/zfcp_dbf.c
+++ b/drivers/s390/scsi/zfcp_dbf.c
@@ -577,16 +577,19 @@ void zfcp_dbf_san_in_els(char *tag, struct zfcp_fsf_req 
*fsf)
 }
 
 /**
- * zfcp_dbf_scsi - trace event for scsi commands
- * @tag: identifier for event
- * @sc: pointer to struct scsi_cmnd
- * @fsf: pointer to struct zfcp_fsf_req
+ * zfcp_dbf_scsi_common() - Common trace event helper for scsi.
+ * @tag: Identifier for event.
+ * @level: trace level of event.
+ * @adapter: Pointer to zfcp adapter as context for this event.
+ * @sc: Pointer to SCSI command, or NULL with task management function (TMF).
+ * @fsf: Pointer to FSF request, or NULL.
+ * @scsi_id: SCSI ID/target to indicate scope, only for TMF.
+ * @scsi_lun: SCSI LUN if TMF is Logical Unit Reset, else 
%ZFCP_DBF_INVALID_LUN.
  */
-void zfcp_dbf_scsi(char *tag, int level, struct scsi_cmnd *sc,
-  struct zfcp_fsf_req *fsf)
+void zfcp_dbf_scsi_common(char *tag, int level, struct zfcp_adapter *adapter,
+ struct scsi_cmnd *sc, struct zfcp_fsf_req *fsf,
+ unsigned int scsi_id, u64 scsi_lun)
 {
-   struct zfcp_adapter *adapter =
-   (struct zfcp_adapter *) sc->device->host->hostdata[0];
struct zfcp_dbf *dbf = adapter->dbf;
struct zfcp_dbf_scsi *rec = >scsi_buf;
struct fcp_resp_with_ext *fcp_rsp;
@@ -598,16 +601,28 @@ void zfcp_dbf_scsi(char *tag, int level, struct scsi_cmnd 
*sc,
 
memcpy(rec->tag, tag, ZFCP_DBF_TAG_LEN);
rec->id = ZFCP_DBF_SCSI_CMND;
-   rec->scsi_result = sc->result;
-   rec->scsi_retries = sc->retries;
-   rec->scsi_allowed = sc->allowed;
-   rec->scsi_id = sc->device->id;
-   rec->scsi_lun = (u32)sc->device->lun;
-   rec->scsi_lun_64_hi = (u32)(sc->device->lun >> 32);
-   rec->host_scribble = (unsigned long)sc->host_scribble;
-
-   memcpy(rec->scsi_opcode, sc->cmnd,
-  min((int)sc->cmd_len, ZFCP_DBF_SCSI_OPCODE));
+   if (sc) {
+   rec->scsi_result = sc->result;
+   rec->scsi_retries = sc->retries;
+   rec->scsi_allowed = sc->allowed;
+   rec->scsi_id = sc->device->id;
+   rec->scsi_lun = (u32)sc->device->lun;
+   rec->scsi_lun_64_hi = (u32)(sc->device->lun >> 32);
+   rec->host_scribble = (unsigned long)sc->host_scribble;
+
+   memcpy(rec->scsi_opcode, sc->cmnd,
+  min_t(int, sc->cmd_len, ZFCP_DBF_SCSI_OPCODE));
+   } else {
+   rec->scsi_result = ~0;
+   rec->scsi_retries = ~0;
+   rec->scsi_allowed = ~0;
+   rec->scsi_id = scsi_id;
+   rec->scsi_lun = (u32)scsi_lun;
+   rec->scsi_lun_64_hi = (u32)(scsi_lun >> 32);
+   rec->host_scribble = ~0;
+
+   memset(rec->scsi_opcode, 0xff, ZFCP_DBF_SCSI_OPCODE);
+   }
 
if (fsf) {
rec->fsf_req_id = fsf->req_id;
diff --git a/drivers/s390/scsi/zfcp_dbf.h b/drivers/s390/scsi/zfcp_dbf.h
index 3508c00458f4..6e29e7cccbc4 100644
--- a/drivers/s390/scsi/zfcp_dbf.h
+++ b/drivers/s390/scsi/zfcp_dbf.h
@@ -358,7 +358,8 @@ void _zfcp_dbf_scsi(char *tag, int level, struct scsi_cmnd 
*scmd,
scmd->device->host->hostdata[0];
 
if (debug_level_enabled(adapter-

[RFC 0/9] zfcp: decouple scsi_eh callbacks from scsi_cmnd

2017-07-25 Thread Steffen Maier
This is an early request for comments.
The patch set serves as a zfcp preparation step for Hannes' series
"[PATCH 00/47] SCSI EH argument reshuffle part II"
http://www.spinics.net/lists/linux-scsi/msg65.html
or
http://marc.info/?l=linux-scsi=150091945302995=2

The series is based on 18 preceding zfcp patches,
including some stable regression bugfixes for zfcp tracing.
Hence it might not apply cleanly.
However, we plan to post the 18 preceding patches soon for integration
and I would like to get those in first.

Please do not apply to any tree that will merge into upstream yet,
as it's not ready for prime time.
It only builds (after each patch; sparse and checkpatch clean)
but it has not seen any function testing yet.

There are still some open questions:
* Search victim scsi_device in target_reset_handler just to get a LUN?
  (Even if we do, zfcp_fsf_fcp_handler_common() should not print that LUN!)
  http://www.spinics.net/lists/linux-scsi/msg64.html
  http://www.spinics.net/lists/linux-scsi/msg66.html
* Exact rport blocking logic in host_reset_handler.
  http://www.spinics.net/lists/linux-scsi/msg65.html


Steffen Maier (9):
  zfcp: drop unsuitable scsi_cmnd usage from SCSI traces for scsi_eh /
TMF
  zfcp: decouple TMF response handler from scsi_cmnd
  zfcp: split FCP_CMND IU setup between SCSI I/O and TMF again
  zfcp: decouple FSF request setup of TMF from scsi_cmnd
  zfcp: decouple SCSI setup of TMF from scsi_cmnd
  scsi: fc: start decoupling fc_block_scsi_eh from scsi_cmnd
  zfcp: use fc_block_rport for TMFs and host reset to decouple from
scsi_cmnd
  zfcp: fix waiting for rport(s) unblock in eh_host_reset_handler
  zfcp: decouple our scsi_eh callbacks from scsi_cmnd

 drivers/s390/scsi/zfcp_dbf.c |  51 ---
 drivers/s390/scsi/zfcp_dbf.h |  26 +++---
 drivers/s390/scsi/zfcp_ext.h |  12 +++--
 drivers/s390/scsi/zfcp_fc.h  |  24 ++---
 drivers/s390/scsi/zfcp_fsf.c | 106 +--
 drivers/s390/scsi/zfcp_scsi.c| 105 +-
 drivers/scsi/scsi_transport_fc.c |  31 ++--
 include/scsi/scsi_transport_fc.h |   1 +
 8 files changed, 252 insertions(+), 104 deletions(-)

-- 
2.11.2



Re: [PATCH 46/47] scsi: Move eh_device_reset_handler() to use scsi_device as argument

2017-07-24 Thread Steffen Maier
} else {
SCSI_LOG_ERROR_RECOVERY(3,
shost_printk(KERN_INFO, shost,


How is this related to just changing the callback argument?

Haven't we previously set the host_byte to anything?

Is the above part rather an independent change or even fix which should 
live in a separate topical commit explaining why and what it does?


(I don't understand especially the DID_RESET cases.)



diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 8b93197..bc97e41 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -681,26 +681,26 @@ static int virtscsi_tmf(struct virtio_scsi *vscsi, struct 
virtio_scsi_cmd *cmd)
return ret;
  }

-static int virtscsi_device_reset(struct scsi_cmnd *sc)
+static int virtscsi_device_reset(struct scsi_device *sdev)
  {
-   struct virtio_scsi *vscsi = shost_priv(sc->device->host);
+   struct virtio_scsi *vscsi = shost_priv(sdev->host);
struct virtio_scsi_cmd *cmd;

-   sdev_printk(KERN_INFO, sc->device, "device reset\n");
+   sdev_printk(KERN_INFO, sdev, "device reset\n");
cmd = mempool_alloc(virtscsi_cmd_pool, GFP_NOIO);
if (!cmd)
return FAILED;

memset(cmd, 0, sizeof(*cmd));
-   cmd->sc = sc;
+   cmd->sc = NULL;


I hope we are protected to never land in virtscsi_complete_cmd() with 
cmd->sc==NULL, not even e.g. if the virtio queue gets hot unplugged the 
hard way in parallel (forcing some completion to drain).

I'm thinking of commit 773c7220e22d193e5667c352fcbf8d47eefc817f
("scsi: virtio_scsi: Reject commands when virtqueue is broken").


cmd->req.tmf = (struct virtio_scsi_ctrl_tmf_req){
.type = VIRTIO_SCSI_T_TMF,
.subtype = cpu_to_virtio32(vscsi->vdev,
 
VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET),
.lun[0] = 1,
-   .lun[1] = sc->device->id,
-   .lun[2] = (sc->device->lun >> 8) | 0x40,
-   .lun[3] = sc->device->lun & 0xff,
+   .lun[1] = sdev->id,
+   .lun[2] = (sdev->lun >> 8) | 0x40,
+   .lun[3] = sdev->lun & 0xff,
};
return virtscsi_tmf(vscsi, cmd);
  }




diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 33bc523..7095076 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -145,7 +145,7 @@ struct scsi_host_template {
 * Status: REQUIRED (at least one of them)
 */
int (* eh_abort_handler)(struct scsi_cmnd *);
-   int (* eh_device_reset_handler)(struct scsi_cmnd *);
+   int (* eh_device_reset_handler)(struct scsi_device *);
int (* eh_target_reset_handler)(struct scsi_target *);
int (* eh_bus_reset_handler)(struct Scsi_Host *, int);
int (* eh_host_reset_handler)(struct Scsi_Host *);



anything else, that I quoted but did not comment, looks good


--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



  1   2   >