date:20171208

RE: [PATCH] scsi: bfa: convert to strlcpy/strlcat

2017-12-08 Thread Kalluru, Sudarsana

Acked-by: Sudarsana Kalluru 

-Original Message-
From: Arnd Bergmann [mailto:a...@arndb.de] 
Sent: 04 December 2017 20:17
To: Gurumurthy, Anil ; Kalluru, Sudarsana 
; James E.J. Bottomley ; 
Martin K. Petersen 
Cc: Arnd Bergmann ; Hannes Reinecke ; Kees Cook 
; Benjamin Poirier ; Mody, Rasesh 
; Johannes Thumshirn ; 
linux-scsi@vger.kernel.org; linux-ker...@vger.kernel.org
Subject: [PATCH] scsi: bfa: convert to strlcpy/strlcat

The bfa driver has a number of real issues with string termination that gcc-8 
now points out:

drivers/scsi/bfa/bfad_bsg.c: In function 'bfad_iocmd_port_get_attr':
drivers/scsi/bfa/bfad_bsg.c:320:9: error: argument to 'sizeof' in 'strncpy' 
call is the same expression as the source; did you mean to use the size of the 
destination? [-Werror=sizeof-pointer-memaccess]
drivers/scsi/bfa/bfa_fcs.c: In function 'bfa_fcs_fabric_psymb_init':
drivers/scsi/bfa/bfa_fcs.c:775:9: error: argument to 'sizeof' in 'strncat' call 
is the same expression as the source; did you mean to use the size of the 
destination? [-Werror=sizeof-pointer-memaccess]
drivers/scsi/bfa/bfa_fcs.c:781:9: error: argument to 'sizeof' in 'strncat' call 
is the same expression as the source; did you mean to use the size of the 
destination? [-Werror=sizeof-pointer-memaccess]
drivers/scsi/bfa/bfa_fcs.c:788:9: error: argument to 'sizeof' in 'strncat' call 
is the same expression as the source; did you mean to use the size of the 
destination? [-Werror=sizeof-pointer-memaccess]
drivers/scsi/bfa/bfa_fcs.c:801:10: error: argument to 'sizeof' in 'strncat' 
call is the same expression as the source; did you mean to use the size of the 
destination? [-Werror=sizeof-pointer-memaccess]
drivers/scsi/bfa/bfa_fcs.c:808:10: error: argument to 'sizeof' in 'strncat' 
call is the same expression as the source; did you mean to use the size of the 
destination? [-Werror=sizeof-pointer-memaccess]
drivers/scsi/bfa/bfa_fcs.c: In function 'bfa_fcs_fabric_nsymb_init':
drivers/scsi/bfa/bfa_fcs.c:837:10: error: argument to 'sizeof' in 'strncat' 
call is the same expression as the source; did you mean to use the size of the 
destination? [-Werror=sizeof-pointer-memaccess]
drivers/scsi/bfa/bfa_fcs.c:844:10: error: argument to 'sizeof' in 'strncat' 
call is the same expression as the source; did you mean to use the size of the 
destination? [-Werror=sizeof-pointer-memaccess]
drivers/scsi/bfa/bfa_fcs.c:852:10: error: argument to 'sizeof' in 'strncat' 
call is the same expression as the source; did you mean to use the size of the 
destination? [-Werror=sizeof-pointer-memaccess]
drivers/scsi/bfa/bfa_fcs.c: In function 'bfa_fcs_fabric_psymb_init':
drivers/scsi/bfa/bfa_fcs.c:778:2: error: 'strncat' output may be truncated 
copying 10 bytes from a string of length 63 [-Werror=stringop-truncation]
drivers/scsi/bfa/bfa_fcs.c:784:2: error: 'strncat' output may be truncated 
copying 30 bytes from a string of length 63 [-Werror=stringop-truncation]
drivers/scsi/bfa/bfa_fcs.c:803:3: error: 'strncat' output may be truncated 
copying 44 bytes from a string of length 63 [-Werror=stringop-truncation]
drivers/scsi/bfa/bfa_fcs.c:811:3: error: 'strncat' output may be truncated 
copying 16 bytes from a string of length 63 [-Werror=stringop-truncation]
drivers/scsi/bfa/bfa_fcs.c: In function 'bfa_fcs_fabric_nsymb_init':
drivers/scsi/bfa/bfa_fcs.c:840:2: error: 'strncat' output may be truncated 
copying 10 bytes from a string of length 63 [-Werror=stringop-truncation]
drivers/scsi/bfa/bfa_fcs.c:847:2: error: 'strncat' output may be truncated 
copying 30 bytes from a string of length 63 [-Werror=stringop-truncation]
drivers/scsi/bfa/bfa_fcs_lport.c: In function 'bfa_fcs_fdmi_get_hbaattr':
drivers/scsi/bfa/bfa_fcs_lport.c:2657:10: error: argument to 'sizeof' in 
'strncat' call is the same expression as the source; did you mean to use the 
size of the destination? [-Werror=sizeof-pointer-memaccess]
drivers/scsi/bfa/bfa_fcs_lport.c:2659:11: error: argument to 'sizeof' in 
'strncat' call is the same expression as the source; did you mean to use the 
size of the destination? [-Werror=sizeof-pointer-memaccess]
drivers/scsi/bfa/bfa_fcs_lport.c: In function 'bfa_fcs_lport_ms_gmal_response':
drivers/scsi/bfa/bfa_fcs_lport.c:3232:5: error: 'strncpy' output may be 
truncated copying 16 bytes from a string of length 247 
[-Werror=stringop-truncation]
drivers/scsi/bfa/bfa_fcs_lport.c: In function 'bfa_fcs_lport_ns_send_rspn_id':
drivers/scsi/bfa/bfa_fcs_lport.c:4670:3: error: 'strncpy' output truncated 
before terminating nul copying as many bytes from a string as its length 
[-Werror=stringop-truncation]
drivers/scsi/bfa/bfa_fcs_lport.c:4682:3: error: 'strncat' output truncated 
before terminating nul copying as many bytes

Re: [PATCH] scsi: libiscsi: Allow sd_shutdown on bad transport

2017-12-08 Thread Rafael David Tinoco

Hello Bart, 

I am returning BLK_EH_HANDLED in iscsi_eh_cmd_timed_out(). Do you mean 
something different ? 

That paragraph means that I have tried to return BLK_EH_NOT_HANDLED first, 
because that would be the other option instead of BLK_EH_RESET_TIMER (which is 
causing this issue), but if I did it, the EH logic would try 
scsi_abort_command() and - successful or not - it would try to get sense before 
completion, causing more traffic on a bad state transport.

Best way to allow faster completion was, indeed, returning BLK_EH_HANDLED, but 
changing result to DID_NO_CONNECT, because that will tell block layer not to 
retry, allowing the completion to happen in the SOFTIRQ handler, informing 
result to the upper layer. 

For the queue, simply now allowing queueing on such condition (shutdown + state 
!= logged in) seemed correct. 

Let me know if you want me to try something else. I would be happy to.

Best,
-Rafael

> On 08/12/2017, at 09:12 PM, Bart Van Assche  wrote:
> 
> On Thu, 2017-12-07 at 19:59 -0200, Rafael David Tinoco wrote:
>> This happens because iscsi_eh_cmd_timed_out(), the transport layer
>> timeout helper, would tell the queue timeout function (scsi_times_out)
>> to reset the request timer over and over, until the session state is
>> back to logged in state. Unfortunately, during server shutdown, this
>> might never happen again.
> 
> Hello Rafael,
> 
> Have you considered to make iscsi_eh_cmd_timed_out() return BLK_EH_HANDLED
> if system_state != SYSTEM_RUNNING? That could result in slower shutdown than
> with your patch but such a change would probably be really easy to review.
> 
> Thanks,
> 
> Bart.

[PATCH 7/9] lpfc: Fix infinite wait when driver unregisters a remote NVME port.

2017-12-08 Thread James Smart

When unregistering a remote port the lpfc driver would eventually
wait for the remoteport_unreg done callback. But the driver never
completed the io aborts that would allow the connections to terminate
thus the unreg done callback was never issued.  Turns out the coding
style of the driver allowed for the wait to occur on the same cpu
that the deferred isr is called on. The blocking for the wait, blocked
the isr, and as the isr didn't run, the io aborts wouldn't finish.

Turns out there was never a good reason to block waiting for the
unreg done in the first place. The driver can continue execution and
the ref counting within the driver will do the right thing.

Resolve by removing the wait and patching up a few cases where the
ref counting didn't look right - mainly cases where the remote port
comes back before the aborts had completed and the unreg done had
been called. Additionally, a few places which used pointer values
to guide driver actions weren't protected by lock, so correct those.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
---
 drivers/scsi/lpfc/lpfc_nvme.c | 135 --
 1 file changed, 51 insertions(+), 84 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index 1097ca5a7a8e..4b2a73ebd116 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -201,16 +201,19 @@ lpfc_nvme_remoteport_delete(struct nvme_fc_remote_port 
*remoteport)
 * calling state machine to remove the node.
 */
lpfc_printf_vlog(vport, KERN_INFO, LOG_NVME_DISC,
-   "6146 remoteport delete complete %p\n",
+   "6146 remoteport delete of remoteport %p\n",
remoteport);
+   spin_lock_irq(>phba->hbalock);
ndlp->nrport = NULL;
+   spin_unlock_irq(>phba->hbalock);
+
+   /* Remove original register reference. The host transport
+* won't reference this rport/remoteport any further.
+*/
lpfc_nlp_put(ndlp);
 
  rport_err:
-   /* This call has to execute as long as the rport is valid.
-* Release any threads waiting for the unreg to complete.
-*/
-   complete(>rport_unreg_done);
+   return;
 }
 
 static void
@@ -966,16 +969,10 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct 
lpfc_iocbq *pwqeIn,
/* NVME targets need completion held off until the abort exchange
 * completes unless the NVME Rport is getting unregistered.
 */
-   if (!(lpfc_ncmd->flags & LPFC_SBUF_XBUSY) ||
-   ndlp->upcall_flags & NLP_WAIT_FOR_UNREG) {
-   /* Clear the XBUSY flag to prevent double completions.
-* The nvme rport is getting unregistered and there is
-* no need to defer the IO.
-*/
-   if (lpfc_ncmd->flags & LPFC_SBUF_XBUSY)
-   lpfc_ncmd->flags &= ~LPFC_SBUF_XBUSY;
 
+   if (!(lpfc_ncmd->flags & LPFC_SBUF_XBUSY)) {
nCmd->done(nCmd);
+   lpfc_ncmd->nvmeCmd = NULL;
}
 
spin_lock_irqsave(>hbalock, flags);
@@ -2494,6 +2491,9 @@ lpfc_nvme_register_port(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp)
 
rpinfo.port_name = wwn_to_u64(ndlp->nlp_portname.u.wwn);
rpinfo.node_name = wwn_to_u64(ndlp->nlp_nodename.u.wwn);
+   if (!ndlp->nrport)
+   lpfc_nlp_get(ndlp);
+
ret = nvme_fc_register_remoteport(localport, , _port);
if (!ret) {
/* If the ndlp already has an nrport, this is just
@@ -2502,23 +2502,33 @@ lpfc_nvme_register_port(struct lpfc_vport *vport, 
struct lpfc_nodelist *ndlp)
 */
rport = remote_port->private;
if (ndlp->nrport) {
-   lpfc_printf_vlog(ndlp->vport, KERN_INFO,
-LOG_NVME_DISC,
-"6014 Rebinding lport to "
-"rport wwpn 0x%llx, "
-"Data: x%x x%x x%x x%06x\n",
-remote_port->port_name,
-remote_port->port_id,
-remote_port->port_role,
-ndlp->nlp_type,
-ndlp->nlp_DID);
+   if (ndlp->nrport == remote_port->private) {
+   /* Same remoteport.  Just reuse. */
+   lpfc_printf_vlog(ndlp->vport, KERN_INFO,
+LOG_NVME_DISC,
+"6014 Rebinding lport to "
+"remoteport %p wwpn 0x%llx, "
+"Data: x%x x%x %p x%x x%06x\n",

[PATCH 4/9] lpfc: Increase SCSI CQ and WQ sizes.

2017-12-08 Thread James Smart

Increased the sizes of the SCSI WQ's and CQ's so that SCSI operation is
similar to that used by NVME. However, size increase restricted only to
those newer adapters that can support the larger WQE size, thus bigger
queue sizes.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
---
 drivers/scsi/lpfc/lpfc_init.c | 65 ---
 drivers/scsi/lpfc/lpfc_nvme.h |  2 --
 drivers/scsi/lpfc/lpfc_sli4.h |  6 ++--
 3 files changed, 46 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 44a98bc913f5..f539c554588c 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -7983,9 +7983,9 @@ lpfc_alloc_nvme_wq_cq(struct lpfc_hba *phba, int wqidx)
 {
struct lpfc_queue *qdesc;
 
-   qdesc = lpfc_sli4_queue_alloc(phba, LPFC_NVME_PAGE_SIZE,
+   qdesc = lpfc_sli4_queue_alloc(phba, LPFC_EXPANDED_PAGE_SIZE,
  phba->sli4_hba.cq_esize,
- LPFC_NVME_CQSIZE);
+ LPFC_CQE_EXP_COUNT);
if (!qdesc) {
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
"0508 Failed allocate fast-path NVME CQ (%d)\n",
@@ -7994,8 +7994,8 @@ lpfc_alloc_nvme_wq_cq(struct lpfc_hba *phba, int wqidx)
}
phba->sli4_hba.nvme_cq[wqidx] = qdesc;
 
-   qdesc = lpfc_sli4_queue_alloc(phba, LPFC_NVME_PAGE_SIZE,
- LPFC_WQE128_SIZE, LPFC_NVME_WQSIZE);
+   qdesc = lpfc_sli4_queue_alloc(phba, LPFC_EXPANDED_PAGE_SIZE,
+ LPFC_WQE128_SIZE, LPFC_WQE_EXP_COUNT);
if (!qdesc) {
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
"0509 Failed allocate fast-path NVME WQ (%d)\n",
@@ -8011,12 +8011,18 @@ static int
 lpfc_alloc_fcp_wq_cq(struct lpfc_hba *phba, int wqidx)
 {
struct lpfc_queue *qdesc;
-   uint32_t wqesize;
 
/* Create Fast Path FCP CQs */
-   qdesc = lpfc_sli4_queue_alloc(phba, LPFC_DEFAULT_PAGE_SIZE,
- phba->sli4_hba.cq_esize,
- phba->sli4_hba.cq_ecount);
+   if (phba->fcp_embed_io)
+   /* Increase the CQ size when WQEs contain an embedded cdb */
+   qdesc = lpfc_sli4_queue_alloc(phba, LPFC_EXPANDED_PAGE_SIZE,
+ phba->sli4_hba.cq_esize,
+ LPFC_CQE_EXP_COUNT);
+
+   else
+   qdesc = lpfc_sli4_queue_alloc(phba, LPFC_DEFAULT_PAGE_SIZE,
+ phba->sli4_hba.cq_esize,
+ phba->sli4_hba.cq_ecount);
if (!qdesc) {
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
"0499 Failed allocate fast-path FCP CQ (%d)\n", wqidx);
@@ -8025,10 +8031,15 @@ lpfc_alloc_fcp_wq_cq(struct lpfc_hba *phba, int wqidx)
phba->sli4_hba.fcp_cq[wqidx] = qdesc;
 
/* Create Fast Path FCP WQs */
-   wqesize = (phba->fcp_embed_io) ?
-   LPFC_WQE128_SIZE : phba->sli4_hba.wq_esize;
-   qdesc = lpfc_sli4_queue_alloc(phba, LPFC_DEFAULT_PAGE_SIZE,
- wqesize, phba->sli4_hba.wq_ecount);
+   if (phba->fcp_embed_io)
+   /* Increase the WQ size when WQEs contain an embedded cdb */
+   qdesc = lpfc_sli4_queue_alloc(phba, LPFC_EXPANDED_PAGE_SIZE,
+ LPFC_WQE128_SIZE,
+ LPFC_WQE_EXP_COUNT);
+   else
+   qdesc = lpfc_sli4_queue_alloc(phba, LPFC_DEFAULT_PAGE_SIZE,
+ phba->sli4_hba.wq_esize,
+ phba->sli4_hba.wq_ecount);
if (!qdesc) {
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
"0503 Failed allocate fast-path FCP WQ (%d)\n",
@@ -12216,7 +12227,6 @@ int
 lpfc_fof_queue_create(struct lpfc_hba *phba)
 {
struct lpfc_queue *qdesc;
-   uint32_t wqesize;
 
/* Create FOF EQ */
qdesc = lpfc_sli4_queue_alloc(phba, LPFC_DEFAULT_PAGE_SIZE,
@@ -12230,21 +12240,32 @@ lpfc_fof_queue_create(struct lpfc_hba *phba)
if (phba->cfg_fof) {
 
/* Create OAS CQ */
-   qdesc = lpfc_sli4_queue_alloc(phba, LPFC_DEFAULT_PAGE_SIZE,
- phba->sli4_hba.cq_esize,
- phba->sli4_hba.cq_ecount);
+   if (phba->fcp_embed_io)
+   qdesc = lpfc_sli4_queue_alloc(phba,
+ LPFC_EXPANDED_PAGE_SIZE,
+ phba->sli4_hba.cq_esize,

[PATCH 8/9] lpfc: Beef up stat counters for debug

2017-12-08 Thread James Smart

If log verbose in not turned on, its hard to tell when
certain error paths get hit. Add stats counters and
corresponding logic to debugfs/sysfs to aid understanding
what paths were traversed.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
---
 drivers/scsi/lpfc/lpfc_attr.c| 46 -
 drivers/scsi/lpfc/lpfc_debugfs.c | 49 +---
 drivers/scsi/lpfc/lpfc_nvme.c| 43 +++
 drivers/scsi/lpfc/lpfc_nvme.h| 13 ++-
 drivers/scsi/lpfc/lpfc_nvmet.c   | 34 
 drivers/scsi/lpfc/lpfc_nvmet.h   |  5 
 6 files changed, 171 insertions(+), 19 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 0eef5aa52fc0..797bb42a6306 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -148,6 +148,7 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
struct lpfc_hba   *phba = vport->phba;
struct lpfc_nvmet_tgtport *tgtp;
struct nvme_fc_local_port *localport;
+   struct lpfc_nvme_lport *lport;
struct lpfc_nodelist *ndlp;
struct nvme_fc_remote_port *nrport;
uint64_t data1, data2, data3, tot;
@@ -198,10 +199,15 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
}
 
len += snprintf(buf+len, PAGE_SIZE-len,
-   "LS: Xmt %08x Drop %08x Cmpl %08x Err %08x\n",
+   "LS: Xmt %08x Drop %08x Cmpl %08x\n",
atomic_read(>xmt_ls_rsp),
atomic_read(>xmt_ls_drop),
-   atomic_read(>xmt_ls_rsp_cmpl),
+   atomic_read(>xmt_ls_rsp_cmpl));
+
+   len += snprintf(buf + len, PAGE_SIZE - len,
+   "LS: RSP Abort %08x xb %08x Err %08x\n",
+   atomic_read(>xmt_ls_rsp_aborted),
+   atomic_read(>xmt_ls_rsp_xb_set),
atomic_read(>xmt_ls_rsp_error));
 
len += snprintf(buf+len, PAGE_SIZE-len,
@@ -236,6 +242,12 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
atomic_read(>xmt_fcp_rsp_drop));
 
len += snprintf(buf+len, PAGE_SIZE-len,
+   "FCP Rsp Abort: %08x xb %08x xricqe  %08x\n",
+   atomic_read(>xmt_fcp_rsp_aborted),
+   atomic_read(>xmt_fcp_rsp_xb_set),
+   atomic_read(>xmt_fcp_xri_abort_cqe));
+
+   len += snprintf(buf + len, PAGE_SIZE - len,
"ABORT: Xmt %08x Cmpl %08x\n",
atomic_read(>xmt_fcp_abort),
atomic_read(>xmt_fcp_abort_cmpl));
@@ -265,6 +277,7 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
}
 
localport = vport->localport;
+   lport = (struct lpfc_nvme_lport *)localport->private;
if (!localport) {
len = snprintf(buf, PAGE_SIZE,
"NVME Initiator x%llx is not allocated\n",
@@ -347,9 +360,16 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
 
len += snprintf(buf + len, PAGE_SIZE - len, "\nNVME Statistics\n");
len += snprintf(buf+len, PAGE_SIZE-len,
-   "LS: Xmt %016x Cmpl %016x\n",
+   "LS: Xmt %010x Cmpl %010x Abort %08x\n",
atomic_read(>fc4NvmeLsRequests),
-   atomic_read(>fc4NvmeLsCmpls));
+   atomic_read(>fc4NvmeLsCmpls),
+   atomic_read(>xmt_ls_abort));
+
+   len += snprintf(buf + len, PAGE_SIZE - len,
+   "LS XMIT: Err %08x  CMPL: xb %08x Err %08x\n",
+   atomic_read(>xmt_ls_err),
+   atomic_read(>cmpl_ls_xb),
+   atomic_read(>cmpl_ls_err));
 
tot = atomic_read(>fc4NvmeIoCmpls);
data1 = atomic_read(>fc4NvmeInputRequests);
@@ -360,8 +380,22 @@ lpfc_nvme_info_show(struct device *dev, struct 
device_attribute *attr,
data1, data2, data3);
 
len += snprintf(buf+len, PAGE_SIZE-len,
-   "Cmpl %016llx Outstanding %016llx\n",
-   tot, (data1 + data2 + data3) - tot);
+   "noxri %08x nondlp %08x qdepth %08x "
+   "wqerr %08x\n",
+   atomic_read(>xmt_fcp_noxri),
+   atomic_read(>xmt_fcp_bad_ndlp),
+   atomic_read(>xmt_fcp_qdepth),
+   atomic_read(>xmt_fcp_wqerr));
+
+   len += snprintf(buf + len,

[PATCH 6/9] lpfc: Fix issues connecting with nvme initiator

2017-12-08 Thread James Smart

In the lpfc discovery engine, when as a nvme target, where the
driver was performing mailbox io with the adapter for port login
when a NVME PRLI is received from the host. Rather than queue and
eventually get back to sending a response after the mailbox traffic,
the driver rejected the io with an error response.

Turns out this particular initiator didn't like the rejection values
(unable to process command/command in progress) so it never attempted
a retry of the PRLI. Thus the host never established nvme connectivity
with the lpfc target.

By changing the rejection values (to Logical Busy/nothing more), the
initiator accepted the response and would retry the PRLI, resulting in
nvme connectivity.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
---
 drivers/scsi/lpfc/lpfc_nportdisc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_nportdisc.c 
b/drivers/scsi/lpfc/lpfc_nportdisc.c
index 283382ac0456..d841aa42f607 100644
--- a/drivers/scsi/lpfc/lpfc_nportdisc.c
+++ b/drivers/scsi/lpfc/lpfc_nportdisc.c
@@ -1603,8 +1603,8 @@ lpfc_rcv_prli_reglogin_issue(struct lpfc_vport *vport,
 * rpi registration does complete.
 */
memset(, 0, sizeof(struct ls_rjt));
-   stat.un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC;
-   stat.un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS;
+   stat.un.b.lsRjtRsnCode = LSRJT_LOGICAL_BSY;
+   stat.un.b.lsRjtRsnCodeExp = LSEXP_NOTHING_MORE;
lpfc_els_rsp_reject(vport, stat.un.lsRjtError, cmdiocb,
ndlp, NULL);
return ndlp->nlp_state;
-- 
2.13.1

[PATCH 9/9] lpfc: update driver version to 11.4.0.6

2017-12-08 Thread James Smart

Update the driver version to 11.4.0.6

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
---
 drivers/scsi/lpfc/lpfc_version.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_version.h b/drivers/scsi/lpfc/lpfc_version.h
index cc2f5cec98c5..c232bf0e8998 100644
--- a/drivers/scsi/lpfc/lpfc_version.h
+++ b/drivers/scsi/lpfc/lpfc_version.h
@@ -20,7 +20,7 @@
  * included with this package. *
  ***/
 
-#define LPFC_DRIVER_VERSION "11.4.0.5"
+#define LPFC_DRIVER_VERSION "11.4.0.6"
 #define LPFC_DRIVER_NAME   "lpfc"
 
 /* Used for SLI 2/3 */
-- 
2.13.1

[PATCH 5/9] lpfc: Fix SCSI LUN discovery when SCSI and NVME enabled

2017-12-08 Thread James Smart

When enabled for both SCSI and NVME support, and connected pt2pt to a
SCSI only target, the driver nodelist entry for the remote port is
left in PRLI_ISSUE state and no SCSI LUNs are discovered. Works fine
if only configured for SCSI support.

Error was due to some of the prli points still reflecting the need
to send only 1 PRLI. On a lot of fabric configs, targets were NVME
only, which meant the fabric-reported protocol attributes were only
telling the driver one protocol or the other. Thus things worked
fine. With pt2pt, the driver must send a PRLI for both protocols as
there are no hints on what the target supports. Thus pt2pt targets
were hitting the multiple PRLI issues.

Complete the dual PRLI support. Track explicitly whether scsi (fcp)
or nvme prli's have been sent. Accurately track protocol support
detected on each node as reported by the fabric or probed by PRLI
traffic.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
---
 drivers/scsi/lpfc/lpfc_ct.c|  1 +
 drivers/scsi/lpfc/lpfc_els.c   | 30 --
 drivers/scsi/lpfc/lpfc_nportdisc.c | 30 +-
 3 files changed, 34 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c
index 2c1fe5ab3128..9d20d2c208c7 100644
--- a/drivers/scsi/lpfc/lpfc_ct.c
+++ b/drivers/scsi/lpfc/lpfc_ct.c
@@ -471,6 +471,7 @@ lpfc_prep_node_fc4type(struct lpfc_vport *vport, uint32_t 
Did, uint8_t fc4_type)
"Parse GID_FTrsp: did:x%x flg:x%x x%x",
Did, ndlp->nlp_flag, vport->fc_flag);
 
+   ndlp->nlp_fc4_type &= ~(NLP_FC4_FCP | NLP_FC4_NVME);
/* By default, the driver expects to support FCP FC4 */
if (fc4_type == FC_TYPE_FCP)
ndlp->nlp_fc4_type |= NLP_FC4_FCP;
diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 6ffd65a935c4..dfb21d9efb0d 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -2094,6 +2094,10 @@ lpfc_cmpl_els_prli(struct lpfc_hba *phba, struct 
lpfc_iocbq *cmdiocb,
ndlp = (struct lpfc_nodelist *) cmdiocb->context1;
spin_lock_irq(shost->host_lock);
ndlp->nlp_flag &= ~NLP_PRLI_SND;
+
+   /* Driver supports multiple FC4 types.  Counters matter. */
+   vport->fc_prli_sent--;
+   ndlp->fc4_prli_sent--;
spin_unlock_irq(shost->host_lock);
 
lpfc_debugfs_disc_trc(vport, LPFC_DISC_TRC_ELS_CMD,
@@ -2101,9 +2105,6 @@ lpfc_cmpl_els_prli(struct lpfc_hba *phba, struct 
lpfc_iocbq *cmdiocb,
irsp->ulpStatus, irsp->un.ulpWord[4],
ndlp->nlp_DID);
 
-   /* Ddriver supports multiple FC4 types.  Counters matter. */
-   vport->fc_prli_sent--;
-
/* PRLI completes to NPort  */
lpfc_printf_vlog(vport, KERN_INFO, LOG_ELS,
 "0103 PRLI completes to NPort x%06x "
@@ -2117,7 +2118,6 @@ lpfc_cmpl_els_prli(struct lpfc_hba *phba, struct 
lpfc_iocbq *cmdiocb,
 
if (irsp->ulpStatus) {
/* Check for retry */
-   ndlp->fc4_prli_sent--;
if (lpfc_els_retry(phba, cmdiocb, rspiocb)) {
/* ELS command is being retried */
goto out;
@@ -2196,6 +2196,15 @@ lpfc_issue_els_prli(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
ndlp->nlp_fc4_type |= NLP_FC4_NVME;
local_nlp_type = ndlp->nlp_fc4_type;
 
+   /* This routine will issue 1 or 2 PRLIs, so zero all the ndlp
+* fields here before any of them can complete.
+*/
+   ndlp->nlp_type &= ~(NLP_FCP_TARGET | NLP_FCP_INITIATOR);
+   ndlp->nlp_type &= ~(NLP_NVME_TARGET | NLP_NVME_INITIATOR);
+   ndlp->nlp_fcp_info &= ~NLP_FCP_2_DEVICE;
+   ndlp->nlp_flag &= ~NLP_FIRSTBURST;
+   ndlp->nvme_fb_size = 0;
+
  send_next_prli:
if (local_nlp_type & NLP_FC4_FCP) {
/* Payload is 4 + 16 = 20 x14 bytes. */
@@ -2304,6 +2313,13 @@ lpfc_issue_els_prli(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
elsiocb->iocb_cmpl = lpfc_cmpl_els_prli;
spin_lock_irq(shost->host_lock);
ndlp->nlp_flag |= NLP_PRLI_SND;
+
+   /* The vport counters are used for lpfc_scan_finished, but
+* the ndlp is used to track outstanding PRLIs for different
+* FC4 types.
+*/
+   vport->fc_prli_sent++;
+   ndlp->fc4_prli_sent++;
spin_unlock_irq(shost->host_lock);
if (lpfc_sli_issue_iocb(phba, LPFC_ELS_RING, elsiocb, 0) ==
IOCB_ERROR) {
@@ -2314,12 +2330,6 @@ lpfc_issue_els_prli(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
return 1;
}
 
-   /* The vport counters are used for lpfc_scan_finished, but
-* the ndlp is used to track outstanding PRLIs for different
-* FC4

[PATCH 1/9] lpfc: Fix random heartbeat timeouts during heavy IO

2017-12-08 Thread James Smart

NVME targets appear to randomly disconnect from the initiator
when running heavy IO.

The error is due to the host aggregate (across all controllers)
io load was beyond the maximum exchange count for nvme on the
adapter. The driver was properly returning a resource busy status,
but the io load was so great heartbeat commands would be bounced
and not have a successful retry within the fuzz amount for the
nvme heartbeat (yes, a very high io load!). Thus the target was
terminating the controller due to a keep alive failure.

Resolve by reserving a few exchanges (by counters) which can be
used when the adapter is out of normal exchanges and the command
is a NVME heartbeat command. As counters are used, while the
reserved command is outstanding, as soon as any other exchange
completes, the counters are adjusted and the reserved count is
replenished. The heartbeat completes execution in a normal fashion.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
---
 drivers/scsi/lpfc/lpfc.h  |  2 ++
 drivers/scsi/lpfc/lpfc_init.c | 16 ++-
 drivers/scsi/lpfc/lpfc_nvme.c | 66 +--
 drivers/scsi/lpfc/lpfc_nvme.h |  1 +
 4 files changed, 63 insertions(+), 22 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index dd2191c83052..61fb46da05d4 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -945,6 +945,8 @@ struct lpfc_hba {
struct list_head lpfc_nvme_buf_list_get;
struct list_head lpfc_nvme_buf_list_put;
uint32_t total_nvme_bufs;
+   uint32_t get_nvme_bufs;
+   uint32_t put_nvme_bufs;
struct list_head lpfc_iocb_list;
uint32_t total_iocbq_bufs;
struct list_head active_rrq_list;
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index fa211550a32a..44a98bc913f5 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -1034,6 +1034,7 @@ lpfc_hba_down_post_s4(struct lpfc_hba *phba)
LIST_HEAD(nvmet_aborts);
unsigned long iflag = 0;
struct lpfc_sglq *sglq_entry = NULL;
+   int cnt;
 
 
lpfc_sli_hbqbuf_free_all(phba);
@@ -1090,11 +1091,14 @@ lpfc_hba_down_post_s4(struct lpfc_hba *phba)
spin_unlock_irqrestore(>scsi_buf_list_put_lock, iflag);
 
if (phba->cfg_enable_fc4_type & LPFC_ENABLE_NVME) {
+   cnt = 0;
list_for_each_entry_safe(psb, psb_next, _aborts, list) {
psb->pCmd = NULL;
psb->status = IOSTAT_SUCCESS;
+   cnt++;
}
spin_lock_irqsave(>nvme_buf_list_put_lock, iflag);
+   phba->put_nvme_bufs += cnt;
list_splice(_aborts, >lpfc_nvme_buf_list_put);
spin_unlock_irqrestore(>nvme_buf_list_put_lock, iflag);
 
@@ -3339,6 +3343,7 @@ lpfc_nvme_free(struct lpfc_hba *phba)
list_for_each_entry_safe(lpfc_ncmd, lpfc_ncmd_next,
 >lpfc_nvme_buf_list_put, list) {
list_del(_ncmd->list);
+   phba->put_nvme_bufs--;
dma_pool_free(phba->lpfc_sg_dma_buf_pool, lpfc_ncmd->data,
  lpfc_ncmd->dma_handle);
kfree(lpfc_ncmd);
@@ -3350,6 +3355,7 @@ lpfc_nvme_free(struct lpfc_hba *phba)
list_for_each_entry_safe(lpfc_ncmd, lpfc_ncmd_next,
 >lpfc_nvme_buf_list_get, list) {
list_del(_ncmd->list);
+   phba->get_nvme_bufs--;
dma_pool_free(phba->lpfc_sg_dma_buf_pool, lpfc_ncmd->data,
  lpfc_ncmd->dma_handle);
kfree(lpfc_ncmd);
@@ -3754,9 +3760,11 @@ lpfc_sli4_nvme_sgl_update(struct lpfc_hba *phba)
uint16_t i, lxri, els_xri_cnt;
uint16_t nvme_xri_cnt, nvme_xri_max;
LIST_HEAD(nvme_sgl_list);
-   int rc;
+   int rc, cnt;
 
phba->total_nvme_bufs = 0;
+   phba->get_nvme_bufs = 0;
+   phba->put_nvme_bufs = 0;
 
if (!(phba->cfg_enable_fc4_type & LPFC_ENABLE_NVME))
return 0;
@@ -3780,6 +3788,9 @@ lpfc_sli4_nvme_sgl_update(struct lpfc_hba *phba)
spin_lock(>nvme_buf_list_put_lock);
list_splice_init(>lpfc_nvme_buf_list_get, _sgl_list);
list_splice(>lpfc_nvme_buf_list_put, _sgl_list);
+   cnt = phba->get_nvme_bufs + phba->put_nvme_bufs;
+   phba->get_nvme_bufs = 0;
+   phba->put_nvme_bufs = 0;
spin_unlock(>nvme_buf_list_put_lock);
spin_unlock_irq(>nvme_buf_list_get_lock);
 
@@ -3824,6 +3835,7 @@ lpfc_sli4_nvme_sgl_update(struct lpfc_hba *phba)
spin_lock_irq(>nvme_buf_list_get_lock);
spin_lock(>nvme_buf_list_put_lock);
list_splice_init(_sgl_list, >lpfc_nvme_buf_list_get);
+   phba->get_nvme_bufs = cnt;
INIT_LIST_HEAD(>lpfc_nvme_buf_list_put);
spin_unlock(>nvme_buf_list_put_lock);

[PATCH 3/9] lpfc: Fix receive PRLI handling

2017-12-08 Thread James Smart

Handling a rcv'ed PRLI incorrectly can cause the ndlp to end up
in the wrong state or the driver to ACC and PRLI when it should
send LS_RJT.

The cause was due to the driver not properly looking at the PRLI
type and taking the multiple protocol support into consideration.

Resolved by adding checks in the various PRLI receive points to
validate PRLI type and reject if not valid for the enabled protocols
and mode (host vs target).

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
---
 drivers/scsi/lpfc/lpfc_els.c   |  7 -
 drivers/scsi/lpfc/lpfc_nportdisc.c | 54 +-
 2 files changed, 47 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 4a14f3c82a07..6ffd65a935c4 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -8063,13 +8063,6 @@ lpfc_els_unsol_buffer(struct lpfc_hba *phba, struct 
lpfc_sli_ring *pring,
rjt_exp = LSEXP_NOTHING_MORE;
break;
}
-
-   /* NVMET accepts NVME PRLI only.  Reject FCP PRLI */
-   if (cmd == ELS_CMD_PRLI && phba->nvmet_support) {
-   rjt_err = LSRJT_CMD_UNSUPPORTED;
-   rjt_exp = LSEXP_REQ_UNSUPPORTED;
-   break;
-   }
lpfc_disc_state_machine(vport, ndlp, elsiocb, NLP_EVT_RCV_PRLI);
break;
case ELS_CMD_LIRR:
diff --git a/drivers/scsi/lpfc/lpfc_nportdisc.c 
b/drivers/scsi/lpfc/lpfc_nportdisc.c
index b6957d944b9a..df050b211e0b 100644
--- a/drivers/scsi/lpfc/lpfc_nportdisc.c
+++ b/drivers/scsi/lpfc/lpfc_nportdisc.c
@@ -727,6 +727,41 @@ lpfc_rcv_logo(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp,
return 0;
 }
 
+static uint32_t
+lpfc_rcv_prli_support_check(struct lpfc_vport *vport,
+   struct lpfc_nodelist *ndlp,
+   struct lpfc_iocbq *cmdiocb)
+{
+   struct ls_rjt stat;
+   uint32_t *payload;
+   uint32_t cmd;
+
+   payload = ((struct lpfc_dmabuf *)cmdiocb->context2)->virt;
+   cmd = *payload;
+   if (vport->phba->nvmet_support) {
+   /* Must be a NVME PRLI */
+   if (cmd ==  ELS_CMD_PRLI)
+   goto out;
+   } else {
+   /* Initiator mode. */
+   if (!vport->nvmei_support && (cmd == ELS_CMD_NVMEPRLI))
+   goto out;
+   }
+   return 1;
+out:
+   lpfc_printf_vlog(vport, KERN_WARNING, LOG_NVME_DISC,
+"6115 Rcv PRLI (%x) check failed: ndlp rpi %d "
+"state x%x flags x%x\n",
+cmd, ndlp->nlp_rpi, ndlp->nlp_state,
+ndlp->nlp_flag);
+   memset(, 0, sizeof(struct ls_rjt));
+   stat.un.b.lsRjtRsnCode = LSRJT_CMD_UNSUPPORTED;
+   stat.un.b.lsRjtRsnCodeExp = LSEXP_REQ_UNSUPPORTED;
+   lpfc_els_rsp_reject(vport, stat.un.lsRjtError, cmdiocb,
+   ndlp, NULL);
+   return 0;
+}
+
 static void
 lpfc_rcv_prli(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp,
  struct lpfc_iocbq *cmdiocb)
@@ -1373,7 +1408,8 @@ lpfc_rcv_prli_adisc_issue(struct lpfc_vport *vport, 
struct lpfc_nodelist *ndlp,
 {
struct lpfc_iocbq *cmdiocb = (struct lpfc_iocbq *) arg;
 
-   lpfc_els_rsp_prli_acc(vport, cmdiocb, ndlp);
+   if (lpfc_rcv_prli_support_check(vport, ndlp, cmdiocb))
+   lpfc_els_rsp_prli_acc(vport, cmdiocb, ndlp);
return ndlp->nlp_state;
 }
 
@@ -1544,6 +1580,9 @@ lpfc_rcv_prli_reglogin_issue(struct lpfc_vport *vport,
struct lpfc_iocbq *cmdiocb = (struct lpfc_iocbq *) arg;
struct ls_rjt stat;
 
+   if (!lpfc_rcv_prli_support_check(vport, ndlp, cmdiocb)) {
+   return ndlp->nlp_state;
+   }
if (vport->phba->nvmet_support) {
/* NVME Target mode.  Handle and respond to the PRLI and
 * transition to UNMAPPED provided the RPI has completed
@@ -1558,11 +1597,6 @@ lpfc_rcv_prli_reglogin_issue(struct lpfc_vport *vport,
 * to prevent an illegal state transition when the
 * rpi registration does complete.
 */
-   lpfc_printf_vlog(vport, KERN_WARNING, LOG_NVME_DISC,
-"6115 NVMET ndlp rpi %d state "
-"unknown, state x%x flags x%08x\n",
-ndlp->nlp_rpi, ndlp->nlp_state,
-ndlp->nlp_flag);
memset(, 0, sizeof(struct ls_rjt));
stat.un.b.lsRjtRsnCode = LSRJT_UNABLE_TPC;
stat.un.b.lsRjtRsnCodeExp = LSEXP_CMD_IN_PROGRESS;
@@ -1573,7 +1607,6 @@ lpfc_rcv_prli_reglogin_issue(struct

[PATCH 2/9] lpfc: Fix -EOVERFLOW behavior for NVMET and defer_rcv

2017-12-08 Thread James Smart

The driver is all set to handle the defer_rcv api for the
nvmet_fc transport, yet didn't properly recognize the return
status when the defer_rcv occurred. The driver treated it simply
as an error and aborted the io. Several residual issues occurred
at that point.

Finish the defer_rcv support: recognize the return status when
the io request is being handled in a deferred style. This stops
the rogue aborts; Replenish the async cmd rcv buffer in the
deferred receive if needed.

Signed-off-by: Dick Kennedy 
Signed-off-by: James Smart 
---
 drivers/scsi/lpfc/lpfc_nvmet.c | 24 ++--
 drivers/scsi/lpfc/lpfc_nvmet.h |  1 +
 drivers/scsi/lpfc/lpfc_sli.c   | 24 +---
 3 files changed, 36 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c
index d80cd1def3b9..02a1cfa10f72 100644
--- a/drivers/scsi/lpfc/lpfc_nvmet.c
+++ b/drivers/scsi/lpfc/lpfc_nvmet.c
@@ -38,6 +38,7 @@
 
 #include <../drivers/nvme/host/nvme.h>
 #include 
+#include 
 
 #include "lpfc_version.h"
 #include "lpfc_hw4.h"
@@ -218,6 +219,7 @@ lpfc_nvmet_ctxbuf_post(struct lpfc_hba *phba, struct 
lpfc_nvmet_ctxbuf *ctx_buf)
ctxp->entry_cnt = 1;
ctxp->flag = 0;
ctxp->ctxbuf = ctx_buf;
+   ctxp->rqb_buffer = (void *)nvmebuf;
spin_lock_init(>ctxlock);
 
 #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
@@ -253,6 +255,17 @@ lpfc_nvmet_ctxbuf_post(struct lpfc_hba *phba, struct 
lpfc_nvmet_ctxbuf *ctx_buf)
return;
}
 
+   /* Processing of FCP command is deferred */
+   if (rc == -EOVERFLOW) {
+   lpfc_nvmeio_data(phba,
+"NVMET RCV BUSY: xri x%x sz %d "
+"from %06x\n",
+oxid, size, sid);
+   /* defer repost rcv buffer till .defer_rcv callback */
+   ctxp->flag &= ~LPFC_NVMET_DEFER_RCV_REPOST;
+   atomic_inc(>rcv_fcp_cmd_out);
+   return;
+   }
atomic_inc(>rcv_fcp_cmd_drop);
lpfc_printf_log(phba, KERN_ERR, LOG_NVME_IOERR,
"2582 FCP Drop IO x%x: err x%x: x%x x%x x%x\n",
@@ -921,7 +934,11 @@ lpfc_nvmet_defer_rcv(struct nvmet_fc_target_port *tgtport,
 
tgtp = phba->targetport->private;
atomic_inc(>rcv_fcp_cmd_defer);
-   lpfc_rq_buf_free(phba, >hbuf); /* repost */
+   if (ctxp->flag & LPFC_NVMET_DEFER_RCV_REPOST)
+   lpfc_rq_buf_free(phba, >hbuf); /* repost */
+   else
+   nvmebuf->hrq->rqbp->rqb_free_buffer(phba, nvmebuf);
+   ctxp->flag &= ~LPFC_NVMET_DEFER_RCV_REPOST;
 }
 
 static struct nvmet_fc_target_template lpfc_tgttemplate = {
@@ -1693,6 +1710,7 @@ lpfc_nvmet_unsol_fcp_buffer(struct lpfc_hba *phba,
ctxp->entry_cnt = 1;
ctxp->flag = 0;
ctxp->ctxbuf = ctx_buf;
+   ctxp->rqb_buffer = (void *)nvmebuf;
spin_lock_init(>ctxlock);
 
 #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
@@ -1726,6 +1744,7 @@ lpfc_nvmet_unsol_fcp_buffer(struct lpfc_hba *phba,
 
/* Process FCP command */
if (rc == 0) {
+   ctxp->rqb_buffer = NULL;
atomic_inc(>rcv_fcp_cmd_out);
lpfc_rq_buf_free(phba, >hbuf); /* repost */
return;
@@ -1737,10 +1756,11 @@ lpfc_nvmet_unsol_fcp_buffer(struct lpfc_hba *phba,
 "NVMET RCV BUSY: xri x%x sz %d from %06x\n",
 oxid, size, sid);
/* defer reposting rcv buffer till .defer_rcv callback */
-   ctxp->rqb_buffer = nvmebuf;
+   ctxp->flag |= LPFC_NVMET_DEFER_RCV_REPOST;
atomic_inc(>rcv_fcp_cmd_out);
return;
}
+   ctxp->rqb_buffer = nvmebuf;
 
atomic_inc(>rcv_fcp_cmd_drop);
lpfc_printf_log(phba, KERN_ERR, LOG_NVME_IOERR,
diff --git a/drivers/scsi/lpfc/lpfc_nvmet.h b/drivers/scsi/lpfc/lpfc_nvmet.h
index 6723e7b81946..03096024e073 100644
--- a/drivers/scsi/lpfc/lpfc_nvmet.h
+++ b/drivers/scsi/lpfc/lpfc_nvmet.h
@@ -126,6 +126,7 @@ struct lpfc_nvmet_rcv_ctx {
 #define LPFC_NVMET_XBUSY   0x4  /* XB bit set on IO cmpl */
 #define LPFC_NVMET_CTX_RLS 0x8  /* ctx free requested */
 #define LPFC_NVMET_ABTS_RCV0x10  /* ABTS received on exchange */
+#define LPFC_NVMET_DEFER_RCV_REPOST0x20  /* repost to RQ on defer rcv */
struct rqb_dmabuf *rqb_buffer;
struct lpfc_nvmet_ctxbuf *ctxbuf;
 
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 1d489b89954e..5f5528a12308 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -475,28 +475,30 @@ lpfc_sli4_rq_put(struct lpfc_queue *hq, struct lpfc_queue 
*dq,

[PATCH 0/9] lpfc updates for 11.4.0.6

2017-12-08 Thread James Smart

This patch set provides a number of bug fixes and 1 addition to
the driver.

The patches were cut against the Martin's 4.16/scsi-queue tree.
There are no outside dependencies and are expected to be pulled
via Martins tree.

James Smart (9):
  lpfc: Fix random heartbeat timeouts during heavy IO
  lpfc: Fix -EOVERFLOW behavior for NVMET and defer_rcv
  lpfc: Fix receive PRLI handling
  lpfc: Increase SCSI CQ and WQ sizes.
  lpfc: Fix SCSI LUN discovery when SCSI and NVME enabled
  lpfc: Fix issues connecting with nvme initiator
  lpfc: Fix infinite wait when driver unregisters a remote NVME port.
  lpfc: Beef up stat counters for debug
  lpfc: update driver version to 11.4.0.6

 drivers/scsi/lpfc/lpfc.h   |   2 +
 drivers/scsi/lpfc/lpfc_attr.c  |  46 ++-
 drivers/scsi/lpfc/lpfc_ct.c|   1 +
 drivers/scsi/lpfc/lpfc_debugfs.c   |  49 +++-
 drivers/scsi/lpfc/lpfc_els.c   |  37 +++---
 drivers/scsi/lpfc/lpfc_init.c  |  81 
 drivers/scsi/lpfc/lpfc_nportdisc.c |  88 +
 drivers/scsi/lpfc/lpfc_nvme.c  | 244 -
 drivers/scsi/lpfc/lpfc_nvme.h  |  16 ++-
 drivers/scsi/lpfc/lpfc_nvmet.c |  58 +++--
 drivers/scsi/lpfc/lpfc_nvmet.h |   6 +
 drivers/scsi/lpfc/lpfc_sli.c   |  24 ++--
 drivers/scsi/lpfc/lpfc_sli4.h  |   6 +-
 drivers/scsi/lpfc/lpfc_version.h   |   2 +-
 14 files changed, 451 insertions(+), 209 deletions(-)

-- 
2.13.1

[PATCH][next] scsi: arcmsr: remove redundant check for secs < 0

2017-12-08 Thread Colin King

From: Colin Ian King 

The check for secs being less than zero is redundant for two reasons.
Firstly, secs is unsigned so the check is always going to be false.
Secondly, if secs was signed the proceeding calculation of secs is
never going to be negative.  Hence we can remove this redundant check
and day and secs re-adjustment.

Detected by static analysis with smatch:
arcmsr_set_iop_datetime() warn: unsigned 'secs' is never less than zero.

Signed-off-by: Colin Ian King 
---
 drivers/scsi/arcmsr/arcmsr_hba.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index 0707a60bf5c0..e4258b69f4be 100644
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -3679,10 +3679,6 @@ static void arcmsr_set_iop_datetime(struct timer_list *t)
secs = (u32)(tv.tv_sec - (sys_tz.tz_minuteswest * 60));
days = secs / 86400;
secs = secs - 86400 * days;
-   if (secs < 0) {
-   days = days - 1;
-   secs = secs + 86400;
-   }
j = days / 146097;
i = days - 146097 * j;
a = i + 719468;
-- 
2.14.1

Re: [PATCH] scsi: libiscsi: Allow sd_shutdown on bad transport

2017-12-08 Thread Bart Van Assche

On Thu, 2017-12-07 at 19:59 -0200, Rafael David Tinoco wrote:
> This happens because iscsi_eh_cmd_timed_out(), the transport layer
> timeout helper, would tell the queue timeout function (scsi_times_out)
> to reset the request timer over and over, until the session state is
> back to logged in state. Unfortunately, during server shutdown, this
> might never happen again.

Hello Rafael,

Have you considered to make iscsi_eh_cmd_timed_out() return BLK_EH_HANDLED
if system_state != SYSTEM_RUNNING? That could result in slower shutdown than
with your patch but such a change would probably be really easy to review.

Thanks,

Bart.

[PATCH] sd: Increase SCSI disk probing concurrency

2017-12-08 Thread Bart Van Assche

The scsi_sd_probe_domain allows to wait until all disk-probing
activity has finished system-wide. This slows down SCSI host removal
that occurs concurrently with SCSI disk probing because sd_remove()
waits on scsi_sd_probe_domain. Additionally, since each function that
waits on scsi_sd_probe_domain specifies for which disk to wait until
probing has finished, replace waiting on scsi_sd_probe_domain by
waiting until probing for a specific disk has finished. Introduce a
.sync() function pointer in struct scsi_driver to make it possible
for the SCSI power management code to wait until probing of a
specific disk has finished.

Signed-off-by: Bart Van Assche 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
---
 drivers/scsi/scsi.c|  5 -
 drivers/scsi/scsi_pm.c |  6 --
 drivers/scsi/scsi_priv.h   |  1 -
 drivers/scsi/sd.c  | 26 +-
 drivers/scsi/sd.h  |  1 +
 include/scsi/scsi_driver.h |  1 +
 6 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index a7e4fba724b7..e6d69e647f6a 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -85,10 +85,6 @@ unsigned int scsi_logging_level;
 EXPORT_SYMBOL(scsi_logging_level);
 #endif
 
-/* sd, scsi core and power management need to coordinate flushing async 
actions */
-ASYNC_DOMAIN(scsi_sd_probe_domain);
-EXPORT_SYMBOL(scsi_sd_probe_domain);
-
 /*
  * Separate domain (from scsi_sd_probe_domain) to maximize the benefit of
  * asynchronous system resume operations.  It is marked 'exclusive' to avoid
@@ -839,7 +835,6 @@ static void __exit exit_scsi(void)
scsi_exit_devinfo();
scsi_exit_procfs();
scsi_exit_queue();
-   async_unregister_domain(_sd_probe_domain);
 }
 
 subsys_initcall(init_scsi);
diff --git a/drivers/scsi/scsi_pm.c b/drivers/scsi/scsi_pm.c
index b44c1bb687a2..d8e43c2f4d40 100644
--- a/drivers/scsi/scsi_pm.c
+++ b/drivers/scsi/scsi_pm.c
@@ -171,9 +171,11 @@ static int scsi_bus_resume_common(struct device *dev,
 static int scsi_bus_prepare(struct device *dev)
 {
if (scsi_is_sdev_device(dev)) {
-   /* sd probing uses async_schedule.  Wait until it finishes. */
-   async_synchronize_full_domain(_sd_probe_domain);
+   struct scsi_driver *drv = to_scsi_driver(dev->driver);
 
+   /* sd probing happens asynchronously. Wait until it finishes. */
+   if (drv->sync)
+   drv->sync(dev);
} else if (scsi_is_host_device(dev)) {
/* Wait until async scanning is finished */
scsi_complete_async_scans();
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 99f1db5e467e..0d88f6b85f5b 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -176,7 +176,6 @@ static inline void scsi_autopm_put_host(struct Scsi_Host 
*h) {}
 #endif /* CONFIG_PM */
 
 extern struct async_domain scsi_sd_pm_domain;
-extern struct async_domain scsi_sd_probe_domain;
 
 /* scsi_dh.c */
 #ifdef CONFIG_SCSI_DH
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index ab75ebd518a7..53ec383e10d1 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -112,6 +112,7 @@ static void sd_shutdown(struct device *);
 static int sd_suspend_system(struct device *);
 static int sd_suspend_runtime(struct device *);
 static int sd_resume(struct device *);
+static void sd_sync_probe_domain(struct device *dev);
 static void sd_rescan(struct device *);
 static int sd_init_command(struct scsi_cmnd *SCpnt);
 static void sd_uninit_command(struct scsi_cmnd *SCpnt);
@@ -564,6 +565,7 @@ static struct scsi_driver sd_template = {
.shutdown   = sd_shutdown,
.pm = _pm_ops,
},
+   .sync   = sd_sync_probe_domain,
.rescan = sd_rescan,
.init_command   = sd_init_command,
.uninit_command = sd_uninit_command,
@@ -3254,9 +3256,9 @@ static int sd_format_disk_name(char *prefix, int index, 
char *buf, int buflen)
 /*
  * The asynchronous part of sd_probe
  */
-static void sd_probe_async(void *data, async_cookie_t cookie)
+static void sd_probe_async(struct work_struct *work)
 {
-   struct scsi_disk *sdkp = data;
+   struct scsi_disk *sdkp = container_of(work, typeof(*sdkp), probe_work);
struct scsi_device *sdp;
struct gendisk *gd;
u32 index;
@@ -3359,6 +3361,8 @@ static int sd_probe(struct device *dev)
if (!sdkp)
goto out;
 
+   INIT_WORK(>probe_work, sd_probe_async);
+
gd = alloc_disk(SD_MINORS);
if (!gd)
goto out_free;
@@ -3410,8 +3414,8 @@ static int sd_probe(struct device *dev)
get_device(dev);
dev_set_drvdata(dev, sdkp);
 
-   get_device(>dev); /* prevent release before async_schedule */
-

[PATCH 01/19] scsi: hisi_sas: initialize dq spinlock before use

2017-12-08 Thread John Garry

From: Xiang Chen 

It is required to initialize the dq spinlock before use, which
was not being done, so fix it. This issue can be detected when
CONFIG_DEBUG_SPINLOCK is enabled.

Signed-off-by: Xiang Chen 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 5f503cb..359ec52 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1657,6 +1657,7 @@ int hisi_sas_alloc(struct hisi_hba *hisi_hba, struct 
Scsi_Host *shost)
cq->hisi_hba = hisi_hba;
 
/* Delivery queue structure */
+   spin_lock_init(>lock);
dq->id = i;
dq->hisi_hba = hisi_hba;
 
-- 
1.9.1

[PATCH 08/19] scsi: hisi_sas: change ncq process for v3 hw

2017-12-08 Thread John Garry

From: Xiang Chen 

For v3 hw, each NCQ will return a CQ, so it is no need to
acquire IPTT from ITCT, just acquire it from IPTT field of
CQ.

Signed-off-by: Xiang Chen 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 40 +-
 1 file changed, 6 insertions(+), 34 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 44f07bc..69aa7bc 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -1653,9 +1653,8 @@ static void cq_tasklet_v3_hw(unsigned long val)
struct hisi_sas_cq *cq = (struct hisi_sas_cq *)val;
struct hisi_hba *hisi_hba = cq->hisi_hba;
struct hisi_sas_slot *slot;
-   struct hisi_sas_itct *itct;
struct hisi_sas_complete_v3_hdr *complete_queue;
-   u32 rd_point = cq->rd_point, wr_point, dev_id;
+   u32 rd_point = cq->rd_point, wr_point;
int queue = cq->id;
struct hisi_sas_dq *dq = _hba->dq[queue];
 
@@ -1671,38 +1670,11 @@ static void cq_tasklet_v3_hw(unsigned long val)
 
complete_hdr = _queue[rd_point];
 
-   /* Check for NCQ completion */
-   if (complete_hdr->act) {
-   u32 act_tmp = complete_hdr->act;
-   int ncq_tag_count = ffs(act_tmp);
-
-   dev_id = (complete_hdr->dw1 & CMPLT_HDR_DEV_ID_MSK) >>
-CMPLT_HDR_DEV_ID_OFF;
-   itct = _hba->itct[dev_id];
-
-   /* The NCQ tags are held in the itct header */
-   while (ncq_tag_count) {
-   __le64 *ncq_tag = >qw4_15[0];
-
-   ncq_tag_count -= 1;
-   iptt = (ncq_tag[ncq_tag_count / 5]
-   >> (ncq_tag_count % 5) * 12) & 0xfff;
-
-   slot = _hba->slot_info[iptt];
-   slot->cmplt_queue_slot = rd_point;
-   slot->cmplt_queue = queue;
-   slot_complete_v3_hw(hisi_hba, slot);
-
-   act_tmp &= ~(1 << ncq_tag_count);
-   ncq_tag_count = ffs(act_tmp);
-   }
-   } else {
-   iptt = (complete_hdr->dw1) & CMPLT_HDR_IPTT_MSK;
-   slot = _hba->slot_info[iptt];
-   slot->cmplt_queue_slot = rd_point;
-   slot->cmplt_queue = queue;
-   slot_complete_v3_hw(hisi_hba, slot);
-   }
+   iptt = (complete_hdr->dw1) & CMPLT_HDR_IPTT_MSK;
+   slot = _hba->slot_info[iptt];
+   slot->cmplt_queue_slot = rd_point;
+   slot->cmplt_queue = queue;
+   slot_complete_v3_hw(hisi_hba, slot);
 
if (++rd_point >= HISI_SAS_QUEUE_SLOTS)
rd_point = 0;
-- 
1.9.1

[PATCH 00/19] hisi_sas: PM, RAS, and other misc changes

2017-12-08 Thread John Garry

This patchset contains support for some new
features, and also some modifications and other
fixes.

Headline changes include:
- v3 hw Suspend and Resume support
- v3 hw RAS (PCI AER) support
- v2 hw HW port error handling support
- other misc fixes and tidy-up

Xiang Chen (8):
  scsi: hisi_sas: initialize dq spinlock before use
  scsi: hisi_sas: fix dma_unmap_sg() parameter
  scsi: hisi_sas: modify hisi_sas_dev_gone() for reset
  scsi: hisi_sas: change ncq process for v3 hw
  scsi: hisi_sas: add some print to enhance debugging
  scsi: hisi_sas: fix SAS_QUEUE_FULL problem while running IO
  scsi: hisi_sas: re-add the lldd_port_deformed()
  scsi: hisi_sas: add v3 hw suspend and resume

Xiaofei Tan (11):
  scsi: hisi_sas: relocate clearing ITCT and freeing device
  scsi: hisi_sas: optimise port id refresh function
  scsi: hisi_sas: some optimizations of host controller reset
  scsi: hisi_sas: add an mechanism to do reset work synchronously
  scsi: hisi_sas: add RAS feature for v3 hw
  scsi: hisi_sas: improve int_chnl_int_v2_hw() consistenty with v3 hw
  scsi: hisi_sas: add v2 hw port AXI error handling support
  scsi: hisi_sas: use an general way to delay PHY work
  scsi: hisi_sas: do link reset for some CHL_INT2 ints
  scsi: hisi_sas: judge result of internal abort
  scsi: hisi_sas: add internal abort dev in some places

 drivers/scsi/hisi_sas/hisi_sas.h   |  40 +++-
 drivers/scsi/hisi_sas/hisi_sas_main.c  | 222 +-
 drivers/scsi/hisi_sas/hisi_sas_v1_hw.c |   6 +-
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 153 ++-
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 330 -
 5 files changed, 610 insertions(+), 141 deletions(-)

-- 
1.9.1

[PATCH 06/19] scsi: hisi_sas: modify hisi_sas_dev_gone() for reset

2017-12-08 Thread John Garry

From: Xiang Chen 

Do a couple of changes for when HISI_SAS_RESET_BIT is
set for HBA:
- Clearing ITCT is not necessary
- Remove internal abort as it will fail during reset

Flag sas_dev->dev_type is kept as SAS_PHY_UNUSED.

Signed-off-by: Xiang Chen 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 64d51a8..e4b3092 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -738,17 +738,19 @@ static void hisi_sas_dev_gone(struct domain_device 
*device)
dev_info(dev, "found dev[%d:%x] is gone\n",
 sas_dev->device_id, sas_dev->dev_type);
 
-   hisi_sas_internal_task_abort(hisi_hba, device,
+   if (!test_bit(HISI_SAS_RESET_BIT, _hba->flags)) {
+   hisi_sas_internal_task_abort(hisi_hba, device,
 HISI_SAS_INT_ABT_DEV, 0);
 
-   hisi_sas_dereg_device(hisi_hba, device);
+   hisi_sas_dereg_device(hisi_hba, device);
+
+   hisi_hba->hw->clear_itct(hisi_hba, sas_dev);
+   device->lldd_dev = NULL;
+   memset(sas_dev, 0, sizeof(*sas_dev));
+   }
 
-   hisi_hba->hw->clear_itct(hisi_hba, sas_dev);
if (hisi_hba->hw->free_device)
hisi_hba->hw->free_device(sas_dev);
-
-   device->lldd_dev = NULL;
-   memset(sas_dev, 0, sizeof(*sas_dev));
sas_dev->dev_type = SAS_PHY_UNUSED;
 }
 
-- 
1.9.1

[PATCH 07/19] scsi: hisi_sas: add an mechanism to do reset work synchronously

2017-12-08 Thread John Garry

From: Xiaofei Tan 

Sometimes it is required to know when the controller reset
has completed and also if it has completed successfully.
For such places, we call hisi_sas_controller_reset() directly
before. That may lead to multiple calls to this function.

This patch create a per-reset structure which contains
a completion structure and status flag to know when the reset
completes and also the status. It is also in hisi_hba.wq to
do reset work.

As all host reset works are done in hisi_hba.wq, we don't worry
multiple calls to hisi_sas_controller_reset().

Signed-off-by: Xiaofei Tan 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas.h  | 26 ++
 drivers/scsi/hisi_sas/hisi_sas_main.c | 19 ++-
 2 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h
index b2534ca..71bc8ea 100644
--- a/drivers/scsi/hisi_sas/hisi_sas.h
+++ b/drivers/scsi/hisi_sas/hisi_sas.h
@@ -99,6 +99,31 @@ struct hisi_sas_hw_error {
const struct hisi_sas_hw_error *sub;
 };
 
+struct hisi_sas_rst {
+   struct hisi_hba *hisi_hba;
+   struct completion *completion;
+   struct work_struct work;
+   bool done;
+};
+
+#define HISI_SAS_RST_WORK_INIT(r, c) \
+   {   .hisi_hba = hisi_hba, \
+   .completion = , \
+   .work = __WORK_INITIALIZER(r.work, \
+   hisi_sas_sync_rst_work_handler), \
+   .done = false, \
+   }
+
+#define HISI_SAS_DECLARE_RST_WORK_ON_STACK(r) \
+   DECLARE_COMPLETION_ONSTACK(c); \
+   DECLARE_WORK(w, hisi_sas_sync_rst_work_handler); \
+   struct hisi_sas_rst r = HISI_SAS_RST_WORK_INIT(r, c)
+
+enum hisi_sas_bit_err_type {
+   HISI_SAS_ERR_SINGLE_BIT_ECC = 0x0,
+   HISI_SAS_ERR_MULTI_BIT_ECC = 0x1,
+};
+
 struct hisi_sas_phy {
struct hisi_hba *hisi_hba;
struct hisi_sas_port*port;
@@ -426,5 +451,6 @@ extern void hisi_sas_slot_task_free(struct hisi_hba 
*hisi_hba,
struct hisi_sas_slot *slot);
 extern void hisi_sas_init_mem(struct hisi_hba *hisi_hba);
 extern void hisi_sas_rst_work_handler(struct work_struct *work);
+extern void hisi_sas_sync_rst_work_handler(struct work_struct *work);
 extern void hisi_sas_kill_tasklets(struct hisi_hba *hisi_hba);
 #endif
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index e4b3092..fb162c0 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1299,8 +1299,14 @@ static int hisi_sas_lu_reset(struct domain_device 
*device, u8 *lun)
 static int hisi_sas_clear_nexus_ha(struct sas_ha_struct *sas_ha)
 {
struct hisi_hba *hisi_hba = sas_ha->lldd_ha;
+   HISI_SAS_DECLARE_RST_WORK_ON_STACK(r);
 
-   return hisi_sas_controller_reset(hisi_hba);
+   queue_work(hisi_hba->wq, );
+   wait_for_completion(r.completion);
+   if (r.done)
+   return TMF_RESP_FUNC_COMPLETE;
+
+   return TMF_RESP_FUNC_FAILED;
 }
 
 static int hisi_sas_query_task(struct sas_task *task)
@@ -1820,6 +1826,17 @@ void hisi_sas_rst_work_handler(struct work_struct *work)
 }
 EXPORT_SYMBOL_GPL(hisi_sas_rst_work_handler);
 
+void hisi_sas_sync_rst_work_handler(struct work_struct *work)
+{
+   struct hisi_sas_rst *rst =
+   container_of(work, struct hisi_sas_rst, work);
+
+   if (!hisi_sas_controller_reset(rst->hisi_hba))
+   rst->done = true;
+   complete(rst->completion);
+}
+EXPORT_SYMBOL_GPL(hisi_sas_sync_rst_work_handler);
+
 int hisi_sas_get_fw_info(struct hisi_hba *hisi_hba)
 {
struct device *dev = hisi_hba->dev;
-- 
1.9.1

[PATCH 04/19] scsi: hisi_sas: optimise port id refresh function

2017-12-08 Thread John Garry

From: Xiaofei Tan 

Currently refreshing the PHY port id after reset is
done in the rescan topology function, which is quite
late in the reset process. It could be moved earlier in
the process, as the port id can be refreshed once the
PHYs become ready.

In addition to this, we should set the hisi_sas_dev port
id to 0xff (invalid port id) if all PHYs of this port remain
down for the same device.

Signed-off-by: Xiaofei Tan 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 48 ++-
 1 file changed, 30 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 6446ce2..326ecb2 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -990,27 +990,42 @@ static int hisi_sas_debug_issue_ssp_tmf(struct 
domain_device *device,
sizeof(ssp_task), tmf);
 }
 
-static void hisi_sas_refresh_port_id(struct hisi_hba *hisi_hba,
-   struct asd_sas_port *sas_port, enum sas_linkrate linkrate)
+static void hisi_sas_refresh_port_id(struct hisi_hba *hisi_hba)
 {
-   struct hisi_sas_device  *sas_dev;
-   struct domain_device *device;
+   u32 state = hisi_hba->hw->get_phys_state(hisi_hba);
int i;
 
for (i = 0; i < HISI_SAS_MAX_DEVICES; i++) {
-   sas_dev = _hba->devices[i];
-   device = sas_dev->sas_device;
+   struct hisi_sas_device *sas_dev = _hba->devices[i];
+   struct domain_device *device = sas_dev->sas_device;
+   struct asd_sas_port *sas_port;
+   struct hisi_sas_port *port;
+   struct hisi_sas_phy *phy = NULL;
+   struct asd_sas_phy *sas_phy;
+
if ((sas_dev->dev_type == SAS_PHY_UNUSED)
-   || !device || (device->port != sas_port))
+   || !device || !device->port)
continue;
 
-   hisi_hba->hw->clear_itct(hisi_hba, sas_dev);
+   sas_port = device->port;
+   port = to_hisi_sas_port(sas_port);
+
+   list_for_each_entry(sas_phy, _port->phy_list, port_phy_el)
+   if (state & BIT(sas_phy->id)) {
+   phy = sas_phy->lldd_phy;
+   break;
+   }
+
+   if (phy) {
+   port->id = phy->port_id;
 
-   /* Update linkrate of directly attached device. */
-   if (!device->parent)
-   device->linkrate = linkrate;
+   /* Update linkrate of directly attached device. */
+   if (!device->parent)
+   device->linkrate = phy->sas_phy.linkrate;
 
-   hisi_hba->hw->setup_itct(hisi_hba, sas_dev);
+   hisi_hba->hw->setup_itct(hisi_hba, sas_dev);
+   } else
+   port->id = 0xff;
}
 }
 
@@ -1025,21 +1040,17 @@ static void hisi_sas_rescan_topology(struct hisi_hba 
*hisi_hba, u32 old_state,
struct hisi_sas_phy *phy = _hba->phy[phy_no];
struct asd_sas_phy *sas_phy = >sas_phy;
struct asd_sas_port *sas_port = sas_phy->port;
-   struct hisi_sas_port *port = to_hisi_sas_port(sas_port);
bool do_port_check = !!(_sas_port != sas_port);
 
if (!sas_phy->phy->enabled)
continue;
 
/* Report PHY state change to libsas */
-   if (state & (1 << phy_no)) {
-   if (do_port_check && sas_port) {
+   if (state & BIT(phy_no)) {
+   if (do_port_check && sas_port && sas_port->port_dev) {
struct domain_device *dev = sas_port->port_dev;
 
_sas_port = sas_port;
-   port->id = phy->port_id;
-   hisi_sas_refresh_port_id(hisi_hba,
-   sas_port, sas_phy->linkrate);
 
if (DEV_IS_EXPANDER(dev->dev_type))
sas_ha->notify_port_event(sas_phy,
@@ -1088,6 +1099,7 @@ static int hisi_sas_controller_reset(struct hisi_hba 
*hisi_hba)
/* Init and wait for PHYs to come up and all libsas event finished. */
hisi_hba->hw->phys_init(hisi_hba);
msleep(1000);
+   hisi_sas_refresh_port_id(hisi_hba);
drain_workqueue(hisi_hba->wq);
drain_workqueue(shost->work_q);
 
-- 
1.9.1

[PATCH 10/19] scsi: hisi_sas: add some print to enhance debugging

2017-12-08 Thread John Garry

From: Xiang Chen 

Add some print at some places such as error info and cq
of exception IO, device found etc, and also adjust some
log levels.

All this to assist debugging ability.

Signed-off-by: Xiang Chen 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_main.c  | 15 ++-
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 24 +++-
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 22 +-
 3 files changed, 46 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index fb162c0..1f6f063 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -580,6 +580,9 @@ static int hisi_sas_dev_found(struct domain_device *device)
}
}
 
+   dev_info(dev, "dev[%d:%x] found\n",
+   sas_dev->device_id, sas_dev->dev_type);
+
return 0;
 }
 
@@ -735,7 +738,7 @@ static void hisi_sas_dev_gone(struct domain_device *device)
struct hisi_hba *hisi_hba = dev_to_hisi_hba(device);
struct device *dev = hisi_hba->dev;
 
-   dev_info(dev, "found dev[%d:%x] is gone\n",
+   dev_info(dev, "dev[%d:%x] is gone\n",
 sas_dev->device_id, sas_dev->dev_type);
 
if (!test_bit(HISI_SAS_RESET_BIT, _hba->flags)) {
@@ -866,12 +869,13 @@ static int hisi_sas_exec_internal_tmf_task(struct 
domain_device *device,
if (!(task->task_state_flags & SAS_TASK_STATE_DONE)) {
struct hisi_sas_slot *slot = task->lldd_task;
 
-   dev_err(dev, "abort tmf: TMF task timeout\n");
+   dev_err(dev, "abort tmf: TMF task timeout and 
not done\n");
if (slot)
slot->task = NULL;
 
goto ex_err;
-   }
+   } else
+   dev_err(dev, "abort tmf: TMF task timeout\n");
}
 
if (task->task_status.resp == SAS_TASK_COMPLETE &&
@@ -1495,9 +1499,10 @@ static int hisi_sas_query_task(struct sas_task *task)
 
if (slot)
slot->task = NULL;
-   dev_err(dev, "internal task abort: timeout.\n");
+   dev_err(dev, "internal task abort: timeout and not 
done.\n");
goto exit;
-   }
+   } else
+   dev_err(dev, "internal task abort: timeout.\n");
}
 
if (task->task_status.resp == SAS_TASK_COMPLETE &&
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index cd9cd84..8d6886a 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -2361,6 +2361,7 @@ static void slot_err_v2_hw(struct hisi_hba *hisi_hba,
ts->resp = SAS_TASK_COMPLETE;
 
if (unlikely(aborted)) {
+   dev_dbg(dev, "slot_complete: task(%p) aborted\n", task);
ts->stat = SAS_ABORTED_TASK;
spin_lock_irqsave(_hba->lock, flags);
hisi_sas_slot_task_free(hisi_hba, task, slot);
@@ -2405,6 +2406,7 @@ static void slot_err_v2_hw(struct hisi_hba *hisi_hba,
(!(complete_hdr->dw0 & CMPLT_HDR_RSPNS_XFRD_MSK))) {
u32 err_phase = (complete_hdr->dw0 & CMPLT_HDR_ERR_PHASE_MSK)
>> CMPLT_HDR_ERR_PHASE_OFF;
+   u32 *error_info = hisi_sas_status_buf_addr_mem(slot);
 
/* Analyse error happens on which phase TX or RX */
if (ERR_ON_TX_PHASE(err_phase))
@@ -2412,6 +2414,16 @@ static void slot_err_v2_hw(struct hisi_hba *hisi_hba,
else if (ERR_ON_RX_PHASE(err_phase))
slot_err_v2_hw(hisi_hba, task, slot, 2);
 
+   if (ts->stat != SAS_DATA_UNDERRUN)
+   dev_info(dev, "erroneous completion iptt=%d task=%p "
+   "CQ hdr: 0x%x 0x%x 0x%x 0x%x "
+   "Error info: 0x%x 0x%x 0x%x 0x%x\n",
+   slot->idx, task,
+   complete_hdr->dw0, complete_hdr->dw1,
+   complete_hdr->act, complete_hdr->dw3,
+   error_info[0], error_info[1],
+   error_info[2], error_info[3]);
+
if (unlikely(slot->abort))
return ts->stat;
goto out;
@@ -2461,7 +2473,7 @@ static void slot_err_v2_hw(struct hisi_hba *hisi_hba,
}
 
if (!slot->port->port_attached) {
-   dev_err(dev, "slot complete: port %d has removed\n",
+   dev_warn(dev, "slot complete: port %d has removed\n",

[PATCH 14/19] scsi: hisi_sas: do link reset for some CHL_INT2 ints

2017-12-08 Thread John Garry

From: Xiaofei Tan 

We should do link reset of PHY when identify timeout or
STP link timeout. They are internal events of SOC and are
notified to driver through interrupts of CHL_INT2.

Besides, we should add an delay work to do link reset as
it needs sleep. So, this patch add an new PHY event
HISI_PHYE_LINK_RESET for this.

Notes: v2 HW doesn't report the event of STP link timeout.
So, we only need to handle event of identify timeout for v2 HW.

Signed-off-by: Xiaofei Tan 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas.h   |  1 +
 drivers/scsi/hisi_sas/hisi_sas_main.c  | 12 
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 18 ++
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 29 +++--
 4 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h
index aa14638..4343c4c 100644
--- a/drivers/scsi/hisi_sas/hisi_sas.h
+++ b/drivers/scsi/hisi_sas/hisi_sas.h
@@ -126,6 +126,7 @@ enum hisi_sas_bit_err_type {
 
 enum hisi_sas_phy_event {
HISI_PHYE_PHY_UP   = 0U,
+   HISI_PHYE_LINK_RESET,
HISI_PHYES_NUM,
 };
 
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 326dc81..7446a39 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -22,6 +22,8 @@ static int hisi_sas_debug_issue_ssp_tmf(struct domain_device 
*device,
 struct domain_device *device,
 int abort_flag, int tag);
 static int hisi_sas_softreset_ata_disk(struct domain_device *device);
+static int hisi_sas_control_phy(struct asd_sas_phy *sas_phy, enum phy_func 
func,
+   void *funcdata);
 
 u8 hisi_sas_get_ata_protocol(u8 cmd, int direction)
 {
@@ -631,8 +633,18 @@ static void hisi_sas_phyup_work(struct work_struct *work)
hisi_sas_bytes_dmaed(hisi_hba, phy_no);
 }
 
+static void hisi_sas_linkreset_work(struct work_struct *work)
+{
+   struct hisi_sas_phy *phy =
+   container_of(work, typeof(*phy), works[HISI_PHYE_LINK_RESET]);
+   struct asd_sas_phy *sas_phy = >sas_phy;
+
+   hisi_sas_control_phy(sas_phy, PHY_FUNC_LINK_RESET, NULL);
+}
+
 static const work_func_t hisi_sas_phye_fns[HISI_PHYES_NUM] = {
[HISI_PHYE_PHY_UP] = hisi_sas_phyup_work,
+   [HISI_PHYE_LINK_RESET] = hisi_sas_linkreset_work,
 };
 
 bool hisi_sas_notify_phy_event(struct hisi_sas_phy *phy,
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index e521c42..b8fe08d 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -245,6 +245,7 @@
 #define CHL_INT1_DMAC_RX_AXI_WR_ERR_OFF21
 #define CHL_INT1_DMAC_RX_AXI_RD_ERR_OFF22
 #define CHL_INT2   (PORT_BASE + 0x1bc)
+#define CHL_INT2_SL_IDAF_TOUT_CONF_OFF 0
 #define CHL_INT0_MSK   (PORT_BASE + 0x1c0)
 #define CHL_INT1_MSK   (PORT_BASE + 0x1c4)
 #define CHL_INT2_MSK   (PORT_BASE + 0x1c8)
@@ -1187,7 +1188,7 @@ static void init_reg_v2_hw(struct hisi_hba *hisi_hba)
hisi_sas_phy_write32(hisi_hba, i, CHL_INT2, 0xfff87fff);
hisi_sas_phy_write32(hisi_hba, i, RXOP_CHECK_CFG_H, 0x1000);
hisi_sas_phy_write32(hisi_hba, i, CHL_INT1_MSK, 0xff857fff);
-   hisi_sas_phy_write32(hisi_hba, i, CHL_INT2_MSK, 0x8bff);
+   hisi_sas_phy_write32(hisi_hba, i, CHL_INT2_MSK, 0x8bfe);
hisi_sas_phy_write32(hisi_hba, i, SL_CFG, 0x13f801fc);
hisi_sas_phy_write32(hisi_hba, i, PHY_CTRL_RDY_MSK, 0x0);
hisi_sas_phy_write32(hisi_hba, i, PHYCTRL_NOT_RDY_MSK, 0x0);
@@ -2905,10 +2906,19 @@ static irqreturn_t int_chnl_int_v2_hw(int irq_no, void 
*p)
 CHL_INT1, irq_value1);
}
 
-   if ((irq_msk & (1 << phy_no)) && irq_value2)
-   hisi_sas_phy_write32(hisi_hba, phy_no,
-CHL_INT2, irq_value2);
+   if ((irq_msk & (1 << phy_no)) && irq_value2) {
+   struct hisi_sas_phy *phy = _hba->phy[phy_no];
+
+   if (irq_value2 & BIT(CHL_INT2_SL_IDAF_TOUT_CONF_OFF)) {
+   dev_warn(dev, "phy%d identify timeout\n",
+   phy_no);
+   hisi_sas_notify_phy_event(phy,
+   HISI_PHYE_LINK_RESET);
+   }
 
+   hisi_sas_phy_write32(hisi_hba, phy_no,
+CHL_INT2, irq_value2);
+   }
 
if ((irq_msk & (1 << phy_no)) && irq_value0) {
if

[PATCH 09/19] scsi: hisi_sas: add RAS feature for v3 hw

2017-12-08 Thread John Garry

From: Xiaofei Tan 

We use PCIe AER to support RAS feature for v3 hw.
This driver should do following two things to support this:
1. Enable RAS interrupts, so that errors can be reported to
RAS module.
2. Realize err_handler for sas_v3_pci_driver. Then if non-fatal
error is detected, print error source and try to recover SAS
controller.

Signed-off-by: Xiaofei Tan 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 139 +
 1 file changed, 139 insertions(+)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 69aa7bc..d356e12 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -204,6 +204,13 @@
 #define AM_ROB_ECC_MULBIT_ERR_ADDR_OFF 8
 #define AM_ROB_ECC_MULBIT_ERR_ADDR_MSK (0xff << AM_ROB_ECC_MULBIT_ERR_ADDR_OFF)
 
+/* RAS registers need init */
+#define RAS_BASE   (0x6000)
+#define SAS_RAS_INTR0  (RAS_BASE)
+#define SAS_RAS_INTR1  (RAS_BASE + 0x04)
+#define SAS_RAS_INTR0_MASK (RAS_BASE + 0x08)
+#define SAS_RAS_INTR1_MASK (RAS_BASE + 0x0c)
+
 /* HW dma structures */
 /* Delivery queue header */
 /* dw0 */
@@ -496,6 +503,10 @@ static void init_reg_v3_hw(struct hisi_hba *hisi_hba)
 
hisi_sas_write32(hisi_hba, SATA_INITI_D2H_STORE_ADDR_HI,
 upper_32_bits(hisi_hba->initial_fis_dma));
+
+   /* RAS registers init */
+   hisi_sas_write32(hisi_hba, SAS_RAS_INTR0_MASK, 0x0);
+   hisi_sas_write32(hisi_hba, SAS_RAS_INTR1_MASK, 0x0);
 }
 
 static void config_phy_opt_mode_v3_hw(struct hisi_hba *hisi_hba, int phy_no)
@@ -2129,6 +2140,127 @@ static void hisi_sas_v3_remove(struct pci_dev *pdev)
scsi_host_put(shost);
 }
 
+static const struct hisi_sas_hw_error sas_ras_intr0_nfe[] = {
+   { .irq_msk = BIT(19), .msg = "HILINK_INT" },
+   { .irq_msk = BIT(20), .msg = "HILINK_PLL0_OUT_OF_LOCK" },
+   { .irq_msk = BIT(21), .msg = "HILINK_PLL1_OUT_OF_LOCK" },
+   { .irq_msk = BIT(22), .msg = "HILINK_LOSS_OF_REFCLK0" },
+   { .irq_msk = BIT(23), .msg = "HILINK_LOSS_OF_REFCLK1" },
+   { .irq_msk = BIT(24), .msg = "DMAC0_TX_POISON" },
+   { .irq_msk = BIT(25), .msg = "DMAC1_TX_POISON" },
+   { .irq_msk = BIT(26), .msg = "DMAC2_TX_POISON" },
+   { .irq_msk = BIT(27), .msg = "DMAC3_TX_POISON" },
+   { .irq_msk = BIT(28), .msg = "DMAC4_TX_POISON" },
+   { .irq_msk = BIT(29), .msg = "DMAC5_TX_POISON" },
+   { .irq_msk = BIT(30), .msg = "DMAC6_TX_POISON" },
+   { .irq_msk = BIT(31), .msg = "DMAC7_TX_POISON" },
+};
+
+static const struct hisi_sas_hw_error sas_ras_intr1_nfe[] = {
+   { .irq_msk = BIT(0), .msg = "RXM_CFG_MEM3_ECC2B_INTR" },
+   { .irq_msk = BIT(1), .msg = "RXM_CFG_MEM2_ECC2B_INTR" },
+   { .irq_msk = BIT(2), .msg = "RXM_CFG_MEM1_ECC2B_INTR" },
+   { .irq_msk = BIT(3), .msg = "RXM_CFG_MEM0_ECC2B_INTR" },
+   { .irq_msk = BIT(4), .msg = "HGC_CQE_ECC2B_INTR" },
+   { .irq_msk = BIT(5), .msg = "LM_CFG_IOSTL_ECC2B_INTR" },
+   { .irq_msk = BIT(6), .msg = "LM_CFG_ITCTL_ECC2B_INTR" },
+   { .irq_msk = BIT(7), .msg = "HGC_ITCT_ECC2B_INTR" },
+   { .irq_msk = BIT(8), .msg = "HGC_IOST_ECC2B_INTR" },
+   { .irq_msk = BIT(9), .msg = "HGC_DQE_ECC2B_INTR" },
+   { .irq_msk = BIT(10), .msg = "DMAC0_RAM_ECC2B_INTR" },
+   { .irq_msk = BIT(11), .msg = "DMAC1_RAM_ECC2B_INTR" },
+   { .irq_msk = BIT(12), .msg = "DMAC2_RAM_ECC2B_INTR" },
+   { .irq_msk = BIT(13), .msg = "DMAC3_RAM_ECC2B_INTR" },
+   { .irq_msk = BIT(14), .msg = "DMAC4_RAM_ECC2B_INTR" },
+   { .irq_msk = BIT(15), .msg = "DMAC5_RAM_ECC2B_INTR" },
+   { .irq_msk = BIT(16), .msg = "DMAC6_RAM_ECC2B_INTR" },
+   { .irq_msk = BIT(17), .msg = "DMAC7_RAM_ECC2B_INTR" },
+   { .irq_msk = BIT(18), .msg = "OOO_RAM_ECC2B_INTR" },
+   { .irq_msk = BIT(20), .msg = "HGC_DQE_POISON_INTR" },
+   { .irq_msk = BIT(21), .msg = "HGC_IOST_POISON_INTR" },
+   { .irq_msk = BIT(22), .msg = "HGC_ITCT_POISON_INTR" },
+   { .irq_msk = BIT(23), .msg = "HGC_ITCT_NCQ_POISON_INTR" },
+   { .irq_msk = BIT(24), .msg = "DMAC0_RX_POISON" },
+   { .irq_msk = BIT(25), .msg = "DMAC1_RX_POISON" },
+   { .irq_msk = BIT(26), .msg = "DMAC2_RX_POISON" },
+   { .irq_msk = BIT(27), .msg = "DMAC3_RX_POISON" },
+   { .irq_msk = BIT(28), .msg = "DMAC4_RX_POISON" },
+   { .irq_msk = BIT(29), .msg = "DMAC5_RX_POISON" },
+   { .irq_msk = BIT(30), .msg = "DMAC6_RX_POISON" },
+   { .irq_msk = BIT(31), .msg = "DMAC7_RX_POISON" },
+};
+
+static bool process_non_fatal_error_v3_hw(struct hisi_hba *hisi_hba)
+{
+   struct device *dev = hisi_hba->dev;
+   const struct hisi_sas_hw_error *ras_error;
+   bool need_reset = false;
+   u32 irq_value;
+   int i;
+
+   irq_value = hisi_sas_read32(hisi_hba,

[PATCH 05/19] scsi: hisi_sas: some optimizations of host controller reset

2017-12-08 Thread John Garry

From: Xiaofei Tan 

This patch do following optimizations to host controller reset:
1. Unblock scsi requests before rescanning topology, as SCSI
command need be used if new device is found during rescanning
topology.

2. Remove drain_workqueue(hisi_hba->wq) and
drain_workqueue(shost->work_q), as there is no need to ensure
that all PHYs event are done before exiting host reset.

3. Improve message print level of host reset. Host reset is an
important and very few occurrence event. We should know its progress
even when not debugging.

Signed-off-by: Xiaofei Tan 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 326ecb2..64d51a8 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1061,8 +1061,6 @@ static void hisi_sas_rescan_topology(struct hisi_hba 
*hisi_hba, u32 old_state,
hisi_sas_phy_down(hisi_hba, phy_no, 0);
 
}
-
-   drain_workqueue(hisi_hba->shost->work_q);
 }
 
 static int hisi_sas_controller_reset(struct hisi_hba *hisi_hba)
@@ -1079,7 +1077,7 @@ static int hisi_sas_controller_reset(struct hisi_hba 
*hisi_hba)
if (test_and_set_bit(HISI_SAS_RESET_BIT, _hba->flags))
return -1;
 
-   dev_dbg(dev, "controller resetting...\n");
+   dev_info(dev, "controller resetting...\n");
old_state = hisi_hba->hw->get_phys_state(hisi_hba);
 
scsi_block_requests(shost);
@@ -1088,6 +1086,7 @@ static int hisi_sas_controller_reset(struct hisi_hba 
*hisi_hba)
if (rc) {
dev_warn(dev, "controller reset failed (%d)\n", rc);
clear_bit(HISI_SAS_REJECT_CMD_BIT, _hba->flags);
+   scsi_unblock_requests(shost);
goto out;
}
spin_lock_irqsave(_hba->lock, flags);
@@ -1100,15 +1099,13 @@ static int hisi_sas_controller_reset(struct hisi_hba 
*hisi_hba)
hisi_hba->hw->phys_init(hisi_hba);
msleep(1000);
hisi_sas_refresh_port_id(hisi_hba);
-   drain_workqueue(hisi_hba->wq);
-   drain_workqueue(shost->work_q);
+   scsi_unblock_requests(shost);
 
state = hisi_hba->hw->get_phys_state(hisi_hba);
hisi_sas_rescan_topology(hisi_hba, old_state, state);
-   dev_dbg(dev, "controller reset complete\n");
+   dev_info(dev, "controller reset complete\n");
 
 out:
-   scsi_unblock_requests(shost);
clear_bit(HISI_SAS_RESET_BIT, _hba->flags);
 
return rc;
-- 
1.9.1

[PATCH 18/19] scsi: hisi_sas: re-add the lldd_port_deformed()

2017-12-08 Thread John Garry

From: Xiang Chen 

In function sas_suspend_devices(), it requires
callback lldd_port_deformed callback to be
implemented if lldd_port_deformed is
implemented.

So add a stub for lldd_port_deformed.

Callback lldd_port_deformed was not required as the
port deformation is done elsewhere in the LLDD.

Signed-off-by: Xiang Chen 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 9bd98e5..ad12237 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1613,6 +1613,10 @@ static void hisi_sas_port_formed(struct asd_sas_phy 
*sas_phy)
hisi_sas_port_notify_formed(sas_phy);
 }
 
+static void hisi_sas_port_deformed(struct asd_sas_phy *sas_phy)
+{
+}
+
 static void hisi_sas_phy_disconnected(struct hisi_sas_phy *phy)
 {
phy->phy_attached = 0;
@@ -1703,6 +1707,7 @@ void hisi_sas_kill_tasklets(struct hisi_hba *hisi_hba)
.lldd_query_task= hisi_sas_query_task,
.lldd_clear_nexus_ha = hisi_sas_clear_nexus_ha,
.lldd_port_formed   = hisi_sas_port_formed,
+   .lldd_port_deformed = hisi_sas_port_deformed,
 };
 
 void hisi_sas_init_mem(struct hisi_hba *hisi_hba)
-- 
1.9.1

[PATCH 17/19] scsi: hisi_sas: fix SAS_QUEUE_FULL problem while running IO

2017-12-08 Thread John Garry

From: Xiang Chen 

This patch fix SAS_QUEUE_FULL problem. The test situation is
close port while running IO.

In sas_eh_handle_sas_errors(), SCSI EH will free sas_task of
the device if lldd_I_T_nexus_reset() return
TMF_RESP_FUNC_COMPLETE or -ENODEV.
But in our SAS driver, we only free slots of the device when
the return value is TMF_RESP_FUNC_COMPLETE. So if the return
value is -ENODEV, the slot resource will not free any more.

As an solution, we should also free slots of the device in
lldd_I_T_nexus_reset() if the return value is -ENODEV.

Signed-off-by: Xiaofei Tan 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 302da84..9bd98e5 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1308,7 +1308,7 @@ static int hisi_sas_I_T_nexus_reset(struct domain_device 
*device)
 
rc = hisi_sas_debug_I_T_nexus_reset(device);
 
-   if (rc == TMF_RESP_FUNC_COMPLETE) {
+   if ((rc == TMF_RESP_FUNC_COMPLETE) || (rc == -ENODEV)) {
spin_lock_irqsave(_hba->lock, flags);
hisi_sas_release_task(hisi_hba, device);
spin_unlock_irqrestore(_hba->lock, flags);
-- 
1.9.1

[PATCH 03/19] scsi: hisi_sas: relocate clearing ITCT and freeing device

2017-12-08 Thread John Garry

From: Xiaofei Tan 

In certain scenarios we may just want to clear the ITCT for
a device, and not free other resources like the SATA bitmap
using in v2 hw.

To facilitate this, this patch relocates the code of clearing
ITCT from free_device() to an new hw interface clear_itct().
Then for some hw, we should not realise free_device() if there's
nothing left to do for it.

Signed-off-by: Xiaofei Tan 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas.h   |  3 ++-
 drivers/scsi/hisi_sas/hisi_sas_main.c  |  7 +--
 drivers/scsi/hisi_sas/hisi_sas_v1_hw.c |  4 ++--
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 16 +++-
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c |  4 ++--
 5 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h
index 83357b03..b2534ca 100644
--- a/drivers/scsi/hisi_sas/hisi_sas.h
+++ b/drivers/scsi/hisi_sas/hisi_sas.h
@@ -205,8 +205,9 @@ struct hisi_sas_hw {
void (*phy_set_linkrate)(struct hisi_hba *hisi_hba, int phy_no,
struct sas_phy_linkrates *linkrates);
enum sas_linkrate (*phy_get_max_linkrate)(void);
-   void (*free_device)(struct hisi_hba *hisi_hba,
+   void (*clear_itct)(struct hisi_hba *hisi_hba,
struct hisi_sas_device *dev);
+   void (*free_device)(struct hisi_sas_device *sas_dev);
int (*get_wideport_bitmap)(struct hisi_hba *hisi_hba, int port_id);
void (*dereg_device)(struct hisi_hba *hisi_hba,
struct domain_device *device);
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index d842530..6446ce2 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -743,7 +743,10 @@ static void hisi_sas_dev_gone(struct domain_device *device)
 
hisi_sas_dereg_device(hisi_hba, device);
 
-   hisi_hba->hw->free_device(hisi_hba, sas_dev);
+   hisi_hba->hw->clear_itct(hisi_hba, sas_dev);
+   if (hisi_hba->hw->free_device)
+   hisi_hba->hw->free_device(sas_dev);
+
device->lldd_dev = NULL;
memset(sas_dev, 0, sizeof(*sas_dev));
sas_dev->dev_type = SAS_PHY_UNUSED;
@@ -1001,7 +1004,7 @@ static void hisi_sas_refresh_port_id(struct hisi_hba 
*hisi_hba,
|| !device || (device->port != sas_port))
continue;
 
-   hisi_hba->hw->free_device(hisi_hba, sas_dev);
+   hisi_hba->hw->clear_itct(hisi_hba, sas_dev);
 
/* Update linkrate of directly attached device. */
if (!device->parent)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
index dc6eca8..8cb9061 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
@@ -544,7 +544,7 @@ static void setup_itct_v1_hw(struct hisi_hba *hisi_hba,
(0xff00ULL << ITCT_HDR_REJ_OPEN_TL_OFF));
 }
 
-static void free_device_v1_hw(struct hisi_hba *hisi_hba,
+static void clear_itct_v1_hw(struct hisi_hba *hisi_hba,
  struct hisi_sas_device *sas_dev)
 {
u64 dev_id = sas_dev->device_id;
@@ -1850,7 +1850,7 @@ static int hisi_sas_v1_init(struct hisi_hba *hisi_hba)
.hw_init = hisi_sas_v1_init,
.setup_itct = setup_itct_v1_hw,
.sl_notify = sl_notify_v1_hw,
-   .free_device = free_device_v1_hw,
+   .clear_itct = clear_itct_v1_hw,
.prep_smp = prep_smp_v1_hw,
.prep_ssp = prep_ssp_v1_hw,
.get_free_slot = get_free_slot_v1_hw,
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index 5d3467f..cd9cd84 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -952,7 +952,7 @@ static void setup_itct_v2_hw(struct hisi_hba *hisi_hba,
(0x1ULL << ITCT_HDR_RTOLT_OFF));
 }
 
-static void free_device_v2_hw(struct hisi_hba *hisi_hba,
+static void clear_itct_v2_hw(struct hisi_hba *hisi_hba,
  struct hisi_sas_device *sas_dev)
 {
DECLARE_COMPLETION_ONSTACK(completion);
@@ -963,10 +963,6 @@ static void free_device_v2_hw(struct hisi_hba *hisi_hba,
 
sas_dev->completion = 
 
-   /* SoC bug workaround */
-   if (dev_is_sata(sas_dev->sas_device))
-   clear_bit(sas_dev->sata_idx, hisi_hba->sata_dev_bitmap);
-
/* clear the itct interrupt state */
if (ENT_INT_SRC3_ITC_INT_MSK & reg_val)
hisi_sas_write32(hisi_hba, ENT_INT_SRC3,
@@ -981,6 +977,15 @@ static void free_device_v2_hw(struct hisi_hba *hisi_hba,
}
 }
 
+static void free_device_v2_hw(struct hisi_sas_device *sas_dev)
+{
+   struct hisi_hba *hisi_hba = sas_dev->hisi_hba;
+
+   /* SoC bug

[PATCH 16/19] scsi: hisi_sas: add internal abort dev in some places

2017-12-08 Thread John Garry

From: Xiaofei Tan 

We should do internal abort dev before
TMF_ABORT_TASK_SET and TMF_LU_RESET. Because we may
only have done internal abort for single IO in the
earlier part of SCSI EH process. Even the internal abort
to the single IO, we also don't know whether it is
successful.

Besides, we should release slots of the device in
hisi_sas_abort_task_set() if the abort is successful.

Signed-off-by: Xiaofei Tan 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 1b9c48c..302da84 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1238,12 +1238,29 @@ static int hisi_sas_abort_task(struct sas_task *task)
 
 static int hisi_sas_abort_task_set(struct domain_device *device, u8 *lun)
 {
+   struct hisi_hba *hisi_hba = dev_to_hisi_hba(device);
+   struct device *dev = hisi_hba->dev;
struct hisi_sas_tmf_task tmf_task;
int rc = TMF_RESP_FUNC_FAILED;
+   unsigned long flags;
+
+   rc = hisi_sas_internal_task_abort(hisi_hba, device,
+   HISI_SAS_INT_ABT_DEV, 0);
+   if (rc < 0) {
+   dev_err(dev, "abort task set: internal abort rc=%d\n", rc);
+   return TMF_RESP_FUNC_FAILED;
+   }
+   hisi_sas_dereg_device(hisi_hba, device);
 
tmf_task.tmf = TMF_ABORT_TASK_SET;
rc = hisi_sas_debug_issue_ssp_tmf(device, lun, _task);
 
+   if (rc == TMF_RESP_FUNC_COMPLETE) {
+   spin_lock_irqsave(_hba->lock, flags);
+   hisi_sas_release_task(hisi_hba, device);
+   spin_unlock_irqrestore(_hba->lock, flags);
+   }
+
return rc;
 }
 
@@ -1333,6 +1350,14 @@ static int hisi_sas_lu_reset(struct domain_device 
*device, u8 *lun)
} else {
struct hisi_sas_tmf_task tmf_task = { .tmf =  TMF_LU_RESET };
 
+   rc = hisi_sas_internal_task_abort(hisi_hba, device,
+   HISI_SAS_INT_ABT_DEV, 0);
+   if (rc < 0) {
+   dev_err(dev, "lu_reset: internal abort failed\n");
+   goto out;
+   }
+   hisi_sas_dereg_device(hisi_hba, device);
+
rc = hisi_sas_debug_issue_ssp_tmf(device, lun, _task);
if (rc == TMF_RESP_FUNC_COMPLETE) {
spin_lock_irqsave(_hba->lock, flags);
-- 
1.9.1

[PATCH 13/19] scsi: hisi_sas: use an general way to delay PHY work

2017-12-08 Thread John Garry

From: Xiaofei Tan 

Use an general way to do delay work for a PHY. Then it will
be easier to add new delayed work for a PHY in future.

Signed-off-by: Xiaofei Tan 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas.h   |  9 -
 drivers/scsi/hisi_sas/hisi_sas_main.c  | 22 --
 drivers/scsi/hisi_sas/hisi_sas_v1_hw.c |  2 +-
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c |  4 ++--
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c |  2 +-
 5 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h
index 71bc8ea..aa14638 100644
--- a/drivers/scsi/hisi_sas/hisi_sas.h
+++ b/drivers/scsi/hisi_sas/hisi_sas.h
@@ -124,12 +124,17 @@ enum hisi_sas_bit_err_type {
HISI_SAS_ERR_MULTI_BIT_ECC = 0x1,
 };
 
+enum hisi_sas_phy_event {
+   HISI_PHYE_PHY_UP   = 0U,
+   HISI_PHYES_NUM,
+};
+
 struct hisi_sas_phy {
+   struct work_struct  works[HISI_PHYES_NUM];
struct hisi_hba *hisi_hba;
struct hisi_sas_port*port;
struct asd_sas_phy  sas_phy;
struct sas_identify identify;
-   struct work_struct  phyup_ws;
u64 port_id; /* from hw */
u64 dev_sas_addr;
u64 frame_rcvd_size;
@@ -453,4 +458,6 @@ extern void hisi_sas_slot_task_free(struct hisi_hba 
*hisi_hba,
 extern void hisi_sas_rst_work_handler(struct work_struct *work);
 extern void hisi_sas_sync_rst_work_handler(struct work_struct *work);
 extern void hisi_sas_kill_tasklets(struct hisi_hba *hisi_hba);
+extern bool hisi_sas_notify_phy_event(struct hisi_sas_phy *phy,
+   enum hisi_sas_phy_event event);
 #endif
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 1f6f063..326dc81 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -622,7 +622,7 @@ static int hisi_sas_scan_finished(struct Scsi_Host *shost, 
unsigned long time)
 static void hisi_sas_phyup_work(struct work_struct *work)
 {
struct hisi_sas_phy *phy =
-   container_of(work, struct hisi_sas_phy, phyup_ws);
+   container_of(work, typeof(*phy), works[HISI_PHYE_PHY_UP]);
struct hisi_hba *hisi_hba = phy->hisi_hba;
struct asd_sas_phy *sas_phy = >sas_phy;
int phy_no = sas_phy->id;
@@ -631,10 +631,27 @@ static void hisi_sas_phyup_work(struct work_struct *work)
hisi_sas_bytes_dmaed(hisi_hba, phy_no);
 }
 
+static const work_func_t hisi_sas_phye_fns[HISI_PHYES_NUM] = {
+   [HISI_PHYE_PHY_UP] = hisi_sas_phyup_work,
+};
+
+bool hisi_sas_notify_phy_event(struct hisi_sas_phy *phy,
+   enum hisi_sas_phy_event event)
+{
+   struct hisi_hba *hisi_hba = phy->hisi_hba;
+
+   if (WARN_ON(event >= HISI_PHYES_NUM))
+   return false;
+
+   return queue_work(hisi_hba->wq, >works[event]);
+}
+EXPORT_SYMBOL_GPL(hisi_sas_notify_phy_event);
+
 static void hisi_sas_phy_init(struct hisi_hba *hisi_hba, int phy_no)
 {
struct hisi_sas_phy *phy = _hba->phy[phy_no];
struct asd_sas_phy *sas_phy = >sas_phy;
+   int i;
 
phy->hisi_hba = hisi_hba;
phy->port = NULL;
@@ -652,7 +669,8 @@ static void hisi_sas_phy_init(struct hisi_hba *hisi_hba, 
int phy_no)
sas_phy->ha = (struct sas_ha_struct *)hisi_hba->shost->hostdata;
sas_phy->lldd_phy = phy;
 
-   INIT_WORK(>phyup_ws, hisi_sas_phyup_work);
+   for (i = 0; i < HISI_PHYES_NUM; i++)
+   INIT_WORK(>works[i], hisi_sas_phye_fns[i]);
 }
 
 static void hisi_sas_port_notify_formed(struct asd_sas_phy *sas_phy)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
index 8cb9061..679e76f 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
@@ -1482,7 +1482,7 @@ static irqreturn_t int_phyup_v1_hw(int irq_no, void *p)
else if (phy->identify.device_type != SAS_PHY_UNUSED)
phy->identify.target_port_protocols =
SAS_PROTOCOL_SMP;
-   queue_work(hisi_hba->wq, >phyup_ws);
+   hisi_sas_notify_phy_event(phy, HISI_PHYE_PHY_UP);
 
 end:
hisi_sas_phy_write32(hisi_hba, phy_no, CHL_INT2,
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index 7257311..e521c42 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -2708,7 +2708,7 @@ static int phy_up_v2_hw(int phy_no, struct hisi_hba 
*hisi_hba)
if (!timer_pending(_hba->timer))
set_link_timer_quirk(hisi_hba);
}
-   queue_work(hisi_hba->wq, >phyup_ws);
+   hisi_sas_notify_phy_event(phy, HISI_PHYE_PHY_UP);
 
 end:
hisi_sas_phy_write32(hisi_hba, phy_no, CHL_INT0,
@@ -3262,7 +3262,7 @@ static

[PATCH 11/19] scsi: hisi_sas: improve int_chnl_int_v2_hw() consistency with v3 hw

2017-12-08 Thread John Garry

From: Xiaofei Tan 

Change code format of int_chnl_int_v2_hw() to be consistent with
v3 hw to reduce an tag indent.

Signed-off-by: Xiaofei Tan 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 58 --
 1 file changed, 28 insertions(+), 30 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index 8d6886a..4c4a000 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -2848,40 +2848,38 @@ static irqreturn_t int_chnl_int_v2_hw(int irq_no, void 
*p)
HGC_INVLD_DQE_INFO_FB_CH3_OFF) & 0x1ff;
 
while (irq_msk) {
-   if (irq_msk & (1 << phy_no)) {
-   u32 irq_value0 = hisi_sas_phy_read32(hisi_hba, phy_no,
-CHL_INT0);
-   u32 irq_value1 = hisi_sas_phy_read32(hisi_hba, phy_no,
-CHL_INT1);
-   u32 irq_value2 = hisi_sas_phy_read32(hisi_hba, phy_no,
-CHL_INT2);
-
-   if (irq_value1) {
-   if (irq_value1 & (CHL_INT1_DMAC_RX_ECC_ERR_MSK |
- CHL_INT1_DMAC_TX_ECC_ERR_MSK))
-   panic("%s: DMAC RX/TX ecc bad error!\
-  (0x%x)",
- dev_name(dev), irq_value1);
-
-   hisi_sas_phy_write32(hisi_hba, phy_no,
-CHL_INT1, irq_value1);
-   }
+   u32 irq_value0 = hisi_sas_phy_read32(hisi_hba, phy_no,
+CHL_INT0);
+   u32 irq_value1 = hisi_sas_phy_read32(hisi_hba, phy_no,
+CHL_INT1);
+   u32 irq_value2 = hisi_sas_phy_read32(hisi_hba, phy_no,
+CHL_INT2);
+
+   if ((irq_msk & (1 << phy_no)) && irq_value1) {
+   if (irq_value1 & (CHL_INT1_DMAC_RX_ECC_ERR_MSK |
+ CHL_INT1_DMAC_TX_ECC_ERR_MSK))
+   panic("%s: DMAC RX/TX ecc bad error!\
+  (0x%x)",
+ dev_name(dev), irq_value1);
 
-   if (irq_value2)
-   hisi_sas_phy_write32(hisi_hba, phy_no,
-CHL_INT2, irq_value2);
+   hisi_sas_phy_write32(hisi_hba, phy_no,
+CHL_INT1, irq_value1);
+   }
 
+   if ((irq_msk & (1 << phy_no)) && irq_value2)
+   hisi_sas_phy_write32(hisi_hba, phy_no,
+CHL_INT2, irq_value2);
 
-   if (irq_value0) {
-   if (irq_value0 & CHL_INT0_SL_RX_BCST_ACK_MSK)
-   phy_bcast_v2_hw(phy_no, hisi_hba);
 
-   hisi_sas_phy_write32(hisi_hba, phy_no,
-   CHL_INT0, irq_value0
-   & (~CHL_INT0_HOTPLUG_TOUT_MSK)
-   & (~CHL_INT0_SL_PHY_ENABLE_MSK)
-   & (~CHL_INT0_NOT_RDY_MSK));
-   }
+   if ((irq_msk & (1 << phy_no)) && irq_value0) {
+   if (irq_value0 & CHL_INT0_SL_RX_BCST_ACK_MSK)
+   phy_bcast_v2_hw(phy_no, hisi_hba);
+
+   hisi_sas_phy_write32(hisi_hba, phy_no,
+   CHL_INT0, irq_value0
+   & (~CHL_INT0_HOTPLUG_TOUT_MSK)
+   & (~CHL_INT0_SL_PHY_ENABLE_MSK)
+   & (~CHL_INT0_NOT_RDY_MSK));
}
irq_msk &= ~(1 << phy_no);
phy_no++;
-- 
1.9.1

[PATCH 19/19] scsi: hisi_sas: add v3 hw suspend and resume

2017-12-08 Thread John Garry

From: Xiang Chen 

For v3 hw SAS, it supports configuring power state
from D0 to D3 for entering Low Power status and
power state from D3 to D0 for quit Low Power status.

When power state from D0 to D3, HW will send FLR to
clear the registers of ECAM and BAR space, and when
power state from D3 to D0, it will clear the
registers of ECAM space only.

So when suspend, need to do like controller reset
(including disable interrupts/DQ/PHY/BUS), and
also release slots after FLR. When resume,
re-config the registers of BAR space.

Signed-off-by: Xiang Chen 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas.h   |  1 +
 drivers/scsi/hisi_sas/hisi_sas_main.c  |  3 +-
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 94 ++
 3 files changed, 97 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h
index 4343c4c..cc05029 100644
--- a/drivers/scsi/hisi_sas/hisi_sas.h
+++ b/drivers/scsi/hisi_sas/hisi_sas.h
@@ -461,4 +461,5 @@ extern void hisi_sas_slot_task_free(struct hisi_hba 
*hisi_hba,
 extern void hisi_sas_kill_tasklets(struct hisi_hba *hisi_hba);
 extern bool hisi_sas_notify_phy_event(struct hisi_sas_phy *phy,
enum hisi_sas_phy_event event);
+extern void hisi_sas_release_tasks(struct hisi_hba *hisi_hba);
 #endif
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index ad12237..04e1172b 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -737,7 +737,7 @@ static void hisi_sas_release_task(struct hisi_hba *hisi_hba,
hisi_sas_do_release_task(hisi_hba, slot->task, slot);
 }
 
-static void hisi_sas_release_tasks(struct hisi_hba *hisi_hba)
+void hisi_sas_release_tasks(struct hisi_hba *hisi_hba)
 {
struct hisi_sas_device *sas_dev;
struct domain_device *device;
@@ -754,6 +754,7 @@ static void hisi_sas_release_tasks(struct hisi_hba 
*hisi_hba)
hisi_sas_release_task(hisi_hba, device);
}
 }
+EXPORT_SYMBOL_GPL(hisi_sas_release_tasks);
 
 static void hisi_sas_dereg_device(struct hisi_hba *hisi_hba,
struct domain_device *device)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 9e32105..6a408d2 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -2303,6 +2303,98 @@ enum {
hip08,
 };
 
+static int hisi_sas_v3_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+   struct sas_ha_struct *sha = pci_get_drvdata(pdev);
+   struct hisi_hba *hisi_hba = sha->lldd_ha;
+   struct device *dev = hisi_hba->dev;
+   struct Scsi_Host *shost = hisi_hba->shost;
+   u32 device_state, status;
+   int rc;
+   u32 reg_val;
+   unsigned long flags;
+
+   if (!pdev->pm_cap) {
+   dev_err(dev, "PCI PM not supported\n");
+   return -ENODEV;
+   }
+
+   set_bit(HISI_SAS_RESET_BIT, _hba->flags);
+   scsi_block_requests(shost);
+   set_bit(HISI_SAS_REJECT_CMD_BIT, _hba->flags);
+   flush_workqueue(hisi_hba->wq);
+   /* disable DQ/PHY/bus */
+   interrupt_disable_v3_hw(hisi_hba);
+   hisi_sas_write32(hisi_hba, DLVRY_QUEUE_ENABLE, 0x0);
+   hisi_sas_kill_tasklets(hisi_hba);
+
+   hisi_sas_stop_phys(hisi_hba);
+
+   reg_val = hisi_sas_read32(hisi_hba, AXI_MASTER_CFG_BASE +
+   AM_CTRL_GLOBAL);
+   reg_val |= 0x1;
+   hisi_sas_write32(hisi_hba, AXI_MASTER_CFG_BASE +
+   AM_CTRL_GLOBAL, reg_val);
+
+   /* wait until bus idle */
+   rc = readl_poll_timeout(hisi_hba->regs + AXI_MASTER_CFG_BASE +
+   AM_CURR_TRANS_RETURN, status, status == 0x3, 10, 100);
+   if (rc) {
+   dev_err(dev, "axi bus is not idle, rc = %d\n", rc);
+   clear_bit(HISI_SAS_REJECT_CMD_BIT, _hba->flags);
+   clear_bit(HISI_SAS_RESET_BIT, _hba->flags);
+   scsi_unblock_requests(shost);
+   return rc;
+   }
+
+   hisi_sas_init_mem(hisi_hba);
+
+   device_state = pci_choose_state(pdev, state);
+   dev_warn(dev, "entering operating state [D%d]\n",
+   device_state);
+   pci_save_state(pdev);
+   pci_disable_device(pdev);
+   pci_set_power_state(pdev, device_state);
+
+   spin_lock_irqsave(_hba->lock, flags);
+   hisi_sas_release_tasks(hisi_hba);
+   spin_unlock_irqrestore(_hba->lock, flags);
+
+   sas_suspend_ha(sha);
+   return 0;
+}
+
+static int hisi_sas_v3_resume(struct pci_dev *pdev)
+{
+   struct sas_ha_struct *sha = pci_get_drvdata(pdev);
+   struct hisi_hba *hisi_hba = sha->lldd_ha;
+   struct Scsi_Host *shost = hisi_hba->shost;
+   struct device *dev = hisi_hba->dev;
+   unsigned int rc;
+   u32

[PATCH 15/19] scsi: hisi_sas: judge result of internal abort

2017-12-08 Thread John Garry

From: Xiaofei Tan 

Normally, hardware should ensure that internal abort
timeout will never happen. If happen, it would be an SoC
failure. What's more, HW will not process any other
commands if an internal abort hasn't return CQ, and they
will time out also.

So, we should judge the result of internal abort in SCSI
EH, if it is failed, we should give up to do TMF/softreset
and return failure to the upper layer directly.

This patch do following things to achieve this.
1. When internal abort timeout happened, we set return
value to -EIO in hisi_sas_internal_task_abort().

2. If prep_abort() is not support, let
hisi_sas_internal_task_abort() return
TMF_RESP_FUNC_FAILED.

3. If hisi_sas_internal_task_abort() return
an negative number, it can be thought that it not
executed properly or internal abort timeout. Then we
won't do behind TMF or softreset, and return failure
directly.

Signed-off-by: Xiaofei Tan 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 38 ---
 1 file changed, 31 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 7446a39..1b9c48c 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1184,6 +1184,11 @@ static int hisi_sas_abort_task(struct sas_task *task)
 
rc2 = hisi_sas_internal_task_abort(hisi_hba, device,
   HISI_SAS_INT_ABT_CMD, tag);
+   if (rc2 < 0) {
+   dev_err(dev, "abort task: internal abort (%d)\n", rc2);
+   return TMF_RESP_FUNC_FAILED;
+   }
+
/*
 * If the TMF finds that the IO is not in the device and also
 * the internal abort does not succeed, then it is safe to
@@ -1201,8 +1206,12 @@ static int hisi_sas_abort_task(struct sas_task *task)
} else if (task->task_proto & SAS_PROTOCOL_SATA ||
task->task_proto & SAS_PROTOCOL_STP) {
if (task->dev->dev_type == SAS_SATA_DEV) {
-   hisi_sas_internal_task_abort(hisi_hba, device,
-HISI_SAS_INT_ABT_DEV, 0);
+   rc = hisi_sas_internal_task_abort(hisi_hba, device,
+   HISI_SAS_INT_ABT_DEV, 0);
+   if (rc < 0) {
+   dev_err(dev, "abort task: internal abort 
failed\n");
+   goto out;
+   }
hisi_sas_dereg_device(hisi_hba, device);
rc = hisi_sas_softreset_ata_disk(device);
}
@@ -1213,7 +1222,8 @@ static int hisi_sas_abort_task(struct sas_task *task)
 
rc = hisi_sas_internal_task_abort(hisi_hba, device,
 HISI_SAS_INT_ABT_CMD, tag);
-   if (rc == TMF_RESP_FUNC_FAILED && task->lldd_task) {
+   if (((rc < 0) || (rc == TMF_RESP_FUNC_FAILED)) &&
+   task->lldd_task) {
spin_lock_irqsave(_hba->lock, flags);
hisi_sas_do_release_task(hisi_hba, task, slot);
spin_unlock_irqrestore(_hba->lock, flags);
@@ -1263,15 +1273,20 @@ static int hisi_sas_I_T_nexus_reset(struct 
domain_device *device)
 {
struct hisi_sas_device *sas_dev = device->lldd_dev;
struct hisi_hba *hisi_hba = dev_to_hisi_hba(device);
-   unsigned long flags;
+   struct device *dev = hisi_hba->dev;
int rc = TMF_RESP_FUNC_FAILED;
+   unsigned long flags;
 
if (sas_dev->dev_status != HISI_SAS_DEV_EH)
return TMF_RESP_FUNC_FAILED;
sas_dev->dev_status = HISI_SAS_DEV_NORMAL;
 
-   hisi_sas_internal_task_abort(hisi_hba, device,
+   rc = hisi_sas_internal_task_abort(hisi_hba, device,
HISI_SAS_INT_ABT_DEV, 0);
+   if (rc < 0) {
+   dev_err(dev, "I_T nexus reset: internal abort (%d)\n", rc);
+   return TMF_RESP_FUNC_FAILED;
+   }
hisi_sas_dereg_device(hisi_hba, device);
 
rc = hisi_sas_debug_I_T_nexus_reset(device);
@@ -1299,8 +1314,10 @@ static int hisi_sas_lu_reset(struct domain_device 
*device, u8 *lun)
/* Clear internal IO and then hardreset */
rc = hisi_sas_internal_task_abort(hisi_hba, device,
  HISI_SAS_INT_ABT_DEV, 0);
-   if (rc == TMF_RESP_FUNC_FAILED)
+   if (rc < 0) {
+   dev_err(dev, "lu_reset: internal abort failed\n");
goto out;
+   }
hisi_sas_dereg_device(hisi_hba, device);
 
phy = sas_get_local_phy(device);
@@ -1497,8

[PATCH 12/19] scsi: hisi_sas: add v2 hw port AXI error handling support

2017-12-08 Thread John Garry

From: Xiaofei Tan 

Add port AXI errors handling for v2 hw. We do host controller
reset for such errors.

Besides, change port muli-bits ECC error handling, and we
should also do host reset for such error. So, this patch put
them in the same struct with port AXI error.

Signed-off-by: Xiaofei Tan 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 51 ++
 1 file changed, 45 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index 4c4a000..7257311 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -240,6 +240,10 @@
 #define CHL_INT1_DMAC_TX_ECC_ERR_MSK   (0x1 << CHL_INT1_DMAC_TX_ECC_ERR_OFF)
 #define CHL_INT1_DMAC_RX_ECC_ERR_OFF   17
 #define CHL_INT1_DMAC_RX_ECC_ERR_MSK   (0x1 << CHL_INT1_DMAC_RX_ECC_ERR_OFF)
+#define CHL_INT1_DMAC_TX_AXI_WR_ERR_OFF19
+#define CHL_INT1_DMAC_TX_AXI_RD_ERR_OFF20
+#define CHL_INT1_DMAC_RX_AXI_WR_ERR_OFF21
+#define CHL_INT1_DMAC_RX_AXI_RD_ERR_OFF22
 #define CHL_INT2   (PORT_BASE + 0x1bc)
 #define CHL_INT0_MSK   (PORT_BASE + 0x1c0)
 #define CHL_INT1_MSK   (PORT_BASE + 0x1c4)
@@ -1182,7 +1186,7 @@ static void init_reg_v2_hw(struct hisi_hba *hisi_hba)
hisi_sas_phy_write32(hisi_hba, i, CHL_INT1, 0x);
hisi_sas_phy_write32(hisi_hba, i, CHL_INT2, 0xfff87fff);
hisi_sas_phy_write32(hisi_hba, i, RXOP_CHECK_CFG_H, 0x1000);
-   hisi_sas_phy_write32(hisi_hba, i, CHL_INT1_MSK, 0x);
+   hisi_sas_phy_write32(hisi_hba, i, CHL_INT1_MSK, 0xff857fff);
hisi_sas_phy_write32(hisi_hba, i, CHL_INT2_MSK, 0x8bff);
hisi_sas_phy_write32(hisi_hba, i, SL_CFG, 0x13f801fc);
hisi_sas_phy_write32(hisi_hba, i, PHY_CTRL_RDY_MSK, 0x0);
@@ -2832,6 +2836,33 @@ static void phy_bcast_v2_hw(int phy_no, struct hisi_hba 
*hisi_hba)
hisi_sas_phy_write32(hisi_hba, phy_no, SL_RX_BCAST_CHK_MSK, 0);
 }
 
+static const struct hisi_sas_hw_error port_ecc_axi_error[] = {
+   {
+   .irq_msk = BIT(CHL_INT1_DMAC_TX_ECC_ERR_OFF),
+   .msg = "dmac_tx_ecc_bad_err",
+   },
+   {
+   .irq_msk = BIT(CHL_INT1_DMAC_RX_ECC_ERR_OFF),
+   .msg = "dmac_rx_ecc_bad_err",
+   },
+   {
+   .irq_msk = BIT(CHL_INT1_DMAC_TX_AXI_WR_ERR_OFF),
+   .msg = "dma_tx_axi_wr_err",
+   },
+   {
+   .irq_msk = BIT(CHL_INT1_DMAC_TX_AXI_RD_ERR_OFF),
+   .msg = "dma_tx_axi_rd_err",
+   },
+   {
+   .irq_msk = BIT(CHL_INT1_DMAC_RX_AXI_WR_ERR_OFF),
+   .msg = "dma_rx_axi_wr_err",
+   },
+   {
+   .irq_msk = BIT(CHL_INT1_DMAC_RX_AXI_RD_ERR_OFF),
+   .msg = "dma_rx_axi_rd_err",
+   },
+};
+
 static irqreturn_t int_chnl_int_v2_hw(int irq_no, void *p)
 {
struct hisi_hba *hisi_hba = p;
@@ -2856,11 +2887,19 @@ static irqreturn_t int_chnl_int_v2_hw(int irq_no, void 
*p)
 CHL_INT2);
 
if ((irq_msk & (1 << phy_no)) && irq_value1) {
-   if (irq_value1 & (CHL_INT1_DMAC_RX_ECC_ERR_MSK |
- CHL_INT1_DMAC_TX_ECC_ERR_MSK))
-   panic("%s: DMAC RX/TX ecc bad error!\
-  (0x%x)",
- dev_name(dev), irq_value1);
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(port_ecc_axi_error); i++) {
+   const struct hisi_sas_hw_error *error =
+   _ecc_axi_error[i];
+
+   if (!(irq_value1 & error->irq_msk))
+   continue;
+
+   dev_warn(dev, "%s error (phy%d 0x%x) found!\n",
+   error->msg, phy_no, irq_value1);
+   queue_work(hisi_hba->wq, _hba->rst_work);
+   }
 
hisi_sas_phy_write32(hisi_hba, phy_no,
 CHL_INT1, irq_value1);
-- 
1.9.1

[PATCH 02/19] scsi: hisi_sas: fix dma_unmap_sg() parameter

2017-12-08 Thread John Garry

From: Xiang Chen 

For function dma_unmap_sg(), the  parameter
should be number of elements in the scatterlist
prior to the mapping, not after the mapping.

Fix this usage.

Signed-off-by: Xiang Chen 
Signed-off-by: John Garry 
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 359ec52..d842530 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -192,7 +192,8 @@ void hisi_sas_slot_task_free(struct hisi_hba *hisi_hba, 
struct sas_task *task,
 
if (!sas_protocol_ata(task->task_proto))
if (slot->n_elem)
-   dma_unmap_sg(dev, task->scatter, slot->n_elem,
+   dma_unmap_sg(dev, task->scatter,
+task->num_scatter,
 task->data_dir);
 
if (sas_dev)
@@ -431,7 +432,8 @@ static int hisi_sas_task_prep(struct sas_task *task, struct 
hisi_sas_dq
dev_err(dev, "task prep: failed[%d]!\n", rc);
if (!sas_protocol_ata(task->task_proto))
if (n_elem)
-   dma_unmap_sg(dev, task->scatter, n_elem,
+   dma_unmap_sg(dev, task->scatter,
+task->num_scatter,
 task->data_dir);
 prep_out:
return rc;
-- 
1.9.1

Re: [PATCH] scsi: bfa: convert to strlcpy/strlcat

2017-12-08 Thread Johannes Thumshirn

Looks good,
Reviewed-by: Johannes Thumshirn 
-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Re: [PATCH] scsi: libiscsi: Allow sd_shutdown on bad transport

2017-12-08 Thread Rafael David Tinoco

Lee, Chris,

Some test results.

- Single unmounted disk, with transport connection wiped before final logout:

http://pastebin.ubuntu.com/26139576/

- Multiple mounted disks, multipath dev-mapper, all transport connections were 
wiped before the final logout, with heavy write workload:

http://pastebin.ubuntu.com/26139620/

Considering sd_shutdown logic - sd_shutdown, sd_sync_cache for each scsi_disk, 
3 attempts of scsi_execute with SYNCHRONIZE_CACHE cmd each -  you can see that, 
because transport was down, first SYNC_CACHE cmd waits for the request timeout 
and for the abort_timeout. All other cmds fail in the enqueuing phase, because 
of the transport failure + previous timeout + server shutdown happening 
simultaneously, so you don't have to wait for timeout on each command again.

This change also suits any pending requests, not only those coming from 
sd_shutdown, and it allows OS to reboot and shutdown, back again, independently 
of how bad userland was configured.

Thank you in advance for considering it.

-Rafael

> On 07/12/2017, at 07:59 PM, Rafael David Tinoco  
> wrote:
> 
> If, for any reason, userland shuts down iscsi transport interfaces
> before proper logouts - like when logging in to LUNs manually,
> without logging out on server shutdown, or when automated scripts
> can't umount/logout from logged LUNs - kernel will hang forever on
> its sd_sync_cache() logic, after issuing the SYNCHRONIZE_CACHE cmd
> to all still existent paths.
> 
> PID: 1 TASK: 8801a69b8000 CPU: 1 COMMAND: "systemd-shutdow"
> #0 [8801a69c3a30] __schedule at 8183e9ee
> #1 [8801a69c3a80] schedule at 8183f0d5
> #2 [8801a69c3a98] schedule_timeout at 81842199
> #3 [8801a69c3b40] io_schedule_timeout at 8183e604
> #4 [8801a69c3b70] wait_for_completion_io_timeout at 8183fc6c
> #5 [8801a69c3bd0] blk_execute_rq at 813cfe10
> #6 [8801a69c3c88] scsi_execute at 815c3fc7
> #7 [8801a69c3cc8] scsi_execute_req_flags at 815c60fe
> #8 [8801a69c3d30] sd_sync_cache at 815d37d7
> #9 [8801a69c3da8] sd_shutdown at 815d3c3c
> 
> This happens because iscsi_eh_cmd_timed_out(), the transport layer
> timeout helper, would tell the queue timeout function (scsi_times_out)
> to reset the request timer over and over, until the session state is
> back to logged in state. Unfortunately, during server shutdown, this
> might never happen again.
> 
> Other option would be "not to handle" the issue in the transport
> layer. That would trigger the error handler logic, which would also
> need the session state to be logged in again.
> 
> Best option, for such case, is to tell upper layers that the command
> was handled during the transport layer error handler helper, marking
> it as DID_NO_CONNECT, which will allow completion and inform about
> the problem.
> 
> After the session was marked as ISCSI_STATE_FAILED, due to the first
> timeout during the server shutdown phase, all subsequent cmds will
> fail to be queued, allowing upper logic to fail faster.
> 
> Signed-off-by: Rafael David Tinoco

Re: [PATCH v5 3/7] scsi: libsas: make the event threshold configurable

2017-12-08 Thread John Garry


On 08/12/2017 09:42, Jason Yan wrote:

Add a sysfs attr that LLDD can configure it for every host. We made
a example in hisi_sas. Other LLDDs using libsas can implement it if
they want.

Suggested-by: Hannes Reinecke 
Signed-off-by: Jason Yan 
CC: John Garry 
CC: Johannes Thumshirn 
CC: Ewan Milne 
CC: Christoph Hellwig 
CC: Tomas Henzl 
CC: Dan Williams 


Acked-by: John Garry  #for hisi_sas part


---
 drivers/scsi/hisi_sas/hisi_sas_main.c |  6 ++

Re: [PATCH v2 1/3] scsi: Fix a scsi_show_rq() NULL pointer dereference

2017-12-08 Thread Ming Lei

Hi Martin,

On Fri, Dec 08, 2017 at 04:44:55PM +0800, Ming Lei wrote:
> Hi Martin,
> 
> On Thu, Dec 07, 2017 at 09:46:21PM -0500, Martin K. Petersen wrote:
> > 
> > Ming,
> > 
> > > As I explained in [1], the use-after-free is inevitable no matter if
> > > clearing 'SCpnt->cmnd' before mempool_free() in sd_uninit_command() or
> > > not, so we need to comment the fact that cdb may point to garbage
> > > data, and this function(especially __scsi_format_command() has to
> > > survive that, so that people won't be surprised when kasan complains
> > > use-after-free, and guys will be careful when they try to change the
> > > code in future.
> > 
> > Longer term we really need to get rid of the separate CDB allocation. It
> > was a necessary evil when I did it. And not much of a concern since I
> > did not expect anybody sane to use Type 2 (it's designed for use inside
> > disk arrays).
> > 
> > However, I keep hearing about people using Type 2 drives. Some vendors
> > source drives formatted that way and use the same SKU for arrays and
> > standalone servers.
> > 
> > So we should really look into making it possible for a queue to have a
> > bigger than 16-byte built-in CDB. For Type 2 devices, 32-byte reads and
> > writes are a prerequisite. So it would be nice to be able to switch a
> > queue to a larger allocation post creation (we won't know the type until
> > after READ CAPACITY(16) has been sent).
> 
> I am wondering why you don't make __cmd[] in scsi_request defined as 32bytes?
> Even for some hosts with thousands of tag, the memory waste is still not
> too much.
> 
> Or if you prefer to do post creation, we have sbitmap_queue now, which can
> help to build a pre-allocated memory pool easily, and its allocation/free is
> pretty efficient.

Or something like the following patch? I run several IO tests over 
scsi_debug(dif=2, dix=1),
and looks it works without any problem.


>From 7731af623af164c6be451d9c543ce6b70e7e66b8 Mon Sep 17 00:00:00 2001
From: Ming Lei 
Date: Fri, 8 Dec 2017 17:35:18 +0800
Subject: [PATCH] SCSI: pre-allocate command buffer for T10_PI_TYPE2_PROTECTION

This patch allocates one array for T10_PI_TYPE2_PROTECTION command,
size of each element is SD_EXT_CDB_SIZE, and the length is
host->can_queue, then we can retrieve one command buffer runtime
via rq->tag.

So we can avoid to allocate the command buffer runtime, also the recent
use-after-free report[1] in scsi_show_rq() can be fixed too.

[1] https://marc.info/?l=linux-block=151030452715642=2

Signed-off-by: Ming Lei 
---
 drivers/scsi/hosts.c |  1 +
 drivers/scsi/sd.c| 55 
 drivers/scsi/sd.h|  4 ++--
 drivers/scsi/sd_dif.c| 32 ++--
 include/scsi/scsi_host.h |  2 ++
 5 files changed, 49 insertions(+), 45 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index fe3a0da3ec97..74f55b8f16fe 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -350,6 +350,7 @@ static void scsi_host_dev_release(struct device *dev)
 
if (parent)
put_device(parent);
+   kfree(shost->cmd_ext_buf);
kfree(shost);
 }
 
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 24fe68522716..853eb57ad4ad 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -131,9 +131,6 @@ static DEFINE_IDA(sd_index_ida);
  * object after last put) */
 static DEFINE_MUTEX(sd_ref_mutex);
 
-static struct kmem_cache *sd_cdb_cache;
-static mempool_t *sd_cdb_pool;
-
 static const char *sd_cache_types[] = {
"write through", "none", "write back",
"write back, no read (daft)"
@@ -1026,6 +1023,13 @@ static int sd_setup_flush_cmnd(struct scsi_cmnd *cmd)
return BLKPREP_OK;
 }
 
+static char *sd_get_ext_buf(struct scsi_device *sdp, struct scsi_cmnd *SCpnt)
+{
+   struct request *rq = SCpnt->request;
+
+   return >host->cmd_ext_buf[rq->tag * SD_EXT_CDB_SIZE];
+}
+
 static int sd_setup_read_write_cmnd(struct scsi_cmnd *SCpnt)
 {
struct request *rq = SCpnt->request;
@@ -1168,12 +1172,7 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd 
*SCpnt)
protect = 0;
 
if (protect && sdkp->protection_type == T10_PI_TYPE2_PROTECTION) {
-   SCpnt->cmnd = mempool_alloc(sd_cdb_pool, GFP_ATOMIC);
-
-   if (unlikely(SCpnt->cmnd == NULL)) {
-   ret = BLKPREP_DEFER;
-   goto out;
-   }
+   SCpnt->cmnd = sd_get_ext_buf(sdp, SCpnt);
 
SCpnt->cmd_len = SD_EXT_CDB_SIZE;
memset(SCpnt->cmnd, 0, SCpnt->cmd_len);
@@ -1318,12 +1317,6 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
 
if (rq->rq_flags & RQF_SPECIAL_PAYLOAD)
__free_page(rq->special_vec.bv_page);
-
-   if (SCpnt->cmnd != scsi_req(rq)->cmd) {
-   mempool_free(SCpnt->cmnd, sd_cdb_pool);
-   SCpnt->cmnd = NULL;

[PATCH] scsi_dh_alua: skip RTPG for devices only supporting active/optimized

2017-12-08 Thread Hannes Reinecke

From: Hannes Reinecke 

For hardware only supporting active/optimized there's no point in
ever re-issuing RTPG as the only new state we can possibly read is
active/optimized.
This avoid spurious errors during path failover on such arrays.

Signed-off-by: Hannes Reinecke 
---
 drivers/scsi/device_handler/scsi_dh_alua.c | 35 ++
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c 
b/drivers/scsi/device_handler/scsi_dh_alua.c
index fd22dc6..b09c12b 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -40,6 +40,7 @@
 #define TPGS_SUPPORT_LBA_DEPENDENT 0x10
 #define TPGS_SUPPORT_OFFLINE   0x40
 #define TPGS_SUPPORT_TRANSITION0x80
+#define TPGS_SUPPORT_ALL   0xdf
 
 #define RTPG_FMT_MASK  0x70
 #define RTPG_FMT_EXT_HDR   0x10
@@ -81,6 +82,7 @@ struct alua_port_group {
int tpgs;
int state;
int pref;
+   int valid_states;
unsignedflags; /* used for optimizing STPG */
unsigned char   transition_tmo;
unsigned long   expiry;
@@ -243,6 +245,7 @@ static struct alua_port_group *alua_alloc_pg(struct 
scsi_device *sdev,
pg->group_id = group_id;
pg->tpgs = tpgs;
pg->state = SCSI_ACCESS_STATE_OPTIMAL;
+   pg->valid_states = TPGS_SUPPORT_ALL;
if (optimize_stpg)
pg->flags |= ALUA_OPTIMIZE_STPG;
kref_init(>kref);
@@ -516,7 +519,7 @@ static int alua_rtpg(struct scsi_device *sdev, struct 
alua_port_group *pg)
 {
struct scsi_sense_hdr sense_hdr;
struct alua_port_group *tmp_pg;
-   int len, k, off, valid_states = 0, bufflen = ALUA_RTPG_SIZE;
+   int len, k, off, bufflen = ALUA_RTPG_SIZE;
unsigned char *desc, *buff;
unsigned err, retval;
unsigned int tpg_desc_tbl_off;
@@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev, struct 
alua_port_group *pg)
retval = submit_rtpg(sdev, buff, bufflen, _hdr, pg->flags);
 
if (retval) {
+   /*
+* If the target only supports active/optimized there's
+* not much we can do; it's not that we can switch paths
+* or somesuch.
+* So ignore any errors to avoid spurious failures during
+* path failover.
+*/
+   if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) {
+   sdev_printk(KERN_INFO, sdev,
+   "%s: ignoring rtpg result %d\n",
+   ALUA_DH_NAME, retval);
+   kfree(buff);
+   return SCSI_DH_OK;
+   }
if (!scsi_sense_valid(_hdr)) {
sdev_printk(KERN_INFO, sdev,
"%s: rtpg failed, result %d\n",
@@ -652,7 +669,7 @@ static int alua_rtpg(struct scsi_device *sdev, struct 
alua_port_group *pg)
rcu_read_unlock();
}
if (tmp_pg == pg)
-   valid_states = desc[1];
+   tmp_pg->valid_states = desc[1];
spin_unlock_irqrestore(_pg->lock, flags);
}
kref_put(_pg->kref, release_port_group);
@@ -665,13 +682,13 @@ static int alua_rtpg(struct scsi_device *sdev, struct 
alua_port_group *pg)
"%s: port group %02x state %c %s supports %c%c%c%c%c%c%c\n",
ALUA_DH_NAME, pg->group_id, print_alua_state(pg->state),
pg->pref ? "preferred" : "non-preferred",
-   valid_states_SUPPORT_TRANSITION?'T':'t',
-   valid_states_SUPPORT_OFFLINE?'O':'o',
-   valid_states_SUPPORT_LBA_DEPENDENT?'L':'l',
-   valid_states_SUPPORT_UNAVAILABLE?'U':'u',
-   valid_states_SUPPORT_STANDBY?'S':'s',
-   valid_states_SUPPORT_NONOPTIMIZED?'N':'n',
-   valid_states_SUPPORT_OPTIMIZED?'A':'a');
+   pg->valid_states_SUPPORT_TRANSITION?'T':'t',
+   pg->valid_states_SUPPORT_OFFLINE?'O':'o',
+   pg->valid_states_SUPPORT_LBA_DEPENDENT?'L':'l',
+   pg->valid_states_SUPPORT_UNAVAILABLE?'U':'u',
+   pg->valid_states_SUPPORT_STANDBY?'S':'s',
+   pg->valid_states_SUPPORT_NONOPTIMIZED?'N':'n',
+   pg->valid_states_SUPPORT_OPTIMIZED?'A':'a');
 
switch (pg->state) {
case SCSI_ACCESS_STATE_TRANSITIONING:
-- 
1.8.5.6

[PATCH v5 5/7] scsi: libsas: use flush_workqueue to process disco events synchronously

2017-12-08 Thread Jason Yan

Now we are processing sas event and discover event in different workqueues.
It's safe to wait the discover event done in the sas event work. Use
flush_workqueue() to insure the disco and revalidate events processed
synchronously so that the whole discover and revalidate process will not
be interrupted by other events.

Signed-off-by: Jason Yan 
CC: John Garry 
CC: Johannes Thumshirn 
CC: Ewan Milne 
CC: Christoph Hellwig 
CC: Tomas Henzl 
CC: Dan Williams 
---
 drivers/scsi/libsas/sas_port.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c
index 9326628..64722f4 100644
--- a/drivers/scsi/libsas/sas_port.c
+++ b/drivers/scsi/libsas/sas_port.c
@@ -192,6 +192,7 @@ static void sas_form_port(struct asd_sas_phy *phy)
si->dft->lldd_port_formed(phy);
 
sas_discover_event(phy->port, DISCE_DISCOVER_DOMAIN);
+   flush_workqueue(sas_ha->disco_q);
 }
 
 /**
@@ -277,6 +278,9 @@ void sas_porte_broadcast_rcvd(struct work_struct *work)
 
SAS_DPRINTK("broadcast received: %d\n", prim);
sas_discover_event(phy->port, DISCE_REVALIDATE_DOMAIN);
+
+   if (phy->port)
+   flush_workqueue(phy->port->ha->disco_q);
 }
 
 void sas_porte_link_reset_err(struct work_struct *work)
-- 
2.9.5

[PATCH v5 7/7] scsi: libsas: notify event PORTE_BROADCAST_RCVD in sas_enable_revalidation()

2017-12-08 Thread Jason Yan

There are two places queuing the disco event DISCE_REVALIDATE_DOMAIN.
One is in sas_porte_broadcast_rcvd() and uses sas_chain_event() to queue
the event. The other is in sas_enable_revalidation() and uses
sas_queue_event() to queue the event. We have diffrent work queues for
event and discovery now, so the DISCE_REVALIDATE_DOMAIN event may be
processed in both event queue and discovery queue.

Now since we do synchronous event handling, we cannot do it in discovery
queue, so have to trigger a fake broadcast event to re-trigger the
revalidation from event queue.

Signed-off-by: Jason Yan 
CC: John Garry 
CC: Johannes Thumshirn 
CC: Ewan Milne 
CC: Christoph Hellwig 
CC: Tomas Henzl 
CC: Dan Williams 
---
 drivers/scsi/libsas/sas_event.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/libsas/sas_event.c b/drivers/scsi/libsas/sas_event.c
index 8c82c00..ae923eb 100644
--- a/drivers/scsi/libsas/sas_event.c
+++ b/drivers/scsi/libsas/sas_event.c
@@ -116,11 +116,17 @@ void sas_enable_revalidation(struct sas_ha_struct *ha)
struct asd_sas_port *port = ha->sas_port[i];
const int ev = DISCE_REVALIDATE_DOMAIN;
struct sas_discovery *d = >disc;
+   struct asd_sas_phy *sas_phy;
 
if (!test_and_clear_bit(ev, >pending))
continue;
 
-   sas_queue_event(ev, >disc_work[ev].work, ha);
+   if (list_empty(>phy_list))
+   continue;
+
+   sas_phy = container_of(port->phy_list.next, struct asd_sas_phy,
+   port_phy_el);
+   ha->notify_port_event(sas_phy, PORTE_BROADCAST_RCVD);
}
mutex_unlock(>disco_mutex);
 }
-- 
2.9.5

[PATCH v5 6/7] scsi: libsas: direct call probe and destruct

2017-12-08 Thread Jason Yan

In commit 87c8331fcf72 ("[SCSI] libsas: prevent domain rediscovery
competing with ata error handling") introduced disco mutex to prevent
rediscovery competing with ata error handling and put the whole
revalidation in the mutex. But the rphy add/remove needs to wait for the
error handling which also grabs the disco mutex. This may leads to dead
lock.So the probe and destruct event were introduce to do the rphy
add/remove asynchronously and out of the lock.

The asynchronously processed workers makes the whole discovery process
not atomic, the other events may interrupt the process. For example,
if a loss of signal event inserted before the probe event, the
sas_deform_port() is called and the port will be deleted.

And sas_port_delete() may run before the destruct event, but the
port-x:x is the top parent of end device or expander. This leads to
a kernel WARNING such as:

[   82.042979] sysfs group 'power' not found for kobject 'phy-1:0:22'
[   82.042983] [ cut here ]
[   82.042986] WARNING: CPU: 54 PID: 1714 at fs/sysfs/group.c:237
sysfs_remove_group+0x94/0xa0
[   82.043059] Call trace:
[   82.043082] [] sysfs_remove_group+0x94/0xa0
[   82.043085] [] dpm_sysfs_remove+0x60/0x70
[   82.043086] [] device_del+0x138/0x308
[   82.043089] [] sas_phy_delete+0x38/0x60
[   82.043091] [] do_sas_phy_delete+0x6c/0x80
[   82.043093] [] device_for_each_child+0x58/0xa0
[   82.043095] [] sas_remove_children+0x40/0x50
[   82.043100] [] sas_destruct_devices+0x64/0xa0
[   82.043102] [] process_one_work+0x1fc/0x4b0
[   82.043104] [] worker_thread+0x50/0x490
[   82.043105] [] kthread+0xfc/0x128
[   82.043107] [] ret_from_fork+0x10/0x50

Make probe and destruct a direct call in the disco and revalidate function,
but put them outside the lock. The whole discovery or revalidate won't
be interrupted by other events. And the DISCE_PROBE and DISCE_DESTRUCT
event are deleted as a result of the direct call.

Introduce a new list to destruct the sas_port and put the port delete after
the destruct. This makes sure the right order of destroying the sysfs
kobject and fix the warning above.

In sas_ex_revalidate_domain() have a loop to find all broadcasted
device, and sometimes we have a chance to find the same expander twice.
Because the sas_port will be deleted at the end of the whole revalidate
process, sas_port with the same name cannot be added before this.
Otherwise the sysfs will complain of creating duplicate filename. Since
the LLDD will send broadcast for every device change, we can only
process one expander's revalidation.

Signed-off-by: Jason Yan 
CC: John Garry 
CC: Johannes Thumshirn 
CC: Ewan Milne 
CC: Christoph Hellwig 
CC: Tomas Henzl 
CC: Dan Williams 
---
 drivers/scsi/libsas/sas_ata.c  |  1 -
 drivers/scsi/libsas/sas_discover.c | 32 ++--
 drivers/scsi/libsas/sas_expander.c |  8 +++-
 drivers/scsi/libsas/sas_internal.h |  1 +
 drivers/scsi/libsas/sas_port.c |  3 +++
 include/scsi/libsas.h  |  3 +--
 include/scsi/scsi_transport_sas.h  |  1 +
 7 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 70be442..2b3637b 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -730,7 +730,6 @@ int sas_discover_sata(struct domain_device *dev)
if (res)
return res;
 
-   sas_discover_event(dev->port, DISCE_PROBE);
return 0;
 }
 
diff --git a/drivers/scsi/libsas/sas_discover.c 
b/drivers/scsi/libsas/sas_discover.c
index 14f714d..190108e 100644
--- a/drivers/scsi/libsas/sas_discover.c
+++ b/drivers/scsi/libsas/sas_discover.c
@@ -212,13 +212,9 @@ void sas_notify_lldd_dev_gone(struct domain_device *dev)
}
 }
 
-static void sas_probe_devices(struct work_struct *work)
+static void sas_probe_devices(struct asd_sas_port *port)
 {
struct domain_device *dev, *n;
-   struct sas_discovery_event *ev = to_sas_discovery_event(work);
-   struct asd_sas_port *port = ev->port;
-
-   clear_bit(DISCE_PROBE, >disc.pending);
 
/* devices must be domain members before link recovery and probe */
list_for_each_entry(dev, >disco_list, disco_list_node) {
@@ -294,7 +290,6 @@ int sas_discover_end_dev(struct domain_device *dev)
res = sas_notify_lldd_dev_found(dev);
if (res)
return res;
-   sas_discover_event(dev->port, DISCE_PROBE);
 
return 0;
 }
@@ -353,13 +348,9 @@ static void sas_unregister_common_dev(struct asd_sas_port 
*port, struct domain_d
sas_put_device(dev);
 }
 
-static void sas_destruct_devices(struct work_struct *work)
+void sas_destruct_devices(struct asd_sas_port *port)
 {
struct domain_device *dev, *n;
-   struct sas_discovery_event *ev = to_sas_discovery_event(work);
-   struct

[PATCH v5 4/7] scsi: libsas: Use new workqueue to run sas event and disco event

2017-12-08 Thread Jason Yan

Now all libsas works are queued to scsi host workqueue,
include sas event work post by LLDD and sas discovery
work, and a sas hotplug flow may be divided into several
works, e.g libsas receive a PORTE_BYTES_DMAED event,
currently we process it as following steps:
sas_form_port  --- run in work in shost workq
sas_discover_domain  --- run in another work in shost workq
...
sas_probe_devices  --- run in new work in shost workq
We found during hot-add a device, libsas may need run several
works in same workqueue to add device in system, the process is
not atomic, it may interrupt by other sas event works, like
PHYE_LOSS_OF_SIGNAL.

This patch is preparation of execute libsas sas event in sync. We need
to use different workqueue to run sas event and disco event. Otherwise
the work will be blocked for waiting another chained work in the same
workqueue.

Signed-off-by: Yijing Wang 
CC: John Garry 
CC: Johannes Thumshirn 
CC: Ewan Milne 
CC: Christoph Hellwig 
CC: Tomas Henzl 
CC: Dan Williams 
Signed-off-by: Jason Yan 
---
 drivers/scsi/libsas/sas_discover.c |  2 +-
 drivers/scsi/libsas/sas_event.c|  6 +++---
 drivers/scsi/libsas/sas_init.c | 18 ++
 include/scsi/libsas.h  |  3 +++
 4 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/libsas/sas_discover.c 
b/drivers/scsi/libsas/sas_discover.c
index 60de662..14f714d 100644
--- a/drivers/scsi/libsas/sas_discover.c
+++ b/drivers/scsi/libsas/sas_discover.c
@@ -534,7 +534,7 @@ static void sas_chain_work(struct sas_ha_struct *ha, struct 
sas_work *sw)
 * workqueue, or known to be submitted from a context that is
 * not racing against draining
 */
-   scsi_queue_work(ha->core.shost, >work);
+   queue_work(ha->disco_q, >work);
 }
 
 static void sas_chain_event(int event, unsigned long *pending,
diff --git a/drivers/scsi/libsas/sas_event.c b/drivers/scsi/libsas/sas_event.c
index 5d7254a..8c82c00 100644
--- a/drivers/scsi/libsas/sas_event.c
+++ b/drivers/scsi/libsas/sas_event.c
@@ -40,7 +40,7 @@ int sas_queue_work(struct sas_ha_struct *ha, struct sas_work 
*sw)
if (list_empty(>drain_node))
list_add_tail(>drain_node, >defer_q);
} else
-   rc = scsi_queue_work(ha->core.shost, >work);
+   rc = queue_work(ha->event_q, >work);
 
return rc;
 }
@@ -61,7 +61,6 @@ static int sas_queue_event(int event, struct sas_work *work,
 
 void __sas_drain_work(struct sas_ha_struct *ha)
 {
-   struct workqueue_struct *wq = ha->core.shost->work_q;
struct sas_work *sw, *_sw;
int ret;
 
@@ -70,7 +69,8 @@ void __sas_drain_work(struct sas_ha_struct *ha)
spin_lock_irq(>lock);
spin_unlock_irq(>lock);
 
-   drain_workqueue(wq);
+   drain_workqueue(ha->event_q);
+   drain_workqueue(ha->disco_q);
 
spin_lock_irq(>lock);
clear_bit(SAS_HA_DRAINING, >state);
diff --git a/drivers/scsi/libsas/sas_init.c b/drivers/scsi/libsas/sas_init.c
index afd928b..c81a63b 100644
--- a/drivers/scsi/libsas/sas_init.c
+++ b/drivers/scsi/libsas/sas_init.c
@@ -110,6 +110,7 @@ void sas_hash_addr(u8 *hashed, const u8 *sas_addr)
 
 int sas_register_ha(struct sas_ha_struct *sas_ha)
 {
+   char name[64];
int error = 0;
 
mutex_init(_ha->disco_mutex);
@@ -143,10 +144,24 @@ int sas_register_ha(struct sas_ha_struct *sas_ha)
goto Undo_ports;
}
 
+   error = -ENOMEM;
+   snprintf(name, sizeof(name), "%s_event_q", dev_name(sas_ha->dev));
+   sas_ha->event_q = create_singlethread_workqueue(name);
+   if (!sas_ha->event_q)
+   goto Undo_ports;
+
+   snprintf(name, sizeof(name), "%s_disco_q", dev_name(sas_ha->dev));
+   sas_ha->disco_q = create_singlethread_workqueue(name);
+   if (!sas_ha->disco_q)
+   goto Undo_event_q;
+
INIT_LIST_HEAD(_ha->eh_done_q);
INIT_LIST_HEAD(_ha->eh_ata_q);
 
return 0;
+
+Undo_event_q:
+   destroy_workqueue(sas_ha->event_q);
 Undo_ports:
sas_unregister_ports(sas_ha);
 Undo_phys:
@@ -177,6 +192,9 @@ int sas_unregister_ha(struct sas_ha_struct *sas_ha)
__sas_drain_work(sas_ha);
mutex_unlock(_ha->drain_mutex);
 
+   destroy_workqueue(sas_ha->disco_q);
+   destroy_workqueue(sas_ha->event_q);
+
return 0;
 }
 
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index 701c67f..fa93c41 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -389,6 +389,9 @@ struct sas_ha_struct {
struct device *dev;   /* should be set */
struct module *lldd_module; /* should be set */
 
+   struct workqueue_struct *event_q;
+   struct workqueue_struct *disco_q;
+
u8 *sas_addr; /*

[PATCH v5 2/7] scsi: libsas: shut down the PHY if events reached the threshold

2017-12-08 Thread Jason Yan

If the PHY burst too many events, we will alloc a lot of events for the
worker. This may leads to memory exhaustion.

Dan Williams suggested to shut down the PHY if the events reached the
threshold, because in this case the PHY may have gone into some
erroneous state. Users can re-enable the PHY by sysfs if they want.

We cannot use the fixed memory pool because if we run out of events, the
shut down event and loss of signal event will lost too. The events still
need to be allocated and processed in this case.

Suggested-by: Dan Williams 
Signed-off-by: Jason Yan 
CC: John Garry 
CC: Johannes Thumshirn 
CC: Ewan Milne 
CC: Christoph Hellwig 
CC: Tomas Henzl 
---
 drivers/scsi/libsas/sas_init.c | 33 -
 drivers/scsi/libsas/sas_phy.c  | 27 ++-
 include/scsi/libsas.h  |  6 ++
 3 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/libsas/sas_init.c b/drivers/scsi/libsas/sas_init.c
index e04f6d6..22bfc02 100644
--- a/drivers/scsi/libsas/sas_init.c
+++ b/drivers/scsi/libsas/sas_init.c
@@ -123,6 +123,8 @@ int sas_register_ha(struct sas_ha_struct *sas_ha)
INIT_LIST_HEAD(_ha->defer_q);
INIT_LIST_HEAD(_ha->eh_dev_q);
 
+   sas_ha->event_thres = SAS_PHY_SHUTDOWN_THRES;
+
error = sas_register_phys(sas_ha);
if (error) {
printk(KERN_NOTICE "couldn't register sas phys:%d\n", error);
@@ -557,14 +559,43 @@ EXPORT_SYMBOL_GPL(sas_domain_attach_transport);
 
 struct asd_sas_event *sas_alloc_event(struct asd_sas_phy *phy)
 {
+   struct asd_sas_event *event;
gfp_t flags = in_interrupt() ? GFP_ATOMIC : GFP_KERNEL;
+   struct sas_ha_struct *sas_ha = phy->ha;
+   struct sas_internal *i =
+   to_sas_internal(sas_ha->core.shost->transportt);
+
+   event = kmem_cache_zalloc(sas_event_cache, flags);
+   if (!event)
+   return NULL;
 
-   return kmem_cache_zalloc(sas_event_cache, flags);
+   atomic_inc(>event_nr);
+
+   if (atomic_read(>event_nr) > phy->ha->event_thres) {
+   if (i->dft->lldd_control_phy) {
+   if (cmpxchg(>in_shutdown, 0, 1) == 0) {
+   sas_printk("The phy%02d bursting events, shut 
it down.\n",
+   phy->id);
+   sas_notify_phy_event(phy, PHYE_SHUTDOWN);
+   }
+   } else {
+   /* Do not support PHY control, stop allocating events */
+   WARN_ONCE(1, "PHY control not supported.\n");
+   kmem_cache_free(sas_event_cache, event);
+   atomic_dec(>event_nr);
+   event = NULL;
+   }
+   }
+
+   return event;
 }
 
 void sas_free_event(struct asd_sas_event *event)
 {
+   struct asd_sas_phy *phy = event->phy;
+
kmem_cache_free(sas_event_cache, event);
+   atomic_dec(>event_nr);
 }
 
 /* -- SAS Class register/unregister -- */
diff --git a/drivers/scsi/libsas/sas_phy.c b/drivers/scsi/libsas/sas_phy.c
index 59f8292..bf3e1b9 100644
--- a/drivers/scsi/libsas/sas_phy.c
+++ b/drivers/scsi/libsas/sas_phy.c
@@ -35,6 +35,7 @@ static void sas_phye_loss_of_signal(struct work_struct *work)
struct asd_sas_event *ev = to_asd_sas_event(work);
struct asd_sas_phy *phy = ev->phy;
 
+   phy->in_shutdown = 0;
phy->error = 0;
sas_deform_port(phy, 1);
 }
@@ -44,6 +45,7 @@ static void sas_phye_oob_done(struct work_struct *work)
struct asd_sas_event *ev = to_asd_sas_event(work);
struct asd_sas_phy *phy = ev->phy;
 
+   phy->in_shutdown = 0;
phy->error = 0;
 }
 
@@ -105,6 +107,28 @@ static void sas_phye_resume_timeout(struct work_struct 
*work)
 }
 
 
+static void sas_phye_shutdown(struct work_struct *work)
+{
+   struct asd_sas_event *ev = to_asd_sas_event(work);
+   struct asd_sas_phy *phy = ev->phy;
+   struct sas_ha_struct *sas_ha = phy->ha;
+   struct sas_internal *i =
+   to_sas_internal(sas_ha->core.shost->transportt);
+
+   if (phy->enabled) {
+   int ret;
+
+   phy->error = 0;
+   phy->enabled = 0;
+   ret = i->dft->lldd_control_phy(phy, PHY_FUNC_DISABLE, NULL);
+   if (ret)
+   sas_printk("lldd disable phy%02d returned %d\n",
+   phy->id, ret);
+   } else
+   sas_printk("phy%02d is not enabled, cannot shutdown\n",
+   phy->id);
+}
+
 /* -- Phy class registration -- */
 
 int sas_register_phys(struct sas_ha_struct *sas_ha)
@@ -116,6 +140,7 @@ int sas_register_phys(struct sas_ha_struct *sas_ha)
struct asd_sas_phy *phy =

[PATCH v5 0/7] Enhance libsas hotplug feature

2017-12-08 Thread Jason Yan

Now the libsas hotplug has some issues, Dan Williams report
a similar bug here before
https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg39187.html

The issues we have found
1. if LLDD burst reports lots of phy-up/phy-down sas events, some events
   may lost because a same sas events is pending now, finally libsas topo
   may different the hardware.
2. receive a phy down sas event, libsas call sas_deform_port to remove
   devices, it would first delete the sas port, then put a destruction
   discovery event in a new work, and queue it at the tail of workqueue,
   once the sas port be deleted, its children device will be deleted too,
   when the destruction work start, it will found the target device has
   been removed, and report a sysfs warnning.
3. since a hotplug process will be devided into several works, if a phy up
   sas event insert into phydown works, like
   destruction work  ---> PORTE_BYTES_DMAED (sas_form_port) 
>PHYE_LOSS_OF_SIGNAL
   the hot remove flow would broken by PORTE_BYTES_DMAED event, it's not
   we expected, and issues would occur.

v4->v5: -process only one expander's revalidation in sas_ex_revalidate_domain()
-notify event PORTE_BROADCAST_RCVD in sas_enable_revalidation()
v3->v4: -use dynamic alloced work and support shutting down the phy if active 
event reached the threshold
-use flush_workqueue instead of wait-completion to process discover 
events synchronously
-direct call probe and destruct function
v2->v3: some code improvements suggested by Johannes and John,
split v2 patch 2 into several small pathes.
v1->v2: some code improvements suggested by John Garry

Jason Yan (7):
  scsi: libsas: Use dynamic alloced work to avoid sas event lost
  scsi: libsas: shut down the PHY if events reached the threshold
  scsi: libsas: make the event threshold configurable
  scsi: libsas: Use new workqueue to run sas event and disco event
  scsi: libsas: use flush_workqueue to process disco events
synchronously
  scsi: libsas: direct call probe and destruct
  scsi: libsas: notify event PORTE_BROADCAST_RCVD in
sas_enable_revalidation()

 drivers/scsi/hisi_sas/hisi_sas_main.c |   6 ++
 drivers/scsi/libsas/sas_ata.c |   1 -
 drivers/scsi/libsas/sas_discover.c|  34 ++-
 drivers/scsi/libsas/sas_event.c   |  86 ---
 drivers/scsi/libsas/sas_expander.c|   8 +--
 drivers/scsi/libsas/sas_init.c| 107 +-
 drivers/scsi/libsas/sas_internal.h|   7 +++
 drivers/scsi/libsas/sas_phy.c |  69 +++---
 drivers/scsi/libsas/sas_port.c|  25 
 include/scsi/libsas.h |  30 +++---
 include/scsi/scsi_transport_sas.h |   1 +
 11 files changed, 277 insertions(+), 97 deletions(-)

-- 
2.9.5

Re: [PATCH v2] scsi: libsas: fix length error in sas_smp_handler()

2017-12-08 Thread John Garry


On 07/12/2017 10:57, Jason Yan wrote:

The bsg_job_done() requires the length of payload received, but we give
it the untransferred residual.



As I understand, this patches fixes (SES) enclosure management for 
libsas, so it's quite an important patch.


Thanks,
John


Fixes: 651a01364994 ("scsi: scsi_transport_sas: switch to bsg-lib for SMP 
passthrough")
Reported-and-tested-by: chenqilin 
Signed-off-by: Jason Yan 
CC: Christoph Hellwig 
---
 drivers/scsi/libsas/sas_expander.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/libsas/sas_expander.c 
b/drivers/scsi/libsas/sas_expander.c
index 50cb0f3..6c40ecc 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -2143,7 +2143,7 @@ void sas_smp_handler(struct bsg_job *job, struct 
Scsi_Host *shost,
struct sas_rphy *rphy)
 {
struct domain_device *dev;
-   unsigned int reslen = 0;
+   unsigned int rcvlen = 0;
int ret = -EINVAL;

/* no rphy means no smp target support (ie aic94xx host) */
@@ -2177,12 +2177,12 @@ void sas_smp_handler(struct bsg_job *job, struct 
Scsi_Host *shost,

ret = smp_execute_task_sg(dev, job->request_payload.sg_list,
job->reply_payload.sg_list);
-   if (ret > 0) {
-   /* positive number is the untransferred residual */
-   reslen = ret;
+   if (ret >= 0) {
+   /* bsg_job_done() requires the length received  */
+   rcvlen = job->reply_payload.payload_len - ret;
ret = 0;
}

 out:
-   bsg_job_done(job, ret, reslen);
+   bsg_job_done(job, ret, rcvlen);
 }

[PATCH v5 1/7] scsi: libsas: Use dynamic alloced work to avoid sas event lost

2017-12-08 Thread Jason Yan

Now libsas hotplug work is static, every sas event type has its own
static work, LLDD driver queues the hotplug work into shost->work_q.
If LLDD driver burst posts lots hotplug events to libsas, the hotplug
events may pending in the workqueue like

shost->work_q
new work[PORTE_BYTES_DMAED] --> |[PHYE_LOSS_OF_SIGNAL][PORTE_BYTES_DMAED] -> 
processing
|<---wait worker to process>|

In this case, a new PORTE_BYTES_DMAED event coming, libsas try to queue
it to shost->work_q, but this work is already pending, so it would be
lost. Finally, libsas delete the related sas port and sas devices, but
LLDD driver expect libsas add the sas port and devices(last sas event).

This patch use dynamic allocated work to avoid this issue.

Signed-off-by: Yijing Wang 
CC: John Garry 
CC: Johannes Thumshirn 
CC: Ewan Milne 
CC: Christoph Hellwig 
CC: Tomas Henzl 
CC: Dan Williams 
Signed-off-by: Jason Yan 
---
 drivers/scsi/libsas/sas_event.c| 74 +-
 drivers/scsi/libsas/sas_init.c | 27 --
 drivers/scsi/libsas/sas_internal.h |  6 
 drivers/scsi/libsas/sas_phy.c  | 44 +--
 drivers/scsi/libsas/sas_port.c | 18 +-
 include/scsi/libsas.h  | 17 +
 6 files changed, 115 insertions(+), 71 deletions(-)

diff --git a/drivers/scsi/libsas/sas_event.c b/drivers/scsi/libsas/sas_event.c
index 0bb9eef..5d7254a 100644
--- a/drivers/scsi/libsas/sas_event.c
+++ b/drivers/scsi/libsas/sas_event.c
@@ -29,7 +29,8 @@
 
 int sas_queue_work(struct sas_ha_struct *ha, struct sas_work *sw)
 {
-   int rc = 0;
+   /* it's added to the defer_q when draining so return succeed */
+   int rc = 1;
 
if (!test_bit(SAS_HA_REGISTERED, >state))
return 0;
@@ -44,19 +45,15 @@ int sas_queue_work(struct sas_ha_struct *ha, struct 
sas_work *sw)
return rc;
 }
 
-static int sas_queue_event(int event, unsigned long *pending,
-   struct sas_work *work,
+static int sas_queue_event(int event, struct sas_work *work,
struct sas_ha_struct *ha)
 {
-   int rc = 0;
+   unsigned long flags;
+   int rc;
 
-   if (!test_and_set_bit(event, pending)) {
-   unsigned long flags;
-
-   spin_lock_irqsave(>lock, flags);
-   rc = sas_queue_work(ha, work);
-   spin_unlock_irqrestore(>lock, flags);
-   }
+   spin_lock_irqsave(>lock, flags);
+   rc = sas_queue_work(ha, work);
+   spin_unlock_irqrestore(>lock, flags);
 
return rc;
 }
@@ -66,6 +63,7 @@ void __sas_drain_work(struct sas_ha_struct *ha)
 {
struct workqueue_struct *wq = ha->core.shost->work_q;
struct sas_work *sw, *_sw;
+   int ret;
 
set_bit(SAS_HA_DRAINING, >state);
/* flush submitters */
@@ -78,7 +76,10 @@ void __sas_drain_work(struct sas_ha_struct *ha)
clear_bit(SAS_HA_DRAINING, >state);
list_for_each_entry_safe(sw, _sw, >defer_q, drain_node) {
list_del_init(>drain_node);
-   sas_queue_work(ha, sw);
+   ret = sas_queue_work(ha, sw);
+   if (ret != 1)
+   sas_free_event(to_asd_sas_event(>work));
+
}
spin_unlock_irq(>lock);
 }
@@ -119,29 +120,68 @@ void sas_enable_revalidation(struct sas_ha_struct *ha)
if (!test_and_clear_bit(ev, >pending))
continue;
 
-   sas_queue_event(ev, >pending, >disc_work[ev].work, ha);
+   sas_queue_event(ev, >disc_work[ev].work, ha);
}
mutex_unlock(>disco_mutex);
 }
 
+
+static void sas_port_event_worker(struct work_struct *work)
+{
+   struct asd_sas_event *ev = to_asd_sas_event(work);
+
+   sas_port_event_fns[ev->event](work);
+   sas_free_event(ev);
+}
+
+static void sas_phy_event_worker(struct work_struct *work)
+{
+   struct asd_sas_event *ev = to_asd_sas_event(work);
+
+   sas_phy_event_fns[ev->event](work);
+   sas_free_event(ev);
+}
+
 static int sas_notify_port_event(struct asd_sas_phy *phy, enum port_event 
event)
 {
+   struct asd_sas_event *ev;
struct sas_ha_struct *ha = phy->ha;
+   int ret;
 
BUG_ON(event >= PORT_NUM_EVENTS);
 
-   return sas_queue_event(event, >port_events_pending,
-  >port_events[event].work, ha);
+   ev = sas_alloc_event(phy);
+   if (!ev)
+   return -ENOMEM;
+
+   INIT_SAS_EVENT(ev, sas_port_event_worker, phy, event);
+
+   ret = sas_queue_event(event, >work, ha);
+   if (ret != 1)
+   sas_free_event(ev);
+
+   return ret;
 }
 
 int sas_notify_phy_event(struct asd_sas_phy *phy, enum phy_event event)
 {
+   struct asd_sas_event *ev;

[PATCH v5 3/7] scsi: libsas: make the event threshold configurable

2017-12-08 Thread Jason Yan

Add a sysfs attr that LLDD can configure it for every host. We made
a example in hisi_sas. Other LLDDs using libsas can implement it if
they want.

Suggested-by: Hannes Reinecke 
Signed-off-by: Jason Yan 
CC: John Garry 
CC: Johannes Thumshirn 
CC: Ewan Milne 
CC: Christoph Hellwig 
CC: Tomas Henzl 
CC: Dan Williams 
---
 drivers/scsi/hisi_sas/hisi_sas_main.c |  6 ++
 drivers/scsi/libsas/sas_init.c| 31 +++
 include/scsi/libsas.h |  1 +
 3 files changed, 38 insertions(+)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c 
b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 5f503cb..b5ce64a 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -1561,6 +1561,11 @@ EXPORT_SYMBOL_GPL(hisi_sas_kill_tasklets);
 struct scsi_transport_template *hisi_sas_stt;
 EXPORT_SYMBOL_GPL(hisi_sas_stt);
 
+struct device_attribute *host_attrs[] = {
+   _attr_phy_event_threshold,
+   NULL,
+};
+
 static struct scsi_host_template _hisi_sas_sht = {
.module = THIS_MODULE,
.name   = DRV_NAME,
@@ -1580,6 +1585,7 @@ static struct scsi_host_template _hisi_sas_sht = {
.eh_target_reset_handler = sas_eh_target_reset_handler,
.target_destroy = sas_target_destroy,
.ioctl  = sas_ioctl,
+   .shost_attrs= host_attrs,
 };
 struct scsi_host_template *hisi_sas_sht = &_hisi_sas_sht;
 EXPORT_SYMBOL_GPL(hisi_sas_sht);
diff --git a/drivers/scsi/libsas/sas_init.c b/drivers/scsi/libsas/sas_init.c
index 22bfc02..afd928b 100644
--- a/drivers/scsi/libsas/sas_init.c
+++ b/drivers/scsi/libsas/sas_init.c
@@ -538,6 +538,37 @@ static struct sas_function_template sft = {
.smp_handler = sas_smp_handler,
 };
 
+static inline ssize_t phy_event_threshold_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct Scsi_Host *shost = class_to_shost(dev);
+   struct sas_ha_struct *sha = SHOST_TO_SAS_HA(shost);
+
+   return scnprintf(buf, PAGE_SIZE, "%u\n", sha->event_thres);
+}
+
+static inline ssize_t phy_event_threshold_store(struct device *dev,
+   struct device_attribute *attr,
+   const char *buf, size_t count)
+{
+   struct Scsi_Host *shost = class_to_shost(dev);
+   struct sas_ha_struct *sha = SHOST_TO_SAS_HA(shost);
+
+   sha->event_thres = simple_strtol(buf, NULL, 10);
+
+   /* threshold cannot be set too small */
+   if (sha->event_thres < 32)
+   sha->event_thres = 32;
+
+   return count;
+}
+
+DEVICE_ATTR(phy_event_threshold,
+   S_IRUGO|S_IWUSR,
+   phy_event_threshold_show,
+   phy_event_threshold_store);
+EXPORT_SYMBOL_GPL(dev_attr_phy_event_threshold);
+
 struct scsi_transport_template *
 sas_domain_attach_transport(struct sas_domain_function_template *dft)
 {
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index 26683e2..701c67f 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -681,6 +681,7 @@ extern int sas_bios_param(struct scsi_device *,
  sector_t capacity, int *hsc);
 extern struct scsi_transport_template *
 sas_domain_attach_transport(struct sas_domain_function_template *);
+extern struct device_attribute dev_attr_phy_event_threshold;
 
 int  sas_discover_root_expander(struct domain_device *);
 
-- 
2.9.5

Re: [PATCH v2 1/3] scsi: Fix a scsi_show_rq() NULL pointer dereference

2017-12-08 Thread Ming Lei

Hi Martin,

On Thu, Dec 07, 2017 at 09:46:21PM -0500, Martin K. Petersen wrote:
> 
> Ming,
> 
> > As I explained in [1], the use-after-free is inevitable no matter if
> > clearing 'SCpnt->cmnd' before mempool_free() in sd_uninit_command() or
> > not, so we need to comment the fact that cdb may point to garbage
> > data, and this function(especially __scsi_format_command() has to
> > survive that, so that people won't be surprised when kasan complains
> > use-after-free, and guys will be careful when they try to change the
> > code in future.
> 
> Longer term we really need to get rid of the separate CDB allocation. It
> was a necessary evil when I did it. And not much of a concern since I
> did not expect anybody sane to use Type 2 (it's designed for use inside
> disk arrays).
> 
> However, I keep hearing about people using Type 2 drives. Some vendors
> source drives formatted that way and use the same SKU for arrays and
> standalone servers.
> 
> So we should really look into making it possible for a queue to have a
> bigger than 16-byte built-in CDB. For Type 2 devices, 32-byte reads and
> writes are a prerequisite. So it would be nice to be able to switch a
> queue to a larger allocation post creation (we won't know the type until
> after READ CAPACITY(16) has been sent).

I am wondering why you don't make __cmd[] in scsi_request defined as 32bytes?
Even for some hosts with thousands of tag, the memory waste is still not
too much.

Or if you prefer to do post creation, we have sbitmap_queue now, which can
help to build a pre-allocated memory pool easily, and its allocation/free is
pretty efficient.

Thanks,
Ming

50 matches

Mail list logo