Re: [PATCH v1] scsi: lpfc: Add auto select on IRQ_POLL

2021-01-26 Thread James Smart

On 1/25/2021 4:05 PM, Tong Zhang wrote:

lpfc depends on irq_poll library, but it is not selected automatically.
When irq_poll is not selected, compiling it can run into following error

ERROR: modpost: "irq_poll_init" [drivers/scsi/lpfc/lpfc.ko] undefined!
ERROR: modpost: "irq_poll_sched" [drivers/scsi/lpfc/lpfc.ko] undefined!
ERROR: modpost: "irq_poll_complete" [drivers/scsi/lpfc/lpfc.ko] undefined!

Signed-off-by: Tong Zhang 
---
  drivers/scsi/Kconfig | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 701b61ec76ee..c79ac0731b13 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -1159,6 +1159,7 @@ config SCSI_LPFC
depends on NVME_TARGET_FC || NVME_TARGET_FC=n
depends on NVME_FC || NVME_FC=n
select CRC_T10DIF
+   select IRQ_POLL
help
This lpfc driver supports the Emulex LightPulse
Family of Fibre Channel PCI host adapters.


Reviewed-by: James Smart 

-- james



Re: [PATCH] scsi: lpfc: style: Simplify bool comparison

2021-01-12 Thread James Smart



On 1/12/2021 12:24 AM, YANG LI wrote:

Fix the following coccicheck warning:
./drivers/scsi/lpfc/lpfc_bsg.c:5392:5-29: WARNING: Comparison to bool

Reported-by: Abaci Robot 
Signed-off-by: YANG LI 
---
  drivers/scsi/lpfc/lpfc_bsg.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)




Reviewed-by:  James Smart 

-- james



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH v2] nvmet-fc: associations list protected by rcu, instead of spinlock_irq where possible.

2021-01-11 Thread James Smart



On 1/3/2021 10:12 AM, leonid.rav...@dell.com wrote:

From: Leonid Ravich 

searching assoc_list protected by rcu_read_lock if list not changed inline.
and according to the rcu list rules.

queue array embedded into nvmet_fc_tgt_assoc protected by rcu_read_lock
according to rcu dereference/assign rules.

queue and assoc object freed after grace period by call_rcu.

tgtport lock taken for changing assoc_list.

Reviewed-by: Eldad Zinger 
Reviewed-by: Elad Grupi 
Signed-off-by: Leonid Ravich 
---
1) fixed sytle issus
2) queues array protects by rcu as well

  drivers/nvme/target/fc.c | 81 +++-
  1 file changed, 38 insertions(+), 43 deletions(-)




Reviewed-by: James Smart 

-- james



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH] nvmet-fc: associations list replaced with hlist rcu,

2021-01-11 Thread James Smart

On 12/24/2020 3:05 AM, leonid.rav...@dell.com wrote:

From: Leonid Ravich 

to remove locking from nvmet_fc_find_target_queue
which called per IO.

Signed-off-by: Leonid Ravich 
---
  drivers/nvme/target/fc.c | 54 
  1 file changed, 32 insertions(+), 22 deletions(-)


was this replaced by the v2 patch ?

-- james



--
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH][next] scsi: lpfc: Fix memory leak on lcb_context

2020-11-19 Thread James Smart



On 11/18/2020 6:13 AM, Colin King wrote:

From: Colin Ian King 

Currently there is an error return path that neglects to free the
allocation for lcb_context.  Fix this by adding a new error free
exit path that kfree's lcb_context before returning.  Use this new
kfree exit path in another exit error path that also kfree's the same
object, allowing a line of code to be removed.

Addresses-Coverity: ("Resource leak")
Fixes: 4430f7fd09ec ("scsi: lpfc: Rework locations of ndlp reference taking")
Signed-off-by: Colin Ian King 
---
  drivers/scsi/lpfc/lpfc_els.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
index 03f47d1b21fe..0d3271b4c130 100644
--- a/drivers/scsi/lpfc/lpfc_els.c
+++ b/drivers/scsi/lpfc/lpfc_els.c
@@ -6515,18 +6515,20 @@ lpfc_els_rcv_lcb(struct lpfc_vport *vport, struct 
lpfc_iocbq *cmdiocb,
lcb_context->ndlp = lpfc_nlp_get(ndlp);
if (!lcb_context->ndlp) {
rjt_err = LSRJT_UNABLE_TPC;
-   goto rjt;
+   goto rjt_free;
}
  
  	if (lpfc_sli4_set_beacon(vport, lcb_context, state)) {

lpfc_printf_vlog(ndlp->vport, KERN_ERR, LOG_TRACE_EVENT,
 "0193 failed to send mail box");
-   kfree(lcb_context);
lpfc_nlp_put(ndlp);
rjt_err = LSRJT_UNABLE_TPC;
-   goto rjt;
+   goto rjt_free;
}
return 0;
+
+rjt_free:
+   kfree(lcb_context);
  rjt:
memset(, 0, sizeof(stat));
stat.un.b.lsRjtRsnCode = rjt_err;

Looks good.

Reviewed-by: James Smart 

-- james



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH][next] scsi: lpfc: remove dead code on second !ndlp check

2020-11-19 Thread James Smart



On 11/18/2020 5:37 AM, Colin King wrote:

From: Colin Ian King 

Currently there is a null check on the pointer ndlp that exits via
error path issue_ct_rsp_exit followed by another null check on the
same pointer that is almost identical to the previous null check
stanza and yet can never can be reached because the previous check
exited via issue_ct_rsp_exit. This is deadcode and can be removed.

Addresses-Coverity: ("Logically dead code")
Signed-off-by: Colin Ian King 
---
  drivers/scsi/lpfc/lpfc_bsg.c | 6 --
  1 file changed, 6 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_bsg.c b/drivers/scsi/lpfc/lpfc_bsg.c
index 35f4998504c1..41e3657c2d8d 100644
--- a/drivers/scsi/lpfc/lpfc_bsg.c
+++ b/drivers/scsi/lpfc/lpfc_bsg.c
@@ -1526,12 +1526,6 @@ lpfc_issue_ct_rsp(struct lpfc_hba *phba, struct bsg_job 
*job, uint32_t tag,
goto issue_ct_rsp_exit;
}
  
-		/* Check if the ndlp is active */

-   if (!ndlp) {
-   rc = IOCB_ERROR;
-   goto issue_ct_rsp_exit;
-   }
-
/* get a refernece count so the ndlp doesn't go away while
 * we respond
 */

Looks good.

Reviewed-by: James Smart 

-- james



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH][next] scsi: lpfc: fix pointer defereference before it is null checked issue

2020-11-19 Thread James Smart



On 11/18/2020 5:13 AM, Colin King wrote:

From: Colin Ian King 

There is a null check on pointer lpfc_cmd after the pointer has been
dereferenced when pointers rdata and ndlp are initialized at the start
of the function. Fix this by only assigning rdata and ndlp after the
pointer lpfc_cmd has been null checked.

Addresses-Coverity: ("Dereference before null check")
Fixes: 96e209be6ecb ("scsi: lpfc: Convert SCSI I/O completions to SLI-3 and SLI-4 
handlers")
Signed-off-by: Colin Ian King 
---
  drivers/scsi/lpfc/lpfc_scsi.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index f989490359a5..3b989f720937 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -4022,8 +4022,8 @@ lpfc_fcp_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct 
lpfc_iocbq *pwqeIn,
struct lpfc_io_buf *lpfc_cmd =
(struct lpfc_io_buf *)pwqeIn->context1;
struct lpfc_vport *vport = pwqeIn->vport;
-   struct lpfc_rport_data *rdata = lpfc_cmd->rdata;
-   struct lpfc_nodelist *ndlp = rdata->pnode;
+   struct lpfc_rport_data *rdata;
+   struct lpfc_nodelist *ndlp;
struct scsi_cmnd *cmd;
unsigned long flags;
struct lpfc_fast_path_event *fast_path_evt;
@@ -4040,6 +4040,9 @@ lpfc_fcp_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct 
lpfc_iocbq *pwqeIn,
return;
}
  
+	rdata = lpfc_cmd->rdata;

+   ndlp = rdata->pnode;
+
if (bf_get(lpfc_wcqe_c_xb, wcqe)) {
/* TOREMOVE - currently this flag is checked during
 * the release of lpfc_iocbq. Remove once we move


Looks good.

Reviewed-by: James Smart 

-- james



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH -next] scsi: lpfc: Mark lpfc_nvmet_prep_abort_wqe with static keyword

2020-11-19 Thread James Smart

On 11/17/2020 11:42 PM, Zou Wei wrote:

Fix the following sparse warning:

drivers/scsi/lpfc/lpfc_nvmet.c:3340:1: warning: symbol 
'lpfc_nvmet_prep_abort_wqe' was not declared. Should it be static?

Signed-off-by: Zou Wei 
---
  drivers/scsi/lpfc/lpfc_nvmet.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c
index c8b9434..a71df87 100644
--- a/drivers/scsi/lpfc/lpfc_nvmet.c
+++ b/drivers/scsi/lpfc/lpfc_nvmet.c
@@ -3336,7 +3336,7 @@ lpfc_nvmet_unsol_issue_abort(struct lpfc_hba *phba,
   *
   * This function is called with hbalock held.
   **/
-void
+static void
  lpfc_nvmet_prep_abort_wqe(struct lpfc_iocbq *pwqeq, u16 xritag, u8 opt)
  {
union lpfc_wqe128 *wqe = >wqe;


Zou, Thank You .  I just submitted the same patch.   Either one Martin 
wants to take.. :)


-- james

Reviewed-by: James Smart 



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH] scsi: lpfc: Add the missed misc_deregister() for lpfc_init()

2020-08-04 Thread James Smart




On 7/30/2020 11:56 PM, Jing Xiangfeng wrote:

lpfc_init() misses to call misc_deregister() in an error path. Add a
label 'unregister' to fix it.

Signed-off-by: Jing Xiangfeng 
---
  drivers/scsi/lpfc/lpfc_init.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)



Looks fine.

Reviewed-by: James Smart 


Thanks

-- james



Re: [PATCH -next] scsi: lpfc: Add dependency on CPU_FREQ

2020-07-22 Thread James Smart

On 7/21/2020 7:30 PM, Guenter Roeck wrote:

Since commit 317aeb83c92b ("scsi: lpfc: Add blk_io_poll support for
latency improvment"), the lpfc driver depends on CPUFREQ. Without it,
builds fail with

drivers/scsi/lpfc/lpfc_sli.c: In function 'lpfc_init_idle_stat_hb':
drivers/scsi/lpfc/lpfc_sli.c:7329:26: error:
implicit declaration of function 'get_cpu_idle_time'

Add the missing dependency.

Fixes: 317aeb83c92b ("scsi: lpfc: Add blk_io_poll support for latency 
improvment")
Cc: Dick Kennedy 
Cc: James Smart 
Signed-off-by: Guenter Roeck 
---
  drivers/scsi/Kconfig | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 571473a58f98..701b61ec76ee 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -1154,6 +1154,7 @@ source "drivers/scsi/qedf/Kconfig"
  config SCSI_LPFC
tristate "Emulex LightPulse Fibre Channel Support"
depends on PCI && SCSI
+   depends on CPU_FREQ
depends on SCSI_FC_ATTRS
depends on NVME_TARGET_FC || NVME_TARGET_FC=n
depends on NVME_FC || NVME_FC=n


Reviewed-by:  James Smart 

-- james



Re: linux-next: Tree for Jul 7 (scsi/lpfc/lpfc_init.c)

2020-07-07 Thread James Smart

On 7/7/2020 10:09 AM, Randy Dunlap wrote:

On 7/7/20 1:08 AM, Stephen Rothwell wrote:

Hi all,

Changes since 20200706:


on i386:

when CONFIG_ACPI is not set/enabled:


../drivers/scsi/lpfc/lpfc_init.c:1265:15: error: implicit declaration of 
function 'get_cpu_idle_time'; did you mean 'get_cpu_device'? 
[-Werror=implicit-function-declaration]


The cpufreq people want justification for using
get_cpu_idle_time().  Please see
https://lore.kernel.org/linux-scsi/20200707030943.xkocccy6qy2c3hrx@vireshk-i7/





The driver is using cpu utilization in order to choose between softirq 
or work queues in handling an interrupt. Less-utilized, softirq is used. 
higher utilized, work queue is used.  The utilization is checked 
periodically via a heartbeat.


-- james



Re: [PATCH][next] scsi: lpfc: fix inconsistent indenting

2020-07-07 Thread James Smart

On 7/7/2020 8:00 AM, Colin King wrote:

From: Colin Ian King 

Fix smatch warning:
drivers/scsi/lpfc/lpfc_sli.c:15156 lpfc_cq_poll_hdler() warn: inconsistent
indenting

Signed-off-by: Colin Ian King 
---
  drivers/scsi/lpfc/lpfc_sli.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index e17675381cb8..92fc6527e7ee 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -15153,7 +15153,7 @@ static int lpfc_cq_poll_hdler(struct irq_poll *iop, int 
budget)
  {
struct lpfc_queue *cq = container_of(iop, struct lpfc_queue, iop);
  
-	 __lpfc_sli4_hba_process_cq(cq, LPFC_IRQ_POLL);

+   __lpfc_sli4_hba_process_cq(cq, LPFC_IRQ_POLL);
  
  	return 1;

  }


Reviewed-by: James Smart 

-- james



Re: [PATCH][next] scsi: lpfc: fix less than zero comparison on unsigned int computation

2020-07-06 Thread James Smart



On 7/6/2020 6:08 AM, Colin King wrote:

From: Colin Ian King 

The expressions start_idx - dbg_cnt is evaluated using unsigned int
arthithmetic (since these variables are unsigned ints) and hence can
never be less than zero, so the less than comparison is never true.
Re-write the expression to check for start_idx being less than dbg_cnt.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 372c187b8a70 ("scsi: lpfc: Add an internal trace log buffer")
Signed-off-by: Colin Ian King 
---
  drivers/scsi/lpfc/lpfc_init.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 7285b0114837..ce5afe7b11d0 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -14152,7 +14152,7 @@ void lpfc_dmp_dbg(struct lpfc_hba *phba)
if ((start_idx + dbg_cnt) > (DBG_LOG_SZ - 1)) {
temp_idx = (start_idx + dbg_cnt) % DBG_LOG_SZ;
} else {
-   if ((start_idx - dbg_cnt) < 0) {
+   if (start_idx < dbg_cnt) {
start_idx = DBG_LOG_SZ - (dbg_cnt - start_idx);
temp_idx = 0;
} else {


Thanks Colin - I was about to send a patch for this. Has this fix and 
one a couple of lines further down. I will post it shortly.


-- james



Re: [PATCH] scsi: lpfc: Avoid another null dereference in lpfc_sli4_hba_unset()

2020-06-23 Thread James Smart




On 6/23/2020 1:41 AM, SeongJae Park wrote:

From: SeongJae Park 

Commit cdb42becdd40 ("scsi: lpfc: Replace io_channels for nvme and fcp
with general hdw_queues per cpu") has introduced static checker warnings
for potential null dereferences in 'lpfc_sli4_hba_unset()' and
commit 1ffdd2c0440d ("scsi: lpfc: resolve static checker warning in
lpfc_sli4_hba_unset") has tried to fix it.  However, yet another
potential null dereference is remaining.  This commit fixes it.

This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.

Fixes: 1ffdd2c0440d ("scsi: lpfc: resolve static checker warning 
inlpfc_sli4_hba_unset")
Fixes: cdb42becdd40 ("scsi: lpfc: Replace io_channels for nvme and fcp with general 
hdw_queues per cpu")
Signed-off-by: SeongJae Park 
---
  drivers/scsi/lpfc/lpfc_init.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 69a5249e007a..6637f84a3d1b 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -11878,7 +11878,8 @@ lpfc_sli4_hba_unset(struct lpfc_hba *phba)
lpfc_sli4_xri_exchange_busy_wait(phba);
  
  	/* per-phba callback de-registration for hotplug event */

-   lpfc_cpuhp_remove(phba);
+   if (phba->pport)
+   lpfc_cpuhp_remove(phba);
  
  	/* Disable PCI subsystem interrupt */

lpfc_sli4_disable_intr(phba);


Reviewed-by: James Smart 

-- james



Re: [PATCH 1/1] nvme-fcloop: verify wwnn and wwpn format

2020-06-04 Thread James Smart

On 5/25/2020 9:21 PM, Dongli Zhang wrote:

The nvme host and target verify the wwnn and wwpn format via
nvme_fc_parse_traddr(). For instance, it is required that the length of
wwnn to be either 21 ("nn-0x") or 19 (nn-).

Add this verification to nvme-fcloop so that the input should always be in
hex and the length of input should always be 18.

Otherwise, the user may use e.g. 0x2 to create fcloop local port, while
0x0002 is required for nvme host and target. This makes the
requirement of format confusing.

Signed-off-by: Dongli Zhang 
---
  drivers/nvme/target/fcloop.c | 29 +++--
  1 file changed, 23 insertions(+), 6 deletions(-)




Reviewed-by: James Smart 

Looks good. Sorry for the delay.

-- james




Re: [PATCH] nvme-fc: Only call nvme_cleanup_cmd() for normal operations

2020-06-02 Thread James Smart




On 5/29/2020 4:37 AM, Daniel Wagner wrote:

Asynchronous event notifications do not have an request
associated. When fcp_io() fails we unconditionally call
nvme_cleanup_cmd() which leads to a crash.

Fixes: 16686f3a6c3c ("nvme: move common call to nvme_cleanup_cmd to core layer")
Cc: Max Gurtovoy 
Signed-off-by: Daniel Wagner 
---
  drivers/nvme/host/fc.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 7dfc4a2ecf1e..287a3e8ea317 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2300,10 +2300,11 @@ nvme_fc_start_fcp_op(struct nvme_fc_ctrl *ctrl, struct 
nvme_fc_queue *queue,
opstate = atomic_xchg(>state, FCPOP_STATE_COMPLETE);
__nvme_fc_fcpop_chk_teardowns(ctrl, op, opstate);
  
-		if (!(op->flags & FCOP_FLAGS_AEN))

+   if (!(op->flags & FCOP_FLAGS_AEN)) {
nvme_fc_unmap_data(ctrl, op->rq, op);
+   nvme_cleanup_cmd(op->rq);
+   }
  
-		nvme_cleanup_cmd(op->rq);

nvme_fc_ctrl_put(ctrl);
  
  		if (ctrl->rport->remoteport.port_state == FC_OBJSTATE_ONLINE &&


Reviewed-by:  James Smart  

-- james



Re: [PATCH] scsi: lpfc: Fix lpfc_nodelist leak when processing unsolicited event

2020-05-26 Thread James Smart




On 5/25/2020 8:12 AM, Daniel Wagner wrote:

Hi,

On Mon, May 25, 2020 at 10:16:24PM +0800, Xiyu Yang wrote:

In order to create or activate a new node, lpfc_els_unsol_buffer()
invokes lpfc_nlp_init() or lpfc_enable_node() or lpfc_nlp_get(), all of
them will return a reference of the specified lpfc_nodelist object to
"ndlp" with increased refcnt.

lpfc_enable_node() is not changing the refcnt.


When lpfc_els_unsol_buffer() returns, local variable "ndlp" becomes
invalid, so the refcount should be decreased to keep refcount balanced.

The reference counting issue happens in one exception handling path of
lpfc_els_unsol_buffer(). When "ndlp" in DEV_LOSS, the function forgets
to decrease the refcnt increased by lpfc_nlp_init() or
lpfc_enable_node() or lpfc_nlp_get(), causing a refcnt leak.

Fix this issue by calling lpfc_nlp_put() when "ndlp" in DEV_LOSS.

This sounds reasonable. At least the lpfc_nlp_init() and lpfc_nlp_get() case
needs this. And I suppose this is also ok for the lfpc_enable_node().

Reviewed-by: Daniel Wagner 

Thanks,
Daniel


Looked at it here and it looks good.

Reviewed-by: James Smart 

-- james




Re: [PATCH] scsi: lpfc: Honor module parameter lpfc_use_adisc

2019-10-21 Thread James Smart

On 10/21/2019 3:05 AM, Daniel Wagner wrote:

The initial lpfc_desc_set_adisc implementation dea3101e0a5c ("lpfc:
add Emulex FC driver version 8.0.28") enabled ADISC if

cfg_use_adisc && RSCN_MODE && FCP_2_DEVICE

In commit 92d7f7b0cde3 ("[SCSI] lpfc: NPIV: add NPIV support on top of
SLI-3") this changed to

(cfg_use_adisc && RSC_MODE) || FCP_2_DEVICE

and later in ffc954936b13 ("[SCSI] lpfc 8.3.13: FC Discovery Fixes and
enhancements.") to

(cfg_use_adisc && RSC_MODE) || (FCP_2_DEVICe && FCP_TARGET)

A customer reports that after a Devlos, an ADISC failure is logged. It
turns out the ADISC flag is set even the user explicitly set
lpfc_use_adisc = 0.

[Sat Dec 22 22:55:58 2018] lpfc :82:00.0: 2:(0):0203 Devloss timeout on 
WWPN 50:01:43:80:12:8e:40:20 NPort x05df00 Data: x8200 x8 xa
[Sat Dec 22 23:08:20 2018] lpfc :82:00.0: 2:(0):2755 ADISC failure 
DID:05DF00 Status:x9/x7

Fixes: 92d7f7b0cde3 ("[SCSI] lpfc: NPIV: add NPIV support on top of SLI-3")
Cc: James Smart 
Cc: Alex Iannicelli 
Signed-off-by: Daniel Wagner 
---
Hi,

Unfortunatly, I don't really know all the procotols involved. So this
is just a rough guess what is wrong.

Thanks,
Daniel

  drivers/scsi/lpfc/lpfc_nportdisc.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_nportdisc.c 
b/drivers/scsi/lpfc/lpfc_nportdisc.c
index cc6b1b0bae83..d27ae84326df 100644
--- a/drivers/scsi/lpfc/lpfc_nportdisc.c
+++ b/drivers/scsi/lpfc/lpfc_nportdisc.c
@@ -940,9 +940,9 @@ lpfc_disc_set_adisc(struct lpfc_vport *vport, struct 
lpfc_nodelist *ndlp)
  
  	if (!(vport->fc_flag & FC_PT2PT)) {

/* Check config parameter use-adisc or FCP-2 */
-   if ((vport->cfg_use_adisc && (vport->fc_flag & FC_RSCN_MODE)) ||
+   if (vport->cfg_use_adisc && ((vport->fc_flag & FC_RSCN_MODE) ||
((ndlp->nlp_fcp_info & NLP_FCP_2_DEVICE) &&
-(ndlp->nlp_type & NLP_FCP_TARGET))) {
+(ndlp->nlp_type & NLP_FCP_TARGET {
spin_lock_irq(shost->host_lock);
ndlp->nlp_flag |= NLP_NPR_ADISC;
spin_unlock_irq(shost->host_lock);


Looks good.

Reviewed-by: James Smart 

Thank You

-- james



Re: [PATCH] scsi: lpfc: Check queue pointer before use

2019-10-18 Thread James Smart

On 10/18/2019 9:21 AM, Daniel Wagner wrote:

The queue pointer might not be valid. The rest of the code checks the
pointer before accessing it. lpfc_sli4_process_missed_mbox_completions
is the only place where the check is missing.

Fixes: 657add4e5e15 ("scsi: lpfc: Fix poor use of hardware queues if fewer irq 
vectors")
Cc: James Smart 
Signed-off-by: Daniel Wagner 
---
Hi,

Not entirely sure if this correct. I tried to understand the logic of
the mentioned patch but failed to grasps all the details. Anyway, we
observe a crash in lpfc_sli4_process_missed_mbox_completions() while
iterating the array. All but the last one entry has a valid pointer.

Thanks,
Daniel

  drivers/scsi/lpfc/lpfc_sli.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 379c37451645..149966ba8a17 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -7906,7 +7906,7 @@ lpfc_sli4_process_missed_mbox_completions(struct lpfc_hba 
*phba)
if (sli4_hba->hdwq) {
for (eqidx = 0; eqidx < phba->cfg_irq_chann; eqidx++) {
eq = phba->sli4_hba.hba_eq_hdl[eqidx].eq;
-   if (eq->queue_id == sli4_hba->mbx_cq->assoc_qid) {
+   if (eq && eq->queue_id == sli4_hba->mbx_cq->assoc_qid) {
fpeq = eq;
break;
    }


looks fine. Thanks!

Reviewed by: James Smart 

-- james



Re: [GIT PULL] SCSI fixes for 5.3-rc7

2019-09-06 Thread James Smart

On 9/6/2019 4:18 PM, Linus Torvalds wrote:

On Fri, Sep 6, 2019 at 1:39 PM James Bottomley
 wrote:


diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 8d8c495b5b60..d65558619ab0 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -5715,7 +5715,7 @@ LPFC_ATTR_RW(nvme_embed_cmd, 1, 0, 2,
   *  0= Set nr_hw_queues by the number of CPUs or HW queues.
   *  1,128 = Manually specify the maximum nr_hw_queue value to be set,
   *
- * Value range is [0,128]. Default value is 8.
+ * Value range is [0,256]. Default value is 8.
   */

Shouldn't that "1,128 = Manually specify.." line also have been updated?

Not that I really care, and I'll pul this, but..

   Linus


Yes - thanks

-- james



Re: [PATCH 1/1] scsi: lpfc: Convert existing %pf users to %ps

2019-09-04 Thread James Smart

On 9/4/2019 9:04 AM, Sakari Ailus wrote:

Convert the remaining %pf users to %ps to prepare for the removal of the
old %pf conversion specifier support.

Fixes: 323506644972 ("scsi: lpfc: Migrate to %px and %pf in kernel print calls")
Signed-off-by: Sakari Ailus 
---
  drivers/scsi/lpfc/lpfc_hbadisc.c | 4 ++--
  drivers/scsi/lpfc/lpfc_sli.c | 2 +-
  2 files changed, 3 insertions(+), 3 deletions(-)




Reviewed by: James Smart 

-- james




Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace

2019-08-30 Thread James Smart

On 8/30/2019 2:07 PM, Sagi Grimberg wrote:



Yes we do, userspace should use it to order events.  Does udev not
handle that properly today?


The problem is not ordering of events, its really about the fact that
the chardev can be removed and reallocated for a different controller
(could be a completely different discovery controller) by the time
that userspace handles the event.


The same is generally true for lot of kernel devices.  We could reduce
the chance by using the idr cyclic allocator.


Well, it was raised by Hannes and James, so I'll ask them respond here
because I don't mind having it this way. I personally think that this
is a better approach than having a cyclic idr allocator. In general, I
don't necessarily think that this is a good idea to have cyclic
controller enumerations if we don't absolutely have to...


We hit it right and left without the cyclic allocator, but that won't 
necessarily remove it.


Perhaps we should have had a unique token assigned to the controller, 
and have the event pass the name and the token.  The cli would then, 
if the token is present, validate it via an ioctl before proceeding 
with other ioctls.


Where all the connection arguments were added we due to the reuse 
issue and then solving the question of how to verify and/or lookup 
the desired controller, by using the shotgun approach rather than 
being very pointed, which is what the name/token would do.


This unique token is: trtype:traddr:trsvcid:host-traddr ...


well yes :)  though rather verbose.   There is still a minute window as 
we're comparing values in sysfs, prior to opening the device, so 
technically something could change in that window between when we 
checked sysfs and when we open'd.   We can certain check after we open 
the device to solve that issue.


There is some elegance to a 32-bit token for the controller (can be an 
incrementing value) passed in the event and used when servicing the 
event that avoids a bunch of work.


Doing either of these would eliminate what Hannes liked - looking for 
the discovery controller with those attributes. Although, I don't know 
that looking for it is all that meaningful.




Re: [PATCH v2 3/3] nvme: fire discovery log page change events to userspace

2019-08-30 Thread James Smart

On 8/30/2019 11:08 AM, Sagi Grimberg wrote:



Yes we do, userspace should use it to order events.  Does udev not
handle that properly today?


The problem is not ordering of events, its really about the fact that
the chardev can be removed and reallocated for a different controller
(could be a completely different discovery controller) by the time
that userspace handles the event.


The same is generally true for lot of kernel devices.  We could reduce
the chance by using the idr cyclic allocator.


Well, it was raised by Hannes and James, so I'll ask them respond here
because I don't mind having it this way. I personally think that this
is a better approach than having a cyclic idr allocator. In general, I
don't necessarily think that this is a good idea to have cyclic
controller enumerations if we don't absolutely have to...


We hit it right and left without the cyclic allocator, but that won't 
necessarily remove it.


Perhaps we should have had a unique token assigned to the controller, 
and have the event pass the name and the token.  The cli would then, if 
the token is present, validate it via an ioctl before proceeding with 
other ioctls.


Where all the connection arguments were added we due to the reuse issue 
and then solving the question of how to verify and/or lookup the desired 
controller, by using the shotgun approach rather than being very 
pointed, which is what the name/token would do.


-- james



Re: [linux-next][BUG][driver/scsi/lpfc][10541f] Kernel panics when booting next kernel on my Power 9 box

2019-08-28 Thread James Smart

On 8/27/2019 10:02 PM, Abdul Haleem wrote:

Greetings,

linux-next kernel 5.3.0-rc1 failed to boot with kernel Oops on Power 9
box

I see a recent changes to lpfc code was from commit
10541f03 scsi: lpfc: Update lpfc version to 12.4.0.0

Recent boot logs:

[..snip..]


see  https://www.spinics.net/lists/linux-scsi/msg133343.html

It hasn't been tested yet, but appears to be the issue.

-- james


Re: [PATCH v2] nvme: allow 64-bit results in passthru commands

2019-08-19 Thread James Smart




On 8/19/2019 7:49 AM, Keith Busch wrote:

On Mon, Aug 19, 2019 at 12:06:23AM -0700, Marta Rybczynska wrote:

- On 16 Aug, 2019, at 15:16, Christoph Hellwig h...@lst.de wrote:

Sorry for not replying to the earlier version, and thanks for doing
this work.

I wonder if instead of using our own structure we'd just use
a full nvme SQE for the input and CQE for that output.  Even if we
reserve a few fields that means we are ready for any newly used
field (at least until the SQE/CQE sizes are expanded..).

We could do that, nvme_command and nvme_completion are already UAPI.
On the other hand that would mean not filling out certain fields like
command_id. Can do an approach like this.

Well, we need to pass user space addresses and lengths, which isn't
captured in struct nvme_command.

This is going to be fun.  It's going to have to be a cooperative effort 
between app and transport. There will always need to be some parts of 
the SQE filled out by the transport like SGL, command type/subtype bits, 
as well as dealing with buffers as Keith states. Command ID is another 
of those fields.


-- james




Re: [PATCH] scsi: lpfc: remove redundant code

2019-08-16 Thread James Smart




On 8/7/2019 6:35 PM, Fuqian Huang wrote:

Remove the redundant initialization code.

Signed-off-by: Fuqian Huang 
---



Looks good!

Reviewed-by: James Smart 

-- james



Re: [PATCH] scsi: lpfc: fix "NULL check before some freeing functions is not needed"

2019-08-16 Thread James Smart




On 7/21/2019 4:42 AM, Hariprasad Kelam wrote:

As dma_pool_destroy and mempool_destroy functions has NULL check. We may
not need NULL check before calling them.

Fix below warnings reported by coccicheck
./drivers/scsi/lpfc/lpfc_mem.c:252:2-18: WARNING: NULL check before some
freeing functions is not needed.
./drivers/scsi/lpfc/lpfc_mem.c:255:2-18: WARNING: NULL check before some
freeing functions is not needed.
./drivers/scsi/lpfc/lpfc_mem.c:258:2-18: WARNING: NULL check before some
freeing functions is not needed.
./drivers/scsi/lpfc/lpfc_mem.c:261:2-18: WARNING: NULL check before some
freeing functions is not needed.
./drivers/scsi/lpfc/lpfc_mem.c:265:2-18: WARNING: NULL check before some
freeing functions is not needed.
./drivers/scsi/lpfc/lpfc_mem.c:269:2-17: WARNING: NULL check before some
freeing functions is not needed.

Signed-off-by: Hariprasad Kelam 
---
  drivers/scsi/lpfc/lpfc_mem.c | 21 +
  1 file changed, 9 insertions(+), 12 deletions(-)



Thanks

Reviewed-by: James Smart 

-- james



Re: [PATCH] scsi: lpfc: use spin_lock_irqsave instead of spin_lock_irq in IRQ context

2019-08-16 Thread James Smart

On 8/12/2019 1:31 AM, Fuqian Huang wrote:

As spin_unlock_irq will enable interrupts.
Function lpfc_findnode_rpi is called from
 lpfc_sli_abts_err_handler (./drivers/scsi/lpfc/lpfc_sli.c)
  <- lpfc_sli_async_event_handler
  <- lpfc_sli_process_unsol_iocb
  <- lpfc_sli_handle_fast_ring_event
  <- lpfc_sli_fp_intr_handler
  <- lpfc_sli_intr_handler
  and lpfc_sli_intr_handler is an interrupt handler.
Interrupts are enabled in interrupt handler.
Use spin_lock_irqsave/spin_unlock_irqrestore instead of spin_(un)lock_irq
in IRQ context to avoid this.

Signed-off-by: Fuqian Huang 
---
  drivers/scsi/lpfc/lpfc_hbadisc.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c
index 28ecaa7fc715..cf02c352b324 100644
--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
+++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
@@ -6065,10 +6065,11 @@ lpfc_findnode_rpi(struct lpfc_vport *vport, uint16_t 
rpi)
  {
struct Scsi_Host *shost = lpfc_shost_from_vport(vport);
struct lpfc_nodelist *ndlp;
+   unsigned long flags;
  
-	spin_lock_irq(shost->host_lock);

+   spin_lock_irqsave(shost->host_lock, flags);
ndlp = __lpfc_findnode_rpi(vport, rpi);
-   spin_unlock_irq(shost->host_lock);
+   spin_unlock_irqrestore(shost->host_lock, flags);
return ndlp;
  }
  


Thank you.

Reviewed-by: James Smart 

-- james


Re: [PATCH -next] scsi: lpfc: Remove unnecessary null check before kfree

2019-07-15 Thread James Smart




On 7/11/2019 7:10 AM, YueHaibing wrote:

A null check before a kfree is redundant, so remove it.
This is detected by coccinelle.

Signed-off-by: YueHaibing 
---
  drivers/scsi/lpfc/lpfc_bsg.c | 4 +---



Reviewed-by: James Smart 

-- james


Re: [PATCH 2/4] lpfc: reduce stack size with CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE

2019-06-28 Thread James Smart




On 6/28/2019 5:37 AM, Arnd Bergmann wrote:

The lpfc_debug_dump_all_queues() function repeatedly calls into
lpfc_debug_dump_qe(), which has a temporary 128 byte buffer.
This was fine before the introduction of CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE
because each instance could occupy the same stack slot. However,
now they each get their own copy, which leads to a huge increase
in stack usage as seen from the compiler warning:

drivers/scsi/lpfc/lpfc_debugfs.c: In function 'lpfc_debug_dump_all_queues':
drivers/scsi/lpfc/lpfc_debugfs.c:6474:1: error: the frame size of 1712 bytes is 
larger than 100 bytes [-Werror=frame-larger-than=]

Avoid this by not marking lpfc_debug_dump_qe() as inline so the
compiler can choose to emit a static version of this function
when it's needed or otherwise silently drop it. As an added benefit,
not inlining multiple copies of this function means we save several
kilobytes of .text section, reducing the file size from 47kb to 43.

It is somewhat unusual to have a function that is static but not
inline in a header file, but this does not cause problems here
because it is only used by other inline functions. It would
however seem reasonable to move all the lpfc_debug_dump_* functions
into lpfc_debugfs.c and not mark them inline as a later cleanup.


I agree with this cleanup.



Fixes: 81a56f6dcd20 ("gcc-plugins: structleak: Generalize to all variable 
types")
Signed-off-by: Arnd Bergmann 
---
  drivers/scsi/lpfc/lpfc_debugfs.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)




Reviewed-by: James Smart 

-- james



Re: [PATCH] scsi: lpfc: Fix backport of faf5a744f4f8 ("scsi: lpfc: avoid uninitialized variable warning")

2019-06-06 Thread James Smart

On 6/6/2019 10:41 AM, Nathan Chancellor wrote:

Prior to commit 4c47efc140fa ("scsi: lpfc: Move SCSI and NVME Stats to
hardware queue structures") upstream, we allocated a cstat structure in
lpfc_nvme_create_localport. When commit faf5a744f4f8 ("scsi: lpfc: avoid
uninitialized variable warning") was backported, it was placed after the
allocation so we leaked memory whenever this function was called and
that conditional was true (so whenever CONFIG_NVME_FC is disabled).

Move the IS_ENABLED if statement above the allocation since it is not
needed when the condition is true.

Reported-by: Pavel Machek 
Signed-off-by: Nathan Chancellor 
---
  drivers/scsi/lpfc/lpfc_nvme.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index 099f70798fdd..645ffb5332b4 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -2477,14 +2477,14 @@ lpfc_nvme_create_localport(struct lpfc_vport *vport)
lpfc_nvme_template.max_sgl_segments = phba->cfg_nvme_seg_cnt + 1;
lpfc_nvme_template.max_hw_queues = phba->cfg_nvme_io_channel;
  
+	if (!IS_ENABLED(CONFIG_NVME_FC))

+   return ret;
+
cstat = kmalloc((sizeof(struct lpfc_nvme_ctrl_stat) *
phba->cfg_nvme_io_channel), GFP_KERNEL);
if (!cstat)
return -ENOMEM;
  
-	if (!IS_ENABLED(CONFIG_NVME_FC))

-   return ret;
-
/* localport is allocated from the stack, but the registration
 * call allocates heap memory as well as the private area.
     */


Reviewed-by: James Smart 




Re: [PATCH] scsi: lpfc: Avoid unused function warnings

2019-06-06 Thread James Smart




On 6/5/2019 10:24 PM, Nathan Chancellor wrote:

When building powerpc pseries_defconfig or powernv_defconfig:

drivers/scsi/lpfc/lpfc_nvmet.c:224:1: error: unused function
'lpfc_nvmet_get_ctx_for_xri' [-Werror,-Wunused-function]
drivers/scsi/lpfc/lpfc_nvmet.c:246:1: error: unused function
'lpfc_nvmet_get_ctx_for_oxid' [-Werror,-Wunused-function]

These functions are only compiled when CONFIG_NVME_TARGET_FC is enabled.
Use that same condition so there is no more warning. While the fixes
commit did not introduce these functions, it caused these warnings.

Fixes: 4064b27417a7 ("scsi: lpfc: Make some symbols static")
Signed-off-by: Nathan Chancellor 
---
  drivers/scsi/lpfc/lpfc_nvmet.c | 2 ++
  1 file changed, 2 insertions(+)




Looks fine.

Reviewed-by:  James Smart  




Re: [PATCH -next] scsi: lpfc: Make some symbols static

2019-06-04 Thread James Smart




On 5/31/2019 8:28 AM, YueHaibing wrote:

Fix sparse warnings:

drivers/scsi/lpfc/lpfc_sli.c:115:1: warning: symbol 'lpfc_sli4_pcimem_bcopy' 
was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_sli.c:7854:1: warning: symbol 
'lpfc_sli4_process_missed_mbox_completions' was not declared. Should it be 
static?
drivers/scsi/lpfc/lpfc_nvmet.c:223:27: warning: symbol 
'lpfc_nvmet_get_ctx_for_xri' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:245:27: warning: symbol 
'lpfc_nvmet_get_ctx_for_oxid' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_init.c:75:10: warning: symbol 'lpfc_present_cpu' was not 
declared. Should it be static?

Reported-by: Hulk Robot 
Signed-off-by: YueHaibing 
---
  drivers/scsi/lpfc/lpfc_init.c  | 2 +-
  drivers/scsi/lpfc/lpfc_nvmet.c | 4 ++--
  drivers/scsi/lpfc/lpfc_sli.c   | 4 ++--
  3 files changed, 5 insertions(+), 5 deletions(-)




Looks good - thank You

Reviewed-by: James Smart  




Re: [PATCH -next] scsi: lpfc: Remove set but not used variables 'qp'

2019-06-04 Thread James Smart




On 5/31/2019 8:27 AM, YueHaibing wrote:

Fixes gcc '-Wunused-but-set-variable' warnings:

drivers/scsi/lpfc/lpfc_init.c: In function lpfc_setup_cq_lookup:
drivers/scsi/lpfc/lpfc_init.c:9359:30: warning: variable qp set but not used 
[-Wunused-but-set-variable]

It's not used since commit e70596a60f88 ("scsi: lpfc: Fix
poor use of hardware queues if fewer irq vectors")

Signed-off-by: YueHaibing 



Looks good - thank You

Reviewed-by: James Smart  



Re: [PATCH] scsi: lpfc: Use *_pool_zalloc rather than *_pool_alloc

2019-05-29 Thread James Smart




On 5/29/2019 1:21 PM, Thomas Meyer wrote:

Use *_pool_zalloc rather than *_pool_alloc followed by memset with 0.

Signed-off-by: Thomas Meyer 
---




looks good

Reviewed-by: James Smart 

-- james



Re: [PATCH 5/8] scsi: lpfc: change snprintf to scnprintf for possible overflow

2019-03-20 Thread James Smart

On 3/20/2019 10:39 AM, Greg KH wrote:

On Tue, Jan 15, 2019 at 02:41:17PM -0800, James Smart wrote:

On 1/14/2019 5:15 PM, Kees Cook wrote:

On Sat, Jan 12, 2019 at 7:29 AM Willy Tarreau  wrote:

From: Silvio Cesare

Change snprintf to scnprintf. There are generally two cases where using
snprintf causes problems.

1) Uses of size += snprintf(buf, SIZE - size, fmt, ...)
In this case, if snprintf would have written more characters than what the
buffer size (SIZE) is, then size will end up larger than SIZE. In later
uses of snprintf, SIZE - size will result in a negative number, leading
to problems. Note that size might already be too large by using
size = snprintf before the code reaches a case of size += snprintf.

2) If size is ultimately used as a length parameter for a copy back to user
space, then it will potentially allow for a buffer overflow and information
disclosure when size is greater than SIZE. When the size is used to index
the buffer directly, we can have memory corruption. This also means when
size = snprintf... is used, it may also cause problems since size may become
large.  Copying to userspace is mitigated by the HARDENED_USERCOPY kernel
configuration.

The solution to these issues is to use scnprintf which returns the number of
characters actually written to the buffer, so the size variable will never
exceed SIZE.

Signed-off-by: Silvio Cesare
Cc: James Smart
Cc: Dick Kennedy
Cc: Dan Carpenter
Cc: Kees Cook
Cc: Will Deacon
Cc: Greg KH
Signed-off-by: Willy Tarreau

I think this needs Cc: stable.

Reviewed-by: Kees Cook

-Kees



Reviewed-by:  James Smart 

What ever happened to this patch?  Did it get dropped somehow?

thanks,

greg k-h


I talked with them and will make sure it's pulled in shortly.

-- james




Re: [PATCH 5/8] scsi: lpfc: change snprintf to scnprintf for possible overflow

2019-03-20 Thread James Smart

On 3/20/2019 10:39 AM, Greg KH wrote:

On Tue, Jan 15, 2019 at 02:41:17PM -0800, James Smart wrote:

On 1/14/2019 5:15 PM, Kees Cook wrote:

On Sat, Jan 12, 2019 at 7:29 AM Willy Tarreau  wrote:

From: Silvio Cesare

Change snprintf to scnprintf. There are generally two cases where using
snprintf causes problems.

1) Uses of size += snprintf(buf, SIZE - size, fmt, ...)
In this case, if snprintf would have written more characters than what the
buffer size (SIZE) is, then size will end up larger than SIZE. In later
uses of snprintf, SIZE - size will result in a negative number, leading
to problems. Note that size might already be too large by using
size = snprintf before the code reaches a case of size += snprintf.

2) If size is ultimately used as a length parameter for a copy back to user
space, then it will potentially allow for a buffer overflow and information
disclosure when size is greater than SIZE. When the size is used to index
the buffer directly, we can have memory corruption. This also means when
size = snprintf... is used, it may also cause problems since size may become
large.  Copying to userspace is mitigated by the HARDENED_USERCOPY kernel
configuration.

The solution to these issues is to use scnprintf which returns the number of
characters actually written to the buffer, so the size variable will never
exceed SIZE.

Signed-off-by: Silvio Cesare
Cc: James Smart
Cc: Dick Kennedy
Cc: Dan Carpenter
Cc: Kees Cook
Cc: Will Deacon
Cc: Greg KH
Signed-off-by: Willy Tarreau

I think this needs Cc: stable.

Reviewed-by: Kees Cook

-Kees



Reviewed-by:  James Smart 

What ever happened to this patch?  Did it get dropped somehow?

thanks,

greg k-h


I assume it wasn't pulled in by the scsi maintainers. I'll go ping them.

-- james



Re: general protection fault in skb_put

2019-03-11 Thread James Smart

On 3/11/2019 9:40 AM, Dmitry Vyukov wrote:

On Mon, Mar 11, 2019 at 5:20 PM 'James Smart' via syzkaller-bugs
 wrote:


On 3/11/2019 6:20 AM, syzbot wrote:

syzbot has bisected this bug to:

commit 97faec531460c949d7120672b8c77e2f41f8d6d7
Author: James Smart 
Date:   Thu Sep 13 23:17:38 2018 +

 nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device

bisection log:
https://syzkaller.appspot.com/x/bisect.txt?x=121f55db20
start commit:   97faec53 nvme_fc: add 'nvme_discovery' sysfs attribute
to ..
git tree:   linux-next
final crash: https://syzkaller.appspot.com/x/report.txt?x=111f55db20
console output: https://syzkaller.appspot.com/x/log.txt?x=161f55db20
kernel config: https://syzkaller.appspot.com/x/.config?x=59aefae07c771af6
dashboard link:
https://syzkaller.appspot.com/bug?extid=65788f9af9d54844389e
userspace arch: amd64
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=178e0798c0
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11b4f0b0c0

Reported-by: syzbot+65788f9af9d548443...@syzkaller.appspotmail.com
Fixes: 97faec53 ("nvme_fc: add 'nvme_discovery' sysfs attribute to fc
transport device")


can someone contact me as to what this thing is doing and how to
interpret all the logs.  nvme_fc isn't remotely in any of the logs and
doesn't use skb's unless the underlying udev_uevents are using them.


Hi James,

What exactly is unclear/needs interpretation? syzbot did what is
commonly known as kernel/git bisection process. This is a new feature
so there can be some rough edges. Hopefully we can improve the
representation together.

Thanks

Everything is unclear. You're telling me that an error occurred and that 
you reduced it to the git submit where the error starts appearing.


Usually there would be something in the base crash, which I'm looking at 
in https://syzkaller.appspot.com/x/report.txt?x=111f55db20 which 
would point back at something in the patch or related to it. There are 
no relationships.  I can't quite figure out what the base test actually 
did that generated the failure to see if there's any possible relationship.


Everything in the base crash stacktrace points to an issue in the 
bluetooth uart driver doing all the logging - not the patch called out. 
So this looks like a failure of your infrastructure.


-- james



Re: general protection fault in skb_put

2019-03-11 Thread James Smart

On 3/11/2019 6:20 AM, syzbot wrote:

syzbot has bisected this bug to:

commit 97faec531460c949d7120672b8c77e2f41f8d6d7
Author: James Smart 
Date:   Thu Sep 13 23:17:38 2018 +

    nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device

bisection log: 
https://syzkaller.appspot.com/x/bisect.txt?x=121f55db20
start commit:   97faec53 nvme_fc: add 'nvme_discovery' sysfs attribute 
to ..

git tree:   linux-next
final crash: https://syzkaller.appspot.com/x/report.txt?x=111f55db20
console output: https://syzkaller.appspot.com/x/log.txt?x=161f55db20
kernel config: https://syzkaller.appspot.com/x/.config?x=59aefae07c771af6
dashboard link: 
https://syzkaller.appspot.com/bug?extid=65788f9af9d54844389e

userspace arch: amd64
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=178e0798c0
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11b4f0b0c0

Reported-by: syzbot+65788f9af9d548443...@syzkaller.appspotmail.com
Fixes: 97faec53 ("nvme_fc: add 'nvme_discovery' sysfs attribute to fc 
transport device")


can someone contact me as to what this thing is doing and how to 
interpret all the logs.  nvme_fc isn't remotely in any of the logs and 
doesn't use skb's unless the underlying udev_uevents are using them.


-- james


Re: [PATCH] nvmet-fc: use zero-sized array and struct_size() in kzalloc()

2019-03-08 Thread James Smart

On 2/23/2019 10:51 AM, Gustavo A. R. Silva wrote:

Update the code to use a zero-sized array instead of a pointer in
structure nvmet_fc_tgt_queue and use struct_size() in kzalloc().

Notice that one of the more common cases of allocation size calculations
is finding the size of a structure that has a zero-sized array at the end,
along with memory for some number of elements for that array. For example:

struct foo {
int stuff;
struct boo entry[];
};

instance = kzalloc(sizeof(struct foo) + sizeof(struct boo) * count, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can now
use the new struct_size() helper:

instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva 
---
  drivers/nvme/target/fc.c | 7 ++-
  1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 1e9654f04c60..23baec38f97e 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -128,12 +128,12 @@ struct nvmet_fc_tgt_queue {
struct nvmet_cq nvme_cq;
struct nvmet_sq nvme_sq;
struct nvmet_fc_tgt_assoc   *assoc;
-   struct nvmet_fc_fcp_iod *fod;   /* array of fcp_iods */
struct list_headfod_list;
struct list_headpending_cmd_list;
struct list_headavail_defer_list;
struct workqueue_struct *work_q;
struct kref ref;
+   struct nvmet_fc_fcp_iod fod[];  /* array of fcp_iods */
  } __aligned(sizeof(unsigned long long));
  
  struct nvmet_fc_tgt_assoc {

@@ -588,9 +588,7 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc 
*assoc,
if (qid > NVMET_NR_QUEUES)
return NULL;
  
-	queue = kzalloc((sizeof(*queue) +

-   (sizeof(struct nvmet_fc_fcp_iod) * sqsize)),
-   GFP_KERNEL);
+   queue = kzalloc(struct_size(queue, fod, sqsize), GFP_KERNEL);
if (!queue)
return NULL;
  
@@ -603,7 +601,6 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *assoc,

if (!queue->work_q)
goto out_a_put;
  
-	queue->fod = (struct nvmet_fc_fcp_iod *)[1];

queue->qid = qid;
queue->sqsize = sqsize;
queue->assoc = assoc;


Reviewed-by:  James Smart 

I guess this is a better style.

-- james



Re: [LKP] [workqueue] 4d43d395fe: WARNING:at_kernel/workqueue.c:#__flush_work

2019-03-04 Thread James Smart

On 3/4/2019 10:21 AM, Sagi Grimberg wrote:



Forwarding to NMVE people:

kernel test robot found that
flush_work(>async_event_work) is called from nvmet_ctrl_free()
without INIT_WORK(>async_event_work, nvmet_async_event_work)
after ctrl was allocated (probably initialized with 0).
Will you make sure that INIT_WORK() is always called?


I cannot reproduce this issue. When following the code I don't
immediately see how this can happen.. Was there something special
in this specific test run? Is it 100% reproduce-able?


I agree. INIT_WORK is setup as almost one of the first items for a new 
controller.  Smells more like a double-free or a corrupt ctrl struct 
from the transport.


-- james



Re: [PATCH] scsi: lpfc: fix a handful of indentation issues

2019-02-13 Thread James Smart

On 2/12/2019 7:29 AM, Colin King wrote:

From: Colin Ian King 

There are a handful of statements that are indented incorrectly. Fix these.

Signed-off-by: Colin Ian King 
---
  drivers/scsi/lpfc/lpfc_bsg.c | 4 ++--
  drivers/scsi/lpfc/lpfc_debugfs.c | 4 ++--
  drivers/scsi/lpfc/lpfc_init.c| 2 +-
  drivers/scsi/lpfc/lpfc_mbox.c| 4 ++--
  drivers/scsi/lpfc/lpfc_sli.c | 2 +-
  5 files changed, 8 insertions(+), 8 deletions(-)




Looks good to me.

Signed-off-by:  James Smart   

-- james



Re: linux-next: manual merge of the scsi-mkp tree with Linus' tree

2019-02-07 Thread James Smart

On 2/6/2019 8:44 PM, Stephen Rothwell wrote:

Hi all,

Today's linux-next merge of the scsi-mkp tree got a conflict in:

   drivers/scsi/lpfc/lpfc_nvme.c

between commit:

   7961cba6f7d8 ("scsi: lpfc: nvme: avoid hang / use-after-free when destroying 
localport")

from Linus' tree and commit:

   4c47efc140fa ("scsi: lpfc: Move SCSI and NVME Stats to hardware queue 
structures")

from the scsi-mkp tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.




Thank You. The fixup patch looks correct.

Martin, let me know if there's anything else you'd like me to do.

-- james

Signed-off-by:  James Smart 


Re: [PATCH 5/8] scsi: lpfc: change snprintf to scnprintf for possible overflow

2019-01-15 Thread James Smart



On 1/14/2019 5:15 PM, Kees Cook wrote:

On Sat, Jan 12, 2019 at 7:29 AM Willy Tarreau  wrote:

From: Silvio Cesare

Change snprintf to scnprintf. There are generally two cases where using
snprintf causes problems.

1) Uses of size += snprintf(buf, SIZE - size, fmt, ...)
In this case, if snprintf would have written more characters than what the
buffer size (SIZE) is, then size will end up larger than SIZE. In later
uses of snprintf, SIZE - size will result in a negative number, leading
to problems. Note that size might already be too large by using
size = snprintf before the code reaches a case of size += snprintf.

2) If size is ultimately used as a length parameter for a copy back to user
space, then it will potentially allow for a buffer overflow and information
disclosure when size is greater than SIZE. When the size is used to index
the buffer directly, we can have memory corruption. This also means when
size = snprintf... is used, it may also cause problems since size may become
large.  Copying to userspace is mitigated by the HARDENED_USERCOPY kernel
configuration.

The solution to these issues is to use scnprintf which returns the number of
characters actually written to the buffer, so the size variable will never
exceed SIZE.

Signed-off-by: Silvio Cesare
Cc: James Smart
Cc: Dick Kennedy
Cc: Dan Carpenter
Cc: Kees Cook
Cc: Will Deacon
Cc: Greg KH
Signed-off-by: Willy Tarreau

I think this needs Cc: stable.

Reviewed-by: Kees Cook

-Kees




Reviewed-by:  James Smart 

-- james



Re: [PATCH 0/4] Rework NVMe abort handling

2018-07-19 Thread James Smart

On 7/19/2018 7:54 AM, Johannes Thumshirn wrote:

On Thu, Jul 19, 2018 at 04:50:05PM +0200, Christoph Hellwig wrote:

The upper layer is only going to retry after tearing down the transport
connection.  And a tear down of the connection MUST clear all pending
commands on the way.  If it doesn't we are in deep, deep trouble.

A NVMe abort has no chance of clearing things at the transport layer.

OK so all I can do in my case (if I want soft error recovery) is send
out a transport abort in the timeout handler and then rearm the timer
and requeue the command.

Which leaves us to the FC patch only, modified of cause.



Which I'm going to say no on. I originally did the abort before the 
reset and it brought out some confusion in the reset code. Thus the 
existing flow which just resets the controller which has to be done 
after the abort anyway.


-- james



Re: [PATCH 0/4] Rework NVMe abort handling

2018-07-19 Thread James Smart

On 7/19/2018 7:54 AM, Johannes Thumshirn wrote:

On Thu, Jul 19, 2018 at 04:50:05PM +0200, Christoph Hellwig wrote:

The upper layer is only going to retry after tearing down the transport
connection.  And a tear down of the connection MUST clear all pending
commands on the way.  If it doesn't we are in deep, deep trouble.

A NVMe abort has no chance of clearing things at the transport layer.

OK so all I can do in my case (if I want soft error recovery) is send
out a transport abort in the timeout handler and then rearm the timer
and requeue the command.

Which leaves us to the FC patch only, modified of cause.



Which I'm going to say no on. I originally did the abort before the 
reset and it brought out some confusion in the reset code. Thus the 
existing flow which just resets the controller which has to be done 
after the abort anyway.


-- james



Re: [PATCH 0/4] Rework NVMe abort handling

2018-07-19 Thread James Smart



On 7/19/2018 7:10 AM, Johannes Thumshirn wrote:

On Thu, Jul 19, 2018 at 03:42:03PM +0200, Christoph Hellwig wrote:

Without even looking at the code yet:  why?  The nvme abort isn't
very useful, and due to the lack of ordering between different
queues almost harmful on fabrics.  What problem do you try to
solve?

The problem I'm trying to solve here is really just single commands
timing out because of i.e. a bad switch in between which causes frame
loss somewhere.

I know RDMA and FC are defined to be lossless but reality sometimes
has a different view on this (can't talk too much for RDMA but I've
had some nice bugs in SCSI due to faulty switches dropping odd
frames).

Of cause we can still do the big hammer if one command times out due
to a misbehaving switch but we can also at least try to abort it. I
know aborts are defined as best effort, but as we're in the error path
anyways it doesn't hurt to at least try.

This would give us a chance to recover from such situations, of cause
given the target actually does something when receiving an abort.

In the FC case we can even send an ABTS and try to abort the command
on the FC side first, before doing it on NVMe. I'm not sure if we can
do it on RDMA or PCIe as well.

So the issue I'm trying to solve is easy, if one command times out for
whatever reason, there's no need to go the big transport reset route
before not even trying to recover from it. Possibly we should also try
doing a queue reset if aborting failed before doing the transport
reset.

Byte,
Johannes


I'm with Christoph.

It doesn't work that way... command delivery is very much tied to any 
command ordering delivery requirements as well as sqhd increment on the 
target, and response delivery is tied similarly tied to sqhd delivery to 
the host as well as ordering requirements on responses. With aborts as 
you're implementing, you drop those things.  Granted, Linux's lack of 
paying attention to SQHD (a problem waiting to happen in my mind) as 
well as not using fused commands (and no other commands yet requiring 
order) make it believe it can get away without it.


You're going to confuse transports as there's no understanding in the 
transport protocol on what it means to abort/cancel a single io.   The 
specs are rather clear, and for a good reason, that non-delivery (the 
abort or cancellation) mandates connection teardown which in turn 
mandates association teardown. You will be creating non-standard 
implementations that will fail interoperability and compliance.


If you really want single io abort - implement it in the NVMe standard 
way with Aborts to the admin queue, subject to the ACL limit.  Then push 
on the targets to support deep ACL counts and honestly responding to 
ABORT, and there will still be race conditions between the ABORT and its 
command that will make an interesting retry policy. Or, wait for Fred 
Knights, new proposal on ABORTS.


-- james




Re: [PATCH 0/4] Rework NVMe abort handling

2018-07-19 Thread James Smart



On 7/19/2018 7:10 AM, Johannes Thumshirn wrote:

On Thu, Jul 19, 2018 at 03:42:03PM +0200, Christoph Hellwig wrote:

Without even looking at the code yet:  why?  The nvme abort isn't
very useful, and due to the lack of ordering between different
queues almost harmful on fabrics.  What problem do you try to
solve?

The problem I'm trying to solve here is really just single commands
timing out because of i.e. a bad switch in between which causes frame
loss somewhere.

I know RDMA and FC are defined to be lossless but reality sometimes
has a different view on this (can't talk too much for RDMA but I've
had some nice bugs in SCSI due to faulty switches dropping odd
frames).

Of cause we can still do the big hammer if one command times out due
to a misbehaving switch but we can also at least try to abort it. I
know aborts are defined as best effort, but as we're in the error path
anyways it doesn't hurt to at least try.

This would give us a chance to recover from such situations, of cause
given the target actually does something when receiving an abort.

In the FC case we can even send an ABTS and try to abort the command
on the FC side first, before doing it on NVMe. I'm not sure if we can
do it on RDMA or PCIe as well.

So the issue I'm trying to solve is easy, if one command times out for
whatever reason, there's no need to go the big transport reset route
before not even trying to recover from it. Possibly we should also try
doing a queue reset if aborting failed before doing the transport
reset.

Byte,
Johannes


I'm with Christoph.

It doesn't work that way... command delivery is very much tied to any 
command ordering delivery requirements as well as sqhd increment on the 
target, and response delivery is tied similarly tied to sqhd delivery to 
the host as well as ordering requirements on responses. With aborts as 
you're implementing, you drop those things.  Granted, Linux's lack of 
paying attention to SQHD (a problem waiting to happen in my mind) as 
well as not using fused commands (and no other commands yet requiring 
order) make it believe it can get away without it.


You're going to confuse transports as there's no understanding in the 
transport protocol on what it means to abort/cancel a single io.   The 
specs are rather clear, and for a good reason, that non-delivery (the 
abort or cancellation) mandates connection teardown which in turn 
mandates association teardown. You will be creating non-standard 
implementations that will fail interoperability and compliance.


If you really want single io abort - implement it in the NVMe standard 
way with Aborts to the admin queue, subject to the ACL limit.  Then push 
on the targets to support deep ACL counts and honestly responding to 
ABORT, and there will still be race conditions between the ABORT and its 
command that will make an interesting retry policy. Or, wait for Fred 
Knights, new proposal on ABORTS.


-- james




Re: [PATCH v5 1/2] nvme: cache struct nvme_ctrl reference to struct nvme_request

2018-06-27 Thread James Smart




On 6/27/2018 5:53 AM, Johannes Thumshirn wrote:

From: Sagi Grimberg 

We will need to reference the controller in the setup and completion
time for tracing and future traffic based keep alive support.

Signed-off-by: Sagi Grimberg 

---
Changes to v4:
- Move caching from nvme_setup_cmd to .init_request (Keith)
---
  drivers/nvme/host/core.c   | 1 +
  drivers/nvme/host/fc.c | 1 +
  drivers/nvme/host/nvme.h   | 1 +
  drivers/nvme/host/pci.c| 2 ++
  drivers/nvme/host/rdma.c   | 1 +
  drivers/nvme/target/loop.c | 1 +
  6 files changed, 7 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index e541fe268bcf..99dd62c1076c 100644



good for the fc side...

Reviewed-by:  James Smart  




Re: [PATCH v5 1/2] nvme: cache struct nvme_ctrl reference to struct nvme_request

2018-06-27 Thread James Smart




On 6/27/2018 5:53 AM, Johannes Thumshirn wrote:

From: Sagi Grimberg 

We will need to reference the controller in the setup and completion
time for tracing and future traffic based keep alive support.

Signed-off-by: Sagi Grimberg 

---
Changes to v4:
- Move caching from nvme_setup_cmd to .init_request (Keith)
---
  drivers/nvme/host/core.c   | 1 +
  drivers/nvme/host/fc.c | 1 +
  drivers/nvme/host/nvme.h   | 1 +
  drivers/nvme/host/pci.c| 2 ++
  drivers/nvme/host/rdma.c   | 1 +
  drivers/nvme/target/loop.c | 1 +
  6 files changed, 7 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index e541fe268bcf..99dd62c1076c 100644



good for the fc side...

Reviewed-by:  James Smart  




Re: [PATCH 5/5] nvmet: fcloop: use irqsave spinlocks

2018-06-12 Thread James Smart




On 5/15/2018 12:40 AM, Johannes Thumshirn wrote:

Lockdep complains about inconsistent hardirq states when using
nvme-fcloop. So use the irqsave variant of spin_locks for the
nvmefc_fcp_req's reqlock.

Here's the lockdep report:
[   11.138417] 
[   11.139085] WARNING: inconsistent lock state
[   11.139207] nvmet: creating controller 2 for subsystem blktests-subsystem-1 
for NQN nqn.2014-08.org.nvmexpress:uuid:046394e7-bff7-4403-9502-1816800efaa0.
[   11.139727] 4.17.0-rc5+ #883 Not tainted
[   11.139732] 
[   11.11] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[   11.145341] swapper/8/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[   11.146108] (ptrval) (&(_req->reqlock)->rlock){?.+.}, at: 
fcloop_fcp_op+0x20/0x1d0 [nvme_fcloop]
[   11.148633] {HARDIRQ-ON-W} state was registered at:
[   11.149380]   _raw_spin_lock+0x34/0x70
[   11.149940]   fcloop_fcp_recv_work+0x17/0x90 [nvme_fcloop]
[   11.150746]   process_one_work+0x1d8/0x5c0
[   11.151346]   worker_thread+0x45/0x3e0
[   11.151886]   kthread+0x101/0x140
[   11.152370]   ret_from_fork+0x3a/0x50
[   11.152903] irq event stamp: 3
[   11.153402] hardirqs last  enabled at (36663): [] 
default_idle+0x13/0x180
[   11.154601] hardirqs last disabled at (36664): [] 
interrupt_entry+0xc4/0xe0
[   11.155817] softirqs last  enabled at (3): [] 
irq_enter+0x59/0x60
[   11.156952] softirqs last disabled at (36665): [] 
irq_enter+0x3e/0x60
[   11.158092]
[   11.158092] other info that might help us debug this:
[   11.159436]  Possible unsafe locking scenario:
[   11.159436]
[   11.160299]CPU0
[   11.160663]
[   11.161026]   lock(&(_req->reqlock)->rlock);
[   11.161709]   
[   11.162091] lock(&(_req->reqlock)->rlock);
[   11.163148]
[   11.163148]  *** DEADLOCK ***
[   11.163148]
[   11.164007] no locks held by swapper/8/0.
[   11.164596]
[   11.164596] stack backtrace:
[   11.165238] CPU: 8 PID: 0 Comm: swapper/8 Not tainted 4.17.0-rc5+ #883
[   11.166180] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.0.0-prebuilt.qemu-project.org 04/01/2014
[   11.167673] Call Trace:
[   11.168037]  
[   11.168349]  dump_stack+0x78/0xb3
[   11.168864]  print_usage_bug+0x1ed/0x1fe
[   11.169440]  ? check_usage_backwards+0x110/0x110
[   11.170111]  mark_lock+0x263/0x2a0
[   11.170995]  __lock_acquire+0x675/0x1870
[   11.171865]  ? __lock_acquire+0x48d/0x1870
[   11.172460]  lock_acquire+0xd4/0x220
[   11.172981]  ? fcloop_fcp_op+0x20/0x1d0 [nvme_fcloop]
[   11.173709]  _raw_spin_lock+0x34/0x70
[   11.174243]  ? fcloop_fcp_op+0x20/0x1d0 [nvme_fcloop]
[   11.174978]  fcloop_fcp_op+0x20/0x1d0 [nvme_fcloop]
[   11.175673]  nvmet_fc_transfer_fcp_data+0x9b/0x130 [nvmet_fc]
[   11.176507]  nvmet_req_complete+0x10/0x110 [nvmet]
[   11.177210]  nvmet_bio_done+0x23/0x40 [nvmet]
[   11.177837]  blk_update_request+0xab/0x3b0
[   11.178434]  blk_mq_end_request+0x13/0x60
[   11.179033]  flush_smp_call_function_queue+0x58/0x150
[   11.179755]  smp_call_function_single_interrupt+0x49/0x260
[   11.180531]  call_function_single_interrupt+0xf/0x20
[   11.181236]  
[   11.181542] RIP: 0010:native_safe_halt+0x2/0x10
[   11.182186] RSP: 0018:a481006cbec0 EFLAGS: 0206 ORIG_RAX: 
ff04
[   11.183265] RAX: 9f54f8b86200 RBX: 9f54f8b86200 RCX: 
[   11.184264] RDX: 9f54f8b86200 RSI: 0001 RDI: 9f54f8b86200
[   11.185270] RBP: 0008 R08:  R09: 
[   11.186271] R10: 0001 R11:  R12: 
[   11.187280] R13:  R14: 9f54f8b86200 R15: 9f54f8b86200
[   11.188280]  default_idle+0x18/0x180
[   11.188806]  do_idle+0x176/0x260
[   11.189273]  cpu_startup_entry+0x5a/0x60
[   11.189832]  start_secondary+0x18f/0x1b0

Signed-off-by: Johannes Thumshirn 
Cc: James Smart 
---
  drivers/nvme/target/fcloop.c | 52 
  1 file changed, 29 insertions(+), 23 deletions(-)



Looks ok. I assume it is as the bio_done call can be in an interrupt 
handler.


Reviewed-by:   James Smart  




Re: [PATCH 5/5] nvmet: fcloop: use irqsave spinlocks

2018-06-12 Thread James Smart




On 5/15/2018 12:40 AM, Johannes Thumshirn wrote:

Lockdep complains about inconsistent hardirq states when using
nvme-fcloop. So use the irqsave variant of spin_locks for the
nvmefc_fcp_req's reqlock.

Here's the lockdep report:
[   11.138417] 
[   11.139085] WARNING: inconsistent lock state
[   11.139207] nvmet: creating controller 2 for subsystem blktests-subsystem-1 
for NQN nqn.2014-08.org.nvmexpress:uuid:046394e7-bff7-4403-9502-1816800efaa0.
[   11.139727] 4.17.0-rc5+ #883 Not tainted
[   11.139732] 
[   11.11] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[   11.145341] swapper/8/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[   11.146108] (ptrval) (&(_req->reqlock)->rlock){?.+.}, at: 
fcloop_fcp_op+0x20/0x1d0 [nvme_fcloop]
[   11.148633] {HARDIRQ-ON-W} state was registered at:
[   11.149380]   _raw_spin_lock+0x34/0x70
[   11.149940]   fcloop_fcp_recv_work+0x17/0x90 [nvme_fcloop]
[   11.150746]   process_one_work+0x1d8/0x5c0
[   11.151346]   worker_thread+0x45/0x3e0
[   11.151886]   kthread+0x101/0x140
[   11.152370]   ret_from_fork+0x3a/0x50
[   11.152903] irq event stamp: 3
[   11.153402] hardirqs last  enabled at (36663): [] 
default_idle+0x13/0x180
[   11.154601] hardirqs last disabled at (36664): [] 
interrupt_entry+0xc4/0xe0
[   11.155817] softirqs last  enabled at (3): [] 
irq_enter+0x59/0x60
[   11.156952] softirqs last disabled at (36665): [] 
irq_enter+0x3e/0x60
[   11.158092]
[   11.158092] other info that might help us debug this:
[   11.159436]  Possible unsafe locking scenario:
[   11.159436]
[   11.160299]CPU0
[   11.160663]
[   11.161026]   lock(&(_req->reqlock)->rlock);
[   11.161709]   
[   11.162091] lock(&(_req->reqlock)->rlock);
[   11.163148]
[   11.163148]  *** DEADLOCK ***
[   11.163148]
[   11.164007] no locks held by swapper/8/0.
[   11.164596]
[   11.164596] stack backtrace:
[   11.165238] CPU: 8 PID: 0 Comm: swapper/8 Not tainted 4.17.0-rc5+ #883
[   11.166180] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.0.0-prebuilt.qemu-project.org 04/01/2014
[   11.167673] Call Trace:
[   11.168037]  
[   11.168349]  dump_stack+0x78/0xb3
[   11.168864]  print_usage_bug+0x1ed/0x1fe
[   11.169440]  ? check_usage_backwards+0x110/0x110
[   11.170111]  mark_lock+0x263/0x2a0
[   11.170995]  __lock_acquire+0x675/0x1870
[   11.171865]  ? __lock_acquire+0x48d/0x1870
[   11.172460]  lock_acquire+0xd4/0x220
[   11.172981]  ? fcloop_fcp_op+0x20/0x1d0 [nvme_fcloop]
[   11.173709]  _raw_spin_lock+0x34/0x70
[   11.174243]  ? fcloop_fcp_op+0x20/0x1d0 [nvme_fcloop]
[   11.174978]  fcloop_fcp_op+0x20/0x1d0 [nvme_fcloop]
[   11.175673]  nvmet_fc_transfer_fcp_data+0x9b/0x130 [nvmet_fc]
[   11.176507]  nvmet_req_complete+0x10/0x110 [nvmet]
[   11.177210]  nvmet_bio_done+0x23/0x40 [nvmet]
[   11.177837]  blk_update_request+0xab/0x3b0
[   11.178434]  blk_mq_end_request+0x13/0x60
[   11.179033]  flush_smp_call_function_queue+0x58/0x150
[   11.179755]  smp_call_function_single_interrupt+0x49/0x260
[   11.180531]  call_function_single_interrupt+0xf/0x20
[   11.181236]  
[   11.181542] RIP: 0010:native_safe_halt+0x2/0x10
[   11.182186] RSP: 0018:a481006cbec0 EFLAGS: 0206 ORIG_RAX: 
ff04
[   11.183265] RAX: 9f54f8b86200 RBX: 9f54f8b86200 RCX: 
[   11.184264] RDX: 9f54f8b86200 RSI: 0001 RDI: 9f54f8b86200
[   11.185270] RBP: 0008 R08:  R09: 
[   11.186271] R10: 0001 R11:  R12: 
[   11.187280] R13:  R14: 9f54f8b86200 R15: 9f54f8b86200
[   11.188280]  default_idle+0x18/0x180
[   11.188806]  do_idle+0x176/0x260
[   11.189273]  cpu_startup_entry+0x5a/0x60
[   11.189832]  start_secondary+0x18f/0x1b0

Signed-off-by: Johannes Thumshirn 
Cc: James Smart 
---
  drivers/nvme/target/fcloop.c | 52 
  1 file changed, 29 insertions(+), 23 deletions(-)



Looks ok. I assume it is as the bio_done call can be in an interrupt 
handler.


Reviewed-by:   James Smart  




Re: [PATCH 4/5] nvmet: use atomic allocations when allocating fc requests

2018-06-12 Thread James Smart




On 5/15/2018 12:40 AM, Johannes Thumshirn wrote:

fcloop_fcp_req() runs with the hctx_lock (a rcu_read_lock() locked
section) held, so memory allocations done in this context have to be
atomic.

...


Signed-off-by: Johannes Thumshirn 
---
  drivers/nvme/target/fcloop.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/target/fcloop.c b/drivers/nvme/target/fcloop.c
index 34712def81b1..d2209c60f95f 100644
--- a/drivers/nvme/target/fcloop.c
+++ b/drivers/nvme/target/fcloop.c
@@ -509,7 +509,7 @@ fcloop_fcp_req(struct nvme_fc_local_port *localport,
if (!rport->targetport)
return -ECONNREFUSED;
  
-	tfcp_req = kzalloc(sizeof(*tfcp_req), GFP_KERNEL);

+   tfcp_req = kzalloc(sizeof(*tfcp_req), GFP_ATOMIC);
if (!tfcp_req)
return -ENOMEM;
  


Reviewed-by:   James Smart  


Re: [PATCH 4/5] nvmet: use atomic allocations when allocating fc requests

2018-06-12 Thread James Smart




On 5/15/2018 12:40 AM, Johannes Thumshirn wrote:

fcloop_fcp_req() runs with the hctx_lock (a rcu_read_lock() locked
section) held, so memory allocations done in this context have to be
atomic.

...


Signed-off-by: Johannes Thumshirn 
---
  drivers/nvme/target/fcloop.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/target/fcloop.c b/drivers/nvme/target/fcloop.c
index 34712def81b1..d2209c60f95f 100644
--- a/drivers/nvme/target/fcloop.c
+++ b/drivers/nvme/target/fcloop.c
@@ -509,7 +509,7 @@ fcloop_fcp_req(struct nvme_fc_local_port *localport,
if (!rport->targetport)
return -ECONNREFUSED;
  
-	tfcp_req = kzalloc(sizeof(*tfcp_req), GFP_KERNEL);

+   tfcp_req = kzalloc(sizeof(*tfcp_req), GFP_ATOMIC);
if (!tfcp_req)
return -ENOMEM;
  


Reviewed-by:   James Smart  


Re: [PATCH 4/5] nvmet: use atomic allocations when allocating fc requests

2018-06-12 Thread James Smart




On 5/31/2018 2:31 AM, Sagi Grimberg wrote:


Question, why isn't tfcp_req embedded in fcpreq? don't they have
the same lifetime?



no they don't.  To properly simulate cable-pulls, etc - the host side 
and controller side effectively have their own "exchange" structure. 
tfcp_req corresponds to the controller side. The lifetimes of the two 
halves can differ.


-- james



Re: [PATCH 4/5] nvmet: use atomic allocations when allocating fc requests

2018-06-12 Thread James Smart




On 5/31/2018 2:31 AM, Sagi Grimberg wrote:


Question, why isn't tfcp_req embedded in fcpreq? don't they have
the same lifetime?



no they don't.  To properly simulate cable-pulls, etc - the host side 
and controller side effectively have their own "exchange" structure. 
tfcp_req corresponds to the controller side. The lifetimes of the two 
halves can differ.


-- james



Re: [PATCH] nvme: fc: provide a descriptive error

2018-04-19 Thread James Smart

On 4/19/2018 10:43 AM, Johannes Thumshirn wrote:

Provide a descriptive error in case an lport to rport association
isn't found when creating the FC-NVME controller.

Currently it's very hard to debug the reason for a failed connect
attempt without a look at the source.

Signed-off-by: Johannes Thumshirn <jthumsh...@suse.de>

---
This actually happened to Hannes and me becuase of a typo in a
customer demo today, so yes things like this happen unitl we have a
proper way to do auto-connect.
---
  drivers/nvme/host/fc.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 6cb26bcf6ec0..8b66879b4ebf 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3284,6 +3284,8 @@ nvme_fc_create_ctrl(struct device *dev, struct 
nvmf_ctrl_options *opts)
}
spin_unlock_irqrestore(_fc_lock, flags);
  
+	pr_warn("%s: %s - %s combination not found\n",

+   __func__, opts->traddr, opts->host_traddr);
return ERR_PTR(-ENOENT);
  }
  


Signed-off-by:  James Smart  <james.sm...@broadcom.com>


Re: [PATCH] nvme: fc: provide a descriptive error

2018-04-19 Thread James Smart

On 4/19/2018 10:43 AM, Johannes Thumshirn wrote:

Provide a descriptive error in case an lport to rport association
isn't found when creating the FC-NVME controller.

Currently it's very hard to debug the reason for a failed connect
attempt without a look at the source.

Signed-off-by: Johannes Thumshirn 

---
This actually happened to Hannes and me becuase of a typo in a
customer demo today, so yes things like this happen unitl we have a
proper way to do auto-connect.
---
  drivers/nvme/host/fc.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 6cb26bcf6ec0..8b66879b4ebf 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3284,6 +3284,8 @@ nvme_fc_create_ctrl(struct device *dev, struct 
nvmf_ctrl_options *opts)
}
spin_unlock_irqrestore(_fc_lock, flags);
  
+	pr_warn("%s: %s - %s combination not found\n",

+   __func__, opts->traddr, opts->host_traddr);
return ERR_PTR(-ENOENT);
  }
  


Signed-off-by:  James Smart  


Re: [PATCH 4/4] nvmet-fc: Use new SGL alloc/free helper for requests

2018-03-29 Thread James Smart

On 3/29/2018 10:02 AM, Logan Gunthorpe wrote:

Per the bug in the previous patch, I don't think that was ever a valid
assumption. It doesn't have anything to do with the sgl_alloc change
either. The dma_map interface is allowed to merge SGLs and that's why it
can return fewer nents than it was passed. I'm not sure how many or
which DMA ops actually do this which is why it hasn't actually
manifested itself as a bug; but it is part of how the interface is
specified to work.


Argh.. yep. I'll have to correct that assumption. The bug would have 
only shown up on i/o sizes beyond a particular length.




I think we need to store the sg_map_cnt separately and use it instead of
the calculation based on the transfer length. But this is really a fix
that should be rolled in with the previous patch. If you can point me to
where this needs to change I can update my patch, or if you want to fix
it yourself go ahead.

I'll fix it as it's part of the assumption fix.

With the nulling/zeroing, I'm good with your patch.

-- james.


Re: [PATCH 4/4] nvmet-fc: Use new SGL alloc/free helper for requests

2018-03-29 Thread James Smart

On 3/29/2018 10:02 AM, Logan Gunthorpe wrote:

Per the bug in the previous patch, I don't think that was ever a valid
assumption. It doesn't have anything to do with the sgl_alloc change
either. The dma_map interface is allowed to merge SGLs and that's why it
can return fewer nents than it was passed. I'm not sure how many or
which DMA ops actually do this which is why it hasn't actually
manifested itself as a bug; but it is part of how the interface is
specified to work.


Argh.. yep. I'll have to correct that assumption. The bug would have 
only shown up on i/o sizes beyond a particular length.




I think we need to store the sg_map_cnt separately and use it instead of
the calculation based on the transfer length. But this is really a fix
that should be rolled in with the previous patch. If you can point me to
where this needs to change I can update my patch, or if you want to fix
it yourself go ahead.

I'll fix it as it's part of the assumption fix.

With the nulling/zeroing, I'm good with your patch.

-- james.


Re: [PATCH 4/4] nvmet-fc: Use new SGL alloc/free helper for requests

2018-03-29 Thread James Smart

overall - I'm a little concerned about the replacement.

I know the code depends on the null/zero checks in 
nvmet_fc_free_tgt_pgs() as there are paths that can call them twice.  
Your replacement in that routine is fine, but you've fully removed any 
initialization to zero or setting to zero on free. Given the structure 
containing the req structure is reused, I'd rather that that code was 
left in some way... or that nvmet_req_free_sgl() or 
nvmet_fc_free_tgt_pgs() was updated to set req->sg to null and (not sure 
necessary if req->sg is null) set req->sg_cnt to 0.


also - sg_cnt isn't referenced as there is an assumption that: transfer 
length meant there would be (transfer length / PAGESIZE + 1 if partial 
page) pages allocated and sg_cnt would always equal that page cnt  thus 
the fcpreq->sgXXX setting behavior in nvmet_fc_transfer_fcp_data().   Is 
that no longer valid ?   I may not have watched closely enough when the 
sgl_alloc() common routine was introduced as it may have invalidated 
these assumptions that were true before.


-- james


On 3/29/2018 9:07 AM, Logan Gunthorpe wrote:

Use the new helpers introduced earlier to allocate the SGLs for
the request.

To do this, we drop the appearantly redundant data_sg and data_sg_cnt
members as they are identical to the existing req.sg and req.sg_cnt.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: James Smart <james.sm...@broadcom.com>
Cc: Christoph Hellwig <h...@lst.de>
Cc: Sagi Grimberg <s...@grimberg.me>
---
  drivers/nvme/target/fc.c | 38 +++---
  1 file changed, 11 insertions(+), 27 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 9f2f8ab83158..00135ff7d1c2 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -74,8 +74,6 @@ struct nvmet_fc_fcp_iod {
struct nvme_fc_cmd_iu   cmdiubuf;
struct nvme_fc_ersp_iu  rspiubuf;
dma_addr_t  rspdma;
-   struct scatterlist  *data_sg;
-   int data_sg_cnt;
u32 offset;
enum nvmet_fcp_datadir  io_dir;
boolactive;
@@ -1696,43 +1694,34 @@ EXPORT_SYMBOL_GPL(nvmet_fc_rcv_ls_req);
  static int
  nvmet_fc_alloc_tgt_pgs(struct nvmet_fc_fcp_iod *fod)
  {
-   struct scatterlist *sg;
-   unsigned int nent;
int ret;
  
-	sg = sgl_alloc(fod->req.transfer_len, GFP_KERNEL, );

-   if (!sg)
-   goto out;
+   ret = nvmet_req_alloc_sgl(>req, >queue->nvme_sq);
+   if (ret < 0)
+   return NVME_SC_INTERNAL;
  
-	fod->data_sg = sg;

-   fod->data_sg_cnt = nent;
-   ret = fc_dma_map_sg(fod->tgtport->dev, sg, nent,
+   ret = fc_dma_map_sg(fod->tgtport->dev, fod->req.sg, fod->req.sg_cnt,
((fod->io_dir == NVMET_FCP_WRITE) ?
DMA_FROM_DEVICE : DMA_TO_DEVICE));
/* note: write from initiator perspective */
if (!ret)
-   goto out;
+   return NVME_SC_INTERNAL;
  
  	return 0;

-
-out:
-   return NVME_SC_INTERNAL;
  }
  
  static void

  nvmet_fc_free_tgt_pgs(struct nvmet_fc_fcp_iod *fod)
  {
-   if (!fod->data_sg || !fod->data_sg_cnt)
+   if (!fod->req.sg || !fod->req.sg_cnt)
return;
  
-	fc_dma_unmap_sg(fod->tgtport->dev, fod->data_sg, fod->data_sg_cnt,

+   fc_dma_unmap_sg(fod->tgtport->dev, fod->req.sg, fod->req.sg_cnt,
((fod->io_dir == NVMET_FCP_WRITE) ?
DMA_FROM_DEVICE : DMA_TO_DEVICE));
-   sgl_free(fod->data_sg);
-   fod->data_sg = NULL;
-   fod->data_sg_cnt = 0;
-}
  
+	nvmet_req_free_sgl(>req);

+}handl
  
  static bool

  queue_90percent_full(struct nvmet_fc_tgt_queue *q, u32 sqhd)
@@ -1871,7 +1860,7 @@ nvmet_fc_transfer_fcp_data(struct nvmet_fc_tgtport 
*tgtport,
fcpreq->fcp_error = 0;
fcpreq->rsplen = 0;
  
-	fcpreq->sg = >data_sg[fod->offset / PAGE_SIZE];

+   fcpreq->sg = >req.sg[fod->offset / PAGE_SIZE];
fcpreq->sg_cnt = DIV_ROUND_UP(tlen, PAGE_SIZE);
  
  	/*

@@ -2083,7 +2072,7 @@ __nvmet_fc_fcp_nvme_cmd_done(struct nvmet_fc_tgtport 
*tgtport,
 * There may be a status where data still was intended to
 * be moved
 */
-   if ((fod->io_dir == NVMET_FCP_READ) && (fod->data_sg_cnt)) {
+   if ((fod->io_dir == NVMET_FCP_READ) && (fod->req.sg_cnt)) {
/* push the data over before sending rsp */
nvmet_fc_transfer_fcp_data(tgtport, fod,
NVMET_FCOP_READDATA);
@@ -2153,9 +214

Re: [PATCH 4/4] nvmet-fc: Use new SGL alloc/free helper for requests

2018-03-29 Thread James Smart

overall - I'm a little concerned about the replacement.

I know the code depends on the null/zero checks in 
nvmet_fc_free_tgt_pgs() as there are paths that can call them twice.  
Your replacement in that routine is fine, but you've fully removed any 
initialization to zero or setting to zero on free. Given the structure 
containing the req structure is reused, I'd rather that that code was 
left in some way... or that nvmet_req_free_sgl() or 
nvmet_fc_free_tgt_pgs() was updated to set req->sg to null and (not sure 
necessary if req->sg is null) set req->sg_cnt to 0.


also - sg_cnt isn't referenced as there is an assumption that: transfer 
length meant there would be (transfer length / PAGESIZE + 1 if partial 
page) pages allocated and sg_cnt would always equal that page cnt  thus 
the fcpreq->sgXXX setting behavior in nvmet_fc_transfer_fcp_data().   Is 
that no longer valid ?   I may not have watched closely enough when the 
sgl_alloc() common routine was introduced as it may have invalidated 
these assumptions that were true before.


-- james


On 3/29/2018 9:07 AM, Logan Gunthorpe wrote:

Use the new helpers introduced earlier to allocate the SGLs for
the request.

To do this, we drop the appearantly redundant data_sg and data_sg_cnt
members as they are identical to the existing req.sg and req.sg_cnt.

Signed-off-by: Logan Gunthorpe 
Cc: James Smart 
Cc: Christoph Hellwig 
Cc: Sagi Grimberg 
---
  drivers/nvme/target/fc.c | 38 +++---
  1 file changed, 11 insertions(+), 27 deletions(-)

diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 9f2f8ab83158..00135ff7d1c2 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -74,8 +74,6 @@ struct nvmet_fc_fcp_iod {
struct nvme_fc_cmd_iu   cmdiubuf;
struct nvme_fc_ersp_iu  rspiubuf;
dma_addr_t  rspdma;
-   struct scatterlist  *data_sg;
-   int data_sg_cnt;
u32 offset;
enum nvmet_fcp_datadir  io_dir;
boolactive;
@@ -1696,43 +1694,34 @@ EXPORT_SYMBOL_GPL(nvmet_fc_rcv_ls_req);
  static int
  nvmet_fc_alloc_tgt_pgs(struct nvmet_fc_fcp_iod *fod)
  {
-   struct scatterlist *sg;
-   unsigned int nent;
int ret;
  
-	sg = sgl_alloc(fod->req.transfer_len, GFP_KERNEL, );

-   if (!sg)
-   goto out;
+   ret = nvmet_req_alloc_sgl(>req, >queue->nvme_sq);
+   if (ret < 0)
+   return NVME_SC_INTERNAL;
  
-	fod->data_sg = sg;

-   fod->data_sg_cnt = nent;
-   ret = fc_dma_map_sg(fod->tgtport->dev, sg, nent,
+   ret = fc_dma_map_sg(fod->tgtport->dev, fod->req.sg, fod->req.sg_cnt,
((fod->io_dir == NVMET_FCP_WRITE) ?
DMA_FROM_DEVICE : DMA_TO_DEVICE));
/* note: write from initiator perspective */
if (!ret)
-   goto out;
+   return NVME_SC_INTERNAL;
  
  	return 0;

-
-out:
-   return NVME_SC_INTERNAL;
  }
  
  static void

  nvmet_fc_free_tgt_pgs(struct nvmet_fc_fcp_iod *fod)
  {
-   if (!fod->data_sg || !fod->data_sg_cnt)
+   if (!fod->req.sg || !fod->req.sg_cnt)
return;
  
-	fc_dma_unmap_sg(fod->tgtport->dev, fod->data_sg, fod->data_sg_cnt,

+   fc_dma_unmap_sg(fod->tgtport->dev, fod->req.sg, fod->req.sg_cnt,
((fod->io_dir == NVMET_FCP_WRITE) ?
DMA_FROM_DEVICE : DMA_TO_DEVICE));
-   sgl_free(fod->data_sg);
-   fod->data_sg = NULL;
-   fod->data_sg_cnt = 0;
-}
  
+	nvmet_req_free_sgl(>req);

+}handl
  
  static bool

  queue_90percent_full(struct nvmet_fc_tgt_queue *q, u32 sqhd)
@@ -1871,7 +1860,7 @@ nvmet_fc_transfer_fcp_data(struct nvmet_fc_tgtport 
*tgtport,
fcpreq->fcp_error = 0;
fcpreq->rsplen = 0;
  
-	fcpreq->sg = >data_sg[fod->offset / PAGE_SIZE];

+   fcpreq->sg = >req.sg[fod->offset / PAGE_SIZE];
fcpreq->sg_cnt = DIV_ROUND_UP(tlen, PAGE_SIZE);
  
  	/*

@@ -2083,7 +2072,7 @@ __nvmet_fc_fcp_nvme_cmd_done(struct nvmet_fc_tgtport 
*tgtport,
 * There may be a status where data still was intended to
 * be moved
 */
-   if ((fod->io_dir == NVMET_FCP_READ) && (fod->data_sg_cnt)) {
+   if ((fod->io_dir == NVMET_FCP_READ) && (fod->req.sg_cnt)) {
/* push the data over before sending rsp */
nvmet_fc_transfer_fcp_data(tgtport, fod,
NVMET_FCOP_READDATA);
@@ -2153,9 +2142,6 @@ nvmet_fc_handle_fcp_rqst(struct nvmet_fc_tgtport *tgtport,
/* clear any respo

Re: [PATCH 3/4] nvmet-fc: Don't use the count returned by the dma_map_sg call

2018-03-29 Thread James Smart

On 3/29/2018 9:30 AM, Logan Gunthorpe wrote:

Can you elaborate? The 'data_sg_cnt' member was in 'struct
nvmet_fc_fcp_iod' which is declared in fc.c so it doesn't seem sane for
lower driver to access it... In fact the next patch in the series
removes it completely.

Logan


actually, I do think it's ok. about to post review on the next patch

-- james



Re: [PATCH 3/4] nvmet-fc: Don't use the count returned by the dma_map_sg call

2018-03-29 Thread James Smart

On 3/29/2018 9:30 AM, Logan Gunthorpe wrote:

Can you elaborate? The 'data_sg_cnt' member was in 'struct
nvmet_fc_fcp_iod' which is declared in fc.c so it doesn't seem sane for
lower driver to access it... In fact the next patch in the series
removes it completely.

Logan


actually, I do think it's ok. about to post review on the next patch

-- james



Re: [PATCH 3/4] nvmet-fc: Don't use the count returned by the dma_map_sg call

2018-03-29 Thread James Smart

On 3/29/2018 9:07 AM, Logan Gunthorpe wrote:

When allocating an SGL, the fibre channel target uses the number
of entities mapped as the number of entities in a given scatter
gather list. This is incorrect.

The DMA-API-HOWTO document gives this note:

The 'nents' argument to the dma_unmap_sg call must be
the _same_ one you passed into the dma_map_sg call,
it should _NOT_ be the 'count' value _returned_ from the
dma_map_sg call.

The fc code only stores the count value returned form the dma_map_sg()
call and uses that value in the call to dma_unmap_sg().

The dma_map_sg() call will return a lower count than nents when multiple
SG entries were merged into one. This implies that there will be fewer
DMA address and length entries but the original number of page entries
in the SGL. So if this occurs, when the SGL reaches nvmet_execute_rw(),
a bio would be created with fewer than the total number of entries.

As odd as it sounds, and as far as I can tell, the number of SG entries
mapped does not appear to be used anywhere in the fc driver and therefore
there's no current need to store it.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: James Smart <james.sm...@broadcom.com>
Cc: Christoph Hellwig <h...@lst.de>
Cc: Sagi Grimberg <s...@grimberg.me>
Fixes: c53432030d8642 ("nvme-fabrics: Add target support for FC transport")
---


Signed-off-by: James Smart  <james.sm...@broadcom.com>

Patch looks fine.

As for "not used anywhere", be careful as the structure being prepped is 
passed from the nvme-fc transport to an underlying lldd. So the 
references would likely be in the lldd.


-- james



Re: [PATCH 3/4] nvmet-fc: Don't use the count returned by the dma_map_sg call

2018-03-29 Thread James Smart

On 3/29/2018 9:07 AM, Logan Gunthorpe wrote:

When allocating an SGL, the fibre channel target uses the number
of entities mapped as the number of entities in a given scatter
gather list. This is incorrect.

The DMA-API-HOWTO document gives this note:

The 'nents' argument to the dma_unmap_sg call must be
the _same_ one you passed into the dma_map_sg call,
it should _NOT_ be the 'count' value _returned_ from the
dma_map_sg call.

The fc code only stores the count value returned form the dma_map_sg()
call and uses that value in the call to dma_unmap_sg().

The dma_map_sg() call will return a lower count than nents when multiple
SG entries were merged into one. This implies that there will be fewer
DMA address and length entries but the original number of page entries
in the SGL. So if this occurs, when the SGL reaches nvmet_execute_rw(),
a bio would be created with fewer than the total number of entries.

As odd as it sounds, and as far as I can tell, the number of SG entries
mapped does not appear to be used anywhere in the fc driver and therefore
there's no current need to store it.

Signed-off-by: Logan Gunthorpe 
Cc: James Smart 
Cc: Christoph Hellwig 
Cc: Sagi Grimberg 
Fixes: c53432030d8642 ("nvme-fabrics: Add target support for FC transport")
---


Signed-off-by: James Smart  

Patch looks fine.

As for "not used anywhere", be careful as the structure being prepped is 
passed from the nvme-fc transport to an underlying lldd. So the 
references would likely be in the lldd.


-- james



Re: [PATCH][next] scsi: lpfc: make several unions static, fix non-ANSI prototype

2018-03-13 Thread James Smart

On 3/13/2018 5:08 AM, Colin King wrote:

From: Colin Ian King <colin.k...@canonical.com>

There are several unions that are local to the source and do not need
to be in global scope, so make them static. Also add in a missing void
parameter to functions lpfc_nvme_cmd_template and
lpfc_nvmet_cmd_template to clean up non-ANSI warning.

Cleans up sparse warnings:
drivers/scsi/lpfc/lpfc_nvme.c:68:19: warning: symbol
'lpfc_iread_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvme.c:69:19: warning: symbol
'lpfc_iwrite_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvme.c:70:19: warning: symbol
'lpfc_icmnd_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvme.c:74:24: warning: non-ANSI function
'lpfc_tsend_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:78:19: warning: symbol
'lpfc_treceive_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:79:19: warning: symbol
'lpfc_trsp_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:83:25: warning: non-ANSI function
declaration of function 'lpfc_nvmet_cmd_template'

Signed-off-by: Colin Ian King <colin.k...@canonical.com>
---
  drivers/scsi/lpfc/lpfc_nvme.c  | 8 
  drivers/scsi/lpfc/lpfc_nvmet.c | 8 
  2 files changed, 8 insertions(+), 8 deletions(-)




Signed-off-by:   James Smart <james.sm...@broadcom.com>




Re: [PATCH][next] scsi: lpfc: make several unions static, fix non-ANSI prototype

2018-03-13 Thread James Smart

On 3/13/2018 5:08 AM, Colin King wrote:

From: Colin Ian King 

There are several unions that are local to the source and do not need
to be in global scope, so make them static. Also add in a missing void
parameter to functions lpfc_nvme_cmd_template and
lpfc_nvmet_cmd_template to clean up non-ANSI warning.

Cleans up sparse warnings:
drivers/scsi/lpfc/lpfc_nvme.c:68:19: warning: symbol
'lpfc_iread_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvme.c:69:19: warning: symbol
'lpfc_iwrite_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvme.c:70:19: warning: symbol
'lpfc_icmnd_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvme.c:74:24: warning: non-ANSI function
'lpfc_tsend_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:78:19: warning: symbol
'lpfc_treceive_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:79:19: warning: symbol
'lpfc_trsp_cmd_template' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:83:25: warning: non-ANSI function
declaration of function 'lpfc_nvmet_cmd_template'

Signed-off-by: Colin Ian King 
---
  drivers/scsi/lpfc/lpfc_nvme.c  | 8 
  drivers/scsi/lpfc/lpfc_nvmet.c | 8 
  2 files changed, 8 insertions(+), 8 deletions(-)




Signed-off-by:   James Smart 




Re: [PATCH] scsi: lpfc: use memcpy_toio instead of writeq

2018-02-24 Thread James Smart
About to post a patch to fix. Rather than fidgeting with the copy 
routine, I want to go back to what we originally proposed - writeq() on 
64bit, writel() on 32-bit.


-- james


On 2/23/2018 1:02 PM, Arnd Bergmann wrote:

On Fri, Feb 23, 2018 at 4:36 PM, Arnd Bergmann  wrote:

@@ -138,12 +137,10 @@ lpfc_sli4_wq_put(struct lpfc_queue *q, union lpfc_wqe 
*wqe)
 if (q->phba->sli3_options & LPFC_SLI4_PHWQ_ENABLED)
 bf_set(wqe_wqid, >generic.wqe_com, q->queue_id);
 lpfc_sli_pcimem_bcopy(wqe, temp_wqe, q->entry_size);
-   if (q->dpp_enable && q->phba->cfg_enable_dpp) {
+   if (q->dpp_enable && q->phba->cfg_enable_dpp)
 /* write to DPP aperture taking advatage of Combined Writes */
-   tmp = (uint8_t *)wqe;
-   for (i = 0; i < q->entry_size; i += sizeof(uint64_t))
-   writeq(*((uint64_t *)(tmp + i)), q->dpp_regaddr + i);
-   }
+   memcpy_toio(tmp, q->dpp_regaddr, q->entry_size);
+
 /* ensure WQE bcopy and DPP flushed before doorbell write */
 wmb();


Not sure where we are with the question of whether memcpy_toio
is a good replacement or not, but further build testing showed that
my patch was completely broken in more than one way:

I mixed up the source and destination arguments, and I used
the uninitialized 'tmp' instead of 'wqe'. Don't try this patch.

Arnd




Re: [PATCH] scsi: lpfc: use memcpy_toio instead of writeq

2018-02-24 Thread James Smart
About to post a patch to fix. Rather than fidgeting with the copy 
routine, I want to go back to what we originally proposed - writeq() on 
64bit, writel() on 32-bit.


-- james


On 2/23/2018 1:02 PM, Arnd Bergmann wrote:

On Fri, Feb 23, 2018 at 4:36 PM, Arnd Bergmann  wrote:

@@ -138,12 +137,10 @@ lpfc_sli4_wq_put(struct lpfc_queue *q, union lpfc_wqe 
*wqe)
 if (q->phba->sli3_options & LPFC_SLI4_PHWQ_ENABLED)
 bf_set(wqe_wqid, >generic.wqe_com, q->queue_id);
 lpfc_sli_pcimem_bcopy(wqe, temp_wqe, q->entry_size);
-   if (q->dpp_enable && q->phba->cfg_enable_dpp) {
+   if (q->dpp_enable && q->phba->cfg_enable_dpp)
 /* write to DPP aperture taking advatage of Combined Writes */
-   tmp = (uint8_t *)wqe;
-   for (i = 0; i < q->entry_size; i += sizeof(uint64_t))
-   writeq(*((uint64_t *)(tmp + i)), q->dpp_regaddr + i);
-   }
+   memcpy_toio(tmp, q->dpp_regaddr, q->entry_size);
+
 /* ensure WQE bcopy and DPP flushed before doorbell write */
 wmb();


Not sure where we are with the question of whether memcpy_toio
is a good replacement or not, but further build testing showed that
my patch was completely broken in more than one way:

I mixed up the source and destination arguments, and I used
the uninitialized 'tmp' instead of 'wqe'. Don't try this patch.

Arnd




Re: [PATCH] nvme-pci: quiesce IO queues prior to disabling device HMB accesses

2018-02-16 Thread James Smart

On 2/12/2018 10:40 AM, Sagi Grimberg wrote:

Thanks, I picked this up for 4.17 (unless someone thinks
this is 4.16-rc material?)

___
Linux-nvme mailing list
linux-n...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme


Note: I really hope that we can eventually commonize all the 
quiesce/unquiesce logic and points relative to controller state.  We 
have just enough differences to be interesting.


-- james




Re: [PATCH] nvme-pci: quiesce IO queues prior to disabling device HMB accesses

2018-02-16 Thread James Smart

On 2/12/2018 10:40 AM, Sagi Grimberg wrote:

Thanks, I picked this up for 4.17 (unless someone thinks
this is 4.16-rc material?)

___
Linux-nvme mailing list
linux-n...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme


Note: I really hope that we can eventually commonize all the 
quiesce/unquiesce logic and points relative to controller state.  We 
have just enough differences to be interesting.


-- james




Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-18 Thread James Smart

Jianchao,

This looks very coherent to me. Thank You.

-- james



On 1/18/2018 2:10 AM, Jianchao Wang wrote:

Hello

Please consider the following scenario.
nvme_reset_ctrl
   -> set state to RESETTING
   -> queue reset_work
 (scheduling)
nvme_reset_work
   -> nvme_dev_disable
 -> quiesce queues
 -> nvme_cancel_request
on outstanding requests
---_boundary_
   -> nvme initializing (issue request on adminq)

Before the _boundary_, not only quiesce the queues, but only cancel
all the outstanding requests.

A request could expire when the ctrl state is RESETTING.
  - If the timeout occur before the _boundary_, the expired requests
are from the previous work.
  - Otherwise, the expired requests are from the controller initializing
procedure, such as sending cq/sq create commands to adminq to setup
io queues.
In current implementation, nvme_timeout cannot identify the _boundary_
so only handles second case above.

In fact, after Sagi's commit (nvme-rdma: fix concurrent reset and
reconnect), both nvme-fc/rdma have following pattern:
RESETTING- quiesce blk-mq queues, teardown and delete queues/
connections, clear out outstanding IO requests...
RECONNECTING - establish new queues/connections and some other
initializing things.
Introduce RECONNECTING to nvme-pci transport to do the same mark
Then we get a coherent state definition among nvme pci/rdma/fc
transports and nvme_timeout could identify the _boundary_.

V5:
  - discard RESET_PREPARE and introduce RESETTING into nvme-pci
  - change the 1st patch's name and comment
  - other misc changes

V4:
  - rebase patches on Jens' for-next
  - let RESETTING equal to RECONNECTING in terms of work procedure
  - change the 1st patch's name and comment
  - other misc changes

V3:
  - fix wrong reference in loop.c
  - other misc changes

V2:
  - split NVME_CTRL_RESETTING into NVME_CTRL_RESET_PREPARE and
NVME_CTRL_RESETTING. Introduce new patch based on this.
  - distinguish the requests based on the new state in nvme_timeout
  - change comments of patch

drivers/nvme/host/core.c |  2 +-
drivers/nvme/host/pci.c  | 43 ---
2 files changed, 33 insertions(+), 12 deletions(-)

Thanks
Jianchao




Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

2018-01-18 Thread James Smart

Jianchao,

This looks very coherent to me. Thank You.

-- james



On 1/18/2018 2:10 AM, Jianchao Wang wrote:

Hello

Please consider the following scenario.
nvme_reset_ctrl
   -> set state to RESETTING
   -> queue reset_work
 (scheduling)
nvme_reset_work
   -> nvme_dev_disable
 -> quiesce queues
 -> nvme_cancel_request
on outstanding requests
---_boundary_
   -> nvme initializing (issue request on adminq)

Before the _boundary_, not only quiesce the queues, but only cancel
all the outstanding requests.

A request could expire when the ctrl state is RESETTING.
  - If the timeout occur before the _boundary_, the expired requests
are from the previous work.
  - Otherwise, the expired requests are from the controller initializing
procedure, such as sending cq/sq create commands to adminq to setup
io queues.
In current implementation, nvme_timeout cannot identify the _boundary_
so only handles second case above.

In fact, after Sagi's commit (nvme-rdma: fix concurrent reset and
reconnect), both nvme-fc/rdma have following pattern:
RESETTING- quiesce blk-mq queues, teardown and delete queues/
connections, clear out outstanding IO requests...
RECONNECTING - establish new queues/connections and some other
initializing things.
Introduce RECONNECTING to nvme-pci transport to do the same mark
Then we get a coherent state definition among nvme pci/rdma/fc
transports and nvme_timeout could identify the _boundary_.

V5:
  - discard RESET_PREPARE and introduce RESETTING into nvme-pci
  - change the 1st patch's name and comment
  - other misc changes

V4:
  - rebase patches on Jens' for-next
  - let RESETTING equal to RECONNECTING in terms of work procedure
  - change the 1st patch's name and comment
  - other misc changes

V3:
  - fix wrong reference in loop.c
  - other misc changes

V2:
  - split NVME_CTRL_RESETTING into NVME_CTRL_RESET_PREPARE and
NVME_CTRL_RESETTING. Introduce new patch based on this.
  - distinguish the requests based on the new state in nvme_timeout
  - change comments of patch

drivers/nvme/host/core.c |  2 +-
drivers/nvme/host/pci.c  | 43 ---
2 files changed, 33 insertions(+), 12 deletions(-)

Thanks
Jianchao




Re: [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure

2018-01-18 Thread James Smart

On 1/18/2018 2:10 AM, Jianchao Wang wrote:

After Sagi's commit (nvme-rdma: fix concurrent reset and reconnect),
both nvme-fc/rdma have following pattern:
RESETTING- quiesce blk-mq queues, teardown and delete queues/
connections, clear out outstanding IO requests...
RECONNECTING - establish new queues/connections and some other
initializing things.
Introduce RECONNECTING to nvme-pci transport to do the same mark.
Then we get a coherent state definition among nvme pci/rdma/fc
transports.

Suggested-by: James Smart <james.sm...@broadcom.com>
Signed-off-by: Jianchao Wang <jianchao.w.w...@oracle.com>
---
  drivers/nvme/host/core.c |  2 +-
  drivers/nvme/host/pci.c  | 19 +--
  2 files changed, 18 insertions(+), 3 deletions(-)



Reviewed-by: James Smart  <james.sm...@broadcom.com>


Re: [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure

2018-01-18 Thread James Smart

On 1/18/2018 2:10 AM, Jianchao Wang wrote:

After Sagi's commit (nvme-rdma: fix concurrent reset and reconnect),
both nvme-fc/rdma have following pattern:
RESETTING- quiesce blk-mq queues, teardown and delete queues/
connections, clear out outstanding IO requests...
RECONNECTING - establish new queues/connections and some other
initializing things.
Introduce RECONNECTING to nvme-pci transport to do the same mark.
Then we get a coherent state definition among nvme pci/rdma/fc
transports.

Suggested-by: James Smart 
Signed-off-by: Jianchao Wang 
---
  drivers/nvme/host/core.c |  2 +-
  drivers/nvme/host/pci.c  | 19 +--
  2 files changed, 18 insertions(+), 3 deletions(-)



Reviewed-by: James Smart  


Re: [PATCH V4 1/2] nvme: add NVME_CTRL_RESET_PREPARE state

2018-01-17 Thread James Smart

Thanks jianchoa. This helped.

On 1/17/2018 7:13 PM, jianchao.wang wrote:

Actually, this patchset is to fix a issue in nvme_timeout.
Please consider the following scenario.

nvme_reset_ctrl
 -> set state to RESETTING
 -> queue reset_work
 (scheduling)
  nvme_reset_work  -> nvme_dev_disable
 -> quiesce queues
   -> nvme_cancel_request
  on outstanding requests
  _boundary_
   -> nvme initializing (may issue request on adminq)

Before the boundary, not only quiesce the queues, but only cancel all the 
outstanding requests.

A request could expire when the ctrl state is RESETTING.
  - If the timeout occur before the _boundary_, the expired requests are from 
the previous work.
  - Otherwise, the expired requests are from the controller initializing 
procedure, such as sending cq/sq
create commands to adminq to setup io queues.
In current implementation, nvme_timeout cannot identify the _boundary_ so only 
handles second case above.


So what you've described very well is the pci adapter and the fact that 
it doesn't use a RECONNECTING state when it starts to reinit the 
controller like rdma/fc does.  Note: we had left it that way as a 
"grandfathering" as the code already existed and we didn't see an issue 
just leaving the reinit in the resetting handler.






So in the patch, RESETTING in nvme-fc/rdma is changed to RESET_PREPARE. Then we 
get:
nvme-fc/rdma RESET_PREPARE -> RECONNECTING -> LIVE
nvme-pci RESET_PREPARE -> RESETTING-> LIVE


Right - so another way to look at this is you renamed RESETTING to 
RESET_PREPARE and added a new RESETTING state in the nvme-pci when in 
reinits.  Why not simply have the nvme-pci transport transition to 
RECONNECTING like the other transports. No new states, semantics are all 
the same.


Here's what the responsibility of the states are:
RESETTING:
-quiescing the blk-mq queues os errors don't bubble back to callees and 
new io is suspended
-terminating the link-side association: resets the controller via 
register access/SET_PROPERTY, deletes connections/queues, cleans out 
active io and lets them queue for resending, etc.
-end result is nvme block device is present, but idle, and no active 
controller on the link side  (link being a transport specific link such 
as pci, or ethernet/ib for rdma or fc link).


RECONNECTING:
- "establish an association at the transport" on PCI this is immediate - 
you can either talk to the pci function or you can't. With rdma/fc we 
send transport traffic to create an admin q connection.
- if association succeeded: the controller is init'd via register 
accesses/fabric GET/SET_PROPERTIES and admin-queue command, io 
connections/queues created, blk-mq queues unquiesced, and finally 
transition to NVME_CTRL_LIVE.
- if association failed: check controller timeout., if not exceeded, 
schedule timer and retry later.  With pci, this could cover the 
temporary loss of the pci function access (a hot plug event) while 
keeping the nvme blk device live in the system.


It matches what you are describing but using what we already have in 
place - with the only difference being the addition of your "boundary" 
by adding the RECONNECTING state to nvme-pci.






I don't believe RESETTING and RECONNECTING are that similar - unless - you are looking at 
that "reinit" part that we left grandfathered into PCI.

Yes. I have got the point. Thanks for your directive.

Both nvme-pc and nvme-fc/rdma have "shutdown" part to tear down 
queues/connects, quiesce queues, cancel outstanding requests...
We could call this part as "shutdowning" as well as the scheduling gap.
Because the difference of initializing between the nvme-pci and nvme-fc/rdma,  we could 
call "reiniting" for nvme-pci and
call "reconnecting" for nvme-fc/rdma


I don't think we need different states for the two. The actions taken 
are really very similar. How they do the actions varies slightly, but 
not what they are trying to do.   Thus, I'd prefer we keep the existing 
RECONNECTING state and use it in nvme-pci as well. I'm open to renaming 
it if there's something about the name that is not agreeable.



-- james



Re: [PATCH V4 1/2] nvme: add NVME_CTRL_RESET_PREPARE state

2018-01-17 Thread James Smart

Thanks jianchoa. This helped.

On 1/17/2018 7:13 PM, jianchao.wang wrote:

Actually, this patchset is to fix a issue in nvme_timeout.
Please consider the following scenario.

nvme_reset_ctrl
 -> set state to RESETTING
 -> queue reset_work
 (scheduling)
  nvme_reset_work  -> nvme_dev_disable
 -> quiesce queues
   -> nvme_cancel_request
  on outstanding requests
  _boundary_
   -> nvme initializing (may issue request on adminq)

Before the boundary, not only quiesce the queues, but only cancel all the 
outstanding requests.

A request could expire when the ctrl state is RESETTING.
  - If the timeout occur before the _boundary_, the expired requests are from 
the previous work.
  - Otherwise, the expired requests are from the controller initializing 
procedure, such as sending cq/sq
create commands to adminq to setup io queues.
In current implementation, nvme_timeout cannot identify the _boundary_ so only 
handles second case above.


So what you've described very well is the pci adapter and the fact that 
it doesn't use a RECONNECTING state when it starts to reinit the 
controller like rdma/fc does.  Note: we had left it that way as a 
"grandfathering" as the code already existed and we didn't see an issue 
just leaving the reinit in the resetting handler.






So in the patch, RESETTING in nvme-fc/rdma is changed to RESET_PREPARE. Then we 
get:
nvme-fc/rdma RESET_PREPARE -> RECONNECTING -> LIVE
nvme-pci RESET_PREPARE -> RESETTING-> LIVE


Right - so another way to look at this is you renamed RESETTING to 
RESET_PREPARE and added a new RESETTING state in the nvme-pci when in 
reinits.  Why not simply have the nvme-pci transport transition to 
RECONNECTING like the other transports. No new states, semantics are all 
the same.


Here's what the responsibility of the states are:
RESETTING:
-quiescing the blk-mq queues os errors don't bubble back to callees and 
new io is suspended
-terminating the link-side association: resets the controller via 
register access/SET_PROPERTY, deletes connections/queues, cleans out 
active io and lets them queue for resending, etc.
-end result is nvme block device is present, but idle, and no active 
controller on the link side  (link being a transport specific link such 
as pci, or ethernet/ib for rdma or fc link).


RECONNECTING:
- "establish an association at the transport" on PCI this is immediate - 
you can either talk to the pci function or you can't. With rdma/fc we 
send transport traffic to create an admin q connection.
- if association succeeded: the controller is init'd via register 
accesses/fabric GET/SET_PROPERTIES and admin-queue command, io 
connections/queues created, blk-mq queues unquiesced, and finally 
transition to NVME_CTRL_LIVE.
- if association failed: check controller timeout., if not exceeded, 
schedule timer and retry later.  With pci, this could cover the 
temporary loss of the pci function access (a hot plug event) while 
keeping the nvme blk device live in the system.


It matches what you are describing but using what we already have in 
place - with the only difference being the addition of your "boundary" 
by adding the RECONNECTING state to nvme-pci.






I don't believe RESETTING and RECONNECTING are that similar - unless - you are looking at 
that "reinit" part that we left grandfathered into PCI.

Yes. I have got the point. Thanks for your directive.

Both nvme-pc and nvme-fc/rdma have "shutdown" part to tear down 
queues/connects, quiesce queues, cancel outstanding requests...
We could call this part as "shutdowning" as well as the scheduling gap.
Because the difference of initializing between the nvme-pci and nvme-fc/rdma,  we could 
call "reiniting" for nvme-pci and
call "reconnecting" for nvme-fc/rdma


I don't think we need different states for the two. The actions taken 
are really very similar. How they do the actions varies slightly, but 
not what they are trying to do.   Thus, I'd prefer we keep the existing 
RECONNECTING state and use it in nvme-pci as well. I'm open to renaming 
it if there's something about the name that is not agreeable.



-- james



Re: [PATCH V4 1/2] nvme: add NVME_CTRL_RESET_PREPARE state

2018-01-17 Thread James Smart
I'm having a hard time following why this patch is being requested. Help 
me catch on.


On 1/16/2018 8:54 PM, Jianchao Wang wrote:

Currently, the ctrl->state will be changed to NVME_CTRL_RESETTING
before queue the reset work. This is not so strict. There could be
a big gap before the reset_work callback is invoked.
ok so what you're saying is you want something to know that you've 
transitioned to RESETTING but not yet performed an action for the 
reset.   What is that something and what is it to do ?


guessing - I assume it's in the transport  xxx_is_ready() and/or 
nvmf_check_init_req(), which is called at the start of queue_rq(), that 
wants to do something ?




  In addition,
there is some disable work in the reset_work callback, strictly
speaking, not part of reset work, and could lead to some confusion.


Can you explain this ?  what's the confusion ?

I assume by "disable" you mean it is quiescing queues ?




In addition, after set state to RESETTING and disable procedure,
nvme-rdma/fc use NVME_CTRL_RECONNECTING to mark the setup and
reconnect procedure. The RESETTING state has been narrowed.


I still don't follow. Yes RECONNECTING is where we repetitively: try to 
create a link-side association again: if it fails, delay and try again; 
or if success, reinit the controller, and unquiesce all queues - 
allowing full operation again, at which time we transition to LIVE.


by "narrowed" what do you mean ?    what "narrowed" ?

In FC, as we have a lot of work that must occur to terminate io as part 
of the reset, it can be a fairly long window.  I don't know that any 
functionally in this path, regardless of time window, has narrowed.





This patch add NVME_CTRL_RESET_PREPARE state to mark the reset_work
or error recovery work, scheduling gap and disable procedure.
After that,
  - For nvme-pci, nvmet-loop, set state to RESETTING, start
initialization.
  - For nvme-rdma, nvme-fc, set state to RECONNECTING, start
initialization or reconnect.


So I'm lost - so you've effectively renamed RESETTING to RESET_PREPARE 
for fc/rdma.  What do you define are the actions in RESETTING that went 
away and why is that different between pci and the other transports ?   
Why doesn't nvme-pci need to go through RESET_PREPARE ? doesn't it have 
the same scheduling window for a reset_work thread ?



On 1/17/2018 1:06 AM, Max Gurtovoy wrote:



+
+    case NVME_CTRL_RESETTING:
+    switch (old_state) {
+    case NVME_CTRL_RESET_PREPARE:
+    changed = true;
+    /* FALLTHRU */
+    default:
+    break;
+    }
+    break;
  case NVME_CTRL_RECONNECTING:
  switch (old_state) {
  case NVME_CTRL_LIVE:
-    case NVME_CTRL_RESETTING:
+    case NVME_CTRL_RESET_PREPARE:


As I suggested in V3, please don't allow this transition.
We'll move to NVME_CTRL_RECONNECTING from NVME_CTRL_RESETTING.

I look on it like that:

NVME_CTRL_RESET_PREPARE - "suspend" state
NVME_CTRL_RESETTING - "resume" state

you don't reconnect from "suspend" state, you must "resume" before you 
reconnect.


This makes no sense to me.

I could use a definition of what "suspend" and "resume" mean to you

from what I've seen so far:
NVME_CTRL_RESET_PREPARE:   means I've decided to reset, changed state, 
but the actual work for reset hasn't started yet.   As we haven't 
commonized who does the quiescing of the queues, the queues are still 
live at this state, although some nvme check routine may bounce them. In 
truth, the queues should be quiesced here.


NVME_CTRL_RESETTING: I'm resetting the controller, tearing down 
queues/connections, the link side association.  AFAIK - pci and all the 
other transports have to do things things.   Now is when the blk-mq 
queues get stopped.   We have a variance on whether the queues are 
unquiesced or left quiesced (I think this is what you meant by "resume", 
where resume means unquiesce) at the end of this.   The admin_q is 
unquiesced, meaning new admin cmds should fail.  rdma also has io queues 
unquiesced meaning new ios fail, while fc leaves them quiesced while 
background timers run - meaning no new ios issued, nor any fail back to 
a multipather. With the agreement that we would patch all of the 
transports to leave them quiesced with fast-fail-timeouts occuring to 
unquiesce them and start failing ios.


NVME_RECONNECTING: transitioned to after the link-side association is 
terminated and the transport will now attempt to reconnect (perhaps 
several attempts) to create a new link-side association. Stays in this 
state until the controller is fully reconnected and it transitions to 
NVME_LIVE.   Until the link side association is active, queues do what 
they do (as left by RESETTING and/or updated per timeouts) excepting 
that after an active assocation, they queues will be unquiesced at the 
time of the LIVE transition.   Note: we grandfathered PCI into not 
needing this state:   As you (almost) can't fail the 

Re: [PATCH V4 1/2] nvme: add NVME_CTRL_RESET_PREPARE state

2018-01-17 Thread James Smart
I'm having a hard time following why this patch is being requested. Help 
me catch on.


On 1/16/2018 8:54 PM, Jianchao Wang wrote:

Currently, the ctrl->state will be changed to NVME_CTRL_RESETTING
before queue the reset work. This is not so strict. There could be
a big gap before the reset_work callback is invoked.
ok so what you're saying is you want something to know that you've 
transitioned to RESETTING but not yet performed an action for the 
reset.   What is that something and what is it to do ?


guessing - I assume it's in the transport  xxx_is_ready() and/or 
nvmf_check_init_req(), which is called at the start of queue_rq(), that 
wants to do something ?




  In addition,
there is some disable work in the reset_work callback, strictly
speaking, not part of reset work, and could lead to some confusion.


Can you explain this ?  what's the confusion ?

I assume by "disable" you mean it is quiescing queues ?




In addition, after set state to RESETTING and disable procedure,
nvme-rdma/fc use NVME_CTRL_RECONNECTING to mark the setup and
reconnect procedure. The RESETTING state has been narrowed.


I still don't follow. Yes RECONNECTING is where we repetitively: try to 
create a link-side association again: if it fails, delay and try again; 
or if success, reinit the controller, and unquiesce all queues - 
allowing full operation again, at which time we transition to LIVE.


by "narrowed" what do you mean ?    what "narrowed" ?

In FC, as we have a lot of work that must occur to terminate io as part 
of the reset, it can be a fairly long window.  I don't know that any 
functionally in this path, regardless of time window, has narrowed.





This patch add NVME_CTRL_RESET_PREPARE state to mark the reset_work
or error recovery work, scheduling gap and disable procedure.
After that,
  - For nvme-pci, nvmet-loop, set state to RESETTING, start
initialization.
  - For nvme-rdma, nvme-fc, set state to RECONNECTING, start
initialization or reconnect.


So I'm lost - so you've effectively renamed RESETTING to RESET_PREPARE 
for fc/rdma.  What do you define are the actions in RESETTING that went 
away and why is that different between pci and the other transports ?   
Why doesn't nvme-pci need to go through RESET_PREPARE ? doesn't it have 
the same scheduling window for a reset_work thread ?



On 1/17/2018 1:06 AM, Max Gurtovoy wrote:



+
+    case NVME_CTRL_RESETTING:
+    switch (old_state) {
+    case NVME_CTRL_RESET_PREPARE:
+    changed = true;
+    /* FALLTHRU */
+    default:
+    break;
+    }
+    break;
  case NVME_CTRL_RECONNECTING:
  switch (old_state) {
  case NVME_CTRL_LIVE:
-    case NVME_CTRL_RESETTING:
+    case NVME_CTRL_RESET_PREPARE:


As I suggested in V3, please don't allow this transition.
We'll move to NVME_CTRL_RECONNECTING from NVME_CTRL_RESETTING.

I look on it like that:

NVME_CTRL_RESET_PREPARE - "suspend" state
NVME_CTRL_RESETTING - "resume" state

you don't reconnect from "suspend" state, you must "resume" before you 
reconnect.


This makes no sense to me.

I could use a definition of what "suspend" and "resume" mean to you

from what I've seen so far:
NVME_CTRL_RESET_PREPARE:   means I've decided to reset, changed state, 
but the actual work for reset hasn't started yet.   As we haven't 
commonized who does the quiescing of the queues, the queues are still 
live at this state, although some nvme check routine may bounce them. In 
truth, the queues should be quiesced here.


NVME_CTRL_RESETTING: I'm resetting the controller, tearing down 
queues/connections, the link side association.  AFAIK - pci and all the 
other transports have to do things things.   Now is when the blk-mq 
queues get stopped.   We have a variance on whether the queues are 
unquiesced or left quiesced (I think this is what you meant by "resume", 
where resume means unquiesce) at the end of this.   The admin_q is 
unquiesced, meaning new admin cmds should fail.  rdma also has io queues 
unquiesced meaning new ios fail, while fc leaves them quiesced while 
background timers run - meaning no new ios issued, nor any fail back to 
a multipather. With the agreement that we would patch all of the 
transports to leave them quiesced with fast-fail-timeouts occuring to 
unquiesce them and start failing ios.


NVME_RECONNECTING: transitioned to after the link-side association is 
terminated and the transport will now attempt to reconnect (perhaps 
several attempts) to create a new link-side association. Stays in this 
state until the controller is fully reconnected and it transitions to 
NVME_LIVE.   Until the link side association is active, queues do what 
they do (as left by RESETTING and/or updated per timeouts) excepting 
that after an active assocation, they queues will be unquiesced at the 
time of the LIVE transition.   Note: we grandfathered PCI into not 
needing this state:   As you (almost) can't fail the 

Re: [Suspected-Phishing]Re: [PATCH V3 1/2] nvme: split resetting state into reset_prepate and resetting

2018-01-17 Thread James Smart

On 1/17/2018 2:37 AM, Sagi Grimberg wrote:


After Sagi's nvme-rdma: fix concurrent reset and reconnect, the rdma 
ctrl state is changed to RECONNECTING state
after some clearing and shutdown work, then some initializing 
procedure,  no matter reset work path or error recovery path.

The fc reset work also does the same thing.
So if we define the range that RESET_PREPARE includes scheduling gap 
and disable and clear work, RESETTING includes initializing

procedure,  RECONNECTING is very similar with RESETTING.

Maybe we could do like this;
In nvme fc/rdma
- set state to RESET_PREPARE, queue reset_work/err_work
- clear/shutdown works, set state to RECONNECTING


Should be fine.


In nvme pci
- set state to RESET_PREPARE, queue reset_work
- clear/shutdown works, set state to RESETTING
- initialization, set state to LIVE


Given that we split reset state and we have a clear symmetry between
the transports, do we want to maybe come up with a unique state that is
coherent across all transports?

Maybe we rename them to NVME_CTRL_SHUTTING_DOWN and
NVME_CTRL_ESTABLISHING? I'm open for better names..


I'm leaning toward this latter suggestion - we need to define the states 
and the actions they take. It seems to me, that RESETTING became the 
"init controller" part in Jainchao's model. So maybe it's not the 
shutting down that needs a new state, but rather the REINIT part.


-- james



Re: [Suspected-Phishing]Re: [PATCH V3 1/2] nvme: split resetting state into reset_prepate and resetting

2018-01-17 Thread James Smart

On 1/17/2018 2:37 AM, Sagi Grimberg wrote:


After Sagi's nvme-rdma: fix concurrent reset and reconnect, the rdma 
ctrl state is changed to RECONNECTING state
after some clearing and shutdown work, then some initializing 
procedure,  no matter reset work path or error recovery path.

The fc reset work also does the same thing.
So if we define the range that RESET_PREPARE includes scheduling gap 
and disable and clear work, RESETTING includes initializing

procedure,  RECONNECTING is very similar with RESETTING.

Maybe we could do like this;
In nvme fc/rdma
- set state to RESET_PREPARE, queue reset_work/err_work
- clear/shutdown works, set state to RECONNECTING


Should be fine.


In nvme pci
- set state to RESET_PREPARE, queue reset_work
- clear/shutdown works, set state to RESETTING
- initialization, set state to LIVE


Given that we split reset state and we have a clear symmetry between
the transports, do we want to maybe come up with a unique state that is
coherent across all transports?

Maybe we rename them to NVME_CTRL_SHUTTING_DOWN and
NVME_CTRL_ESTABLISHING? I'm open for better names..


I'm leaning toward this latter suggestion - we need to define the states 
and the actions they take. It seems to me, that RESETTING became the 
"init controller" part in Jainchao's model. So maybe it's not the 
shutting down that needs a new state, but rather the REINIT part.


-- james



Re: [PATCH v2] nvme-fc: don't require user to enter host_traddr

2017-12-01 Thread James Smart

On 12/1/2017 12:34 AM, Johannes Thumshirn wrote:

James Smart <james.sm...@broadcom.com> writes:


On 11/30/2017 7:12 AM, Johannes Thumshirn wrote:

One major usability difference between NVMf RDMA and FC is resolving
the default host transport address in RDMA. This is perfectly doable
in FC as well, as we already have all possible lport <-> rport
combinations pre-populated so we can pick the first lport that has a
connection to our desired rport per default or optionally use the user
supplied lport if we have one.

Signed-off-by: Johannes Thumshirn <jthumsh...@suse.de>
Cc: James Smart <james.sm...@broadcom.com>

This is unnecessary and can create weird configurations. It assumes
connections are manually created.

a) connections can (and will) be maually created and for this the users
have to know the topology or connection establishment will fail b) there
is no need that the connections are manually created. Sagi did post an
RFC systemd service which calls nvme connect-all and this is what should
be done regardless if we're running on FC-NVME or NVMe over RDMA any new
transport that may come in the future. What I want is a consistent user
experience within NVMe, as I am the one that has to answer a
documentation team's inquiries on how to configure NVMf, support QA in
testing and fix end user bugs. The least thing I want ot do is tell them
"well if you use rdma you have to use the nvme-connect.service, if you
use FC you have to have some magic udev rules and auto-connect scripts,
if you use $POSSIBLE_NEW_TRANSPORT you have to yada, yada".

I don't really like that we have to manully connect either but this
behaviour was first in NVMe so we should either stick to that or convert
RDMA over to using some sort of udev magic as well (which wont work as
far as I know, and it definitively won't work with TCP if and when it's
there).



Sagi's RFC is certainly fine to use with FC-NVME as well. But the 
mechanism does require the admin to have apriori knowledge and populate 
the discovery.conf file with the discovery controller information. It 
also requires updates on any dynamic change. All of that is unnecessary 
with the FC-NVME auto-connect scripts. Note: there's nothing wrong with 
using both mechanisms simultaneously with FC-NVME.


I certainly understand the sentiment of not doing something N ways. But, 
in this case, the functionality is so useful and valuable, that 
constraining the solution to the least common denominator is a bad 
idea.  You are also ignoring the user-base coming from SCSI on FC (which 
is much larger than nvmf on RDMA) that have completely different 
expectations on how the system finds targets and connects to storage. 
They will be confused and have lots of questions about why SCSI is so 
automatic yet (even on the same machines/adapters/fabric) they now must 
have all kinds of upfront knowledge and make manual changes for NVME. I 
get bombarded on this and I know you will too.   Although it's not nice 
to state transport-specific methods, I believe the message is still 
rather short and simple.


We seem to have gotten away from the desired patch into a completely 
different area.   In truth, adding the patch isn't a big deal, but to me 
it doesn't provide any value over the existing scripts. You could say it 
adds yet another one off in FC-specific behavior that could complicate 
the documentation.


-- james




Re: [PATCH v2] nvme-fc: don't require user to enter host_traddr

2017-12-01 Thread James Smart

On 12/1/2017 12:34 AM, Johannes Thumshirn wrote:

James Smart  writes:


On 11/30/2017 7:12 AM, Johannes Thumshirn wrote:

One major usability difference between NVMf RDMA and FC is resolving
the default host transport address in RDMA. This is perfectly doable
in FC as well, as we already have all possible lport <-> rport
combinations pre-populated so we can pick the first lport that has a
connection to our desired rport per default or optionally use the user
supplied lport if we have one.

Signed-off-by: Johannes Thumshirn 
Cc: James Smart 

This is unnecessary and can create weird configurations. It assumes
connections are manually created.

a) connections can (and will) be maually created and for this the users
have to know the topology or connection establishment will fail b) there
is no need that the connections are manually created. Sagi did post an
RFC systemd service which calls nvme connect-all and this is what should
be done regardless if we're running on FC-NVME or NVMe over RDMA any new
transport that may come in the future. What I want is a consistent user
experience within NVMe, as I am the one that has to answer a
documentation team's inquiries on how to configure NVMf, support QA in
testing and fix end user bugs. The least thing I want ot do is tell them
"well if you use rdma you have to use the nvme-connect.service, if you
use FC you have to have some magic udev rules and auto-connect scripts,
if you use $POSSIBLE_NEW_TRANSPORT you have to yada, yada".

I don't really like that we have to manully connect either but this
behaviour was first in NVMe so we should either stick to that or convert
RDMA over to using some sort of udev magic as well (which wont work as
far as I know, and it definitively won't work with TCP if and when it's
there).



Sagi's RFC is certainly fine to use with FC-NVME as well. But the 
mechanism does require the admin to have apriori knowledge and populate 
the discovery.conf file with the discovery controller information. It 
also requires updates on any dynamic change. All of that is unnecessary 
with the FC-NVME auto-connect scripts. Note: there's nothing wrong with 
using both mechanisms simultaneously with FC-NVME.


I certainly understand the sentiment of not doing something N ways. But, 
in this case, the functionality is so useful and valuable, that 
constraining the solution to the least common denominator is a bad 
idea.  You are also ignoring the user-base coming from SCSI on FC (which 
is much larger than nvmf on RDMA) that have completely different 
expectations on how the system finds targets and connects to storage. 
They will be confused and have lots of questions about why SCSI is so 
automatic yet (even on the same machines/adapters/fabric) they now must 
have all kinds of upfront knowledge and make manual changes for NVME. I 
get bombarded on this and I know you will too.   Although it's not nice 
to state transport-specific methods, I believe the message is still 
rather short and simple.


We seem to have gotten away from the desired patch into a completely 
different area.   In truth, adding the patch isn't a big deal, but to me 
it doesn't provide any value over the existing scripts. You could say it 
adds yet another one off in FC-specific behavior that could complicate 
the documentation.


-- james




Re: [PATCH v2] nvme-fc: don't require user to enter host_traddr

2017-11-30 Thread James Smart

On 11/30/2017 7:12 AM, Johannes Thumshirn wrote:

One major usability difference between NVMf RDMA and FC is resolving
the default host transport address in RDMA. This is perfectly doable
in FC as well, as we already have all possible lport <-> rport
combinations pre-populated so we can pick the first lport that has a
connection to our desired rport per default or optionally use the user
supplied lport if we have one.

Signed-off-by: Johannes Thumshirn <jthumsh...@suse.de>
Cc: James Smart <james.sm...@broadcom.com>


This is unnecessary and can create weird configurations. It assumes 
connections are manually created. The weirdness is: a) an admin has to 
know there are multiple paths in order to connect them and be 
intelligent on how to get the complex name strings and try to know what 
connections are already in existence;  b) if a users has a connectivity 
loss beyond dev_loss_tmo or ctlr_loss_tmo such that the controller is 
terminated, they must manually issue the connec commands again; and c) 
those un-knowledgeable users will unknowingly find that their multiple 
paths aren't connected and the system will gang up on the host adapter 
detected on the system with connectivity. All things unexpected and not 
what occurs with FC and SCSI and which will result in system support calls.


If the system uses the FC auto-connect scripts things will be properly 
connected across all paths connected to the subsystem - automatically, 
including resume after an extended connectivity loss - and the system 
will behave just like FC does with SCSI.


I see no reason to add this patch.  Please move away from manual 
configuration.


-- james



Re: [PATCH v2] nvme-fc: don't require user to enter host_traddr

2017-11-30 Thread James Smart

On 11/30/2017 7:12 AM, Johannes Thumshirn wrote:

One major usability difference between NVMf RDMA and FC is resolving
the default host transport address in RDMA. This is perfectly doable
in FC as well, as we already have all possible lport <-> rport
combinations pre-populated so we can pick the first lport that has a
connection to our desired rport per default or optionally use the user
supplied lport if we have one.

Signed-off-by: Johannes Thumshirn 
Cc: James Smart 


This is unnecessary and can create weird configurations. It assumes 
connections are manually created. The weirdness is: a) an admin has to 
know there are multiple paths in order to connect them and be 
intelligent on how to get the complex name strings and try to know what 
connections are already in existence;  b) if a users has a connectivity 
loss beyond dev_loss_tmo or ctlr_loss_tmo such that the controller is 
terminated, they must manually issue the connec commands again; and c) 
those un-knowledgeable users will unknowingly find that their multiple 
paths aren't connected and the system will gang up on the host adapter 
detected on the system with connectivity. All things unexpected and not 
what occurs with FC and SCSI and which will result in system support calls.


If the system uses the FC auto-connect scripts things will be properly 
connected across all paths connected to the subsystem - automatically, 
including resume after an extended connectivity loss - and the system 
will behave just like FC does with SCSI.


I see no reason to add this patch.  Please move away from manual 
configuration.


-- james



Re: [PATCH] scsi: lpfc: fix kzalloc-simple.cocci warnings

2017-11-03 Thread James Smart

On 10/11/2017 12:42 PM, Vasyl Gomonovych wrote:

drivers/scsi/lpfc/lpfc_debugfs.c:5460:22-29: WARNING: kzalloc should be used for 
phba -> nvmeio_trc, instead of kmalloc/memset
drivers/scsi/lpfc/lpfc_debugfs.c:2230:20-27: WARNING: kzalloc should be used for 
phba -> nvmeio_trc, instead of kmalloc/memset

  Use kzalloc rather than kmalloc followed by memset with 0

Generated by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci

Signed-off-by: Vasyl Gomonovych <gomonov...@gmail.com>
---
  drivers/scsi/lpfc/lpfc_debugfs.c | 9 ++---
  1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c
index d50c481..2bf5ad3 100644
--- a/drivers/scsi/lpfc/lpfc_debugfs.c
+++ b/drivers/scsi/lpfc/lpfc_debugfs.c
@@ -2227,7 +2227,7 @@
kfree(phba->nvmeio_trc);
  
  	/* Allocate new trace buffer and initialize */

-   phba->nvmeio_trc = kmalloc((sizeof(struct lpfc_debugfs_nvmeio_trc) *
+   phba->nvmeio_trc = kzalloc((sizeof(struct lpfc_debugfs_nvmeio_trc) *
sz), GFP_KERNEL);
if (!phba->nvmeio_trc) {
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
@@ -2235,8 +2235,6 @@
"nvmeio_trc buffer\n");
return -ENOMEM;
}
-   memset(phba->nvmeio_trc, 0,
-  (sizeof(struct lpfc_debugfs_nvmeio_trc) * sz));
atomic_set(>nvmeio_trc_cnt, 0);
phba->nvmeio_trc_on = 0;
phba->nvmeio_trc_output_idx = 0;
@@ -5457,7 +5455,7 @@ static int lpfc_idiag_cmd_get(const char __user *buf, 
size_t nbytes,
phba->nvmeio_trc_size = lpfc_debugfs_max_nvmeio_trc;
  
  			/* Allocate trace buffer and initialize */

-   phba->nvmeio_trc = kmalloc(
+   phba->nvmeio_trc = kzalloc(
(sizeof(struct lpfc_debugfs_nvmeio_trc) *
phba->nvmeio_trc_size), GFP_KERNEL);
  
@@ -5467,9 +5465,6 @@ static int lpfc_idiag_cmd_get(const char __user *buf, size_t nbytes,

"nvmeio_trc buffer\n");
goto nvmeio_off;
}
-   memset(phba->nvmeio_trc, 0,
-  (sizeof(struct lpfc_debugfs_nvmeio_trc) *
-  phba->nvmeio_trc_size));
phba->nvmeio_trc_on = 1;
phba->nvmeio_trc_output_idx = 0;
    phba->nvmeio_trc = NULL;


looks good.


Signed-off-by:  James Smart <james.sm...@broadcom.com>


Re: [PATCH] scsi: lpfc: fix kzalloc-simple.cocci warnings

2017-11-03 Thread James Smart

On 10/11/2017 12:42 PM, Vasyl Gomonovych wrote:

drivers/scsi/lpfc/lpfc_debugfs.c:5460:22-29: WARNING: kzalloc should be used for 
phba -> nvmeio_trc, instead of kmalloc/memset
drivers/scsi/lpfc/lpfc_debugfs.c:2230:20-27: WARNING: kzalloc should be used for 
phba -> nvmeio_trc, instead of kmalloc/memset

  Use kzalloc rather than kmalloc followed by memset with 0

Generated by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci

Signed-off-by: Vasyl Gomonovych 
---
  drivers/scsi/lpfc/lpfc_debugfs.c | 9 ++---
  1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c
index d50c481..2bf5ad3 100644
--- a/drivers/scsi/lpfc/lpfc_debugfs.c
+++ b/drivers/scsi/lpfc/lpfc_debugfs.c
@@ -2227,7 +2227,7 @@
kfree(phba->nvmeio_trc);
  
  	/* Allocate new trace buffer and initialize */

-   phba->nvmeio_trc = kmalloc((sizeof(struct lpfc_debugfs_nvmeio_trc) *
+   phba->nvmeio_trc = kzalloc((sizeof(struct lpfc_debugfs_nvmeio_trc) *
sz), GFP_KERNEL);
if (!phba->nvmeio_trc) {
lpfc_printf_log(phba, KERN_ERR, LOG_INIT,
@@ -2235,8 +2235,6 @@
"nvmeio_trc buffer\n");
return -ENOMEM;
}
-   memset(phba->nvmeio_trc, 0,
-  (sizeof(struct lpfc_debugfs_nvmeio_trc) * sz));
atomic_set(>nvmeio_trc_cnt, 0);
phba->nvmeio_trc_on = 0;
phba->nvmeio_trc_output_idx = 0;
@@ -5457,7 +5455,7 @@ static int lpfc_idiag_cmd_get(const char __user *buf, 
size_t nbytes,
phba->nvmeio_trc_size = lpfc_debugfs_max_nvmeio_trc;
  
  			/* Allocate trace buffer and initialize */

-   phba->nvmeio_trc = kmalloc(
+   phba->nvmeio_trc = kzalloc(
(sizeof(struct lpfc_debugfs_nvmeio_trc) *
phba->nvmeio_trc_size), GFP_KERNEL);
  
@@ -5467,9 +5465,6 @@ static int lpfc_idiag_cmd_get(const char __user *buf, size_t nbytes,

"nvmeio_trc buffer\n");
goto nvmeio_off;
}
-   memset(phba->nvmeio_trc, 0,
-  (sizeof(struct lpfc_debugfs_nvmeio_trc) *
-  phba->nvmeio_trc_size));
phba->nvmeio_trc_on = 1;
phba->nvmeio_trc_output_idx = 0;
        phba->nvmeio_trc = NULL;


looks good.


Signed-off-by:  James Smart 


Re: [PATCH 3/6] scsi: lpfc: Cocci spatch "pool_zalloc-simple"

2017-09-22 Thread James Smart

On 9/20/2017 11:15 PM, Thomas Meyer wrote:

Use *_pool_zalloc rather than *_pool_alloc followed by memset with 0.
Found by coccinelle spatch "api/alloc/pool_zalloc-simple.cocci"

Signed-off-by: Thomas Meyer <tho...@m3y3r.de>
---




Looks good. Thanks.

Signed-off-by: James Smart <james.sm...@broadcom.com>


Re: [PATCH 3/6] scsi: lpfc: Cocci spatch "pool_zalloc-simple"

2017-09-22 Thread James Smart

On 9/20/2017 11:15 PM, Thomas Meyer wrote:

Use *_pool_zalloc rather than *_pool_alloc followed by memset with 0.
Found by coccinelle spatch "api/alloc/pool_zalloc-simple.cocci"

Signed-off-by: Thomas Meyer 
---




Looks good. Thanks.

Signed-off-by: James Smart 


Re: [PATCH] scsi: lpfc: remove redundant null check on eqe

2017-09-14 Thread James Smart

On 9/8/2017 1:02 AM, Colin King wrote:

From: Colin Ian King <colin.k...@canonical.com>

The pointer eqe is always non-null inside the while loop, so the
check to see if eqe is NULL is redudant and hence can be removed.

Detected by CoverityScan CID#1248693 ("Logically Dead Code")

Signed-off-by: Colin Ian King <colin.k...@canonical.com>



Yep. thank you

Signed-off-by: James Smart <james.sm...@broadcom.com>



Re: [PATCH] scsi: lpfc: remove redundant null check on eqe

2017-09-14 Thread James Smart

On 9/8/2017 1:02 AM, Colin King wrote:

From: Colin Ian King 

The pointer eqe is always non-null inside the while loop, so the
check to see if eqe is NULL is redudant and hence can be removed.

Detected by CoverityScan CID#1248693 ("Logically Dead Code")

Signed-off-by: Colin Ian King 



Yep. thank you

Signed-off-by: James Smart 



Re: [PATCH v2] scsi: lpfc: avoid false positive gcc-8 warning

2017-08-24 Thread James Smart

On 8/24/2017 4:01 AM, Arnd Bergmann wrote:


I have also come up with a different workaround of my own
(sorry for the broken formatting here) and tested it successfully
over night. I have definitely spent more time on it than it was
worth now. Let me know if you prefer that version over yours,
then I'll submit that as a proper patch with your Ack.

Signed-off-by: Arnd Bergmann <a...@arndb.de>

--- a/drivers/scsi/lpfc/lpfc_debugfs.h


I'm ok with either solution. I prefer less change but this is a trivial 
thing.


Signed-off-by: James Smart <james.sm...@broadcom.com>

-- james



Re: [PATCH v2] scsi: lpfc: avoid false positive gcc-8 warning

2017-08-24 Thread James Smart

On 8/24/2017 4:01 AM, Arnd Bergmann wrote:


I have also come up with a different workaround of my own
(sorry for the broken formatting here) and tested it successfully
over night. I have definitely spent more time on it than it was
worth now. Let me know if you prefer that version over yours,
then I'll submit that as a proper patch with your Ack.

Signed-off-by: Arnd Bergmann 

--- a/drivers/scsi/lpfc/lpfc_debugfs.h


I'm ok with either solution. I prefer less change but this is a trivial 
thing.


Signed-off-by: James Smart 

-- james



[PATCH v2] scsi: lpfc: avoid false positive gcc-8 warning

2017-08-23 Thread James Smart
Arnd Bergmann, testing gcc-8, encountered the following:

> This is an interesting regression with gcc-8, showing a harmless
> warning for correct code:
>
>In file included from include/linux/kernel.h:13:0,
>...
>from drivers/scsi/lpfc/lpfc_debugfs.c:23:
>include/linux/printk.h:301:2: error: 'eq' may be used
>uninitialized in this function [-Werror=maybe-uninitialized]
>   printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
>   ^~
>In file included from drivers/scsi/lpfc/lpfc_debugfs.c:58:0:
>drivers/scsi/lpfc/lpfc_debugfs.h:451:31: note: 'eq' was
>declared here

The code is fine: a for loop which if there's at least 1 itteration,
will assign eq a value. Followed by an if test that checks for no
itterations and assigns eq a default value. But the checker doesn't
see the relationship between the two so assumes eq may not a have a
value.

I believe, simply initializing with a NULL will solve the issue.

Signed-off-by: James Smart <james.sm...@broadcom.com>
---
 drivers/scsi/lpfc/lpfc_debugfs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_debugfs.h b/drivers/scsi/lpfc/lpfc_debugfs.h
index 7b7d314af0e0..32e86d7813a0 100644
--- a/drivers/scsi/lpfc/lpfc_debugfs.h
+++ b/drivers/scsi/lpfc/lpfc_debugfs.h
@@ -448,7 +448,7 @@ lpfc_debug_dump_wq(struct lpfc_hba *phba, int qtype, int 
wqidx)
 static inline void
 lpfc_debug_dump_cq(struct lpfc_hba *phba, int qtype, int wqidx)
 {
-   struct lpfc_queue *wq, *cq, *eq;
+   struct lpfc_queue *wq, *cq, *eq = NULL;
char *qtypestr;
int eqidx;
 
-- 
2.13.1



[PATCH v2] scsi: lpfc: avoid false positive gcc-8 warning

2017-08-23 Thread James Smart
Arnd Bergmann, testing gcc-8, encountered the following:

> This is an interesting regression with gcc-8, showing a harmless
> warning for correct code:
>
>In file included from include/linux/kernel.h:13:0,
>...
>from drivers/scsi/lpfc/lpfc_debugfs.c:23:
>include/linux/printk.h:301:2: error: 'eq' may be used
>uninitialized in this function [-Werror=maybe-uninitialized]
>   printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
>   ^~
>In file included from drivers/scsi/lpfc/lpfc_debugfs.c:58:0:
>drivers/scsi/lpfc/lpfc_debugfs.h:451:31: note: 'eq' was
>declared here

The code is fine: a for loop which if there's at least 1 itteration,
will assign eq a value. Followed by an if test that checks for no
itterations and assigns eq a default value. But the checker doesn't
see the relationship between the two so assumes eq may not a have a
value.

I believe, simply initializing with a NULL will solve the issue.

Signed-off-by: James Smart 
---
 drivers/scsi/lpfc/lpfc_debugfs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_debugfs.h b/drivers/scsi/lpfc/lpfc_debugfs.h
index 7b7d314af0e0..32e86d7813a0 100644
--- a/drivers/scsi/lpfc/lpfc_debugfs.h
+++ b/drivers/scsi/lpfc/lpfc_debugfs.h
@@ -448,7 +448,7 @@ lpfc_debug_dump_wq(struct lpfc_hba *phba, int qtype, int 
wqidx)
 static inline void
 lpfc_debug_dump_cq(struct lpfc_hba *phba, int qtype, int wqidx)
 {
-   struct lpfc_queue *wq, *cq, *eq;
+   struct lpfc_queue *wq, *cq, *eq = NULL;
char *qtypestr;
int eqidx;
 
-- 
2.13.1



Re: [PATCH] scsi: lpfc: avoid false-positive gcc-8 warning

2017-08-23 Thread James Smart

On 8/23/2017 8:01 AM, Arnd Bergmann wrote:

This is an interesting regression with gcc-8, showing a harmless
warning for correct code:

In file included from include/linux/kernel.h:13:0,
  ...
  from drivers/scsi/lpfc/lpfc_debugfs.c:23:
include/linux/printk.h:301:2: error: 'eq' may be used uninitialized in this 
function [-Werror=maybe-uninitialized]
   printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
   ^~
In file included from drivers/scsi/lpfc/lpfc_debugfs.c:58:0:
drivers/scsi/lpfc/lpfc_debugfs.h:451:31: note: 'eq' was declared here

I tried to come up with a reduced test case for gcc here
a few times, but every time ended up with code that is actually
wrong with older gcc versions missing the bug and gcc-8 finding
it. As this is the only false-positive -Wmaybe-uninitialized
warnign I got with gcc-8 randconfig builds, I'd suggest we
work around it.

Making the index variable 'unsigned' is enough to shut up
the warning, as gcc can then see that comparing eqidx to
phba->io_channel_irqs is fine here.

Signed-off-by: Arnd Bergmann <a...@arndb.de>
---
  drivers/scsi/lpfc/lpfc_debugfs.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)



looks good. Thanks

Signed-off-by: James Smart <james.sm...@broadcom.com>

-- james


Re: [PATCH] scsi: lpfc: avoid false-positive gcc-8 warning

2017-08-23 Thread James Smart

On 8/23/2017 8:01 AM, Arnd Bergmann wrote:

This is an interesting regression with gcc-8, showing a harmless
warning for correct code:

In file included from include/linux/kernel.h:13:0,
  ...
  from drivers/scsi/lpfc/lpfc_debugfs.c:23:
include/linux/printk.h:301:2: error: 'eq' may be used uninitialized in this 
function [-Werror=maybe-uninitialized]
   printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
   ^~
In file included from drivers/scsi/lpfc/lpfc_debugfs.c:58:0:
drivers/scsi/lpfc/lpfc_debugfs.h:451:31: note: 'eq' was declared here

I tried to come up with a reduced test case for gcc here
a few times, but every time ended up with code that is actually
wrong with older gcc versions missing the bug and gcc-8 finding
it. As this is the only false-positive -Wmaybe-uninitialized
warnign I got with gcc-8 randconfig builds, I'd suggest we
work around it.

Making the index variable 'unsigned' is enough to shut up
the warning, as gcc can then see that comparing eqidx to
phba->io_channel_irqs is fine here.

Signed-off-by: Arnd Bergmann 
---
  drivers/scsi/lpfc/lpfc_debugfs.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)



looks good. Thanks

Signed-off-by: James Smart 

-- james


Re: [PATCH] scsi: lpfc: remove useless code in lpfc_sli4_bsg_link_diag_test

2017-08-23 Thread James Smart

On 8/22/2017 1:53 PM, Gustavo A. R. Silva wrote:

Remove variable assignments. The value stored in local variable _rc_ is
overwritten at line 2448:rc = lpfc_sli4_bsg_set_link_diag_state(phba, 0);
before it can be used.

Addresses-Coverity-ID: 1226935
Signed-off-by: Gustavo A. R. Silva <gust...@embeddedor.com>
---
This issue was detected by Coverity and it was tested by compilation only.
Notice that this code has been there since 2011.

  drivers/scsi/lpfc/lpfc_bsg.c | 9 +++--
  1 file changed, 3 insertions(+), 6 deletions(-)




looks good. Thanks

Signed-off-by: James Smart <james.sm...@broadcom.com>

-- james



Re: [PATCH] scsi: lpfc: remove useless code in lpfc_sli4_bsg_link_diag_test

2017-08-23 Thread James Smart

On 8/22/2017 1:53 PM, Gustavo A. R. Silva wrote:

Remove variable assignments. The value stored in local variable _rc_ is
overwritten at line 2448:rc = lpfc_sli4_bsg_set_link_diag_state(phba, 0);
before it can be used.

Addresses-Coverity-ID: 1226935
Signed-off-by: Gustavo A. R. Silva 
---
This issue was detected by Coverity and it was tested by compilation only.
Notice that this code has been there since 2011.

  drivers/scsi/lpfc/lpfc_bsg.c | 9 +++--
  1 file changed, 3 insertions(+), 6 deletions(-)




looks good. Thanks

Signed-off-by: James Smart 

-- james



  1   2   3   4   >