Re: iser-target: Add iSCSI Extensions for RDMA (iSER) target driver
Hi Dan, (Adding Sagi CC') On Tue, 2016-01-05 at 00:07 +0300, Dan Carpenter wrote: > Hello Nicholas Bellinger, > > The patch b8d26b3be8b3: "iser-target: Add iSCSI Extensions for RDMA > (iSER) target driver" from Mar 7, 2013, leads to the following static > checker warning: > > drivers/infiniband/ulp/isert/ib_isert.c:423 isert_device_get() > error: passing non negative 1 to ERR_PTR > > drivers/infiniband/ulp/isert/ib_isert.c >417 >418 device->ib_device = cma_id->device; >419 ret = isert_create_device_ib_res(device); >420 if (ret) { >421 kfree(device); >422 mutex_unlock(_list_mutex); >423 return ERR_PTR(ret); > > The warning here is because isert_create_device_ib_res() returns either > a negative error code, zero or one. The documentation is not clear what > that means. AHAHAHAHAHAHAHAH. I joke. There is no documentation. > > Anyway, it's definitely a bug and it leads to a NULL dereference in the > caller. > >424 } >425 >426 device->refcount++; >427 list_add_tail(>dev_node, _list); >428 isert_info("Created a new iser device %p refcount %d\n", >429 device, device->refcount); >430 mutex_unlock(_list_mutex); >431 >432 return device; >433 } > Looking at the current code, AFAICT isert_alloc_comps() -> ib_req_notify_cq() is the one case where this can happen, right..? Here's a quick patch that I'm applying to target-pending/for-next, that always returns negative upon isert_create_device_ib_res() failure. Thanks Dan! diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert. index 91eb22c..8954e12 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -356,7 +356,7 @@ isert_create_device_ib_res(struct isert_device *device) dev_attr = >dev_attr; ret = isert_query_device(device->ib_device, dev_attr); if (ret) - return ret; + goto out; /* asign function handlers */ if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS && @@ -372,7 +372,7 @@ isert_create_device_ib_res(struct isert_device *device) ret = isert_alloc_comps(device, dev_attr); if (ret) - return ret; + goto out; device->pd = ib_alloc_pd(device->ib_device); if (IS_ERR(device->pd)) { @@ -390,6 +390,9 @@ isert_create_device_ib_res(struct isert_device *device) out_cq: isert_free_comps(device); +out: + if (ret > 0) + ret = -EINVAL; return ret; } -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 00/10] iSER support for remote invalidate
On Thu, 2015-12-24 at 12:22 +0200, Sagi Grimberg wrote: > >> Applied to target-pending/for-next as v4.5-rc1 material, along with > >> Reviewed-by tags from HCH. > > > > So this is both in your and Dougs now it seems. Given the non-trivial > > merge with the other RDMA updates I'd suggest to drop it from the > > target tree as Doug already sorted out the merge. > > Yea, this conflicts with the CQ API stuff. Doug and I sorting it out. > FYI, the only conflict as reported in linux-next last week is with nfsd tree and "IB: merge struct ib_device_attr into struct ib_device" here: http://marc.info/?l=linux-next=145155049101826=2 It looks like there is ongoing discussion this morning wrt rdma + nfsd trees.. Do you still want this series dropped from target-pending/for-next..? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] InfiniBand-iSER: Refactoring for two function implementations
On Sun, 2015-12-27 at 13:36 +0100, SF Markus Elfring wrote: > From: Markus Elfring> Date: Sun, 27 Dec 2015 13:12:10 +0100 > Subject: [PATCH 0/2] InfiniBand-iSER: Refactoring for two function > implementations > > I suggest to return directly instead of using the jump label "err" > in two functions (which are working without clean-up there). > > Markus Elfring (2): > One jump label less in iser_reg_sig_mr() > One jump label less in isert_reg_sig_mr() > > drivers/infiniband/ulp/iser/iser_memory.c | 5 ++--- > drivers/infiniband/ulp/isert/ib_isert.c | 7 +++ > 2 files changed, 5 insertions(+), 7 deletions(-) > Doug, are you going to pick these two minor patches up, or shall I..? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: iser-target: Add iSCSI Extensions for RDMA (iSER) target driver
That silences the warning, thanks! regards, dan carpenter -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 00/10] iSER support for remote invalidate
Hi Sagi, Jenny & Co, On Wed, 2015-12-09 at 14:11 +0200, Sagi Grimberg wrote: > Changes from v1: > - Fixed change-log typos > > Changes from v0: > - Rebased on top of 4.4-rc2 > - Removed iser_hello messages from the protocol header > - Avoided from further breaking the non-existent bidi support > - Fixed initiator remote invalidate support exposure only > for fastreg && iser_always_register > - Removed patch 2/10 as it's still under testing. > - Minor line-spacing nitpicks CR comments fixes. > - Added a FIXME comment to declare our debt to Or and Jason > on the awkward device->mr->rkey deref. > - Piggybacked another patch that reduces some iser code that > is now available by the ib_core (ib_sg_to_pages). > > Thanks a lot to all the reviewers! > > Code is available at: > g...@github.com:sagigrimberg/linux.git iser-remote-inv.2 > > Jenny Derzhavetz (5): > IB/iser: Don't register memory for all immediate data writes > IB/iser: set intuitive values for mr_valid > iser-target: Declare correct flags when accepting a connection > iser-target: Support the remote invalidation exception > IB/iser: Support the remote invalidation exception > > Roi Dayan (1): > IB/iser: Fix module init not cleaning up on error flow > > Sagi Grimberg (4): > IB/iser: Reuse ib_sg_to_pages > iser: Have initiator and target to share protocol structures and > definitions > iser-target: Remove unused file iser_proto.h > IB/iser: Increment the rkey when registering and not when invalidating > > drivers/infiniband/ulp/iser/iscsi_iser.c | 9 +- > drivers/infiniband/ulp/iser/iscsi_iser.h | 54 +++--- > drivers/infiniband/ulp/iser/iser_initiator.c | 70 +++-- > drivers/infiniband/ulp/iser/iser_memory.c| 150 > +-- > drivers/infiniband/ulp/iser/iser_verbs.c | 30 -- > drivers/infiniband/ulp/isert/ib_isert.c | 73 + > drivers/infiniband/ulp/isert/ib_isert.h | 40 ++- > drivers/infiniband/ulp/isert/isert_proto.h | 47 - > include/scsi/iser.h | 78 ++ > 9 files changed, 322 insertions(+), 229 deletions(-) > delete mode 100644 drivers/infiniband/ulp/isert/isert_proto.h > create mode 100644 include/scsi/iser.h > Applied to target-pending/for-next as v4.5-rc1 material, along with Reviewed-by tags from HCH. Thanks folks. --nab -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V5 0/5] iSER support for iWARP
Hi Steve Co, On Sun, 2015-07-05 at 12:44 -0500, Steve Wise wrote: The following series implements support for iWARP transports in the iSER initiator and target. This is based on Doug's k.o/for-4.2 branch. I've tested this on cxgb4 and mlx4 hardware. Changes since V4: iser: fixedcompiler warning isert: back to setting REMOTE_WRITE only for iWARP devices Changes since V3: Fixed commit messages based on feedback. iser: adjust max_sectors isert: split into 2 patches isert: always set REMOTE_WRITE for dma mrs Changes since V2: The transport independent work is removed from this series and will be submitted in a subsequent series. This V3 series now enables iWARP using existing core services. Changes since V1: Introduce and use transport-independent RDMA core services for allocating DMA MRs and computing fast register access flags. Correctly set the device max_sge_rd capability in several rdma device drivers. isert: use device capability max_sge_rd for the read sge depth. isert: change max_sge to max_write_sge in struct isert_conn. --- Sagi Grimberg (1): mlx4, mlx5, mthca: Expose max_sge_rd correctly Steve Wise (4): RDMA/isert: Limit read depth based on the device max_sge_rd capability RDMA/isert: Set REMOTE_WRITE on DMA MRs to support iWARP devices RDMA/iser: Limit sg tablesize and max_sectors to device fastreg max depth ipath,qib: Expose max_sge_rd correctly drivers/infiniband/hw/ipath/ipath_verbs.c|1 drivers/infiniband/hw/mlx4/main.c|1 drivers/infiniband/hw/mlx5/main.c|1 drivers/infiniband/hw/mthca/mthca_provider.c |1 drivers/infiniband/hw/qib/qib_verbs.c|1 drivers/infiniband/ulp/iser/iscsi_iser.c |9 drivers/infiniband/ulp/isert/ib_isert.c | 53 ++ drivers/infiniband/ulp/isert/ib_isert.h |3 + 8 files changed, 60 insertions(+), 10 deletions(-) Very excited to see iWARP support for iser-initiator + iser-target. ;) No objections to taking the iser-target changes through Doug's infiniband.git. For the target pieces: Acked-by: Nicholas Bellinger n...@linux-iscsi.org Also Doug, since this series is sufficiently small enough, and allows a class of hardware that people are expecting to function with iser to 'just work', I'd recommend to go ahead and push this series for v4.2-rc code. Thank you, --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] iser-target: Fix error path in isert_create_pi_ctx()
On Fri, 2015-05-29 at 23:12 -0700, Roland Dreier wrote: From: Roland Dreier rol...@purestorage.com We don't assign pi_ctx to desc-pi_ctx until we're certain to succeed in the function. That means the cleanup path should use the local pi_ctx variable, not desc-pi_ctx. This was detected by Coverity (CID 1260062). Signed-off-by: Roland Dreier rol...@purestorage.com --- drivers/infiniband/ulp/isert/ib_isert.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 327529ee85eb..3f40319a55da 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -547,11 +547,11 @@ isert_create_pi_ctx(struct fast_reg_descriptor *desc, return 0; err_prot_mr: - ib_dereg_mr(desc-pi_ctx-prot_mr); + ib_dereg_mr(pi_ctx-prot_mr); err_prot_frpl: - ib_free_fast_reg_page_list(desc-pi_ctx-prot_frpl); + ib_free_fast_reg_page_list(pi_ctx-prot_frpl); err_pi_ctx: - kfree(desc-pi_ctx); + kfree(pi_ctx); return ret; } Applied to target-pending/master. Thanks Roland! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] iser-target: Handle errors from isert_put_datain and isert_get_dataout
On Sat, 2015-03-07 at 04:16 +0200, Sagi Grimberg wrote: On 3/6/2015 7:56 PM, Chris Moore wrote: isert_put_datain() always returns 1 and isert_get_dataout() always returns 0, even if ib_post_send() fails. They should return an error in this case so the caller can handle it. Also, in the case of an ib_post_send() failure, user isert_err instead of isert_warn. With these changes, these two functions handle errors from ib_post_send() in the same way as other functions within ib_isert.c Hi Chris, This is indeed needed, but I'm afraid this is not complete given the rc is completely ignored by the callers (see lio_queue_data_in/lio_write_pending). So lio_write_pending() is propagating up the return back to transport_generic_new_cmd(). When the return is -EAGAIN or -ENOMEM, it triggers transport_handle_queue_full() to retry -write_pending() from se_device-qf_work_queue context. It's lio_queue_data_in() + lio_queue_status() that aren't propagating up failures to trigger queue_full in target_complete_ok_work(). Looking at this code again for traditional iscsi-target, I don't see a reason why iscsit_add_cmd_to_response_queue() failure should not be triggering queue_full logic to kick in.. On the iser-target side, is it OK for isert_put_datain() + isert_put_response() to be re-invoked from transport_complete_qf() context after ib_post_send() failure..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] iser-target: Remove duplicate function names
On Fri, 2015-02-06 at 01:09 +0100, Rasmus Villemoes wrote: The macro isert_dbg already ensures that __func__ is part of the output, so there's no reason to duplicate the function name in the format string itself. Signed-off-by: Rasmus Villemoes li...@rasmusvillemoes.dk --- drivers/infiniband/ulp/isert/ib_isert.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index dafb3c531f96..20097de9ac45 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -949,7 +949,7 @@ isert_post_recv(struct isert_conn *isert_conn, u32 count) isert_err(ib_post_recv() failed with ret: %d\n, ret); isert_conn-post_recv_buf_count -= count; } else { - isert_dbg(isert_post_recv(): Posted %d RX buffers\n, count); + isert_dbg(Posted %d RX buffers\n, count); isert_conn-conn_rx_desc_head = rx_head; } return ret; @@ -1709,8 +1709,7 @@ isert_put_cmd(struct isert_cmd *isert_cmd, bool comp_err) * associated cmd-se_cmd needs to be released. */ if (cmd-se_cmd.se_tfo != NULL) { - isert_dbg(Calling transport_generic_free_cmd from - isert_put_cmd for 0x%02x\n, + isert_dbg(Calling transport_generic_free_cmd for 0x%02x\n, cmd-iscsi_opcode); transport_generic_free_cmd(cmd-se_cmd, 0); break; @@ -3136,7 +3135,7 @@ accept_wait: spin_lock_bh(np-np_thread_lock); if (np-np_thread_state = ISCSI_NP_THREAD_RESET) { spin_unlock_bh(np-np_thread_lock); - isert_dbg(np_thread_state %d for isert_accept_np\n, + isert_dbg(np_thread_state %d\n, np-np_thread_state); /** * No point in stalling here when np_thread Applied to target-pending/for-next. Thanks Rasmus! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ib_srpt: wait_for_completion_timeout does not return negativ status
Hi Nicholas, On Fri, 2015-01-16 at 12:20 +0100, Nicholas Mc Guire wrote: Signed-off-by: Nicholas Mc Guire der.h...@hofr.at --- Patch is against 3.19.0-rc3 -next-20150109 Patch was compiletested only with x86_64_defconfig + CONFIG_TARGET_CORE=m, CONFIG_INFINIBAND=m, CONFIG_INFINIBAND_SRPT=m drivers/infiniband/ulp/srpt/ib_srpt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index eb694dd..4e58c76 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -3533,7 +3533,7 @@ static void srpt_close_session(struct se_session *se_sess) spin_unlock_irq(sdev-spinlock); res = wait_for_completion_timeout(release_done, 60 * HZ); - WARN_ON(res = 0); + WARN_ON(res == 0); } /** Notice that 'res' here is still incorrectly defined as an 'int', instead of 'unsigned long'. I've updated your patch to use the proper type for res, and added a short commit log to describe what it's doing. Merged into target-pending/for-next. Thanks! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ib_isert RDMA_CM_EVENT_DEVICE_REMOVAL events
On Tue, 2014-10-28 at 18:34 +0200, Sagi Grimberg wrote: On 10/24/2014 9:02 AM, Nicholas A. Bellinger wrote: SNIP AFAICT, it looks like the assumption in isert_disconnected_handler() to dereference rdma_cm_id-context as isert_conn (in all cases) is wrong, and the above RDMA_CM_EVENT_DEVICE_REMOVAL has iscsi_np stored in -context from the original rdma_create_id() at isert_setup_np() time. So, is there a way to tell the difference how rdma_cm_id-context should be dereferenced when DEVICE_REMOVAL occurs..? Does DEVICE_REMOVAL occur on just the listener rdma_cm_id, or on all accepted children as well..? Anything else to consider wrt to other CMA events being kicked off into isert_disconnected_handler()..? Hey Nic, Terribly sorry for the late response, I'm juggling between 5 different projects... This is indeed a bug, and I indeed noticed it. This will happen if the network portal cm id listens on a specific address (e.g. not any - 0.0.0.0), in this case the cm id will acquire the relevant device (see rdma_bind_addr) - hence will sense DEVICE_REMOVAL events. And yes, all the accepted children will of course get the event as well. Notice that cma_remove_one sequence (that fires DEVICE_REMOVAL event to all the relevant cma ids) requires the cmd is owner to finish cleanup before the end of the callback because in the end of it, it will allow the device to remove. So I do plan to get disconnected_handler to the callback instead of the deferred work. I think this should make the bug you hit go away: iser-target: Handle DEVICE_REMOVAL event on network portal listener correctly In this case the cm_id-context is the isert_np, and the cm_id-qp is NULL, so use that to distinct the cases. Since we don't expect any other events on this cm_id we can just return -1 for explicit termination of the cm_id by the cma layer. Signed-off-by: Sagi Grimberg sa...@mellanox.com Just verified that this resolves the specific OOPesen. Queued up in target-pending/master, with a CC' to v3.10.y. Thanks Sagi! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
ib_isert RDMA_CM_EVENT_DEVICE_REMOVAL events
Hey Or Sagi, Quick CMA related question for you.. I've been hitting the following NULL pointer dereference during reboot using a v3.14.y based kernel with Sagi's latest ib_isert fixes in the stable-queue from v3.17. Note this system was not performing /etc/init.d/target stop during reboot to take down the configfs layout, and no actual iser logins or sessions had been previously established on iser enabled network portal in question: [info] Will now restart. [ 111.076328] kvm: exiting hardware virtualization [ 111.083670] sd 9:0:3:0: [sdi] Synchronizing SCSI cache [ 111.089825] sd 9:0:2:0: [sdh] Synchronizing SCSI cache [ 111.095924] sd 9:0:1:0: [sdg] Synchronizing SCSI cache [ 111.103375] sd 9:0:0:0: [sdf] Synchronizing SCSI cache [ 111.109707] sd 8:0:3:0: [sde] Synchronizing SCSI cache [ 111.116036] sd 8:0:2:0: [sdd] Synchronizing SCSI cache [ 111.122368] sd 8:0:1:0: [sdc] Synchronizing SCSI cache [ 111.128723] sd 8:0:0:0: [sdb] Synchronizing SCSI cache [ 111.134979] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 111.273440] isert_cma_handler: event 11 status 0 conn 880815896000 id 88101440d400 [ 111.282871] isert_disconnect_work(): [ 111.290808] BUG: unable to handle kernel NULL pointer dereference at (null) [ 111.299886] IP: [ (null)] (null) [ 111.305736] PGD 10186c6067 PUD 1016d84067 PMD 0 [ 111.311271] Oops: 0010 [#1] SMP [ 111.315169] Modules linked in: ib_isert ib_ipoib mlx4_ib rpcsec_gss_krb5 nfsv4 ip_tables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nf_nat nf_conntrack ip6table_filter ip6_tables ebtables x_tables iscsi_target_mod ib_srpt tcm_qla2xxx tcm_loop vhost_scsi vhost tcm_fc libfc target_core_file target_core_iblock target_core_pscsi target_core_mod 8021q garp stp mrp llc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd sunrpc loop x86_pkg_temp_thermal intel_powerclamp crct10dif_pclmul sb_edac crc32_pclmul ioatdma ghash_clmulni_intel lpc_ich edac_core mfd_core i2c_i801 ipmi_si processor thermal_sys button md_mod sg hid_generic isci usbhid mpt3sas ixgbe mlx4_core libsas raid_class hid igb scsi_transport_sas qla2xxx mdio i2c_algo_bit i2c_core scsi_transport_fc dca [ 111.398587] CPU: 6 PID: 138 Comm: kworker/6:1 Not tainted 3.14.13+ #6 [ 111.405902] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.01.0002.082220131453 08/22/2013 [ 111.417530] Workqueue: events isert_disconnect_work [ib_isert] [ 111.424254] task: 88101a9bcb60 ti: 8810152bc000 task.ti: 8810152bc000 [ 111.432762] RIP: 0010:[] [ (null)] (null) [ 111.441357] RSP: 0018:8810152bddb0 EFLAGS: 00010087 [ 111.447407] RAX: 8808158969e8 RBX: RCX: [ 111.455499] RDX: RSI: 0003 RDI: 8808158969e8 [ 111.463593] RBP: 880815896600 R08: R09: 074f [ 111.471685] R10: R11: R12: [ 111.479779] R13: R14: 0003 R15: 880815896be8 [ 111.487872] FS: () GS:88101f20() knlGS: [ 111.497061] CS: 0010 DS: ES: CR0: 80050033 [ 111.503598] CR2: CR3: 0010040ce000 CR4: 001407e0 [ 111.511691] Stack: [ 111.514046] 810f40ac 8810152bde08 0001152bddc8 88101f20f440 [ 111.522812] 8808158965f8 8808158965f0 0292 88101f216700 [ 111.531578] 0180 810f49ca 8808158965a8 [ 111.540344] Call Trace: [ 111.543195] [810f40ac] ? __wake_up_common+0x4c/0x80 [ 111.549836] [810f49ca] ? complete+0x3a/0x60 [ 111.555698] [810ccecf] ? process_one_work+0x16f/0x430 [ 111.562528] [810ce6d6] ? worker_thread+0x116/0x3d0 [ 111.569065] [810ce5c0] ? manage_workers.isra.21+0x2e0/0x2e0 [ 111.576482] [810d49bc] ? kthread+0xbc/0xe0 [ 111.582243] [810d4900] ? flush_kthread_worker+0x80/0x80 [ 111.589273] [8164d8cc] ? ret_from_fork+0x7c/0xb0 [ 111.595616] [810d4900] ? flush_kthread_worker+0x80/0x80 [ 111.602639] Code: Bad RIP value. [ 111.606631] RIP [ (null)] (null) [ 111.612576] RSP 8810152bddb0 [ 111.616583] CR2: [ 111.620400] ---[ end trace 8e386ea065bef2ce ]--- [ 111.634392] BUG: unable to handle kernel paging request at ffd8 [ 111.642470] IP: [810d4d67] kthread_data+0x7/0x10 [ 111.648806] PGD 1c0d067 PUD 1c0f067 PMD 0 [ 111.653761] Oops: [#2] SMP [ 111.657653] Modules linked in: ib_isert ib_ipoib mlx4_ib rpcsec_gss_krb5 nfsv4 ip_tables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nf_nat nf_conntrack ip6table_filter ip6_tables ebtables x_tables iscsi_target_mod ib_srpt tcm_qla2xxx tcm_loop vhost_scsi vhost tcm_fc libfc target_core_file target_core_iblock target_core_pscsi
Re: Can someone help me understand the reason for this code in ib_isert.c?
On Wed, 2014-10-22 at 12:02 +0300, Or Gerlitz wrote: On 10/22/2014 8:06 AM, Nicholas A. Bellinger wrote: On Tue, 2014-10-21 at 00:13 +0300, Or Gerlitz wrote: On Mon, Oct 20, 2014 at 6:29 PM, Chris Moore chris.mo...@emulex.com wrote: The following code is in isert_conn_setup_qp() in ib_isert.c: /* * FIXME: Use devattr.max_sge - 2 for max_send_sge as * work-around for RDMA_READ.. */ attr.cap.max_send_sge = device-dev_attr.max_sge - 2; It's not clear from the comment what this is a work-around for, and I wasn't able to figure it out from looking at logs. I believe this refers to some IBTA spec corner case which comes into play with the max_sges advertized by mlx4, Eli, can you shed some light (IBTA pointer) on that? is this the case (i.e dev_attr.max_sge isn't always achievable) with mlx5 too? It's for ConnectX-2 reporting max_sge=31 during development, while the largest max_send_sge for stable large block RDMA reads was (is..?) 29 SGEs. In trying to get isert working with ocrdma I ran into a problem where the code thought there was only 1 send SGE available when in fact the device has 3. Then I found the above work-around, which explained the problem. Nic, any reason for attr.cap.max_send_sge = 1 not to work OK? IIRC, pre fastreg code supported max_send_sge=1 by limiting the transfer size per RDMA read, and would issue subsequent RDMA read + offset from completion up to the total se_cmd-data_length. Not sure how this works with fastreg today. Sagi..? While on fastreg mode, for RDMA reads we use only one SGE, talking to Sagi he explained me that the problem with creating the QP with max_send_sge = 1 has to do with other flows where two or even three SGEs are needed, e.g when the iscsi/iser header (doesn't exist in RDMA ops) is taken from one spot of the memory and the data (sense) taken from another one? Nic, we need to see what is the minimum needed by the code (two?) and tweak that line to make sure it gets there as long as the device supports it, SB, makes sense? diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 0bea577..24abbb3 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -115,9 +115,10 @@ isert_conn_setup_qp(struct isert_conn *isert_conn, struct rdma_cm_id *cma_id, attr.cap.max_recv_wr = ISERT_QP_MAX_RECV_DTOS; /* * FIXME: Use devattr.max_sge - 2 for max_send_sge as -* work-around for RDMA_READ.. +* work-around for RDMA_READ.. still need to make sure to +* have at least two SGEs for SCSI responses. */ - attr.cap.max_send_sge = device-dev_attr.max_sge - 2; + attr.cap.max_send_sge = max(2, device-dev_attr.max_sge - 2); isert_conn-max_sge = attr.cap.max_send_sge; attr.cap.max_recv_sge = 1; After a quick audit, the minimum max_send_sge=2 patch looks correct for the hardcoded tx_desc-num_sge cases in isert_put_login_tx(), isert_put_response(), isert_put_reject() and isert_put_text_rsp(). For the RDMA read path with non fast-reg, isert_build_rdma_wr() is still correctly limiting the SGEs per operation based upon the negotiated isert_conn-max_sge set above in isert_conn_setup_qp(). Chris, please confirm on ocrdma with Or's patch, and I'll include his patch into target-pending/queue for now, and move into master once you give a proper Tested-By. Thanks! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can someone help me understand the reason for this code in ib_isert.c?
On Tue, 2014-10-21 at 00:13 +0300, Or Gerlitz wrote: On Mon, Oct 20, 2014 at 6:29 PM, Chris Moore chris.mo...@emulex.com wrote: The following code is in isert_conn_setup_qp() in ib_isert.c: /* * FIXME: Use devattr.max_sge - 2 for max_send_sge as * work-around for RDMA_READ.. */ attr.cap.max_send_sge = device-dev_attr.max_sge - 2; It's not clear from the comment what this is a work-around for, and I wasn't able to figure it out from looking at logs. I believe this refers to some IBTA spec corner case which comes into play with the max_sges advertized by mlx4, Eli, can you shed some light (IBTA pointer) on that? is this the case (i.e dev_attr.max_sge isn't always achievable) with mlx5 too? It's for ConnectX-2 reporting max_sge=31 during development, while the largest max_send_sge for stable large block RDMA reads was (is..?) 29 SGEs. In trying to get isert working with ocrdma I ran into a problem where the code thought there was only 1 send SGE available when in fact the device has 3. Then I found the above work-around, which explained the problem. Nic, any reason for attr.cap.max_send_sge = 1 not to work OK? IIRC, pre fastreg code supported max_send_sge=1 by limiting the transfer size per RDMA read, and would issue subsequent RDMA read + offset from completion up to the total se_cmd-data_length. Not sure how this works with fastreg today. Sagi..? -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] Signature feature update
Hi Sagi Co, On Wed, 2014-08-13 at 19:54 +0300, Sagi Grimberg wrote: Lately MKP posted some block data integrity updates over Linux-scsi (see https://lwn.net/Articles/606820/). These changes follow DIX draft 1.1 (see see https://oss.oracle.com/~mkp/docs/dix-1.1.pdf). The draft updates the execution parameters passed from SCSI midlayer to the LLDs with respect to data integrity feature. This set updates signature feature accordingly. Once Martin's patches will land mainline I'll also modify iSER to take relevant parameters from the scsi command structure. This set consists of: - Patches #1, #2, #3 are minor fixes - Patch #4 is a preparation patch - Patches #5, #6 are code centralization - Patch #7 is an API update (modifies ib_core, mlx5, iser, isert) Roland, there are some patches in iser-target code, but I'm pretty sure that they can go from your tree. Right Nic? Yep, no target dependencies here, so they should go through infiniband.git. Roland, are you planning to pick these up for v3.18..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 3/3] TARGET/sbc,loopback: Adjust command data length in case pi exists on the wire
On Wed, 2014-06-11 at 10:24 +0300, Sagi Grimberg wrote: On 6/11/2014 12:17 AM, Quinn Tran wrote: SNIP QT Instead of using existing value within cmd-data_length, can we calculated data_length based on secstors blocksize. cmd-data_length = sectors * dev-dev_attrib.block_size; We can do that I suppose... Although it seems weird that the core discards the transport byte-count... Just wandering if we should recalc on the DIF case only or always. Yeah, same concern here wrt to discarding the transport length. This effectively makes the residual handling further down in sbc_parse_cdb() - target_cmd_size_check() always match size == cmd-data_length, because the latter as been recalculated by the former. Honestly, I don't know if this is a problem for normal READ + WRITE with prot=1 as I've never seen size != cmd-data_length for I/O related commands, but it does seem potentially dangerous. From the QLogic perfective, the cmd-data_length is pull directly from the wire (i.e. FCP header) when cmd is received. In essence, it's whatever the Initiator is set it to. We does not have any indicator to spot DIF vs none-DIF cmd when 1st received, unless the code do a peek. Same for all transports I assume... So just to confirm Quinn, the Qlogic the initiator includes the PI bytes into the FCP header data_length for the target-side *_PASS and DOUT_STRIP / DIN_INSERT, that is currently passed into qla_tgt_ops-handle_cmd(), right..? If that is the case, Sagi's v1 to cmd-data_length -= cmd-prot_length seems like it would still do right thing for *_PASS and DOUT_STRIP / DIN_INSERT, given that cmd-prot_length is calculated in sbc_check_prot() based upon dev-prot_length * sectors.. With that said, the cmd-data_length does not guarantee to contain both data length protection length when cmd is submit to TCM/target_submit_cmd(). In Dif-Insert mode, data_length will only contain the actual data(no Dif). No, in the DOUT_INSERT/DIN_STRIP case, protect bits are off and the core will take the data length as is. This case is covered. nod It's best that the SBC code re-calculate the actual data length and dif data length based on the number of sectors derived from the cmd. Nic, what's your take on this? Hard to say. Discarding the transport length in v2 doesn't seem like a good idea, but subtracting from cmd-prot_length in v1 is using the sector count from the CDB anyways, so it's essentially the same tradeoff of recalculating the transport's cmd-data_length from values within the CDB w/ prot=1. MKP, WDYT..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/3] Include protection information in transport header
On Wed, 2014-06-11 at 12:09 +0300, Sagi Grimberg wrote: At the SCSI transport level, there is no distinction between user data and protection information. Thus, iscsi header field expected data transfer length should include protection information. Patch #1 introduces scsi helper scsi_transfer_length which computes wire transfer byte count. Patch #2 modifies iscsi initiator to set correct wire transfer length in iscsi header data_length field (and modifies iser accordingly). Patch #3 modifies target core to re-calculate the pure data length in case of PI presence over the wire (and modifies loopback transport to align with other transports). Changes from v1: - scsi_cmnd: Rewrite scsi_transfer_length as MKP suggested - Target/sbc: re-calculate the data_length in case PI exists on the wire (instead od deacrease data -= prot) Changes from v0: - Introduce scsi helpers to compute correct transfer length in the presence of protection information. - Modify iscsi to set correct transfer length using scsi helpers - Modify loopback transport to set correct transfer length using scsi helpers Sagi Grimberg (3): scsi_cmnd: Introduce scsi_transfer_length helper libiscsi, iser: Adjust data_length to include protection information TARGET/sbc,loopback: Adjust command data length in case pi exists on the wire drivers/infiniband/ulp/iser/iser_initiator.c | 34 +++-- drivers/scsi/libiscsi.c | 18 +++--- drivers/target/loopback/tcm_loop.c | 15 +-- drivers/target/target_core_sbc.c | 15 ++- include/scsi/scsi_cmnd.h | 17 + 5 files changed, 61 insertions(+), 38 deletions(-) Ok, I've applied these to for-next, but let's see what MKP recommends for patch #3 wrt to recalculating cmd-data_length. Thanks Sagi! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 3/3] TARGET/sbc,loopback: Adjust command data length in case pi exists on the wire
On Wed, 2014-06-11 at 22:32 +, Quinn Tran wrote: On 6/11/14 2:30 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: On Wed, 2014-06-11 at 10:24 +0300, Sagi Grimberg wrote: On 6/11/2014 12:17 AM, Quinn Tran wrote: SNIP QT Instead of using existing value within cmd-data_length, can we calculated data_length based on secstors blocksize. cmd-data_length = sectors * dev-dev_attrib.block_size; We can do that I suppose... Although it seems weird that the core discards the transport byte-count... Just wandering if we should recalc on the DIF case only or always. Yeah, same concern here wrt to discarding the transport length. This effectively makes the residual handling further down in sbc_parse_cdb() - target_cmd_size_check() always match size == cmd-data_length, because the latter as been recalculated by the former. Honestly, I don't know if this is a problem for normal READ + WRITE with prot=1 as I've never seen size != cmd-data_length for I/O related commands, but it does seem potentially dangerous. From the QLogic perfective, the cmd-data_length is pull directly from the wire (i.e. FCP header) when cmd is received. In essence, it's whatever the Initiator is set it to. We does not have any indicator to spot DIF vs none-DIF cmd when 1st received, unless the code do a peek. Same for all transports I assume... So just to confirm Quinn, the Qlogic the initiator includes the PI bytes into the FCP header data_length for the target-side *_PASS and DOUT_STRIP / DIN_INSERT, that is currently passed into qla_tgt_ops-handle_cmd(), right..? QT Initiator: DOUT_STRIP/DIN_Insert: FCP_DL = data length only (no dif length) Dif PASS: FCP_DL = data length + Dif length. Target: DOUT_strip/DIN_Insert: qla_tgt_ops-handle_cmd(data length only) sbc_check_prot() If (protect) cmd-data_length -= cmd-prot_length; truncation nod --- DIF_PASS: qla_tgt_ops-handle_cmd(data length + Dif length) Ok, so Sagi's patches for legacy fabric - PI backend and PI fabric - PI backend cases look OK, but not for the PI fabric - legacy backend case as noted above.. Given DOUT_STRIP + DIN_INSERT have not been enabled in sbc_check_prot() just yet, I'll go ahead and merge his v2 series to address the two existing cases in -rc1 + v3.15.y stable code. From there, enabling PI fabric - legacy backend support in = rc2 with target-core will be easy, as it's just a matter of updating sbc_set_prot_op_checks() to assign prot_ops, avoid sbc_check_prot() cmd-data_length recalculation, and making sure PROTECTION control feature bits are still exposed for legacy - unprotected backends. Thanks! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 3/3] TARGET/sbc,loopback: Adjust command data length in case pi exists on the wire
Hi Sagi Co, On Sun, 2014-06-08 at 13:27 +0300, Sagi Grimberg wrote: In various areas of the code, it is assumed that se_cmd-data_length describes pure data. In case that protection information exists over the wire (protect bits is are on) the target core decrease the protection length from the data length (instead of each transport peeking in the cdb). Modify loopback device to include protection information in the transferred data length (like other scsi transports). Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/target/loopback/tcm_loop.c | 15 --- drivers/target/target_core_sbc.c | 15 +-- 2 files changed, 25 insertions(+), 5 deletions(-) diff --git a/drivers/target/loopback/tcm_loop.c b/drivers/target/loopback/tcm_loop.c index c886ad1..1f4c015 100644 --- a/drivers/target/loopback/tcm_loop.c +++ b/drivers/target/loopback/tcm_loop.c @@ -179,7 +179,7 @@ static void tcm_loop_submission_work(struct work_struct *work) struct tcm_loop_hba *tl_hba; struct tcm_loop_tpg *tl_tpg; struct scatterlist *sgl_bidi = NULL; - u32 sgl_bidi_count = 0; + u32 sgl_bidi_count = 0, transfer_length; int rc; tl_hba = *(struct tcm_loop_hba **)shost_priv(sc-device-host); @@ -213,12 +213,21 @@ static void tcm_loop_submission_work(struct work_struct *work) } - if (!scsi_prot_sg_count(sc) scsi_get_prot_op(sc) != SCSI_PROT_NORMAL) + transfer_length = scsi_transfer_length(sc); + if (!scsi_prot_sg_count(sc) + scsi_get_prot_op(sc) != SCSI_PROT_NORMAL) { se_cmd-prot_pto = true; + /* + * loopback transport doesn't support + * WRITE_GENERATE, READ_STRIP protection + * information operations, go ahead unprotected. + */ + transfer_length = scsi_bufflen(sc); + } rc = target_submit_cmd_map_sgls(se_cmd, tl_nexus-se_sess, sc-cmnd, tl_cmd-tl_sense_buf[0], tl_cmd-sc-device-lun, - scsi_bufflen(sc), tcm_loop_sam_attr(sc), + transfer_length, tcm_loop_sam_attr(sc), sc-sc_data_direction, 0, scsi_sglist(sc), scsi_sg_count(sc), sgl_bidi, sgl_bidi_count, diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c index e022959..06f8ecd 100644 --- a/drivers/target/target_core_sbc.c +++ b/drivers/target/target_core_sbc.c @@ -665,8 +665,19 @@ sbc_check_prot(struct se_device *dev, struct se_cmd *cmd, unsigned char *cdb, cmd-prot_type = dev-dev_attrib.pi_prot_type; cmd-prot_length = dev-prot_length * sectors; - pr_debug(%s: prot_type=%d, prot_length=%d prot_op=%d prot_checks=%d\n, - __func__, cmd-prot_type, cmd-prot_length, + + /** + * In case protection information exists over the wire + * we modify command data length to describe pure data. + * The actual transfer length is data length + protection + * length + **/ + if (protect) + cmd-data_length -= cmd-prot_length; + This fabric wide assumption breaks the vhost-scsi + tcm_qla2xxx PI code already queued for v3.16-rc1.. A vhost-scsi side conversion to combine exp_data_len + prot_bytes is easy enough in target-pending/for-next (see below), given that vhost-scsi's exp_data_len is coming from virtio buffer iovecs, so changing virtio-scsi LLD to use scsi_transfer_length() doesn't really make any sense. (Adding CC's for MST + Paolo) The qla2xxx target T10 PI patch in scsi/for-next (adding Quinn CC) will also need a similar change to update qlt_do_work() to include PI bytes for data_length being passed into tgt_ops-handle_cmd(), following what qlt_build_ctio_crc2_pkt() is already doing for calculating transfer_length for HW offload. That said, there is one other small qla2xxx change to enable per-session PI that is currently missing in Quinn's patch in scsi/for-next code. Sooo, I'll go ahead and include Sagi's patches with the vhost-scsi change below if there are no objections from MKP and/or driver maintainers, and post an -rc2 series after the merge window closes to address the two remaining qla2xxx specific items. --nab diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 667e72d..03e484f 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -1144,7 +1144,8 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq) } cmd = vhost_scsi_get_tag(vq, tpg, cdb, tag, lun, task_attr, -exp_data_len, data_direction); +exp_data_len + prot_bytes, +data_direction); if (IS_ERR(cmd)) { vq_err(vq, vhost_scsi_get_tag failed %ld\n,
Re: [PATCH v1 0/3] Include protection information in iscsi header
Hi MKP + Mike + Roland, On Sun, 2014-06-08 at 13:27 +0300, Sagi Grimberg wrote: At the SCSI transport level, there is no distinction between user data and protection information. Thus, iscsi header field expected data transfer length should include protection information. Patch #1 introduces scsi helpers scsi_transfer_length (compute wire transfer byte count) and scsi_prot_len (compute protection information byte count). Patch #2 modifies iscsi initiator to set correct wire transfer length in iscsi header data_length field (and modifies iser accordingly). Patch #3 modifies target core to expect protection information included in the wire transfer length (and modifies loopback transport to do so). I have 2 patches that convert lpfc/qla2xxx drivers to use scsi helpers but these are completely untested at the moment. Once we get this set to land upstream, I can queue them up as RFCs. Changes from v0: - Introduce scsi helpers to compute correct transfer length in the presence of protection information (instead of having each transport doing the same computation). - Modify iscsi to set correct transfer length using scsi helpers - Modify loopback transport to set correct transfer length using scsi helpers Sagi Grimberg (3): scsi_cmnd: Introduce scsi_transfer_length helper libiscsi, iser: Adjust data_length to include protection information TARGET/sbc,loopback: Adjust command data length in case pi exists on the wire drivers/infiniband/ulp/iser/iser_initiator.c | 34 ++ drivers/scsi/libiscsi.c | 18 ++-- drivers/target/loopback/tcm_loop.c | 15 -- drivers/target/target_core_sbc.c | 15 - include/scsi/scsi_cmnd.h | 39 ++ 5 files changed, 83 insertions(+), 38 deletions(-) Please review + ACK this series. Also, given the nature of the changes, they will need to be included into v3.15.y code to avoid any potential wire protocol compatibility issues. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 12/12] IB/isert: Support T10-PI protected transactions
Hey Sagi, On Wed, 2014-02-19 at 17:50 +0200, Sagi Grimberg wrote: In case the Target core passed transport T10 protection operation: 1. Register data buffer (data memory region) 2. Register protection buffer if exsists (prot memory region) 3. Register signature region (signature memory region) - use work request IB_WR_REG_SIG_MR 4. Execute RDMA 5. Upon RDMA completion check the signature status - if succeeded send good SCSI response - if failed send SCSI bad response with appropriate sense buffer Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/infiniband/ulp/isert/ib_isert.c | 321 --- drivers/infiniband/ulp/isert/ib_isert.h |1 + 2 files changed, 292 insertions(+), 30 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 3871ab2..04bdd79 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c SNIP @@ -2454,26 +2708,33 @@ isert_put_datain(struct iscsi_conn *conn, struct iscsi_cmd *cmd) return rc; } - /* - * Build isert_conn-tx_desc for iSCSI response PDU and attach - */ - isert_create_send_desc(isert_conn, isert_cmd, isert_cmd-tx_desc); - iscsit_build_rsp_pdu(cmd, conn, true, (struct iscsi_scsi_rsp *) - isert_cmd-tx_desc.iscsi_header); - isert_init_tx_hdrs(isert_conn, isert_cmd-tx_desc); - isert_init_send_wr(isert_conn, isert_cmd, -isert_cmd-tx_desc.send_wr, true); + if (se_cmd-prot_type == TARGET_PROT_NORMAL) { + /* + * Build isert_conn-tx_desc for iSCSI response PDU and attach + */ + isert_create_send_desc(isert_conn, isert_cmd, +isert_cmd-tx_desc); + iscsit_build_rsp_pdu(cmd, conn, false, (struct iscsi_scsi_rsp *) + isert_cmd-tx_desc.iscsi_header); + isert_init_tx_hdrs(isert_conn, isert_cmd-tx_desc); + isert_init_send_wr(isert_conn, isert_cmd, +isert_cmd-tx_desc.send_wr, true); + isert_cmd-rdma_wr.s_send_wr.next = isert_cmd-tx_desc.send_wr; + } So I'm fixing up some v3.14-rc6 context changes before applying this series to for-next, and noticed something above.. Namely, that iscsi_build_rsp_pdu() has been changed to pass 'false' as it's second argument vs. the original case passing 'true'. This boolean controls the incrementing of the StatSN in the traditional response header, which AFAICT should still be 'true' for this case. Is there a reason why this was changed..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 12/12] IB/isert: Support T10-PI protected transactions
On Thu, 2014-03-13 at 20:57 +0200, Sagi Grimberg wrote: On 3/13/2014 8:15 PM, Nicholas A. Bellinger wrote: Hey Sagi, On Wed, 2014-02-19 at 17:50 +0200, Sagi Grimberg wrote: In case the Target core passed transport T10 protection operation: 1. Register data buffer (data memory region) 2. Register protection buffer if exsists (prot memory region) 3. Register signature region (signature memory region) - use work request IB_WR_REG_SIG_MR 4. Execute RDMA 5. Upon RDMA completion check the signature status - if succeeded send good SCSI response - if failed send SCSI bad response with appropriate sense buffer Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/infiniband/ulp/isert/ib_isert.c | 321 --- drivers/infiniband/ulp/isert/ib_isert.h |1 + 2 files changed, 292 insertions(+), 30 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 3871ab2..04bdd79 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c SNIP @@ -2454,26 +2708,33 @@ isert_put_datain(struct iscsi_conn *conn, struct iscsi_cmd *cmd) return rc; } - /* - * Build isert_conn-tx_desc for iSCSI response PDU and attach - */ - isert_create_send_desc(isert_conn, isert_cmd, isert_cmd-tx_desc); - iscsit_build_rsp_pdu(cmd, conn, true, (struct iscsi_scsi_rsp *) - isert_cmd-tx_desc.iscsi_header); - isert_init_tx_hdrs(isert_conn, isert_cmd-tx_desc); - isert_init_send_wr(isert_conn, isert_cmd, - isert_cmd-tx_desc.send_wr, true); + if (se_cmd-prot_type == TARGET_PROT_NORMAL) { + /* + * Build isert_conn-tx_desc for iSCSI response PDU and attach + */ + isert_create_send_desc(isert_conn, isert_cmd, + isert_cmd-tx_desc); + iscsit_build_rsp_pdu(cmd, conn, false, (struct iscsi_scsi_rsp *) + isert_cmd-tx_desc.iscsi_header); + isert_init_tx_hdrs(isert_conn, isert_cmd-tx_desc); + isert_init_send_wr(isert_conn, isert_cmd, + isert_cmd-tx_desc.send_wr, true); + isert_cmd-rdma_wr.s_send_wr.next = isert_cmd-tx_desc.send_wr; + } So I'm fixing up some v3.14-rc6 context changes before applying this series to for-next, and noticed something above.. Namely, that iscsi_build_rsp_pdu() has been changed to pass 'false' as it's second argument vs. the original case passing 'true'. This boolean controls the incrementing of the StatSN in the traditional response header, which AFAICT should still be 'true' for this case. Is there a reason why this was changed..? Well, it wrong... This probably was left around when I rebased to 3.14 (earlier versions incremented StatSN explicitly). I wonder how that works though... Want me to fix and resend? Nah, fixing this in for-next now. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 12/12] IB/isert: Support T10-PI protected transactions
On Thu, 2014-03-13 at 21:14 +0200, Sagi Grimberg wrote: On 3/13/2014 8:59 PM, Nicholas A. Bellinger wrote: On Thu, 2014-03-13 at 20:57 +0200, Sagi Grimberg wrote: On 3/13/2014 8:15 PM, Nicholas A. Bellinger wrote: Hey Sagi, On Wed, 2014-02-19 at 17:50 +0200, Sagi Grimberg wrote: In case the Target core passed transport T10 protection operation: 1. Register data buffer (data memory region) 2. Register protection buffer if exsists (prot memory region) 3. Register signature region (signature memory region) - use work request IB_WR_REG_SIG_MR 4. Execute RDMA 5. Upon RDMA completion check the signature status - if succeeded send good SCSI response - if failed send SCSI bad response with appropriate sense buffer Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/infiniband/ulp/isert/ib_isert.c | 321 --- drivers/infiniband/ulp/isert/ib_isert.h |1 + 2 files changed, 292 insertions(+), 30 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 3871ab2..04bdd79 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c SNIP @@ -2454,26 +2708,33 @@ isert_put_datain(struct iscsi_conn *conn, struct iscsi_cmd *cmd) return rc; } -/* - * Build isert_conn-tx_desc for iSCSI response PDU and attach - */ -isert_create_send_desc(isert_conn, isert_cmd, isert_cmd-tx_desc); -iscsit_build_rsp_pdu(cmd, conn, true, (struct iscsi_scsi_rsp *) - isert_cmd-tx_desc.iscsi_header); -isert_init_tx_hdrs(isert_conn, isert_cmd-tx_desc); -isert_init_send_wr(isert_conn, isert_cmd, - isert_cmd-tx_desc.send_wr, true); +if (se_cmd-prot_type == TARGET_PROT_NORMAL) { +/* + * Build isert_conn-tx_desc for iSCSI response PDU and attach + */ +isert_create_send_desc(isert_conn, isert_cmd, + isert_cmd-tx_desc); +iscsit_build_rsp_pdu(cmd, conn, false, (struct iscsi_scsi_rsp *) + isert_cmd-tx_desc.iscsi_header); +isert_init_tx_hdrs(isert_conn, isert_cmd-tx_desc); +isert_init_send_wr(isert_conn, isert_cmd, + isert_cmd-tx_desc.send_wr, true); +isert_cmd-rdma_wr.s_send_wr.next = isert_cmd-tx_desc.send_wr; +} So I'm fixing up some v3.14-rc6 context changes before applying this series to for-next, and noticed something above.. Namely, that iscsi_build_rsp_pdu() has been changed to pass 'false' as it's second argument vs. the original case passing 'true'. This boolean controls the incrementing of the StatSN in the traditional response header, which AFAICT should still be 'true' for this case. Is there a reason why this was changed..? Well, it wrong... This probably was left around when I rebased to 3.14 (earlier versions incremented StatSN explicitly). I wonder how that works though... Want me to fix and resend? Nah, fixing this in for-next now. --nab Thanks, I just spotted something strange, I think it was already fixed sometime: - if (se_cmd-prot_type == TARGET_PROT_NORMAL) { + if (se_cmd-prot_op == TARGET_PROT_NORMAL) { This is strange that I find a mismatch in our trees. I'll check it out on Sunday. Mmmm, yeah. I thought that was fixed a while ago, but is still present in the -v2 patches from 02192014.. In any event, those are now merged in for-next, and should be included in tomorrow's next build. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] iser-target: Fix active I/O shutdown related issues
On Thu, 2014-03-06 at 16:05 +0200, sagi grimberg wrote: On 3/6/2014 12:04 AM, Nicholas A. Bellinger wrote: On Wed, 2014-03-05 at 14:12 +0200, Sagi Grimberg wrote: On 3/5/2014 2:06 AM, Nicholas A. Bellinger wrote: On Tue, 2014-03-04 at 17:17 +0200, Sagi Grimberg wrote: On 3/4/2014 2:00 AM, Nicholas A. Bellinger wrote: SNIP nod, I noticed that as well during recent debugging. However, AFAICT the RDMA_CM_EVENT_TIMEWAIT_EVENT doesn't (always) occur on the target side after a RDMA_CM_EVENT_DISCONNECTED, and thus far I've not been able to ascertain what's different about the shutdown sequence that would make this happen, or not happen.. Any ideas..? That's probably because the cm_id is destroyed before you get the event. There is a specific timout computation to get this event (see IB spec). If you will attempt to disconnect while the link is down (initiator won't receive it and send you disconnect back), you should be able to see this event. As I understand, in order to comply the spec, the QP (and the cm_id afterwards) should be destroyed only when getting this event and not before. nod, thanks for the additional background. So currently rdma_destroy_qp() + rdma_destroy_id() is being done via isert_connect_release(), which occurs after the final isert_put_conn() happens from either the RDMA_CM_EVENT_DISCONNECTED handler, or within isert_free_conn() in one of the per connection kernel thread contexts via iscsit_close_connection(). If I understand the above correctly, the isert_put_conn() should move from the RDMA_CM_EVENT_DISCONNECTED handler into the TIMEWAIT_EVENT handler, yes..? Yes. And it's safe to assume that DISCONNECTED will always occur before TIMEWAIT_EVENT, right..? DISCONNECTED event may not even come at all (in case the initiator didn't call rdma_disconnect). no guarantees here.. But, if once we get the TIMEWAIT event, we destroy the qp and the *cm_id*, we won't get any CM events at all. As I understand, we don't even need to explicitly destroy the cm_id, we can just return a non-zero return from cma_handler for TIMEWAIT events which will cause rdma_cm to implicitly destroy the cm_id. Mmmm, if that's the case then I'm more confused about how reference counting for isert_conn should work wrt TIMEWAIT_EVENT than before.. ;) As mentioned earlier, the first isert_put_conn() occurs from the per connection process context after calling rdma_disconnect(), and the second from the disconnected event handler. Your comment above would seem to indicate that iser-target code should be waiting to receive TIMEWAIT_EVENT, instead of pro-actively calling rdma_disconnect() to trigger the disconnect. Is that correct..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux rdma 3.14 merge plans
On Thu, 2014-02-06 at 16:04 -0800, Roland Dreier wrote: On Thu, Feb 6, 2014 at 4:02 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: Can you give us an estimate of when you'll have some time to give feedback on the outstanding patches..? I hope to get to it in the next few weeks. - R. Hi Roland, We'd very much like to move forward with the verbs + mlx5 + LIO target PI related patches for v3.15 code. Given the verbs + mlx5 changes have undergone ~5 months of review at this point, I'd like to go ahead and get them in target-pending/for-next ASAP to continue making forward progress towards a mainline merge. I'm happy with the state of the iser-target related changes, and Sagi Co are making good progress on the iser-initiator related patches with Mike. That all said, do you have an objection wrt taking this bits through target-pending..? Given the dependencies involved, that would seem the most logical path to take. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux rdma 3.14 merge plans
On Wed, 2014-03-05 at 07:18 -0800, Roland Dreier wrote: On Wed, Mar 5, 2014 at 1:54 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: That all said, do you have an objection wrt taking this bits through target-pending..? Given the dependencies involved, that would seem the most logical path to take. Perhaps not surprisingly, I would prefer to get a chance to review a major change to the core RDMA midlayer rather than having you merge it through your tree. So yes I do object. Please give me a chance to review and merge it. I am working on that this week. Great. We'll be looking for a response by the end of the week. Otherwise if you end up not having time, we'd still like to move forward for v3.15 given the amount of review the series has already gotten on the list. Thank you, --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] iser-target: Fix active I/O shutdown related issues
On Wed, 2014-03-05 at 14:12 +0200, Sagi Grimberg wrote: On 3/5/2014 2:06 AM, Nicholas A. Bellinger wrote: On Tue, 2014-03-04 at 17:17 +0200, Sagi Grimberg wrote: On 3/4/2014 2:00 AM, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org SNIP More on cleanup flow. isert_cma_handler does not handle RDMA_CM_EVENT_TIMEWAIT_EXIT. To be more specific, according to IB spec, when initiating disconnect (rdma_disconnect/ib_send_cm_dreq), one should not destroy a used qp until getting TIMEWAIT_EXIT CM event. We are working on this in iSER initiator. It might lead to stale connection CM rejects on future connections (SRP also does not do that). nod, I noticed that as well during recent debugging. However, AFAICT the RDMA_CM_EVENT_TIMEWAIT_EVENT doesn't (always) occur on the target side after a RDMA_CM_EVENT_DISCONNECTED, and thus far I've not been able to ascertain what's different about the shutdown sequence that would make this happen, or not happen.. Any ideas..? That's probably because the cm_id is destroyed before you get the event. There is a specific timout computation to get this event (see IB spec). If you will attempt to disconnect while the link is down (initiator won't receive it and send you disconnect back), you should be able to see this event. As I understand, in order to comply the spec, the QP (and the cm_id afterwards) should be destroyed only when getting this event and not before. nod, thanks for the additional background. So currently rdma_destroy_qp() + rdma_destroy_id() is being done via isert_connect_release(), which occurs after the final isert_put_conn() happens from either the RDMA_CM_EVENT_DISCONNECTED handler, or within isert_free_conn() in one of the per connection kernel thread contexts via iscsit_close_connection(). If I understand the above correctly, the isert_put_conn() should move from the RDMA_CM_EVENT_DISCONNECTED handler into the TIMEWAIT_EVENT handler, yes..? And it's safe to assume that DISCONNECTED will always occur before TIMEWAIT_EVENT, right..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6] iser-target: Ignore completions for FRWRs in isert_cq_tx_work
On Tue, 2014-03-04 at 16:51 +0200, Sagi Grimberg wrote: On 3/4/2014 2:01 AM, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org This patch changes IB_WR_FAST_REG_MR + IB_WR_LOCAL_INV related work requests to include a ISER_FRWR_LI_WRID value in order to signal isert_cq_tx_work() that these requests should be ignored. This is necessary because even though IB_SEND_SIGNALED is not set for either work request, during a QP failure event the work requests will be returned with exception status from the TX completion queue. Cc: Sagi Grimberg sa...@mellanox.com Cc: Or Gerlitz ogerl...@mellanox.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/isert/ib_isert.c |8 ++-- drivers/infiniband/ulp/isert/ib_isert.h |1 + 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index c9d488f..003b5d0 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -1738,8 +1738,10 @@ isert_cq_tx_work(struct work_struct *work) pr_debug(TX wc.status: 0x%08x\n, wc.status); pr_debug(TX wc.vendor_err: 0x%08x\n, wc.vendor_err); - atomic_dec(isert_conn-post_send_buf_count); - isert_cq_tx_comp_err(tx_desc, isert_conn); + if (wc.wr_id != ISER_FRWR_LI_WRID) { Better to use ISER_FASTREG_LI_WRID - I changed it in the initiator. nod, updating that bit now.. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] iser-target: Fix active I/O shutdown related issues
On Tue, 2014-03-04 at 17:17 +0200, Sagi Grimberg wrote: On 3/4/2014 2:00 AM, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org Hi Or Sagi, This series addresses a number of active I/O shutdown related issues in iser-target code that have come up recently during stress testing. Note there is still a seperate iser-target network portal shutdown bug being tracked down, but this series addresses all existing issues related to active I/O session shutdown. The patch breakdown looks like: Patch #1 fixes a long-standing bug where TPGs in shutdown incorrectly could be referenced by new login attempts. Patch #2 converts list_del - list_del_init for iscsi_cmd-i_conn_node so that list_empty works correctly. Patch #3 addresses isert_conn-state related bugs resulting in hung shutdown, and splits isert_free_conn() into seperate code that is called earlier during shutdown to ensure that all outstanding I/O has completed. Patch #4 fixes incorrect accounting of -post_send_buf_count during active I/O shutdown with outstanding RDMA WRITE + RDMA READ work requests. Patch #5 addresses a bug related to active I/O shutdown with outstanding FRMR work requests. Note this patch is specific to v3.12+ code. Patch #6 addresses bugs related to active I/O shutdown with outstanding completion interrupt coalescing batches. Note this patch is specific to v3.13+ code. Please review. Hey Nic, So besides a minor comment, you have my Ack on this set. Thanks! More on cleanup flow. isert_cma_handler does not handle RDMA_CM_EVENT_TIMEWAIT_EXIT. To be more specific, according to IB spec, when initiating disconnect (rdma_disconnect/ib_send_cm_dreq), one should not destroy a used qp until getting TIMEWAIT_EXIT CM event. We are working on this in iSER initiator. It might lead to stale connection CM rejects on future connections (SRP also does not do that). nod, I noticed that as well during recent debugging. However, AFAICT the RDMA_CM_EVENT_TIMEWAIT_EVENT doesn't (always) occur on the target side after a RDMA_CM_EVENT_DISCONNECTED, and thus far I've not been able to ascertain what's different about the shutdown sequence that would make this happen, or not happen.. Any ideas..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/6] iser-target: Fix post_send_buf_count for RDMA READ/WRITE
From: Nicholas Bellinger n...@linux-iscsi.org This patch fixes the incorrect setting of -post_send_buf_count related to RDMA WRITEs + READs where isert_rdma_rw-send_wr_num was not being taken into account. This includes incrementing -post_send_buf_count within isert_put_datain() + isert_get_dataout(), decrementing within __isert_send_completion() + isert_response_completion(), and clearing wr-send_wr_num within isert_completion_rdma_read() This is necessary because even though IB_SEND_SIGNALED is not set for RDMA WRITEs + READs, during a QP failure event the work requests will be returned with exception status from the TX completion queue. Cc: Sagi Grimberg sa...@mellanox.com Cc: Or Gerlitz ogerl...@mellanox.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/isert/ib_isert.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 8283f5a..c9d488f 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -1536,6 +1536,7 @@ isert_completion_rdma_read(struct iser_tx_desc *tx_desc, iscsit_stop_dataout_timer(cmd); device-unreg_rdma_mem(isert_cmd, isert_conn); cmd-write_data_done = wr-cur_rdma_length; + wr-send_wr_num = 0; pr_debug(Cmd: %p RDMA_READ comp calling execute_cmd\n, isert_cmd); spin_lock_bh(cmd-istate_lock); @@ -1600,6 +1601,7 @@ isert_response_completion(struct iser_tx_desc *tx_desc, struct ib_device *ib_dev) { struct iscsi_cmd *cmd = isert_cmd-iscsi_cmd; + struct isert_rdma_wr *wr = isert_cmd-rdma_wr; if (cmd-i_state == ISTATE_SEND_TASKMGTRSP || cmd-i_state == ISTATE_SEND_LOGOUTRSP || @@ -1611,7 +1613,7 @@ isert_response_completion(struct iser_tx_desc *tx_desc, queue_work(isert_comp_wq, isert_cmd-comp_work); return; } - atomic_dec(isert_conn-post_send_buf_count); + atomic_sub(wr-send_wr_num + 1, isert_conn-post_send_buf_count); cmd-i_state = ISTATE_SENT_STATUS; isert_completion_put(tx_desc, isert_cmd, ib_dev); @@ -1649,7 +1651,7 @@ __isert_send_completion(struct iser_tx_desc *tx_desc, case ISER_IB_RDMA_READ: pr_debug(isert_send_completion: Got ISER_IB_RDMA_READ:\n); - atomic_dec(isert_conn-post_send_buf_count); + atomic_sub(wr-send_wr_num, isert_conn-post_send_buf_count); isert_completion_rdma_read(tx_desc, isert_cmd); break; default: @@ -2375,12 +2377,12 @@ isert_put_datain(struct iscsi_conn *conn, struct iscsi_cmd *cmd) isert_init_send_wr(isert_conn, isert_cmd, isert_cmd-tx_desc.send_wr, true); - atomic_inc(isert_conn-post_send_buf_count); + atomic_add(wr-send_wr_num + 1, isert_conn-post_send_buf_count); rc = ib_post_send(isert_conn-conn_qp, wr-send_wr, wr_failed); if (rc) { pr_warn(ib_post_send() failed for IB_WR_RDMA_WRITE\n); - atomic_dec(isert_conn-post_send_buf_count); + atomic_sub(wr-send_wr_num + 1, isert_conn-post_send_buf_count); } pr_debug(Cmd: %p posted RDMA_WRITE + Response for iSER Data READ\n, isert_cmd); @@ -2408,12 +2410,12 @@ isert_get_dataout(struct iscsi_conn *conn, struct iscsi_cmd *cmd, bool recovery) return rc; } - atomic_inc(isert_conn-post_send_buf_count); + atomic_add(wr-send_wr_num, isert_conn-post_send_buf_count); rc = ib_post_send(isert_conn-conn_qp, wr-send_wr, wr_failed); if (rc) { pr_warn(ib_post_send() failed for IB_WR_RDMA_READ\n); - atomic_dec(isert_conn-post_send_buf_count); + atomic_sub(wr-send_wr_num, isert_conn-post_send_buf_count); } pr_debug(Cmd: %p posted RDMA_READ memory for ISER Data WRITE\n, isert_cmd); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/6] iser-target: Fix command leak for tx_desc-comp_llnode_batch
From: Nicholas Bellinger n...@linux-iscsi.org This patch addresses a number of active I/O shutdown issues related to isert_cmd descriptors being leaked that are part of a completion interrupt coalescing batch. This includes adding logic in isert_cq_tx_comp_err() to drain any associated tx_desc-comp_llnode_batch, as well as isert_cq_drain_comp_llist() to drain any associated isert_conn-conn_comp_llist. Also, set tx_desc-llnode_active in isert_init_send_wr() in order to determine when work requests need to be skipped in isert_cq_tx_work() exception path code. Finally, update isert_init_send_wr() to only allow interrupt coalescing when ISER_CONN_UP. Cc: Sagi Grimberg sa...@mellanox.com Cc: Or Gerlitz ogerl...@mellanox.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/isert/ib_isert.c | 50 +++ drivers/infiniband/ulp/isert/ib_isert.h |2 +- 2 files changed, 46 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 003b5d0..fdd3042 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -484,7 +484,6 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct rdma_cm_event *event) kref_init(isert_conn-conn_kref); kref_get(isert_conn-conn_kref); mutex_init(isert_conn-conn_mutex); - mutex_init(isert_conn-conn_comp_mutex); spin_lock_init(isert_conn-conn_lock); cma_id-context = isert_conn; @@ -875,16 +874,17 @@ isert_init_send_wr(struct isert_conn *isert_conn, struct isert_cmd *isert_cmd, * Coalesce send completion interrupts by only setting IB_SEND_SIGNALED * bit for every ISERT_COMP_BATCH_COUNT number of ib_post_send() calls. */ - mutex_lock(isert_conn-conn_comp_mutex); - if (coalesce + mutex_lock(isert_conn-conn_mutex); + if (coalesce isert_conn-state == ISER_CONN_UP ++isert_conn-conn_comp_batch ISERT_COMP_BATCH_COUNT) { + tx_desc-llnode_active = true; llist_add(tx_desc-comp_llnode, isert_conn-conn_comp_llist); - mutex_unlock(isert_conn-conn_comp_mutex); + mutex_unlock(isert_conn-conn_mutex); return; } isert_conn-conn_comp_batch = 0; tx_desc-comp_llnode_batch = llist_del_all(isert_conn-conn_comp_llist); - mutex_unlock(isert_conn-conn_comp_mutex); + mutex_unlock(isert_conn-conn_mutex); send_wr-send_flags = IB_SEND_SIGNALED; } @@ -1680,10 +1680,45 @@ isert_send_completion(struct iser_tx_desc *tx_desc, } static void +isert_cq_drain_comp_llist(struct isert_conn *isert_conn, struct ib_device *ib_dev) +{ + struct llist_node *llnode; + struct isert_rdma_wr *wr; + struct iser_tx_desc *t; + + mutex_lock(isert_conn-conn_mutex); + llnode = llist_del_all(isert_conn-conn_comp_llist); + isert_conn-conn_comp_batch = 0; + mutex_unlock(isert_conn-conn_mutex); + + while (llnode) { + t = llist_entry(llnode, struct iser_tx_desc, comp_llnode); + llnode = llist_next(llnode); + wr = t-isert_cmd-rdma_wr; + + atomic_sub(wr-send_wr_num + 1, isert_conn-post_send_buf_count); + isert_completion_put(t, t-isert_cmd, ib_dev); + } +} + +static void isert_cq_tx_comp_err(struct iser_tx_desc *tx_desc, struct isert_conn *isert_conn) { struct ib_device *ib_dev = isert_conn-conn_cm_id-device; struct isert_cmd *isert_cmd = tx_desc-isert_cmd; + struct llist_node *llnode = tx_desc-comp_llnode_batch; + struct isert_rdma_wr *wr; + struct iser_tx_desc *t; + + while (llnode) { + t = llist_entry(llnode, struct iser_tx_desc, comp_llnode); + llnode = llist_next(llnode); + wr = t-isert_cmd-rdma_wr; + + atomic_sub(wr-send_wr_num + 1, isert_conn-post_send_buf_count); + isert_completion_put(t, t-isert_cmd, ib_dev); + } + tx_desc-comp_llnode_batch = NULL; if (!isert_cmd) isert_unmap_tx_desc(tx_desc, ib_dev); @@ -1700,6 +1735,8 @@ isert_cq_rx_comp_err(struct isert_conn *isert_conn) if (isert_conn-post_recv_buf_count) return; + isert_cq_drain_comp_llist(isert_conn, ib_dev); + if (conn-sess) { target_sess_cmd_list_set_waiting(conn-sess-se_sess); target_wait_for_sess_cmds(conn-sess-se_sess); @@ -1739,6 +1776,9 @@ isert_cq_tx_work(struct work_struct *work) pr_debug(TX wc.vendor_err: 0x%08x\n, wc.vendor_err); if (wc.wr_id != ISER_FRWR_LI_WRID) { + if (tx_desc-llnode_active) + continue; + atomic_dec(isert_conn-post_send_buf_count
[PATCH 1/6] iscsi-target: Fix iscsit_get_tpg_from_np tpg_state bug
From: Nicholas Bellinger n...@linux-iscsi.org This patch fixes a bug in iscsit_get_tpg_from_np() where the tpg-tpg_state sanity check was looking for TPG_STATE_FREE, instead of != TPG_STATE_ACTIVE. The latter is expected during a normal TPG shutdown once the tpg_state goes into TPG_STATE_INACTIVE in order to reject any new incoming login attempts. Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target_tpg.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/target/iscsi/iscsi_target_tpg.c b/drivers/target/iscsi/iscsi_target_tpg.c index 3976183..44a5471 100644 --- a/drivers/target/iscsi/iscsi_target_tpg.c +++ b/drivers/target/iscsi/iscsi_target_tpg.c @@ -137,7 +137,7 @@ struct iscsi_portal_group *iscsit_get_tpg_from_np( list_for_each_entry(tpg, tiqn-tiqn_tpg_list, tpg_list) { spin_lock(tpg-tpg_state_lock); - if (tpg-tpg_state == TPG_STATE_FREE) { + if (tpg-tpg_state != TPG_STATE_ACTIVE) { spin_unlock(tpg-tpg_state_lock); continue; } -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/6] iser-target: Ignore completions for FRWRs in isert_cq_tx_work
From: Nicholas Bellinger n...@linux-iscsi.org This patch changes IB_WR_FAST_REG_MR + IB_WR_LOCAL_INV related work requests to include a ISER_FRWR_LI_WRID value in order to signal isert_cq_tx_work() that these requests should be ignored. This is necessary because even though IB_SEND_SIGNALED is not set for either work request, during a QP failure event the work requests will be returned with exception status from the TX completion queue. Cc: Sagi Grimberg sa...@mellanox.com Cc: Or Gerlitz ogerl...@mellanox.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/isert/ib_isert.c |8 ++-- drivers/infiniband/ulp/isert/ib_isert.h |1 + 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index c9d488f..003b5d0 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -1738,8 +1738,10 @@ isert_cq_tx_work(struct work_struct *work) pr_debug(TX wc.status: 0x%08x\n, wc.status); pr_debug(TX wc.vendor_err: 0x%08x\n, wc.vendor_err); - atomic_dec(isert_conn-post_send_buf_count); - isert_cq_tx_comp_err(tx_desc, isert_conn); + if (wc.wr_id != ISER_FRWR_LI_WRID) { + atomic_dec(isert_conn-post_send_buf_count); + isert_cq_tx_comp_err(tx_desc, isert_conn); + } } } @@ -2202,6 +2204,7 @@ isert_fast_reg_mr(struct fast_reg_descriptor *fr_desc, if (!fr_desc-valid) { memset(inv_wr, 0, sizeof(inv_wr)); + inv_wr.wr_id = ISER_FRWR_LI_WRID; inv_wr.opcode = IB_WR_LOCAL_INV; inv_wr.ex.invalidate_rkey = fr_desc-data_mr-rkey; wr = inv_wr; @@ -2212,6 +2215,7 @@ isert_fast_reg_mr(struct fast_reg_descriptor *fr_desc, /* Prepare FASTREG WR */ memset(fr_wr, 0, sizeof(fr_wr)); + fr_wr.wr_id = ISER_FRWR_LI_WRID; fr_wr.opcode = IB_WR_FAST_REG_MR; fr_wr.wr.fast_reg.iova_start = fr_desc-data_frpl-page_list[0] + page_off; diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h index 41e799f..599b4e2 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.h +++ b/drivers/infiniband/ulp/isert/ib_isert.h @@ -6,6 +6,7 @@ #define ISERT_RDMA_LISTEN_BACKLOG 10 #define ISCSI_ISER_SG_TABLESIZE256 +#define ISER_FRWR_LI_WRID 0xULL enum isert_desc_type { ISCSI_TX_CONTROL, -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/6] iser-target: Fix active I/O shutdown related issues
From: Nicholas Bellinger n...@linux-iscsi.org Hi Or Sagi, This series addresses a number of active I/O shutdown related issues in iser-target code that have come up recently during stress testing. Note there is still a seperate iser-target network portal shutdown bug being tracked down, but this series addresses all existing issues related to active I/O session shutdown. The patch breakdown looks like: Patch #1 fixes a long-standing bug where TPGs in shutdown incorrectly could be referenced by new login attempts. Patch #2 converts list_del - list_del_init for iscsi_cmd-i_conn_node so that list_empty works correctly. Patch #3 addresses isert_conn-state related bugs resulting in hung shutdown, and splits isert_free_conn() into seperate code that is called earlier during shutdown to ensure that all outstanding I/O has completed. Patch #4 fixes incorrect accounting of -post_send_buf_count during active I/O shutdown with outstanding RDMA WRITE + RDMA READ work requests. Patch #5 addresses a bug related to active I/O shutdown with outstanding FRMR work requests. Note this patch is specific to v3.12+ code. Patch #6 addresses bugs related to active I/O shutdown with outstanding completion interrupt coalescing batches. Note this patch is specific to v3.13+ code. Please review. --nab Nicholas Bellinger (6): iscsi-target: Fix iscsit_get_tpg_from_np tpg_state bug iscsi/iser-target: Use list_del_init for -i_conn_node iscsi/iser-target: Fix isert_conn-state hung shutdown issues iser-target: Fix post_send_buf_count for RDMA READ/WRITE iser-target: Ignore completions for FRWRs in isert_cq_tx_work iser-target: Fix command leak for tx_desc-comp_llnode_batch drivers/infiniband/ulp/isert/ib_isert.c | 180 ++ drivers/infiniband/ulp/isert/ib_isert.h |7 +- drivers/target/iscsi/iscsi_target.c | 10 +- drivers/target/iscsi/iscsi_target_erl2.c | 16 +-- drivers/target/iscsi/iscsi_target_tpg.c |2 +- include/target/iscsi/iscsi_transport.h |1 + 6 files changed, 129 insertions(+), 87 deletions(-) -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] iscsi/iser-target: Use list_del_init for -i_conn_node
From: Nicholas Bellinger n...@linux-iscsi.org There are a handful of uses of list_empty() for cmd-i_conn_node within iser-target code that expect to return false once a cmd has been removed from the per connect list. This patch changes all uses of list_del - list_del_init in order to ensure that list_empty() returns false as expected. Cc: Sagi Grimberg sa...@mellanox.com Cc: Or Gerlitz ogerl...@mellanox.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/isert/ib_isert.c |6 +++--- drivers/target/iscsi/iscsi_target.c |6 +++--- drivers/target/iscsi/iscsi_target_erl2.c | 16 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 63bcf69..05a575e 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -1451,7 +1451,7 @@ isert_put_cmd(struct isert_cmd *isert_cmd) case ISCSI_OP_SCSI_CMD: spin_lock_bh(conn-cmd_lock); if (!list_empty(cmd-i_conn_node)) - list_del(cmd-i_conn_node); + list_del_init(cmd-i_conn_node); spin_unlock_bh(conn-cmd_lock); if (cmd-data_direction == DMA_TO_DEVICE) @@ -1463,7 +1463,7 @@ isert_put_cmd(struct isert_cmd *isert_cmd) case ISCSI_OP_SCSI_TMFUNC: spin_lock_bh(conn-cmd_lock); if (!list_empty(cmd-i_conn_node)) - list_del(cmd-i_conn_node); + list_del_init(cmd-i_conn_node); spin_unlock_bh(conn-cmd_lock); transport_generic_free_cmd(cmd-se_cmd, 0); @@ -1473,7 +1473,7 @@ isert_put_cmd(struct isert_cmd *isert_cmd) case ISCSI_OP_TEXT: spin_lock_bh(conn-cmd_lock); if (!list_empty(cmd-i_conn_node)) - list_del(cmd-i_conn_node); + list_del_init(cmd-i_conn_node); spin_unlock_bh(conn-cmd_lock); /* diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index a99637e..72f1576 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -784,7 +784,7 @@ static void iscsit_ack_from_expstatsn(struct iscsi_conn *conn, u32 exp_statsn) spin_unlock_bh(conn-cmd_lock); list_for_each_entry_safe(cmd, cmd_p, ack_list, i_conn_node) { - list_del(cmd-i_conn_node); + list_del_init(cmd-i_conn_node); iscsit_free_cmd(cmd, false); } } @@ -3709,7 +3709,7 @@ iscsit_immediate_queue(struct iscsi_conn *conn, struct iscsi_cmd *cmd, int state break; case ISTATE_REMOVE: spin_lock_bh(conn-cmd_lock); - list_del(cmd-i_conn_node); + list_del_init(cmd-i_conn_node); spin_unlock_bh(conn-cmd_lock); iscsit_free_cmd(cmd, false); @@ -4152,7 +4152,7 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn) spin_lock_bh(conn-cmd_lock); list_for_each_entry_safe(cmd, cmd_tmp, conn-conn_cmd_list, i_conn_node) { - list_del(cmd-i_conn_node); + list_del_init(cmd-i_conn_node); spin_unlock_bh(conn-cmd_lock); iscsit_increment_maxcmdsn(cmd, sess); diff --git a/drivers/target/iscsi/iscsi_target_erl2.c b/drivers/target/iscsi/iscsi_target_erl2.c index 33be1fb..4ca8fd2 100644 --- a/drivers/target/iscsi/iscsi_target_erl2.c +++ b/drivers/target/iscsi/iscsi_target_erl2.c @@ -138,7 +138,7 @@ void iscsit_free_connection_recovery_entires(struct iscsi_session *sess) list_for_each_entry_safe(cmd, cmd_tmp, cr-conn_recovery_cmd_list, i_conn_node) { - list_del(cmd-i_conn_node); + list_del_init(cmd-i_conn_node); cmd-conn = NULL; spin_unlock(cr-conn_recovery_cmd_lock); iscsit_free_cmd(cmd, true); @@ -160,7 +160,7 @@ void iscsit_free_connection_recovery_entires(struct iscsi_session *sess) list_for_each_entry_safe(cmd, cmd_tmp, cr-conn_recovery_cmd_list, i_conn_node) { - list_del(cmd-i_conn_node); + list_del_init(cmd-i_conn_node); cmd-conn = NULL; spin_unlock(cr-conn_recovery_cmd_lock); iscsit_free_cmd(cmd, true); @@ -216,7 +216,7 @@ int iscsit_remove_cmd_from_connection_recovery( } cr = cmd-cr; - list_del(cmd-i_conn_node); + list_del_init(cmd-i_conn_node); return --cr-cmd_count; } @@ -297,7 +297,7 @@ int iscsit_discard_unacknowledged_ooo_cmdsns_for_conn(struct iscsi_conn *conn) if (!(cmd
[PATCH 3/6] iscsi/iser-target: Fix isert_conn-state hung shutdown issues
From: Nicholas Bellinger n...@linux-iscsi.org This patch addresses a couple of different hug shutdown issues related to wait_event() + isert_conn-state. First, it changes isert_conn-conn_wait + isert_conn-conn_wait_comp_err from waitqueues to completions, and sets ISER_CONN_TERMINATING from within isert_disconnect_work(). Second, it splits isert_free_conn() into isert_wait_conn() that is called earlier in iscsit_close_connection() to ensure that all outstanding commands have completed before continuing. Finally, it breaks isert_cq_comp_err() into seperate TX / RX related code, and adds logic in isert_cq_rx_comp_err() to wait for outstanding commands to complete before setting ISER_CONN_DOWN and calling complete(isert_conn-conn_wait_comp_err). Cc: Sagi Grimberg sa...@mellanox.com Cc: Or Gerlitz ogerl...@mellanox.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/isert/ib_isert.c | 106 ++- drivers/infiniband/ulp/isert/ib_isert.h |4 +- drivers/target/iscsi/iscsi_target.c |4 ++ include/target/iscsi/iscsi_transport.h |1 + 4 files changed, 55 insertions(+), 60 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 05a575e..8283f5a 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -479,8 +479,8 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct rdma_cm_event *event) isert_conn-state = ISER_CONN_INIT; INIT_LIST_HEAD(isert_conn-conn_accept_node); init_completion(isert_conn-conn_login_comp); - init_waitqueue_head(isert_conn-conn_wait); - init_waitqueue_head(isert_conn-conn_wait_comp_err); + init_completion(isert_conn-conn_wait); + init_completion(isert_conn-conn_wait_comp_err); kref_init(isert_conn-conn_kref); kref_get(isert_conn-conn_kref); mutex_init(isert_conn-conn_mutex); @@ -675,11 +675,11 @@ isert_disconnect_work(struct work_struct *work) pr_debug(isert_disconnect_work(): \n); mutex_lock(isert_conn-conn_mutex); - isert_conn-state = ISER_CONN_DOWN; + if (isert_conn-state == ISER_CONN_UP) + isert_conn-state = ISER_CONN_TERMINATING; if (isert_conn-post_recv_buf_count == 0 atomic_read(isert_conn-post_send_buf_count) == 0) { - pr_debug(Calling wake_up(isert_conn-conn_wait);\n); mutex_unlock(isert_conn-conn_mutex); goto wake_up; } @@ -699,7 +699,7 @@ isert_disconnect_work(struct work_struct *work) mutex_unlock(isert_conn-conn_mutex); wake_up: - wake_up(isert_conn-conn_wait); + complete(isert_conn-conn_wait); isert_put_conn(isert_conn); } @@ -1576,7 +1576,7 @@ isert_do_control_comp(struct work_struct *work) pr_debug(Calling iscsit_logout_post_handler \n); /* * Call atomic_dec(isert_conn-post_send_buf_count) -* from isert_free_conn() +* from isert_wait_conn() */ isert_conn-logout_posted = true; iscsit_logout_post_handler(cmd, cmd-conn); @@ -1678,31 +1678,39 @@ isert_send_completion(struct iser_tx_desc *tx_desc, } static void -isert_cq_comp_err(struct iser_tx_desc *tx_desc, struct isert_conn *isert_conn) +isert_cq_tx_comp_err(struct iser_tx_desc *tx_desc, struct isert_conn *isert_conn) { struct ib_device *ib_dev = isert_conn-conn_cm_id-device; + struct isert_cmd *isert_cmd = tx_desc-isert_cmd; + + if (!isert_cmd) + isert_unmap_tx_desc(tx_desc, ib_dev); + else + isert_completion_put(tx_desc, isert_cmd, ib_dev); +} + +static void +isert_cq_rx_comp_err(struct isert_conn *isert_conn) +{ + struct ib_device *ib_dev = isert_conn-conn_cm_id-device; + struct iscsi_conn *conn = isert_conn-conn; - if (tx_desc) { - struct isert_cmd *isert_cmd = tx_desc-isert_cmd; + if (isert_conn-post_recv_buf_count) + return; - if (!isert_cmd) - isert_unmap_tx_desc(tx_desc, ib_dev); - else - isert_completion_put(tx_desc, isert_cmd, ib_dev); + if (conn-sess) { + target_sess_cmd_list_set_waiting(conn-sess-se_sess); + target_wait_for_sess_cmds(conn-sess-se_sess); } - if (isert_conn-post_recv_buf_count == 0 - atomic_read(isert_conn-post_send_buf_count) == 0) { - pr_debug(isert_cq_comp_err \n); - pr_debug(Calling wake_up from isert_cq_comp_err\n); + while (atomic_read(isert_conn-post_send_buf_count)) + msleep(3000); - mutex_lock(isert_conn-conn_mutex); - if (isert_conn-state != ISER_CONN_DOWN) - isert_conn-state = ISER_CONN_TERMINATING
Re: [PATCH 05/11] IB/iser: Replace fastreg descriptor valid bool with indicators container
On Mon, 2014-02-24 at 10:27 +0200, Sagi Grimberg wrote: On 2/24/2014 9:30 AM, Nicholas A. Bellinger wrote: On Sun, 2014-02-23 at 22:53 -0800, Nicholas A. Bellinger wrote: On Sun, 2014-02-23 at 14:19 +0200, Sagi Grimberg wrote: In T10-PI support we will have memory keys for protection buffers and signature transactions. We prefer to compact indicators rather than keeping multiple bools. This commit does not change any functionality. Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/infiniband/ulp/iser/iscsi_iser.h |8 ++-- drivers/infiniband/ulp/iser/iser_memory.c |4 ++-- drivers/infiniband/ulp/iser/iser_verbs.c |2 +- 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h b/drivers/infiniband/ulp/iser/iscsi_iser.h index 5ffa92f..5f7dbfd 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.h +++ b/drivers/infiniband/ulp/iser/iscsi_iser.h @@ -280,13 +280,17 @@ struct iser_device { enum iser_data_dir cmd_dir); }; +enum iser_reg_indicator { + ISER_DATA_KEY_VALID = 1 0, +}; + struct fast_reg_descriptor { struct list_head list; /* For fast registration - FRWR */ struct ib_mr *data_mr; struct ib_fast_reg_page_list *data_frpl; - /* Valid for fast registration flag */ - bool valid; + /* registration indicators container */ + u8reg_indicators; }; struct iser_conn { diff --git a/drivers/infiniband/ulp/iser/iser_memory.c b/drivers/infiniband/ulp/iser/iser_memory.c index 0e029fe..3edab18 100644 --- a/drivers/infiniband/ulp/iser/iser_memory.c +++ b/drivers/infiniband/ulp/iser/iser_memory.c @@ -479,7 +479,7 @@ static int iser_fast_reg_mr(struct iscsi_iser_task *iser_task, return -EINVAL; } - if (!desc-valid) { + if (!(desc-reg_indicators ISER_DATA_KEY_VALID)) { memset(inv_wr, 0, sizeof(inv_wr)); inv_wr.wr_id = ISER_FRWR_LI_WRID; inv_wr.opcode = IB_WR_LOCAL_INV; This block doesn't apply either.. ISER_FRWR_LI_WRID does not appear to be defined anywhere in v3.14-rc2 code..? Is this patch missing a new definition..? Er, missed the bit wrt: IB/iser: Suppress completion for fast registration work requests Applied along with ib_iser + ib_isert patches to target-pending/rdma-dif --nab OK, I see you got the missing piece. This fixes all your conflicts right (I verified it before myself)? All good, thanks. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 04/10] IB/mlx5: Initialize mlx5_ib_qp signature related
On Sun, 2014-02-23 at 14:19 +0200, Sagi Grimberg wrote: If user requested signature enable we Initialize relevant mlx5_ib_qp members. we mark the qp as sig_enable and we increase the effective SQ size, but still limit the user max_send_wr to original size computed. We also allow the create_qp routine to accept sig_enable create flag. Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/infiniband/hw/mlx5/mlx5_ib.h |3 +++ drivers/infiniband/hw/mlx5/qp.c | 12 +--- include/linux/mlx5/qp.h |1 + 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 79c4f14..e438f08 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -189,6 +189,9 @@ struct mlx5_ib_qp { int create_type; u32 pa_lkey; + + /* Store signature errors */ + boolsignature_en; }; struct mlx5_ib_cq_buf { diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index 7dfe8a1..01999f3 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c SNIP @@ -665,7 +671,7 @@ static int create_kernel_qp(struct mlx5_ib_dev *dev, int err; uuari = dev-mdev.priv.uuari; - if (init_attr-create_flags) + if (init_attr-create_flags ~IB_QP_CREATE_SIGNATURE_EN) return -EINVAL; if (init_attr-qp_type == MLX5_IB_QPT_REG_UMR) FYI, this particular block doesn't apply against = v3.14-rc2 code. Dropping it for now, and applying the rest as #5. Please fix if necessary. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/11] IB/iser: Replace fastreg descriptor valid bool with indicators container
On Sun, 2014-02-23 at 14:19 +0200, Sagi Grimberg wrote: In T10-PI support we will have memory keys for protection buffers and signature transactions. We prefer to compact indicators rather than keeping multiple bools. This commit does not change any functionality. Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/infiniband/ulp/iser/iscsi_iser.h |8 ++-- drivers/infiniband/ulp/iser/iser_memory.c |4 ++-- drivers/infiniband/ulp/iser/iser_verbs.c |2 +- 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h b/drivers/infiniband/ulp/iser/iscsi_iser.h index 5ffa92f..5f7dbfd 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.h +++ b/drivers/infiniband/ulp/iser/iscsi_iser.h @@ -280,13 +280,17 @@ struct iser_device { enum iser_data_dir cmd_dir); }; +enum iser_reg_indicator { + ISER_DATA_KEY_VALID = 1 0, +}; + struct fast_reg_descriptor { struct list_head list; /* For fast registration - FRWR */ struct ib_mr *data_mr; struct ib_fast_reg_page_list *data_frpl; - /* Valid for fast registration flag */ - bool valid; + /* registration indicators container */ + u8reg_indicators; }; struct iser_conn { diff --git a/drivers/infiniband/ulp/iser/iser_memory.c b/drivers/infiniband/ulp/iser/iser_memory.c index 0e029fe..3edab18 100644 --- a/drivers/infiniband/ulp/iser/iser_memory.c +++ b/drivers/infiniband/ulp/iser/iser_memory.c @@ -479,7 +479,7 @@ static int iser_fast_reg_mr(struct iscsi_iser_task *iser_task, return -EINVAL; } - if (!desc-valid) { + if (!(desc-reg_indicators ISER_DATA_KEY_VALID)) { memset(inv_wr, 0, sizeof(inv_wr)); inv_wr.wr_id = ISER_FRWR_LI_WRID; inv_wr.opcode = IB_WR_LOCAL_INV; This block doesn't apply either.. ISER_FRWR_LI_WRID does not appear to be defined anywhere in v3.14-rc2 code..? Is this patch missing a new definition..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 00/10] Introduce Signature feature
On Sun, 2014-02-23 at 14:19 +0200, Sagi Grimberg wrote: This patchset Introduces Verbs level support for signature handover feature. Siganture is intended to implement end-to-end data integrity on a transactional basis in a completely offloaded manner. There are several end-to-end data integrity methods used today in various applications and/or upper layer protocols such as T10-DIF defined by SCSI specifications (SBC), CRC32, XOR8 and more. This patchset adds verbs support only for T10-DIF. The proposed framework allows adding more signature methods in the future. In T10-DIF, when a series of 512-byte data blocks are transferred, each block is followed by an 8-byte guard (note that other protection intervals may be used other then 512-bytes). The guard consists of CRC that protects the integrity of the data in the block, and tag that protects against mis-directed IOs and a free tag for application use. Data can be protected when transferred over the wire, but can also be protected in the memory of the sender/receiver. This allows true end- to-end protection against bits flipping either over the wire, through gateways, in memory, over PCI, etc. While T10-DIF clearly defines that over the wire protection guards are interleaved into the data stream (each 512-Byte block followed by 8-byte guard), when in memory, the protection guards may reside in a buffer separated from the data. Depending on the application, it is usually easier to handle the data when it is contiguous. In this case the data buffer will be of size 512xN and the protection buffer will be of size 8xN (where N is the number of blocks in the transaction). There are 3 kinds of signature handover operation: 1. Take unprotected data (from wire or memory) and ADD protection guards. 2. Take protetected data (from wire or memory), validate the data integrity against the protection guards and STRIP the protection guards. 3. Take protected data (from wire or memory), validate the data integrity against the protection guards and PASS the data with the guards as-is. This translates to defining to the HCA how/if data protection exists in memory domain, and how/if data protection exists is wire domain. The way that data integrity is performed is by using a new kind of memory region: signature-enabled MR, and a new kind of work request: REG_SIG_MR. The REG_SIG_MR WR operates on the signature-enabled MR, and defines all the needed information for the signature handover (data buffer, protection buffer if needed and signature attributes). The result is an MR that can be used for data transfer as usual, that will also add/validate/strip/pass protection guards. When the data transfer is successfully completed, it does not mean that there are no integrity errors. The user must afterwards check the signature status of the handover operation using a new light-weight verb. This feature shall be used in storage upper layer protocols iSER/SRP implementing end-to-end data integrity T10-DIF. Following this patchset, ib_iser/ib_isert will use these verbs for T10-PI offload support. Patchset summary: - Intoduce verbs for create/destroy memory regions supporting signature. - Introduce IB core signature verbs API. - Implement mr create/destroy verbs in mlx5 driver. - Preperation patches for signature support in mlx5 driver. - Implement signature handover work request in mlx5 driver. - Implement signature error collection and handling in mlx5 driver. Changes from v4: - Update to for-next 3.14-rc2 Series applied to target-pending/rdma-dif, minus the missing upstream check in patch #5. Thanks Sagi! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/11] IB/iser: Replace fastreg descriptor valid bool with indicators container
On Sun, 2014-02-23 at 22:53 -0800, Nicholas A. Bellinger wrote: On Sun, 2014-02-23 at 14:19 +0200, Sagi Grimberg wrote: In T10-PI support we will have memory keys for protection buffers and signature transactions. We prefer to compact indicators rather than keeping multiple bools. This commit does not change any functionality. Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/infiniband/ulp/iser/iscsi_iser.h |8 ++-- drivers/infiniband/ulp/iser/iser_memory.c |4 ++-- drivers/infiniband/ulp/iser/iser_verbs.c |2 +- 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h b/drivers/infiniband/ulp/iser/iscsi_iser.h index 5ffa92f..5f7dbfd 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.h +++ b/drivers/infiniband/ulp/iser/iscsi_iser.h @@ -280,13 +280,17 @@ struct iser_device { enum iser_data_dir cmd_dir); }; +enum iser_reg_indicator { + ISER_DATA_KEY_VALID = 1 0, +}; + struct fast_reg_descriptor { struct list_head list; /* For fast registration - FRWR */ struct ib_mr *data_mr; struct ib_fast_reg_page_list *data_frpl; - /* Valid for fast registration flag */ - bool valid; + /* registration indicators container */ + u8reg_indicators; }; struct iser_conn { diff --git a/drivers/infiniband/ulp/iser/iser_memory.c b/drivers/infiniband/ulp/iser/iser_memory.c index 0e029fe..3edab18 100644 --- a/drivers/infiniband/ulp/iser/iser_memory.c +++ b/drivers/infiniband/ulp/iser/iser_memory.c @@ -479,7 +479,7 @@ static int iser_fast_reg_mr(struct iscsi_iser_task *iser_task, return -EINVAL; } - if (!desc-valid) { + if (!(desc-reg_indicators ISER_DATA_KEY_VALID)) { memset(inv_wr, 0, sizeof(inv_wr)); inv_wr.wr_id = ISER_FRWR_LI_WRID; inv_wr.opcode = IB_WR_LOCAL_INV; This block doesn't apply either.. ISER_FRWR_LI_WRID does not appear to be defined anywhere in v3.14-rc2 code..? Is this patch missing a new definition..? Er, missed the bit wrt: IB/iser: Suppress completion for fast registration work requests Applied along with ib_iser + ib_isert patches to target-pending/rdma-dif --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/mlx5: Fix mkey flag wrong assignment
On Thu, 2014-02-20 at 19:43 +0200, Sagi Grimberg wrote: Signed-off-by: Sagi Grimberg sa...@mellanox.com --- Applied + squashed into target-pending/rdma-dif for commit: IB/mlx5: Collect signature error completion --nab drivers/infiniband/hw/mlx5/qp.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index 22db279..5036a2e 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -1756,7 +1756,6 @@ static __be64 frwr_mkey_mask(void) result = MLX5_MKEY_MASK_LEN | MLX5_MKEY_MASK_PAGE_SIZE| MLX5_MKEY_MASK_START_ADDR | - MLX5_MKEY_MASK_EN_SIGERR| MLX5_MKEY_MASK_EN_RINVAL| MLX5_MKEY_MASK_KEY | MLX5_MKEY_MASK_LR | @@ -1778,6 +1777,7 @@ static __be64 sig_mkey_mask(void) MLX5_MKEY_MASK_PAGE_SIZE| MLX5_MKEY_MASK_START_ADDR | MLX5_MKEY_MASK_EN_RINVAL| + MLX5_MKEY_MASK_EN_SIGERR| MLX5_MKEY_MASK_KEY | MLX5_MKEY_MASK_LR | MLX5_MKEY_MASK_LW | -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 00/12] T10-DIF Initial support for iSER target
On Wed, 2014-02-19 at 17:50 +0200, Sagi Grimberg wrote: Hey Nic, I addressed your comments in the this set. I'll try to send the initiator code for review in the next couple of days. This patchset introduces target side T10-PI offload support over RDMA. Currently the implementation is for iSER transport but can be easily extended to SRP (or FCoE in the future). Should mention that this patchset depends on RDMA signature verbs making it for inclusion which will hopefully happen in near future. This code was tested against iSER legacy initiator, and also initiator that uses T10-PI offload as well. I'll clean up the initiator code in the following days and submit that as well. This code works under medium loads using backstores: - FileIO with DIF emulation. - RD with DIF emulation. - iBLOCK (scsi_debug with DIF support). Chnages from v1: - Rebased rdma_dif (3.14-rc2) - Target core: - Pass zero_flag=true to alloc_sgl for protection buffers - Removed Unneeded inline function rwprotect - Pass is_write bool to sbc_set_prot_op_checks - Send failure response for absence of protection buffers when needed. - iSER: - Removed unneeded assignments in isert_set_sig_attrs Chnages from v0: - Rebased from for-next - Target core: - Don't minor fixes for check_prot - to support transports that doesn't use submit_map_sgls. - file format - use escape values. - Removed redundant prot_handover. - Added protection checks and operation set. - iSER: - Added preperation routines for mapping/unmapping buffers to ease the amount of code in isert_reg_rdma. - Fixed print of DIF error (sector instead of offset). - Fix RDMA length for protection on wire domain. - Refactored reg_sig_mr to use helper routines. Hey Sagi, FYI, this series did not compile: drivers/infiniband/ulp/isert/ib_isert.c: In function ‘isert_reg_sig_mr’: drivers/infiniband/ulp/isert/ib_isert.c:2531:2: error: expected ‘;’ before ‘ret’ drivers/infiniband/ulp/isert/ib_isert.c: In function ‘isert_reg_rdma’: drivers/infiniband/ulp/isert/ib_isert.c:2611:24: warning: comparison between ‘enum target_prot_type’ and ‘enum target_prot_op’ [-Wenum-compare] drivers/infiniband/ulp/isert/ib_isert.c:2625:24: warning: comparison between ‘enum target_prot_type’ and ‘enum target_prot_op’ [-Wenum-compare] drivers/infiniband/ulp/isert/ib_isert.c:2667:43: warning: comparison between ‘enum target_prot_type’ and ‘enum target_prot_op’ [-Wenum-compare] drivers/infiniband/ulp/isert/ib_isert.c: In function ‘isert_put_datain’: drivers/infiniband/ulp/isert/ib_isert.c:2711:24: warning: comparison between ‘enum target_prot_type’ and ‘enum target_prot_op’ [-Wenum-compare] drivers/infiniband/ulp/isert/ib_isert.c:2732:24: warning: comparison between ‘enum target_prot_type’ and ‘enum target_prot_op’ [-Wenum-compare] drivers/infiniband/ulp/isert/ib_isert.c: At top level: drivers/infiniband/ulp/isert/ib_isert.c:2468:1: warning: ‘isert_set_sig_attrs’ defined but not used [-Wunused-function] drivers/target/target_core_transport.c: In function ‘transport_generic_new_cmd’: drivers/target/target_core_transport.c:2221:22: warning: comparison between ‘enum target_prot_type’ and ‘enum target_prot_op’ [-Wenum-compare] Squashing the following patch(es) into your original series, and applied to target-pending/rdma-dif. Thank you, --nab diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert. index 04bdd79..b7c8cd7 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -2527,7 +2527,7 @@ isert_reg_sig_mr(struct isert_conn *isert_conn, struct se_cmd *se_cmd, int ret; u32 key; - memset(sig_attrs, 0, sizeof(sig_attrs)) + memset(sig_attrs, 0, sizeof(sig_attrs)); ret = isert_set_sig_attrs(se_cmd, sig_attrs); if (ret) goto err; @@ -2608,7 +2608,7 @@ isert_reg_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd, return ret; if (wr-data.dma_nents != 1 || - se_cmd-prot_type != TARGET_PROT_NORMAL) { + se_cmd-prot_op != TARGET_PROT_NORMAL) { spin_lock_irqsave(isert_conn-conn_lock, flags); fr_desc = list_first_entry(isert_conn-conn_fr_pool, struct fast_reg_descriptor, list); @@ -2622,7 +2622,7 @@ isert_reg_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd, if (ret) goto unmap_cmd; - if (se_cmd-prot_type != TARGET_PROT_NORMAL) { + if (se_cmd-prot_op != TARGET_PROT_NORMAL) { struct ib_sge prot_sge, sig_sge; if (se_cmd-t_prot_sg) { @@ -2664,7 +2664,7 @@ isert_reg_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd, send_wr-opcode = IB_WR_RDMA_WRITE; send_wr-wr.rdma.remote_addr = isert_cmd-read_va; send_wr-wr.rdma.rkey = isert_cmd-read_stag;
Re: [PATCH v1 01/12] Target/transport: Allocate protection sg if needed
On Sun, 2014-02-16 at 19:38 +0200, Sagi Grimberg wrote: In case protection information is involved, allocate protection SG-list for transport. Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/target/target_core_transport.c | 12 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 24b4f65..f029bb7 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -2036,6 +2036,10 @@ static inline void transport_free_pages(struct se_cmd *cmd) transport_free_sgl(cmd-t_bidi_data_sg, cmd-t_bidi_data_nents); cmd-t_bidi_data_sg = NULL; cmd-t_bidi_data_nents = 0; + + transport_free_sgl(cmd-t_prot_sg, cmd-t_prot_nents); + cmd-t_prot_sg = NULL; + cmd-t_prot_nents = 0; } /** @@ -2199,6 +2203,14 @@ transport_generic_new_cmd(struct se_cmd *cmd) return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE; } + if (cmd-prot_type != TARGET_PROT_NORMAL) { + ret = target_alloc_sgl(cmd-t_prot_sg, +cmd-t_prot_nents, +cmd-prot_length, zero_flag); + if (ret 0) + return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE; + } + ret = target_alloc_sgl(cmd-t_data_sg, cmd-t_data_nents, cmd-data_length, zero_flag); if (ret 0) So the 'zero_flag' for the t_prot_sg allocation should always be true to avoid garbage in the app-tag field.. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 02/12] Target/sbc: Set protection operation and relevant checks
On Sun, 2014-02-16 at 19:38 +0200, Sagi Grimberg wrote: SBC-3 mandates the protection checks that must be performed in the rdprotect/wrprotect field. Use them. According to backstore device pi_attributes and cdb rdprotect/wrprotect. Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/target/target_core_sbc.c | 88 ++--- include/target/target_core_base.h |7 +++ 2 files changed, 88 insertions(+), 7 deletions(-) diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c index a448944..dfeb1c2 100644 --- a/drivers/target/target_core_sbc.c +++ b/drivers/target/target_core_sbc.c @@ -240,6 +240,11 @@ static inline u32 transport_lba_32(unsigned char *cdb) return (cdb[2] 24) | (cdb[3] 16) | (cdb[4] 8) | cdb[5]; } +static inline u8 rwprotect(unsigned char *cdb) +{ + return cdb[1] 5; +} + No need for this one-liner. Go ahead and inline instead.. static inline unsigned long long transport_lba_64(unsigned char *cdb) { unsigned int __v1, __v2; @@ -569,30 +574,95 @@ sbc_compare_and_write(struct se_cmd *cmd) return TCM_NO_SENSE; } +static int +sbc_set_prot_op_checks(u8 protect, enum target_prot_type prot_type, +u8 op, struct se_cmd *cmd) +{ + switch (op) { + case READ_10: + case READ_12: + case READ_16: + cmd-prot_op = protect ? TARGET_PROT_DIN_PASS : + TARGET_PROT_DIN_STRIP; + switch (protect) { + case 0x0: + case 0x1: + case 0x5: + cmd-prot_checks = TARGET_DIF_CHECK_GUARD; + if (prot_type == TARGET_DIF_TYPE1_PROT) + cmd-prot_checks |= TARGET_DIF_CHECK_REFTAG; + break; + case 0x2: + if (prot_type == TARGET_DIF_TYPE1_PROT) + cmd-prot_checks = TARGET_DIF_CHECK_REFTAG; + break; + case 0x3: + cmd-prot_checks = 0; + break; + case 0x4: + cmd-prot_checks = TARGET_DIF_CHECK_GUARD; + break; + default: + pr_err(Unsupported protect field %d\n, protect); + return -EINVAL; + } + break; + case WRITE_10: + case WRITE_12: + case WRITE_16: + case WRITE_VERIFY: + cmd-prot_op = protect ? TARGET_PROT_DOUT_PASS : + TARGET_PROT_DOUT_INSERT; + switch (protect) { + case 0x0: + case 0x3: + cmd-prot_checks = 0; + case 0x1: + case 0x5: + cmd-prot_checks = TARGET_DIF_CHECK_GUARD; + if (prot_type == TARGET_DIF_TYPE1_PROT) + cmd-prot_checks |= TARGET_DIF_CHECK_REFTAG; + break; + case 0x2: + if (prot_type == TARGET_DIF_TYPE1_PROT) + cmd-prot_checks = TARGET_DIF_CHECK_REFTAG; + break; + case 0x4: + cmd-prot_checks = TARGET_DIF_CHECK_GUARD; + break; + default: + pr_err(Unsupported protect field %d\n, protect); + return -EINVAL; + } + break; + default: + pr_err(ERROR: bad opcode %d\n, op); + return TCM_INVALID_CDB_FIELD; + } + + return 0; +} + sbc_set_prot_op_checks() is probably better served by passing in a bool to determine read/write, than performing a switch on individual CDB opcode. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 04/12] Target/sbc: don't return from sbc_check for non prot_sg
On Sun, 2014-02-16 at 19:38 +0200, Sagi Grimberg wrote: For transports which use generic new command these buffers have yet to be allocated. Instead check afterwards if command required prot buffers but none are provided. Also this way, target may support protection information against legacy initiators (writes are inserted and reads are stripped). Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/target/target_core_sbc.c |3 --- drivers/target/target_core_transport.c | 21 + 2 files changed, 17 insertions(+), 7 deletions(-) diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c index dfeb1c2..3f12726 100644 --- a/drivers/target/target_core_sbc.c +++ b/drivers/target/target_core_sbc.c @@ -649,9 +649,6 @@ sbc_check_prot(struct se_device *dev, struct se_cmd *cmd, unsigned char *cdb, { u8 protect = rwprotect(cdb); - if (!cmd-t_prot_sg || !cmd-t_prot_nents) - return true; - switch (dev-dev_attrib.pi_prot_type) { case TARGET_DIF_TYPE3_PROT: cmd-reftag_seed = 0x; diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index f029bb7..3a5550a 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -1365,6 +1365,13 @@ int target_submit_cmd_map_sgls(struct se_cmd *se_cmd, struct se_session *se_sess target_put_sess_cmd(se_sess, se_cmd); return 0; } + + rc = target_setup_cmd_from_cdb(se_cmd, cdb); + if (rc != 0) { + transport_generic_request_failure(se_cmd, rc); + return 0; + } + /* * Save pointers for SGLs containing protection information, * if present. @@ -1374,11 +1381,17 @@ int target_submit_cmd_map_sgls(struct se_cmd *se_cmd, struct se_session *se_sess se_cmd-t_prot_nents = sgl_prot_count; } - rc = target_setup_cmd_from_cdb(se_cmd, cdb); - if (rc != 0) { - transport_generic_request_failure(se_cmd, rc); - return 0; + /* + * Fail if protection operation requiers protection + * information buffers but None are provided! + */ + if ((!se_cmd-t_prot_sg || !se_cmd-t_prot_nents) + (se_cmd-prot_op != TARGET_PROT_NORMAL)) { + pr_err(ERROR: protection information was requested but +protection buffers weren't provided.\n); + return -EINVAL; } A failure here requires the call to transport_generic_request_failure() to complete the request with CHECK_CONDITION status, and should return 0 following the failure case for target_setup_cmd_from_cdb() above.. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 00/12] T10-DIF Initial support for iSER target
On Sun, 2014-02-16 at 19:38 +0200, Sagi Grimberg wrote: Hey Nic and folks, This patchset introduces target side T10-PI offload support over RDMA. Currently the implementation is for iSER transport but can be easily extended to SRP (or FCoE in the future). Should mention that this patchset depends on RDMA signature verbs making it for inclusion which will hopefully happen in near future. This code was tested against iSER legacy initiator, and also initiator that uses T10-PI offload as well. I'll clean up the initiator code in the following days and submit that as well. Looking forward to seeing the initiator code in action. This code works under medium loads using backstores: - FileIO with DIF emulation. - RD with DIF emulation. - iBLOCK (scsi_debug with DIF support). Nic, Can you take this code to rdma_dif branch until RDMA Signature verbs set will pass submission? (after review of course...). Ok, a few mostly minor comments on this series. Please address these and repost. Also, target-pending/rdma-dif has been updated to v3.14-rc2 from the last RDMA Signature verbs patch series. Please take a moment to verify that it's update to date. Thanks! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srpt: replace strict_strtoul() with kstrtoul()
On Wed, 2014-02-05 at 11:22 +0900, Jingoo Han wrote: The usage of strict_strtoul() is not preferred, because strict_strtoul() is obsolete. Thus, kstrtoul() should be used. Signed-off-by: Jingoo Han jg1@samsung.com --- Applied to target-pending/queue. Thanks! --nab drivers/infiniband/ulp/srpt/ib_srpt.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index 520a7e5..0e537d8 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -3666,9 +3666,9 @@ static ssize_t srpt_tpg_attrib_store_srp_max_rdma_size( unsigned long val; int ret; - ret = strict_strtoul(page, 0, val); + ret = kstrtoul(page, 0, val); if (ret 0) { - pr_err(strict_strtoul() failed with ret: %d\n, ret); + pr_err(kstrtoul() failed with ret: %d\n, ret); return -EINVAL; } if (val MAX_SRPT_RDMA_SIZE) { @@ -3706,9 +3706,9 @@ static ssize_t srpt_tpg_attrib_store_srp_max_rsp_size( unsigned long val; int ret; - ret = strict_strtoul(page, 0, val); + ret = kstrtoul(page, 0, val); if (ret 0) { - pr_err(strict_strtoul() failed with ret: %d\n, ret); + pr_err(kstrtoul() failed with ret: %d\n, ret); return -EINVAL; } if (val MAX_SRPT_RSP_SIZE) { @@ -3746,9 +3746,9 @@ static ssize_t srpt_tpg_attrib_store_srp_sq_size( unsigned long val; int ret; - ret = strict_strtoul(page, 0, val); + ret = kstrtoul(page, 0, val); if (ret 0) { - pr_err(strict_strtoul() failed with ret: %d\n, ret); + pr_err(kstrtoul() failed with ret: %d\n, ret); return -EINVAL; } if (val MAX_SRPT_SRQ_SIZE) { @@ -3793,7 +3793,7 @@ static ssize_t srpt_tpg_store_enable( unsigned long tmp; int ret; - ret = strict_strtoul(page, 0, tmp); + ret = kstrtoul(page, 0, tmp); if (ret 0) { printk(KERN_ERR Unable to extract srpt_tpg_store_enable\n); return -EINVAL; -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux rdma 3.14 merge plans
Hi Roland, On Tue, 2014-01-21 at 23:27 -0800, Nicholas A. Bellinger wrote: Roland Co, On Tue, 2014-01-21 at 16:43 -0800, Roland Dreier wrote: On Tue, Jan 21, 2014 at 2:00 PM, Or Gerlitz or.gerl...@gmail.com wrote: Roland, ping! the signature patches were posted three months ago. We deserve a response from the maintainer that goes beyond I need to think on that. Responsiveness was stated by Linus to be the #1 requirement from kernel maintainers. Or, I'm not sure what response you're after from me. Linus has also said that maintainers should say no a lot more (http://lwn.net/Articles/571995/) so maybe you want me to say, No, I won't merge this patch set, since it adds a bunch of complexity to support a feature no one really cares about. Is that it? The patch set proposed by Sagi + Or is modest in terms of LOC to core IB code, and includes mostly mlx5 specific driver changes that enables HW offloads. (And yes I am skeptical about this stuff — I work at an enterprise storage company and even here it's hard to find anyone who cares about DIF/DIX, especially offload features that stop it from being end-to-end) My understanding is most HBAs capable of T10 PI offload in DIX PASS + VERIFY mode are already implementing DIX INSERT + STRIP modes in various capacities to support legacy environments. Beyond the DIX INSERT + STRIP case for enterprise storage, the amount of FC + SAS HBAs that already support T10 PI metadata is substantial. I'm sure you're not expecting me to say, Sure, I'll merge it without understanding the problem it's solving or how it's doing that, especially given the your recent history of pushing me to merge stuff like the IP-RoCE patches back when they broke the userspace ABI. With the merge window now upon us, there is a understandable reluctance to merge new features. Given the amount of time the series has spent on the list, it is however a good candidate to consider for an exception. Short of that, are you planning to accept the series for the next round once the current merge window closes..? We'd really like to start enabling fabrics with these types of offloads for v3.15. Now with the initial DIF backend taraget support in place for v3.14-rc1 code, we'd like to move forward on iser-target related pieces for T10 PI. Can you give us an estimate of when you'll have some time to give feedback on the outstanding patches..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux rdma 3.14 merge plans
Roland Co, On Tue, 2014-01-21 at 16:43 -0800, Roland Dreier wrote: On Tue, Jan 21, 2014 at 2:00 PM, Or Gerlitz or.gerl...@gmail.com wrote: Roland, ping! the signature patches were posted three months ago. We deserve a response from the maintainer that goes beyond I need to think on that. Responsiveness was stated by Linus to be the #1 requirement from kernel maintainers. Or, I'm not sure what response you're after from me. Linus has also said that maintainers should say no a lot more (http://lwn.net/Articles/571995/) so maybe you want me to say, No, I won't merge this patch set, since it adds a bunch of complexity to support a feature no one really cares about. Is that it? The patch set proposed by Sagi + Or is modest in terms of LOC to core IB code, and includes mostly mlx5 specific driver changes that enables HW offloads. (And yes I am skeptical about this stuff — I work at an enterprise storage company and even here it's hard to find anyone who cares about DIF/DIX, especially offload features that stop it from being end-to-end) My understanding is most HBAs capable of T10 PI offload in DIX PASS + VERIFY mode are already implementing DIX INSERT + STRIP modes in various capacities to support legacy environments. Beyond the DIX INSERT + STRIP case for enterprise storage, the amount of FC + SAS HBAs that already support T10 PI metadata is substantial. I'm sure you're not expecting me to say, Sure, I'll merge it without understanding the problem it's solving or how it's doing that, especially given the your recent history of pushing me to merge stuff like the IP-RoCE patches back when they broke the userspace ABI. With the merge window now upon us, there is a understandable reluctance to merge new features. Given the amount of time the series has spent on the list, it is however a good candidate to consider for an exception. Short of that, are you planning to accept the series for the next round once the current merge window closes..? We'd really like to start enabling fabrics with these types of offloads for v3.15. I'd really rather spend my time on something actually useful like cleaning up softroce. +1 for softroce + T10 PI support! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux rdma 3.14 merge plans
On Sat, 2014-01-18 at 13:42 -0800, Roland Dreier wrote: On Thu, Jan 16, 2014 at 1:14 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: I've reviewed the API from the perspective of what's required for implementing protection support in iser, and currently don't have any recommendations or objections beyond what has been proposed by Sagi Co in PATCH-v4 code. I guess I'm a little confused about why we need verbs support for this to implement DIF/DIX in iser. Isn't the whole point of protection to have end-to-end checksums, rather than having checksums computed by the transport after there's a chance for corruption? So to my knowledge, there are three target side DIX HBA modes of operation: - TARGET PASS: Fabric + Backend support PI - TARGET INSERT: Fabric does not support PI, backend supports PI - TARGET STRIP: Fabric supports PI, backend does not support PI The scenario your thinking about above is the 'TARGET INSERT' case, where the initiator does not generate PI, but the backend device on the target side expects PI, so the target fabric ends up generating PI on incoming WRITEs, and verifying + striping PI on outgoing READs. The scenario for 'TARGET STRIP' is when the initiator generates PI but the backend device does not support/process PI, so the target verifies + strips PI on incoming WRITESs, and inserts PI on outgoing READs. Your correct that both of these modes don't provide true end-to-end protection, and my understanding is that they are provided as a way to accommodate existing fabrics + backend devices where PI is not supported all the way through the stack. The 'TARGET PASS' is the scenario that provides true end-to-end guarantees, where for WRITEs PI is generated by the Host OS, verified + passed on the initiator side HBA, verified + passed on the target HBA, and verified + stored on the device backend. For READs, PI is retrieved from the backend device, verified + passed on the target HBA, verified + passed on the initiator HBA, and finally verified on the Host OS. So in the proposed RDMA VERBs changes these three modes of target DIX operation are supported. Also it's my understanding (Sagi Co, please correct me), that the proposed changes are implemented to be independent of target/initiator mode DIX operation. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux rdma 3.14 merge plans
On Thu, 2014-01-16 at 12:13 -0800, Roland Dreier wrote: On Mon, 2014-01-13 at 12:32 -0800, Nicholas A. Bellinger wrote: Hi Roland Co, Just curious if your planning to take a look at this series soon for v3.14 code..? As you might imagine, I'd like to see it merged this round in order to move forward on iser-target protection for v3.15. Are there any specific issues that you'd like to see addressed for an initial merge..? I haven't really had a chance to look at the new API and decide if it makes sense as the way to expose these features. Have you looked at the new kernel API? Any thoughts? I've reviewed the API from the perspective of what's required for implementing protection support in iser, and currently don't have any recommendations or objections beyond what has been proposed by Sagi Co in PATCH-v4 code. How does it compare with how other subsystems have exposed protection info? So there is not a interface in SCSI land for interacting (directly) with hardware protection support, as it's primarily just telling SCSI what protection modes are supported while the rest is implemented in vendor specific firmware interfaces. (CC'ing MKP) Right now I'm dealing with fixing the fallout from picking up the IP addressing for IBoE and other patch sets that were supposedly ready to merge for a long time so I'm not sure I'll get to the protection changes in time for 3.14. Pretty please for v3.14..? Sagi Co are committed to supporting the changes, and are ready to do what's necessary for a merge. AFAICT the series has gone through enough list review over the last months without (serious) objections, so I'm not sure what else is necessary to move forward for a merge, aside from your input of course. :-) --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux rdma 3.14 merge plans
On Thu, 2014-01-16 at 23:42 +0200, Or Gerlitz wrote: On Thu, Jan 16, 2014 at 11:37 PM, Greg Kroah-Hartman gre...@linuxfoundation.org wrote: On Thu, Jan 16, 2014 at 01:14:12PM -0800, Nicholas A. Bellinger wrote: On Thu, 2014-01-16 at 12:13 -0800, Roland Dreier wrote: SNIP I haven't really had a chance to look at the new API and decide if it makes sense as the way to expose these features. Have you looked at the new kernel API? Any thoughts? I've reviewed the API from the perspective of what's required for implementing protection support in iser, and currently don't have any recommendations or objections beyond what has been proposed by Sagi Co in PATCH-v4 code. How does it compare with how other subsystems have exposed protection info? So there is not a interface in SCSI land for interacting (directly) with hardware protection support, as it's primarily just telling SCSI what protection modes are supported while the rest is implemented in vendor specific firmware interfaces. (CC'ing MKP) Right now I'm dealing with fixing the fallout from picking up the IP addressing for IBoE and other patch sets that were supposedly ready to merge for a long time so I'm not sure I'll get to the protection changes in time for 3.14. Pretty please for v3.14..? It's _really_ late in the development cycle for new stuff for 3.14. My trees have been closed for almost a week now, for major stuff, and for anything else for a few days now. It would be good if you could get your patchsets into the 0-day testing bot earlier to shake out any build issues that might happen with them, please work with that developer to help make the merging of the code easier. Greg, The T10 patches were posted whole three months ago (V0 Oct 15th). I don't see why another cycle should get lost just because there was no maintainer feedback on them throughout this whole period. There's enough time for us to fix things that will show up in the testing bot before Roland sends his pull request over the two weeks merge window. So as Greg noted, it's still useful to get this into the 0-day build testing now. That said, pushing the -v4 series into target-pending/rdma-dif, and Fengguang's scripts should be picking it up shortly. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux rdma 3.14 merge plans
On Mon, 2014-01-13 at 12:32 -0800, Nicholas A. Bellinger wrote: Hi Roland Co, Just curious if your planning to take a look at this series soon for v3.14 code..? As you might imagine, I'd like to see it merged this round in order to move forward on iser-target protection for v3.15. Are there any specific issues that you'd like to see addressed for an initial merge..? Hi again Roland, Ping^2..? We're quite eager to see this code merged this round. Is there a specific issue on these patches preventing a v3.14 merge..? Thank you, --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/11] IB/isert: seperate connection protection domains and dma MRs
Hi Sagi, On Thu, 2014-01-09 at 18:40 +0200, Sagi Grimberg wrote: It is more correct to seperate connections protection domains and dma_mr handles. protection information support requires to do so. Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/infiniband/ulp/isert/ib_isert.c | 46 --- drivers/infiniband/ulp/isert/ib_isert.h |2 - 2 files changed, 24 insertions(+), 24 deletions(-) Applied to for-next, given this patch does not depend on the proposed VERBS protection changes. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/11] IB/isert: Avoid frwr notation, user fastreg
On Thu, 2014-01-09 at 18:40 +0200, Sagi Grimberg wrote: Use fast registration lingo. fast registration will also incorporate signature/DIF registration. Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/infiniband/ulp/isert/ib_isert.c | 84 --- drivers/infiniband/ulp/isert/ib_isert.h |8 ++-- 2 files changed, 47 insertions(+), 45 deletions(-) Applied to for-next. Thanks Sagi! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/11] IB/isert: Move fastreg descriptor creation to a function
On Thu, 2014-01-09 at 18:40 +0200, Sagi Grimberg wrote: This routine may be called both by fast registration descriptors for data and for integrity buffers. This patch does not change any functionality. Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/infiniband/ulp/isert/ib_isert.c | 52 +++ 1 files changed, 32 insertions(+), 20 deletions(-) Also applied to for-next. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 07/11] IB/isert: pass scatterlist instead of cmd to fast_reg_mr routine
On Thu, 2014-01-09 at 18:40 +0200, Sagi Grimberg wrote: This routine may help for protection registration as well. This patch does not change any functionality. Signed-off-by: Sagi Grimberg sa...@mellanox.com --- drivers/infiniband/ulp/isert/ib_isert.c | 28 1 files changed, 12 insertions(+), 16 deletions(-) Also applied to for-next. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux rdma 3.14 merge plans
Hi Roland Co, Just curious if your planning to take a look at this series soon for v3.14 code..? As you might imagine, I'd like to see it merged this round in order to move forward on iser-target protection for v3.15. Are there any specific issues that you'd like to see addressed for an initial merge..? Thank you, --nab On Wed, 2014-01-08 at 11:37 +0200, Or Gerlitz wrote: On 08/01/2014 02:51, Roland Dreier wrote: The data integrity stuff I'm not so sure about. Sean raised some I think legitimate questions about whether all this should be added to the verbs API and I want to see more discussion or at least have a deep think about this myself before comitting. Well, re the deep think thing, we have problem here -- the patches were posted three months ago, on Oct 15th 2013, and so far not a single bit of input from you. Can you take a look on that as soon as possible and let us know your thoughts? As for the questions posed by Sean -- Sagi, Tzahi and myself provided detailed answers. Taking into account the small scale of the rdma community, I don't see any other choice other than the guy wearing the maintainer hat to step in and provide his say or follow up questions that would cut things out or revive the discussion. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] iser-target: fix error return code in isert_create_device_ib_res()
On Mon, 2013-12-02 at 15:00 +0100, Jiri Kosina wrote: On Tue, 29 Oct 2013, Wei Yongjun wrote: From: Wei Yongjun yongjun_...@trendmicro.com.cn Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun yongjun_...@trendmicro.com.cn This one has been sitting in my queue for quite some time, but I'd like the actual infiniband maintainers to be acted upon by. Roland, please? Hi Jiri, Wei Co, Apologies for missing this one earlier.. This patch has been queued with a CC' for v3.10+ stable to target-pending/master, and will be included in the next for-v3.13 fixes PULL request. Thank you, --nab --- drivers/infiniband/ulp/isert/ib_isert.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 6df2350..2ba71c0 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -263,21 +263,29 @@ isert_create_device_ib_res(struct isert_device *device) isert_cq_event_callback, (void *)cq_desc[i], ISER_MAX_RX_CQ_LEN, i); - if (IS_ERR(device-dev_rx_cq[i])) + if (IS_ERR(device-dev_rx_cq[i])) { + ret = PTR_ERR(device-dev_rx_cq[i]); + device-dev_rx_cq[i] = NULL; goto out_cq; + } device-dev_tx_cq[i] = ib_create_cq(device-ib_device, isert_cq_tx_callback, isert_cq_event_callback, (void *)cq_desc[i], ISER_MAX_TX_CQ_LEN, i); - if (IS_ERR(device-dev_tx_cq[i])) + if (IS_ERR(device-dev_tx_cq[i])) { + ret = PTR_ERR(device-dev_tx_cq[i]); + device-dev_tx_cq[i] = NULL; goto out_cq; + } - if (ib_req_notify_cq(device-dev_rx_cq[i], IB_CQ_NEXT_COMP)) + ret = ib_req_notify_cq(device-dev_rx_cq[i], IB_CQ_NEXT_COMP); + if (ret) goto out_cq; - if (ib_req_notify_cq(device-dev_tx_cq[i], IB_CQ_NEXT_COMP)) + ret = ib_req_notify_cq(device-dev_tx_cq[i], IB_CQ_NEXT_COMP); + if (ret) goto out_cq; } -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] iser-target: Move INIT_WORK setup into isert_create_device_ib_res
From: Nicholas Bellinger n...@linux-iscsi.org This patch moves INIT_WORK setup for cq_desc-cq_[rx,tx]_work into isert_create_device_ib_res(), instead of being done each callback invocation in isert_cq_[rx,tx]_callback(). This also fixes a 'INFO: trying to register non-static key' warning when cancel_work_sync() is called before INIT_WORK has setup the struct work_struct. Reported-by: Or Gerlitz ogerl...@mellanox.com Cc: Or Gerlitz ogerl...@mellanox.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/isert/ib_isert.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 78f6e92..9804fca 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -207,7 +207,9 @@ isert_free_rx_descriptors(struct isert_conn *isert_conn) isert_conn-conn_rx_descs = NULL; } +static void isert_cq_tx_work(struct work_struct *); static void isert_cq_tx_callback(struct ib_cq *, void *); +static void isert_cq_rx_work(struct work_struct *); static void isert_cq_rx_callback(struct ib_cq *, void *); static int @@ -259,6 +261,7 @@ isert_create_device_ib_res(struct isert_device *device) cq_desc[i].device = device; cq_desc[i].cq_index = i; + INIT_WORK(cq_desc[i].cq_rx_work, isert_cq_rx_work); device-dev_rx_cq[i] = ib_create_cq(device-ib_device, isert_cq_rx_callback, isert_cq_event_callback, @@ -270,6 +273,7 @@ isert_create_device_ib_res(struct isert_device *device) goto out_cq; } + INIT_WORK(cq_desc[i].cq_tx_work, isert_cq_tx_work); device-dev_tx_cq[i] = ib_create_cq(device-ib_device, isert_cq_tx_callback, isert_cq_event_callback, @@ -1732,7 +1736,6 @@ isert_cq_tx_callback(struct ib_cq *cq, void *context) { struct isert_cq_desc *cq_desc = (struct isert_cq_desc *)context; - INIT_WORK(cq_desc-cq_tx_work, isert_cq_tx_work); queue_work(isert_comp_wq, cq_desc-cq_tx_work); } @@ -1776,7 +1779,6 @@ isert_cq_rx_callback(struct ib_cq *cq, void *context) { struct isert_cq_desc *cq_desc = (struct isert_cq_desc *)context; - INIT_WORK(cq_desc-cq_rx_work, isert_cq_rx_work); queue_work(isert_rx_wq, cq_desc-cq_rx_work); } -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ib_isert: Add support for completion interrupt coalescing
From: Nicholas Bellinger n...@linux-iscsi.org This patch adds support for completion interrupt coalescing that allows only every ISERT_COMP_BATCH_COUNT (8) to set IB_SEND_SIGNALED, thus avoiding completion interrupts for every posted iser_tx_desc. The batch processing is done using a per isert_conn llist that once IB_SEND_SIGNALED has been set is saved to tx_desc-comp_llnode_batch, and completion processing of previously posted iser_tx_descs is done in a single shot from within isert_send_completion() code. Note this is only done for response PDUs from ISCSI_OP_SCSI_CMD, and all other control type of PDU responses will force an implicit batch drain to occur. Cc: Or Gerlitz ogerl...@mellanox.com Cc: Sagi Grimberg sa...@mellanox.com Cc: Kent Overstreet k...@daterainc.com Cc: Roland Dreier rol...@purestorage.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/isert/ib_isert.c | 63 +-- drivers/infiniband/ulp/isert/ib_isert.h |6 +++ 2 files changed, 57 insertions(+), 12 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 3591855..27708c3 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -22,6 +22,7 @@ #include linux/socket.h #include linux/in.h #include linux/in6.h +#include linux/llist.h #include rdma/ib_verbs.h #include rdma/rdma_cm.h #include target/target_core_base.h @@ -489,6 +490,7 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct rdma_cm_event *event) kref_init(isert_conn-conn_kref); kref_get(isert_conn-conn_kref); mutex_init(isert_conn-conn_mutex); + mutex_init(isert_conn-conn_comp_mutex); spin_lock_init(isert_conn-conn_lock); cma_id-context = isert_conn; @@ -843,14 +845,32 @@ isert_init_tx_hdrs(struct isert_conn *isert_conn, } static void -isert_init_send_wr(struct isert_cmd *isert_cmd, struct ib_send_wr *send_wr) +isert_init_send_wr(struct isert_conn *isert_conn, struct isert_cmd *isert_cmd, + struct ib_send_wr *send_wr, bool coalesce) { + struct iser_tx_desc *tx_desc = isert_cmd-tx_desc; + isert_cmd-rdma_wr.iser_ib_op = ISER_IB_SEND; send_wr-wr_id = (unsigned long)isert_cmd-tx_desc; send_wr-opcode = IB_WR_SEND; - send_wr-send_flags = IB_SEND_SIGNALED; - send_wr-sg_list = isert_cmd-tx_desc.tx_sg[0]; + send_wr-sg_list = tx_desc-tx_sg[0]; send_wr-num_sge = isert_cmd-tx_desc.num_sge; + /* +* Coalesce send completion interrupts by only setting IB_SEND_SIGNALED +* bit for every ISERT_COMP_BATCH_COUNT number of ib_post_send() calls. +*/ + mutex_lock(isert_conn-conn_comp_mutex); + if (coalesce + ++isert_conn-conn_comp_batch ISERT_COMP_BATCH_COUNT) { + llist_add(tx_desc-comp_llnode, isert_conn-conn_comp_llist); + mutex_unlock(isert_conn-conn_comp_mutex); + return; + } + isert_conn-conn_comp_batch = 0; + tx_desc-comp_llnode_batch = llist_del_all(isert_conn-conn_comp_llist); + mutex_unlock(isert_conn-conn_comp_mutex); + + send_wr-send_flags = IB_SEND_SIGNALED; } static int @@ -1582,8 +1602,8 @@ isert_response_completion(struct iser_tx_desc *tx_desc, } static void -isert_send_completion(struct iser_tx_desc *tx_desc, - struct isert_conn *isert_conn) +__isert_send_completion(struct iser_tx_desc *tx_desc, + struct isert_conn *isert_conn) { struct ib_device *ib_dev = isert_conn-conn_cm_id-device; struct isert_cmd *isert_cmd = tx_desc-isert_cmd; @@ -1624,6 +1644,24 @@ isert_send_completion(struct iser_tx_desc *tx_desc, } static void +isert_send_completion(struct iser_tx_desc *tx_desc, + struct isert_conn *isert_conn) +{ + struct llist_node *llnode = tx_desc-comp_llnode_batch; + struct iser_tx_desc *t; + /* +* Drain coalesced completion llist starting from comp_llnode_batch +* setup in isert_init_send_wr(), and then complete trailing tx_desc. +*/ + while (llnode) { + t = llist_entry(llnode, struct iser_tx_desc, comp_llnode); + llnode = llist_next(llnode); + __isert_send_completion(t, isert_conn); + } + __isert_send_completion(tx_desc, isert_conn); +} + +static void isert_cq_comp_err(struct iser_tx_desc *tx_desc, struct isert_conn *isert_conn) { struct ib_device *ib_dev = isert_conn-conn_cm_id-device; @@ -1793,7 +1831,7 @@ isert_put_response(struct iscsi_conn *conn, struct iscsi_cmd *cmd) isert_cmd-tx_desc.num_sge = 2; } - isert_init_send_wr(isert_cmd, send_wr); + isert_init_send_wr(isert_conn, isert_cmd, send_wr, true); pr_debug(Posting SCSI Response IB_WR_SEND \n); @@ -1813,7 +1851,7 @@ isert_put_nopin(struct
Re: [PATCH RFC v2 00/10] Introduce Signature feature
On Tue, 2013-11-05 at 11:13 +0200, Sagi Grimberg wrote: On 11/4/2013 8:41 PM, Nicholas A. Bellinger wrote: On Sat, 2013-11-02 at 14:57 -0700, Bart Van Assche wrote: On 1/11/2013 18:36, Nicholas A. Bellinger wrote: On Fri, 2013-11-01 at 08:03 -0700, Bart Van Assche wrote: On 31/10/2013 5:24, Sagi Grimberg wrote: In T10-DIF, when a series of 512-byte data blocks are transferred, each block is followed by an 8-byte guard. The guard consists of CRC that protects the integrity of the data in the block, and some other tags that protects against mis-directed IOs. Shouldn't that read logical block length divided by 2**(protection interval exponent) instead of 512 ? From the SPC-4 FORMAT UNIT section: Why should the protection interval in FORMAT_UNIT be mentioned when it's not supported by the hardware, nor by drivers/scsi/sd_dif.c itself..? Hello Nick, My understanding is that this patch series is not only intended for initiator drivers but also for target drivers like ib_srpt and ib_isert. As you know target drivers do not restrict the initiator operating system to Linux. Although I do not know whether there are already operating systems that support the protection interval exponent, It's my understanding that Linux is still the only stack that supports DIF, so AFAICT no one is actually supporting this. I think it is a good idea to stay as close as possible to the terminology of the SPC-4 standard. No, in this context it only adds pointless misdirection because 1) The hardware in question doesn't support it, and 2) Linux itself doesn't support it. I think that Bart is suggesting renaming block_size as pi_interval in ib_sig_domain. I tend to agree since even if support for that does not exist yet, it might be in the future. I think it is not a misdirection because it does represent the protection information interval. The point is that changing the description from what the patch actually does, to something it does not do in order to 'stay as close as possible to the terminology of the SPC-4 standard' is pointlessly confusing. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC v2 00/10] Introduce Signature feature
On Sat, 2013-11-02 at 14:57 -0700, Bart Van Assche wrote: On 1/11/2013 18:36, Nicholas A. Bellinger wrote: On Fri, 2013-11-01 at 08:03 -0700, Bart Van Assche wrote: On 31/10/2013 5:24, Sagi Grimberg wrote: In T10-DIF, when a series of 512-byte data blocks are transferred, each block is followed by an 8-byte guard. The guard consists of CRC that protects the integrity of the data in the block, and some other tags that protects against mis-directed IOs. Shouldn't that read logical block length divided by 2**(protection interval exponent) instead of 512 ? From the SPC-4 FORMAT UNIT section: Why should the protection interval in FORMAT_UNIT be mentioned when it's not supported by the hardware, nor by drivers/scsi/sd_dif.c itself..? Hello Nick, My understanding is that this patch series is not only intended for initiator drivers but also for target drivers like ib_srpt and ib_isert. As you know target drivers do not restrict the initiator operating system to Linux. Although I do not know whether there are already operating systems that support the protection interval exponent, It's my understanding that Linux is still the only stack that supports DIF, so AFAICT no one is actually supporting this. I think it is a good idea to stay as close as possible to the terminology of the SPC-4 standard. No, in this context it only adds pointless misdirection because 1) The hardware in question doesn't support it, and 2) Linux itself doesn't support it. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC v2 00/10] Introduce Signature feature
On Fri, 2013-11-01 at 08:03 -0700, Bart Van Assche wrote: On 31/10/2013 5:24, Sagi Grimberg wrote: In T10-DIF, when a series of 512-byte data blocks are transferred, each block is followed by an 8-byte guard. The guard consists of CRC that protects the integrity of the data in the block, and some other tags that protects against mis-directed IOs. Shouldn't that read logical block length divided by 2**(protection interval exponent) instead of 512 ? From the SPC-4 FORMAT UNIT section: Why should the protection interval in FORMAT_UNIT be mentioned when it's not supported by the hardware, nor by drivers/scsi/sd_dif.c itself..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH]SRP: fix task management handle in srp
On Fri, 2013-09-27 at 12:55 +0200, Jack Wang wrote: On 09/27/2013 12:30 PM, Bart Van Assche wrote: On 09/27/13 11:20, Jack Wang wrote: Hi all, Currently handle of srp_rsp for task management is broken. in 6.9 T10/1415-D revision 16a SRP_RSP responses that contain either RESPONSE DATA or SENSE DATA shall be sent as the minimum length message containing those fields. LENGTH field specify the number of bytes in the RESPONSE DATA field. RSPVALID set to one also indicates that the contents of the STATUS field shall be ignored by the SRP initiator port. If response data is provided, RSPVALID shall be set to one and the RESPONSE DATA LIST LENGTH field shall specify the number of bytes in the RESPONSE DATA field (see table 23). The RESPONSE DATA LIST LENGTH field shall contain the value four. Other lengths are reserved for future standardization. If no response data is provided, RSPVALID shall be set to zero. The SRP initiator port shall ignore the RESPONSE DATA LIST LENGTH field and shall assume a length of zero. Response data shall be provided in any SRP_RSP response that is sent in response to an SRP_TSK_MGMT request (see 6.7). The information in the RSP_CODE field (see table 24) shall indicate the completion status of the task management function. Response data shall not be provided in any SRP_RSP response that returns a non-zero status code in the STATUS field. The STATUS field contains the status of a task that completes. Patch made against v3.12-rc1 From 5f5af6de8dd72e37448841b7d7735d3eea4d3d83 Mon Sep 17 00:00:00 2001 From: Jack Wang jinpu.w...@profitbricks.com Date: Fri, 27 Sep 2013 11:10:05 +0200 Subject: [PATCH] IB/srp: fix task management handle in srp Currently the srp driver process task manamgement command in wrong way it just ignore the return srp_rsp for successful case eg rsp-status is success, fix this. Signed-off-by: Jack Wang jinpu.w...@profitbricks.com Reviewed-by: Dongsu Park dongsu.p...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index f93baf8..5e1f1bf 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -1145,9 +1145,11 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp) target-req_lim += be32_to_cpu(rsp-req_lim_delta); spin_unlock_irqrestore(target-lock, flags); -target-tsk_mgmt_status = -1; -if (be32_to_cpu(rsp-resp_data_len) = 4) -target-tsk_mgmt_status = rsp-data[3]; +target-tsk_mgmt_status = rsp-status; +if (rsp-flags SRP_RSP_FLAG_RSPVALID) { +if (be32_to_cpu(rsp-resp_data_len) = 4) +target-tsk_mgmt_status = rsp-data[3]; +} complete(target-tsk_mgmt_done); } else { req = target-req_ring[rsp-tag]; @@ -1739,7 +1741,7 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target, msecs_to_jiffies(SRP_ABORT_TIMEOUT_MS))) return -1; -return 0; +return target-tsk_mgmt_status; } static int srp_abort(struct scsi_cmnd *scmnd) @@ -1776,8 +1778,6 @@ static int srp_reset_device(struct scsi_cmnd *scmnd) if (srp_send_tsk_mgmt(target, SRP_TAG_NO_REQ, scmnd-device-lun, SRP_TSK_LUN_RESET)) return FAILED; -if (target-tsk_mgmt_status) -return FAILED; for (i = 0; i SRP_CMD_SQ_SIZE; ++i) { struct srp_request *req = target-req_ring[i]; Good catch, however: - I think the STATUS field only has a meaning in replies to regular SCSI commands but not in SRP_TSK_MGMT replies. - If the spec says that a target driver should always provide response data in response to a SRP_TSK_MGMT request, isn't it the target that should be modified instead of the initiator ? Bart. Hello Bart, It's a little vague in the srp spec about status definition(as it only a draft version several years old without any update), I think it is quite efficient to also use status for succesful task management commands too. SAS has similar usage for that as I know. But it will be great if you could fix srpt in SCST:) It would be even better if someone sent a patch for srpt in mainline as well, so I don't have to fish out bug fixes from an out of tree codebase months (or years) after the fact. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v4 7/9] iscsi-target: Refactor TX queue logic + export response PDU creation
On Fri, 2013-05-03 at 23:04 +0200, Geert Uytterhoeven wrote: Hi Nicholas, On Fri, Apr 12, 2013 at 10:52 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c static int iscsit_send_reject( struct iscsi_cmd *cmd, struct iscsi_conn *conn) @@ -3505,18 +3548,9 @@ static int iscsit_send_reject( struct iscsi_reject *hdr; struct kvec *iov; - hdr = (struct iscsi_reject *) cmd-pdu; Woops, and now hdr is no longer initialized: drivers/target/iscsi/iscsi_target.c: In function ‘iscsit_send_reject’: drivers/target/iscsi/iscsi_target.c:3577: warning: ‘hdr’ is used uninitialized in this function - hdr-opcode = ISCSI_OP_REJECT; - hdr-flags |= ISCSI_FLAG_CMD_FINAL; - hton24(hdr-dlength, ISCSI_HDR_LEN); - hdr- = cpu_to_be32(0x); - cmd-stat_sn= conn-stat_sn++; - hdr-statsn = cpu_to_be32(cmd-stat_sn); - hdr-exp_cmdsn = cpu_to_be32(conn-sess-exp_cmd_sn); - hdr-max_cmdsn = cpu_to_be32(conn-sess-max_cmd_sn); + iscsit_build_reject(cmd, conn, (struct iscsi_reject *)cmd-pdu[0]); Hence it will crash later: iscsit_do_crypto_hash_buf(conn-conn_tx_hash, (unsigned char *)hdr, ISCSI_HDR_LEN, 0, NULL, (u8 *)header_digest); and pr_debug(Built Reject PDU StatSN: 0x%08x, Reason: 0x%02x, CID: %hu\n, ntohl(hdr-statsn), hdr-reason, conn-cid); Whoops. Applying the following patch to fix this up now, and including in the next PULL request. Thanks alot for catching this! --nab diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index ffbc6a9..c230eac 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -3557,11 +3557,11 @@ static int iscsit_send_reject( struct iscsi_cmd *cmd, struct iscsi_conn *conn) { - u32 iov_count = 0, tx_size = 0; - struct iscsi_reject *hdr; + struct iscsi_reject *hdr = (struct iscsi_reject *)cmd-pdu[0]; struct kvec *iov; + u32 iov_count = 0, tx_size; - iscsit_build_reject(cmd, conn, (struct iscsi_reject *)cmd-pdu[0]); + iscsit_build_reject(cmd, conn, hdr); iov = cmd-iov_misc[0]; iov[iov_count].iov_base = cmd-pdu; -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: initial LIO iSER performance numbers [was: GIT PULL] target updates for v3.10-rc1)
On Thu, 2013-05-02 at 16:58 +0300, Or Gerlitz wrote: On 30/04/2013 05:59, Nicholas A. Bellinger wrote: Hello Linus! Here are the target pending changes for the v3.10-rc1 merge window. Please go ahead and pull from: git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git for-next-merge SNIP Hi Nic, everyone, So LIO iser target code is now merged into Linus tree, and will be in kernel 3.10, exciting! Here's some data on raw performance numbers we were able to get with the LIO iser code. For single initiator and single lun, block sizes varying over the range 1KB,2KB... 128KB doing random read: 1KB 227,870K 2KB 458,099K 4KB 909,761K 8KB 1,679,922K 16KB 3,233,753K 32KB 4,905,139K 64KB 5,294,873K 128KB 5,565,235K When enlarging the number of luns and still with single initiator, for 1KB randomreads we get: 1 LUN = 230k IOPS 2 LUNs = 420k IOPS 4 LUNs = 740k IOPS When enlarging the number of initiators, and each having four lunswe get for 1KB random reads: 1 initiator x 4 LUNs = 740k IOPS 2 initiators x 4 LUNs = 1480k IOPS 3 initiators x 4 LUNs = 1570k IOPS So all in all, things scale pretty nicely, and we observe a some bottleneck in the IOPS rate around 1.6 Million IOPS, so there's where to improve... Excellent. Thanks for the posting these initial performance results. Here's the fio command line used by the initiators $ fio --cpumask=0xfc --rw=randread --bs=1k --numjobs=2 --iodepth=128 --runtime=62 --time_based --size=1073741824k --loops=1 --ioengine=libaio --direct=1 --invalidate=1 --fsync_on_close=1 --randrepeat=1 --norandommap --group_reporting --exitall --name dev-sdb-randread-1k-2thr-libaio-128iodepth-62sec --filename=/dev/sdb And some details on the setup: The nodes are HP ProLiant DL380p Gen8 with the following CPU: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz two NUMA nodes with eight cores each, 32GB RAM, PCI express gen3 8x, the HCA being Mellanox ConnectX3 with firmware 2.11.500 The target node was running upstream kernel and the initiators RHEL 6.3 kernel, all X86_64 We used RAMDISK_MCP backend which was patched to act as NULL device, so we can test the raw iSER wire performance. Btw, I'll be including a similar patch to allow for RAMDISK_NULL to be configured as a NULL device mode. Thanks Or Co! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/for-next 0/3] iscsi-target/iser-target: Fix breakage
From: Nicholas Bellinger n...@linux-iscsi.org Hi folks, This series addresses two bugs within v3.10 for-next code related to the upcoming iscsi-target series adding multi-transport support, along with a third iser-target bugfix related to early ISCSI_OP_SCSI_CMD exception handling. These three patches are being folded into for-next/for-next-merge code now.. Thanks! --nab Nicholas Bellinger (3): iscsi-target: Fix solicited NopIN handling with RDMAExtentions=No iscsi-target: Fix iscsit_handle_scsi_cmd() exception se_cmd leak iser-target: Add special case for early ISCSI_OP_SCSI_CMD exception handling drivers/infiniband/ulp/isert/ib_isert.c |7 ++ drivers/target/iscsi/iscsi_target.c | 32 ++ 2 files changed, 26 insertions(+), 13 deletions(-) -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/for-next 3/3] iser-target: Add special case for early ISCSI_OP_SCSI_CMD exception handling
From: Nicholas Bellinger n...@linux-iscsi.org During early ISCSI_OP_SCSI_CMD exception handling with non GOOD status, the TX thread context in isert_response_queue() needs to post IB_WR_SEND via isert_put_response() when processing ISTATE_SEND_STATUS. Cc: Or Gerlitz ogerl...@mellanox.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/isert/ib_isert.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 1a150ef..41712f0 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -1947,6 +1947,13 @@ isert_response_queue(struct iscsi_conn *conn, struct iscsi_cmd *cmd, int state) case ISTATE_SEND_REJECT: ret = isert_put_reject(cmd, conn); break; + case ISTATE_SEND_STATUS: + /* +* Special case for sending non GOOD SCSI status from TX thread +* context during pre se_cmd excecution failure. +*/ + ret = isert_put_response(conn, cmd); + break; default: pr_err(Unknown response state: 0x%02x\n, state); ret = -EINVAL; -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/for-next 2/3] iscsi-target: Fix iscsit_handle_scsi_cmd() exception se_cmd leak
From: Nicholas Bellinger n...@linux-iscsi.org This patch fixes a regression where failures before backend se_cmd execution in iscsit_handle_scsi_cmd() is leaking iscsi_cmd due to a missing target_put_sess_cmd() call to drop the extra kref. Introduced during the RX PDU refactoring in v3.10 for-next code. Cc: Or Gerlitz ogerl...@mellanox.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target.c | 26 +++--- 1 files changed, 19 insertions(+), 7 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 5eee9b7..ffbc6a9 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -1069,12 +1069,17 @@ int iscsit_process_scsi_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, */ if (!cmd-immediate_data) { cmdsn_ret = iscsit_sequence_cmd(conn, cmd, hdr-cmdsn); - if (cmdsn_ret == CMDSN_LOWER_THAN_EXP) + if (cmdsn_ret == CMDSN_LOWER_THAN_EXP) { + if (!cmd-sense_reason) + return 0; + + target_put_sess_cmd(conn-sess-se_sess, cmd-se_cmd); return 0; - else if (cmdsn_ret == CMDSN_ERROR_CANNOT_RECOVER) + } else if (cmdsn_ret == CMDSN_ERROR_CANNOT_RECOVER) { return iscsit_add_reject_from_cmd( ISCSI_REASON_PROTOCOL_ERROR, 1, 0, (unsigned char *)hdr, cmd); + } } iscsit_ack_from_expstatsn(conn, be32_to_cpu(hdr-exp_statsn)); @@ -1085,24 +1090,31 @@ int iscsit_process_scsi_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, if (!cmd-immediate_data) { if (!cmd-sense_reason cmd-unsolicited_data) iscsit_set_unsoliticed_dataout(cmd); + if (!cmd-sense_reason) + return 0; + target_put_sess_cmd(conn-sess-se_sess, cmd-se_cmd); return 0; } /* -* Early CHECK_CONDITIONs never make it to the transport processing -* thread. They are processed in CmdSN order by -* iscsit_check_received_cmdsn() below. +* Early CHECK_CONDITIONs with ImmediateData never make it to command +* execution. These exceptions are processed in CmdSN order using +* iscsit_check_received_cmdsn() in iscsit_get_immediate_data() below. */ - if (cmd-sense_reason) + if (cmd-sense_reason) { + target_put_sess_cmd(conn-sess-se_sess, cmd-se_cmd); return 1; + } /* * Call directly into transport_generic_new_cmd() to perform * the backend memory allocation. */ cmd-sense_reason = transport_generic_new_cmd(cmd-se_cmd); - if (cmd-sense_reason) + if (cmd-sense_reason) { + target_put_sess_cmd(conn-sess-se_sess, cmd-se_cmd); return 1; + } return 0; } -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/for-next 1/3] iscsi-target: Fix solicited NopIN handling with RDMAExtentions=No
From: Nicholas Bellinger n...@linux-iscsi.org This patches fixes a regression with solicited NopIN handling in traditional iSCSI code introduced during TX immediate queue refactoring for v3.10 for-next code. Cc: Or Gerlitz ogerl...@mellanox.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target.c |6 -- 1 files changed, 0 insertions(+), 6 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 0148cc3..5eee9b7 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -3762,12 +3762,6 @@ check_rsp_state: break; case ISTATE_SEND_NOPIN: ret = iscsit_send_nopin(cmd, conn); - if (!ret) { - spin_lock_bh(cmd-istate_lock); - cmd-i_state = ISTATE_SENT_STATUS; - spin_unlock_bh(cmd-istate_lock); - return 0; - } break; case ISTATE_SEND_REJECT: ret = iscsit_send_reject(cmd, conn); -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -next] iser-target: fix error return code in isert_connect_request()
On Fri, 2013-04-19 at 13:13 +0800, Wei Yongjun wrote: From: Wei Yongjun yongjun_...@trendmicro.com.cn Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun yongjun_...@trendmicro.com.cn --- Merged into the initial iser-target commit in for-next-merge. Thanks Wei! --nab drivers/infiniband/ulp/isert/ib_isert.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index f6f4f58..803b949 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -329,6 +329,7 @@ static struct isert_device * isert_device_find_by_ib_dev(struct rdma_cm_id *cma_id) { struct isert_device *device; + int ret; mutex_lock(device_list_mutex); list_for_each_entry(device, device_list, dev_node) { @@ -342,16 +343,17 @@ isert_device_find_by_ib_dev(struct rdma_cm_id *cma_id) device = kzalloc(sizeof(struct isert_device), GFP_KERNEL); if (!device) { mutex_unlock(device_list_mutex); - return NULL; + return ERR_PTR(-ENOMEM); } INIT_LIST_HEAD(device-dev_node); device-ib_device = cma_id-device; - if (isert_create_device_ib_res(device)) { + ret = isert_create_device_ib_res(device); + if (ret) { kfree(device); mutex_unlock(device_list_mutex); - return NULL; + return ERR_PTR(ret); } device-refcount++; @@ -434,8 +436,10 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct rdma_cm_event *event) } device = isert_device_find_by_ib_dev(cma_id); - if (!device) + if (IS_ERR(device)) { + ret = PTR_ERR(device); goto out_rsp_dma_map; + } isert_conn-conn_device = device; isert_conn-conn_pd = device-dev_pd; -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next: Tree for Apr 17 (infiniband/rdma)
On Wed, 2013-04-17 at 21:15 +0300, Or Gerlitz wrote: On Wed, Apr 17, 2013 at 9:06 PM, Randy Dunlap rdun...@infradead.org wrote: On 04/17/13 00:04, Stephen Rothwell wrote: Changes since 20130416: on x86_64: drivers/built-in.o: In function `isert_free_np': ib_isert.c:(.text+0x6e8a77): undefined reference to `rdma_destroy_id' [...] Nic, Yep, you need to add depends on INET INFINIBAND_ADDR_TRANS to the isert Kconfig entry This will make sure to get the rdma_cm to be configured and builts whenever isert is Thanks Roland Or, fixing this up now. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC-v4 0/9] Add support for iSCSI Extensions for RDMA (ISER) target mode
From: Nicholas Bellinger n...@linux-iscsi.org Hi folks, This series is the forth RFC for iSCSI Extensions for RDMA (ISER) target mode support planned for an upcoming v3.10 merge. This series refactors existing traditional iscsi-target mode logic in order for external ib_isert.ko module code to function with areas common to traditional TCP socket based iSCSI and RDMA verbs based ISER operation. This includes a basic iscsit_transport API that allows different transports to reside under the existing iscsi-target configfs control plane, using an pre-defined network portal attribute to enable a rdma_cm listener on top of existing ipoib portals. At this point the code is functional and pushing sustained RDMA_WRITE + RDMA_READ traffic using open-iscsi on top of multiple iser network portals + multiple IB HCA ports + multiple LUNs. Thus far we're using Mellanox IB HCAs for initial development, and will be verfiying with RoCE capable NICs as well in the near future. This RFC-v4 code is available in git against v3.9-rc3 here: git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git iser_target-rfc-v4 Changes since RFC-v3 include: - Mark isert_cq_rx_work as static (Or) - Drop unnecessary ib_dma_sync_single_for_cpu + ib_dma_sync_single_for_device calls for isert_cmd-sense_buf_dma from isert_put_response (Or) - Use 12288 for ISER_RX_PAD_SIZE base to save extra page per struct iser_rx_desc (Or + nab) - Drop now unnecessary isert_rx_desc usage, and convert RX users to iser_rx_desc (Or + nab) - Move isert_[alloc,free]_rx_descriptors() ahead of isert_create_device_ib_res() usage (nab) - Mark isert_cq_[rx,tx]_callback() + prototypes as static - Fix 'warning: 'ret' may be used uninitialized' warning for isert_create_device_ib_res on powerpc allmodconfig (fengguang + nab) - Fix 'warning: 'ret' may be used uninitialized' warning for isert_connect_request on i386 allyesconfig (fengguang + nab) - Fix pr_debug conversion specification in isert_rx_completion() (fengguang + nab) - Drop unnecessary isert_conn-conn_cm_id != NULL check in isert_connect_release causing the build warning: variable dereferenced before check 'isert_conn-conn_cm_id' - Fix isert_lid + isert_np leak in isert_setup_np failure path - Add isert_conn-conn_wait_comp_err usage in isert_free_conn() for isert_cq_comp_err completion path - Add isert_conn-logout_posted bit to determine decrement of isert_conn-post_send_buf_count from logout response completion - Always set ISER_CONN_DOWN from isert_disconnect_work() callback - Add request_module for ib_isert to lio_target_np_store_iser() - Add missing iscsit_put_transport() call in iscsi_target_setup_login_socket() failure case RFC-v4 has been rebased this afternoon into target-pending for-next/for-next-merge to be picked up by monday's -next build. Please review. --nab Nicholas Bellinger (9): target: Add export of target_get_sess_cmd symbol iscsi-target: Add iscsit_transport API template iscsi-target: Initial traditional TCP conversion to iscsit_transport iscsi-target: Add iser-target parameter keys + setup during login iscsi-target: Add per transport iscsi_cmd alloc/free iscsi-target: Refactor RX PDU logic + export request PDU handling iscsi-target: Refactor TX queue logic + export response PDU creation iscsi-target: Add iser network portal attribute iser-target: Add iSCSI Extensions for RDMA (iSER) target driver drivers/infiniband/Kconfig |1 + drivers/infiniband/Makefile|1 + drivers/infiniband/ulp/isert/Kconfig |6 + drivers/infiniband/ulp/isert/Makefile |2 + drivers/infiniband/ulp/isert/ib_isert.c| 2264 drivers/infiniband/ulp/isert/ib_isert.h| 138 ++ drivers/infiniband/ulp/isert/isert_proto.h | 47 + drivers/target/iscsi/Makefile |3 +- drivers/target/iscsi/iscsi_target.c| 1169 - drivers/target/iscsi/iscsi_target.h|3 +- drivers/target/iscsi/iscsi_target_configfs.c | 98 +- drivers/target/iscsi/iscsi_target_core.h | 26 +- drivers/target/iscsi/iscsi_target_device.c |7 +- drivers/target/iscsi/iscsi_target_erl1.c | 13 +- drivers/target/iscsi/iscsi_target_login.c | 472 -- drivers/target/iscsi/iscsi_target_login.h |6 + drivers/target/iscsi/iscsi_target_nego.c | 182 +-- drivers/target/iscsi/iscsi_target_nego.h | 11 +- drivers/target/iscsi/iscsi_target_parameters.c | 87 +- drivers/target/iscsi/iscsi_target_parameters.h | 16 +- drivers/target/iscsi/iscsi_target_tmr.c|4 +- drivers/target/iscsi/iscsi_target_tpg.c|6 +- drivers/target/iscsi/iscsi_target_transport.c | 55 + drivers/target/iscsi/iscsi_target_util.c | 53 +- drivers/target/iscsi
[RFC-v4 8/9] iscsi-target: Add iser network portal attribute
From: Nicholas Bellinger n...@linux-iscsi.org This patch adds a new network portal attribute for iser, that lives under existing iscsi-target configfs layout at: /sys/kernel/config/target/iscsi/$TARGETNAME/$TPGT/np/$PORTAL/iser When lio_target_np_store_iser() is enabled, iscsit_tpg_add_network_portal() will attempt to start an rdma_cma network portal for iser-target, only if the external ib_isert module transport has been loaded. When disabled, iscsit_tpg_del_network_portal() will cease iser login service on the network portal, and release any external ib_isert module reference. v4 changes: - Add request_module for ib_isert to lio_target_np_store_iser() Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target_configfs.c | 79 ++ 1 files changed, 79 insertions(+), 0 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target_configfs.c b/drivers/target/iscsi/iscsi_target_configfs.c index 2704daf..13e9e71 100644 --- a/drivers/target/iscsi/iscsi_target_configfs.c +++ b/drivers/target/iscsi/iscsi_target_configfs.c @@ -125,8 +125,87 @@ out: TF_NP_BASE_ATTR(lio_target, sctp, S_IRUGO | S_IWUSR); +static ssize_t lio_target_np_show_iser( + struct se_tpg_np *se_tpg_np, + char *page) +{ + struct iscsi_tpg_np *tpg_np = container_of(se_tpg_np, + struct iscsi_tpg_np, se_tpg_np); + struct iscsi_tpg_np *tpg_np_iser; + ssize_t rb; + + tpg_np_iser = iscsit_tpg_locate_child_np(tpg_np, ISCSI_INFINIBAND); + if (tpg_np_iser) + rb = sprintf(page, 1\n); + else + rb = sprintf(page, 0\n); + + return rb; +} + +static ssize_t lio_target_np_store_iser( + struct se_tpg_np *se_tpg_np, + const char *page, + size_t count) +{ + struct iscsi_np *np; + struct iscsi_portal_group *tpg; + struct iscsi_tpg_np *tpg_np = container_of(se_tpg_np, + struct iscsi_tpg_np, se_tpg_np); + struct iscsi_tpg_np *tpg_np_iser = NULL; + char *endptr; + u32 op; + int rc; + + op = simple_strtoul(page, endptr, 0); + if ((op != 1) (op != 0)) { + pr_err(Illegal value for tpg_enable: %u\n, op); + return -EINVAL; + } + np = tpg_np-tpg_np; + if (!np) { + pr_err(Unable to locate struct iscsi_np from +struct iscsi_tpg_np\n); + return -EINVAL; + } + + tpg = tpg_np-tpg; + if (iscsit_get_tpg(tpg) 0) + return -EINVAL; + + if (op) { + int rc = request_module(ib_isert); + if (rc != 0) + pr_warn(Unable to request_module for ib_isert\n); + + tpg_np_iser = iscsit_tpg_add_network_portal(tpg, np-np_sockaddr, + np-np_ip, tpg_np, ISCSI_INFINIBAND); + if (!tpg_np_iser || IS_ERR(tpg_np_iser)) + goto out; + } else { + tpg_np_iser = iscsit_tpg_locate_child_np(tpg_np, ISCSI_INFINIBAND); + if (!tpg_np_iser) + goto out; + + rc = iscsit_tpg_del_network_portal(tpg, tpg_np_iser); + if (rc 0) + goto out; + } + + printk(lio_target_np_store_iser() done, op: %d\n, op); + + iscsit_put_tpg(tpg); + return count; +out: + iscsit_put_tpg(tpg); + return -EINVAL; +} + +TF_NP_BASE_ATTR(lio_target, iser, S_IRUGO | S_IWUSR); + static struct configfs_attribute *lio_target_portal_attrs[] = { lio_target_np_sctp.attr, + lio_target_np_iser.attr, NULL, }; -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC-v4 1/9] target: Add export of target_get_sess_cmd symbol
From: Nicholas Bellinger n...@linux-iscsi.org Export target_get_sess_cmd() symbol so that it can be used by iscsi-target. Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/target_core_transport.c |4 ++-- include/target/target_core_fabric.h|2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 493e9e5..f8388b4 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -65,7 +65,6 @@ static void transport_complete_task_attr(struct se_cmd *cmd); static void transport_handle_queue_full(struct se_cmd *cmd, struct se_device *dev); static int transport_generic_get_mem(struct se_cmd *cmd); -static int target_get_sess_cmd(struct se_session *, struct se_cmd *, bool); static void transport_put_cmd(struct se_cmd *cmd); static void target_complete_ok_work(struct work_struct *work); @@ -2179,7 +2178,7 @@ EXPORT_SYMBOL(transport_generic_free_cmd); * @se_cmd:command descriptor to add * @ack_kref: Signal that fabric will perform an ack target_put_sess_cmd() */ -static int target_get_sess_cmd(struct se_session *se_sess, struct se_cmd *se_cmd, +int target_get_sess_cmd(struct se_session *se_sess, struct se_cmd *se_cmd, bool ack_kref) { unsigned long flags; @@ -2208,6 +2207,7 @@ out: spin_unlock_irqrestore(se_sess-sess_cmd_lock, flags); return ret; } +EXPORT_SYMBOL(target_get_sess_cmd); static void target_release_cmd_kref(struct kref *kref) { diff --git a/include/target/target_core_fabric.h b/include/target/target_core_fabric.h index aaa1ee6..ba3471b 100644 --- a/include/target/target_core_fabric.h +++ b/include/target/target_core_fabric.h @@ -120,7 +120,7 @@ booltransport_wait_for_tasks(struct se_cmd *); inttransport_check_aborted_status(struct se_cmd *, int); inttransport_send_check_condition_and_sense(struct se_cmd *, sense_reason_t, int); - +inttarget_get_sess_cmd(struct se_session *, struct se_cmd *, bool); inttarget_put_sess_cmd(struct se_session *, struct se_cmd *); void target_sess_cmd_list_set_waiting(struct se_session *); void target_wait_for_sess_cmds(struct se_session *, int); -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC-v4 3/9] iscsi-target: Initial traditional TCP conversion to iscsit_transport
From: Nicholas Bellinger n...@linux-iscsi.org This patch performs the initial conversion of existing traditional iscsi to use iscsit_transport API callers. This includes: - iscsi-np cleanups for iscsit_transport_type - Add iscsi-np transport calls w/ -iscsit_setup_up() and -iscsit_free_np() - Convert login thread process context to use -iscsit_accept_np() for connections with pre-allocated struct iscsi_conn - Convert existing socket accept code to iscsit_accept_np() - Convert login RX/TX callers to use -iscsit_get_login_rx() and -iscsit_put_login_tx() to exchange request/response PDUs - Convert existing socket login RX/TX calls into iscsit_get_login_rx() and iscsit_put_login_tx() - Change iscsit_close_connection() to invoke -iscsit_free_conn() + iscsit_put_transport() calls. - Add iscsit_register_transport() + iscsit_unregister_transport() calls to module init/exit v4 changes: - Add missing iscsit_put_transport() call in iscsi_target_setup_login_socket() failure case v2 changes: - Update module init/exit to use register_transport() + unregister_transport() Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target.c| 35 ++- drivers/target/iscsi/iscsi_target_core.h | 15 +- drivers/target/iscsi/iscsi_target_login.c | 416 --- drivers/target/iscsi/iscsi_target_login.h |6 + drivers/target/iscsi/iscsi_target_nego.c | 167 +- drivers/target/iscsi/iscsi_target_nego.h | 11 +- drivers/target/iscsi/iscsi_target_parameters.c | 12 +- drivers/target/iscsi/iscsi_target_tpg.c|6 +- drivers/target/iscsi/iscsi_target_util.c | 27 +-- 9 files changed, 380 insertions(+), 315 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 7ea246a..8203bf3 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -49,6 +49,8 @@ #include iscsi_target_device.h #include iscsi_target_stat.h +#include target/iscsi/iscsi_transport.h + static LIST_HEAD(g_tiqn_list); static LIST_HEAD(g_np_list); static DEFINE_SPINLOCK(tiqn_lock); @@ -401,8 +403,7 @@ struct iscsi_np *iscsit_add_np( spin_unlock_bh(np_lock); pr_debug(CORE[0] - Added Network Portal: %s:%hu on %s\n, - np-np_ip, np-np_port, (np-np_network_transport == ISCSI_TCP) ? - TCP : SCTP); + np-np_ip, np-np_port, np-np_transport-name); return np; } @@ -441,11 +442,10 @@ int iscsit_reset_np_thread( return 0; } -static int iscsit_del_np_comm(struct iscsi_np *np) +static void iscsit_free_np(struct iscsi_np *np) { if (np-np_socket) sock_release(np-np_socket); - return 0; } int iscsit_del_np(struct iscsi_np *np) @@ -467,20 +467,32 @@ int iscsit_del_np(struct iscsi_np *np) send_sig(SIGINT, np-np_thread, 1); kthread_stop(np-np_thread); } - iscsit_del_np_comm(np); + + np-np_transport-iscsit_free_np(np); spin_lock_bh(np_lock); list_del(np-np_list); spin_unlock_bh(np_lock); pr_debug(CORE[0] - Removed Network Portal: %s:%hu on %s\n, - np-np_ip, np-np_port, (np-np_network_transport == ISCSI_TCP) ? - TCP : SCTP); + np-np_ip, np-np_port, np-np_transport-name); + iscsit_put_transport(np-np_transport); kfree(np); return 0; } +static struct iscsit_transport iscsi_target_transport = { + .name = iSCSI/TCP, + .transport_type = ISCSI_TCP, + .owner = NULL, + .iscsit_setup_np= iscsit_setup_np, + .iscsit_accept_np = iscsit_accept_np, + .iscsit_free_np = iscsit_free_np, + .iscsit_get_login_rx= iscsit_get_login_rx, + .iscsit_put_login_tx= iscsit_put_login_tx, +}; + static int __init iscsi_target_init_module(void) { int ret = 0; @@ -557,6 +569,8 @@ static int __init iscsi_target_init_module(void) goto ooo_out; } + iscsit_register_transport(iscsi_target_transport); + if (iscsit_load_discovery_tpg() 0) goto r2t_out; @@ -587,6 +601,7 @@ static void __exit iscsi_target_cleanup_module(void) iscsi_deallocate_thread_sets(); iscsi_thread_set_free(); iscsit_release_discovery_tpg(); + iscsit_unregister_transport(iscsi_target_transport); kmem_cache_destroy(lio_cmd_cache); kmem_cache_destroy(lio_qr_cache); kmem_cache_destroy(lio_dr_cache); @@ -4053,6 +4068,12 @@ int iscsit_close_connection( if (conn-sock) sock_release(conn-sock); + + if (conn-conn_transport-iscsit_free_conn) + conn-conn_transport-iscsit_free_conn(conn); + + iscsit_put_transport(conn-conn_transport); + conn-thread_set = NULL
[RFC-v4 5/9] iscsi-target: Add per transport iscsi_cmd alloc/free
From: Nicholas Bellinger n...@linux-iscsi.org This patch converts struct iscsi_cmd memory allocation + free to use -iscsit_alloc_cmd() iscsit_transport API caller, and export iscsit_allocate_cmd() symbols Also add iscsi_cmd-release_cmd() to be used seperately from iscsit_transport for connection/session shutdown. v2 changes: - Remove unnecessary checks in iscsit_alloc_cmd (asias) - Drop iscsit_transport-iscsit_free_cmd() usage - Drop iscsit_transport-iscsit_unmap_cmd() usage - Add iscsi_cmd-release_cmd() - Convert lio_release_cmd() to use iscsi_cmd-release_cmd() Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target.c |1 + drivers/target/iscsi/iscsi_target_configfs.c |3 ++- drivers/target/iscsi/iscsi_target_core.h |1 + drivers/target/iscsi/iscsi_target_util.c | 25 + drivers/target/iscsi/iscsi_target_util.h |1 + 5 files changed, 26 insertions(+), 5 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 8203bf3..b01a10e 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -489,6 +489,7 @@ static struct iscsit_transport iscsi_target_transport = { .iscsit_setup_np= iscsit_setup_np, .iscsit_accept_np = iscsit_accept_np, .iscsit_free_np = iscsit_free_np, + .iscsit_alloc_cmd = iscsit_alloc_cmd, .iscsit_get_login_rx= iscsit_get_login_rx, .iscsit_put_login_tx= iscsit_put_login_tx, }; diff --git a/drivers/target/iscsi/iscsi_target_configfs.c b/drivers/target/iscsi/iscsi_target_configfs.c index 78d75c8..c78b824 100644 --- a/drivers/target/iscsi/iscsi_target_configfs.c +++ b/drivers/target/iscsi/iscsi_target_configfs.c @@ -1700,7 +1700,8 @@ static void lio_release_cmd(struct se_cmd *se_cmd) { struct iscsi_cmd *cmd = container_of(se_cmd, struct iscsi_cmd, se_cmd); - iscsit_release_cmd(cmd); + pr_debug(Entering lio_release_cmd for se_cmd: %p\n, se_cmd); + cmd-release_cmd(cmd); } /* End functions for target_core_fabric_ops */ diff --git a/drivers/target/iscsi/iscsi_target_core.h b/drivers/target/iscsi/iscsi_target_core.h index 53400b0..60ec4b9 100644 --- a/drivers/target/iscsi/iscsi_target_core.h +++ b/drivers/target/iscsi/iscsi_target_core.h @@ -485,6 +485,7 @@ struct iscsi_cmd { u32 first_data_sg_off; u32 kmapped_nents; sense_reason_t sense_reason; + void (*release_cmd)(struct iscsi_cmd *); } cacheline_aligned; struct iscsi_tmr_req { diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c index 4cf1e7f..0b73f90 100644 --- a/drivers/target/iscsi/iscsi_target_util.c +++ b/drivers/target/iscsi/iscsi_target_util.c @@ -149,6 +149,18 @@ void iscsit_free_r2ts_from_list(struct iscsi_cmd *cmd) spin_unlock_bh(cmd-r2t_lock); } +struct iscsi_cmd *iscsit_alloc_cmd(struct iscsi_conn *conn, gfp_t gfp_mask) +{ + struct iscsi_cmd *cmd; + + cmd = kmem_cache_zalloc(lio_cmd_cache, gfp_mask); + if (!cmd) + return NULL; + + cmd-release_cmd = iscsit_release_cmd; + return cmd; +} + /* * May be called from software interrupt (timer) context for allocating * iSCSI NopINs. @@ -157,13 +169,12 @@ struct iscsi_cmd *iscsit_allocate_cmd(struct iscsi_conn *conn, gfp_t gfp_mask) { struct iscsi_cmd *cmd; - cmd = kmem_cache_zalloc(lio_cmd_cache, gfp_mask); + cmd = conn-conn_transport-iscsit_alloc_cmd(conn, gfp_mask); if (!cmd) { pr_err(Unable to allocate memory for struct iscsi_cmd.\n); return NULL; } - - cmd-conn = conn; + cmd-conn = conn; INIT_LIST_HEAD(cmd-i_conn_node); INIT_LIST_HEAD(cmd-datain_list); INIT_LIST_HEAD(cmd-cmd_r2t_list); @@ -176,6 +187,7 @@ struct iscsi_cmd *iscsit_allocate_cmd(struct iscsi_conn *conn, gfp_t gfp_mask) return cmd; } +EXPORT_SYMBOL(iscsit_allocate_cmd); struct iscsi_seq *iscsit_get_seq_holder_for_datain( struct iscsi_cmd *cmd, @@ -690,6 +702,11 @@ void iscsit_free_cmd(struct iscsi_cmd *cmd) */ switch (cmd-iscsi_opcode) { case ISCSI_OP_SCSI_CMD: + if (cmd-data_direction == DMA_TO_DEVICE) + iscsit_stop_dataout_timer(cmd); + /* +* Fallthrough +*/ case ISCSI_OP_SCSI_TMFUNC: transport_generic_free_cmd(cmd-se_cmd, 1); break; @@ -705,7 +722,7 @@ void iscsit_free_cmd(struct iscsi_cmd *cmd) } /* Fall-through */ default: - iscsit_release_cmd(cmd); + cmd-release_cmd(cmd); break; } } diff --git a/drivers/target/iscsi/iscsi_target_util.h b/drivers/target
[RFC-v4 7/9] iscsi-target: Refactor TX queue logic + export response PDU creation
From: Nicholas Bellinger n...@linux-iscsi.org This patch refactors TX immediate + response queue handling to use the new iscsit_transport API callers, and exports the necessary traditional iscsi PDU response creation functions for iser-target to utilize. This includes: - Add iscsit_build_datain_pdu() for DATAIN PDU init + convert iscsit_build_datain_pdu() - Add iscsit_build_logout_rsp() for LOGOUT_RSP PDU init + convert iscsit_send_logout() - Add iscsit_build_nopin_rsp() for NOPIN_RSP PDU init + convert iscsit_send_nopin() - Add iscsit_build_rsp_pdu() for SCSI_RSP PDU init + convert iscsit_send_response() - Add iscsit_build_task_mgt_rsp for TM_RSP PDU init + convert iscsit_send_task_mgt_rsp() - Refactor immediate queue state switch into iscsit_immediate_queue() - Convert handle_immediate_queue() to use iscsit_transport caller - Refactor response queue state switch into iscsit_response_queue() - Convert handle_response_queue to use iscsit_transport caller - Export iscsit_logout_post_handler(), iscsit_increment_maxcmdsn() and iscsit_tmr_post_handler() for external transport module usage v3 changes: - Add iscsit_build_reject for REJECT PDU init + convert iscsit_send_reject() v2 changes: - Add iscsit_queue_rsp() for iscsit_transport-iscsit_queue_data_in() and iscsit_transport-iscsit_queue_status() - Update lio_queue_data_in() to use -iscsit_queue_data_in() - Update lio_queue_status() to use -iscsit_queue_status() - Use mutex_trylock() in iscsit_increment_maxcmdsn() Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target.c | 661 ++ drivers/target/iscsi/iscsi_target_configfs.c |7 +- drivers/target/iscsi/iscsi_target_device.c |7 +- drivers/target/iscsi/iscsi_target_tmr.c |1 + 4 files changed, 374 insertions(+), 302 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 19d4e08..3948aa1 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -70,8 +70,7 @@ struct kmem_cache *lio_ooo_cache; struct kmem_cache *lio_r2t_cache; static int iscsit_handle_immediate_data(struct iscsi_cmd *, - unsigned char *buf, u32); -static int iscsit_logout_post_handler(struct iscsi_cmd *, struct iscsi_conn *); + struct iscsi_scsi_req *, u32); struct iscsi_tiqn *iscsit_get_tiqn_for_login(unsigned char *buf) { @@ -482,6 +481,15 @@ int iscsit_del_np(struct iscsi_np *np) return 0; } +static int iscsit_immediate_queue(struct iscsi_conn *, struct iscsi_cmd *, int); +static int iscsit_response_queue(struct iscsi_conn *, struct iscsi_cmd *, int); + +static int iscsit_queue_rsp(struct iscsi_conn *conn, struct iscsi_cmd *cmd) +{ + iscsit_add_cmd_to_response_queue(cmd, cmd-conn, cmd-i_state); + return 0; +} + static struct iscsit_transport iscsi_target_transport = { .name = iSCSI/TCP, .transport_type = ISCSI_TCP, @@ -493,6 +501,10 @@ static struct iscsit_transport iscsi_target_transport = { .iscsit_get_login_rx= iscsit_get_login_rx, .iscsit_put_login_tx= iscsit_put_login_tx, .iscsit_get_dataout = iscsit_build_r2ts_for_cmd, + .iscsit_immediate_queue = iscsit_immediate_queue, + .iscsit_response_queue = iscsit_response_queue, + .iscsit_queue_data_in = iscsit_queue_rsp, + .iscsit_queue_status= iscsit_queue_rsp, }; static int __init iscsi_target_init_module(void) @@ -651,14 +663,6 @@ static int iscsit_add_reject( iscsit_add_cmd_to_response_queue(cmd, conn, cmd-i_state); ret = wait_for_completion_interruptible(cmd-reject_comp); - /* -* Perform the kref_put now if se_cmd has been setup by -* iscsit_setup_scsi_cmd() -*/ - if (cmd-se_cmd.se_tfo != NULL) { - pr_debug(iscsi reject: calling target_put_sess_cmd \n); - target_put_sess_cmd(conn-sess-se_sess, cmd-se_cmd); - } if (ret != 0) return -1; @@ -2536,18 +2540,60 @@ static void iscsit_tx_thread_wait_for_tcp(struct iscsi_conn *conn) } } -static int iscsit_send_data_in( - struct iscsi_cmd *cmd, - struct iscsi_conn *conn) +static void +iscsit_build_datain_pdu(struct iscsi_cmd *cmd, struct iscsi_conn *conn, + struct iscsi_datain *datain, struct iscsi_data_rsp *hdr, + bool set_statsn) { - int iov_ret = 0, set_statsn = 0; - u32 iov_count = 0, tx_size = 0; + hdr-opcode = ISCSI_OP_SCSI_DATA_IN; + hdr-flags = datain-flags; + if (hdr-flags ISCSI_FLAG_DATA_STATUS) { + if (cmd-se_cmd.se_cmd_flags SCF_OVERFLOW_BIT) { + hdr-flags |= ISCSI_FLAG_DATA_OVERFLOW; + hdr-residual_count = cpu_to_be32(cmd-se_cmd.residual_count
[RFC-v4 6/9] iscsi-target: Refactor RX PDU logic + export request PDU handling
From: Nicholas Bellinger n...@linux-iscsi.org This patch refactors existing traditional iscsi RX side PDU handling to use iscsit_transport, and exports the necessary logic for external transport modules. This includes: - Refactor iscsit_handle_scsi_cmd() into PDU setup / processing - Add updated iscsit_handle_scsi_cmd() for tradtional iscsi code - Add iscsit_set_unsoliticed_dataout() wrapper - Refactor iscsit_handle_data_out() into PDU check / processing - Add updated iscsit_handle_data_out() for tradtional iscsi code - Add iscsit_handle_nop_out() + iscsit_handle_task_mgt_cmd() to accept pre-allocated struct iscsi_cmd - Add iscsit_build_r2ts_for_cmd() caller for iscsi_target_transport to handle ISTATE_SEND_R2T for TX immediate queue - Refactor main traditional iscsi iscsi_target_rx_thread() PDU switch into iscsi_target_rx_opcode() using iscsit_allocate_cmd() - Turn iscsi_target_rx_thread() process context into NOP for ib_isert side work-queue. v3 changes: - Add extra target_put_sess_cmd call in iscsit_add_reject_from_cmd after completion v2 changes: - Disable iscsit_ack_from_expstatsn() usage for RDMAExtentions=Yes - Disable iscsit_allocate_datain_req() usage for RDMAExtentions=Yes - Add target_get_sess_cmd() reference counting to iscsit_setup_scsi_cmd() - Add TFO-lio_check_stop_free() fabric API caller - Add export of iscsit_stop_dataout_timer() symbol - Add iscsit_build_r2ts_for_cmd() for iscsit_transport-iscsit_get_dataout() - Convert existing usage of iscsit_build_r2ts_for_cmd() to -iscsit_get_dataout() - Drop RDMAExtentions=Yes specific check in iscsit_build_r2ts_for_cmd() - Fix RDMAExtentions - RDMAExtensions typo (andy) - Pass correct dump_payload value into iscsit_get_immediate_data() for iscsit_handle_scsi_cmd() Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target.c | 486 -- drivers/target/iscsi/iscsi_target.h |3 +- drivers/target/iscsi/iscsi_target_configfs.c |9 +- drivers/target/iscsi/iscsi_target_erl1.c | 13 +- drivers/target/iscsi/iscsi_target_login.c|3 +- drivers/target/iscsi/iscsi_target_nego.c | 15 - drivers/target/iscsi/iscsi_target_tmr.c |3 +- drivers/target/iscsi/iscsi_target_util.c |1 + 8 files changed, 331 insertions(+), 202 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index b01a10e..19d4e08 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -492,6 +492,7 @@ static struct iscsit_transport iscsi_target_transport = { .iscsit_alloc_cmd = iscsit_alloc_cmd, .iscsit_get_login_rx= iscsit_get_login_rx, .iscsit_put_login_tx= iscsit_put_login_tx, + .iscsit_get_dataout = iscsit_build_r2ts_for_cmd, }; static int __init iscsi_target_init_module(void) @@ -650,6 +651,14 @@ static int iscsit_add_reject( iscsit_add_cmd_to_response_queue(cmd, conn, cmd-i_state); ret = wait_for_completion_interruptible(cmd-reject_comp); + /* +* Perform the kref_put now if se_cmd has been setup by +* iscsit_setup_scsi_cmd() +*/ + if (cmd-se_cmd.se_tfo != NULL) { + pr_debug(iscsi reject: calling target_put_sess_cmd \n); + target_put_sess_cmd(conn-sess-se_sess, cmd-se_cmd); + } if (ret != 0) return -1; @@ -698,11 +707,20 @@ int iscsit_add_reject_from_cmd( iscsit_add_cmd_to_response_queue(cmd, conn, cmd-i_state); ret = wait_for_completion_interruptible(cmd-reject_comp); + /* +* Perform the kref_put now if se_cmd has already been setup by +* scsit_setup_scsi_cmd() +*/ + if (cmd-se_cmd.se_tfo != NULL) { + pr_debug(iscsi reject: calling target_put_sess_cmd \n); + target_put_sess_cmd(conn-sess-se_sess, cmd-se_cmd); + } if (ret != 0) return -1; return (!fail_conn) ? 0 : -1; } +EXPORT_SYMBOL(iscsit_add_reject_from_cmd); /* * Map some portion of the allocated scatterlist to an iovec, suitable for @@ -761,6 +779,9 @@ static void iscsit_ack_from_expstatsn(struct iscsi_conn *conn, u32 exp_statsn) conn-exp_statsn = exp_statsn; + if (conn-sess-sess_ops-RDMAExtensions) + return; + spin_lock_bh(conn-cmd_lock); list_for_each_entry(cmd, conn-conn_cmd_list, i_conn_node) { spin_lock(cmd-istate_lock); @@ -793,12 +814,10 @@ static int iscsit_allocate_iovecs(struct iscsi_cmd *cmd) return 0; } -static int iscsit_handle_scsi_cmd( - struct iscsi_conn *conn, - unsigned char *buf) +int iscsit_setup_scsi_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, + unsigned char *buf) { - int data_direction, payload_length, cmdsn_ret = 0, immed_ret; - struct iscsi_cmd *cmd = NULL; + int
[RFC-v4 4/9] iscsi-target: Add iser-target parameter keys + setup during login
From: Nicholas Bellinger n...@linux-iscsi.org This patch adds RDMAExtensions, InitiatorRecvDataSegmentLength and TargetRecvDataSegmentLength parameters keys necessary for iser-target login to occur. This includes setting the necessary parameters during login path code within iscsi_login_zero_tsih_s2(), and currently PAGE_SIZE aligning the target's advertised MRDSL for immediate data and unsolicited data-out incoming payloads. v3 changes: - Add iscsi_post_login_start_timers FIXME for ISER v2 changes: - Fix RDMAExtentions - RDMAExtensions typo (andy) - Drop unnecessary '== true' conditional checks for type bool Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target_core.h | 10 +++ drivers/target/iscsi/iscsi_target_login.c | 77 drivers/target/iscsi/iscsi_target_parameters.c | 75 +-- drivers/target/iscsi/iscsi_target_parameters.h | 16 +- 4 files changed, 161 insertions(+), 17 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target_core.h b/drivers/target/iscsi/iscsi_target_core.h index 2587677..53400b0 100644 --- a/drivers/target/iscsi/iscsi_target_core.h +++ b/drivers/target/iscsi/iscsi_target_core.h @@ -244,6 +244,11 @@ struct iscsi_conn_ops { u8 IFMarker; /* [0,1] == [No,Yes] */ u32 OFMarkInt; /* [1..65535] */ u32 IFMarkInt; /* [1..65535] */ + /* +* iSER specific connection parameters +*/ + u32 InitiatorRecvDataSegmentLength; /* [512..2**24-1] */ + u32 TargetRecvDataSegmentLength;/* [512..2**24-1] */ }; struct iscsi_sess_ops { @@ -265,6 +270,10 @@ struct iscsi_sess_ops { u8 DataSequenceInOrder;/* [0,1] == [No,Yes] */ u8 ErrorRecoveryLevel; /* [0..2] */ u8 SessionType;/* [0,1] == [Normal,Discovery]*/ + /* +* iSER specific session parameters +*/ + u8 RDMAExtensions; /* [0,1] == [No,Yes] */ }; struct iscsi_queue_req { @@ -284,6 +293,7 @@ struct iscsi_data_count { }; struct iscsi_param_list { + booliser; struct list_headparam_list; struct list_headextra_response_list; }; diff --git a/drivers/target/iscsi/iscsi_target_login.c b/drivers/target/iscsi/iscsi_target_login.c index 0de5c47..e6a8269 100644 --- a/drivers/target/iscsi/iscsi_target_login.c +++ b/drivers/target/iscsi/iscsi_target_login.c @@ -340,6 +340,7 @@ static int iscsi_login_zero_tsih_s2( struct iscsi_node_attrib *na; struct iscsi_session *sess = conn-sess; unsigned char buf[32]; + bool iser = false; sess-tpg = conn-tpg; @@ -361,7 +362,10 @@ static int iscsi_login_zero_tsih_s2( return -1; } - iscsi_set_keys_to_negotiate(0, conn-param_list); + if (conn-conn_transport-transport_type == ISCSI_INFINIBAND) + iser = true; + + iscsi_set_keys_to_negotiate(conn-param_list, iser); if (sess-sess_ops-SessionType) return iscsi_set_keys_irrelevant_for_discovery( @@ -399,6 +403,56 @@ static int iscsi_login_zero_tsih_s2( if (iscsi_login_disable_FIM_keys(conn-param_list, conn) 0) return -1; + /* +* Set RDMAExtensions=Yes by default for iSER enabled network portals +*/ + if (iser) { + struct iscsi_param *param; + unsigned long mrdsl, off; + int rc; + + sprintf(buf, RDMAExtensions=Yes); + if (iscsi_change_param_value(buf, conn-param_list, 0) 0) { + iscsit_tx_login_rsp(conn, ISCSI_STATUS_CLS_TARGET_ERR, + ISCSI_LOGIN_STATUS_NO_RESOURCES); + return -1; + } + /* +* Make MaxRecvDataSegmentLength PAGE_SIZE aligned for +* Immediate Data + Unsolicitied Data-OUT if necessary.. +*/ + param = iscsi_find_param_from_key(MaxRecvDataSegmentLength, + conn-param_list); + if (!param) { + iscsit_tx_login_rsp(conn, ISCSI_STATUS_CLS_TARGET_ERR, + ISCSI_LOGIN_STATUS_NO_RESOURCES); + return -1; + } + rc = strict_strtoul(param-value, 0, mrdsl); + if (rc 0) { + iscsit_tx_login_rsp(conn, ISCSI_STATUS_CLS_TARGET_ERR, + ISCSI_LOGIN_STATUS_NO_RESOURCES); + return -1; + } + off = mrdsl % PAGE_SIZE; + if (!off) + return 0; + + if (mrdsl PAGE_SIZE) + mrdsl = PAGE_SIZE
[RFC-v4 2/9] iscsi-target: Add iscsit_transport API template
From: Nicholas Bellinger n...@linux-iscsi.org Add basic struct iscsit_transport API template to allow iscsi-target for running with external transport modules using existing iscsi_target_core.h code. For all external modules, this calls try_module_get() and module_put() to obtain + release an external iscsit_transport module reference count. Also include the iscsi-target symbols necessary in iscsi_transport.h to allow external transport modules to function. v3 changes: - Add iscsit_build_reject export for ISTATE_SEND_REJECT usage v2 changes: - Drop unnecessary export of iscsit_get_transport + iscsit_put_transport (roland) - Add -iscsit_queue_data_in() to remove extra context switch on RDMA_WRITE - Add -iscsit_queue_status() to remove extra context switch on IB_SEND status - Add -iscsit_get_dataout() to remove extra context switch on RDMA_READ - Drop -iscsit_free_cmd() - Drop -iscsit_unmap_cmd() - Rename iscsit_create_transport() - iscsit_register_transport() (andy) - Rename iscsit_destroy_transport() - iscsit_unregister_transport() (andy) Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/Makefile |3 +- drivers/target/iscsi/iscsi_target_transport.c | 55 include/target/iscsi/iscsi_transport.h| 83 + 3 files changed, 140 insertions(+), 1 deletions(-) create mode 100644 drivers/target/iscsi/iscsi_target_transport.c create mode 100644 include/target/iscsi/iscsi_transport.h diff --git a/drivers/target/iscsi/Makefile b/drivers/target/iscsi/Makefile index 5b9a2cf..13a9240 100644 --- a/drivers/target/iscsi/Makefile +++ b/drivers/target/iscsi/Makefile @@ -15,6 +15,7 @@ iscsi_target_mod-y += iscsi_target_parameters.o \ iscsi_target_util.o \ iscsi_target.o \ iscsi_target_configfs.o \ - iscsi_target_stat.o + iscsi_target_stat.o \ + iscsi_target_transport.o obj-$(CONFIG_ISCSI_TARGET) += iscsi_target_mod.o diff --git a/drivers/target/iscsi/iscsi_target_transport.c b/drivers/target/iscsi/iscsi_target_transport.c new file mode 100644 index 000..882728f --- /dev/null +++ b/drivers/target/iscsi/iscsi_target_transport.c @@ -0,0 +1,55 @@ +#include linux/spinlock.h +#include linux/list.h +#include target/iscsi/iscsi_transport.h + +static LIST_HEAD(g_transport_list); +static DEFINE_MUTEX(transport_mutex); + +struct iscsit_transport *iscsit_get_transport(int type) +{ + struct iscsit_transport *t; + + mutex_lock(transport_mutex); + list_for_each_entry(t, g_transport_list, t_node) { + if (t-transport_type == type) { + if (t-owner !try_module_get(t-owner)) { + t = NULL; + } + mutex_unlock(transport_mutex); + return t; + } + } + mutex_unlock(transport_mutex); + + return NULL; +} + +void iscsit_put_transport(struct iscsit_transport *t) +{ + if (t-owner) + module_put(t-owner); +} + +int iscsit_register_transport(struct iscsit_transport *t) +{ + INIT_LIST_HEAD(t-t_node); + + mutex_lock(transport_mutex); + list_add_tail(t-t_node, g_transport_list); + mutex_unlock(transport_mutex); + + pr_debug(Registered iSCSI transport: %s\n, t-name); + + return 0; +} +EXPORT_SYMBOL(iscsit_register_transport); + +void iscsit_unregister_transport(struct iscsit_transport *t) +{ + mutex_lock(transport_mutex); + list_del(t-t_node); + mutex_unlock(transport_mutex); + + pr_debug(Unregistered iSCSI transport: %s\n, t-name); +} +EXPORT_SYMBOL(iscsit_unregister_transport); diff --git a/include/target/iscsi/iscsi_transport.h b/include/target/iscsi/iscsi_transport.h new file mode 100644 index 000..23a87d0 --- /dev/null +++ b/include/target/iscsi/iscsi_transport.h @@ -0,0 +1,83 @@ +#include linux/module.h +#include linux/list.h +#include ../../../drivers/target/iscsi/iscsi_target_core.h + +struct iscsit_transport { +#define ISCSIT_TRANSPORT_NAME 16 + char name[ISCSIT_TRANSPORT_NAME]; + int transport_type; + struct module *owner; + struct list_head t_node; + int (*iscsit_setup_np)(struct iscsi_np *, struct __kernel_sockaddr_storage *); + int (*iscsit_accept_np)(struct iscsi_np *, struct iscsi_conn *); + void (*iscsit_free_np)(struct iscsi_np *); + void (*iscsit_free_conn)(struct iscsi_conn *); + struct iscsi_cmd *(*iscsit_alloc_cmd)(struct iscsi_conn *, gfp_t); + int (*iscsit_get_login_rx)(struct iscsi_conn *, struct iscsi_login *); + int (*iscsit_put_login_tx)(struct iscsi_conn *, struct iscsi_login *, u32); + int (*iscsit_immediate_queue)(struct iscsi_conn *, struct iscsi_cmd *, int); + int
[RFC-v3 7/9] iscsi-target: Refactor TX queue logic + export response PDU creation
From: Nicholas Bellinger n...@linux-iscsi.org This patch refactors TX immediate + response queue handling to use the new iscsit_transport API callers, and exports the necessary traditional iscsi PDU response creation functions for iser-target to utilize. This includes: - Add iscsit_build_datain_pdu() for DATAIN PDU init + convert iscsit_build_datain_pdu() - Add iscsit_build_logout_rsp() for LOGOUT_RSP PDU init + convert iscsit_send_logout() - Add iscsit_build_nopin_rsp() for NOPIN_RSP PDU init + convert iscsit_send_nopin() - Add iscsit_build_rsp_pdu() for SCSI_RSP PDU init + convert iscsit_send_response() - Add iscsit_build_task_mgt_rsp for TM_RSP PDU init + convert iscsit_send_task_mgt_rsp() - Refactor immediate queue state switch into iscsit_immediate_queue() - Convert handle_immediate_queue() to use iscsit_transport caller - Refactor response queue state switch into iscsit_response_queue() - Convert handle_response_queue to use iscsit_transport caller - Export iscsit_logout_post_handler(), iscsit_increment_maxcmdsn() and iscsit_tmr_post_handler() for external transport module usage v3 changes: - Add iscsit_build_reject for REJECT PDU init + convert iscsit_send_reject() v2 changes: - Add iscsit_queue_rsp() for iscsit_transport-iscsit_queue_data_in() and iscsit_transport-iscsit_queue_status() - Update lio_queue_data_in() to use -iscsit_queue_data_in() - Update lio_queue_status() to use -iscsit_queue_status() - Use mutex_trylock() in iscsit_increment_maxcmdsn() Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target.c | 661 ++ drivers/target/iscsi/iscsi_target_configfs.c |7 +- drivers/target/iscsi/iscsi_target_device.c |7 +- drivers/target/iscsi/iscsi_target_tmr.c |1 + 4 files changed, 374 insertions(+), 302 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 19d4e08..3948aa1 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -70,8 +70,7 @@ struct kmem_cache *lio_ooo_cache; struct kmem_cache *lio_r2t_cache; static int iscsit_handle_immediate_data(struct iscsi_cmd *, - unsigned char *buf, u32); -static int iscsit_logout_post_handler(struct iscsi_cmd *, struct iscsi_conn *); + struct iscsi_scsi_req *, u32); struct iscsi_tiqn *iscsit_get_tiqn_for_login(unsigned char *buf) { @@ -482,6 +481,15 @@ int iscsit_del_np(struct iscsi_np *np) return 0; } +static int iscsit_immediate_queue(struct iscsi_conn *, struct iscsi_cmd *, int); +static int iscsit_response_queue(struct iscsi_conn *, struct iscsi_cmd *, int); + +static int iscsit_queue_rsp(struct iscsi_conn *conn, struct iscsi_cmd *cmd) +{ + iscsit_add_cmd_to_response_queue(cmd, cmd-conn, cmd-i_state); + return 0; +} + static struct iscsit_transport iscsi_target_transport = { .name = iSCSI/TCP, .transport_type = ISCSI_TCP, @@ -493,6 +501,10 @@ static struct iscsit_transport iscsi_target_transport = { .iscsit_get_login_rx= iscsit_get_login_rx, .iscsit_put_login_tx= iscsit_put_login_tx, .iscsit_get_dataout = iscsit_build_r2ts_for_cmd, + .iscsit_immediate_queue = iscsit_immediate_queue, + .iscsit_response_queue = iscsit_response_queue, + .iscsit_queue_data_in = iscsit_queue_rsp, + .iscsit_queue_status= iscsit_queue_rsp, }; static int __init iscsi_target_init_module(void) @@ -651,14 +663,6 @@ static int iscsit_add_reject( iscsit_add_cmd_to_response_queue(cmd, conn, cmd-i_state); ret = wait_for_completion_interruptible(cmd-reject_comp); - /* -* Perform the kref_put now if se_cmd has been setup by -* iscsit_setup_scsi_cmd() -*/ - if (cmd-se_cmd.se_tfo != NULL) { - pr_debug(iscsi reject: calling target_put_sess_cmd \n); - target_put_sess_cmd(conn-sess-se_sess, cmd-se_cmd); - } if (ret != 0) return -1; @@ -2536,18 +2540,60 @@ static void iscsit_tx_thread_wait_for_tcp(struct iscsi_conn *conn) } } -static int iscsit_send_data_in( - struct iscsi_cmd *cmd, - struct iscsi_conn *conn) +static void +iscsit_build_datain_pdu(struct iscsi_cmd *cmd, struct iscsi_conn *conn, + struct iscsi_datain *datain, struct iscsi_data_rsp *hdr, + bool set_statsn) { - int iov_ret = 0, set_statsn = 0; - u32 iov_count = 0, tx_size = 0; + hdr-opcode = ISCSI_OP_SCSI_DATA_IN; + hdr-flags = datain-flags; + if (hdr-flags ISCSI_FLAG_DATA_STATUS) { + if (cmd-se_cmd.se_cmd_flags SCF_OVERFLOW_BIT) { + hdr-flags |= ISCSI_FLAG_DATA_OVERFLOW; + hdr-residual_count = cpu_to_be32(cmd-se_cmd.residual_count
[RFC-v3 5/9] iscsi-target: Add per transport iscsi_cmd alloc/free
From: Nicholas Bellinger n...@linux-iscsi.org This patch converts struct iscsi_cmd memory allocation + free to use -iscsit_alloc_cmd() iscsit_transport API caller, and export iscsit_allocate_cmd() symbols Also add iscsi_cmd-release_cmd() to be used seperately from iscsit_transport for connection/session shutdown. v2 changes: - Remove unnecessary checks in iscsit_alloc_cmd (asias) - Drop iscsit_transport-iscsit_free_cmd() usage - Drop iscsit_transport-iscsit_unmap_cmd() usage - Add iscsi_cmd-release_cmd() - Convert lio_release_cmd() to use iscsi_cmd-release_cmd() Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target.c |1 + drivers/target/iscsi/iscsi_target_configfs.c |3 ++- drivers/target/iscsi/iscsi_target_core.h |1 + drivers/target/iscsi/iscsi_target_util.c | 25 + drivers/target/iscsi/iscsi_target_util.h |1 + 5 files changed, 26 insertions(+), 5 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 8203bf3..b01a10e 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -489,6 +489,7 @@ static struct iscsit_transport iscsi_target_transport = { .iscsit_setup_np= iscsit_setup_np, .iscsit_accept_np = iscsit_accept_np, .iscsit_free_np = iscsit_free_np, + .iscsit_alloc_cmd = iscsit_alloc_cmd, .iscsit_get_login_rx= iscsit_get_login_rx, .iscsit_put_login_tx= iscsit_put_login_tx, }; diff --git a/drivers/target/iscsi/iscsi_target_configfs.c b/drivers/target/iscsi/iscsi_target_configfs.c index 78d75c8..c78b824 100644 --- a/drivers/target/iscsi/iscsi_target_configfs.c +++ b/drivers/target/iscsi/iscsi_target_configfs.c @@ -1700,7 +1700,8 @@ static void lio_release_cmd(struct se_cmd *se_cmd) { struct iscsi_cmd *cmd = container_of(se_cmd, struct iscsi_cmd, se_cmd); - iscsit_release_cmd(cmd); + pr_debug(Entering lio_release_cmd for se_cmd: %p\n, se_cmd); + cmd-release_cmd(cmd); } /* End functions for target_core_fabric_ops */ diff --git a/drivers/target/iscsi/iscsi_target_core.h b/drivers/target/iscsi/iscsi_target_core.h index 53400b0..60ec4b9 100644 --- a/drivers/target/iscsi/iscsi_target_core.h +++ b/drivers/target/iscsi/iscsi_target_core.h @@ -485,6 +485,7 @@ struct iscsi_cmd { u32 first_data_sg_off; u32 kmapped_nents; sense_reason_t sense_reason; + void (*release_cmd)(struct iscsi_cmd *); } cacheline_aligned; struct iscsi_tmr_req { diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c index 4cf1e7f..0b73f90 100644 --- a/drivers/target/iscsi/iscsi_target_util.c +++ b/drivers/target/iscsi/iscsi_target_util.c @@ -149,6 +149,18 @@ void iscsit_free_r2ts_from_list(struct iscsi_cmd *cmd) spin_unlock_bh(cmd-r2t_lock); } +struct iscsi_cmd *iscsit_alloc_cmd(struct iscsi_conn *conn, gfp_t gfp_mask) +{ + struct iscsi_cmd *cmd; + + cmd = kmem_cache_zalloc(lio_cmd_cache, gfp_mask); + if (!cmd) + return NULL; + + cmd-release_cmd = iscsit_release_cmd; + return cmd; +} + /* * May be called from software interrupt (timer) context for allocating * iSCSI NopINs. @@ -157,13 +169,12 @@ struct iscsi_cmd *iscsit_allocate_cmd(struct iscsi_conn *conn, gfp_t gfp_mask) { struct iscsi_cmd *cmd; - cmd = kmem_cache_zalloc(lio_cmd_cache, gfp_mask); + cmd = conn-conn_transport-iscsit_alloc_cmd(conn, gfp_mask); if (!cmd) { pr_err(Unable to allocate memory for struct iscsi_cmd.\n); return NULL; } - - cmd-conn = conn; + cmd-conn = conn; INIT_LIST_HEAD(cmd-i_conn_node); INIT_LIST_HEAD(cmd-datain_list); INIT_LIST_HEAD(cmd-cmd_r2t_list); @@ -176,6 +187,7 @@ struct iscsi_cmd *iscsit_allocate_cmd(struct iscsi_conn *conn, gfp_t gfp_mask) return cmd; } +EXPORT_SYMBOL(iscsit_allocate_cmd); struct iscsi_seq *iscsit_get_seq_holder_for_datain( struct iscsi_cmd *cmd, @@ -690,6 +702,11 @@ void iscsit_free_cmd(struct iscsi_cmd *cmd) */ switch (cmd-iscsi_opcode) { case ISCSI_OP_SCSI_CMD: + if (cmd-data_direction == DMA_TO_DEVICE) + iscsit_stop_dataout_timer(cmd); + /* +* Fallthrough +*/ case ISCSI_OP_SCSI_TMFUNC: transport_generic_free_cmd(cmd-se_cmd, 1); break; @@ -705,7 +722,7 @@ void iscsit_free_cmd(struct iscsi_cmd *cmd) } /* Fall-through */ default: - iscsit_release_cmd(cmd); + cmd-release_cmd(cmd); break; } } diff --git a/drivers/target/iscsi/iscsi_target_util.h b/drivers/target
[RFC-v3 0/9] Add support for iSCSI Extensions for RDMA (ISER) target mode
From: Nicholas Bellinger n...@linux-iscsi.org This series is the third RFC for iSCSI Extensions for RDMA (ISER) target mode support planned for an future v3.10 merge. This series refactors existing traditional iscsi-target mode logic in order for external ib_isert.ko module code to function with areas common to traditional TCP socket based iSCSI and RDMA verbs based ISER operation. This includes a basic iscsit_transport API that allows different transports to reside under the existing iscsi-target configfs control plane, using an pre-defined network portal attribute to enable a rdma_cm listener on top of existing ipoib portals. At this point the code is functional and pushing sustained RDMA_WRITE + RDMA_READ traffic using open-iscsi on top of multiple iser network portals + multiple IB HCA ports + multiple LUNs. Thus far we're using Mellanox IB HCAs for initial development, and will be verfiying with RCoE capable NICs as well in the near future. This RFC-v3 code is available in git against v3.9-rc3 here: git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git iser_target-rfc-v3 The changes included since RFC-v2 include: - Convert to use per isert_cq_desc-cq_[rx,tx]_work + drop tasklets (Or + nab) - Move IB_EVENT_QP_LAST_WQE_REACHED warn into correct isert_qp_event_callback (Or) - Drop unnecessary IB_ACCESS_REMOTE_* access flag usage in isert_create_device_ib_res (Or) - Add common isert_init_send_wr(), and convert isert_put_* calls (Or) - Move to verbs+core logic to single ib_isert.[c,h] (Or + nab) - Add kmem_cache isert_cmd_cache usage for descriptor allocation (nab) - Move common ib_post_send() logic used by isert_put_*() to isert_post_response() (nab) - Add isert_put_reject call in isert_response_queue() for posting ISCSI_REJECT response. (nab) - Add ISTATE_SEND_REJECT checking in isert_do_control_comp. (nab) - Add extra target_put_sess_cmd call in iscsit_add_reject_from_cmd after completion - Add iscsit_build_reject for REJECT PDU init + convert iscsit_send_reject() - Add iscsi_post_login_start_timers FIXME for ISER Many thanks again to Or Gerlitz from Mellanox for v2 review feedback! With the per isert_cq_desc-cq_[rx,tx]_work cmwq dispatch now in place, process context switch overhead is reduced to each cq callback, instead of per each ib_poll_cq ib_wc descriptor as with tasklets in RFC-v2 code. This allows RFC-v3 to reach within ~%5 of single-lun NULLIO small block IOPs performance parity vs. existing STGT iser code, at almost 1/10 the number of total context switches. As before, review patches are broken down into: Patch #1 adds the export of target_get_sess_cmd to be used by iscsi-target Patch #2 - #4 include iscsi-target API template, conversion of iscsi/tcp login path to use API template, plus add iser RFC parameter keys. Patch #5 - #6 allow external iscsi_cmd descriptor allocation / free, and refactoring of RX side PDU request handling to allow incoming PDU logic to be called by external ib_isert workqueue process context. Patch #7 allows iscsi-target to use per transport API template immediate / response callbacks in the per-connection TX thread completion path, and refactoring of response PDU creation for export to external ib_isert code. Patch #8 adds the pre-defined iser network portal attribute under the existing iscsi-target configfs tree. Patch #9 - #12 is the external ib_isert.ko module code seperated into individual commits for review. Note that I'll be dropping #1 - #8 changes into target-pending/for-next shortly, and planning to do the same with ib_isert code soon unless there is an objection from Roland. Please review.. Thank you! --nab Nicholas Bellinger (9): target: Add export of target_get_sess_cmd symbol iscsi-target: Add iscsit_transport API template iscsi-target: Initial traditional TCP conversion to iscsit_transport iscsi-target: Add iser-target parameter keys + setup during login iscsi-target: Add per transport iscsi_cmd alloc/free iscsi-target: Refactor RX PDU logic + export request PDU handling iscsi-target: Refactor TX queue logic + export response PDU creation iscsi-target: Add iser network portal attribute iser-target: Add iSCSI Extensions for RDMA (iSER) target driver drivers/infiniband/Kconfig |1 + drivers/infiniband/Makefile|1 + drivers/infiniband/ulp/isert/Kconfig |6 + drivers/infiniband/ulp/isert/Makefile |2 + drivers/infiniband/ulp/isert/ib_isert.c| 2270 drivers/infiniband/ulp/isert/ib_isert.h| 142 ++ drivers/infiniband/ulp/isert/isert_proto.h | 47 + drivers/target/iscsi/Makefile |3 +- drivers/target/iscsi/iscsi_target.c| 1169 - drivers/target/iscsi/iscsi_target.h|3 +- drivers/target/iscsi/iscsi_target_configfs.c | 94 +- drivers/target/iscsi/iscsi_target_core.h
[RFC-v3 6/9] iscsi-target: Refactor RX PDU logic + export request PDU handling
From: Nicholas Bellinger n...@linux-iscsi.org This patch refactors existing traditional iscsi RX side PDU handling to use iscsit_transport, and exports the necessary logic for external transport modules. This includes: - Refactor iscsit_handle_scsi_cmd() into PDU setup / processing - Add updated iscsit_handle_scsi_cmd() for tradtional iscsi code - Add iscsit_set_unsoliticed_dataout() wrapper - Refactor iscsit_handle_data_out() into PDU check / processing - Add updated iscsit_handle_data_out() for tradtional iscsi code - Add iscsit_handle_nop_out() + iscsit_handle_task_mgt_cmd() to accept pre-allocated struct iscsi_cmd - Add iscsit_build_r2ts_for_cmd() caller for iscsi_target_transport to handle ISTATE_SEND_R2T for TX immediate queue - Refactor main traditional iscsi iscsi_target_rx_thread() PDU switch into iscsi_target_rx_opcode() using iscsit_allocate_cmd() - Turn iscsi_target_rx_thread() process context into NOP for ib_isert side work-queue. v3 changes: - Add extra target_put_sess_cmd call in iscsit_add_reject_from_cmd after completion v2 changes: - Disable iscsit_ack_from_expstatsn() usage for RDMAExtentions=Yes - Disable iscsit_allocate_datain_req() usage for RDMAExtentions=Yes - Add target_get_sess_cmd() reference counting to iscsit_setup_scsi_cmd() - Add TFO-lio_check_stop_free() fabric API caller - Add export of iscsit_stop_dataout_timer() symbol - Add iscsit_build_r2ts_for_cmd() for iscsit_transport-iscsit_get_dataout() - Convert existing usage of iscsit_build_r2ts_for_cmd() to -iscsit_get_dataout() - Drop RDMAExtentions=Yes specific check in iscsit_build_r2ts_for_cmd() - Fix RDMAExtentions - RDMAExtensions typo (andy) - Pass correct dump_payload value into iscsit_get_immediate_data() for iscsit_handle_scsi_cmd() Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target.c | 486 -- drivers/target/iscsi/iscsi_target.h |3 +- drivers/target/iscsi/iscsi_target_configfs.c |9 +- drivers/target/iscsi/iscsi_target_erl1.c | 13 +- drivers/target/iscsi/iscsi_target_login.c|3 +- drivers/target/iscsi/iscsi_target_nego.c | 15 - drivers/target/iscsi/iscsi_target_tmr.c |3 +- drivers/target/iscsi/iscsi_target_util.c |1 + 8 files changed, 331 insertions(+), 202 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index b01a10e..19d4e08 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -492,6 +492,7 @@ static struct iscsit_transport iscsi_target_transport = { .iscsit_alloc_cmd = iscsit_alloc_cmd, .iscsit_get_login_rx= iscsit_get_login_rx, .iscsit_put_login_tx= iscsit_put_login_tx, + .iscsit_get_dataout = iscsit_build_r2ts_for_cmd, }; static int __init iscsi_target_init_module(void) @@ -650,6 +651,14 @@ static int iscsit_add_reject( iscsit_add_cmd_to_response_queue(cmd, conn, cmd-i_state); ret = wait_for_completion_interruptible(cmd-reject_comp); + /* +* Perform the kref_put now if se_cmd has been setup by +* iscsit_setup_scsi_cmd() +*/ + if (cmd-se_cmd.se_tfo != NULL) { + pr_debug(iscsi reject: calling target_put_sess_cmd \n); + target_put_sess_cmd(conn-sess-se_sess, cmd-se_cmd); + } if (ret != 0) return -1; @@ -698,11 +707,20 @@ int iscsit_add_reject_from_cmd( iscsit_add_cmd_to_response_queue(cmd, conn, cmd-i_state); ret = wait_for_completion_interruptible(cmd-reject_comp); + /* +* Perform the kref_put now if se_cmd has already been setup by +* scsit_setup_scsi_cmd() +*/ + if (cmd-se_cmd.se_tfo != NULL) { + pr_debug(iscsi reject: calling target_put_sess_cmd \n); + target_put_sess_cmd(conn-sess-se_sess, cmd-se_cmd); + } if (ret != 0) return -1; return (!fail_conn) ? 0 : -1; } +EXPORT_SYMBOL(iscsit_add_reject_from_cmd); /* * Map some portion of the allocated scatterlist to an iovec, suitable for @@ -761,6 +779,9 @@ static void iscsit_ack_from_expstatsn(struct iscsi_conn *conn, u32 exp_statsn) conn-exp_statsn = exp_statsn; + if (conn-sess-sess_ops-RDMAExtensions) + return; + spin_lock_bh(conn-cmd_lock); list_for_each_entry(cmd, conn-conn_cmd_list, i_conn_node) { spin_lock(cmd-istate_lock); @@ -793,12 +814,10 @@ static int iscsit_allocate_iovecs(struct iscsi_cmd *cmd) return 0; } -static int iscsit_handle_scsi_cmd( - struct iscsi_conn *conn, - unsigned char *buf) +int iscsit_setup_scsi_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd, + unsigned char *buf) { - int data_direction, payload_length, cmdsn_ret = 0, immed_ret; - struct iscsi_cmd *cmd = NULL; + int
[RFC-v3 4/9] iscsi-target: Add iser-target parameter keys + setup during login
From: Nicholas Bellinger n...@linux-iscsi.org This patch adds RDMAExtensions, InitiatorRecvDataSegmentLength and TargetRecvDataSegmentLength parameters keys necessary for iser-target login to occur. This includes setting the necessary parameters during login path code within iscsi_login_zero_tsih_s2(), and currently PAGE_SIZE aligning the target's advertised MRDSL for immediate data and unsolicited data-out incoming payloads. v3 changes: - Add iscsi_post_login_start_timers FIXME for ISER v2 changes: - Fix RDMAExtentions - RDMAExtensions typo (andy) - Drop unnecessary '== true' conditional checks for type bool Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target_core.h | 10 +++ drivers/target/iscsi/iscsi_target_login.c | 77 drivers/target/iscsi/iscsi_target_parameters.c | 75 +-- drivers/target/iscsi/iscsi_target_parameters.h | 16 +- 4 files changed, 161 insertions(+), 17 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target_core.h b/drivers/target/iscsi/iscsi_target_core.h index 2587677..53400b0 100644 --- a/drivers/target/iscsi/iscsi_target_core.h +++ b/drivers/target/iscsi/iscsi_target_core.h @@ -244,6 +244,11 @@ struct iscsi_conn_ops { u8 IFMarker; /* [0,1] == [No,Yes] */ u32 OFMarkInt; /* [1..65535] */ u32 IFMarkInt; /* [1..65535] */ + /* +* iSER specific connection parameters +*/ + u32 InitiatorRecvDataSegmentLength; /* [512..2**24-1] */ + u32 TargetRecvDataSegmentLength;/* [512..2**24-1] */ }; struct iscsi_sess_ops { @@ -265,6 +270,10 @@ struct iscsi_sess_ops { u8 DataSequenceInOrder;/* [0,1] == [No,Yes] */ u8 ErrorRecoveryLevel; /* [0..2] */ u8 SessionType;/* [0,1] == [Normal,Discovery]*/ + /* +* iSER specific session parameters +*/ + u8 RDMAExtensions; /* [0,1] == [No,Yes] */ }; struct iscsi_queue_req { @@ -284,6 +293,7 @@ struct iscsi_data_count { }; struct iscsi_param_list { + booliser; struct list_headparam_list; struct list_headextra_response_list; }; diff --git a/drivers/target/iscsi/iscsi_target_login.c b/drivers/target/iscsi/iscsi_target_login.c index 57f9dea..24a9e69 100644 --- a/drivers/target/iscsi/iscsi_target_login.c +++ b/drivers/target/iscsi/iscsi_target_login.c @@ -340,6 +340,7 @@ static int iscsi_login_zero_tsih_s2( struct iscsi_node_attrib *na; struct iscsi_session *sess = conn-sess; unsigned char buf[32]; + bool iser = false; sess-tpg = conn-tpg; @@ -361,7 +362,10 @@ static int iscsi_login_zero_tsih_s2( return -1; } - iscsi_set_keys_to_negotiate(0, conn-param_list); + if (conn-conn_transport-transport_type == ISCSI_INFINIBAND) + iser = true; + + iscsi_set_keys_to_negotiate(conn-param_list, iser); if (sess-sess_ops-SessionType) return iscsi_set_keys_irrelevant_for_discovery( @@ -399,6 +403,56 @@ static int iscsi_login_zero_tsih_s2( if (iscsi_login_disable_FIM_keys(conn-param_list, conn) 0) return -1; + /* +* Set RDMAExtensions=Yes by default for iSER enabled network portals +*/ + if (iser) { + struct iscsi_param *param; + unsigned long mrdsl, off; + int rc; + + sprintf(buf, RDMAExtensions=Yes); + if (iscsi_change_param_value(buf, conn-param_list, 0) 0) { + iscsit_tx_login_rsp(conn, ISCSI_STATUS_CLS_TARGET_ERR, + ISCSI_LOGIN_STATUS_NO_RESOURCES); + return -1; + } + /* +* Make MaxRecvDataSegmentLength PAGE_SIZE aligned for +* Immediate Data + Unsolicitied Data-OUT if necessary.. +*/ + param = iscsi_find_param_from_key(MaxRecvDataSegmentLength, + conn-param_list); + if (!param) { + iscsit_tx_login_rsp(conn, ISCSI_STATUS_CLS_TARGET_ERR, + ISCSI_LOGIN_STATUS_NO_RESOURCES); + return -1; + } + rc = strict_strtoul(param-value, 0, mrdsl); + if (rc 0) { + iscsit_tx_login_rsp(conn, ISCSI_STATUS_CLS_TARGET_ERR, + ISCSI_LOGIN_STATUS_NO_RESOURCES); + return -1; + } + off = mrdsl % PAGE_SIZE; + if (!off) + return 0; + + if (mrdsl PAGE_SIZE) + mrdsl = PAGE_SIZE
[RFC-v3 3/9] iscsi-target: Initial traditional TCP conversion to iscsit_transport
From: Nicholas Bellinger n...@linux-iscsi.org This patch performs the initial conversion of existing traditional iscsi to use iscsit_transport API callers. This includes: - iscsi-np cleanups for iscsit_transport_type - Add iscsi-np transport calls w/ -iscsit_setup_up() and -iscsit_free_np() - Convert login thread process context to use -iscsit_accept_np() for connections with pre-allocated struct iscsi_conn - Convert existing socket accept code to iscsit_accept_np() - Convert login RX/TX callers to use -iscsit_get_login_rx() and -iscsit_put_login_tx() to exchange request/response PDUs - Convert existing socket login RX/TX calls into iscsit_get_login_rx() and iscsit_put_login_tx() - Change iscsit_close_connection() to invoke -iscsit_free_conn() + iscsit_put_transport() calls. - Add iscsit_register_transport() + iscsit_unregister_transport() calls to module init/exit v2 changes: - Update module init/exit to use register_transport() + unregister_transport() Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target.c| 35 ++- drivers/target/iscsi/iscsi_target_core.h | 15 +- drivers/target/iscsi/iscsi_target_login.c | 414 drivers/target/iscsi/iscsi_target_login.h |6 + drivers/target/iscsi/iscsi_target_nego.c | 167 ++- drivers/target/iscsi/iscsi_target_nego.h | 11 +- drivers/target/iscsi/iscsi_target_parameters.c | 12 +- drivers/target/iscsi/iscsi_target_tpg.c|6 +- drivers/target/iscsi/iscsi_target_util.c | 27 +-- 9 files changed, 378 insertions(+), 315 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 7ea246a..8203bf3 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -49,6 +49,8 @@ #include iscsi_target_device.h #include iscsi_target_stat.h +#include target/iscsi/iscsi_transport.h + static LIST_HEAD(g_tiqn_list); static LIST_HEAD(g_np_list); static DEFINE_SPINLOCK(tiqn_lock); @@ -401,8 +403,7 @@ struct iscsi_np *iscsit_add_np( spin_unlock_bh(np_lock); pr_debug(CORE[0] - Added Network Portal: %s:%hu on %s\n, - np-np_ip, np-np_port, (np-np_network_transport == ISCSI_TCP) ? - TCP : SCTP); + np-np_ip, np-np_port, np-np_transport-name); return np; } @@ -441,11 +442,10 @@ int iscsit_reset_np_thread( return 0; } -static int iscsit_del_np_comm(struct iscsi_np *np) +static void iscsit_free_np(struct iscsi_np *np) { if (np-np_socket) sock_release(np-np_socket); - return 0; } int iscsit_del_np(struct iscsi_np *np) @@ -467,20 +467,32 @@ int iscsit_del_np(struct iscsi_np *np) send_sig(SIGINT, np-np_thread, 1); kthread_stop(np-np_thread); } - iscsit_del_np_comm(np); + + np-np_transport-iscsit_free_np(np); spin_lock_bh(np_lock); list_del(np-np_list); spin_unlock_bh(np_lock); pr_debug(CORE[0] - Removed Network Portal: %s:%hu on %s\n, - np-np_ip, np-np_port, (np-np_network_transport == ISCSI_TCP) ? - TCP : SCTP); + np-np_ip, np-np_port, np-np_transport-name); + iscsit_put_transport(np-np_transport); kfree(np); return 0; } +static struct iscsit_transport iscsi_target_transport = { + .name = iSCSI/TCP, + .transport_type = ISCSI_TCP, + .owner = NULL, + .iscsit_setup_np= iscsit_setup_np, + .iscsit_accept_np = iscsit_accept_np, + .iscsit_free_np = iscsit_free_np, + .iscsit_get_login_rx= iscsit_get_login_rx, + .iscsit_put_login_tx= iscsit_put_login_tx, +}; + static int __init iscsi_target_init_module(void) { int ret = 0; @@ -557,6 +569,8 @@ static int __init iscsi_target_init_module(void) goto ooo_out; } + iscsit_register_transport(iscsi_target_transport); + if (iscsit_load_discovery_tpg() 0) goto r2t_out; @@ -587,6 +601,7 @@ static void __exit iscsi_target_cleanup_module(void) iscsi_deallocate_thread_sets(); iscsi_thread_set_free(); iscsit_release_discovery_tpg(); + iscsit_unregister_transport(iscsi_target_transport); kmem_cache_destroy(lio_cmd_cache); kmem_cache_destroy(lio_qr_cache); kmem_cache_destroy(lio_dr_cache); @@ -4053,6 +4068,12 @@ int iscsit_close_connection( if (conn-sock) sock_release(conn-sock); + + if (conn-conn_transport-iscsit_free_conn) + conn-conn_transport-iscsit_free_conn(conn); + + iscsit_put_transport(conn-conn_transport); + conn-thread_set = NULL; pr_debug(Moving to TARG_CONN_STATE_FREE.\n); diff --git a/drivers/target/iscsi/iscsi_target_core.h b/drivers
[RFC-v3 1/9] target: Add export of target_get_sess_cmd symbol
From: Nicholas Bellinger n...@linux-iscsi.org Export target_get_sess_cmd() symbol so that it can be used by iscsi-target. Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/target_core_transport.c |4 ++-- include/target/target_core_fabric.h|2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 493e9e5..f8388b4 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -65,7 +65,6 @@ static void transport_complete_task_attr(struct se_cmd *cmd); static void transport_handle_queue_full(struct se_cmd *cmd, struct se_device *dev); static int transport_generic_get_mem(struct se_cmd *cmd); -static int target_get_sess_cmd(struct se_session *, struct se_cmd *, bool); static void transport_put_cmd(struct se_cmd *cmd); static void target_complete_ok_work(struct work_struct *work); @@ -2179,7 +2178,7 @@ EXPORT_SYMBOL(transport_generic_free_cmd); * @se_cmd:command descriptor to add * @ack_kref: Signal that fabric will perform an ack target_put_sess_cmd() */ -static int target_get_sess_cmd(struct se_session *se_sess, struct se_cmd *se_cmd, +int target_get_sess_cmd(struct se_session *se_sess, struct se_cmd *se_cmd, bool ack_kref) { unsigned long flags; @@ -2208,6 +2207,7 @@ out: spin_unlock_irqrestore(se_sess-sess_cmd_lock, flags); return ret; } +EXPORT_SYMBOL(target_get_sess_cmd); static void target_release_cmd_kref(struct kref *kref) { diff --git a/include/target/target_core_fabric.h b/include/target/target_core_fabric.h index aaa1ee6..ba3471b 100644 --- a/include/target/target_core_fabric.h +++ b/include/target/target_core_fabric.h @@ -120,7 +120,7 @@ booltransport_wait_for_tasks(struct se_cmd *); inttransport_check_aborted_status(struct se_cmd *, int); inttransport_send_check_condition_and_sense(struct se_cmd *, sense_reason_t, int); - +inttarget_get_sess_cmd(struct se_session *, struct se_cmd *, bool); inttarget_put_sess_cmd(struct se_session *, struct se_cmd *); void target_sess_cmd_list_set_waiting(struct se_session *); void target_wait_for_sess_cmds(struct se_session *, int); -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v3 9/9] iser-target: Add iSCSI Extensions for RDMA (iSER) target driver
On Thu, 2013-04-04 at 12:20 +0300, Or Gerlitz wrote: On 04/04/2013 10:24, Nicholas A. Bellinger wrote: + +void isert_cq_tx_callback(struct ib_cq *, void *); +void isert_cq_rx_callback(struct ib_cq *, void *); +void isert_free_rx_descriptors(struct isert_conn *); any reason not to have these as static functions (same for isert_cq_rx_work) Nope. Looking at re-orig now to avoid the use of function prototypes here, and marking everything as static.. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v3 9/9] iser-target: Add iSCSI Extensions for RDMA (iSER) target driver
On Thu, 2013-04-04 at 12:45 +0300, Or Gerlitz wrote: On 04/04/2013 10:24, Nicholas A. Bellinger wrote: +#define ISER_RECV_DATA_SEG_LEN 8192 +#define ISER_RX_PAYLOAD_SIZE(ISER_HEADERS_LEN + ISER_RECV_DATA_SEG_LEN) [...] +#define ISER_RX_PAD_SIZE (16384 - (ISER_RX_PAYLOAD_SIZE + \ + sizeof(u64) + sizeof(struct ib_sge))) We're eating here too much ram for the pad, you need 8K + something, so the pad can count down from 12K and not 16K which means each such element will consume three pages and not four. Hmm, IIRC this larger pad was originally required after bumping the ISER_RECV_DATA_SEG_LEN value for handling incoming ImmediateData and Unsolicited Data-OUT.. Will try using 12k here and see what happens.. Also, ISER_RECV_DATA_SEG_LEN will need to be enforced as the largest MaxRecvDataSegmentLength negotiated during iser login to prevent the initiator from exceeding the hardcoded value.. +struct iser_rx_desc { + struct iser_hdr iser_header; + struct iscsi_hdr iscsi_header; + chardata[ISER_RECV_DATA_SEG_LEN]; + u64 dma_addr; + struct ib_sge rx_sg; + charpad[ISER_RX_PAD_SIZE]; +} __packed; + +struct isert_rx_desc { + struct isert_conn *desc_conn; + struct work_struct desc_work; + struct iser_rx_desc desc; +} __packed; You have way enough room in the pad field of struct iser_rx_desc to place there the two fields added by struct isert_rx_desc (and you only use struct iser_rx_desc from within isert_rx_desc) -- any reason not to unify them? This is left-over cruft from the per isert_rx_desc dispatch into process context.. Dropping this now. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v3 9/9] iser-target: Add iSCSI Extensions for RDMA (iSER) target driver
On Thu, 2013-04-04 at 12:51 +0300, Or Gerlitz wrote: On 04/04/2013 10:24, Nicholas A. Bellinger wrote: +static int +isert_put_response(struct iscsi_conn *conn, struct iscsi_cmd *cmd) +{ + struct isert_cmd *isert_cmd = container_of(cmd, + struct isert_cmd, iscsi_cmd); + struct isert_conn *isert_conn = (struct isert_conn *)conn-context; + struct ib_send_wr *send_wr = isert_cmd-tx_desc.send_wr; + struct iscsi_scsi_rsp *hdr = (struct iscsi_scsi_rsp *) + isert_cmd-tx_desc.iscsi_header; + + isert_create_send_desc(isert_conn, isert_cmd, isert_cmd-tx_desc); + iscsit_build_rsp_pdu(cmd, conn, true, hdr); + isert_init_tx_hdrs(isert_conn, isert_cmd-tx_desc); + /* +* Attach SENSE DATA payload to iSCSI Response PDU +*/ + if (cmd-se_cmd.sense_buffer + ((cmd-se_cmd.se_cmd_flags SCF_TRANSPORT_TASK_SENSE) || + (cmd-se_cmd.se_cmd_flags SCF_EMULATED_TASK_SENSE))) { + struct ib_device *ib_dev = isert_conn-conn_cm_id-device; + struct ib_sge *tx_dsg = isert_cmd-tx_desc.tx_sg[1]; + u32 padding, sense_len; + + put_unaligned_be16(cmd-se_cmd.scsi_sense_length, + cmd-sense_buffer); + cmd-se_cmd.scsi_sense_length += sizeof(__be16); + + padding = -(cmd-se_cmd.scsi_sense_length) 3; + hton24(hdr-dlength, (u32)cmd-se_cmd.scsi_sense_length); + sense_len = cmd-se_cmd.scsi_sense_length + padding; + + isert_cmd-sense_buf_dma = ib_dma_map_single(ib_dev, + (void *)cmd-sense_buffer, sense_len, + DMA_TO_DEVICE); + + isert_cmd-sense_buf_len = sense_len; + ib_dma_sync_single_for_cpu(ib_dev, isert_cmd-sense_buf_dma, + sense_len, DMA_TO_DEVICE); + ib_dma_sync_single_for_device(ib_dev, isert_cmd-sense_buf_dma, + sense_len, DMA_TO_DEVICE); + you just called dma_map_single, and not going to touch the buffer before posting it to the wire, there's no point to sync it for the cpu and for the device, remove these calls. Dropped. Thanks Or! -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 10/12] iser-target: Add logic for verbs
On Wed, 2013-04-03 at 10:04 +0300, Or Gerlitz wrote: On 02/04/2013 09:18, Or Gerlitz wrote: On 23/03/2013 01:55, Nicholas A. Bellinger wrote: +++ b/drivers/infiniband/ulp/isert/isert_verbs.h @@ -0,0 +1,5 @@ +extern void isert_connect_release(struct isert_conn *); +extern void isert_put_conn(struct isert_conn *); +extern int isert_cma_handler(struct rdma_cm_id *, struct rdma_cm_event *); +extern int isert_post_recv(struct isert_conn *, u32); +extern int isert_post_send(struct isert_conn *, struct iser_tx_desc *); why use extern here? maybe a left over from V1? Nic, are you picking this comment one and its sister comment asking to remove externs and use less header files? So in yesterday's target-pending/iser-target-wip push, source/headers have been merged into a single ib_isert.[c,h], with the exception of the existing isert_proto.h definitions. This will be included as a single commit for RFC-v3. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 11/12] iser-target: Add logic for core
On Tue, 2013-04-02 at 11:33 +0300, Or Gerlitz wrote: On 23/03/2013 01:55, Nicholas A. Bellinger wrote: +++ b/drivers/infiniband/ulp/isert/isert_core.h @@ -0,0 +1,11 @@ +#include linux/socket.h +#include linux/in.h +#include linux/in6.h +#include rdma/ib_verbs.h +#include rdma/rdma_cm.h + +extern void iser_cq_tx_tasklet(unsigned long); +extern void isert_cq_tx_callback(struct ib_cq *, void *); +extern void iser_cq_rx_tasklet(unsigned long); +extern void isert_cq_rx_callback(struct ib_cq *, void *); +extern void isert_free_rx_descriptors(struct isert_conn *); no need for externs here too, agree? also, any reason for these two header files not to be merged into one or into one of the other header files? nod, merging into a single source/include for RFC-v3 code. diff --git a/drivers/infiniband/ulp/isert/isert_verbs.h b/drivers/infiniband/ulp/isert/isert_verbs.h new file mode 100644 index 000..da7924d --- /dev/null +++ b/drivers/infiniband/ulp/isert/isert_verbs.h @@ -0,0 +1,5 @@ +extern void isert_connect_release(struct isert_conn *); +extern void isert_put_conn(struct isert_conn *); +extern int isert_cma_handler(struct rdma_cm_id *, struct rdma_cm_event *); +extern int isert_post_recv(struct isert_conn *, u32); +extern int isert_post_send(struct isert_conn *, struct iser_tx_desc *); -- To unsubscribe from this list: send the line unsubscribe target-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 10/12] iser-target: Add logic for verbs
On Tue, 2013-04-02 at 23:09 +0300, Or Gerlitz wrote: On Sat, Mar 23, 2013 at 1:55 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: [...] +static void +isert_qp_event_callback(struct ib_event *e, void *context) +{ + struct isert_conn *isert_conn = (struct isert_conn *)context; + + pr_err(isert_qp_event_callback event: %d\n, e-event); + switch (e-event) { + case IB_EVENT_COMM_EST: + rdma_notify(isert_conn-conn_cm_id, IB_EVENT_COMM_EST); + break; + default: + break; + } +} [...] +static void +isert_cq_event_callback(struct ib_event *e, void *context) +{ + pr_debug(isert_cq_event_callback event: %d\n, e-event); + + switch (e-event) { + case IB_EVENT_QP_LAST_WQE_REACHED: + pr_warn(Reached TX IB_EVENT_QP_LAST_WQE_REACHED:\n); + break; + default: + pr_warn(Unknown e-event; %d\n, e-event); + break; + } +} This is QP not CQ event, move the case for it to QP event hander isert_qp_event_callback Done. Thanks Or! -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 10/12] iser-target: Add logic for verbs
On Wed, 2013-04-03 at 00:13 +0300, Or Gerlitz wrote: On Sat, Mar 23, 2013 at 1:55 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: + device-dev_mr = ib_get_dma_mr(device-dev_pd, + IB_ACCESS_LOCAL_WRITE | + IB_ACCESS_REMOTE_WRITE | + IB_ACCESS_REMOTE_READ); remove IB_ACCESS_REMOTE_yyy access flags, you're not letting anyone do remote rdma to this memory region Droping IB_ACCESS_REMOTE_yyy access for RFC-v3. Thanks Or! -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 11/12] iser-target: Add logic for core
On Wed, 2013-04-03 at 00:24 +0300, Or Gerlitz wrote: On Sat, Mar 23, 2013 at 1:55 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: +static int +isert_put_response(struct iscsi_conn *conn, struct iscsi_cmd *cmd) +{ + struct isert_cmd *isert_cmd = container_of(cmd, + struct isert_cmd, iscsi_cmd); + struct isert_conn *isert_conn = (struct isert_conn *)conn-context; + struct ib_send_wr *send_wr = isert_cmd-tx_desc.send_wr, *wr_failed; + struct iscsi_scsi_rsp *hdr = (struct iscsi_scsi_rsp *) + isert_cmd-tx_desc.iscsi_header; + int ret; + + isert_create_send_desc(isert_conn, isert_cmd, isert_cmd-tx_desc); + iscsit_build_rsp_pdu(cmd, conn, true, hdr); + isert_init_tx_hdrs(isert_conn, isert_cmd-tx_desc); + /* +* Attach SENSE DATA payload to iSCSI Response PDU +*/ + if (cmd-se_cmd.sense_buffer + ((cmd-se_cmd.se_cmd_flags SCF_TRANSPORT_TASK_SENSE) || + (cmd-se_cmd.se_cmd_flags SCF_EMULATED_TASK_SENSE))) { + struct ib_device *ib_dev = isert_conn-conn_cm_id-device; + struct ib_sge *tx_dsg = isert_cmd-tx_desc.tx_sg[1]; + u32 padding, sense_len; + + put_unaligned_be16(cmd-se_cmd.scsi_sense_length, + cmd-sense_buffer); + cmd-se_cmd.scsi_sense_length += sizeof(__be16); + + padding = -(cmd-se_cmd.scsi_sense_length) 3; + hton24(hdr-dlength, (u32)cmd-se_cmd.scsi_sense_length); + sense_len = cmd-se_cmd.scsi_sense_length + padding; + + isert_cmd-sense_buf_dma = ib_dma_map_single(ib_dev, + (void *)cmd-sense_buffer, sense_len, + DMA_TO_DEVICE); + + isert_cmd-sense_buf_len = sense_len; + ib_dma_sync_single_for_cpu(ib_dev, isert_cmd-sense_buf_dma, + sense_len, DMA_TO_DEVICE); + ib_dma_sync_single_for_device(ib_dev, isert_cmd-sense_buf_dma, + sense_len, DMA_TO_DEVICE); + + tx_dsg-addr= isert_cmd-sense_buf_dma; + tx_dsg-length = sense_len; + tx_dsg-lkey= isert_conn-conn_mr-lkey; + isert_cmd-tx_desc.num_sge = 2; + } + + isert_cmd-rdma_wr.iser_ib_op = ISER_IB_SEND; [...] + send_wr-wr_id = (unsigned long)isert_cmd-tx_desc; + send_wr-opcode = IB_WR_SEND; + send_wr-send_flags = IB_SEND_SIGNALED; + send_wr-sg_list = isert_cmd-tx_desc.tx_sg[0]; + send_wr-num_sge = isert_cmd-tx_desc.num_sge; + send_wr-next = NULL; [...] These seven lines are repeated 3-5 times below, a quick question and suggestion: 1. can't we do it beforehand? 2. we can move to helper function and call it when needed. nod, adding common isert_init_send_wr() caller now. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 01/11] iscsi-target: Add iscsit_transport API template
On Fri, 2013-03-22 at 10:23 -0700, Andy Grover wrote: On 03/07/2013 05:45 PM, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org Add basic struct iscsit_transport API template to allow iscsi-target for running with external transport modules using existing iscsi_target_core.h code. For all external modules, this calls try_module_get() and module_put() to obtain + release an external iscsit_transport module reference count. Also include the iscsi-target symbols necessary in iscsi_transport.h to allow external transport modules to function. Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/Makefile |3 +- drivers/target/iscsi/iscsi_target_transport.c | 57 ++ include/target/iscsi/iscsi_transport.h| 77 + 3 files changed, 136 insertions(+), 1 deletions(-) create mode 100644 drivers/target/iscsi/iscsi_target_transport.c create mode 100644 include/target/iscsi/iscsi_transport.h diff --git a/drivers/target/iscsi/Makefile b/drivers/target/iscsi/Makefile index 5b9a2cf..13a9240 100644 --- a/drivers/target/iscsi/Makefile +++ b/drivers/target/iscsi/Makefile @@ -15,6 +15,7 @@ iscsi_target_mod-y += iscsi_target_parameters.o \ iscsi_target_util.o \ iscsi_target.o \ iscsi_target_configfs.o \ - iscsi_target_stat.o + iscsi_target_stat.o \ + iscsi_target_transport.o obj-$(CONFIG_ISCSI_TARGET)+= iscsi_target_mod.o diff --git a/drivers/target/iscsi/iscsi_target_transport.c b/drivers/target/iscsi/iscsi_target_transport.c new file mode 100644 index 000..4ffd965 --- /dev/null +++ b/drivers/target/iscsi/iscsi_target_transport.c @@ -0,0 +1,57 @@ +#include linux/spinlock.h +#include linux/list.h +#include target/iscsi/iscsi_transport.h + +static LIST_HEAD(g_transport_list); +static DEFINE_MUTEX(transport_mutex); + +struct iscsit_transport *iscsit_get_transport(int type) +{ + struct iscsit_transport *t; + + mutex_lock(transport_mutex); + list_for_each_entry(t, g_transport_list, t_node) { + if (t-transport_type == type) { + if (t-owner !try_module_get(t-owner)) { + t = NULL; + } + mutex_unlock(transport_mutex); + return t; + } + } + mutex_unlock(transport_mutex); + + return NULL; +} +EXPORT_SYMBOL(iscsit_get_transport); + +void iscsit_put_transport(struct iscsit_transport *t) +{ + if (t-owner) + module_put(t-owner); +} +EXPORT_SYMBOL(iscsit_put_transport); + +int iscsit_create_transport(struct iscsit_transport *t) +{ + INIT_LIST_HEAD(t-t_node); + + mutex_lock(transport_mutex); + list_add_tail(t-t_node, g_transport_list); + mutex_unlock(transport_mutex); + + printk(Created iSCSI transport: %s\n, t-name); + + return 0; +} +EXPORT_SYMBOL(iscsit_create_transport); + +void iscsit_destroy_transport(struct iscsit_transport *t) +{ + mutex_lock(transport_mutex); + list_del(t-t_node); + mutex_unlock(transport_mutex); + + printk(Destroyed iSCSI transport: %s\n, t-name); +} I don't think create/destroy are the right names to use here - I suggest register/unregister or something like that? Fair enough. Renamed to register/unregister in the forth-coming RFC-v2 Please also make sure to turn those printks in to pr_something() for the final version. Done. Thanks, --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 02/11] iscsi-target: Initial traditional TCP conversion to iscsit_transport
On Fri, 2013-03-22 at 10:23 -0700, Andy Grover wrote: On 03/07/2013 05:45 PM, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org This patch performs the initial conversion of existing traditional iscsi to use iscsit_transport API callers. This includes: - iscsi-np cleanups for iscsit_transport_type - Add iscsi-np transport calls w/ -iscsit_setup_up() and -iscsit_free_np() - Convert login thread process context to use -iscsit_accept_np() for connections with pre-allocated struct iscsi_conn - Convert existing socket accept code to iscsit_accept_np() - Convert login RX/TX callers to use -iscsit_get_login_rx() and -iscsit_put_login_tx() to exchange request/response PDUs - Convert existing socket login RX/TX calls into iscsit_get_login_rx() and iscsit_put_login_tx() - Change iscsit_close_connection() to invoke -iscsit_free_conn() + iscsit_put_transport() calls. - Add iscsit_create_transport() + iscsit_destroy_transport() calls to module init/exit Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/target/iscsi/iscsi_target.c| 35 ++- drivers/target/iscsi/iscsi_target_core.h | 15 +- drivers/target/iscsi/iscsi_target_login.c | 411 drivers/target/iscsi/iscsi_target_login.h |6 + drivers/target/iscsi/iscsi_target_nego.c | 185 ++-- drivers/target/iscsi/iscsi_target_nego.h | 11 +- drivers/target/iscsi/iscsi_target_parameters.c | 12 +- drivers/target/iscsi/iscsi_target_tpg.c|6 +- drivers/target/iscsi/iscsi_target_util.c | 27 +-- 9 files changed, 376 insertions(+), 332 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index 23a98e6..4dc1c9b 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c SNIP @@ -4045,6 +4060,12 @@ int iscsit_close_connection( if (conn-sock) sock_release(conn-sock); + + if (conn-conn_transport-iscsit_free_conn) + conn-conn_transport-iscsit_free_conn(conn); For all the function pointers in conn_transport, won't they always be there? If so, can we just call them w/o checking? For this case, no. -iscsit_free_conn() is currently present for only ib_isert.ko code. diff --git a/drivers/target/iscsi/iscsi_target_login.c b/drivers/target/iscsi/iscsi_target_login.c index fdb632f..9354a5f 100644 --- a/drivers/target/iscsi/iscsi_target_login.c +++ b/drivers/target/iscsi/iscsi_target_login.c SNIP @@ -635,7 +677,13 @@ static int iscsi_post_login_handler( spin_unlock_bh(sess-conn_lock); iscsi_post_login_start_timers(conn); - iscsi_activate_thread_set(conn, ts); + + if (conn-conn_transport == ISCSI_TCP) { + iscsi_activate_thread_set(conn, ts); I thought conn_transport was a pointer? It looks like it's being compared against an enum here. This has already been removed for RFC-v2 code. + } else { + printk(Not calling iscsi_activate_thread_set\n); + dump_stack(); Left-in debug code? Ditto here too.. +int iscsit_get_login_rx(struct iscsi_conn *conn, struct iscsi_login *login) +{ + struct iscsi_login_req *login_req; + u32 padding = 0, payload_length; + + if (iscsi_login_rx_data(conn, login-req, ISCSI_HDR_LEN) 0) + return -1; + + login_req = (struct iscsi_login_req *)login-req; + payload_length = ntoh24(login_req-dlength); + padding = ((-payload_length) 3); + + pr_debug(Got Login Command, Flags 0x%02x, ITT: 0x%08x, +CmdSN: 0x%08x, ExpStatSN: 0x%08x, CID: %hu, Length: %u\n, + login_req-flags, login_req-itt, login_req-cmdsn, + login_req-exp_statsn, login_req-cid, payload_length); + /* +* Setup the initial iscsi_login values from the leading +* login request PDU. +*/ + if (login-first_request) { + login_req = (struct iscsi_login_req *)login-req; + login-leading_connection = (!login_req-tsih) ? 1 : 0; + login-current_stage= + (login_req-flags ISCSI_FLAG_LOGIN_CURRENT_STAGE_MASK) 2; + login-version_min = login_req-min_version; + login-version_max = login_req-max_version; + memcpy(login-isid, login_req-isid, 6); + login-cmd_sn = be32_to_cpu(login_req-cmdsn); + login-init_task_tag= login_req-itt; + login-initial_exp_statsn = be32_to_cpu(login_req-exp_statsn); + login-cid = be16_to_cpu(login_req-cid); + login-tsih = be16_to_cpu(login_req-tsih); + } + + if (iscsi_target_check_login_request(conn, login) 0) + return -1; Better