Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1
On Mon, Oct 24, 2011 at 8:33 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: It would have been nice if you bothered to mention these in any of the pre merge window reviews, but given your delay responsed on these items they will have to be something we'll fix post merge. Sorry for that - it took some time though before I remembered myself about all this. If you are looking for a test case: pulling an IB cable during I/O a few times should allow to trigger this. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1
On Mon, Oct 24, 2011 at 11:01 PM, Roland Dreier rol...@purestorage.com wrote: On Mon, Oct 24, 2011 at 11:33 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: - Handle IB completion timeouts. Although the InfiniBand Architecture Manual specifies that a HCA must generate an error completion for each pending work item when a queue pair reset is issued, an important class of HCAs doesn't do this (that includes the HCA in your test setup). In the SCST version of this driver such timeouts are handled by the function srpt_pending_cmd_timeout(). Hold on... not sure what you mean by a queue pair reset but the IB spec does not say that a flush error should be generated when a QP is transitioned to the reset state. The flush error completions are required when a QP is transitioned to the error state, and in fact completion entries may be lost when a QP transitions to reset. I was referring to transitioning a QP to the error state -- see also srpt_ch_qp_err(). As far as I know every HCA supported by Linux does implement this correctly. Which class did you have in mind as not doing that? At least QDR ConnectX 2 HCAs with fairly recent firmware. This behavior can be reproduced easily with the SCST version of ib_srpt as follows: - Log in from an SRP initiator to ib_srpt. - Start a direct I/O read test, e.g. with fio. - Issue the command rmmod ib_srpt on the target during I/O. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] ib_srpt: Convert srp_max_rdma_size into per port configfs attribute
On Mon, 2011-10-24 at 13:29 -0700, Nicholas A. Bellinger wrote: On Mon, 2011-10-24 at 05:33 +, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org This patch converts the srp_max_rdma_size module parameter into a per endpoint configfs attribute. This includes adding the necessary bits for show + store attributes w/ min/max bounds checking, and updating srpt_get_ioc() to accept a struct srpt_port parameter. Reported-by: Roland Dreier rol...@purestorage.com Cc: Roland Dreier rol...@purestorage.com Cc: Bart Van Assche bvanass...@acm.org Signed-off-by: Nicholas A. Bellinger n...@linux-iscsi.org --- drivers/infiniband/ulp/srpt/ib_srpt.c | 60 - drivers/infiniband/ulp/srpt/ib_srpt.h | 10 + 2 files changed, 61 insertions(+), 9 deletions(-) SNIP diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.h b/drivers/infiniband/ulp/srpt/ib_srpt.h index 59ee2d7..c9caa90 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.h +++ b/drivers/infiniband/ulp/srpt/ib_srpt.h @@ -114,6 +114,7 @@ enum { MIN_SRPT_SRQ_SIZE = 4, DEFAULT_SRPT_SRQ_SIZE = 4095, MAX_SRPT_SRQ_SIZE = 65535, + MAX_SRPT_RDMA_SIZE = 256U, Hey btw, what should the proper MAX_SRPT_RDMA_SIZE be here..? Ping on this Bart..? Thanks, --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] rdma/cma: fixed resource leak in case of an error in udaddy
Fixed a resource leak in case of an error level in the udaddy example. Signed-off-by: Dotan Barak dot...@dev.mellanox.co.il --- examples/udaddy.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/examples/udaddy.c b/examples/udaddy.c index 637306a..1534df5 100644 --- a/examples/udaddy.c +++ b/examples/udaddy.c @@ -547,7 +547,7 @@ static int run_server(void) ret = rdma_bind_addr(listen_id, test.src_addr); if (ret) { perror(udaddy: bind address failed); - return ret; + goto out; } ret = rdma_listen(listen_id, 0); -- 1.7.4.1 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] ib_srpt: Convert srp_max_rdma_size into per port configfs attribute
On Mon, Oct 24, 2011 at 10:29 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: @@ -114,6 +114,7 @@ enum { MIN_SRPT_SRQ_SIZE = 4, DEFAULT_SRPT_SRQ_SIZE = 4095, MAX_SRPT_SRQ_SIZE = 65535, + MAX_SRPT_RDMA_SIZE = 256U, Hey btw, what should the proper MAX_SRPT_RDMA_SIZE be here..? As you can see in the ib_srpt source code, the largest value filled in for this parameter in the IOC profile is 1U 24. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] ib_srpt: Convert srp_max_rdma_size into per port configfs attribute
On Tue, 2011-10-25 at 12:32 +0200, Bart Van Assche wrote: On Mon, Oct 24, 2011 at 10:29 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: @@ -114,6 +114,7 @@ enum { MIN_SRPT_SRQ_SIZE = 4, DEFAULT_SRPT_SRQ_SIZE = 4095, MAX_SRPT_SRQ_SIZE = 65535, + MAX_SRPT_RDMA_SIZE = 256U, Hey btw, what should the proper MAX_SRPT_RDMA_SIZE be here..? As you can see in the ib_srpt source code, the largest value filled in for this parameter in the IOC profile is 1U 24. Thanks Bart. Fixing this up now. --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] RDMA/cxgb4: Mark the QP in error before disabling the Q in the firmware
From: Tom Tucker t...@ogc.us The QP needs to be moved to error before telling the firwmare to shutdown the queue. Otherwise, the application can submit WR that will never get fetched by the hardware and never flushed by the driver. Signed-off-by: Kumar Sanghvi kuma...@chelsio.com --- drivers/infiniband/hw/cxgb4/qp.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c index a391a4a..d6ccc7e 100644 --- a/drivers/infiniband/hw/cxgb4/qp.c +++ b/drivers/infiniband/hw/cxgb4/qp.c @@ -1226,6 +1226,8 @@ int c4iw_modify_qp(struct c4iw_dev *rhp, struct c4iw_qp *qhp, disconnect = 1; c4iw_get_ep(qhp-ep-com); } + if (qhp-ibqp.uobject) + t4_set_wq_in_error(qhp-wq); ret = rdma_fini(rhp, qhp, ep); if (ret) goto err; @@ -1244,6 +1246,8 @@ int c4iw_modify_qp(struct c4iw_dev *rhp, struct c4iw_qp *qhp, break; case C4IW_QP_STATE_ERROR: set_state(qhp, C4IW_QP_STATE_ERROR); + if (qhp-ibqp.uobject) + t4_set_wq_in_error(qhp-wq); if (!internal) { abort = 1; disconnect = 1; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)
On Mon, Oct 24, 2011 at 11:07 PM, Bart Van Assche bvanass...@acm.org wrote: As far as I know every HCA supported by Linux does implement this correctly. Which class did you have in mind as not doing that? At least QDR ConnectX 2 HCAs with fairly recent firmware. This behavior can be reproduced easily with the SCST version of ib_srpt as follows: - Log in from an SRP initiator to ib_srpt. - Start a direct I/O read test, e.g. with fio. - Issue the command rmmod ib_srpt on the target during I/O. OK, this is a pretty serious bug in mlx4_ib if true. Are you sure that you really are seeing some pending work requests not generating a flush error when the QP transitions to the error state? Keep in mind that transitioning the QP to the reset state can cause completion entries to disappear after they are generated. - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 07/10] RDMA/cxgb4: DB Drop Recovery for RDMA and LLD queues.
On Mon, Oct 24, 2011 at 2:12 PM, David Miller da...@davemloft.net wrote: 1. We would like to recommend that all the patches get included in Roland's infiniband tree since it has build dependencies. This is fine with me. It just means that Roland's tree has to have net-next included in it already, because otherwise the paths to the cxgb4 driver under drivers/net won't be right. OK, let's do that... - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] IB/qib: Fix kernel panic on QME7342 boards
From: Mitko Haralanov mi...@qlogic.com Fix a kernel panic on QME7342 (and any Mezz boards without QSFP connectors) caused by scheduling work on uninitialized workqueue introduced by 62066fc8df4632c772d723813ce7af456d62ddf7. Signed-off-by: Mitko Haralanov mi...@qlogic.com Signed-off-by: Mike Marciniszyn mike.marcinis...@qlogic.com --- drivers/infiniband/hw/qib/qib_iba7322.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/qib/qib_iba7322.c b/drivers/infiniband/hw/qib/qib_iba7322.c index 62005c7..3e40cff 100644 --- a/drivers/infiniband/hw/qib/qib_iba7322.c +++ b/drivers/infiniband/hw/qib/qib_iba7322.c @@ -5285,8 +5285,10 @@ static int qib_7322_ib_updown(struct qib_pportdata *ppd, int ibup, u64 ibcs) qib_7322_mini_pcs_reset(ppd); /* schedule the qsfp refresh which should turn the link off */ - qd-t_insert = get_jiffies_64(); - schedule_work(qd-work); + if (ppd-dd-flags QIB_HAS_QSFP) { + qd-t_insert = get_jiffies_64(); + schedule_work(qd-work); + } spin_lock_irqsave(ppd-sdma_lock, flags); if (__qib_sdma_running(ppd)) __qib_sdma_process_event(ppd, -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)
On Tue, Oct 25, 2011 at 1:17 PM, Roland Dreier rol...@purestorage.com wrote: On Mon, Oct 24, 2011 at 11:07 PM, Bart Van Assche bvanass...@acm.org wrote: As far as I know every HCA supported by Linux does implement this correctly. Which class did you have in mind as not doing that? At least QDR ConnectX 2 HCAs with fairly recent firmware. This behavior can be reproduced easily with the SCST version of ib_srpt as follows: 1. Log in from an SRP initiator to ib_srpt. 2. Start a direct I/O read test, e.g. with fio. 3. Issue the command rmmod ib_srpt on the target during I/O. OK, this is a pretty serious bug in mlx4_ib if true. Are you sure that you really are seeing some pending work requests not generating a flush error when the QP transitions to the error state? It's a little more complex than that. The original version of ib_srpt stops polling for completions as soon as the last WQE event has been received and after that the queue has been drained. So I don't know whether these flush errors were not delivered or whether these were delivered too late. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mlx4 missing completions (was Re: [PATCH-v2] ib_srpt: Initial SRP Target merge for v3.2-rc1)
On Tue, Oct 25, 2011 at 11:09 AM, Bart Van Assche bvanass...@acm.org wrote: It's a little more complex than that. The original version of ib_srpt stops polling for completions as soon as the last WQE event has been received and after that the queue has been drained. So I don't know whether these flush errors were not delivered or whether these were delivered too late. Sorry, but now I confused about what the bug is. You have a QP associated with an SRQ, and you transition the QP to error. At some point you get a last WQE received event for that QP (which means all receive requests for that QP have completed), and you drain the CQ after that, which means you are guaranteed to have processed all the receive requests for that QP. (At least this is how the verbs are supposed to work). So what goes wrong with ConnectX? I don't think there is any way to guarantee that every request posted to the SRQ is taken by a QP. You just have to make sure that you tear down every QP as above, and then you know when there are no more QPs associated to the SRQ that any receive requests that haven't completed are still on the SRQ, and you can free their resources after you destroy the SRQ. - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html