Hi Roland,
This patch series fixes critical bugs for RDMA/cxgb4. It fixes bugs in following
areas:
- Aborts connection in error scenarios
- Logs only critical errors
- Holds the reference of the QP untill TID is released
- Avoids race condition in endpoint timeout
- Fixes reconnect and version
If a FINI operation fails, then we need to ABORT instead
of CLOSE. Also, if we ABORT due to unexpected STREAMING
data, then wake up anybody blocked in FINI...
Signed-off-by: Vipul Pandya vi...@chelsio.com
---
drivers/infiniband/hw/cxgb4/cm.c |1 +
drivers/infiniband/hw/cxgb4/qp.c |1 +
With later firmware, the chances of getting streaming mode data after
we exit RTS is likely, so we don't need to warn for it. The only real
case where we don't expect it is when the QP is in RTS.
move QP to ERROR when streaming mode data received.
Signed-off-by: Vipul Pandya vi...@chelsio.com
Log AEs even if the QP isn't in RTS. It is useful
information.
Signed-off-by: Vipul Pandya vi...@chelsio.com
---
drivers/infiniband/hw/cxgb4/cm.c |6 +++---
drivers/infiniband/hw/cxgb4/ev.c |8 +---
2 files changed, 8 insertions(+), 6 deletions(-)
diff --git
With newer firmware, we can get streaming data due to connection
errors before the driver moves the QP out of RTS.
Signed-off-by: Vipul Pandya vi...@chelsio.com
---
drivers/infiniband/hw/cxgb4/cm.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git
The endpoint timeout logic had a race that could cause an endpoint
object to be freed while it was still on the timedout list. This
can happen if the timer is stopped after it had fired, but before
the timedout thread processed the endpoint timeout.
Signed-off-by: Vipul Pandya vi...@chelsio.com
only reconnect if the endpoint wasn't freed.
peer_abort() should only attempt to reconnect if the endpoint wasn't freed.
Also remove hwtid from the debugfs idr.
Add missing check for peer2peer in MPAv2 code
use correct mpa version on reject.
Signed-off-by: Vipul Pandya vi...@chelsio.com
---
Don't wakeup threads blocked in rdma_init/rdma_fini if we are on
MPAv2, and want to retry connection with MPAv1.
Stop ep-timer on getting MPA version mismatch, before doing the
abort_connection - in process_mpa_request.
Take care to stop ep-timer in error paths for process_mpa_request.
CPL_ABORT_REQ_RSS can come before TCP connection is established. In such case
peer_abort was trying to remove the hwtid which was not inserted. To avoid this
we insert the hwtid when we are sure that we are surely going to send passive
accept request.
Signed-off-by: Vipul Pandya vi...@chelsio.com
It fixes following types of sparse warnings
- cast to pointer from integer of different size
- cast from pointer to integer of different size
- incorrect type in assignment (different base types)
- incorrect type in argument 1 (different base types)
- cast from restricted __be64
- cast from
Reviewed-by: Steve Wise sw...@opengridcomputing.com
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
And:
Acked-by: Steve Wise sw...@opengridcomputing.com
Not sure which one I should be using :)
On 1/7/2013 9:44 AM, Steve Wise wrote:
Reviewed-by: Steve Wise sw...@opengridcomputing.com
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to
On Mon, 2013-01-07 at 06:34 -0500, Bart Van Assche wrote:
Sorry but this patch looks wrong to me, and that because of the
following reasons:
- A root cause analysis is missing. It has been mentioned in the patch
description that device_del() did hang but an analysis of why that
hang
Hey Sean,
Is this a bug? I think it is...
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 2709ff5..fb24f05 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -806,8 +806,13 @@ static ssize_t ucma_accept(struct ucma_file *file,
Is this a bug? I think it is...
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 2709ff5..fb24f05 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -806,8 +806,13 @@ static ssize_t ucma_accept(struct ucma_file *file,
On 1/7/2013 1:22 PM, Hefty, Sean wrote:
Is this a bug? I think it is...
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 2709ff5..fb24f05 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -806,8 +806,13 @@ static ssize_t
Signed-off-by: Jim Schutt jasc...@sandia.gov
Signed-off-by: Hal Rosenstock h...@mellanox.com
---
diff --git a/opensm/osm_torus.c b/opensm/osm_torus.c
index 1d847b3..ff83edb 100644
--- a/opensm/osm_torus.c
+++ b/opensm/osm_torus.c
@@ -853,14 +853,14 @@ out:
}
static
-bool parse_port(unsigned
Rather than hard coded constant of 32 for max torus changes to
be reported, allow this to be configured with max_changes
parameter in torus conf file. Default for max_changes parameter
is same as hard coded constant (32).
Also, update torus conf documentation for this new parameter.
Useful feature for torus debug
Also, in report_torus_changes, no need for NULL pointer check on nt
Reviewed-by: Jim Schutt jasc...@sandia.gov
Signed-off-by: Hal Rosenstock h...@mellanox.com
---
diff --git a/opensm/osm_torus.c b/opensm/osm_torus.c
index 1d847b3..4e5688f 100644
---
Signed-off-by: Hal Rosenstock h...@mellanox.com
---
diff --git a/include/complib/cl_packon.h b/include/complib/cl_packon.h
index ffc8e11..e2e45b4 100644
--- a/include/complib/cl_packon.h
+++ b/include/complib/cl_packon.h
@@ -55,14 +55,14 @@
* not align properly for some platforms. Care must
On Mon, Jan 7, 2013 at 5:11 AM, Vipul Pandya vi...@chelsio.com wrote:
This patch series fixes critical bugs for RDMA/cxgb4. It fixes bugs in
following
areas:
- Aborts connection in error scenarios
- Logs only critical errors
- Holds the reference of the QP untill TID is released
- Avoids
On 08-01-2013 06:03, Roland Dreier wrote:
On Mon, Jan 7, 2013 at 5:11 AM, Vipul Pandya vi...@chelsio.com wrote:
This patch series fixes critical bugs for RDMA/cxgb4. It fixes bugs in
following
areas:
- Aborts connection in error scenarios
- Logs only critical errors
- Holds the reference
22 matches
Mail list logo