Hello,
when planning a data transfer system using Infiniband's RDMA mechanisms, I
stumbled upon the following question: Is there a standard approach to inform
the sender after an RDMA_write operation that the receiving buffer has been
processed by the receiver and is now ready to receive new
Or Gerlitz wrote:
Rick Frank who brought this to my attention, also handed me this patch
which is claimed to workaround this issue,
--- ofa_kernel-1.3.1.orig/drivers/infiniband/core/addr.c
+++ ofa_kernel-1.3.1/drivers/infiniband/core/addr.c
@@ -174,15 +174,29 @@ static int
On 09:43 Thu 05 Feb , Eli Dorfman (Voltaire) wrote:
ok. Please apply the fixed patch.
Did you test it?
Sasha
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe,
This email was generated automatically, please do not reply
git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git
git_branch: ofed_kernel
Common build parameters:
Passed:
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.19
Passed on i686 with
- I'm not quite following this yet. Are you wanting a list of IP
addresses that map to RDMA devices?
When looking at a case that the user defines a local interface ip addr
which it wants to work with. The application does not know if the ip
addr maps to an rdma-cm capable device (IB or iWapr) or
As we do a lot of routing tests with ibsim we had the need to be able to launch
multiple simulator on the same system.
With this patch, ibsim (and umad2sim) will try to read the socket basename using a
getenv(IBSIM_SOCKNAME) which makes it possible.
If IBSIM_SOCKNAME is not set, SIM_BASENAME is
enable log_max_size opt update
Signed-off-by: Eli Dorfman e...@voltaire.com
---
opensm/opensm/osm_subnet.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index f589180..d6d39a6 100644
---
fix parse functions for big endian machines
Signed-off-by: Eli Dorfman e...@voltaire.com
---
opensm/opensm/osm_subnet.c | 10 +-
1 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index d6d39a6..7b33659 100644
---
As reported by Manish Katiyar mkati...@gmail.com, tmp_addr is
causing a compilation warning when INFINIBAND_NES_DEBUG is not defined.
tmp_addr is used in a NES_DEBUG and the print does not make sense.
Taking out tmp_addr and the NES_DEBUG.
Signed-off-by: Chien Tung chien.tin.t...@intel.com
---
Hello,
we am currently looking into the scalability of the T3 in terms of
connections. We are using a 1-to-n scenario where the one server
has a chunk of data and n client that fetch this chunk over and over
again using RDMA reads (each 1MB in size).
The clients do that such that they get an
Is my understanding of the mechanisms correct? Since locking and unlocking of
data receiving buffers is a standard use case in most transport strategies, I
wanted to ask if there's a more elegant way to manage this using the Infiniband
architecture? Like for example delaying the sender side work
Rick Frank who brought this to my attention, also handed me this patch
which is claimed to workaround this issue, its badly formatted and I
couldn't really understand what it does. I hoped to be able and reproduce
this with rping or ucmatose, but neither allow me to specify a -I address
to the
On 15:34 Thu 05 Feb , Nicolas Morey Chaisemartin wrote:
As we do a lot of routing tests with ibsim we had the need to be able to
launch multiple simulator on the same system.
With this patch, ibsim (and umad2sim) will try to read the socket basename
using a getenv(IBSIM_SOCKNAME) which
On 17:00 Thu 05 Feb , Eli Dorfman (Voltaire) wrote:
enable log_max_size opt update
Signed-off-by: Eli Dorfman e...@voltaire.com
Applied. Thanks.
Sasha
___
general mailing list
general@lists.openfabrics.org
On 17:19 Thu 05 Feb , Eli Dorfman (Voltaire) wrote:
fix parse functions for big endian machines
Signed-off-by: Eli Dorfman e...@voltaire.com
Applied. Thanks.
I'm fine with this patch - the code looks cleaner than it was before.
But could you please explain what was a problem with
On Thu, Feb 05, 2009 at 02:03:42PM +0200, Or Gerlitz wrote:
Or Gerlitz wrote:
Rick Frank who brought this to my attention, also handed me this patch
which is claimed to workaround this issue,
+++ ofa_kernel-1.3.1/drivers/infiniband/core/addr.c
@@ -174,15 +174,29 @@ static int
Sasha,
On Wed, 4 Feb 2009 10:30:54 -0800
Ira Weiny wei...@llnl.gov wrote:
On Wed, 4 Feb 2009 20:27:25 +0200
Sasha Khapyorsky sas...@voltaire.com wrote:
On 11:20 Wed 04 Feb , Jason Gunthorpe wrote:
On Wed, Feb 04, 2009 at 08:14:21PM +0200, Sasha Khapyorsky wrote:
I don't
Use introduced 'enum MAD_DEST' as type of ibd_dest_type variable.
Signed-off-by: Sasha Khapyorsky sas...@voltaire.com
---
infiniband-diags/include/ibdiag_common.h |2 +-
infiniband-diags/src/ibdiag_common.c |2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git
On 10:03 Thu 05 Feb , Ira Weiny wrote:
Sasha,
On Wed, 4 Feb 2009 10:30:54 -0800
Ira Weiny wei...@llnl.gov wrote:
On Wed, 4 Feb 2009 20:27:25 +0200
Sasha Khapyorsky sas...@voltaire.com wrote:
On 11:20 Wed 04 Feb , Jason Gunthorpe wrote:
On Wed, Feb 04, 2009 at
Assuming this is an rdma-cm capable device in a 'bad' state, the user
space application can wait for asyn ibv events (PORT_ACTIVE) from the
device. Once the device is active again it can retry the rdma_create_qp
or rdma_join_mc.
Will this work? Even once the port goes active, what the
@@ -2167,6 +2170,12 @@ static int cma_sidr_rep_handler(struct i
event.status = ib_event-param.sidr_rep_rcvd.status;
break;
}
+ ret = cma_set_qkey(id_priv);
+ if (ret) {
+ event.event =
Sean Hefty wrote:
@@ -2167,6 +2170,12 @@ static int cma_sidr_rep_handler(struct i
event.status = ib_event-param.sidr_rep_rcvd.status;
break;
}
+ ret = cma_set_qkey(id_priv);
+ if (ret) {
+
It might be better to catch errors earlier, but there is the risk that the
flow might change somehow, and losing the (now obvious) logical connection
between retrieving the qkey and actually using it.
I can go with that. I don't have a strong preference. Have you tested the
patch and verified
Sean Hefty wrote:
It might be better to catch errors earlier, but there is the risk that the
flow might change somehow, and losing the (now obvious) logical connection
between retrieving the qkey and actually using it.
I can go with that. I don't have a strong preference. Have you tested the
From: Yossi Etigin yos...@voltaire.com
When doing rdma_resolve_addr() and relevant port is down, the function fails
and rdma_cm id is not bound to the device. Therefore, application does not have
device handle and cannot wait for the port to become active. The function
fails because ipoib is
I'm sure I know the answer to this, or will be floored if it's other
than I think, but just to do due diligence... are OFED 1.3.1 and 1.4
compatible? That is, nodes running one version will talk to nodes of
the other version without problem, yes?
Is it complete compatibility or are there any
Philip Frey1 wrote:
Hello,
we am currently looking into the scalability of the T3 in terms of
connections. We are using a 1-to-n scenario where the one server
has a chunk of data and n client that fetch this chunk over and over
again using RDMA reads (each 1MB in size).
The clients do that
Hi,
Steve and I have been working to debug RDS's credit-based flow control,
and I happened to notice that IB already implements this (see ib spec
section 9.7.7.2).
So, why is it necessary for a ULP like RDS to implement its own flow
control? It looks like IB's flow control should result in
So, why is it necessary for a ULP like RDS to implement its own flow
control? It looks like IB's flow control should result in no RNR
retries, yet without protocol-level FC, we see RNR retries.
If you're using a shared receive queue, end to end flow control is disabled.
Also, see 9.7.7.2.5 C9-162
In infiniband-diags/perfquery, -e is used for extended counters and
covers up using the common errors option so I'd like to change this to
be -x for xtended. Any objections ? Without this change when perfquery
fails you can't get the more detailed error information which is very
useful for
Sasha,
Trivial patch to eliminate the unused 'modified' variable.
-- Hal
0001-ibsim-Eliminate-unused-modified-variable.patch
Description: application/mbox
___
general mailing list
general@lists.openfabrics.org
Sasha,
Patch to change lid print format to unsigned to be consistent elsewhere.
-- Hal
0003-ibsim-Change-lid-prints-to-unsigned.patch
Description: application/mbox
___
general mailing list
general@lists.openfabrics.org
Sasha,
Trivial patch to fix some typos in this doc.
-- Hal
0001-opensm-doc-perf-manager-arch.txt-Fix-some-commentar.patch
Description: application/mbox
___
general mailing list
general@lists.openfabrics.org
Sasha,
This just adds copyrights missed in previous patches.
-- Hal
0002-opensm-PerfMgr-Add-copyright.patch
Description: application/mbox
___
general mailing list
general@lists.openfabrics.org
Sasha,
This changes libibmad lid print format to unsigned to be consistent with
OpenSM and diag tools.
-- Hal
0003-libibmad-lid-printing-changed-to-unsigned-as-was-d.patch
Description: application/mbox
___
general mailing list
Sasha,
This patch changes umad.c lid print format to unsigned.
-- Hal
0007-libibumad-umad.c-Change-lid-prints-to-unsigned.patch
Description: application/mbox
___
general mailing list
general@lists.openfabrics.org
Sasha,
This patch sets the attribute ID based on what is in the response.
-- Hal
0009-libibmad-rpc.c-In-mad_rpc-and-mad_rpc_rmpp-set-rpc.patch
Description: application/mbox
___
general mailing list
general@lists.openfabrics.org
Sasha,
This patch factors out some common code in gs.c. common_query_setup is
used by both pma_query_via and performance_reset_via.
-- Hal
0010-libibmad-gs.c-Factor-out-common-code.patch
Description: application/mbox
___
general mailing list
Sasha,
Per the RFC, this patch changes the option name for extended counters to
to not cover up common errors option. This changes it from -e/--extended
to -x/--xtended so -e/--errors can be used to get error information as
is common with the IB diags.
-- Hal
Sean Hefty wrote:
So, why is it necessary for a ULP like RDS to implement its own flow
control? It looks like IB's flow control should result in no RNR
retries, yet without protocol-level FC, we see RNR retries.
If you're using a shared receive queue, end to end flow control is disabled.
Also,
I'm reading C9-162 and still not seeing why (according to the spec
anyways) there should ever be RNR retries on a connection. I would think
the receiving HCA would not credit its last WQE to the sender, and thus
retries should never happen?
The whole point of this feature is to eliminate RNR
Sean Hefty wrote:
My assumption is that if no credits are available when the SEND request arrives,
then the receiver generates a RNR message, but I didn't read through the entire
section to verify this.
This is totally a guess, but there needs to be some sort of recovery mechanism
in place to
Andy Grover wrote:
Sean Hefty wrote:
My assumption is that if no credits are available when the SEND
request arrives,
then the receiver generates a RNR message, but I didn't read through
the entire
section to verify this.
This is totally a guess, but there needs to be some sort of recovery
Andy Grover wrote:
How would I verify that? I'm using current HCAs (mlx4), so I'm assuming
if the spec says an HCA must support something, is is supported?
We definitely still need ulp-level flow control for iwarp so it's not
wasted work. But if IB doesn't, then it would be great to not incur
How would I verify that? I'm using current HCAs (mlx4), so I'm
assuming if the spec says an HCA must support something, is is
supported?
We definitely still need ulp-level flow control for iwarp so it's not
wasted work. But if IB doesn't, then it would be great to not incur
the
45 matches
Mail list logo