[ofa-general] RDMA transfers: Buffer status communications?

2009-02-05 Thread Jan Ruffing
Hello, when planning a data transfer system using Infiniband's RDMA mechanisms, I stumbled upon the following question: Is there a standard approach to inform the sender after an RDMA_write operation that the receiving buffer has been processed by the receiver and is now ready to receive new

Re: [ofa-general] Re: pick the outgoing HCA based on the IP used for bind

2009-02-05 Thread Or Gerlitz
Or Gerlitz wrote: Rick Frank who brought this to my attention, also handed me this patch which is claimed to workaround this issue, --- ofa_kernel-1.3.1.orig/drivers/infiniband/core/addr.c +++ ofa_kernel-1.3.1/drivers/infiniband/core/addr.c @@ -174,15 +174,29 @@ static int

Re: [ofa-general] [PATCH 2/4 v2] opensm/osm_state_mgr.c rescan subnet configuration after SIGHUP

2009-02-05 Thread Sasha Khapyorsky
On 09:43 Thu 05 Feb , Eli Dorfman (Voltaire) wrote: ok. Please apply the fixed patch. Did you test it? Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe,

[ofa-general] ofa_1_4_kernel 20090205-0200 daily build status

2009-02-05 Thread Vladimir Sokolovsky (Mellanox)
This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on i686 with

RE: [ofa-general] RE: impossibility to bind a device/port with the rdma-cm when the port is down

2009-02-05 Thread Alex Rosenbaum
- I'm not quite following this yet. Are you wanting a list of IP addresses that map to RDMA devices? When looking at a case that the user defines a local interface ip addr which it wants to work with. The application does not know if the ip addr maps to an rdma-cm capable device (IB or iWapr) or

[ofa-general] [ibsim][PATCH] Socket name can be forced by exporting IBSIM_SOCKNAME before starting ibsim and/or preloading umad2sim so multiple simulator can run on the same system at the same time

2009-02-05 Thread Nicolas Morey Chaisemartin
As we do a lot of routing tests with ibsim we had the need to be able to launch multiple simulator on the same system. With this patch, ibsim (and umad2sim) will try to read the socket basename using a getenv(IBSIM_SOCKNAME) which makes it possible. If IBSIM_SOCKNAME is not set, SIM_BASENAME is

[ofa-general] ***SPAM*** [PATCH] opensm/osm_subnet.c enable log_max_size opt update

2009-02-05 Thread Eli Dorfman (Voltaire)
enable log_max_size opt update Signed-off-by: Eli Dorfman e...@voltaire.com --- opensm/opensm/osm_subnet.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index f589180..d6d39a6 100644 ---

[ofa-general] ***SPAM*** [PATCH] opensm/osm_subnet.c fix parse functions for big endian machines

2009-02-05 Thread Eli Dorfman (Voltaire)
fix parse functions for big endian machines Signed-off-by: Eli Dorfman e...@voltaire.com --- opensm/opensm/osm_subnet.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c index d6d39a6..7b33659 100644 ---

[ofa-general] [PATCH] RDMA/nes: tmp_addr compilation warning

2009-02-05 Thread Chien Tung
As reported by Manish Katiyar mkati...@gmail.com, tmp_addr is causing a compilation warning when INFINIBAND_NES_DEBUG is not defined. tmp_addr is used in a NES_DEBUG and the print does not make sense. Taking out tmp_addr and the NES_DEBUG. Signed-off-by: Chien Tung chien.tin.t...@intel.com ---

[ofa-general] Chelsio T3: Aggregate Throughput

2009-02-05 Thread Philip Frey1
Hello, we am currently looking into the scalability of the T3 in terms of connections. We are using a 1-to-n scenario where the one server has a chunk of data and n client that fetch this chunk over and over again using RDMA reads (each 1MB in size). The clients do that such that they get an

RE: [ofa-general] RDMA transfers: Buffer status communications?

2009-02-05 Thread Sean Hefty
Is my understanding of the mechanisms correct? Since locking and unlocking of data receiving buffers is a standard use case in most transport strategies, I wanted to ask if there's a more elegant way to manage this using the Infiniband architecture? Like for example delaying the sender side work

[ofa-general] RE: pick the outgoing HCA based on the IP used for bind

2009-02-05 Thread Sean Hefty
Rick Frank who brought this to my attention, also handed me this patch which is claimed to workaround this issue, its badly formatted and I couldn't really understand what it does. I hoped to be able and reproduce this with rping or ucmatose, but neither allow me to specify a -I address to the

[ofa-general] Re: [ibsim][PATCH] Socket name can be forced by exporting IBSIM_SOCKNAME before starting ibsim and/or preloading umad2sim so multiple simulator can run on the same system at the same tim

2009-02-05 Thread Sasha Khapyorsky
On 15:34 Thu 05 Feb , Nicolas Morey Chaisemartin wrote: As we do a lot of routing tests with ibsim we had the need to be able to launch multiple simulator on the same system. With this patch, ibsim (and umad2sim) will try to read the socket basename using a getenv(IBSIM_SOCKNAME) which

[ofa-general] Re: [PATCH] opensm/osm_subnet.c enable log_max_size opt update

2009-02-05 Thread Sasha Khapyorsky
On 17:00 Thu 05 Feb , Eli Dorfman (Voltaire) wrote: enable log_max_size opt update Signed-off-by: Eli Dorfman e...@voltaire.com Applied. Thanks. Sasha ___ general mailing list general@lists.openfabrics.org

[ofa-general] Re: [PATCH] opensm/osm_subnet.c fix parse functions for big endian machines

2009-02-05 Thread Sasha Khapyorsky
On 17:19 Thu 05 Feb , Eli Dorfman (Voltaire) wrote: fix parse functions for big endian machines Signed-off-by: Eli Dorfman e...@voltaire.com Applied. Thanks. I'm fine with this patch - the code looks cleaner than it was before. But could you please explain what was a problem with

Re: [ofa-general] Re: pick the outgoing HCA based on the IP used for bind

2009-02-05 Thread Jason Gunthorpe
On Thu, Feb 05, 2009 at 02:03:42PM +0200, Or Gerlitz wrote: Or Gerlitz wrote: Rick Frank who brought this to my attention, also handed me this patch which is claimed to workaround this issue, +++ ofa_kernel-1.3.1/drivers/infiniband/core/addr.c @@ -174,15 +174,29 @@ static int

[ofa-general] [PATCH] libibmad: Use enum types for function parameters (WAS) Declare some enums as typedefs for cleaner function interfaces

2009-02-05 Thread Ira Weiny
Sasha, On Wed, 4 Feb 2009 10:30:54 -0800 Ira Weiny wei...@llnl.gov wrote: On Wed, 4 Feb 2009 20:27:25 +0200 Sasha Khapyorsky sas...@voltaire.com wrote: On 11:20 Wed 04 Feb , Jason Gunthorpe wrote: On Wed, Feb 04, 2009 at 08:14:21PM +0200, Sasha Khapyorsky wrote: I don't

[ofa-general] [PATCH] infiniband-diags/common: use enum MAD_DEST as ibd_dest_type type

2009-02-05 Thread Sasha Khapyorsky
Use introduced 'enum MAD_DEST' as type of ibd_dest_type variable. Signed-off-by: Sasha Khapyorsky sas...@voltaire.com --- infiniband-diags/include/ibdiag_common.h |2 +- infiniband-diags/src/ibdiag_common.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git

[ofa-general] Re: [PATCH] libibmad: Use enum types for function parameters (WAS) Declare some enums as typedefs for cleaner function interfaces

2009-02-05 Thread Sasha Khapyorsky
On 10:03 Thu 05 Feb , Ira Weiny wrote: Sasha, On Wed, 4 Feb 2009 10:30:54 -0800 Ira Weiny wei...@llnl.gov wrote: On Wed, 4 Feb 2009 20:27:25 +0200 Sasha Khapyorsky sas...@voltaire.com wrote: On 11:20 Wed 04 Feb , Jason Gunthorpe wrote: On Wed, Feb 04, 2009 at

RE: [ofa-general] RE: impossibility to bind a device/port with the rdma-cm when the port is down

2009-02-05 Thread Sean Hefty
Assuming this is an rdma-cm capable device in a 'bad' state, the user space application can wait for asyn ibv events (PORT_ACTIVE) from the device. Once the device is active again it can retry the rdma_create_qp or rdma_join_mc. Will this work? Even once the port goes active, what the

[ofa-general] RE: impossibility to bind a device/port with the rdma-cm when the port is down

2009-02-05 Thread Sean Hefty
@@ -2167,6 +2170,12 @@ static int cma_sidr_rep_handler(struct i event.status = ib_event-param.sidr_rep_rcvd.status; break; } + ret = cma_set_qkey(id_priv); + if (ret) { + event.event =

[ofa-general] Re: impossibility to bind a device/port with the rdma-cm when the port is down

2009-02-05 Thread Yossi Etigin
Sean Hefty wrote: @@ -2167,6 +2170,12 @@ static int cma_sidr_rep_handler(struct i event.status = ib_event-param.sidr_rep_rcvd.status; break; } + ret = cma_set_qkey(id_priv); + if (ret) { +

[ofa-general] RE: impossibility to bind a device/port with the rdma-cm when the port is down

2009-02-05 Thread Sean Hefty
It might be better to catch errors earlier, but there is the risk that the flow might change somehow, and losing the (now obvious) logical connection between retrieving the qkey and actually using it. I can go with that. I don't have a strong preference. Have you tested the patch and verified

[ofa-general] Re: impossibility to bind a device/port with the rdma-cm when the port is down

2009-02-05 Thread Yossi Etigin
Sean Hefty wrote: It might be better to catch errors earlier, but there is the risk that the flow might change somehow, and losing the (now obvious) logical connection between retrieving the qkey and actually using it. I can go with that. I don't have a strong preference. Have you tested the

[ofa-general] RE: impossibility to bind a device/port with the rdma-cm when the port is down

2009-02-05 Thread Sean Hefty
From: Yossi Etigin yos...@voltaire.com When doing rdma_resolve_addr() and relevant port is down, the function fails and rdma_cm id is not bound to the device. Therefore, application does not have device handle and cannot wait for the port to become active. The function fails because ipoib is

[ofa-general] 1.3.1 and 1.4 compatibilty

2009-02-05 Thread Brian J. Murrell
I'm sure I know the answer to this, or will be floored if it's other than I think, but just to do due diligence... are OFED 1.3.1 and 1.4 compatible? That is, nodes running one version will talk to nodes of the other version without problem, yes? Is it complete compatibility or are there any

Re: [ofa-general] Chelsio T3: Aggregate Throughput

2009-02-05 Thread Steve Wise
Philip Frey1 wrote: Hello, we am currently looking into the scalability of the T3 in terms of connections. We are using a 1-to-n scenario where the one server has a chunk of data and n client that fetch this chunk over and over again using RDMA reads (each 1MB in size). The clients do that

[ofa-general] IB credit-based flow control

2009-02-05 Thread Andy Grover
Hi, Steve and I have been working to debug RDS's credit-based flow control, and I happened to notice that IB already implements this (see ib spec section 9.7.7.2). So, why is it necessary for a ULP like RDS to implement its own flow control? It looks like IB's flow control should result in

RE: [ofa-general] IB credit-based flow control

2009-02-05 Thread Sean Hefty
So, why is it necessary for a ULP like RDS to implement its own flow control? It looks like IB's flow control should result in no RNR retries, yet without protocol-level FC, we see RNR retries. If you're using a shared receive queue, end to end flow control is disabled. Also, see 9.7.7.2.5 C9-162

[ofa-general] ***SPAM*** [RFC] infiniband-diags/perfquery.c: Any objections to changing an option name ?

2009-02-05 Thread Hal Rosenstock
In infiniband-diags/perfquery, -e is used for extended counters and covers up using the common errors option so I'd like to change this to be -x for xtended. Any objections ? Without this change when perfquery fails you can't get the more detailed error information which is very useful for

[ofa-general] [PATCH] ibsim: Eliminate unused modified variable

2009-02-05 Thread Hal Rosenstock
Sasha, Trivial patch to eliminate the unused 'modified' variable. -- Hal 0001-ibsim-Eliminate-unused-modified-variable.patch Description: application/mbox ___ general mailing list general@lists.openfabrics.org

[ofa-general] [PATCH] ibsim: Change lid print format to unsigned

2009-02-05 Thread Hal Rosenstock
Sasha, Patch to change lid print format to unsigned to be consistent elsewhere. -- Hal 0003-ibsim-Change-lid-prints-to-unsigned.patch Description: application/mbox ___ general mailing list general@lists.openfabrics.org

[ofa-general] [PATCH] opensm/doc/perf-manager-arch.txt: Fix some commentary typos

2009-02-05 Thread Hal Rosenstock
Sasha, Trivial patch to fix some typos in this doc. -- Hal 0001-opensm-doc-perf-manager-arch.txt-Fix-some-commentar.patch Description: application/mbox ___ general mailing list general@lists.openfabrics.org

[ofa-general] [PATCH] opensm/PerfMgr: Add copyrights

2009-02-05 Thread Hal Rosenstock
Sasha, This just adds copyrights missed in previous patches. -- Hal 0002-opensm-PerfMgr-Add-copyright.patch Description: application/mbox ___ general mailing list general@lists.openfabrics.org

[ofa-general] [PATCH] libibmad: lid print format changed to unsigned

2009-02-05 Thread Hal Rosenstock
Sasha, This changes libibmad lid print format to unsigned to be consistent with OpenSM and diag tools. -- Hal 0003-libibmad-lid-printing-changed-to-unsigned-as-was-d.patch Description: application/mbox ___ general mailing list

[ofa-general] libibumad/umad.c: Change lid print format to unsigned

2009-02-05 Thread Hal Rosenstock
Sasha, This patch changes umad.c lid print format to unsigned. -- Hal 0007-libibumad-umad.c-Change-lid-prints-to-unsigned.patch Description: application/mbox ___ general mailing list general@lists.openfabrics.org

[ofa-general] [PATCH] libibmad/rpc.c: In mad_rpc/mad_rpc_rmpp, set rpc attribute ID from response

2009-02-05 Thread Hal Rosenstock
Sasha, This patch sets the attribute ID based on what is in the response. -- Hal 0009-libibmad-rpc.c-In-mad_rpc-and-mad_rpc_rmpp-set-rpc.patch Description: application/mbox ___ general mailing list general@lists.openfabrics.org

[ofa-general] [PATCH] libibmad/gs.c: Factor out common code

2009-02-05 Thread Hal Rosenstock
Sasha, This patch factors out some common code in gs.c. common_query_setup is used by both pma_query_via and performance_reset_via. -- Hal 0010-libibmad-gs.c-Factor-out-common-code.patch Description: application/mbox ___ general mailing list

[ofa-general] [PATCH] infiniband-diags/perfquery: Change option name for extended counters

2009-02-05 Thread Hal Rosenstock
Sasha, Per the RFC, this patch changes the option name for extended counters to to not cover up common errors option. This changes it from -e/--extended to -x/--xtended so -e/--errors can be used to get error information as is common with the IB diags. -- Hal

Re: [ofa-general] IB credit-based flow control

2009-02-05 Thread Andy Grover
Sean Hefty wrote: So, why is it necessary for a ULP like RDS to implement its own flow control? It looks like IB's flow control should result in no RNR retries, yet without protocol-level FC, we see RNR retries. If you're using a shared receive queue, end to end flow control is disabled. Also,

RE: [ofa-general] IB credit-based flow control

2009-02-05 Thread Sean Hefty
I'm reading C9-162 and still not seeing why (according to the spec anyways) there should ever be RNR retries on a connection. I would think the receiving HCA would not credit its last WQE to the sender, and thus retries should never happen? The whole point of this feature is to eliminate RNR

Re: [ofa-general] IB credit-based flow control

2009-02-05 Thread Andy Grover
Sean Hefty wrote: My assumption is that if no credits are available when the SEND request arrives, then the receiver generates a RNR message, but I didn't read through the entire section to verify this. This is totally a guess, but there needs to be some sort of recovery mechanism in place to

Re: [ofa-general] IB credit-based flow control

2009-02-05 Thread Steve Wise
Andy Grover wrote: Sean Hefty wrote: My assumption is that if no credits are available when the SEND request arrives, then the receiver generates a RNR message, but I didn't read through the entire section to verify this. This is totally a guess, but there needs to be some sort of recovery

Re: [ofa-general] IB credit-based flow control

2009-02-05 Thread Andy Grover
Andy Grover wrote: How would I verify that? I'm using current HCAs (mlx4), so I'm assuming if the spec says an HCA must support something, is is supported? We definitely still need ulp-level flow control for iwarp so it's not wasted work. But if IB doesn't, then it would be great to not incur

Re: [ofa-general] IB credit-based flow control

2009-02-05 Thread Roland Dreier
How would I verify that? I'm using current HCAs (mlx4), so I'm assuming if the spec says an HCA must support something, is is supported? We definitely still need ulp-level flow control for iwarp so it's not wasted work. But if IB doesn't, then it would be great to not incur the