[ofa-general] running ib diagnostics blocks

2009-05-12 Thread Eli Dorfman (Voltaire)
Hi, What could be the reason that open(/dev/infiniband/umad0, O_RDWR|O_NONBLOCK) blocks and does not return. I did not find any errors in dmesg. Thanks, Eli ___ general mailing list general@lists.openfabrics.org

Re: [ofa-general] running ib diagnostics blocks

2009-05-12 Thread Or Gerlitz
Eli Dorfman wrote: What could be the reason that open(/dev/infiniband/umad0, O_RDWR|O_NONBLOCK) blocks and does not return. I did not find any errors in dmesg. Eli, You can examine the kernel stack of all processes, including yours... using sysrq ($ echo 1 /proc/sysrq-trigger and then $

[ofa-general] OFED 1.4.1-rc5 recall

2009-05-12 Thread Vladimir Sokolovsky
Hi, OFED-1.4.1-rc5 was removed from OFA downloads. OFED-1.4.1-rc6 will be released as soon as dependence issue between nfs and ib_core will be resolved and tested. Regards, Vladimir ___ general mailing list general@lists.openfabrics.org

Re: [ofa-general] How to establis h IB communcation more effectively?

2009-05-12 Thread Moni Shoua
Dotan Barak wrote: You can't find such samples in the verbs library; It can be found in the rdma cma library, you should search for rping or ucmatose. Dotan 2009/5/12 zhouyong...@ict.ac.cn: Hi all, I'm using libibverbs to build a cluster memory pool, and using TCP/IP handshake to

[ofa-general] ofa_1_4_kernel 20090512-0200 daily build status

2009-05-12 Thread Vladimir Sokolovsky (Mellanox)
This email was generated automatically, please do not reply git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git git_branch: ofed_kernel Common build parameters: Passed: Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.19 Passed on i686 with linux-2.6.17 Passed on i686 with

[ofa-general] [PATCH] perftest: send_lat/bw: Attach to multicast group when QP is in INIT

2009-05-12 Thread Hoang-Nam Nguyen
Subject: [PATCH] perftest: send_lat/bw: Attach to multicast group when QP is in INIT If multicast is enabled, the current code of send_lat/bw attaches the QP to a multicast group while it's still in RESET state. Since the IB spec does not strictly specify the QP state for this operation and

Re: [ofa-general] OFED 1.4.1-rc5 symbol disagreements on SLES 11 SP0

2009-05-12 Thread Tziporet Koren
Brian M. Rzycki wrote: Greetings, I have the following SLES 11 SP0 machine: It looks like the OFED installer isn't building ib_iser.ko even when I choose 2,3. This is the same bug reported on rc5. We removed rc5 and will publish RC6 soon Tziporet

[ofa-general] Re: [PATCH ofed-1.4.1 cxgb3 relnotes] Update cxgb3 release notes for 1.4.1

2009-05-12 Thread Tziporet Koren
Steve Wise wrote: Signed-off-by: Steve Wise sw...@opengridcomputing.com --- Applied Tziporet ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit

[ofa-general] Re: [PATCH] ehca: remove driver_data direct access of struct device

2009-05-12 Thread Hoang-Nam Nguyen
Hi, Thanks for this patch. But I've to NACK because 1) Greg KH has already done a similar patch in his tree. See http://lists.openfabrics.org/pipermail/general/2009-May/059442.html 2) Your patch is incomplete Regards Nam Roel Kluin roel.kl...@gmail.com wrote on 11.05.2009 22:25:07: From:

Re: [ofa-general] Re: [PATCH 2.6.30] xprtrdma: The frmr iova_start values are truncated by the nfs rdma client.

2009-05-12 Thread Steve Wise
Steve Wise wrote: Trond Myklebust wrote (earlier in this thread): All I should need to know is that I can advertise either dma handles or kernel VAs, and know that I can choose between two functions, say, ib_send_wr_fastreg_dma_init() and ib_send_wr_fastreg_kva_init() to initialise the

[ofa-general] Re: [PATCH] Fix 2 formatting diff's from old ibqueryerrors.

2009-05-12 Thread Sasha Khapyorsky
On 09:51 Wed 06 May , Ira Weiny wrote: 2 changes I noted in the output from ibqueryerrors. Link Info: was not being printed when -r was used. The header: Errors for 0xguid node name Should only be printed when errors are found. The following patch cleans those up. Ira

[ofa-general] Re: [PATCH] Clean up printing of switch heading when printing down links only.

2009-05-12 Thread Sasha Khapyorsky
On 09:53 Wed 06 May , Ira Weiny wrote: Another corner case: If there are no down links on a switch and -d is selected then the header for that switch should not be printed. Ira From: Ira Weiny wei...@llnl.gov Date: Thu, 30 Apr 2009 13:41:38 -0700 Subject: [PATCH] Clean up

Re: [ofa-general] [PATCH] opensm/osm_port.c: Remove error number from debug level log message

2009-05-12 Thread Sasha Khapyorsky
On 06:46 Sun 10 May , Hal Rosenstock wrote: Sasha has been adamant that any device supplied data errors use something other than ERROR log level. But I think that VERBOSE is more appropriate than for such cases than just DEBUG. Another way is to add another level for subnet warnings.

[ofa-general] Re: [PATCH] opensm/osm_port.c: Remove error number from debug level log message

2009-05-12 Thread Sasha Khapyorsky
On 10:33 Thu 07 May , Hal Rosenstock wrote: Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com Applied. Thanks. Sasha ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To

Re: [ofa-general] [PATCH] saquery: fix -c arguement

2009-05-12 Thread Sasha Khapyorsky
On 14:30 Sun 10 May , Doron Shoham wrote: set SAQUERY_CMD_CLASS_PORT_INFO instead of CLASS_PORT_INFO Signed-off-by: Doron Shoham dor...@voltaire.com Applied. Thanks. Sasha ___ general mailing list general@lists.openfabrics.org

Re: [ofa-general] [PATCH] opensm/osm_port.c: Remove error number from debug level log message

2009-05-12 Thread Hal Rosenstock
On Tue, May 12, 2009 at 1:55 PM, Sasha Khapyorsky sas...@voltaire.com wrote: On 06:46 Sun 10 May     , Hal Rosenstock wrote: Sasha has been adamant that any device supplied data errors use something other than ERROR log level. But I think that VERBOSE is more appropriate than for such cases

[ofa-general] Re: [PATCH] opensm/osm_lid_mgr.c bug in opensm LID assignment

2009-05-12 Thread Sasha Khapyorsky
On 14:29 Sat 09 May , Eli Dorfman (Voltaire) wrote: lid persistent range wrong check used lids were not properly chekced which caused duplicate lid assignment in some cases. Signed-off-by: Eli Dorfman e...@voltaire.com Applied. Thanks. Sasha

[ofa-general] Re: [PATCH] opensm/PerfMgr: Reduce host name length

2009-05-12 Thread Sasha Khapyorsky
On 07:21 Tue 12 May , Hal Rosenstock wrote: to what's needed (based on NodeDescription length) Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com Applied. Thanks. Sasha ___ general mailing list general@lists.openfabrics.org

Re: [ofa-general] [PATCH] opensm/osm_port.c: Remove error number from debug level log message

2009-05-12 Thread Sasha Khapyorsky
On 14:06 Tue 12 May , Hal Rosenstock wrote: Yes, VERBOSE level is more consistent than DEBUG level with what is done elsewhere in OpenSM. Ok, I'm changing to VERBOSE. Sasha ___ general mailing list general@lists.openfabrics.org

[ofa-general] [PATCH] opensm/osm_port.c: Change log level of Invalid OP_VLS 0 message to VERBOSE

2009-05-12 Thread Hal Rosenstock
Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com --- diff --git a/opensm/opensm/osm_port.c b/opensm/opensm/osm_port.c index 17bac73..cb8b153 100644 --- a/opensm/opensm/osm_port.c +++ b/opensm/opensm/osm_port.c @@ -381,7 +381,7 @@ uint8_t osm_physp_calc_link_op_vls(IN osm_log_t * p_log,

[ofa-general] Re: [PATCH] opensm/osm_port.c: Change log level of Invalid OP_VLS 0 message to VERBOSE

2009-05-12 Thread Sasha Khapyorsky
On 14:32 Tue 12 May , Hal Rosenstock wrote: Signed-off-by: Hal Rosenstock hal.rosenst...@gmail.com Oops, I committed this already in the local branch :) Will use your version instead. Applied. Thanks. Sasha ___ general mailing list

[ofa-general] Re: [PATCH 0/2] osm_port.c: do not enforce PortInfo update if max_op_vls = 0

2009-05-12 Thread Sasha Khapyorsky
On 18:55 Thu 07 May , Doron Shoham wrote: do not enforce PortInfo update if max_op_vls = 0 Signed-off-by: Doron Shoham dor...@voltaire.com --- opensm/opensm/osm_port.c |2 +- opensm/opensm/osm_subnet.c |8 2 files changed, 9 insertions(+), 1 deletions(-) diff

Re: [ofa-general] [PATCH 1/2] osm_port.c: check if op_vls = 0 before max_op_vls comparison

2009-05-12 Thread Sasha Khapyorsky
On 09:49 Sun 10 May , Eli Dorfman (Voltaire) wrote: Doron Shoham wrote: check if op_vls = 0 before max_op_vls comparison Signed-off-by: Doron Shoham dor...@voltaire.com --- opensm/opensm/osm_port.c |9 + 1 files changed, 5 insertions(+), 4 deletions(-) diff

[ofa-general] Re: [PATCH 1/2] osm_port.c: check if op_vls = 0 before max_op_vls comparison

2009-05-12 Thread Sasha Khapyorsky
On 11:17 Sun 10 May , Doron Shoham wrote: check if op_vls = 0 before max_op_vls comparison Signed-off-by: Doron Shoham dor...@voltaire.com Applied. Thanks. See comments below. --- opensm/opensm/osm_port.c | 11 ++- 1 files changed, 6 insertions(+), 5 deletions(-) diff

[ofa-general] Re: [RFC][PATCH] ibnetdiscover: remove report of max hops discovered.

2009-05-12 Thread Sasha Khapyorsky
On 18:01 Wed 06 May , Ira Weiny wrote: The number reported as max hops from ibnetdiscover can change depending on the algorithm used to discover the fabric. As Hal says in the message below using this number is therefore dangerous. If no one is currently using this number I propose the

RE: [ofa-general] How to establish IB communcation more effectively?

2009-05-12 Thread Davis, Arlin R
Hi all, I'm using libibverbs to build a cluster memory pool, and using TCP/IP handshake to exchange memory information and establish the connection before the IB communication. While I found this process costed a lot of time, 100ms in 1GEth LAN, so I want to use the rdma_cm or ib_ucm to

Re: [ofa-general] How to establish IB communcation more effectively?

2009-05-12 Thread Or Gerlitz
Davis, Arlin R arlin.r.da...@intel.com wrote: For a connection (socket connect, exchanging QP info, private data, qp modify) using uDAPL socket cm versus rdma_cm I get: socket_cm on 1Ge == ~900us socket_cm on IPoIB (mlx4 ddr) == ~400us rdma_cm on IB (mlx4 ddr) == ~2200us As you can see, the

RE: [ofa-general] How to establish IB communcation more effectively?

2009-05-12 Thread Sean Hefty
Just to make sure we're on the same page: both IPoIB and the RDMA-CM use SA path queries But ipoib caches its path records... - Sean ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

RE: [ofa-general] How to establish IB communcation more effectively?

2009-05-12 Thread Davis, Arlin R
Davis, Arlin R arlin.r.da...@intel.com wrote: For a connection (socket connect, exchanging QP info, private data, qp modify) using uDAPL socket cm versus rdma_cm I get: socket_cm on 1Ge == ~900us socket_cm on IPoIB (mlx4 ddr) == ~400us rdma_cm on IB (mlx4 ddr) == ~2200us As you can see,

Re: [ofa-general] How to establish IB communcation more effectively?

2009-05-12 Thread Or Gerlitz
Just to make sure we're on the same page: both IPoIB and the RDMA-CM use SA path queries But ipoib caches its path records... Yes, of-course. But, to start with, lets analyze the case of each node running --one-- rank and then take it from there to the case where each node runs C ranks. Or.

Re: [ofa-general] How to establish IB communcation more effectively?

2009-05-12 Thread Or Gerlitz
Davis, Arlin R arlin.r.da...@intel.com wrote: Just to make sure we're on the same page: both IPoIB and the RDMA-CM use SA path queries (ipoib for the unicast arp reply, and rdma-cm for rdma_resolve_route), going into details, things look like: I am running IPoIB connected so I assume there is no

RE: [ofa-general] How to establish IB communcation more effectively?

2009-05-12 Thread Sean Hefty
Yes, of-course. But, to start with, lets analyze the case of each node running --one-- rank and then take it from there to the case where each node runs C ranks. The caching is independent of running MPI though. To get a fair comparison, you'd probably have to reboot the entire cluster before

[ofa-general] qperf: destroy QPs before destroying any other objects

2009-05-12 Thread Ralph Campbell
The QP contains references to the protection domain (PD), memory regions (MR), address handles, completion queues (CQ), address handles (AH), etc. The QP should be destroyed before any other objects are destroyed so that the referenced object is not busy. Signed-off-by: Ralph Campbell