I'm trying to track down a problem by using systemtap - but it needs the
debuginfo for the affected modules, and the OFED installer does not create a
debuginfo for the kernel modules.
Is there a way to turn the creation of debuginfo files on?
This message and any attached documents contain
...@lists.openfabrics.org [mailto:ewg-
boun...@lists.openfabrics.org] On Behalf Of Mike Heinz
Sent: Monday, February 21, 2011 11:55 AM
To: klit...@dev.mellanox.co.il
Cc: Linux RDMA; ewg@lists.openfabrics.org
Subject: Re: [ewg] Patch breaks OFED 1.5.3: [PATCH] ibdiagpath:
Properly index VlArbTable during QoS test
, I'll see if it makes a difference.
-Original Message-
From: Yevgeny Kliteynik [mailto:klit...@dev.mellanox.co.il]
Sent: Sunday, February 20, 2011 9:05 AM
To: Mike Heinz; John Jolly
Cc: ewg@lists.openfabrics.org; Linux RDMA; Todd Rimmer; Eli Dorfman (Voltaire)
Subject: Re: Patch breaks OFED
-
From: Mike Heinz
Sent: Monday, February 21, 2011 10:40 AM
To: 'klit...@dev.mellanox.co.il'; John Jolly
Cc: ewg@lists.openfabrics.org; Linux RDMA; Todd Rimmer; Eli Dorfman (Voltaire)
Subject: RE: Patch breaks OFED 1.5.3: [ewg] [PATCH] ibdiagpath: Properly index
VlArbTable during QoS test
Yevgeny
The man page for umad_send() does not match the source code.
Signed-off-by: Michael Heinz michael.he...@qlogic.com
---
diff --git a/libibumad/man/umad_send.3 b/libibumad/man/umad_send.3
index 2d84f57..c4a617a 100644
--- a/libibumad/man/umad_send.3
+++ b/libibumad/man/umad_send.3
@@ -7,11 +7,13 @@
The version of ibdiagpath included with OFED 1.5.3-rc3 contains syntax errors
which prevent it from executing on the systems I've tested (using TCL 8.4).
Attempts to use ibdiagpath fail with an error message:
-I---
-I- QoS on Path Check
Wouldn't the BUSY patch I proposed last year deal with this situation?
-Original Message-
From: ewg-boun...@lists.openfabrics.org
[mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Moni Shoua
Sent: Wednesday, February 02, 2011 10:10 AM
To: Vlad
Cc: n...@voltaire.com; ewg
Subject:
.html
Basically, the spec permits an SM to reply busy instead of simply tossing
packets on the floor, but OFED does not handle this case right now.
-Original Message-
From: Moni Shoua [mailto:mo...@voltaire.com]
Sent: Wednesday, February 02, 2011 10:42 AM
To: Mike Heinz
Cc: Vlad; n
When you say connected mode you referring to ipoib or your MPI configuration?
You really don't want to use ipoib for HPC applications. What MPI are you
using?
For MPI - my personal experience is that OpenMPI is sometimes more reliable but
Mvapich-1 offers the best performance.
-Original
Heh. I forgot Intel sells an mpi, I thought you were saying you had recompiled
one of the OFED mpis with icc.
1) For your small cluster, there's no reason not to use connected mode. The
only reason for providing a datagram mode with MPI is to support very large
clusters where there simply
.
So the MTU value which I'm seeing on the ib0 interface (65520) is not
connected to the real infiniband MTU value ?
Le 07/12/2010 16:52, Mike Heinz a écrit :
Heh. I forgot Intel sells an mpi, I thought you were saying you had
recompiled one of the OFED mpis with icc.
1) For your small cluster
...@intel.com]
Sent: Wednesday, October 13, 2010 11:59 AM
To: Mike Heinz; linux-r...@vger.kernel.org; e...@openfabrics.org
Cc: v...@mellanox.co.il; Roland Dreier
Subject: RE: user SA notifications, redux
As I mentioned earlier, the reason ib_sa acts as a single access point for
SA/SM traps and notices
to adding the
user-space capability to libibverbs.
Now that 1.5.2 is out the door, can we revisit this and try to get this and the
matching kernel changes into the next release?
===
API for Proposal for adding ib_usa to the Linux Infiniband Subsystem
Mike Heinz
Sean, Jason,
I backed off on this because the migration to OFED 1.5.2 and other issues was
consuming all of my time; I've had this patch for quite a while but I finally
had time recently to rework and test it for 1.5.2.
The intent of this patch is to try to address the feedback you gave me
Resending this because I never saw it show up in the list:
Looking at the SRPMS, I noticed that libsdp doesn't seem to have been made from
clean source. It contains the result of a configure and make operation:
Only in libsdp-1.1.103: config.h
Only in libsdp-1.1.103: config.log
Only in
Looking at the SRPMS, I noticed that libsdp doesn't seem to have been made from
clean source. It contains the result of a configure and make operation:
Only in libsdp-1.1.103: config.h
Only in libsdp-1.1.103: config.log
Only in libsdp-1.1.103: config.status
Only in libsdp-1.1.103: libtool
Only
Hello all,
I'm trying to build mvapich2-1.5.1 on an RHEL 5 update 3 system. It builds from
the SRPM just fine, but when I try to compile test programs, they don't link.
It appears that a set of routines, hwloc_* are missing from the shared library.
[r...@homer bandwidth]#
BTW - in case it wasn't clear, this is the mvapich2-1.5.1 rpm that comes with
OFED 1.5.2-rc6.
-Original Message-
From: mpich2-dev-boun...@mcs.anl.gov [mailto:mpich2-dev-boun...@mcs.anl.gov] On
Behalf Of Mike Heinz
Sent: Friday, September 10, 2010 2:43 PM
To: mpich2-...@mcs.anl.gov; e
Hey, all - I'm trying to install the 1.5.2-rc2 tarball with the following
command:
# ./install.pl --all --prefix /usr/ofed-1.5.2-rc2
but it fails when it gets to libsdp:
configure: error: OPENIB: --with-openib must be provided - fail to find
standard OpenIB kernel installation
error: Bad exit
'_usr /usr/ofed-1.5.2'
/home/mheinz/work/OFED-1.5.2-rc2/SRPMS/libsdp-1.1.101-0.3.gc767eee.src.rpm
-Original Message-
From: ewg-boun...@openfabrics.org [mailto:ewg-boun...@openfabrics.org] On
Behalf Of Mike Heinz
Sent: Thursday, July 22, 2010 3:05 PM
To: e...@openfabrics.org
Subject: [ewg
I never got a response to this patch, so I'm sending it again.
-
IPoIB is coded to use the 1st PKey in the PKey table as its ib0 interface.
Additional ib0.pkey interfaces may be created using the /sys/class/...
add_child interface.
However, there is a race. During normal
This patch builds upon my previously submitted patch for improving the default
handling of the node_desc.
With this patch, the openibd script will set the description of each HCA in the
system to the value @: HCA-## where ## is replaced with a unique id number
for that HCA and the @ symbol is
This is the OFED 1.5.2 version of a patch I submitted earlier today to
linux-rdma. There are only very small differences between OFED 1.5.2 and
matching areas of the IB drivers in Linux 2.6.35, but they were enough to break
the patch, making this version necessary.
If this patch is accepted
Thanks!
From: Vladimir Sokolovsky [v...@dev.mellanox.co.il]
Sent: Sunday, June 13, 2010 5:01 AM
To: Mike Heinz
Cc: e...@openfabrics.org
Subject: Re: [ewg] [PATCH] ofa_kernel madeye.c
Mike Heinz wrote:
This is a simple fix. Several of the snoop filters
and should be included in OFED 1.5.2.
-Original Message-
From: ewg-boun...@openfabrics.org [mailto:ewg-boun...@openfabrics.org] On
Behalf Of Mike Heinz
Sent: Tuesday, June 01, 2010 9:58 AM
To: e...@openfabrics.org
Subject: [ewg] [PATCH] ofa_kernel madeye.c
I'm resending this, because it seems
add code to force a trailing zero.
-Original Message-
From: Jack Morgenstein [mailto:ja...@dev.mellanox.co.il]
Sent: Thursday, June 03, 2010 6:14 AM
To: e...@openfabrics.org
Cc: Mike Heinz; e...@openfabrics.org
Subject: Re: [ewg] [PATCH] node description patch
On Tuesday 01 June 2010 17
It's workable, although I really wish there was a way to handle stupid apps
that aren't written to handle a busy response.
-Original Message-
From: Hefty, Sean [mailto:sean.he...@intel.com]
Sent: Tuesday, June 08, 2010 12:44 PM
To: Jason Gunthorpe
Cc: Mike Heinz; linux-r
Roland Dreier said:
I don't have a strong opinion on this but it seems a bit odd. If we're just
going to drop the response anyway, why did the SA send it in the first place?
On the other hand, if the SA told us it's busy, it does seem we could do
something more sensible than retrying
Sean said:
I don't object to the concept of treating a busy response as a timeout, but
how does this help prevent overwhelming the SA? It continues to retry the
queries, even if the SA says that it's too busy to respond without adjusting
the timeout specified by the user. I would think
, Sean
Cc: Mike Heinz; linux-r...@vger.kernel.org; e...@openfabrics.org
Subject: Re: [PATCH] Handling busy responses from the SA
On Fri, Jun 04, 2010 at 02:05:10PM -0700, Hefty, Sean wrote:
Maybe we should re-think that guideline and allow users to simply
indicate that the MAD layer should use
The purpose of this patch is to cause the ib_mad driver to discard busy
responses from the SA, effectively causing busy responses to become time outs.
This ensures that naïve IB applications cannot overwhelm the SA with queries,
which could happen when a cluster is being rebooted, or when a
Message-
From: ewg-boun...@openfabrics.org [mailto:ewg-boun...@openfabrics.org] On
Behalf Of Mike Heinz
Sent: Wednesday, May 26, 2010 4:01 PM
To: e...@openfabrics.org
Subject: [ewg] [PATCH] ofa_kernel madeye.c
This is a simple fix. Several of the snoop filters in
./drivers/infiniband/util
This patch fixes a problem with the openibd initialization script. On machines
using slower DHCP servers, openibd frequently sets the HCA's node description
to HCA-1. This patch modifies openibd to add a @ instead of the hostname and
adds a small hook in the core drivers to replace the @ sign
To: Mike Heinz; openfabrics-...@openib.org
Subject: RE: [ewg] Question: When should patches be submitted to EWG and when
should they be submitted to linux-rdma?
In general, we would like kernel code to be reviewed and accepted (or at least
queued for
acceptance) upstream first and then submitted
This is a simple fix. Several of the snoop filters in
./drivers/infiniband/util/madeye.c don't switch the attribute id to host byte
order before checking it.
Signed-off-by: Michael Heinz michael.he...@qlogic.com
diff --git a/drivers/infiniband/util/madeye.c b/drivers/infiniband/util/madeye.c
The subject says it all. If I have a patch that can be applied against either
the current OFED git repository or against the upstream kernel - where do I
post it?
___
ewg mailing list
ewg@lists.openfabrics.org
, or do I need to
submit the patch to both groups?
-Original Message-
From: Roland Dreier [mailto:rdre...@cisco.com]
Sent: Wednesday, May 26, 2010 4:50 PM
To: Mike Heinz
Cc: openfabrics-...@openib.org
Subject: Re: [ewg] Question: When should patches be submitted to EWG and when
should
Ira,
I'm pretty sure I already fixed this problem. I submitted a patch to Sasha back
in April.
-Original Message-
From: linux-rdma-ow...@vger.kernel.org
[mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Ira Weiny
Sent: Wednesday, May 05, 2010 9:10 PM
To: Woodruff, Robert J;
Yup - I've also sent a note to Sasha what happened to the patch.
-Original Message-
From: Ira Weiny [mailto:wei...@llnl.gov]
Sent: Thursday, May 06, 2010 11:35 AM
To: Mike Heinz; Sasha Khapyorsky
Cc: Woodruff, Robert J; linux-r...@vger.kernel.org; EWG; tzipo...@mellanox.co.il
Subject: Re
Sasha asked that I re-submit the patches for perfquery in a slightly different
format. This is the first of 3 patches.
This patch adds a function to libibmad that allows the caller to dump a
configurable range of MAD attributes. Basically, this provides an external
interface to the internal
Khapyorsky [mailto:sashakv...@gmail.com] On Behalf Of Sasha
Khapyorsky
Sent: Thursday, May 06, 2010 5:03 PM
To: Mike Heinz
Cc: linux-r...@vger.kernel.org; e...@openfabrics.org
Subject: Re: [PATCH] management: adding mad_dump_fields to libibmad
On 13:27 Thu 06 May , Mike Heinz wrote:
Sasha asked
Hi - the problem is that not all switches support the same features, and
ibcheckerrors is treating this as an error. I believe this will be fixed in
OFED 1.5.2.
-Original Message-
From: ewg-boun...@openfabrics.org [mailto:ewg-boun...@openfabrics.org] On
Behalf Of Woodruff, Robert J
the problem.
From: Tziporet Koren [mailto:tzipo...@dev.mellanox.co.il]
Sent: Sunday, May 02, 2010 4:05 PM
To: Mike Heinz
Cc: e...@openfabrics.org
Subject: Re: [ewg] Hang in ib_mad when unergistering.
On 4/30/2010 4:04 PM, Mike Heinz wrote:
Using OFED 1.5.0 and 1.5.1 we've been seeing nodes
Using OFED 1.5.0 and 1.5.1 we've been seeing nodes occasionally hang when a
process tries to disconnect from the umad interface. Can anyone suggest what
might be causing this?
Here's a typical example:
Apr 29 10:01:37 st2139 kernel: qlgc_dsc D 80148c54 0 5478
1 5497
We had a customer report that perfquery was crashing on their nodes when trying
to query ports on a switch. When I examined the core dump, it was clear that
libibmad was dereferencing a null pointer from one of the mad_set_ functions:
#0 0x in ?? ()
#1 0x2ae4e13e7536 in
These patches are a modification to a patch I submitted earlier, based on
Sasha's feedback. Rather than duplicating functionality between perfquery.c and
libibmad/dump.c, this patch exposes the internal function _dump_fields() as new
api call, mad_dump_fields(). This permits perfquery to change
46 matches
Mail list logo