[ofa-general] multicast group join limits -- test code

2007-08-14 Thread Andrew Friedley
I've attached a simple test program that should demonstrate the limitations I'm seeing when joining multiple multicast groups; the idea being to allow others to see the weirdness I'm seeing and make some progress. An MPI is needed to compile/run the test. No arguments are needed; the test

Re: [ofa-general] Limited number of multicasts groups that can be joined?

2007-07-19 Thread Andrew Friedley
Finally was able to have the SM switched over from Cisco on the switch to OpenSM on a node. Responses inline below.. Sean Hefty wrote: Now the more interesting part. I'm now able to run on a 128 node machine using open SM running on a node (before, I was running on an 8 node machine which

Re: [ofa-general] Limited number of multicasts groups that can be joined?

2007-07-19 Thread Andrew Friedley
Hal Rosenstock wrote: I'm not quite parsing what is the same with what is different in the results (and I presume the only variable is SM). Yes; this is confusing, I'll try to summarize the various behaviors I'm getting. First, there are two machines. One has 8 nodes and runs a Topspin

Re: [ofa-general] Limited number of multicasts groups that can be joined?

2007-07-19 Thread Andrew Friedley
Andrew Friedley wrote: Hal Rosenstock wrote: I'm not quite parsing what is the same with what is different in the results (and I presume the only variable is SM). Yes; this is confusing, I'll try to summarize the various behaviors I'm getting. First, there are two machines. One has 8

Re: [ofa-general] Limited number of multicasts groups that can be joined?

2007-06-28 Thread Andrew Friedley
where to start looking on my own, so even some hints on where the problem(s) might lie would be useful. Andrew Andrew Friedley wrote: I've run into a problem where it appears that I cannot join more than 14 multicast groups from a single HCA. I'm using the RDMA CM UD/multicast interface from

[ofa-general] Limited number of multicasts groups that can be joined?

2007-06-08 Thread Andrew Friedley
I've run into a problem where it appears that I cannot join more than 14 multicast groups from a single HCA. I'm using the RDMA CM UD/multicast interface from an OFED v1.2 nightly build, and using a '0' address when joining to have the SM allocate an unused address. The first 14

Re: [ofa-general] how to write a IB user level multicast application

2007-05-24 Thread Andrew Friedley
Dotan Barak wrote: In the following URL you can find a very simple example on how to use multicast: https://svn.openfabrics.org/svn/openib/trunk/contrib/mellanox/ibtp/gen2/userspace/useraccess/multicast_test/multicast_test.c I seem to be missing v1.h on my OFED v1.2 nightly install, where can

Re: [OMPI users] [ofa-general] Re: openMPI over uDAPL doesn't work

2007-05-09 Thread Andrew Friedley
You say that fixes the problem, does it work even when running more than one MPI process per node? (that is the case the hack fixes) Simply doing an mpirun with a -np paremeter higher than the number of nodes you have set up should trigger this case, and making sure to use '-mca btl

Re: [OMPI devel] [ofa-general] Re: OMPI over ofed udapl - bugs opened

2007-05-09 Thread Andrew Friedley
Therefore, the only truly safe thing for an iWARP btl to do (or a udapl btl since that is also an iWARP btl) is to have the active layer send an MPI Layer nop of some kind immediately after establishing the connection if there is nothing else to send. This is fine for an

[ofa-general] Re: [OMPI devel] OMPI over ofed udapl - bugs opened

2007-05-09 Thread Andrew Friedley
Steve Wise wrote: On Wed, 2007-05-09 at 16:15 -0700, Andrew Friedley wrote: Steve Wise wrote: There have been a series of discussions on the ofa general list about this issue, and the conclusion to date is that it cannot be resolved in the rdma-cm or iwarp-cm code of the linux rdma stack

Re: [OMPI devel] OMPI over OFA udapl (was Re: [ofa-general] OpenMPI and RDMA-CM)

2007-05-08 Thread Andrew Friedley
Steve Wise wrote: Well I've tried OMPI on ofed-1.2 udapl today and it doesn't work. I'm debugging now. Here's part of the problem (from ompi/btl/udapl/btl_udapl.c): /* TODO - big bad evil hack! */ /* uDAPL doesn't ever seem to keep track of ports with addresses. This becomes

Re: [ofa-general] build failure on nightly tarball -- bonding

2007-03-06 Thread Andrew Friedley
Moni Shoua wrote: Andrew Friedley wrote: Moni Shoua wrote: Andrew Friedley wrote: The chelsio build errors from yesterday appear to be gone, though now I'm seeing errors building the IB bonding code with the 3/2 alpha tarball -- error below. I'm wondering, is there a way to selectively avoid

Re: [ofa-general] [PATCH] Chelsio RHEL4 U2/U3 Support.

2007-03-01 Thread Andrew Friedley
Steve Wise wrote: Vlad, This patch fixes the compile problems with Chelsio cxgb3 on RHEL4 U2/U3. Looks like I have the problem for this fix on U4 as well, could this be applied there too? Andrew ___ general mailing list