On Aug 9, 2005, at 7:24 AM, Sridhar Chirravuri wrote:

I have fixed the timing issue between the server and client, and now I could build Open MPI successfully.

Good.

Here is the output of ompi_info....

[root@micrompi-2 ompi]# ompi_info

                Open MPI: 1.0a1r6760M

Note that as of this morning (US Eastern time), the current head is r6774. Also be wary of any local mods you have put in the tree (as noted by the "M"). Check "svn status" to see which files you have modified, and "svn diff" to see the exact changes.

This time, I could see that btl mvapi component is built.

But I am still seeing the same problem while running Pallas Benchmark i.e., I still see that the data is passing over TCP/GigE and NOT over Infiniband.

Please note that the 2nd generation point-to-point implementation is still the default (where we have no IB support) -- all the IB support, both mVAPI and Open IB, is in the 3rd generation point-to-point implementation. You must explicitly request the 3rd generation point-to-point implementation at run time to get IB support. Check out slide 48, "Example: Forcing ob1/BTL" in the slides that we discussed on the teleconference (were you on the teleconference? I attached copies if you were not). The short version is that you need to tell Open MPI to use the "ob1" pml component (3rd gen), not the default "teg" pml component (2nd gen).

We'll eventually make the 3rd gen stuff be the default, and likely remove all the 2nd gen stuff (i.e., definitely before release) -- we just haven't done it yet because Tim and Galen are still polishing up the 3rd gen stuff.

I have disabled building OpenIB and to do so I have touched .ompi_ignore. This should not be a problem for MVAPI.

If the Open IB headers / libraries are not located in compiler-known locations, then you shouldn't need to .ompi_ignore the tree (i.e., configure won't find the Open IB headers / libraries, and will therefore automatically skip those components).

Again, it is our intention that users will neither know about nor have to touch files in the distribution -- they only need use appropriate options to "configure" and then "make".

I'm not sure if we have explicit options to disable a component in configure -- Brian, can you comment here?

I have run autogen.sh, configure and make all. The output of autogen.sh, configure and make all commands are <<ompi_out.tar.gz>> gzip'ed in ompi_out.tar.gz file which is attached in this mail. This gzip file also contains the output of Pallas Benchmark results. At the end of Pallas Benchmark output, you can find the error

Request for 0 bytes (coll_basic_reduce.c, 193)

Request for 0 bytes (coll_basic_reduce.c, 193)

Request for 0 bytes (coll_basic_reduce.c, 193)

Request for 0 bytes (coll_basic_reduce.c, 193)

Request for 0 bytes (coll_basic_reduce.c, 193)

Request for 0 bytes (coll_basic_reduce_scatter.c, 79)

Request for 0 bytes (coll_basic_reduce.c, 193)

Request for 0 bytes (coll_basic_reduce_scatter.c, 79)

Request for 0 bytes (coll_basic_reduce.c, 193)

..and Pallas just hung.

I have no clue about the above errors which are coming from Open MPI source code.

The 2nd generation component has fallen into some disrepair -- I'd try re-running with ob1 and see what happens. I have not seen such errors when running PMB before, but I can try running it again to see if we've broken something recently.

Is there any thing that I am missing while building btl mvapi? Also, is anyone built for mvapi and tested this OMPI stack. Please let me know.

Galen Shipman and Tim Woodall are doing all the IB work.

--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/


Reply via email to