On Wed, 16 Jul 2008, Adam Jundt wrote:

I have been working on getting a nightly tarball of Open MPI to build on
a Cray XT4 system running CNL. I found the following post on the forum:
http://www.open-mpi.org/community/lists/users/2007/09/4059.php. I had to
modify the configure options a little (added another include directory
to CFLAGS, and inserted the '--disable-mpi-f77' flag) to get it to build
for me, here is what I used:

./configure CC=/opt/xt-pe/default/bin/snos64/linux-pgcc
CXX=/opt/xt-pe/default/bin/snos64/linux-pgCC
F77=/opt/xt-pe/default/bin/snos64/linux-pgftn
FC=/opt/xt-pe/default/bin/snos64/linux-pgf90
CFLAGS="-I/opt/xt-pe/default/include/
-I/opt/xt-catamount/default/catamount/linux/include/"
CPPFLAGS=-I/opt/xt-pe/default/include/
FCFLAGS=-I/opt/xt-pe/default/include/
FFLAGS=-I/opt/xt-pe/default/include/
LDFLAGS=-L/opt/xt-mpt/default/lib/snos64/ LIBS="-lpct -lalpslli
-lalpsutil" --build=x86_64-unknown-linux-gnu
--host=x86_64-cray-linux-gnu
--with-platform=/lus/nid00008/jundt/openmpi-1.3a1r18788/contrib/platform/cray_xt3_romio
--with-io-romio-flags=--disable-aio build_alias=x86_64-unknown-linux-gnu
host_alias=x86_64-cray-linux-gnu --enable-ltdl-convenience
--no-recursion --disable-mpi-f77 --prefix=~/OpenMPI

I don't think it's a huge deal, but I think things will be a bit more sane if you change the --with-platform argument to cray_xt_cnl_romio instead of cray_xt3_romio (which is really targeting Catamount instead of CNL). One of the ORNL guys can probably be more helpful than I can here, as I'm only familiar with building on Red Storm / Catamount.

~/OpenMPI/lib/libopen-pal.a(timer_catamount_component.o): In function
`opal_timer_catamount_open':
timer_catamount_component.c:(.text+0x6): undefined reference to `__cpu_mhz'

Looking into timer_catamount_component.c, __cpu_mhz is defined within
the <catamount/dclock.h> file (which it should have already pulled in).
I realize that this is a very specified question, but I was curious if
anyone else had successfully gotten Open MPI to work on a similar
system, and if so, what configure options were used? If not, is anyone
aware of how to circumvent the problem?

By the way, I did try modifying the file timer_catamount_component.c to
not reference __cpu_mhz to see the result, and the program is able to
successfully compile, but hangs upon execution, i.e.:

That's a weird result. The configure test for the timer catamount component checks to see if __cpu_mhz is defined when linking. Can you send me (off list is probably best) the config.log generated by configure? That component was added just to the trunk/v1.3 branch in the last month, which is probably why no one on CNL noticed yet (obviously it works great on Catamount). I'm not really familiar with CNL -- does catamount/dclock.h exist on a standard CNL setup?

~/OpenMPI/bin/mpicc test.c
~/OpenMPI/lib/libopen-rte.a(session_dir.o): In function
`orte_session_dir_get_name':
session_dir.c:(.text+0x7e): warning: Using 'getpwuid' in statically
linked applications requires at runtime the shared libraries from the
glibc version used for linking
~/OpenMPI/lib/libmpi.a(btl_tcp_component.o): In function
`mca_btl_tcp_component_create_listen':
btl_tcp_component.c:(.text+0x11c0): warning: Using 'getaddrinfo' in
statically linked applications requires at runtime the shared libraries
from the glibc version used for linking
aprun -n 2 ./a.out
... program hangs...

I'm afraid I can't help a whole lot here. HOwever, there are some differences between how Open MPI initializes Portals between CNL and Catamount. Since you configured for Catamount, it's possible that's the cause of the hang. Again, the ORNL people would probably know better than I would.


Brian

Reply via email to