Hi, I'm really grateful for the detailed responses. I'll try running different responses as Larry suggested. Right now MPICH seems to be satisfying my needs, so I have less time to devote to getting OpenMPI working, but I am interested in having it working just as an option to MPICH.
Thanks! On Tue, Aug 16, 2011 at 10:35 PM, Ralph Castain <r...@open-mpi.org> wrote: > Just an FYI. Disabling ORTE support is intended solely for systems that > require no RTE assistance - e.g., Crays. Configuring without RTE support > will generate something that cannot run on a Mac, which is why the build > fails in that environment - it is looking for external RTE support that does > not exist on the Mac. That configure option works fine on the intended > targets. > > The declspec macro does indeed have visibility attributes - in fact, that > is its sole purpose. You are welcome to try disabling visibility to see if > that helps. > > The module definitions are actually identical, minus the visibility flags. > > > On Aug 16, 2011, at 8:08 PM, Larry Baker wrote: > > Matthew, > > The best I can come up with is that somehow the declaration of > external orte_odls in orte/mca/odls/odls.h > > ORTE_DECLSPEC extern orte_odls_base_module_t orte_odls; /* holds selected > module's function pointers */ > > > does not exactly match the definition of orte_odis in > orte/mca/odis/base/odls_base_open.c > > orte_odls_base_module_t orte_odls; > > > ORTE_DECLSPEC might include some decorations having to do with the > visibility attribute. Try adding --disable-visibility to your configure. > > Otherwise, I see in orte/mca/odis/base/odls_base_open.c that orte_odis is > not defined if ORTE_DISABLE_FULL_SUPPORT == 1. I tried to compile > with --without-rte-support to force #define ORTE_DISABLE_FULL_SUPPORT 1, but > the make failed before it reached the link that failed for you. When > --without-rte-support is requested in 1.4.3, there are declarations that > depend on typedefs that are skipped, causing the make to fail. You may be > encountering something subtle like that when configure deduces some behavior > for pgcc and the code doesn't quite have the conditional compilation tests > in the right place. > > You might try a newer version of OpenMPI, which might have fixed problem > like --without-rte-support failing. > > Larry Baker > US Geological Survey > 650-329-5608 > ba...@usgs.gov > > On 16 Aug 2011, at 11:53 AM, Matthew Russell wrote: > > Hi Larry, > > Thank you for your interest. > > I believe your solution is the right one, however I think there's some > other issues causing some problems too. > > When I add the search_paths_first flag to my configure, the command that > breaks in the Makefile is, > > libtool: link: /opt/pgi/osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -Msignextend -V > -search_paths_first -o orte-clean orte-clean.o > ../../../orte/.libs/libopen-rte.a > /Users/matt/software/openmpi/openmpi-1.4.3/opal/.libs/libopen-pal.a -lutil > *pgcc-Error-Unknown switch: -search_paths_first* > > pgcc 10.9-0 64-bit target on Apple OS/X -tp nehalem-64 > Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved. > Copyright 2000-2010, STMicroelectronics, Inc. All Rights Reserved. > make: *** [orte-clean] Error 1 > > The problem there is that that libtool isn't passing the "-Wl," along with > the search_path_first error, so it isn't getting to the linker. If I try > to manually build it, I still have missing symbols: > > matt@pontus:orte-clean$ pgcc -DNDEBUG -O2 -Msignextend -V * > -Wl,-search_paths_first* -o orte-clean orte-clean.o > ../../../orte/.libs/libopen-rte.a > /Users/matt/software/openmpi/openmpi-1.4.3/opal/.libs/libopen-pal.a -lutil > > pgcc 10.9-0 64-bit target on Apple OS/X -tp nehalem-64 > Copyright 1989-2000, The Portland Group, Inc. All Rights Reserved. > Copyright 2000-2010, STMicroelectronics, Inc. All Rights Reserved. > Undefined symbols for architecture x86_64: > "_orte_odls", referenced from: > _orte_errmgr_base_error_abort in libopen-rte.a(errmgr_base_fns.o) > ld: symbol(s) not found for architecture x86_64 > > > > On Tue, Aug 16, 2011 at 2:46 PM, Larry Baker <ba...@usgs.gov> wrote: > >> Matthew, >> >> What configure options did you use? >> >> I can try to replicate your findings, as best I can, using the Intel >> compiler on my desktop Mac (Leopard). One thing I want to investigate is >> which libutil is supposed to be linked. There is no -L in the failing link >> step. Is that possibly the error? >> >> I have PGI and about five other compilers on our cluster. I'll get to >> OpenMPI 1.4.3 with all those as soon as I fetch the latest versions and >> reinstall my cluster software (Rocks just came out with 5.4.3). >> >> Larry Baker >> US Geological Survey >> 650-329-5608 >> ba...@usgs.gov >> >> On 16 Aug 2011, at 9:44 AM, Matthew Russell wrote: >> >> Hmm, I tried the recommendation above, adding -Wl,-search_paths_first, and >> I still ran into the same issue. I suspect it is an issue with PGI. >> >> Meanwhile, I've been able to get my applications (CMAQ) working with >> MPICH2, so for now at least I am going to continue with that. >> >> Thanks for the responses! >> >> On Mon, Aug 15, 2011 at 8:43 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >>> FWIW: I build OMPI on Mac OS-X (Snow Leopard) every day, without adding >>> any extra flags, without problem. The citation below relates to something >>> from a long time ago, I believe - haven't seen that problem in quite some >>> time. >>> >>> I do not, however, use PGI. We regularly have problems with PGI on a >>> variety of systems, and I suspect you are hitting one here - but can't >>> confirm it as we don't have PGI licenses to use for testing. >>> >>> The Xgrid support is broken, but has nothing to do with the problem you >>> describe. Just means you can't launch via Xgrid. >>> >>> >>> >>> On Aug 15, 2011, at 2:53 PM, Larry Baker wrote: >>> >>> Matthew, >>> >>> I have the same type of error on a completely different software package >>> on Mac OS X. The error occurs because of the way that Mac OS X searches for >>> -lutil. If the libutil.a ORTE needs is theirs, i.e., not the system >>> libutil.dylib, then you have exactly the same problem I did. >>> >>> Here are my notes for the fix using gcc. You will have to find out the >>> equivalent method to pass the -search_paths_first linker option using pgcc. >>> >>> # Mac OS X searches for shared libraries before static libraries. Thus, >>> -L<ours> -lutil finds the system libutil.dylib >>> # before our libutil.a, which causes undefined references in the link >>> step because it is using the wrong library. The >>> # ld -search_paths_first option forces ld to search each directory first >>> for a matching library, instead of all directories >>> # first for a shared library. >>> # Note: this is the form to pass -search_paths_first to ld when $(CC) is >>> the linker command in makefile.ux >>> export LDFLAGS=-Wl,-search_paths_first >>> >>> >>> Larry Baker >>> US Geological Survey >>> 650-329-5608 >>> ba...@usgs.gov >>> >>> On 15 Aug 2011, at 1:01 PM, Matthew Russell wrote: >>> >>> >>> >>> I hope this problem merits being posted here. >>> >>> On OS X (Snow Leopard, and Lion), I cannot seem to build Open MPI. >>> >>> After a lot of building, I get the error: >>> >>> /bin/sh ../../../libtool --tag=CC --mode=link >>> /opt/pgi/osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -Msignextend -V >>> -export-dynamic -o orte-clean orte-clean.o >>> ../../../orte/libopen-rte.la-lutil >>> libtool: link: /opt/pgi/osx86-64/10.9/bin/pgcc -DNDEBUG -O2 -Msignextend >>> -V -o orte-clean orte-clean.o ../../../orte/.libs/libopen-rte.a >>> /Users/matt/software/openmpi/openmpi-1.4.3/opal/.libs/libopen-pal.a -lutil >>> Undefined symbols for architecture x86_64: >>> "_orte_odls", referenced from: >>> _orte_errmgr_base_error_abort in libopen-rte.a(errmgr_base_fns.o) >>> ld: symbol(s) not found for architecture x86_64 >>> >>> This is with the PGI 10.9 compiler, OpenMPI 1.4.3, platform is 86x64 >>> >>> The README does not list PGI as a compiler that OpenMPI was tested with, >>> and there are notes about it's support for XGrid being broken (I'm not sure >>> if this is related.) >>> >>> I seem to get the error regardless of which configure flags I'm using, >>> just for completeness though, here are the flags I am using: >>> ./configure --prefix=/usr/local/openmpi_pg --enable-mpi-f77 >>> --enable-mpi-f90 --with-memory-manager=none >>> >>> Has anyone else got or fixed this error? >>> >>> I looked at other postings in this list, such as >>> http://www.open-mpi.org/community/lists/devel/2007/05/1590.php , but >>> they didn't help much. >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > >