Adding --without-lustre to my configure args allowed me to compile and link ring_c. I am in the queue now and will report later on run results.
-Paul On Fri, Jan 25, 2013 at 2:13 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Still having problems on the Cray XC30, but now they are when linking an > MPI app: > > $ ./INSTALL/bin/mpicc -o ring_c examples/ring_c.c >> fs_lustre_file_open.c:(.text+0x130): undefined reference to >> `llapi_file_create' >> fs_lustre_file_open.c:(.text+0x17e): undefined reference to >> `llapi_file_get_stripe' >> /usr/bin/ld: link errors found, deleting executable `ring_c' >> collect2: error: ld returned 1 exit status > > > It appears that lustre support was found at configure time using a test > that used "-llustre -llusterapi": > >> configure:157666: checking if possible to link LUSTRE >> configure:157680: cc -std=gnu99 -o conftest -O3 -DNDEBUG >> -finline-functions -fno-strict-aliasing -fexceptions -D_REENTRANT >> -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/opal/mca/hwloc/hwloc151/hwloc/include >> -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/BUILD-edison/opal/mca/hwloc/hwloc151/hwloc/include >> -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/opal/mca/event/libevent2019/libevent >> -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/opal/mca/event/libevent2019/libevent/include >> -I/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/BUILD-edison/opal/mca/event/libevent2019/libevent/include >> -I/opt/cray/pmi/default/include -I/opt/cray/pmi/default/include >> -I/opt/cray/pmi/default/include -I/opt/cray/pmi/default/include >> -I/usr//include/lustre/ -fexceptions -L/usr//lib64 conftest.c -lnsl >> -lutil -lnsl -lutil -llustre -llustreapi > > > However, those two libs are NOT included when linking an MPI application: > >> $ ./INSTALL/bin/mpicc -o ring_c examples/ring_c.c -v 2>&1 | grep collect >> /opt/gcc/4.7.2/snos/libexec/gcc/x86_64-suse-linux/4.7.2/collect2 >> --sysroot= -m elf_x86_64 -static -o ring_c -u pthread_mutex_trylock -u >> pthread_mutex_destroy -u pthread_create /usr/lib/../lib64/crt1.o >> /usr/lib/../lib64/crti.o >> /opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/crtbeginT.o >> -L/opt/cray/pmi/default/lib64 -L/opt/cray/alps/default/lib64 >> -L/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.9a1r27905/INSTALL/lib >> -L/opt/cray/udreg/2.3.2-1.0500.5931.3.1.ari/lib64 >> -L/opt/cray/ugni/4.0-1.0500.5836.7.58.ari/lib64 >> -L/opt/cray/pmi/4.0.0-1.0000.9282.69.4.ari/lib64 >> -L/opt/cray/dmapp/4.0.1-1.0500.5932.6.5.ari/lib64 >> -L/opt/cray/xpmem/0.1-2.0500.36799.3.6.ari/lib64 >> -L/opt/cray/alps/5.0.1-2.0500.7663.1.1.ari/lib64 >> -L/opt/cray/rca/1.0.0-2.0500.37705.3.12.ari/lib64 >> -L/opt/cray/mpt/5.6.0/gni/mpich2-gnu/47/lib >> -L/opt/cray/mpt/5.6.0/gni/sma/lib64 >> -L/opt/cray/libsci/12.0.00/gnu/47/sandybridge/lib >> -L/opt/cray/alps/5.0.1-2.0500.7663.1.1.ari/lib64 >> -L/opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2 >> -L/opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/../../../../lib64 >> -L/lib/../lib64 -L/usr/lib/../lib64 >> -L/opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/../../.. >> /scratch1/scratchdirs/hargrove/cceRJNtp.o -lmpi -lpmi -lalpslli -lalpsutil >> -lnsl -lutil -lnsl -lutil -lopen-rte -lpmi -lalpslli -lalpsutil -lnsl >> -lutil -lnsl -lutil -lopen-pal -lpmi -lalpslli -lalpsutil -lnsl -lutil >> -lnsl -lutil -lrca -L/opt/cray/atp/1.6.0/lib/ --undefined=_ATP_Data_Globals >> --undefined=__atpHandlerInstall -lAtpSigHCommData -lAtpSigHandler >> --start-group -lgfortran -lscicpp_gnu -lsci_gnu_mp -lstdc++ -lgfortran >> -lmpich_gnu_47 -lmpl -lrt -lsma -lxpmem -ldmapp -lugni -lpmi -lalpslli >> -lalpsutil -lalps -ludreg -lpthread -lm --end-group -lgomp -lpthread >> --start-group -lgcc -lgcc_eh -lc --end-group >> /opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/crtend.o >> /usr/lib/../lib64/crtn.o >> collect2: error: ld returned 1 exit status > > > Of course the obvious work-around to try is adding "-llustre -llustreapi" > to my command line. However, that doesn't work because mpicc places my > "-l" args BEFORE its own "-lmpi". Since "-static" is also among the > arguments, no symbols are picked up from the luster libs when they appear > on the command line before "-lmpi", from which lustre symbols are > referenced. > > Best guess(es): > EITHER config/ompi_check_lustre.m4 is failing to add "-llustre > -llustreapi" to some variable > OR the variable set by config/ompi_check_lustre.m4 isn't making its way > into the application link command for some reason > > Note that this is a --disable-shared/--enable-static build which may > differ from other systems where LUSTRE support gets used/tested. > > -Paul > > > On Fri, Jan 25, 2013 at 12:01 PM, Ralph Castain <r...@open-mpi.org> wrote: > >> Thanks Paul >> >> I'm currently tracking down a problem on the Cray XE6 - it appears that >> recent OS release changed the way alps stores allocation info :-( >> >> Will hopefully have it running soon. >> >> On Jan 25, 2013, at 10:50 AM, Paul Hargrove <phhargr...@lbl.gov> wrote: >> >> I was able to compile with openmpi-1.9a1r27905.tar.bz >> >> I'll report again when I've had an opportunity to run something like >> ring_c. >> >> Thanks, >> -Paul >> >> >> On Tue, Jan 22, 2013 at 6:08 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >>> I went ahead and removed the duplicate code, so this should work now. >>> The problem is that we re-factored the ompi_info/orte-info code, but didn't >>> complete the job - specifically, the orte-info tool didn't get updated. >>> It's about to get revamped yet again when the ompi-rte branch gets >>> committed to the trunk, so I'd rather not do any more with it now. >>> >>> Hopefully, this will be the minimum required. >>> >>> >>> On Jan 22, 2013, at 4:20 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: >>> >>> I am using the openmpi-1.9a1r27886 tarball and I still see an error for >>> one of the two duplicate symbols: >>> >>> CCLD orte-info >>> ../../../orte/.libs/libopen-rte.a(orte_info_support.o): In function >>> `orte_info_show_orte_version': >>> ../../orte/runtime/orte_info_support.c:(.text+0xe10): multiple >>> definition of `orte_info_show_orte_version' >>> version.o:../../../../orte/tools/orte-info/version.c:(.text+0x2370): >>> first defined here >>> >>> -Paul >>> >>> >>> On Fri, Jan 18, 2013 at 3:52 AM, George Bosilca <bosi...@icl.utk.edu>wrote: >>> >>>> Luckily for us all the definitions contain the same constant (orte). >>>> r27864 should fix this. >>>> >>>> George. >>>> >>>> >>>> On Jan 18, 2013, at 06:21 , Paul Hargrove <phhargr...@lbl.gov> wrote: >>>> >>>> My employer has a nice new Cray XC30 (aka Cascade), and I thought I'd >>>> give Open MPI a quick test. >>>> >>>> Given that it is INTENDED to be API-compatible with the XE series, I >>>> began configuring with >>>> CC=cc CXX=CC FC=ftn >>>> --with-platform=lanl/cray_xe6/optimized-nopanasas >>>> However, since this is Intel h/w, I commented-out the following 2 lines >>>> in the platform file: >>>> with_wrapper_cflags="-march=amdfam10" >>>> CFLAGS=-march=amdfam10 >>>> >>>> I am using PrgEnv-gnu/5.0.15, though PrgEnv-intel is the default on our >>>> system >>>> >>>> As far as I know, use of 1.6.x is out - no ugni at all, right? >>>> So, I didn't even try. >>>> >>>> I gave openmpi-1.7rc6 a try, but the ALPS headers and libs have moved >>>> (as mentioned in ompi-trunk/config/orte_check_alps.m4). >>>> Perhaps one should CMR the updated-for-CLE-5 configure logic to the 1.7 >>>> branch? >>>> >>>> Next, I tried a trunk nightly tarball: openmpi-1.9a1r27862.tar.bz2 >>>> As I mentioned above, the trunk has the right logic for locating ALPS. >>>> However, it looks like there is some untested code, protected by "#if >>>> WANT_CRAY_PMI2_EXT", that needs work: >>>> >>>> make[2]: Entering directory >>>> `/global/scratch/sd/hargrove/OMPI/openmpi-1.9a1r27862/BUILD/orte/mca/db/pmi' >>>> CC db_pmi_component.lo >>>> CC db_pmi.lo >>>> ../../../../../orte/mca/db/pmi/db_pmi.c: In function 'store': >>>> ../../../../../orte/mca/db/pmi/db_pmi.c:202: error: 'ptr' undeclared >>>> (first use in this function) >>>> ../../../../../orte/mca/db/pmi/db_pmi.c:202: error: (Each undeclared >>>> identifier is reported only once >>>> ../../../../../orte/mca/db/pmi/db_pmi.c:202: error: for each function >>>> it appears in.) >>>> make[2]: *** [db_pmi.lo] Error 1 >>>> make[2]: Leaving directory >>>> `/global/scratch/sd/hargrove/OMPI/openmpi-1.9a1r27862/BUILD/orte/mca/db/pmi' >>>> make[1]: *** [all-recursive] Error 1 >>>> make[1]: Leaving directory >>>> `/global/scratch/sd/hargrove/OMPI/openmpi-1.9a1r27862/BUILD/orte' >>>> make: *** [all-recursive] Error 1 >>>> >>>> I added the missing "char *ptr" declaration a few lines before it's >>>> first use, and resumed the build. >>>> This time the build terminated at >>>> >>>> make[2]: Entering directory >>>> `/global/scratch/sd/hargrove/OMPI/openmpi-1.9a1r27862/BUILD/opal/tools/wrappers' >>>> CC opal_wrapper.o >>>> CCLD opal_wrapper >>>> /usr/bin/ld: attempted static link of dynamic object >>>> `../../../opal/.libs/libopen-pal.so' >>>> collect2: error: ld returned 1 exit status >>>> >>>> So I went back to the platform file and changed >>>> enable_shared=yes >>>> to >>>> enable_shared=no >>>> No big deal there - I had to make the same change for our XE6. >>>> >>>> And so I started back at configure (after a "make distclean", to be >>>> safe), and here is the next error: >>>> >>>> Making all in tools/orte-info >>>> make[2]: Entering directory >>>> `/global/scratch/sd/hargrove/OMPI/openmpi-1.9a1r27862/BUILD/orte/tools/orte-info' >>>> CCLD orte-info >>>> ../../../orte/.libs/libopen-rte.a(orte_info_support.o): In function >>>> `orte_info_show_orte_version': >>>> orte_info_support.c:(.text+0xd70): multiple definition of >>>> `orte_info_show_orte_version' >>>> version.o:version.c:(.text+0x4b0): first defined here >>>> ../../../orte/.libs/libopen-rte.a(orte_info_support.o):(.data+0x0): >>>> multiple definition of `orte_info_type_orte' >>>> orte-info.o:(.data+0x10): first defined here >>>> /usr/bin/ld: link errors found, deleting executable `orte-info' >>>> collect2: error: ld returned 1 exit status >>>> make[2]: *** [orte-info] Error 1 >>>> >>>> I am not sure how to fix this, but I would guess this is probably a >>>> simple fix for somebody who knows OMPI's build infrastructure better than >>>> I. >>>> >>>> -Paul >>>> >>>> -- >>>> Paul H. Hargrove phhargr...@lbl.gov >>>> Future Technologies Group >>>> Computer and Data Sciences Department Tel: +1-510-495-2352 >>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>>> >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>> >>> >>> >>> -- >>> Paul H. Hargrove phhargr...@lbl.gov >>> Future Technologies Group >>> Computer and Data Sciences Department Tel: +1-510-495-2352 >>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >> >> >> >> -- >> Paul H. Hargrove phhargr...@lbl.gov >> Future Technologies Group >> Computer and Data Sciences Department Tel: +1-510-495-2352 >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900