Re: [OMPI devel] Build fails for Git versions (master and v4.0.x)

2019-07-31 Thread Jan Bierbaum via devel
On 31.07.19 23:54, Jeff Squyres (jsquyres) wrote:
> We don't really have any test suites that just test, for example, the
> BTLs.  We usually rely on the usual MPI benchmarks and test suites
> (e.g., the Intel MPI benchmarks have a correctness-checking mode).
I guess I'll also move in this direction. Thanks again for your help!


Regards, Jan
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel


Re: [OMPI devel] Build fails for Git versions (master and v4.0.x)

2019-07-31 Thread Jeff Squyres (jsquyres) via devel
On Jul 31, 2019, at 5:44 PM, Jan Bierbaum  
wrote:
> 
> Thanks a lot. This completely fixed those build problems. I used 'git
> clean -df' (without x) before and could have sworn I also tried a fresh
> clone … well, obviously I hadn't.

Ah ha!  Ok, good.  Because I was seriously stumped there.  Not all components 
(i.e., repo dirs) are present in all branches.  So if you had built one branch, 
and then "git clean -df" and then built another branch, it is quite possible 
that chaos/hilarity ensued...

> Any suggestions for my question about a test suite for (Open)MPI that
> also covers correct communication? It would be great to have some way to
> check my setup “layer by layer”.

Ah, sorry missed that question at the bottom of your prior email.

We don't really have any test suites that just test, for example, the BTLs.  We 
usually rely on the usual MPI benchmarks and test suites (e.g., the Intel MPI 
benchmarks have a correctness-checking mode).  We have a pile of internal 
tests/suites that we use, but they're not publicly available because many of 
them are modified versions of public test suites, and we never bothered to 
check into redistribution rights.  :-(

-- 
Jeff Squyres
jsquy...@cisco.com

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] Build fails for Git versions (master and v4.0.x)

2019-07-31 Thread Jan Bierbaum via devel
On 31.07.19 22:12, Jeff Squyres (jsquyres) wrote:
> Just to make sure you're not dealing with anything left over from and old / 
> stale build:
> 
> cd top-of-source-tree
> git clean -dfx
> ./autogen.pl |& tee auto.out
> ./configure ... |& tee config.out
> make V=1 -j 8 |& tee make.out
Thanks a lot. This completely fixed those build problems. I used 'git
clean -df' (without x) before and could have sworn I also tried a fresh
clone … well, obviously I hadn't.

Any suggestions for my question about a test suite for (Open)MPI that
also covers correct communication? It would be great to have some way to
check my setup “layer by layer”.


Regards, Jan
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] Build fails for Git versions (master and v4.0.x)

2019-07-31 Thread Jeff Squyres (jsquyres) via devel
This is very odd.

I do not see any obvious bad output in any of your logs.

Also, the missing symbols below are only *some* of the OPAL components.  Why 
only those?

Also, why is libopen-pal.so looking for those symbols in the first place?  All 
those symbols are in plugins / components -- they're outside of libopen-pal.so 
and are dynamically opened at run time (via dlopen()).

Just to make sure you're not dealing with anything left over from and old / 
stale build:

cd top-of-source-tree
git clean -dfx
./autogen.pl |& tee auto.out
./configure ... |& tee config.out
make V=1 -j 8 |& tee make.out

The V=1 in there should emit a lot more output; there might be a clue in that 
output...?



> On Jul 31, 2019, at 2:31 PM, Jan Bierbaum via devel 
>  wrote:
> 
> Hello!
> 
> After I ran into problems with a self-compiled OpenMPI 4.0.1 and CP2K
> ('make test' fails for the latter and also a couple of input files are
> dysfunctional with the MPI version), I though it might help to give the
> Git version of OpenMPI a try. However, I can build neither 'v4.0.x'
> (673ddae) nor 'master' (7b7ad5e). Both fail during the linking of
> 'libopen-pal.so'. Is this expected?
> 
> The error for 'master' is ('v4.0.x' shows a different line number in the
> Makefile):
> 
>> make[2]: Entering directory 
>> '/dev/shm/Setup/build/openmpi-git/opal/tools/wrappers'
>>  CC   opal_wrapper.o
>>  CCLD opal_wrapper
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_crs_none_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_reachable_netlink_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_pstat_linux_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_shmem_posix_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_btl_tcp_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_patcher_overwrite_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_btl_uct_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_allocator_bucket_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_shmem_sysv_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_pmix_isolated_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_btl_vader_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_shmem_mmap_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_pmix_pmix4x_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_btl_self_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_allocator_basic_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_rcache_grdma_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_mpool_hugepage_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_btl_sm_component'
>> ../../../opal/.libs/libopen-pal.so: undefined reference to 
>> `mca_reachable_weighted_component'
>> collect2: error: ld returned 1 exit status
>> Makefile:1836: recipe for target 'opal_wrapper' failed
> 
> 
> Software used:
> 
> - automake (GNU automake) 1.15
> - m4 (GNU M4) 1.4.18
> - autoconf (GNU Autoconf) 2.69
> - libtoolize (GNU libtool) 2.4.6
> - flex 2.6.1
> - gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
> - UCT version=1.5.1 revision 7e67a4b
> 
> 
> Build process:
> 
>> $ git clone … ompi; git checkout $BRANCH
>> $ cd ompi
>> $ ./autogen.pl &> auto.log
>> $ ./configure --prefix=$DIR --disable-timing --disable-mpi-cxx 
>> --enable-shared --enable-weak-symbols --enable-binaries --enable-mpi 
>> --enable-mpi-interface-warning --enable-mpi-fortran --enable-c11-atomics 
>> --enable-builtin-atomics --enable-fast-install --enable-mpi1-compatibility 
>> --without-cuda --without-verbs --with-ucx=${PATH_TO_UCX} --disable-debug 
>> --disable-mem-debug &> configure.log
>> $ make -j 8 &> make.log
> 
> 
> I also tried a serial build to avoid potential races in the build
> process but to no avail. The respective log files are attached in
> compressed form and, for your convenience, also available online
> 
> auto.log -> https://pastebin.com/2w5RDNdc
> configure.log -> https://pastebin.com/chWtk4pw
> make.log -> https://pastebin.com/kYWscGYD
> 
> 
> As a side question: Are there any functionality tests for OpenMPI in the
> sense that they check whether communication works properly, i.e. no lost
> messages, message contents unchanged, …?
> 
> 
> Regards, Jan
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel


-- 
Jeff Squyres
jsquy...@cisco.com

___
devel mailing list
devel@lists.open-mpi.org