Dear Open MPI Gurus,

This is a...confusing one. For some reason, I cannot build a working Open
MPI with NAG 7.0.7062 and clang on my MacBook running macOS 11.6.1. The
thing is, I could do this back in July with NAG 7.0.7048. So my fear is
that something changed with macOS, or clang/xcode, or something in between.

So here are the symptoms, I usually build with a few extra flags that I've
always carried around but for now I'm going to go basic. First, I try to
build Open MPI in a basic way:

../configure FCFLAGS"=-mismatch_all -fpp" CC=clang CXX=clang++ FC=nagfor
--prefix=$HOME/installed/Compiler/nag-7.0_7062/openmpi/4.1.1-basic |& tee
configure.log

Note that the FCFLAGS are needed for NAG since it doesn't preprocess .F90
files by default (so -fpp) and it can be *very* strict with interfaces and
any slight interface difference is an error so we use -mismatch_all.

Now with this configure line, I then build and:

Making all in mpi/fortran/use-mpi-tkr
make[2]: Entering directory
'/Users/mathomp4/src/MPI/openmpi-4.1.1/build-basic/ompi/mpi/fortran/use-mpi-tkr'
  FCLD     libmpi_usempi.la
NAG Fortran Compiler Release 7.0(Yurakucho) Build 7062
Option error: Unrecognised option -dynamiclib
make[2]: *** [Makefile:1966: libmpi_usempi.la] Error 2
make[2]: Leaving directory
'/Users/mathomp4/src/MPI/openmpi-4.1.1/build-basic/ompi/mpi/fortran/use-mpi-tkr'
make[1]: *** [Makefile:3555: all-recursive] Error 1
make[1]: Leaving directory
'/Users/mathomp4/src/MPI/openmpi-4.1.1/build-basic/ompi'
make: *** [Makefile:1901: all-recursive] Error 1

For some reason, the make system is trying to pass a clang option,
-dynamiclib, to nagfor and it fails. With verbose on:

libtool: link: nagfor -dynamiclib -Wl,-Wl,,-undefined
-Wl,-Wl,,dynamic_lookup -o .libs/libmpi_usempi.40.dylib  .libs/mpi.o
.libs/mpi_aint_add_f90.o .libs/mpi_aint_diff_f90.o
.libs/mpi_comm_spawn_multiple_f90.o .libs/mpi_testall_f90.o
.libs/mpi_testsome_f90.o .libs/mpi_waitall_f90.o .libs/mpi_waitsome_f90.o
.libs/mpi_wtick_f90.o .libs/mpi_wtime_f90.o .libs/mpi-tkr-sizeof.o...

As a test, I tried the same thing with NAG 7.0.7048 (which worked in July)
and I get the same issue:

Option error: Unrecognised option -dynamiclib

Note, that Intel Fortran and Gfortran *do* support this flag, but NAG has
something like:

       -Bbinding Specify  static  or  dynamic binding.  This only has
effect if specified during the link phase.  The default is dynamic binding.

but maybe the Open MPI system doesn't know NAG?

So I say to myself, okay, dynamiclib is a shared library sounding thing, so
let's try static library build! So, following the documentation I try:

../configure --enable-static -disable-shared FCFLAGS"=-mismatch_all -fpp"
CC=gcc CXX=g++ FC=nagfor
--prefix=$HOME/installed/Compiler/nag-7.0_7062/openmpi/4.1.1-static |& tee
configure.log

and it builds! Yay! And then I try to build helloworld.c and it fails! To
wit:
❯ cat helloworld.c
/*The Parallel Hello World Program*/
#include <stdio.h>
#include <mpi.h>

int main(int argc, char **argv)
{
   int node;

   MPI_Init(&argc,&argv);
   MPI_Comm_rank(MPI_COMM_WORLD, &node);

   printf("Hello World from Node %d\n",node);

   MPI_Finalize();
}
❯
/Users/mathomp4/installed/Compiler/nag-7.0_7062/openmpi/4.1.1-static/bin/mpicc
helloworld.c
Undefined symbols for architecture x86_64:
  "_MPIR_Breakpoint", referenced from:
      _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
  "_MPIR_attach_fifo", referenced from:
      _orte_submit_finalize in libopen-rte.a(orted_submit.o)
      _orte_submit_job in libopen-rte.a(orted_submit.o)
      _open_fifo in libopen-rte.a(orted_submit.o)
  "_MPIR_being_debugged", referenced from:
      _ompi_rte_wait_for_debugger in libmpi.a(rte_orte_module.o)
      _orte_submit_job in libopen-rte.a(orted_submit.o)
      _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
      _attach_debugger in libopen-rte.a(orted_submit.o)
  "_MPIR_debug_state", referenced from:
      _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
  "_MPIR_executable_path", referenced from:
      _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
      _setup_debugger_job in libopen-rte.a(orted_submit.o)
      _run_debugger in libopen-rte.a(orted_submit.o)
      _attach_debugger in libopen-rte.a(orted_submit.o)
  "_MPIR_forward_output", referenced from:
      _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
      _setup_debugger_job in libopen-rte.a(orted_submit.o)
  "_MPIR_i_am_starter", referenced from:
      _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
  "_MPIR_partial_attach_ok", referenced from:
      _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
  "_MPIR_proctable", referenced from:
      _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
  "_MPIR_proctable_size", referenced from:
      _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
  "_MPIR_server_arguments", referenced from:
      _orte_debugger_init_after_spawn in libopen-rte.a(orted_submit.o)
      _setup_debugger_job in libopen-rte.a(orted_submit.o)
      _run_debugger in libopen-rte.a(orted_submit.o)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see
invocation)

So...yeah. ¯\_(ツ)_/¯ Maybe this needs -Bstatic??

But again, all this worked with shared a few months ago (I've never tried
static until now) and NAG has *never* supported -dynamiclib as far as I
know.

I do see references to -Bstatic and -Bdynamic in the source code, but
apparently I'm not triggering the configure step to use them?

Anyone else out there encounter this?

NOTE: I did try doing an Intel Fortran + Clang shared build today and that
seemed to work. I think that's because Intel Fortran recognizes -dynamiclib
so it can get past that FCLD step.
-- 
Matt Thompson
   “The fact is, this is about us identifying what we do best and
   finding more ways of doing less of it better” -- Director of Better Anna
Rampton

Reply via email to