The error changes now and at an earlier place, 66% vs 70%: make LDFLAGS="-Wl,--copy-dt-needed-entries" Consolidate compiler generated dependencies of target fmt [ 12%] Built target fmt Consolidate compiler generated dependencies of target richdem [ 37%] Built target richdem Consolidate compiler generated dependencies of target wtm [ 62%] Built target wtm Consolidate compiler generated dependencies of target wtm.x [ 66%] Linking CXX executable wtm.x /usr/bin/ld: libwtm.a(transient_groundwater.cpp.o): undefined reference to symbol 'MPI_Abort' /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40: error adding symbols: DSO missing from command line collect2: error: ld returned 1 exit status make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1 make[1]: *** [CMakeFiles/Makefile2:225: CMakeFiles/wtm.x.dir/all] Error 2 make: *** [Makefile:136: all] Error 2
So perhaps PET_Sc is now being found. Any other suggestions? On Fri, Oct 7, 2022 at 11:18 PM Rob Kudyba <rk3...@columbia.edu> wrote: > > Thanks for the quick reply. I added these options to make and make check >>> still produce the warnings so I used the command like this: >>> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug >>> MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca >>> opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'" check >>> Running check examples to verify correct installation >>> Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug >>> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process >>> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI >>> processes >>> Completed test examples >>> >>> Could be useful for the FAQ. >>> >> You mentioned you had "OpenMPI 4.1.1 with CUDA aware", so I think a >> workable mpicc should automatically find cuda libraries. Maybe you >> unloaded cuda libraries? >> > Oh let me clarify, OpenMPI is CUDA aware however this code and the node > where PET_Sc is compiling does not have a GPU, hence not needed and using > the MPIEXEC option worked during the 'check' to suppress the warning. > > I'm not trying to use PetSC to compile and linking appears to go awry: >>> [ 58%] Building CXX object >>> CMakeFiles/wtm.dir/src/update_effective_storativity.cpp.o >>> [ 62%] Linking CXX static library libwtm.a >>> [ 62%] Built target wtm >>> [ 66%] Building CXX object CMakeFiles/wtm.x.dir/src/WTM.cpp.o >>> [ 70%] Linking CXX executable wtm.x >>> /usr/bin/ld: cannot find -lpetsc >>> collect2: error: ld returned 1 exit status >>> make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1 >>> make[1]: *** [CMakeFiles/Makefile2:269: CMakeFiles/wtm.x.dir/all] Error 2 >>> make: *** [Makefile:136: all] Error 2 >>> >> It seems cmake could not find petsc. Look >> at $PETSC_DIR/share/petsc/CMakeLists.txt and try to modify your >> CMakeLists.txt. >> > > There is an explicit reference to the path in CMakeLists.txt: > # NOTE: You may need to update this path to identify PETSc's location > set(ENV{PKG_CONFIG_PATH} > "$ENV{PKG_CONFIG_PATH}:/path/to/petsc/arch-linux-cxx-debug/lib/pkgconfig/") > pkg_check_modules(PETSC PETSc>=3.17.1 IMPORTED_TARGET REQUIRED) > message(STATUS "Found PETSc ${PETSC_VERSION}") > add_subdirectory(common/richdem EXCLUDE_FROM_ALL) > add_subdirectory(common/fmt EXCLUDE_FROM_ALL) > > And that exists: > ls /path/to/petsc/arch-linux-cxx-debug/lib/pkgconfig/ > petsc.pc PETSc.pc > > Is there an environment variable I'm missing? I've seen the suggestion >> <https://www.mail-archive.com/search?l=petsc-users@mcs.anl.gov&q=subject:%22%5C%5Bpetsc%5C-users%5C%5D+CMake+error+in+PETSc%22&o=newest&f=1> >> to add it to LD_LIBRARY_PATH which I did with export >> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib and that >> points to: >> >>> ls -l /path/to/petsc/arch-linux-c-debug/lib >>> total 83732 >>> lrwxrwxrwx 1 rk3199 user 18 Oct 7 13:56 libpetsc.so -> >>> libpetsc.so.3.18.0 >>> lrwxrwxrwx 1 rk3199 user 18 Oct 7 13:56 libpetsc.so.3.18 -> >>> libpetsc.so.3.18.0 >>> -rwxr-xr-x 1 rk3199 user 85719200 Oct 7 13:56 libpetsc.so.3.18.0 >>> drwxr-xr-x 3 rk3199 user 4096 Oct 6 10:22 petsc >>> drwxr-xr-x 2 rk3199 user 4096 Oct 6 10:23 pkgconfig >>> >>> Anything else to check? >>> >> If modifying CMakeLists.txt does not work, you can try export >> LIBRARY_PATH=$LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib >> LD_LIBRARY_PATHis is for run time, but the error happened at link time, >> > > Yes that's what I already had. Any other debug that I can provide? > > > >> On Fri, Oct 7, 2022 at 1:53 PM Satish Balay <ba...@mcs.anl.gov> wrote: >>> >>>> you can try >>>> >>>> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug >>>> MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca >>>> opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'" >>>> >>>> Wrt configure - it can be set with --with-mpiexec option - its saved in >>>> PETSC_ARCH/lib/petsc/conf/petscvariables >>>> >>>> Satish >>>> >>>> On Fri, 7 Oct 2022, Rob Kudyba wrote: >>>> >>>> > We are on RHEL 8, using modules that we can load/unload various >>>> version of >>>> > packages/libraries, and I have OpenMPI 4.1.1 with CUDA aware loaded >>>> along >>>> > with GDAL 3.3.0, GCC 10.2.0, and cmake 3.22.1 >>>> > >>>> > make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug check >>>> > fails with the below errors, >>>> > Running check examples to verify correct installation >>>> > >>>> > Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug >>>> > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI >>>> process >>>> > See https://petsc.org/release/faq/ >>>> > >>>> -------------------------------------------------------------------------- >>>> > The library attempted to open the following supporting CUDA libraries, >>>> > but each of them failed. CUDA-aware support is disabled. >>>> > libcuda.so.1: cannot open shared object file: No such file or >>>> directory >>>> > libcuda.dylib: cannot open shared object file: No such file or >>>> directory >>>> > /usr/lib64/libcuda.so.1: cannot open shared object file: No such file >>>> or >>>> > directory >>>> > /usr/lib64/libcuda.dylib: cannot open shared object file: No such >>>> file or >>>> > directory >>>> > If you are not interested in CUDA-aware support, then run with >>>> > --mca opal_warn_on_missing_libcuda 0 to suppress this message. If >>>> you are >>>> > interested >>>> > in CUDA-aware support, then try setting LD_LIBRARY_PATH to the >>>> location >>>> > of libcuda.so.1 to get passed this issue. >>>> > >>>> -------------------------------------------------------------------------- >>>> > >>>> -------------------------------------------------------------------------- >>>> > WARNING: There was an error initializing an OpenFabrics device. >>>> > >>>> > Local host: g117 >>>> > Local device: mlx5_0 >>>> > >>>> -------------------------------------------------------------------------- >>>> > lid velocity = 0.0016, prandtl # = 1., grashof # = 1. >>>> > Number of SNES iterations = 2 >>>> > Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI >>>> processes >>>> > See https://petsc.org/release/faq/ >>>> > >>>> > The library attempted to open the following supporting CUDA libraries, >>>> > but each of them failed. CUDA-aware support is disabled. >>>> > libcuda.so.1: cannot open shared object file: No such file or >>>> directory >>>> > libcuda.dylib: cannot open shared object file: No such file or >>>> directory >>>> > /usr/lib64/libcuda.so.1: cannot open shared object file: No such file >>>> or >>>> > directory >>>> > /usr/lib64/libcuda.dylib: cannot open shared object file: No such >>>> file or >>>> > directory >>>> > If you are not interested in CUDA-aware support, then run with >>>> > --mca opal_warn_on_missing_libcuda 0 to suppress this message. If >>>> you are >>>> > interested in CUDA-aware support, then try setting LD_LIBRARY_PATH to >>>> the >>>> > locationof libcuda.so.1 to get passed this issue. >>>> > >>>> > WARNING: There was an error initializing an OpenFabrics device. >>>> > >>>> > Local host: xxx >>>> > Local device: mlx5_0 >>>> > >>>> > lid velocity = 0.0016, prandtl # = 1., grashof # = 1. >>>> > Number of SNES iterations = 2 >>>> > [g117:4162783] 1 more process has sent help message >>>> > help-mpi-common-cuda.txt / dlopen failed >>>> > [g117:4162783] Set MCA parameter "orte_base_help_aggregate" to 0 to >>>> see all >>>> > help / error messages >>>> > [g117:4162783] 1 more process has sent help message >>>> help-mpi-btl-openib.txt >>>> > / error in device init >>>> > Completed test examples >>>> > Error while running make check >>>> > gmake[1]: *** [makefile:149: check] Error 1 >>>> > make: *** [GNUmakefile:17: check] Error 2 >>>> > >>>> > Where is $MPI_RUN set? I'd like to be able to pass options such as >>>> --mca >>>> > orte_base_help_aggregate 0 --mca opal_warn_on_missing_libcuda 0 -mca >>>> pml >>>> > ucx --mca btl '^openib' which will help me troubleshoot and hide >>>> unneeded >>>> > warnings. >>>> > >>>> > Thanks, >>>> > Rob >>>> > >>>> >>>>