Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-10 Thread Junchao Zhang
On Mon, Oct 10, 2022 at 8:13 AM Rob Kudyba  wrote:

> OK, let's walk back and don't use -DCMAKE_C_COMPILER=/path/to/mpicc
>>
> Will do
>
>
>> libompitrace.so.40.30.0 is not the OpenMP library; it is the tracing
>> library for OpenMPI, https://github.com/open-mpi/ompi/issues/10036
>>
> Does that mean I should remove this option in the cmake command?
>
>
>> In your previous email, there was
>>
>> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_link_script
>> CMakeFiles/wtm.x.dir/link.txt --verbose=1
>> /cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra
>> -pedantic -Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
>>  -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/
>> gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_
>> support/lib:/path/to/petsc/arch-linux-cxx-debug/lib libwtm.a
>> common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
>> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
>> common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
>> /usr/bin/ld: CMakeFiles/wtm.x.dir/src/WTM.cpp.o: undefined reference to
>> symbol 'ompi_mpi_comm_self'
>> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40: error
>> adding symbols: DSO missing from command line
>>
>>
>> Let's try to add -lmpi (or /path/to/openmpi-4.1.1_ucx_
>> cuda_11.0.3_support/lib/libmpi.so) manually to see if it links
>>
>> /cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra
>> -pedantic -Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
>>  -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/
>> gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_
>> support/lib:/path/to/petsc/arch-linux-cxx-debug/lib libwtm.a
>> common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
>> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
>> common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
>> -lmpi
>>
>
> so just adding that to the make command? Sttil seeing linking errors:
>
>  make VERBOSE=1 LDFLAGS="-Wl,--copy-dt-needed-entries" -lmpi
>
 make VERBOSE=1 LDFLAGS="-Wl,--copy-dt-needed-entries -lmpi"

/path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -S/path/to/WTM
> -B/path/to/WTM/build --check-build-system CMakeFiles/Makefile.cmake 0
> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_progress_start
> /path/to/WTM/build/CMakeFiles /path/to/WTM/build//CMakeFiles/progress.marks
> make  -f CMakeFiles/Makefile2 all
> make[1]: Entering directory '/path/to/WTM/build'
> make  -f common/fmt/CMakeFiles/fmt.dir/build.make
> common/fmt/CMakeFiles/fmt.dir/depend
> make[2]: Entering directory '/path/to/WTM/build'
> cd /path/to/WTM/build &&
> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_depends "Unix
> Makefiles" /path/to/WTM /path/to/WTM/common/fmt /path/to/WTM/build
> /path/to/WTM/build/common/fmt
> /path/to/WTM/build/common/fmt/CMakeFiles/fmt.dir/DependInfo.cmake --color=
> make[2]: Leaving directory '/path/to/WTM/build'
> make  -f common/fmt/CMakeFiles/fmt.dir/build.make
> common/fmt/CMakeFiles/fmt.dir/build
> make[2]: Entering directory '/path/to/WTM/build'
> [  4%] Building CXX object common/fmt/CMakeFiles/fmt.dir/src/format.cc.o
> cd /path/to/WTM/build/common/fmt && /cm/local/apps/gcc/10.2.0/bin/c++
>  -I/path/to/WTM/common/fmt/include -isystem -std=gnu++11 -MD -MT
> common/fmt/CMakeFiles/fmt.dir/src/format.cc.o -MF
> CMakeFiles/fmt.dir/src/format.cc.o.d -o CMakeFiles/fmt.dir/src/format.cc.o
> -c /path/to/WTM/common/fmt/src/format.cc
> [  8%] Building CXX object common/fmt/CMakeFiles/fmt.dir/src/os.cc.o
> cd /path/to/WTM/build/common/fmt && /cm/local/apps/gcc/10.2.0/bin/c++
>  -I/path/to/WTM/common/fmt/include -isystem -std=gnu++11 -MD -MT
> common/fmt/CMakeFiles/fmt.dir/src/os.cc.o -MF
> CMakeFiles/fmt.dir/src/os.cc.o.d -o CMakeFiles/fmt.dir/src/os.cc.o -c
> /path/to/WTM/common/fmt/src/os.cc
> [ 12%] Linking CXX static library libfmt.a
> cd /path/to/WTM/build/common/fmt &&
> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -P
> CMakeFiles/fmt.dir/cmake_clean_target.cmake
> cd /path/to/WTM/build/common/fmt &&
> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_link_script
> CMakeFiles/fmt.dir/link.txt --verbose=1
> /usr/bin/ar qc libfmt.a CMakeFiles/fmt.dir/src/format.cc.o
> CMakeFiles/fmt.dir/src/os.cc.o
> /usr/bin/ranlib libfmt.a
> make[2]: Leaving directory '/path/to/WTM/build'
> [ 12%] Built target fmt
> make  -f common/richdem/CMakeFiles/richdem.dir/build.make
> common/richdem/CMakeFiles/richdem.dir/depend
> make[2]: Entering directory '/path/to/WTM/build'
> cd /path/to/WTM/build &&
> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_depends "Unix
> Makefiles" /path/to/WTM /path/to/WTM/common/richdem /path/to/WTM/build
> /path/to/WTM/build/common/richdem
> /path/to/WTM/build/common/richdem/CMakeFiles/richdem.dir/DependInfo.cmake
> --color=
> make[2]: Leaving directory '/path/to/WTM/build'
> make  -f 

Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-10 Thread Rob Kudyba
OK so I missed the OpenMP vs OpenMPI with incorrectly setting
 
-DOpenMP_libomp_LIBRARY="/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib//libompitrace.so.40.30.0
So
I changed it to point to  /cm/local/apps/gcc/10.2.0/lib/libgomp.so.1.0.0

-- Found PETSc 3.18.0
CMake Error at
/path/to/cmake/cmake-3.22.1-linux-x86_64/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:230
(message):
  Could NOT find OpenMP_CXX (missing: OpenMP_libomp_LIBRARY
  OpenMP_libomp_LIBRARY) (found version "4.5")
Call Stack (most recent call first):

/path/to/cmake/cmake-3.22.1-linux-x86_64/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:594
(_FPHSA_FAILURE_MESSAGE)

/path/to/cmake/cmake-3.22.1-linux-x86_64/share/cmake-3.22/Modules/FindOpenMP.cmake:544
(find_package_handle_standard_args)
  common/richdem/CMakeLists.txt:12 (find_package)
Perhaps I need to reach out to the richdem
 maintainer?

On Mon, Oct 10, 2022 at 9:12 AM Rob Kudyba  wrote:

> OK, let's walk back and don't use -DCMAKE_C_COMPILER=/path/to/mpicc
>>
> Will do
>
>
>> libompitrace.so.40.30.0 is not the OpenMP library; it is the tracing
>> library for OpenMPI, https://github.com/open-mpi/ompi/issues/10036
>>
> Does that mean I should remove this option in the cmake command?
>
>
>> In your previous email, there was
>>
>> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_link_script
>> CMakeFiles/wtm.x.dir/link.txt --verbose=1
>> /cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra
>> -pedantic -Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
>>  -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/
>> gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_
>> support/lib:/path/to/petsc/arch-linux-cxx-debug/lib libwtm.a
>> common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
>> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
>> common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
>> /usr/bin/ld: CMakeFiles/wtm.x.dir/src/WTM.cpp.o: undefined reference to
>> symbol 'ompi_mpi_comm_self'
>> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40: error
>> adding symbols: DSO missing from command line
>>
>>
>> Let's try to add -lmpi (or /path/to/openmpi-4.1.1_ucx_
>> cuda_11.0.3_support/lib/libmpi.so) manually to see if it links
>>
>> /cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra
>> -pedantic -Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
>>  -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/
>> gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_
>> support/lib:/path/to/petsc/arch-linux-cxx-debug/lib libwtm.a
>> common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
>> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
>> common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
>> -lmpi
>>
>
> so just adding that to the make command? Sttil seeing linking errors:
>
>  make VERBOSE=1 LDFLAGS="-Wl,--copy-dt-needed-entries" -lmpi
> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -S/path/to/WTM
> -B/path/to/WTM/build --check-build-system CMakeFiles/Makefile.cmake 0
> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_progress_start
> /path/to/WTM/build/CMakeFiles /path/to/WTM/build//CMakeFiles/progress.marks
> make  -f CMakeFiles/Makefile2 all
> make[1]: Entering directory '/path/to/WTM/build'
> make  -f common/fmt/CMakeFiles/fmt.dir/build.make
> common/fmt/CMakeFiles/fmt.dir/depend
> make[2]: Entering directory '/path/to/WTM/build'
> cd /path/to/WTM/build &&
> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_depends "Unix
> Makefiles" /path/to/WTM /path/to/WTM/common/fmt /path/to/WTM/build
> /path/to/WTM/build/common/fmt
> /path/to/WTM/build/common/fmt/CMakeFiles/fmt.dir/DependInfo.cmake --color=
> make[2]: Leaving directory '/path/to/WTM/build'
> make  -f common/fmt/CMakeFiles/fmt.dir/build.make
> common/fmt/CMakeFiles/fmt.dir/build
> make[2]: Entering directory '/path/to/WTM/build'
> [  4%] Building CXX object common/fmt/CMakeFiles/fmt.dir/src/format.cc.o
> cd /path/to/WTM/build/common/fmt && /cm/local/apps/gcc/10.2.0/bin/c++
>  -I/path/to/WTM/common/fmt/include -isystem -std=gnu++11 -MD -MT
> common/fmt/CMakeFiles/fmt.dir/src/format.cc.o -MF
> CMakeFiles/fmt.dir/src/format.cc.o.d -o CMakeFiles/fmt.dir/src/format.cc.o
> -c /path/to/WTM/common/fmt/src/format.cc
> [  8%] Building CXX object common/fmt/CMakeFiles/fmt.dir/src/os.cc.o
> cd /path/to/WTM/build/common/fmt && /cm/local/apps/gcc/10.2.0/bin/c++
>  -I/path/to/WTM/common/fmt/include -isystem -std=gnu++11 -MD -MT
> common/fmt/CMakeFiles/fmt.dir/src/os.cc.o -MF
> CMakeFiles/fmt.dir/src/os.cc.o.d -o CMakeFiles/fmt.dir/src/os.cc.o -c
> /path/to/WTM/common/fmt/src/os.cc
> [ 12%] Linking CXX static library libfmt.a
> cd /path/to/WTM/build/common/fmt &&
> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -P
> CMakeFiles/fmt.dir/cmake_clean_target.cmake
> cd 

Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-10 Thread Rob Kudyba
>
> OK, let's walk back and don't use -DCMAKE_C_COMPILER=/path/to/mpicc
>
Will do


> libompitrace.so.40.30.0 is not the OpenMP library; it is the tracing
> library for OpenMPI, https://github.com/open-mpi/ompi/issues/10036
>
Does that mean I should remove this option in the cmake command?


> In your previous email, there was
>
> /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_link_script
> CMakeFiles/wtm.x.dir/link.txt --verbose=1
> /cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra -pedantic
> -Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
>  -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/
> gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_
> support/lib:/path/to/petsc/arch-linux-cxx-debug/lib libwtm.a
> common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
> common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
> /usr/bin/ld: CMakeFiles/wtm.x.dir/src/WTM.cpp.o: undefined reference to
> symbol 'ompi_mpi_comm_self'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40: error
> adding symbols: DSO missing from command line
>
>
> Let's try to add -lmpi (or /path/to/openmpi-4.1.1_ucx_
> cuda_11.0.3_support/lib/libmpi.so) manually to see if it links
>
> /cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra -pedantic
> -Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
>  -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/
> gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_
> support/lib:/path/to/petsc/arch-linux-cxx-debug/lib libwtm.a
> common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
> common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
> -lmpi
>

so just adding that to the make command? Sttil seeing linking errors:

 make VERBOSE=1 LDFLAGS="-Wl,--copy-dt-needed-entries" -lmpi
/path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -S/path/to/WTM
-B/path/to/WTM/build --check-build-system CMakeFiles/Makefile.cmake 0
/path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_progress_start
/path/to/WTM/build/CMakeFiles /path/to/WTM/build//CMakeFiles/progress.marks
make  -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/path/to/WTM/build'
make  -f common/fmt/CMakeFiles/fmt.dir/build.make
common/fmt/CMakeFiles/fmt.dir/depend
make[2]: Entering directory '/path/to/WTM/build'
cd /path/to/WTM/build && /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake
-E cmake_depends "Unix Makefiles" /path/to/WTM /path/to/WTM/common/fmt
/path/to/WTM/build /path/to/WTM/build/common/fmt
/path/to/WTM/build/common/fmt/CMakeFiles/fmt.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/path/to/WTM/build'
make  -f common/fmt/CMakeFiles/fmt.dir/build.make
common/fmt/CMakeFiles/fmt.dir/build
make[2]: Entering directory '/path/to/WTM/build'
[  4%] Building CXX object common/fmt/CMakeFiles/fmt.dir/src/format.cc.o
cd /path/to/WTM/build/common/fmt && /cm/local/apps/gcc/10.2.0/bin/c++
 -I/path/to/WTM/common/fmt/include -isystem -std=gnu++11 -MD -MT
common/fmt/CMakeFiles/fmt.dir/src/format.cc.o -MF
CMakeFiles/fmt.dir/src/format.cc.o.d -o CMakeFiles/fmt.dir/src/format.cc.o
-c /path/to/WTM/common/fmt/src/format.cc
[  8%] Building CXX object common/fmt/CMakeFiles/fmt.dir/src/os.cc.o
cd /path/to/WTM/build/common/fmt && /cm/local/apps/gcc/10.2.0/bin/c++
 -I/path/to/WTM/common/fmt/include -isystem -std=gnu++11 -MD -MT
common/fmt/CMakeFiles/fmt.dir/src/os.cc.o -MF
CMakeFiles/fmt.dir/src/os.cc.o.d -o CMakeFiles/fmt.dir/src/os.cc.o -c
/path/to/WTM/common/fmt/src/os.cc
[ 12%] Linking CXX static library libfmt.a
cd /path/to/WTM/build/common/fmt &&
/path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -P
CMakeFiles/fmt.dir/cmake_clean_target.cmake
cd /path/to/WTM/build/common/fmt &&
/path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_link_script
CMakeFiles/fmt.dir/link.txt --verbose=1
/usr/bin/ar qc libfmt.a CMakeFiles/fmt.dir/src/format.cc.o
CMakeFiles/fmt.dir/src/os.cc.o
/usr/bin/ranlib libfmt.a
make[2]: Leaving directory '/path/to/WTM/build'
[ 12%] Built target fmt
make  -f common/richdem/CMakeFiles/richdem.dir/build.make
common/richdem/CMakeFiles/richdem.dir/depend
make[2]: Entering directory '/path/to/WTM/build'
cd /path/to/WTM/build && /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake
-E cmake_depends "Unix Makefiles" /path/to/WTM /path/to/WTM/common/richdem
/path/to/WTM/build /path/to/WTM/build/common/richdem
/path/to/WTM/build/common/richdem/CMakeFiles/richdem.dir/DependInfo.cmake
--color=
make[2]: Leaving directory '/path/to/WTM/build'
make  -f common/richdem/CMakeFiles/richdem.dir/build.make
common/richdem/CMakeFiles/richdem.dir/build
make[2]: Entering directory '/path/to/WTM/build'
[ 16%] Building CXX object
common/richdem/CMakeFiles/richdem.dir/src/richdem.cpp.o
cd /path/to/WTM/build/common/richdem && /cm/local/apps/gcc/10.2.0/bin/c++

Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-09 Thread Junchao Zhang
OK, let's walk back and don't use -DCMAKE_C_COMPILER=/path/to/mpicc

libompitrace.so.40.30.0 is not the OpenMP library; it is the tracing
library for OpenMPI, https://github.com/open-mpi/ompi/issues/10036

In your previous email, there was

/path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake -E cmake_link_script
CMakeFiles/wtm.x.dir/link.txt --verbose=1
/cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra -pedantic
-Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
 -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/
gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_
support/lib:/path/to/petsc/arch-linux-cxx-debug/lib libwtm.a
common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
/usr/bin/ld: CMakeFiles/wtm.x.dir/src/WTM.cpp.o: undefined reference to
symbol 'ompi_mpi_comm_self'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40: error
adding symbols: DSO missing from command line


Let's try to add -lmpi (or /path/to/openmpi-4.1.1_ucx_
cuda_11.0.3_support/lib/libmpi.so) manually to see if it links

/cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra -pedantic
-Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
 -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/
gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_
support/lib:/path/to/petsc/arch-linux-cxx-debug/lib libwtm.a
common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so
-lmpi


On Sun, Oct 9, 2022 at 9:28 PM Rob Kudyba  wrote:

> I did have -DMPI_CXX_COMPILER set, so I added -DCMAKE_C_COMPILER and now
> get these errors:
>
> [ 25%] Linking CXX shared library librichdem.so
> /lib/../lib64/crt1.o: In function `_start':
> (.text+0x24): undefined reference to `main'
> CMakeFiles/richdem.dir/src/random.cpp.o: In function
> `richdem::rand_engine()':
> random.cpp:(.text+0x45): undefined reference to `omp_get_thread_num'
> CMakeFiles/richdem.dir/src/random.cpp.o: In function
> `richdem::seed_rand(unsigned long)':
> random.cpp:(.text+0xb6): undefined reference to `GOMP_parallel'
> CMakeFiles/richdem.dir/src/random.cpp.o: In function
> `richdem::uniform_rand_int(int, int)':
> random.cpp:(.text+0x10c): undefined reference to `omp_get_thread_num'
> CMakeFiles/richdem.dir/src/random.cpp.o: In function
> `richdem::uniform_rand_real(double, double)':
> random.cpp:(.text+0x1cb): undefined reference to `omp_get_thread_num'
> CMakeFiles/richdem.dir/src/random.cpp.o: In function
> `richdem::normal_rand(double, double)':
> random.cpp:(.text+0x29e): undefined reference to `omp_get_thread_num'
> CMakeFiles/richdem.dir/src/random.cpp.o: In function
> `richdem::seed_rand(unsigned long) [clone ._omp_fn.0]':
> random.cpp:(.text+0x4a3): undefined reference to `GOMP_critical_start'
> random.cpp:(.text+0x4b1): undefined reference to `GOMP_critical_end'
> random.cpp:(.text+0x4c3): undefined reference to `omp_get_thread_num'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Comm_rank'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Get_address'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Comm_get_name'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Add_error_string'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Type_get_name'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Abort'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Alloc_mem'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Isend'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Barrier'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Allgather'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Reduce'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Send'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Init'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Type_size'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
> undefined reference to `PMPI_Accumulate'
> 

Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-09 Thread Rob Kudyba
I did have -DMPI_CXX_COMPILER set, so I added -DCMAKE_C_COMPILER and now
get these errors:

[ 25%] Linking CXX shared library librichdem.so
/lib/../lib64/crt1.o: In function `_start':
(.text+0x24): undefined reference to `main'
CMakeFiles/richdem.dir/src/random.cpp.o: In function
`richdem::rand_engine()':
random.cpp:(.text+0x45): undefined reference to `omp_get_thread_num'
CMakeFiles/richdem.dir/src/random.cpp.o: In function
`richdem::seed_rand(unsigned long)':
random.cpp:(.text+0xb6): undefined reference to `GOMP_parallel'
CMakeFiles/richdem.dir/src/random.cpp.o: In function
`richdem::uniform_rand_int(int, int)':
random.cpp:(.text+0x10c): undefined reference to `omp_get_thread_num'
CMakeFiles/richdem.dir/src/random.cpp.o: In function
`richdem::uniform_rand_real(double, double)':
random.cpp:(.text+0x1cb): undefined reference to `omp_get_thread_num'
CMakeFiles/richdem.dir/src/random.cpp.o: In function
`richdem::normal_rand(double, double)':
random.cpp:(.text+0x29e): undefined reference to `omp_get_thread_num'
CMakeFiles/richdem.dir/src/random.cpp.o: In function
`richdem::seed_rand(unsigned long) [clone ._omp_fn.0]':
random.cpp:(.text+0x4a3): undefined reference to `GOMP_critical_start'
random.cpp:(.text+0x4b1): undefined reference to `GOMP_critical_end'
random.cpp:(.text+0x4c3): undefined reference to `omp_get_thread_num'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Comm_rank'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Get_address'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Comm_get_name'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Add_error_string'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Type_get_name'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Abort'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Alloc_mem'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Isend'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Barrier'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Allgather'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Reduce'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Send'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Init'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Type_size'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Accumulate'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Add_error_class'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Finalize'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Allgatherv'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Bcast'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Recv'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Request_free'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Allreduce'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `ompi_mpi_comm_world'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Sendrecv'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Add_error_code'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0:
undefined reference to `PMPI_Win_get_name'
collect2: error: ld returned 1 exit status
make[2]: *** [common/richdem/CMakeFiles/richdem.dir/build.make:163:
common/richdem/librichdem.so] Error 1
make[1]: *** [CMakeFiles/Makefile2:306:
common/richdem/CMakeFiles/richdem.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

I took a guess at using
-DOpenMP_libomp_LIBRARY="/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0"
as
as otherwise I'd get:
CMake Error at
/path/to/cmake/cmake-3.22.1-linux-x86_64/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:230
(message):
  Could NOT find OpenMP_CXX (missing: OpenMP_libomp_LIBRARY
  

Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-09 Thread Junchao Zhang
In the last link step to generate the executable
/cm/local/apps/gcc/10.2.0/bin/c++ -isystem -O3 -g -Wall -Wextra -pedantic
-Wshadow CMakeFiles/wtm.x.dir/src/WTM.cpp.o -o wtm.x
 -Wl,-rpath,/path/to/WTM/build/common/richdem:/path/to/
gdal-3.3.0/lib:/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_
support/lib:/path/to/petsc/arch-linux-cxx-debug/lib libwtm.a
common/richdem/librichdem.so /path/to/gdal-3.3.0/lib/libgdal.so
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libompitrace.so.40.30.0
common/fmt/libfmt.a /path/to/petsc/arch-linux-cxx-debug/lib/libpetsc.so

I did not find -lmpi to link in the mpi library.  You can try to use  cmake
-DCMAKE_C_COMPILER=/path/to/mpicc  -DCMAKE_CXX_COMPILER=/path/to/mpicxx to
build your code

On Sat, Oct 8, 2022 at 9:32 PM Rob Kudyba  wrote:

> Perhaps we can back one step:
>> Use your mpicc to build a "hello world" mpi test, then run it on a
>> compute node (with GPU) to see if it works.
>> If no, then your MPI environment has problems;
>> If yes, then use it to build petsc (turn on petsc's gpu support,
>>  --with-cuda  --with-cudac=nvcc), and then your code.
>> --Junchao Zhang
>
> OK tried this just to eliminate that the CUDA-capable OpenMPI is a factor:
> ./configure --with-debugging=0 --with-cmake=true   --with-mpi=true
>  --with-mpi-dir=/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support --with-fc=0
>   --with-cuda=1
> [..]
> cuda:
>   Version:11.7
>   Includes:   -I/path/to/cuda11.7/toolkit/11.7.1/include
>   Libraries:  -Wl,-rpath,/path/to/cuda11.7/toolkit/11.7.1/lib64
> -L/cm/shared/apps/cuda11.7/toolkit/11.7.1/lib64
> -L/path/to/cuda11.7/toolkit/11.7.1/lib64/stubs -lcudart -lnvToolsExt
> -lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda
>   CUDA SM 75
>   CUDA underlying compiler:
> CUDA_CXX="/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/bin"/mpicxx
>   CUDA underlying compiler flags: CUDA_CXXFLAGS=
>   CUDA underlying linker libraries: CUDA_CXXLIBS=
> [...]
>  Configure stage complete. Now build PETSc libraries with:
>make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-opt all
>
> C++ compiler version: g++ (GCC) 10.2.0
> Using C++ compiler to compile PETSc
> -
> Using C/C++ linker:
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/bin/mpicxx
> Using C/C++ flags: -Wall -Wwrite-strings -Wno-strict-aliasing
> -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector
> -fvisibility=hidden -g -O0
> -
> Using system modules:
> shared:slurm/20.02.6:DefaultModules:openmpi/gcc/64/4.1.1_cuda_11.0.3_aware:gdal/3.3.0:cmake/3.22.1:cuda11.7/toolkit/11.7.1:openblas/dynamic/0.3.7:gcc/10.2.0
> Using mpi.h: # 1
> "/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/include/mpi.h" 1
> -
> Using libraries: -Wl,-rpath,/path/to/petsc/arch-linux-cxx-debug/lib
> -L/path/to/petsc/arch-linux-cxx-debug/lib -lpetsc -lopenblas -lm -lX11
> -lquadmath -lstdc++ -ldl
> --
> Using mpiexec: mpiexec -mca orte_base_help_aggregate 0  -mca pml ucx --mca
> btl '^openib'
> --
> Using MAKE: /path/to/petsc/arch-linux-cxx-debug/bin/make
> Using MAKEFLAGS: -j24 -l48.0  --no-print-directory -- MPIEXEC=mpiexec\
> -mca\ orte_base_help_aggregate\ 0\ \ -mca\ pml\ ucx\ --mca\ btl\ '^openib'
> PETSC_ARCH=arch-linux-cxx-debug PETSC_DIR=/path/to/petsc
> ==
> make[3]: Nothing to be done for 'libs'.
> =
> Now to check if the libraries are working do:
> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-cxx-debug check
> =
> [me@xxx petsc]$ make PETSC_DIR=/path/to/petsc
> PETSC_ARCH=arch-linux-cxx-debug MPIEXEC="mpiexec -mca
> orte_base_help_aggregate 0  -mca pml ucx --mca btl '^openib'" check
> Running check examples to verify correct installation
> Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-cxx-debug
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
>
> ./bandwidthTest
> [CUDA Bandwidth Test] - Starting...
> Running on...
>
>  Device 0: Quadro RTX 8000
>  Quick Mode
>
>  Host to Device Bandwidth, 1 Device(s)
>  PINNED Memory Transfers
>Transfer Size (Bytes) Bandwidth(GB/s)
>3200 12.3
>
>  Device to Host Bandwidth, 1 Device(s)
>  PINNED Memory Transfers
>Transfer Size (Bytes) Bandwidth(GB/s)
>3200 13.2
>
>  Device to Device Bandwidth, 1 Device(s)
>  PINNED Memory Transfers
>Transfer Size (Bytes) Bandwidth(GB/s)
>3200 466.2
>
> Result = PASS
>
> On Sat, Oct 8, 2022 at 7:56 PM Barry Smith  wrote:
>
>>
>>   True, but when users send reports back to us they will never have used
>> the VERBOSE=1 option, so it requires one more round trip of email to get
>> this additional information.
>>
>> > On Oct 8, 2022, at 6:48 PM, Jed Brown  wrote:
>> >
>> > Barry 

Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-08 Thread Rob Kudyba
>
> Perhaps we can back one step:
> Use your mpicc to build a "hello world" mpi test, then run it on a compute
> node (with GPU) to see if it works.
> If no, then your MPI environment has problems;
> If yes, then use it to build petsc (turn on petsc's gpu support,
>  --with-cuda  --with-cudac=nvcc), and then your code.
> --Junchao Zhang

OK tried this just to eliminate that the CUDA-capable OpenMPI is a factor:
./configure --with-debugging=0 --with-cmake=true   --with-mpi=true
 --with-mpi-dir=/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support --with-fc=0
  --with-cuda=1
[..]
cuda:
  Version:11.7
  Includes:   -I/path/to/cuda11.7/toolkit/11.7.1/include
  Libraries:  -Wl,-rpath,/path/to/cuda11.7/toolkit/11.7.1/lib64
-L/cm/shared/apps/cuda11.7/toolkit/11.7.1/lib64
-L/path/to/cuda11.7/toolkit/11.7.1/lib64/stubs -lcudart -lnvToolsExt
-lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda
  CUDA SM 75
  CUDA underlying compiler:
CUDA_CXX="/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/bin"/mpicxx
  CUDA underlying compiler flags: CUDA_CXXFLAGS=
  CUDA underlying linker libraries: CUDA_CXXLIBS=
[...]
 Configure stage complete. Now build PETSc libraries with:
   make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-opt all

C++ compiler version: g++ (GCC) 10.2.0
Using C++ compiler to compile PETSc
-
Using C/C++ linker:
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/bin/mpicxx
Using C/C++ flags: -Wall -Wwrite-strings -Wno-strict-aliasing
-Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector
-fvisibility=hidden -g -O0
-
Using system modules:
shared:slurm/20.02.6:DefaultModules:openmpi/gcc/64/4.1.1_cuda_11.0.3_aware:gdal/3.3.0:cmake/3.22.1:cuda11.7/toolkit/11.7.1:openblas/dynamic/0.3.7:gcc/10.2.0
Using mpi.h: # 1
"/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/include/mpi.h" 1
-
Using libraries: -Wl,-rpath,/path/to/petsc/arch-linux-cxx-debug/lib
-L/path/to/petsc/arch-linux-cxx-debug/lib -lpetsc -lopenblas -lm -lX11
-lquadmath -lstdc++ -ldl
--
Using mpiexec: mpiexec -mca orte_base_help_aggregate 0  -mca pml ucx --mca
btl '^openib'
--
Using MAKE: /path/to/petsc/arch-linux-cxx-debug/bin/make
Using MAKEFLAGS: -j24 -l48.0  --no-print-directory -- MPIEXEC=mpiexec\
-mca\ orte_base_help_aggregate\ 0\ \ -mca\ pml\ ucx\ --mca\ btl\ '^openib'
PETSC_ARCH=arch-linux-cxx-debug PETSC_DIR=/path/to/petsc
==
make[3]: Nothing to be done for 'libs'.
=
Now to check if the libraries are working do:
make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-cxx-debug check
=
[me@xxx petsc]$ make PETSC_DIR=/path/to/petsc
PETSC_ARCH=arch-linux-cxx-debug MPIEXEC="mpiexec -mca
orte_base_help_aggregate 0  -mca pml ucx --mca btl '^openib'" check
Running check examples to verify correct installation
Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-cxx-debug
C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes

./bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: Quadro RTX 8000
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes) Bandwidth(GB/s)
   3200 12.3

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes) Bandwidth(GB/s)
   3200 13.2

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes) Bandwidth(GB/s)
   3200 466.2

Result = PASS

On Sat, Oct 8, 2022 at 7:56 PM Barry Smith  wrote:

>
>   True, but when users send reports back to us they will never have used
> the VERBOSE=1 option, so it requires one more round trip of email to get
> this additional information.
>
> > On Oct 8, 2022, at 6:48 PM, Jed Brown  wrote:
> >
> > Barry Smith  writes:
> >
> >>   I hate these kinds of make rules that hide what the compiler is doing
> (in the name of having less output, I guess) it makes it difficult to
> figure out what is going wrong.
> >
> > You can make VERBOSE=1 with CMake-generated makefiles.
>


> Anyways, either some of the MPI libraries are missing from the link line
> or they are in the wrong order and thus it is not able to search them
> properly. Here is a bunch of discussions on why that error message can
> appear
> https://stackoverflow.com/questions/19901934/libpthread-so-0-error-adding-symbols-dso-missing-from-command-line
>


Still same but more noise and I have been using the suggestion of
LDFLAGS="-Wl,--copy-dt-needed-entries" along with make:
make[2]: Entering directory '/path/to/WTM/build'
cd /path/to/WTM/build && /path/to/cmake/cmake-3.22.1-linux-x86_64/bin/cmake
-E cmake_depends "Unix Makefiles" /path/to/WTM /path/to/WTM
/path/to/WTM/build 

Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-08 Thread Barry Smith


  True, but when users send reports back to us they will never have used the 
VERBOSE=1 option, so it requires one more round trip of email to get this 
additional information. 

> On Oct 8, 2022, at 6:48 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
> 
>>   I hate these kinds of make rules that hide what the compiler is doing (in 
>> the name of having less output, I guess) it makes it difficult to figure out 
>> what is going wrong.
> 
> You can make VERBOSE=1 with CMake-generated makefiles.



Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-08 Thread Jed Brown
Barry Smith  writes:

>I hate these kinds of make rules that hide what the compiler is doing (in 
> the name of having less output, I guess) it makes it difficult to figure out 
> what is going wrong.

You can make VERBOSE=1 with CMake-generated makefiles.


Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-08 Thread Junchao Zhang
Perhaps we can back one step:
Use your mpicc to build a "hello world" mpi test, then run it on a compute
node (with GPU) to see if it works.
If no, then your MPI environment has problems;
If yes, then use it to build petsc (turn on petsc's gpu support,
--with-cuda  --with-cudac=nvcc), and then your code.

--Junchao Zhang


On Fri, Oct 7, 2022 at 10:45 PM Rob Kudyba  wrote:

> The error changes now and at an earlier place, 66% vs 70%:
> make LDFLAGS="-Wl,--copy-dt-needed-entries"
> Consolidate compiler generated dependencies of target fmt
> [ 12%] Built target fmt
> Consolidate compiler generated dependencies of target richdem
> [ 37%] Built target richdem
> Consolidate compiler generated dependencies of target wtm
> [ 62%] Built target wtm
> Consolidate compiler generated dependencies of target wtm.x
> [ 66%] Linking CXX executable wtm.x
> /usr/bin/ld: libwtm.a(transient_groundwater.cpp.o): undefined reference to
> symbol 'MPI_Abort'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40: error
> adding symbols: DSO missing from command line
> collect2: error: ld returned 1 exit status
> make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
> make[1]: *** [CMakeFiles/Makefile2:225: CMakeFiles/wtm.x.dir/all] Error 2
> make: *** [Makefile:136: all] Error 2
>
> So perhaps PET_Sc is now being found. Any other suggestions?
>
> On Fri, Oct 7, 2022 at 11:18 PM Rob Kudyba  wrote:
>
>>
>> Thanks for the quick reply. I added these options to make and make check
 still produce the warnings so I used the command like this:
 make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
  MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
 opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'" check
 Running check examples to verify correct installation
 Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
 C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI
 process
 C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI
 processes
 Completed test examples

 Could be useful for the FAQ.

>>> You mentioned you had "OpenMPI 4.1.1 with CUDA aware",  so I think a
>>> workable mpicc should automatically find cuda libraries.  Maybe you
>>> unloaded cuda libraries?
>>>
>> Oh let me clarify, OpenMPI is CUDA aware however this code and the node
>> where PET_Sc is compiling does not have a GPU, hence not needed and using
>> the MPIEXEC option worked during the 'check' to suppress the warning.
>>
>> I'm not trying to use PetSC to compile and linking appears to go awry:
 [ 58%] Building CXX object
 CMakeFiles/wtm.dir/src/update_effective_storativity.cpp.o
 [ 62%] Linking CXX static library libwtm.a
 [ 62%] Built target wtm
 [ 66%] Building CXX object CMakeFiles/wtm.x.dir/src/WTM.cpp.o
 [ 70%] Linking CXX executable wtm.x
 /usr/bin/ld: cannot find -lpetsc
 collect2: error: ld returned 1 exit status
 make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
 make[1]: *** [CMakeFiles/Makefile2:269: CMakeFiles/wtm.x.dir/all] Error
 2
 make: *** [Makefile:136: all] Error 2

>>> It seems cmake could not find petsc.   Look
>>> at $PETSC_DIR/share/petsc/CMakeLists.txt and try to modify your
>>> CMakeLists.txt.
>>>
>>
>> There is an explicit reference to the path in CMakeLists.txt:
>> # NOTE: You may need to update this path to identify PETSc's location
>> set(ENV{PKG_CONFIG_PATH}
>> "$ENV{PKG_CONFIG_PATH}:/path/to/petsc/arch-linux-cxx-debug/lib/pkgconfig/")
>> pkg_check_modules(PETSC PETSc>=3.17.1 IMPORTED_TARGET REQUIRED)
>> message(STATUS "Found PETSc ${PETSC_VERSION}")
>> add_subdirectory(common/richdem EXCLUDE_FROM_ALL)
>> add_subdirectory(common/fmt EXCLUDE_FROM_ALL)
>>
>> And that exists:
>> ls /path/to/petsc/arch-linux-cxx-debug/lib/pkgconfig/
>> petsc.pc  PETSc.pc
>>
>>  Is there an environment variable I'm missing? I've seen the suggestion
>>> 
>>> to add it to LD_LIBRARY_PATH which I did with export
>>> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib and that
>>> points to:
>>>
 ls -l /path/to/petsc/arch-linux-c-debug/lib
 total 83732
 lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so ->
 libpetsc.so.3.18.0
 lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so.3.18 ->
 libpetsc.so.3.18.0
 -rwxr-xr-x 1 rk3199 user 85719200 Oct  7 13:56 libpetsc.so.3.18.0
 drwxr-xr-x 3 rk3199 user 4096 Oct  6 10:22 petsc
 drwxr-xr-x 2 rk3199 user 4096 Oct  6 10:23 pkgconfig

 Anything else to check?

>>> If modifying  CMakeLists.txt does not work, you can try export
>>> LIBRARY_PATH=$LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib
>>> LD_LIBRARY_PATHis is for run time, but the error happened at link time,
>>>
>>
>> Yes that's what I already had. Any other 

Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-08 Thread Barry Smith

   I hate these kinds of make rules that hide what the compiler is doing (in 
the name of having less output, I guess) it makes it difficult to figure out 
what is going wrong.

   Anyways, either some of the MPI libraries are missing from the link line or 
they are in the wrong order and thus it is not able to search them properly. 
Here is a bunch of discussions on why that error message can appear 
https://stackoverflow.com/questions/19901934/libpthread-so-0-error-adding-symbols-dso-missing-from-command-line


  Barry


> On Oct 7, 2022, at 11:45 PM, Rob Kudyba  wrote:
> 
> The error changes now and at an earlier place, 66% vs 70%:
> make LDFLAGS="-Wl,--copy-dt-needed-entries"
> Consolidate compiler generated dependencies of target fmt
> [ 12%] Built target fmt
> Consolidate compiler generated dependencies of target richdem
> [ 37%] Built target richdem
> Consolidate compiler generated dependencies of target wtm
> [ 62%] Built target wtm
> Consolidate compiler generated dependencies of target wtm.x
> [ 66%] Linking CXX executable wtm.x
> /usr/bin/ld: libwtm.a(transient_groundwater.cpp.o): undefined reference to 
> symbol 'MPI_Abort'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40: error adding 
> symbols: DSO missing from command line
> collect2: error: ld returned 1 exit status
> make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
> make[1]: *** [CMakeFiles/Makefile2:225: CMakeFiles/wtm.x.dir/all] Error 2
> make: *** [Makefile:136: all] Error 2
> 
> So perhaps PET_Sc is now being found. Any other suggestions?
> 
> On Fri, Oct 7, 2022 at 11:18 PM Rob Kudyba  > wrote:
> 
> Thanks for the quick reply. I added these options to make and make check 
> still produce the warnings so I used the command like this:
> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug  MPIEXEC="mpiexec 
> -mca orte_base_help_aggregate 0 --mca opal_warn_on_missing_libcuda 0 -mca pml 
> ucx --mca btl '^openib'" check
> Running check examples to verify correct installation
> Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
> Completed test examples
> 
> Could be useful for the FAQ.
> You mentioned you had "OpenMPI 4.1.1 with CUDA aware",  so I think a workable 
> mpicc should automatically find cuda libraries.  Maybe you unloaded cuda 
> libraries?
> Oh let me clarify, OpenMPI is CUDA aware however this code and the node where 
> PET_Sc is compiling does not have a GPU, hence not needed and using the 
> MPIEXEC option worked during the 'check' to suppress the warning. 
> 
> I'm not trying to use PetSC to compile and linking appears to go awry:
> [ 58%] Building CXX object 
> CMakeFiles/wtm.dir/src/update_effective_storativity.cpp.o
> [ 62%] Linking CXX static library libwtm.a
> [ 62%] Built target wtm
> [ 66%] Building CXX object CMakeFiles/wtm.x.dir/src/WTM.cpp.o
> [ 70%] Linking CXX executable wtm.x
> /usr/bin/ld: cannot find -lpetsc
> collect2: error: ld returned 1 exit status
> make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
> make[1]: *** [CMakeFiles/Makefile2:269: CMakeFiles/wtm.x.dir/all] Error 2
> make: *** [Makefile:136: all] Error 2
> It seems cmake could not find petsc.   Look at 
> $PETSC_DIR/share/petsc/CMakeLists.txt and try to modify your CMakeLists.txt.
> 
> There is an explicit reference to the path in CMakeLists.txt:
> # NOTE: You may need to update this path to identify PETSc's location
> set(ENV{PKG_CONFIG_PATH} 
> "$ENV{PKG_CONFIG_PATH}:/path/to/petsc/arch-linux-cxx-debug/lib/pkgconfig/")
> pkg_check_modules(PETSC PETSc>=3.17.1 IMPORTED_TARGET REQUIRED)
> message(STATUS "Found PETSc ${PETSC_VERSION}")
> add_subdirectory(common/richdem EXCLUDE_FROM_ALL)
> add_subdirectory(common/fmt EXCLUDE_FROM_ALL)
>  
> And that exists:
> ls /path/to/petsc/arch-linux-cxx-debug/lib/pkgconfig/
> petsc.pc  PETSc.pc
> 
>  Is there an environment variable I'm missing? I've seen the suggestion 
> 
>  to add it to LD_LIBRARY_PATH which I did with export 
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib and that points 
> to:
> ls -l /path/to/petsc/arch-linux-c-debug/lib
> total 83732
> lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so -> 
> libpetsc.so.3.18.0
> lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so.3.18 -> 
> libpetsc.so.3.18.0
> -rwxr-xr-x 1 rk3199 user 85719200 Oct  7 13:56 libpetsc.so.3.18.0
> drwxr-xr-x 3 rk3199 user 4096 Oct  6 10:22 petsc
> drwxr-xr-x 2 rk3199 user 4096 Oct  6 10:23 pkgconfig
> 
> Anything else to check?
> If modifying  CMakeLists.txt does not work, you can try export 
> LIBRARY_PATH=$LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib
> LD_LIBRARY_PATHis is for run time, but the error 

Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-07 Thread Rob Kudyba
The error changes now and at an earlier place, 66% vs 70%:
make LDFLAGS="-Wl,--copy-dt-needed-entries"
Consolidate compiler generated dependencies of target fmt
[ 12%] Built target fmt
Consolidate compiler generated dependencies of target richdem
[ 37%] Built target richdem
Consolidate compiler generated dependencies of target wtm
[ 62%] Built target wtm
Consolidate compiler generated dependencies of target wtm.x
[ 66%] Linking CXX executable wtm.x
/usr/bin/ld: libwtm.a(transient_groundwater.cpp.o): undefined reference to
symbol 'MPI_Abort'
/path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40: error
adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
make[1]: *** [CMakeFiles/Makefile2:225: CMakeFiles/wtm.x.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

So perhaps PET_Sc is now being found. Any other suggestions?

On Fri, Oct 7, 2022 at 11:18 PM Rob Kudyba  wrote:

>
> Thanks for the quick reply. I added these options to make and make check
>>> still produce the warnings so I used the command like this:
>>> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
>>>  MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
>>> opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'" check
>>> Running check examples to verify correct installation
>>> Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
>>> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
>>> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI
>>> processes
>>> Completed test examples
>>>
>>> Could be useful for the FAQ.
>>>
>> You mentioned you had "OpenMPI 4.1.1 with CUDA aware",  so I think a
>> workable mpicc should automatically find cuda libraries.  Maybe you
>> unloaded cuda libraries?
>>
> Oh let me clarify, OpenMPI is CUDA aware however this code and the node
> where PET_Sc is compiling does not have a GPU, hence not needed and using
> the MPIEXEC option worked during the 'check' to suppress the warning.
>
> I'm not trying to use PetSC to compile and linking appears to go awry:
>>> [ 58%] Building CXX object
>>> CMakeFiles/wtm.dir/src/update_effective_storativity.cpp.o
>>> [ 62%] Linking CXX static library libwtm.a
>>> [ 62%] Built target wtm
>>> [ 66%] Building CXX object CMakeFiles/wtm.x.dir/src/WTM.cpp.o
>>> [ 70%] Linking CXX executable wtm.x
>>> /usr/bin/ld: cannot find -lpetsc
>>> collect2: error: ld returned 1 exit status
>>> make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
>>> make[1]: *** [CMakeFiles/Makefile2:269: CMakeFiles/wtm.x.dir/all] Error 2
>>> make: *** [Makefile:136: all] Error 2
>>>
>> It seems cmake could not find petsc.   Look
>> at $PETSC_DIR/share/petsc/CMakeLists.txt and try to modify your
>> CMakeLists.txt.
>>
>
> There is an explicit reference to the path in CMakeLists.txt:
> # NOTE: You may need to update this path to identify PETSc's location
> set(ENV{PKG_CONFIG_PATH}
> "$ENV{PKG_CONFIG_PATH}:/path/to/petsc/arch-linux-cxx-debug/lib/pkgconfig/")
> pkg_check_modules(PETSC PETSc>=3.17.1 IMPORTED_TARGET REQUIRED)
> message(STATUS "Found PETSc ${PETSC_VERSION}")
> add_subdirectory(common/richdem EXCLUDE_FROM_ALL)
> add_subdirectory(common/fmt EXCLUDE_FROM_ALL)
>
> And that exists:
> ls /path/to/petsc/arch-linux-cxx-debug/lib/pkgconfig/
> petsc.pc  PETSc.pc
>
>  Is there an environment variable I'm missing? I've seen the suggestion
>> 
>> to add it to LD_LIBRARY_PATH which I did with export
>> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib and that
>> points to:
>>
>>> ls -l /path/to/petsc/arch-linux-c-debug/lib
>>> total 83732
>>> lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so ->
>>> libpetsc.so.3.18.0
>>> lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so.3.18 ->
>>> libpetsc.so.3.18.0
>>> -rwxr-xr-x 1 rk3199 user 85719200 Oct  7 13:56 libpetsc.so.3.18.0
>>> drwxr-xr-x 3 rk3199 user 4096 Oct  6 10:22 petsc
>>> drwxr-xr-x 2 rk3199 user 4096 Oct  6 10:23 pkgconfig
>>>
>>> Anything else to check?
>>>
>> If modifying  CMakeLists.txt does not work, you can try export
>> LIBRARY_PATH=$LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib
>> LD_LIBRARY_PATHis is for run time, but the error happened at link time,
>>
>
> Yes that's what I already had. Any other debug that I can provide?
>
>
>
>> On Fri, Oct 7, 2022 at 1:53 PM Satish Balay  wrote:
>>>
 you can try

 make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
 MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
 opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'"

 Wrt configure - it can be set with --with-mpiexec option - its saved in
 PETSC_ARCH/lib/petsc/conf/petscvariables

 Satish

 On Fri, 7 Oct 2022, Rob Kudyba wrote:


Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-07 Thread Rob Kudyba
> Thanks for the quick reply. I added these options to make and make check
>> still produce the warnings so I used the command like this:
>> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
>>  MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
>> opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'" check
>> Running check examples to verify correct installation
>> Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
>> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
>> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI
>> processes
>> Completed test examples
>>
>> Could be useful for the FAQ.
>>
> You mentioned you had "OpenMPI 4.1.1 with CUDA aware",  so I think a
> workable mpicc should automatically find cuda libraries.  Maybe you
> unloaded cuda libraries?
>
Oh let me clarify, OpenMPI is CUDA aware however this code and the node
where PET_Sc is compiling does not have a GPU, hence not needed and using
the MPIEXEC option worked during the 'check' to suppress the warning.

I'm not trying to use PetSC to compile and linking appears to go awry:
>> [ 58%] Building CXX object
>> CMakeFiles/wtm.dir/src/update_effective_storativity.cpp.o
>> [ 62%] Linking CXX static library libwtm.a
>> [ 62%] Built target wtm
>> [ 66%] Building CXX object CMakeFiles/wtm.x.dir/src/WTM.cpp.o
>> [ 70%] Linking CXX executable wtm.x
>> /usr/bin/ld: cannot find -lpetsc
>> collect2: error: ld returned 1 exit status
>> make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
>> make[1]: *** [CMakeFiles/Makefile2:269: CMakeFiles/wtm.x.dir/all] Error 2
>> make: *** [Makefile:136: all] Error 2
>>
> It seems cmake could not find petsc.   Look
> at $PETSC_DIR/share/petsc/CMakeLists.txt and try to modify your
> CMakeLists.txt.
>

There is an explicit reference to the path in CMakeLists.txt:
# NOTE: You may need to update this path to identify PETSc's location
set(ENV{PKG_CONFIG_PATH}
"$ENV{PKG_CONFIG_PATH}:/path/to/petsc/arch-linux-cxx-debug/lib/pkgconfig/")
pkg_check_modules(PETSC PETSc>=3.17.1 IMPORTED_TARGET REQUIRED)
message(STATUS "Found PETSc ${PETSC_VERSION}")
add_subdirectory(common/richdem EXCLUDE_FROM_ALL)
add_subdirectory(common/fmt EXCLUDE_FROM_ALL)

And that exists:
ls /path/to/petsc/arch-linux-cxx-debug/lib/pkgconfig/
petsc.pc  PETSc.pc

 Is there an environment variable I'm missing? I've seen the suggestion
> 
> to add it to LD_LIBRARY_PATH which I did with export
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib and that
> points to:
>
>> ls -l /path/to/petsc/arch-linux-c-debug/lib
>> total 83732
>> lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so ->
>> libpetsc.so.3.18.0
>> lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so.3.18 ->
>> libpetsc.so.3.18.0
>> -rwxr-xr-x 1 rk3199 user 85719200 Oct  7 13:56 libpetsc.so.3.18.0
>> drwxr-xr-x 3 rk3199 user 4096 Oct  6 10:22 petsc
>> drwxr-xr-x 2 rk3199 user 4096 Oct  6 10:23 pkgconfig
>>
>> Anything else to check?
>>
> If modifying  CMakeLists.txt does not work, you can try export
> LIBRARY_PATH=$LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib
> LD_LIBRARY_PATHis is for run time, but the error happened at link time,
>

Yes that's what I already had. Any other debug that I can provide?



> On Fri, Oct 7, 2022 at 1:53 PM Satish Balay  wrote:
>>
>>> you can try
>>>
>>> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
>>> MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
>>> opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'"
>>>
>>> Wrt configure - it can be set with --with-mpiexec option - its saved in
>>> PETSC_ARCH/lib/petsc/conf/petscvariables
>>>
>>> Satish
>>>
>>> On Fri, 7 Oct 2022, Rob Kudyba wrote:
>>>
>>> > We are on RHEL 8, using modules that we can load/unload various
>>> version of
>>> > packages/libraries, and I have OpenMPI 4.1.1 with CUDA aware loaded
>>> along
>>> > with GDAL 3.3.0, GCC 10.2.0, and cmake 3.22.1
>>> >
>>> > make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug check
>>> > fails with the below errors,
>>> > Running check examples to verify correct installation
>>> >
>>> > Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
>>> > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process
>>> > See https://petsc.org/release/faq/
>>> >
>>> --
>>> > The library attempted to open the following supporting CUDA libraries,
>>> > but each of them failed.  CUDA-aware support is disabled.
>>> > libcuda.so.1: cannot open shared object file: No such file or directory
>>> > libcuda.dylib: cannot open shared object file: No such file or
>>> directory
>>> > /usr/lib64/libcuda.so.1: cannot open shared object file: No such file
>>> or
>>> > directory
>>> > /usr/lib64/libcuda.dylib: cannot 

Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-07 Thread Junchao Zhang
On Fri, Oct 7, 2022 at 1:08 PM Rob Kudyba  wrote:

> Thanks for the quick reply. I added these options to make and make check
> still produce the warnings so I used the command like this:
> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
>  MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
> opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'" check
> Running check examples to verify correct installation
> Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
> Completed test examples
>
> Could be useful for the FAQ.
>
You mentioned you had "OpenMPI 4.1.1 with CUDA aware",  so I think a
workable mpicc should automatically find cuda libraries.  Maybe you
unloaded cuda libraries?


> I'm not trying to use PetSC to compile and linking appears to go awry:
> [ 58%] Building CXX object
> CMakeFiles/wtm.dir/src/update_effective_storativity.cpp.o
> [ 62%] Linking CXX static library libwtm.a
> [ 62%] Built target wtm
> [ 66%] Building CXX object CMakeFiles/wtm.x.dir/src/WTM.cpp.o
> [ 70%] Linking CXX executable wtm.x
> /usr/bin/ld: cannot find -lpetsc
> collect2: error: ld returned 1 exit status
> make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
> make[1]: *** [CMakeFiles/Makefile2:269: CMakeFiles/wtm.x.dir/all] Error 2
> make: *** [Makefile:136: all] Error 2
>
It seems cmake could not find petsc.   Look
at $PETSC_DIR/share/petsc/CMakeLists.txt and try to modify your
CMakeLists.txt.


>
>
> Is there an environment variable I'm missing? I've seen the suggestion
> 
> to add it to LD_LIBRARY_PATH which I did with export
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib and that
> points to:
> ls -l /path/to/petsc/arch-linux-c-debug/lib
> total 83732
> lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so ->
> libpetsc.so.3.18.0
> lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so.3.18 ->
> libpetsc.so.3.18.0
> -rwxr-xr-x 1 rk3199 user 85719200 Oct  7 13:56 libpetsc.so.3.18.0
> drwxr-xr-x 3 rk3199 user 4096 Oct  6 10:22 petsc
> drwxr-xr-x 2 rk3199 user 4096 Oct  6 10:23 pkgconfig
>
> Anything else to check?
>
If modifying  CMakeLists.txt does not work, you can try export
LIBRARY_PATH=$LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib
LD_LIBRARY_PATHis is for run time, but the error happened at link time,


>
> On Fri, Oct 7, 2022 at 1:53 PM Satish Balay  wrote:
>
>> you can try
>>
>> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
>> MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
>> opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'"
>>
>> Wrt configure - it can be set with --with-mpiexec option - its saved in
>> PETSC_ARCH/lib/petsc/conf/petscvariables
>>
>> Satish
>>
>> On Fri, 7 Oct 2022, Rob Kudyba wrote:
>>
>> > We are on RHEL 8, using modules that we can load/unload various version
>> of
>> > packages/libraries, and I have OpenMPI 4.1.1 with CUDA aware loaded
>> along
>> > with GDAL 3.3.0, GCC 10.2.0, and cmake 3.22.1
>> >
>> > make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug check
>> > fails with the below errors,
>> > Running check examples to verify correct installation
>> >
>> > Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
>> > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process
>> > See https://petsc.org/release/faq/
>> >
>> --
>> > The library attempted to open the following supporting CUDA libraries,
>> > but each of them failed.  CUDA-aware support is disabled.
>> > libcuda.so.1: cannot open shared object file: No such file or directory
>> > libcuda.dylib: cannot open shared object file: No such file or directory
>> > /usr/lib64/libcuda.so.1: cannot open shared object file: No such file or
>> > directory
>> > /usr/lib64/libcuda.dylib: cannot open shared object file: No such file
>> or
>> > directory
>> > If you are not interested in CUDA-aware support, then run with
>> > --mca opal_warn_on_missing_libcuda 0 to suppress this message.  If you
>> are
>> > interested
>> > in CUDA-aware support, then try setting LD_LIBRARY_PATH to the location
>> > of libcuda.so.1 to get passed this issue.
>> >
>> --
>> >
>> --
>> > WARNING: There was an error initializing an OpenFabrics device.
>> >
>> >   Local host:   g117
>> >   Local device: mlx5_0
>> >
>> --
>> > lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
>> > Number of SNES iterations = 2
>> > Possible error running C/C++ 

Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-07 Thread Rob Kudyba
Thanks for the quick reply. I added these options to make and make check
still produce the warnings so I used the command like this:
make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
 MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'" check
Running check examples to verify correct installation
Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
Completed test examples

Could be useful for the FAQ.

I'm not trying to use PetSC to compile and linking appears to go awry:
[ 58%] Building CXX object
CMakeFiles/wtm.dir/src/update_effective_storativity.cpp.o
[ 62%] Linking CXX static library libwtm.a
[ 62%] Built target wtm
[ 66%] Building CXX object CMakeFiles/wtm.x.dir/src/WTM.cpp.o
[ 70%] Linking CXX executable wtm.x
/usr/bin/ld: cannot find -lpetsc
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
make[1]: *** [CMakeFiles/Makefile2:269: CMakeFiles/wtm.x.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

Is there an environment variable I'm missing? I've seen the suggestion

to add it to LD_LIBRARY_PATH which I did with export
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib and that points
to:
ls -l /path/to/petsc/arch-linux-c-debug/lib
total 83732
lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so ->
libpetsc.so.3.18.0
lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so.3.18 ->
libpetsc.so.3.18.0
-rwxr-xr-x 1 rk3199 user 85719200 Oct  7 13:56 libpetsc.so.3.18.0
drwxr-xr-x 3 rk3199 user 4096 Oct  6 10:22 petsc
drwxr-xr-x 2 rk3199 user 4096 Oct  6 10:23 pkgconfig

Anything else to check?

On Fri, Oct 7, 2022 at 1:53 PM Satish Balay  wrote:

> you can try
>
> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug
> MPIEXEC="mpiexec -mca orte_base_help_aggregate 0 --mca
> opal_warn_on_missing_libcuda 0 -mca pml ucx --mca btl '^openib'"
>
> Wrt configure - it can be set with --with-mpiexec option - its saved in
> PETSC_ARCH/lib/petsc/conf/petscvariables
>
> Satish
>
> On Fri, 7 Oct 2022, Rob Kudyba wrote:
>
> > We are on RHEL 8, using modules that we can load/unload various version
> of
> > packages/libraries, and I have OpenMPI 4.1.1 with CUDA aware loaded along
> > with GDAL 3.3.0, GCC 10.2.0, and cmake 3.22.1
> >
> > make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug check
> > fails with the below errors,
> > Running check examples to verify correct installation
> >
> > Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
> > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process
> > See https://petsc.org/release/faq/
> >
> --
> > The library attempted to open the following supporting CUDA libraries,
> > but each of them failed.  CUDA-aware support is disabled.
> > libcuda.so.1: cannot open shared object file: No such file or directory
> > libcuda.dylib: cannot open shared object file: No such file or directory
> > /usr/lib64/libcuda.so.1: cannot open shared object file: No such file or
> > directory
> > /usr/lib64/libcuda.dylib: cannot open shared object file: No such file or
> > directory
> > If you are not interested in CUDA-aware support, then run with
> > --mca opal_warn_on_missing_libcuda 0 to suppress this message.  If you
> are
> > interested
> > in CUDA-aware support, then try setting LD_LIBRARY_PATH to the location
> > of libcuda.so.1 to get passed this issue.
> >
> --
> >
> --
> > WARNING: There was an error initializing an OpenFabrics device.
> >
> >   Local host:   g117
> >   Local device: mlx5_0
> >
> --
> > lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> > Number of SNES iterations = 2
> > Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
> > See https://petsc.org/release/faq/
> >
> > The library attempted to open the following supporting CUDA libraries,
> > but each of them failed.  CUDA-aware support is disabled.
> > libcuda.so.1: cannot open shared object file: No such file or directory
> > libcuda.dylib: cannot open shared object file: No such file or directory
> > /usr/lib64/libcuda.so.1: cannot open shared object file: No such file or
> > directory
> > /usr/lib64/libcuda.dylib: cannot open shared object file: No such file or
> > directory
> > If you are not interested in CUDA-aware support, then run with
> > --mca opal_warn_on_missing_libcuda 0 to suppress this 

[petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-07 Thread Rob Kudyba
We are on RHEL 8, using modules that we can load/unload various version of
packages/libraries, and I have OpenMPI 4.1.1 with CUDA aware loaded along
with GDAL 3.3.0, GCC 10.2.0, and cmake 3.22.1

make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug check
fails with the below errors,
Running check examples to verify correct installation

Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process
See https://petsc.org/release/faq/
--
The library attempted to open the following supporting CUDA libraries,
but each of them failed.  CUDA-aware support is disabled.
libcuda.so.1: cannot open shared object file: No such file or directory
libcuda.dylib: cannot open shared object file: No such file or directory
/usr/lib64/libcuda.so.1: cannot open shared object file: No such file or
directory
/usr/lib64/libcuda.dylib: cannot open shared object file: No such file or
directory
If you are not interested in CUDA-aware support, then run with
--mca opal_warn_on_missing_libcuda 0 to suppress this message.  If you are
interested
in CUDA-aware support, then try setting LD_LIBRARY_PATH to the location
of libcuda.so.1 to get passed this issue.
--
--
WARNING: There was an error initializing an OpenFabrics device.

  Local host:   g117
  Local device: mlx5_0
--
lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
Number of SNES iterations = 2
Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
See https://petsc.org/release/faq/

The library attempted to open the following supporting CUDA libraries,
but each of them failed.  CUDA-aware support is disabled.
libcuda.so.1: cannot open shared object file: No such file or directory
libcuda.dylib: cannot open shared object file: No such file or directory
/usr/lib64/libcuda.so.1: cannot open shared object file: No such file or
directory
/usr/lib64/libcuda.dylib: cannot open shared object file: No such file or
directory
If you are not interested in CUDA-aware support, then run with
--mca opal_warn_on_missing_libcuda 0 to suppress this message.  If you are
interested in CUDA-aware support, then try setting LD_LIBRARY_PATH to the
locationof libcuda.so.1 to get passed this issue.

WARNING: There was an error initializing an OpenFabrics device.

  Local host:   xxx
  Local device: mlx5_0

lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
Number of SNES iterations = 2
[g117:4162783] 1 more process has sent help message
help-mpi-common-cuda.txt / dlopen failed
[g117:4162783] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
help / error messages
[g117:4162783] 1 more process has sent help message help-mpi-btl-openib.txt
/ error in device init
Completed test examples
Error while running make check
gmake[1]: *** [makefile:149: check] Error 1
make: *** [GNUmakefile:17: check] Error 2

Where is $MPI_RUN set? I'd like to be able to pass options such as --mca
orte_base_help_aggregate 0 --mca opal_warn_on_missing_libcuda 0 -mca pml
ucx --mca btl '^openib' which will help me troubleshoot and hide unneeded
warnings.

Thanks,
Rob