Thank you Barry and Stefano,

Below is the output from the example, which I ran with an added option since my 
mpi is not gpu aware. I believe this may be responsible for the error. The 
reason I chose to compile with the option

>>> --download-hypre-configure-arguments=--enable-unified-memory \

is because it was in config/examples/arch-ci-linux-cuda-pkgs.py . There are 
several other examples and there is no other particular reason why I chose this 
one, other than using hyper. I didn’t think too much about it. After 
recompiling without this option the example ran successfully. I will see about 
combining openmpi with cuda support.

Thanks!

For the sake of reference:

With the --download-hypre-configure-arguments=--enable-unified-memory option

$ mpiexec -n 1 ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -da_refine 3 
-snes_monitor_short -ksp_norm_type unpreconditioned -pc_type hypre 
-use_gpu_aware_mpi 0 -info > log.ex19 2>&1

[0] PetscDetermineInitialFPTrap(): Floating point trapping is off by default 0
[0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType cuda 
supported, initializing
[0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType hip not 
supported
[0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType sycl not 
supported
[0] PetscInitialize_Common(): PETSc successfully started: number of processors 
= 1
[0] PetscGetHostName(): Rejecting domainname, likely is NIS node021.(none)
[0] PetscInitialize_Common(): Running on machine: node021
[0] PetscCommDuplicate(): Duplicating a communicator 140679929097504 30289408 
max tags = 8388607
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929097504 
30289408
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929097504 
30289408
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929097504 
30289408
[0] PetscCommDuplicate(): Duplicating a communicator 140679929096992 30157712 
max tags = 8388607
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 
30157712
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 
30157712
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 
30157712
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 
30157712
[0] DMGetDMSNES(): Creating new DMSNES
[0] PetscGetHostName(): Rejecting domainname, likely is NIS node021.(none)
[0] configure(): Configured device 0
lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 2500 X 2500; storage space: 0 
unneeded,48400 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 20
[0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 
2500) < 0.6. Do not useCompressedRow routines.
[0] DMGetDMKSP(): Creating new DMKSP
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 
30157712
[0] PetscDeviceContextSetupGlobalContext_Private(): Initializing global 
PetscDeviceContext
0 SNES Function norm 0.0406612
[0] ISColoringCreate(): Number of colors 20
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 
30157712
[0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 
30157712
[0] MatFDColoringSetUp_SeqXAIJ(): ncolors 20, brows 66 and bcols 15 are used.
[0] SNESComputeJacobian(): Rebuilding preconditioner
[0] PCSetUp(): Setting up PC for first time
[0] MatConvert(): Check superclass seqhypre seqaijcusparse -> 0
[0] MatConvert(): Check superclass mpihypre seqaijcusparse -> 0
[0] MatConvert(): Check superclass mpihypre seqaijcusparse -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_seqhypre_C 
(seqaijcusparse) -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_mpihypre_C 
(seqaijcusparse) -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_hypre_C 
(seqaijcusparse) -> 1
CUDA ERROR (code = 101, invalid device ordinal) at memory.c:139
--------------------------------------------------------------------------
Primary jobterminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus 
causing
the job to be terminated. The first process to do so was:

Process name: [[51372,1],0]
Exit code:1
--------------------------------------------------------------------------

Without the --download-hypre-configure-arguments=--enable-unified-memory option

$ mpiexec -n 1 ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -da_refine 3 
-snes_monitor_short -ksp_norm_type unpreconditioned -pc_type hypre 
-use_gpu_aware_mpi 0 -info > log.ex19 2>&1

[0] PetscDetermineInitialFPTrap(): Floating point trapping is off by default 0
[0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType cuda 
supported, initializing
[0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType hip not 
supported
[0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType sycl not 
supported
[0] PetscInitialize_Common(): PETSc successfully started: number of processors 
= 1
[0] PetscGetHostName(): Rejecting domainname, likely is NIS node021.(none)
[0] PetscInitialize_Common(): Running on machine: node021
[0] PetscCommDuplicate(): Duplicating a communicator 140322706697504 29662720 
max tags = 8388607
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706697504 
29662720
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706697504 
29662720
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706697504 
29662720
[0] PetscCommDuplicate(): Duplicating a communicator 140322706696992 29531024 
max tags = 8388607
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 
29531024
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 
29531024
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 
29531024
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 
29531024
[0] DMGetDMSNES(): Creating new DMSNES
[0] PetscGetHostName(): Rejecting domainname, likely is NIS node021.(none)
[0] configure(): Configured device 0
lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 2500 X 2500; storage space: 0 
unneeded,48400 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 20
[0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 
2500) < 0.6. Do not use CompressedRow routines.
[0] DMGetDMKSP(): Creating new DMKSP
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 
29531024
[0] PetscDeviceContextSetupGlobalContext_Private(): Initializing global 
PetscDeviceContext
0 SNES Function norm 0.0406612
[0] ISColoringCreate(): Number of colors 20
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 
29531024
[0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 
29531024
[0] MatFDColoringSetUp_SeqXAIJ(): ncolors 20, brows 66 and bcols 15 are used.
[0] SNESComputeJacobian(): Rebuilding preconditioner
[0] PCSetUp(): Setting up PC for first time
[0] MatConvert(): Check superclass seqhypre seqaijcusparse -> 0
[0] MatConvert(): Check superclass mpihypre seqaijcusparse -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_seqhypre_C 
(seqaijcusparse) -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_mpihypre_C 
(seqaijcusparse) -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_hypre_C 
(seqaijcusparse) -> 1
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] KSPConvergedDefault(): Linear solver has converged. Residual norm 
3.001654795047e-07 is less than relative tolerance 1.000000000000e-05 times 
initial right hand side norm 4.066115181565e-02 at iteration 33
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] SNESSolve_NEWTONLS(): iter=0, linear solve iterations=33
[0] SNESNEWTONLSCheckResidual_Private(): ||J^T(F-Ax)||/||F-AX|| 
2.890238131751e+01 near zero implies inconsistent rhs
[0] PetscSplitReductionGet(): Putting reduction data in an MPI_Comm 29662720
[0] SNESLineSearchApply_BT(): Initial fnorm 4.066115181565e-02 gnorm 
3.338338626166e-06
[0] SNESSolve_NEWTONLS(): fnorm=4.0661151815649638e-02, 
gnorm=3.3383386261659113e-06, ynorm=5.4373378910396353e-01, lssucceed=0
1 SNES Function norm 3.33834e-06
[0] SNESComputeJacobian(): Rebuilding preconditioner
[0] PCSetUp(): Setting up PC with same nonzero pattern
[0] MatConvert(): Check superclass seqhypre seqaijcusparse -> 0
[0] MatConvert(): Check superclass mpihypre seqaijcusparse -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_seqhypre_C 
(seqaijcusparse) -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_mpihypre_C 
(seqaijcusparse) -> 0
[0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_hypre_C 
(seqaijcusparse) -> 1
[0] PetscCommGetComm(): Reusing a communicator 29662720 68829840
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] KSPConvergedDefault(): Linear solver has converged. Residual norm 
2.753325754967e-11 is less than relative tolerance 1.000000000000e-05 times 
initial right hand side norm 3.338338626166e-06 at iteration 29
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] SNESSolve_NEWTONLS(): iter=1, linear solve iterations=29
[0] SNESNEWTONLSCheckResidual_Private(): ||J^T(F-Ax)||/||F-AX|| 
3.172675080131e+01 near zero implies inconsistent rhs
[0] SNESLineSearchApply_BT(): Initial fnorm 3.338338626166e-06 gnorm 
2.754150439906e-11
[0] SNESSolve_NEWTONLS(): fnorm=3.3383386261659113e-06, 
gnorm=2.7541504399056686e-11, ynorm=1.6805315020558734e-05, lssucceed=0
2 SNES Function norm 2.754e-11
[0] SNESConvergedDefault(): Converged due to function norm 2.754150439906e-11 < 
4.066115181565e-10 (relative tolerance)
Number of SNES iterations = 2
[0] Petsc_OuterComm_Attr_Delete_Fn(): Removing reference to PETSc communicator 
embedded in a user MPI_Comm 29531024
[0] Petsc_InnerComm_Attr_Delete_Fn(): User MPI_Comm 140322706696992 is being 
unlinked from inner PETSc comm 29531024
[0] PetscCommDestroy(): Deleting PETSc MPI_Comm 29531024
[0] Petsc_Counter_Attr_Delete_Fn(): Deleting counter data in an MPI_Comm 
29531024
[0] PetscFinalize(): PetscFinalize() called
[0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm 29662720
[0] Petsc_OuterComm_Attr_Delete_Fn(): Removing reference to PETSc communicator 
embedded in a user MPI_Comm 29662720
[0] Petsc_InnerComm_Attr_Delete_Fn(): User MPI_Comm 140322706697504 is being 
unlinked from inner PETSc comm 29662720
[0] PetscCommDestroy(): Deleting PETSc MPI_Comm 29662720
[0] Petsc_DelReduction(): Deleting reduction data in an MPI_Comm 29662720
[0] Petsc_Counter_Attr_Delete_Fn(): Deleting counter data in an MPI_Comm 
29662720

> On Jul 14, 2022, at 1:56 PM, Stefano Zampini <[email protected]> 
> wrote:
>
> You don't need unified memory for boomeramg to work.
>
> On Thu, Jul 14, 2022, 18:55 Barry Smith <[email protected]> wrote:
>
>> So the PETSc test all run, including the test that uses a GPU.
>>
>> The hypre test is failing. It is impossible to tell from the output why.
>>
>> You can run it manually, cd src/snes/tutorials
>>
>> make ex19
>> mpiexec -n 1 ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -da_refine 3 
>> -snes_monitor_short -ksp_norm_type unpreconditioned -pc_type hypre -info > 
>> somefile
>>
>> then take a look at the output in somefile and send it to us.
>>
>> Barry
>>
>>> On Jul 14, 2022, at 12:32 PM, Juan Pablo de Lima Costa Salazar via 
>>> petsc-users <[email protected]> wrote:
>>>
>>> Hello,
>>>
>>> I was hoping to get help regarding a runtime error I am encountering on a 
>>> cluster node with 4 Tesla K40m GPUs after configuring PETSc with the 
>>> following command:
>>>
>>> $./configure --force \
>>> --with-precision=double \
>>> --with-debugging=0 \
>>> --COPTFLAGS=-O3 \
>>> --CXXOPTFLAGS=-O3 \
>>> --FOPTFLAGS=-O3 \
>>> PETSC_ARCH=linux64GccDPInt32-spack \
>>> --download-fblaslapack \
>>> --download-openblas \
>>> --download-hypre \
>>> --download-hypre-configure-arguments=--enable-unified-memory \
>>> --with-mpi-dir=/opt/ohpc/pub/mpi/openmpi4-gnu9/4.0.4 \
>>> --with-cuda=1 \
>>> --download-suitesparse \
>>> --download-dir=downloads \
>>> --with-cudac=/opt/ohpc/admin/spack/0.15.0/opt/spack/linux-centos8-ivybridge/gcc-9.3.0/cuda-11.7.0-hel25vgwc7fixnvfl5ipvnh34fnskw3m/bin/nvcc
>>>  \
>>> --with-packages-download-dir=downloads \
>>> --download-sowing=downloads/v1.1.26-p4.tar.gz \
>>> --with-cuda-arch=35
>>>
>>> When I run
>>>
>>> $ make PETSC_DIR=/home/juan/OpenFOAM/juan-v2206/petsc-cuda 
>>> PETSC_ARCH=linux64GccDPInt32-spack check
>>> Running check examples to verify correct installation
>>> Using PETSC_DIR=/home/juan/OpenFOAM/juan-v2206/petsc-cuda and 
>>> PETSC_ARCH=linux64GccDPInt32-spack
>>> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
>>> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
>>> 3,5c3,15
>>> < 1 SNES Function norm 4.12227e-06
>>> < 2 SNES Function norm 6.098e-11
>>> < Number of SNES iterations = 2
>>> ---
>>>> CUDA ERROR (code = 101, invalid device ordinal) at memory.c:139
>>>> CUDA ERROR (code = 101, invalid device ordinal) at memory.c:139
>>>> --------------------------------------------------------------------------
>>>> Primary job terminated normally, but 1 process returned
>>>> a non-zero exit code. Per user-direction, the job has been aborted.
>>>> --------------------------------------------------------------------------
>>>> --------------------------------------------------------------------------
>>>> mpiexec detected that one or more processes exited with non-zero status, 
>>>> thus causing
>>>> the job to be terminated. The first process to do so was:
>>>>
>>>> Process name: [[52712,1],0]
>>>> Exit code: 1
>>>> --------------------------------------------------------------------------
>>> /home/juan/OpenFOAM/juan-v2206/petsc-cuda/src/snes/tutorials
>>> Possible problem with ex19 running with hypre, diffs above
>>> =========================================
>>> C/C++ example src/snes/tutorials/ex19 run successfully with cuda
>>> C/C++ example src/snes/tutorials/ex19 run successfully with suitesparse
>>> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
>>> Completed test examples
>>>
>>> I have compiled the code on the head node (without GPUs) and on the compute 
>>> node where there are 4 GPUs.
>>>
>>> $nvidia-debugdump -l
>>> Found 4 NVIDIA devices
>>> Device ID: 0
>>> Device name: Tesla K40m
>>> GPU internal ID: 0320717032250
>>>
>>> Device ID: 1
>>> Device name: Tesla K40m
>>> GPU internal ID: 0320717031968
>>>
>>> Device ID: 2
>>> Device name: Tesla K40m
>>> GPU internal ID: 0320717032246
>>>
>>> Device ID: 3
>>> Device name: Tesla K40m
>>> GPU internal ID: 0320717032235
>>>
>>> Attached are the log files form configure and make.
>>>
>>> Any pointers are highly appreciated. My intention is to use PETSc as a 
>>> linear solver for OpenFOAM, leveraging the availability of GPUs at the same 
>>> time. Currently I can run PETSc without GPU support.
>>>
>>> Cheers,
>>> Juan S.
>>>
>>> <configure.log.tar.gz><make.log.tar.gz>

Reply via email to