Thank you Barry and Stefano, Below is the output from the example, which I ran with an added option since my mpi is not gpu aware. I believe this may be responsible for the error. The reason I chose to compile with the option
>>> --download-hypre-configure-arguments=--enable-unified-memory \ is because it was in config/examples/arch-ci-linux-cuda-pkgs.py . There are several other examples and there is no other particular reason why I chose this one, other than using hyper. I didn’t think too much about it. After recompiling without this option the example ran successfully. I will see about combining openmpi with cuda support. Thanks! For the sake of reference: With the --download-hypre-configure-arguments=--enable-unified-memory option $ mpiexec -n 1 ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -da_refine 3 -snes_monitor_short -ksp_norm_type unpreconditioned -pc_type hypre -use_gpu_aware_mpi 0 -info > log.ex19 2>&1 [0] PetscDetermineInitialFPTrap(): Floating point trapping is off by default 0 [0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType cuda supported, initializing [0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType hip not supported [0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType sycl not supported [0] PetscInitialize_Common(): PETSc successfully started: number of processors = 1 [0] PetscGetHostName(): Rejecting domainname, likely is NIS node021.(none) [0] PetscInitialize_Common(): Running on machine: node021 [0] PetscCommDuplicate(): Duplicating a communicator 140679929097504 30289408 max tags = 8388607 [0] PetscCommDuplicate(): Using internal PETSc communicator 140679929097504 30289408 [0] PetscCommDuplicate(): Using internal PETSc communicator 140679929097504 30289408 [0] PetscCommDuplicate(): Using internal PETSc communicator 140679929097504 30289408 [0] PetscCommDuplicate(): Duplicating a communicator 140679929096992 30157712 max tags = 8388607 [0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712 [0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712 [0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712 [0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712 [0] DMGetDMSNES(): Creating new DMSNES [0] PetscGetHostName(): Rejecting domainname, likely is NIS node021.(none) [0] configure(): Configured device 0 lid velocity = 0.0016, prandtl # = 1., grashof # = 1. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 2500 X 2500; storage space: 0 unneeded,48400 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 20 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 2500) < 0.6. Do not useCompressedRow routines. [0] DMGetDMKSP(): Creating new DMKSP [0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712 [0] PetscDeviceContextSetupGlobalContext_Private(): Initializing global PetscDeviceContext 0 SNES Function norm 0.0406612 [0] ISColoringCreate(): Number of colors 20 [0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712 [0] PetscCommDuplicate(): Using internal PETSc communicator 140679929096992 30157712 [0] MatFDColoringSetUp_SeqXAIJ(): ncolors 20, brows 66 and bcols 15 are used. [0] SNESComputeJacobian(): Rebuilding preconditioner [0] PCSetUp(): Setting up PC for first time [0] MatConvert(): Check superclass seqhypre seqaijcusparse -> 0 [0] MatConvert(): Check superclass mpihypre seqaijcusparse -> 0 [0] MatConvert(): Check superclass mpihypre seqaijcusparse -> 0 [0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_seqhypre_C (seqaijcusparse) -> 0 [0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_mpihypre_C (seqaijcusparse) -> 0 [0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_hypre_C (seqaijcusparse) -> 1 CUDA ERROR (code = 101, invalid device ordinal) at memory.c:139 -------------------------------------------------------------------------- Primary jobterminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[51372,1],0] Exit code:1 -------------------------------------------------------------------------- Without the --download-hypre-configure-arguments=--enable-unified-memory option $ mpiexec -n 1 ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -da_refine 3 -snes_monitor_short -ksp_norm_type unpreconditioned -pc_type hypre -use_gpu_aware_mpi 0 -info > log.ex19 2>&1 [0] PetscDetermineInitialFPTrap(): Floating point trapping is off by default 0 [0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType cuda supported, initializing [0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType hip not supported [0] PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType sycl not supported [0] PetscInitialize_Common(): PETSc successfully started: number of processors = 1 [0] PetscGetHostName(): Rejecting domainname, likely is NIS node021.(none) [0] PetscInitialize_Common(): Running on machine: node021 [0] PetscCommDuplicate(): Duplicating a communicator 140322706697504 29662720 max tags = 8388607 [0] PetscCommDuplicate(): Using internal PETSc communicator 140322706697504 29662720 [0] PetscCommDuplicate(): Using internal PETSc communicator 140322706697504 29662720 [0] PetscCommDuplicate(): Using internal PETSc communicator 140322706697504 29662720 [0] PetscCommDuplicate(): Duplicating a communicator 140322706696992 29531024 max tags = 8388607 [0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024 [0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024 [0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024 [0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024 [0] DMGetDMSNES(): Creating new DMSNES [0] PetscGetHostName(): Rejecting domainname, likely is NIS node021.(none) [0] configure(): Configured device 0 lid velocity = 0.0016, prandtl # = 1., grashof # = 1. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 2500 X 2500; storage space: 0 unneeded,48400 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 20 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 2500) < 0.6. Do not use CompressedRow routines. [0] DMGetDMKSP(): Creating new DMKSP [0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024 [0] PetscDeviceContextSetupGlobalContext_Private(): Initializing global PetscDeviceContext 0 SNES Function norm 0.0406612 [0] ISColoringCreate(): Number of colors 20 [0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024 [0] PetscCommDuplicate(): Using internal PETSc communicator 140322706696992 29531024 [0] MatFDColoringSetUp_SeqXAIJ(): ncolors 20, brows 66 and bcols 15 are used. [0] SNESComputeJacobian(): Rebuilding preconditioner [0] PCSetUp(): Setting up PC for first time [0] MatConvert(): Check superclass seqhypre seqaijcusparse -> 0 [0] MatConvert(): Check superclass mpihypre seqaijcusparse -> 0 [0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_seqhypre_C (seqaijcusparse) -> 0 [0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_mpihypre_C (seqaijcusparse) -> 0 [0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_hypre_C (seqaijcusparse) -> 1 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 3.001654795047e-07 is less than relative tolerance 1.000000000000e-05 times initial right hand side norm 4.066115181565e-02 at iteration 33 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] SNESSolve_NEWTONLS(): iter=0, linear solve iterations=33 [0] SNESNEWTONLSCheckResidual_Private(): ||J^T(F-Ax)||/||F-AX|| 2.890238131751e+01 near zero implies inconsistent rhs [0] PetscSplitReductionGet(): Putting reduction data in an MPI_Comm 29662720 [0] SNESLineSearchApply_BT(): Initial fnorm 4.066115181565e-02 gnorm 3.338338626166e-06 [0] SNESSolve_NEWTONLS(): fnorm=4.0661151815649638e-02, gnorm=3.3383386261659113e-06, ynorm=5.4373378910396353e-01, lssucceed=0 1 SNES Function norm 3.33834e-06 [0] SNESComputeJacobian(): Rebuilding preconditioner [0] PCSetUp(): Setting up PC with same nonzero pattern [0] MatConvert(): Check superclass seqhypre seqaijcusparse -> 0 [0] MatConvert(): Check superclass mpihypre seqaijcusparse -> 0 [0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_seqhypre_C (seqaijcusparse) -> 0 [0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_mpihypre_C (seqaijcusparse) -> 0 [0] MatConvert(): Check specialized (1) MatConvert_seqaijcusparse_hypre_C (seqaijcusparse) -> 1 [0] PetscCommGetComm(): Reusing a communicator 29662720 68829840 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 2.753325754967e-11 is less than relative tolerance 1.000000000000e-05 times initial right hand side norm 3.338338626166e-06 at iteration 29 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] SNESSolve_NEWTONLS(): iter=1, linear solve iterations=29 [0] SNESNEWTONLSCheckResidual_Private(): ||J^T(F-Ax)||/||F-AX|| 3.172675080131e+01 near zero implies inconsistent rhs [0] SNESLineSearchApply_BT(): Initial fnorm 3.338338626166e-06 gnorm 2.754150439906e-11 [0] SNESSolve_NEWTONLS(): fnorm=3.3383386261659113e-06, gnorm=2.7541504399056686e-11, ynorm=1.6805315020558734e-05, lssucceed=0 2 SNES Function norm 2.754e-11 [0] SNESConvergedDefault(): Converged due to function norm 2.754150439906e-11 < 4.066115181565e-10 (relative tolerance) Number of SNES iterations = 2 [0] Petsc_OuterComm_Attr_Delete_Fn(): Removing reference to PETSc communicator embedded in a user MPI_Comm 29531024 [0] Petsc_InnerComm_Attr_Delete_Fn(): User MPI_Comm 140322706696992 is being unlinked from inner PETSc comm 29531024 [0] PetscCommDestroy(): Deleting PETSc MPI_Comm 29531024 [0] Petsc_Counter_Attr_Delete_Fn(): Deleting counter data in an MPI_Comm 29531024 [0] PetscFinalize(): PetscFinalize() called [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm 29662720 [0] Petsc_OuterComm_Attr_Delete_Fn(): Removing reference to PETSc communicator embedded in a user MPI_Comm 29662720 [0] Petsc_InnerComm_Attr_Delete_Fn(): User MPI_Comm 140322706697504 is being unlinked from inner PETSc comm 29662720 [0] PetscCommDestroy(): Deleting PETSc MPI_Comm 29662720 [0] Petsc_DelReduction(): Deleting reduction data in an MPI_Comm 29662720 [0] Petsc_Counter_Attr_Delete_Fn(): Deleting counter data in an MPI_Comm 29662720 > On Jul 14, 2022, at 1:56 PM, Stefano Zampini <[email protected]> > wrote: > > You don't need unified memory for boomeramg to work. > > On Thu, Jul 14, 2022, 18:55 Barry Smith <[email protected]> wrote: > >> So the PETSc test all run, including the test that uses a GPU. >> >> The hypre test is failing. It is impossible to tell from the output why. >> >> You can run it manually, cd src/snes/tutorials >> >> make ex19 >> mpiexec -n 1 ./ex19 -dm_vec_type cuda -dm_mat_type aijcusparse -da_refine 3 >> -snes_monitor_short -ksp_norm_type unpreconditioned -pc_type hypre -info > >> somefile >> >> then take a look at the output in somefile and send it to us. >> >> Barry >> >>> On Jul 14, 2022, at 12:32 PM, Juan Pablo de Lima Costa Salazar via >>> petsc-users <[email protected]> wrote: >>> >>> Hello, >>> >>> I was hoping to get help regarding a runtime error I am encountering on a >>> cluster node with 4 Tesla K40m GPUs after configuring PETSc with the >>> following command: >>> >>> $./configure --force \ >>> --with-precision=double \ >>> --with-debugging=0 \ >>> --COPTFLAGS=-O3 \ >>> --CXXOPTFLAGS=-O3 \ >>> --FOPTFLAGS=-O3 \ >>> PETSC_ARCH=linux64GccDPInt32-spack \ >>> --download-fblaslapack \ >>> --download-openblas \ >>> --download-hypre \ >>> --download-hypre-configure-arguments=--enable-unified-memory \ >>> --with-mpi-dir=/opt/ohpc/pub/mpi/openmpi4-gnu9/4.0.4 \ >>> --with-cuda=1 \ >>> --download-suitesparse \ >>> --download-dir=downloads \ >>> --with-cudac=/opt/ohpc/admin/spack/0.15.0/opt/spack/linux-centos8-ivybridge/gcc-9.3.0/cuda-11.7.0-hel25vgwc7fixnvfl5ipvnh34fnskw3m/bin/nvcc >>> \ >>> --with-packages-download-dir=downloads \ >>> --download-sowing=downloads/v1.1.26-p4.tar.gz \ >>> --with-cuda-arch=35 >>> >>> When I run >>> >>> $ make PETSC_DIR=/home/juan/OpenFOAM/juan-v2206/petsc-cuda >>> PETSC_ARCH=linux64GccDPInt32-spack check >>> Running check examples to verify correct installation >>> Using PETSC_DIR=/home/juan/OpenFOAM/juan-v2206/petsc-cuda and >>> PETSC_ARCH=linux64GccDPInt32-spack >>> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process >>> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes >>> 3,5c3,15 >>> < 1 SNES Function norm 4.12227e-06 >>> < 2 SNES Function norm 6.098e-11 >>> < Number of SNES iterations = 2 >>> --- >>>> CUDA ERROR (code = 101, invalid device ordinal) at memory.c:139 >>>> CUDA ERROR (code = 101, invalid device ordinal) at memory.c:139 >>>> -------------------------------------------------------------------------- >>>> Primary job terminated normally, but 1 process returned >>>> a non-zero exit code. Per user-direction, the job has been aborted. >>>> -------------------------------------------------------------------------- >>>> -------------------------------------------------------------------------- >>>> mpiexec detected that one or more processes exited with non-zero status, >>>> thus causing >>>> the job to be terminated. The first process to do so was: >>>> >>>> Process name: [[52712,1],0] >>>> Exit code: 1 >>>> -------------------------------------------------------------------------- >>> /home/juan/OpenFOAM/juan-v2206/petsc-cuda/src/snes/tutorials >>> Possible problem with ex19 running with hypre, diffs above >>> ========================================= >>> C/C++ example src/snes/tutorials/ex19 run successfully with cuda >>> C/C++ example src/snes/tutorials/ex19 run successfully with suitesparse >>> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process >>> Completed test examples >>> >>> I have compiled the code on the head node (without GPUs) and on the compute >>> node where there are 4 GPUs. >>> >>> $nvidia-debugdump -l >>> Found 4 NVIDIA devices >>> Device ID: 0 >>> Device name: Tesla K40m >>> GPU internal ID: 0320717032250 >>> >>> Device ID: 1 >>> Device name: Tesla K40m >>> GPU internal ID: 0320717031968 >>> >>> Device ID: 2 >>> Device name: Tesla K40m >>> GPU internal ID: 0320717032246 >>> >>> Device ID: 3 >>> Device name: Tesla K40m >>> GPU internal ID: 0320717032235 >>> >>> Attached are the log files form configure and make. >>> >>> Any pointers are highly appreciated. My intention is to use PETSc as a >>> linear solver for OpenFOAM, leveraging the availability of GPUs at the same >>> time. Currently I can run PETSc without GPU support. >>> >>> Cheers, >>> Juan S. >>> >>> <configure.log.tar.gz><make.log.tar.gz>
