Also, the configure.log has

  #define PETSC_HAVE_MPI_GPU_AWARE 1

which says PETSc thinks the GPU support is there.

  Thanks,

     Matt

On Tue, Sep 23, 2025 at 1:20 AM Satish Balay <[email protected]> wrote:

> orte-info output does suggest OpenMPI is built with cuda enabled.
>
> Are you able to run PETSc examples? What do you get for:
>
> >>>>
> balay@petsc-gpu-01:/scratch/balay/petsc/src/snes/tutorials$ make ex19
> /scratch/balay/petsc/arch-linux-c-debug/bin/mpicc -fPIC -Wall
> -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch
> -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -g3 -O0
> -I/scratch/balay/petsc/include
> -I/scratch/balay/petsc/arch-linux-c-debug/include
>
> -I/nfs/gce/projects/petsc/soft/u22.04/spack-2024-11-27-cuda/opt/spack/linux-ubuntu22.04-x86_64/gcc-11.4.0/cuda-12.0.1-gy7foq57oi6wzltombtsdy5eqz5gkjgc/include
>     -Wl,-export-dynamic ex19.c
> -Wl,-rpath,/scratch/balay/petsc/arch-linux-c-debug/lib
> -L/scratch/balay/petsc/arch-linux-c-debug/lib
>
> -Wl,-rpath,/nfs/gce/projects/petsc/soft/u22.04/spack-2024-11-27-cuda/opt/spack/linux-ubuntu22.04-x86_64/gcc-11.4.0/cuda-12.0.1-gy7foq57oi6wzltombtsdy5eqz5gkjgc/lib64
>
> -L/nfs/gce/projects/petsc/soft/u22.04/spack-2024-11-27-cuda/opt/spack/linux-ubuntu22.04-x86_64/gcc-11.4.0/cuda-12.0.1-gy7foq57oi6wzltombtsdy5eqz5gkjgc/lib64
>
> -L/nfs/gce/projects/petsc/soft/u22.04/spack-2024-11-27-cuda/opt/spack/linux-ubuntu22.04-x86_64/gcc-11.4.0/cuda-12.0.1-gy7foq57oi6wzltombtsdy5eqz5gkjgc/lib64/stubs
> -Wl,-rpath,/scratch/balay/petsc/arch-linux-c-debug/lib
> -L/scratch/balay/petsc/arch-linux-c-debug/lib
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/11
> -L/usr/lib/gcc/x86_64-linux-gnu/11 -lpetsc -llapack -lblas -lm -lcudart
> -lnvToolsExt -lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda
> -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
> -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -o ex19
> balay@petsc-gpu-01:/scratch/balay/petsc/src/snes/tutorials$ ./ex19
> -snes_monitor -dm_mat_type seqaijcusparse -dm_vec_type seqcuda -pc_type
> gamg -pc_gamg_esteig_ksp_max_it 10 -ksp_monitor -mg_levels_ksp_max_it 3
> lid velocity = 0.0625, prandtl # = 1., grashof # = 1.
>   0 SNES Function norm 2.391552133017e-01
>     0 KSP Residual norm 2.013462697105e-01
>     1 KSP Residual norm 5.027022294231e-02
>     2 KSP Residual norm 7.248258907839e-03
>     3 KSP Residual norm 8.590847505363e-04
>     4 KSP Residual norm 1.511762118013e-05
>     5 KSP Residual norm 1.410585959219e-06
>   1 SNES Function norm 6.812362089434e-05
>     0 KSP Residual norm 2.315252918142e-05
>     1 KSP Residual norm 2.351994603807e-06
>     2 KSP Residual norm 3.882072626158e-07
>     3 KSP Residual norm 2.227447016095e-08
>     4 KSP Residual norm 2.200353394658e-09
>     5 KSP Residual norm 1.147903850265e-10
>   2 SNES Function norm 3.411489611752e-10
> Number of SNES iterations = 2
> balay@petsc-gpu-01:/scratch/balay/petsc/src/snes/tutorials$
> <<<<
>
> So what issue are you seeing with your code? And does it go away with the
> option: "-use_gpu_aware_mpi 0"? for example:
>
> >>>>
> balay@petsc-gpu-01:/scratch/balay/petsc/src/snes/tutorials$ ./ex19
> -snes_monitor -dm_mat_type seqaijcusparse -dm_vec_type seqcuda -pc_type
> gamg -pc_gamg_esteig_ksp_max_it 10 -ksp_monitor -mg_levels_ksp_max_it 3
> -use_gpu_aware_mpi 0
> lid velocity = 0.0625, prandtl # = 1., grashof # = 1.
>   0 SNES Function norm 2.391552133017e-01
>     0 KSP Residual norm 2.013462697105e-01
>     1 KSP Residual norm 5.027022294231e-02
>     2 KSP Residual norm 7.248258907839e-03
>     3 KSP Residual norm 8.590847505363e-04
>     4 KSP Residual norm 1.511762118013e-05
>     5 KSP Residual norm 1.410585959219e-06
>   1 SNES Function norm 6.812362089434e-05
>     0 KSP Residual norm 2.315252918142e-05
>     1 KSP Residual norm 2.351994603807e-06
>     2 KSP Residual norm 3.882072626158e-07
>     3 KSP Residual norm 2.227447016095e-08
>     4 KSP Residual norm 2.200353394658e-09
>     5 KSP Residual norm 1.147903850265e-10
>   2 SNES Function norm 3.411489611752e-10
> Number of SNES iterations = 2
> balay@petsc-gpu-01:/scratch/balay/petsc/src/snes/tutorials$
> <<<<
>
> Satish
>
> On Tue, 23 Sep 2025, 岳新海 wrote:
>
> > I get:
> > [mae_yuexh@login01 ~]$ orte-info |grep 'MCA btl'
> > &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;MCA btl:
> smcuda (MCA v2.1, API v3.1, Component v4.1.5)
> > &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;MCA btl:
> tcp (MCA v2.1, API v3.1, Component v4.1.5)
> > &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;MCA btl:
> self (MCA v2.1, API v3.1, Component v4.1.5)
> > &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;MCA btl:
> vader (MCA v2.1, API v3.1, Component v4.1.5)
> >
> >
> >
> > Xinhai
> >
> >
> >
> >
> >
> > 岳新海
> >
> >
> >
> > 南方科技大学/学生/研究生/2023级研究生
> >
> >
> >
> > 广东省深圳市南山区学苑大道1088号
> >
> >
> >
> >
> > &nbsp;
> > &nbsp;
> > &nbsp;
> > ------------------&nbsp;Original&nbsp;------------------
> > From: &nbsp;"Satish&nbsp;Balay"<[email protected]&gt;;
> > Date: &nbsp;Tue, Sep 23, 2025 03:25 AM
> > To: &nbsp;"岳新海"<[email protected]&gt;;
> > Cc: &nbsp;"petsc-dev"<[email protected]&gt;;
> > Subject: &nbsp;Re: [petsc-dev] Question on PETSc + CUDA configuration
> with MPI on  cluster
> >
> > &nbsp;
> >
> >
> >
> What&nbsp;do&nbsp;you&nbsp;get&nbsp;for&nbsp;(with&nbsp;your&nbsp;openmpi&nbsp;install)&nbsp;:orte-info&nbsp;|grep&nbsp;'MCA&nbsp;btl'
> >
> > With&nbsp;cuda&nbsp;built&nbsp;openmpi&nbsp;-&nbsp;I&nbsp;get:
> > balay@petsc-gpu-01
> :/scratch/balay/petsc$&nbsp;./arch-linux-c-debug/bin/orte-info&nbsp;|grep&nbsp;'MCA&nbsp;btl'
> >
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MCA&nbsp;btl:&nbsp;smcuda&nbsp;(MCA&nbsp;v2.1,&nbsp;API&nbsp;v3.1,&nbsp;Component&nbsp;v4.1.6)
> >
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MCA&nbsp;btl:&nbsp;openib&nbsp;(MCA&nbsp;v2.1,&nbsp;API&nbsp;v3.1,&nbsp;Component&nbsp;v4.1.6)
> >
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MCA&nbsp;btl:&nbsp;self&nbsp;(MCA&nbsp;v2.1,&nbsp;API&nbsp;v3.1,&nbsp;Component&nbsp;v4.1.6)
> >
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MCA&nbsp;btl:&nbsp;tcp&nbsp;(MCA&nbsp;v2.1,&nbsp;API&nbsp;v3.1,&nbsp;Component&nbsp;v4.1.6)
> >
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MCA&nbsp;btl:&nbsp;vader&nbsp;(MCA&nbsp;v2.1,&nbsp;API&nbsp;v3.1,&nbsp;Component&nbsp;v4.1.6)
> >
> > And&nbsp;without&nbsp;cuda:
> > balay@petsc-gpu-01
> :/scratch/balay/petsc.x$&nbsp;./arch-test/bin/orte-info&nbsp;&nbsp;|&nbsp;grep&nbsp;'MCA&nbsp;btl'
> >
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MCA&nbsp;btl:&nbsp;openib&nbsp;(MCA&nbsp;v2.1,&nbsp;API&nbsp;v3.1,&nbsp;Component&nbsp;v4.1.6)
> >
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MCA&nbsp;btl:&nbsp;self&nbsp;(MCA&nbsp;v2.1,&nbsp;API&nbsp;v3.1,&nbsp;Component&nbsp;v4.1.6)
> >
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MCA&nbsp;btl:&nbsp;tcp&nbsp;(MCA&nbsp;v2.1,&nbsp;API&nbsp;v3.1,&nbsp;Component&nbsp;v4.1.6)
> >
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;MCA&nbsp;btl:&nbsp;vader&nbsp;(MCA&nbsp;v2.1,&nbsp;API&nbsp;v3.1,&nbsp;Component&nbsp;v4.1.6)
> >
> >
> i.e&nbsp;"smcuda"&nbsp;should&nbsp;be&nbsp;listed&nbsp;for&nbsp;a&nbsp;cuda&nbsp;enabled&nbsp;openmpi.
> >
> >
> Its&nbsp;not&nbsp;clear&nbsp;if&nbsp;GPU-aware&nbsp;MPI&nbsp;makes&nbsp;a&nbsp;difference&nbsp;for&nbsp;all&nbsp;MPI&nbsp;impls&nbsp;(or&nbsp;versions)&nbsp;-&nbsp;so&nbsp;good&nbsp;to&nbsp;verify.&nbsp;[its&nbsp;a&nbsp;performance&nbsp;issue&nbsp;anyway&nbsp;-&nbsp;so&nbsp;primarily&nbsp;useful&nbsp;when&nbsp;performing&nbsp;timing&nbsp;measurements]
> >
> > Satish
> >
> > On&nbsp;Mon,&nbsp;22&nbsp;Sep&nbsp;2025,&nbsp;岳新海&nbsp;wrote:
> >
> > &gt;&nbsp;Dear&nbsp;PETSc&nbsp;Team,
> > &gt;&nbsp;&nbsp;
> >
> &gt;&nbsp;I&nbsp;am&nbsp;encountering&nbsp;an&nbsp;issue&nbsp;when&nbsp;running&nbsp;PETSc&nbsp;with&nbsp;CUDA&nbsp;support&nbsp;on&nbsp;a&nbsp;cluster.&nbsp;When&nbsp;I&nbsp;set&nbsp;the&nbsp;vector&nbsp;type&nbsp;to&nbsp;VECCUDA,&nbsp;PETSc&nbsp;reports&nbsp;that&nbsp;my&nbsp;MPI&nbsp;is&nbsp;not&nbsp;GPU-aware.&nbsp;However,&nbsp;the&nbsp;MPI&nbsp;library&nbsp;(OpenMPI&nbsp;4.1.5)&nbsp;I&nbsp;used&nbsp;to&nbsp;configure&nbsp;PETSc&nbsp;was&nbsp;built&nbsp;with&nbsp;the&nbsp;--with-cuda&nbsp;option&nbsp;enabled.
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;Here&nbsp;are&nbsp;some&nbsp;details:
> > &gt;&nbsp;PETSc&nbsp;version:&nbsp;3.20.6
> >
> &gt;&nbsp;MPI:&nbsp;OpenMPI&nbsp;4.1.5,&nbsp;configured&nbsp;with&nbsp;--with-cuda
> > &gt;&nbsp;GPU:&nbsp;RTX3090
> > &gt;&nbsp;CUDA&nbsp;version:&nbsp;12.1&nbsp;
> >
> &gt;&nbsp;I&nbsp;have&nbsp;attached&nbsp;both&nbsp;my&nbsp;PETSc&nbsp;configure&nbsp;command&nbsp;and&nbsp;OpenMPI&nbsp;configure&nbsp;command&nbsp;for&nbsp;reference.
> > &gt;&nbsp;
> > &gt;&nbsp;My&nbsp;questions&nbsp;are:
> > &gt;&nbsp;
> > &gt;&nbsp;&nbsp;
> > &gt;&nbsp;&nbsp;
> > &gt;&nbsp;&nbsp;
> >
> &gt;&nbsp;Even&nbsp;though&nbsp;I&nbsp;enabled&nbsp;--with-cuda&nbsp;in&nbsp;OpenMPI,&nbsp;why&nbsp;does&nbsp;PETSc&nbsp;still&nbsp;report&nbsp;that&nbsp;MPI&nbsp;is&nbsp;not&nbsp;GPU-aware?
> > &gt;&nbsp;&nbsp;
> > &gt;&nbsp;&nbsp;
> > &gt;&nbsp;&nbsp;
> >
> &gt;&nbsp;Are&nbsp;there&nbsp;additional&nbsp;steps&nbsp;or&nbsp;specific&nbsp;configuration&nbsp;flags&nbsp;required&nbsp;(either&nbsp;in&nbsp;OpenMPI&nbsp;or&nbsp;PETSc)&nbsp;to&nbsp;ensure&nbsp;GPU-aware&nbsp;MPI&nbsp;is&nbsp;correctly&nbsp;detected?
> > &gt;&nbsp;
> > &gt;&nbsp;
> >
> &gt;&nbsp;Any&nbsp;guidance&nbsp;or&nbsp;suggestions&nbsp;would&nbsp;be&nbsp;greatly&nbsp;appreciated.
> > &gt;&nbsp;
> > &gt;&nbsp;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;Best&nbsp;regards,
> > &gt;&nbsp;
> > &gt;&nbsp;Xinhai&nbsp;Yue
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;岳新海
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;南方科技大学/学生/研究生/2023级研究生
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;广东省深圳市南山区学苑大道1088号
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;
> > &gt;&nbsp;&nbsp;



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cdM21c-7ReRi2XQTPS6YvfilWnEw4nUkPb1NxGBgs3JvVOaKitEMUhroNxRlbKSqRNErzlAPkMZW25Dxws16$
  
<https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cdM21c-7ReRi2XQTPS6YvfilWnEw4nUkPb1NxGBgs3JvVOaKitEMUhroNxRlbKSqRNErzlAPkMZW22UEHKIh$
 >

Reply via email to