On Wed, Aug 6, 2025 at 11:53 AM howen via petsc-users < petsc-users@mcs.anl.gov> wrote:
> Dear Sir, > > I am introducing petsc into our fortran + openacc code, > https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab__;!!G_uCfscf7eWS!btmKhbl3NeBl7Rgz-znsnwgTYs0aZfcXn-qxj6ei6jgwTQfNYeQ7Rz8De5GhON1QxIqHrvi9CfpPQuwP4t9a$ > > <https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab__;!!G_uCfscf7eWS!f0Y_9okAiFSBeUBOlNplMe6jOCPtdRHfhX2s_rz3N9kCK4OJGiGWGjFyRd3JOwojw7MeNEA-VJtgdn8wBbCmRmJTqdFC$> > . > > My final objective is to run AMG (Boomer from hypre) on the GPU. > > For the moment I am performing test on the CPU only. > > I run on Marenostrum-V. > https://urldefense.us/v3/__https://www.bsc.es/supportkc/docs/MareNostrum5/intro/__;!!G_uCfscf7eWS!btmKhbl3NeBl7Rgz-znsnwgTYs0aZfcXn-qxj6ei6jgwTQfNYeQ7Rz8De5GhON1QxIqHrvi9CfpPQqfAlkhq$ > > <https://urldefense.us/v3/__https://www.bsc.es/supportkc/docs/MareNostrum5/intro/__;!!G_uCfscf7eWS!f0Y_9okAiFSBeUBOlNplMe6jOCPtdRHfhX2s_rz3N9kCK4OJGiGWGjFyRd3JOwojw7MeNEA-VJtgdn8wBbCmRraUJcUF$> > > I am compiling my code with NVHPC and the support team from BSC has > compiled petsc + hypre for me. > > In the configuration they used -with-cuda. > > I have run petsc and it works correctly both in serial and parallel on the > CPU. In my code I use call MatSetType(amat,MATAIJ,ierr). > I understand this is teh expected behaviour for Petsc. > That is, that even if one compiles with petsc cuda support one can run > only on the GPU depending on what one sets in MatSetType. > Could you confirm that this is the expected behaviour? Instead it seems > that when petsc+hypre is used one needs to > compile specific versions for CPU and GPU. > > When trying to use hypre through petsc if one has compiled petsc using > -with-cuda the run fails. > From what I have understood this in not the expected behavior. could you > confirm this? > > > Depending of what options I give, the error is different > > If I use > > -pc_type hypre > -pc_hypre_type boomeramg > -pc_hypre_boomeramg_coarsen_type hmis > -pc_hypre_boomeramg_interp_type ext+i > -pc_hypre_boomeramg_relax_type_all SOR/Jacobi > -pc_hypre_boomeramg_relax_type_coarse SOR/Jacobi > -pc_hypre_boomeramg_grid_sweeps_coarse 1 > -pc_hypre_boomeramg_strong_threshold 0.25 > > > I get > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!btmKhbl3NeBl7Rgz-znsnwgTYs0aZfcXn-qxj6ei6jgwTQfNYeQ7Rz8De5GhON1QxIqHrvi9CfpPQlAW-NlE$ > > <https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!f0Y_9okAiFSBeUBOlNplMe6jOCPtdRHfhX2s_rz3N9kCK4OJGiGWGjFyRd3JOwojw7MeNEA-VJtgdn8wBbCmRt2ImXnV$> > and > https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!btmKhbl3NeBl7Rgz-znsnwgTYs0aZfcXn-qxj6ei6jgwTQfNYeQ7Rz8De5GhON1QxIqHrvi9CfpPQnhULpwg$ > > <https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!f0Y_9okAiFSBeUBOlNplMe6jOCPtdRHfhX2s_rz3N9kCK4OJGiGWGjFyRd3JOwojw7MeNEA-VJtgdn8wBbCmRjGOi2o3$> > [0]PETSC ERROR: or try > https://urldefense.us/v3/__https://docs.nvidia.com/cuda/cuda-memcheck/index.html__;!!G_uCfscf7eWS!btmKhbl3NeBl7Rgz-znsnwgTYs0aZfcXn-qxj6ei6jgwTQfNYeQ7Rz8De5GhON1QxIqHrvi9CfpPQge-ms2z$ > > <https://urldefense.us/v3/__https://docs.nvidia.com/cuda/cuda-memcheck/index.html__;!!G_uCfscf7eWS!f0Y_9okAiFSBeUBOlNplMe6jOCPtdRHfhX2s_rz3N9kCK4OJGiGWGjFyRd3JOwojw7MeNEA-VJtgdn8wBbCmRqNF46dV$> > on NVIDIA CUDA systems to find memory corruption errors > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: The line numbers in the error traceback are not always > exact. > [0]PETSC ERROR: #1 jac->setup() > [0]PETSC ERROR: #2 PCSetUp_HYPRE() at > /gpfs/apps/MN5/ACC/PETSC/SRC/petsc-v3.21.0_hypre-debug/src/ksp/pc/impls/hypre/hypre.c:422 > [0]PETSC ERROR: #3 PCSetUp() at > /gpfs/apps/MN5/ACC/PETSC/SRC/petsc-v3.21.0_hypre-debug/src/ksp/pc/interface/precon.c:1079 > [0]PETSC ERROR: #4 KSPSetUp() at > /gpfs/apps/MN5/ACC/PETSC/SRC/petsc-v3.21.0_hypre-debug/src/ksp/ksp/interface/itfunc.c:415 > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI COMMUNICATOR 3 SPLIT > FROM 0 > with errorcode 59. > > Which does not help much. > > Instead if I use only > > -pc_type hypre > > I get > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: HYPRE_MEMORY_DEVICE expects a device vector. You need to > enable PETSc device support, for example, in some cases, -vec_type cuda > > > This helped my realise that hypre was trying to use the GPU despite in my > code > We try to have a good error message here (If you could give us the code with the SEGV, we will fix it to give an informative error). This is a limitation of Hypre, namely that it runs _either_ on the CPU or the GPU, not both at the same time. Hopefully this will not be the case with a future release. Thanks, Matt > Both runs are succesfull when compiling without -with-cuda > > Best, > Herbert Owen > Senior Researcher, Dpt. Computer Applications in Science and Engineering > Barcelona Supercomputing Center (BSC-CNS) > Tel: +34 93 413 4038 > Skype: herbert.owen > > https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!btmKhbl3NeBl7Rgz-znsnwgTYs0aZfcXn-qxj6ei6jgwTQfNYeQ7Rz8De5GhON1QxIqHrvi9CfpPQp3NDHfq$ > > <https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!f0Y_9okAiFSBeUBOlNplMe6jOCPtdRHfhX2s_rz3N9kCK4OJGiGWGjFyRd3JOwojw7MeNEA-VJtgdn8wBbCmRkKVuPTJ$> > > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!btmKhbl3NeBl7Rgz-znsnwgTYs0aZfcXn-qxj6ei6jgwTQfNYeQ7Rz8De5GhON1QxIqHrvi9CfpPQjVmHttn$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!btmKhbl3NeBl7Rgz-znsnwgTYs0aZfcXn-qxj6ei6jgwTQfNYeQ7Rz8De5GhON1QxIqHrvi9CfpPQs5VjyVu$ >