Can your problem have load imbalance? You might try '-pc_type asm' (and/or jacobi) to see your baseline load imbalance. GAMG can add some load imbalance but start by getting a baseline.
Mark On Wed, Jan 18, 2023 at 2:54 PM Mark Lohry <[email protected]> wrote: > Q0) does -memory_view trace GPU memory as well, or is there another method > to query the peak device memory allocation? > > Q1) I'm loading a aijcusparse matrix with MatLoad, and running with > -ksp_type fgmres -pc_type gamg -mg_levels_pc_type asm with mat info > 27,142,948 rows and cols, bs=4, total nonzeros 759,709,392. Using 8 ranks > on 8x80GB GPUs, and during the setup phase before crashing with > CUSPARSE_STATUS_INSUFFICIENT_RESOURCES nvidia-smi shows the below pasted > content. > > GPU memory usage spanning from 36GB-50GB but with one rank at 77GB. Is > this expected? Do I need to manually repartition this somehow? > > Thanks, > Mark > > > > +-----------------------------------------------------------------------------+ > > | Processes: > | > > | GPU GI CI PID Type Process name GPU > Memory | > > | ID ID > Usage | > > > |=============================================================================| > > | 0 N/A N/A 1630309 C nvidia-cuda-mps-server > 27MiB | > > | 0 N/A N/A 1696543 C ./petsc_solver_test > 38407MiB | > > | 0 N/A N/A 1696544 C ./petsc_solver_test > 467MiB | > > | 0 N/A N/A 1696545 C ./petsc_solver_test > 467MiB | > > | 0 N/A N/A 1696546 C ./petsc_solver_test > 467MiB | > > | 0 N/A N/A 1696548 C ./petsc_solver_test > 467MiB | > > | 0 N/A N/A 1696550 C ./petsc_solver_test > 471MiB | > > | 0 N/A N/A 1696551 C ./petsc_solver_test > 467MiB | > > | 0 N/A N/A 1696552 C ./petsc_solver_test > 467MiB | > > | 1 N/A N/A 1630309 C nvidia-cuda-mps-server > 27MiB | > > | 1 N/A N/A 1696544 C ./petsc_solver_test > 35849MiB | > > | 2 N/A N/A 1630309 C nvidia-cuda-mps-server > 27MiB | > > | 2 N/A N/A 1696545 C ./petsc_solver_test > 36719MiB | > > | 3 N/A N/A 1630309 C nvidia-cuda-mps-server > 27MiB | > > | 3 N/A N/A 1696546 C ./petsc_solver_test > 37343MiB | > > | 4 N/A N/A 1630309 C nvidia-cuda-mps-server > 27MiB | > > | 4 N/A N/A 1696548 C ./petsc_solver_test > 36935MiB | > > | 5 N/A N/A 1630309 C nvidia-cuda-mps-server > 27MiB | > > | 5 N/A N/A 1696550 C ./petsc_solver_test > 49953MiB | > > | 6 N/A N/A 1630309 C nvidia-cuda-mps-server > 27MiB | > > | 6 N/A N/A 1696551 C ./petsc_solver_test > 47693MiB | > > | 7 N/A N/A 1630309 C nvidia-cuda-mps-server > 27MiB | > > | 7 N/A N/A 1696552 C ./petsc_solver_test > 77331MiB | > > > +-----------------------------------------------------------------------------+ >
