With asm I see a range of 8GB-13GB, slightly smaller ratio but that probably explains it (does this still seem like a lot of memory to you for the problem size?)
In general I don't have the same number of blocks per row, so I suppose it makes sense there's some memory imbalance. On Wed, Jan 18, 2023 at 3:35 PM Mark Adams <[email protected]> wrote: > Can your problem have load imbalance? > > You might try '-pc_type asm' (and/or jacobi) to see your baseline load > imbalance. > GAMG can add some load imbalance but start by getting a baseline. > > Mark > > On Wed, Jan 18, 2023 at 2:54 PM Mark Lohry <[email protected]> wrote: > >> Q0) does -memory_view trace GPU memory as well, or is there another >> method to query the peak device memory allocation? >> >> Q1) I'm loading a aijcusparse matrix with MatLoad, and running with >> -ksp_type fgmres -pc_type gamg -mg_levels_pc_type asm with mat info >> 27,142,948 rows and cols, bs=4, total nonzeros 759,709,392. Using 8 ranks >> on 8x80GB GPUs, and during the setup phase before crashing with >> CUSPARSE_STATUS_INSUFFICIENT_RESOURCES nvidia-smi shows the below pasted >> content. >> >> GPU memory usage spanning from 36GB-50GB but with one rank at 77GB. Is >> this expected? Do I need to manually repartition this somehow? >> >> Thanks, >> Mark >> >> >> >> +-----------------------------------------------------------------------------+ >> >> | Processes: >> | >> >> | GPU GI CI PID Type Process name GPU >> Memory | >> >> | ID ID >> Usage | >> >> >> |=============================================================================| >> >> | 0 N/A N/A 1630309 C nvidia-cuda-mps-server >> 27MiB | >> >> | 0 N/A N/A 1696543 C ./petsc_solver_test >> 38407MiB | >> >> | 0 N/A N/A 1696544 C ./petsc_solver_test >> 467MiB | >> >> | 0 N/A N/A 1696545 C ./petsc_solver_test >> 467MiB | >> >> | 0 N/A N/A 1696546 C ./petsc_solver_test >> 467MiB | >> >> | 0 N/A N/A 1696548 C ./petsc_solver_test >> 467MiB | >> >> | 0 N/A N/A 1696550 C ./petsc_solver_test >> 471MiB | >> >> | 0 N/A N/A 1696551 C ./petsc_solver_test >> 467MiB | >> >> | 0 N/A N/A 1696552 C ./petsc_solver_test >> 467MiB | >> >> | 1 N/A N/A 1630309 C nvidia-cuda-mps-server >> 27MiB | >> >> | 1 N/A N/A 1696544 C ./petsc_solver_test >> 35849MiB | >> >> | 2 N/A N/A 1630309 C nvidia-cuda-mps-server >> 27MiB | >> >> | 2 N/A N/A 1696545 C ./petsc_solver_test >> 36719MiB | >> >> | 3 N/A N/A 1630309 C nvidia-cuda-mps-server >> 27MiB | >> >> | 3 N/A N/A 1696546 C ./petsc_solver_test >> 37343MiB | >> >> | 4 N/A N/A 1630309 C nvidia-cuda-mps-server >> 27MiB | >> >> | 4 N/A N/A 1696548 C ./petsc_solver_test >> 36935MiB | >> >> | 5 N/A N/A 1630309 C nvidia-cuda-mps-server >> 27MiB | >> >> | 5 N/A N/A 1696550 C ./petsc_solver_test >> 49953MiB | >> >> | 6 N/A N/A 1630309 C nvidia-cuda-mps-server >> 27MiB | >> >> | 6 N/A N/A 1696551 C ./petsc_solver_test >> 47693MiB | >> >> | 7 N/A N/A 1630309 C nvidia-cuda-mps-server >> 27MiB | >> >> | 7 N/A N/A 1696552 C ./petsc_solver_test >> 77331MiB | >> >> >> +-----------------------------------------------------------------------------+ >> >
