Hello, I would like to understand why more memory is consumed by -pc_type gamg compared to -pc_type mg for the same problem size
ksp/ksp/tutorial: ./ex45 -da_grid_x 368 -da_grid_x 368 -da_grid_x 368 -ksp_type cg -pc_type mg Maximum (over computational time) process memory: total 1.9399e+10 max 9.7000e+09 min 9.6992e+09 -pc_type gamg Maximum (over computational time) process memory: total 4.9671e+10 max 2.4836e+10 min 2.4835e+10 Am I right in understanding that the memory limiting factor is ‘max 2.4836e+10’ as it is the maximum memory used at any given time? I have attached the -log_view output of both the preconditioners. Best regards, Karthik. This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses.
Residual norm 7.68921e-06 Summary of Memory Usage in PETSc Maximum (over computational time) process memory: total 4.9671e+10 max 2.4836e+10 min 2.4835e+10 Current process memory: total 4.5088e+09 max 2.3270e+09 min 2.1818e+09 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex45 on a named glados.dl.ac.uk with 2 processors, by kchockalingam Mon Nov 22 08:19:18 2021 Using Petsc Release Version 3.15.3, Aug 06, 2021 Max Max/Min Avg Total Time (sec): 1.740e+02 1.000 1.740e+02 Objects: 7.220e+02 1.000 7.220e+02 Flop: 1.119e+11 1.001 1.119e+11 2.238e+11 Flop/sec: 6.435e+08 1.001 6.432e+08 1.286e+09 MPI Messages: 6.105e+02 1.005 6.090e+02 1.218e+03 MPI Message Lengths: 2.849e+08 1.000 4.678e+05 5.698e+08 MPI Reductions: 6.940e+02 1.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 1.7396e+02 100.0% 2.2377e+11 100.0% 1.218e+03 100.0% 4.678e+05 100.0% 6.760e+02 97.4% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) CpuToGpu Count: total number of CPU to GPU copies per processor CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) GpuToCpu Count: total number of GPU to CPU copies per processor GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) GPU %F: percent flops on GPU in this event ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F --------------------------------------------------------------------------------------------------------------------------------------------------------------- --- Event Stage 0: Main Stage BuildTwoSided 76 1.0 7.5865e-0134.8 0.00e+00 0.0 9.0e+01 4.0e+00 7.6e+01 0 0 7 0 11 0 0 7 0 11 0 0 0 0.00e+00 0 0.00e+00 0 BuildTwoSidedF 29 1.0 1.6669e-0115.4 0.00e+00 0.0 1.6e+01 2.0e+06 2.9e+01 0 0 1 6 4 0 0 1 6 4 0 0 0 0.00e+00 0 0.00e+00 0 MatMult 310 1.0 1.2298e+00 1.2 3.61e+10 1.0 6.4e+02 3.6e+05 5.0e+00 1 32 53 40 1 1 32 53 40 1 58589 121425 2 2.19e+03 0 0.00e+00 100 MatMultAdd 50 1.0 8.1880e-02 1.6 2.23e+09 1.0 9.0e+01 7.0e+04 0.0e+00 0 2 7 1 0 0 2 7 1 0 54400 92984 0 0.00e+00 0 0.00e+00 100 MatMultTranspose 50 1.0 5.8773e-02 1.0 1.95e+09 1.0 1.1e+02 6.1e+04 5.0e+00 0 2 9 1 1 0 2 9 1 1 66343 79796 1 8.43e-03 0 0.00e+00 100 MatSolve 10 0.0 5.5659e-04 0.0 9.10e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2 2 0 0.00e+00 0 0.00e+00 100 MatSOR 255 1.0 4.7986e+01 1.0 2.92e+10 1.0 0.0e+00 0.0e+00 0.0e+00 27 26 0 0 0 27 26 0 0 0 1217 0 0 0.00e+00 444 1.94e+04 0 MatLUFactorSym 1 1.0 1.0550e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatLUFactorNum 1 1.0 5.7214e-04232.7 2.10e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 2 3.96e-04 0 0.00e+00 0 MatConvert 15 1.0 1.3423e+00 1.0 0.00e+00 0.0 2.0e+01 8.5e+04 5.0e+00 1 0 2 0 1 1 0 2 0 1 0 0 0 0.00e+00 8 9.05e+02 0 MatScale 15 1.0 1.3180e+00 1.0 7.97e+08 1.0 1.0e+01 3.4e+05 0.0e+00 1 1 1 1 0 1 1 1 1 0 1209 70111 10 8.91e+02 15 8.93e+02 14 MatResidual 50 1.0 1.8463e-01 1.7 6.02e+09 1.0 1.0e+02 3.4e+05 0.0e+00 0 5 8 6 0 0 5 8 6 0 65185 111619 50 4.00e-04 0 0.00e+00 100 MatAssemblyBegin 35 1.0 1.3208e-01 2.6 0.00e+00 0.0 1.6e+01 2.0e+06 1.5e+01 0 0 1 6 2 0 0 1 6 2 0 0 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 35 1.0 6.3233e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 5.6e+01 4 0 0 0 8 4 0 0 0 8 0 0 0 0.00e+00 0 0.00e+00 0 MatGetRowIJ 1 0.0 1.5944e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatCreateSubMat 2 1.0 4.7421e-04 1.0 0.00e+00 0.0 5.0e+00 7.9e+01 2.8e+01 0 0 0 0 4 0 0 0 0 4 0 0 0 0.00e+00 2 1.96e-04 0 MatGetOrdering 1 0.0 2.5809e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatCoarsen 5 1.0 3.7309e+00 1.0 0.00e+00 0.0 6.0e+01 2.3e+05 1.5e+01 2 0 5 2 2 2 0 5 2 2 0 0 0 0.00e+00 0 0.00e+00 0 MatAXPY 5 1.0 3.1612e+00 1.0 2.76e+07 1.0 0.0e+00 0.0e+00 5.0e+00 2 0 0 0 1 2 0 0 0 1 17 0 0 0.00e+00 10 8.91e+02 0 MatMatMultSym 5 1.0 3.8528e+00 1.0 5.75e+08 1.0 3.0e+01 2.3e+05 3.0e+01 2 1 2 1 4 2 1 2 1 4 298 1762 43 3.78e+03 30 1.34e+03 100 MatMatMultNum 5 1.0 1.3230e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatPtAPSymbolic 5 1.0 7.6662e+00 1.0 7.03e+09 1.0 1.6e+02 1.2e+06 4.0e+01 4 6 13 34 6 4 6 13 34 6 1832 4422 47 6.68e+03 40 2.86e+03 100 MatPtAPNumeric 5 1.0 2.0198e+00 1.0 7.00e+09 1.0 1.0e+01 3.2e+06 0.0e+00 1 6 1 6 0 1 6 1 6 0 6924 7248 20 4.64e+02 0 0.00e+00 100 MatTrnMatMultSym 1 1.0 3.4088e+01 1.0 0.00e+00 0.0 1.0e+01 4.0e+06 1.2e+01 20 0 1 7 2 20 0 1 7 2 0 0 0 0.00e+00 0 0.00e+00 0 MatGetLocalMat 6 1.0 2.3597e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 8 8.90e+02 5 5.56e+02 0 MatGetBrAoCol 10 1.0 9.3712e-02 1.0 0.00e+00 0.0 6.0e+01 6.9e+05 0.0e+00 0 0 5 7 0 0 0 5 7 0 0 0 0 0.00e+00 0 0.00e+00 0 MatCUSPARSCopyTo 68 1.1 1.4593e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 66 9.68e+03 0 0.00e+00 0 MatCUSPARSCopyFr 30 1.0 4.0342e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 30 2.69e+03 0 MatCUSPARSSolAnl 2 0.0 2.5864e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatCUSPARSGenT 11 1.0 2.7540e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatSetPreallCOO 10 1.0 2.4171e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+01 1 0 0 0 4 1 0 0 0 4 0 0 50 6.46e+03 30 1.15e+03 0 MatSetValuesCOO 10 1.0 2.2698e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 KSPSetUp 13 1.0 1.3010e+01 1.0 1.94e+10 1.0 1.0e+02 3.4e+05 1.6e+02 7 17 8 6 22 7 17 8 6 23 2977 116420 170 2.43e+03 155 2.21e+03 67 KSPSolve 1 1.0 3.9555e+01 1.0 6.34e+10 1.0 6.2e+02 2.7e+05 3.3e+01 23 57 51 30 5 23 57 51 30 5 3202 107147 830 9.02e+03 422 1.72e+04 64 KSPGMRESOrthog 100 1.0 5.5859e-01 1.1 1.21e+10 1.0 0.0e+00 0.0e+00 1.0e+02 0 11 0 0 14 0 11 0 0 15 43431 127991 150 2.21e+03 100 5.32e-01 100 DMCreateMat 1 1.0 1.7780e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+00 10 0 0 0 1 10 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0 SFSetGraph 56 1.0 1.3704e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFSetUp 47 1.0 7.6317e-01 4.1 0.00e+00 0.0 1.6e+02 3.6e+05 4.7e+01 0 0 13 10 7 0 0 13 10 7 0 0 0 0.00e+00 0 0.00e+00 0 SFBcastBegin 25 1.0 4.3326e-03 1.2 0.00e+00 0.0 5.0e+01 5.4e+05 0.0e+00 0 0 4 5 0 0 0 4 5 0 0 0 0 0.00e+00 0 0.00e+00 0 SFBcastEnd 25 1.0 5.9710e-02 4.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFReduceBegin 25 1.0 3.0703e-03 1.0 0.00e+00 0.0 5.0e+01 1.9e+06 0.0e+00 0 0 4 17 0 0 0 4 17 0 0 0 0 0.00e+00 0 0.00e+00 0 SFReduceEnd 25 1.0 2.1760e-01 4.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFFetchOpBegin 5 1.0 1.1191e-03 1.5 0.00e+00 0.0 1.0e+01 1.6e+06 0.0e+00 0 0 1 3 0 0 0 1 3 0 0 0 0 0.00e+00 0 0.00e+00 0 SFFetchOpEnd 5 1.0 9.7333e-03 1.0 0.00e+00 0.0 1.0e+01 1.6e+06 0.0e+00 0 0 1 3 0 0 0 1 3 0 0 0 0 0.00e+00 0 0.00e+00 0 SFPack 486 1.0 9.0454e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFUnpack 491 1.0 2.3874e-02 1.2 3.99e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33 0 0 0.00e+00 0 0.00e+00 100 VecMDot 100 1.0 4.3873e-01 1.2 6.07e+09 1.0 0.0e+00 0.0e+00 1.0e+02 0 5 0 0 14 0 5 0 0 15 27648 172456 50 2.21e+03 100 5.32e-01 100 VecTDot 18 1.0 1.1606e-02 1.0 8.97e+08 1.0 0.0e+00 0.0e+00 1.8e+01 0 1 0 0 3 0 1 0 0 3 154578 156117 0 0.00e+00 18 1.44e-04 100 VecNorm 121 1.0 1.0401e-01 2.4 1.76e+09 1.0 0.0e+00 0.0e+00 1.2e+02 0 2 0 0 17 0 2 0 0 18 33867 196387 5 2.21e+02 121 9.68e-04 100 VecScale 110 1.0 1.7113e-02 1.0 6.07e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 70881 71050 110 8.80e-04 0 0.00e+00 100 VecCopy 162 1.0 9.8063e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecSet 357 1.0 4.8593e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecAXPY 29 1.0 5.3274e-02 1.0 1.06e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 39687 101587 30 1.99e+02 0 0.00e+00 100 VecAYPX 308 1.0 7.4778e-01 1.0 3.71e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 9914 58799 409 4.61e+03 0 0.00e+00 100 VecAXPBYCZ 100 1.0 6.5374e-01 1.0 5.51e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 16868 179853 400 4.41e+03 0 0.00e+00 100 VecMAXPY 110 1.0 1.4140e-01 1.0 7.17e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 6 0 0 0 0 6 0 0 0 101383 101406 110 5.20e-03 0 0.00e+00 100 VecAssemblyBegin 15 1.0 8.2847e-0219.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 2 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 VecAssemblyEnd 15 1.0 2.3514e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecPointwiseMult 55 1.0 4.6009e-02 1.0 3.03e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 13182 48660 5 2.21e+02 0 0.00e+00 100 VecScatterBegin 431 1.0 9.3988e-02 1.0 0.00e+00 0.0 9.0e+02 3.1e+05 1.7e+01 0 0 74 49 2 0 0 74 49 3 0 0 0 0.00e+00 0 0.00e+00 0 VecScatterEnd 431 1.0 3.1769e-01 7.3 3.99e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2 0 0 0.00e+00 0 0.00e+00 100 VecSetRandom 5 1.0 3.0144e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecNormalize 110 1.0 7.0528e-02 1.3 1.82e+09 1.0 0.0e+00 0.0e+00 1.1e+02 0 2 0 0 16 0 2 0 0 16 51596 118147 115 2.21e+02 110 8.80e-04 100 VecCUDACopyTo 262 1.0 1.5966e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 262 1.19e+04 0 0.00e+00 0 VecCUDACopyFrom 449 1.0 2.9575e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 0.00e+00 449 1.94e+04 0 PCGAMGGraph_AGG 5 1.0 2.1182e+01 1.0 5.75e+08 1.0 3.0e+01 1.7e+05 4.5e+01 12 1 2 1 6 12 1 2 1 7 54 0 0 0.00e+00 13 9.06e+02 0 PCGAMGCoarse_AGG 5 1.0 4.8827e+01 1.0 0.00e+00 0.0 8.8e+01 8.7e+05 3.8e+01 28 0 7 13 5 28 0 7 13 6 0 0 0 0.00e+00 0 0.00e+00 0 PCGAMGProl_AGG 5 1.0 7.0233e+00 1.0 0.00e+00 0.0 4.8e+01 2.7e+05 7.9e+01 4 0 4 2 11 4 0 4 2 12 0 0 0 0.00e+00 0 0.00e+00 0 PCGAMGPOpt_AGG 5 1.0 1.1097e+01 1.0 1.42e+10 1.0 1.5e+02 2.8e+05 1.8e+02 6 13 12 7 27 6 13 12 7 27 2555 31195 175 7.08e+03 155 3.12e+03 99 GAMG: createProl 5 1.0 8.8829e+01 1.0 1.48e+10 1.0 3.2e+02 4.3e+05 3.5e+02 51 13 26 24 50 51 13 26 24 51 332 30620 175 7.08e+03 168 4.02e+03 95 Graph 10 1.0 2.1133e+01 1.0 5.75e+08 1.0 3.0e+01 1.7e+05 4.5e+01 12 1 2 1 6 12 1 2 1 7 54 0 0 0.00e+00 13 9.06e+02 0 MIS/Agg 5 1.0 3.7314e+00 1.0 0.00e+00 0.0 6.0e+01 2.3e+05 1.5e+01 2 0 5 2 2 2 0 5 2 2 0 0 0 0.00e+00 0 0.00e+00 0 SA: col data 5 1.0 1.1503e+00 1.0 0.00e+00 0.0 3.6e+01 3.3e+05 3.4e+01 1 0 3 2 5 1 0 3 2 5 0 0 0 0.00e+00 0 0.00e+00 0 SA: frmProl0 5 1.0 5.6186e+00 1.0 0.00e+00 0.0 1.2e+01 1.2e+05 2.5e+01 3 0 1 0 4 3 0 1 0 4 0 0 0 0.00e+00 0 0.00e+00 0 SA: smooth 5 1.0 8.1271e+00 1.0 8.25e+08 1.0 3.0e+01 2.3e+05 4.5e+01 5 1 2 1 6 5 1 2 1 7 203 2055 53 4.67e+03 50 3.12e+03 83 GAMG: partLevel 5 1.0 9.6874e+00 1.0 1.40e+10 1.0 1.8e+02 1.2e+06 9.3e+01 6 13 15 40 13 6 13 15 40 14 2894 5490 67 7.14e+03 42 2.86e+03 100 repartition 1 1.0 1.4764e-03 1.0 0.00e+00 0.0 1.4e+01 3.7e+01 5.3e+01 0 0 1 0 8 0 0 1 0 8 0 0 0 0.00e+00 2 1.96e-04 0 Invert-Sort 1 1.0 6.1192e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0 Move A 1 1.0 7.0602e-04 1.0 0.00e+00 0.0 5.0e+00 7.9e+01 1.5e+01 0 0 0 0 2 0 0 0 0 2 0 0 0 0.00e+00 2 1.96e-04 0 Move P 1 1.0 2.9498e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+01 0 0 0 0 2 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 PCGAMG Squ l00 1 1.0 3.4088e+01 1.0 0.00e+00 0.0 1.0e+01 4.0e+06 1.2e+01 20 0 1 7 2 20 0 1 7 2 0 0 0 0.00e+00 0 0.00e+00 0 PCGAMG Gal l00 1 1.0 5.5473e+00 1.0 5.28e+09 1.0 3.4e+01 2.4e+06 8.0e+00 3 5 3 14 1 3 5 3 14 1 1903 4210 14 5.01e+03 8 2.05e+03 100 PCGAMG Opt l00 1 1.0 2.8926e+00 1.0 3.48e+08 1.0 6.0e+00 7.2e+05 6.0e+00 2 0 0 1 1 2 0 0 1 1 241 1610 9 3.02e+03 6 1.09e+03 100 PCGAMG Gal l01 1 1.0 3.0978e+00 1.0 6.19e+09 1.0 3.4e+01 3.4e+06 8.0e+00 2 6 3 21 1 2 6 3 21 1 3993 7014 14 1.94e+03 8 7.30e+02 100 PCGAMG Opt l01 1 1.0 7.3372e-01 1.0 1.59e+08 1.0 6.0e+00 3.0e+05 6.0e+00 0 0 0 0 1 0 0 0 0 1 433 2001 8 6.69e+02 6 2.15e+02 100 PCGAMG Gal l02 1 1.0 9.7982e-01 1.0 2.45e+09 1.0 3.4e+01 8.3e+05 8.0e+00 1 2 3 5 1 1 2 3 5 1 5001 6196 14 1.86e+02 8 8.35e+01 100 PCGAMG Opt l02 1 1.0 2.2427e-01 1.0 6.46e+07 1.0 6.0e+00 1.0e+05 6.0e+00 0 0 0 0 1 0 0 0 0 1 573 1958 8 8.82e+01 6 2.79e+01 100 PCGAMG Gal l03 1 1.0 5.5081e-02 1.0 1.07e+08 1.1 3.4e+01 7.0e+04 8.0e+00 0 0 3 0 1 0 0 3 0 1 3780 4920 14 4.86e+00 8 2.43e+00 100 PCGAMG Opt l03 1 1.0 1.4115e-02 1.0 2.89e+06 1.0 6.0e+00 1.1e+04 6.0e+00 0 0 0 0 1 0 0 0 0 1 404 1032 8 2.64e+00 6 8.67e-01 100 PCGAMG Gal l04 1 1.0 6.0144e-03 1.0 3.85e+05 1.1 3.4e+01 6.3e+02 8.0e+00 0 0 3 0 1 0 0 3 0 1 120 256 12 2.26e-02 8 1.21e-02 100 PCGAMG Opt l04 1 1.0 2.6873e-03 1.0 3.82e+04 1.0 6.0e+00 6.8e+02 6.0e+00 0 0 0 0 1 0 0 0 0 1 28 66 8 3.02e-02 6 1.13e-02 100 PCSetUp 2 1.0 1.1120e+02 1.0 4.81e+10 1.0 6.0e+02 6.6e+05 6.1e+02 64 43 49 70 88 64 43 49 70 90 866 13162 414 1.67e+04 365 9.09e+03 85 PCSetUpOnBlocks 10 1.0 6.6756e-04 7.1 2.10e+02 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 2 3.96e-04 0 0.00e+00 0 PCApply 10 1.0 3.9430e+01 1.0 5.78e+10 1.0 6.0e+02 2.5e+05 5.0e+00 23 52 49 26 1 23 52 49 26 1 2928 106327 804 9.02e+03 394 1.72e+04 60 --------------------------------------------------------------------------------------------------------------------------------------------------------------- Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Krylov Solver 18 18 320312 0. DMKSP interface 1 1 664 0. Matrix 150 150 27331245384 0. Matrix Coarsen 5 5 3160 0. Distributed Mesh 15 15 76232 0. Index Set 75 75 114024764 0. IS L to G Mapping 21 21 102164020 0. Star Forest Graph 81 81 95112 0. Discrete System 15 15 13560 0. Weak Form 15 15 12360 0. Vector 296 296 10935097304 0. Preconditioner 18 18 17700 0. PetscRandom 10 10 6740 0. Viewer 2 1 848 0. ======================================================================================================================== Average time to get PetscTime(): 2.5332e-08 Average time for MPI_Barrier(): 7.34627e-07 Average time for zero size MPI_Send(): 3.86685e-06 #PETSc Option Table entries: -da_grid_x 368 -da_grid_y 368 -da_grid_z 368 -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -log_view -malloc_log -memory_view -pc_type gamg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --package-prefix-hash=/home/kchockalingam/petsc-hash-pkgs --with-make-test-np=2 COPTFLAGS="-g -O3 -fno-omit-frame-pointer" FOPTFLAGS="-g -O3 -fno-omit-frame-pointer" CXXOPTFLAGS="-g -O3 -fno-omit-frame-pointer" --with-cuda=1 --with-cuda-arch=70 --with-blaslapack=1 --with-cuda-dir=/apps/packages/cuda/10.1/ --with-mpi-dir=/apps/packages/gcc/7.3.0/openmpi/3.1.2 --download-hypre=1 --download-hypre-configure-arguments=--enable-gpu-profiling=yes,--enable-cusparse=yes,--enable-cublas=yes,--enable-curand=yes,HYPRE_CUDA_SM=70 --with-debugging=no PETSC_ARCH=arch-ci-linux-cuda11-hypre-double ----------------------------------------- Libraries compiled on 2021-11-18 14:19:41 on glados.dl.ac.uk Machine characteristics: Linux-4.18.0-193.6.3.el8_2.x86_64-x86_64-with-centos-8.2.2004-Core Using PETSc directory: /home/kchockalingam/tools/petsc-3.15.3 Using PETSc arch: ----------------------------------------- Using C compiler: /apps/packages/gcc/7.3.0/openmpi/3.1.2/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O3 -fno-omit-frame-pointer Using Fortran compiler: /apps/packages/gcc/7.3.0/openmpi/3.1.2/bin/mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O3 -fno-omit-frame-pointer ----------------------------------------- Using include paths: -I/home/kchockalingam/tools/petsc-3.15.3/include -I/home/kchockalingam/tools/petsc-3.15.3/arch-ci-linux-cuda11-hypre-double/include -I/home/kchockalingam/petsc-hash-pkgs/194329/include -I/apps/packages/gcc/7.3.0/openmpi/3.1.2/include -I/apps/packages/cuda/10.1/include ----------------------------------------- Using C linker: /apps/packages/gcc/7.3.0/openmpi/3.1.2/bin/mpicc Using Fortran linker: /apps/packages/gcc/7.3.0/openmpi/3.1.2/bin/mpif90 Using libraries: -Wl,-rpath,/home/kchockalingam/tools/petsc-3.15.3/lib -L/home/kchockalingam/tools/petsc-3.15.3/lib -lpetsc -Wl,-rpath,/home/kchockalingam/petsc-hash-pkgs/194329/lib -L/home/kchockalingam/petsc-hash-pkgs/194329/lib -Wl,-rpath,/apps/packages/cuda/10.1/lib64 -L/apps/packages/cuda/10.1/lib64 -Wl,-rpath,/apps/packages/gcc/7.3.0/openmpi/3.1.2/lib -L/apps/packages/gcc/7.3.0/openmpi/3.1.2/lib -Wl,-rpath,/apps/packages/compilers/gcc/7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0 -L/apps/packages/compilers/gcc/7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0 -Wl,-rpath,/apps/packages/compilers/gcc/7.3.0/lib64 -L/apps/packages/compilers/gcc/7.3.0/lib64 -Wl,-rpath,/apps/packages/compilers/gcc/7.3.0/lib -L/apps/packages/compilers/gcc/7.3.0/lib -lHYPRE -llapack -lblas -lcufft -lcublas -lcudart -lcusparse -lcusolver -lcurand -lX11 -lstdc++ -ldl -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lutil -lrt -lz -lgfortran -lm -lgfortran -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl -----------------------------------------
0 KSP Residual norm 1.495651392161e+03 1 KSP Residual norm 3.649891107619e+02 2 KSP Residual norm 2.117146192828e+02 3 KSP Residual norm 1.445856885170e+02 4 KSP Residual norm 1.077424410387e+02 5 KSP Residual norm 9.198111477688e+01 6 KSP Residual norm 7.533215136725e+01 7 KSP Residual norm 6.451868462772e+01 8 KSP Residual norm 5.872899850046e+01 9 KSP Residual norm 5.061580432653e+01 10 KSP Residual norm 4.708451245418e+01 11 KSP Residual norm 4.237917691640e+01 12 KSP Residual norm 3.901940233527e+01 13 KSP Residual norm 3.646993301702e+01 14 KSP Residual norm 3.332041450528e+01 15 KSP Residual norm 3.194496692785e+01 16 KSP Residual norm 2.928883693435e+01 17 KSP Residual norm 2.800449787474e+01 18 KSP Residual norm 2.659073917787e+01 19 KSP Residual norm 2.468772425132e+01 20 KSP Residual norm 2.408386623759e+01 21 KSP Residual norm 2.259733320445e+01 22 KSP Residual norm 2.165089065314e+01 23 KSP Residual norm 2.087552323842e+01 24 KSP Residual norm 1.983208395628e+01 25 KSP Residual norm 1.921481237253e+01 26 KSP Residual norm 1.836751251265e+01 27 KSP Residual norm 1.772999436226e+01 28 KSP Residual norm 1.717025579150e+01 29 KSP Residual norm 1.648753232895e+01 30 KSP Residual norm 1.595404607384e+01 31 KSP Residual norm 1.553529314488e+01 32 KSP Residual norm 1.488033633220e+01 33 KSP Residual norm 1.450317697522e+01 34 KSP Residual norm 1.410137231648e+01 35 KSP Residual norm 1.350442954302e+01 36 KSP Residual norm 1.332756202308e+01 37 KSP Residual norm 1.275864495790e+01 38 KSP Residual norm 1.242357295332e+01 39 KSP Residual norm 1.222135293906e+01 40 KSP Residual norm 1.162533517126e+01 41 KSP Residual norm 1.158693300904e+01 42 KSP Residual norm 1.113234572782e+01 43 KSP Residual norm 1.082139597868e+01 44 KSP Residual norm 1.078038754391e+01 45 KSP Residual norm 1.020082046247e+01 46 KSP Residual norm 1.021689722133e+01 47 KSP Residual norm 9.907150556837e+00 48 KSP Residual norm 9.547388096064e+00 49 KSP Residual norm 9.600154708139e+00 50 KSP Residual norm 9.110924039435e+00 51 KSP Residual norm 9.065506894706e+00 52 KSP Residual norm 8.878368576554e+00 53 KSP Residual norm 8.505425004955e+00 54 KSP Residual norm 8.581392859885e+00 55 KSP Residual norm 8.148007421767e+00 56 KSP Residual norm 8.065927739730e+00 57 KSP Residual norm 7.990533170413e+00 58 KSP Residual norm 7.556927487891e+00 59 KSP Residual norm 7.675404973915e+00 60 KSP Residual norm 7.333803913416e+00 61 KSP Residual norm 7.166317051638e+00 62 KSP Residual norm 7.198844469611e+00 63 KSP Residual norm 6.774954193917e+00 64 KSP Residual norm 6.872066969908e+00 65 KSP Residual norm 6.617813188020e+00 66 KSP Residual norm 6.412411326410e+00 67 KSP Residual norm 6.491178230546e+00 68 KSP Residual norm 6.101212156660e+00 69 KSP Residual norm 6.182217362982e+00 70 KSP Residual norm 6.103620464104e+00 71 KSP Residual norm 6.095418939334e+00 72 KSP Residual norm 6.775974052758e+00 73 KSP Residual norm 7.527726872709e+00 74 KSP Residual norm 9.072043988077e+00 75 KSP Residual norm 9.636613703974e+00 76 KSP Residual norm 7.657551468431e+00 77 KSP Residual norm 5.481479203936e+00 78 KSP Residual norm 4.259497268317e+00 79 KSP Residual norm 4.436470054159e+00 80 KSP Residual norm 4.564357893234e+00 81 KSP Residual norm 3.720705638968e+00 82 KSP Residual norm 3.456690901373e+00 83 KSP Residual norm 3.453431604562e+00 84 KSP Residual norm 3.151142098664e+00 85 KSP Residual norm 2.882452789367e+00 86 KSP Residual norm 2.798244762166e+00 87 KSP Residual norm 2.572182250139e+00 88 KSP Residual norm 2.394429603093e+00 89 KSP Residual norm 2.315722094218e+00 90 KSP Residual norm 2.101641503104e+00 91 KSP Residual norm 2.011937794513e+00 92 KSP Residual norm 1.868463333889e+00 93 KSP Residual norm 1.741962828026e+00 94 KSP Residual norm 1.650905494042e+00 95 KSP Residual norm 1.515305169977e+00 96 KSP Residual norm 1.421981579767e+00 97 KSP Residual norm 1.317045136017e+00 98 KSP Residual norm 1.215586504381e+00 99 KSP Residual norm 1.126380158677e+00 100 KSP Residual norm 1.045448228934e+00 101 KSP Residual norm 9.390091098393e-01 102 KSP Residual norm 8.829404315108e-01 103 KSP Residual norm 7.760121248092e-01 104 KSP Residual norm 7.253597449597e-01 105 KSP Residual norm 6.586582757537e-01 106 KSP Residual norm 6.051736039622e-01 107 KSP Residual norm 5.790585710076e-01 108 KSP Residual norm 5.259594077655e-01 109 KSP Residual norm 4.982302792693e-01 110 KSP Residual norm 4.593048900932e-01 111 KSP Residual norm 4.168158507746e-01 112 KSP Residual norm 3.969629457262e-01 113 KSP Residual norm 3.546379850023e-01 114 KSP Residual norm 3.332453713647e-01 115 KSP Residual norm 3.068925104294e-01 116 KSP Residual norm 2.756944445656e-01 117 KSP Residual norm 2.635375966688e-01 118 KSP Residual norm 2.325001353311e-01 119 KSP Residual norm 2.199234046339e-01 120 KSP Residual norm 1.994580647155e-01 121 KSP Residual norm 1.812120424979e-01 122 KSP Residual norm 1.683880795172e-01 123 KSP Residual norm 1.507657264996e-01 124 KSP Residual norm 1.376966981436e-01 125 KSP Residual norm 1.258652583185e-01 126 KSP Residual norm 1.113645108302e-01 127 KSP Residual norm 1.026205995037e-01 128 KSP Residual norm 9.068139854964e-02 129 KSP Residual norm 8.119198385262e-02 130 KSP Residual norm 7.350479129364e-02 131 KSP Residual norm 6.334173405612e-02 132 KSP Residual norm 5.811559484006e-02 133 KSP Residual norm 4.952193458274e-02 134 KSP Residual norm 4.427509896691e-02 135 KSP Residual norm 3.842091471301e-02 136 KSP Residual norm 3.277284939040e-02 137 KSP Residual norm 2.889849060988e-02 138 KSP Residual norm 2.391014409595e-02 139 KSP Residual norm 2.080839323584e-02 140 KSP Residual norm 1.726070845998e-02 141 KSP Residual norm 1.450952641061e-02 Residual norm 3.26933e-05 Summary of Memory Usage in PETSc Maximum (over computational time) process memory: total 1.9399e+10 max 9.7000e+09 min 9.6992e+09 Current process memory: total 1.8596e+09 max 9.3022e+08 min 9.2937e+08 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex45 on a named glados.dl.ac.uk with 2 processors, by kchockalingam Wed Nov 24 09:19:33 2021 Using Petsc Release Version 3.15.3, Aug 06, 2021 Max Max/Min Avg Total Time (sec): 2.222e+02 1.000 2.222e+02 Objects: 5.800e+01 1.000 5.800e+01 Flop: 2.963e+11 1.000 2.963e+11 5.925e+11 Flop/sec: 1.333e+09 1.000 1.333e+09 2.666e+09 MPI Messages: 2.960e+02 1.000 2.960e+02 5.920e+02 MPI Message Lengths: 3.191e+08 1.000 1.078e+06 6.381e+08 MPI Reductions: 5.360e+02 1.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 2.2224e+02 100.0% 5.9253e+11 100.0% 5.920e+02 100.0% 1.078e+06 100.0% 5.180e+02 96.6% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) CpuToGpu Count: total number of CPU to GPU copies per processor CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) GpuToCpu Count: total number of GPU to CPU copies per processor GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) GPU %F: percent flops on GPU in this event ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F --------------------------------------------------------------------------------------------------------------------------------------------------------------- --- Event Stage 0: Main Stage BuildTwoSided 3 1.0 8.7330e-02 4.3 0.00e+00 0.0 2.0e+00 4.0e+00 3.0e+00 0 0 0 0 1 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0 BuildTwoSidedF 2 1.0 8.7315e-02 4.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatMult 294 1.0 2.8526e+00 1.3 9.50e+10 1.0 5.9e+02 1.1e+06 1.0e+00 1 32100100 0 1 32100100 0 66605 109985 2 2.19e+03 0 0.00e+00 100 MatSOR 295 1.0 1.8572e+02 1.0 1.03e+11 1.0 0.0e+00 0.0e+00 0.0e+00 83 35 0 0 0 83 35 0 0 0 1105 0 0 0.00e+00 576 1.15e+05 0 MatAssemblyBegin 2 1.0 8.7371e-02 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 2 1.0 6.8931e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 1 0 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0 MatCUSPARSCopyTo 2 1.0 3.6613e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 2 2.19e+03 0 0.00e+00 0 KSPSetUp 2 1.0 9.2705e+00 1.0 1.39e+10 1.0 2.4e+01 9.5e+05 6.6e+01 4 5 4 4 12 4 5 4 4 13 3006 116114 36 4.38e+03 31 1.99e+03 72 KSPSolve 1 1.0 1.9028e+02 1.0 2.82e+11 1.0 5.7e+02 1.1e+06 4.2e+02 86 95 96 96 79 86 95 96 96 82 2963 103602 1417 5.68e+04 990 1.13e+05 65 KSPGMRESOrthog 10 1.0 4.0958e-01 1.1 5.48e+09 1.0 0.0e+00 0.0e+00 1.0e+01 0 2 0 0 2 0 2 0 0 2 26768 129969 20 1.99e+03 10 5.32e-02 100 DMCreateMat 1 1.0 1.7783e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+00 8 0 0 0 1 8 0 0 0 1 0 0 0 0.00e+00 0 0.00e+00 0 SFSetGraph 2 1.0 1.2732e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFSetUp 1 1.0 2.4075e-03 1.0 0.00e+00 0.0 4.0e+00 2.7e+05 1.0e+00 0 0 1 0 0 0 0 1 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFPack 294 1.0 9.5077e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFUnpack 294 1.0 6.5334e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecMDot 10 1.0 3.5582e-01 1.2 2.74e+09 1.0 0.0e+00 0.0e+00 1.0e+01 0 1 0 0 2 0 1 0 0 2 15407 177096 10 1.99e+03 10 5.32e-02 100 VecTDot 282 1.0 1.8488e-01 1.0 1.41e+10 1.0 0.0e+00 0.0e+00 2.8e+02 0 5 0 0 53 0 5 0 0 54 152030 156002 0 0.00e+00 282 2.26e-03 100 VecNorm 154 1.0 7.9338e-01 9.0 7.67e+09 1.0 0.0e+00 0.0e+00 1.5e+02 0 3 0 0 29 0 3 0 0 30 19347 260968 1 1.99e+02 154 1.23e-03 100 VecScale 11 1.0 7.4799e-03 1.0 2.74e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 73289 73338 11 8.80e-05 0 0.00e+00 100 VecCopy 287 1.0 2.6882e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecSet 313 1.0 1.0696e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecAXPY 284 1.0 3.1031e-01 1.0 1.42e+10 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 91223 101939 285 1.99e+02 0 0.00e+00 100 VecAYPX 424 1.0 4.6610e+00 1.0 2.11e+10 1.0 0.0e+00 0.0e+00 0.0e+00 2 7 0 0 0 2 7 0 0 0 9067 59976 567 2.85e+04 0 0.00e+00 100 VecAXPBYCZ 142 1.0 4.2748e+00 1.0 3.54e+10 1.0 0.0e+00 0.0e+00 0.0e+00 2 12 0 0 0 2 12 0 0 0 16554 187273 568 2.83e+04 0 0.00e+00 100 VecMAXPY 11 1.0 6.3508e-02 1.0 3.24e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 102014 102026 11 5.20e-04 0 0.00e+00 100 VecScatterBegin 294 1.0 6.4442e-02 1.0 0.00e+00 0.0 5.9e+02 1.1e+06 1.0e+00 0 0100100 0 0 0100100 0 0 0 0 0.00e+00 0 0.00e+00 0 VecScatterEnd 294 1.0 7.4767e-0138.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 VecNormalize 11 1.0 4.3984e-02 1.2 8.22e+08 1.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 2 0 0 0 0 2 37391 141046 12 1.99e+02 11 8.80e-05 100 VecCUDACopyTo 297 1.0 8.1843e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 0 297 5.92e+04 0 0.00e+00 0 VecCUDACopyFrom 576 1.0 1.7463e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 0 0 0 0.00e+00 576 1.15e+05 0 PCSetUp 1 1.0 9.0794e+00 1.0 1.39e+10 1.0 2.4e+01 9.5e+05 6.0e+01 4 5 4 4 11 4 5 4 4 12 3069 116796 36 4.38e+03 31 1.99e+03 72 PCApply 142 1.0 1.8867e+02 1.0 1.94e+11 1.0 2.8e+02 1.1e+06 0.0e+00 85 66 48 48 0 85 66 48 48 0 2058 97012 995 5.68e+04 566 1.13e+05 49 --------------------------------------------------------------------------------------------------------------------------------------------------------------- Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Krylov Solver 3 3 34072 0. DMKSP interface 1 1 664 0. Matrix 3 3 3683542300 0. Distributed Mesh 2 2 10608 0. Index Set 4 4 100759064 0. IS L to G Mapping 1 1 100214440 0. Star Forest Graph 6 6 7056 0. Discrete System 2 2 1808 0. Weak Form 2 2 1648 0. Vector 29 29 4984738072 0. Preconditioner 3 3 3040 0. Viewer 2 1 848 0. ======================================================================================================================== Average time to get PetscTime(): 3.12924e-08 Average time for MPI_Barrier(): 8.24034e-07 Average time for zero size MPI_Send(): 4.88758e-06 #PETSc Option Table entries: -da_grid_x 368 -da_grid_y 368 -da_grid_z 368 -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_monitor -ksp_type cg -log_view -malloc_log -memory_view -pc_type mg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --package-prefix-hash=/home/kchockalingam/petsc-hash-pkgs --with-make-test-np=2 COPTFLAGS="-g -O3 -fno-omit-frame-pointer" FOPTFLAGS="-g -O3 -fno-omit-frame-pointer" CXXOPTFLAGS="-g -O3 -fno-omit-frame-pointer" --with-cuda=1 --with-cuda-arch=70 --with-blaslapack=1 --with-cuda-dir=/apps/packages/cuda/10.1/ --with-mpi-dir=/apps/packages/gcc/7.3.0/openmpi/3.1.2 --download-hypre=1 --download-hypre-configure-arguments=--enable-gpu-profiling=yes,--enable-cusparse=yes,--enable-cublas=yes,--enable-curand=yes,HYPRE_CUDA_SM=70 --with-debugging=no PETSC_ARCH=arch-ci-linux-cuda11-hypre-double ----------------------------------------- Libraries compiled on 2021-11-18 14:19:41 on glados.dl.ac.uk Machine characteristics: Linux-4.18.0-193.6.3.el8_2.x86_64-x86_64-with-centos-8.2.2004-Core Using PETSc directory: /home/kchockalingam/tools/petsc-3.15.3 Using PETSc arch: ----------------------------------------- Using C compiler: /apps/packages/gcc/7.3.0/openmpi/3.1.2/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O3 -fno-omit-frame-pointer Using Fortran compiler: /apps/packages/gcc/7.3.0/openmpi/3.1.2/bin/mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O3 -fno-omit-frame-pointer ----------------------------------------- Using include paths: -I/home/kchockalingam/tools/petsc-3.15.3/include -I/home/kchockalingam/tools/petsc-3.15.3/arch-ci-linux-cuda11-hypre-double/include -I/home/kchockalingam/petsc-hash-pkgs/194329/include -I/apps/packages/gcc/7.3.0/openmpi/3.1.2/include -I/apps/packages/cuda/10.1/include ----------------------------------------- Using C linker: /apps/packages/gcc/7.3.0/openmpi/3.1.2/bin/mpicc Using Fortran linker: /apps/packages/gcc/7.3.0/openmpi/3.1.2/bin/mpif90 Using libraries: -Wl,-rpath,/home/kchockalingam/tools/petsc-3.15.3/lib -L/home/kchockalingam/tools/petsc-3.15.3/lib -lpetsc -Wl,-rpath,/home/kchockalingam/petsc-hash-pkgs/194329/lib -L/home/kchockalingam/petsc-hash-pkgs/194329/lib -Wl,-rpath,/apps/packages/cuda/10.1/lib64 -L/apps/packages/cuda/10.1/lib64 -Wl,-rpath,/apps/packages/gcc/7.3.0/openmpi/3.1.2/lib -L/apps/packages/gcc/7.3.0/openmpi/3.1.2/lib -Wl,-rpath,/apps/packages/compilers/gcc/7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0 -L/apps/packages/compilers/gcc/7.3.0/lib/gcc/x86_64-pc-linux-gnu/7.3.0 -Wl,-rpath,/apps/packages/compilers/gcc/7.3.0/lib64 -L/apps/packages/compilers/gcc/7.3.0/lib64 -Wl,-rpath,/apps/packages/compilers/gcc/7.3.0/lib -L/apps/packages/compilers/gcc/7.3.0/lib -lHYPRE -llapack -lblas -lcufft -lcublas -lcudart -lcusparse -lcusolver -lcurand -lX11 -lstdc++ -ldl -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lutil -lrt -lz -lgfortran -lm -lgfortran -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl -----------------------------------------