On Fri, Oct 5, 2012 at 11:51 AM, Zhenglun (Alan) Wei <zhenglun.wei at gmail.com > wrote:
> Dear folks, > I hope you're having a nice day. > I've been reading the thread of "Enquiry regarding log summary > results" and tried the similar things with -log_summary with my code. > However, my output seems totally different from Dr. TAY Wee-beng's. I > attached it here. My code adopts the /src/ksp/ksp/example/tutorial/**ex45.c > from Dr. Yang to solve the Poisson equation (twice). The "Event Stage 3" > and "Event Stage 5" are just for "KSPSolve", for example: > > ierr = PetscLogStagePush(stages[2]); CHKERRQ(ierr); // Start Pressure > Poisson Equation Solve > ierr = KSPSolve(ksp,PETSC_NULL,PETSC_**NULL);CHKERRQ(ierr); > ierr = PetscLogStagePop(); CHKERRQ(ierr); // Finish Pressure Poisson > Equation Solve > > Therefore, solving the equation takes most of my times. In the > procedure of solving the equation, VecMDot, MatMult, MatMultAdd, > MatMultTranspose takes lots of time. > > ------------------------------**------------------------------** > ------------------------------**------------------------------ > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s > ------------------------------**------------------------------** > ------------------------------**------------------------------ > > --- Event Stage 3: Pressure Solve > > KSPGMRESOrthog 8643 1.0 2.0476e+0313.1 5.84e+09 1.0 0.0e+00 0.0e+00 > 8.6e+03 17 13 0 0 17 39 29 0 0 36 270 > KSPSetUp 5 1.0 1.0729e-05 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 5 1.0 4.0481e+03 1.0 2.05e+10 1.0 4.0e+07 1.2e+03 > 2.4e+04 44 46 47 44 46 100100100100100 477 > VecMDot 8643 1.0 2.0424e+0313.5 2.92e+09 1.0 0.0e+00 0.0e+00 > 8.6e+03 17 7 0 0 17 39 14 0 0 36 135 > VecNorm 14503 1.0 8.7184e+02 6.4 8.47e+08 1.0 0.0e+00 0.0e+00 > 1.5e+04 7 2 0 0 28 15 4 0 0 61 92 > VecScale 14503 1.0 3.7836e+00 4.7 4.23e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 2 0 0 0 10598 > VecCopy 5767 1.0 2.0257e+00 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 34670 1.0 5.5035e+00 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 8741 1.0 2.9888e+00 4.5 5.10e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 3 0 0 0 16172 > VecAYPX 2881 1.0 1.7167e+00 7.2 8.41e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 4640 > VecWAXPY 93 1.0 5.3297e-02 4.8 2.71e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 4824 > VecMAXPY 14503 1.0 1.5191e+01 3.1 3.42e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 8 0 0 0 0 17 0 0 0 21342 > VecAssemblyBegin 5 1.0 3.7097e+0012.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.5e+01 0 0 0 0 0 0 0 0 0 0 0 > VecAssemblyEnd 5 1.0 2.9087e-05 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 26022 1.0 1.1726e+01 4.2 0.00e+00 0.0 4.0e+07 1.2e+03 > 0.0e+00 0 0 47 44 0 0 0100100 0 0 > VecScatterEnd 26022 1.0 3.5268e+03 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 19 0 0 0 0 43 0 0 0 0 0 > VecNormalize 11524 1.0 8.4214e+02 7.2 1.01e+09 1.0 0.0e+00 0.0e+00 > 1.2e+04 6 2 0 0 22 15 5 0 0 49 114 > MatMult 14498 1.0 3.0545e+0382.7 5.50e+09 1.0 6.4e+06 7.6e+03 > 0.0e+00 9 12 8 44 0 21 27 16 98 0 169 > This load imbalance is totally unreasonable. Either you have an atrocious partition or there is contention on the machine (over-subscribed or other things running). The load balance for MatMult() should be close to 1.0 and the MFlop/s (last column) should be at least 10000 or so. > MatMultAdd 2881 1.0 1.8177e+0218.6 1.09e+09 1.1 3.6e+06 1.5e+01 > 0.0e+00 1 2 4 0 0 2 5 9 0 0 560 > MatMultTranspose 2881 1.0 7.1296e+0274.9 1.09e+09 1.1 3.6e+06 1.5e+01 > 0.0e+00 5 2 4 0 0 11 5 9 0 0 143 > MatSolve 14405 1.0 3.9663e+01 3.3 4.56e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 10 0 0 0 1 22 0 0 0 10890 > MatLUFactorNum 5 1.0 3.9389e-02 1.6 3.05e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 7322 > MatILUFactorSym 5 1.0 4.2221e-02 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 5.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetRowIJ 5 1.0 9.7752e-06 5.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetOrdering 5 1.0 2.9373e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 > PCSetUp 5 1.0 8.3110e-02 2.0 3.05e+06 1.0 0.0e+00 0.0e+00 > 1.5e+01 0 0 0 0 0 0 0 0 0 0 3470 > PCSetUpOnBlocks 5762 1.0 9.9790e-02 2.0 3.05e+06 1.0 0.0e+00 0.0e+00 > 1.5e+01 0 0 0 0 0 0 0 0 0 0 2890 > PCApply 2881 1.0 3.3151e+03 1.0 1.37e+10 1.0 3.8e+07 1.0e+03 > 1.7e+04 36 31 46 36 33 82 67 97 80 73 390 > > BTW, how to really calculate the time that a function costs? For > example, it shows that "VecMDot, 2.0424e+0313.5". Would that be > 2.0424*10^313.5 sec? > No, the imbalance is just spilling over. The amount of time is 2042 seconds. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20121005/e53a2389/attachment.html>
