On Sun, Nov 16, 2014 at 10:11 AM, Evan Um <[email protected]> wrote: > Dear Matthew, > > Thanks for your comments. I will prepare log summary. To activate an > separate log stage for each iteration (i.e. each RHS vector), I tried the > code below, but got an error. Could you give me a comment? Does > PetscLogView() allow users to use different output file names such that > each iteration generates its own log file? >
You use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Profiling/PetscLogStageRegister.html for each stage, and then http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Profiling/PetscLogStagePush.html http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Profiling/PetscLogStagePop.html#PetscLogStagePop Matt > Evan > > /* work */ > PetscViewer viewer; > for (int i=0; i<1000; i++) { > PetscLogBegin(); > /* work to do*/ > PetscLogView(viewer); > } > /* work */ > > Errors: > Number of PCG iterations:1 > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Corrupt argument: > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: Invalid type of object: Parameter # 1 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.5.0, Jun, 30, 2014 > [0]PETSC ERROR: fetdem3dp on a arch-linux2-c-debug named n0024.voltaire0 > by esum Sun Nov 16 08:08:53 2014 > [0]PETSC ERROR: Configure options > --prefix=/clusterfs/voltaire/home/software/modules/petsc/3.5.0 > --download-fblaslapack=1 --download-mumps=1 --download-parmetis=1 > --with-ptscotch=1 > --with-ptscotch-dir=/clusterfs/voltaire/home/software/modules/scotch/5.1.12-gcc/ > --download-scalapack --download-metis=1 --download-superlu=1 > --download-superlu_dist=1 --download-hypre=1 > --with-mpi-dir=/global/software/sl-6.x86_64/modules/gcc/4.4.7/openmpi/1.6.5-gcc/ > [0]PETSC ERROR: #1 PetscObjectTypeCompare() line 145 in > /clusterfs/voltaire/home/software/source/petsc-3.5.0/src/sys/objects/destroy.c > [0]PETSC ERROR: #2 PetscLogView() line 1807 in > /clusterfs/voltaire/home/software/source/petsc-3.5.0/src/sys/logging/plog.c > > > On Sun, Nov 16, 2014 at 3:57 AM, Matthew Knepley <[email protected]> > wrote: > >> On Sat, Nov 15, 2014 at 9:24 PM, Evan Um <[email protected]> wrote: >> >>> Dear PETSC users, >>> >>> I would like to show you a performance issue when Cholesky factor is >>> re-used as a direct solver or pre-conditioner many times with many >>> right-hand side vectors. Does anyone suggest a solution about this issue? >>> In advance, thanks for your help. >>> >>> Regards, >>> Evan >>> >>> Example 1: I used MUMPS as a direct solver and measured backward/forward >>> substitution solution time for selected RHS vectors. A simplified test code >>> is shown in code 1 and time measurements in result 1. As shown below, >>> backward and forward solution time is nearly constant (e.g. about 2.2 >>> seconds) in early time, but later the solution time overall increases. In >>> contrast, in late time, it takes about 12.8 seconds. It is hard to >>> understand why the backward/forward solution time gets longer although the >>> same numerical operation is carried out with each RHS vector (for i). For >>> many RHS vectors, is there any parameter that I need to reset to stably >>> take care of lots of RHS vectors? >>> >>> Example 2: In this case, I use the Cholesky factor as a preconditioner >>> for CG solver. One iteration is performed. Its sample code and time >>> measurments are shown in Code 2 and Result 2. Again, KSPSolve() gets slower >>> as the preconditioner is re-used with many RHS vectors. For example, In >>> early stage, it took about 4.6 seconds. Later, it took 16 seconds. Does >>> anyone observe such a performance issue? Do you know any solution for this >>> problem? >>> In the two experiments, I expected Example 2 would show shorter solution >>> time with each RHS vector than Example 1 because Example 2 uses scalable >>> matrix-vector multiplication. Instead, when MUMPS is used in Example 1, >>> MUMPS carries out back/forward substitution that are inherently >>> un-scalable. However, my experiments consistently showed that the >>> back/forward substitution of MUMPS is faster than a single PCG iteration >>> with Cholesky preconditioner. Can anyone explain this? FYI, in both example >>> 1 and 2, I used a very large test problem (about sparse SPD >>> 4,000,000-by-4,000,000 matrix). >>> >>> Code 1: >>> KSPCreate(PETSC_COMM_WORLD, &ksp_fetd_dt); >>> KSPSetOperators(ksp_fetd_dt, A_dt, A_dt); >>> KSPSetType (ksp_fetd_dt, KSPPREONLY); >>> KSPGetPC(ksp_fetd_dt, &pc_fetd_dt); >>> MatSetOption(A_dt, MAT_SPD, PETSC_TRUE); >>> PCSetType(pc_fetd_dt, PCCHOLESKY); >>> PCFactorSetMatSolverPackage(pc_fetd_dt, MATSOLVERMUMPS); >>> PCFactorSetUpMatSolverPackage(pc_fetd_dt); >>> PCFactorGetMatrix(pc_fetd_dt, &F_dt); >>> for (int i=0; i<1000; i++) { >>> // Create a new RHS vector B_dt >>> >> >> Are you calling VecCreate() each time here? If so, you could be filling >> up the memory >> of your computer as you run, causing it to slow down due to swapping. >> >> For any performance question, ALWAYS send the output of -log_summary. For >> your >> question, also divide this loop into log stages for each iteration, so we >> can see what >> specific operations slow down. >> >> Matt >> >> >>> time1=my_timer_function(); >>> KSPSolve(ksp_fetd_dt,B_dt,solution); >>> time2=my_timer_function () >>> // Output solution time=time2-time1; >>> } >>> >>> Result1: >>> nTS:12 Time(s):2.400e-04 Sol. time(s):2.246e+00 PCG niter:1 >>> nTS:13 Time(s):2.600e-04 Sol. time(s):2.329e+00 PCG niter:1 >>> nTS:14 Time(s):2.800e-04 Sol. time(s):2.289e+00 PCG niter:1 >>> nTS:15 Time(s):3.000e-04 Sol. time(s):2.239e+00 PCG niter:1 >>> . >>> . >>> nTS:267 Time(s):5.340e-03 Sol. time(s):2.152e+00 PCG niter:1 >>> nTS:280 Time(s):5.600e-03 Sol. time(s):2.255e+00 PCG niter:1 >>> nTS:293 Time(s):5.860e-03 Sol. time(s):1.087e+01 PCG niter:1 >>> nTS:307 Time(s):6.140e-03 Sol. time(s):1.225e+01 PCG niter:1 >>> nTS:321 Time(s):6.420e-03 Sol. time(s):1.280e+01 PCG niter:1 >>> nTS:337 Time(s):6.740e-03 Sol. time(s):1.243e+01 PCG niter:1 >>> nTS:353 Time(s):7.060e-03 Sol. time(s):6.953e+00 PCG niter:1 >>> nTS:369 Time(s):7.380e-03 Sol. time(s):8.712e+00 PCG niter:1 >>> nTS:387 Time(s):7.740e-03 Sol. time(s):9.048e+00 PCG niter:1 >>> nTS:405 Time(s):8.100e-03 Sol. time(s):1.099e+01 PCG niter:1 >>> nTS:424 Time(s):8.480e-03 Sol. time(s):1.171e+01 PCG niter:1 >>> nTS:445 Time(s):8.900e-03 Sol. time(s):1.294e+01 PCG niter:1 >>> nTS:466 Time(s):9.320e-03 Sol. time(s):1.227e+01 PCG niter:1 >>> nTS:488 Time(s):9.760e-03 Sol. time(s):8.581e+00 PCG niter:1 >>> nTS:511 Time(s):1.022e-02 Sol. time(s):1.059e+01 PCG niter:1 >>> nTS:535 Time(s):1.070e-02 Sol. time(s):8.452e+00 PCG niter:1 >>> >>> Code 2: >>> KSPCreate(PETSC_COMM_WORLD, &ksp_fetd_dt); >>> KSPSetOperators(ksp_fetd_dt, A_dt, A_dt); >>> KSPSetType (ksp_fetd_dt, KSPPREONLY); >>> KSPGetPC(ksp_fetd_dt, &pc_fetd_dt); >>> MatSetOption(A_dt, MAT_SPD, PETSC_TRUE); >>> PCSetType(pc_fetd_dt, PCCHOLESKY); >>> PCFactorSetMatSolverPackage(pc_fetd_dt, MATSOLVERMUMPS); >>> PCFactorSetUpMatSolverPackage(pc_fetd_dt); >>> PCFactorGetMatrix(pc_fetd_dt, &F_dt); >>> KSPSetType(ksp_fetd_dt, KSPCG); >>> KSPSetTolerances(ksp_fetd_dt, 1e-9, 1.0e-50, 1.0e10, ksp_iter); >>> for (int i=0; i<1000; i++) { >>> // Create a new RHS vector B_dt >>> time1=my_timer_function(); >>> KSPSolve(ksp_fetd_dt,B_dt,solution); >>> time2=my_timer_function () >>> // Output solution time=time2-time1; >>> } >>> >>> Result 2: >>> nTS:10 Time(s):2.000e-04 Sol. time(s):4.644e+00 PCG niter:1 >>> Norm:2.937e-10 >>> nTS:11 Time(s):2.200e-04 Sol. time(s):4.585e+00 PCG niter:1 >>> Norm:3.151e-10 >>> nTS:12 Time(s):2.400e-04 Sol. time(s):4.737e+00 PCG niter:1 >>> Norm:2.719e-10 >>> nTS:13 Time(s):2.600e-04 Sol. time(s):4.537e+00 PCG niter:1 >>> Norm:3.982e-10 >>> nTS:14 Time(s):2.800e-04 Sol. time(s):4.578e+00 PCG niter:1 >>> Norm:3.221e-10 >>> nTS:15 Time(s):3.000e-04 Sol. time(s):4.678e+00 PCG niter:1 >>> Norm:3.008e-10 >>> nTS:16 Time(s):3.200e-04 Sol. time(s):4.619e+00 PCG niter:1 >>> Norm:3.005e-10 >>> nTS:46 Time(s):9.200e-04 Sol. time(s):6.862e+00 PCG niter:1 >>> Norm:2.759e-10 >>> nTS:48 Time(s):9.600e-04 Sol. time(s):4.475e+00 PCG niter:1 >>> Norm:3.071e-10 >>> nTS:50 Time(s):1.000e-03 Sol. time(s):6.068e+00 PCG niter:1 >>> Norm:3.848e-10 >>> nTS:52 Time(s):1.040e-03 Sol. time(s):5.024e+00 PCG niter:1 >>> Norm:3.639e-10 >>> nTS:55 Time(s):1.100e-03 Sol. time(s):1.333e+01 PCG niter:1 >>> Norm:3.049e-10 >>> nTS:58 Time(s):1.160e-03 Sol. time(s):1.440e+01 PCG niter:1 >>> Norm:3.467e-10 >>> nTS:60 Time(s):1.200e-03 Sol. time(s):1.475e+01 PCG niter:1 >>> Norm:2.951e-10 >>> nTS:63 Time(s):1.260e-03 Sol. time(s):9.899e+00 PCG niter:1 >>> Norm:3.018e-10 >>> nTS:66 Time(s):1.320e-03 Sol. time(s):1.097e+01 PCG niter:1 >>> Norm:3.320e-10 >>> nTS:69 Time(s):1.380e-03 Sol. time(s):1.485e+01 PCG niter:1 >>> Norm:2.811e-10 >>> nTS:73 Time(s):1.460e-03 Sol. time(s):1.199e+01 PCG niter:1 >>> Norm:2.999e-10 >>> nTS:76 Time(s):1.520e-03 Sol. time(s):1.109e+01 PCG niter:1 >>> Norm:3.359e-10 >>> nTS:80 Time(s):1.600e-03 Sol. time(s):1.473e+01 PCG niter:1 >>> Norm:3.336e-10 >>> nTS:84 Time(s):1.680e-03 Sol. time(s):1.444e+01 PCG niter:1 >>> Norm:3.181e-10 >>> nTS:88 Time(s):1.760e-03 Sol. time(s):1.639e+01 PCG niter:1 >>> Norm:2.956e-10 >>> nTS:92 Time(s):1.840e-03 Sol. time(s):1.713e+01 PCG niter:1 >>> Norm:3.552e-10 >>> nTS:96 Time(s):1.920e-03 Sol. time(s):1.562e+01 PCG niter:1 >>> Norm:2.843e-10 >>> nTS:101 Time(s):2.020e-03 Sol. time(s):1.631e+01 PCG niter:1 >>> Norm:3.687e-10 >>> nTS:105 Time(s):2.100e-03 Sol. time(s):1.496e+01 PCG niter:1 >>> Norm:3.567e-10 >>> nTS:111 Time(s):2.220e-03 Sol. time(s):1.607e+01 PCG niter:1 >>> Norm:3.392e-10 >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
