1) Your timings are meaningless! You cannot compare timings when built with all
debugging on, PERIOD!
##########################################################
# #
# WARNING!!! #
# #
# This code was compiled with a debugging option, #
# To get timing results run ./configure #
# using --with-debugging=no, the performance will #
# be generally two or three times faster. #
# #
##########################################################
2) Please run with -snes_view .
3) Note that with 7 levels
SNESJacobianEval 21 1.0 2.4364e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 54 0 0 0 0 54 0 0 0 0 0
with 2 levels
SNESJacobianEval 6 1.0 2.2441e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 34 0 0 0 0 34 0 0 0 0 0
The Jacobian evaluation is dominating the time! Likely if you fix the debugging
this will be less the case
Barry
> On Oct 13, 2015, at 9:23 PM, Timothée Nicolas <[email protected]>
> wrote:
>
> Dear all,
>
> I have been playing around with multigrid recently, namely with
> /ksp/ksp/examples/tutorials/ex42.c, with /snes/examples/tutorial/ex5.c and
> with my own implementation of a laplacian type problem. In all cases, I have
> noted no improvement whatsoever in the performance, whether in CPU time or
> KSP iteration, by varying the number of levels of the multigrid solver. As an
> example, I have attached the log_summary for ex5.c with nlevels = 2 to 7,
> launched by
>
> mpiexec -n 1 ./ex5 -da_grid_x 21 -da_grid_y 21 -ksp_rtol 1.0e-9 -da_refine 6
> -pc_type mg -pc_mg_levels # -snes_monitor -ksp_monitor -log_summary
>
> where -pc_mg_levels is set to a number between 2 and 7.
>
> So there is a noticeable CPU time improvement from 2 levels to 3 levels
> (30%), and then no improvement whatsoever. I am surprised because with 6
> levels of refinement of the DMDA the fine grid has more than 1200 points so
> with 3 levels the coarse grid still has more than 300 points which is still
> pretty large (I assume the ratio between grids is 2). I am wondering how the
> coarse solver efficiently solves the problem on the coarse grid with such a
> large number of points ? Given the principle of multigrid which is to erase
> the smooth part of the error with relaxation methods, which are usually
> efficient only for high frequency, I would expect optimal performance when
> the coarse grid is basically just a few points in each direction. Does anyone
> know why the performance saturates at low number of levels ? Basically what
> happens internally seems to be quite different from what I would expect...
>
> Best
>
> Timothee
> <ex5_2_levels_of_multigrid.log><ex5_3_levels_of_multigrid.log><ex5_4_levels_of_multigrid.log><ex5_5_levels_of_multigrid.log><ex5_6_levels_of_multigrid.log><ex5_7_levels_of_multigrid.log>