Timothee,

     Thank you for reporting this issue, it is indeed disturbing and could be 
due to a performance regression we may have introduced by being too clever for 
our own good. Could you please rerun with the additional option 
-mg_levels_ksp_type richardson and send the same output?

   Thanks

  Barry

> On Oct 14, 2015, at 9:32 PM, Timothée Nicolas <[email protected]> 
> wrote:
> 
> Thank you Barry for pointing this out. Indeed on a system with no debugging 
> the Jacobian evaluations no longer dominate the time (less than 10%). However 
> the rest is similar, except the improvement from 2 to 3 levels is much 
> better. Still it saturates after levels=3. I understand it in terms of CPU 
> time thanks to Matthew's explanations, however what surprises me more is that 
> KSP iterations are not more efficient. At the least, even if it takes more 
> time to have more levels because of memory issues, I would expect KSP 
> iterations to converge more rapidly with more levels, but it is not the case 
> as you can see. Probably there is also a rationale behind this but I cannot 
> see easily. 
> 
> I send the new outputs
> 
> Best
> 
> Timothee
> 
> 2015-10-15 3:02 GMT+09:00 Barry Smith <[email protected]>:
> 1) Your timings are meaningless! You cannot compare timings when built with 
> all debugging on, PERIOD!
> 
>   ##########################################################
>       #                                                        #
>       #                          WARNING!!!                    #
>       #                                                        #
>       #   This code was compiled with a debugging option,      #
>       #   To get timing results run ./configure                #
>       #   using --with-debugging=no, the performance will      #
>       #   be generally two or three times faster.              #
>       #                                                        #
>       ##########################################################
> 
> 2) Please run with -snes_view .
> 
> 3) Note that with 7 levels
> 
> SNESJacobianEval      21 1.0 2.4364e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00 54  0  0  0  0  54  0  0  0  0     0
> 
> with 2 levels
> 
> SNESJacobianEval       6 1.0 2.2441e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00 34  0  0  0  0  34  0  0  0  0     0
> 
> 
> The Jacobian evaluation is dominating the time! Likely if you fix the 
> debugging this will be less the case
> 
>   Barry
> 
> > On Oct 13, 2015, at 9:23 PM, Timothée Nicolas <[email protected]> 
> > wrote:
> >
> > Dear all,
> >
> > I have been playing around with multigrid recently, namely with 
> > /ksp/ksp/examples/tutorials/ex42.c, with /snes/examples/tutorial/ex5.c and 
> > with my own implementation of a laplacian type problem. In all cases, I 
> > have noted no improvement whatsoever in the performance, whether in CPU 
> > time or KSP iteration, by varying the number of levels of the multigrid 
> > solver. As an example, I have attached the log_summary for ex5.c with 
> > nlevels = 2 to 7, launched by
> >
> > mpiexec -n 1 ./ex5 -da_grid_x 21 -da_grid_y 21 -ksp_rtol 1.0e-9 -da_refine 
> > 6 -pc_type mg -pc_mg_levels # -snes_monitor -ksp_monitor -log_summary
> >
> > where -pc_mg_levels is set to a number between 2 and 7.
> >
> > So there is a noticeable CPU time improvement from 2 levels to 3 levels 
> > (30%), and then no improvement whatsoever. I am surprised because with 6 
> > levels of refinement of the DMDA the fine grid has more than 1200 points so 
> > with 3 levels the coarse grid still has more than 300 points which is still 
> > pretty large (I assume the ratio between grids is 2). I am wondering how 
> > the coarse solver efficiently solves the problem on the coarse grid with 
> > such a large number of points ? Given the principle of multigrid which is 
> > to erase the smooth part of the error with relaxation methods, which are 
> > usually efficient only for high frequency, I would expect optimal 
> > performance when the coarse grid is basically just a few points in each 
> > direction. Does anyone know why the performance saturates at low number of 
> > levels ? Basically what happens internally seems to be quite different from 
> > what I would expect...
> >
> > Best
> >
> > Timothee
> > <ex5_2_levels_of_multigrid.log><ex5_3_levels_of_multigrid.log><ex5_4_levels_of_multigrid.log><ex5_5_levels_of_multigrid.log><ex5_6_levels_of_multigrid.log><ex5_7_levels_of_multigrid.log>
> 
> 
> <ex5_2_multigrid_levels.log><ex5_3_multigrid_levels.log><ex5_4_multigrid_levels.log><ex5_5_multigrid_levels.log><ex5_6_multigrid_levels.log><ex5_7_multigrid_levels.log>

Reply via email to