Dear both, I have recompiled slepc and petsc without debugging, as well as with the recommended --with-fortran-kernels=1. In the attachment I show the scaling for a typical "large" simulation with about 120 000 unkowns, using Krylov-Schur.
There are two sets of datapoints there, as I do two EPS solves in one simulations. The second solve is faster as it results from a grid refinement of the first solve, and takes the solution of the first solve as a first, good guess. Note that there are two pages in the PDF and in the second page I show the time · n_procs. As you can see, the scaling is better than before, especially up to 8 processes (which means about 15,000 unknowns per process, which is, as I recall, cited as a good minimum on the website. I am currently trying to run make streams NPMAX=8, but the cluster is extraordinarily crowded today and it does not like my interactive jobs. I will try to run them asap. The main issue now, however, is again the first issue: the Generalizeid Davidson method does not converge to the physically correct negative eigenvalue (it should be about -0.05 as Krylov-Schur gives me). In stead it stays stuck at some small positive eigenvalue of about +0.0002. It looks as if the solver really does not like passing the eigenvalue = 0 barrier, a behavior I also see in smaller simulations, where the convergence is greatly slowed down when crossing this. However, this time, for this big simulation, just increasing NCV does *not* do the trick, at least not until NCV=2048. Also, I tried to use target magnitude without success either. I started implementing the capability to start with Krylov-Schur and then switch to GD with EPSSetInitialSpace when a certain precision has been reached, but then realized it might be a bit of overkill as the SLEPC solution phase in my code is generally not more than 15% of the time. There are probably other places where I can gain more than a few percents. However, if there is another trick that can make GD to work, it would certainly be appreciated, as in my experience it is really about 5 times faster than Krylov-Schur! Thanks! Toon On Thu, Mar 30, 2017 at 2:47 PM Matthew Knepley <[email protected]> wrote: > On Thu, Mar 30, 2017 at 3:05 AM, Jose E. Roman <[email protected]> wrote: > > > > El 30 mar 2017, a las 9:27, Toon Weyens <[email protected]> > escribió: > > > > Hi, thanks for the answer. > > > > I use MUMPS as a PC. The options -ksp_converged_reason, > -ksp_monitor_true_residual and -ksp_view are not used. > > > > The difference between the log_view outputs of running a simple solution > with 1, 2, 3 or 4 MPI procs is attached (debug version). > > > > I can see that with 2 procs it takes about 22 seconds, versus 7 seconds > for 1 proc. For 3 and 4 the situation is worse: 29 and 37 seconds. > > > > Looks like the difference is mainly in the BVmult and especially in the > BVorthogonalize routines: > > > > BVmult takes 1, 6.5, 10 or even a whopping 17 seconds for the different > number of proceses > > BVorthogonalize takes 1, 4, 6, 10. > > > > Calculating the preconditioner does not take more time for different > number of proceses, and applying it only slightly increases. So it cannot > be mumps' fault... > > > > Does this makes sense? Is there any way to improve this? > > > > Thanks! > > Cannot trust performance data in a debug build: > > > Yes, you should definitely make another build configured using > --with-debugging=no. > > What do you get for STREAMS on this machine > > make streams NP=4 > > From this data, it looks like you have already saturated the bandwidth at > 2 procs. > > Thanks, > > Matt > > > > ########################################################## > # # > # WARNING!!! # > # # > # This code was compiled with a debugging option, # > # To get timing results run ./configure # > # using --with-debugging=no, the performance will # > # be generally two or three times faster. # > # # > ########################################################## > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener >
time_SLEPC.pdf
Description: Adobe PDF document
