Ah ok. When I find the time I will have a look into mapping processes to cores. I guess it is possible using the torque scheduler.
Thank you! On Tue, Apr 4, 2017 at 2:00 PM Matthew Knepley <knep...@gmail.com> wrote: > On Tue, Apr 4, 2017 at 6:58 AM, Toon Weyens <toon.wey...@gmail.com> wrote: > > Dear Matthew, > > Thanks for your answer, but this is something I do not really know much > about... The node I used has 12 cores and about 24GB of RAM. > > But for these test cases, isn't the distribution of memory over cores > handled automatically by SLEPC? > > > No. Its handled by MPI, which just passes that job off to the OS, which > does a crap job. > > Matt > > > Regards > > On Tue, Apr 4, 2017 at 1:40 PM Matthew Knepley <knep...@gmail.com> wrote: > > On Tue, Apr 4, 2017 at 2:20 AM, Toon Weyens <toon.wey...@gmail.com> wrote: > > Dear Jose and Matthew, > > Thank you so much for the effort! > > I still don't manage to converge using the range interval technique to > filter out the positive eigenvalues, but using shift-invert combined with a > target eigenvalue does true miracles. I get extremely fast convergence. > > The truth of the matter is that we are mainly interested in negative > eigenvalues (unstable modes), and from physical considerations they are > more or less situated in -0.2<lambda<0 in the normalized quantities that we > use. So we will just use guesses. > > Thank you so much again! > > Also, I have finally managed to run streams (the cluster is quite full > atm). These are the outputs: > > > 1) This shows you have a bad process mapping. You could get much more > speedup for 1-4 procs by properly mapping processes to cores, perhaps with > numactl. > > 2) Essentially 3 processes can saturate your memory bandwidth, so I would > not expect much gain from using more than 4. > > Thanks, > > Matt > > > 1 processes > Number of MPI processes 1 Processor names c04b27 > Triad: 12352.0825 Rate (MB/s) > 2 processes > Number of MPI processes 2 Processor names c04b27 c04b27 > Triad: 18968.0226 Rate (MB/s) > 3 processes > Number of MPI processes 3 Processor names c04b27 c04b27 c04b27 > Triad: 21106.8580 Rate (MB/s) > 4 processes > Number of MPI processes 4 Processor names c04b27 c04b27 c04b27 c04b27 > Triad: 21655.5885 Rate (MB/s) > 5 processes > Number of MPI processes 5 Processor names c04b27 c04b27 c04b27 c04b27 > c04b27 > Triad: 21627.5559 Rate (MB/s) > 6 processes > Number of MPI processes 6 Processor names c04b27 c04b27 c04b27 c04b27 > c04b27 c04b27 > Triad: 21394.9620 Rate (MB/s) > 7 processes > Number of MPI processes 7 Processor names c04b27 c04b27 c04b27 c04b27 > c04b27 c04b27 c04b27 > Triad: 24952.7076 Rate (MB/s) > 8 processes > Number of MPI processes 8 Processor names c04b27 c04b27 c04b27 c04b27 > c04b27 c04b27 c04b27 c04b27 > Triad: 28357.1062 Rate (MB/s) > 9 processes > Number of MPI processes 9 Processor names c04b27 c04b27 c04b27 c04b27 > c04b27 c04b27 c04b27 c04b27 c04b27 > Triad: 31720.4545 Rate (MB/s) > 10 processes > Number of MPI processes 10 Processor names c04b27 c04b27 c04b27 c04b27 > c04b27 c04b27 c04b27 c04b27 c04b27 c04b27 > Triad: 35198.7412 Rate (MB/s) > 11 processes > Number of MPI processes 11 Processor names c04b27 c04b27 c04b27 c04b27 > c04b27 c04b27 c04b27 c04b27 c04b27 c04b27 c04b27 > Triad: 38616.0615 Rate (MB/s) > 12 processes > Number of MPI processes 12 Processor names c04b27 c04b27 c04b27 c04b27 > c04b27 c04b27 c04b27 c04b27 c04b27 c04b27 c04b27 c04b27 > Triad: 41939.3994 Rate (MB/s) > > I attach a figure. > > Thanks again! > > On Mon, Apr 3, 2017 at 8:29 PM Jose E. Roman <jro...@dsic.upv.es> wrote: > > > > El 1 abr 2017, a las 0:01, Toon Weyens <toon.wey...@gmail.com> escribió: > > > > Dear jose, > > > > I have saved the matrices in Matlab format and am sending them to you > using pCloud. If you want another format, please tell me. Please also note > that they are about 1.4GB each. > > > > I also attach a typical output of eps_view and log_view in output.txt, > for 8 processes. > > > > Thanks so much for helping me out! I think Petsc and Slepc are amazing > inventions that really have saved me many months of work! > > > > Regards > > I played a little bit with your matrices. > > With Krylov-Schur I can solve the problem quite easily. Note that in > generalized eigenvalue problems it is always better to use STSINVERT > because you have to invert a matrix anyway. So instead of setting > which=smallest_real, use shift-and-invert with a target that is close to > the wanted eigenvalue. For instance, with target=-0.005 I get convergence > with just one iteration: > > $ ./ex7 -f1 A.bin -f2 B.bin -st_ksp_type preonly -st_pc_type lu > -st_pc_factor_mat_solver_package mumps -eps_tol 1e-5 -st_type sinvert > -eps_target -0.005 > > Generalized eigenproblem stored in file. > > Reading COMPLEX matrices from binary files... > Number of iterations of the method: 1 > Number of linear iterations of the method: 16 > Solution method: krylovschur > > Number of requested eigenvalues: 1 > Stopping condition: tol=1e-05, maxit=7500 > Linear eigensolve converged (1 eigenpair) due to CONVERGED_TOL; > iterations 1 > ---------------------- -------------------- > k ||Ax-kBx||/||kx|| > ---------------------- -------------------- > -0.004809-0.000000i 8.82085e-05 > ---------------------- -------------------- > > > Of course, you don't know a priori where your eigenvalue is. > Alternatively, you can set the target at 0 and get rid of positive > eigenvalues with a region filtering. For instance: > > $ ./ex7 -f1 A.bin -f2 B.bin -st_ksp_type preonly -st_pc_type lu > -st_pc_factor_mat_solver_package mumps -eps_tol 1e-5 -st_type sinvert > -eps_target 0 -rg_type interval -rg_interval_endpoints -1,0,-.05,.05 > -eps_nev 2 > > Generalized eigenproblem stored in file. > > Reading COMPLEX matrices from binary files... > Number of iterations of the method: 8 > Number of linear iterations of the method: 74 > Solution method: krylovschur > > Number of requested eigenvalues: 2 > Stopping condition: tol=1e-05, maxit=7058 > Linear eigensolve converged (2 eigenpairs) due to CONVERGED_TOL; > iterations 8 > ---------------------- -------------------- > k ||Ax-kBx||/||kx|| > ---------------------- -------------------- > -0.000392-0.000000i 2636.4 > -0.004809+0.000000i 318441 > ---------------------- -------------------- > > In this case, the residuals seem very bad. But this is due to the fact > that your matrices have huge norms. Adding the option -eps_error_backward > ::ascii_info_detail will show residuals relative to the matrix norms: > ---------------------- -------------------- > k eta(x,k) > ---------------------- -------------------- > -0.000392-0.000000i 3.78647e-11 > -0.004809+0.000000i 5.61419e-08 > ---------------------- -------------------- > > > Regarding the GD solver, I am also getting the correct solution. I don't > know why you are not getting convergence to the wanted eigenvalue: > > $ ./ex7 -f1 A.bin -f2 B.bin -st_ksp_type preonly -st_pc_type lu > -st_pc_factor_mat_solver_package mumps -eps_tol 1e-5 -eps_smallest_real > -eps_ncv 32 -eps_type gd > > Generalized eigenproblem stored in file. > > Reading COMPLEX matrices from binary files... > Number of iterations of the method: 132 > Number of linear iterations of the method: 0 > Solution method: gd > > Number of requested eigenvalues: 1 > Stopping condition: tol=1e-05, maxit=120000 > Linear eigensolve converged (1 eigenpair) due to CONVERGED_TOL; > iterations 132 > ---------------------- -------------------- > k ||Ax-kBx||/||kx|| > ---------------------- -------------------- > -0.004809+0.000000i 2.16223e-05 > ---------------------- -------------------- > > > Again, it is much better to use a target instead of smallest_real: > > $ ./ex7 -f1 A.bin -f2 B.bin -st_ksp_type preonly -st_pc_type lu > -st_pc_factor_mat_solver_package mumps -eps_tol 1e-5 -eps_type gd > -eps_target -0.005 > > Generalized eigenproblem stored in file. > > Reading COMPLEX matrices from binary files... > Number of iterations of the method: 23 > Number of linear iterations of the method: 0 > Solution method: gd > > Number of requested eigenvalues: 1 > Stopping condition: tol=1e-05, maxit=120000 > Linear eigensolve converged (1 eigenpair) due to CONVERGED_TOL; > iterations 23 > ---------------------- -------------------- > k ||Ax-kBx||/||kx|| > ---------------------- -------------------- > -0.004809-0.000000i 2.06572e-05 > ---------------------- -------------------- > > > Jose > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener >