Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-25 Thread Ale Foggia
No, the eigenvalue is around -15. I've tried KS and the number of
iterations differs in one when I change the number of MPI processes, which
seems fine for me. So, I'll see if this method is fine for my specific goal
or not, and I'll try to use. Thanks for the help.

El mié., 24 oct. 2018 a las 15:48, Jose E. Roman ()
escribió:

> Everything seems correct. I don't know, maybe your problem is very
> sensitive? Is the eigenvalue tiny?
> I would still try with Krylov-Schur.
> Jose
>
>
> > El 24 oct 2018, a las 14:59, Ale Foggia  escribió:
> >
> > The functions called to set the solver are (in this order): EPSCreate();
> EPSSetOperators(); EPSSetProblemType(EPS_HEP); EPSSetType(EPSLANCZOS);
> EPSSetWhichEigenpairs(EPS_SMALLEST_REAL); EPSSetFromOptions();
> >
> > The output of -eps_view for each run is:
> > =
> > EPS Object: 960 MPI processes
> >   type: lanczos
> > LOCAL reorthogonalization
> >   problem type: symmetric eigenvalue problem
> >   selected portion of the spectrum: smallest real parts
> >   number of eigenvalues (nev): 1
> >   number of column vectors (ncv): 16
> >   maximum dimension of projected problem (mpd): 16
> >   maximum number of iterations: 291700777
> >   tolerance: 1e-08
> >   convergence test: relative to the eigenvalue
> > BV Object: 960 MPI processes
> >   type: svec
> >   17 columns of global length 2333606220
> >   vector orthogonalization method: modified Gram-Schmidt
> >   orthogonalization refinement: if needed (eta: 0.7071)
> >   block orthogonalization method: GS
> >   doing matmult as a single matrix-matrix product
> >   generating random vectors independent of the number of processes
> > DS Object: 960 MPI processes
> >   type: hep
> >   parallel operation mode: REDUNDANT
> >   solving the problem with: Implicit QR method (_steqr)
> > ST Object: 960 MPI processes
> >   type: shift
> >   shift: 0.
> >   number of matrices: 1
> > =
> > EPS Object: 1024 MPI processes
> >   type: lanczos
> > LOCAL reorthogonalization
> >   problem type: symmetric eigenvalue problem
> >   selected portion of the spectrum: smallest real parts
> >   number of eigenvalues (nev): 1
> >   number of column vectors (ncv): 16
> >   maximum dimension of projected problem (mpd): 16
> >   maximum number of iterations: 291700777
> >   tolerance: 1e-08
> >   convergence test: relative to the eigenvalue
> > BV Object: 1024 MPI processes
> >   type: svec
> >   17 columns of global length 2333606220
> >   vector orthogonalization method: modified Gram-Schmidt
> >   orthogonalization refinement: if needed (eta: 0.7071)
> >   block orthogonalization method: GS
> >   doing matmult as a single matrix-matrix product
> >   generating random vectors independent of the number of processes
> > DS Object: 1024 MPI processes
> >   type: hep
> >   parallel operation mode: REDUNDANT
> >   solving the problem with: Implicit QR method (_steqr)
> > ST Object: 1024 MPI processes
> >   type: shift
> >   shift: 0.
> >   number of matrices: 1
> > =
> >
> > I run again the same configurations and I got the same result in term of
> the number of iterations.
> >
> > I also tried the full reorthogonalization (always with the
> -bv_reproducible_random option) but I still get different number of
> iterations: for 960 procs I get 172 iters, and for 1024 I get 362 iters.
> The -esp_view output for this case (only for 960 procs, the other one has
> the same information -except the number of processes-) is:
> > =
> > EPS Object: 960 MPI processes
> >   type: lanczos
> > FULL reorthogonalization
> >   problem type: symmetric eigenvalue problem
> >   selected portion of the spectrum: smallest real parts
> >   number of eigenvalues (nev): 1
> >   number of column vectors (ncv): 16
> >   maximum dimension of projected problem (mpd): 16
> >   maximum number of iterations: 291700777
> >   tolerance: 1e-08
> >   convergence test: relative to the eigenvalue
> > BV Object: 960 MPI processes
> >   type: svec
> >   17 columns of global length 2333606220
> >   vector orthogonalization method: classical Gram-Schmidt
> >   orthogonalization refinement: if needed (eta: 0.7071)
> >   block orthogonalization method: GS
> >   doing matmult as a single matrix-matrix product
> >   generating random vectors independent of the number of processes
> > DS Object: 960 MPI processes
> >   type: hep
> >   parallel operation mode: REDUNDANT
> >   solving the problem with: Implicit QR method (_steqr)
> > ST Object: 960 MPI processes
> >   type: shift
> >   shift: 0.
> >   number of matrices: 1
> > =
> >
> > El mié., 24 oct. 2018 a las 10:52, Jose E. Roman ()
> escribió:
> > This is very strange. Make sure you call EPSSetFromOptions in the 

Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-25 Thread Ale Foggia
El mar., 23 oct. 2018 a las 13:53, Matthew Knepley ()
escribió:

> On Tue, Oct 23, 2018 at 6:24 AM Ale Foggia  wrote:
>
>> Hello,
>>
>> I'm currently using Lanczos solver (EPSLANCZOS) to get the smallest real
>> eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem (EPS_HEP). Those are
>> the only options I set for the solver. My aim is to be able to
>> predict/estimate the time-to-solution. To do so, I was doing a scaling of
>> the code for different sizes of matrices and for different number of MPI
>> processes. As I was not observing a good scaling I checked the number of
>> iterations of the solver (given by EPSGetIterationNumber). I've encounter
>> that for the **same size** of matrix (that meaning, the same problem), when
>> I change the number of MPI processes, the amount of iterations changes, and
>> the behaviour is not monotonic. This are the numbers I've got:
>>
>
> I am sure you know this, but this test is strong scaling and will top out
> when the individual problem sizes become too small (we see this at several
> thousand unknowns).
>

Thanks for pointing this out, we are aware of that and I've been "playing"
around to try to see by myself this behaviour. Now, I think I'll go with
the Krylov-Schur method because is the only solution to the problem of the
number of iterations. With this I think I'll be able to see the individual
problem size effect in the scaling.


>   Thanks,
>
> Matt
>
>
>>
>> # procs   # iters
>> 960  157
>> 992  189
>> 1024338
>> 1056190
>> 1120174
>> 2048136
>>
>> I've checked the mailing list for a similar situation and I've found
>> another person with the same problem but in another solver ("[SLEPc] GD is
>> not deterministic when using different number of cores", Nov 19 2015), but
>> I think the solution this person finds does not apply to my problem
>> (removing "-eps_harmonic" option).
>>
>> Can you give me any hint on what is the reason for this behaviour? Is
>> there a way to prevent this? It's not possible to estimate/predict any time
>> consumption for bigger problems if the number of iterations varies this
>> much.
>>
>> Ale
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>


Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-24 Thread Jose E. Roman
Everything seems correct. I don't know, maybe your problem is very sensitive? 
Is the eigenvalue tiny?
I would still try with Krylov-Schur.
Jose


> El 24 oct 2018, a las 14:59, Ale Foggia  escribió:
> 
> The functions called to set the solver are (in this order): EPSCreate(); 
> EPSSetOperators(); EPSSetProblemType(EPS_HEP); EPSSetType(EPSLANCZOS); 
> EPSSetWhichEigenpairs(EPS_SMALLEST_REAL); EPSSetFromOptions();
> 
> The output of -eps_view for each run is:
> =
> EPS Object: 960 MPI processes
>   type: lanczos
> LOCAL reorthogonalization
>   problem type: symmetric eigenvalue problem
>   selected portion of the spectrum: smallest real parts
>   number of eigenvalues (nev): 1
>   number of column vectors (ncv): 16
>   maximum dimension of projected problem (mpd): 16
>   maximum number of iterations: 291700777
>   tolerance: 1e-08
>   convergence test: relative to the eigenvalue
> BV Object: 960 MPI processes
>   type: svec
>   17 columns of global length 2333606220
>   vector orthogonalization method: modified Gram-Schmidt
>   orthogonalization refinement: if needed (eta: 0.7071)
>   block orthogonalization method: GS
>   doing matmult as a single matrix-matrix product
>   generating random vectors independent of the number of processes
> DS Object: 960 MPI processes
>   type: hep
>   parallel operation mode: REDUNDANT
>   solving the problem with: Implicit QR method (_steqr)
> ST Object: 960 MPI processes
>   type: shift
>   shift: 0.
>   number of matrices: 1
> =
> EPS Object: 1024 MPI processes
>   type: lanczos
> LOCAL reorthogonalization
>   problem type: symmetric eigenvalue problem
>   selected portion of the spectrum: smallest real parts
>   number of eigenvalues (nev): 1
>   number of column vectors (ncv): 16
>   maximum dimension of projected problem (mpd): 16
>   maximum number of iterations: 291700777
>   tolerance: 1e-08
>   convergence test: relative to the eigenvalue
> BV Object: 1024 MPI processes
>   type: svec
>   17 columns of global length 2333606220
>   vector orthogonalization method: modified Gram-Schmidt
>   orthogonalization refinement: if needed (eta: 0.7071)
>   block orthogonalization method: GS
>   doing matmult as a single matrix-matrix product
>   generating random vectors independent of the number of processes
> DS Object: 1024 MPI processes
>   type: hep
>   parallel operation mode: REDUNDANT
>   solving the problem with: Implicit QR method (_steqr)
> ST Object: 1024 MPI processes
>   type: shift
>   shift: 0.
>   number of matrices: 1
> =
> 
> I run again the same configurations and I got the same result in term of the 
> number of iterations.
> 
> I also tried the full reorthogonalization (always with the 
> -bv_reproducible_random option) but I still get different number of 
> iterations: for 960 procs I get 172 iters, and for 1024 I get 362 iters. The 
> -esp_view output for this case (only for 960 procs, the other one has the 
> same information -except the number of processes-) is:
> =
> EPS Object: 960 MPI processes
>   type: lanczos
> FULL reorthogonalization
>   problem type: symmetric eigenvalue problem
>   selected portion of the spectrum: smallest real parts
>   number of eigenvalues (nev): 1
>   number of column vectors (ncv): 16
>   maximum dimension of projected problem (mpd): 16
>   maximum number of iterations: 291700777
>   tolerance: 1e-08
>   convergence test: relative to the eigenvalue
> BV Object: 960 MPI processes
>   type: svec
>   17 columns of global length 2333606220
>   vector orthogonalization method: classical Gram-Schmidt
>   orthogonalization refinement: if needed (eta: 0.7071)
>   block orthogonalization method: GS
>   doing matmult as a single matrix-matrix product
>   generating random vectors independent of the number of processes
> DS Object: 960 MPI processes
>   type: hep
>   parallel operation mode: REDUNDANT
>   solving the problem with: Implicit QR method (_steqr)
> ST Object: 960 MPI processes
>   type: shift
>   shift: 0.
>   number of matrices: 1
> =
> 
> El mié., 24 oct. 2018 a las 10:52, Jose E. Roman () 
> escribió:
> This is very strange. Make sure you call EPSSetFromOptions in the code. Do 
> iteration counts change also for two different runs with the same number of 
> processes?
> Maybe Lanczos with default options is too sensitive (by default it does not 
> reorthogonalize). Suggest using Krylov-Schur or Lanczos with full 
> reorthogonalization (EPSLanczosSetReorthog).
> Also, send the output of -eps_view to see if there is anything abnormal.
> 
> Jose
> 
> 
> > El 24 oct 2018, a las 9:09, Ale Foggia  escribió:
> > 
> > I've tried the option that you gave me but I still get different number of 

Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-24 Thread Ale Foggia
The functions called to set the solver are (in this order): EPSCreate();
EPSSetOperators(); EPSSetProblemType(EPS_HEP); EPSSetType(EPSLANCZOS);
EPSSetWhichEigenpairs(EPS_SMALLEST_REAL); EPSSetFromOptions();

The output of -eps_view for each run is:
=
EPS Object: 960 MPI processes
  type: lanczos
LOCAL reorthogonalization
  problem type: symmetric eigenvalue problem
  selected portion of the spectrum: smallest real parts
  number of eigenvalues (nev): 1
  number of column vectors (ncv): 16
  maximum dimension of projected problem (mpd): 16
  maximum number of iterations: 291700777
  tolerance: 1e-08
  convergence test: relative to the eigenvalue
BV Object: 960 MPI processes
  type: svec
  17 columns of global length 2333606220
  vector orthogonalization method: modified Gram-Schmidt
  orthogonalization refinement: if needed (eta: 0.7071)
  block orthogonalization method: GS
  doing matmult as a single matrix-matrix product
  generating random vectors independent of the number of processes
DS Object: 960 MPI processes
  type: hep
  parallel operation mode: REDUNDANT
  solving the problem with: Implicit QR method (_steqr)
ST Object: 960 MPI processes
  type: shift
  shift: 0.
  number of matrices: 1
=
EPS Object: 1024 MPI processes
  type: lanczos
LOCAL reorthogonalization
  problem type: symmetric eigenvalue problem
  selected portion of the spectrum: smallest real parts
  number of eigenvalues (nev): 1
  number of column vectors (ncv): 16
  maximum dimension of projected problem (mpd): 16
  maximum number of iterations: 291700777
  tolerance: 1e-08
  convergence test: relative to the eigenvalue
BV Object: 1024 MPI processes
  type: svec
  17 columns of global length 2333606220
  vector orthogonalization method: modified Gram-Schmidt
  orthogonalization refinement: if needed (eta: 0.7071)
  block orthogonalization method: GS
  doing matmult as a single matrix-matrix product
  generating random vectors independent of the number of processes
DS Object: 1024 MPI processes
  type: hep
  parallel operation mode: REDUNDANT
  solving the problem with: Implicit QR method (_steqr)
ST Object: 1024 MPI processes
  type: shift
  shift: 0.
  number of matrices: 1
=

I run again the same configurations and I got the same result in term of
the number of iterations.

I also tried the full reorthogonalization (always with the
-bv_reproducible_random option) but I still get different number of
iterations: for 960 procs I get 172 iters, and for 1024 I get 362 iters.
The -esp_view output for this case (only for 960 procs, the other one has
the same information -except the number of processes-) is:
=
EPS Object: 960 MPI processes
  type: lanczos
FULL reorthogonalization
  problem type: symmetric eigenvalue problem
  selected portion of the spectrum: smallest real parts
  number of eigenvalues (nev): 1
  number of column vectors (ncv): 16
  maximum dimension of projected problem (mpd): 16
  maximum number of iterations: 291700777
  tolerance: 1e-08
  convergence test: relative to the eigenvalue
BV Object: 960 MPI processes
  type: svec
  17 columns of global length 2333606220
  vector orthogonalization method: classical Gram-Schmidt
  orthogonalization refinement: if needed (eta: 0.7071)
  block orthogonalization method: GS
  doing matmult as a single matrix-matrix product
  generating random vectors independent of the number of processes
DS Object: 960 MPI processes
  type: hep
  parallel operation mode: REDUNDANT
  solving the problem with: Implicit QR method (_steqr)
ST Object: 960 MPI processes
  type: shift
  shift: 0.
  number of matrices: 1
=

El mié., 24 oct. 2018 a las 10:52, Jose E. Roman ()
escribió:

> This is very strange. Make sure you call EPSSetFromOptions in the code. Do
> iteration counts change also for two different runs with the same number of
> processes?
> Maybe Lanczos with default options is too sensitive (by default it does
> not reorthogonalize). Suggest using Krylov-Schur or Lanczos with full
> reorthogonalization (EPSLanczosSetReorthog).
> Also, send the output of -eps_view to see if there is anything abnormal.
>
> Jose
>
>
> > El 24 oct 2018, a las 9:09, Ale Foggia  escribió:
> >
> > I've tried the option that you gave me but I still get different number
> of iterations when changing the number of MPI processes: I did 960 procs
> and 1024 procs and I got 435 and 176 iterations, respectively.
> >
> > El mar., 23 oct. 2018 a las 16:48, Jose E. Roman ()
> escribió:
> >
> >
> > > El 23 oct 2018, a las 15:46, Ale Foggia  escribió:
> > >
> > >
> > >
> > > El mar., 23 oct. 2018 a las 15:33, Jose E. Roman ()
> escribió:
> > >
> > >
> > > > El 23 oct 2018, a las 15:17, Ale Foggia 
> 

Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-24 Thread Jose E. Roman
This is very strange. Make sure you call EPSSetFromOptions in the code. Do 
iteration counts change also for two different runs with the same number of 
processes?
Maybe Lanczos with default options is too sensitive (by default it does not 
reorthogonalize). Suggest using Krylov-Schur or Lanczos with full 
reorthogonalization (EPSLanczosSetReorthog).
Also, send the output of -eps_view to see if there is anything abnormal.

Jose


> El 24 oct 2018, a las 9:09, Ale Foggia  escribió:
> 
> I've tried the option that you gave me but I still get different number of 
> iterations when changing the number of MPI processes: I did 960 procs and 
> 1024 procs and I got 435 and 176 iterations, respectively.
> 
> El mar., 23 oct. 2018 a las 16:48, Jose E. Roman () 
> escribió:
> 
> 
> > El 23 oct 2018, a las 15:46, Ale Foggia  escribió:
> > 
> > 
> > 
> > El mar., 23 oct. 2018 a las 15:33, Jose E. Roman () 
> > escribió:
> > 
> > 
> > > El 23 oct 2018, a las 15:17, Ale Foggia  escribió:
> > > 
> > > Hello Jose, thanks for your answer.
> > > 
> > > El mar., 23 oct. 2018 a las 12:59, Jose E. Roman () 
> > > escribió:
> > > There is an undocumented option:
> > > 
> > >   -bv_reproducible_random
> > > 
> > > It will force the initial vector of the Krylov subspace to be the same 
> > > irrespective of the number of MPI processes. This should be used for 
> > > scaling analyses as the one you are trying to do.
> > > 
> > > What about when I'm not doing the scaling? Now I would like to ask for 
> > > computing time for bigger size problems, should I also use this option in 
> > > that case? Because, what happens if I have a "bad" configuration? 
> > > Meaning, I ask for some time, enough if I take into account the "correct" 
> > > scaling, but when I run it takes double the time/iterations, like it 
> > > happened before when changing from 960 to 1024 processes?
> > 
> > When you increase the matrix size the spectrum of the matrix changes and 
> > probably also the convergence, so the computation time is not easy to 
> > predict in advance.
> > 
> > Okey, I'll keep that in mine. I thought that, even if the spectrum changes, 
> > if I had a behaviour/tendency for 6 or 7 smaller cases I could predict more 
> > or less the time. It was working this way until I found this "iterations 
> > problem" which doubled the time of execution for the same size problem. To 
> > be completely sure, do you suggest me or not to use this run-time option 
> > when going in production? Can you elaborate a bit in the effect this 
> > option? Is the (huge) difference I got in the number of iterations 
> > something expected?
> 
> Ideally if you have a rough approximation of the eigenvector, you set it as 
> the initial vector with EPSSetInitialSpace(). Otherwise, SLEPc generates a 
> random initial vector, that is start the search blindly. The difference 
> between using one random vector or another may be large, depending on the 
> problem. Krylov-Schur is usually less sensitive to the initial vector. 
> 
> Jose
> 
> > 
> > 
> > > 
> > > An additional comment is that we strongly recommend to use the default 
> > > solver (Krylov-Schur), which will do Lanczos with implicit restart. It is 
> > > generally faster and more stable.
> > > 
> > > I will be doing Dynamical Lanczos, that means that I'll need the "matrix 
> > > whose rows are the eigenvectors of the tridiagonal matrix" (so, according 
> > > to the Lanczos Technical Report notation, I need the "matrix whose rows 
> > > are the eigenvectors of T_m", which should be the same as the vectors 
> > > y_i). I checked the Technical Report for Krylov-Schur also and I think I 
> > > can get the same information also from that solver, but I'm not sure. Can 
> > > you confirm this please? 
> > > Also, as the vectors I want are given by V_m^(-1)*x_i=y_i (following the 
> > > notation on the Report), my idea to get them was to retrieve the 
> > > invariant subspace V_m (with EPSGetInvariantSubspace), invert it, and 
> > > then multiply it with the eigenvectors that I get with EPSGetEigenvector. 
> > > Is there another easier (or with less computations) way to get this?
> > 
> > In Krylov-Schur the tridiagonal matrix T_m becomes 
> > arrowhead-plus-tridiagonal. Apart from this, it should be equivalent. The 
> > relevant information can be obtained with EPSGetBV() and EPSGetDS(). But 
> > this is a "developer level" interface. We could help you get this running. 
> > Send a small problem matrix to slepc-maint together with a more detailed 
> > description of what you need to compute.
> > 
> > Thanks! When I get to that part I'll write to slepc-maint for help.
> > 
> > 
> > Jose
> > 
> > > 
> > > 
> > > Jose
> > > 
> > > 
> > > 
> > > > El 23 oct 2018, a las 12:13, Ale Foggia  escribió:
> > > > 
> > > > Hello, 
> > > > 
> > > > I'm currently using Lanczos solver (EPSLANCZOS) to get the smallest 
> > > > real eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem (EPS_HEP). 
> > > > Those are the only options 

Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-24 Thread Ale Foggia
I've tried the option that you gave me but I still get different number of
iterations when changing the number of MPI processes: I did 960 procs and
1024 procs and I got 435 and 176 iterations, respectively.

El mar., 23 oct. 2018 a las 16:48, Jose E. Roman ()
escribió:

>
>
> > El 23 oct 2018, a las 15:46, Ale Foggia  escribió:
> >
> >
> >
> > El mar., 23 oct. 2018 a las 15:33, Jose E. Roman ()
> escribió:
> >
> >
> > > El 23 oct 2018, a las 15:17, Ale Foggia  escribió:
> > >
> > > Hello Jose, thanks for your answer.
> > >
> > > El mar., 23 oct. 2018 a las 12:59, Jose E. Roman ()
> escribió:
> > > There is an undocumented option:
> > >
> > >   -bv_reproducible_random
> > >
> > > It will force the initial vector of the Krylov subspace to be the same
> irrespective of the number of MPI processes. This should be used for
> scaling analyses as the one you are trying to do.
> > >
> > > What about when I'm not doing the scaling? Now I would like to ask for
> computing time for bigger size problems, should I also use this option in
> that case? Because, what happens if I have a "bad" configuration? Meaning,
> I ask for some time, enough if I take into account the "correct" scaling,
> but when I run it takes double the time/iterations, like it happened before
> when changing from 960 to 1024 processes?
> >
> > When you increase the matrix size the spectrum of the matrix changes and
> probably also the convergence, so the computation time is not easy to
> predict in advance.
> >
> > Okey, I'll keep that in mine. I thought that, even if the spectrum
> changes, if I had a behaviour/tendency for 6 or 7 smaller cases I could
> predict more or less the time. It was working this way until I found this
> "iterations problem" which doubled the time of execution for the same size
> problem. To be completely sure, do you suggest me or not to use this
> run-time option when going in production? Can you elaborate a bit in the
> effect this option? Is the (huge) difference I got in the number of
> iterations something expected?
>
> Ideally if you have a rough approximation of the eigenvector, you set it
> as the initial vector with EPSSetInitialSpace(). Otherwise, SLEPc generates
> a random initial vector, that is start the search blindly. The difference
> between using one random vector or another may be large, depending on the
> problem. Krylov-Schur is usually less sensitive to the initial vector.
>
> Jose
>
> >
> >
> > >
> > > An additional comment is that we strongly recommend to use the default
> solver (Krylov-Schur), which will do Lanczos with implicit restart. It is
> generally faster and more stable.
> > >
> > > I will be doing Dynamical Lanczos, that means that I'll need the
> "matrix whose rows are the eigenvectors of the tridiagonal matrix" (so,
> according to the Lanczos Technical Report notation, I need the "matrix
> whose rows are the eigenvectors of T_m", which should be the same as the
> vectors y_i). I checked the Technical Report for Krylov-Schur also and I
> think I can get the same information also from that solver, but I'm not
> sure. Can you confirm this please?
> > > Also, as the vectors I want are given by V_m^(-1)*x_i=y_i (following
> the notation on the Report), my idea to get them was to retrieve the
> invariant subspace V_m (with EPSGetInvariantSubspace), invert it, and then
> multiply it with the eigenvectors that I get with EPSGetEigenvector. Is
> there another easier (or with less computations) way to get this?
> >
> > In Krylov-Schur the tridiagonal matrix T_m becomes
> arrowhead-plus-tridiagonal. Apart from this, it should be equivalent. The
> relevant information can be obtained with EPSGetBV() and EPSGetDS(). But
> this is a "developer level" interface. We could help you get this running.
> Send a small problem matrix to slepc-maint together with a more detailed
> description of what you need to compute.
> >
> > Thanks! When I get to that part I'll write to slepc-maint for help.
> >
> >
> > Jose
> >
> > >
> > >
> > > Jose
> > >
> > >
> > >
> > > > El 23 oct 2018, a las 12:13, Ale Foggia 
> escribió:
> > > >
> > > > Hello,
> > > >
> > > > I'm currently using Lanczos solver (EPSLANCZOS) to get the smallest
> real eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem (EPS_HEP). Those
> are the only options I set for the solver. My aim is to be able to
> predict/estimate the time-to-solution. To do so, I was doing a scaling of
> the code for different sizes of matrices and for different number of MPI
> processes. As I was not observing a good scaling I checked the number of
> iterations of the solver (given by EPSGetIterationNumber). I've encounter
> that for the **same size** of matrix (that meaning, the same problem), when
> I change the number of MPI processes, the amount of iterations changes, and
> the behaviour is not monotonic. This are the numbers I've got:
> > > >
> > > > # procs   # iters
> > > > 960  157
> > > > 992  189
> > > > 1024338
> > > > 1056

Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-23 Thread Jose E. Roman



> El 23 oct 2018, a las 15:46, Ale Foggia  escribió:
> 
> 
> 
> El mar., 23 oct. 2018 a las 15:33, Jose E. Roman () 
> escribió:
> 
> 
> > El 23 oct 2018, a las 15:17, Ale Foggia  escribió:
> > 
> > Hello Jose, thanks for your answer.
> > 
> > El mar., 23 oct. 2018 a las 12:59, Jose E. Roman () 
> > escribió:
> > There is an undocumented option:
> > 
> >   -bv_reproducible_random
> > 
> > It will force the initial vector of the Krylov subspace to be the same 
> > irrespective of the number of MPI processes. This should be used for 
> > scaling analyses as the one you are trying to do.
> > 
> > What about when I'm not doing the scaling? Now I would like to ask for 
> > computing time for bigger size problems, should I also use this option in 
> > that case? Because, what happens if I have a "bad" configuration? Meaning, 
> > I ask for some time, enough if I take into account the "correct" scaling, 
> > but when I run it takes double the time/iterations, like it happened before 
> > when changing from 960 to 1024 processes?
> 
> When you increase the matrix size the spectrum of the matrix changes and 
> probably also the convergence, so the computation time is not easy to predict 
> in advance.
> 
> Okey, I'll keep that in mine. I thought that, even if the spectrum changes, 
> if I had a behaviour/tendency for 6 or 7 smaller cases I could predict more 
> or less the time. It was working this way until I found this "iterations 
> problem" which doubled the time of execution for the same size problem. To be 
> completely sure, do you suggest me or not to use this run-time option when 
> going in production? Can you elaborate a bit in the effect this option? Is 
> the (huge) difference I got in the number of iterations something expected?

Ideally if you have a rough approximation of the eigenvector, you set it as the 
initial vector with EPSSetInitialSpace(). Otherwise, SLEPc generates a random 
initial vector, that is start the search blindly. The difference between using 
one random vector or another may be large, depending on the problem. 
Krylov-Schur is usually less sensitive to the initial vector. 

Jose

> 
> 
> > 
> > An additional comment is that we strongly recommend to use the default 
> > solver (Krylov-Schur), which will do Lanczos with implicit restart. It is 
> > generally faster and more stable.
> > 
> > I will be doing Dynamical Lanczos, that means that I'll need the "matrix 
> > whose rows are the eigenvectors of the tridiagonal matrix" (so, according 
> > to the Lanczos Technical Report notation, I need the "matrix whose rows are 
> > the eigenvectors of T_m", which should be the same as the vectors y_i). I 
> > checked the Technical Report for Krylov-Schur also and I think I can get 
> > the same information also from that solver, but I'm not sure. Can you 
> > confirm this please? 
> > Also, as the vectors I want are given by V_m^(-1)*x_i=y_i (following the 
> > notation on the Report), my idea to get them was to retrieve the invariant 
> > subspace V_m (with EPSGetInvariantSubspace), invert it, and then multiply 
> > it with the eigenvectors that I get with EPSGetEigenvector. Is there 
> > another easier (or with less computations) way to get this?
> 
> In Krylov-Schur the tridiagonal matrix T_m becomes 
> arrowhead-plus-tridiagonal. Apart from this, it should be equivalent. The 
> relevant information can be obtained with EPSGetBV() and EPSGetDS(). But this 
> is a "developer level" interface. We could help you get this running. Send a 
> small problem matrix to slepc-maint together with a more detailed description 
> of what you need to compute.
> 
> Thanks! When I get to that part I'll write to slepc-maint for help.
> 
> 
> Jose
> 
> > 
> > 
> > Jose
> > 
> > 
> > 
> > > El 23 oct 2018, a las 12:13, Ale Foggia  escribió:
> > > 
> > > Hello, 
> > > 
> > > I'm currently using Lanczos solver (EPSLANCZOS) to get the smallest real 
> > > eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem (EPS_HEP). Those 
> > > are the only options I set for the solver. My aim is to be able to 
> > > predict/estimate the time-to-solution. To do so, I was doing a scaling of 
> > > the code for different sizes of matrices and for different number of MPI 
> > > processes. As I was not observing a good scaling I checked the number of 
> > > iterations of the solver (given by EPSGetIterationNumber). I've encounter 
> > > that for the **same size** of matrix (that meaning, the same problem), 
> > > when I change the number of MPI processes, the amount of iterations 
> > > changes, and the behaviour is not monotonic. This are the numbers I've 
> > > got:
> > > 
> > > # procs   # iters
> > > 960  157
> > > 992  189
> > > 1024338
> > > 1056190
> > > 1120174
> > > 2048136
> > > 
> > > I've checked the mailing list for a similar situation and I've found 
> > > another person with the same problem but in another solver ("[SLEPc] GD 
> > > is not deterministic when 

Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-23 Thread Ale Foggia
El mar., 23 oct. 2018 a las 15:33, Jose E. Roman ()
escribió:

>
>
> > El 23 oct 2018, a las 15:17, Ale Foggia  escribió:
> >
> > Hello Jose, thanks for your answer.
> >
> > El mar., 23 oct. 2018 a las 12:59, Jose E. Roman ()
> escribió:
> > There is an undocumented option:
> >
> >   -bv_reproducible_random
> >
> > It will force the initial vector of the Krylov subspace to be the same
> irrespective of the number of MPI processes. This should be used for
> scaling analyses as the one you are trying to do.
> >
> > What about when I'm not doing the scaling? Now I would like to ask for
> computing time for bigger size problems, should I also use this option in
> that case? Because, what happens if I have a "bad" configuration? Meaning,
> I ask for some time, enough if I take into account the "correct" scaling,
> but when I run it takes double the time/iterations, like it happened before
> when changing from 960 to 1024 processes?
>
> When you increase the matrix size the spectrum of the matrix changes and
> probably also the convergence, so the computation time is not easy to
> predict in advance.
>

Okey, I'll keep that in mine. I thought that, even if the spectrum changes,
if I had a behaviour/tendency for 6 or 7 smaller cases I could predict more
or less the time. It was working this way until I found this "iterations
problem" which doubled the time of execution for the same size problem. To
be completely sure, do you suggest me or not to use this run-time option
when going in production? Can you elaborate a bit in the effect this
option? Is the (huge) difference I got in the number of iterations
something expected?


> >
> > An additional comment is that we strongly recommend to use the default
> solver (Krylov-Schur), which will do Lanczos with implicit restart. It is
> generally faster and more stable.
> >
> > I will be doing Dynamical Lanczos, that means that I'll need the "matrix
> whose rows are the eigenvectors of the tridiagonal matrix" (so, according
> to the Lanczos Technical Report notation, I need the "matrix whose rows are
> the eigenvectors of T_m", which should be the same as the vectors y_i). I
> checked the Technical Report for Krylov-Schur also and I think I can get
> the same information also from that solver, but I'm not sure. Can you
> confirm this please?
> > Also, as the vectors I want are given by V_m^(-1)*x_i=y_i (following the
> notation on the Report), my idea to get them was to retrieve the invariant
> subspace V_m (with EPSGetInvariantSubspace), invert it, and then multiply
> it with the eigenvectors that I get with EPSGetEigenvector. Is there
> another easier (or with less computations) way to get this?
>
> In Krylov-Schur the tridiagonal matrix T_m becomes
> arrowhead-plus-tridiagonal. Apart from this, it should be equivalent. The
> relevant information can be obtained with EPSGetBV() and EPSGetDS(). But
> this is a "developer level" interface. We could help you get this running.
> Send a small problem matrix to slepc-maint together with a more detailed
> description of what you need to compute.
>

Thanks! When I get to that part I'll write to slepc-maint for help.


> Jose
>
> >
> >
> > Jose
> >
> >
> >
> > > El 23 oct 2018, a las 12:13, Ale Foggia  escribió:
> > >
> > > Hello,
> > >
> > > I'm currently using Lanczos solver (EPSLANCZOS) to get the smallest
> real eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem (EPS_HEP). Those
> are the only options I set for the solver. My aim is to be able to
> predict/estimate the time-to-solution. To do so, I was doing a scaling of
> the code for different sizes of matrices and for different number of MPI
> processes. As I was not observing a good scaling I checked the number of
> iterations of the solver (given by EPSGetIterationNumber). I've encounter
> that for the **same size** of matrix (that meaning, the same problem), when
> I change the number of MPI processes, the amount of iterations changes, and
> the behaviour is not monotonic. This are the numbers I've got:
> > >
> > > # procs   # iters
> > > 960  157
> > > 992  189
> > > 1024338
> > > 1056190
> > > 1120174
> > > 2048136
> > >
> > > I've checked the mailing list for a similar situation and I've found
> another person with the same problem but in another solver ("[SLEPc] GD is
> not deterministic when using different number of cores", Nov 19 2015), but
> I think the solution this person finds does not apply to my problem
> (removing "-eps_harmonic" option).
> > >
> > > Can you give me any hint on what is the reason for this behaviour? Is
> there a way to prevent this? It's not possible to estimate/predict any time
> consumption for bigger problems if the number of iterations varies this
> much.
> > >
> > > Ale
> >
>
>


Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-23 Thread Jose E. Roman



> El 23 oct 2018, a las 15:17, Ale Foggia  escribió:
> 
> Hello Jose, thanks for your answer.
> 
> El mar., 23 oct. 2018 a las 12:59, Jose E. Roman () 
> escribió:
> There is an undocumented option:
> 
>   -bv_reproducible_random
> 
> It will force the initial vector of the Krylov subspace to be the same 
> irrespective of the number of MPI processes. This should be used for scaling 
> analyses as the one you are trying to do.
> 
> What about when I'm not doing the scaling? Now I would like to ask for 
> computing time for bigger size problems, should I also use this option in 
> that case? Because, what happens if I have a "bad" configuration? Meaning, I 
> ask for some time, enough if I take into account the "correct" scaling, but 
> when I run it takes double the time/iterations, like it happened before when 
> changing from 960 to 1024 processes?

When you increase the matrix size the spectrum of the matrix changes and 
probably also the convergence, so the computation time is not easy to predict 
in advance.

> 
> An additional comment is that we strongly recommend to use the default solver 
> (Krylov-Schur), which will do Lanczos with implicit restart. It is generally 
> faster and more stable.
> 
> I will be doing Dynamical Lanczos, that means that I'll need the "matrix 
> whose rows are the eigenvectors of the tridiagonal matrix" (so, according to 
> the Lanczos Technical Report notation, I need the "matrix whose rows are the 
> eigenvectors of T_m", which should be the same as the vectors y_i). I checked 
> the Technical Report for Krylov-Schur also and I think I can get the same 
> information also from that solver, but I'm not sure. Can you confirm this 
> please? 
> Also, as the vectors I want are given by V_m^(-1)*x_i=y_i (following the 
> notation on the Report), my idea to get them was to retrieve the invariant 
> subspace V_m (with EPSGetInvariantSubspace), invert it, and then multiply it 
> with the eigenvectors that I get with EPSGetEigenvector. Is there another 
> easier (or with less computations) way to get this?

In Krylov-Schur the tridiagonal matrix T_m becomes arrowhead-plus-tridiagonal. 
Apart from this, it should be equivalent. The relevant information can be 
obtained with EPSGetBV() and EPSGetDS(). But this is a "developer level" 
interface. We could help you get this running. Send a small problem matrix to 
slepc-maint together with a more detailed description of what you need to 
compute.

Jose

> 
> 
> Jose
> 
> 
> 
> > El 23 oct 2018, a las 12:13, Ale Foggia  escribió:
> > 
> > Hello, 
> > 
> > I'm currently using Lanczos solver (EPSLANCZOS) to get the smallest real 
> > eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem (EPS_HEP). Those are 
> > the only options I set for the solver. My aim is to be able to 
> > predict/estimate the time-to-solution. To do so, I was doing a scaling of 
> > the code for different sizes of matrices and for different number of MPI 
> > processes. As I was not observing a good scaling I checked the number of 
> > iterations of the solver (given by EPSGetIterationNumber). I've encounter 
> > that for the **same size** of matrix (that meaning, the same problem), when 
> > I change the number of MPI processes, the amount of iterations changes, and 
> > the behaviour is not monotonic. This are the numbers I've got:
> > 
> > # procs   # iters
> > 960  157
> > 992  189
> > 1024338
> > 1056190
> > 1120174
> > 2048136
> > 
> > I've checked the mailing list for a similar situation and I've found 
> > another person with the same problem but in another solver ("[SLEPc] GD is 
> > not deterministic when using different number of cores", Nov 19 2015), but 
> > I think the solution this person finds does not apply to my problem 
> > (removing "-eps_harmonic" option).
> > 
> > Can you give me any hint on what is the reason for this behaviour? Is there 
> > a way to prevent this? It's not possible to estimate/predict any time 
> > consumption for bigger problems if the number of iterations varies this 
> > much.
> > 
> > Ale
> 



Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-23 Thread Ale Foggia
Hello Jose, thanks for your answer.

El mar., 23 oct. 2018 a las 12:59, Jose E. Roman ()
escribió:

> There is an undocumented option:
>
>   -bv_reproducible_random
>
> It will force the initial vector of the Krylov subspace to be the same
> irrespective of the number of MPI processes. This should be used for
> scaling analyses as the one you are trying to do.
>

What about when I'm not doing the scaling? Now I would like to ask for
computing time for bigger size problems, should I also use this option in
that case? Because, what happens if I have a "bad" configuration? Meaning,
I ask for some time, enough if I take into account the "correct" scaling,
but when I run it takes double the time/iterations, like it happened before
when changing from 960 to 1024 processes?

>
> An additional comment is that we strongly recommend to use the default
> solver (Krylov-Schur), which will do Lanczos with implicit restart. It is
> generally faster and more stable.
>

I will be doing Dynamical Lanczos, that means that I'll need the "matrix
whose rows are the eigenvectors of the tridiagonal matrix" (so, according
to the Lanczos Technical Report notation, I need the "matrix whose rows are
the eigenvectors of T_m", which should be the same as the vectors y_i). I
checked the Technical Report for Krylov-Schur also and I think I can get
the same information also from that solver, but I'm not sure. Can you
confirm this please?
Also, as the vectors I want are given by V_m^(-1)*x_i=y_i (following the
notation on the Report), my idea to get them was to retrieve the invariant
subspace V_m (with EPSGetInvariantSubspace), invert it, and then multiply
it with the eigenvectors that I get with EPSGetEigenvector. Is there
another easier (or with less computations) way to get this?


> Jose
>
>
>
> > El 23 oct 2018, a las 12:13, Ale Foggia  escribió:
> >
> > Hello,
> >
> > I'm currently using Lanczos solver (EPSLANCZOS) to get the smallest real
> eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem (EPS_HEP). Those are
> the only options I set for the solver. My aim is to be able to
> predict/estimate the time-to-solution. To do so, I was doing a scaling of
> the code for different sizes of matrices and for different number of MPI
> processes. As I was not observing a good scaling I checked the number of
> iterations of the solver (given by EPSGetIterationNumber). I've encounter
> that for the **same size** of matrix (that meaning, the same problem), when
> I change the number of MPI processes, the amount of iterations changes, and
> the behaviour is not monotonic. This are the numbers I've got:
> >
> > # procs   # iters
> > 960  157
> > 992  189
> > 1024338
> > 1056190
> > 1120174
> > 2048136
> >
> > I've checked the mailing list for a similar situation and I've found
> another person with the same problem but in another solver ("[SLEPc] GD is
> not deterministic when using different number of cores", Nov 19 2015), but
> I think the solution this person finds does not apply to my problem
> (removing "-eps_harmonic" option).
> >
> > Can you give me any hint on what is the reason for this behaviour? Is
> there a way to prevent this? It's not possible to estimate/predict any time
> consumption for bigger problems if the number of iterations varies this
> much.
> >
> > Ale
>
>


Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-23 Thread Matthew Knepley
On Tue, Oct 23, 2018 at 6:24 AM Ale Foggia  wrote:

> Hello,
>
> I'm currently using Lanczos solver (EPSLANCZOS) to get the smallest real
> eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem (EPS_HEP). Those are
> the only options I set for the solver. My aim is to be able to
> predict/estimate the time-to-solution. To do so, I was doing a scaling of
> the code for different sizes of matrices and for different number of MPI
> processes. As I was not observing a good scaling I checked the number of
> iterations of the solver (given by EPSGetIterationNumber). I've encounter
> that for the **same size** of matrix (that meaning, the same problem), when
> I change the number of MPI processes, the amount of iterations changes, and
> the behaviour is not monotonic. This are the numbers I've got:
>

I am sure you know this, but this test is strong scaling and will top out
when the individual problem sizes become too small (we see this at several
thousand unknowns).

  Thanks,

Matt


>
> # procs   # iters
> 960  157
> 992  189
> 1024338
> 1056190
> 1120174
> 2048136
>
> I've checked the mailing list for a similar situation and I've found
> another person with the same problem but in another solver ("[SLEPc] GD is
> not deterministic when using different number of cores", Nov 19 2015), but
> I think the solution this person finds does not apply to my problem
> (removing "-eps_harmonic" option).
>
> Can you give me any hint on what is the reason for this behaviour? Is
> there a way to prevent this? It's not possible to estimate/predict any time
> consumption for bigger problems if the number of iterations varies this
> much.
>
> Ale
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-23 Thread Jose E. Roman
There is an undocumented option:

  -bv_reproducible_random

It will force the initial vector of the Krylov subspace to be the same 
irrespective of the number of MPI processes. This should be used for scaling 
analyses as the one you are trying to do.

An additional comment is that we strongly recommend to use the default solver 
(Krylov-Schur), which will do Lanczos with implicit restart. It is generally 
faster and more stable.

Jose



> El 23 oct 2018, a las 12:13, Ale Foggia  escribió:
> 
> Hello, 
> 
> I'm currently using Lanczos solver (EPSLANCZOS) to get the smallest real 
> eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem (EPS_HEP). Those are 
> the only options I set for the solver. My aim is to be able to 
> predict/estimate the time-to-solution. To do so, I was doing a scaling of the 
> code for different sizes of matrices and for different number of MPI 
> processes. As I was not observing a good scaling I checked the number of 
> iterations of the solver (given by EPSGetIterationNumber). I've encounter 
> that for the **same size** of matrix (that meaning, the same problem), when I 
> change the number of MPI processes, the amount of iterations changes, and the 
> behaviour is not monotonic. This are the numbers I've got:
> 
> # procs   # iters
> 960  157
> 992  189
> 1024338
> 1056190
> 1120174
> 2048136
> 
> I've checked the mailing list for a similar situation and I've found another 
> person with the same problem but in another solver ("[SLEPc] GD is not 
> deterministic when using different number of cores", Nov 19 2015), but I 
> think the solution this person finds does not apply to my problem (removing 
> "-eps_harmonic" option).
> 
> Can you give me any hint on what is the reason for this behaviour? Is there a 
> way to prevent this? It's not possible to estimate/predict any time 
> consumption for bigger problems if the number of iterations varies this much.
> 
> Ale



[petsc-users] [SLEPc] Number of iterations changes with MPI processes in Lanczos

2018-10-23 Thread Ale Foggia
Hello,

I'm currently using Lanczos solver (EPSLANCZOS) to get the smallest real
eigenvalue (EPS_SMALLEST_REAL) of a Hermitian problem (EPS_HEP). Those are
the only options I set for the solver. My aim is to be able to
predict/estimate the time-to-solution. To do so, I was doing a scaling of
the code for different sizes of matrices and for different number of MPI
processes. As I was not observing a good scaling I checked the number of
iterations of the solver (given by EPSGetIterationNumber). I've encounter
that for the **same size** of matrix (that meaning, the same problem), when
I change the number of MPI processes, the amount of iterations changes, and
the behaviour is not monotonic. This are the numbers I've got:

# procs   # iters
960  157
992  189
1024338
1056190
1120174
2048136

I've checked the mailing list for a similar situation and I've found
another person with the same problem but in another solver ("[SLEPc] GD is
not deterministic when using different number of cores", Nov 19 2015), but
I think the solution this person finds does not apply to my problem
(removing "-eps_harmonic" option).

Can you give me any hint on what is the reason for this behaviour? Is there
a way to prevent this? It's not possible to estimate/predict any time
consumption for bigger problems if the number of iterations varies this
much.

Ale