Re: [petsc-users] [SLEPc] Performance of Krylov-Schur with MUMPS-based shift-and-invert

2018-02-28 Thread Jose E. Roman
Balancing may reduce the norm of the matrix or the condition number of some 
eigenvalues. It applies to the ST operator, (A-sigma*B)^{-1}*B in case of 
shift-and-invert. The case that sigma is close to an eigenvalue is usually not 
a problem, provided that you use a robust direct solver (MUMPS). In cases where 
both A and B are singular, balancing may improve residual errors.

I think it is not necessary in your case.
Jose


> El 27 feb 2018, a las 19:03, Thibaut Appel  
> escribió:
> 
> Good afternoon Mr Roman,
> 
> Thank you very much for your detailed and quick answer. I'll make the use of 
> eps_view and log_view and see if I can optimize the preallocation of my 
> matrix.
> 
> Good to know that it is possible to play with the "mpd" parameter, I thought 
> it was only for when large values of nev were requested - as suggested by the 
> user guide.
> 
> My residual norms are decent (10E-6 to 10E-8) so I do not think my 
> application code requires significant tuning.
> 
> When you say "EPSSetBalance() sometimes helps, even in the case of using 
> spectral transformation", did you mean "particularly" instead of "even"? If 
> yes, why? If I misunderstood, what did you want to explain?
> 
> Regards,
> 
> 
> Thibaut
> 
> On 19/02/18 18:36, Jose E. Roman wrote:
>> 
>>> El 19 feb 2018, a las 19:15, Thibaut Appel  
>>> escribió:
>>> 
>>> Good afternoon,
>>> 
>>> I am solving generalized eigenvalue problems {Ax = omegaBx} in complex 
>>> arithmetic, where A is non-hermitian and B is singular. I think the only 
>>> way to get round the singularity is to employ a shift-and-invert method, 
>>> where I am using MUMPS to invert the shifted matrix.
>>> 
>>> I am using the Fortran interface of PETSc 3.8.3 and SLEPc 3.8.2 where my 
>>> ./configure line was
>>> ./configure --with-fortran-kernels=1 --with-scalar-type=complex 
>>> --with-blaslapack-dir=/home/linuxbrew/.linuxbrew/opt/openblas 
>>> --PETSC_ARCH=cplx_dble_optim 
>>> --with-cmake-dir=/home/linuxbrew/.linuxbrew/opt/cmake 
>>> --with-mpi-dir=/home/linuxbrew/.linuxbrew/opt/openmpi --with-debugging=0 
>>> --download-scalapack --download-mumps --COPTFLAGS="-O3 -march=native" 
>>> --CXXOPTFLAGS="-O3 -march=native" --FOPTFLAGS="-O3 -march=native"
>>> 
>>> My matrices A and B are assembled correctly in parallel and my 
>>> preallocation is quasi-optimal in the sense that I don't have any called to 
>>> mallocs but I may overestimate the required memory for some rows of the 
>>> matrices. Here is how I setup the EPS problem and solve:
>>> 
>>> CALL EPSSetProblemType(eps,EPS_GNHEP,ierr)
>>> CALL EPSSetOperators(eps,MatA,MatB,ierr)
>>> CALL EPSSetType(eps,EPSKRYLOVSCHUR,ierr)
>>> CALL EPSSetDimensions(eps,nev,ncv,PETSC_DECIDE,ierr)
>>> CALL EPSSetTolerances(eps,tol_ev,PETSC_DECIDE,ierr)
>>> 
>>> CALL EPSSetFromOptions(eps,ierr)
>>> CALL EPSSetTarget(eps,shift,ierr)
>>> CALL EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE,ierr)
>>> 
>>> CALL EPSGetST(eps,st,ierr)
>>> CALL STGetKSP(st,ksp,ierr)
>>> CALL KSPGetPC(ksp,pc,ierr)
>>> 
>>> CALL STSetType(st,STSINVERT,ierr)
>>> CALL KSPSetType(ksp,KSPPREONLY,ierr)
>>> CALL PCSetType(pc,PCLU,ierr)
>>> 
>>> CALL PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS,ierr)
>>> CALL PCSetFromOptions(pc,ierr)
>>> 
>>> CALL EPSSolve(eps,ierr)
>>> CALL EPSGetIterationNumber(eps,iter,ierr)
>>> CALL EPSGetConverged(eps,nev_conv,ierr)
>> The settings seem ok. You can use -eps_view to make sure that everything is 
>> set as you want.
>> 
>>> - Using one MPI process, it takes 1 hour and 22 minutes to retrieve 250 
>>> eigenvalues with a Krylov subspace of size 500, a tolerance of 10^-12 when 
>>> the leading dimension of the matrices is 405000. My matrix A has 98,415,000 
>>> non-zero elements and B has 1,215,000 non zero elements. Would you be 
>>> shocked by that computation time? I would have expected something much 
>>> lower given the values of nev and ncv I have but could be completely wrong 
>>> in my understanding of the Krylov-Schur method.
>> If you run with -log_view you will see the breakup in the different steps. 
>> Most probably a large percentage of the time is in the factorization of the 
>> matrix (MatLUFactorSym and MatLUFactorNum).
>> 
>> The matrix is quite dense (about 250 nonzero elements per row), so 
>> factorizing it is costly. You may want to try inexact shift-and-invert with 
>> an iterative method, but you will need a good preconditioner.
>> 
>> The time needed for the other steps may be reduced a little bit by setting a 
>> smaller subspace size, for instance with -eps_mpd 200
>> 
>>> - My goal is speed and reliability. Is there anything you notice in my 
>>> EPS solver that could be improved or corrected? I remember an exchange with 
>>> Jose E. Roman where he said that the parameters of MUMPS are not worth 
>>> being changed, however I notice some people play with the -mat_mumps_cntl_1 
>>> and  -mat_mumps_cntl_3 which control the relat

Re: [petsc-users] [SLEPc] Performance of Krylov-Schur with MUMPS-based shift-and-invert

2018-02-27 Thread Thibaut Appel

Good afternoon Mr Roman,

Thank you very much for your detailed and quick answer. I'll make the 
use of eps_view and log_view and see if I can optimize the preallocation 
of my matrix.


Good to know that it is possible to play with the "mpd" parameter, I 
thought it was only for when large values of nev were requested - as 
suggested by the user guide.


My residual norms are decent (10E-6 to 10E-8) so I do not think my 
application code requires significant tuning.


When you say "EPSSetBalance() sometimes helps, even in the case of using 
spectral transformation", did you mean "particularly" instead of "even"? 
If yes, why? If I misunderstood, what did you want to explain?


Regards,


Thibaut

On 19/02/18 18:36, Jose E. Roman wrote:



El 19 feb 2018, a las 19:15, Thibaut Appel  escribió:

Good afternoon,

I am solving generalized eigenvalue problems {Ax = omegaBx} in complex 
arithmetic, where A is non-hermitian and B is singular. I think the only way to 
get round the singularity is to employ a shift-and-invert method, where I am 
using MUMPS to invert the shifted matrix.

I am using the Fortran interface of PETSc 3.8.3 and SLEPc 3.8.2 where my 
./configure line was
./configure --with-fortran-kernels=1 --with-scalar-type=complex 
--with-blaslapack-dir=/home/linuxbrew/.linuxbrew/opt/openblas --PETSC_ARCH=cplx_dble_optim 
--with-cmake-dir=/home/linuxbrew/.linuxbrew/opt/cmake --with-mpi-dir=/home/linuxbrew/.linuxbrew/opt/openmpi 
--with-debugging=0 --download-scalapack --download-mumps --COPTFLAGS="-O3 -march=native" 
--CXXOPTFLAGS="-O3 -march=native" --FOPTFLAGS="-O3 -march=native"

My matrices A and B are assembled correctly in parallel and my preallocation is 
quasi-optimal in the sense that I don't have any called to mallocs but I may 
overestimate the required memory for some rows of the matrices. Here is how I 
setup the EPS problem and solve:

 CALL EPSSetProblemType(eps,EPS_GNHEP,ierr)
 CALL EPSSetOperators(eps,MatA,MatB,ierr)
 CALL EPSSetType(eps,EPSKRYLOVSCHUR,ierr)
 CALL EPSSetDimensions(eps,nev,ncv,PETSC_DECIDE,ierr)
 CALL EPSSetTolerances(eps,tol_ev,PETSC_DECIDE,ierr)

 CALL EPSSetFromOptions(eps,ierr)
 CALL EPSSetTarget(eps,shift,ierr)
 CALL EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE,ierr)

 CALL EPSGetST(eps,st,ierr)
 CALL STGetKSP(st,ksp,ierr)
 CALL KSPGetPC(ksp,pc,ierr)

 CALL STSetType(st,STSINVERT,ierr)
 CALL KSPSetType(ksp,KSPPREONLY,ierr)
 CALL PCSetType(pc,PCLU,ierr)

 CALL PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS,ierr)
 CALL PCSetFromOptions(pc,ierr)

 CALL EPSSolve(eps,ierr)
 CALL EPSGetIterationNumber(eps,iter,ierr)
 CALL EPSGetConverged(eps,nev_conv,ierr)

The settings seem ok. You can use -eps_view to make sure that everything is set 
as you want.


 - Using one MPI process, it takes 1 hour and 22 minutes to retrieve 250 
eigenvalues with a Krylov subspace of size 500, a tolerance of 10^-12 when the 
leading dimension of the matrices is 405000. My matrix A has 98,415,000 
non-zero elements and B has 1,215,000 non zero elements. Would you be shocked 
by that computation time? I would have expected something much lower given the 
values of nev and ncv I have but could be completely wrong in my understanding 
of the Krylov-Schur method.

If you run with -log_view you will see the breakup in the different steps. Most 
probably a large percentage of the time is in the factorization of the matrix 
(MatLUFactorSym and MatLUFactorNum).

The matrix is quite dense (about 250 nonzero elements per row), so factorizing 
it is costly. You may want to try inexact shift-and-invert with an iterative 
method, but you will need a good preconditioner.

The time needed for the other steps may be reduced a little bit by setting a 
smaller subspace size, for instance with -eps_mpd 200


 - My goal is speed and reliability. Is there anything you notice in my EPS 
solver that could be improved or corrected? I remember an exchange with Jose E. 
Roman where he said that the parameters of MUMPS are not worth being changed, 
however I notice some people play with the -mat_mumps_cntl_1 and  
-mat_mumps_cntl_3 which control the relative/absolute pivoting threshold?

Yes, you can try tuning MUMPS options. Maybe they are relevant in your 
application.


 - Would you advise the use of EPSSetTrueResidual and EPSSetBalance since I 
am using a spectral transformation?

Are you getting large residual norms?
I would not suggest using EPSSetTrueResidual() because it may prevent 
convergence, especially if the target is not very close to the wanted 
eigenvalues.
EPSSetBalance() sometimes helps, even in the case of using spectral 
transformation. It is intended for ill-conditioned problems where the obtained 
residuals are not so good.


 - Would you see anything that would prevent me from getting speedup in 
parallel executions?

I guess it will depend on how MUMPS scales for your problem.

Jose


Thank you very much

Re: [petsc-users] [SLEPc] Performance of Krylov-Schur with MUMPS-based shift-and-invert

2018-02-19 Thread Jose E. Roman


> El 19 feb 2018, a las 19:15, Thibaut Appel  
> escribió:
> 
> Good afternoon,
> 
> I am solving generalized eigenvalue problems {Ax = omegaBx} in complex 
> arithmetic, where A is non-hermitian and B is singular. I think the only way 
> to get round the singularity is to employ a shift-and-invert method, where I 
> am using MUMPS to invert the shifted matrix.
> 
> I am using the Fortran interface of PETSc 3.8.3 and SLEPc 3.8.2 where my 
> ./configure line was
> ./configure --with-fortran-kernels=1 --with-scalar-type=complex 
> --with-blaslapack-dir=/home/linuxbrew/.linuxbrew/opt/openblas 
> --PETSC_ARCH=cplx_dble_optim 
> --with-cmake-dir=/home/linuxbrew/.linuxbrew/opt/cmake 
> --with-mpi-dir=/home/linuxbrew/.linuxbrew/opt/openmpi --with-debugging=0 
> --download-scalapack --download-mumps --COPTFLAGS="-O3 -march=native" 
> --CXXOPTFLAGS="-O3 -march=native" --FOPTFLAGS="-O3 -march=native"
> 
> My matrices A and B are assembled correctly in parallel and my preallocation 
> is quasi-optimal in the sense that I don't have any called to mallocs but I 
> may overestimate the required memory for some rows of the matrices. Here is 
> how I setup the EPS problem and solve:
> 
> CALL EPSSetProblemType(eps,EPS_GNHEP,ierr)
> CALL EPSSetOperators(eps,MatA,MatB,ierr)
> CALL EPSSetType(eps,EPSKRYLOVSCHUR,ierr)
> CALL EPSSetDimensions(eps,nev,ncv,PETSC_DECIDE,ierr)
> CALL EPSSetTolerances(eps,tol_ev,PETSC_DECIDE,ierr)
> 
> CALL EPSSetFromOptions(eps,ierr)
> CALL EPSSetTarget(eps,shift,ierr)
> CALL EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE,ierr)
> 
> CALL EPSGetST(eps,st,ierr)
> CALL STGetKSP(st,ksp,ierr)
> CALL KSPGetPC(ksp,pc,ierr)
> 
> CALL STSetType(st,STSINVERT,ierr)
> CALL KSPSetType(ksp,KSPPREONLY,ierr)
> CALL PCSetType(pc,PCLU,ierr)
> 
> CALL PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS,ierr)
> CALL PCSetFromOptions(pc,ierr)
> 
> CALL EPSSolve(eps,ierr)
> CALL EPSGetIterationNumber(eps,iter,ierr)
> CALL EPSGetConverged(eps,nev_conv,ierr)

The settings seem ok. You can use -eps_view to make sure that everything is set 
as you want.

> 
> - Using one MPI process, it takes 1 hour and 22 minutes to retrieve 250 
> eigenvalues with a Krylov subspace of size 500, a tolerance of 10^-12 when 
> the leading dimension of the matrices is 405000. My matrix A has 98,415,000 
> non-zero elements and B has 1,215,000 non zero elements. Would you be shocked 
> by that computation time? I would have expected something much lower given 
> the values of nev and ncv I have but could be completely wrong in my 
> understanding of the Krylov-Schur method.

If you run with -log_view you will see the breakup in the different steps. Most 
probably a large percentage of the time is in the factorization of the matrix 
(MatLUFactorSym and MatLUFactorNum).

The matrix is quite dense (about 250 nonzero elements per row), so factorizing 
it is costly. You may want to try inexact shift-and-invert with an iterative 
method, but you will need a good preconditioner.

The time needed for the other steps may be reduced a little bit by setting a 
smaller subspace size, for instance with -eps_mpd 200

> 
> - My goal is speed and reliability. Is there anything you notice in my 
> EPS solver that could be improved or corrected? I remember an exchange with 
> Jose E. Roman where he said that the parameters of MUMPS are not worth being 
> changed, however I notice some people play with the -mat_mumps_cntl_1 and  
> -mat_mumps_cntl_3 which control the relative/absolute pivoting threshold?

Yes, you can try tuning MUMPS options. Maybe they are relevant in your 
application.

> 
> - Would you advise the use of EPSSetTrueResidual and EPSSetBalance since 
> I am using a spectral transformation?

Are you getting large residual norms?
I would not suggest using EPSSetTrueResidual() because it may prevent 
convergence, especially if the target is not very close to the wanted 
eigenvalues.
EPSSetBalance() sometimes helps, even in the case of using spectral 
transformation. It is intended for ill-conditioned problems where the obtained 
residuals are not so good.

> 
> - Would you see anything that would prevent me from getting speedup in 
> parallel executions?

I guess it will depend on how MUMPS scales for your problem.

Jose

> 
> Thank you very much in advance and I look forward to exchanging with you 
> about these different points,
> 
> Thibaut
> 



[petsc-users] [SLEPc] Performance of Krylov-Schur with MUMPS-based shift-and-invert

2018-02-19 Thread Thibaut Appel

Good afternoon,

I am solving generalized eigenvalue problems {Ax = omegaBx} in complex 
arithmetic, where A is non-hermitian and B is singular. I think the only 
way to get round the singularity is to employ a shift-and-invert method, 
where I am using MUMPS to invert the shifted matrix.


I am using the Fortran interface of PETSc 3.8.3 and SLEPc 3.8.2 where my 
./configure line was
./configure --with-fortran-kernels=1 --with-scalar-type=complex 
--with-blaslapack-dir=/home/linuxbrew/.linuxbrew/opt/openblas 
--PETSC_ARCH=cplx_dble_optim 
--with-cmake-dir=/home/linuxbrew/.linuxbrew/opt/cmake 
--with-mpi-dir=/home/linuxbrew/.linuxbrew/opt/openmpi --with-debugging=0 
--download-scalapack --download-mumps --COPTFLAGS="-O3 -march=native" 
--CXXOPTFLAGS="-O3 -march=native" --FOPTFLAGS="-O3 -march=native"


My matrices A and B are assembled correctly in parallel and my 
preallocation is quasi-optimal in the sense that I don't have any called 
to mallocs but I may overestimate the required memory for some rows of 
the matrices. Here is how I setup the EPS problem and solve:


    CALL EPSSetProblemType(eps,EPS_GNHEP,ierr)
    CALL EPSSetOperators(eps,MatA,MatB,ierr)
    CALL EPSSetType(eps,EPSKRYLOVSCHUR,ierr)
    CALL EPSSetDimensions(eps,nev,ncv,PETSC_DECIDE,ierr)
    CALL EPSSetTolerances(eps,tol_ev,PETSC_DECIDE,ierr)

    CALL EPSSetFromOptions(eps,ierr)
    CALL EPSSetTarget(eps,shift,ierr)
    CALL EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE,ierr)

    CALL EPSGetST(eps,st,ierr)
    CALL STGetKSP(st,ksp,ierr)
    CALL KSPGetPC(ksp,pc,ierr)

    CALL STSetType(st,STSINVERT,ierr)
    CALL KSPSetType(ksp,KSPPREONLY,ierr)
    CALL PCSetType(pc,PCLU,ierr)

    CALL PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS,ierr)
    CALL PCSetFromOptions(pc,ierr)

    CALL EPSSolve(eps,ierr)
    CALL EPSGetIterationNumber(eps,iter,ierr)
    CALL EPSGetConverged(eps,nev_conv,ierr)

    - Using one MPI process, it takes 1 hour and 22 minutes to retrieve 
250 eigenvalues with a Krylov subspace of size 500, a tolerance of 
10^-12 when the leading dimension of the matrices is 405000. My matrix A 
has 98,415,000 non-zero elements and B has 1,215,000 non zero elements. 
Would you be shocked by that computation time? I would have expected 
something much lower given the values of nev and ncv I have but could be 
completely wrong in my understanding of the Krylov-Schur method.


    - My goal is speed and reliability. Is there anything you notice in 
my EPS solver that could be improved or corrected? I remember an 
exchange with Jose E. Roman where he said that the parameters of MUMPS 
are not worth being changed, however I notice some people play with the 
-mat_mumps_cntl_1 and  -mat_mumps_cntl_3 which control the 
relative/absolute pivoting threshold?


    - Would you advise the use of EPSSetTrueResidual and EPSSetBalance 
since I am using a spectral transformation?


    - Would you see anything that would prevent me from getting speedup 
in parallel executions?


Thank you very much in advance and I look forward to exchanging with you 
about these different points,


Thibaut