Re: [petsc-users] Speed up KSPSolve of dense matrix

2014-11-07 Thread Florian Lindner
Hey,

I've modified my code that it counts the total number of non-zeros (trivial to 
put in the preallocation code) and decides upon that wheather to create a 
MATSBAIJ or MATDENSE matrix.

Using sparse matrices (for one that is really dense) I've made good experiences 
with -pc_factor_shift_type NONZERO  -pc_type cholesky -ksp_type preonly. But 
when I use that on a MATDENSE I again get a a Zero pivot in LU factorization 
though using pc_factor_shift_type.

I suspect that is related to the MATDENSE matrix intended to be symmetric, but 
actually is not. I have MAT_SYMMETRY_ETERNAL set but when I view the matrix 
only the upper right triangular is set, how I did in my matrix assembly 
algorithm. Do the algorithm treat the matrix as symmetric when 
MAT_SYMMETRY_ETERNAL is set or is just a little hint? Any way to actually set a 
MATDENSE matrix to be symmetric or do I need to copy the entries to the lower 
triangular?

Thanks,

Florian

Am Mittwoch, 5. November 2014, 08:36:26 schrieb Barry Smith:
 
   So you have a dense, very ill-conditioned matrix. Seems to me you should 
 just be using the dense LU/Cholesky factorization unless you know how to 
 preconditioner these types of matrices.  
 http://scicomp.stackexchange.com/questions/7561/do-rbf-kernel-matrices-tend-to-be-ill-conditioned
   
 
 You can’t expect to take monstrously ill-conditioned matrices and expect to 
 use iterative methods unless you really really really understand the source 
 and and control of the ill-conditioning.
 
  What is your end goal in trying to use iterative methods?
 
   Barry
 
 
  On Nov 5, 2014, at 3:41 AM, Florian Lindner mailingli...@xgm.de wrote:
  
  Am Dienstag, 4. November 2014, 10:08:27 schrieb Barry Smith:
  
There are a lot of questions here. 
  
  Yes, thanks for replying!
  
  On Nov 4, 2014, at 328 AM, Florian Lindner mailingli...@xgm.de wrote:
  
  Hello,
  
  I have a fulll matrix of size e.g. 603x603 of which I'm very disappointed 
  with the runtime, compared to a naive LU / forward / backward solution. 
  My petsc solution takes about 14s, while the old one takes just 0.5s. 
  (when you're looking at sparse matrices the figures are almost inversed). 
  I try to investigate what's the problem.
  
For small problems like this direct dense LU could easily be faster than 
  iterative schemes. Especially if you have many right hand sides with the 
  same matrix since the factorization only needs to be done one.
  
  
  Most of the time is spent in the KSPSolve.
  
  EventCount  Time (sec) Flops  
 --- Global ---  --- Stage ---   Total
   Max Ratio  Max Ratio   Max  Ratio  Mess   Avg len 
  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
  
  
  --- Event Stage 0: Main Stage
  
  MatMult10334 1.0 6.3801e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
  0.0e+00 45 49  0  0  0  45 49  0  0  0  1177
  MatSolve   10334 1.0 6.9187e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
  0.0e+00 49 49  0  0  0  49 49  0  0  0  1085
  
If this is for a single solver for a 603 by 603 matrix then this is 
  absurd.  1 plus iterations
  
  But that's the output I got. When I run with -info I get a lot of messages:
  
  [0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
  unchanged
  
  though doing just on KSPSolve / KSPSetOperators. Is that ok?
  
  MatCholFctrNum 1 1.0 1.8968e-01 1.0 6.03e+02 1.0 0.0e+00 0.0e+00 
  0.0e+00  1  0  0  0  0   1  0  0  0  0 0
  KSPGMRESOrthog 1 1.0 2.1417e-01 1.0 3.73e+08 1.0 0.0e+00 0.0e+00 
  0.0e+00  2  2  0  0  0   2  2  0  0  0  1744
  KSPSolve   1 1.0 1.3763e+01 1.0 1.54e+10 1.0 0.0e+00 0.0e+00 
  0.0e+00 97100  0  0  0  97100  0  0  0  1120
  Prealloc Matrix C   1 1.0 5.7458e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
  0.0e+00  0  0  0  0  0   0  0  0  0  0 0
  Filling Matrix C   1 1.0 6.7984e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
  0.0e+00  0  0  0  0  0   0  0  0  0  0 0
  VecMDot1 1.0 8.8615e-02 1.0 1.87e+08 1.0 0.0e+00 0.0e+00 
  0.0e+00  1  1  0  0  0   1  1  0  0  0  2106
  VecMAXPY   10334 1.0 1.2585e-01 1.0 1.99e+08 1.0 0.0e+00 0.0e+00 
  0.0e+00  1  1  0  0  0   1  1  0  0  0  1580
  Prealloc Matrix A   1 1.0 1.1071e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
  0.0e+00  1  0  0  0  0   1  0  0  0  0 0
  Filling Matrix A   1 1.0 1.2574e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
  0.0e+00  1  0  0  0  0   1  0  0  0  0 0
  PCSetUp1 1.0 1.9073e-01 1.0 6.03e+02 1.0 0.0e+00 0.0e+00 
  0.0e+00  1  0  0  0  0   1  0  0  0  0 0
  PCApply10334 1.0 6.9233e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
  0.0e+00 49 49  0  0  0  49 49  0  0  0  1085
  
  
  

Re: [petsc-users] Speed up KSPSolve of dense matrix

2014-11-07 Thread Barry Smith


 On Nov 7, 2014, at 7:03 AM, Florian Lindner mailingli...@xgm.de wrote:
 
 Hey,
 
 I've modified my code that it counts the total number of non-zeros (trivial 
 to put in the preallocation code) and decides upon that wheather to create a 
 MATSBAIJ or MATDENSE matrix.
 
 Using sparse matrices (for one that is really dense) I've made good 
 experiences with -pc_factor_shift_type NONZERO  -pc_type cholesky -ksp_type 
 preonly. But when I use that on a MATDENSE I again get a a Zero pivot in LU 
 factorization though using pc_factor_shift_type.

  If you use Cholesky for sbaij why are you using LU for dense? Why not 
Cholesky?

  The  pc_factor_shift_type  stuff is not supported for dense. 

  Is your matrix singular? 
 
 I suspect that is related to the MATDENSE matrix intended to be symmetric, 
 but actually is not. I have MAT_SYMMETRY_ETERNAL set but when I view the 
 matrix only the upper right triangular is set, how I did in my matrix 
 assembly algorithm. Do the algorithm treat the matrix as symmetric when 
 MAT_SYMMETRY_ETERNAL is set or is just a little hint? Any way to actually set 
 a MATDENSE matrix to be symmetric or do I need to copy the entries to the 
 lower triangular?

   If you want to use LU or MatMult with the dense matrix then you need to set 
all the values in the matrix, you cannot set just 1/2 of them. Even if the 
matrix is symmetric. 

   Note you can set all the values in the SBAIJ matrix and just tell it to 
ignore the lower triangular entries you pass in. To do this after you create 
the matrix and preallocate then call 
MatSetOption(mat,MAT_IGNORE_LOWER_TRIANGULAR,PETSC_TRUE);

 
 Thanks,
 
 Florian
 
 Am Mittwoch, 5. November 2014, 08:36:26 schrieb Barry Smith:
 
  So you have a dense, very ill-conditioned matrix. Seems to me you should 
 just be using the dense LU/Cholesky factorization unless you know how to 
 preconditioner these types of matrices.  
 http://scicomp.stackexchange.com/questions/7561/do-rbf-kernel-matrices-tend-to-be-ill-conditioned
   
 
 You can’t expect to take monstrously ill-conditioned matrices and expect to 
 use iterative methods unless you really really really understand the source 
 and and control of the ill-conditioning.
 
 What is your end goal in trying to use iterative methods?
 
  Barry
 
 
 On Nov 5, 2014, at 3:41 AM, Florian Lindner mailingli...@xgm.de wrote:
 
 Am Dienstag, 4. November 2014, 10:08:27 schrieb Barry Smith:
 
  There are a lot of questions here. 
 
 Yes, thanks for replying!
 
 On Nov 4, 2014, at 328 AM, Florian Lindner mailingli...@xgm.de wrote:
 
 Hello,
 
 I have a fulll matrix of size e.g. 603x603 of which I'm very disappointed 
 with the runtime, compared to a naive LU / forward / backward solution. 
 My petsc solution takes about 14s, while the old one takes just 0.5s. 
 (when you're looking at sparse matrices the figures are almost inversed). 
 I try to investigate what's the problem.
 
  For small problems like this direct dense LU could easily be faster than 
 iterative schemes. Especially if you have many right hand sides with the 
 same matrix since the factorization only needs to be done one.
 
 
 Most of the time is spent in the KSPSolve.
 
 EventCount  Time (sec) Flops  
--- Global ---  --- Stage ---   Total
 Max Ratio  Max Ratio   Max  Ratio  Mess   Avg len 
 Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
 
 
 --- Event Stage 0: Main Stage
 
 MatMult10334 1.0 6.3801e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
 0.0e+00 45 49  0  0  0  45 49  0  0  0  1177
 MatSolve   10334 1.0 6.9187e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
 0.0e+00 49 49  0  0  0  49 49  0  0  0  1085
 
  If this is for a single solver for a 603 by 603 matrix then this is 
 absurd.  1 plus iterations
 
 But that's the output I got. When I run with -info I get a lot of messages:
 
 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
 unchanged
 
 though doing just on KSPSolve / KSPSetOperators. Is that ok?
 
 MatCholFctrNum 1 1.0 1.8968e-01 1.0 6.03e+02 1.0 0.0e+00 0.0e+00 
 0.0e+00  1  0  0  0  0   1  0  0  0  0 0
 KSPGMRESOrthog 1 1.0 2.1417e-01 1.0 3.73e+08 1.0 0.0e+00 0.0e+00 
 0.0e+00  2  2  0  0  0   2  2  0  0  0  1744
 KSPSolve   1 1.0 1.3763e+01 1.0 1.54e+10 1.0 0.0e+00 0.0e+00 
 0.0e+00 97100  0  0  0  97100  0  0  0  1120
 Prealloc Matrix C   1 1.0 5.7458e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
 Filling Matrix C   1 1.0 6.7984e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
 VecMDot1 1.0 8.8615e-02 1.0 1.87e+08 1.0 0.0e+00 0.0e+00 
 0.0e+00  1  1  0  0  0   1  1  0  0  0  2106
 VecMAXPY   10334 1.0 1.2585e-01 1.0 1.99e+08 1.0 0.0e+00 0.0e+00 
 0.0e+00  1  1  0  0  0   

Re: [petsc-users] Speed up KSPSolve of dense matrix

2014-11-06 Thread Jed Brown
Florian Lindner mailingli...@xgm.de writes:
 I'm more or less aware of that, but I'm surprised that the performance is 
 such poor.

If you've ever written a naive triply-nested loop dgemm, you'd know that
it gives up orders of magnitude.

 I changed the format of matrixC to dense now, but the runtime is still poor. 
 Overall time is now around 9s with 95% spent in KSPSolve and exact the same 
 number of MatMult executions, taking 65% of time.

Your solver is broken.  It takes 1 iterations and presumably returns
KSP_DIVERGED_ITS (use -ksp_converged_reason to check).  This could be
due to GMRES restarts.  You might have better results with -ksp_type cg
(if your matrix is symmetric) or -ksp_type bcgs for nonsymmetric.  But
the bottom line is that you need a good preconditioner (or
well-conditioned system) for iterative methods to make sense for dense
matrices.


signature.asc
Description: PGP signature


Re: [petsc-users] Speed up KSPSolve of dense matrix

2014-11-05 Thread Florian Lindner
Am Dienstag, 4. November 2014, 19:14:58 schrieb Jed Brown:
 Barry Smith bsm...@mcs.anl.gov writes:
  I've tried to use a direct solver like suggested on pp 72, but:
  
  ./petBench 600 1 -ksp_type preonly -pc_type lu
 
You cannot use LU with SBAIJ format. Only Cholesky. So use -pc_type 
  cholesky
 
 And using a sparse matrix format for a dense matrix is a recipe for poor 
 performance.

I'm more or less aware of that, but I'm surprised that the performance is such 
poor.

I changed the format of matrixC to dense now, but the runtime is still poor. 
Overall time is now around 9s with 95% spent in KSPSolve and exact the same 
number of MatMult executions, taking 65% of time.

Thx,
Florian


Re: [petsc-users] Speed up KSPSolve of dense matrix

2014-11-05 Thread Florian Lindner
Am Dienstag, 4. November 2014, 10:08:27 schrieb Barry Smith:
 
There are a lot of questions here. 

Yes, thanks for replying!

  On Nov 4, 2014, at 328 AM, Florian Lindner mailingli...@xgm.de wrote:
  
  Hello,
  
  I have a fulll matrix of size e.g. 603x603 of which I'm very disappointed 
  with the runtime, compared to a naive LU / forward / backward solution. My 
  petsc solution takes about 14s, while the old one takes just 0.5s. (when 
  you're looking at sparse matrices the figures are almost inversed). I try 
  to investigate what's the problem.
 
For small problems like this direct dense LU could easily be faster than 
 iterative schemes. Especially if you have many right hand sides with the same 
 matrix since the factorization only needs to be done one.
 
  
  Most of the time is spent in the KSPSolve.
  
  EventCount  Time (sec) Flops
   --- Global ---  --- Stage ---   Total
Max Ratio  Max Ratio   Max  Ratio  Mess   Avg len 
  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
  
  
  --- Event Stage 0: Main Stage
  
  MatMult10334 1.0 6.3801e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
  0.0e+00 45 49  0  0  0  45 49  0  0  0  1177
  MatSolve   10334 1.0 6.9187e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
  0.0e+00 49 49  0  0  0  49 49  0  0  0  1085
 
If this is for a single solver for a 603 by 603 matrix then this is 
 absurd.  1 plus iterations

But that's the output I got. When I run with -info I get a lot of messages:

[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged

though doing just on KSPSolve / KSPSetOperators. Is that ok?

  MatCholFctrNum 1 1.0 1.8968e-01 1.0 6.03e+02 1.0 0.0e+00 0.0e+00 
  0.0e+00  1  0  0  0  0   1  0  0  0  0 0
  KSPGMRESOrthog 1 1.0 2.1417e-01 1.0 3.73e+08 1.0 0.0e+00 0.0e+00 
  0.0e+00  2  2  0  0  0   2  2  0  0  0  1744
  KSPSolve   1 1.0 1.3763e+01 1.0 1.54e+10 1.0 0.0e+00 0.0e+00 
  0.0e+00 97100  0  0  0  97100  0  0  0  1120
  Prealloc Matrix C   1 1.0 5.7458e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
  0.0e+00  0  0  0  0  0   0  0  0  0  0 0
  Filling Matrix C   1 1.0 6.7984e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
  0.0e+00  0  0  0  0  0   0  0  0  0  0 0
  VecMDot1 1.0 8.8615e-02 1.0 1.87e+08 1.0 0.0e+00 0.0e+00 
  0.0e+00  1  1  0  0  0   1  1  0  0  0  2106
  VecMAXPY   10334 1.0 1.2585e-01 1.0 1.99e+08 1.0 0.0e+00 0.0e+00 
  0.0e+00  1  1  0  0  0   1  1  0  0  0  1580
  Prealloc Matrix A   1 1.0 1.1071e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
  0.0e+00  1  0  0  0  0   1  0  0  0  0 0
  Filling Matrix A   1 1.0 1.2574e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
  0.0e+00  1  0  0  0  0   1  0  0  0  0 0
  PCSetUp1 1.0 1.9073e-01 1.0 6.03e+02 1.0 0.0e+00 0.0e+00 
  0.0e+00  1  0  0  0  0   1  0  0  0  0 0
  PCApply10334 1.0 6.9233e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
  0.0e+00 49 49  0  0  0  49 49  0  0  0  1085
  
  
  Removed all events with global %T == 0, but my own ones.
  
  Matrix type is sequential MATSBAIJ on PETSC_COMM_SELF.
  
  KSP is initalized once:
  
   KSPSetOperators(_solver, _matrixC.matrix, _matrixC.matrix);
   KSPSetTolerances(_solver, _solverRtol, PETSC_DEFAULT, PETSC_DEFAULT, 
  PETSC_DEFAULT);
   KSPSetFromOptions(_solver);
  
  KSPSolve: 
  
   ierr = KSPSolve(_solver, in.vector, p.vector); CHKERRV(ierr)
  
  KSPSolve may be executed multiple times with the same SetOperators acording 
  to dimensionality of the data. Here it's just one.
  
  -ksp_view says:
  
  KSP Object: 1 MPI processes
   type: gmres
 GMRES: restart=30, using Classical (unmodified) Gram-Schmidt 
  Orthogonalization with no iterative refinement
 GMRES: happy breakdown tolerance 1e-30
   maximum iterations=1, initial guess is zero
   tolerances:  relative=1e-09, absolute=1e-50, divergence=1
   left preconditioning
   using PRECONDITIONED norm type for convergence test
  PC Object: 1 MPI processes
   type: icc
 0 levels of fill
 tolerance for zero pivot 2.22045e-14
 using Manteuffel shift [POSITIVE_DEFINITE]
 matrix ordering: natural
 factor fill ratio given 1, needed 1
   Factored matrix follows:
 Mat Object: 1 MPI processes
   type: seqsbaij
   rows=603, cols=603
   package used to perform factorization: petsc
   total: nonzeros=182103, allocated nonzeros=182103
   total number of mallocs used during MatSetValues calls =0
   block size is 1
   linear system matrix = precond matrix:
   Mat Object:  C   1 MPI processes
 type: seqsbaij
 rows=603, cols=603
 total: 

Re: [petsc-users] Speed up KSPSolve of dense matrix

2014-11-05 Thread Barry Smith

  So you have a dense, very ill-conditioned matrix. Seems to me you should just 
be using the dense LU/Cholesky factorization unless you know how to 
preconditioner these types of matrices.  
http://scicomp.stackexchange.com/questions/7561/do-rbf-kernel-matrices-tend-to-be-ill-conditioned
  

You can’t expect to take monstrously ill-conditioned matrices and expect to use 
iterative methods unless you really really really understand the source and and 
control of the ill-conditioning.

 What is your end goal in trying to use iterative methods?

  Barry


 On Nov 5, 2014, at 3:41 AM, Florian Lindner mailingli...@xgm.de wrote:
 
 Am Dienstag, 4. November 2014, 10:08:27 schrieb Barry Smith:
 
   There are a lot of questions here. 
 
 Yes, thanks for replying!
 
 On Nov 4, 2014, at 328 AM, Florian Lindner mailingli...@xgm.de wrote:
 
 Hello,
 
 I have a fulll matrix of size e.g. 603x603 of which I'm very disappointed 
 with the runtime, compared to a naive LU / forward / backward solution. My 
 petsc solution takes about 14s, while the old one takes just 0.5s. (when 
 you're looking at sparse matrices the figures are almost inversed). I try 
 to investigate what's the problem.
 
   For small problems like this direct dense LU could easily be faster than 
 iterative schemes. Especially if you have many right hand sides with the 
 same matrix since the factorization only needs to be done one.
 
 
 Most of the time is spent in the KSPSolve.
 
 EventCount  Time (sec) Flops
  --- Global ---  --- Stage ---   Total
  Max Ratio  Max Ratio   Max  Ratio  Mess   Avg len 
 Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
 
 
 --- Event Stage 0: Main Stage
 
 MatMult10334 1.0 6.3801e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
 0.0e+00 45 49  0  0  0  45 49  0  0  0  1177
 MatSolve   10334 1.0 6.9187e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
 0.0e+00 49 49  0  0  0  49 49  0  0  0  1085
 
   If this is for a single solver for a 603 by 603 matrix then this is 
 absurd.  1 plus iterations
 
 But that's the output I got. When I run with -info I get a lot of messages:
 
 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
 unchanged
 
 though doing just on KSPSolve / KSPSetOperators. Is that ok?
 
 MatCholFctrNum 1 1.0 1.8968e-01 1.0 6.03e+02 1.0 0.0e+00 0.0e+00 
 0.0e+00  1  0  0  0  0   1  0  0  0  0 0
 KSPGMRESOrthog 1 1.0 2.1417e-01 1.0 3.73e+08 1.0 0.0e+00 0.0e+00 
 0.0e+00  2  2  0  0  0   2  2  0  0  0  1744
 KSPSolve   1 1.0 1.3763e+01 1.0 1.54e+10 1.0 0.0e+00 0.0e+00 
 0.0e+00 97100  0  0  0  97100  0  0  0  1120
 Prealloc Matrix C   1 1.0 5.7458e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
 Filling Matrix C   1 1.0 6.7984e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
 VecMDot1 1.0 8.8615e-02 1.0 1.87e+08 1.0 0.0e+00 0.0e+00 
 0.0e+00  1  1  0  0  0   1  1  0  0  0  2106
 VecMAXPY   10334 1.0 1.2585e-01 1.0 1.99e+08 1.0 0.0e+00 0.0e+00 
 0.0e+00  1  1  0  0  0   1  1  0  0  0  1580
 Prealloc Matrix A   1 1.0 1.1071e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
 0.0e+00  1  0  0  0  0   1  0  0  0  0 0
 Filling Matrix A   1 1.0 1.2574e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
 0.0e+00  1  0  0  0  0   1  0  0  0  0 0
 PCSetUp1 1.0 1.9073e-01 1.0 6.03e+02 1.0 0.0e+00 0.0e+00 
 0.0e+00  1  0  0  0  0   1  0  0  0  0 0
 PCApply10334 1.0 6.9233e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
 0.0e+00 49 49  0  0  0  49 49  0  0  0  1085
 
 
 Removed all events with global %T == 0, but my own ones.
 
 Matrix type is sequential MATSBAIJ on PETSC_COMM_SELF.
 
 KSP is initalized once:
 
 KSPSetOperators(_solver, _matrixC.matrix, _matrixC.matrix);
 KSPSetTolerances(_solver, _solverRtol, PETSC_DEFAULT, PETSC_DEFAULT, 
 PETSC_DEFAULT);
 KSPSetFromOptions(_solver);
 
 KSPSolve: 
 
 ierr = KSPSolve(_solver, in.vector, p.vector); CHKERRV(ierr)
 
 KSPSolve may be executed multiple times with the same SetOperators acording 
 to dimensionality of the data. Here it's just one.
 
 -ksp_view says:
 
 KSP Object: 1 MPI processes
 type: gmres
   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt 
 Orthogonalization with no iterative refinement
   GMRES: happy breakdown tolerance 1e-30
 maximum iterations=1, initial guess is zero
 tolerances:  relative=1e-09, absolute=1e-50, divergence=1
 left preconditioning
 using PRECONDITIONED norm type for convergence test
 PC Object: 1 MPI processes
 type: icc
   0 levels of fill
   tolerance for zero pivot 2.22045e-14
   using Manteuffel shift [POSITIVE_DEFINITE]
   matrix 

Re: [petsc-users] Speed up KSPSolve of dense matrix

2014-11-04 Thread Barry Smith

   There are a lot of questions here. 

 On Nov 4, 2014, at 3:28 AM, Florian Lindner mailingli...@xgm.de wrote:
 
 Hello,
 
 I have a fulll matrix of size e.g. 603x603 of which I'm very disappointed 
 with the runtime, compared to a naive LU / forward / backward solution. My 
 petsc solution takes about 14s, while the old one takes just 0.5s. (when 
 you're looking at sparse matrices the figures are almost inversed). I try to 
 investigate what's the problem.

   For small problems like this direct dense LU could easily be faster than 
iterative schemes. Especially if you have many right hand sides with the same 
matrix since the factorization only needs to be done one.

 
 Most of the time is spent in the KSPSolve.
 
 EventCount  Time (sec) Flops  
--- Global ---  --- Stage ---   Total
   Max Ratio  Max Ratio   Max  Ratio  Mess   Avg len 
 Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
 
 
 --- Event Stage 0: Main Stage
 
 MatMult10334 1.0 6.3801e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
 0.0e+00 45 49  0  0  0  45 49  0  0  0  1177
 MatSolve   10334 1.0 6.9187e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
 0.0e+00 49 49  0  0  0  49 49  0  0  0  1085

   If this is for a single solver for a 603 by 603 matrix then this is absurd.  
1 plus iterations 

 MatCholFctrNum 1 1.0 1.8968e-01 1.0 6.03e+02 1.0 0.0e+00 0.0e+00 
 0.0e+00  1  0  0  0  0   1  0  0  0  0 0
 KSPGMRESOrthog 1 1.0 2.1417e-01 1.0 3.73e+08 1.0 0.0e+00 0.0e+00 
 0.0e+00  2  2  0  0  0   2  2  0  0  0  1744
 KSPSolve   1 1.0 1.3763e+01 1.0 1.54e+10 1.0 0.0e+00 0.0e+00 
 0.0e+00 97100  0  0  0  97100  0  0  0  1120
 Prealloc Matrix C   1 1.0 5.7458e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
 Filling Matrix C   1 1.0 6.7984e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
 VecMDot1 1.0 8.8615e-02 1.0 1.87e+08 1.0 0.0e+00 0.0e+00 
 0.0e+00  1  1  0  0  0   1  1  0  0  0  2106
 VecMAXPY   10334 1.0 1.2585e-01 1.0 1.99e+08 1.0 0.0e+00 0.0e+00 
 0.0e+00  1  1  0  0  0   1  1  0  0  0  1580
 Prealloc Matrix A   1 1.0 1.1071e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
 0.0e+00  1  0  0  0  0   1  0  0  0  0 0
 Filling Matrix A   1 1.0 1.2574e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
 0.0e+00  1  0  0  0  0   1  0  0  0  0 0
 PCSetUp1 1.0 1.9073e-01 1.0 6.03e+02 1.0 0.0e+00 0.0e+00 
 0.0e+00  1  0  0  0  0   1  0  0  0  0 0
 PCApply10334 1.0 6.9233e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 
 0.0e+00 49 49  0  0  0  49 49  0  0  0  1085
 
 
 Removed all events with global %T == 0, but my own ones.
 
 Matrix type is sequential MATSBAIJ on PETSC_COMM_SELF.
 
 KSP is initalized once:
 
  KSPSetOperators(_solver, _matrixC.matrix, _matrixC.matrix);
  KSPSetTolerances(_solver, _solverRtol, PETSC_DEFAULT, PETSC_DEFAULT, 
 PETSC_DEFAULT);
  KSPSetFromOptions(_solver);
 
 KSPSolve: 
 
  ierr = KSPSolve(_solver, in.vector, p.vector); CHKERRV(ierr)
 
 KSPSolve may be executed multiple times with the same SetOperators acording 
 to dimensionality of the data. Here it's just one.
 
 -ksp_view says:
 
 KSP Object: 1 MPI processes
  type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt 
 Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
  maximum iterations=1, initial guess is zero
  tolerances:  relative=1e-09, absolute=1e-50, divergence=1
  left preconditioning
  using PRECONDITIONED norm type for convergence test
 PC Object: 1 MPI processes
  type: icc
0 levels of fill
tolerance for zero pivot 2.22045e-14
using Manteuffel shift [POSITIVE_DEFINITE]
matrix ordering: natural
factor fill ratio given 1, needed 1
  Factored matrix follows:
Mat Object: 1 MPI processes
  type: seqsbaij
  rows=603, cols=603
  package used to perform factorization: petsc
  total: nonzeros=182103, allocated nonzeros=182103
  total number of mallocs used during MatSetValues calls =0
  block size is 1
  linear system matrix = precond matrix:
  Mat Object:  C   1 MPI processes
type: seqsbaij
rows=603, cols=603
total: nonzeros=182103, allocated nonzeros=182103
total number of mallocs used during MatSetValues calls =0
block size is 1
 
 
 I've tried to use a direct solver like suggested on pp 72, but:
 
 ./petBench 600 1 -ksp_type preonly -pc_type lu

  You cannot use LU with SBAIJ format. Only Cholesky. So use -pc_type cholesky


 Mesh of size: 600
 Do mappings: 1
 [0]PETSC ERROR: - Error Message 
 

Re: [petsc-users] Speed up KSPSolve of dense matrix

2014-11-04 Thread Jed Brown
Barry Smith bsm...@mcs.anl.gov writes:
 I've tried to use a direct solver like suggested on pp 72, but:
 
 ./petBench 600 1 -ksp_type preonly -pc_type lu

   You cannot use LU with SBAIJ format. Only Cholesky. So use -pc_type cholesky

And using a sparse matrix format for a dense matrix is a recipe for poor 
performance.


signature.asc
Description: PGP signature