Hello, I have a fulll matrix of size e.g. 603x603 of which I'm very disappointed with the runtime, compared to a naive LU / forward / backward solution. My petsc solution takes about 14s, while the old one takes just 0.5s. (when you're looking at sparse matrices the figures are almost inversed). I try to investigate what's the problem.
Most of the time is spent in the KSPSolve. Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 10334 1.0 6.3801e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 0.0e+00 45 49 0 0 0 45 49 0 0 0 1177 MatSolve 10334 1.0 6.9187e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 0.0e+00 49 49 0 0 0 49 49 0 0 0 1085 MatCholFctrNum 1 1.0 1.8968e-01 1.0 6.03e+02 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 KSPGMRESOrthog 10000 1.0 2.1417e-01 1.0 3.73e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 2 0 0 0 2 2 0 0 0 1744 KSPSolve 1 1.0 1.3763e+01 1.0 1.54e+10 1.0 0.0e+00 0.0e+00 0.0e+00 97100 0 0 0 97100 0 0 0 1120 Prealloc Matrix C 1 1.0 5.7458e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Filling Matrix C 1 1.0 6.7984e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 10000 1.0 8.8615e-02 1.0 1.87e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 2106 VecMAXPY 10334 1.0 1.2585e-01 1.0 1.99e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 1580 Prealloc Matrix A 1 1.0 1.1071e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 Filling Matrix A 1 1.0 1.2574e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 PCSetUp 1 1.0 1.9073e-01 1.0 6.03e+02 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 PCApply 10334 1.0 6.9233e+00 1.0 7.51e+09 1.0 0.0e+00 0.0e+00 0.0e+00 49 49 0 0 0 49 49 0 0 0 1085 ------------------------------------------------------------------------------------------------------------------------ Removed all events with global %T == 0, but my own ones. Matrix type is sequential MATSBAIJ on PETSC_COMM_SELF. KSP is initalized once: KSPSetOperators(_solver, _matrixC.matrix, _matrixC.matrix); KSPSetTolerances(_solver, _solverRtol, PETSC_DEFAULT, PETSC_DEFAULT, PETSC_DEFAULT); KSPSetFromOptions(_solver); KSPSolve: ierr = KSPSolve(_solver, in.vector, p.vector); CHKERRV(ierr) KSPSolve may be executed multiple times with the same SetOperators acording to dimensionality of the data. Here it's just one. -ksp_view says: KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: icc 0 levels of fill tolerance for zero pivot 2.22045e-14 using Manteuffel shift [POSITIVE_DEFINITE] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqsbaij rows=603, cols=603 package used to perform factorization: petsc total: nonzeros=182103, allocated nonzeros=182103 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: C 1 MPI processes type: seqsbaij rows=603, cols=603 total: nonzeros=182103, allocated nonzeros=182103 total number of mallocs used during MatSetValues calls =0 block size is 1 I've tried to use a direct solver like suggested on pp 72, but: ./petBench 600 1 -ksp_type preonly -pc_type lu Mesh of size: 600 Do mappings: 1 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Factor type not supported [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.5.2, unknown [0]PETSC ERROR: ./petBench on a arch-linux2-c-opt named helium by lindnefn Tue Nov 4 10:19:28 2014 [0]PETSC ERROR: Configure options --with-debugging=0 [0]PETSC ERROR: #1 MatGetFactor_seqsbaij_petsc() line 1833 in /data2/scratch/lindner/petsc/src/mat/impls/sbaij/seq/sbaij.c [0]PETSC ERROR: #2 MatGetFactor() line 3963 in /data2/scratch/lindner/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: #3 PCSetUp_LU() line 125 in /data2/scratch/lindner/petsc/src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: #4 PCSetUp() line 902 in /data2/scratch/lindner/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #5 KSPSetUp() line 305 in /data2/scratch/lindner/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #6 KSPSolve() line 417 in /data2/scratch/lindner/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #7 map() line 391 in /data2/scratch/lindner/precice/src/mapping/PetRadialBasisFctMapping.hpp Any other idea about how to speed up that? Thanks for any suggestions! Florian