`Thank you for your answers. Barry's answer allowed me to get up to 50x`

`faster code, so it was a huge help. I decided not to use ksp context`

`altogether for this small operation.`

Thank you again,

Ali On 11-03-2018 18:53, Matthew Knepley wrote:

On Sun, Mar 11, 2018 at 1:14 AM, Smith, Barry F. <bsm...@mcs.anl.gov<mailto:bsm...@mcs.anl.gov>> wrote:1) Run the problem with -ksp_view_mat and -ksp_view_rhs and mail petsc-ma...@mcs.anl.gov <mailto:petsc-ma...@mcs.anl.gov> the resulting file produced called binaryoutput 2) By default PCLU does a reordering to reduce fill that could introduce a zero pivoit, PCILU does not do a reordering by default. You can use -pc_factor_mat_ordering_type none to force no reordering (PCLU does not do numerical pivoting for stability so can fail with zero pivots). 3) If you need to solve these tiny 7 by 7 systems many times (presumably you are solving these to set up a large algebraic system solved afterwards) then you probably don't want to use KSP to solve them. You can use the low level kernel PetscKernel_A_gets_inverse_A_7() that does do pivoting followed by a multiply like s1 = v[0]*x1 + v[6]*x2 + v[12]*x3 + v[18]*x4 + v[24]*x5 + v[30]*x6; s2 = v[1]*x1 + v[7]*x2 + v[13]*x3 + v[19]*x4 + v[25]*x5 + v[31]*x6; s3 = v[2]*x1 + v[8]*x2 + v[14]*x3 + v[20]*x4 + v[26]*x5 + v[32]*x6; s4 = v[3]*x1 + v[9]*x2 + v[15]*x3 + v[21]*x4 + v[27]*x5 + v[33]*x6; s5 = v[4]*x1 + v[10]*x2 + v[16]*x3 + v[22]*x4 + v[28]*x5 + v[34]*x6; s6 = v[5]*x1 + v[11]*x2 + v[17]*x3 + v[23]*x4 + v[29]*x5 + v[35]*x6; where v is the dense 7 by 7 matrix (stored column oriented like Fortran) an the x are the seven values of the right hand side. Note that PCPBJACOBI will do this automatically. Matt Barry > On Mar 10, 2018, at 5:22 AM, Ali Berk Kahraman <aliberkkahra...@yahoo.com <mailto:aliberkkahra...@yahoo.com>> wrote: > > Hello All, > > I am trying to get the finite difference coefficients for a given irregular grid. For this, I follow the following webpage, which tells me to solve a linear system. > > http://web.media.mit.edu/~crtaylor/calculator.html <http://web.media.mit.edu/%7Ecrtaylor/calculator.html> > > I solve a 7 unknown linear system with a 7x7 dense matrix to get the finite difference coefficients. Since I will call this code many many many times in my overall project, I need it to be as fast, yet as exact as possible. So I use PCLU. I make sure that there are no zero diagonals on the matrix, I swap required rows for it. However, PCLU still diverges with the output at the end of this e-mail. It indicates "FACTOR_NUMERIC_ZEROPIVOT" , but as I have written above I make sure there are no zero main diagonal entries on the matrix. When I use PCILU instead, it converges pretty well. > > So my question is, is PCILU the same thing mathematically as PCLU when applied on a small dense matrix? I need to know if I get the exact solution with PCILU, because my whole project will depend on the accuracy of the finite differences. > > Best Regards, > > Ali Berk Kahraman > M.Sc. Student, Mechanical Engineering Dept. > Boğaziçi Uni., Istanbul, Turkey > > Linear solve did not converge due to DIVERGED_PCSETUP_FAILED iterations 0 > PCSETUP_FAILED due to FACTOR_NUMERIC_ZEROPIVOT > KSP Object: 1 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=7, cols=7 > package used to perform factorization: petsc > total: nonzeros=49, allocated nonzeros=49 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=7, cols=7 > total: nonzeros=49, allocated nonzeros=49 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2 nodes, limit used is 5 > > --What most experimenters take for granted before they begin theirexperiments is infinitely more interesting than any results to whichtheir experiments lead.-- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/%7Emk51/>