Dear petsc users, I am solving large systems of non linear PDEs. To do so, the most expensive operation is to solve linear systems Ax=b, where A is block tridiagonal. To do so, I am using petsc.
A is created using MatCreateMPIAIJ and x and b using VecCreateMPI, and I do not modify default parameters of the KSP context for now (ie the Krylov method is GMRES and the preconditioner method is ILU(0) if I use 1 processor - sequential matrix - and Block Jacobi with ILU(0) for each sub-block if I use more than 1 processor). For any number n of processors used, I do get the correct result. However, it seems that the more processors I have, the more iterations are done on each linear solve (n = 1 gives about 1-2 iterations per solve, n = 2 gives 8-12 iterations per solve, n = 4 gives 15-20 iterations per solve). While I can understand the difference between n = 1 and n = 2, since the preconditioning method changes from ILU(0) to Block Jacobi, I don't understand why this is the case from n = 2 to 4 for example, since it seems to me that the method used to solve Ax=b will be the same (although the partitioning is different) and so the operations will be the same, even though there will be more communication. My first question is then: Is this normal behavior or am I probably wrong somewhere ? Also, since the increase in the number of iterations more than offsets the decrease in time spent solving the system when n increase, my program runs slower with an increasing number of processors, which is the opposite of what is desired...Would you have suggestions as what I could change to correct this ? I would be happy to provide more details about the problem/datastructures used if needed, thank you for your help, Tibo
