Thanks for the answer. It makes sense. 

However, in my case, matrix A is huge and rather sparse, which also owns a pretty good diagonal structure although there are some other elements are nonzero. I have to  look for a better way to solve the system more efficiently. If in parallel, it is even better. 

Attached is an example for A's structure. The pink block is a matrix with 10x10 elements. The row or column in my case can be in million size. 

Thanks again. 
Paul



--
Huaibao (Paul) Zhang
Gas Surface Interactions Lab
Department of Mechanical Engineering
University of Kentucky,
Lexington, KY, 40506-0503
Office: 216 Ralph G. Anderson Building
Web:gsil.engineering.uky.edu

On Oct 21, 2013, at 12:53 PM, Matthew Knepley <[email protected]> wrote:

On Mon, Oct 21, 2013 at 11:23 AM, paul zhang <[email protected]> wrote:
Hi Jed,

Thanks a lot for your answer. It really helps. I built parts of the matrix on each processor, then collected them into a global one according to their global position. Actually I used two MPI function instead of the one in the example, where the local size, as well as the global size is given.
VecCreateMPI and MatCreateMPIAIJ. It does not really matter right?

My continuing question is since the iteration for the system is global. Is it more efficient if I solve locally instead. ie. solve parts on each of the processor instead of doing globally.

No, because this ignores the coupling between domains.

  Matt
 
Thanks again,

Paul







On Mon, Oct 21, 2013 at 11:42 AM, Jed Brown <[email protected]> wrote:
paul zhang <[email protected]> writes:

> I am using KSP, more specifically FGMRES method, with MPI to solve Ax=b
> system. Here is what I am doing. I cut my computation domain into many
> pieces, in each of them I compute independently by solving fluid equations.
> This has nothing to do with PETSc. Finally, I collect all of the
> information and load it to a whole A matrix.

I hope you build parts of this matrix on each processor, as is done in
the examples.  Note the range Istart to Iend here:

http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex2.c.html

> My question is how PETSc functions work in parallel in my case. There are
> two guesses to me. First, PETSc solves its own matrix for each domain using
> local processor, although A is a global. For the values like number of
> iterations, solution vector, their numbers should have equaled to the
> number of processors I applied, but I get only one value for each of them.
> The reason is that the processors must talk with each other once all of
> their work is done, that is why I received the "all reduced" value. This is
> more logical than my second guess.

It does not work because the solution operators are global, so to solve
the problem, the iteration must be global.

> In the second one, the system is solved in parallel too. But PETSc function
> redistributes the global sparse matrix A to each of the processors after
> its load is complete. That is to say now each processor may not solve the
> its own partition matrix.

Hopefully you build the matrix already-distributed.  The default
_preconditioner_ is local, but the iteration is global.  PETSc does not
"redistribute" the matrix automatically, though if you call
MatSetSizes() and pass PETSC_DECIDE for the local sizes, PETSc will
choose them.



--
Huaibao (Paul) Zhang
Gas Surface Interactions Lab
Department of Mechanical Engineering
University of Kentucky,
Lexington,
KY, 40506-0503
Office
: 216 Ralph G. Anderson Building
Web:gsil.engineering.uky.edu



--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

Reply via email to