Re: [petsc-users] Fwd: How PETSc solves Ax=b in parallel

Matthew Knepley Tue, 22 Oct 2013 13:15:22 -0700

On Tue, Oct 22, 2013 at 3:12 PM, paul zhang <[email protected]> wrote:


> One more question, can I solve the system in parallel?
>

Yes, or you would be using ETSC :)

   Matt


>
> On Tue, Oct 22, 2013 at 4:08 PM, Matthew Knepley <[email protected]>wrote:
>
>> On Tue, Oct 22, 2013 at 3:04 PM, huaibao zhang 
>> <[email protected]>wrote:
>>
>>> Thanks for the answer. It makes sense.
>>>
>>> However, in my case, matrix A is huge and rather sparse, which also owns
>>> a pretty good diagonal structure although there are some other elements are
>>> nonzero. I have to  look for a better way to solve the system more
>>> efficiently. If in parallel, it is even better.
>>>
>>> Attached is an example for A's structure. The pink block is a matrix
>>> with 10x10 elements. The row or column in my case can be in million size.
>>>
>>
>> The analytic character of the operator is usually more important than the
>> sparsity structure for scalable solvers.
>> The pattern matters a lot for direct solvers, and you should definitely
>> try them (SuperLU_dist or MUMPS in PETSc).
>> If they use too much memory or are too slow, then you need to investigate
>> good preconditioners for iterative methods.
>>
>>    Matt
>>
>>
>>> Thanks again.
>>> Paul
>>>
>>>
>>>
>>>  --
>>> Huaibao (Paul) Zhang
>>> *Gas Surface Interactions Lab*
>>> Department of Mechanical Engineering
>>> University of Kentucky,
>>> Lexington, KY, 40506-0503
>>> *Office*: 216 Ralph G. Anderson Building
>>> *Web*:gsil.engineering.uky.edu
>>>
>>> On Oct 21, 2013, at 12:53 PM, Matthew Knepley <[email protected]> wrote:
>>>
>>> On Mon, Oct 21, 2013 at 11:23 AM, paul zhang <[email protected]>wrote:
>>>
>>>> Hi Jed,
>>>>
>>>> Thanks a lot for your answer. It really helps. I built parts of the
>>>> matrix on each processor, then collected them into a global one according
>>>> to their global position. Actually I used two MPI function instead of the
>>>> one in the example, where the local size, as well as the global size is
>>>> given.
>>>> VecCreateMPI and MatCreateMPIAIJ. It does not really matter right?
>>>>
>>>> My continuing question is since the iteration for the system is global.
>>>> Is it more efficient if I solve locally instead. ie. solve parts on each of
>>>> the processor instead of doing globally.
>>>>
>>>
>>> No, because this ignores the coupling between domains.
>>>
>>>   Matt
>>>
>>>
>>>> Thanks again,
>>>>
>>>> Paul
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Oct 21, 2013 at 11:42 AM, Jed Brown <[email protected]>wrote:
>>>>
>>>>> paul zhang <[email protected]> writes:
>>>>>
>>>>> > I am using KSP, more specifically FGMRES method, with MPI to solve
>>>>> Ax=b
>>>>> > system. Here is what I am doing. I cut my computation domain into
>>>>> many
>>>>> > pieces, in each of them I compute independently by solving fluid
>>>>> equations.
>>>>> > This has nothing to do with PETSc. Finally, I collect all of the
>>>>> > information and load it to a whole A matrix.
>>>>>
>>>>> I hope you build parts of this matrix on each processor, as is done in
>>>>> the examples.  Note the range Istart to Iend here:
>>>>>
>>>>>
>>>>> http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex2.c.html
>>>>>
>>>>> > My question is how PETSc functions work in parallel in my case.
>>>>> There are
>>>>> > two guesses to me. First, PETSc solves its own matrix for each
>>>>> domain using
>>>>> > local processor, although A is a global. For the values like number
>>>>> of
>>>>> > iterations, solution vector, their numbers should have equaled to the
>>>>> > number of processors I applied, but I get only one value for each of
>>>>> them.
>>>>> > The reason is that the processors must talk with each other once all
>>>>> of
>>>>> > their work is done, that is why I received the "all reduced" value.
>>>>> This is
>>>>> > more logical than my second guess.
>>>>>
>>>>> It does not work because the solution operators are global, so to solve
>>>>> the problem, the iteration must be global.
>>>>>
>>>>> > In the second one, the system is solved in parallel too. But PETSc
>>>>> function
>>>>> > redistributes the global sparse matrix A to each of the processors
>>>>> after
>>>>> > its load is complete. That is to say now each processor may not
>>>>> solve the
>>>>> > its own partition matrix.
>>>>>
>>>>> Hopefully you build the matrix already-distributed.  The default
>>>>> _preconditioner_ is local, but the iteration is global.  PETSc does not
>>>>> "redistribute" the matrix automatically, though if you call
>>>>> MatSetSizes() and pass PETSC_DECIDE for the local sizes, PETSc will
>>>>> choose them.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Huaibao (Paul) Zhang
>>>> *Gas Surface Interactions Lab*
>>>>  Department of Mechanical Engineering
>>>> University of Kentucky,
>>>> Lexington,
>>>> KY, 40506-0503*
>>>> Office*: 216 Ralph G. Anderson Building
>>>> *Web*:gsil.engineering.uky.edu
>>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
>
> --
> Huaibao (Paul) Zhang
> *Gas Surface Interactions Lab*
> Department of Mechanical Engineering
> University of Kentucky,
> Lexington,
> KY, 40506-0503*
> Office*: 216 Ralph G. Anderson Building
> *Web*:gsil.engineering.uky.edu
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

<<PastedGraphic-1.png>>

Re: [petsc-users] Fwd: How PETSc solves Ax=b in parallel

Reply via email to