---------- Forwarded message ---------- From: paul zhang <[email protected]> Date: Tue, Oct 22, 2013 at 4:12 PM Subject: Re: [petsc-users] Fwd: How PETSc solves Ax=b in parallel To: Matthew Knepley <[email protected]>
One more question, can I solve the system in parallel? On Tue, Oct 22, 2013 at 4:08 PM, Matthew Knepley <[email protected]> wrote: > On Tue, Oct 22, 2013 at 3:04 PM, huaibao zhang <[email protected]>wrote: > >> Thanks for the answer. It makes sense. >> >> However, in my case, matrix A is huge and rather sparse, which also owns >> a pretty good diagonal structure although there are some other elements are >> nonzero. I have to look for a better way to solve the system more >> efficiently. If in parallel, it is even better. >> >> Attached is an example for A's structure. The pink block is a matrix with >> 10x10 elements. The row or column in my case can be in million size. >> > > The analytic character of the operator is usually more important than the > sparsity structure for scalable solvers. > The pattern matters a lot for direct solvers, and you should definitely > try them (SuperLU_dist or MUMPS in PETSc). > If they use too much memory or are too slow, then you need to investigate > good preconditioners for iterative methods. > > Matt > > >> Thanks again. >> Paul >> >> >> >> -- >> Huaibao (Paul) Zhang >> *Gas Surface Interactions Lab* >> Department of Mechanical Engineering >> University of Kentucky, >> Lexington, KY, 40506-0503 >> *Office*: 216 Ralph G. Anderson Building >> *Web*:gsil.engineering.uky.edu >> >> On Oct 21, 2013, at 12:53 PM, Matthew Knepley <[email protected]> wrote: >> >> On Mon, Oct 21, 2013 at 11:23 AM, paul zhang <[email protected]>wrote: >> >>> Hi Jed, >>> >>> Thanks a lot for your answer. It really helps. I built parts of the >>> matrix on each processor, then collected them into a global one according >>> to their global position. Actually I used two MPI function instead of the >>> one in the example, where the local size, as well as the global size is >>> given. >>> VecCreateMPI and MatCreateMPIAIJ. It does not really matter right? >>> >>> My continuing question is since the iteration for the system is global. >>> Is it more efficient if I solve locally instead. ie. solve parts on each of >>> the processor instead of doing globally. >>> >> >> No, because this ignores the coupling between domains. >> >> Matt >> >> >>> Thanks again, >>> >>> Paul >>> >>> >>> >>> >>> >>> >>> >>> On Mon, Oct 21, 2013 at 11:42 AM, Jed Brown <[email protected]>wrote: >>> >>>> paul zhang <[email protected]> writes: >>>> >>>> > I am using KSP, more specifically FGMRES method, with MPI to solve >>>> Ax=b >>>> > system. Here is what I am doing. I cut my computation domain into many >>>> > pieces, in each of them I compute independently by solving fluid >>>> equations. >>>> > This has nothing to do with PETSc. Finally, I collect all of the >>>> > information and load it to a whole A matrix. >>>> >>>> I hope you build parts of this matrix on each processor, as is done in >>>> the examples. Note the range Istart to Iend here: >>>> >>>> >>>> http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex2.c.html >>>> >>>> > My question is how PETSc functions work in parallel in my case. There >>>> are >>>> > two guesses to me. First, PETSc solves its own matrix for each domain >>>> using >>>> > local processor, although A is a global. For the values like number of >>>> > iterations, solution vector, their numbers should have equaled to the >>>> > number of processors I applied, but I get only one value for each of >>>> them. >>>> > The reason is that the processors must talk with each other once all >>>> of >>>> > their work is done, that is why I received the "all reduced" value. >>>> This is >>>> > more logical than my second guess. >>>> >>>> It does not work because the solution operators are global, so to solve >>>> the problem, the iteration must be global. >>>> >>>> > In the second one, the system is solved in parallel too. But PETSc >>>> function >>>> > redistributes the global sparse matrix A to each of the processors >>>> after >>>> > its load is complete. That is to say now each processor may not solve >>>> the >>>> > its own partition matrix. >>>> >>>> Hopefully you build the matrix already-distributed. The default >>>> _preconditioner_ is local, but the iteration is global. PETSc does not >>>> "redistribute" the matrix automatically, though if you call >>>> MatSetSizes() and pass PETSC_DECIDE for the local sizes, PETSc will >>>> choose them. >>>> >>> >>> >>> >>> -- >>> Huaibao (Paul) Zhang >>> *Gas Surface Interactions Lab* >>> Department of Mechanical Engineering >>> University of Kentucky, >>> Lexington, >>> KY, 40506-0503* >>> Office*: 216 Ralph G. Anderson Building >>> *Web*:gsil.engineering.uky.edu >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- Huaibao (Paul) Zhang *Gas Surface Interactions Lab* Department of Mechanical Engineering University of Kentucky, Lexington, KY, 40506-0503* Office*: 216 Ralph G. Anderson Building *Web*:gsil.engineering.uky.edu -- Huaibao (Paul) Zhang *Gas Surface Interactions Lab* Department of Mechanical Engineering University of Kentucky, Lexington, KY, 40506-0503* Office*: 216 Ralph G. Anderson Building *Web*:gsil.engineering.uky.edu
<<PastedGraphic-1.png>>
