On Oct 31, 2013, at 4:44 PM, Qin Lu <[email protected]> wrote:

> Jed,
> 
> The program is sequential so the CSR matrix is only constructed in the master 
> processor. You said: "You should really partition the *mesh*, then assemble 
> the local parts in parallel", do you mean I should parallelize the whole 
> program (using domain decomposition, for instance)? That will be my next 
> step. But for now I need to make the parallel solver work in a sequential 
> program.

   This is really not something that the MPI parallel programming model is in 
anyway suited towards. The process of taking your sequential code and its 
sequential matrix and making the matrix be parallel is completely wasted effort 
instead 

   if you want to “try out” parallel solvers to see how they scale on your 
matrices then you should 

1)   Run your sequential program and have it use MatView() and VecView() on the 
matrix (ices) and vector(s) right hand side with the binary viewer into a file

2)   Use src/ksp/ksp/examples/tutorials/ex10.c to load the matrix and vector 
from the file and solve the system in parallel on any number of processes.

   
   If you are happy with how the parallel linear solvers work then you rewrite 
the code to handle the mesh in parallel and generate the matrix in parallel 
using MatSetValues() etc.

   Barry

> 
> In the context of partitioning/distributing the matrix, the manual says the 
> first n0 number of rows should be on the first process, the second n1 rows 
> should be on the second process, and so on. But when calling MatSetValues, 
> each coefficient should use GLOBAL (instead of local) row/column index (as 
> showed in sample code ex2.c), right?
> 
> Thanks,
> Qin
> 
> 
> 
> On Thursday, October 31, 2013 4:07 PM, Jed Brown <[email protected]> wrote:
> 
> Qin Lu <[email protected]> writes:
> 
>> Jed,
>>  
>> What about MatCreateSeqAIJWithArrays? Is it also implemented by looping over
>> the rows calling MatSetValues?
> 
> No, it uses the arrays directly because that format makes sense in
> serial.
> 
>> My CRS matrix is constructed in the master processor when using a
>> parallel solver. 
> 
> This will absolutely cripple your scalability.  If you're just trying to
> use a few cores, then fine, but if you care about scalability, you need
> to rethink your design.
> 
> 
>> Do I have to manually partition it (using metis, for instance) and
>> distribute it to all processors using MPI, or PETSc has any
>> subroutines to do this job?
> 
> You should really partition the *mesh*, then assemble the local parts in
> parallel.  The alternative you can use is to partition the matrix
> entries (renumbering from your "native" ordering) and then call
> MatSetValues (mapping the indices) from rank 0.  This part is not
> scalable and may only make sense if you have a difficult problem or many
> systems to solve.  Better to do it right and assemble in parallel.  It's
> not difficult.     

Reply via email to