Re: [DOLFIN-dev] Parallel assembly

Garth N. Wells Wed, 05 Dec 2007 10:17:33 -0800


Niclas Jansson wrote:
> On Wed, 5 Dec 2007 08:09:07 -0600
> "Matthew Knepley" <[EMAIL PROTECTED]> wrote:
> 
>> Just some comments on the strategy.
>>
>> On Dec 5, 2007 7:50 AM, Anders Logg <[EMAIL PROTECTED]> wrote:
>>> On Mon, Dec 03, 2007 at 11:44:44AM +0100, Niclas Jansson wrote:
>>>
>>>> It's a different strategy that uses point to point instead of
>>>> collective communication. However the plan for parallel assembly
>>>> should be more or less the same.
>>>>
>>>> I attached the more detailed TODO list, it should explain the
>>>> necessary changes to the Mesh classes.
>>>>
>>>> Niclas
>>>> Modify mesh representation to store both local and global indices
>>>> for each cell/vertex. Implement mesh functions to map between
>>>> local and global indices. The local indices corresponds to the
>>>> current cell and vertex indices, only the mapping functions must
>>>> be added to the Mesh class.
>>> I don't think we should store both local and global indices for mesh
>>> entities. All we need is to store the mapping from local to global
>>> indices. We can use MeshFunctions for this but it's not necessary.
>>>
>>> My suggestion would be to add a new class MeshNumbering (maybe
>>> someone can suggest a better name) which would store the numbering
>>> scheme in a set of (dim + 1) arrays:
>>>
>>>   class MeshNumbering
>>>   {
>>>   public:
>>>
>>>     ...
>>>
>>>   private:
>>>
>>>     uint** numbering;
>>>
>>>   }
>>>
>>> One array for each dimension so, numbering[0][i] would return the
>>> global number for the vertex with index i.
>>>
>>> We can also add an easy-acccess function for global numbers to
>>> MeshEntity so that (for example) e.number() would return the global
>>> number of an entity e.
>> I will just point out that I think this is very limiting. You can
>> argue that it covers what you want to do, but it is quite inflexible
>> compared with having names. It is an incredible pain in the ass the
>> rebalance ( or anything else complicated, like AMR) if you
>> rely on offsets (numberings) rather than names. I recommend (as we
>> do) using names until you have exactly the mesh you want, and then
>> reducing to offsets. This is implemeted manually in Sieve right now
>> (you call a method), but I am trying to automate it with code
>> generation.
>>
> 
> Ok, since the second part of the project covers AMR maybe a different
> approach is needed.
> 
> 
>>>> Adapt mesh reading for the new representation, store mesh data
>>>> based on number of local cells/vertices instead of parsed numbers.
>>>> This modification allows processors to read different parts of the
>>>> mesh in parallel making an initial distribution step unnecessary.
>>>>
>>>> Loading meshes in parallel should increase efficiency, reduce cost
>>>> full communication and save memory for large scale problem, given
>>>> that the parallel environment have a shared file system that could
>>>> handle the load. However the serial distribution should still be
>>>> implemented to support environment without shared file systems.
>>>>
>>>>  Modification for the new representation should be implemented in
>>>>  class XMLMesh. Functions for initial mesh distribution should be
>>>>  implemented in a new class.
>>> For this, we should add optional data to the mesh format, such that
>>> the current file format still works. If additional data is present,
>>> then that is read into MeshNumbering, otherwise it is empty.
>>>
>>> (When I think of it, MeshNumbering may not be a good choice of name
>>> for the new class, since it may be confused with MeshOrdering which
>>> does something different but related.)
>>>
> 
> 
> It would clean up the implementation a lot, my idea was to use a simple
> linear distribution just to get everything of disk. But maybe this
> approach (whole idea) wont scale beyond 4-8 processor without any fancy
> filesystem (gpfs) or MPI-IO implementation.
> 
> 
>>>> Change mesh partitioning library to ParMETIS. Modify the
>>>> partitioning class to work on distributed data, add the necessary
>>>> calls to METIS and redistribute the local vertices/cells according
>>>> to the result. Since METIS could partition a mesh directly using
>>>> an internal mesh to graph translation it is possible to have
>>>> partitioning directly in the MeshPartition class. However both
>>>> methods could easily be implemented and compared against each
>>>> other.
>>> We don't want to change from SCOTCH to ParMETIS, but we could add
>>> support for using METIS/ParMETIS as an option.
>> Have you thought about generalizing the partitioning to hypergraphs? I
>> just did this so I can partition faces (for FVM) and it was not that
>> bad. I use Zoltan
>> from Sandia.
>>
> 
> No, but Zoltan looked really interesting for the AMR/load balancing parts.
> 
> 
>>>> Finish implementation of mesh communication class
>>>> MPIMeshCommunicator. Add functionality for single vertex and cell
>>>> communication needed for mesh partitioning.
>>> What do you mean by single vertex and cell communication? Also note
>>> that it is not enough to communicate indices for vertices and
>>> cells. Sometimes we also need to communicate edges and faces.
>> That is why you should never explicitly refer to vertices and cells,
>> but rather communicate that entire closure and star of each element
>> which you send. That is the point of the mesh structure, to avoid this
>> kind of special purpose coding.
>>
> 
> Of course, what I meant was that functionality for a point-to-point
> pattern had to be implemented.
> 
>>>> Adapt boundary calculation to work on distributed meshes. Use
>>>> knowledge about which vertices are shared among processors to
>>>> decide if an edge is global or local. Implement the logic directly
>>>> in BoundaryComputation class using information from the mesh
>>>> partitioning.
>>> I'm not sure I understand this point.
>>>
> 
> Since the mesh is distributed a boundary could be local (shared among
> processor ) or global where the BC should be applied. The list of shared
> vertices could be used to sort out the local boundaries.
>


But why do you need this? All you need is the dof map, and PETsc will 
take care of assembling entries on the boundaries of partitions.

> 
>>>> Modify Assembly process with a mapping function which maps
>>>> dof_maps indices from local global prior to updating the global
>>>> tensor. Implement the call in class Assembler using functions from
>>>> the Mesh class.
>>> It might be enough to modify UFCCell::update().
>>>
> 
> Ok, I was thinking about something similar to the previously discussed
> pdofmap approach (src/sandbox/passembly).
> 

This is the point which is most pressing. Magnus has mesh partitioning 
and distribution working (which can be refined later to be fully 
distributed), so to really get moving with parallel assembly we need to 
sort out the dof mapping. The second priority is then making sire that 
the Function class works properly in parallel.

Garth

> Niclas
> 
>>>> Change PETSc data types to MPI (PETScMatrix,PETScVector).
>>>> Change PETSc solver environment to use the correct MPI
>>>> communicator
>>>>  (All PETSc solver classes).
>>> We need to determine whether to use MPI or Seq PETSc types depending
>>> on whether we are running in parallel.
>> We have types for this like AIJ and the default Vec.
>>
>>    Matt
>>
>>> --
>>> Anders
>>
> 
> 
> 
> _______________________________________________
> DOLFIN-dev mailing list
> [email protected]
> http://www.fenics.org/mailman/listinfo/dolfin-dev


_______________________________________________
DOLFIN-dev mailing list
[email protected]
http://www.fenics.org/mailman/listinfo/dolfin-dev

Re: [DOLFIN-dev] Parallel assembly

Reply via email to