Re: [DOLFIN-dev] Parallel assembly

Garth N. Wells Fri, 07 Dec 2007 06:12:24 -0800


Anders Logg wrote:
> On Fri, Dec 07, 2007 at 03:04:12PM +0100, Anders Logg wrote:
>> On Wed, Dec 05, 2007 at 06:18:53PM +0100, Johan Hoffman wrote:
>>>> Just some comments on the strategy.
>>>>
>>>> On Dec 5, 2007 7:50 AM, Anders Logg <[EMAIL PROTECTED]> wrote:
>>>>> On Mon, Dec 03, 2007 at 11:44:44AM +0100, Niclas Jansson wrote:
>>>>>
>>>>>> It's a different strategy that uses point to point instead of
>>>>> collective
>>>>>> communication. However the plan for parallel assembly should be more
>>>>> or
>>>>>> less the same.
>>>>>>
>>>>>> I attached the more detailed TODO list, it should explain the
>>>>> necessary
>>>>>> changes to the Mesh classes.
>>>>>>
>>>>>> Niclas
>>>>>> Modify mesh representation to store both local and global indices
>>>>>> for each cell/vertex. Implement mesh functions to map between local
>>>>>> and global indices. The local indices corresponds to the current
>>>>>> cell and vertex indices, only the mapping functions must be added to
>>>>>> the Mesh class.
>>>>> I don't think we should store both local and global indices for mesh
>>>>> entities. All we need is to store the mapping from local to global
>>>>> indices. We can use MeshFunctions for this but it's not necessary.
>>>>>
>>>>> My suggestion would be to add a new class MeshNumbering (maybe someone
>>>>> can suggest a better name) which would store the numbering scheme in a
>>>>> set of (dim + 1) arrays:
>>>>>
>>>>>   class MeshNumbering
>>>>>   {
>>>>>   public:
>>>>>
>>>>>     ...
>>>>>
>>>>>   private:
>>>>>
>>>>>     uint** numbering;
>>>>>
>>>>>   }
>>>>>
>>>>> One array for each dimension so, numbering[0][i] would return the
>>>>> global number for the vertex with index i.
>>>>>
>>>>> We can also add an easy-acccess function for global numbers to
>>>>> MeshEntity so that (for example) e.number() would return the global
>>>>> number of an entity e.
>>> I think the motivation for locally storing the global index is to keep the
>>> algorithms fully distributed.
>>> As far as I understand, with a
>>> MeshNumbering-solution as you outline you would store this data at one
>>> processor? OR do I misunderstand this? For very large problems we want all
>>> algorithms to be fully distributed.
>> I mean storing the indices locally on each processor (for its part of
>> the mesh) but storing them in global arrays on those processors. There
>> is no way to store a value locally at each vertex etc since those
>> objects (vertices, cells and in general MeshEntity) don't exist. Only
>> the arrays exist and the objects are views that get created by the
>> iterators etc.
>>
>>>> I will just point out that I think this is very limiting. You can argue
>>>> that it
>>>> covers what you want to do, but it is quite inflexible compared with
>>>> having names. It is an incredible pain in the ass the rebalance ( or
>>>> anything else complicated, like AMR) if you
>>>> rely on offsets (numberings) rather than names. I recommend (as we
>>>> do) using names until you have exactly the mesh you want, and then
>>>> reducing to offsets. This is implemeted manually in Sieve right now
>>>> (you call a method), but I am trying to automate it with code generation.
>>> I am not sure of what you mean by "names"? But I guess that you mean that
>>> you want to work with different closures of a set of mesh entities (like
>>> cells for example)? If that's what you mean, then I would agree.
>>>
>>>>>> Adapt mesh reading for the new representation, store mesh data based
>>>>>> on number of local cells/vertices instead of parsed numbers. This
>>>>>> modification allows processors to read different parts of the mesh
>>>>>> in parallel making an initial distribution step unnecessary.
>>>>>>
>>>>>> Loading meshes in parallel should increase efficiency, reduce cost
>>>>>> full communication and save memory for large scale problem, given
>>>>>> that the parallel environment have a shared file system that could
>>>>>> handle the load. However the serial distribution should still be
>>>>>> implemented to support environment without shared file systems.
>>>>>>
>>>>>>  Modification for the new representation should be implemented in
>>>>>>  class XMLMesh. Functions for initial mesh distribution should be
>>>>>>  implemented in a new class.
>>>>> For this, we should add optional data to the mesh format, such that
>>>>> the current file format still works. If additional data is present,
>>>>> then that is read into MeshNumbering, otherwise it is empty.
>>> Yes, that is natural (that the serial/regular mesh format should still
>>> work without modifications). Although, I am not sure that I think the
>>> MeshNumbering approach is a good solution.
>>>
>>>>> (When I think of it, MeshNumbering may not be a good choice of name
>>>>> for the new class, since it may be confused with MeshOrdering which
>>>>> does something different but related.)
>>>>>
>>>>>> Change mesh partitioning library to ParMETIS. Modify the
>>>>>> partitioning class to work on distributed data, add the necessary
>>>>>> calls to METIS and redistribute the local vertices/cells according
>>>>>> to the result. Since METIS could partition a mesh directly using an
>>>>>> internal mesh to graph translation it is possible to have
>>>>>> partitioning directly in the MeshPartition class. However both
>>>>>> methods could easily be implemented and compared against each other.
>>>>> We don't want to change from SCOTCH to ParMETIS, but we could add
>>>>> support for using METIS/ParMETIS as an option.
>>> Yes, that is right. The implementation for the basic algorithms should
>>> look more or less the same for Scotch and parMetis, and then we could add
>>> extra optional functionality based on parMetis until we can find
>>> corresponding functionality in Scotch.
>>>
>>>> Have you thought about generalizing the partitioning to hypergraphs? I
>>>> just
>>>> did this so I can partition faces (for FVM) and it was not that bad. I
>>>> use Zoltan
>>>> from Sandia.
>>> Sounds interesting. I do not know about this, but it may be worth to look 
>>> at.
>>>
>>>>>> Finish implementation of mesh communication class MPIMeshCommunicator.
>>>>>> Add functionality for single vertex and cell communication needed
>>>>>> for mesh partitioning.
>>>>> What do you mean by single vertex and cell communication? Also note
>>>>> that it is not enough to communicate indices for vertices and
>>>>> cells. Sometimes we also need to communicate edges and faces.
>>>> That is why you should never explicitly refer to vertices and cells, but
>>>> rather
>>>> communicate that entire closure and star of each element which you send.
>>>> That is the point of the mesh structure, to avoid this kind of special
>>>> purpose
>>>> coding.
>>>>
>>>>>> Adapt boundary calculation to work on distributed meshes. Use
>>>>>> knowledge about which vertices are shared among processors to decide
>>>>>> if an edge is global or local. Implement the logic directly in
>>>>>> BoundaryComputation class using information from the mesh
>>>>>> partitioning.
>>>>> I'm not sure I understand this point.
>>> If you only have a part of the mesh, as is the case for a fully
>>> distributed mesh, then you need to know if your Boundary (of the local
>>> Mesh) is a internal boundary, which should communicate with its
>>> neighboring partitions, or an external boundary for which you apply
>>> boundary conditions.
>> ok, good point.
> 
> We also need to be able to assemble over interior facets (in DG
> methods). There we iterate over all interior facets and for each facet
> pick its two neighboring cells, so those two cells need to be
> available. So if we want to partition along a given facet, do we
> include the two cells on either side in both sub domains so that both
> cells are available from both sub domains?
>


I would say yes. This is how FD methods typically work in parallel.

Garth
_______________________________________________
DOLFIN-dev mailing list
[email protected]
http://www.fenics.org/mailman/listinfo/dolfin-dev

Re: [DOLFIN-dev] Parallel assembly

Reply via email to