Anders Logg wrote:
> On Fri, Aug 22, 2008 at 09:11:37AM +0200, Niclas Jansson wrote:
>> Anders Logg wrote:
>>> On Thu, Aug 21, 2008 at 11:14:09AM +0200, Niclas Jansson wrote:
>>>> Anders Logg wrote:
>>>>> On Thu, Aug 21, 2008 at 09:10:03AM +0200, Niclas Jansson wrote:
>>>>>> Anders Logg wrote:
>>>>>>> On Wed, Aug 20, 2008 at 06:17:30PM +0200, Niclas Jansson wrote:
>>>>>>>
>>>>>>>>>> Stage 2 seems to involve a lot of communication, with small messages.
>>>>>>>>>> I think it would be more efficient if the stage were reorganized such
>>>>>>>>>> that all messages could be exchanged "at once", in a couple of larger
>>>>>>>>>> messages.
>>>>>>>>> That would be nice. I'm very open to suggestions.
>>>>>>>> If understand the {T, S, F} overlap correctly, a facet could be
>>>>>>>> globally
>>>>>>>> identified by the value of F(facet).
>>>>>>> No, F(facet) would be the local number of the facet in subdomain
>>>>>>> S(facet).
>>>>>>>
>>>>>>>> If so, one suggestion is to buffer N_i and F(facet) in 0...p-1 buffers
>>>>>>>> (one for each processor) and exchange these during stage 2.
>>>>>>>>
>>>>>>>> -- stage 1
>>>>>>>>
>>>>>>>> for each facet f \in T
>>>>>>>> j = S_i(f)
>>>>>>>> if j > i
>>>>>>>>
>>>>>>>> -- calculate dof N_i
>>>>>>>>
>>>>>>>> buffer[S_i(f)].add(N_i)
>>>>>>>> buffer[S_i(f)].add(F_i(f))
>>>>>>>> end
>>>>>>>> end
>>>>>>>>
>>>>>>>>
>>>>>>>> -- stage 2
>>>>>>>>
>>>>>>>> -- Exchange shared dofs with fancy MPI_Allgatherv or a lookalike
>>>>>>>> -- MPI_SendRecv loop.
>>>>>>>>
>>>>>>>> for j = 1 to j = (num processors - 1)
>>>>>>>> src = (rank - j + num processors) % num processors
>>>>>>>> dest = (rank + j) % num processors
>>>>>>>>
>>>>>>>> MPI_SendRecv(dest, buffer[dest], src, recv_buffer)
>>>>>>>>
>>>>>>>> for i = 0 to sizeof(recv_buffer), i += 2
>>>>>>>> --update facet recv_buff(i+1) with dof value in recv_buff(i)
>>>>>>>> end
>>>>>>>>
>>>>>>>> end
>>>>>>> I didn't look at this in detail (yet). Is it still valid with the
>>>>>>> above interpretation of F(facet)?
>>>>>>>
>>>>>> Yes, I think so.
>>>>> I think I understand your point, but I don't understand the details
>>>>> of your code.
>>>> if j > i the processor is responsible for creating M_i for the shared
>>>> facet. The newly created M_i is placed in the send buffer for the
>>>> subdomain S_f(f), together with the local facet number in that subdomain.
>>>>
>>>> So the send buffers contains tuples {M_i, F_i(f)}, since there is one
>>>> buffer for each subdomain, one could be sure that F_i(f) is valid on the
>>>> receiving processor.
>>>>
>>>> Instead of iterating over all processors and facets in stage 2, each
>>>> processor receives a set of tuples (for all shared facets) from each
>>>> processor. These could then be used to identify the local facet (since
>>>> F_i(f) is the local facet number) and assign the dofs, which, if I
>>>> understand everything correctly is obtained from M_i.
>>>>
>>>> One modification to the above algorithm, I think it's easier if the
>>>> tuples are stored as {F_i(f), M_i}. Since M_i could be a long list of
>>>> dofs. So the update loop would be something similar to
>>>>
>>>> for i = 0 to size of recv_buff , i +=(number of dofs on each facet + 1)
>>>> local facet f = recv_buff(i)
>>>> for each facet on f, loop counter j
>>>> assign recv_buff( (i+1) + j) ) to facet dof j
>>>> end
>>>> end
>>>>
>>>>> The mapping N_i is an auxiliary global-to-global mapping, which maps
>>>>> the global dofs on a local mesh to global dofs on the global mesh. It
>>>>> has a meaning only on each local mesh. What we want to communicate is
>>>>> the stuff in M_i.
>>>> I see, then it should be M_i in the outlined code.
>>>>
>>>> Niclas
>>> Sounds very good.
>>>
>>> Where do we start?
>>>
>>> I guess one good place to start would be to get input/partitioning
>>> working and you seem to have that working already. We should be able
>>> to read in a mesh, partition it (with ParMetis for now) and construct
>>> the MeshFunctions S_i and F_i.
>>>
>>> Once that is in place, we can start hacking on DofMapBuilder::build().
>>>
>>> Could you outline what you have in terms of input/partitioning and
>>> then we can start applying those patches.
>>>
>> Parallel mesh parsing, the entire mesh is never represented on a single
>> processor. It's a two stage process, first the coordinates are loaded
>> and partitioned with a geometric partitioner. In the second stage each
>> processor loads the cells corresponding to the assigned coordinates, and
>> finally the mesh is partitioned with a graph partitioner.
>>
>> Partitioning is done with the distributed geometric and mesh-to-dual
>> graph partitioner in parmetis.
>
> How does this work? It seems as if you have already partitioned the
> domain geometrically, then there's no need to make a partition based on
> topology. Or is the geometric partitioning only a first approximation
> and then vertices are exchanged between processors?
>
> --
> Anders
>
The coordinates in the XML-file are initially distributed across the
processors with a simple linear distribution, just to get it of the hard
drive.
The geometric partitioner is used to make sure that all coordinates on a
processor lies close to each other. This simplifies and minimize
communication when parsing cell data.
Since the geometric partitioner creates rather bad partitions (with
respected to edge-cut), the graph partitioner is used to create a good
partitioning, with low edge-cut, thus minimizing
data-dependencies/communication.
Niclas
>
>> Mesh distribution or more correctly redistribution (since the mesh is
>> always distributed) moves vertices between processors and construct a
>> new mesh with the MeshEditor. Since processors share vertices in the
>> overlap, I use the concept of ghosted vertices in order to decide which
>> processor should be responsible for redistributing a vertex.
>>
>>
>> Niclas
>> _______________________________________________
>> DOLFIN-dev mailing list
>> [email protected]
>> http://www.fenics.org/mailman/listinfo/dolfin-dev
_______________________________________________
DOLFIN-dev mailing list
[email protected]
http://www.fenics.org/mailman/listinfo/dolfin-dev