Anders Logg wrote:
> On Thu, Aug 21, 2008 at 09:10:03AM +0200, Niclas Jansson wrote:
>> Anders Logg wrote:
>>> On Wed, Aug 20, 2008 at 06:17:30PM +0200, Niclas Jansson wrote:
>>>
>>>>>> Stage 2 seems to involve a lot of communication, with small messages.
>>>>>> I think it would be more efficient if the stage were reorganized such
>>>>>> that all messages could be exchanged "at once", in a couple of larger
>>>>>> messages.
>>>>> That would be nice. I'm very open to suggestions.
>>>> If understand the {T, S, F} overlap correctly, a facet could be globally
>>>> identified by the value of F(facet).
>>> No, F(facet) would be the local number of the facet in subdomain S(facet).
>>>
>>>> If so, one suggestion is to buffer N_i and F(facet) in 0...p-1 buffers
>>>> (one for each processor) and exchange these during stage 2.
>>>>
>>>> -- stage 1
>>>>
>>>>   for each facet  f \in T
>>>>     j = S_i(f)
>>>>     if j > i
>>>>
>>>>         -- calculate dof N_i
>>>>
>>>>         buffer[S_i(f)].add(N_i)
>>>>         buffer[S_i(f)].add(F_i(f))
>>>>     end
>>>>   end
>>>>
>>>>
>>>> -- stage 2
>>>>
>>>> -- Exchange shared dofs with fancy MPI_Allgatherv or a lookalike
>>>> -- MPI_SendRecv loop.
>>>>
>>>>    for j = 1 to j = (num processors - 1)
>>>>       src = (rank - j + num processors) % num processors
>>>>       dest = (rank + j) % num processors
>>>>
>>>>       MPI_SendRecv(dest, buffer[dest], src, recv_buffer)
>>>>
>>>>       for i = 0 to sizeof(recv_buffer), i += 2
>>>>          --update facet recv_buff(i+1) with dof value in  recv_buff(i)
>>>>       end
>>>>
>>>>    end
>>> I didn't look at this in detail (yet). Is it still valid with the
>>> above interpretation of F(facet)?
>>>
>> Yes, I think so.
> 
> I think I understand your point, but I don't understand the details
> of your code.

if j > i the processor is responsible for creating M_i for the shared 
facet. The newly created M_i is placed in the send buffer for the 
subdomain S_f(f), together with the local facet number in that subdomain.

So the send buffers contains tuples {M_i, F_i(f)}, since there is one 
buffer for each subdomain, one could be sure that F_i(f) is valid on the 
receiving processor.

Instead of iterating over all processors and facets in stage 2, each 
processor receives a set of tuples (for all shared facets) from each 
processor. These could then be used to identify the local facet (since 
F_i(f) is the local facet number) and assign the dofs, which, if I 
understand everything correctly is obtained from M_i.

One modification to the above algorithm, I think it's easier if the 
tuples are stored as {F_i(f), M_i}. Since M_i could be a long list of 
dofs. So the update loop would be something similar to

  for i = 0 to size of recv_buff , i +=(number of dofs on each facet + 1)
     local facet f = recv_buff(i)
     for each facet on f, loop counter j
        assign recv_buff( (i+1) + j) ) to facet dof j
     end
  end

> 
> The mapping N_i is an auxiliary global-to-global mapping, which maps
> the global dofs on a local mesh to global dofs on the global mesh. It
> has a meaning only on each local mesh. What we want to communicate is
> the stuff in M_i.

I see, then it should be M_i in the outlined code.

Niclas

> 
> Should we try to implement this? It will essentially be Algorithm 5++
> (Algorithm 5 with your improvements). So we don't store a global
> numbering of mesh entities but instead compute a global dof map in
> parallel. And we store the overlap as MeshData in some way (a set of
> MeshFunctions attached to each local mesh). I'm very open to which set
> of MeshFunctions we will need, either just S, F or additional data we
> might need.
> 
> Other opinions? Garth? Ola?
> 
> --
> Anders
> 

_______________________________________________
DOLFIN-dev mailing list
[email protected]
http://www.fenics.org/mailman/listinfo/dolfin-dev

Reply via email to