On Nov 25, 2011, at 4:22 PM, Jed Brown wrote:

> On Fri, Nov 25, 2011 at 13:07, Mark F. Adams <mark.adams at columbia.edu> 
> wrote:
> OK, thats a good start.  You do need two communication steps, 1) get offsets, 
> 2) get data (right?).
> 
> Yes, to send variable-sized data, you need two steps. For constant-sized 
> data, it is just one.
>  
>  You avoid packing the send data but you need to unpack the receive data, 
> converting the raw graph data into the reduced ghost data.  Not simple but 
> writing from one side only and with only one-to-one communication is 
> attractive.
> 
> That depends a bit on what representation you want for the ghost data. You 
> never "unpack" in a by-rank way since it gets put into the correct part of 
> the domain according to where an MPI_Datatype says to put it. (This was part 
> of the local setup when defining the MPI_Get() calls, but those are internal 
> to the library.)
> 
> If you are increasing the level of ghosting, you probably want to number the 
> new ghost nodes that you acquired and make the local data structure point 
> directly at them instead of at the (rank, offset) at which they are owned. 
> This is a global-to-local translation of the connectivity, so you can do it 
> with a hash table or sorting. I think you can also skip sorting by pushing 
> the new indices back through the global space.
> 
> That is, I effectively have two local-to-global maps, both pointing at the 
> same global space: L1toG and L2toG. I want to translate some local 
> information defined on L1 into L2. I can do this by fetching global indices 
> and inverting the local part of the map, or (with the assistance of some 
> memory provided by the global owner) I can do a gather L2 -> G* (G* has a 
> slot for each process sharing via this map) and then get those indices back 
> in L1 using scatter G* -> L1. I think the hash is usually better, but with a 
> fast network and/or highly vectorized hardware, there might be an advantage 
> to pushing the indices L2 -> G* -> L1 without the need for hashing or sorting.
> 
> 
> So keep going.  AMR is good but I'm not sure I'm up to specifying a complete 
> unstructured AMR algorithm at the moment.  My other example of AMG coarse 
> grid process aggregation is another....
> 
> Or perhaps a parallel maximal independent set algorithm, preferably my 
> algorithm which is implemented in GAMG and documented in an old paper.
> 
> MIS is straightforward. To define the parallel graph, let's say that ghosting 
> processes have (owner rank, index) of neighbors. An owned point is only a 
> candidate for selection if all neighbors have the same rank or smaller 
> (according to Rule 2 of your paper).
> 
> while (candidates remain):
>     forall v in owned_candidates with neighbor ranks smaller than p:

This is not quite right, and I want to see how complex or cumbersome this one 
sided model will get when you get all the details in.

You could say loop:

for all owned_candidates v
 for all neighbors q of v 
   if q is selected
      delete v
      skip v
   else if q.proc > myproc and q is not deleted
      skip v
   endif
 end for
 if not skip v then 
   select v
   delete all q
   broadcast v status ????
 endif
end for
reduce all status to owners ???

I thought in your one-sided model you could not broadcast, or push data.  You 
can only pull data and only want to pull data.  But you have a broadcast???

Mark

>         mark v as selected, mark neighbors(v) as deleted (in local space)
>     reduce status to owner
>     broadcast status to ghosters
> 
> In any round, multiple procs might have marked the vertex for deletion, but 
> only if the vertex has a neighbor with larger owner rank, in which case it 
> could not have been selected.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20111125/79736b14/attachment.html>

Reply via email to