On Nov 25, 2011, at 4:22 PM, Jed Brown wrote: > On Fri, Nov 25, 2011 at 13:07, Mark F. Adams <mark.adams at columbia.edu> > wrote: > OK, thats a good start. You do need two communication steps, 1) get offsets, > 2) get data (right?). > > Yes, to send variable-sized data, you need two steps. For constant-sized > data, it is just one. > > You avoid packing the send data but you need to unpack the receive data, > converting the raw graph data into the reduced ghost data. Not simple but > writing from one side only and with only one-to-one communication is > attractive. > > That depends a bit on what representation you want for the ghost data. You > never "unpack" in a by-rank way since it gets put into the correct part of > the domain according to where an MPI_Datatype says to put it. (This was part > of the local setup when defining the MPI_Get() calls, but those are internal > to the library.) > > If you are increasing the level of ghosting, you probably want to number the > new ghost nodes that you acquired and make the local data structure point > directly at them instead of at the (rank, offset) at which they are owned. > This is a global-to-local translation of the connectivity, so you can do it > with a hash table or sorting. I think you can also skip sorting by pushing > the new indices back through the global space. > > That is, I effectively have two local-to-global maps, both pointing at the > same global space: L1toG and L2toG. I want to translate some local > information defined on L1 into L2. I can do this by fetching global indices > and inverting the local part of the map, or (with the assistance of some > memory provided by the global owner) I can do a gather L2 -> G* (G* has a > slot for each process sharing via this map) and then get those indices back > in L1 using scatter G* -> L1. I think the hash is usually better, but with a > fast network and/or highly vectorized hardware, there might be an advantage > to pushing the indices L2 -> G* -> L1 without the need for hashing or sorting. > > > So keep going. AMR is good but I'm not sure I'm up to specifying a complete > unstructured AMR algorithm at the moment. My other example of AMG coarse > grid process aggregation is another.... > > Or perhaps a parallel maximal independent set algorithm, preferably my > algorithm which is implemented in GAMG and documented in an old paper. > > MIS is straightforward. To define the parallel graph, let's say that ghosting > processes have (owner rank, index) of neighbors. An owned point is only a > candidate for selection if all neighbors have the same rank or smaller > (according to Rule 2 of your paper). > > while (candidates remain): > forall v in owned_candidates with neighbor ranks smaller than p:
This is not quite right, and I want to see how complex or cumbersome this one sided model will get when you get all the details in. You could say loop: for all owned_candidates v for all neighbors q of v if q is selected delete v skip v else if q.proc > myproc and q is not deleted skip v endif end for if not skip v then select v delete all q broadcast v status ???? endif end for reduce all status to owners ??? I thought in your one-sided model you could not broadcast, or push data. You can only pull data and only want to pull data. But you have a broadcast??? Mark > mark v as selected, mark neighbors(v) as deleted (in local space) > reduce status to owner > broadcast status to ghosters > > In any round, multiple procs might have marked the vertex for deletion, but > only if the vertex has a neighbor with larger owner rank, in which case it > could not have been selected. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20111125/79736b14/attachment.html>