On Jun 10, 2012, at 5:54 PM, Mark F. Adams wrote:

> Barry,
> 
> Your algorithm is the classic CS algorithm.  My algorithms exploits the (low 
> dimensional) geometric nature of our graphs.  Assuming you have a decent 
> partitioning, as Jed says, you have, for instance, "interior" vertices that 
> have no external dependence so coloring them is silly.

    Sure absolutely in the MPI world. If you are going for vectorization it may 
not be silly.

>  The algorithm enforces only the (external) data dependencies that are 
> required for correctness, which for normal PDE problems results in a far 
> shorter dependency depth than generic coloring and leaves you some freedom to 
> reorder for performance.
> 
> From a theoretically point of view there is actually a coloring in the 
> algorithm but it is not obvious.  For instance, if there is only one vertex 
> per process then it basically reverts to a coloring algorithm (although it is 
> still asynchronous) where the processor IDs (in the way that I implement it 
> for simplicity) are the "random" numbers that one can use for an MIS or 
> coloring. 

   I think it would be useful to understand the algorithm exactly in terms of 
"block colorings?", "dependency graphs", ..... etc?


    Barry

> 
> Mark
> 
> On Jun 10, 2012, at 1:36 PM, Jed Brown wrote:
> 
>> On Sun, Jun 10, 2012 at 12:30 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>   For clarification it sounds like you color your matrix (with independent 
>> sets) and then march through the colors, each color gets smoothed (in 
>> parallel since they are independent) then on to the next color?
>> 
>> Not independent sets, just by data dependency. For example, everyone relaxes 
>> a block along the right boundary (implying an ordering that visits those 
>> blocks in rank order, except that those strips never touch each other so 
>> they can be done in parallel), then send the updated state to the proc on 
>> the right. While waiting for the message, he might smooth another set that 
>> does not have a data dependency. When a process receives the first message 
>> (from the proc to the left), it can relax that part. There is a picture in 
>> Mark's paper that should make the algorithm clear. This approach doesn't 
>> suffer from the atrocious memory access pattern of classical coloring; it's 
>> also asynchronous and completes with fewer messages in total.
> 


Reply via email to