On 9 Sep 2010, at 17:00, Gus Correa wrote:

> Hello All
> 
> Gabrielle's question, Ashley's recipe, and Dick Treutmann's cautionary words, 
> may be part of a larger context of load balance, or not?
> 
> Would Ashley's recipe of sporadic barriers be a silver bullet to
> improve load imbalance problems, regardless of which collectives or
> even point-to-point calls are in use?

No, it only holds where there is no data dependency between some of the ranks, 
in particular if there are any non-rooted collectives in an iteration of your 
code then it cannot make any difference at all, likewise if you have a reduce 
followed by a barrier using the same root for example then you already have 
global synchronisation each iteration and it won't help.  My feeling is that it 
applies to a significant minority of problems, certainly the phrase "adding 
barriers can make codes faster" should be textbook stuff if it isn't already.

> Would sporadic barriers in the flux coupler "shake up" these delays?

I don't fully understand your description but it sounds like it might set the 
program back to a clean slate which would give you per-iteraion delays only 
rather than cumulative or worse delays.

> Ashley:  How did you get to the magic number of 25 iterations for the
> sporadic barriers?

Experience and finger in the air.  The major factors in picking this number is 
the likelihood of a positives feedback cycle of delays happening, the delays 
these delays add and the cost of a barrier itself.  Having too low a value will 
slightly reduce performance, having too high a value can drastically reduce 
performance.

As a further item (because I like them) the asynchronous barrier is even better 
again if used properly, in the good case it doesn't cause any process to block 
ever so the cost is only that of the CPU cycles the code takes itself, in the 
bad case where it has to delay a rank then this tends to have a positive impact 
on performance.

> Would it be application/communicator pattern dependent?

Absolutely.

Ashley,

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk


Reply via email to