Jeff Squyres wrote:
On Dec 12, 2008, at 11:46 AM, Eugene Loh wrote:
FWIW, I've run into the need for this a few times due to HPCC tests
on large (>100 MPI procs) nodes or multicore systems. HPCC (among
other things) looks at the performance of a single process while all
other np-1 processes spinwait -- or of a single pingpong pair while
all other np-2 processes wait. I'm not 100% sure what's going on,
but I'm guessing that the hard spinning of waiting processes hits
the memory system or some other resource, degrading the performance
of working processes. This is on nodes that are not oversubscribed.
I guess I could <waving hands> see how shmem kinds of communication
could lead to this kind of bottleneck, and that increasing core
counts would magnify the effect. It would be good to understand if
shmem activity is the cause of the slowdown to know if this is a
good rationale datapoint for whether we should do blocking progress
(or, more specifically, whether we need to increase the priority of
implementing blocking progress).
I don't understand all of what's going on here, but I/we've seen this
sort of "catastrophic degradation" on two large (>100 processes) nodes
of rather different architectures. Prototypes seem to indicate that
blocking *or* directed polling seems to address the problem, but those
are preliminary findings that are not backed up by sound understanding
of what's going on under the hood. Yes, still handwaving.