On Sep 30, 2008, at 2:51 PM, Rayson Ho wrote:

Restarting this discussion. A new update version of Grid Engine 6.2
will come out early next year [1], and I really hope that we can get
at least the interface defined.

Great!

At the minimum, is it enough for the batch system to tell OpenMPI via
an env variable which core (or virtual core, in the SMT case) to start
binding the first MPI task?? I guess an added bonus would be
information about the number of processors to skip (the stride)
between the sibling tasks?? Stride of one is usually the case, but
something larger than one would allow the batch system to control the
level of cache and memory bandwidth sharing between the MPI tasks...

Wouldn't it be better to give us a specific list of cores to bind to? As core counts go up in servers, I think we may see a re-emergence of having multiple MPI jobs on a single server. And as core counts go even *higher*, then fragmentation of available cores over time is possible/likely.

Would you be giving us a list of *relative* cores to bind to (i.e., "bind to the Nth online core on the machine" -- which may be different than the OS's ID for that processor) or will you be giving us the actual OS virtual processor ID(s) to bind to?

--
Jeff Squyres
Cisco Systems

Reply via email to