Re: [OMPI devel] processor affinity -- OpenMPI/batch system integration

Jeff Squyres Fri, 11 Jan 2008 09:39:10 -0500

carto is more intended to be a discovery and provider of topologyinformation. How various parts of the OMPI code base use thatinformation is a different issue.

With regards to processor affinity, there are two general ways ofdoing it:

1. The resource manager tells us what processors have been allocatedto us. E.g., provide us some environment variables saying whatprocessors/cores/whatever have been allocated to us on a per-hostbasis (e.g., in the environment of the launched applications, andtherefore may be different on every host). Then Open MPI decides howto split up the allocated host processors amongst all the Open MPIprocesses on that host.


It would be great if SGE could provide some environment variables to us.

2. The resource manager does all the processor affinity itself.SLURM, for example, has a nice command line syntax for all kinds ofprocessor affinity stuff in their "srun" command. A traditionalroadblock to this has been that OMPI currently uses the resourcemanager to launch a single "orted" process on each node, and then thatorted, in turn, launches all the MPI processes locally. However,there is work progressing to remove this roadblock. If I try todescribe it, I'm sure I'll get it wrong :-) -- Ralph / IU?


-----

Open MPI will need to be able to tell the difference between #1 and#2. So it might be good if the RM always provides the environmentvariables, but in those env variables, tell us whether the RM did theaffinity pinning or not. I.e., in #1, you'll get information aboutall the processors that are available -- all the processes on a singlehost will get the same information. In #2, each process will getindividualized information about where it has been pinned.


Make sense?



On Jan 11, 2008, at 6:22 AM, Pak Lui wrote:

Hi Rayson,

I guess this is an issue only for SGE. I believe there is something

called 'carto' framework is being developed to represent the node-socketrelationship in order to address the multicore issue. I think thereareother folks in the team who are actively working on it so theyprobablycan address it better than I can. Here some descriptions on the wikifor it:


https://svn.open-mpi.org/trac/ompi/wiki/OnHostTopologyDescription

Rayson Ho wrote:

Hello,

I'm from the Sun Grid Engine (SGE) project (
http://gridengine.sunsource.net ). I am working on processor affinity
support for SGE.

In 2005, we had some discussions on the SGE mailing list with Jeff on
this topic. As quad-core processors are available from AMD and Intel,
and higher core count per socket is coming soon, I would like to see
what we can do to come up with a simple interface for the SGE 6.2
release, which will be available in Q2 this year (or at least into an

"update" release of SGE6.2 if we couldn't get the changes in ontime).


The discussions we had before:
http://gridengine.sunsource.net/servlets/BrowseList?list=dev&by=thread&from=7081
http://gridengine.sunsource.net/servlets/BrowseList?list=dev&by=thread&from=4803

I looked at the SGE code, the simplest we can do is to set an
environment variable to tell the task group the processor mask of the
node before we start each task group. Is it good enough for OpenMPI??

After reading the OpenMPI code, I believe what we need to do is that
in ompi/runtime/ompi_mpi_init.c , we need to add an else case:

if (ompi_mpi_paffinity_alone) {
  ...
}
else
{

// get processor affinity information from batch system via theenv var

  ...
}

Thanks,
Rayson
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--


- Pak Lui
pak....@sun.com
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems

Re: [OMPI devel] processor affinity -- OpenMPI/batch system integration

Reply via email to