Re: [OMPI devel] Is it possible to get BTL transport work directly with MPI level

Jeff Squyres Wed, 4 Apr 2007 07:56:52 -0400

On Apr 3, 2007, at 3:07 PM, Li-Ta Lo wrote:

Well, that's a good question. At the moment, the only environmentswhere weencounter multiple cores treat each core as a separate "slot" whentheyassign resources. We don't currently provide an option that says"map bytwo", so the only way to do what you describe would be to manuallyspecify
the mapping, slot by slot.
I also don't understand how Paffinity work for this case. When orted
launch N processes on a node, does it have control on how those
processes are started and mapped to the core/processor? Or is it
the case that O.S. puts the process on whatever cores it picks and
the paffinity module will try to "pin" the process on the core (picked
by O.S.)?


Check out these 3 FAQ entries:

http://www.open-mpi.org/faq/?category=tuning#paffinity-defs
http://www.open-mpi.org/faq/?category=tuning#maffinity-defs
http://www.open-mpi.org/faq/?category=tuning#using-paffinity

We *only* have 1 lame way of doing paffinity right now -- we startpinning processes to processors starting with processor ID 0.

If someone cares to suggest some alternative notation/option forrequestingthat kind of mapping flexibility, I'm certainly willing toimplement it (itwould be rather trivial to do "map by N", but might be morecomplicated if
you want other things).
What is the current syntax of the config file/command line? Can we do
something like array index in those script languages e.g. [0:N:2]?
mailman/listinfo.cgi/devel

There is no syntax for the command line -- this is a discussion thatwe developers have gotten into deadlock over several times. It's aproblem that we'd like to solve, but every time we talk about it, wedeadlock and then move on to other higher-priority items. :-\

I take it to mean that "[0:N:2]" (ditching the [] would probably begood, because those would need to be escaped on the command line --probably "--paffinity 0:N:2" or something would be sufficient) wouldbe "start with core 0, end with core N, and step by 2 cores". Right?

This is fine, and similar things have been suggested before. Theproblem with it is when you want to specify by socket, and not bycore. Additionally, there can be an ambiguity in Linux -- core 0 isalways the first core on the first socket. But where is core 1? Itcould be the 2nd core on the 1st socket, or it could be the 1st coreon the 2nd socket -- it depends on BIOS settings (IIRC).Additionally, Solaris processor ID number does not necessarily startwith 0, nor is it necessarily contiguous.

So we probably need an OMPI-specific syntax that specifically callsout cores and sockets and doesn't rely on making assumptions aboutthe underlying numbering/labeling (analogous to LAM's C/N notation).

But then the problem gets even harder, because we need to also mixthis in with slots and nodes. I.e., what does --byslot and --bynodemean in conjunction with this syntax? Should they be illegal?

How can you specify a sequence of specific cores where you wantprocesses to go if they're in an irregular pattern?


What does it mean to oversubscribe in these scenarios?

...these are some of the questions that we would debate about. Wehaven't really found a good syntax that answers all of them. GalenShipman had a promising syntax at one point, but I've lost the specsof it... If you wander down to his office, he might be able to digit up for you...?


--
Jeff Squyres
Cisco Systems

Re: [OMPI devel] Is it possible to get BTL transport work directly with MPI level

Reply via email to