To rhc,
Thanks for those suggestions.  Here are the results:
(1) Add "--oversubscribe" to mpirun cmd (I also added
    "--output-filename junk" -- see other output below).
Terminal output had this fairly usual error message (shortened):

Child job 2 terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.

mpirun detected that one or more processes exited with non-zero status,
thus causing the job to be terminated. The first process to do so was:
  Process name: [[37749,2],0]
  Exit code:    1

    And a file junk.2.000 (presumably stderr) was written--edited
contents here (deleted duplicate output from multiple nodes):

[] PSM EP connect error (Endpoint could not
be reached):
[]  Node0
[]  Node0
[]  Node0
----A bunch of identical lines deleted----
[]  n0003
[]  n0003
[]  n0003
----A bunch of identical lines deleted----
[]  n0004
[]  n0004
[]  n0004
----A bunch of identical lines deleted----
[] [[37749,2],0] ORTE_ERROR_LOG: Error in
file dpm_orte.c at line 523
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[] Local abort before MPI_INIT completed
successfully; not able to aggregate error messages, and not able to
guarantee that all other processes were killed!

I note that these errors apparently occurred in MPI_Init, before
my attempt to spawn additional processes.

(2) Modify your MPI_INFO to be “host”, “node0:22” so it thinks there
    are more slots available
When I did this, since I actually try to spawn two processes,
I put "Node0:22" for the first one and "Node0:23" for the second
one.  I get simply on the terminal output with no "junk" files:
All nodes which are allocated for this job are already filled.

This is the same whether I have "slots=22 max-slots=22" or
"slots=21 max-slots=24" in the hostfile.

(3) Using the MPI_INFO as in (2), I also tried adding "--bind-to core"
   to the mpirun line.  This may be the most interesting output:

WARNING: a request was made to bind a process. While the system
supports binding the process itself, at least one node does NOT
support binding memory to the process location.

  Node:  Node0

This usually is due to not having the required NUMA support installed
on the node. In some Linux distributions, the required support is
contained in the libnumactl and libnumactl-devel packages.
This is a warning only; your job will continue, though performance may
be degraded.
A request was made to bind to that would result in binding more
processes than cpus on a resource:

   Bind to:     CORE
   Node:        Node0
   #processes:  2
   #cpus:       1

You can override this protection by adding the "overload-allowed"
option to your binding directive.

Indeed the packages mentioned are not installed.  I found some
discussion of this at
which claims this message should really be about "hwloc" which is
another thing I know nothing about.
Does any of this help or suggest something else to try?
George Reeke

On Fri, 2017-10-06 at 13:55 -0700, wrote:
> Couple of things you can try:
> * add --oversubscribe to your mpirun cmd line so it doesn’t care how many 
> slots there are
> * modify your MPI_INFO to be “host”, “node0:22” so it thinks there are more 
> slots available
> It’s possible that the “host” info processing has a bug in it, but this will 
> tell us a little more and hopefully get your running. If you want to bind 
> your processes to cores, then add “--bind-to core” to the cmd line
> > On Oct 6, 2017, at 1:35 PM, George Reeke <> wrote:
> > 
> > Dear colleagues,
> > I need some help controlling where a process spawned with
> > MPI_Comm_spawn goes.  I am in openmpi-1.10 under Centos 6.7.
> > My application is written in C and am running on a RedBarn
> > system with a master node (hardware box) that connects to the
> > outside world and two other nodes connected to it via ethernet and
> > Infiniband.  There are two executable files, one (I'll call it
> > "Rank0Pgm") that expects to be rank 0 and does all the I/O and
> > the other ("RanknPgm") that only communicates via MPI messages.
> > There are two MPI_Comm_spawns that run just after MPI_Init and
> > an initial broadcast that shares some setup info, like this:
> > MPI_Comm_spawn("andmsg", argv, 1, MPI_INFO_NULL,
> >   hostid, commc, &commd, &sperr);
> > where "andmsg" is a program that needs to communicate with the
> > internet and with all the other processes via a new communicator
> > that will be called commd (and another name for the other one).
> >   When I run this program with no hostfile and an mpirun line
> > something like this on a node with 32 cores:
> > /usr/lib64/openmpi-1.10/bin/mpirun -n 1 Rank0Pgm : -n 28 RanknPgm \
> >   < InputFile
> > everything works fine.  I assume the spawns use 2 of the 3 available
> > cores that I did not ask the program to use.
> > 
> > Now I want to run on the full network, so I make a hostfile like this
> > (call it "nodes120"):
> > node0 slots=22 max-slots=22
> > n0003 slots=40 max-slots=40
> > n0004 slots=56 max-slots=56
> > where node0 has 24 cores and I am trying to leave room for my two
> > spawned processes.  The spawned processes have to be able to contact
> > the internet, so I make an MPI_INFO with MPI_Info_create and
> > MPI_Info_set(mpinfo, "host", "node0")
> > and change the MPI_INFO_NULL in the spawn calls to point to this
> > new MPI_Info.  (If I leave the MPI_INFO_NULL I get a different
> > error that is probably not of interest here.)
> > 
> > Now I run the mpirun like above except now with
> > "--hostfile nodes120" and "-n 116" after the colon.  Now I get this
> > error:
> > 
> > "There are not enough slots available in the system to satisfy the 1
> > slots that were requested by the application:
> >  andmsg
> > Either request fewer slots for your application, or make more slots
> > available for use."
> > 
> > I get the same error with "max-slots=24" on the first line of the
> > hosts file.
> > 
> > Sorry for the length of all that.  Request for help:  How do I set
> > things up to run my rank 0 program and enough copies of RanknPgm to fill
> > all but some number of cores on the master hardware node, and all the
> > other rank n programs on the other hardware "nodes" (boxes of CPUs).
> > [My application will do best with the default "by slot" scheduling.]
> > 
> > Suggestions much appreciated.  I am quite convinced my code is OK
> > in that it runs OK as shown above on one hardware box.  Also runs
> > on my laptop with 4 cores and "-n 3 RanknPgm" so I guess I don't
> > even really need to reserve cores for the two spawned processes.
> > I thought of using old-fashioned 'fork' but I really want the
> > extra communicators to keep asynchronous messages separated.
> > The documentation says overloading is OK by default, so maybe
> > something else is wrong here.
> > 
> > George Reeke
> > 
> > 
> > 
> > 
