I see.  Okay.  So, I just tried removing the check for universe size,
and set the universe size to 2.  Here's my output:

LD_LIBRARY_PATH=/home/budgeb/p4/pseb/external/lib.dev:/usr/local/lib
OMPI_MCA_orte_default_hostfile=`pwd`/hostsfile ./master_exe
[budgeb-interlagos:29965] [[4156,0],0] ORTE_ERROR_LOG: Fatal in file
base/plm_base_receive.c at line 253
[budgeb-interlagos:29963] [[4156,1],0] ORTE_ERROR_LOG: The specified
application failed to start in file dpm_orte.c at line 785

The corresponding run with mpirun still works.

Thanks,
  Brian

On Tue, Aug 28, 2012 at 2:46 PM, Ralph Castain <r...@open-mpi.org> wrote:
> I see the issue - it's here:
>
>>   MPI_Attr_get(MPI_COMM_WORLD, MPI_UNIVERSE_SIZE, &puniverseSize, &flag);
>>
>>   if(!flag) {
>>       std::cerr << "no universe size" << std::endl;
>>       return -1;
>>   }
>>   universeSize = *puniverseSize;
>>   if(universeSize == 1) {
>>       std::cerr << "cannot start slaves... not enough nodes" << std::endl;
>>   }
>
> The universe size is set to 1 on a singleton because the attribute gets set 
> at the beginning of time - we haven't any way to go back and change it. The 
> sequence of events explains why. The singleton starts up and sets its 
> attributes, including universe_size. It also spins off an orte daemon to act 
> as its own private "mpirun" in case you call comm_spawn. At this point, 
> however, no hostfile has been read - the singleton is just an MPI proc doing 
> its own thing, and the orte daemon is just sitting there on "stand-by".
>
> When your app calls comm_spawn, then the orte daemon gets called to launch 
> the new procs. At that time, it (not the original singleton!) reads the 
> hostfile to find out how many nodes are around, and then does the launch.
>
> You are trying to check the number of nodes from within the singleton, which 
> won't work - it has no way of discovering that info.
>
>
>
>
> On Aug 28, 2012, at 2:38 PM, Brian Budge <brian.bu...@gmail.com> wrote:
>
>>> echo hostsfile
>> localhost
>> budgeb-sandybridge
>>
>> Thanks,
>>  Brian
>>
>> On Tue, Aug 28, 2012 at 2:36 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>> Hmmm...what is in your "hostsfile"?
>>>
>>> On Aug 28, 2012, at 2:33 PM, Brian Budge <brian.bu...@gmail.com> wrote:
>>>
>>>> Hi Ralph -
>>>>
>>>> Thanks for confirming this is possible.  I'm trying this and currently
>>>> failing.  Perhaps there's something I'm missing in the code to make
>>>> this work.  Here are the two instantiations and their outputs:
>>>>
>>>>> LD_LIBRARY_PATH=/home/budgeb/p4/pseb/external/lib.dev:/usr/local/lib 
>>>>> OMPI_MCA_orte_default_hostfile=`pwd`/hostsfile ./master_exe
>>>> cannot start slaves... not enough nodes
>>>>
>>>>> LD_LIBRARY_PATH=/home/budgeb/p4/pseb/external/lib.dev:/usr/local/lib 
>>>>> OMPI_MCA_orte_default_hostfile=`pwd`/hostsfile mpirun -n 1 ./master_exe
>>>> master spawned 1 slaves...
>>>> slave responding...
>>>>
>>>>
>>>> The code:
>>>>
>>>> //master.cpp
>>>> #include <mpi.h>
>>>> #include <boost/filesystem.hpp>
>>>> #include <iostream>
>>>>
>>>> int main(int argc, char **args) {
>>>>   int worldSize, universeSize, *puniverseSize, flag;
>>>>
>>>>   MPI_Comm everyone; //intercomm
>>>>   boost::filesystem::path curPath =
>>>> boost::filesystem::absolute(boost::filesystem::current_path());
>>>>
>>>>   std::string toRun = (curPath / "slave_exe").string();
>>>>
>>>>   int ret = MPI_Init(&argc, &args);
>>>>
>>>>   if(ret != MPI_SUCCESS) {
>>>>       std::cerr << "failed init" << std::endl;
>>>>       return -1;
>>>>   }
>>>>
>>>>   MPI_Comm_size(MPI_COMM_WORLD, &worldSize);
>>>>
>>>>   if(worldSize != 1) {
>>>>       std::cerr << "too many masters" << std::endl;
>>>>   }
>>>>
>>>>   MPI_Attr_get(MPI_COMM_WORLD, MPI_UNIVERSE_SIZE, &puniverseSize, &flag);
>>>>
>>>>   if(!flag) {
>>>>       std::cerr << "no universe size" << std::endl;
>>>>       return -1;
>>>>   }
>>>>   universeSize = *puniverseSize;
>>>>   if(universeSize == 1) {
>>>>       std::cerr << "cannot start slaves... not enough nodes" << std::endl;
>>>>   }
>>>>
>>>>
>>>>   char *buf = (char*)alloca(toRun.size() + 1);
>>>>   memcpy(buf, toRun.c_str(), toRun.size());
>>>>   buf[toRun.size()] = '\0';
>>>>
>>>>   MPI_Comm_spawn(buf, MPI_ARGV_NULL, universeSize-1, MPI_INFO_NULL,
>>>> 0, MPI_COMM_SELF, &everyone,
>>>>                  MPI_ERRCODES_IGNORE);
>>>>
>>>>   std::cerr << "master spawned " << universeSize-1 << " slaves..."
>>>> << std::endl;
>>>>
>>>>   MPI_Finalize();
>>>>
>>>>  return 0;
>>>> }
>>>>
>>>>
>>>> //slave.cpp
>>>> #include <mpi.h>
>>>>
>>>> int main(int argc, char **args) {
>>>>   int size;
>>>>   MPI_Comm parent;
>>>>   MPI_Init(&argc, &args);
>>>>
>>>>   MPI_Comm_get_parent(&parent);
>>>>
>>>>   if(parent == MPI_COMM_NULL) {
>>>>       std::cerr << "slave has no parent" << std::endl;
>>>>   }
>>>>   MPI_Comm_remote_size(parent, &size);
>>>>   if(size != 1) {
>>>>       std::cerr << "parent size is " << size << std::endl;
>>>>   }
>>>>
>>>>   std::cerr << "slave responding..." << std::endl;
>>>>
>>>>   MPI_Finalize();
>>>>
>>>>   return 0;
>>>> }
>>>>
>>>>
>>>> Any ideas?  Thanks for any help.
>>>>
>>>> Brian
>>>>
>>>> On Wed, Aug 22, 2012 at 9:03 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>> It really is just that simple :-)
>>>>>
>>>>> On Aug 22, 2012, at 8:56 AM, Brian Budge <brian.bu...@gmail.com> wrote:
>>>>>
>>>>>> Okay.  Is there a tutorial or FAQ for setting everything up?  Or is it
>>>>>> really just that simple?  I don't need to run a copy of the orte
>>>>>> server somewhere?
>>>>>>
>>>>>> if my current ip is 192.168.0.1,
>>>>>>
>>>>>> 0 > echo 192.168.0.11 > /tmp/hostfile
>>>>>> 1 > echo 192.168.0.12 >> /tmp/hostfile
>>>>>> 2 > export OMPI_MCA_orte_default_hostfile=/tmp/hostfile
>>>>>> 3 > ./mySpawningExe
>>>>>>
>>>>>> At this point, mySpawningExe will be the master, running on
>>>>>> 192.168.0.1, and I can have spawned, for example, childExe on
>>>>>> 192.168.0.11 and 192.168.0.12?  Or childExe1 on 192.168.0.11 and
>>>>>> childExe2 on 192.168.0.12?
>>>>>>
>>>>>> Thanks for the help.
>>>>>>
>>>>>> Brian
>>>>>>
>>>>>> On Wed, Aug 22, 2012 at 7:15 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>>> Sure, that's still true on all 1.3 or above releases. All you need to 
>>>>>>> do is set the hostfile envar so we pick it up:
>>>>>>>
>>>>>>> OMPI_MCA_orte_default_hostfile=<foo>
>>>>>>>
>>>>>>>
>>>>>>> On Aug 21, 2012, at 7:23 PM, Brian Budge <brian.bu...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi.  I know this is an old thread, but I'm curious if there are any
>>>>>>>> tutorials describing how to set this up?  Is this still available on
>>>>>>>> newer open mpi versions?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Brian
>>>>>>>>
>>>>>>>> On Fri, Jan 4, 2008 at 7:57 AM, Ralph Castain <r...@lanl.gov> wrote:
>>>>>>>>> Hi Elena
>>>>>>>>>
>>>>>>>>> I'm copying this to the user list just to correct a mis-statement on 
>>>>>>>>> my part
>>>>>>>>> in an earlier message that went there. I had stated that a singleton 
>>>>>>>>> could
>>>>>>>>> comm_spawn onto other nodes listed in a hostfile by setting an 
>>>>>>>>> environmental
>>>>>>>>> variable that pointed us to the hostfile.
>>>>>>>>>
>>>>>>>>> This is incorrect in the 1.2 code series. That series does not allow
>>>>>>>>> singletons to read a hostfile at all. Hence, any comm_spawn done by a
>>>>>>>>> singleton can only launch child processes on the singleton's local 
>>>>>>>>> host.
>>>>>>>>>
>>>>>>>>> This situation has been corrected for the upcoming 1.3 code series. 
>>>>>>>>> For the
>>>>>>>>> 1.2 series, though, you will have to do it via an mpirun command line.
>>>>>>>>>
>>>>>>>>> Sorry for the confusion - I sometimes have too many code families to 
>>>>>>>>> keep
>>>>>>>>> straight in this old mind!
>>>>>>>>>
>>>>>>>>> Ralph
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 1/4/08 5:10 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hello Ralph,
>>>>>>>>>>
>>>>>>>>>> Thank you very much for the explanations.
>>>>>>>>>> But I still do not get it running...
>>>>>>>>>>
>>>>>>>>>> For the case
>>>>>>>>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe
>>>>>>>>>> everything works.
>>>>>>>>>>
>>>>>>>>>> For the case
>>>>>>>>>> ./my_master.exe
>>>>>>>>>> it does not.
>>>>>>>>>>
>>>>>>>>>> I did:
>>>>>>>>>> - create my_hostfile and put it in the $HOME/.openmpi/components/
>>>>>>>>>> my_hostfile :
>>>>>>>>>> bollenstreek slots=2 max_slots=3
>>>>>>>>>> octocore01 slots=8  max_slots=8
>>>>>>>>>> octocore02 slots=8  max_slots=8
>>>>>>>>>> clstr000 slots=2 max_slots=3
>>>>>>>>>> clstr001 slots=2 max_slots=3
>>>>>>>>>> clstr002 slots=2 max_slots=3
>>>>>>>>>> clstr003 slots=2 max_slots=3
>>>>>>>>>> clstr004 slots=2 max_slots=3
>>>>>>>>>> clstr005 slots=2 max_slots=3
>>>>>>>>>> clstr006 slots=2 max_slots=3
>>>>>>>>>> clstr007 slots=2 max_slots=3
>>>>>>>>>> - setenv OMPI_MCA_rds_hostfile_path my_hostfile (I  put it in 
>>>>>>>>>> .tcshrc and
>>>>>>>>>> then source .tcshrc)
>>>>>>>>>> - in my_master.cpp I did
>>>>>>>>>> MPI_Info info1;
>>>>>>>>>> MPI_Info_create(&info1);
>>>>>>>>>> char* hostname =
>>>>>>>>>> "clstr002,clstr003,clstr005,clstr006,clstr007,octocore01,octocore02";
>>>>>>>>>> MPI_Info_set(info1, "host", hostname);
>>>>>>>>>>
>>>>>>>>>> _intercomm = intracomm.Spawn("./childexe", argv1, _nProc, info1, 0,
>>>>>>>>>> MPI_ERRCODES_IGNORE);
>>>>>>>>>>
>>>>>>>>>> - After I call the executable, I've got this error message
>>>>>>>>>>
>>>>>>>>>> bollenstreek: > ./my_master
>>>>>>>>>> number of processes to run: 1
>>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>>> Some of the requested hosts are not included in the current 
>>>>>>>>>> allocation for
>>>>>>>>>> the application:
>>>>>>>>>> ./childexe
>>>>>>>>>> The requested hosts were:
>>>>>>>>>> clstr002,clstr003,clstr005,clstr006,clstr007,octocore01,octocore02
>>>>>>>>>>
>>>>>>>>>> Verify that you have mapped the allocated resources properly using 
>>>>>>>>>> the
>>>>>>>>>> --host specification.
>>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>>>> base/rmaps_base_support_fns.c at line 225
>>>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>>>> rmaps_rr.c at line 478
>>>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>>>> base/rmaps_base_map_job.c at line 210
>>>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>>>> rmgr_urm.c at line 372
>>>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>>>> communicator/comm_dyn.c at line 608
>>>>>>>>>>
>>>>>>>>>> Did I miss something?
>>>>>>>>>> Thanks for help!
>>>>>>>>>>
>>>>>>>>>> Elena
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Ralph H Castain [mailto:r...@lanl.gov]
>>>>>>>>>> Sent: Tuesday, December 18, 2007 3:50 PM
>>>>>>>>>> To: Elena Zhebel; Open MPI Users <us...@open-mpi.org>
>>>>>>>>>> Cc: Ralph H Castain
>>>>>>>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster 
>>>>>>>>>> configuration
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 12/18/07 7:35 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks a lot! Now it works!
>>>>>>>>>>> The solution is to use mpirun -n 1 -hostfile my.hosts *.exe and pass
>>>>>>>>>> MPI_Info
>>>>>>>>>>> Key to the Spawn function!
>>>>>>>>>>>
>>>>>>>>>>> One more question: is it necessary to start my "master" program with
>>>>>>>>>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host 
>>>>>>>>>>> my_master.exe ?
>>>>>>>>>>
>>>>>>>>>> No, it isn't necessary - assuming that my_master_host is the first 
>>>>>>>>>> host
>>>>>>>>>> listed in your hostfile! If you are only executing one my_master.exe 
>>>>>>>>>> (i.e.,
>>>>>>>>>> you gave -n 1 to mpirun), then we will automatically map that 
>>>>>>>>>> process onto
>>>>>>>>>> the first host in your hostfile.
>>>>>>>>>>
>>>>>>>>>> If you want my_master.exe to go on someone other than the first host 
>>>>>>>>>> in the
>>>>>>>>>> file, then you have to give us the -host option.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Are there other possibilities for easy start?
>>>>>>>>>>> I would say just to run ./my_master.exe , but then the master 
>>>>>>>>>>> process
>>>>>>>>>> doesn't
>>>>>>>>>>> know about the available in the network hosts.
>>>>>>>>>>
>>>>>>>>>> You can set the hostfile parameter in your environment instead of on 
>>>>>>>>>> the
>>>>>>>>>> command line. Just set OMPI_MCA_rds_hostfile_path = my.hosts.
>>>>>>>>>>
>>>>>>>>>> You can then just run ./my_master.exe on the host where you want the 
>>>>>>>>>> master
>>>>>>>>>> to reside - everything should work the same.
>>>>>>>>>>
>>>>>>>>>> Just as an FYI: the name of that environmental variable is going to 
>>>>>>>>>> change
>>>>>>>>>> in the 1.3 release, but everything will still work the same.
>>>>>>>>>>
>>>>>>>>>> Hope that helps
>>>>>>>>>> Ralph
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>> Elena
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Ralph H Castain [mailto:r...@lanl.gov]
>>>>>>>>>>> Sent: Monday, December 17, 2007 5:49 PM
>>>>>>>>>>> To: Open MPI Users <us...@open-mpi.org>; Elena Zhebel
>>>>>>>>>>> Cc: Ralph H Castain
>>>>>>>>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster 
>>>>>>>>>>> configuration
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 12/17/07 8:19 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hello Ralph,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for your answer.
>>>>>>>>>>>>
>>>>>>>>>>>> I'm using OpenMPI 1.2.3. , compiler glibc232, Linux Suse 10.0.
>>>>>>>>>>>> My "master" executable runs only on the one local host, then it 
>>>>>>>>>>>> spawns
>>>>>>>>>>>> "slaves" (with MPI::Intracomm::Spawn).
>>>>>>>>>>>> My question was: how to determine the hosts where these "slaves" 
>>>>>>>>>>>> will be
>>>>>>>>>>>> spawned?
>>>>>>>>>>>> You said: "You have to specify all of the hosts that can be used by
>>>>>>>>>>>> your job
>>>>>>>>>>>> in the original hostfile". How can I specify the host file? I can 
>>>>>>>>>>>> not
>>>>>>>>>>>> find it
>>>>>>>>>>>> in the documentation.
>>>>>>>>>>>
>>>>>>>>>>> Hmmm...sorry about the lack of documentation. I always assumed that 
>>>>>>>>>>> the MPI
>>>>>>>>>>> folks in the project would document such things since it has little 
>>>>>>>>>>> to do
>>>>>>>>>>> with the underlying run-time, but I guess that fell through the 
>>>>>>>>>>> cracks.
>>>>>>>>>>>
>>>>>>>>>>> There are two parts to your question:
>>>>>>>>>>>
>>>>>>>>>>> 1. how to specify the hosts to be used for the entire job. I 
>>>>>>>>>>> believe that
>>>>>>>>>> is
>>>>>>>>>>> somewhat covered here:
>>>>>>>>>>> http://www.open-mpi.org/faq/?category=running#simple-spmd-run
>>>>>>>>>>>
>>>>>>>>>>> That FAQ tells you what a hostfile should look like, though you may 
>>>>>>>>>>> already
>>>>>>>>>>> know that. Basically, we require that you list -all- of the nodes 
>>>>>>>>>>> that both
>>>>>>>>>>> your master and slave programs will use.
>>>>>>>>>>>
>>>>>>>>>>> 2. how to specify which nodes are available for the master, and 
>>>>>>>>>>> which for
>>>>>>>>>>> the slave.
>>>>>>>>>>>
>>>>>>>>>>> You would specify the host for your master on the mpirun command 
>>>>>>>>>>> line with
>>>>>>>>>>> something like:
>>>>>>>>>>>
>>>>>>>>>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe
>>>>>>>>>>>
>>>>>>>>>>> This directs Open MPI to map that specified executable on the 
>>>>>>>>>>> specified
>>>>>>>>>> host
>>>>>>>>>>> - note that my_master_host must have been in my_hostfile.
>>>>>>>>>>>
>>>>>>>>>>> Inside your master, you would create an MPI_Info key "host" that 
>>>>>>>>>>> has a
>>>>>>>>>> value
>>>>>>>>>>> consisting of a string "host1,host2,host3" identifying the hosts 
>>>>>>>>>>> you want
>>>>>>>>>>> your slave to execute upon. Those hosts must have been included in
>>>>>>>>>>> my_hostfile. Include that key in the MPI_Info array passed to your 
>>>>>>>>>>> Spawn.
>>>>>>>>>>>
>>>>>>>>>>> We don't currently support providing a hostfile for the slaves (as 
>>>>>>>>>>> opposed
>>>>>>>>>>> to the host-at-a-time string above). This may become available in a 
>>>>>>>>>>> future
>>>>>>>>>>> release - TBD.
>>>>>>>>>>>
>>>>>>>>>>> Hope that helps
>>>>>>>>>>> Ralph
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>>> Elena
>>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: users-boun...@open-mpi.org 
>>>>>>>>>>>> [mailto:users-boun...@open-mpi.org] On
>>>>>>>>>>>> Behalf Of Ralph H Castain
>>>>>>>>>>>> Sent: Monday, December 17, 2007 3:31 PM
>>>>>>>>>>>> To: Open MPI Users <us...@open-mpi.org>
>>>>>>>>>>>> Cc: Ralph H Castain
>>>>>>>>>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster
>>>>>>>>>>>> configuration
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/12/07 5:46 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm working on a MPI application where I'm using OpenMPI instead 
>>>>>>>>>>>>> of
>>>>>>>>>>>>> MPICH.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In my "master" program I call the function MPI::Intracomm::Spawn 
>>>>>>>>>>>>> which
>>>>>>>>>>>> spawns
>>>>>>>>>>>>> "slave" processes. It is not clear for me how to spawn the "slave"
>>>>>>>>>>>> processes
>>>>>>>>>>>>> over the network. Currently "master" creates "slaves" on the same
>>>>>>>>>>>>> host.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If I use 'mpirun --hostfile openmpi.hosts' then processes are 
>>>>>>>>>>>>> spawn
>>>>>>>>>>>>> over
>>>>>>>>>>>> the
>>>>>>>>>>>>> network as expected. But now I need to spawn processes over the
>>>>>>>>>>>>> network
>>>>>>>>>>>> from
>>>>>>>>>>>>> my own executable using MPI::Intracomm::Spawn, how can I achieve 
>>>>>>>>>>>>> it?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I'm not sure from your description exactly what you are trying to 
>>>>>>>>>>>> do,
>>>>>>>>>>>> nor in
>>>>>>>>>>>> what environment this is all operating within or what version of 
>>>>>>>>>>>> Open
>>>>>>>>>>>> MPI
>>>>>>>>>>>> you are using. Setting aside the environment and version issue, I'm
>>>>>>>>>>>> guessing
>>>>>>>>>>>> that you are running your executable over some specified set of 
>>>>>>>>>>>> hosts,
>>>>>>>>>>>> but
>>>>>>>>>>>> want to provide a different hostfile that specifies the hosts to be
>>>>>>>>>>>> used for
>>>>>>>>>>>> the "slave" processes. Correct?
>>>>>>>>>>>>
>>>>>>>>>>>> If that is correct, then I'm afraid you can't do that in any 
>>>>>>>>>>>> version
>>>>>>>>>>>> of Open
>>>>>>>>>>>> MPI today. You have to specify all of the hosts that can be used by
>>>>>>>>>>>> your job
>>>>>>>>>>>> in the original hostfile. You can then specify a subset of those 
>>>>>>>>>>>> hosts
>>>>>>>>>>>> to be
>>>>>>>>>>>> used by your original "master" program, and then specify a 
>>>>>>>>>>>> different
>>>>>>>>>>>> subset
>>>>>>>>>>>> to be used by the "slaves" when calling Spawn.
>>>>>>>>>>>>
>>>>>>>>>>>> But the system requires that you tell it -all- of the hosts that 
>>>>>>>>>>>> are
>>>>>>>>>>>> going
>>>>>>>>>>>> to be used at the beginning of the job.
>>>>>>>>>>>>
>>>>>>>>>>>> At the moment, there is no plan to remove that requirement, though
>>>>>>>>>>>> there has
>>>>>>>>>>>> been occasional discussion about doing so at some point in the 
>>>>>>>>>>>> future.
>>>>>>>>>>>> No
>>>>>>>>>>>> promises that it will happen, though - managed environments, in
>>>>>>>>>>>> particular,
>>>>>>>>>>>> currently object to the idea of changing the allocation 
>>>>>>>>>>>> on-the-fly. We
>>>>>>>>>>>> may,
>>>>>>>>>>>> though, make a provision for purely hostfile-based environments 
>>>>>>>>>>>> (i.e.,
>>>>>>>>>>>> unmanaged) at some time in the future.
>>>>>>>>>>>>
>>>>>>>>>>>> Ralph
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks in advance for any help.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Elena
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> users mailing list
>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> users mailing list
>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to