I'm afraid I don't know much about OpenFoam.  You'll likely need to ask 
assistance from the OpenFoam community.


> On Jun 1, 2016, at 6:13 PM, Megdich Islem <megdich_is...@yahoo.fr> wrote:
> 
> Hi!
> 
> Thank you Jeff. I was able to run a case of OF  by setting the absolute path 
> name  for mpiexec. But, when I wanted to run a coupled case in which  OF is 
> coupled with dummyCSM through EMPIRE using these three command lines:
> 
> mpiexec -np 1 Emperor emperorInput.xml
> mpiexec -np 1 dummyCSM dummyCSMInput
> mpiexec -np 1 pimpleDyMFoam -case OF,
> 
> I was still getting OF not able to connect. In the user guide of EMPIRE, it 
> is said that emperor (the client)  has to recognize the clients which are 
> dummyCSM and OpenFOAM. For some reasons emperor is not able to recognize 
> OpenFoam but it recognizes dummyCSM. 
> 
> What can be the cause that a server can not recognize a client ?
> 
> Regards,
> Islem
> 
> 
> Le Mercredi 1 juin 2016 17h00, "users-requ...@open-mpi.org" 
> <users-requ...@open-mpi.org> a écrit :
> 
> 
> Send users mailing list submissions to
>     us...@open-mpi.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>     https://www.open-mpi.org/mailman/listinfo.cgi/users
> or, via email, send a message with subject or body 'help' to
>     users-requ...@open-mpi.org
> 
> You can reach the person managing the list at
>     users-ow...@open-mpi.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of users digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: Firewall settings for MPI communication
>       (Jeff Squyres (jsquyres))
>   2. Re: users Digest, Vol 3514, Issue 1 (Jeff Squyres (jsquyres))
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Wed, 1 Jun 2016 13:02:22 +0000
> From: "Jeff Squyres (jsquyres)" <jsquy...@cisco.com>
> To: "Open MPI User's List" <us...@open-mpi.org>
> Subject: Re: [OMPI users] Firewall settings for MPI communication
> Message-ID: <ae97b273-c16b-4bc3-bc30-924d25697...@cisco.com>
> Content-Type: text/plain; charset="utf-8"
> 
> In addition, you might want to consider upgrading to Open MPI v1.10.x (v1.6.x 
> is fairly ancient).
> 
> > On Jun 1, 2016, at 7:46 AM, Gilles Gouaillardet 
> > <gilles.gouaillar...@gmail.com> wrote:
> > 
> > which network are your VMs using for communications ?
> > if this is tcp, then you also have to specify a restricted set of allowed 
> > ports for the tcp btl
> > 
> > that would be something like
> > mpirun --mca btl_tcp_dynamic_ports 49990-50010 ...
> > 
> > please double check the Open MPI 1.6.5 parameter and syntax with
> > ompi_info --all
> > (or check the archives, I think I posted the correct command line a few 
> > weeks ago)
> > 
> > Cheers,
> > 
> > Gilles
> > 
> > On Wednesday, June 1, 2016, Ping Wang <ping.w...@asc-s.de> wrote:
> > I'm using Open MPI 1.6.5 to run OpenFOAM in parallel on several VMs on a 
> > cloud. mpirun hangs without any error messages. I think this is a firewall 
> > issue. Because when I open all the TCP ports(1-65535) in the security group 
> > of VMs, mpirun works well. However I was suggested to open as less ports as 
> > possible. So I have to limit MPI to run on a range of ports. I opened the 
> > port range 49990-50010 for MPI communication. And use command
> > 
> >  
> > 
> > mpirun --mca oob_tcp_dynamic_ports 49990-50010 -np 4 --hostfile machines 
> > simpleFoam ?parallel. 
> > 
> >  
> > 
> > But it still hangs. How can I specify a port range that OpenMPI will use? I 
> > appreciate any help you can provide.
> > 
> >  
> > 
> > Best,
> > 
> > Ping Wang
> > 
> >  
> > 
> > <image001.png>
> > 
> > ------------------------------------------------------
> > 
> > Ping Wang
> > 
> > Automotive Simulation Center Stuttgart e.V.
> > 
> > Nobelstra?e 15
> > 
> > D-70569 Stuttgart
> > 
> > Telefon: +49 711 699659-14
> > 
> > Fax: +49 711 699659-29
> > 
> > E-Mail: ping.w...@asc-s.de
> > 
> > Web: http://www.asc-s.de
> > 
> > Social Media: <image002.gif>/asc.stuttgart
> > 
> > ------------------------------------------------------
> > 
> >  
> > 
> >  
> > 
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/users/2016/06/29340.php
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Wed, 1 Jun 2016 13:51:55 +0000
> From: "Jeff Squyres (jsquyres)" <jsquy...@cisco.com>
> To: Megdich Islem <megdich_is...@yahoo.fr>, "Open MPI User's List"
>     <us...@open-mpi.org>
> Subject: Re: [OMPI users] users Digest, Vol 3514, Issue 1
> Message-ID: <c44110cb-7576-4272-bc53-9c274ef1e...@cisco.com>
> Content-Type: text/plain; charset="utf-8"
> 
> The example you list below has all MPICH paths -- I don't see any Open MPI 
> setups in there.
> 
> What I was suggesting was that if you absolutely need to have both Open MPI 
> and MPICH installed and in your PATH / LD_LIBRARY_PATH / MANPATH, then you 
> can use the full, absolute path name to each of the Open MPI executables -- 
> e.g., /path/to/openmpi/install/bin/mpicc, etc.  That way, you can use Open 
> MPI's mpicc without having it in your path.
> 
> Additionally, per 
> https://www.open-mpi.org/faq/?category=running#mpirun-prefix, if you specify 
> the absolute path name to mpirun (or mpiexec -- they're identical in Open 
> MPI) and you're using the rsh/ssh launcher in Open MPI, then Open MPI will 
> set the right PATH / LD_LIBRARY_PATH on remote servers for you.  See the FAQ 
> link for more detail.
> 
> 
> 
> > On Jun 1, 2016, at 8:41 AM, Megdich Islem <megdich_is...@yahoo.fr> wrote:
> > 
> > Hi!
> > 
> > Thank you Jeff for you suggestion. But, I am still not able to understand 
> > what do you mean by using absolute path names to for 
> > mpicc/mpifort-mpirun/mpiexec ?
> > 
> > This is how my .bashrc looks like
> > 
> > source /opt/openfoam30/etc/bashrc
> > 
> > export PATH=/home/Desktop/mpich/bin:$PATH
> > export LD_LIBRARY_PATH="/home/islem/Desktop/mpich/lib/:$LD_LIBRARY_PATH"
> > export MPICH_F90=gfortran
> > export MPICH_CC=/opt/intel/bin/icc
> > export MPICH_CXX=/opt/intel/bin/icpc
> > export MPICH_LINK_CXX="-L/home/Desktop/mpich/lib/ -Wl,-rpath 
> > -Wl,/home/islem/Desktop/mpich/lib -lmpichcxx -lmpich -lopa -lmpl -lrt 
> > -lpthread"
> > 
> > export PATH=$PATH:/opt/intel/bin/
> > LD_LIBRARY_PATH="/opt/intel/lib/intel64:$LD_LIBRARY_PATH"
> > export LD_LIBRARY_PATH
> > source 
> > /opt/intel/compilers_and_libraries_2016.3.210/linux/mpi/intel64/mpivars.sh 
> > intel64
> > 
> > alias startEMPIRE=". /home/islem/software/empire/EMPIRE-Core/etc/bashrc.sh 
> > ICC"
> > 
> > mpirun --version gives mpich 3.0.4
> > 
> > This is how I run one example that couples 2 clients through the server 
> > EMPIRE.
> > I use three terminals, in each I write one of these command lines
> > 
> > mpiexec -np 1 Emperor emperorInput.xml  (I got a message in the terminal 
> > saying that Empire started)
> > 
> > mpiexec -np 1 dummyCSM dummyCSMInput (I get a message that Emperor 
> > acknowledged connection)
> > mpiexec -np 1 pimpleDyMFoam -case OF (I got no message in the terminal 
> > which means no connection)
> > 
> > How can I use the mpirun and where to right any modifications ?
> > 
> > Regards,
> > Islem
> > 
> > 
> > Le Vendredi 27 mai 2016 17h00, "users-requ...@open-mpi.org" 
> > <users-requ...@open-mpi.org> a ?crit :
> > 
> > 
> > Send users mailing list submissions to
> >    us...@open-mpi.org
> > 
> > To subscribe or unsubscribe via the World Wide Web, visit
> >    https://www.open-mpi.org/mailman/listinfo.cgi/users
> > or, via email, send a message with subject or body 'help' to
> >    users-requ...@open-mpi.org
> > 
> > You can reach the person managing the list at
> >    users-ow...@open-mpi.org
> > 
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of users digest..."
> > 
> > 
> > Today's Topics:
> > 
> >  1. Re: users Digest, Vol 3510, Issue 2 (Jeff Squyres (jsquyres))
> >  2. Re: segmentation fault for slot-list and openmpi-1.10.3rc2
> >      (Siegmar Gross)
> >  3. OpenMPI virtualization aware (Marco D'Amico)
> >  4. Re: OpenMPI virtualization aware (Ralph Castain)
> > 
> > 
> > ----------------------------------------------------------------------
> > 
> > Message: 1
> > Date: Thu, 26 May 2016 23:28:17 +0000
> > From: "Jeff Squyres (jsquyres)" <jsquy...@cisco.com>
> > To: Megdich Islem <megdich_is...@yahoo.fr>, "Open MPI User's List"
> >    <us...@open-mpi.org>
> > Cc: Dave Love <d.l...@liverpool.ac.uk>
> > Subject: Re: [OMPI users] users Digest, Vol 3510, Issue 2
> > Message-ID: <441f803d-fdbb-443d-82aa-74ff3845a...@cisco.com>
> > Content-Type: text/plain; charset="utf-8"
> > 
> > You're still intermingling your Open MPI and MPICH installations.
> > 
> > You need to ensure to use the wrapper compilers and mpirun/mpiexec from the 
> > same MPI implementation.
> > 
> > For example, if you use mpicc/mpifort from Open MPI to build your program, 
> > then you must use Open MPI's mpirun/mpiexec.
> > 
> > If you absolutely need to have both MPI implementations in your PATH / 
> > LD_LIBRARY_PATH, you might want to use absolute path names to for 
> > mpicc/mpifort/mpirun/mpiexec.
> > 
> > 
> > 
> > > On May 26, 2016, at 3:46 PM, Megdich Islem <megdich_is...@yahoo.fr> wrote:
> > > 
> > > Thank you all for your suggestions !!
> > > 
> > > I found an answer to a similar case in Open MPI FAQ (Question 15)
> > > FAQ: Running MPI jobs
> > >  
> > >  
> > > 
> > >  
> > >  
> > >  
> > >  
> > >  
> > > FAQ: Running MPI jobs
> > > Table of contents: What pre-requisites are necessary for running an Open 
> > > MPI job? What ABI guarantees does Open MPI provide? Do I need a common 
> > > filesystem on a...
> > > Afficher sur www.open-mpi.org
> > > Aper?u par Yahoo
> > >  
> > > which suggests to use mpirun's  prefix command line option or to use the 
> > > mpirun wrapper.
> > > 
> > > I modified my command  to the following
> > >  mpirun --prefix 
> > > /opt/openfoam30/platforms/linux64GccDPInt32Opt/lib/Openmpi-system -np 1 
> > > pimpleDyMFoam -case OF
> > > 
> > > But, I got an error (see attached picture). Is the syntax correct? How 
> > > can I solve the problem? That first method seems to be easier than using 
> > > the mpirun wrapper.
> > > 
> > > Otherwise, how can I use the mpirun wrapper?
> > > 
> > > Regards,
> > > islem
> > > 
> > > 
> > > Le Mercredi 25 mai 2016 16h40, Dave Love <d.l...@liverpool.ac.uk> a ?crit 
> > > :
> > > 
> > > 
> > > I wrote:
> > > 
> > > 
> > > > You could wrap one (set of) program(s) in a script to set the
> > > > appropriate environment before invoking the real program. 
> > > 
> > > 
> > > I realize I should have said something like "program invocations",
> > > i.e. if you have no control over something invoking mpirun for programs
> > > using different MPIs, then an mpirun wrapper needs to check what it's
> > > being asked to run.
> > > 
> > > 
> > > 
> > > <mpirun-error.png><path-to-open-mpi.png>_______________________________________________
> > > users mailing list
> > > us...@open-mpi.org
> > > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > > Link to this post: 
> > > http://www.open-mpi.org/community/lists/users/2016/05/29317.php
> > 
> > 
> > -- 
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to: 
> > http://www.cisco.com/web/about/doing_business/legal/cri/
> > 
> > 
> > ------------------------------
> > 
> > Message: 2
> > Date: Fri, 27 May 2016 08:16:41 +0200
> > From: Siegmar Gross <siegmar.gr...@informatik.hs-fulda.de>
> > To: Open MPI Users <us...@open-mpi.org>
> > Subject: Re: [OMPI users] segmentation fault for slot-list and
> >    openmpi-1.10.3rc2
> > Message-ID:
> >    <f5653a5c-174f-4569-c730-082a9db82...@informatik.hs-fulda.de>
> > Content-Type: text/plain; charset=windows-1252; format=flowed
> > 
> > Hi Ralph,
> > 
> > 
> > Am 26.05.2016 um 17:38 schrieb Ralph Castain:
> > > I?m afraid I honestly can?t make any sense of it. It seems
> > > you at least have a simple workaround (use a hostfile instead
> > > of -host), yes?
> > 
> > Only the combination "--host" and "--slot-list" breaks.
> > Everything else works as expected. One more remark: As you
> > can see below, this combination worked using gdb and "next"
> > after the breakpoint. The process blocks, if I keep the
> > enter-key pressed down and I have to kill simple_spawn in
> > another window to get control back in gdb (<Ctrl-c> or
> > anything else didn't work). I got this error yesterday
> > evening.
> > 
> > ...
> > (gdb)
> > ompi_mpi_init (argc=0, argv=0x0, requested=0, provided=0x7fffffffbc0c)
> >    at ../../openmpi-1.10.3rc3/ompi/runtime/ompi_mpi_init.c:738
> > 738        if (OMPI_SUCCESS != (ret = ompi_file_init())) {
> > (gdb)
> > 744        if (OMPI_SUCCESS != (ret = ompi_win_init())) {
> > (gdb)
> > 750        if (OMPI_SUCCESS != (ret = ompi_attr_init())) {
> > (gdb)
> > 758        if (OMPI_SUCCESS != (ret = ompi_proc_complete_init())) {
> > (gdb)
> > 764        ret = MCA_PML_CALL(enable(true));
> > (gdb)
> > 765        if( OMPI_SUCCESS != ret ) {
> > (gdb)
> > 771        if (NULL == (procs = ompi_proc_world(&nprocs))) {
> > (gdb)
> > 775        ret = MCA_PML_CALL(add_procs(procs, nprocs));
> > (gdb)
> > 776        free(procs);
> > (gdb)
> > 780        if (OMPI_ERR_UNREACH == ret) {
> > (gdb)
> > 785        } else if (OMPI_SUCCESS != ret) {
> > (gdb)
> > 790        MCA_PML_CALL(add_comm(&ompi_mpi_comm_world.comm));
> > (gdb)
> > 791        MCA_PML_CALL(add_comm(&ompi_mpi_comm_self.comm));
> > (gdb)
> > 796        if (ompi_mpi_show_mca_params) {
> > (gdb)
> > 803        ompi_rte_wait_for_debugger();
> > (gdb)
> > 807        if (ompi_enable_timing && 0 == OMPI_PROC_MY_NAME->vpid) {
> > (gdb)
> > 817        coll = OBJ_NEW(ompi_rte_collective_t);
> > (gdb)
> > 818        coll->id = ompi_process_info.peer_init_barrier;
> > (gdb)
> > 819        coll->active = true;
> > (gdb)
> > 820        if (OMPI_SUCCESS != (ret = ompi_rte_barrier(coll))) {
> > (gdb)
> > 825        OMPI_WAIT_FOR_COMPLETION(coll->active);
> > (gdb)
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > Program received signal SIGTERM, Terminated.
> > 0x00007ffff7a7acd0 in opal_progress@plt ()
> >    from /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12
> > (gdb)
> > Single stepping until exit from function opal_progress@plt,
> > which has no line number information.
> > [Thread 0x7ffff491b700 (LWP 19602) exited]
> > 
> > Program terminated with signal SIGTERM, Terminated.
> > The program no longer exists.
> > (gdb)
> > The program is not being run.
> > (gdb)
> > ...
> > 
> > 
> > 
> > Kind regards
> > 
> > Siegmar
> > 
> > 
> > >> On May 26, 2016, at 5:48 AM, Siegmar Gross 
> > >> <siegmar.gr...@informatik.hs-fulda.de> wrote:
> > >>
> > >> Hi Ralph and Gilles,
> > >>
> > >> it's strange that the program works with "--host" and "--slot-list"
> > >> in your environment and not in mine. I get the following output, if
> > >> I run the program in gdb without a breakpoint.
> > >>
> > >>
> > >> loki spawn 142 gdb /usr/local/openmpi-1.10.3_64_gcc/bin/mpiexec
> > >> GNU gdb (GDB; SUSE Linux Enterprise 12) 7.9.1
> > >> ...
> > >> (gdb) set args -np 1 --host loki --slot-list 0:0-1,1:0-1 simple_spawn
> > >> (gdb) run
> > >> Starting program: /usr/local/openmpi-1.10.3_64_gcc/bin/mpiexec -np 1 
> > >> --host loki --slot-list 0:0-1,1:0-1 simple_spawn
> > >> [Thread debugging using libthread_db enabled]
> > >> Using host libthread_db library "/lib64/libthread_db.so.1".
> > >> Detaching after fork from child process 18031.
> > >> [pid 18031] starting up!
> > >> 0 completed MPI_Init
> > >> Parent [pid 18031] about to spawn!
> > >> Detaching after fork from child process 18033.
> > >> Detaching after fork from child process 18034.
> > >> [pid 18033] starting up!
> > >> [pid 18034] starting up!
> > >> [loki:18034] *** Process received signal ***
> > >> [loki:18034] Signal: Segmentation fault (11)
> > >> ...
> > >>
> > >>
> > >>
> > >> I get a different output, if I run the program in gdb with
> > >> a breakpoint.
> > >>
> > >> gdb /usr/local/openmpi-1.10.3_64_gcc/bin/mpiexec
> > >> (gdb) set args -np 1 --host loki --slot-list 0:0-1,1:0-1 simple_spawn
> > >> (gbd) set follow-fork-mode child
> > >> (gdb) break ompi_proc_self
> > >> (gdb) run
> > >> (gdb) next
> > >>
> > >> Repeating "next" very often results in the following output.
> > >>
> > >> ...
> > >> Starting program: 
> > >> /home/fd1026/work/skripte/master/parallel/prog/mpi/spawn/simple_spawn
> > >> [Thread debugging using libthread_db enabled]
> > >> Using host libthread_db library "/lib64/libthread_db.so.1".
> > >> [pid 13277] starting up!
> > >> [New Thread 0x7ffff42ef700 (LWP 13289)]
> > >>
> > >> Breakpoint 1, ompi_proc_self (size=0x7fffffffc060)
> > >>    at ../../openmpi-1.10.3rc3/ompi/proc/proc.c:413
> > >> 413        ompi_proc_t **procs = (ompi_proc_t**) 
> > >> malloc(sizeof(ompi_proc_t*));
> > >> (gdb) n
> > >> 414        if (NULL == procs) {
> > >> (gdb)
> > >> 423        OBJ_RETAIN(ompi_proc_local_proc);
> > >> (gdb)
> > >> 424        *procs = ompi_proc_local_proc;
> > >> (gdb)
> > >> 425        *size = 1;
> > >> (gdb)
> > >> 426        return procs;
> > >> (gdb)
> > >> 427    }
> > >> (gdb)
> > >> ompi_comm_init () at 
> > >> ../../openmpi-1.10.3rc3/ompi/communicator/comm_init.c:138
> > >> 138        group->grp_my_rank      = 0;
> > >> (gdb)
> > >> 139        group->grp_proc_count    = (int)size;
> > >> ...
> > >> 193        ompi_comm_reg_init();
> > >> (gdb)
> > >> 196        ompi_comm_request_init ();
> > >> (gdb)
> > >> 198        return OMPI_SUCCESS;
> > >> (gdb)
> > >> 199    }
> > >> (gdb)
> > >> ompi_mpi_init (argc=0, argv=0x0, requested=0, provided=0x7fffffffc21c)
> > >>    at ../../openmpi-1.10.3rc3/ompi/runtime/ompi_mpi_init.c:738
> > >> 738        if (OMPI_SUCCESS != (ret = ompi_file_init())) {
> > >> (gdb)
> > >> 744        if (OMPI_SUCCESS != (ret = ompi_win_init())) {
> > >> (gdb)
> > >> 750        if (OMPI_SUCCESS != (ret = ompi_attr_init())) {
> > >> ...
> > >> 988        ompi_mpi_initialized = true;
> > >> (gdb)
> > >> 991        if (ompi_enable_timing && 0 == OMPI_PROC_MY_NAME->vpid) {
> > >> (gdb)
> > >> 999        return MPI_SUCCESS;
> > >> (gdb)
> > >> 1000    }
> > >> (gdb)
> > >> PMPI_Init (argc=0x0, argv=0x0) at pinit.c:94
> > >> 94          if (MPI_SUCCESS != err) {
> > >> (gdb)
> > >> 104        return MPI_SUCCESS;
> > >> (gdb)
> > >> 105    }
> > >> (gdb)
> > >> 0x0000000000400d0c in main ()
> > >> (gdb)
> > >> Single stepping until exit from function main,
> > >> which has no line number information.
> > >> 0 completed MPI_Init
> > >> Parent [pid 13277] about to spawn!
> > >> [New process 13472]
> > >> [Thread debugging using libthread_db enabled]
> > >> Using host libthread_db library "/lib64/libthread_db.so.1".
> > >> process 13472 is executing new program: 
> > >> /usr/local/openmpi-1.10.3_64_gcc/bin/orted
> > >> [Thread debugging using libthread_db enabled]
> > >> Using host libthread_db library "/lib64/libthread_db.so.1".
> > >> [New process 13474]
> > >> [Thread debugging using libthread_db enabled]
> > >> Using host libthread_db library "/lib64/libthread_db.so.1".
> > >> process 13474 is executing new program: 
> > >> /home/fd1026/work/skripte/master/parallel/prog/mpi/spawn/simple_spawn
> > >> [pid 13475] starting up!
> > >> [pid 13476] starting up!
> > >> [Thread debugging using libthread_db enabled]
> > >> Using host libthread_db library "/lib64/libthread_db.so.1".
> > >> [pid 13474] starting up!
> > >> [New Thread 0x7ffff491b700 (LWP 13480)]
> > >> [Switching to Thread 0x7ffff7ff1740 (LWP 13474)]
> > >>
> > >> Breakpoint 1, ompi_proc_self (size=0x7fffffffba30)
> > >>    at ../../openmpi-1.10.3rc3/ompi/proc/proc.c:413
> > >> 413        ompi_proc_t **procs = (ompi_proc_t**) 
> > >> malloc(sizeof(ompi_proc_t*));
> > >> (gdb)
> > >> 414        if (NULL == procs) {
> > >> ...
> > >> 426        return procs;
> > >> (gdb)
> > >> 427    }
> > >> (gdb)
> > >> ompi_comm_init () at 
> > >> ../../openmpi-1.10.3rc3/ompi/communicator/comm_init.c:138
> > >> 138        group->grp_my_rank      = 0;
> > >> (gdb)
> > >> 139        group->grp_proc_count    = (int)size;
> > >> (gdb)
> > >> 140        OMPI_GROUP_SET_INTRINSIC (group);
> > >> ...
> > >> 193        ompi_comm_reg_init();
> > >> (gdb)
> > >> 196        ompi_comm_request_init ();
> > >> (gdb)
> > >> 198        return OMPI_SUCCESS;
> > >> (gdb)
> > >> 199    }
> > >> (gdb)
> > >> ompi_mpi_init (argc=0, argv=0x0, requested=0, provided=0x7fffffffbbec)
> > >>    at ../../openmpi-1.10.3rc3/ompi/runtime/ompi_mpi_init.c:738
> > >> 738        if (OMPI_SUCCESS != (ret = ompi_file_init())) {
> > >> (gdb)
> > >> 744        if (OMPI_SUCCESS != (ret = ompi_win_init())) {
> > >> (gdb)
> > >> 750        if (OMPI_SUCCESS != (ret = ompi_attr_init())) {
> > >> ...
> > >> 863        if (OMPI_SUCCESS != (ret = ompi_pubsub_base_select())) {
> > >> (gdb)
> > >> 869        if (OMPI_SUCCESS != (ret = 
> > >> mca_base_framework_open(&ompi_dpm_base_framework, 0))) {
> > >> (gdb)
> > >> 873        if (OMPI_SUCCESS != (ret = ompi_dpm_base_select())) {
> > >> (gdb)
> > >> 884        if ( OMPI_SUCCESS !=
> > >> (gdb)
> > >> 894        if (OMPI_SUCCESS !=
> > >> (gdb)
> > >> 900        if (OMPI_SUCCESS !=
> > >> (gdb)
> > >> 911        if (OMPI_SUCCESS != (ret = ompi_dpm.dyn_init())) {
> > >> (gdb)
> > >> Parent done with spawn
> > >> Parent sending message to child
> > >> 2 completed MPI_Init
> > >> Hello from the child 2 of 3 on host loki pid 13476
> > >> 1 completed MPI_Init
> > >> Hello from the child 1 of 3 on host loki pid 13475
> > >> 921        if (OMPI_SUCCESS != (ret = ompi_cr_init())) {
> > >> (gdb)
> > >> 931        opal_progress_event_users_decrement();
> > >> (gdb)
> > >> 934        opal_progress_set_yield_when_idle(ompi_mpi_yield_when_idle);
> > >> (gdb)
> > >> 937        if (ompi_mpi_event_tick_rate >= 0) {
> > >> (gdb)
> > >> 946        if (OMPI_SUCCESS != (ret = ompi_mpiext_init())) {
> > >> (gdb)
> > >> 953        if (ret != OMPI_SUCCESS) {
> > >> (gdb)
> > >> 972        OBJ_CONSTRUCT(&ompi_registered_datareps, opal_list_t);
> > >> (gdb)
> > >> 977        OBJ_CONSTRUCT( &ompi_mpi_f90_integer_hashtable, 
> > >> opal_hash_table_t);
> > >> (gdb)
> > >> 978        opal_hash_table_init(&ompi_mpi_f90_integer_hashtable, 16 /* 
> > >> why not? */);
> > >> (gdb)
> > >> 980        OBJ_CONSTRUCT( &ompi_mpi_f90_real_hashtable, 
> > >> opal_hash_table_t);
> > >> (gdb)
> > >> 981        opal_hash_table_init(&ompi_mpi_f90_real_hashtable, 
> > >> FLT_MAX_10_EXP);
> > >> (gdb)
> > >> 983        OBJ_CONSTRUCT( &ompi_mpi_f90_complex_hashtable, 
> > >> opal_hash_table_t);
> > >> (gdb)
> > >> 984        opal_hash_table_init(&ompi_mpi_f90_complex_hashtable, 
> > >> FLT_MAX_10_EXP);
> > >> (gdb)
> > >> 988        ompi_mpi_initialized = true;
> > >> (gdb)
> > >> 991        if (ompi_enable_timing && 0 == OMPI_PROC_MY_NAME->vpid) {
> > >> (gdb)
> > >> 999        return MPI_SUCCESS;
> > >> (gdb)
> > >> 1000    }
> > >> (gdb)
> > >> PMPI_Init (argc=0x0, argv=0x0) at pinit.c:94
> > >> 94          if (MPI_SUCCESS != err) {
> > >> (gdb)
> > >> 104        return MPI_SUCCESS;
> > >> (gdb)
> > >> 105    }
> > >> (gdb)
> > >> 0x0000000000400d0c in main ()
> > >> (gdb)
> > >> Single stepping until exit from function main,
> > >> which has no line number information.
> > >> 0 completed MPI_Init
> > >> Hello from the child 0 of 3 on host loki pid 13474
> > >>
> > >> Child 2 disconnected
> > >> Child 1 disconnected
> > >> Child 0 received msg: 38
> > >> Parent disconnected
> > >> 13277: exiting
> > >>
> > >> Program received signal SIGTERM, Terminated.
> > >> 0x0000000000400f0a in main ()
> > >> (gdb)
> > >> Single stepping until exit from function main,
> > >> which has no line number information.
> > >> [tcsetpgrp failed in terminal_inferior: No such process]
> > >> [Thread 0x7ffff491b700 (LWP 13480) exited]
> > >>
> > >> Program terminated with signal SIGTERM, Terminated.
> > >> The program no longer exists.
> > >> (gdb)
> > >> The program is not being run.
> > >> (gdb)
> > >> The program is not being run.
> > >> (gdb) info break
> > >> Num    Type          Disp Enb Address            What
> > >> 1      breakpoint    keep y  0x00007ffff7aa35c7 in ompi_proc_self
> > >>                                                  at 
> > >> ../../openmpi-1.10.3rc3/ompi/proc/proc.c:413 inf 8, 7, 6, 5, 4, 3, 2, 1
> > >>        breakpoint already hit 2 times
> > >> (gdb) delete 1
> > >> (gdb) r
> > >> Starting program: 
> > >> /home/fd1026/work/skripte/master/parallel/prog/mpi/spawn/simple_spawn
> > >> [Thread debugging using libthread_db enabled]
> > >> Using host libthread_db library "/lib64/libthread_db.so.1".
> > >> [pid 16708] starting up!
> > >> 0 completed MPI_Init
> > >> Parent [pid 16708] about to spawn!
> > >> [New process 16720]
> > >> [Thread debugging using libthread_db enabled]
> > >> Using host libthread_db library "/lib64/libthread_db.so.1".
> > >> process 16720 is executing new program: 
> > >> /usr/local/openmpi-1.10.3_64_gcc/bin/orted
> > >> [Thread debugging using libthread_db enabled]
> > >> Using host libthread_db library "/lib64/libthread_db.so.1".
> > >> [New process 16722]
> > >> [Thread debugging using libthread_db enabled]
> > >> Using host libthread_db library "/lib64/libthread_db.so.1".
> > >> process 16722 is executing new program: 
> > >> /home/fd1026/work/skripte/master/parallel/prog/mpi/spawn/simple_spawn
> > >> [pid 16723] starting up!
> > >> [pid 16724] starting up!
> > >> [Thread debugging using libthread_db enabled]
> > >> Using host libthread_db library "/lib64/libthread_db.so.1".
> > >> [pid 16722] starting up!
> > >> Parent done with spawn
> > >> Parent sending message to child
> > >> 1 completed MPI_Init
> > >> Hello from the child 1 of 3 on host loki pid 16723
> > >> 2 completed MPI_Init
> > >> Hello from the child 2 of 3 on host loki pid 16724
> > >> 0 completed MPI_Init
> > >> Hello from the child 0 of 3 on host loki pid 16722
> > >> Child 0 received msg: 38
> > >> Child 0 disconnected
> > >> Parent disconnected
> > >> Child 1 disconnected
> > >> Child 2 disconnected
> > >> 16708: exiting
> > >> 16724: exiting
> > >> 16723: exiting
> > >> [New Thread 0x7ffff491b700 (LWP 16729)]
> > >>
> > >> Program received signal SIGTERM, Terminated.
> > >> [Switching to Thread 0x7ffff7ff1740 (LWP 16722)]
> > >> __GI__dl_debug_state () at dl-debug.c:74
> > >> 74      dl-debug.c: No such file or directory.
> > >> (gdb) 
> > >> --------------------------------------------------------------------------
> > >> WARNING: A process refused to die despite all the efforts!
> > >> This process may still be running and/or consuming resources.
> > >>
> > >> Host: loki
> > >> PID:  16722
> > >>
> > >> --------------------------------------------------------------------------
> > >>
> > >>
> > >> The following simple_spawn processes exist now.
> > >>
> > >> loki spawn 171 ps -aef | grep simple_spawn
> > >> fd1026  11079 11053  0 14:00 pts/0    00:00:00 
> > >> /usr/local/openmpi-1.10.3_64_gcc/bin/mpiexec -np 1 --host loki 
> > >> --slot-list 0:0-1,1:0-1 simple_spawn
> > >> fd1026  11095 11079 29 14:01 pts/0    00:09:37 [simple_spawn] <defunct>
> > >> fd1026  16722    1  0 14:31 ?        00:00:00 [simple_spawn] <defunct>
> > >> fd1026  17271 29963  0 14:33 pts/2    00:00:00 grep simple_spawn
> > >> loki spawn 172
> > >>
> > >>
> > >> Is it possible that there is a race condition? How can I help
> > >> to get a solution for my problem?
> > >>
> > >>
> > >> Kind regards
> > >>
> > >> Siegmar
> > >>
> > >> Am 24.05.2016 um 16:54 schrieb Ralph Castain:
> > >>> Works perfectly for me, so I believe this must be an environment issue 
> > >>> - I am using gcc 6.0.0 on CentOS7 with x86:
> > >>>
> > >>> $ mpirun -n 1 -host bend001 --slot-list 0:0-1,1:0-1 --report-bindings 
> > >>> ./simple_spawn
> > >>> [bend001:17599] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 
> > >>> 0[core 1[hwt 0-1]], socket 1[core 6[hwt 0-1]], socket 1[core 7[hwt 
> > >>> 0-1]]: [BB/BB/../../../..][BB/BB/../../../..]
> > >>> [pid 17601] starting up!
> > >>> 0 completed MPI_Init
> > >>> Parent [pid 17601] about to spawn!
> > >>> [pid 17603] starting up!
> > >>> [bend001:17599] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 
> > >>> 0[core 1[hwt 0-1]], socket 1[core 6[hwt 0-1]], socket 1[core 7[hwt 
> > >>> 0-1]]: [BB/BB/../../../..][BB/BB/../../../..]
> > >>> [bend001:17599] MCW rank 1 bound to socket 0[core 0[hwt 0-1]], socket 
> > >>> 0[core 1[hwt 0-1]], socket 1[core 6[hwt 0-1]], socket 1[core 7[hwt 
> > >>> 0-1]]: [BB/BB/../../../..][BB/BB/../../../..]
> > >>> [bend001:17599] MCW rank 2 bound to socket 0[core 0[hwt 0-1]], socket 
> > >>> 0[core 1[hwt 0-1]], socket 1[core 6[hwt 0-1]], socket 1[core 7[hwt 
> > >>> 0-1]]: [BB/BB/../../../..][BB/BB/../../../..]
> > >>> [pid 17604] starting up!
> > >>> [pid 17605] starting up!
> > >>> Parent done with spawn
> > >>> Parent sending message to child
> > >>> 0 completed MPI_Init
> > >>> Hello from the child 0 of 3 on host bend001 pid 17603
> > >>> Child 0 received msg: 38
> > >>> 1 completed MPI_Init
> > >>> Hello from the child 1 of 3 on host bend001 pid 17604
> > >>> 2 completed MPI_Init
> > >>> Hello from the child 2 of 3 on host bend001 pid 17605
> > >>> Child 0 disconnected
> > >>> Child 2 disconnected
> > >>> Parent disconnected
> > >>> Child 1 disconnected
> > >>> 17603: exiting
> > >>> 17605: exiting
> > >>> 17601: exiting
> > >>> 17604: exiting
> > >>> $
> > >>>
> > >>>> On May 24, 2016, at 7:18 AM, Siegmar Gross 
> > >>>> <siegmar.gr...@informatik.hs-fulda.de> wrote:
> > >>>>
> > >>>> Hi Ralph and Gilles,
> > >>>>
> > >>>> the program breaks only, if I combine "--host" and "--slot-list". 
> > >>>> Perhaps this
> > >>>> information is helpful. I use a different machine now, so that you can 
> > >>>> see that
> > >>>> the problem is not restricted to "loki".
> > >>>>
> > >>>>
> > >>>> pc03 spawn 115 ompi_info | grep -e "OPAL repo revision:" -e "C 
> > >>>> compiler absolute:"
> > >>>>    OPAL repo revision: v1.10.2-201-gd23dda8
> > >>>>    C compiler absolute: /usr/local/gcc-6.1.0/bin/gcc
> > >>>>
> > >>>>
> > >>>> pc03 spawn 116 uname -a
> > >>>> Linux pc03 3.12.55-52.42-default #1 SMP Thu Mar 3 10:35:46 UTC 2016 
> > >>>> (4354e1d) x86_64 x86_64 x86_64 GNU/Linux
> > >>>>
> > >>>>
> > >>>> pc03 spawn 117 cat host_pc03.openmpi
> > >>>> pc03.informatik.hs-fulda.de slots=12 max_slots=12
> > >>>>
> > >>>>
> > >>>> pc03 spawn 118 mpicc simple_spawn.c
> > >>>>
> > >>>>
> > >>>> pc03 spawn 119 mpiexec -np 1 --report-bindings a.out
> > >>>> [pc03:03711] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: 
> > >>>> [BB/../../../../..][../../../../../..]
> > >>>> [pid 3713] starting up!
> > >>>> 0 completed MPI_Init
> > >>>> Parent [pid 3713] about to spawn!
> > >>>> [pc03:03711] MCW rank 0 bound to socket 1[core 6[hwt 0-1]], socket 
> > >>>> 1[core 7[hwt 0-1]], socket 1[core 8[hwt 0-1]], socket 1[core 9[hwt 
> > >>>> 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]]: 
> > >>>> [../../../../../..][BB/BB/BB/BB/BB/BB]
> > >>>> [pc03:03711] MCW rank 1 bound to socket 0[core 0[hwt 0-1]], socket 
> > >>>> 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 
> > >>>> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]]: 
> > >>>> [BB/BB/BB/BB/BB/BB][../../../../../..]
> > >>>> [pc03:03711] MCW rank 2 bound to socket 1[core 6[hwt 0-1]], socket 
> > >>>> 1[core 7[hwt 0-1]], socket 1[core 8[hwt 0-1]], socket 1[core 9[hwt 
> > >>>> 0-1]], socket 1[core 10[hwt 0-1]], socket 1[core 11[hwt 0-1]]: 
> > >>>> [../../../../../..][BB/BB/BB/BB/BB/BB]
> > >>>> [pid 3715] starting up!
> > >>>> [pid 3716] starting up!
> > >>>> [pid 3717] starting up!
> > >>>> Parent done with spawn
> > >>>> Parent sending message to child
> > >>>> 0 completed MPI_Init
> > >>>> Hello from the child 0 of 3 on host pc03 pid 3715
> > >>>> 1 completed MPI_Init
> > >>>> Hello from the child 1 of 3 on host pc03 pid 3716
> > >>>> 2 completed MPI_Init
> > >>>> Hello from the child 2 of 3 on host pc03 pid 3717
> > >>>> Child 0 received msg: 38
> > >>>> Child 0 disconnected
> > >>>> Child 2 disconnected
> > >>>> Parent disconnected
> > >>>> Child 1 disconnected
> > >>>> 3713: exiting
> > >>>> 3715: exiting
> > >>>> 3716: exiting
> > >>>> 3717: exiting
> > >>>>
> > >>>>
> > >>>> pc03 spawn 120 mpiexec -np 1 --hostfile host_pc03.openmpi --slot-list 
> > >>>> 0:0-1,1:0-1 --report-bindings a.out
> > >>>> [pc03:03729] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 
> > >>>> 0[core 1[hwt 0-1]], socket 1[core 6[hwt 0-1]], socket 1[core 7[hwt 
> > >>>> 0-1]]: [BB/BB/../../../..][BB/BB/../../../..]
> > >>>> [pid 3731] starting up!
> > >>>> 0 completed MPI_Init
> > >>>> Parent [pid 3731] about to spawn!
> > >>>> [pc03:03729] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 
> > >>>> 0[core 1[hwt 0-1]], socket 1[core 6[hwt 0-1]], socket 1[core 7[hwt 
> > >>>> 0-1]]: [BB/BB/../../../..][BB/BB/../../../..]
> > >>>> [pc03:03729] MCW rank 1 bound to socket 0[core 0[hwt 0-1]], socket 
> > >>>> 0[core 1[hwt 0-1]], socket 1[core 6[hwt 0-1]], socket 1[core 7[hwt 
> > >>>> 0-1]]: [BB/BB/../../../..][BB/BB/../../../..]
> > >>>> [pc03:03729] MCW rank 2 bound to socket 0[core 0[hwt 0-1]], socket 
> > >>>> 0[core 1[hwt 0-1]], socket 1[core 6[hwt 0-1]], socket 1[core 7[hwt 
> > >>>> 0-1]]: [BB/BB/../../../..][BB/BB/../../../..]
> > >>>> [pid 3733] starting up!
> > >>>> [pid 3734] starting up!
> > >>>> [pid 3735] starting up!
> > >>>> Parent done with spawn
> > >>>> Parent sending message to child
> > >>>> 2 completed MPI_Init
> > >>>> Hello from the child 2 of 3 on host pc03 pid 3735
> > >>>> 1 completed MPI_Init
> > >>>> Hello from the child 1 of 3 on host pc03 pid 3734
> > >>>> 0 completed MPI_Init
> > >>>> Hello from the child 0 of 3 on host pc03 pid 3733
> > >>>> Child 0 received msg: 38
> > >>>> Child 0 disconnected
> > >>>> Child 2 disconnected
> > >>>> Child 1 disconnected
> > >>>> Parent disconnected
> > >>>> 3731: exiting
> > >>>> 3734: exiting
> > >>>> 3733: exiting
> > >>>> 3735: exiting
> > >>>>
> > >>>>
> > >>>> pc03 spawn 121 mpiexec -np 1 --host pc03 --slot-list 0:0-1,1:0-1 
> > >>>> --report-bindings a.out
> > >>>> [pc03:03744] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 
> > >>>> 0[core 1[hwt 0-1]], socket 1[core 6[hwt 0-1]], socket 1[core 7[hwt 
> > >>>> 0-1]]: [BB/BB/../../../..][BB/BB/../../../..]
> > >>>> [pid 3746] starting up!
> > >>>> 0 completed MPI_Init
> > >>>> Parent [pid 3746] about to spawn!
> > >>>> [pc03:03744] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 
> > >>>> 0[core 1[hwt 0-1]], socket 1[core 6[hwt 0-1]], socket 1[core 7[hwt 
> > >>>> 0-1]]: [BB/BB/../../../..][BB/BB/../../../..]
> > >>>> [pc03:03744] MCW rank 2 bound to socket 0[core 0[hwt 0-1]], socket 
> > >>>> 0[core 1[hwt 0-1]], socket 1[core 6[hwt 0-1]], socket 1[core 7[hwt 
> > >>>> 0-1]]: [BB/BB/../../../..][BB/BB/../../../..]
> > >>>> [pid 3748] starting up!
> > >>>> [pid 3749] starting up!
> > >>>> [pc03:03749] *** Process received signal ***
> > >>>> [pc03:03749] Signal: Segmentation fault (11)
> > >>>> [pc03:03749] Signal code: Address not mapped (1)
> > >>>> [pc03:03749] Failing at address: 0x8
> > >>>> [pc03:03749] [ 0] /lib64/libpthread.so.0(+0xf870)[0x7fe6f0d1f870]
> > >>>> [pc03:03749] [ 1] 
> > >>>> /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(ompi_proc_self+0x35)[0x7fe6f0f825b0]
> > >>>> [pc03:03749] [ 2] 
> > >>>> /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(ompi_comm_init+0x68b)[0x7fe6f0f61b08]
> > >>>> [pc03:03749] [ 3] 
> > >>>> /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(ompi_mpi_init+0xa90)[0x7fe6f0f87e8a]
> > >>>> [pc03:03749] [ 4] 
> > >>>> /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(MPI_Init+0x1a0)[0x7fe6f0fc42ae]
> > >>>> [pc03:03749] [ 5] a.out[0x400d0c]
> > >>>> [pc03:03749] [ 6] 
> > >>>> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fe6f0989b05]
> > >>>> [pc03:03749] [ 7] a.out[0x400bf9]
> > >>>> [pc03:03749] *** End of error message ***
> > >>>> --------------------------------------------------------------------------
> > >>>> mpiexec noticed that process rank 2 with PID 3749 on node pc03 exited 
> > >>>> on signal 11 (Segmentation fault).
> > >>>> --------------------------------------------------------------------------
> > >>>> pc03 spawn 122
> > >>>>
> > >>>>
> > >>>>
> > >>>> Kind regards
> > >>>>
> > >>>> Siegmar
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 05/24/16 15:44, Ralph Castain wrote:
> > >>>>>
> > >>>>>> On May 24, 2016, at 6:21 AM, Siegmar Gross 
> > >>>>>> <siegmar.gr...@informatik.hs-fulda.de> wrote:
> > >>>>>>
> > >>>>>> Hi Ralph,
> > >>>>>>
> > >>>>>> I copy the relevant lines to this place, so that it is easier to see 
> > >>>>>> what
> > >>>>>> happens. "a.out" is your program, which I compiled with mpicc.
> > >>>>>>
> > >>>>>>>> loki spawn 153 ompi_info | grep -e "OPAL repo revision:" -e "C 
> > >>>>>>>> compiler
> > >>>>>>>> absolute:"
> > >>>>>>>>    OPAL repo revision: v1.10.2-201-gd23dda8
> > >>>>>>>>  C compiler absolute: /usr/local/gcc-6.1.0/bin/gcc
> > >>>>>>>> loki spawn 154 mpicc simple_spawn.c
> > >>>>>>
> > >>>>>>>> loki spawn 155 mpiexec -np 1 a.out
> > >>>>>>>> [pid 24008] starting up!
> > >>>>>>>> 0 completed MPI_Init
> > >>>>>> ...
> > >>>>>>
> > >>>>>> "mpiexec -np 1 a.out" works.
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>> I don?t know what ?a.out? is, but it looks like there is some memory
> > >>>>>>> corruption there.
> > >>>>>>
> > >>>>>> "a.out" is still your program. I get the same error on different
> > >>>>>> machines, so that it is not very likely, that the (hardware) memory
> > >>>>>> is corrupted.
> > >>>>>>
> > >>>>>>
> > >>>>>>>> loki spawn 156 mpiexec -np 1 --host loki --slot-list 0-5 a.out
> > >>>>>>>> [pid 24102] starting up!
> > >>>>>>>> 0 completed MPI_Init
> > >>>>>>>> Parent [pid 24102] about to spawn!
> > >>>>>>>> [pid 24104] starting up!
> > >>>>>>>> [pid 24105] starting up!
> > >>>>>>>> [loki:24105] *** Process received signal ***
> > >>>>>>>> [loki:24105] Signal: Segmentation fault (11)
> > >>>>>>>> [loki:24105] Signal code: Address not mapped (1)
> > >>>>>> ...
> > >>>>>>
> > >>>>>> "mpiexec -np 1 --host loki --slot-list 0-5 a.out" breaks with a 
> > >>>>>> segmentation
> > >>>>>> faUlt. Can I do something, so that you can find out, what happens?
> > >>>>>
> > >>>>> I honestly have no idea - perhaps Gilles can help, as I have no 
> > >>>>> access to that kind of environment. We aren?t seeing such problems 
> > >>>>> elsewhere, so it is likely something local.
> > >>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> Kind regards
> > >>>>>>
> > >>>>>> Siegmar
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On 05/24/16 15:07, Ralph Castain wrote:
> > >>>>>>>
> > >>>>>>>> On May 24, 2016, at 4:19 AM, Siegmar Gross
> > >>>>>>>> <siegmar.gr...@informatik.hs-fulda.de
> > >>>>>>>> <mailto:siegmar.gr...@informatik.hs-fulda.de>> wrote:
> > >>>>>>>>
> > >>>>>>>> Hi Ralph,
> > >>>>>>>>
> > >>>>>>>> thank you very much for your answer and your example program.
> > >>>>>>>>
> > >>>>>>>> On 05/23/16 17:45, Ralph Castain wrote:
> > >>>>>>>>> I cannot replicate the problem - both scenarios work fine for me. 
> > >>>>>>>>> I?m not
> > >>>>>>>>> convinced your test code is correct, however, as you call 
> > >>>>>>>>> Comm_free the
> > >>>>>>>>> inter-communicator but didn?t call Comm_disconnect. Checkout the 
> > >>>>>>>>> attached
> > >>>>>>>>> for a correct code and see if it works for you.
> > >>>>>>>>
> > >>>>>>>> I thought that I only need MPI_Comm_Disconnect, if I would have 
> > >>>>>>>> established a
> > >>>>>>>> connection with MPI_Comm_connect before. The man page for 
> > >>>>>>>> MPI_Comm_free states
> > >>>>>>>>
> > >>>>>>>> "This  operation marks the communicator object for deallocation. 
> > >>>>>>>> The
> > >>>>>>>> handle is set to MPI_COMM_NULL. Any pending operations that use 
> > >>>>>>>> this
> > >>>>>>>> communicator will complete normally; the object is actually 
> > >>>>>>>> deallocated only
> > >>>>>>>> if there are no other active references to it.".
> > >>>>>>>>
> > >>>>>>>> The man page for MPI_Comm_disconnect states
> > >>>>>>>>
> > >>>>>>>> "MPI_Comm_disconnect waits for all pending communication on comm 
> > >>>>>>>> to complete
> > >>>>>>>> internally, deallocates the communicator object, and sets the 
> > >>>>>>>> handle to
> > >>>>>>>> MPI_COMM_NULL. It is  a  collective operation.".
> > >>>>>>>>
> > >>>>>>>> I don't see a difference for my spawned processes, because both 
> > >>>>>>>> functions will
> > >>>>>>>> "wait" until all pending operations have finished, before the 
> > >>>>>>>> object will be
> > >>>>>>>> destroyed. Nevertheless, perhaps my small example program worked 
> > >>>>>>>> all the years
> > >>>>>>>> by chance.
> > >>>>>>>>
> > >>>>>>>> However, I don't understand, why my program works with
> > >>>>>>>> "mpiexec -np 1 --host loki,loki,loki,loki,loki spawn_master" and 
> > >>>>>>>> breaks with
> > >>>>>>>> "mpiexec -np 1 --host loki --slot-list 0:0-5,1:0-5 spawn_master". 
> > >>>>>>>> You are right,
> > >>>>>>>> my slot-list is equivalent to "-bind-to none". I could also have 
> > >>>>>>>> used
> > >>>>>>>> "mpiexec -np 1 --host loki --oversubscribe spawn_master" which 
> > >>>>>>>> works as well.
> > >>>>>>>
> > >>>>>>> Well, you are only giving us one slot when you specify "-host 
> > >>>>>>> loki?, and then
> > >>>>>>> you are trying to launch multiple processes into it. The 
> > >>>>>>> ?slot-list? option only
> > >>>>>>> tells us what cpus to bind each process to - it doesn?t allocate 
> > >>>>>>> process slots.
> > >>>>>>> So you have to tell us how many processes are allowed to run on 
> > >>>>>>> this node.
> > >>>>>>>
> > >>>>>>>>
> > >>>>>>>> The program breaks with "There are not enough slots available in 
> > >>>>>>>> the system
> > >>>>>>>> to satisfy ...", if I only use "--host loki" or different host 
> > >>>>>>>> names,
> > >>>>>>>> without mentioning five host names, using "slot-list", or 
> > >>>>>>>> "oversubscribe",
> > >>>>>>>> Unfortunately "--host <host name>:<number of slots>" isn't 
> > >>>>>>>> available for
> > >>>>>>>> openmpi-1.10.3rc2 to specify the number of available slots.
> > >>>>>>>
> > >>>>>>> Correct - we did not backport the new syntax
> > >>>>>>>
> > >>>>>>>>
> > >>>>>>>> Your program behaves the same way as mine, so that 
> > >>>>>>>> MPI_Comm_disconnect
> > >>>>>>>> will not solve my problem. I had to modify your program in a 
> > >>>>>>>> negligible way
> > >>>>>>>> to get it compiled.
> > >>>>>>>>
> > >>>>>>>> loki spawn 153 ompi_info | grep -e "OPAL repo revision:" -e "C 
> > >>>>>>>> compiler absolute:"
> > >>>>>>>>  OPAL repo revision: v1.10.2-201-gd23dda8
> > >>>>>>>>  C compiler absolute: /usr/local/gcc-6.1.0/bin/gcc
> > >>>>>>>> loki spawn 154 mpicc simple_spawn.c
> > >>>>>>>> loki spawn 155 mpiexec -np 1 a.out
> > >>>>>>>> [pid 24008] starting up!
> > >>>>>>>> 0 completed MPI_Init
> > >>>>>>>> Parent [pid 24008] about to spawn!
> > >>>>>>>> [pid 24010] starting up!
> > >>>>>>>> [pid 24011] starting up!
> > >>>>>>>> [pid 24012] starting up!
> > >>>>>>>> Parent done with spawn
> > >>>>>>>> Parent sending message to child
> > >>>>>>>> 0 completed MPI_Init
> > >>>>>>>> Hello from the child 0 of 3 on host loki pid 24010
> > >>>>>>>> 1 completed MPI_Init
> > >>>>>>>> Hello from the child 1 of 3 on host loki pid 24011
> > >>>>>>>> 2 completed MPI_Init
> > >>>>>>>> Hello from the child 2 of 3 on host loki pid 24012
> > >>>>>>>> Child 0 received msg: 38
> > >>>>>>>> Child 0 disconnected
> > >>>>>>>> Child 1 disconnected
> > >>>>>>>> Child 2 disconnected
> > >>>>>>>> Parent disconnected
> > >>>>>>>> 24012: exiting
> > >>>>>>>> 24010: exiting
> > >>>>>>>> 24008: exiting
> > >>>>>>>> 24011: exiting
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Is something wrong with my command line? I didn't use slot-list 
> > >>>>>>>> before, so
> > >>>>>>>> that I'm not sure, if I use it in the intended way.
> > >>>>>>>
> > >>>>>>> I don?t know what ?a.out? is, but it looks like there is some 
> > >>>>>>> memory corruption
> > >>>>>>> there.
> > >>>>>>>
> > >>>>>>>>
> > >>>>>>>> loki spawn 156 mpiexec -np 1 --host loki --slot-list 0-5 a.out
> > >>>>>>>> [pid 24102] starting up!
> > >>>>>>>> 0 completed MPI_Init
> > >>>>>>>> Parent [pid 24102] about to spawn!
> > >>>>>>>> [pid 24104] starting up!
> > >>>>>>>> [pid 24105] starting up!
> > >>>>>>>> [loki:24105] *** Process received signal ***
> > >>>>>>>> [loki:24105] Signal: Segmentation fault (11)
> > >>>>>>>> [loki:24105] Signal code: Address not mapped (1)
> > >>>>>>>> [loki:24105] Failing at address: 0x8
> > >>>>>>>> [loki:24105] [ 0] /lib64/libpthread.so.0(+0xf870)[0x7f39aa76f870]
> > >>>>>>>> [loki:24105] [ 1]
> > >>>>>>>> /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(ompi_proc_self+0x35)[0x7f39aa9d25b0]
> > >>>>>>>> [loki:24105] [ 2]
> > >>>>>>>> /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(ompi_comm_init+0x68b)[0x7f39aa9b1b08]
> > >>>>>>>> [loki:24105] [ 3] *** An error occurred in MPI_Init
> > >>>>>>>> *** on a NULL communicator
> > >>>>>>>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
> > >>>>>>>> abort,
> > >>>>>>>> ***    and potentially your MPI job)
> > >>>>>>>> [loki:24104] Local abort before MPI_INIT completed successfully; 
> > >>>>>>>> not able to
> > >>>>>>>> aggregate error messages, and not able to guarantee that all other 
> > >>>>>>>> processes
> > >>>>>>>> were killed!
> > >>>>>>>> /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(ompi_mpi_init+0xa90)[0x7f39aa9d7e8a]
> > >>>>>>>> [loki:24105] [ 4]
> > >>>>>>>> /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(MPI_Init+0x1a0)[0x7f39aaa142ae]
> > >>>>>>>> [loki:24105] [ 5] a.out[0x400d0c]
> > >>>>>>>> [loki:24105] [ 6] 
> > >>>>>>>> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f39aa3d9b05]
> > >>>>>>>> [loki:24105] [ 7] a.out[0x400bf9]
> > >>>>>>>> [loki:24105] *** End of error message ***
> > >>>>>>>> -------------------------------------------------------
> > >>>>>>>> Child job 2 terminated normally, but 1 process returned
> > >>>>>>>> a non-zero exit code.. Per user-direction, the job has been 
> > >>>>>>>> aborted.
> > >>>>>>>> -------------------------------------------------------
> > >>>>>>>> --------------------------------------------------------------------------
> > >>>>>>>> mpiexec detected that one or more processes exited with non-zero 
> > >>>>>>>> status, thus
> > >>>>>>>> causing
> > >>>>>>>> the job to be terminated. The first process to do so was:
> > >>>>>>>>
> > >>>>>>>> Process name: [[49560,2],0]
> > >>>>>>>> Exit code:    1
> > >>>>>>>> --------------------------------------------------------------------------
> > >>>>>>>> loki spawn 157
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Hopefully, you will find out what happens. Please let me know, if 
> > >>>>>>>> I can
> > >>>>>>>> help you in any way.
> > >>>>>>>>
> > >>>>>>>> Kind regards
> > >>>>>>>>
> > >>>>>>>> Siegmar
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>> FWIW: I don?t know how many cores you have on your sockets, but 
> > >>>>>>>>> if you
> > >>>>>>>>> have 6 cores/socket, then your slot-list is equivalent to 
> > >>>>>>>>> ??bind-to none?
> > >>>>>>>>> as the slot-list applies to every process being launched
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>> On May 23, 2016, at 6:26 AM, Siegmar Gross
> > >>>>>>>>>> <siegmar.gr...@informatik.hs-fulda.de
> > >>>>>>>>>> <mailto:siegmar.gr...@informatik.hs-fulda.de>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>> Hi,
> > >>>>>>>>>>
> > >>>>>>>>>> I installed openmpi-1.10.3rc2 on my "SUSE Linux Enterprise Server
> > >>>>>>>>>> 12 (x86_64)" with Sun C 5.13  and gcc-6.1.0. Unfortunately I get
> > >>>>>>>>>> a segmentation fault for "--slot-list" for one of my small 
> > >>>>>>>>>> programs.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> loki spawn 119 ompi_info | grep -e "OPAL repo revision:" -e "C 
> > >>>>>>>>>> compiler
> > >>>>>>>>>> absolute:"
> > >>>>>>>>>>  OPAL repo revision: v1.10.2-201-gd23dda8
> > >>>>>>>>>> C compiler absolute: /usr/local/gcc-6.1.0/bin/gcc
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> loki spawn 120 mpiexec -np 1 --host loki,loki,loki,loki,loki 
> > >>>>>>>>>> spawn_master
> > >>>>>>>>>>
> > >>>>>>>>>> Parent process 0 running on loki
> > >>>>>>>>>> I create 4 slave processes
> > >>>>>>>>>>
> > >>>>>>>>>> Parent process 0: tasks in MPI_COMM_WORLD:                    1
> > >>>>>>>>>>              tasks in COMM_CHILD_PROCESSES local group:  1
> > >>>>>>>>>>              tasks in COMM_CHILD_PROCESSES remote group: 4
> > >>>>>>>>>>
> > >>>>>>>>>> Slave process 0 of 4 running on loki
> > >>>>>>>>>> Slave process 1 of 4 running on loki
> > >>>>>>>>>> Slave process 2 of 4 running on loki
> > >>>>>>>>>> spawn_slave 2: argv[0]: spawn_slave
> > >>>>>>>>>> Slave process 3 of 4 running on loki
> > >>>>>>>>>> spawn_slave 0: argv[0]: spawn_slave
> > >>>>>>>>>> spawn_slave 1: argv[0]: spawn_slave
> > >>>>>>>>>> spawn_slave 3: argv[0]: spawn_slave
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> loki spawn 121 mpiexec -np 1 --host loki --slot-list 0:0-5,1:0-5 
> > >>>>>>>>>> spawn_master
> > >>>>>>>>>>
> > >>>>>>>>>> Parent process 0 running on loki
> > >>>>>>>>>> I create 4 slave processes
> > >>>>>>>>>>
> > >>>>>>>>>> [loki:17326] *** Process received signal ***
> > >>>>>>>>>> [loki:17326] Signal: Segmentation fault (11)
> > >>>>>>>>>> [loki:17326] Signal code: Address not mapped (1)
> > >>>>>>>>>> [loki:17326] Failing at address: 0x8
> > >>>>>>>>>> [loki:17326] [ 0] /lib64/libpthread.so.0(+0xf870)[0x7f4e469b3870]
> > >>>>>>>>>> [loki:17326] [ 1] *** An error occurred in MPI_Init
> > >>>>>>>>>> *** on a NULL communicator
> > >>>>>>>>>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
> > >>>>>>>>>> now abort,
> > >>>>>>>>>> ***    and potentially your MPI job)
> > >>>>>>>>>> [loki:17324] Local abort before MPI_INIT completed successfully; 
> > >>>>>>>>>> not able to
> > >>>>>>>>>> aggregate error messages, and not able to guarantee that all 
> > >>>>>>>>>> other processes
> > >>>>>>>>>> were killed!
> > >>>>>>>>>> /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(ompi_proc_self+0x35)[0x7f4e46c165b0]
> > >>>>>>>>>> [loki:17326] [ 2]
> > >>>>>>>>>> /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(ompi_comm_init+0x68b)[0x7f4e46bf5b08]
> > >>>>>>>>>> [loki:17326] [ 3] *** An error occurred in MPI_Init
> > >>>>>>>>>> *** on a NULL communicator
> > >>>>>>>>>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
> > >>>>>>>>>> now abort,
> > >>>>>>>>>> ***    and potentially your MPI job)
> > >>>>>>>>>> [loki:17325] Local abort before MPI_INIT completed successfully; 
> > >>>>>>>>>> not able to
> > >>>>>>>>>> aggregate error messages, and not able to guarantee that all 
> > >>>>>>>>>> other processes
> > >>>>>>>>>> were killed!
> > >>>>>>>>>> /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(ompi_mpi_init+0xa90)[0x7f4e46c1be8a]
> > >>>>>>>>>> [loki:17326] [ 4]
> > >>>>>>>>>> /usr/local/openmpi-1.10.3_64_gcc/lib64/libmpi.so.12(MPI_Init+0x180)[0x7f4e46c5828e]
> > >>>>>>>>>> [loki:17326] [ 5] spawn_slave[0x40097e]
> > >>>>>>>>>> [loki:17326] [ 6] 
> > >>>>>>>>>> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f4e4661db05]
> > >>>>>>>>>> [loki:17326] [ 7] spawn_slave[0x400a54]
> > >>>>>>>>>> [loki:17326] *** End of error message ***
> > >>>>>>>>>> -------------------------------------------------------
> > >>>>>>>>>> Child job 2 terminated normally, but 1 process returned
> > >>>>>>>>>> a non-zero exit code.. Per user-direction, the job has been 
> > >>>>>>>>>> aborted.
> > >>>>>>>>>> -------------------------------------------------------
> > >>>>>>>>>> --------------------------------------------------------------------------
> > >>>>>>>>>> mpiexec detected that one or more processes exited with non-zero 
> > >>>>>>>>>> status,
> > >>>>>>>>>> thus causing
> > >>>>>>>>>> the job to be terminated. The first process to do so was:
> > >>>>>>>>>>
> > >>>>>>>>>> Process name: [[56340,2],0]
> > >>>>>>>>>> Exit code:    1
> > >>>>>>>>>> --------------------------------------------------------------------------
> > >>>>>>>>>> loki spawn 122
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> I would be grateful, if somebody can fix the problem. Thank you
> > >>>>>>>>>> very much for any help in advance.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> Kind regards
> > >>>>>>>>>>
> > >>>>>>>>>> Siegmar
> > >>>>>>>>>> _______________________________________________
> > >>>>>>>>>> users mailing list
> > >>>>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
> > >>>>>>>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>>>>>>> Link to this post:
> > >>>>>>>>>> http://www.open-mpi.org/community/lists/users/2016/05/29281.php
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> _______________________________________________
> > >>>>>>>>> users mailing list
> > >>>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
> > >>>>>>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>>>>>> Link to this
> > >>>>>>>>> post: 
> > >>>>>>>>> http://www.open-mpi.org/community/lists/users/2016/05/29284.php
> > >>>>>>>>>
> > >>>>>>>> <simple_spawn_modified.c>_______________________________________________
> > >>>>>>>> users mailing list
> > >>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
> > >>>>>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>>>>> Link to this post: 
> > >>>>>>>> http://www.open-mpi.org/community/lists/users/2016/05/29300.php
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> _______________________________________________
> > >>>>>>> users mailing list
> > >>>>>>> us...@open-mpi.org
> > >>>>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>>>> Link to this post: 
> > >>>>>>> http://www.open-mpi.org/community/lists/users/2016/05/29301.php
> > >>>>>>>
> > >>>>>> _______________________________________________
> > >>>>>> users mailing list
> > >>>>>> us...@open-mpi.org
> > >>>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>>> Link to this post: 
> > >>>>>> http://www.open-mpi.org/community/lists/users/2016/05/29304.php
> > >>>>>
> > >>>>> _______________________________________________
> > >>>>> users mailing list
> > >>>>> us...@open-mpi.org
> > >>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>> Link to this post: 
> > >>>>> http://www.open-mpi.org/community/lists/users/2016/05/29307.php
> > >>>>>
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> us...@open-mpi.org
> > >>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>> Link to this post: 
> > >>>> http://www.open-mpi.org/community/lists/users/2016/05/29308.php
> > >>>
> > >>> _______________________________________________
> > >>> users mailing list
> > >>> us...@open-mpi.org
> > >>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>> Link to this post: 
> > >>> http://www.open-mpi.org/community/lists/users/2016/05/29309.php
> > >>>
> > >> _______________________________________________
> > >> users mailing list
> > >> us...@open-mpi.org
> > >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > >> Link to this post: 
> > >> http://www.open-mpi.org/community/lists/users/2016/05/29315.php
> > >
> > > _______________________________________________
> > > users mailing list
> > > us...@open-mpi.org
> > > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > > Link to this post: 
> > > http://www.open-mpi.org/community/lists/users/2016/05/29316.php
> > >
> > 
> > 
> > 
> > ------------------------------
> > 
> > Message: 3
> > Date: Fri, 27 May 2016 09:14:42 +0000
> > From: "Marco D'Amico" <marco.damic...@gmail.com>
> > To: us...@open-mpi.org
> > Subject: [OMPI users] OpenMPI virtualization aware
> > Message-ID:
> >    <CABi-01XH+vdi2egBD=knen_cyxpecg0j-+3rtvnfnc6mtd+...@mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> > 
> > Hi I'm recently investigating in Virtualization used in HPC field, and I
> > found out that MVAPICH has a "Virtualization aware" version, that permit to
> > overcome the big latencies problems of using a Virtualization environment
> > for HPC.
> > 
> > My question is if there is any similar efforts in OpenMPI, since I would
> > eventually contribute in it.
> > 
> > Best regards,
> > Marco D'Amico
> > -------------- next part --------------
> > HTML attachment scrubbed and removed
> > 
> > ------------------------------
> > 
> > Message: 4
> > Date: Fri, 27 May 2016 06:45:05 -0700
> > From: Ralph Castain <r...@open-mpi.org>
> > To: Open MPI Users <us...@open-mpi.org>
> > Subject: Re: [OMPI users] OpenMPI virtualization aware
> > Message-ID: <bbeb8e66-40b0-4688-8284-2113252e1...@open-mpi.org>
> > Content-Type: text/plain; charset="utf-8"
> > 
> > Hi Marco
> > 
> > OMPI has integrated support for the Singularity container:
> > 
> > http://singularity.lbl.gov/index.html 
> > <http://singularity.lbl.gov/index.html>
> > 
> > https://groups.google.com/a/lbl.gov/forum/#!forum/singularity 
> > <https://groups.google.com/a/lbl.gov/forum/#!forum/singularity>
> > 
> > It is in OMPI master now, and an early version is in 2.0 - the full 
> > integration will be in 2.1. Singularity is undergoing changes for its 2.0 
> > release (so we?ll need to do some updating of the OMPI integration), and 
> > there is still plenty that can be done to further optimize its integration 
> > - so contributions would be welcome!
> > 
> > Ralph
> > 
> > 
> > 
> > > On May 27, 2016, at 2:14 AM, Marco D'Amico <marco.damic...@gmail.com> 
> > > wrote:
> > > 
> > > Hi I'm recently investigating in Virtualization used in HPC field, and I 
> > > found out that MVAPICH has a "Virtualization aware" version, that permit 
> > > to overcome the big latencies problems of using a Virtualization 
> > > environment for HPC.
> > > 
> > > My question is if there is any similar efforts in OpenMPI, since I would 
> > > eventually contribute in it.
> > > 
> > > Best regards,
> > > Marco D'Amico
> > > _______________________________________________
> > > users mailing list
> > > us...@open-mpi.org
> > > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > > Link to this post: 
> > > http://www.open-mpi.org/community/lists/users/2016/05/29320.php
> > 
> > -------------- next part --------------
> > HTML attachment scrubbed and removed
> > 
> > ------------------------------
> > 
> > Subject: Digest Footer
> > 
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > https://www.open-mpi.org/mailman/listinfo.cgi/users
> > 
> > ------------------------------
> > 
> > End of users Digest, Vol 3514, Issue 1
> > **************************************
> > 
> > 
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/users/2016/06/29341.php
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> https://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ------------------------------
> 
> End of users Digest, Vol 3518, Issue 2
> **************************************
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/06/29344.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to