On Nov 24, 2011, at 11:49 AM, Paul Kapinos wrote:

> Hello Ralph, Terry, all!
> 
> again, two news: the good one and the second one.
> 
> Ralph Castain wrote:
>> Yes, that would indeed break things. The 1.5 series isn't correctly checking 
>> connections across multiple interfaces until it finds one that works - it 
>> just uses the first one it sees. :-(
> 
> Yahhh!!
> This behaviour - catch a random interface and hang forever if something is 
> wrong with it - is somewhat less than perfect.
> 
> From my perspective - the users one - OpenMPI should try to use eitcher *all* 
> available networks (as 1.4 it does...), starting with the high performance 
> ones, or *only* those interfaces on which the hostnames from the hostfile are 
> bound to.

It is indeed supposed to do the former - as I implied, this is a bug in the 1.5 
series.

> 
> Also, there should be timeouts (if you cannot connect to a node within a 
> minute you probably will never ever be connected...)

We have debated about this for some time - there is a timeout mca param one can 
set, but we'll consider again making it default.

> 
> If some connection runs into a timeout a warning would be great (and a hint 
> to take off the interface by oob_tcp_if_exclude, btl_tcp_if_exclude).
> 
> Should it not?
> Maybe you can file it as a "call for enhancement"...

Probably the right approach at this time.

> 
> 
> 
>> The solution is to specify -mca oob_tcp_if_include ib0. This will direct the 
>> run-time wireup across the IP over IB interface.
>> You will also need the -mca btl_tcp_if_include ib0 as well so the MPI comm 
>> goes exclusively over that network. 
> 
> YES! This works. Adding
> -mca oob_tcp_if_include ib0 -mca btl_tcp_if_include ib0
> to the command line of mpiexec helps me to run the 1.5.x programs, so I 
> believe this is the workaround.
> 
> Many thanks for this hint, Ralph! My fail to not to find it in the FAQ (I was 
> so close :o) http://www.open-mpi.org/faq/?category=tcp#tcp-selection
> 
> But then I ran into yet another one issue. In 
> http://www.open-mpi.org/faq/?category=tuning#setting-mca-params
> the way to define MCA parameters over environment variables is described.
> 
> I tried it:
> $ export OMPI_MCA_oob_tcp_if_include=ib0
> $ export OMPI_MCA_btl_tcp_if_include=ib0
> 
> 
> I checked it:
> $ ompi_info --param all all | grep oob_tcp_if_include
>                 MCA oob: parameter "oob_tcp_if_include" (current value: 
> <ib0>, data source: environment or cmdline)
> $ ompi_info --param all all | grep btl_tcp_if_include
>                 MCA btl: parameter "btl_tcp_if_include" (current value: 
> <ib0>, data source: environment or cmdline)
> 
> 
> But then I get again the hang-up issue!
> 
> ==> seem, mpiexec does not understand these environment variables! and only 
> get the command line options. This should not be so?

No, that isn't what is happening. The problem lies in the behavior of rsh/ssh. 
This environment does not forward environmental variables. Because of limits on 
cmd line length, we don't automatically forward MCA params from the 
environment, but only from the cmd line. It is an annoying limitation, but one 
outside our control.

Put those envars in the default mca param file and the problem will be resolved.

> 
> (I also tried to advise to provide the envvars by -x 
> OMPI_MCA_oob_tcp_if_include -x OMPI_MCA_btl_tcp_if_include - nothing changed.

I'm surprised by that - they should be picked up and forwarded. Could be a bug

> Well, they are OMPI_ variables and should be provided in any case).

No, they aren't - they are not treated differently than any other envar.

> 
> 
> Best wishes and many thanks for all,
> 
> Paul Kapinos
> 
> 
> 
> 
>> Specifying both include and exclude should generate an error as those are 
>> mutually exclusive options - I think this was also missed in early 1.5 
>> releases and was recently patched.
>> HTH
>> Ralph
>> On Nov 23, 2011, at 12:14 PM, TERRY DONTJE wrote:
>>> On 11/23/2011 2:02 PM, Paul Kapinos wrote:
>>>> Hello Ralph, hello all,
>>>> 
>>>> Two news, as usual a good and a bad one.
>>>> 
>>>> The good: we believe to find out *why* it hangs
>>>> 
>>>> The bad: it seem for me, this is a bug or at least undocumented feature of 
>>>> Open MPI /1.5.x.
>>>> 
>>>> In detail:
>>>> As said, we see mystery hang-ups if starting on some nodes using some 
>>>> permutation of hostnames. Usually removing "some bad" nodes helps, 
>>>> sometimes a permutation of node names in the hostfile is enough(!). The 
>>>> behaviour is reproducible.
>>>> 
>>>> The machines have at least 2 networks:
>>>> 
>>>> *eth0* is used for installation, monitoring, ... - this ethernet is very 
>>>> slim
>>>> 
>>>> *ib0* - is the "IP over IB" interface and is used for everything: the file 
>>>> systems, ssh and so on. The hostnames are bound to the ib0 network; our 
>>>> idea was not to use eth0 for MPI at all.
>>>> 
>>>> all machines are available from any over ib0 (are in one network).
>>>> 
>>>> But on eth0 there are at least two different networks; especially the 
>>>> computer linuxbsc025 is in different network than the others and is not 
>>>> reachable from other nodes over eth0! (but reachable over ib0. The name 
>>>> used in the hostfile is resolved to the IP of ib0 ).
>>>> 
>>>> So I believe that Open MPI /1.5.x tries to communicate over eth0 and 
>>>> cannot do it, and hangs. The /1.4.3 does not hang, so this issue is 
>>>> 1.5.x-specific (seen in 1.5.3 and 1.5.4). A bug?
>>>> 
>>>> I also tried to disable the eth0 completely:
>>>> 
>>>> $ mpiexec -mca btl_tcp_if_exclude eth0,lo  -mca btl_tcp_if_include ib0 ...
>>>> 
>>> I believe if you give "-mca btl_tcp_if_include ib0" you do not need to 
>>> specify the exclude parameter.
>>>> ...but this does not help. All right, the above command should disable the 
>>>> usage of eth0 for MPI communication itself, but it hangs just before the 
>>>> MPI is started, isn't it? (because one process lacks, the MPI_INIT cannot 
>>>> be passed)
>>>> 
>>> By "just before the MPI is started" do you mean while orte is launching the 
>>> processes.
>>> I wonder if you need to specify "-mca oob_tcp_if_include ib0" also but I 
>>> think that may depend on which oob you are using.
>>>> Now a question: is there a way to forbid the mpiexec to use some 
>>>> interfaces at all?
>>>> 
>>>> Best wishes,
>>>> 
>>>> Paul Kapinos
>>>> 
>>>> P.S. Of course we know about the good idea to bring all nodes into the 
>>>> same net on eth0, but at this point it is impossible due of technical 
>>>> reason[s]...
>>>> 
>>>> P.S.2 I'm not sure that the issue is really rooted in the above mentioned 
>>>> misconfiguration of eth0, but I have no better idea at this point...
>>>> 
>>>> 
>>>>>> The map seem to be correctly build, also the output if the daemons seem 
>>>>>> to be the same (see helloworld.txt)
>>>>> 
>>>>> Unfortunately, it appears that OMPI was not built with --enable-debug as 
>>>>> there is no debug info in the output. Without a debug installation of 
>>>>> OMPI, the ability to determine the problem is pretty limited.
>>>> 
>>>> well, this will be the next option we will activate. We also have another 
>>>> issue here, on (not) using uDAPL..
>>>> 
>>>> 
>>>>> 
>>>>> 
>>>>>>> You should also try putting that long list of nodes in a hostfile - see 
>>>>>>> if that makes a difference.
>>>>>>> It will process the nodes thru a different code path, so if there is 
>>>>>>> some problem in --host,
>>>>>>> this will tell us.
>>>>>> No, with the host file instead of host list on command line the 
>>>>>> behaviour is the same.
>>>>>> 
>>>>>> But, I just found out that the 1.4.3 does *not* hang on this 
>>>>>> constellation. The next thing I will try will be the installation of 
>>>>>> 1.5.4 :o)
>>>>>> 
>>>>>> Best,
>>>>>> 
>>>>>> Paul
>>>>>> 
>>>>>> P.S. started:
>>>>>> 
>>>>>> $ /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec --hostfile 
>>>>>> hostfile-mini -mca odls_base_verbose 5 --leave-session-attached 
>>>>>> --display-map  helloworld 2>&1 | tee helloworld.txt
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Nov 21, 2011, at 9:33 AM, Paul Kapinos wrote:
>>>>>>>> Hello Open MPI volks,
>>>>>>>> 
>>>>>>>> We use OpenMPI 1.5.3 on our pretty new 1800+ nodes InfiniBand cluster, 
>>>>>>>> and we have some strange hangups if starting OpenMPI processes.
>>>>>>>> 
>>>>>>>> The nodes are named linuxbsc001,linuxbsc002,... (with some lacuna due 
>>>>>>>> of  offline nodes). Each node is accessible from each other over SSH 
>>>>>>>> (without password), also MPI programs between any two nodes are 
>>>>>>>> checked to run.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> So long, I tried to start some bigger number of processes, one process 
>>>>>>>> per node:
>>>>>>>> $ mpiexec -np NN  --host linuxbsc001,linuxbsc002,... MPI_FastTest.exe
>>>>>>>> 
>>>>>>>> Now the problem: there are some constellations of names in the host 
>>>>>>>> list on which mpiexec reproducible hangs forever; and more surprising: 
>>>>>>>> other *permutation* of the *same* node names may run without any 
>>>>>>>> errors!
>>>>>>>> 
>>>>>>>> Example: the command in laueft.txt runs OK, the command in haengt.txt 
>>>>>>>> hangs. Note: the only difference is that the node linuxbsc025 is put 
>>>>>>>> on the end of the host list. Amazed, too?
>>>>>>>> 
>>>>>>>> Looking on the particular nodes during the above mpiexec hangs, we 
>>>>>>>> found the orted daemons started on *each* node and the binary on all 
>>>>>>>> but one node (orted.txt, MPI_FastTest.txt).
>>>>>>>> Again amazing that the node with no user process started (leading to 
>>>>>>>> hangup in MPI_Init of all processes and thus to hangup, I believe) was 
>>>>>>>> always the same, linuxbsc005, which is NOT the permuted item 
>>>>>>>> linuxbsc025...
>>>>>>>> 
>>>>>>>> This behaviour is reproducible. The hang-on only occure if the started 
>>>>>>>> application is a MPI application ("hostname" does not hang).
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Any Idea what is gonna on?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> 
>>>>>>>> Paul Kapinos
>>>>>>>> 
>>>>>>>> 
>>>>>>>> P.S: no alias names used, all names are real ones
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
>>>>>>>> RWTH Aachen University, Center for Computing and Communication
>>>>>>>> Seffenter Weg 23,  D 52074  Aachen (Germany)
>>>>>>>> Tel: +49 241/80-24915
>>>>>>>> linuxbsc001: STDOUT: 24323 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc002: STDOUT:  2142 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc003: STDOUT: 69266 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc004: STDOUT: 58899 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc006: STDOUT: 68255 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc007: STDOUT: 62026 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc008: STDOUT: 54221 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc009: STDOUT: 55482 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc010: STDOUT: 59380 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc011: STDOUT: 58312 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc014: STDOUT: 56013 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc016: STDOUT: 58563 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc017: STDOUT: 54693 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc018: STDOUT: 54187 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc020: STDOUT: 55811 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc021: STDOUT: 54982 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc022: STDOUT: 50032 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc023: STDOUT: 54044 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc024: STDOUT: 51247 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc025: STDOUT: 18575 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc027: STDOUT: 48969 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc028: STDOUT: 52397 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc029: STDOUT: 52780 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc030: STDOUT: 47537 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc031: STDOUT: 54609 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> linuxbsc032: STDOUT: 52833 ?        SLl    0:00 MPI_FastTest.exe
>>>>>>>> $ timex /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec -np 27  --host 
>>>>>>>> linuxbsc001,linuxbsc002,linuxbsc003,linuxbsc004,linuxbsc005,linuxbsc006,linuxbsc007,linuxbsc008,linuxbsc009,linuxbsc010,linuxbsc011,linuxbsc014,linuxbsc016,linuxbsc017,linuxbsc018,linuxbsc020,linuxbsc021,linuxbsc022,linuxbsc023,linuxbsc024,linuxbsc025,linuxbsc027,linuxbsc028,linuxbsc029,linuxbsc030,linuxbsc031,linuxbsc032
>>>>>>>>  MPI_FastTest.exe
>>>>>>>> $ timex /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec -np 27  --host 
>>>>>>>> linuxbsc001,linuxbsc002,linuxbsc003,linuxbsc004,linuxbsc005,linuxbsc006,linuxbsc007,linuxbsc008,linuxbsc009,linuxbsc010,linuxbsc011,linuxbsc014,linuxbsc016,linuxbsc017,linuxbsc018,linuxbsc020,linuxbsc021,linuxbsc022,linuxbsc023,linuxbsc024,linuxbsc027,linuxbsc028,linuxbsc029,linuxbsc030,linuxbsc031,linuxbsc032,linuxbsc025
>>>>>>>>  MPI_FastTest.exe
>>>>>>>> linuxbsc001: STDOUT: 24322 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 1 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc002: STDOUT:  2141 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 2 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc003: STDOUT: 69265 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 3 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc004: STDOUT: 58898 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 4 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc005: STDOUT: 65642 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 5 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc006: STDOUT: 68254 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 6 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc007: STDOUT: 62025 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 7 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc008: STDOUT: 54220 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 8 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc009: STDOUT: 55481 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 9 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc010: STDOUT: 59379 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 10 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc011: STDOUT: 58311 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 11 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc014: STDOUT: 56012 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 12 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc016: STDOUT: 58562 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 13 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc017: STDOUT: 54692 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 14 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc018: STDOUT: 54186 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 15 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc020: STDOUT: 55810 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 16 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc021: STDOUT: 54981 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 17 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc022: STDOUT: 50031 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 18 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc023: STDOUT: 54043 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 19 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc024: STDOUT: 51246 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 20 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc025: STDOUT: 18574 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 21 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc027: STDOUT: 48968 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 22 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc028: STDOUT: 52396 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 23 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc029: STDOUT: 52779 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 24 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc030: STDOUT: 47536 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 25 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc031: STDOUT: 54608 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 26 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> linuxbsc032: STDOUT: 52832 ?        Ss     0:00 
>>>>>>>> /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env 
>>>>>>>> -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 27 -mca 
>>>>>>>> orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 
>>>>>>>> -mca plm rsh
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> -- 
>>>>>> Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
>>>>>> RWTH Aachen University, Center for Computing and Communication
>>>>>> Seffenter Weg 23,  D 52074  Aachen (Germany)
>>>>>> Tel: +49 241/80-24915
>>>>>> linuxbsc005 slots=1
>>>>>> linuxbsc006 slots=1
>>>>>> linuxbsc007 slots=1
>>>>>> linuxbsc008 slots=1
>>>>>> linuxbsc009 slots=1
>>>>>> linuxbsc010 slots=1
>>>>>> linuxbsc011 slots=1
>>>>>> linuxbsc014 slots=1
>>>>>> linuxbsc016 slots=1
>>>>>> linuxbsc017 slots=1
>>>>>> linuxbsc018 slots=1
>>>>>> linuxbsc020 slots=1
>>>>>> linuxbsc021 slots=1
>>>>>> linuxbsc022 slots=1
>>>>>> linuxbsc023 slots=1
>>>>>> linuxbsc024 slots=1
>>>>>> linuxbsc025 slots=1[linuxc2.rz.RWTH-Aachen.DE:22229] mca:base:select:( 
>>>>>> odls) Querying component [default]
>>>>>> [linuxc2.rz.RWTH-Aachen.DE:22229] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxc2.rz.RWTH-Aachen.DE:22229] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> 
>>>>>> ========================   JOB MAP   ========================
>>>>>> 
>>>>>> Data for node: linuxbsc005    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 0
>>>>>> 
>>>>>> Data for node: linuxbsc006    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 1
>>>>>> 
>>>>>> Data for node: linuxbsc007    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 2
>>>>>> 
>>>>>> Data for node: linuxbsc008    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 3
>>>>>> 
>>>>>> Data for node: linuxbsc009    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 4
>>>>>> 
>>>>>> Data for node: linuxbsc010    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 5
>>>>>> 
>>>>>> Data for node: linuxbsc011    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 6
>>>>>> 
>>>>>> Data for node: linuxbsc014    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 7
>>>>>> 
>>>>>> Data for node: linuxbsc016    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 8
>>>>>> 
>>>>>> Data for node: linuxbsc017    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 9
>>>>>> 
>>>>>> Data for node: linuxbsc018    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 10
>>>>>> 
>>>>>> Data for node: linuxbsc020    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 11
>>>>>> 
>>>>>> Data for node: linuxbsc021    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 12
>>>>>> 
>>>>>> Data for node: linuxbsc022    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 13
>>>>>> 
>>>>>> Data for node: linuxbsc023    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 14
>>>>>> 
>>>>>> Data for node: linuxbsc024    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 15
>>>>>> 
>>>>>> Data for node: linuxbsc025    Num procs: 1
>>>>>>    Process OMPI jobid: [87,1] Process rank: 16
>>>>>> 
>>>>>> =============================================================
>>>>>> [linuxbsc007.rz.RWTH-Aachen.DE:07574] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc007.rz.RWTH-Aachen.DE:07574] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc007.rz.RWTH-Aachen.DE:07574] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc016.rz.RWTH-Aachen.DE:03146] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc016.rz.RWTH-Aachen.DE:03146] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc016.rz.RWTH-Aachen.DE:03146] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc005.rz.RWTH-Aachen.DE:22051] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc005.rz.RWTH-Aachen.DE:22051] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc005.rz.RWTH-Aachen.DE:22051] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc011.rz.RWTH-Aachen.DE:07131] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc011.rz.RWTH-Aachen.DE:07131] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc011.rz.RWTH-Aachen.DE:07131] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc025.rz.RWTH-Aachen.DE:43153] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc025.rz.RWTH-Aachen.DE:43153] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc025.rz.RWTH-Aachen.DE:43153] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc017.rz.RWTH-Aachen.DE:05044] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc017.rz.RWTH-Aachen.DE:05044] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc017.rz.RWTH-Aachen.DE:05044] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc018.rz.RWTH-Aachen.DE:01840] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc018.rz.RWTH-Aachen.DE:01840] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc018.rz.RWTH-Aachen.DE:01840] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc024.rz.RWTH-Aachen.DE:79549] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc024.rz.RWTH-Aachen.DE:79549] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc024.rz.RWTH-Aachen.DE:79549] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc022.rz.RWTH-Aachen.DE:73501] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc022.rz.RWTH-Aachen.DE:73501] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc022.rz.RWTH-Aachen.DE:73501] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc023.rz.RWTH-Aachen.DE:03364] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc023.rz.RWTH-Aachen.DE:03364] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc023.rz.RWTH-Aachen.DE:03364] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc006.rz.RWTH-Aachen.DE:16811] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc006.rz.RWTH-Aachen.DE:16811] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc006.rz.RWTH-Aachen.DE:16811] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc014.rz.RWTH-Aachen.DE:10206] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc014.rz.RWTH-Aachen.DE:10206] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc014.rz.RWTH-Aachen.DE:10206] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc008.rz.RWTH-Aachen.DE:00858] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc008.rz.RWTH-Aachen.DE:00858] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc008.rz.RWTH-Aachen.DE:00858] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc010.rz.RWTH-Aachen.DE:09727] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc010.rz.RWTH-Aachen.DE:09727] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc010.rz.RWTH-Aachen.DE:09727] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc020.rz.RWTH-Aachen.DE:06680] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc020.rz.RWTH-Aachen.DE:06680] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc020.rz.RWTH-Aachen.DE:06680] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc009.rz.RWTH-Aachen.DE:05145] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc009.rz.RWTH-Aachen.DE:05145] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc009.rz.RWTH-Aachen.DE:05145] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> [linuxbsc021.rz.RWTH-Aachen.DE:01405] mca:base:select:( odls) Querying 
>>>>>> component [default]
>>>>>> [linuxbsc021.rz.RWTH-Aachen.DE:01405] mca:base:select:( odls) Query of 
>>>>>> component [default] set priority to 1
>>>>>> [linuxbsc021.rz.RWTH-Aachen.DE:01405] mca:base:select:( odls) Selected 
>>>>>> component [default]
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> -- 
>>> <Mail Attachment.gif>
>>> Terry D. Dontje | Principal Software Engineer
>>> Developer Tools Engineering | +1.781.442.2631
>>> Oracle * - Performance Technologies*
>>> 95 Network Drive, Burlington, MA 01803
>>> Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> ------------------------------------------------------------------------
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
> RWTH Aachen University, Center for Computing and Communication
> Seffenter Weg 23,  D 52074  Aachen (Germany)
> Tel: +49 241/80-24915
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to