Hi Gilles, > there are several confusing things here... > > your first post clearly shown /usr/bin/ssh is used, but you expect using > /usr/local/bin/ssh instead
Yes, it is very confusing, because I've never had this problem before (as far as I remember) and as far as I know we haven't changed anything in our environment regarding "ssh". In my last e-mail I reported strange components in my environment variables ("bin" and "lib64"), which I don'get for openmpi-1.9 or openmpi-1.8.4. Furthermore the order of my hosts doesn't matter with these versions. tyr hello_1 107 ompi_info | grep -e "MPI:" -e "Open MPI repo revision:" Open MPI: 1.9.0a1 Open MPI repo revision: dev-1708-g8497a6a tyr hello_1 108 mpiexec -np 5 --host sunpc1,linpc1,tyr,rs0 environ_mpi | grep -e " bin" -e " lib64" tyr hello_1 109 mpiexec -np 5 --host linpc1,sunpc1,tyr,rs0 environ_mpi | grep -e " bin" -e " lib64" tyr hello_1 110 tyr hello_1 107 ompi_info | grep -e "MPI:" -e "Open MPI repo revision:" Open MPI: 1.8.4 Open MPI repo revision: v1.8.3-330-g0344f04 tyr hello_1 108 mpiexec -np 5 --host sunpc1,linpc1,tyr,rs0 environ_mpi | grep -e " bin" -e " lib64" tyr hello_1 109 mpiexec -np 5 --host linpc1,sunpc1,tyr,rs0 environ_mpi | grep -e " bin" -e " lib64" tyr hello_1 110 The order of the hosts is relevant for openmpi-1.8.5 and I have strange environment variables. tyr hello_1 111 ompi_info | grep -e "MPI:" -e "Open MPI repo revision:" Open MPI: 1.8.5 Open MPI repo revision: v1.8.4-333-g039fb11 tyr hello_1 112 mpiexec -np 5 --host sunpc1,linpc1,tyr,rs0 environ_mpi | grep -e " bin" -e " lib64" ld.so.1: ssh: fatal: relocation error: file /usr/bin/ssh: symbol SUNWcry_installed: referenced symbol not found * not finding the required libraries and/or binaries on * compilation of the orted with dynamic libraries when static are required ^CKilled by signal 2. mpiexec: abort is already in progress...hit ctrl-c again to forcibly terminate tyr hello_1 113 mpiexec -np 5 --host linpc1,sunpc1,tyr,rs0 environ_mpi | grep -e " bin" -e " lib64" bin lib64 bin lib64 bin lib64 tyr hello_1 114 The above values are not part of the environment variables if I use your suggested command. tyr hello_1 110 ssh linpc1 env | grep ^LD_LIBRARY_PATH= LD_LIBRARY_PATH=/usr/local/openmpi-1.9.0_64_gcc/lib:... tyr hello_1 111 ssh linpc1 env | grep ^PATH= PATH=/usr/local/openmpi-1.9.0_64_gcc/bin:... tyr hello_1 108 ssh linpc1 env | grep ^LD_LIBRARY_PATH= LD_LIBRARY_PATH=/usr/local/openmpi-1.8.5_64_gcc/lib:... tyr hello_1 109 ssh linpc1 env | grep ^LD_LIBRARY_PATH= | grep ":lib64" tyr hello_1 110 tyr hello_1 110 ssh linpc1 env | grep ^PATH= PATH=/usr/local/openmpi-1.8.5_64_gcc/bin: tyr hello_1 111 ssh linpc1 env | grep ^PATH= | grep ":bin" tyr hello_1 112 > tyr hello_1 292 ssh linpc1 echo $LD_LIBRARY_PATH > will unlikely do what you expect, you'd rather run > tyr hello_1 292 ssh linpc1 env | grew ^LD_LIBRARY_PATH= > > did you configure ompi with --enable-mpirun-prefix-by-default ? No. I used the following command for gcc-4.9.2: ../openmpi-1.8.5/configure --prefix=/usr/local/openmpi-1.8.5_64_gcc \ --libdir=/usr/local/openmpi-1.8.5_64_gcc/lib64 \ --with-jdk-bindir=/usr/local/jdk1.8.0/bin \ --with-jdk-headers=/usr/local/jdk1.8.0/include \ JAVA_HOME=/usr/local/jdk1.8.0 \ LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \ CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \ CPP="cpp" CXXCPP="cpp" \ CPPFLAGS="" CXXCPPFLAGS="" \ --enable-mpi-cxx \ --enable-cxx-exceptions \ --enable-mpi-java \ --enable-heterogeneous \ --enable-mpi-thread-multiple \ --with-threads=posix \ --with-hwloc=internal \ --without-verbs \ --with-wrapper-cflags="-std=c11 -m64" \ --with-wrapper-cxxflags="-m64" \ --with-wrapper-fcflags="-m64" \ --enable-debug \ |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc > can you try to run > mpiexec --mca plm_base_verbose 100 > --prefix /usr/local/openmpi-1.8.5_64_cc/bin ... > and see if it helps ? Today I used "gcc" instead of "cc" because I only have a gcc-version of openmpi-1.9 ("Sun C 5.13" breaks on my old Linux version for openmpi-1.9 with an internal error. I've reported that before) and I have the same behaviour for both compilers for openmpi-1.8.5. tyr hello_1 115 mpiexec --mca plm_base_verbose 100 --prefix /usr/local/openmpi-1.8.5_64_gcc/bin -np 5 --host sunpc1,linpc1,tyr,rs0 environ_mpi [tyr.informatik.hs-fulda.de:03810] mca: base: components_register: registering plm components ... /usr/local/openmpi-1.8.5_64_gcc/bin/bin/orted: Befehl nicht gefunden. tyr hello_1 116 Wrong directory for "orted". I get better results, when I use "--prefix" without "bin" and I even get the output from my program. Without "--prefix" the program hangs once more and I get "bin" and "lib64" in my environment. I added this output further down (look for "###########"). tyr hello_1 129 mpiexec --mca plm_base_verbose 100 --prefix /usr/local/openmpi-1.8.5_64_gcc/ -np 5 --host sunpc1,linpc1,tyr,rs0 environ_mpi [tyr.informatik.hs-fulda.de:03925] mca: base: components_register: registering plm components [tyr.informatik.hs-fulda.de:03925] mca: base: components_register: found loaded component isolated [tyr.informatik.hs-fulda.de:03925] mca: base: components_register: component isolated has no register or open function [tyr.informatik.hs-fulda.de:03925] mca: base: components_register: found loaded component rsh [tyr.informatik.hs-fulda.de:03925] mca: base: components_register: component rsh register function successful [tyr.informatik.hs-fulda.de:03925] mca: base: components_open: opening plm components [tyr.informatik.hs-fulda.de:03925] mca: base: components_open: found loaded component isolated [tyr.informatik.hs-fulda.de:03925] mca: base: components_open: component isolated open function successful [tyr.informatik.hs-fulda.de:03925] mca: base: components_open: found loaded component rsh [tyr.informatik.hs-fulda.de:03925] mca: base: components_open: component rsh open function successful [tyr.informatik.hs-fulda.de:03925] mca:base:select: Auto-selecting plm components [tyr.informatik.hs-fulda.de:03925] mca:base:select:( plm) Querying component [isolated] [tyr.informatik.hs-fulda.de:03925] mca:base:select:( plm) Query of component [isolated] set priority to 0 [tyr.informatik.hs-fulda.de:03925] mca:base:select:( plm) Querying component [rsh] [tyr.informatik.hs-fulda.de:03925] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh path NULL [tyr.informatik.hs-fulda.de:03925] mca:base:select:( plm) Query of component [rsh] set priority to 10 [tyr.informatik.hs-fulda.de:03925] mca:base:select:( plm) Selected component [rsh] [tyr.informatik.hs-fulda.de:03925] mca: base: close: component isolated closed [tyr.informatik.hs-fulda.de:03925] mca: base: close: unloading component isolated [tyr.informatik.hs-fulda.de:03925] plm:base:set_hnp_name: initial bias 3925 nodename hash 339128848 [tyr.informatik.hs-fulda.de:03925] plm:base:set_hnp_name: final jobfam 43379 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh_setup on agent ssh : rsh path NULL [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive start comm [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:setup_job [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:setup_vm [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:setup_vm creating map [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] setup:vm: working unmanaged allocation [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] using dash_host [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] checking node sunpc1 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] checking node linpc1 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] checking node tyr [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] ignoring myself [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] checking node rs0 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:setup_vm add new daemon [[43379,0],1] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:setup_vm assigning new daemon [[43379,0],1] to node sunpc1 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:setup_vm add new daemon [[43379,0],2] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:setup_vm assigning new daemon [[43379,0],2] to node linpc1 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:setup_vm add new daemon [[43379,0],3] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:setup_vm assigning new daemon [[43379,0],3] to node rs0 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh: launching vm [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh: local shell: 2 (tcsh) [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh: assuming same remote shell as local shell [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh: remote shell: 2 (tcsh) [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh: final template argv: /usr/local/bin/ssh <template> set path = ( /usr/local/openmpi-1.8.5_64_gcc/bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64 ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64:$LD_LIBRARY_PATH ; if ( $?DYLD_LIBRARY_PATH == 1 ) set OMPI_have_dllp ; if ( $?DYLD_LIBRARY_PATH == 0 ) setenv DYLD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64 ; if ( $?OMPI_have_dllp == 1 ) setenv DYLD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64:$DYLD_LIBRARY_PATH ; /usr/local/openmpi-1.8.5_64_gcc/bin/orted --hnp-topo-sig 2N:2S:0L3:0L2:0L1:2C:2H:sun4u -mca ess "env" -mca orte_ess_jobid "2842886144" -mca orte_ess_vpid "<template>" -mca orte_ess_num_procs "4" -mca orte_hnp_uri "2842886144.0;tcp://193.174.24.39:34966" --tree-spawn --mca plm_base_verbose "100" -mca plm "rsh" [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh:launch daemon 0 not a child of mine [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh: adding node sunpc1 to launch list [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh: adding node linpc1 to launch list [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh:launch daemon 3 not a child of mine [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh: activating launch event [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh: executing: (/usr/local/bin/ssh) [/usr/local/bin/ssh sunpc1 set path = ( /usr/local/openmpi-1.8.5_64_gcc/bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64 ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64:$LD_LIBRARY_PATH ; if ( $?DYLD_LIBRARY_PATH == 1 ) set OMPI_have_dllp ; if ( $?DYLD_LIBRARY_PATH == 0 ) setenv DYLD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64 ; if ( $?OMPI_have_dllp == 1 ) setenv DYLD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64:$DYLD_LIBRARY_PATH ; /usr/local/openmpi-1.8.5_64_gcc/bin/orted --hnp-topo-sig 2N:2S:0L3:0L2:0L1:2C:2H:sun4u -mca ess "env" -mca orte_ess_jobid "2842886144" -mca orte_ess_vpid 1 -mca orte_ess_num_procs "4" -mca orte_hnp_uri "2842886144.0;tcp://193.174.24.39:34966" --tree-spawn --mca plm_base_verbose "100" -mca plm "rsh"] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh: recording launch of daemon [[43379,0],1] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh: executing: (/usr/local/bin/ssh) [/usr/local/bin/ssh linpc1 set path = ( /usr/local/openmpi-1.8.5_64_gcc/bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64 ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64:$LD_LIBRARY_PATH ; if ( $?DYLD_LIBRARY_PATH == 1 ) set OMPI_have_dllp ; if ( $?DYLD_LIBRARY_PATH == 0 ) setenv DYLD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64 ; if ( $?OMPI_have_dllp == 1 ) setenv DYLD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64:$DYLD_LIBRARY_PATH ; /usr/local/openmpi-1.8.5_64_gcc/bin/orted --hnp-topo-sig 2N:2S:0L3:0L2:0L1:2C:2H:sun4u -mca ess "env" -mca orte_ess_jobid "2842886144" -mca orte_ess_vpid 2 -mca orte_ess_num_procs "4" -mca orte_hnp_uri "2842886144.0;tcp://193.174.24.39:34966" --tree-spawn --mca plm_base_verbose "100" -mca plm "rsh"] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:rsh: recording launch of daemon [[43379,0],2] Warning: untrusted X11 forwarding setup failed: xauth key data not generated Warning: No xauth data; using fake authentication data for X11 forwarding. X11 forwarding request failed on channel 0 [sunpc1:12951] mca: base: components_register: registering plm components [sunpc1:12951] mca: base: components_register: found loaded component rsh Warning: untrusted X11 forwarding setup failed: xauth key data not generated Warning: No xauth data; using fake authentication data for X11 forwarding. [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:orted_report_launch from daemon [[43379,0],1] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:orted_report_launch from daemon [[43379,0],1] on node sunpc1 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] RECEIVED TOPOLOGY FROM NODE sunpc1 SIG 2N:2S:0L3:0L2:0L1:4C:4H:i86pc [tyr.informatik.hs-fulda.de:03925] Type: Machine Number of child objects: 2 Name=NULL total=8387112KB Backend=Solaris OSName=SunOS OSRelease=5.10 OSVersion=Generic_147441-21 Architecture=i86pc Cpuset: 0x0000000f Online: 0x0000000f Allowed: 0x0000000f Bind CPU proc: TRUE Bind CPU thread: TRUE Bind MEM proc: TRUE Bind MEM thread: TRUE Type: NUMANode Number of child objects: 1 Name=NULL local=4192808KB total=4192808KB Cpuset: 0x00000003 Online: 0x00000003 Allowed: 0x00000003 Type: Socket Number of child objects: 2 Name=NULL CPUType= CPUModel=i86pc CPUVendor=AuthenticAMD CPUModelNumber=33 CPUFamilyNumber=15 Cpuset: 0x00000003 Online: 0x00000003 Allowed: 0x00000003 Type: Core Number of child objects: 1 Name=NULL Cpuset: 0x00000001 Online: 0x00000001 Allowed: 0x00000001 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000001 Online: 0x00000001 Allowed: 0x00000001 Type: Core Number of child objects: 1 Name=NULL Cpuset: 0x00000002 Online: 0x00000002 Allowed: 0x00000002 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000002 Online: 0x00000002 Allowed: 0x00000002 Type: NUMANode Number of child objects: 1 Name=NULL local=4194304KB total=4194304KB Cpuset: 0x0000000c Online: 0x0000000c Allowed: 0x0000000c Type: Socket Number of child objects: 2 Name=NULL CPUType= CPUModel=i86pc CPUVendor=AuthenticAMD CPUModelNumber=33 CPUFamilyNumber=15 Cpuset: 0x0000000c Online: 0x0000000c Allowed: 0x0000000c Type: Core Number of child objects: 1 Name=NULL Cpuset: 0x00000004 Online: 0x00000004 Allowed: 0x00000004 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000004 Online: 0x00000004 Allowed: 0x00000004 Type: Core Number of child objects: 1 Name=NULL Cpuset: 0x00000008 Online: 0x00000008 Allowed: 0x00000008 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000008 Online: 0x00000008 Allowed: 0x00000008 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] ADDING TOPOLOGY PER USER REQUEST [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:orted_report_launch completed for daemon [[43379,0],1] at contact 2842886144.1;tcp://193.174.26.210:50382 [sunpc1:12951] mca: base: components_register: component rsh register function successful [sunpc1:12951] mca: base: components_open: opening plm components [sunpc1:12951] mca: base: components_open: found loaded component rsh [sunpc1:12951] mca: base: components_open: component rsh open function successful [sunpc1:12951] mca:base:select: Auto-selecting plm components [sunpc1:12951] mca:base:select:( plm) Querying component [rsh] [sunpc1:12951] [[43379,0],1] plm:rsh_lookup on agent ssh : rsh path NULL [sunpc1:12951] mca:base:select:( plm) Query of component [rsh] set priority to 10 [sunpc1:12951] mca:base:select:( plm) Selected component [rsh] [sunpc1:12951] [[43379,0],1] plm:rsh_setup on agent ssh : rsh path NULL [sunpc1:12951] [[43379,0],1] plm:base:receive start comm [sunpc1:12951] [[43379,0],1] plm:rsh: remote spawn called [sunpc1:12951] [[43379,0],1] plm:rsh: local shell: 2 (tcsh) [sunpc1:12951] [[43379,0],1] plm:rsh: assuming same remote shell as local shell [sunpc1:12951] [[43379,0],1] plm:rsh: remote shell: 2 (tcsh) [sunpc1:12951] [[43379,0],1] plm:rsh: final template argv: /usr/local/bin/ssh <template> set path = ( /usr/local/openmpi-1.8.5_64_gcc/bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64 ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64:$LD_LIBRARY_PATH ; if ( $?DYLD_LIBRARY_PATH == 1 ) set OMPI_have_dllp ; if ( $?DYLD_LIBRARY_PATH == 0 ) setenv DYLD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64 ; if ( $?OMPI_have_dllp == 1 ) setenv DYLD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64:$DYLD_LIBRARY_PATH ; /usr/local/openmpi-1.8.5_64_gcc/bin/orted --hnp-topo-sig 2N:2S:0L3:0L2:0L1:4C:4H:i86pc -mca ess "env" -mca orte_ess_jobid "2842886144" -mca orte_ess_vpid "<template>" -mca orte_ess_num_procs "4" -mca orte_parent_uri "2842886144.1;tcp://193.174.26.210:50382" -mca orte_hnp_uri "2842886144.0;tcp://193.174.24.39:34966" --mca plm_base_verbose "100" -mca plm "rsh" [sunpc1:12951] [[43379,0],1] plm:rsh: activating launch event [sunpc1:12951] [[43379,0],1] plm:rsh: recording launch of daemon [[43379,0],3] [sunpc1:12951] [[43379,0],1] plm:rsh: executing: (/usr/local/bin/ssh) [/usr/local/bin/ssh rs0 set path = ( /usr/local/openmpi-1.8.5_64_gcc/bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64 ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64:$LD_LIBRARY_PATH ; if ( $?DYLD_LIBRARY_PATH == 1 ) set OMPI_have_dllp ; if ( $?DYLD_LIBRARY_PATH == 0 ) setenv DYLD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64 ; if ( $?OMPI_have_dllp == 1 ) setenv DYLD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib64:$DYLD_LIBRARY_PATH ; /usr/local/openmpi-1.8.5_64_gcc/bin/orted --hnp-topo-sig 2N:2S:0L3:0L2:0L1:4C:4H:i86pc -mca ess "env" -mca orte_ess_jobid "2842886144" -mca orte_ess_vpid 3 -mca orte_ess_num_procs "4" -mca orte_parent_uri "2842886144.1;tcp://193.174.26.210:50382" -mca orte_hnp_uri "2842886144.0;tcp://193.174.24.39:34966" --mca plm_base_verbose "100" -mca plm "rsh" --tree-spawn] [rs0.informatik.hs-fulda.de:15815] mca: base: components_register: registering plm components [rs0.informatik.hs-fulda.de:15815] mca: base: components_register: found loaded component rsh [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:orted_report_launch from daemon [[43379,0],3] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:orted_report_launch from daemon [[43379,0],3] on node rs0 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] RECEIVED TOPOLOGY FROM NODE rs0 SIG 1N:2S:0L3:0L2:0L1:8C:16H:sun4u [tyr.informatik.hs-fulda.de:03925] Type: Machine Number of child objects: 1 Name=NULL total=33554432KB Backend=Solaris OSName=SunOS OSRelease=5.10 OSVersion=Generic_150400-10 Architecture=sun4u Cpuset: 0x0000ffff Online: 0x0000ffff Allowed: 0x0000ffff Bind CPU proc: TRUE Bind CPU thread: TRUE Bind MEM proc: TRUE Bind MEM thread: TRUE Type: NUMANode Number of child objects: 2 Name=NULL local=33554432KB total=33554432KB Cpuset: 0x0000ffff Online: 0x0000ffff Allowed: 0x0000ffff Type: Socket Number of child objects: 4 Name=NULL CPUType=sparcv9 CPUModel=SPARC64_VII Cpuset: 0x000000ff Online: 0x000000ff Allowed: 0x000000ff Type: Core Number of child objects: 2 Name=NULL Cpuset: 0x00000003 Online: 0x00000003 Allowed: 0x00000003 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000001 Online: 0x00000001 Allowed: 0x00000001 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000002 Online: 0x00000002 Allowed: 0x00000002 Type: Core Number of child objects: 2 Name=NULL Cpuset: 0x0000000c Online: 0x0000000c Allowed: 0x0000000c Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000004 Online: 0x00000004 Allowed: 0x00000004 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000008 Online: 0x00000008 Allowed: 0x00000008 Type: Core Number of child objects: 2 Name=NULL Cpuset: 0x00000030 Online: 0x00000030 Allowed: 0x00000030 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000010 Online: 0x00000010 Allowed: 0x00000010 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000020 Online: 0x00000020 Allowed: 0x00000020 Type: Core Number of child objects: 2 Name=NULL Cpuset: 0x000000c0 Online: 0x000000c0 Allowed: 0x000000c0 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000040 Online: 0x00000040 Allowed: 0x00000040 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000080 Online: 0x00000080 Allowed: 0x00000080 Type: Socket Number of child objects: 4 Name=NULL CPUType=sparcv9 CPUModel=SPARC64_VII Cpuset: 0x0000ff00 Online: 0x0000ff00 Allowed: 0x0000ff00 Type: Core Number of child objects: 2 Name=NULL Cpuset: 0x00000300 Online: 0x00000300 Allowed: 0x00000300 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000100 Online: 0x00000100 Allowed: 0x00000100 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000200 Online: 0x00000200 Allowed: 0x00000200 Type: Core Number of child objects: 2 Name=NULL Cpuset: 0x00000c00 Online: 0x00000c00 Allowed: 0x00000c00 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000400 Online: 0x00000400 Allowed: 0x00000400 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000800 Online: 0x00000800 Allowed: 0x00000800 Type: Core Number of child objects: 2 Name=NULL Cpuset: 0x00003000 Online: 0x00003000 Allowed: 0x00003000 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00001000 Online: 0x00001000 Allowed: 0x00001000 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00002000 Online: 0x00002000 Allowed: 0x00002000 Type: Core Number of child objects: 2 Name=NULL Cpuset: 0x0000c000 Online: 0x0000c000 Allowed: 0x0000c000 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00004000 Online: 0x00004000 Allowed: 0x00004000 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00008000 Online: 0x00008000 Allowed: 0x00008000 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] NEW TOPOLOGY - ADDING [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:orted_report_launch completed for daemon [[43379,0],3] at contact 2842886144.3;tcp://193.174.26.198,192.168.128.1,10.1.1.2:59024 [rs0.informatik.hs-fulda.de:15815] mca: base: components_register: component rsh register function successful [rs0.informatik.hs-fulda.de:15815] mca: base: components_open: opening plm components [rs0.informatik.hs-fulda.de:15815] mca: base: components_open: found loaded component rsh [rs0.informatik.hs-fulda.de:15815] mca: base: components_open: component rsh open function successful [rs0.informatik.hs-fulda.de:15815] mca:base:select: Auto-selecting plm components [rs0.informatik.hs-fulda.de:15815] mca:base:select:( plm) Querying component [rsh] [rs0.informatik.hs-fulda.de:15815] [[43379,0],3] plm:rsh_lookup on agent ssh : rsh path NULL [rs0.informatik.hs-fulda.de:15815] mca:base:select:( plm) Query of component [rsh] set priority to 10 [rs0.informatik.hs-fulda.de:15815] mca:base:select:( plm) Selected component [rsh] [rs0.informatik.hs-fulda.de:15815] [[43379,0],3] plm:rsh_setup on agent ssh : rsh path NULL [rs0.informatik.hs-fulda.de:15815] [[43379,0],3] plm:base:receive start comm [rs0.informatik.hs-fulda.de:15815] [[43379,0],3] plm:rsh: remote spawn called [rs0.informatik.hs-fulda.de:15815] [[43379,0],3] plm:rsh: remote spawn - have no children! [linpc1:24344] mca: base: components_register: registering plm components [linpc1:24344] mca: base: components_register: found loaded component rsh [linpc1:24344] mca: base: components_register: component rsh register function successful [linpc1:24344] mca: base: components_open: opening plm components [linpc1:24344] mca: base: components_open: found loaded component rsh [linpc1:24344] mca: base: components_open: component rsh open function successful [linpc1:24344] mca:base:select: Auto-selecting plm components [linpc1:24344] mca:base:select:( plm) Querying component [rsh] [linpc1:24344] [[43379,0],2] plm:rsh_lookup on agent ssh : rsh path NULL [linpc1:24344] mca:base:select:( plm) Query of component [rsh] set priority to 10 [linpc1:24344] mca:base:select:( plm) Selected component [rsh] [linpc1:24344] [[43379,0],2] plm:rsh_setup on agent ssh : rsh path NULL [linpc1:24344] [[43379,0],2] plm:base:receive start comm [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:orted_report_launch from daemon [[43379,0],2] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:orted_report_launch from daemon [[43379,0],2] on node linpc1 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] RECEIVED TOPOLOGY FROM NODE linpc1 SIG 2N:2S:0L3:4L2:4L1:4C:4H:x86_64 [tyr.informatik.hs-fulda.de:03925] Type: Machine Number of child objects: 2 Name=NULL total=8387048KB DMIProductName="Sun Ultra 40 Workstation" DMIProductVersion=11 DMIBoardVendor="Sun Microsystems" DMIBoardName="Sun Ultra 40 Workstation" DMIBoardVersion=01 DMIBoardAssetTag= DMIChassisVendor="Sun Microsystems" DMIChassisType=17 DMIChassisVersion=01 DMIChassisAssetTag= DMIBIOSVendor="Phoenix Technologies Ltd." DMIBIOSVersion="1.70 " DMIBIOSDate=02/15/2008 DMISysVendor="Sun Microsystems" Backend=Linux OSName=Linux OSRelease=3.1.10-1.29-desktop OSVersion="#1 SMP PREEMPT Fri May 31 20:10:04 UTC 2013 (2529847)" Architecture=x86_64 Cpuset: 0x0000000f Online: 0x0000000f Allowed: 0x0000000f Bind CPU proc: TRUE Bind CPU thread: TRUE Bind MEM proc: FALSE Bind MEM thread: TRUE Type: NUMANode Number of child objects: 2 Name=NULL local=4192744KB total=4192744KB Cpuset: 0x00000003 Online: 0x00000003 Allowed: 0x00000003 Type: Socket Number of child objects: 2 Name=NULL CPUVendor=AuthenticAMD CPUFamilyNumber=15 CPUModelNumber=33 CPUModel="Dual Core AMD Opteron(tm) Processor 280" Cpuset: 0x00000003 Online: 0x00000003 Allowed: 0x00000003 Type: L2Cache Number of child objects: 1 Name=NULL size=1024KB linesize=64 ways=16 Cpuset: 0x00000001 Online: 0x00000001 Allowed: 0x00000001 Type: L1dCache Number of child objects: 1 Name=NULL size=64KB linesize=64 ways=2 Cpuset: 0x00000001 Online: 0x00000001 Allowed: 0x00000001 Type: Core Number of child objects: 1 Name=NULL Cpuset: 0x00000001 Online: 0x00000001 Allowed: 0x00000001 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000001 Online: 0x00000001 Allowed: 0x00000001 Type: L2Cache Number of child objects: 1 Name=NULL size=1024KB linesize=64 ways=16 Cpuset: 0x00000002 Online: 0x00000002 Allowed: 0x00000002 Type: L1dCache Number of child objects: 1 Name=NULL size=64KB linesize=64 ways=2 Cpuset: 0x00000002 Online: 0x00000002 Allowed: 0x00000002 Type: Core Number of child objects: 1 Name=NULL Cpuset: 0x00000002 Online: 0x00000002 Allowed: 0x00000002 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000002 Online: 0x00000002 Allowed: 0x00000002 Type: Bridge Host->PCI Number of child objects: 4 Name=NULL buses=0000:[00-03] Type: PCI 10de:0053 Number of child objects: 1 Name=nVidia Corporation CK804 IDE busid=0000:00:06.0 class=0101(IDE) PCIVendor="nVidia Corporation" PCIDevice="CK804 IDE" Type: Block Number of child objects: 0 Name=sr0 Type: PCI 10de:0055 Number of child objects: 1 Name=nVidia Corporation CK804 Serial ATA Controller busid=0000:00:07.0 class=0101(IDE) PCIVendor="nVidia Corporation" PCIDevice="CK804 Serial ATA Controller" Type: Block Number of child objects: 0 Name=sda Type: PCI 10de:0054 Number of child objects: 0 Name=nVidia Corporation CK804 Serial ATA Controller busid=0000:00:08.0 class=0101(IDE) PCIVendor="nVidia Corporation" PCIDevice="CK804 Serial ATA Controller" Type: PCI 10de:029d Number of child objects: 2 Name=nVidia Corporation G71GL [Quadro FX 3500] busid=0000:03:00.0 class=0300(VGA) PCIVendor="nVidia Corporation" PCIDevice="G71GL [Quadro FX 3500]" Type: GPU Number of child objects: 0 Name=controlD64 Type: GPU Number of child objects: 0 Name=card0 Type: NUMANode Number of child objects: 2 Name=NULL local=4194304KB total=4194304KB Cpuset: 0x0000000c Online: 0x0000000c Allowed: 0x0000000c Type: Socket Number of child objects: 2 Name=NULL CPUVendor=AuthenticAMD CPUFamilyNumber=15 CPUModelNumber=33 CPUModel="Dual Core AMD Opteron(tm) Processor 280" Cpuset: 0x0000000c Online: 0x0000000c Allowed: 0x0000000c Type: L2Cache Number of child objects: 1 Name=NULL size=1024KB linesize=64 ways=16 Cpuset: 0x00000004 Online: 0x00000004 Allowed: 0x00000004 Type: L1dCache Number of child objects: 1 Name=NULL size=64KB linesize=64 ways=2 Cpuset: 0x00000004 Online: 0x00000004 Allowed: 0x00000004 Type: Core Number of child objects: 1 Name=NULL Cpuset: 0x00000004 Online: 0x00000004 Allowed: 0x00000004 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000004 Online: 0x00000004 Allowed: 0x00000004 Type: L2Cache Number of child objects: 1 Name=NULL size=1024KB linesize=64 ways=16 Cpuset: 0x00000008 Online: 0x00000008 Allowed: 0x00000008 Type: L1dCache Number of child objects: 1 Name=NULL size=64KB linesize=64 ways=2 Cpuset: 0x00000008 Online: 0x00000008 Allowed: 0x00000008 Type: Core Number of child objects: 1 Name=NULL Cpuset: 0x00000008 Online: 0x00000008 Allowed: 0x00000008 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000008 Online: 0x00000008 Allowed: 0x00000008 Type: Bridge Host->PCI Number of child objects: 2 Name=NULL buses=0000:[80-82] Type: PCI 10de:0054 Number of child objects: 0 Name=nVidia Corporation CK804 Serial ATA Controller busid=0000:80:07.0 class=0101(IDE) PCIVendor="nVidia Corporation" PCIDevice="CK804 Serial ATA Controller" Type: PCI 10de:0055 Number of child objects: 0 Name=nVidia Corporation CK804 Serial ATA Controller busid=0000:80:08.0 class=0101(IDE) PCIVendor="nVidia Corporation" PCIDevice="CK804 Serial ATA Controller" [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] NEW TOPOLOGY - ADDING [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:orted_report_launch completed for daemon [[43379,0],2] at contact 2842886144.2;tcp://193.174.26.208:58980 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:setting topo to that from node sunpc1 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:launch_apps for job [43379,1] [linpc1:24344] [[43379,0],2] plm:rsh: remote spawn called [linpc1:24344] [[43379,0],2] plm:rsh: remote spawn - have no children! [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive processing msg [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive update proc state command from [[43379,0],3] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for job [43379,1] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for vpid 4 state RUNNING exit_code 0 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive done processing commands [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive processing msg [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive update proc state command from [[43379,0],2] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for job [43379,1] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for vpid 2 state RUNNING exit_code 0 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive done processing commands [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive processing msg [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive update proc state command from [[43379,0],1] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for job [43379,1] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for vpid 0 state RUNNING exit_code 0 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for vpid 1 state RUNNING exit_code 0 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive done processing commands [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive processing msg [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive done processing commands [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:launch wiring up iof for job [43379,1] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive processing msg [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive done processing commands [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive processing msg [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive done processing commands [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:launch [43379,1] registered [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:launch job [43379,1] is not a dynamic spawn Now 4 slave tasks are sending their environment. Environment from task 1: message type: 3 msg length: 3812 characters message: hostname: sunpc1 operating system: SunOS release: 5.10 processor: i86pc PATH /usr/local/openmpi-1.8.5_64_gcc/bin /usr/local/openmpi-1.8.5_64_gcc/bin /usr/local/openmpi-1.8.5_64_gcc/bin /usr/local/NetBeans-4.0/bin /usr/local/jdk1.8.0/bin /usr/local/apache-ant-1.6.2/bin /usr/local/db-derby-10.11.1.1-bin/bin /usr/local/gcc-4.9.2/bin /opt/solstudio12.4/bin /usr/local/bin /usr/local/ssl/bin /usr/local/pgsql/bin /usr/bin /usr/openwin/bin /usr/dt/bin /usr/ccs/bin /usr/sfw/bin /opt/sfw/bin /usr/ucb /usr/lib/lp/postscript /usr/local/teTeX-1.0.7/bin/i386-pc-solaris2.10 /usr/local/bluej-2.1.2 /usr/local/hwloc-1.10.0/bin /home/fd1026/SunOS/x86_64/bin . /usr/sbin LD_LIBRARY_PATH_64 /usr/local/openmpi-1.8.5_64_gcc/lib64 /usr/local/jdk1.8.0/jre/lib/amd64 /usr/local/gcc-4.9.2/lib/amd64 /usr/local/gcc-4.9.2/lib/gcc/i386-pc-solaris2.10/4.9.2/amd64 /usr/local/lib/amd64 /usr/local/ssl/lib/amd64 /usr/local/lib64 /usr/lib/amd64 /usr/openwin/lib/amd64 /usr/openwin/server/lib/amd64 /usr/dt/lib/amd64 /usr/X11R6/lib/amd64 /usr/ccs/lib/amd64 /usr/sfw/lib/amd64 /opt/sfw/lib/amd64 /usr/ucblib/amd64 /usr/local/hwloc-1.10.0/lib64 /home/fd1026/SunOS/x86_64/lib64 LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib /usr/local/openmpi-1.8.5_64_gcc/lib64 /usr/local/openmpi-1.8.5_64_gcc/lib /usr/local/jdk1.8.0/jre/lib/i386 /usr/local/gcc-4.9.2/lib /usr/local/gcc-4.9.2/lib/gcc/i386-pc-solaris2.10/4.9.2 /usr/local/lib /usr/local/ssl/lib /usr/local/oracle /usr/local/pgsql/lib /usr/lib /usr/openwin/lib /usr/openwin/server/lib /usr/dt/lib /usr/X11R6/lib /usr/ccs/lib /usr/sfw/lib /opt/sfw/lib /usr/ucblib /usr/local/hwloc-1.10.0/lib /usr/lib/gnome-private/lib /home/fd1026/SunOS/x86_64/lib CLASSPATH /usr/local/db-derby-10.11.1.1-bin/lib/derby.jar /usr/local/db-derby-10.11.1.1-bin/lib/derbytools.jar /usr/local/db-derby-10.11.1.1-bin/lib/derbyrun.jar /usr/local/jdk1.8.0/hibernate-jpa-2.0-api-1.0.0.Final.jar /usr/local/junit4.10 /usr/local/junit4.10/junit-4.10.jar /usr/local/javacc-5.0/javacc.jar . /home/fd1026/SunOS/x86_64/mpi_classfiles Environment from task 2: message type: 3 msg length: 6946 characters message: hostname: linpc1 operating system: Linux release: 3.1.10-1.29-desktop processor: x86_64 PATH /usr/local/openmpi-1.8.5_64_gcc/bin /usr/local/openmpi-1.8.5_64_gcc/bin /usr/local/openmpi-1.8.5_64_gcc/bin /usr/local/NetBeans-4.0/bin /usr/local/jdk1.8.0/bin /usr/local/apache-ant-1.6.2/bin /usr/local/db-derby-10.11.1.1-bin/bin /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/bin/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mpirt/bin/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/debugger/gdb/intel64_mic/py27/bin /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/debugger/gdb/intel64/py27/bin /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/bin/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/bin/intel64_mic /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/debugger/gui/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/bin/ia32 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mpirt/bin/ia32 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/debugger/gdb/intel64_mic/py27/bin /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/debugger/gdb/intel64/py27/bin /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/bin/ia32 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/debugger/gui/ia32 /usr/local/gcc-4.9.2/bin /opt/solstudio12.4/bin /usr/local/bin /usr/local/ssl/bin /usr/local/pgsql/bin /bin /usr/bin /usr/X11R6/bin /usr/local/teTeX-1.0.7/bin/i586-pc-linux-gnu /usr/local/bluej-2.1.2 /usr/local/hwloc-1.10.0/bin /home/fd1026/Linux/x86_64/bin . /usr/sbin LD_LIBRARY_PATH_64 /usr/local/openmpi-1.8.5_64_gcc/lib64 /usr/local/jdk1.8.0/jre/lib/amd64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/compiler/lib/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mpirt/lib/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/ipp/../compiler/lib/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/ipp/lib/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/compiler/lib/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mkl/lib/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/tbb/lib/intel64/gcc4.4 /usr/local/gcc-4.9.2/lib64 /usr/local/gcc-4.9.2/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2 /usr/local/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 /usr/local/lib64 /usr/local/ssl/lib64 /usr/lib64 /usr/X11R6/lib64 /usr/local/hwloc-1.10.0/lib64 /home/fd1026/Linux/x86_64/lib64 LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib /usr/local/openmpi-1.8.5_64_gcc/lib64 /usr/local/openmpi-1.8.5_64_gcc/lib /usr/local/jdk1.8.0/jre/lib/i386 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/compiler/lib/ia32 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mpirt/lib/ia32 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/ipp/../compiler/lib/ia32 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/ipp/lib/ia32 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/compiler/lib/ia32 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mkl/lib/ia32 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/tbb/lib/ia32/gcc4.4 /usr/local/gcc-4.9.2/lib /usr/local/gcc-4.9.2/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/32 /usr/local/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/32 /usr/local/lib /usr/local/ssl/lib /lib /usr/lib /usr/X11R6/lib /usr/local/hwloc-1.10.0/lib /usr/lib/gnome-private/lib /home/fd1026/Linux/x86_64/lib /usr/local/openmpi-1.8.5_64_gcc/lib64 /usr/local/jdk1.8.0/jre/lib/amd64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/compiler/lib/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mpirt/lib/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/ipp/../compiler/lib/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/ipp/lib/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/compiler/lib/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mkl/lib/intel64 /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/tbb/lib/intel64/gcc4.4 /usr/local/gcc-4.9.2/lib64 /usr/local/gcc-4.9.2/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2 /usr/local/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 /usr/local/lib64 /usr/local/ssl/lib64 /usr/lib64 /usr/X11R6/lib64 /usr/local/hwloc-1.10.0/lib64 /home/fd1026/Linux/x86_64/lib64 CLASSPATH /usr/local/db-derby-10.11.1.1-bin/lib/derby.jar /usr/local/db-derby-10.11.1.1-bin/lib/derbytools.jar /usr/local/db-derby-10.11.1.1-bin/lib/derbyrun.jar /usr/local/jdk1.8.0/hibernate-jpa-2.0-api-1.0.0.Final.jar /usr/local/junit4.10 /usr/local/junit4.10/junit-4.10.jar /usr/local/javacc-5.0/javacc.jar . /home/fd1026/Linux/x86_64/mpi_classfiles Environment from task 3: message type: 3 msg length: 3917 characters message: hostname: tyr.informatik.hs-fulda.de operating system: SunOS release: 5.10 processor: sun4u PATH /usr/local/openmpi-1.8.5_64_gcc/bin /usr/local/openmpi-1.8.5_64_gcc/bin /usr/local/openmpi-1.8.5_64_gcc/bin /usr/local/NetBeans-4.0/bin /usr/local/jdk1.8.0/bin /usr/local/apache-ant-1.6.2/bin /usr/local/db-derby-10.11.1.1-bin/bin /usr/local/gcc-4.9.2/bin /opt/solstudio12.4/bin /usr/local/bin /usr/local/ssl/bin /usr/local/pgsql/bin /usr/bin /usr/openwin/bin /usr/dt/bin /usr/ccs/bin /usr/sfw/bin /opt/sfw/bin /usr/ucb /usr/xpg4/bin /usr/local/teTeX-1.0.7/bin/sparc-sun-solaris2.10 /usr/local/bluej-2.1.2 /usr/local/hwloc-1.10.0/bin /home/fd1026/SunOS/sparc/bin . /usr/sbin LD_LIBRARY_PATH_64 /usr/local/openmpi-1.8.5_64_gcc/lib64 /usr/local/jdk1.8.0/jre/lib/sparcv9 /usr/local/gcc-4.9.2/lib/sparcv9 /usr/local/gcc-4.9.2/lib/gcc/sparc-sun-solaris2.10/4.9.2/sparcv9 /usr/local/lib/sparcv9 /usr/local/ssl/lib/sparcv9 /usr/local/lib64 /usr/local/oracle/sparcv9 /usr/local/pgsql/lib/sparcv9 /lib/sparcv9 /usr/lib/sparcv9 /usr/openwin/lib/sparcv9 /usr/dt/lib/sparcv9 /usr/X11R6/lib/sparcv9 /usr/ccs/lib/sparcv9 /usr/sfw/lib/sparcv9 /opt/sfw/lib/sparcv9 /usr/ucblib/sparcv9 /usr/local/hwloc-1.10.0/lib64 /home/fd1026/SunOS/sparc/lib64 LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib /usr/local/openmpi-1.8.5_64_gcc/lib64 /usr/local/openmpi-1.8.5_64_gcc/lib /usr/local/jdk1.8.0/jre/lib/sparc /usr/local/gcc-4.9.2/lib /usr/local/gcc-4.9.2/lib/gcc/sparc-sun-solaris2.10/4.9.2 /usr/local/lib /usr/local/ssl/lib /usr/local/oracle /usr/local/pgsql/lib /lib /usr/lib /usr/openwin/lib /usr/dt/lib /usr/X11R6/lib /usr/ccs/lib /usr/sfw/lib /opt/sfw/lib /usr/ucblib /usr/local/hwloc-1.10.0/lib /usr/lib/gnome-private/lib /home/fd1026/SunOS/sparc/lib CLASSPATH /usr/local/db-derby-10.11.1.1-bin/lib/derby.jar /usr/local/db-derby-10.11.1.1-bin/lib/derbytools.jar /usr/local/db-derby-10.11.1.1-bin/lib/derbyrun.jar /usr/local/jdk1.8.0/hibernate-jpa-2.0-api-1.0.0.Final.jar /usr/local/junit4.10 /usr/local/junit4.10/junit-4.10.jar /usr/local/javacc-5.0/javacc.jar . /home/fd1026/SunOS/sparc/mpi_classfiles Environment from task 4: message type: 3 msg length: 3917 characters message: hostname: rs0.informatik.hs-fulda.de operating system: SunOS release: 5.10 processor: sun4u PATH /usr/local/openmpi-1.8.5_64_gcc/bin /usr/local/openmpi-1.8.5_64_gcc/bin /usr/local/openmpi-1.8.5_64_gcc/bin /usr/local/NetBeans-4.0/bin /usr/local/jdk1.8.0/bin /usr/local/apache-ant-1.6.2/bin /usr/local/db-derby-10.11.1.1-bin/bin /usr/local/gcc-4.9.2/bin /opt/solstudio12.4/bin /usr/local/bin /usr/local/ssl/bin /usr/local/pgsql/bin /usr/bin /usr/openwin/bin /usr/dt/bin /usr/ccs/bin /usr/sfw/bin /opt/sfw/bin /usr/ucb /usr/xpg4/bin /usr/local/teTeX-1.0.7/bin/sparc-sun-solaris2.10 /usr/local/bluej-2.1.2 /usr/local/hwloc-1.10.0/bin /home/fd1026/SunOS/sparc/bin . /usr/sbin LD_LIBRARY_PATH_64 /usr/local/openmpi-1.8.5_64_gcc/lib64 /usr/local/jdk1.8.0/jre/lib/sparcv9 /usr/local/gcc-4.9.2/lib/sparcv9 /usr/local/gcc-4.9.2/lib/gcc/sparc-sun-solaris2.10/4.9.2/sparcv9 /usr/local/lib/sparcv9 /usr/local/ssl/lib/sparcv9 /usr/local/lib64 /usr/local/oracle/sparcv9 /usr/local/pgsql/lib/sparcv9 /lib/sparcv9 /usr/lib/sparcv9 /usr/openwin/lib/sparcv9 /usr/dt/lib/sparcv9 /usr/X11R6/lib/sparcv9 /usr/ccs/lib/sparcv9 /usr/sfw/lib/sparcv9 /opt/sfw/lib/sparcv9 /usr/ucblib/sparcv9 /usr/local/hwloc-1.10.0/lib64 /home/fd1026/SunOS/sparc/lib64 LD_LIBRARY_PATH /usr/local/openmpi-1.8.5_64_gcc/lib /usr/local/openmpi-1.8.5_64_gcc/lib64 /usr/local/openmpi-1.8.5_64_gcc/lib /usr/local/jdk1.8.0/jre/lib/sparc /usr/local/gcc-4.9.2/lib /usr/local/gcc-4.9.2/lib/gcc/sparc-sun-solaris2.10/4.9.2 /usr/local/lib /usr/local/ssl/lib /usr/local/oracle /usr/local/pgsql/lib /lib /usr/lib /usr/openwin/lib /usr/dt/lib /usr/X11R6/lib /usr/ccs/lib /usr/sfw/lib /opt/sfw/lib /usr/ucblib /usr/local/hwloc-1.10.0/lib /usr/lib/gnome-private/lib /home/fd1026/SunOS/sparc/lib CLASSPATH /usr/local/db-derby-10.11.1.1-bin/lib/derby.jar /usr/local/db-derby-10.11.1.1-bin/lib/derbytools.jar /usr/local/db-derby-10.11.1.1-bin/lib/derbyrun.jar /usr/local/jdk1.8.0/hibernate-jpa-2.0-api-1.0.0.Final.jar /usr/local/junit4.10 /usr/local/junit4.10/junit-4.10.jar /usr/local/javacc-5.0/javacc.jar . /home/fd1026/SunOS/sparc/mpi_classfiles [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive processing msg [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive update proc state command from [[43379,0],2] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for job [43379,1] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for vpid 2 state NORMALLY TERMINATED exit_code 0 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive done processing commands [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive processing msg [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive update proc state command from [[43379,0],1] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for job [43379,1] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for vpid 0 state NORMALLY TERMINATED exit_code 0 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for vpid 1 state NORMALLY TERMINATED exit_code 0 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive done processing commands [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive processing msg [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive update proc state command from [[43379,0],3] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for job [43379,1] [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive got update_proc_state for vpid 4 state NORMALLY TERMINATED exit_code 0 [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive done processing commands [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:orted_cmd sending orted_exit commands [linpc1:24344] [[43379,0],2] plm:base:receive stop comm [linpc1:24344] mca: base: close: component rsh closed [linpc1:24344] mca: base: close: unloading component rsh [rs0.informatik.hs-fulda.de:15815] [[43379,0],3] plm:base:receive stop comm [rs0.informatik.hs-fulda.de:15815] mca: base: close: component rsh closed [rs0.informatik.hs-fulda.de:15815] mca: base: close: unloading component rsh [tyr.informatik.hs-fulda.de:03925] [[43379,0],0] plm:base:receive stop comm [tyr.informatik.hs-fulda.de:03925] mca: base: close: component rsh closed [tyr.informatik.hs-fulda.de:03925] mca: base: close: unloading component rsh tyr hello_1 130 [sunpc1:12951] [[43379,0],1] plm:base:receive stop comm [sunpc1:12951] mca: base: close: component rsh closed [sunpc1:12951] mca: base: close: unloading component rsh tyr hello_1 130 ############################################################################### ############################################################################### ############################################################################### Now the output without "--prefix". This time I get "bin" and "lib64" in my environment. Do you know why this happens? tyr hello_1 130 mpiexec --mca plm_base_verbose 100 -np 5 --host sunpc1,linpc1,tyr,rs0 environ_mpi [tyr.informatik.hs-fulda.de:03938] mca: base: components_register: registering plm components [tyr.informatik.hs-fulda.de:03938] mca: base: components_register: found loaded component isolated [tyr.informatik.hs-fulda.de:03938] mca: base: components_register: component isolated has no register or open function [tyr.informatik.hs-fulda.de:03938] mca: base: components_register: found loaded component rsh [tyr.informatik.hs-fulda.de:03938] mca: base: components_register: component rsh register function successful [tyr.informatik.hs-fulda.de:03938] mca: base: components_open: opening plm components [tyr.informatik.hs-fulda.de:03938] mca: base: components_open: found loaded component isolated [tyr.informatik.hs-fulda.de:03938] mca: base: components_open: component isolated open function successful [tyr.informatik.hs-fulda.de:03938] mca: base: components_open: found loaded component rsh [tyr.informatik.hs-fulda.de:03938] mca: base: components_open: component rsh open function successful [tyr.informatik.hs-fulda.de:03938] mca:base:select: Auto-selecting plm components [tyr.informatik.hs-fulda.de:03938] mca:base:select:( plm) Querying component [isolated] [tyr.informatik.hs-fulda.de:03938] mca:base:select:( plm) Query of component [isolated] set priority to 0 [tyr.informatik.hs-fulda.de:03938] mca:base:select:( plm) Querying component [rsh] [tyr.informatik.hs-fulda.de:03938] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh path NULL [tyr.informatik.hs-fulda.de:03938] mca:base:select:( plm) Query of component [rsh] set priority to 10 [tyr.informatik.hs-fulda.de:03938] mca:base:select:( plm) Selected component [rsh] [tyr.informatik.hs-fulda.de:03938] mca: base: close: component isolated closed [tyr.informatik.hs-fulda.de:03938] mca: base: close: unloading component isolated [tyr.informatik.hs-fulda.de:03938] plm:base:set_hnp_name: initial bias 3938 nodename hash 339128848 [tyr.informatik.hs-fulda.de:03938] plm:base:set_hnp_name: final jobfam 43332 [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh_setup on agent ssh : rsh path NULL [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:receive start comm [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:setup_job [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:setup_vm [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:setup_vm creating map [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] setup:vm: working unmanaged allocation [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] using dash_host [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] checking node sunpc1 [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] checking node linpc1 [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] checking node tyr [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] ignoring myself [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] checking node rs0 [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:setup_vm add new daemon [[43332,0],1] [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:setup_vm assigning new daemon [[43332,0],1] to node sunpc1 [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:setup_vm add new daemon [[43332,0],2] [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:setup_vm assigning new daemon [[43332,0],2] to node linpc1 [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:setup_vm add new daemon [[43332,0],3] [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:setup_vm assigning new daemon [[43332,0],3] to node rs0 [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh: launching vm [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh: local shell: 2 (tcsh) [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh: assuming same remote shell as local shell [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh: remote shell: 2 (tcsh) [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh: final template argv: /usr/local/bin/ssh <template> set path = ( bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH lib64 ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH lib64:$LD_LIBRARY_PATH ; if ( $?DYLD_LIBRARY_PATH == 1 ) set OMPI_have_dllp ; if ( $?DYLD_LIBRARY_PATH == 0 ) setenv DYLD_LIBRARY_PATH lib64 ; if ( $?OMPI_have_dllp == 1 ) setenv DYLD_LIBRARY_PATH lib64:$DYLD_LIBRARY_PATH ; orted --hnp-topo-sig 2N:2S:0L3:0L2:0L1:2C:2H:sun4u -mca ess "env" -mca orte_ess_jobid "2839805952" -mca orte_ess_vpid "<template>" -mca orte_ess_num_procs "4" -mca orte_hnp_uri "2839805952.0;tcp://193.174.24.39:34971" --tree-spawn --mca plm_base_verbose "100" -mca plm "rsh" [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh:launch daemon 0 not a child of mine [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh: adding node sunpc1 to launch list [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh: adding node linpc1 to launch list [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh:launch daemon 3 not a child of mine [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh: activating launch event [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh: recording launch of daemon [[43332,0],1] [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh: executing: (/usr/local/bin/ssh) [/usr/local/bin/ssh sunpc1 set path = ( bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH lib64 ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH lib64:$LD_LIBRARY_PATH ; if ( $?DYLD_LIBRARY_PATH == 1 ) set OMPI_have_dllp ; if ( $?DYLD_LIBRARY_PATH == 0 ) setenv DYLD_LIBRARY_PATH lib64 ; if ( $?OMPI_have_dllp == 1 ) setenv DYLD_LIBRARY_PATH lib64:$DYLD_LIBRARY_PATH ; orted --hnp-topo-sig 2N:2S:0L3:0L2:0L1:2C:2H:sun4u -mca ess "env" -mca orte_ess_jobid "2839805952" -mca orte_ess_vpid 1 -mca orte_ess_num_procs "4" -mca orte_hnp_uri "2839805952.0;tcp://193.174.24.39:34971" --tree-spawn --mca plm_base_verbose "100" -mca plm "rsh"] [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh: executing: (/usr/local/bin/ssh) [/usr/local/bin/ssh linpc1 set path = ( bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH lib64 ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH lib64:$LD_LIBRARY_PATH ; if ( $?DYLD_LIBRARY_PATH == 1 ) set OMPI_have_dllp ; if ( $?DYLD_LIBRARY_PATH == 0 ) setenv DYLD_LIBRARY_PATH lib64 ; if ( $?OMPI_have_dllp == 1 ) setenv DYLD_LIBRARY_PATH lib64:$DYLD_LIBRARY_PATH ; orted --hnp-topo-sig 2N:2S:0L3:0L2:0L1:2C:2H:sun4u -mca ess "env" -mca orte_ess_jobid "2839805952" -mca orte_ess_vpid 2 -mca orte_ess_num_procs "4" -mca orte_hnp_uri "2839805952.0;tcp://193.174.24.39:34971" --tree-spawn --mca plm_base_verbose "100" -mca plm "rsh"] [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:rsh: recording launch of daemon [[43332,0],2] Warning: untrusted X11 forwarding setup failed: xauth key data not generated Warning: No xauth data; using fake authentication data for X11 forwarding. X11 forwarding request failed on channel 0 Warning: untrusted X11 forwarding setup failed: xauth key data not generated Warning: No xauth data; using fake authentication data for X11 forwarding. [sunpc1:12987] mca: base: components_register: registering plm components [sunpc1:12987] mca: base: components_register: found loaded component rsh [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:orted_report_launch from daemon [[43332,0],1] [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:orted_report_launch from daemon [[43332,0],1] on node sunpc1 [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] RECEIVED TOPOLOGY FROM NODE sunpc1 SIG 2N:2S:0L3:0L2:0L1:4C:4H:i86pc [tyr.informatik.hs-fulda.de:03938] Type: Machine Number of child objects: 2 Name=NULL total=8387112KB Backend=Solaris OSName=SunOS OSRelease=5.10 OSVersion=Generic_147441-21 Architecture=i86pc Cpuset: 0x0000000f Online: 0x0000000f Allowed: 0x0000000f Bind CPU proc: TRUE Bind CPU thread: TRUE Bind MEM proc: TRUE Bind MEM thread: TRUE Type: NUMANode Number of child objects: 1 Name=NULL local=4192808KB total=4192808KB Cpuset: 0x00000003 Online: 0x00000003 Allowed: 0x00000003 Type: Socket Number of child objects: 2 Name=NULL CPUType= CPUModel=i86pc CPUVendor=AuthenticAMD CPUModelNumber=33 CPUFamilyNumber=15 Cpuset: 0x00000003 Online: 0x00000003 Allowed: 0x00000003 Type: Core Number of child objects: 1 Name=NULL Cpuset: 0x00000001 Online: 0x00000001 Allowed: 0x00000001 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000001 Online: 0x00000001 Allowed: 0x00000001 Type: Core Number of child objects: 1 Name=NULL Cpuset: 0x00000002 Online: 0x00000002 Allowed: 0x00000002 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000002 Online: 0x00000002 Allowed: 0x00000002 Type: NUMANode Number of child objects: 1 Name=NULL local=4194304KB total=4194304KB Cpuset: 0x0000000c Online: 0x0000000c Allowed: 0x0000000c Type: Socket Number of child objects: 2 Name=NULL CPUType= CPUModel=i86pc CPUVendor=AuthenticAMD CPUModelNumber=33 CPUFamilyNumber=15 Cpuset: 0x0000000c Online: 0x0000000c Allowed: 0x0000000c Type: Core Number of child objects: 1 Name=NULL Cpuset: 0x00000004 Online: 0x00000004 Allowed: 0x00000004 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000004 Online: 0x00000004 Allowed: 0x00000004 Type: Core Number of child objects: 1 Name=NULL Cpuset: 0x00000008 Online: 0x00000008 Allowed: 0x00000008 Type: PU Number of child objects: 0 Name=NULL Cpuset: 0x00000008 Online: 0x00000008 Allowed: 0x00000008 [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] ADDING TOPOLOGY PER USER REQUEST [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:orted_report_launch completed for daemon [[43332,0],1] at contact 2839805952.1;tcp://193.174.26.210:50393 [sunpc1:12987] mca: base: components_register: component rsh register function successful [sunpc1:12987] mca: base: components_open: opening plm components [sunpc1:12987] mca: base: components_open: found loaded component rsh [sunpc1:12987] mca: base: components_open: component rsh open function successful [sunpc1:12987] mca:base:select: Auto-selecting plm components [sunpc1:12987] mca:base:select:( plm) Querying component [rsh] [sunpc1:12987] [[43332,0],1] plm:rsh_lookup on agent ssh : rsh path NULL [sunpc1:12987] mca:base:select:( plm) Query of component [rsh] set priority to 10 [sunpc1:12987] mca:base:select:( plm) Selected component [rsh] [sunpc1:12987] [[43332,0],1] plm:rsh_setup on agent ssh : rsh path NULL [sunpc1:12987] [[43332,0],1] plm:base:receive start comm [sunpc1:12987] [[43332,0],1] plm:rsh: remote spawn called [sunpc1:12987] [[43332,0],1] plm:rsh: local shell: 2 (tcsh) [sunpc1:12987] [[43332,0],1] plm:rsh: assuming same remote shell as local shell [sunpc1:12987] [[43332,0],1] plm:rsh: remote shell: 2 (tcsh) [sunpc1:12987] [[43332,0],1] plm:rsh: final template argv: /bin/ssh <template> set path = ( bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH lib64 ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH lib64:$LD_LIBRARY_PATH ; if ( $?DYLD_LIBRARY_PATH == 1 ) set OMPI_have_dllp ; if ( $?DYLD_LIBRARY_PATH == 0 ) setenv DYLD_LIBRARY_PATH lib64 ; if ( $?OMPI_have_dllp == 1 ) setenv DYLD_LIBRARY_PATH lib64:$DYLD_LIBRARY_PATH ; orted --hnp-topo-sig 2N:2S:0L3:0L2:0L1:4C:4H:i86pc -mca ess "env" -mca orte_ess_jobid "2839805952" -mca orte_ess_vpid "<template>" -mca orte_ess_num_procs "4" -mca orte_parent_uri "2839805952.1;tcp://193.174.26.210:50393" -mca orte_hnp_uri "2839805952.0;tcp://193.174.24.39:34971" --mca plm_base_verbose "100" -mca plm "rsh" [sunpc1:12987] [[43332,0],1] plm:rsh: activating launch event [sunpc1:12987] [[43332,0],1] plm:rsh: recording launch of daemon [[43332,0],3] [sunpc1:12987] [[43332,0],1] plm:rsh: executing: (/bin/ssh) [/bin/ssh rs0 set path = ( bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH lib64 ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH lib64:$LD_LIBRARY_PATH ; if ( $?DYLD_LIBRARY_PATH == 1 ) set OMPI_have_dllp ; if ( $?DYLD_LIBRARY_PATH == 0 ) setenv DYLD_LIBRARY_PATH lib64 ; if ( $?OMPI_have_dllp == 1 ) setenv DYLD_LIBRARY_PATH lib64:$DYLD_LIBRARY_PATH ; orted --hnp-topo-sig 2N:2S:0L3:0L2:0L1:4C:4H:i86pc -mca ess "env" -mca orte_ess_jobid "2839805952" -mca orte_ess_vpid 3 -mca orte_ess_num_procs "4" -mca orte_parent_uri "2839805952.1;tcp://193.174.26.210:50393" -mca orte_hnp_uri "2839805952.0;tcp://193.174.24.39:34971" --mca plm_base_verbose "100" -mca plm "rsh" --tree-spawn] ld.so.1: ssh: fatal: relocation error: file /usr/bin/ssh: symbol SUNWcry_installed: referenced symbol not found -------------------------------------------------------------------------- ORTE was unable to reliably start one or more daemons. This usually is caused by: * not finding the required libraries and/or binaries on one or more nodes. Please check your PATH and LD_LIBRARY_PATH settings, or configure OMPI with --enable-orterun-prefix-by-default * lack of authority to execute on one or more specified nodes. Please verify your allocation and authorities. * the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base). Please check with your sys admin to determine the correct location to use. * compilation of the orted with dynamic libraries when static are required (e.g., on Cray). Please check your configure cmd line and consider using one of the contrib/platform definitions for your system type. * an inability to create a connection back to mpirun due to a lack of common network interfaces and/or no route found between them. Please check network connectivity (including firewalls and network routing requirements). -------------------------------------------------------------------------- [tyr.informatik.hs-fulda.de:03938] [[43332,0],0] plm:base:orted_cmd sending orted_exit commands [sunpc1:12987] [[43332,0],1] daemon 3 failed with status 0 [sunpc1:12987] [[43332,0],1] plm:base:receive stop comm [sunpc1:12987] mca: base: close: component rsh closed [sunpc1:12987] mca: base: close: unloading component rsh [linpc1:24427] mca: base: components_register: registering plm components [linpc1:24427] mca: base: components_register: found loaded component rsh [linpc1:24427] mca: base: components_register: component rsh register function successful [linpc1:24427] mca: base: components_open: opening plm components [linpc1:24427] mca: base: components_open: found loaded component rsh [linpc1:24427] mca: base: components_open: component rsh open function successful [linpc1:24427] mca:base:select: Auto-selecting plm components [linpc1:24427] mca:base:select:( plm) Querying component [rsh] [linpc1:24427] [[43332,0],2] plm:rsh_lookup on agent ssh : rsh path NULL [linpc1:24427] mca:base:select:( plm) Query of component [rsh] set priority to 10 [linpc1:24427] mca:base:select:( plm) Selected component [rsh] [linpc1:24427] [[43332,0],2] plm:rsh_setup on agent ssh : rsh path NULL [linpc1:24427] [[43332,0],2] plm:base:receive start comm ^Cmpiexec: abort is already in progress...hit ctrl-c again to forcibly terminate Killed by signal 2. tyr hello_1 131 Hopefully, you can see what happens and fix the problem. Thank you very much for your help. Kind regards Siegmar > Ralph, > with such a small number of nodes, how are orted spawn ? > mpiexec ssh orted to all the nodes ? > orted are spawn in a tree way regardless the number of nodes ? > > Cheers, > > Gilles > > On Saturday, May 16, 2015, Siegmar Gross < > siegmar.gr...@informatik.hs-fulda.de> wrote: > > > Hi Gilles, > > > > > can you run > > > LD_LIBRARY_PATH= LD_LIBRARY_PATH64= /usr/bin/ssh > > > on all your boxes ? > > > > > > the root cause could be you try to run ssh on box A with the env of box B > > > > No, should be the same on all boxes (it worked before and it still works > > with openmpi-1.9 in the same environment). We don't use /usr/bin/ssh > > because it doesn't work . > > > > tyr hello_1 279 /usr/bin/ssh sunpc1 date > > ld.so.1: ssh: fatal: relocation error: file /usr/bin/ssh: symbol > > SUNWcry_installed: referenced symbol not found > > Killed > > tyr hello_1 280 /usr/bin/ssh linpc1 date > > ld.so.1: ssh: fatal: relocation error: file /usr/bin/ssh: symbol > > SUNWcry_installed: referenced symbol not found > > Killed > > tyr hello_1 281 /usr/bin/ssh rs0 date > > ld.so.1: ssh: fatal: relocation error: file /usr/bin/ssh: symbol > > SUNWcry_installed: referenced symbol not found > > Killed > > tyr hello_1 282 /usr/bin/ssh tyr date > > ld.so.1: ssh: fatal: relocation error: file /usr/bin/ssh: symbol > > SUNWcry_installed: referenced symbol not found > > Killed > > tyr hello_1 283 > > > > > > We use /usr/local/bin/ssh. > > > > tyr hello_1 284 ssh tyr where ssh > > ssh is aliased to /usr/local/bin/ssh -q -F /usr/local/etc/ssh/ssh_config > > /usr/local/bin/ssh > > /usr/bin/ssh > > tyr hello_1 285 ssh sunpc1 where ssh > > ssh is aliased to /usr/local/bin/ssh -q -F /usr/local/etc/ssh/ssh_config > > /usr/local/bin/ssh > > /usr/bin/ssh > > tyr hello_1 286 ssh linpc1 where ssh > > ssh ist ein Alias f\303\274r /usr/local/bin/ssh -q -F > > /usr/local/etc/ssh/ssh_config > > /usr/local/bin/ssh > > /usr/bin/ssh > > tyr hello_1 287 ssh rs0 where ssh > > ssh is aliased to /usr/local/bin/ssh -q -F /usr/local/etc/ssh/ssh_config > > /usr/local/bin/ssh > > /usr/bin/ssh > > tyr hello_1 288 > > > > > > > can you also run with the -output-tag (or -tag-output) so we can figure > > out > > > on which box ssh is failing > > > > tyr hello_1 114 mpiexec -np 5 --host sunpc1,linpc1,tyr,rs0 -output-tag > > hello_1_mpi > > mpiexec: Error: unknown option "-output-tag" > > Type 'mpiexec --help' for usage. > > tyr hello_1 115 mpiexec -np 5 --host sunpc1,linpc1,tyr,rs0 -tag-output > > hello_1_mpi > > ld.so.1: ssh: fatal: relocation error: file /usr/bin/ssh: symbol > > SUNWcry_installed: referenced symbol not found > > -------------------------------------------------------------------------- > > ORTE was unable to reliably start one or more daemons. > > This usually is caused by: > > ... > > > > > > The output is still the same as before and the process blocks as before > > so that the new flag didn't change anything. Sorry, for the bad news. > > > > > > I have another small program that prints the environment. Sometimes > > it breaks with the same error as above and somtimes it works as > > expected. > > > > > > The following commands break. > > > > tyr hello_1 259 mpiexec -np 7 --host sunpc0,sunpc1,linpc0,linpc1,tyr,rs0 > > environ_mpi > > tyr hello_1 264 mpiexec -np 5 --host sunpc1,linpc1,tyr,rs0 environ_mpi > > tyr hello_1 269 mpiexec -np 5 --host tyr,rs0,sunpc1,linpc1 environ_mpi | > > more > > tyr hello_1 263 mpiexec -np 4 --host sunpc1,linpc1,rs0 environ_mpi > > > > > > The following commands work fine. > > > > tyr hello_1 249 mpiexec -np 7 --host linpc0,linpc1,sunpc0,sunpc1,tyr,rs0 > > environ_mpi > > tyr hello_1 261 mpiexec -np 4 --host sunpc1,linpc1,tyr environ_mpi > > tyr hello_1 266 mpiexec -np 4 --host sunpc1,tyr,rs0 environ_mpi > > > > Some variations of the last command that breaks (see avbove) > > which also work fine. > > > > tyr hello_1 272 mpiexec -np 3 --host sunpc1,rs0 environ_mpi > > tyr hello_1 273 mpiexec -np 3 --host linpc1,rs0 environ_mpi > > tyr hello_1 277 mpiexec -np 3 --host sunpc1,linpc1 environ_mpi > > > > > > > > > > Open MPI sees the following environment on Solaris 10 x86_64, > > Linux x86_64, and Solaris 10 Sparc. > > > > > > tyr hello_1 122 mpiexec -np 4 --host sunpc1,linpc1,tyr environ_mpi > > > > Now 3 slave tasks are sending their environment. > > > > Environment from task 1: > > message type: 3 > > msg length: 3627 characters > > message: > > hostname: sunpc1 > > operating system: SunOS > > release: 5.10 > > processor: i86pc > > PATH > > bin > > /usr/local/openmpi-1.8.5_64_cc/bin > > /usr/local/NetBeans-4.0/bin > > /usr/local/jdk1.8.0/bin > > /usr/local/apache-ant-1.6.2/bin > > /usr/local/db-derby-10.11.1.1-bin/bin > > /usr/local/gcc-4.9.2/bin > > /opt/solstudio12.4/bin > > /usr/local/bin > > /usr/local/ssl/bin > > /usr/local/pgsql/bin > > /usr/bin > > /usr/openwin/bin > > /usr/dt/bin > > /usr/ccs/bin > > /usr/sfw/bin > > /opt/sfw/bin > > /usr/ucb > > /usr/lib/lp/postscript > > /usr/local/teTeX-1.0.7/bin/i386-pc-solaris2.10 > > /usr/local/bluej-2.1.2 > > /usr/local/hwloc-1.10.0/bin > > /home/fd1026/SunOS/x86_64/bin > > . > > /usr/sbin > > LD_LIBRARY_PATH_64 > > /usr/local/openmpi-1.8.5_64_cc/lib64 > > /usr/local/jdk1.8.0/jre/lib/amd64 > > /usr/local/gcc-4.9.2/lib/amd64 > > > > /usr/local/gcc-4.9.2/lib/gcc/i386-pc-solaris2.10/4.9.2/amd64 > > /usr/local/lib/amd64 > > /usr/local/ssl/lib/amd64 > > /usr/local/lib64 > > /usr/lib/amd64 > > /usr/openwin/lib/amd64 > > /usr/openwin/server/lib/amd64 > > /usr/dt/lib/amd64 > > /usr/X11R6/lib/amd64 > > /usr/ccs/lib/amd64 > > /usr/sfw/lib/amd64 > > /opt/sfw/lib/amd64 > > /usr/ucblib/amd64 > > /usr/local/hwloc-1.10.0/lib64 > > /home/fd1026/SunOS/x86_64/lib64 > > LD_LIBRARY_PATH > > lib64 > > /usr/local/openmpi-1.8.5_64_cc/lib > > /usr/local/jdk1.8.0/jre/lib/i386 > > /usr/local/gcc-4.9.2/lib > > > > /usr/local/gcc-4.9.2/lib/gcc/i386-pc-solaris2.10/4.9.2 > > /usr/local/lib > > /usr/local/ssl/lib > > /usr/local/oracle > > /usr/local/pgsql/lib > > /usr/lib > > /usr/openwin/lib > > /usr/openwin/server/lib > > /usr/dt/lib > > /usr/X11R6/lib > > /usr/ccs/lib > > /usr/sfw/lib > > /opt/sfw/lib > > /usr/ucblib > > /usr/local/hwloc-1.10.0/lib > > /usr/lib/gnome-private/lib > > /home/fd1026/SunOS/x86_64/lib > > CLASSPATH > > /usr/local/db-derby-10.11.1.1-bin/lib/derby.jar > > /usr/local/db-derby-10.11.1.1-bin/lib/derbytools.jar > > /usr/local/db-derby-10.11.1.1-bin/lib/derbyrun.jar > > > > /usr/local/jdk1.8.0/hibernate-jpa-2.0-api-1.0.0.Final.jar > > /usr/local/junit4.10 > > /usr/local/junit4.10/junit-4.10.jar > > /usr/local/javacc-5.0/javacc.jar > > . > > /home/fd1026/SunOS/x86_64/mpi_classfiles > > > > Environment from task 2: > > message type: 3 > > msg length: 6760 characters > > message: > > hostname: linpc1 > > operating system: Linux > > release: 3.1.10-1.29-desktop > > processor: x86_64 > > PATH > > bin > > /usr/local/openmpi-1.8.5_64_cc/bin > > /usr/local/NetBeans-4.0/bin > > /usr/local/jdk1.8.0/bin > > /usr/local/apache-ant-1.6.2/bin > > /usr/local/db-derby-10.11.1.1-bin/bin > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/bin/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mpirt/bin/intel64 > > > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/debugger/gdb/intel64_mic/py27/bin > > > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/debugger/gdb/intel64/py27/bin > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/bin/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/bin/intel64_mic > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/debugger/gui/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/bin/ia32 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mpirt/bin/ia32 > > > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/debugger/gdb/intel64_mic/py27/bin > > > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/debugger/gdb/intel64/py27/bin > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/bin/ia32 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/debugger/gui/ia32 > > /usr/local/gcc-4.9.2/bin > > /opt/solstudio12.4/bin > > /usr/local/bin > > /usr/local/ssl/bin > > /usr/local/pgsql/bin > > /bin > > /usr/bin > > /usr/X11R6/bin > > /usr/local/teTeX-1.0.7/bin/i586-pc-linux-gnu > > /usr/local/bluej-2.1.2 > > /usr/local/hwloc-1.10.0/bin > > /home/fd1026/Linux/x86_64/bin > > . > > /usr/sbin > > LD_LIBRARY_PATH_64 > > /usr/local/openmpi-1.8.5_64_cc/lib64 > > /usr/local/jdk1.8.0/jre/lib/amd64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/compiler/lib/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mpirt/lib/intel64 > > > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/ipp/../compiler/lib/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/ipp/lib/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/compiler/lib/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mkl/lib/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/tbb/lib/intel64/gcc4.4 > > /usr/local/gcc-4.9.2/lib64 > > > > /usr/local/gcc-4.9.2/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2 > > > > /usr/local/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 > > /usr/local/lib64 > > /usr/local/ssl/lib64 > > /usr/lib64 > > /usr/X11R6/lib64 > > /usr/local/hwloc-1.10.0/lib64 > > /home/fd1026/Linux/x86_64/lib64 > > LD_LIBRARY_PATH > > lib64 > > > > /usr/local/openmpi-1.8.5_64_cc/lib > > /usr/local/jdk1.8.0/jre/lib/i386 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/compiler/lib/ia32 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mpirt/lib/ia32 > > > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/ipp/../compiler/lib/ia32 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/ipp/lib/ia32 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/compiler/lib/ia32 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mkl/lib/ia32 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/tbb/lib/ia32/gcc4.4 > > /usr/local/gcc-4.9.2/lib > > > > /usr/local/gcc-4.9.2/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/32 > > > > /usr/local/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/32 > > /usr/local/lib > > /usr/local/ssl/lib > > /lib > > /usr/lib > > /usr/X11R6/lib > > /usr/local/hwloc-1.10.0/lib > > /usr/lib/gnome-private/lib > > /home/fd1026/Linux/x86_64/lib > > /usr/local/openmpi-1.8.5_64_cc/lib64 > > /usr/local/jdk1.8.0/jre/lib/amd64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/compiler/lib/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mpirt/lib/intel64 > > > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/ipp/../compiler/lib/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/ipp/lib/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/compiler/lib/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/mkl/lib/intel64 > > > > /usr/local/intel_xe_2013/composer_xe_2013_sp1.1.106/tbb/lib/intel64/gcc4.4 > > /usr/local/gcc-4.9.2/lib64 > > > > /usr/local/gcc-4.9.2/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2 > > > > /usr/local/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 > > /usr/local/lib64 > > /usr/local/ssl/lib64 > > /usr/lib64 > > /usr/X11R6/lib64 > > /usr/local/hwloc-1.10.0/lib64 > > /home/fd1026/Linux/x86_64/lib64 > > CLASSPATH > > /usr/local/db-derby-10.11.1.1-bin/lib/derby.jar > > /usr/local/db-derby-10.11.1.1-bin/lib/derbytools.jar > > /usr/local/db-derby-10.11.1.1-bin/lib/derbyrun.jar > > > > /usr/local/jdk1.8.0/hibernate-jpa-2.0-api-1.0.0.Final.jar > > /usr/local/junit4.10 > > /usr/local/junit4.10/junit-4.10.jar > > /usr/local/javacc-5.0/javacc.jar > > . > > /home/fd1026/Linux/x86_64/mpi_classfiles > > > > Environment from task 3: > > message type: 3 > > msg length: 3676 characters > > message: > > hostname: tyr.informatik.hs-fulda.de > > operating system: SunOS > > release: 5.10 > > processor: sun4u > > PATH > > /usr/local/openmpi-1.8.5_64_cc/bin > > /usr/local/NetBeans-4.0/bin > > /usr/local/jdk1.8.0/bin > > /usr/local/apache-ant-1.6.2/bin > > /usr/local/db-derby-10.11.1.1-bin/bin > > /usr/local/gcc-4.9.2/bin > > /opt/solstudio12.4/bin > > /usr/local/bin > > /usr/local/ssl/bin > > /usr/local/pgsql/bin > > /usr/bin > > /usr/openwin/bin > > /usr/dt/bin > > /usr/ccs/bin > > /usr/sfw/bin > > /opt/sfw/bin > > /usr/ucb > > /usr/xpg4/bin > > /usr/local/teTeX-1.0.7/bin/sparc-sun-solaris2.10 > > /usr/local/bluej-2.1.2 > > /usr/local/hwloc-1.10.0/bin > > /home/fd1026/SunOS/sparc/bin > > . > > /usr/sbin > > LD_LIBRARY_PATH_64 > > /usr/local/openmpi-1.8.5_64_cc/lib64 > > /usr/local/jdk1.8.0/jre/lib/sparcv9 > > /usr/local/gcc-4.9.2/lib/sparcv9 > > > > /usr/local/gcc-4.9.2/lib/gcc/sparc-sun-solaris2.10/4.9.2/sparcv9 > > /usr/local/lib/sparcv9 > > /usr/local/ssl/lib/sparcv9 > > /usr/local/lib64 > > /usr/local/oracle/sparcv9 > > /usr/local/pgsql/lib/sparcv9 > > /lib/sparcv9 > > /usr/lib/sparcv9 > > /usr/openwin/lib/sparcv9 > > /usr/dt/lib/sparcv9 > > /usr/X11R6/lib/sparcv9 > > /usr/ccs/lib/sparcv9 > > /usr/sfw/lib/sparcv9 > > /opt/sfw/lib/sparcv9 > > /usr/ucblib/sparcv9 > > /usr/local/hwloc-1.10.0/lib64 > > /home/fd1026/SunOS/sparc/lib64 > > LD_LIBRARY_PATH > > /usr/local/openmpi-1.8.5_64_cc/lib > > /usr/local/jdk1.8.0/jre/lib/sparc > > /usr/local/gcc-4.9.2/lib > > > > /usr/local/gcc-4.9.2/lib/gcc/sparc-sun-solaris2.10/4.9.2 > > /usr/local/lib > > /usr/local/ssl/lib > > /usr/local/oracle > > /usr/local/pgsql/lib > > /lib > > /usr/lib > > /usr/openwin/lib > > /usr/dt/lib > > /usr/X11R6/lib > > /usr/ccs/lib > > /usr/sfw/lib > > /opt/sfw/lib > > /usr/ucblib > > /usr/local/hwloc-1.10.0/lib > > /usr/lib/gnome-private/lib > > /home/fd1026/SunOS/sparc/lib > > CLASSPATH > > /usr/local/db-derby-10.11.1.1-bin/lib/derby.jar > > /usr/local/db-derby-10.11.1.1-bin/lib/derbytools.jar > > /usr/local/db-derby-10.11.1.1-bin/lib/derbyrun.jar > > > > /usr/local/jdk1.8.0/hibernate-jpa-2.0-api-1.0.0.Final.jar > > /usr/local/junit4.10 > > /usr/local/junit4.10/junit-4.10.jar > > /usr/local/javacc-5.0/javacc.jar > > . > > /home/fd1026/SunOS/sparc/mpi_classfiles > > > > tyr hello_1 239 > > > > > > Do you see anything strange? I don't know why I have "bin" at the > > beginning of PATH and "lib64" at the beginning of LD_LIBRARY_PATH > > for "sunpc1" and "linpc1" because they are not available in the > > environment variables. I have to investigate it. > > > > tyr hello_1 291 ssh linpc1 echo $PATH > > > > /usr/local/openmpi-1.8.5_64_cc/bin:/usr/local/NetBeans-4.0/bin:/usr/local/jdk1.8.0/bin:/usr/local/apache-ant-1.6.2/bin:/usr > > > > /local/db-derby-10.11.1.1-bin/bin:/usr/local/gcc-4.9.2/bin:/opt/solstudio12.4/bin:/usr/local/bin:/usr/local/ssl/bin:/usr/lo > > > > cal/pgsql/bin:/usr/bin:/usr/openwin/bin:/usr/dt/bin:/usr/ccs/bin:/usr/sfw/bin:/opt/sfw/bin:/usr/ucb:/usr/xpg4/bin:/usr/loca > > > > l/teTeX-1.0.7/bin/sparc-sun-solaris2.10:/usr/local/bluej-2.1.2:/usr/local/hwloc-1.10.0/bin:/home/fd1026/SunOS/sparc/bin:.:/ > > usr/sbin > > > > tyr hello_1 292 ssh linpc1 echo $LD_LIBRARY_PATH > > > > /usr/local/openmpi-1.8.5_64_cc/lib:/usr/local/jdk1.8.0/jre/lib/sparc:/usr/local/gcc-4.9.2/lib:/usr/local/gcc-4.9.2/lib/gcc/ > > > > sparc-sun-solaris2.10/4.9.2:/usr/local/lib:/usr/local/ssl/lib:/usr/local/oracle:/usr/local/pgsql/lib:/lib:/usr/lib:/usr/ope > > > > nwin/lib:/usr/dt/lib:/usr/X11R6/lib:/usr/ccs/lib:/usr/sfw/lib:/opt/sfw/lib:/usr/ucblib:/usr/local/hwloc-1.10.0/lib:/usr/lib > > /gnome-private/lib:/home/fd1026/SunOS/sparc/lib > > tyr hello_1 293 > > > > > > Kind regards and thank you very much for your help. > > > > Siegmar > > > > > > > > > > > > > > > On Friday, May 15, 2015, Siegmar Gross < > > siegmar.gr...@informatik.hs-fulda.de <javascript:;>> > > > wrote: > > > > > > > Hi, > > > > > > > > I successfully installed openmpi-1.8.5 on my machines (Solaris 10 > > > > Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with > > > > gcc-4.9.2 and Sun C 5.13. I get the same error for both compilers, > > > > if I use the following command and no errors if I change the order > > > > of the first two machines. I also get no errors if I use > > > > openmpi-dev-1708-g8497a6a for an arbitrary order of the machines. > > > > > > > > > > > > tyr hello_1 109 which mpicc > > > > /usr/local/openmpi-1.8.5_64_cc/bin/mpicc > > > > tyr hello_1 110 mpiexec -np 5 -host sunpc1,linpc1,tyr,rs0 hello_1_mpi > > > > ld.so.1: ssh: fatal: relocation error: file /usr/bin/ssh: symbol > > > > SUNWcry_installed: referenced symbol not found > > > > > > -------------------------------------------------------------------------- > > > > ORTE was unable to reliably start one or more daemons. > > > > This usually is caused by: > > > > > > > > * not finding the required libraries and/or binaries on > > > > one or more nodes. Please check your PATH and LD_LIBRARY_PATH > > > > settings, or configure OMPI with --enable-orterun-prefix-by-default > > > > > > > > * lack of authority to execute on one or more specified nodes. > > > > Please verify your allocation and authorities. > > > > > > > > * the inability to write startup files into /tmp > > > > (--tmpdir/orte_tmpdir_base). > > > > Please check with your sys admin to determine the correct location to > > > > use. > > > > > > > > * compilation of the orted with dynamic libraries when static are > > required > > > > (e.g., on Cray). Please check your configure cmd line and consider > > using > > > > one of the contrib/platform definitions for your system type. > > > > > > > > * an inability to create a connection back to mpirun due to a > > > > lack of common network interfaces and/or no route found between > > > > them. Please check network connectivity (including firewalls > > > > and network routing requirements). > > > > > > -------------------------------------------------------------------------- > > > > > > > > > > > > > > > > Now the program hangs and "top" shows that "orterun" is very busy. > > > > > > > > PID USERNAME THR PR NCE SIZE RES STATE TIME FLTS CPU COMMAND > > > > 29550 fd1026 2 0 0 14.5M 8576K cpu01 1:06 0 47.72% orterun > > > > > > > > > > > > > > > > > > > > tyr hello_1 116 mpiexec -np 5 -host linpc1,sunpc1,tyr,rs0 hello_1_mpi > > > > Process 2 of 5 running on sunpc1 > > > > Process 4 of 5 running on rs0.informatik.hs-fulda.de > > > > Process 3 of 5 running on tyr.informatik.hs-fulda.de > > > > Process 1 of 5 running on linpc1 > > > > Process 0 of 5 running on linpc1 > > > > ... > > > > > > > > > > > > > > > > Everything works fine with openmpi-dev-1708-g8497a6a. > > > > > > > > tyr hello_1 120 which mpicc > > > > /usr/local/openmpi-1.9.0_64_gcc/bin/mpicc > > > > tyr hello_1 121 mpiexec -np 5 -host sunpc1,linpc1,tyr,rs0 hello_1_mpi > > > > Process 2 of 5 running on linpc1 > > > > Process 0 of 5 running on sunpc1 > > > > Process 1 of 5 running on sunpc1 > > > > Process 4 of 5 running on rs0.informatik.hs-fulda.de > > > > Process 3 of 5 running on tyr.informatik.hs-fulda.de > > > > ... > > > > > > > > > > > > Any ideas what's going wrong? I would be grateful if somebody can > > > > fix the problem. Thank you very much for any help in advance. > > > > > > > > > > > > Kind regards > > > > > > > > Siegmar > > > > > > > > _______________________________________________ > > > > users mailing list > > > > us...@open-mpi.org <javascript:;> <javascript:;> > > > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > Link to this post: > > > > http://www.open-mpi.org/community/lists/users/2015/05/26871.php > > > > > > > >