Gilles, Ahh, I didn't know the current status. Thank you for the notice!
Thanks, Takahiro Kawashima > Kawashima-san, > > i'd rather consider this as a bug in the README (!) > > > heterogenous support has been broken for some time, but it was > eventually fixed. > > truth is there are *very* limited resources (both human and hardware) > maintaining heterogeneous > support, but that does not mean heterogeneous support should not be > used, nor that bug report > will be ignored. > > Cheers, > > Gilles > > On 2014/12/24 9:26, Kawashima, Takahiro wrote: > > Hi Siegmar, > > > > Heterogeneous environment is not supported officially. > > > > README of Open MPI master says: > > > > --enable-heterogeneous > > Enable support for running on heterogeneous clusters (e.g., machines > > with different endian representations). Heterogeneous support is > > disabled by default because it imposes a minor performance penalty. > > > > *** THIS FUNCTIONALITY IS CURRENTLY BROKEN - DO NOT USE *** > > > >> Hi, > >> > >> today I installed openmpi-dev-602-g82c02b4 on my machines (Solaris 10 > >> Sparc, > >> Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with gcc-4.9.2 and the > >> new Solaris Studio 12.4 compilers. All build processes finished without > >> errors, but I have a problem running a very small program. It works for > >> three processes but hangs for six processes. I have the same behaviour > >> for both compilers. > >> > >> tyr small_prog 139 time; mpiexec -np 3 --host sunpc1,linpc1,tyr > >> init_finalize; time > >> 827.161u 210.126s 30:51.08 56.0% 0+0k 4151+20io 2898pf+0w > >> Hello! > >> Hello! > >> Hello! > >> 827.886u 210.335s 30:54.68 55.9% 0+0k 4151+20io 2898pf+0w > >> tyr small_prog 140 time; mpiexec -np 6 --host sunpc1,linpc1,tyr > >> init_finalize; time > >> 827.946u 210.370s 31:15.02 55.3% 0+0k 4151+20io 2898pf+0w > >> ^CKilled by signal 2. > >> Killed by signal 2. > >> 869.242u 221.644s 33:40.54 53.9% 0+0k 4151+20io 2898pf+0w > >> tyr small_prog 141 > >> > >> tyr small_prog 145 ompi_info | grep -e "Open MPI repo revision:" -e "C > >> compiler:" > >> Open MPI repo revision: dev-602-g82c02b4 > >> C compiler: cc > >> tyr small_prog 146 > >> > >> > >> tyr small_prog 146 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec > >> GNU gdb (GDB) 7.6.1 > >> ... > >> (gdb) run -np 3 --host sunpc1,linpc1,tyr init_finalize > >> Starting program: /usr/local/openmpi-1.9.0_64_cc/bin/mpiexec -np 3 --host > >> sunpc1,linpc1,tyr > >> init_finalize > >> [Thread debugging using libthread_db enabled] > >> [New Thread 1 (LWP 1)] > >> [New LWP 2 ] > >> Hello! > >> Hello! > >> Hello! > >> [LWP 2 exited] > >> [New Thread 2 ] > >> [Switching to Thread 1 (LWP 1)] > >> sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to > >> satisfy query > >> (gdb) run -np 6 --host sunpc1,linpc1,tyr init_finalize > >> The program being debugged has been started already. > >> Start it from the beginning? (y or n) y > >> > >> Starting program: /usr/local/openmpi-1.9.0_64_cc/bin/mpiexec -np 6 --host > >> sunpc1,linpc1,tyr > >> init_finalize > >> [Thread debugging using libthread_db enabled] > >> [New Thread 1 (LWP 1)] > >> [New LWP 2 ] > >> ^CKilled by signal 2. > >> Killed by signal 2. > >> > >> Program received signal SIGINT, Interrupt. > >> [Switching to Thread 1 (LWP 1)] > >> 0xffffffff7d1dc6b0 in __pollsys () from /lib/sparcv9/libc.so.1 > >> (gdb) bt > >> #0 0xffffffff7d1dc6b0 in __pollsys () from /lib/sparcv9/libc.so.1 > >> #1 0xffffffff7d1cb468 in _pollsys () from /lib/sparcv9/libc.so.1 > >> #2 0xffffffff7d170ed8 in poll () from /lib/sparcv9/libc.so.1 > >> #3 0xffffffff7e69a630 in poll_dispatch () > >> from /usr/local/openmpi-1.9.0_64_cc/lib64/libopen-pal.so.0 > >> #4 0xffffffff7e6894ec in opal_libevent2021_event_base_loop () > >> from /usr/local/openmpi-1.9.0_64_cc/lib64/libopen-pal.so.0 > >> #5 0x000000010000eb14 in orterun (argc=1757447168, > >> argv=0xffffff7ed8550cff) > >> at > >> ../../../../openmpi-dev-602-g82c02b4/orte/tools/orterun/orterun.c:1090 > >> #6 0x0000000100004e2c in main (argc=256, argv=0xffffff7ed8af5c00) > >> at ../../../../openmpi-dev-602-g82c02b4/orte/tools/orterun/main.c:13 > >> (gdb) > >> > >> Any ideas? Unfortunately I'm leaving for vaccation so that I cannot test > >> any patches until the end of the year. Neverthess I wanted to report the > >> problem. At the moment I cannot test if I have the same behaviour in a > >> homogeneous environment with three machines because the new version isn't > >> available before tomorrow on the other machines. I used the following > >> configure command. > >> > >> ../openmpi-dev-602-g82c02b4/configure > >> --prefix=/usr/local/openmpi-1.9.0_64_cc \ > >> --libdir=/usr/local/openmpi-1.9.0_64_cc/lib64 \ > >> --with-jdk-bindir=/usr/local/jdk1.8.0/bin \ > >> --with-jdk-headers=/usr/local/jdk1.8.0/include \ > >> JAVA_HOME=/usr/local/jdk1.8.0 \ > >> LDFLAGS="-m64 -mt" \ > >> CC="cc" CXX="CC" FC="f95" \ > >> CFLAGS="-m64 -mt" CXXFLAGS="-m64 -library=stlport4" FCFLAGS="-m64" \ > >> CPP="cpp" CXXCPP="cpp" \ > >> CPPFLAGS="" CXXCPPFLAGS="" \ > >> --enable-mpi-cxx \ > >> --enable-cxx-exceptions \ > >> --enable-mpi-java \ > >> --enable-heterogeneous \ > >> --enable-mpi-thread-multiple \ > >> --with-threads=posix \ > >> --with-hwloc=internal \ > >> --without-verbs \ > >> --with-wrapper-cflags="-m64 -mt" \ > >> --with-wrapper-cxxflags="-m64 -library=stlport4" \ > >> --with-wrapper-ldflags="-mt" \ > >> --enable-debug \ > >> |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc > >> > >> Furthermore I used the following test program. > >> > >> #include <stdio.h> > >> #include <stdlib.h> > >> #include "mpi.h" > >> > >> int main (int argc, char *argv[]) > >> { > >> MPI_Init (&argc, &argv); > >> printf ("Hello!\n"); > >> MPI_Finalize (); > >> return EXIT_SUCCESS; > >> }