Hi, today I installed openmpi-1.7.4a1r29784 on "Solaris 10, Sparc" with "Sun C 5.12" with the following configure command.
../openmpi-1.7.4a1r29784/configure \ --prefix=/usr/local/openmpi-1.7.4_64_cc \ --libdir=/usr/local/openmpi-1.7.4_64_cc/lib64 \ --with-jdk-bindir=/usr/local/jdk1.7.0_07/bin/sparcv9 \ --with-jdk-headers=/usr/local/jdk1.7.0_07/include \ JAVA_HOME=/usr/local/jdk1.7.0_07 \ LDFLAGS="-m64" \ CC="cc" CXX="CC" FC="f95" \ CFLAGS="-m64" CXXFLAGS="-m64 -library=stlport4" FCFLAGS="-m64" \ CPP="cpp" CXXCPP="cpp" \ CPPFLAGS="" CXXCPPFLAGS="" \ --enable-cxx-exceptions \ --enable-mpi-java \ --enable-heterogeneous \ --enable-opal-multi-threads \ --enable-mpi-thread-multiple \ --with-threads=posix \ --with-hwloc=internal \ --without-verbs \ --with-wrapper-cflags=-m64 \ --enable-debug \ |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc 1) Bus error with "ompi_info -a" tyr fd1026 108 ompi_info | grep MPI: Open MPI: 1.7.4a1r29784 I get a Bus Error, if I use option "-a". tyr fd1026 109 ompi_info -a | grep MPI: [tyr:17668] *** Process received signal *** [tyr:17668] Signal: Bus Error (10) [tyr:17668] Signal code: Invalid address alignment (1) [tyr:17668] Failing at address: ffffffff7d3ca461 /export2/prog/SunOS_sparc/openmpi-1.7.4_64_cc/lib64/libopen-pal.so.6.0.0: opal_backtrace_print+0x14 /export2/prog/SunOS_sparc/openmpi-1.7.4_64_cc/lib64/libopen-pal.so.6.0.0: 0x1843d8 /lib/sparcv9/libc.so.1:0xd8c28 /lib/sparcv9/libc.so.1:0xcc79c /lib/sparcv9/libc.so.1:0xcc9a8 /export2/prog/SunOS_sparc/openmpi-1.7.4_64_cc/lib64/libopen-pal.so.6.0.0: 0x13a3dc [ Signal 2099942168 (?)] /export2/prog/SunOS_sparc/openmpi-1.7.4_64_cc/lib64/libopen-pal.so.6.0.0: mca_base_var_dump+0x190 /export2/prog/SunOS_sparc/openmpi-1.7.4_64_cc/lib64/libopen-pal.so.6.0.0: 0x899a8 /export2/prog/SunOS_sparc/openmpi-1.7.4_64_cc/lib64/libopen-pal.so.6.0.0: opal_info_show_mca_params+0xb4 /export2/prog/SunOS_sparc/openmpi-1.7.4_64_cc/lib64/libopen-pal.so.6.0.0: opal_info_do_params+0x364 /export2/prog/SunOS_sparc/openmpi-1.7.4_64_cc/bin/ompi_info:main+0x6e4 /export2/prog/SunOS_sparc/openmpi-1.7.4_64_cc/bin/ompi_info:_start+0x12c [tyr:17668] *** End of error message *** Bus error tyr fd1026 110 tyr fd1026 112 cd /usr/local/openmpi-1.7.4_64_cc/bin/ tyr bin 113 /opt/solstudio12.3/bin/sparcv9/dbx ompi_info For information about new features see `help changes' To remove this message, put `dbxenv suppress_startup_message 7.9' in your .dbxrc Reading ompi_info Reading ld.so.1 Reading libmpi.so.1.2.0 Reading libopen-rte.so.6.0.0 Reading libopen-pal.so.6.0.0 Reading libsendfile.so.1 Reading libpicl.so.1 Reading libkstat.so.1 Reading liblgrp.so.1 Reading libsocket.so.1 Reading libnsl.so.1 Reading librt.so.1 Reading libm.so.2 Reading libthread.so.1 Reading libc.so.1 Reading libdoor.so.1 Reading libaio.so.1 Reading libmd.so.1 (dbx) run -a Running: ompi_info -a (process id 17678) Reading libc_psr.so.1 ... Reading mca_topo_basic.so Reading mca_vprotocol_pessimist.so Prefix: /usr/local/openmpi-1.7.4_64_cc Exec_prefix: /usr/local/openmpi-1.7.4_64_cc Bindir: /usr/local/openmpi-1.7.4_64_cc/bin ... MPI_MAX_PORT_NAME: 1024 MPI_MAX_DATAREP_STRING: 128 MCA mca: parameter "mca_param_files" (current value: "/home/fd1026/.openmpi/mca-params.conf: /usr/local/openmpi-1.7.4_64_cc/etc/openmpi-mca-params.conf", data source: default, level: 2 user/detail, type: string, deprecated, synonym of: mca_base_param_files) Path for MCA configuration files containing variable values MCA mca: parameter "mca_component_path" (current value: "/usr/local/openmpi-1.7.4_64_cc/lib64/openmpi: /home/fd1026/.openmpi/components", data source: default, level: 9 dev/all, type: string, deprecated, synonym of: mca_base_component_path) Path where to look for Open MPI and ORTE components MCA mca: parameter "mca_component_show_load_errors" (current value: "true", data source: default, level: 9 dev/all, type: bool, deprecated, synonym of: mca_base_component_show_load_errors) Whether to show errors for components that failed to load or not Valid values: 0: f|false|disabled, 1: t|true|enabled t@1 (l@1) signal BUS (invalid address alignment) in var_value_string at line 1685 in file "mca_base_var.c" 1685 ret = var->mbv_enumerator->string_from_value(var->mbv_enumerator, value->intval, &tmp); (dbx) (dbx) (dbx) (dbx) check -all dbx: warning: check -all will be turned on in the next run of the process access checking - OFF memuse checking - OFF (dbx) run -a Running: ompi_info -a (process id 17680) Reading rtcapihook.so Reading libdl.so.1 Reading rtcaudit.so Reading libmapmalloc.so.1 Reading rtcboot.so Reading librtc.so Reading libmd_psr.so.1 RTC: Enabling Error Checking... RTC: Using UltraSparc trap mechanism RTC: See `help rtc showmap' and `help rtc limitations' for details. RTC: Running program... Read from uninitialized (rui) on thread 1: Attempting to read 4 bytes at address 0xffffffff7fffd548 which is 184 bytes above the current stack pointer Variable is 'index' t@1 (l@1) stopped in var_find at line 802 in file "mca_base_var.c" 802 return (OPAL_SUCCESS != ret) ? ret : index; (dbx) 2) Bus error with "make check" tail -15 log.make-check.SunOS.sparc.64_cc >>--------------------------------------------<< PASS: ddt_test /bin/bash: line 5: 4466 Bus Error ${dir}$tst FAIL: ddt_raw ======================================================== 1 of 6 tests failed Please report to http://www.open-mpi.org/community/help/ ======================================================== make[3]: *** [check-TESTS] Error 1 make[3]: Leaving directory `.../test/datatype' make[2]: *** [check-am] Error 2 make[2]: Leaving directory `.../test/datatype' make[1]: *** [check-recursive] Error 1 make[1]: Leaving directory `.../test' make: *** [check-recursive] Error 1 tyr openmpi-1.7.4a1r29784-SunOS.sparc.64_cc 116 cd test/datatype/.libs/ tyr .libs 117 /opt/solstudio12.3/bin/sparcv9/dbx ddt_raw For information about new features see `help changes' To remove this message, put `dbxenv suppress_startup_message 7.9' in your .dbxrc Reading ddt_raw Reading ld.so.1 Reading libmpi.so.1.2.0 Reading libopen-rte.so.6.0.0 Reading libopen-pal.so.6.0.0 Reading libsendfile.so.1 Reading libpicl.so.1 Reading libkstat.so.1 Reading liblgrp.so.1 Reading libsocket.so.1 Reading libnsl.so.1 Reading librt.so.1 Reading libm.so.2 Reading libthread.so.1 Reading libc.so.1 Reading libdoor.so.1 Reading libaio.so.1 Reading libmd.so.1 (dbx) run Running: ddt_raw (process id 17689) Reading libc_psr.so.1 # * TEST INVERSED VECTOR # t@1 (l@1) signal BUS (invalid address alignment) in opal_convertor_raw at line 64 in file "opal_convertor_raw.c" 64 DO_DEBUG( opal_output( 0, "opal_convertor_raw( %p, {%p, %u}, %lu )\n", (void*)pConvertor, (dbx) (dbx) (dbx) (dbx) check -all dbx: warning: check -all will be turned on in the next run of the process access checking - OFF memuse checking - OFF (dbx) run Running: ddt_raw (process id 17691) Reading rtcapihook.so Reading libdl.so.1 Reading rtcaudit.so Reading libmapmalloc.so.1 Reading libgen.so.1 Reading rtcboot.so Reading librtc.so Reading libmd_psr.so.1 RTC: Enabling Error Checking... RTC: Using UltraSparc trap mechanism RTC: See `help rtc showmap' and `help rtc limitations' for details. RTC: Running program... # * TEST INVERSED VECTOR # Misaligned read (mar) on thread 1: Attempting to read 4 bytes at address 0xffffffff60cca179 t@1 (l@1) stopped in opal_convertor_raw at line 64 in file "opal_convertor_raw.c" 64 DO_DEBUG( opal_output( 0, "opal_convertor_raw( %p, {%p, %u}, %lu )\n", (void*)pConvertor, (dbx) 3) Bus error with my programs tyr small_prog 122 mpicc init_finalize.c tyr small_prog 123 /opt/solstudio12.3/bin/sparcv9/dbx /usr/local/openmpi-1.7.4_64_cc/bin/mpiexec For information about new features see `help changes' To remove this message, put `dbxenv suppress_startup_message 7.9' in your .dbxrc Reading mpiexec Reading ld.so.1 Reading libopen-rte.so.6.0.0 Reading libopen-pal.so.6.0.0 Reading libsendfile.so.1 Reading libpicl.so.1 Reading libkstat.so.1 Reading liblgrp.so.1 Reading libsocket.so.1 Reading libnsl.so.1 Reading librt.so.1 Reading libm.so.2 Reading libthread.so.1 Reading libc.so.1 Reading libdoor.so.1 Reading libaio.so.1 Reading libmd.so.1 (dbx) run -np 1 a.out Running: mpiexec -np 1 a.out (process id 17791) Reading libc_psr.so.1 Reading mca_shmem_mmap.so Reading libmp.so.2 ... Reading mca_dfs_orted.so Reading mca_dfs_test.so t@1 (l@1) signal BUS (invalid address alignment) in opal_net_samenetwork at line 272 in file "net.c" 272 (inaddr2->sin_addr.s_addr & netmask)) { (dbx) (dbx) (dbx) (dbx) check -all dbx: warning: check -all will be turned on in the next run of the process access checking - OFF memuse checking - OFF (dbx) run -np 1 a.out Running: mpiexec -np 1 a.out (process id 17794) Reading rtcapihook.so Reading libdl.so.1 Reading rtcaudit.so Reading libmapmalloc.so.1 Reading rtcboot.so Reading librtc.so Reading libmd_psr.so.1 RTC: Enabling Error Checking... RTC: Using UltraSparc trap mechanism RTC: See `help rtc showmap' and `help rtc limitations' for details. RTC: Running program... Read from uninitialized (rui) on thread 1: Attempting to read 4 bytes at address 0xffffffff7fffd368 which is 184 bytes above the current stack pointer Variable is 'index' t@1 (l@1) stopped in var_find at line 802 in file "mca_base_var.c" 802 return (OPAL_SUCCESS != ret) ? ret : index; (dbx) I have the same problems with openmpi-1.9a1r29790 (same files). tyr fd1026 107 ompi_info |grep MPI: Open MPI: 1.9a1r29790 tyr fd1026 108 ompi_info -a | grep MPI: [tyr:17867] *** Process received signal *** [tyr:17867] Signal: Bus Error (10) [tyr:17867] Signal code: Invalid address alignment (1) [tyr:17867] Failing at address: ffffffff7d3c5ac1 /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0: opal_backtrace_print+0x14 /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0: 0x17f268 /lib/sparcv9/libc.so.1:0xd8c28 /lib/sparcv9/libc.so.1:0xcc79c /lib/sparcv9/libc.so.1:0xcc9a8 /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0: 0x134b9c [ Signal 2099923552 (?)] /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0: mca_base_var_dump+0x1b0 /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0: 0x89828 /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0: opal_info_show_mca_params+0xb4 /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0: opal_info_do_params+0x364 /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/bin/ompi_info:main+0x628 /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/bin/ompi_info:_start+0x12c [tyr:17867] *** End of error message *** Bus error tyr fd1026 109 I would be grateful, if somebody could solve the problems. Do you need any further information? Kind regards Siegmar