I've dug a little deeper and thing the problem has something to do with 10MB sized /tmp filesystem.
[bloscel@k1n11 ~]$ df -h Filesystem Size Used Avail Use% Mounted on compute_x86_64 32G 1.1G 31G 4% / tmpfs 32G 0 32G 0% /dev/shm tmpfs 10M 80K 10M 1% /tmp tmpfs 10M 0 10M 0% /var/tmp /dev/lb 53T 109G 53T 1% /gpfs/lb /dev/sb 3.3T 38G 3.3T 2% /gpfs/sb [bloscel@k1n11 ~]$ mktemp /tmp/tmp.L8owhNH1AN [bloscel@k1n11 ~]$ ompi_info -a | grep /dev/shm MCA shmem: parameter "shmem_mmap_backing_file_base_dir" (current value: </dev/shm>, data source: default value) [bloscel@k1n11 ~]$ ompi_info -a | grep orte_tmpdir_base MCA orte: parameter "orte_tmpdir_base" (current value: <none>, data source: default value) [bloscel@k1n11 ~]$ From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Blosch, Edwin L Sent: Wednesday, June 05, 2013 11:14 AM To: Open MPI Users (us...@open-mpi.org) Subject: EXTERNAL: [OMPI users] How to diagnose bus error with 1.6.4 I am running into a bus error that does not happen with MVAPICH, and I am guessing it has something to do with shared-memory communication. Has anyone had a similar experience or have any insights on what this could be? Thanks [k1n08:12688] mca: base: components_open: Looking for shmem components [k1n08:12688] mca: base: components_open: opening shmem components [k1n08:12688] mca: base: components_open: found loaded component mmap [k1n08:12688] mca: base: components_open: component mmap register function successful [k1n08:12688] mca: base: components_open: component mmap open function successful [k1n08:12688] mca: base: components_open: found loaded component posix [k1n08:12688] mca: base: components_open: component posix has no register function [k1n08:12688] mca: base: components_open: component posix open function successful [k1n08:12688] mca: base: components_open: found loaded component sysv [k1n08:12688] mca: base: components_open: component sysv has no register function [k1n08:12688] mca: base: components_open: component sysv open function successful [k1n08:12688] shmem: base: runtime_query: Auto-selecting shmem components [k1n08:12688] shmem: base: runtime_query: (shmem) Querying component (run-time) [mmap] [k1n08:12688] shmem: base: runtime_query: (shmem) Query of component [mmap] set priority to 50 [k1n08:12688] shmem: base: runtime_query: (shmem) Querying component (run-time) [posix] [k1n08:12688] shmem: base: runtime_query: (shmem) Skipping component [posix]. Run-time Query failed to return a module [k1n08:12688] shmem: base: runtime_query: (shmem) Querying component (run-time) [sysv] [k1n08:12688] shmem: base: runtime_query: (shmem) Skipping component [sysv]. Run-time Query failed to return a module [k1n08:12688] shmem: base: runtime_query: (shmem) Selected component [mmap] [k1n08:12688] mca: base: close: unloading component posix [k1n08:12688] mca: base: close: unloading component sysv [k1n08:12688] *** Process received signal *** [k1n08:12688] Signal: Bus error (7) [k1n08:12688] Signal code: Non-existant physical address (2) [k1n08:12688] Failing at address: 0x2ac1e088e030 [k1n08:12688] [ 0] /lib64/libpthread.so.0(+0xf500) [0x2ac1de7c0500] [k1n08:12688] [ 1] /applocal/cfd/test/bin/test_openmpi(__intel_ssse3_rep_memcpy+0xcdb) [0x1495cab] [k1n08:12688] [ 2] /applocal/cfd/test/bin/test_openmpi(opal_convertor_pack+0x101) [0x125c111] [k1n08:12688] [ 3] /applocal/cfd/test/bin/test_openmpi(mca_btl_sm_prepare_src+0xc5) [0x13aab25] [k1n08:12688] [ 4] /applocal/cfd/test/bin/test_openmpi(mca_pml_ob1_send_request_start_rndv+0x67) [0x12fa9a7] [k1n08:12688] [ 5] /applocal/cfd/test/bin/test_openmpi(mca_pml_ob1_isend+0x3ab) [0x12ef02b] [k1n08:12688] [ 6] /applocal/cfd/test/bin/test_openmpi(ompi_coll_tuned_sendrecv_actual+0x94) [0x12d67f4] [k1n08:12688] [ 7] /applocal/cfd/test/bin/test_openmpi(ompi_coll_tuned_bcast_intra_split_bintree+0x94d) [0x12d45fd] [k1n08:12688] [ 8] /applocal/cfd/test/bin/test_openmpi(ompi_coll_tuned_bcast_intra_dec_fixed+0x143) [0x12d5dd3] [k1n08:12688] [ 9] /applocal/cfd/test/bin/test_openmpi(mca_coll_sync_bcast+0x66) [0x12d6aa6] [k1n08:12688] [10] /applocal/cfd/test/bin/test_openmpi(MPI_Bcast+0x5a) [0x11f95da] [k1n08:12688] [11] /applocal/cfd/test/bin/test_openmpi(mpi_bcast_f+0x6e) [0x11dca5e] [k1n08:12688] [12] /applocal/cfd/test/bin/test_openmpi(wpf_calc_mod_mp_wpf_calc_+0x10f0) [0x541be0] [k1n08:12688] [13] /applocal/cfd/test/bin/test_openmpi(special_init_mod_mp_special_init_geom_+0x3f4) [0x683254] [k1n08:12688] [14] /applocal/cfd/test/bin/test_openmpi(setup_mod_mp_setup_domains_+0x56b) [0x53effb] [k1n08:12688] [15] /applocal/cfd/test/bin/test_openmpi(MAIN__+0x1ab7) [0x5e8be7] [k1n08:12688] [16] /applocal/cfd/test/bin/test_openmpi(main+0x3c) [0x4ff82c] [k1n08:12688] [17] /lib64/libc.so.6(__libc_start_main+0xfd) [0x2ac1de9eccdd] [k1n08:12688] [18] /applocal/cfd/test/bin/test_openmpi() [0x4ff729] [k1n08:12688] *** End of error message ***