Andrew@Intel is looking into it - he has some PSM patches coming that may resolve this already.
> On Oct 27, 2014, at 9:10 AM, Adrian Reber <adr...@lisas.de> wrote: > > This is a simpler test setup: > > On 8 core machines this works: > > $ mpirun -np 8 mpi_test_suite -t "environment" > [...] > Number of failed tests:0 > > Using 9 or more cores it fails: > > $ mpirun -np 9 mpi_test_suite -t "environment" > > mpi_test_suite:20293 terminated with signal 11 at PC=2b6d107fa9a4 > SP=7fff06431a70. Backtrace: > /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b6d107fa9a4] > /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b6d107eb172] > /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b6d0fa6e384] > /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b6d0f93376a] > /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b6d0f963d42] > mpi_test_suite[0x46cd00] > mpi_test_suite[0x44434c] > /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b6d10047d5d] > mpi_test_suite[0x4058e9] > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > > mpi_test_suite:11212 terminated with signal 11 at PC=2b2c27d0d9a4 > SP=7ffff5020430. Backtrace: > /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b2c27d0d9a4] > /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b2c27cfe172] > /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b2c26f81384] > /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b2c26e4676a] > /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b2c26e76d42] > mpi_test_suite[0x46cd00] > mpi_test_suite[0x44434c] > /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b2c2755ad5d] > mpi_test_suite[0x4058e9] > -------------------------------------------------------------------------- > mpirun detected that one or more processes exited with non-zero status, thus > causing > the job to be terminated. The first process to do so was: > > Process name: [[47415,1],0] > Exit code: 1 > -------------------------------------------------------------------------- > > > > On Mon, Oct 27, 2014 at 08:27:17AM -0700, Ralph Castain wrote: >> I’m afraid I can’t quite decipher from all this what actually fails. Of >> course, PSM doesn’t support dynamic operations like comm_spawn or >> connect_accept, so if you are running those tests that just won’t work. Is >> that the heart of the problem here? >> >> >>> On Oct 27, 2014, at 1:40 AM, Adrian Reber <adr...@lisas.de> wrote: >>> >>> Running Open MPI 1.8.3 with PSM does not seem to work right now at all. >>> I am getting the same errors also on trunk from my newly set up MTT. >>> Before trying to debug this I just wanted to make sure this is not a >>> configuration error. I have following PSM packages installed: >>> >>> infinipath-devel-3.1.1-363.1140_rhel6_qlc.noarch >>> infinipath-libs-3.1.1-363.1140_rhel6_qlc.x86_64 >>> infinipath-3.1.1-363.1140_rhel6_qlc.x86_64 >>> >>> with 1.6.5 I do not see PSM errors and the test suite fails much later: >>> >>> P2P tests Many-to-one with MPI_Iprobe (MPI_ANY_SOURCE) (21/48), comm >>> Intracomm merged of the Halved Intercomm (13/13), type MPI_TYPE_MIX_ARRAY >>> (28/29) >>> P2P tests Many-to-one with MPI_Iprobe (MPI_ANY_SOURCE) (21/48), comm >>> Intracomm merged of the Halved Intercomm (13/13), type MPI_TYPE_MIX_LB_UB >>> (29/29) >>> n050304:5.0.Cannot cancel send requests (req=0x2ad8ba881f80) >>> P2P tests Many-to-one with Isend and Cancellation (22/48), comm >>> MPI_COMM_WORLD (1/13), type MPI_CHAR (1/29) >>> n050304:2.0.Cannot cancel send requests (req=0x2b25143fbd88) >>> n050302:7.0.Cannot cancel send requests (req=0x2b4d95eb0f80) >>> n050301:4.0.Cannot cancel send requests (req=0x2adf03e14f80) >>> n050304:4.0.Cannot cancel send requests (req=0x2ad877257ed8) >>> n050301:6.0.Cannot cancel send requests (req=0x2ba47634af80) >>> n050304:8.0.Cannot cancel send requests (req=0x2ae8ac16cf80) >>> n050302:3.0.Cannot cancel send requests (req=0x2ab81dcb4d88) >>> n050303:4.0.Cannot cancel send requests (req=0x2b9ef4ef8f80) >>> n050303:2.0.Cannot cancel send requests (req=0x2ab0f03f9f80) >>> n050302:9.0.Cannot cancel send requests (req=0x2b214f9ebed8) >>> n050301:2.0.Cannot cancel send requests (req=0x2b31302d4f80) >>> n050302:4.0.Cannot cancel send requests (req=0x2b0581bd3f80) >>> n050301:8.0.Cannot cancel send requests (req=0x2ae53776bf80) >>> n050303:6.0.Cannot cancel send requests (req=0x2b13eeb78f80) >>> n050304:7.0.Cannot cancel send requests (req=0x2b4e99715f80) >>> n050304:9.0.Cannot cancel send requests (req=0x2b10429c2f80) >>> n050304:3.0.Cannot cancel send requests (req=0x2b9196f5fe30) >>> n050304:6.0.Cannot cancel send requests (req=0x2b30d6c69ed8) >>> n050301:9.0.Cannot cancel send requests (req=0x2b93c9e04f80) >>> n050303:9.0.Cannot cancel send requests (req=0x2ab4d6ce0f80) >>> n050301:5.0.Cannot cancel send requests (req=0x2b6ad851ef80) >>> n050303:3.0.Cannot cancel send requests (req=0x2b8ef52a0f80) >>> n050301:3.0.Cannot cancel send requests (req=0x2b277a4aff80) >>> n050303:7.0.Cannot cancel send requests (req=0x2ba570fa9f80) >>> n050301:7.0.Cannot cancel send requests (req=0x2ba707dfbf80) >>> n050302:2.0.Cannot cancel send requests (req=0x2b90f2e51e30) >>> n050303:5.0.Cannot cancel send requests (req=0x2b1250ba8f80) >>> n050302:8.0.Cannot cancel send requests (req=0x2b22e0129ed8) >>> n050303:8.0.Cannot cancel send requests (req=0x2b6609792f80) >>> n050302:6.0.Cannot cancel send requests (req=0x2b2b6081af80) >>> n050302:5.0.Cannot cancel send requests (req=0x2ab24f6f1f80) >>> -------------------------------------------------------------------------- >>> mpirun has exited due to process rank 14 with PID 4496 on >>> node n050303 exiting improperly. There are two reasons this could occur: >>> >>> 1. this process did not call "init" before exiting, but others in >>> the job did. This can cause a job to hang indefinitely while it waits >>> for all processes to call "init". By rule, if one process calls "init", >>> then ALL processes must call "init" prior to termination. >>> >>> 2. this process called "init", but exited without calling "finalize". >>> By rule, all processes that call "init" MUST call "finalize" prior to >>> exiting or it will be considered an "abnormal termination" >>> >>> This may have caused other processes in the application to be >>> terminated by signals sent by mpirun (as reported here). >>> -------------------------------------------------------------------------- >>> [adrian@n050304 mpi_test_suite]$ >>> >>> and this are my PSM errors with 1.8.3: >>> >>> [adrian@n050304 mpi_test_suite]$ mpirun -np 32 mpi_test_suite -t >>> "All,^io,^one-sided" >>> >>> mpi_test_suite:8904 terminated with signal 11 at PC=2b08466239a4 >>> SP=7ffff03c6e30. Backtrace: >>> >>> mpi_test_suite:16905 terminated with signal 11 at PC=2ae4cad209a4 >>> SP=7fffceefa730. Backtrace: >>> >>> mpi_test_suite:3171 terminated with signal 11 at PC=2b57daafe9a4 >>> SP=7fff5c4b3af0. Backtrace: >>> >>> mpi_test_suite:16906 terminated with signal 11 at PC=2b4c9fa019a4 >>> SP=7fffe916c330. Backtrace: >>> >>> mpi_test_suite:3172 terminated with signal 11 at PC=2b6dde92e9a4 >>> SP=7fff04cf1730. Backtrace: >>> >>> mpi_test_suite:16907 terminated with signal 11 at PC=2ad6eb8589a4 >>> SP=7fffc30d02f0. Backtrace: >>> >>> mpi_test_suite:3173 terminated with signal 11 at PC=2b2e4aec89a4 >>> SP=7fffa054e230. Backtrace: >>> >>> mpi_test_suite:16908 terminated with signal 11 at PC=2b4e6e5589a4 >>> SP=7fff68c7a1f0. Backtrace: >>> >>> mpi_test_suite:3174 terminated with signal 11 at PC=2b7049b279a4 >>> SP=7fff99a49f70. Backtrace: >>> >>> mpi_test_suite:16909 terminated with signal 11 at PC=2b252219d9a4 >>> SP=7fff72a0c6b0. Backtrace: >>> >>> mpi_test_suite:3175 terminated with signal 11 at PC=2ac8d5caf9a4 >>> SP=7fff6d7a63f0. Backtrace: >>> >>> mpi_test_suite:16910 terminated with signal 11 at PC=2b7f83fc49a4 >>> SP=7fffb95b89b0. Backtrace: >>> >>> mpi_test_suite:3176 terminated with signal 11 at PC=2b11438da9a4 >>> SP=7fffe626f270. Backtrace: >>> >>> mpi_test_suite:16903 terminated with signal 11 at PC=2ac5249249a4 >>> SP=7fff8874af30. Backtrace: >>> >>> mpi_test_suite:3177 terminated with signal 11 at PC=2ab6154549a4 >>> SP=7fffbf6ff430. Backtrace: >>> >>> mpi_test_suite:16904 terminated with signal 11 at PC=2ad0265099a4 >>> SP=7fff89fea470. Backtrace: >>> >>> mpi_test_suite:3178 terminated with signal 11 at PC=2b606b1a79a4 >>> SP=7fff20240db0. Backtrace: >>> >>> mpi_test_suite:4458 terminated with signal 11 at PC=2b593ef029a4 >>> SP=7fff4f48b470. Backtrace: >>> >>> mpi_test_suite:4459 terminated with signal 11 at PC=2b06dde559a4 >>> SP=7fffd771a4f0. Backtrace: >>> >>> mpi_test_suite:4460 terminated with signal 11 at PC=2ba7904cb9a4 >>> SP=7fff9694c8b0. Backtrace: >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2ab6154549a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2ab615445172] >>> >>> mpi_test_suite:4461 terminated with signal 11 at PC=2b26799fd9a4 >>> SP=7fff70f69eb0. Backtrace: >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b11438da9a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b11438cb172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b1142b4e384] >>> >>> mpi_test_suite:4462 terminated with signal 11 at PC=2b15418e19a4 >>> SP=7fff858425b0. Backtrace: >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2ab6146c8384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2ab61458d76a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2ab6145bdd42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ab614ca1d5d] >>> mpi_test_suite[0x4058e9] >>> >>> mpi_test_suite:4463 terminated with signal 11 at PC=2b43082919a4 >>> SP=7fff2ea8a530. Backtrace: >>> >>> mpi_test_suite:4464 terminated with signal 11 at PC=2adc01fe89a4 >>> SP=7fff0de9d4b0. Backtrace: >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b1142a1376a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b1142a43d42] >>> mpi_test_suite[0x46cd00] >>> >>> mpi_test_suite:4465 terminated with signal 11 at PC=2b477a1819a4 >>> SP=7fffd33831b0. Backtrace: >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b1143127d5d] >>> mpi_test_suite[0x4058e9] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b43082919a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b4308282172] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b477a1819a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b477a172172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b47793f5384] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b26799fd9a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b26799ee172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b2678c71384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b47792ba76a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b47792ead42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b4307505384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b43073ca76a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b43073fad42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b2678b3676a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b2678b66d42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b4307aded5d] >>> mpi_test_suite[0x4058e9] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b47799ced5d] >>> mpi_test_suite[0x4058e9] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b267924ad5d] >>> mpi_test_suite[0x4058e9] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b2e4aec89a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b2e4aeb9172] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2ac8d5caf9a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2ac8d5ca0172] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b57daafe9a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b57daaef172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b57d9d72384] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b08466239a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b0846614172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b0845897384] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b593ef029a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b593eef3172] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b7049b279a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b7049b18172] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b4e6e5589a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b4e6e549172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b084575c76a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b084578cd42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b0845e70d5d] >>> mpi_test_suite[0x4058e9] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b15418e19a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b15418d2172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b2e4a13c384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b2e4a00176a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b2e4a031d42] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b4e6d7cc384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b593e176384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b593e03b76a] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b606b1a79a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b606b198172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b606a41b384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b4e6d69176a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b1540b55384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b1540a1a76a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b57d9c3776a] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b06dde559a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b06dde46172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b06dd0c9384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b06dcf8e76a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b06dcfbed42] >>> mpi_test_suite[0x46cd00] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2ac8d4f23384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2ac8d4de876a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b4e6d6c1d42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b4e6dda5d5d] >>> mpi_test_suite[0x4058e9] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b1540a4ad42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b154112ed5d] >>> mpi_test_suite[0x4058e9] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b6dde92e9a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b6dde91f172] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2ad6eb8589a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2ad6eb849172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b593e06bd42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b593e74fd5d] >>> mpi_test_suite[0x4058e9] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b606a2e076a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b606a310d42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b606a9f4d5d] >>> mpi_test_suite[0x4058e9] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b06dd6a2d5d] >>> mpi_test_suite[0x4058e9] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b7048d9b384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b7048c6076a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b7048c90d42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2ac5249249a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2ac524915172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2ac523b98384] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2adc01fe89a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2adc01fd9172] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b2e4a715d5d] >>> mpi_test_suite[0x4058e9] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2ad6eaacc384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2ad6ea99176a] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2ba7904cb9a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2ba7904bc172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b6dddba2384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b6ddda6776a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b6ddda97d42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2ac523a5d76a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2ac523a8dd42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2adc0125c384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2adc0112176a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2adc01151d42] >>> mpi_test_suite[0x46cd00] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b57d9c67d42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b57da34bd5d] >>> mpi_test_suite[0x4058e9] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ac524171d5d] >>> mpi_test_suite[0x4058e9] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2ba78f73f384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2ba78f60476a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2ac8d4e18d42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ac8d54fcd5d] >>> mpi_test_suite[0x4058e9] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2ad6ea9c1d42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ad6eb0a5d5d] >>> mpi_test_suite[0x4058e9] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2adc01835d5d] >>> mpi_test_suite[0x4058e9] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b6dde17bd5d] >>> mpi_test_suite[0x4058e9] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2ba78f634d42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ba78fd18d5d] >>> mpi_test_suite[0x4058e9] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b7049374d5d] >>> mpi_test_suite[0x4058e9] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b252219d9a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b252218e172] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2ae4cad209a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2ae4cad11172] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2ad0265099a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2ad0264fa172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2ae4c9f94384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2ae4c9e5976a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b2521411384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b25212d676a] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b7f83fc49a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b7f83fb5172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2ae4c9e89d42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ae4ca56dd5d] >>> mpi_test_suite[0x4058e9] >>> /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b4c9fa019a4] >>> /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b4c9f9f2172] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b4c9ec75384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b4c9eb3a76a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2ad02577d384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2ad02564276a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2ad025672d42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ad025d56d5d] >>> mpi_test_suite[0x4058e9] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b2521306d42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b25219ead5d] >>> mpi_test_suite[0x4058e9] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b7f83238384] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b7f830fd76a] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b7f8312dd42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b4c9eb6ad42] >>> mpi_test_suite[0x46cd00] >>> mpi_test_suite[0x44434c] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b4c9f24ed5d] >>> mpi_test_suite[0x4058e9] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b7f83811d5d] >>> mpi_test_suite[0x4058e9] >>> ------------------------------------------------------- >>> Primary job terminated normally, but 1 process returned >>> a non-zero exit code.. Per user-direction, the job has been aborted. >>> ------------------------------------------------------- >>> -------------------------------------------------------------------------- >>> mpirun detected that one or more processes exited with non-zero status, >>> thus causing >>> the job to be terminated. The first process to do so was: >>> >>> Process name: [[9290,1],0] >>> Exit code: 1 >>> -------------------------------------------------------------------------- >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2014/10/16093.php >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/10/16099.php > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/10/16100.php