Dear list, I get "Connection reset by peer" in Finalize (see log below), but *only* if I free my intercommunicators:
... for (std::vector<Connector*>::iterator connector = connectors.begin (); connector != connectors.end (); ++connector) (*connector)->freeIntercomm (); MPI::Finalize (); ... where freeIntercomm is defined: void Connector::freeIntercomm () { intercomm.Free (); } What could be the reason for this? I'm using 1.2.7~rc2-1ubuntu2. (The problem does not occur on the other MPI implementations I've tested.) [swish:10019] [ 0] /lib/libpthread.so.0 [0x7f0dc32610f0] [swish:10019] [ 1] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so [0x7f0dbe1ed460] [swish:10019] [ 2] /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x670) [0x7f0dbd79ee60] [swish:10019] [ 3] /usr/lib/openmpi/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x2b) [0x7f0dbdfe318b] [swish:10019] [ 4] /usr/lib/libopen-pal.so.0(opal_progress+0x4a) [0x7f0dc4248f5a] [swish:10019] [ 5] /usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x1d) [0x7f0dc189691d] [swish:10019] [ 6] /usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x437) [0x7f0dc189a037] [swish:10019] [ 7] /usr/lib/libopen-rte.so.0(mca_oob_recv_packed+0x33) [0x7f0dc44cbd43] [swish:10019] [ 8] /usr/lib/openmpi/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_increment_value+0x1e2) [0x7f0dc14826a2] [swish:10019] [ 9] /usr/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x2ac) [0x7f0dc44e28fc] [swish:10019] [10] /usr/lib/libmpi.so.0(ompi_mpi_finalize+0x111) [0x7f0dc4733521] [swish:10019] [11] /home/mdj/music/trunk/src/.libs/libmusic.so.1(_ZN5MUSIC7Runtime8finalizeEv+0x7d) [0x7f0dc4bed7ed] [swish:10019] [12] /home/mdj/music/trunk/test/.libs/lt-contdelay(main+0x347) [0x40a297] [swish:10019] [13] /lib/libc.so.6(__libc_start_main+0xe6) [0x7f0dc2efe466] [swish:10019] [14] /home/mdj/music/trunk/test/.libs/lt-contdelay [0x409539] [swish:10019] *** End of error message *** [swish:10015] [0,0,0]-[0,1,1] mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104) mpirun noticed that job rank 0 with PID 10018 on node swish exited on signal 15 (Terminated). 3 additional processes aborted (not shown)