I upgraded recently to gcc-4.3 and I'm finding trouble to execute my MPI
programs. Indeed, when executing an MPI program with mpiexec sometimes it
terminates correctly and sometimes it shows different error messages such
as:
rank 2 in job 1 mahmoud-desktop_33023 caused collective abort of all
ranks
exit status of rank 2: killed by signal 9
or
[cli_1]: aborting job:
Fatal error in MPI_Allreduce: Other MPI error, error stack:
MPI_Allreduce(696): MPI_Allreduce(sbuf=0x8103344,
rbuf=0x8103348, count=1, MPI_UNSIGNED, MPI_SUM, MPI_COMM_WORLD) failed
MPIR_Allreduce(285)...:
MPIC_Sendrecv(161):
MPIC_Wait(321):
MPIDI_CH3_Progress_wait(199)..: an error occurred while
handling
an event returned by MPIDU_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(422):
MPIDU_Socki_handle_read(649)..: connection failure
(set=0,sock=3,errno=104:(strerror() not found))
[cli_1]: aborting job:
Fatal error in MPI_Finalize: Other MPI error, error stack:
MPI_Finalize(220).: MPI_Finalize failed
MPI_Finalize(146).:
MPID_Finalize(206): an error occurred while the
device was waiting for all open connections to close
MPIDI_CH3_Progress_wait(199)..: an error occurred while
handling
an event returned by MPIDU_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(422):
MPIDU_Socki_handle_read(649)..: connection failure
(set=0,sock=4,errno=104:(strerror() not found))
rank 3 in job 4 mahmoud-desktop_33023 caused collective abort of all
ranks
exit status of rank 3: killed by signal 9
rank 0 in job 4 mahmoud-desktop_33023 caused collective abort of all
ranks
exit status of rank 0: killed by signal 11
I'm failing to find a reason as my programs work fine with gcc 4.2. If it
is
a known bug that has been already fixed please send tell me how to fix it
on
my own machine.
Best regards,
Yours faithfully.
--
Summary: Compatibilty with MPICH2
Product: gcc
Version: 4.3.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: mahmoud dot fatene at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38837