Hi,

i'm experiencing the following problem with MPICH-G2, when i run a mutli-node job "Isend" operation doesn't run to completion, giving the following error message:

ERROR: MPID_Get_count: could not interpret status->private_count 145538180
0 - MPI_GET_COUNT : Internal MPI error!
Aborting with code 16
MPICH-G2: read failure - globus_xio: System error in read: Connection reset by peer, state=await_format MPICH-G2: read failure - globus_xio: System error in read: Connection reset by peer, state=await_format

MPICH-G2: read failure - globus_xio: System error in read: Connection reset by peer, state=await_format MPICH-G2: read failure - globus_xio: System error in read: Connection reset by peer, state=await_format ERROR: MPID_Abort: failed remote globus_gram_client_job_cancel to job contact >https://cs2.cse.oar.net:51767/14800/1189625960/<


Caught broken pipe signal. Connection to server may be down
MPICH-G2: ERROR: prime_the_line: connect failed

MPICH-G2: read failure - globus_xio: System error in read: Connection reset by peer, state=await_format ERROR: MPID_Abort: failed remote globus_gram_client_job_cancel to job contact >https://cs2.cse.oar.net:51767/14800/1189625960/< ERROR: MPID_Abort: failed remote globus_gram_client_job_cancel to job contact >https://cs2.cse.oar.net:51767/14800/1189625960/< ERROR: MPID_Abort: failed remote globus_gram_client_job_cancel to job contact >https://cs3.cse.oar.net:55840/14766/1189625960/< ERROR: MPID_Abort: failed remote globus_gram_client_job_cancel to job contact >https://cs4.cse.oar.net:59909/3581/1189625960/<

i've searched the internet for a couple of days, and found that some other people have posted this problem, but not one person has posted a solution. I'm using globus 4.0.4 and mpich 1.2.7.

Does anyone have any clues on what could be causing this?

thanks in advance for your help,

~leo

Reply via email to