"It's not possible to connect!!!!"

Hi Devel list, crossposting as this 
is getting weird... 

I did a client/server using MPI_Publish_name / 
MPI_Lookup_name 
and it runs fine on both MPICH2 and LAM-MPI but fail 
on Open MPI. It's 
not a simple failure (ie. returning an error code) 
it breaks the 
execution line and quits. The server continue to run 
after the 
client's crash. 


The server also use 100% of CPU while 
running, what doesn't happen with LAM. 


The code is here: 
http://www.
systemcall.com.br/rengolin/open-mpi/ 


OpenMP version: 1.1.1 


Compiling: 
mpiCC -o server server.c 
mpiCC -o client client.c 
 - or 
- 
mpiCC -o client client.c -DUSE_LOOKUP 


Running & Output: 
-- 
Server -- 
sbornia$ mpiexec server foo 
server Process Rank 0 ,TOT 
processes 1 on sbornia 
Server foo available at 0.1.0:2000 



-- 
Client without USE_LOOKUP -- 
sbornia$ mpiexec client foo 
Rank Client 
Process 0 ,TOT processes 1 on sbornia 
[sbornia:06246] [0,1,0] 
ORTE_ERROR_LOG: Pack data mismatch in file 
dss/dss_unpack.c at line 
171 
[sbornia:06246] [0,1,0] ORTE_ERROR_LOG: Pack data mismatch in 
file 
dss/dss_unpack.c at line 145 
[sbornia:06246] *** An error 
occurred in MPI_Comm_connect 
[sbornia:06246] *** on communicator 
MPI_COMM_WORLD 
[sbornia:06246] *** MPI_ERR_UNKNOWN: unknown error 
[sbornia:06246] *** MPI_ERRORS_ARE_FATAL (goodbye) 
[sbornia:06243] 
[0,0,0]-[0,1,0] mca_oob_tcp_msg_recv: readv failed 
with errno=104 



-- Client with USE_LOOKUP -- 
sbornia$ mpiexec client foo 
Rank Client 
Process 0 ,TOT processes 1 on sbornia 
[sbornia:06232] *** An error 
occurred in MPI_Lookup_name 
[sbornia:06232] *** on communicator 
MPI_COMM_WORLD 
[sbornia:06232] *** MPI_ERR_NAME: invalid name 
argument 
[sbornia:06232] *** MPI_ERRORS_ARE_FATAL (goodbye) 
[sbornia:
06229] [0,0,0]-[0,1,0] mca_oob_tcp_msg_recv: readv failed 
with 
errno=104 



OS error code 104: Connection reset by peer 


what are 
we doing wrong or where's the bug? 


thanks in advance! 

--alfonso & 
renato 



Reply via email to