Re: [OMPI users] Deadlock in MPI_File_write_all on Infiniband

Edgar Gabriel Mon, 12 Oct 2009 13:09:49 -0400

I am wondering whether this is really due to the usage ofFile_write_all. We had a bug in in 1.3 series so far (which will befixed in 1.3.4) where we lost message segments and thus had a deadlockin Comm_dup if there was communication occurring *right after* theComm_dup. File_open executes a comm_dup internally.

If you replace write_all by write, you are avoiding the communication.If you replace ib by tcp, your entire timing is different and you mightaccidentally not see the deadlock...


Just my $0.02 ...

Thanks
Edgar

Dorian Krause wrote:

Dear list,
the attached program deadlocks in MPI_File_write_all when run with 16processes on two 8 core nodes of an Infiniband cluster. It runs fine when I
a) use tcp
or
b) replace MPI_File_write_all by MPI_File_write
I'm using openmpi V. 1.3.2 (but I checked that the problem is alsooccurs with version 1.3.3). The OFED version is 1.4 (installed viaRocks). The Operating system is CentOS 5.2
I compile with gcc-4.1.2. The openmpi configure flags are
../../configure --prefix=/share/apps/openmpi/1.3.2/gcc-4.1.2/--with-io-romio-flags=--with-file-system=nfs+ufs+pvfs2--with-wrapper-ldflags=-L/share/apps/pvfs2/libCPPFLAGS=-I/share/apps/pvfs2/include/ LDFLAGS=-L/share/apps/pvfs2/libLIBS=-lpvfs2 -lpthread
The user home directories are mounted via nfs.

Is it a problem with the user code, the system or with openmpi?

Thanks,
Dorian


------------------------------------------------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab      http://pstl.cs.uh.edu
Department of Computer Science          University of Houston
Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335

Re: [OMPI users] Deadlock in MPI_File_write_all on Infiniband

Reply via email to