Thanks.

I had also posted the bug on the MPICH2 list, and received an
aswer from the ROMIO maintainers: the issue seems to be related to
NFS file locking bugs. I had been testing on an NFS system, and
when I re-tested under a local (ext3) file system, I did not reproduce
the bug.

I had been experimenting with the MPI-IO using explicit offsets,
individual pointers, and shared pointers, and have workarounds,
so I'll just avoid shared pointers on NFS.

Best regards,

        Yvan Fournier

        EDF R&D

On Sat, 2008-08-16 at 08:19 -0400, users-requ...@open-mpi.org wrote:

> Date: Sat, 16 Aug 2008 08:05:14 -0400
> From: Jeff Squyres <jsquy...@cisco.com>
> Subject: Re: [OMPI users] bug in MPI_File_get_position_shared ?
> To: Open MPI Users <us...@open-mpi.org>
> Cc: mpich2-ma...@mcs.anl.gov
> Message-ID: <023f1db0-8e8d-4c8c-8156-80ae52ff0...@cisco.com>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
> 
> On Aug 13, 2008, at 7:06 PM, Yvan Fournier wrote:
> 
> > I seem to have encountered a bug in MPI-IO, in which
> > MPI_File_get_position_shared hangs when called by multiple processes  
> > in
> > a communicator. It can be illustrated by the following simple test  
> > case,
> > in which a file is simply created with C IO, and opened with MPI-IO.
> > (defining or undefining MY_MPI_IO_BUG on line 5 enables/disables the
> > bug). From the MPI2 documentation, It seems that all processes  
> > should be
> > able to call MPI_File_get_position_shared, but if more than one  
> > process
> > uses it, it fails. Setting the shared pointer helps, but this should  
> > not
> > be necessary, and the code still hangs (in more complete code, after
> > writing data).
> >
> > I encounter the same problem with Open MPI 1.2.6 and MPICH2 1.0.7, so
> > I may have misread the documentation, but I suspect a ROMIO bug.
> 
> Bummer.  :-(
> 
> It would be best to report this directly to the ROMIO maintainers via 
> romio-ma...@mcs.anl.gov 
> .  They lurk on this list, but they may not be paying attention to  
> every mail.
> 
> If you wouldn't mind, please CC me on the mail to romio-maint.  Thanks!
> 
> -- 
> Jeff Squyres
> Cisco Systems


Reply via email to