On Tue, Nov 08, 2011 at 03:51:01PM -0500, Zaak Beekman wrote:
> So,
> I it turns out I had sent the wrong error log. The correct error I am
> seeing is attached Again it is triggered on calls to h5dwrite_f. The error
> has something about file locks. HDF5 is built against an MPICH variant and
> the intel compiler. The data is being read from and written to a lustre
> file system. I believe the all the tests passed when I installed
> HDF5-1.8.5p1 and I have been successfuly reading and writing data
> (collectively) in parallel until now.
> I spoke to our system administrator and he seems to think that if we remout
> lustrefs with different flags it will fix the issue, but both of us have
> limited experience.
> 
> Has anyone seen these errors before? Is my sysadmin correct in saying that
> a simple change to the lustrefs mount flags will fix this?

It's good you found the correct error file.  

Your MPI version is a little on the older side.  Newest versions have
a bit more helpful error message:

FPRINTF(stderr, "File locking failed in ADIOI_Set_lock(fd %X,cmd %s/%X,type 
%s/%X,whence %X) with return value %X and errno %X.\n"
                  "- If the file system is NFS, you need to use NFS version 3, 
ensure that the lockd daemon is running on all the machines, and mount the 
directory with the 'noac' option (no attribute caching).\n"
                  "- If the file system is LUSTRE, ensure that the directory is 
mounted with the 'flock' option.\n",

I don't have much personal experience with lustre, so I don't know the
full ramifications of the 'flock' option, but that's what the
lustre community tells me is needed.

A more helpful error message is just one benefit of upgrading your MPI
implementation.  The Lustre driver has seen a host of improvements and
bug fixes.  If there's any way you can upgrade to MPICH2-1.4.1 or
MVAPICH2-1.7, you will be a much happier lustre user.

==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to