Stephane,
I just tried on an XK6 system using gnu compilers and I do get the same
issue. I tried with 1.8.8 (compiled myself) and 1.8.7 (from cray). It
seems that it only appears when I write to the lustre file system... and
it does not seem to occur otherwise.
1002
1003
1004
1005
HDF5-DIAG: Error detected in HDF5 (1.8.8) MPI-process 0:
#000:
/project/csvis/soumagne/apps/src/rosa/hdf5-vfd-1.8.8/source/src/H5F.c
line 1522 in H5Fopen(): unable to open file
major: File accessability
minor: Unable to open file
#001:
/project/csvis/soumagne/apps/src/rosa/hdf5-vfd-1.8.8/source/src/H5F.c
line 1211 in H5F_open(): unable to open file: time = Tue Feb 7 10:42:17
2012
, name = 'test.h5', tent_flags = 1
major: File accessability
minor: Unable to open file
regards,
Jerome
On 02/07/2012 10:39 AM, Biddiscombe, John A. wrote:
Sephane,
Make sure you point out that this only occurs when using a lustre
filesystem.
JB
*From:*[email protected]
[mailto:[email protected]] *On Behalf Of *Stéphane Backaert
*Sent:* 05 February 2012 23:21
*To:* [email protected]
*Subject:* [Hdf-forum] can not open/close a file more than 1010 times
Dear all,
I am using the HDF5-parallel API on a Cray XE6. This had been working
fine until a few weeks. This is probably something related to this
machine (since a maintenance?) so I sent my issue to the helpdesk of
this Cray, but I submit my problem to you as well. I have this problem
only with this machine, no problem with another machine (not Cray) used.
Problem:
With parallel version of HDF5, we can not open-and-close the same hdf5
file more than a fixed number of times (1010). Moreover, if we try to
open-and-close, for example, two different hdf5 files, this maximum
number is divided by two:* is there a limit of the number of hdf5
files opening allowed?*
Observations:
This problem can be reproduced, still the same given number of
opening/closing allowed.
This problem does not depend on the number of mpi procs involved (test
up to 512 cores, still 1010 opening before crash), or on the quantity
written, or on the actions made with an opened file ( like create
dataset, attributes or group).
We checked the status of each hdf5 operations (hdferror argument): no
one complains before the crash. So every file seems to be correctly
opened/closed. No different behavior if we force the file opening
property with H5F_CLOSE_STRONG_F.
The hdf5 file created and used before the crash is still readable and
all data written in it before are ok.
Setup:
This problem occurs with the hdf5-parallel version 1.5.8.0 and 1.8.8,
both compiled with the intel compiler 12.0.3.174, only on a Cray XE6
(I could send the module config used).
A basic code reproduces our hdf5 calls structure. The attached code
produced the described error. It is compiled with the command "h5pfc
-FR main.F -o test".
Here is a part of the standard error, produced when the code reached
the 1010th opening:
HDF5-DIAG: Error detected in HDF5 (1.8.5) MPI-process 0:
#000: H5F.c line 1495 in H5Fopen(): unable to open file
major: File accessability
minor: Unable to open file
#001: H5F.c line 1195 in H5F_open(): unable to open file
major: File accessability
minor: Unable to open file
#002: H5FD.c line 1088 in H5FD_open(): open failed
major: Virtual File Layer
minor: Unable to initialize object
#003: H5FDmpio.c line 999 in H5FD_mpio_open(): MPI_File_open failed
major: Internal error (too specific to document in detail)
minor: Some MPI function failed
#004: H5FDmpio.c line 999 in H5FD_mpio_open(): Other I/O error ,
error stack:
ADIOI_UFS_OPEN(108): Other I/O error Too many open files
major: Internal error (too specific to document in detail)
minor: MPI Error String
HDF5-DIAG: Error detected in HDF5 (1.8.5) MPI-process 0:
#000: H5F.c line 1943 in H5Fclose(): invalid file identifier
major: Invalid arguments to routine
minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.8.5) MPI-process 1:
#000: H5F.c line 1495 in H5Fopen(): unable to open file
major: File accessability
minor: Unable to open file
#001: H5F.c line 1195 in H5F_open(): unable to open file
major: File accessability
minor: Unable to open file
#002: H5FD.c line 1088 in H5FD_open(): open failed
major: Virtual File Layer
minor: Unable to initialize object
HDF5-DIAG: Error detected in HDF5 (1.8.5) HDF5-DIAG: Error detected in
HDF5 (1.8.5) HDF5-DIAG: Error detected in HDF5 (1.8.5) #003:
H5FDmpio.c line 999 in H5FD_mpio_open(): MPI_File_open failed
MPI-process 3MPI-process 6 major: Internal error (too specific to
document in detail)
HDF5-DIAG: Error detected in HDF5 (1.8.5) :
HDF5-DIAG: Error detected in HDF5 (1.8.5) MPI-process 7HDF5-DIAG:
Error detected in HDF5 (1.8.5) MPI-process 2:
:
#000: H5F.c line 1495 in H5Fopen(): unable to open file
minor: Some MPI function failed
#000: H5F.c line 1495 in H5Fopen(): unable to open file
#000: H5F.c line 1495 in H5Fopen(): unable to open file
major: File accessability
major: File accessability
I also attach the whole standard error.
I will appreciate any help! Do not hesitate to ask me some additional
details if needed!
Thanks,
Best regards,
Stephane
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org