Which XE6 is this? (which OS rev?) -john
On Feb 7, 2012, at 1:39 AM, Biddiscombe, John A. wrote: > Sephane, > > Make sure you point out that this only occurs when using a lustre filesystem. > > JB > > From: [email protected] [mailto:[email protected]] > On Behalf Of Stéphane Backaert > Sent: 05 February 2012 23:21 > To: [email protected] > Subject: [Hdf-forum] can not open/close a file more than 1010 times > > Dear all, > > I am using the HDF5-parallel API on a Cray XE6. This had been working fine > until a few weeks. This is probably something related to this machine (since > a maintenance?) so I sent my issue to the helpdesk of this Cray, but I submit > my problem to you as well. I have this problem only with this machine, no > problem with another machine (not Cray) used. > > Problem: > With parallel version of HDF5, we can not open-and-close the same hdf5 file > more than a fixed number of times (1010). Moreover, if we try to > open-and-close, for example, two different hdf5 files, this maximum number is > divided by two: is there a limit of the number of hdf5 files opening allowed? > > Observations: > This problem can be reproduced, still the same given number of > opening/closing allowed. > This problem does not depend on the number of mpi procs involved (test up to > 512 cores, still 1010 opening before crash), or on the quantity written, or > on the actions made with an opened file ( like create dataset, attributes or > group). > We checked the status of each hdf5 operations (hdferror argument): no one > complains before the crash. So every file seems to be correctly > opened/closed. No different behavior if we force the file opening property > with H5F_CLOSE_STRONG_F. > The hdf5 file created and used before the crash is still readable and all > data written in it before are ok. > > Setup: > This problem occurs with the hdf5-parallel version 1.5.8.0 and 1.8.8, both > compiled with the intel compiler 12.0.3.174, only on a Cray XE6 (I could send > the module config used). > > A basic code reproduces our hdf5 calls structure. The attached code produced > the described error. It is compiled with the command "h5pfc -FR main.F -o > test". > > Here is a part of the standard error, produced when the code reached the > 1010th opening: > > HDF5-DIAG: Error detected in HDF5 (1.8.5) MPI-process 0: > #000: H5F.c line 1495 in H5Fopen(): unable to open file > major: File accessability > minor: Unable to open file > #001: H5F.c line 1195 in H5F_open(): unable to open file > major: File accessability > minor: Unable to open file > #002: H5FD.c line 1088 in H5FD_open(): open failed > major: Virtual File Layer > minor: Unable to initialize object > #003: H5FDmpio.c line 999 in H5FD_mpio_open(): MPI_File_open failed > major: Internal error (too specific to document in detail) > minor: Some MPI function failed > #004: H5FDmpio.c line 999 in H5FD_mpio_open(): Other I/O error , error > stack: > ADIOI_UFS_OPEN(108): Other I/O error Too many open files > major: Internal error (too specific to document in detail) > minor: MPI Error String > HDF5-DIAG: Error detected in HDF5 (1.8.5) MPI-process 0: > #000: H5F.c line 1943 in H5Fclose(): invalid file identifier > major: Invalid arguments to routine > minor: Inappropriate type > HDF5-DIAG: Error detected in HDF5 (1.8.5) MPI-process 1: > #000: H5F.c line 1495 in H5Fopen(): unable to open file > major: File accessability > minor: Unable to open file > #001: H5F.c line 1195 in H5F_open(): unable to open file > major: File accessability > minor: Unable to open file > #002: H5FD.c line 1088 in H5FD_open(): open failed > major: Virtual File Layer > minor: Unable to initialize object > HDF5-DIAG: Error detected in HDF5 (1.8.5) HDF5-DIAG: Error detected in HDF5 > (1.8.5) HDF5-DIAG: Error detected in HDF5 (1.8.5) #003: H5FDmpio.c line 999 > in H5FD_mpio_open(): MPI_File_open failed > MPI-process 3MPI-process 6 major: Internal error (too specific to document > in detail) > HDF5-DIAG: Error detected in HDF5 (1.8.5) : > HDF5-DIAG: Error detected in HDF5 (1.8.5) MPI-process 7HDF5-DIAG: Error > detected in HDF5 (1.8.5) MPI-process 2: > : > #000: H5F.c line 1495 in H5Fopen(): unable to open file > minor: Some MPI function failed > #000: H5F.c line 1495 in H5Fopen(): unable to open file > #000: H5F.c line 1495 in H5Fopen(): unable to open file > major: File accessability > major: File accessability > > I also attach the whole standard error. > > I will appreciate any help! Do not hesitate to ask me some additional details > if needed! > > Thanks, > > Best regards, > > Stephane > > > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
