Which XE6 is this?  (which OS rev?)

-john

On Feb 7, 2012, at 1:39 AM, Biddiscombe, John A. wrote:
> Sephane,
>  
> Make sure you point out that this only occurs when using a lustre filesystem.
>  
> JB
>  
> From: [email protected] [mailto:[email protected]] 
> On Behalf Of Stéphane Backaert
> Sent: 05 February 2012 23:21
> To: [email protected]
> Subject: [Hdf-forum] can not open/close a file more than 1010 times
>  
> Dear all,
>  
> I am using the HDF5-parallel API on a Cray XE6. This had been working fine 
> until a few weeks. This is probably something related to this machine (since 
> a maintenance?) so I sent my issue to the helpdesk of this Cray, but I submit 
> my problem to you as well. I have this problem only with this machine, no 
> problem with another machine (not Cray) used.
>  
> Problem:
> With parallel version of HDF5, we can not open-and-close the same hdf5 file 
> more than a fixed number of times (1010). Moreover, if we try to 
> open-and-close, for example, two different hdf5 files, this maximum number is 
> divided by two: is there a limit of the number of hdf5 files opening allowed?
>  
> Observations:
> This problem can be reproduced, still the same given number of 
> opening/closing allowed.
> This problem does not depend on the number of mpi procs involved (test up to 
> 512 cores, still 1010 opening before crash), or on the quantity written, or 
> on the actions made with an opened file ( like create dataset, attributes or 
> group). 
> We checked the status of each hdf5 operations (hdferror argument): no one 
> complains before the crash. So every file seems to be correctly 
> opened/closed.  No different behavior if we force the file opening property 
> with H5F_CLOSE_STRONG_F.
> The hdf5 file created and used before the crash is still readable and all 
> data written in it before are ok.
>  
> Setup:
> This problem occurs with the hdf5-parallel version 1.5.8.0 and 1.8.8, both 
> compiled with the intel compiler 12.0.3.174, only on a Cray XE6 (I could send 
> the module config used).
>  
> A basic code reproduces our hdf5 calls structure. The attached code produced 
> the described error. It is compiled with the command "h5pfc -FR main.F -o 
> test".
>  
> Here is a part of the standard error, produced when the code reached the 
> 1010th opening:
>  
> HDF5-DIAG: Error detected in HDF5 (1.8.5) MPI-process 0:
>   #000: H5F.c line 1495 in H5Fopen(): unable to open file
>     major: File accessability
>     minor: Unable to open file
>   #001: H5F.c line 1195 in H5F_open(): unable to open file
>     major: File accessability
>     minor: Unable to open file
>   #002: H5FD.c line 1088 in H5FD_open(): open failed
>     major: Virtual File Layer
>     minor: Unable to initialize object
>   #003: H5FDmpio.c line 999 in H5FD_mpio_open(): MPI_File_open failed
>     major: Internal error (too specific to document in detail)
>     minor: Some MPI function failed
>   #004: H5FDmpio.c line 999 in H5FD_mpio_open(): Other I/O error , error 
> stack:
> ADIOI_UFS_OPEN(108): Other I/O error Too many open files
>     major: Internal error (too specific to document in detail)
>     minor: MPI Error String
> HDF5-DIAG: Error detected in HDF5 (1.8.5) MPI-process 0:
>   #000: H5F.c line 1943 in H5Fclose(): invalid file identifier
>     major: Invalid arguments to routine
>     minor: Inappropriate type
> HDF5-DIAG: Error detected in HDF5 (1.8.5) MPI-process 1:
>   #000: H5F.c line 1495 in H5Fopen(): unable to open file
>     major: File accessability
>     minor: Unable to open file
>   #001: H5F.c line 1195 in H5F_open(): unable to open file
>     major: File accessability
>     minor: Unable to open file
>   #002: H5FD.c line 1088 in H5FD_open(): open failed
>     major: Virtual File Layer
>     minor: Unable to initialize object
> HDF5-DIAG: Error detected in HDF5 (1.8.5) HDF5-DIAG: Error detected in HDF5 
> (1.8.5) HDF5-DIAG: Error detected in HDF5 (1.8.5)   #003: H5FDmpio.c line 999 
> in H5FD_mpio_open(): MPI_File_open failed
> MPI-process 3MPI-process 6    major: Internal error (too specific to document 
> in detail)
> HDF5-DIAG: Error detected in HDF5 (1.8.5) :
> HDF5-DIAG: Error detected in HDF5 (1.8.5) MPI-process 7HDF5-DIAG: Error 
> detected in HDF5 (1.8.5) MPI-process 2:
> :
>   #000: H5F.c line 1495 in H5Fopen(): unable to open file
>     minor: Some MPI function failed
>   #000: H5F.c line 1495 in H5Fopen(): unable to open file
>   #000: H5F.c line 1495 in H5Fopen(): unable to open file
>     major: File accessability
>     major: File accessability
>  
> I also attach the whole standard error.
>  
> I will appreciate any help! Do not hesitate to ask me some additional details 
> if needed!
>  
> Thanks,
>  
> Best regards,
>  
> Stephane
>  
>                                                             
>  
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to