Hi MIke,

Did you build the thread-safe version of the library?  (i.e., configure with 
--enable-threadsafe)

Dana

From: Hdf-forum [mailto:[email protected]] On Behalf Of 
Cook, Michael J (GE Healthcare)
Sent: Friday, January 03, 2014 5:03 PM
To: [email protected]
Subject: [Hdf-forum] H5SL_insert_common(): can't insert duplicate key

My HDF5 1.8.11 application is running on linux multi-core processor dual 
Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz and consists of multi-threaded process 
that at times can have two different threads reading the same hdf5 file for 
different reasons. Both threads read partial datasets. Both threads do separate 
H5Fopen on the file for read only and DIRECT_IO, and have their own returned 
file identifier. Both threads do H5Dread( dset, dtype, H5S_ALL, H5S_ALL, 
H5P_DEFAULT, ptrForData ) with a typical data request size of ~32MB.


Problem: Occasionally the two threads get in sync and perform H5Dread() on 
exactly the same partial dataset. When this happens, the application errors out 
with the following HDF5 error stack:

HDF5-DIAG: Error detected in HDF5 (1.8.11) thread 0:
  #000: H5Dio.c line 182 in H5Dread(): can't read data
    major: Dataset
    minor: Read failed
  #001: H5Dio.c line 539 in H5D__read(): can't initialize I/O info
    major: Dataset
    minor: Unable to initialize object
  #002: H5Dchunk.c line 827 in H5D__chunk_io_init(): unable to create file 
chunk selections
    major: Dataset
    minor: Unable to initialize object
  #003: H5Dchunk.c line 1301 in H5D__create_chunk_file_map_hyper(): can't 
insert chunk into skip list
    major: Dataspace
    minor: Unable to insert object
  #004: H5SL.c line 989 in H5SL_insert(): can't create new skip list node
    major: Skip Lists
    minor: Unable to insert object
  #005: H5SL.c line 669 in H5SL_insert_common(): can't insert duplicate key
    major: Skip Lists
    minor: Unable to insert object

Note that we have implemented a mutex to the hdf5 library calls to guarantee 
that no thread accesses the library calls simultaneously. Furthermore, our hdf5 
wrapper class performs a set of related hdf5 library calls as an atomic 
sequence, for example to read a partial dataset, in context of one mutex get,  
the sequence of H5 calls to H5Dopen2, H5Dget_type, H5Dget_space, 
H5Sget_simple_extent_ndims, H5Sget_simple_extent_dims, H5Sselect_hyperslab, 
H5Screate_simple, H5Dread, ... where as entire sequence is performed.

Our investigation suggests the error only occurs if the two threads happen to 
align with the exact same H5Dread request, which is serialized given our 
library mutex. The second identical H5Dread errors out as indicated in above 
error stack. The failure is very difficult to reproduce, suggesting a small 
timing window of opportunity.

Can anyone share explanation of the error and/or possible ways to prevent?

What is a H5 'skip list'?

The H5SL.c H5SL_insert() header comment suggests:
"COMMENTS, BUGS, ASSUMPTIONS
    Inserting an item with the same key as an existing object fails."

Is this a known bug?

Regards,
Mike C.





_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Reply via email to