I have an application which has been processing 100's of GB of raw data and 
generating HDF5 file daily for many years now. I recently attempted to upgrade 
it from using hdf5-1.8.3 to hdf5-1.8.13, and have been encountering errors with 
a traceback like the following:

HDF5-DIAG: Error detected in HDF5 (1.8.13) thread 1126189376:
  #000: H5D.c line 369 in H5Dopen2(): can't open dataset
    major: Dataset
    minor: Unable to initialize object
  #001: H5Dint.c line 1147 in H5D_open(): not found
    major: Dataset
    minor: Object not found
  #002: H5Dint.c line 1247 in H5D__open_oid(): unable to register type
    major: Dataset
    minor: Unable to register new atom
  #003: H5I.c line 895 in H5I_register(): can't insert ID node into skip list
    major: Object atom
    minor: Unable to insert object
  #004: H5SL.c line 995 in H5SL_insert(): can't create new skip list node
    major: Skip Lists
    minor: Unable to insert object
  #005: H5SL.c line 687 in H5SL_insert_common(): can't insert duplicate key
    major: Skip Lists
    minor: Unable to insert object

Unfortunately, the error is somewhat non-deterministic - it happens in 100% of 
the runs of the application, but not always on the same dataset of the same 
file each time. The one thing that is repeatable is that it only occurs after 
several hours (and some 50+ GB of data, written into some thousands of HDF5 
files) into the run, making it rather difficult to isolate into a simple test 
case!

My application is multithreaded, and the library is built with threading 
enabled (see full libhdf5.settings below). The application itself is written in 
C++, but uses only the HDF5 C library.

The general application structure is that data is written into several thousand 
newly-created HDF5 files. Within each file there are 100s - 1000s of groups 
(one level deep), each of which has about 10 datasets. Every group and every 
dataset has anywhere from 1 to 10 attributes (almost all are single 
variable-length string scalar values), and contains a one-dimensional array of 
integer data (chunked and compressed with SZIP). Each file is created in a 
single thread (synchronized at the application level), but then multiple 
threads are used to create each group, each dataset within each group, the 
attributes on the groups and datasets, and the data for each dataset, depending 
on the libhdf5 global mutex for synchronization. However, never will more than 
one thread attempt to access or modify any single object within the HDF5 file 
except for the file object itself.

The above skip list error has only ever occurred while opening a dataset prior 
to creating an attribute on that dataset. Thus, I am quite confident that only 
a single thread could ever be opening any particular dataset, and in any case, 
the global mutex in libhdf5 should make H5Fopen2() entirely threadsafe anyway.

Note that I never close an HDF5 file until it is complete, and once closed, the 
file is never re-opened for additional updates.

I have had to revert to hdf5-1.8.3 for the time being, but any 
guidance/assistance in resolving this issue with hdf5-1.8.13 would be 
appreciated. From searching the forum, I have encountered only one other report 
of a problem which appears to be either identical or related: 
http://hdf-forum.184993.n3.nabble.com/H5SL-insert-common-can-t-insert-duplicate-key-td4026817.html.
 There does not appear to have been any resolution to that issue, which 
happened to occur with hdf5-1.8.11, so apparently it's been around for a while.

Sincerely,

Stephen Pope


            SUMMARY OF THE HDF5 CONFIGURATION
            =================================

General Information:
-------------------
                   HDF5 Version: 1.8.13
                  Configured on: Wed Jun 11 16:25:23 MDT 2014
                  Configured by: scp@nxscp at Prediction Company, Santa Fe, NM, 
USA
                 Configure mode: production
                    Host system: x86_64-unknown-linux-gnu
              Uname information: Linux nxscp 2.6.18-348.el5 #1 SMP Tue Jan 8 
17:53:53 EST 2013 x86_64 x86_64 x86_64 GNU/Linux
                       Byte sex: little-endian
                      Libraries: shared
             Installation point: 
/apps/prediction/thirdparty/hdf5-1.8.13/build.opt.x86_64.rhel5.gcc4

Compiling Options:
------------------
               Compilation Mode: production
                     C Compiler: /usr/local/pkg/gcc-4.8.1/bin/gcc ( gcc (GCC) 
4.8.1)
                         CFLAGS: -march=core2 -mtune=corei7-avx -pthread
                      H5_CFLAGS: -std=c99 -pedantic -Wall -Wextra -Wundef 
-Wshadow -Wpointer-arith -Wbad-function-cast -Wcast-qual -Wcast-align 
-Wwrite-strings -Wconversion -Waggregate-return -Wstrict-prototypes 
-Wmissing-prototypes -Wmissing-declarations -Wredundant-decls -Wnested-externs 
-Winline -Wfloat-equal -Wmissing-format-attribute -Wmissing-noreturn -Wpacked 
-Wdisabled-optimization -Wformat=2 -Wunreachable-code -Wendif-labels 
-Wdeclaration-after-statement -Wold-style-definition -Winvalid-pch 
-Wvariadic-macros -Winit-self -Wmissing-include-dirs -Wswitch-default 
-Wswitch-enum -Wunused-macros -Wunsafe-loop-optimizations -Wc++-compat 
-Wstrict-overflow -Wlogical-op -Wlarger-than=2048 -Wvla -Wsync-nand 
-Wframe-larger-than=16384 -Wpacked-bitfield-compat -Wstrict-overflow=5 
-Wjump-misses-init -Wunsuffixed-float-constants -Wdouble-promotion 
-Wsuggest-attribute=const -Wtrampolines -Wstack-usage=8192 
-Wvector-operation-performance -Wsuggest-attribute=pure 
-Wsuggest-attribute=noreturn -W
 suggest-attribute=format -O3 -fomit-frame-pointer -finline-functions
                      AM_CFLAGS:
                       CPPFLAGS:
                    H5_CPPFLAGS: -D_POSIX_C_SOURCE=199506L   -DNDEBUG 
-UH5_DEBUG_API
                    AM_CPPFLAGS: 
-I/home/scp/svn/szip/build.opt.x86_64.rhel5.gcc4/include -D_LARGEFILE_SOURCE 
-D_LARGEFILE64_SOURCE -D_BSD_SOURCE
               Shared C Library: yes
               Static C Library: no
  Statically Linked Executables: no
                        LDFLAGS:
                     H5_LDFLAGS:
                     AM_LDFLAGS:  
-L/home/scp/svn/szip/build.opt.x86_64.rhel5.gcc4/lib
                Extra libraries:  -lpthread -lsz -lz -lrt -ldl -lm
                       Archiver: ar
                         Ranlib: ranlib
              Debugged Packages:
                    API Tracing: no

Languages:
----------
                        Fortran: no

                            C++: no

Features:
---------
                  Parallel HDF5: no
             High Level library: yes
                   Threadsafety: yes
            Default API Mapping: v18
 With Deprecated Public Symbols: yes
         I/O filters (external): deflate(zlib),szip(encoder)
         I/O filters (internal): shuffle,fletcher32,nbit,scaleoffset
                            MPE: no
                     Direct VFD: no
                        dmalloc: no
Clear file buffers before write: yes
           Using memory checker: no
         Function Stack Tracing: no
      Strict File Format Checks: no
   Optimization Instrumentation: no
       Large File Support (LFS): yes
######################################################################
The information contained in this communication is confidential and
may contain information that is privileged or exempt from disclosure
under applicable law. If you are not a named addressee, please notify
the sender immediately and delete this email from your system.
If you have received this communication, and are not a named
recipient, you are hereby notified that any dissemination,
distribution or copying of this communication is strictly prohibited.
######################################################################

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to