Hi Tom,
On Apr 27, 2012, at 5:46 AM, tom fogal wrote:
> Hi Quincey,
>
> Thanks for your reply.
>
> This helped considerably. I can dump one of my files in 16.5 minutes now,
> down from the 4+ hours it took before. However, this is still the slowest
> part of my pipeline. Another order of magnitude improvement would be
> welcome, of course ;), but I'd be really happy if we could just halve my
> current runtime for it. Any other ideas?
Dunno, can you run with gprof?
> Secondly, my previous HDF5 was simply installed as part of Ubuntu. I imagine
> pre-installed HDF5 versions are common for many users. Could I request that
> "no thread safety" be made a runtime option, which the command line tools
> could set implicitly? As shown, it provides a huge performance benefit, and
> is significantly easier to use, then, because users won't need to compile
> their own HDF5.
Hmm, that could be done, yes. I'll file an issue for it.
Quincey
> Thanks,
>
> -tom
>
> On 04/25/2012 05:53 PM, Quincey Koziol wrote:
>> Hi Tom,
>> Looks like you are working with a thread-safe build of HDF5, which is
>> unnecessary for the command-line tools. You could rebuild the HDF5
>> distribution (I would suggest moving up to 1.8.8 or the 1.8.9 prerelease)
>> without the thread-safe configure flag, and that should get rid of the mutex
>> issues.
>>
>> Quincey
>>
>> On Apr 25, 2012, at 9:28 AM, tom fogal wrote:
>>
>>> I am getting really awful performance using 'h5dump' to dump a scalar field
>>> as a binary file. It takes literally hours, whereas an h5copy -f ref takes
>>> just under 2 minutes, and a simple 'cp' is a little bit quicker than that.
>>>
>>> My file header is reproduced below [1]. I am using
>>>
>>> h5dump -b LE -d /C00 -o outfile.raw infile.h5
>>>
>>> to convert. For comparison, 'h5copy' is run thusly:
>>>
>>> h5copy -s C00 -d C00 -i infile.h5 -o ./testing.h5 -v -f ref
>>>
>>> While running, h5dump pegs a core at 98+% CPU usage.
>>>
>>> I've tried attaching gdb to the process while it's running, so that I can
>>> obtain some poor-man's profiling. One popular stacktrace is appended below
>>> [2]. I guess it's locking and unlocking a mutex constantly? Other traces
>>> I have seen multiple times: H5I_object_verify called from H5Tequal;
>>> __pthread_setcancelstate from H5TS_cancel_count_inc from H5Tequal;
>>> __pthread_mutex_lock from H5TS_mutex_lock from H5open; H5T_cmp from
>>> H5Tequal (rarely).
>>>
>>> If locking is indeed the problem, can I disable it at runtime somehow?
>>> These files are only being accessed by one process at a time, h5dump isn't
>>> even multithreaded anyway, and furthermore the access is purely read-only.
>>>
>>> I am using HDF5 1.8.4. Please enlighten me as to how I can get reasonable
>>> performance out of these files.
>>>
>>> Thanks,
>>>
>>> -tom
>>>
>>> [1]
>>> $ h5dump -p -H TS_2011_12_26/TS_C00_0_16.h5
>>> HDF5 "TS_C00_0_16.h5" {
>>> GROUP "/" {
>>> DATASET "C00" {
>>> DATATYPE H5T_STD_U16LE
>>> DATASPACE SIMPLE { ( 301, 2550, 2550 ) / ( 301, 2550, 2550 ) }
>>> STORAGE_LAYOUT {
>>> CONTIGUOUS
>>> SIZE 3914505000
>>> OFFSET 1400
>>> }
>>> FILTERS {
>>> NONE
>>> }
>>> FILLVALUE {
>>> FILL_TIME H5D_FILL_TIME_IFSET
>>> VALUE 0
>>> }
>>> ALLOCATION_TIME {
>>> H5D_ALLOC_TIME_LATE
>>> }
>>> }
>>> }
>>> }
>>>
>>> [2]
>>> (gdb) bt
>>> #0 __pthread_mutex_lock (mutex=0x7ff2d1e7fac8) at pthread_mutex_lock.c:47
>>> #1 0x00007ff2d1bf67d6 in H5TS_mutex_unlock () from /usr/lib/libhdf5.so.6
>>> #2 0x00007ff2d19105b8 in H5open () from /usr/lib/libhdf5.so.6
>>> #3 0x0000000000420501 in ?? ()
>>> #4 0x000000000041fbf6 in ?? ()
>>> #5 0x0000000000416e7f in ?? ()
>>> #6 0x000000000041c856 in ?? ()
>>> #7 0x000000000041cea9 in ?? ()
>>> #8 0x000000000040abaf in ?? ()
>>> #9 0x000000000040a20d in ?? ()
>>> #10 0x000000000040d3c6 in ?? ()
>>> #11 0x000000000040f387 in ?? ()
>>> #12 0x00007ff2d156530d in __libc_start_main (main=0x40eae4, argc=8,
>>> ubp_av=0x7fffa529e648, init=<optimized out>, fini=<optimized out>,
>>> rtld_fini=<optimized out>, stack_end=0x7fffa529e638) at libc-start.c:226
>>> #13 0x0000000000405349 in ?? ()
>>>
>>> _______________________________________________
>>> Hdf-forum is for HDF software users discussion.
>>> [email protected]
>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org