Hi Andrew and Quincy,
Andrew, thanks for your input.
Where do you do these operations (read to a conversion buffer, etc.)? Do
you do this as a separate call, e.g.
a) H5Dread returns buffer followed by n calls to H5Tconvert
or do you do it inside a custom conversion function
b) H5Dread calls custom conversion which calls H5Tconvert
BTW, I've got another work around in progress. I'll post details once
I've gotten all the kinks worked out, but a hint is that I can convince
HDF5 to convert from the "file-type" vlen that we've encountered to a
"memory-type" vlen (i.e. hvl_t) inside by calling H5Tconvert inside my
custom conversion function.
Quincy,
Something to think about in planning for the next revision. As will
become apparent once I post my work around, the reason it works is that
I'm able to adjust my vlen memory allocation routine after I know what
the destination type is, but before the vlen is read into memory.
Effectively what I'm doing is adjusting the way my vlen memory allocator
behaves based on the datatype of the the vlen element. This would be
much simpler if this information were simply passed to the allocation
routine in the first place, e.g. if src_id and dst_id were passed to the
allocation routine (Or, alternatively, and perhaps more efficient, if
one could register different allocators depending on the explicit
conversion taking place.)
Jason
On 9/27/2012 6:16 PM, Andrew Collette wrote:
Hi Jason & Quincey,
The part that is currently problematic is that the buffer passed to the
conversion routine does not have a vector of hvl_ts but rather a vector of some
sort of internal hdf5 types.
I'm the original person who asked about this (I'm the main author of
h5py). We have to convert from HDF5 vlen strings to an opaque object
(a Python string). Since h5py has to deal with this behavior in
released versions of HDF5 we implemented a workaround, which I briefly
described in that thread:
1. Read from the dataset selection into a contiguous conversion buffer
with exactly the same type as the dataset
2. Call H5Tconvert to go from the dataset type to your destination
type. The correct data is supplied to the custom converter when you
do this (for some reason)
3. Scatter the converted points from the buffer to your memory destination.
This is kind of annoying because the gather/scatter process is
time-intensive, and getting everything correct w.r.t backing buffers,
etc. is a real headache. You can see our implementation here (in
Cython):
https://code.google.com/p/h5py/source/browse/h5py/_proxy.pyx (starts
at line 102)
I agree it would be great if this were fixed (in HDF5 1.10?). Along
with identifier recycling, this is one of the biggest sources of pain
in the h5py codebase.
Parenthetically, are you aware of h5labview
(http://sourceforge.net/p/h5labview)? Maybe you and Martijn could
join forces.
Andrew
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org