Hi Andrew and Quincy,

Andrew, thanks for your input.

Where do you do these operations (read to a conversion buffer, etc.)? Do you do this as a separate call, e.g.

a) H5Dread returns buffer followed by n calls to H5Tconvert

or do you do it inside a custom conversion function

b) H5Dread calls custom conversion which calls H5Tconvert

BTW, I've got another work around in progress. I'll post details once I've gotten all the kinks worked out, but a hint is that I can convince HDF5 to convert from the "file-type" vlen that we've encountered to a "memory-type" vlen (i.e. hvl_t) inside by calling H5Tconvert inside my custom conversion function.

Quincy,

Something to think about in planning for the next revision. As will become apparent once I post my work around, the reason it works is that I'm able to adjust my vlen memory allocation routine after I know what the destination type is, but before the vlen is read into memory. Effectively what I'm doing is adjusting the way my vlen memory allocator behaves based on the datatype of the the vlen element. This would be much simpler if this information were simply passed to the allocation routine in the first place, e.g. if src_id and dst_id were passed to the allocation routine (Or, alternatively, and perhaps more efficient, if one could register different allocators depending on the explicit conversion taking place.)

Jason


On 9/27/2012 6:16 PM, Andrew Collette wrote:
Hi Jason & Quincey,

The part that is currently problematic is that the buffer passed to the 
conversion routine does not have a vector of hvl_ts but rather a vector of some 
sort of internal hdf5 types.
I'm the original person who asked about this (I'm the main author of
h5py).  We have to convert from HDF5 vlen strings to an opaque object
(a Python string).  Since h5py has to deal with this behavior in
released versions of HDF5 we implemented a workaround, which I briefly
described in that thread:

1. Read from the dataset selection into a contiguous conversion buffer
with exactly the same type as the dataset
2. Call H5Tconvert to go from the dataset type to your destination
type.  The correct data is supplied to the custom converter when you
do this (for some reason)
3. Scatter the converted points from the buffer to your memory destination.

This is kind of annoying because the gather/scatter process is
time-intensive, and getting everything correct w.r.t backing buffers,
etc. is a real headache.  You can see our implementation here (in
Cython):

https://code.google.com/p/h5py/source/browse/h5py/_proxy.pyx  (starts
at line 102)

I agree it would be great if this were fixed (in HDF5 1.10?).  Along
with identifier recycling, this is one of the biggest sources of pain
in the h5py codebase.

Parenthetically, are you aware of h5labview
(http://sourceforge.net/p/h5labview)?  Maybe you and Martijn could
join forces.

Andrew

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to