Hi John,
On Mar 22, 2011, at 3:31 AM, Biddiscombe, John A. wrote:
> b) promote these blocks from datasets to chunks, so that the hdf library
> was responsible for the virtual addressing and did all the real work at
> retrieval time.
>
> it seems like hdf already does everything we want if we had b) in place. once
> the chunks are on disk and indexed correctly, a user selecting a slab will
> trigger retrieval of the chunks and as long as the decompression filter is
> available, handle that too. There’d be no need for a virtual dataset to map
> access to the sub-datasets underneath.
>
> Hmm, so you'd have some new "bind" operation that took as input a bunch
> of datasets and bound them together as a new dataset?
>
> Essentially yes, I had something quite intrusive in mind. What I was thinking
> was that each process independently creates a dataset and compresses it (it
> could be just a memory buffer rather than an hdf5 dataset). Collectively a
> new dataset is created which has the correct extents for the whole data,
> chunks are ‘requested’ by each process and instead of allowing hdf to manage
> the chunks and allocate them, we intercept the chunk generation/allocation
> (override it) and simply supply our own, using our compressed data buffer.
> hdf then does all the book keeping as usual and writes/flushes the data to
> disk. Providing the chunk extents are regular, the compressed data could vary
> in final size from chunk to chunk, (some tidying up might be necessary in the
> chunk code).
>
> On load, the user can treat the data as a completely normal dataset, but
> compressed.
>
> I suspect this is what you meant too, but I thought I’d spell it out more
> clearly just in case. I will start poking around with the chunking code to
> see if I can intercept things at convenient places. Please stop me if you
> think I’m pursuing a bad idea.
I think it's an interesting idea - poke around and send me any
questions you have (off list, if you'd like).
Quincey
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org