Hi John,

On Mar 22, 2011, at 3:31 AM, Biddiscombe, John A. wrote:

> b)      promote these blocks from datasets to chunks, so that the hdf library 
> was responsible for the virtual addressing and did all the real work at 
> retrieval time.
>  
> it seems like hdf already does everything we want if we had b) in place. once 
> the chunks are on disk and indexed correctly, a user selecting a slab will 
> trigger retrieval of the chunks and as long as the decompression filter is 
> available, handle that too. There’d be no need for a virtual dataset to map 
> access to the sub-datasets underneath.
>  
>       Hmm, so you'd have some new "bind" operation that took as input a bunch 
> of datasets and bound them together as a new dataset?
>  
> Essentially yes, I had something quite intrusive in mind. What I was thinking 
> was that each process independently creates a dataset and compresses it (it 
> could be just a memory buffer rather than an hdf5 dataset). Collectively a 
> new dataset is created which has the correct extents for the whole data, 
> chunks are ‘requested’ by each process and instead of allowing hdf to manage 
> the chunks and allocate them, we intercept the chunk generation/allocation 
> (override it) and simply supply our own, using our compressed data buffer. 
> hdf then does all the book keeping as usual and writes/flushes the data to 
> disk. Providing the chunk extents are regular, the compressed data could vary 
> in final size from chunk to chunk, (some tidying up might be necessary in the 
> chunk code).
>  
> On load, the user can treat the data as a completely normal dataset, but 
> compressed.
>  
> I suspect this is what you meant too, but I thought I’d spell it out more 
> clearly just in case. I will start poking around with the chunking code to 
> see if I can intercept things at convenient places. Please stop me if you 
> think I’m pursuing a bad idea.

        I think it's an interesting idea - poke around and send me any 
questions you have (off list, if you'd like).

        Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to