Hi John,
On Mar 7, 2011, at 8:47 AM, Biddiscombe, John A. wrote:
> Quincey, Mark
>
>
>> Mark and I have kicked around the idea of creating a "virtual"
>> dataset, which is composed of other datasets in the file, stitched together
>> and presented as a single dataset to the application. That way,
>> applications could access the underlying piece (either directly, by reading
>> from the underlying dataset; or through a selection of the virtual dataset)
>> or, access the virtual dataset as if it was a single large dataset. This
>> would be a looser form of chunking, in an abstract sense.
>
> Suppose I modify my H5MB utility to create one dataset per process – and
> compress them individually, then write them out – but what I’d really like to
> do is
> a) ensure all blocks are the same size, do some padding if necessary
> b) promote these blocks from datasets to chunks, so that the hdf library
> was responsible for the virtual addressing and did all the real work at
> retrieval time.
>
> it seems like hdf already does everything we want if we had b) in place. once
> the chunks are on disk and indexed correctly, a user selecting a slab will
> trigger retrieval of the chunks and as long as the decompression filter is
> available, handle that too. There’d be no need for a virtual dataset to map
> access to the sub-datasets underneath.
Hmm, so you'd have some new "bind" operation that took as input a bunch
of datasets and bound them together as a new dataset?
> As you (Quincey) know, I already have some practice of messing about with the
> hdf internals. If I wanted to do b), is it feasible that instead of each
> process writing a dataset, I could get hold of the metadata directly and
> manipulate it to write the pieces as chunks.
>
> I can spend some time on this if I can get decent compression working on
> parallel IO
I think it could be done, but it's going to be a fairly intensive bit
of coding...
Quincey
> Regards
>
> JB
> PS. Mark, I followed links to your silo/hdf5 wiki stuff. Interesting. it
> looks like we’re both looking at very similar problems. I will have to play
> with your PMPIO stuff too. – Are you also looking at other libraries like
> ADIOS for example. (off topic, you can reply off list if you don’t feel it
> appropriate for here).
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org