Quincey, Mark
Mark and I have kicked around the idea of creating a "virtual"
dataset, which is composed of other datasets in the file, stitched together and
presented as a single dataset to the application. That way, applications could
access the underlying piece (either directly, by reading from the underlying
dataset; or through a selection of the virtual dataset) or, access the virtual
dataset as if it was a single large dataset. This would be a looser form of
chunking, in an abstract sense.
Suppose I modify my H5MB utility to create one dataset per process - and
compress them individually, then write them out - but what I'd really like to
do is
a) ensure all blocks are the same size, do some padding if necessary
b) promote these blocks from datasets to chunks, so that the hdf library
was responsible for the virtual addressing and did all the real work at
retrieval time.
it seems like hdf already does everything we want if we had b) in place. once
the chunks are on disk and indexed correctly, a user selecting a slab will
trigger retrieval of the chunks and as long as the decompression filter is
available, handle that too. There'd be no need for a virtual dataset to map
access to the sub-datasets underneath.
As you (Quincey) know, I already have some practice of messing about with the
hdf internals. If I wanted to do b), is it feasible that instead of each
process writing a dataset, I could get hold of the metadata directly and
manipulate it to write the pieces as chunks.
I can spend some time on this if I can get decent compression working on
parallel IO
Regards
JB
PS. Mark, I followed links to your silo/hdf5 wiki stuff. Interesting. it looks
like we're both looking at very similar problems. I will have to play with your
PMPIO stuff too. - Are you also looking at other libraries like ADIOS for
example. (off topic, you can reply off list if you don't feel it appropriate
for here).
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org