On Tue, 2011-02-22 at 14:06, Quincey Koziol wrote:

> 
>       Well, as I say above, with this approach, you push the space
> allocation problem to the dataset creation step (which has it's own
> set of problems),

Yeah, but those 'problems' aren't new to parallel I/O issues. Anyone
that is currently doing concurrent parallel I/O with HDF5 has had to
already deal with this part of the problem -- space allocation at
dataset creation -- right? The point is the caller of HDF5 then knows
how big it will be after its been compressed and HDF5 doesn't have to
'discover' that during H5Dwrite. Hmm puzzling...

I am recalling my suggestion of a '2-pass-planning' VFD where the caller
executes slew of HDF5 operations on a file TWICE. The first pass, HDF5
doesn't do any of the actual raw data I/O but just records all the
information about it for a 'repeat performance' second pass. In the
second pass, HDF5 knows everything about what is 'about to happen' and
then can plan accordingly.

What about maybe doing that on a dataset-at-a-time basis? I mean, what
if you set dxpl props to indicate either 'pass 1' or 'pass 2' of a
2-pass H5Dwrite operation. On pass 1, between H5Dopen and H5Dclose,
H5Dwrites don't do any of the raw data I/O but do apply filters and
compute sizes of things it will eventually write. On H5Dclose of pass 1,
all the information of chunk sizes is recorded. Caller then does
everything again, a second time but sets 'pass' to 'pass 2' in dxpl for
H5Dwrite calls and everything 'works' because all processors know
everything they need to know.

>   Maybe HDF5 could expose an API routine that the application could
> call, to pre-compress the data by passing it through the I/O filters?

I think that could be useful in any case. Like its now possible to apply
type conversion to a buffer of bytes, it probably ought to be possible
to apply any 'filter' to a buffer of bytes. The second half of this
though would involve smartening HDF5 then to 'pass-through' pre-filtered
data so result is 'as if' HDF5 had done the filtering work itself during
H5Dwrite. Not sure how easy that would be ;) But, you asked for
comments/input.

> 
>       Quincey
> 
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
-- 
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
[email protected]      urgent: [email protected]
T:8-6 (925)-423-5901    M/W/Th:7-12,2-7 (530)-753-8511


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to