On Tue, 2011-02-22 at 14:06, Quincey Koziol wrote: > > Well, as I say above, with this approach, you push the space > allocation problem to the dataset creation step (which has it's own > set of problems),
Yeah, but those 'problems' aren't new to parallel I/O issues. Anyone that is currently doing concurrent parallel I/O with HDF5 has had to already deal with this part of the problem -- space allocation at dataset creation -- right? The point is the caller of HDF5 then knows how big it will be after its been compressed and HDF5 doesn't have to 'discover' that during H5Dwrite. Hmm puzzling... I am recalling my suggestion of a '2-pass-planning' VFD where the caller executes slew of HDF5 operations on a file TWICE. The first pass, HDF5 doesn't do any of the actual raw data I/O but just records all the information about it for a 'repeat performance' second pass. In the second pass, HDF5 knows everything about what is 'about to happen' and then can plan accordingly. What about maybe doing that on a dataset-at-a-time basis? I mean, what if you set dxpl props to indicate either 'pass 1' or 'pass 2' of a 2-pass H5Dwrite operation. On pass 1, between H5Dopen and H5Dclose, H5Dwrites don't do any of the raw data I/O but do apply filters and compute sizes of things it will eventually write. On H5Dclose of pass 1, all the information of chunk sizes is recorded. Caller then does everything again, a second time but sets 'pass' to 'pass 2' in dxpl for H5Dwrite calls and everything 'works' because all processors know everything they need to know. > Maybe HDF5 could expose an API routine that the application could > call, to pre-compress the data by passing it through the I/O filters? I think that could be useful in any case. Like its now possible to apply type conversion to a buffer of bytes, it probably ought to be possible to apply any 'filter' to a buffer of bytes. The second half of this though would involve smartening HDF5 then to 'pass-through' pre-filtered data so result is 'as if' HDF5 had done the filtering work itself during H5Dwrite. Not sure how easy that would be ;) But, you asked for comments/input. > > Quincey > > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org -- Mark C. Miller, Lawrence Livermore National Laboratory ================!!LLNL BUSINESS ONLY!!================ [email protected] urgent: [email protected] T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-8511 _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
