On Tue, 2011-02-22 at 12:29, Quincey Koziol wrote:

>       The problem with the collective I/O [write] operations is that
> multiple processes may be writing into each chunk, which MPI-I/O can
> handle when the data is not compressed, but since compressed data is
> context-sensitive, straightforward collective I/O won't work for
> compressed chunks.  Perhaps a two-phase approach where the data for
> each chunk was shipped to a single process, which updated the data in
> the chunk and compressed it, followed by 1+ passes of collective
> writes of compressed chunks.
> 
>       The problem with independent I/O [write] operations is that
> compressed chunks [almost always] change size when the data in the
> chunk is written (either initially, or when the data is overwritten),
> and since all the processes aren't available, communicating the space
> allocation is a problem.  Each process needs to allocate space in the
> file, but since the other processes aren't "listening", it can't let
> them know that some space in the file has been used.  A possible
> solution to this might involve just appending data to the end of the
> file, but that's prone to race conditions between processes (although
> maybe the "shared file pointer" I/O mode in MPI-I/O would help this). 
> Also, if each process moves a chunk around in the file (because it
> resized it), how will other processes learn where that chunk is, if
> they need to read from it?

Something that puzzles me here is that if my parallel app. applied szip,
gzip or whatever compression to my data on each processor BEFORE ever
passing it to HDF5, I can then successfully engage in write operations
to HDF5 treating the data as an opaque array of bytes of known size
using collective or independent parallel I/O just as any other
'ordinary' HDF5 dataset (using either chunked or contig layouts).

The problem, of course, is that the HDF5 library would not be 'aware' of
the data's true nature (either its original pre-compressed type or the
fact that it had been compressed and by which algorithm(s) etc.).
Subsequent readers would have to 'know' what to do with it, etc.

So, why can't we fix the second half of this problem and invent a way to
hand HDF5 'pre-filtered' data, and bypass any subsequent attempts in
HDF5 to filter it (or chunks thereof) on write. On the read end, enough
information would be available to the library to 'do the right' thing.

I guess another way of saying this is that HDF5's chunking is specified
in terms of the dataset's 'native shape'. For compressed data, why not
'turn that around' and handle chunking as buckets of a fixed number of
compressed bytes of the dataset (where number of bytes is chosen to
equate to #bytes of a chunk as specified in the dataset's 'native'
layout) but when uncompressed yields a variable sized 'chunk' in the
native layout?

Mark



-- 
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
[email protected]      urgent: [email protected]
T:8-6 (925)-423-5901    M/W/Th:7-12,2-7 (530)-753-8511


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to