Hi Leigh, I guess I am still interested to know whether an approach where specifying a minimum target compression ratio and then allowing HDF5 to (possibly over) allocate assuming a max. compressed size would work for you?
Mark On Wed, 2010-12-15 at 10:59, Leigh Orf wrote: > > On Tue, Dec 14, 2010 at 5:42 PM, Quincey Koziol <[email protected]> > wrote: > Hi Leigh, > > > [snipped for brevity] > > > Quincey, > > > > Probably a combination of both, namely, an ideal situation > > would be a group of MPI ranks collectively writing one > > compressed HDF5 file. On Blue Waters a 100kcore run with 32 > > cores/MCM could therefore result in say around 3000 files, > > which is not unreasonable. > > > > Maybe I'm thinking about this too simply, but couldn't you > > compress the data on each MPI rank, save it in a buffer, > > calculate the space required, and the write it? I don't know > > enough about the internal workings of hdf5 to know whether > > that would fit in the HDF5 model. In our particular > > application on Blue Waters, memory is cheap, so there is > > lots of space in memory for buffering data. > > > > > What you say above is basically what happens, except that > space in the file needs to be allocated for each block of > compressed data. Since each block is not the same size, the > HDF5 library can't pre-allocate the space or algorithmically > determine how much to reserve for each process. In the case > of collective I/O, at least it's theoretically possible for > all the processes to communicate and work it out, but I'm not > certain it's going to be solvable for independent I/O, unless > we reserve some processes to either allocate space (like a > "free space server") or buffer the "I/O", etc. > > Could you make this work by forcing each core to have some specific > chunking arrangement? For instance, you could have each core's > dimension simply be the same dimension as each chunk, which actually > works out pretty well in my application, at least in the horizontal. I > typically have nxchunk=nx, nychunk=ny, and nzchunk to be something > like 20 or so. But - now that I think about it, even if that were the > case you don't know the size of the compressed chunks until you've > compressed them and you'd still need to communicate the size of the > compressed chunks amongst cores writing to an individual file. > > I don't know enough about hdf5 to understand how the preallocation > process works. It sounds like you are allocating a bunch of zeroes (or > something) on disk first, and then doing I/O straight to that space on > disk? If this is the case then I can see how this necessitates some > kind of collective communication if you are splitting up compression > amongst MPI ranks. > > Personally I am perfectly happy with a bit of overhead which forces > all cores to share amongst themselves what the compressed block size > is before writing if it means we can do compression. Right now I see > my choices as being (1) compressed, but 1 file per MPI rank, lots of > files (2) No compression, fewer files, but perhaps compressing later > on using h5repack, calling it in parallel, one h5repack per MPI rank > as a post-processing step (yuck!). > > I'm glad you're working on this, personally I think this is important > stuff for really huge simulations. In talking to other folks who will > be using Blue Waters, compression is not much of an issue with many of > them because of the nature of their data. Cloud data especially tends > to compress very well. It would be a shame to fill terabytes of disk > space with zeroes! I am sure we can still carry out our research > objectives without compression, but the sheer amount of data we will > be producing is staggering even with compression. > > Leigh > > > Quincey > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org > > > > > -- > Leigh Orf > Associate Professor of Atmospheric Science > Department of Geology and Meteorology > Central Michigan University > Currently on sabbatical at the National Center for Atmospheric > Research in Boulder, CO > NCAR office phone: (303) 497-8200 -- Mark C. Miller, Lawrence Livermore National Laboratory ================!!LLNL BUSINESS ONLY!!================ [email protected] urgent: [email protected] T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-8511 _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
