Matthieu > what would be the best way of writing one file from all the processes > together (in terms of write latency), knowing the data layout (regular 2D > arrays),
if all processes are writing one piece of a single dataset (assuming I understood your question correctly), then the usual collective create of the dataset, followed by a hyperslab selection on each process and a write of individual pieces. > write by hyperslabs/chunks/patterns is I think what you want. Writing one dataset per process was something I wanted - an example of why might be most illustrative... Suppose I'm working in paraview and have done some work in parallel on multi-block data, each process has a different block from the multi-block structure. They might be geometrically diverse (eg. tetrahedral on one process, prisms on another). I want to write out my current state, but don't want to do a collective write to one dataset. I really want to write each block out independently, but all to the same file. Because each process has no idea what the others have got, I needed a way to gather the info and creat the 'structure' then write. In the general case it'll be slower (physically more writes to disk), but for the purposes of organisation, much tidier. JB From: [email protected] [mailto:[email protected]] On Behalf Of Matthieu Dorier Sent: 25 February 2011 10:10 To: HDF Users Discussion List Subject: Re: [Hdf-forum] multi-pass IO (was chunking) Hello John (and others, since maybe other people can answer the following questions) Your library seems very interesting and I will probably use it in my project. Yet I have a question: what would be the best way of writing one file from all the processes together (in terms of write latency), knowing the data layout (regular 2D arrays), - using the classic PHDF5 library and write by hyperslabs/chunks/patterns - or using your library to split a dataset into "/procNNN/dataset"? It seems to me that writing regular patterns can benefit from MPI-IO's particular optimizations, but maybe I misunderstood the goal of your library? Thank you, Matthieu 2011/2/25 Mark Miller <[email protected]<mailto:[email protected]>> John, This is awesome! Thanks so much for putting it up. I really wish the HDF5 Group had decided a long while ago to make this kind of thing available UNDER the HDF5 API via... a) adding either a H5Xcreate_deferred for an part, X, of the API or adding a property to X's create property list to indicate a desire for deferred creation Any object so created cannot be acted upon until subsequent H5Xsync_deferred()... b) H5Xsync_deferred() function to synchronize all deferred created objects. But, in spite of numerous suggestions over many years that it'd be good for parallel applications to be able to do this, it still hasn't found its way into the HDF5 library proper ;) Its so nice to see someone offer a suitable alternative ;) Mark On Thu, 2011-02-24 at 14:39, Biddiscombe, John A. wrote: > The discussion about chunking and two pass VFDs reminded me that I intended > to make a small library for doing independent dataset creates, on a per > process basis, available. It was created some time ago and used extensively > on one project, but currently not in use. > > I've tidied the code up a bit and uploaded it to the following page > https://hpcforge.org/plugins/mediawiki/wiki/libh5mb/index.php/Main_Page > the source code is available via the SCM link. > > Some brief notes on the library are shown on the wiki page, but the actual > API is probably best described in the H5MButil.h file. I created the wiki > page very quickly so apologies if the content is unclear, please let me know > if it needs improvement. > > Hopefully someone will find the code useful. > > JB -- Mark C. Miller, Lawrence Livermore National Laboratory ================!!LLNL BUSINESS ONLY!!================ [email protected]<mailto:[email protected]> urgent: [email protected]<mailto:[email protected]> T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-8511 _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected]<mailto:[email protected]> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org -- Matthieu Dorier ENS Cachan, antenne de Bretagne Département informatique et télécommunication http://perso.eleves.bretagne.ens-cachan.fr/~mdori307/wiki/<http://perso.eleves.bretagne.ens-cachan.fr/%7Emdori307/wiki/>
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
