Hi Nathanael, I'll try and spend some time looking at the patch. Thanks for sharing!
This sounds like you are optimizing your checkpointing phase. Is there any advantage from doing this rather than using PLFS? Mohamad -----Original Message----- From: Hdf-forum [mailto:[email protected]] On Behalf Of huebbe Sent: Friday, September 20, 2013 6:34 AM To: [email protected] Cc: Julian Kunkel Subject: Re: [Hdf-forum] Very poor performance of pHDF5 when using single (shared) file On 09/19/2013 10:43 AM, [email protected] wrote: > What we are doing is working with The HDF Group to define a work package > dubbed "Virtual Datasets" where you can have a virtual dataset in a master > file which is composed of datasets in underlying files. It is a bit like > extending the soft-link mechanism to allow unions. The method of mapping the > underlying datasets onto the virtual dataset is very flexible and so we hope > it can be used in a number of circumstances. The two main requirements are: > > - The use of the virtual dataset is transparent to any program reading the > data later. > - The writing nodes can write their files independently, so don't need pHDF5. As a matter of fact, this is pretty much what we did already for our own research: We, too, patched the HDF5 library to provide writing of multiple files and reading them back in a way entirely transparent to the application. You can find our patch, along with a much more detailed description, on our website: http://www.wr.informatik.uni-hamburg.de/research/projects/icomex/multifilehdf5 On our system, we could actually see an improvement in wall-clock time for the entire process of writing-reconstructing-reading as opposed to writing to a shared file and reading it single stream. This may be different on other systems, but at least we expect a huge benefit in CPU-time since the multifile approach allows the parallel part of the workflow to be fast. Of course, we are very interested to hear about other people's experiences with transparent multifiles. Cheers, Nathanael Hübbe _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
