Hi Nathanael,

I'll try and spend some time looking at the patch. Thanks for sharing!

This sounds like you are optimizing your checkpointing phase.
Is there any advantage from doing this rather than using PLFS?

Mohamad

-----Original Message-----
From: Hdf-forum [mailto:[email protected]] On Behalf Of 
huebbe
Sent: Friday, September 20, 2013 6:34 AM
To: [email protected]
Cc: Julian Kunkel
Subject: Re: [Hdf-forum] Very poor performance of pHDF5 when using single 
(shared) file

On 09/19/2013 10:43 AM, [email protected] wrote:
> What we are doing is working with The HDF Group to define a work package 
> dubbed "Virtual Datasets" where you can have a virtual dataset in a master 
> file which is composed of datasets in underlying files. It is a bit like 
> extending the soft-link mechanism to allow unions. The method of mapping the 
> underlying datasets onto the virtual dataset is very flexible and so we hope 
> it can be used in a number of circumstances. The two main requirements are:
> 
>  - The use of the virtual dataset is transparent to any program reading the 
> data later.
>  - The writing nodes can write their files independently, so don't need pHDF5.

As a matter of fact, this is pretty much what we did already for our own
research: We, too, patched the HDF5 library to provide writing of multiple 
files and reading them back in a way entirely transparent to the application. 
You can find our patch, along with a much more detailed description, on our 
website:
http://www.wr.informatik.uni-hamburg.de/research/projects/icomex/multifilehdf5

On our system, we could actually see an improvement in wall-clock time for the 
entire process of writing-reconstructing-reading as opposed to writing to a 
shared file and reading it single stream. This may be different on other 
systems, but at least we expect a huge benefit in CPU-time since the multifile 
approach allows the parallel part of the workflow to be fast.

Of course, we are very interested to hear about other people's experiences with 
transparent multifiles.

Cheers,
Nathanael Hübbe


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Reply via email to