Hello,
I am also interested in parallel IO in my project.
I know HDF5 supports multiple opens on a single file by different processes 
right out of box. If my memory serves me correctly, HDF5 requires Parallel File 
IO (PFIO) in order to perform multiple writes onto a single file. Given a 
multi-core linux with mpich2 or openmpi installed, what else do I need to 
install in order to perform parallel file-writes? And is Windows platform out 
of luck, except Windows HPC server? Thanks a lot.

Best,
x
From: [email protected] [mailto:[email protected]] On 
Behalf Of Biddiscombe, John A.
Sent: Friday, February 25, 2011 5:11 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] multi-pass IO (was chunking)

Matthieu

> what would be the best way of writing one file from all the processes 
> together (in terms of write latency), knowing the data layout (regular 2D 
> arrays),

if all processes are writing one piece of a single dataset (assuming I 
understood your question correctly), then the usual collective create of the 
dataset, followed by a hyperslab selection on each process and a write of 
individual pieces.
> write by hyperslabs/chunks/patterns
is I think what you want.

Writing one dataset per process was something I wanted - an example of why 
might be most illustrative...

Suppose I'm working in paraview and have done some work in parallel on 
multi-block data, each process has a different block from the multi-block 
structure. They might be geometrically diverse (eg. tetrahedral on one process, 
prisms on another). I want to write out my current state, but don't want to do 
a collective write to one dataset. I really want to write each block out 
independently, but all to the same file.
Because each process has no idea what the others have got, I needed a way to 
gather the info and creat the 'structure' then write.

In the general case it'll be slower (physically more writes to disk), but for 
the purposes of organisation, much tidier.

JB


From: [email protected] [mailto:[email protected]] On 
Behalf Of Matthieu Dorier
Sent: 25 February 2011 10:10
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] multi-pass IO (was chunking)

Hello John (and others, since maybe other people can answer the following 
questions)

Your library seems very interesting and I will probably use it in my project. 
Yet I have a question: what would be the best way of writing one file from all 
the processes together (in terms of write latency), knowing the data layout 
(regular 2D arrays),
- using the classic PHDF5 library and write by hyperslabs/chunks/patterns
- or using your library to split a dataset into "/procNNN/dataset"? It seems to 
me that writing regular patterns can benefit from MPI-IO's particular 
optimizations, but maybe I misunderstood the goal of your library?

Thank you,

Matthieu
2011/2/25 Mark Miller <[email protected]<mailto:[email protected]>>
John,

This is awesome! Thanks so much for putting it up.

I really wish the HDF5 Group had decided a long while ago to make this
kind of thing available UNDER the HDF5 API via...
   a) adding either a H5Xcreate_deferred for an part, X, of the API or
      adding a property to X's create property list to indicate a
      desire for deferred creation
      Any object so created cannot be acted upon until subsequent
      H5Xsync_deferred()...
   b) H5Xsync_deferred() function to synchronize all deferred created
      objects.
But, in spite of numerous suggestions over many years that it'd be good
for parallel applications to be able to do this, it still hasn't found
its way into the HDF5 library proper ;)

Its so nice to see someone offer a suitable alternative ;)

Mark



On Thu, 2011-02-24 at 14:39, Biddiscombe, John A. wrote:
> The discussion about chunking and two pass VFDs reminded me that I intended 
> to make a small library for doing independent dataset creates, on a per 
> process basis, available. It was created some time ago and used extensively 
> on one project, but currently not in use.
>
> I've tidied the code up a bit and uploaded it to the following page
> https://hpcforge.org/plugins/mediawiki/wiki/libh5mb/index.php/Main_Page
> the source code is available via the SCM link.
>
> Some brief notes on the library are shown on the wiki page, but the actual 
> API is probably best described in the H5MButil.h file. I created the wiki 
> page very quickly so apologies if the content is unclear, please let me know 
> if it needs improvement.
>
> Hopefully someone will find the code useful.
>
> JB
--
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
[email protected]<mailto:[email protected]>      urgent: 
[email protected]<mailto:[email protected]>
T:8-6 (925)-423-5901    M/W/Th:7-12,2-7 (530)-753-8511


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]<mailto:[email protected]>
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org



--
Matthieu Dorier
ENS Cachan, antenne de Bretagne
Département informatique et télécommunication
http://perso.eleves.bretagne.ens-cachan.fr/~mdori307/wiki/<http://perso.eleves.bretagne.ens-cachan.fr/%7Emdori307/wiki/>
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to