Yes of course Francesc, I was thinking float = half of 64bit instead of 4x 8bit 
:)
I was thinking that it might be beneficial to keep the size in powers of 2, so 
that is why I chose 1024 and not 1000. I keep it as a variable so I can easily 
change it.

Werner, I was thinking that I should eventually move to a sequence of 1D 
arrays, but it requires slightly more rewriting. The number of lines I have to 
write depends on whether or not the particle is still alive. I am starting out 
with an equal amount of particles, but have no means to know if I need to write 
the position of a given particle 0 times or one million times. Typically I have 
something like 1 million timesteps, but I do not write down trajectories all 
the time (when is dependent on the Monte Carlo so no way to know in advance)

Ideally I would've written all analysis into the code itself so I didn't have 
to write the trajectories all the time (I have not made this choice!), but that 
requires too much work for me to handle at the moment. Using HDF5 will reduce 
the storage space needed by about a factor 6 from my estimates, improve 
precision, and significantly reduce CPU hours needed as well. This is already a 
great improvement!

Cheers,
Yngve

On Wednesday 16 March 2011 02:09:36 PM Werner Benger wrote:
> Hi,
> 
>   what's the reason for using a 2D extendable dataset instead of a sequence  
> of 1D arrays
> in a group, using one group per time step? How many particles and time  
> steps do you
> have typically? I assume in your case the number of particles is constant  
> over time?
> 
> Cheers,
>       Werner
> 
> 
> On Wed, 16 Mar 2011 03:52:10 -0500, Yngve Inntjore Levinsen  
> <[email protected]> wrote:
> 
> > Dear hierarchical people,
> >
> > I have currently converted a piece of code from using a simple ascii  
> > format for output into using HDF5. What the code does is at every  
> > iteration dumping some information about particle  
> > energy/trajectory/position to the ascii file (this is a particle  
> > tracking code).
> >
> > Initially I then did the same with the HDF5 library, having a unlimited   
> > row dimension in a 2D array and using h5extend_f to extend by one  
> > element each time and writing a hyperslab of one row to the file. As  
> > some (perhaps most) of you might have guessed or know already, this was  
> > a rather bad idea. The file (without compression) was about the same  
> > size as the ascii file (but obviously with higher precision), and  
> > reading the file in subsequent analysis was at least an order of  
> > magnitude slower.
> >
> > I then realized that I probably needed to write less frequently and  
> > rather keeping a semi-large hyperslab in memory. I chose a hyperslab of  
> > 1000 rows, but otherwise using the same procedure. This seems to be both  
> > fast and with compression creating quite a bit smaller file. I tried  
> > even larger slabs, but did not see any speed improvement in my initial  
> > testing
> >
> > My question really was just if there are some recommended ways to do  
> > this? I would imagine I am not the first that want to use HDF5 in this  
> > way, dumping some data at every iteration of a given simulation, without  
> > having to keep it all in memory until the end?
> >
> > Thanks for all explanations/suggestions/experiences related to this  
> > problem you can provide me so I can make the best design choices in my  
> > program! :)
> >
> > Cheers,
> > Yngve
> 
> 
> 

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to