Karl,

My apologies if this comes out wrong - it is a) my first time posting and b) I 
only get the digest and so had to copy and paste.,

We too are developing software for the control of a scientific instrument and 
frankly I am nervous that my selection of HDF5 might backfire and so I look for 
problems like this one as things to test for. We too are first logging to a 
binary file and then loading the data into HDF5 and we capture the time stamps 
at source and so this problem would be less of an issue for us.

My suspicion is that the increasing delays are due to chunking and perhaps 
compression.

You have 18 bits @ 250kHz sampled for 2.8 ms = 700 samples
You are binning this data down to 18 or 20 samples.
You mention an overall rate of 6 Hz so I am guessing you write at around 120 
samples/sec/dataset
Your 9 sec frequency corresponds to about every 1080 values.

Yours are presumably expandable datasets and so they must be using chunking 
(you don't mention chunk size or compression) but depending on whether caching 
is enabled and whether you are using compression or not, this might be the 
source of your delays every 9 secs. If you are using a chunk size of 1000 (for 
example) then about every 9 seconds your data is compressed, and written to 
disk. This effect will be magnified by the number of channels you are recording 
(24 or so) since I am guessing all those datasets will fill up a chunk at the 
same time due to uniform sample rates. I am unclear why the time to write would 
get longer as time goes on but you might want to try varying chunk size and 
turning compression off to see if it changes the behaviour.

Can I ask why you are not capturing the time stamps at the point of measurement?

rgds
Ewan


On Mon, May 9, 2016 at 1:58 PM, Karl Hoover 
<[email protected]<mailto:[email protected]>> wrote:

We're developing software for the control of a scientific instrument. At
an overall rate of bout 6Hz, 2.8 millisecond's worth of 18 bit samples at
250 kHz on up to 24 channels. These data are shipped back over gigabit
Ethernet to a Linux PC running a simple Java program. These data can
reliably be written as a byte stream to disk at full speed with extremely
regular timing. Thus we are certain that our data acquisition, Ethernet
transport, Linux PC software and file system are working fine.


However, the users want data in a more portable, summarized format and we
selected hdf5. The 700 or so 18 bit samples of each channel are integrated
into 18 to 20 time bins. The resulting data sets are thus not very large at
all. I've attached a screen shot of a region of a typical file of typical
size and example data (much smaller than a typical file.)


The instrument operates in two distinct modes. In one mode the instrument
is stationary over the region of interest. This is working flawlessly. In
the other mode, the instrument is moved around and about in arbitrary
paths. In this mode the precise time of the data acquisition obviously is
critical. What we observe is that the performance of the system is fine
very stable at 6Hz except that every 9 seconds a delay occurs starting with
about a 10 ms delays growing without bound to 100's of milliseconds. There
is nothing in my software that knows anything about a 9 second interval.
And I've found that this delay only occurs when I *write* the HDF5 file.
All other processing including creating the HDF5 file can be performed
without any performance problem. It makes no difference whether I keep the
hdf5 file open or close it each time. I'm using HDF5.8.02 and the jni /
Java library. Any suggestions about how to fix this problem would be
appreciated.


Best regards,
Karl Hoover

Senior Sofware Engineer

Geometrics
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to