Hi Kirk,

On Apr 16, 2010, at 2:58 PM, [email protected] wrote:

> Quincey,
> 
> Thanks for the tip. A quick read of "HDF5 Dataset Region References" does
> look promising.
> 
> Would you say the main benefit of Region References is more direct (i.e.
> efficient) construction of the related Hyperslabs upon a data retrieval?
> Perhaps versus saving a start/stop element number within the index element
> and having to build a Hyperslab region from that information alone?

        Yes, definitely.

        Quincey

> Kirk
> 
>> Hi Kirk,
>> 
>> On Apr 16, 2010, at 11:29 AM, Kirk Harrison wrote:
>> 
>>> To account for possible gaps of data within the stream I need to have a
>>> way
>>> of indexing blocks of data within the (single) dataset that I write the
>>> data
>>> to. (I elected to use a fixed contiguous dataset approach as opposed to
>>> a
>>> dynamically sized one using Chunks so that I can better manage the
>>> diskspace
>>> and circular buffer.)
>>> 
>>> I am in the process of setting up an (dynamic/chunked) indexing dataset
>>> to
>>> access the dataset used to capture the datastream. What I envision is
>>> each
>>> record in the index table containing elements such as:
>>> - Start_time
>>> - Stop_time
>>> - Num_Records
>>> - Reference (??? See question 3 below)
>>> Each index record would be use to describe a region in the continuous
>>> dataset used to capture the streamed data (which would further be used
>>> by a
>>> client to set up hyperslabs to request specific groups of data.)
>>> 
>>> I am still in the process of learning about HDF5 Links. I was thinking I
>>> might be able to simply have the index table contain soft links to the
>>> stream dataset with possibly properties (Start_time, Stop_time,
>>> Num_Records,
>>> etc...)
>>> 
>>> With all of this being said:
>>> 1) Is there a better way to do this within HDF5 (i.e., some built-in
>>> capability to index in this fashion which I have yet to discover)
>>> 2) Can links be even placed in a table like this (point to a specific
>>> record
>>> in a dataset)
>>> 3) What is recommended mechanism for "referencing" a particular record
>>> within a dataset
>> 
>>      I think the answer to all three questions is: you should use a dataset
>> region reference for this purpose
>> (http://www.hdfgroup.org/HDF5/doc/RM/RM_H5R.html#Reference-Create).
>> 
>>      Quincey
>> 
>>> Kirk
>>> 
>>> -----Original Message-----
>>> From: Mark Miller [mailto:[email protected]]
>>> Sent: Friday, March 26, 2010 3:14 PM
>>> To: Kirk Harrison
>>> Subject: RE: [Hdf-forum] HDF5 Circular Database
>>> 
>>> If you encounter serious performance issues at the I/O level, I'd be
>>> interested to know and may have some suggestions for improvement if you
>>> do.
>>> 
>>> Mark
>>> 
>>> On Fri, 2010-03-26 at 11:02, Kirk Harrison wrote:
>>>> Mark and Quincy,
>>>> 
>>>> Thanks! I will look into Hyperslabs as well. I finally located a
>>>> reference
>>>> under HDF5 Advanced Topics.
>>>> I have multiple streams of time series data that result from different
>>> types
>>>> of processing from the system. The data differs such that I will
>>>> probably
>>>> try several approaches with each stream in an attempt to optimize
>>>> performance. In the past I have manually programmed this type of binary
>>>> file-based solution and am eager to see what capability and performance
>>>> I
>>>> can get out of HDF5 for this type of domain. (I also have an associate
>>>> independently evaluating MySQL for comparison.)
>>>> 
>>>> Kirk
>>>> 
>>>> -----Original Message-----
>>>> From: Mark Miller [mailto:[email protected]]
>>>> Sent: Thursday, March 25, 2010 5:59 PM
>>>> To: [email protected]
>>>> Cc: HDF Users Discussion List
>>>> Subject: Re: [Hdf-forum] HDF5 Circular Database
>>>> 
>>>> Well, I had envisioned your 'buffer' as being a collection of datasets.
>>>> 
>>>> You could just have a single dataset that is the 'buffer' and then
>>>> you'd
>>>> have to use hyperslabs or selections to write to just a portion of that
>>>> dataset (as Quincey already mentioned).
>>>> 
>>>> HTH
>>>> 
>>>> Mark
>>>> 
>>>> On Thu, 2010-03-25 at 14:03, [email protected] wrote:
>>>>> Mark,
>>>>> 
>>>>> I am new to HDF5 and still working my way through the Tutorials. It
>>> looks
>>>>> promising thus far, but have been concerned about the Circular
>>>>> Database
>>>>> implementation.
>>>>> The dataset size will be static based upon the time duration for which
>>>>> I
>>>>> want to provide data lookup and the data output rate of the sensors. I
>>>>> suppose what I need to figure out then, based on your approach, is how
>>> to
>>>>> "seek" to the appropriate location (record) within the dataset for
>>>>> continued writing of the data. This is probably where your suggestion
>>>>> of
>>>>> adding an attribute (time of acquisition) comes into play.
>>>>> 
>>>>> Thanks for the reassurance and the tips,
>>>>> Kirk
>>>>> 
>>>>>> You should be able to do that pretty easily with HDF5.
>>>>>> 
>>>>>> If you are absolutely certain your datasets will never, ever change
>>>>>> in
>>>>>> size, you could create an 'empty' database by going through and
>>> creating
>>>>>> N datasets (H5Dcreate) of desired size (H5Screate_simple) but not
>>>>>> actually writing anything to any of the datasets.
>>>>>> 
>>>>>> Then, as time evolves, you pick a particular dataset to open
>>> (H5Dopen),
>>>>>> write to (writing afresh if the dataset has yet to be written to or
>>>>>> overwriting whats already there if it has already been written to --
>>>>>> makes no difference to the application. It just calls H5Dwrite) and
>>>>>> H5Dclose.
>>>>>> 
>>>>>> If you think you might want to be able to vary dataset size over
>>>>>> time,
>>>>>> use 'chunked' datasets (H5Pset_chunk) instead of the default
>>>>>> (contiguous). If you need to maintain other tidbits of information
>>> about
>>>>>> the datasets such as time of acquisition, sensor # (whatever), and
>>> that
>>>>>> data is 'small' (<16kb), attach attributes (H5Acreate) to your
>>> datasets
>>>>>> and overwrite those attributes as you would datasets (H5Aopen,
>>> H5Awrite,
>>>>>> H5Aclose).
>>>>>> 
>>>>>> Mark
>>>>>> 
>>>>>> 
>>>>>> On Thu, 2010-03-25 at 13:11, [email protected] wrote:
>>>>>>> I am interested in using HDF5 to manage sensor data within a
>>> continuous
>>>>>>> Circular Database/File. I wish to define a database of a fixed size
>>> to
>>>>>>> manage a finite amount of historical data. When the database file is
>>>>>>> full
>>>>>>> (i.e. reach the defined capacity) I would like to begin overwriting
>>> the
>>>>>>> oldest data within the file.) This is for an application for a
>>>>>>> system
>>>>>>> where I only care about the most recent data over a specific
>>>>>>> duration
>>>>>>> with
>>>>>>> obvious constraints on the amount of storage available.
>>>>>>> 
>>>>>>> Does HDF5 have such capability or is there a recommended
>>>>>>> approach/suggestions anyone has?
>>>>>>> 
>>>>>>> Best Regards,
>>>>>>> Kirk Harrison
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Hdf-forum is for HDF software users discussion.
>>>>>>> [email protected]
>>>>>>> http://***mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>>>>> --
>>>>>> Mark C. Miller, Lawrence Livermore National Laboratory
>>>>>> ================!!LLNL BUSINESS ONLY!!================
>>>>>> [email protected]      urgent: [email protected]
>>>>>> T:8-6 (925)-423-5901     M/W/Th:7-12,2-7 (530)-753-851
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Hdf-forum is for HDF software users discussion.
>>>>>> [email protected]
>>>>>> http://**mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>>>>> 
>>>>> 
>>> --
>>> Mark C. Miller, Lawrence Livermore National Laboratory
>>> ================!!LLNL BUSINESS ONLY!!================
>>> [email protected]      urgent: [email protected]
>>> T:8-6 (925)-423-5901     M/W/Th:7-12,2-7 (530)-753-851
>>> 
>> 
>> 
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>> 
> 
> 


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to