Hi Val,
On Feb 11, 2011, at 10:27 AM, Val Schmidt wrote:
> Hm.
> This brings up a related question. I like the hierarchical structure of HDF
> files and the file-system like organization it brings. But it begs the
> question - can you do very fast queries using object names? For example can
> you use wildcards in the same way you might within a file system to pull data
> - ( /root/group/packet-* )
We don't have wildcards per se, but you should be able to use link
iteration (H5Literate or H5Lvisit) to achieve the same result.
Quincey
> -Val
>
>
> On Feb 11, 2011, at 11:54 AM, Mitchell, Scott - IS wrote:
>
>> The first search is pretty straight forward. My link is pretty simple,
>> there's a 1:1 correspondence between the line numbers in the time scale &
>> the dataset.
>>
>> The two other searches have to be brute forced from within the Packet Table
>> interface (H5PT) by iterating each line to just pull the individual
>> field(s). There may be a better way from the dataset (H5D). I've stuck with
>> the PT interface because I generally grab the whole dataset and it
>> simplifies the process of adding new data.
>>
>>
>>
>> Scott
>>
>>> -----Original Message-----
>>> From: [email protected] [mailto:[email protected]]
>>> On Behalf Of Val Schmidt
>>> Sent: Friday, February 11, 2011 11:28 AM
>>> To: HDF Users Discussion List
>>> Subject: Re: [Hdf-forum] hdf suitability for packetized data
>>>
>>> Your question is a good one.
>>>
>>> I would need to be able to pull a full record (or set of records) within a
>>> set
>>> of time bounds.
>>> I would need to be able to pull some field from all records for all times -
>>> as
>>> a time series.
>>> I might need to be able to pull all the fields within some field range for
>>> all
>>> times.
>>>
>>> I'm thinking of something similar to what you have done (I think) - that is,
>>> to self-index the file. The index would be in it's own dataset with an array
>>> of time records and perhaps a few other fields and relative links (I forget
>>> what HDF5 calls them) to the actual data records.
>>>
>>> -Val
>>>
>>> On Feb 11, 2011, at 10:56 AM, Mitchell, Scott - IS wrote:
>>>
>>>> I'm doing something similar to what you are looking at. I have data coming
>>> in from multiple instruments which go through processing and result in one
>>> or
>>> several C# structures/arrays. In my example each instrument type has a
>>> structure containing Packet Tables with associated time axes/scales. The
>>> packet table structure mimics the instrument data structures.
>>>>
>>>> Metadata is held in Attributes and other Packet Tables. I've created a
>>> standard across the program, with specifics defined for each instrument.
>>>>
>>>> I end up storing each individual instrument's data in its own file. In most
>>> cases, a single thread processes and stores data, so I don't have to worry
>>> about synchronization (as much).
>>>>
>>>>
>>>> I believe you'll want to store each data type in its own dataset or file.
>>> For the ability to search by data type and data length issues. How are you
>>> expecting to search?
>>>>
>>>> In my case, we allow users to 'play back' the data. I have the time scale
>>>> as
>>> a separate dataset so I can do random access lookups without having to load
>>> large data records to find a specific time.
>>>>
>>>>
>>>> Scott
>>>>
>>>>> -----Original Message-----
>>>>> From: [email protected] [mailto:hdf-forum-
>>> [email protected]]
>>>>> On Behalf Of Val Schmidt
>>>>> Sent: Thursday, February 10, 2011 6:10 PM
>>>>> To: [email protected]
>>>>> Subject: [Hdf-forum] hdf suitability for packetized data
>>>>>
>>>>> Hello everyone,
>>>>>
>>>>> I am new to HDF and am trying to understand whether or not it might be a
>>>>> suitable file format for my application. The data I'm interested to store
>>> is
>>>>> usually written by the collecting instrument to basic binary files of
>>>>> concatenated packets (think c structures), each of which contains a header
>>>>> with a time stamp, packet format, packet identifier, and packet size
>>> followed
>>>>> by the data itself (arrays) and associated metadata. There are 10's of
>>> types
>>>>> of packets that may come in any order and they are usually written to the
>>> file
>>>>> sequentially. Packets contain from 10-100 fields, some of which may be
>>> arrays
>>>>> of data of various sizes.
>>>>>
>>>>> This format allows one to relatively quickly index a file by passing
>>> through
>>>>> the file and parsing only these headers. Then one can use the index to
>>>>> pull
>>>>> subsets of the data in a non-linear fashion, sometimes simultaneously in
>>>>> multiple threads for quite fast reading. The problem is that every
>>> instrument
>>>>> manufacturer has their own method of encoding packets and a single format
>>> is
>>>>> needed for archival purposes.
>>>>>
>>>>> My question to you is how might a similar model be implemented in HDF5
>>>>> such
>>>>> that the same kind of indexing and parallel data retrieval is possible?
>>> What
>>>>> is to be avoided is the need to read through a file sequentially to get to
>>> the
>>>>> fields to extract.
>>>>>
>>>>> It seems like HDF5 should handle this kind of thing well, but because I am
>>>>> inexperienced and because most folks using it seem to be storing
>>>>> relatively
>>>>> small numbers of very large arrays (imagery in many cases), rather than
>>>>> relatively large numbers of smaller numbers of fields and smaller arrays,
>>> it
>>>>> is not clear to me how such an implementation might perform. So I guess
>>>>> I'm
>>>>> also asking, what is the relative penalty for writing lots of small sets
>>>>> of
>>>>> data?
>>>>>
>>>>> I hope this makes sense.
>>>>>
>>>>> Thanks in advance,
>>>>>
>>>>> Val
>>>>> ------------------------------------------------------
>>>>> Val Schmidt
>>>>> CCOM/JHC
>>>>> University of New Hampshire
>>>>> Chase Ocean Engineering Lab
>>>>> 24 Colovos Road
>>>>> Durham, NH 03824
>>>>> e: vschmidt [AT] ccom.unh.edu
>>>>> m: 614.286.3726
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Hdf-forum is for HDF software users discussion.
>>>>> [email protected]
>>>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>>>
>>>> This e-mail and any files transmitted with it may be proprietary and are
>>> intended solely for the use of the individual or entity to whom they are
>>> addressed. If you have received this e-mail in error please notify the
>>> sender.
>>>> Please note that any views or opinions presented in this e-mail are solely
>>> those of the author and do not necessarily represent those of ITT
>>> Corporation.
>>> The recipient should check this e-mail and any attachments for the presence
>>> of
>>> viruses. ITT accepts no liability for any damage caused by any virus
>>> transmitted by this e-mail.
>>>>
>>>> _______________________________________________
>>>> Hdf-forum is for HDF software users discussion.
>>>> [email protected]
>>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>>
>>> ------------------------------------------------------
>>> Val Schmidt
>>> CCOM/JHC
>>> University of New Hampshire
>>> Chase Ocean Engineering Lab
>>> 24 Colovos Road
>>> Durham, NH 03824
>>> e: vschmidt [AT] ccom.unh.edu
>>> m: 614.286.3726
>>>
>>>
>>>
>>> _______________________________________________
>>> Hdf-forum is for HDF software users discussion.
>>> [email protected]
>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
> ------------------------------------------------------
> Val Schmidt
> CCOM/JHC
> University of New Hampshire
> Chase Ocean Engineering Lab
> 24 Colovos Road
> Durham, NH 03824
> e: vschmidt [AT] ccom.unh.edu
> m: 614.286.3726
>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org