Hi Val,

On Feb 11, 2011, at 10:27 AM, Val Schmidt wrote:

> Hm.
> This brings up a related question. I like the hierarchical structure of HDF 
> files and the file-system like organization it brings. But it begs the 
> question - can you do very fast queries using object names? For example can 
> you use wildcards in the same way you might within a file system to pull data 
> - ( /root/group/packet-* )

        We don't have wildcards per se, but you should be able to use link 
iteration (H5Literate or H5Lvisit) to achieve the same result.

        Quincey

> -Val 
> 
> 
> On Feb 11, 2011, at 11:54 AM, Mitchell, Scott - IS wrote:
> 
>> The first search is pretty straight forward. My link is pretty simple, 
>> there's a 1:1 correspondence between the line numbers in the time scale & 
>> the dataset.
>> 
>> The two other searches have to be brute forced from within the Packet Table 
>> interface (H5PT) by iterating each line to just pull the individual 
>> field(s). There may be a better way from the dataset (H5D). I've stuck with 
>> the PT interface because I generally grab the whole dataset and it 
>> simplifies the process of adding new data.
>> 
>> 
>> 
>> Scott
>> 
>>> -----Original Message-----
>>> From: [email protected] [mailto:[email protected]]
>>> On Behalf Of Val Schmidt
>>> Sent: Friday, February 11, 2011 11:28 AM
>>> To: HDF Users Discussion List
>>> Subject: Re: [Hdf-forum] hdf suitability for packetized data
>>> 
>>> Your question is a good one.
>>> 
>>> I would need to be able to pull a full record (or set of records) within a 
>>> set
>>> of time bounds.
>>> I would need to be able to pull some field from all records for all times - 
>>> as
>>> a time series.
>>> I might need to be able to pull all the fields within some field range for 
>>> all
>>> times.
>>> 
>>> I'm thinking of something similar to what you have done (I think) - that is,
>>> to self-index the file. The index would be in it's own dataset with an array
>>> of time records and perhaps a few other fields and relative links (I forget
>>> what HDF5 calls them) to the actual data records.
>>> 
>>> -Val
>>> 
>>> On Feb 11, 2011, at 10:56 AM, Mitchell, Scott - IS wrote:
>>> 
>>>> I'm doing something similar to what you are looking at. I have data coming
>>> in from multiple instruments which go through processing and result in one 
>>> or
>>> several C# structures/arrays. In my example each instrument type has a
>>> structure containing Packet Tables with associated time axes/scales. The
>>> packet table structure mimics the instrument data structures.
>>>> 
>>>> Metadata is held in Attributes and other Packet Tables. I've created a
>>> standard across the program, with specifics defined for each instrument.
>>>> 
>>>> I end up storing each individual instrument's data in its own file. In most
>>> cases, a single thread processes and stores data, so I don't have to worry
>>> about synchronization (as much).
>>>> 
>>>> 
>>>> I believe you'll want to store each data type in its own dataset or file.
>>> For the ability to search by data type and data length issues. How are you
>>> expecting to search?
>>>> 
>>>> In my case, we allow users to 'play back' the data. I have the time scale 
>>>> as
>>> a separate dataset so I can do random access lookups without having to load
>>> large data records to find a specific time.
>>>> 
>>>> 
>>>> Scott
>>>> 
>>>>> -----Original Message-----
>>>>> From: [email protected] [mailto:hdf-forum-
>>> [email protected]]
>>>>> On Behalf Of Val Schmidt
>>>>> Sent: Thursday, February 10, 2011 6:10 PM
>>>>> To: [email protected]
>>>>> Subject: [Hdf-forum] hdf suitability for packetized data
>>>>> 
>>>>> Hello everyone,
>>>>> 
>>>>> I am new to HDF and am trying to understand whether or not it might be a
>>>>> suitable file format for my application. The data I'm interested to store
>>> is
>>>>> usually written by the collecting instrument to basic binary files of
>>>>> concatenated packets (think c structures), each of which contains a header
>>>>> with a time stamp, packet format, packet identifier, and packet size
>>> followed
>>>>> by the data itself (arrays) and associated metadata. There are 10's of
>>> types
>>>>> of packets that may come in any order and they are usually written to the
>>> file
>>>>> sequentially. Packets contain from 10-100 fields, some of which may be
>>> arrays
>>>>> of data of various sizes.
>>>>> 
>>>>> This format allows one to relatively quickly index a file by passing
>>> through
>>>>> the file and parsing only these headers. Then one can use the index to 
>>>>> pull
>>>>> subsets of the data in a non-linear fashion, sometimes simultaneously in
>>>>> multiple threads for quite fast reading.  The problem is that every
>>> instrument
>>>>> manufacturer has their own method of encoding packets and a single format
>>> is
>>>>> needed for archival purposes.
>>>>> 
>>>>> My question to you is how might a similar model be implemented in HDF5 
>>>>> such
>>>>> that the same kind of indexing and parallel data retrieval is possible?
>>> What
>>>>> is to be avoided is the need to read through a file sequentially to get to
>>> the
>>>>> fields to extract.
>>>>> 
>>>>> It seems like HDF5 should handle this kind of thing well, but because I am
>>>>> inexperienced and because most folks using it seem to be storing 
>>>>> relatively
>>>>> small numbers of very large arrays (imagery in many cases), rather than
>>>>> relatively large numbers of smaller numbers of fields and smaller arrays,
>>> it
>>>>> is not clear to me how such an implementation might perform. So I guess 
>>>>> I'm
>>>>> also asking, what is the relative penalty for writing lots of small sets 
>>>>> of
>>>>> data?
>>>>> 
>>>>> I hope this makes sense.
>>>>> 
>>>>> Thanks in advance,
>>>>> 
>>>>> Val
>>>>> ------------------------------------------------------
>>>>> Val Schmidt
>>>>> CCOM/JHC
>>>>> University of New Hampshire
>>>>> Chase Ocean Engineering Lab
>>>>> 24 Colovos Road
>>>>> Durham, NH 03824
>>>>> e: vschmidt [AT] ccom.unh.edu
>>>>> m: 614.286.3726
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Hdf-forum is for HDF software users discussion.
>>>>> [email protected]
>>>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>>> 
>>>> This e-mail and any files transmitted with it may be proprietary and are
>>> intended solely for the use of the individual or entity to whom they are
>>> addressed. If you have received this e-mail in error please notify the 
>>> sender.
>>>> Please note that any views or opinions presented in this e-mail are solely
>>> those of the author and do not necessarily represent those of ITT 
>>> Corporation.
>>> The recipient should check this e-mail and any attachments for the presence 
>>> of
>>> viruses. ITT accepts no liability for any damage caused by any virus
>>> transmitted by this e-mail.
>>>> 
>>>> _______________________________________________
>>>> Hdf-forum is for HDF software users discussion.
>>>> [email protected]
>>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>> 
>>> ------------------------------------------------------
>>> Val Schmidt
>>> CCOM/JHC
>>> University of New Hampshire
>>> Chase Ocean Engineering Lab
>>> 24 Colovos Road
>>> Durham, NH 03824
>>> e: vschmidt [AT] ccom.unh.edu
>>> m: 614.286.3726
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Hdf-forum is for HDF software users discussion.
>>> [email protected]
>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>> 
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
> 
> ------------------------------------------------------
> Val Schmidt
> CCOM/JHC
> University of New Hampshire
> Chase Ocean Engineering Lab
> 24 Colovos Road
> Durham, NH 03824
> e: vschmidt [AT] ccom.unh.edu
> m: 614.286.3726
> 
> 
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to