Hi Steve,

On Aug 16, 2010, at 5:52 AM, BISSELL, Stephen wrote:

> Thanks for the quick reply, Quincey. I won't be able to investigate this
> further for a while, as I'm away on holiday as of tomorrow :-) and have
> been tagged for an urgent job on my return :-(  - that should be of
> short duration, so I'll be able to look into this properly when that's
> done.

        OK, we'll wait to hear from you then.  Happy holiday! :-)

> My worst case fallback would be to store each bit as a single byte
> bitfield type of precision 1 (to preserve the information that the type
> is a single bit); I'll do some tests to see what the file size penalty
> is.

        Yes, that'll definitely work.

> On a related note, I notice that there is an enumeration type, but it
> does not seem possible to define a string for "any OTHER value". This is
> necessary to be able to use your enumeration scheme to map LOGICAL
> values that need to be mapped to one of exactly two strings, e.g. 
> == 0 ==> VALVE_OPEN
> == ANY OTHER VALUE ==> VALVE_CLOSED.

        Interesting idea, I'll add it to our issue tracker.

> Further, the enumeration scheme of HDF5 seems to be tightly coupled to
> the integer datatype, whereas on our system, "enumeration" is just a
> view/transform applicable to any datatype for which you can define a
> transform (e.g. we might choose to map float data to an enumeration set
> via a rounding transform).

        Yes, this is already in our issue tracker. :-)

> This can all be handled at my middleware layer between HDF5 and my
> client interface via attributes and/or a XML in the user-block, I guess,
> but I will have a quick look at what might be involved in making bit
> types, at least, "native" to HDF5.

        OK, let me know how it goes.

> Finally, I notice from other posts that "HDF5 does not currently support
> attaching attributes to fields of compound types". But if the fields
> within a compound type are user-defined types, and those user-defined
> types themselves have attributes attached, doesn't that achieve the
> required end of defining meta-data for compound fields?

        Yes, that's a potential work-around, although it requires using 
committed datatypes in the file, which may not work for some use cases and 
might be pretty complicated for others.  I'd like to have a more obvious, 
self-describing implementation that would work in all situations.

        Quincey

> Thanks again,
> Steve
> 
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Quincey Koziol
> Sent: 13 August 2010 21:42
> To: HDF Users Discussion List
> Subject: Re: [Hdf-forum] Possible to pack bit types into compound data
> types?
> 
> Hi Steve,
>       HDF5 does allow creating bitfield datatypes, but the underlying
> type must currently be an integral number of bytes in size.  It sounds
> like a reasonable extension to allow some way to pack bitfields into one
> underlying byte, but we haven't explored it seriously.  If you'd like to
> think about it and propose an interface that you think would work, that
> would kick off the discussion nicely. :-)
> 
>       Quincey
> 
> On Aug 13, 2010, at 7:04 AM, Steve Bissell wrote:
> 
>> 
>> I am working on an application to record data in HDF5 format, and I'm
>> completely new to it.
>> The data is in the form of packets, each of which has an associated
>> timestamp and class. 
>> Therefore, it would seem appropriate to use the FL_PacketTable class
> (99% of
>> the packets are fixed length, so this is my core use case).
>> The class of the packet indicates the packet contents, and each class
>> appears to map naturally to the HDF5 "Compound" data type, with a
> struct for
>> each class of packet.
>> Note also that data is retrieved from a legacy file format that uses
>> individual bits to represent certain data.
>> 
>> So far, so good. I can produce an hdf5 file with the following code
>> (C++/win32/VisStudio2005); assume that the file object and the group
> V3 are
>> defined.
>> 
>> //structured data - "compound" in the HDF5 terminology.
>> struct _my_type {
>>  double t;//e.g. time.
>>  int a;
>>  float b;
>> };
>> CompType mtype1( sizeof(_my_type) );
>> mtype1.insertMember( "time", HOFFSET(_my_type, t),
> PredType::NATIVE_DOUBLE);
>> mtype1.insertMember( "alt", HOFFSET(_my_type, a),
> PredType::NATIVE_INT);
>> mtype1.insertMember( "math", HOFFSET(_my_type, b),
> PredType::NATIVE_FLOAT);
>> 
>> FL_PacketTable pt(V3.getId(),"Packets",mtype1.getId(),500,6);
>> _my_type s1;
>> for (int i = 0; i< 400000; i++)
>> {
>>      s1.t = i/10.f;//monotonic time
>>      s1.a = i % 10;//sawtooth integer data
>>      s1.b = 100.f/(i+1);//math function 
>>      pt.AppendPacket(&s1);
>> }
>> 
>> The resulting file is maximally self describing, in that when opened
> with
>> hdfView, I see a packet table with columns headed time, alt, math, and
> my
>> "packets" in the records below.
>> 
>> Now what I would like to do is achieve the same maximally self
> describing
>> file for the amended compound type:
>> 
>> struct _my_type {
>>  double t;//e.g. time.
>>  int a;
>>  float b;//so far, so easy....
>>  //BUT, we would also like...
>>  union {
>>         struct {
>>                 unsigned char bit0 : 1;//ideally, should be able to
> map each bit's
>> value
>>                 unsigned char bit1 : 1;//to one of a pair of strings,
> e.g. "VALVE_OPEN"
>> / "VALVE_CLOSED"
>>                 unsigned char bit2 : 1;//by using, perhaps, something
> like the
>> ENUMERATION feature of
>>                 unsigned char bit3 : 1;//HDF5. 
>>                 unsigned char bit4 : 1; 
>>                 unsigned char bit5 : 1;
>>                 unsigned char bit6 : 1;
>>                 unsigned char bit7 : 1;
>>         };
>>         //..and ideally would ALSO like to be able to retrieve the
> entire field,
>> as below....
>>         unsigned char wholebyte;
>>  };
>> };
>> 
>> 
>> If I now amend my code to do:
>> 
>> mtype1.insertMember( "wholebyte", HOFFSET(_my_type, wholebyte),
>> PredType::NATIVE_UCHAR);
>> s1.wholebyte = 0;
>> 
>> for (int i = 0; i< 400000; i++)
>> {
>>      s1.t = i/10.f;//monotonic time
>>      s1.a = i % 10;//sawtooth integer data
>>      s1.b = 100.f/(i+1);//math function 
>>      s1.bit1 = ( (0 == (i % 20)) ? 1 : 0);//bit1 goes true every 20th
> element
>>      s1.bit2 = ( (10 < (i % 20)) ? 1 : 0);//bit2 goes true about 1/2
> the time
>>      s1.bit3 = ( (10 > (i % 30)) ? 1 : 0);//bit3 goes true about 1/3
> the time
>>      pt.AppendPacket(&s1);
>> }
>> 
>> 
>> then I do indeed see "wholebyte" and its data as an extra column in
> hdfview.
>> But end-users will certainly want to see individual bit values, rather
> than
>> the entire byte.
>> 
>> So - and this is my problem - if I do this instead (i.e. I do not
> insert
>> wholebyte):
>> 
>> //Create single bit transient types, then commit them to the dataset.
>> //Q: are these types modifying the original types, or are they
> "copies" in
>> the H5Tcopy sense?
>> //Not yet clear without examining c++ library behaviour further.....
>> IntType mySingleBit1Type(PredType::STD_B8LE);
>> mySingleBit1Type.setPrecision(1);
>> mySingleBit1Type.setOffset(1);
>> mySingleBit1Type.commit(V3,"Bit1Type");
>> 
>> mtype1.insertMember( "bit1", HOFFSET(_my_type,wholebyte),
> mySingleBit1Type);
>> 
>> Then I do NOT see "bit1" as a field in the packet table using hdfview
> - that
>> is, the "self describing" aspect fails.
>> 
>> Worse, if I attempt to define and insert another bit type, as below:
>> 
>> IntType mySingleBit2Type(PredType::STD_B8LE);
>> mySingleBit2Type.setPrecision(1);
>> mySingleBit2Type.setOffset(2);
>> mySingleBit2Type.commit(V3,"Bit2Type");
>> mtype1.insertMember( "bit2", HOFFSET(_my_type,wholebyte),
> mySingleBit2Type);
>> 
>> Then I get a "member overlaps with another member" exception from
>> H5Tcompound.c. This is not surprising, since the API only appears to
> allow
>> BYTE offsets.
>> 
>> Now some obvious, but ugly workarounds exist. I could, for example,
> store my
>> original bit data as bytes. But this would be very inefficient, in
> terms of
>> storage, unless the magic of compression would reduce the problem
> .....
>> 
>> I can't believe I'm the first person to encounter this issue, much
> more
>> likely is that I'm still too stupid to understand how best to define
> the bit
>> fields. Does anyone have any ideas? I'm aware that the above code may
> not be
>> completely platform portable in theory due to the C specification not
>> specifying exactly where bits might be put within the machine word,
> but this
>> isn't an issue in our case (at the moment!)
>> Thanks!
>> 
>> -- 
>> View this message in context:
> http://hdf-forum.184993.n3.nabble.com/Possible-to-pack-bit-types-into-co
> mpound-data-types-tp1131024p1131024.html
>> Sent from the hdf-forum mailing list archive at Nabble.com.
>> 
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
> 
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
> 
> This mail has originated outside your organization, either from an
> external partner or the Global Internet.
> Keep this in mind if you answer this message.
> 
> 
> 
> 
> This e-mail and any attachment may contain confidential and/or privileged 
> information. If you have received this e-mail and/or attachment in error, 
> please notify the sender immediately and delete the e-mail and any attachment 
> from your system. If you are not the intended recipient you must not copy, 
> distribute, disclose or use the contents of the e-mail or any attachment. 
> All e-mail sent to or from this address may be accessed by someone other than 
> the recipient for system management and security reasons or for other lawful 
> purposes. 
> Airbus Operations Limited does not accept liability for any damage or loss 
> which may be caused by software viruses. 
> Airbus Operations Limited is registered in England and Wales under company 
> number 3468788. The company's registered office is at New Filton House, 
> Filton, Bristol, BS99 7AR.
> 
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to