Re: Can bits be cast to string, concatenated, and then cast to an unsigned int?

Steve Lawrence Tue, 12 Feb 2019 16:28:30 -0800

I would agree based on your description. In that case, I think you just
need to set the bitOrder property to "leastSignificantBitFirst" and the
fields will parse as you would expect. The Ordinal_Number element will
parse exactly the same as before since it starts and ends on a byte
boundary (bitOrder has no effect if everything is on a byte boundary).
The Zero element will then consume all of the the third byte and the 7
least significant bits of the 4th byte. And the Ordinal_Flag element
will consume the most significant bit of the last byte.


- Steve


On 2/12/19 5:09 PM, Costello, Roger L. wrote:
> Thank you Steve! The fixed="0" works like a charm.
> 
> But I just realized a potential fatal wrinkle in this problem.
> 
> EXE files are in Little Endian format .... oops!
> 
> The EXE specification says that the "most significant bit (bit 31)" is the 
> thing 
> that determines how to interpret the other 31 bits.
> 
> In a 4-byte Little Endian field, the most significant bit (bit 31) is the 
> first 
> bit of the last byte:
> 
> I have been thinking that the most significant bit is the last bit of the 
> last 
> byte, which is wrong.
> 
> So, this is not correct:
> 
> <xs:sequence>
> <xs:elementname="Ordinal_Number"type="unsignedint16"/>
> <xs:elementname="Zero"type="unsignedint15"fixed="0"/>
> <xs:elementname="Ordinal_Flag"type="unsignedint1">
> <xs:annotation>
> <xs:appinfosource="http://www.ogf.org/dfdl/";>
> <dfdl:discriminatortest="{ . eq 1 }"/>
> </xs:appinfo>
> </xs:annotation>
> </xs:element>
> </xs:sequence>
> 
> Do you agree?
> 
> /Roger
> 
> -----Original Message-----
> From: Steve Lawrence <[email protected]>
> Sent: Tuesday, February 12, 2019 7:42 AM
> To: [email protected]; Costello, Roger L. <[email protected]>
> Subject: [EXT] Re: Can bits be cast to string, concatenated, and then cast to 
> an 
> unsigned int?
> 
> I think the reason is because of the discriminator on the Zero element.
> 
> When a discriminator test passes, that tells Daffodil that the surrounding 
> point 
> of uncertainty (PoU) has been resolved and that Daffodil has taken the 
> correct 
> branch. If something later causes that branch to fail, Daffodil will not try 
> any 
> other branches related to that PoU, but will instead continue to backtrack 
> passed the PoU to the next one.
> 
> So in this case we parse the Zero element and it just happens to be all 
> zeros, 
> even though we know we're in the wrong branch. The { . eq 0 } test on the 
> Zero 
> element passes, alerting Daffodil that it has chosen the correct choice 
> branch. 
> Then it parses the Ordinal_Flag element which has a value of 1. This 
> discriminator fails, causing Daffodil to backtrack. But we've already 
> resolved 
> the choice PoU, so we don't try the other branch.
> 
> The reason it works when you swap the order of the choice branches is because 
> it 
> parses the correct choice branch first--the Zero discriminator was never set 
> to 
> tell it not to try this branch.
> 
> So what's the right fix?
> 
> First of all, the discriminator on the Zero element should definitely change. 
> The Zero element having a value of 0 doesn't mean that we've taken the 
> correct 
> branch, which is what the discriminator will say.
> 
> One approach would be to change that dfdl:discriminator to a dfdl:assert, 
> which 
> will not resolve the PoU when the value of Zero is 0, but will cause the same 
> backtracking behavior when it's not 0. This will allow Daffodil to attempt 
> the 
> second branch.
> 
> However, I would argue that that isn't necessarily the right behavior.
> 
> Regardless of the value of Zero, you only want to backtrack and try the 
> second 
> branch when Ordinal_Flag is not zero. That is the thing that determines which 
> choice branch we should take, not whether or not Zero has the correct value 
> or 
> not. So I would argue that the Zero element being 0 should be a validation 
> check, not a runtime parse check.
> 
> Syntactically, it's fine if that value is non-zero, but when someone wants to 
> validate if what was parsed actually looked like an EXE, that being non-zero 
> would raise a flag that something wasn't right. So I would recommend doing 
> something like this:
> 
>    <xs:element name="Zero" type="unsignedint15" fixed="0" />
> 
> Setting the "fixed" attribute will not affect the parse at all, but will 
> cause a 
> validation error if the value wasn't correct.
> 
> - Steve
> 
> On 2/12/19 4:21 AM, Costello, Roger L. wrote:
> 
>  > Thanks Steve - awesome!
> 
>  >
> 
>  > I tried the second approach that you provided. See below. When I run it, I 
> get this error message:
> 
>  >
> 
>  > [error] Parse Error: Failed to populate Lookup_Table_Entry[1]. Cause:
> 
>  > Parse Error: All choice alternatives failed. Reason(s): List(Parse
> 
>  > Error: Alternative failed. Reason(s): List(Parse Error: Assertion
> 
>  > failed: { . eq 1 } failed
> 
>  >
> 
>  > On the other hand, if I switch the order of the sequences in
> 
>  > xs:choice, then I get no error. Why do I get an error with the
> 
>  > sequences in one order but no error in another order?  /Roger
> 
>  >
> 
>  > <xs:element name="Lookup_Table_Entry" dfdl:length="32" 
> dfdl:lengthKind="explicit" dfdl:lengthUnits="bits">
> 
>  >     <xs:annotation>
> 
>  >         <xs:appinfo source="http://www.ogf.org/dfdl/";>
> 
>  >             <dfdl:discriminator test="{
> 
>  >                 fn:not(
> 
>  >                 (fn:exists(./Hint_Name_Table_RVA)) and
> 
>  >                 (./Hint_Name_Table_RVA eq 0) and
> 
>  >                 (./Name_Flag eq 0)
> 
>  >                 )
> 
>  >                 }" />
> 
>  >         </xs:appinfo>
> 
>  >     </xs:annotation>
> 
>  >     <xs:complexType>
> 
>  >         <xs:choice>
> 
>  >             <xs:sequence>
> 
>  >                 <xs:element name="Ordinal_Number" type="unsignedint16" />
> 
>  >                 <xs:element name="Zero" type="unsignedint15">
> 
>  >                     <xs:annotation>
> 
>  >                         <xs:appinfo source="http://www.ogf.org/dfdl/";>
> 
>  >                             <dfdl:discriminator test="{ . eq 0 }" />
> 
>  >                         </xs:appinfo>
> 
>  >                     </xs:annotation>
> 
>  >                 </xs:element>
> 
>  >                 <xs:element name="Ordinal_Flag" type="unsignedint1">
> 
>  >                     <xs:annotation>
> 
>  >                         <xs:appinfo source="http://www.ogf.org/dfdl/";>
> 
>  >                             <dfdl:discriminator test="{ . eq 1 }" />
> 
>  >                         </xs:appinfo>
> 
>  >                     </xs:annotation>
> 
>  >                 </xs:element>
> 
>  >             </xs:sequence>
> 
>  >             <xs:sequence>
> 
>  >                 <xs:element name="Hint_Name_Table_RVA" 
> type="unsignedint31" />
> 
>  >                 <xs:element name="Name_Flag" type="unsignedint1">
> 
>  >                     <xs:annotation>
> 
>  >                         <xs:appinfo source="http://www.ogf.org/dfdl/";>
> 
>  >                             <dfdl:discriminator test="{ . eq 0 }" />
> 
>  >                         </xs:appinfo>
> 
>  >                     </xs:annotation>
> 
>  >                 </xs:element>
> 
>  >             </xs:sequence>
> 
>  >         </xs:choice>
> 
>  >     </xs:complexType>
> 
>  > </xs:element>
> 
>  >
> 
>  >
> 
>  >
> 
>  > -----Original Message-----
> 
>  > From: Steve Lawrence <[email protected] <mailto:[email protected]>>
> 
>  > Sent: Monday, February 11, 2019 8:10 AM
> 
>  > To: [email protected] <mailto:[email protected]>; 
> Costello, 
> Roger L. <[email protected] <mailto:[email protected]>>
> 
>  > Subject: [EXT] Re: Can bits be cast to string, concatenated, and then cast 
> to 
> an unsigned int?
> 
>  >
> 
>  > Concating bit values will not work because there are no constructor 
> functions 
> that accept a bit string. The dfdl:setBits() function can be used, but only 
> for 
> only for bytes (i.e. 8 bits or less). You could always just use math to 
> construct an int, e.g.:
> 
>  >
> 
>  >   ./bit1 * fn:pow(2,31) +
> 
>  >   ./bit2 * fn:pow(2,30) +
> 
>  >   ...
> 
>  >   ./bitN * fn:pow(2,0)
> 
>  >
> 
>  > It would probably be more efficient to expand the fn:pow's to their actual 
> values, but this gives the idea.
> 
>  >
> 
>  > That said, I think an approach that makes for a more descriptive schema 
> that 
> is easier to understand is to use a choice where each branch of the choice 
> parses the 30 bits differently, and a discriminator is used to ensure the 
> correct branch is taken based on the lst bit. Something like:
> 
>  >
> 
>  >   <xs:choice>
> 
>  >     <xs:sequence>
> 
>  >       <xs:element name="OrdinalNumber" dfdl:length="16" ... />
> 
>  >       <xs:element name="Zero" dfdl:length="15 ... />
> 
>  >       <xs:element name="Flag" dfdl:length=1" ... >
> 
>  >         <xs:annotation>
> 
>  >           <xs:appinfo source="http://www.ogf.org/dfdl/";>
> 
>  >             <dfdl:discriminator test="{ . eq 1 }" />
> 
>  >           </xs:appinfo>
> 
>  >         </xs:annotation>
> 
>  >       </xs:element>
> 
>  >     </xs:sequence>
> 
>  >     <xs:sequence>
> 
>  >       <xs:element name="HintNameTableRVA" dfdl:length="31" ... />
> 
>  >       <xs:element name="Flag" dfdl:length=1" ... >
> 
>  >         <xs:annotation>
> 
>  >           <xs:appinfo source="http://www.ogf.org/dfdl/";>
> 
>  >             <dfdl:discriminator test="{ . eq 0 }" />
> 
>  >           </xs:appinfo>
> 
>  >         </xs:annotation>
> 
>  >       </xs:element>
> 
>  >     </xs:sequence>
> 
>  >   </xs:choice>
> 
>  >
> 
>  > This is bit inefficient since it requires parsing the same 32 bits twice 
> if 
> the last bit is a 0, but it's much easier to understand what data looks like.
> 
>  >
> 
>  > - Steve
> 
>  >
> 
>  >
> 
>  > On 2/11/19 6:43 AM, Costello, Roger L. wrote:
> 
>  >> Hello DFDL Community,
> 
>  >>
> 
>  >> My input file contains a 32-bit field (bit 0 to bit 31). Bit 31
> 
>  >> determines how to interpret the other bits: if bit 31 = 1 then bits 0
> 
>  >> to 15 is an unsigned int denoting an ordinal number and the other
> 
>  >> bits must be zero. If bit 30 = 0 then bits 0 to 30 is an unsigned int
> 
>  >> denoting a hint/name table RVA. The following graphic illustrates the 
> structure:
> 
>  >>
> 
>  >> How to express this in DFDL? I figured that I would create a hidden
> 
>  >> group that contains 32 1-bit elements. I would have a choice that
> 
>  >> expresses this: If hidden bit 31 = 1 then choose the ordinal_number
> 
>  >> element, otherwise (hidden bit 31 = 0) choose hint_name_table_RVA element.
> 
>  >>
> 
>  >> The value of ordinal_number is calculated by concatenating hidden bit
> 
>  >> 0 with hidden bit 1, ..., hidden bit 15. Then, cast that to unsigned int.
> 
>  >>
> 
>  >> The value of hint_name_table_RVA is calculated by concatenating
> 
>  >> hidden bit 0 with hidden bit 1, ..., hidden bit 30. Then, cast that to 
> unsigned int.
> 
>  >>
> 
>  >> You can see my schema below. Unfortunately, it fails. What is the
> 
>  >> right way to approach this problem, please?  /Roger
> 
>  >>
> 
>  >> <xs:elementname="Ordinal_Number"type="unsignedint16"dfdl:inputValueCalc="{
> 
>  >>      unsignedint16(
> 
>  >>      fn:concat(./Hidden_Lookup_Table_bit0,
> 
>  >>      ./Hidden_Lookup_Table_bit1,
> 
>  >>      ./Hidden_Lookup_Table_bit2,
> 
>  >>      ./Hidden_Lookup_Table_bit3,
> 
>  >>      ./Hidden_Lookup_Table_bit4,
> 
>  >>      ./Hidden_Lookup_Table_bit5,
> 
>  >>      ./Hidden_Lookup_Table_bit6,
> 
>  >>      ./Hidden_Lookup_Table_bit7,
> 
>  >>      ./Hidden_Lookup_Table_bit8,
> 
>  >>      ./Hidden_Lookup_Table_bit9,
> 
>  >>      ./Hidden_Lookup_Table_bit10,
> 
>  >>      ./Hidden_Lookup_Table_bit11,
> 
>  >>      ./Hidden_Lookup_Table_bit12,
> 
>  >>      ./Hidden_Lookup_Table_bit13,
> 
>  >>      ./Hidden_Lookup_Table_bit14,
> 
>  >>      ./Hidden_Lookup_Table_bit15))
> 
>  >>      }"/>
> 
>  >>
> 
>  >> 
> <xs:elementname="Hint_Name_Table_RVA"type="unsignedint31"dfdl:inputValueCalc="{
> 
>  >>      unsignedint31(
> 
>  >>      fn:concat(./Hidden_Lookup_Table_bit0,
> 
>  >>      ./Hidden_Lookup_Table_bit1,
> 
>  >>      ./Hidden_Lookup_Table_bit2,
> 
>  >>      ./Hidden_Lookup_Table_bit3,
> 
>  >>      ./Hidden_Lookup_Table_bit4,
> 
>  >>      ./Hidden_Lookup_Table_bit5,
> 
>  >>      ./Hidden_Lookup_Table_bit6,
> 
>  >>      ./Hidden_Lookup_Table_bit7,
> 
>  >>      ./Hidden_Lookup_Table_bit8,
> 
>  >>      ./Hidden_Lookup_Table_bit9,
> 
>  >>      ./Hidden_Lookup_Table_bit10,
> 
>  >>      ./Hidden_Lookup_Table_bit11,
> 
>  >>      ./Hidden_Lookup_Table_bit12,
> 
>  >>      ./Hidden_Lookup_Table_bit13,
> 
>  >>      ./Hidden_Lookup_Table_bit14,
> 
>  >>      ./Hidden_Lookup_Table_bit15,
> 
>  >>      ./Hidden_Lookup_Table_bit16,
> 
>  >>      ./Hidden_Lookup_Table_bit17,
> 
>  >>      ./Hidden_Lookup_Table_bit18,
> 
>  >>      ./Hidden_Lookup_Table_bit19,
> 
>  >>      ./Hidden_Lookup_Table_bit20,
> 
>  >>      ./Hidden_Lookup_Table_bit21,
> 
>  >>      ./Hidden_Lookup_Table_bit22,
> 
>  >>      ./Hidden_Lookup_Table_bit23,
> 
>  >>      ./Hidden_Lookup_Table_bit24,
> 
>  >>      ./Hidden_Lookup_Table_bit25,
> 
>  >>      ./Hidden_Lookup_Table_bit26,
> 
>  >>      ./Hidden_Lookup_Table_bit27,
> 
>  >>      ./Hidden_Lookup_Table_bit28,
> 
>  >>      ./Hidden_Lookup_Table_bit29,
> 
>  >>      ./Hidden_Lookup_Table_bit30
> 
>  >>      ))
> 
>  >>      }"/>
> 
>  >>
> 
>  >> <xs:groupname="hidden_Lookup_Table_Group">
> 
>  >> <xs:sequencedfdl:alignmentUnits="bits">
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit7"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit6"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit5"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit4"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit3"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit2"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit1"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit0"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit15"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit14"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit13"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit12"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit11"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit10"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit9"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit8"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit23"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit22"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit21"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit20"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit19"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit18"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit17"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit16"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit31"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit30"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit29"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit28"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit27"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit26"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit25"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> <xs:elementname="Hidden_Lookup_Table_bit24"type="unsignedint1"
> 
>  >>              dfdl:outputValueCalc='{ . }'
> 
>  >> />
> 
>  >> </xs:sequence>
> 
>  >> </xs:group>
> 
>  >>
> 
>  >
>

Re: Can bits be cast to string, concatenated, and then cast to an unsigned int?

Reply via email to