Re: Can bits be cast to string, concatenated, and then cast to an unsigned int?

Costello, Roger L. Tue, 12 Feb 2019 14:10:47 -0800

Thank you Steve! The fixed="0" works like a charm.



But I just realized a potential fatal wrinkle in this problem.



EXE files are in Little Endian format .... oops!



The EXE specification says that the "most significant bit (bit 31)" is the 
thing that determines how to interpret the other 31 bits.



In a 4-byte Little Endian field, the most significant bit (bit 31) is the first 
bit of the last byte:



[cid:[email protected]]



I have been thinking that the most significant bit is the last bit of the last 
byte, which is wrong.



So, this is not correct:


<xs:sequence>
    <xs:element name="Ordinal_Number" type="unsignedint16" />
    <xs:element name="Zero" type="unsignedint15" fixed="0" />
    <xs:element name="Ordinal_Flag" type="unsignedint1">
        <xs:annotation>
            <xs:appinfo source="http://www.ogf.org/dfdl/";>
                <dfdl:discriminator test="{ . eq 1 }" />
            </xs:appinfo>
        </xs:annotation>
    </xs:element>
</xs:sequence>



Do you agree?



/Roger







-----Original Message-----
From: Steve Lawrence <[email protected]>
Sent: Tuesday, February 12, 2019 7:42 AM
To: [email protected]; Costello, Roger L. <[email protected]>
Subject: [EXT] Re: Can bits be cast to string, concatenated, and then cast to 
an unsigned int?



I think the reason is because of the discriminator on the Zero element.



When a discriminator test passes, that tells Daffodil that the surrounding 
point of uncertainty (PoU) has been resolved and that Daffodil has taken the 
correct branch. If something later causes that branch to fail, Daffodil will 
not try any other branches related to that PoU, but will instead continue to 
backtrack passed the PoU to the next one.



So in this case we parse the Zero element and it just happens to be all zeros, 
even though we know we're in the wrong branch. The { . eq 0 } test on the Zero 
element passes, alerting Daffodil that it has chosen the correct choice branch. 
Then it parses the Ordinal_Flag element which has a value of 1. This 
discriminator fails, causing Daffodil to backtrack. But we've already resolved 
the choice PoU, so we don't try the other branch.



The reason it works when you swap the order of the choice branches is because 
it parses the correct choice branch first--the Zero discriminator was never set 
to tell it not to try this branch.



So what's the right fix?



First of all, the discriminator on the Zero element should definitely change. 
The Zero element having a value of 0 doesn't mean that we've taken the correct 
branch, which is what the discriminator will say.



One approach would be to change that dfdl:discriminator to a dfdl:assert, which 
will not resolve the PoU when the value of Zero is 0, but will cause the same 
backtracking behavior when it's not 0. This will allow Daffodil to attempt the 
second branch.



However, I would argue that that isn't necessarily the right behavior.

Regardless of the value of Zero, you only want to backtrack and try the second 
branch when Ordinal_Flag is not zero. That is the thing that determines which 
choice branch we should take, not whether or not Zero has the correct value or 
not. So I would argue that the Zero element being 0 should be a validation 
check, not a runtime parse check.

Syntactically, it's fine if that value is non-zero, but when someone wants to 
validate if what was parsed actually looked like an EXE, that being non-zero 
would raise a flag that something wasn't right. So I would recommend doing 
something like this:



  <xs:element name="Zero" type="unsignedint15" fixed="0" />



Setting the "fixed" attribute will not affect the parse at all, but will cause 
a validation error if the value wasn't correct.



- Steve



On 2/12/19 4:21 AM, Costello, Roger L. wrote:

> Thanks Steve - awesome!

>

> I tried the second approach that you provided. See below. When I run it, I 
> get this error message:

>

> [error] Parse Error: Failed to populate Lookup_Table_Entry[1]. Cause:

> Parse Error: All choice alternatives failed. Reason(s): List(Parse

> Error: Alternative failed. Reason(s): List(Parse Error: Assertion

> failed: { . eq 1 } failed

>

> On the other hand, if I switch the order of the sequences in

> xs:choice, then I get no error. Why do I get an error with the

> sequences in one order but no error in another order?  /Roger

>

> <xs:element name="Lookup_Table_Entry" dfdl:length="32" 
> dfdl:lengthKind="explicit" dfdl:lengthUnits="bits">

>     <xs:annotation>

>         <xs:appinfo source="http://www.ogf.org/dfdl/";>

>             <dfdl:discriminator test="{

>                 fn:not(

>                 (fn:exists(./Hint_Name_Table_RVA)) and

>                 (./Hint_Name_Table_RVA eq 0) and

>                 (./Name_Flag eq 0)

>                 )

>                 }" />

>         </xs:appinfo>

>     </xs:annotation>

>     <xs:complexType>

>         <xs:choice>

>             <xs:sequence>

>                 <xs:element name="Ordinal_Number" type="unsignedint16" />

>                 <xs:element name="Zero" type="unsignedint15">

>                     <xs:annotation>

>                         <xs:appinfo source="http://www.ogf.org/dfdl/";>

>                             <dfdl:discriminator test="{ . eq 0 }" />

>                         </xs:appinfo>

>                     </xs:annotation>

>                 </xs:element>

>                 <xs:element name="Ordinal_Flag" type="unsignedint1">

>                     <xs:annotation>

>                         <xs:appinfo source="http://www.ogf.org/dfdl/";>

>                             <dfdl:discriminator test="{ . eq 1 }" />

>                         </xs:appinfo>

>                     </xs:annotation>

>                 </xs:element>

>             </xs:sequence>

>             <xs:sequence>

>                 <xs:element name="Hint_Name_Table_RVA" type="unsignedint31" />

>                 <xs:element name="Name_Flag" type="unsignedint1">

>                     <xs:annotation>

>                         <xs:appinfo source="http://www.ogf.org/dfdl/";>

>                             <dfdl:discriminator test="{ . eq 0 }" />

>                         </xs:appinfo>

>                     </xs:annotation>

>                 </xs:element>

>             </xs:sequence>

>         </xs:choice>

>     </xs:complexType>

> </xs:element>

>

>

>

> -----Original Message-----

> From: Steve Lawrence <[email protected]<mailto:[email protected]>>

> Sent: Monday, February 11, 2019 8:10 AM

> To: [email protected]<mailto:[email protected]>; Costello, 
> Roger L. <[email protected]<mailto:[email protected]>>

> Subject: [EXT] Re: Can bits be cast to string, concatenated, and then cast to 
> an unsigned int?

>

> Concating bit values will not work because there are no constructor functions 
> that accept a bit string. The dfdl:setBits() function can be used, but only 
> for only for bytes (i.e. 8 bits or less). You could always just use math to 
> construct an int, e.g.:

>

>   ./bit1 * fn:pow(2,31) +

>   ./bit2 * fn:pow(2,30) +

>   ...

>   ./bitN * fn:pow(2,0)

>

> It would probably be more efficient to expand the fn:pow's to their actual 
> values, but this gives the idea.

>

> That said, I think an approach that makes for a more descriptive schema that 
> is easier to understand is to use a choice where each branch of the choice 
> parses the 30 bits differently, and a discriminator is used to ensure the 
> correct branch is taken based on the lst bit. Something like:

>

>   <xs:choice>

>     <xs:sequence>

>       <xs:element name="OrdinalNumber" dfdl:length="16" ... />

>       <xs:element name="Zero" dfdl:length="15 ... />

>       <xs:element name="Flag" dfdl:length=1" ... >

>         <xs:annotation>

>           <xs:appinfo source="http://www.ogf.org/dfdl/";>

>             <dfdl:discriminator test="{ . eq 1 }" />

>           </xs:appinfo>

>         </xs:annotation>

>       </xs:element>

>     </xs:sequence>

>     <xs:sequence>

>       <xs:element name="HintNameTableRVA" dfdl:length="31" ... />

>       <xs:element name="Flag" dfdl:length=1" ... >

>         <xs:annotation>

>           <xs:appinfo source="http://www.ogf.org/dfdl/";>

>             <dfdl:discriminator test="{ . eq 0 }" />

>           </xs:appinfo>

>         </xs:annotation>

>       </xs:element>

>     </xs:sequence>

>   </xs:choice>

>

> This is bit inefficient since it requires parsing the same 32 bits twice if 
> the last bit is a 0, but it's much easier to understand what data looks like.

>

> - Steve

>

>

> On 2/11/19 6:43 AM, Costello, Roger L. wrote:

>> Hello DFDL Community,

>>

>> My input file contains a 32-bit field (bit 0 to bit 31). Bit 31

>> determines how to interpret the other bits: if bit 31 = 1 then bits 0

>> to 15 is an unsigned int denoting an ordinal number and the other

>> bits must be zero. If bit 30 = 0 then bits 0 to 30 is an unsigned int

>> denoting a hint/name table RVA. The following graphic illustrates the 
>> structure:

>>

>> How to express this in DFDL? I figured that I would create a hidden

>> group that contains 32 1-bit elements. I would have a choice that

>> expresses this: If hidden bit 31 = 1 then choose the ordinal_number

>> element, otherwise (hidden bit 31 = 0) choose hint_name_table_RVA element.

>>

>> The value of ordinal_number is calculated by concatenating hidden bit

>> 0 with hidden bit 1, ..., hidden bit 15. Then, cast that to unsigned int.

>>

>> The value of hint_name_table_RVA is calculated by concatenating

>> hidden bit 0 with hidden bit 1, ..., hidden bit 30. Then, cast that to 
>> unsigned int.

>>

>> You can see my schema below. Unfortunately, it fails. What is the

>> right way to approach this problem, please?  /Roger

>>

>> <xs:elementname="Ordinal_Number"type="unsignedint16"dfdl:inputValueCalc="{

>>      unsignedint16(

>>      fn:concat(./Hidden_Lookup_Table_bit0,

>>      ./Hidden_Lookup_Table_bit1,

>>      ./Hidden_Lookup_Table_bit2,

>>      ./Hidden_Lookup_Table_bit3,

>>      ./Hidden_Lookup_Table_bit4,

>>      ./Hidden_Lookup_Table_bit5,

>>      ./Hidden_Lookup_Table_bit6,

>>      ./Hidden_Lookup_Table_bit7,

>>      ./Hidden_Lookup_Table_bit8,

>>      ./Hidden_Lookup_Table_bit9,

>>      ./Hidden_Lookup_Table_bit10,

>>      ./Hidden_Lookup_Table_bit11,

>>      ./Hidden_Lookup_Table_bit12,

>>      ./Hidden_Lookup_Table_bit13,

>>      ./Hidden_Lookup_Table_bit14,

>>      ./Hidden_Lookup_Table_bit15))

>>      }"/>

>>

>> <xs:elementname="Hint_Name_Table_RVA"type="unsignedint31"dfdl:inputValueCalc="{

>>      unsignedint31(

>>      fn:concat(./Hidden_Lookup_Table_bit0,

>>      ./Hidden_Lookup_Table_bit1,

>>      ./Hidden_Lookup_Table_bit2,

>>      ./Hidden_Lookup_Table_bit3,

>>      ./Hidden_Lookup_Table_bit4,

>>      ./Hidden_Lookup_Table_bit5,

>>      ./Hidden_Lookup_Table_bit6,

>>      ./Hidden_Lookup_Table_bit7,

>>      ./Hidden_Lookup_Table_bit8,

>>      ./Hidden_Lookup_Table_bit9,

>>      ./Hidden_Lookup_Table_bit10,

>>      ./Hidden_Lookup_Table_bit11,

>>      ./Hidden_Lookup_Table_bit12,

>>      ./Hidden_Lookup_Table_bit13,

>>      ./Hidden_Lookup_Table_bit14,

>>      ./Hidden_Lookup_Table_bit15,

>>      ./Hidden_Lookup_Table_bit16,

>>      ./Hidden_Lookup_Table_bit17,

>>      ./Hidden_Lookup_Table_bit18,

>>      ./Hidden_Lookup_Table_bit19,

>>      ./Hidden_Lookup_Table_bit20,

>>      ./Hidden_Lookup_Table_bit21,

>>      ./Hidden_Lookup_Table_bit22,

>>      ./Hidden_Lookup_Table_bit23,

>>      ./Hidden_Lookup_Table_bit24,

>>      ./Hidden_Lookup_Table_bit25,

>>      ./Hidden_Lookup_Table_bit26,

>>      ./Hidden_Lookup_Table_bit27,

>>      ./Hidden_Lookup_Table_bit28,

>>      ./Hidden_Lookup_Table_bit29,

>>      ./Hidden_Lookup_Table_bit30

>>      ))

>>      }"/>

>>

>> <xs:groupname="hidden_Lookup_Table_Group">

>> <xs:sequencedfdl:alignmentUnits="bits">

>> <xs:elementname="Hidden_Lookup_Table_bit7"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit6"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit5"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit4"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit3"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit2"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit1"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit0"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit15"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit14"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit13"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit12"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit11"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit10"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit9"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit8"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit23"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit22"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit21"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit20"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit19"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit18"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit17"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit16"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit31"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit30"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit29"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit28"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit27"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit26"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit25"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> <xs:elementname="Hidden_Lookup_Table_bit24"type="unsignedint1"

>>              dfdl:outputValueCalc='{ . }'

>> />

>> </xs:sequence>

>> </xs:group>

>>

>

Re: Can bits be cast to string, concatenated, and then cast to an unsigned int?

Reply via email to