One subtlety here is that if the input byte was not filled with zero in the 
unused bits it would not unparse "properly" because the unused bits will be 
replaced by zero bits from your fill byte.

We call this the canonical form. Your data format tolerates various input 
forms, but unparses to a canonical form.

The tdml runner actually has features to test data and schemas that change data 
into canonical form. For example you can do roundTrip="twoPass" tests which 
parse, unparse, then reparse the output from the unparse and compare to insure 
the infoset is the same as the original parse.

________________________________
From: Costello, Roger L. <[email protected]>
Sent: Monday, July 29, 2019 8:32:46 AM
To: [email protected] <[email protected]>
Subject: Re: Unparsing a byte that is padded to byte boundary produces 
incorrect results

Thanks Mike! Now my schema parses and unparses perfectly:

<xs:element name="input">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="two-bits" type="unsignedint2" />
            <xs:element name="three-bits" type="unsignedint3" />
            <xs:sequence dfdl:hiddenGroupRef="padToByteBoundary" />
        </xs:sequence>
    </xs:complexType>
</xs:element>

<xs:group name="padToByteBoundary">
    <xs:sequence dfdl:alignment="8"
                           dfdl:alignmentUnits="bits"
                           dfdl:fillByte="%#r00;"/>
</xs:group>

/Roger

From: Beckerle, Mike <[email protected]>
Sent: Monday, July 29, 2019 8:24 AM
To: [email protected]
Subject: [EXT] Re: Unparsing a byte that is padded to byte boundary produces 
incorrect results

Check that the fillbyte is %#×00;
The unused bits of your byte are filled from the fill byte when unparsing.
Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: Costello, Roger L. <[email protected]<mailto:[email protected]>>
Sent: Monday, July 29, 2019 8:16:22 AM
To: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>>
Subject: Unparsing a byte that is padded to byte boundary produces incorrect 
results

Hello DFDL community,

My input is binary. There is a 2-bit unsigned integer, followed by a 3-bit 
unsigned integer, and then it is padded to an 8-bit boundary. The bits are 
leastSignificantBitFirst. Here is my input (hex):

        0E

Here is my DFDL Schema:

<xs:element name="input">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="two-bits" type="unsignedint2" />
            <xs:element name="three-bits" type="unsignedint3" />
            <xs:sequence dfdl:hiddenGroupRef="padToByteBoundary" />
        </xs:sequence>
    </xs:complexType>
</xs:element>

<xs:group name="padToByteBoundary">
    <xs:sequence dfdl:alignment="8" dfdl:alignmentUnits="bits"/>
</xs:group>

Parsing produces this XML:

<input>
  <two-bits>2</two-bits>
  <three-bits>3</three-bits>
</input>

Perfect!

However, unparsing produces incorrect binary (hex):

        CE

Yikes! What am I doing wrong, please?

/Roger

Reply via email to