One subtlety here is that if the input byte was not filled with zero in the unused bits it would not unparse "properly" because the unused bits will be replaced by zero bits from your fill byte.
We call this the canonical form. Your data format tolerates various input forms, but unparses to a canonical form. The tdml runner actually has features to test data and schemas that change data into canonical form. For example you can do roundTrip="twoPass" tests which parse, unparse, then reparse the output from the unparse and compare to insure the infoset is the same as the original parse. ________________________________ From: Costello, Roger L. <[email protected]> Sent: Monday, July 29, 2019 8:32:46 AM To: [email protected] <[email protected]> Subject: Re: Unparsing a byte that is padded to byte boundary produces incorrect results Thanks Mike! Now my schema parses and unparses perfectly: <xs:element name="input"> <xs:complexType> <xs:sequence> <xs:element name="two-bits" type="unsignedint2" /> <xs:element name="three-bits" type="unsignedint3" /> <xs:sequence dfdl:hiddenGroupRef="padToByteBoundary" /> </xs:sequence> </xs:complexType> </xs:element> <xs:group name="padToByteBoundary"> <xs:sequence dfdl:alignment="8" dfdl:alignmentUnits="bits" dfdl:fillByte="%#r00;"/> </xs:group> /Roger From: Beckerle, Mike <[email protected]> Sent: Monday, July 29, 2019 8:24 AM To: [email protected] Subject: [EXT] Re: Unparsing a byte that is padded to byte boundary produces incorrect results Check that the fillbyte is %#×00; The unused bits of your byte are filled from the fill byte when unparsing. Get Outlook for Android<https://aka.ms/ghei36> ________________________________ From: Costello, Roger L. <[email protected]<mailto:[email protected]>> Sent: Monday, July 29, 2019 8:16:22 AM To: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> Subject: Unparsing a byte that is padded to byte boundary produces incorrect results Hello DFDL community, My input is binary. There is a 2-bit unsigned integer, followed by a 3-bit unsigned integer, and then it is padded to an 8-bit boundary. The bits are leastSignificantBitFirst. Here is my input (hex): 0E Here is my DFDL Schema: <xs:element name="input"> <xs:complexType> <xs:sequence> <xs:element name="two-bits" type="unsignedint2" /> <xs:element name="three-bits" type="unsignedint3" /> <xs:sequence dfdl:hiddenGroupRef="padToByteBoundary" /> </xs:sequence> </xs:complexType> </xs:element> <xs:group name="padToByteBoundary"> <xs:sequence dfdl:alignment="8" dfdl:alignmentUnits="bits"/> </xs:group> Parsing produces this XML: <input> <two-bits>2</two-bits> <three-bits>3</three-bits> </input> Perfect! However, unparsing produces incorrect binary (hex): CE Yikes! What am I doing wrong, please? /Roger
