Hi Roger, This writeup is great. I’m really enjoying these. I like these kinds of writeups that take the readers in a journey with you as you try and fail, then ultimately succeed, so we get to learn the lessons along with you. Love the real-world use-case too. You have me craving more!
-Davin From: Roger L Costello <coste...@mitre.org> Reply-To: "users@daffodil.apache.org" <users@daffodil.apache.org> Date: Thursday, September 8, 2022 at 9:47 AM To: "users@daffodil.apache.org" <users@daffodil.apache.org> Subject: Here is my writeup of category #2: Field with fixed length, nillable, not composite, choice Hi Folks, Below is my writeup of the second category (recall last week I identified 16 categories of fields): 2. Field with fixed length, nillable, not composite, choice Please let me know of any typos and anything that is not clear. /Roger ------------------------------------------------------------- Some fields have a choice of values. For example, airports mark the direction of runways in one of two ways, like this: 24L or like this: 24L-36R Let’s create DFDL for a field containing runway direction. Let’s assume that if no data is available to populate the field then it is to contain a hyphen, i.e., the field is nillable. Let’s also assume the field has a fixed length (7). Field Requirements: >> Choice of values: one or two runway directions >> Fixed length (7) >> Nillable, hyphen is the nil value, the hyphen may be positioned anywhere >> within the 7-character field >> Not composite, i.e., values are atomic >> Values shorter than 7 chars must be left-justified >> Values shorter than 7 characters must be padded with spaces Here is a skeletal outline of the schema: <xs:element name="RunwayDirection" nillable="true"> <xs:complexType> <xs:choice> <xs:element name="OneRunwayDirection" ... </xs:element> <xs:element name="TwoRunwayDirections"> ... </xs:element> </xs:choice> </xs:complexType> </xs:element> I named the field RunwayDirection. Its value is a choice of either OneRunwayDirection or TwoRunwayDirections. Since the field has a choice of content, RunwayDirection is declared to have a complexType. That’s a problem. Recall that if there is no data to populate RunwayDirection, it is to be populated by a hyphen, but DFDL does not allow a complexType element to have hyphen as the nil value; DFDL only allows ES (Empty Space) as the nil value. The workaround is to embed the element (RunwayDirection) within a choice, where one branch of the choice is an element declaration for the runway field (omit nillable="true") and the other branch is an element declaration for the nil value: <xs:element name="RunwayDirectionWrapper"> <xs:complexType> <xs:choice> <xs:element name="RunwayDirection_" nillable="true">...</xs:element> <xs:element name="RunwayDirection"> <xs:complexType> <xs:choice> <xs:element name="OneRunwayDirection" ... </xs:element> <xs:element name="TwoRunwayDirections"> ... </xs:element> </xs:choice> <xs:complexType> </xs:element> </xs:choice> </xs:complexType> </xs:element> Note the choice within a choice. The outer choice has two branches, one for the nil value and the other for RunwayDirection. The first branch, for the nil value, I named RunwayDirection_ (note the underscore at the end of the name). The two branches are embedded within an element that I named RunwayDirectionWrapper. Of course, you can name the wrapper element and the element for the nil value whatever you want. The order of the branches is important. You must list the element for the nil value first. The reason for this is somewhat involved. The reader who is interested in getting a detailed explanation is invited to read the online discussion of this topic: https://lists.apache.org/list.html?users@daffodil.apache.org To specify that the field is fixed length, add these two DFDL properties to RunwayDirectionWrapper: dfdl:lengthKind="explicit" dfdl:length="7" Here is the element with the two DFDL properties added: <xs:element name="RunwayDirectionWrapper" dfdl:lengthKind="explicit" dfdl:length="7"> <xs:complexType> <xs:choice> -- branch 1 -- branch 2 </xs:choice> </xs:complexType> </xs:element> The first branch of the choice is the nillable element. To specify that the field is nillable with hyphen, add these two DFDL properties: dfdl:nilKind="literalValue" dfdl:nilValue="%WSP*;-%WSP*;" Here is the first branch: <xs:element name="RunwayDirection_" type="xs:string" nillable="true" dfdl:nilKind="literalValue" dfdl:nilValue ="%WSP*;-%WSP*;" /> Unfortunately, that is not quite right. Consider what happens with this input: …/24L-36R/… Parsing produces this XML: <RunwayDirection_>24L-36R</RunwayDirection_> The parser chose the first branch (the nil branch), which is wrong. The reason it did that is that the first branch is declared of type string and since "24L-36R" is a string, the parser chose that branch. We need to specify that the branch is to be chosen only when the field’s value is hyphen. To do this, add the following to the element declaration: <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert>{ fn:nilled(.) }</dfdl:assert> </xs:appinfo> </xs:annotation> The XPath nilled() function returns true only if the field contains the nil value (hyphen). If the assert fails on an input, then the parser will fail on the branch, backtrack, and try the next branch. Here is the updated branch: <xs:element name="RunwayDirection_" type="xs:string" nillable="true" dfdl:nilKind="literalValue" dfdl:nilValue="%WSP*;-%WSP*;"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert>{ fn:nilled(.) }</dfdl:assert> </xs:appinfo> </xs:annotation> </xs:element> The second branch is fixed length, non-nillable, and has a choice of values: <xs:element name="RunwayDirection"> <xs:complexType> <xs:choice dfdl:choiceLengthKind="implicit"> <xs:element name="OneRunwayDirection" dfdl:textTrimKind="padChar" dfdl:textPadKind="padChar" dfdl:textStringPadCharacter="%SP;" dfdl:textStringJustification="left"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="[0-9]{2,2}(C|L|R){0,1}"/> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="TwoRunwayDirections" dfdl:textTrimKind="padChar" dfdl:textPadKind="padChar" dfdl:textStringPadCharacter="%SP;" dfdl:textStringJustification="left"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="[0-9]{2,2}(C|L|R){0,1}-[0-9]{2,2}(C|L|R){0,1}"/> </xs:restriction> </xs:simpleType> </xs:element> </xs:choice> </xs:complexType> </xs:element> Unfortunately, that too is not quite right. Consider what happens with this input: …/24L-36R/… Parsing it results in this error: OneRunwayDirection failed facet checks due to: facet pattern(s): [0-9]{2,2}(C|L|R){0,1} Here is what’s happening: the parser checks the input against the simpleType facets in the OneRunwayDirection branch and since the input does not conform, parsing fails. The full explanation of why this is so, is a bit involved. The interested reader is invited to read the mailing list discussion: https://lists.apache.org/thread/rhvllp7gwh4yv8zklphqz7jpnhy6q88q To force parsing to backup and try the next branch when facet validation fails we add this to each branch: <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert>{ dfdl:checkConstraints(.) }</dfdl:assert> </xs:appinfo> </xs:annotation> Here is the updated RunwayDirection element: <xs:element name="RunwayDirection"> <xs:complexType> <xs:choice dfdl:choiceLengthKind="implicit"> <xs:element name="OneRunwayDirection" dfdl:textTrimKind="padChar" dfdl:textPadKind="padChar" dfdl:textStringPadCharacter="%SP;" dfdl:textStringJustification="left"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert>{ dfdl:checkConstraints(.) }</dfdl:assert> </xs:appinfo> </xs:annotation> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="[0-9]{2,2}(C|L|R){0,1}"/> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="TwoRunwayDirections" dfdl:textTrimKind="padChar" dfdl:textPadKind="padChar" dfdl:textStringPadCharacter="%SP;" dfdl:textStringJustification="left"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert>{ dfdl:checkConstraints(.) }</dfdl:assert> </xs:appinfo> </xs:annotation> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="[0-9]{2,2}(C|L|R){0,1}-[0-9]{2,2}(C|L|R){0,1}"/> </xs:restriction> </xs:simpleType> </xs:element> </xs:choice> </xs:complexType> </xs:element> That’s it. Now let’s see how a DFDL processor parses the runway field with various inputs. With the following input (note the spaces around the hyphen): …/ - /… parsing produces this XML: <RunwayDirectionWrapper> <RunwayDirection_ xsi:nil="true"></RunwayDirection_> </RunwayDirectionWrapper> and unparsing produces this: …/- /… Notice that unparsing results in moving the hyphen to the left side of the field. With this input: …/24L /… parsing produces this XML: <RunwayDirectionWrapper> <RunwayDirection> <OneRunwayDirection>24L</OneRunwayDirection> </RunwayDirection> </RunwayDirectionWrapper> and unparsing produces this: …/24L /… With this input: …/24L-36R/… parsing produces this XML: <RunwayDirectionWrapper> <RunwayDirection> <TwoRunwayDirections>24L-36R</TwoRunwayDirections> </RunwayDirection> </RunwayDirectionWrapper> and unparsing produces this: …/24L-36R/… Perfect. One final comment: You might have noticed that the following DFDL properties were added twice, once for OneRunwayDirection and again for TwoRunwayDirections. dfdl:textTrimKind="padChar" dfdl:textPadKind="padChar" dfdl:textStringPadCharacter="%SP;" dfdl:textStringJustification="left" Why didn’t we simply add those properties once, onto their parent element, RunwayDirection? The answer is this: the RunwayDirection element has a complex type, and none of those properties are applicable to complex types. Those properties apply only to simple types with text representation. ----------------------------------------------------------------- This message and any files transmitted within are intended solely for the addressee or its representative and may contain company proprietary information. If you are not the intended recipient, notify the sender immediately and delete this message. Publication, reproduction, forwarding, or content disclosure is prohibited without the consent of the original sender and may be unlawful. Concurrent Technologies Corporation and its Affiliates. www.ctc.com 1-800-282-4392 -----------------------------------------------------------------