My input is a single line consisting of three fields separated by slashes. The first field (A) can contain any string. The second field (B) has a fixed length (3); if the data does not consume the allotted 3 spaces, then the data is left-aligned and padded with spaces on the right. The third field (C) can contain any string. Here is a sample input:
Hello/X /Comment Notice the two padding spaces following X. Here is another sample input: Hello/XYZ/Comment That is all very straightforward and easily described in DFDL. Now for the complexity ... The third field (C) is optional. If there is no data for the third field, then the data in the second field (B) does not need to be padded. So here is a valid input: Hello/X There is no padding following X. (Nor is there a slash separator) So, the second field (B) has a fixed length only if there is a third field (C). I created a DFDL schema which seems to correctly express this data format. See below. The approach I use is a choice for the second field: choice <sequence> element declaration for fixed length B element declaration for C </sequence> element declaration for variable length B Eek! I don't think that approach is scalable. Suppose instead of 3 fields, there are 4 fields, A, B, C, D. Suppose B, C, D are optional and B, C are fixed length unless there are no following fields then they are variable length. The choice approach quickly becomes untenable as all permutations must be described. Is there a better approach to this problem? <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:dfdl=http://www.ogf.org/dfdl/dfdl-1.0/ xmlns:xs=http://www.w3.org/2001/XMLSchema xmlns:fn=http://www.w3.org/2005/xpath-functions > <xs:annotation> <xs:appinfo source=http://www.ogf.org/dfdl/> <dfdl:format alignment="1" alignmentUnits="bytes" emptyValueDelimiterPolicy="none" encoding="ASCII" encodingErrorPolicy="replace" escapeSchemeRef="" fillByte="%SP;" floating="no" ignoreCase="yes" initiatedContent="no" initiator="" leadingSkip="0" lengthKind="delimited" lengthUnits="characters" nilValueDelimiterPolicy="none" occursCountKind="implicit" outputNewLine="%CR;%LF;" representation="text" separator="" separatorSuppressionPolicy="anyEmpty" sequenceKind="ordered" textBidi="no" textPadKind="none" textTrimKind="none" trailingSkip="0" truncateSpecifiedLengthString="no" terminator="" textNumberRep="standard" textStandardBase="10" textStandardZeroRep="0" textNumberRounding="pattern" textStandardExponentRep="E" textNumberCheckPolicy="strict"/> </xs:appinfo> </xs:annotation> <xs:element name="Test"> <xs:complexType> <xs:sequence dfdl:separator="/" dfdl:separatorPosition="infix"> <xs:element name="A" type="xs:string" /> <xs:choice dfdl:choiceLengthKind="implicit"> <xs:sequence dfdl:separator="/" dfdl:separatorPosition="infix"> <xs:element name="B-fixed-length" dfdl:lengthKind="explicit" dfdl:length="3" dfdl:textTrimKind="padChar" dfdl:textPadKind="padChar" dfdl:textStringPadCharacter="%SP;" dfdl:textStringJustification="left"> <xs:simpleType> <xs:restriction base="validString"> <xs:enumeration value="X"/> <xs:enumeration value="XY"/> <xs:enumeration value="XYZ"/> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="C" type="xs:string"/> </xs:sequence> <xs:element name="B-variable-length"> <xs:simpleType> <xs:restriction base="validString"> <xs:enumeration value="X"/> <xs:enumeration value="XY"/> <xs:enumeration value="XYZ"/> </xs:restriction> </xs:simpleType> </xs:element> </xs:choice> </xs:sequence> </xs:complexType> </xs:element> <xs:simpleType name="validString"> <xs:annotation> <xs:appinfo source=http://www.ogf.org/dfdl/> <dfdl:assert>{ dfdl:checkConstraints(.) }</dfdl:assert> </xs:appinfo> </xs:annotation> <xs:restriction base="xs:string"/> </xs:simpleType> </xs:schema>