The feature you want is some lookahead. In DFDL this is done with dfdl:assert with testKind 'pattern' and a regex.
So you can, for just one field, define it as either fixed or variable length depending on whether the data looks like 3 characters and another delimiter, or not. That way each field can be defined this way, and each one is isolated from the next, so the whole thing doesn't become a big coupled mess with everything having to be combined with the next field. It's not perfect, because you are expressing the separator in two places, as a sequence separator, and in this look ahead regex, but OTOH it expresses exactly the way you described the problem in terms of "it's fixed length if it's followed by a next field" So something like this would be in a sequence separated by "/" <element name="b"> <complexType> <choice> <sequence> <sequence> <annotation><appinfo ..> <!-- look ahead for 3 non-slash non-line-ending then a slash --> <dfdl:assert testKind="pattern" testPattern="[^/\R][^/\R][^/\R]/" /> </appinfo></annotation> </sequence> <!-- this len named element is here to obey XSD's UPA rules. --> <element name="len" type="xs:unsignedInt" dfdl:inputValueCalc="{ 3 }"/> <element name="str" type="xs:string" dfdl:lengthKind="explicit" dfdl:length="{ ../len }"/> <!-- you could add space pad/trim to the str if you want it left justified --> </sequence> <element name="str" type="xs:string" dfdl:lengthKind="delimited"/> </choice> </complexType> </element> On Thu, Sep 28, 2023 at 9:09 AM Roger L Costello <coste...@mitre.org> wrote: > My input is a single line consisting of three fields separated by slashes. > The first field (A) can contain any string. The second field (B) has a > fixed length (3); if the data does not consume the allotted 3 spaces, then > the data is left-aligned and padded with spaces on the right. The third > field (C) can contain any string. Here is a sample input: > > > > Hello/X /Comment > > > > Notice the two padding spaces following X. > > > > Here is another sample input: > > > > Hello/XYZ/Comment > > > > That is all very straightforward and easily described in DFDL. > > > > Now for the complexity … > > > > The third field (C) is optional. If there is no data for the third field, > then the data in the second field (B) does not need to be padded. So here > is a valid input: > > > > Hello/X > > > > There is no padding following X. (Nor is there a slash separator) > > > > So, the second field (B) has a fixed length only if there is a third field > (C). > > > > I created a DFDL schema which seems to correctly express this data format. > See below. The approach I use is a choice for the second field: > > > > choice > <sequence> > element declaration for fixed length B > > element declaration for C > > </sequence> > > element declaration for variable length B > > > > Eek! I don’t think that approach is scalable. > > > > Suppose instead of 3 fields, there are 4 fields, A, B, C, D. Suppose B, C, > D are optional and B, C are fixed length unless there are no following > fields then they are variable length. The choice approach quickly becomes > untenable as all permutations must be described. Is there a better approach > to this problem? > > > > <?xml version="1.0" encoding="UTF-8"?> > <xs:schema xmlns:dfdl=http://www.ogf.org/dfdl/dfdl-1.0/ > xmlns:xs=http://www.w3.org/2001/XMLSchema > xmlns:fn=http://www.w3.org/2005/xpath-functions > > <xs:annotation> > <xs:appinfo source=http://www.ogf.org/dfdl/> > <dfdl:format alignment="1" > alignmentUnits="bytes" > emptyValueDelimiterPolicy="none" > encoding="ASCII" > encodingErrorPolicy="replace" > escapeSchemeRef="" > fillByte="%SP;" > floating="no" > ignoreCase="yes" > initiatedContent="no" > initiator="" > leadingSkip="0" > lengthKind="delimited" > lengthUnits="characters" > nilValueDelimiterPolicy="none" > occursCountKind="implicit" > outputNewLine="%CR;%LF;" > representation="text" > separator="" > separatorSuppressionPolicy="anyEmpty" > sequenceKind="ordered" > textBidi="no" > textPadKind="none" > textTrimKind="none" > trailingSkip="0" > truncateSpecifiedLengthString="no" > terminator="" > textNumberRep="standard" > textStandardBase="10" > textStandardZeroRep="0" > textNumberRounding="pattern" > textStandardExponentRep="E" > textNumberCheckPolicy="strict"/> > </xs:appinfo> > </xs:annotation> > <xs:element name="Test"> > <xs:complexType> > <xs:sequence dfdl:separator="/" dfdl:separatorPosition="infix" > > > <xs:element name="A" type="xs:string" /> > <xs:choice dfdl:choiceLengthKind="implicit"> > <xs:sequence dfdl:separator="/" dfdl:separatorPosition > ="infix"> > <xs:element name="B-fixed-length" > dfdl:lengthKind="explicit" > dfdl:length="3" > dfdl:textTrimKind="padChar" > dfdl:textPadKind="padChar" > dfdl:textStringPadCharacter= > "%SP;" > dfdl:textStringJustification= > "left"> > <xs:simpleType> > <xs:restriction base="validString"> > <xs:enumeration value="X"/> > <xs:enumeration value="XY"/> > <xs:enumeration value="XYZ"/> > </xs:restriction> > </xs:simpleType> > </xs:element> > <xs:element name="C" type="xs:string"/> > </xs:sequence> > <xs:element name="B-variable-length"> > <xs:simpleType> > <xs:restriction base="validString"> > <xs:enumeration value="X"/> > <xs:enumeration value="XY"/> > <xs:enumeration value="XYZ"/> > </xs:restriction> > </xs:simpleType> > </xs:element> > </xs:choice> > </xs:sequence> > </xs:complexType> > </xs:element> > > <xs:simpleType name="validString"> > <xs:annotation> > <xs:appinfo source=http://www.ogf.org/dfdl/> > <dfdl:assert>{ dfdl:checkConstraints(.) }</dfdl:assert> > </xs:appinfo> > </xs:annotation> > <xs:restriction base="xs:string"/> > </xs:simpleType> > > </xs:schema> > > >