Yeah, I'm with Josh here. This is really quite hard to understand, and I have many reasons why I would suggest different techniques.
Use of a regex, especially one with a * at the end, is never recommended. (DoS attack vulnerability - I can feed your parser data that will be very very slow to fail, thereby denying service.) Can you share with us why, why, oh why, you want to do this with pattern? Because to me, you said the field is 8 characters, .... but then you are saying lengthKind pattern, right there I have a problem with this. The two just don't go together. ________________________________ From: Adams, Joshua <jad...@owlcyberdefense.com> Sent: Friday, July 30, 2021 2:41 PM To: users@daffodil.apache.org <users@daffodil.apache.org> Subject: Re: Bug in Daffodil? Roger, I know we discussed this in the previous email chain, but I would recommend not using a regex to absorb these whitespace characters and instead rely on the textPadKind="padChar" while also specifying xs:minLength and xs:maxLength. Also, is there a reason your regex isn't simply "[ ]{1,4}[0-9]{0,3}" -> 1 to 4 spaces followed by 0 to 3 numbers? While yours should still work and I'm working on setting up a test case to verify, it does seem overly complicated. Josh ________________________________ From: Roger L Costello <coste...@mitre.org> Sent: Friday, July 30, 2021 1:35 PM To: users@daffodil.apache.org <users@daffodil.apache.org> Subject: Bug in Daffodil? Hi Folks, A runway's surface composition can be ASPHALT, BRICK, CLAY, etc. I want the RunwayComposition element to be 8 characters long, with space padding at the end, where needed. This works great: <xs:element name="RunwayComposition" type="xs:string" dfdl:lengthKind="pattern" dfdl:lengthPattern="(ASPHALT|BRICK|CLAY|CONCR|EARTH|GRASS|GRAVEL)[ ]*"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="{ fn:string-length(.) eq 8 }"/> </xs:appinfo> </xs:annotation> </xs:element> Notice [ ]* at the end of the regex. The assert requires the string length to equal 8, so Daffodil ensures that the input has the correct number of spaces at the end of a value. Next, a runway's width can be 0-999 feet. I want the RunwayWidth element to be 4 characters long, with space padding before the number, where needed. This doesn't work: <xs:element name="RunwayWidth" type="xs:string" dfdl:lengthKind="pattern" dfdl:lengthPattern="[ ]*([0-9]|[1-9][0-9]|[1-8][0-9][0-9]|9[0-8][0-9]|99[0-9])"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="{ fn:string-length(.) eq 4 }"/> </xs:appinfo> </xs:annotation> </xs:element> Notice [ ]* at the start of the regex. Daffodil bombs, giving this error message: [error] Parse Error: Assertion failed: { fn:string-length(.) eq 4 } failed I believe this is a bug in Daffodil. Do you agree? Oddly enough, this does work: <xs:element name="RunwayWidth" type="xs:string" dfdl:lengthKind="pattern" dfdl:lengthPattern="[ ]*(99[0-9]|9[0-8][0-9]|[1-8][0-9][0-9]|[1-9][0-9]|[0-9])"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="{ fn:string-length(.) eq 4 }"/> </xs:appinfo> </xs:annotation> </xs:element> I reversed the order of the clauses in the regex so that the longest one -- 99[0-9] -- is listed first and the shortest one -- [0-9] -- is listed last. This also works: <xs:element name="RunwayWidth" type="xs:string" dfdl:lengthKind="pattern" dfdl:lengthPattern="[ ]{3,3}[0-9]|[ ]{2,2}[1-9][0-9]|[ ]{1,1}[1-8][0-9][0-9]|[ ]{1,1}9[0-8][0-9]|[ ]{1,1}99[0-9]"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="{ fn:string-length(.) eq 4 }"/> </xs:appinfo> </xs:annotation> </xs:element> Notice that for each clause of the regex, I have added the appropriate space padding to ensure the length is 4. /Roger Here is my input document: 300/ASPHALT -- And here is my DFDL schema: <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:format textBidi="no" floating="no" encodingErrorPolicy="replace" outputNewLine="%CR;%LF;" leadingSkip="0" trailingSkip="0" alignment="1" alignmentUnits="bytes" textPadKind="none" textTrimKind="none" truncateSpecifiedLengthString="no" escapeSchemeRef="" representation="text" encoding="ASCII" lengthKind = "delimited" initiator = "" terminator = "" ignoreCase = "yes" sequenceKind="ordered" separator="" initiatedContent="no" emptyValueDelimiterPolicy="none" fillByte="%SP;" textNumberRep="standard" textStandardBase="10" textStandardZeroRep="0" textNumberRounding="pattern" textStandardExponentRep="E" textNumberCheckPolicy="strict" lengthUnits="characters" separatorSuppressionPolicy="trailingEmptyStrict" /> </xs:appinfo> </xs:annotation> <xs:element name="Runway" dfdl:terminator="--"> <xs:complexType> <xs:sequence dfdl:separator="/" dfdl:separatorPosition="infix"> <!-- This works --> <xs:element name="RunwayWidth" type="xs:string" dfdl:lengthKind="pattern" dfdl:lengthPattern="[ ]*(99[0-9]|9[0-8][0-9]|[1-8][0-9][0-9]|[1-9][0-9]|[0-9])"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="{ fn:string-length(.) eq 4 }"/> </xs:appinfo> </xs:annotation> </xs:element> <!-- Doesn't work --> <!--<xs:element name="RunwayWidth" type="xs:string" dfdl:lengthKind="pattern" dfdl:lengthPattern="[ ]*([0-9]|[1-9][0-9]|[1-8][0-9][0-9]|9[0-8][0-9]|99[0-9])"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="{ fn:string-length(.) eq 4 }"/> </xs:appinfo> </xs:annotation> </xs:element>--> <!-- This works --> <!--<xs:element name="RunwayWidth" type="xs:string" dfdl:lengthKind="pattern" dfdl:lengthPattern="[ ]{3,3}[0-9]|[ ]{2,2}[1-9][0-9]|[ ]{1,1}[1-8][0-9][0-9]|[ ]{1,1}9[0-8][0-9]|[ ]{1,1}99[0-9]"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="{ fn:string-length(.) eq 4 }"/> </xs:appinfo> </xs:annotation> </xs:element>--> <xs:element name="RunwayComposition" type="xs:string" dfdl:lengthKind="pattern" dfdl:lengthPattern="(ASPHALT|BRICK|CLAY|CONCR|EARTH|GRASS|GRAVEL)[ ]*"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="{ fn:string-length(.) eq 8 }"/> </xs:appinfo> </xs:annotation> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>