I think I know what is happening.

In the battle of delimiters vs. nested explicit length, explicit wins.

So if you have abc/-/cef

but after parsing abc then finding the separator /, the next field is
latitudeDegrees with explicit length 2, that "wins" and "-/" are the
characters of that string.

Validation will then issue a validation warning because Daffodil's
"limited" validation is done as the elements are parsed.

This does not cause backtracking, it's just a "warning" that the seemingly
well-formed data is invalid.

Then latitudeMinutes is parsed, and that uses the ever problematic
lengthKind pattern, which succeeds, with a zero-length string, which then
also causes a validation error.

 Again because this validation error because this, now zero-length string
doesn't look like the digits you expect.

Then it parses the hyphen element, which is just a string of length 1,

.... I'll stop here because things are clearly off the rails.

Here's my suggestion for how to fix this and get Daffodil to magically do
what you want, which is to pay attention to the facets.

<!-- vString = 'validated string'. Facets are checked while parsing. -->
<simpleType name="vString">
   <annotation><appinfo source="http://www.ogf.org/dfdl/";>
       <dfdl:assert message="Invalid value">{ dfdl:checkConstraints(.)
}</dfdl:assert>
   </appinfo></annotation>
    <restriction base="xs:string"/>
</simpleType>

Define all your strings with vString as your type, and it should behave
much more like you expect.

Now normally I tell people not to call checkConstraints(.) on everything
because it fails to distinguish well-formed data from invalid data, and
often one wants the parse to succeed even if the data is invalid.

In your case things are different. You have not provided enough information
in the DFDL properties to parse this data. The facets are necessary
information to successfully parse it.

You will want to complement vString with use of discriminators. For example
I think your schema should have a discriminator after the latitudeDegrees
element because if you successfully parse that element, backtracking to the
nilled case no longer makes sense.




On Thu, Aug 25, 2022 at 7:01 AM Roger L Costello <coste...@mitre.org> wrote:

> Hi Folks,
>
> Here are two sample inputs:
>
> John Doe/2006N-05912E/Sally Smith
> John Doe/-/Sally Smith
>
> It is the field in the middle that is of interest.
>
> The field is a composite field, i.e., it consists of a series of parts:
> lat degrees, lat minutes, lat hemisphere, hyphen, long degrees, long
> minutes, long hemisphere. No separator between the parts.
>
> The field is nillable and the hyphen is the nil value.
>
> The first input shown above succeeds, the second fails to parse.
>
> What we have here is a variable length, nillable element with a
> complexType and the nil value is not %ES;. As we have determined in
> previous posts, Daffodil does not support this. So, the workaround is to
> place the element in a choice, where the first branch of the choice is the
> element minus the nillable stuff and the second branch is a plain string
> element that is nillable. Well, I implemented that and Daffodil complains:
>
> [error] Parse Error: Failed to parse infix separator. Cause: Parse Error:
> Separator '/' not found
>
> When I use the -V limited parse option I get a completely different set of
> error messages, e.g.:
>
> [error] Validation Error: LatitudeMinutes failed facet checks due to:
> facet pattern(s):
> [0-9]{2}|[0-9]{2}\.[0-9]{1}|[0-9]{2}\.[0-9]{2}|[0-9]{2}\.[0-9]{3}|[0-9]{2}\.[0-9]{4}
>
> Am I doing something wrong in my DFDL schema (shown below) or is this a
> bug in Daffodil?  /Roger
>
> <?xml version="1.0" encoding="UTF-8"?>
> <xs:schema xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/";
>                      xmlns:xs="http://www.w3.org/2001/XMLSchema";>
>     <xs:annotation>
>         <xs:appinfo source="http://www.ogf.org/dfdl/";>
>             <dfdl:format
>                 alignment="1"
>                 alignmentUnits="bytes"
>                 emptyValueDelimiterPolicy="none"
>                 encoding="ASCII"
>                 encodingErrorPolicy="replace"
>                 escapeSchemeRef=""
>                 fillByte="%SP;"
>                 floating="no"
>                 ignoreCase="yes"
>                 initiatedContent="no"
>                 initiator=""
>                 leadingSkip="0"
>                 lengthKind="delimited"
>                 lengthUnits="characters"
>                 nilValueDelimiterPolicy="none"
>                 occursCountKind="implicit"
>                 outputNewLine="%CR;%LF;"
>                 representation="text"
>                 separator=""
>                 separatorSuppressionPolicy="anyEmpty"
>                 sequenceKind="ordered"
>                 textBidi="no"
>                 textPadKind="none"
>                 textTrimKind="none"
>                 trailingSkip="0"
>                 truncateSpecifiedLengthString="no"
>                 terminator=""
>                 textNumberRep="standard"
>                 textStandardBase="10"
>                 textStandardZeroRep="0"
>                 textNumberRounding="pattern"
>                 textStandardExponentRep="E"
>                 textNumberCheckPolicy="strict"/>
>         </xs:appinfo>
>     </xs:annotation>
>     <xs:element name="Test">
>         <xs:complexType>
>             <xs:sequence dfdl:separator="/" dfdl:separatorPosition="infix">
>                 <xs:element name="A" type="xs:string"/>
>                 <xs:choice dfdl:choiceLengthKind="implicit">
>                     <xs:element name="Origin">
>                         <xs:complexType>
>                             <xs:sequence dfdl:separator="">
>                                 <xs:element name="LatitudeDegrees"
> dfdl:lengthKind="explicit" dfdl:length="2">
>                                     <xs:simpleType>
>                                         <xs:restriction base="xs:string">
>                                             <xs:pattern value="[0-9]{2}"/>
>                                         </xs:restriction>
>                                     </xs:simpleType>
>                                 </xs:element>
>                                 <xs:element name="LatitudeMinutes"
> dfdl:lengthKind="pattern" dfdl:lengthPattern=".*?(?=(N|S))">
>                                     <xs:simpleType>
>                                         <xs:restriction base="xs:string">
>                                             <xs:pattern value="[0-9]{2}"/>
>                                             <xs:pattern
> value="[0-9]{2}\.[0-9]{1}"/>
>                                             <xs:pattern
> value="[0-9]{2}\.[0-9]{2}"/>
>                                             <xs:pattern
> value="[0-9]{2}\.[0-9]{3}"/>
>                                             <xs:pattern
> value="[0-9]{2}\.[0-9]{4}"/>
>                                         </xs:restriction>
>                                     </xs:simpleType>
>                                 </xs:element>
>                                 <xs:element name="LatitudeHemisphere"
> dfdl:lengthKind="explicit" dfdl:length="1">
>                                     <xs:simpleType>
>                                         <xs:restriction base="xs:string">
>                                             <xs:enumeration value="N"/>
>                                             <xs:enumeration value="S"/>
>                                         </xs:restriction>
>                                     </xs:simpleType>
>                                 </xs:element>
>                                 <xs:element name="Hyphen"
> dfdl:lengthKind="explicit" dfdl:length="1">
>                                     <xs:simpleType>
>                                         <xs:restriction base="xs:string">
>                                             <xs:enumeration value="-"/>
>                                         </xs:restriction>
>                                     </xs:simpleType>
>                                 </xs:element>
>                                 <xs:element name="LongitudeDegrees"
> dfdl:lengthKind="explicit" dfdl:length="3">
>                                     <xs:simpleType>
>                                         <xs:restriction base="xs:string">
>                                             <xs:pattern value="[0-9]{3}"/>
>                                         </xs:restriction>
>                                     </xs:simpleType>
>                                 </xs:element>
>                                 <xs:element name="LongitudeMinutes"
> dfdl:lengthKind="pattern" dfdl:lengthPattern=".*?(?=(E|W))">
>                                     <xs:simpleType>
>                                         <xs:restriction base="xs:string">
>                                             <xs:pattern value="[0-9]{2}"/>
>                                             <xs:pattern
> value="[0-9]{2}\.[0-9]{1}"/>
>                                             <xs:pattern
> value="[0-9]{2}\.[0-9]{2}"/>
>                                             <xs:pattern
> value="[0-9]{2}\.[0-9]{3}"/>
>                                             <xs:pattern
> value="[0-9]{2}\.[0-9]{4}"/>
>                                         </xs:restriction>
>                                     </xs:simpleType>
>                                 </xs:element>
>                                 <xs:element name="LongitudeHemisphere">
>                                     <xs:simpleType>
>                                         <xs:restriction base="xs:string">
>                                             <xs:enumeration value="E"/>
>                                             <xs:enumeration value="W"/>
>                                         </xs:restriction>
>                                     </xs:simpleType>
>                                 </xs:element>
>                             </xs:sequence>
>                         </xs:complexType>
>                     </xs:element>
>                     <xs:element name="Origin_" type="xs:string"
> nillable="true" dfdl:nilKind="literalValue" dfdl:nilValue="-"/>
>                 </xs:choice>
>                 <xs:element name="B" type="xs:string"/>
>             </xs:sequence>
>         </xs:complexType>
>     </xs:element>
> </xs:schema>
>
>

Reply via email to