Hi Folks,

My input contains a social security number (SSN), e.g.,

123-45-6789

If I declare the SSN element like this:

<xs:element name="SSN"
                      dfdl:lengthKind="explicit"
                     dfdl:length="11">
    <xs:simpleType>
        <xs:restriction base="xs:string">
            <xs:pattern value="[0-9]{3}-[0-9]{2}-[0-9]{4}"/>
        </xs:restriction>
    </xs:simpleType>
</xs:element>

then the parser will accept well-formed but invalid data such as this:

xxx-45-6789

If I want to be notified that the data is not valid, then I can use the -V 
limited option. Then the parser will both generate XML and notify me that the 
input is not valid.

If I add checkConstraints:

<xs:element name="SSN"
                       dfdl:lengthKind="explicit"
                       dfdl:length="11">
    <xs:annotation>
        <xs:appinfo source="http://www.ogf.org/dfdl/";>
            <dfdl:assert>{ dfdl:checkConstraints(.) }</dfdl:assert>
        </xs:appinfo>
    </xs:annotation>
    <xs:simpleType>
        <xs:restriction base="xs:string">
            <xs:pattern value="[0-9]{3}-[0-9]{2}-[0-9]{4}"/>
        </xs:restriction>
    </xs:simpleType>
</xs:element>

then the parser no longer accepts well-formed but invalid data. No XML is 
generated.

Lesson Learned: Don't use checkConstraints if you want parsing to accept 
well-formed but invalid input.

But, but, but, ........

Things aren't that simple.

Suppose SSN is part of a choice. The choice has two branches. The first branch 
specifies RealID space SSN, the second branch specifies SSN space RealID.

Consider this valid input:

123-45-6789 A12345678

If the DFDL does not use checkConstraints, then this incorrect XML is generated:

  <PersonID>
    <RealID>123-45-6789</RealID>
    <Space> </Space>
    <SSN>A12345678</SSN>
  </PersonID>

Notice that the <RealID> value is the ssn and the <SSN> value is the real id.

If we want to get correct XML, then we must use checkConstraints.

Lesson Learned: Use checkConstraints if you want parsing to generate correct 
XML.

Overall Lesson Learned: You can't have a DFDL schema that both accepts 
well-formed but invalid data and always produces correct XML.

Do you agree?

/Roger

Reply via email to