Hi Folks,
I am jumping around in my writeups.
As always, please let me know of anything that is unclear.  /Roger
--------------------------------------------------------------------------------------
11. Variable length, nillable, composite, no choice

A composite field is one that is composed of parts. There is no separator 
between the parts. The parts may be fixed length or variable length. The parts 
are non-nillable, although the composite field itself may be nillable.
This section deals with composite fields containing parts that are variable 
length and the field is nillable.
We will create a DFDL schema for a "Location" field that has a latitude and 
longitude, separated by a dash. Here is a sample value:
2006N-05912E
That is one value with 7 parts:
The first two digits (20) represents a latitude in degrees.
The next two digits (06) represents the latitude in minutes.
The N indicates the latitude's hemisphere.
The dash ( - ) separates the latitude values from the following longitude 
values.
The 059 represents the longitude in degrees.
The 12 represents the longitude in minutes.
The E represents the longitude hemisphere.
In other words, the location is latitude 20 degrees, 6 minutes North, longitude 
59 degrees, 12 minutes East.
Both the latitude minute and longitude minute are variable length are expressed 
as a two-digit integer or as a decimal value. If a decimal, there may be 1-4 
digits to the right of the decimal point. Here are Location values with minute 
parts (highlighted in yellow) that have decimal values:
4221.6N-71003.5W
4221.63N-71003.57W
4221.630N-71003.576W
4221.6300N-71003.5760W
Here is one more example of a valid Location value:
-
That value means: no data was available to populate the field.
To re-emphasize, Location is a variable length, nillable, composite field.
Here is an XML Schema declaration of Location, sans any DFDL properties (I 
highlighted in yellow the field name and part names):
<xs:element name="Location" nillable="true">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="LatitudeDegrees">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:pattern value="[0-9]{2}" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:element name="LatitudeMinutes">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:pattern value="[0-9]{2}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{1}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{2}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{3}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{4}" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:element name="LatitudeHemisphere">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:enumeration value="N" />
                        <xs:enumeration value="S" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
           <xs:element name="Hyphen">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:enumeration value="-" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:element name="LongitudeDegrees">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:pattern value="[0-9]{3}" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:element name="LongitudeMinutes">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:pattern value="[0-9]{2}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{1}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{2}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{3}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{4}" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:element name="LongitudeHemisphere">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:enumeration value="E" />
                        <xs:enumeration value="W" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>
These parts have fixed length: LatitudeDegrees, LatitudeHemisphere, Hyphen, 
LongitudeDegrees, and LongitudeHemisphere.
These parts have variable length: LatitudeMinutes and LongitudeMinutes.
For the fixed length parts, add these two DFDL properties:
dfdl:lengthKind="explicit"
dfdl:length="__"
For example, LatitudeDegrees has a fixed length of 2. Here is its declaration, 
with the DFDL properties (in yellow) added:
<xs:element name="LatitudeDegrees"
                      dfdl:lengthKind="explicit"
                      dfdl:length="2">
    <xs:simpleType>
        <xs:restriction base="xs:string">
            <xs:pattern value="[0-9]{2}" />
        </xs:restriction>
    </xs:simpleType>
</xs:element>
Use the same strategy for the other fixed fields.
LatitudeMinutes is variable length. The part that follows it 
(LatitudeHemisphere) has a fixed length (its value is either N or S). To 
declare LatitudeMinutes, add these two DFDL properties:
dfdl:lengthKind="pattern"
dfdl:lengthPattern="regex"
In the regex use a lookahead pattern. Here is LatitudeMinutes, extended with 
the DFDL properties (in yellow):
<xs:element name="LatitudeMinutes"
                       dfdl:lengthKind="pattern"
                       dfdl:lengthPattern=".*?(?=(N|S))">
    <xs:simpleType>
        <xs:restriction base="xs:string">
            <xs:pattern value="[0-9]{2}"/>
            <xs:pattern value="[0-9]{2}\.[0-9]{1}"/>
            <xs:pattern value="[0-9]{2}\.[0-9]{2}"/>
            <xs:pattern value="[0-9]{2}\.[0-9]{3}"/>
            <xs:pattern value="[0-9]{2}\.[0-9]{4}"/>
        </xs:restriction>
    </xs:simpleType>
</xs:element>
Read that as: the content of LatitudeMinutes is the text up to, but not 
including N or S.
Use the same regex lookahead strategy for LongitudeMinutes.
As I stated earlier, Location is nillable with hyphen as the nil value. 
Further, Location has a complexType. That is a problem. See section 2 for a 
complete discussion of the problem with nillable complexTypes and how to deal 
with it.
Here's the DFDL schema for the Location field (DFDL is shown in yellow):
<xs:element name="Location">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="LatitudeDegrees"
                                   dfdl:lengthKind="explicit"
                                   dfdl:length="2">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:pattern value="[0-9]{2}" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:element name="LatitudeMinutes"
                                   dfdl:lengthKind="pattern"
                                   dfdl:lengthPattern=".*?(?=(N|S))">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:pattern value="[0-9]{2}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{1}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{2}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{3}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{4}" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:element name="LatitudeHemisphere"
                                   dfdl:lengthKind="explicit"
                                   dfdl:length="1">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:enumeration value="N" />
                        <xs:enumeration value="S" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:element name="Hyphen"
                                  dfdl:lengthKind="explicit"
                                  dfdl:length="1">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:enumeration value="-" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:element name="LongitudeDegrees"
                                   dfdl:lengthKind="explicit"
                                   dfdl:length="3">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:pattern value="[0-9]{3}" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:element name="LongitudeMinutes"
                                   dfdl:lengthKind="pattern"
                                   dfdl:lengthPattern=".*?(?=(E|W))">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:pattern value="[0-9]{2}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{1}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{2}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{3}" />
                        <xs:pattern value="[0-9]{2}\.[0-9]{4}" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
            <xs:element name="LongitudeHemisphere">
                <xs:simpleType>
                    <xs:restriction base="xs:string">
                        <xs:enumeration value="E" />
                        <xs:enumeration value="W" />
                    </xs:restriction>
                </xs:simpleType>
            </xs:element>
        </xs:sequence>
    </xs:complexType>
</xs:element>
Notice that the last part (LongitudeHemisphere) has no DFDL added. This is 
because I am assuming that it is followed by the delimiter for the Location 
field.

Reply via email to