Examples of this are common in formats that allow extra trailing empty 
separators.

EDIFACT depends on this.

https://github.com/DFDLSchemas/EDIFACT/tree/daffodil-dev

That's the daffodil-dev branch of the EDIFACT DFDL-schema repository.

This DFDL schema works both in Daffodil and in IBM DFDL.

'sbt test' will test it.

Now to discuss your specific question....

So trailingEmpty or trailingEmptyStrict take no position on empty (adjacent 
separators) appearing prior to the "trail" of the data.

Those may or may not be errors. Consider this which will let us construct 
examples where  an earlier empty (adjacent separators) causes an error:

<sequence dfdl:separatorSuppressionPolicy="trailingEmptyStrict" 
dfdl:separator="/">
  <element name="f1" type"xs:int"/>
  <element name="a1" type="xs:string" maxOccurs="5"/>
</sequence>

Instance

1/a/b/c  -- is well-formed

1/a  -- is well-formed
1/a///  -- is malformed violates trailingEmptyStrict
/a  -- is malformed because the first occurrence must be an int.

That last one is an earlier, non-trailing empty occurrence causing an error.

Here's a funny one:

1/a/////////////////b -- is well-formed

That's well-formed because none of those empty occurrences are "trailing" 
because of the final 'b'. They're also optional, empty, and so are not added to 
the infoset. So this produces the same infoset as, and unparses to:

1/a/b

Furthermore, note that maxOccurs is '5', but there's a lot more than 5 empty 
values in that funny data. These are tolerated. DFDL does not create element 
occurrences in the infoset for empty optional elements.

I do think that DFDL is missing a way of expressing the more straightforward 
behavior where the funny one above would not be tolerated because it has more 
than 5 occurrences being separated, and where empty optional element values ARE 
created in the infoset for empty strings.

There is a way to do it using nillable elements. Nil representation takes 
priority over Empty representation in DFDL, so if you want optional "empty 
string" values to occupy array occurrences, the only way to do that in DFDL is 
a nillable element with dfdl:nilValue="%ES;". Then an empty string becomes a 
nilled element, which occupies an array occurrence.

<sequence dfdl:separatorSuppressionPolicy="trailingEmptyStrict" 
dfdl:separator="/">
  <element name="f1" type"xs:int"/>
  <element nillable='true' name="a1" type="xs:string" maxOccurs="5" 
dfdl:nilValue="%ES;" />
</sequence>

Given the above this data

1/a////b -- is well-formed. The a1 array has 5 elements. The middle 3 are 
nilled.

1/a//////////////b -- is sort of well-formed, sort of malformed. It's well 
formed, followed by a bunch of left-over-data. dfdl:occurCountKind='implicit" 
will stop parsing elements after 5 are counted and will have an infoset where 
the a1 array has 1 value element contianing "a", and 4 nilled elements. At that 
point it will stop parsing, return an infoset, but leaving left over data of 
"/////////b". Note that nothing went wrong during the parse here. It's just the 
left-over data that allows us to deem this malformed.



________________________________
From: Roger L Costello <coste...@mitre.org>
Sent: Tuesday, April 20, 2021 8:32 AM
To: users@daffodil.apache.org <users@daffodil.apache.org>
Subject: Seek an example of zero-length data items that are not at the end of a 
sequence

Hi Folks,

separatorSuppressionPolicy="trailingEmptyStrict" means that the separators for 
the zero-length data items that are at the end of a sequence must be omitted.

So, if the separator is the forward slash, then instead of this:

         a/b/c////

the instance document must be this:

        a/b/c

Right?

separatorSuppressionPolicy="trailingEmptyStrict" does not say anything about 
the separators of zero-length data items that are not at the end of a sequence, 
right? I seek an example to illustrate this. That is, I seek an example that 
yields an error if the instance document has omitted the separators of 
zero-length data items that are not at the end of a sequence. Would you provide 
an example of this, please?

/Roger

Reply via email to