Q1: False.

The decisions were driven by the collective memory of the DFDL Workgroup 
members, and the experience of many others that was captured in the 
implementations of many data integration tools. In some cases properties were 
added because a popular widely used existing data integration tool had the 
property, and there was a consensus of the group that the property was likely 
needed in real examples because it would not otherwise exist in that popular 
tool.


Case in point: dfdl:separatorPosition with values prefix, infix, postfix. Do we 
really know whether this property is needed? Can all formats that use it be 
modeled using just "infix" behavior, plus some extra sequence groups and 
dfdl:terminator or dfdl:initiator? The answer is "maybe", because some data 
integration tools didn't have separator position properties. But figuring that 
out would be an academic exercise. This property was introduced (to my 
knowledge) in the Mercator EII tool, and widely copied, e.g., it also appears 
in  Microsoft BizTalk, and I believe also in an IBM message broker. So there 
was already plenty of precedent for the need for such property.  Furthermore, 
Mercator representatives on the DFDL workgroup contributed greatly, and there 
was simply no point in trying to negotiate minimization of features they had 
found a need for.


Effectively, DFDL is the union of all properties from a wide array of such 
tools - with some renaming for uniformity.


Q2: false


XSD provided the opportunity to allow nilled complex types. I do not recall if 
prior systems had this or not. The ones I am most familiar with did not. 
Somehow, a decision was made to allow this feature in DFDL. At some point there 
was a (misguided in hindsight) effort to provide a use in DFDL for most 
constructions available in XSD, so this may have been allowed by way of that 
rationale. Then subsequently,the complexity of it became clear - needing a 
bunch of properties for the nilValue representation of a complex type that were 
in addition to the properties for the non-nilled representation added too much 
complexity, and there was no precedent for such properties in existing data 
integration systems. So the restriction to ES was put in to eliminate this 
need. Alternatively we could have dropped the entire feature, but that was not 
the decision the workgroup decided to go with.


Q3: false


Some properties were added because there are formats that need them, but 
whether anyone will ever parse those formats with DFDL is unclear. In many 
cases these properties or property values are not implemented by any DFDL 
implementation as yet. Example: dfdl:lengthKind='endOfParent'. This property 
handles the case where a structure of some specified length contains children 
elements such as strings, each of which has a way to determine its length 
except the last one, which is assumed to extend to the end of the enclosing 
parent object. This concept definitely exists in data formats I have seen 
described. Nobody has needed it as yet. There are, quite possibly, other ways 
to work-around the need for this element.


Q4: No, there is no such document. What there is, is a test suite for DFDL 
which exercises every implemented property. This is part of Daffodil.

________________________________
From: Costello, Roger L. <[email protected]>
Sent: Thursday, July 11, 2019 7:26:20 AM
To: [email protected]
Subject: The properties included in DFDL are driven by real-world data formats 
... correct?


Hello DFDL community,



Question #1: Is the following statement true or false?



The decision on what properties to include in DFDL were driven by real-world 
use cases - that is, by real-world data formats.



Question #2: Is the following statement true or false?



Upon examining the real-world data formats, the DFDL working discovered that it 
is rare to have anything other than an empty string as the in-band nil value in 
a complex type; because of it’s rarity (yesterday Mike used the word 
“obscure”), the DFDL working group decided to restrict in-band nil values on 
complex types to the empty string.



Question #3: Is the following statement true or false?



For every DFDL property there are multiple real-world data formats which 
require that DFDL property.



Question #4: Is there a document which shows the mapping between DFDL property 
and real-world data formats requiring that DFDL property?



/Roger

Reply via email to