For the XML fans..... well as much as we want to avoid DFDL becoming a transformation language, XML fans want to create standard, well-designed XML data, using a single schema that describes that XML data, but also carries DFDL annotations corresponding to the non-XML form of that data.
I.e., just one schema please, not a schema + two XSLTs each just as big as the schema, one for parsing, one for unparsing, all of which must be maintained together. This requires some amount of transformation expressed in the schema, beyond the little bit that DFDL allows now. But key point: it's not *general* transformation. Nobody is talking about having to break up hierarchical data into RDBMS tables here, or generating surrogate keys, etc. (At least I don't think so.) I don't have a comprehensive list of what is needed, but I think it is necessary to populate attributes, move elements around, remove tiers of elements by hoisting up the children, replace 'value elements' with simple values or attributes,... that sort of thing. An important concept is expressing this in a way that is *invertible*, so that unparsing is automatic (versus expressing two separate transformations: one for parse, one for unparse). I think this implies that these transformations are not XSLT nor XQuery like. Both of those are designed with one-way transformations and are not expressed on schemas of the data but relative to instance documents. Both are designed to work on XML even if there is no schema. We're after expressing the needed transformations ON the schema. So it's an entirely different concept. The starting point for any of this is attributes. The stumbling block there is XSD's syntax where attributes are expressed AFTER all the child elements. The lexical order can't be used to represent the physical order as we do in the rest of DFDL. If the element has ONLY attributes that works, and if it has NO attributes that's what we have now. Its the mixture of the two that creates the situation where even though you wrote the attribute last, there's some way that it gets to specify where in the physical representation it is to be found. A concrete proposal for how to deal with this is needed. > *Sent:* Thursday, November 2, 2023 8:27 AM > *To:* users@daffodil.apache.org > *Subject:* Re: Proposal: Extensible DFDL > > > > I think extensibility would be great for DFDL. > > > > The DFDL workgroup punted on this as there was no such thing as an > extensible format description language to generalize into a standard. > > We realized that unparsing was already breaking a lot of new ground, but > it was a must-have feature. > > > > So we had to draw a line somewhere on the number of untested new concepts > in DFDL or it would never get done. It took 20 years as is to become > standardized. > > > > Some format description languages may have been implemented this > extensible way, but that was not a visible user feature in any one that I > ever saw. > > > > As a research effort this is a good idea. Daffodil is available for use in > prototyping if that's useful, and if it turns out to be valuable it could > be proposed for inclusion in DFDL in the future. > > > > Some years ago I suggested this to someone as a thesis topic for a CS PhD > project, but to my knowledge it didn't go anywhere. > > > > > > On Thu, Nov 2, 2023 at 10:20 AM Roger L Costello <coste...@mitre.org> > wrote: > > Hi Folks, > > Consider this input containing a date time value: > > 20230926T124800Z > > We can design the DFDL, using the xs:datetime datatype and associated DFDL > calendar properties, so that parsing produces this XML: > > <DateTimeIso>2023-09-26T12:48:00+00:00</DateTimeIso> > > That is beautiful XML - concise and precise. > > Next, consider input containing a lat/long value: > > 2006N-05912E > > It would be excellent if we could design the DFDL so that parsing produces > this: > > <OriginOfBearing>20°06′N 059°12′E</OriginOfBearing> > > That is also beautiful XML. > > In fact, it is possible to achieve this! By hiding the input and then > performing a bunch of transformations using dfdl:inputValueCalc. > > However, that's a terrible approach because, as Mike Beckerle often says, > "DFDL is not a transformation language!" > > If only we had a latlong datatype and associated DFDL latlong properties > ..... > > If only we could extend DFDL ....... > > How about making DFDL extensible? How about allowing users of DFDL to > create their own datatypes (actually, XSD already allows this) and allow > users to create their own DFDL properties for the user-defined datatype? > > That is, how about turning DFDL into extensible DFDL? > > Thoughts? > > /Roger > >