The "data stream" is typically what we call the output of the unparser, or the
input to the parser.
The "physical representation" is another term.
The fact that it is intended to be in some way related to the input is an
artifact of a specific use case which is ripping data apart, validating, and
rebuilding, to insure the data is not going to crash the software that consumes
it.
In our Daffodil TDML test rig, we have found that tests that "round trip",
i.e., parse and unparse, requires several different settings with respect to
when the infoset is expected to match, or the unparsed output data stream is
expected to match.
A parser test can be run with roundTrip="none" - meaning parse, and compare to
see if the infoset is what is expected.
But the test can also be roundTrip="onePass/twoPass/threePass" which are
described roughly here:
<!--
parse, compare infoset, unparse - compare data
-->
<enumeration value="onePass"/>
<!--
parse, unparse, reparse - compare infoset.
Note that this can mask certain errors. So must be used with caution.
-->
<enumeration value="twoPass"/>
<!--
parse, unparse, reparse, reunparse, compare data.
Note that this can mask many kinds of errors very easily, so must be
used
rarely and with caution.
-->
<enumeration value="threePass"/>
There are several twoPass tests - these come up when the unparsing of the data
effectively converts the data from a variety of acceptable input formats into a
canonical input format. This doesn't match the original but is equivalent.
Three pass tests come up in some obscure cases where the parse and unparse
truly are asymmetric - e.g., where the schema is in some sense upgrading or
converting the format and the corresponding infoset from an older
representation to a newer representation that is also accepted by the same
schema, but where the infoset that the new representation produces on parse is
also different.
________________________________
From: Costello, Roger L. <[email protected]>
Sent: Wednesday, January 16, 2019 10:59:33 AM
To: [email protected]
Subject: What word do you use for the document generated from unparsing?
Hello DFDL community,
So, we parse an input to generate XML.
Then, we unparse the XML to generate .... what?
What do you call the document that results from unparsing?
I call it the "reconstituted input document". Is "reconstituted" a good name?
What do you call it?
/Roger