The "data stream" is typically what we call the output of the unparser, or the 
input to the parser.


The "physical representation" is another term.


The fact that it is intended to be in some way related to the input is an 
artifact of a specific use case which is ripping data apart, validating, and 
rebuilding, to insure the data is not going to crash the software that consumes 
it.


In our Daffodil TDML test rig, we have found that tests that "round trip", 
i.e., parse and unparse, requires several different settings with respect to 
when the infoset is expected to match, or the unparsed output data stream is 
expected to match.


A parser test can be run with roundTrip="none" - meaning parse, and compare to 
see if the infoset is what is expected.


But the test can also be roundTrip="onePass/twoPass/threePass" which are 
described roughly here:


      <!--
        parse, compare infoset, unparse - compare data
        -->
      <enumeration value="onePass"/>
      <!--
        parse, unparse, reparse - compare infoset.
        Note that this can mask certain errors. So must be used with caution.
        -->
      <enumeration value="twoPass"/>
      <!--
        parse, unparse, reparse, reunparse, compare data.
        Note that this can mask many kinds of errors very easily, so must be 
used
        rarely and with caution.
        -->
      <enumeration value="threePass"/>

There are several twoPass tests - these come up when the unparsing of the data 
effectively converts the data from a variety of acceptable input formats into a 
canonical input format. This doesn't match the original but is equivalent.

Three pass tests come up in some obscure cases where the parse and unparse 
truly are asymmetric - e.g., where the schema is in some sense upgrading or 
converting the format and the corresponding infoset from an older 
representation to a newer representation that is also accepted by the same 
schema, but where the infoset that the new representation produces on parse is 
also different.




________________________________
From: Costello, Roger L. <[email protected]>
Sent: Wednesday, January 16, 2019 10:59:33 AM
To: [email protected]
Subject: What word do you use for the document generated from unparsing?

Hello DFDL community,

So, we parse an input to generate XML.

Then, we unparse the XML to generate .... what?

What do you call the document that results from unparsing?

I call it the "reconstituted input document". Is "reconstituted" a good name? 
What do you call it?

/Roger

Reply via email to