[GitHub] mbeckerle opened a new pull request #158: Daffodil 1080 sequences and separators - preliminary review

GitBox Fri, 14 Dec 2018 21:39:10 -0800

mbeckerle opened a new pull request #158: Daffodil 1080 sequences and 
separators - preliminary review
URL: https://github.com/apache/incubator-daffodil/pull/158
 
 
   This is a checkpoint in work to improve Sequences and Separators.
   I do not intend to merge this branch; however, review would be helpful 
perhaps.
   There are some difficult issues here. 
       
       Tests that distinguish proper function in Daffodil, proper function in
       IBM DFDL, failures of each, and differences between them added.
       
       The tests that fail (either implementation) are in scala-debug.
       
       Parsing is working better than unparsing. Unparsing is failing in a
       number
       of important scenarios still.
       
       There is much duplication of test-case logic in order to provide
       distinct parse/unparse and daffodil/ibm variants of a test.
       
       In many cases there are 4 copies of a test. Ex: for a test named "foo":
          fooN_Np_daf
          fooN_Np_ibm
          fooN_Nu_daf
          fooN_Nu_ibm
       In these cases the test document data, and test infoset are IDENTICAL
       AND SHOULD BE KEPT THAT WAY. Eventually as we improve things we would
       hope
       to collapse these back to a single, portable and round-trip test.
       
       (alternatively, TDML Runner features allowing a single test to be
       expressed by drawing in infoset and document from shared definitions,
       would be helpful. That way one could have 4 tests, but have them
       explicitly share the same data document definition and infoset
       definition.
       You can do this today, but only by putting the infoset and document part
       into external files. I'd like to keep this self contained.)
       
       For now this split up allow us to have tests which work only some of
       the 4 possible combinations, and to split those up over scala vs.
       scala-debug.
       
       The "_ibm" tests are to understand and characterize the behavior of
       that implementation that we need to interoperate with, because in some
       cases this behavior is NOT conforming to current DFDL specification
       due to well known bugs and limitations.
       
       == Affect of changes so far on general function/regression tests ==
       
       1 regression in daffodil-test-ibm1 - AX000 fails. Complex situation at
       first glance.
       
       1 regression in daffodil-tdml-processor tests - the 'threePass' test
       doesn't work because it was depending on an empty element being created
       that is now suppressed. See below discussion.
       
       36 regressions in daffodil-test.
       A substantial number of these are due to empty elements that we were
       creating, but no longer are, and some are nilled elements we used to
       create but no longer are.
       
       There are some other failure scenarios also however.
       
       Right now the code does too much suppression of elements having zero
       length. They are not supposed to be suppressed if ZL is the nil
       representation nor if ZL is the "empty string" or "empty hexBinary"
       representation.
       
       Daffodil should implement the correct DFDL-spec-compliant behavior,
       and did prior to these changes. In order to get interop with IBM DFDL
       on some of the new EDI-style tests, I changed behavior for string and
       hexbinary elements so that they are more often suppressed as elements.
       
       However, IBM DFDL does not implement this functionality according to
       the DFDL spec, so interoperability with IBM DFDL will require a flag
       that changes this behavior. Many of these changes were added to DFDL
       spec several years ago before the 2014 revision of DFDL v1.0. But
       IBM's DFDL implementation was already in operation, so they have a
       legacy of behavior they'll have to eventually adjust. This big set of
       subtle revisions were associated with the DFDL Workgroup "Action 140"
       work/action item. Basically IBM DFDL did not implement (some ? or all
       of) the Action 140 changes.
       
       So we either need a flag, or alternatively, 36 tests have to be made
       Daffodil specific - which may be preferable. This depends on whether
       the behavior is required to make "real" schemas work portably, or just
       little test cases.
       
       DAFFODIL-1080


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] mbeckerle opened a new pull request #158: Daffodil 1080 sequences and separators - preliminary review

Reply via email to