[
https://issues.apache.org/jira/browse/DAFFODIL-2708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Lawrence resolved DAFFODIL-2708.
--------------------------------------
Fix Version/s: 3.4.0
Resolution: Fixed
Fixed in commit 3b213ce30b1974ecd9fc2260e4f081240da89874
> XML String feature in XML Text Infoset Inputter/Outputter
> ---------------------------------------------------------
>
> Key: DAFFODIL-2708
> URL: https://issues.apache.org/jira/browse/DAFFODIL-2708
> Project: Daffodil
> Issue Type: New Feature
> Components: Back End
> Affects Versions: 3.3.0
> Reporter: Mike Beckerle
> Priority: Critical
> Fix For: 3.4.0
>
>
> Several users need a specific feature.
> The required feature is needed for XML output where a string that is known to
> itself be a string of XML text can be embedded in the XML output from parsing
> without escaping it.
> Symmetrically, for unparsing, a string element identified as XML text should
> result in a series of XML "events" being absorbed and converted to a string
> which is the ultimate value of the string element.
> Note that for any given popular data format (XML, JSON, etc.) where Daffodil
> supports output of infosets in that representation, the same issue can arise
> where data contains a string which is already in that representation and
> users desire for it to be directly embedded, not escaped as a string.
> For the purposes of this ticket, let's focus on XML only. Other
> representations could be added subsequently.
> Notes:
> 1) on canonicalization - I see know way to avoid strong canonicalization of
> this XML. If byte for byte preservation of characters such as character
> entities like   (a space) or CRLFs is needed, there's just no way to
> do that(at least that I know of).
> 2) XML initial slug line/processing instruction - a way to strip this if
> present in the XML string may be needed. An option to generate it as part of
> the string when unparsing may also be needed.
> 3) An ASCII-only or iso-8859-1 only option may be needed where any character
> outside of those and standard whitespaces is converted to a character entity.
> 4) This breaks the idea that the DFDL schema IS the XML Schema of the output
> Infoset XML from parsing. Rather, to create an XML schema for the resulting
> data, one would have to replace the DFDL element declaration for the string
> to an appropriate DFDL element reference to the schema of the XML being
> embedded at that place.
> It is highly recommended that such a DFDL schema contain comments describing
> this exact element reference - namespace + name, that the XML String
> corresponds to.
> w.r.t. implementation...
> There's some pseudocode for in the "Example Implementation" section of
> the Runtime Properties proposal:
> [https://cwiki.apache.org/confluence/display/DAFFODIL/Proposal%3A+Runtime+Properties#Proposal%3ARuntimeProperties-ExampleImplementation]
> This pseudocode uses the ScalaXML InfosetInputter/Outputter as a base for
> simplicity, but we should base the actual one on the
> XMLTextInfosetInputter/Outputter
> since that's what most people use.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)