Sure thing.
Say we have a really simple data format that is a length followed by a
string payload. Our DFDL schema might look like this:
<element name="format">
<complexType>
<sequence>
<element name="length" type"xs:int" dfdl:length="4" ... />
<element name="payload" type="xs:string" dfdl:length="{
../length }" ... />
</sequence>
</complexType>
</element>
This would parse data like this:
0011Hello World
To this:
<format>
<length>11</length>
<payload>Hello World</payload>
</format>
But what if our payload contained XML data? Then our data would look
like this:
0020<hello>world</hello>
Which would parse to this:
<format>
<length>20</length>
<payload><hello>world</hello></payload>
</format>
Note that the payload now contains escaped XML. This makes it difficult
to validated this "embedded XML" using things like XSD/schematron since
it's just a string.
So what you can do now is modify the payload definition to add the
dfdlx:runtimeProperties attribute like this:
<element name="payload" type="xs:string" dfdl:length="{
../length }" dfdlx:runtimeProperties="stringAsXml=true" .../>
Then when we parse the same data as above, we would get an infoset that
looks like this:
<format>
<length>20</length>
<payload>
<stringAsXml xmlns="">
<hello>world</hello>
</stringAsXml>
</payload>
</format>
So what used to be a simple "payload" element in the infoset, now
becomes a complex element that contains a "stringAsXml" element as its
child, and that contains the payload content as actual XML instead of an
XML-escaped string. This makes it much easier to query and validate this
XML payload since it's actually XML instead of an escaped string.
Note that this only currently works with the default XML text outputter.
Also note this new infoset doesn't validate against the DFDL schema, so
a separate schema is needed to validate this content if needed.
For some more complex examples of this, we have test files and schema
(including an example of a separate validation schema) in this directory:
https://github.com/apache/daffodil/tree/main/daffodil-test/src/test/resources/org/apache/daffodil/infoset/stringAsXml/namespaced
On 11/8/22 7:47 AM, Roger L Costello wrote:
Hi Steve,
I read the description of "embedded XML" and I don't get it. Would you provide
an example of what it's used for please?
/Roger
-----Original Message-----
From: Steve Lawrence <slawre...@apache.org>
Sent: Tuesday, November 8, 2022 7:34 AM
To: users@daffodil.apache.org
Subject: [EXT] [ANNOUNCE] Apache Daffodil 3.4.0 Released
The Apache Daffodil community is pleased to announce the
release of version 3.4.0.
Notable changes in this release include EXI binary XML support,
pluggable character sets, embedded XML, and C code generator updates
Detailed release notes and downloads are available at:
https://daffodil.apache.org/releases/3.4.0/
Apache Daffodil is an open-source implementation of the DFDL
specification that uses DFDL data descriptions to parse fixed format
data into an infoset. This infoset is commonly converted into XML or
JSON to enable the use of well-established XML or JSON technologies
and libraries to consume, inspect, and manipulate fixed format data in
existing solutions. Daffodil is also capable of serializing or
"unparsing" data back to the original data format. The DFDL infoset
can also be converted directly to/from the data structures carried by
data processing frameworks so as to bypass any XML/JSON overheads.
For more information about Daffodil visit:
https://daffodil.apache.org/
Regards,
The Apache Daffodil Team