Mike Beckerle created DAFFODIL-2884:
---------------------------------------
Summary: String-As-XML cause SDE on malformed XML data. Needs to
be PE.
Key: DAFFODIL-2884
URL: https://issues.apache.org/jira/browse/DAFFODIL-2884
Project: Daffodil
Issue Type: Bug
Components: Back End
Affects Versions: 3.6.0
Reporter: Mike Beckerle
When using the string-as-XML feature, currently if the string that is supposed
to be XML is malformed, then a WstxUnexpectedCharException (or other similar
exception) gets thrown in the InfosetOutputter which is what does the
string-of-XML to actual XML conversion. The InfosetOutputter is outside the
scope of backtracking, so this error cannot be converted into a ParseError at
this point. The InfosetOutputter currently escalates this to an SDE.
That's not correct for a data problem. The parser could be speculating down a
path where the string of data that is supposed to be XML is just gibberish. If
that string is malformed XML, a Parse Error needs to occur so we can backtrack.
Converting Infoset into XML is normally something done by the InfosetOutputter,
but in this case it cannot be. It needs to be done in the string parser, and
the Infoset needs to somehow cache the resulting XML so it can be handed off to
the InfosetOutputter.
I think this has to work analogously to text numbers. We parse the string
first, then convert to the data type, which for numbers is an
integer/float/decimal, etc. This conversion can fail, and that's a Parse Error.
String-as-XML needs to work the same way. The string is parsed via one of the
lengthKind techniques, then it is converted into XML. If the conversion to XML
fails, then it's a Parse Error.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)