[
https://issues.apache.org/jira/browse/DAFFODIL-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17831573#comment-17831573
]
Mike Beckerle commented on DAFFODIL-2883:
-----------------------------------------
See file XMLUtils line 156 for the FIXME comment about this issue.
> Pre-existing PUA characters in data cause SDE
> ---------------------------------------------
>
> Key: DAFFODIL-2883
> URL: https://issues.apache.org/jira/browse/DAFFODIL-2883
> Project: Daffodil
> Issue Type: Bug
> Components: Back End
> Affects Versions: 3.6.0
> Reporter: Mike Beckerle
> Priority: Major
>
> If data contains Unicode PUA characters, these cause the Infoset Outputter to
> convert the RemapPUACharDetected into an SDE.
> We can't get away with this. We need to tolerate PUA characters in data and
> have them either cause a ParseError, or just tolerate them. (Or have a switch
> to choose between those modes)
> This was discovered by fuzz testing.
> If the existence of PUA characters means the data is gibberish, then perhaps
> the parser is speculating down a path that should be backtracked. We need a
> parse error in that case.
> If the existence of PUA characters is acceptable, then we need no error at
> all from them.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)