[
https://issues.apache.org/jira/browse/DAFFODIL-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250622#comment-17250622
]
Mike Beckerle commented on DAFFODIL-966:
----------------------------------------
This is an important robustness and performance issue.
Daffodil performance doesn't have to be good only on correct and valid data.
Daffodil has to be able to be robust (not leak memory) even in the face of an
enslaught of bad data that causes parsing to fail with parse errors, or bad
infosets causing unparsing to fail. It also has to perform reasonably when
rejecting data. There can be some overhead for generating a diagnostic, so if a
parse fails at the very end, there may be more overhead than if that parse
didn't fail, but it needs to be a modest overhead only.
If performance on failures is massively slower than on successes, then almost
by definition a flood of bad data is a denial of service, so if it's bad enough
this becomes a security issue. Hence, we need to measure this.
That said, for some DFDL schemas which involve lots of backtracking, it's
possible for specific bad data to cause a big waste of time backtracking to try
many alternatives that ultimately are ALL not successful, so there are cases
where failure to parse will be massively slower than successful parsing.
> Add negative tests to performance suite
> ---------------------------------------
>
> Key: DAFFODIL-966
> URL: https://issues.apache.org/jira/browse/DAFFODIL-966
> Project: Daffodil
> Issue Type: Improvement
> Components: Infrastructure, QA
> Reporter: Jessie Chab
> Priority: Minor
>
> Add tests to performance repo for each data type to demonstrate how long the
> parser takes to abort or recover from invalid data.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)