[
https://issues.apache.org/jira/browse/DAFFODIL-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mike Beckerle reopened DAFFODIL-1749:
-------------------------------------
Should not have been closed. Creating a single validator per thread is good,
but is not what this bug was about.
Pre-Compiled schemas (saved to binary file) still need a way to store the
schema text if that is what is needed for Full validation.
It would be best if the Xerces validator could consume the XML Schema, and
*then* the resulting data structure be serialized as part of the compiled
state, so that we can avoid the startup overhead of the Xerces validator. Keep
in mind there are schemas with over 200 files in them. Just opening and
ingesting them, and setting up initialized data structure for a validator can
take real time. A high-speed validator must essentially compile the DFDL/XML
schema to do fast validation.
> Store Xerces validator in DataProcessor so reloaded parsers can use full
> validation
> -----------------------------------------------------------------------------------
>
> Key: DAFFODIL-1749
> URL: https://issues.apache.org/jira/browse/DAFFODIL-1749
> Project: Daffodil
> Issue Type: Improvement
> Components: CLI
> Reporter: Steve Lawrence
> Priority: Major
>
> With validation mode set to full, after a parse we have xerces recompile the
> schema and validate against the infoset. That is painful. Instead we should
> create a Xerces validator once (if full validation is enabled) and reuse the
> same one for each parse. That should give a noticeable gain in performance
> with full validation. Need to confirm that the xerces validator is thread
> safe and treat it appropriately if not
> Also, we *might* be able to serialize the xerces validator as part of saving
> a parser. This would be nice so that we could enable full validation for
> saved parsers.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)