[ https://issues.apache.org/jira/browse/XERCESJ-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542182#comment-17542182 ]
Mike Beckerle commented on XERCESJ-1745: ---------------------------------------- Thank you for the link. I looked into the faq-grammars page and related example source code. Alas, none of these data structures are serializable, so using grammar pools and preloaded grammers seems to acheve "compile once at start-up" behavior which I believe we're already getting via the factory patterns that support providing the schema once and then creating parsers from that factory. With respect to the serialization form. I wanted to clarify our need is simpler than what many people would think is needed for serializability. We do not need any compatibility of the serializations across Xerces versions/builds. For our needs the saved serialized representation can be completely tied to exactly the same version/build of Xerces that created it. Reloading a serialization created from a different version/build of Xerces can just be a fatal error. This actually removes the need for a great deal of the maintenance complexity associated with serializability. > Save/Restore serialized "compiled" parser-validator > --------------------------------------------------- > > Key: XERCESJ-1745 > URL: https://issues.apache.org/jira/browse/XERCESJ-1745 > Project: Xerces2-J > Issue Type: New Feature > Components: Other, Serialization > Affects Versions: 2.12.2 > Reporter: Mike Beckerle > Priority: Major > > Feature requested by Apache Daffodil project PMC. > > We use Xerces-J to validate XML files. > > The schemas of these files are huge. Think 300+ fairly large XSD files all > included/imported together. Megabytes of XSD. > > In order to validate+parse faster, we know Xerces does something akin to > "compiling" the XSD into lower-level data structures. > > The requested feature is to make this "compilation" step of the large XSD > schema explicit, and then be able to serialize the resulting java object to a > file. Subsequently one can reload this pre-compiled object so as not to face > this compiling overhead at startup time. > > An API call to explicitly force this compilation step, so that the time taken > to do it can be measured, is an important part of this feature. This > compilation can also occur automatically on first use, without requiring an > explicit "compile it now" API call, and that would retain perfect > compatiblity with Xerces APIs today. > > But for very large XSD, it is of value to be able to time this compile > activity, so a new API method to cause Xerces to do this compilation step > explicitly (and which is separate from the serialization of the resulting > object) is of value. > > In summary I think numerous internal data structures within Xerces would have > to be made Serializable, and a compileParser(), > saveParser(java.io.OutputStream) and restoreParser(java.io.InputStream) or > something along those lines are needed. > > -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: j-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: j-dev-h...@xerces.apache.org