[ 
https://issues.apache.org/jira/browse/XERCESJ-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542006#comment-17542006
 ] 

Mukul Gandhi commented on XERCESJ-1745:
---------------------------------------

It seems that, there's already quite a bit of work done within Xerces related 
to grammar caching and preparsing. Please see, 
[https://xerces.apache.org/xerces2-j/faq-grammars.html|https://xerces.apache.org/xerces2-j/faq-grammars.html].

I doubt, that serializing Xerces's XSModel instance to file and deserializing 
it would make the XML schema validation much faster. I guess, serialized file 
representation of XSModel, may be large (for the XML schema's whose in-memory 
content model state machines are large), which will not serve the purpose 
stated within this feature request.

Moreover, changing many XML Schema related Xerces data structures to implement 
java.io.Serializable may be a very complicated and extensive change, and may 
risk stability of the Xerces implementation.

> Save/Restore serialized "compiled" parser-validator
> ---------------------------------------------------
>
>                 Key: XERCESJ-1745
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1745
>             Project: Xerces2-J
>          Issue Type: New Feature
>          Components: Other, Serialization
>    Affects Versions: 2.12.2
>            Reporter: Mike Beckerle
>            Priority: Major
>
> Feature requested by Apache Daffodil project PMC.
>  
> We use Xerces-J to validate XML files. 
>  
> The schemas of these files are huge. Think 300+ fairly large XSD files all 
> included/imported together. Megabytes of XSD. 
>  
> In order to validate+parse faster, we know Xerces does something akin to 
> "compiling" the XSD into lower-level data structures. 
>  
> The requested feature is to make this "compilation" step of the large XSD 
> schema explicit, and then be able to serialize the resulting java object to a 
> file. Subsequently one can reload this pre-compiled object so as not to face 
> this compiling overhead at startup time.
>  
> An API call to explicitly force this compilation step, so that the time taken 
> to do it can be measured, is an important part of this feature. This 
> compilation can also occur automatically on first use, without requiring an 
> explicit "compile it now" API call, and that would retain perfect 
> compatiblity with Xerces APIs today. 
>  
>  But for very large XSD, it is of value to be able to time this compile 
> activity, so a  new API method to cause Xerces to do this compilation step 
> explicitly (and which is separate from the serialization of the resulting 
> object) is of value. 
>  
> In summary I think numerous internal data structures within Xerces would have 
> to be made Serializable, and a compileParser(), 
> saveParser(java.io.OutputStream) and restoreParser(java.io.InputStream) or 
> something along those lines are needed. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: j-dev-h...@xerces.apache.org

Reply via email to