arne-bdt opened a new issue, #2740:
URL: https://github.com/apache/jena/issues/2740

   ### Version
   
   5.2.0-SNAPSHOT
   
   ### Feature
   
   Profiling shows that resolving IRIs takes a lot of time when parsing 
RDF/XML. 
   (Parsers: RRX.RDFXML_SAX, RRX.RDFXML_StAX_ev, RRX.RDFXML_StAX_sr )
   
   There were two main things, I could do about it:
   - In all current RRX-parsers, many IRIs are parsed twice, when "private Node 
iriResolve(String uriStr, ..." is called.
      --> I added  "public Node createURI(IRIx iriX, ...);" to the 
ParserProfile, which simply uses the given IRI instead of resolving it again. 
      --> in some cases this made the reader 25% faster
   - adding general IRIx caching 
`(org.apache.jena.atlas.lib.cache.CacheSimple`) in the parsers where the 
already cached org.apache.jena.riot.system.ParserProfileStd#resolver is not 
applicable
      --> in some cases this gave me another 10%
      
      
   
   ### Are you interested in contributing a solution yourself?
   
   Yes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to