arne-bdt opened a new pull request, #2744:
URL: https://github.com/apache/jena/pull/2744

   Faster parsing of RDF/XML by avoiding duplicated resolving of IRIs and 
adding cache for IRIx in parsers
   (Parsers: RRX.RDFXML_SAX, RRX.RDFXML_StAX_ev, RRX.RDFXML_StAX_sr )
   
   GitHub issue resolved #2740
   
   Pull request Description:
   - added "public Node createURI(IRIx iriX, ...);" to the ParserProfile, which 
simply uses the given IRI instead of resolving it again.
   - adding general IRIx caching (org.apache.jena.atlas.lib.cache.CacheSimple) 
in the parsers where the already cached 
org.apache.jena.riot.system.ParserProfileStd#resolver is not applicable
   - removed httpClient from org.apache.jena.riot.RDFParserBuilder and 
org.apache.jena.riot.RDFParser, which took quite some time during 
initialization.
   - removed unused code and variables from ParserRRX_StAX_SR and 
ParserRRX_StAX_EV
    - now org.apache.jena.http.HttpEnv#getDftHttpClient is called from 
org.apache.jena.riot.RDFParser#openTypedInputStream only if needed. HttpEnv 
also holds a static reference, so that should be fine, I hope.
   - added jena-benchmarks-shadedJena510 to be able to perform benchmarks 
againts Jena 5.1.0
   - added org.apache.jena.riot.lang.rdfxml.TestXMLParser in jena-benchmarks-kmh
   
   Benchmark results:
   ```
   Benchmark                                                       
(param0_GraphUri)  (param1_ParserLang)  Mode  Cnt   Score   Error  Units
   TestXMLParser.parseXML                                   
../testing/citations.rdf       RRX.RDFXML_SAX  avgt    5  47,232 ± 0,778   s/op
   TestXMLParser.parseXML                                   
../testing/citations.rdf   RRX.RDFXML_StAX_ev  avgt    5  76,502 ± 4,390   s/op
   TestXMLParser.parseXML                                   
../testing/citations.rdf   RRX.RDFXML_StAX_sr  avgt    5  48,689 ± 2,224   s/op
   TestXMLParser.parseXML                                   
../testing/citations.rdf      RRX.RDFXML_ARP1  avgt    5  86,298 ± 2,440   s/op
   TestXMLParser.parseXML                                
../testing/BSBM/bsbm-5m.xml       RRX.RDFXML_SAX  avgt    5   9,576 ± 0,402   
s/op
   TestXMLParser.parseXML                                
../testing/BSBM/bsbm-5m.xml   RRX.RDFXML_StAX_ev  avgt    5  11,562 ± 0,535   
s/op
   TestXMLParser.parseXML                                
../testing/BSBM/bsbm-5m.xml   RRX.RDFXML_StAX_sr  avgt    5   9,406 ± 0,465   
s/op
   TestXMLParser.parseXML                                
../testing/BSBM/bsbm-5m.xml      RRX.RDFXML_ARP1  avgt    5  19,738 ± 1,526   
s/op
   TestXMLParser.parseXML          
CGMES_v2.4.15_RealGridTestConfiguration_EQ_V2.xml       RRX.RDFXML_SAX  avgt    
5   0,998 ± 0,223   s/op
   TestXMLParser.parseXML          
CGMES_v2.4.15_RealGridTestConfiguration_EQ_V2.xml   RRX.RDFXML_StAX_ev  avgt    
5   1,325 ± 0,093   s/op
   TestXMLParser.parseXML          
CGMES_v2.4.15_RealGridTestConfiguration_EQ_V2.xml   RRX.RDFXML_StAX_sr  avgt    
5   0,985 ± 0,018   s/op
   TestXMLParser.parseXML          
CGMES_v2.4.15_RealGridTestConfiguration_EQ_V2.xml      RRX.RDFXML_ARP1  avgt    
5   2,357 ± 0,163   s/op
   TestXMLParser.parseXML         
CGMES_v2.4.15_RealGridTestConfiguration_SSH_V2.xml       RRX.RDFXML_SAX  avgt   
 5   0,146 ± 0,029   s/op
   TestXMLParser.parseXML         
CGMES_v2.4.15_RealGridTestConfiguration_SSH_V2.xml   RRX.RDFXML_StAX_ev  avgt   
 5   0,192 ± 0,007   s/op
   TestXMLParser.parseXML         
CGMES_v2.4.15_RealGridTestConfiguration_SSH_V2.xml   RRX.RDFXML_StAX_sr  avgt   
 5   0,140 ± 0,016   s/op
   TestXMLParser.parseXML         
CGMES_v2.4.15_RealGridTestConfiguration_SSH_V2.xml      RRX.RDFXML_ARP1  avgt   
 5   0,309 ± 0,098   s/op
   TestXMLParser.parseXMLJena510                            
../testing/citations.rdf       RRX.RDFXML_SAX  avgt    5  57,690 ± 0,932   s/op
   TestXMLParser.parseXMLJena510                            
../testing/citations.rdf   RRX.RDFXML_StAX_ev  avgt    5  84,579 ± 4,109   s/op
   TestXMLParser.parseXMLJena510                            
../testing/citations.rdf   RRX.RDFXML_StAX_sr  avgt    5  56,949 ± 0,815   s/op
   TestXMLParser.parseXMLJena510                            
../testing/citations.rdf      RRX.RDFXML_ARP1  avgt    5  82,940 ± 0,815   s/op
   TestXMLParser.parseXMLJena510                         
../testing/BSBM/bsbm-5m.xml       RRX.RDFXML_SAX  avgt    5  13,280 ± 0,458   
s/op
   TestXMLParser.parseXMLJena510                         
../testing/BSBM/bsbm-5m.xml   RRX.RDFXML_StAX_ev  avgt    5  14,994 ± 0,803   
s/op
   TestXMLParser.parseXMLJena510                         
../testing/BSBM/bsbm-5m.xml   RRX.RDFXML_StAX_sr  avgt    5  13,132 ± 0,166   
s/op
   TestXMLParser.parseXMLJena510                         
../testing/BSBM/bsbm-5m.xml      RRX.RDFXML_ARP1  avgt    5  19,125 ± 1,044   
s/op
   TestXMLParser.parseXMLJena510   
CGMES_v2.4.15_RealGridTestConfiguration_EQ_V2.xml       RRX.RDFXML_SAX  avgt    
5   1,311 ± 0,018   s/op
   TestXMLParser.parseXMLJena510   
CGMES_v2.4.15_RealGridTestConfiguration_EQ_V2.xml   RRX.RDFXML_StAX_ev  avgt    
5   1,693 ± 0,021   s/op
   TestXMLParser.parseXMLJena510   
CGMES_v2.4.15_RealGridTestConfiguration_EQ_V2.xml   RRX.RDFXML_StAX_sr  avgt    
5   1,332 ± 0,179   s/op
   TestXMLParser.parseXMLJena510   
CGMES_v2.4.15_RealGridTestConfiguration_EQ_V2.xml      RRX.RDFXML_ARP1  avgt    
5   2,305 ± 0,280   s/op
   TestXMLParser.parseXMLJena510  
CGMES_v2.4.15_RealGridTestConfiguration_SSH_V2.xml       RRX.RDFXML_SAX  avgt   
 5   0,194 ± 0,028   s/op
   TestXMLParser.parseXMLJena510  
CGMES_v2.4.15_RealGridTestConfiguration_SSH_V2.xml   RRX.RDFXML_StAX_ev  avgt   
 5   0,227 ± 0,016   s/op
   TestXMLParser.parseXMLJena510  
CGMES_v2.4.15_RealGridTestConfiguration_SSH_V2.xml   RRX.RDFXML_StAX_sr  avgt   
 5   0,194 ± 0,025   s/op
   TestXMLParser.parseXMLJena510  
CGMES_v2.4.15_RealGridTestConfiguration_SSH_V2.xml      RRX.RDFXML_ARP1  avgt   
 5   0,291 ± 0,039   s/op
   ```
   
   ----
   
    - [ x] Tests are included.
    -  no Documentation change and updates needed
    - [ x] Commits have been squashed to remove intermediate development commit 
messages.
    - [ x] Key commit messages start with the issue number (GH-xxxx)
   
   By submitting this pull request, I acknowledge that I am making a 
contribution to the Apache Software Foundation under the terms and conditions 
of the [Contributor's 
Agreement](https://www.apache.org/licenses/contributor-agreements.html).
   
   ----
   
   See the [Apache Jena "Contributing" 
guide](https://github.com/apache/jena/blob/main/CONTRIBUTING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@jena.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscr...@jena.apache.org
For additional commands, e-mail: pr-h...@jena.apache.org

Reply via email to