[ 
https://issues.apache.org/jira/browse/TIKA-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881340#comment-15881340
 ] 

Hudson commented on TIKA-2276:
------------------------------

SUCCESS: Integrated in Jenkins build Tika-trunk #1209 (See 
[https://builds.apache.org/job/Tika-trunk/1209/])
TIKA-2276 -- Try to reuse parsers from ParseContext for custom embedded 
(tallison: rev e3a50ba30ad5ccee721401ceb20464cb327de17c)
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/JackcessExtractor.java
* (edit) 
tika-core/src/main/java/org/apache/tika/extractor/EmbeddedDocumentUtil.java
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/OutlookExtractor.java
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/chm/ChmParser.java
TIKA-2276 -- Have AutoDetectParser pass itself to the ParseContext for 
(tallison: rev 579a92bebff5205c6f881766fb8cfddf3a8a520f)
* (edit) 
tika-core/src/main/java/org/apache/tika/extractor/EmbeddedDocumentUtil.java


> Try to be more parsimonious creating TikaConfigs and ParseContexts
> ------------------------------------------------------------------
>
>                 Key: TIKA-2276
>                 URL: https://issues.apache.org/jira/browse/TIKA-2276
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>             Fix For: 2.0, 1.15
>
>
> If we run the AutoDetectParser() against the files in our unit tests (around 
> 600 files*), there are 701 new instantiations of TikaConfig.  The time is 
> around 20 seconds.  If we modify AutoDetectParser to pass its TikaConfig via 
> the ParseContext if one isn't already specified, that drops to 234 
> instantiations, and parse time goes to ~17 seconds.
> Let's make this simple change and look for other areas to decrease the number 
> of times our parsers are creating a new TikaConfig.
> *Note I did not include the testCHM2.chm monster in these runs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to