[ https://issues.apache.org/jira/browse/TIKA-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Allison resolved TIKA-2276. ------------------------------- Resolution: Fixed > Try to be more parsimonious creating TikaConfigs and ParseContexts > ------------------------------------------------------------------ > > Key: TIKA-2276 > URL: https://issues.apache.org/jira/browse/TIKA-2276 > Project: Tika > Issue Type: Improvement > Reporter: Tim Allison > Assignee: Tim Allison > Fix For: 2.0, 1.15 > > > If we run the AutoDetectParser() against the files in our unit tests (around > 600 files*), there are 701 new instantiations of TikaConfig. The time is > around 20 seconds. If we modify AutoDetectParser to pass its TikaConfig via > the ParseContext if one isn't already specified, that drops to 234 > instantiations, and parse time goes to ~17 seconds. > Let's make this simple change and look for other areas to decrease the number > of times our parsers are creating a new TikaConfig. > *Note I did not include the testCHM2.chm monster in these runs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)