[ https://issues.apache.org/jira/browse/OAK-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matt Ryan resolved OAK-7996. ---------------------------- Resolution: Won't Fix So far not a strong enough argument to avoid just doing this via Tika config. > Ability to disable automatic text extraction via configuration > -------------------------------------------------------------- > > Key: OAK-7996 > URL: https://issues.apache.org/jira/browse/OAK-7996 > Project: Jackrabbit Oak > Issue Type: Improvement > Reporter: Matt Ryan > Assignee: Matt Ryan > Priority: Major > > This issue is to discuss allowing a user to disable automatic text extraction > of binary data via a configuration file. > Currently you can save a tika.config file inside an index definition, which > overrides the default Tika configuration for that index. You can use this > approach to disable automatic text extraction. > I'd like to be able to do this at a global level - not per-index - via a > configuration file instead. Then inside the document maker code somewhere, > we would check to see whether the candidate for text extraction has been > disabled by configuration. > The value in this approach is that two instances can be identical in terms of > index definitions, only differing in local configuration. Separate index > definitions don't have to be maintained. And if you want to change which > files you extract text, you don't have to refresh an index to make it happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)