[
https://issues.apache.org/jira/browse/OAK-2463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14298593#comment-14298593
]
Chetan Mehrotra edited comment on OAK-2463 at 2/2/15 9:37 AM:
--------------------------------------------------------------
Applied the patch in trunk for now with http://svn.apache.org/r1655996. Once
review is done would merge it to branch
Custom Tika config xml can be now be provided as part of Index Defintion node
by creating a {{nt:file}} node with name {{tikaConfig}} under index definition
{noformat}
/oak:index/assetType
- jcr:primaryType = "oak:QueryIndexDefinition"
- compatVersion = 2
- type = "lucene"
- async = "async"
+ tika
+ config.xml (nt:file)
+ jcr:content
- jcr:data = //config xml binary content
+ indexRules
{noformat}
was (Author: chetanm):
Applied the patch in trunk for now with http://svn.apache.org/r1655996. Once
review is done would merge it to branch
Custom Tika config xml can be now be provided as part of Index Defintion node
by creating a {{nt:file}} node with name {{tikaConfig}} under index definition
{noformat}
/oak:index/assetType
- jcr:primaryType = "oak:QueryIndexDefinition"
- compatVersion = 2
- type = "lucene"
- async = "async"
+ tika
+ config (nt:file)
+ jcr:content
- jcr:data = //config xml binary content
+ indexRules
{noformat}
> Provide support for providing custom Tika config
> ------------------------------------------------
>
> Key: OAK-2463
> URL: https://issues.apache.org/jira/browse/OAK-2463
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: oak-lucene
> Reporter: Chetan Mehrotra
> Assignee: Chetan Mehrotra
> Fix For: 1.1.6, 1.0.12
>
> Attachments: OAK-2463.patch
>
>
> Currently the Oak Lucene uses the default Tika Config while extracting text
> content from binary properties. To provide better control the tika config
> should be made configurable
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)