[ 
https://issues.apache.org/jira/browse/OAK-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352948#comment-15352948
 ] 

Vikas Saurabh commented on OAK-4516:
------------------------------------

One of the possible solution could be to utilize 
[WordDelimiterFilter.html#PRESERVE_ORIGINAL|https://lucene.apache.org/core/4_4_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/WordDelimiterFilter.html#PRESERVE_ORIGINAL]
 feature in default {{OakAnalyzer}}.

This has an obvious downside that it'd increase index size. So, we would 
probably want to keep this configurable.

/cc [~chetanm]

> Configurable option to lucene index defs to index original (unanalyzed value 
> as well)
> -------------------------------------------------------------------------------------
>
>                 Key: OAK-4516
>                 URL: https://issues.apache.org/jira/browse/OAK-4516
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Vikas Saurabh
>            Assignee: Vikas Saurabh
>            Priority: Minor
>
> It's sometimes useful to have original value being indexed to be stored as a 
> term. One use-case could be like:
> * consider a couple of values to be indexed as {{abc_def}}, {{abcdef}}
> * On query, it seems reasonable to get both values for a query for {{abc*}}
> Currently, the values would get indexed like:
> * {{abc_def}} -> {{\[abc], \[def]}}
> * {{abcdef}} -> {{\[abcdef]}}
> So, the query {{abc*}} would only fetch {{abcdef}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to