[jira] [Commented] (OAK-2177) Configurable Analyzer in Lucene index

Chetan Mehrotra (JIRA) Mon, 08 Dec 2014 02:45:13 -0800

    [ 
https://issues.apache.org/jira/browse/OAK-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237746#comment-14237746
 ]


Chetan Mehrotra commented on OAK-2177:
--------------------------------------

[~mmarth] I should have been more clear. The proposed approach is complimentry 
to OSGi one i.e. Oak can support both approach

# OSGi way - There a user has to provide a bundle which would then register the 
OSGi service which would be picked up by the Oak
# Declarative way - There user would _compose_ the analyzer via content and 
would not require to implement any OSGi service stuff

Most of the apps provide support for #2 as its is convienent to end user. As in 
most cases you use out of the box Lucene provided filters, tokenizers etc to 
compose your analyzers. In only few cases where out of the box provided stuff 
does not meet your requirements you implement your own

Regarding your queries

# Analyzer can be configured any time and stop words changed during runtime and 
they would be pickedup without restart. Only thing to be taken care of is you 
would need to reindex as these settings affect indexed data. Same would hold 
true if we allow changing analyzer via OSGi extension

Note that any change in analyzer behaviour would need a reindex at minimum!

So above approach is another option and not excludes the OSGi way. I see usage 
of both approach. Just that declarative way is implemented first and OSGi way 
would come sometime later (as per need). Both Solr and ElasticSearch provide 
declarative way to configure analyzer and have extensive documentation around 
that [1] and [2]

[1] 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis.html
[2] 
https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Specifying_an_Analyzer_in_the_schema

> Configurable Analyzer in Lucene index
> -------------------------------------
>
>                 Key: OAK-2177
>                 URL: https://issues.apache.org/jira/browse/OAK-2177
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: oak-lucene
>    Affects Versions: 1.1.0
>            Reporter: Tommaso Teofili
>            Assignee: Chetan Mehrotra
>         Attachments: OAK-2177.patch
>
>
> Currently the _OakAnalyzer_ is used by default for each Lucene field, 
> sometimes using a different analyzer is needed though.
> It should be possible to make that configurable to support things like: 
> multiple languages, stopword filtering, synonyms expansion, stemming, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (OAK-2177) Configurable Analyzer in Lucene index

Reply via email to