Hi Chetan,

overall I prefer B, but that would be not a trivial effort and it looks
like reinventing what ES (and Solr) already have in that area. However,
also given that, I think A and C would be very tricky to solve in OSGi, I
have dealt myself with this topic before 1.0 and there was no clean
solution (see also [1]), also because Lucene uses ServiceLoader for some of
its configuration points [2][3].

Regards,
Tommaso

[1] : https://issues.apache.org/jira/browse/LUCENE-3167
[2] : https://issues.apache.org/jira/browse/SMX4-1637
[3] : http://www.eclipse.org/forums/index.php/t/474446/


2014-11-05 14:38 GMT+01:00 Chetan Mehrotra <[email protected]>:

> Sent wrong link for Elastic search doc. Refer to
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis.html
> for how to enabled constructing analyzer
> Chetan Mehrotra
>
>
> On Wed, Nov 5, 2014 at 6:34 PM, Chetan Mehrotra
> <[email protected]> wrote:
> > Hi Team,
> >
> > Currently Oak uses a hard coded analyzer. As part of OAK-2177 this
> > need to be opened up for extension. Possible approaches
> >
> > A - Create anayzer using reflection
> > ------------------------------------------
> >
> > Jackrabbit [1] used to support changing the analyzer. We can do the
> > same by capturing the analyzer class name and instantiate it
> >
> > Pros : Simple to implement
> > Cons : Would not work well in OSGi
> >
> > B - Make it configurable via content
> > ----------------------------------------------
> >
> > Elasticsearch provide a content based dsl to create analyzer [1]. We
> > can possibly implement something similar
> >
> > Pros : End user usability improves quite a bit. No need to code, just
> configure!
> > Cons : Implementation complexity to support full configuration via
> content
> >
> > C - Lookup Analyzer via OSGi
> > ----------------------------------------
> >
> > Make use of OSGi Service Registry to look analyzer by name. User
> > provides the analyzer as part of config
> >
> > "title" : {
> >         "boost" : 1.5,
> >          "analyzer" : "AnalyzerA"
> > }
> >
> > Oak can ship some default analyzer (lowercase etc) and others can be
> > looked from SR. For configuration we have two options
> >
> > 1. Provide an AnalyzerFactory - The factory can be provided with Index
> > definition nodestate corresponding to analyzer element. This can be
> > used to configure the analyzer say by reading stop word data from
> > content
> >
> > 2. No default support - Analyzer provider are expected to register the
> > analyzer fully configured. They can probably utilize the repository
> > api lookup config
> >
> > Pros : Full extensibility Makes use of OSGi
> > Cons :
> > 1. Need to export Lucene classes,
> > 2. Deal with OSGi dynamic nature etc (we can simply throw exception if
> > analyzer is not found)
> >
> > Chetan Mehrotra
> > [1]
> http://wiki.apache.org/jackrabbit/IndexingConfiguration#Index_Analyzers
>

Reply via email to