Sent wrong link for Elastic search doc. Refer to http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis.html for how to enabled constructing analyzer Chetan Mehrotra
On Wed, Nov 5, 2014 at 6:34 PM, Chetan Mehrotra <[email protected]> wrote: > Hi Team, > > Currently Oak uses a hard coded analyzer. As part of OAK-2177 this > need to be opened up for extension. Possible approaches > > A - Create anayzer using reflection > ------------------------------------------ > > Jackrabbit [1] used to support changing the analyzer. We can do the > same by capturing the analyzer class name and instantiate it > > Pros : Simple to implement > Cons : Would not work well in OSGi > > B - Make it configurable via content > ---------------------------------------------- > > Elasticsearch provide a content based dsl to create analyzer [1]. We > can possibly implement something similar > > Pros : End user usability improves quite a bit. No need to code, just > configure! > Cons : Implementation complexity to support full configuration via content > > C - Lookup Analyzer via OSGi > ---------------------------------------- > > Make use of OSGi Service Registry to look analyzer by name. User > provides the analyzer as part of config > > "title" : { > "boost" : 1.5, > "analyzer" : "AnalyzerA" > } > > Oak can ship some default analyzer (lowercase etc) and others can be > looked from SR. For configuration we have two options > > 1. Provide an AnalyzerFactory - The factory can be provided with Index > definition nodestate corresponding to analyzer element. This can be > used to configure the analyzer say by reading stop word data from > content > > 2. No default support - Analyzer provider are expected to register the > analyzer fully configured. They can probably utilize the repository > api lookup config > > Pros : Full extensibility Makes use of OSGi > Cons : > 1. Need to export Lucene classes, > 2. Deal with OSGi dynamic nature etc (we can simply throw exception if > analyzer is not found) > > Chetan Mehrotra > [1] http://wiki.apache.org/jackrabbit/IndexingConfiguration#Index_Analyzers
