Garrett is correct that you can create a custom type. However you are correct in that you can specify the "analyzerClass" property if and only if there are one of two different types of constructors. The default constructor (no args) or one that takes the LuceneVersion enum. Otherwise it will throw an exception. This also assumes that you are running a fairly recent version of Blur if it's 0.2.2 (which I think you are) then you are likely good to use that option.
Here's the code: https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=blob;f=blur-query/src/main/java/org/apache/blur/analysis/type/TextFieldTypeDefinition.java;h=049207bdb4f94cf03a4b0c74891eba129d13fbbb;hb=3967e154e7b064ad40b36d1d5832b7c7bcac44cd#l69 Perhaps the reason it's not being taken is because the field has already been defined for the given table? If none of those possibilities are the problem I'm not sure what the problem could be. Let us know how it goes. Aaron On Tue, May 13, 2014 at 12:12 PM, Garrett Barton <[email protected]>wrote: > I think you have to create a custom TypeDefinition that calls your > analyzer underneath the covers. You can extend the TextFieldTypeDefinition > if I remember right and just override the analyzer it calls. > > ~Garrett > > > On Tue, May 13, 2014 at 11:54 AM, Dibyendu Bhattacharya < > [email protected]> wrote: > >> Hi , >> >> I was trying to configure a Custom Analyzer ( EgdeNGram) for a text field. >> >> Below is the very simple Edge N Gram Analyzer code with works fine. >> >> public class EdgeNGramAnalyzer extends Analyzer { >> @Override >> protected TokenStreamComponents createComponents(String fieldName, Reader >> reader) { >> final StandardTokenizer src = new StandardTokenizer(Version.LUCENE_43, >> reader); >> TokenStream tok = new StandardFilter(Version.LUCENE_43, src); >> tok = new LowerCaseFilter(Version.LUCENE_43, tok); >> tok = new StopFilter(Version.LUCENE_43, tok, >> StopAnalyzer.ENGLISH_STOP_WORDS_SET); >> tok = new EdgeNGramTokenFilter(tok, >> EdgeNGramTokenFilter.Side.FRONT,3,20); >> return new TokenStreamComponents(src, tok) { >> @Override >> protected void setReader(final Reader reader) throws IOException { >> super.setReader(reader); >> } >> }; >> } >> } >> >> >> I configured this Analyzer for a CloumnDefination using following steps >> via >> thrift client.. >> >> ColumnDefinition customAnalyzerDefn = new ColumnDefinition(); >> customAnalyzerDefn.setFamily(FAMILY_NAME); >> customAnalyzerDefn.setColumnName(COLUMN_NAME); >> customAnalyzerDefn.setFieldType("text"); >> >> Map<String,String> analyzer = new HashMap<String,String>(); >> analyzer.put("analyzerClass", "x.y.z.EdgeNGramAnalyzer"); >> customAnalyzerDefn.setProperties(analyzer); >> >> client.addColumnDefinition(TABLE_NAME, customAnalyzerDefn); >> >> >> I copied the Jar containing the analyzer class into Blur Lib folder. >> >> But I do not see this analyzer getting called. Blur always using the >> default StandardAnalyzer for text field. Kindly let me know if I am >> missing >> something, or there is an issue that "analyzerClass" property is not >> getting set. I found Blur using this key to set the Analyzer >> in TextFieldTypeDefinition .. >> >> Regards, >> Dibyendu >> > >
