Uwe, Perfect, exactly what I was looking for. No duplication and no on going maintenance (as using defaults) :-)
return CustomAnalyzer.builder() .withTokenizer(StandardTokenizerFactory.class) .addTokenFilter(StandardFilterFactory.class) .addTokenFilter(LowerCaseFilterFactory.class) .addTokenFilter(SuggestStopFilterFactory.class).build(); Thanks Greg. On 29 January 2017 at 12:17, Uwe Schindler <u...@thetaphi.de> wrote: > Hi, > > CustomAnalyzer is a very generic thing. It has a builder that you can use > to configure your analyzer. You can define which Tokenizer, which > StopFilter (and pass stop words as you like), add stemming. No, it does not > subclass StopWordAnalyzerBase, but that is also not needed, because it has > a generic configuration interface. > > So I don't understand you problem. Lucene APIs take the abstract Analyzer > class and CustomAnalyzer provides it the same like StandardAnalyzer. > CustomAnalyzer is basically the same like Solr's schema.xml and > Elasticsearch's analyzer index config. > > The first example in the Javadocs is more or less StandardAnalyzer, just > adapt it and pass the factory: > http://lucene.apache.org/core/6_4_0/analyzers-common/org/ > apache/lucene/analysis/custom/CustomAnalyzer.html > > Uwe > > ----- > Uwe Schindler > Achterdiek 19, D-28357 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > -----Original Message----- > > From: Greg Huber [mailto:gregh3...@gmail.com] > > Sent: Sunday, January 29, 2017 12:48 PM > > To: java-user@lucene.apache.org > > Subject: Re: Strange results returned from suggester > > > > Uwe, > > > > >...or use CustomAnalyzer then you don't need to > > > subclass. Just decare the components. > > > > If I need the StandardAnalyzer code (marked final) and this extends > > StopwordAnalyzerBase, how would I do this? > > > > Cheers Greg > > > > On 29 January 2017 at 11:32, Uwe Schindler <u...@thetaphi.de> wrote: > > > > > ...or use CustomAnalyzer then you don't need to subclass. Just decare > the > > > components. > > > > > > Uwe > > > > > > ----- > > > Uwe Schindler > > > Achterdiek 19, D-28357 Bremen > > > http://www.thetaphi.de > > > eMail: u...@thetaphi.de > > > > > > > -----Original Message----- > > > > From: Michael McCandless [mailto:luc...@mikemccandless.com] > > > > Sent: Sunday, January 29, 2017 12:28 PM > > > > To: Greg Huber <gregh3...@gmail.com>; Lucene Users <java- > > > > u...@lucene.apache.org> > > > > Subject: Re: Strange results returned from suggester > > > > > > > > That's right, just make your own analyzer, forked from > > > > StandardAnalyzer, and change out the StopFilter. The analyzer is a > > > > tiny class and this (creating your own components in an analyzers) is > > > > normal practice... > > > > > > > > Mike McCandless > > > > > > > > http://blog.mikemccandless.com > > > > > > > > > > > > On Sat, Jan 28, 2017 at 6:09 AM, Greg Huber <gregh3...@gmail.com> > > wrote: > > > > > Michael, > > > > > > > > > > Thanks for the update, so I just duplicate StandardAnalyzer and > > > replace : > > > > > > > > > > > > > > > //tok = new StopFilter(tok, stopwords); > > > > > tok = new SuggestStopFilter(tok, stopwords); > > > > > > > > > > in createComponents(..) > > > > > > > > > > Is there a way I can just override the method as in > > > AnalyzingInfixSuggester > > > > > rather than duplicating classes? > > > > > > > > > > > > > > > Cheers Greg > > > > > > > > > > On 28 January 2017 at 10:31, Michael McCandless > > > > <luc...@mikemccandless.com> > > > > > wrote: > > > > >> > > > > >> Hi Greg, > > > > >> > > > > >> OK StandardAnalyzer does indeed use StopFilter, with English stop > > > > >> words by default, which includes "will", so this explains what > you are > > > > >> seeing. > > > > >> > > > > >> I suggest making your own analyzer just like StandardAnalyzer, > except > > > > >> instead of StopFilter use the SuggestStopFilter class. > > > > >> > > > > >> That class was created for exactly the situation you're in, so > that > > > > >> "will" would not be filtered out as a stop word, but "will " is > > > > >> (because it ends with a token separator). > > > > >> > > > > >> Either that or pass an empty stop word set to StandardAnalyzer, > but > > > > >> then you have no stop word filtering. > > > > >> > > > > >> This short blog post explains SuggestStopFilter: > > > > >> > > > > >> http://blog.mikemccandless.com/2013/08/suggeststopfilter- > carefully- > > > > removes.html > > > > >> > > > > >> Mike McCandless > > > > >> > > > > >> http://blog.mikemccandless.com > > > > >> > > > > >> > > > > >> On Sat, Jan 28, 2017 at 3:39 AM, Greg Huber <gregh3...@gmail.com> > > > > wrote: > > > > >> > Michael, > > > > >> > > > > > >> > I am using the standard analyzer eith no stop words, and is > build > > > from > > > > >> > an > > > > >> > existing lucene index. > > > > >> > > > > > >> > org.apache.lucene.search.suggest.analyzing. > AnalyzingInfixSuggester > > > > >> > > > > > >> > I am overriding the addContextToQuery to make it an AND rather > > than > > > > an > > > > >> > OR > > > > >> > > > > > >> > public void addContextToQuery(Builder query, BytesRef context, > > Occur > > > > >> > clause) > > > > >> > { > > > > >> > query.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME, > > > > context)), > > > > >> > BooleanClause.Occur.MUST); > > > > >> > } > > > > >> > > > > > >> > Cheers Greg > > > > >> > > > > > >> > On 27 January 2017 at 18:20, Michael McCandless > > > > >> > <luc...@mikemccandless.com> > > > > >> > wrote: > > > > >> >> > > > > >> >> Which suggester are you using? > > > > >> >> > > > > >> >> Maybe you are using a suggester with an analyzer, and your > > analysis > > > > >> >> chain includes a StopFilter and "will" is a stop word? > > > > >> >> > > > > >> >> Mike McCandless > > > > >> >> > > > > >> >> http://blog.mikemccandless.com > > > > >> >> > > > > >> >> > > > > >> >> On Fri, Jan 27, 2017 at 10:42 AM, Greg Huber > > <gregh3...@gmail.com> > > > > >> >> wrote: > > > > >> >> > Hello, > > > > >> >> > > > > > >> >> > Is there anyway to see why items are returned from the > > suggester? > > > > >> >> > Similar > > > > >> >> > to the search. > > > > >> >> > > > > > >> >> > I have a really strange case where if I enter 'will' > (without the > > > > >> >> > quotes) > > > > >> >> > it seems to return all the search results. > > > > >> >> > > > > > >> >> > example: > > > > >> >> > > > > > >> >> > there should be two entries beginning with will* ie william > and > > > > >> >> > Willoughby > > > > >> >> > > > > > >> >> > wil > two entries with correct highlight > > > > >> >> > will > all entries with NO highlight > > > > >> >> > willi > single entry > > > > >> >> > willo > single entry > > > > >> >> > > > > > >> >> > I have checked and I do not have will on all the entries! > > > > >> >> > > > > > >> >> > Cheers Greg > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------ > --------- > > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >