Hi,

CustomAnalyzer is a very generic thing. It has a builder that you can use to 
configure your analyzer. You can define which Tokenizer, which StopFilter (and 
pass stop words as you like), add stemming. No, it does not subclass 
StopWordAnalyzerBase, but that is also not needed, because it has a generic 
configuration interface.

So I don't understand you problem. Lucene APIs take the abstract Analyzer class 
and CustomAnalyzer provides it the same like StandardAnalyzer. CustomAnalyzer 
is basically the same like Solr's schema.xml and Elasticsearch's analyzer index 
config.

The first example in the Javadocs is more or less StandardAnalyzer, just adapt 
it and pass the factory:
http://lucene.apache.org/core/6_4_0/analyzers-common/org/apache/lucene/analysis/custom/CustomAnalyzer.html

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -----Original Message-----
> From: Greg Huber [mailto:gregh3...@gmail.com]
> Sent: Sunday, January 29, 2017 12:48 PM
> To: java-user@lucene.apache.org
> Subject: Re: Strange results returned from suggester
> 
> Uwe,
> 
> >...or use CustomAnalyzer then you don't need to
> > subclass. Just decare the components.
> 
> If I need the StandardAnalyzer code (marked final) and this extends
> StopwordAnalyzerBase, how would I do this?
> 
> Cheers Greg
> 
> On 29 January 2017 at 11:32, Uwe Schindler <u...@thetaphi.de> wrote:
> 
> > ...or use CustomAnalyzer then you don't need to subclass. Just decare the
> > components.
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > Achterdiek 19, D-28357 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> > > -----Original Message-----
> > > From: Michael McCandless [mailto:luc...@mikemccandless.com]
> > > Sent: Sunday, January 29, 2017 12:28 PM
> > > To: Greg Huber <gregh3...@gmail.com>; Lucene Users <java-
> > > u...@lucene.apache.org>
> > > Subject: Re: Strange results returned from suggester
> > >
> > > That's right, just make your own analyzer, forked from
> > > StandardAnalyzer, and change out the StopFilter.  The analyzer is a
> > > tiny class and this (creating your own components in an analyzers) is
> > > normal practice...
> > >
> > > Mike McCandless
> > >
> > > http://blog.mikemccandless.com
> > >
> > >
> > > On Sat, Jan 28, 2017 at 6:09 AM, Greg Huber <gregh3...@gmail.com>
> wrote:
> > > > Michael,
> > > >
> > > > Thanks for the update, so I just duplicate StandardAnalyzer and
> > replace :
> > > >
> > > >
> > > > //tok = new StopFilter(tok, stopwords);
> > > >   tok = new SuggestStopFilter(tok, stopwords);
> > > >
> > > > in createComponents(..)
> > > >
> > > > Is there a way I can just override the method as in
> > AnalyzingInfixSuggester
> > > > rather than duplicating classes?
> > > >
> > > >
> > > > Cheers Greg
> > > >
> > > > On 28 January 2017 at 10:31, Michael McCandless
> > > <luc...@mikemccandless.com>
> > > > wrote:
> > > >>
> > > >> Hi Greg,
> > > >>
> > > >> OK StandardAnalyzer does indeed use StopFilter, with English stop
> > > >> words by default, which includes "will", so this explains what you are
> > > >> seeing.
> > > >>
> > > >> I suggest making your own analyzer just like StandardAnalyzer, except
> > > >> instead of StopFilter use the SuggestStopFilter class.
> > > >>
> > > >> That class was created for exactly the situation you're in, so that
> > > >> "will" would not be filtered out as a stop word, but "will " is
> > > >> (because it ends with a token separator).
> > > >>
> > > >> Either that or pass an empty stop word set to StandardAnalyzer, but
> > > >> then you have no stop word filtering.
> > > >>
> > > >> This short blog post explains SuggestStopFilter:
> > > >>
> > > >> http://blog.mikemccandless.com/2013/08/suggeststopfilter-carefully-
> > > removes.html
> > > >>
> > > >> Mike McCandless
> > > >>
> > > >> http://blog.mikemccandless.com
> > > >>
> > > >>
> > > >> On Sat, Jan 28, 2017 at 3:39 AM, Greg Huber <gregh3...@gmail.com>
> > > wrote:
> > > >> > Michael,
> > > >> >
> > > >> > I am using the standard analyzer eith no stop words, and is build
> > from
> > > >> > an
> > > >> > existing lucene index.
> > > >> >
> > > >> > org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester
> > > >> >
> > > >> > I am overriding the addContextToQuery to make it an AND rather
> than
> > > an
> > > >> > OR
> > > >> >
> > > >> > public void addContextToQuery(Builder query, BytesRef context,
> Occur
> > > >> > clause)
> > > >> > {
> > > >> >         query.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME,
> > > context)),
> > > >> >                 BooleanClause.Occur.MUST);
> > > >> >     }
> > > >> >
> > > >> > Cheers Greg
> > > >> >
> > > >> > On 27 January 2017 at 18:20, Michael McCandless
> > > >> > <luc...@mikemccandless.com>
> > > >> > wrote:
> > > >> >>
> > > >> >> Which suggester are you using?
> > > >> >>
> > > >> >> Maybe you are using a suggester with an analyzer, and your
> analysis
> > > >> >> chain includes a StopFilter and "will" is a stop word?
> > > >> >>
> > > >> >> Mike McCandless
> > > >> >>
> > > >> >> http://blog.mikemccandless.com
> > > >> >>
> > > >> >>
> > > >> >> On Fri, Jan 27, 2017 at 10:42 AM, Greg Huber
> <gregh3...@gmail.com>
> > > >> >> wrote:
> > > >> >> > Hello,
> > > >> >> >
> > > >> >> > Is there anyway to see why items are returned from the
> suggester?
> > > >> >> > Similar
> > > >> >> > to the search.
> > > >> >> >
> > > >> >> > I have a really strange case where if I enter 'will' (without the
> > > >> >> > quotes)
> > > >> >> > it seems to return all the search results.
> > > >> >> >
> > > >> >> > example:
> > > >> >> >
> > > >> >> > there should be two entries beginning with will*  ie william and
> > > >> >> > Willoughby
> > > >> >> >
> > > >> >> > wil >  two entries with correct highlight
> > > >> >> > will > all entries with NO highlight
> > > >> >> > willi > single entry
> > > >> >> > willo > single entry
> > > >> >> >
> > > >> >> > I have checked and I do not have will on all the entries!
> > > >> >> >
> > > >> >> > Cheers Greg
> > > >> >
> > > >> >
> > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to