Wonderful, thank you for bringing closure!  Stop words and analyzing
suggesters are a tricky combo ...

Mike McCandless

http://blog.mikemccandless.com


On Sun, Jan 29, 2017 at 6:37 AM, Greg Huber <gregh3...@gmail.com> wrote:
> Mike,
>
> Many thanks, it works perfectly now.
>
> Cheers Greg
>
> On 29 January 2017 at 11:28, Michael McCandless <luc...@mikemccandless.com>
> wrote:
>>
>> That's right, just make your own analyzer, forked from
>> StandardAnalyzer, and change out the StopFilter.  The analyzer is a
>> tiny class and this (creating your own components in an analyzers) is
>> normal practice...
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Sat, Jan 28, 2017 at 6:09 AM, Greg Huber <gregh3...@gmail.com> wrote:
>> > Michael,
>> >
>> > Thanks for the update, so I just duplicate StandardAnalyzer and replace
>> > :
>> >
>> >
>> > //tok = new StopFilter(tok, stopwords);
>> >   tok = new SuggestStopFilter(tok, stopwords);
>> >
>> > in createComponents(..)
>> >
>> > Is there a way I can just override the method as in
>> > AnalyzingInfixSuggester
>> > rather than duplicating classes?
>> >
>> >
>> > Cheers Greg
>> >
>> > On 28 January 2017 at 10:31, Michael McCandless
>> > <luc...@mikemccandless.com>
>> > wrote:
>> >>
>> >> Hi Greg,
>> >>
>> >> OK StandardAnalyzer does indeed use StopFilter, with English stop
>> >> words by default, which includes "will", so this explains what you are
>> >> seeing.
>> >>
>> >> I suggest making your own analyzer just like StandardAnalyzer, except
>> >> instead of StopFilter use the SuggestStopFilter class.
>> >>
>> >> That class was created for exactly the situation you're in, so that
>> >> "will" would not be filtered out as a stop word, but "will " is
>> >> (because it ends with a token separator).
>> >>
>> >> Either that or pass an empty stop word set to StandardAnalyzer, but
>> >> then you have no stop word filtering.
>> >>
>> >> This short blog post explains SuggestStopFilter:
>> >>
>> >>
>> >> http://blog.mikemccandless.com/2013/08/suggeststopfilter-carefully-removes.html
>> >>
>> >> Mike McCandless
>> >>
>> >> http://blog.mikemccandless.com
>> >>
>> >>
>> >> On Sat, Jan 28, 2017 at 3:39 AM, Greg Huber <gregh3...@gmail.com>
>> >> wrote:
>> >> > Michael,
>> >> >
>> >> > I am using the standard analyzer eith no stop words, and is build
>> >> > from
>> >> > an
>> >> > existing lucene index.
>> >> >
>> >> > org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester
>> >> >
>> >> > I am overriding the addContextToQuery to make it an AND rather than
>> >> > an
>> >> > OR
>> >> >
>> >> > public void addContextToQuery(Builder query, BytesRef context, Occur
>> >> > clause)
>> >> > {
>> >> >         query.add(new TermQuery(new Term(CONTEXTS_FIELD_NAME,
>> >> > context)),
>> >> >                 BooleanClause.Occur.MUST);
>> >> >     }
>> >> >
>> >> > Cheers Greg
>> >> >
>> >> > On 27 January 2017 at 18:20, Michael McCandless
>> >> > <luc...@mikemccandless.com>
>> >> > wrote:
>> >> >>
>> >> >> Which suggester are you using?
>> >> >>
>> >> >> Maybe you are using a suggester with an analyzer, and your analysis
>> >> >> chain includes a StopFilter and "will" is a stop word?
>> >> >>
>> >> >> Mike McCandless
>> >> >>
>> >> >> http://blog.mikemccandless.com
>> >> >>
>> >> >>
>> >> >> On Fri, Jan 27, 2017 at 10:42 AM, Greg Huber <gregh3...@gmail.com>
>> >> >> wrote:
>> >> >> > Hello,
>> >> >> >
>> >> >> > Is there anyway to see why items are returned from the suggester?
>> >> >> > Similar
>> >> >> > to the search.
>> >> >> >
>> >> >> > I have a really strange case where if I enter 'will' (without the
>> >> >> > quotes)
>> >> >> > it seems to return all the search results.
>> >> >> >
>> >> >> > example:
>> >> >> >
>> >> >> > there should be two entries beginning with will*  ie william and
>> >> >> > Willoughby
>> >> >> >
>> >> >> > wil >  two entries with correct highlight
>> >> >> > will > all entries with NO highlight
>> >> >> > willi > single entry
>> >> >> > willo > single entry
>> >> >> >
>> >> >> > I have checked and I do not have will on all the entries!
>> >> >> >
>> >> >> > Cheers Greg
>> >> >
>> >> >
>> >
>> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to