Re: Word stop list in examples (was Re: Default stop word list)

Erick Erickson Sun, 04 Sep 2016 08:11:32 -0700

I can argue both ways as usual. Stopwords may have started as a way to help
deal with limited space/memory, but are things really any different now? We
just shove more and more data into the system and still have hardware
constraints to deal with that can be helped by squeezing out stopwords.

OTOH, how much time and energy do we spend trying to support them? Hmmm,
maybe the right thing to do is reconsider how they work. It seems like the
pain of supporting them is a consequence of them being a filter, then we
get into whether to preserve pos info and the like. Would it be easier if
we thought of them as pre-processing before any analysis chain even saw
them? It sure would be easier to explain as "it's as if they never existed"
than the present "it depends". This would certainly change behavior
though....

On Aug 29, 2016 18:36, "Walter Underwood" <[email protected]> wrote:

> I’ve never removed stopwords and I started working on search in 1996 at
> Infoseek.
>
> wunder
> Walter Underwood
> [email protected]
> http://observer.wunderwood.org/  (my blog)
>
> On Aug 29, 2016, at 6:32 PM, Alexandre Rafalovitch <[email protected]>
> wrote:
>
> On 30 August 2016 at 08:18, Walter Underwood <[email protected]>
> wrote (on Solr users list):
>
> Stop word removal is a hack left over from when we were running search
> engines in 64 kbytes of memory.
>
>
> If this is a leftover hack, should we start removing it from the
> official examples?
>
> Or do they still have value even with latest ranking algorithms?
>
> Regards,
>   Alex.
> ----
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
>

Re: Word stop list in examples (was Re: Default stop word list)

Reply via email to