Martin Porter keeps a list of common english stop words that is frequently
used to improve search results:

http://snowball.tartarus.org/algorithms/english/stop.txt

On Tue, Dec 27, 2016 at 6:31 AM, Erik Gustafson <erik.d.gustaf...@gmail.com>
wrote:

> Hi Alex,
>
> As we see, it indexes only words which have a length of 4 characters or
> more.
>
> The reason is to decrease the total index size (which may in fact not be
> critical) and to avoid noise like "a", "the" and "and". This function
> could be
> made more intelligent.
>
>
> Ahh, that makes sense. I may or may not play around with some changes on a
> local copy. Might save a newcomer or two some time down the road; not
> exactly mission critical though.
>
>

Reply via email to