Hi, I'm working on a mail archiver with a builtin search engine
similar to mnogosearch, what I did is a command which retrieves
words above such a limit, and you can eventually modify the list
then pipe it into another scripts which appends these words to the
stopwords list and deletes these words which are indexed.

Alain

----- Original Message -----
From: "Anonymous (marcio)" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, June 07, 2001 8:03 PM
Subject: Webboard: adaptive stopwords ?


> Author: marcio
> Email:
> Message:
>
> Swish has a smart way to deal with stopwords. Instead of having to list
them explicitely, you can give a percentage. Words that appear more than a
given percentage (relative to the number of documents seen) are not indexed.
>
>
> IgnoreLimit 80 256
> # This automatically omits words that appear too often in the files
> # (these words are called stopwords). Specify a whole percentage
> # and a number, such as "80 256". This omits words that occur in
> # over 80% of the files and appear in over 256 files. Comment out
> # to turn of auto-stopwording.
>
>
>  Does mnoGoSearch have this ? Any plans to have it ?
>
>
> Reply: <http://www.mnogosearch.org/board/message.php?id=2365>
>
> ___________________________________________
> If you want to unsubscribe send "unsubscribe general"
> to [EMAIL PROTECTED]

___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]

Reply via email to