This is on my list of things to work on..

An alternative is to have a separate word stemmer which stores the words
in the index in stemmed form.

The Porter Stemming algorithm is good for this, and I have code to do it.

Thanks.

On Fri, 29 Nov 2002, Lachlan Andrew wrote:

> On Fri, 29 Nov 2002 02:21, Gilles Detillieux wrote:
> > if someone wants a dictionary
> > for some other language and finds that such a dictionary
> > is better supported or more complete/correct in aspell
>
> Ahh...  That makes sense.  Thanks.  However I still don't
> understand why we wouldn't want the English dictionary to
> stem  unrealised  and  realises  together (which implicitly
> allows the non-word "unrealises").
>
> > A quick grep shows there are a lot of *hs words in there
> > that htfuzzy can't make use of.  The quick fix would be
> > to grab one of the available flags (BCEFKLOQW) and use
> > that for th->ths, but it might be more logical to keep
> > the S flag for th->ths pluralizations, and use something
> > like E for th->thes conjugations.
>
> Yes.  I've suggested a new rule to the ispell maintainer
> ([^cst]h -> s, [cst]h -> es) which fixes most problems
> (*gh,*ph), while maintaining compatibility with ispell as
> much as possible.  You're right that we can add lots more
> rules to improve stemming in lots of ways.
>
> Cheers,
> Lachlan
>
> --
> Lachlan Andrew  Phone: +613 8344-3816 Fax: +613 8344-6678
> Dept of Electrical and Electronic Engg                CRICOS Provider Code
> University of Melbourne, Victoria, 3010  AUSTRALIA    00116K
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Get the new Palm Tungsten T
> handheld. Power & Color in a compact size!
> http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en
> _______________________________________________
> htdig-dev mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/htdig-dev
>

Neal Richter
Knowledgebase Developer
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485




-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T 
handheld. Power & Color in a compact size! 
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en
_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to