According to Lachlan Andrew:
> Greetings,
> 
> On Wed, 27 Nov 2002 15:53, Gilles Detillieux wrote:
> > According to Geoff Hutchison:
> > > > Is the policy to have all possible stemmings, even if
> > > > they are "non-words", like "unrealises"?
> > > No, and I'd expect that ispell doesn't want them
> > > either. Of course many people have moved away from
> > > ispell too...
> > Does that mean we'll end up having to add support for
> > aspell dictionaries to htfuzzy endings?
> 
> Does it matter that the list originally came from  ispell?  
> Its role here is fundamentally different.  For a spell 
> checker, you only care what combinations of letters are 
> valid words.  For a stemmer, you only want to know which 
> "words" are derived from a common stem.  Unless the same 
> actual file is used for spell checking, it is not clear why 
> it matters what spell checker people use.  Am I missing 
> something?

It only matters in that it may have an impact on available dictionaries.
We can tweak the english dictionary all we want, but if someone wants a
dictionary for some other language and finds that such a dictionary is
better supported or more complete/correct in aspell or some other spell
checker, than it is with ispell, then they may start asking for support
in htfuzzy for these other dictionary formats.

> > I've made the changes to english.0 and synonyms, with one
> > minor addition (adding D & S flags to birth).
> 
> Thanks for that.  You might want to reconsider the '/S' 
> flag; it produces 'birthes', not 'births' as you might 
> expect.  (The '*h -> es' rule suits words like 'wreath'.)  
> That rule and its use are among the things I hope to clean 
> up after 3.2.0b5 is out...

It's out of there.  Thanks for the heads-up.  I've reinserted "births"
instead, for the sake of completeness, even though htfuzzy won't use it.

A quick grep shows there are a lot of *hs words in there that htfuzzy
can't make use of.  The quick fix would be to grab one of the available
flags (BCEFKLOQW) and use that for th->ths, but it might be more logical
to keep the S flag for th->ths pluralizations, and use something like E
for th->thes conjugations.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T 
handheld. Power & Color in a compact size! 
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en
_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to