At 12:34 PM +0100 5/15/00, [EMAIL PROTECTED] wrote:
>Rather than a fuzzy accents search method, why not make the htdig database
>accent independent?  After all, it is case independent already!
>For example:
>
>Garçon  ->   Gar�on   ->   gar�on   ->   garcon

I would make the analogy to word suffixes rather than to case. There 
is an endings fuzzy rather than a general stemming step during 
indexing. IMHO, this makes searches a bit more precise because the 
alternatives will get less weight than what the user actually 
entered. (Remember the old maxim "the customer is always right?")

Besides, there are some situations where the unaccented word and the 
accented word do *not* mean the same thing.

(BTW, the 3.2 code isn't completely case independent. It stores a 
flag when the word is capitalized. My feeling is that user queries 
with capitals should return capitals preferentially.)

All that said, it would be possible to patch the code in WordList.cc 
and remove accents before storing the word.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Reply via email to