RE: [htdig] Accent problem.

NEPOTE Charles (Neuilly Gestion) Mon, 15 May 2000 06:52:31 -0700

Title: RE: [htdig] Accent problem.

I am sorry but I think the accent patch won't solve my problem because it is an "after-merge solution".
Without accent patch, if I manualy search "tue or tu�" it still find only one document... The problem is in the database.

Extract from db.wordlist (read carrefully -- i hope you can read accented chars) :

trie            i:0
trie            i:1
trie            i:2
tri�            i:3
tri�            i:4
tri�            i:5
tue             i:0
tu�             i:0
tue             i:1
tu�             i:1
tue             i:2
tu�             i:2
...

search "trie" will find 0 1 2
search "tri�" will find 3 4 5
search "tri� or trie" will find 0 1 2 3 4 5
search "tue" will find 2
search "tu�" will find 2
search "tue or tu�" will find 2

=> there is a problem... and other people should reproduce (anyone down there ?)...

Many thanks for your help.
Charles N�pote.

> -----Message d'origine-----
> De : Geoff Hutchison [mailto:[EMAIL PROTECTED]]
> Envoy� : lundi 15 mai 2000 15:16
> � : NEPOTE Charles (Neuilly Gestion)
> Cc : '[EMAIL PROTECTED]'
> Objet : Re: [htdig] Accent problem.
>
>
> At 12:51 PM +0200 5/15/00, NEPOTE Charles (Neuilly Gestion) wrote:
> >(only the file which correspond to "i:2" will be found).
> >
> >
> >Is this can be solve ?
> >(Note I have in htdig.conf :
> >locale: fr_FR
>
> You probably want to try the accents fuzzy patch at
> <ftp://sol.ccsf.cc.ca.us//htdig-patches/3.1.5/accents.5>
>
> (Thanks to Joe Jah for archiving patches.)
>
> This works along the lines of the soundex or metaphone fuzzy
> algorithms. You run it after running htmerge and it will add
> alternative accented or unaccented words to the query (with lesser
> weight as determined by the search_algorithms attribute).
>
> See my other message just now about the +/- of this approach or
> simply stripping accented words. As you noted in your message, the
> two words do not mean the same thing!
>
> --
> -Geoff Hutchison
> Williams Students Online
> http://wso.williams.edu/
>
> ------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> [EMAIL PROTECTED]
> You will receive a message to confirm this.
>

RE: [htdig] Accent problem.

Reply via email to