Back when we were still discussing the addition of an accents fuzzy
algorithm, I had suggested the possibility of extending the concept to
handle one-to-many and many-to-one mappings in a configurable fashion.
This would have facilitated handling on other non-west-European languages,
as well as other two-character conventions for alternatives to accented
characters or umlauts.  It would still take some work to add this in, but
now that the basic framework is in place it would be easier to do.

According to Geoff Hutchison:
> For the sake of keeping this on-list, I include your prior e-mail. 
> You can certainly include these characters as word characters using 
> the extra_word_characters attribute:
> 
> <http://www.htdig.org/attrs.html#extra_word_characters>
> 
> However, it may be just as easy to redesign your font depending on 
> your motivation, etc.
> 
> But yes, I would guess that you'd want to work out a mapping using 
> the "accent" fuzzy match, including for your convention. (BTW, a flat 
> "bar" as an accent is called a macron IIRC.)
...
> At 6:51 PM -0800 2/19/01, Michael Olds wrote:
> >Again thank you for your reply and attention.
> >These are the ones replaced. I am afraid not "mostly" other characters with
> >accents. (The font was obviously not designed to be useful as a normal
> >font).
> >
> >¡  ¢  £  ?  ¥  ¦  §  ¨  ©  ª  «  ¬  ¿  À  ¯  °  ±  "  "  ´  µ  ¶  ·  ¸  '
> >º  »  ¼  ½  ¾  Á   Â
> >
> >At this point I start to think I am imposing, but with the idea that I am
> >not rushing this (since it looks to me like this is a major problem that
> >will have to be carefully thought through -- it might be easier to redesign
> >the font according to your suggestion and convert what has been done this
> >far -- the ability to do a word search in this field would be invaluable),
> >let me also describe the other twist, raised by your mention of the fuzzy
> >algorithm: not only are these words often spelled without their accents,
> >they are just as often spelled using a "convention" The "a" with the
> >"accent" (there is a word for this accent, I think "bar" but I am not sure)
> >is aa, and so on with ee, ii, oo, uu, and with the characters with dots
> >under, the convention is .t, .d, .m, .n; and the nya goes ~n.
> >
> >It looks to me like this is the only problem I have with going this way, and
> >on the other hand, I doubt there is another search engine out there that
> >will do this at all.


-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to