D. Starner writes:
>> The result is much better if you allow the ASCII conversion to be a string.
>> This allows you to, e.g., "�" = "(c)", "�" = "1/2", and so on. This is also
>> good for letters: "�" = "ss", "�" = "aa", etc.
> 
> etcetra? I think he needs more direction then that, especially most na�ve 
> algorithms are going to produce "a" from "�". Diagraphs can be treated
> as titlecase or capital or intelligently.

Hm.  Actually I'll want a mode which generates "a" rather than "aa" for
that one, to mimic local practice for how to generate e-mail adresses.
Though that can be tacked on with an extra hack afterwards.

One question, unless it has been answered already - I need to read up on
Unicode before I'll understand all the answers:

I'd like to translate '�' to 'o' or maybe 'oe'.  'o' at least when used
for matching, since it should match Swedish '�'.  However,
UnicodeData.txt has no decomposition property for that character:

00F8;LATIN SMALL LETTER O WITH STROKE;Ll;0;L;;;;;N;LATIN SMALL LETTER O 
SLASH;;00D8;;00D8

Is there some other property I can use?  Or is this a rare special case
to handle by hand?

-- 
Hallvard

Reply via email to