D. Starner writes: >> The result is much better if you allow the ASCII conversion to be a string. >> This allows you to, e.g., "�" = "(c)", "�" = "1/2", and so on. This is also >> good for letters: "�" = "ss", "�" = "aa", etc. > > etcetra? I think he needs more direction then that, especially most na�ve > algorithms are going to produce "a" from "�". Diagraphs can be treated > as titlecase or capital or intelligently.
Hm. Actually I'll want a mode which generates "a" rather than "aa" for that one, to mimic local practice for how to generate e-mail adresses. Though that can be tacked on with an extra hack afterwards. One question, unless it has been answered already - I need to read up on Unicode before I'll understand all the answers: I'd like to translate '�' to 'o' or maybe 'oe'. 'o' at least when used for matching, since it should match Swedish '�'. However, UnicodeData.txt has no decomposition property for that character: 00F8;LATIN SMALL LETTER O WITH STROKE;Ll;0;L;;;;;N;LATIN SMALL LETTER O SLASH;;00D8;;00D8 Is there some other property I can use? Or is this a rare special case to handle by hand? -- Hallvard

