On 2003.08.06, 11:37, Philippe Verdy <[EMAIL PROTECTED]> wrote:

> The main UCD table already contains the needed NFD canonical
> decompositions, and removing accents is simply a matter of NFD
> decomposition plus removal of combining characters
<...>
> they are not really accents but are important to correctly identify
> vowels and consonnants,

Note that even most latin script orthographies will suffer badly if
diacriticals are removed. I'm sure we can all come out with examples,
many of which quite embarrassing or even dangerous. (F.i., portuguese
�Do you have a porpoise?� becomes quite nasty if you remove the one
acute from it...) Learning that diacriticals do, in most languages, a
lot more than just add snazziness to a word is probably lesson #1 in
i-n-t-e-r-n-a-t-i-o-n-a-l-i-z-a-t-i-o-n... 

--                                                                   ____.
Ant�nio MARTINS-Tuv�lkin                                            |  ()|
<[EMAIL PROTECTED]>                                           |####|
R. Laureano de Oliveira, 64 r/c esq.                                     |
PT-1885-050 MOSCAVIDE (LRS)              N�o me invejo de quem tem       |
+351 934 821 700                         carros, parelhas e montes       |
http://www.tuvalkin.web.pt/bandeira/     s� me invejo de quem bebe       |
http://pagina.de/bandeiras/              a �gua em todas as fontes       |


Reply via email to