Re: Cross-platform Unicode normalisation, NFD

Theodore H. Smith Fri, 16 Jun 2006 08:05:14 -0700

From: Marcel <[EMAIL PROTECTED]>
Date: Fri, 16 Jun 2006 16:53:47 +0200


Theodore!

What does normalisation in context to Unicode mean?


If you don't know, you probably don't need my code ;)

Basically, normalisation means that each character is put into "theproper format" for that character.

The problem with Unicode, is that a character can have multiplerepresentations. It can be composed or decomposed, and the accentscan be in different orders.

So if I search for a string "é", maybe this string isn't found in atext that really *does* contain the "é", because the character isstored differently in the text from the search query.

Normalisation solves this, by making sure that every character hasonly 1 correct form, and it's always in that form.

It's the sort of thing that can usually be left until you find aproblem :)

I think one or two users needed this code before, when they werecopying files with European letters in them, across platform, becauseMS stores their letters in a different format than Apple does. Whenthey had that problem, it was time to use my code ;P

Best,
Marcel

On 14.06.2006, at 16:13, Theodore H. Smith wrote:

If anyone needs Unicode normalisation


--
http://elfdata.com/plugin/



_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Re: Cross-platform Unicode normalisation, NFD

Reply via email to