Does this mean, that there are several codes for the same character - like a "é" in Unicode? But not for UTF-8 encoding, or?
I basically trust you, but can't believe ;-)

Best,
Marcel

On 16.06.2006, at 17:04, Theodore H. Smith wrote:

From: Marcel <[EMAIL PROTECTED]>
Date: Fri, 16 Jun 2006 16:53:47 +0200

Theodore!

What does normalisation in context to Unicode mean?

If you don't know, you probably don't need my code ;)

Basically, normalisation means that each character is put into "the proper format" for that character.

The problem with Unicode, is that a character can have multiple representations. It can be composed or decomposed, and the accents can be in different orders.

So if I search for a string "é", maybe this string isn't found in a text that really *does* contain the "é", because the character is stored differently in the text from the search query.

Normalisation solves this, by making sure that every character has only 1 correct form, and it's always in that form.

It's the sort of thing that can usually be left until you find a problem :)

I think one or two users needed this code before, when they were copying files with European letters in them, across platform, because MS stores their letters in a different format than Apple does. When they had that problem, it was time to use my code ;P

Best,
Marcel

On 14.06.2006, at 16:13, Theodore H. Smith wrote:

If anyone needs Unicode normalisation

--
http://elfdata.com/plugin/



_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>


_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Reply via email to