It's a shame UTF-8 wasn't made the standard in Delphi.  It's commonly used in 
audio file tags, for example, which I have to deal with.

My software needs to search for songs with specific artists or titles, and it 
sounds like I'm going to have problems where the information is visually the 
same but entered differently in different parts of the world, using all sorts 
of 3rd party software.

Ross.
 
-----Original Message-----
From: delphi-boun...@delphi.org.nz [mailto:delphi-boun...@delphi.org.nz] On 
Behalf Of Todd
Sent: Wednesday, 24 November 2010 11:27 AM
To: NZ Borland Developers Group - Delphi List
Subject: Re: [DUG] Upgrading to XE - Unicode strings questions

Hi John

You can find out whether a unicode string is inside the BMP by 
converting it to UTF-32 and checking that the new string is twice the 
length of the original (UTF-16) string.
> A user could specifically choose to enter that character in either form - 
> this is unlikely, yes.  Or, two users using the same codepage could choose to 
> enter the character differently.
>
> Or if your data is coming from two separate external sources.
>
> The *only* way to be sure is to normalise before processing.
>    
Agreed. That will eliminate any issues with composite codepoints.
>> You only ever get issues if you cross codepage boundaries
>> (like for example if you have users in different countries
>> storing data in a database - which is why international
>> databases often use UTF-8 to store data instead of their
>> native charactersets).
>>      
> This makes no sense at all to me.
>
> "รถ" encoded as #$006F + #$0308 **OR** #$00f6 even in UTF-8.  Whether you 
> encode using UTF-8, UTF-16 or UTF-32, a single accented character codepoint 
> vs a character followed by a diacritic are still two distinct "character" 
> sequences.
>    
True. I think the point is that UTF-8 is the most compact format without 
data loss, regardless of whether the codepoints are composite or not.

Todd.

_______________________________________________
NZ Borland Developers Group - Delphi mailing list
Post: delphi@delphi.org.nz
Admin: http://delphi.org.nz/mailman/listinfo/delphi
Unsubscribe: send an email to delphi-requ...@delphi.org.nz with Subject: 
unsubscribe



_______________________________________________
NZ Borland Developers Group - Delphi mailing list
Post: delphi@delphi.org.nz
Admin: http://delphi.org.nz/mailman/listinfo/delphi
Unsubscribe: send an email to delphi-requ...@delphi.org.nz with Subject: 
unsubscribe

Reply via email to