Re: Multiple encodings for 1 character

David Possin Mon, 08 Jul 2002 14:31:29 -0700

You will have to normalize the way the strings are processed, and you
need to make sure it is done the same way everytime. Checkout ICU for
this purpose.


http://oss.software.ibm.com/icu/

Dave
--- "Theodore H. Smith" <[EMAIL PROTECTED]> wrote:
> What is going to be done about the confusion generated from 
> having multiple ways to encode the same character?
> 
> For example, for filenames, OSX will encode an accented Roman 
> letter one way, while for filenames Windows will encode it the 
> other way. These kind of confusions are totally expected, if 
> Unicode will allow more than one way to encode the same 
> character.
> 
> This means that matching algorithm's won't work, because the 
> characters are different!
> 
> Will there be some kind of recommendation of which to avoid? 
> Will the Unicode consortium make a standard to say that one of 
> these encodings is strongly not recommended, and in fact 
> depreciated?
> 
> And what about the OS that uses this encoding? How will the 
> Unicode consortium make the newly-offending OS change it's ways?
> 
> And what about the hordes of apps that expect one format but 
> don't expect the other? And the hoardes of OS independant apps 
> (Java? Perl?) that might generate conflicting versions?
> 
> 


=====
Dave Possin
Globalization Consultant
www.Welocalize.com
http://groups.yahoo.com/group/locales/

__________________________________________________
Do You Yahoo!?
Sign up for SBC Yahoo! Dial - First Month Free
http://sbc.yahoo.com

Re: Multiple encodings for 1 character

Reply via email to