EncodingManager usage note

Andrew Dunbar Fri, 24 Jan 2003 18:56:01 -0800

I just wanted to warn those of us who may need to use
certain EncodingManager methods about a slight problem
I've noticed.


To get the names of certain encodings used by the
system, for use with iconv there are a few methods
provided:

getNativeEncodingName()
getNativeSystemEncodingName()
getNative8BitEncodingName()
getNativeUnicodeEncodingName()

I've noticed that some code treats the second two as
though they are mutually exclusive.  This is not the
case at all.  As noted in the comment/documentation
for getNative8BitEncodingName() - it is perfectly
okay for it to return UTF-8 or any multibyte CJK
encoding - the only requirement is that the encoding
is a superset of ASCII.
Some code is calling this function to get an
ISO-8859-x encoding when the native encoding is UTF-8.
I believe there are some very subtle bugs which may be
due to this.

The correct fix is to add yet another method:
getNativeNonUnicodeEncodingName()

which will never return UTF-8 on *nix,BeOS,QNX, or OSX
and will never return UCS-2 or UTF-16 on Windows.

There is a slight and subtle semantic overlap and I
apologize for the confusing nature of this.  I'd like
to implement this myself right now but my internet
cafe bill today is already astronomical ):

So just be careful with encodings and keep up the
good work!

Andrew Dunbar.


=====
http://linguaphile.sourceforge.net/cgi-bin/translator.pl http://www.abisource.com

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

EncodingManager usage note

Reply via email to