RE: character sets in oracle

Steve Sapovits Wed, 18 Jul 2001 23:02:04 -0700

The way Oracle handles character sets is to map from the client's
set to the set defined in the DB.  For example, if you create a 
database as ISO-8859, but your client is USASCII, then when the
client queries the DB, Oracle will map from the ISO set to ASCII.
Since ISO is a superset of ASCII, the ISO characters Oracle has
no equivalent for in ASCII will be turned into '?' characters.

If the client and the DB use the same set, no mapping is done.
So, for example, you can actually put 8-bit characters into a DB
defined as ASCII (7-bit) -- if you pull them out with a client
that also has its character set defined as ASCII you'll get out
exactly what you put in.  Even though you can get data out of a 
DB not properly defined to handle the characters being inserted,
string operations will not work reliably unless the defined DB
character set actually matches what you're putting in.  

If that's not clear, the example is this:  You define your DB as 
USASCII (7-bit) but end up putting ISO-8859 8-bit characters into it.
You can get those characters out exactly as they were put in by 
setting the character set of the client doing the query as USASCII.
But if you try to do string operations that rely on character set 
collating order, etc. the 8-bit values may skew the results since 
Oracle has no idea what actual characters (glyphs) they represent.

With Microsoft products, things get more complicated because 
Microsoft has its own versions of standard character sets.  For
example, Microsoft's ISO-8859 character set uses values not defined
in standard ISO-8859 (true ISO-8859 has a column or two as undefined).
These extended MS character sets have their own identities so they
don't conflict with the true standards but you can have mapping issues
since the true standards may not have equivalents for the values MS
has added.  When your source is an MS character set, you invariably
have some characters that don't map to other sets since most non-MS
platforms adhere to true ISO standards.

Also, don't get confused by what you're getting from the DB versus 
how some editor or other tool is displaying it.  They may have their
own methods of defining character sets, although NLS_LANG should be
fairly standard at this point.

In the past, I've used this setting for ASCII:

    NLS_LANG = AMERICAN_AMERICA.US7ASCII

and this setting for MS character sets:

    NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P1

I'm not sure if those are accurate for your case though.  Your Linux server
is probably using a standard set.  I believe the degree symbol is one of
the MS additions but I may be mistaken.

I've followed links from here in the past to find out what characters 
are defined by what character sets:

    http://www.jimprice.com/jim-asc.htm

> -----Original Message-----
> From: Madhavi [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, July 19, 2001 1:19 AM
> To: [EMAIL PROTECTED]
> Subject: character sets in oracle
> 
> 
> I have a problem like this:
> We have got data in Excel sheet from client.It contains 6 cells and
> nearly 5000 rows.we r importing that data into a table by using VB.The
> data will be in several languages like 
> English,Deutch,French,German etc.
> 
> Except English all other languages contains some special 
> characters like
> Ä ,40 °c etc.when i am importing this into a table it is 
> taking in that
> manner only that is 40°c,Ä etc.If i query that in NTserver it is
> displaying 40°c only. but if i query it in Linux server it is showing
> some thing like 40Ä?c like that.If i query it through browser it is
> showing Question marks wherevery special chars appear.How to overcome
> this problem?we have set some NLS parameters while creating 
> database as
> UTF8(for character set),UTF8(for Nchar characterset).
> all help is greately appreciated
>
RE: character sets in oracle

Reply via email to