So I set NLS_LANG to AL16UTF16?  I'd like the note to be more explicit.   I thought 
that would be the value to which I would set the "character set" argument of the 
create  database command.  The two are not the same.  For instance the  "character 
set" argument of the "create database" might be WE8ISO8859P1; whereas the NLS_LANG 
variable might be American_America.WE8ISO8859P1.  So I'm guessing, 
American_America.AL16UTF16 for the NLS_LANG variable?

As I understand UTF16 it takes 16 bits to encode each character.  This should result 
in more disk storage being required to store the same amount of information.  I also 
understand the varchar2 fields are  character-based; there is no need to increase the 
length of a varchar2 field which switching from 7/8 bit encoding to 16 bit; however, 
char fields are byte-based and would need to have their lengths altered.  One would 
need a char(60) field  to hold 30 characters encoded with UTF16.

I'm not sure what is mean by 

Since 9i and upwards, we support UTF-16 encoding at column level as
national
(alternative) database character set. In 9i, the UTF-16 encoding Oracle
character set AL16UTF16 has even become the default character set for
SQL NCHAR
data types.

How can I use that to my advantage?
----------------------------------------------------------------------------------------------------------------
My problem stems mainly from Intermedia.  The inso_filtering converts the document to  
the base character set for the database.  This is not too bad when Cerenkov loses the 
diacritical mark over the 'C' and is thus rendered as I have written it here.  It's 
not good at all when  a ligature such as the 'fl' in reflection is lost so that the 
filter converts in to 
re ection.

There is obviously a cost  of going to a 16 bit encoding system.  Twice as many bits 
need to be read to get the same information out.  The information which contains the 
special characters is stored in a BLOB.  I wouldn't think that character sets mattered 
to BLOBs at all.  However,  character set certainly does matter to the 
DR$<INDEXNAME>$I  tokens of an "Intermedia Index".

I would certainly count myself as  a member of the ignorant masses when it comes to 
character sets.  If anything I have stated is untrue, or untrue under certain 
conditions, I'd sure like to know.

Ian MacGregor





-----Original Message-----
Sent: Tuesday, August 20, 2002 2:35 PM
To: LazyDBA.com Discussion


>From Note 77443.1 on metalink:

UTF-16 SUPPORT 
--------------
Since 9i and upwards, we support UTF-16 encoding at column level as
national
(alternative) database character set. In 9i, the UTF-16 encoding Oracle
character set AL16UTF16 has even become the default character set for
SQL NCHAR
datatypes.


-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: MacGregor, Ian A.
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Reply via email to