Oracle ncahr versus nvarchar2 and unicode data

Martin J. Evans Mon, 18 Jul 2005 10:05:22 -0700

Hi,

Wondered if anyone can tell me if I've misunderstood something here. I am able
to insert unicode data into nvarchar2 and get back what was inserted but same
code with nchar column does not work.


OCI version is 10.1.0.2
Database is Oracle 10.2 on Linux
Database  CHAR set is WE8ISO8859P1 (Non-Unicode), NCHAR set is AL16UTF16
(Unicode)
This is perl, v5.8.7 built for i686-linux
DBD::Oracle 1.16.

#!/disk2/martin/perl/bin/perl
use charnames ':full';
use DBI;
use Encode;
use DBD::Oracle qw( :ora_types ORA_OCI SQLCS_NCHAR );
use Devel::Peek;

if ( $] < 5.008) {
    print "Need at least Perl 5.8\n";
    exit 1;
}
binmode(STDOUT, ":utf8");

print "<html><head>";
print q|<meta http-equiv="Content-Type" content="text/html;charset=utf-8">|;
print "<title>test</title></head><body>\n";

$ENV{'ORACLE_HOME'} = '/home/oracle/product/10.1.0/db_2';
$ENV{'NLS_NCHAR'} = 'AL32UTF8';
my $dbh = DBI->connect('dbi:Oracle:tentwo.testing', 'xxx', 'yyy');
$dbh->{ChopBlanks} = 1;
print "OCI version is ", ORA_OCI, "\n";
print "ora_can_unicode = ", $dbh->ora_can_unicode, "\n";
if (!($dbh->ora_can_unicode & 1)) {
    print "Database does not support UNICODE\n";
    exit 1;
}
my $paramsH = $dbh->ora_nls_parameters();
printf "Database $ora_server_version CHAR set is %s (%s), NCHAR set is %s
(%s)\n",
    $paramsH->{NLS_CHARACTERSET}, 
    $dbh->ora_can_unicode & 2 ? "Unicode" : "Non-Unicode",
    $paramsH->{NLS_NCHAR_CHARACTERSET},
    $dbh->ora_can_unicode & 1 ? "Unicode" : "Non-Unicode";
eval {
$dbh->do('drop table martin');
};
# NOTE nchar does not work!!!! (why?)
$dbh->do('create table martin (a nvarchar2(256))');
my $sql = "insert into martin values (?)";
my $x = "\N{EURO SIGN}\x{20ac}\N{LATIN CAPITAL LETTER A WITH ACUTE}";
$x .= chr(0x20ac);
print "is_utf8 returns ", utf8::is_utf8($x), "\n";
print "valid returns ", utf8::valid($x), "\n";
my $sth = $dbh->prepare($sql);
$sth->bind_param(1, $x, {ora_csform => SQLCS_NCHAR});
$sth->execute();
$sql = "select a from martin";
$sth = $dbh->prepare($sql);
$sth->execute;
while(my @row = $sth->fetchrow_array) {
    print "valid on returned string returns ", utf8::valid($row[0]), "\n";
    #$_ = decode("utf8", $_) foreach @row;
    print $row[0], "\n";
    print unpack("H*", $row[0]), "\n";
    print Dump(@row[0]);
}
$sth->finish;
$dbh->disconnect();

Output is:

<html><head><meta http-equiv="Content-Type"
content="text/html;charset=utf-8"><title>test</title></head><body>
OCI version is 10.1.0.2
ora_can_unicode = 1
Database  CHAR set is WE8ISO8859P1 (Non-Unicode), NCHAR set is AL16UTF16
(Unicode)
is_utf8 returns 1
valid returns 1
valid on returned string returns 1
<utf8 data here - snipped from email>
e282ace282acc381e282ac
SV = PV(0x82a3efc) at 0x82e5b48
  REFCNT = 1
  FLAGS = (POK,pPOK,UTF8)
  PV = 0x8361bb0 "\342\202\254\342\202\254\303\201\342\202\254"\0 [UTF8
"\x{20ac}\x{20ac}\x{c1}\x{20ac}"]
  CUR = 11
  LEN = 12
</body></html>

UTF-8 data output with Devel::Peek Dump looks fine.
  
A simple change to create the table with nchar instead of nvarchar2
outputs:

<html><head><meta http-equiv="Content-Type"
content="text/html;charset=utf-8"><title>test</title></head><body>
OCI version is 10.1.0.2
ora_can_unicode = 1
Database  CHAR set is WE8ISO8859P1 (Non-Unicode), NCHAR set is AL16UTF16
(Unicode)
is_utf8 returns 1
valid returns 1
valid on returned string returns 1
<utf8 data here snipped from email>
3f3f413f
SV = PV(0x82a3efc) at 0x82e5c3c
  REFCNT = 1
  FLAGS = (POK,pPOK,UTF8)
  PV = 0x8361cc0 "??A?"\0 [UTF8 "??A?"]
  CUR = 4
  LEN = 5
</body></html>

Devel::Peek Dump output looks nothing like what I expected.

Also, do I have to use {ora_csform => SQLCS_NCHAR}? Does not seem to work in
either case if I don't.

And one final thing. If I change $x containing the UTF-8 data to insert and run
$sth->execute again it inserts the original data again and not the changed $x.
I've not confirmed this yet with other drivers, but I am /fairly/ sure with
DBD::ODBC you don't have to call bind_param again if the bound data is changed.

Thanks.

Martin
--
Martin J. Evans
Easysoft Ltd, UK
Development

Oracle ncahr versus nvarchar2 and unicode data

Reply via email to