Re: DBD::Oracle - problem with UTF-8 charset and Oracle 8.1.7

2007-02-15 Thread Alexander Foken
http://search.cpan.org/~pythian/DBD-Oracle-1.19/Oracle.pm#Oracle_and_Unicode 
says roughly that Oracle 8 and Unicode don't mix well. If possible, try 
to upgrade the Oracle server to at least 9.2 and use AL32UTF32 as charset.


(I'm not an Oracle expert. I'm just using it occasionally.)

Alexander


On 15.02.2007 17:32, Peter J. Holzer wrote:

I think I've reported that problem before but we didn't get close
enough to fix the problem, so I'm taking another stab at it.

Environment:

Oracle server is 8.1.7. 
Oracle client is 9.2 (same problem with 10g)

DBD::Oracle version 1.18
LANG=en_US.UTF-8
NLS_LANG=.UTF8

When a parameter has the UTF-8 bit set, the query fails with ORA-12704:
character set mismatch. However, strings returned from queries do have
this bit set, so in a (rather stupid) sequence like:

my $text0 = test;
my ($id1, $text1) = $dbh-selectrow_array(select id, t from ora817utf8bug where 
t=?, {}, $text0);
print $id1, $text1\n;
my ($id2, $text2) = $dbh-selectrow_array(select id, t from ora817utf8bug where 
t=?, {}, $text1);
print $id2, $text2\n;

the first selectrow_array works, and the second fails.

In a level 9 trace, the relevant (I hope) bits look like this:

1   - prepare= DBI::st=HASH(0x9432f1c) at ./ora817utf8bug line 46 
   bind :p1 == 'test' (type 0)

   rebinding :p1 (not-utf8, ftype 1, csid 0, csform 0, inout 0)
   bind :p1 == 'test' (size 4/5/0, ptype 4, otype 1)
   bind :p1 == 'test' (size 4/4, otype 1, indp 0, at_exec 1)

OCIBindByName(94d3a20,95381fc,94cbd94,:p1,3,9537908,4,1,9538214,0,953820c,0,0,2)=SUCCESS
OCIBindDynamic(94d3830,94cbd94,95381d8,499ab0,95381d8,499da0)=SUCCESS
OCIAttrGet(94d3830,OCI_HTYPE_BIND,95381e8,0,31,94cbd94)=SUCCESS
   bind :p1 == 'test' (in, not-utf8, csid 871-0-871, ftype 1, csform 
0-0, maxlen 4, maxdata_size 0)
OCIAttrSet(94d3830,OCI_HTYPE_BIND,bfffb38a,0,31,94cbd94)=SUCCESS
dbd_st_execute SELECT (out0, lob0)...
   in  ':p1' [0,0]: len  4, ind 0
OCIStmtExecute(94cbd20,94d3a20,94cbd94,0,0,0,0,0)=SUCCESS
OCIAttrGet(94d3a20,OCI_HTYPE_STMT,bfffb4ba,0,10,94cbd94)=SUCCESS
dbd_st_execute SELECT returned (SUCCESS, rpc0, fn4, out0)
[...]
1   - prepare= DBI::st=HASH(0x92507a4) at ./ora817utf8bug line 48
   bind :p1 == test (type 0)
   rebinding :p1 (is-utf8, ftype 1, csid 0, csform 0, inout 0)
   bind :p1 == test (size 4/5/0, ptype 4, otype 1)
   bind :p1 == 'test' (size 4/4, otype 1, indp 0, at_exec 1)

OCIBindByName(94d3a20,95382f4,94cbd94,:p1,3,954d750,4,1,953830c,0,9538304,0,0,2)=SUCCESS
OCIBindDynamic(94d3830,94cbd94,95382d0,499ab0,95382d0,499da0)=SUCCESS
   rebinding :p1 with UTF8 value so setting csform=SQLCS_IMPLICIT   
OCIAttrSet(94d3830,OCI_HTYPE_BIND,bfffb383,0,32,94cbd94)=SUCCESS
OCIAttrGet(94d3830,OCI_HTYPE_BIND,95382e0,0,31,94cbd94)=SUCCESS
   bind :p1 == test (in, is-utf8, csid 871-0-871, ftype 1, csform 
0-2, maxlen 4, maxdata_size 0)
OCIAttrSet(94d3830,OCI_HTYPE_BIND,bfffb38a,0,31,94cbd94)=SUCCESS
dbd_st_execute SELECT (out0, lob0)...
   in  ':p1' [0,0]: len  4, ind 0
OCIStmtExecute(94cbd20,94d3a20,94cbd94,0,0,0,0,0)=ERROR
OCIAttrGet(94d3a20,OCI_HTYPE_STMT,bfffb3be,0,129,94cbd94)=SUCCESS
OCIErrorGet(94cbd94,1,NULL,bfffafec,ORA-12704: character set mismatch 
,1024,2)=SUCCESS

AFAICS, the difference is the csform in the final bind. In the
successful case it's csform 0-0, in the unsuccessful case it's
csform 0-2. I assume that 2 is SQLCS_IMPLICIT mentioned 2 lines
above.

The code works if run against Oracle 9i or 10g servers, so I think that
maybe Oracle 8.1.7 doesn't support csform=SQLCS_IMPLICIT?

Here is a full script to demonstrate the problem (it tries to read the
parameters for connect ($data_source, $username, $auth) from a file
specified with --cred or the env. variable DBI_CREDENTIAL_FILE - I
normally keep such files in ~/.dbi)

#!/usr/local/bin/perl -w

=head1 NAME

ora817utf8bug - demonstrate bug with Oracle 8.1.7 and UTF8

=head1 SYNOPSIS

export LANG=en_US.UTF-8
export NLS_LANG=.UTF8
export DBI_TRACE=9

./ora817utf8bug

=head1 DESCRIPTION

=cut

use strict;
use DBI;
use Getopt::Long;
use Pod::Usage;


my $help;
my $credfile = $ENV{DBI_CREDENTIAL_FILE} || default;

GetOptions(
'help|?'= \$help,
'credfile=s'= \$credfile,
) or pod2usage(2);
pod2usage(1) if $help;

my @cred = read_cred($credfile);
my $dbh = DBI-connect($cred[0], $cred[1], $cred[2], {RaiseError = 1, AutoCommit 
= 0});
$dbh-{FetchHashKeyName} = 'NAME_lc';

$dbh-{RaiseError} = 0;
$dbh-do(drop table ora817utf8bug);
$dbh-do(create table ora817utf8bug(id number, t varchar2(40)));
$dbh-{RaiseError} = 1;
$dbh-do(insert into ora817utf8bug(id, t) values(1, 'test'));


my $text0 = test;
my ($id1, $text1) = $dbh-selectrow_array(select id, t from ora817utf8bug where 
t=?, {}, $text0);
print $id1, $text1\n;
my ($id2, $text2) = $dbh-selectrow_array(select id, t from ora817utf8bug where 

Re: DBD::Oracle - problem with UTF-8 charset and Oracle 8.1.7

2007-02-15 Thread Peter J. Holzer
On 2007-02-15 18:45:31 +0100, Alexander Foken wrote:
 http://search.cpan.org/~pythian/DBD-Oracle-1.19/Oracle.pm#Oracle_and_Unicode 
 says roughly that Oracle 8 and Unicode don't mix well.

| Oracle 8 client libraries have a number of bugs related to character
| set handling, especially when connected to an Oracle 9+ server.

However, I have the opposite situation: I have 9.2 client libraries and
an 8.1 server.


 If possible, try to upgrade the Oracle server to at least 9.2 and use
 AL32UTF32 as charset.

Unfortunately, that's not an option right now. I do have a number of
workarounds: I can either use another charset than UTF-8 (the database
uses WE8MSWIN1252 anyway) or I can explicitely encode() all strings
passed to DBI. Neither is particularly appealing, and I'd rather fix
the problem than program workarounds.

hp

-- 
   _  | Peter J. Holzer| If I wanted to be academically correct,
|_|_) | Sysadmin WSR   | I'd be programming in Java.
| |   | [EMAIL PROTECTED]  | I don't, and I'm not.
__/   | http://www.hjp.at/ |   -- Jesse Erlbaum on dbi-users


pgp9eSEqHTOHc.pgp
Description: PGP signature