I have UTF8 text data supporting multiple languages in a text file that I 
need to insert into a CLOB field in an Oracle database (v8.1.7.1.0).  I'm 
using PERL v 5.005_03.  The contents get corrupted during the insert 
process.

Both the input data and SYS.V_$NLS_PARAMETERS.NLS_CHARACTERSET (for the clob 
field) are UTF8.

Basically, what goes in as:
<record name="שלוחולאומבעולם.dcr">
    <item delimiter=", " name="CREATOR">

Comes out as:
<record name="W)WWWWWWWWWW"WWW.dcr">
    <item delimiter=", " name="CREATOR">


I've tried this two ways:
1) using a direct embedded perl call using DBI:

my $stmt   = "UPDATE lob_table SET detail = :1 " .
             "WHERE  something = :2 ";
my $sth    = $dbh->prepare( $stmt );
$sth->bind_param(1, $buf, {ora_type => ORA_CLOB});
$sth->bind_param(2, 'aValue' );
eval { $sth->execute(); };

2) calling a stored procedure, binding parameters as above.  The
stored procedure looks a bit like the following:

DECLARE
  t_clob CLOB;
BEGIN
  SELECT detail INTO t_clob
  FROM   LOB_TABLE
  WHERE  id  = V_id
  FOR UPDATE;

  DBMS_LOB.WRITE( t_clob, V_lob_byte_size, 1, V_detail );
END;

where V_detail is a varchar2 buffer containing the data passed from perl, 
and V_lob_byte_size is the buffer size.

Both methods give the same output when dumped, so I  think that when DBI or 
DBD passes the data, it tries convert any UTF8 character represented by more 
than 2 bytes to a 2 byte unicode representation, and some data required to 
represent the character is lost.  Something along those lines anyway.

The gist is, I need to be able to tell DBI (or Oracle) that the data is 
multibyte UTF8 encoded.

Any help is appreciated.

-Rob


_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp

Reply via email to