I have UTF8 text data supporting multiple languages in a text file that I 
need to insert into a CLOB field in an Oracle database (v8.1.7.1.0).  I'm 
using PERL v 5.005_03.  The contents get corruted during the insert process.

Basically, what goes in as:
<record name="שלוחולאומבעולם.dcr">
    <item delimiter=", " name="CREATOR">

Comes out as:
<record name="W)WWWWWWWWWW"WWW.dcr">
    <item delimiter=", " name="CREATOR">

I've tried this two ways (these are only code excerpts):
1) using a direct embedded perl call using DBI, a la:

my $stmt   = "UPDATE lob_table SET detail = :1 " .
             "WHERE  something = :2 ";
my $sth    = $dbh->prepare( $stmt );
$sth->bind_param(1, $buf, {ora_type => ORA_CLOB});
$sth->bind_param(2, 'aValue' );
eval { $sth->execute(); };

2) calling a stored procedure, binding parameters as above.  The stored 
procedure looks a bit like the following:

UPDATE LOB_TABLE
VALUES( id, EMPTY_CLOB() );
DECLARE
  t_clob CLOB;
BEGIN
  SELECT detail INTO t_clob
  FROM   LOB_TABLE
  WHERE  id  = V_id
  FOR UPDATE;
  DBMS_LOB.WRITE( t_clob, V_lob_byte_size, 1, V_detail );
END;


Both methods give the same output when dumped, so I'm inclide to think that 
when DBI or DBD passes the data, it tries convert any UTF8 character 
represented by more than 2 bytes to a 2 byte unicode representation, and 
some data required to represent the character is lost.  Something along 
those lines anyway.
The gist is that we need to be able to tell DBI or Oracle that the data is 
multibyte UTF8 encoded.

Any help is appretiated.

-Rob

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp

Reply via email to