I have UTF8 text data supporting multiple languages in a text file that I
need to insert into a CLOB field in an Oracle database (v8.1.7.1.0). I'm
using PERL v 5.005_03. The contents get corruted during the insert process.
Basically, what goes in as:
<record name="שלוחולאומבעולם.dcr">
<item delimiter=", " name="CREATOR">
Comes out as:
<record name="W)WWWWWWWWWW"WWW.dcr">
<item delimiter=", " name="CREATOR">
I've tried this two ways (these are only code excerpts):
1) using a direct embedded perl call using DBI, a la:
my $stmt = "UPDATE lob_table SET detail = :1 " .
"WHERE something = :2 ";
my $sth = $dbh->prepare( $stmt );
$sth->bind_param(1, $buf, {ora_type => ORA_CLOB});
$sth->bind_param(2, 'aValue' );
eval { $sth->execute(); };
2) calling a stored procedure, binding parameters as above. The stored
procedure looks a bit like the following:
UPDATE LOB_TABLE
VALUES( id, EMPTY_CLOB() );
DECLARE
t_clob CLOB;
BEGIN
SELECT detail INTO t_clob
FROM LOB_TABLE
WHERE id = V_id
FOR UPDATE;
DBMS_LOB.WRITE( t_clob, V_lob_byte_size, 1, V_detail );
END;
Both methods give the same output when dumped, so I'm inclide to think that
when DBI or DBD passes the data, it tries convert any UTF8 character
represented by more than 2 bytes to a 2 byte unicode representation, and
some data required to represent the character is lost. Something along
those lines anyway.
The gist is that we need to be able to tell DBI or Oracle that the data is
multibyte UTF8 encoded.
Any help is appretiated.
-Rob
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp