On Fri, Aug 19, 2005 at 01:52:54PM +0100, Charles Jardine wrote: > Tim Bunce wrote, on 17/08/2005 21:23: > > >If the $statement is UTF8 then simply tell Oracle that it's UTF8. > >Simple. > > Unfortunately this isn't as easy as it is when binding. OCI > statement handles do not support OCI_ATTR_CHARSET_ID. They > inherit their encoding settings from their environment handle > at create time, and do not expose them for inspection or change.
Ah. I'd forgotten that. Thanks. That puts a different light on it... > It seems it will be necessary to create the statement > handle using a different environment handle. Let's go back to your original message: On Fri, Aug 12, 2005 at 04:45:50PM +0100, Charles Jardine wrote: : The method $dbh->prepare($stmt) of DBD::Oracle ignores the : state of the utf8 flag in the SV for $stmt. For example, : after : : my $a = "select '\xe2' from dual"; : my $b = decode_utf8("select '\xc3\xa2' from dual"); : : $a and $b compare equal with the perl 'eq' operator. However, : their internal representations differ. $a is represented in : iso-8859-1 and its utf8 flag is off. $b is represented in : utf-8 and its utf8 flag is on. : : The two equal statements give different results when run : using DBD::Oracle. Which gives the 'correct' result depends : on the client-side database character set (the NLS_LANG : charset). $a gives the correct result for 8-bit charsets. : $b gives the correct result if the charset is utf-8. : : This is clearly a bug. Why? (I'm not sure it isn't, but I'd like a stronger explanation, and one that's explained with reference to the whole of http://search.cpan.org/~timb/DBD-Oracle/Oracle.pm#Unicode ) The approach DBD::Oracle takes is that the application should behave in accordance with NLS_LANG. To use UTF8 in your SQL you should set NLS_LANG to UTF8. If your NLS_LANG is not UTF8 then don't try to give UTF8 strings to Oracle. The DBD::Oracle docs explicitly say: * Sending Data using SQL * * Oracle assumes the SQL statement is in the default client character set * (as specified by NLS_LANG). So Unicode strings containing non-ASCII * characters should not be used unless the default client character set * is AL32UTF8. Tim. : It can affect any SQL statement which : contains a non-ASCII character. It can strike whether or not : Unicode is being used in the database. I would like to fix it. : This requires that code be put somewhere which decides how to : process the SV on the basis of its utf8 flag and of the : NLS_LANG charset. > As a side effect, > this will change the default charsets for bind and define > handles under this statement handle. This, in turn, may > require change in the code for binding and defining to > preserve the existing behaviour. > > This patch is going to take some time. Nevertheless, I > shall attempt it. It would be good to have the behaviour of > prepare() consistent with that of bind_param().