On Mon, Sep 03, 2007 at 07:20:42PM +0200, Peter J. Holzer wrote:
> On 2007-09-03 13:03:37 +0100, Tim Bunce wrote:
> > Oh the joys of Oracle character set support.
> >
> > See the dbd_rebind_ph() code in dbdimp.c. Especially from the line
> > "csform = phs->csform;" onwards.
> >
> > You can explicitly set the csform and csid using bind_param(..., { ... })
> >
> > Have fun!
>
> I had :-). You gave me a push into the right direction, and I think I
> found the problem:
>
> 1375 csform = phs->csform;
> 1376
> 1377 if (!csform && SvUTF8(phs->sv)) {
> 1378 /* try to default csform to avoid translation through
> non-unicode */
> 1379 if (CSFORM_IMPLIES_UTF8(SQLCS_NCHAR)) /* prefer
> NCHAR */
> 1380 csform = SQLCS_NCHAR;
> 1381 else if (CSFORM_IMPLIES_UTF8(SQLCS_IMPLICIT))
> 1382 csform = SQLCS_IMPLICIT;
> 1383 /* else leave csform == 0 */
> 1384 if (trace_level)
> 1385 PerlIO_printf(DBILOGFP, " rebinding %s with UTF8
> value %s", phs->name,
> 1386 (csform == SQLCS_NCHAR) ? "so setting
> csform=SQLCS_IMPLICIT" :
> 1387 (csform == SQLCS_IMPLICIT) ? "so setting
> csform=SQLCS_NCHAR" :
> 1388 "but neither CHAR nor NCHAR are unicode\n");
> 1389 }
>
> The first thing to notice is that the trace message is backwards: It
> prints setting csform=SQLCS_IMPLICIT if the csform is set to SQLCS_NCHAR
> and vice versa. So that explains why I didn't notice anything strange in
> the trace output. While it claimed to set csform=SQLCS_IMPLICIT, it was
> actually setting it to SQLCS_NCHAR, and I guess that was what prevented
> the use of the index on the varchar2 column. Sure enough, if I
> explicitely set ora_csform to 1, it's fast. It is also fast, if I set
> NLS_NCHAR to US7ASCII, so that it isn't used.
>
> Maybe the tests in line 1379 and 1381 should be reversed so that
> SQLCS_IMPLICIT is preferred over SQLCS_NCHAR?
Sounds very plausible.
> I guess there was some good reason to prefer SQLCS_NCHAR, but it seems
> to me that SQLCS_IMPLICIT is usually what you would want
It was probably just the way the code evolved.
> (but then I'm probably biased from my absolute failure to use nvarchar
> or nclob columns with Oracle 8.0 from Java or Perl many moons ago).
>
> > But do please write up what you find as a patch the the Oracle docs.
> > It's entirely possible that the code is doing the wrong thing.
>
> Like this?
That's great, but why stop there? Now you seem to have identified the
problem, how about patching the code instead ;-)
Tim.
> Index: Oracle.pm
> ===================================================================
> --- Oracle.pm (revision 9909)
> +++ Oracle.pm (working copy)
> @@ -1462,7 +1462,14 @@
>
> Specify the OCI_ATTR_CHARSET_FORM for the bind value. Valid values
> are SQLCS_IMPLICIT (1) and SQLCS_NCHAR (2). Both those constants can
> -be imported from the DBD::Oracle module. Rarely needed.
> +be imported from the DBD::Oracle module.
> +This should be rarely needed, but at least some versions of Oracle
> +(e.g., 10.2) don't use an index if this is set wrong, which can happen
> +if both the character set and the national character set are UTF-8 and
> +the placeholder value has the utf8 flag set. In this case you can either
> +choose ora_csform to match your column or (if you don't have any NCHAR,
> +NVARCHAR2 or NCLOB columns) you can set NLS_NCHAR to US7ASCII to prevent
> +its use.
>
> =item ora_csid
>
>
> hp
>
> --
> _ | Peter J. Holzer | If I wanted to be "academically correct",
> |_|_) | Sysadmin WSR | I'd be programming in Java.
> | | | [EMAIL PROTECTED] | I don't, and I'm not.
> __/ | http://www.hjp.at/ | -- Jesse Erlbaum on dbi-users