Peter J. Holzer wrote:
On 2007-09-03 13:03:37 +0100, Tim Bunce wrote:
Oh the joys of Oracle character set support.
See the dbd_rebind_ph() code in dbdimp.c. Especially from the line
"csform = phs->csform;" onwards.
You can explicitly set the csform and csid using bind_param(..., { ... })
Have fun!
I had :-). You gave me a push into the right direction, and I think I
found the problem:
1375 csform = phs->csform;
1376
1377 if (!csform && SvUTF8(phs->sv)) {
1378 /* try to default csform to avoid translation through
non-unicode */
1379 if (CSFORM_IMPLIES_UTF8(SQLCS_NCHAR)) /* prefer NCHAR
*/
1380 csform = SQLCS_NCHAR;
1381 else if (CSFORM_IMPLIES_UTF8(SQLCS_IMPLICIT))
1382 csform = SQLCS_IMPLICIT;
1383 /* else leave csform == 0 */
1384 if (trace_level)
1385 PerlIO_printf(DBILOGFP, " rebinding %s with UTF8 value
%s", phs->name,
1386 (csform == SQLCS_NCHAR) ? "so setting
csform=SQLCS_IMPLICIT" :
1387 (csform == SQLCS_IMPLICIT) ? "so setting
csform=SQLCS_NCHAR" :
1388 "but neither CHAR nor NCHAR are unicode\n");
1389 }
The first thing to notice is that the trace message is backwards: It
prints setting csform=SQLCS_IMPLICIT if the csform is set to SQLCS_NCHAR
and vice versa. So that explains why I didn't notice anything strange in
the trace output. While it claimed to set csform=SQLCS_IMPLICIT, it was
actually setting it to SQLCS_NCHAR, and I guess that was what prevented
the use of the index on the varchar2 column. Sure enough, if I
explicitely set ora_csform to 1, it's fast. It is also fast, if I set
NLS_NCHAR to US7ASCII, so that it isn't used.
Maybe the tests in line 1379 and 1381 should be reversed so that
SQLCS_IMPLICIT is preferred over SQLCS_NCHAR? I guess there was some
good reason to prefer SQLCS_NCHAR, but it seems to me that SQLCS_IMPLICIT is
usually what you would want (but then I'm probably biased from my
absolute failure to use nvarchar or nclob columns with Oracle 8.0 from
Java or Perl many moons ago).
stumbled on this thread while debugging a problem with full table scans
binding utf8 strings into oracle queries...
it seems the degraded performance is "expected"
http://download-uk.oracle.com/docs/cd/B10501_01/server.920/a96529/ch6.htm#1008756
when you used csform SQLCS_NCHAR and the database column is CHAR,
VARCHAR2 or CLOB. same is true of the reverse, using SQLCS_IMPLICIT
when the column is NCHAR, NVARCHAR2, NCLOB.
perhaps we need to rework CSFORM_IMPLIES_UTF8 so it doesn't return true
for csform SQLCS_NCHAR if the columns are CHAR, VARCHAR2 or CLOB?
- mark