Re: DBD::ODBC and Unicode

Martin J. Evans Wed, 20 Nov 2013 03:50:46 -0800

On 20/11/13 09:39, Michiel Beijen wrote:

Hi,


I ran into this issue with DBD::ODBC;

I read Martin asked about feedback on the 1.46_1 devel release. I
tested my code against both versions 1.43 and 1.46_1 an the results
are the same.


Thanks for this. It has raised questions that needed answering.

Basically, if I do
   SELECT example FROM foo WHERE example LIKE 'string%';

this is OK as long as long as 'string' is ascii. But whenever 'string'
is unicode, it does not return rows, when it should.
if I use bind variables, this problem goes away. DBD::mysql,
DBD::Oracle and DBD::Pg do not have this problem.

An alternative which I found worked is using N:
   SELECT example FROM foo WHERE example LIKE N'string%';

but as I said, using the statement WITHOUT the N works on the other
three DBD drivers I mentioned.

Is this a bug in DBD::ODBC or expected behaviour?

Test code is here: https://gist.github.com/mbeijen/7560268
Result is here: https://gist.github.com/mbeijen/5a497e05e0739ef93ce9

--
Mike


All DBD::ODBC does when it gets a piece of unicode SQL is encode it in UTF16 
and pass it to SQLPrepareW. I've confirmed that the data sent over the wire in 
the TDS protocol to MS SQL Server is encoded correctly and is the correct data. 
I've also confirmed the results of your test here with 4 different SQL Server 
drivers (2 windows, 2 unix). You can even do a select of your data from MS SQL 
Management studio and then paste it into a select with a where and it does not 
return any rows unless the N prefix is added.

Also, there are some gotchas using like - see 
http://technet.microsoft.com/en-us/library/ms179859.aspx

If you read http://support.microsoft.com/kb/239530 it explains why you need to 
prefix all unicode strings with N. In fact, I'm not sure you do so long as the 
characters in your literal are in your client codepage but that is difficult to 
prove conclusively.

A small change to your script shows you MS SQL Server will report the literal 
suffix required for nchar columns and your script is ignoring that:

my @types = $dbh->type_info(-9);
print Dumper(@types);

$VAR1 = {
          'UNSIGNED_ATTRIBUTE' => undef,
          'MAXIMUM_SCALE' => undef,
          'INTERVAL_PRECISION' => undef,
          'CREATE_PARAMS' => 'max length',
          'NUM_PREC_RADIX' => undef,
          'SEARCHABLE' => '3',
          'MONEY' => '0',
          'AUTO_INCREMENT' => undef,
          'USERTYPE' => '0',
          'LOCAL_TYPE_NAME' => 'nvarchar',
          'LITERAL_PREFIX' => 'N\'',
          'COLUMN_SIZE' => 4000,
          'MINIMUM_SCALE' => undef,
          'TYPE_NAME' => 'nvarchar',
          'NULLABLE' => '1',
          'SQL_DATA_TYPE' => '-9',
          'DATA_TYPE' => -9,
          'LITERAL_SUFFIX' => '\'',
          'CASE_SENSITIVE' => '0',
          'SQL_DATETIME_SUB' => undef
        };

So this is probably not what you wanted to hear but if you stick to bound 
parameters (which is good practise anyway) there should not be a problem.

I'll try and add some of this to my unicode problems document.

Martin

Re: DBD::ODBC and Unicode

Reply via email to