On 20/11/13 09:39, Michiel Beijen wrote:
Hi,
I ran into this issue with DBD::ODBC;
I read Martin asked about feedback on the 1.46_1 devel release. I
tested my code against both versions 1.43 and 1.46_1 an the results
are the same.
Thanks for this. It has raised questions that needed answering.
Basically, if I do
SELECT example FROM foo WHERE example LIKE 'string%';
this is OK as long as long as 'string' is ascii. But whenever 'string'
is unicode, it does not return rows, when it should.
if I use bind variables, this problem goes away. DBD::mysql,
DBD::Oracle and DBD::Pg do not have this problem.
An alternative which I found worked is using N:
SELECT example FROM foo WHERE example LIKE N'string%';
but as I said, using the statement WITHOUT the N works on the other
three DBD drivers I mentioned.
Is this a bug in DBD::ODBC or expected behaviour?
Test code is here: https://gist.github.com/mbeijen/7560268
Result is here: https://gist.github.com/mbeijen/5a497e05e0739ef93ce9
--
Mike
All DBD::ODBC does when it gets a piece of unicode SQL is encode it in UTF16
and pass it to SQLPrepareW. I've confirmed that the data sent over the wire in
the TDS protocol to MS SQL Server is encoded correctly and is the correct data.
I've also confirmed the results of your test here with 4 different SQL Server
drivers (2 windows, 2 unix). You can even do a select of your data from MS SQL
Management studio and then paste it into a select with a where and it does not
return any rows unless the N prefix is added.
Also, there are some gotchas using like - see
http://technet.microsoft.com/en-us/library/ms179859.aspx
If you read http://support.microsoft.com/kb/239530 it explains why you need to
prefix all unicode strings with N. In fact, I'm not sure you do so long as the
characters in your literal are in your client codepage but that is difficult to
prove conclusively.
A small change to your script shows you MS SQL Server will report the literal
suffix required for nchar columns and your script is ignoring that:
my @types = $dbh->type_info(-9);
print Dumper(@types);
$VAR1 = {
'UNSIGNED_ATTRIBUTE' => undef,
'MAXIMUM_SCALE' => undef,
'INTERVAL_PRECISION' => undef,
'CREATE_PARAMS' => 'max length',
'NUM_PREC_RADIX' => undef,
'SEARCHABLE' => '3',
'MONEY' => '0',
'AUTO_INCREMENT' => undef,
'USERTYPE' => '0',
'LOCAL_TYPE_NAME' => 'nvarchar',
'LITERAL_PREFIX' => 'N\'',
'COLUMN_SIZE' => 4000,
'MINIMUM_SCALE' => undef,
'TYPE_NAME' => 'nvarchar',
'NULLABLE' => '1',
'SQL_DATA_TYPE' => '-9',
'DATA_TYPE' => -9,
'LITERAL_SUFFIX' => '\'',
'CASE_SENSITIVE' => '0',
'SQL_DATETIME_SUB' => undef
};
So this is probably not what you wanted to hear but if you stick to bound
parameters (which is good practise anyway) there should not be a problem.
I'll try and add some of this to my unicode problems document.
Martin