Re: (CAY-1210) mysql does not use index for case insensitive searches

Aristedes Maniatis Tue, 11 Jan 2011 21:01:44 -0800

On 11/01/11 10:18 PM, Andrew Lindesay wrote:

I think the functional indexes do make sense and it seems reasonable that a DBA 
would apply them as part of tuning.  Maybe it would be best to have a page in 
the manual on this explaining that for MySQL/... CI search it is best to change 
the collation on the schema objects and for Oracle/PG/... CI search it is best 
to implement functional indexes?


It is all a bit ugly. Postgresql docs recommend using LOWER() to achieve ci 
search, whereas Cayenne spits out UPPER(). Hibernate also produces LOWER(). So 
a db used by different systems is going to need quite a few indexes.

Right now, I'm looking for a solution for Derby and mysql in particular, which 
both lack functional indexes. And a MS-SQL installation which is already in ci 
'mode' anyway and so adding functional indexes is just a workaround for Cayenne 
executing inappropriate SQL for this environment.

Another small gotcha for developers is that Expression.filterObjects may be 
inconsistent with the queries against the database, when the expectation is 
that it should be the same.

A ci database will always be ci, unless we add the BINARY keyword (I know that 
works for mysql, not sure about everything else). [1] However that also 
bypasses the use of the index, so we don't want to add it when it isn't needed. 
It is easy to be correct. But correct and fast is much harder.


On 11/01/11 11:04 PM, Andrus Adamchik wrote:

Maybe do a total hack to get us out of this limbo - how about a DI extension 
point for CI LIKE SQL generation strategy (or rather a System property 
activating this strategy - 'default collation case sensitivity')? This won't 
complicate the mapping, won't force us to generalize, and will allow the same 
mapping to be used with both kinds of schemas.


Well, a database-wide property solves my particular use-case. Seems clunky 
(since it is database-wide and not specific to columns), but if you are 
convinced that case-sensitivity is not an attribute of the model, then this is 
the only way. I still can't help but think this is a property of an attribute 
in the model though. Other than some JDBC specification, why is the choice 
between BLOB, CLOB and TEXT a modeller property, but the choice between ci-TEXT 
and cs-TEXT isn't?

"BLOB values are treated as binary strings (byte strings). They have no character 
set, and sorting and comparison are based on the numeric values of the bytes in column 
values. TEXT values are treated as nonbinary strings (character strings). They have a 
character set, and values are sorted and compared based on the collation of the character 
set." Effectively BLOBs are just TEXT with collation = none. I've not tried to see 
what happens if you try to perform a case insensitive search on BLOB from Cayenne. Will 
it try to output UPPER?



Out of interest, my googling discovered that Atlassian developers are running 
into this problem with Hibernate. But they didn't have a nice solution. 
http://jira.atlassian.com/browse/CONF-10030

Regards

Ari



[1] http://dev.mysql.com/doc/refman/5.0/en/charset-binary-op.html

--
-------------------------->
Aristedes Maniatis
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A

Re: (CAY-1210) mysql does not use index for case insensitive searches

Reply via email to