On 11/01/11 10:18 PM, Andrew Lindesay wrote:
I think the functional indexes do make sense and it seems reasonable that a DBA would apply them as part of tuning. Maybe it would be best to have a page in the manual on this explaining that for MySQL/... CI search it is best to change the collation on the schema objects and for Oracle/PG/... CI search it is best to implement functional indexes?
It is all a bit ugly. Postgresql docs recommend using LOWER() to achieve ci search, whereas Cayenne spits out UPPER(). Hibernate also produces LOWER(). So a db used by different systems is going to need quite a few indexes. Right now, I'm looking for a solution for Derby and mysql in particular, which both lack functional indexes. And a MS-SQL installation which is already in ci 'mode' anyway and so adding functional indexes is just a workaround for Cayenne executing inappropriate SQL for this environment. Another small gotcha for developers is that Expression.filterObjects may be inconsistent with the queries against the database, when the expectation is that it should be the same. A ci database will always be ci, unless we add the BINARY keyword (I know that works for mysql, not sure about everything else). [1] However that also bypasses the use of the index, so we don't want to add it when it isn't needed. It is easy to be correct. But correct and fast is much harder. On 11/01/11 11:04 PM, Andrus Adamchik wrote:
Maybe do a total hack to get us out of this limbo - how about a DI extension point for CI LIKE SQL generation strategy (or rather a System property activating this strategy - 'default collation case sensitivity')? This won't complicate the mapping, won't force us to generalize, and will allow the same mapping to be used with both kinds of schemas.
Well, a database-wide property solves my particular use-case. Seems clunky (since it is database-wide and not specific to columns), but if you are convinced that case-sensitivity is not an attribute of the model, then this is the only way. I still can't help but think this is a property of an attribute in the model though. Other than some JDBC specification, why is the choice between BLOB, CLOB and TEXT a modeller property, but the choice between ci-TEXT and cs-TEXT isn't? "BLOB values are treated as binary strings (byte strings). They have no character set, and sorting and comparison are based on the numeric values of the bytes in column values. TEXT values are treated as nonbinary strings (character strings). They have a character set, and values are sorted and compared based on the collation of the character set." Effectively BLOBs are just TEXT with collation = none. I've not tried to see what happens if you try to perform a case insensitive search on BLOB from Cayenne. Will it try to output UPPER? Out of interest, my googling discovered that Atlassian developers are running into this problem with Hibernate. But they didn't have a nice solution. http://jira.atlassian.com/browse/CONF-10030 Regards Ari [1] http://dev.mysql.com/doc/refman/5.0/en/charset-binary-op.html -- --------------------------> Aristedes Maniatis GPG fingerprint CBFB 84B4 738D 4E87 5E5C 5EFA EF6A 7D2E 3E49 102A
