Thanks, Mamta. I'm having a hard time seeing these issues addressed by
the descriptions of DERBY-1478 and DERBY-2793. Do you think it might
make sense to log a new issue and link it to these old ones?
Thanks,
-Rick
Mamta Satoor wrote:
Yes, that is correct.
For 1), the disabling of optimizaiton for LIKE for a territory based
database has jira entry DERBY-1478 for it.
For 2), we have jira entry DERBY-2793
Mamta
On 8/25/08, Rick Hillegas <[EMAIL PROTECTED]> wrote:
Thanks, Mamta. This is very helpful. Based on your responses, I think that
the following summarizes how LIKE behaves in databases with territory based
collations:
1) The LIKE optimizations are disabled. For example, given the following
query where A is an indexed string column
select * from T where T.A like 'foo%'
Derby cannot use the index on A and falls back on performing a full table
scan.
2) The LIKE operator has Derby-specific semantics. The Derby-specifics
semantics return a subset of the rows which qualify under the ANSI rules.
Does that sound correct to you?
Thanks,
-Rick
Mamta Satoor wrote:
Rick, the behavior we expect for LIKE and = in a territory based
database can be explained by the following comment that I picked up
from DERBY-3166
**********************************************************
The following applies to territory based database
For LIKE, when comparing a pattern against a value string, we do the
comparison of collation elements(s) for one character at a time for
non-metacharacters. This is different than what is done for =
operation. For =, we compare the collation elements for the entire
string on left hand with the collation elements of the entire string
on the right hand side.
For eg say we are working with a territory where character 'z' has
same collation elements as 'xy'. For such a territory consider 2
clauses in WHERE clause
1)'zcb' = 'xycb'
2)'zcb' LIKE 'xy_b'
For case 1), we will return TRUE because the collation elements for
the entire string 'zcb' will match the collation elements of the
entire string 'xycb'.
For case 2) though, we will return FALSE because collation element(s)
for character 'z' does not match the collation element(s) for
character 'x'. So, as can be seen, the LIKE operation is one character
at a time whereas = operation is the entire string at a time.
In addition, the metacharacter _ in pattern for LIKE will consume
*one* character in the string value. So for an eg clause 'xycb' LIKE
'_cb' will return FALSE because metacharacter _ will consume 'x' and
since 'c' does not match 'y', we will return FALSE.
**********************************************************
Is this the information you were looking for?
thanks,
Mamta
On 8/22/08, Rick Hillegas <[EMAIL PROTECTED]> wrote:
According to the Developer's Guide section titled "Character-based
collation
in Derby", the LIKE operator behaves differently than the = operator.
That
is, if you are using a territory-based collation, two strings might =
one
another but not be LIKE one another (and vice-versa).
In the meantime, according to the section in the Tuning Guide titled
"Character string beginning with constant", LIKE expressions which begin
with a constant (followed by a wildcard) can be transformed into
indexable
expressions involving the ">=" and "<" operators.
To my muddled understanding, it seems that one of the following occurs:
1) The LIKE optimizations are disabled if you are using territory-based
collation
or
2) a LIKE expression may return different results depending on your
indexes
and/or the presence of trailing wildcards in your LIKE expression.
(1) seems like a performance issue which needs to be documented. (2)
seems
like a correctness problem. What is the behavior that we expect?
Thanks,
-Rick