[ 
https://issues.apache.org/jira/browse/DERBY-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534914#comment-13534914
 ] 

Knut Anders Hatlen commented on DERBY-6017:
-------------------------------------------

I believe this happens because of optimizations that are performed if the IN 
list consists of constants only.

Such IN lists are sorted at compile time so that binary search can be used to 
find if there's a match at run time. That's all good. However, the sorting and 
the binary search use different ordering. The sorting (in 
ValueNodeList.sortInAscendingOrder()) uses the ordering of the type with the 
highest precedence of the target and all the operands. The binary search (in 
DataType.in()) uses the ordering of the type with the highest precedence of 
each pair of values that it compares.

In the query above, this means:

The sorting happens using the type with the highest precedence of all the 
values. That is, DOUBLE. All the four values in the IN list have the same 
DOUBLE value, so the list is already sorted, regardless of how we order the 
actual values. But when binary search is performed at run time, BIGINT 
semantics are used for some of the comparisons (those that involve BIGINTs 
only) and DOUBLE comparison for others (those that involve a DOUBLE value). So 
the binary search does not see the list as one that contain values that are all 
equal.

Additionally, during preprocessing, there is code to simplify the predicate if 
it's an IN list where all values are equal. This check also uses the dominant 
type, DOUBLE, and finds that the list indeed contains only one distinct value. 
It therefore eliminates the IN list and replaces it with a simple equality 
check using just one of the values in the IN list. That is, it rewrites the 
query from

    select * from t where x in (9223372036854775805, 9223372036854775806, 
9223372036854775807, 9.223372036854776E18)

to

    select * from t where x = 9223372036854775805

Those two queries are equivalent if the equality operator uses DOUBLE 
semantics. Unfortunately, the information about what's the dominant type is 
lost when the IN list is eliminated, and the equality check is performed using 
BIGINT semantics instead. The result is that only a single row matches.

So I think there are two things that need to be fixed:

1) The sorting and the binary search must be made consistent.

2) The duplicate elimination must preserve type information.
                
> IN lists with constants may return wrong results
> ------------------------------------------------
>
>                 Key: DERBY-6017
>                 URL: https://issues.apache.org/jira/browse/DERBY-6017
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.9.1.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>
> Given this table:
> ij> connect 'jdbc:derby:memory:db;create=true';
> ij> create table t(x bigint);
> 0 rows inserted/updated/deleted
> ij> insert into t values 9223372036854775805, 9223372036854775806, 
> 9223372036854775807;
> 3 rows inserted/updated/deleted
> A query that uses an IN list that contains all the three values actually 
> stored in the table, returns all three rows as expected:
> ij> select * from t where x in (9223372036854775805, 9223372036854775806, 
> 9223372036854775807);
> X                   
> --------------------
> 9223372036854775805 
> 9223372036854775806 
> 9223372036854775807 
> 3 rows selected
> However, if we add a value whose type precedence is higher, like a DOUBLE 
> value, and that value happens to be equal to the approximation of the other 
> values in the IN list when they are cast from BIGINT to DOUBLE, only one row 
> is returned:
> ij> select * from t where x in (9223372036854775805, 9223372036854775806, 
> 9223372036854775807, 9.223372036854776E18);
> X                   
> --------------------
> 9223372036854775805 
> 1 row selected
> I believe this query should return all three rows too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to