[ 
https://issues.apache.org/jira/browse/DERBY-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030575#comment-14030575
 ] 

Knut Anders Hatlen commented on DERBY-6080:
-------------------------------------------

It looks like this is caused by an inconsistency between the equals(Object) and 
hashCode() methods in the number data types, as can be seen by this code 
example:

{code}
        SQLDouble d = new SQLDouble((double) Long.MAX_VALUE);
        SQLLongint b = new SQLLongint(Long.MAX_VALUE - 1);
        System.out.println("d equals b: " + d.equals(b));
        System.out.println("b equals d: " + b.equals(d));
        System.out.println("b hashCode: " + b.hashCode());
        System.out.println("d hashCode: " + d.hashCode());
{code}

The above code causes the following to be printed:

{noformat}
d equals b: true
b equals d: true
b hashCode: -2147483647
d hashCode: -2147483648
{noformat}

The DOUBLE and BIGINT values are considered equal by the equals(Object) method, 
but hashCode() returns different values. That clearly violates the contract of 
java.lang.Object.hashCode(), which states that "\[if\] two objects are equal 
according to the equals(Object) method, then calling the hashCode method on 
each of the two objects must produce the same integer result."

Because of this inconsistency, the nested loop join believes the two objects 
are equal, whereas the hash join puts the two values in different buckets 
because of their different hash codes, and considers them not equal.

The reason why the equals() method believes that they are equal, is that it 
performs the comparison using the type that has the highest type precedence. 
For comparisons between DOUBLE and BIGINT, DOUBLE has the highest precedence, 
so they are compared by using SQLDouble.typeCompare(). SQLDouble.typeCompare() 
calls getDouble() on both values, and loses precision so that both 
9223372036854775806 and 9223372036854775807 end up as 9.223372036854776E18.

I'm not quite sure if the join should have returned zero, one or two rows. 
SQL:2011, part 2, 8.2 <comparison predicate>, GR 2 says: "Numbers are compared 
with respect to their algebraic value."

Reasons for zero rows: Neither 9223372036854775806 nor 9223372036854775807 is 
algebraically equal to 9.223372036854776E18, so the join condition should 
evaluate to false for both rows.

Reasons for one row: When casting 9.223372036854776E18 to a long/bigint, we get 
the value 9223372036854775807, so the difference is just that the printed 
representation of the double is an approximation of the actual value.

Reasons for two rows: None, except if we could find evidence in the standard 
that it is correct to convert both operands of such a comparison to DOUBLE 
first.

> Cast to same type changes result with IN subquery
> -------------------------------------------------
>
>                 Key: DERBY-6080
>                 URL: https://issues.apache.org/jira/browse/DERBY-6080
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.0.2.1, 10.1.1.0, 10.1.2.1, 10.1.3.1, 10.2.1.6, 
> 10.2.2.0, 10.3.1.4, 10.3.2.1, 10.3.3.0, 10.4.1.3, 10.4.2.0, 10.5.1.1, 
> 10.5.2.0, 10.5.3.0, 10.6.1.0, 10.6.2.1, 10.7.1.1, 10.8.1.2, 10.8.2.2, 
> 10.8.3.0, 10.9.1.0, 10.10.1.1
>            Reporter: Knut Anders Hatlen
>            Priority: Minor
>              Labels: derby_triage10_11
>
> Given these tables
> ij> create table t4 (b bigint);
> 0 rows inserted/updated/deleted
> ij> insert into t4 values 9223372036854775806, 9223372036854775807;
> 2 rows inserted/updated/deleted
> ij> create table t5 (d double);
> 0 rows inserted/updated/deleted
> ij> insert into t5 values 9.223372036854776E18;
> 1 row inserted/updated/deleted
> the following two SELECT queries return different results:
> ij> select * from t4 where b in (select d from t5);
> B                   
> --------------------
> 9223372036854775807 
> 1 row selected
> ij> select * from t4 where b in (select cast(d as double) from t5);
> B                   
> --------------------
> 9223372036854775806 
> 9223372036854775807 
> 2 rows selected
> Since the only difference is D vs CAST(D AS DOUBLE), and the type of D is 
> already DOUBLE, I'd expect the two queries to be equivalent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to