I had tried to find answers in the sql-92 manual, but have yet to find specific 
rules on how predicate "IN" should behave for implicit datatype conversions. 
From what I understood, implementations of implicit datatype conversion is 
implementation-defined(in our case decided by calcite). 

        Implementation-defined elements:
        ...
        The term implementation-defined is used to identify characteristics
                that may differ between implementations, but that shall be 
defined
                for each particular implementation.
        ...
        6) Subclause 4.6, "Type conversions and mixing of data types": When
                converting between numeric data types, if least significant dig-
                its are lost, then it is implementation-defined whether rounding
                or truncation occurs.

Thus, I had used postgres as reference to compare between different 
implementations. I do agree on sql: " select … where x in (v1, v2)" giving the 
same results as sql: " select … where x = v1 or x = v2" too.

I tried a few quick tests in reflective schema (as I could not find docs on 
Calcite implicit datatype conversion precedence list) and concluded (correct me 
if I'm wrong) that for the "=" operator in Calcite the behaviors are as follow: 
 
        1. the lower precision datatype will be converted to higher precision 
datatypes prior comparison
        2. for comparisons between smallint, int and bigint, conversion will be 
done on the RHS variable of the operator "=" prior comparison

So,  sql: " select … where x in (v1, v2)" does not give the same results as 
sql: " select … where x = v1 or x = v2" for the 1st behavior above, e.g (tests 
with reflective schema):

SQL: explain plan for select * from T2 where CBINT in (CDBL,CINT)
| PLAN(VARCHAR) |
|_|EnumerableCalc(expr#0..6=[{inputs}], expr#7=[CAST($t3):JavaType(class 
java.lang.Long)], expr#8=[=($t2, $t7)], expr#9=[CAST($t1):JavaType(class 
java.lang.Long)], expr#10=[=($t2, $t9)], expr#11=[OR($t8, $t10)], 
proj#0..6=[{exprs}], $condition=[$t11])
  EnumerableTableScan(table=[[TEST, T2]])
|_|
SQL: explain plan for select * from T2 where CBINT = CDBL or CBINT = CINT
| PLAN(VARCHAR) |
|_|EnumerableCalc(expr#0..6=[{inputs}], expr#7=[CAST($t2):JavaType(class 
java.lang.Double)], expr#8=[=($t7, $t3)], expr#9=[CAST($t1):JavaType(class 
java.lang.Long)], expr#10=[=($t2, $t9)], expr#11=[OR($t8, $t10)], 
proj#0..6=[{exprs}], $condition=[$t11])
  EnumerableTableScan(table=[[TEST, T2]])
|_|

SQL: select * from "T2"
| CSINT(SMALLINT) | CINT(INTEGER) | CBINT(BIGINT) | CDBL(DOUBLE) | V1(VARCHAR) 
| V2(VARCHAR) | CTS(TIMESTAMP) |
|_|0|_|0|_|0|_|0.0|_|zero|_|z|_|1970-01-01 00:00:00.0|_|
|_|0|_|null|_|0|_|0.5|_|half|_|h|_|1970-01-01 00:00:00.0|_|
|_|1|_|1|_|1|_|1.0|_|one|_|o|_|null|_|
|_|2|_|2|_|2|_|null|_|two|_|t|_|1970-01-01 00:00:00.002|_|
|_|3|_|3|_|3|_|3.0|_|three|_|null|_|1970-01-01 00:00:00.003|_|
|_|4|_|4|_|null|_|4.0|_|four|_|f|_|1970-01-01 00:00:00.004|_|
|_|5|_|5|_|5|_|5.0|_|null|_|n|_|1970-01-01 00:00:00.005|_|
SQL: select * from T2 where CBINT in (CDBL,CINT)
| CSINT(SMALLINT) | CINT(INTEGER) | CBINT(BIGINT) | CDBL(DOUBLE) | V1(VARCHAR) 
| V2(VARCHAR) | CTS(TIMESTAMP) |
|_|0|_|0|_|0|_|0.0|_|zero|_|z|_|1970-01-01 00:00:00.0|_|
|_|0|_|null|_|0|_|0.5|_|half|_|h|_|1970-01-01 00:00:00.0|_|
|_|1|_|1|_|1|_|1.0|_|one|_|o|_|null|_|
|_|2|_|2|_|2|_|null|_|two|_|t|_|1970-01-01 00:00:00.002|_|
|_|3|_|3|_|3|_|3.0|_|three|_|null|_|1970-01-01 00:00:00.003|_|
|_|5|_|5|_|5|_|5.0|_|null|_|n|_|1970-01-01 00:00:00.005|_|
SQL: select * from T2 where CBINT = CDBL or CBINT = CINT
| CSINT(SMALLINT) | CINT(INTEGER) | CBINT(BIGINT) | CDBL(DOUBLE) | V1(VARCHAR) 
| V2(VARCHAR) | CTS(TIMESTAMP) |
|_|0|_|0|_|0|_|0.0|_|zero|_|z|_|1970-01-01 00:00:00.0|_|
|_|1|_|1|_|1|_|1.0|_|one|_|o|_|null|_|
|_|2|_|2|_|2|_|null|_|two|_|t|_|1970-01-01 00:00:00.002|_|
|_|3|_|3|_|3|_|3.0|_|three|_|null|_|1970-01-01 00:00:00.003|_|
|_|5|_|5|_|5|_|5.0|_|null|_|n|_|1970-01-01 00:00:00.005|_|


Thanks & Regards,

Yuri Au Yong
Software Engineer - NPM | [email protected] | Desk: +60 3 7663 8372 
Persistent Systems Ltd. | Partners in Innovation | www.persistentsys.com


-----Original Message-----
From: Julian Hyde [mailto:[email protected]] 
Sent: Thursday, October 29, 2015 9:22 AM
To: [email protected]
Subject: Re: Implicit DataType Conversion for IN predicate

A rule of thumb I have used is that

   select … where x in (v1, v2)

should give the same result as

  select … where x = v1 or x = v2

even taking into account implicit conversions of x, v1, and v2. Maybe this rule 
of thumb is mistaken; I don’t know. It is interesting to see what Postgres 
does, but the definitive answer is to look into the SQL standard.

What does the standard say on this matter?

Julian


> On Oct 25, 2015, at 7:17 PM, Yuri Au Yong <[email protected]> wrote:
> 
> Hi,
>  
> When using JDBC adapter with postgres db, I observed that the implicit 
> DataType conversion for IN predicate is only done on the in value list and 
> not according to a datatype precedence list. Is this expected?
>  
> Consider T2:
>  
> SQL: select * from "T2"
> | CSINT(SMALLINT) | CINT(INTEGER) | CBINT(BIGINT) | CDBL(DOUBLE) | 
> |V1(VARCHAR) | V2(VARCHAR) | CTS(TIMESTAMP) |
> |_|0|_|0|_|0|_|0.0|_|zero|_|z|_|1970-01-01 00:00:00.0|_|
> |_|0|_|null|_|0|_|0.5|_|half|_|h|_|1970-01-01 00:00:00.0|_| 
> |_|1|_|1|_|1|_|1.0|_|one|_|o|_|null|_|
> |_|2|_|2|_|2|_|null|_|two|_|t|_|1970-01-01 00:00:00.002|_|
> |_|3|_|3|_|3|_|3.0|_|three|_|null|_|1970-01-01 00:00:00.003|_|
> |_|4|_|4|_|null|_|4.0|_|four|_|f|_|1970-01-01 00:00:00.004|_|
> |_|5|_|5|_|5|_|5.0|_|null|_|n|_|1970-01-01 00:00:00.005|_|
>  
> I am able to get the same result when comparing “CBINT in (CDBL)” and 
> “CDBL in (CBINT)” in postgres db
>  
> When I apply the same query using JDBC Adaptor with postgres db,  “CBINT in 
> (CDBL)” will give different results from  “CDBL in (CBINT)”
>  
> SQL: explain plan for select * from T2 where CBINT in (CDBL)
> | PLAN(VARCHAR) |
> |_|EnumerableCalc(expr#0..6=[{inputs}], 
> |expr#7=[CAST($t3):JavaType(class java.lang.Long)], expr#8=[=($t2, 
> |$t7)], proj#0..6=[{exprs}], $condition=[$t8])
>   EnumerableTableScan(table=[[TEST, T2]])
> |_|
> SQL: explain plan for select * from T2 where CDBL in (CBINT)
> | PLAN(VARCHAR) |
> |_|EnumerableCalc(expr#0..6=[{inputs}], 
> |expr#7=[CAST($t2):JavaType(class java.lang.Double)], expr#8=[=($t3, 
> |$t7)], proj#0..6=[{exprs}], $condition=[$t8])
>   EnumerableTableScan(table=[[TEST, T2]])
> |_|
> SQL: select * from T2 where CBINT in (CDBL)
> | CSINT(SMALLINT) | CINT(INTEGER) | CBINT(BIGINT) | CDBL(DOUBLE) | 
> |V1(VARCHAR) | V2(VARCHAR) | CTS(TIMESTAMP) |
> |_|0|_|0|_|0|_|0.0|_|zero|_|z|_|1970-01-01 00:00:00.0|_|
> |_|0|_|null|_|0|_|0.5|_|half|_|h|_|1970-01-01 00:00:00.0|_| 
> |_|1|_|1|_|1|_|1.0|_|one|_|o|_|null|_|
> |_|3|_|3|_|3|_|3.0|_|three|_|null|_|1970-01-01 00:00:00.003|_|
> |_|5|_|5|_|5|_|5.0|_|null|_|n|_|1970-01-01 00:00:00.005|_|
> selectSomething - SQL: select * from T2 where CDBL in (CBINT)
> | CSINT(SMALLINT) | CINT(INTEGER) | CBINT(BIGINT) | CDBL(DOUBLE) | 
> |V1(VARCHAR) | V2(VARCHAR) | CTS(TIMESTAMP) |
> |_|0|_|0|_|0|_|0.0|_|zero|_|z|_|1970-01-01 00:00:00.0|_| 
> |_|1|_|1|_|1|_|1.0|_|one|_|o|_|null|_|
> |_|3|_|3|_|3|_|3.0|_|three|_|null|_|1970-01-01 00:00:00.003|_|
> |_|5|_|5|_|5|_|5.0|_|null|_|n|_|1970-01-01 00:00:00.005|_|
>  
> Thanks & Regards,
>  
> Yuri Au Yong
> Software Engineer - NPM | [email protected] 
> <mailto:[email protected]> | Desk: +60 3 7663 8372 Persistent 
> Systems Ltd. | Partners in Innovation | www.persistentsys.com 
> <http://www.persistentsys.com/>  
> <https://tnpmsupport.persistentsys.com/training>
>  
> DISCLAIMER ========== This e-mail may contain privileged and confidential 
> information which is the property of Persistent Systems Ltd. It is intended 
> only for the use of the individual or entity to which it is addressed. If you 
> are not the intended recipient, you are not authorized to read, retain, copy, 
> print, distribute or use this message. If you have received this 
> communication in error, please notify the sender and delete all copies of 
> this message. Persistent Systems Ltd. does not accept any liability for virus 
> infected mails.
> 


DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Ltd. does not accept any liability for virus infected mails.

Reply via email to