[ 
https://issues.apache.org/jira/browse/SPARK-29274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939875#comment-16939875
 ] 

Yuming Wang commented on SPARK-29274:
-------------------------------------

I don't konw how DB2 work. But ir returns correct values
{noformat}
create table t1 (incdata_id decimal(21,0), v VARCHAR(210));
create table t2 (incdata_id VARCHAR(210), v VARCHAR(210));
insert into t1 values(100000000001636981212, '1');
insert into t2 values(100000000001636981213, '2');

[db2inst1@2f3c821d36b7 ~]$ db2 "select * from t1 join t2 on (t1.incdata_id = 
t2.incdata_id)"

INCDATA_ID              V                                                       
                                                                                
                                                                           
INCDATA_ID                                                                      
                                                                                
                                                   V
----------------------- 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

  0 record(s) selected.


[db2inst1@2f3c821d36b7 ~]$ db2expln  -database sample -t -g -f "test.sql"

DB2 Universal Database Version 10.5, 5622-044 (c) Copyright IBM Corp. 1991, 2012
Licensed Material - Program Property of IBM
IBM DB2 Universal Database SQL and XQUERY Explain Tool

DB2 Universal Database Version 10.5, 5622-044 (c) Copyright IBM Corp. 1991, 2012
Licensed Material - Program Property of IBM
IBM DB2 Universal Database SQL and XQUERY Explain Tool

******************** DYNAMIC ***************************************

==================== STATEMENT ==========================================

        Isolation Level          = Cursor Stability
        Blocking                 = Block Unambiguous Cursors
        Query Optimization Class = 5

        Partition Parallel       = No
        Intra-Partition Parallel = No

        SQL Path                 = "SYSIBM", "SYSFUN", "SYSPROC", "SYSIBMADM",
                                   "DB2INST1"


Statement:

  select *
  from t1 join t2 on (t1.incdata_id =t2.incdata_id)


Section Code Page = 1208

Estimated Cost = 13.630928
Estimated Cardinality = 1.000000

Access Table Name = DB2INST1.T1  ID = 3,6
|  #Columns = 2
|  Skip Inserted Rows
|  Avoid Locking Committed Data
|  Currently Committed for Cursor Stability
|  May participate in Scan Sharing structures
|  Scan may start anywhere and wrap, for completion
|  Fast scan, for purposes of scan sharing management
|  Scan can be throttled in scan sharing management
|  Relation Scan
|  |  Prefetch: Eligible
|  Lock Intents
|  |  Table: Intent Share
|  |  Row  : Next Key Share
|  Sargable Predicate(s)
|  |  Process Build Table for Hash Join
Hash Join
|  Estimated Build Size: 4000
|  Estimated Probe Size: 4000
|  Access Table Name = DB2INST1.T2  ID = 3,7
|  |  #Columns = 2
|  |  Skip Inserted Rows
|  |  Avoid Locking Committed Data
|  |  Currently Committed for Cursor Stability
|  |  May participate in Scan Sharing structures
|  |  Scan may start anywhere and wrap, for completion
|  |  Fast scan, for purposes of scan sharing management
|  |  Scan can be throttled in scan sharing management
|  |  Relation Scan
|  |  |  Prefetch: Eligible
|  |  Lock Intents
|  |  |  Table: Intent Share
|  |  |  Row  : Next Key Share
|  |  Sargable Predicate(s)
|  |  |  Process Probe Table for Hash Join
Return Data to Application
|  #Columns = 4

End of section


Optimizer Plan:

         Rows
       Operator
         (ID)
         Cost

          1
       RETURN
        ( 1)
       13.6309
         |
          1
       HSJOIN
        ( 2)
       13.6309
      /       \
     1         1
  TBSCAN    TBSCAN
   ( 3)      ( 4)
  6.81521   6.81521
    |         |
    1         1
 Table:    Table:
 DB2INST1  DB2INST1
 T2        T1
{noformat}



> Can not coerce decimal type to double type when it's join key
> -------------------------------------------------------------
>
>                 Key: SPARK-29274
>                 URL: https://issues.apache.org/jira/browse/SPARK-29274
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.4, 2.4.4, 3.0.0
>            Reporter: Yuming Wang
>            Assignee: Pengfei Chang
>            Priority: Major
>         Attachments: image-2019-09-27-20-20-24-238.png
>
>
> How to reproduce this issue:
> {code:sql}
> create table t1 (incdata_id decimal(21,0), v string) using parquet;
> create table t2 (incdata_id string, v string) using parquet;
> explain select * from t1 join t2 on (t1.incdata_id = t2.incdata_id);
> == Physical Plan ==
> *(5) SortMergeJoin 
> [knownfloatingpointnormalized(normalizenanandzero(cast(incdata_id#31 as 
> double)))], 
> [knownfloatingpointnormalized(normalizenanandzero(cast(incdata_id#33 as 
> double)))], Inner
> :- *(2) Sort 
> [knownfloatingpointnormalized(normalizenanandzero(cast(incdata_id#31 as 
> double))) ASC NULLS FIRST], false, 0
> :  +- Exchange 
> hashpartitioning(knownfloatingpointnormalized(normalizenanandzero(cast(incdata_id#31
>  as double))), 200), true, [id=#104]
> :     +- *(1) Filter isnotnull(incdata_id#31)
> :        +- Scan hive default.t1 [incdata_id#31, v#32], HiveTableRelation 
> `default`.`t1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> [incdata_id#31, v#32], Statistics(sizeInBytes=8.0 EiB)
> +- *(4) Sort 
> [knownfloatingpointnormalized(normalizenanandzero(cast(incdata_id#33 as 
> double))) ASC NULLS FIRST], false, 0
>    +- Exchange 
> hashpartitioning(knownfloatingpointnormalized(normalizenanandzero(cast(incdata_id#33
>  as double))), 200), true, [id=#112]
>       +- *(3) Filter isnotnull(incdata_id#33)
>          +- Scan hive default.t2 [incdata_id#33, v#34], HiveTableRelation 
> `default`.`t2`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
> [incdata_id#33, v#34], Statistics(sizeInBytes=8.0 EiB)
> {code}
> {code:sql}
> select cast(v1 as double) as v3, cast(v2 as double) as v4,
>   cast(v1 as double) = cast(v2 as double), v1 = v2 
> from (select cast('100000000001636981212' as decimal(21, 0)) as v1,
>       cast('100000000001636981213' as decimal(21, 0)) as v2) t;
> 1.0000000000163697E20 1.0000000000163697E20   true    false
> {code}
>  
> It's a realy case in our production:
> !image-2019-09-27-20-20-24-238.png|width=100%!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to