[ 
https://issues.apache.org/jira/browse/HIVE-11477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659239#comment-14659239
 ] 

Sergey Shelukhin commented on HIVE-11477:
-----------------------------------------

I seem to recall something about these UDFs. They might be necessary because of 
type issues in Calcite, if I remember the same issue... it is very hard to 
strip them back off because it's hard to tell apart the existing user casts, 
and the casts after the plan was changed by Calcite, in a general case. IIRC 
the idea that I had was that Calcite would need to have some form of separate 
casts that would do the same thing but be distinguishable from regular casts. 
Or the functions in RelNode tree would need to be taggable with tags preserved 
during transformations. 
Although, maybe it's a different, simpler issue, not sure.

> CBO inserts a UDF cast for integer type promotion
> -------------------------------------------------
>
>                 Key: HIVE-11477
>                 URL: https://issues.apache.org/jira/browse/HIVE-11477
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Pengcheng Xiong
>
> When CBO is enabled, filters which compares tinyint, smallint columns with 
> constant integer types will insert a UDFToInteger cast for the columns. When 
> CBO is disabled, there is no such UDF. This behaviour breaks ORC predicate 
> pushdown feature as ORC ignores UDFs in the filters.
> In the following examples column t is tinyint
> {code:title=Explain for select count(*) from orc_ppd where t < -127; (CBO 
> OFF)}
> Filter Operator [FIL_9]
>                            predicate:(t = 125) (type: boolean)
>                            Statistics:Num rows: 1050 Data size: 611757 Basic 
> stats: COMPLETE Column stats: NONE
>                            TableScan [TS_0]
>                               alias:orc_ppd
>                               Statistics:Num rows: 2100 Data size: 1223514 
> Basic stats: COMPLETE Column stats: NONE
> {code}
> {code:title=Explain for select count(*) from orc_ppd where t < -127; (CBO ON)}
> Filter Operator [FIL_10]
>                            predicate:(UDFToInteger(t) < -127) (type: boolean)
>                            Statistics:Num rows: 700 Data size: 407838 Basic 
> stats: COMPLETE Column stats: NONE
>                            TableScan [TS_0]
>                               alias:orc_ppd
>                               Statistics:Num rows: 2100 Data size: 1223514 
> Basic stats: COMPLETE Column stats: NONE
> {code}
> CBO does not insert such cast for non-negative numbers
> {code:title=Explain for select count(*) from orc_ppd where t < 127; (CBO ON)}
> Filter Operator [FIL_10]
>                            predicate:(t < 127) (type: boolean)
>                            Statistics:Num rows: 700 Data size: 407838 Basic 
> stats: COMPLETE Column stats: NONE
>                            TableScan [TS_0]
>                               alias:orc_ppd
>                               Statistics:Num rows: 2100 Data size: 1223514 
> Basic stats: COMPLETE Column stats: NONE
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to