[ 
https://issues.apache.org/jira/browse/HIVE-27919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-27919:
---------------------------------------
    Description: 
Constant reduction (aka. constant folding) does not work in CBO for 
FROM_UNIXTIME, DATE_ADD, and DATE_SUB functions.

Below, we outline the result of EXPLAIN CBO and plain EXPLAIN using some 
trivial constant inputs. Notice, that constant reduction is not present in the 
CBO plan but it is applied later on by the physical optimizer.

{code:sql}
SELECT FROM_UNIXTIME(1701088643)
{code}
{noformat}
CBO PLAN:
HiveProject(_o__c0=[FROM_UNIXTIME(1701088643)])
  HiveTableScan(table=[[_dummy_database, _dummy_table]], 
table:alias=[_dummy_table])

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        TableScan
          alias: _dummy_table
          Row Limit Per Split: 1
          Select Operator
            expressions: '2023-11-27 04:37:23' (type: string)
            outputColumnNames: _col0
            ListSink
{noformat}

{code:sql}
SELECT DATE_ADD('2023-01-01', 1)
{code}
{noformat}
CBO PLAN:
HiveProject(_o__c0=[DATE_ADD(_UTF-16LE'2023-01-01':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE", 1)])
  HiveTableScan(table=[[_dummy_database, _dummy_table]], 
table:alias=[_dummy_table])

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        TableScan
          alias: _dummy_table
          Row Limit Per Split: 1
          Select Operator
            expressions: DATE'2023-01-02' (type: date)
            outputColumnNames: _col0
            ListSink
{noformat}

{code:sql}
SELECT DATE_SUB('2023-01-01', 1)
{code}
{noformat}
CBO PLAN:
HiveProject(_o__c0=[DATE_SUB(_UTF-16LE'2023-01-01':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE", 1)])
  HiveTableScan(table=[[_dummy_database, _dummy_table]], 
table:alias=[_dummy_table])

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        TableScan
          alias: _dummy_table
          Row Limit Per Split: 1
          Select Operator
            expressions: DATE'2022-12-31' (type: date)
            outputColumnNames: _col0
            ListSink
{noformat}
The reason is that all the functions at the moment are (wrongly) declared as a 
dynamic functions and this by default prevents constant folding. The functions 
are not dynamic since they don't depend on context variables and require one or 
more parameters.

  was:
Constant reduction (aka. constant folding) does not work in CBO for 
FROM_UNIXTIME, DATE_ADD, and DATE_SUB functions.

Below, we outline the result of EXPLAIN CBO and plain EXPLAIN using some 
trivial constant inputs. Notice, that constant reduction is not present in the 
CBO plan but it is applied later on by the physical optimizer.

{code:sql}
SELECT FROM_UNIXTIME(1701088643)
{code}
{noformat}
CBO PLAN:
HiveProject(_o__c0=[FROM_UNIXTIME(1701088643)])
  HiveTableScan(table=[[_dummy_database, _dummy_table]], 
table:alias=[_dummy_table])

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        TableScan
          alias: _dummy_table
          Row Limit Per Split: 1
          Select Operator
            expressions: '2023-11-27 04:37:23' (type: string)
            outputColumnNames: _col0
            ListSink
{noformat}

{code:sql}
SELECT DATE_ADD('2023-01-01', 1)
{code}
{noformat}
CBO PLAN:
HiveProject(_o__c0=[DATE_ADD(_UTF-16LE'2023-01-01':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE", 1)])
  HiveTableScan(table=[[_dummy_database, _dummy_table]], 
table:alias=[_dummy_table])

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        TableScan
          alias: _dummy_table
          Row Limit Per Split: 1
          Select Operator
            expressions: DATE'2023-01-02' (type: date)
            outputColumnNames: _col0
            ListSink
{noformat}

{code:sql}
SELECT DATE_SUB('2023-01-01', 1)
{code}
{noformat}
CBO PLAN:
HiveProject(_o__c0=[DATE_SUB(_UTF-16LE'2023-01-01':VARCHAR(2147483647) 
CHARACTER SET "UTF-16LE", 1)])
  HiveTableScan(table=[[_dummy_database, _dummy_table]], 
table:alias=[_dummy_table])

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        TableScan
          alias: _dummy_table
          Row Limit Per Split: 1
          Select Operator
            expressions: DATE'2022-12-31' (type: date)
            outputColumnNames: _col0
            ListSink
{noformat}
The reason is that all the functions at the moment extend 
{{SqlAbstractTimeFunction}} which is declared as a dynamic function and by 
default prevents constant folding.


> Constant reduction in CBO does not work for FROM_UNIXTIME, DATE_ADD, 
> DATE_SUB, TO_UNIX_TIMESTAMP
> ------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-27919
>                 URL: https://issues.apache.org/jira/browse/HIVE-27919
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO
>    Affects Versions: 4.0.0-beta-1
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>              Labels: pull-request-available
>
> Constant reduction (aka. constant folding) does not work in CBO for 
> FROM_UNIXTIME, DATE_ADD, and DATE_SUB functions.
> Below, we outline the result of EXPLAIN CBO and plain EXPLAIN using some 
> trivial constant inputs. Notice, that constant reduction is not present in 
> the CBO plan but it is applied later on by the physical optimizer.
> {code:sql}
> SELECT FROM_UNIXTIME(1701088643)
> {code}
> {noformat}
> CBO PLAN:
> HiveProject(_o__c0=[FROM_UNIXTIME(1701088643)])
>   HiveTableScan(table=[[_dummy_database, _dummy_table]], 
> table:alias=[_dummy_table])
> STAGE PLANS:
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         TableScan
>           alias: _dummy_table
>           Row Limit Per Split: 1
>           Select Operator
>             expressions: '2023-11-27 04:37:23' (type: string)
>             outputColumnNames: _col0
>             ListSink
> {noformat}
> {code:sql}
> SELECT DATE_ADD('2023-01-01', 1)
> {code}
> {noformat}
> CBO PLAN:
> HiveProject(_o__c0=[DATE_ADD(_UTF-16LE'2023-01-01':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE", 1)])
>   HiveTableScan(table=[[_dummy_database, _dummy_table]], 
> table:alias=[_dummy_table])
> STAGE PLANS:
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         TableScan
>           alias: _dummy_table
>           Row Limit Per Split: 1
>           Select Operator
>             expressions: DATE'2023-01-02' (type: date)
>             outputColumnNames: _col0
>             ListSink
> {noformat}
> {code:sql}
> SELECT DATE_SUB('2023-01-01', 1)
> {code}
> {noformat}
> CBO PLAN:
> HiveProject(_o__c0=[DATE_SUB(_UTF-16LE'2023-01-01':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE", 1)])
>   HiveTableScan(table=[[_dummy_database, _dummy_table]], 
> table:alias=[_dummy_table])
> STAGE PLANS:
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         TableScan
>           alias: _dummy_table
>           Row Limit Per Split: 1
>           Select Operator
>             expressions: DATE'2022-12-31' (type: date)
>             outputColumnNames: _col0
>             ListSink
> {noformat}
> The reason is that all the functions at the moment are (wrongly) declared as 
> a dynamic functions and this by default prevents constant folding. The 
> functions are not dynamic since they don't depend on context variables and 
> require one or more parameters.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to