[
https://issues.apache.org/jira/browse/IMPALA-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013035#comment-17013035
]
ASF subversion and git services commented on IMPALA-9035:
---------------------------------------------------------
Commit 6e3d1216f1680e7fca4b5979f182a19673018a2f in impala's branch
refs/heads/master from xuzhou
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=6e3d121 ]
IMPALA-9035: Simplify casting string to timestamp.
This change will help with queries generated by some BI tools.
Case 1:
Simplify 'string -> bigint -> timestamp' TO 'string -> timestamp':
cast(unix_timestamp('timestr') as timestamp) ->
cast('timestr' as timestamp)
Case 2:
Simplify 'string[fmt] -> bigint -> timestamp' TO 'string -> timestamp':
cast(unix_timestamp('timestr', 'fmt') as timestamp) ->
to_timestamp('timestr', 'fmt')
Tests:
Add front-end tests in ExprRewriteRulesTest.
Change-Id: I4ed72d6e7886eaf50d2be60cf45170ffaef5e72d
Reviewed-on: http://gerrit.cloudera.org:8080/14896
Reviewed-by: Attila Jeges <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Simplify cast string to timestamp
> ---------------------------------
>
> Key: IMPALA-9035
> URL: https://issues.apache.org/jira/browse/IMPALA-9035
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: 徐洲
> Assignee: xuzhou
> Priority: Critical
>
> Some BI tools generate compatible sqls to cast string to timestamp:
> cast(unix_timestamp('timestr', 'fmt') as timestamp) or
> cast(unix_timestamp('timestr') as timestamp)
> The internal type change in impala:
> StringVal->TimestampValue->BigIntVal→TimestampValue→TimestampVal
> In earlier 2.x versions, casting TimestampValue to BigIntVal using libc
> functions which is more expensive than current design with CCTZ.
> However, the cast to BIgIntVal seems to be redundant. We can simplify
> cast(unix_timestamp('timestr', 'fmt') as timestamp) to
> to_timestamp('timestr', 'fmt'), simplify cast(unix_timestamp('timestr') as
> timestamp) to cast('timestr' as timestamp).
> I managed to modify the fe to support such expr rewriting.
> Benefiting from the rewriting, the query time cost is reduced from 2.52s to
> 2.02s.
> Here is the sql:
> SELECT `t2`.`phy_category1_name` `d0`, SUM(`t1`.`apply_count`) `m0` FROM
> `tr_orders` `t1` LEFT JOIN `dim_sku` `t2` ON (`t1`.`sku_id` = `t2`.`sku_id`)
> WHERE (((CAST(UNIX_TIMESTAMP(`t1`.`date_id`) AS TIMESTAMP)) >=
> cast('2018-08-26' as timestamp)) AND ((CAST(UNIX_TIMESTAMP(`t1`.`date_id`) AS
> TIMESTAMP)) < cast('2019-09-02' as timestamp))) GROUP BY
> `t2`.`phy_category1_name` order by d0;
> Query: SELECT `t2`.`phy_category1_name` `d0`, SUM(`t1`.`apply_count`) `m0`
> FROM `tr_orders` `t1` LEFT JOIN `dim_sku` `t2` ON (`t1`.`sku_id` =
> `t2`.`sku_id`) WHERE (((CAST(UNIX_TIMESTAMP(`t1`.`date_id`) AS TIMESTAMP)) >=
> cast('2018-08-26' as timestamp)) AND ((CAST(UNIX_TIMESTAMP(`t1`.`date_id`) AS
> TIMESTAMP)) < cast('2019-09-02' as timestamp))) GROUP BY
> `t2`.`phy_category1_name` order by d0
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]