[
https://issues.apache.org/jira/browse/HIVE-13250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15204812#comment-15204812
]
Siddharth Seth commented on HIVE-13250:
---------------------------------------
bq. I misunderstood this bug report. Without patch, filter expression for {{
ts_field = "2016-01-23 00:00:00"}} gets executed as (UDFToString(ts_field) =
'2016-01-23 00:00:00') In the patch I made changes such that cast is on
constant (ts_field = UDFTOTimeStamp('2016-01-23 00:00:00')) which gets folded
compile time to (ts_field = 2016-01-23 00:00:00.0)
I'd expect the cast to change the value to whatever can be compared directly
against storage. However, I think the type promotion system is far more
complicated - and this may not be possible always.
> Compute predicate conversions on the client, instead of per row group
> ---------------------------------------------------------------------
>
> Key: HIVE-13250
> URL: https://issues.apache.org/jira/browse/HIVE-13250
> Project: Hive
> Issue Type: Improvement
> Affects Versions: 2.1.0
> Reporter: Siddharth Seth
> Assignee: Ashutosh Chauhan
> Attachments: HIVE-13250.2.patch, HIVE-13250.patch
>
>
> When running a query for the form
> select count from table where ts_field = "2016-01-23 00:00:00";
> or
> select count from table where ts_field = 1453507200
> ts_field is of type TIMESTAMP
> The predicate is converted to whatever format is appropriate for TIMESTAMP
> processing on each and every row group.
> It would be far more efficient to process this once on the client - or even
> once per task.
> The same applies to ORC splt elimination as well - this is applied for each
> stripe.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)