[
https://issues.apache.org/jira/browse/HIVE-15850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jesus Camacho Rodriguez updated HIVE-15850:
-------------------------------------------
Description:
We need to make sure that filters on timestamp are passed to Druid with correct
timezone.
After CALCITE-1617, Calcite will generate a Druid query with intervals without
timezone specification. In Druid, these intervals will be assumed to be in UTC
(if Druid is running in UTC, which is currently the recommendation). However,
in Hive, those intervals should be assumed to be in the user timezone. Thus, we
should respect Hive semantics and include the user timezone in the intervals
passed to Druid.
was:
We need to make sure that filters on timestamp are represented with timezone
when we go into Calcite and converting them again when we go back from Calcite
to Hive. That would help us to 1) push the correct filters to Druid, and 2) if
filters are not pushed at all (they remain in the Calcite plan), they will be
correctly represented in Hive. I have checked and AFAIK this is currently done
correctly (ASTBuilder.java, ExprNodeConverter.java, and RexNodeConverter.java).
Secondly, we need to make sure we read/write timestamp data correctly from/to
Druid.
- When we write timestamp to Druid, we should include the timezone, which would
allow Druid to handle them properly. We do that already.
- When we read timestamp from Druid, we should transform the timestamp to be
based on Hive timezone. This will give us a consistent behavior of
Druid-on-Hive vs Hive-standalone, since timestamp in Hive is represented to the
user using Hive client timezone. Currently we do not do that.
> Proper handling of timezone in Druid storage handler
> ----------------------------------------------------
>
> Key: HIVE-15850
> URL: https://issues.apache.org/jira/browse/HIVE-15850
> Project: Hive
> Issue Type: Bug
> Components: Druid integration
> Affects Versions: 2.2.0
> Reporter: Jesus Camacho Rodriguez
> Assignee: Jesus Camacho Rodriguez
> Priority: Critical
>
> We need to make sure that filters on timestamp are passed to Druid with
> correct timezone.
> After CALCITE-1617, Calcite will generate a Druid query with intervals
> without timezone specification. In Druid, these intervals will be assumed to
> be in UTC (if Druid is running in UTC, which is currently the
> recommendation). However, in Hive, those intervals should be assumed to be in
> the user timezone. Thus, we should respect Hive semantics and include the
> user timezone in the intervals passed to Druid.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)