[jira] [Comment Edited] (CALCITE-1738) Support CAST of literal values in filters pushed to Druid

Remus Rusanu (JIRA) Mon, 17 Apr 2017 10:58:47 -0700

    [ 
https://issues.apache.org/jira/browse/CALCITE-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971409#comment-15971409
 ]


Remus Rusanu edited comment on CALCITE-1738 at 4/17/17 5:57 PM:
----------------------------------------------------------------

In the real case the code runs after constant folding, so one would not expect 
the CAST to be there. However, what we found is that the whole system behaves 
somehow unexpected. Say we start with a SQL text like {{ ... WHERE dt BETWEEN 
'2010-01-01' AND '2011-01-01' }} . Since HIVE-16027 we're adding a CAST, so 
Calcite will see {{ ... WHERE dt BETWEEN CAST('2010-01-01' AS TIMESTAMP) AND 
... }} . During simplification the Hive executor gets invoked and ask to 
evaluate the expression {{ CAST('2010-01-01' AS TIMESTAMP }} and we return a 
RexLiteral with the value {{ 2010-01-01 }} of the appropriate  TIMESTAMP 
literal. Now when this value gets put back into the tree it does not match the 
nullability of the expression being replaced (ie. the CAST). As a result a new 
CAST is inserted on top of it, the relevant logic is 
[RexBuilder.java:553](https://github.com/apache/calcite/blob/branch-1.12/core/src/main/java/org/apache/calcite/rex/RexBuilder.java#L553).
 The new CAST will cast the {{2010-01-01}} timestamp literal to TIMESTAMP.

When looking at this fix, I pondered whether to fix the generic case and have 
the DruidDateTimeUtils handle all cases. But I think that what we're dealign 
with here is a very specific issue and other cases (eg. CAST from a string 
literal) will be converted beforehand by constant folding into the actual 
literal. Making the code in DruidDateTimeUtils handle the generic case requires 
having the correct executor, much like the changes you did for CALCITE-1695, 
and in effect would run constant folding *again* in this place. So my instinct 
was to go with this simplified patch.




was (Author: rusanu):
In the real case the code runs after constant folding, so one would not expect 
the CAST to be there. However, what we found is that the whole system behaves 
somehow unexpected. Say we start with a SQL text like{{ ... WHERE dt BETWEEN 
'2010-01-01' AND '2011-01-01'}}. Since HIVE-16027 we're adding a CAST, so 
Calcite will see {{ ... WHERE dt BETWEEN CAST('2010-01-01' AS TIMESTAMP) AND 
...}}. During simplification the Hive executor gets invoked and ask to evaluate 
the expression {{CAST('2010-01-01' AS TIMESTAMP}} and we return a RexLiteral 
with the value {{ 2010-01-01 }} of the appropriate  TIMESTAMP literal. Now when 
this value gets put back into the tree it does not match the nullability of the 
expression being replaced (ie. the CAST). As a result a new CAST is inserted on 
top of it, the relevant logic is 
[RexBuilder.java:553](https://github.com/apache/calcite/blob/branch-1.12/core/src/main/java/org/apache/calcite/rex/RexBuilder.java#L553).
 The new CAST will cast the {{2010-01-01}} timestamp literal to TIMESTAMP.

When looking at this fix, I pondered whether to fix the generic case and have 
the DruidDateTimeUtils handle all cases. But I think that what we're dealign 
with here is a very specific issue and other cases (eg. CAST from a string 
literal) will be converted beforehand by constant folding into the actual 
literal. Making the code in DruidDateTimeUtils handle the generic case requires 
having the correct executor, much like the changes you did for CALCITE-1695, 
and in effect would run constant folding *again* in this place. So my instinct 
was to go with this simplified patch.



> Support CAST of literal values in filters pushed to Druid
> ---------------------------------------------------------
>
>                 Key: CALCITE-1738
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1738
>             Project: Calcite
>          Issue Type: Improvement
>          Components: druid
>            Reporter: Remus Rusanu
>            Assignee: Remus Rusanu
>
> Because HIVE-16027 forced an implicit CAST on queries like {{WHERE 
> <timestampcolumn> IN ('<literal', '<literal>')}}, IN, BETEWEEN and other 
> filters are no longer pushed down to Druid. In this call stack:
> {noformat}
> org.apache.calcite.adapter.druid.DruidDateTimeUtils.literalValue(DruidDateTimeUtils.java:246)
>       at 
> org.apache.calcite.adapter.druid.DruidDateTimeUtils.leafToRanges(DruidDateTimeUtils.java:227)
>       at 
> org.apache.calcite.adapter.druid.DruidDateTimeUtils.extractRanges(DruidDateTimeUtils.java:120)
>       at 
> org.apache.calcite.adapter.druid.DruidDateTimeUtils.createInterval(DruidDateTimeUtils.java:65)
>       at 
> org.apache.calcite.adapter.druid.DruidRules$DruidFilterRule.onMatch(DruidRules.java:186)
>       at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317)
>       at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:506)
>       at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:385)
>       at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:251)
>       at 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:125)
>       at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:210)
>       at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:197)
>       at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:1790)
>       at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1518)
>       at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1265)
>       at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113)
>       at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
>       at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:149)
>       at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:106)
>       at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1073)
>       at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1089)
>       at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:368)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11119)
>       at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:290)
>       at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
> {noformat}
> the literalValue only knows how to handle {{RexLiteral}} case. Because of the 
> CAST, the node is a {{RexCall}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (CALCITE-1738) Support CAST of literal values in filters pushed to Druid

Reply via email to