[
https://issues.apache.org/jira/browse/HIVE-12923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135643#comment-15135643
]
Hari Sankar Sivarama Subramaniyan commented on HIVE-12923:
----------------------------------------------------------
[~jcamachorodriguez] Thanks for the review comments.
1. I totally agree that this can potentially cause regression in some cases on
join merging in the return path. As commented in the fix, this should be
removed once CALCITE-1069 is fixed, but this fix is better than throwing an
exception to the end user. One way to optimize this would be to check for the
only first AGGREGATE node below the current PROJECT instead of the entire
subtree. Does that sound fine ?
2. I dont totally get your second point. There can be a filter between the
Project and Aggregate as JOIN->PROJECT->FILTER->AGGREGATE, for e.g. NOT NULL
filter, but I believe the PROJECT is always present somewhere above a AGGREGATE
after CalcitePlanner.genLogicalPlan(QB qb, boolean outerMostQB) introduces a
SELECT to remove the indicator columns from the GBY rowschema. If you are
referring to FILTER above the PROJECT, then the HiveJoinProjectTranspose rule
shouldnt kick in, right?
Thanks
Hari
> CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver
> groupby_grouping_sets4.q failure
> ------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-12923
> URL: https://issues.apache.org/jira/browse/HIVE-12923
> Project: Hive
> Issue Type: Sub-task
> Components: CBO
> Reporter: Hari Sankar Sivarama Subramaniyan
> Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12923.1.patch, HIVE-12923.2.patch
>
>
> {code}
> EXPLAIN
> SELECT * FROM
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq1
> join
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq2
> on subq1.a = subq2.a
> {code}
> Stack trace:
> {code}
> java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.pruneJoinOperator(ColumnPrunerProcFactory.java:1110)
> at
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.access$400(ColumnPrunerProcFactory.java:85)
> at
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerJoinProc.process(ColumnPrunerProcFactory.java:941)
> at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at
> org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:172)
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at
> org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:135)
> at
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:237)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10176)
> at
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:229)
> at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
> at
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
> at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:472)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:312)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1168)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1256)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1094)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082)
> at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
> at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
> at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
> at
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1129)
> at
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1103)
> at
> org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:10444)
> at
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets4(TestCliDriver.java:3313)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)