Stamatis Zampetakis created HIVE-29249:
------------------------------------------
Summary: RuntimeException in
PlanModifierForASTConv.introduceDerivedTable for queries with self joins
Key: HIVE-29249
URL: https://issues.apache.org/jira/browse/HIVE-29249
Project: Hive
Issue Type: Bug
Components: CBO
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis
Various queries containing more than 2 joins of the same relation (table,
materialized view, CTE) fail at compile time.
{code:sql}
create table t1 (key int, value int);
create table t2 (a string, b string);
explain cbo
with cte as
(select key, value, BLOCK__OFFSET__INSIDE__FILE, INPUT__FILE__NAME, ROW__ID,
ROW__IS__DELETED from t1)
select * from cte a join t2 b join cte c
{code}
The query fails with the following stacktrace:
{noformat}
java.lang.RuntimeException: Couldn't find child node in parent's inputs
at
org.apache.hadoop.hive.ql.optimizer.calcite.translator.PlanModifierForASTConv.introduceDerivedTable(PlanModifierForASTConv.java:341)
at
org.apache.hadoop.hive.ql.optimizer.calcite.translator.PlanModifierForASTConv$SelfJoinHandler.visit(PlanModifierForASTConv.java:236)
at
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveJoin.accept(HiveJoin.java:229)
at
org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelShuttleImpl.visitChild(HiveRelShuttleImpl.java:60)
at
org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelShuttleImpl.visit(HiveRelShuttleImpl.java:114)
at
org.apache.hadoop.hive.ql.optimizer.calcite.translator.PlanModifierForASTConv$SelfJoinHandler.visit(PlanModifierForASTConv.java:242)
at
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveProject.accept(HiveProject.java:134)
at
org.apache.hadoop.hive.ql.optimizer.calcite.translator.PlanModifierForASTConv.convertOpTree(PlanModifierForASTConv.java:109)
at
org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:136)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1405)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:609)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13218)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:482)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:359)
at
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:187)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:359)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224)
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:109)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:498)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:450)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:414)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:408)
at
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
at
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:234)
at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:430)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:358)
at
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:760)
at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:730)
at
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:115)
at
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:139)
at
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
{noformat}
The repro above requires all technical columns (e.g.,
{{BLOCK__OFFSET__INSIDE__FILE}}) to be present in the SELECT clause which is
rather rare in real SQL queries. However, the problem can still appear in other
use-cases/scenarios when using the {{hive.optimize.cte}} features and/or
materialized views and the same view or CTE is used multiple times in the query.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)