Krisztian Kasa created HIVE-28773:
-------------------------------------
Summary: CBO fails when a materialized view is dropped but its
pre-compiled plan remains in the registry.
Key: HIVE-28773
URL: https://issues.apache.org/jira/browse/HIVE-28773
Project: Hive
Issue Type: Bug
Reporter: Krisztian Kasa
Attachments: TestQueryRewrite.java
Let's assume we have a cluster with two HS2 instances. Each instance has its
own Materialized View (MV) registry. The registries contain pre-compiled plans
of MVs enabled for query rewriting. (Without the registry, MVs would need to be
loaded and compiled during each query compilation, leading to slow query
performance.)
MVs are added to and removed from the registry when they are created or
dropped, but only in the HS2 instance that executes the create/drop statement.
The other instance is not immediately notified of the change. A background
process is scheduled to refresh the registry, but this process does not handle
the removal of dropped MVs.
When an MV is dropped by HS2 #1, it remains in the registry of HS2 #2. If a
query is processed by HS2 #2, the rewrite algorithm still attempts to use the
dropped MV. If the MV is stored in Iceberg, the storage handler tries to
refresh the MV metadata from the metastore but throws an exception because the
MV no longer exists. This exception is not handled properly, leading to a CBO
failure.
To simulate the issue I created the following test:
{code}
@Test
public void testQueryIsNotRewrittenWhenMVIsDropped() throws Exception {
executeStatementOnDriver("create table " + TABLE1 + "(a int, b string, c
float) stored as orc TBLPROPERTIES ('transactional'='true')", driver);
executeStatementOnDriver("insert into " + TABLE1 + "(a,b, c) values (1,
'one', 1.1), (2, 'two', 2.2), (NULL, NULL, NULL)", driver);
executeStatementOnDriver("create materialized view " + MV1 + " stored by
iceberg tblproperties('format-version'='2') as " +
"select a,b,c from " + TABLE1 + " where a > 0 or a is null",
driver);
// Simulate a multi HS2 cluster.
// Drop the MV using a direct API call to HMS. This is similar what
happening when the drop MV is executed by
// another HS2.
// In this case the MV is not removed from HiveMaterializedViewsRegistry of
HS2 which runs the explain query.
msClient.dropTable("default", MV1);
List<String> result = execSelectAndDumpData("explain cbo select a, b from "
+ TABLE1 + " where a > 0", driver, "");
{code}
{code}
2025-02-21T04:51:02,117 ERROR [main] parse.CalcitePlanner: CBO failed, skipping
CBO.
org.apache.iceberg.exceptions.NoSuchTableException: Table does not exist:
default.mat1
at
org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:55)
~[hive-iceberg-handler-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:115)
~[hive-iceberg-handler-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:105)
~[hive-iceberg-handler-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.iceberg.mr.hive.IcebergTableUtil.lambda$getTable$1(IcebergTableUtil.java:147)
~[hive-iceberg-handler-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.iceberg.mr.hive.IcebergTableUtil.lambda$getTable$4(IcebergTableUtil.java:159)
~[hive-iceberg-handler-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at java.util.Optional.orElseGet(Optional.java:267) ~[?:1.8.0_301]
at
org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:156)
~[hive-iceberg-handler-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:118)
~[hive-iceberg-handler-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:122)
~[hive-iceberg-handler-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.iceberg.mr.hive.HiveIcebergStorageHandler.isPartitioned(HiveIcebergStorageHandler.java:2128)
~[hive-iceberg-handler-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.metadata.Table.isPartitioned(Table.java:824)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.computePartitionList(RelOptHiveTable.java:467)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getRowCount(RelOptHiveTable.java:438)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.rel.core.TableScan.computeSelfCost(TableScan.java:100)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.getNonCumulativeCost(RelMdPercentageOriginalRows.java:174)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost_$(Unknown
Source) ~[?:?]
at
GeneratedMetadataHandler_NonCumulativeCost.getNonCumulativeCost(Unknown Source)
~[?:?]
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getNonCumulativeCost(RelMetadataQuery.java:288)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.optimizer.calcite.cost.HiveVolcanoPlanner.getCost(HiveVolcanoPlanner.java:113)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements0(RelSubset.java:415)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.RelSubset.propagateCostImprovements(RelSubset.java:398)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(VolcanoPlanner.java:1268)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1227)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.rel.rules.materialize.MaterializedViewRule.perform(MaterializedViewRule.java:474)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.rel.rules.materialize.MaterializedViewProjectFilterRule.onMatch(MaterializedViewProjectFilterRule.java:50)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.rewriteUsingViews(CalcitePlanner.java:2123)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2035)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1724)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1578)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1334)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:586)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13180)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:479)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:335)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:183)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:335)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:109)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:498)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:450)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:414)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:408)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:234)
~[hive-exec-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT]
at
org.apache.hadoop.hive.ql.txn.compactor.TestCompactorBase.executeStatementOnDriver(TestCompactorBase.java:171)
~[test-classes/:?]
at
org.apache.hadoop.hive.ql.txn.compactor.TestCompactorBase.execSelectAndDumpData(TestCompactorBase.java:139)
~[test-classes/:?]
at
org.apache.hadoop.hive.ql.txn.compactor.TestQueryRewrite.testQueryIsNotRewrittenWhenMVIsDropped(TestQueryRewrite.java:73)
~[test-classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
~[?:1.8.0_301]
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:1.8.0_301]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:1.8.0_301]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_301]
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
~[junit-4.13.2.jar:4.13.2]
at
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
~[junit-4.13.2.jar:4.13.2]
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
~[junit-4.13.2.jar:4.13.2]
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
~[junit-4.13.2.jar:4.13.2]
at
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
~[junit-rt.jar:?]
at
com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38)
~[junit-rt.jar:?]
at
com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11)
~[idea_rt.jar:?]
at
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35)
~[junit-rt.jar:?]
at
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:232)
~[junit-rt.jar:?]
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:55)
~[junit-rt.jar:?]
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)