[ 
https://issues.apache.org/jira/browse/CALCITE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350711#comment-16350711
 ] 

Volodymyr Vysotskyi edited comment on CALCITE-2018 at 2/2/18 5:41 PM:
----------------------------------------------------------------------

[~julianhyde], I tried to add these checks for cached and uncached metadata 
values, and all tests in Calcite are passed. But when I ran Drill unit tests 
with this check, some of them failed.

This failure appeared because *the cumulative cost of best rel node increased*. 
For example when we have such piece of the plan:
{noformat}
SingleMergeExchangePrel(sort0=[1 DESC]): rowcount = 60.175, cumulative cost = 
{60.175 rows, 481.4 cpu, 0.0 io, 492953.6 network, 0.0 memory}, id = 2232
  SortPrel(subset=[rel#2228:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$1]]).[1 
DESC]], sort0=[$1], dir0=[DESC]): rowcount = 60.175, cumulative cost = {60.175 
rows, 1422.7999242130231 cpu, 0.0 io, 0.0 network, 962.8 memory}, id = 2896
    
HashToRandomExchangePrel(subset=[rel#2885:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$1]]).[]],
 dist0=[[$1]]): rowcount = 60.175, cumulative cost = {60.175 rows, 962.8 cpu, 
0.0 io, 492953.6 network, 0.0 memory}, id = 2918
      
StreamAggPrel(subset=[rel#2900:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$0]]).[]],
 group=[{0}], revenue=[SUM($1)]): rowcount = 60.175, cumulative cost = {601.75 
rows, 9628.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2215
        
HashToMergeExchangePrel(subset=[rel#2214:Subset#204.PHYSICAL.HASH_DISTRIBUTED([[$0]]).[0]]):
 rowcount = 601.75, cumulative cost = {601.75 rows, 9628.0 cpu, 0.0 io, 
4929536.0 network, 0.0 memory}, id = 2213
          
StreamAggPrel(subset=[rel#2205:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$0]]).[0]],
 group=[{0}], revenue=[SUM($1)]): rowcount = 601.75, cumulative cost = {6017.5 
rows, 96280.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2472
{noformat}
rel node {{SingleMergeExchangePrel}} has a smaller cumulative cost than the 
cumulative cost of the same rel node after applying some rules:
{noformat}
SingleMergeExchangePrel(sort0=[1 DESC]): rowcount = 601.75, cumulative cost = 
{601.75 rows, 4814.0 cpu, 0.0 io, 4929536.0 network, 0.0 memory}, id = 2232
  SortPrel(subset=[rel#2228:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$1]]).[1 
DESC]], sort0=[$1], dir0=[DESC]): rowcount = 601.75, cumulative cost = {601.75 
rows, 22223.88016652411 cpu, 0.0 io, 0.0 network, 9628.0 memory}, id = 2896
    
HashToRandomExchangePrel(subset=[rel#2885:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$1]]).[]],
 dist0=[[$1]]): rowcount = 601.75, cumulative cost = {601.75 rows, 9628.0 cpu, 
0.0 io, 4929536.0 network, 0.0 memory}, id = 2884
      
StreamAggPrel(subset=[rel#2205:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$0]]).[0]],
 group=[{0}], revenue=[SUM($1)]): rowcount = 601.75, cumulative cost = {6017.5 
rows, 96280.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2472
{noformat}
But its cost was changed because the cumulative cost of {{SortPrel}} from the 
second plan is less than the cost of the same rel node from the first plan.

Perhaps we should change a little bit cost estimation in Drill, but I am not 
sure that it will help to prevent other similar cases. 
Or we should try to modify Volcano planner to be able to revert the change of 
best rel nodes when the cumulative cost of neither of top rel nodes wasn't 
decreased. 
Or we may reflect the increase in the cumulative cost of best rel node in 
bestCost.

[~julianhyde], [~amansinha100], could you please advise me the better way to 
fix this issue?


was (Author: vvysotskyi):
[~julianhyde], I tried to add these checks for cached and uncached metadata 
values, and all tests in Calcite are passed. But when I ran Drill unit tests 
with this check, some of them failed.

This failure appeared because *the cost of best rel node increased*. For 
example when we have such piece of the plan:
{noformat}
SingleMergeExchangePrel(sort0=[1 DESC]): rowcount = 60.175, cumulative cost = 
{60.175 rows, 481.4 cpu, 0.0 io, 492953.6 network, 0.0 memory}, id = 2232
  SortPrel(subset=[rel#2228:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$1]]).[1 
DESC]], sort0=[$1], dir0=[DESC]): rowcount = 60.175, cumulative cost = {60.175 
rows, 1422.7999242130231 cpu, 0.0 io, 0.0 network, 962.8 memory}, id = 2896
    
HashToRandomExchangePrel(subset=[rel#2885:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$1]]).[]],
 dist0=[[$1]]): rowcount = 60.175, cumulative cost = {60.175 rows, 962.8 cpu, 
0.0 io, 492953.6 network, 0.0 memory}, id = 2918
      
StreamAggPrel(subset=[rel#2900:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$0]]).[]],
 group=[{0}], revenue=[SUM($1)]): rowcount = 60.175, cumulative cost = {601.75 
rows, 9628.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2215
        
HashToMergeExchangePrel(subset=[rel#2214:Subset#204.PHYSICAL.HASH_DISTRIBUTED([[$0]]).[0]]):
 rowcount = 601.75, cumulative cost = {601.75 rows, 9628.0 cpu, 0.0 io, 
4929536.0 network, 0.0 memory}, id = 2213
          
StreamAggPrel(subset=[rel#2205:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$0]]).[0]],
 group=[{0}], revenue=[SUM($1)]): rowcount = 601.75, cumulative cost = {6017.5 
rows, 96280.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2472
{noformat}
rel node {{SingleMergeExchangePrel}} has a smaller cumulative cost than the 
cumulative cost of the same rel node after applying some rules:
{noformat}
SingleMergeExchangePrel(sort0=[1 DESC]): rowcount = 601.75, cumulative cost = 
{601.75 rows, 4814.0 cpu, 0.0 io, 4929536.0 network, 0.0 memory}, id = 2232
  SortPrel(subset=[rel#2228:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$1]]).[1 
DESC]], sort0=[$1], dir0=[DESC]): rowcount = 601.75, cumulative cost = {601.75 
rows, 22223.88016652411 cpu, 0.0 io, 0.0 network, 9628.0 memory}, id = 2896
    
HashToRandomExchangePrel(subset=[rel#2885:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$1]]).[]],
 dist0=[[$1]]): rowcount = 601.75, cumulative cost = {601.75 rows, 9628.0 cpu, 
0.0 io, 4929536.0 network, 0.0 memory}, id = 2884
      
StreamAggPrel(subset=[rel#2205:Subset#201.PHYSICAL.HASH_DISTRIBUTED([[$0]]).[0]],
 group=[{0}], revenue=[SUM($1)]): rowcount = 601.75, cumulative cost = {6017.5 
rows, 96280.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2472
{noformat}
But its cost was changed because the cumulative cost of {{SortPrel}} from the 
second plan is less than the cost of the same rel node from the first plan.

Perhaps we should change a little bit cost estimation in Drill, but I am not 
sure that it will help to prevent other similar cases. 
Or we should try to modify Volcano planner to be able to revert the change of 
best rel nodes when the cumulative cost of neither of top rel nodes wasn't 
decreased. 
Or we may reflect the increase in the cumulative cost of best rel node in 
bestCost.

[~julianhyde], [~amansinha100], could you please advise me the better way to 
fix this issue?

> Queries failed with AssertionError: rel has lower cost than best cost of 
> subset
> -------------------------------------------------------------------------------
>
>                 Key: CALCITE-2018
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2018
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.13.0
>            Reporter: Volodymyr Vysotskyi
>            Assignee: Julian Hyde
>            Priority: Critical
>             Fix For: 1.16.0
>
>
> *Problem description*
> When rootLogger level is DEBUG, unit tests 
> * MaterializationTest.testMaterializationSubstitution2
> * MaterializationTest.testJoinMaterializationUKFK8
> * MaterializationTest.testJoinMaterializationUKFK6
> * JdbcTest.testWhereNot
> unit tests are failed with error AssertionError: rel has lower cost than best 
> cost of subset.
> Full stack trace for test 
> {{MaterializationTest.testMaterializationSubstitution2}}:
> {noformat}
> java.lang.AssertionError: rel 
> [rel#245:EnumerableUnion.ENUMERABLE.[](input#0=rel#246:Subset#5.ENUMERABLE.[],input#1=rel#239:Subset#6.ENUMERABLE.[0],all=true)]
>  has lower cost {14.0 rows, 19.0 cpu, 0.0 io} than best cost {15.0 rows, 20.0 
> cpu, 0.0 io} of subset [rel#243:Subset#7.ENUMERABLE.[]]
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.validate(VolcanoPlanner.java:906)
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:866)
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:883)
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:101)
>       at 
> org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:336)
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1495)
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:863)
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:883)
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1766)
>       at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:135)
>       at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234)
>       at 
> org.apache.calcite.rel.rules.FilterProjectTransposeRule.onMatch(FilterProjectTransposeRule.java:143)
>       at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:212)
>       at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:650)
>       at org.apache.calcite.tools.Programs$5.run(Programs.java:326)
>       at 
> org.apache.calcite.tools.Programs$SequenceProgram.run(Programs.java:387)
>       at org.apache.calcite.prepare.Prepare.optimize(Prepare.java:187)
>       at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:318)
>       at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:229)
>       at 
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:786)
>       at 
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:640)
>       at 
> org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:610)
>       at org.apache.calcite.schema.Schemas.prepare(Schemas.java:346)
>       at 
> org.apache.calcite.materialize.MaterializationService$DefaultTableFactory.createTable(MaterializationService.java:374)
>       at 
> org.apache.calcite.materialize.MaterializationService.defineMaterialization(MaterializationService.java:137)
>       at 
> org.apache.calcite.materialize.MaterializationService.defineMaterialization(MaterializationService.java:99)
>       at 
> org.apache.calcite.schema.impl.MaterializedViewTable$MaterializedViewTableMacro.<init>(MaterializedViewTable.java:110)
>       at 
> org.apache.calcite.schema.impl.MaterializedViewTable$MaterializedViewTableMacro.<init>(MaterializedViewTable.java:100)
>       at 
> org.apache.calcite.schema.impl.MaterializedViewTable.create(MaterializedViewTable.java:81)
>       at org.apache.calcite.model.ModelHandler.visit(ModelHandler.java:364)
>       at 
> org.apache.calcite.model.JsonMaterialization.accept(JsonMaterialization.java:42)
>       at org.apache.calcite.model.JsonSchema.visitChildren(JsonSchema.java:98)
>       at 
> org.apache.calcite.model.JsonMapSchema.visitChildren(JsonMapSchema.java:48)
>       at 
> org.apache.calcite.model.ModelHandler.populateSchema(ModelHandler.java:257)
>       at org.apache.calcite.model.ModelHandler.visit(ModelHandler.java:273)
>       at 
> org.apache.calcite.model.JsonCustomSchema.accept(JsonCustomSchema.java:45)
>       at org.apache.calcite.model.ModelHandler.visit(ModelHandler.java:196)
>       at org.apache.calcite.model.ModelHandler.<init>(ModelHandler.java:88)
>       at org.apache.calcite.jdbc.Driver$1.onConnectionInit(Driver.java:104)
>       at 
> org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:139)
>       at java.sql.DriverManager.getConnection(DriverManager.java:571)
>       at java.sql.DriverManager.getConnection(DriverManager.java:187)
>       at 
> org.apache.calcite.test.CalciteAssert$MapConnectionFactory.createConnection(CalciteAssert.java:1227)
>       at 
> org.apache.calcite.test.CalciteAssert$AssertQuery.createConnection(CalciteAssert.java:1266)
>       at 
> org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1337)
>       at 
> org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1320)
>       at 
> org.apache.calcite.test.CalciteAssert$AssertQuery.sameResultWithMaterializationsDisabled(CalciteAssert.java:1548)
>       at 
> org.apache.calcite.test.MaterializationTest.testMaterializationSubstitution2(MaterializationTest.java:2062)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>       at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>       at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>       at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>       at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>       at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
>       at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
>       at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> {noformat}
> *Possible root cause*
> JaninoRelMetadataProvider caches metadata queries of {{RelSubset}} even when 
> {{RelSubset.best}} value is not set. 
> Actually not all RelNodes in {{RelSubset.set.rels}} have the same row count 
> for some queries.
> When {{RelMetadataQuery.getRowCount(RelNode rel)}} method is called with 
> {{RelSubset}} instance without best value, row count of {{RelSubset.set.rel}} 
> is cached. 
> After assigning best relNode, the {{RelMetadataQuery.getRowCount(RelNode 
> rel)}} method returns old cached value if {{RelMetadataQuery}} instance still 
> the same. 
> This error appears when row count for best relNode is different from set.rel 
> relNode.
> In the test {{MaterializationTest.testMaterializationSubstitution2}} row 
> count of {{rel#237:LogicalFilter}} was cached and returned its value when 
> relSubset had best {{rel#347:EnumerableFilter}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to