[jira] [Commented] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite
[ https://issues.apache.org/jira/browse/HIVE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219496#comment-17219496 ] Nemon Lou commented on HIVE-24165: -- Not able to reproduce in master branch. After upgrade calcite from 1.16.0 to 1.17.0,this bug also gone for branch3 with multi distinct rewrite. May be fixed in CALCITE-2232 > CBO: Query fails after multiple count distinct rewrite > --- > > Key: HIVE-24165 > URL: https://issues.apache.org/jira/browse/HIVE-24165 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: Nemon Lou >Assignee: Nemon Lou >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24165.patch > > Time Spent: 10m > Remaining Estimate: 0h > > One way to reproduce: > > {code:sql} > CREATE TABLE test( > `device_id` string, > `level` string, > `site_id` string, > `user_id` string, > `first_date` string, > `last_date` string, > `dt` string) ; > set hive.execution.engine=tez; > set hive.optimize.distinct.rewrite=true; > set hive.cli.print.header=true; > select > dt, > site_id, > count(DISTINCT t1.device_id) as device_tol_cnt, > count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else > null end) as device_add_cnt > from test t1 where dt='2020-09-15' > group by > dt, > site_id > ; > {code} > > Error log: > {code:java} > Exception in thread "main" java.lang.AssertionError: Cannot add expression of > different type to set: > set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE > "ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL > expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT > $f3_0) NOT NULL > set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, > 3},agg#0=count($0),agg#1=count($1)) > expression is HiveProject#95 > at > org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411) > at > org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234) > at > org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186) > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317) > at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) > at > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280) > at > org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74) > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) > at > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609) > at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052) > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773) > at
[jira] [Commented] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite
[ https://issues.apache.org/jira/browse/HIVE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195976#comment-17195976 ] Panagiotis Garefalakis commented on HIVE-24165: --- Hey [~nemon] would you mind openning a PR for this? > CBO: Query fails after multiple count distinct rewrite > --- > > Key: HIVE-24165 > URL: https://issues.apache.org/jira/browse/HIVE-24165 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: Nemon Lou >Assignee: Nemon Lou >Priority: Major > Attachments: HIVE-24165.patch > > > One way to reproduce: > > {code:sql} > CREATE TABLE test( > `device_id` string, > `level` string, > `site_id` string, > `user_id` string, > `first_date` string, > `last_date` string, > `dt` string) ; > set hive.execution.engine=tez; > set hive.optimize.distinct.rewrite=true; > set hive.cli.print.header=true; > select > dt, > site_id, > count(DISTINCT t1.device_id) as device_tol_cnt, > count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else > null end) as device_add_cnt > from test t1 where dt='2020-09-15' > group by > dt, > site_id > ; > {code} > > Error log: > {code:java} > Exception in thread "main" java.lang.AssertionError: Cannot add expression of > different type to set: > set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE > "ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL > expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT > $f3_0) NOT NULL > set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, > 3},agg#0=count($0),agg#1=count($1)) > expression is HiveProject#95 > at > org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411) > at > org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234) > at > org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186) > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317) > at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) > at > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280) > at > org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74) > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) > at > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609) > at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052) > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) > at >
[jira] [Commented] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite
[ https://issues.apache.org/jira/browse/HIVE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195807#comment-17195807 ] Nemon Lou commented on HIVE-24165: -- In fact , i reproduce this issue by apply HIVE-22448 back to Hive branch 3.1.2. Master branch should have the same issue. AggregateProjectPullUpConstantsRule expects groupSet in Aggregate to be ordered and start with 0, like \{0,1,2}.but after multiple distinct rewrite, groupSet is \{3,4,5}. > CBO: Query fails after multiple count distinct rewrite > --- > > Key: HIVE-24165 > URL: https://issues.apache.org/jira/browse/HIVE-24165 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: Nemon Lou >Priority: Major > > One way to reproduce: > > {code:sql} > CREATE TABLE test( > `device_id` string, > `level` string, > `site_id` string, > `user_id` string, > `first_date` string, > `last_date` string, > `dt` string) ; > set hive.execution.engine=tez; > set hive.optimize.distinct.rewrite=true; > set hive.cli.print.header=true; > select > dt, > site_id, > count(DISTINCT t1.device_id) as device_tol_cnt, > count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else > null end) as device_add_cnt > from test t1 where dt='2020-09-15' > group by > dt, > site_id > ; > {code} > > Error log: > {code:java} > Exception in thread "main" java.lang.AssertionError: Cannot add expression of > different type to set: > set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE > "ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL > expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" > COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT > $f3_0) NOT NULL > set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, > 3},agg#0=count($0),agg#1=count($1)) > expression is HiveProject#95 > at > org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411) > at > org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234) > at > org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186) > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317) > at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) > at > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280) > at > org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74) > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) > at > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609) > at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052) > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773) > at