[ https://issues.apache.org/jira/browse/HIVE-12923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15879135#comment-15879135 ]
Julian Hyde commented on HIVE-12923: ------------------------------------ I'm thinking of an alternative solution to CALCITE-1069. Currently, as you know, an Aggregate with more than one grouping set returns more columns than one with only one grouping set. We have been arguing about whether there should be 1 extra column (Hive's preference) or N extra columns (Calcite's preference). My new proposal is that there should be no extra columns. We make GROUPING into an aggregate function, and if you want those extra columns you can add calls to GROUPING. If the row type of Aggregate is same regardless of the number of grouping sets, it will simplify a bunch of things. For example, it would be easier to write a rule that pushes down the Filter "group_id = 2", because we wouldn't have to worry about disappearing columns, and whether they are used. [~hsubramaniyan], [~jcamachorodriguez], Would the new proposal be acceptable to Hive? > CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver > groupby_grouping_sets4.q failure > ------------------------------------------------------------------------------------------------------------ > > Key: HIVE-12923 > URL: https://issues.apache.org/jira/browse/HIVE-12923 > Project: Hive > Issue Type: Sub-task > Components: CBO > Reporter: Hari Sankar Sivarama Subramaniyan > Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12923.1.patch, HIVE-12923.2.patch > > > {code} > EXPLAIN > SELECT * FROM > (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq1 > join > (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq2 > on subq1.a = subq2.a > {code} > Stack trace: > {code} > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.pruneJoinOperator(ColumnPrunerProcFactory.java:1110) > at > org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.access$400(ColumnPrunerProcFactory.java:85) > at > org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerJoinProc.process(ColumnPrunerProcFactory.java:941) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:172) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:135) > at > org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:237) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10176) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:229) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:472) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:312) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1168) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1256) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1094) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1129) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1103) > at > org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:10444) > at > org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets4(TestCliDriver.java:3313) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)