Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
dtenedor commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2047820851 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -6532,12 +6547,12 @@ class AstBuilder extends DataTypeAstBuilder case n: NamedExpression => newGroupingExpressions += n newAggregateExpressions += n -// If the grouping expression is an integer literal, create [[UnresolvedOrdinal]] and -// [[UnresolvedPipeAggregateOrdinal]] expressions to represent it in the final grouping -// and aggregate expressions, respectively. This will let the +// If the grouping expression is an [[UnresolvedOrdinal]], replace the ordinal value and +// create [[UnresolvedPipeAggregateOrdinal]] expressions to represent it in the final +// grouping and aggregate expressions, respectively. This will let the // [[ResolveOrdinalInOrderByAndGroupBy]] rule detect the ordinal in the aggregate list // and replace it with the corresponding attribute from the child operator. -case Literal(v: Int, IntegerType) if conf.groupByOrdinal => +case UnresolvedOrdinal(v: Int) => newGroupingExpressions += UnresolvedOrdinal(newAggregateExpressions.length + 1) Review Comment: Note that for pipe SQL syntax, GROUP BY ordinals work differently. In this case, the ordinals refer to the one-based indexes of the attributes returned from the child operator, not to the grouping expressions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
cloud-fan closed pull request #50606: [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal URL: https://github.com/apache/spark/pull/50606 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
cloud-fan commented on PR #50606: URL: https://github.com/apache/spark/pull/50606#issuecomment-2814257625 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
mihailotim-db commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2048387700 ## sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala: ## @@ -929,7 +929,16 @@ class Dataset[T] private[sql]( /** @inheritdoc */ @scala.annotation.varargs def groupBy(cols: Column*): RelationalGroupedDataset = { -RelationalGroupedDataset(toDF(), cols.map(_.expr), RelationalGroupedDataset.GroupByType) +val groupingExpressionsWithReplacedOrdinals = cols.map { col => col.expr match { Review Comment: Done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
mihailotim-db commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2048385309 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -1825,24 +1825,32 @@ class AstBuilder extends DataTypeAstBuilder } visitNamedExpression(n) }.toSeq + val groupByExpressionsWithReplacedOrdinals = +replaceOrdinalsInGroupingExpressions(groupByExpressions) if (ctx.GROUPING != null) { // GROUP BY ... GROUPING SETS (...) // `groupByExpressions` can be non-empty for Hive compatibility. It may add extra grouping // expressions that do not exist in GROUPING SETS (...), and the value is always null. // For example, `SELECT a, b, c FROM ... GROUP BY a, b, c GROUPING SETS (a, b)`, the output // of column `c` is always null. val groupingSets = - ctx.groupingSet.asScala.map(_.expression.asScala.map(e => expression(e)).toSeq) -Aggregate(Seq(GroupingSets(groupingSets.toSeq, groupByExpressions)), + ctx.groupingSet.asScala.map(_.expression.asScala.map(e => { Review Comment: Done! I moved to a separate method with a scaladoc instead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
mihailotim-db commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2048521406 ## sql/core/src/test/scala/org/apache/spark/sql/analysis/resolver/AggregateResolverSuite.scala: ## @@ -44,12 +44,6 @@ class AggregateResolverSuite extends QueryTest with SharedSparkSession { resolverRunner.resolve(query) } - test("Valid group by ordinal") { Review Comment: Yep, that's better. Done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
mihailotim-db commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2048513651 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala: ## @@ -446,11 +447,16 @@ package object dsl { def sortBy(sortExprs: SortOrder*): LogicalPlan = Sort(sortExprs, false, logicalPlan) def groupBy(groupingExprs: Expression*)(aggregateExprs: Expression*): LogicalPlan = { +// Replace top-level integer literals with ordinals, if `groupByOrdinal` is enabled. +val groupingExprsWithReplacedOrdinals = groupingExprs.map { Review Comment: Fixed! ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -1825,24 +1825,25 @@ class AstBuilder extends DataTypeAstBuilder } visitNamedExpression(n) }.toSeq + val groupByExpressionsWithReplacedOrdinals = +replaceOrdinalsInGroupingExpressions(groupByExpressions) Review Comment: Sounds good! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
mihailotim-db commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2048515428 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -1979,19 +1978,7 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor throw QueryCompilationErrors.groupByPositionRefersToAggregateFunctionError( index, ordinalExpr) } else { Review Comment: Nice! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
mihailotim-db commented on PR #50606: URL: https://github.com/apache/spark/pull/50606#issuecomment-2812189399 > Just one thing we need to check - if the view is persisted with `ORDER_BY_ORDINAL` conf ON, what happens if we read this view with `ORDER_BY_ORDINAL` conf `OFF`? This might be an issue, since we moved the conf check to the parser. > > The view must keep its confs. Confirmed the correct behavior in the shell. With conf off, the query should error out with `MISSING_AGGREGATION`  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
mihailotim-db commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2048372228 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala: ## @@ -446,11 +447,15 @@ package object dsl { def sortBy(sortExprs: SortOrder*): LogicalPlan = Sort(sortExprs, false, logicalPlan) def groupBy(groupingExprs: Expression*)(aggregateExprs: Expression*): LogicalPlan = { +val groupingExprsWithReplacedOrdinals = groupingExprs.map { Review Comment: Done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
mihailotim-db commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2048461271 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/GroupByOrdinalsRepeatedAnalysisSuite.scala: ## @@ -17,63 +17,42 @@ package org.apache.spark.sql.catalyst.analysis -import org.apache.spark.sql.catalyst.analysis.TestRelations.{testRelation, testRelation2} +import org.apache.spark.sql.catalyst.analysis.TestRelations.testRelation import org.apache.spark.sql.catalyst.dsl.expressions._ import org.apache.spark.sql.catalyst.dsl.plans._ import org.apache.spark.sql.catalyst.expressions.{GenericInternalRow, Literal} import org.apache.spark.sql.catalyst.plans.logical.LocalRelation -import org.apache.spark.sql.internal.SQLConf -class SubstituteUnresolvedOrdinalsSuite extends AnalysisTest { - private lazy val a = testRelation2.output(0) - private lazy val b = testRelation2.output(1) +class GroupByOrdinalsRepeatedAnalysisSuite extends AnalysisTest { test("unresolved ordinal should not be unresolved") { // Expression OrderByOrdinal is unresolved. assert(!UnresolvedOrdinal(0).resolved) } - test("order by ordinal") { -// Tests order by ordinal, apply single rule. -val plan = testRelation2.orderBy(Literal(1).asc, Literal(2).asc) + test("SPARK-45920: group by ordinal repeated analysis") { +val plan = testRelation.groupBy(Literal(1))(Literal(100).as("a")).analyze comparePlans( - SubstituteUnresolvedOrdinals.apply(plan), - testRelation2.orderBy(UnresolvedOrdinal(1).asc, UnresolvedOrdinal(2).asc)) - -// Tests order by ordinal, do full analysis -checkAnalysis(plan, testRelation2.orderBy(a.asc, b.asc)) + plan, + testRelation.groupBy(Literal(1))(Literal(100).as("a")).analyze +) -// order by ordinal can be turned off by config -withSQLConf(SQLConf.ORDER_BY_ORDINAL.key -> "false") { Review Comment: We are removing `SubstituteUnresolvedOrdinals` object and we also have golden file tests for these cases so I think it would be redundant to rewrite them again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
vladimirg-db commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2048438196 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala: ## @@ -446,11 +447,16 @@ package object dsl { def sortBy(sortExprs: SortOrder*): LogicalPlan = Sort(sortExprs, false, logicalPlan) def groupBy(groupingExprs: Expression*)(aggregateExprs: Expression*): LogicalPlan = { +// Replace top-level integer literals with ordinals, if `groupByOrdinal` is enabled. +val groupingExprsWithReplacedOrdinals = groupingExprs.map { Review Comment: The ordinals are not replaced. They are "injected" ```suggestion val groupingExprsWithOrdinals = groupingExprs.map { ``` ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -1979,19 +1978,7 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor throw QueryCompilationErrors.groupByPositionRefersToAggregateFunctionError( index, ordinalExpr) } else { Review Comment: You can drop this `else`, since there's a `throw` above. ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/GroupByOrdinalsRepeatedAnalysisSuite.scala: ## @@ -17,63 +17,42 @@ package org.apache.spark.sql.catalyst.analysis -import org.apache.spark.sql.catalyst.analysis.TestRelations.{testRelation, testRelation2} +import org.apache.spark.sql.catalyst.analysis.TestRelations.testRelation import org.apache.spark.sql.catalyst.dsl.expressions._ import org.apache.spark.sql.catalyst.dsl.plans._ import org.apache.spark.sql.catalyst.expressions.{GenericInternalRow, Literal} import org.apache.spark.sql.catalyst.plans.logical.LocalRelation -import org.apache.spark.sql.internal.SQLConf -class SubstituteUnresolvedOrdinalsSuite extends AnalysisTest { - private lazy val a = testRelation2.output(0) - private lazy val b = testRelation2.output(1) +class GroupByOrdinalsRepeatedAnalysisSuite extends AnalysisTest { test("unresolved ordinal should not be unresolved") { // Expression OrderByOrdinal is unresolved. assert(!UnresolvedOrdinal(0).resolved) } - test("order by ordinal") { -// Tests order by ordinal, apply single rule. -val plan = testRelation2.orderBy(Literal(1).asc, Literal(2).asc) + test("SPARK-45920: group by ordinal repeated analysis") { +val plan = testRelation.groupBy(Literal(1))(Literal(100).as("a")).analyze comparePlans( - SubstituteUnresolvedOrdinals.apply(plan), - testRelation2.orderBy(UnresolvedOrdinal(1).asc, UnresolvedOrdinal(2).asc)) - -// Tests order by ordinal, do full analysis -checkAnalysis(plan, testRelation2.orderBy(a.asc, b.asc)) + plan, + testRelation.groupBy(Literal(1))(Literal(100).as("a")).analyze +) -// order by ordinal can be turned off by config -withSQLConf(SQLConf.ORDER_BY_ORDINAL.key -> "false") { Review Comment: Why do we remove this piece of test? ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala: ## @@ -1,64 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - *http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -package org.apache.spark.sql.catalyst.analysis - -import org.apache.spark.sql.catalyst.expressions.{BaseGroupingSets, Expression, Literal, SortOrder} -import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, LogicalPlan, Sort} -import org.apache.spark.sql.catalyst.rules.Rule -import org.apache.spark.sql.catalyst.trees.CurrentOrigin.withOrigin -import org.apache.spark.sql.catalyst.trees.TreePattern._ -import org.apache.spark.sql.types.IntegerType - -/** - * Replaces ordinal in 'order by' or 'group by' with UnresolvedOrdinal expression. - */ -object SubstituteUnresolvedOrdinals extends Rule[LogicalPlan] { Review Comment: Nice! ## sql/core/src/test/scala/org/apache/spark/sql/analysis/resolver/AggregateResolverSuite.scala: ## @@ -44,12 +44,6 @@ class AggregateResolverSuite extends QueryTest with SharedSparkSession { resolverRunner.resolve(query) } - test("Valid group by ordinal") { Rev
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
vladimirg-db commented on PR #50606: URL: https://github.com/apache/spark/pull/50606#issuecomment-2812125611 Just one thing we need to check - if the view is persisted with `ORDER_BY_ORDINAL` conf ON, what happens if we read this view with `ORDER_BY_ORDINAL` conf `OFF`? This might be an issue, since we moved the conf check to the parser. The view must keep its confs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
mihailotim-db commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2048385581 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -6558,6 +6573,31 @@ class AstBuilder extends DataTypeAstBuilder } } + private def visitSortItemAndReplaceOrdinals(sortItemContext: SortItemContext) = { Review Comment: Done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
mihailotim-db commented on PR #50606: URL: https://github.com/apache/spark/pull/50606#issuecomment-2812045197 > shall we also handle Spark Connect queries in `SparkConnectPlanner`? Sure, done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
mihailotim-db commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2048368533 ## sql/core/src/test/scala/org/apache/spark/sql/analysis/resolver/AggregateResolverSuite.scala: ## @@ -44,12 +44,6 @@ class AggregateResolverSuite extends QueryTest with SharedSparkSession { resolverRunner.resolve(query) Review Comment: Done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
mihailotim-db commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2048369191 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -6532,12 +6547,12 @@ class AstBuilder extends DataTypeAstBuilder case n: NamedExpression => newGroupingExpressions += n newAggregateExpressions += n -// If the grouping expression is an integer literal, create [[UnresolvedOrdinal]] and -// [[UnresolvedPipeAggregateOrdinal]] expressions to represent it in the final grouping -// and aggregate expressions, respectively. This will let the +// If the grouping expression is an [[UnresolvedOrdinal]], replace the ordinal value and +// create [[UnresolvedPipeAggregateOrdinal]] expressions to represent it in the final +// grouping and aggregate expressions, respectively. This will let the // [[ResolveOrdinalInOrderByAndGroupBy]] rule detect the ordinal in the aggregate list // and replace it with the corresponding attribute from the child operator. -case Literal(v: Int, IntegerType) if conf.groupByOrdinal => +case UnresolvedOrdinal(v: Int) => newGroupingExpressions += UnresolvedOrdinal(newAggregateExpressions.length + 1) Review Comment: Thanks for the clarification! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
cloud-fan commented on PR #50606: URL: https://github.com/apache/spark/pull/50606#issuecomment-2811844154 shall we also handle Spark Connect queries in `SparkConnectPlanner`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
cloud-fan commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2048273150 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala: ## @@ -446,11 +447,15 @@ package object dsl { def sortBy(sortExprs: SortOrder*): LogicalPlan = Sort(sortExprs, false, logicalPlan) def groupBy(groupingExprs: Expression*)(aggregateExprs: Expression*): LogicalPlan = { +val groupingExprsWithReplacedOrdinals = groupingExprs.map { Review Comment: +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analysis to avoid issue with group by ordinal [spark]
dtenedor commented on code in PR #50606: URL: https://github.com/apache/spark/pull/50606#discussion_r2047817277 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala: ## @@ -446,11 +447,15 @@ package object dsl { def sortBy(sortExprs: SortOrder*): LogicalPlan = Sort(sortExprs, false, logicalPlan) def groupBy(groupingExprs: Expression*)(aggregateExprs: Expression*): LogicalPlan = { +val groupingExprsWithReplacedOrdinals = groupingExprs.map { Review Comment: can you please add a comment here saying what this part is doing? ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -1825,24 +1825,32 @@ class AstBuilder extends DataTypeAstBuilder } visitNamedExpression(n) }.toSeq + val groupByExpressionsWithReplacedOrdinals = +replaceOrdinalsInGroupingExpressions(groupByExpressions) if (ctx.GROUPING != null) { // GROUP BY ... GROUPING SETS (...) // `groupByExpressions` can be non-empty for Hive compatibility. It may add extra grouping // expressions that do not exist in GROUPING SETS (...), and the value is always null. // For example, `SELECT a, b, c FROM ... GROUP BY a, b, c GROUPING SETS (a, b)`, the output // of column `c` is always null. val groupingSets = - ctx.groupingSet.asScala.map(_.expression.asScala.map(e => expression(e)).toSeq) -Aggregate(Seq(GroupingSets(groupingSets.toSeq, groupByExpressions)), + ctx.groupingSet.asScala.map(_.expression.asScala.map(e => { Review Comment: can you please add a comment here saying what this part is doing? ## sql/core/src/test/scala/org/apache/spark/sql/analysis/resolver/AggregateResolverSuite.scala: ## @@ -44,12 +44,6 @@ class AggregateResolverSuite extends QueryTest with SharedSparkSession { resolverRunner.resolve(query) Review Comment: Can you copy these test contents to the Jira so we don't forget? ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala: ## @@ -6558,6 +6573,31 @@ class AstBuilder extends DataTypeAstBuilder } } + private def visitSortItemAndReplaceOrdinals(sortItemContext: SortItemContext) = { Review Comment: can you please add a comment here saying what these new methods are doing? ## sql/core/src/main/scala/org/apache/spark/sql/classic/Dataset.scala: ## @@ -929,7 +929,16 @@ class Dataset[T] private[sql]( /** @inheritdoc */ @scala.annotation.varargs def groupBy(cols: Column*): RelationalGroupedDataset = { -RelationalGroupedDataset(toDF(), cols.map(_.expr), RelationalGroupedDataset.GroupByType) +val groupingExpressionsWithReplacedOrdinals = cols.map { col => col.expr match { Review Comment: can you please add a comment here saying what this part is doing? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org