zhenlineo commented on code in PR #40796:
URL: https://github.com/apache/spark/pull/40796#discussion_r1184408288
##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala:
##########
@@ -664,7 +665,53 @@ class SparkConnectPlanner(val session: SparkSession) {
input: proto.Relation,
groupingExprs: java.util.List[proto.Expression],
sortingExprs: java.util.List[proto.Expression]):
UntypedKeyValueGroupedDataset = {
- val logicalPlan = transformRelation(input)
+ apply(transformRelation(input), groupingExprs, sortingExprs)
+ }
+
+ private def apply(
+ logicalPlan: LogicalPlan,
+ groupingExprs: java.util.List[proto.Expression],
+ sortingExprs: java.util.List[proto.Expression]):
UntypedKeyValueGroupedDataset = {
+ if (groupingExprs.size() == 1) {
+ createFromGroupByKeyFunc(logicalPlan, groupingExprs, sortingExprs)
+ } else if (groupingExprs.size() > 1) {
Review Comment:
I do not see a common path here. The nasty part of this is we hide a logic
inside the grouping_exprs using the count of the expressions. The alternative I
can think is a UnresolvedFunc or new Expression which allow us to add more
logics. e.g.
```
message KeyValueGroupedDataset { // New Expression or Unresolved Func
// (Required) Input user-defined function. Defines the grouping func
CommonInlineUserDefinedFunction grouping_func = 1;
// (Optional) Extra grouping expressions needed for RelationalGroupedDataset
repeat Expression grouping_expressions = 2;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]