[GitHub] [spark] grundprinzip commented on a diff in pull request #38627: [SPARK-40875] [CONNECT] [FOLLOW] Retain Group expressions in aggregate.

GitBox Sun, 13 Nov 2022 22:30:55 -0800


grundprinzip commented on code in PR #38627:
URL: https://github.com/apache/spark/pull/38627#discussion_r1021116483



##########
connector/connect/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##########
@@ -148,29 +147,25 @@ class SparkConnectProtoSuite extends PlanTest with 
SparkConnectPlanTest {
   }
 
   test("Aggregate with more than 1 grouping expressions") {
-    withSQLConf(SQLConf.DATAFRAME_RETAIN_GROUP_COLUMNS.key -> "false") {
-      val connectPlan =
-        connectTestRelation.groupBy("id".protoAttr, "name".protoAttr)()
-      val sparkPlan =
-        sparkTestRelation.groupBy(Column("id"), 
Column("name")).agg(Map.empty[String, String])
-      comparePlans(connectPlan, sparkPlan)
-    }
+    val connectPlan =
+      connectTestRelation.groupBy("id".protoAttr, "name".protoAttr)()
+    val sparkPlan =
+      sparkTestRelation.groupBy(Column("id"), 
Column("name")).agg(Map.empty[String, String])
+    comparePlans(connectPlan, sparkPlan)
   }
 
   test("Aggregate expressions") {
-    withSQLConf(SQLConf.DATAFRAME_RETAIN_GROUP_COLUMNS.key -> "false") {

Review Comment:
   There are couple of things to point out:
   
   1) The behavior can be controlled by applying the default values for the 
execution on the server side and I would like to hold adding knobs until we 
have real evidence that we need it.
   
   2) `__repr__` the behavior of REPL is defined by PySpark and we don't have 
to follow it to the letter. We can make independent decisions that define what 
the behavior on `__repr__` is. For example, I don't think we should eagerly 
evaluate.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] grundprinzip commented on a diff in pull request #38627: [SPARK-40875] [CONNECT] [FOLLOW] Retain Group expressions in aggregate.

Reply via email to