grundprinzip commented on code in PR #38627:
URL: https://github.com/apache/spark/pull/38627#discussion_r1021116483
##########
connector/connect/src/test/scala/org/apache/spark/sql/connect/planner/SparkConnectProtoSuite.scala:
##########
@@ -148,29 +147,25 @@ class SparkConnectProtoSuite extends PlanTest with
SparkConnectPlanTest {
}
test("Aggregate with more than 1 grouping expressions") {
- withSQLConf(SQLConf.DATAFRAME_RETAIN_GROUP_COLUMNS.key -> "false") {
- val connectPlan =
- connectTestRelation.groupBy("id".protoAttr, "name".protoAttr)()
- val sparkPlan =
- sparkTestRelation.groupBy(Column("id"),
Column("name")).agg(Map.empty[String, String])
- comparePlans(connectPlan, sparkPlan)
- }
+ val connectPlan =
+ connectTestRelation.groupBy("id".protoAttr, "name".protoAttr)()
+ val sparkPlan =
+ sparkTestRelation.groupBy(Column("id"),
Column("name")).agg(Map.empty[String, String])
+ comparePlans(connectPlan, sparkPlan)
}
test("Aggregate expressions") {
- withSQLConf(SQLConf.DATAFRAME_RETAIN_GROUP_COLUMNS.key -> "false") {
Review Comment:
There are couple of things to point out:
1) The behavior can be controlled by applying the default values for the
execution on the server side and I would like to hold adding knobs until we
have real evidence that we need it.
2) `__repr__` the behavior of REPL is defined by PySpark and we don't have
to follow it to the letter. We can make independent decisions that define what
the behavior on `__repr__` is. For example, I don't think we should eagerly
evaluate.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]