cloud-fan commented on code in PR #56190:
URL: https://github.com/apache/spark/pull/56190#discussion_r3461768210
##########
sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala:
##########
@@ -1223,4 +1212,19 @@ class QueryTestSuite extends test.SharedSparkSession {
"from range(2)"),
Seq(Row(Row(null)), Row(Row("null"))))
}
+
+ test("checkAnswer demands correct result order for ordered queries") {
+ val e = intercept[org.scalatest.exceptions.TestFailedException] {
+ checkAnswer(
+ sql("SELECT col1 FROM VALUE 1, 2, 1, 3 ORDER BY col 1"),
Review Comment:
`VALUE` isn't valid inline-table syntax — it's `VALUES` — and `col 1` (with
a space) isn't a valid column ref. This throws `ParseException`, so the
`intercept[TestFailedException]` never matches and the test fails; it's the
root cause of the red CI (and `QueryTestWithConnectSuite` inherits and cascades
on it). The same `VALUE` typo is on line 1227.
```suggestion
sql("SELECT col1 FROM VALUES 1, 2, 1, 3 ORDER BY col1"),
```
##########
sql/connect/server/src/test/scala/org/apache/spark/sql/connect/SessionQueryTest.scala:
##########
@@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.connect
+
+import org.apache.spark.sql
+
+/**
+ * Overrides test utils to implement 'connect variants' of suites declared in
sql/core:
+ * {{{
+ * // in sql/core
+ * FooSuite extends SessionQueryTest { test("") { ... } }
+ *
+ * // in sql/connect
+ * FooConnectSuite extends FooSuite with connect.SessionQueryTest
+ * }}}
+ *
+ * This trait overrides [[spark]] to use a [[SparkSession
connect.SparkSession]], which executes
+ * via the gRPC API using an in-process connect server.
+ */
+trait SessionQueryTest extends sql.SessionQueryTest with SparkSessionBinder {
+
+ /**
+ * TODO add required test-only API to Spark Connect
+ * This method is used by [[checkAnswer]] internally but cannot yet be
implemented in connect.
+ * Thus we always return `false` for now.
+ */
+ override def isDfSorted(df: sql.DataFrame): Boolean = df match {
+ case df: DataFrame => df.explainString(true).contains("Sort")
+ case df => super.isDfSorted(df)
+ }
Review Comment:
**(blocking)** This `match` block is mis-indented (`case` at 6 spaces,
closing `}` at 4) and fails the scalafmt check, which is enforced on
`sql/connect/*` (`dev/lint-scala`, maxColumn 98) — that's why the "Scala
linter" job is red. Reformat:
```suggestion
override def isDfSorted(df: sql.DataFrame): Boolean = df match {
case df: DataFrame => df.explainString(true).contains("Sort")
case df => super.isDfSorted(df)
}
```
**(non-blocking, question)** `explainString(true)` is the EXTENDED plan, so
substring `"Sort"` also matches physical operators like
`SortMergeJoin`/`SortAggregate`. An *unordered* query that happens to plan a
sort-based operator would then be treated as order-sensitive and could fail
spuriously — classic detects a logical `Sort` node instead. Worth matching the
analyzed logical plan's `Sort`, or documenting the limitation. (This is the
good direction vs. the old always-`false`, just imprecise.)
**(nit)** The scaladoc just above (line 39) still says "Thus we always
return `false` for now", which no longer matches this implementation.
##########
sql/core/src/test/scala/org/apache/spark/sql/ExampleSessionAgnosticSuite.scala:
##########
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.SparkConf
+import org.apache.spark.sql.connector.catalog.InMemoryPartitionTableCatalog
+
+/**
+ * Example for
Review Comment:
Incomplete sentence.
```suggestion
* Example of a classic/connect-agnostic test suite.
```
##########
sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala:
##########
@@ -1223,4 +1212,19 @@ class QueryTestSuite extends test.SharedSparkSession {
"from range(2)"),
Seq(Row(Row(null)), Row(Row("null"))))
}
+
+ test("checkAnswer demands correct result order for ordered queries") {
+ val e = intercept[org.scalatest.exceptions.TestFailedException] {
+ checkAnswer(
+ sql("SELECT col1 FROM VALUE 1, 2, 1, 3 ORDER BY col 1"),
+ Seq(Row(3), Row(1), Row(1), Row(2)))
+ }
+ assert(e.getMessage().contains("Results do not match for query"))
+ }
+
+ test("checkAnswer ignores result order for unordered queries") {
+ checkAnswer(
+ sql("SELECT col1 FROM VALUE 1, 2, 1, 3"),
Review Comment:
Same `VALUE` -> `VALUES` fix here.
```suggestion
sql("SELECT col1 FROM VALUES 1, 2, 1, 3"),
```
##########
sql/connect/server/src/test/scala/org/apache/spark/sql/connect/QueryTestWithConnectSuite.scala:
##########
@@ -0,0 +1,31 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect
+
+import org.apache.spark.sql.QueryTestSuite
+
+/**
+ * Runs [[QueryTestSuite]] tests through a Connect session.
+ *
+ * This validates the `FooSuite with connect.SparkSessionBinder` pattern: the
existing
Review Comment:
The class mixes in `connect.SessionQueryTest` (the pattern the PR
recommends), not the bare `connect.SparkSessionBinder` — update the doc to
match what the code does:
```suggestion
* This validates the `FooSuite with connect.SessionQueryTest` pattern: the
existing
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]