szehon-ho commented on code in PR #54554:
URL: https://github.com/apache/spark/pull/54554#discussion_r2867128442


##########
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala:
##########
@@ -337,6 +362,127 @@ class SQLQueryTestSuite extends QueryTest with 
SharedSparkSession with SQLHelper
       }
     }
 
+    localSparkSession
+  }
+
+  /**
+   * Runs queries in debug mode. Only commands and --DEBUG-marked queries are 
executed.
+   * Failed debug queries print the full error stacktrace. Results are 
compared against
+   * the golden file. The test always fails at the end to prevent accidental 
commits.
+   *
+   * To inspect the DataFrame interactively, set a breakpoint at the line where
+   * `localSparkSession.sql(sql)` is called inside this method, then evaluate 
the DataFrame
+   * in the debugger (e.g., `df.queryExecution.analyzed`, `df.explain(true)`).
+   */
+  protected def runDebugQueries(
+      queriesWithDebug: Seq[(String, Boolean)],
+      testCase: TestCase,
+      sparkConfigSet: Seq[(String, String)]): Unit = {
+    val localSparkSession = setupLocalSession(testCase, sparkConfigSet)
+    val lowercaseTestCase = testCase.name.toLowerCase(Locale.ROOT)
+    val segmentsPerQuery = 3
+
+    val goldenFileExists = new File(testCase.resultFile).exists()
+    val segments = if (goldenFileExists) {
+      Files.readString(new File(testCase.resultFile).toPath).split("-- 
!query.*\n")
+    } else {
+      Array.empty[String]
+    }
+
+    queriesWithDebug.zipWithIndex.foreach { case ((sql, isDebug), i) =>
+      val isCommand = try {
+        
localSparkSession.sessionState.sqlParser.parsePlan(sql).isInstanceOf[Command]
+      } catch {
+        case _: ParseException => false
+      }
+
+      if (isCommand) {
+        localSparkSession.sql(sql).collect()
+      } else if (isDebug) {
+        // Capture exception stacktrace if the query fails.
+        var exceptionTrace: Option[String] = None

Review Comment:
   can we always do print the stack, not just in debug mode (when we hit a 
non-expected exception?)



##########
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala:
##########
@@ -84,6 +84,16 @@ import org.apache.spark.util.Utils
  *     times, each time picks one config set from each dimension, until all 
the combinations are
  *     tried. For example, if dimension 1 has 2 lines, dimension 2 has 3 
lines, this testing file
  *     will be run 6 times (cartesian product).
+ *  6. A line with --DEBUG (on its own line before a query) marks that query 
for debug mode.
+ *     When any --DEBUG marker is present, the test enters debug mode:
+ *     - Commands (CREATE TABLE, INSERT, DROP, SET, etc.) are always executed 
automatically.
+ *     - Only --DEBUG-marked non-command queries are executed; all others are 
skipped.
+ *     - For failed queries, the full error stacktrace is printed to the 
console.
+ *     - Query results are still compared against the golden file.
+ *     - The test always fails at the end with a reminder to remove --DEBUG 
markers.
+ *     To inspect the DataFrame interactively, set a breakpoint in 
`runDebugQueries` at the
+ *     line where `localSparkSession.sql(sql)` is called, then evaluate the 
DataFrame in the

Review Comment:
   is it possible to have this available in all case, not just --debug mode?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to