(spark) branch master updated: [SPARK-55198][SQL] spark-sql should skip comment line with leading whitespaces

wenchen Wed, 28 Jan 2026 19:04:57 -0800

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 1268c15a4339 [SPARK-55198][SQL] spark-sql should skip comment line 
with leading whitespaces
1268c15a4339 is described below

commit 1268c15a43394b4308fc06a02899988e3a011a8f
Author: Cheng Pan <[email protected]>
AuthorDate: Thu Jan 29 11:04:37 2026 +0800

    [SPARK-55198][SQL] spark-sql should skip comment line with leading 
whitespaces
    
    ### What changes were proposed in this pull request?
    
    The current `spark-sql` has a specific behavior on processing line starts 
with `--`, which introduces unintuitive behaviors
    ```
    spark-sql (default)> set x=
                       > -- comment
                       > 1;
    x       1
    Time taken: 0.614 seconds, Fetched 1 row(s)
    ```
    ```
    spark-sql (default)> set x=
                       >  -- comment
                       > 1;
    x       -- comment
    1
    Time taken: 0.044 seconds, Fetched 1 row(s)
    ```
    
    This PR follows HIVE-8396 (Hive 2.0.0) to call `line.trim` before testing 
whether it starts with `--`, to make the above two queries have the same output.
    
    ### Why are the changes needed?
    
    Keep the `spark-sql`'s behavior for processing comment lines more 
intuitive, and consistent with `beeline`.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, `spark-sql` now ignores comment lines regardless of whether they have 
leading whitespaces or not.
    
    ### How was this patch tested?
    
    UTs are added. Also manually verified that the updated behavior is 
consistent with `beeline`.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #53977 from pan3793/SPARK-55198.
    
    Authored-by: Cheng Pan <[email protected]>
    Signed-off-by: Wenchen Fan <[email protected]>
---
 .../sql/hive/thriftserver/SparkSQLCLIDriver.scala    | 20 +++++++++++++++++++-
 .../spark/sql/hive/thriftserver/CliSuite.scala       | 15 +++++++++++++++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git 
a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala
 
b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala
index 64279b03f76f..0a024fb10ee0 100644
--- 
a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala
+++ 
b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala
@@ -259,7 +259,9 @@ private[hive] object SparkSQLCLIDriver extends Logging {
     var line = reader.readLine(currentPrompt + "> ")
 
     while (line != null) {
-      if (!line.startsWith("--")) {
+      // SPARK-55198: call line.trim to also skip comment line with leading 
whitespaces,
+      // this keeps the behavior align with HIVE-8396
+      if (!line.trim.startsWith("--")) {
         if (prefix.nonEmpty) {
           prefix += '\n'
         }
@@ -529,6 +531,22 @@ private[hive] class SparkSQLCLIDriver extends CliDriver 
with Logging {
     }
   }
 
+  // Adapted processReader from Hive 2.3's CliDriver.processReader.
+  // SPARK-55198: call line.trim to also skip comment line with leading 
whitespaces,
+  // this keeps the spark-sql's behavior align with beeline.
+  override def processReader(r: BufferedReader): Int = {
+    val qsb = new StringBuilder
+    var line = r.readLine
+    while (line != null) {
+      // Skipping through comments
+      if (!line.trim.startsWith("--")) {
+        qsb.append(line + "\n")
+      }
+      line = r.readLine
+    }
+    processLine(qsb.toString)
+  }
+
   // Adapted processLine from Hive 2.3's CliDriver.processLine.
   override def processLine(line: String, allowInterrupting: Boolean): Int = {
     var oldSignal: SignalHandler = null
diff --git 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
index 8278ab14dd68..d12f9fdd1900 100644
--- 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
+++ 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
@@ -19,6 +19,7 @@ package org.apache.spark.sql.hive.thriftserver
 
 import java.io._
 import java.nio.charset.StandardCharsets
+import java.nio.file.Files
 import java.util.concurrent.CountDownLatch
 
 import scala.collection.mutable.ArrayBuffer
@@ -876,4 +877,18 @@ class CliSuite extends SparkFunSuite {
       "SELECT ?;" -> ""
     )
   }
+
+  test("SPARK-55198: spark-sql should skip comment line with leading 
whitespaces") {
+    val sql = """SET x=
+                | -- comment
+                |1;
+                |""".stripMargin
+    runCliWithin(2.minutes)(sql -> "x\t1")
+
+    withTempDir { tmpDir =>
+      val sqlFilePath = tmpDir.toPath.resolve("test.sql").toAbsolutePath
+      Files.writeString(sqlFilePath, sql)
+      runCliWithin(2.minutes, extraArgs = Seq("-f", sqlFilePath.toString))("" 
-> "x\t1")
+    }
+  }
 }


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-55198][SQL] spark-sql should skip comment line with leading whitespaces

Reply via email to