Re: [PR] [KYUUBI #5377] Spark engine query result save to file [kyuubi]

via GitHub Tue, 19 Dec 2023 21:59:53 -0800


cxzl25 commented on code in PR #5591:
URL: https://github.com/apache/kyuubi/pull/5591#discussion_r1432281147



##########
externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecuteStatement.scala:
##########
@@ -158,6 +171,24 @@ class ExecuteStatement(
         override def iterator: Iterator[Any] = 
incrementalCollectResult(resultDF)
       })
     } else {
+      val sparkSave = getSessionConf(OPERATION_RESULT_SAVE_TO_FILE, spark)
+      lazy val threshold = 
getSessionConf(OPERATION_RESULT_SAVE_TO_FILE_THRESHOLD, spark)
+      if (hasResultSet && sparkSave && shouldSaveResultToHdfs(resultMaxRows, 
threshold, result)) {
+        val sessionId = session.handle.identifier.toString
+        val savePath = 
session.sessionManager.getConf.get(OPERATION_RESULT_SAVE_TO_FILE_PATH)
+        saveFileName = Some(s"$savePath/$engineId/$sessionId/$statementId")
+        val colName = range(0, result.schema.size).map(x => "col" + x)
+        if (resultMaxRows > 0) {
+          result.toDF(colName: _*).limit(resultMaxRows).write
+            .option("compression", "zstd").format("orc").save(saveFileName.get)

Review Comment:
   [SPARK-33978][SQL] Support ZSTD compression in ORC data source
   https://issues.apache.org/jira/browse/SPARK-33978
   Fix Version/s: 3.2.0
   
   Maybe we need a configuration item, or the Spark version less than 3.2.0 is 
compressed with zlib.
   
   3.1.1 bin/spark-shell
   ```scala
   scala> spark.range(10).write.option("compression", "zstd").orc("/tmp/zstd")
   java.lang.IllegalArgumentException: Codec [zstd] is not available. Available 
codecs are uncompressed, lzo, snappy, zlib, none.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [KYUUBI #5377] Spark engine query result save to file [kyuubi]

Reply via email to