This is an automated email from the ASF dual-hosted git repository.
feiwang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kyuubi.git
The following commit(s) were added to refs/heads/master by this push:
new 19701efc3 [KYUUBI #5377][FOLLOWUP] Spark engine query result save to
file
19701efc3 is described below
commit 19701efc3c45a5e9c63ffcdc9afdfc4f5ed0181d
Author: senmiaoliu <[email protected]>
AuthorDate: Thu Dec 21 14:17:47 2023 -0800
[KYUUBI #5377][FOLLOWUP] Spark engine query result save to file
# :mag: Description
## Issue References ๐
https://github.com/apache/kyuubi/pull/5591#discussion_r1432281147
## Describe Your Solution ๐ง
Please include a summary of the change and which issue is fixed. Please
also include relevant motivation and context. List any dependencies that are
required for this change.
## Types of changes :bookmark:
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
## Test Plan ๐งช
#### Behavior Without This Pull Request :coffin:
#### Behavior With This Pull Request :tada:
#### Related Unit Tests
---
# Checklists
## ๐ Author Self Checklist
- [x] My code follows the [style
guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html)
of this project
- [x] I have performed a self-review
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature
works
- [ ] New and existing unit tests pass locally with my changes
- [ ] This patch was not authored or co-authored using [Generative
Tooling](https://www.apache.org/legal/generative-tooling.html)
## ๐ Committer Pre-Merge Checklist
- [ ] Pull request title is okay.
- [ ] No license issues.
- [ ] Milestone correctly set?
- [ ] Test coverage is ok
- [ ] Assignees are selected.
- [ ] Minimum number of approvals
- [ ] No changes are requested
**Be nice. Be informative.**
Closes #5895 from lsm1/branch-kyuubi-5377-followup.
Closes #5377
4219d28ba [Fei Wang] nit
31d4fc15f [senmiaoliu] use zlib when SPARK version less than 3.2
Lead-authored-by: senmiaoliu <[email protected]>
Co-authored-by: Fei Wang <[email protected]>
Signed-off-by: Fei Wang <[email protected]>
---
.../org/apache/kyuubi/engine/spark/operation/ExecuteStatement.scala | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git
a/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecuteStatement.scala
b/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecuteStatement.scala
index d1a213067..8b47e2075 100644
---
a/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecuteStatement.scala
+++
b/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/operation/ExecuteStatement.scala
@@ -182,13 +182,15 @@ class ExecuteStatement(
saveFileName = Some(s"$savePath/$engineId/$sessionId/$statementId")
// Rename all col name to avoid duplicate columns
val colName = range(0, result.schema.size).map(x => "col" + x)
+
+ val codec = if (SPARK_ENGINE_RUNTIME_VERSION >= "3.2") "zstd" else
"zlib"
// df.write will introduce an extra shuffle for the outermost limit,
and hurt performance
if (resultMaxRows > 0) {
result.toDF(colName: _*).limit(resultMaxRows).write
- .option("compression", "zstd").format("orc").save(saveFileName.get)
+ .option("compression", codec).format("orc").save(saveFileName.get)
} else {
result.toDF(colName: _*).write
- .option("compression", "zstd").format("orc").save(saveFileName.get)
+ .option("compression", codec).format("orc").save(saveFileName.get)
}
info(s"Save result to $saveFileName")
fetchOrcStatement = Some(new FetchOrcStatement(spark))