andygrove commented on code in PR #1185: URL: https://github.com/apache/datafusion-comet/pull/1185#discussion_r1893224341
########## docs/source/user-guide/tuning.md: ########## @@ -103,6 +103,12 @@ native shuffle currently only supports `HashPartitioning` and `SinglePartitionin To enable native shuffle, set `spark.comet.exec.shuffle.mode` to `native`. If this mode is explicitly set, then any shuffle operations that cannot be supported in this mode will fall back to Spark. +### Shuffle Compression + +By default, Spark compresses shuffle files using LZ4 compression. Comet overrides this behavior with ZSTD compression. +Compression can be disabled by setting `spark.shuffle.compress=false`, which may result in faster shuffle times in +certain environments, such as single-node setups with fast NVMe drives, at the expense of increased disk space usage. Review Comment: We don't support LZ4 natively yet (there is a separate PR where I am working on adding this) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org