siying commented on PR #43338: URL: https://github.com/apache/spark/pull/43338#issuecomment-1759990922
> > The RocksDB Team recommend LZ4 or ZSTD ... > > Why choose lz4 instead of zstd? I suppose zstd is a more future-proofing algorithm ZSTD has good compression ratio but is slower. LZ4 is the fast one with worse compression ratio (which is similar to Snappy). For Spark Structured Streaming, CPU is more of a bottleneck, rather than I/O or space, so LZ4 is a better choice. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org