Github user maropu commented on the issue:
https://github.com/apache/spark/pull/17303
@Cyan4973 I quickly checked again;
```
scaleFactor: 4
AWS instance: c4.4xlarge
// In this bench, I used `local-cluster` (`local` used in the benchmark
above)
./bin/spark-shell --master local-cluster[4,4,7500] \
--conf spark.driver.memory=1g \
--conf spark.executor.memory=7g \
--conf spark.io.compression.codec=xxx
--- zstd (level=3)
Running execution q4-v1.4 iteration: 1, StandardRun=true
Execution time: 36.517211838s
Running execution q4-v1.4 iteration: 2, StandardRun=true
Execution time: 25.026869575s
Running execution q4-v1.4 iteration: 3, StandardRun=true
Execution time: 24.370711575s
--- zstd (level=1)
Running execution q4-v1.4 iteration: 1, StandardRun=true
Execution time: 29.654705815s
Running execution q4-v1.4 iteration: 2, StandardRun=true
Execution time: 20.638918335s
Running execution q4-v1.4 iteration: 3, StandardRun=true
Execution time: 19.928730758999997s
--- lz4
Running execution q4-v1.4 iteration: 1, StandardRun=true
Execution time: 27.422360631s
Running execution q4-v1.4 iteration: 2, StandardRun=true
Execution time: 17.38519278s
Running execution q4-v1.4 iteration: 3, StandardRun=true
Execution time: 15.779084563s
--- snappy
Running execution q4-v1.4 iteration: 1, StandardRun=true
Execution time: 27.476569521000002s
Running execution q4-v1.4 iteration: 2, StandardRun=true
Execution time: 16.438640631s
Running execution q4-v1.4 iteration: 3, StandardRun=true
Execution time: 14.949329456s
--- lzf
Running execution q4-v1.4 iteration: 1, StandardRun=true
Execution time: 27.853010073s
Running execution q4-v1.4 iteration: 2, StandardRun=true
Execution time: 17.431232532000003s
Running execution q4-v1.4 iteration: 3, StandardRun=true
Execution time: 15.916569896999999s
```
`zstd` was still worse than the others.
Not sure though, there might be the winner case where `zstd` overcomes the
others in more larger data set.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]