Aaditya Ramesh created SPARK-19525:
--------------------------------------
Summary: Enable Compression of Spark Streaming Checkpoints
Key: SPARK-19525
URL: https://issues.apache.org/jira/browse/SPARK-19525
Project: Spark
Issue Type: Improvement
Components: Structured Streaming
Affects Versions: 2.1.0
Reporter: Aaditya Ramesh
In our testing, compressing partitions while writing them to checkpoints on
HDFS using snappy helped performance significantly while also reducing the
variability of the checkpointing operation. In our tests, checkpointing time
was reduced by 3X, and variability was reduced by 2X for data sets of
compressed size approximately 1 GB.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]