Hi, I have compressed data of size 500GB .I am repartitioning this data since the underlying data is very skewed and is causing a lot of issues for the downstream jobs. During repartioning the *shuffles writes* are not getting compressed due to this I am running into disk space issues.Below is the screen shot which clearly depicts the issue(Input,shuffle write columns) I have proactively set below parameters to true, but still it doesnt compress the intermediate shuffled data
spark.shuffle.compress spark.shuffle.spill.compress [image: Inline image 1] I am using Spark 1.5 (for various unavoidable reasons!!) Any suggestions would be greatly appreciated. Thanks, Baahu