Did the problem go away when you switched to lz4? There was a change from the default compression codec fro 1.0 to 1.1, where we went from LZF to Snappy. I don't think there was any such change from 1.1 to 1.2, though.
On Fri, Feb 6, 2015 at 12:17 AM, Praveen Garg <praveen.g...@guavus.com> wrote: > We tried changing the compression codec from snappy to lz4. It did > improve the performance but we are still wondering why default options > didn’t work as claimed. > > From: Raghavendra Pandey <raghavendra.pan...@gmail.com> > Date: Friday, 6 February 2015 1:23 pm > To: Praveen Garg <praveen.g...@guavus.com> > Cc: "user@spark.apache.org" <user@spark.apache.org> > Subject: Re: Shuffle read/write issue in spark 1.2 > > Even I observed the same issue. > > On Fri, Feb 6, 2015 at 12:19 AM, Praveen Garg <praveen.g...@guavus.com> > wrote: > >> Hi, >> >> While moving from spark 1.1 to spark 1.2, we are facing an issue where >> Shuffle read/write has been increased significantly. We also tried running >> the job by rolling back to spark 1.1 configuration where we set >> spark.shuffle.manager to hash and spark.shuffle.blockTransferService to >> nio. It did improve the performance a bit but it was still much worse than >> spark 1.1. The scenario seems similar to the bug raised sometime back >> https://issues.apache.org/jira/browse/SPARK-5081. >> Has anyone come across any similar issue? Please tell us if any >> configuration change can help. >> >> Regards, Praveen >> >> >