To confirm, lihu, are you using Spark version 1.2.0 ? On Tue, Jan 13, 2015 at 9:26 PM, lihu <lihu...@gmail.com> wrote:
> Hi, > I just test groupByKey method on a 100GB data, the cluster is 20 > machine, each with 125GB RAM. > > At first I set conf.set("spark.shuffle.use.netty", "false") and run > the experiment, and then I set conf.set("spark.shuffle.use.netty", "true") > again to re-run the experiment, but at the latter case, the GC time is much > higher。 > > > I thought the latter one should be better, but it is not. So when should > we use netty for network shuffle fetching? > > >