I would like to know the maximum value for spark.akka.framesize, too and I am wondering if it will affect the performance of reduceByKey(). Thanks!
2013/12/8 Matei Zaharia <[email protected]> > Hey Matt, > > This setting shouldn’t really affect groupBy operations, because they > don’t go through Akka. The frame size setting is for messages from the > master to workers (specifically, sending out tasks), and for results that > go directly from workers to the application (e.g. collect()). So it > shouldn’t be a problem unless these are large. In Spark 0.8.1, results back > to the master will be sent in a different way if they’re large, so the > setting will only cover task sizes. > > Matei > > On Dec 7, 2013, at 10:20 PM, Matt Cheah <[email protected]> wrote: > > Hi everyone, > > I'm noticing like others that group-By operations with large sized > groups gives Spark some trouble. Increasing the spark.akka.frameSize > property alleviates it up to a point. > > I was wondering what the maximum setting for this value is. I've seen > previous e-mails talking about the ramifications of turning up this value, > but I was wondering what the actual maximum number that could be set for it > is. I'll benchmark the performance hit accordingly. > > Thanks! > > -Matt Cheah > > > -- -- Shangyu, Luo
