As I said, it should not affect performance of transformations on RDDs, only of 
sending tasks to the workers and getting results back. In general, you want the 
Akka frame size to be as small as possible while still holding your largest 
task or result; as long as your application isn’t throwing an error due to the 
frame size being too small, you’re fine. Having a bigger frame size will result 
in wasted space and unneeded memory allocation for buffers. It doesn’t make the 
communication more efficient.

Matei

On Dec 8, 2013, at 12:57 PM, Shangyu Luo <[email protected]> wrote:

> I would like to know the maximum value for spark.akka.framesize, too and I am 
> wondering if it will affect the performance of reduceByKey().
> Thanks!
> 
> 
> 2013/12/8 Matei Zaharia <[email protected]>
> Hey Matt,
> 
> This setting shouldn’t really affect groupBy operations, because they don’t 
> go through Akka. The frame size setting is for messages from the master to 
> workers (specifically, sending out tasks), and for results that go directly 
> from workers to the application (e.g. collect()). So it shouldn’t be a 
> problem unless these are large. In Spark 0.8.1, results back to the master 
> will be sent in a different way if they’re large, so the setting will only 
> cover task sizes.
> 
> Matei
> 
> On Dec 7, 2013, at 10:20 PM, Matt Cheah <[email protected]> wrote:
> 
>> Hi everyone,
>> 
>> I'm noticing like others that group-By operations with large sized groups 
>> gives Spark some trouble. Increasing the spark.akka.frameSize property 
>> alleviates it up to a point.
>> 
>> I was wondering what the maximum setting for this value is. I've seen 
>> previous e-mails talking about the ramifications of turning up this value, 
>> but I was wondering what the actual maximum number that could be set for it 
>> is. I'll benchmark the performance hit accordingly.
>> 
>> Thanks!
>> 
>> -Matt Cheah
> 
> 
> 
> 
> -- 
> --
> 
> Shangyu, Luo
> 

Reply via email to