Shuai:
Please take a look at:

http://blog.takipi.com/garbage-collectors-serial-vs-parallel-vs-cms-vs-the-g1-and-whats-new-in-java-8/



> On Apr 23, 2015, at 10:18 AM, Dean Wampler <deanwamp...@gmail.com> wrote:
> 
> JVM's often have significant GC overhead with heaps bigger than 64GB. You 
> might try your experiments with configurations below this threshold.
> 
> dean
> 
> Dean Wampler, Ph.D.
> Author: Programming Scala, 2nd Edition (O'Reilly)
> Typesafe
> @deanwampler
> http://polyglotprogramming.com
> 
>> On Thu, Apr 23, 2015 at 12:14 PM, Shuai Zheng <szheng.c...@gmail.com> wrote:
>> Hi All,
>> 
>>  
>> 
>> I am running some benchmark on r3*8xlarge instance. I have a cluster with 
>> one master (no executor on it) and one slave (r3*8xlarge).
>> 
>>  
>> 
>> My job has 1000 tasks in stage 0.
>> 
>>  
>> 
>> R3*8xlarge has 244G memory and 32 cores.
>> 
>>  
>> 
>> If I create 4 executors, each has 8 core+50G memory, each task will take 
>> around 320s-380s. And if I only use one big executor with 32 cores and 200G 
>> memory, each task will take 760s-900s.
>> 
>>  
>> 
>> And I check the log, looks like the minor GC takes much longer when using 
>> 200G memory:
>> 
>>  
>> 
>> 285.242: [GC [PSYoungGen: 29027310K->8646087K(31119872K)] 
>> 38810417K->19703013K(135977472K), 11.2509770 secs] [Times: user=38.95 
>> sys=120.65, real=11.25 secs]
>> 
>>  
>> 
>> And when it uses 50G memory, the minor GC takes only less than 1s.
>> 
>>  
>> 
>> I try to see what is the best way to configure the Spark. For some special 
>> reason, I tempt to use a bigger memory on single executor if no significant 
>> penalty on performance. But now looks like it is?
>> 
>>  
>> 
>> Anyone has any idea?
>> 
>>  
>> 
>> Regards,
>> 
>>  
>> 
>> Shuai
>> 
> 

Reply via email to