JVM's often have significant GC overhead with heaps bigger than 64GB. You
might try your experiments with configurations below this threshold.

dean

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
<http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
Typesafe <http://typesafe.com>
@deanwampler <http://twitter.com/deanwampler>
http://polyglotprogramming.com

On Thu, Apr 23, 2015 at 12:14 PM, Shuai Zheng <szheng.c...@gmail.com> wrote:

> Hi All,
>
>
>
> I am running some benchmark on r3*8xlarge instance. I have a cluster with
> one master (no executor on it) and one slave (r3*8xlarge).
>
>
>
> My job has 1000 tasks in stage 0.
>
>
>
> R3*8xlarge has 244G memory and 32 cores.
>
>
>
> If I create 4 executors, each has 8 core+50G memory, each task will take
> around 320s-380s. And if I only use one big executor with 32 cores and 200G
> memory, each task will take 760s-900s.
>
>
>
> And I check the log, looks like the minor GC takes much longer when using
> 200G memory:
>
>
>
> 285.242: [GC [PSYoungGen: 29027310K->8646087K(31119872K)]
> 38810417K->19703013K(135977472K), 11.2509770 secs] [Times: user=38.95
> sys=120.65, real=11.25 secs]
>
>
>
> And when it uses 50G memory, the minor GC takes only less than 1s.
>
>
>
> I try to see what is the best way to configure the Spark. For some special
> reason, I tempt to use a bigger memory on single executor if no significant
> penalty on performance. But now looks like it is?
>
>
>
> Anyone has any idea?
>
>
>
> Regards,
>
>
>
> Shuai
>

Reply via email to