FWIW, I ran into a similar issue on r3.8xlarge nodes and opted for more/smaller executors. Another observation was that one large executor results in less overall read throughput from S3 (using Amazon's EMRFS implementation) in case that matters to your application. -Sven
On Thu, Apr 23, 2015 at 10:18 AM, Dean Wampler <deanwamp...@gmail.com> wrote: > JVM's often have significant GC overhead with heaps bigger than 64GB. You > might try your experiments with configurations below this threshold. > > dean > > Dean Wampler, Ph.D. > Author: Programming Scala, 2nd Edition > <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly) > Typesafe <http://typesafe.com> > @deanwampler <http://twitter.com/deanwampler> > http://polyglotprogramming.com > > On Thu, Apr 23, 2015 at 12:14 PM, Shuai Zheng <szheng.c...@gmail.com> > wrote: > >> Hi All, >> >> >> >> I am running some benchmark on r3*8xlarge instance. I have a cluster with >> one master (no executor on it) and one slave (r3*8xlarge). >> >> >> >> My job has 1000 tasks in stage 0. >> >> >> >> R3*8xlarge has 244G memory and 32 cores. >> >> >> >> If I create 4 executors, each has 8 core+50G memory, each task will take >> around 320s-380s. And if I only use one big executor with 32 cores and 200G >> memory, each task will take 760s-900s. >> >> >> >> And I check the log, looks like the minor GC takes much longer when using >> 200G memory: >> >> >> >> 285.242: [GC [PSYoungGen: 29027310K->8646087K(31119872K)] >> 38810417K->19703013K(135977472K), 11.2509770 secs] [Times: user=38.95 >> sys=120.65, real=11.25 secs] >> >> >> >> And when it uses 50G memory, the minor GC takes only less than 1s. >> >> >> >> I try to see what is the best way to configure the Spark. For some >> special reason, I tempt to use a bigger memory on single executor if no >> significant penalty on performance. But now looks like it is? >> >> >> >> Anyone has any idea? >> >> >> >> Regards, >> >> >> >> Shuai >> > > -- www.skrasser.com <http://www.skrasser.com/?utm_source=sig>