Re: Larger heap leads to perf degradation due to GC

2014-10-16 Thread Akshat Aranya
I just want to pitch in and say that I ran into the same problem with
running with 64GB executors.  For example, some of the tasks take 5 minutes
to execute, out of which 4 minutes are spent in GC.  I'll try out smaller
executors.

On Mon, Oct 6, 2014 at 6:35 PM, Otis Gospodnetic otis.gospodne...@gmail.com
 wrote:

 Hi,

 The other option to consider is using G1 GC, which should behave better
 with large heaps.  But pointers are not compressed in heaps  32 GB in
 size, so you may be better off staying under 32 GB.

 Otis
 --
 Monitoring * Alerting * Anomaly Detection * Centralized Log Management
 Solr  Elasticsearch Support * http://sematext.com/


 On Mon, Oct 6, 2014 at 8:08 PM, Mingyu Kim m...@palantir.com wrote:

 Ok, cool. This seems to be general issues in JVM with very large heaps. I
 agree that the best workaround would be to keep the heap size below 32GB.
 Thanks guys!

 Mingyu

 From: Arun Ahuja aahuj...@gmail.com
 Date: Monday, October 6, 2014 at 7:50 AM
 To: Andrew Ash and...@andrewash.com
 Cc: Mingyu Kim m...@palantir.com, user@spark.apache.org 
 user@spark.apache.org, Dennis Lawler dlaw...@palantir.com
 Subject: Re: Larger heap leads to perf degradation due to GC

 We have used the strategy that you suggested, Andrew - using many workers
 per machine and keeping the heaps small ( 20gb).

 Using a large heap resulted in workers hanging or not responding (leading
 to timeouts).  The same dataset/job for us will fail (most often due to
 akka disassociated or fetch failures errors) with 10 cores / 100 executors,
 60 gb per executor while succceed with 1 core / 1000 executors / 6gb per
 executor.

 When the job does succceed with more cores per executor and larger heap
 it is usually much slower than the smaller executors (the same 8-10 min job
 taking 15-20 min to complete)

 The unfortunate downside of this has been, we have had some large
 broadcast variables which may not fit into memory (and unnecessarily
 duplicated) when using the smaller executors.

 Most of this is anecdotal but for the most part we have had more success
 and consistency with more executors with smaller memory requirements.

 On Sun, Oct 5, 2014 at 7:20 PM, Andrew Ash and...@andrewash.com wrote:

 Hi Mingyu,

 Maybe we should be limiting our heaps to 32GB max and running multiple
 workers per machine to avoid large GC issues.

 For a 128GB memory, 32 core machine, this could look like:

 SPARK_WORKER_INSTANCES=4
 SPARK_WORKER_MEMORY=32
 SPARK_WORKER_CORES=8

 Are people running with large (32GB+) executor heaps in production?  I'd
 be curious to hear if so.

 Cheers!
 Andrew

 On Thu, Oct 2, 2014 at 1:30 PM, Mingyu Kim m...@palantir.com wrote:

 This issue definitely needs more investigation, but I just wanted to
 quickly check if anyone has run into this problem or has general guidance
 around it. We’ve seen a performance degradation with a large heap on a
 simple map task (I.e. No shuffle). We’ve seen the slowness starting around
 from 50GB heap. (I.e. spark.executor.memoty=50g) And, when we checked the
 CPU usage, there were just a lot of GCs going on.

 Has anyone seen a similar problem?

 Thanks,
 Mingyu







Re: Larger heap leads to perf degradation due to GC

2014-10-06 Thread Arun Ahuja
We have used the strategy that you suggested, Andrew - using many workers
per machine and keeping the heaps small ( 20gb).

Using a large heap resulted in workers hanging or not responding (leading
to timeouts).  The same dataset/job for us will fail (most often due to
akka disassociated or fetch failures errors) with 10 cores / 100 executors,
60 gb per executor while succceed with 1 core / 1000 executors / 6gb per
executor.

When the job does succceed with more cores per executor and larger heap it
is usually much slower than the smaller executors (the same 8-10 min job
taking 15-20 min to complete)

The unfortunate downside of this has been, we have had some large broadcast
variables which may not fit into memory (and unnecessarily duplicated) when
using the smaller executors.

Most of this is anecdotal but for the most part we have had more success
and consistency with more executors with smaller memory requirements.

On Sun, Oct 5, 2014 at 7:20 PM, Andrew Ash and...@andrewash.com wrote:

 Hi Mingyu,

 Maybe we should be limiting our heaps to 32GB max and running multiple
 workers per machine to avoid large GC issues.

 For a 128GB memory, 32 core machine, this could look like:

 SPARK_WORKER_INSTANCES=4
 SPARK_WORKER_MEMORY=32
 SPARK_WORKER_CORES=8

 Are people running with large (32GB+) executor heaps in production?  I'd
 be curious to hear if so.

 Cheers!
 Andrew

 On Thu, Oct 2, 2014 at 1:30 PM, Mingyu Kim m...@palantir.com wrote:

 This issue definitely needs more investigation, but I just wanted to
 quickly check if anyone has run into this problem or has general guidance
 around it. We’ve seen a performance degradation with a large heap on a
 simple map task (I.e. No shuffle). We’ve seen the slowness starting around
 from 50GB heap. (I.e. spark.executor.memoty=50g) And, when we checked the
 CPU usage, there were just a lot of GCs going on.

 Has anyone seen a similar problem?

 Thanks,
 Mingyu





Re: Larger heap leads to perf degradation due to GC

2014-10-05 Thread Andrew Ash
Hi Mingyu,

Maybe we should be limiting our heaps to 32GB max and running multiple
workers per machine to avoid large GC issues.

For a 128GB memory, 32 core machine, this could look like:

SPARK_WORKER_INSTANCES=4
SPARK_WORKER_MEMORY=32
SPARK_WORKER_CORES=8

Are people running with large (32GB+) executor heaps in production?  I'd be
curious to hear if so.

Cheers!
Andrew

On Thu, Oct 2, 2014 at 1:30 PM, Mingyu Kim m...@palantir.com wrote:

 This issue definitely needs more investigation, but I just wanted to
 quickly check if anyone has run into this problem or has general guidance
 around it. We’ve seen a performance degradation with a large heap on a
 simple map task (I.e. No shuffle). We’ve seen the slowness starting around
 from 50GB heap. (I.e. spark.executor.memoty=50g) And, when we checked the
 CPU usage, there were just a lot of GCs going on.

 Has anyone seen a similar problem?

 Thanks,
 Mingyu