On Wed, Apr 30, 2014 at 1:52 PM, wxhsdp <wxh...@gmail.com> wrote:

> Hi, guys
>
>   i want to do some optimizations of my spark codes. i use VisualVM to
> monitor the executor when run the app.
>   here's the snapshot:
> <
> http://apache-spark-user-list.1001560.n3.nabble.com/file/n5107/executor.png
> >
>
> from the snapshot, i can get the memory usage information about the
> executor, but the executor contains lots of tasks. is it possible to get
> the
> memory usage of one single task in JVM with GC running in the background?
>

I guess you could run 1-core slaves. That way they would only work on one
task at a time.

by the way, you can see every time when memory is consumed up to 90%, JVM
> does GC operation.
> i'am a little confused about that. i originally thought that 60% of the
> memory is kept for Spark's memory cache(i did not cache any RDDs in my
> application), so there was only 40% left for running the app.
>

The way I understand it, Spark does not have a tight control on the memory.
Your code running on the executor can easily use more than 40% of memory.
Spark only limits the memory used for RDD caches and shuffles. If its RDD
caches are full, taking up 60% of the heap, and your code takes up more
than 40% (after GC), the executor will die with OOM.

I suppose there is not much Spark could do about this. You cannot control
how much memory a function you call is allowed to use.

Reply via email to