Well off-heap memory will be from an OS perspective be visible under the
JVM process (you see the memory consumption of the jvm process growing when
using off-heap memory). There is one exception: if there is another
process, which has not been started by the JVM and "lives" outside the JVM,
but uses IPC to communicate with the JVM. I do not assume this is for Spark
@xms/xmx you are right here, this is just about heap memory. You may be
able to limit the memory (and thus under previous described assumption) of
the jvm process by using cgroups, which needs to be thought about if this
shoudld be done.
On Thu, Sep 22, 2016 at 5:09 AM, Sean Owen <so...@cloudera.com> wrote:
> No, Xmx only controls the maximum size of on-heap allocated memory.
> The JVM doesn't manage/limit off-heap (how could it? it doesn't know
> when it can be released).
> The answer is that YARN will kill the process because it's using more
> memory than it asked for. A JVM is always going to use a little
> off-heap memory by itself, so setting a max heap size of 2GB means the
> JVM process may use a bit more than 2GB of memory. With an off-heap
> intensive app like Spark it can be a lot more.
> There's a built-in 10% overhead, so that if you ask for a 3GB executor
> it will ask for 3.3GB from YARN. You can increase the overhead.
> On Wed, Sep 21, 2016 at 11:41 PM, Jörn Franke <jornfra...@gmail.com>
> > All off-heap memory is still managed by the JVM process. If you limit the
> > memory of this process then you limit the memory. I think the memory of
> > JVM process could be limited via the xms/xmx parameter of the JVM. This
> > be configured via spark options for yarn (be aware that they are
> > in cluster and client mode), but i recommend to use the spark options for
> > the off heap maximum.
> > https://spark.apache.org/docs/latest/running-on-yarn.html
> > On 21 Sep 2016, at 22:02, Michael Segel <msegel_had...@hotmail.com>
> > Iâ€™ve asked this question a couple of times from a friend who didnâ€™t
> > the answerâ€¦ so I thought I would try here.
> > Suppose we launch a job on a cluster (YARN) and we have set up the
> > containers to be 3GB in size.
> > What does that 3GB represent?
> > I mean what happens if we end up using 2-3GB of off heap storage via
> > tungsten?
> > What will Spark do?
> > Will it try to honor the containerâ€™s limits and throw an exception or
> > it allow my job to grab that amount of memory and exceed YARNâ€™s
> > expectations since its off heap?
> > Thx
> > -Mike
> > B‹KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB•
> > [œÝXœØÜšX™H K[XZ[ ˆ \Ù\‹][œÝXœØÜšX™P Ü \šË˜\ XÚ K›Ü™ÃBƒ