I’ve asked this question a couple of times from a friend who didn’t know the
answer… so I thought I would try here.
Suppose we launch a job on a cluster (YARN) and we have set up the containers
to be 3GB in size.
What does that 3GB represent?
I mean what happens if we end up using 2-3GB of off heap storage via tungsten?
What will Spark do?
Will it try to honor the container’s limits and throw an exception or will it
allow my job to grab that amount of memory and exceed YARN’s expectations since
its off heap?