Hello,

I was looking for guidelines on what value to set executor memory to
(via spark.executor.memory for example).

This seems to be important to avoid OOM during tasks, especially in no swap
environments (like AWS EMR clusters).

This setting is really about the executor JVM heap. Hence, in order to come
up with the maximal amount of heap memory for the executor, we need to list:
1. the memory taken by other processes (Worker in standalone mode, ...)
2. all off-heap allocations in the executor

Fortunately, for #1, we can just look at memory consumption without any
application running.

For #2, it is trickier. What I suspect we should account for:
a. thread stack size
b. akka buffers (via akka framesize & number of akka threads)
c. kryo buffers
d. shuffle buffers
(e. tachyon)

Could anyone shed some light on this? Maybe a formula? Or maybe swap should
actually be turned on, as a safeguard against OOMs?

Thanks

Reply via email to