We are planning to use varying servers spec (32 GB, 64GB, 244GB RAM or even
higher and varying cores) for an standalone deployment of spark but we do
not know the spec of the server ahead of time and we need to script up some
logic that will run on the server on boot and automatically set the
following params on the server based on what it reads from OS about cores
and memory

SPARK_WORKER_CORES
SPARK_WORKER_MEMORY
SPARK_WORKER_INSTANCES

What could such script the logic to be based on the memory size and number
of cores seen by this script? In other words, what are the recommend rule
of thumps to divide up a server (specially for larger rams) without knowing
about the spark application and data size ahead of time?

Thanks,
Mike

Reply via email to