Hi,

I have a three node k8s cluster (GKE) in Google cloud with E2
standard machines that have 4 GB of system memory per VCPU giving 4 VPCU
and 16,384MB of RAM.

An optimum sizing of the number of executors, CPU and memory allocation is
important here. These are the assumptions:

   1. You want to fit exactly one Spark executor pod per Kubernetes node
   2. You should not starve the node OS, network etc from CPU usage
   3. If you have 3 nodes, one node should be allocated to the driver and
   two nodes to the executors
   4. Regardless you want to execute the code ik8s as fast as possible

I don't think with the current architecture, one can force the driver node
to accommodate both the driver plus one executor at the same time. I did
some tests and looked at the available discussions here
<https://spark.apache.org/docs/latest/running-on-kubernetes.html>and here
<https://www.datamechanics.co/blog-post/setting-up-managing-monitoring-spark-on-kubernetes>
. One can fine tune various parameters, but these seem to be fine

          --conf spark.executor.instances=2 \
        --conf spark.driver.cores=3 \
         --conf spark.executor.cores=3 \
          --conf spark.driver.memory=8000m \
          --conf spark.executor.memory=8000m \

What I am suggesting here is to leave one 1 VCPU out of 4 VCPUS to the OS
on each node. It is a safer bet to grab half of the memory available on
each node for the driver and executors. Your mileage varies because if you
try to allocate more memory, it will take longer for the driver and
executors to spin off (ContainerCreating), meaning that the execution time
will be longer. This could be offset if you are running a long job and you
care about allocating more available memory rather than the
ContainerCreation time. It would be interesting if others have done similar
configuration and their experience.


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

Reply via email to