Hi,

Right you seem to have an on-premise hadoop cluster of 9 physical boxes and
you want to deploy spark on it.

What spec do you have for each physical host memory and CPU and disk space?

You can take what is known as data affinity by putting your compute layers
(spark) on the same hadoop nodes.

Having hadoop implies that you have a YARN resource manager already  plus
HDFS. YARN is the most widely used resource manager on-premise for Spark.

Provide some additional info and we go from there. .

HTH





   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Sat, 24 Jul 2021 at 13:46, Dinakar Chennubotla <chennu.bigd...@gmail.com>
wrote:

> Hi All,
>
> I am Dinakar, Hadoop admin,
> could someone help me here,
>
> 1. I have a DEV-POC task to do,
> 2. Need to Installing Distributed apache-spark cluster with Cluster mode
> on Docker containers.
> 3. with Scalable spark-worker containers.
> 4. we have a 9 node cluster with some other services or tools.
>
> Thanks,
> Dinakar
>

Reply via email to