Hi All  ,

What is the Best Way to install and Spark Cluster along side with Hadoop
Cluster , Any recommendation for below deployment topology will be a great
help

*Also Is it necessary to put the Spark Worker on DataNodes as when it read
block from HDFS it will be local to the Server / Worker or  I can put the
Worker on any other nodes and if i do that will it affect the performance
of the Spark Data Processing ..*

Hadoop Option 1

Server 1 - NameNode   & Spark Master
Server 2 - DataNode 1  & Spark Worker
Server 3 - DataNode 2  & Spark Worker
Server 4 - DataNode 3  & Spark Worker

Hadoop Option 2


Server 1 - NameNode
Server 2 - Spark Master
Server 2 - DataNode 1
Server 3 - DataNode 2
Server 4 - DataNode 3
Server 5 - Spark Worker 1
Server 6 - Spark Worker 2
Server 7 - Spark Worker 3

Thanks.

Reply via email to