Ashok, Cluster nodes has enough memory but CPU cores are less. 512GB / 16 = 32 GB. For 1 core the cluster has 32GB memory. Either their should be more cores available to use efficiently the available memory or don't configure a higher executor memory which will cause lot of GC.
Thanks, Prabhu Joseph On Fri, Mar 11, 2016 at 3:45 AM, Ashok Kumar <ashok34...@yahoo.com.invalid> wrote: > > Hi, > > We intend to use 5 servers which will be utilized for building Bigdata > Hadoop data warehouse system (not using any propriety distribution like > Hortonworks or Cloudera or others). > All servers configurations are 512GB RAM, 30TB storage and 16 cores, > Ubuntu Linux servers. Hadoop will be installed on all the servers/nodes. > Server 1 will be used for NameNode plus DataNode as well. Server 2 will be > used for standby NameNode & DataNode. The rest of the servers will be used > as DataNodes.. > Now we would like to install Spark on each servers to create Spark > cluster. Is that the good thing to do or we should buy additional hardware > for Spark (minding cost here) or simply do we require additional memory to > accommodate Spark as well please. In that case how much memory for each > Spark node would you recommend? > > > thanks all >