Hi Mich, We have HDP 2.3.2 where spark will run on 21 nodes each having 250 gb memory. Jobs run in yarn-client and yarn-cluster mode.
We have other teams using the same cluster to build their applications. Regards, Pradeep > On May 15, 2016, at 1:37 PM, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > > Hi Pradeep, > > In your case what type of cluster we are taking about? A standalone cluster? > > HTh > > Dr Mich Talebzadeh > > LinkedIn > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > http://talebzadehmich.wordpress.com > > >> On 15 May 2016 at 13:19, Mail.com <pradeep.mi...@mail.com> wrote: >> Hi , >> >> I have seen multiple videos on spark tuning which shows how to determine # >> cores, #executors and memory size of the job. >> >> In all that I have seen, it seems each job has to be given the max resources >> allowed in the cluster. >> >> How do we factor in input size as well? I am processing a 1gb compressed >> file then I can live with say 10 executors and not 21 etc.. >> >> Also do we consider other jobs in the cluster that could be running? I will >> use only 20 GB out of available 300 gb etc.. >> >> Thanks, >> Pradeep >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >