Overall the defaults are sensible, but you definitely have to look at your application and optimise a few of them. I mostly refer to the following links when the job is slow or failing or we have more hardware which we see we are not utilizing.
http://spark.apache.org/docs/latest/tuning.html http://spark.apache.org/docs/latest/hardware-provisioning.html http://spark.apache.org/docs/latest/configuration.html Thanks, Sonal Nube Technologies <http://www.nubetech.co> <http://in.linkedin.com/in/sonalgoyal> On Tue, Sep 12, 2017 at 2:40 AM, Aakash Basu <aakash.spark....@gmail.com> wrote: > Hi, > > Can someone please clarify a little on how should we effectively calculate > the parameters to be passed over using spark-submit. > > Parameters as in - > > Cores, NumExecutors, DriverMemory, etc. > > Is there any generic calculation which can be done over most kind of > clusters with different sizes from small 3 node to 100s of nodes. > > Thanks, > Aakash. >