Overall the defaults are sensible, but you definitely have to look at your
application and optimise a few of them. I mostly refer to the following
links when the job is slow or failing or we have more hardware which we see
we are not utilizing.

http://spark.apache.org/docs/latest/tuning.html
http://spark.apache.org/docs/latest/hardware-provisioning.html
http://spark.apache.org/docs/latest/configuration.html


Thanks,
Sonal
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>



On Tue, Sep 12, 2017 at 2:40 AM, Aakash Basu <aakash.spark....@gmail.com>
wrote:

> Hi,
>
> Can someone please clarify a little on how should we effectively calculate
> the parameters to be passed over using spark-submit.
>
> Parameters as in -
>
> Cores, NumExecutors, DriverMemory, etc.
>
> Is there any generic calculation which can be done over most kind of
> clusters with different sizes from small 3 node to 100s of nodes.
>
> Thanks,
> Aakash.
>

Reply via email to