If you make your driver memory too low it is likely you are going to hit
OOM error.

You have not mentioned with Spark mode you are using (Local, Standalone,
Yarn etc)

HTH



Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 19 September 2016 at 23:48, Anand Viswanathan <
anand_v...@ymail.com.invalid> wrote:

> Thank you so much, Kevin.
>
> My data size is around 4GB.
> I am not using collect(), take() or takeSample()
> At the final job, number of tasks grows up to 200,000
>
> Still the driver crashes with OOM with default —driver-memory 1g but Job
> succeeds if i specify 2g.
>
> Thanks and regards,
> Anand Viswanathan
>
> On Sep 19, 2016, at 4:00 PM, Kevin Mellott <kevin.r.mell...@gmail.com>
> wrote:
>
> Hi Anand,
>
> Unfortunately, there is not really a "one size fits all" answer to this
> question; however, here are some things that you may want to consider when
> trying different sizes.
>
>    - What is the size of the data you are processing?
>    - Whenever you invoke an action that requires ALL of the data to be
>    sent to the driver (such as collect), you'll need to ensure that your
>    memory setting can handle it.
>    - What level of parallelization does your code support? The more
>    processing you can do on the worker nodes, the less your driver will need
>    to do.
>
> Related to these comments, keep in mind that the --executor-memory,
> --num-executors, and --executor-cores configurations can be useful when
> tuning the worker nodes. There is some great information in the Spark
> Tuning Guide (linked below) that you may find useful as well.
>
> http://spark.apache.org/docs/latest/tuning.html
>
> Hope that helps!
> Kevin
>
> On Mon, Sep 19, 2016 at 9:32 AM, Anand Viswanathan <
> anand_v...@ymail.com.invalid> wrote:
>
>> Hi,
>>
>> Spark version :spark-1.5.2-bin-hadoop2.6 ,using pyspark.
>>
>> I am running a machine learning program, which runs perfectly by
>> specifying 2G for —driver-memory.
>> However the program cannot be run with default 1G, driver crashes with
>> OOM error.
>>
>> What is the recommended configuration for —driver-memory…? Please suggest.
>>
>> Thanks and regards,
>> Anand.
>>
>>
>
>

Reply via email to