blondowski,

How big is your JSON file. Is it possible to post the spark params or
configurations here, maybe that might get to some idea about the issue.

Thanks

On Thu, Jan 19, 2017 at 4:21 PM, blondowski <dan.blondow...@dice.com> wrote:

> Please bear with me..I'm fairly new to spark.  Running pyspark 2.0.1 on AWS
> EMR (6 node cluster with 475GB of RAM)
>
> We have a job that creates a dataframe from json files, then does some
> manipulation (adds columns) and then calls a UDF.
>
> The job fails on the UDF call with Container killed by YARN for exceeding
> memory limits. 6.7 GB of 6.6 GB physical memory used. Consider boosting
> spark.yarn.executor.memoryOverhead.
>
> I've tried adjusting executor-memory to 48GB, but that also failed.
>
> What I've noticed that during reading json & creation of dataframe it uses
> 100+ executors and all of the memory on the cluster is being used.
>
> When it gets to the part where it's calling UDF it only allocates 3
> executors. And they all die one by one.
> Can somebody please explain to me how the executors get allocated?
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Executors-running-out-of-memory-tp28325.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Reply via email to