Re: java.lang.OutOfMemoryError Spark Worker

2020-05-12 Thread Hrishikesh Mishra
Configuration: Driver memory we tried: 2GB / 4GB / 5GB Executor memory we tried: 4G / 5GB Even reduced: *spark.memory.fraction *to 0.2 (we are not using cache) VM Memory: 32 GB and 8 core We tried for SPARK_WORKER_MEMORY: 30GB / 24GB SPARK_WORKER_CORES: 32 (because jobs are not CPU bound )

Re: java.lang.OutOfMemoryError Spark Worker

2020-05-08 Thread Russell Spitzer
The error is in the Spark Standalone Worker. It's hitting an OOM while launching/running an executor process. Specifically it's running out of memory when parsing the hadoop configuration trying to figure out the env/command line to run

Re: java.lang.OutOfMemoryError Spark Worker

2020-05-08 Thread Hrishikesh Mishra
We submit spark job through spark-submit command, Like below one. sudo /var/lib/pf-spark/bin/spark-submit \ --total-executor-cores 30 \ --driver-cores 2 \ --class com.hrishikesh.mishra.Main\ --master spark://XX.XX.XXX.19:6066 \ --deploy-mode cluster \ --supervise

Re: java.lang.OutOfMemoryError Spark Worker

2020-05-08 Thread Jacek Laskowski
Hi, It's been a while since I worked with Spark Standalone, but I'd check the logs of the workers. How do you spark-submit the app? DId you check /grid/1/spark/work/driver-20200508153502-1291 directory? Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski "The Internals Of" Online

Re: java.lang.OutOfMemoryError Spark Worker

2020-05-08 Thread Hrishikesh Mishra
Thanks Jacek for quick response. Due to our system constraints, we can't move to Structured Streaming now. But definitely YARN can be tried out. But my problem is I'm able to figure out where is the issue, Driver, Executor, or Worker. Even exceptions are clueless. Please see the below exception,

Re: java.lang.OutOfMemoryError Spark Worker

2020-05-08 Thread Jacek Laskowski
Hi, Sorry for being perhaps too harsh, but when you asked "Am I missing something. " and I noticed this "Kafka Direct Stream" and "Spark Standalone Cluster. " I immediately thought "Yeah...please upgrade your Spark env to use Spark Structured Streaming at the very least and/or use YARN as the

Re: java.lang.OutOfMemoryError Spark Worker

2020-05-08 Thread Hrishikesh Mishra
These errors are completely clueless. No clue why its OOM exception is coming. 20/05/08 15:36:55 INFO Worker: Asked to kill driver driver-20200508153502-1291 20/05/08 15:36:55 INFO DriverRunner: Killing driver process! 20/05/08 15:36:55 INFO CommandUtils: Redirection to

Re: java.lang.OutOfMemoryError Spark Worker

2020-05-07 Thread Hrishikesh Mishra
It's only happening for Hadoop config. The exceptions trace are different for each time it gets died. And Jobs run for couple hours then worker dies. Another Reason: *20/05/02 02:26:34 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[ExecutorRunner for

Re: java.lang.OutOfMemoryError Spark Worker

2020-05-07 Thread Jeff Evans
You might want to double check your Hadoop config files. From the stack trace it looks like this is happening when simply trying to load configuration (XML files). Make sure they're well formed. On Thu, May 7, 2020 at 6:12 AM Hrishikesh Mishra wrote: > Hi > > I am getting out of memory error

java.lang.OutOfMemoryError Spark Worker

2020-05-07 Thread Hrishikesh Mishra
Hi I am getting out of memory error in worker log in streaming jobs in every couple of hours. After this worker dies. There is no shuffle, no aggression, no. caching in job, its just a transformation. I'm not able to identify where is the problem, driver or executor. And why worker getting dead