Re: Running Spark und YARN on AWS EMR

2017-07-17 Thread Takashi Sasaki
Hi Josh, As you say, I also recognize the problem. I feel I got a warning when specifying a huge data set. We also adjust the partition size but we are doing command options instead of default settings, or in code. Regards, Takashi 2017-07-18 6:48 GMT+09:00 Josh Holbrook : > I just ran into

Re: Running Spark und YARN on AWS EMR

2017-07-17 Thread Josh Holbrook
I just ran into this issue! Small world. As far as I can tell, by default spark on EMR is completely untuned, but it comes with a flag that you can set to tell EMR to autotune spark. In your configuration.json file, you can add something like: { "Classification": "spark", "Properties":

Re: Running Spark und YARN on AWS EMR

2017-07-17 Thread Pascal Stammer
Hi Takashi, thanks for your help. After a further investigation, I figure out that the killed container was the driver process. After setting spark.yarn.driver.memoryOverhead instead of spark.yarn.executor.memoryOverhead the error was gone and application is executed without error. Maybe it wi

Re: Running Spark und YARN on AWS EMR

2017-07-17 Thread Takashi Sasaki
Hi Pascal, The error also occurred frequently in our project. As a solution, it was effective to specify the memory size directly with spark-submit command. eg. spark-submit executor-memory 2g Regards, Takashi > 2017-07-18 5:18 GMT+09:00 Pascal Stammer : >> Hi, >> >> I am running a Spark 2.1

Running Spark und YARN on AWS EMR

2017-07-17 Thread Pascal Stammer
Hi, I am running a Spark 2.1.x Application on AWS EMR with YARN and get following error that kill my application: AM Container for appattempt_1500320286695_0001_01 exited with exitCode: -104 For more detailed output, check application tracking page:http://ip-172-31-35-192.eu-central-1.compu