Re: Running Spark und YARN on AWS EMR

Takashi Sasaki Mon, 17 Jul 2017 14:57:58 -0700

Hi Josh,


As you say, I also recognize the problem. I feel I got a warning when
specifying a huge data set.


We also adjust the partition size but we are doing command options
instead of default settings, or in code.


Regards,

Takashi

2017-07-18 6:48 GMT+09:00 Josh Holbrook <josh.holbr...@fusion.net>:
> I just ran into this issue! Small world.
>
> As far as I can tell, by default spark on EMR is completely untuned, but it
> comes with a flag that you can set to tell EMR to autotune spark. In your
> configuration.json file, you can add something like:
>
>   {
>     "Classification": "spark",
>     "Properties": {
>       "maximizeResourceAllocation": "true"
>     }
>   },
>
> but keep in mind that, again as far as I can tell, the default parallelism
> with this config is merely twice the number of executor cores--so for a 10
> machine cluster w/ 3 active cores each, 60 partitions. This is pretty low,
> so you'll likely want to adjust this--I'm currently using the following
> because spark chokes on datasets that are bigger than about 2g per
> partition:
>
>   {
>     "Classification": "spark-defaults",
>     "Properties": {
>       "spark.default.parallelism": "1000"
>     }
>   }
>
> Good luck, and I hope this is helpful!
>
> --Josh
>
>
> On Mon, Jul 17, 2017 at 4:59 PM, Takashi Sasaki <tsasaki...@gmail.com>
> wrote:
>>
>> Hi Pascal,
>>
>> The error also occurred frequently in our project.
>>
>> As a solution, it was effective to specify the memory size directly
>> with spark-submit command.
>>
>> eg. spark-submit executor-memory 2g
>>
>>
>> Regards,
>>
>> Takashi
>>
>> > 2017-07-18 5:18 GMT+09:00 Pascal Stammer <stam...@deichbrise.de>:
>> >> Hi,
>> >>
>> >> I am running a Spark 2.1.x Application on AWS EMR with YARN and get
>> >> following error that kill my application:
>> >>
>> >> AM Container for appattempt_1500320286695_0001_000001 exited with
>> >> exitCode:
>> >> -104
>> >> For more detailed output, check application tracking
>> >>
>> >> page:http://ip-172-31-35-192.eu-central-1.compute.internal:8088/cluster/app/application_1500320286695_0001Then,
>> >> click on links to logs of each attempt.
>> >> Diagnostics: Container
>> >> [pid=9216,containerID=container_1500320286695_0001_01_000001] is
>> >> running
>> >> beyond physical memory limits. Current usage: 1.4 GB of 1.4 GB physical
>> >> memory used; 3.3 GB of 6.9 GB virtual memory used. Killing container.
>> >>
>> >>
>> >> I already change spark.yarn.executor.memoryOverhead but the error still
>> >> occurs. Does anybody have a hint for me which parameter or
>> >> configuration I
>> >> have to adapt.
>> >>
>> >> Thank you very much.
>> >>
>> >> Regards,
>> >>
>> >> Pascal Stammer
>> >>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Running Spark und YARN on AWS EMR

Reply via email to