Hi there, Are you executing spark-submit with client deploy-mode or cluster mode (client mode is the default)? Is it the driver that is being killed? Does this long running Spark application work without Oozie?
Assuming spark-submit is called with client deploy-mode, the driver will be launched directly within the spark-submit process (shell action). So it is not really the shell itself that needs so much memory, it is rather the Spark driver. That is why you need to increase memory settings for shell action when running spark-submit with client mode. On Fri, Oct 20, 2017 at 9:18 AM, Ilya Karpov <[email protected]> wrote: > Hi, guys, > I’m launching spark 2.2 job using oozie 4.1.0 (CDH 5.12.0). Oozie map task > (with shell script that started executed spark-submit command) fails after > ~ 4h of work with message in yarn: > Container [pid=49585,containerID=container_e112_1508189142310_0789_01_000002] > is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB > physical memory used; 20.3 GB of 2.1 GB virtual memory used. Killing > container. > I know that I can increase memory settings for oozie shell action, but i > can’t believe that need so much memory to run simple shell. Does anybody > knows where the root of the problem lays? > > > -- -- Attila Sasvari Software Engineer <http://www.cloudera.com/>
