[ 
https://issues.apache.org/jira/browse/SAMZA-333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen updated SAMZA-333:
-------------------------

    Description: 
{code}
Application application_1404246879802_0019 failed 50 times due to AM Container 
for appattempt_1404246879802_0019_000050 exited with exitCode: 0 due to: 
Exception from container-launch: java.io.IOException: Cannot run program "nice" 
(in directory 
"/export/content/data/samsa-yarn/usercache/samza-job/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
 error=7, Argument list too long
java.io.IOException: Cannot run program "nice" (in directory 
"/export/content/data/samsa-yarn/usercache/samza/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
 error=7, Argument list too long
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1042)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:448)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error=7, Argument list too long
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1023)
... 10 more
.Failing this attempt.. Failing the application.
{code}


This happens because the launch_container.sh script generated by yarn has all 
the export variables (including samza configs) and the run_container scripts, 
and when we export a big config variable it crashes the current shell it's 
running in.

For e.g., the size of the variable "SAMZA_SYSTEM_STREAMS" from launch_container 
config is:
{code}
bash-4.1$ sed '12q;d' launch_container.sh | wc -c
167546
{code}

As indicated here, http://www.in-ulm.de/~mascheck/various/argmax/
The maximum size of an argument is bound by MAX_ARG_STRLEN (131072).

This can be reproduced by exporting a large variable
{code}
[nsomasun@eat1-app201 usercache]$ sudo -uapp bash
bash-4.1$ export b1=A
bash-4.1$ export b2=$b1$b1
bash-4.1$ export b4=$b2$b2
bash-4.1$ export b8=$b4$b4
bash-4.1$ export b16=$b8$b8
bash-4.1$ export b32=$b16$b16
bash-4.1$ export b64=$b32$b32
bash-4.1$ export b128=$b64$b64
bash-4.1$ export b256=$b128$b128
bash-4.1$ export b512=$b256$b256
bash-4.1$ export b1k=$b512$b512
bash-4.1$ export b2k=$b1k$b1k
bash-4.1$ export b4k=$b2k$b2k
bash-4.1$ export b8k=$b4k$b4k
bash-4.1$ export b16k=$b8k$b8k
bash-4.1$ export b32k=$b16k$b16k
bash-4.1$ export b64k=$b32k$b32k
bash-4.1$ export b128k=$b64k$b64k
bash-4.1$ ls
bash: /bin/ls: Argument list too long
{code}

We need alternate mechanisms to pass configurations to the samza container, 
since we bound by the size of the variable the shell can support.


  was:
{code}
Application application_1404246879802_0019 failed 50 times due to AM Container 
for appattempt_1404246879802_0019_000050 exited with exitCode: 0 due to: 
Exception from container-launch: java.io.IOException: Cannot run program "nice" 
(in directory 
"/export/content/data/samsa-yarn/usercache/samza-job/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
 error=7, Argument list too long
java.io.IOException: Cannot run program "nice" (in directory 
"/export/content/data/samsa-yarn/usercache/samza/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
 error=7, Argument list too long
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1042)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:448)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error=7, Argument list too long
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1023)
... 10 more
.Failing this attempt.. Failing the application.
{code}


> Large samza configurations results in  yarn job failure
> -------------------------------------------------------
>
>                 Key: SAMZA-333
>                 URL: https://issues.apache.org/jira/browse/SAMZA-333
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>            Reporter: Naveen
>
> {code}
> Application application_1404246879802_0019 failed 50 times due to AM 
> Container for appattempt_1404246879802_0019_000050 exited with exitCode: 0 
> due to: Exception from container-launch: java.io.IOException: Cannot run 
> program "nice" (in directory 
> "/export/content/data/samsa-yarn/usercache/samza-job/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
>  error=7, Argument list too long
> java.io.IOException: Cannot run program "nice" (in directory 
> "/export/content/data/samsa-yarn/usercache/samza/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
>  error=7, Argument list too long
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1042)
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:448)
> at org.apache.hadoop.util.Shell.run(Shell.java:418)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: error=7, Argument list too long
> at java.lang.UNIXProcess.forkAndExec(Native Method)
> at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
> at java.lang.ProcessImpl.start(ProcessImpl.java:134)
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1023)
> ... 10 more
> .Failing this attempt.. Failing the application.
> {code}
> This happens because the launch_container.sh script generated by yarn has all 
> the export variables (including samza configs) and the run_container scripts, 
> and when we export a big config variable it crashes the current shell it's 
> running in.
> For e.g., the size of the variable "SAMZA_SYSTEM_STREAMS" from 
> launch_container config is:
> {code}
> bash-4.1$ sed '12q;d' launch_container.sh | wc -c
> 167546
> {code}
> As indicated here, http://www.in-ulm.de/~mascheck/various/argmax/
> The maximum size of an argument is bound by MAX_ARG_STRLEN (131072).
> This can be reproduced by exporting a large variable
> {code}
> [nsomasun@eat1-app201 usercache]$ sudo -uapp bash
> bash-4.1$ export b1=A
> bash-4.1$ export b2=$b1$b1
> bash-4.1$ export b4=$b2$b2
> bash-4.1$ export b8=$b4$b4
> bash-4.1$ export b16=$b8$b8
> bash-4.1$ export b32=$b16$b16
> bash-4.1$ export b64=$b32$b32
> bash-4.1$ export b128=$b64$b64
> bash-4.1$ export b256=$b128$b128
> bash-4.1$ export b512=$b256$b256
> bash-4.1$ export b1k=$b512$b512
> bash-4.1$ export b2k=$b1k$b1k
> bash-4.1$ export b4k=$b2k$b2k
> bash-4.1$ export b8k=$b4k$b4k
> bash-4.1$ export b16k=$b8k$b8k
> bash-4.1$ export b32k=$b16k$b16k
> bash-4.1$ export b64k=$b32k$b32k
> bash-4.1$ export b128k=$b64k$b64k
> bash-4.1$ ls
> bash: /bin/ls: Argument list too long
> {code}
> We need alternate mechanisms to pass configurations to the samza container, 
> since we bound by the size of the variable the shell can support.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to