[
https://issues.apache.org/jira/browse/SAMZA-333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Naveen updated SAMZA-333:
-------------------------
Description:
{code}
Application application_1404246879802_0019 failed 50 times due to AM Container
for appattempt_1404246879802_0019_000050 exited with exitCode: 0 due to:
Exception from container-launch: java.io.IOException: Cannot run program "nice"
(in directory
"/export/content/data/samsa-yarn/usercache/samza-job/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
error=7, Argument list too long
java.io.IOException: Cannot run program "nice" (in directory
"/export/content/data/samsa-yarn/usercache/samza/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
error=7, Argument list too long
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1042)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:448)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error=7, Argument list too long
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1023)
... 10 more
.Failing this attempt.. Failing the application.
{code}
This happens because the launch_container.sh script generated by yarn has all
the export variables (including samza configs) and the run_container scripts,
and when we export a big config variable it crashes the current shell it's
running in.
For e.g., the size of the variable "SAMZA_SYSTEM_STREAMS" from launch_container
config is:
{code}
bash-4.1$ sed '12q;d' launch_container.sh | wc -c
167546
{code}
As indicated here, http://www.in-ulm.de/~mascheck/various/argmax/
The maximum size of an argument is bound by MAX_ARG_STRLEN (131072).
This can be reproduced by exporting a large variable
{code}
[nsomasun@eat1-app201 usercache]$ sudo -uapp bash
bash-4.1$ export b1=A
bash-4.1$ export b2=$b1$b1
bash-4.1$ export b4=$b2$b2
bash-4.1$ export b8=$b4$b4
bash-4.1$ export b16=$b8$b8
bash-4.1$ export b32=$b16$b16
bash-4.1$ export b64=$b32$b32
bash-4.1$ export b128=$b64$b64
bash-4.1$ export b256=$b128$b128
bash-4.1$ export b512=$b256$b256
bash-4.1$ export b1k=$b512$b512
bash-4.1$ export b2k=$b1k$b1k
bash-4.1$ export b4k=$b2k$b2k
bash-4.1$ export b8k=$b4k$b4k
bash-4.1$ export b16k=$b8k$b8k
bash-4.1$ export b32k=$b16k$b16k
bash-4.1$ export b64k=$b32k$b32k
bash-4.1$ export b128k=$b64k$b64k
bash-4.1$ ls
bash: /bin/ls: Argument list too long
{code}
We need alternate mechanisms to pass configurations to the samza container,
since we bound by the size of the variable the shell can support.
was:
{code}
Application application_1404246879802_0019 failed 50 times due to AM Container
for appattempt_1404246879802_0019_000050 exited with exitCode: 0 due to:
Exception from container-launch: java.io.IOException: Cannot run program "nice"
(in directory
"/export/content/data/samsa-yarn/usercache/samza-job/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
error=7, Argument list too long
java.io.IOException: Cannot run program "nice" (in directory
"/export/content/data/samsa-yarn/usercache/samza/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
error=7, Argument list too long
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1042)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:448)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error=7, Argument list too long
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1023)
... 10 more
.Failing this attempt.. Failing the application.
{code}
> Large samza configurations results in yarn job failure
> -------------------------------------------------------
>
> Key: SAMZA-333
> URL: https://issues.apache.org/jira/browse/SAMZA-333
> Project: Samza
> Issue Type: Bug
> Components: container
> Reporter: Naveen
>
> {code}
> Application application_1404246879802_0019 failed 50 times due to AM
> Container for appattempt_1404246879802_0019_000050 exited with exitCode: 0
> due to: Exception from container-launch: java.io.IOException: Cannot run
> program "nice" (in directory
> "/export/content/data/samsa-yarn/usercache/samza-job/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
> error=7, Argument list too long
> java.io.IOException: Cannot run program "nice" (in directory
> "/export/content/data/samsa-yarn/usercache/samza/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
> error=7, Argument list too long
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1042)
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:448)
> at org.apache.hadoop.util.Shell.run(Shell.java:418)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: error=7, Argument list too long
> at java.lang.UNIXProcess.forkAndExec(Native Method)
> at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
> at java.lang.ProcessImpl.start(ProcessImpl.java:134)
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1023)
> ... 10 more
> .Failing this attempt.. Failing the application.
> {code}
> This happens because the launch_container.sh script generated by yarn has all
> the export variables (including samza configs) and the run_container scripts,
> and when we export a big config variable it crashes the current shell it's
> running in.
> For e.g., the size of the variable "SAMZA_SYSTEM_STREAMS" from
> launch_container config is:
> {code}
> bash-4.1$ sed '12q;d' launch_container.sh | wc -c
> 167546
> {code}
> As indicated here, http://www.in-ulm.de/~mascheck/various/argmax/
> The maximum size of an argument is bound by MAX_ARG_STRLEN (131072).
> This can be reproduced by exporting a large variable
> {code}
> [nsomasun@eat1-app201 usercache]$ sudo -uapp bash
> bash-4.1$ export b1=A
> bash-4.1$ export b2=$b1$b1
> bash-4.1$ export b4=$b2$b2
> bash-4.1$ export b8=$b4$b4
> bash-4.1$ export b16=$b8$b8
> bash-4.1$ export b32=$b16$b16
> bash-4.1$ export b64=$b32$b32
> bash-4.1$ export b128=$b64$b64
> bash-4.1$ export b256=$b128$b128
> bash-4.1$ export b512=$b256$b256
> bash-4.1$ export b1k=$b512$b512
> bash-4.1$ export b2k=$b1k$b1k
> bash-4.1$ export b4k=$b2k$b2k
> bash-4.1$ export b8k=$b4k$b4k
> bash-4.1$ export b16k=$b8k$b8k
> bash-4.1$ export b32k=$b16k$b16k
> bash-4.1$ export b64k=$b32k$b32k
> bash-4.1$ export b128k=$b64k$b64k
> bash-4.1$ ls
> bash: /bin/ls: Argument list too long
> {code}
> We need alternate mechanisms to pass configurations to the samza container,
> since we bound by the size of the variable the shell can support.
--
This message was sent by Atlassian JIRA
(v6.2#6252)