[
https://issues.apache.org/jira/browse/SAMZA-333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062254#comment-14062254
]
Chris Riccomini commented on SAMZA-333:
---------------------------------------
Third thought: this problem goes away if we use the ConfigLog (see SAMZA-123
for some discussion). In such a case, the JobRunner could write directly to the
ConfigLog stream at job-start time. Obviously, this requires the JobRunner
having access to whatever system is hosting the ConfigLog (likely Kafka). This
might not be true in all cases, but perhaps it's worth the trade-off to make
this demand.
> Large samza configurations results in yarn job failure
> -------------------------------------------------------
>
> Key: SAMZA-333
> URL: https://issues.apache.org/jira/browse/SAMZA-333
> Project: Samza
> Issue Type: Bug
> Components: container
> Reporter: Naveen
>
> {code}
> Application application_1404246879802_0019 failed 50 times due to AM
> Container for appattempt_1404246879802_0019_000050 exited with exitCode: 0
> due to: Exception from container-launch: java.io.IOException: Cannot run
> program "nice" (in directory
> "/export/content/data/samsa-yarn/usercache/samza-job/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
> error=7, Argument list too long
> java.io.IOException: Cannot run program "nice" (in directory
> "/export/content/data/samsa-yarn/usercache/samza/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
> error=7, Argument list too long
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1042)
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:448)
> at org.apache.hadoop.util.Shell.run(Shell.java:418)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: error=7, Argument list too long
> at java.lang.UNIXProcess.forkAndExec(Native Method)
> at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
> at java.lang.ProcessImpl.start(ProcessImpl.java:134)
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1023)
> ... 10 more
> .Failing this attempt.. Failing the application.
> {code}
> This happens because the launch_container.sh script generated by yarn has all
> the export variables (including samza configs) and the run_container scripts,
> and when we export a big config variable it crashes the current shell it's
> running in.
> For e.g., the size of the variable "SAMZA_SYSTEM_STREAMS" from
> launch_container config is:
> {code}
> bash-4.1$ sed '12q;d' launch_container.sh | wc -c
> 167546
> {code}
> As indicated here, http://www.in-ulm.de/~mascheck/various/argmax/
> The maximum size of an argument is bound by MAX_ARG_STRLEN (131072).
> This can be reproduced by exporting a large variable
> {code}
> [nsomasun@eat1-app201 usercache]$ sudo -uapp bash
> bash-4.1$ export b1=A
> bash-4.1$ export b2=$b1$b1
> bash-4.1$ export b4=$b2$b2
> bash-4.1$ export b8=$b4$b4
> bash-4.1$ export b16=$b8$b8
> bash-4.1$ export b32=$b16$b16
> bash-4.1$ export b64=$b32$b32
> bash-4.1$ export b128=$b64$b64
> bash-4.1$ export b256=$b128$b128
> bash-4.1$ export b512=$b256$b256
> bash-4.1$ export b1k=$b512$b512
> bash-4.1$ export b2k=$b1k$b1k
> bash-4.1$ export b4k=$b2k$b2k
> bash-4.1$ export b8k=$b4k$b4k
> bash-4.1$ export b16k=$b8k$b8k
> bash-4.1$ export b32k=$b16k$b16k
> bash-4.1$ export b64k=$b32k$b32k
> bash-4.1$ export b128k=$b64k$b64k
> bash-4.1$ ls
> bash: /bin/ls: Argument list too long
> {code}
> We need alternate mechanisms to pass configurations to the samza container,
> since we bound by the size of the variable the shell can support.
--
This message was sent by Atlassian JIRA
(v6.2#6252)