Wangda Tan, Thanks for your reply! We did actually figure out where the problem was coming from, but this is a very helpful technique to know.
John From: Wangda Tan [mailto:wheele...@gmail.com] Sent: Wednesday, March 26, 2014 6:35 PM To: user@hadoop.apache.org Subject: Re: Getting error message from AM container launch HI John, Typically, this is caused by somewhere in your program set "nice" as AM launching command. You can check the "real" script which YARN used to launch AM. You need set "yarn.nodemanager.delete.debug-delay-sec" in yarn-site.xml on all NMs to a larger value (like 600, 10 min), to make NMs don't remove temporary directory of a container as soon as the container get finished. You need restart NMs after you set. After that, you can re-run your program again, the script you can find should be <host-of-AM>:/ephemeral02/hadoop/yarn/local/usercache/SYSTEM/appcache/<app-id>/<container-id>/launch_container.sh. You can verify the launch command if correct in the script. -- Regards, Wangda Tan On Thu, Mar 27, 2014 at 7:12 AM, Azuryy <azury...@gmail.com<mailto:azury...@gmail.com>> wrote: You used 'nice' in your app? Sent from my iPhone5s On 2014年3月27日, at 6:55, John Lilley <john.lil...@redpoint.net<mailto:john.lil...@redpoint.net>> wrote: On further examination they appear to be 369 characters long. I’ve read about similar issues showing when the environment exceeds 132KB, but we aren’t putting anything significant in the environment. John From: John Lilley [mailto:john.lil...@redpoint.net] Sent: Wednesday, March 26, 2014 4:41 PM To: user@hadoop.apache.org<mailto:user@hadoop.apache.org> Subject: RE: Getting error message from AM container launch We do have a fairly long container command-line. Not huge, around 200 characters. John From: John Lilley [mailto:john.lil...@redpoint.net] Sent: Wednesday, March 26, 2014 4:38 PM To: user@hadoop.apache.org<mailto:user@hadoop.apache.org> Subject: Getting error message from AM container launch Running a non-MapReduce YARN application, one of the containers launched by the AM is failing with an error message I’ve never seen. Any ideas? I’m not sure who exactly is running “nice” or why its argument list would be too long. Thanks john Container for appattempt_1395755163053_0030_000001 exited with exitCode: 0 due to: Exception from container-launch: java.io.IOException: Cannot run program ""nice"" (in directory ""/ephemeral02/hadoop/yarn/local/usercache/SYSTEM/appcache/application_1395755163053_0030/container_1395755163053_0030_01_000001""): java.io.IOException: error=7, Argument list too long at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) at org.apache.hadoop.util.Shell.runCommand(Shell.java:407) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: java.io.IOException: error=7, Argument list too long at java.lang.UNIXProcess.<init>(UNIXProcess.java:148) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) ... 11 more -- Regards, Wangda