I'll bet you are hitting MR-2998. From the changelog:
MAPREDUCE-2998. Fixed a bug in TaskAttemptImpl which caused it to fork bin/mapred too many times. Contributed by Vinod K V. Arun On Sep 21, 2011, at 9:52 AM, Chris Riccomini wrote: > Hey Guys, > > My ApplicationMaster is being killed by the NodeManager because of memory > consumption, and I don't understand why. I'm using -Xmx512M, and setting my > resource request to 2048. > > > .addCommand("java -Xmx512M -cp './package/*' kafka.yarn.ApplicationMaster > " ... > > ... > > private var memory = 2048 > > resource.setMemory(memory) > containerCtx.setResource(resource) > containerCtx.setCommands(cmds.toList) > containerCtx.setLocalResources(Collections.singletonMap("package", > packageResource)) > appCtx.setApplicationId(appId) > appCtx.setUser(user.getShortUserName) > appCtx.setAMContainerSpec(containerCtx) > request.setApplicationSubmissionContext(appCtx) > applicationsManager.submitApplication(request) > > When this runs, I see (in my NodeManager's logs): > > > 2011-09-21 09:35:19,112 INFO monitor.ContainersMonitorImpl > (ContainersMonitorImpl.java:run(402)) - Memory usage of ProcessTree 28134 for > container-id container_1316559026783_0003_01_000001 : Virtual 2260938752 > bytes, limit : 2147483648 bytes; Physical 71540736 bytes, limit -1 bytes > 2011-09-21 09:35:19,112 WARN monitor.ContainersMonitorImpl > (ContainersMonitorImpl.java:isProcessTreeOverLimit(289)) - Process tree for > container: container_1316559026783_0003_01_000001 has processes older than 1 > iteration running over the configured limit. Limit=2147483648, current usage > = 2260938752 > 2011-09-21 09:35:19,113 WARN monitor.ContainersMonitorImpl > (ContainersMonitorImpl.java:run(453)) - Container > [pid=28134,containerID=container_1316559026783_0003_01_000001] is running > beyond memory-limits. Current usage : 2260938752bytes. Limit : > 2147483648bytes. Killing container. > Dump of the process-tree for container_1316559026783_0003_01_000001 : > |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) > SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE > |- 28134 25886 28134 28134 (bash) 0 0 108638208 303 /bin/bash -c java > -Xmx512M -cp './package/*' kafka.yarn.ApplicationMaster 3 1 1316559026783 > com.linkedin.TODO 1 > 1>/tmp/logs/application_1316559026783_0003/container_1316559026783_0003_01_000001/stdout > > 2>/tmp/logs/application_1316559026783_0003/container_1316559026783_0003_01_000001/stderr > > |- 28137 28134 28134 28134 (java) 92 3 2152300544 17163 java -Xmx512M > -cp ./package/* kafka.yarn.ApplicationMaster 3 1 1316559026783 > com.linkedin.TODO 1 > > 2011-09-21 09:35:19,113 INFO monitor.ContainersMonitorImpl > (ContainersMonitorImpl.java:run(463)) - Removed ProcessTree with root 28134 > > It appears that YARN is honoring my 2048 command, yet my process is somehow > taking 2260938752 bytes. I don't think that I'm using nearly that much in > permgen, and my heap is limited to 512. I don't have any JNI stuff running > (that I know of), so it's unclear to me what's going on here. The only thing > that I can think of is that Java's Runtime exec is forking, and copying its > entire JVM memory footprint for the fork. > > Has anyone seen this? Am I doing something dumb? > > Thanks! > Chris