Hi, I am using 2.1.0-beta and have seen container allocation failing randomly even when running the same application in a loop. I know that the cluster has enough resources to give, because it gave the resources for the same application all the other times in the loop and ran it successfully.
I have observed a lot of the following kind of messages in the node manager's log whenever such failure happens, any clues as to why it happens? 2013-09-12 08:54:36,204 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out status for container: container_id { app_attempt_id { application_id { id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state: C_RUNNING diagnostics: "" exit_status: -1000 2013-09-12 08:54:37,220 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out status for container: container_id { app_attempt_id { application_id { id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state: C_RUNNING diagnostics: "" exit_status: -1000 2013-09-12 08:54:38,231 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out status for container: container_id { app_attempt_id { application_id { id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state: C_RUNNING diagnostics: "" exit_status: -1000 2013-09-12 08:54:39,239 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out status for container: container_id { app_attempt_id { application_id { id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state: C_RUNNING diagnostics: "" exit_status: -1000 2013-09-12 08:54:40,267 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out status for container: container_id { app_attempt_id { application_id { id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state: C_RUNNING diagnostics: "" exit_status: -1000 2013-09-12 08:54:41,275 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out status for container: container_id { app_attempt_id { application_id { id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state: C_RUNNING diagnostics: "" exit_status: -1000 2013-09-12 08:54:42,283 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out status for container: container_id { app_attempt_id { application_id { id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state: C_RUNNING diagnostics: "" exit_status: -1000 2013-09-12 08:54:43,289 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out status for container: container_id { app_attempt_id { application_id { id: 2 cluster_timestamp: 1378990400253 } attemptId: 1 } id: 1 } state: C_RUNNING diagnostics: "" exit_status: -1000 Thanks, Kishore