[ https://issues.apache.org/jira/browse/YARN-544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622969#comment-13622969 ]
Vinod Kumar Vavilapalli commented on YARN-544: ---------------------------------------------- When you come around to doing this, please write a test-case first to reproduce this. Tx. > Failed resource localization might introduce a race condition. > -------------------------------------------------------------- > > Key: YARN-544 > URL: https://issues.apache.org/jira/browse/YARN-544 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Omkar Vinit Joshi > Assignee: Omkar Vinit Joshi > > When resource localization fails [Public localizer / > LocalizerRunner(Private)] it sends ContainerResourceFailedEvent to the > containers which then sends ResourceReleaseEvent to the failed resource. In > the end when LocalizedResource's ref count drops to 0 its state is changed > from DOWNLOADING to INIT. > Now if a Resource gets ResourceRequestEvent in between > ContainerResourceFailedEvent and last ResourceReleaseEvent then for that > resource ref count will not drop to 0 and the container which sent the > ResourceRequestEvent will keep waiting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira