Brook Zhou created YARN-7098: -------------------------------- Summary: LocalizerRunner should immediately send heartbeat response LocalizerStatus.DIE when the Container transitions from LOCALIZING to KILLING Key: YARN-7098 URL: https://issues.apache.org/jira/browse/YARN-7098 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Brook Zhou Assignee: Brook Zhou Priority: Minor
Currently, the following can happen: 1. ContainerLocalizer heartbeats to ResourceLocalizationService. 2. LocalizerTracker.processHeartbeat verifies that there is a LocalizerRunner for the localizerId (containerId). 3. Container receives kill event, goes from LOCALIZING -> KILLING. The LocalizerRunner for the localizerId is removed from LocalizerTracker. 4. Since check (2) passed, LocalizerRunner sends heartbeat response with LocalizerStatus.LIVE and the next file to download. What should happen here is that (4) sends a LocalizerStatus.DIE, since (3) happened before the heartbeat response in (4). This saves the container from potentially downloading an extra resource which will end up being deleted anyway. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org