[ 
https://issues.apache.org/jira/browse/YARN-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144551#comment-16144551
 ] 

Arun Suresh commented on YARN-7098:
-----------------------------------

Thanks for the patch [~brookz]. 
I think we would have to consider the Localization Scope as well though. In 
case of private localizers, the 
{{LocalizerResourceRequestEvent::getVisibility()}} can be PRIVATE or 
APPLICATION. If it is application scope, the resource can technically be used 
by other containers of the same app, in which case, we should probably not 
endContainerLocaliztion.
[~jianhe], Thoughts ?

> LocalizerRunner should immediately send heartbeat response 
> LocalizerStatus.DIE when the Container transitions from LOCALIZING to KILLING
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-7098
>                 URL: https://issues.apache.org/jira/browse/YARN-7098
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Brook Zhou
>            Assignee: Brook Zhou
>            Priority: Minor
>         Attachments: YARN-7098.patch
>
>
> Currently, the following can happen:
> 1. ContainerLocalizer heartbeats to ResourceLocalizationService.
> 2. LocalizerTracker.processHeartbeat verifies that there is a LocalizerRunner 
> for the localizerId (containerId). Goes into {code:java}return 
> localizer.processHeartbeat(status.getResources());{code}
> 3. Container receives kill event, goes from LOCALIZING -> KILLING. The 
> LocalizerRunner is removed from LocalizerTracker, since the privLocalizers 
> lock is now free.
> 4. Since check (2) passed, LocalizerRunner sends heartbeat response with 
> LocalizerStatus.LIVE and the next file to download.
> What should happen here is that (4) sends a LocalizerStatus.DIE, since (3) 
> happened before the heartbeat response in (4). This saves the container from 
> potentially downloading an extra resource due to the one extra LIVE heartbeat 
> which will end up being deleted anyway.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to