[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102590#comment-13102590
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-2949:
----------------------------------------------------

Thanks for the patch Ravi! Couple of comments:

bq. I think LocalizerTracker should be under the ResourceLocalizationService as 
its not generic enough, to be made a seperate service.I feel that moving the 
tasks to the service startup may be a good thing.
What I meant was that LocalizerTracker can be under ResourceLocalizationService 
but still extends {{AbstractService}} and thus implement the life-cycle 
properly, that's been the convention and discipline we've been adhering to in 
YARN :)

Granted the localizerTracker is going away because of NM's shutdown hook, one 
thing that is not clear is how the {{cacheCleanup}} executor-service is going 
away during JVM shutdown in your manual verification after the patch. May be it 
is the timing and NM will probably shut down cleanly if there are no active 
thread in the pool? I'd think that we should do an explicit shutdown on the 
executor-service.

> NodeManager in a inconsistent state if a service startup fails.
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-2949
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2949
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.24.0
>            Reporter: Ravi Teja Ch N V
>            Assignee: Ravi Teja Ch N V
>         Attachments: MAPREDUCE-2949.patch, Threaddump.txt
>
>
> When a service startup fails at the Nodemanager, the Nodemanager JVM doesnot 
> exit as the following threads are still running.
> Daemon Thread [Timer for 'NodeManager' metrics system] (Running)      
> Thread [pool-1-thread-1] (Running)    
> Thread [Thread-11] (Running)  
> Thread [DestroyJavaVM] (Running).
> As a result, the NodeManager keeps running even though no services are 
> started.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to