[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125635#comment-13125635
 ] 

Siddharth Seth commented on MAPREDUCE-3084:
-------------------------------------------

There's two races - One while ContainersLaunch is localizing files prior to 
launch, and the other between generation of the CONTAINER_LAUNCHED event and 
the actual launch. From an initial look at the patch, seems like both cases 
will be handled with the changes (will continue looking).
One case that could be problematic is if there's multiple threads in the 
AsyncDispatcher - 1 per event type - that could still lead to a race between 
CE.cancelOrSendSignal and CE.launchContainer (similar stuff likely exists 
elsewhere also).
Vinod had an alternate suggestion - by using additional states and moving the 
START_CONTAINER localization to an alternate state.
Would it be simpler fixing this by just using interrupts ? (Shell does 
currently ignore interrupts in some cases though)
                
> race when KILL_CONTAINER is received for a LOCALIZED container
> --------------------------------------------------------------
>
>                 Key: MAPREDUCE-3084
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3084
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Siddharth Seth
>            Assignee: Hitesh Shah
>            Priority: Blocker
>         Attachments: MR-3084.1.patch, MR-3084.wip.patch
>
>
> Depending on when ContainersLaunch starts a container, {{KILL_CONTAINER}} 
> when container state is {{LOCALIZED}} ({{LAUNCH_CONTAINER}} event already 
> sent) can end up generating a {{CONTAINER_LAUNCHED}} event - which isn't 
> handled by ContainerState: {{KILLING}}. Also, the launched container won't be 
> killed since {{CLEANUP_CONTAINER}} would have already been processed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to