[
https://issues.apache.org/jira/browse/MAPREDUCE-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hitesh Shah updated MAPREDUCE-3084:
-----------------------------------
Attachment: MR-3084.wip.patch
Attaching more or less a working version that fixes the issue.
Handling the launched event at the killing state is effectively a no-op as the
container cleanup event is always handled after a container launch event.
The patch effectively ensures that either the container does not come up if it
has not yet or kills it if it has.
This requires changes in hadoop-common to get around the async nature of the
launches .
Sid/Vinod, please take a look and let me know if you see something
wrong/missing.
Given the slightly complex nature of this change, I decided not to incorporate
the other missing state transitions into this patch but will instead open a
separate jira for those.
> race when KILL_CONTAINER is received for a LOCALIZED container
> --------------------------------------------------------------
>
> Key: MAPREDUCE-3084
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3084
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 0.23.0
> Reporter: Siddharth Seth
> Assignee: Hitesh Shah
> Priority: Blocker
> Attachments: MR-3084.wip.patch
>
>
> Depending on when ContainersLaunch starts a container, {{KILL_CONTAINER}}
> when container state is {{LOCALIZED}} ({{LAUNCH_CONTAINER}} event already
> sent) can end up generating a {{CONTAINER_LAUNCHED}} event - which isn't
> handled by ContainerState: {{KILLING}}. Also, the launched container won't be
> killed since {{CLEANUP_CONTAINER}} would have already been processed.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira