[
https://issues.apache.org/jira/browse/MAPREDUCE-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hitesh Shah updated MAPREDUCE-3240:
-----------------------------------
Attachment: MR-3240.wip.patch
Patch does the following:
- introduced sending a sigterm followed by a sigkill when cleaning up a
container
- new config settings introduced for the delay between sigterm and sigkill
- introduced activeContainers within the ContainerExecutor. Used by the
launcher to set whether a container should be launched or not. If cleanup is
called before the process starts, this flag ensures that the process is never
started. Addresses race-kill issue in MR-3084
- Getting the pid after the shell executor has completed is unreliable so now
task.sh writes the pid into a local file which can be read by the
containerlauncher and used to kill the process.
> NM should send a SIGKILL for completed containers also
> ------------------------------------------------------
>
> Key: MAPREDUCE-3240
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3240
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2, nodemanager
> Affects Versions: 0.23.0
> Reporter: Vinod Kumar Vavilapalli
> Assignee: Hitesh Shah
> Attachments: MR-3240.wip.patch
>
>
> This is to address the containers which exit properly after spawning
> sub-processes themselves. We don't want to leave these sub-process-tree or
> else they can pillage the NM's resources.
> Today, we already have code to send SIGKILL to the whole process-trees
> (because of single sessionId resulting from setsid) when the container is
> alive. We need to obtain the PID of the containers when they start and use
> that PID to send signal for completed containers' case also.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira