[
https://issues.apache.org/jira/browse/MESOS-17?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Charles Reiss updated MESOS-17:
-------------------------------
Description: The Hadoop framework considers tasks finished when they are in
the COMMIT_PENDING state. When using the LXC isolation module, this can cause
the Hadoop executor's memory allocation to be reduced before the task actually
commits. When this happens, the Hadoop executor is sometimes killed for
exceeding its memory allocation, leaving the tasks stalled until the master
detects the lost task tracker by timeout. (was: When using the LXC isolation
module, the Hadoop framework considers tasks finished when they are in the
COMMIT_PENDING state. This can cause the Hadoop executor's memory allocation to
be reduced before the task actually commits. When this happens, the Hadoop
executor is sometimes killed for exceeding its memory allocation, leaving the
tasks stalled until the master detects the lost task tracker by timeout.)
> Hadoop executors killed while tasks in COMMIT_PENDING
> -----------------------------------------------------
>
> Key: MESOS-17
> URL: https://issues.apache.org/jira/browse/MESOS-17
> Project: Mesos
> Issue Type: Bug
> Components: isolation
> Environment: LXC isolation module, Hadoop framework
> Reporter: Charles Reiss
> Priority: Minor
> Labels: hadoop, lxc
>
> The Hadoop framework considers tasks finished when they are in the
> COMMIT_PENDING state. When using the LXC isolation module, this can cause the
> Hadoop executor's memory allocation to be reduced before the task actually
> commits. When this happens, the Hadoop executor is sometimes killed for
> exceeding its memory allocation, leaving the tasks stalled until the master
> detects the lost task tracker by timeout.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira