[ 
https://issues.apache.org/jira/browse/MESOS-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363851#comment-16363851
 ] 

Greg Mann commented on MESOS-1720:
----------------------------------

Patches on master:
{code}
commit 3e3c582f10e8154e4a76c2b481cc33c8d4d0310c
Author: Meng Zhu <m...@mesosphere.io>
Date:   Tue Feb 13 22:45:23 2018 -0800

    Added tests to check that executors which fail to launch are removed.

    Theses tests ensure that the agent sends `ExitedExecutorMessage` when
    a task group fails to launch due to unschedule GC failure, or when a
    task fails to launch due to task authorization failure.

    Review: https://reviews.apache.org/r/65593/
{code}
{code}
commit a8e723b6ca5a268cc97e39919f7a6b4aedfc3222
Author: Meng Zhu <m...@mesosphere.io>
Date:   Tue Feb 13 22:45:21 2018 -0800

    Added a mock method for `__run()` to the mock slave.

    Review: https://reviews.apache.org/r/65626/
{code}
{code}
commit a6c065060d94dc04dcdc81021035d846ad7040a0
Author: Meng Zhu <m...@mesosphere.io>
Date:   Tue Feb 13 22:45:16 2018 -0800

    Added a test to ensure master removes executors that never launched.

    This test ensures that the agent sends `ExitedExecutorMessage` when
    the executor is never launched so that the master's executor
    bookkeeping entry is removed. See MESOS-1720.

    Review: https://reviews.apache.org/r/65448/
{code}
{code}
commit b5350fecc8604bdddb45303d9363aff4ca60cfcc
Author: Meng Zhu <m...@mesosphere.io>
Date:   Tue Feb 13 22:45:07 2018 -0800

    Fixed a bug where executor info lingers on master if failed to launch.

    Master relies on `ExitedExecutorMessage` from the agent to remove
    executor entries. However, this message won't be sent if an executor
    never actually launched (due to transient error), leaving executor
    info on the master and the executor's resources claimed.
    See MESOS-1720.

    This patch fixes this issue by sending the `ExitedExecutorMessage`
    from the agent if the executor is never launched.

    Review: https://reviews.apache.org/r/65449/
{code}
{code}
commit 0321b85ce66f21e9cb6990a3032cb7f8f709c6e6
Author: Meng Zhu <m...@mesosphere.io>
Date:   Tue Feb 13 22:45:03 2018 -0800

    Added helper function for the agent to send `ExitedExecutorMessage`.

    Review: https://reviews.apache.org/r/65446/
{code}
{code}
commit ce7f1f6a0807b96b92cb4c755c52f36e1a8e2853
Author: Meng Zhu <m...@mesosphere.io>
Date:   Tue Feb 13 22:44:58 2018 -0800

    Made master set `launch_executor` in the RunTask(Group)Message.

    By setting a new field `launch_executor` in the RunTask(Group)Message,
    the master is able to control executor creation on the agent.

    Also refactored the `addTask()` logic. Added two new functions:
    `isTaskLaunchExecutor()` checks if a task needs to launch an executor;
    `addExecutor()` adds an executor to the framework and slave.

    Review: https://reviews.apache.org/r/65504/
{code}
{code}
commit 7c29031bf35232a9e8b0c8bbbb8c826d0185673a
Author: Meng Zhu <m...@mesosphere.io>
Date:   Tue Feb 13 22:44:48 2018 -0800

    Added new protobuf field `launch_executor` in RunTask(Group)Message.

    This boolean flag is used for the master to specify whether a
    new executor should be launched for the task or task group (with
    the exception of the command executor). This allows the master
    to control executor creation on the agent.

    Also updated the relevant message handlers and mock functions.

    Review: https://reviews.apache.org/r/65445/
{code}

> Slave should send exited executor message when the executor is never launched.
> ------------------------------------------------------------------------------
>
>                 Key: MESOS-1720
>                 URL: https://issues.apache.org/jira/browse/MESOS-1720
>             Project: Mesos
>          Issue Type: Bug
>          Components: agent, master
>            Reporter: Benjamin Mahler
>            Assignee: Meng Zhu
>            Priority: Major
>              Labels: mesosphere
>
> When the slave sends TASK_LOST before launching an executor for a task, the 
> slave does not send an exited executor message to the master.
> Since the master receives no exited executor message, it still thinks the 
> executor's resources are consumed on the slave.
> One possible fix for this would be to send the exited executor message to the 
> master in these cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to