[jira] [Assigned] (MESOS-8383) Add metrics for operations in Storage Local Resource Provider (SLRP).
[ https://issues.apache.org/jira/browse/MESOS-8383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chun-Hung Hsiao reassigned MESOS-8383: -- Assignee: Chun-Hung Hsiao > Add metrics for operations in Storage Local Resource Provider (SLRP). > - > > Key: MESOS-8383 > URL: https://issues.apache.org/jira/browse/MESOS-8383 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Chun-Hung Hsiao >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-8584) Move volume file attach/detach from the agent to the containerizer.
Gilbert Song created MESOS-8584: --- Summary: Move volume file attach/detach from the agent to the containerizer. Key: MESOS-8584 URL: https://issues.apache.org/jira/browse/MESOS-8584 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Gilbert Song Volume is the concept for containers, and is supported via the isolator in the containerizer. We should consider to move the file endpoint attach/detach to the containerizer. A refactoring is needed. /cc [~jieyu] [~qianzhang] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (MESOS-8573) Container stuck in PULLING when Docker daemon hangs
[ https://issues.apache.org/jira/browse/MESOS-8573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gilbert Song reassigned MESOS-8573: --- Assignee: Gilbert Song > Container stuck in PULLING when Docker daemon hangs > --- > > Key: MESOS-8573 > URL: https://issues.apache.org/jira/browse/MESOS-8573 > Project: Mesos > Issue Type: Improvement >Affects Versions: 1.5.0 >Reporter: Greg Mann >Assignee: Gilbert Song >Priority: Major > Labels: mesosphere > > When the {{force}} argument is not set to {{true}}, {{Docker::pull}} will > always perform a {{docker inspect}} call before it does a {{docker pull}}. If > either of these two Docker CLI calls hangs indefinitely, the Docker container > will be stuck in the PULLING state. This means that we make no further > progress in the {{launch()}} call path, so the executor binary is never > executed, the {{Future}} associated with the {{launch()}} call is never > failed or satisfied, and {{wait()}} is never called on the container. Thus, > when the executor registration timeout elapses, the agent's call to > {{containerizer->destroy()}} gets stuck waiting on the container status, and > its continuation is never invoked. > This leaves the task destined for that Docker executor stuck in TASK_STAGING > from the framework's perspective, and attempts to kill the task will fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (MESOS-8565) Persistent volumes are not visible in Mesos UI when launching a pod using default executor.
[ https://issues.apache.org/jira/browse/MESOS-8565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363643#comment-16363643 ] Gilbert Song edited comment on MESOS-8565 at 2/14/18 6:41 PM: -- commit 9d4c6d9576741cc480c75f8e59cc8d1adc9849fc Author: Qian ZhangDate: Wed Feb 14 00:17:37 2018 -0800 Attached/detached volume directory for task which has volume specified. Review: [https://reviews.apache.org/r/65570/] was (Author: gilbert): commit a7714536fad1140fd0c07c47e32b40e9ed00a3c3 Author: Qian Zhang Date: Mon Feb 5 20:42:07 2018 +0800 Reaped the container process directly in Docker executor. Due to a Docker issue (https://github.com/moby/moby/issues/33820), Docker daemon can fail to catch a container exit, i.e., the container process has already exited but the command `docker ps` shows the container still running, this will lead to the "docker run" command that we execute in Docker executor never returning, and it will also cause the `docker stop` command takes no effect, i.e., it will return without error but `docker ps` shows the container still running, so the task will stuck in `TASK_KILLING` state. To workaround this Docker issue, in this patch we made Docker executor reaps the container process directly so Docker executor will be notified once the container process exits. Review: https://reviews.apache.org/r/65518 > Persistent volumes are not visible in Mesos UI when launching a pod using > default executor. > --- > > Key: MESOS-8565 > URL: https://issues.apache.org/jira/browse/MESOS-8565 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.2.2, 1.3.1, 1.4.1 >Reporter: Qian Zhang >Assignee: Qian Zhang >Priority: Major > Fix For: 1.6.0, 1.5.1 > > > When user launches a pod to use a persistent volume in DC/OS, the nested > containers in the pod can access the PV successfully and the PV directory of > the executor shown in Mesos UI has all the contents written by the tasks, but > the PV directory of the tasks shown in DC/OS UI and Mesos UI is empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-8583) Autotools and Cmake do not give the same permissions on files in build/bin/.
Armand Grillet created MESOS-8583: - Summary: Autotools and Cmake do not give the same permissions on files in build/bin/. Key: MESOS-8583 URL: https://issues.apache.org/jira/browse/MESOS-8583 Project: Mesos Issue Type: Bug Components: build Affects Versions: 1.6.0 Reporter: Armand Grillet Using CMake, the files in {{build/bin}} will have for access rights: {code:java} ls -l total 84 -rw--x. 1 agrillet agrillet 1583 Feb 14 10:13 gdb-mesos-agent.sh -rw--x. 1 agrillet agrillet 1577 Feb 14 10:13 gdb-mesos-local.sh -rw--x. 1 agrillet agrillet 1586 Feb 14 10:13 gdb-mesos-master.sh -rw--x. 1 agrillet agrillet 1554 Feb 14 10:13 gdb-mesos-tests.sh -rw--x. 1 agrillet agrillet 1543 Feb 14 10:13 lldb-mesos-agent.sh -rw--x. 1 agrillet agrillet 1545 Feb 14 10:13 lldb-mesos-local.sh -rw--x. 1 agrillet agrillet 1548 Feb 14 10:13 lldb-mesos-master.sh -rw--x. 1 agrillet agrillet 1522 Feb 14 10:13 lldb-mesos-tests.sh -rw--x. 1 agrillet agrillet 1840 Feb 14 10:13 mesos-agent-flags.sh -rw--x. 1 agrillet agrillet 1047 Feb 14 10:13 mesos-agent.sh -rw--x. 1 agrillet agrillet 929 Feb 14 10:13 mesos-local-flags.sh -rw--x. 1 agrillet agrillet 1053 Feb 14 10:13 mesos-local.sh -rw--x. 1 agrillet agrillet 892 Feb 14 10:13 mesos-master-flags.sh -rw--x. 1 agrillet agrillet 1056 Feb 14 10:13 mesos-master.sh -rw--x. 1 agrillet agrillet 1200 Feb 14 10:13 mesos.sh -rw--x. 1 agrillet agrillet 901 Feb 14 10:13 mesos-tests-flags.sh -rw--x. 1 agrillet agrillet 1056 Feb 14 10:13 mesos-tests.sh -rw--x. 1 agrillet agrillet 1825 Feb 14 10:13 valgrind-mesos-agent.sh -rw--x. 1 agrillet agrillet 1825 Feb 14 10:13 valgrind-mesos-local.sh -rw--x. 1 agrillet agrillet 1828 Feb 14 10:13 valgrind-mesos-master.sh -rw--x. 1 agrillet agrillet 1825 Feb 14 10:13 valgrind-mesos-tests.sh{code} Using Autotools, the permissions are not the same: {code} ls -l total 104 -rwxrwxr-x. 1 agrillet agrillet 1592 Feb 14 10:32 gdb-mesos-agent.sh -rwxrwxr-x. 1 agrillet agrillet 1586 Feb 14 10:32 gdb-mesos-local.sh -rwxrwxr-x. 1 agrillet agrillet 1595 Feb 14 10:32 gdb-mesos-master.sh -rwxrwxr-x. 1 agrillet agrillet 1592 Feb 14 10:32 gdb-mesos-slave.sh -rwxrwxr-x. 1 agrillet agrillet 1563 Feb 14 10:32 gdb-mesos-tests.sh -rwxrwxr-x. 1 agrillet agrillet 1552 Feb 14 10:32 lldb-mesos-agent.sh -rwxrwxr-x. 1 agrillet agrillet 1554 Feb 14 10:32 lldb-mesos-local.sh -rwxrwxr-x. 1 agrillet agrillet 1557 Feb 14 10:32 lldb-mesos-master.sh -rwxrwxr-x. 1 agrillet agrillet 1552 Feb 14 10:32 lldb-mesos-slave.sh -rwxrwxr-x. 1 agrillet agrillet 1531 Feb 14 10:32 lldb-mesos-tests.sh -rw-rw-r--. 1 agrillet agrillet 1840 Feb 14 10:32 mesos-agent-flags.sh -rwxrwxr-x. 1 agrillet agrillet 1047 Feb 14 10:32 mesos-agent.sh -rw-rw-r--. 1 agrillet agrillet 929 Feb 14 10:32 mesos-local-flags.sh -rwxrwxr-x. 1 agrillet agrillet 1053 Feb 14 10:32 mesos-local.sh -rw-rw-r--. 1 agrillet agrillet 901 Feb 14 10:32 mesos-master-flags.sh -rwxrwxr-x. 1 agrillet agrillet 1056 Feb 14 10:32 mesos-master.sh -rwxrwxr-x. 1 agrillet agrillet 1209 Feb 14 10:32 mesos.sh -rw-rw-r--. 1 agrillet agrillet 1840 Feb 14 10:32 mesos-slave-flags.sh -rwxrwxr-x. 1 agrillet agrillet 1047 Feb 14 10:32 mesos-slave.sh -rw-rw-r--. 1 agrillet agrillet 901 Feb 14 10:32 mesos-tests-flags.sh -rwxrwxr-x. 1 agrillet agrillet 1056 Feb 14 10:32 mesos-tests.sh -rwxrwxr-x. 1 agrillet agrillet 1834 Feb 14 10:32 valgrind-mesos-agent.sh -rwxrwxr-x. 1 agrillet agrillet 1834 Feb 14 10:32 valgrind-mesos-local.sh -rwxrwxr-x. 1 agrillet agrillet 1837 Feb 14 10:32 valgrind-mesos-master.sh -rwxrwxr-x. 1 agrillet agrillet 1834 Feb 14 10:32 valgrind-mesos-slave.sh -rwxrwxr-x. 1 agrillet agrillet 1834 Feb 14 10:32 valgrind-mesos-tests.sh {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-8582) Add a way to make sure an agent always knows the full framework information of all frameworks executing operations on its resources
Benjamin Bannier created MESOS-8582: --- Summary: Add a way to make sure an agent always knows the full framework information of all frameworks executing operations on its resources Key: MESOS-8582 URL: https://issues.apache.org/jira/browse/MESOS-8582 Project: Mesos Issue Type: Bug Components: agent, master, storage Affects Versions: 1.5.0 Reporter: Benjamin Bannier Currently an {{Operation}} only contains a {{FrameworkID}} of originating frameworks, but e.g., not the full {{FrameworkInfo}}. This is problematic in master failover scenarios where a master might learn about an operation triggered by a framework unknown to it. The way the master implementation is structured, we would like to create tracking structures for that framework (e.g., to sync with the allocator down the line), but cannot do so since we can only learn this information when either the framework reregisters, or an agent running tasks of that framework reconciles with the master. We also cannot use conjured uo dummy information until we learn the true {{FrameworkInfo}} since some required fields in {{FrameworkInfo}} (namely {{FrameworkInfo.user}}) cannot be updated, see MESOS-703. We should introduce a channel for agents to learn the full {{FrameworkInfo}} for all frameworks executing operations on its resources. For simplicity and symmetry with {{RunTaskMessage}} it seems that adding an explicit {{FrameworkInfo}} field to {{Operation}} would do the job (e.g., allow atomic information transfer when operations are sent to the agent or on reconciliation with newly elected masters. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-8468) `LAUNCH_GROUP` failure tears down the default executor.
[ https://issues.apache.org/jira/browse/MESOS-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363885#comment-16363885 ] Qian Zhang commented on MESOS-8468: --- commit 632ff7f7f8e32d3f9507e9199c8a253ff755224e Author: Gaston KleimanDate: Wed Feb 14 14:35:34 2018 +0800 Removed outdated executor-wide launched flag from the default executor. Review: https://reviews.apache.org/r/65616/ src/launcher/default_executor.cpp | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) commit 54b6c5b9c7cb059ebd87ee0f9927cfa6ff73129d Author: Gaston Kleiman Date: Wed Feb 14 14:35:22 2018 +0800 Made the default executor treat agent disconnections more gracefully. This patch makes the default executor not shutdown if there are active child containers, and it fails to connect or is not subscribed to the agent when starting to launch a task group. Review: https://reviews.apache.org/r/65556/ src/launcher/default_executor.cpp | 43 +++ 1 file changed, 35 insertions(+), 8 deletions(-) commit 656196eeca4ab6449c4b9f329b5b9cac2f69a885 Author: Gaston Kleiman Date: Wed Feb 14 14:35:17 2018 +0800 Added a regression test for MESOS-8468. Review: https://reviews.apache.org/r/65552/ src/tests/default_executor_tests.cpp | 252 + 1 file changed, 252 insertions(+) commit c3f3542e7ecce82cad8b75fdc2db14fe8c43a5da Author: Gaston Kleiman Date: Wed Feb 14 14:35:11 2018 +0800 Stopped shutting down the whole default executor on task launch failure. The default executor would be completely shutdown on a `LAUNCH_NESTED_CONTAINER` failure. This patch makes it kill the affected task group instead of shutting down and killing all task groups. Review: https://reviews.apache.org/r/65551/ src/launcher/default_executor.cpp | 165 ++-- 1 file changed, 103 insertions(+), 62 deletions(-) commit 5c8852b244b09b4ae57e00abcd940482927d57e6 Author: Gaston Kleiman Date: Wed Feb 14 14:35:01 2018 +0800 Made default executor not shutdown if unsubscribed during task launch. The default executor would unnecessarily shutdown if, while launching a task group, it gets unsubscribed after having successfully launched the task group's containers. Review: https://reviews.apache.org/r/65550/ src/launcher/default_executor.cpp | 24 +--- 1 file changed, 13 insertions(+), 11 deletions(-) commit 2e570b709dc7d15c73c8d728ef0b32e2416b0a08 Author: Gaston Kleiman Date: Wed Feb 14 14:34:56 2018 +0800 Improved some default executor log messages. Review: https://reviews.apache.org/r/65549/ src/launcher/default_executor.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) commit 29d1e4e1a1b894da78c2033f1932b282ee794f4b Author: Gaston Kleiman Date: Wed Feb 14 14:34:50 2018 +0800 Added `Event::Update` and `v1::scheduler::TaskStatus` ostream operators. This operators make gtest print a human-readable representation of the protos on test failures. Review: https://reviews.apache.org/r/65548/ include/mesos/v1/mesos.hpp | 3 +++ include/mesos/v1/scheduler/scheduler.hpp | 10 ++ src/v1/mesos.cpp | 37 + 3 files changed, 50 insertions(+) > `LAUNCH_GROUP` failure tears down the default executor. > --- > > Key: MESOS-8468 > URL: https://issues.apache.org/jira/browse/MESOS-8468 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.2.0, 1.3.0, 1.4.0, 1.5.0 >Reporter: Chun-Hung Hsiao >Assignee: Gastón Kleiman >Priority: Critical > Labels: default-executor, mesosphere > Fix For: 1.6.0, 1.5.1 > > > The following code in the default executor > (https://github.com/apache/mesos/blob/12be4ba002f2f5ff314fbc16af51d095b0d90e56/src/launcher/default_executor.cpp#L525-L535) > shows that if a `LAUNCH_NESTED_CONTAINER` call is failed (say, due to a > fetcher failure), the whole executor will be shut down: > {code:cpp} > // Check if we received a 200 OK response for all the > // `LAUNCH_NESTED_CONTAINER` calls. Shutdown the executor > // if this is not the case. > foreach (const Response& response, responses.get()) { > if (response.code != process::http::Status::OK) { > LOG(ERROR) << "Received '" << response.status << "' (" ><< response.body << ") while launching child container"; > _shutdown(); > return; > } > } > {code} > This is not expected by a
[jira] [Commented] (MESOS-8468) `LAUNCH_GROUP` failure tears down the default executor.
[ https://issues.apache.org/jira/browse/MESOS-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363883#comment-16363883 ] Qian Zhang commented on MESOS-8468: --- https://reviews.apache.org/r/65616/ > `LAUNCH_GROUP` failure tears down the default executor. > --- > > Key: MESOS-8468 > URL: https://issues.apache.org/jira/browse/MESOS-8468 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.2.0, 1.3.0, 1.4.0, 1.5.0 >Reporter: Chun-Hung Hsiao >Assignee: Gastón Kleiman >Priority: Critical > Labels: default-executor, mesosphere > > The following code in the default executor > (https://github.com/apache/mesos/blob/12be4ba002f2f5ff314fbc16af51d095b0d90e56/src/launcher/default_executor.cpp#L525-L535) > shows that if a `LAUNCH_NESTED_CONTAINER` call is failed (say, due to a > fetcher failure), the whole executor will be shut down: > {code:cpp} > // Check if we received a 200 OK response for all the > // `LAUNCH_NESTED_CONTAINER` calls. Shutdown the executor > // if this is not the case. > foreach (const Response& response, responses.get()) { > if (response.code != process::http::Status::OK) { > LOG(ERROR) << "Received '" << response.status << "' (" ><< response.body << ") while launching child container"; > _shutdown(); > return; > } > } > {code} > This is not expected by a user. Instead, one would expect that a failed > `LAUNCH_GROUP` won't affect other task groups launched by the same executor, > similar to the case that a task failure only takes down its own task group. > We should adjust the semantics to make a failed `LAUNCH_GROUP` not take down > the executor and affect other task groups. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-1720) Slave should send exited executor message when the executor is never launched.
[ https://issues.apache.org/jira/browse/MESOS-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363865#comment-16363865 ] Greg Mann commented on MESOS-1720: -- Patches on 1.5.x: {code} commit 2bdf4935b7929d0dce614d76461cddb991df89da Author: Meng ZhuDate: Tue Feb 13 22:45:07 2018 -0800 Fixed a bug where executor info lingers on master if failed to launch. Master relies on `ExitedExecutorMessage` from the agent to remove executor entries. However, this message won't be sent if an executor never actually launched (due to transient error), leaving executor info on the master and the executor's resources claimed. See MESOS-1720. This patch fixes this issue by sending the `ExitedExecutorMessage` from the agent if the executor is never launched. Review: https://reviews.apache.org/r/65449/ {code} {code} commit fb0e2f1f81b2256a76cae83893e2a69fdd91fcd7 Author: Meng Zhu Date: Tue Feb 13 22:45:03 2018 -0800 Added helper function for the agent to send `ExitedExecutorMessage`. Review: https://reviews.apache.org/r/65446/ {code} {code} commit 10aa875df8947f8cbfb318820101984d99259070 Author: Meng Zhu Date: Tue Feb 13 22:44:58 2018 -0800 Made master set `launch_executor` in the RunTask(Group)Message. By setting a new field `launch_executor` in the RunTask(Group)Message, the master is able to control executor creation on the agent. Also refactored the `addTask()` logic. Added two new functions: `isTaskLaunchExecutor()` checks if a task needs to launch an executor; `addExecutor()` adds an executor to the framework and slave. Review: https://reviews.apache.org/r/65504/ {code} {code} commit 08e0ceb84e4bf353e1f938482bb6766bf73310c7 Author: Meng Zhu Date: Tue Feb 13 22:44:48 2018 -0800 Added new protobuf field `launch_executor` in RunTask(Group)Message. This boolean flag is used for the master to specify whether a new executor should be launched for the task or task group (with the exception of the command executor). This allows the master to control executor creation on the agent. Also updated the relevant message handlers and mock functions. Review: https://reviews.apache.org/r/65445/ {code} > Slave should send exited executor message when the executor is never launched. > -- > > Key: MESOS-1720 > URL: https://issues.apache.org/jira/browse/MESOS-1720 > Project: Mesos > Issue Type: Bug > Components: agent, master >Reporter: Benjamin Mahler >Assignee: Meng Zhu >Priority: Major > Labels: mesosphere > Fix For: 1.6.0, 1.5.1 > > > When the slave sends TASK_LOST before launching an executor for a task, the > slave does not send an exited executor message to the master. > Since the master receives no exited executor message, it still thinks the > executor's resources are consumed on the slave. > One possible fix for this would be to send the exited executor message to the > master in these cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-1720) Slave should send exited executor message when the executor is never launched.
[ https://issues.apache.org/jira/browse/MESOS-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363851#comment-16363851 ] Greg Mann commented on MESOS-1720: -- Patches on master: {code} commit 3e3c582f10e8154e4a76c2b481cc33c8d4d0310c Author: Meng ZhuDate: Tue Feb 13 22:45:23 2018 -0800 Added tests to check that executors which fail to launch are removed. Theses tests ensure that the agent sends `ExitedExecutorMessage` when a task group fails to launch due to unschedule GC failure, or when a task fails to launch due to task authorization failure. Review: https://reviews.apache.org/r/65593/ {code} {code} commit a8e723b6ca5a268cc97e39919f7a6b4aedfc3222 Author: Meng Zhu Date: Tue Feb 13 22:45:21 2018 -0800 Added a mock method for `__run()` to the mock slave. Review: https://reviews.apache.org/r/65626/ {code} {code} commit a6c065060d94dc04dcdc81021035d846ad7040a0 Author: Meng Zhu Date: Tue Feb 13 22:45:16 2018 -0800 Added a test to ensure master removes executors that never launched. This test ensures that the agent sends `ExitedExecutorMessage` when the executor is never launched so that the master's executor bookkeeping entry is removed. See MESOS-1720. Review: https://reviews.apache.org/r/65448/ {code} {code} commit b5350fecc8604bdddb45303d9363aff4ca60cfcc Author: Meng Zhu Date: Tue Feb 13 22:45:07 2018 -0800 Fixed a bug where executor info lingers on master if failed to launch. Master relies on `ExitedExecutorMessage` from the agent to remove executor entries. However, this message won't be sent if an executor never actually launched (due to transient error), leaving executor info on the master and the executor's resources claimed. See MESOS-1720. This patch fixes this issue by sending the `ExitedExecutorMessage` from the agent if the executor is never launched. Review: https://reviews.apache.org/r/65449/ {code} {code} commit 0321b85ce66f21e9cb6990a3032cb7f8f709c6e6 Author: Meng Zhu Date: Tue Feb 13 22:45:03 2018 -0800 Added helper function for the agent to send `ExitedExecutorMessage`. Review: https://reviews.apache.org/r/65446/ {code} {code} commit ce7f1f6a0807b96b92cb4c755c52f36e1a8e2853 Author: Meng Zhu Date: Tue Feb 13 22:44:58 2018 -0800 Made master set `launch_executor` in the RunTask(Group)Message. By setting a new field `launch_executor` in the RunTask(Group)Message, the master is able to control executor creation on the agent. Also refactored the `addTask()` logic. Added two new functions: `isTaskLaunchExecutor()` checks if a task needs to launch an executor; `addExecutor()` adds an executor to the framework and slave. Review: https://reviews.apache.org/r/65504/ {code} {code} commit 7c29031bf35232a9e8b0c88c826d0185673a Author: Meng Zhu Date: Tue Feb 13 22:44:48 2018 -0800 Added new protobuf field `launch_executor` in RunTask(Group)Message. This boolean flag is used for the master to specify whether a new executor should be launched for the task or task group (with the exception of the command executor). This allows the master to control executor creation on the agent. Also updated the relevant message handlers and mock functions. Review: https://reviews.apache.org/r/65445/ {code} > Slave should send exited executor message when the executor is never launched. > -- > > Key: MESOS-1720 > URL: https://issues.apache.org/jira/browse/MESOS-1720 > Project: Mesos > Issue Type: Bug > Components: agent, master >Reporter: Benjamin Mahler >Assignee: Meng Zhu >Priority: Major > Labels: mesosphere > > When the slave sends TASK_LOST before launching an executor for a task, the > slave does not send an exited executor message to the master. > Since the master receives no exited executor message, it still thinks the > executor's resources are consumed on the slave. > One possible fix for this would be to send the exited executor message to the > master in these cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-8565) Persistent volumes are not visible in Mesos UI when launching a pod using default executor.
[ https://issues.apache.org/jira/browse/MESOS-8565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363651#comment-16363651 ] Gilbert Song commented on MESOS-8565: - Please note that this fix addresses the issue that persistent volume is shared to nested containers by defining a `SANDBOX_PATH` volume to each nested container from the framework. So that the persistent volume would show up on the Mesos UI. However, there is a related *limitation*: for the case of only `SANDBOX_PATH` volume being specified for a nested container (no persistent volume or any other volume is specified at the executor container level), the pure `SANDBOX_PATH` volume is not reflected on the UI yet. We should create a separate Jira for this case. /cc [~qianzhang] > Persistent volumes are not visible in Mesos UI when launching a pod using > default executor. > --- > > Key: MESOS-8565 > URL: https://issues.apache.org/jira/browse/MESOS-8565 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.2.2, 1.3.1, 1.4.1 >Reporter: Qian Zhang >Assignee: Qian Zhang >Priority: Major > Fix For: 1.6.0, 1.5.1 > > > When user launches a pod to use a persistent volume in DC/OS, the nested > containers in the pod can access the PV successfully and the PV directory of > the executor shown in Mesos UI has all the contents written by the tasks, but > the PV directory of the tasks shown in DC/OS UI and Mesos UI is empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005)