[
https://issues.apache.org/jira/browse/MESOS-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140671#comment-14140671
]
Tom Arnfeld edited comment on MESOS-1812 at 9/19/14 2:47 PM:
-------------------------------------------------------------
I think there are use cases for it. For example, the modifications I am making
to the hadoop framework.
Ultimately I am trying to control how long an Executor process lives for, and
be able to trigger it to commit suicide, from the framework. Framework/Executor
messages are currently not a reliable form of communication over mesos (as far
as I know) and after my tasks are done I need the executor to stay around for a
specific amount of time.
Perhaps what I really need here is some kind of {{shutdownExecutor}} driver
call.
was (Author: tarnfeld):
I think there are use cases for it. For example, the modifications I am making
to the hadoop framework.
Ultimately I am trying to control how long an Executor process lives for, and
be able to trigger it to commit suicide. Framework messages are currently not a
reliable form of communication over mesos (as far as I know) and after my tasks
are done I need the executor to stay around for a specific amount of time.
Perhaps what I really need here is some kind of {{shutdownExecutor}} driver
call.
> Queued tasks are not actually launched in the order they were queued
> --------------------------------------------------------------------
>
> Key: MESOS-1812
> URL: https://issues.apache.org/jira/browse/MESOS-1812
> Project: Mesos
> Issue Type: Bug
> Components: slave
> Reporter: Tom Arnfeld
>
> Even though tasks are assigned and queued in the order in which they are
> launched (e.g multiple tasks in reply to one offer), due to timing issues
> with the futures, this can sometimes break the causality and end up not being
> launched in order.
> Example trace from a slave... In this example the Task_Tracker_10 task should
> be launched before slots_Task_Tracker_10.
> {code}
> I0918 02:10:50.371445 17072 slave.cpp:933] Got assigned task Task_Tracker_10
> for framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.372110 17072 slave.cpp:933] Got assigned task
> slots_Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.372172 17073 gc.cpp:84] Unscheduling
> '/mnt/mesos-slave/slaves/20140915-112519-3171422218-5050-5016-6/frameworks/20140916-233111-3171422218-5050-14295-0015'
> from gc
> I0918 02:10:50.375018 17072 slave.cpp:1043] Launching task
> slots_Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.386282 17072 slave.cpp:1153] Queuing task
> 'slots_Task_Tracker_10' for executor executor_Task_Tracker_10 of framework
> '20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.386312 17070 mesos_containerizer.cpp:537] Starting container
> '5f507f09-b48e-44ea-b74e-740b0e8bba4d' for executor
> 'executor_Task_Tracker_10' of framework
> '20140916-233111-3171422218-5050-14295-0015'
> I0918 02:10:50.388942 17072 slave.cpp:1043] Launching task Task_Tracker_10
> for framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.406277 17070 launcher.cpp:117] Forked child with pid '817' for
> container '5f507f09-b48e-44ea-b74e-740b0e8bba4d'
> I0918 02:10:50.406563 17072 slave.cpp:1153] Queuing task 'Task_Tracker_10'
> for executor executor_Task_Tracker_10 of framework
> '20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.408499 17069 mesos_containerizer.cpp:647] Fetching URIs for
> container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' using command
> '/usr/local/libexec/mesos/mesos-fetcher'
> I0918 02:11:11.650687 17071 slave.cpp:2873] Current usage 17.34%. Max allowed
> age: 5.086371210668750days
> I0918 02:11:16.590270 17075 slave.cpp:2355] Monitoring executor
> 'executor_Task_Tracker_10' of framework
> '20140916-233111-3171422218-5050-14295-0015' in container
> '5f507f09-b48e-44ea-b74e-740b0e8bba4d'
> I0918 02:11:17.701015 17070 slave.cpp:1664] Got registration for executor
> 'executor_Task_Tracker_10' of framework
> 20140916-233111-3171422218-5050-14295-0015
> I0918 02:11:17.701897 17070 slave.cpp:1783] Flushing queued task
> slots_Task_Tracker_10 for executor 'executor_Task_Tracker_10' of framework
> 20140916-233111-3171422218-5050-14295-0015
> I0918 02:11:17.702350 17070 slave.cpp:1783] Flushing queued task
> Task_Tracker_10 for executor 'executor_Task_Tracker_10' of framework
> 20140916-233111-3171422218-5050-14295-0015
> I0918 02:11:18.588388 17070 mesos_containerizer.cpp:1112] Executor for
> container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' has exited
> I0918 02:11:18.588665 17070 mesos_containerizer.cpp:996] Destroying container
> '5f507f09-b48e-44ea-b74e-740b0e8bba4d'
> I0918 02:11:18.599234 17072 slave.cpp:2413] Executor
> 'executor_Task_Tracker_10' of framework
> 20140916-233111-3171422218-5050-14295-0015 has exited with status 1
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)