[ 
https://issues.apache.org/jira/browse/AURORA-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897131#comment-13897131
 ] 

brian wickman commented on AURORA-201:
--------------------------------------

{noformat}
  I0209 08:10:03.100 THREAD28403 
org.apache.aurora.scheduler.state.CronJobManager$4.get: Initiating delayed 
launch of cron JobKey(role:balexandrescu, environment:devel, name:skyfall)
  I0209 08:10:03.101 THREAD28403 
com.twitter.common.util.StateMachine$Builder$1.execute: 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de
 state machine transition INIT -> PENDING
  I0209 08:10:03.101 THREAD28403 
org.apache.aurora.scheduler.state.TaskStateMachine.addFollowup: Adding work 
command SAVE_STATE for 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de
  I0209 08:10:03.102 THREAD28403 
org.apache.aurora.scheduler.async.TaskGroups$2.load: Evaluating group 
balexandrescu/devel/skyfall in 1000 ms
  I0209 08:10:10.107 THREAD23 
com.twitter.common.util.StateMachine$Builder$1.execute: 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de
 state machine transition PENDING -> ASSIGNED
  I0209 08:10:10.107 THREAD23 
org.apache.aurora.scheduler.state.TaskStateMachine.addFollowup: Adding work 
command SAVE_STATE for 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de
  ) is being assigned task for 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de.
  I0209 08:10:13.200 THREAD28584 
org.apache.aurora.scheduler.MesosSchedulerImpl.statusUpdate: Received status 
update for task 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de
 in state TASK_STARTING with core message Initializing sandbox.
  I0209 08:10:13.201 THREAD28584 
com.twitter.common.util.StateMachine$Builder$1.execute: 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de
 state machine transition ASSIGNED -> STARTING
  I0209 08:10:13.201 THREAD28584 
org.apache.aurora.scheduler.state.TaskStateMachine.addFollowup: Adding work 
command SAVE_STATE for 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de
  I0209 08:11:22.576 THREAD12036 
org.apache.aurora.scheduler.thrift.aop.LoggingInterceptor.invoke: 
forceTaskState(1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de,
 KILLING, SessionKey(user:wickman; elevated:ElevatedPrivilege(requested:true, 
justification:user balexandrescu no longer exists)))
  I0209 08:11:22.643 THREAD12036 
org.apache.aurora.scheduler.thrift.aop.UserCapabilityInterceptor.invoke: 
Permitting SessionKey(user:wickman; elevated:ElevatedPrivilege(requested:true, 
justification:user balexandrescu no longer exists)) to act as ROOT and perform 
action forceTaskState
  I0209 08:11:22.701 THREAD12036 
com.twitter.common.util.StateMachine$Builder$1.execute: 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de
 state machine transition STARTING -> KILLING
  I0209 08:11:22.701 THREAD12036 
org.apache.aurora.scheduler.state.TaskStateMachine.addFollowup: Adding work 
command KILL for 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de
  I0209 08:11:22.701 THREAD12036 
org.apache.aurora.scheduler.state.TaskStateMachine.addFollowup: Adding work 
command SAVE_STATE for 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de
  I0209 08:16:22.702 THREAD24 
org.apache.aurora.scheduler.async.TaskTimeout$TimedOutTaskHandler.run: Timeout 
reached for task 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de:KILLING
  I0209 08:16:22.703 THREAD24 
com.twitter.common.util.StateMachine$Builder$1.execute: 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de
 state machine transition KILLING -> LOST
  I0209 08:16:22.703 THREAD24 
org.apache.aurora.scheduler.state.TaskStateMachine.addFollowup: Adding work 
command SAVE_STATE for 
1391933403101-balexandrescu-devel-skyfall-0-31e566b6-caff-49db-84f9-cdd85ae5a0de
  I0209 08:27:56.453 THREAD29172 
org.apache.aurora.scheduler.MesosSchedulerImpl.statusUpdate: Received status 
update for task 
1391919003030-balexandrescu-devel-skyfall-0-ecc2930a-d279-4811-aca6-5e57d9cdb5d8
 in state TASK_LOST with core message 
  I0209 08:27:56.454 THREAD29172 
com.twitter.common.util.StateMachine$Builder$1.execute: 
1391919003030-balexandrescu-devel-skyfall-0-ecc2930a-d279-4811-aca6-5e57d9cdb5d8
 state machine transition UNKNOWN -> LOST (not allowed)
  I0209 08:27:56.553 THREAD29173 
org.apache.aurora.scheduler.MesosSchedulerImpl.statusUpdate: Received status 
update for task 
1391898603031-balexandrescu-devel-skyfall-0-64f3cee5-49c0-40f4-a990-e88cce7f1435
 in state TASK_LOST with core message 
  I0209 08:27:56.554 THREAD29173 
com.twitter.common.util.StateMachine$Builder$1.execute: 
1391898603031-balexandrescu-devel-skyfall-0-64f3cee5-49c0-40f4-a990-e88cce7f1435
 state machine transition UNKNOWN -> LOST (not allowed)
{noformat}

> aurora needs a "really, really kill this task" command
> ------------------------------------------------------
>
>                 Key: AURORA-201
>                 URL: https://issues.apache.org/jira/browse/AURORA-201
>             Project: Aurora
>          Issue Type: Story
>          Components: Client, Scheduler
>            Reporter: brian wickman
>
> If the executor has a bug that causes it to die but the executor driver stays 
> alive, it's possible for it to swallow killTask messages forever.  The admin 
> client will happily force the task into KILLING state, but upon timing out, 
> it will go to LOST and automatically get restarted.  This means that there's 
> really no way to kill a task if there's a buggy executor.  Ideally there is a 
> really, really terminal state that says "when it times out in KILLING, 
> instead of transitioning to LOST, transition to DEAD." or something along 
> those lines.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to