[
https://issues.apache.org/jira/browse/AURORA-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897125#comment-13897125
]
brian wickman commented on AURORA-201:
--------------------------------------
unfortunately the job has since been garbage collected. with the bug
introduced by AURORA-204 we were seeing KILLING -> LOST. let me dig through
the logs.
> aurora needs a "really, really kill this task" command
> ------------------------------------------------------
>
> Key: AURORA-201
> URL: https://issues.apache.org/jira/browse/AURORA-201
> Project: Aurora
> Issue Type: Story
> Components: Client, Scheduler
> Reporter: brian wickman
>
> If the executor has a bug that causes it to die but the executor driver stays
> alive, it's possible for it to swallow killTask messages forever. The admin
> client will happily force the task into KILLING state, but upon timing out,
> it will go to LOST and automatically get restarted. This means that there's
> really no way to kill a task if there's a buggy executor. Ideally there is a
> really, really terminal state that says "when it times out in KILLING,
> instead of transitioning to LOST, transition to DEAD." or something along
> those lines.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)