[ 
https://issues.apache.org/jira/browse/MESOS-8405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-8405:
------------------------------
    Description: 
>From [~vinodkone] in [r/64940|https://reviews.apache.org/r/64940/]:

{quote}
Ideally, we want terminal but unacknowledged tasks to still be marked 
unreachable in some way, either via task state being TASK_UNREACHABLE or task 
being present in unreachableTasks. This allows, for example, the WebUI to not 
show sandbox links for unreachable tasks irrespective of whether they were 
terminal or not before going unreachable. 

But doing this is tricky for various reasons:

--> updateTask() doesn't allow a terminal state to be transitioned to 
TASK_UNREACHABLE. Right now when we call updateTask for a terminal task, it 
adds TASK_UNREACHABLE status to Task.statuses and also sends it to operator API 
stream subscribers which looks incorrect. The fact that updateTask internally 
deals with already terminal tasks is a bad design decision in retrospect. I 
think the callers shouldn't call it for terminal tasks instead.

--> It's not clear to our users what a completed task means. The intention was 
for this to hold a cache of terminal and acknowledged tasks for storing recent 
history. The users of the WebUI probably equate "Completed Tasks" to terminal 
tasks irrespective of their acknowledgement status, which is why it is 
confusing for them to see terminal but unacknowledged tasks in the "Active 
tasks" section in the WebUI.

--> When a framework reconciles the state of a task on an unreachable agent, 
master replies with TASK_UNREACHABLE irrespective of whether the task was in a 
non-terminal state or terminal but un-acknowledged state or terminal and 
acknowledged state when the agent went unreachable.  

I think the direction we want to go towards is

--> Completed tasks should consist of terminal unacknowledged and terminal 
acknowled tasks, likely in two different data structures.
--> Unreachable tasks should consist of all non-complete tasks on an 
unreachable agent.  All the tasks in this map should be in TASK_UNREACHABLE 
state.
{quote}

  was:
>From [~agentvindo.dev] in [r/64940|https://reviews.apache.org/r/64940/]:

{quote}
Ideally, we want terminal but unacknowledged tasks to still be marked 
unreachable in some way, either via task state being TASK_UNREACHABLE or task 
being present in unreachableTasks. This allows, for example, the WebUI to not 
show sandbox links for unreachable tasks irrespective of whether they were 
terminal or not before going unreachable. 

But doing this is tricky for various reasons:

--> updateTask() doesn't allow a terminal state to be transitioned to 
TASK_UNREACHABLE. Right now when we call updateTask for a terminal task, it 
adds TASK_UNREACHABLE status to Task.statuses and also sends it to operator API 
stream subscribers which looks incorrect. The fact that updateTask internally 
deals with already terminal tasks is a bad design decision in retrospect. I 
think the callers shouldn't call it for terminal tasks instead.

--> It's not clear to our users what a completed task means. The intention was 
for this to hold a cache of terminal and acknowledged tasks for storing recent 
history. The users of the WebUI probably equate "Completed Tasks" to terminal 
tasks irrespective of their acknowledgement status, which is why it is 
confusing for them to see terminal but unacknowledged tasks in the "Active 
tasks" section in the WebUI.

--> When a framework reconciles the state of a task on an unreachable agent, 
master replies with TASK_UNREACHABLE irrespective of whether the task was in a 
non-terminal state or terminal but un-acknowledged state or terminal and 
acknowledged state when the agent went unreachable.  

I think the direction we want to go towards is

--> Completed tasks should consist of terminal unacknowledged and terminal 
acknowled tasks, likely in two different data structures.
--> Unreachable tasks should consist of all non-complete tasks on an 
unreachable agent.  All the tasks in this map should be in TASK_UNREACHABLE 
state.
{quote}


> Update master task loss handling.
> ---------------------------------
>
>                 Key: MESOS-8405
>                 URL: https://issues.apache.org/jira/browse/MESOS-8405
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: James Peach
>
> From [~vinodkone] in [r/64940|https://reviews.apache.org/r/64940/]:
> {quote}
> Ideally, we want terminal but unacknowledged tasks to still be marked 
> unreachable in some way, either via task state being TASK_UNREACHABLE or task 
> being present in unreachableTasks. This allows, for example, the WebUI to not 
> show sandbox links for unreachable tasks irrespective of whether they were 
> terminal or not before going unreachable. 
> But doing this is tricky for various reasons:
> --> updateTask() doesn't allow a terminal state to be transitioned to 
> TASK_UNREACHABLE. Right now when we call updateTask for a terminal task, it 
> adds TASK_UNREACHABLE status to Task.statuses and also sends it to operator 
> API stream subscribers which looks incorrect. The fact that updateTask 
> internally deals with already terminal tasks is a bad design decision in 
> retrospect. I think the callers shouldn't call it for terminal tasks instead.
> --> It's not clear to our users what a completed task means. The intention 
> was for this to hold a cache of terminal and acknowledged tasks for storing 
> recent history. The users of the WebUI probably equate "Completed Tasks" to 
> terminal tasks irrespective of their acknowledgement status, which is why it 
> is confusing for them to see terminal but unacknowledged tasks in the "Active 
> tasks" section in the WebUI.
> --> When a framework reconciles the state of a task on an unreachable agent, 
> master replies with TASK_UNREACHABLE irrespective of whether the task was in 
> a non-terminal state or terminal but un-acknowledged state or terminal and 
> acknowledged state when the agent went unreachable.  
> I think the direction we want to go towards is
> --> Completed tasks should consist of terminal unacknowledged and terminal 
> acknowled tasks, likely in two different data structures.
> --> Unreachable tasks should consist of all non-complete tasks on an 
> unreachable agent.  All the tasks in this map should be in TASK_UNREACHABLE 
> state.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to