[jira] [Commented] (AURORA-1837) Improve task history pruning

Mehrdad Nurolahzade (JIRA) Fri, 10 Feb 2017 15:08:59 -0800

    [ 
https://issues.apache.org/jira/browse/AURORA-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15861981#comment-15861981
 ]


Mehrdad Nurolahzade commented on AURORA-1837:
---------------------------------------------

As I mentioned above, we have noticed that most task history pruning happens 
after scheduler restarts and can severely hamper scheduler performance (or 
cause consecutive fail-overs on test clusters when we put load test on 
scheduler).

The reason is that scheduler loses its in-memory state of operations scheduled 
with {{DelayExecutor}} upon restart/failure. {{TaskHistoryPruner}} learns about 
these dead task upon restart when it replays log and these dead tasks are 
picked up by the second call to {{executor.execute()}} that performs job level 
pruning immediately (i.e., without delay).

> Improve task history pruning
> ----------------------------
>
>                 Key: AURORA-1837
>                 URL: https://issues.apache.org/jira/browse/AURORA-1837
>             Project: Aurora
>          Issue Type: Task
>            Reporter: Reza Motamedi
>            Assignee: Mehrdad Nurolahzade
>            Priority: Minor
>              Labels: scheduler
>
> Current implementation of {{TaskHistoryPrunner}} registers all inactive tasks 
> upon terminal _state_ change for pruning. 
> {{TaskHistoryPrunner::registerInactiveTask()}} uses a delay executor to 
> schedule the process of pruning _task_s. However, we have noticed most of 
> pruning takes place after scheduler recovers from a fail-over.
> Modify {{TaskHistoryPruner}} to a design similar to 
> {{JobUpdateHistoryPruner}}:
> # Instead of registering delay executor's upon terminal task state 
> transitions, have it wake up on preconfigured intervals, find all terminal 
> state tasks that meet pruning criteria and delete them.
> # Make the initial task history pruning delay configurable so that it does 
> not hamper scheduler upon start.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (AURORA-1837) Improve task history pruning

Reply via email to