[
https://issues.apache.org/jira/browse/AURORA-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maxim Khutornenko updated AURORA-1459:
--------------------------------------
Sprint: Twitter Aurora Q3'15 Sprint 11
> DelayExecutor is flaky within scheduling loop
> ---------------------------------------------
>
> Key: AURORA-1459
> URL: https://issues.apache.org/jira/browse/AURORA-1459
> Project: Aurora
> Issue Type: Bug
> Components: Scheduler
> Reporter: Maxim Khutornenko
>
> TaskGroups now uses DelayExecutor introduced to gate async operations. The
> problem though is that DelayExecutor queue is only flushed on DB transaction
> completion (1). This means no scheduling can ever proceed unless there is
> _some_ storage mutation activity. If/when there are no storage writes
> scheduling effectively halts.
> While it unlikely to happen in production, it is consistently reproducible
> with e2e tests in vagrant on any subsequent run.
> (1) -
> https://github.com/apache/aurora/blob/06ddaadbcba4c66b8019815de6ca27d50a9df77d/src/main/java/org/apache/aurora/scheduler/storage/db/DbStorage.java#L175-L178
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)