[
https://issues.apache.org/jira/browse/AURORA-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maxim Khutornenko updated AURORA-1582:
--------------------------------------
Story Points: 3
> Task History Pruning attempts can fail silently
> -----------------------------------------------
>
> Key: AURORA-1582
> URL: https://issues.apache.org/jira/browse/AURORA-1582
> Project: Aurora
> Issue Type: Bug
> Reporter: Zameer Manji
> Assignee: Zameer Manji
>
> As discovered in AURORA-1580, task history pruning attempts can fail and if
> they do fail, they fail silently. The root cause seems to be that
> AsyncModule's {{AsyncProcessor}} threads just log the unhandled exception if
> it exists:
> {noformat}
> private static void evaluateResult(Runnable runnable, Throwable throwable,
> Logger logger) {
> // See java.util.concurrent.ThreadPoolExecutor#afterExecute(Runnable,
> Throwable)
> // for more details and an implementation example.
> if (throwable == null) {
> if (runnable instanceof Future) {
> try {
> Future<?> future = (Future<?>) runnable;
> if (future.isDone()) {
> future.get();
> }
> } catch (InterruptedException ie) {
> Thread.currentThread().interrupt();
> } catch (ExecutionException ee) {
> logger.error(ee.toString(), ee);
> }
> }
> } else {
> logger.error(throwable.toString(), throwable);
> }
> }
> {noformat}
> I think instead of silently failing if work on these threads fail, we should
> shut down the scheduler, much like how if the preemptor or other guava
> service fails we shut down the scheduler. This way the scheduler does not
> enter an undefined state and operators are informed of the abnormal behaviour.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)