[
https://issues.apache.org/jira/browse/YARN-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15805840#comment-15805840
]
Yufei Gu commented on YARN-6061:
Yes. This sets the default handler if no specific one is set for the thread.
But we need a different handlers here. When {{YarnUncaughtExceptionHandler}}
got a raw RuntimeException, it just logs an error, didn't bring down the RM.
This is fine for some threads, e.g. threads in a thread pool. But for other
threads like update thread and preemption thread in fair scheduler, we should
bring down the RM once a RTE is caught since there is no way RM still is
running but these critical threads are done.
I realize that it should work for all critical threads(critical means we should
bring down the RM if the thread crashed). Maybe we should enlarge the scope to
RM instead of FS only.
> Add a customized uncaughtexceptionhandler for fair scheduler
>
>
> Key: YARN-6061
> URL: https://issues.apache.org/jira/browse/YARN-6061
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: fairscheduler, yarn
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Labels: fairscheduler
>
> There are several threads in fair scheduler. The thread will quit when there
> is a runtime exception inside it. We should bring down the RM when that
> happens. Otherwise, there may be some weird behavior in RM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org