[jira] [Commented] (YARN-6061) Add a customized uncaughtexceptionhandler for fair scheduler

2017-01-06 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15805840#comment-15805840
 ] 

Yufei Gu commented on YARN-6061:


Yes. This sets the default handler if no specific one is set for the thread. 
But we need a different handlers here. When {{YarnUncaughtExceptionHandler}} 
got a raw RuntimeException, it just logs an error, didn't bring down the RM. 
This is fine for some threads, e.g. threads in a thread pool. But for other 
threads like update thread and preemption thread in fair scheduler, we should 
bring down the RM once a RTE is caught since there is no way RM still is 
running but these critical threads are done.
I realize that it should work for all critical threads(critical means we should 
bring down the RM if the thread crashed). Maybe we should enlarge the scope to 
RM instead of FS only. 

> Add a customized uncaughtexceptionhandler for fair scheduler
> 
>
> Key: YARN-6061
> URL: https://issues.apache.org/jira/browse/YARN-6061
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, yarn
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>  Labels: fairscheduler
>
> There are several threads in fair scheduler. The thread will quit when there 
> is a runtime exception inside it. We should bring down the RM when that 
> happens. Otherwise, there may be some weird behavior in RM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6061) Add a customized uncaughtexceptionhandler for fair scheduler

2017-01-06 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15805654#comment-15805654
 ] 

Devaraj K commented on YARN-6061:
-

Should not handle this?
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java#L1378

> Add a customized uncaughtexceptionhandler for fair scheduler
> 
>
> Key: YARN-6061
> URL: https://issues.apache.org/jira/browse/YARN-6061
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, yarn
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>  Labels: fairscheduler
>
> There are several threads in fair scheduler. The thread will quit when there 
> is a runtime exception inside it. We should bring down the RM when that 
> happens. Otherwise, there may be some weird behavior in RM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org