[jira] [Updated] (YARN-10483) yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10483: - Description: yarn不定期卡死,新任务无法提交,经排查jstack日志,capacity scheduler有线程在无限等待锁,rm的cpu内存网络磁盘均正常。问题基本可以确定是capacity

[jira] [Updated] (YARN-10483) yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10483: - Description: yarn不定期卡死,新任务无法提交,经排查jstack日志,capacity scheduler有线程在无限等待锁,rm的cpu内存网络磁盘均正常。问题基本可以确定是capacity

[jira] [Updated] (YARN-10483) yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10483: - Description: yarn不定期卡死,新任务无法提交,经排查jstack日志,capacity scheduler有线程在无限等待锁,rm的cpu内存网络磁盘均正常。问题基本可以确定是capacity

[jira] [Updated] (YARN-10483) yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10483: - Attachment: RM_unnormal_state.stack > yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复 >

[jira] [Created] (YARN-10483) yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复

2020-11-04 Thread jufeng li (Jira)
jufeng li created YARN-10483: Summary: yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复 Key: YARN-10483 URL: https://issues.apache.org/jira/browse/YARN-10483 Project: Hadoop YARN Issue Type: Bug

[jira] [Updated] (YARN-10483) yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10483: - Attachment: RM_normal_state.stack > yarn hang住卡死,任务无法提交,切换RM主节点或重启才能恢复 >

[jira] [Commented] (YARN-10440) resource manager hangs,and i cannot submit any new jobs,but rm and nm processes are normal

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226463#comment-17226463 ] jufeng li commented on YARN-10440: -- [~joepep] I set,and it not effective,you can check my jstack log  >

[jira] [Commented] (YARN-10479) RMProxy should retry on SocketTimeout Exceptions

2020-11-04 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226356#comment-17226356 ] Hadoop QA commented on YARN-10479: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-10479) RMProxy should retry on SocketTimeout Exceptions

2020-11-04 Thread Jim Brennan (Jira)
[ https://issues.apache.org/jira/browse/YARN-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226303#comment-17226303 ] Jim Brennan commented on YARN-10479: I believe most of the YARN failures are unrelated to this

[jira] [Commented] (YARN-10479) RMProxy should retry on SocketTimeout Exceptions

2020-11-04 Thread Jim Brennan (Jira)
[ https://issues.apache.org/jira/browse/YARN-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226271#comment-17226271 ] Jim Brennan commented on YARN-10479: patch 003 fixes the checkstyle issues. > RMProxy should retry

[jira] [Updated] (YARN-10479) RMProxy should retry on SocketTimeout Exceptions

2020-11-04 Thread Jim Brennan (Jira)
[ https://issues.apache.org/jira/browse/YARN-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated YARN-10479: --- Attachment: YARN-10479.003.patch > RMProxy should retry on SocketTimeout Exceptions >

[jira] [Commented] (YARN-10440) resource manager hangs,and i cannot submit any new jobs,but rm and nm processes are normal

2020-11-04 Thread chan (Jira)
[ https://issues.apache.org/jira/browse/YARN-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226008#comment-17226008 ] chan commented on YARN-10440: - [~Jufeng] whether you set config 

[jira] [Commented] (YARN-10440) resource manager hangs,and i cannot submit any new jobs,but rm and nm processes are normal

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226002#comment-17226002 ] jufeng li commented on YARN-10440: -- This issue happend hous ago.I uploaded the RM jvm stack  > resource

[jira] [Updated] (YARN-10440) resource manager hangs,and i cannot submit any new jobs,but rm and nm processes are normal

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10440: - Attachment: (was: rm_2020-09-26-2.dump) > resource manager hangs,and i cannot submit any new

[jira] [Updated] (YARN-10440) resource manager hangs,and i cannot submit any new jobs,but rm and nm processes are normal

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10440: - Attachment: RM_unnormal_state.stack > resource manager hangs,and i cannot submit any new jobs,but rm and

[jira] [Updated] (YARN-10440) resource manager hangs,and i cannot submit any new jobs,but rm and nm processes are normal

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10440: - Attachment: RM_normal_state.stack > resource manager hangs,and i cannot submit any new jobs,but rm and

[jira] [Updated] (YARN-10482) Capacity Scheduler seems locked,RM cannot submit any new job,and change active RM manually return to normal

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10482: - Description: Capacity Scheduler seems locked,RM cannot submit any new job, and change active RM manually

[jira] [Updated] (YARN-10482) Capacity Scheduler seems locked,RM cannot submit any new job,and change active RM manually return to normal

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10482: - Attachment: (was: RM_normal_state.stack) > Capacity Scheduler seems locked,RM cannot submit any new

[jira] [Updated] (YARN-10482) Capacity Scheduler seems locked,RM cannot submit any new job,and change active RM manually return to normal

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10482: - Attachment: RM_unnormal_state.stack > Capacity Scheduler seems locked,RM cannot submit any new job,and

[jira] [Updated] (YARN-10482) Capacity Scheduler seems locked,RM cannot submit any new job,and change active RM manually return to normal

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10482: - Attachment: RM_normal_state.stack > Capacity Scheduler seems locked,RM cannot submit any new job,and

[jira] [Updated] (YARN-10482) Capacity Scheduler seems locked,RM cannot submit any new job,and change active RM manually return to normal

2020-11-04 Thread jufeng li (Jira)
[ https://issues.apache.org/jira/browse/YARN-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jufeng li updated YARN-10482: - Attachment: RM_normal_state.stack > Capacity Scheduler seems locked,RM cannot submit any new job,and

[jira] [Created] (YARN-10482) Capacity Scheduler seems locked,RM cannot submit any new job,and change active RM manually return to normal

2020-11-04 Thread jufeng li (Jira)
jufeng li created YARN-10482: Summary: Capacity Scheduler seems locked,RM cannot submit any new job,and change active RM manually return to normal Key: YARN-10482 URL: