[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Description: In our cluster, ResouceManager stuck twice within twenty days. Yarn client can't

[jira] [Created] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
zhengchenyu created YARN-10642: -- Summary: ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop Key: YARN-10642 URL:

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Description: In our cluster, ResouceManager stuck twice within twenty days. Yarn client can't

[jira] [Comment Edited] (YARN-10641) Fix maxApllications update error when add new queues.

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287592#comment-17287592 ] Qi Zhu edited comment on YARN-10641 at 2/20/21, 8:31 AM: - cc [~snemeth] 

[jira] [Updated] (YARN-10641) Fix maxApllications update error when add new queues.

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10641: -- Attachment: image-2021-02-20-16-31-13-714.png > Fix maxApllications update error when add new queues. >

[jira] [Commented] (YARN-10641) Fix maxApllications update error when add new queues.

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287592#comment-17287592 ] Qi Zhu commented on YARN-10641: --- cc [~snemeth] [~gandras] [~bteke] I have updated a patch, and confirmed

[jira] [Updated] (YARN-10641) Fix maxApllications update error when add new queues.

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10641: -- Priority: Critical (was: Major) > Fix maxApllications update error when add new queues. >

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: take.png > ResourceManager may keep stuck, because AsyncDispatcher's >

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: put.png > ResourceManager may keep stuck, because AsyncDispatcher's >

[jira] [Commented] (YARN-10639) Queueinfo related capacity, should ajusted to weight mode.

2021-02-20 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287593#comment-17287593 ] Hadoop QA commented on YARN-10639: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287685#comment-17287685 ] zhengchenyu commented on YARN-10642: If you feel description is too long, you only need to remember

[jira] [Updated] (YARN-10641) Fix maxApllications update error when add new queues.

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10641: -- Attachment: image-2021-02-20-16-29-18-519.png > Fix maxApllications update error when add new queues. >

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: debugfornode.png > ResourceManager may keep stuck, because AsyncDispatcher's >

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: deadloop.png > ResourceManager may keep stuck, because AsyncDispatcher's >

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: MockForDeadLoop.java > ResourceManager may keep stuck, because AsyncDispatcher's >

[jira] [Commented] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287713#comment-17287713 ] zhengchenyu commented on YARN-10642: YARN-10221 is the same problem, but no real reason, my issue

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642.002.patch > ResourceManager may keep stuck, because AsyncDispatcher's >

[jira] [Commented] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287728#comment-17287728 ] zhengchenyu commented on YARN-10642: [~zhuqi] That's OK. I also found, but sorry for forget it in

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10642: -- Target Version/s: 3.4.0 > ResourceManager may keep stuck, because AsyncDispatcher's > printEventQueueDetails

[jira] [Commented] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287699#comment-17287699 ] Qi Zhu commented on YARN-10642: --- [~zhengchenyu]  Thanks for finding this, i removed the fix version, we

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10642: -- Fix Version/s: (was: 3.2.3) (was: 3.3.1) > ResourceManager may keep stuck, because

[jira] [Comment Edited] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287699#comment-17287699 ] Qi Zhu edited comment on YARN-10642 at 2/20/21, 3:06 PM: - [~zhengchenyu]  Thanks

[jira] [Comment Edited] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287699#comment-17287699 ] Qi Zhu edited comment on YARN-10642 at 2/20/21, 3:06 PM: - [~zhengchenyu]  Thanks

[jira] [Commented] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287706#comment-17287706 ] Qi Zhu commented on YARN-10643: --- [~zhengchenyu] [~jonbender-stripe] I will fix YARN-10221  YARN-10642 in

[jira] [Comment Edited] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287724#comment-17287724 ] Qi Zhu edited comment on YARN-10642 at 2/20/21, 3:48 PM: - Thanks for

[jira] [Created] (YARN-10644) ProcfsBasedProcessTree.ADDRESS_PATTERN is misleading and possibly incorrect

2021-02-20 Thread Marcono1234 (Jira)
Marcono1234 created YARN-10644: -- Summary: ProcfsBasedProcessTree.ADDRESS_PATTERN is misleading and possibly incorrect Key: YARN-10644 URL: https://issues.apache.org/jira/browse/YARN-10644 Project:

[jira] [Updated] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10643: -- Description: The race condition introduced by -YARN-8995.- The problem has been raised in YARN-10221 also in 

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642.001.patch > ResourceManager may keep stuck, because AsyncDispatcher's >

[jira] [Comment Edited] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287724#comment-17287724 ] Qi Zhu edited comment on YARN-10642 at 2/20/21, 3:52 PM: - Thanks for

[jira] [Comment Edited] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287714#comment-17287714 ] zhengchenyu edited comment on YARN-10643 at 2/20/21, 3:27 PM: -- Just use

[jira] [Commented] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287714#comment-17287714 ] zhengchenyu commented on YARN-10643: Just use Iterator() could solve this problem. You could see

[jira] [Commented] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287724#comment-17287724 ] Qi Zhu commented on YARN-10642: --- [~zhengchenyu] It make sense to me, two minor things: 1. import

[jira] [Assigned] (YARN-10221) Nodemanager lockups on printEventQueueDetails

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu reassigned YARN-10221: - Assignee: Qi Zhu > Nodemanager lockups on printEventQueueDetails >

[jira] [Comment Edited] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287713#comment-17287713 ] zhengchenyu edited comment on YARN-10642 at 2/20/21, 3:28 PM: -- YARN-10221 is

[jira] [Assigned] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu reassigned YARN-10643: -- Assignee: zhengchenyu (was: Qi Zhu) > Fix the race condition introduced by YARN-8995. >

[jira] [Commented] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287715#comment-17287715 ] Qi Zhu commented on YARN-10643: --- [~zhengchenyu]  I think the best way to fix, is use deep copy, the method

[jira] [Comment Edited] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287699#comment-17287699 ] Qi Zhu edited comment on YARN-10642 at 2/20/21, 2:30 PM: - [~zhengchenyu]  Thanks

[jira] [Updated] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10643: -- Description: The race condition introduced by -YARN-8995.- The problem has been raised in YARN-10221 also in 

[jira] [Updated] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10643: --- Attachment: YARN-10643.001.patch > Fix the race condition introduced by YARN-8995. >

[jira] [Comment Edited] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287715#comment-17287715 ] Qi Zhu edited comment on YARN-10643 at 2/20/21, 3:30 PM: - Thanks  [~zhengchenyu] 

[jira] [Commented] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287723#comment-17287723 ] zhengchenyu commented on YARN-10643: I think you need to konw why stuck. Please discuess in

[jira] [Created] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread Qi Zhu (Jira)
Qi Zhu created YARN-10643: - Summary: Fix the race condition introduced by YARN-8995. Key: YARN-10643 URL: https://issues.apache.org/jira/browse/YARN-10643 Project: Hadoop YARN Issue Type: Bug

[jira] [Commented] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287730#comment-17287730 ] Qi Zhu commented on YARN-10642: --- Thanks for [~zhengchenyu] update. The patch LGTM +1. Waiting for

[jira] [Commented] (YARN-10183) Auto Created Leaf Queues does not start

2021-02-20 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287844#comment-17287844 ] Qi Zhu commented on YARN-10183: --- [~tanu.ajmera] [~prabhujoseph] I will help deep into this issue. > Auto