[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhengchenyu updated YARN-10642:
---
Description:
In our cluster, ResouceManager stuck twice within twenty days. Yarn client
can't
zhengchenyu created YARN-10642:
--
Summary: ResourceManager may keep stuck, because AsyncDispatcher's
printEventQueueDetails method stuck in an endless loop
Key: YARN-10642
URL:
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhengchenyu updated YARN-10642:
---
Description:
In our cluster, ResouceManager stuck twice within twenty days. Yarn client
can't
[
https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287592#comment-17287592
]
Qi Zhu edited comment on YARN-10641 at 2/20/21, 8:31 AM:
-
cc [~snemeth]
[
https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qi Zhu updated YARN-10641:
--
Attachment: image-2021-02-20-16-31-13-714.png
> Fix maxApllications update error when add new queues.
>
[
https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287592#comment-17287592
]
Qi Zhu commented on YARN-10641:
---
cc [~snemeth] [~gandras] [~bteke]
I have updated a patch, and confirmed
[
https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qi Zhu updated YARN-10641:
--
Priority: Critical (was: Major)
> Fix maxApllications update error when add new queues.
>
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhengchenyu updated YARN-10642:
---
Attachment: take.png
> ResourceManager may keep stuck, because AsyncDispatcher's
>
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhengchenyu updated YARN-10642:
---
Attachment: put.png
> ResourceManager may keep stuck, because AsyncDispatcher's
>
[
https://issues.apache.org/jira/browse/YARN-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287593#comment-17287593
]
Hadoop QA commented on YARN-10639:
--
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem ||
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287685#comment-17287685
]
zhengchenyu commented on YARN-10642:
If you feel description is too long, you only need to remember
[
https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qi Zhu updated YARN-10641:
--
Attachment: image-2021-02-20-16-29-18-519.png
> Fix maxApllications update error when add new queues.
>
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhengchenyu updated YARN-10642:
---
Attachment: debugfornode.png
> ResourceManager may keep stuck, because AsyncDispatcher's
>
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhengchenyu updated YARN-10642:
---
Attachment: deadloop.png
> ResourceManager may keep stuck, because AsyncDispatcher's
>
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhengchenyu updated YARN-10642:
---
Attachment: MockForDeadLoop.java
> ResourceManager may keep stuck, because AsyncDispatcher's
>
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287713#comment-17287713
]
zhengchenyu commented on YARN-10642:
YARN-10221 is the same problem, but no real reason, my issue
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhengchenyu updated YARN-10642:
---
Attachment: YARN-10642.002.patch
> ResourceManager may keep stuck, because AsyncDispatcher's
>
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287728#comment-17287728
]
zhengchenyu commented on YARN-10642:
[~zhuqi] That's OK. I also found, but sorry for forget it in
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qi Zhu updated YARN-10642:
--
Target Version/s: 3.4.0
> ResourceManager may keep stuck, because AsyncDispatcher's
> printEventQueueDetails
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287699#comment-17287699
]
Qi Zhu commented on YARN-10642:
---
[~zhengchenyu]
Thanks for finding this, i removed the fix version, we
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qi Zhu updated YARN-10642:
--
Fix Version/s: (was: 3.2.3)
(was: 3.3.1)
> ResourceManager may keep stuck, because
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287699#comment-17287699
]
Qi Zhu edited comment on YARN-10642 at 2/20/21, 3:06 PM:
-
[~zhengchenyu]
Thanks
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287699#comment-17287699
]
Qi Zhu edited comment on YARN-10642 at 2/20/21, 3:06 PM:
-
[~zhengchenyu]
Thanks
[
https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287706#comment-17287706
]
Qi Zhu commented on YARN-10643:
---
[~zhengchenyu] [~jonbender-stripe]
I will fix YARN-10221 YARN-10642 in
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287724#comment-17287724
]
Qi Zhu edited comment on YARN-10642 at 2/20/21, 3:48 PM:
-
Thanks for
Marcono1234 created YARN-10644:
--
Summary: ProcfsBasedProcessTree.ADDRESS_PATTERN is misleading and
possibly incorrect
Key: YARN-10644
URL: https://issues.apache.org/jira/browse/YARN-10644
Project:
[
https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qi Zhu updated YARN-10643:
--
Description:
The race condition introduced by -YARN-8995.-
The problem has been raised in YARN-10221
also in
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhengchenyu updated YARN-10642:
---
Attachment: YARN-10642.001.patch
> ResourceManager may keep stuck, because AsyncDispatcher's
>
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287724#comment-17287724
]
Qi Zhu edited comment on YARN-10642 at 2/20/21, 3:52 PM:
-
Thanks for
[
https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287714#comment-17287714
]
zhengchenyu edited comment on YARN-10643 at 2/20/21, 3:27 PM:
--
Just use
[
https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287714#comment-17287714
]
zhengchenyu commented on YARN-10643:
Just use Iterator() could solve this problem. You could see
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287724#comment-17287724
]
Qi Zhu commented on YARN-10642:
---
[~zhengchenyu]
It make sense to me, two minor things:
1. import
[
https://issues.apache.org/jira/browse/YARN-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qi Zhu reassigned YARN-10221:
-
Assignee: Qi Zhu
> Nodemanager lockups on printEventQueueDetails
>
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287713#comment-17287713
]
zhengchenyu edited comment on YARN-10642 at 2/20/21, 3:28 PM:
--
YARN-10221 is
[
https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhengchenyu reassigned YARN-10643:
--
Assignee: zhengchenyu (was: Qi Zhu)
> Fix the race condition introduced by YARN-8995.
>
[
https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287715#comment-17287715
]
Qi Zhu commented on YARN-10643:
---
[~zhengchenyu]
I think the best way to fix, is use deep copy, the method
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287699#comment-17287699
]
Qi Zhu edited comment on YARN-10642 at 2/20/21, 2:30 PM:
-
[~zhengchenyu]
Thanks
[
https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qi Zhu updated YARN-10643:
--
Description:
The race condition introduced by -YARN-8995.-
The problem has been raised in YARN-10221
also in
[
https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhengchenyu updated YARN-10643:
---
Attachment: YARN-10643.001.patch
> Fix the race condition introduced by YARN-8995.
>
[
https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287715#comment-17287715
]
Qi Zhu edited comment on YARN-10643 at 2/20/21, 3:30 PM:
-
Thanks [~zhengchenyu]
[
https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287723#comment-17287723
]
zhengchenyu commented on YARN-10643:
I think you need to konw why stuck. Please discuess in
Qi Zhu created YARN-10643:
-
Summary: Fix the race condition introduced by YARN-8995.
Key: YARN-10643
URL: https://issues.apache.org/jira/browse/YARN-10643
Project: Hadoop YARN
Issue Type: Bug
[
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287730#comment-17287730
]
Qi Zhu commented on YARN-10642:
---
Thanks for [~zhengchenyu] update. The patch LGTM +1.
Waiting for
[
https://issues.apache.org/jira/browse/YARN-10183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287844#comment-17287844
]
Qi Zhu commented on YARN-10183:
---
[~tanu.ajmera] [~prabhujoseph]
I will help deep into this issue.
> Auto
44 matches
Mail list logo