[jira] [Updated] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10557: --- Affects Version/s: 3.2.1 > Application may be leaked in state store when resourcemanager failover. >

[jira] [Created] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
zhengchenyu created YARN-10557: -- Summary: Application may be leaked in state store when resourcemanager failover. Key: YARN-10557 URL: https://issues.apache.org/jira/browse/YARN-10557 Project: Hadoop YAR

[jira] [Updated] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10557: --- Fix Version/s: 3.3.1 > Application may be leaked in state store when resourcemanager failover. >

[jira] [Updated] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10557: --- Component/s: RM > Application may be leaked in state store when resourcemanager failover. > -

[jira] [Assigned] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu reassigned YARN-10557: -- Assignee: zhengchenyu > Application may be leaked in state store when resourcemanager failover

[jira] [Updated] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10557: --- Labels: resourcemanager (was: ) > Application may be leaked in state store when resourcemanager fail

[jira] [Updated] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10557: --- Component/s: (was: RM) resourcemanager > Application may be leaked in state stor

[jira] [Updated] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10557: --- Description: In resourceManager log, I found amount of log like below: {code} 2020-12-30 19:18:48,12

[jira] [Resolved] (YARN-10557) Application may be leaked in state store when resourcemanager failover.

2020-12-30 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu resolved YARN-10557. Release Note: YARN-9848 Resolution: Duplicate I think it duplicate with YARN-9848. > Applica

[jira] [Created] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
zhengchenyu created YARN-10642: -- Summary: ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop Key: YARN-10642 URL: https://issues.apache.org/jira/browse/YA

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Description: In our cluster, ResouceManager stuck twice within twenty days. Yarn client can't submit

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Description: In our cluster, ResouceManager stuck twice within twenty days. Yarn client can't submit

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: put.png > ResourceManager may keep stuck, because AsyncDispatcher's > printEventQueueDet

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: take.png > ResourceManager may keep stuck, because AsyncDispatcher's > printEventQueueDe

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: debugfornode.png > ResourceManager may keep stuck, because AsyncDispatcher's > printEven

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: deadloop.png > ResourceManager may keep stuck, because AsyncDispatcher's > printEventQue

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: MockForDeadLoop.java > ResourceManager may keep stuck, because AsyncDispatcher's > print

[jira] [Commented] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287685#comment-17287685 ] zhengchenyu commented on YARN-10642: If you feel description is too long, you only ne

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642.001.patch > ResourceManager may keep stuck, because AsyncDispatcher's > print

[jira] [Commented] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287713#comment-17287713 ] zhengchenyu commented on YARN-10642: YARN-10221 is the same problem, but no real reas

[jira] [Updated] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10643: --- Attachment: YARN-10643.001.patch > Fix the race condition introduced by YARN-8995. >

[jira] [Comment Edited] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287714#comment-17287714 ] zhengchenyu edited comment on YARN-10643 at 2/20/21, 3:27 PM: -

[jira] [Commented] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287714#comment-17287714 ] zhengchenyu commented on YARN-10643: Just use Iterator() could solve this problem. Yo

[jira] [Comment Edited] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287713#comment-17287713 ] zhengchenyu edited comment on YARN-10642 at 2/20/21, 3:28 PM: -

[jira] [Assigned] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu reassigned YARN-10643: -- Assignee: zhengchenyu (was: Qi Zhu) > Fix the race condition introduced by YARN-8995. > -

[jira] [Commented] (YARN-10643) Fix the race condition introduced by YARN-8995.

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287723#comment-17287723 ] zhengchenyu commented on YARN-10643: I think you need to konw why stuck. Please discu

[jira] [Commented] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287728#comment-17287728 ] zhengchenyu commented on YARN-10642: [~zhuqi] That's OK. I also found, but sorry for

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-20 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642.002.patch > ResourceManager may keep stuck, because AsyncDispatcher's > print

[jira] [Updated] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-02-21 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Summary: AsyncDispatcher will stuck introduced by YARN-8995. (was: ResourceManager may keep stuck, b

[jira] [Updated] (YARN-10642) ResourceManager may keep stuck, because AsyncDispatcher's printEventQueueDetails method stuck in an endless loop

2021-02-21 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642.003.patch > ResourceManager may keep stuck, because AsyncDispatcher's > print

[jira] [Commented] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-02-21 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288223#comment-17288223 ] zhengchenyu commented on YARN-10642: I add a uni-test in YARN-10642.003.patch which r

[jira] [Comment Edited] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-02-21 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288223#comment-17288223 ] zhengchenyu edited comment on YARN-10642 at 2/22/21, 7:38 AM: -

[jira] [Comment Edited] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-02-21 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288223#comment-17288223 ] zhengchenyu edited comment on YARN-10642 at 2/22/21, 7:40 AM: -

[jira] [Updated] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-02-21 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Description: In our cluster, ResouceManager stuck twice within twenty days. Yarn client can't submit

[jira] [Created] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11127: -- Summary: Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention. Key: YARN-11127 URL: https://issues.apache.org/jira/b

[jira] [Updated] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11127: --- Description: I found rm deadlock in our cluster. It's a low probability event. some critical jstack

[jira] [Updated] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11127: --- Description: I found rm deadlock in our cluster. It's a low probability event. some critical jstack

[jira] [Updated] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11127: --- Description: I found rm deadlock in our cluster. It's a low probability event. some critical jstack

[jira] [Commented] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532824#comment-17532824 ] zhengchenyu commented on YARN-11127: aggregateLogReport introduce by YARN-1376 then t

[jira] [Comment Edited] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532824#comment-17532824 ] zhengchenyu edited comment on YARN-11127 at 5/6/22 12:07 PM: -

[jira] [Commented] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532828#comment-17532828 ] zhengchenyu commented on YARN-11127: [~vinodkv] [~bteke]  [~pbacsko]  [~bilwa_st] [~z

[jira] [Created] (YARN-11132) RM failover may fail when Dispatcher stuck.

2022-05-06 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11132: -- Summary: RM failover may fail when Dispatcher stuck. Key: YARN-11132 URL: https://issues.apache.org/jira/browse/YARN-11132 Project: Hadoop YARN Issue Type: Impro

[jira] [Commented] (YARN-11132) RM failover may fail when Dispatcher stuck.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533203#comment-17533203 ] zhengchenyu commented on YARN-11132: I think we could watch the head element of event

[jira] [Commented] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533204#comment-17533204 ] zhengchenyu commented on YARN-11127: Another problem is that When dispatcher thread s

[jira] [Updated] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-07 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11127: --- Description: I found rm deadlock in our cluster. It's a low probability event. some critical jstack

[jira] [Updated] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10775: --- Description: I setup a yarn federation cluster, I can't connect the running app web, but the complet

[jira] [Created] (YARN-11148) In federation and security mode, nm recover may fail.

2022-05-13 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11148: -- Summary: In federation and security mode, nm recover may fail. Key: YARN-11148 URL: https://issues.apache.org/jira/browse/YARN-11148 Project: Hadoop YARN Issue T

[jira] [Updated] (YARN-11148) In federation and security mode, nm recover may fail.

2022-05-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11148: --- Description: Exception stack {code:java} 2022-05-08 00:44:11,536 WARN org.apache.hadoop.ipc.Client: E

[jira] [Updated] (YARN-11148) In federation and security mode, nm recover may fail.

2022-05-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11148: --- Description: Exception stack {code:java} 2022-05-08 00:44:11,536 WARN org.apache.hadoop.ipc.Client: E

[jira] [Updated] (YARN-11148) In federation and security mode, nm recover may fail.

2022-05-13 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11148: --- Description: In federation yarn cluster, security is enable, nm recovery is enable, nm restart may f

[jira] [Commented] (YARN-6539) Create SecureLogin inside Router

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537385#comment-17537385 ] zhengchenyu commented on YARN-6539: --- Any new progress about this? I have apply this patc

[jira] [Updated] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10775: --- Description: I setup a yarn federation cluster, I can't connect the running app web, but the complet

[jira] [Updated] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10775: --- Attachment: YARN-10775-design-doc.001.pdf > Federation: Yarn running app web can't be unable to conne

[jira] [Updated] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10775: --- Description: I setup a yarn federation cluster, I can't connect the running app web, but the complet

[jira] [Created] (YARN-11153) Make proxy server support yarn federation.

2022-05-16 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11153: -- Summary: Make proxy server support yarn federation. Key: YARN-11153 URL: https://issues.apache.org/jira/browse/YARN-11153 Project: Hadoop YARN Issue Type: Improv

[jira] [Created] (YARN-11154) Make router support proxy server.

2022-05-16 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11154: -- Summary: Make router support proxy server. Key: YARN-11154 URL: https://issues.apache.org/jira/browse/YARN-11154 Project: Hadoop YARN Issue Type: Improvement

[jira] [Commented] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537392#comment-17537392 ] zhengchenyu commented on YARN-10775: YARN-10786 describe same problem, but have two p

[jira] [Updated] (YARN-11153) Make proxy server support yarn federation.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11153: --- Description: Detail message see: https://issues.apache.org/jira/browse/YARN-10775 and  > Make proxy

[jira] [Updated] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10775: --- Description: I setup a yarn federation cluster, I can't connect the running app web, but the complet

[jira] [Updated] (YARN-11154) Make router support proxy server.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11154: --- Description: Detail message see: https://issues.apache.org/jira/browse/YARN-10775 and YARN-10775-des

[jira] [Updated] (YARN-11153) Make proxy server support yarn federation.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11153: --- Description: Detail message see: https://issues.apache.org/jira/browse/YARN-10775 and YARN-10775-des

[jira] [Comment Edited] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537403#comment-17537403 ] zhengchenyu edited comment on YARN-10775 at 5/16/22 8:44 AM: -

[jira] [Commented] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537403#comment-17537403 ] zhengchenyu commented on YARN-10775: [~inigoiri]  [~snemeth] [~ayushsaxena] [~bteke]

[jira] [Comment Edited] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532828#comment-17532828 ] zhengchenyu edited comment on YARN-11127 at 5/16/22 8:45 AM: -

[jira] [Comment Edited] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537403#comment-17537403 ] zhengchenyu edited comment on YARN-10775 at 5/16/22 8:49 AM: -

[jira] [Updated] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10775: --- Description: I setup a yarn federation cluster, I can't connect the running app web, but the complet

[jira] [Updated] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10775: --- Description: I setup a yarn federation cluster, I can't connect the running app web, but the complet

[jira] [Updated] (YARN-10775) Federation: Yarn running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-05-16 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10775: --- Description: I setup a yarn federation cluster, I can't connect the running app web, but the complet

[jira] [Commented] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-05-19 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539363#comment-17539363 ] zhengchenyu commented on YARN-11127: Thanks [~hexiaoqiao] . Maybe it is a low probabi

[jira] [Updated] (YARN-11153) Make proxy server support yarn federation.

2022-06-05 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11153: --- Parent: YARN-10775 Issue Type: Sub-task (was: Improvement) > Make proxy server support yarn

[jira] [Updated] (YARN-11154) Make router support proxy server.

2022-06-05 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11154: --- Parent: YARN-10775 Issue Type: Sub-task (was: Improvement) > Make router support proxy serve

[jira] [Commented] (YARN-10775) Federation: YARN running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-06-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550333#comment-17550333 ] zhengchenyu commented on YARN-10775: [~slfan1989]  Answer 1: This operation impleme

[jira] [Commented] (YARN-10775) Federation: YARN running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-06-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550343#comment-17550343 ] zhengchenyu commented on YARN-10775: I think you need read my answer and document aga

[jira] [Comment Edited] (YARN-10775) Federation: YARN running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-06-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550343#comment-17550343 ] zhengchenyu edited comment on YARN-10775 at 6/6/22 8:36 AM: [

[jira] [Commented] (YARN-11127) Potential deadlock in AsyncDispatcher caused by RMNodeImpl, SchedulerApplicationAttempt and RMAppImpl's lock contention.

2022-06-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550349#comment-17550349 ] zhengchenyu commented on YARN-11127: [~slfan1989] Thanks for review. In fact, fix an

[jira] [Commented] (YARN-10775) Federation: YARN running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-06-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550355#comment-17550355 ] zhengchenyu commented on YARN-10775: In our cluster, the final version, I don't regar

[jira] [Commented] (YARN-10775) Federation: YARN running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-06-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550363#comment-17550363 ] zhengchenyu commented on YARN-10775: Thanks for review and suggestion. Welcome contin

[jira] [Commented] (YARN-10775) Federation: YARN running app web can't be unable to connect, because AppMaster can't redirect to the right address.

2022-06-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550401#comment-17550401 ] zhengchenyu commented on YARN-10775: [~slfan1989] In fact, the picture of chapter 3 h

[jira] [Created] (YARN-11172) Fix testDelegationToken

2022-06-06 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11172: -- Summary: Fix testDelegationToken Key: YARN-11172 URL: https://issues.apache.org/jira/browse/YARN-11172 Project: Hadoop YARN Issue Type: Improvement R

[jira] [Assigned] (YARN-11172) Fix testDelegationToken

2022-06-06 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu reassigned YARN-11172: -- Assignee: zhengchenyu > Fix testDelegationToken > --- > >

[jira] [Updated] (YARN-11154) Make router support proxy server.

2022-06-14 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11154: --- Attachment: YARN-11154.draft.patch > Make router support proxy server. >

[jira] [Updated] (YARN-11154) Make router support proxy server.

2022-06-14 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11154: --- Attachment: (was: YARN-11154.draft.patch) > Make router support proxy server. > -

[jira] [Created] (YARN-11183) Federation: Remove outdated ApplicationHomeSubCluster in federation state store.

2022-06-14 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11183: -- Summary: Federation: Remove outdated ApplicationHomeSubCluster in federation state store. Key: YARN-11183 URL: https://issues.apache.org/jira/browse/YARN-11183 Project: H

[jira] [Commented] (YARN-11183) Federation: Remove outdated ApplicationHomeSubCluster in federation state store.

2022-06-14 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554030#comment-17554030 ] zhengchenyu commented on YARN-11183: In our first version, I remove ApplicationHomeSu

[jira] [Comment Edited] (YARN-11183) Federation: Remove outdated ApplicationHomeSubCluster in federation state store.

2022-06-14 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554030#comment-17554030 ] zhengchenyu edited comment on YARN-11183 at 6/14/22 10:49 AM: -

[jira] [Updated] (YARN-11154) Make router support proxy server.

2022-06-14 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-11154: --- Attachment: YARN-11154.draft.patch > Make router support proxy server. >

[jira] [Commented] (YARN-11154) Make router support proxy server.

2022-06-14 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554034#comment-17554034 ] zhengchenyu commented on YARN-11154: [~slfan1989] Hi, I submit a draft patch firstly.

[jira] [Comment Edited] (YARN-11183) Federation: Remove outdated ApplicationHomeSubCluster in federation state store.

2022-06-14 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554030#comment-17554030 ] zhengchenyu edited comment on YARN-11183 at 6/14/22 11:17 AM: -

[jira] [Comment Edited] (YARN-11183) Federation: Remove outdated ApplicationHomeSubCluster in federation state store.

2022-06-17 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554030#comment-17554030 ] zhengchenyu edited comment on YARN-11183 at 6/17/22 9:18 AM: -

[jira] [Commented] (YARN-5936) when cpu strict mode is closed, yarn couldn't assure scheduling fairness between containers

2022-09-02 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17599379#comment-17599379 ] zhengchenyu commented on YARN-5936: --- For work change, I miss long long time. In fact, w

[jira] [Resolved] (YARN-5936) when cpu strict mode is closed, yarn couldn't assure scheduling fairness between containers

2022-09-02 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu resolved YARN-5936. --- Resolution: Not A Problem > when cpu strict mode is closed, yarn couldn't assure scheduling fairness

[jira] [Updated] (YARN-5936) when cpu strict mode is closed, yarn couldn't assure scheduling fairness between containers

2022-09-02 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-5936: -- Target Version/s: (was: 2.7.1) > when cpu strict mode is closed, yarn couldn't assure scheduling fairn

[jira] [Commented] (YARN-11183) Federation: Remove outdated ApplicationHomeSubCluster in federation state store.

2022-11-08 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-11183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630300#comment-17630300 ] zhengchenyu commented on YARN-11183: [~goiri]  Hi, can you please review this PR? the

[jira] [Updated] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-03-03 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642.004.patch > AsyncDispatcher will stuck introduced by YARN-8995. >

[jira] [Commented] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-03-03 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17294982#comment-17294982 ] zhengchenyu commented on YARN-10642: Okay, I submit YARN-10642.004.patch which repair

[jira] [Comment Edited] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-03-03 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17294982#comment-17294982 ] zhengchenyu edited comment on YARN-10642 at 3/4/21, 4:15 AM: -

[jira] [Commented] (YARN-10221) Nodemanager lockups on printEventQueueDetails

2021-03-03 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17294988#comment-17294988 ] zhengchenyu commented on YARN-10221: Please follow YARN-10642 which explain the reaso

[jira] [Updated] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-03-04 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-10642: --- Attachment: YARN-10642.005.patch > AsyncDispatcher will stuck introduced by YARN-8995. >

[jira] [Commented] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.

2021-03-04 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295286#comment-17295286 ] zhengchenyu commented on YARN-10642: [~pbacsko] Okay, work done in YARN-10642.005.pat

[jira] [Commented] (YARN-10642) Race condition: AsyncDispatcher can get stuck by the changes introduced in YARN-8995

2021-03-05 Thread zhengchenyu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296075#comment-17296075 ] zhengchenyu commented on YARN-10642: [~pbacsko] Yes, I think it's Java's bug. Then I

  1   2   3   >