[jira] [Comment Edited] (YARN-10738) When multi thread scheduling with multi node, we should shuffle with a gap to prevent hot accessing nodes.
[ https://issues.apache.org/jira/browse/YARN-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335104#comment-17335104 ] Qi Zhu edited comment on YARN-10738 at 4/29/21, 2:48 AM: - Thanks [~Jim_Brennan] for review and very patient investigation. The original ResourceUsageMultiNodeLookupPolicy policy sometimes cause the hot node in test cluster, and after the gap shuffle about more than 50% reduce the hot node case, but the gap 10 we should discuss about it, it related to the size of the cluster, and it will get better result if we choose the good gap. I agree with you, that another option to consider would be to have a policy that uses node utilization, which should more accurately reflect how busy the node is. And we should also shuffle based the node utilization, because multi thread scheduling without node heartbeat scheduling, may will all commit to the first same node, it will cause the hot node, and the hot node is the big bottleneck of real time cluster. And actually the hot node is mainly affected the real time cluster, because it is more restrict to the delay of job. I agree with you we should do more test and practice to push this issue, and i think it will be helpful to large scale cluster, especially the real time clusters, i will help a lot. Thanks again [~Jim_Brennan] . was (Author: zhuqi): Thanks [~Jim_Brennan] for review and very patient investigation. The original ResourceUsageMultiNodeLookupPolicy policy sometimes cause the hot node in test cluster, and after the gap shuffle about more than 50% reduce the hot node case, but the gap 10 we should discuss about it, it related to the size of the cluster, and it will get better result if we choose the good gap. I agree with you, that another option to consider would be to have a policy that uses node utilization, which should more accurately reflect how busy the node is. And we should also shuffle based the node utilization, because multi thread scheduling without node heartbeat scheduling, may will all commit to the first same node, it will cause the hot node, and the hot node is the big bottleneck of real time cluster. And actually the hot node is mainly affected the real time cluster, because it is more restrict to the delay of job. Thanks. > When multi thread scheduling with multi node, we should shuffle with a gap to > prevent hot accessing nodes. > -- > > Key: YARN-10738 > URL: https://issues.apache.org/jira/browse/YARN-10738 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Now the multi threading scheduling with multi node is not reasonable. > In large clusters, it will cause the hot accessing nodes, which will lead the > abnormal boom node. > Solution: > I think we should shuffle the sorted node (such the available resource sort > policy) with an interval. > I will solve the above problem, and avoid the hot accessing node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10738) When multi thread scheduling with multi node, we should shuffle with a gap to prevent hot accessing nodes.
[ https://issues.apache.org/jira/browse/YARN-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335104#comment-17335104 ] Qi Zhu edited comment on YARN-10738 at 4/29/21, 2:46 AM: - Thanks [~Jim_Brennan] for review and very patient investigation. The original ResourceUsageMultiNodeLookupPolicy policy sometimes cause the hot node in test cluster, and after the gap shuffle about more than 50% reduce the hot node case, but the gap 10 we should discuss about it, it related to the size of the cluster, and it will get better result if we choose the good gap. I agree with you, that another option to consider would be to have a policy that uses node utilization, which should more accurately reflect how busy the node is. And we should also shuffle based the node utilization, because multi thread scheduling without node heartbeat scheduling, may will all commit to the first same node, it will cause the hot node, and the hot node is the big bottleneck of real time cluster. And actually the hot node is mainly affected the real time cluster, because it is more restrict to the delay of job. Thanks. was (Author: zhuqi): Thanks [~Jim_Brennan] for review and very patient investigation. The original ResourceUsageMultiNodeLookupPolicy policy sometimes cause the hot node in test cluster, and after the gap shuffle about more than 50% reduce the hot node case, but the gap 10 we should discuss about it, it related to the size of the cluster, and it will get better result if we choose the good gap. I agree with you, that another option to consider would be to have a policy that uses node utilization, which should more accurately reflect how busy the node is. And we should also shuffle based the node utilization, because multi thread scheduling without node heartbeat scheduling, will commit to the first same node, it will cause the hot node, and the hot node is the big bottleneck of real time cluster. And actually the hot node is mainly affected the real time cluster, because it is more restrict to the delay of job. Thanks. > When multi thread scheduling with multi node, we should shuffle with a gap to > prevent hot accessing nodes. > -- > > Key: YARN-10738 > URL: https://issues.apache.org/jira/browse/YARN-10738 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Now the multi threading scheduling with multi node is not reasonable. > In large clusters, it will cause the hot accessing nodes, which will lead the > abnormal boom node. > Solution: > I think we should shuffle the sorted node (such the available resource sort > policy) with an interval. > I will solve the above problem, and avoid the hot accessing node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10738) When multi thread scheduling with multi node, we should shuffle with a gap to prevent hot accessing nodes.
[ https://issues.apache.org/jira/browse/YARN-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335104#comment-17335104 ] Qi Zhu edited comment on YARN-10738 at 4/29/21, 2:45 AM: - Thanks [~Jim_Brennan] for review and very patient investigation. The original ResourceUsageMultiNodeLookupPolicy policy sometimes cause the hot node in test cluster, and after the gap shuffle about more than 50% reduce the hot node case, but the gap 10 we should discuss about it, it related to the size of the cluster, and it will get better result if we choose the good gap. I agree with you, that another option to consider would be to have a policy that uses node utilization, which should more accurately reflect how busy the node is. And we should also shuffle based the node utilization, because multi thread scheduling without node heartbeat scheduling, will commit to the first same node, it will cause the hot node, and the hot node is the big bottleneck of real time cluster. And actually the hot node is mainly affected the real time cluster, because it is more restrict to the delay of job. Thanks. was (Author: zhuqi): Thanks [~Jim_Brennan] for review and very patient investigation. The original ResourceUsageMultiNodeLookupPolicy policy sometimes cause the hot node in test cluster, and after the gap shuffle about more than 50% reduce the hot node case, but the gap 10 we should discuss about it, it related to the size of the cluster, and it will get better result if we choose the good gap. I agree with you, that another option to consider would be to have a policy that uses node utilization, which should more accurately reflect how busy the node is. And we should also shuffle based the node utilization, because multi thread scheduling, will commit to the first same node, it will cause the hot node, and the hot node is the big bottleneck of real time cluster. And actually the hot node is mainly affected the real time cluster, because it is more restrict to the delay of job. Thanks. > When multi thread scheduling with multi node, we should shuffle with a gap to > prevent hot accessing nodes. > -- > > Key: YARN-10738 > URL: https://issues.apache.org/jira/browse/YARN-10738 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Now the multi threading scheduling with multi node is not reasonable. > In large clusters, it will cause the hot accessing nodes, which will lead the > abnormal boom node. > Solution: > I think we should shuffle the sorted node (such the available resource sort > policy) with an interval. > I will solve the above problem, and avoid the hot accessing node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10738) When multi thread scheduling with multi node, we should shuffle with a gap to prevent hot accessing nodes.
[ https://issues.apache.org/jira/browse/YARN-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335104#comment-17335104 ] Qi Zhu commented on YARN-10738: --- Thanks [~Jim_Brennan] for review and very patient investigation. The original ResourceUsageMultiNodeLookupPolicy policy sometimes cause the hot node in test cluster, and after the gap shuffle about more than 50% reduce the hot node case, but the gap 10 we should discuss about it, it related to the size of the cluster, and it will get better result if we choose the good gap. I agree with you, that another option to consider would be to have a policy that uses node utilization, which should more accurately reflect how busy the node is. And we should also shuffle based the node utilization, because multi thread scheduling, will commit to the first same node, it will cause the hot node, and the hot node is the big bottleneck of real time cluster. And actually the hot node is mainly affected the real time cluster, because it is more restrict to the delay of job. Thanks. > When multi thread scheduling with multi node, we should shuffle with a gap to > prevent hot accessing nodes. > -- > > Key: YARN-10738 > URL: https://issues.apache.org/jira/browse/YARN-10738 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Now the multi threading scheduling with multi node is not reasonable. > In large clusters, it will cause the hot accessing nodes, which will lead the > abnormal boom node. > Solution: > I think we should shuffle the sorted node (such the available resource sort > policy) with an interval. > I will solve the above problem, and avoid the hot accessing node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10738) When multi thread scheduling with multi node, we should shuffle with a gap to prevent hot accessing nodes.
[ https://issues.apache.org/jira/browse/YARN-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17335017#comment-17335017 ] Jim Brennan commented on YARN-10738: [~zhuqi], I am not very familiar with the multi-threaded scheduling code - we have not started using it yet. So it would be very helpful if you could provide more details about what you are observing in your cluster, and how you think this will fix it. Is your cluster made up of many nodes that are the same size, or do you have a mix of different sizes? If you have any data that shows some nodes being more heavily utilized than others, that would be helpful. Looking at {{ResourceUsageMultiNodeLookupPolicy}}, it seems to sort by allocated resources to a node, so this seems to be trying to ensure we allocate more evenly across nodes. It doesn't consider the relative sizes of the nodes though, so in a heterogenous cluster, I could see it leading to smaller nodes being busier than larger nodes. I wonder if a reverse sort by unallocated resources might be more fair, because it would favor nodes that have more room for new resource requests, rather than those that currently have fewer resources allocated. Another option to consider would be to have a policy that uses node utilization, which should more accurately reflect how busy the node is. With respect to the policy proposed in this ticket, I am not convinced it will help very much? It's doing the same sort by allocated resources, but just adding a shuffle of every 10 nodes. I'm not sure how much that will help in practice on a large cluster. A rack is usually more than 10 nodes, so it's possible the same set of racks will be over-utilized. Again, it would be helpful if you had some before/after data to show how it helps in a real cluster. > When multi thread scheduling with multi node, we should shuffle with a gap to > prevent hot accessing nodes. > -- > > Key: YARN-10738 > URL: https://issues.apache.org/jira/browse/YARN-10738 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Now the multi threading scheduling with multi node is not reasonable. > In large clusters, it will cause the hot accessing nodes, which will lead the > abnormal boom node. > Solution: > I think we should shuffle the sorted node (such the available resource sort > policy) with an interval. > I will solve the above problem, and avoid the hot accessing node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10571) Refactor dynamic queue handling logic
[ https://issues.apache.org/jira/browse/YARN-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334763#comment-17334763 ] Hadoop QA commented on YARN-10571: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 16s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 34s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 2s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 20m 7s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 50s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 55s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/945/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 generated 5 new + 51 unchanged - 5 fixed = 56 total (was 56) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 45s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/945/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 generated 5 new + 40 unchanged - 5 fixed = 45 total (was 45) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green}{color} | {color:green
[jira] [Commented] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334756#comment-17334756 ] Qi Zhu commented on YARN-10707: --- The failed time out test is not related, passed locally. > Support custom resources in ResourceUtilization, and update Node GPU > Utilization to use. > > > Key: YARN-10707 > URL: https://issues.apache.org/jira/browse/YARN-10707 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10707.001.patch, YARN-10707.002.patch, > YARN-10707.003.patch, YARN-10707.004.patch, YARN-10707.005.patch, > YARN-10707.006.patch, YARN-10707.007.patch, YARN-10707.008.patch, > YARN-10707.009.patch, YARN-10707.010.patch, YARN-10707.011.patch > > > Support gpu in ResourceUtilization, and update Node GPU Utilization to use > first. > It will be very helpful for other use cases about GPU utilization. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10759) Encapsulate queue config modes
[ https://issues.apache.org/jira/browse/YARN-10759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334727#comment-17334727 ] Andras Gyori commented on YARN-10759: - Uploaded an initial revision of this refactor. Currently I have skipped refactoring ManagedParent and AutoCreatedLeafQueue related logic. I have also kept this improvement as simple as possible. It is not justified yet to complicate things by: * Encapsulate child queue modes for a ParentQueue * Create new mode WEIGHT mode, as it is more of a subtype of RELATIVE mode This might change, however, according to the needs of the tasks defined under this Jira eg. YARN-9936. > Encapsulate queue config modes > -- > > Key: YARN-10759 > URL: https://issues.apache.org/jira/browse/YARN-10759 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10759.001.patch > > > Capacity Scheduler queues have three modes: > * relative/percentage > * weight > * absolute > Most of them have their own: > * validation logic > * config setting logic > * effective capacity calculation logic > These logics can be easily extracted and encapsulated in separate config mode > classes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10759) Encapsulate queue config modes
[ https://issues.apache.org/jira/browse/YARN-10759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Gyori updated YARN-10759: Attachment: YARN-10759.001.patch > Encapsulate queue config modes > -- > > Key: YARN-10759 > URL: https://issues.apache.org/jira/browse/YARN-10759 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10759.001.patch > > > Capacity Scheduler queues have three modes: > * relative/percentage > * weight > * absolute > Most of them have their own: > * validation logic > * config setting logic > * effective capacity calculation logic > These logics can be easily extracted and encapsulated in separate config mode > classes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334691#comment-17334691 ] Hadoop QA commented on YARN-10707: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 22m 25s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} buf {color} | {color:blue} 0m 0s{color} | {color:blue}{color} | {color:blue} buf was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 21s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 26s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 59s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 28s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 38s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 27s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 33s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 5s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 17s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 29m 18s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 5m 27s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 52s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 13s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 13s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 13s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 13s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 8m 13s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 13s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 36s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 20s{color} | {col
[jira] [Commented] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334689#comment-17334689 ] Hadoop QA commented on YARN-9927: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 45s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 4s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 11s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 55s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 39s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 48s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 6s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 44s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 39s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 42s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 31m 19s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 6m 18s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 15s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 17s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 17s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 57s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 57s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 42s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 47s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 55s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {c
[jira] [Commented] (YARN-10571) Refactor dynamic queue handling logic
[ https://issues.apache.org/jira/browse/YARN-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334675#comment-17334675 ] Andras Gyori commented on YARN-10571: - Thank you [~pbacsko]. I have no idea what is going on, but I hope the OOM error is not related. > Refactor dynamic queue handling logic > - > > Key: YARN-10571 > URL: https://issues.apache.org/jira/browse/YARN-10571 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Minor > Attachments: YARN-10571.001.patch, YARN-10571.002.patch, > YARN-10571.003.patch > > > As per YARN-10506 we have introduced an other mode for auto queue creation > and a new class, which handles it. We should move the old, managed queue > related logic to CSAutoQueueHandler as well, and do additional cleanup > regarding queue management. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10571) Refactor dynamic queue handling logic
[ https://issues.apache.org/jira/browse/YARN-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334674#comment-17334674 ] Peter Bacsko commented on YARN-10571: - Thanks [~gandras] for the patch. Do you know what's going on with the javac warnings? That code wasn't even touched. Maybe it has to do with the failing build ("Unable to create native thread"). I'll trigger a rebuild. > Refactor dynamic queue handling logic > - > > Key: YARN-10571 > URL: https://issues.apache.org/jira/browse/YARN-10571 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Minor > Attachments: YARN-10571.001.patch, YARN-10571.002.patch, > YARN-10571.003.patch > > > As per YARN-10506 we have introduced an other mode for auto queue creation > and a new class, which handles it. We should move the old, managed queue > related logic to CSAutoQueueHandler as well, and do additional cleanup > regarding queue management. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10571) Refactor dynamic queue handling logic
[ https://issues.apache.org/jira/browse/YARN-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334608#comment-17334608 ] Hadoop QA commented on YARN-10571: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 16s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 24s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 4s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 20s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 27m 50s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 2m 58s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 28s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/942/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 generated 5 new + 51 unchanged - 5 fixed = 56 total (was 56) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 22s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/942/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 generated 5 new + 40 unchanged - 5 fixed = 45 total (was 45) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green
[jira] [Updated] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10707: -- Attachment: YARN-10707.011.patch > Support custom resources in ResourceUtilization, and update Node GPU > Utilization to use. > > > Key: YARN-10707 > URL: https://issues.apache.org/jira/browse/YARN-10707 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10707.001.patch, YARN-10707.002.patch, > YARN-10707.003.patch, YARN-10707.004.patch, YARN-10707.005.patch, > YARN-10707.006.patch, YARN-10707.007.patch, YARN-10707.008.patch, > YARN-10707.009.patch, YARN-10707.010.patch, YARN-10707.011.patch > > > Support gpu in ResourceUtilization, and update Node GPU Utilization to use > first. > It will be very helpful for other use cases about GPU utilization. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10707) Support custom resources in ResourceUtilization, and update Node GPU Utilization to use.
[ https://issues.apache.org/jira/browse/YARN-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334606#comment-17334606 ] Qi Zhu commented on YARN-10707: --- Fixed java doc in latest patch. > Support custom resources in ResourceUtilization, and update Node GPU > Utilization to use. > > > Key: YARN-10707 > URL: https://issues.apache.org/jira/browse/YARN-10707 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10707.001.patch, YARN-10707.002.patch, > YARN-10707.003.patch, YARN-10707.004.patch, YARN-10707.005.patch, > YARN-10707.006.patch, YARN-10707.007.patch, YARN-10707.008.patch, > YARN-10707.009.patch, YARN-10707.010.patch, YARN-10707.011.patch > > > Support gpu in ResourceUtilization, and update Node GPU Utilization to use > first. > It will be very helpful for other use cases about GPU utilization. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334565#comment-17334565 ] Qi Zhu edited comment on YARN-9927 at 4/28/21, 8:29 AM: [~ebadger] [~pbacsko] [~gandras] [~epayne] Updated a patch to improve: 1. The event produce to event queue : Each eventType will have single async dispatcher thread, the thread number is small which i have counted. And each dispatcher has single event queue, events will not affect each other anymore. 2. The event consume from event queue and process: I add an example to multi thread processing in handler just for RMNodeEvent. Different RMNodeImpl object has different write lock, i think it can be faster to make multi thread handle with different RMNode object. What's your opinions about this? Thanks. was (Author: zhuqi): [~ebadger] [~gandras] [~epayne] Updated a patch to improve: 1. The event produce to event queue : Each eventType will have single async dispatcher thread, the thread number is small which i have counted. And each dispatcher has single event queue, events will not affect each other anymore. 2. The event consume from event queue and process: I add an example to multi thread processing in handler just for RMNodeEvent. Different RMNodeImpl object has different write lock, i think it can be faster to make multi thread handle with different RMNode object. What's your opinions about this? Thanks. > RM multi-thread event processing mechanism > -- > > Key: YARN-9927 > URL: https://issues.apache.org/jira/browse/YARN-9927 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 3.0.0, 2.9.2 >Reporter: hcarrot >Assignee: Qi Zhu >Priority: Major > Attachments: RM multi-thread event processing mechanism.pdf, > YARN-9927.001.patch, YARN-9927.002.patch, YARN-9927.003.patch, > YARN-9927.004.patch, YARN-9927.005.patch > > > Recently, we have observed serious event blocking in RM event dispatcher > queue. After analysis of RM event monitoring data and RM event processing > logic, we found that > 1) environment: a cluster with thousands of nodes > 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler > 3) Meanwhile, RM event processing is in a single-thread mode, and It results > in the low headroom of RM event scheduler, thus performance of RM. > So we proposed a RM multi-thread event processing mechanism to improve RM > performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334565#comment-17334565 ] Qi Zhu commented on YARN-9927: -- [~ebadger] [~gandras] [~epayne] Updated a patch to improve: 1. The event produce to event queue : Each eventType will have single async dispatcher thread, the thread number is small which i have counted. And each dispatcher has single event queue, events will not affect each other anymore. 2. The event consume from event queue and process: I add an example to multi thread processing in handler just for RMNodeEvent. Different RMNodeImpl object has different write lock, i think it can be faster to make multi thread handle with different RMNode object. What's your opinions about this? Thanks. > RM multi-thread event processing mechanism > -- > > Key: YARN-9927 > URL: https://issues.apache.org/jira/browse/YARN-9927 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 3.0.0, 2.9.2 >Reporter: hcarrot >Assignee: Qi Zhu >Priority: Major > Attachments: RM multi-thread event processing mechanism.pdf, > YARN-9927.001.patch, YARN-9927.002.patch, YARN-9927.003.patch, > YARN-9927.004.patch, YARN-9927.005.patch > > > Recently, we have observed serious event blocking in RM event dispatcher > queue. After analysis of RM event monitoring data and RM event processing > logic, we found that > 1) environment: a cluster with thousands of nodes > 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler > 3) Meanwhile, RM event processing is in a single-thread mode, and It results > in the low headroom of RM event scheduler, thus performance of RM. > So we proposed a RM multi-thread event processing mechanism to improve RM > performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-9927: - Attachment: YARN-9927.005.patch > RM multi-thread event processing mechanism > -- > > Key: YARN-9927 > URL: https://issues.apache.org/jira/browse/YARN-9927 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 3.0.0, 2.9.2 >Reporter: hcarrot >Assignee: Qi Zhu >Priority: Major > Attachments: RM multi-thread event processing mechanism.pdf, > YARN-9927.001.patch, YARN-9927.002.patch, YARN-9927.003.patch, > YARN-9927.004.patch, YARN-9927.005.patch > > > Recently, we have observed serious event blocking in RM event dispatcher > queue. After analysis of RM event monitoring data and RM event processing > logic, we found that > 1) environment: a cluster with thousands of nodes > 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler > 3) Meanwhile, RM event processing is in a single-thread mode, and It results > in the low headroom of RM event scheduler, thus performance of RM. > So we proposed a RM multi-thread event processing mechanism to improve RM > performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10571) Refactor dynamic queue handling logic
[ https://issues.apache.org/jira/browse/YARN-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Gyori updated YARN-10571: Attachment: YARN-10571.003.patch > Refactor dynamic queue handling logic > - > > Key: YARN-10571 > URL: https://issues.apache.org/jira/browse/YARN-10571 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Minor > Attachments: YARN-10571.001.patch, YARN-10571.002.patch, > YARN-10571.003.patch > > > As per YARN-10506 we have introduced an other mode for auto queue creation > and a new class, which handles it. We should move the old, managed queue > related logic to CSAutoQueueHandler as well, and do additional cleanup > regarding queue management. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10571) Refactor dynamic queue handling logic
[ https://issues.apache.org/jira/browse/YARN-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Gyori updated YARN-10571: Attachment: (was: YARN-10571.003.patch) > Refactor dynamic queue handling logic > - > > Key: YARN-10571 > URL: https://issues.apache.org/jira/browse/YARN-10571 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Minor > Attachments: YARN-10571.001.patch, YARN-10571.002.patch, > YARN-10571.003.patch > > > As per YARN-10506 we have introduced an other mode for auto queue creation > and a new class, which handles it. We should move the old, managed queue > related logic to CSAutoQueueHandler as well, and do additional cleanup > regarding queue management. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org