[jira] [Updated] (YARN-11018) RM rest api show error resources in capacity scheduler with nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-11018: --- Attachment: YARN-11018.001.patch > RM rest api show error resources in capacity scheduler with nodelabels > -- > > Key: YARN-11018 > URL: https://issues.apache.org/jira/browse/YARN-11018 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Major > Attachments: YARN-11018.001.patch > > > Because resource metrics updated only for "default" partition, allocatedMB, > allocatedVCores, totalMB, totalVirtualCores are error in capacity scheduler > with nodelabels. > When we get cluster metrics use 'curl > [http://rm:8088/ws/v1/cluster/metrics',] we get error totalMB and > totalVirtualCores. > It should use resources across partition to replace. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11018) RM rest api show error resources in capacity scheduler with nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-11018: --- Attachment: (was: YARN-11018.001.patch) > RM rest api show error resources in capacity scheduler with nodelabels > -- > > Key: YARN-11018 > URL: https://issues.apache.org/jira/browse/YARN-11018 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Major > Attachments: YARN-11018.001.patch > > > Because resource metrics updated only for "default" partition, allocatedMB, > allocatedVCores, totalMB, totalVirtualCores are error in capacity scheduler > with nodelabels. > When we get cluster metrics use 'curl > [http://rm:8088/ws/v1/cluster/metrics',] we get error totalMB and > totalVirtualCores. > It should use resources across partition to replace. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-11026) Make default AppPlacementAllocator configurable
Minni Mittal created YARN-11026: --- Summary: Make default AppPlacementAllocator configurable Key: YARN-11026 URL: https://issues.apache.org/jira/browse/YARN-11026 Project: Hadoop YARN Issue Type: Task Reporter: Minni Mittal Assignee: Minni Mittal -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11018) RM rest api show error resources in capacity scheduler with nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451547#comment-17451547 ] Hadoop QA commented on YARN-11018: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 51s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 43s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 20m 13s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 55s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 36s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/1250/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 34s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} |
[jira] [Updated] (YARN-10863) CGroupElasticMemoryController is not work
[ https://issues.apache.org/jira/browse/YARN-10863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] LuoGe updated YARN-10863: - Attachment: YARN-10863.004.patch > CGroupElasticMemoryController is not work > - > > Key: YARN-10863 > URL: https://issues.apache.org/jira/browse/YARN-10863 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.3.1 >Reporter: LuoGe >Priority: Major > Attachments: YARN-10863.001-1.patch, YARN-10863.002.patch, > YARN-10863.004.patch > > > When following the > [documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManagerCGroupsMemory.html] > configuring elastic memory resource control, > yarn.nodemanager.elastic-memory-control.enabled set true, > yarn.nodemanager.resource.memory.enforced set to false, > yarn.nodemanager.pmem-check-enabled set true, and > yarn.nodemanager.resource.memory.enabled set true to use cgroup control > memory, but elastic memory control is not work. > I see the code ContainersMonitorImpl.java, in checkLimit function, the skip > logic have some problem. The return condition is strictMemoryEnforcement is > true and elasticMemoryEnforcement is false. So, following the document set > use elastic memory control, the check logic will continue, when container > memory used over limit will killed by checkLimit. > {code:java} > if (strictMemoryEnforcement && !elasticMemoryEnforcement) { > // When cgroup-based strict memory enforcement is used alone without > // elastic memory control, the oom-kill would take care of it. > // However, when elastic memory control is also enabled, the oom killer > // would be disabled at the root yarn container cgroup level (all child > // cgroups would inherit that setting). Hence, we fall back to the > // polling-based mechanism. > return; > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10863) CGroupElasticMemoryController is not work
[ https://issues.apache.org/jira/browse/YARN-10863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] LuoGe updated YARN-10863: - Attachment: (was: YARN-10863.003.patch) > CGroupElasticMemoryController is not work > - > > Key: YARN-10863 > URL: https://issues.apache.org/jira/browse/YARN-10863 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.3.1 >Reporter: LuoGe >Priority: Major > Attachments: YARN-10863.001-1.patch, YARN-10863.002.patch > > > When following the > [documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManagerCGroupsMemory.html] > configuring elastic memory resource control, > yarn.nodemanager.elastic-memory-control.enabled set true, > yarn.nodemanager.resource.memory.enforced set to false, > yarn.nodemanager.pmem-check-enabled set true, and > yarn.nodemanager.resource.memory.enabled set true to use cgroup control > memory, but elastic memory control is not work. > I see the code ContainersMonitorImpl.java, in checkLimit function, the skip > logic have some problem. The return condition is strictMemoryEnforcement is > true and elasticMemoryEnforcement is false. So, following the document set > use elastic memory control, the check logic will continue, when container > memory used over limit will killed by checkLimit. > {code:java} > if (strictMemoryEnforcement && !elasticMemoryEnforcement) { > // When cgroup-based strict memory enforcement is used alone without > // elastic memory control, the oom-kill would take care of it. > // However, when elastic memory control is also enabled, the oom killer > // would be disabled at the root yarn container cgroup level (all child > // cgroups would inherit that setting). Hence, we fall back to the > // polling-based mechanism. > return; > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11025) Implement distributed decommissioning
[ https://issues.apache.org/jira/browse/YARN-11025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Minni Mittal updated YARN-11025: Summary: Implement distributed decommissioning (was: Implement distributed maintenance ) > Implement distributed decommissioning > - > > Key: YARN-11025 > URL: https://issues.apache.org/jira/browse/YARN-11025 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Minni Mittal >Assignee: Minni Mittal >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-11025) Implement distributed maintenance
Minni Mittal created YARN-11025: --- Summary: Implement distributed maintenance Key: YARN-11025 URL: https://issues.apache.org/jira/browse/YARN-11025 Project: Hadoop YARN Issue Type: New Feature Reporter: Minni Mittal Assignee: Minni Mittal -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10863) CGroupElasticMemoryController is not work
[ https://issues.apache.org/jira/browse/YARN-10863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451491#comment-17451491 ] Hadoop QA commented on YARN-10863: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 14s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 28s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 33s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 22s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 18m 0s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 22s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 35s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/1249/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 23s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/1249/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 23s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/1249/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 19s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/1249/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 19s{color} |
[jira] [Updated] (YARN-11018) RM rest api show error resources in capacity scheduler with nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-11018: --- Attachment: YARN-11018.001.patch > RM rest api show error resources in capacity scheduler with nodelabels > -- > > Key: YARN-11018 > URL: https://issues.apache.org/jira/browse/YARN-11018 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Major > Attachments: YARN-11018.001.patch > > > Because resource metrics updated only for "default" partition, allocatedMB, > allocatedVCores, totalMB, totalVirtualCores are error in capacity scheduler > with nodelabels. > When we get cluster metrics use 'curl > [http://rm:8088/ws/v1/cluster/metrics',] we get error totalMB and > totalVirtualCores. > It should use resources across partition to replace. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11018) RM rest api show error resources in capacity scheduler with nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-11018: --- Attachment: (was: YARN-11018.001.patch) > RM rest api show error resources in capacity scheduler with nodelabels > -- > > Key: YARN-11018 > URL: https://issues.apache.org/jira/browse/YARN-11018 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Major > > Because resource metrics updated only for "default" partition, allocatedMB, > allocatedVCores, totalMB, totalVirtualCores are error in capacity scheduler > with nodelabels. > When we get cluster metrics use 'curl > [http://rm:8088/ws/v1/cluster/metrics',] we get error totalMB and > totalVirtualCores. > It should use resources across partition to replace. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10863) CGroupElasticMemoryController is not work
[ https://issues.apache.org/jira/browse/YARN-10863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] LuoGe updated YARN-10863: - Attachment: YARN-10863.003.patch > CGroupElasticMemoryController is not work > - > > Key: YARN-10863 > URL: https://issues.apache.org/jira/browse/YARN-10863 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.3.1 >Reporter: LuoGe >Priority: Major > Attachments: YARN-10863.001-1.patch, YARN-10863.002.patch, > YARN-10863.003.patch > > > When following the > [documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManagerCGroupsMemory.html] > configuring elastic memory resource control, > yarn.nodemanager.elastic-memory-control.enabled set true, > yarn.nodemanager.resource.memory.enforced set to false, > yarn.nodemanager.pmem-check-enabled set true, and > yarn.nodemanager.resource.memory.enabled set true to use cgroup control > memory, but elastic memory control is not work. > I see the code ContainersMonitorImpl.java, in checkLimit function, the skip > logic have some problem. The return condition is strictMemoryEnforcement is > true and elasticMemoryEnforcement is false. So, following the document set > use elastic memory control, the check logic will continue, when container > memory used over limit will killed by checkLimit. > {code:java} > if (strictMemoryEnforcement && !elasticMemoryEnforcement) { > // When cgroup-based strict memory enforcement is used alone without > // elastic memory control, the oom-kill would take care of it. > // However, when elastic memory control is also enabled, the oom killer > // would be disabled at the root yarn container cgroup level (all child > // cgroups would inherit that setting). Hence, we fall back to the > // polling-based mechanism. > return; > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11018) RM rest api show error resources in capacity scheduler with nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451294#comment-17451294 ] Hadoop QA commented on YARN-11018: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 52s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 24s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 17s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 19m 45s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 2m 1s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 38s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/1248/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 27 new + 2 unchanged - 0 fixed = 29 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 51s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} |
[jira] [Updated] (YARN-11024) Create an AbstractLeafQueue to store the common LeafQueue + AutoCreatedLeafQueue functionality
[ https://issues.apache.org/jira/browse/YARN-11024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated YARN-11024: -- Labels: pull-request-available (was: ) > Create an AbstractLeafQueue to store the common LeafQueue + > AutoCreatedLeafQueue functionality > -- > > Key: YARN-11024 > URL: https://issues.apache.org/jira/browse/YARN-11024 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > AbstractAutoCreatedLeafQueue extends the LeafQueue class which is an > instantiable class, so every time an AutoCreatedLeafQueue is created a normal > LeafQueue is configured as well. This setup results in some strange behaviour > like having to pass the template configs of an auto created queue to a leaf > queue. To make the whole structure more flexible an AbstractLeafQueue should > be created which stores the common methods. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-11024) Create an AbstractLeafQueue to store the common LeafQueue + AutoCreatedLeafQueue functionality
Benjamin Teke created YARN-11024: Summary: Create an AbstractLeafQueue to store the common LeafQueue + AutoCreatedLeafQueue functionality Key: YARN-11024 URL: https://issues.apache.org/jira/browse/YARN-11024 Project: Hadoop YARN Issue Type: Sub-task Reporter: Benjamin Teke Assignee: Benjamin Teke AbstractAutoCreatedLeafQueue extends the LeafQueue class which is an instantiable class, so every time an AutoCreatedLeafQueue is created a normal LeafQueue is configured as well. This setup results in some strange behaviour like having to pass the template configs of an auto created queue to a leaf queue. To make the whole structure more flexible an AbstractLeafQueue should be created which stores the common methods. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11023) Extend the root QueueInfo with max-parallel-apps in CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-11023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated YARN-11023: -- Labels: pull-request-available (was: ) > Extend the root QueueInfo with max-parallel-apps in CapacityScheduler > - > > Key: YARN-11023 > URL: https://issues.apache.org/jira/browse/YARN-11023 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: Tamas Domok >Assignee: Tamas Domok >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > YARN-10891 extended the QueueInfo with the maxParallelApps property, but for > the root queue this property is missing. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11018) RM rest api show error resources in capacity scheduler with nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-11018: --- Attachment: YARN-11018.001.patch > RM rest api show error resources in capacity scheduler with nodelabels > -- > > Key: YARN-11018 > URL: https://issues.apache.org/jira/browse/YARN-11018 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Major > Attachments: YARN-11018.001.patch > > > Because resource metrics updated only for "default" partition, allocatedMB, > allocatedVCores, totalMB, totalVirtualCores are error in capacity scheduler > with nodelabels. > When we get cluster metrics use 'curl > [http://rm:8088/ws/v1/cluster/metrics',] we get error totalMB and > totalVirtualCores. > It should use resources across partition to replace. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11018) RM rest api show error resources in capacity scheduler with nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-11018: --- Attachment: (was: YARN-11018.001.patch) > RM rest api show error resources in capacity scheduler with nodelabels > -- > > Key: YARN-11018 > URL: https://issues.apache.org/jira/browse/YARN-11018 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Major > > Because resource metrics updated only for "default" partition, allocatedMB, > allocatedVCores, totalMB, totalVirtualCores are error in capacity scheduler > with nodelabels. > When we get cluster metrics use 'curl > [http://rm:8088/ws/v1/cluster/metrics',] we get error totalMB and > totalVirtualCores. > It should use resources across partition to replace. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11018) RM rest api show error resources in capacity scheduler with nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-11018: --- Description: Because resource metrics updated only for "default" partition, allocatedMB, allocatedVCores, totalMB, totalVirtualCores are error in capacity scheduler with nodelabels. When we get cluster metrics use 'curl [http://rm:8088/ws/v1/cluster/metrics',] we get error totalMB and totalVirtualCores. It should use resources across partition to replace. was: Because resource metrics updated only for "default" partition, allocatedMB, allocatedVCores, totalMB and other resources are error in capacity scheduler with nodelabels. And in RM UI 1, the cluster metrics is uncorrect. We can see the memory total and vcores total is unequal to resources in the all partitions. It should use resources across partition to replace. > RM rest api show error resources in capacity scheduler with nodelabels > -- > > Key: YARN-11018 > URL: https://issues.apache.org/jira/browse/YARN-11018 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Major > > Because resource metrics updated only for "default" partition, allocatedMB, > allocatedVCores, totalMB, totalVirtualCores are error in capacity scheduler > with nodelabels. > When we get cluster metrics use 'curl > [http://rm:8088/ws/v1/cluster/metrics',] we get error totalMB and > totalVirtualCores. > It should use resources across partition to replace. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11018) RM rest api show error resources in capacity scheduler with nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-11018: --- Summary: RM rest api show error resources in capacity scheduler with nodelabels (was: RM UI 1 show error resources in capacity scheduler with multi nodelabels) > RM rest api show error resources in capacity scheduler with nodelabels > -- > > Key: YARN-11018 > URL: https://issues.apache.org/jira/browse/YARN-11018 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Major > Attachments: YARN-11018.001.patch > > > Because resource metrics updated only for "default" partition, allocatedMB, > allocatedVCores, totalMB and other resources are error in capacity scheduler > with nodelabels. And in RM UI 1, the cluster metrics is uncorrect. > We can see the memory total and vcores total is unequal to resources in the > all partitions. > It should use resources across partition to replace. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11022) Fix the documentation for max-parallel-apps in CS
[ https://issues.apache.org/jira/browse/YARN-11022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-11022: -- Description: The documentation does not mention that the max-parallel-apps property is inherited. The property can be overridden on a per queue basis, but the parent(s) can also restrict how many parallel apps can be run. {*}yarn.scheduler.capacity.max-parallel-apps / yarn.scheduler.capacity..max-parallel-apps{*}: {quote} Maximum number of applications that can run at the same time. Unlike to maximum-applications, application submissions are not rejected when this limit is reached. Instead they stay in ACCEPTED state until they are eligible to run. This can be set for all queues with yarn.scheduler.capacity.max-parallel-apps and can also be overridden on a per queue basis by setting yarn.scheduler.capacity..max-parallel-apps. Integer value is expected. By default, there is no limit. {quote} [https://github.com/apache/hadoop/blob/03cfc852791c14fad39db4e5b14104a276c08e59/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSMaxRunningAppsEnforcer.java#L99] {code} private boolean exceedQueueMaxParallelApps(AbstractCSQueue queue) { // Check queue and all parent queues while (queue != null) { if (queue.getNumRunnableApps() >= queue.getMaxParallelApps()) { LOG.info("Maximum runnable apps exceeded for queue {}", queue.getQueuePath()); return true; } queue = (AbstractCSQueue) queue.getParent(); } return false; } {code} Example: Let's say the user configured the *yarn.scheduler.capacity.max-parallel-apps* to 250, that will be the default for queues that doesn't override the setting. ([https://github.com/apache/hadoop/blob/32ecaed9c3c06a48ef01d0437e62e8faccd3e9f3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1688]) Given this queue hierarchy: ||root||.a||.a1||.a2||.a3||.a4|| |500|default|50|10|default|15| ||root||.a||.b|| |500|default|50| - maximum 250 apps can run parallel under the *root.a* queues. - maximum 50 apps can run parallel under the *root.a.a1* queues. - maximum 10 apps can run parallel under the *root.a.a1.a2* queues. - maximum *10* apps can run parallel under the *root.a1.a2.a3* queues. (even though the max-parallel-apps is not set for .a3 so the default 250 applies for that queue, but it's parent had a lower value, and children can't exceed that) - maximum *10* apps can run parallel under the *root.a1.a2.a3.a4* queue. (even though it's configured for 15, the parents restrict this limit to 10) - maximum 50 apps can run parallel under the *root.a.b* queue. was: The documentation does not mention that the max-parallel-apps property is inherited. The property can be overridden on a per queue basis, but the parent(s) can also restrict how many parallel apps can be run. {*}yarn.scheduler.capacity.max-parallel-apps / yarn.scheduler.capacity..max-parallel-apps{*}: Maximum number of applications that can run at the same time. Unlike to maximum-applications, application submissions are not rejected when this limit is reached. Instead they stay in ACCEPTED state until they are eligible to run. This can be set for all queues with yarn.scheduler.capacity.max-parallel-apps and can also be overridden on a per queue basis by setting yarn.scheduler.capacity..max-parallel-apps. Integer value is expected. By default, there is no limit. [https://github.com/apache/hadoop/blob/03cfc852791c14fad39db4e5b14104a276c08e59/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSMaxRunningAppsEnforcer.java#L99] private boolean exceedQueueMaxParallelApps(AbstractCSQueue queue) { // Check queue and all parent queues while (queue != null) { if (queue.getNumRunnableApps() >= queue.getMaxParallelApps()) { LOG.info("Maximum runnable apps exceeded for queue {}", queue.getQueuePath()); return true; } queue = (AbstractCSQueue) queue.getParent(); } return false; } Example: Let's say the user configured the *yarn.scheduler.capacity.max-parallel-apps* to 250, that will be the default for queues that doesn't override the setting. ([https://github.com/apache/hadoop/blob/32ecaed9c3c06a48ef01d0437e62e8faccd3e9f3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1688]) Given this queue hierarchy: ||root||.a||.a1||.a2||.a3||.a4||
[jira] [Updated] (YARN-11022) Fix the documentation for max-parallel-apps in CS
[ https://issues.apache.org/jira/browse/YARN-11022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-11022: -- Description: The documentation does not mention that the max-parallel-apps property is inherited. The property can be overridden on a per queue basis, but the parent(s) can also restrict how many parallel apps can be run. {*}yarn.scheduler.capacity.max-parallel-apps / yarn.scheduler.capacity..max-parallel-apps{*}: {quote} Maximum number of applications that can run at the same time. Unlike to maximum-applications, application submissions are not rejected when this limit is reached. Instead they stay in ACCEPTED state until they are eligible to run. This can be set for all queues with yarn.scheduler.capacity.max-parallel-apps and can also be overridden on a per queue basis by setting yarn.scheduler.capacity..max-parallel-apps. Integer value is expected. By default, there is no limit. {quote} [https://github.com/apache/hadoop/blob/03cfc852791c14fad39db4e5b14104a276c08e59/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSMaxRunningAppsEnforcer.java#L99] {code} private boolean exceedQueueMaxParallelApps(AbstractCSQueue queue) { // Check queue and all parent queues while (queue != null) { if (queue.getNumRunnableApps() >= queue.getMaxParallelApps()) { LOG.info("Maximum runnable apps exceeded for queue {}", queue.getQueuePath()); return true; } queue = (AbstractCSQueue) queue.getParent(); } return false; } {code} Example: Let's say the user configured the *yarn.scheduler.capacity.max-parallel-apps* to 250, that will be the default for queues that doesn't override the setting. ([https://github.com/apache/hadoop/blob/32ecaed9c3c06a48ef01d0437e62e8faccd3e9f3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1688]) Given this queue hierarchy: ||root||.a||.a1||.a2||.a3||.a4|| |500|default|50|10|default|15| ||root||.a||.b|| |500|default|50| - maximum 250 apps can run parallel under the *root.a* queues. - maximum 50 apps can run parallel under the *root.a.a1* queues. - maximum 10 apps can run parallel under the *root.a.a1.a2* queues. - maximum *10* apps can run parallel under the *root.a1.a2.a3* queues. (even though the max-parallel-apps is not set for .a3 so the default 250 applies for that queue, but it's parent had a lower value, and children can't exceed that) - maximum *10* apps can run parallel under the *root.a1.a2.a3.a4* queue. (even though it's configured for 15, the parents restrict this limit to 10) - maximum 50 apps can run parallel under the *root.a.b* queue. was: The documentation does not mention that the max-parallel-apps property is inherited. The property can be overridden on a per queue basis, but the parent(s) can also restrict how many parallel apps can be run. {*}yarn.scheduler.capacity.max-parallel-apps / yarn.scheduler.capacity..max-parallel-apps{*}: {quote} Maximum number of applications that can run at the same time. Unlike to maximum-applications, application submissions are not rejected when this limit is reached. Instead they stay in ACCEPTED state until they are eligible to run. This can be set for all queues with yarn.scheduler.capacity.max-parallel-apps and can also be overridden on a per queue basis by setting yarn.scheduler.capacity..max-parallel-apps. Integer value is expected. By default, there is no limit. {quote} [https://github.com/apache/hadoop/blob/03cfc852791c14fad39db4e5b14104a276c08e59/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSMaxRunningAppsEnforcer.java#L99] {code} private boolean exceedQueueMaxParallelApps(AbstractCSQueue queue) { // Check queue and all parent queues while (queue != null) { if (queue.getNumRunnableApps() >= queue.getMaxParallelApps()) { LOG.info("Maximum runnable apps exceeded for queue {}", queue.getQueuePath()); return true; } queue = (AbstractCSQueue) queue.getParent(); } return false; } {code} Example: Let's say the user configured the *yarn.scheduler.capacity.max-parallel-apps* to 250, that will be the default for queues that doesn't override the setting. ([https://github.com/apache/hadoop/blob/32ecaed9c3c06a48ef01d0437e62e8faccd3e9f3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1688]) Given this queue hierarchy:
[jira] [Created] (YARN-11023) Extend the root QueueInfo with max-parallel-apps in CapacityScheduler
Tamas Domok created YARN-11023: -- Summary: Extend the root QueueInfo with max-parallel-apps in CapacityScheduler Key: YARN-11023 URL: https://issues.apache.org/jira/browse/YARN-11023 Project: Hadoop YARN Issue Type: Bug Components: capacity scheduler Affects Versions: 3.4.0 Reporter: Tamas Domok Assignee: Tamas Domok YARN-10891 extended the QueueInfo with the maxParallelApps property, but for the root queue this property is missing. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-11022) Fix the documentation for max-parallel-apps in CS
Tamas Domok created YARN-11022: -- Summary: Fix the documentation for max-parallel-apps in CS Key: YARN-11022 URL: https://issues.apache.org/jira/browse/YARN-11022 Project: Hadoop YARN Issue Type: Bug Components: capacity scheduler Affects Versions: 3.4.0 Reporter: Tamas Domok Assignee: Tamas Domok The documentation does not mention that the max-parallel-apps property is inherited. The property can be overridden on a per queue basis, but the parent(s) can also restrict how many parallel apps can be run. {*}yarn.scheduler.capacity.max-parallel-apps / yarn.scheduler.capacity..max-parallel-apps{*}: Maximum number of applications that can run at the same time. Unlike to maximum-applications, application submissions are not rejected when this limit is reached. Instead they stay in ACCEPTED state until they are eligible to run. This can be set for all queues with yarn.scheduler.capacity.max-parallel-apps and can also be overridden on a per queue basis by setting yarn.scheduler.capacity..max-parallel-apps. Integer value is expected. By default, there is no limit. [https://github.com/apache/hadoop/blob/03cfc852791c14fad39db4e5b14104a276c08e59/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSMaxRunningAppsEnforcer.java#L99] private boolean exceedQueueMaxParallelApps(AbstractCSQueue queue) { // Check queue and all parent queues while (queue != null) { if (queue.getNumRunnableApps() >= queue.getMaxParallelApps()) { LOG.info("Maximum runnable apps exceeded for queue {}", queue.getQueuePath()); return true; } queue = (AbstractCSQueue) queue.getParent(); } return false; } Example: Let's say the user configured the *yarn.scheduler.capacity.max-parallel-apps* to 250, that will be the default for queues that doesn't override the setting. ([https://github.com/apache/hadoop/blob/32ecaed9c3c06a48ef01d0437e62e8faccd3e9f3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1688]) Given this queue hierarchy: ||root||.a||.a1||.a2||.a3||.a4|| |500|default|50|10|default|15| ||root||.a||.b|| |500|default|50| - maximum 250 apps can run parallel under the *root.a* queues. - maximum 50 apps can run parallel under the *root.a.a1* queues. - maximum 10 apps can run parallel under the *root.a.a1.a2* queues. - maximum *10* apps can run parallel under the *root.a1.a2.a3* queues. (even though the max-parallel-apps is not set for .a3 so the default 250 applies for that queue, but it's parent had a lower value, and children can't exceed that) - maximum *10* apps can run parallel under the *root.a1.a2.a3.a4* queue. (even though it's configured for 15, the parents restrict this limit to 10) - maximum 50 apps can run parallel under the *root.a.b* queue. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11020) [UI2] No container is found for an application attempt with a single AM container
[ https://issues.apache.org/jira/browse/YARN-11020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451027#comment-17451027 ] Andras Gyori commented on YARN-11020: - Thanks [~adam.antal] for chiming in. I agree with you that it is a bug on YARN RM services side, as making a distinction between multiple and single entries in responses is a really bad practice. However, there is a slight, but non-zero possibility, that someone is already using this endpoint aside from UI2. > [UI2] No container is found for an application attempt with a single AM > container > - > > Key: YARN-11020 > URL: https://issues.apache.org/jira/browse/YARN-11020 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > In UI2 for an application under the Logs tab, No container data available > message is shown if the application attempt only submitted a single container > (which is the AM container). > The culprit of the issue is that the response from YARN is not consistent, > because for a single container it looks like: > {noformat} > { > "containerLogsInfo": { > "containerLogInfo": [ > { > "fileName": "prelaunch.out", > "fileSize": "100", > "lastModifiedTime": "Mon Nov 29 09:28:16 + 2021" > }, > { > "fileName": "directory.info", > "fileSize": "2296", > "lastModifiedTime": "Mon Nov 29 09:28:16 + 2021" > }, > { > "fileName": "stderr", > "fileSize": "1722", > "lastModifiedTime": "Mon Nov 29 09:28:28 + 2021" > }, > { > "fileName": "prelaunch.err", > "fileSize": "0", > "lastModifiedTime": "Mon Nov 29 09:28:16 + 2021" > }, > { > "fileName": "stdout", > "fileSize": "0", > "lastModifiedTime": "Mon Nov 29 09:28:16 + 2021" > }, > { > "fileName": "syslog", > "fileSize": "38551", > "lastModifiedTime": "Mon Nov 29 09:28:28 + 2021" > }, > { > "fileName": "launch_container.sh", > "fileSize": "5013", > "lastModifiedTime": "Mon Nov 29 09:28:16 + 2021" > } > ], > "logAggregationType": "AGGREGATED", > "containerId": "container_1638174027957_0008_01_01", > "nodeId": "da175178c179:43977" > } > }{noformat} > As for applications with multiple containers it looks like: > {noformat} > { > "containerLogsInfo": [{ > > }, { }] > }{noformat} > We can not change the response of the endpoint due to backward compatibility, > therefore we need to make UI2 be able to handle both scenarios. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10863) CGroupElasticMemoryController is not work
[ https://issues.apache.org/jira/browse/YARN-10863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451026#comment-17451026 ] Hadoop QA commented on YARN-10863: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 49s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 27s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 24s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 37s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 18m 22s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 30s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 22s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 28s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 25s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/1247/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 2 new + 1 unchanged - 0 fixed = 3 total (was 1) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} shellcheck {color} | {color:red} 0m 0s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/1247/artifact/out/diff-patch-shellcheck.txt{color} | {color:red} The patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 18s{color} | {color:green}{color} | {color:green}
[jira] [Updated] (YARN-10863) CGroupElasticMemoryController is not work
[ https://issues.apache.org/jira/browse/YARN-10863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] LuoGe updated YARN-10863: - Attachment: YARN-10863.002.patch > CGroupElasticMemoryController is not work > - > > Key: YARN-10863 > URL: https://issues.apache.org/jira/browse/YARN-10863 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.3.1 >Reporter: LuoGe >Priority: Major > Attachments: YARN-10863.001-1.patch, YARN-10863.002.patch > > > When following the > [documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManagerCGroupsMemory.html] > configuring elastic memory resource control, > yarn.nodemanager.elastic-memory-control.enabled set true, > yarn.nodemanager.resource.memory.enforced set to false, > yarn.nodemanager.pmem-check-enabled set true, and > yarn.nodemanager.resource.memory.enabled set true to use cgroup control > memory, but elastic memory control is not work. > I see the code ContainersMonitorImpl.java, in checkLimit function, the skip > logic have some problem. The return condition is strictMemoryEnforcement is > true and elasticMemoryEnforcement is false. So, following the document set > use elastic memory control, the check logic will continue, when container > memory used over limit will killed by checkLimit. > {code:java} > if (strictMemoryEnforcement && !elasticMemoryEnforcement) { > // When cgroup-based strict memory enforcement is used alone without > // elastic memory control, the oom-kill would take care of it. > // However, when elastic memory control is also enabled, the oom killer > // would be disabled at the root yarn container cgroup level (all child > // cgroups would inherit that setting). Hence, we fall back to the > // polling-based mechanism. > return; > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11018) RM UI 1 show error resources in capacity scheduler with multi nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-11018: --- Environment: (was: !resource1.png|width=774,height=185!) > RM UI 1 show error resources in capacity scheduler with multi nodelabels > > > Key: YARN-11018 > URL: https://issues.apache.org/jira/browse/YARN-11018 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Major > Attachments: YARN-11018.001.patch > > > Because resource metrics updated only for "default" partition, allocatedMB, > allocatedVCores, totalMB and other resources are error in capacity scheduler > with nodelabels. And in RM UI 1, the cluster metrics is uncorrect. > We can see the memory total and vcores total is unequal to resources in the > all partitions. > It should use resources across partition to replace. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] (YARN-11018) RM UI 1 show error resources in capacity scheduler with multi nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018 ] caozhiqiang deleted comment on YARN-11018: was (Author: caozhiqiang): It is related to [YARN-10343|https://issues.apache.org/jira/browse/YARN-10343], so is cancel this patch. > RM UI 1 show error resources in capacity scheduler with multi nodelabels > > > Key: YARN-11018 > URL: https://issues.apache.org/jira/browse/YARN-11018 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 > Environment: !resource1.png|width=774,height=185! >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Major > Attachments: YARN-11018.001.patch > > > Because resource metrics updated only for "default" partition, allocatedMB, > allocatedVCores, totalMB and other resources are error in capacity scheduler > with nodelabels. And in RM UI 1, the cluster metrics is uncorrect. > We can see the memory total and vcores total is unequal to resources in the > all partitions. > It should use resources across partition to replace. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11018) RM UI 1 show error resources in capacity scheduler with multi nodelabels
[ https://issues.apache.org/jira/browse/YARN-11018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caozhiqiang updated YARN-11018: --- Attachment: (was: resource1.png) > RM UI 1 show error resources in capacity scheduler with multi nodelabels > > > Key: YARN-11018 > URL: https://issues.apache.org/jira/browse/YARN-11018 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 > Environment: !resource1.png|width=774,height=185! >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Major > Attachments: YARN-11018.001.patch > > > Because resource metrics updated only for "default" partition, allocatedMB, > allocatedVCores, totalMB and other resources are error in capacity scheduler > with nodelabels. And in RM UI 1, the cluster metrics is uncorrect. > We can see the memory total and vcores total is unequal to resources in the > all partitions. > It should use resources across partition to replace. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org