[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out
[ https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673870#comment-16673870 ] Hadoop QA commented on YARN-8672: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 10s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 1s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 75m 9s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8672 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12946760/YARN-8672.005.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 38bd19105dce 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 989715e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22411/testReport/ | | Max. process+thread count | 306 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22411/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. >
[jira] [Commented] (YARN-8867) Retrieve the status of resource localization
[ https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673853#comment-16673853 ] Hadoop QA commented on YARN-8867: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 11 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 6s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 28s{color} | {color:green} root: The patch generated 0 new + 480 unchanged - 1 fixed = 480 total (was 481) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 32s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 50s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 31s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 30s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 50s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 39s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 37m 24s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 13s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 2s{color} | {color:green} The patch does not generate ASF License warnings. {color} | |
[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml
[ https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673832#comment-16673832 ] Subru Krishnan commented on YARN-7592: -- Thanks [~rahulanand90] for the clarification. Can you update the patch after removing the flag (which I should mention is great) and quickly revalidate that there's no regression? +1 from my side pending that. > yarn.federation.failover.enabled missing in yarn-default.xml > > > Key: YARN-7592 > URL: https://issues.apache.org/jira/browse/YARN-7592 > Project: Hadoop YARN > Issue Type: Bug > Components: federation >Affects Versions: 3.0.0-beta1 >Reporter: Gera Shegalov >Priority: Major > Attachments: IssueReproduce.patch > > > yarn.federation.failover.enabled should be documented in yarn-default.xml. I > am also not sure why it should be true by default and force the HA retry > policy in {{RMProxy#createRMProxy}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6900) ZooKeeper based implementation of the FederationStateStore
[ https://issues.apache.org/jira/browse/YARN-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673830#comment-16673830 ] Subru Krishnan commented on YARN-6900: -- [~rahulanand90], I agree with you the parameters are tricky to identify. Programmatically, what we need is a serialize conf as defined [here|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/policies/manager/FederationPolicyManager.java#L101]. Manually, we could start with a key-value map where predefined keys could be router/amrmproxy weights or headroomAlpha. Thoughts? > ZooKeeper based implementation of the FederationStateStore > -- > > Key: YARN-6900 > URL: https://issues.apache.org/jira/browse/YARN-6900 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, nodemanager, resourcemanager >Reporter: Subru Krishnan >Assignee: Íñigo Goiri >Priority: Major > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: YARN-6900-002.patch, YARN-6900-003.patch, > YARN-6900-004.patch, YARN-6900-005.patch, YARN-6900-006.patch, > YARN-6900-007.patch, YARN-6900-008.patch, YARN-6900-009.patch, > YARN-6900-010.patch, YARN-6900-011.patch, YARN-6900-YARN-2915-000.patch, > YARN-6900-YARN-2915-001.patch > > > YARN-5408 defines the unified {{FederationStateStore}} API. Currently we only > support SQL based stores, this JIRA tracks adding a ZooKeeper based > implementation for simplifying deployment as it's already popularly used for > {{RMStateStore}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out
[ https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673827#comment-16673827 ] Chandni Singh commented on YARN-8672: - Uploaded patch 5 with localizers having private token files. > TestContainerManager#testLocalingResourceWhileContainerRunning occasionally > times out > - > > Key: YARN-8672 > URL: https://issues.apache.org/jira/browse/YARN-8672 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0 >Reporter: Jason Lowe >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8672.001.patch, YARN-8672.002.patch, > YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch > > > Precommit builds have been failing in > TestContainerManager#testLocalingResourceWhileContainerRunning. I have been > able to reproduce the problem without any patch applied if I run the test > enough times. It looks like something is removing container tokens from the > nmPrivate area just as a new localizer starts. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out
[ https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-8672: Attachment: YARN-8672.005.patch > TestContainerManager#testLocalingResourceWhileContainerRunning occasionally > times out > - > > Key: YARN-8672 > URL: https://issues.apache.org/jira/browse/YARN-8672 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0 >Reporter: Jason Lowe >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8672.001.patch, YARN-8672.002.patch, > YARN-8672.003.patch, YARN-8672.004.patch, YARN-8672.005.patch > > > Precommit builds have been failing in > TestContainerManager#testLocalingResourceWhileContainerRunning. I have been > able to reproduce the problem without any patch applied if I run the test > enough times. It looks like something is removing container tokens from the > nmPrivate area just as a new localizer starts. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8963) Add flag to disable interactive shell
[ https://issues.apache.org/jira/browse/YARN-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8963: Attachment: YARN-8963.001.patch > Add flag to disable interactive shell > - > > Key: YARN-8963 > URL: https://issues.apache.org/jira/browse/YARN-8963 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Attachments: YARN-8963.001.patch > > > For some production job, application admin might choose to disable debugging > to production jobs to prevent developer or system admin from accessing the > containers. It would be nice to add an environment variable flag to disable > interactive shell during application submission. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8893) [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client
[ https://issues.apache.org/jira/browse/YARN-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-8893: --- Fix Version/s: 2.10.0 > [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client > --- > > Key: YARN-8893 > URL: https://issues.apache.org/jira/browse/YARN-8893 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Fix For: 2.10.0, 3.3.0 > > Attachments: YARN-8893.v1.patch, YARN-8893.v2.patch > > > Fix thread leak in AMRMClientRelayer and UAM client used by > FederationInterceptor, when destroying the interceptor pipeline in AMRMProxy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8893) [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client
[ https://issues.apache.org/jira/browse/YARN-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673807#comment-16673807 ] Botong Huang commented on YARN-8893: Thanks [~giovanni.fumarola] for the review! Committed to branch-2 as well. > [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client > --- > > Key: YARN-8893 > URL: https://issues.apache.org/jira/browse/YARN-8893 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-8893.v1.patch, YARN-8893.v2.patch > > > Fix thread leak in AMRMClientRelayer and UAM client used by > FederationInterceptor, when destroying the interceptor pipeline in AMRMProxy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8893) [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client
[ https://issues.apache.org/jira/browse/YARN-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673782#comment-16673782 ] Hudson commented on YARN-8893: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15351 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15351/]) YARN-8893. [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM (gifuma: rev 989715ec5066c6ac7868e25ad9234dc64723e61e) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/MockResourceManagerFacade.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/uam/UnmanagedApplicationManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/uam/TestUnmanagedApplicationManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/FederationInterceptor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/AMRMClientRelayer.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/TestAMRMProxyService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/metrics/TestAMRMClientRelayerMetrics.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/uam/UnmanagedAMPoolManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/TestAMRMClientRelayer.java > [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client > --- > > Key: YARN-8893 > URL: https://issues.apache.org/jira/browse/YARN-8893 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-8893.v1.patch, YARN-8893.v2.patch > > > Fix thread leak in AMRMClientRelayer and UAM client used by > FederationInterceptor, when destroying the interceptor pipeline in AMRMProxy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8893) [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client
[ https://issues.apache.org/jira/browse/YARN-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673760#comment-16673760 ] Giovanni Matteo Fumarola commented on YARN-8893: Perfect. Committed to trunk. Thanks [~botong] for the patch. > [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client > --- > > Key: YARN-8893 > URL: https://issues.apache.org/jira/browse/YARN-8893 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-8893.v1.patch, YARN-8893.v2.patch > > > Fix thread leak in AMRMClientRelayer and UAM client used by > FederationInterceptor, when destroying the interceptor pipeline in AMRMProxy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8893) [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client
[ https://issues.apache.org/jira/browse/YARN-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8893: --- Fix Version/s: 3.3.0 > [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client > --- > > Key: YARN-8893 > URL: https://issues.apache.org/jira/browse/YARN-8893 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-8893.v1.patch, YARN-8893.v2.patch > > > Fix thread leak in AMRMClientRelayer and UAM client used by > FederationInterceptor, when destroying the interceptor pipeline in AMRMProxy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8962) Add ability to use interactive shell with normal yarn container
[ https://issues.apache.org/jira/browse/YARN-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673751#comment-16673751 ] Eric Yang commented on YARN-8962: - Patch 002 added test case. > Add ability to use interactive shell with normal yarn container > --- > > Key: YARN-8962 > URL: https://issues.apache.org/jira/browse/YARN-8962 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Attachments: YARN-8962.001.patch, YARN-8962.002.patch > > > This task is focusing on extending interactive shell capability to yarn > container without docker. This will improve some aspect of debugging > mapreduce or spark applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8962) Add ability to use interactive shell with normal yarn container
[ https://issues.apache.org/jira/browse/YARN-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8962: Attachment: YARN-8962.002.patch > Add ability to use interactive shell with normal yarn container > --- > > Key: YARN-8962 > URL: https://issues.apache.org/jira/browse/YARN-8962 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Attachments: YARN-8962.001.patch, YARN-8962.002.patch > > > This task is focusing on extending interactive shell capability to yarn > container without docker. This will improve some aspect of debugging > mapreduce or spark applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8893) [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client
[ https://issues.apache.org/jira/browse/YARN-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673748#comment-16673748 ] Botong Huang commented on YARN-8893: This is also a proper shutdown and close connection in the forceKill case, so that after we forceKilled the UAM, our local proxy connection inside the AMRMClientRelayer won't be left open. > [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client > --- > > Key: YARN-8893 > URL: https://issues.apache.org/jira/browse/YARN-8893 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-8893.v1.patch, YARN-8893.v2.patch > > > Fix thread leak in AMRMClientRelayer and UAM client used by > FederationInterceptor, when destroying the interceptor pipeline in AMRMProxy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8893) [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client
[ https://issues.apache.org/jira/browse/YARN-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673721#comment-16673721 ] Giovanni Matteo Fumarola commented on YARN-8893: Thanks [~botong]. Why do you need to call this.rmProxyRelayer.shutdown(); from {{forceKillApplication}}? > [AMRMProxy] Fix thread leak in AMRMClientRelayer and UAM client > --- > > Key: YARN-8893 > URL: https://issues.apache.org/jira/browse/YARN-8893 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-8893.v1.patch, YARN-8893.v2.patch > > > Fix thread leak in AMRMClientRelayer and UAM client used by > FederationInterceptor, when destroying the interceptor pipeline in AMRMProxy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8932) ResourceUtilization cpu is misused in oversubscription as a percentage
[ https://issues.apache.org/jira/browse/YARN-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673638#comment-16673638 ] Haibo Chen commented on YARN-8932: -- The shaded client issue is unrelated to this patch. It failed before the patch was applied. Checking in this patch to YARN-1011 branch. Will open a Jira if the shaded client issue comes up again. Thanks [~rkanter] for the review! > ResourceUtilization cpu is misused in oversubscription as a percentage > -- > > Key: YARN-8932 > URL: https://issues.apache.org/jira/browse/YARN-8932 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8932-YARN-1011.00.patch, > YARN-8932-YARN-1011.01.patch, YARN-8932-YARN-1011.02.patch > > > The ResourceUtilization javadoc mistakenly documents the cpu as a percentage > represented by a float number in [0, 1.0f], however it is used as the # of > vcores used in reality. > See javadoc and discussion in YARN-8911. > /** > * Get CPU utilization. > * > * @return CPU utilization normalized to 1 CPU > */ > @Public > @Unstable > public abstract float getCPU(); -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8776) Container Executor change to create stdin/stdout pipeline
[ https://issues.apache.org/jira/browse/YARN-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673632#comment-16673632 ] Eric Yang commented on YARN-8776: - The failed test case is tracked by YARN-8672. [~billie.rinaldi] can you help with the review? > Container Executor change to create stdin/stdout pipeline > - > > Key: YARN-8776 > URL: https://issues.apache.org/jira/browse/YARN-8776 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Major > Labels: Docker > Attachments: YARN-8776.001.patch, YARN-8776.002.patch, > YARN-8776.003.patch, YARN-8776.004.patch, YARN-8776.005.patch, > YARN-8776.006.patch > > > The pipeline is built to connect the stdin/stdout channel from WebSocket > servlet through container-executor to docker executor. So when the WebSocket > servlet is started, we need to invoke container-executor “dockerExec” method > (which will be implemented) to create a new docker executor and use “docker > exec -it $ContainerId” command which executes an interactive bash shell on > the container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8962) Add ability to use interactive shell with normal yarn container
[ https://issues.apache.org/jira/browse/YARN-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673622#comment-16673622 ] Eric Yang commented on YARN-8962: - First patch renames run_docker_with_pty to exec_container to support both docker exec -it and yarn container. Yarn container is using restricted shell to prevent user from going out of bound. DefaultLinuxContainerRuntime implements writing .cmd file for c version of container-executor to run shell for yarn container. > Add ability to use interactive shell with normal yarn container > --- > > Key: YARN-8962 > URL: https://issues.apache.org/jira/browse/YARN-8962 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Attachments: YARN-8962.001.patch > > > This task is focusing on extending interactive shell capability to yarn > container without docker. This will improve some aspect of debugging > mapreduce or spark applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8962) Add ability to use interactive shell with normal yarn container
[ https://issues.apache.org/jira/browse/YARN-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8962: Attachment: YARN-8962.001.patch > Add ability to use interactive shell with normal yarn container > --- > > Key: YARN-8962 > URL: https://issues.apache.org/jira/browse/YARN-8962 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > Attachments: YARN-8962.001.patch > > > This task is focusing on extending interactive shell capability to yarn > container without docker. This will improve some aspect of debugging > mapreduce or spark applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673617#comment-16673617 ] Hadoop QA commented on YARN-7225: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 40s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-2.8 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 49s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} branch-2.8 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 31s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 18s{color} | {color:red} The patch generated 2 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}126m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Unreaped Processes | hadoop-yarn-server-resourcemanager:1 | | Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.security.TestClientToAMTokens | | | hadoop.yarn.server.resourcemanager.TestRMRestart | | | hadoop.yarn.server.resourcemanager.TestRMHA | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesHttpStaticUserPermissions | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerQueueACLs | | | hadoop.yarn.server.resourcemanager.scheduler.TestSchedulerUtils | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.resourcetracker.TestNMReconnect | | | hadoop.yarn.server.resourcemanager.resourcetracker.TestNMExpiry | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerQueueACLs | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication | | | hadoop.yarn.server.resourcemanager.TestApplicationACLs | | | hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler | | | hadoop.yarn.server.resourcemanager.TestClientRMService | | Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ae3769f | | JIRA Issue | YARN-7225 | | JIRA Patch URL |
[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673584#comment-16673584 ] Jonathan Hung commented on YARN-7225: - Thanks, committed to branch-2.8. > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1 >Reporter: Jonathan Hung >Assignee: Eric Payne >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.1.2, 3.3.0, 2.8.6, 3.2.1, 2.9.3 > > Attachments: YARN-7225.001.patch, YARN-7225.002.patch, > YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, > YARN-7225.branch-2.8.001.patch, YARN-7225.branch-2.8.002.patch, > YARN-7225.branch-2.8.003.addendum.patch > > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673579#comment-16673579 ] Eric Payne commented on YARN-7225: -- Yes, [~jhung]. Very good catch. Sorry about that. +1 to the update. > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1 >Reporter: Jonathan Hung >Assignee: Eric Payne >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.1.2, 3.3.0, 2.8.6, 3.2.1, 2.9.3 > > Attachments: YARN-7225.001.patch, YARN-7225.002.patch, > YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, > YARN-7225.branch-2.8.001.patch, YARN-7225.branch-2.8.002.patch, > YARN-7225.branch-2.8.003.addendum.patch > > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8867) Retrieve the status of resource localization
[ https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-8867: Attachment: YARN-8867.002.patch > Retrieve the status of resource localization > > > Key: YARN-8867 > URL: https://issues.apache.org/jira/browse/YARN-8867 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8867.001.patch, YARN-8867.002.patch, > YARN-8867.wip.patch > > > Refer YARN-3854. > Currently NM does not have an API to retrieve the status of localization. > Unless the client can know when the localization of a resource is complete > irrespective of the type of the resource, it cannot take any appropriate > action. > We need an API in {{ContainerManagementProtocol}} to retrieve the status on > the localization. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8897) LoadBasedRouterPolicy throws "NPE" in case of sub cluster unavailability
[ https://issues.apache.org/jira/browse/YARN-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673533#comment-16673533 ] Hudson commented on YARN-8897: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15350 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15350/]) YARN-8897. LoadBasedRouterPolicy throws NPE in case of sub cluster (gifuma: rev aed836efbff775d95899d05ff947f1048df8cf19) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/policies/router/LoadBasedRouterPolicy.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/federation/policies/router/TestLoadBasedRouterPolicy.java > LoadBasedRouterPolicy throws "NPE" in case of sub cluster unavailability > - > > Key: YARN-8897 > URL: https://issues.apache.org/jira/browse/YARN-8897 > Project: Hadoop YARN > Issue Type: Bug > Components: federation, router >Reporter: Akshay Agarwal >Assignee: Bilwa S T >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-8897-001.patch, YARN-8897-002.patch, > YARN-8897-003.patch > > > If no sub clusters are available for "*Load Based Router Policy*" with > *cluster weight* as *1* in Router Based Federation Setup , throwing > "*NullPointerException*". > > *Exception Details:* > {code:java} > java.lang.NullPointerException: java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.federation.policies.router.LoadBasedRouterPolicy.getHomeSubcluster(LoadBasedRouterPolicy.java:99) > at > org.apache.hadoop.yarn.server.federation.policies.RouterPolicyFacade.getHomeSubcluster(RouterPolicyFacade.java:204) > at > org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:362) > at > org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:297) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) > at com.sun.proxy.$Proxy15.submitApplication(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:288) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:300) >
[jira] [Updated] (YARN-8897) LoadBasedRouterPolicy throws "NPE" in case of sub cluster unavailability
[ https://issues.apache.org/jira/browse/YARN-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8897: --- Fix Version/s: 3.3.0 > LoadBasedRouterPolicy throws "NPE" in case of sub cluster unavailability > - > > Key: YARN-8897 > URL: https://issues.apache.org/jira/browse/YARN-8897 > Project: Hadoop YARN > Issue Type: Bug > Components: federation, router >Reporter: Akshay Agarwal >Assignee: Bilwa S T >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-8897-001.patch, YARN-8897-002.patch, > YARN-8897-003.patch > > > If no sub clusters are available for "*Load Based Router Policy*" with > *cluster weight* as *1* in Router Based Federation Setup , throwing > "*NullPointerException*". > > *Exception Details:* > {code:java} > java.lang.NullPointerException: java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.federation.policies.router.LoadBasedRouterPolicy.getHomeSubcluster(LoadBasedRouterPolicy.java:99) > at > org.apache.hadoop.yarn.server.federation.policies.RouterPolicyFacade.getHomeSubcluster(RouterPolicyFacade.java:204) > at > org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:362) > at > org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:297) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) > at com.sun.proxy.$Proxy15.submitApplication(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:288) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:300) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:331) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) > at
[jira] [Commented] (YARN-8954) Reservations list field in ReservationListInfo is not accessible
[ https://issues.apache.org/jira/browse/YARN-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673518#comment-16673518 ] Hudson commented on YARN-8954: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15349 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15349/]) YARN-8954. Reservations list field in ReservationListInfo is not (gifuma: rev babc946d4017e9c385d19a8e6f7f1ecd5080d619) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ReservationListInfo.java > Reservations list field in ReservationListInfo is not accessible > > > Key: YARN-8954 > URL: https://issues.apache.org/jira/browse/YARN-8954 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager, restapi >Reporter: Oleksandr Shevchenko >Assignee: Oleksandr Shevchenko >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-8954.001.patch > > > We need to add the getter for Reservations list field since the field cannot > be accessible after the unmarshal. The similar problem described in YARN-2280. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8897) LoadBasedRouterPolicy throws "NPE" in case of sub cluster unavailability
[ https://issues.apache.org/jira/browse/YARN-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673514#comment-16673514 ] Giovanni Matteo Fumarola commented on YARN-8897: +1 on [^YARN-8897-003.patch] . Committing into trunk. Thanks [~BilwaST] for the patch, and [~bibinchundatt] for the comments. > LoadBasedRouterPolicy throws "NPE" in case of sub cluster unavailability > - > > Key: YARN-8897 > URL: https://issues.apache.org/jira/browse/YARN-8897 > Project: Hadoop YARN > Issue Type: Bug > Components: federation, router >Reporter: Akshay Agarwal >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-8897-001.patch, YARN-8897-002.patch, > YARN-8897-003.patch > > > If no sub clusters are available for "*Load Based Router Policy*" with > *cluster weight* as *1* in Router Based Federation Setup , throwing > "*NullPointerException*". > > *Exception Details:* > {code:java} > java.lang.NullPointerException: java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.federation.policies.router.LoadBasedRouterPolicy.getHomeSubcluster(LoadBasedRouterPolicy.java:99) > at > org.apache.hadoop.yarn.server.federation.policies.RouterPolicyFacade.getHomeSubcluster(RouterPolicyFacade.java:204) > at > org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:362) > at > org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:297) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) > at com.sun.proxy.$Proxy15.submitApplication(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:288) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:300) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:331) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at >
[jira] [Commented] (YARN-8954) Reservations list field in ReservationListInfo is not accessible
[ https://issues.apache.org/jira/browse/YARN-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673504#comment-16673504 ] Giovanni Matteo Fumarola commented on YARN-8954: Thanks [~oshevchenko] for the patch. The get method was missing in YARN-4420. No tests needed. Committing into trunk. > Reservations list field in ReservationListInfo is not accessible > > > Key: YARN-8954 > URL: https://issues.apache.org/jira/browse/YARN-8954 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager, restapi >Reporter: Oleksandr Shevchenko >Assignee: Oleksandr Shevchenko >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-8954.001.patch > > > We need to add the getter for Reservations list field since the field cannot > be accessible after the unmarshal. The similar problem described in YARN-2280. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8954) Reservations list field in ReservationListInfo is not accessible
[ https://issues.apache.org/jira/browse/YARN-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola reassigned YARN-8954: -- Assignee: Oleksandr Shevchenko > Reservations list field in ReservationListInfo is not accessible > > > Key: YARN-8954 > URL: https://issues.apache.org/jira/browse/YARN-8954 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager, restapi >Reporter: Oleksandr Shevchenko >Assignee: Oleksandr Shevchenko >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-8954.001.patch > > > We need to add the getter for Reservations list field since the field cannot > be accessible after the unmarshal. The similar problem described in YARN-2280. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8954) Reservations list field in ReservationListInfo is not accessible
[ https://issues.apache.org/jira/browse/YARN-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8954: --- Fix Version/s: 3.3.0 > Reservations list field in ReservationListInfo is not accessible > > > Key: YARN-8954 > URL: https://issues.apache.org/jira/browse/YARN-8954 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager, restapi >Reporter: Oleksandr Shevchenko >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-8954.001.patch > > > We need to add the getter for Reservations list field since the field cannot > be accessible after the unmarshal. The similar problem described in YARN-2280. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung reopened YARN-7225: - > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1 >Reporter: Jonathan Hung >Assignee: Eric Payne >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.1.2, 3.3.0, 2.8.6, 3.2.1, 2.9.3 > > Attachments: YARN-7225.001.patch, YARN-7225.002.patch, > YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, > YARN-7225.branch-2.8.001.patch, YARN-7225.branch-2.8.002.patch, > YARN-7225.branch-2.8.003.addendum.patch > > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673495#comment-16673495 ] Jonathan Hung commented on YARN-7225: - Hey [~eepayne], in the process of backporting the branch-2.8 patch, I inadvertently found another issue...seems if we pass queue/partition then we no longer print IP. I attached a one line addendum patch , mind giving it a look ? Thanks! > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1 >Reporter: Jonathan Hung >Assignee: Eric Payne >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.1.2, 3.3.0, 2.8.6, 3.2.1, 2.9.3 > > Attachments: YARN-7225.001.patch, YARN-7225.002.patch, > YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, > YARN-7225.branch-2.8.001.patch, YARN-7225.branch-2.8.002.patch, > YARN-7225.branch-2.8.003.addendum.patch > > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated YARN-7225: Attachment: YARN-7225.branch-2.8.003.addendum.patch > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.9.1, 2.8.4, 3.0.2, 3.1.1 >Reporter: Jonathan Hung >Assignee: Eric Payne >Priority: Major > Fix For: 2.10.0, 3.0.4, 3.1.2, 3.3.0, 2.8.6, 3.2.1, 2.9.3 > > Attachments: YARN-7225.001.patch, YARN-7225.002.patch, > YARN-7225.003.patch, YARN-7225.004.patch, YARN-7225.005.patch, > YARN-7225.branch-2.8.001.patch, YARN-7225.branch-2.8.002.patch, > YARN-7225.branch-2.8.003.addendum.patch > > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8932) ResourceUtilization cpu is misused in oversubscription as a percentage
[ https://issues.apache.org/jira/browse/YARN-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673475#comment-16673475 ] Hadoop QA commented on YARN-8932: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} YARN-1011 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 10s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 39s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s{color} | {color:green} YARN-1011 passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 13m 25s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} YARN-1011 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 31s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 31s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server generated 16 new + 86 unchanged - 0 fixed = 102 total (was 86) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 58s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 1 new + 203 unchanged - 0 fixed = 204 total (was 203) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 12m 4s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 22s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 23s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 86m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8932 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12946290/YARN-8932-YARN-1011.02.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4050669b7330 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | YARN-1011 / f3d08c7 | | maven | version:
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673366#comment-16673366 ] Hadoop QA commented on YARN-8233: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 43s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 7s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 35s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 5 new + 100 unchanged - 0 fixed = 105 total (was 100) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 48s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}116m 23s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}184m 55s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerSchedulingRequestUpdate | | | hadoop.yarn.server.resourcemanager.TestApplicationMasterServiceCapacity | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerSurgicalPreemption | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8233 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12946674/YARN-8233.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4bdb90b6738e 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d16d5f7 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle |
[jira] [Commented] (YARN-8898) Fix FederationInterceptor#allocate to set application priority in allocateResponse
[ https://issues.apache.org/jira/browse/YARN-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673048#comment-16673048 ] Bibin A Chundatt commented on YARN-8898: Thank you [~botong] for explanation Will take a look at YARN-8933. > Fix FederationInterceptor#allocate to set application priority in > allocateResponse > -- > > Key: YARN-8898 > URL: https://issues.apache.org/jira/browse/YARN-8898 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > > In case of FederationInterceptor#mergeAllocateResponses skips > application_priority in response returned -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673037#comment-16673037 ] Hadoop QA commented on YARN-8948: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 9 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 33s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 870 unchanged - 1 fixed = 870 total (was 871) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 39s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 28s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}162m 36s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueManagementDynamicEditPolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8948 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12946649/YARN-8948.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 67af53276c3d 4.4.0-134-generic #160~14.04.1-Ubuntu SMP Fri Aug 17 11:07:07 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d16d5f7 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/22406/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22406/testReport/ | | Max. process+thread count
[jira] [Commented] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673035#comment-16673035 ] Bibin A Chundatt commented on YARN-8948: [~sunilg] Updating the severity . Not required to be a blocker. > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: YARN-8948.001.patch, YARN-8948.002.patch, > YARN-8948.003.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-8948: --- Priority: Major (was: Critical) > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Major > Attachments: YARN-8948.001.patch, YARN-8948.002.patch, > YARN-8948.003.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-8948: --- Target Version/s: (was: 3.2.0) > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Major > Attachments: YARN-8948.001.patch, YARN-8948.002.patch, > YARN-8948.003.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673031#comment-16673031 ] Tao Yang commented on YARN-8233: Attached v2 patch for review. > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8233.001.patch, YARN-8233.002.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8233: --- Attachment: YARN-8233.002.patch > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8233.001.patch, YARN-8233.002.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673027#comment-16673027 ] Wilfred Spiegelenburg commented on YARN-8948: - [~bibinchundatt] Thank you for picking this up. I had just started looking at this to move the FairScheduler over to the same interface. I have opened a new jira for that already and will start with that as soon as this jira is finished. > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: YARN-8948.001.patch, YARN-8948.002.patch, > YARN-8948.003.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8967) Change FairScheduler to use PlacementRule interface
Wilfred Spiegelenburg created YARN-8967: --- Summary: Change FairScheduler to use PlacementRule interface Key: YARN-8967 URL: https://issues.apache.org/jira/browse/YARN-8967 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler Reporter: Wilfred Spiegelenburg Assignee: Wilfred Spiegelenburg The PlacementRule interface was introduced to be used by all schedulers as per YARN-3635. The CapacityScheduler is using it but the FairScheduler is not and is using its own rule definition. YARN-8948 cleans up the implementation and removes the CS references which should allow this change to go through. This would be the first step in using one placement rule engine for both schedulers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672961#comment-16672961 ] Sunil Govindan commented on YARN-8948: -- On this note, this is not an interface level compatibility problem. Ideally we should have done for all schedulers, however this is not affecting the interface, and hence i suggest to mark this as Major and get this in all respective versions and next release can take this. If you feel this is really a blocker to 3.2, kindly let me know. > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: YARN-8948.001.patch, YARN-8948.002.patch, > YARN-8948.003.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672922#comment-16672922 ] Bibin A Chundatt commented on YARN-8948: [~sunilg] {code} cud u pls explain how this is breaking wire compatibility {code} Not breaking wire compatability since nothing is transmitted between process. In jira i mentioned the *interface is supposed be for all schedulers.* shouldnt be for capacityscheduler alone. Else next version might have to change again?? {code} If this is not breaking any compatibility, i would like to degrade severity to Minor so that 3.2 release can go on. {code} https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html Since we havnt marked {{PlacementRule}} as public interface level compatability might not be requried. YARN-8016 made the placementRule refined and made it usable. Another issue seems currently UserGroup*Rule is added always and the order is not maintained for {{distingushRuleSet}} custom configuration will not work as expected. Consider as functional issue. Lets take a call based on response from [~leftnoteasy] and [~suma.shivaprasad] too. > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: YARN-8948.001.patch, YARN-8948.002.patch, > YARN-8948.003.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672883#comment-16672883 ] Sunil Govindan commented on YARN-8948: -- [~bibinchundatt] Few questions: # cud u pls explain how this is breaking wire compatibility b/w old hadoop version to new one? # These jiras are already in hadoop-3.1 and other released? Hence is this already broken by these version? # any workarounds if u upgrade to hadoop-3.2 and then use old clients/apps by which we ll not have any breakage? If this is not breaking any compatibility, i would like to degrade severity to Minor so that 3.2 release can go on. cc [~leftnoteasy] [~suma.shivaprasad] > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: YARN-8948.001.patch, YARN-8948.002.patch, > YARN-8948.003.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app
[ https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672875#comment-16672875 ] Tao Yang commented on YARN-8958: {quote} I am not sure about that... The cached usage is used by the FairComparator to determine the ordering of schedulable entities, we need to make sure that is updated correctly. {quote} Inside this update is refreshing its cached pending/used to be its current pending/used, that's why it can't be updated incorrectly. {quote} so we don't need to change #reorderScheduleEntities logic, doesn't that make sense? {quote} I think this change can solve a lot but not all. It's possible in a race condition scenario that schedulable entity did exist when put it to entitiesToReorder but removed by other thread immediately. For example: (1) Thread1 -> AbstractComparatorOrderingPolicy#removeSchedulableEntity synchronized block just execute done but haven't removed this schedulable entity from schedulableEntities (2) Thread2 -> AbstractComparatorOrderingPolicy#entityRequiresReordering synchronized block execute done: put schedulable entity back to entitiesToReorder (3) Thread1 -> AbstractComparatorOrderingPolicy#removeSchedulableEntity removed this schedulable entity from schedulableEntities Thoughts? > Schedulable entities leak in fair ordering policy when recovering containers > between remove app attempt and remove app > -- > > Key: YARN-8958 > URL: https://issues.apache.org/jira/browse/YARN-8958 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8958.001.patch, YARN-8958.002.patch > > > We found a NPE in ClientRMService#getApplications when querying apps with > specified queue. The cause is that there is one app which can't be found by > calling RMContextImpl#getRMApps(is finished and swapped out of memory) but > still can be queried from fair ordering policy. > To reproduce schedulable entities leak in fair ordering policy: > (1) create app1 and launch container1 on node1 > (2) restart RM > (3) remove app1 attempt, app1 is removed from the schedulable entities. > (4) recover container1 after node1 reconnected to RM, then the state of > contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder > after container released, then app1 will be added back into schedulable > entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler. > (5) remove app1 > To solve this problem, we should make sure schedulableEntities can only be > affected by add or remove app attempt, new entity should not be added into > schedulableEntities by reordering process. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > schedulableEntities.add(schedulableEntity); > } > {code} > Related codes above can be improved as follow to make sure only existent > entity can be re-add into schedulableEntities. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > boolean exists = schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > if (exists) { > schedulableEntities.add(schedulableEntity); > } else { > LOG.info("Skip reordering non-existent schedulable entity: " > + schedulableEntity.getId()); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672853#comment-16672853 ] Weiwei Yang commented on YARN-8233: --- Thanks [~Tao Yang], sounds good. > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8233.001.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672835#comment-16672835 ] Tao Yang commented on YARN-8233: Hi, [~surmountian], [~cheersyang] {quote}I also met this issue in Hadoop 2.9, would you also patch for Hadoop 2.9? Maybe some unit test cases could be added to cover the logic. {quote} Yes, I would like to update this patch with test case then backport it to 2.9 {quote}does this issue only happen when async-scheduling is enabled? {quote} I think so. The sync-scheduling process will hold write lock of CapacityScheduler and make sure the return of CapacityScheduler#getSchedulerContainer is not null. {quote}Regarding to the patch, CapacityScheduler#getSchedulerContainer was being called multiple places, I think we need to add null check for all of them. {quote} Thanks for reminding this, CapacityScheduler#getSchedulerContainersToRelease seems should add null check too. > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8233.001.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672826#comment-16672826 ] Hadoop QA commented on YARN-8233: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 22s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 16s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 57s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 0s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}162m 6s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-8233 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12921116/YARN-8233.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux b4d89e094499 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d16d5f7 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/22405/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22405/testReport/ | | Max. process+thread count | 956 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U:
[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app
[ https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672817#comment-16672817 ] Weiwei Yang commented on YARN-8958: --- Hi [~Tao Yang] {quote}This calling can make schedulable entity more correctly because inside it just cached resource usage of itself get fresher and no harm to others. {quote} I am not sure about that... The cached usage is used by the FairComparator to determine the ordering of schedulable entities, we need to make sure that is updated correctly. Go back to the fix, I come up with an alternative approach: # {{schedulableEntities}} mains the full list of apps for the ordering policy, entities are added/removed when app added, removed or updated; # {{entitiesToReorder}} maintains the apps that needs to re-order, entities are added/removed when container allocated, released or updated (resource usage changes) Under such context, {{entitiesToReorder}} should be a sub-set of {{schedulableEntities}}. So why not we ensure that by following change? {code:java} protected void entityRequiresReordering(S schedulableEntity) { synchronized (entitiesToReorder) { if (schedulableEntities.contains(schedulableEntity)) { entitiesToReorder.put(schedulableEntity.getId(), schedulableEntity); } } } {code} so we don't need to change #reorderScheduleEntities logic, doesn't that make sense? > Schedulable entities leak in fair ordering policy when recovering containers > between remove app attempt and remove app > -- > > Key: YARN-8958 > URL: https://issues.apache.org/jira/browse/YARN-8958 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8958.001.patch, YARN-8958.002.patch > > > We found a NPE in ClientRMService#getApplications when querying apps with > specified queue. The cause is that there is one app which can't be found by > calling RMContextImpl#getRMApps(is finished and swapped out of memory) but > still can be queried from fair ordering policy. > To reproduce schedulable entities leak in fair ordering policy: > (1) create app1 and launch container1 on node1 > (2) restart RM > (3) remove app1 attempt, app1 is removed from the schedulable entities. > (4) recover container1 after node1 reconnected to RM, then the state of > contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder > after container released, then app1 will be added back into schedulable > entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler. > (5) remove app1 > To solve this problem, we should make sure schedulableEntities can only be > affected by add or remove app attempt, new entity should not be added into > schedulableEntities by reordering process. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > schedulableEntities.add(schedulableEntity); > } > {code} > Related codes above can be improved as follow to make sure only existent > entity can be re-add into schedulableEntities. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > boolean exists = schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > if (exists) { > schedulableEntities.add(schedulableEntity); > } else { > LOG.info("Skip reordering non-existent schedulable entity: " > + schedulableEntity.getId()); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672797#comment-16672797 ] Weiwei Yang commented on YARN-8233: --- [~Tao Yang], [~surmountian], does this issue only happen when async-scheduling is enabled? Regarding to the patch, {{CapacityScheduler#getSchedulerContainer}} was being called multiple places, I think we need to add null check for all of them. > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8233.001.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8686) Queue Management API - not returning JSON or XML response data when passing Accept header
[ https://issues.apache.org/jira/browse/YARN-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akhil PB reassigned YARN-8686: -- Assignee: Akhil PB > Queue Management API - not returning JSON or XML response data when passing > Accept header > - > > Key: YARN-8686 > URL: https://issues.apache.org/jira/browse/YARN-8686 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Akhil PB >Assignee: Akhil PB >Priority: Major > > API should return JSON or XML response data based on Accept header. Instead, > API returns plain text for success as well as error scenarios. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8925) Updating distributed node attributes only when necessary
[ https://issues.apache.org/jira/browse/YARN-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672769#comment-16672769 ] Tao Yang edited comment on YARN-8925 at 11/2/18 8:41 AM: - Hi, [~cheersyang]. I have attached V2 patch for review, this patch is not finished yet since some test cases are still needed, can you help to take a simple review and give your feedback about this patch? V2 patch supports reporting node attributes only when updated or re-sync time is elapsed by NM to avoid overhead of the comparing in every heartbeat process. As your advice, logic is similar as reporting node labels. Common logic such as judging whether labels/attributes should be send to RM is extracted as HeartbeatSyncIfNeededHandler, so that handlers of labels and attributes can extend it to reuse the logic. For test cases, I have created TestNodeStatusUpdaterForAttributes to test NM side and will add some new test methods in TestResourceTrackerService to test RM side (similar to node labels test cases in TestResourceTrackerService). Any problems about that? Looking forward to your feedback. Thanks ! was (Author: tao yang): Hi, [~cheersyang]. I have attached V2 patch for review, this patch is not finished yet since some test cases are still needed, can you help to take a simple review and give your feedback about this patch? V2 patch supports reporting node attributes only when updated or re-sync time is elapsed by NM to avoid overhead of the comparing in every heartbeat process. As your advice, logic is similar as reporting node labels. Common logic such as judging whether labels/attributes should be send to RM is extracted as HeartbeatSyncIfNeededHandler, so that handlers of labels and attributes can extend it to reuse the logic. For test cases, I create TestNodeStatusUpdaterForAttributes to test NM side and will add some new test methods in TestResourceTrackerService to test RM side (similar to node labels test cases in TestResourceTrackerService). Thoughts? > Updating distributed node attributes only when necessary > > > Key: YARN-8925 > URL: https://issues.apache.org/jira/browse/YARN-8925 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.2.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Labels: performance > Attachments: YARN-8925.001.patch, YARN-8925.002.patch > > > Currently if distributed node attributes exist, even though there is no > change, updating for distributed node attributes will happen in every > heartbeat between NM and RM. Updating process will hold > NodeAttributesManagerImpl#writeLock and may have some influence in a large > cluster. We have found nodes UI of a large cluster is opened slowly and most > time it's waiting for the lock in NodeAttributesManagerImpl. I think this > updating should be called only when necessary to enhance the performance of > related process. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8925) Updating distributed node attributes only when necessary
[ https://issues.apache.org/jira/browse/YARN-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672769#comment-16672769 ] Tao Yang commented on YARN-8925: Hi, [~cheersyang]. I have attached V2 patch for review, this patch is not finished yet since some test cases are still needed, can you help to take a simple review and give your feedback about this patch? V2 patch supports reporting node attributes only when updated or re-sync time is elapsed by NM to avoid overhead of the comparing in every heartbeat process. As your advice, logic is similar as reporting node labels. Common logic such as judging whether labels/attributes should be send to RM is extracted as HeartbeatSyncIfNeededHandler, so that handlers of labels and attributes can extend it to reuse the logic. For test cases, I create TestNodeStatusUpdaterForAttributes to test NM side and will add some new test methods in TestResourceTrackerService to test RM side (similar to node labels test cases in TestResourceTrackerService). Thoughts? > Updating distributed node attributes only when necessary > > > Key: YARN-8925 > URL: https://issues.apache.org/jira/browse/YARN-8925 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.2.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Labels: performance > Attachments: YARN-8925.001.patch, YARN-8925.002.patch > > > Currently if distributed node attributes exist, even though there is no > change, updating for distributed node attributes will happen in every > heartbeat between NM and RM. Updating process will hold > NodeAttributesManagerImpl#writeLock and may have some influence in a large > cluster. We have found nodes UI of a large cluster is opened slowly and most > time it's waiting for the lock in NodeAttributesManagerImpl. I think this > updating should be called only when necessary to enhance the performance of > related process. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt reassigned YARN-8948: -- Assignee: Bibin A Chundatt > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: YARN-8948.001.patch, YARN-8948.002.patch, > YARN-8948.003.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8925) Updating distributed node attributes only when necessary
[ https://issues.apache.org/jira/browse/YARN-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8925: --- Attachment: YARN-8925.002.patch > Updating distributed node attributes only when necessary > > > Key: YARN-8925 > URL: https://issues.apache.org/jira/browse/YARN-8925 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.2.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Labels: performance > Attachments: YARN-8925.001.patch, YARN-8925.002.patch > > > Currently if distributed node attributes exist, even though there is no > change, updating for distributed node attributes will happen in every > heartbeat between NM and RM. Updating process will hold > NodeAttributesManagerImpl#writeLock and may have some influence in a large > cluster. We have found nodes UI of a large cluster is opened slowly and most > time it's waiting for the lock in NodeAttributesManagerImpl. I think this > updating should be called only when necessary to enhance the performance of > related process. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-8948: --- Attachment: YARN-8948.003.patch > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Critical > Attachments: YARN-8948.001.patch, YARN-8948.002.patch, > YARN-8948.003.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-8948: --- Attachment: YARN-8948.003.patch > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Critical > Attachments: YARN-8948.001.patch, YARN-8948.002.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-8948: --- Attachment: (was: YARN-8948.003.patch) > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Critical > Attachments: YARN-8948.001.patch, YARN-8948.002.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672686#comment-16672686 ] Bibin A Chundatt commented on YARN-8948: [~sunil.gov...@gmail.com] Could you look into this issue.. > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Critical > Attachments: YARN-8948.001.patch, YARN-8948.002.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8948) PlacementRule interface should be for all YarnSchedulers
[ https://issues.apache.org/jira/browse/YARN-8948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-8948: --- Description: *Issue 1:* YARN-3635 intention was to add PlacementRule interface common for all YarnSchedules. {code} 33public abstract boolean initialize( 34CapacitySchedulerContext schedulerContext) throws IOException; {code} PlacementRule initialization is done using CapacitySchedulerContext binding to CapacityScheduler *Issue 2:* {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity Scheduler {quote} * **Queue Mapping Interface based on Default or User Defined Placement Rules** - This feature allows users to map a job to a specific queue based on some default placement rule. For instance based on user & group, or application name. User can also define their own placement rule. {quote} As per current UserGroupMapping is always added in placementRule. {{CapacityScheduler#updatePlacementRules}} {code} // Initialize placement rules Collection placementRuleStrs = conf.getStringCollection( YarnConfiguration.QUEUE_PLACEMENT_RULES); List placementRules = new ArrayList<>(); ... // add UserGroupMappingPlacementRule if absent distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); {code} PlacementRule configuration order is not maintained was: *Issue 1:* YARN-3635 intention was to add PlacementRule interface common for all YarnSchedules. {code} 33public abstract boolean initialize( 34CapacitySchedulerContext schedulerContext) throws IOException; {code} PlacementRule initialization is done using CapacitySchedulerContext binding to CapacityScheduler *Issue 2:* {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity Scheduler {quote} * **Queue Mapping Interface based on Default or User Defined Placement Rules** - This feature allows users to map a job to a specific queue based on some default placement rule. For instance based on user & group, or application name. User can also define their own placement rule. {quote} As per current UserGroupMapping is always added in placementRule. {{CapacityScheduler#updatePlacementRules}} {code} // Initialize placement rules Collection placementRuleStrs = conf.getStringCollection( YarnConfiguration.QUEUE_PLACEMENT_RULES); List placementRules = new ArrayList<>(); ... // add UserGroupMappingPlacementRule if absent distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); {code} > PlacementRule interface should be for all YarnSchedulers > > > Key: YARN-8948 > URL: https://issues.apache.org/jira/browse/YARN-8948 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Critical > Attachments: YARN-8948.001.patch, YARN-8948.002.patch > > > *Issue 1:* > YARN-3635 intention was to add PlacementRule interface common for all > YarnSchedules. > {code} > 33 public abstract boolean initialize( > 34 CapacitySchedulerContext schedulerContext) throws IOException; > {code} > PlacementRule initialization is done using CapacitySchedulerContext binding > to CapacityScheduler > *Issue 2:* > {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity > Scheduler > {quote} > * **Queue Mapping Interface based on Default or User Defined Placement > Rules** - This feature allows users to map a job to a specific queue based on > some default placement rule. For instance based on user & group, or > application name. User can also define their own placement rule. > {quote} > As per current UserGroupMapping is always added in placementRule. > {{CapacityScheduler#updatePlacementRules}} > {code} > // Initialize placement rules > Collection placementRuleStrs = conf.getStringCollection( > YarnConfiguration.QUEUE_PLACEMENT_RULES); > List placementRules = new ArrayList<>(); > ... > // add UserGroupMappingPlacementRule if absent > distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE); > {code} > PlacementRule configuration order is not maintained -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8686) Queue Management API - not returning JSON or XML response data when passing Accept header
[ https://issues.apache.org/jira/browse/YARN-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akhil PB resolved YARN-8686. Resolution: Won't Fix Changing the output of restful API might be incompatible to old clients. > Queue Management API - not returning JSON or XML response data when passing > Accept header > - > > Key: YARN-8686 > URL: https://issues.apache.org/jira/browse/YARN-8686 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Akhil PB >Priority: Major > > API should return JSON or XML response data based on Accept header. Instead, > API returns plain text for success as well as error scenarios. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8905) [Router] Add JvmMetricsInfo and pause monitor
[ https://issues.apache.org/jira/browse/YARN-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672659#comment-16672659 ] Bibin A Chundatt commented on YARN-8905: +1 latest patch looks good to me. Will commit by tomorrow if no objections. > [Router] Add JvmMetricsInfo and pause monitor > - > > Key: YARN-8905 > URL: https://issues.apache.org/jira/browse/YARN-8905 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Minor > Attachments: YARN-8905-001.patch, YARN-8905-002.patch, > YARN-8905-003.patch, YARN-8905-004.patch, YARN-8905-005.patch > > > Similar to resourcemanager and nodemanager serivce we can add JvmMetricsInfo > to router service too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8233) NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal whose allocatedOrReservedContainer is null
[ https://issues.apache.org/jira/browse/YARN-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672650#comment-16672650 ] Xiao Liang commented on YARN-8233: -- Thanks [~Tao Yang], I also met this issue in Hadoop 2.9, would you also patch for Hadoop 2.9? Maybe some unit test cases could be added to cover the logic. > NPE in CapacityScheduler#tryCommit when handling allocate/reserve proposal > whose allocatedOrReservedContainer is null > - > > Key: YARN-8233 > URL: https://issues.apache.org/jira/browse/YARN-8233 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8233.001.patch > > > Recently we saw a NPE problem in CapacityScheduler#tryCommit when try to find > the attemptId by calling {{c.getAllocatedOrReservedContainer().get...}} from > an allocate/reserve proposal. But got null allocatedOrReservedContainer and > thrown NPE. > Reference code: > {code:java} > // find the application to accept and apply the ResourceCommitRequest > if (request.anythingAllocatedOrReserved()) { > ContainerAllocationProposal c = > request.getFirstAllocatedOrReservedContainer(); > attemptId = > c.getAllocatedOrReservedContainer().getSchedulerApplicationAttempt() > .getApplicationAttemptId(); //NPE happens here > } else { ... > {code} > The proposal was constructed in > {{CapacityScheduler#createResourceCommitRequest}} and > allocatedOrReservedContainer is possibly null in async-scheduling process > when node was lost or application was finished (details in > {{CapacityScheduler#getSchedulerContainer}}). > Reference code: > {code:java} > // Allocated something > List allocations = > csAssignment.getAssignmentInformation().getAllocationDetails(); > if (!allocations.isEmpty()) { > RMContainer rmContainer = allocations.get(0).rmContainer; > allocated = new ContainerAllocationProposal<>( > getSchedulerContainer(rmContainer, true), //possibly null > getSchedulerContainersToRelease(csAssignment), > > getSchedulerContainer(csAssignment.getFulfilledReservedContainer(), > false), csAssignment.getType(), > csAssignment.getRequestLocalityType(), > csAssignment.getSchedulingMode() != null ? > csAssignment.getSchedulingMode() : > SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY, > csAssignment.getResource()); > } > {code} > I think we should add null check for allocateOrReserveContainer before create > allocate/reserve proposals. Besides the allocation process has increase > unconfirmed resource of app when creating an allocate assignment, so if this > check is null, we should decrease the unconfirmed resource of live app. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8867) Retrieve the status of resource localization
[ https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672619#comment-16672619 ] Hadoop QA commented on YARN-8867: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 11 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 6m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 52s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 3s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 43s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 29s{color} | {color:orange} root: The patch generated 26 new + 481 unchanged - 0 fixed = 507 total (was 481) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 26s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 6s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 38s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 40s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 17m 22s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}114m 45s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 27m 45s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 2s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 18s{color} |
[jira] [Commented] (YARN-8958) Schedulable entities leak in fair ordering policy when recovering containers between remove app attempt and remove app
[ https://issues.apache.org/jira/browse/YARN-8958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672606#comment-16672606 ] Tao Yang commented on YARN-8958: Hi, [~cheersyang], thanks for the review. {quote} Can we also only do the resource usage updates when the schedulable entity exists? otherwise, would the resource usage gets incorrectly updated? {quote} I think it's OK to update resource usage for non-existent schedulable entity, maybe this schedulable entity is not finished but moved from ordering-policy to pending-ordering-policy, it may need this update. This calling can make schedulable entity more correctly because inside it just cached resource usage of itself get fresher and no harm to others. > Schedulable entities leak in fair ordering policy when recovering containers > between remove app attempt and remove app > -- > > Key: YARN-8958 > URL: https://issues.apache.org/jira/browse/YARN-8958 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8958.001.patch, YARN-8958.002.patch > > > We found a NPE in ClientRMService#getApplications when querying apps with > specified queue. The cause is that there is one app which can't be found by > calling RMContextImpl#getRMApps(is finished and swapped out of memory) but > still can be queried from fair ordering policy. > To reproduce schedulable entities leak in fair ordering policy: > (1) create app1 and launch container1 on node1 > (2) restart RM > (3) remove app1 attempt, app1 is removed from the schedulable entities. > (4) recover container1 after node1 reconnected to RM, then the state of > contianer1 is changed to COMPLETED, app1 is bring back to entitiesToReorder > after container released, then app1 will be added back into schedulable > entities after calling FairOrderingPolicy#getAssignmentIterator by scheduler. > (5) remove app1 > To solve this problem, we should make sure schedulableEntities can only be > affected by add or remove app attempt, new entity should not be added into > schedulableEntities by reordering process. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > schedulableEntities.add(schedulableEntity); > } > {code} > Related codes above can be improved as follow to make sure only existent > entity can be re-add into schedulableEntities. > {code:java} > protected void reorderSchedulableEntity(S schedulableEntity) { > //remove, update comparable data, and reinsert to update position in order > boolean exists = schedulableEntities.remove(schedulableEntity); > updateSchedulingResourceUsage( > schedulableEntity.getSchedulingResourceUsage()); > if (exists) { > schedulableEntities.add(schedulableEntity); > } else { > LOG.info("Skip reordering non-existent schedulable entity: " > + schedulableEntity.getId()); > } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org