[jira] [Commented] (YARN-10080) Support show app id on localizer thread pool
[ https://issues.apache.org/jira/browse/YARN-10080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090195#comment-17090195 ] zhoukang commented on YARN-10080: - how to push this [~adam.antal][~abmodi] Thanks > Support show app id on localizer thread pool > > > Key: YARN-10080 > URL: https://issues.apache.org/jira/browse/YARN-10080 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Attachments: YARN-10080-001.patch, YARN-10080.002.patch > > > Currently when we are troubleshooting a container localizer issue, if we want > to analyze the jstack with thread detail, we can not figure out which thread > is processing the given container. So i want to add app id on the thread name -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10242) CapacityScheduler may call updateClusterResource for every node register event which will cause resource register too slow
zhoukang created YARN-10242: --- Summary: CapacityScheduler may call updateClusterResource for every node register event which will cause resource register too slow Key: YARN-10242 URL: https://issues.apache.org/jira/browse/YARN-10242 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, resourcemanager Reporter: zhoukang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9997) Code cleanup in ZKConfigurationStore
[ https://issues.apache.org/jira/browse/YARN-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090045#comment-17090045 ] Hadoop QA commented on YARN-9997: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-3.2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 3s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 0s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} branch-3.2 passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 47s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 44s{color} | {color:green} branch-3.2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 31s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 7s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}311m 31s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}376m 20s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestNodeBlacklistingOnAMFailures | | | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 | | | hadoop.yarn.server.resourcemanager.TestApplicationACLs | | | hadoop.yarn.server.resourcemanager.TestWorkPreservingUnmanagedAM | | | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector | | | hadoop.yarn.server.resourcemanager.placement.TestPlacementManager | | | hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.TestFSSchedulerConfigurationStore | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base:
[jira] [Commented] (YARN-10201) Make AMRMProxyPolicy aware of SC load
[ https://issues.apache.org/jira/browse/YARN-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089925#comment-17089925 ] Young Chen commented on YARN-10201: --- Thanks [~goiri] for the feedback - I fixed the more immediate comments first. I'll go over the test cases and clean up /add coverage as necessary today. * Add javadocs to the new methods in AMRMClientRelayer. I would also extract the values like {{this.remotePendingAsks.get(key)}}. The extraction was using the container and also relaxing the constraints when matching. Actually after looking it again, I think it would be okay to just directly construct the ResourceRequestKey via Container (maybe w/ a new constructor overload). What do you think about this option? * Javadoc with example for SubClusterId#getShortId(), it would be good if it had examples. getShortId() wasn't actually being used here, so I've removed it. I'll add it back as part of the patch that depends on it. * Overall, I think we should go some more tests a little more specific, ideas that come to mind are the whole protobuf. ContainerAsksBalancer, and the code path that leads to routeNodeRequestIfNeeded. Agreed, I'll clean this up in the next patch. Let me know if I missed anything. > Make AMRMProxyPolicy aware of SC load > - > > Key: YARN-10201 > URL: https://issues.apache.org/jira/browse/YARN-10201 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy >Reporter: Young Chen >Assignee: Young Chen >Priority: Major > Attachments: YARN-10201.v0.patch, YARN-10201.v1.patch, > YARN-10201.v2.patch, YARN-10201.v3.patch, YARN-10201.v4.patch, > YARN-10201.v5.patch, YARN-10201.v6.patch > > > LocalityMulticastAMRMProxyPolicy is currently unaware of SC load when > splitting resource requests. We propose changes to the policy so that it > receives feedback from SCs and can load balance requests across the federated > cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph reopened YARN-10154: -- > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10154.001.patch, YARN-10154.002.patch, > YARN-10154.003.patch, YARN-10154.addendum-001.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089904#comment-17089904 ] Prabhu Joseph commented on YARN-10154: -- [~maniraj...@gmail.com] [~sunilg] We have found below two issues on further testing. 1. Application submitted into AutoCreatedLeafQueue stays in ACCEPTED state as the queue's capacity and max capacity are set as NaN. This happens as {{leafQueueTemplate}} created during ManagedParentQueue initialize as part of ResourceManager startup sets capacity based on the initial clusterResource 0. >> Have moved the setCapacity in leafQueueTemplate inside addChildQueue where it uses the right clusterResource. 2. Enforces Max Capacity to be set in leafqueuetemplate. If not set, RM fails to start with IllegalArgumentException abs-cap cannot be greater that abs-max-cap. >> When not set, parent max resource is used. > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10154.001.patch, YARN-10154.002.patch, > YARN-10154.003.patch, YARN-10154.addendum-001.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources
[ https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-10154: - Attachment: YARN-10154.addendum-001.patch > CS Dynamic Queues cannot be configured with absolute resources > -- > > Key: YARN-10154 > URL: https://issues.apache.org/jira/browse/YARN-10154 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.3 >Reporter: Sunil G >Assignee: Manikandan R >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10154.001.patch, YARN-10154.002.patch, > YARN-10154.003.patch, YARN-10154.addendum-001.patch > > > In CS, ManagedParent Queue and its template cannot take absolute resource > value like > [memory=8192,vcores=8] > Thsi Jira is to track and improve the configuration reading module of > DynamicQueue to support absolute resource values. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10201) Make AMRMProxyPolicy aware of SC load
[ https://issues.apache.org/jira/browse/YARN-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089763#comment-17089763 ] Íñigo Goiri commented on YARN-10201: There are still a few comments left (including the new checkstyles and the findbugs). Can we also add some documentation to the Federation.md page? > Make AMRMProxyPolicy aware of SC load > - > > Key: YARN-10201 > URL: https://issues.apache.org/jira/browse/YARN-10201 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy >Reporter: Young Chen >Assignee: Young Chen >Priority: Major > Attachments: YARN-10201.v0.patch, YARN-10201.v1.patch, > YARN-10201.v2.patch, YARN-10201.v3.patch, YARN-10201.v4.patch, > YARN-10201.v5.patch, YARN-10201.v6.patch > > > LocalityMulticastAMRMProxyPolicy is currently unaware of SC load when > splitting resource requests. We propose changes to the policy so that it > receives feedback from SCs and can load balance requests across the federated > cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9997) Code cleanup in ZKConfigurationStore
[ https://issues.apache.org/jira/browse/YARN-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089744#comment-17089744 ] Andras Gyori commented on YARN-9997: As YARN-10002 had been successfully backported, I have uploaded the backport patch as well. > Code cleanup in ZKConfigurationStore > > > Key: YARN-9997 > URL: https://issues.apache.org/jira/browse/YARN-9997 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Andras Gyori >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-9997.001.patch, YARN-9997.002.patch, > YARN-9997.003.patch, YARN-9997.004.patch, YARN-9997.005.patch, > YARN-9997.006.patch, YARN-9997.branch-3.2.001.patch > > > Many thins can be improved: > * znodeParentPath could be a local variable > * zkManager could be private, VisibleForTesting annotation is not needed > anymore > * Do something with unchecked casts > * zkManager.safeSetData calls are almost having the same set of parameters: > Simplify this > * Extract zkManager calls to their own methods: They are repeated > * Remove TODOs -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9997) Code cleanup in ZKConfigurationStore
[ https://issues.apache.org/jira/browse/YARN-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Gyori updated YARN-9997: --- Attachment: YARN-9997.branch-3.2.001.patch > Code cleanup in ZKConfigurationStore > > > Key: YARN-9997 > URL: https://issues.apache.org/jira/browse/YARN-9997 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Andras Gyori >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-9997.001.patch, YARN-9997.002.patch, > YARN-9997.003.patch, YARN-9997.004.patch, YARN-9997.005.patch, > YARN-9997.006.patch, YARN-9997.branch-3.2.001.patch > > > Many thins can be improved: > * znodeParentPath could be a local variable > * zkManager could be private, VisibleForTesting annotation is not needed > anymore > * Do something with unchecked casts > * zkManager.safeSetData calls are almost having the same set of parameters: > Simplify this > * Extract zkManager calls to their own methods: They are repeated > * Remove TODOs -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8417) Should skip passing HDFS_HOME, HADOOP_CONF_DIR, JAVA_HOME, etc. to Docker container.
[ https://issues.apache.org/jira/browse/YARN-8417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089689#comment-17089689 ] Jim Brennan commented on YARN-8417: --- [~leftnoteasy], [~eyang], [~shaneku...@gmail.com], is this something you guys still need for ENTRY_POINT mode? > Should skip passing HDFS_HOME, HADOOP_CONF_DIR, JAVA_HOME, etc. to Docker > container. > > > Key: YARN-8417 > URL: https://issues.apache.org/jira/browse/YARN-8417 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Priority: Critical > > Currently, YARN NM passes JAVA_HOME, HDFS_HOME, CLASSPATH environments before > launching Docker container no matter if ENTRY_POINT is used or not. This will > overwrite environments defined inside Dockerfile (by using \{{ENV}}). For > Docker container, it actually doesn't make sense to pass JAVA_HOME, > HDFS_HOME, etc. because inside docker image we have a separate Java/Hadoop > installed or mounted to exactly same directory of host machine. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10201) Make AMRMProxyPolicy aware of SC load
[ https://issues.apache.org/jira/browse/YARN-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089635#comment-17089635 ] Hadoop QA commented on YARN-10201: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 31s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} prototool {color} | {color:blue} 0m 0s{color} | {color:blue} prototool was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 3s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 21m 57s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 6s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 2m 2s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 22s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common in trunk has 1 extant findbugs warnings. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 14s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 37s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 11 new + 241 unchanged - 0 fixed = 252 total (was 241) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 10s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 22s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 27s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 2s{color} | {color:red} hadoop-yarn-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 20s{color} | {color:red} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 59s{color} | {color:red} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 23m 0s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed.
[jira] [Commented] (YARN-9996) Code cleanup in QueueAdminConfigurationMutationACLPolicy
[ https://issues.apache.org/jira/browse/YARN-9996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089532#comment-17089532 ] Hadoop QA commented on YARN-9996: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 4s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-3.2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 50s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 1s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} branch-3.2 passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 2m 8s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s{color} | {color:green} branch-3.2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 51s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}317m 11s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 0s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}395m 0s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler | | | hadoop.yarn.server.resourcemanager.TestNodeBlacklistingOnAMFailures | | | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 | | | hadoop.yarn.server.resourcemanager.TestApplicationACLs | | | hadoop.yarn.server.resourcemanager.TestWorkPreservingUnmanagedAM | | | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector | | | hadoop.yarn.server.resourcemanager.placement.TestPlacementManager | | | hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.TestFSSchedulerConfigurationStore | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base:
[jira] [Commented] (YARN-8234) Improve RM system metrics publisher's performance by pushing events to timeline server in batch
[ https://issues.apache.org/jira/browse/YARN-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089446#comment-17089446 ] Hadoop QA commented on YARN-8234: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 18m 26s{color} | {color:red} Docker failed to build yetus/hadoop:c2d96dd50e7. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-8234 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12939409/YARN-8234-branch-2.8.3.004.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25914/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org | This message was automatically generated. > Improve RM system metrics publisher's performance by pushing events to > timeline server in batch > --- > > Key: YARN-8234 > URL: https://issues.apache.org/jira/browse/YARN-8234 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager, timelineserver >Affects Versions: 2.8.3 >Reporter: Hu Ziqian >Assignee: Hu Ziqian >Priority: Critical > Attachments: YARN-8234-branch-2.8.3.001.patch, > YARN-8234-branch-2.8.3.002.patch, YARN-8234-branch-2.8.3.003.patch, > YARN-8234-branch-2.8.3.004.patch, YARN-8234.001.patch, YARN-8234.002.patch, > YARN-8234.003.patch, YARN-8234.004.patch > > > When system metrics publisher is enabled, RM will push events to timeline > server via restful api. If the cluster load is heavy, many events are sent to > timeline server and the timeline server's event handler thread locked. > YARN-7266 talked about the detail of this problem. Because of the lock, > timeline server can't receive event as fast as it generated in RM and lots of > timeline event stays in RM's memory. Finally, those events will consume all > RM's memory and RM will start a full gc (which cause an JVM stop-world and > cause a timeout from rm to zookeeper) or even get an OOM. > The main problem here is that timeline can't receive timeline server's event > as fast as it generated. Now, RM system metrics publisher put only one event > in a request, and most time costs on handling http header or some thing about > the net connection on timeline side. Only few time is spent on dealing with > the timeline event which is truly valuable. > In this issue, we add a buffer in system metrics publisher and let publisher > send events to timeline server in batch via one request. When sets the batch > size to 1000, in out experiment the speed of the timeline server receives > events has 100x improvement. We have implement this function int our product > environment which accepts 2 app's in one hour and it works fine. > We add following configuration: > * yarn.resourcemanager.system-metrics-publisher.batch-size: the size of > system metrics publisher sending events in one request. Default value is 1000 > * yarn.resourcemanager.system-metrics-publisher.buffer-size: the size of the > event buffer in system metrics publisher. > * yarn.resourcemanager.system-metrics-publisher.interval-seconds: When > enable batch publishing, we must avoid that the publisher waits for a batch > to be filled up and hold events in buffer for long time. So we add another > thread which send event's in the buffer periodically. This config sets the > interval of the cyclical sending thread. The default value is 60s. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10201) Make AMRMProxyPolicy aware of SC load
[ https://issues.apache.org/jira/browse/YARN-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Young Chen updated YARN-10201: -- Attachment: YARN-10201.v6.patch > Make AMRMProxyPolicy aware of SC load > - > > Key: YARN-10201 > URL: https://issues.apache.org/jira/browse/YARN-10201 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy >Reporter: Young Chen >Assignee: Young Chen >Priority: Major > Attachments: YARN-10201.v0.patch, YARN-10201.v1.patch, > YARN-10201.v2.patch, YARN-10201.v3.patch, YARN-10201.v4.patch, > YARN-10201.v5.patch, YARN-10201.v6.patch > > > LocalityMulticastAMRMProxyPolicy is currently unaware of SC load when > splitting resource requests. We propose changes to the policy so that it > receives feedback from SCs and can load balance requests across the federated > cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8234) Improve RM system metrics publisher's performance by pushing events to timeline server in batch
[ https://issues.apache.org/jira/browse/YARN-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089429#comment-17089429 ] Gabor Bota commented on YARN-8234: -- The hadoop release 3.1.4 code freeze is today (https://cwiki.apache.org/confluence/display/HADOOP/Roadmap). Please close this issue today or move it to a different target version. Thank you! > Improve RM system metrics publisher's performance by pushing events to > timeline server in batch > --- > > Key: YARN-8234 > URL: https://issues.apache.org/jira/browse/YARN-8234 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager, timelineserver >Affects Versions: 2.8.3 >Reporter: Hu Ziqian >Assignee: Hu Ziqian >Priority: Critical > Attachments: YARN-8234-branch-2.8.3.001.patch, > YARN-8234-branch-2.8.3.002.patch, YARN-8234-branch-2.8.3.003.patch, > YARN-8234-branch-2.8.3.004.patch, YARN-8234.001.patch, YARN-8234.002.patch, > YARN-8234.003.patch, YARN-8234.004.patch > > > When system metrics publisher is enabled, RM will push events to timeline > server via restful api. If the cluster load is heavy, many events are sent to > timeline server and the timeline server's event handler thread locked. > YARN-7266 talked about the detail of this problem. Because of the lock, > timeline server can't receive event as fast as it generated in RM and lots of > timeline event stays in RM's memory. Finally, those events will consume all > RM's memory and RM will start a full gc (which cause an JVM stop-world and > cause a timeout from rm to zookeeper) or even get an OOM. > The main problem here is that timeline can't receive timeline server's event > as fast as it generated. Now, RM system metrics publisher put only one event > in a request, and most time costs on handling http header or some thing about > the net connection on timeline side. Only few time is spent on dealing with > the timeline event which is truly valuable. > In this issue, we add a buffer in system metrics publisher and let publisher > send events to timeline server in batch via one request. When sets the batch > size to 1000, in out experiment the speed of the timeline server receives > events has 100x improvement. We have implement this function int our product > environment which accepts 2 app's in one hour and it works fine. > We add following configuration: > * yarn.resourcemanager.system-metrics-publisher.batch-size: the size of > system metrics publisher sending events in one request. Default value is 1000 > * yarn.resourcemanager.system-metrics-publisher.buffer-size: the size of the > event buffer in system metrics publisher. > * yarn.resourcemanager.system-metrics-publisher.interval-seconds: When > enable batch publishing, we must avoid that the publisher waits for a batch > to be filled up and hold events in buffer for long time. So we add another > thread which send event's in the buffer periodically. This config sets the > interval of the cyclical sending thread. The default value is 60s. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8417) Should skip passing HDFS_HOME, HADOOP_CONF_DIR, JAVA_HOME, etc. to Docker container.
[ https://issues.apache.org/jira/browse/YARN-8417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089422#comment-17089422 ] Gabor Bota commented on YARN-8417: -- The hadoop release 3.1.4 code freeze is today (https://cwiki.apache.org/confluence/display/HADOOP/Roadmap). Please close this issue today or move it to a different target branch. Thank you! > Should skip passing HDFS_HOME, HADOOP_CONF_DIR, JAVA_HOME, etc. to Docker > container. > > > Key: YARN-8417 > URL: https://issues.apache.org/jira/browse/YARN-8417 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Priority: Critical > > Currently, YARN NM passes JAVA_HOME, HDFS_HOME, CLASSPATH environments before > launching Docker container no matter if ENTRY_POINT is used or not. This will > overwrite environments defined inside Dockerfile (by using \{{ENV}}). For > Docker container, it actually doesn't make sense to pass JAVA_HOME, > HDFS_HOME, etc. because inside docker image we have a separate Java/Hadoop > installed or mounted to exactly same directory of host machine. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8257) Native service should automatically adding escapes for environment/launch cmd before sending to YARN
[ https://issues.apache.org/jira/browse/YARN-8257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089423#comment-17089423 ] Gabor Bota commented on YARN-8257: -- The hadoop release 3.1.4 code freeze is today (https://cwiki.apache.org/confluence/display/HADOOP/Roadmap). Please close this issue today or move it to a different target branch. Thank you! > Native service should automatically adding escapes for environment/launch cmd > before sending to YARN > > > Key: YARN-8257 > URL: https://issues.apache.org/jira/browse/YARN-8257 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services >Reporter: Wangda Tan >Assignee: Gour Saha >Priority: Critical > > Noticed this issue while using native service: > Basically, when a string for environment / launch command contains chars like > ", /, `: it needs to be escaped twice. > The first time is from json spec, because of json accept double quote only, > it needs an escape. > The second time is from launch container, what we did for command line is: > (ContainerLaunch.java) > {code:java} > line("exec /bin/bash -c \"", StringUtils.join(" ", command), "\"");{code} > And for environment: > {code:java} > line("export ", key, "=\"", value, "\"");{code} > An example of launch_command: > {code:java} > "launch_command": "export CLASSPATH=\\`\\$HADOOP_HDFS_HOME/bin/hadoop > classpath --glob\\`"{code} > And example of environment: > {code:java} > "TF_CONFIG" : "{\\\"cluster\\\": {\\\"master\\\": > [\\\"master-0.distributed-tf.ambari-qa.tensorflow.site:8000\\\"], \\\"ps\\\": > [\\\"ps-0.distributed-tf.ambari-qa.tensorflow.site:8000\\\"], \\\"worker\\\": > [\\\"worker-0.distributed-tf.ambari-qa.tensorflow.site:8000\\\"]}, > \\\"task\\\": {\\\"type\\\":\\\"${COMPONENT_NAME}\\\", > \\\"index\\\":${COMPONENT_ID}}, \\\"environment\\\":\\\"cloud\\\"}",{code} > To improve usability, I think we should auto escape the input string once. > (For example, if user specified > {code} > "TF_CONFIG": "\"key\"" > {code} > We will automatically escape it to: > {code} > "TF_CONFIG": \\\"key\\\" > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8634) Implement yarn cmd for killing all applications in the cluster
[ https://issues.apache.org/jira/browse/YARN-8634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089344#comment-17089344 ] Hadoop QA commented on YARN-8634: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s{color} | {color:red} YARN-8634 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-8634 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12934764/YARN-8634-001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25913/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.12.0 https://yetus.apache.org | This message was automatically generated. > Implement yarn cmd for killing all applications in the cluster > -- > > Key: YARN-8634 > URL: https://issues.apache.org/jira/browse/YARN-8634 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-8634-001.patch > > > If we want to kill all running applications then currently we need to give > all application Id's. So instead we can have a command which can kill all > applications. > Command would be *yarn application -killall* -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8634) Implement yarn cmd for killing all applications in the cluster
[ https://issues.apache.org/jira/browse/YARN-8634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089341#comment-17089341 ] Surendra Singh Lilhore edited comment on YARN-8634 at 4/22/20, 6:39 AM: [~BilwaST], Thanks for jira. I feel you can add {{debug}} command same as HDFS. These commands are for advanced users only. [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#Debug_Commands] You can add below command : {noformat} ./yarn debug application -killAll [-queue ] {noformat} This will allow you to kill all application or queue level jobs. You can add console confirmation message also from user before executing this command. was (Author: surendrasingh): [~BilwaST], Thanks for jira. I feel you can add {{debug}} command same as HDFS. These commands are for advanced users only. [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#Debug_Commands] You can add below command : {noformat} ./yarn debug job -killAll [-queue ] {noformat} This will allow you to kill all jobs or queue level jobs. You can add console confirmation message also from user before executing this command. > Implement yarn cmd for killing all applications in the cluster > -- > > Key: YARN-8634 > URL: https://issues.apache.org/jira/browse/YARN-8634 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-8634-001.patch > > > If we want to kill all running applications then currently we need to give > all application Id's. So instead we can have a command which can kill all > applications. > Command would be *yarn application -killall* -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8634) Implement yarn cmd for killing all applications in the cluster
[ https://issues.apache.org/jira/browse/YARN-8634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089341#comment-17089341 ] Surendra Singh Lilhore commented on YARN-8634: -- [~BilwaST], Thanks for jira. I feel you can add {{debug}} command same as HDFS. These commands are for advanced users only. [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#Debug_Commands] You can add below command : {noformat} ./yarn debug job -killAll [-queue ] {noformat} This will allow you to kill all jobs or queue level jobs. You can add console confirmation message also from user before executing this command. > Implement yarn cmd for killing all applications in the cluster > -- > > Key: YARN-8634 > URL: https://issues.apache.org/jira/browse/YARN-8634 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Attachments: YARN-8634-001.patch > > > If we want to kill all running applications then currently we need to give > all application Id's. So instead we can have a command which can kill all > applications. > Command would be *yarn application -killall* -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10237) Add isAbsoluteResource config for queue in scheduler response
[ https://issues.apache.org/jira/browse/YARN-10237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089312#comment-17089312 ] Prabhu Joseph commented on YARN-10237: -- [~sunil.gov...@gmail.com] [~snemeth] Can you review this Jira when you get time. Thanks. > Add isAbsoluteResource config for queue in scheduler response > - > > Key: YARN-10237 > URL: https://issues.apache.org/jira/browse/YARN-10237 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 3.4.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-10237-001.patch, YARN-10237-002.patch > > > Internal Config Management tools have difficulty in managing the capacity > scheduler queue configs if user toggles between Absolute Resource to > Percentage or vice versa. > This jira is to expose if a queue is configured in absolute resource or not > as part of scheduler response. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org