[jira] [Updated] (YARN-4108) CapacityScheduler: Improve preemption to only kill containers that would satisfy the incoming request
[ https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4108: - Summary: CapacityScheduler: Improve preemption to only kill containers that would satisfy the incoming request (was: CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request) > CapacityScheduler: Improve preemption to only kill containers that would > satisfy the incoming request > - > > Key: YARN-4108 > URL: https://issues.apache.org/jira/browse/YARN-4108 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-4108-design-doc-V3.pdf, > YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, YARN-4108.1.patch, > YARN-4108.10.patch, YARN-4108.11.patch, YARN-4108.2.patch, YARN-4108.3.patch, > YARN-4108.4.patch, YARN-4108.5.patch, YARN-4108.6.patch, YARN-4108.7.patch, > YARN-4108.8.patch, YARN-4108.9.patch, YARN-4108.poc.1.patch, > YARN-4108.poc.2-WIP.patch, YARN-4108.poc.3-WIP.patch, > YARN-4108.poc.4-WIP.patch > > > This is sibling JIRA for YARN-2154. We should make sure container preemption > is more effective. > *Requirements:*: > 1) Can handle case of user-limit preemption > 2) Can handle case of resource placement requirements, such as: hard-locality > (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I > don't want to use rack1 and host\[1-3\]) > 3) Can handle preemption within a queue: cross user preemption (YARN-2113), > cross applicaiton preemption (such as priority-based (YARN-1963) / > fairness-based (YARN-3319)). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4390) Consider container request size during CS preemption
[ https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199970#comment-15199970 ] Sunil G commented on YARN-4390: --- Thats very good, +1. If this is considering other parameters such as locality,priority etc, may be title and description can be changed. > Consider container request size during CS preemption > > > Key: YARN-4390 > URL: https://issues.apache.org/jira/browse/YARN-4390 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.0.0, 2.8.0, 2.7.3 >Reporter: Eric Payne >Assignee: Wangda Tan > > There are multiple reasons why preemption could unnecessarily preempt > containers. One is that an app could be requesting a large container (say > 8-GB), and the preemption monitor could conceivably preempt multiple > containers (say 8, 1-GB containers) in order to fill the large container > request. These smaller containers would then be rejected by the requesting AM > and potentially given right back to the preempted app. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing
[ https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200544#comment-15200544 ] Junping Du commented on YARN-4837: -- Thanks for calling it out, [~vinodkv]! I also feel uncomfortable on existing AM blacklisting mechanism (agree that name is a bit confusing :(). Put some thoughts (and design) on YARN-4576 before. May be we should check how to consolidate these JIRAs? > User facing aspects of 'AM blacklisting' feature need fixing > > > Key: YARN-4837 > URL: https://issues.apache.org/jira/browse/YARN-4837 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > > Was reviewing the user-facing aspects that we are releasing as part of 2.8.0. > Looking at the 'AM blacklisting feature', I see several things to be fixed > before we release it in 2.8.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing
[ https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202044#comment-15202044 ] Karthik Kambatla commented on YARN-4837: I am in favor of making these changes if it improves usability by making all this handling transparent. > User facing aspects of 'AM blacklisting' feature need fixing > > > Key: YARN-4837 > URL: https://issues.apache.org/jira/browse/YARN-4837 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > > Was reviewing the user-facing aspects that we are releasing as part of 2.8.0. > Looking at the 'AM blacklisting feature', I see several things to be fixed > before we release it in 2.8.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4815) ATS 1.5 timelineclinet impl try to create attempt directory for every event call
[ https://issues.apache.org/jira/browse/YARN-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199541#comment-15199541 ] Junping Du commented on YARN-4815: -- bq. My concern is that we do not need to have separate caches for each use cases if they can be modeled by Guava. I see. There are pros and cons to use third party library like Guava. The good thing is it simply implementation, like: we don't need to handle synchronization in this case. However, the general concern is the consistent (interface and behavior) of APIs is not that promising (even Java API has this problem but we have no other choices), also it is hard to follow bug fix across different versions. I don't have strong preference to use Guava or not. But this shouldn't be the reason to keep patch get in as no consensus in community yet. Any other comments on patch? > ATS 1.5 timelineclinet impl try to create attempt directory for every event > call > > > Key: YARN-4815 > URL: https://issues.apache.org/jira/browse/YARN-4815 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4815.1.patch > > > ATS 1.5 timelineclinet impl, try to create attempt directory for every event > call. Since per attempt only one call to create directory is enough, this is > causing perf issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4746) yarn web services should convert parse failures of appId to 400
[ https://issues.apache.org/jira/browse/YARN-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4746: --- Attachment: 0004-YARN-4746.patch Attaching latest patch with testcase fix > yarn web services should convert parse failures of appId to 400 > --- > > Key: YARN-4746 > URL: https://issues.apache.org/jira/browse/YARN-4746 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Priority: Minor > Attachments: 0001-YARN-4746.patch, 0002-YARN-4746.patch, > 0003-YARN-4746.patch, 0003-YARN-4746.patch, 0004-YARN-4746.patch > > > I'm seeing somewhere in the WS API tests of mine an error with exception > conversion of a bad app ID sent in as an argument to a GET. I know it's in > ATS, but a scan of the core RM web services implies a same problem > {{WebServices.parseApplicationId()}} uses {{ConverterUtils.toApplicationId}} > to convert an argument; this throws IllegalArgumentException, which is then > handled somewhere by jetty as a 500 error. > In fact, it's a bad argument, which should be handled by returning a 400. > This can be done by catching the raised argument and explicitly converting it -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing
[ https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200838#comment-15200838 ] Rohith Sharma K S commented on YARN-4837: - Adding to discussion about AM blacklisting, there are few corner cases where application can get hanged forever. Especially after blacklisting some nodes, if other nodes removed from the cluster. In such cases, there is no mechanism such that to remove blacklisted nodes for AM. See YARN-4685 for one of the scenario. > User facing aspects of 'AM blacklisting' feature need fixing > > > Key: YARN-4837 > URL: https://issues.apache.org/jira/browse/YARN-4837 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > > Was reviewing the user-facing aspects that we are releasing as part of 2.8.0. > Looking at the 'AM blacklisting feature', I see several things to be fixed > before we release it in 2.8.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200518#comment-15200518 ] Li Lu commented on YARN-4517: - Hi [~varun_saxena], thanks for the note! Right now I'm fine with moving forward as a POC and keep all related issues tracked in new JIRAs. > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer.
[ https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiwei Guo updated YARN-3933: - Attachment: YARN-3933.003.patch > Race condition when calling AbstractYarnScheduler.completedContainer. > - > > Key: YARN-3933 > URL: https://issues.apache.org/jira/browse/YARN-3933 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.6.0, 2.7.0, 2.5.2, 2.7.1 >Reporter: Lavkesh Lahngir >Assignee: Shiwei Guo > Attachments: YARN-3933.001.patch, YARN-3933.002.patch, > YARN-3933.003.patch > > > In our cluster we are seeing available memory and cores being negative. > Initial inspection: > Scenario no. 1: > In capacity scheduler the method allocateContainersToNode() checks if > there are excess reservation of containers for an application, and they are > no longer needed then it calls queue.completedContainer() which causes > resources being negative. And they were never assigned in the first place. > I am still looking through the code. Can somebody suggest how to simulate > excess containers assignments ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4812) TestFairScheduler#testContinuousScheduling fails intermittently
[ https://issues.apache.org/jira/browse/YARN-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199629#comment-15199629 ] Hudson commented on YARN-4812: -- FAILURE: Integrated in Hadoop-trunk-Commit #9472 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9472/]) YARN-4812. TestFairScheduler#testContinuousScheduling fails (kasha: rev f84af8bd588763c4e99305742d8c86ed596e8359) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestContinuousScheduling.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java > TestFairScheduler#testContinuousScheduling fails intermittently > --- > > Key: YARN-4812 > URL: https://issues.apache.org/jira/browse/YARN-4812 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4812-1.patch > > > This test has failed in the past, and there seem to be more issues. > {noformat} > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testContinuousScheduling(TestFairScheduler.java:3816) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4607) AppAttempt page TotalOutstandingResource Requests table support pagination
[ https://issues.apache.org/jira/browse/YARN-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4607: --- Attachment: 0001-YARN-4607.patch [~rohithsharma] Could you please review patch attached > AppAttempt page TotalOutstandingResource Requests table support pagination > -- > > Key: YARN-4607 > URL: https://issues.apache.org/jira/browse/YARN-4607 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Minor > Attachments: 0001-YARN-4607.patch > > > Simulate cluster with 10 racks with 100 nodes using sls and of we check the > table for Total Outstanding Resource Requests will consume complete page. > Good to support pagination for the table -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4839) ResourceManager deadlock between RMAppAttemptImpl and SchedulerApplicationAttempt
[ https://issues.apache.org/jira/browse/YARN-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201687#comment-15201687 ] Jason Lowe commented on YARN-4839: -- bq. Could this be the same issue as pointed out by YARN-4247? It is essentially the same core issue, but it wasn't caused by YARN-2005. We don't have that change in our build, but we do have YARN-3116. That's the first time getMasterContainer was called from SchedulerApplicationAttempt. Without the side-effect of YARN-3361 it leads to a deadlock since getMasterContainer tries to grab the lock. > ResourceManager deadlock between RMAppAttemptImpl and > SchedulerApplicationAttempt > - > > Key: YARN-4839 > URL: https://issues.apache.org/jira/browse/YARN-4839 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Jason Lowe >Priority: Blocker > > Hit a deadlock in the ResourceManager as one thread was holding the > SchedulerApplicationAttempt lock and trying to call > RMAppAttemptImpl.getMasterContainer while another thread had the > RMAppAttemptImpl lock and was trying to call > SchedulerApplicationAttempt.getResourceUsageReport. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4767) Network issues can cause persistent RM UI outage
[ https://issues.apache.org/jira/browse/YARN-4767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200150#comment-15200150 ] Daniel Templeton commented on YARN-4767: Nevermind. Looks like I have a new round of checkstyle issues to resolve, and I broke some tests. > Network issues can cause persistent RM UI outage > > > Key: YARN-4767 > URL: https://issues.apache.org/jira/browse/YARN-4767 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Critical > Attachments: YARN-4767.001.patch, YARN-4767.002.patch, > YARN-4767.003.patch > > > If a network issue causes an AM web app to resolve the RM proxy's address to > something other than what's listed in the allowed proxies list, the > AmIpFilter will 302 redirect the RM proxy's request back to the RM proxy. > The RM proxy will then consume all available handler threads connecting to > itself over and over, resulting in an outage of the web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started
[ https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201886#comment-15201886 ] Eric Payne commented on YARN-4686: -- [~ebadger], Thanks for the work you have done resolving this issue. I merged {{YARN-4686.006.patch}} to trunk and cherry-picked to branch-2 and branch-2.8. There were enough conflicts with the cherry-pick to branch-2.7 that I think it would be best if you provided a separate patch. The way it is written now, it has dependencies on JIRAs that were not backported to 2.7 (e.g., YARN-41). > MiniYARNCluster.start() returns before cluster is completely started > > > Key: YARN-4686 > URL: https://issues.apache.org/jira/browse/YARN-4686 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch, YARN-4686.001.patch, > YARN-4686.002.patch, YARN-4686.003.patch, YARN-4686.004.patch, > YARN-4686.005.patch, YARN-4686.006.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started
[ https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200203#comment-15200203 ] Jason Lowe commented on YARN-4686: -- bq. Still interested in if Jason Lowe or Karthik Kambatla have comments, especially about removal of the (extra) threads in startResourceManager and serviceStart methods. The thread removal is key, IMHO. MiniYARNCluster was a source of flaky tests because those threads allowed the mini cluster to return from its start method before its subcomponents completed their start methods. That means tests that assumed the cluster was started after cluster.start() were making a bad assumption. Removing these threads means the cluster really is started after the start method, assuming the RM and NM start methods correctly return only after they have started. +1 patch looks good to me. I'm OK either way on the blind or checked transition to active since it's a fast no-op in the non-HA case. It will generate an extra "Already in active state" info message in the test logs but is otherwise benign. > MiniYARNCluster.start() returns before cluster is completely started > > > Key: YARN-4686 > URL: https://issues.apache.org/jira/browse/YARN-4686 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch, YARN-4686.001.patch, > YARN-4686.002.patch, YARN-4686.003.patch, YARN-4686.004.patch, > YARN-4686.005.patch, YARN-4686.006.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started
[ https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201876#comment-15201876 ] Hudson commented on YARN-4686: -- FAILURE: Integrated in Hadoop-trunk-Commit #9474 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9474/]) YARN-4686. MiniYARNCluster.start() returns before cluster is completely (epayne: rev 92b7e0d41302b6b110927f99de5c2b4a4a93c5fd) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/TestMiniYARNClusterForHA.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java > MiniYARNCluster.start() returns before cluster is completely started > > > Key: YARN-4686 > URL: https://issues.apache.org/jira/browse/YARN-4686 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch, YARN-4686.001.patch, > YARN-4686.002.patch, YARN-4686.003.patch, YARN-4686.004.patch, > YARN-4686.005.patch, YARN-4686.006.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4685) AM blacklisting result in application to get hanged
[ https://issues.apache.org/jira/browse/YARN-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201117#comment-15201117 ] Vinod Kumar Vavilapalli commented on YARN-4685: --- There are simpler cases which are busted too. For e.g, if an AM failed on a node, this node will *never* be looked again for launching this app's AM as it is within the blacklist threshold. In a busy cluster where this node continues to be the only one free for a while, we will keep on skipping the machine. > AM blacklisting result in application to get hanged > --- > > Key: YARN-4685 > URL: https://issues.apache.org/jira/browse/YARN-4685 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > > AM blacklist addition or removal is updated only when RMAppAttempt is > scheduled i.e {{RMAppAttemptImpl#ScheduleTransition#transition}}. But once > attempt is scheduled if there is any removeNode/addNode in cluster then this > is not updated to {{BlackListManager#refreshNodeHostCount}}. This leads > BlackListManager to operate on stale NM's count. And application is in > ACCEPTED state and wait forever even if we add more nodes to cluster. > Solution is update BlacklistManager for every > {{RMAppAttemptImpl#AMContainerAllocatedTransition#transition}} call. This > ensures if there is any addition/removal in nodes, this will be updated to > BlacklistManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3926) Extend the YARN resource model for easier resource-type management and profiles
[ https://issues.apache.org/jira/browse/YARN-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197750#comment-15197750 ] Arun Suresh commented on YARN-3926: --- bq. RM subset - as long the NM resource types are a superset of the RM, the handshake proceeds - I believe this will address your concerns. Correct? That should work... But I feel, maybe *allow mismatch* should be the default. If NM has a super-set of RMs resource types, it will just be ignored, If sub-set, then for those specific resource-types, RM will assign a 0 value for the NM. Which would facilitate my other point.. Allow NMs to dynamically advertise new / disable existing resource types (NM would know of these new types via some admin API or self-discovery) as part of the NM heartbeat. Similarly, on the RM side, if the new resource advertised by the NM is unknown to RM, it just ignores it. We can also add admin API on the RM to add / remove allowable resource types on the fly. > Extend the YARN resource model for easier resource-type management and > profiles > --- > > Key: YARN-3926 > URL: https://issues.apache.org/jira/browse/YARN-3926 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: Proposal for modifying resource model and profiles.pdf > > > Currently, there are efforts to add support for various resource-types such > as disk(YARN-2139), network(YARN-2140), and HDFS bandwidth(YARN-2681). These > efforts all aim to add support for a new resource type and are fairly > involved efforts. In addition, once support is added, it becomes harder for > users to specify the resources they need. All existing jobs have to be > modified, or have to use the minimum allocation. > This ticket is a proposal to extend the YARN resource model to a more > flexible model which makes it easier to support additional resource-types. It > also considers the related aspect of “resource profiles” which allow users to > easily specify the various resources they need for any given container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1508) Rename ResourceOption and document resource over-commitment cases
[ https://issues.apache.org/jira/browse/YARN-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-1508: - Priority: Major (was: Minor) > Rename ResourceOption and document resource over-commitment cases > - > > Key: YARN-1508 > URL: https://issues.apache.org/jira/browse/YARN-1508 > Project: Hadoop YARN > Issue Type: Sub-task > Components: graceful, nodemanager, scheduler >Reporter: Junping Du >Assignee: Junping Du > > Per Vinod's comment in > YARN-312(https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087) > and Bikas' comment in > YARN-311(https://issues.apache.org/jira/browse/YARN-311?focusedCommentId=13848615), > the name of ResourceOption is not good enough for being understood. Also, we > need to document more on resource overcommitment time and use cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode
[ https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200154#comment-15200154 ] Varun Saxena commented on YARN-2962: Thanks Daniel for the review. bq. One optimization I might consider is to only add the splits that actually exist to the rmAppRootHierarchies since I would assume that the common case will be to not use splits. That is done in {{loadRMAppState}}. Only those hierarchies which have apps are considered. We can go with not creating the directories which are not configured ever, at all. But I went with this approach. bq. what happens if an app happens to exist in more than one split? Should not happen until and unless somebody does manual operations in state store. And in that case, a lot can go wrong with state store. So we do make some assumptions, even in previous code. However, I am not sure whether we need to fence certain operations or not as I highlighted in an earlier comment. But could not come up with a scenario because I felt state store operations are synchronized, inconsistency should not occur. > ZKRMStateStore: Limit the number of znodes under a znode > > > Key: YARN-2962 > URL: https://issues.apache.org/jira/browse/YARN-2962 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Karthik Kambatla >Assignee: Varun Saxena >Priority: Critical > Attachments: YARN-2962.01.patch, YARN-2962.04.patch, > YARN-2962.2.patch, YARN-2962.3.patch > > > We ran into this issue where we were hitting the default ZK server message > size configs, primarily because the message had too many znodes even though > they individually they were all small. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4823) Refactor the nested reservation id field in listReservation to simple string field
[ https://issues.apache.org/jira/browse/YARN-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-4823: - Summary: Refactor the nested reservation id field in listReservation to simple string field (was: Rename the nested reservation id field in listReservation to ID) > Refactor the nested reservation id field in listReservation to simple string > field > -- > > Key: YARN-4823 > URL: https://issues.apache.org/jira/browse/YARN-4823 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Subru Krishnan >Assignee: Subru Krishnan > > The listReservation REST API returns a ReservationId field which has a nested > id field which is also called ReservationId. This JIRA proposes to rename the > nested field to id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4732) *ProcessTree classes have too many whitespace issues
[ https://issues.apache.org/jira/browse/YARN-4732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203040#comment-15203040 ] Gabor Liptak commented on YARN-4732: [~kasha] Any other changes you would like to see? Thanks > *ProcessTree classes have too many whitespace issues > > > Key: YARN-4732 > URL: https://issues.apache.org/jira/browse/YARN-4732 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Karthik Kambatla >Assignee: Gabor Liptak >Priority: Trivial > Labels: newbie > Attachments: YARN-4732.1.patch > > > *ProcessTree classes have too many whitespace issues - extra newlines between > methods, spaces in empty lines etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4829) Add support for binary units
[ https://issues.apache.org/jira/browse/YARN-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-4829: Attachment: YARN-4829-YARN-3926.001.patch Attached a file with the fix. > Add support for binary units > > > Key: YARN-4829 > URL: https://issues.apache.org/jira/browse/YARN-4829 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4829-YARN-3926.001.patch > > > The units conversion util should have support for binary units. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (YARN-3926) Extend the YARN resource model for easier resource-type management and profiles
[ https://issues.apache.org/jira/browse/YARN-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197763#comment-15197763 ] Varun Vasudev edited comment on YARN-3926 at 3/16/16 5:41 PM: -- bq. That should work... But I feel, maybe allow mismatch should be the default. If NM has a super-set of RMs resource types, it will just be ignored, If sub-set, then for those specific resource-types, RM will assign a 0 value for the NM. I don't have any particular preference - I can see scenarios for all 3. I'm fine with making allow mismatch the default. bq. We can also add admin API on the RM to add / remove allowable resource types on the fly. This should be do-able but we need to go through how this will affect on running apps. was (Author: vvasudev): bq. That should work... But I feel, maybe allow mismatch should be the default. If NM has a super-set of RMs resource types, it will just be ignored, If sub-set, then for those specific resource-types, RM will assign a 0 value for the NM. I don't have any particular preference - I can see scenarios for all 3. I'm fine with making allow mismatch the default. bq. We can also add admin API on the RM to add / remove allowable resource types on the fly. This should be do-able but we need to go through the affect on running apps. > Extend the YARN resource model for easier resource-type management and > profiles > --- > > Key: YARN-3926 > URL: https://issues.apache.org/jira/browse/YARN-3926 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: Proposal for modifying resource model and profiles.pdf > > > Currently, there are efforts to add support for various resource-types such > as disk(YARN-2139), network(YARN-2140), and HDFS bandwidth(YARN-2681). These > efforts all aim to add support for a new resource type and are fairly > involved efforts. In addition, once support is added, it becomes harder for > users to specify the resources they need. All existing jobs have to be > modified, or have to use the minimum allocation. > This ticket is a proposal to extend the YARN resource model to a more > flexible model which makes it easier to support additional resource-types. It > also considers the related aspect of “resource profiles” which allow users to > easily specify the various resources they need for any given container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4820) ResourceManager web redirects in HA mode drops query parameters
[ https://issues.apache.org/jira/browse/YARN-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198044#comment-15198044 ] Steve Loughran commented on YARN-4820: -- I was just looking at the redirect logic and noting it was looking at 302's only...if you are treating redirects specially, then 307s ought to be covered as well, as those are what PUT/POST/DELETE verbs should be issuing if they ever need to redirect > ResourceManager web redirects in HA mode drops query parameters > --- > > Key: YARN-4820 > URL: https://issues.apache.org/jira/browse/YARN-4820 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4820.001.patch > > > The RMWebAppFilter redirects http requests from the standby to the active. > However it drops all the query parameters when it does the redirect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928
[ https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201871#comment-15201871 ] Naganarasimha G R commented on YARN-4712: - Thanks for the review and commit [~varun_saxena], [~sjlee0], [~djp], & [~sunilg]. > CPU Usage Metric is not captured properly in YARN-2928 > -- > > Key: YARN-4712 > URL: https://issues.apache.org/jira/browse/YARN-4712 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Fix For: YARN-2928 > > Attachments: YARN-4712-YARN-2928.v1.001.patch, > YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, > YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch, > YARN-4712-YARN-2928.v1.006.patch > > > There are 2 issues with CPU usage collection > * I was able to observe that that many times CPU usage got from > {{pTree.getCpuUsagePercent()}} is > ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do > the calculation i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore > /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE > check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not > encountered. so proper checks needs to be handled > * {{EntityColumnPrefix.METRIC}} uses always LongConverter but > ContainerMonitor is publishing decimal values for the CPU usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4767) Network issues can cause persistent RM UI outage
[ https://issues.apache.org/jira/browse/YARN-4767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-4767: --- Attachment: YARN-4767.003.patch This patch should resolve the checkstyle issues. I ended up having to refactor some existing code to reduce the method sizes. > Network issues can cause persistent RM UI outage > > > Key: YARN-4767 > URL: https://issues.apache.org/jira/browse/YARN-4767 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Critical > Attachments: YARN-4767.001.patch, YARN-4767.002.patch, > YARN-4767.003.patch > > > If a network issue causes an AM web app to resolve the RM proxy's address to > something other than what's listed in the allowed proxies list, the > AmIpFilter will 302 redirect the RM proxy's request back to the RM proxy. > The RM proxy will then consume all available handler threads connecting to > itself over and over, resulting in an outage of the web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4390) Consider container request size during CS preemption
[ https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198795#comment-15198795 ] Wangda Tan commented on YARN-4390: -- Hi [~eepayne], Since YARN-4108 doesn't solve all the issues. (I planned to solve this together with YARN-4108, but YARN-4108 only tackled half of the problem: when containers selected, only preempt useful containers). However, we need select container more clever based on requirement. I'm thinking about this recently and I plan to make some progresses as soon as possible. May I reopen this JIRA and take over from you? > Consider container request size during CS preemption > > > Key: YARN-4390 > URL: https://issues.apache.org/jira/browse/YARN-4390 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.0.0, 2.8.0, 2.7.3 >Reporter: Eric Payne >Assignee: Eric Payne > > There are multiple reasons why preemption could unnecessarily preempt > containers. One is that an app could be requesting a large container (say > 8-GB), and the preemption monitor could conceivably preempt multiple > containers (say 8, 1-GB containers) in order to fill the large container > request. These smaller containers would then be rejected by the requesting AM > and potentially given right back to the preempted app. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines
[ https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198111#comment-15198111 ] Inigo Goiri commented on YARN-2965: --- [~rgrandl], [~srikanthkandula], any objections? > Enhance Node Managers to monitor and report the resource usage on machines > -- > > Key: YARN-2965 > URL: https://issues.apache.org/jira/browse/YARN-2965 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Robert Grandl >Assignee: Robert Grandl > Attachments: ddoc_RT.docx > > > This JIRA is about augmenting Node Managers to monitor the resource usage on > the machine, aggregates these reports and exposes them to the RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4746) yarn web services should convert parse failures of appId to 400
[ https://issues.apache.org/jira/browse/YARN-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4746: --- Attachment: 0003-YARN-4746.patch Attaching patch after testcase fix > yarn web services should convert parse failures of appId to 400 > --- > > Key: YARN-4746 > URL: https://issues.apache.org/jira/browse/YARN-4746 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Priority: Minor > Attachments: 0001-YARN-4746.patch, 0002-YARN-4746.patch, > 0003-YARN-4746.patch, 0003-YARN-4746.patch > > > I'm seeing somewhere in the WS API tests of mine an error with exception > conversion of a bad app ID sent in as an argument to a GET. I know it's in > ATS, but a scan of the core RM web services implies a same problem > {{WebServices.parseApplicationId()}} uses {{ConverterUtils.toApplicationId}} > to convert an argument; this throws IllegalArgumentException, which is then > handled somewhere by jetty as a 500 error. > In fact, it's a bad argument, which should be handled by returning a 400. > This can be done by catching the raised argument and explicitly converting it -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-796) Allow for (admin) labels on nodes and resource-requests
[ https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197237#comment-15197237 ] Naganarasimha G R commented on YARN-796: Hi [~jameszhouyi], In 2.6.0 label exclusivity is not supported and hope you are also aware that labels are supported only in CS > Allow for (admin) labels on nodes and resource-requests > --- > > Key: YARN-796 > URL: https://issues.apache.org/jira/browse/YARN-796 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.4.1 >Reporter: Arun C Murthy >Assignee: Wangda Tan > Attachments: LabelBasedScheduling.pdf, > Node-labels-Requirements-Design-doc-V1.pdf, > Node-labels-Requirements-Design-doc-V2.pdf, > Non-exclusive-Node-Partition-Design.pdf, YARN-796-Diagram.pdf, > YARN-796.node-label.consolidate.1.patch, > YARN-796.node-label.consolidate.10.patch, > YARN-796.node-label.consolidate.11.patch, > YARN-796.node-label.consolidate.12.patch, > YARN-796.node-label.consolidate.13.patch, > YARN-796.node-label.consolidate.14.patch, > YARN-796.node-label.consolidate.2.patch, > YARN-796.node-label.consolidate.3.patch, > YARN-796.node-label.consolidate.4.patch, > YARN-796.node-label.consolidate.5.patch, > YARN-796.node-label.consolidate.6.patch, > YARN-796.node-label.consolidate.7.patch, > YARN-796.node-label.consolidate.8.patch, YARN-796.node-label.demo.patch.1, > YARN-796.patch, YARN-796.patch4 > > > It will be useful for admins to specify labels for nodes. Examples of labels > are OS, processor architecture etc. > We should expose these labels and allow applications to specify labels on > resource-requests. > Obviously we need to support admin operations on adding/removing node labels. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4785) inconsistent value type of the "type" field for LeafQueueInfo in response of RM REST API - cluster/scheduler
[ https://issues.apache.org/jira/browse/YARN-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199809#comment-15199809 ] Junping Du commented on YARN-4785: -- Thanks [~vvasudev], the patch for branch-2.6 and branch-2.7 LGTM. Will commit them shortly. > inconsistent value type of the "type" field for LeafQueueInfo in response of > RM REST API - cluster/scheduler > > > Key: YARN-4785 > URL: https://issues.apache.org/jira/browse/YARN-4785 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.6.0 >Reporter: Jayesh >Assignee: Varun Vasudev > Labels: REST_API > Attachments: YARN-4785.001.patch, YARN-4785.branch-2.6.001.patch, > YARN-4785.branch-2.7.001.patch > > > I see inconsistent value type ( String and Array ) of the "type" field for > LeafQueueInfo in response of RM REST API - cluster/scheduler > as per the spec it should be always String. > here is the sample output ( removed non-relevant fields ) > {code} > { > "scheduler": { > "schedulerInfo": { > "type": "capacityScheduler", > "capacity": 100, > ... > "queueName": "root", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 0.1, > > }, > { > "type": [ > "capacitySchedulerLeafQueueInfo" > ], > "capacity": 0.1, > "queueName": "test-queue", > "state": "RUNNING", > > }, > { > "type": [ > "capacitySchedulerLeafQueueInfo" > ], > "capacity": 2.5, > > }, > { > "capacity": 25, > > "state": "RUNNING", > "queues": { > "queue": [ > { > "capacity": 6, > "state": "RUNNING", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 100, > ... > } > ] > }, > > }, > { > "capacity": 6, > ... > "state": "RUNNING", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 100, > ... > } > ] > }, > ... > }, > ... > ] > }, > ... > } > ] > } > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4829) Add support for binary units
[ https://issues.apache.org/jira/browse/YARN-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197959#comment-15197959 ] Arun Suresh commented on YARN-4829: --- bq. ..ones in the IEC standard. It's also the format used by Kubernetes... Ah !!.. Makes sense.. Also, Since we are introducing *Mi* here, the following should be fixed as well : * In {{ResourceInformation}}, the static final {{MEMORY_MB}} field should be initialized to {{ResourceInformation.newInstance(MEMORY_URI, "Mi");}} * In {{ResourcePBImpl#getMemory}}, the argument to {{convert()}} should be *Mi*, not *M* > Add support for binary units > > > Key: YARN-4829 > URL: https://issues.apache.org/jira/browse/YARN-4829 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4829-YARN-3926.001.patch, > YARN-4829-YARN-3926.002.patch > > > The units conversion util should have support for binary units. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4830) Add support for resource types in the nodemanager
[ https://issues.apache.org/jira/browse/YARN-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-4830: Component/s: (was: resourcemanager) > Add support for resource types in the nodemanager > - > > Key: YARN-4830 > URL: https://issues.apache.org/jira/browse/YARN-4830 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4829) Add support for binary units
[ https://issues.apache.org/jira/browse/YARN-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197795#comment-15197795 ] Arun Suresh commented on YARN-4829: --- The patch looks mostly good. Thanks [~vvasudev] Couplo minor nits: * Maybe rename *Mi, Ti, Pi* to *Me, Te, Pe* or maybe replace eveything with *b (Kb, Mb..) to signify binary ? * Can we have a test case that converts between a binary to non-binary (K to Ki) for eg. ? > Add support for binary units > > > Key: YARN-4829 > URL: https://issues.apache.org/jira/browse/YARN-4829 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4829-YARN-3926.001.patch > > > The units conversion util should have support for binary units. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4820) ResourceManager web redirects in HA mode drops query parameters
[ https://issues.apache.org/jira/browse/YARN-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200204#comment-15200204 ] Steve Loughran commented on YARN-4820: -- ok, I'm wrong, this is a 307, not a 302 ... ignore everything I was complaining about > ResourceManager web redirects in HA mode drops query parameters > --- > > Key: YARN-4820 > URL: https://issues.apache.org/jira/browse/YARN-4820 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4820.001.patch, YARN-4820.002.patch > > > The RMWebAppFilter redirects http requests from the standby to the active. > However it drops all the query parameters when it does the redirect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4839) ResourceManager deadlock between RMAppAttemptImpl and SchedulerApplicationAttempt
[ https://issues.apache.org/jira/browse/YARN-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201669#comment-15201669 ] Sangjin Lee commented on YARN-4839: --- Could this be the same issue as pointed out by YARN-4247? We did see this issue in our environment (which is 2.6 + patches), but that was because we backported YARN-2005 without YARN-3361. Not sure if there has been a more recent regression. > ResourceManager deadlock between RMAppAttemptImpl and > SchedulerApplicationAttempt > - > > Key: YARN-4839 > URL: https://issues.apache.org/jira/browse/YARN-4839 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Jason Lowe >Priority: Blocker > > Hit a deadlock in the ResourceManager as one thread was holding the > SchedulerApplicationAttempt lock and trying to call > RMAppAttemptImpl.getMasterContainer while another thread had the > RMAppAttemptImpl lock and was trying to call > SchedulerApplicationAttempt.getResourceUsageReport. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199761#comment-15199761 ] Varun Saxena commented on YARN-4517: [~gtCarrera9], as per discussion with Wangda, unification of app/container information, we can do after this branch goes into trunk. I think we can definitely unify and have a single container page. I will do this later as part of another JIRA. NM single app page, will have to see. Regarding this, bq. However, in the new UI, when the node is shutdown, I just could not hold myself to try to find a link to the NM logs to figure out why. I think the workflow here changed slightly, hence the user experience. Some other projects like Apache Ambari may want to maintain those information as well, but in YARN, it will be great if we could provide our users a way out. Maybe something like: "You have to ssh to the missing nodes' /xxx/ dir to look for the logs" would even be helpful. In old UI, we did not show anything if node was shutdown. So here, the change is that we are showing even the nodes which have been SHUTDOWN. Just thought that this might be useful info for the admin. Now, some information regarding on which path, user can check the logs may be useful but currently this information is not available in RM. I am not sure how NM can report this to RM. You can report it in node registration but do we need to ? Ambari may have this information because I guess it knows exactly where installation has been done through it. Regarding node labels, we can add it in REST response indicating if labels are enabled or not. We can do this later because this would require another JIRA for REST changes. 500 error I think you are encountering on app page. I will fix it while doing AM pages or as a separate JIRA. > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2965) Enhance Node Managers to monitor and report the resource usage on machines
[ https://issues.apache.org/jira/browse/YARN-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198204#comment-15198204 ] Srikanth Kandula commented on YARN-2965: Go for it :-) We have some dummy code that was good enough to get numbers and experiments but are not actively working on pushing that in. Inigo, i will share that code with you offline so you can pick any useful pieces if you like from that. > Enhance Node Managers to monitor and report the resource usage on machines > -- > > Key: YARN-2965 > URL: https://issues.apache.org/jira/browse/YARN-2965 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Robert Grandl >Assignee: Robert Grandl > Attachments: ddoc_RT.docx > > > This JIRA is about augmenting Node Managers to monitor the resource usage on > the machine, aggregates these reports and exposes them to the RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table
[ https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200736#comment-15200736 ] Hadoop QA commented on YARN-4062: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 48s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 12s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 57s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 28s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 42s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 14s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 45s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 58s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 3m 23s {color} | {color:red} hadoop-yarn-project_hadoop-yarn-jdk1.8.0_74 with JDK v1.8.0_74 generated 1 new + 9 unchanged - 0 fixed = 10 total (was 9) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 22s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 5m 45s {color} | {color:red} hadoop-yarn-project_hadoop-yarn-jdk1.7.0_95 with JDK v1.7.0_95 generated 1 new + 10 unchanged - 0 fixed = 11 total (was 10) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s {color} | {color:green} hadoop-yarn-project/hadoop-yarn: patch generated 0 new + 212 unchanged - 1 fixed = 212 total (was 213) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 38s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 0s {color}
[jira] [Commented] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started
[ https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198410#comment-15198410 ] Karthik Kambatla commented on YARN-4686: On a cursory look, the patch looks reasonable. One thing that caught my eye: are we explicitly transitioning the RM to active even when HA is not enabled? Is that required? > MiniYARNCluster.start() returns before cluster is completely started > > > Key: YARN-4686 > URL: https://issues.apache.org/jira/browse/YARN-4686 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch, YARN-4686.001.patch, > YARN-4686.002.patch, YARN-4686.003.patch, YARN-4686.004.patch, > YARN-4686.005.patch, YARN-4686.006.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4808) SchedulerNode can use a few more cosmetic changes
[ https://issues.apache.org/jira/browse/YARN-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202938#comment-15202938 ] Karthik Kambatla commented on YARN-4808: [~leftnoteasy] - will you be able to review this? > SchedulerNode can use a few more cosmetic changes > - > > Key: YARN-4808 > URL: https://issues.apache.org/jira/browse/YARN-4808 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4808-1.patch > > > We have made some cosmetic changes to SchedulerNode recently. While working > on YARN-4511, realized we could improve it a little more: > # Remove volatile variables - don't see the need for them being volatile > # Some methods end up doing very similar things, so consolidating them > # Renaming totalResource to capacity. YARN-4511 plans to add inflatedCapacity > to include the un-utilized resources, and having two totals can be a little > confusing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4815) ATS 1.5 timelineclinet impl try to create attempt directory for every event call
[ https://issues.apache.org/jira/browse/YARN-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-4815: Attachment: YARN-4815.2.patch rebase the patch > ATS 1.5 timelineclinet impl try to create attempt directory for every event > call > > > Key: YARN-4815 > URL: https://issues.apache.org/jira/browse/YARN-4815 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4815.1.patch, YARN-4815.2.patch > > > ATS 1.5 timelineclinet impl, try to create attempt directory for every event > call. Since per attempt only one call to create directory is enough, this is > causing perf issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4835) [YARN-3368] REST API related changes for new Web UI
[ https://issues.apache.org/jira/browse/YARN-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4835: --- Description: Following things need to be added for AM related web pages. 1. Support task state query param in REST URL for fetching tasks. 2. Support task attempt state query param in REST URL for fetching task attempts. 3. A new REST endpoint to fetch counters for each task belonging to a job. Also have a query param for counter name. i.e. something like : {{/jobs/\{jobid\}/taskCounters}} 4. A REST endpoint in NM for fetching all log files associated with a container. Useful if logs served by NM. > [YARN-3368] REST API related changes for new Web UI > --- > > Key: YARN-4835 > URL: https://issues.apache.org/jira/browse/YARN-4835 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Reporter: Varun Saxena >Assignee: Varun Saxena > > Following things need to be added for AM related web pages. > 1. Support task state query param in REST URL for fetching tasks. > 2. Support task attempt state query param in REST URL for fetching task > attempts. > 3. A new REST endpoint to fetch counters for each task belonging to a job. > Also have a query param for counter name. >i.e. something like : > {{/jobs/\{jobid\}/taskCounters}} > 4. A REST endpoint in NM for fetching all log files associated with a > container. Useful if logs served by NM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4823) Refactor the nested reservation id field in listReservation to simple string field
[ https://issues.apache.org/jira/browse/YARN-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200852#comment-15200852 ] Subru Krishnan commented on YARN-4823: -- The test case failures are consistent and unrelated and are covered in YARN-4478 > Refactor the nested reservation id field in listReservation to simple string > field > -- > > Key: YARN-4823 > URL: https://issues.apache.org/jira/browse/YARN-4823 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Subru Krishnan >Assignee: Subru Krishnan > Attachments: YARN-4823-v1.patch > > > The listReservation REST API returns a ReservationId field which has a nested > id field which is also called ReservationId. This JIRA proposes to rename the > nested field to a string as it's easier to read and moreover what the > update/delete APIs take in as input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4785) inconsistent value type of the "type" field for LeafQueueInfo in response of RM REST API - cluster/scheduler
[ https://issues.apache.org/jira/browse/YARN-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199865#comment-15199865 ] Hadoop QA commented on YARN-4785: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} | {color:red} YARN-4785 does not apply to branch-2.6. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12793996/YARN-4785.branch-2.6.001.patch | | JIRA Issue | YARN-4785 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/10806/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > inconsistent value type of the "type" field for LeafQueueInfo in response of > RM REST API - cluster/scheduler > > > Key: YARN-4785 > URL: https://issues.apache.org/jira/browse/YARN-4785 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.6.0 >Reporter: Jayesh >Assignee: Varun Vasudev > Labels: REST_API > Attachments: YARN-4785.001.patch, YARN-4785.branch-2.6.001.patch, > YARN-4785.branch-2.7.001.patch > > > I see inconsistent value type ( String and Array ) of the "type" field for > LeafQueueInfo in response of RM REST API - cluster/scheduler > as per the spec it should be always String. > here is the sample output ( removed non-relevant fields ) > {code} > { > "scheduler": { > "schedulerInfo": { > "type": "capacityScheduler", > "capacity": 100, > ... > "queueName": "root", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 0.1, > > }, > { > "type": [ > "capacitySchedulerLeafQueueInfo" > ], > "capacity": 0.1, > "queueName": "test-queue", > "state": "RUNNING", > > }, > { > "type": [ > "capacitySchedulerLeafQueueInfo" > ], > "capacity": 2.5, > > }, > { > "capacity": 25, > > "state": "RUNNING", > "queues": { > "queue": [ > { > "capacity": 6, > "state": "RUNNING", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 100, > ... > } > ] > }, > > }, > { > "capacity": 6, > ... > "state": "RUNNING", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 100, > ... > } > ] > }, > ... > }, > ... > ] > }, > ... > } > ] > } > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-998) Persistent resource change during NM/RM restart
[ https://issues.apache.org/jira/browse/YARN-998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199584#comment-15199584 ] Junping Du commented on YARN-998: - Hi [~jianhe], would you kindly review the patch again? Thanks! > Persistent resource change during NM/RM restart > --- > > Key: YARN-998 > URL: https://issues.apache.org/jira/browse/YARN-998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: graceful, nodemanager, scheduler >Reporter: Junping Du >Assignee: Junping Du > Attachments: YARN-998-sample.patch, YARN-998-v1.patch, > YARN-998-v2.patch > > > When NM is restarted by plan or from a failure, previous dynamic resource > setting should be kept for consistency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200205#comment-15200205 ] Varun Saxena commented on YARN-4517: [~sunilg], bq. I will try and see whether we can make a unified patch for all REST changes needed for UI together. Will syncup with you offline and will share summary here. Yup. I will raise a JIRA for REST changes soon. You can update the changes you plan to make, there. bq. From RM, we can get the node ip/hostname. Atleast we can give a relative patch for getting dir for logs (may be from available path from yarn-default.xml). Node IP will anyways be shown with the SHUTDOWN node. Sorry, I do not know, but which current configuration are you referring to, from which we can know where nodemanager logs are located ? IIUC, log file location is decided by yarn.log.dir system property(which is passed while starting NM daemon) and RM wont know about it for nodemanagers. Currently, if I am not wrong, RM wont have this info and if we need to support, code will have to be added. > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928
[ https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199768#comment-15199768 ] Varun Saxena commented on YARN-4712: +1 on the latest patch from me too. >From the point of view of aggregation, I agree with using container CPU per >core. Before committing this, I think we can discuss with others in meeting today and find out if they agree. And if we need other metrics at the container level. > CPU Usage Metric is not captured properly in YARN-2928 > -- > > Key: YARN-4712 > URL: https://issues.apache.org/jira/browse/YARN-4712 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Attachments: YARN-4712-YARN-2928.v1.001.patch, > YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, > YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch, > YARN-4712-YARN-2928.v1.006.patch > > > There are 2 issues with CPU usage collection > * I was able to observe that that many times CPU usage got from > {{pTree.getCpuUsagePercent()}} is > ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do > the calculation i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore > /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE > check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not > encountered. so proper checks needs to be handled > * {{EntityColumnPrefix.METRIC}} uses always LongConverter but > ContainerMonitor is publishing decimal values for the CPU usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4746) yarn web services should convert parse failures of appId to 400
[ https://issues.apache.org/jira/browse/YARN-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199226#comment-15199226 ] Hadoop QA commented on YARN-4746: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 51s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 52s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 55s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 3s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 55s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s {color} | {color:green} hadoop-yarn-server-common in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 4s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 41s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:
[jira] [Commented] (YARN-4785) inconsistent value type of the "type" field for LeafQueueInfo in response of RM REST API - cluster/scheduler
[ https://issues.apache.org/jira/browse/YARN-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200882#comment-15200882 ] Varun Vasudev commented on YARN-4785: - Thanks [~djp]! > inconsistent value type of the "type" field for LeafQueueInfo in response of > RM REST API - cluster/scheduler > > > Key: YARN-4785 > URL: https://issues.apache.org/jira/browse/YARN-4785 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.6.0 >Reporter: Jayesh >Assignee: Varun Vasudev > Labels: REST_API > Fix For: 2.8.0, 2.7.3, 2.6.5 > > Attachments: YARN-4785.001.patch, YARN-4785.branch-2.6.001.patch, > YARN-4785.branch-2.7.001.patch > > > I see inconsistent value type ( String and Array ) of the "type" field for > LeafQueueInfo in response of RM REST API - cluster/scheduler > as per the spec it should be always String. > here is the sample output ( removed non-relevant fields ) > {code} > { > "scheduler": { > "schedulerInfo": { > "type": "capacityScheduler", > "capacity": 100, > ... > "queueName": "root", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 0.1, > > }, > { > "type": [ > "capacitySchedulerLeafQueueInfo" > ], > "capacity": 0.1, > "queueName": "test-queue", > "state": "RUNNING", > > }, > { > "type": [ > "capacitySchedulerLeafQueueInfo" > ], > "capacity": 2.5, > > }, > { > "capacity": 25, > > "state": "RUNNING", > "queues": { > "queue": [ > { > "capacity": 6, > "state": "RUNNING", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 100, > ... > } > ] > }, > > }, > { > "capacity": 6, > ... > "state": "RUNNING", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 100, > ... > } > ] > }, > ... > }, > ... > ] > }, > ... > } > ] > } > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4560) Make scheduler error checking message more user friendly
[ https://issues.apache.org/jira/browse/YARN-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197617#comment-15197617 ] Ray Chiang commented on YARN-4560: -- Thanks for the review everyone! Thanks for the commit Karthik! > Make scheduler error checking message more user friendly > > > Key: YARN-4560 > URL: https://issues.apache.org/jira/browse/YARN-4560 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.7.1 >Reporter: Ray Chiang >Assignee: Ray Chiang >Priority: Trivial > Labels: supportability > Fix For: 2.9.0 > > Attachments: YARN-4560.001.patch > > > If the YARN properties below are poorly configured: > {code} > yarn.scheduler.minimum-allocation-mb > yarn.scheduler.maximum-allocation-mb > {code} > The error message that shows up in the RM is: > {panel} > 2016-01-07 14:47:03,711 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Invalid resource > scheduler memory allocation configuration, > yarn.scheduler.minimum-allocation-mb=-1, > yarn.scheduler.maximum-allocation-mb=-3, min should equal greater than 0, max > should be no smaller than min. > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.validateConf(FairScheduler.java:215) > {panel} > While it's technically correct, it's not very user friendly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4825) Remove redundant code in ClientRMService::listReservations
[ https://issues.apache.org/jira/browse/YARN-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200746#comment-15200746 ] Hadoop QA commented on YARN-4825: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 56s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 11s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 153m 7s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL |
[jira] [Commented] (YARN-3926) Extend the YARN resource model for easier resource-type management and profiles
[ https://issues.apache.org/jira/browse/YARN-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197811#comment-15197811 ] Arun Suresh commented on YARN-3926: --- bq. ..how this will affect on running apps Agreed, might not be trivial. But my hunch is, if DRF works correctly, it should be equivalent to Cluster Capacity / Resource change (In the FairScheduler IIRC, a re-calculation of Queue and Application fair-shares are done) > Extend the YARN resource model for easier resource-type management and > profiles > --- > > Key: YARN-3926 > URL: https://issues.apache.org/jira/browse/YARN-3926 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: Proposal for modifying resource model and profiles.pdf > > > Currently, there are efforts to add support for various resource-types such > as disk(YARN-2139), network(YARN-2140), and HDFS bandwidth(YARN-2681). These > efforts all aim to add support for a new resource type and are fairly > involved efforts. In addition, once support is added, it becomes harder for > users to specify the resources they need. All existing jobs have to be > modified, or have to use the minimum allocation. > This ticket is a proposal to extend the YARN resource model to a more > flexible model which makes it easier to support additional resource-types. It > also considers the related aspect of “resource profiles” which allow users to > easily specify the various resources they need for any given container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4831) Recovered containers will be killed after NM stateful restart
Siqi Li created YARN-4831: - Summary: Recovered containers will be killed after NM stateful restart Key: YARN-4831 URL: https://issues.apache.org/jira/browse/YARN-4831 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode
[ https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198442#comment-15198442 ] Daniel Templeton commented on YARN-2962: Thanks for updating the patch, [~varun_saxena]. The overall approach appears faithful to discussion above. One optimization I might consider is to only add the splits that actually exist to the {{rmAppRootHierarchies}} since I would assume that the common case will be to not use splits. Implementation comments: Could you also explain in the parameter description why one would want to change it from the default of 0 and how to know what a good split value would be? {code} static final int NO_APPID_NODE_SPLIT = 0; {code} I'm not sure this constant adds anything. I found it made the code harder to read than just hard-coding in 0. At a minimum, the name could be improved. Took me a bit to realize you you weren't abbreviating "number" as "no". At the absolute barest minimum, a comment to explain the constant would help. {code} private final static class AppNodeSplitInfo { private final String path; private final int splitIndex; AppNodeSplitInfo(String path, int splitIndex) { this.path = path; this.splitIndex = splitIndex; } public String getPath() { return path; } public int getSplitIndex() { return splitIndex; } } {code} It may be just me, but for a private holding class like this, I don't think the accessors are needed. Just access the member vars directly. {code} if (appIdNodeSplitIndex < 1 || appIdNodeSplitIndex > 4) { appIdNodeSplitIndex = NO_APPID_NODE_SPLIT; } {code} This violates the Principle of Least Astonishment. At least log a warning that you're not doing what the user said to. I might even log it as an error. {code} rmAppRootHierarchies = new HashMap(5); {code} should be {code} rmAppRootHierarchies = new HashMap<>(5); {code} {code} if (alternatePathInfo == null) { // Unexpected. Assume that app attempt has been deleted. return; } appIdRemovePath = alternatePathInfo.getPath(); {code} I'm not a fan of the if-return style of coding. I'd rather you did: {code} // Assume that app attempt has been deleted if the path info is null if (alternatePathInfo != null) { appIdRemovePath = alternatePathInfo.getPath(); } {code} and then wrap the tail of the method in {{if (appIdRemovePath != null) {}}. Same in {{removeApplicationStateInternal()}} and {{removeApplication()}}. Please include messages in your @throws comments. {code} /** * Deletes the path. Assumes that path exists. * @param path Path to be deleted. * @throws Exception */ private void safeDeleteIfExists(final String path) throws Exception { SafeTransaction transaction = new SafeTransaction(); transaction.delete(path); transaction.commit(); } /** * Deletes the path. Checks for existence of path as well. * @param path Path to be deleted. * @throws Exception */ private void safeDelete(final String path) throws Exception { if (exists(path)) { safeDeleteIfExists(path); } } {code} What I see is that {{safeDelete()}} deletes the path if it exists, and {{safeDeleteIfExists()}} deletes the path blindly. Might want to swap those method names. {code} private AppNodeSplitInfo getAlternatePath(String appId) throws Exception { for (Map.Entry entry : rmAppRootHierarchies.entrySet()) { // Look for other paths int splitIndex = entry.getKey(); if (splitIndex != appIdNodeSplitIndex) { String alternatePath = getLeafAppIdNodePath(appId, entry.getValue(), splitIndex, false); if (exists(alternatePath)) { return new AppNodeSplitInfo(alternatePath, splitIndex); } } } return null; } {code} Näive question: what happens if an app happens to exist in more than one split? I know that's not the expected case, but never underestimate the users... I would also love to see some use of newlines to make the code a little more readable. I would love to see javadoc comments on your test methods. {code} HashMap attempts = new HashMap(); {code} should be {code} HashMap attempts = new HashMap<>(); {code} Your assert methods should have some message text to explain what went wrong. It would be really swell if those two long test methods had some more explanatory comments so that they're easier to understand. > ZKRMStateStore: Limit the number of znodes under a znode > > > Key: YARN-2962 > URL: https://issues.apache.org/jira/browse/YARN-2962 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.6.0
[jira] [Resolved] (YARN-2048) List all of the containers of an application from the yarn web
[ https://issues.apache.org/jira/browse/YARN-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla resolved YARN-2048. Resolution: Duplicate > List all of the containers of an application from the yarn web > -- > > Key: YARN-2048 > URL: https://issues.apache.org/jira/browse/YARN-2048 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager, webapp >Affects Versions: 2.3.0, 2.4.0, 2.5.0 >Reporter: Min Zhou >Assignee: Xuan Gong > Attachments: YARN-2048-trunk-v1.patch > > > Currently, Yarn haven't provide a way to list all of the containers of an > application from its web. This kind of information is needed by the > application user. They can conveniently know how many containers their > applications already acquired as well as which nodes those containers were > launched on. They also want to view the logs of each container of an > application. > One approach is maintain a container list in RMAppImpl and expose this info > to Application page. I will submit a patch soon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (YARN-2048) List all of the containers of an application from the yarn web
[ https://issues.apache.org/jira/browse/YARN-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reopened YARN-2048: > List all of the containers of an application from the yarn web > -- > > Key: YARN-2048 > URL: https://issues.apache.org/jira/browse/YARN-2048 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager, webapp >Affects Versions: 2.3.0, 2.4.0, 2.5.0 >Reporter: Min Zhou >Assignee: Xuan Gong > Attachments: YARN-2048-trunk-v1.patch > > > Currently, Yarn haven't provide a way to list all of the containers of an > application from its web. This kind of information is needed by the > application user. They can conveniently know how many containers their > applications already acquired as well as which nodes those containers were > launched on. They also want to view the logs of each container of an > application. > One approach is maintain a container list in RMAppImpl and expose this info > to Application page. I will submit a patch soon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started
[ https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202474#comment-15202474 ] Hadoop QA commented on YARN-4686: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 10s {color} | {color:red} YARN-4686 does not apply to branch-2.7. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12794247/YARN-4686-branch-2.7.006.patch | | JIRA Issue | YARN-4686 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/10820/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > MiniYARNCluster.start() returns before cluster is completely started > > > Key: YARN-4686 > URL: https://issues.apache.org/jira/browse/YARN-4686 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch, > YARN-4686-branch-2.7.006.patch, YARN-4686.001.patch, YARN-4686.002.patch, > YARN-4686.003.patch, YARN-4686.004.patch, YARN-4686.005.patch, > YARN-4686.006.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4829) Add support for binary units
[ https://issues.apache.org/jira/browse/YARN-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200878#comment-15200878 ] Varun Vasudev commented on YARN-4829: - [~asuresh] - can you take a look at the latest patch and commit it to branch YARN-3926 if it looks good? Thanks! > Add support for binary units > > > Key: YARN-4829 > URL: https://issues.apache.org/jira/browse/YARN-4829 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4829-YARN-3926.001.patch, > YARN-4829-YARN-3926.002.patch, YARN-4829-YARN-3926.003.patch, > YARN-4829-YARN-3926.004.patch > > > The units conversion util should have support for binary units. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4636) Make blacklist tracking policy pluggable for more extensions.
[ https://issues.apache.org/jira/browse/YARN-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201206#comment-15201206 ] Vinod Kumar Vavilapalli commented on YARN-4636: --- -1 for something like this without understanding the use-cases. IMO, the "AM blacklisting" doesn't even need to be user-visible (YARN-4837) let alone be pluggable. > Make blacklist tracking policy pluggable for more extensions. > - > > Key: YARN-4636 > URL: https://issues.apache.org/jira/browse/YARN-4636 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Junping Du >Assignee: Sunil G > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4390) Consider container request size during CS preemption
[ https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4390: - Attachment: YARN-4390.poc-WIP.1.patch And uploaded a WIP POC patch if you're interested to see what's the patch will look like. (Need apply patch of YARN-4822 first). I haven't done any test yet, no guarantee that it can be compiled successfully. Could you please share your thoughts about the proposal? [~eepayne], [~sunilg], [~curino]. > Consider container request size during CS preemption > > > Key: YARN-4390 > URL: https://issues.apache.org/jira/browse/YARN-4390 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Affects Versions: 3.0.0, 2.8.0, 2.7.3 >Reporter: Eric Payne >Assignee: Wangda Tan > Attachments: YARN-4390-design.1.pdf, YARN-4390.poc-WIP.1.patch > > > There are multiple reasons why preemption could unnecessarily preempt > containers. One is that an app could be requesting a large container (say > 8-GB), and the preemption monitor could conceivably preempt multiple > containers (say 8, 1-GB containers) in order to fill the large container > request. These smaller containers would then be rejected by the requesting AM > and potentially given right back to the preempted app. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing
Vinod Kumar Vavilapalli created YARN-4837: - Summary: User facing aspects of 'AM blacklisting' feature need fixing Key: YARN-4837 URL: https://issues.apache.org/jira/browse/YARN-4837 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Was reviewing the user-facing aspects that we are releasing as part of 2.8.0. Looking at the 'AM blacklisting feature', I see several things to be fixed before we release it in 2.8.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table
[ https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200414#comment-15200414 ] Sangjin Lee commented on YARN-4062: --- LGTM pending jenkins. I'll commit it once the jenkins comes back clean. > Add the flush and compaction functionality via coprocessors and scanners for > flow run table > --- > > Key: YARN-4062 > URL: https://issues.apache.org/jira/browse/YARN-4062 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Vrushali C >Assignee: Vrushali C > Labels: yarn-2928-1st-milestone > Attachments: YARN-4062-YARN-2928.04.patch, > YARN-4062-YARN-2928.05.patch, YARN-4062-YARN-2928.06.patch, > YARN-4062-YARN-2928.07.patch, YARN-4062-YARN-2928.08.patch, > YARN-4062-YARN-2928.09.patch, YARN-4062-YARN-2928.1.patch, > YARN-4062-feature-YARN-2928.01.patch, YARN-4062-feature-YARN-2928.02.patch, > YARN-4062-feature-YARN-2928.03.patch > > > As part of YARN-3901, coprocessor and scanner is being added for storing into > the flow_run table. It also needs a flush & compaction processing in the > coprocessor and perhaps a new scanner to deal with the data during flushing > and compaction stages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4820) ResourceManager web redirects in HA mode drops query parameters
[ https://issues.apache.org/jira/browse/YARN-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197235#comment-15197235 ] Varun Vasudev commented on YARN-4820: - [~steve_l] - sorry I didn't understand the case you mentioned. You're talking about a scenario where the active RM web services redirect you to a standby RM? > ResourceManager web redirects in HA mode drops query parameters > --- > > Key: YARN-4820 > URL: https://issues.apache.org/jira/browse/YARN-4820 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4820.001.patch > > > The RMWebAppFilter redirects http requests from the standby to the active. > However it drops all the query parameters when it does the redirect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4823) Refactor the nested reservation id field in listReservation to simple string field
[ https://issues.apache.org/jira/browse/YARN-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-4823: - Attachment: YARN-4823-v1.patch Attaching a patch that refactors the nested reservation id field in listReservation to simple string field > Refactor the nested reservation id field in listReservation to simple string > field > -- > > Key: YARN-4823 > URL: https://issues.apache.org/jira/browse/YARN-4823 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Subru Krishnan >Assignee: Subru Krishnan > Attachments: YARN-4823-v1.patch > > > The listReservation REST API returns a ReservationId field which has a nested > id field which is also called ReservationId. This JIRA proposes to rename the > nested field to a string as it's easier to read and moreover what the > update/delete APIs take in as input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4746) yarn web services should convert parse failures of appId to 400
[ https://issues.apache.org/jira/browse/YARN-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201265#comment-15201265 ] Hadoop QA commented on YARN-4746: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 41s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 15s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 29s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 15s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 12s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 44s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 75m 8s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 16s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 53s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 20s {color} | {color:red} hadoop-yarn-ser
[jira] [Commented] (YARN-4502) Fix two AM containers get allocated when AM restart
[ https://issues.apache.org/jira/browse/YARN-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200423#comment-15200423 ] Vinod Kumar Vavilapalli commented on YARN-4502: --- [~zxu] / [~djp] bq. It looks like the implementation for AbstractYarnScheduler#getApplicationAttempt(ApplicationAttemptId applicationAttemptId) is also confusing. This is by design - see YARN-1041 - we want to route all the events destined for AppAttempt *only* to the current attempt. We should just document this and move on. > Fix two AM containers get allocated when AM restart > --- > > Key: YARN-4502 > URL: https://issues.apache.org/jira/browse/YARN-4502 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yesha Vora >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Fix For: 2.8.0 > > Attachments: YARN-4502-20160114.txt, YARN-4502-20160212.txt > > > Scenario : > * set yarn.resourcemanager.am.max-attempts = 2 > * start dshell application > {code} > yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar > hadoop-yarn-applications-distributedshell-*.jar > -attempt_failures_validity_interval 6 -shell_command "sleep 150" > -num_containers 16 > {code} > * Kill AM pid > * Print container list for 2nd attempt > {code} > yarn container -list appattempt_1450825622869_0001_02 > INFO impl.TimelineClientImpl: Timeline service address: > http://xxx:port/ws/v1/timeline/ > INFO client.RMProxy: Connecting to ResourceManager at xxx/10.10.10.10: > Total number of containers :2 > Container-Id Start Time Finish Time > StateHost Node Http Address >LOG-URL > container_e12_1450825622869_0001_02_02 Tue Dec 22 23:07:35 + 2015 > N/A RUNNINGxxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_02/hrt_qa > container_e12_1450825622869_0001_02_01 Tue Dec 22 23:07:34 + 2015 > N/A RUNNINGxxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_01/hrt_qa > {code} > * look for new AM pid > Here, 2nd AM container was suppose to be started on > container_e12_1450825622869_0001_02_01. But AM was not launched on > container_e12_1450825622869_0001_02_01. It was in AQUIRED state. > On other hand, container_e12_1450825622869_0001_02_02 got the AM running. > Expected behavior: RM should not start 2 containers for starting AM -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4837) User facing aspects of 'AM blacklisting' feature need fixing
[ https://issues.apache.org/jira/browse/YARN-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201546#comment-15201546 ] Sunil G commented on YARN-4837: --- Thanks [~vinodkv] for pitching in. YARN-2005 blacklists nodes if AM container launch failed due to DISK_FAILED. And after YARN-4284, blacklisting for am-container-failure is made for all container failure except PREEMPTED. There were few discussion on usecase aspects for this change. If blacklisting (am container failure) feature is enabled in cluster level, all applications will be forced to comply the blacklisting rule. YARN-4389 had also an option to disable this feature from application end. Also it could control the threshold if its too strict (and vice versa). Yes, agreeing to your point and its early for user to take blacklisting decisions w/o having much needed/useful information. But by seeing the current aggressive nature, this change was helping in skipping this feature. Agreeing that this has to be a controllable feature without causing problems in a busy cluster. I think may be a time based purging solution can be ideal to allow same app to use the node again. > User facing aspects of 'AM blacklisting' feature need fixing > > > Key: YARN-4837 > URL: https://issues.apache.org/jira/browse/YARN-4837 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli > > Was reviewing the user-facing aspects that we are releasing as part of 2.8.0. > Looking at the 'AM blacklisting feature', I see several things to be fixed > before we release it in 2.8.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201146#comment-15201146 ] Vinod Kumar Vavilapalli commented on YARN-2005: --- -1 for backporting this, while I understand that the original feature-ask is useful for avoiding AM scheduling getting blocked, there are far too many issues with the feature as it is. Please see my comments on YARN-4576 and YARN-4837. > Blacklisting support for scheduling AMs > --- > > Key: YARN-2005 > URL: https://issues.apache.org/jira/browse/YARN-2005 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 0.23.10, 2.4.0 >Reporter: Jason Lowe >Assignee: Anubhav Dhoot > Fix For: 2.8.0 > > Attachments: YARN-2005.001.patch, YARN-2005.002.patch, > YARN-2005.003.patch, YARN-2005.004.patch, YARN-2005.005.patch, > YARN-2005.006.patch, YARN-2005.006.patch, YARN-2005.007.patch, > YARN-2005.008.patch, YARN-2005.009.patch > > > It would be nice if the RM supported blacklisting a node for an AM launch > after the same node fails a configurable number of AM attempts. This would > be similar to the blacklisting support for scheduling task attempts in the > MapReduce AM but for scheduling AM attempts on the RM side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4820) ResourceManager web redirects in HA mode drops query parameters
[ https://issues.apache.org/jira/browse/YARN-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200374#comment-15200374 ] Hadoop QA commented on YARN-4820: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 57s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 41s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 2s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 7s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 34s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 41s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 44s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 302m 49s {color} | {color:black} {color} | \\ \\ || Reason || Tests |
[jira] [Commented] (YARN-4595) Add support for configurable read-only mounts
[ https://issues.apache.org/jira/browse/YARN-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200385#comment-15200385 ] Hadoop QA commented on YARN-4595: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 5s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 40s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 33m 20s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12794016/YARN-4595.2.patch | | JIRA Issue | YARN-4595 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux b1bf64ea3a7a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / dc951e6 | | Default Java | 1.7.0_95 | | Multi-JDK versions
[jira] [Created] (YARN-4828) Create a pull request template for github
Steve Loughran created YARN-4828: Summary: Create a pull request template for github Key: YARN-4828 URL: https://issues.apache.org/jira/browse/YARN-4828 Project: Hadoop YARN Issue Type: Improvement Components: build Affects Versions: 3.0.0 Environment: github Reporter: Steve Loughran Priority: Minor We're starting to see PRs appear without any JIRA, explanation etc. These are going to be ignored without them. It's possible to [create a PR text template](https://help.github.com/articles/creating-a-pull-request-template-for-your-repository/) under {{.github/PULL_REQUEST_TEMPLATE}} We can do such a template, which provides template summary points, such as: * which JIRA * if against an object store, how did you test it? * if its a shell script, how did you test it? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1508) Document Dynamic Resource Configuration feature
[ https://issues.apache.org/jira/browse/YARN-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-1508: - Summary: Document Dynamic Resource Configuration feature (was: Rename ResourceOption and document resource over-commitment cases) > Document Dynamic Resource Configuration feature > --- > > Key: YARN-1508 > URL: https://issues.apache.org/jira/browse/YARN-1508 > Project: Hadoop YARN > Issue Type: Sub-task > Components: graceful, nodemanager, scheduler >Reporter: Junping Du >Assignee: Junping Du > > Per Vinod's comment in > YARN-312(https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087) > and Bikas' comment in > YARN-311(https://issues.apache.org/jira/browse/YARN-311?focusedCommentId=13848615), > the name of ResourceOption is not good enough for being understood. Also, we > need to document more on resource overcommitment time and use cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4820) ResourceManager web redirects in HA mode drops query parameters
[ https://issues.apache.org/jira/browse/YARN-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197561#comment-15197561 ] Hadoop QA commented on YARN-4820: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 18s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 28s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 35s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 42s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 45s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:b
[jira] [Created] (YARN-4832) NM side resource value should get updated if change applied in RM side
Junping Du created YARN-4832: Summary: NM side resource value should get updated if change applied in RM side Key: YARN-4832 URL: https://issues.apache.org/jira/browse/YARN-4832 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, resourcemanager Reporter: Junping Du Assignee: Junping Du Priority: Critical Now, if we execute CLI to update node (single or multiple) resource in RM side, NM will not receive any notification. It doesn't affect resource scheduling but will make resource usage metrics reported by NM a bit weird. We should sync up new resource between RM and NM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4711) NM is going down with NPE's due to single thread processing of events by Timeline client
[ https://issues.apache.org/jira/browse/YARN-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197864#comment-15197864 ] Naganarasimha G R commented on YARN-4711: - offline comments from [~sjlee0] : {quote} The current state of NM timeline integration seems to have quite a few rough edges. I did look at the exceptions and here are my early thoughts. I agree with your other points. (1) NPE in {{NMTimelinePublisher$ContainerEventHandler}} I understand that this happens because the event is handled after the container object was removed in the NM context, correct? As a rule, I think any attempt to retrieve objects from the NM context in the async event handler is inherently dangerous because there is no guarantee that those objects are still there in the context. So we should review the {{NMTimelinePublisher}} code to spot those cases. This is one of them. What this event handler needs is the container's resource and priority. What I would suggest is to add the resource and priority into the event itself. I'm not sure if we need to subclass {{ContainerEvent}} for this purpose... Thoughts? (2) NPE in {{NMTimelinePublisher.putEntity()}} This is the other place in {{NMTimelinePublisher}} where it attempts to retrieve an object from the context, and it fails for a similar reason. My question when I looked at this is, who should own {{TimelineClient}}s? Currently they are owned by the individual {{ApplicationImpl}} instances. I'm not sure if we went back and forth on this, but if {{ApplicationImpl}} goes away but we still need to publish, there doesn't seem to be a way. Since it's really {{NMTimelinePublisher}} that needs the timeline clients, should they be owned and managed by {{NMTImelinePublisher}}? I know it might be a rather big change, but I'm not sure if there is any other way to resolve this. {quote} > NM is going down with NPE's due to single thread processing of events by > Timeline client > > > Key: YARN-4711 > URL: https://issues.apache.org/jira/browse/YARN-4711 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: 4711Analysis.txt > > > After YARN-3367, while testing the latest 2928 branch came across few NPEs > due to which NM is shutting down. > {code} > 2016-02-21 23:19:54,078 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: > Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ContainerEventHandler.handle(NMTimelinePublisher.java:306) > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ContainerEventHandler.handle(NMTimelinePublisher.java:296) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {code} > {code} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.putEntity(NMTimelinePublisher.java:213) > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.publishContainerFinishedEvent(NMTimelinePublisher.java:192) > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.access$400(NMTimelinePublisher.java:63) > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ApplicationEventHandler.handle(NMTimelinePublisher.java:289) > at > org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher$ApplicationEventHandler.handle(NMTimelinePublisher.java:280) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {code} > On analysis found that the there was delay in processing of events, as after > YARN-3367 all the events were getting processed by a single thread inside the > timeline client. > Additionally found one scenario where there is possibility of NPE: > * TimelineEntity.toString() when {{real}} is not null -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4768) getAvailablePhysicalMemorySize can be inaccurate on linux
[ https://issues.apache.org/jira/browse/YARN-4768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199732#comment-15199732 ] Nathan Roberts commented on YARN-4768: -- Any comments on this approach? > getAvailablePhysicalMemorySize can be inaccurate on linux > - > > Key: YARN-4768 > URL: https://issues.apache.org/jira/browse/YARN-4768 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0, 2.7.2 > Environment: Linux >Reporter: Nathan Roberts >Assignee: Nathan Roberts > Attachments: YARN-4768.patch > > > Algorithm currently uses "MemFree" + "Inactive" from /proc/meminfo > "Inactive" may not be a very good indication of how much memory can be > readily freed because it contains both: > - Pages mapped with MAP_SHARED|MAP_ANONYMOUS (regardless of whether they're > being actively accessed or not. Unclear to me why this is the case...) > - Pages mapped MAP_PRIVATE|MAP_ANONYMOUS that have not been accessed recently > Both of these types of pages probably shouldn't be considered "Available". > "Inactive(file)" would seem more accurate but it's not available in all > kernel versions. To keep things simple, maybe just use "Inactive(file)" if > available, otherwise fallback to "Inactive". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4822) Refactor existing Preemption Policy of CS for easier adding new approach to select preemption candidates
[ https://issues.apache.org/jira/browse/YARN-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4822: - Attachment: YARN-4822.1.patch Attached ver.1 patch for review. > Refactor existing Preemption Policy of CS for easier adding new approach to > select preemption candidates > > > Key: YARN-4822 > URL: https://issues.apache.org/jira/browse/YARN-4822 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-4822.1.patch > > > Currently, ProportionalCapacityPreemptionPolicy has hard coded logic to > select candidates to be preempted (based on FIFO order of > applications/containers). It's not a simple to add new candidate-selection > logics, such as preemption for large container, intra-queeu fairness/policy, > etc. > In this JIRA, I propose to do following changes: > 1) Cleanup code bases, consolidate current logic into 3 stages: > - Compute ideal sharing of queues > - Select to-be-preempt candidates > - Send preemption/kill events to scheduler > 2) Add a new interface: {{PreemptionCandidatesSelectionPolicy}} for above > "select to-be-preempt candidates" part. Move existing how to select > candidates logics to {{FifoPreemptionCandidatesSelectionPolicy}}. > 3) Allow multiple PreemptionCandidatesSelectionPolicies work together in a > chain. Preceding PreemptionCandidatesSelectionPolicy has higher priority to > select candidates, and later PreemptionCandidatesSelectionPolicy can make > decisions according to already selected candidates and pre-computed queue > ideal shares of resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4823) Refactor the nested reservation id field in listReservation to simple string field
[ https://issues.apache.org/jira/browse/YARN-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200786#comment-15200786 ] Hadoop QA commented on YARN-4823: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 36s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 58s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 152m 46s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12794057/YARN-4823-v1.patch | | JIRA Issue | YARN-4823 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs c
[jira] [Commented] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started
[ https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198081#comment-15198081 ] Eric Payne commented on YARN-4686: -- Thanks, [~ebadger], for this patch. +1 LGTM. I will wait a day or so to give [~jlowe], [~kasha], and others to comment. Then will commit if no further concerns. > MiniYARNCluster.start() returns before cluster is completely started > > > Key: YARN-4686 > URL: https://issues.apache.org/jira/browse/YARN-4686 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch, YARN-4686.001.patch, > YARN-4686.002.patch, YARN-4686.003.patch, YARN-4686.004.patch, > YARN-4686.005.patch, YARN-4686.006.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4831) Recovered containers will be killed after NM stateful restart
[ https://issues.apache.org/jira/browse/YARN-4831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated YARN-4831: -- Description: {code} 2016-03-04 19:43:48,130 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1456335621285_0040_01_66 transitioned from NEW to DONE 2016-03-04 19:43:48,130 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=henkins-service OPERATION=Container Finished - Killed TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1456335621285_0040 {code} > Recovered containers will be killed after NM stateful restart > -- > > Key: YARN-4831 > URL: https://issues.apache.org/jira/browse/YARN-4831 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Siqi Li > > {code} > 2016-03-04 19:43:48,130 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: > Container container_1456335621285_0040_01_66 transitioned from NEW to > DONE > 2016-03-04 19:43:48,130 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=henkins-service >OPERATION=Container Finished - Killed TARGET=ContainerImpl > RESULT=SUCCESS APPID=application_1456335621285_0040 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200211#comment-15200211 ] Varun Saxena commented on YARN-4517: Filed YARN-4835 > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (YARN-4390) Consider container request size during CS preemption
[ https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne reopened YARN-4390: -- Assignee: Wangda Tan (was: Eric Payne) {quote} Since YARN-4108 doesn't solve all the issues. (I planned to solve this together with YARN-4108, but YARN-4108 only tackled half of the problem: when containers selected, only preempt useful containers). However, we need select container more clever based on requirement. I'm thinking about this recently and I plan to make some progresses as soon as possible. May I reopen this JIRA and take over from you? {quote} [~leftnoteasy], I had forgotten that we had closed this JIRA in favor of YARN-4108. Yes, I had noticed that the selection of containers to preempt in YARN-4108 do not actually consider the properties of the needed resources like size or locality. Even still, YARN-4108 is a big improvement and does prevent unnecessary preemption. However, you are correct, implementing this JIRA would eliminate some extra event passing and processing if killable containers are rejected over and over. I am reopening and assigning to you. > Consider container request size during CS preemption > > > Key: YARN-4390 > URL: https://issues.apache.org/jira/browse/YARN-4390 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.0.0, 2.8.0, 2.7.3 >Reporter: Eric Payne >Assignee: Wangda Tan > > There are multiple reasons why preemption could unnecessarily preempt > containers. One is that an app could be requesting a large container (say > 8-GB), and the preemption monitor could conceivably preempt multiple > containers (say 8, 1-GB containers) in order to fill the large container > request. These smaller containers would then be rejected by the requesting AM > and potentially given right back to the preempted app. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4746) yarn web services should convert parse failures of appId to 400
[ https://issues.apache.org/jira/browse/YARN-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4746: --- Attachment: 0003-YARN-4746.patch [~ste...@apache.org] Thanks you for review.Uploading patch with below changes # testInvalidAppAttempts corrected to check invalid attempt earlier was checking invalid appId # Moved parse validation for applicationID to WebAppUtil and re-factoring done Please do review > yarn web services should convert parse failures of appId to 400 > --- > > Key: YARN-4746 > URL: https://issues.apache.org/jira/browse/YARN-4746 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Priority: Minor > Attachments: 0001-YARN-4746.patch, 0002-YARN-4746.patch, > 0003-YARN-4746.patch > > > I'm seeing somewhere in the WS API tests of mine an error with exception > conversion of a bad app ID sent in as an argument to a GET. I know it's in > ATS, but a scan of the core RM web services implies a same problem > {{WebServices.parseApplicationId()}} uses {{ConverterUtils.toApplicationId}} > to convert an argument; this throws IllegalArgumentException, which is then > handled somewhere by jetty as a 500 error. > In fact, it's a bad argument, which should be handled by returning a 400. > This can be done by catching the raised argument and explicitly converting it -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4785) inconsistent value type of the "type" field for LeafQueueInfo in response of RM REST API - cluster/scheduler
[ https://issues.apache.org/jira/browse/YARN-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200116#comment-15200116 ] Hudson commented on YARN-4785: -- FAILURE: Integrated in Hadoop-trunk-Commit #9473 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9473/]) YARN-4785. inconsistent value type of the type field for LeafQueueInfo (junping_du: rev ca8106d2dd03458944303d93679daa03b1d82ad5) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerInfo.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java > inconsistent value type of the "type" field for LeafQueueInfo in response of > RM REST API - cluster/scheduler > > > Key: YARN-4785 > URL: https://issues.apache.org/jira/browse/YARN-4785 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.6.0 >Reporter: Jayesh >Assignee: Varun Vasudev > Labels: REST_API > Fix For: 2.8.0, 2.7.3, 2.6.5 > > Attachments: YARN-4785.001.patch, YARN-4785.branch-2.6.001.patch, > YARN-4785.branch-2.7.001.patch > > > I see inconsistent value type ( String and Array ) of the "type" field for > LeafQueueInfo in response of RM REST API - cluster/scheduler > as per the spec it should be always String. > here is the sample output ( removed non-relevant fields ) > {code} > { > "scheduler": { > "schedulerInfo": { > "type": "capacityScheduler", > "capacity": 100, > ... > "queueName": "root", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 0.1, > > }, > { > "type": [ > "capacitySchedulerLeafQueueInfo" > ], > "capacity": 0.1, > "queueName": "test-queue", > "state": "RUNNING", > > }, > { > "type": [ > "capacitySchedulerLeafQueueInfo" > ], > "capacity": 2.5, > > }, > { > "capacity": 25, > > "state": "RUNNING", > "queues": { > "queue": [ > { > "capacity": 6, > "state": "RUNNING", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 100, > ... > } > ] > }, > > }, > { > "capacity": 6, > ... > "state": "RUNNING", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 100, > ... > } > ] > }, > ... > }, > ... > ] > }, > ... > } > ] > } > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4576) Enhancement for tracking Blacklist in AM Launching
[ https://issues.apache.org/jira/browse/YARN-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201139#comment-15201139 ] Vinod Kumar Vavilapalli commented on YARN-4576: --- Please read my comments on YARN-4837, this whole "AM blacklisting" feature is unnecessarily blown way out of proportion - we just don't need this amount of complexity. Adding more functionality like global lists (YARN-4635), per-user lists (YARN-4790), pluggable blacklisting ((!)) (YARN-4636) etc will makes things far worse. Containers are marked DISKS_FAILED only if all the disks have become bad, in which case the node itself becomes unhealthy. So there is no need for blacklisting per app at all !! If an AM is killed due to memory over-flow, blacklisting the node will not help at all! Overall, like I commented on [the JIRA YARN-4790|https://issues.apache.org/jira/browse/YARN-4790?focusedCommentId=15191217&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15191217], what we need is to not penalize applications for system related issues. When YARN finds a node with configuration / permission issues, it should itself take an action to (a) avoid scheduling on that node, (b) alert administrators etc. Implementing heuristics for app / user level blacklisting to work-around platform problems should be a last-ditch effort. We did that in Hadoop 1 MapReduce as we didn't have clear demarcation between app vs system failures. But that isn't the case with YARN - part of the reason why we never implemented heuristics based per-app blacklisting in YARN - we left that completely up to applications. > Enhancement for tracking Blacklist in AM Launching > -- > > Key: YARN-4576 > URL: https://issues.apache.org/jira/browse/YARN-4576 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: EnhancementAMLaunchingBlacklist.pdf > > > Before YARN-2005, YARN blacklist mechanism is to track the bad nodes by AM: > If AM tried to launch containers on a specific node get failed for several > times, AM will blacklist this node in future resource asking. This mechanism > works fine for normal containers. However, from our observation on behaviors > of several clusters: if this problematic node launch AM failed, then RM could > pickup this problematic node to launch next AM attempts again and again that > cause application failure in case other functional nodes are busy. In normal > case, the customized healthy checker script cannot be so sensitive to mark > node as unhealthy when one or two containers get launched failed. > After YARN-2005, we can have a BlacklistManager in each RMapp, so those nodes > who launching AM attempts failed for specific application before will get > blacklisted. To get rid of potential risks that all nodes being blacklisted > by BlacklistManager, a disable-failure-threshold is involved to stop adding > more nodes into blacklist if hit certain ratio already. > There are already some enhancements for this AM blacklist mechanism: > YARN-4284 is to address the more wider case for AM container get launched > failure and YARN-4389 tries to make configuration settings available for > change by App to meet app specific requirement. However, there are still > several gaps to address more scenarios: > 1. We may need a global blacklist instead of each app maintain a separated > one. The reason is: AM could get more chance to fail if other AM get failed > before. A quick example is: in a busy cluster, all nodes are busy except two > problematic nodes: node a and node b, app1 already submit and get failed in > two AM attempts on a and b. app2 and other apps should wait for other busy > nodes rather than waste attempts on these two problematic nodes. > 2. If AM container failure is recognized as global event instead app own > issue, we should consider the blacklist is not a permanent thing but with a > specific time window. > 3. We could have user defined black list polices to address more possible > cases and scenarios, so it reasonable to make blacklist policy pluggable. > 4. For some test scenario, we could have whitelist mechanism for AM launching. > 5. Some minor issues: it sounds like NM reconnect won't refresh blacklist so > far. > Will try to address all issues here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-998) Persistent resource change during NM/RM restart
[ https://issues.apache.org/jira/browse/YARN-998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197345#comment-15197345 ] Junping Du commented on YARN-998: - Thanks Jian for review and comments. bq. DynamicResourceConfiguration(configuration, true), the second parameter is not needed because it’s always passing ‘true’; Nice catch! Will remove it in next patch. bq. instead of reload the config again, looks like we can just call resourceTrackerServce.set(newConf) to replace the config? newConfig is reloaded earlier in the same call path. I thought of this before but my original concern is a bit risky to have an api to replace config with whatever come in. Will update it if this is not a valid concern. > Persistent resource change during NM/RM restart > --- > > Key: YARN-998 > URL: https://issues.apache.org/jira/browse/YARN-998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: graceful, nodemanager, scheduler >Reporter: Junping Du >Assignee: Junping Du > Attachments: YARN-998-sample.patch, YARN-998-v1.patch > > > When NM is restarted by plan or from a failure, previous dynamic resource > setting should be kept for consistency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199784#comment-15199784 ] Sunil G commented on YARN-4517: --- [~varun_saxena] bq.Regarding node labels, we can add it in REST response indicating if labels are enabled or not. We can do this later because this would require another JIRA for REST changes For this, I have some done changes and we can immediately know whether labels are in cluster (i am finishing up with node-label page now, will upload a patch soon). I will try and see whether we can make a unified patch for all REST changes needed for UI together. Will syncup with you offline and will share summary here. bq.Maybe something like: "You have to ssh to the missing nodes' /xxx/ dir to look for the logs" I am +1 for giving more information's. From RM, we can get the node ip/hostname. Atleast we can give a relative patch for getting dir for logs (may be from available path from yarn-default.xml). User can changes, so message can be a possible suggestion. > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198218#comment-15198218 ] Daniel Templeton commented on YARN-4311: I don't have any further comments. LGTM. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, > YARN-4311-v6.patch, YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4829) Add support for binary units
[ https://issues.apache.org/jira/browse/YARN-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200326#comment-15200326 ] Hadoop QA commented on YARN-4829: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 27s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 42s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s {color} | {color:green} YARN-3926 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s {color} | {color:green} YARN-3926 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 43s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s {color} | {color:green} YARN-3926 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 19s {color} | {color:green} YARN-3926 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 6s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 6s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 31s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 1 new + 7 unchanged - 0 fixed = 8 total (was 7) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 14s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 5s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 15s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:blac
[jira] [Commented] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started
[ https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197845#comment-15197845 ] Hadoop QA commented on YARN-4686: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 6s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 49s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 50s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 5s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 31s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 1 new + 30 unchanged - 0 fixed = 31 total (was 30) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 0s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 20s {color} | {color:red} hadoop-yarn-server-tests in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 18s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 33s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 30s {color} | {color:red} hadoop-yarn-server-tests in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit
[jira] [Updated] (YARN-4686) MiniYARNCluster.start() returns before cluster is completely started
[ https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-4686: -- Attachment: YARN-4686-branch-2.7.006.patch [~eepayne] Attaching the branch-2.7 patch. It passed all of the tests locally on my machine. > MiniYARNCluster.start() returns before cluster is completely started > > > Key: YARN-4686 > URL: https://issues.apache.org/jira/browse/YARN-4686 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch, > YARN-4686-branch-2.7.006.patch, YARN-4686.001.patch, YARN-4686.002.patch, > YARN-4686.003.patch, YARN-4686.004.patch, YARN-4686.005.patch, > YARN-4686.006.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4785) inconsistent value type of the "type" field for LeafQueueInfo in response of RM REST API - cluster/scheduler
[ https://issues.apache.org/jira/browse/YARN-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197781#comment-15197781 ] Jayesh commented on YARN-4785: -- +1 ( thanks for explaining the solution in code comment ) > inconsistent value type of the "type" field for LeafQueueInfo in response of > RM REST API - cluster/scheduler > > > Key: YARN-4785 > URL: https://issues.apache.org/jira/browse/YARN-4785 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.6.0 >Reporter: Jayesh >Assignee: Varun Vasudev > Labels: REST_API > Attachments: YARN-4785.001.patch > > > I see inconsistent value type ( String and Array ) of the "type" field for > LeafQueueInfo in response of RM REST API - cluster/scheduler > as per the spec it should be always String. > here is the sample output ( removed non-relevant fields ) > {code} > { > "scheduler": { > "schedulerInfo": { > "type": "capacityScheduler", > "capacity": 100, > ... > "queueName": "root", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 0.1, > > }, > { > "type": [ > "capacitySchedulerLeafQueueInfo" > ], > "capacity": 0.1, > "queueName": "test-queue", > "state": "RUNNING", > > }, > { > "type": [ > "capacitySchedulerLeafQueueInfo" > ], > "capacity": 2.5, > > }, > { > "capacity": 25, > > "state": "RUNNING", > "queues": { > "queue": [ > { > "capacity": 6, > "state": "RUNNING", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 100, > ... > } > ] > }, > > }, > { > "capacity": 6, > ... > "state": "RUNNING", > "queues": { > "queue": [ > { > "type": "capacitySchedulerLeafQueueInfo", > "capacity": 100, > ... > } > ] > }, > ... > }, > ... > ] > }, > ... > } > ] > } > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4839) ResourceManager deadlock between RMAppAttemptImpl and SchedulerApplicationAttempt
[ https://issues.apache.org/jira/browse/YARN-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201578#comment-15201578 ] Jason Lowe commented on YARN-4839: -- Stack trace of the relevant threads: {noformat} "IPC Server handler 32 on 8030" #153 daemon prio=5 os_prio=0 tid=0x7fb649603800 nid=0x20b1 waiting on condition [0x7fb5888d2000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00036de978f0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283) at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.getMasterContainer(RMAppAttemptImpl.java:779) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.pullNewlyAllocatedContainersAndNMTokens(SchedulerApplicationAttempt.java:467) - locked <0x00032a106f00> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.getAllocation(FiCaSchedulerApp.java:278) - locked <0x00032a106f00> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocate(CapacityScheduler.java:1008) - locked <0x00032a106f00> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:534) - locked <0x000383ce08b0> (a org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService$AllocateResponseLock) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:608) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server.call(Server.java:2267) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:648) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:615) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2217) [...] "1413615244@qtp-1677286081-37" #40337 daemon prio=5 os_prio=0 tid=0x7fb62c089800 nid=0x1b8d waiting for monitor entry [0x7fb5ca40e000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:580) - waiting to lock <0x00032a106f00> (a org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:267) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.getApplicationResourceUsageReport(RMAppAttemptImpl.java:826) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.createAndGetApplicationReport(RMAppImpl.java:580) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplications(ClientRMService.java:815) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getApps(RMWebServices.java:457) at sun.reflect.GeneratedMethodAccessor142.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvok
[jira] [Commented] (YARN-4751) In 2.7, Labeled queue usage not shown properly in capacity scheduler UI
[ https://issues.apache.org/jira/browse/YARN-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197668#comment-15197668 ] Sunil G commented on YARN-4751: --- Yes, agreeing to the point that we have to aggregate and peek into multiple patches to get the functionality. If 2.7 doesnt need new features/enhancements for labels, we can bring in patches on use case basis. cc/[~wangda.tan] > In 2.7, Labeled queue usage not shown properly in capacity scheduler UI > --- > > Key: YARN-4751 > URL: https://issues.apache.org/jira/browse/YARN-4751 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 2.7.3 >Reporter: Eric Payne >Assignee: Eric Payne > Attachments: 2.7 CS UI No BarGraph.jpg, > YARH-4752-branch-2.7.001.patch, YARH-4752-branch-2.7.002.patch > > > In 2.6 and 2.7, the capacity scheduler UI does not have the queue graphs > separated by partition. When applications are running on a labeled queue, no > color is shown in the bar graph, and several of the "Used" metrics are zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4836) [YARN-3368] Add AM related pages
Varun Saxena created YARN-4836: -- Summary: [YARN-3368] Add AM related pages Key: YARN-4836 URL: https://issues.apache.org/jira/browse/YARN-4836 Project: Hadoop YARN Issue Type: Sub-task Components: webapp Reporter: Varun Saxena Assignee: Varun Saxena -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4839) ResourceManager deadlock between RMAppAttemptImpl and SchedulerApplicationAttempt
[ https://issues.apache.org/jira/browse/YARN-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201680#comment-15201680 ] Jason Lowe commented on YARN-4839: -- This appears to have been fixed as a side-effect of YARN-3361 which wasn't in the build that reproduced this issue. That change updated getMasterContainer to avoid locking the RMAppAttemptImpl. > ResourceManager deadlock between RMAppAttemptImpl and > SchedulerApplicationAttempt > - > > Key: YARN-4839 > URL: https://issues.apache.org/jira/browse/YARN-4839 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Jason Lowe >Priority: Blocker > > Hit a deadlock in the ResourceManager as one thread was holding the > SchedulerApplicationAttempt lock and trying to call > RMAppAttemptImpl.getMasterContainer while another thread had the > RMAppAttemptImpl lock and was trying to call > SchedulerApplicationAttempt.getResourceUsageReport. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4595) Add support for configurable read-only mounts
[ https://issues.apache.org/jira/browse/YARN-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billie Rinaldi updated YARN-4595: - Attachment: YARN-4595.2.patch Rebased patch. > Add support for configurable read-only mounts > - > > Key: YARN-4595 > URL: https://issues.apache.org/jira/browse/YARN-4595 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Billie Rinaldi >Assignee: Billie Rinaldi > Attachments: YARN-4595.1.patch, YARN-4595.2.patch > > > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. We could allow > the user to set a list of mounts in the environment of ContainerLaunchContext > (e.g. /dir1:/targetdir1,/dir2:/targetdir2). These would be mounted read-only > to the specified target locations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2915) Enable YARN RM scale out via federation using multiple RM's
[ https://issues.apache.org/jira/browse/YARN-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197956#comment-15197956 ] Vinod Kumar Vavilapalli commented on YARN-2915: --- One thing that occurred to me in an offline conversation with [~subru] and [~leftnoteasy] is about the modeling of queues and their shares in different sub-clusters. As seems to be already proposed, it is very desirable to have a unified *logic queues* that are applicable across all sub-clusters. With unified logical queues, looks like there are some proposals for ways of how resources can get sub-divided amongst different sub-clusters. But to me, they already map to an existing concept in YARN - *Node Partitions* / node-labels ! Essentially you have *one YARN cluster* -> *multiple sub-clusters* -> *each sub-cluster with multiple node-partitions*. This can further be extended to more levels. For e.g. we can unify rack also under the same concept. The advantage of unifying this with node-partitions is that we can have - one single administrative view philosophy of sub-clusters, node-partitions, racks etc - unified configuration mechanisms: Today we support centralized and distributed node-partition mechanisms, exclusive / non-exclusive access etc. - unified queue-sharing models - today we already can assign X% of a node-partition to a queue. This way we can, again, reuse existing concepts, mental models and allocation policies - instead of creating specific policies for sub-cluster sharing like the user-based share that is proposed. We will have to dig deeper into the details, but it seems to me that node-partition and sub-cluster are equivalence classes except for the fact that two sub-clusters report to two different RMs (physically / implementation wise) which isn't the case today with node-partitions. Thoughts? /cc [~curino] [~chris.douglas] > Enable YARN RM scale out via federation using multiple RM's > --- > > Key: YARN-2915 > URL: https://issues.apache.org/jira/browse/YARN-2915 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Sriram Rao >Assignee: Subru Krishnan > Attachments: FEDERATION_CAPACITY_ALLOCATION_JIRA.pdf, > Federation-BoF.pdf, Yarn_federation_design_v1.pdf, federation-prototype.patch > > > This is an umbrella JIRA that proposes to scale out YARN to support large > clusters comprising of tens of thousands of nodes. That is, rather than > limiting a YARN managed cluster to about 4k in size, the proposal is to > enable the YARN managed cluster to be elastically scalable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)