[jira] [Updated] (YARN-457) Setting updated nodes from null to null causes NPE in AllocateResponsePBImpl
[ https://issues.apache.org/jira/browse/YARN-457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenji Kikushima updated YARN-457: - Attachment: YARN-457-5.patch Added test for setUpdatedNodes setting null. Setting updated nodes from null to null causes NPE in AllocateResponsePBImpl Key: YARN-457 URL: https://issues.apache.org/jira/browse/YARN-457 Project: Hadoop YARN Issue Type: Bug Components: api Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Kenji Kikushima Priority: Minor Labels: Newbie Attachments: YARN-457-2.patch, YARN-457-3.patch, YARN-457-4.patch, YARN-457-5.patch, YARN-457.patch {code} if (updatedNodes == null) { this.updatedNodes.clear(); return; } {code} If updatedNodes is already null, a NullPointerException is thrown. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-721) ContainerManagerImpl failed to authorizeRequest
[ https://issues.apache.org/jira/browse/YARN-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13677852#comment-13677852 ] PengZhang commented on YARN-721: I checked YARN-370 and YARN-382,I think it's the same problem with this issue. I'm using cdh4.2 and found that SchedulerUtils.java in trunk not changed, so I mistakenly assumed that the bug was not fixed. And I agreed that YARN-382 is a better fix. ContainerManagerImpl failed to authorizeRequest --- Key: YARN-721 URL: https://issues.apache.org/jira/browse/YARN-721 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: PengZhang Attachments: YARN-721.patch When security is enabled, resource check will be failed. AM master cannot be launched. It reports like Expected resource memory:1800, vCores:1 but found memory:1536, vCores:1 I tracked this problem, and found it's imported in YARN-2. In RMAppAttemptImpl.ScheduleTransition, after allocate(), scheduler normalized Resource and created a new Resource object. Resource objects in scheduler and RMAppAttemptImp are used for schedule and launch separately. The difference value met in ContainerManagerImpl, and caused this problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-553) Have GetNewApplicationResponse generate a directly usable ApplicationSubmissionContext
[ https://issues.apache.org/jira/browse/YARN-553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-553: -- Attachment: yarn-553-2.patch Fix the javadoc warning. Have GetNewApplicationResponse generate a directly usable ApplicationSubmissionContext -- Key: YARN-553 URL: https://issues.apache.org/jira/browse/YARN-553 Project: Hadoop YARN Issue Type: Sub-task Components: client Affects Versions: 2.0.3-alpha Reporter: Harsh J Assignee: Karthik Kambatla Priority: Minor Attachments: yarn-553-1.patch, yarn-553-2.patch Right now, we're doing multiple steps to create a relevant ApplicationSubmissionContext for a pre-received GetNewApplicationResponse. {code} GetNewApplicationResponse newApp = yarnClient.getNewApplication(); ApplicationId appId = newApp.getApplicationId(); ApplicationSubmissionContext appContext = Records.newRecord(ApplicationSubmissionContext.class); appContext.setApplicationId(appId); {code} A simplified way may be to have the GetNewApplicationResponse itself provide a helper method that builds a usable ApplicationSubmissionContext for us. Something like: {code} GetNewApplicationResponse newApp = yarnClient.getNewApplication(); ApplicationSubmissionContext appContext = newApp.generateApplicationSubmissionContext(); {code} [The above method can also take an arg for the container launch spec, or perhaps pre-load defaults like min-resource, etc. in the returned object, aside of just associating the application ID automatically.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-773) Move YarnRuntimeException from package api.yarn to api.yarn.exceptions
[ https://issues.apache.org/jira/browse/YARN-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13677906#comment-13677906 ] Steve Loughran commented on YARN-773: - # Given the grief I had with the last exception rebasing, I'd clearly like to get my -117 patch in first. # That failing test is one that I'd already patched there, as it was too brittle against exception time and location of the test; as wrapping of exceptions changed it would break. The patched version looks for the string on a contains, not an equals {code} if(!e.getMessage().contains(errMessage)) { throw e; } {code} Even those tests are brittle against the error text changing -the text should be moved to constant strings in the source files and referenced by the tests Move YarnRuntimeException from package api.yarn to api.yarn.exceptions -- Key: YARN-773 URL: https://issues.apache.org/jira/browse/YARN-773 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He Attachments: YARN-773.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-773) Move YarnRuntimeException from package api.yarn to api.yarn.exceptions
[ https://issues.apache.org/jira/browse/YARN-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13677908#comment-13677908 ] Steve Loughran commented on YARN-773: - I like new exception classes for testability. If an existing class is to be used, the error string has to be made a constant, with the test referencing it directly, rather than having a cut-and-paste copy of the string Move YarnRuntimeException from package api.yarn to api.yarn.exceptions -- Key: YARN-773 URL: https://issues.apache.org/jira/browse/YARN-773 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He Attachments: YARN-773.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-773) Move YarnRuntimeException from package api.yarn to api.yarn.exceptions
[ https://issues.apache.org/jira/browse/YARN-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13677917#comment-13677917 ] Steve Loughran commented on YARN-773: - (ignore last comment, wrong JIRA) Move YarnRuntimeException from package api.yarn to api.yarn.exceptions -- Key: YARN-773 URL: https://issues.apache.org/jira/browse/YARN-773 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He Attachments: YARN-773.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-689) Add multiplier unit to resourcecapabilities
[ https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13677937#comment-13677937 ] Tom White commented on YARN-689: +1 to Hitesh and Bikas' points about minimum (and increment) being an internal scheduling artifact, and removing it from the API (or at least making it clear that AMs shouldn't use it as a multiplier). Add multiplier unit to resourcecapabilities --- Key: YARN-689 URL: https://issues.apache.org/jira/browse/YARN-689 Project: Hadoop YARN Issue Type: Sub-task Components: api, scheduler Affects Versions: 2.0.4-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: YARN-689.patch, YARN-689.patch, YARN-689.patch, YARN-689.patch, YARN-689.patch Currently we overloading the minimum resource value as the actual multiplier used by the scheduler. Today with a minimum memory set to 1GB, requests for 1.5GB are always translated to allocation of 2GB. We should decouple the minimum allocation from the multiplier. The multiplier should also be exposed to the client via the RegisterApplicationMasterResponse -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-404) Node Manager leaks Data Node connections
[ https://issues.apache.org/jira/browse/YARN-404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K resolved YARN-404. Resolution: Duplicate Node Manager leaks Data Node connections Key: YARN-404 URL: https://issues.apache.org/jira/browse/YARN-404 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Affects Versions: 2.0.2-alpha, 0.23.6 Reporter: Devaraj K Assignee: Devaraj K RM is missing to give some applications to NM for clean up, due to this log aggregation is not happening for those applications and also it is leaking data node connections in NM side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-298) ResourceManager Cluster Applications REST API throws NPE
[ https://issues.apache.org/jira/browse/YARN-298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated YARN-298: --- Affects Version/s: 2.0.1-alpha ResourceManager Cluster Applications REST API throws NPE Key: YARN-298 URL: https://issues.apache.org/jira/browse/YARN-298 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.0.1-alpha Reporter: Devaraj K {code:xml} 2012-12-28 06:03:49,125 WARN org.apache.hadoop.yarn.webapp.GenericExceptionHandler: INTERNAL_SERVER_ERROR java.lang.NullPointerException at org.apache.hadoop.yarn.server.security.ApplicationACLsManager.checkAccess(ApplicationACLsManager.java:104) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.hasAccess(RMWebServices.java:100) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getApps(RMWebServices.java:362) at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:886) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.util.IpValidationFilter.doFilter(IpValidationFilter.java:60) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:985) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at
[jira] [Resolved] (YARN-781) Expose LOGDIR that containers should use for logging
[ https://issues.apache.org/jira/browse/YARN-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved YARN-781. Resolution: Duplicate Duplicate of YARN-772. Expose LOGDIR that containers should use for logging Key: YARN-781 URL: https://issues.apache.org/jira/browse/YARN-781 Project: Hadoop YARN Issue Type: Sub-task Reporter: Devaraj Das Assignee: Vinod Kumar Vavilapalli The LOGDIR is known. We should expose this to the container's environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-767) Initialize Application status metrics when QueueMetrics is initialized
[ https://issues.apache.org/jira/browse/YARN-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678013#comment-13678013 ] Arun C Murthy commented on YARN-767: The patch looks good. Nit: initAppStatusMetrics should be a private method. Initialize Application status metrics when QueueMetrics is initialized --- Key: YARN-767 URL: https://issues.apache.org/jira/browse/YARN-767 Project: Hadoop YARN Issue Type: Bug Reporter: Jian He Assignee: Jian He Attachments: YARN-767.1.patch Applications: ResourceManager.QueueMetrics.AppsSubmitted, ResourceManager.QueueMetrics.AppsRunning, ResourceManager.QueueMetrics.AppsPending, ResourceManager.QueueMetrics.AppsCompleted, ResourceManager.QueueMetrics.AppsKilled, ResourceManager.QueueMetrics.AppsFailed For now these metrics are created only when they are needed, we want to make them be seen when QueueMetrics is initialized -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-648) FS: Add documentation for pluggable policy
[ https://issues.apache.org/jira/browse/YARN-648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678096#comment-13678096 ] Karthik Kambatla commented on YARN-648: --- bq. What do you mean by latency? A policy that is compute-intensive can still slow down things for everyone. Agree, I should probably leave out the latency and just leave it as configure appropriate policy. However, I believe latency is also one of the reasons. Consider a configuration where root.a and root.b are to parent queues, configured to use fairshare and SP (a sophisticated scheduler with higher latency than fairshare) respectively. root itself has fairshare. Now, the apps under root.a don't suffer from the overhead of SP. Further, in the future, scheduling within a queue can be done asynchronously to achieve even higher throughput and the time to schedule an app depends on the cumulative latencies of parent queue schedulers? If I understand it right, Omega (Google) was built to handle such latencies in the presence of sophisticated schedulers. FS: Add documentation for pluggable policy -- Key: YARN-648 URL: https://issues.apache.org/jira/browse/YARN-648 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.4-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: documentaion Attachments: yarn-648-1.patch YARN-469 and YARN-482 make the scheduling policy in FS pluggable. Need to add documentation on how to use this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-760) NodeManager throws AvroRuntimeException on failed start
[ https://issues.apache.org/jira/browse/YARN-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678109#comment-13678109 ] Junping Du commented on YARN-760: - Patch looks good to me. Someone can help to review and commit it? Thanks! NodeManager throws AvroRuntimeException on failed start --- Key: YARN-760 URL: https://issues.apache.org/jira/browse/YARN-760 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Niranjan Singh Labels: newbie Attachments: YARN-760.patch, YARN-760.patch, YARN-760.patch NodeManager wraps exceptions that occur in its start method in AvroRuntimeExceptions, even though it doesn't use Avro anywhere else. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-783) RM and NM web server /logs pages link not working
Binglin Chang created YARN-783: -- Summary: RM and NM web server /logs pages link not working Key: YARN-783 URL: https://issues.apache.org/jira/browse/YARN-783 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Binglin Chang RM and NM web server /logs page link to main default apps page -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-784) YARN does not provide Version info in JMX like hdfs and mapreducev1
Binglin Chang created YARN-784: -- Summary: YARN does not provide Version info in JMX like hdfs and mapreducev1 Key: YARN-784 URL: https://issues.apache.org/jira/browse/YARN-784 Project: Hadoop YARN Issue Type: Improvement Reporter: Binglin Chang Priority: Minor Some third party tool may still need this version info in JMX. Better add this for backward compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-760) NodeManager throws AvroRuntimeException on failed start
[ https://issues.apache.org/jira/browse/YARN-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678110#comment-13678110 ] Jason Lowe commented on YARN-760: - +1, will commit shortly NodeManager throws AvroRuntimeException on failed start --- Key: YARN-760 URL: https://issues.apache.org/jira/browse/YARN-760 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Niranjan Singh Labels: newbie Attachments: YARN-760.patch, YARN-760.patch, YARN-760.patch NodeManager wraps exceptions that occur in its start method in AvroRuntimeExceptions, even though it doesn't use Avro anywhere else. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-784) YARN does not provide Version info in JMX like hdfs and mapreducev1
[ https://issues.apache.org/jira/browse/YARN-784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Binglin Chang updated YARN-784: --- Affects Version/s: 2.1.0-beta YARN does not provide Version info in JMX like hdfs and mapreducev1 --- Key: YARN-784 URL: https://issues.apache.org/jira/browse/YARN-784 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.1.0-beta Reporter: Binglin Chang Priority: Minor Some third party tool may still need this version info in JMX. Better add this for backward compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-783) RM and NM web server /logs pages link not working
[ https://issues.apache.org/jira/browse/YARN-783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678112#comment-13678112 ] Binglin Chang commented on YARN-783: The problem seems to only exist in extra jetty context root, like /logs/ and /static/, and none root paths work fine, like /logs/userlogs/ RM and NM web server /logs pages link not working - Key: YARN-783 URL: https://issues.apache.org/jira/browse/YARN-783 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Binglin Chang RM and NM web server /logs page link to main default apps page -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-760) NodeManager throws AvroRuntimeException on failed start
[ https://issues.apache.org/jira/browse/YARN-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678127#comment-13678127 ] Niranjan Singh commented on YARN-760: - Thanks to Jason Lowe, Junping Du and Sandy Ryza for all your help. NodeManager throws AvroRuntimeException on failed start --- Key: YARN-760 URL: https://issues.apache.org/jira/browse/YARN-760 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Niranjan Singh Labels: newbie Fix For: 2.1.0-beta Attachments: YARN-760.patch, YARN-760.patch, YARN-760.patch NodeManager wraps exceptions that occur in its start method in AvroRuntimeExceptions, even though it doesn't use Avro anywhere else. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (YARN-781) Expose LOGDIR that containers should use for logging
[ https://issues.apache.org/jira/browse/YARN-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reopened YARN-781: -- I read this as more than just documentation. It is about exposing the final LOGDIR as env to the containers so that they can decide where to write logs at run-time. As of today, AMs do have a not-so-straight way of doing this: an AM can explicitly set SOMEENV=LOGDIR and we promptly do the appropriate string replacement. But I think it is useful to set it explicitly. Expose LOGDIR that containers should use for logging Key: YARN-781 URL: https://issues.apache.org/jira/browse/YARN-781 Project: Hadoop YARN Issue Type: Sub-task Reporter: Devaraj Das Assignee: Vinod Kumar Vavilapalli The LOGDIR is known. We should expose this to the container's environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-642) Fix up /nodes REST API to have 1 param and be consistent with the Java API
[ https://issues.apache.org/jira/browse/YARN-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678176#comment-13678176 ] Vinod Kumar Vavilapalli commented on YARN-642: -- Looking at it now.. Fix up /nodes REST API to have 1 param and be consistent with the Java API -- Key: YARN-642 URL: https://issues.apache.org/jira/browse/YARN-642 Project: Hadoop YARN Issue Type: Bug Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Labels: incompatible Attachments: YARN-642-1.patch, YARN-642-2.patch, YARN-642-2.patch, YARN-642.patch The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-775) stream jobs are not cleaning the Yarn local-dirs after container is released
[ https://issues.apache.org/jira/browse/YARN-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678174#comment-13678174 ] yeshavora commented on YARN-775: Setting yarn.nodemanager.delete.debug-delay-sec solves the issue. Closing this Jira stream jobs are not cleaning the Yarn local-dirs after container is released Key: YARN-775 URL: https://issues.apache.org/jira/browse/YARN-775 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: yeshavora Assignee: Omkar Vinit Joshi Fix For: 2.1.0-beta Run a stream job: hadoop jar hadoop-streaming.jar -files file:///tmp/Tmp.py -input Tmp.py -output /tmp/Tmpout -mapper python Tmp.py -reducer NONE Container Dirs are not being cleaned after Stream job is completed/Killed/Failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-775) stream jobs are not cleaning the Yarn local-dirs after container is released
[ https://issues.apache.org/jira/browse/YARN-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yeshavora resolved YARN-775. Resolution: Invalid stream jobs are not cleaning the Yarn local-dirs after container is released Key: YARN-775 URL: https://issues.apache.org/jira/browse/YARN-775 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: yeshavora Assignee: Omkar Vinit Joshi Fix For: 2.1.0-beta Run a stream job: hadoop jar hadoop-streaming.jar -files file:///tmp/Tmp.py -input Tmp.py -output /tmp/Tmpout -mapper python Tmp.py -reducer NONE Container Dirs are not being cleaned after Stream job is completed/Killed/Failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-767) Initialize Application status metrics when QueueMetrics is initialized
[ https://issues.apache.org/jira/browse/YARN-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-767: - Attachment: YARN-767.2.patch Add a test, and fix the visibility of the method Initialize Application status metrics when QueueMetrics is initialized --- Key: YARN-767 URL: https://issues.apache.org/jira/browse/YARN-767 Project: Hadoop YARN Issue Type: Bug Reporter: Jian He Assignee: Jian He Attachments: YARN-767.1.patch, YARN-767.2.patch Applications: ResourceManager.QueueMetrics.AppsSubmitted, ResourceManager.QueueMetrics.AppsRunning, ResourceManager.QueueMetrics.AppsPending, ResourceManager.QueueMetrics.AppsCompleted, ResourceManager.QueueMetrics.AppsKilled, ResourceManager.QueueMetrics.AppsFailed For now these metrics are created only when they are needed, we want to make them be seen when QueueMetrics is initialized -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-457) Setting updated nodes from null to null causes NPE in AllocateResponsePBImpl
[ https://issues.apache.org/jira/browse/YARN-457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678199#comment-13678199 ] Hadoop QA commented on YARN-457: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586663/YARN-457-5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1158//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1158//console This message is automatically generated. Setting updated nodes from null to null causes NPE in AllocateResponsePBImpl Key: YARN-457 URL: https://issues.apache.org/jira/browse/YARN-457 Project: Hadoop YARN Issue Type: Bug Components: api Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Kenji Kikushima Priority: Minor Labels: Newbie Attachments: YARN-457-2.patch, YARN-457-3.patch, YARN-457-4.patch, YARN-457-5.patch, YARN-457.patch {code} if (updatedNodes == null) { this.updatedNodes.clear(); return; } {code} If updatedNodes is already null, a NullPointerException is thrown. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-553) Have GetNewApplicationResponse generate a directly usable ApplicationSubmissionContext
[ https://issues.apache.org/jira/browse/YARN-553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678201#comment-13678201 ] Hadoop QA commented on YARN-553: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586672/yarn-553-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1157//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1157//console This message is automatically generated. Have GetNewApplicationResponse generate a directly usable ApplicationSubmissionContext -- Key: YARN-553 URL: https://issues.apache.org/jira/browse/YARN-553 Project: Hadoop YARN Issue Type: Sub-task Components: client Affects Versions: 2.0.3-alpha Reporter: Harsh J Assignee: Karthik Kambatla Priority: Minor Attachments: yarn-553-1.patch, yarn-553-2.patch Right now, we're doing multiple steps to create a relevant ApplicationSubmissionContext for a pre-received GetNewApplicationResponse. {code} GetNewApplicationResponse newApp = yarnClient.getNewApplication(); ApplicationId appId = newApp.getApplicationId(); ApplicationSubmissionContext appContext = Records.newRecord(ApplicationSubmissionContext.class); appContext.setApplicationId(appId); {code} A simplified way may be to have the GetNewApplicationResponse itself provide a helper method that builds a usable ApplicationSubmissionContext for us. Something like: {code} GetNewApplicationResponse newApp = yarnClient.getNewApplication(); ApplicationSubmissionContext appContext = newApp.generateApplicationSubmissionContext(); {code} [The above method can also take an arg for the container launch spec, or perhaps pre-load defaults like min-resource, etc. in the returned object, aside of just associating the application ID automatically.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-602) NodeManager should mandatorily set some Environment variables into every containers that it launches
[ https://issues.apache.org/jira/browse/YARN-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678204#comment-13678204 ] Hadoop QA commented on YARN-602: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586012/YARN-602.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1156//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1156//console This message is automatically generated. NodeManager should mandatorily set some Environment variables into every containers that it launches Key: YARN-602 URL: https://issues.apache.org/jira/browse/YARN-602 Project: Hadoop YARN Issue Type: Bug Reporter: Xuan Gong Assignee: Kenji Kikushima Attachments: YARN-602.patch NodeManager should mandatorily set some Environment variables into every containers that it launches, such as Environment.user, Environment.pwd. If both users and NodeManager set those variables, the value set by NM should be used -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-641) Make AMLauncher in RM Use NMClient
[ https://issues.apache.org/jira/browse/YARN-641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678212#comment-13678212 ] Siddharth Seth commented on YARN-641: - I don't think yarn-client should have a dependency on the ResourceManager. We could move the tests (maybe to a yarn-clien-test module) or re-factor the tests to not use the RM at all - may be difficult since some of the tests use the MiniYARNCluster. Make AMLauncher in RM Use NMClient -- Key: YARN-641 URL: https://issues.apache.org/jira/browse/YARN-641 Project: Hadoop YARN Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen YARN-422 adds NMClient. RM's AMLauncher is responsible for the interactions with an application's AM container. AMLauncher should also replace the raw ContainerManager proxy with NMClient. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-785) Every startContainer request send a set of information (auxiliary service related information) which is redundant. Can be replace with single API.
Omkar Vinit Joshi created YARN-785: -- Summary: Every startContainer request send a set of information (auxiliary service related information) which is redundant. Can be replace with single API. Key: YARN-785 URL: https://issues.apache.org/jira/browse/YARN-785 Project: Hadoop YARN Issue Type: Improvement Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi At present we are sending bunch of information mainly related to auxiliary serivces to whoever launches the container. This is an added overhead for NM. Instead we can expose this as an API then using NMToken client can get this information whenever it needs it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-785) Every startContainer request send a set of information (auxiliary service related information) which is redundant. Can be replaced with single NodeManager API.
[ https://issues.apache.org/jira/browse/YARN-785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-785: --- Summary: Every startContainer request send a set of information (auxiliary service related information) which is redundant. Can be replaced with single NodeManager API. (was: Every startContainer request send a set of information (auxiliary service related information) which is redundant. Can be replace with single API.) Every startContainer request send a set of information (auxiliary service related information) which is redundant. Can be replaced with single NodeManager API. --- Key: YARN-785 URL: https://issues.apache.org/jira/browse/YARN-785 Project: Hadoop YARN Issue Type: Improvement Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi At present we are sending bunch of information mainly related to auxiliary serivces to whoever launches the container. This is an added overhead for NM. Instead we can expose this as an API then using NMToken client can get this information whenever it needs it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-702) minicluster classpath construction requires user to set yarn.is.minicluster in the job conf
[ https://issues.apache.org/jira/browse/YARN-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678221#comment-13678221 ] Siddharth Seth commented on YARN-702: - bq. I don't like the idea of the mr client layer needed to know which settings to extract from the minimrcluster's config from an encapsulation point of view Do you mean users of the MR client - i.e. HBase, PIG, or anyone trying to submit a job to the MiniMRCluster? I'd agree, clients pulling individual config keys is a terrible way to get these clients working. bq. This sounds great. This would be a method of some sort that we pass in a Confiugration into that either mutates it? Either a method to mutate the user specific configuration, or a method to return a configuration which users can then merge into their configuration (this already exists, but is a giant set of config parameters). I'm OK with either approach. minicluster classpath construction requires user to set yarn.is.minicluster in the job conf --- Key: YARN-702 URL: https://issues.apache.org/jira/browse/YARN-702 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza YARN-129 improved classpath construction for miniclusters by, when yarn.is.minicluster is set, adding the current JVM's classpath to the ContainerLaunchContext for the MR AM and tasks. An issue with this is that it requires the user to set yarn.is.minicluster on the mapreduce side in the job conf, if they are not copying to RM conf into the jobconf. I think it would be better to bypass the ContainerLaunchContext and instead have the nodemanager check the property, and if it is true, do the classpath additions there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-339) TestResourceTrackerService is failing intermittently
[ https://issues.apache.org/jira/browse/YARN-339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He reassigned YARN-339: Assignee: Jian He TestResourceTrackerService is failing intermittently Key: YARN-339 URL: https://issues.apache.org/jira/browse/YARN-339 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 0.23.5 Reporter: Ravi Prakash Assignee: Jian He The test after testReconnectNode() is failing usually. This might be a race condition in Metrics2 code. Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.127 sec FAILURE! testDecommissionWithIncludeHosts(org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService) Time elapsed: 55 sec ERROR! org.apache.hadoop.metrics2.MetricsException: Metrics source ClusterMetrics already exists! at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:134) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:115) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:217) at org.apache.hadoop.yarn.server.resourcemanager.ClusterMetrics.registerMetrics(ClusterMetrics.java:71) at org.apache.hadoop.yarn.server.resourcemanager.ClusterMetrics.getMetrics(ClusterMetrics.java:58) at org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testDecommissionWithIncludeHosts(TestResourceTrackerService.java:74) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678228#comment-13678228 ] Siddharth Seth commented on YARN-752: - Can we make this rack resolution optional 1) If the user has provided racks - just trust that information instead of resolving and then checking whether the user has provided the correct information. 2) Does it make sense to have an option to disable this automatic resolution. I haven't been following the whitelisting / blacklisting closely - but isn't it possible for users to ask for specific hosts, without specifying racks (Ignoring scheduler implementation here) In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-777) Remove unreferenced objects from proto
[ https://issues.apache.org/jira/browse/YARN-777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-777: - Attachment: YARN-777.patch After investigation, only found one unused object, yarn_protos.proto.StringURLMapProto Remove unreferenced objects from proto -- Key: YARN-777 URL: https://issues.apache.org/jira/browse/YARN-777 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He Attachments: YARN-777.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-649) Make container logs available over HTTP in plain text
[ https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678238#comment-13678238 ] Sandy Ryza commented on YARN-649: - Somehow messed up the patch's name, but the contents are correct. Make container logs available over HTTP in plain text - Key: YARN-649 URL: https://issues.apache.org/jira/browse/YARN-649 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-649.patch, YARN-752-1.patch It would be good to make container logs available over the REST API for MAPREDUCE-4362 and so that they can be accessed programatically in general. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-641) Make AMLauncher in RM Use NMClient
[ https://issues.apache.org/jira/browse/YARN-641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678257#comment-13678257 ] Vinod Kumar Vavilapalli commented on YARN-641: -- *sigh* It seems like we'll have to move the tests in yarn-client to yarn-server-tests. Maven is imposing a single dependency graph overall and not per scope. Once we do this, we should perhaps rename server-tests to be integration - that's what it does. Make AMLauncher in RM Use NMClient -- Key: YARN-641 URL: https://issues.apache.org/jira/browse/YARN-641 Project: Hadoop YARN Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen YARN-422 adds NMClient. RM's AMLauncher is responsible for the interactions with an application's AM container. AMLauncher should also replace the raw ContainerManager proxy with NMClient. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE
[ https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal reassigned YARN-299: -- Assignee: Mayank Bansal Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE --- Key: YARN-299 URL: https://issues.apache.org/jira/browse/YARN-299 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.0.1-alpha, 2.0.0-alpha Reporter: Devaraj K Assignee: Mayank Bansal {code:xml} 2012-12-31 10:36:27,844 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Can't handle this event at current state: Current: [DONE], eventType: [RESOURCE_FAILED] org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) 2012-12-31 10:36:27,845 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1356792558130_0002_01_01 transitioned from DONE to null {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable
[ https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678263#comment-13678263 ] Maysam Yabandeh commented on YARN-713: -- Thanks [~tgraves]. The changes of this patch is more or less the same of the changes applied at MAPREDUCE-4295. The updates were lost during the merge of an outdated patch in YARN-39. I am submitting the patch to get some reviews. ResourceManager can exit unexpectedly if DNS is unavailable --- Key: YARN-713 URL: https://issues.apache.org/jira/browse/YARN-713 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Maysam Yabandeh Priority: Critical Attachments: YARN-713.patch, YARN-713.patch As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and that ultimately would cause the RM to exit. The RM should not exit during DNS hiccups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-369) Handle ( or throw a proper error when receiving) status updates from application masters that have not registered
[ https://issues.apache.org/jira/browse/YARN-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678267#comment-13678267 ] Mayank Bansal commented on YARN-369: Hi [~abhishekkapoor], Are you still working on this , or I can take it up if you are busy. Thanks, Mayank Handle ( or throw a proper error when receiving) status updates from application masters that have not registered - Key: YARN-369 URL: https://issues.apache.org/jira/browse/YARN-369 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.0.3-alpha, trunk-win Reporter: Hitesh Shah Assignee: Abhishek Kapoor Attachments: YARN-369.patch Currently, an allocate call from an unregistered application is allowed and the status update for it throws a statemachine error that is silently dropped. org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: STATUS_UPDATE at LAUNCHED at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:588) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:99) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:471) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:452) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77) at java.lang.Thread.run(Thread.java:680) ApplicationMasterService should likely throw an appropriate error for applications' requests that should not be handled in such cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678270#comment-13678270 ] Sandy Ryza commented on YARN-752: - bq. If the user has provided racks - just trust that information instead of resolving and then checking whether the user has provided the correct information. If we do this, I think it would be good at least to log some sort of warnings when the given and expected racks don't match up. The consequences of a user getting it wrong can be pretty subtle. bq. but isn't it possible for users to ask for specific hosts, without specifying racks (Ignoring scheduler implementation here) Currently, it does not make sense for users to ask for specific hosts without specifying racks in any scheduler implementation. Containers may still be scheduled on those nodes, but they will be treated like any other non-local nodes for the purposes of delay-scheduling and locality-specific requests. I believe that this is not just an implementation issue, but a result of the protocol. A scheduler having zero containers requested on rack1, but non-zero containers requested on nodes on rack1, can occur when a single container was requested for any of multiple nodes on rack1. In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678273#comment-13678273 ] Bikas Saha commented on YARN-752: - YARN's default and most efficient scheduling behavior comes when locality can relaxed from node to rack etc. The AMRMClient should simply do the right thing in this context and add racks for the nodes provided by the user, if the user has already not done so. If this is not done then its the burden of the user to do it and if the user forgets to do it then YARN will probably not give it any containers and the job will be stuck. Hence, by default the AMRMClient should always fill in missing racks. If the user has already done the right thing then this functionality will be a no-op. This cannot be optional. The only time this should not be done is when the user has asked for specific nodes/racks. Even there the AMRMClient will have to add missing racks but with a special flag or else the schedulers will not assign containers to the app. Support for that feature is yet to be added in AMRMClient and is tracked by another jira. If the user has specified a mix of nodes and racks, even then we need to ensure that racks are added for nodes whose racks are missing. In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678275#comment-13678275 ] Bikas Saha commented on YARN-752: - The patch looks good to me as is. Need to do a final review. In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-513) Create common proxy client for communicating with RM
[ https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-513: - Attachment: YARN-513.12.patch rebased on the latest trunk Create common proxy client for communicating with RM Key: YARN-513 URL: https://issues.apache.org/jira/browse/YARN-513 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Bikas Saha Assignee: Jian He Attachments: YARN-513.10.patch, YARN-513.11.patch, YARN-513.12.patch, YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch, YARN-513.4.patch, YARN.513.5.patch, YARN-513.6.patch, YARN-513.7.patch, YARN-513.8.patch, YARN-513.9.patch When the RM is restarting, the NM, AM and Clients should wait for some time for the RM to come back up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-786) Expose in application resource usage RM REST API
Sandy Ryza created YARN-786: --- Summary: Expose in application resource usage RM REST API Key: YARN-786 URL: https://issues.apache.org/jira/browse/YARN-786 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza It might be good to require users to explicitly ask for this information, as it's a little more expensive to collect than the other fields in AppInfo. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-786) Expose application resource usage in RM REST API
[ https://issues.apache.org/jira/browse/YARN-786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-786: Summary: Expose application resource usage in RM REST API (was: Expose in application resource usage RM REST API) Expose application resource usage in RM REST API Key: YARN-786 URL: https://issues.apache.org/jira/browse/YARN-786 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza It might be good to require users to explicitly ask for this information, as it's a little more expensive to collect than the other fields in AppInfo. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-502) RM crash with NPE on NODE_REMOVED event
[ https://issues.apache.org/jira/browse/YARN-502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678300#comment-13678300 ] Mayank Bansal commented on YARN-502: Hi [~sandyr] Are you working on this? Thanks, Mayank RM crash with NPE on NODE_REMOVED event --- Key: YARN-502 URL: https://issues.apache.org/jira/browse/YARN-502 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu While running some test and adding/removing nodes, we see RM crashed with the below exception. We are testing with fair scheduler and running hadoop-2.0.3-alpha {noformat} 2013-03-22 18:54:27,015 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Deactivating Node :55680 as it is now LOST 2013-03-22 18:54:27,015 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: :55680 Node Transitioned from UNHEALTHY to LOST 2013-03-22 18:54:27,015 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_REMOVED to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeNode(FairScheduler.java:619) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:856) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:375) at java.lang.Thread.run(Thread.java:662) 2013-03-22 18:54:27,016 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye.. 2013-03-22 18:54:27,020 INFO org.mortbay.log: Stopped SelectChannelConnector@:50030 {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-502) RM crash with NPE on NODE_REMOVED event
[ https://issues.apache.org/jira/browse/YARN-502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678306#comment-13678306 ] Sandy Ryza commented on YARN-502: - [~mayank_bansal], I am not. I didn't know how to reproduce it - have you experienced this as well? RM crash with NPE on NODE_REMOVED event --- Key: YARN-502 URL: https://issues.apache.org/jira/browse/YARN-502 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu While running some test and adding/removing nodes, we see RM crashed with the below exception. We are testing with fair scheduler and running hadoop-2.0.3-alpha {noformat} 2013-03-22 18:54:27,015 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Deactivating Node :55680 as it is now LOST 2013-03-22 18:54:27,015 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: :55680 Node Transitioned from UNHEALTHY to LOST 2013-03-22 18:54:27,015 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_REMOVED to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeNode(FairScheduler.java:619) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:856) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:375) at java.lang.Thread.run(Thread.java:662) 2013-03-22 18:54:27,016 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye.. 2013-03-22 18:54:27,020 INFO org.mortbay.log: Stopped SelectChannelConnector@:50030 {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-787) Remove resource min from Yarn client API
Alejandro Abdelnur created YARN-787: --- Summary: Remove resource min from Yarn client API Key: YARN-787 URL: https://issues.apache.org/jira/browse/YARN-787 Project: Hadoop YARN Issue Type: Bug Components: api Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Per discussions in YARN-689 and YARN-769 we should remove minimum from the API as this is a scheduler internal thing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-788) Rename scheduler resource minimum to increment
Alejandro Abdelnur created YARN-788: --- Summary: Rename scheduler resource minimum to increment Key: YARN-788 URL: https://issues.apache.org/jira/browse/YARN-788 Project: Hadoop YARN Issue Type: Bug Components: api Affects Versions: 2.0.4-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Per discussions in YARN-689 the current name minimum is wrong, we should rename it to increment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-777) Remove unreferenced objects from proto
[ https://issues.apache.org/jira/browse/YARN-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678353#comment-13678353 ] Hadoop QA commented on YARN-777: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586757/YARN-777.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1159//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1159//console This message is automatically generated. Remove unreferenced objects from proto -- Key: YARN-777 URL: https://issues.apache.org/jira/browse/YARN-777 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He Attachments: YARN-777.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-789) Add flag to scheduler to allow zero capabilities in resources
Alejandro Abdelnur created YARN-789: --- Summary: Add flag to scheduler to allow zero capabilities in resources Key: YARN-789 URL: https://issues.apache.org/jira/browse/YARN-789 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.0.4-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Per discussion in YARN-689, reposting updated use case: 1. I have a set of services co-existing with a Yarn cluster. 2. These services run out of band from Yarn. They are not started as yarn containers and they don't use Yarn containers for processing. 3. These services use, dynamically, different amounts of CPU and memory based on their load. They manage their CPU and memory requirements independently. In other words, depending on their load, they may require more CPU but not memory or vice-versa. By using YARN as RM for these services I'm able share and utilize the resources of the cluster appropriately and in a dynamic way. Yarn keeps tab of all the resources. These services run an AM that reserves resources on their behalf. When this AM gets the requested resources, the services bump up their CPU/memory utilization out of band from Yarn. If the Yarn allocations are released/preempted, the services back off on their resources utilization. By doing this, Yarn and these service correctly share the cluster resources, being Yarn RM the only one that does the overall resource bookkeeping. The services AM, not to break the lifecycle of containers, start containers in the corresponding NMs. These container processes do basically a sleep forever (i.e. sleep 1d). They are almost not using any CPU nor memory (less than 1MB). Thus it is reasonable to assume their required CPU and memory utilization is NIL (more on hard enforcement later). Because of this almost NIL utilization of CPU and memory, it is possible to specify, when doing a request, zero as one of the dimensions (CPU or memory). The current limitation is that the increment is also the minimum. If we set the memory increment to 1MB. When doing a pure CPU request, we would have to specify 1MB of memory. That would work. However it would allow discretionary memory requests without a desired normalization (increments of 256, 512, etc). If we set the CPU increment to 1CPU. When doing a pure memory request, we would have to specify 1CPU. CPU amounts a much smaller than memory amounts, and because we don't have fractional CPUs, it would mean that all my pure memory requests will be wasting 1 CPU thus reducing the overall utilization of the cluster. Finally, on hard enforcement. * For CPU. Hard enforcement can be done via a cgroup cpu controller. Using an absolute minimum of a few CPU shares (ie 10) in the LinuxContainerExecutor we ensure there is enough CPU cycles to run the sleep process. This absolute minimum would only kick-in if zero is allowed, otherwise will never kick in as the shares for 1 CPU are 1024. * For Memory. Hard enforcement is currently done by the ProcfsBasedProcessTree.java, using a minimum absolute of 1 or 2 MBs would take care of zero memory resources. And again, this absolute minimum would only kick-in if zero is allowed, otherwise will never kick in as the increment memory is in several MBs if not 1GB. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-767) Initialize Application status metrics when QueueMetrics is initialized
[ https://issues.apache.org/jira/browse/YARN-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678357#comment-13678357 ] Hadoop QA commented on YARN-767: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586751/YARN-767.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1160//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1160//console This message is automatically generated. Initialize Application status metrics when QueueMetrics is initialized --- Key: YARN-767 URL: https://issues.apache.org/jira/browse/YARN-767 Project: Hadoop YARN Issue Type: Bug Reporter: Jian He Assignee: Jian He Attachments: YARN-767.1.patch, YARN-767.2.patch Applications: ResourceManager.QueueMetrics.AppsSubmitted, ResourceManager.QueueMetrics.AppsRunning, ResourceManager.QueueMetrics.AppsPending, ResourceManager.QueueMetrics.AppsCompleted, ResourceManager.QueueMetrics.AppsKilled, ResourceManager.QueueMetrics.AppsFailed For now these metrics are created only when they are needed, we want to make them be seen when QueueMetrics is initialized -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable
[ https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678367#comment-13678367 ] Hadoop QA commented on YARN-713: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12585703/YARN-713.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1162//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1162//console This message is automatically generated. ResourceManager can exit unexpectedly if DNS is unavailable --- Key: YARN-713 URL: https://issues.apache.org/jira/browse/YARN-713 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Maysam Yabandeh Priority: Critical Fix For: 2.1.0-beta Attachments: YARN-713.patch, YARN-713.patch As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and that ultimately would cause the RM to exit. The RM should not exit during DNS hiccups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-689) Add multiplier unit to resourcecapabilities
[ https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur resolved YARN-689. - Resolution: Won't Fix Closing it and opening other JIRAs to address the discussed changes. Add multiplier unit to resourcecapabilities --- Key: YARN-689 URL: https://issues.apache.org/jira/browse/YARN-689 Project: Hadoop YARN Issue Type: Sub-task Components: api, scheduler Affects Versions: 2.0.4-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: YARN-689.patch, YARN-689.patch, YARN-689.patch, YARN-689.patch, YARN-689.patch Currently we overloading the minimum resource value as the actual multiplier used by the scheduler. Today with a minimum memory set to 1GB, requests for 1.5GB are always translated to allocation of 2GB. We should decouple the minimum allocation from the multiplier. The multiplier should also be exposed to the client via the RegisterApplicationMasterResponse -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-513) Create common proxy client for communicating with RM
[ https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678371#comment-13678371 ] Hadoop QA commented on YARN-513: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586767/YARN-513.12.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1161//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1161//console This message is automatically generated. Create common proxy client for communicating with RM Key: YARN-513 URL: https://issues.apache.org/jira/browse/YARN-513 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Bikas Saha Assignee: Jian He Attachments: YARN-513.10.patch, YARN-513.11.patch, YARN-513.12.patch, YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch, YARN-513.4.patch, YARN.513.5.patch, YARN-513.6.patch, YARN-513.7.patch, YARN-513.8.patch, YARN-513.9.patch When the RM is restarting, the NM, AM and Clients should wait for some time for the RM to come back up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-787) Remove resource min from Yarn client API
[ https://issues.apache.org/jira/browse/YARN-787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated YARN-787: Attachment: YARN-787.patch patch removing the minimum resource from the client API. Remove resource min from Yarn client API Key: YARN-787 URL: https://issues.apache.org/jira/browse/YARN-787 Project: Hadoop YARN Issue Type: Bug Components: api Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: YARN-787.patch Per discussions in YARN-689 and YARN-769 we should remove minimum from the API as this is a scheduler internal thing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (YARN-788) Rename scheduler resource minimum to increment
[ https://issues.apache.org/jira/browse/YARN-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678378#comment-13678378 ] Alejandro Abdelnur edited comment on YARN-788 at 6/7/13 8:09 PM: - renaming minimum to increment through out the code, and some improvements on the normalization code mostly for clarity purposes and to facilitate YARN-789. was (Author: tucu00): renaming minimum to increment through out the code, and some improvements on the normalization code mostly for clarity purposes. Rename scheduler resource minimum to increment -- Key: YARN-788 URL: https://issues.apache.org/jira/browse/YARN-788 Project: Hadoop YARN Issue Type: Bug Components: api Affects Versions: 2.0.4-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: YARN-788.patch Per discussions in YARN-689 the current name minimum is wrong, we should rename it to increment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-788) Rename scheduler resource minimum to increment
[ https://issues.apache.org/jira/browse/YARN-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated YARN-788: Attachment: YARN-788.patch renaming minimum to increment through out the code, and some improvements on the normalization code mostly for clarity purposes. Rename scheduler resource minimum to increment -- Key: YARN-788 URL: https://issues.apache.org/jira/browse/YARN-788 Project: Hadoop YARN Issue Type: Bug Components: api Affects Versions: 2.0.4-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: YARN-788.patch Per discussions in YARN-689 the current name minimum is wrong, we should rename it to increment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-789) Add flag to scheduler to allow zero capabilities in resources
[ https://issues.apache.org/jira/browse/YARN-789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated YARN-789: Attachment: YARN-789.patch patch adding boolean configuration that allows zero capabilities in a resource request. by default is OFF, thus the minimum value is the increment (current behavior). Add flag to scheduler to allow zero capabilities in resources - Key: YARN-789 URL: https://issues.apache.org/jira/browse/YARN-789 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.0.4-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: YARN-789.patch Per discussion in YARN-689, reposting updated use case: 1. I have a set of services co-existing with a Yarn cluster. 2. These services run out of band from Yarn. They are not started as yarn containers and they don't use Yarn containers for processing. 3. These services use, dynamically, different amounts of CPU and memory based on their load. They manage their CPU and memory requirements independently. In other words, depending on their load, they may require more CPU but not memory or vice-versa. By using YARN as RM for these services I'm able share and utilize the resources of the cluster appropriately and in a dynamic way. Yarn keeps tab of all the resources. These services run an AM that reserves resources on their behalf. When this AM gets the requested resources, the services bump up their CPU/memory utilization out of band from Yarn. If the Yarn allocations are released/preempted, the services back off on their resources utilization. By doing this, Yarn and these service correctly share the cluster resources, being Yarn RM the only one that does the overall resource bookkeeping. The services AM, not to break the lifecycle of containers, start containers in the corresponding NMs. These container processes do basically a sleep forever (i.e. sleep 1d). They are almost not using any CPU nor memory (less than 1MB). Thus it is reasonable to assume their required CPU and memory utilization is NIL (more on hard enforcement later). Because of this almost NIL utilization of CPU and memory, it is possible to specify, when doing a request, zero as one of the dimensions (CPU or memory). The current limitation is that the increment is also the minimum. If we set the memory increment to 1MB. When doing a pure CPU request, we would have to specify 1MB of memory. That would work. However it would allow discretionary memory requests without a desired normalization (increments of 256, 512, etc). If we set the CPU increment to 1CPU. When doing a pure memory request, we would have to specify 1CPU. CPU amounts a much smaller than memory amounts, and because we don't have fractional CPUs, it would mean that all my pure memory requests will be wasting 1 CPU thus reducing the overall utilization of the cluster. Finally, on hard enforcement. * For CPU. Hard enforcement can be done via a cgroup cpu controller. Using an absolute minimum of a few CPU shares (ie 10) in the LinuxContainerExecutor we ensure there is enough CPU cycles to run the sleep process. This absolute minimum would only kick-in if zero is allowed, otherwise will never kick in as the shares for 1 CPU are 1024. * For Memory. Hard enforcement is currently done by the ProcfsBasedProcessTree.java, using a minimum absolute of 1 or 2 MBs would take care of zero memory resources. And again, this absolute minimum would only kick-in if zero is allowed, otherwise will never kick in as the increment memory is in several MBs if not 1GB. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678401#comment-13678401 ] Alejandro Abdelnur commented on YARN-752: - agree with [~bikassaha]'s, only question left, what if the user provides wrong rack for node? As Sandy mentioned, that could lead to odd/subtle behavior. In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678405#comment-13678405 ] Bikas Saha commented on YARN-752: - User cannot provide a wrong rack for a node. User can provide a wrong rack by itself. If a rack is missing then it will be corrected by AMRMClient. If there is another different rack, its either a valid rack or a rack that does not exist. We should probably resolve nodes and racks to make sure they are all valid locations. Different jira perhaps. This kind of depends on YARN-435. In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678418#comment-13678418 ] Alejandro Abdelnur commented on YARN-752: - Ah!, ok, so before this patch if I would do a request of only NODES allocations would never happen. I think we should clearly document in the API that specifying RACKS is only needed if you want an allocation in any node of that RACK, if you want the allocation in a NODE you don't need to specify the rack the node is. In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-686) Flatten NodeReport
[ https://issues.apache.org/jira/browse/YARN-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678419#comment-13678419 ] Alejandro Abdelnur commented on YARN-686: - +1 Flatten NodeReport -- Key: YARN-686 URL: https://issues.apache.org/jira/browse/YARN-686 Project: Hadoop YARN Issue Type: Sub-task Components: api Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-686-1.patch, YARN-686.patch, YARN-686.patch The NodeReport returned by getClusterNodes or given to AMs in heartbeat responses includes both a NodeState (enum) and a NodeHealthStatus (object). As UNHEALTHY is already NodeState, a separate NodeHealthStatus doesn't seem necessary. I propose eliminating NodeHealthStatus#getIsNodeHealthy and moving its two other methods, getHealthReport and getLastHealthReportTime, into NodeReport. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-648) FS: Add documentation for pluggable policy
[ https://issues.apache.org/jira/browse/YARN-648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-648: -- Attachment: yarn-648-2.patch Patch that addresses Sandy's comments. Thanks Sandy. FS: Add documentation for pluggable policy -- Key: YARN-648 URL: https://issues.apache.org/jira/browse/YARN-648 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.4-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: documentaion Attachments: yarn-648-1.patch, yarn-648-2.patch YARN-469 and YARN-482 make the scheduling policy in FS pluggable. Need to add documentation on how to use this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-648) FS: Add documentation for pluggable policy
[ https://issues.apache.org/jira/browse/YARN-648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678428#comment-13678428 ] Sandy Ryza commented on YARN-648: - +1, thanks Karthik FS: Add documentation for pluggable policy -- Key: YARN-648 URL: https://issues.apache.org/jira/browse/YARN-648 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.4-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: documentaion Attachments: yarn-648-1.patch, yarn-648-2.patch YARN-469 and YARN-482 make the scheduling policy in FS pluggable. Need to add documentation on how to use this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-299) Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE
[ https://issues.apache.org/jira/browse/YARN-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678437#comment-13678437 ] Mayank Bansal commented on YARN-299: Looks like there is a race condition here when container is killed during localization process. LocalizeRunner will send RESOURCE_FAILED as killed container is trying to fetch the resources from the already cleanedup directories. In the mean time Contained is killed and after cleanup its reached to Done state. Thanks, Mayank Node Manager throws org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE --- Key: YARN-299 URL: https://issues.apache.org/jira/browse/YARN-299 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.0.1-alpha, 2.0.0-alpha Reporter: Devaraj K Assignee: Mayank Bansal {code:xml} 2012-12-31 10:36:27,844 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Can't handle this event at current state: Current: [DONE], eventType: [RESOURCE_FAILED] org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: RESOURCE_FAILED at DONE at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) 2012-12-31 10:36:27,845 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1356792558130_0002_01_01 transitioned from DONE to null {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-686) Flatten NodeReport
[ https://issues.apache.org/jira/browse/YARN-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678438#comment-13678438 ] Hudson commented on YARN-686: - Integrated in Hadoop-trunk-Commit #3882 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3882/]) YARN-686. Flatten NodeReport. (sandyr via tucu) (Revision 1490827) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490827 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/NodeReport.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/NodeReportPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/NodeCLI.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/utils/BuilderUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMNMInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodeInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/TestDiskFailures.java Flatten NodeReport -- Key: YARN-686 URL: https://issues.apache.org/jira/browse/YARN-686 Project: Hadoop YARN Issue Type: Sub-task Components: api Affects Versions: 2.0.4-alpha
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678445#comment-13678445 ] Siddharth Seth commented on YARN-752: - Alright - I guess the node only request will be handled in the other jira to track it. And from Bikas' explanation - that'll require rack resolution in any case. Seems a little strange, but required based on scheduler implementation. If we're always doing the rack lookup, we should change the MR AM avoid rack lookups if possible - may be difficult considering it tracks RACK_LOCAL allocations. Separate jira though. The rack lookup in the patch itself, shouldn't it be adding a rack only once. e.g. ResourceRequest: h1, h2, h3 - numContainers=1 If all of them resolve to the same rack - num containers on the rack would go to 3 after this ? Is that correct behaviour. In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-648) FS: Add documentation for pluggable policy
[ https://issues.apache.org/jira/browse/YARN-648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678446#comment-13678446 ] Hadoop QA commented on YARN-648: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586798/yarn-648-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1163//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1163//console This message is automatically generated. FS: Add documentation for pluggable policy -- Key: YARN-648 URL: https://issues.apache.org/jira/browse/YARN-648 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.4-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: documentaion Attachments: yarn-648-1.patch, yarn-648-2.patch YARN-469 and YARN-482 make the scheduling policy in FS pluggable. Need to add documentation on how to use this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-777) Remove unreferenced objects from proto
[ https://issues.apache.org/jira/browse/YARN-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678448#comment-13678448 ] Zhijie Shen commented on YARN-777: -- +1. Checked the code base, StringURLMapProto is useless. Straightforward change, no tests is ok. Remove unreferenced objects from proto -- Key: YARN-777 URL: https://issues.apache.org/jira/browse/YARN-777 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He Attachments: YARN-777.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678468#comment-13678468 ] Sandy Ryza commented on YARN-752: - bq. The rack lookup in the patch itself, shouldn't it be adding a rack only once. You're right. I'll fix this. In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678470#comment-13678470 ] Bikas Saha commented on YARN-752: - The user needs to only specify a node (and may end up getting a container on that node, on that node's rack or somewhere else. There is no need to specify the rack of a node explicitly but users may do so. If a user wants any node in a set of racks then they should simply directly specify those racks. This code is not quite correct. If nodes n1, n2, n3 (all in rack1) are present then this will end up calling addResourceRequest 3 times for rack1. Should probably create a local set of racks, initialized by CR.racks, add nodes' racks to the set. Then call addResourceRequest for the rack set. {code} +String rack = RackResolver.resolve(host).getNetworkLocation(); +if (rack == null) { + LOG.warn(Failed to resolve rack for host + host + .); +} else if (req.racks == null || !req.racks.contains(rack)) { + addResourceRequest(req.priority, rack, req.capability, + req.containerCount, req); +} {code} The test needs to improve to catch the above case. Also, should check container count for the asks to ensure that counts are correct (with or without user specifying the racks). The test is also calling api's without calling start(). It works now, but will probably break later once we enforce the life cycle inside the client. Things like rack resolution etc probably depend on topoplogy information coming from the RM upon registration. In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678472#comment-13678472 ] Bikas Saha commented on YARN-752: - Would be great if the next patch added documentation requested by [~tucu00] In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-790) Use ApplicationSubmissionContext to send tokens instead of file
Bikas Saha created YARN-790: --- Summary: Use ApplicationSubmissionContext to send tokens instead of file Key: YARN-790 URL: https://issues.apache.org/jira/browse/YARN-790 Project: Hadoop YARN Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-767) Initialize Application status metrics when QueueMetrics is initialized
[ https://issues.apache.org/jira/browse/YARN-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-767: - Attachment: YARN-767.3.patch Initialize Application status metrics when QueueMetrics is initialized --- Key: YARN-767 URL: https://issues.apache.org/jira/browse/YARN-767 Project: Hadoop YARN Issue Type: Bug Reporter: Jian He Assignee: Jian He Attachments: YARN-767.1.patch, YARN-767.2.patch, YARN-767.3.patch Applications: ResourceManager.QueueMetrics.AppsSubmitted, ResourceManager.QueueMetrics.AppsRunning, ResourceManager.QueueMetrics.AppsPending, ResourceManager.QueueMetrics.AppsCompleted, ResourceManager.QueueMetrics.AppsKilled, ResourceManager.QueueMetrics.AppsFailed For now these metrics are created only when they are needed, we want to make them be seen when QueueMetrics is initialized -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-791) Ensure that RM RPC APIs that return nodes are consistent with /nodes REST API
Sandy Ryza created YARN-791: --- Summary: Ensure that RM RPC APIs that return nodes are consistent with /nodes REST API Key: YARN-791 URL: https://issues.apache.org/jira/browse/YARN-791 Project: Hadoop YARN Issue Type: Improvement Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-791) Ensure that RM RPC APIs that return nodes are consistent with /nodes REST API
[ https://issues.apache.org/jira/browse/YARN-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-791: Issue Type: Sub-task (was: Improvement) Parent: YARN-386 Ensure that RM RPC APIs that return nodes are consistent with /nodes REST API - Key: YARN-791 URL: https://issues.apache.org/jira/browse/YARN-791 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-642) Fix up /nodes REST API to have 1 param and be consistent with the Java API
[ https://issues.apache.org/jira/browse/YARN-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678521#comment-13678521 ] Sandy Ryza commented on YARN-642: - Thanks for taking a look Vinod. I'll upload a a patch that fixes the tests. Regarding the RPCs, I think we might only need to change the semantics of getClusterNodes and not the APIs/protocols. Either way, I filed YARN-791 and I'll post results after investigating there. Fix up /nodes REST API to have 1 param and be consistent with the Java API -- Key: YARN-642 URL: https://issues.apache.org/jira/browse/YARN-642 Project: Hadoop YARN Issue Type: Bug Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Labels: incompatible Attachments: YARN-642-1.patch, YARN-642-2.patch, YARN-642-2.patch, YARN-642.patch The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-792) Move NodeHealthStatus from yarn.api.record to yarn.server.api.record
Jian He created YARN-792: Summary: Move NodeHealthStatus from yarn.api.record to yarn.server.api.record Key: YARN-792 URL: https://issues.apache.org/jira/browse/YARN-792 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-686) Flatten NodeReport
[ https://issues.apache.org/jira/browse/YARN-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678528#comment-13678528 ] Jian He commented on YARN-686: -- we should move NodeHealthStatus to yarn.server.api.record after this patch, also the 'NodeHealthStatus' can be moved from the NodeReport class comment. Created YARN-792 for this. Flatten NodeReport -- Key: YARN-686 URL: https://issues.apache.org/jira/browse/YARN-686 Project: Hadoop YARN Issue Type: Sub-task Components: api Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.1.0-beta Attachments: YARN-686-1.patch, YARN-686.patch, YARN-686.patch The NodeReport returned by getClusterNodes or given to AMs in heartbeat responses includes both a NodeState (enum) and a NodeHealthStatus (object). As UNHEALTHY is already NodeState, a separate NodeHealthStatus doesn't seem necessary. I propose eliminating NodeHealthStatus#getIsNodeHealthy and moving its two other methods, getHealthReport and getLastHealthReportTime, into NodeReport. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-642) Fix up /nodes REST API to have 1 param and be consistent with the Java API
[ https://issues.apache.org/jira/browse/YARN-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678533#comment-13678533 ] Vinod Kumar Vavilapalli commented on YARN-642: -- Not just semantics. Once we start returning only active nodes, we'll need to add the filter API or just modify getClusterNodes or GetClusterNodesRequest to take in targetState. Fix up /nodes REST API to have 1 param and be consistent with the Java API -- Key: YARN-642 URL: https://issues.apache.org/jira/browse/YARN-642 Project: Hadoop YARN Issue Type: Bug Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Labels: incompatible Attachments: YARN-642-1.patch, YARN-642-2.patch, YARN-642-2.patch, YARN-642.patch The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-752: Attachment: YARN-752-2.patch In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752-2.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678535#comment-13678535 ] Sandy Ryza commented on YARN-752: - Uploading a patch that modifies the test the catch the issue identified by Siddharth, fixes it, and adds the documentation requested by Tucu In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752-2.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-686) Flatten NodeReport
[ https://issues.apache.org/jira/browse/YARN-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678539#comment-13678539 ] Sandy Ryza commented on YARN-686: - Agreed Flatten NodeReport -- Key: YARN-686 URL: https://issues.apache.org/jira/browse/YARN-686 Project: Hadoop YARN Issue Type: Sub-task Components: api Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.1.0-beta Attachments: YARN-686-1.patch, YARN-686.patch, YARN-686.patch The NodeReport returned by getClusterNodes or given to AMs in heartbeat responses includes both a NodeState (enum) and a NodeHealthStatus (object). As UNHEALTHY is already NodeState, a separate NodeHealthStatus doesn't seem necessary. I propose eliminating NodeHealthStatus#getIsNodeHealthy and moving its two other methods, getHealthReport and getLastHealthReportTime, into NodeReport. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-642) Fix up /nodes REST API to have 1 param and be consistent with the Java API
[ https://issues.apache.org/jira/browse/YARN-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-642: Attachment: YARN-642-3.patch Fix up /nodes REST API to have 1 param and be consistent with the Java API -- Key: YARN-642 URL: https://issues.apache.org/jira/browse/YARN-642 Project: Hadoop YARN Issue Type: Bug Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Labels: incompatible Attachments: YARN-642-1.patch, YARN-642-2.patch, YARN-642-2.patch, YARN-642-3.patch, YARN-642.patch The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-642) Fix up /nodes REST API to have 1 param and be consistent with the Java API
[ https://issues.apache.org/jira/browse/YARN-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678552#comment-13678552 ] Sandy Ryza commented on YARN-642: - OK, that makes sense to me about the APIs. Uploaded a patch that fixes the tests. Fix up /nodes REST API to have 1 param and be consistent with the Java API -- Key: YARN-642 URL: https://issues.apache.org/jira/browse/YARN-642 Project: Hadoop YARN Issue Type: Bug Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Labels: incompatible Attachments: YARN-642-1.patch, YARN-642-2.patch, YARN-642-2.patch, YARN-642-3.patch, YARN-642.patch The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-792) Move NodeHealthStatus from yarn.api.record to yarn.server.api.record
[ https://issues.apache.org/jira/browse/YARN-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-792: - Attachment: YARN-792.patch Moved NodeHealthStatus and its PB to yarn.server.api. Moved nodeHealthStatus proto to yarn.server.common proto file Removed 'NodeHealthStatus' from the NodeReport class comment Move NodeHealthStatus from yarn.api.record to yarn.server.api.record Key: YARN-792 URL: https://issues.apache.org/jira/browse/YARN-792 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He Attachments: YARN-792.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-642) Fix up /nodes REST API to have 1 param and be consistent with the Java API
[ https://issues.apache.org/jira/browse/YARN-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678554#comment-13678554 ] Hadoop QA commented on YARN-642: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586825/YARN-642-3.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1166//console This message is automatically generated. Fix up /nodes REST API to have 1 param and be consistent with the Java API -- Key: YARN-642 URL: https://issues.apache.org/jira/browse/YARN-642 Project: Hadoop YARN Issue Type: Bug Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Labels: incompatible Attachments: YARN-642-1.patch, YARN-642-2.patch, YARN-642-2.patch, YARN-642-3.patch, YARN-642.patch The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678557#comment-13678557 ] Hadoop QA commented on YARN-752: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586821/YARN-752-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1164//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1164//console This message is automatically generated. In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752-2.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-777) Remove unreferenced objects from proto
[ https://issues.apache.org/jira/browse/YARN-777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-777: - Attachment: YARN-777.1.patch Removed ApplicationStatusProto and ApplicationMasterProto also, thanks for catching this. Remove unreferenced objects from proto -- Key: YARN-777 URL: https://issues.apache.org/jira/browse/YARN-777 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He Attachments: YARN-777.1.patch, YARN-777.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678561#comment-13678561 ] Vinod Kumar Vavilapalli commented on YARN-752: -- Haven't been paying attention. But quickly looked through. - I think if the user gives a wrong rack, we should throw an error. - IIUC, this code will change after YARN-521 or is YARN-521 just a new API instead of changes to the current API with optional flags? - In any case, the algo should eventually look like {code} For each explicitly added rack, if it doesn't exist, throw error; if (asking for specific nodes) { don't add racks } else { add racks } {code} - Either ways I think YARN-521 is a priority, depend on how the API is designed, it can result in API signature changes. If you agree that the logic is based on enabling/disabling strict allocations, the following in the patch won't be correct any longer, right? {code} Scheduler + * documentation should be consulted for the specifics of how the parameters + * are honored. {code} In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752-2.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-521) Augment AM - RM client module to be able to request containers only at specific locations
[ https://issues.apache.org/jira/browse/YARN-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-521: - Issue Type: Sub-task (was: Bug) Parent: YARN-386 Augment AM - RM client module to be able to request containers only at specific locations - Key: YARN-521 URL: https://issues.apache.org/jira/browse/YARN-521 Project: Hadoop YARN Issue Type: Sub-task Components: api Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza When YARN-392 and YARN-398 are completed, it would be good for AMRMClient to offer an easy way to access their functionality -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-771) AMRMClient support for resource blacklisting
[ https://issues.apache.org/jira/browse/YARN-771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-771: - Issue Type: Sub-task (was: Improvement) Parent: YARN-386 AMRMClient support for resource blacklisting - Key: YARN-771 URL: https://issues.apache.org/jira/browse/YARN-771 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha After YARN-750 AMRMClient should support blacklisting via the new YARN API's -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-642) Fix up /nodes REST API to have 1 param and be consistent with the Java API
[ https://issues.apache.org/jira/browse/YARN-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-642: Attachment: YARN-642-4.patch Fix up /nodes REST API to have 1 param and be consistent with the Java API -- Key: YARN-642 URL: https://issues.apache.org/jira/browse/YARN-642 Project: Hadoop YARN Issue Type: Bug Components: api, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Labels: incompatible Attachments: YARN-642-1.patch, YARN-642-2.patch, YARN-642-2.patch, YARN-642-3.patch, YARN-642-4.patch, YARN-642.patch The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-339) TestResourceTrackerService is failing intermittently
[ https://issues.apache.org/jira/browse/YARN-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678565#comment-13678565 ] Jian He commented on YARN-339: -- set miniClusterMode to be true so that it ignores Metrics source already exists in unit test TestResourceTrackerService is failing intermittently Key: YARN-339 URL: https://issues.apache.org/jira/browse/YARN-339 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 0.23.5 Reporter: Ravi Prakash Assignee: Jian He Attachments: YARN-339.patch The test after testReconnectNode() is failing usually. This might be a race condition in Metrics2 code. Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.127 sec FAILURE! testDecommissionWithIncludeHosts(org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService) Time elapsed: 55 sec ERROR! org.apache.hadoop.metrics2.MetricsException: Metrics source ClusterMetrics already exists! at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:134) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:115) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:217) at org.apache.hadoop.yarn.server.resourcemanager.ClusterMetrics.registerMetrics(ClusterMetrics.java:71) at org.apache.hadoop.yarn.server.resourcemanager.ClusterMetrics.getMetrics(ClusterMetrics.java:58) at org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testDecommissionWithIncludeHosts(TestResourceTrackerService.java:74) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-339) TestResourceTrackerService is failing intermittently
[ https://issues.apache.org/jira/browse/YARN-339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-339: - Attachment: YARN-339.patch TestResourceTrackerService is failing intermittently Key: YARN-339 URL: https://issues.apache.org/jira/browse/YARN-339 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 0.23.5 Reporter: Ravi Prakash Assignee: Jian He Attachments: YARN-339.patch The test after testReconnectNode() is failing usually. This might be a race condition in Metrics2 code. Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.127 sec FAILURE! testDecommissionWithIncludeHosts(org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService) Time elapsed: 55 sec ERROR! org.apache.hadoop.metrics2.MetricsException: Metrics source ClusterMetrics already exists! at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:134) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:115) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:217) at org.apache.hadoop.yarn.server.resourcemanager.ClusterMetrics.registerMetrics(ClusterMetrics.java:71) at org.apache.hadoop.yarn.server.resourcemanager.ClusterMetrics.getMetrics(ClusterMetrics.java:58) at org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testDecommissionWithIncludeHosts(TestResourceTrackerService.java:74) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-777) Remove unreferenced objects from proto
[ https://issues.apache.org/jira/browse/YARN-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678568#comment-13678568 ] Hadoop QA commented on YARN-777: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586829/YARN-777.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1167//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1167//console This message is automatically generated. Remove unreferenced objects from proto -- Key: YARN-777 URL: https://issues.apache.org/jira/browse/YARN-777 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He Attachments: YARN-777.1.patch, YARN-777.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-792) Move NodeHealthStatus from yarn.api.record to yarn.server.api.record
[ https://issues.apache.org/jira/browse/YARN-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678570#comment-13678570 ] Hadoop QA commented on YARN-792: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586826/YARN-792.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1165//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1165//console This message is automatically generated. Move NodeHealthStatus from yarn.api.record to yarn.server.api.record Key: YARN-792 URL: https://issues.apache.org/jira/browse/YARN-792 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He Attachments: YARN-792.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-752) In AMRMClient, automatically add corresponding rack requests for requested nodes
[ https://issues.apache.org/jira/browse/YARN-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678578#comment-13678578 ] Alejandro Abdelnur commented on YARN-752: - bq. I think if the user gives a wrong rack, we should throw an error. Per previous discussion this does not happen (except if you are referring to an invalid rack altogether). If a ContainerRequest has host1 and rack3, it means that the containers are desired either in the host1 or rack3. it does not mean that host1 is in rack3. In AMRMClient, automatically add corresponding rack requests for requested nodes Key: YARN-752 URL: https://issues.apache.org/jira/browse/YARN-752 Project: Hadoop YARN Issue Type: Improvement Components: api, applications Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-752-1.patch, YARN-752-1.patch, YARN-752-2.patch, YARN-752.patch A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-339) TestResourceTrackerService is failing intermittently
[ https://issues.apache.org/jira/browse/YARN-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678582#comment-13678582 ] Hadoop QA commented on YARN-339: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586832/YARN-339.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1169//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1169//console This message is automatically generated. TestResourceTrackerService is failing intermittently Key: YARN-339 URL: https://issues.apache.org/jira/browse/YARN-339 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 0.23.5 Reporter: Ravi Prakash Assignee: Jian He Attachments: YARN-339.patch The test after testReconnectNode() is failing usually. This might be a race condition in Metrics2 code. Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 3.127 sec FAILURE! testDecommissionWithIncludeHosts(org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService) Time elapsed: 55 sec ERROR! org.apache.hadoop.metrics2.MetricsException: Metrics source ClusterMetrics already exists! at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:134) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:115) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:217) at org.apache.hadoop.yarn.server.resourcemanager.ClusterMetrics.registerMetrics(ClusterMetrics.java:71) at org.apache.hadoop.yarn.server.resourcemanager.ClusterMetrics.getMetrics(ClusterMetrics.java:58) at org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testDecommissionWithIncludeHosts(TestResourceTrackerService.java:74) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira