[jira] [Commented] (YARN-2588) Standby RM does not transitionToActive if previous transitionToActive is failed with ZK exception.
[ https://issues.apache.org/jira/browse/YARN-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189698#comment-14189698 ] Karthik Kambatla commented on YARN-2588: Agree, we will need to move things around a little to get it right. Standby RM does not transitionToActive if previous transitionToActive is failed with ZK exception. -- Key: YARN-2588 URL: https://issues.apache.org/jira/browse/YARN-2588 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.0.0, 2.6.0, 2.5.1 Reporter: Rohith Assignee: Rohith Fix For: 2.6.0 Attachments: YARN-2588.1.patch, YARN-2588.2.patch, YARN-2588.patch Consider scenario where, StandBy RM is failed to transition to Active because of ZK exception(connectionLoss or SessionExpired). Then any further transition to Active for same RM does not move RM to Active state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager
[ https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189707#comment-14189707 ] Hadoop QA commented on YARN-2753: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678125/YARN-2753.005.patch against trunk revision 0126cf1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5637//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5637//console This message is automatically generated. Fix potential issues and code clean up for *NodeLabelsManager - Key: YARN-2753 URL: https://issues.apache.org/jira/browse/YARN-2753 Project: Hadoop YARN Issue Type: Sub-task Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-2753.000.patch, YARN-2753.001.patch, YARN-2753.002.patch, YARN-2753.003.patch, YARN-2753.004.patch, YARN-2753.005.patch Issues include: * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in labelCollections if the key already exists otherwise the Label.resource will be changed(reset). * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of CommonNodeLabelsManager. ** because when a Node is created, Node.labels can be null. ** In this case, nm.labels; may be null. So we need check originalLabels not null before use it(originalLabels.containsAll). * addToCluserNodeLabels should be protected by writeLock in RMNodeLabelsManager.java. because we should protect labelCollections in RMNodeLabelsManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2712) Adding tests about FSQueue and headroom of FairScheduler to TestWorkPreservingRMRestart
[ https://issues.apache.org/jira/browse/YARN-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189720#comment-14189720 ] Tsuyoshi OZAWA commented on YARN-2712: -- [~adhoot] [~kkambatl] [~jianhe] do you have additional comments? Adding tests about FSQueue and headroom of FairScheduler to TestWorkPreservingRMRestart --- Key: YARN-2712 URL: https://issues.apache.org/jira/browse/YARN-2712 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: YARN-2712.1.patch, YARN-2712.2.patch TestWorkPreservingRMRestart#testSchedulerRecovery doesn't have test cases about FairScheduler partially. We should support them. {code} // Until YARN-1959 is resolved if (scheduler.getClass() != FairScheduler.class) { assertEquals(availableResources, schedulerAttempt.getHeadroom()); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2771) DistributedShell's DSConstants are badly named
[ https://issues.apache.org/jira/browse/YARN-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189724#comment-14189724 ] Hadoop QA commented on YARN-2771: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678058/YARN-2771.1.patch against trunk revision 0126cf1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5638//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5638//console This message is automatically generated. DistributedShell's DSConstants are badly named -- Key: YARN-2771 URL: https://issues.apache.org/jira/browse/YARN-2771 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-2771.1.patch I'd rather have underscores (DISTRIBUTED_SHELL_TIMELINE_DOMAIN instead of DISTRIBUTEDSHELLTIMELINEDOMAIN). DISTRIBUTEDSHELLTIMELINEDOMAIN is added in this release, can we rename it to be DISTRIBUTED_SHELL_TIMELINE_DOMAIN? For the old envs, we can just add new envs that point to the old-one and deprecate the old ones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2712) Adding tests about FSQueue and headroom of FairScheduler to TestWorkPreservingRMRestart
[ https://issues.apache.org/jira/browse/YARN-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189742#comment-14189742 ] Karthik Kambatla commented on YARN-2712: LGTM, +1. Adding tests about FSQueue and headroom of FairScheduler to TestWorkPreservingRMRestart --- Key: YARN-2712 URL: https://issues.apache.org/jira/browse/YARN-2712 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: YARN-2712.1.patch, YARN-2712.2.patch TestWorkPreservingRMRestart#testSchedulerRecovery doesn't have test cases about FairScheduler partially. We should support them. {code} // Until YARN-1959 is resolved if (scheduler.getClass() != FairScheduler.class) { assertEquals(availableResources, schedulerAttempt.getHeadroom()); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2712) TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks
[ https://issues.apache.org/jira/browse/YARN-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2712: --- Summary: TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks (was: Adding tests about FSQueue and headroom of FairScheduler to TestWorkPreservingRMRestart) TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks Key: YARN-2712 URL: https://issues.apache.org/jira/browse/YARN-2712 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: YARN-2712.1.patch, YARN-2712.2.patch TestWorkPreservingRMRestart#testSchedulerRecovery doesn't have test cases about FairScheduler partially. We should support them. {code} // Until YARN-1959 is resolved if (scheduler.getClass() != FairScheduler.class) { assertEquals(availableResources, schedulerAttempt.getHeadroom()); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2712) TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks
[ https://issues.apache.org/jira/browse/YARN-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189754#comment-14189754 ] Hudson commented on YARN-2712: -- FAILURE: Integrated in Hadoop-trunk-Commit #6392 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6392/]) YARN-2712. TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks. (Tsuyoshi Ozawa via kasha) (kasha: rev 179cab81e0bde1af0cba6131f16ff127358a) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks Key: YARN-2712 URL: https://issues.apache.org/jira/browse/YARN-2712 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: YARN-2712.1.patch, YARN-2712.2.patch TestWorkPreservingRMRestart#testSchedulerRecovery doesn't have test cases about FairScheduler partially. We should support them. {code} // Until YARN-1959 is resolved if (scheduler.getClass() != FairScheduler.class) { assertEquals(availableResources, schedulerAttempt.getHeadroom()); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2775) There is no close method in NMWebServices#getLogs()
skrho created YARN-2775: --- Summary: There is no close method in NMWebServices#getLogs() Key: YARN-2775 URL: https://issues.apache.org/jira/browse/YARN-2775 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: skrho Priority: Minor If getLogs method is called, fileInputStream object is accumulated in memory.. Because fileinputStream object is not closed.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2775) There is no close method in NMWebServices#getLogs()
[ https://issues.apache.org/jira/browse/YARN-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] skrho updated YARN-2775: Attachment: YARN-2775_001.patch I added close method.. so fileInputStream object is not accumulated in memory.. How about that? There is no close method in NMWebServices#getLogs() --- Key: YARN-2775 URL: https://issues.apache.org/jira/browse/YARN-2775 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: skrho Priority: Minor Attachments: YARN-2775_001.patch If getLogs method is called, fileInputStream object is accumulated in memory.. Because fileinputStream object is not closed.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2712) TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks
[ https://issues.apache.org/jira/browse/YARN-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189835#comment-14189835 ] Tsuyoshi OZAWA commented on YARN-2712: -- Thanks Anubhav and Karhitk for the reviews. TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks Key: YARN-2712 URL: https://issues.apache.org/jira/browse/YARN-2712 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Fix For: 2.7.0 Attachments: YARN-2712.1.patch, YARN-2712.2.patch TestWorkPreservingRMRestart#testSchedulerRecovery doesn't have test cases about FairScheduler partially. We should support them. {code} // Until YARN-1959 is resolved if (scheduler.getClass() != FairScheduler.class) { assertEquals(availableResources, schedulerAttempt.getHeadroom()); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2776) In HA mode, can't set ip but hostname to yarn.resourcemanager.webapp.address.*
meiyoula created YARN-2776: -- Summary: In HA mode, can't set ip but hostname to yarn.resourcemanager.webapp.address.* Key: YARN-2776 URL: https://issues.apache.org/jira/browse/YARN-2776 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: meiyoula Priority: Critical In HA mode, when setting yarn.resourcemanager.webapp.address.* with ip:port, I run a spark application on yarn. The sparkui in yarn webui of 8080 is ok, but spark own page has a bug when the address turns to yarn address. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2776) In HA mode, can't set ip but hostname to yarn.resourcemanager.webapp.address.*
[ https://issues.apache.org/jira/browse/YARN-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] meiyoula updated YARN-2776: --- Attachment: YARN-2766.patch In HA mode, can't set ip but hostname to yarn.resourcemanager.webapp.address.* - Key: YARN-2776 URL: https://issues.apache.org/jira/browse/YARN-2776 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: meiyoula Priority: Critical Attachments: YARN-2766.patch In HA mode, when setting yarn.resourcemanager.webapp.address.* with ip:port, I run a spark application on yarn. The sparkui in yarn webui of 8080 is ok, but spark own page has a bug when the address turns to yarn address. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2776) In HA mode, can't set ip but hostname to yarn.resourcemanager.webapp.address.*
[ https://issues.apache.org/jira/browse/YARN-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] meiyoula updated YARN-2776: --- Description: In HA mode, when setting yarn.resourcemanager.webapp.address.* with ip:port, I run a spark application on yarn. The sparkui in yarn webui of 8080 is ok, but spark own page has a bug when the address turns to yarn address. The exception shows like this: WARN | [qtp542345580-71] | /stages/ | org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:561) javax.servlet.ServletException: Could not determine the proxy server for redirection at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:183) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:139) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) was:In HA mode, when setting yarn.resourcemanager.webapp.address.* with ip:port, I run a spark application on yarn. The sparkui in yarn webui of 8080 is ok, but spark own page has a bug when the address turns to yarn address. In HA mode, can't set ip but hostname to yarn.resourcemanager.webapp.address.* - Key: YARN-2776 URL: https://issues.apache.org/jira/browse/YARN-2776 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: meiyoula Priority: Critical Attachments: YARN-2766.patch In HA mode, when setting yarn.resourcemanager.webapp.address.* with ip:port, I run a spark application on yarn. The sparkui in yarn webui of 8080 is ok, but spark own page has a bug when the address turns to yarn address. The exception shows like this: WARN | [qtp542345580-71] | /stages/ | org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:561) javax.servlet.ServletException: Could not determine the proxy server for redirection at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:183) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:139) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at
[jira] [Updated] (YARN-2776) In HA mode, can't set ip but hostname to yarn.resourcemanager.webapp.address.*
[ https://issues.apache.org/jira/browse/YARN-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] meiyoula updated YARN-2776: --- Description: In HA mode, when setting yarn.resourcemanager.webapp.address.* with ip:port, I run a spark application on yarn. The sparkui in yarn webui of 8080 is ok, but spark own page has a bug when the address turns to yarn address. The exception shows like this: WARN | [qtp542345580-71] | /stages/ | org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:561) javax.servlet.ServletException: Could not determine the proxy server for redirection at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:183) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:139) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) was: In HA mode, when setting yarn.resourcemanager.webapp.address.* with ip:port, I run a spark application on yarn. The sparkui in yarn webui of 8080 is ok, but spark own page has a bug when the address turns to yarn address. The exception shows like this: WARN | [qtp542345580-71] | /stages/ | org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:561) javax.servlet.ServletException: Could not determine the proxy server for redirection at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:183) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:139) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at
[jira] [Updated] (YARN-2776) In HA mode, can't set ip but hostname to yarn.resourcemanager.webapp.address.*
[ https://issues.apache.org/jira/browse/YARN-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] meiyoula updated YARN-2776: --- Description: In HA mode, when setting yarn.resourcemanager.webapp.address.* with ip:port, I run a spark application on yarn. The sparkui in yarn webui of 8080 is ok, but spark own page has a bug when the address turns to yarn address. The error in web: HTTP ERROR: 500 Problem accessing /stages/. Reason: Server Error The exception in log : WARN | [qtp542345580-71] | /stages/ | org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:561) javax.servlet.ServletException: Could not determine the proxy server for redirection at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:183) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:139) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) was: In HA mode, when setting yarn.resourcemanager.webapp.address.* with ip:port, I run a spark application on yarn. The sparkui in yarn webui of 8080 is ok, but spark own page has a bug when the address turns to yarn address. The exception shows like this: WARN | [qtp542345580-71] | /stages/ | org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:561) javax.servlet.ServletException: Could not determine the proxy server for redirection at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:183) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:139) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at
[jira] [Updated] (YARN-2776) In HA mode, can't set ip but hostname to yarn.resourcemanager.webapp.address.*
[ https://issues.apache.org/jira/browse/YARN-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] meiyoula updated YARN-2776: --- Description: In HA mode, when setting yarn.resourcemanager.webapp.address.* with ip:port, I run a spark application on yarn. The sparkui in yarn webui of 8080 is ok, but spark own page has a bug when the address turns to yarn address. But when setting yarn.resourcemanager.webapp.address.* with hostname:port, the webs both ok. The error in web: HTTP ERROR: 500 Problem accessing /stages/. Reason: Server Error The exception in log : WARN | [qtp542345580-71] | /stages/ | org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:561) javax.servlet.ServletException: Could not determine the proxy server for redirection at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:183) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:139) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) was: In HA mode, when setting yarn.resourcemanager.webapp.address.* with ip:port, I run a spark application on yarn. The sparkui in yarn webui of 8080 is ok, but spark own page has a bug when the address turns to yarn address. The error in web: HTTP ERROR: 500 Problem accessing /stages/. Reason: Server Error The exception in log : WARN | [qtp542345580-71] | /stages/ | org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:561) javax.servlet.ServletException: Could not determine the proxy server for redirection at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:183) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:139) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at
[jira] [Updated] (YARN-2762) Provide RMAdminCLI args validation for NodeLabelManager operations
[ https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-2762: - Issue Type: Sub-task (was: Improvement) Parent: YARN-2492 Provide RMAdminCLI args validation for NodeLabelManager operations -- Key: YARN-2762 URL: https://issues.apache.org/jira/browse/YARN-2762 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Rohith Assignee: Rohith Priority: Minor Attachments: YARN-2762.patch All NodeLabel args validation's are done at server side. The same can be done at RMAdminCLI so that unnecessary RPC calls can be avoided. And for the input such as x,y,,z,, no need to add empty string instead can be skipped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2765) Add leveldb-based implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189934#comment-14189934 ] Tsuyoshi OZAWA commented on YARN-2765: -- Currently we can assume that LeveldbRMStateStore is access from single process, thus failure detection itself is done by EmbeddedElector which depends on ZooKeeper in addition to no support of fencing on LevelDBRMStateStore. It means we need to launch ZooKeeper and it's normal decision to use ZKRMStateStore in this case. Please correct me if I'm wrong. On another front, if we use RockDB as a backend db of timeline server, we don't need to use leveldb and it's good decision to switch the dependency. Add leveldb-based implementation for RMStateStore - Key: YARN-2765 URL: https://issues.apache.org/jira/browse/YARN-2765 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-2765.patch, YARN-2765v2.patch It would be nice to have a leveldb option to the resourcemanager recovery store. Leveldb would provide some benefits over the existing filesystem store such as better support for atomic operations, fewer I/O ops per state update, and far fewer total files on the filesystem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2742) FairSchedulerConfiguration should allow extra spaces between value and unit
[ https://issues.apache.org/jira/browse/YARN-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189970#comment-14189970 ] Hudson commented on YARN-2742: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #728 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/728/]) YARN-2742. FairSchedulerConfiguration should allow extra spaces between value and unit. (Wei Yan via kasha) (kasha: rev 782971ae7a0247bcf5920e10b434b7e0954dd868) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java * hadoop-yarn-project/CHANGES.txt FairSchedulerConfiguration should allow extra spaces between value and unit --- Key: YARN-2742 URL: https://issues.apache.org/jira/browse/YARN-2742 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Wei Yan Priority: Minor Fix For: 2.7.0 Attachments: YARN-2742-1.patch, YARN-2742-2.patch FairSchedulerConfiguration is very strict about the number of space characters between the value and the unit: 0 or 1 space. For example, for values like the following: {noformat} maxResources4096 mb, 2 vcoresmaxResources {noformat} (note 2 spaces) This above line fails to parse: {noformat} 2014-10-24 22:56:40,802 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService: Failed to reload fair scheduler config file - will use existing allocations. org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException: Missing resource: mb at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.findResource(FairSchedulerConfiguration.java:247) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.parseResourceConfigValue(FairSchedulerConfiguration.java:231) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:347) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:381) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:293) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService$1.run(AllocationFileLoaderService.java:117) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2769) Timeline server domain not set correctly when using shell_command on Windows
[ https://issues.apache.org/jira/browse/YARN-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189965#comment-14189965 ] Hudson commented on YARN-2769: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #728 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/728/]) YARN-2769. Fixed the problem that timeline domain is not set in distributed shell AM when using shell_command on Windows. Contributed by Varun Vasudev. (zjshen: rev a8c120222047280234c3411ce1c1c9b17f08c851) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java Timeline server domain not set correctly when using shell_command on Windows Key: YARN-2769 URL: https://issues.apache.org/jira/browse/YARN-2769 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Varun Vasudev Assignee: Varun Vasudev Fix For: 2.6.0 Attachments: apache-yarn-2769.0.patch The bug is caught by one of the unit tests which fails. {noformat} Running org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 37.661 sec FAILURE! - in org.apache.hadoop.yarn.applications.distribut testDSShellWithDomain(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell) Time elapsed: 37.366 sec FAILURE! org.junit.ComparisonFailure: expected:[TEST_DOMAIN] but was:[DEFAULT] at org.junit.Assert.assertEquals(Assert.java:115) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:290) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithDomain(TestDistributedShell.java:179) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2712) TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks
[ https://issues.apache.org/jira/browse/YARN-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189973#comment-14189973 ] Hudson commented on YARN-2712: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #728 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/728/]) YARN-2712. TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks. (Tsuyoshi Ozawa via kasha) (kasha: rev 179cab81e0bde1af0cba6131f16ff127358a) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks Key: YARN-2712 URL: https://issues.apache.org/jira/browse/YARN-2712 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Fix For: 2.7.0 Attachments: YARN-2712.1.patch, YARN-2712.2.patch TestWorkPreservingRMRestart#testSchedulerRecovery doesn't have test cases about FairScheduler partially. We should support them. {code} // Until YARN-1959 is resolved if (scheduler.getClass() != FairScheduler.class) { assertEquals(availableResources, schedulerAttempt.getHeadroom()); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2742) FairSchedulerConfiguration should allow extra spaces between value and unit
[ https://issues.apache.org/jira/browse/YARN-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190084#comment-14190084 ] Hudson commented on YARN-2742: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1917 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1917/]) YARN-2742. FairSchedulerConfiguration should allow extra spaces between value and unit. (Wei Yan via kasha) (kasha: rev 782971ae7a0247bcf5920e10b434b7e0954dd868) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java FairSchedulerConfiguration should allow extra spaces between value and unit --- Key: YARN-2742 URL: https://issues.apache.org/jira/browse/YARN-2742 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Wei Yan Priority: Minor Fix For: 2.7.0 Attachments: YARN-2742-1.patch, YARN-2742-2.patch FairSchedulerConfiguration is very strict about the number of space characters between the value and the unit: 0 or 1 space. For example, for values like the following: {noformat} maxResources4096 mb, 2 vcoresmaxResources {noformat} (note 2 spaces) This above line fails to parse: {noformat} 2014-10-24 22:56:40,802 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService: Failed to reload fair scheduler config file - will use existing allocations. org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException: Missing resource: mb at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.findResource(FairSchedulerConfiguration.java:247) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.parseResourceConfigValue(FairSchedulerConfiguration.java:231) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:347) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:381) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:293) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService$1.run(AllocationFileLoaderService.java:117) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2712) TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks
[ https://issues.apache.org/jira/browse/YARN-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190087#comment-14190087 ] Hudson commented on YARN-2712: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1917 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1917/]) YARN-2712. TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks. (Tsuyoshi Ozawa via kasha) (kasha: rev 179cab81e0bde1af0cba6131f16ff127358a) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks Key: YARN-2712 URL: https://issues.apache.org/jira/browse/YARN-2712 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Fix For: 2.7.0 Attachments: YARN-2712.1.patch, YARN-2712.2.patch TestWorkPreservingRMRestart#testSchedulerRecovery doesn't have test cases about FairScheduler partially. We should support them. {code} // Until YARN-1959 is resolved if (scheduler.getClass() != FairScheduler.class) { assertEquals(availableResources, schedulerAttempt.getHeadroom()); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2769) Timeline server domain not set correctly when using shell_command on Windows
[ https://issues.apache.org/jira/browse/YARN-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190079#comment-14190079 ] Hudson commented on YARN-2769: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1917 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1917/]) YARN-2769. Fixed the problem that timeline domain is not set in distributed shell AM when using shell_command on Windows. Contributed by Varun Vasudev. (zjshen: rev a8c120222047280234c3411ce1c1c9b17f08c851) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java Timeline server domain not set correctly when using shell_command on Windows Key: YARN-2769 URL: https://issues.apache.org/jira/browse/YARN-2769 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Varun Vasudev Assignee: Varun Vasudev Fix For: 2.6.0 Attachments: apache-yarn-2769.0.patch The bug is caught by one of the unit tests which fails. {noformat} Running org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 37.661 sec FAILURE! - in org.apache.hadoop.yarn.applications.distribut testDSShellWithDomain(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell) Time elapsed: 37.366 sec FAILURE! org.junit.ComparisonFailure: expected:[TEST_DOMAIN] but was:[DEFAULT] at org.junit.Assert.assertEquals(Assert.java:115) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:290) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithDomain(TestDistributedShell.java:179) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2677) registry punycoding of usernames doesn't fix all usernames to be DNS-valid
[ https://issues.apache.org/jira/browse/YARN-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated YARN-2677: - Attachment: YARN-2677-002.patch Updated patch which also converts usernames to lower case (english locale). This handles names like {{Administrator}} which you can see on windows registry punycoding of usernames doesn't fix all usernames to be DNS-valid -- Key: YARN-2677 URL: https://issues.apache.org/jira/browse/YARN-2677 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Affects Versions: 2.6.0 Reporter: Steve Loughran Assignee: Steve Loughran Attachments: YARN-2677-001.patch, YARN-2677-002.patch The registry has a restriction DNS-valid names only to retain the future option of DNS exporting of the registry. to handle complex usernames, it punycodes the username first, using Java's {{java.net.IDN}} class. This turns out to only map high unicode- ASCII, and does nothing for ascii-but-invalid-hostname chars, so stopping users with DNS-illegal names (e.g. with an underscore in them) from being able to register -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2677) registry punycoding of usernames doesn't fix all usernames to be DNS-valid
[ https://issues.apache.org/jira/browse/YARN-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190151#comment-14190151 ] Hadoop QA commented on YARN-2677: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678180/YARN-2677-002.patch against trunk revision 179cab8. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5639//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5639//console This message is automatically generated. registry punycoding of usernames doesn't fix all usernames to be DNS-valid -- Key: YARN-2677 URL: https://issues.apache.org/jira/browse/YARN-2677 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Affects Versions: 2.6.0 Reporter: Steve Loughran Assignee: Steve Loughran Attachments: YARN-2677-001.patch, YARN-2677-002.patch The registry has a restriction DNS-valid names only to retain the future option of DNS exporting of the registry. to handle complex usernames, it punycodes the username first, using Java's {{java.net.IDN}} class. This turns out to only map high unicode- ASCII, and does nothing for ascii-but-invalid-hostname chars, so stopping users with DNS-illegal names (e.g. with an underscore in them) from being able to register -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2755) NM fails to clean up usercache_DEL_timestamp dirs after YARN-661
[ https://issues.apache.org/jira/browse/YARN-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190172#comment-14190172 ] Jason Lowe commented on YARN-2755: -- +1, committing this. NM fails to clean up usercache_DEL_timestamp dirs after YARN-661 -- Key: YARN-2755 URL: https://issues.apache.org/jira/browse/YARN-2755 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li Assignee: Siqi Li Priority: Critical Attachments: YARN-2755.v1.patch, YARN-2755.v2.patch, YARN-2755.v3.patch, YARN-2755.v4.patch When NM restarts frequently due to some reason, a large number of directories like these left in /data/disk$num/yarn/local/: /data/disk1/yarn/local/usercache_DEL_1414372756105 /data/disk1/yarn/local/usercache_DEL_1413557901696 /data/disk1/yarn/local/usercache_DEL_1413657004894 /data/disk1/yarn/local/usercache_DEL_1413675321860 /data/disk1/yarn/local/usercache_DEL_1414093167936 /data/disk1/yarn/local/usercache_DEL_1413565841271 These directories are empty, but take up 100M+ due to the number of them. There were 38714 on the machine I looked at per data disk. It appears to be a regression introduced by YARN-661 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2776) In HA mode, can't set ip but hostname to yarn.resourcemanager.webapp.address.*
[ https://issues.apache.org/jira/browse/YARN-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated YARN-2776: --- Priority: Major (was: Critical) In HA mode, can't set ip but hostname to yarn.resourcemanager.webapp.address.* - Key: YARN-2776 URL: https://issues.apache.org/jira/browse/YARN-2776 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: meiyoula Attachments: YARN-2766.patch In HA mode, when setting yarn.resourcemanager.webapp.address.* with ip:port, I run a spark application on yarn. The sparkui in yarn webui of 8080 is ok, but spark own page has a bug when the address turns to yarn address. But when setting yarn.resourcemanager.webapp.address.* with hostname:port, the webs both ok. The error in web: HTTP ERROR: 500 Problem accessing /stages/. Reason: Server Error The exception in log : WARN | [qtp542345580-71] | /stages/ | org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:561) javax.servlet.ServletException: Could not determine the proxy server for redirection at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.findRedirectUrl(AmIpFilter.java:183) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:139) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:499) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:428) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2769) Timeline server domain not set correctly when using shell_command on Windows
[ https://issues.apache.org/jira/browse/YARN-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190200#comment-14190200 ] Hudson commented on YARN-2769: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1942 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1942/]) YARN-2769. Fixed the problem that timeline domain is not set in distributed shell AM when using shell_command on Windows. Contributed by Varun Vasudev. (zjshen: rev a8c120222047280234c3411ce1c1c9b17f08c851) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java Timeline server domain not set correctly when using shell_command on Windows Key: YARN-2769 URL: https://issues.apache.org/jira/browse/YARN-2769 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Varun Vasudev Assignee: Varun Vasudev Fix For: 2.6.0 Attachments: apache-yarn-2769.0.patch The bug is caught by one of the unit tests which fails. {noformat} Running org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 37.661 sec FAILURE! - in org.apache.hadoop.yarn.applications.distribut testDSShellWithDomain(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell) Time elapsed: 37.366 sec FAILURE! org.junit.ComparisonFailure: expected:[TEST_DOMAIN] but was:[DEFAULT] at org.junit.Assert.assertEquals(Assert.java:115) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:290) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithDomain(TestDistributedShell.java:179) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2712) TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks
[ https://issues.apache.org/jira/browse/YARN-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190208#comment-14190208 ] Hudson commented on YARN-2712: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1942 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1942/]) YARN-2712. TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks. (Tsuyoshi Ozawa via kasha) (kasha: rev 179cab81e0bde1af0cba6131f16ff127358a) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * hadoop-yarn-project/CHANGES.txt TestWorkPreservingRMRestart: Augment FS tests with queue and headroom checks Key: YARN-2712 URL: https://issues.apache.org/jira/browse/YARN-2712 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Fix For: 2.7.0 Attachments: YARN-2712.1.patch, YARN-2712.2.patch TestWorkPreservingRMRestart#testSchedulerRecovery doesn't have test cases about FairScheduler partially. We should support them. {code} // Until YARN-1959 is resolved if (scheduler.getClass() != FairScheduler.class) { assertEquals(availableResources, schedulerAttempt.getHeadroom()); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2742) FairSchedulerConfiguration should allow extra spaces between value and unit
[ https://issues.apache.org/jira/browse/YARN-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190205#comment-14190205 ] Hudson commented on YARN-2742: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1942 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1942/]) YARN-2742. FairSchedulerConfiguration should allow extra spaces between value and unit. (Wei Yan via kasha) (kasha: rev 782971ae7a0247bcf5920e10b434b7e0954dd868) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerConfiguration.java FairSchedulerConfiguration should allow extra spaces between value and unit --- Key: YARN-2742 URL: https://issues.apache.org/jira/browse/YARN-2742 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Wei Yan Priority: Minor Fix For: 2.7.0 Attachments: YARN-2742-1.patch, YARN-2742-2.patch FairSchedulerConfiguration is very strict about the number of space characters between the value and the unit: 0 or 1 space. For example, for values like the following: {noformat} maxResources4096 mb, 2 vcoresmaxResources {noformat} (note 2 spaces) This above line fails to parse: {noformat} 2014-10-24 22:56:40,802 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService: Failed to reload fair scheduler config file - will use existing allocations. org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException: Missing resource: mb at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.findResource(FairSchedulerConfiguration.java:247) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.parseResourceConfigValue(FairSchedulerConfiguration.java:231) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:347) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:381) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:293) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService$1.run(AllocationFileLoaderService.java:117) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2755) NM fails to clean up usercache_DEL_timestamp dirs after YARN-661
[ https://issues.apache.org/jira/browse/YARN-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190221#comment-14190221 ] Hudson commented on YARN-2755: -- FAILURE: Integrated in Hadoop-trunk-Commit #6393 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6393/]) YARN-2755. NM fails to clean up usercache_DEL_timestamp dirs after YARN-661. Contributed by Siqi Li (jlowe: rev 73e626ad91cd5c06a005068d8432fd16e06fe6a0) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerReboot.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java * hadoop-yarn-project/CHANGES.txt NM fails to clean up usercache_DEL_timestamp dirs after YARN-661 -- Key: YARN-2755 URL: https://issues.apache.org/jira/browse/YARN-2755 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li Assignee: Siqi Li Priority: Critical Fix For: 2.6.0 Attachments: YARN-2755.v1.patch, YARN-2755.v2.patch, YARN-2755.v3.patch, YARN-2755.v4.patch When NM restarts frequently due to some reason, a large number of directories like these left in /data/disk$num/yarn/local/: /data/disk1/yarn/local/usercache_DEL_1414372756105 /data/disk1/yarn/local/usercache_DEL_1413557901696 /data/disk1/yarn/local/usercache_DEL_1413657004894 /data/disk1/yarn/local/usercache_DEL_1413675321860 /data/disk1/yarn/local/usercache_DEL_1414093167936 /data/disk1/yarn/local/usercache_DEL_1413565841271 These directories are empty, but take up 100M+ due to the number of them. There were 38714 on the machine I looked at per data disk. It appears to be a regression introduced by YARN-661 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-661) NM fails to cleanup local directories for users
[ https://issues.apache.org/jira/browse/YARN-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190220#comment-14190220 ] Hudson commented on YARN-661: - FAILURE: Integrated in Hadoop-trunk-Commit #6393 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6393/]) YARN-2755. NM fails to clean up usercache_DEL_timestamp dirs after YARN-661. Contributed by Siqi Li (jlowe: rev 73e626ad91cd5c06a005068d8432fd16e06fe6a0) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerReboot.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java * hadoop-yarn-project/CHANGES.txt NM fails to cleanup local directories for users --- Key: YARN-661 URL: https://issues.apache.org/jira/browse/YARN-661 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.0-beta, 0.23.8 Reporter: Jason Lowe Assignee: Omkar Vinit Joshi Fix For: 2.1.0-beta Attachments: YARN-661-20130701.patch, YARN-661-20130708.patch, YARN-661-20130710.1.patch, YARN-661-20130711.1.patch, YARN-661-20130712.1.patch, YARN-661-20130715.1.patch, YARN-661-20130716.1.patch YARN-71 added deletion of local directories on startup, but in practice it fails to delete the directories because of permission problems. The top-level usercache directory is owned by the user but is in a directory that is not writable by the user. Therefore the deletion of the user's usercache directory, as the user, fails due to lack of permissions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2777) Mark the end of individual log in aggregated log
Ted Yu created YARN-2777: Summary: Mark the end of individual log in aggregated log Key: YARN-2777 URL: https://issues.apache.org/jira/browse/YARN-2777 Project: Hadoop YARN Issue Type: Improvement Reporter: Ted Yu Below is snippet of aggregated log showing hbase master log: {code} LogType: hbase-hbase-master-ip-172-31-34-167.log LogUploadTime: 29-Oct-2014 22:31:55 LogLength: 24103045 Log Contents: Wed Oct 29 15:43:57 UTC 2014 Starting master on ip-172-31-34-167 ... at org.apache.hadoop.hbase.master.cleaner.CleanerChore.chore(CleanerChore.java:124) at org.apache.hadoop.hbase.Chore.run(Chore.java:80) at java.lang.Thread.run(Thread.java:745) LogType: hbase-hbase-master-ip-172-31-34-167.out {code} Since logs from various daemons are aggregated in one log file, it would be desirable to mark the end of one log before starting with the next. e.g. with such a line: {code} End of LogType: hbase-hbase-master-ip-172-31-34-167.log {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2698) Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService
[ https://issues.apache.org/jira/browse/YARN-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2698: - Summary: Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService (was: Move getClusterNodeLabels and getNodeToLabels to YARN CLI instead of RMAdminCLI) Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService --- Key: YARN-2698 URL: https://issues.apache.org/jira/browse/YARN-2698 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Attachments: YARN-2698-20141028-1.patch, YARN-2698-20141028-2.patch, YARN-2698-20141028-3.patch, YARN-2698-20141029-1.patch, YARN-2698-20141029-2.patch YARN RMAdminCLI and AdminService should have write API only, for other read APIs, they should be located at YARNCLI and RMClientService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2698) Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService
[ https://issues.apache.org/jira/browse/YARN-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2698: - Description: YARN AdminService should have write API only, for other read APIs, they should be located at RM ClientService. Include, 1) getClusterNodeLabels 2) getNodeToLabels 3) getNodeReport should contains labels was:YARN RMAdminCLI and AdminService should have write API only, for other read APIs, they should be located at YARNCLI and RMClientService. Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService --- Key: YARN-2698 URL: https://issues.apache.org/jira/browse/YARN-2698 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Attachments: YARN-2698-20141028-1.patch, YARN-2698-20141028-2.patch, YARN-2698-20141028-3.patch, YARN-2698-20141029-1.patch, YARN-2698-20141029-2.patch YARN AdminService should have write API only, for other read APIs, they should be located at RM ClientService. Include, 1) getClusterNodeLabels 2) getNodeToLabels 3) getNodeReport should contains labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2771) DistributedShell's DSConstants are badly named
[ https://issues.apache.org/jira/browse/YARN-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190430#comment-14190430 ] Zhijie Shen commented on YARN-2771: --- Constant name changes. No need for test case. DistributedShell's DSConstants are badly named -- Key: YARN-2771 URL: https://issues.apache.org/jira/browse/YARN-2771 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-2771.1.patch I'd rather have underscores (DISTRIBUTED_SHELL_TIMELINE_DOMAIN instead of DISTRIBUTEDSHELLTIMELINEDOMAIN). DISTRIBUTEDSHELLTIMELINEDOMAIN is added in this release, can we rename it to be DISTRIBUTED_SHELL_TIMELINE_DOMAIN? For the old envs, we can just add new envs that point to the old-one and deprecate the old ones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2778) YARN node CLI should display labels on returned node reports
Wangda Tan created YARN-2778: Summary: YARN node CLI should display labels on returned node reports Key: YARN-2778 URL: https://issues.apache.org/jira/browse/YARN-2778 Project: Hadoop YARN Issue Type: Sub-task Components: client Reporter: Wangda Tan Assignee: Wangda Tan -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2727) In RMAdminCLI usage display, instead of yarn.node-labels.fs-store.root-dir, yarn.node-labels.fs-store.uri is being displayed
[ https://issues.apache.org/jira/browse/YARN-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190473#comment-14190473 ] Naganarasimha G R commented on YARN-2727: - Seems like failure is not due to this patch... Can we rerun the test cases with patch again ? In RMAdminCLI usage display, instead of yarn.node-labels.fs-store.root-dir, yarn.node-labels.fs-store.uri is being displayed Key: YARN-2727 URL: https://issues.apache.org/jira/browse/YARN-2727 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Naganarasimha G R Assignee: Naganarasimha G R Priority: Minor Attachments: YARN-2727.20141023.1.patch In org.apache.hadoop.yarn.client.cli.RMAdminCLI usage display instead of yarn.node-labels.fs-store.root-dir, yarn.node-labels.fs-store.uri is being used And also some modifications for the description -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2772) DistributedShell's timeline related options are not clear
[ https://issues.apache.org/jira/browse/YARN-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2772: -- Attachment: YARN-2772.2.patch Make one improvement. If failure happens when creating a domain, the client will reset the domain ID, and make the timeline entities go into DEFAULT. DistributedShell's timeline related options are not clear - Key: YARN-2772 URL: https://issues.apache.org/jira/browse/YARN-2772 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-2772.1.patch, YARN-2772.2.patch The new options domain and create options - they are not descriptive at all. It is also not clear when view_acls and modify_acls need to be set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)
[ https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-2495: Attachment: YARN-2495.20141031-1.patch Updated with fixes for all review comments,testcases and rebasing to the trunk code Allow admin specify labels from each NM (Distributed configuration) --- Key: YARN-2495 URL: https://issues.apache.org/jira/browse/YARN-2495 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Wangda Tan Assignee: Naganarasimha G R Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, YARN-2495_20141022.1.patch Target of this JIRA is to allow admin specify labels in each NM, this covers - User can set labels in each NM (by setting yarn-site.xml or using script suggested by [~aw]) - NM will send labels to RM via ResourceTracker API - RM will set labels in NodeLabelManager when NM register/update labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2729) Support script based NodeLabelsProvider Interface in Distributed Node Label Configuration Setup
[ https://issues.apache.org/jira/browse/YARN-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-2729: Attachment: YARN-2729.20141031-1.patch Updated with review comments and test cases Support script based NodeLabelsProvider Interface in Distributed Node Label Configuration Setup --- Key: YARN-2729 URL: https://issues.apache.org/jira/browse/YARN-2729 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Naganarasimha G R Assignee: Naganarasimha G R Attachments: YARN-2729.20141023-1.patch, YARN-2729.20141024-1.patch, YARN-2729.20141031-1.patch Support script based NodeLabelsProvider Interface in Distributed Node Label Configuration Setup . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2772) DistributedShell's timeline related options are not clear
[ https://issues.apache.org/jira/browse/YARN-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190566#comment-14190566 ] Hadoop QA commented on YARN-2772: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678251/YARN-2772.2.patch against trunk revision b811212. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5640//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5640//console This message is automatically generated. DistributedShell's timeline related options are not clear - Key: YARN-2772 URL: https://issues.apache.org/jira/browse/YARN-2772 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-2772.1.patch, YARN-2772.2.patch The new options domain and create options - they are not descriptive at all. It is also not clear when view_acls and modify_acls need to be set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2698) Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService
[ https://issues.apache.org/jira/browse/YARN-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2698: - Attachment: YARN-2698-20141030-1.patch Attached patch contains get node reports Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService --- Key: YARN-2698 URL: https://issues.apache.org/jira/browse/YARN-2698 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Attachments: YARN-2698-20141028-1.patch, YARN-2698-20141028-2.patch, YARN-2698-20141028-3.patch, YARN-2698-20141029-1.patch, YARN-2698-20141029-2.patch, YARN-2698-20141030-1.patch YARN AdminService should have write API only, for other read APIs, they should be located at RM ClientService. Include, 1) getClusterNodeLabels 2) getNodeToLabels 3) getNodeReport should contains labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2771) DistributedShell's DSConstants are badly named
[ https://issues.apache.org/jira/browse/YARN-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2771: -- Attachment: YARN-2771.2.patch Just realized I've missed one suggestion from Vinod in the description. I think it makes sense to keep the old constants, and mark them deprecated. In addition. I altered the logic in the AM to accept both new and old evn vars. DistributedShell's DSConstants are badly named -- Key: YARN-2771 URL: https://issues.apache.org/jira/browse/YARN-2771 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-2771.1.patch, YARN-2771.2.patch I'd rather have underscores (DISTRIBUTED_SHELL_TIMELINE_DOMAIN instead of DISTRIBUTEDSHELLTIMELINEDOMAIN). DISTRIBUTEDSHELLTIMELINEDOMAIN is added in this release, can we rename it to be DISTRIBUTED_SHELL_TIMELINE_DOMAIN? For the old envs, we can just add new envs that point to the old-one and deprecate the old ones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2779) SystemMetricsPublisher needs to renew and cancel timeline DT too
Zhijie Shen created YARN-2779: - Summary: SystemMetricsPublisher needs to renew and cancel timeline DT too Key: YARN-2779 URL: https://issues.apache.org/jira/browse/YARN-2779 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, timelineserver Affects Versions: 2.6.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical SystemMetricsPublisher is going to grab a timeline DT in secure mode as well. The timeline DT will expiry after 24h. No DT renewer will handle renewing work for SystemMetricsPublisher, but this has to been handled by itself. In addition, SystemMetricsPublisher should cancel the timeline DT when it is stopped, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2771) DistributedShell's DSConstants are badly named
[ https://issues.apache.org/jira/browse/YARN-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190685#comment-14190685 ] Hadoop QA commented on YARN-2771: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678276/YARN-2771.2.patch against trunk revision c2866ac. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell: org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5642//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5642//console This message is automatically generated. DistributedShell's DSConstants are badly named -- Key: YARN-2771 URL: https://issues.apache.org/jira/browse/YARN-2771 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-2771.1.patch, YARN-2771.2.patch I'd rather have underscores (DISTRIBUTED_SHELL_TIMELINE_DOMAIN instead of DISTRIBUTEDSHELLTIMELINEDOMAIN). DISTRIBUTEDSHELLTIMELINEDOMAIN is added in this release, can we rename it to be DISTRIBUTED_SHELL_TIMELINE_DOMAIN? For the old envs, we can just add new envs that point to the old-one and deprecate the old ones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2766) ApplicationHistoryManager is expected to return a sorted list of apps/attempts/containers
[ https://issues.apache.org/jira/browse/YARN-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated YARN-2766: Attachment: YARN-2766.patch Makes sense. The new patch updates ApplicationHistoryManagerOnTimelineStore instead of the protobuf classes. It replaces the HashMap with a LinkedHashMap, which does maintain order (the data is loaded from the store into the Map in a consistent order). ApplicationHistoryManager is expected to return a sorted list of apps/attempts/containers -- Key: YARN-2766 URL: https://issues.apache.org/jira/browse/YARN-2766 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: 2.6.0 Reporter: Robert Kanter Assignee: Robert Kanter Attachments: YARN-2766.patch, YARN-2766.patch, YARN-2766.patch, YARN-2766.patch {{TestApplicationHistoryClientService.testContainers}} and {{TestApplicationHistoryClientService.testApplicationAttempts}} both fail because the test assertions are assuming a returned Collection is in a certain order. The collection comes from a HashMap, so the order is not guaranteed, plus, according to [this page|http://docs.oracle.com/javase/8/docs/technotes/guides/collections/changes8.html], there are situations where the iteration order of a HashMap will be different between Java 7 and 8. We should fix the test code to not assume a specific ordering. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2780) Log aggregated resource allocation in rm-appsummary.log
Koji Noguchi created YARN-2780: -- Summary: Log aggregated resource allocation in rm-appsummary.log Key: YARN-2780 URL: https://issues.apache.org/jira/browse/YARN-2780 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Reporter: Koji Noguchi Priority: Minor YARN-415 added useful information about resource usage by applications. Asking to log that info inside rm-appsummary.log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2780) Log aggregated resource allocation in rm-appsummary.log
[ https://issues.apache.org/jira/browse/YARN-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne reassigned YARN-2780: Assignee: Eric Payne Log aggregated resource allocation in rm-appsummary.log --- Key: YARN-2780 URL: https://issues.apache.org/jira/browse/YARN-2780 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Reporter: Koji Noguchi Assignee: Eric Payne Priority: Minor YARN-415 added useful information about resource usage by applications. Asking to log that info inside rm-appsummary.log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2781) support more flexible policy for uploading in shared cache
Sangjin Lee created YARN-2781: - Summary: support more flexible policy for uploading in shared cache Key: YARN-2781 URL: https://issues.apache.org/jira/browse/YARN-2781 Project: Hadoop YARN Issue Type: Sub-task Reporter: Sangjin Lee Today all resources are always uploaded as long as the client wants to upload it. We may want to implement a feature where the shared cache manager can instruct the node managers not to upload under some circumstances. Some examples may be uploading a resource if it is seen more than N number of times. This doesn't need to be included in the first version of the shared cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2766) ApplicationHistoryManager is expected to return a sorted list of apps/attempts/containers
[ https://issues.apache.org/jira/browse/YARN-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190820#comment-14190820 ] Hadoop QA commented on YARN-2766: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678285/YARN-2766.patch against trunk revision c2866ac. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5643//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5643//console This message is automatically generated. ApplicationHistoryManager is expected to return a sorted list of apps/attempts/containers -- Key: YARN-2766 URL: https://issues.apache.org/jira/browse/YARN-2766 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: 2.6.0 Reporter: Robert Kanter Assignee: Robert Kanter Attachments: YARN-2766.patch, YARN-2766.patch, YARN-2766.patch, YARN-2766.patch {{TestApplicationHistoryClientService.testContainers}} and {{TestApplicationHistoryClientService.testApplicationAttempts}} both fail because the test assertions are assuming a returned Collection is in a certain order. The collection comes from a HashMap, so the order is not guaranteed, plus, according to [this page|http://docs.oracle.com/javase/8/docs/technotes/guides/collections/changes8.html], there are situations where the iteration order of a HashMap will be different between Java 7 and 8. We should fix the test code to not assume a specific ordering. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2677) registry punycoding of usernames doesn't fix all usernames to be DNS-valid
[ https://issues.apache.org/jira/browse/YARN-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated YARN-2677: Hadoop Flags: Reviewed +1 for the patch. I verified this on both Mac and Windows. Thanks, Steve! registry punycoding of usernames doesn't fix all usernames to be DNS-valid -- Key: YARN-2677 URL: https://issues.apache.org/jira/browse/YARN-2677 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Affects Versions: 2.6.0 Reporter: Steve Loughran Assignee: Steve Loughran Attachments: YARN-2677-001.patch, YARN-2677-002.patch The registry has a restriction DNS-valid names only to retain the future option of DNS exporting of the registry. to handle complex usernames, it punycodes the username first, using Java's {{java.net.IDN}} class. This turns out to only map high unicode- ASCII, and does nothing for ascii-but-invalid-hostname chars, so stopping users with DNS-illegal names (e.g. with an underscore in them) from being able to register -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2770) Timeline delegation tokens need to be automatically renewed by the RM
[ https://issues.apache.org/jira/browse/YARN-2770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190855#comment-14190855 ] Jian He commented on YARN-2770: --- thanks Zhijie ! some comments: - {{SecurityUtil#getServerPrincipal}} may be useful. {code} if (rmPrincipal != null rmPrincipal.length() 0) { renewer = new KerberosName(rmPrincipal).getServiceName(); } {code} - We may replace the token after renew is really succeeded. {code} if (!timelineDT.equals(token.getDelegationToken())) { token.setDelegationToken((Token) timelineDT); } {code} - In cancelDelegationToken, why replacing the token. Also rename the {{renewDTAction}} to {{cacnelDT}} {code} // If the timeline DT to renew is different than cached, replace it. // Token to set every time for retry, because when exception happens, // DelegationTokenAuthenticatedURL will reset it to null; if (!timelineDT.equals(token.getDelegationToken())) { token.setDelegationToken((Token) timelineDT); } {code} - the same DelegationTokenAuthenticatedURL is instantiated multiple times, is it possible to store it as a variable ? {code} DelegationTokenAuthenticatedURL authUrl = new DelegationTokenAuthenticatedURL(authenticator, connConfigurator); {code} - similarly for the timeline client instantiation. {code} TimelineClient client = TimelineClient.createTimelineClient(); client.init(conf); client.start(); {code} Timeline delegation tokens need to be automatically renewed by the RM - Key: YARN-2770 URL: https://issues.apache.org/jira/browse/YARN-2770 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: 2.5.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Attachments: YARN-2770.1.patch YarnClient will automatically grab a timeline DT for the application and pass it to the app AM. Now the timeline DT renew is still dummy. If an app is running for more than 24h (default DT expiry time), the app AM is no longer able to use the expired DT to communicate with the timeline server. Since RM will cache the credentials of each app, and renew the DTs for the running app. We should provider renew hooks similar to what HDFS DT has for RM, and set RM user as the renewer when grabbing the timeline DT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2698) Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService
[ https://issues.apache.org/jira/browse/YARN-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190883#comment-14190883 ] Wangda Tan commented on YARN-2698: -- And please note that, I removed getClusterNodeLabels and getNodeToLabels out from RM Admin CLI, reader interfaces shouldn't exist in RM Admin CLI. Getting labels of running NMs can be done via YARN-2778 Wangda Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService --- Key: YARN-2698 URL: https://issues.apache.org/jira/browse/YARN-2698 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Attachments: YARN-2698-20141028-1.patch, YARN-2698-20141028-2.patch, YARN-2698-20141028-3.patch, YARN-2698-20141029-1.patch, YARN-2698-20141029-2.patch, YARN-2698-20141030-1.patch YARN AdminService should have write API only, for other read APIs, they should be located at RM ClientService. Include, 1) getClusterNodeLabels 2) getNodeToLabels 3) getNodeReport should contains labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2698) Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService
[ https://issues.apache.org/jira/browse/YARN-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190907#comment-14190907 ] Hadoop QA commented on YARN-2698: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678275/YARN-2698-20141030-1.patch against trunk revision c2866ac. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5641//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5641//console This message is automatically generated. Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService --- Key: YARN-2698 URL: https://issues.apache.org/jira/browse/YARN-2698 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Attachments: YARN-2698-20141028-1.patch, YARN-2698-20141028-2.patch, YARN-2698-20141028-3.patch, YARN-2698-20141029-1.patch, YARN-2698-20141029-2.patch, YARN-2698-20141030-1.patch YARN AdminService should have write API only, for other read APIs, they should be located at RM ClientService. Include, 1) getClusterNodeLabels 2) getNodeToLabels 3) getNodeReport should contains labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-2779) SystemMetricsPublisher needs to renew and cancel timeline DT too
[ https://issues.apache.org/jira/browse/YARN-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli resolved YARN-2779. --- Resolution: Invalid This doesn't make any sense. RM can simply use kerberos based auth to talk to Timeline service. It doesn't need tokens at all. Closing this as invalid, reopen if you disagree. SystemMetricsPublisher needs to renew and cancel timeline DT too Key: YARN-2779 URL: https://issues.apache.org/jira/browse/YARN-2779 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, timelineserver Affects Versions: 2.6.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical SystemMetricsPublisher is going to grab a timeline DT in secure mode as well. The timeline DT will expiry after 24h. No DT renewer will handle renewing work for SystemMetricsPublisher, but this has to been handled by itself. In addition, SystemMetricsPublisher should cancel the timeline DT when it is stopped, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2698) Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService
[ https://issues.apache.org/jira/browse/YARN-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190922#comment-14190922 ] Wangda Tan commented on YARN-2698: -- Tried to run tests locally, can be passed, should not related to this change Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService --- Key: YARN-2698 URL: https://issues.apache.org/jira/browse/YARN-2698 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Attachments: YARN-2698-20141028-1.patch, YARN-2698-20141028-2.patch, YARN-2698-20141028-3.patch, YARN-2698-20141029-1.patch, YARN-2698-20141029-2.patch, YARN-2698-20141030-1.patch YARN AdminService should have write API only, for other read APIs, they should be located at RM ClientService. Include, 1) getClusterNodeLabels 2) getNodeToLabels 3) getNodeReport should contains labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2730) Only one localizer can run on a NodeManager at a time
[ https://issues.apache.org/jira/browse/YARN-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190947#comment-14190947 ] Siqi Li commented on YARN-2730: --- [~jlowe] Hi Jason, do you have some time to take a look at this patch? Only one localizer can run on a NodeManager at a time - Key: YARN-2730 URL: https://issues.apache.org/jira/browse/YARN-2730 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Siqi Li Assignee: Siqi Li Priority: Critical Attachments: YARN-2730.v1.patch We are seeing that when one of the localizerRunner stuck, the rest of the localizerRunners are blocked. We should remove the synchronized modifier. The synchronized modifier appears to have been added by https://issues.apache.org/jira/browse/MAPREDUCE-3537 It could be removed if Localizer doesn't depend on current directory -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2778) YARN node CLI should display labels on returned node reports
[ https://issues.apache.org/jira/browse/YARN-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2778: - Attachment: YARN-2778-20141030-1.patch YARN node CLI should display labels on returned node reports Key: YARN-2778 URL: https://issues.apache.org/jira/browse/YARN-2778 Project: Hadoop YARN Issue Type: Sub-task Components: client Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2778-20141030-1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (YARN-2779) SystemMetricsPublisher needs to renew and cancel timeline DT too
[ https://issues.apache.org/jira/browse/YARN-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen reopened YARN-2779: --- SystemMetricsPublisher needs to renew and cancel timeline DT too Key: YARN-2779 URL: https://issues.apache.org/jira/browse/YARN-2779 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, timelineserver Affects Versions: 2.6.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical SystemMetricsPublisher is going to grab a timeline DT in secure mode as well. The timeline DT will expiry after 24h. No DT renewer will handle renewing work for SystemMetricsPublisher, but this has to been handled by itself. In addition, SystemMetricsPublisher should cancel the timeline DT when it is stopped, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2779) SystemMetricsPublisher needs to renew and cancel timeline DT too
[ https://issues.apache.org/jira/browse/YARN-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190955#comment-14190955 ] Zhijie Shen commented on YARN-2779: --- [~vinodkv], in the current code base, we're SystemMetricsPublisher to grab a timeline DT to talk to the timeline server in secure mode. That's why we need this Jira to add renew and cancel work. But thinking of this issue again, it should be okay to let RM talk to the timeline server with kerberos directly. As this is the only process, which will not add too much workload to the kerberos server. So on the other hand, let's remove the getting DT logic, and let RM uses kerberos directly. SystemMetricsPublisher needs to renew and cancel timeline DT too Key: YARN-2779 URL: https://issues.apache.org/jira/browse/YARN-2779 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, timelineserver Affects Versions: 2.6.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical SystemMetricsPublisher is going to grab a timeline DT in secure mode as well. The timeline DT will expiry after 24h. No DT renewer will handle renewing work for SystemMetricsPublisher, but this has to been handled by itself. In addition, SystemMetricsPublisher should cancel the timeline DT when it is stopped, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2779) SystemMetricsPublisher can use Kerberos directly instead of timeline DT
[ https://issues.apache.org/jira/browse/YARN-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2779: -- Summary: SystemMetricsPublisher can use Kerberos directly instead of timeline DT (was: SystemMetricsPublisher needs to renew and cancel timeline DT too) SystemMetricsPublisher can use Kerberos directly instead of timeline DT --- Key: YARN-2779 URL: https://issues.apache.org/jira/browse/YARN-2779 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, timelineserver Affects Versions: 2.6.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical SystemMetricsPublisher is going to grab a timeline DT in secure mode as well. The timeline DT will expiry after 24h. No DT renewer will handle renewing work for SystemMetricsPublisher, but this has to been handled by itself. In addition, SystemMetricsPublisher should cancel the timeline DT when it is stopped, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2677) registry punycoding of usernames doesn't fix all usernames to be DNS-valid
[ https://issues.apache.org/jira/browse/YARN-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190971#comment-14190971 ] Hudson commented on YARN-2677: -- SUCCESS: Integrated in Hadoop-trunk-Commit #6399 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6399/]) YARN-2677 registry punycoding of usernames doesn't fix all usernames to be DNS-valid (stevel) (stevel: rev 81fe8e414748161f537e6902021d63928f8635f1) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/test/java/org/apache/hadoop/registry/operations/TestRegistryOperations.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/main/java/org/apache/hadoop/registry/client/impl/zk/RegistryOperationsService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/test/java/org/apache/hadoop/registry/client/binding/TestRegistryOperationUtils.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/main/java/org/apache/hadoop/registry/client/binding/RegistryUtils.java registry punycoding of usernames doesn't fix all usernames to be DNS-valid -- Key: YARN-2677 URL: https://issues.apache.org/jira/browse/YARN-2677 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Affects Versions: 2.6.0 Reporter: Steve Loughran Assignee: Steve Loughran Fix For: 2.6.0 Attachments: YARN-2677-001.patch, YARN-2677-002.patch The registry has a restriction DNS-valid names only to retain the future option of DNS exporting of the registry. to handle complex usernames, it punycodes the username first, using Java's {{java.net.IDN}} class. This turns out to only map high unicode- ASCII, and does nothing for ascii-but-invalid-hostname chars, so stopping users with DNS-illegal names (e.g. with an underscore in them) from being able to register -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2770) Timeline delegation tokens need to be automatically renewed by the RM
[ https://issues.apache.org/jira/browse/YARN-2770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190998#comment-14190998 ] Vinod Kumar Vavilapalli commented on YARN-2770: --- Quick comments: - Let's make sure the renewer name mangling imitates MR JobClient, it is easy to get this wrong. - It'll be great to also test separately that renewal can work fine when https is enabled. Timeline delegation tokens need to be automatically renewed by the RM - Key: YARN-2770 URL: https://issues.apache.org/jira/browse/YARN-2770 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: 2.5.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Attachments: YARN-2770.1.patch YarnClient will automatically grab a timeline DT for the application and pass it to the app AM. Now the timeline DT renew is still dummy. If an app is running for more than 24h (default DT expiry time), the app AM is no longer able to use the expired DT to communicate with the timeline server. Since RM will cache the credentials of each app, and renew the DTs for the running app. We should provider renew hooks similar to what HDFS DT has for RM, and set RM user as the renewer when grabbing the timeline DT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2778) YARN node CLI should display labels on returned node reports
[ https://issues.apache.org/jira/browse/YARN-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2778: - Description: When user run yarn node -list .., it should print labels of this node as well. Upon changes of YARN-2698, this patch can get labels of nodes from ResourceManager YARN node CLI should display labels on returned node reports Key: YARN-2778 URL: https://issues.apache.org/jira/browse/YARN-2778 Project: Hadoop YARN Issue Type: Sub-task Components: client Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2778-20141030-1.patch When user run yarn node -list .., it should print labels of this node as well. Upon changes of YARN-2698, this patch can get labels of nodes from ResourceManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2771) DistributedShell's DSConstants are badly named
[ https://issues.apache.org/jira/browse/YARN-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2771: -- Attachment: YARN-2771.3.patch Fix the test failure DistributedShell's DSConstants are badly named -- Key: YARN-2771 URL: https://issues.apache.org/jira/browse/YARN-2771 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-2771.1.patch, YARN-2771.2.patch, YARN-2771.3.patch I'd rather have underscores (DISTRIBUTED_SHELL_TIMELINE_DOMAIN instead of DISTRIBUTEDSHELLTIMELINEDOMAIN). DISTRIBUTEDSHELLTIMELINEDOMAIN is added in this release, can we rename it to be DISTRIBUTED_SHELL_TIMELINE_DOMAIN? For the old envs, we can just add new envs that point to the old-one and deprecate the old ones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2779) SystemMetricsPublisher can use Kerberos directly instead of timeline DT
[ https://issues.apache.org/jira/browse/YARN-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2779: -- Attachment: YARN-2779.1.patch Upload a patch to remove the code of getting the timeline DT in the SystemMetricsPublisher SystemMetricsPublisher can use Kerberos directly instead of timeline DT --- Key: YARN-2779 URL: https://issues.apache.org/jira/browse/YARN-2779 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, timelineserver Affects Versions: 2.6.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Attachments: YARN-2779.1.patch SystemMetricsPublisher is going to grab a timeline DT in secure mode as well. The timeline DT will expiry after 24h. No DT renewer will handle renewing work for SystemMetricsPublisher, but this has to been handled by itself. In addition, SystemMetricsPublisher should cancel the timeline DT when it is stopped, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2186) Node Manager uploader service for cache manager
[ https://issues.apache.org/jira/browse/YARN-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-2186: -- Attachment: YARN-2186-trunk-v5.patch v.5 patch posted. It addresses all the review comments from Karthik. To see the diff better, see https://github.com/ctrezzo/hadoop/compare/trunk...sharedcache-4-YARN-2186-uploader To see the changes between v.4 and v.5, https://github.com/ctrezzo/hadoop/commit/ce090b328505b6acf36f0419a99feeb8ec0cff44 Most of the changes revolve around renaming the protobuf types and the classes. I ended up using SCMUploader as the prefix for most of them. I considered SharedCacheUploader but it turns out it's bit too verbose, and makes other class names very long. Thanks! Node Manager uploader service for cache manager --- Key: YARN-2186 URL: https://issues.apache.org/jira/browse/YARN-2186 Project: Hadoop YARN Issue Type: Sub-task Reporter: Chris Trezzo Assignee: Chris Trezzo Attachments: YARN-2186-trunk-v1.patch, YARN-2186-trunk-v2.patch, YARN-2186-trunk-v3.patch, YARN-2186-trunk-v4.patch, YARN-2186-trunk-v5.patch Implement the node manager uploader service for the cache manager. This service is responsible for communicating with the node manager when it uploads resources to the shared cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2186) Node Manager uploader service for cache manager
[ https://issues.apache.org/jira/browse/YARN-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191062#comment-14191062 ] Karthik Kambatla commented on YARN-2186: Thanks Sangjin. I barely skimmed through the updated patch. Couple of nits before I go through a detailed review: # yarn_server_common_service_protos.proto: Should we rename the upload response content to uploadable? # Thanks for filing the follow up JIRAs. If you don't mind, can we annotate the TODOs in code with those JIRAs. e.g. {{// TODO (YARN-xyz):}} Node Manager uploader service for cache manager --- Key: YARN-2186 URL: https://issues.apache.org/jira/browse/YARN-2186 Project: Hadoop YARN Issue Type: Sub-task Reporter: Chris Trezzo Assignee: Chris Trezzo Attachments: YARN-2186-trunk-v1.patch, YARN-2186-trunk-v2.patch, YARN-2186-trunk-v3.patch, YARN-2186-trunk-v4.patch, YARN-2186-trunk-v5.patch Implement the node manager uploader service for the cache manager. This service is responsible for communicating with the node manager when it uploads resources to the shared cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2186) Node Manager uploader service for cache manager
[ https://issues.apache.org/jira/browse/YARN-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191066#comment-14191066 ] Sangjin Lee commented on YARN-2186: --- {quote} Thanks for filing the follow up JIRAs. If you don't mind, can we annotate the TODOs in code with those JIRAs. e.g. // TODO (YARN-xyz): {quote} I added the JIRA ids for two, but I missed one. I'll fix it. {quote} yarn_server_common_service_protos.proto: Should we rename the upload response content to uploadable? {quote} I'm not exactly sure which one you're referring to. We have 4 types: SCMUploaderNotifyRequest/Response, and SCMUploaderCanUploadRequest/Response. Are you talking about renaming SCMUploaderCanUploadRequest/Response to SCMUploaderUploableRequest/Response, or renaming the response field from accepted to uploadable? Node Manager uploader service for cache manager --- Key: YARN-2186 URL: https://issues.apache.org/jira/browse/YARN-2186 Project: Hadoop YARN Issue Type: Sub-task Reporter: Chris Trezzo Assignee: Chris Trezzo Attachments: YARN-2186-trunk-v1.patch, YARN-2186-trunk-v2.patch, YARN-2186-trunk-v3.patch, YARN-2186-trunk-v4.patch, YARN-2186-trunk-v5.patch Implement the node manager uploader service for the cache manager. This service is responsible for communicating with the node manager when it uploads resources to the shared cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2186) Node Manager uploader service for cache manager
[ https://issues.apache.org/jira/browse/YARN-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191070#comment-14191070 ] Karthik Kambatla commented on YARN-2186: The response field of canUpload to uploadable from accepted. Node Manager uploader service for cache manager --- Key: YARN-2186 URL: https://issues.apache.org/jira/browse/YARN-2186 Project: Hadoop YARN Issue Type: Sub-task Reporter: Chris Trezzo Assignee: Chris Trezzo Attachments: YARN-2186-trunk-v1.patch, YARN-2186-trunk-v2.patch, YARN-2186-trunk-v3.patch, YARN-2186-trunk-v4.patch, YARN-2186-trunk-v5.patch Implement the node manager uploader service for the cache manager. This service is responsible for communicating with the node manager when it uploads resources to the shared cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2186) Node Manager uploader service for cache manager
[ https://issues.apache.org/jira/browse/YARN-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191074#comment-14191074 ] Sangjin Lee commented on YARN-2186: --- Got it. I'll update the patch shortly. Thanks! Node Manager uploader service for cache manager --- Key: YARN-2186 URL: https://issues.apache.org/jira/browse/YARN-2186 Project: Hadoop YARN Issue Type: Sub-task Reporter: Chris Trezzo Assignee: Chris Trezzo Attachments: YARN-2186-trunk-v1.patch, YARN-2186-trunk-v2.patch, YARN-2186-trunk-v3.patch, YARN-2186-trunk-v4.patch, YARN-2186-trunk-v5.patch Implement the node manager uploader service for the cache manager. This service is responsible for communicating with the node manager when it uploads resources to the shared cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2771) DistributedShell's DSConstants are badly named
[ https://issues.apache.org/jira/browse/YARN-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191089#comment-14191089 ] Hadoop QA commented on YARN-2771: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678350/YARN-2771.3.patch against trunk revision 81fe8e4. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5645//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5645//console This message is automatically generated. DistributedShell's DSConstants are badly named -- Key: YARN-2771 URL: https://issues.apache.org/jira/browse/YARN-2771 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-2771.1.patch, YARN-2771.2.patch, YARN-2771.3.patch I'd rather have underscores (DISTRIBUTED_SHELL_TIMELINE_DOMAIN instead of DISTRIBUTEDSHELLTIMELINEDOMAIN). DISTRIBUTEDSHELLTIMELINEDOMAIN is added in this release, can we rename it to be DISTRIBUTED_SHELL_TIMELINE_DOMAIN? For the old envs, we can just add new envs that point to the old-one and deprecate the old ones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2186) Node Manager uploader service for cache manager
[ https://issues.apache.org/jira/browse/YARN-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-2186: -- Attachment: YARN-2186-trunk-v6.patch v.6 patch posted. Node Manager uploader service for cache manager --- Key: YARN-2186 URL: https://issues.apache.org/jira/browse/YARN-2186 Project: Hadoop YARN Issue Type: Sub-task Reporter: Chris Trezzo Assignee: Chris Trezzo Attachments: YARN-2186-trunk-v1.patch, YARN-2186-trunk-v2.patch, YARN-2186-trunk-v3.patch, YARN-2186-trunk-v4.patch, YARN-2186-trunk-v5.patch, YARN-2186-trunk-v6.patch Implement the node manager uploader service for the cache manager. This service is responsible for communicating with the node manager when it uploads resources to the shared cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2698) Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService
[ https://issues.apache.org/jira/browse/YARN-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191104#comment-14191104 ] Hadoop QA commented on YARN-2698: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678275/YARN-2698-20141030-1.patch against trunk revision 5e3f428. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5644//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5644//console This message is automatically generated. Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService --- Key: YARN-2698 URL: https://issues.apache.org/jira/browse/YARN-2698 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Attachments: YARN-2698-20141028-1.patch, YARN-2698-20141028-2.patch, YARN-2698-20141028-3.patch, YARN-2698-20141029-1.patch, YARN-2698-20141029-2.patch, YARN-2698-20141030-1.patch YARN AdminService should have write API only, for other read APIs, they should be located at RM ClientService. Include, 1) getClusterNodeLabels 2) getNodeToLabels 3) getNodeReport should contains labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2186) Node Manager uploader service for cache manager
[ https://issues.apache.org/jira/browse/YARN-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191105#comment-14191105 ] Hadoop QA commented on YARN-2186: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678352/YARN-2186-trunk-v5.patch against trunk revision 81fe8e4. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5647//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5647//console This message is automatically generated. Node Manager uploader service for cache manager --- Key: YARN-2186 URL: https://issues.apache.org/jira/browse/YARN-2186 Project: Hadoop YARN Issue Type: Sub-task Reporter: Chris Trezzo Assignee: Chris Trezzo Attachments: YARN-2186-trunk-v1.patch, YARN-2186-trunk-v2.patch, YARN-2186-trunk-v3.patch, YARN-2186-trunk-v4.patch, YARN-2186-trunk-v5.patch, YARN-2186-trunk-v6.patch Implement the node manager uploader service for the cache manager. This service is responsible for communicating with the node manager when it uploads resources to the shared cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2383) Add ability to renew ClientToAMToken
[ https://issues.apache.org/jira/browse/YARN-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191121#comment-14191121 ] Anubhav Dhoot commented on YARN-2383: - Hi [~xgong] can you please rebase the patch Add ability to renew ClientToAMToken Key: YARN-2383 URL: https://issues.apache.org/jira/browse/YARN-2383 Project: Hadoop YARN Issue Type: Bug Components: applications, resourcemanager Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-2383.preview.1.patch, YARN-2383.preview.2.patch, YARN-2383.preview.3.1.patch, YARN-2383.preview.3.2.patch, YARN-2383.preview.3.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2383) Add ability to renew ClientToAMToken
[ https://issues.apache.org/jira/browse/YARN-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191131#comment-14191131 ] Hadoop QA commented on YARN-2383: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12662875/YARN-2383.preview.3.2.patch against trunk revision a9331fe. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5649//console This message is automatically generated. Add ability to renew ClientToAMToken Key: YARN-2383 URL: https://issues.apache.org/jira/browse/YARN-2383 Project: Hadoop YARN Issue Type: Bug Components: applications, resourcemanager Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-2383.preview.1.patch, YARN-2383.preview.2.patch, YARN-2383.preview.3.1.patch, YARN-2383.preview.3.2.patch, YARN-2383.preview.3.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2186) Node Manager uploader service for cache manager
[ https://issues.apache.org/jira/browse/YARN-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191135#comment-14191135 ] Hadoop QA commented on YARN-2186: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678363/YARN-2186-trunk-v6.patch against trunk revision 348bfb7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5648//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5648//console This message is automatically generated. Node Manager uploader service for cache manager --- Key: YARN-2186 URL: https://issues.apache.org/jira/browse/YARN-2186 Project: Hadoop YARN Issue Type: Sub-task Reporter: Chris Trezzo Assignee: Chris Trezzo Attachments: YARN-2186-trunk-v1.patch, YARN-2186-trunk-v2.patch, YARN-2186-trunk-v3.patch, YARN-2186-trunk-v4.patch, YARN-2186-trunk-v5.patch, YARN-2186-trunk-v6.patch Implement the node manager uploader service for the cache manager. This service is responsible for communicating with the node manager when it uploads resources to the shared cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2779) SystemMetricsPublisher can use Kerberos directly instead of timeline DT
[ https://issues.apache.org/jira/browse/YARN-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191144#comment-14191144 ] Hadoop QA commented on YARN-2779: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678351/YARN-2779.1.patch against trunk revision 81fe8e4. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5646//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5646//console This message is automatically generated. SystemMetricsPublisher can use Kerberos directly instead of timeline DT --- Key: YARN-2779 URL: https://issues.apache.org/jira/browse/YARN-2779 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, timelineserver Affects Versions: 2.6.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Attachments: YARN-2779.1.patch SystemMetricsPublisher is going to grab a timeline DT in secure mode as well. The timeline DT will expiry after 24h. No DT renewer will handle renewing work for SystemMetricsPublisher, but this has to been handled by itself. In addition, SystemMetricsPublisher should cancel the timeline DT when it is stopped, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager
[ https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191173#comment-14191173 ] zhihai xu commented on YARN-2753: - Hi [~xgong], since [~leftnoteasy] agreed to my patch(YARN-2756 is discussed separately) and also the patch passed Hadoop QA, Could you review/commit the patch? thanks zhihai Fix potential issues and code clean up for *NodeLabelsManager - Key: YARN-2753 URL: https://issues.apache.org/jira/browse/YARN-2753 Project: Hadoop YARN Issue Type: Sub-task Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-2753.000.patch, YARN-2753.001.patch, YARN-2753.002.patch, YARN-2753.003.patch, YARN-2753.004.patch, YARN-2753.005.patch Issues include: * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in labelCollections if the key already exists otherwise the Label.resource will be changed(reset). * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of CommonNodeLabelsManager. ** because when a Node is created, Node.labels can be null. ** In this case, nm.labels; may be null. So we need check originalLabels not null before use it(originalLabels.containsAll). * addToCluserNodeLabels should be protected by writeLock in RMNodeLabelsManager.java. because we should protect labelCollections in RMNodeLabelsManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2770) Timeline delegation tokens need to be automatically renewed by the RM
[ https://issues.apache.org/jira/browse/YARN-2770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191174#comment-14191174 ] Zhijie Shen commented on YARN-2770: --- bq. SecurityUtil#getServerPrincipal may be useful. bq. Let's make sure the renewer name mangling imitates MR JobClient, it is easy to get this wrong. I think we should use HadoopKerberosName#getShortName (AbstractDelegationTokenSecretManager is using it as well) and RM_Principal (which should be there in secure mode) to get the RM daemon user, and HadoopKerberosName will automatically handle auth_to_local if we need to map the auth name to the real operating system name. bq. It'll be great to also test separately that renewal can work fine when https is enabled. I've verified it will work with SSL. BTW, SystemMetricsPublisher works fine with SSL too. To make it work, we must make sure RM have seen the proper configuration for SSL and the truststore. bq. the same DelegationTokenAuthenticatedURL is instantiated multiple times, is it possible to store it as a variable ? It's probably okay to reuse DelegationTokenAuthenticatedURL. However, I'd like to construct one for each request to isolate the possible resource sharing, preventing introducing potential bugs. Actually Jersey client also construct a new URL for each request. It won't be a big overhead, as it doesn't deeply construct something. bq. similarly for the timeline client instantiation. I'm not sure, but guess you're talking about TokenRenewer. Actually I'm following the way that RMDelegationTokenIdentifier does. If we don't construct the client per call, we need to make it a service, and have separate stage for init/start and stop. It may complex the change. Please let me know if you want this change. bq. We may replace the token after renew is really succeeded. According to the design of DelegationTokenAuthenticatedURL, I need to put the DT into the current DelegationTokenAuthenticatedURL.Token, which will be fetched internally to do the corresponding operations. So to renew a given DT, I need to set DT there. However, if it already cached there, the client can skip the set step. Otherwise, I've addressed the remaining comments. Thanks Jian and Vinod! Timeline delegation tokens need to be automatically renewed by the RM - Key: YARN-2770 URL: https://issues.apache.org/jira/browse/YARN-2770 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: 2.5.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Attachments: YARN-2770.1.patch YarnClient will automatically grab a timeline DT for the application and pass it to the app AM. Now the timeline DT renew is still dummy. If an app is running for more than 24h (default DT expiry time), the app AM is no longer able to use the expired DT to communicate with the timeline server. Since RM will cache the credentials of each app, and renew the DTs for the running app. We should provider renew hooks similar to what HDFS DT has for RM, and set RM user as the renewer when grabbing the timeline DT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2770) Timeline delegation tokens need to be automatically renewed by the RM
[ https://issues.apache.org/jira/browse/YARN-2770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2770: -- Attachment: YARN-2770.2.patch Timeline delegation tokens need to be automatically renewed by the RM - Key: YARN-2770 URL: https://issues.apache.org/jira/browse/YARN-2770 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: 2.5.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Attachments: YARN-2770.1.patch, YARN-2770.2.patch YarnClient will automatically grab a timeline DT for the application and pass it to the app AM. Now the timeline DT renew is still dummy. If an app is running for more than 24h (default DT expiry time), the app AM is no longer able to use the expired DT to communicate with the timeline server. Since RM will cache the credentials of each app, and renew the DTs for the running app. We should provider renew hooks similar to what HDFS DT has for RM, and set RM user as the renewer when grabbing the timeline DT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2779) SystemMetricsPublisher can use Kerberos directly instead of timeline DT
[ https://issues.apache.org/jira/browse/YARN-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191177#comment-14191177 ] Zhijie Shen commented on YARN-2779: --- I've verified it in a secure cluster, and SystemMetricsPublisher works fine with kerberos directly. SystemMetricsPublisher can use Kerberos directly instead of timeline DT --- Key: YARN-2779 URL: https://issues.apache.org/jira/browse/YARN-2779 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, timelineserver Affects Versions: 2.6.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Attachments: YARN-2779.1.patch SystemMetricsPublisher is going to grab a timeline DT in secure mode as well. The timeline DT will expiry after 24h. No DT renewer will handle renewing work for SystemMetricsPublisher, but this has to been handled by itself. In addition, SystemMetricsPublisher should cancel the timeline DT when it is stopped, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2770) Timeline delegation tokens need to be automatically renewed by the RM
[ https://issues.apache.org/jira/browse/YARN-2770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191224#comment-14191224 ] Hadoop QA commented on YARN-2770: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678376/YARN-2770.2.patch against trunk revision e1f7d65. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5650//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5650//console This message is automatically generated. Timeline delegation tokens need to be automatically renewed by the RM - Key: YARN-2770 URL: https://issues.apache.org/jira/browse/YARN-2770 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: 2.5.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Attachments: YARN-2770.1.patch, YARN-2770.2.patch YarnClient will automatically grab a timeline DT for the application and pass it to the app AM. Now the timeline DT renew is still dummy. If an app is running for more than 24h (default DT expiry time), the app AM is no longer able to use the expired DT to communicate with the timeline server. Since RM will cache the credentials of each app, and renew the DTs for the running app. We should provider renew hooks similar to what HDFS DT has for RM, and set RM user as the renewer when grabbing the timeline DT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2782) TestResourceTrackerOnHA fails on trunk
Zhijie Shen created YARN-2782: - Summary: TestResourceTrackerOnHA fails on trunk Key: YARN-2782 URL: https://issues.apache.org/jira/browse/YARN-2782 Project: Hadoop YARN Issue Type: Test Reporter: Zhijie Shen {code} Running org.apache.hadoop.yarn.client.TestResourceTrackerOnHA Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 12.684 sec FAILURE! - in org.apache.hadoop.yarn.client.TestResourceTrackerOnHA testResourceTrackerOnHA(org.apache.hadoop.yarn.client.TestResourceTrackerOnHA) Time elapsed: 12.518 sec ERROR! java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 to asf905.gq1.ygridcore.net:28031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy87.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:68) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) at com.sun.proxy.$Proxy88.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.client.TestResourceTrackerOnHA.testResourceTrackerOnHA(TestResourceTrackerOnHA.java:64) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2783) TestApplicationClientProtocolOnHA
Zhijie Shen created YARN-2783: - Summary: TestApplicationClientProtocolOnHA Key: YARN-2783 URL: https://issues.apache.org/jira/browse/YARN-2783 Project: Hadoop YARN Issue Type: Test Reporter: Zhijie Shen {code} Running org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA Tests run: 17, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 147.881 sec FAILURE! - in org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA testGetContainersOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA) Time elapsed: 12.928 sec ERROR! java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 to asf905.gq1.ygridcore.net:28032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy17.getContainers(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getContainers(ApplicationClientProtocolPBClientImpl.java:400) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) at com.sun.proxy.$Proxy18.getContainers(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getContainers(YarnClientImpl.java:639) at org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetContainersOnHA(TestApplicationClientProtocolOnHA.java:154) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2770) Timeline delegation tokens need to be automatically renewed by the RM
[ https://issues.apache.org/jira/browse/YARN-2770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191308#comment-14191308 ] Zhijie Shen commented on YARN-2770: --- The two test failures are not related, and happen on other Jiras, too: file two tickets for them - YARN-2782 an YARN-2783. Timeline delegation tokens need to be automatically renewed by the RM - Key: YARN-2770 URL: https://issues.apache.org/jira/browse/YARN-2770 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: 2.5.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Attachments: YARN-2770.1.patch, YARN-2770.2.patch YarnClient will automatically grab a timeline DT for the application and pass it to the app AM. Now the timeline DT renew is still dummy. If an app is running for more than 24h (default DT expiry time), the app AM is no longer able to use the expired DT to communicate with the timeline server. Since RM will cache the credentials of each app, and renew the DTs for the running app. We should provider renew hooks similar to what HDFS DT has for RM, and set RM user as the renewer when grabbing the timeline DT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2634) Test failure for TestClientRMTokens
[ https://issues.apache.org/jira/browse/YARN-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191337#comment-14191337 ] Yang Yifan commented on YARN-2634: -- Face the same problem. Test failure for TestClientRMTokens --- Key: YARN-2634 URL: https://issues.apache.org/jira/browse/YARN-2634 Project: Hadoop YARN Issue Type: Test Reporter: Junping Du Assignee: Jian He Priority: Blocker The test get failed as below: {noformat} --- Test set: org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens --- Tests run: 6, Failures: 3, Errors: 2, Skipped: 0, Time elapsed: 60.184 sec FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens testShortCircuitRenewCancelDifferentHostSamePort(org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens) Time elapsed: 22.693 sec FAILURE! java.lang.AssertionError: expected:getProxy but was:null at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens.checkShortCircuitRenewCancel(TestClientRMTokens.java:319) at org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens.testShortCircuitRenewCancelDifferentHostSamePort(TestClientRMTokens.java:272) testShortCircuitRenewCancelDifferentHostDifferentPort(org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens) Time elapsed: 20.087 sec FAILURE! java.lang.AssertionError: expected:getProxy but was:null at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens.checkShortCircuitRenewCancel(TestClientRMTokens.java:319) at org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens.testShortCircuitRenewCancelDifferentHostDifferentPort(TestClientRMTokens.java:283) testShortCircuitRenewCancel(org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens) Time elapsed: 0.031 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier$Renewer.getRmClient(RMDelegationTokenIdentifier.java:148) at org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier$Renewer.renew(RMDelegationTokenIdentifier.java:101) at org.apache.hadoop.security.token.Token.renew(Token.java:377) at org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens.checkShortCircuitRenewCancel(TestClientRMTokens.java:309) at org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens.testShortCircuitRenewCancel(TestClientRMTokens.java:241) testShortCircuitRenewCancelSameHostDifferentPort(org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens) Time elapsed: 0.061 sec FAILURE! java.lang.AssertionError: expected:getProxy but was:null at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens.checkShortCircuitRenewCancel(TestClientRMTokens.java:319) at org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens.testShortCircuitRenewCancelSameHostDifferentPort(TestClientRMTokens.java:261) testShortCircuitRenewCancelWildcardAddress(org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens) Time elapsed: 0.07 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.net.NetUtils.isLocalAddress(NetUtils.java:684) at org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier$Renewer.getRmClient(RMDelegationTokenIdentifier.java:149) 1,1 Top {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2775) There is no close method in NMWebServices#getLogs()
[ https://issues.apache.org/jira/browse/YARN-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191411#comment-14191411 ] Ravi Prakash commented on YARN-2775: Hi skrho! Thanks for the contribution. It seems the JAX-RS spec doesn't say anything about either closing or not closing the OutputStream. Have you seen these objects accumulate on the heap? Are you sure they are never reaped? There is no close method in NMWebServices#getLogs() --- Key: YARN-2775 URL: https://issues.apache.org/jira/browse/YARN-2775 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: skrho Priority: Minor Attachments: YARN-2775_001.patch If getLogs method is called, fileInputStream object is accumulated in memory.. Because fileinputStream object is not closed.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2698) Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService
[ https://issues.apache.org/jira/browse/YARN-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191417#comment-14191417 ] Vinod Kumar Vavilapalli commented on YARN-2698: --- This looks good, +1. Checking this in. Move getClusterNodeLabels and getNodeToLabels to YarnClient instead of AdminService --- Key: YARN-2698 URL: https://issues.apache.org/jira/browse/YARN-2698 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Attachments: YARN-2698-20141028-1.patch, YARN-2698-20141028-2.patch, YARN-2698-20141028-3.patch, YARN-2698-20141029-1.patch, YARN-2698-20141029-2.patch, YARN-2698-20141030-1.patch YARN AdminService should have write API only, for other read APIs, they should be located at RM ClientService. Include, 1) getClusterNodeLabels 2) getNodeToLabels 3) getNodeReport should contains labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)