[jira] [Commented] (YARN-90) NodeManager should identify failed disks becoming good back again
[ https://issues.apache.org/jira/browse/YARN-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781905#comment-13781905 ] Ravi Prakash commented on YARN-90: -- Hi nijel! For testing I would like to configure a USB drive to be one of the local + log dirs. We can then simulate failure by unplugging the USB drive. When we plug it back in, the NM should start using the recovered disk. Did you experience this behaviour yourself? I'll also try to test this soon as I get some cycles. NodeManager should identify failed disks becoming good back again - Key: YARN-90 URL: https://issues.apache.org/jira/browse/YARN-90 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Ravi Gummadi Attachments: YARN-90.1.patch, YARN-90.patch MAPREDUCE-3121 makes NodeManager identify disk failures. But once a disk goes down, it is marked as failed forever. To reuse that disk (after it becomes good), NodeManager needs restart. This JIRA is to improve NodeManager to reuse good disks(which could be bad some time back). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1232) Configuration support for RM HA
[ https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1232: --- Attachment: yarn-1232-5.patch Discussed this with Bikas and Alejandro offline. The consensus was to have all rpc-addresses take the form rpc-address-conf.node-id. Uploading a patch that does that. Configuration support for RM HA --- Key: YARN-1232 URL: https://issues.apache.org/jira/browse/YARN-1232 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Attachments: yarn-1232-1.patch, yarn-1232-2.patch, yarn-1232-3.patch, yarn-1232-4.patch, yarn-1232-5.patch We should augment the configuration to allow users specify two RMs and the individual RPC addresses for them. This blocks ConfiguredFailoverProxyProvider. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1252) Secure RM fails to start up in secure HA setup with Renewal request for unknown token exception
Arpit Gupta created YARN-1252: - Summary: Secure RM fails to start up in secure HA setup with Renewal request for unknown token exception Key: YARN-1252 URL: https://issues.apache.org/jira/browse/YARN-1252 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.1-beta Reporter: Arpit Gupta {code} 2013-09-26 08:15:20,507 INFO ipc.Server (Server.java:run(861)) - IPC Server Responder: starting 2013-09-26 08:15:20,521 ERROR security.UserGroupInformation (UserGroupInformation.java:doAs(1486)) - PriviledgedActionException as:rm/host@realm (auth:KERBEROS) cause:org.apache.hadoop.security.token.SecretManager$InvalidToken: Renewal request for unknown token at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:388) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:5934) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:453) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:851) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59650) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1483) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042 {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1232) Configuration support for RM HA
[ https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782032#comment-13782032 ] Hadoop QA commented on YARN-1232: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12605931/yarn-1232-5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2040//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2040//console This message is automatically generated. Configuration support for RM HA --- Key: YARN-1232 URL: https://issues.apache.org/jira/browse/YARN-1232 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Attachments: yarn-1232-1.patch, yarn-1232-2.patch, yarn-1232-3.patch, yarn-1232-4.patch, yarn-1232-5.patch We should augment the configuration to allow users specify two RMs and the individual RPC addresses for them. This blocks ConfiguredFailoverProxyProvider. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode
Alejandro Abdelnur created YARN-1253: Summary: Changes to LinuxContainerExecutor to use cgroups in unsecure mode Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker Fix For: 2.1.1-beta When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782038#comment-13782038 ] Alejandro Abdelnur commented on YARN-1253: -- When using {{LinuxContainerExecutor.java}} in unsecure mode, we should have a {{yarn.nodemanager.linux-container-executor.unsecure-mode.local-user}} (with {{yarnuser}} as default) property that defines the local user LCE should use to start containers when used in unsecure mode. The {{container-executor.c}} should received and extra parameter with the runAsUser, differentiating it from the user (which is used to create the usercache/$USER/ directory. (the {{container-executor.c}} code already is already prepared to handle this differentiation, the changes are minimal, just passing the extra parameter and wiring it in the right places. The {{yarnuser}} should be provisioned as system user in the nodes and added to the whitelisted system users in the {{container-executor.cfg}} configuration, YARN-1137. Changes to LinuxContainerExecutor to use cgroups in unsecure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker Fix For: 2.1.1-beta When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1232) Configuration to support multiple RMs
[ https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1232: --- Description: We should augment the configuration to allow users specify two RMs and the individual RPC addresses for them. (was: We should augment the configuration to allow users specify two RMs and the individual RPC addresses for them. This blocks ConfiguredFailoverProxyProvider.) Configuration to support multiple RMs - Key: YARN-1232 URL: https://issues.apache.org/jira/browse/YARN-1232 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Attachments: yarn-1232-1.patch, yarn-1232-2.patch, yarn-1232-3.patch, yarn-1232-4.patch, yarn-1232-5.patch We should augment the configuration to allow users specify two RMs and the individual RPC addresses for them. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1232) Configuration to support multiple RMs
[ https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1232: --- Summary: Configuration to support multiple RMs (was: Configuration support for RM HA) Configuration to support multiple RMs - Key: YARN-1232 URL: https://issues.apache.org/jira/browse/YARN-1232 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Attachments: yarn-1232-1.patch, yarn-1232-2.patch, yarn-1232-3.patch, yarn-1232-4.patch, yarn-1232-5.patch We should augment the configuration to allow users specify two RMs and the individual RPC addresses for them. This blocks ConfiguredFailoverProxyProvider. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1232) Configuration to support multiple RMs
[ https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782047#comment-13782047 ] Karthik Kambatla commented on YARN-1232: Summarizing the JIRA to make it simpler for folks to follow. As the description states, the focus of this JIRA is to add configs to allow specifying multiple RMs and their RPC addresses. The approach is to add the notion of RM-ids through {{yarn.resourcemanager.ha.nodes}}, and add a node-id suffix to each RPC address config. When starting the cluster, the server-side config should explicitly set {{yarn.resourcemanager.ha.node.id}} to specify the node-id of the RM being started. I believe the patch is ready for review. Configuration to support multiple RMs - Key: YARN-1232 URL: https://issues.apache.org/jira/browse/YARN-1232 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Attachments: yarn-1232-1.patch, yarn-1232-2.patch, yarn-1232-3.patch, yarn-1232-4.patch, yarn-1232-5.patch We should augment the configuration to allow users specify two RMs and the individual RPC addresses for them. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1215) Yarn URL should include userinfo
[ https://issues.apache.org/jira/browse/YARN-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782066#comment-13782066 ] Chuan Liu commented on YARN-1215: - I have following test failures in my full test run. None of them seems a regression, i.e. I have the same failures with or without the patch. Yarn: {noformat} TestRMDelegationTokens.testRMDTMasterKeyStateOnRollingMasterKey:114 null TestFairScheduler.testSimpleFairShareCalculation:385 expected:3414 but was:0 TestDiskFailures.testLocalDirsFailures:99-testDirsFailures:179-verifyDisksHealth:247 » NoSuchElement TestContainerManagerSecurity.testContainerManager:113-testNMTokens:222 » IllegalArgument TestContainerManagerSecurity.testContainerManager:113-testNMTokens:222 » IllegalArgument TestNMClient.testNMClientNoCleanupOnStop:199-allocateContainers:233 » IndexOutOfBounds {noformat} Mapred: {noformat} TestFetchFailure.testFetchFailureMultipleReduces:332 expected:SUCCEEDED but was:SCHEDULED TestMRApp.testUpdatedNodes:258 Expecting 2 more completion events for killed expected:4 but was:3 TestCommitterEventHandler.testBasic:263 null TestMiniMRClientCluster.testRestart:146 Address before restart: chuanliu101:0 is different from new address: chuanliu101:53368 expected:chuanliu101:[0] but was:chuanliu101:[53368] TestClusterMRNotificationNotificationTestCase.testMR:163 expected:2 but was:0 TestJobListCache.testAddExisting:39 » test timed out after 1000 milliseconds TestLocalMRNotificationNotificationTestCase.testMR:178 » IO Job cleanup didn'... TestMRJobsWithHistoryService.testJobHistoryData:153 » IO java.net.ConnectExcep... {noformat} Yarn URL should include userinfo Key: YARN-1215 URL: https://issues.apache.org/jira/browse/YARN-1215 Project: Hadoop YARN Issue Type: Bug Components: api Affects Versions: 3.0.0 Reporter: Chuan Liu Assignee: Chuan Liu Attachments: YARN-1215-trunk.2.patch, YARN-1215-trunk.patch In the {{org.apache.hadoop.yarn.api.records.URL}} class, we don't have an userinfo as part of the URL. When converting a {{java.net.URI}} object into the YARN URL object in {{ConverterUtils.getYarnUrlFromURI()}} method, we will set uri host as the url host. If the uri has a userinfo part, the userinfo is discarded. This will lead to information loss if the original uri has the userinfo, e.g. foo://username:passw...@example.com will be converted to foo://example.com and username/password information is lost during the conversion. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1111) NM containerlogs servlet can't handle logs of more than a GB
[ https://issues.apache.org/jira/browse/YARN-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782078#comment-13782078 ] Steve Loughran commented on YARN-: -- investigate how the logs get copied back from YARN containers to HDFS NM containerlogs servlet can't handle logs of more than a GB Key: YARN- URL: https://issues.apache.org/jira/browse/YARN- Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.1.0-beta Environment: Long-lived service generating lots of log data from HBase running in debug level Reporter: Steve Loughran Priority: Minor If a container is set up to log stdout to a file, the container log servlet will list the file {code} err.txt : Total file length is 551 bytes. out.txt : Total file length is 1572099246 bytes. {code} If you actually click on out.txt then the tail logic takes a *very* long time to react. There is also the question of what will happen if the log fills up that volume -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1111) NM containerlogs servlet can't handle logs of more than a GB
[ https://issues.apache.org/jira/browse/YARN-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782080#comment-13782080 ] Steve Loughran commented on YARN-: -- ignore that last comment, note to myself NM containerlogs servlet can't handle logs of more than a GB Key: YARN- URL: https://issues.apache.org/jira/browse/YARN- Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.1.0-beta Environment: Long-lived service generating lots of log data from HBase running in debug level Reporter: Steve Loughran Priority: Minor If a container is set up to log stdout to a file, the container log servlet will list the file {code} err.txt : Total file length is 551 bytes. out.txt : Total file length is 1572099246 bytes. {code} If you actually click on out.txt then the tail logic takes a *very* long time to react. There is also the question of what will happen if the log fills up that volume -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated YARN-1221: -- Attachment: YARN1221_v6.patch With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely - Key: YARN-1221 URL: https://issues.apache.org/jira/browse/YARN-1221 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt, YARN1221_v3.patch.txt, YARN1221_v4.patch, YARN1221_v5.patch, YARN1221_v6.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1232) Configuration to support multiple RMs
[ https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782241#comment-13782241 ] Bikas Saha commented on YARN-1232: -- We should probably de-link ha with rm id. The rm id is a logical name for the rm that is currently used to separate config, translate tokens etc. ha utilizes this logical name concept to reference the rm's. Configuration to support multiple RMs - Key: YARN-1232 URL: https://issues.apache.org/jira/browse/YARN-1232 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Attachments: yarn-1232-1.patch, yarn-1232-2.patch, yarn-1232-3.patch, yarn-1232-4.patch, yarn-1232-5.patch We should augment the configuration to allow users specify two RMs and the individual RPC addresses for them. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1254) NM is polluting container's credentials
Vinod Kumar Vavilapalli created YARN-1254: - Summary: NM is polluting container's credentials Key: YARN-1254 URL: https://issues.apache.org/jira/browse/YARN-1254 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Before launching the container, NM is using the same credential object and so is polluting what container should see. We should fix this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782250#comment-13782250 ] Hadoop QA commented on YARN-1221: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12605964/YARN1221_v6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2041//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2041//console This message is automatically generated. With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely - Key: YARN-1221 URL: https://issues.apache.org/jira/browse/YARN-1221 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt, YARN1221_v3.patch.txt, YARN1221_v4.patch, YARN1221_v5.patch, YARN1221_v6.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1241: - Attachment: YARN-1241.patch In Fair Scheduler maxRunningApps does not work for non-leaf queues -- Key: YARN-1241 URL: https://issues.apache.org/jira/browse/YARN-1241 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1241.patch Setting the maxRunningApps property on a parent queue should make it that the sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1255) RM fails to start up with Failed to load/recover state error in a HA setup
Arpit Gupta created YARN-1255: - Summary: RM fails to start up with Failed to load/recover state error in a HA setup Key: YARN-1255 URL: https://issues.apache.org/jira/browse/YARN-1255 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.1-beta Reporter: Arpit Gupta {code} 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:parseQueue(408)) - Initialized queue: default: capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:parseQueue(408)) - Initialized queue: root: numChildQueue= 1, capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:initializeQueues(306)) - Initialized root queue root: numChildQueue= 1, capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:reinitialize(270)) - Initialized CapacityScheduler with calculator=class org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator, minimumAllocation=memory:1024, vCores:1, maximumAllocation=memory:8192, vCores:32 2013-09-30 09:12:09,240 INFO event.AsyncDispatcher (AsyncDispatcher.java:register(157)) - Registering class org.apache.hadoop.yarn.server.resourcemanager.RMAppManagerEventType for class org.apache.hadoop.yarn.server.resourcemanager.RMAppManager 2013-09-30 09:12:09,250 INFO event.AsyncDispatcher (AsyncDispatcher.java:register(157)) - Registering class org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncherEventType for class org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher 2013-09-30 09:12:09,252 INFO resourcemanager.RMNMInfo (RMNMInfo.java:init(63)) - Registered RMNMInfo MBean 2013-09-30 09:12:09,253 INFO util.HostsFileReader (HostsFileReader.java:refresh(84)) - Refreshing hosts (include/exclude) list 2013-09-30 09:12:09,278 INFO security.UserGroupInformation (UserGroupInformation.java:loginUserFromKeytab(843)) - Login successful for user rm/hostname@realm using keytab file /etc/security/keytabs/rm.service.keytab 2013-09-30 09:12:09,278 INFO security.RMContainerTokenSecretManager (RMContainerTokenSecretManager.java:rollMasterKey(103)) - Rolling master-key for container-tokens 2013-09-30 09:12:09,279 INFO security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:rollMasterKey(107)) - Rolling master-key for amrm-tokens 2013-09-30 09:12:09,281 INFO security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:rollMasterKey(97)) - Rolling master-key for nm-tokens 2013-09-30 09:12:10,196 INFO recovery.FileSystemRMStateStore (FileSystemRMStateStore.java:loadRMAppState(131)) - Loading application from node: application_1380531989689_0002 2013-09-30 09:12:10,217 INFO recovery.FileSystemRMStateStore (FileSystemRMStateStore.java:loadRMAppState(131)) - Loading application from node: application_1380531989689_0003 2013-09-30 09:12:10,232 INFO security.RMDelegationTokenSecretManager (RMDelegationTokenSecretManager.java:recover(181)) - recovering RMDelegationTokenSecretManager. 2013-09-30 09:12:10,234 INFO resourcemanager.RMAppManager (RMAppManager.java:recover(329)) - Recovering 2 applications 2013-09-30 09:12:10,234 ERROR resourcemanager.ResourceManager (ResourceManager.java:serviceStart(640)) - Failed to load/recover state java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:332) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:842) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:636) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:855) 2013-09-30 09:12:10,236 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2013-09-30 09:17:20,144 INFO resourcemanager.ResourceManager (StringUtils.java:startupShutdownMessage(601)) - STARTUP_MSG: {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1252) Secure RM fails to start up in secure HA setup with Renewal request for unknown token exception
[ https://issues.apache.org/jira/browse/YARN-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782263#comment-13782263 ] Jian He commented on YARN-1252: --- It could be the reason that when the application finishes, NN is failing over and becomes in SAFEMODE, and at that point of time RM is not able to remove the application state (within which we store the HDFSDelegationToken) from the store, and RM goes ahead and finishes the app and add the token to the cancel queue, when new NN is up, the token is canceled. Then RM shutdown. Since the token is removed on HDFS tokenSecretManager already , when RM comes back, it will reads the application state(which failed to remove) to try to renew a non-existing token. Secure RM fails to start up in secure HA setup with Renewal request for unknown token exception --- Key: YARN-1252 URL: https://issues.apache.org/jira/browse/YARN-1252 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.1-beta Reporter: Arpit Gupta {code} 2013-09-26 08:15:20,507 INFO ipc.Server (Server.java:run(861)) - IPC Server Responder: starting 2013-09-26 08:15:20,521 ERROR security.UserGroupInformation (UserGroupInformation.java:doAs(1486)) - PriviledgedActionException as:rm/host@realm (auth:KERBEROS) cause:org.apache.hadoop.security.token.SecretManager$InvalidToken: Renewal request for unknown token at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:388) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:5934) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:453) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:851) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59650) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1483) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042 {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1255) RM fails to start up with Failed to load/recover state error in a HA setup
[ https://issues.apache.org/jira/browse/YARN-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782274#comment-13782274 ] Jian He commented on YARN-1255: --- RM might be killed while it's saving the app data(after the app file is created, before the data is written into the file), when RM recovers it loads an empty file and gets a NULL exception, reproduced this locally and see the same exception stack. RM fails to start up with Failed to load/recover state error in a HA setup -- Key: YARN-1255 URL: https://issues.apache.org/jira/browse/YARN-1255 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.1-beta Reporter: Arpit Gupta {code} 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:parseQueue(408)) - Initialized queue: default: capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:parseQueue(408)) - Initialized queue: root: numChildQueue= 1, capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:initializeQueues(306)) - Initialized root queue root: numChildQueue= 1, capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:reinitialize(270)) - Initialized CapacityScheduler with calculator=class org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator, minimumAllocation=memory:1024, vCores:1, maximumAllocation=memory:8192, vCores:32 2013-09-30 09:12:09,240 INFO event.AsyncDispatcher (AsyncDispatcher.java:register(157)) - Registering class org.apache.hadoop.yarn.server.resourcemanager.RMAppManagerEventType for class org.apache.hadoop.yarn.server.resourcemanager.RMAppManager 2013-09-30 09:12:09,250 INFO event.AsyncDispatcher (AsyncDispatcher.java:register(157)) - Registering class org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncherEventType for class org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher 2013-09-30 09:12:09,252 INFO resourcemanager.RMNMInfo (RMNMInfo.java:init(63)) - Registered RMNMInfo MBean 2013-09-30 09:12:09,253 INFO util.HostsFileReader (HostsFileReader.java:refresh(84)) - Refreshing hosts (include/exclude) list 2013-09-30 09:12:09,278 INFO security.UserGroupInformation (UserGroupInformation.java:loginUserFromKeytab(843)) - Login successful for user rm/hostname@realm using keytab file /etc/security/keytabs/rm.service.keytab 2013-09-30 09:12:09,278 INFO security.RMContainerTokenSecretManager (RMContainerTokenSecretManager.java:rollMasterKey(103)) - Rolling master-key for container-tokens 2013-09-30 09:12:09,279 INFO security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:rollMasterKey(107)) - Rolling master-key for amrm-tokens 2013-09-30 09:12:09,281 INFO security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:rollMasterKey(97)) - Rolling master-key for nm-tokens 2013-09-30 09:12:10,196 INFO recovery.FileSystemRMStateStore (FileSystemRMStateStore.java:loadRMAppState(131)) - Loading application from node: application_1380531989689_0002 2013-09-30 09:12:10,217 INFO recovery.FileSystemRMStateStore (FileSystemRMStateStore.java:loadRMAppState(131)) - Loading application from node: application_1380531989689_0003 2013-09-30 09:12:10,232 INFO security.RMDelegationTokenSecretManager (RMDelegationTokenSecretManager.java:recover(181)) - recovering RMDelegationTokenSecretManager. 2013-09-30 09:12:10,234 INFO resourcemanager.RMAppManager (RMAppManager.java:recover(329)) - Recovering 2 applications 2013-09-30 09:12:10,234 ERROR resourcemanager.ResourceManager (ResourceManager.java:serviceStart(640)) - Failed to load/recover state java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:332) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:842) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:636) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:855) 2013-09-30 09:12:10,236 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2013-09-30
[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782283#comment-13782283 ] Sandy Ryza commented on YARN-1221: -- +1 With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely - Key: YARN-1221 URL: https://issues.apache.org/jira/browse/YARN-1221 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt, YARN1221_v3.patch.txt, YARN1221_v4.patch, YARN1221_v5.patch, YARN1221_v6.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1254) NM is polluting container's credentials
[ https://issues.apache.org/jira/browse/YARN-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-1254: Attachment: YARN-1254.20131030.1.patch NM is polluting container's credentials --- Key: YARN-1254 URL: https://issues.apache.org/jira/browse/YARN-1254 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Attachments: YARN-1254.20131030.1.patch Before launching the container, NM is using the same credential object and so is polluting what container should see. We should fix this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1241: - Attachment: YARN-1241-1.patch In Fair Scheduler maxRunningApps does not work for non-leaf queues -- Key: YARN-1241 URL: https://issues.apache.org/jira/browse/YARN-1241 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1241-1.patch, YARN-1241.patch Setting the maxRunningApps property on a parent queue should make it that the sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782319#comment-13782319 ] Sandy Ryza commented on YARN-1241: -- Rebased on trunk In Fair Scheduler maxRunningApps does not work for non-leaf queues -- Key: YARN-1241 URL: https://issues.apache.org/jira/browse/YARN-1241 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1241-1.patch, YARN-1241.patch Setting the maxRunningApps property on a parent queue should make it that the sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1221: - Assignee: Siqi Li With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely - Key: YARN-1221 URL: https://issues.apache.org/jira/browse/YARN-1221 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Siqi Li Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt, YARN1221_v3.patch.txt, YARN1221_v4.patch, YARN1221_v5.patch, YARN1221_v6.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782322#comment-13782322 ] Sandy Ryza commented on YARN-1221: -- I just committed this to trunk, branch-2, and branch-2.1-beta. Thanks Siqi! With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely - Key: YARN-1221 URL: https://issues.apache.org/jira/browse/YARN-1221 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Siqi Li Fix For: 2.1.2-beta Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt, YARN1221_v3.patch.txt, YARN1221_v4.patch, YARN1221_v5.patch, YARN1221_v6.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782324#comment-13782324 ] Siqi Li commented on YARN-1221: --- you are welcome With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely - Key: YARN-1221 URL: https://issues.apache.org/jira/browse/YARN-1221 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Siqi Li Fix For: 2.1.2-beta Attachments: YARN1221_v1.patch.txt, YARN1221_v2.patch.txt, YARN1221_v3.patch.txt, YARN1221_v4.patch, YARN1221_v5.patch, YARN1221_v6.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1254) NM is polluting container's credentials
[ https://issues.apache.org/jira/browse/YARN-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782325#comment-13782325 ] Hadoop QA commented on YARN-1254: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12605983/YARN-1254.20131030.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2043//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2043//console This message is automatically generated. NM is polluting container's credentials --- Key: YARN-1254 URL: https://issues.apache.org/jira/browse/YARN-1254 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Attachments: YARN-1254.20131030.1.patch Before launching the container, NM is using the same credential object and so is polluting what container should see. We should fix this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1256) NM silently ignores non-existent service in StartContainerRequest
Bikas Saha created YARN-1256: Summary: NM silently ignores non-existent service in StartContainerRequest Key: YARN-1256 URL: https://issues.apache.org/jira/browse/YARN-1256 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Bikas Saha Priority: Critical Fix For: 2.1.2-beta A container can set token service metadata for a service, say shuffle_service. If that service does not exist then the errors is silently ignored. Later, when the next container wants to access data written to shuffle_service by the first task, then it fails because the service does not have the token that was supposed to be set by the first task. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1255) RM fails to start up with Failed to load/recover state error in a HA setup
[ https://issues.apache.org/jira/browse/YARN-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782348#comment-13782348 ] Jason Lowe commented on YARN-1255: -- Is this a dup of YARN-1185? RM fails to start up with Failed to load/recover state error in a HA setup -- Key: YARN-1255 URL: https://issues.apache.org/jira/browse/YARN-1255 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.1-beta Reporter: Arpit Gupta {code} 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:parseQueue(408)) - Initialized queue: default: capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:parseQueue(408)) - Initialized queue: root: numChildQueue= 1, capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:initializeQueues(306)) - Initialized root queue root: numChildQueue= 1, capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:reinitialize(270)) - Initialized CapacityScheduler with calculator=class org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator, minimumAllocation=memory:1024, vCores:1, maximumAllocation=memory:8192, vCores:32 2013-09-30 09:12:09,240 INFO event.AsyncDispatcher (AsyncDispatcher.java:register(157)) - Registering class org.apache.hadoop.yarn.server.resourcemanager.RMAppManagerEventType for class org.apache.hadoop.yarn.server.resourcemanager.RMAppManager 2013-09-30 09:12:09,250 INFO event.AsyncDispatcher (AsyncDispatcher.java:register(157)) - Registering class org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncherEventType for class org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher 2013-09-30 09:12:09,252 INFO resourcemanager.RMNMInfo (RMNMInfo.java:init(63)) - Registered RMNMInfo MBean 2013-09-30 09:12:09,253 INFO util.HostsFileReader (HostsFileReader.java:refresh(84)) - Refreshing hosts (include/exclude) list 2013-09-30 09:12:09,278 INFO security.UserGroupInformation (UserGroupInformation.java:loginUserFromKeytab(843)) - Login successful for user rm/hostname@realm using keytab file /etc/security/keytabs/rm.service.keytab 2013-09-30 09:12:09,278 INFO security.RMContainerTokenSecretManager (RMContainerTokenSecretManager.java:rollMasterKey(103)) - Rolling master-key for container-tokens 2013-09-30 09:12:09,279 INFO security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:rollMasterKey(107)) - Rolling master-key for amrm-tokens 2013-09-30 09:12:09,281 INFO security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:rollMasterKey(97)) - Rolling master-key for nm-tokens 2013-09-30 09:12:10,196 INFO recovery.FileSystemRMStateStore (FileSystemRMStateStore.java:loadRMAppState(131)) - Loading application from node: application_1380531989689_0002 2013-09-30 09:12:10,217 INFO recovery.FileSystemRMStateStore (FileSystemRMStateStore.java:loadRMAppState(131)) - Loading application from node: application_1380531989689_0003 2013-09-30 09:12:10,232 INFO security.RMDelegationTokenSecretManager (RMDelegationTokenSecretManager.java:recover(181)) - recovering RMDelegationTokenSecretManager. 2013-09-30 09:12:10,234 INFO resourcemanager.RMAppManager (RMAppManager.java:recover(329)) - Recovering 2 applications 2013-09-30 09:12:10,234 ERROR resourcemanager.ResourceManager (ResourceManager.java:serviceStart(640)) - Failed to load/recover state java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:332) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:842) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:636) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:855) 2013-09-30 09:12:10,236 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2013-09-30 09:17:20,144 INFO resourcemanager.ResourceManager (StringUtils.java:startupShutdownMessage(601)) - STARTUP_MSG: {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782357#comment-13782357 ] Hadoop QA commented on YARN-1241: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12605986/YARN-1241-1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2044//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/2044//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2044//console This message is automatically generated. In Fair Scheduler maxRunningApps does not work for non-leaf queues -- Key: YARN-1241 URL: https://issues.apache.org/jira/browse/YARN-1241 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1241-1.patch, YARN-1241.patch Setting the maxRunningApps property on a parent queue should make it that the sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1247) test-container-executor has gotten out of sync with the changes to container-executor
[ https://issues.apache.org/jira/browse/YARN-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782359#comment-13782359 ] Alejandro Abdelnur commented on YARN-1247: -- +1 LGTM. test-container-executor has gotten out of sync with the changes to container-executor - Key: YARN-1247 URL: https://issues.apache.org/jira/browse/YARN-1247 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Attachments: 0001-YARN-1247.-test-container-executor-has-gotten-out-of.patch If run under the super-user account test-container-executor.c fails in multiple different places. It would be nice to fix it so that we have better testing of LCE functionality. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (YARN-1255) RM fails to start up with Failed to load/recover state error in a HA setup
[ https://issues.apache.org/jira/browse/YARN-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Gupta resolved YARN-1255. --- Resolution: Duplicate Thanks [~jlowe] it is. RM fails to start up with Failed to load/recover state error in a HA setup -- Key: YARN-1255 URL: https://issues.apache.org/jira/browse/YARN-1255 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.1-beta Reporter: Arpit Gupta {code} 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:parseQueue(408)) - Initialized queue: default: capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:parseQueue(408)) - Initialized queue: root: numChildQueue= 1, capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:initializeQueues(306)) - Initialized root queue root: numChildQueue= 1, capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:reinitialize(270)) - Initialized CapacityScheduler with calculator=class org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator, minimumAllocation=memory:1024, vCores:1, maximumAllocation=memory:8192, vCores:32 2013-09-30 09:12:09,240 INFO event.AsyncDispatcher (AsyncDispatcher.java:register(157)) - Registering class org.apache.hadoop.yarn.server.resourcemanager.RMAppManagerEventType for class org.apache.hadoop.yarn.server.resourcemanager.RMAppManager 2013-09-30 09:12:09,250 INFO event.AsyncDispatcher (AsyncDispatcher.java:register(157)) - Registering class org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncherEventType for class org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher 2013-09-30 09:12:09,252 INFO resourcemanager.RMNMInfo (RMNMInfo.java:init(63)) - Registered RMNMInfo MBean 2013-09-30 09:12:09,253 INFO util.HostsFileReader (HostsFileReader.java:refresh(84)) - Refreshing hosts (include/exclude) list 2013-09-30 09:12:09,278 INFO security.UserGroupInformation (UserGroupInformation.java:loginUserFromKeytab(843)) - Login successful for user rm/hostname@realm using keytab file /etc/security/keytabs/rm.service.keytab 2013-09-30 09:12:09,278 INFO security.RMContainerTokenSecretManager (RMContainerTokenSecretManager.java:rollMasterKey(103)) - Rolling master-key for container-tokens 2013-09-30 09:12:09,279 INFO security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:rollMasterKey(107)) - Rolling master-key for amrm-tokens 2013-09-30 09:12:09,281 INFO security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:rollMasterKey(97)) - Rolling master-key for nm-tokens 2013-09-30 09:12:10,196 INFO recovery.FileSystemRMStateStore (FileSystemRMStateStore.java:loadRMAppState(131)) - Loading application from node: application_1380531989689_0002 2013-09-30 09:12:10,217 INFO recovery.FileSystemRMStateStore (FileSystemRMStateStore.java:loadRMAppState(131)) - Loading application from node: application_1380531989689_0003 2013-09-30 09:12:10,232 INFO security.RMDelegationTokenSecretManager (RMDelegationTokenSecretManager.java:recover(181)) - recovering RMDelegationTokenSecretManager. 2013-09-30 09:12:10,234 INFO resourcemanager.RMAppManager (RMAppManager.java:recover(329)) - Recovering 2 applications 2013-09-30 09:12:10,234 ERROR resourcemanager.ResourceManager (ResourceManager.java:serviceStart(640)) - Failed to load/recover state java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:332) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:842) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:636) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:855) 2013-09-30 09:12:10,236 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2013-09-30 09:17:20,144 INFO resourcemanager.ResourceManager (StringUtils.java:startupShutdownMessage(601)) - STARTUP_MSG: {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1185) FileSystemRMStateStore can leave partial files that prevent subsequent recovery
[ https://issues.apache.org/jira/browse/YARN-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782363#comment-13782363 ] Arpit Gupta commented on YARN-1185: --- Here is the stack trace from the RM when it tries to recover partially written data {code} 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:parseQueue(408)) - Initialized queue: default: capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:parseQueue(408)) - Initialized queue: root: numChildQueue= 1, capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:initializeQueues(306)) - Initialized root queue root: numChildQueue= 1, capacity=1.0, absoluteCapacity=1.0, usedResources=memory:0, vCores:0usedCapacity=0.0, numApps=0, numContainers=0 2013-09-30 09:12:09,206 INFO capacity.CapacityScheduler (CapacityScheduler.java:reinitialize(270)) - Initialized CapacityScheduler with calculator=class org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator, minimumAllocation=memory:1024, vCores:1, maximumAllocation=memory:8192, vCores:32 2013-09-30 09:12:09,240 INFO event.AsyncDispatcher (AsyncDispatcher.java:register(157)) - Registering class org.apache.hadoop.yarn.server.resourcemanager.RMAppManagerEventType for class org.apache.hadoop.yarn.server.resourcemanager.RMAppManager 2013-09-30 09:12:09,250 INFO event.AsyncDispatcher (AsyncDispatcher.java:register(157)) - Registering class org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncherEventType for class org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher 2013-09-30 09:12:09,252 INFO resourcemanager.RMNMInfo (RMNMInfo.java:init(63)) - Registered RMNMInfo MBean 2013-09-30 09:12:09,253 INFO util.HostsFileReader (HostsFileReader.java:refresh(84)) - Refreshing hosts (include/exclude) list 2013-09-30 09:12:09,278 INFO security.UserGroupInformation (UserGroupInformation.java:loginUserFromKeytab(843)) - Login successful for user rm/hostname@realm using keytab file /etc/security/keytabs/rm.service.keytab 2013-09-30 09:12:09,278 INFO security.RMContainerTokenSecretManager (RMContainerTokenSecretManager.java:rollMasterKey(103)) - Rolling master-key for container-tokens 2013-09-30 09:12:09,279 INFO security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:rollMasterKey(107)) - Rolling master-key for amrm-tokens 2013-09-30 09:12:09,281 INFO security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:rollMasterKey(97)) - Rolling master-key for nm-tokens 2013-09-30 09:12:10,196 INFO recovery.FileSystemRMStateStore (FileSystemRMStateStore.java:loadRMAppState(131)) - Loading application from node: application_1380531989689_0002 2013-09-30 09:12:10,217 INFO recovery.FileSystemRMStateStore (FileSystemRMStateStore.java:loadRMAppState(131)) - Loading application from node: application_1380531989689_0003 2013-09-30 09:12:10,232 INFO security.RMDelegationTokenSecretManager (RMDelegationTokenSecretManager.java:recover(181)) - recovering RMDelegationTokenSecretManager. 2013-09-30 09:12:10,234 INFO resourcemanager.RMAppManager (RMAppManager.java:recover(329)) - Recovering 2 applications 2013-09-30 09:12:10,234 ERROR resourcemanager.ResourceManager (ResourceManager.java:serviceStart(640)) - Failed to load/recover state java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:332) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:842) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:636) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:855) 2013-09-30 09:12:10,236 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2013-09-30 09:17:20,144 INFO resourcemanager.ResourceManager (StringUtils.java:startupShutdownMessage(601)) - STARTUP_MSG: {code} FileSystemRMStateStore can leave partial files that prevent subsequent recovery --- Key: YARN-1185 URL: https://issues.apache.org/jira/browse/YARN-1185 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Jason Lowe FileSystemRMStateStore writes directly to the destination file when storing state. However if the RM
[jira] [Updated] (YARN-1185) FileSystemRMStateStore can leave partial files that prevent subsequent recovery
[ https://issues.apache.org/jira/browse/YARN-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1185: -- Issue Type: Sub-task (was: Bug) Parent: YARN-128 FileSystemRMStateStore can leave partial files that prevent subsequent recovery --- Key: YARN-1185 URL: https://issues.apache.org/jira/browse/YARN-1185 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Jason Lowe FileSystemRMStateStore writes directly to the destination file when storing state. However if the RM were to crash in the middle of the write, the recovery method could encounter a partially-written file and either outright crash during recovery or silently load incomplete state. To avoid this, the data should be written to a temporary file and renamed to the destination file afterwards. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-953) [YARN-321] Change ResourceManager to use HistoryStorage to log history data
[ https://issues.apache.org/jira/browse/YARN-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-953: --- Attachment: YARN-953-5.patch Thanks [~zjshen] for the patch. I am updating it with latest YARN-321 branch and fixing some of the compilation failures. Thanks, Mayank [YARN-321] Change ResourceManager to use HistoryStorage to log history data --- Key: YARN-953 URL: https://issues.apache.org/jira/browse/YARN-953 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-953.1.patch, YARN-953.2.patch, YARN-953.3.patch, YARN-953.4.patch, YARN-953-5.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1247) test-container-executor has gotten out of sync with the changes to container-executor
[ https://issues.apache.org/jira/browse/YARN-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782402#comment-13782402 ] Hudson commented on YARN-1247: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4501 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4501/]) YARN-1247. test-container-executor has gotten out of sync with the changes to container-executor. (rvs via tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1527813) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c test-container-executor has gotten out of sync with the changes to container-executor - Key: YARN-1247 URL: https://issues.apache.org/jira/browse/YARN-1247 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 2.1.2-beta Attachments: 0001-YARN-1247.-test-container-executor-has-gotten-out-of.patch If run under the super-user account test-container-executor.c fails in multiple different places. It would be nice to fix it so that we have better testing of LCE functionality. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-953) [YARN-321] Change ResourceManager to use HistoryStorage to log history data
[ https://issues.apache.org/jira/browse/YARN-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782404#comment-13782404 ] Hadoop QA commented on YARN-953: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12605993/YARN-953-5.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2045//console This message is automatically generated. [YARN-321] Change ResourceManager to use HistoryStorage to log history data --- Key: YARN-953 URL: https://issues.apache.org/jira/browse/YARN-953 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-953.1.patch, YARN-953.2.patch, YARN-953.3.patch, YARN-953.4.patch, YARN-953-5.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1257) Avro apps are failing with hadoop2
Mayank Bansal created YARN-1257: --- Summary: Avro apps are failing with hadoop2 Key: YARN-1257 URL: https://issues.apache.org/jira/browse/YARN-1257 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Mayank Bansal Fix For: 2.1.2-beta hi, MR Apps which are using avro is not running. These apps are compile with the hadoop1 jars. Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.avro.mapreduce.AvroMultipleOutputs.getNamedOutputsList(AvroMultipleOutputs.java:208) at org.apache.avro.mapreduce.AvroMultipleOutputs.checkNamedOutputName(AvroMultipleOutputs.java:195) at org.apache.avro.mapreduce.AvroMultipleOutputs.addNamedOutput(AvroMultipleOutputs.java:259) at com.ebay.sojourner.HadoopJob.addAvroMultipleOutput(HadoopJob.java:157) at com.ebay.sojourner.intraday.IntradayJob.initJob(IntradayJob.java:113) at com.ebay.sojourner.HadoopJob.run(HadoopJob.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at com.ebay.sojourner.intraday.IntradayJob.main(IntradayJob.java:165) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152) at com.ebay.sojourner.JobDriver.main(JobDriver.java:52) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1257) Avro apps are failing with hadoop2
[ https://issues.apache.org/jira/browse/YARN-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-1257: Priority: Blocker (was: Major) Avro apps are failing with hadoop2 -- Key: YARN-1257 URL: https://issues.apache.org/jira/browse/YARN-1257 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Mayank Bansal Priority: Blocker Fix For: 2.1.2-beta hi, MR Apps which are using avro is not running. These apps are compile with the hadoop1 jars. Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.avro.mapreduce.AvroMultipleOutputs.getNamedOutputsList(AvroMultipleOutputs.java:208) at org.apache.avro.mapreduce.AvroMultipleOutputs.checkNamedOutputName(AvroMultipleOutputs.java:195) at org.apache.avro.mapreduce.AvroMultipleOutputs.addNamedOutput(AvroMultipleOutputs.java:259) at com.ebay.sojourner.HadoopJob.addAvroMultipleOutput(HadoopJob.java:157) at com.ebay.sojourner.intraday.IntradayJob.initJob(IntradayJob.java:113) at com.ebay.sojourner.HadoopJob.run(HadoopJob.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at com.ebay.sojourner.intraday.IntradayJob.main(IntradayJob.java:165) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152) at com.ebay.sojourner.JobDriver.main(JobDriver.java:52) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated YARN-1253: Fix Version/s: (was: 2.1.1-beta) Changes to LinuxContainerExecutor to use cgroups in unsecure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782426#comment-13782426 ] Arun C Murthy commented on YARN-1253: - AFAIK, LCE already works in non-secure more. Can you please help me understand what is the extra ask here? Is the intent to ensure only NM can use LCE to run as other users? Changes to LinuxContainerExecutor to use cgroups in unsecure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated YARN-1253: Target Version/s: (was: 2.1.1-beta) Changes to LinuxContainerExecutor to use cgroups in unsecure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782431#comment-13782431 ] Arun C Murthy commented on YARN-1253: - It would help to understand what LCE+nonsecure doesn't solve yet... Changes to LinuxContainerExecutor to use cgroups in unsecure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1070) ContainerImpl State Machine: Invalid event: CONTAINER_KILLED_ON_REQUEST at CONTAINER_CLEANEDUP_AFTER_KILL
[ https://issues.apache.org/jira/browse/YARN-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782434#comment-13782434 ] Vinod Kumar Vavilapalli commented on YARN-1070: --- Patch looks good to me, +1. Rekicking Jenkins before comitting it. ContainerImpl State Machine: Invalid event: CONTAINER_KILLED_ON_REQUEST at CONTAINER_CLEANEDUP_AFTER_KILL - Key: YARN-1070 URL: https://issues.apache.org/jira/browse/YARN-1070 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Hitesh Shah Assignee: Zhijie Shen Attachments: YARN-1070.1.patch, YARN-1070.2.patch, YARN-1070.3.patch, YARN-1070.4.patch, YARN-1070.5.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1010) FairScheduler: decouple container scheduling from nodemanager heartbeats
[ https://issues.apache.org/jira/browse/YARN-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782437#comment-13782437 ] Wei Yan commented on YARN-1010: --- Updates in the patch. (1) The {{FairScheduler}} launches a thread to do the continuous scheduler. (2) Several configuration fields: {{yarn.scheduler.fair.continuous.scheduling.enabled}}. Whether to enable continuous scheduling. The default value is false. {{yarn.scheduler.fair.continuous.scheduling.sleep.time.ms}}. The sleep time for each round of continuous scheduling, default valus is 5 ms. Configurations for delay scheduling: {{yarn.scheduler.fair.locality.threshold.node.time.ms}}. Time threshold for node locality. The default value is -1L. {{yarn.scheduler.fair.locality.threshold.rack.time.ms}}. Time threshold for rack locality. The default value is -1L. (3) Add test cases for continuous scheduling in {{TestFairScheduler}}, and the delay scheduling mechanism in {{TestFSSchedulerApp}}. FairScheduler: decouple container scheduling from nodemanager heartbeats Key: YARN-1010 URL: https://issues.apache.org/jira/browse/YARN-1010 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Wei Yan Priority: Critical Attachments: YARN-1010.patch Currently scheduling for a node is done when a node heartbeats. For large cluster where the heartbeat interval is set to several seconds this delays scheduling of incoming allocations significantly. We could have a continuous loop scanning all nodes and doing scheduling. If there is availability AMs will get the allocation in the next heartbeat after the one that placed the request. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782435#comment-13782435 ] Alejandro Abdelnur commented on YARN-1253: -- LCE it works in a no-secure setup, but it has 2 issues as stated in the description of the JIRA: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. It could be argued that the first one could be a requirement (though, doing an analogy, it is not for HDFS permissions in unsecure mode) The second issue is the, IMO, severe one. Specially for the the scenarios mentioned in the following up paragraph Particularly, Changes to LinuxContainerExecutor to use cgroups in unsecure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782448#comment-13782448 ] Sandy Ryza commented on YARN-1241: -- Uploaded patch to fix findbugs warnings In Fair Scheduler maxRunningApps does not work for non-leaf queues -- Key: YARN-1241 URL: https://issues.apache.org/jira/browse/YARN-1241 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1241-1.patch, YARN-1241-2.patch, YARN-1241.patch Setting the maxRunningApps property on a parent queue should make it that the sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1258) Allow configuring the Fair Scheduler root queue
Sandy Ryza created YARN-1258: Summary: Allow configuring the Fair Scheduler root queue Key: YARN-1258 URL: https://issues.apache.org/jira/browse/YARN-1258 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.1-beta Reporter: Sandy Ryza This would be useful for acls, maxRunningApps, scheduling modes, etc. The allocation file should be able to accept both: * An implicit root queue * A root queue at the top of the hierarchy with all queues under/inside of it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1070) ContainerImpl State Machine: Invalid event: CONTAINER_KILLED_ON_REQUEST at CONTAINER_CLEANEDUP_AFTER_KILL
[ https://issues.apache.org/jira/browse/YARN-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782453#comment-13782453 ] Hadoop QA commented on YARN-1070: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12601671/YARN-1070.5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2047//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2047//console This message is automatically generated. ContainerImpl State Machine: Invalid event: CONTAINER_KILLED_ON_REQUEST at CONTAINER_CLEANEDUP_AFTER_KILL - Key: YARN-1070 URL: https://issues.apache.org/jira/browse/YARN-1070 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Hitesh Shah Assignee: Zhijie Shen Attachments: YARN-1070.1.patch, YARN-1070.2.patch, YARN-1070.3.patch, YARN-1070.4.patch, YARN-1070.5.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1070) ContainerImpl State Machine: Invalid event: CONTAINER_KILLED_ON_REQUEST at CONTAINER_CLEANEDUP_AFTER_KILL
[ https://issues.apache.org/jira/browse/YARN-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782474#comment-13782474 ] Hudson commented on YARN-1070: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4502 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4502/]) YARN-1070. Fixed race conditions in NodeManager during container-kill. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1527827) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncher.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java ContainerImpl State Machine: Invalid event: CONTAINER_KILLED_ON_REQUEST at CONTAINER_CLEANEDUP_AFTER_KILL - Key: YARN-1070 URL: https://issues.apache.org/jira/browse/YARN-1070 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Hitesh Shah Assignee: Zhijie Shen Fix For: 2.1.2-beta Attachments: YARN-1070.1.patch, YARN-1070.2.patch, YARN-1070.3.patch, YARN-1070.4.patch, YARN-1070.5.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1215) Yarn URL should include userinfo
[ https://issues.apache.org/jira/browse/YARN-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782508#comment-13782508 ] Bikas Saha commented on YARN-1215: -- Hmm.OK. Lets take the patch then. +1. Yarn URL should include userinfo Key: YARN-1215 URL: https://issues.apache.org/jira/browse/YARN-1215 Project: Hadoop YARN Issue Type: Bug Components: api Affects Versions: 3.0.0 Reporter: Chuan Liu Assignee: Chuan Liu Attachments: YARN-1215-trunk.2.patch, YARN-1215-trunk.patch In the {{org.apache.hadoop.yarn.api.records.URL}} class, we don't have an userinfo as part of the URL. When converting a {{java.net.URI}} object into the YARN URL object in {{ConverterUtils.getYarnUrlFromURI()}} method, we will set uri host as the url host. If the uri has a userinfo part, the userinfo is discarded. This will lead to information loss if the original uri has the userinfo, e.g. foo://username:passw...@example.com will be converted to foo://example.com and username/password information is lost during the conversion. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-905) Add state filters to nodes CLI
[ https://issues.apache.org/jira/browse/YARN-905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782511#comment-13782511 ] Hadoop QA commented on YARN-905: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606009/YARN-905-addendum.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2048//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2048//console This message is automatically generated. Add state filters to nodes CLI -- Key: YARN-905 URL: https://issues.apache.org/jira/browse/YARN-905 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Wei Yan Attachments: YARN-905-addendum.patch, YARN-905-addendum.patch, YARN-905-addendum.patch, Yarn-905.patch, YARN-905.patch, YARN-905.patch It would be helpful for the nodes CLI to have a node-states option that allows it to return nodes that are not just in the RUNNING state. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1257) Avro apps are failing with hadoop2
[ https://issues.apache.org/jira/browse/YARN-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782513#comment-13782513 ] Arun C Murthy commented on YARN-1257: - [~mayank_bansal]: seems like this is using o.a.h.mapreduce.* apis? If so, you'll have to recompile against hadoop-2... Avro apps are failing with hadoop2 -- Key: YARN-1257 URL: https://issues.apache.org/jira/browse/YARN-1257 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Mayank Bansal Priority: Blocker Fix For: 2.1.2-beta hi, MR Apps which are using avro is not running. These apps are compile with the hadoop1 jars. Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.avro.mapreduce.AvroMultipleOutputs.getNamedOutputsList(AvroMultipleOutputs.java:208) at org.apache.avro.mapreduce.AvroMultipleOutputs.checkNamedOutputName(AvroMultipleOutputs.java:195) at org.apache.avro.mapreduce.AvroMultipleOutputs.addNamedOutput(AvroMultipleOutputs.java:259) at com.ebay.sojourner.HadoopJob.addAvroMultipleOutput(HadoopJob.java:157) at com.ebay.sojourner.intraday.IntradayJob.initJob(IntradayJob.java:113) at com.ebay.sojourner.HadoopJob.run(HadoopJob.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at com.ebay.sojourner.intraday.IntradayJob.main(IntradayJob.java:165) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152) at com.ebay.sojourner.JobDriver.main(JobDriver.java:52) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782542#comment-13782542 ] Hadoop QA commented on YARN-1241: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606007/YARN-1241-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2049//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/2049//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2049//console This message is automatically generated. In Fair Scheduler maxRunningApps does not work for non-leaf queues -- Key: YARN-1241 URL: https://issues.apache.org/jira/browse/YARN-1241 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1241-1.patch, YARN-1241-2.patch, YARN-1241.patch Setting the maxRunningApps property on a parent queue should make it that the sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1141) Updating resource requests should be decoupled with updating blacklist
[ https://issues.apache.org/jira/browse/YARN-1141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782544#comment-13782544 ] Hadoop QA commented on YARN-1141: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606004/YARN-1141.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2050//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2050//console This message is automatically generated. Updating resource requests should be decoupled with updating blacklist -- Key: YARN-1141 URL: https://issues.apache.org/jira/browse/YARN-1141 Project: Hadoop YARN Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-1141.1.patch, YARN-1141.2.patch Currently, in CapacityScheduler and FifoScheduler, blacklist is updated together with resource requests, only when the incoming resource requests are not empty. Therefore, when the incoming resource requests are empty, the blacklist will not be updated even when blacklist additions and removals are not empty. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782548#comment-13782548 ] Arun C Murthy commented on YARN-1253: - bq. Because users can impersonate other users, any user would have access to any local file of other users You mean by submitting a job as someone else? I think we need to step back - is the requirement that we need to use cgroups in non-secure mode for resource isolation? If so, LCE in non-secure mode is sufficient. Let's not confuse this with security, we already have the problem where someone can delete all data in HDFS in non-secure mode... which is why we have a security in the first place. Changes to LinuxContainerExecutor to use cgroups in unsecure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782566#comment-13782566 ] Sandy Ryza commented on YARN-1241: -- Uploaded patch to fix the new findbugs warnings In Fair Scheduler maxRunningApps does not work for non-leaf queues -- Key: YARN-1241 URL: https://issues.apache.org/jira/browse/YARN-1241 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1241-1.patch, YARN-1241-2.patch, YARN-1241-3.patch, YARN-1241.patch Setting the maxRunningApps property on a parent queue should make it that the sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (YARN-1257) Avro apps are failing with hadoop2
[ https://issues.apache.org/jira/browse/YARN-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved YARN-1257. - Resolution: Not A Problem I'm closing at won't fix, pls re-open if necessary. Thanks. Avro apps are failing with hadoop2 -- Key: YARN-1257 URL: https://issues.apache.org/jira/browse/YARN-1257 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Mayank Bansal Priority: Blocker Fix For: 2.1.2-beta hi, MR Apps which are using avro is not running. These apps are compile with the hadoop1 jars. Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.avro.mapreduce.AvroMultipleOutputs.getNamedOutputsList(AvroMultipleOutputs.java:208) at org.apache.avro.mapreduce.AvroMultipleOutputs.checkNamedOutputName(AvroMultipleOutputs.java:195) at org.apache.avro.mapreduce.AvroMultipleOutputs.addNamedOutput(AvroMultipleOutputs.java:259) at com.ebay.sojourner.HadoopJob.addAvroMultipleOutput(HadoopJob.java:157) at com.ebay.sojourner.intraday.IntradayJob.initJob(IntradayJob.java:113) at com.ebay.sojourner.HadoopJob.run(HadoopJob.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at com.ebay.sojourner.intraday.IntradayJob.main(IntradayJob.java:165) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152) at com.ebay.sojourner.JobDriver.main(JobDriver.java:52) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782572#comment-13782572 ] Vinod Kumar Vavilapalli commented on YARN-1253: --- Agree with [~acmurthy], LCE + unsecure mode can already be used to do cgroup. If there are bugs, we should fix them. bq. LCE requires all Hadoop users submitting jobs to be Unix users in all nodes Yes, this has always been a requirement. I think there is some effort going on in the Windows world of Hadoop to change this, you should look at it. bq. Because users can impersonate other users, any user would have access to any local file of other users Even if the jobs run as a single 'yarnuser', security isn't still there - like Arun said, any body can bomb HDFS directories of other users, any user can kill any other user's tasks/containers, any one can delete any one else's local dirs, log-dir and so on. We could argue which is worse - stealing user's passwords or deleting other user's data on DFS - it depends on who you ask. If you want security, you should enable security. Changes to LinuxContainerExecutor to use cgroups in unsecure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782571#comment-13782571 ] Roman Shaposhnik commented on YARN-1253: I've started doing some preliminary work on this JIRA, so hopefully I can explain some of the things that my patch is about to address: # the reason to use LCE in a non-secure mode is to be able to take advantage of cgroups mechanism, now perhaps cgroups functionality should be independent from the rest of LCE functionality, but re-using the current LCE design is also quite easy -- hence lets assume that for cgroups we need LCE # in a fully secure deployment, LCE works perfectly and makes YARN users correspond 1-1 with the local UNIX users provisioned on each worker node # in a non-secure deployment this 1-1 correspondence feels like a burden that doesn't necessarily have to be there Thus, the proposal is really to add a tiny bit of functionality to LCE where in a non-secure case it would be able to run all tasks under a single designated user (different from a user running nodemanager). On top of that, the notion of the YARN user (which no longer has to have a corresponding UNIX user) get preserved in everything else that LCE does (which really boils down to paths in the local filesystem used for localization). Changes to LinuxContainerExecutor to use cgroups in unsecure mode - Key: YARN-1253 URL: https://issues.apache.org/jira/browse/YARN-1253 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Priority: Blocker When using cgroups we require LCE to be configured in the cluster to start containers. When LCE starts containers as the user that submitted the job. While this works correctly in a secure setup, in an un-secure setup this presents a couple issues: * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes * Because users can impersonate other users, any user would have access to any local file of other users Particularly, the second issue is not desirable as a user could get access to ssh keys of other users in the nodes or if there are NFS mounts, get to other users data outside of the cluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1260) RM_HOME link breaks when webapp.https.address related properties are not specified
Yesha Vora created YARN-1260: Summary: RM_HOME link breaks when webapp.https.address related properties are not specified Key: YARN-1260 URL: https://issues.apache.org/jira/browse/YARN-1260 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Yesha Vora This issue happens in multiple node cluster where resource manager and node manager are running on different machines. Steps to reproduce: 1) set yarn.resourcemanager.hostname = resourcemanager host in yarn-site.xml 2) set hadoop.ssl.enabled = true in core-site.xml 3) Do not specify below property in yarn-site.xml yarn.nodemanager.webapp.https.address and yarn.resourcemanager.webapp.https.address Here, the default value of above two property will be considered. 4) Go to nodemanager web UI https:nodemanager host:8044/node 5) Click on RM_HOME link This link redirects to https:nodemanager host:8090/cluster instead https:resourcemanager host:8090/cluster -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-976) Document the meaning of a virtual core
[ https://issues.apache.org/jira/browse/YARN-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-976: Attachment: YARN-976.patch Document the meaning of a virtual core -- Key: YARN-976 URL: https://issues.apache.org/jira/browse/YARN-976 Project: Hadoop YARN Issue Type: Task Components: documentation Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-976.patch As virtual cores are a somewhat novel concept, it would be helpful to have thorough documentation that clarifies their meaning. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1260) RM_HOME link breaks when webapp.https.address related properties are not specified
[ https://issues.apache.org/jira/browse/YARN-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yesha Vora updated YARN-1260: - Description: This issue happens in multiple node cluster where resource manager and node manager are running on different machines. Steps to reproduce: 1) set yarn.resourcemanager.hostname = resourcemanager host in yarn-site.xml 2) set hadoop.ssl.enabled = true in core-site.xml 3) Do not specify below property in yarn-site.xml yarn.nodemanager.webapp.https.address and yarn.resourcemanager.webapp.https.address Here, the default value of above two property will be considered. 4) Go to nodemanager web UI https://nodemanager host:8044/node 5) Click on RM_HOME link This link redirects to https://nodemanager host:8090/cluster instead https://resourcemanager host:8090/cluster was: This issue happens in multiple node cluster where resource manager and node manager are running on different machines. Steps to reproduce: 1) set yarn.resourcemanager.hostname = resourcemanager host in yarn-site.xml 2) set hadoop.ssl.enabled = true in core-site.xml 3) Do not specify below property in yarn-site.xml yarn.nodemanager.webapp.https.address and yarn.resourcemanager.webapp.https.address Here, the default value of above two property will be considered. 4) Go to nodemanager web UI https:nodemanager host:8044/node 5) Click on RM_HOME link This link redirects to https:nodemanager host:8090/cluster instead https:resourcemanager host:8090/cluster RM_HOME link breaks when webapp.https.address related properties are not specified -- Key: YARN-1260 URL: https://issues.apache.org/jira/browse/YARN-1260 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Yesha Vora This issue happens in multiple node cluster where resource manager and node manager are running on different machines. Steps to reproduce: 1) set yarn.resourcemanager.hostname = resourcemanager host in yarn-site.xml 2) set hadoop.ssl.enabled = true in core-site.xml 3) Do not specify below property in yarn-site.xml yarn.nodemanager.webapp.https.address and yarn.resourcemanager.webapp.https.address Here, the default value of above two property will be considered. 4) Go to nodemanager web UI https://nodemanager host:8044/node 5) Click on RM_HOME link This link redirects to https://nodemanager host:8090/cluster instead https://resourcemanager host:8090/cluster -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-976) Document the meaning of a virtual core
[ https://issues.apache.org/jira/browse/YARN-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-976: Issue Type: Sub-task (was: Task) Parent: YARN-1024 Document the meaning of a virtual core -- Key: YARN-976 URL: https://issues.apache.org/jira/browse/YARN-976 Project: Hadoop YARN Issue Type: Sub-task Components: documentation Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-976.patch As virtual cores are a somewhat novel concept, it would be helpful to have thorough documentation that clarifies their meaning. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores
[ https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1089: - Issue Type: Sub-task (was: Improvement) Parent: YARN-1024 Add YARN compute units alongside virtual cores -- Key: YARN-1089 URL: https://issues.apache.org/jira/browse/YARN-1089 Project: Hadoop YARN Issue Type: Sub-task Components: api Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1089-1.patch, YARN-1089.patch Based on discussion in YARN-1024, we will add YARN compute units as a resource for requesting and scheduling CPU processing power. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1024) Define a CPU resource(s) unambigiously
[ https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1024: - Summary: Define a CPU resource(s) unambigiously (was: Define a virtual core unambigiously) Define a CPU resource(s) unambigiously -- Key: YARN-1024 URL: https://issues.apache.org/jira/browse/YARN-1024 Project: Hadoop YARN Issue Type: Improvement Reporter: Arun C Murthy Assignee: Arun C Murthy Attachments: CPUasaYARNresource.pdf We need to clearly define the meaning of a virtual core unambiguously so that it's easy to migrate applications between clusters. For e.g. here is Amazon EC2 definition of ECU: http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it Essentially we need to clearly define a YARN Virtual Core (YVC). Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1241) In Fair Scheduler maxRunningApps does not work for non-leaf queues
[ https://issues.apache.org/jira/browse/YARN-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782582#comment-13782582 ] Hadoop QA commented on YARN-1241: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606033/YARN-1241-3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2051//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2051//console This message is automatically generated. In Fair Scheduler maxRunningApps does not work for non-leaf queues -- Key: YARN-1241 URL: https://issues.apache.org/jira/browse/YARN-1241 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1241-1.patch, YARN-1241-2.patch, YARN-1241-3.patch, YARN-1241.patch Setting the maxRunningApps property on a parent queue should make it that the sum of apps in all subqueues can't exceed it -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1260) RM_HOME link breaks when webapp.https.address related properties are not specified
[ https://issues.apache.org/jira/browse/YARN-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-1260: Priority: Blocker (was: Major) RM_HOME link breaks when webapp.https.address related properties are not specified -- Key: YARN-1260 URL: https://issues.apache.org/jira/browse/YARN-1260 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta, 2.1.2-beta Reporter: Yesha Vora Priority: Blocker This issue happens in multiple node cluster where resource manager and node manager are running on different machines. Steps to reproduce: 1) set yarn.resourcemanager.hostname = resourcemanager host in yarn-site.xml 2) set hadoop.ssl.enabled = true in core-site.xml 3) Do not specify below property in yarn-site.xml yarn.nodemanager.webapp.https.address and yarn.resourcemanager.webapp.https.address Here, the default value of above two property will be considered. 4) Go to nodemanager web UI https://nodemanager host:8044/node 5) Click on RM_HOME link This link redirects to https://nodemanager host:8090/cluster instead https://resourcemanager host:8090/cluster -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1260) RM_HOME link breaks when webapp.https.address related properties are not specified
[ https://issues.apache.org/jira/browse/YARN-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-1260: Affects Version/s: 2.1.2-beta RM_HOME link breaks when webapp.https.address related properties are not specified -- Key: YARN-1260 URL: https://issues.apache.org/jira/browse/YARN-1260 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta, 2.1.2-beta Reporter: Yesha Vora Priority: Blocker This issue happens in multiple node cluster where resource manager and node manager are running on different machines. Steps to reproduce: 1) set yarn.resourcemanager.hostname = resourcemanager host in yarn-site.xml 2) set hadoop.ssl.enabled = true in core-site.xml 3) Do not specify below property in yarn-site.xml yarn.nodemanager.webapp.https.address and yarn.resourcemanager.webapp.https.address Here, the default value of above two property will be considered. 4) Go to nodemanager web UI https://nodemanager host:8044/node 5) Click on RM_HOME link This link redirects to https://nodemanager host:8090/cluster instead https://resourcemanager host:8090/cluster -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (YARN-1260) RM_HOME link breaks when webapp.https.address related properties are not specified
[ https://issues.apache.org/jira/browse/YARN-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi reassigned YARN-1260: --- Assignee: Omkar Vinit Joshi RM_HOME link breaks when webapp.https.address related properties are not specified -- Key: YARN-1260 URL: https://issues.apache.org/jira/browse/YARN-1260 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta, 2.1.2-beta Reporter: Yesha Vora Assignee: Omkar Vinit Joshi Priority: Blocker This issue happens in multiple node cluster where resource manager and node manager are running on different machines. Steps to reproduce: 1) set yarn.resourcemanager.hostname = resourcemanager host in yarn-site.xml 2) set hadoop.ssl.enabled = true in core-site.xml 3) Do not specify below property in yarn-site.xml yarn.nodemanager.webapp.https.address and yarn.resourcemanager.webapp.https.address Here, the default value of above two property will be considered. 4) Go to nodemanager web UI https://nodemanager host:8044/node 5) Click on RM_HOME link This link redirects to https://nodemanager host:8090/cluster instead https://resourcemanager host:8090/cluster -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-976) Document the meaning of a virtual core
[ https://issues.apache.org/jira/browse/YARN-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782585#comment-13782585 ] Hadoop QA commented on YARN-976: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12606035/YARN-976.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2052//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2052//console This message is automatically generated. Document the meaning of a virtual core -- Key: YARN-976 URL: https://issues.apache.org/jira/browse/YARN-976 Project: Hadoop YARN Issue Type: Sub-task Components: documentation Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-976.patch As virtual cores are a somewhat novel concept, it would be helpful to have thorough documentation that clarifies their meaning. -- This message was sent by Atlassian JIRA (v6.1#6144)