[jira] [Updated] (YARN-3885) ProportionalCapacityPreemptionPolicy doesn't preempt if queue is more than 2 level
[ https://issues.apache.org/jira/browse/YARN-3885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Priyank Rastogi updated YARN-3885: -- Description: when preemption policy is {{ProportionalCapacityPreemptionPolicy.cloneQueues}} this piece of code, to calculate {{untoucable}} doesnt consider al the children, it considers only immediate childern was: when preemption policy is {{ProportionalCapacityPreemptionPolicy.cloneQueues}} this piece of code, to calculate {{untoucable}} is wrong as it doesnt consider al the children, it considers only immediate childern ProportionalCapacityPreemptionPolicy doesn't preempt if queue is more than 2 level -- Key: YARN-3885 URL: https://issues.apache.org/jira/browse/YARN-3885 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.8.0 Reporter: Ajith S Priority: Critical Attachments: YARN-3885.patch when preemption policy is {{ProportionalCapacityPreemptionPolicy.cloneQueues}} this piece of code, to calculate {{untoucable}} doesnt consider al the children, it considers only immediate childern -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.
[ https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Priyank Rastogi updated YARN-3543: -- Labels: (was: BB2015-05-TBR) ApplicationReport should be able to tell whether the Application is AM managed or not. --- Key: YARN-3543 URL: https://issues.apache.org/jira/browse/YARN-3543 Project: Hadoop YARN Issue Type: Improvement Components: api Affects Versions: 2.6.0 Reporter: Spandan Dutta Assignee: Rohith Attachments: 0001-YARN-3543.patch, 0001-YARN-3543.patch, 0002-YARN-3543.patch, 0002-YARN-3543.patch, 0003-YARN-3543.patch, 0004-YARN-3543.patch, 0004-YARN-3543.patch, YARN-3543-AH.PNG, YARN-3543-RM.PNG Currently we can know whether the application submitted by the user is AM managed from the applicationSubmissionContext. This can be only done at the time when the user submits the job. We should have access to this info from the ApplicationReport as well so that we can check whether an app is AM managed or not anytime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3562) unit tests failures and issues found from findbug from earlier ATS checkins
[ https://issues.apache.org/jira/browse/YARN-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Priyank Rastogi updated YARN-3562: -- Labels: (was: BB2015-05-TBR) unit tests failures and issues found from findbug from earlier ATS checkins --- Key: YARN-3562 URL: https://issues.apache.org/jira/browse/YARN-3562 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Naganarasimha G R Priority: Minor Fix For: YARN-2928 Attachments: YARN-3562-YARN-2928.001.patch, YARN-3562-YARN-2928.002.patch, YARN-3562-YARN-2928.003.patch *Issues reported from MAPREDUCE-6337* : A bunch of MR unit tests are failing on our branch whenever the mini YARN cluster needs to bring up multiple node managers. For example, see https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5472/testReport/org.apache.hadoop.mapred/TestClusterMapReduceTestCase/testMapReduceRestarting/ It is because the NMCollectorService is using a fixed port for the RPC (8048). *Issues reported from YARN-3044* : Test case failures and tools(FB CS) issues found : # find bugs issue : Comparison of String objects using == or != in ResourceTrackerService.updateAppCollectorsMap # find bugs issue : Boxing/unboxing to parse a primitive RMTimelineCollectorManager.postPut. Called method Long.longValue() Should call Long.parseLong(String) instead. # find bugs issue : DM_DEFAULT_ENCODING Called method new java.io.FileWriter(String, boolean) At FileSystemTimelineWriterImpl.java:\[line 86\] # hadoop.yarn.server.resourcemanager.TestAppManager, hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions, hadoop.yarn.server.resourcemanager.TestClientRMService hadoop.yarn.server.resourcemanager.logaggregationstatus.TestRMAppLogAggregationStatus, refer https://builds.apache.org/job/PreCommit-YARN-Build/7534/testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3385) Race condition: KeeperException$NoNodeException will cause RM shutdown during ZK node deletion.
[ https://issues.apache.org/jira/browse/YARN-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Priyank Rastogi updated YARN-3385: -- Labels: (was: BB2015-05-TBR) Race condition: KeeperException$NoNodeException will cause RM shutdown during ZK node deletion. --- Key: YARN-3385 URL: https://issues.apache.org/jira/browse/YARN-3385 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: zhihai xu Assignee: zhihai xu Priority: Critical Fix For: 2.7.1 Attachments: YARN-3385.000.patch, YARN-3385.001.patch, YARN-3385.002.patch, YARN-3385.003.patch, YARN-3385.004.patch Race condition: KeeperException$NoNodeException will cause RM shutdown during ZK node deletion(Op.delete). The race condition is similar as YARN-3023. since the race condition exists for ZK node creation, it should also exist for ZK node deletion. We see this issue with the following stack trace: {code} 2015-03-17 19:18:58,958 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type STATE_STORE_OP_FAILED. Cause: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:945) at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:911) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:857) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:854) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:973) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:992) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:854) at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.removeApplicationStateInternal(ZKRMStateStore.java:647) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:691) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:766) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:761) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) at java.lang.Thread.run(Thread.java:745) 2015-03-17 19:18:58,959 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)