[jira] [Commented] (YARN-369) Handle ( or throw a proper error when receiving) status updates from application masters that have not registered
[ https://issues.apache.org/jira/browse/YARN-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700365#comment-13700365 ] Bikas Saha commented on YARN-369: - Some comments This probably needs to change since its also used for double registration. Maybe just say invalid AM request exception. {code} +/** + * The exception is thrown when an application Master call allocate without + * calling RegisterApplicationMaster. + */ {code} Not quite sure if this will break other tests or not? Do other tests that use this method continue to pass with this change? We could create a different registerAppAttempt() does not wait for LAUNCHED state and the current registerAppAttempt() could wait and then call the new one. {code} public RegisterApplicationMasterResponse registerAppAttempt() throws Exception { -waitForState(RMAppAttemptState.LAUNCHED); {code} The test has some code with author name in it. Please remove them. Handle ( or throw a proper error when receiving) status updates from application masters that have not registered - Key: YARN-369 URL: https://issues.apache.org/jira/browse/YARN-369 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.0.3-alpha, trunk-win Reporter: Hitesh Shah Assignee: Mayank Bansal Attachments: YARN-369.patch, YARN-369-trunk-1.patch, YARN-369-trunk-2.patch, YARN-369-trunk-3.patch Currently, an allocate call from an unregistered application is allowed and the status update for it throws a statemachine error that is silently dropped. org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: STATUS_UPDATE at LAUNCHED at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:588) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:99) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:471) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:452) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77) at java.lang.Thread.run(Thread.java:680) ApplicationMasterService should likely throw an appropriate error for applications' requests that should not be handled in such cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700371#comment-13700371 ] Bikas Saha commented on YARN-353: - I dont think it makes sense to have default value for this. ZK location is not something we control and we cannot assume it to be running on some default location. The commented value in the default.xml file is just for a syntax example. {code} + public static final String DEFAULT_ZK_RM_STATE_STORE_ADDRESS = + 127.0.0.1:2181; {code} Wherever we are doing multiple operations, we should probably use the ZK multi API's to guarantee atomic operations. {code} ++ latestSequenceNumber); +try { + if (dtSequenceNumberPath != null) { +deleteWithRetries(dtSequenceNumberPath, 0); + } + createWithRetries(latestSequenceNumberPath, null, zkAcl, +CreateMode.PERSISTENT); +} catch (Exception e) { + LOG.info(Error in storing + dtSequenceNumberPath); + throw e; +} +dtSequenceNumberPath = latestSequenceNumberPath; {code} Add Zookeeper-based store implementation for RMStateStore - Key: YARN-353 URL: https://issues.apache.org/jira/browse/YARN-353 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Hitesh Shah Assignee: Bikas Saha Attachments: YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch Add store that write RM state data to ZK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-845) RM crash with NPE on NODE_UPDATE
[ https://issues.apache.org/jira/browse/YARN-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700372#comment-13700372 ] Bikas Saha commented on YARN-845: - Looks good. +1. RM crash with NPE on NODE_UPDATE Key: YARN-845 URL: https://issues.apache.org/jira/browse/YARN-845 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 3.0.0, 2.1.0-beta Reporter: Arpit Gupta Assignee: Mayank Bansal Attachments: rm.log, YARN-845-trunk-1.patch, YARN-845-trunk-draft.patch the following stack trace is generated in rm {code} n, service: 68.142.246.147:45454 }, ] resource=memory:1536, vCores:1 queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=memory:44544, vCores:29usedCapacity=0.90625, absoluteUsedCapacity=0.90625, numApps=1, numContainers=29 usedCapacity=0.90625 absoluteUsedCapacity=0.90625 used=memory:44544, vCores:29 cluster=memory:49152, vCores:48 2013-06-17 12:43:53,655 INFO capacity.ParentQueue (ParentQueue.java:completedContainer(696)) - completedContainer queue=root usedCapacity=0.90625 absoluteUsedCapacity=0.90625 used=memory:44544, vCores:29 cluster=memory:49152, vCores:48 2013-06-17 12:43:53,656 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(832)) - Application appattempt_1371448527090_0844_01 released container container_1371448527090_0844_01_05 on node: host: hostXX:45454 #containers=4 available=2048 used=6144 with event: FINISHED 2013-06-17 12:43:53,656 INFO capacity.CapacityScheduler (CapacityScheduler.java:nodeUpdate(661)) - Trying to fulfill reservation for application application_1371448527090_0844 on node: hostXX:45454 2013-06-17 12:43:53,656 INFO fica.FiCaSchedulerApp (FiCaSchedulerApp.java:unreserve(435)) - Application application_1371448527090_0844 unreserved on node host: hostXX:45454 #containers=4 available=2048 used=6144, currently has 4 at priority 20; currentReservation memory:6144, vCores:4 2013-06-17 12:43:53,656 INFO scheduler.AppSchedulingInfo (AppSchedulingInfo.java:updateResourceRequests(168)) - checking for deactivate... 2013-06-17 12:43:53,657 FATAL resourcemanager.ResourceManager (ResourceManager.java:run(422)) - Error in handling event type NODE_UPDATE to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.unreserve(FiCaSchedulerApp.java:432) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.unreserve(LeafQueue.java:1416) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1346) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1221) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1180) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:939) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:803) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:665) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:727) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:83) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:413) at java.lang.Thread.run(Thread.java:662) 2013-06-17 12:43:53,659 INFO resourcemanager.ResourceManager (ResourceManager.java:run(426)) - Exiting, bbye.. 2013-06-17 12:43:53,665 INFO mortbay.log (Slf4jLog.java:info(67)) - Stopped SelectChannelConnector@hostXX:8088 2013-06-17 12:43:53,765 ERROR delegation.AbstractDelegationTokenSecretManager (AbstractDelegationTokenSecretManager.java:run(513)) - InterruptedExcpetion recieved for ExpiredTokenRemover thread java.lang.InterruptedException: sleep interrupted 2013-06-17 12:43:53,766 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(200)) - Stopping ResourceManager metrics system... 2013-06-17 12:43:53,767 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(206)) - ResourceManager metrics system stopped. 2013-06-17 12:43:53,767 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(572)) - ResourceManager metrics system shutdown complete. 2013-06-17 12:43:53,768 WARN
[jira] [Commented] (YARN-763) AMRMClientAsync should stop heartbeating after receiving shutdown from RM
[ https://issues.apache.org/jira/browse/YARN-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700377#comment-13700377 ] Bikas Saha commented on YARN-763: - By the time the callback thread handles the shutdown request, the heartbeat thread may have already pinged the RM multiple times and we should ideally avoid that. e.g. since each time the RM will end up sending it a resync/shutdown or might fail it. Ideally, the heartbeater thread should check the command and stop as needed so that there are no subsequent heartbeats. Not quite clear what the test is testing? The thing to be tested is that there should not be an allocate call made by the heartbeater thread after it has been sent a shutdown command by the RM. I dont quite see anything that verifies this behavior. Secondly, there is a lot of probably unnecessary code in the test. I dont think multiple responses after shutdown or mocking client.getAvailableResources is required. {code} +final AllocateResponse response1 = createAllocateResponse( +new ArrayListContainerStatus(), allocated1, null); +final AllocateResponse response2 = createAllocateResponse(completed1, +new ArrayListContainer(), null); +final AllocateResponse shutDownResponse = createAllocateResponse( +new ArrayListContainerStatus(), new ArrayListContainer(), null); +shutDownResponse.setAMCommand(AMCommand.AM_SHUTDOWN); + +TestCallbackHandler callbackHandler = new TestCallbackHandler(); +final AMRMClientContainerRequest client = mock(AMRMClientImpl.class); +when(client.allocate(anyFloat())).thenReturn(shutDownResponse) +.thenReturn(response1).thenReturn(response2); + +when(client.registerApplicationMaster(anyString(), anyInt(), anyString())) + .thenReturn(null); +when(client.getAvailableResources()).thenAnswer(new AnswerResource() { + @Override + public Resource answer(InvocationOnMock invocation) + throws Throwable { +// take client lock to simulate behavior of real impl +synchronized (client) { + Thread.sleep(10); +} +return null; + } +}); {code} On a different note, serviceStop() should not call join() on the heartbeater thread. While serviceStop() blocks on the join() it may be holding onto application locks in its call tree. The callback thread might be waiting on those locks as it upcalls to the app code. Resulting in a deadlock. However, we should ensure the JVM is not hung because of any issue on this thread. So we should mark the callback thread as a daemon so that the JVM exits even if that thread is running. AMRMClientAsync should stop heartbeating after receiving shutdown from RM - Key: YARN-763 URL: https://issues.apache.org/jira/browse/YARN-763 Project: Hadoop YARN Issue Type: Bug Reporter: Bikas Saha Assignee: Xuan Gong Attachments: YARN-763.1.patch, YARN-763.2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-845) RM crash with NPE on NODE_UPDATE
[ https://issues.apache.org/jira/browse/YARN-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700380#comment-13700380 ] Hudson commented on YARN-845: - Integrated in Hadoop-trunk-Commit #4043 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4043/]) YARN-845. RM crash with NPE on NODE_UPDATE (Mayank Bansal via bikas) (Revision 1499886) Result = SUCCESS bikas : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1499886 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java RM crash with NPE on NODE_UPDATE Key: YARN-845 URL: https://issues.apache.org/jira/browse/YARN-845 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 3.0.0, 2.1.0-beta Reporter: Arpit Gupta Assignee: Mayank Bansal Attachments: rm.log, YARN-845-trunk-1.patch, YARN-845-trunk-draft.patch the following stack trace is generated in rm {code} n, service: 68.142.246.147:45454 }, ] resource=memory:1536, vCores:1 queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=memory:44544, vCores:29usedCapacity=0.90625, absoluteUsedCapacity=0.90625, numApps=1, numContainers=29 usedCapacity=0.90625 absoluteUsedCapacity=0.90625 used=memory:44544, vCores:29 cluster=memory:49152, vCores:48 2013-06-17 12:43:53,655 INFO capacity.ParentQueue (ParentQueue.java:completedContainer(696)) - completedContainer queue=root usedCapacity=0.90625 absoluteUsedCapacity=0.90625 used=memory:44544, vCores:29 cluster=memory:49152, vCores:48 2013-06-17 12:43:53,656 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(832)) - Application appattempt_1371448527090_0844_01 released container container_1371448527090_0844_01_05 on node: host: hostXX:45454 #containers=4 available=2048 used=6144 with event: FINISHED 2013-06-17 12:43:53,656 INFO capacity.CapacityScheduler (CapacityScheduler.java:nodeUpdate(661)) - Trying to fulfill reservation for application application_1371448527090_0844 on node: hostXX:45454 2013-06-17 12:43:53,656 INFO fica.FiCaSchedulerApp (FiCaSchedulerApp.java:unreserve(435)) - Application application_1371448527090_0844 unreserved on node host: hostXX:45454 #containers=4 available=2048 used=6144, currently has 4 at priority 20; currentReservation memory:6144, vCores:4 2013-06-17 12:43:53,656 INFO scheduler.AppSchedulingInfo (AppSchedulingInfo.java:updateResourceRequests(168)) - checking for deactivate... 2013-06-17 12:43:53,657 FATAL resourcemanager.ResourceManager (ResourceManager.java:run(422)) - Error in handling event type NODE_UPDATE to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.unreserve(FiCaSchedulerApp.java:432) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.unreserve(LeafQueue.java:1416) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1346) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1221) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1180) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:939) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:803) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:665) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:727) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:83) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:413) at java.lang.Thread.run(Thread.java:662) 2013-06-17 12:43:53,659 INFO resourcemanager.ResourceManager (ResourceManager.java:run(426)) - Exiting, bbye.. 2013-06-17 12:43:53,665 INFO mortbay.log (Slf4jLog.java:info(67)) -
[jira] [Commented] (YARN-873) YARNClient.getApplicationReport(unknownAppId) returns a null report
[ https://issues.apache.org/jira/browse/YARN-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700382#comment-13700382 ] Bikas Saha commented on YARN-873: - There is probably no concept of an error code in the ApplicationReport object. The only current way for the YarnClient method to show an error is via an exception or a null report. Null report can be unclear as to what happened. YARNClient.getApplicationReport(unknownAppId) returns a null report --- Key: YARN-873 URL: https://issues.apache.org/jira/browse/YARN-873 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: Bikas Saha Assignee: Xuan Gong How can the client find out that app does not exist? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-808) ApplicationReport does not clearly tell that the attempt is running or not
[ https://issues.apache.org/jira/browse/YARN-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700389#comment-13700389 ] Bikas Saha commented on YARN-808: - We should probably expose the state of the app attempt. Probably need a translation from the internal app attempt state so that we dont expose the internal state machine state. [~vinodkv] Any other ideas? ApplicationReport does not clearly tell that the attempt is running or not -- Key: YARN-808 URL: https://issues.apache.org/jira/browse/YARN-808 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Bikas Saha Assignee: Xuan Gong When an app attempt fails and is being retried, ApplicationReport immediately gives the new attemptId and non-null values of host etc. There is no way for clients to know that the attempt is running other than connecting to it and timing out on invalid host. Solution would be to expose the attempt state or return a null value for host instead of N/A -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-818) YARN application classpath should add $PWD/* in addition to $PWD
[ https://issues.apache.org/jira/browse/YARN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700394#comment-13700394 ] Bikas Saha commented on YARN-818: - In yarn application path we should include only yarn client and api jars instead of every jar in yarn. [~vinodkv] any comments? YARN application classpath should add $PWD/* in addition to $PWD Key: YARN-818 URL: https://issues.apache.org/jira/browse/YARN-818 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Jian He Attachments: YARN-818.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700395#comment-13700395 ] Jian He commented on YARN-353: -- bq. I dont think it makes sense to have default value for this. ZK location is not something we control and we cannot assume it to be running on some default location. Yes, we can not assume which location ZK is ruining on, but I think the result would be the same if we provide a default or leave it empty, botch cases should raise connect exception or something, which leads the user to config the true address. One bonus doing such might make user easier in test mode where ZK is running on its defaults, your opinion? Add Zookeeper-based store implementation for RMStateStore - Key: YARN-353 URL: https://issues.apache.org/jira/browse/YARN-353 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Hitesh Shah Assignee: Bikas Saha Attachments: YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch Add store that write RM state data to ZK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-881) Priority#compareTo method seems to be wrong.
[ https://issues.apache.org/jira/browse/YARN-881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700397#comment-13700397 ] Bikas Saha commented on YARN-881: - That internal code should probably create its own comparator. This compareTo method for the class is user facing and its inconsistent for users to see the compareTo() method returning results that are opposite to the declared ordering of priorities in yarn. [~vinodkv] - what do you think? Priority#compareTo method seems to be wrong. Key: YARN-881 URL: https://issues.apache.org/jira/browse/YARN-881 Project: Hadoop YARN Issue Type: Bug Reporter: Jian He Assignee: Jian He if lower int value means higher priority, shouldn't we return other.getPriority() - this.getPriority() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700398#comment-13700398 ] Bikas Saha commented on YARN-353: - No. It must be required for the user to specify this. We cannot assume some random address if the user has not specified a value. The code should throw an exception if this is not specified. Add Zookeeper-based store implementation for RMStateStore - Key: YARN-353 URL: https://issues.apache.org/jira/browse/YARN-353 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Hitesh Shah Assignee: Bikas Saha Attachments: YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch Add store that write RM state data to ZK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700422#comment-13700422 ] Jian He commented on YARN-353: -- Any downside of doing that ? Add Zookeeper-based store implementation for RMStateStore - Key: YARN-353 URL: https://issues.apache.org/jira/browse/YARN-353 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Hitesh Shah Assignee: Bikas Saha Attachments: YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch Add store that write RM state data to ZK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-901) Active users field in Resourcemanager scheduler UI gives negative values
Nishan Shetty created YARN-901: -- Summary: Active users field in Resourcemanager scheduler UI gives negative values Key: YARN-901 URL: https://issues.apache.org/jira/browse/YARN-901 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.5-alpha Reporter: Nishan Shetty Priority: Minor Active users field in Resourcemanager scheduler UI gives negative values on Resourcemanager restart when job is in progress -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-902) Used Resources field in Resourcemanager scheduler UI not displaying any values
Nishan Shetty created YARN-902: -- Summary: Used Resources field in Resourcemanager scheduler UI not displaying any values Key: YARN-902 URL: https://issues.apache.org/jira/browse/YARN-902 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.5-alpha Reporter: Nishan Shetty Priority: Minor Used Resources field in Resourcemanager scheduler UI not displaying any values -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700440#comment-13700440 ] Bikas Saha commented on YARN-353: - Downside of doing what? Throwing clear exception will alert the user that the address is not configured and so the RM will not start. Add Zookeeper-based store implementation for RMStateStore - Key: YARN-353 URL: https://issues.apache.org/jira/browse/YARN-353 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Hitesh Shah Assignee: Bikas Saha Attachments: YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch Add store that write RM state data to ZK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-901) Active users field in Resourcemanager scheduler UI gives negative values
[ https://issues.apache.org/jira/browse/YARN-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700445#comment-13700445 ] rohithsharma commented on YARN-901: --- Active users shows negative value during restart of RM. When APP_ADDED event, Active user values is calculated and same is recalculated at APP_REMOVED event. Afer submitting job, if we restart RM then calculation lead to Negative value.The problem is InMemory storage of User Info at each queue which will be reset during RM start up. Active users field in Resourcemanager scheduler UI gives negative values -- Key: YARN-901 URL: https://issues.apache.org/jira/browse/YARN-901 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.5-alpha Reporter: Nishan Shetty Priority: Minor Active users field in Resourcemanager scheduler UI gives negative values on Resourcemanager restart when job is in progress -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-818) YARN application classpath should add $PWD/* in addition to $PWD
[ https://issues.apache.org/jira/browse/YARN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700444#comment-13700444 ] Alejandro Abdelnur commented on YARN-818: - agree with Bikas. YARN application classpath should add $PWD/* in addition to $PWD Key: YARN-818 URL: https://issues.apache.org/jira/browse/YARN-818 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Jian He Attachments: YARN-818.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700465#comment-13700465 ] Jian He commented on YARN-353: -- sorry, I meant downside of giving a default ZK address. yeah, throwing an exception would be clear. Add Zookeeper-based store implementation for RMStateStore - Key: YARN-353 URL: https://issues.apache.org/jira/browse/YARN-353 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Hitesh Shah Assignee: Bikas Saha Attachments: YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch Add store that write RM state data to ZK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-902) Used Resources field in Resourcemanager scheduler UI not displaying any values
[ https://issues.apache.org/jira/browse/YARN-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700466#comment-13700466 ] Sandy Ryza commented on YARN-902: - [~nishan], which scheduler is this occurring for you with? Used Resources field in Resourcemanager scheduler UI not displaying any values Key: YARN-902 URL: https://issues.apache.org/jira/browse/YARN-902 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.5-alpha Reporter: Nishan Shetty Priority: Minor Used Resources field in Resourcemanager scheduler UI not displaying any values -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-521) Augment AM - RM client module to be able to request containers only at specific locations
[ https://issues.apache.org/jira/browse/YARN-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-521: Attachment: YARN-521.patch Augment AM - RM client module to be able to request containers only at specific locations - Key: YARN-521 URL: https://issues.apache.org/jira/browse/YARN-521 Project: Hadoop YARN Issue Type: Sub-task Components: api Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-521.patch When YARN-392 and YARN-398 are completed, it would be good for AMRMClient to offer an easy way to access their functionality -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira