[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763470#comment-13763470 ] Karthik Kambatla commented on YARN-1027: Did some testing with several transitions to Standby and Active back and forth, and ran MR jobs when in Active mode. # The Standby mode (389719 objects worth 46661952 bytes) indeed has fewer objects and uses less memory compared to the Active mode (399819 objects worth 50104584 bytes). # The applicationId has the same timestamp from when the RM started, and starts issuing ids starting from 1. This leads to issues ranging from client-side failures due to entries in .staging/ to jobs hanging. Once enough jobs are killed, subsequent jobs can be run as usual. To address this, I think it is safe to reset the timestamp to when the RM becomes Active. # The WebUI behaves as expected. Regarding more involved tests, I was thinking of writing a MiniYARNCluster-based one that checks if the RPC servers are shutdown in Standby mode. We can check if a client can request applicationId etc. Is it okay for these tests to live in hadoop-yarn-client. Or, would it make sense to create a separate module for such end-to-end tests, including future HA tests, stress tests etc.? Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762557#comment-13762557 ] Karthik Kambatla commented on YARN-1027: bq. What happens if we call this method when the RM is in standby mode? I am wondering if we may be able to call this during that time and verify that the RM is indeed not active. These particular MockRM methods work on any inited RM - even standby mode. The tests for the Standby mode should be on a MiniYARNCluster. Will try to work those in. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761524#comment-13761524 ] Bikas Saha commented on YARN-1027: -- This code is confusing. The state shouldnt be initializing if the service is stopped. Can we open a jira to add a meaningful state to HAServiceState and refer to that jira in this code so that we can fix it in that jira too. {code} + public void serviceStop() throws Exception { +// Stop all services +transitionToStandby(); + +// Update haState as RM can no longer be active +haState = HAServiceState.INITIALIZING; +super.serviceStop(); {code} Lets not leave orphan TODOs in the code. Please refer to YARN-1068 or open a new jira. {code} +// TODO: When automatic failover is enabled, check if transition should be +// allowed for this request {code} In transtionToStandby() we are changing state to STANDBY after stopping all services. This is fine for now. We must keep this in mind later on when we start having ha-aware alwaysOn services. They need to stop signalling the ActiveServices before we stop them. Eg. RPC services would need to start rejecting requests before we stop the activeServices. createAndStartActiveServices() and related methods should be package visibility and not protected. Protected would mean that we intend a derived class to see these methods too. Is the commented code going to be uncommented or removed? The code is valid and should work in an active state. So it should probably be uncommented. What happens if we call this method when the RM is in standby mode? I am wondering if we may be able to call this during that time and verify that the RM is indeed not active. {code} + private void checkActiveRMisFunctional() { +try { + rm.getNewAppId(); +// rm.registerNode(node1, 2048); + rm.submitApp(1024); {code} Locking on the RMHAServiceProtocol is confusing. Some public methods are synchronized while others are not. Will these lead to race conditions in the future. How about we make them all public synchronized since they are not expected to be high performance and so heavy locking is fine. Caveat to this would be if ZKFC expects getServiceState()/monitorHealth() to work even while the service is transitioning to active/standby. Again, that probably doesnt matter if these operations happen in a reasonably short time. Depending on your conclusion in YARN-1077 we can keep the approach in 1027-3 or 1027-4 wrt RMHAServiceProtocol being always present or not. Patch is close to being ready for commit! The main thing to verify (even if manually) is that the ActiveService objects are being GC'd correctly or not. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13759973#comment-13759973 ] Bikas Saha commented on YARN-1027: -- The patch look clean overall. I would suggest keeping haEnabled concept within the HAServiceProtocol service instead of mixing it between the ResourceManager and HAServiceProtocol. Thus the RM always addService(HAServiceProtocol). HAServiceProtocol is the one that checks if haEnabled in serviceStart(). If enabled then it transitions to standby and waits for active signal. If not, then it directly transitions to active. Shouldn't we simply call transitionToStandby() here? That would ensure getServiceStatus() returns non active status for anyone that cares to know. {code} + public void serviceStop() throws Exception { +if (rm.haState == HAServiceState.ACTIVE) { + rm.stopActiveServices(); {code} This is fine for now but we might have to invest in better health check in a different jira. Any ideas? {code} public synchronized void monitorHealth() throws HealthCheckFailedException { +if (rm.haState == HAServiceState.ACTIVE !rm.areActiveServicesRunning()) { {code} We probably want the log before the if stmt. Should we change state to standby before we stop services? Assuming that HA aware services would need to know about this earlier rather than later so that they can stop signaling Active services and allow them to be drained/stopped. {code} +if (rm.haState == HAServiceState.ACTIVE) { + rm.stopActiveServices(); +} + +LOG.info(Transitioning to standby); +rm.haState = HAServiceState.STANDBY; {code} Didnt quite get this comment. Is this do with change being requested by user/admin/ZKFC? {code} + public void transitionToActive(StateChangeRequestInfo reqInfo) { +// TODO: When automatic failover is enabled, check if transition should +// be allowed for this request {code} What are the pros of making haState a member of ResourceManager instead of HAServiceProtocol? A pro of the latter is that it keeps all HA stuff in one place. Why is there a lock used in ResourceManager.startActive() etc. Why are these methods protected. If testing, then lets add an @visiblefortesting annotation. Is there a way to confirm that the active service objects are all being GC'd? testStartAndTransitions() - How about calling getServiceStatus() and monitorHealth() in addition to checking the internal members, in all places where internal members are being checked. So we can test and exercise those methods too. How about completing Active-Standby-Active-Standby-Active-RM.serviceStop(). This would fully simulate multiple full cycles of transitions and also verify the shutdown case. We can also issue some requests like createApplication() to the RM, when in active state, and verify that the RM is really working. TestRMHADisabled. It confusing to read that the RM has started but its haState==INITIALIZING. Also, we can probably move this test in TestRMHA.java to keep related tests in one place. Minor nits LOG instead of print? {code} +} catch (Exception e) { + e.printStackTrace(); {code} RM_HA_PREFIX instead of HA_PREFIX Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760582#comment-13760582 ] Karthik Kambatla commented on YARN-1027: Thanks for the detailed review, [~bikas]. bq. What are the pros of making haState a member of ResourceManager instead of HAServiceProtocol? A pro of the latter is that it keeps all HA stuff in one place. In the future, when individual external-facing services need to behave based on the HAState, having it in the RM might be useful. However, I think we should move it to RMHAProtocolService now, and move it to the RM or RMContext lazily. bq. Why is there a lock used in ResourceManager.startActive() etc. Why are these methods protected. If testing, then lets add an @visiblefortesting annotation. The lock is to protect against concurrent invocations of transitionToActive() and transitionToStandby() due to say user input. The methods are protected because they are being accessed from outside the RM - in this case, RMHAProtocolService. bq. Is there a way to confirm that the active service objects are all being GC'd? Not sure of a deterministic test. How about using Runtime.memory methods to measure memory usage before and after transitioning to Active and subsequently Standby? I can jmap a real RM on a pseudo-dist cluster and see if they are being cleaned up. bq. Didnt quite get this comment. Is this do with change being requested by user/admin/ZKFC? If automatic failover is enabled and a user issues a transition command, it should take effect only when it is forced. Agree with remaining comments. Will fix it in the next version. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760752#comment-13760752 ] Bikas Saha commented on YARN-1027: -- RMHAProtocolService can be made available via RMContext and thus accessible to everyone who has access to RMContext. In that case we probably mean package and not protected since there is no inheritance story here. I dont think we need a test (although that would be awesome). If we can manually verify then it should be sufficient for now I guess. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760774#comment-13760774 ] Karthik Kambatla commented on YARN-1027: In yarn-1027-4.patch, the RM always addService(HAServiceProtocol). HAServiceProtocol is the one that checks if haEnabled in serviceStart(). If enabled then it transitions to standby and waits for active signal. If not, then it directly transitions to active. However, post RM#init(), RM fields are not instantiated (e.g. TokenManagers) leading a bunch of test failures. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756655#comment-13756655 ] Karthik Kambatla commented on YARN-1027: Just uploaded a patch (yarn-1027-2.patch) that builds on top of YARN-1098 patch, and depends on it. Patch outline: # Implement RMHAProtocolService # When HA is enabled, make this HA-service one of the services managed by the RM. RM no longer manages the activeStateServices directly, these are to be managed by the HA-service. # Tests to check HA enable/disable and transitions when enabled. # Included another patch (test-yarn-1027.patch) that I used to force transitionToActive() after RM starts in the Standby mode. Post transitionToActive, the RM behaves normally. I was able to run jobs. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754306#comment-13754306 ] Karthik Kambatla commented on YARN-1027: Discussion with Bikas and Vinod offline: The HA-in-RM approach doesn't seem to be much more disruptive than the wrapper/extension approaches, particularly given the changes in YARN-1098. The implementation can be along the lines of the proof-of-concept patch. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: yarn-1027-1.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750240#comment-13750240 ] Karthik Kambatla commented on YARN-1027: Thanks for the review, Bikas. Sorry for the delay in response - was on vacation last week. bq. Taking the hybrid approach drops the simplicity of the wrapper while at the same time making it complex to interact with the ResourceManager. I see your point. IMO, the extension approach increases the flexibility of the wrapper approach without adding too much complexity. Keeping the HA code separate from the RM avoids complicating the RM, particularly during the period we are stabilizing the HA portion of the code. Once stable, if we think it is appropriate, it is simple enough to merge it all into the RM itself. bq. Which one is the real ResourceManager. For example, there are many tests that use the ResourceManager but now since they dont use HAResourceManager they are probably not exercising some possibilities. Should they use HAResourceManager? The HA specific tests can access the HAResourceManager, it should be okay for the remaining tests to access ResourceManager and not the HAResourceManager. bq. This shows that adding HA awareness can be added without significant overhaul in the RM. Most importantly, my fear is adding HA to the RM directly leads to a more significant overhaul. Let me draft a patch implementing the same within RM itself instead of extending it. Any other ideas in the interim would also greatly help. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: yarn-1027-1.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750258#comment-13750258 ] Bikas Saha commented on YARN-1027: -- Its a good idea to draft a path in which the HA protocol becomes another service within the RM. We should think through various startup/transitionToActive()/transitionToStandby() scenarios to determine the best approach to code this. E.g. repeated transitions from active-standby-active for the same RM without bringing the process down. This means that all apps in the RM (ie all internal stateful objects like appmanager, scheduler, rmappimpl etc etc) should all be completely cleaned up during transitionToStanbdy(). Currently the RM simply shuts down and hence that cleanup is not necessary. This may also suggest that we logically divide RM internal objects into 2 groups 1) stuff that can be started once and kept on until RM stops 2) stuff that needs to be cleaned every time the RM is standby and re-inited when the RM is active. The second group would contain things like the scheduler while the first would contain things like the RPC services. The first set would be transparent to HA while the second set would need to be aware of HA. Perhaps before we tackle this jira to completion, we should open and commit another jira that identifies all stateful objects within the RM and adds support to clean them up during RM shutdown. Those cleanup methods can be re-used during transitionToStandby(). This jira can build on top of that. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: yarn-1027-1.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745841#comment-13745841 ] Bikas Saha commented on YARN-1027: -- First of all, thanks for the writing the patch and testing it. This shows that adding HA awareness can be added without significant overhaul in the RM. I wish I could say that I like the hybrid approach, but after reading the patch unfortunately thats not the case. Having a pure wrapper approach that simply does a new ResourceManager() upon transitionToActive() has the virtue of being completely separate from the RM and being simple. Having HAService built into ResourceManager as a service integrates it completely with the ResourceManager flow and allows for features like RPC redirect in tandem with other RM services. Taking the hybrid approach drops the simplicity of the wrapper while at the same time making it complex to interact with the ResourceManager. Which one is the real ResourceManager. For example, there are many tests that use the ResourceManager but now since they dont use HAResourceManager they are probably not exercising some possibilities. Should they use HAResourceManager? Fundamentally, HA is going to be an integral part of the ResourceManager and to me it does not make sense to create a derive impl of the ResourceManager in order to add the HA logic. What other derivations are possible for the RM that motivate the use of inheritance and sub-classing? Why have 2 impls for essentially the same component. Starting up and stopping services is not super fast and will add time to the failover. So unless there is an obstacle to that path, we should be looking at starting as many (if not all) services on the RM so that the only thing thats blocking failover is populating the state. Like discussed earlier, its not necessary for all services to be started in the first cut. We can choose to start the HA service only. I would really encourage attempting to make HAService part of ResourceManager itself. I can help with the patch if needed. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: yarn-1027-1.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13743547#comment-13743547 ] Bikas Saha commented on YARN-1027: -- [~vinodkv] Do you have any suggestions? Would be great to have them early because this approach will be important for the remaining changes. So best to spend time on this now and make sure we are in the best position for future development. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: yarn-1027-1.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742368#comment-13742368 ] Hadoop QA commented on YARN-1027: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12598376/yarn-1027-1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1732//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1732//console This message is automatically generated. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: yarn-1027-1.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729294#comment-13729294 ] Karthik Kambatla commented on YARN-1027: [~nemon], if you haven't started work on this already, do you mind if I take this up? I have been discussing this with Bikas on YARN-149 and offline and started working on. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: nemon lou Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729348#comment-13729348 ] nemon lou commented on YARN-1027: - I have also started working on this since it was in unassigned. It's ok to take it up,i will review the patch :) Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: nemon lou Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira