lakshmi-manasa-g opened a new pull request #1442: URL: https://github.com/apache/samza/pull/1442
**Feature:** Main feature is Cluster based Job coordinator (aka AM) high availability (HA) (TODO: sep/doc how?). The feature ensures that the new AM can establish connection with already running containers to avoid restarting all running containers when AM dies. This PR enables an already running container to establish heartbeat connection with the new AM and introduces a config behind which all of the code for this feature will be. **Changes:** 1. New job config 2. ContaineHeartbeatMonitor - will fetch new AM url from cooridnator stream and establish heartbeat with new AM **Tests:** added unit test **API changes:** 1. config can be set for a job and that will enable AM HA feature. 2. if config is enabled, container will read coordinator stream until timeout/successful heartbeat reestablish. **Usage instructions:** to enable AM HA set config to "true", default value is "false". **Upgrade instructions:** None ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
