| ... Originally, Samza could be run in distributed mode, only , with the help of Yarn, which is used for cluster management. Yarn was responsible for allocating resources for each physical process, and coordinating between them. Recently, Samza was released in embedded mode which enables you to use it as a library. Currently, the coordination services in the embedded version of Samza have been written using Zookeeper. The dependency on Zookeeper increases our customers’ reliability on the infrastructure, and does not help with modularity. Also, Zookeeper is tedious to maintain and does not help in componentization. The goal of this proposal is to write the same coordination primitives using services provided in Microsoft Azure, in order to to make the the coordination service pluggable. ... With the 0.13.0 release, Samza introduced a flexible deployment model which enables you to run Samza it in containerized environments, with resource managers other than YARN, or in the cloud with the proper coordination primitives. It also enables you to run Samza as a library, within your application. The dependency on Zookeeper for this increases our customers’ reliability on the infrastructure, and does not help with modularity. Also, Zookeeper is tedious to maintain and does not help in componentization. Introducing a coordination service in Azure will help identify issues with the current Job Coordinator design, and validate the functionalities that Samza Embedded claims it provides. If incorporated with the EventHub connector for Brooklyn, it will give us an end-to-end system running in Azure, giving more motivation to teams in Microsoft that will be able to easily deploy Samza jobs , to incorporate Samza in their existing systems. Additionally, we get all the advantages of moving to the cloud infrastructure. Proposed Changes
-
Implement the AzureJobCoordinator on top of current JobCoordinator. This will include the implementation of the storage component in Azure Storage and the notification component in (Operations Manager, Application Insights, Azure Monitor, Notification Hubs) as they are still not pluggable in the current API.
-
Implement the Latch (lock) and Leader functionality with Lease Blobs in Azure. These are pluggable components.
-
Implementing the checkpointing mechanism with Azure Table Storage. ??
-
Integrate all of this with the EventHubSystemProducer and EventHubSystemConsumer.
...
-
Implement the AzureJobCoordinator, LeaderElection and Latch functionality
-
Add metrics to monitor the new features
-
Implement necessary unit tests and integration tests for the added functionalities (Details: TBD)
Compatibility, Deprecation, and Migration Plan The changes made in this proposal will be backward compatible. A new config value for the AzureJobCoordinator will be introduced. The client just needs to change the config file and assign the job.coordinator.factory variable to org.apache.samza.azure.AzureJobCoordinatorFactory. Rejected Alternatives NA Future Work Implementing the checkpointing mechanism with Azure Storage. |