| ... With the 0.13.0 release, Samza introduced a flexible deployment model which enables you to run it in containerized environments, with resource managers other than YARN, or in the cloud with the proper coordination primitives. It also enables you to run Samza as a library, within your application. The dependency on Zookeeper for this increases our customers’ reliability on the infrastructure, and does not help with modularity. Also, Zookeeper is tedious to maintain and does not help in componentization. Introducing a coordination service in Azure will help identify issues with the current Job Coordinator design, and validate the functionalities that Samza Embedded claims it provides. If incorporated with the EventHub connector for BrooklynBrooklin, it will give us an end-to-end system running in Azure, giving more motivation to teams in Microsoft, to incorporate Samza in their existing systems. Additionally, we get all the advantages of moving to the cloud infrastructure. ...
-
Implement the AzureJobCoordinator on top of current JobCoordinator.
-
Implement the Latch (lock) and Leader functionality with Lease Blobs in Azure. These are pluggable components. A blob in Azure storage is used for storing large amounts of unstructured data. A Lease Blob is an operation that establishes and manages a lock on a blob for write and delete operations. We will use this service to elect the leader when running in Azure.
-
Integrate all of this with the EventHubSystemProducer and EventHubSystemConsumer.
... The following interfaces will be implemented for Azure:
-
JobCoordinator
public
class AzureJobCoordinator
implements JobCoordinator
{}
|
-
Latch
public
class AzureLatch
implements Latch
{}
|
-
LeaderElection
public
class AzureLeaderElector implements LeaderElector {}
|
Implementation and Test Plan ... The changes made in this proposal will be backward compatible. A new config value for the AzureJobCoordinator JobCoordinatorFactory will be introduced. The client just needs to change the config file and assign the job.coordinator.factory variable to org.apache.samza.azure.AzureJobCoordinatorFactory. Rejected Alternatives NA Future Work
- Implementing the checkpointing mechanism with Azure Storage.
|