cameronlee314 opened a new pull request #1529:
URL: https://github.com/apache/samza/pull/1529


   Feature: Adding a job coordinator which does not do resource management. For 
Samza on YARN, the `ClusterBasedJobCoordinator` does resource management. 
However, for Samza on Kubernetes, it is not necessary to have a job coordinator 
which does resource management, because Kubernetes controllers can take care of 
resource management.
   
   Changes:
   1. Added a `StaticResourceJobCoordinator` which handles responsibilities 
like job model calculation, communicating job model to workers, monitoring 
input streams which may cause the job model to change, and startpoint fanout.
   2. Added new abstraction layer `CoordinatorCommunication` for communication 
between job coordinator and workers. Before, the only communication option was 
an HTTP endpoint. The new abstraction layer allows us to start decoupling the 
coordination from the HTTP endpoint. This PR doesn't expose an option to plug 
in a custom communication layer yet, but there is an interface to start working 
off of.
   
   Testing:
   1. Added unit tests
   2. Tested a Samza job using this new coordinator in Kubernetes
   
   API changes (all changes are backwards compatible):
   1. Set the config `job.coordinator.use.static.resource.job.coordinator` to 
`true` in order to use the new coordinator.
   2. Set the config `job.coordinator.restart.signal.factory` to define how to 
restart the Samza job when an input stream changes which will change the job 
model. This plug-in is dependent on where the Samza job is running (e.g. 
Kubernetes). Currently, there is only a no-op implementation of this restart 
signal.
   
   Note: This PR reuses components like `JobModelHelper`, 
`JobCoordinatorMetadataManager`, `StreamPartitionCountMonitorFactory`, and 
`StartpointManager` across `ClusterBasedJobCoordinator` and 
`StaticResourceJobCoordinator`. I considered consolidating 
`ClusterBasedJobCoordinator` and `StaticResourceJobCoordinator` even further to 
share code which encapsulates usage of multiple of the above components, but 
it's not yet clear how to fit the new and old flows together cleanly. There 
might be further divergence between the coordinators as we iterate on on 
`StaticResourceJobCoordinator`, so I felt it would be easier to leave them more 
decoupled for now. Also, keeping them decoupled reduces risk that a change will 
impact the existing `ClusterBasedJobCoordinator`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to