[ 
https://issues.apache.org/jira/browse/SAMZA-881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197894#comment-15197894
 ] 

Jagadish commented on SAMZA-881:
--------------------------------

Thanks Yi for the feedback.
>>1.When we describe mutual exclusiveness in partition assignment, please 
>>exclude broadcast stream in the discussion
Good point, thanks for catching this.
>>Be consistent w/ terms: in page 2, “leader container” vs “leader process”
I'll ensure that the terms are consistent.
>>In the architecture graph, it would be nice to label the text on the edges w/ 
>>execution order
The ordering here is not quite deterministic. The only two things are 1. 
getJobModel 2. startContainer. I'm not sure how the 'general' ordering will 
look like. The part about requestingResources is valid only in the Yarn case, 
while the leader notifications, resource notifications are asynchronous. I've 
captured ordering in the next swimlane diagram.

>>Where is the container liveness management module in the design of 
>>JobCoordinator? W/ SAMZA-871, requesting for direct heart beat between the 
>>containers and AM (i.e. essentially followers and leaders in the new design), 
>>I think that we should have a separate pluggable module for this, in addition 
>>to ContainerProcessManager, which is just interface to allocate/request 
>>processes.

Great point on the container liveness management piece. The 
ContainerProcessManager API already includes callbacks for notifications during 
container failures. We can separate out
  + onStreamProcessorLaunch(resourceID)
  + onResourceCompleted(resourcestatus)
into a separate API for liveness. 

The standalone part will implement the liveness APIs using zookeeper while the 
Yarn part will rely on notifications from Yarn.

>>Case 3.2 is the prototype implemented in SAMZA-516, right? We should call it 
>>out.
Yes, good catch. I'll call it out.


> Re-think the Samza Job Coordinator
> ----------------------------------
>
>                 Key: SAMZA-881
>                 URL: https://issues.apache.org/jira/browse/SAMZA-881
>             Project: Samza
>          Issue Type: Improvement
>            Reporter: Jagadish
>            Assignee: Jagadish
>         Attachments: SamzaJobCoordinatorRe-designProposal.pdf
>
>
> Currently, the only way to run Samza containers in distributed mode is using 
> Yarn. However, there has been interest to run Samza on top of other resource 
> managers with the recent explosion in the # of such systems. Users have also 
> requested us to run Samza as a library, and to run Samza on Docker containers 
> managed by Kubernetes.
> We must re-think the JobCoordinator functionality as follows:
> 1. ID assignment: Provide an ID to each SamzaContainer.
> 2. JobModel agreement: Ensure containers agree on a JobModel.
> 3. Re-start the SamzaContainer when the job model changes. 
> This will arguably require some leader election (depending on how users 
> choose to run Samza).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to