mynameborat opened a new pull request #1528:
URL: https://github.com/apache/samza/pull/1528


   **Problem**:
   As part of  SAMZA-2638, we introduced skipping container restart and stops 
on no changes to work assignment for processors across rebalances. However, we 
only update the active job model with the proposed job model on starting the 
container as part of `onNewJobModel`. This leads to a scenario where the 
processor is stopped but the future rebalances assume the container is still 
running. More information on scenario below.
   
   **Description**: 
   Imagine the quorum is in steady state with job model version v1. A new 
rebalance occurs and the leader generates v2. Processor P1 has changes in work 
assignment and as a result stops the container as part of job model expiration. 
However, in the event of the rebalance being unsuccessful (barrier times out), 
a new rebalance occurs which generates a job model version v3. In the scenario 
where work assignment for P1 in v3 is same as v1, then the state transition 
assumes the processor hasn't stopped the container and proceeds to do an no-op.
   
   **Changes**:
   - Track job model expiration
   - `onNewJobModel` triggers new job model as long as the active job model has 
been expired
   - Handle no change in work assignment optimization only during 
`checkJobModelExpired` flow.
   
   **Test**:
   - Added unit test to cover the scenario of multiple incomplete rebalances
   
   **API Changes**: None
   
   **Upgrade Instructions**: None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to