[ 
https://issues.apache.org/jira/browse/SAMZA-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17306843#comment-17306843
 ] 

Bharath Kumarasubramanian commented on SAMZA-2633:
--------------------------------------------------

We will tackle the changes into multiple stages
 # Changes to processor rebalance flow when work assignment doesn't change 
[SAMZA-2638]
 # Changes to processor startup flow by using last active job model version 
 # Changes to leader to add criteria on when to trigger rebalance

2 & 3 will be tackled as part of this ticket.

> Rolling deployment/upgrade causes downtime for processors for the entire 
> deployment window
> ------------------------------------------------------------------------------------------
>
>                 Key: SAMZA-2633
>                 URL: https://issues.apache.org/jira/browse/SAMZA-2633
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Bharath Kumarasubramanian
>            Assignee: Bharath Kumarasubramanian
>            Priority: Major
>
> *Problem*:
> At LinkedIn, we noticed several standalone users complained about 
> lag/downtime during rolling deployments/upgrades.
> *Description*:
> During rolling upgrades, the current debounce timer gets extended every time 
> when there is a quorum change notification. As a result, processors that were 
> upgraded earlier in the deployment window remain unavailable waiting for work 
> assignment. In some scenarios, this cause processors to be unavailable for 20 
> minutes or so depending on the size of the quorum and the debounce time 
> configuration.
> *Impact*:
> Partitions that were stopped for initial processors as part of upgrade remain 
> unassigned for the entire deployment window which can result in processing 
> lag.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to