[jira] [Resolved] (SAMZA-1695) Clear events in debounce queue on session expiration

Shanthoosh Venkataraman (JIRA) Mon, 07 May 2018 19:49:37 -0700

     [ 
https://issues.apache.org/jira/browse/SAMZA-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shanthoosh Venkataraman resolved SAMZA-1695.
--------------------------------------------
       Resolution: Fixed
    Fix Version/s: 0.15.0

> Clear events in debounce queue on session expiration
> ----------------------------------------------------
>
>                 Key: SAMZA-1695
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1695
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Shanthoosh Venkataraman
>            Assignee: Shanthoosh Venkataraman
>            Priority: Major
>             Fix For: 0.15.0
>
>
> *Scenario:*
> Let's assume there're three processors in the group [P1, P2, P3] and P1 is 
> the leader.
> 1. Leader processor(P1) loses connectivity with a zookeeper server in the 
> ensemble and it's ephemeral processor node is deleted(due to session 
> expiration).
>  2. Immediate successor(P2) to the leader(P1) finds out that the leader is 
> dead and declares itself as leader. Processor P2 Schedules onProcessorChange 
> to publish JobModel.
>  3. ZkClient connection retry logic helps the Leader(P1) to reconnect to 
> another zkServer in the ensemble and it joins as follower.
>  4. Processor P1 acts on the stale buffered event in the debounce queue(which 
> it received when it's a leader) and acts as leader. At this point, there're 
> two processors acting as leader(P1 & P2). If P1 proceeds to execute leader 
> actions before P2, P2 will fail(and in worst case can cause state corruption).
> *Sample exception logs:*
> [https://gist.github.com/shanthoosh/55410fe4ebf3cfb65281b35f16397cad]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (SAMZA-1695) Clear events in debounce queue on session expiration

Reply via email to