[ 
https://issues.apache.org/jira/browse/SAMZA-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shanthoosh Venkataraman updated SAMZA-2301:
-------------------------------------------
    Description: -  (was: - We've observed corner cases where-in all the 
instances of an standalone application do not see the same state in zookeeper, 
i.e, some instance see the up-to-date JobModel state and some see an out-dated 
inconsistent state. There is an minuscule propagation delay(depending upon n/w 
bandwidth) between the leader of zookeeper quorum and the other servers in the 
ensemble. Consider the case where an follower undergoes the following execution 
sequence. 
 - Follower receives an jobModel version change notification from a zookeeper 
server on which the JobModel version updation had been made by standalone 
leader processor. 
 - Follower tries to read JobModel, but receives a session disconnect from the 
up-to date zookeeper server. IOTec ZkClient library retries connecting to other 
servers in the ensemble and connection to an out-dated zookeeper server is 
established successfully. The read for JobModel from this out-dated zookeeper 
server would return null for the new JobModel zookeeper path and there by 
killing the standalone processor.)

> Add non-null checks in JobModel read control-flow in standalone.
> ----------------------------------------------------------------
>
>                 Key: SAMZA-2301
>                 URL: https://issues.apache.org/jira/browse/SAMZA-2301
>             Project: Samza
>          Issue Type: New Feature
>            Reporter: Shanthoosh Venkataraman
>            Assignee: Shanthoosh Venkataraman
>            Priority: Major
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> -



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to