shanthoosh opened a new pull request #1137: Improve zookeper metadata 
implementation
URL: https://github.com/apache/samza/pull/1137
 
 
   
   
   
   - We've observed corner cases where-in all the instances of an standalone 
application do not see the same state in zookeeper, i.e, some instance see the 
up-to-date JobModel state and some see an out-dated inconsistent state. There 
is an minuscule propagation delay(depending upon n/w bandwidth) between the 
leader of zookeeper quorum and the other servers in the ensemble. Consider the 
case where an follower undergoes the following execution sequence. 
      - Follower receives an jobModel version change notification from a 
zookeeper server on which the JobModel version updation had been made by 
standalone leader processor. 
      - Follower tries to read JobModel, but receives a session disconnect from 
the  up-to date zookeeper server. IOTec ZkClient library retries connecting to 
other servers in the ensemble and connection to an out-dated zookeeper server 
is established successfully. The read for JobModel from this out-dated 
zookeeper server would return null for the new JobModel zookeeper path and 
there by killing the standalone processor. 
       
   
   This patch solves the problem,. by adding fixed retries in the 
metadata-store read path. 
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to