> On 2011-01-09 06:48:15, fpj wrote:
> > trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java,
> >  line 336
> > <https://reviews.apache.org/r/240/diff/1/?file=9407#file9407line336>
> >
> >     If this error is fatal, then I was wondering if we shouldn't abort 
> > lookForLeader() and perhaps even propagate it up so that we kill the peer. 
> > What do you think?
> 
> Vishal Kher wrote:
>     Looking at the code this error case should never happen. But it would be 
> good to propogate back fatal errors. How do you propose to progate the error 
> and kill the peer?
> 
> fpj wrote:
>     One option is to throw an exception from lookForLeader, catch it in the 
> main loop in QuorumPeer, shutdown the peer in the catch block, and exit the 
> main thread. The main difficulty is propagating the error to lookForLeader. 
> The only option I see is propagating it through a special message that FLE 
> receives through the recvQueue.
> 
> fpj wrote:
>     I don't know if you have made any progress here, Vishal, but if you 
> haven't perhaps we should consider having a separate jira for it. I've been 
> thinking that this is a more general problem with error handling between QCM 
> and FLE. How does it sound?

Sounds good to me. Can I get a "ship it" if you are ok with rest of the code?


- Vishal


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/240/#review95
-----------------------------------------------------------


On 2011-01-17 03:55:41, Vishal Kher wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/240/
> -----------------------------------------------------------
> 
> (Updated 2011-01-17 03:55:41)
> 
> 
> Review request for zookeeper and fpj.
> 
> 
> Summary
> -------
> 
> QuorumCnxManager performed blocking socket IO at a few places. As a result, 
> QCM on a peer could block forever which would prevent other peers from 
> connecting to the blocked peer.
> If the peer happens to be the leader, then it will block new peers from 
> becoming a follower.
> 
> I have made changes as per ZOOKEEPER-932
> 
> 
> This addresses bug ZOOKEEPER-932.
>     https://issues.apache.org/jira/browse/ZOOKEEPER-932
> 
> 
> Diffs
> -----
> 
>   
> trunk/src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java
>  1040328 
>   trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java 
> 1040328 
>   
> trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java 
> 1040328 
>   trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 
> 1040328 
>   trunk/src/java/test/org/apache/zookeeper/test/CnxManagerTest.java 1040328 
> 
> Diff: https://reviews.apache.org/r/240/diff
> 
> 
> Testing
> -------
> 
> - ant test-core-java
> - systest
> - basic hand testing
> - rebooted follower/leader several times
> 
> 
> Thanks,
> 
> Vishal
> 
>

Reply via email to