[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-10-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918878#action_12918878
 ] 

Hudson commented on ZOOKEEPER-822:
--

Integrated in ZooKeeper-trunk #959 (See 
[https://hudson.apache.org/hudson/job/ZooKeeper-trunk/959/])
ZOOKEEPER-822. Leader election taking a long time to complete


> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-28 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915796#action_12915796
 ] 

Benjamin Reed commented on ZOOKEEPER-822:
-

looks good overall flavio. just a quick questions: i notice that operations on 
senderWorkerMap in initiateConnection are not synchronized. senderWorkerMap is 
concurrent, but there could be a race between the get, put, and vsw.finish if 
initiateConnection is called concurrently for the same sid. right?

also you need to add a blurb to the config doc for the timeout system variable, 
which should be "zookeeper.cnxtimeout" so that it can be set from the 
configuration file.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-21 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913307#action_12913307
 ] 

Vishal K commented on ZOOKEEPER-822:


Hi Flavio,

+1. Looks good. I remember looking at the socket.connect() method, but I don't 
remember why I ruled it out in the favor of thread.
Minor point - missing space before "error" in LOG.warn("Connection broken: for 
id " + sid + "my id = " + self.getId() + "error..).

Thank you.

-Vishal



> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-20 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12912483#action_12912483
 ] 

Vishal K commented on ZOOKEEPER-822:


Hi Flavio,

Thanks. I will take a look at the patches.

-Vishal

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-14 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909592#action_12909592
 ] 

Mahadev konar commented on ZOOKEEPER-822:
-

vishal,
 I was expecting some commitment from you for making it use a selector :).

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-14 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909560#action_12909560
 ] 

Vishal K commented on ZOOKEEPER-822:


I agree with Mahadev. 

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-14 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909558#action_12909558
 ] 

Mahadev konar commented on ZOOKEEPER-822:
-

visha, flavio,
  If there is just one thread running at one point in time, then its ok. Also, 
I am really worried about the code structure in LeaderElection.java. Its ok to 
have a temporary fix, but it would be great to see some commitment from someone 
on doing it right in 3.4. 

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-14 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909556#action_12909556
 ] 

Vishal K commented on ZOOKEEPER-822:


Hi Flavio,

As I mentioned earlier, this is a temporary patch until the selector based 
approach (non-blocking IO)  is ready.

In general, what is the concern with the current fix? There will be only one 
thread running at a time. The thread just makes sure that we can bound the 
connection time. This patch is working well for us as a temporary fix. Apart 
from the overhead of starting a thread I don't see anything wrong with the fix.

Again, given that this bug is a blocker for us, we certainly cannot wait until 
the non-blocking implementation  is done and released.

Thanks.
-Vishal

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-14 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909434#action_12909434
 ] 

Flavio Junqueira commented on ZOOKEEPER-822:


Hi Vishal, I have taken a look at your patch. As I said before, it sounds good 
to me to make SocketChannel non-blocking, but I don't like very much the 
approach of creating one thread per connection attempt. Instead, I was thinking 
that we should try to use a selector. What do you think?

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-07 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906872#action_12906872
 ] 

Flavio Junqueira commented on ZOOKEEPER-822:


I'll have a look at it, Vishal. Thanks for posting it.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-07 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906854#action_12906854
 ] 

Vishal K commented on ZOOKEEPER-822:


Hi Flavio,

> I think we need some time to converge on problems and fixes. 

I don't think it would take a long time to converge. I think the patch that I 
attached is quite simple. After adding a new property for timeout we should be 
good to go.

> My understanding is that we want to have 3.3.2 out soon, and my feeling is 
> that this is not a blocker for 3.3.2 given Vishal's description and our 
> experience with the system so far, but it would be good to hear from Vishal.

>From our earlier email exchanges I have a feeling that in most cases FLE was 
>tested by restarting the ZooKeeper service (and not by rebooting/shutting down 
>the host). I am a bit concerned that enough time may not have been spent in 
>testing/reproducing this problem. In my opinion, this fix should go in 3.3.2. 
>I know for sure that we won't be able to use the next release as is without 
>this fix.

Thanks.
-Vishal

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-05 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906346#action_12906346
 ] 

Flavio Junqueira commented on ZOOKEEPER-822:


I think we need some time to converge on problems and fixes. My understanding 
is that we want to have 3.3.2 out soon, and my feeling is that this is not a 
blocker for 3.3.2 given Vishal's description and our experience with the system 
so far, but it would be good to hear from Vishal.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-03 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905836#action_12905836
 ] 

Flavio Junqueira commented on ZOOKEEPER-822:


{quote}
1. Blocking connects and accepts:
You are right, when the node is down TCP timeouts rule.

a) The first problem is in manager.toSend(). This invokes connectOne(), which 
does a blocking connect. While testing, I changed the code so that connectOne() 
starts a new thread called AsyncConnct(). AsyncConnect.run() does a 
socketChannel.connect(). After starting AsyncConnect, connectOne starts a 
timer. connectOne continues with normal operations if the connection is 
established before the timer expires, otherwise, when the timer expires it 
interrupts AsyncConnect() thread and returns. In this way, I can have an upper 
bound on the amount of time we need to wait for connect to succeed. Of course, 
this was a quick fix for my testing. Ideally, we should use Selector to do 
non-blocking connects/accepts. I am planning to do that later once we at least 
have a quick fix for the problem and consensus from others for the real fix 
(this problem is big blocker for us). Note that it is OK to do blocking IO in 
SenderWorker and RecvWorker threads since they block IO to the respective pe!
 er.
{quote}

As I commented before, it might be ok to make it asynchronous, especially if we 
have a way of checking that there is an attempt to establish a connection in 
progress. 
I'm also still intrigued about why this is a problem for you. I haven't seen 
any of this being a problem before, which of course doesn't mean we shouldn't 
fix it. It would be nice to understand what's special about your setup or if 
others have seen similar problems and I missed the reports.

{quote}
b) The blocking IO problem is not just restricted to connectOne(), but also in 
receiveConnection(). The Listener thread calls receiveConnection() for each 
incoming connection request. receiveConnection does blocking IO to get peer's 
info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the peer that 
had sent the connection request. All of this is happening from the Listener. In 
short, if a peer fails after initiating a connection, the Listener thread won't 
be able to accept connections from other peers, because it would be stuck in 
read() or connetOne(). Also the code has an inherent cycle. 
initiateConnection() and receiveConnection() will have to be very carefully 
synchronized otherwise, we could run into deadlocks. This code is going to be 
difficult to maintain/modify.
{quote}

If I remember correctly, we currently synchronize connectOne and make all 
connection establishments through connectOne so that we make sure that we do 
one at a time. My understanding is that this should reduce the number of rounds 
of attempts to establish connections, perhaps at the cost of a longer delay in 
some runs. 

{quote}
2. Buggy senderWorkerMap handling:
The code that manages senderWorkerMap is very buggy. It is causing multiple 
election rounds. While debugging I found that sometimes after FLE a node will 
have its sendWorkerMap empty even if it has SenderWorker and RecvWorker threads 
for each peer.
{quote}

I don't think that having multiple rounds is bad; in fact, I think it is 
unavoidable using reasonable timeout values. The second part, however, sounds 
like a problem we should fix. 

{quote}
a) The receiveConnection() method calls the finish() method, which removes an 
entry from the map. Additionally, the thread itself calls finish() which could 
remove the newly added entry from the map. In short, receiveConnection is 
causing the exact condition that you mentioned above.
{quote}

I thought that we were increasing the intervals between notifications, and if 
so I believe the case you mention above should not happen more than a few 
times. Now, to fix it, it sounds like we need to check that the finish call is 
removing the correct object in sendWorkerMap. That is, obj.finish() should 
remove obj and do nothing if the SendWorker object in sendWorkerMap is a 
different one. What do you think?

{quote}
b) Apart from the bug in finish(), receiveConnection is making an entry in 
senderWorkerMap at the wrong place. Here's the buggy code:
SendWorker vsw = senderWorkerMap.get(sid);
senderWorkerMap.put(sid, sw);
if(vsw != null)
vsw.finish();
It makes an entry for the new thread and then calls finish, which causes the 
new thread to be removed from the Map. The old thread will also get terminated 
since finish() will interrupt the thread.
{quote}

See my comment above. Perhaps I should wait to see your proposed modifications, 
but I wonder if works to check that we are removing the correct SendWorker 
object.

{quote}
3. Race condition in receiveConnection and initiateConnection:

In theory, two peers can keep disconnecting each other's connection.

Example:
T0: Peer 0 initia

[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-02 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905528#action_12905528
 ] 

Vishal K commented on ZOOKEEPER-822:


Hi Flavio,

I was planning to send out a mail explaining the problems in the FLE 
implementation that I have found so far. For now, I will put the info here. We 
can create new JIRAs if needed. I am waiting to hear back from our legal 
department to resolve copyright issues so that I can share my fixes as well.

1. Blocking connects and accepts:
You are right, when the node is down TCP timeouts rule.

a) The first problem is in manager.toSend(). This invokes connectOne(), which 
does a blocking connect. While testing, I changed the code so that connectOne() 
starts a new thread called AsyncConnct(). AsyncConnect.run() does a 
socketChannel.connect(). After starting AsyncConnect, connectOne starts a 
timer. connectOne continues with normal operations if the connection is 
established before the timer expires, otherwise, when the timer expires it 
interrupts AsyncConnect() thread and returns. In this way, I can have an upper 
bound on the amount of time we need to wait for connect to succeed. Of course, 
this was a quick fix for my testing. Ideally, we should use Selector to do 
non-blocking connects/accepts. I am planning to do that later once we at least 
have a quick fix for the problem and consensus from others for the real fix 
(this problem is big blocker for us). Note that it is OK to do blocking IO in 
SenderWorker and RecvWorker threads since they block IO to the respective pe!
 er.

b) The blocking IO problem is not just restricted to connectOne(), but also in 
receiveConnection(). The Listener thread calls receiveConnection() for each 
incoming connection request. receiveConnection does blocking IO to get peer's 
info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the peer that 
had sent the connection request. All of this is happening from the Listener. In 
short, if a peer fails after initiating a connection, the Listener thread won't 
be able to accept connections from other peers, because it would be stuck in 
read() or connetOne(). Also the code has an inherent cycle. 
initiateConnection() and receiveConnection() will have to be very carefully 
synchronized otherwise, we could run into deadlocks. This code is going to be 
difficult to maintain/modify.

2. Buggy senderWorkerMap handling:
The code that manages senderWorkerMap is very buggy. It is causing multiple 
election rounds. While debugging I found that sometimes after FLE a node will 
have its sendWorkerMap empty even if it has SenderWorker and RecvWorker threads 
for each peer.

a) The receiveConnection() method calls the finish() method, which removes an 
entry from the map. Additionally, the thread itself calls finish() which could 
remove the newly added entry from the map. In short, receiveConnection is 
causing the exact condition that you mentioned above.

b) Apart from the bug in finish(), receiveConnection is making an entry in 
senderWorkerMap at the wrong place. Here's the buggy code:
SendWorker vsw = senderWorkerMap.get(sid);
senderWorkerMap.put(sid, sw);
if(vsw != null)
vsw.finish();
It makes an entry for the new thread and then calls finish, which causes the 
new thread to be removed from the Map. The old thread will also get terminated 
since finish() will interrupt the thread.

3. Race condition in receiveConnection and initiateConnection:

*In theory*, two peers can keep disconnecting each other's connection.

Example:
T0: Peer 0 initiates a connection (request 1)

   T1: Peer 1 receives connection from peer 0

   T2: Peer 1 calls receiveConnection()
T2: Peer 0 closes connection to Peer 1 because its ID is lower.
T3: Peer 0 re-initiates connection to Peer 1 from manger.toSend() (request 2)
T3: Peer 1 terminates older connection to peer 0
T4: Peer 1 calls connectOne() which starts new sendWorker threads for peer 0
T5: Peer 1 kills connection created in T3 because it receives another (request 
2) connect request from 0

The problem here is that while Peer 0 is accepting a connection from Peer 1 it 
can also be initiating a connection to Peer 1. So if they hit the right 
frequencies they could sit in a connect/disconnect loop and cause multiple 
rounds of leader election.

I think the cause here is again blocking connects()/accepts(). A peer starts to 
take action (to kill existing threads and start new threads) as soon as a 
connection is established at the *TCP level*. That is, it does not give us any 
control to synchronized connect and accepts. We could use non-blocking connects 
and accepts. This will allow us to a) tell a thre

[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-31 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904684#action_12904684
 ] 

Flavio Junqueira commented on ZOOKEEPER-822:


Hi VIshal, Good catches:

1- It sounds right that blocking the connection establishment might increase 
the time to election unnecessarily when the other party is not up. Here is my 
interpretation. If the machine is up but the the zk server is not running, then 
we simply get a connection failure and move on. The same doesn't happen when 
the the machine is down, since we need to wait for the connection establishment 
to time out;
2- It sounds right that a connection can be dropped erroneously due to a race, 
but I don't see in which case it can cause the election time to increase 
substantially, unless the race is triggered multiple times in a row. A server 
will try to connect upon every new notification, and a server only calls 
SendWorker.finish() in receiveNotification if it has a higher identifier. In 
this case, it creates a new connection immediately after, so it would need a 
previous connection being dropped right before to have the case you're 
describing;
3- Servers with higher identifiers decline connection requests from servers 
with lower identifiers; it is part of the protocol. Is this what you're 
referring to?


> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-26 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902988#action_12902988
 ] 

Vishal K commented on ZOOKEEPER-822:


The fix for problem 1 and 2 above eliminates the bug. I will have a patch out 
soon.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-20 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900898#action_12900898
 ] 

Vishal K commented on ZOOKEEPER-822:


Correction:
2. SendWorker.run() calls finish at the end. This could result in finish() 
getting called twice (e.g., finish called from receiveConnection), thus, 
causing senderWorkerMap.remove(sid) called twice and removing an entry that 
should *not* be removed.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-19 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900401#action_12900401
 ] 

Vishal K commented on ZOOKEEPER-822:


Hi Flavio,

Ah! my trunk is quite old then. But I don't think it is necessary to run with 
the latest code for debugging this issue.

I have identified one problem in WorkerSender.process(). This function calls 
manager.toSend() whicih calls connectOne. connectOne does a blocking connect 
(which takes order of minutes to return if a node is down). Thus, 
WorkerSender.run() will  block and not send any successive notifications to 
other nodes.

Let met know what you think

I tired with adding timeouts to connectOne, but I am running into similar issue 
somewhere else. So that didnt fix the problem





> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-18 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900165#action_12900165
 ] 

Flavio Junqueira commented on ZOOKEEPER-822:


Vishal, You don't seem to be using trunk code. The current trunk code would 
report notifications using the following format when report level info is 
enabled:

{noformat}
LOG.info("Notification: " + n.leader + " (n.leader), " + n.zxid +
" (n.zxid), " + n.epoch + " (n.round), " + n.state +
" (n.state), " + n.sid + " (n.sid), " + self.getPeerState() +
" (my state)");
{noformat}

And I'm seeing the following in the excerpt above:

{noformat}
Notification: 0, 34359738368, 4, 0, LOOKING, LOOKING, 0
{noformat}

Also, it would be great if we could use loggraph to visualize what is going on 
in your situation. 

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-18 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900111#action_12900111
 ] 

Vishal K commented on ZOOKEEPER-822:


at line 852 in 10.17.119.101-zookeeper.log WorkSender finally finds something 
in sendqueue and starts sending the notification to server 1.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-18 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900108#action_12900108
 ] 

Vishal K commented on ZOOKEEPER-822:


I am suspecting that one of the node (10.17.119.101) is not sending the 
notification to the other node. sendNotifications() is called to send 
notification to all peers. This functions enteres the notification in 
sendqueue. However, either the entry was not put in the queue (sendqueue.offer 
failed) or the thread that polls sendqueue did not wake up. I am not sure what 
the cause is yet.

I had added extra debug messages. Three messages are of main interest:

1. in sendNotifications():  Print "IN FLE sending notification to server id = 
1" for each server. Also print  "proposedLeader, proposedZxid, logicalclock"
2. In FastLeaderElection.lookForLeader() print "Updating proposa" before 
calling upgradeProposal if (totalOrderPredicate(n.leader, n.zxid, proposedZxid) 
is true
3. in WorkerSender.process(), log -  LOG.info("WorkSender.process() QUEUEING 
m.state= " + m.state + " m.leader=" + m.leader + " m.sid=" + m.sid);

Suppporting log entries from 10.17.119.101-zookeeper.log.  I  have added 
description inline..
--
2010-08-18 14:53:56,451 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@475] - IN FLE sending 
notification to server id = 1
2010-08-18 14:53:56,451 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@476] - proposedLeader, 
proposedZxid, logicalclock 0343597383684
2010-08-18 14:53:56,452 - INFO  [WorkerSender 
Thread:fastleaderelection$messenger$workersen...@352] - WorkSender.process() 
QUEUEING m.state= LOOKING m.leader=0 m.sid=1
2010-08-18 14:53:56,452 - DEBUG [WorkerSender Thread:quorumcnxmana...@347] - 
Opening channel to server 1
2010-08-18 14:53:56,453 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@475] - IN FLE sending 
notification to server id = 2
2010-08-18 14:53:56,453 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@476] - proposedLeader, 
proposedZxid, logicalclock 0343597383684
2010-08-18 14:53:56,453 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@690] - Notification: 1, 
34359738368, 4, 0, LOOKING, LOOKING, 1
2010-08-18 14:53:56,454 - DEBUG 
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@496] - id: 1, proposed id: 
0, zxid: 34359738368, proposed zxid: 34359738368
2010-08-18 14:53:56,454 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@710] - Updating proposal
2010-08-18 14:53:56,454 - INFO  [WorkerSender Thread:quorumcnxmana...@162] - 
Have smaller server identifier, so dropping the connection: (1, 0)
2010-08-18 14:53:56,455 - INFO  [WorkerSender 
Thread:fastleaderelection$messenger$workersen...@352] - WorkSender.process() 
QUEUEING m.state= LOOKING m.leader=0 m.sid=2
2010-08-18 14:53:56,458 - DEBUG [WorkerSender Thread:quorumcnxmana...@347] - 
Opening channel to server 2
2010-08-18 14:53:56,458 - WARN  [Thread-19:quorumcnxmanager$recvwor...@659] - 
Connection broken:
java.io.IOException: Channel eof
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:631)
2010-08-18 14:53:56,459 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@475] - IN FLE sending 
notification to server id = 0

2010-08-18 14:53:56,460 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@476] - proposedLeader, 
proposedZxid, logicalclock 1343597383684  

* The above line shows that this node (server 0) is going to vote for 1. 
see - proposedLeader, proposedZxid, logicalclock 1 34359738368 4  Forgot to add 
spaces in the message  :-) 


2010-08-18 14:53:56,460 - DEBUG [Thread-1:quorumcnxmanager$liste...@446] - 
Connection request /10.17.119.102:41597
2010-08-18 14:53:56,461 - DEBUG [Thread-1:quorumcnxmanager$liste...@449] - 
Connection request: 0
2010-08-18 14:53:56,461 - DEBUG [Thread-1:quorumcnxmanager$sendwor...@505] - 
Address of remote peer: 1
2010-08-18 14:53:56,461 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@475] - IN FLE sending 
notification to server id = 1
2010-08-18 14:53:56,462 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@476] - proposedLeader, 
proposedZxid, logicalclock 1343597383684 

* Above, server 0 queued a notification to be sent to server 1. The 
notfication is saying that it accepts 1 as the leader. But the notification 
never got sent. process() was not called at all from WorkerSender. Its almost 
as if the notification was never entered in sendqueue (in sendNotifications). 
*

2010-08-18 14:53:56,462 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@475] - IN FLE sending 
notification to server id = 2
2010-08-18 14:53:56,462 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@476] - proposedLeader, 
proposedZxid, logicalclock 1343597383684
2010-08-18 14:53:56,463 - DEBUG 
[QuorumPeer:/0:0:0:0:0:0:0:

[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-18 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900010#action_12900010
 ] 

Flavio Junqueira commented on ZOOKEEPER-822:


Hi Vishal, Thanks for reporting. Here are some quick comments:

Issue 1: I think that just the javadoc message is incorrect. We really just 
want to check that some process has received notifications.

Issue2: The connection will eventually timeout if not established, so setting a 
different value should not make a difference. The point about blocking connect 
is a good one. I think it is worthwhile creating a jira for it.  

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-18 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899983#action_12899983
 ] 

Vishal K commented on ZOOKEEPER-822:


While going through the code yesterday,  I found two potential problems that I 
though might be worth reporting in the context of this bug.

1. In FastLeaderElection.java
/**
 * Check if all queues are empty, indicating that all messages have been 
delivered.
 */
boolean haveDelivered() {
for (ArrayBlockingQueue queue : queueSendMap.values()) {
LOG.debug("Queue size: " + queue.size());
if (queue.size() == 0)
return true;
}

return false;
}

the haveDelivered()  function returns true without checking if rest of the 
queus are empty.

2. QuorumCnxManager.connectAll() function connects to one peer at a time and it 
uses a blocking connect (SocketChannle.open). I added a timeout to the 
SocketChannel.open and that did not fix the problem. 

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-16 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899156#action_12899156
 ] 

Vishal K commented on ZOOKEEPER-822:


Hi Ivan,

Can you describe me your setup?

My setup info:
- 3 ESX boxes
- 1 SLES 11 VMs on each
- Cluster of 3 nodes

I hit this problem consistently after rebooting the leader.

Thanks.
-Vishal


> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, test_zookeeper_1.log, test_zookeeper_2.log, 
> zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-10 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896983#action_12896983
 ] 

Ivan Kelly commented on ZOOKEEPER-822:
--

Actually, ignore that. Read it wrong.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, test_zookeeper_1.log, test_zookeeper_2.log, 
> zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-10 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896980#action_12896980
 ] 

Ivan Kelly commented on ZOOKEEPER-822:
--

Could this be related?

https://issues.apache.org/jira/browse/ZOOKEEPER-785

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, test_zookeeper_1.log, test_zookeeper_2.log, 
> zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-06 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896063#action_12896063
 ] 

Ivan Kelly commented on ZOOKEEPER-822:
--

They logs were of some help, but I don't understand what's happening. I looks 
like multiple nodes are claiming leadership at the same time, but that can't be 
right.

The FLE changes won't fix it, but they do log more information, so they may 
make it easier to see what is happening.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, test_zookeeper_1.log, test_zookeeper_2.log, 
> zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-06 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896061#action_12896061
 ] 

Vishal K commented on ZOOKEEPER-822:


Hi Ivan.

Were the logs of any help? I might be worth having 3 VMs and rebooting the 
leader instead of shutting down the interface. We have seen this on all of our 
dev cluster. Al tough all the dev clusters are based on same VM images. So it 
won't be fair to claim that the problem was seen on different set of machines. 
I will try with the latest trunk and let you know the result. What FLE changes 
do you think would have fixed this problem?

Thanks.

-Vishal




> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, test_zookeeper_1.log, test_zookeeper_2.log, 
> zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-06 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896043#action_12896043
 ] 

Ivan Kelly commented on ZOOKEEPER-822:
--

I've tried to repro this with 3 zookeepers running on the same machine, and 3 
zookeepers running on virtual machine and I cannot get it to repro. I was 
taking out the leader by shutting down the network interface. Have you been 
able to repro this on another set of machines other than the ones you first 
observed it on?

Also, could you try this with the latest trunk? Some improvements were made 
around the FLE which may shed some more light.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, test_zookeeper_1.log, test_zookeeper_2.log, 
> zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-08-02 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894576#action_12894576
 ] 

Vishal K commented on ZOOKEEPER-822:



I have attached new logs. I don't use ntp, but all l the nodes should be at the 
most a few seconds apart. 

I have marked start and end of the faulty election. look at  
zookeeper-192.168.10.3-log and search for "vishal", 

Note - it is super easy to reproduce the bug. Create a 3 node cluster and 
reboot the leader (or shutdown the network interface). You may need to repeat 
the test several times. 

If you do a clean shutdown of the leader (zkServer.sh stop), then you won't see 
this bug. I feel that there is something releated to TCP timeout/ session 
management of failed node that is causing this problem.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: 822.tar.gz, test_zookeeper_1.log, test_zookeeper_2.log, 
> zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-07-28 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893121#action_12893121
 ] 

Ivan Kelly commented on ZOOKEEPER-822:
--

currently the timestamp is used as the zxid isn't always available in the 
message logs. But yes, zxid would be more desirable. Perhaps I can extract that 
from the context some other way. 



> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: test_zookeeper_1.log, test_zookeeper_2.log, 
> zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-07-27 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892830#action_12892830
 ] 

Patrick Hunt commented on ZOOKEEPER-822:


Ivan, do the clocks need to be in sync? Perhaps you should use the xid (cxid) 
instead?

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: test_zookeeper_1.log, test_zookeeper_2.log, 
> zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-07-27 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892761#action_12892761
 ] 

Ivan Kelly commented on ZOOKEEPER-822:
--

Hi Vishal,

The logs in zk_leader_election.tar.gz seem to be from different runs. node0 
starts at 2010-07-22 17:33:54,166, node1 at 2010-07-22 22:21:11,979 and node2 
at 2010-07-22 22:22:17,249. Are the clocks on the machine in sync?

-Ivan

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: test_zookeeper_1.log, test_zookeeper_2.log, 
> zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-07-23 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891582#action_12891582
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-822:
--

Thanks for the logs, Vishal. The jira discussing loggraph is ZOOKEEPER-773.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: test_zookeeper_1.log, test_zookeeper_2.log, 
> zk_leader_election.tar.gz
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-07-20 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890199#action_12890199
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-822:
--

Hi Vishal, Do you think you can uploaded all three log files for a problematic 
run? We would like to put it on loggraph to visualize what's going on there. It 
sounds like it is somehow related to the VM reboots, I don't know why yet.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: test_zookeeper_1.log, test_zookeeper_2.log
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-07-19 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890052#action_12890052
 ] 

Vishal K commented on ZOOKEEPER-822:


Hi Flavio,

I have Zookeeper servers running in a VM. To kill ZK server, I power off a VM. 
On the other hand, I tried several times killing ZK process and restarting it 
and I did not see any issues.
So there is something about the reboot that is causing this problem (TCP 
session not getting cleaned-up?).

I don't see many connection exceptions in the log.

Once the leader election starts  we start seeing "Notification time out" 
messages.

However, before this we do see that the connection was established (show below):

2010-07-19 14:40:52,562 - DEBUG [WorkerSender Thread:quorumcnxmana...@366] - 
There is a connection already for server 0
2010-07-19 14:40:52,563 - DEBUG [WorkerSender Thread:quorumcnxmana...@346] - 
Opening channel to server 2

Do you still think this is a communication problem?

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: test_zookeeper_1.log, test_zookeeper_2.log
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-07-19 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889906#action_12889906
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-822:
--

Vishal, I can't reproduce your problem. I just tried twice to kill the leader 
and rejoin it 20 times each, and I can't see the problem you're mentioning.  I 
wonder if there is anything special about your setup. I also can see in your 
logs lots of exceptions related to connections, and as a first cut, it sounds 
like this is preventing the severs from exchanging notifications, and therefore 
the delay. 

Two minor comments: your log file for server 2 does not contain "START HERE" 
and each file duplicates every message.



> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: test_zookeeper_1.log, test_zookeeper_2.log
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-07-19 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889907#action_12889907
 ] 

Ivan Kelly commented on ZOOKEEPER-822:
--

Could you try putting the logs through loggraph (in zookeeper/src/contrib)? 
Perhaps a graphical view will give some insight?

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: test_zookeeper_1.log, test_zookeeper_2.log
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-07-19 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889902#action_12889902
 ] 

Vishal K commented on ZOOKEEPER-822:


I would like that add that the problem is highly reproducible.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Priority: Blocker
> Attachments: test_zookeeper_1.log, test_zookeeper_2.log
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.