[jira] Commented: (ZOOKEEPER-869) Support for election of leader with arbitrary zxid

2010-09-17 Thread Diogo (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12910531#action_12910531
 ] 

Diogo commented on ZOOKEEPER-869:
-

While trying to implement this, I found an interesting issue. Say we have an 
ensemble with 3 nodes. Say we start all nodes together and all have the state 
synchronized, meaning, all replicas return the same value with 
ZKDatabase().getLastLoggedZxid(). It seems that the leader will send a snapshot 
to all followers, although that is not necessary. They need no state transfer.

The leader (quorum/Leader.java:283) reads its lastLoggedZxid() and adds a new 
epoch on it and stores it as lastProposed. In LearnerHandler.java:308 the 
thread will decide if the replica needs an empty DIFF otherwise a SNAP. (I am 
assuming the state of the system described above). But startForwarding will 
return lastProposed, which is necessarily larger than any other zxid. Then SNAP 
will be selected and sent.

Here there is the part of an output, where 2 replicas have the same state 
stored and one is behind.

2010-09-17 12:11:27,296 [myid:3] - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2183:files...@82] - Reading snapshot 
/tmp/zoo3/version-2/snapshot.7
2010-09-17 12:11:27,298 [myid:3] - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2183:files...@82] - Reading snapshot 
/tmp/zoo3/version-2/snapshot.7
2010-09-17 12:11:27,301 [myid:3] - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2183:filetxnsnap...@208] - Snapshotting: 7
2010-09-17 12:11:27,303 [myid:3] - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2183:lea...@285] - lastLoggedZxid = 7 
lastProposed = 8   -- added line just after leader sets its 
lastProposed
2010-09-17 12:11:27,309 [myid:3] - INFO  
[LearnerHandler-/127.0.0.1:48318:learnerhand...@247] - Follower sid: 1 : info : 
org.apache.zookeeper.server.quorum.quorumpeer$quorumser...@12d3205
2010-09-17 12:11:27,310 [myid:3] - WARN  
[LearnerHandler-/127.0.0.1:48318:learnerhand...@326] - Sending snapshot last 
zxid of peer is 0x7  zxid of leader is 0x8   -- snapshot 
being sent!
2010-09-17 12:11:27,312 [myid:3] - WARN  
[LearnerHandler-/127.0.0.1:48318:lea...@474] - Commiting zxid 0x8 from 
/127.0.0.1:2890 not first!
2010-09-17 12:11:27,313 [myid:3] - WARN  
[LearnerHandler-/127.0.0.1:48318:lea...@476] - First is 0
2010-09-17 12:11:27,313 [myid:3] - INFO  
[LearnerHandler-/127.0.0.1:48318:lea...@500] - Have quorum of supporters; 
starting up and setting last processed zxid: 34359738368
2010-09-17 12:11:28,290 [myid:3] - INFO  
[LearnerHandler-/127.0.0.1:48319:learnerhand...@247] - Follower sid: 2 : info : 
org.apache.zookeeper.server.quorum.quorumpeer$quorumser...@1319c
2010-09-17 12:11:28,291 [myid:3] - WARN  
[LearnerHandler-/127.0.0.1:48319:learnerhand...@326] - Sending snapshot last 
zxid of peer is 0x6  zxid of leader is 0x8   this follower 
needs the snapshot.


Am I understanding something wrong?

 Support for election of leader with arbitrary zxid
 --

 Key: ZOOKEEPER-869
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-869
 Project: Zookeeper
  Issue Type: New Feature
Reporter: Diogo
Priority: Minor

 Currently, the leader election algorithm implemented guarantees that the 
 leader has the maximum zxid of the ensemble. The state synchronization after 
 the election was built based on this assumption. However, other leader 
 elections algorithms might elect leaders with arbitrary zxid. 
 To support other leader election algorithms, the state synchronization should 
 allow the leader to have an arbitrary zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-869) Support for election of leader with arbitrary zxid

2010-09-17 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12910643#action_12910643
 ] 

Benjamin Reed commented on ZOOKEEPER-869:
-

this is a good observation diogo, but i think you may be characterizing it 
improperly. the problem is that when we do a leadership we increment the epoch 
and propose a new leader, so all other processes will be much lower than the 
leader. when a follower connects we figure out how far behind the follower is 
by comparing the lastProposed zxids and taking the difference. we should really 
be using the recent history to do the comparison.

as a side note, if we were to chose not to take the maximum zxid during 
recovery, we need to make sure that we still cover all committed messages.

 Support for election of leader with arbitrary zxid
 --

 Key: ZOOKEEPER-869
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-869
 Project: Zookeeper
  Issue Type: New Feature
Reporter: Diogo
Priority: Minor

 Currently, the leader election algorithm implemented guarantees that the 
 leader has the maximum zxid of the ensemble. The state synchronization after 
 the election was built based on this assumption. However, other leader 
 elections algorithms might elect leaders with arbitrary zxid. 
 To support other leader election algorithms, the state synchronization should 
 allow the leader to have an arbitrary zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-874) FileTxnSnapLog.restore does not call listener

2010-09-17 Thread Diogo (JIRA)
FileTxnSnapLog.restore does not call listener
-

 Key: ZOOKEEPER-874
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-874
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Reporter: Diogo
Priority: Trivial


FileTxnSnapLog.restore() does not call listener passed as parameter. The result 
is that the commitLogs list is empty. When a follower connects to the leader, 
the leader is forced to send a snapshot to the follower instead of a couple of 
requests and commits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-874) FileTxnSnapLog.restore does not call listener

2010-09-17 Thread Diogo (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Diogo updated ZOOKEEPER-874:


   Status: Patch Available  (was: Open)
Affects Version/s: 3.3.1
Fix Version/s: 3.4.0

 FileTxnSnapLog.restore does not call listener
 -

 Key: ZOOKEEPER-874
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-874
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.3.1
Reporter: Diogo
Priority: Trivial
 Fix For: 3.4.0


 FileTxnSnapLog.restore() does not call listener passed as parameter. The 
 result is that the commitLogs list is empty. When a follower connects to the 
 leader, the leader is forced to send a snapshot to the follower instead of a 
 couple of requests and commits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-874) FileTxnSnapLog.restore does not call listener

2010-09-17 Thread Diogo (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Diogo updated ZOOKEEPER-874:


Attachment: commitlog-listener.patch

 FileTxnSnapLog.restore does not call listener
 -

 Key: ZOOKEEPER-874
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-874
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.3.1
Reporter: Diogo
Priority: Trivial
 Fix For: 3.4.0

 Attachments: commitlog-listener.patch


 FileTxnSnapLog.restore() does not call listener passed as parameter. The 
 result is that the commitLogs list is empty. When a follower connects to the 
 leader, the leader is forced to send a snapshot to the follower instead of a 
 couple of requests and commits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



build.xml from 3.3.1 distribution has version=3.3.2-dev

2010-09-17 Thread Dave Wright
Ended up chasing our tail for a while today because the ant build.xml
in the 3.3.1 distribution on the website has version is actually
3.3.2-dev. Any particular reason for that? The jar included in the
distribution is obviously labeled correctly, and the version number in
its manifest is 3.3.1, so it appears that the build.xml was modified
and bundled up for distribution after the binary was built. Not sure
if it's worth fixing in the distribution, but thought I'd at least
mention it.

-Dave Wright


Re: build.xml from 3.3.1 distribution has version=3.3.2-dev

2010-09-17 Thread Patrick Hunt
Hi Dave, while it may appear that way, it's not the case. When building a
release we run the following command:

ant -Dversion=3.3.1 ...

which overrides any setting in build.xml. This is documented in our release
process (we pretty much follow what hadoop does, although some of the maven
repo details are a bit different)
http://wiki.apache.org/hadoop/ZooKeeper/HowToRelease

Patrick

On Fri, Sep 17, 2010 at 8:55 AM, Dave Wright wrig...@gmail.com wrote:

 Ended up chasing our tail for a while today because the ant build.xml
 in the 3.3.1 distribution on the website has version is actually
 3.3.2-dev. Any particular reason for that? The jar included in the
 distribution is obviously labeled correctly, and the version number in
 its manifest is 3.3.1, so it appears that the build.xml was modified
 and bundled up for distribution after the binary was built. Not sure
 if it's worth fixing in the distribution, but thought I'd at least
 mention it.

 -Dave Wright



[jira] Updated: (ZOOKEEPER-831) BookKeeper: Throttling improved for reads

2010-09-17 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-831:


Status: Resolved  (was: Patch Available)
Resolution: Fixed

Committed revision 998200.

thanx for the fix flavio and ivan for the reviews!

 BookKeeper: Throttling improved for reads
 -

 Key: ZOOKEEPER-831
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-831
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bookkeeper
Affects Versions: 3.3.1
Reporter: Flavio Junqueira
Assignee: Flavio Junqueira
 Fix For: 3.4.0

 Attachments: ZOOKEEPER-831.patch, ZOOKEEPER-831.patch, 
 ZOOKEEPER-831.patch, ZOOKEEPER-831.patch


 Reads and writes in BookKeeper are asymmetric: a write request writes one 
 entry, whereas a read request may read multiple requests. The current 
 implementation of throttling only counts the number of read requests instead 
 of counting the number of entries being read. Consequently, a few read 
 requests reading a large number of entries each will spawn a large number of 
 read-entry requests. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-875) ResponderThread and udpSocket should be move from QuorumPeer to LeaderElection

2010-09-17 Thread Diogo (JIRA)
ResponderThread and udpSocket should be move from QuorumPeer to LeaderElection
--

 Key: ZOOKEEPER-875
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-875
 Project: Zookeeper
  Issue Type: Improvement
  Components: leaderElection
Affects Versions: 3.3.1
Reporter: Diogo
Priority: Trivial


Part of the algorithm implemented in the class LeaderElection is inside 
QuorumPeer. Is there any reason for that? ResponderThread and udpSocket belong 
to LeaderElection class and should be moved in LeaderElection.java. That would 
make the code look cleaner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-794) Callbacks are not invoked when the client is closed

2010-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12910684#action_12910684
 ] 

Hadoop QA commented on ZOOKEEPER-794:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12454780/ZOOKEEPER-794_4.patch.txt
  against trunk revision 998200.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/116/console

This message is automatically generated.

 Callbacks are not invoked when the client is closed
 ---

 Key: ZOOKEEPER-794
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-794
 Project: Zookeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.3.1
Reporter: Alexis Midon
Assignee: Alexis Midon
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-794.patch.txt, ZOOKEEPER-794.txt, 
 ZOOKEEPER-794_2.patch, ZOOKEEPER-794_3.patch, ZOOKEEPER-794_4.patch.txt


 I noticed that ZooKeeper has different behaviors when calling synchronous or 
 asynchronous actions on a closed ZooKeeper client.
 Actually a synchronous call will throw a session expired exception while an 
 asynchronous call will do nothing. No exception, no callback invocation.
 Actually, even if the EventThread receives the Packet with the session 
 expired err code, the packet is never processed since the thread has been 
 killed by the ventOfDeath. So the call back is not invoked.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-17 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Attachment: ZOOKEEPER-822.patch

I'm adding a test to the patch. It tries to send a message to an address for 
which a connection request receives no response, so it has to timeout. The test 
then checks that the amount of time elapsed is less than 6s (the timeout value 
is hardcoded 5s). Raising the timeout from 5s to say 7s makes the test fail.

 Leader election taking a long time  to complete
 ---

 Key: ZOOKEEPER-822
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.3.0
Reporter: Vishal K
Assignee: Vishal K
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
 test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
 ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch_v1


 Created a 3 node cluster.
 1 Fail the ZK leader
 2. Let leader election finish. Restart the leader and let it join the 
 3. Repeat 
 After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
 Note- we didn't have any ZK clients and no new znodes were created.
 zoo.cfg is shown below:
 #Mon Jul 19 12:15:10 UTC 2010
 server.1=192.168.4.12\:2888\:3888
 server.0=192.168.4.11\:2888\:3888
 clientPort=2181
 dataDir=/var/zookeeper
 syncLimit=2
 server.2=192.168.4.13\:2888\:3888
 initLimit=5
 tickTime=2000
 I have attached logs from two nodes that took a long time to form the cluster 
 after failing the leader. The leader was down anyways so logs from that node 
 shouldn't matter.
 Look for START HERE. Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-17 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Status: Patch Available  (was: Open)

 Leader election taking a long time  to complete
 ---

 Key: ZOOKEEPER-822
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.3.0
Reporter: Vishal K
Assignee: Vishal K
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
 test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
 ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch_v1


 Created a 3 node cluster.
 1 Fail the ZK leader
 2. Let leader election finish. Restart the leader and let it join the 
 3. Repeat 
 After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
 Note- we didn't have any ZK clients and no new znodes were created.
 zoo.cfg is shown below:
 #Mon Jul 19 12:15:10 UTC 2010
 server.1=192.168.4.12\:2888\:3888
 server.0=192.168.4.11\:2888\:3888
 clientPort=2181
 dataDir=/var/zookeeper
 syncLimit=2
 server.2=192.168.4.13\:2888\:3888
 initLimit=5
 tickTime=2000
 I have attached logs from two nodes that took a long time to form the cluster 
 after failing the leader. The leader was down anyways so logs from that node 
 shouldn't matter.
 Look for START HERE. Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-17 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Attachment: ZOOKEEPER-822-3.3.2.patch

Attaching patch for 3.3.2.

 Leader election taking a long time  to complete
 ---

 Key: ZOOKEEPER-822
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.3.0
Reporter: Vishal K
Assignee: Vishal K
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
 test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
 ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
 ZOOKEEPER-822.patch, ZOOKEEPER-822.patch_v1


 Created a 3 node cluster.
 1 Fail the ZK leader
 2. Let leader election finish. Restart the leader and let it join the 
 3. Repeat 
 After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
 Note- we didn't have any ZK clients and no new znodes were created.
 zoo.cfg is shown below:
 #Mon Jul 19 12:15:10 UTC 2010
 server.1=192.168.4.12\:2888\:3888
 server.0=192.168.4.11\:2888\:3888
 clientPort=2181
 dataDir=/var/zookeeper
 syncLimit=2
 server.2=192.168.4.13\:2888\:3888
 initLimit=5
 tickTime=2000
 I have attached logs from two nodes that took a long time to form the cluster 
 after failing the leader. The leader was down anyways so logs from that node 
 shouldn't matter.
 Look for START HERE. Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-17 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-822:
---

Attachment: ZOOKEEPER-822.patch

 Leader election taking a long time  to complete
 ---

 Key: ZOOKEEPER-822
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.3.0
Reporter: Vishal K
Assignee: Vishal K
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
 test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
 ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
 ZOOKEEPER-822.patch, ZOOKEEPER-822.patch_v1


 Created a 3 node cluster.
 1 Fail the ZK leader
 2. Let leader election finish. Restart the leader and let it join the 
 3. Repeat 
 After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
 Note- we didn't have any ZK clients and no new znodes were created.
 zoo.cfg is shown below:
 #Mon Jul 19 12:15:10 UTC 2010
 server.1=192.168.4.12\:2888\:3888
 server.0=192.168.4.11\:2888\:3888
 clientPort=2181
 dataDir=/var/zookeeper
 syncLimit=2
 server.2=192.168.4.13\:2888\:3888
 initLimit=5
 tickTime=2000
 I have attached logs from two nodes that took a long time to form the cluster 
 after failing the leader. The leader was down anyways so logs from that node 
 shouldn't matter.
 Look for START HERE. Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.