[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

2009-11-17 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-582:


Attachment: ZOOKEEPER-582.patch

this patch fixes the issue. Ill test out the patch tomm.

> ZooKeeper can revert to old data when a snapshot is created outside of normal 
> processing
> 
>
> Key: ZOOKEEPER-582
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.1.1, 3.2.1
>Reporter: Benjamin Reed
>Priority: Blocker
> Fix For: 3.2.2, 3.1.2
>
> Attachments: test.patch, ZOOKEEPER-582.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) 
> it finds in the data directory. unfortunately, in the quorum version of 
> zookeeper updates are logged using an epoch based on the latest log file in a 
> directory. if there is a snapshot with a higher epoch than the log files, the 
> zookeeper server will start logging using an epoch one higher than the 
> highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no 
> log files, zookeeper will start logging changes using epoch 1. if the cluster 
> restarts the state will be restored from the snapshot with the epoch of 27, 
> which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, 
> whether it comes from the snapshot or log, and should include a sanity check 
> to make sure that a follower never connects to a leader that has a lower 
> epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

2009-11-17 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-582:


Attachment: test.patch

this patch reproduces the problems outlined in this issue.

> ZooKeeper can revert to old data when a snapshot is created outside of normal 
> processing
> 
>
> Key: ZOOKEEPER-582
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.1.1, 3.2.1
>Reporter: Benjamin Reed
>Priority: Blocker
> Fix For: 3.2.2, 3.1.2
>
> Attachments: test.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) 
> it finds in the data directory. unfortunately, in the quorum version of 
> zookeeper updates are logged using an epoch based on the latest log file in a 
> directory. if there is a snapshot with a higher epoch than the log files, the 
> zookeeper server will start logging using an epoch one higher than the 
> highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no 
> log files, zookeeper will start logging changes using epoch 1. if the cluster 
> restarts the state will be restored from the snapshot with the epoch of 27, 
> which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, 
> whether it comes from the snapshot or log, and should include a sanity check 
> to make sure that a follower never connects to a leader that has a lower 
> epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-507) Improve error handling of BookKeeper client

2009-11-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779258#action_12779258
 ] 

Hadoop QA commented on ZOOKEEPER-507:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12422953/ZOOKEEPER-507.patch
  against trunk revision 881623.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 36 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/4/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/4/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/4/console

This message is automatically generated.

> Improve error handling of BookKeeper client
> ---
>
> Key: ZOOKEEPER-507
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-507
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: contrib-bookkeeper
>Reporter: Flavio Paiva Junqueira
>Assignee: Utkarsh Srivastava
> Attachments: ZOOKEEPER-507.patch
>
>
> Error handling is far from ideal currently in the BookKeeper client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-547) Sanity check in QuorumCnxn Manager and quorum communication port.

2009-11-17 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779236#action_12779236
 ] 

Benjamin Reed commented on ZOOKEEPER-547:
-

Committed revision 881641. (for branch 3.2.2) i had to pull in 
src/java/test/org/apache/zookeeper/PortAssignment.java
from trunk.

> Sanity check in QuorumCnxn Manager and quorum communication port.
> -
>
> Key: ZOOKEEPER-547
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-547
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection, server
>Affects Versions: 3.2.0, 3.2.1
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 3.2.2, 3.3.0
>
> Attachments: ZOOKEEPER-547.patch, ZOOKEEPER-547.patch, 
> ZOOKEEPER-547.patch, ZOOKEEPER-547.patch
>
>
> We need to put some sanity checks in QuorumCnxnManager and the other quorum 
> port for rogue clients. Sometimes a clients might get misconfigured and they 
> might send random characters on such ports. We need to make sure that such 
> rogue clients do not bring down the clients and need to put in some sanity 
> checks with respect to packet lengths and deserialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers: core functionality

2009-11-17 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779209#action_12779209
 ] 

Mahadev konar commented on ZOOKEEPER-368:
-

thats great  ill commit the patch  tonight! 

> Observers: core functionality 
> --
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Fix For: 3.3.0
>
> Attachments: obs-refactor.patch, observer-refactor.patch, observers 
> sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch
>
>
> Edit (Henry Robinson/henryr) 12/11/09:
> This JIRA specifically concerns the implementation of non-voting peers called 
> Observers, their documentation and their tests. 
> Explicit goals are 1. not breaking any current ZK functionality, 2. enabling 
> at least one deployment scenario involving Observers, 3. documentation 
> describing how to use the feature and 4. tests validating the correct 
> behaviour of 2. 
> Non goals of this JIRA are 1. performance optimizations specific to 
> Observers, 2. compatibility with every feature of ZooKeeper (in particular 
> all leader election protocols), which are both to be addressed in future 
> JIRAs. 
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers for more detail of use 
> cases, proposed design and usage.
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers/ReviewGuide for a brief 
> commentary on the current patch. 
> -
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers: core functionality

2009-11-17 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779205#action_12779205
 ] 

Henry Robinson commented on ZOOKEEPER-368:
--

Mahadev - 

Ah, I understand now. Yes, I just ran that test. Everything works as I expected 
it to - both when two servers are from the old code and then when two servers 
are from the new. They can be swapped out for each other will no ill effects, 
AFAIK.  

Thanks!

Henry

> Observers: core functionality 
> --
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Fix For: 3.3.0
>
> Attachments: obs-refactor.patch, observer-refactor.patch, observers 
> sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch
>
>
> Edit (Henry Robinson/henryr) 12/11/09:
> This JIRA specifically concerns the implementation of non-voting peers called 
> Observers, their documentation and their tests. 
> Explicit goals are 1. not breaking any current ZK functionality, 2. enabling 
> at least one deployment scenario involving Observers, 3. documentation 
> describing how to use the feature and 4. tests validating the correct 
> behaviour of 2. 
> Non goals of this JIRA are 1. performance optimizations specific to 
> Observers, 2. compatibility with every feature of ZooKeeper (in particular 
> all leader election protocols), which are both to be addressed in future 
> JIRAs. 
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers for more detail of use 
> cases, proposed design and usage.
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers/ReviewGuide for a brief 
> commentary on the current patch. 
> -
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-547) Sanity check in QuorumCnxn Manager and quorum communication port.

2009-11-17 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-547:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed revision 881623.

> Sanity check in QuorumCnxn Manager and quorum communication port.
> -
>
> Key: ZOOKEEPER-547
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-547
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection, server
>Affects Versions: 3.2.0, 3.2.1
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 3.2.2, 3.3.0
>
> Attachments: ZOOKEEPER-547.patch, ZOOKEEPER-547.patch, 
> ZOOKEEPER-547.patch, ZOOKEEPER-547.patch
>
>
> We need to put some sanity checks in QuorumCnxnManager and the other quorum 
> port for rogue clients. Sometimes a clients might get misconfigured and they 
> might send random characters on such ports. We need to make sure that such 
> rogue clients do not bring down the clients and need to put in some sanity 
> checks with respect to packet lengths and deserialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers: core functionality

2009-11-17 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779194#action_12779194
 ] 

Mahadev konar commented on ZOOKEEPER-368:
-

thanks henry for the comments and testing! Thanks for all the hard work and 
responses. I have one more question. Sorry I couldnt find the answer to that, 
so wanted to ask again. I know looking at the code that this shouldnt be a 
problem but I think it is worth running a small test for it.

- the test is to have 2 servers (s1, S2) from the old code and 1 server (s3) 
with the new code and verify that s1/s2 or s3 are all capable of becoming the 
leader and everything works fine with neone of them becoming a leader. This 
could be done by bringing up s1, s2 and s3 at the same time and killing them 
one at a time and bringing the other up. SOmething like testing a rolling 
upgrade wherein one server is the new code and the other servers are old code. 
This is not testing observers but just testing (though I think it should work 
fine looking at the code) that the older versions will work with the new 
version irrespective of which of them is the leader.

> Observers: core functionality 
> --
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Fix For: 3.3.0
>
> Attachments: obs-refactor.patch, observer-refactor.patch, observers 
> sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch
>
>
> Edit (Henry Robinson/henryr) 12/11/09:
> This JIRA specifically concerns the implementation of non-voting peers called 
> Observers, their documentation and their tests. 
> Explicit goals are 1. not breaking any current ZK functionality, 2. enabling 
> at least one deployment scenario involving Observers, 3. documentation 
> describing how to use the feature and 4. tests validating the correct 
> behaviour of 2. 
> Non goals of this JIRA are 1. performance optimizations specific to 
> Observers, 2. compatibility with every feature of ZooKeeper (in particular 
> all leader election protocols), which are both to be addressed in future 
> JIRAs. 
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers for more detail of use 
> cases, proposed design and usage.
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers/ReviewGuide for a brief 
> commentary on the current patch. 
> -
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers: core functionality

2009-11-17 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779186#action_12779186
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-368:
--

I have just made sure that the docs compile, and after Henry's response, I 
don't have further comments. + 1, good job, Henry!




> Observers: core functionality 
> --
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Fix For: 3.3.0
>
> Attachments: obs-refactor.patch, observer-refactor.patch, observers 
> sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch
>
>
> Edit (Henry Robinson/henryr) 12/11/09:
> This JIRA specifically concerns the implementation of non-voting peers called 
> Observers, their documentation and their tests. 
> Explicit goals are 1. not breaking any current ZK functionality, 2. enabling 
> at least one deployment scenario involving Observers, 3. documentation 
> describing how to use the feature and 4. tests validating the correct 
> behaviour of 2. 
> Non goals of this JIRA are 1. performance optimizations specific to 
> Observers, 2. compatibility with every feature of ZooKeeper (in particular 
> all leader election protocols), which are both to be addressed in future 
> JIRAs. 
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers for more detail of use 
> cases, proposed design and usage.
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers/ReviewGuide for a brief 
> commentary on the current patch. 
> -
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers: core functionality

2009-11-17 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779184#action_12779184
 ] 

Henry Robinson commented on ZOOKEEPER-368:
--

Flavio - 

Thanks for your comments!

1. Will do. 
2. QuorumPeerTestBase is extended by QuorumPeerMainTest and ObserverTest; it 
seemed to make sense to introduce a base class rather than have ObserverTest 
extend QuorumPeerMainTest and then have to manually disable the tests that I 
didn't want to run. Also, it makes the test classes themselves shorter and 
easier to reason about, 
3. See above.

Henry

> Observers: core functionality 
> --
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Fix For: 3.3.0
>
> Attachments: obs-refactor.patch, observer-refactor.patch, observers 
> sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch
>
>
> Edit (Henry Robinson/henryr) 12/11/09:
> This JIRA specifically concerns the implementation of non-voting peers called 
> Observers, their documentation and their tests. 
> Explicit goals are 1. not breaking any current ZK functionality, 2. enabling 
> at least one deployment scenario involving Observers, 3. documentation 
> describing how to use the feature and 4. tests validating the correct 
> behaviour of 2. 
> Non goals of this JIRA are 1. performance optimizations specific to 
> Observers, 2. compatibility with every feature of ZooKeeper (in particular 
> all leader election protocols), which are both to be addressed in future 
> JIRAs. 
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers for more detail of use 
> cases, proposed design and usage.
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers/ReviewGuide for a brief 
> commentary on the current patch. 
> -
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-583) on resync client should generate session expired exception if there is no server in cluster with acceptable zxid

2009-11-17 Thread Patrick Hunt (JIRA)
on resync client should generate session expired exception if there is no 
server in cluster with acceptable zxid


 Key: ZOOKEEPER-583
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-583
 Project: Zookeeper
  Issue Type: Bug
  Components: c client, java client
Affects Versions: 3.2.1, 3.1.1
Reporter: Patrick Hunt
 Fix For: 3.3.0


Both the c and java clients attempt to connect to a server in the cluster by 
iterating through
a randomized list of servers as listed in the connect string passed to the 
zookeeper_init (c)
or ZooKeeper constructor (java). The clients do this indefinitely, until 
successfully connecting
to a server or until the client is close()ed. Additionally if a client is 
disconnected from a server
it will attempt to reconnect to another server in the cluster, in this case it 
will only connect
to a server that has the same, or higher, zxid as seen by the client on the 
previous server that
it was connected to (this ensures that the client never sees old data).

In some weird cases (in particular where operators reset the server database, 
clearing out the
existing snapshots and txnlogs) existing clients will now see a much lower zxid 
(due to the
epoch number being reset) regardless of the server that the client attempts to 
connect to. In this
case the current client will iterate essentially forever.

Instead the client should throw session expired in this case (notify any 
watchers). After iterating
through all of the servers in the list, if none of the servers have an 
acceptable zxid the client
should expire the session and shut down the handle. This will ensure that the 
client will eventually
shutdown in this unusual, but possible (esp with server operators who don't 
also control the
clients) situation.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers: core functionality

2009-11-17 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779178#action_12779178
 ] 

Henry Robinson commented on ZOOKEEPER-368:
--

Thanks Ben! I agree that 581 is a genuine issue, I'll take it up over on its 
JIRA. 

Mahadev - 

An Observer won't be able to connect to a pre-Observer ensemble because it 
doesn't send FOLLOWERINFO (rather, it sends OBSERVERINFO). The effect is that 
it retries, and is rejected. I have just verified this. 

If a server is brought down and reconnects as an Observer, it will be able to 
connect to the ensemble without problem. The Leader does not validate the type 
of the Learner that connects so it happily accepts the OBSERVERINFO handshake 
and carries on. It is possible that, if the process was restarted very quickly 
and the server was originally the Leader, that there might be some confusion 
when the Observer refuses to issue proposals. My belief is that the old Leader 
would be identified as failed. 

This should probably be considered user error? Users must not try and start the 
cluster with different configurations at each node. I can think of similar 
'bugs' in the current code where different servers have different 
configurations and therefore acknowledge different quorum groups, meaning that 
there wouldn't be consensus on who is the Leader, for example. 

> Observers: core functionality 
> --
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Fix For: 3.3.0
>
> Attachments: obs-refactor.patch, observer-refactor.patch, observers 
> sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch
>
>
> Edit (Henry Robinson/henryr) 12/11/09:
> This JIRA specifically concerns the implementation of non-voting peers called 
> Observers, their documentation and their tests. 
> Explicit goals are 1. not breaking any current ZK functionality, 2. enabling 
> at least one deployment scenario involving Observers, 3. documentation 
> describing how to use the feature and 4. tests validating the correct 
> behaviour of 2. 
> Non goals of this JIRA are 1. performance optimizations specific to 
> Observers, 2. compatibility with every feature of ZooKeeper (in particular 
> all leader election protocols), which are both to be addressed in future 
> JIRAs. 
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers for more detail of use 
> cases, proposed design and usage.
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers/ReviewGuide for a brief 
> commentary on the current patch. 
> -
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the 

[jira] Commented: (ZOOKEEPER-368) Observers: core functionality

2009-11-17 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779176#action_12779176
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-368:
--

Henry, thanks for all the changes, a few quick comments:

# You've added a TODO to leader.java, which was a good catch. I don't want to 
make you generate another patch just to fix the majority check on that message, 
so if prefer not to do it, could you make sure there is a jira to fix it?
# I didn't quite understand why you moved all that code between 
QuorumPeerMainTest and QuorumPeerTestBase. Would you mind just commenting 
quickly?
# Mahadev gave some good suggestions for tests we should perform before 
committing. Would you mind running those tests?

> Observers: core functionality 
> --
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Fix For: 3.3.0
>
> Attachments: obs-refactor.patch, observer-refactor.patch, observers 
> sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch
>
>
> Edit (Henry Robinson/henryr) 12/11/09:
> This JIRA specifically concerns the implementation of non-voting peers called 
> Observers, their documentation and their tests. 
> Explicit goals are 1. not breaking any current ZK functionality, 2. enabling 
> at least one deployment scenario involving Observers, 3. documentation 
> describing how to use the feature and 4. tests validating the correct 
> behaviour of 2. 
> Non goals of this JIRA are 1. performance optimizations specific to 
> Observers, 2. compatibility with every feature of ZooKeeper (in particular 
> all leader election protocols), which are both to be addressed in future 
> JIRAs. 
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers for more detail of use 
> cases, proposed design and usage.
> See http://wiki.apache.org/hadoop/ZooKeeper/Observers/ReviewGuide for a brief 
> commentary on the current patch. 
> -
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[ATTN] If you upgraded from v2 of ZooKeeper to v3 please read this.

2009-11-17 Thread Patrick Hunt
We've found an issue with the migration tool used to migrate users from 
version 2 of ZooKeeper to version 3. This tool was provided to users who 
upgraded from the SourceForge v2 ZK, after we moved to being a 
subproject of Apache Hadoop (which is the same time that we incremented 
the version number from 2.x.y to 3.0.0 - 2+ years ago)


https://issues.apache.org/jira/browse/ZOOKEEPER-582

Please note, this only effects users who migrated their ZK data 
directory from version 2 to version 3. If you have only ever used ZK 
version 3.0.0 and later this does not effect you.


Patrick


[jira] Commented: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

2009-11-17 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779149#action_12779149
 ] 

Patrick Hunt commented on ZOOKEEPER-582:


As Ben mentioned we will never see this situation during normal operation of ZK.

The case where we did see this was a result of a user running the migration 
tool that we provide to upgrade from version 2 to version 3 of ZooKeeper. The 
tool migrates the data by writing a single snapshot file where the zxid is 
maintained (it does not write a log file). As a result of the scenario Ben 
mentioned (snap with no associated log file) this could cause this bug to 
occur. If you have run the migration tool, documented here:
http://hadoop.apache.org/zookeeper/docs/r3.0.0/releasenotes.html#migration_data
you can verify whether or not you have this situation by looking at your 
ZooKeeper datadirectory

Here's an example

-rw-r--r--  1 root search 67108880 Nov 17 19:31 log.300022b61
-rw-r--r--  1 root search 67108880 Nov 17 19:38 log.3000292d0
-rw-r--r--  1 root search  3646608 Nov  5 12:13 snapshot.1db5df6e2d6
-rw-r--r--  1 root search  3616579 Nov 17 19:31 snapshot.3000292c9
-rw-r--r--  1 root search  3616708 Nov 17 19:38 snapshot.300038d32

where the files are of the form . 
epoch and xid both being 4 byte values represented as hex

Notice that the snapshot.1db5df6e2d6 has epoch of 0x1db, while the other
files have epoch of 0x3, this is the scenario described in the description of 
this
JIRA. (there is no log file associated with epoch 0x1db)

If you see this in your datadir - a snapshot with an epoch where there are no 
log files with
this same epoch, then this bug pertains.  If you see snapshots of a particular 
epoch
and log files with the same epoch then this bug does NOT pertain.


> ZooKeeper can revert to old data when a snapshot is created outside of normal 
> processing
> 
>
> Key: ZOOKEEPER-582
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.1.1, 3.2.1
>Reporter: Benjamin Reed
>Priority: Blocker
> Fix For: 3.2.2, 3.1.2
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) 
> it finds in the data directory. unfortunately, in the quorum version of 
> zookeeper updates are logged using an epoch based on the latest log file in a 
> directory. if there is a snapshot with a higher epoch than the log files, the 
> zookeeper server will start logging using an epoch one higher than the 
> highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no 
> log files, zookeeper will start logging changes using epoch 1. if the cluster 
> restarts the state will be restored from the snapshot with the epoch of 27, 
> which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, 
> whether it comes from the snapshot or log, and should include a sanity check 
> to make sure that a follower never connects to a leader that has a lower 
> epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

2009-11-17 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-582:
---

 Priority: Blocker  (was: Major)
Affects Version/s: 3.1.1
Fix Version/s: 3.1.2

> ZooKeeper can revert to old data when a snapshot is created outside of normal 
> processing
> 
>
> Key: ZOOKEEPER-582
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.1.1, 3.2.1
>Reporter: Benjamin Reed
>Priority: Blocker
> Fix For: 3.2.2, 3.1.2
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) 
> it finds in the data directory. unfortunately, in the quorum version of 
> zookeeper updates are logged using an epoch based on the latest log file in a 
> directory. if there is a snapshot with a higher epoch than the log files, the 
> zookeeper server will start logging using an epoch one higher than the 
> highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no 
> log files, zookeeper will start logging changes using epoch 1. if the cluster 
> restarts the state will be restored from the snapshot with the epoch of 27, 
> which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, 
> whether it comes from the snapshot or log, and should include a sanity check 
> to make sure that a follower never connects to a leader that has a lower 
> epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

2009-11-17 Thread Benjamin Reed (JIRA)
ZooKeeper can revert to old data when a snapshot is created outside of normal 
processing


 Key: ZOOKEEPER-582
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.2.1
Reporter: Benjamin Reed
 Fix For: 3.2.2


when zookeeper starts up it will restore the most recent state (latest zxid) it 
finds in the data directory. unfortunately, in the quorum version of zookeeper 
updates are logged using an epoch based on the latest log file in a directory. 
if there is a snapshot with a higher epoch than the log files, the zookeeper 
server will start logging using an epoch one higher than the highest log file.

so if a data directory has a snapshot with an epoch of 27 and there are no log 
files, zookeeper will start logging changes using epoch 1. if the cluster 
restarts the state will be restored from the snapshot with the epoch of 27, 
which in effect, restores old data.

normal operation of zookeeper will never result in this situation.

this does not effect standalone zookeeper.

a fix should make sure to use an epoch one higher than the current state, 
whether it comes from the snapshot or log, and should include a sanity check to 
make sure that a follower never connects to a leader that has a lower epoch 
than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[Fwd: Hadoop User Group (Bay Area) - next Wednesday (Nov 18th) at Yahoo!]

2009-11-17 Thread Patrick Hunt
Tomorrow is the BA HUG, if anyone is interested to talk with Mahadev or 
I f2f regarding ZooKeeper we'll both be in attendance.


Patrick
--- Begin Message ---
Hi all,

We are one week away from the next Bay Area Hadoop User Group  - Yahoo! 
Sunnyvale Campus, next Wednesday (Nov 18th) at 6PM

We have an exciting evening planed:

*Katta, Solr, Lucene and Hadoop - Searching at scale, Jason Rutherglen and 
Jason Venner

*Walking through the New File system API, Sanjay Radia, Yahoo!

*Keep your data in Jute but still use it in python, Paul Tarjan, Yahoo!


Please RSVP here:
http://www.meetup.com/hadoop/calendar/11724002/


Please note that this is the last HUG for 2009, as we will not have a meeting 
on December (due to the holidays).
We will open 2010 with a HUG on Jan 20th.

Looking forward to seeing you next week!

Dekel

--- End Message ---


[jira] Commented: (ZOOKEEPER-547) Sanity check in QuorumCnxn Manager and quorum communication port.

2009-11-17 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779074#action_12779074
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-547:
--

+1

> Sanity check in QuorumCnxn Manager and quorum communication port.
> -
>
> Key: ZOOKEEPER-547
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-547
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection, server
>Affects Versions: 3.2.0, 3.2.1
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 3.2.2, 3.3.0
>
> Attachments: ZOOKEEPER-547.patch, ZOOKEEPER-547.patch, 
> ZOOKEEPER-547.patch, ZOOKEEPER-547.patch
>
>
> We need to put some sanity checks in QuorumCnxnManager and the other quorum 
> port for rogue clients. Sometimes a clients might get misconfigured and they 
> might send random characters on such ports. We need to make sure that such 
> rogue clients do not bring down the clients and need to put in some sanity 
> checks with respect to packet lengths and deserialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-544) improve client testability - allow test client to access connected server location

2009-11-17 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778993#action_12778993
 ] 

Patrick Hunt commented on ZOOKEEPER-544:


discoverability. I put testable* to make it easier for clients to do test 
(rather than figure out how to get this).

Also, exposing cnxn means it's harder to make changes to the implementation as 
the user code
will be closely coupled to the guts of cnxn. this allows for changes under the 
covers (for common
use cases at least)

> improve client testability - allow test client to access connected server 
> location
> --
>
> Key: ZOOKEEPER-544
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-544
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: c client, java client, tests
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-544.patch
>
>
> This came up recently on the user list. If you are developing tests for your 
> zk client you need to be able to access the server that your
> session is currently connected to. The reason is that your test needs to know 
> which server in the quorum to shutdown in order to
> verify you are handling failover correctly. Similar for session expiration 
> testing.
> however we should be careful, we prefer not to expose this to all clients, 
> this is an implementation detail that we typically
> want to hide. 
> also we should provide this in both the c and java clients
> I suspect we should add a protected method on ZooKeeper. This will make a 
> higher bar (user will have to subclass) for 
> the user to access this method. In tests it's fine, typically you want a 
> "TestableZooKeeper" class anyway. In c we unfortunately
> have less options, we can just rely on docs for now. 
> In both cases (c/java) we need to be very very clear in the docs that this is 
> for testing only and to clearly define semantics.
> We should add the following at the same time:
> toString() method to ZooKeeper which includes server ip/port, client port, 
> any other information deemed useful (connection stats like send/recv?)
> the java ZooKeeper is missing "deterministic connection order" that the c 
> client has. this is also useful for testing. again, protected and 
> clear docs that this is for testing purposes only!
> Any other things we should expose?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-544) improve client testability - allow test client to access connected server location

2009-11-17 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778980#action_12778980
 ] 

Benjamin Reed commented on ZOOKEEPER-544:
-

why do you have testableLocalSocketAddress in ZooKeeper? The cnxn object is 
protected. You don't need that method is ZooKeeper do you?

> improve client testability - allow test client to access connected server 
> location
> --
>
> Key: ZOOKEEPER-544
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-544
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: c client, java client, tests
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-544.patch
>
>
> This came up recently on the user list. If you are developing tests for your 
> zk client you need to be able to access the server that your
> session is currently connected to. The reason is that your test needs to know 
> which server in the quorum to shutdown in order to
> verify you are handling failover correctly. Similar for session expiration 
> testing.
> however we should be careful, we prefer not to expose this to all clients, 
> this is an implementation detail that we typically
> want to hide. 
> also we should provide this in both the c and java clients
> I suspect we should add a protected method on ZooKeeper. This will make a 
> higher bar (user will have to subclass) for 
> the user to access this method. In tests it's fine, typically you want a 
> "TestableZooKeeper" class anyway. In c we unfortunately
> have less options, we can just rely on docs for now. 
> In both cases (c/java) we need to be very very clear in the docs that this is 
> for testing only and to clearly define semantics.
> We should add the following at the same time:
> toString() method to ZooKeeper which includes server ip/port, client port, 
> any other information deemed useful (connection stats like send/recv?)
> the java ZooKeeper is missing "deterministic connection order" that the c 
> client has. this is also useful for testing. again, protected and 
> clear docs that this is for testing purposes only!
> Any other things we should expose?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-547) Sanity check in QuorumCnxn Manager and quorum communication port.

2009-11-17 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-547:


Hadoop Flags: [Reviewed]

+1 looks good.

> Sanity check in QuorumCnxn Manager and quorum communication port.
> -
>
> Key: ZOOKEEPER-547
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-547
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection, server
>Affects Versions: 3.2.0, 3.2.1
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 3.2.2, 3.3.0
>
> Attachments: ZOOKEEPER-547.patch, ZOOKEEPER-547.patch, 
> ZOOKEEPER-547.patch, ZOOKEEPER-547.patch
>
>
> We need to put some sanity checks in QuorumCnxnManager and the other quorum 
> port for rogue clients. Sometimes a clients might get misconfigured and they 
> might send random characters on such ports. We need to make sure that such 
> rogue clients do not bring down the clients and need to put in some sanity 
> checks with respect to packet lengths and deserialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-581) peerType in configuration file is redundant

2009-11-17 Thread Benjamin Reed (JIRA)
peerType in configuration file is redundant
---

 Key: ZOOKEEPER-581
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-581
 Project: Zookeeper
  Issue Type: Improvement
Reporter: Benjamin Reed
Priority: Minor


to configure a machine to be an observer you must add a peerType=observer to 
the configuration file and an observer tag to the server list. this is 
redundant. if the observer tag is on the entry of a machine it should know it 
is an observer without needing the peerType tag.

on the other hand, do we really need the observers in the server list? they 
don't vote.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-580) Document reasonable limits on the size and shape of data for a zookeeper ensemble.

2009-11-17 Thread bryan thompson (JIRA)
Document reasonable limits on the size and shape of data for a zookeeper 
ensemble.
--

 Key: ZOOKEEPER-580
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-580
 Project: Zookeeper
  Issue Type: Improvement
  Components: documentation
Reporter: bryan thompson


I would like to have documentation which clarifies the reasonable limits on the 
size and shape of data in a zookeeper ensemble.  Since all zookeeper nodes and 
their data are replicated on each peer in an ensemble, there will be a machine 
limit on the amount of data in a zookeeper instance, but I have not seen any 
guidance on the estimation of that machine limit.  Presumably the machine 
limits are primarily determined by the amount of heap available to the JVM 
before swapping sets in, however there might well be other limits which are 
less obvious in terms of the #of children per node and the depth of the node 
hierarchy (in addition to the already documented limit on the amount of data in 
a node).  There may also be interactions with the hierarchy depth and 
performance, which I have not seen detailed anywhere.

Guidance regarding pragmatic and machine limits would be helpful is choosing 
designs using zookeeper which can scale.  For example, if metadata about each 
shard of a partitioned database architecture is mapped onto a distinct znode in 
zookeeper, then there could be an very large number of znodes for a large 
database deployment.   While this would make it easy to reassign shards to 
services dynamically, the design might impose an unforeseen limit on the #of 
shards in the database.  A similar concern would apply to an attempt to 
maintain metadata about each file in a distributed file system.

Issue [ZOOKEEPER-272] described some problems when nodes have a large number 
#of children.  However, it did not elaborate on whether the change to an 
Iterator model would break the atomic semantics of the List of children 
or if the Iterator would be backed by a snapshot of the children as it existed 
at the time the iterator was requested, which would put a memory burden on the 
ensemble.  This raises the related question of when designs which work around 
scaling limits in zookeeper might break desirable semantics, primarily the 
ability to have a consistent view of the distributed state.

Put another way, are there anti-patterns for zookeeper relating to scalability? 
 Too many children?  Too much depth?  Avoid decomposing large numbers of 
children into hierarchies?  Etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.