Re: Possible race in LETest.java

2009-11-11 Thread Flavio Junqueira

Henry already opened one: ZOOKEEPER-569.

-Flavio

On Nov 11, 2009, at 7:03 AM, Patrick Hunt wrote:


Closing the loop - what's the status on this? Can one of you open a
JIRA and provide a patch for this?

Thanks,

Patrick

Flavio Junqueira wrote:
Hi Henry, Apologies for the the delay. Your observation sounds  
right to

me. Here is how I'm reading it; let me know if it makes sense.

If everyone votes for 3 in the second round and 3 has crashed, then  
in
countVotes we will remove all votes to 3 and there will be no vote  
left.

In such a case, there will be no winner as a result of the call to
countVotes and lookForLeader won't change the current vote
(LeaderElection.java:201). This is a situation in which we are stuck.

Does it sound reasonable to add an else to the if statement of
LeaderElection.java:201 to reset the vote? This modification would
implementing resetting the vote when countVotes returns no winner,  
which

should happen only when the replica itself votes for a dead leader.

-Flavio

On Oct 28, 2009, at 7:44 AM, Henry Robinson wrote:


[ Sending this direct since the Apache mailserver is rejecting my
e-mails at the moment ]

As I understand it, 1 and 2 receive a vote for 3 in the first round,
which causes them to vote for 3 in the second round. So in the  
second

round, all votes cast are for 3. But 3 has died, so all votes for it
are discounted. 1 and 2 continue to vote for 3 ad infinitum, never
resetting their vote.

Does this sound plausible, or am I missing something?

cheers,
Henry

On Tue, Oct 27, 2009 at 3:48 PM, Flavio Junqueira f...@yahoo- 
inc.com

wrote:
Hi Henry, I don't understand how 1 and 2 do not end up electing 2 in
your situation. If they exclude 3 in countVotes, then countVotes  
will
end up returning 2 and not 3, assuming there is a vote for 2. What  
am

I missing?

The problem with QuorumPeer you're pointing at was also an issue  
with

the FLE tests, and I couldn't see an easy way around it other than
timing out and restarting leader election.

Cheers,
-Flavio


On Oct 27, 2009, at 6:35 AM, Henry Robinson wrote:

I've been working on adding a TCPResponderThread to the leader  
election
process so that if a deployment needs to be TCP only, it can be  
and still
use all election types. Testing this has exposed what might be a  
race
condition in the leader election code that prevents a leader from  
being

elected.

Here's the behaviour I see in LETest occasionally. With three nodes
(reduced
from 30 for ease of debugging), node 3 gets elected before either  
node

1 or
node 2 finish their election (there is one round where each node  
that

3 has
the highest id, and then 3 completes its second round by receiving
votes for
itself from 1 and 2, but 1 and 2 do not receive votes from 3).

Now 3 is killed by the test harness. 1 and 2 are still voting for  
it, but
every time they try, the vote tally excludes 3 since it hasn't  
been heard

from. They then spin round the voting process, unable to reset their
vote. I
expect that the heartbeat mechanism in a running QuorumPeer takes  
care of

this when the leader is lost, but the associated QuorumPeers aren't
running.

If this is the case, then there is a simple fix to reset the nodes
vote to
themselves if they are voting for a node that hasn't been heard  
from. I

don't know why using TCP instead of UDP for the responder thread is
exacerbating this (and we can't rule out my introducing a bug :));  
but as
it's a race condition the different timings associated with  
waiting on

a TCP
socket might just be enough to expose the issue.

Can someone verify this might be possible / figure out what I  
missed?


cheers,
Henry









[jira] Commented: (ZOOKEEPER-550) Java Queue Recipe

2009-11-11 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776370#action_12776370
 ] 

Mahadev konar commented on ZOOKEEPER-550:
-

by installing zookeeper library I meant doing a make install and installing 
it in standard path of your system.

 Java Queue Recipe
 -

 Key: ZOOKEEPER-550
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-550
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.2.1
Reporter: Steven Cheng
Assignee: Steven Cheng
Priority: Minor
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, 
 ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, 
 ZOOKEEPER-550.patch, ZOOKEEPER-550.patch


 This patch adds a recipe for creating a distributed queue with ZooKeeper 
 similar to the WriteLock recipe and some sequential tests.  This early 
 attempt follows the Java BlockingQueue interface, though it doesn't implement 
 it since I don't think there's a good reason for it to be Iterable.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers

2009-11-11 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776482#action_12776482
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-368:
--

Thanks for the updated patch and the review guide, Henry. The review guide is 
quite handy. 

To me, we need the following to complete this patch:

# Make it work with FLE, which is the default leader election;
# Get rid of all hardcoded quorum.size() / 2 and replace it with 
containsQuorum();
# Include a test with hierarchical quorums and observers;
# Prepare a forrest document for the feature, describing what it does, how to 
configure ZooKeeper to use it, and perhaps one or two cases in which it would 
be useful to use observers.


 Observers
 -

 Key: ZOOKEEPER-368
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
 Project: Zookeeper
  Issue Type: New Feature
  Components: quorum
Reporter: Flavio Paiva Junqueira
Assignee: Henry Robinson
 Attachments: obs-refactor.patch, observer-refactor.patch, observers 
 sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch


 Currently, all servers of an ensemble participate actively in reaching 
 agreement on the order of ZooKeeper transactions. That is, all followers 
 receive proposals, acknowledge them, and receive commit messages from the 
 leader. A leader issues commit messages once it receives acknowledgments from 
 a quorum of followers. For cross-colo operation, it would be useful to have a 
 third role: observer. Using Paxos terminology, observers are similar to 
 learners. An observer does not participate actively in the agreement step of 
 the atomic broadcast protocol. Instead, it only commits proposals that have 
 been accepted by some quorum of followers.
 One simple solution to implement observers is to have the leader forwarding 
 commit messages not only to followers but also to observers, and have 
 observers applying transactions according to the order followers agreed upon. 
 In the current implementation of the protocol, however, commit messages do 
 not carry their corresponding transaction payload because all servers 
 different from the leader are followers and followers receive such a payload 
 first through a proposal message. Just forwarding commit messages as they 
 currently are to an observer consequently is not sufficient. We have a couple 
 of options:
 1- Include the transaction payload along in commit messages to observers;
 2- Send proposals to observers as well.
 Number 2 is simpler to implement because it doesn't require changing the 
 protocol implementation, but it increases traffic slightly. The performance 
 impact due to such an increase might be insignificant, though.
 For scalability purposes, we may consider having followers also forwarding 
 commit messages to observers. With this option, observers can connect to 
 followers, and receive messages from followers. This choice is important to 
 avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Possible race in LETest.java

2009-11-11 Thread Henry Robinson
The patch is simple, but a test is harder because it's a race condition. I'm
working on it.

Henry

On Wed, Nov 11, 2009 at 12:06 AM, Flavio Junqueira f...@yahoo-inc.comwrote:

 Henry already opened one: ZOOKEEPER-569.

 -Flavio


 On Nov 11, 2009, at 7:03 AM, Patrick Hunt wrote:

  Closing the loop - what's the status on this? Can one of you open a
 JIRA and provide a patch for this?

 Thanks,

 Patrick

 Flavio Junqueira wrote:

 Hi Henry, Apologies for the the delay. Your observation sounds right to
 me. Here is how I'm reading it; let me know if it makes sense.

 If everyone votes for 3 in the second round and 3 has crashed, then in
 countVotes we will remove all votes to 3 and there will be no vote left.
 In such a case, there will be no winner as a result of the call to
 countVotes and lookForLeader won't change the current vote
 (LeaderElection.java:201). This is a situation in which we are stuck.

 Does it sound reasonable to add an else to the if statement of
 LeaderElection.java:201 to reset the vote? This modification would
 implementing resetting the vote when countVotes returns no winner, which
 should happen only when the replica itself votes for a dead leader.

 -Flavio

 On Oct 28, 2009, at 7:44 AM, Henry Robinson wrote:

  [ Sending this direct since the Apache mailserver is rejecting my
 e-mails at the moment ]

 As I understand it, 1 and 2 receive a vote for 3 in the first round,
 which causes them to vote for 3 in the second round. So in the second
 round, all votes cast are for 3. But 3 has died, so all votes for it
 are discounted. 1 and 2 continue to vote for 3 ad infinitum, never
 resetting their vote.

 Does this sound plausible, or am I missing something?

 cheers,
 Henry

 On Tue, Oct 27, 2009 at 3:48 PM, Flavio Junqueira f...@yahoo-inc.com
 wrote:
 Hi Henry, I don't understand how 1 and 2 do not end up electing 2 in
 your situation. If they exclude 3 in countVotes, then countVotes will
 end up returning 2 and not 3, assuming there is a vote for 2. What am
 I missing?

 The problem with QuorumPeer you're pointing at was also an issue with
 the FLE tests, and I couldn't see an easy way around it other than
 timing out and restarting leader election.

 Cheers,
 -Flavio


 On Oct 27, 2009, at 6:35 AM, Henry Robinson wrote:

 I've been working on adding a TCPResponderThread to the leader election
 process so that if a deployment needs to be TCP only, it can be and
 still
 use all election types. Testing this has exposed what might be a race
 condition in the leader election code that prevents a leader from being
 elected.

 Here's the behaviour I see in LETest occasionally. With three nodes
 (reduced
 from 30 for ease of debugging), node 3 gets elected before either node
 1 or
 node 2 finish their election (there is one round where each node that
 3 has
 the highest id, and then 3 completes its second round by receiving
 votes for
 itself from 1 and 2, but 1 and 2 do not receive votes from 3).

 Now 3 is killed by the test harness. 1 and 2 are still voting for it,
 but
 every time they try, the vote tally excludes 3 since it hasn't been
 heard
 from. They then spin round the voting process, unable to reset their
 vote. I
 expect that the heartbeat mechanism in a running QuorumPeer takes care
 of
 this when the leader is lost, but the associated QuorumPeers aren't
 running.

 If this is the case, then there is a simple fix to reset the nodes
 vote to
 themselves if they are voting for a node that hasn't been heard from. I
 don't know why using TCP instead of UDP for the responder thread is
 exacerbating this (and we can't rule out my introducing a bug :)); but
 as
 it's a race condition the different timings associated with waiting on
 a TCP
 socket might just be enough to expose the issue.

 Can someone verify this might be possible / figure out what I missed?

 cheers,
 Henry








[jira] Updated: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral

2009-11-11 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-472:
---

Status: Open  (was: Patch Available)

 Making DataNode not instantiate a HashMap when the node is ephmeral
 ---

 Key: ZOOKEEPER-472
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.2.0, 3.1.1
Reporter: Erik Holstad
Assignee: Erik Holstad
Priority: Minor
 Fix For: 3.3.0

 Attachments: zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch, zookeeper-472.patch


 Looking at the code, there is an overhead of a HashSet object for that nodes 
 children, even though the node might be an ephmeral node and cannot have 
 children.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral

2009-11-11 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-472:
---

Status: Patch Available  (was: Open)

 Making DataNode not instantiate a HashMap when the node is ephmeral
 ---

 Key: ZOOKEEPER-472
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.2.0, 3.1.1
Reporter: Erik Holstad
Assignee: Erik Holstad
Priority: Minor
 Fix For: 3.3.0

 Attachments: zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch


 Looking at the code, there is an overhead of a HashSet object for that nodes 
 children, even though the node might be an ephmeral node and cannot have 
 children.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral

2009-11-11 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-472:
---

Attachment: zookeeper-472.patch

Updated patch to compile against latest trunk.

Also cleaned up some finals.

Also reduced the default child hashset size to 8 rather than 16 (let's be 
conservative
as to the avg number of subnodes).

Small optimization to getchildren in datatree - allocate exactly the right size 
array list

 Making DataNode not instantiate a HashMap when the node is ephmeral
 ---

 Key: ZOOKEEPER-472
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.1.1, 3.2.0
Reporter: Erik Holstad
Assignee: Erik Holstad
Priority: Minor
 Fix For: 3.3.0

 Attachments: zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch


 Looking at the code, there is an overhead of a HashSet object for that nodes 
 children, even though the node might be an ephmeral node and cannot have 
 children.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral

2009-11-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776510#action_12776510
 ] 

Hadoop QA commented on ZOOKEEPER-472:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12424615/zookeeper-472.patch
  against trunk revision 833938.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/61/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/61/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/61/console

This message is automatically generated.

 Making DataNode not instantiate a HashMap when the node is ephmeral
 ---

 Key: ZOOKEEPER-472
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.1.1, 3.2.0
Reporter: Erik Holstad
Assignee: Erik Holstad
Priority: Minor
 Fix For: 3.3.0

 Attachments: zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch


 Looking at the code, there is an overhead of a HashSet object for that nodes 
 children, even though the node might be an ephmeral node and cannot have 
 children.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-425) Add OSGi metadata to zookeeper.jar

2009-11-11 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776538#action_12776538
 ] 

Benjamin Reed commented on ZOOKEEPER-425:
-

sorry i didn't notice this sooner. this is a great idea, and certainly 
reasonable. i think the import and export packages statement is incorrect. we 
should list the exact dependencies.

 Add OSGi metadata to zookeeper.jar
 --

 Key: ZOOKEEPER-425
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-425
 Project: Zookeeper
  Issue Type: Improvement
  Components: build
Affects Versions: 3.1.1
Reporter: David Bosschaert
 Attachments: MANIFEST.MF


 After adding OSGi metadata to zookeeper.jar it can be used as both an OSGi 
 bundle as well as an ordinary jar file. 
 In the CXF/DOSGi project the buildsystem does this using the 
 maven-bundle-plugin: 
 http://svn.apache.org/repos/asf/cxf/dosgi/trunk/discovery/distributed/zookeeper-wrapper/pom.xml
 The MANIFEST.MF generated by maven-bundle-plugin is attached to this bug, 
 this works for the CXF/DOSGi project.
 If your buildsystem isn't using maven, I would advise to use bnd 
 (http://www.aqute.biz/Code/Bnd). BND defines its own ant task in which you 
 should be able to use more or less the same instructions as were used in 
 maven:
 instructions
   Bundle-NameZooKeeper bundle/Bundle-Name
   Bundle-DescriptionThis bundle contains the ZooKeeper 
 library/Bundle-Description
   Bundle-SymbolicNameorg.apache.hadoop.zookeeper/Bundle-SymbolicName
   Bundle-Version3.1.1/Bundle-Version
   Import-Package*/Import-Package
   Export-Package*;version=3.1.1/Export-Package
 /instructions
 Oh and one other thing. Is it really necessary to put the source code in the 
 Jar file too? I would put that in a separate source distribution :)
 See also: 
 http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200905.mbox/%3c4a2009b1.3030...@yahoo-inc.com%3e

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral

2009-11-11 Thread Erik Holstad (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Holstad updated ZOOKEEPER-472:
---

Attachment: zookeeper-472.patch

Fixed the findbug warning, hopefully, by moving the synchronization away from 
the node object into the different caller methods.

 Making DataNode not instantiate a HashMap when the node is ephmeral
 ---

 Key: ZOOKEEPER-472
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.1.1, 3.2.0
Reporter: Erik Holstad
Assignee: Erik Holstad
Priority: Minor
 Fix For: 3.3.0

 Attachments: zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch


 Looking at the code, there is an overhead of a HashSet object for that nodes 
 children, even though the node might be an ephmeral node and cannot have 
 children.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral

2009-11-11 Thread Erik Holstad (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776549#action_12776549
 ] 

Erik Holstad commented on ZOOKEEPER-472:


Not really sure what to do about the no test -1. It is kinda hard to include 
new test for something like this and make it portable in a good way. 
Since I'm not adding any new functionality, the old tests should be enough, or 
what is the general opinion on this matter?

Erik

 Making DataNode not instantiate a HashMap when the node is ephmeral
 ---

 Key: ZOOKEEPER-472
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.1.1, 3.2.0
Reporter: Erik Holstad
Assignee: Erik Holstad
Priority: Minor
 Fix For: 3.3.0

 Attachments: zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch


 Looking at the code, there is an overhead of a HashSet object for that nodes 
 children, even though the node might be an ephmeral node and cannot have 
 children.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-425) Add OSGi metadata to zookeeper.jar

2009-11-11 Thread David Bosschaert (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776548#action_12776548
 ] 

David Bosschaert commented on ZOOKEEPER-425:


Hi Benjamin. Did you actually take a look at the attached MANIFEST.MF? Are the 
import and export statements in there correct? What is in the XML above is just 
the maven-bundle-plugin/BND instructions to create the exact list by 
introspection :)
If you think it would be better to explicitly list everything when creating the 
manifest, please state what you would like to see in there.

 Add OSGi metadata to zookeeper.jar
 --

 Key: ZOOKEEPER-425
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-425
 Project: Zookeeper
  Issue Type: Improvement
  Components: build
Affects Versions: 3.1.1
Reporter: David Bosschaert
 Attachments: MANIFEST.MF


 After adding OSGi metadata to zookeeper.jar it can be used as both an OSGi 
 bundle as well as an ordinary jar file. 
 In the CXF/DOSGi project the buildsystem does this using the 
 maven-bundle-plugin: 
 http://svn.apache.org/repos/asf/cxf/dosgi/trunk/discovery/distributed/zookeeper-wrapper/pom.xml
 The MANIFEST.MF generated by maven-bundle-plugin is attached to this bug, 
 this works for the CXF/DOSGi project.
 If your buildsystem isn't using maven, I would advise to use bnd 
 (http://www.aqute.biz/Code/Bnd). BND defines its own ant task in which you 
 should be able to use more or less the same instructions as were used in 
 maven:
 instructions
   Bundle-NameZooKeeper bundle/Bundle-Name
   Bundle-DescriptionThis bundle contains the ZooKeeper 
 library/Bundle-Description
   Bundle-SymbolicNameorg.apache.hadoop.zookeeper/Bundle-SymbolicName
   Bundle-Version3.1.1/Bundle-Version
   Import-Package*/Import-Package
   Export-Package*;version=3.1.1/Export-Package
 /instructions
 Oh and one other thing. Is it really necessary to put the source code in the 
 Jar file too? I would put that in a separate source distribution :)
 See also: 
 http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200905.mbox/%3c4a2009b1.3030...@yahoo-inc.com%3e

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-577) LeaderElection code hardcodes majority quorums in at least two places

2009-11-11 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-577:
-

  Component/s: server
Affects Version/s: 3.2.1

 LeaderElection code hardcodes majority quorums in at least two places
 -

 Key: ZOOKEEPER-577
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-577
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.2.1
Reporter: Henry Robinson
Assignee: Henry Robinson

 See e.g. lookForLeader in LeaderElection.java and termPredicate in 
 AuthFastLeaderElection.java
 Should use containsQuorum. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-577) LeaderElection code hardcodes majority quorums in at least two places

2009-11-11 Thread Henry Robinson (JIRA)
LeaderElection code hardcodes majority quorums in at least two places
-

 Key: ZOOKEEPER-577
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-577
 Project: Zookeeper
  Issue Type: Bug
Reporter: Henry Robinson
Assignee: Henry Robinson


See e.g. lookForLeader in LeaderElection.java and termPredicate in 
AuthFastLeaderElection.java

Should use containsQuorum. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers

2009-11-11 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776566#action_12776566
 ] 

Henry Robinson commented on ZOOKEEPER-368:
--

Flavio, Ben - thanks for the comments! Feels like we're getting close with this 
one.

To Flavio's specific points:

1. In order to make this work with FLE, the easiest thing is to have a 
ResponderThread be running all the time. However, a ResponderThread currently 
only runs when electionAlg=0. To make the responder thread run for all 
electionAlg types is easy, but this introduces a UDP dependency which some 
installations do not want. So we need to make ResponderThread be both UDP and 
TCP compliant. This is easy enough (I have written this code), but it also 
makes configuration yet more complicated because there is yet another port that 
needs specifying (there is some port re-use in the code currently that's a bit 
sketchy I think, and that doesn't work in all cases, we need another dedicated 
port). We will have to discuss whether we want to require strings of the form 
server.id:address:port:port:port:learnertype or if it's time to break out the 
per-server configuration into a more structured format. At this point, I feel 
like this is complicated enough, and orthogonal to Observers, to warrant its 
own JIRA - it would make the Observers patch too complicated. Also, this 
feature requires getting the race condition bug fixed. 

I've created https://issues.apache.org/jira/browse/ZOOKEEPER-578 for this issue.

So we can block the Observers patch on this feature, or we can get a reduced 
Observers patch in (and prevent another cycle of refactoring when trunk gets 
updated and the patch no longer applies). Either is good; but I'm probably in 
favour of getting the patch in now and updating once the ResponderThread JIRA 
gets closed. The change to re-enable Observers for all election types is pretty 
trivial.

2. I think this is a great idea - I'd point out that the hardcoded 
quorum.size() / 2 usages predate the Observers patch! For example, see 
termPredicate(..) in AuthFastLeaderElection.java and lookForLeader in 
LeaderElection.java. This should therefore be a separate JIRA (I'm trying to 
avoid having several issues fixed by this patch).

I've created https://issues.apache.org/jira/browse/ZOOKEEPER-577 for this 
issue. 

3. Yes, will do.

4. Yep, will do.

Ben - I didn't take great notes at that meeting (jetlag!), but my recollection 
is: we were trying to reconcile having Observers change roles and join the 
ensemble as voting members with the complications of doing so. Zero-weight 
followers are a great way to do that. However, we decided that actually that 
might not be a feature we wanted. At that point, the optimisations you can make 
with Observers, particularly for WANs such as batching and the single-message 
INFORM protocol, means it makes sense to logically separate out Observers in 
the code. We could have special-cased handling of 0-weight clients, but we felt 
that since this would involve a step-change in the behaviour of peers as the 
weight went from 0 to 0+ it would be a bit counter intuitive. 



 Observers
 -

 Key: ZOOKEEPER-368
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
 Project: Zookeeper
  Issue Type: New Feature
  Components: quorum
Reporter: Flavio Paiva Junqueira
Assignee: Henry Robinson
 Attachments: obs-refactor.patch, observer-refactor.patch, observers 
 sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch


 Currently, all servers of an ensemble participate actively in reaching 
 agreement on the order of ZooKeeper transactions. That is, all followers 
 receive proposals, acknowledge them, and receive commit messages from the 
 leader. A leader issues commit messages once it receives acknowledgments from 
 a quorum of followers. For cross-colo operation, it would be useful to have a 
 third role: observer. Using Paxos terminology, observers are similar to 
 learners. An observer does not participate actively in the agreement step of 
 the atomic broadcast protocol. Instead, it only commits proposals that have 
 been accepted by some quorum of followers.
 One simple solution to implement observers is to have the leader forwarding 
 commit messages not only to followers but also to observers, and have 
 observers applying transactions according to the order followers agreed upon. 
 In the current implementation of the protocol, however, commit messages do 
 not carry their corresponding transaction payload because all servers 
 different from the leader are followers and followers receive such a payload 
 first through a proposal message. Just 

[jira] Created: (ZOOKEEPER-578) ResponderThread should be able to use TCP or UDP

2009-11-11 Thread Henry Robinson (JIRA)
ResponderThread should be able to use TCP or UDP


 Key: ZOOKEEPER-578
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-578
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.2.1
Reporter: Henry Robinson
Assignee: Henry Robinson


The ResponderThread, which responds to inquiries for the current leader, 
currently only uses UDP. It also only runs for electionAlg=0. 

Observers will eventually require that a ResponderThread runs for all election 
types. However, this introduces a UDP dependency which some installations do 
not want. This would also allow such installations to use electionAlg=0 
(although this is not a big win as it is the least sophisticated election 
algorithm). 

Therefore we should be able to toggle ResponderThread to use either TCP or UDP. 
Since UDP is more performant, it probably makes sense to retain it. So I 
propose to choose between the two at startup time using a configuration flag 
responderTCP=true.

Fixing this issue exposed ZOOKEEPER-569, on which this JIRA depends. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Possible race in LETest.java

2009-11-11 Thread Patrick Hunt

Ok, great. Good to see you guys are on top of this one!

Patrick

Henry Robinson wrote:

The patch is simple, but a test is harder because it's a race condition. I'm
working on it.

Henry

On Wed, Nov 11, 2009 at 12:06 AM, Flavio Junqueira f...@yahoo-inc.comwrote:


Henry already opened one: ZOOKEEPER-569.

-Flavio


On Nov 11, 2009, at 7:03 AM, Patrick Hunt wrote:

 Closing the loop - what's the status on this? Can one of you open a

JIRA and provide a patch for this?

Thanks,

Patrick

Flavio Junqueira wrote:


Hi Henry, Apologies for the the delay. Your observation sounds right to
me. Here is how I'm reading it; let me know if it makes sense.

If everyone votes for 3 in the second round and 3 has crashed, then in
countVotes we will remove all votes to 3 and there will be no vote left.
In such a case, there will be no winner as a result of the call to
countVotes and lookForLeader won't change the current vote
(LeaderElection.java:201). This is a situation in which we are stuck.

Does it sound reasonable to add an else to the if statement of
LeaderElection.java:201 to reset the vote? This modification would
implementing resetting the vote when countVotes returns no winner, which
should happen only when the replica itself votes for a dead leader.

-Flavio

On Oct 28, 2009, at 7:44 AM, Henry Robinson wrote:

 [ Sending this direct since the Apache mailserver is rejecting my

e-mails at the moment ]

As I understand it, 1 and 2 receive a vote for 3 in the first round,
which causes them to vote for 3 in the second round. So in the second
round, all votes cast are for 3. But 3 has died, so all votes for it
are discounted. 1 and 2 continue to vote for 3 ad infinitum, never
resetting their vote.

Does this sound plausible, or am I missing something?

cheers,
Henry

On Tue, Oct 27, 2009 at 3:48 PM, Flavio Junqueira f...@yahoo-inc.com
wrote:
Hi Henry, I don't understand how 1 and 2 do not end up electing 2 in
your situation. If they exclude 3 in countVotes, then countVotes will
end up returning 2 and not 3, assuming there is a vote for 2. What am
I missing?

The problem with QuorumPeer you're pointing at was also an issue with
the FLE tests, and I couldn't see an easy way around it other than
timing out and restarting leader election.

Cheers,
-Flavio


On Oct 27, 2009, at 6:35 AM, Henry Robinson wrote:

I've been working on adding a TCPResponderThread to the leader election
process so that if a deployment needs to be TCP only, it can be and
still
use all election types. Testing this has exposed what might be a race
condition in the leader election code that prevents a leader from being
elected.

Here's the behaviour I see in LETest occasionally. With three nodes
(reduced
from 30 for ease of debugging), node 3 gets elected before either node
1 or
node 2 finish their election (there is one round where each node that
3 has
the highest id, and then 3 completes its second round by receiving
votes for
itself from 1 and 2, but 1 and 2 do not receive votes from 3).

Now 3 is killed by the test harness. 1 and 2 are still voting for it,
but
every time they try, the vote tally excludes 3 since it hasn't been
heard
from. They then spin round the voting process, unable to reset their
vote. I
expect that the heartbeat mechanism in a running QuorumPeer takes care
of
this when the leader is lost, but the associated QuorumPeers aren't
running.

If this is the case, then there is a simple fix to reset the nodes
vote to
themselves if they are voting for a node that hasn't been heard from. I
don't know why using TCP instead of UDP for the responder thread is
exacerbating this (and we can't rule out my introducing a bug :)); but
as
it's a race condition the different timings associated with waiting on
a TCP
socket might just be enough to expose the issue.

Can someone verify this might be possible / figure out what I missed?

cheers,
Henry









[jira] Updated: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral

2009-11-11 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-472:
---

Status: Patch Available  (was: Open)

 Making DataNode not instantiate a HashMap when the node is ephmeral
 ---

 Key: ZOOKEEPER-472
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.2.0, 3.1.1
Reporter: Erik Holstad
Assignee: Erik Holstad
Priority: Minor
 Fix For: 3.3.0

 Attachments: zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch


 Looking at the code, there is an overhead of a HashSet object for that nodes 
 children, even though the node might be an ephmeral node and cannot have 
 children.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral

2009-11-11 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776572#action_12776572
 ] 

Patrick Hunt commented on ZOOKEEPER-472:


In general we should have tests but I'm fine with the no-test in this case, 
this is an optimization not a bug fix 
and I can't think of any test we could add that would benefit, we already have 
good test coverage on this area.


 Making DataNode not instantiate a HashMap when the node is ephmeral
 ---

 Key: ZOOKEEPER-472
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.1.1, 3.2.0
Reporter: Erik Holstad
Assignee: Erik Holstad
Priority: Minor
 Fix For: 3.3.0

 Attachments: zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch


 Looking at the code, there is an overhead of a HashSet object for that nodes 
 children, even though the node might be an ephmeral node and cannot have 
 children.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral

2009-11-11 Thread Erik Holstad (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776580#action_12776580
 ] 

Erik Holstad commented on ZOOKEEPER-472:


Sounds good to me.

 Making DataNode not instantiate a HashMap when the node is ephmeral
 ---

 Key: ZOOKEEPER-472
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.1.1, 3.2.0
Reporter: Erik Holstad
Assignee: Erik Holstad
Priority: Minor
 Fix For: 3.3.0

 Attachments: zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch


 Looking at the code, there is an overhead of a HashSet object for that nodes 
 children, even though the node might be an ephmeral node and cannot have 
 children.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral

2009-11-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776582#action_12776582
 ] 

Hadoop QA commented on ZOOKEEPER-472:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12424625/zookeeper-472.patch
  against trunk revision 833938.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/62/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/62/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/62/console

This message is automatically generated.

 Making DataNode not instantiate a HashMap when the node is ephmeral
 ---

 Key: ZOOKEEPER-472
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.1.1, 3.2.0
Reporter: Erik Holstad
Assignee: Erik Holstad
Priority: Minor
 Fix For: 3.3.0

 Attachments: zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch, 
 zookeeper-472.patch


 Looking at the code, there is an overhead of a HashSet object for that nodes 
 children, even though the node might be an ephmeral node and cannot have 
 children.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-425) Add OSGi metadata to zookeeper.jar

2009-11-11 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776592#action_12776592
 ] 

Patrick Hunt commented on ZOOKEEPER-425:


A dumb q: adding this to the manifest of the zk jar will have no effect on 
non-osgi containers (etc...) correct?

Also, re the original question Is it really necessary to put the source code 
in the Jar file too:
  notice that in the trunk we have changed things a bit since this jira was 
created. We still have
  the original jar which includes sources, but we also have an additional 
binary only jar (class files) in addition
  to separate source and javadoc jars. You will see this in the package 
target of the latest build.xml
  This was added for Maven -- we should be sure to include the manifest changes 
to those jars
  (just the jars containing class files  I guess?)

 Add OSGi metadata to zookeeper.jar
 --

 Key: ZOOKEEPER-425
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-425
 Project: Zookeeper
  Issue Type: Improvement
  Components: build
Affects Versions: 3.1.1
Reporter: David Bosschaert
 Attachments: MANIFEST.MF


 After adding OSGi metadata to zookeeper.jar it can be used as both an OSGi 
 bundle as well as an ordinary jar file. 
 In the CXF/DOSGi project the buildsystem does this using the 
 maven-bundle-plugin: 
 http://svn.apache.org/repos/asf/cxf/dosgi/trunk/discovery/distributed/zookeeper-wrapper/pom.xml
 The MANIFEST.MF generated by maven-bundle-plugin is attached to this bug, 
 this works for the CXF/DOSGi project.
 If your buildsystem isn't using maven, I would advise to use bnd 
 (http://www.aqute.biz/Code/Bnd). BND defines its own ant task in which you 
 should be able to use more or less the same instructions as were used in 
 maven:
 instructions
   Bundle-NameZooKeeper bundle/Bundle-Name
   Bundle-DescriptionThis bundle contains the ZooKeeper 
 library/Bundle-Description
   Bundle-SymbolicNameorg.apache.hadoop.zookeeper/Bundle-SymbolicName
   Bundle-Version3.1.1/Bundle-Version
   Import-Package*/Import-Package
   Export-Package*;version=3.1.1/Export-Package
 /instructions
 Oh and one other thing. Is it really necessary to put the source code in the 
 Jar file too? I would put that in a separate source distribution :)
 See also: 
 http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200905.mbox/%3c4a2009b1.3030...@yahoo-inc.com%3e

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-425) Add OSGi metadata to zookeeper.jar

2009-11-11 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776610#action_12776610
 ] 

Benjamin Reed commented on ZOOKEEPER-425:
-

right these are osgi specific tags that will get ignored normally.

 Add OSGi metadata to zookeeper.jar
 --

 Key: ZOOKEEPER-425
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-425
 Project: Zookeeper
  Issue Type: Improvement
  Components: build
Affects Versions: 3.1.1
Reporter: David Bosschaert
 Attachments: MANIFEST.MF


 After adding OSGi metadata to zookeeper.jar it can be used as both an OSGi 
 bundle as well as an ordinary jar file. 
 In the CXF/DOSGi project the buildsystem does this using the 
 maven-bundle-plugin: 
 http://svn.apache.org/repos/asf/cxf/dosgi/trunk/discovery/distributed/zookeeper-wrapper/pom.xml
 The MANIFEST.MF generated by maven-bundle-plugin is attached to this bug, 
 this works for the CXF/DOSGi project.
 If your buildsystem isn't using maven, I would advise to use bnd 
 (http://www.aqute.biz/Code/Bnd). BND defines its own ant task in which you 
 should be able to use more or less the same instructions as were used in 
 maven:
 instructions
   Bundle-NameZooKeeper bundle/Bundle-Name
   Bundle-DescriptionThis bundle contains the ZooKeeper 
 library/Bundle-Description
   Bundle-SymbolicNameorg.apache.hadoop.zookeeper/Bundle-SymbolicName
   Bundle-Version3.1.1/Bundle-Version
   Import-Package*/Import-Package
   Export-Package*;version=3.1.1/Export-Package
 /instructions
 Oh and one other thing. Is it really necessary to put the source code in the 
 Jar file too? I would put that in a separate source distribution :)
 See also: 
 http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200905.mbox/%3c4a2009b1.3030...@yahoo-inc.com%3e

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-425) Add OSGi metadata to zookeeper.jar

2009-11-11 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776614#action_12776614
 ] 

Benjamin Reed commented on ZOOKEEPER-425:
-

oh sorry david. so i still have the same concern with the full manifest.mf, but 
before that i was wondering: are you trying to provide the bundle so that other 
bundles can use zookeeper or so that the bundle can start up a zookeeper server?

most of the packages imported and exported are internal to zookeeper and should 
be kept private. if we want to just provide access to the client API we should 
just list org.apache.zookeeper and org.apache.zookeeper.data (possibly 
org.apache.zookeeper.version). we should also use the script to set the version 
rather than hard code it. if you want to start the server, we should really 
have a separate package with just the classes/interfaces needed to manage a 
server instance and export that.

the only import we need is log4j. is there already a standard log4j bundle?

 Add OSGi metadata to zookeeper.jar
 --

 Key: ZOOKEEPER-425
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-425
 Project: Zookeeper
  Issue Type: Improvement
  Components: build
Affects Versions: 3.1.1
Reporter: David Bosschaert
 Attachments: MANIFEST.MF


 After adding OSGi metadata to zookeeper.jar it can be used as both an OSGi 
 bundle as well as an ordinary jar file. 
 In the CXF/DOSGi project the buildsystem does this using the 
 maven-bundle-plugin: 
 http://svn.apache.org/repos/asf/cxf/dosgi/trunk/discovery/distributed/zookeeper-wrapper/pom.xml
 The MANIFEST.MF generated by maven-bundle-plugin is attached to this bug, 
 this works for the CXF/DOSGi project.
 If your buildsystem isn't using maven, I would advise to use bnd 
 (http://www.aqute.biz/Code/Bnd). BND defines its own ant task in which you 
 should be able to use more or less the same instructions as were used in 
 maven:
 instructions
   Bundle-NameZooKeeper bundle/Bundle-Name
   Bundle-DescriptionThis bundle contains the ZooKeeper 
 library/Bundle-Description
   Bundle-SymbolicNameorg.apache.hadoop.zookeeper/Bundle-SymbolicName
   Bundle-Version3.1.1/Bundle-Version
   Import-Package*/Import-Package
   Export-Package*;version=3.1.1/Export-Package
 /instructions
 Oh and one other thing. Is it really necessary to put the source code in the 
 Jar file too? I would put that in a separate source distribution :)
 See also: 
 http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200905.mbox/%3c4a2009b1.3030...@yahoo-inc.com%3e

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers

2009-11-11 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776654#action_12776654
 ] 

Mahadev konar commented on ZOOKEEPER-368:
-

henry,
thanks for the patch.
the patch looks good.

some comments:

- looking at the patch it seems like it would work with servers prior to 
including this patch. Did you try some testing with current servers (killing 
one at a time and brining them up  in a round robin fashion) just to make sure 
it works all fine with the current servers (not including the patch)?
- what happens if a server configured as follower is suddenly brought down and 
is made an observer and the other way around as well? Just checking to see if 
we have these scenarios covered because such mistakes are easy to make when 
setting up servers
- also it would be good to have more javadocs in the code. Its good to have 
javadocs just be explaining whats going on in each method (though we lack that 
kind of documentation in the code but I do hope we can get more javadoc)
- I think its fine to do FLE in another jira as long as it gets done. It would 
not be a useful feature if it does not run with FLE. I would have gone with 
making it work with FLE first and then trying to see if it works with LE or not.
- removing the quorums.getsize()/2 with containsQuorum() can surely be done in 
another jira.
- also performance benchmarking the code with this patch and without this patch 
so that we make sure that this patch doesnt degrade the performance in any way 
will be good to have

 Observers
 -

 Key: ZOOKEEPER-368
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
 Project: Zookeeper
  Issue Type: New Feature
  Components: quorum
Reporter: Flavio Paiva Junqueira
Assignee: Henry Robinson
 Attachments: obs-refactor.patch, observer-refactor.patch, observers 
 sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch


 Currently, all servers of an ensemble participate actively in reaching 
 agreement on the order of ZooKeeper transactions. That is, all followers 
 receive proposals, acknowledge them, and receive commit messages from the 
 leader. A leader issues commit messages once it receives acknowledgments from 
 a quorum of followers. For cross-colo operation, it would be useful to have a 
 third role: observer. Using Paxos terminology, observers are similar to 
 learners. An observer does not participate actively in the agreement step of 
 the atomic broadcast protocol. Instead, it only commits proposals that have 
 been accepted by some quorum of followers.
 One simple solution to implement observers is to have the leader forwarding 
 commit messages not only to followers but also to observers, and have 
 observers applying transactions according to the order followers agreed upon. 
 In the current implementation of the protocol, however, commit messages do 
 not carry their corresponding transaction payload because all servers 
 different from the leader are followers and followers receive such a payload 
 first through a proposal message. Just forwarding commit messages as they 
 currently are to an observer consequently is not sufficient. We have a couple 
 of options:
 1- Include the transaction payload along in commit messages to observers;
 2- Send proposals to observers as well.
 Number 2 is simpler to implement because it doesn't require changing the 
 protocol implementation, but it increases traffic slightly. The performance 
 impact due to such an increase might be insignificant, though.
 For scalability purposes, we may consider having followers also forwarding 
 commit messages to observers. With this option, observers can connect to 
 followers, and receive messages from followers. This choice is important to 
 avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers

2009-11-11 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776690#action_12776690
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-368:
--

# I think it is quite important to have it working with FLE because FLE is the 
default leader election currently. My preference is to have it fixed before we 
get this patch in because it is not an unusual case. I'm happy to work with 
Henry on getting this fixed, btw;
# I didn't realize that the majority checks were the ones of the leader 
election implementations. This is pending at least for AFLE because AFLE does 
not use server identifiers, and there is a jira open to fix this issue 
(ZOOKEEPER-372). Fixing this hasn't been a priority because we haven't been 
able to decide whether we should support all implementations of leader election 
or not. We have been trying to keep FLE in good shape, though. To me, it is ok 
to postpone these changes if the checks are only on the LE and AFLE. 

 Observers
 -

 Key: ZOOKEEPER-368
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
 Project: Zookeeper
  Issue Type: New Feature
  Components: quorum
Reporter: Flavio Paiva Junqueira
Assignee: Henry Robinson
 Attachments: obs-refactor.patch, observer-refactor.patch, observers 
 sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch


 Currently, all servers of an ensemble participate actively in reaching 
 agreement on the order of ZooKeeper transactions. That is, all followers 
 receive proposals, acknowledge them, and receive commit messages from the 
 leader. A leader issues commit messages once it receives acknowledgments from 
 a quorum of followers. For cross-colo operation, it would be useful to have a 
 third role: observer. Using Paxos terminology, observers are similar to 
 learners. An observer does not participate actively in the agreement step of 
 the atomic broadcast protocol. Instead, it only commits proposals that have 
 been accepted by some quorum of followers.
 One simple solution to implement observers is to have the leader forwarding 
 commit messages not only to followers but also to observers, and have 
 observers applying transactions according to the order followers agreed upon. 
 In the current implementation of the protocol, however, commit messages do 
 not carry their corresponding transaction payload because all servers 
 different from the leader are followers and followers receive such a payload 
 first through a proposal message. Just forwarding commit messages as they 
 currently are to an observer consequently is not sufficient. We have a couple 
 of options:
 1- Include the transaction payload along in commit messages to observers;
 2- Send proposals to observers as well.
 Number 2 is simpler to implement because it doesn't require changing the 
 protocol implementation, but it increases traffic slightly. The performance 
 impact due to such an increase might be insignificant, though.
 For scalability purposes, we may consider having followers also forwarding 
 commit messages to observers. With this option, observers can connect to 
 followers, and receive messages from followers. This choice is important to 
 avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-550) Java Queue Recipe

2009-11-11 Thread Steven Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776770#action_12776770
 ] 

Steven Cheng commented on ZOOKEEPER-550:


Haven't done a make install yet with zookeeper so that part should be ok, I'll 
try applying the patch to a fresh checkout Mahadev.


 Java Queue Recipe
 -

 Key: ZOOKEEPER-550
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-550
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.2.1
Reporter: Steven Cheng
Assignee: Steven Cheng
Priority: Minor
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, 
 ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, 
 ZOOKEEPER-550.patch, ZOOKEEPER-550.patch


 This patch adds a recipe for creating a distributed queue with ZooKeeper 
 similar to the WriteLock recipe and some sequential tests.  This early 
 attempt follows the Java BlockingQueue interface, though it doesn't implement 
 it since I don't think there's a good reason for it to be Iterable.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers

2009-11-11 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776840#action_12776840
 ] 

Benjamin Reed commented on ZOOKEEPER-368:
-

jeff, i agree that we shouldn't hold a patch to fix a bug somewhere else, but 
we also generally try to keep our trunk correct, so generally we want to see 
doc, test, and correct behavior before committing especially with something 
that touches the core. having said that i think the missing doc, functionality, 
and testing is confined to the observer function, so i think we should commit 
it and fix the rest of the observer code as separate patches to avoid having to 
refresh the patch.

 Observers
 -

 Key: ZOOKEEPER-368
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
 Project: Zookeeper
  Issue Type: New Feature
  Components: quorum
Reporter: Flavio Paiva Junqueira
Assignee: Henry Robinson
 Attachments: obs-refactor.patch, observer-refactor.patch, observers 
 sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch


 Currently, all servers of an ensemble participate actively in reaching 
 agreement on the order of ZooKeeper transactions. That is, all followers 
 receive proposals, acknowledge them, and receive commit messages from the 
 leader. A leader issues commit messages once it receives acknowledgments from 
 a quorum of followers. For cross-colo operation, it would be useful to have a 
 third role: observer. Using Paxos terminology, observers are similar to 
 learners. An observer does not participate actively in the agreement step of 
 the atomic broadcast protocol. Instead, it only commits proposals that have 
 been accepted by some quorum of followers.
 One simple solution to implement observers is to have the leader forwarding 
 commit messages not only to followers but also to observers, and have 
 observers applying transactions according to the order followers agreed upon. 
 In the current implementation of the protocol, however, commit messages do 
 not carry their corresponding transaction payload because all servers 
 different from the leader are followers and followers receive such a payload 
 first through a proposal message. Just forwarding commit messages as they 
 currently are to an observer consequently is not sufficient. We have a couple 
 of options:
 1- Include the transaction payload along in commit messages to observers;
 2- Send proposals to observers as well.
 Number 2 is simpler to implement because it doesn't require changing the 
 protocol implementation, but it increases traffic slightly. The performance 
 impact due to such an increase might be insignificant, though.
 For scalability purposes, we may consider having followers also forwarding 
 commit messages to observers. With this option, observers can connect to 
 followers, and receive messages from followers. This choice is important to 
 avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-550) Java Queue Recipe

2009-11-11 Thread Steven Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776854#action_12776854
 ] 

Steven Cheng commented on ZOOKEEPER-550:


Just tried with a fresh checkout.  The recipes depend on the libraries being in 
src/c which only happens if you run make in that directory.  Also 
src/recipes/lock/src/c/tests/zkServer.sh does not have execute permissions on a 
fresh checkout, src/recipes/queue/src/c/tests/zkServer.sh also won't have 
permission because it's in a patch.

What would be a good solution to the library issue?  I noticed that the c 
client test builds to build/test/test-cppunit, the recipe tests could get the 
library from there instead of src/c?


 Java Queue Recipe
 -

 Key: ZOOKEEPER-550
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-550
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.2.1
Reporter: Steven Cheng
Assignee: Steven Cheng
Priority: Minor
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, 
 ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, 
 ZOOKEEPER-550.patch, ZOOKEEPER-550.patch


 This patch adds a recipe for creating a distributed queue with ZooKeeper 
 similar to the WriteLock recipe and some sequential tests.  This early 
 attempt follows the Java BlockingQueue interface, though it doesn't implement 
 it since I don't think there's a good reason for it to be Iterable.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.