[jira] Updated: (ZOOKEEPER-799) Add tools and recipes for monitoring as a contrib

2010-07-13 Thread Andrei Savu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Savu updated ZOOKEEPER-799:
--

Attachment: (was: monitoring.tar.gz)

 Add tools and recipes for monitoring as a contrib
 -

 Key: ZOOKEEPER-799
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-799
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib
Reporter: Andrei Savu
Assignee: Andrei Savu
 Attachments: monitoring.tar.gz, ZOOKEEPER-799.patch


 Tools and Recipes for Monitoring ZooKeeper using Cacti, Nagios or Ganglia. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-799) Add tools and recipes for monitoring as a contrib

2010-07-13 Thread Andrei Savu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Savu updated ZOOKEEPER-799:
--

Attachment: (was: monitoring.tar.gz)

 Add tools and recipes for monitoring as a contrib
 -

 Key: ZOOKEEPER-799
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-799
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib
Reporter: Andrei Savu
Assignee: Andrei Savu
 Attachments: monitoring.tar.gz, ZOOKEEPER-799.patch


 Tools and Recipes for Monitoring ZooKeeper using Cacti, Nagios or Ganglia. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-799) Add tools and recipes for monitoring as a contrib

2010-07-13 Thread Andrei Savu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Savu updated ZOOKEEPER-799:
--

Attachment: (was: ZOOKEEPER-799.patch)

 Add tools and recipes for monitoring as a contrib
 -

 Key: ZOOKEEPER-799
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-799
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib
Reporter: Andrei Savu
Assignee: Andrei Savu
 Attachments: monitoring.tar.gz, ZOOKEEPER-799.patch


 Tools and Recipes for Monitoring ZooKeeper using Cacti, Nagios or Ganglia. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-799) Add tools and recipes for monitoring as a contrib

2010-07-13 Thread Andrei Savu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Savu updated ZOOKEEPER-799:
--

Attachment: (was: ZOOKEEPER-799.patch)

 Add tools and recipes for monitoring as a contrib
 -

 Key: ZOOKEEPER-799
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-799
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib
Reporter: Andrei Savu
Assignee: Andrei Savu
 Attachments: monitoring.tar.gz, ZOOKEEPER-799.patch


 Tools and Recipes for Monitoring ZooKeeper using Cacti, Nagios or Ganglia. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-799) Add tools and recipes for monitoring as a contrib

2010-07-13 Thread Andrei Savu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Savu updated ZOOKEEPER-799:
--

Attachment: monitoring.tar.gz

 Add tools and recipes for monitoring as a contrib
 -

 Key: ZOOKEEPER-799
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-799
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib
Reporter: Andrei Savu
Assignee: Andrei Savu
 Attachments: monitoring.tar.gz, ZOOKEEPER-799.patch


 Tools and Recipes for Monitoring ZooKeeper using Cacti, Nagios or Ganglia. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-799) Add tools and recipes for monitoring as a contrib

2010-07-13 Thread Andrei Savu (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Savu updated ZOOKEEPER-799:
--

Status: Patch Available  (was: Open)

 Add tools and recipes for monitoring as a contrib
 -

 Key: ZOOKEEPER-799
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-799
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib
Reporter: Andrei Savu
Assignee: Andrei Savu
 Attachments: monitoring.tar.gz, ZOOKEEPER-799.patch


 Tools and Recipes for Monitoring ZooKeeper using Cacti, Nagios or Ganglia. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-799) Add tools and recipes for monitoring as a contrib

2010-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887847#action_12887847
 ] 

Hadoop QA commented on ZOOKEEPER-799:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12449351/monitoring.tar.gz
  against trunk revision 962697.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/141/console

This message is automatically generated.

 Add tools and recipes for monitoring as a contrib
 -

 Key: ZOOKEEPER-799
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-799
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib
Reporter: Andrei Savu
Assignee: Andrei Savu
 Attachments: monitoring.tar.gz, ZOOKEEPER-799.patch


 Tools and Recipes for Monitoring ZooKeeper using Cacti, Nagios or Ganglia. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-799) Add tools and recipes for monitoring as a contrib

2010-07-13 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887852#action_12887852
 ] 

Patrick Hunt commented on ZOOKEEPER-799:


I see, both files are necessary to build. I'll take a look at this asap (don't 
worry about hudson).

 Add tools and recipes for monitoring as a contrib
 -

 Key: ZOOKEEPER-799
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-799
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib
Reporter: Andrei Savu
Assignee: Andrei Savu
 Attachments: monitoring.tar.gz, ZOOKEEPER-799.patch


 Tools and Recipes for Monitoring ZooKeeper using Cacti, Nagios or Ganglia. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-795) eventThread isn't shutdown after a connection session expired event coming

2010-07-13 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887854#action_12887854
 ] 

Mahadev konar commented on ZOOKEEPER-795:
-

ben, can you take a look at this?

 eventThread isn't shutdown after a connection session expired event coming
 

 Key: ZOOKEEPER-795
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-795
 Project: Zookeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.3.1
 Environment: ubuntu 10.04
Reporter: mathieu barcikowski
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: ExpiredSessionThreadLeak.java, ZOOKEEPER-795.patch


 Hi,
 I notice a problem with the eventThread located in ClientCnxn.java file.
 The eventThread isn't shutdown after a connection session expired event 
 coming (i.e. never receive EventOfDeath).
 When a session timeout occurs and the session is marked as expired, the 
 connexion is fully closed (socket, SendThread...) expect for the eventThread.
 As a result, if i create a new zookeeper object and connect through it, I got 
 a zombi thread which will never be kill (as for the previous zookeeper 
 object, the state is already close, calling close again don't do anything).
 So everytime I will create a new zookeeper connection after a expired 
 session, I will have a one more zombi EventThread.
 How to reproduce :
 - Start a zookeeper client connection in debug mode
 - Pause the jvm enough time to the expired event occur
 - Watch for example with jvisualvm the list of threads, the sendThread is 
 succesfully killed, but the EventThread go to wait state for a infinity of 
 time
 - if you reopen a new zookeeper connection, and do again the previous steps, 
 another EventThread will be present in infinite wait state

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-783) committedLog in ZKDatabase is not properly synchronized

2010-07-13 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-783:


Status: Patch Available  (was: Open)

i think we can do without a test on this one. marking it PA

 committedLog in ZKDatabase is not properly synchronized
 ---

 Key: ZOOKEEPER-783
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-783
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.3.1
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Critical
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-783.patch


 ZKDatabase.getCommittedLog() returns a reference to the LinkedListProposal 
 committedLog in ZKDatabase. This is then iterated over by at least one 
 caller. 
 I have seen a bug that causes a NPE in LinkedList.clear on committedLog, 
 which I am pretty sure is due to the lack of synchronization. This bug has 
 not been apparent in normal ZK operation, but in code that I have that starts 
 and stops a ZK server in process repeatedly (clear() is called from 
 ZooKeeperServerMain.shutdown()). 
 It's better style to defensively copy the list in getCommittedLog, and to 
 synchronize on the list in ZKDatabase.clear.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-799) Add tools and recipes for monitoring as a contrib

2010-07-13 Thread Andrei Savu (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887870#action_12887870
 ] 

Andrei Savu commented on ZOOKEEPER-799:
---

Not really. The archive only contains some extra files (screenshots). I don't 
understand why Hudson keeps trying to apply it as patch even of it's not marked 
for inclusion. 


-original message-
Subject: [jira] Commented: (ZOOKEEPER-799) Add tools and recipes for monitoring 
as a contrib
From: Patrick Hunt (JIRA) j...@apache.org
Date: 13/07/2010 20:13


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887852#action_12887852
 ] 

Patrick Hunt commented on ZOOKEEPER-799:


I see, both files are necessary to build. I'll take a look at this asap (don't 
worry about hudson).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.





 Add tools and recipes for monitoring as a contrib
 -

 Key: ZOOKEEPER-799
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-799
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib
Reporter: Andrei Savu
Assignee: Andrei Savu
 Attachments: monitoring.tar.gz, ZOOKEEPER-799.patch


 Tools and Recipes for Monitoring ZooKeeper using Cacti, Nagios or Ganglia. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-800) zoo_add_auth returns ZOK if zookeeper handle is in ZOO_CLOSED_STATE

2010-07-13 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar reassigned ZOOKEEPER-800:
---

Assignee: Mahadev konar

 zoo_add_auth returns ZOK if zookeeper handle is in ZOO_CLOSED_STATE
 ---

 Key: ZOOKEEPER-800
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-800
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.3.1
Reporter: Michi Mutsuzaki
Assignee: Mahadev konar
Priority: Minor
 Fix For: 3.3.2, 3.4.0


 This happened when I called zoo_add_auth() immediately after 
 zookeeper_init(). It took me a while to figure out that authentication 
 actually failed since zoo_add_auth() returned ZOK. It should return 
 ZINVALIDSTATE instead. 
 --Michi

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-783) committedLog in ZKDatabase is not properly synchronized

2010-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887891#action_12887891
 ] 

Hadoop QA commented on ZOOKEEPER-783:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12446054/ZOOKEEPER-783.patch
  against trunk revision 962697.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/142/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/142/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/142/console

This message is automatically generated.

 committedLog in ZKDatabase is not properly synchronized
 ---

 Key: ZOOKEEPER-783
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-783
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.3.1
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Critical
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-783.patch


 ZKDatabase.getCommittedLog() returns a reference to the LinkedListProposal 
 committedLog in ZKDatabase. This is then iterated over by at least one 
 caller. 
 I have seen a bug that causes a NPE in LinkedList.clear on committedLog, 
 which I am pretty sure is due to the lack of synchronization. This bug has 
 not been apparent in normal ZK operation, but in code that I have that starts 
 and stops a ZK server in process repeatedly (clear() is called from 
 ZooKeeperServerMain.shutdown()). 
 It's better style to defensively copy the list in getCommittedLog, and to 
 synchronize on the list in ZKDatabase.clear.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-765) Add python example script

2010-07-13 Thread Andrei Savu (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887896#action_12887896
 ] 

Andrei Savu commented on ZOOKEEPER-765:
---

I think Henry's queue should also be part of this patch: 
http://github.com/henryr/pyzk-recipes What do you think? 

 Add python example script
 -

 Key: ZOOKEEPER-765
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-765
 Project: Zookeeper
  Issue Type: Improvement
  Components: contrib-bindings, documentation
Reporter: Travis Crawford
Assignee: Andrei Savu
Priority: Minor
 Fix For: 3.4.0

 Attachments: zk.py, ZOOKEEPER-765.patch


 When adding some zookeeper-based functionality to a python script I had to 
 figure everything out without guidance, which while doable, would have been a 
 lot easier with an example. I extracted a skeleton program structure out with 
 hopes its useful to others (maybe add as an example in the source or wiki?).
 This script does an aget() and sets a watch, and hopefully illustrates what's 
 going on, and where to plug in your application code that gets run when the 
 znode changes.
 There are probably some bugs, which if we fix now and provide a well-reviewed 
 example hopefully others will not run into the same mistakes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-799) Add tools and recipes for monitoring as a contrib

2010-07-13 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-799:
---

Fix Version/s: 3.4.0

 Add tools and recipes for monitoring as a contrib
 -

 Key: ZOOKEEPER-799
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-799
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib
Reporter: Andrei Savu
Assignee: Andrei Savu
 Fix For: 3.4.0

 Attachments: monitoring.tar.gz, ZOOKEEPER-799.patch


 Tools and Recipes for Monitoring ZooKeeper using Cacti, Nagios or Ganglia. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-780) zkCli.sh generates a ArrayIndexOutOfBoundsException

2010-07-13 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt reassigned ZOOKEEPER-780:
--

Assignee: Andrei Savu

 zkCli.sh  generates a ArrayIndexOutOfBoundsException 
 -

 Key: ZOOKEEPER-780
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-780
 Project: Zookeeper
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.3.1
 Environment: Linux Ubuntu running in VMPlayer on top of Windows XP
Reporter: Miguel Correia
Assignee: Andrei Savu
Priority: Minor
 Fix For: 3.4.0

 Attachments: ZOOKEEPER-780.patch


 I'm starting to play with Zookeeper so I'm still running it in standalone 
 mode. This is not a big issue, but here it goes for the records. 
 I've run zkCli.sh to run some commands in the server. I created a znode 
 /groups. When I tried to create a znode client_1 inside /groups, I forgot to 
 include the data: an exception was generated and zkCli-sh crashed, instead of 
 just showing an error. I tried a few variations and it seems like the problem 
 is not including the data.
 A copy of the screen:
 [zk: localhost:2181(CONNECTED) 3] create /groups firstgroup
 Created /groups
 [zk: localhost:2181(CONNECTED) 4] create -e /groups/client_1
 Exception in thread main java.lang.ArrayIndexOutOfBoundsException: 3
   at 
 org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:678)
   at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:581)
   at 
 org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:353)
   at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:311)
   at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:270)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-790) Last processed zxid set prematurely while establishing leadership

2010-07-13 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887941#action_12887941
 ] 

Vishal K commented on ZOOKEEPER-790:


Folks,

Sorry to the delay. The patch did not work. Any other ideas? Thanks.

-Vishal

 Last processed zxid set prematurely while establishing leadership
 -

 Key: ZOOKEEPER-790
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-790
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.3.1
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-790.patch


 The leader code is setting the last processed zxid to the first of the new 
 epoch even before connecting to a quorum of followers. Because the leader 
 code sets this value before connecting to a quorum of followers 
 (Leader.java:281) and the follower code throws an IOException 
 (Follower.java:73) if the leader epoch is smaller, we have that when the 
 false leader drops leadership and becomes a follower, it finds a smaller 
 epoch and kills itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-790) Last processed zxid set prematurely while establishing leadership

2010-07-13 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887969#action_12887969
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-790:
--

Vishal, What exactly didn't work? Do you get the same error messages with the 
patch? Do you have a reliable way of reproducing it? In general, it would be 
useful if you could provide more detail.

Thanks!


 Last processed zxid set prematurely while establishing leadership
 -

 Key: ZOOKEEPER-790
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-790
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.3.1
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-790.patch


 The leader code is setting the last processed zxid to the first of the new 
 epoch even before connecting to a quorum of followers. Because the leader 
 code sets this value before connecting to a quorum of followers 
 (Leader.java:281) and the follower code throws an IOException 
 (Follower.java:73) if the leader epoch is smaller, we have that when the 
 false leader drops leadership and becomes a follower, it finds a smaller 
 epoch and kills itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Commented: (ZOOKEEPER-790) Last processed zxid set prematurely while establishing leadership

2010-07-13 Thread Vishal K
Hi Flavio ,

I got the same error messages. I can reproduce this quite easily. I will
capture the logs again. Is there anything else you would like me to provide?
Thanks.

-Vishal

On Tue, Jul 13, 2010 at 3:52 PM, Flavio Paiva Junqueira (JIRA) 
j...@apache.org wrote:


[
 https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887969#action_12887969]

 Flavio Paiva Junqueira commented on ZOOKEEPER-790:
 --

 Vishal, What exactly didn't work? Do you get the same error messages with
 the patch? Do you have a reliable way of reproducing it? In general, it
 would be useful if you could provide more detail.

 Thanks!


  Last processed zxid set prematurely while establishing leadership
  -
 
  Key: ZOOKEEPER-790
  URL: https://issues.apache.org/jira/browse/ZOOKEEPER-790
  Project: Zookeeper
   Issue Type: Bug
   Components: quorum
 Affects Versions: 3.3.1
 Reporter: Flavio Paiva Junqueira
 Assignee: Flavio Paiva Junqueira
 Priority: Blocker
  Fix For: 3.3.2, 3.4.0
 
  Attachments: ZOOKEEPER-790.patch
 
 
  The leader code is setting the last processed zxid to the first of the
 new epoch even before connecting to a quorum of followers. Because the
 leader code sets this value before connecting to a quorum of followers
 (Leader.java:281) and the follower code throws an IOException
 (Follower.java:73) if the leader epoch is smaller, we have that when the
 false leader drops leadership and becomes a follower, it finds a smaller
 epoch and kills itself.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.




Re: [jira] Commented: (ZOOKEEPER-790) Last processed zxid set prematurely while establishing leadership

2010-07-13 Thread Flavio Junqueira
I forgot if you have provided already a description of how you reproduce it. If you could point me out to that, I would appreciate.-FlavioOn Jul 13, 2010, at 11:33 PM, Vishal K wrote:Hi Flavio ,I got the same error messages. I can reproduce this quite easily. I willcapture the logs again. Is there anything else you would like me to provide?Thanks.-VishalOn Tue, Jul 13, 2010 at 3:52 PM, Flavio Paiva Junqueira (JIRA) j...@apache.org wrote: [https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12887969#action_12887969]Flavio Paiva Junqueira commented on ZOOKEEPER-790:--Vishal, What exactly didn't work? Do you get the same error messages withthe patch? Do you have a reliable way of reproducing it? In general, itwould be useful if you could provide more detail.Thanks!Last processed zxid set prematurely while establishing leadership- Key: ZOOKEEPER-790 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-790 Project: Zookeeper Issue Type: Bug Components: quorum Affects Versions: 3.3.1 Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Priority: Blocker Fix For: 3.3.2, 3.4.0 Attachments: ZOOKEEPER-790.patchThe leader code is setting the last processed zxid to the first of thenew epoch even before connecting to a quorum of followers. Because theleader code sets this value before connecting to a quorum of followers(Leader.java:281) and the follower code throws an IOException(Follower.java:73) if the leader epoch is smaller, we have that when thefalse leader drops leadership and becomes a follower, it finds a smallerepoch and kills itself.--This message is automatically generated by JIRA.-You can reply to this email to add a comment to the issue online. flaviojunqueiraresearch scientistf...@yahoo-inc.comdirect +34 93-183-8828avinguda diagonal 177, 8th floor, barcelona, 08018, esphone (408) 349 3300fax (408) 349 3301 

[jira] Commented: (ZOOKEEPER-790) Last processed zxid set prematurely while establishing leadership

2010-07-13 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12888119#action_12888119
 ] 

Vishal K commented on ZOOKEEPER-790:


copying comments from email to jira.

Hi Flavio ,

I got the same error messages. I can reproduce this quite easily. I will 
capture the logs again. Is there anything else you would like me to provide? 
Thanks.

-Vishal

 Last processed zxid set prematurely while establishing leadership
 -

 Key: ZOOKEEPER-790
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-790
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.3.1
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-790.patch


 The leader code is setting the last processed zxid to the first of the new 
 epoch even before connecting to a quorum of followers. Because the leader 
 code sets this value before connecting to a quorum of followers 
 (Leader.java:281) and the follower code throws an IOException 
 (Follower.java:73) if the leader epoch is smaller, we have that when the 
 false leader drops leadership and becomes a follower, it finds a smaller 
 epoch and kills itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-790) Last processed zxid set prematurely while establishing leadership

2010-07-13 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12888121#action_12888121
 ] 

Vishal K commented on ZOOKEEPER-790:


copying comments from email to jira. 

I forgot if you have provided already a description of how you reproduce it. If 
you could point me out to that, I would appreciate.

-Flavio



 Last processed zxid set prematurely while establishing leadership
 -

 Key: ZOOKEEPER-790
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-790
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.3.1
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-790.patch


 The leader code is setting the last processed zxid to the first of the new 
 epoch even before connecting to a quorum of followers. Because the leader 
 code sets this value before connecting to a quorum of followers 
 (Leader.java:281) and the follower code throws an IOException 
 (Follower.java:73) if the leader epoch is smaller, we have that when the 
 false leader drops leadership and becomes a follower, it finds a smaller 
 epoch and kills itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-790) Last processed zxid set prematurely while establishing leadership

2010-07-13 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12888122#action_12888122
 ] 

Vishal K commented on ZOOKEEPER-790:


From ZOOKEEPER-335..

Hi,

I enabled tracing and did some more debugging. Looks like the restarted peer 
(and trying to join the cluster) determines that it is a leader and increments 
its epoch. However, rest of the nodes don't acknowledge this node as the 
leader, and hence, have an older epoch. I will attache the log. Unfortunately, 
I don't have traces from other nodes. I will repeat the experiment later and 
attache logs from other nodes.

Scenario:

* Form a 3 node cluster. This is not just ZK cluster. It also involves our 
application cluster that uses ZK.
* Kill one of the follower
* After a minute or so restart follower
* Follower rejects leader with Leader epoch y is less than our epoch y + 1

From logs:

a) Peer X restarts and starts leader election.
a) For a small window of time, X thinks that it is the new leader! During this 
window, for some reason, rest of the nodes tell X that they are also trying to 
find a leader. I.e., all 3 nodes are in LOOKING state. After seeing that all 3 
nodes are in LOOKING state, X decides to be a leader?

155 2010-06-20 23:22:46,421 - DEBUG [WorkerSender Thread:quorumcnxmana...@346] 
- Opening channel to server 1
156 2010-06-20 23:22:46,423 - DEBUG [WorkerReceiver 
Thread:fastleaderelection$messenger$workerrecei...@214] - Receive new 
notification message. My id = 0
157 2010-06-20 23:22:46,424 - INFO 
[QuorumPeer:/0.0.0.0:2181:fastleaderelect...@689] - Notification: 0, 
77309411393, 1, 0, LOOKING, LOOKING, 0
158 2010-06-20 23:22:46,424 - DEBUG 
[QuorumPeer:/0.0.0.0:2181:fastleaderelect...@495] - id: 0, proposed id: 0, 
zxid: 77309411393, proposed zxid: 77309411393
159 2010-06-20 23:22:46,424 - DEBUG 
[QuorumPeer:/0.0.0.0:2181:fastleaderelect...@717] - Adding vote: From = 0, 
Proposed leader = 0, Porposed zxid = 77309411393, Proposed epoch = 1
160 2010-06-20 23:22:46,426 - INFO [WorkerSender Thread:quorumcnxmana...@162] - 
Have smaller server identifier, so dropping the connection: (1, 0)
161 2010-06-20 23:22:46,426 - DEBUG [WorkerSender Thread:quorumcnxmana...@346] 
- Opening channel to server 2
162 2010-06-20 23:22:46,427 - DEBUG [Thread-1:quorumcnxmanager$liste...@445] - 
Connection request /192.168.1.182:46701
163 2010-06-20 23:22:46,427 - DEBUG [Thread-1:quorumcnxmanager$liste...@448] - 
Connection request: 0
164 2010-06-20 23:22:46,428 - DEBUG [Thread-1:quorumcnxmanager$sendwor...@504] 
- Address of remote peer: 1
165 2010-06-20 23:22:46,428 - INFO [WorkerSender Thread:quorumcnxmana...@162] - 
Have smaller server identifier, so dropping the connection: (2, 0)
166 2010-06-20 23:22:46,431 - DEBUG [WorkerReceiver 
Thread:fastleaderelection$messenger$workerrecei...@214] - Receive new 
notification message. My id = 0
167 2010-06-20 23:22:46,432 - INFO 
[QuorumPeer:/0.0.0.0:2181:fastleaderelect...@689] - Notification: 1, 
77309411372, 1, 0, LOOKING, LOOKING, 1
168 2010-06-20 23:22:46,432 - DEBUG 
[QuorumPeer:/0.0.0.0:2181:fastleaderelect...@495] - id: 1, proposed id: 0, 
zxid: 77309411372, proposed zxid: 77309411393
169 2010-06-20 23:22:46,432 - DEBUG 
[QuorumPeer:/0.0.0.0:2181:fastleaderelect...@717] - Adding vote: From = 1, 
Proposed leader = 1, Porposed zxid = 77309411372, Proposed epoch = 1
170 2010-06-20 23:22:46,436 - DEBUG [Thread-1:quorumcnxmanager$liste...@445] - 
Connection request /192.168.1.183:44310
171 2010-06-20 23:22:46,436 - DEBUG [Thread-1:quorumcnxmanager$liste...@448] - 
Connection request: 0
172 2010-06-20 23:22:46,436 - DEBUG [Thread-1:quorumcnxmanager$sendwor...@504] 
- Address of remote peer: 2
173 2010-06-20 23:22:46,440 - DEBUG [WorkerReceiver 
Thread:fastleaderelection$messenger$workerrecei...@214] - Receive new 
notification message. My id = 0
174 2010-06-20 23:22:46,440 - INFO 
[QuorumPeer:/0.0.0.0:2181:fastleaderelect...@689] - Notification: 2, 
7301097, 1, 0, LOOKING, LOOKING, 2
175 2010-06-20 23:22:46,440 - DEBUG 
[QuorumPeer:/0.0.0.0:2181:fastleaderelect...@495] - id: 2, proposed id: 0, 
zxid: 7301097, proposed zxid: 77309411393
176 2010-06-20 23:22:46,441 - DEBUG 
[QuorumPeer:/0.0.0.0:2181:fastleaderelect...@717] - Adding vote: From = 2, 
Proposed leader = 2, Porposed zxid = 7301097, Proposed epoch = 1
177 2010-06-20 23:22:46,441 - INFO [QuorumPeer:/0.0.0.0:2181:quorump...@647] - 
LEADING

b) As a result X increments its epoch. Worse, since this node decided to be a 
leader, it starts doing transactions. The first set of transactions start 
removing all ephemeral nodes. But these transactions are only done locally. 
Other peers do not ack these transactions since they know that this peer is not 
the leader.

c) After a few seconds (8 secs), X relinquishes leadership since it does not 
receive any ack from rest of the peers
d) It starts leader election 

[jira] Updated: (ZOOKEEPER-733) use netty to handle client connections

2010-07-13 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-733:
---

Attachment: ZOOKEEPER-733.patch

this latest patch cleans up the logging a bit more it also adds Nio client - 
Netty server based unit tests - these are a subset of the base tests but using 
the netty server cnxn factory.

 use netty to handle client connections
 --

 Key: ZOOKEEPER-733
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-733
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Reporter: Benjamin Reed
Assignee: Patrick Hunt
 Fix For: 3.4.0

 Attachments: accessive.jar, flowctl.zip, moved.zip, 
 QuorumTestFailed_sessionmoved_TRACE_LOG.txt.gz, ZOOKEEPER-733.patch, 
 ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
 ZOOKEEPER-733.patch, ZOOKEEPER-733.patch


 we currently have our own asynchronous NIO socket engine to be able to handle 
 lots of clients with a single thread. over time the engine has become more 
 complicated. we would also like the engine to use multiple threads on 
 machines with lots of cores. plus, we would like to be able to support things 
 like SSL. if we switch to netty, we can simplify our code and get the 
 previously mentioned benefits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.