[jira] Commented: (ZOOKEEPER-542) c-client can spin when server unresponsive

2009-10-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763444#action_12763444
 ] 

Hudson commented on ZOOKEEPER-542:
--

Integrated in ZooKeeper-trunk #491 (See 
[http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/491/])
. c-client can spin when server unresponsive (Christian Wiedmann via 
mahadev)


 c-client can spin when server unresponsive
 --

 Key: ZOOKEEPER-542
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.2.0, 3.2.1
Reporter: Christian Wiedmann
Assignee: Christian Wiedmann
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch


 Due to a mismatch between zookeeper_interest() and zookeeper_process(), when 
 the zookeeper server is unresponsive the client can spin when reconnecting to 
 the server.
 In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is 
 data to be sent, but flush_send_queue() only writes the data if the state is 
 ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
 This probably doesn't affect production, but I had a runaway process in a 
 development deployment that caused performance issues on the node.  This is 
 easy to reproduce in a single node environment by doing a kill -STOP on the 
 server and waiting for the session timeout.
 Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-544) improve client testability - allow test client to access connected server location

2009-10-08 Thread Patrick Hunt (JIRA)
improve client testability - allow test client to access connected server 
location
--

 Key: ZOOKEEPER-544
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-544
 Project: Zookeeper
  Issue Type: Improvement
  Components: c client, java client, tests
Reporter: Patrick Hunt
 Fix For: 3.3.0


This came up recently on the user list. If you are developing tests for your zk 
client you need to be able to access the server that your
session is currently connected to. The reason is that your test needs to know 
which server in the quorum to shutdown in order to
verify you are handling failover correctly. Similar for session expiration 
testing.

however we should be careful, we prefer not to expose this to all clients, this 
is an implementation detail that we typically
want to hide. 

also we should provide this in both the c and java clients

I suspect we should add a protected method on ZooKeeper. This will make a 
higher bar (user will have to subclass) for 
the user to access this method. In tests it's fine, typically you want a 
TestableZooKeeper class anyway. In c we unfortunately
have less options, we can just rely on docs for now. 

In both cases (c/java) we need to be very very clear in the docs that this is 
for testing only and to clearly define semantics.

We should add the following at the same time:

toString() method to ZooKeeper which includes server ip/port, client port, any 
other information deemed useful (connection stats like send/recv?)

the java ZooKeeper is missing deterministic connection order that the c 
client has. this is also useful for testing. again, protected and 
clear docs that this is for testing purposes only!


Any other things we should expose?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-545) investigate use of realtime gc as the recommened default for server vm

2009-10-08 Thread Patrick Hunt (JIRA)
investigate use of realtime gc as the recommened default for server vm
--

 Key: ZOOKEEPER-545
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-545
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Reporter: Patrick Hunt
Priority: Critical
 Fix For: 3.3.0


We currently don't recommend that ppl use the realtime gc when running the 
server, we probably should.

Before we do so we need to verify that it works.
We should make it the default for all our tests.
concurrent vs g2 or whatever it's called (new in 1.6_15 or something?)
Update all scripts to specify this option
update documentation to include this option and add section in the dev/ops docs 
detailing it's benefits (in particular latency effects of gc)

Also, -server option? any benefit for us to recommend this as well?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-546) add diskless ensemble support

2009-10-08 Thread Patrick Hunt (JIRA)
add diskless ensemble support
---

 Key: ZOOKEEPER-546
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-546
 Project: Zookeeper
  Issue Type: New Feature
  Components: server
Reporter: Patrick Hunt
 Fix For: 3.3.0


In some cases there is no need to have the ZK data persisted to disk. For 
example if all you are doing is group membership and leadership
election the data is totally ephemeral, storing on disk is unnecessary. We've 
also seen cases where any non-ephemeral data can be
easily recovered (say configuration data that's generated/read and loaded into 
zk) and there is less need to worry about recovery of the
data in the case of catastrophic failure (meaning _all_ replicas are lost, 
remember, recovery is automatic if 2n+1 servers and = n servers
fail, even if  n fail manual recovery is still possible as long as at least 1 
replica, or replica backup can be recovered)

In these cases it makes sense to have a diskless zookeeper ensemble. The 
result should be improved write performance
an less moving parts (no disk to fail!), simplifiying ops in cases where this 
can be applied.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-537) The zookeeper jar includes the java source files

2009-10-08 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763561#action_12763561
 ] 

Patrick Hunt commented on ZOOKEEPER-537:


Here's an idea, Thomas tell me if this works for you (maven):

1) do everything the same as today (legacy, not maven optimal), except
2) add a new maven directory to the release that contains the 3 jars thomas 
suggested (signed bin/src/doc jars, pom)

We can then deploy the contents of the maven directory to the maven repo as 
part of the release process.

Thomas, we use Ivy to generate, any idea how to do what you detailed (the xml) 
for maven but with Ivy? How about a pointer to a project that you think does 
things right that I can look at and make sure we have similar output.


 The zookeeper jar includes the java source files
 

 Key: ZOOKEEPER-537
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-537
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.2.1
Reporter: Thomas Dudziak
 Fix For: 3.3.0


 This is a problem if you use zookeeper as a dependency in maven because for 
 whatever reason the maven compiler plugin will pick up the java files in the 
 jar and compile them to the output directory. From there they will land in 
 the generated jar file for whatever project happens to depend on zookeeper 
 thus introducing duplicate classes (once in zookeeper.jar, once in the 
 project's artifact).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-59) Synchronized block in NIOServerCnxn

2009-10-08 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763567#action_12763567
 ] 

Patrick Hunt commented on ZOOKEEPER-59:
---

Hi Flavio, what's the status of this patch?

 Synchronized block in NIOServerCnxn
 ---

 Key: ZOOKEEPER-59
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-59
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-59.patch


 There are two synchronized blocks locking on different objects, and to me 
 they should be guarded by the same object. Here are the parts of the code I'm 
 talking about:
 {noformat}
 nioservercnxn.readrequ...@444
 ...
   synchronized (this) {
 outstandingRequests++;
 // check throttling
 if (zk.getInProcess()  factory.outstandingLimit) {
 disableRecv();
 // following lines should not be needed since we are 
 already
 // reading
 // } else {
 // enableRecv();
 }
 } 
 {noformat}
 {noformat}
 nioservercnxn.sendrespo...@740
 ...
  synchronized (this.factory) {
 outstandingRequests--;
 // check throttling
 if (zk.getInProcess()  factory.outstandingLimit
 || outstandingRequests  1) {
 sk.selector().wakeup();
 enableRecv();
 }
 }
 {noformat}
 I think the second one is correct, and the first synchronized block should be 
 guarded by this.factory. 
 This could be related to issue ZOOKEEPER-57, but I have no concrete 
 indication that this is the case so far.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-462) Last hint for open ledger

2009-10-08 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763572#action_12763572
 ] 

Patrick Hunt commented on ZOOKEEPER-462:


What's the status on this?

 Last hint for open ledger
 -

 Key: ZOOKEEPER-462
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-462
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib-bookkeeper
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-462.patch


 In some use cases of BookKeeper, it is useful to be able to read from a 
 ledger before closing the ledger. To enable such a feature, the writer has to 
 be able to communicate to a reader how many entries it has been able to write 
 successfully. The main idea of this jira is to continuously update a znode 
 with the number of successful writes, and a reader can, for example, watch 
 the node for changes.
  I was thinking of having a configuration parameter to state how often a 
 writer should update the hint on ZooKeeper (e.g., every 1000 requests, every 
 10,000 requests). Clearly updating more often increases the overhead of 
 writing to ZooKeeper, although the impact on the performance of writes to 
 BookKeeper should be minimal given that we make an asynchronous call to 
 update the hint.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-462) Last hint for open ledger

2009-10-08 Thread Utkarsh Srivastava (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763595#action_12763595
 ] 

Utkarsh Srivastava commented on ZOOKEEPER-462:
--

Ben and I discussed this and we dont think this is the best design. Under the 
current design, a lot of unnecessary write load will be put on ZK. 

Instead, the bookies already support a method to query for the last entry for a 
particular ledger. Thus, a client that wants to read an unclosed ledger can ask 
the bookies for their last entries and read until there. 

 Last hint for open ledger
 -

 Key: ZOOKEEPER-462
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-462
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib-bookkeeper
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-462.patch


 In some use cases of BookKeeper, it is useful to be able to read from a 
 ledger before closing the ledger. To enable such a feature, the writer has to 
 be able to communicate to a reader how many entries it has been able to write 
 successfully. The main idea of this jira is to continuously update a znode 
 with the number of successful writes, and a reader can, for example, watch 
 the node for changes.
  I was thinking of having a configuration parameter to state how often a 
 writer should update the hint on ZooKeeper (e.g., every 1000 requests, every 
 10,000 requests). Clearly updating more often increases the overhead of 
 writing to ZooKeeper, although the impact on the performance of writes to 
 BookKeeper should be minimal given that we make an asynchronous call to 
 update the hint.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-547) Sanity check in QuorumCnxn Manager and quorum communication port.

2009-10-08 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-547:


  Component/s: server
   leaderElection
Affects Version/s: 3.2.0
   3.2.1

 Sanity check in QuorumCnxn Manager and quorum communication port.
 -

 Key: ZOOKEEPER-547
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-547
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection, server
Affects Versions: 3.2.0, 3.2.1
Reporter: Mahadev konar
Assignee: Mahadev konar
 Fix For: 3.3.0


 We need to put some sanity checks in QuorumCnxnManager and the other quorum 
 port for rogue clients. Sometimes a clients might get misconfigured and they 
 might send random characters on such ports. We need to make sure that such 
 rogue clients do not bring down the clients and need to put in some sanity 
 checks with respect to packet lengths and deserialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-547) Sanity check in QuorumCnxn Manager and quorum communication port.

2009-10-08 Thread Mahadev konar (JIRA)
Sanity check in QuorumCnxn Manager and quorum communication port.
-

 Key: ZOOKEEPER-547
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-547
 Project: Zookeeper
  Issue Type: Bug
Reporter: Mahadev konar
Assignee: Mahadev konar


We need to put some sanity checks in QuorumCnxnManager and the other quorum 
port for rogue clients. Sometimes a clients might get misconfigured and they 
might send random characters on such ports. We need to make sure that such 
rogue clients do not bring down the clients and need to put in some sanity 
checks with respect to packet lengths and deserialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-547) Sanity check in QuorumCnxn Manager and quorum communication port.

2009-10-08 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-547:


Fix Version/s: 3.3.0

 Sanity check in QuorumCnxn Manager and quorum communication port.
 -

 Key: ZOOKEEPER-547
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-547
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection, server
Affects Versions: 3.2.0, 3.2.1
Reporter: Mahadev konar
Assignee: Mahadev konar
 Fix For: 3.3.0


 We need to put some sanity checks in QuorumCnxnManager and the other quorum 
 port for rogue clients. Sometimes a clients might get misconfigured and they 
 might send random characters on such ports. We need to make sure that such 
 rogue clients do not bring down the clients and need to put in some sanity 
 checks with respect to packet lengths and deserialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types

2009-10-08 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763743#action_12763743
 ] 

Mahadev konar commented on ZOOKEEPER-510:
-

the patch looks good ...
henry, do we have a test case for ZOOKEEPER-540?



 zkpython lumps all exceptions as IOError, needs specialized exceptions for 
 KeeperException types
 

 Key: ZOOKEEPER-510
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.2.0
Reporter: Patrick Hunt
Assignee: Henry Robinson
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, 
 ZOOKEEPER-510.patch, ZOOKEEPER-510.patch


 The current zkpython bindings always throw IOError(text) exceptions, even 
 for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error 
 prone) to handle exceptions in python code. You can't easily pickup a 
 connection loss vs a node exists for example. Of course you could match the 
 error string, but this seems like a bad idea imo.
 We need to add specific exception types to the python binding that map 
 directly to KeeperException/java types. It would also be useful to include 
 the information provided by the KeeperException (like path in some cases), 
 etc... as part of the error thrown to the python code. Would probably be a 
 good idea to stay as close to java api as possible wrt mapping the errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types

2009-10-08 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763749#action_12763749
 ] 

Henry Robinson commented on ZOOKEEPER-510:
--

Yes - see testconnection in connection_test.py

Two cases: trying to close an already closed handle, and trying to issue 
commands on an already closed handle. Both should raise exceptions. 

 zkpython lumps all exceptions as IOError, needs specialized exceptions for 
 KeeperException types
 

 Key: ZOOKEEPER-510
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.2.0
Reporter: Patrick Hunt
Assignee: Henry Robinson
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, 
 ZOOKEEPER-510.patch, ZOOKEEPER-510.patch


 The current zkpython bindings always throw IOError(text) exceptions, even 
 for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error 
 prone) to handle exceptions in python code. You can't easily pickup a 
 connection loss vs a node exists for example. Of course you could match the 
 error string, but this seems like a bad idea imo.
 We need to add specific exception types to the python binding that map 
 directly to KeeperException/java types. It would also be useful to include 
 the information provided by the KeeperException (like path in some cases), 
 etc... as part of the error thrown to the python code. Would probably be a 
 good idea to stay as close to java api as possible wrt mapping the errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers

2009-10-08 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763753#action_12763753
 ] 

Mahadev konar commented on ZOOKEEPER-368:
-

great... lets work to get this in soon before the codebase changes again... 

 Observers
 -

 Key: ZOOKEEPER-368
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
 Project: Zookeeper
  Issue Type: New Feature
  Components: quorum
Reporter: Flavio Paiva Junqueira
Assignee: Henry Robinson
 Attachments: obs-refactor.patch, observer-refactor.patch, 
 observers.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch


 Currently, all servers of an ensemble participate actively in reaching 
 agreement on the order of ZooKeeper transactions. That is, all followers 
 receive proposals, acknowledge them, and receive commit messages from the 
 leader. A leader issues commit messages once it receives acknowledgments from 
 a quorum of followers. For cross-colo operation, it would be useful to have a 
 third role: observer. Using Paxos terminology, observers are similar to 
 learners. An observer does not participate actively in the agreement step of 
 the atomic broadcast protocol. Instead, it only commits proposals that have 
 been accepted by some quorum of followers.
 One simple solution to implement observers is to have the leader forwarding 
 commit messages not only to followers but also to observers, and have 
 observers applying transactions according to the order followers agreed upon. 
 In the current implementation of the protocol, however, commit messages do 
 not carry their corresponding transaction payload because all servers 
 different from the leader are followers and followers receive such a payload 
 first through a proposal message. Just forwarding commit messages as they 
 currently are to an observer consequently is not sufficient. We have a couple 
 of options:
 1- Include the transaction payload along in commit messages to observers;
 2- Send proposals to observers as well.
 Number 2 is simpler to implement because it doesn't require changing the 
 protocol implementation, but it increases traffic slightly. The performance 
 impact due to such an increase might be insignificant, though.
 For scalability purposes, we may consider having followers also forwarding 
 commit messages to observers. With this option, observers can connect to 
 followers, and receive messages from followers. This choice is important to 
 avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types

2009-10-08 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763754#action_12763754
 ] 

Mahadev konar commented on ZOOKEEPER-510:
-

great... missed it... my bad... 

 zkpython lumps all exceptions as IOError, needs specialized exceptions for 
 KeeperException types
 

 Key: ZOOKEEPER-510
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.2.0
Reporter: Patrick Hunt
Assignee: Henry Robinson
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, 
 ZOOKEEPER-510.patch, ZOOKEEPER-510.patch


 The current zkpython bindings always throw IOError(text) exceptions, even 
 for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error 
 prone) to handle exceptions in python code. You can't easily pickup a 
 connection loss vs a node exists for example. Of course you could match the 
 error string, but this seems like a bad idea imo.
 We need to add specific exception types to the python binding that map 
 directly to KeeperException/java types. It would also be useful to include 
 the information provided by the KeeperException (like path in some cases), 
 etc... as part of the error thrown to the python code. Would probably be a 
 good idea to stay as close to java api as possible wrt mapping the errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types

2009-10-08 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-510:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I just committed this. thanks henry!

 zkpython lumps all exceptions as IOError, needs specialized exceptions for 
 KeeperException types
 

 Key: ZOOKEEPER-510
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.2.0
Reporter: Patrick Hunt
Assignee: Henry Robinson
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, 
 ZOOKEEPER-510.patch, ZOOKEEPER-510.patch


 The current zkpython bindings always throw IOError(text) exceptions, even 
 for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error 
 prone) to handle exceptions in python code. You can't easily pickup a 
 connection loss vs a node exists for example. Of course you could match the 
 error string, but this seems like a bad idea imo.
 We need to add specific exception types to the python binding that map 
 directly to KeeperException/java types. It would also be useful to include 
 the information provided by the KeeperException (like path in some cases), 
 etc... as part of the error thrown to the python code. Would probably be a 
 good idea to stay as close to java api as possible wrt mapping the errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (ZOOKEEPER-548) zookeeper.ZooKeeperException not added to the module in zkpython

2009-10-08 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt resolved ZOOKEEPER-548.


Resolution: Invalid
  Assignee: Patrick Hunt

Looks like this was addressed in ZOOKEEPER-510 (just went in earlier today)

 zookeeper.ZooKeeperException not added to the module in zkpython
 

 Key: ZOOKEEPER-548
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-548
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Reporter: Patrick Hunt
Assignee: Patrick Hunt
Priority: Critical
 Fix For: 3.3.0


 ZooKeeperException is not being added to the zookeeper module in zookeeper.c 
 (zkpython). The other exceptions
 are added but not ZooKeeperException. Sorry, I missed this in my previous 
 change, I got all the subclasses but not zkex itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types

2009-10-08 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt reopened ZOOKEEPER-510:



Unless I'm missing something there's a problem with this patch, the following 
is incorrect according to the python manual:

+#define ADD_EXCEPTION(x) x = PyErr_NewException(zookeeper.#x, 
ZooKeeperException, NULL); \
+  PyModule_AddObject(module, #x, x);

specifically the ref is not being incremented.

see this example in the following python man page:
http://docs.python.org/extending/extending.html#intermezzo-errors-and-exceptions

SpamError = PyErr_NewException(spam.error, NULL, NULL);
Py_INCREF(SpamError);
PyModule_AddObject(m, error, SpamError);



 zkpython lumps all exceptions as IOError, needs specialized exceptions for 
 KeeperException types
 

 Key: ZOOKEEPER-510
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.2.0
Reporter: Patrick Hunt
Assignee: Henry Robinson
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, 
 ZOOKEEPER-510.patch, ZOOKEEPER-510.patch


 The current zkpython bindings always throw IOError(text) exceptions, even 
 for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error 
 prone) to handle exceptions in python code. You can't easily pickup a 
 connection loss vs a node exists for example. Of course you could match the 
 error string, but this seems like a bad idea imo.
 We need to add specific exception types to the python binding that map 
 directly to KeeperException/java types. It would also be useful to include 
 the information provided by the KeeperException (like path in some cases), 
 etc... as part of the error thrown to the python code. Would probably be a 
 good idea to stay as close to java api as possible wrt mapping the errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types

2009-10-08 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763835#action_12763835
 ] 

Patrick Hunt commented on ZOOKEEPER-510:


I suggest creating an additional patch on this jira to address. Is this 
testable? the script could clear the exception from the module (it would be 
gc'd) then cause the code to raise the exception should cause seg fault?

 zkpython lumps all exceptions as IOError, needs specialized exceptions for 
 KeeperException types
 

 Key: ZOOKEEPER-510
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.2.0
Reporter: Patrick Hunt
Assignee: Henry Robinson
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, 
 ZOOKEEPER-510.patch, ZOOKEEPER-510.patch


 The current zkpython bindings always throw IOError(text) exceptions, even 
 for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error 
 prone) to handle exceptions in python code. You can't easily pickup a 
 connection loss vs a node exists for example. Of course you could match the 
 error string, but this seems like a bad idea imo.
 We need to add specific exception types to the python binding that map 
 directly to KeeperException/java types. It would also be useful to include 
 the information provided by the KeeperException (like path in some cases), 
 etc... as part of the error thrown to the python code. Would probably be a 
 good idea to stay as close to java api as possible wrt mapping the errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-541) zkpython limited to 256 handles

2009-10-08 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763838#action_12763838
 ] 

Patrick Hunt commented on ZOOKEEPER-541:


Henry, you want to make this patch available, not in progress.  I seem to 
be locked out of doing that otw I would fix it.

 zkpython limited to 256 handles
 ---

 Key: ZOOKEEPER-541
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-541
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.2.1
Reporter: Patrick Hunt
Assignee: Henry Robinson
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-541.patch


 zkpython is currently limited to a max of 256 total handles - not 256 open 
 handles, but rather 256 total handles created
 over the lifetime of the python application.
 In general this isn't a real issue, however in the case of a long lived 
 application which polls the cluster periodically (closing
 the session btw calls) this is an issue.
 it would be great if the slots could be reused? or perhaps a more complex 
 structure, such as a linked list, which would allow
 dynamic growth/shrinkage of the handle list.
 Also see ZOOKEEPER-540

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-512) FLE election fails to elect leader

2009-10-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763850#action_12763850
 ] 

Hadoop QA commented on ZOOKEEPER-512:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12417726/ZOOKEEPER-512.patch
  against trunk revision 823371.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/20/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/20/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/20/console

This message is automatically generated.

 FLE election fails to elect leader
 --

 Key: ZOOKEEPER-512
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-512
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum, server
Affects Versions: 3.2.0
Reporter: Patrick Hunt
Assignee: Flavio Paiva Junqueira
Priority: Blocker
 Fix For: 3.3.0

 Attachments: jst.txt, log3_debug.tar.gz, logs.tar.gz, logs2.tar.gz, 
 t5_aj.tar.gz, ZOOKEEPER-512.patch, ZOOKEEPER-512.patch, ZOOKEEPER-512.patch, 
 ZOOKEEPER-512.patch


 I was doing some fault injection testing of 3.2.1 with ZOOKEEPER-508 patch 
 applied and noticed that after some time the ensemble failed to re-elect a 
 leader.
 See the attached log files - 5 member ensemble. typically 5 is the leader
 Notice that after 16:23:50,525 no quorum is formed, even after 20 minutes 
 elapses w/no quorum
 environment:
 I was doing fault injection testing using aspectj. The faults are injected 
 into socketchannel read/write, I throw exceptions randomly at a 1/200 ratio 
 (rand.nextFloat() = .005 = throw IOException
 You can see when a fault is injected in the log via:
 2009-08-19 16:57:09,568 - INFO  [Thread-74:readrequestfailsintermitten...@38] 
 - READPACKET FORCED FAIL
 vs a read/write that didn't force fail:
 2009-08-19 16:57:09,568 - INFO  [Thread-74:readrequestfailsintermitten...@41] 
 - READPACKET OK
 otw standard code/config (straight fle quorum with 5 members)
 also see the attached jstack trace. this is for one of the servers. Notice in 
 particular that the number of sendworkers != the number of recv workers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.