[jira] Commented: (ZOOKEEPER-542) c-client can spin when server unresponsive
[ https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763444#action_12763444 ] Hudson commented on ZOOKEEPER-542: -- Integrated in ZooKeeper-trunk #491 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/491/]) . c-client can spin when server unresponsive (Christian Wiedmann via mahadev) c-client can spin when server unresponsive -- Key: ZOOKEEPER-542 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542 Project: Zookeeper Issue Type: Bug Components: c client Affects Versions: 3.2.0, 3.2.1 Reporter: Christian Wiedmann Assignee: Christian Wiedmann Fix For: 3.3.0 Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch Due to a mismatch between zookeeper_interest() and zookeeper_process(), when the zookeeper server is unresponsive the client can spin when reconnecting to the server. In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data to be sent, but flush_send_queue() only writes the data if the state is ZOO_CONNECTED_STATE. When in ZOO_ASSOCIATING_STATE, this results in spinning. This probably doesn't affect production, but I had a runaway process in a development deployment that caused performance issues on the node. This is easy to reproduce in a single node environment by doing a kill -STOP on the server and waiting for the session timeout. Patch to be added. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-544) improve client testability - allow test client to access connected server location
improve client testability - allow test client to access connected server location -- Key: ZOOKEEPER-544 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-544 Project: Zookeeper Issue Type: Improvement Components: c client, java client, tests Reporter: Patrick Hunt Fix For: 3.3.0 This came up recently on the user list. If you are developing tests for your zk client you need to be able to access the server that your session is currently connected to. The reason is that your test needs to know which server in the quorum to shutdown in order to verify you are handling failover correctly. Similar for session expiration testing. however we should be careful, we prefer not to expose this to all clients, this is an implementation detail that we typically want to hide. also we should provide this in both the c and java clients I suspect we should add a protected method on ZooKeeper. This will make a higher bar (user will have to subclass) for the user to access this method. In tests it's fine, typically you want a TestableZooKeeper class anyway. In c we unfortunately have less options, we can just rely on docs for now. In both cases (c/java) we need to be very very clear in the docs that this is for testing only and to clearly define semantics. We should add the following at the same time: toString() method to ZooKeeper which includes server ip/port, client port, any other information deemed useful (connection stats like send/recv?) the java ZooKeeper is missing deterministic connection order that the c client has. this is also useful for testing. again, protected and clear docs that this is for testing purposes only! Any other things we should expose? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-545) investigate use of realtime gc as the recommened default for server vm
investigate use of realtime gc as the recommened default for server vm -- Key: ZOOKEEPER-545 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-545 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Patrick Hunt Priority: Critical Fix For: 3.3.0 We currently don't recommend that ppl use the realtime gc when running the server, we probably should. Before we do so we need to verify that it works. We should make it the default for all our tests. concurrent vs g2 or whatever it's called (new in 1.6_15 or something?) Update all scripts to specify this option update documentation to include this option and add section in the dev/ops docs detailing it's benefits (in particular latency effects of gc) Also, -server option? any benefit for us to recommend this as well? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-546) add diskless ensemble support
add diskless ensemble support --- Key: ZOOKEEPER-546 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-546 Project: Zookeeper Issue Type: New Feature Components: server Reporter: Patrick Hunt Fix For: 3.3.0 In some cases there is no need to have the ZK data persisted to disk. For example if all you are doing is group membership and leadership election the data is totally ephemeral, storing on disk is unnecessary. We've also seen cases where any non-ephemeral data can be easily recovered (say configuration data that's generated/read and loaded into zk) and there is less need to worry about recovery of the data in the case of catastrophic failure (meaning _all_ replicas are lost, remember, recovery is automatic if 2n+1 servers and = n servers fail, even if n fail manual recovery is still possible as long as at least 1 replica, or replica backup can be recovered) In these cases it makes sense to have a diskless zookeeper ensemble. The result should be improved write performance an less moving parts (no disk to fail!), simplifiying ops in cases where this can be applied. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-537) The zookeeper jar includes the java source files
[ https://issues.apache.org/jira/browse/ZOOKEEPER-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763561#action_12763561 ] Patrick Hunt commented on ZOOKEEPER-537: Here's an idea, Thomas tell me if this works for you (maven): 1) do everything the same as today (legacy, not maven optimal), except 2) add a new maven directory to the release that contains the 3 jars thomas suggested (signed bin/src/doc jars, pom) We can then deploy the contents of the maven directory to the maven repo as part of the release process. Thomas, we use Ivy to generate, any idea how to do what you detailed (the xml) for maven but with Ivy? How about a pointer to a project that you think does things right that I can look at and make sure we have similar output. The zookeeper jar includes the java source files Key: ZOOKEEPER-537 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-537 Project: Zookeeper Issue Type: Bug Affects Versions: 3.2.1 Reporter: Thomas Dudziak Fix For: 3.3.0 This is a problem if you use zookeeper as a dependency in maven because for whatever reason the maven compiler plugin will pick up the java files in the jar and compile them to the output directory. From there they will land in the generated jar file for whatever project happens to depend on zookeeper thus introducing duplicate classes (once in zookeeper.jar, once in the project's artifact). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-59) Synchronized block in NIOServerCnxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763567#action_12763567 ] Patrick Hunt commented on ZOOKEEPER-59: --- Hi Flavio, what's the status of this patch? Synchronized block in NIOServerCnxn --- Key: ZOOKEEPER-59 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-59 Project: Zookeeper Issue Type: Bug Components: server Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Fix For: 3.3.0 Attachments: ZOOKEEPER-59.patch There are two synchronized blocks locking on different objects, and to me they should be guarded by the same object. Here are the parts of the code I'm talking about: {noformat} nioservercnxn.readrequ...@444 ... synchronized (this) { outstandingRequests++; // check throttling if (zk.getInProcess() factory.outstandingLimit) { disableRecv(); // following lines should not be needed since we are already // reading // } else { // enableRecv(); } } {noformat} {noformat} nioservercnxn.sendrespo...@740 ... synchronized (this.factory) { outstandingRequests--; // check throttling if (zk.getInProcess() factory.outstandingLimit || outstandingRequests 1) { sk.selector().wakeup(); enableRecv(); } } {noformat} I think the second one is correct, and the first synchronized block should be guarded by this.factory. This could be related to issue ZOOKEEPER-57, but I have no concrete indication that this is the case so far. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-462) Last hint for open ledger
[ https://issues.apache.org/jira/browse/ZOOKEEPER-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763572#action_12763572 ] Patrick Hunt commented on ZOOKEEPER-462: What's the status on this? Last hint for open ledger - Key: ZOOKEEPER-462 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-462 Project: Zookeeper Issue Type: New Feature Components: contrib-bookkeeper Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Fix For: 3.3.0 Attachments: ZOOKEEPER-462.patch In some use cases of BookKeeper, it is useful to be able to read from a ledger before closing the ledger. To enable such a feature, the writer has to be able to communicate to a reader how many entries it has been able to write successfully. The main idea of this jira is to continuously update a znode with the number of successful writes, and a reader can, for example, watch the node for changes. I was thinking of having a configuration parameter to state how often a writer should update the hint on ZooKeeper (e.g., every 1000 requests, every 10,000 requests). Clearly updating more often increases the overhead of writing to ZooKeeper, although the impact on the performance of writes to BookKeeper should be minimal given that we make an asynchronous call to update the hint. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-462) Last hint for open ledger
[ https://issues.apache.org/jira/browse/ZOOKEEPER-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763595#action_12763595 ] Utkarsh Srivastava commented on ZOOKEEPER-462: -- Ben and I discussed this and we dont think this is the best design. Under the current design, a lot of unnecessary write load will be put on ZK. Instead, the bookies already support a method to query for the last entry for a particular ledger. Thus, a client that wants to read an unclosed ledger can ask the bookies for their last entries and read until there. Last hint for open ledger - Key: ZOOKEEPER-462 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-462 Project: Zookeeper Issue Type: New Feature Components: contrib-bookkeeper Reporter: Flavio Paiva Junqueira Assignee: Flavio Paiva Junqueira Fix For: 3.3.0 Attachments: ZOOKEEPER-462.patch In some use cases of BookKeeper, it is useful to be able to read from a ledger before closing the ledger. To enable such a feature, the writer has to be able to communicate to a reader how many entries it has been able to write successfully. The main idea of this jira is to continuously update a znode with the number of successful writes, and a reader can, for example, watch the node for changes. I was thinking of having a configuration parameter to state how often a writer should update the hint on ZooKeeper (e.g., every 1000 requests, every 10,000 requests). Clearly updating more often increases the overhead of writing to ZooKeeper, although the impact on the performance of writes to BookKeeper should be minimal given that we make an asynchronous call to update the hint. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-547) Sanity check in QuorumCnxn Manager and quorum communication port.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated ZOOKEEPER-547: Component/s: server leaderElection Affects Version/s: 3.2.0 3.2.1 Sanity check in QuorumCnxn Manager and quorum communication port. - Key: ZOOKEEPER-547 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-547 Project: Zookeeper Issue Type: Bug Components: leaderElection, server Affects Versions: 3.2.0, 3.2.1 Reporter: Mahadev konar Assignee: Mahadev konar Fix For: 3.3.0 We need to put some sanity checks in QuorumCnxnManager and the other quorum port for rogue clients. Sometimes a clients might get misconfigured and they might send random characters on such ports. We need to make sure that such rogue clients do not bring down the clients and need to put in some sanity checks with respect to packet lengths and deserialization. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-547) Sanity check in QuorumCnxn Manager and quorum communication port.
Sanity check in QuorumCnxn Manager and quorum communication port. - Key: ZOOKEEPER-547 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-547 Project: Zookeeper Issue Type: Bug Reporter: Mahadev konar Assignee: Mahadev konar We need to put some sanity checks in QuorumCnxnManager and the other quorum port for rogue clients. Sometimes a clients might get misconfigured and they might send random characters on such ports. We need to make sure that such rogue clients do not bring down the clients and need to put in some sanity checks with respect to packet lengths and deserialization. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-547) Sanity check in QuorumCnxn Manager and quorum communication port.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated ZOOKEEPER-547: Fix Version/s: 3.3.0 Sanity check in QuorumCnxn Manager and quorum communication port. - Key: ZOOKEEPER-547 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-547 Project: Zookeeper Issue Type: Bug Components: leaderElection, server Affects Versions: 3.2.0, 3.2.1 Reporter: Mahadev konar Assignee: Mahadev konar Fix For: 3.3.0 We need to put some sanity checks in QuorumCnxnManager and the other quorum port for rogue clients. Sometimes a clients might get misconfigured and they might send random characters on such ports. We need to make sure that such rogue clients do not bring down the clients and need to put in some sanity checks with respect to packet lengths and deserialization. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types
[ https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763743#action_12763743 ] Mahadev konar commented on ZOOKEEPER-510: - the patch looks good ... henry, do we have a test case for ZOOKEEPER-540? zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types Key: ZOOKEEPER-510 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.2.0 Reporter: Patrick Hunt Assignee: Henry Robinson Fix For: 3.3.0 Attachments: ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch The current zkpython bindings always throw IOError(text) exceptions, even for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error prone) to handle exceptions in python code. You can't easily pickup a connection loss vs a node exists for example. Of course you could match the error string, but this seems like a bad idea imo. We need to add specific exception types to the python binding that map directly to KeeperException/java types. It would also be useful to include the information provided by the KeeperException (like path in some cases), etc... as part of the error thrown to the python code. Would probably be a good idea to stay as close to java api as possible wrt mapping the errors. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types
[ https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763749#action_12763749 ] Henry Robinson commented on ZOOKEEPER-510: -- Yes - see testconnection in connection_test.py Two cases: trying to close an already closed handle, and trying to issue commands on an already closed handle. Both should raise exceptions. zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types Key: ZOOKEEPER-510 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.2.0 Reporter: Patrick Hunt Assignee: Henry Robinson Fix For: 3.3.0 Attachments: ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch The current zkpython bindings always throw IOError(text) exceptions, even for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error prone) to handle exceptions in python code. You can't easily pickup a connection loss vs a node exists for example. Of course you could match the error string, but this seems like a bad idea imo. We need to add specific exception types to the python binding that map directly to KeeperException/java types. It would also be useful to include the information provided by the KeeperException (like path in some cases), etc... as part of the error thrown to the python code. Would probably be a good idea to stay as close to java api as possible wrt mapping the errors. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-368) Observers
[ https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763753#action_12763753 ] Mahadev konar commented on ZOOKEEPER-368: - great... lets work to get this in soon before the codebase changes again... Observers - Key: ZOOKEEPER-368 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368 Project: Zookeeper Issue Type: New Feature Components: quorum Reporter: Flavio Paiva Junqueira Assignee: Henry Robinson Attachments: obs-refactor.patch, observer-refactor.patch, observers.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch Currently, all servers of an ensemble participate actively in reaching agreement on the order of ZooKeeper transactions. That is, all followers receive proposals, acknowledge them, and receive commit messages from the leader. A leader issues commit messages once it receives acknowledgments from a quorum of followers. For cross-colo operation, it would be useful to have a third role: observer. Using Paxos terminology, observers are similar to learners. An observer does not participate actively in the agreement step of the atomic broadcast protocol. Instead, it only commits proposals that have been accepted by some quorum of followers. One simple solution to implement observers is to have the leader forwarding commit messages not only to followers but also to observers, and have observers applying transactions according to the order followers agreed upon. In the current implementation of the protocol, however, commit messages do not carry their corresponding transaction payload because all servers different from the leader are followers and followers receive such a payload first through a proposal message. Just forwarding commit messages as they currently are to an observer consequently is not sufficient. We have a couple of options: 1- Include the transaction payload along in commit messages to observers; 2- Send proposals to observers as well. Number 2 is simpler to implement because it doesn't require changing the protocol implementation, but it increases traffic slightly. The performance impact due to such an increase might be insignificant, though. For scalability purposes, we may consider having followers also forwarding commit messages to observers. With this option, observers can connect to followers, and receive messages from followers. This choice is important to avoid increasing the load on the leader with the number of observers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types
[ https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763754#action_12763754 ] Mahadev konar commented on ZOOKEEPER-510: - great... missed it... my bad... zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types Key: ZOOKEEPER-510 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.2.0 Reporter: Patrick Hunt Assignee: Henry Robinson Fix For: 3.3.0 Attachments: ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch The current zkpython bindings always throw IOError(text) exceptions, even for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error prone) to handle exceptions in python code. You can't easily pickup a connection loss vs a node exists for example. Of course you could match the error string, but this seems like a bad idea imo. We need to add specific exception types to the python binding that map directly to KeeperException/java types. It would also be useful to include the information provided by the KeeperException (like path in some cases), etc... as part of the error thrown to the python code. Would probably be a good idea to stay as close to java api as possible wrt mapping the errors. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types
[ https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated ZOOKEEPER-510: Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I just committed this. thanks henry! zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types Key: ZOOKEEPER-510 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.2.0 Reporter: Patrick Hunt Assignee: Henry Robinson Fix For: 3.3.0 Attachments: ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch The current zkpython bindings always throw IOError(text) exceptions, even for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error prone) to handle exceptions in python code. You can't easily pickup a connection loss vs a node exists for example. Of course you could match the error string, but this seems like a bad idea imo. We need to add specific exception types to the python binding that map directly to KeeperException/java types. It would also be useful to include the information provided by the KeeperException (like path in some cases), etc... as part of the error thrown to the python code. Would probably be a good idea to stay as close to java api as possible wrt mapping the errors. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (ZOOKEEPER-548) zookeeper.ZooKeeperException not added to the module in zkpython
[ https://issues.apache.org/jira/browse/ZOOKEEPER-548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt resolved ZOOKEEPER-548. Resolution: Invalid Assignee: Patrick Hunt Looks like this was addressed in ZOOKEEPER-510 (just went in earlier today) zookeeper.ZooKeeperException not added to the module in zkpython Key: ZOOKEEPER-548 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-548 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Reporter: Patrick Hunt Assignee: Patrick Hunt Priority: Critical Fix For: 3.3.0 ZooKeeperException is not being added to the zookeeper module in zookeeper.c (zkpython). The other exceptions are added but not ZooKeeperException. Sorry, I missed this in my previous change, I got all the subclasses but not zkex itself. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types
[ https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reopened ZOOKEEPER-510: Unless I'm missing something there's a problem with this patch, the following is incorrect according to the python manual: +#define ADD_EXCEPTION(x) x = PyErr_NewException(zookeeper.#x, ZooKeeperException, NULL); \ + PyModule_AddObject(module, #x, x); specifically the ref is not being incremented. see this example in the following python man page: http://docs.python.org/extending/extending.html#intermezzo-errors-and-exceptions SpamError = PyErr_NewException(spam.error, NULL, NULL); Py_INCREF(SpamError); PyModule_AddObject(m, error, SpamError); zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types Key: ZOOKEEPER-510 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.2.0 Reporter: Patrick Hunt Assignee: Henry Robinson Fix For: 3.3.0 Attachments: ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch The current zkpython bindings always throw IOError(text) exceptions, even for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error prone) to handle exceptions in python code. You can't easily pickup a connection loss vs a node exists for example. Of course you could match the error string, but this seems like a bad idea imo. We need to add specific exception types to the python binding that map directly to KeeperException/java types. It would also be useful to include the information provided by the KeeperException (like path in some cases), etc... as part of the error thrown to the python code. Would probably be a good idea to stay as close to java api as possible wrt mapping the errors. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types
[ https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763835#action_12763835 ] Patrick Hunt commented on ZOOKEEPER-510: I suggest creating an additional patch on this jira to address. Is this testable? the script could clear the exception from the module (it would be gc'd) then cause the code to raise the exception should cause seg fault? zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types Key: ZOOKEEPER-510 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.2.0 Reporter: Patrick Hunt Assignee: Henry Robinson Fix For: 3.3.0 Attachments: ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch The current zkpython bindings always throw IOError(text) exceptions, even for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error prone) to handle exceptions in python code. You can't easily pickup a connection loss vs a node exists for example. Of course you could match the error string, but this seems like a bad idea imo. We need to add specific exception types to the python binding that map directly to KeeperException/java types. It would also be useful to include the information provided by the KeeperException (like path in some cases), etc... as part of the error thrown to the python code. Would probably be a good idea to stay as close to java api as possible wrt mapping the errors. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-541) zkpython limited to 256 handles
[ https://issues.apache.org/jira/browse/ZOOKEEPER-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763838#action_12763838 ] Patrick Hunt commented on ZOOKEEPER-541: Henry, you want to make this patch available, not in progress. I seem to be locked out of doing that otw I would fix it. zkpython limited to 256 handles --- Key: ZOOKEEPER-541 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-541 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.2.1 Reporter: Patrick Hunt Assignee: Henry Robinson Fix For: 3.3.0 Attachments: ZOOKEEPER-541.patch zkpython is currently limited to a max of 256 total handles - not 256 open handles, but rather 256 total handles created over the lifetime of the python application. In general this isn't a real issue, however in the case of a long lived application which polls the cluster periodically (closing the session btw calls) this is an issue. it would be great if the slots could be reused? or perhaps a more complex structure, such as a linked list, which would allow dynamic growth/shrinkage of the handle list. Also see ZOOKEEPER-540 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-512) FLE election fails to elect leader
[ https://issues.apache.org/jira/browse/ZOOKEEPER-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763850#action_12763850 ] Hadoop QA commented on ZOOKEEPER-512: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417726/ZOOKEEPER-512.patch against trunk revision 823371. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/20/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/20/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/20/console This message is automatically generated. FLE election fails to elect leader -- Key: ZOOKEEPER-512 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-512 Project: Zookeeper Issue Type: Bug Components: quorum, server Affects Versions: 3.2.0 Reporter: Patrick Hunt Assignee: Flavio Paiva Junqueira Priority: Blocker Fix For: 3.3.0 Attachments: jst.txt, log3_debug.tar.gz, logs.tar.gz, logs2.tar.gz, t5_aj.tar.gz, ZOOKEEPER-512.patch, ZOOKEEPER-512.patch, ZOOKEEPER-512.patch, ZOOKEEPER-512.patch I was doing some fault injection testing of 3.2.1 with ZOOKEEPER-508 patch applied and noticed that after some time the ensemble failed to re-elect a leader. See the attached log files - 5 member ensemble. typically 5 is the leader Notice that after 16:23:50,525 no quorum is formed, even after 20 minutes elapses w/no quorum environment: I was doing fault injection testing using aspectj. The faults are injected into socketchannel read/write, I throw exceptions randomly at a 1/200 ratio (rand.nextFloat() = .005 = throw IOException You can see when a fault is injected in the log via: 2009-08-19 16:57:09,568 - INFO [Thread-74:readrequestfailsintermitten...@38] - READPACKET FORCED FAIL vs a read/write that didn't force fail: 2009-08-19 16:57:09,568 - INFO [Thread-74:readrequestfailsintermitten...@41] - READPACKET OK otw standard code/config (straight fle quorum with 5 members) also see the attached jstack trace. this is for one of the servers. Notice in particular that the number of sendworkers != the number of recv workers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.