[jira] Updated: (ZOOKEEPER-563) ant test for recipes is broken.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-563: --- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Looks good, thanks Mahadev. > ant test for recipes is broken. > --- > > Key: ZOOKEEPER-563 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-563 > Project: Zookeeper > Issue Type: Bug > Components: build >Reporter: Mahadev konar >Assignee: Mahadev konar > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-563.patch > > > With ZOOKEEPER-529 checked in, ant test for recipes broke. Its a minor change > to the build for including librariries from the new location where jars are > downloaded by ivy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-562) c client can flood server with pings if tcp send queue filled
[ https://issues.apache.org/jira/browse/ZOOKEEPER-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-562: --- Release Note: Basically the problem here is that the client gets confused, it tries to send a ping to the server but since the tcp queue is full it's unable to do so. The logic responsible for sending occasional pings based on the timeout gets confused by this, and ends up flooding the server with pings. Eventually this clears up, however it can result in increased load on the server and instability for the effected client. > c client can flood server with pings if tcp send queue filled > - > > Key: ZOOKEEPER-562 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-562 > Project: Zookeeper > Issue Type: Bug > Components: c client >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Benjamin Reed >Priority: Blocker > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-562.patch > > > The c client can flood the server with pings if the tcp queue is filled. > Say the cluster is overloaded and shuts down the recv processing > a c client can send a ping, but since last_send is only updated on successful > pushing of data into the > socket, if flush_send_queue fails to send any data (send_buffer returns 0) > then last_send is not updated > and zookeeper_interest will again send a ping the next time it is woken - > which could be 0 if recv_to is close > to 0, easily could happen if server is not sending data to the client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-562) c client can flood server with pings if tcp send queue filled
[ https://issues.apache.org/jira/browse/ZOOKEEPER-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771295#action_12771295 ] Patrick Hunt commented on ZOOKEEPER-562: +1, looks good to me - if we are waiting for a result we don't need to send a ping. great! Mahadev, can you take a look at this and commit if no issues found? (both 3.2 branch and trunk) > c client can flood server with pings if tcp send queue filled > - > > Key: ZOOKEEPER-562 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-562 > Project: Zookeeper > Issue Type: Bug > Components: c client >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Benjamin Reed >Priority: Blocker > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-562.patch > > > The c client can flood the server with pings if the tcp queue is filled. > Say the cluster is overloaded and shuts down the recv processing > a c client can send a ping, but since last_send is only updated on successful > pushing of data into the > socket, if flush_send_queue fails to send any data (send_buffer returns 0) > then last_send is not updated > and zookeeper_interest will again send a ping the next time it is woken - > which could be 0 if recv_to is close > to 0, easily could happen if server is not sending data to the client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-566) "reqs" four letter word (command port) returns no information
"reqs" four letter word (command port) returns no information - Key: ZOOKEEPER-566 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-566 Project: Zookeeper Issue Type: Bug Affects Versions: 3.2.1 Reporter: Patrick Hunt Priority: Critical Fix For: 3.3.0 the four letter word "reqs" doesn't do anything - it always returns empty data. Seems that "outstanding" field is always empty and never set. we should remove outstanding and also update the reqs code to correctly output the outstanding requests (if not possible then remove the cmd and update docs - although this is very useful command, hate to see us lose it) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (ZOOKEEPER-558) server "sent" stats not being updated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-558: -- Assignee: Patrick Hunt > server "sent" stats not being updated > - > > Key: ZOOKEEPER-558 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-558 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > > the server and connection "sent" stat is not being updated. if you run "stat" > on the client port the sent packets is much lower than it should be > seems that sendbuffer is not updating the stats when it shortcircuits the > send. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-558) server "sent" stats not being updated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771468#action_12771468 ] Patrick Hunt commented on ZOOKEEPER-558: looks like this may be more serious, afaict some sendbuffer(closeconn) will (may) be ignored. not sure this was thought out when the short-circuit optimization was made. esp around four letter words. > server "sent" stats not being updated > - > > Key: ZOOKEEPER-558 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-558 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > > the server and connection "sent" stat is not being updated. if you run "stat" > on the client port the sent packets is much lower than it should be > seems that sendbuffer is not updating the stats when it shortcircuits the > send. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-565) Revisit some java client log messages
[ https://issues.apache.org/jira/browse/ZOOKEEPER-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771663#action_12771663 ] Patrick Hunt commented on ZOOKEEPER-565: Hm, I want to make sure I understand your issues. There is only one session here - the zk client session. there's also a network connection (the socket) btw client/server. this message is (should be) saying that the zk client lib attempted to re-connect to the server but the client's ZK session has already expired on the server (timeout exceeded, ephemerals cleaned up, etc...) and the ZK session is no longer valid (ie the client needs to create a new session) I'm happy to make this better (can't be much worse) but I want to make sure I grok your request. If this log message said something like 2009-10-29 14:25:54,023 - INFO ClientCnxn - Unable to reconnect to ZooKeeper service, session 0x124a265d8b20001 has expired would that be better? (notice info level since it's not really an error condition.) your client watcher code is getting (in both cases) the watcher event that notifies it of the session expiration, this is a log by the client library code capturing the event. better? suggestions? > Revisit some java client log messages > - > > Key: ZOOKEEPER-565 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-565 > Project: Zookeeper > Issue Type: Improvement >Affects Versions: 3.2.1 >Reporter: Jean-Daniel Cryans > > As discussed during the 10/23 meeting, some messages in the java client logs > are mixing up terms from different levels. For example: > {code} > 2009-10-14 15:12:43,566 WARN org.apache.zookeeper.ClientCnxn: Exception > closing session 0x1244f619478000d to sun.nio.ch.selectionkeyi...@15e32c4 > java.io.IOException: Session Expired >at > org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:589) >at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:709) >at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945) > {code} > Which session are we talking about in the first line? Now I know that it's a > network-related session and not the ZK one, but I've seen many of our users > getting confused over those lines. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-565) Revisit some java client log messages
[ https://issues.apache.org/jira/browse/ZOOKEEPER-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771763#action_12771763 ] Patrick Hunt commented on ZOOKEEPER-565: currently I have changed it to this 2009-10-29 19:53:27,788 - INFO - Client session timed out, have not heard from server in 20001ms for sessionid 0x124a391a9620001, closing socket connection and attempting reconnect as part of a patch that I'm working on that will attempt to improve the client/server session establishment messages > Revisit some java client log messages > - > > Key: ZOOKEEPER-565 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-565 > Project: Zookeeper > Issue Type: Improvement >Affects Versions: 3.2.1 >Reporter: Jean-Daniel Cryans > > As discussed during the 10/23 meeting, some messages in the java client logs > are mixing up terms from different levels. For example: > {code} > 2009-10-14 15:12:43,566 WARN org.apache.zookeeper.ClientCnxn: Exception > closing session 0x1244f619478000d to sun.nio.ch.selectionkeyi...@15e32c4 > java.io.IOException: Session Expired >at > org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:589) >at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:709) >at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945) > {code} > Which session are we talking about in the first line? Now I know that it's a > network-related session and not the ZK one, but I've seen many of our users > getting confused over those lines. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-564) Give more feedback on that current flow of events in java client logs
[ https://issues.apache.org/jira/browse/ZOOKEEPER-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771829#action_12771829 ] Patrick Hunt commented on ZOOKEEPER-564: I've worked out the following for session establishment, teardown and expiration handling. I'm not convinced about the numbering (and getting the numbers "right" (no gaps for example) in all cases might be tough). We'd also include some documentation describing the client session establishment, teardown, and expiration handling which would refer to the messages (a bit of handwaving here cuz nothing yet, but think that it would be some docs describing what's below). These logs make it more clear re documenting the steps - first a socket connection is created, then a session is established. following is logging at info level client side log create session 2009-10-29 21:37:05,393 - INFO - Initiating client connection, connectString=localhost:2181 sessionTimeout=3 watcher=org.apache.zookeeper.zookeepermain$mywatc...@1608e05 2009-10-29 21:37:05,449 - INFO - Opening socket connection to server localhost/127.0.0.1:2181 2009-10-29 21:37:05,493 - INFO - Socket connection established to localhost/127.0.0.1:2181, initiating session Welcome to ZooKeeper! 2009-10-29 21:37:05,547 - INFO - Session establishment complete, sessionid = 0x124a3f255ce client side log close session 2009-10-29 21:37:08,677 - INFO - Session: 0x124a3f255ce closed server watching client session creation 2009-10-29 20:57:19,748 - INFO - Accepted socket connection from /127.0.0.1:49641 2009-10-29 20:57:19,784 - INFO - Client attempting to establish new session at /127.0.0.1:49641 2009-10-29 20:57:19,801 - INFO - Established session 0x124a3cdf52d for client /127.0.0.1:49641 server watching client close session 2009-10-29 20:57:49,772 - INFO - Processed session termination for sessionid: 0x124a3cdf52d 2009-10-29 20:57:49,775 - INFO - Closed socket connection for client /127.0.0.1:49641 which had sessionid 0x124a3cdf52d server expiring client session 2009-10-29 21:00:18,001 - INFO - Expiring session 0x124a3cdf52d0001, timeout of 3ms exceeded 2009-10-29 21:00:18,002 - INFO - Processed session termination for sessionid: 0x124a3cdf52d0001 2009-10-29 21:00:18,004 - INFO - Closed socket connection for client /127.0.0.1:49644 which had sessionid 0x124a3cdf52d0001 server watching client attempt to re-establish expired session 2009-10-29 21:00:28,222 - INFO - Accepted socket connection from /127.0.0.1:51000 2009-10-29 21:00:28,223 - INFO - Client attempting to renew session 0x124a3cdf52d0001 at /127.0.0.1:51000 2009-10-29 21:00:28,225 - INFO - Invalid session 0x124a3cdf52d0001 for client /127.0.0.1:51000, probably expired 2009-10-29 21:00:28,227 - INFO - Closed socket connection for client /127.0.0.1:51000 which had sessionid 0 > Give more feedback on that current flow of events in java client logs > - > > Key: ZOOKEEPER-564 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-564 > Project: Zookeeper > Issue Type: Improvement >Affects Versions: 3.2.1 >Reporter: Jean-Daniel Cryans > > As discussed during the 10/23 meeting, one issue we have in debugging ZK > client logs with HBase is that we have a hard time following the flow of > events. It may be obvious for a ZK dev, but in our POV that kind of trace > isn't very intuitive: > {code} > 2009-09-27 15:41:10,776 INFO org.apache.zookeeper.ClientCnxn: Attempting > connection to server ... > 2009-09-27 15:41:10,776 INFO org.apache.zookeeper.ClientCnxn: Priming > connection to java.nio.channels.SocketChannel[connected local=/ ... remote=... > 2009-09-27 15:41:10,776 INFO org.apache.zookeeper.ClientCnxn: Server > connection successful > 2009-09-27 15:41:10,784 WARN org.apache.zookeeper.ClientCnxn: Exception > closing session 0x0 to sun.nio.ch.selectionkeyi...@2c9b42e6 > {code} > This excerpt is just an example. We would like to see something like a > numbering of the events and possibly, in the case of an exception, at which > point did it went wrong and what's the next step. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-567) javadoc for getchildren2 needs to mention "new in 3.3.0"
javadoc for getchildren2 needs to mention "new in 3.3.0" Key: ZOOKEEPER-567 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-567 Project: Zookeeper Issue Type: Bug Components: c client, java client Reporter: Patrick Hunt Fix For: 3.3.0 the javadoc/cdoc for getchildren2 needs to mention that the methods are "new in 3.3.0" -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-537) The zookeeper jar includes the java source files
[ https://issues.apache.org/jira/browse/ZOOKEEPER-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-537: --- Status: Open (was: Patch Available) Thomas there are at least a couple of outstanding comments not addressed by this patch, specifically there is interest to not change the existing jar file as it is currently used for those non-maven users.. I put forth the idea to have a separate set of jars for bin/src/jdoc (btw, I don't see the jdoc jar in your patch, isn't that something you had mentioned would be useful?) https://issues.apache.org/jira/browse/ZOOKEEPER-537?focusedCommentId=12761052&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12761052 https://issues.apache.org/jira/browse/ZOOKEEPER-537?focusedCommentId=12763561&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12763561 Do you see an issue with what I'm suggesting? > The zookeeper jar includes the java source files > > > Key: ZOOKEEPER-537 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-537 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Thomas Dudziak > Fix For: 3.3.0 > > Attachments: build.patch > > > This is a problem if you use zookeeper as a dependency in maven because for > whatever reason the maven compiler plugin will pick up the java files in the > jar and compile them to the output directory. From there they will land in > the generated jar file for whatever project happens to depend on zookeeper > thus introducing duplicate classes (once in zookeeper.jar, once in the > project's artifact). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-558) server "sent" stats not being updated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-558: --- Attachment: ZOOKEEPER-558.patch This patch addresses the sent count issue (counter update was missing), additionally I addressed a number of session establishment/teardown issues that I found while fixing this: 1) better logging of session est/teardown, much of this was based on user feedback such as ZOOKEEPER-565 ZOOKEEPER-564 Much of this was cleaning up the log text and using common terminology for the messages themselves 1.1) some of 1) required better error handling. A lot of the changed code was throwing general IOExceptions, rather than throwing more specific exceptions allowing higher layer code to properly report the issue. 2) while closing a client connection the connection close response was not always sent to the client 3) in some cases the server would not close the connection (closeconn buffer was not being queued) properly. this would happen anyway when the client closed the connection, but the server should do the right thing regardless 4) calling close more than once just returns after the first call 5) added testableWaitForShutdown to ZooKeeper to allow tests to validate that the client has shutdown all threads (this is test only) 6) reading a client sockets now reads the entire request in one shot if possible. previously we would re-poll the selector after reading the length header. > server "sent" stats not being updated > - > > Key: ZOOKEEPER-558 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-558 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-558.patch > > > the server and connection "sent" stat is not being updated. if you run "stat" > on the client port the sent packets is much lower than it should be > seems that sendbuffer is not updating the stats when it shortcircuits the > send. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-558) server "sent" stats not being updated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-558: --- Status: Patch Available (was: Open) > server "sent" stats not being updated > - > > Key: ZOOKEEPER-558 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-558 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-558.patch > > > the server and connection "sent" stat is not being updated. if you run "stat" > on the client port the sent packets is much lower than it should be > seems that sendbuffer is not updating the stats when it shortcircuits the > send. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-568) SyncRequestProcessor snapping too frequently - counts non-log events as log events
SyncRequestProcessor snapping too frequently - counts non-log events as log events -- Key: ZOOKEEPER-568 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-568 Project: Zookeeper Issue Type: Bug Affects Versions: 3.2.1 Reporter: Patrick Hunt Fix For: 3.3.0 Noticed the following issues in SyncRequestProcessor 1) logCount is incremented even for non-log events (say getData) txnlog should return indication if request was logged or not (if hdr ==null it returns) also: 2) move r.nextInt below logCount++ (ie if an actual log event) 3) fix indentation after txnlog.append (for some reason has unnecessary 4 char indent) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-558) server "sent" stats not being updated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-558: --- Status: Open (was: Patch Available) > server "sent" stats not being updated > - > > Key: ZOOKEEPER-558 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-558 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-558.patch > > > the server and connection "sent" stat is not being updated. if you run "stat" > on the client port the sent packets is much lower than it should be > seems that sendbuffer is not updating the stats when it shortcircuits the > send. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-568) SyncRequestProcessor snapping too frequently - counts non-log events as log events
[ https://issues.apache.org/jira/browse/ZOOKEEPER-568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772822#action_12772822 ] Patrick Hunt commented on ZOOKEEPER-568: move traceMask into isTraceEnabled check > SyncRequestProcessor snapping too frequently - counts non-log events as log > events > -- > > Key: ZOOKEEPER-568 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-568 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.2.1 >Reporter: Patrick Hunt > Fix For: 3.3.0 > > > Noticed the following issues in SyncRequestProcessor > 1) logCount is incremented even for non-log events (say getData) > txnlog should return indication if request was logged or not (if hdr ==null > it returns) > also: > 2) move r.nextInt below logCount++ (ie if an actual log event) > 3) fix indentation after txnlog.append (for some reason has unnecessary 4 > char indent) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-558) server "sent" stats not being updated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-558: --- Attachment: ZOOKEEPER-558.patch Same as last patch except cleared up the findbugs issue. > server "sent" stats not being updated > - > > Key: ZOOKEEPER-558 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-558 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-558.patch, ZOOKEEPER-558.patch > > > the server and connection "sent" stat is not being updated. if you run "stat" > on the client port the sent packets is much lower than it should be > seems that sendbuffer is not updating the stats when it shortcircuits the > send. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-558) server "sent" stats not being updated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-558: --- Status: Patch Available (was: Open) > server "sent" stats not being updated > - > > Key: ZOOKEEPER-558 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-558 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-558.patch, ZOOKEEPER-558.patch > > > the server and connection "sent" stat is not being updated. if you run "stat" > on the client port the sent packets is much lower than it should be > seems that sendbuffer is not updating the stats when it shortcircuits the > send. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-558) server "sent" stats not being updated
[ https://issues.apache.org/jira/browse/ZOOKEEPER-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773277#action_12773277 ] Patrick Hunt commented on ZOOKEEPER-558: eclipse warns otw if you don't have the uid. no other reason > server "sent" stats not being updated > - > > Key: ZOOKEEPER-558 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-558 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-558.patch, ZOOKEEPER-558.patch > > > the server and connection "sent" stat is not being updated. if you run "stat" > on the client port the sent packets is much lower than it should be > seems that sendbuffer is not updating the stats when it shortcircuits the > send. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-547) Sanity check in QuorumCnxn Manager and quorum communication port.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-547: --- Status: Open (was: Patch Available) Why did the unit tests fail? > Sanity check in QuorumCnxn Manager and quorum communication port. > - > > Key: ZOOKEEPER-547 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-547 > Project: Zookeeper > Issue Type: Bug > Components: leaderElection, server >Affects Versions: 3.2.1, 3.2.0 >Reporter: Mahadev konar >Assignee: Mahadev konar > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-547.patch, ZOOKEEPER-547.patch, > ZOOKEEPER-547.patch > > > We need to put some sanity checks in QuorumCnxnManager and the other quorum > port for rogue clients. Sometimes a clients might get misconfigured and they > might send random characters on such ports. We need to make sure that such > rogue clients do not bring down the clients and need to put in some sanity > checks with respect to packet lengths and deserialization. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-547) Sanity check in QuorumCnxn Manager and quorum communication port.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-547: --- Fix Version/s: 3.2.2 > Sanity check in QuorumCnxn Manager and quorum communication port. > - > > Key: ZOOKEEPER-547 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-547 > Project: Zookeeper > Issue Type: Bug > Components: leaderElection, server >Affects Versions: 3.2.0, 3.2.1 >Reporter: Mahadev konar >Assignee: Mahadev konar > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-547.patch, ZOOKEEPER-547.patch, > ZOOKEEPER-547.patch > > > We need to put some sanity checks in QuorumCnxnManager and the other quorum > port for rogue clients. Sometimes a clients might get misconfigured and they > might send random characters on such ports. We need to make sure that such > rogue clients do not bring down the clients and need to put in some sanity > checks with respect to packet lengths and deserialization. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-570) AsyncHammerTest is broken, callbacks need to validate rc parameter
AsyncHammerTest is broken, callbacks need to validate rc parameter -- Key: ZOOKEEPER-570 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-570 Project: Zookeeper Issue Type: Bug Components: tests Affects Versions: 3.2.1 Reporter: Patrick Hunt Assignee: Patrick Hunt Priority: Critical Fix For: 3.2.2, 3.3.0 the asynchammertest is not validating the rc in the callback, more serious is that it is using path in the create callback to delete the node, rather than name (which is important in the case of a sequential node creation as in this case) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-570) AsyncHammerTest is broken, callbacks need to validate rc parameter
[ https://issues.apache.org/jira/browse/ZOOKEEPER-570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-570: --- Attachment: ZOOKEEPER-570.patch Fixed the test: 1) delete now uses name rather than path (since create is using seq flag) this was the main issue previously 2) fail the test if create or delete operations fail 3) don't send messages to the server until connected - otw false positive failure due to queued aysnc op when client times out > AsyncHammerTest is broken, callbacks need to validate rc parameter > -- > > Key: ZOOKEEPER-570 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-570 > Project: Zookeeper > Issue Type: Bug > Components: tests >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-570.patch > > > the asynchammertest is not validating the rc in the callback, more serious is > that it is using path in the create callback > to delete the node, rather than name (which is important in the case of a > sequential node creation as in this case) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-570) AsyncHammerTest is broken, callbacks need to validate rc parameter
[ https://issues.apache.org/jira/browse/ZOOKEEPER-570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-570: --- Status: Patch Available (was: Open) > AsyncHammerTest is broken, callbacks need to validate rc parameter > -- > > Key: ZOOKEEPER-570 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-570 > Project: Zookeeper > Issue Type: Bug > Components: tests >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-570.patch > > > the asynchammertest is not validating the rc in the callback, more serious is > that it is using path in the create callback > to delete the node, rather than name (which is important in the case of a > sequential node creation as in this case) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-570) AsyncHammerTest is broken, callbacks need to validate rc parameter
[ https://issues.apache.org/jira/browse/ZOOKEEPER-570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-570: --- Status: Open (was: Patch Available) > AsyncHammerTest is broken, callbacks need to validate rc parameter > -- > > Key: ZOOKEEPER-570 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-570 > Project: Zookeeper > Issue Type: Bug > Components: tests >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-570.patch > > > the asynchammertest is not validating the rc in the callback, more serious is > that it is using path in the create callback > to delete the node, rather than name (which is important in the case of a > sequential node creation as in this case) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-570) AsyncHammerTest is broken, callbacks need to validate rc parameter
[ https://issues.apache.org/jira/browse/ZOOKEEPER-570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-570: --- Attachment: ZOOKEEPER-570.patch same patch as before, however I noticed that there was a serious flaw with the verification phase, which restarted the quorum. The restart code was wrong, it caused a new quorum to be started, rather than restarting the existing quorum. This addresses the original cause of the test failing, plus the original patch's fix for the delete not being correct (etc...) > AsyncHammerTest is broken, callbacks need to validate rc parameter > -- > > Key: ZOOKEEPER-570 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-570 > Project: Zookeeper > Issue Type: Bug > Components: tests >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-570.patch, ZOOKEEPER-570.patch > > > the asynchammertest is not validating the rc in the callback, more serious is > that it is using path in the create callback > to delete the node, rather than name (which is important in the case of a > sequential node creation as in this case) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-570) AsyncHammerTest is broken, callbacks need to validate rc parameter
[ https://issues.apache.org/jira/browse/ZOOKEEPER-570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-570: --- Status: Patch Available (was: Open) > AsyncHammerTest is broken, callbacks need to validate rc parameter > -- > > Key: ZOOKEEPER-570 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-570 > Project: Zookeeper > Issue Type: Bug > Components: tests >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-570.patch, ZOOKEEPER-570.patch > > > the asynchammertest is not validating the rc in the callback, more serious is > that it is using path in the create callback > to delete the node, rather than name (which is important in the case of a > sequential node creation as in this case) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-544) improve client testability - allow test client to access connected server location
[ https://issues.apache.org/jira/browse/ZOOKEEPER-544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-544: --- Assignee: Patrick Hunt Status: Patch Available (was: Open) > improve client testability - allow test client to access connected server > location > -- > > Key: ZOOKEEPER-544 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-544 > Project: Zookeeper > Issue Type: Improvement > Components: c client, java client, tests >Reporter: Patrick Hunt >Assignee: Patrick Hunt > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-544.patch > > > This came up recently on the user list. If you are developing tests for your > zk client you need to be able to access the server that your > session is currently connected to. The reason is that your test needs to know > which server in the quorum to shutdown in order to > verify you are handling failover correctly. Similar for session expiration > testing. > however we should be careful, we prefer not to expose this to all clients, > this is an implementation detail that we typically > want to hide. > also we should provide this in both the c and java clients > I suspect we should add a protected method on ZooKeeper. This will make a > higher bar (user will have to subclass) for > the user to access this method. In tests it's fine, typically you want a > "TestableZooKeeper" class anyway. In c we unfortunately > have less options, we can just rely on docs for now. > In both cases (c/java) we need to be very very clear in the docs that this is > for testing only and to clearly define semantics. > We should add the following at the same time: > toString() method to ZooKeeper which includes server ip/port, client port, > any other information deemed useful (connection stats like send/recv?) > the java ZooKeeper is missing "deterministic connection order" that the c > client has. this is also useful for testing. again, protected and > clear docs that this is for testing purposes only! > Any other things we should expose? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-544) improve client testability - allow test client to access connected server location
[ https://issues.apache.org/jira/browse/ZOOKEEPER-544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-544: --- Attachment: ZOOKEEPER-544.patch Added some methods to the java to improve testability - in particular to get the tostring of zk object and allow access to local/report ip/port. Also added some stats for tracking sent/recv information - included in tostring. Added tests of these features - see the log output for examples. Note to reviewer - be sure to check some cleanups I did in the code surrounding the send/recv counts. Note my attempt to inform potential users that these methods are meant for testing by 1) making them protected, 2) naming, 3) javadoc comments. Also, it would be great to get c versions of these 3 new methods. Ben? > improve client testability - allow test client to access connected server > location > -- > > Key: ZOOKEEPER-544 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-544 > Project: Zookeeper > Issue Type: Improvement > Components: c client, java client, tests >Reporter: Patrick Hunt > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-544.patch > > > This came up recently on the user list. If you are developing tests for your > zk client you need to be able to access the server that your > session is currently connected to. The reason is that your test needs to know > which server in the quorum to shutdown in order to > verify you are handling failover correctly. Similar for session expiration > testing. > however we should be careful, we prefer not to expose this to all clients, > this is an implementation detail that we typically > want to hide. > also we should provide this in both the c and java clients > I suspect we should add a protected method on ZooKeeper. This will make a > higher bar (user will have to subclass) for > the user to access this method. In tests it's fine, typically you want a > "TestableZooKeeper" class anyway. In c we unfortunately > have less options, we can just rely on docs for now. > In both cases (c/java) we need to be very very clear in the docs that this is > for testing only and to clearly define semantics. > We should add the following at the same time: > toString() method to ZooKeeper which includes server ip/port, client port, > any other information deemed useful (connection stats like send/recv?) > the java ZooKeeper is missing "deterministic connection order" that the c > client has. this is also useful for testing. again, protected and > clear docs that this is for testing purposes only! > Any other things we should expose? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-571) support balancing of client load across servers in an ensemble
support balancing of client load across servers in an ensemble -- Key: ZOOKEEPER-571 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-571 Project: Zookeeper Issue Type: Improvement Components: quorum, server Reporter: Patrick Hunt Currently the ensemble does not ensure a balanced load across servers in an ensemble. Clients randomly connect to a server, which typically balances the number of sessions. However there are problems with this: 1) session count is balanced, but not session load 2) if server A goes down all of the sessions on that server migrate to other servers in the cluster randomly, this is fine, however when server A comes back into service it will have no sessions, and migration of sessions from other servers may take time The quorum should probably have some way of broadcasting load, and occasionally re-balance the sessions based on this information. Might be tricky though, want to ensure that we aren't constantly ping-ponging sessions to servers. Probably need some hysteresis as well as limit the frequency. Real time tuning would need to be supported. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-572) add ability for operator to examine state of watches currently registered with a server
add ability for operator to examine state of watches currently registered with a server --- Key: ZOOKEEPER-572 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-572 Project: Zookeeper Issue Type: Improvement Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 3.3.0 it may be useful for an operator to examine the watches registered with a server by the various connected sessions seems useful to allow: 1) watches on a session 2) watches on a path 3) all watches? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-572) add ability for operator to examine state of watches currently registered with a server
[ https://issues.apache.org/jira/browse/ZOOKEEPER-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-572: --- Component/s: server jmx Description: it may be useful for an operator to examine the watches registered with a server by the various connected sessions seems useful to allow: 1) watches on a session 2) watches on a path 3) all watches? command port and JMX. was: it may be useful for an operator to examine the watches registered with a server by the various connected sessions seems useful to allow: 1) watches on a session 2) watches on a path 3) all watches? > add ability for operator to examine state of watches currently registered > with a server > --- > > Key: ZOOKEEPER-572 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-572 > Project: Zookeeper > Issue Type: Improvement > Components: jmx, server >Reporter: Patrick Hunt >Assignee: Patrick Hunt > Fix For: 3.3.0 > > > it may be useful for an operator to examine the watches registered with a > server by the various connected sessions > seems useful to allow: > 1) watches on a session > 2) watches on a path > 3) all watches? > command port and JMX. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-573) the dump 4letterword is not formatting sessionids in hex
the dump 4letterword is not formatting sessionids in hex Key: ZOOKEEPER-573 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-573 Project: Zookeeper Issue Type: Bug Components: server Affects Versions: 3.2.1 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 3.3.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-566) "reqs" four letter word (command port) returns no information
[ https://issues.apache.org/jira/browse/ZOOKEEPER-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-566: --- Attachment: ZOOKEEPER-566.patch this patch removes the "reqs" command. at some point the code was refactored in such a way that this information was no longer available. also fixed the dump command to output sessionid as hex Added more detailed "cons" information - information on connections added "srvr" command that just shows server information added crst which resets the connection stat information (similar to srst for server stats) > "reqs" four letter word (command port) returns no information > - > > Key: ZOOKEEPER-566 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-566 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-566.patch > > > the four letter word "reqs" doesn't do anything - it always returns empty > data. Seems that "outstanding" field is always empty and never set. > we should remove outstanding and also update the reqs code to correctly > output the outstanding requests (if not possible then remove the cmd and > update docs - although this is very useful command, hate to see us lose it) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-566) "reqs" four letter word (command port) returns no information
[ https://issues.apache.org/jira/browse/ZOOKEEPER-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-566: --- Assignee: Patrick Hunt Status: Patch Available (was: Open) > "reqs" four letter word (command port) returns no information > - > > Key: ZOOKEEPER-566 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-566 > Project: Zookeeper > Issue Type: Bug >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-566.patch > > > the four letter word "reqs" doesn't do anything - it always returns empty > data. Seems that "outstanding" field is always empty and never set. > we should remove outstanding and also update the reqs code to correctly > output the outstanding requests (if not possible then remove the cmd and > update docs - although this is very useful command, hate to see us lose it) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-368) Observers
[ https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12775956#action_12775956 ] Patrick Hunt commented on ZOOKEEPER-368: Henry, I don't see any docs for this in src/docs. I suggest that you start a new document (new xml file) for this feature, it should explain why/how(torun) at the very least -- so that potential users can come up to speed. Flavio, could you also review the comments on this JIRA as part of your commit review? We should make sure that either all of the issues are addressed, or at the very least new JIRAs are created (Henry could you do this?) for the pending items so that we don't lose the comments/concerns/issues that have been identified previously (this is a major new/visible feature so I think it warrants the extra time/effort). > Observers > - > > Key: ZOOKEEPER-368 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368 > Project: Zookeeper > Issue Type: New Feature > Components: quorum >Reporter: Flavio Paiva Junqueira >Assignee: Henry Robinson > Attachments: obs-refactor.patch, observer-refactor.patch, observers > sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch > > > Currently, all servers of an ensemble participate actively in reaching > agreement on the order of ZooKeeper transactions. That is, all followers > receive proposals, acknowledge them, and receive commit messages from the > leader. A leader issues commit messages once it receives acknowledgments from > a quorum of followers. For cross-colo operation, it would be useful to have a > third role: observer. Using Paxos terminology, observers are similar to > learners. An observer does not participate actively in the agreement step of > the atomic broadcast protocol. Instead, it only commits proposals that have > been accepted by some quorum of followers. > One simple solution to implement observers is to have the leader forwarding > commit messages not only to followers but also to observers, and have > observers applying transactions according to the order followers agreed upon. > In the current implementation of the protocol, however, commit messages do > not carry their corresponding transaction payload because all servers > different from the leader are followers and followers receive such a payload > first through a proposal message. Just forwarding commit messages as they > currently are to an observer consequently is not sufficient. We have a couple > of options: > 1- Include the transaction payload along in commit messages to observers; > 2- Send proposals to observers as well. > Number 2 is simpler to implement because it doesn't require changing the > protocol implementation, but it increases traffic slightly. The performance > impact due to such an increase might be insignificant, though. > For scalability purposes, we may consider having followers also forwarding > commit messages to observers. With this option, observers can connect to > followers, and receive messages from followers. This choice is important to > avoid increasing the load on the leader with the number of observers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral
[ https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-472: --- Status: Patch Available (was: Open) Is this ready? I see some updates, throwing into the patch queue, reviewer please be sure all the comments are addressed. > Making DataNode not instantiate a HashMap when the node is ephmeral > --- > > Key: ZOOKEEPER-472 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472 > Project: Zookeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.2.0, 3.1.1 >Reporter: Erik Holstad >Assignee: Erik Holstad >Priority: Minor > Fix For: 3.3.0 > > Attachments: zookeeper-472.patch, zookeeper-472.patch, > zookeeper-472.patch, zookeeper-472.patch > > > Looking at the code, there is an overhead of a HashSet object for that nodes > children, even though the node might be an ephmeral node and cannot have > children. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-574) the documentation on snapcount in the admin guide has the wrong default
the documentation on snapcount in the admin guide has the wrong default --- Key: ZOOKEEPER-574 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-574 Project: Zookeeper Issue Type: Bug Reporter: Patrick Hunt Priority: Minor Fix For: 3.3.0 I believe it's 100k, not 10k --- snapCount (Java system property: zookeeper.snapCount) Clients can submit requests faster than ZooKeeper can process them, especially if there are a lot of clients. To prevent ZooKeeper from running out of memory due to queued requests, ZooKeeper will throttle clients so that there is no more than globalOutstandingLimit outstanding requests in the system. The default limit is 1,000.ZooKeeper logs transactions to a transaction log. After snapCount transactions are written to a log file a snapshot is started and a new transaction log file is started. The default snapCount is 10,000. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-575) remove System.exit calls to make the server more container friendly
remove System.exit calls to make the server more container friendly --- Key: ZOOKEEPER-575 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-575 Project: Zookeeper Issue Type: Improvement Components: server Reporter: Patrick Hunt Fix For: 3.3.0 There are a handful of places left in the code that still use System.exit, we should remove these to make the server more container friendly. There are some legitimate places for the exits - in *Main.java for example should be fine - these are the command line main routines. Containers should be embedding code that runs just below this layer (or we should refactor so that it would). The tricky bit is ensuring the server shuts down in case of an unrecoverable error occurring, afaik these are the locations where we still have sys exit calls. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral
[ https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-472: --- Status: Open (was: Patch Available) > Making DataNode not instantiate a HashMap when the node is ephmeral > --- > > Key: ZOOKEEPER-472 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472 > Project: Zookeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.2.0, 3.1.1 >Reporter: Erik Holstad >Assignee: Erik Holstad >Priority: Minor > Fix For: 3.3.0 > > Attachments: zookeeper-472.patch, zookeeper-472.patch, > zookeeper-472.patch, zookeeper-472.patch > > > Looking at the code, there is an overhead of a HashSet object for that nodes > children, even though the node might be an ephmeral node and cannot have > children. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral
[ https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-472: --- Status: Patch Available (was: Open) > Making DataNode not instantiate a HashMap when the node is ephmeral > --- > > Key: ZOOKEEPER-472 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472 > Project: Zookeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.2.0, 3.1.1 >Reporter: Erik Holstad >Assignee: Erik Holstad >Priority: Minor > Fix For: 3.3.0 > > Attachments: zookeeper-472.patch, zookeeper-472.patch, > zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch > > > Looking at the code, there is an overhead of a HashSet object for that nodes > children, even though the node might be an ephmeral node and cannot have > children. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral
[ https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-472: --- Attachment: zookeeper-472.patch Updated patch to compile against latest trunk. Also cleaned up some finals. Also reduced the default child hashset size to 8 rather than 16 (let's be conservative as to the avg number of subnodes). Small optimization to getchildren in datatree - allocate exactly the right size array list > Making DataNode not instantiate a HashMap when the node is ephmeral > --- > > Key: ZOOKEEPER-472 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472 > Project: Zookeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.1.1, 3.2.0 >Reporter: Erik Holstad >Assignee: Erik Holstad >Priority: Minor > Fix For: 3.3.0 > > Attachments: zookeeper-472.patch, zookeeper-472.patch, > zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch > > > Looking at the code, there is an overhead of a HashSet object for that nodes > children, even though the node might be an ephmeral node and cannot have > children. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral
[ https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-472: --- Status: Open (was: Patch Available) > Making DataNode not instantiate a HashMap when the node is ephmeral > --- > > Key: ZOOKEEPER-472 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472 > Project: Zookeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.2.0, 3.1.1 >Reporter: Erik Holstad >Assignee: Erik Holstad >Priority: Minor > Fix For: 3.3.0 > > Attachments: zookeeper-472.patch, zookeeper-472.patch, > zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch, > zookeeper-472.patch > > > Looking at the code, there is an overhead of a HashSet object for that nodes > children, even though the node might be an ephmeral node and cannot have > children. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral
[ https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-472: --- Status: Patch Available (was: Open) > Making DataNode not instantiate a HashMap when the node is ephmeral > --- > > Key: ZOOKEEPER-472 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472 > Project: Zookeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.2.0, 3.1.1 >Reporter: Erik Holstad >Assignee: Erik Holstad >Priority: Minor > Fix For: 3.3.0 > > Attachments: zookeeper-472.patch, zookeeper-472.patch, > zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch, > zookeeper-472.patch > > > Looking at the code, there is an overhead of a HashSet object for that nodes > children, even though the node might be an ephmeral node and cannot have > children. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-472) Making DataNode not instantiate a HashMap when the node is ephmeral
[ https://issues.apache.org/jira/browse/ZOOKEEPER-472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776572#action_12776572 ] Patrick Hunt commented on ZOOKEEPER-472: In general we should have tests but I'm fine with the no-test in this case, this is an optimization not a bug fix and I can't think of any test we could add that would benefit, we already have good test coverage on this area. > Making DataNode not instantiate a HashMap when the node is ephmeral > --- > > Key: ZOOKEEPER-472 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-472 > Project: Zookeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.1.1, 3.2.0 >Reporter: Erik Holstad >Assignee: Erik Holstad >Priority: Minor > Fix For: 3.3.0 > > Attachments: zookeeper-472.patch, zookeeper-472.patch, > zookeeper-472.patch, zookeeper-472.patch, zookeeper-472.patch, > zookeeper-472.patch > > > Looking at the code, there is an overhead of a HashSet object for that nodes > children, even though the node might be an ephmeral node and cannot have > children. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-425) Add OSGi metadata to zookeeper.jar
[ https://issues.apache.org/jira/browse/ZOOKEEPER-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776592#action_12776592 ] Patrick Hunt commented on ZOOKEEPER-425: A dumb q: adding this to the manifest of the zk jar will have no effect on non-osgi containers (etc...) correct? Also, re the original question "Is it really necessary to put the source code in the Jar file too": notice that in the trunk we have changed things a bit since this jira was created. We still have the original jar which includes sources, but we also have an additional binary only jar (class files) in addition to separate source and javadoc jars. You will see this in the "package" target of the latest build.xml This was added for Maven -- we should be sure to include the manifest changes to those jars (just the jars containing class files I guess?) > Add OSGi metadata to zookeeper.jar > -- > > Key: ZOOKEEPER-425 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-425 > Project: Zookeeper > Issue Type: Improvement > Components: build >Affects Versions: 3.1.1 >Reporter: David Bosschaert > Attachments: MANIFEST.MF > > > After adding OSGi metadata to zookeeper.jar it can be used as both an OSGi > bundle as well as an ordinary jar file. > In the CXF/DOSGi project the buildsystem does this using the > maven-bundle-plugin: > http://svn.apache.org/repos/asf/cxf/dosgi/trunk/discovery/distributed/zookeeper-wrapper/pom.xml > The MANIFEST.MF generated by maven-bundle-plugin is attached to this bug, > this works for the CXF/DOSGi project. > If your buildsystem isn't using maven, I would advise to use bnd > (http://www.aqute.biz/Code/Bnd). BND defines its own ant task in which you > should be able to use more or less the same instructions as were used in > maven: > > ZooKeeper bundle > This bundle contains the ZooKeeper > library > org.apache.hadoop.zookeeper > 3.1.1 > * > *;version=3.1.1 > > Oh and one other thing. Is it really necessary to put the source code in the > Jar file too? I would put that in a separate source distribution :) > See also: > http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200905.mbox/%3c4a2009b1.3030...@yahoo-inc.com%3e -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-368) Observers
[ https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777091#action_12777091 ] Patrick Hunt commented on ZOOKEEPER-368: Last night I was thinking about this, "368 - why the disconnect?" and I believe I have figured out the underlying issue. JIRAs are primarily problem statements and resolutions (patches for the most part). In this case the solution doesn't fit the problem statement "subject: Observers". This is not observers, really it's more like "phase 1 of Observers - code changes and tests, limited functionality (UDP LE only)" with additional JIRAs to address subsequent to this patch going in. I know when I reviewed this patch, and if you look at my most recent comments, this is the mindset I had - "this is observers", but really that's not Henry's intent. That's fine from my perspective, iterative development is great, improve things but don't break existing functionality, but the JIRA description here (esp subject) doesn't fit and that's throwing people. Creating additional JIRAs would also make this more clear ("obs phase 2 adding ...", "phase 3 finalizing observers, code complete" -- whatever). Changing the subject on this JIRA would make this more clear. Ben had a good summary of next steps so I won't go through that. Flavio and Henry seem to have a plan in place to execute. So lets wrap this up boys and girls. ;-) Finally, I want to point out that if a patch takes 5 months or 5 years, if it's not ready to go in it's not ready, regardless of outside pressure. It's the contributor's responsibility to work with the committers to get a patch committed. It's the committers responsibility to work with the contributor, review the patch, provide useful feedback and try to get the issues resolved with limited muss/fuss. Henry, you've been doing a great job on this (and support of ZK in general). I know both you and Flavio (and the rest of the committers) have been spending a lot of time on this - thanks all! So like I said, let's wrap this up and move on. > Observers > - > > Key: ZOOKEEPER-368 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368 > Project: Zookeeper > Issue Type: New Feature > Components: quorum >Reporter: Flavio Paiva Junqueira >Assignee: Henry Robinson > Attachments: obs-refactor.patch, observer-refactor.patch, observers > sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch > > > Currently, all servers of an ensemble participate actively in reaching > agreement on the order of ZooKeeper transactions. That is, all followers > receive proposals, acknowledge them, and receive commit messages from the > leader. A leader issues commit messages once it receives acknowledgments from > a quorum of followers. For cross-colo operation, it would be useful to have a > third role: observer. Using Paxos terminology, observers are similar to > learners. An observer does not participate actively in the agreement step of > the atomic broadcast protocol. Instead, it only commits proposals that have > been accepted by some quorum of followers. > One simple solution to implement observers is to have the leader forwarding > commit messages not only to followers but also to observers, and have > observers applying transactions according to the order followers agreed upon. > In the current implementation of the protocol, however, commit messages do > not carry their corresponding transaction payload because all servers > different from the leader are followers and followers receive such a payload > first through a proposal message. Just forwarding commit messages as they > currently are to an observer consequently is not sufficient. We have a couple > of options: > 1- Include the transaction payload along in commit messages to observers; > 2- Send proposals to observers as well. > Number 2 is simpler to implement because it doesn't require changing the > protocol implementation, but it increases traffic slightly. The performance > impact due to such an increase might be insignificant, though. > For scalability purposes, we may consider having followers also forwarding > commit messages to observers. With this option, observers can connect to > followers, and receive messages from followers. This choice is important to > avoid increasing the load on the leader with the number of observers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-550) Java Queue Recipe
[ https://issues.apache.org/jira/browse/ZOOKEEPER-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777231#action_12777231 ] Patrick Hunt commented on ZOOKEEPER-550: Good job. Great to see another recipe implemented as part of the release artifact, this will make our user's lives easier! > Java Queue Recipe > - > > Key: ZOOKEEPER-550 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-550 > Project: Zookeeper > Issue Type: New Feature > Components: java client >Affects Versions: 3.2.1 >Reporter: Steven Cheng >Assignee: Steven Cheng >Priority: Minor > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, > ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, > ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, ZOOKEEPER-550.patch > > > This patch adds a recipe for creating a distributed queue with ZooKeeper > similar to the WriteLock recipe and some sequential tests. This early > attempt follows the Java BlockingQueue interface, though it doesn't implement > it since I don't think there's a good reason for it to be Iterable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-368) Observers: core functionality
[ https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-368: --- Status: Open (was: Patch Available) > Observers: core functionality > -- > > Key: ZOOKEEPER-368 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368 > Project: Zookeeper > Issue Type: New Feature > Components: quorum >Reporter: Flavio Paiva Junqueira >Assignee: Henry Robinson > Attachments: obs-refactor.patch, observer-refactor.patch, observers > sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch > > > Edit (Henry Robinson/henryr) 12/11/09: > This JIRA specifically concerns the implementation of non-voting peers called > Observers, their documentation and their tests. > Explicit goals are 1. not breaking any current ZK functionality, 2. enabling > at least one deployment scenario involving Observers, 3. documentation > describing how to use the feature and 4. tests validating the correct > behaviour of 2. > Non goals of this JIRA are 1. performance optimizations specific to > Observers, 2. compatibility with every feature of ZooKeeper (in particular > all leader election protocols), which are both to be addressed in future > JIRAs. > See http://wiki.apache.org/hadoop/ZooKeeper/Observers for more detail of use > cases, proposed design and usage. > See http://wiki.apache.org/hadoop/ZooKeeper/Observers/ReviewGuide for a brief > commentary on the current patch. > - > Currently, all servers of an ensemble participate actively in reaching > agreement on the order of ZooKeeper transactions. That is, all followers > receive proposals, acknowledge them, and receive commit messages from the > leader. A leader issues commit messages once it receives acknowledgments from > a quorum of followers. For cross-colo operation, it would be useful to have a > third role: observer. Using Paxos terminology, observers are similar to > learners. An observer does not participate actively in the agreement step of > the atomic broadcast protocol. Instead, it only commits proposals that have > been accepted by some quorum of followers. > One simple solution to implement observers is to have the leader forwarding > commit messages not only to followers but also to observers, and have > observers applying transactions according to the order followers agreed upon. > In the current implementation of the protocol, however, commit messages do > not carry their corresponding transaction payload because all servers > different from the leader are followers and followers receive such a payload > first through a proposal message. Just forwarding commit messages as they > currently are to an observer consequently is not sufficient. We have a couple > of options: > 1- Include the transaction payload along in commit messages to observers; > 2- Send proposals to observers as well. > Number 2 is simpler to implement because it doesn't require changing the > protocol implementation, but it increases traffic slightly. The performance > impact due to such an increase might be insignificant, though. > For scalability purposes, we may consider having followers also forwarding > commit messages to observers. With this option, observers can connect to > followers, and receive messages from followers. This choice is important to > avoid increasing the load on the leader with the number of observers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-368) Observers: core functionality
[ https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-368: --- Fix Version/s: 3.3.0 Status: Patch Available (was: Open) > Observers: core functionality > -- > > Key: ZOOKEEPER-368 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368 > Project: Zookeeper > Issue Type: New Feature > Components: quorum >Reporter: Flavio Paiva Junqueira >Assignee: Henry Robinson > Fix For: 3.3.0 > > Attachments: obs-refactor.patch, observer-refactor.patch, observers > sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, > ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch > > > Edit (Henry Robinson/henryr) 12/11/09: > This JIRA specifically concerns the implementation of non-voting peers called > Observers, their documentation and their tests. > Explicit goals are 1. not breaking any current ZK functionality, 2. enabling > at least one deployment scenario involving Observers, 3. documentation > describing how to use the feature and 4. tests validating the correct > behaviour of 2. > Non goals of this JIRA are 1. performance optimizations specific to > Observers, 2. compatibility with every feature of ZooKeeper (in particular > all leader election protocols), which are both to be addressed in future > JIRAs. > See http://wiki.apache.org/hadoop/ZooKeeper/Observers for more detail of use > cases, proposed design and usage. > See http://wiki.apache.org/hadoop/ZooKeeper/Observers/ReviewGuide for a brief > commentary on the current patch. > - > Currently, all servers of an ensemble participate actively in reaching > agreement on the order of ZooKeeper transactions. That is, all followers > receive proposals, acknowledge them, and receive commit messages from the > leader. A leader issues commit messages once it receives acknowledgments from > a quorum of followers. For cross-colo operation, it would be useful to have a > third role: observer. Using Paxos terminology, observers are similar to > learners. An observer does not participate actively in the agreement step of > the atomic broadcast protocol. Instead, it only commits proposals that have > been accepted by some quorum of followers. > One simple solution to implement observers is to have the leader forwarding > commit messages not only to followers but also to observers, and have > observers applying transactions according to the order followers agreed upon. > In the current implementation of the protocol, however, commit messages do > not carry their corresponding transaction payload because all servers > different from the leader are followers and followers receive such a payload > first through a proposal message. Just forwarding commit messages as they > currently are to an observer consequently is not sufficient. We have a couple > of options: > 1- Include the transaction payload along in commit messages to observers; > 2- Send proposals to observers as well. > Number 2 is simpler to implement because it doesn't require changing the > protocol implementation, but it increases traffic slightly. The performance > impact due to such an increase might be insignificant, though. > For scalability purposes, we may consider having followers also forwarding > commit messages to observers. With this option, observers can connect to > followers, and receive messages from followers. This choice is important to > avoid increasing the load on the leader with the number of observers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-507) Improve error handling of BookKeeper client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-507: --- Status: Patch Available (was: Open) > Improve error handling of BookKeeper client > --- > > Key: ZOOKEEPER-507 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-507 > Project: Zookeeper > Issue Type: Improvement > Components: contrib-bookkeeper >Reporter: Flavio Paiva Junqueira >Assignee: Utkarsh Srivastava > Attachments: ZOOKEEPER-507.patch > > > Error handling is far from ideal currently in the BookKeeper client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-507) Improve error handling of BookKeeper client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-507: --- Status: Open (was: Patch Available) > Improve error handling of BookKeeper client > --- > > Key: ZOOKEEPER-507 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-507 > Project: Zookeeper > Issue Type: Improvement > Components: contrib-bookkeeper >Reporter: Flavio Paiva Junqueira >Assignee: Utkarsh Srivastava > Attachments: ZOOKEEPER-507.patch > > > Error handling is far from ideal currently in the BookKeeper client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-544) improve client testability - allow test client to access connected server location
[ https://issues.apache.org/jira/browse/ZOOKEEPER-544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778993#action_12778993 ] Patrick Hunt commented on ZOOKEEPER-544: discoverability. I put testable* to make it easier for clients to do test (rather than figure out how to get this). Also, exposing cnxn means it's harder to make changes to the implementation as the user code will be closely coupled to the guts of cnxn. this allows for changes under the covers (for common use cases at least) > improve client testability - allow test client to access connected server > location > -- > > Key: ZOOKEEPER-544 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-544 > Project: Zookeeper > Issue Type: Improvement > Components: c client, java client, tests >Reporter: Patrick Hunt >Assignee: Patrick Hunt > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-544.patch > > > This came up recently on the user list. If you are developing tests for your > zk client you need to be able to access the server that your > session is currently connected to. The reason is that your test needs to know > which server in the quorum to shutdown in order to > verify you are handling failover correctly. Similar for session expiration > testing. > however we should be careful, we prefer not to expose this to all clients, > this is an implementation detail that we typically > want to hide. > also we should provide this in both the c and java clients > I suspect we should add a protected method on ZooKeeper. This will make a > higher bar (user will have to subclass) for > the user to access this method. In tests it's fine, typically you want a > "TestableZooKeeper" class anyway. In c we unfortunately > have less options, we can just rely on docs for now. > In both cases (c/java) we need to be very very clear in the docs that this is > for testing only and to clearly define semantics. > We should add the following at the same time: > toString() method to ZooKeeper which includes server ip/port, client port, > any other information deemed useful (connection stats like send/recv?) > the java ZooKeeper is missing "deterministic connection order" that the c > client has. this is also useful for testing. again, protected and > clear docs that this is for testing purposes only! > Any other things we should expose? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing
[ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-582: --- Priority: Blocker (was: Major) Affects Version/s: 3.1.1 Fix Version/s: 3.1.2 > ZooKeeper can revert to old data when a snapshot is created outside of normal > processing > > > Key: ZOOKEEPER-582 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.1.1, 3.2.1 >Reporter: Benjamin Reed >Priority: Blocker > Fix For: 3.2.2, 3.1.2 > > > when zookeeper starts up it will restore the most recent state (latest zxid) > it finds in the data directory. unfortunately, in the quorum version of > zookeeper updates are logged using an epoch based on the latest log file in a > directory. if there is a snapshot with a higher epoch than the log files, the > zookeeper server will start logging using an epoch one higher than the > highest log file. > so if a data directory has a snapshot with an epoch of 27 and there are no > log files, zookeeper will start logging changes using epoch 1. if the cluster > restarts the state will be restored from the snapshot with the epoch of 27, > which in effect, restores old data. > normal operation of zookeeper will never result in this situation. > this does not effect standalone zookeeper. > a fix should make sure to use an epoch one higher than the current state, > whether it comes from the snapshot or log, and should include a sanity check > to make sure that a follower never connects to a leader that has a lower > epoch than its own. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing
[ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779149#action_12779149 ] Patrick Hunt commented on ZOOKEEPER-582: As Ben mentioned we will never see this situation during normal operation of ZK. The case where we did see this was a result of a user running the migration tool that we provide to upgrade from version 2 to version 3 of ZooKeeper. The tool migrates the data by writing a single snapshot file where the zxid is maintained (it does not write a log file). As a result of the scenario Ben mentioned (snap with no associated log file) this could cause this bug to occur. If you have run the migration tool, documented here: http://hadoop.apache.org/zookeeper/docs/r3.0.0/releasenotes.html#migration_data you can verify whether or not you have this situation by looking at your ZooKeeper datadirectory Here's an example -rw-r--r-- 1 root search 67108880 Nov 17 19:31 log.300022b61 -rw-r--r-- 1 root search 67108880 Nov 17 19:38 log.3000292d0 -rw-r--r-- 1 root search 3646608 Nov 5 12:13 snapshot.1db5df6e2d6 -rw-r--r-- 1 root search 3616579 Nov 17 19:31 snapshot.3000292c9 -rw-r--r-- 1 root search 3616708 Nov 17 19:38 snapshot.300038d32 where the files are of the form . epoch and xid both being 4 byte values represented as hex Notice that the snapshot.1db5df6e2d6 has epoch of 0x1db, while the other files have epoch of 0x3, this is the scenario described in the description of this JIRA. (there is no log file associated with epoch 0x1db) If you see this in your datadir - a snapshot with an epoch where there are no log files with this same epoch, then this bug pertains. If you see snapshots of a particular epoch and log files with the same epoch then this bug does NOT pertain. > ZooKeeper can revert to old data when a snapshot is created outside of normal > processing > > > Key: ZOOKEEPER-582 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.1.1, 3.2.1 >Reporter: Benjamin Reed >Priority: Blocker > Fix For: 3.2.2, 3.1.2 > > > when zookeeper starts up it will restore the most recent state (latest zxid) > it finds in the data directory. unfortunately, in the quorum version of > zookeeper updates are logged using an epoch based on the latest log file in a > directory. if there is a snapshot with a higher epoch than the log files, the > zookeeper server will start logging using an epoch one higher than the > highest log file. > so if a data directory has a snapshot with an epoch of 27 and there are no > log files, zookeeper will start logging changes using epoch 1. if the cluster > restarts the state will be restored from the snapshot with the epoch of 27, > which in effect, restores old data. > normal operation of zookeeper will never result in this situation. > this does not effect standalone zookeeper. > a fix should make sure to use an epoch one higher than the current state, > whether it comes from the snapshot or log, and should include a sanity check > to make sure that a follower never connects to a leader that has a lower > epoch than its own. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-583) on resync client should generate session expired exception if there is no server in cluster with acceptable zxid
on resync client should generate session expired exception if there is no server in cluster with acceptable zxid Key: ZOOKEEPER-583 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-583 Project: Zookeeper Issue Type: Bug Components: c client, java client Affects Versions: 3.2.1, 3.1.1 Reporter: Patrick Hunt Fix For: 3.3.0 Both the c and java clients attempt to connect to a server in the cluster by iterating through a randomized list of servers as listed in the connect string passed to the zookeeper_init (c) or ZooKeeper constructor (java). The clients do this indefinitely, until successfully connecting to a server or until the client is close()ed. Additionally if a client is disconnected from a server it will attempt to reconnect to another server in the cluster, in this case it will only connect to a server that has the same, or higher, zxid as seen by the client on the previous server that it was connected to (this ensures that the client never sees old data). In some weird cases (in particular where operators reset the server database, clearing out the existing snapshots and txnlogs) existing clients will now see a much lower zxid (due to the epoch number being reset) regardless of the server that the client attempts to connect to. In this case the current client will iterate essentially forever. Instead the client should throw session expired in this case (notify any watchers). After iterating through all of the servers in the list, if none of the servers have an acceptable zxid the client should expire the session and shut down the handle. This will ensure that the client will eventually shutdown in this unusual, but possible (esp with server operators who don't also control the clients) situation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-540) zkpython needs better tracking of handle validity
[ https://issues.apache.org/jira/browse/ZOOKEEPER-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-540: --- Fix Version/s: 3.2.2 > zkpython needs better tracking of handle validity > - > > Key: ZOOKEEPER-540 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-540 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bindings >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Henry Robinson > Fix For: 3.2.2, 3.3.0 > > > I was getting a python segfault in one of my scripts. Turns out I was closing > a session handle and then reusing it (async call). This was causing python to > segfault. > zkpython should track handle state and complain, rather than crash, if the > handle is invalid (closed). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-554) zkpython can segfault when statting a deleted node
[ https://issues.apache.org/jira/browse/ZOOKEEPER-554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-554: --- Fix Version/s: 3.2.2 > zkpython can segfault when statting a deleted node > -- > > Key: ZOOKEEPER-554 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-554 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bindings >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-554.patch > > > C client returns NULL for stat object for deleted nodes. zookeeper.c blindly > dereferences it. Segfault. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-541) zkpython limited to 256 handles
[ https://issues.apache.org/jira/browse/ZOOKEEPER-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-541: --- Fix Version/s: 3.2.2 > zkpython limited to 256 handles > --- > > Key: ZOOKEEPER-541 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-541 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bindings >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Henry Robinson > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-541.patch > > > zkpython is currently limited to a max of 256 total handles - not 256 open > handles, but rather 256 total handles created > over the lifetime of the python application. > In general this isn't a real issue, however in the case of a long lived > application which polls the cluster periodically (closing > the session btw calls) this is an issue. > it would be great if the slots could be reused? or perhaps a more complex > structure, such as a linked list, which would allow > dynamic growth/shrinkage of the handle list. > Also see ZOOKEEPER-540 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-510) zkpython lumps all exceptions as IOError, needs specialized exceptions for KeeperException types
[ https://issues.apache.org/jira/browse/ZOOKEEPER-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-510: --- Fix Version/s: 3.2.2 > zkpython lumps all exceptions as IOError, needs specialized exceptions for > KeeperException types > > > Key: ZOOKEEPER-510 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-510 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bindings >Affects Versions: 3.2.0 >Reporter: Patrick Hunt >Assignee: Henry Robinson > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-510-incref.patch, ZOOKEEPER-510.patch, > ZOOKEEPER-510.patch, ZOOKEEPER-510.patch, ZOOKEEPER-510.patch > > > The current zkpython bindings always throw "IOError("text")" exceptions, even > for ZK specific exceptions such as NODEEXISTS. This makes it difficult (error > prone) to handle exceptions in python code. You can't easily pickup a > connection loss vs a node exists for example. Of course you could match the > error string, but this seems like a bad idea imo. > We need to add specific exception types to the python binding that map > directly to KeeperException/java types. It would also be useful to include > the information provided by the KeeperException (like path in some cases), > etc... as part of the error thrown to the python code. Would probably be a > good idea to stay as close to java api as possible wrt mapping the errors. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-538) zookeeper.async causes python to segfault
[ https://issues.apache.org/jira/browse/ZOOKEEPER-538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-538: --- Fix Version/s: 3.2.2 > zookeeper.async causes python to segfault > - > > Key: ZOOKEEPER-538 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-538 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bindings >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Henry Robinson >Priority: Critical > Fix For: 3.2.2, 3.3.0 > > Attachments: callback.patch, callback.patch > > > Henry, can you take a look at this, am I doing it right? > calling > zookeeper.async(self.handle, path) > causes python to segfault. > see: http://github.com/phunt/zk-smoketest/blob/master/zk-smoketest.py -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-420) build/test should not require install in zkpython
[ https://issues.apache.org/jira/browse/ZOOKEEPER-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-420: --- Fix Version/s: 3.2.2 > build/test should not require install in zkpython > - > > Key: ZOOKEEPER-420 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-420 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bindings >Reporter: Patrick Hunt >Assignee: Henry Robinson > Fix For: 3.2.2, 3.3.0 > > Attachments: build.jpg, ZOOKEEPER-420.patch, ZOOKEEPER-420.patch, > ZOOKEEPER-420.patch, ZOOKEEPER-420.patch, ZOOKEEPER-420.patch > > > Currently you cannot just build and test the zkpython contrib, you need to > actually install the zookeeper client c library as well > as the zkpython lib itself. > There really needs to be 2 steps: > 1) build/test zkpython "encapsulated" within the src repository, there should > be no requirement to actually install anything > (this is esp the case for automated processes and for review by PMC during > release time for example) > 2) build an egg that can be distributed/installed by end user -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing
[ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-582: -- Assignee: Mahadev konar > ZooKeeper can revert to old data when a snapshot is created outside of normal > processing > > > Key: ZOOKEEPER-582 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.1.1, 3.2.1 >Reporter: Benjamin Reed >Assignee: Mahadev konar >Priority: Blocker > Fix For: 3.2.2, 3.1.2 > > Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, > ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch > > > when zookeeper starts up it will restore the most recent state (latest zxid) > it finds in the data directory. unfortunately, in the quorum version of > zookeeper updates are logged using an epoch based on the latest log file in a > directory. if there is a snapshot with a higher epoch than the log files, the > zookeeper server will start logging using an epoch one higher than the > highest log file. > so if a data directory has a snapshot with an epoch of 27 and there are no > log files, zookeeper will start logging changes using epoch 1. if the cluster > restarts the state will be restored from the snapshot with the epoch of 27, > which in effect, restores old data. > normal operation of zookeeper will never result in this situation. > this does not effect standalone zookeeper. > a fix should make sure to use an epoch one higher than the current state, > whether it comes from the snapshot or log, and should include a sanity check > to make sure that a follower never connects to a leader that has a lower > epoch than its own. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing
[ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-582: --- Fix Version/s: 3.3.0 > ZooKeeper can revert to old data when a snapshot is created outside of normal > processing > > > Key: ZOOKEEPER-582 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.1.1, 3.2.1 >Reporter: Benjamin Reed >Assignee: Mahadev konar >Priority: Blocker > Fix For: 3.2.2, 3.3.0, 3.1.2 > > Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, > ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch > > > when zookeeper starts up it will restore the most recent state (latest zxid) > it finds in the data directory. unfortunately, in the quorum version of > zookeeper updates are logged using an epoch based on the latest log file in a > directory. if there is a snapshot with a higher epoch than the log files, the > zookeeper server will start logging using an epoch one higher than the > highest log file. > so if a data directory has a snapshot with an epoch of 27 and there are no > log files, zookeeper will start logging changes using epoch 1. if the cluster > restarts the state will be restored from the snapshot with the epoch of 27, > which in effect, restores old data. > normal operation of zookeeper will never result in this situation. > this does not effect standalone zookeeper. > a fix should make sure to use an epoch one higher than the current state, > whether it comes from the snapshot or log, and should include a sanity check > to make sure that a follower never connects to a leader that has a lower > epoch than its own. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-425) Add OSGi metadata to zookeeper.jar
[ https://issues.apache.org/jira/browse/ZOOKEEPER-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780227#action_12780227 ] Patrick Hunt commented on ZOOKEEPER-425: 1) would go into build.xml 2) & 3) would be in a contrib package, correct? src/contrib/osgi -- or is this stuff typically implementation specific such as src/contrib/felix ? (assuming not the latter) > Add OSGi metadata to zookeeper.jar > -- > > Key: ZOOKEEPER-425 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-425 > Project: Zookeeper > Issue Type: Improvement > Components: build >Affects Versions: 3.1.1 >Reporter: David Bosschaert > Attachments: MANIFEST.MF, zk_patch3.patch > > > After adding OSGi metadata to zookeeper.jar it can be used as both an OSGi > bundle as well as an ordinary jar file. > In the CXF/DOSGi project the buildsystem does this using the > maven-bundle-plugin: > http://svn.apache.org/repos/asf/cxf/dosgi/trunk/discovery/distributed/zookeeper-wrapper/pom.xml > The MANIFEST.MF generated by maven-bundle-plugin is attached to this bug, > this works for the CXF/DOSGi project. > If your buildsystem isn't using maven, I would advise to use bnd > (http://www.aqute.biz/Code/Bnd). BND defines its own ant task in which you > should be able to use more or less the same instructions as were used in > maven: > > ZooKeeper bundle > This bundle contains the ZooKeeper > library > org.apache.hadoop.zookeeper > 3.1.1 > * > *;version=3.1.1 > > Oh and one other thing. Is it really necessary to put the source code in the > Jar file too? I would put that in a separate source distribution :) > See also: > http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200905.mbox/%3c4a2009b1.3030...@yahoo-inc.com%3e -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-562) c client can flood server with pings if tcp send queue filled
[ https://issues.apache.org/jira/browse/ZOOKEEPER-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-562: --- Fix Version/s: 3.1.2 > c client can flood server with pings if tcp send queue filled > - > > Key: ZOOKEEPER-562 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-562 > Project: Zookeeper > Issue Type: Bug > Components: c client >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Benjamin Reed >Priority: Blocker > Fix For: 3.2.2, 3.3.0, 3.1.2 > > Attachments: ZOOKEEPER-562.patch > > > The c client can flood the server with pings if the tcp queue is filled. > Say the cluster is overloaded and shuts down the recv processing > a c client can send a ping, but since last_send is only updated on successful > pushing of data into the > socket, if flush_send_queue fails to send any data (send_buffer returns 0) > then last_send is not updated > and zookeeper_interest will again send a ping the next time it is woken - > which could be 0 if recv_to is close > to 0, easily could happen if server is not sending data to the client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-570) AsyncHammerTest is broken, callbacks need to validate rc parameter
[ https://issues.apache.org/jira/browse/ZOOKEEPER-570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-570: --- Fix Version/s: (was: 3.2.2) Dropping this from 3.2.2, it's not a bug fix (well, it is a fix to a test) but it does depend on refactored code from 3.3, it's not clear how this would be done (easily/successfully), dropping from 3.2.2 > AsyncHammerTest is broken, callbacks need to validate rc parameter > -- > > Key: ZOOKEEPER-570 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-570 > Project: Zookeeper > Issue Type: Bug > Components: tests >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-570.patch, ZOOKEEPER-570.patch > > > the asynchammertest is not validating the rc in the callback, more serious is > that it is using path in the create callback > to delete the node, rather than name (which is important in the case of a > sequential node creation as in this case) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (ZOOKEEPER-585) Update README for zkpython in 3.2.2
[ https://issues.apache.org/jira/browse/ZOOKEEPER-585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt resolved ZOOKEEPER-585. Resolution: Fixed Hadoop Flags: [Reviewed] +1, Looks good, added to 3.2.2 > Update README for zkpython in 3.2.2 > --- > > Key: ZOOKEEPER-585 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-585 > Project: Zookeeper > Issue Type: Improvement > Components: contrib-bindings >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Fix For: 3.2.2 > > Attachments: ZOOKEEPER-585.patch > > > zkpython has a few improvements going into 3.2.2, and its README needs a > short update to reflect this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-576) docs need to be updated for session moved exception and how to handle it
[ https://issues.apache.org/jira/browse/ZOOKEEPER-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-576: --- Attachment: ZOOKEEPER-576.patch Updated patch to note this is new in 3.2.0 > docs need to be updated for session moved exception and how to handle it > > > Key: ZOOKEEPER-576 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-576 > Project: Zookeeper > Issue Type: Bug >Reporter: Mahadev konar >Assignee: Benjamin Reed > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-576.patch, ZOOKEEPER-576.patch > > > the handling and implications of session moved exception should be documented. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-576) docs need to be updated for session moved exception and how to handle it
[ https://issues.apache.org/jira/browse/ZOOKEEPER-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-576: --- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) +1, looks good, added to 3.2.2 and trunk > docs need to be updated for session moved exception and how to handle it > > > Key: ZOOKEEPER-576 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-576 > Project: Zookeeper > Issue Type: Bug >Reporter: Mahadev konar >Assignee: Benjamin Reed > Fix For: 3.2.2, 3.3.0 > > Attachments: ZOOKEEPER-576.patch, ZOOKEEPER-576.patch > > > the handling and implications of session moved exception should be documented. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-586) c client does not compile under cygwin
[ https://issues.apache.org/jira/browse/ZOOKEEPER-586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-586: --- Attachment: ZOOKEEPER-586.patch c client compiles with this patch but the tests fail > c client does not compile under cygwin > -- > > Key: ZOOKEEPER-586 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-586 > Project: Zookeeper > Issue Type: Bug > Components: c client >Affects Versions: 3.2.1 >Reporter: Patrick Hunt > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-586.patch > > > the c client fails to compile under cygwin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-586) c client does not compile under cygwin
c client does not compile under cygwin -- Key: ZOOKEEPER-586 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-586 Project: Zookeeper Issue Type: Bug Components: c client Affects Versions: 3.2.1 Reporter: Patrick Hunt Fix For: 3.3.0 Attachments: ZOOKEEPER-586.patch the c client fails to compile under cygwin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-587) client should log timeout negotiated with server
client should log timeout negotiated with server Key: ZOOKEEPER-587 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-587 Project: Zookeeper Issue Type: Bug Components: c client, java client Affects Versions: 3.2.1 Reporter: Patrick Hunt Fix For: 3.3.0 The ZK client should log the timeout negotiated with the server if the time is different than the timeout parameter specified by the client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-588) remove unnecessary/annoying log of tostring error in Request.toString()
remove unnecessary/annoying log of tostring error in Request.toString() --- Key: ZOOKEEPER-588 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-588 Project: Zookeeper Issue Type: Bug Components: server Affects Versions: 3.2.1 Reporter: Patrick Hunt Priority: Minor Fix For: 3.3.0 Why are we logging this? It's unnecessary and just annoying afaict. We should remove it entirely. 2009-11-18 05:37:29,312 WARN org.apache.zookeeper.server.Request: Ignoring exception during toString java.nio.BufferUnderflowException at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:127) at java.nio.ByteBuffer.get(ByteBuffer.java:675) at org.apache.zookeeper.server.Request.toString(Request.java:199) at java.lang.String.valueOf(String.java:2827) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.zookeeper.server.quorum.CommitProcessor.processRequest(CommitProcessor.java:167) at org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run(FollowerRequestProcessor.java:68) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-590) review logging to ensure that session related messages include session id
review logging to ensure that session related messages include session id - Key: ZOOKEEPER-590 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-590 Project: Zookeeper Issue Type: Bug Components: server Reporter: Patrick Hunt Fix For: 3.3.0 when the server is logging session related log messages it must include the session id in hex form this greatly simplifies debugging - being able to relate a session message back to a particular session. otw there's too much going on and there is no way to determine what messages are related to a particular session -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-587) client should log timeout negotiated with server
[ https://issues.apache.org/jira/browse/ZOOKEEPER-587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-587: --- Assignee: Patrick Hunt Status: Patch Available (was: Open) > client should log timeout negotiated with server > > > Key: ZOOKEEPER-587 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-587 > Project: Zookeeper > Issue Type: Bug > Components: c client, java client >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-587.patch > > > The ZK client should log the timeout negotiated with the server if the time > is different than the timeout parameter specified by the client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-587) client should log timeout negotiated with server
[ https://issues.apache.org/jira/browse/ZOOKEEPER-587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-587: --- Attachment: ZOOKEEPER-587.patch this patch adds negotiated timeout to both the c and java log messages I also add the destination server for the session to the java message (already on c log message) > client should log timeout negotiated with server > > > Key: ZOOKEEPER-587 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-587 > Project: Zookeeper > Issue Type: Bug > Components: c client, java client >Affects Versions: 3.2.1 >Reporter: Patrick Hunt > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-587.patch > > > The ZK client should log the timeout negotiated with the server if the time > is different than the timeout parameter specified by the client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-588) remove unnecessary/annoying log of tostring error in Request.toString()
[ https://issues.apache.org/jira/browse/ZOOKEEPER-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-588: --- Assignee: Patrick Hunt Status: Patch Available (was: Open) > remove unnecessary/annoying log of tostring error in Request.toString() > --- > > Key: ZOOKEEPER-588 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-588 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Minor > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-588.patch > > > Why are we logging this? It's unnecessary and just annoying afaict. We should > remove it entirely. > 2009-11-18 05:37:29,312 WARN org.apache.zookeeper.server.Request: Ignoring > exception during toString > java.nio.BufferUnderflowException > at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:127) > at java.nio.ByteBuffer.get(ByteBuffer.java:675) > at org.apache.zookeeper.server.Request.toString(Request.java:199) > at java.lang.String.valueOf(String.java:2827) > at java.lang.StringBuilder.append(StringBuilder.java:115) > at > org.apache.zookeeper.server.quorum.CommitProcessor.processRequest(CommitProcessor.java:167) > at > org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run(FollowerRequestProcessor.java:68) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-588) remove unnecessary/annoying log of tostring error in Request.toString()
[ https://issues.apache.org/jira/browse/ZOOKEEPER-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-588: --- Attachment: ZOOKEEPER-588.patch fixed the tostring and also addressed a problem in the caller 1) caller now tries to log the path if it knows it (since we may not be able to figure out in req tostring as that's sorta a hack 2) added a few more exceptions to attempts to print reqpath in tostring 3) added more sanity checks before attempting to determine path 4) removed the annoying log message that caused this issue in the first place. > remove unnecessary/annoying log of tostring error in Request.toString() > --- > > Key: ZOOKEEPER-588 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-588 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Priority: Minor > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-588.patch > > > Why are we logging this? It's unnecessary and just annoying afaict. We should > remove it entirely. > 2009-11-18 05:37:29,312 WARN org.apache.zookeeper.server.Request: Ignoring > exception during toString > java.nio.BufferUnderflowException > at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:127) > at java.nio.ByteBuffer.get(ByteBuffer.java:675) > at org.apache.zookeeper.server.Request.toString(Request.java:199) > at java.lang.String.valueOf(String.java:2827) > at java.lang.StringBuilder.append(StringBuilder.java:115) > at > org.apache.zookeeper.server.quorum.CommitProcessor.processRequest(CommitProcessor.java:167) > at > org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run(FollowerRequestProcessor.java:68) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-587) client should log timeout negotiated with server
[ https://issues.apache.org/jira/browse/ZOOKEEPER-587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-587: --- Status: Open (was: Patch Available) I should be logging this in the server log as well. > client should log timeout negotiated with server > > > Key: ZOOKEEPER-587 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-587 > Project: Zookeeper > Issue Type: Bug > Components: c client, java client >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-587.patch > > > The ZK client should log the timeout negotiated with the server if the time > is different than the timeout parameter specified by the client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-587) client should log timeout negotiated with server
[ https://issues.apache.org/jira/browse/ZOOKEEPER-587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-587: --- Attachment: ZOOKEEPER-587.patch Updated server to log message. Note there are no tests for this - it's a logging change only. Already exercised by existing code. > client should log timeout negotiated with server > > > Key: ZOOKEEPER-587 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-587 > Project: Zookeeper > Issue Type: Bug > Components: c client, java client >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-587.patch, ZOOKEEPER-587.patch > > > The ZK client should log the timeout negotiated with the server if the time > is different than the timeout parameter specified by the client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-587) client should log timeout negotiated with server
[ https://issues.apache.org/jira/browse/ZOOKEEPER-587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-587: --- Status: Patch Available (was: Open) > client should log timeout negotiated with server > > > Key: ZOOKEEPER-587 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-587 > Project: Zookeeper > Issue Type: Bug > Components: c client, java client >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-587.patch, ZOOKEEPER-587.patch > > > The ZK client should log the timeout negotiated with the server if the time > is different than the timeout parameter specified by the client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-593) java client api does not allow client to access negotiated session timeout
java client api does not allow client to access negotiated session timeout -- Key: ZOOKEEPER-593 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-593 Project: Zookeeper Issue Type: Bug Components: java client Reporter: Patrick Hunt Fix For: 3.3.0 The java client api does not allow the client to access the negotiated session timeout (c does allow this). In some cases the client may not get the requested timeout (server applies a min/max bound) in which case the client user code may want to examine the timeout it did receive. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-425) Add OSGi metadata to zookeeper.jar
[ https://issues.apache.org/jira/browse/ZOOKEEPER-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782053#action_12782053 ] Patrick Hunt commented on ZOOKEEPER-425: Just to summarize - what's being applied here (patch)? Just buildxmlpatch.patch and nothing else (the other attachments are obsolete?) I agree that no new test is necessary, but it would be interesting to know how to try this out. Is there a way to load this into something like Apache Felix yet, or does that have to wait for the follow-on jira? (I would suggest - please create one and link it to this jira) > Add OSGi metadata to zookeeper.jar > -- > > Key: ZOOKEEPER-425 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-425 > Project: Zookeeper > Issue Type: Improvement > Components: build >Reporter: David Bosschaert >Assignee: Benjamin Reed > Fix For: 3.3.0 > > Attachments: buildxmlpatch.patch, MANIFEST.MF, zk_patch3.patch > > > After adding OSGi metadata to zookeeper.jar it can be used as both an OSGi > bundle as well as an ordinary jar file. > In the CXF/DOSGi project the buildsystem does this using the > maven-bundle-plugin: > http://svn.apache.org/repos/asf/cxf/dosgi/trunk/discovery/distributed/zookeeper-wrapper/pom.xml > The MANIFEST.MF generated by maven-bundle-plugin is attached to this bug, > this works for the CXF/DOSGi project. > If your buildsystem isn't using maven, I would advise to use bnd > (http://www.aqute.biz/Code/Bnd). BND defines its own ant task in which you > should be able to use more or less the same instructions as were used in > maven: > > ZooKeeper bundle > This bundle contains the ZooKeeper > library > org.apache.hadoop.zookeeper > 3.1.1 > * > *;version=3.1.1 > > Oh and one other thing. Is it really necessary to put the source code in the > Jar file too? I would put that in a separate source distribution :) > See also: > http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200905.mbox/%3c4a2009b1.3030...@yahoo-inc.com%3e -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (ZOOKEEPER-425) Add OSGi metadata to zookeeper.jar
[ https://issues.apache.org/jira/browse/ZOOKEEPER-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-425: -- Assignee: David Bosschaert (was: Benjamin Reed) > Add OSGi metadata to zookeeper.jar > -- > > Key: ZOOKEEPER-425 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-425 > Project: Zookeeper > Issue Type: Improvement > Components: build >Reporter: David Bosschaert >Assignee: David Bosschaert > Fix For: 3.3.0 > > Attachments: buildxmlpatch.patch, MANIFEST.MF, zk_patch3.patch > > > After adding OSGi metadata to zookeeper.jar it can be used as both an OSGi > bundle as well as an ordinary jar file. > In the CXF/DOSGi project the buildsystem does this using the > maven-bundle-plugin: > http://svn.apache.org/repos/asf/cxf/dosgi/trunk/discovery/distributed/zookeeper-wrapper/pom.xml > The MANIFEST.MF generated by maven-bundle-plugin is attached to this bug, > this works for the CXF/DOSGi project. > If your buildsystem isn't using maven, I would advise to use bnd > (http://www.aqute.biz/Code/Bnd). BND defines its own ant task in which you > should be able to use more or less the same instructions as were used in > maven: > > ZooKeeper bundle > This bundle contains the ZooKeeper > library > org.apache.hadoop.zookeeper > 3.1.1 > * > *;version=3.1.1 > > Oh and one other thing. Is it really necessary to put the source code in the > Jar file too? I would put that in a separate source distribution :) > See also: > http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200905.mbox/%3c4a2009b1.3030...@yahoo-inc.com%3e -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-588) remove unnecessary/annoying log of tostring error in Request.toString()
[ https://issues.apache.org/jira/browse/ZOOKEEPER-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782057#action_12782057 ] Patrick Hunt commented on ZOOKEEPER-588: I always do that for multi-line conditional, I think it makes it easier to find the block bounds > remove unnecessary/annoying log of tostring error in Request.toString() > --- > > Key: ZOOKEEPER-588 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-588 > Project: Zookeeper > Issue Type: Bug > Components: server >Affects Versions: 3.2.1 >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Minor > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-588.patch > > > Why are we logging this? It's unnecessary and just annoying afaict. We should > remove it entirely. > 2009-11-18 05:37:29,312 WARN org.apache.zookeeper.server.Request: Ignoring > exception during toString > java.nio.BufferUnderflowException > at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:127) > at java.nio.ByteBuffer.get(ByteBuffer.java:675) > at org.apache.zookeeper.server.Request.toString(Request.java:199) > at java.lang.String.valueOf(String.java:2827) > at java.lang.StringBuilder.append(StringBuilder.java:115) > at > org.apache.zookeeper.server.quorum.CommitProcessor.processRequest(CommitProcessor.java:167) > at > org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run(FollowerRequestProcessor.java:68) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-595) A means of asking quorum what conifguration it is running with
[ https://issues.apache.org/jira/browse/ZOOKEEPER-595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-595: --- Component/s: server jmx Fix Version/s: 3.3.0 Good idea, we should expose this through both command port and JMX. > A means of asking quorum what conifguration it is running with > -- > > Key: ZOOKEEPER-595 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-595 > Project: Zookeeper > Issue Type: Improvement > Components: jmx, server >Reporter: stack > Fix For: 3.3.0 > > > I'd like to ask a running quorum what its configuration is. I'd want to know > stuff like session timeout and tick times. > Use case is that in hbase there is no zoo.cfg usually; the configuration is > manufactured and piped to the starting zk server. I want to know if all of > the manufactured config. 'took' or how zk interpreted it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-597) ASyncHammerTest is failing intermittently on hudson trunk
ASyncHammerTest is failing intermittently on hudson trunk - Key: ZOOKEEPER-597 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-597 Project: Zookeeper Issue Type: Bug Components: tests Reporter: Patrick Hunt Priority: Critical Fix For: 3.3.0 ASyncHammerTest is failing intermittently on hudson trunk. There is no clear reason why this is happening, but it seems from the logs that a session connection to a follower is failing during session establishment - the failure seems to be a problem either on the follower or leader. The server gets the session create request, but it stalls in the request processor pipeline. (we see it go in, but we do not see it com eout) unfortunately all efforts to reproduce this on non-hudson trunk have failed. Even trying to reproduce by running on hudson host itself (manually) has failed. We need to instrument the client session creation code in the test to dump the thread stack if the session creation fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-597) ASyncHammerTest is failing intermittently on hudson trunk
[ https://issues.apache.org/jira/browse/ZOOKEEPER-597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-597: --- Attachment: ZOOKEEPER-597.patch this patch updates the test to log the threads/stacks if an error occurrs during session est. > ASyncHammerTest is failing intermittently on hudson trunk > - > > Key: ZOOKEEPER-597 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-597 > Project: Zookeeper > Issue Type: Bug > Components: tests >Reporter: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-597.patch > > > ASyncHammerTest is failing intermittently on hudson trunk. There is no clear > reason why this is happening, but > it seems from the logs that a session connection to a follower is failing > during session establishment - the > failure seems to be a problem either on the follower or leader. The server > gets the session create request, but > it stalls in the request processor pipeline. (we see it go in, but we do not > see it com eout) > unfortunately all efforts to reproduce this on non-hudson trunk have failed. > Even trying to reproduce by > running on hudson host itself (manually) has failed. > We need to instrument the client session creation code in the test to dump > the thread stack if the > session creation fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (ZOOKEEPER-597) ASyncHammerTest is failing intermittently on hudson trunk
[ https://issues.apache.org/jira/browse/ZOOKEEPER-597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-597: -- Assignee: Patrick Hunt > ASyncHammerTest is failing intermittently on hudson trunk > - > > Key: ZOOKEEPER-597 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-597 > Project: Zookeeper > Issue Type: Bug > Components: tests >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-597.patch > > > ASyncHammerTest is failing intermittently on hudson trunk. There is no clear > reason why this is happening, but > it seems from the logs that a session connection to a follower is failing > during session establishment - the > failure seems to be a problem either on the follower or leader. The server > gets the session create request, but > it stalls in the request processor pipeline. (we see it go in, but we do not > see it com eout) > unfortunately all efforts to reproduce this on non-hudson trunk have failed. > Even trying to reproduce by > running on hudson host itself (manually) has failed. > We need to instrument the client session creation code in the test to dump > the thread stack if the > session creation fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (ZOOKEEPER-597) ASyncHammerTest is failing intermittently on hudson trunk
[ https://issues.apache.org/jira/browse/ZOOKEEPER-597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reopened ZOOKEEPER-597: still an issue, reopening > ASyncHammerTest is failing intermittently on hudson trunk > - > > Key: ZOOKEEPER-597 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-597 > Project: Zookeeper > Issue Type: Bug > Components: tests >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-597.patch > > > ASyncHammerTest is failing intermittently on hudson trunk. There is no clear > reason why this is happening, but > it seems from the logs that a session connection to a follower is failing > during session establishment - the > failure seems to be a problem either on the follower or leader. The server > gets the session create request, but > it stalls in the request processor pipeline. (we see it go in, but we do not > see it com eout) > unfortunately all efforts to reproduce this on non-hudson trunk have failed. > Even trying to reproduce by > running on hudson host itself (manually) has failed. > We need to instrument the client session creation code in the test to dump > the thread stack if the > session creation fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-600) TODO pondering about allocation behavior in zkpython may be removed
[ https://issues.apache.org/jira/browse/ZOOKEEPER-600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-600: --- Component/s: contrib-bindings Affects Version/s: 3.2.1 Fix Version/s: 3.3.0 Issue Type: Bug (was: Task) > TODO pondering about allocation behavior in zkpython may be removed > --- > > Key: ZOOKEEPER-600 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-600 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bindings >Affects Versions: 3.2.1 >Reporter: Gustavo Niemeyer >Assignee: Gustavo Niemeyer >Priority: Trivial > Fix For: 3.3.0 > > > I suppose the TODO below is referring to the "path" variable which is passed > in as an output variable to PyArg_ParseTuple right below. The TODO may be > removed, since the code is right. Code using PyArg_ParseTuple will borrow > the reference from the calling code, since there's a stack behind the call to > the enclosing function (pyzoo_get_children in this case) which won't go away > until the function returns. > Index: src/contrib/zkpython/src/c/zookeeper.c > === > --- src/contrib/zkpython/src/c/zookeeper.c(revision 885582) > +++ src/contrib/zkpython/src/c/zookeeper.c(working copy) > @@ -774,8 +774,6 @@ > > static PyObject *pyzoo_get_children(PyObject *self, PyObject *args) > { > - // TO DO: Does Python copy the string or the reference? If it's the former > - // we should free the String_vector >int zkhid; >char *path; >PyObject *watcherfn = Py_None; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (ZOOKEEPER-600) TODO pondering about allocation behavior in zkpython may be removed
[ https://issues.apache.org/jira/browse/ZOOKEEPER-600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-600: -- Assignee: Gustavo Niemeyer > TODO pondering about allocation behavior in zkpython may be removed > --- > > Key: ZOOKEEPER-600 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-600 > Project: Zookeeper > Issue Type: Task > Components: contrib-bindings >Affects Versions: 3.2.1 >Reporter: Gustavo Niemeyer >Assignee: Gustavo Niemeyer >Priority: Trivial > Fix For: 3.3.0 > > > I suppose the TODO below is referring to the "path" variable which is passed > in as an output variable to PyArg_ParseTuple right below. The TODO may be > removed, since the code is right. Code using PyArg_ParseTuple will borrow > the reference from the calling code, since there's a stack behind the call to > the enclosing function (pyzoo_get_children in this case) which won't go away > until the function returns. > Index: src/contrib/zkpython/src/c/zookeeper.c > === > --- src/contrib/zkpython/src/c/zookeeper.c(revision 885582) > +++ src/contrib/zkpython/src/c/zookeeper.c(working copy) > @@ -774,8 +774,6 @@ > > static PyObject *pyzoo_get_children(PyObject *self, PyObject *args) > { > - // TO DO: Does Python copy the string or the reference? If it's the former > - // we should free the String_vector >int zkhid; >char *path; >PyObject *watcherfn = Py_None; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-600) TODO pondering about allocation behavior in zkpython may be removed
[ https://issues.apache.org/jira/browse/ZOOKEEPER-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783903#action_12783903 ] Patrick Hunt commented on ZOOKEEPER-600: Hi Gustavo, thanks for the submit. I need to point out that we require submissions via patch file, details of which you can find here: http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute (use svn diff to create a patch, attach it to this jira using "attach file" link on the left hand side of this page) The reason for this is that we need to capture your acceptance of the license grant to ASF. Otw we cannot accept the patch (for legal reasons). Also our automated workflow checks submissions and such, it's triggered by your attaching the file, then clicking on "submit patch". Thanks for your patience. If you could attach you change as a patch file that would be great. > TODO pondering about allocation behavior in zkpython may be removed > --- > > Key: ZOOKEEPER-600 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-600 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bindings >Affects Versions: 3.2.1 >Reporter: Gustavo Niemeyer >Assignee: Gustavo Niemeyer >Priority: Trivial > Fix For: 3.3.0 > > > I suppose the TODO below is referring to the "path" variable which is passed > in as an output variable to PyArg_ParseTuple right below. The TODO may be > removed, since the code is right. Code using PyArg_ParseTuple will borrow > the reference from the calling code, since there's a stack behind the call to > the enclosing function (pyzoo_get_children in this case) which won't go away > until the function returns. > Index: src/contrib/zkpython/src/c/zookeeper.c > === > --- src/contrib/zkpython/src/c/zookeeper.c(revision 885582) > +++ src/contrib/zkpython/src/c/zookeeper.c(working copy) > @@ -774,8 +774,6 @@ > > static PyObject *pyzoo_get_children(PyObject *self, PyObject *args) > { > - // TO DO: Does Python copy the string or the reference? If it's the former > - // we should free the String_vector >int zkhid; >char *path; >PyObject *watcherfn = Py_None; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-601) allow configuration of session timeout min/max bounds
allow configuration of session timeout min/max bounds - Key: ZOOKEEPER-601 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-601 Project: Zookeeper Issue Type: Improvement Components: server Affects Versions: 3.2.1 Reporter: Patrick Hunt Fix For: 3.3.0 ZK servers currently enforce a min/max boundary on client session timeout relative to the ticktime setting, detailed here: http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions In general there are good reasons for this however in some cases, in particular with HBase region servers, we have seen a need to allow this bound to be set differently (higher). The Sun jvm can GC pause for very long times (in some cases we've seen 4 minutes even with the "realtime" gc. It would be good to allow this bound to be set via configuration parameters. Note: 4letterword and JMX integration would be needed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-600) TODO pondering about allocation behavior in zkpython may be removed
[ https://issues.apache.org/jira/browse/ZOOKEEPER-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784056#action_12784056 ] Patrick Hunt commented on ZOOKEEPER-600: no worries, we don't expect first time contributors to know everything. ;-) thanks for the interest. > TODO pondering about allocation behavior in zkpython may be removed > --- > > Key: ZOOKEEPER-600 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-600 > Project: Zookeeper > Issue Type: Bug > Components: contrib-bindings >Affects Versions: 3.2.1 >Reporter: Gustavo Niemeyer >Assignee: Gustavo Niemeyer >Priority: Trivial > Fix For: 3.3.0 > > > I suppose the TODO below is referring to the "path" variable which is passed > in as an output variable to PyArg_ParseTuple right below. The TODO may be > removed, since the code is right. Code using PyArg_ParseTuple will borrow > the reference from the calling code, since there's a stack behind the call to > the enclosing function (pyzoo_get_children in this case) which won't go away > until the function returns. > Index: src/contrib/zkpython/src/c/zookeeper.c > === > --- src/contrib/zkpython/src/c/zookeeper.c(revision 885582) > +++ src/contrib/zkpython/src/c/zookeeper.c(working copy) > @@ -774,8 +774,6 @@ > > static PyObject *pyzoo_get_children(PyObject *self, PyObject *args) > { > - // TO DO: Does Python copy the string or the reference? If it's the former > - // we should free the String_vector >int zkhid; >char *path; >PyObject *watcherfn = Py_None; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-597) ASyncHammerTest is failing intermittently on hudson trunk
[ https://issues.apache.org/jira/browse/ZOOKEEPER-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784340#action_12784340 ] Patrick Hunt commented on ZOOKEEPER-597: according to the latest log the commit processor thread is exiting. I notice that we are not logging exceptions from that thread. We should include logging the exception as part of this fix. Really we need to add to the ThreadGroup -- handle uncaught exceptions -- log them at error level > ASyncHammerTest is failing intermittently on hudson trunk > - > > Key: ZOOKEEPER-597 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-597 > Project: Zookeeper > Issue Type: Bug > Components: tests >Reporter: Patrick Hunt >Assignee: Patrick Hunt >Priority: Critical > Fix For: 3.3.0 > > Attachments: ZOOKEEPER-597.patch > > > ASyncHammerTest is failing intermittently on hudson trunk. There is no clear > reason why this is happening, but > it seems from the logs that a session connection to a follower is failing > during session establishment - the > failure seems to be a problem either on the follower or leader. The server > gets the session create request, but > it stalls in the request processor pipeline. (we see it go in, but we do not > see it com eout) > unfortunately all efforts to reproduce this on non-hudson trunk have failed. > Even trying to reproduce by > running on hudson host itself (manually) has failed. > We need to instrument the client session creation code in the test to dump > the thread stack if the > session creation fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-602) log all exceptions not caught by ZK threads
log all exceptions not caught by ZK threads --- Key: ZOOKEEPER-602 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-602 Project: Zookeeper Issue Type: Bug Components: java client, server Affects Versions: 3.2.1 Reporter: Patrick Hunt Priority: Critical Fix For: 3.3.0 the java code should add a ThreadGroup exception handler that logs at ERROR level any uncaught exceptions thrown by Thread run methods. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-604) zk needs to prevent export of any symbol not listed in their api
[ https://issues.apache.org/jira/browse/ZOOKEEPER-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784457#action_12784457 ] Patrick Hunt commented on ZOOKEEPER-604: Bummer -- +1 on fixing this for 3.3.0. Alex it would be great if you could provide a patch given you can verify and also are on the cusp of the issue. > zk needs to prevent export of any symbol not listed in their api > > > Key: ZOOKEEPER-604 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-604 > Project: Zookeeper > Issue Type: Bug > Components: c client >Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1, 3.2.2, > 3.3.0, 4.0.0 > Environment: All >Reporter: Alex Newman > Fix For: 3.3.0 > > > Currently the zookeeper seems to be exporting symbols not in the api. An > example of this seems to be the symbol hash, which interferes with me using > memcached and zookeeper in the same program. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-604) zk needs to prevent export of any symbol not listed in their api
[ https://issues.apache.org/jira/browse/ZOOKEEPER-604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-604: --- Priority: Critical (was: Major) > zk needs to prevent export of any symbol not listed in their api > > > Key: ZOOKEEPER-604 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-604 > Project: Zookeeper > Issue Type: Bug > Components: c client >Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1, 3.2.2, > 3.3.0, 4.0.0 > Environment: All >Reporter: Alex Newman >Priority: Critical > Fix For: 3.3.0 > > > Currently the zookeeper seems to be exporting symbols not in the api. An > example of this seems to be the symbol hash, which interferes with me using > memcached and zookeeper in the same program. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.