Question on quorum behavior
In short: it seems leader can treat observers as quorum members. Steps to repro: 1. I have a following ensemble configuration: # servers list server.1=localhost:2881:3881 server.2=localhost:2882:3882 server.3=localhost:2883:3883:observer server.4=localhost:2884:3884 server.5=localhost:2885:3885:observer 2. I'm bringing up servers 1,2,3 and it's enough for quorum (1 and 2). 3. I'm shutting down the one from the quorum who is the follower. As I understand, expected result is that leader will start a new election round so that to regain quorum. But the real situation is that it just says goodbye to that follower, and is still operable. (When I'm shutting down 3rd one -- observer -- leader starts trying to regain a quorum). Is this a bug, or a feature? -- Regards, Sergey
Re: Question on quorum behavior
Sergey - Sounds like a bug. Can you open a new JIRA and attach your log files to it? Thanks, Henry On 6 May 2010 07:50, Sergey Doroshenko dors...@gmail.com wrote: In short: it seems leader can treat observers as quorum members. Steps to repro: 1. I have a following ensemble configuration: # servers list server.1=localhost:2881:3881 server.2=localhost:2882:3882 server.3=localhost:2883:3883:observer server.4=localhost:2884:3884 server.5=localhost:2885:3885:observer 2. I'm bringing up servers 1,2,3 and it's enough for quorum (1 and 2). 3. I'm shutting down the one from the quorum who is the follower. As I understand, expected result is that leader will start a new election round so that to regain quorum. But the real situation is that it just says goodbye to that follower, and is still operable. (When I'm shutting down 3rd one -- observer -- leader starts trying to regain a quorum). Is this a bug, or a feature? -- Regards, Sergey -- Henry Robinson Software Engineer Cloudera 415-994-6679
[jira] Created: (ZOOKEEPER-768) zkpython segfault (assertion error in io thread)
zkpython segfault (assertion error in io thread) Key: ZOOKEEPER-768 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-768 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.4.0 Environment: ubuntu lucid (10.04), zookeeper trunk (java/c/zkpython) Reporter: Kapil Thangavelu Attachments: zkpython-segfault-stack-traces.txt, zkpython-segfault.py While trying to create a test case showing slow average add_auth, i stumbled upon a test case that reliably segfaults for me, albeit with variable amount of iterations (anwhere from 2 to 20). fwiw, I've got about 220 processes in my test environment (ubuntu lucid 10.04). The test case opens a connection, adds authentication to it, and closes the connection, in a loop. I'm including the sample program and the gdb stack traces from the core file. I can upload the core file if thats helpful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-768) zkpython segfault (assertion error in io thread)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Thangavelu updated ZOOKEEPER-768: --- Attachment: zkpython-segfault-client-log.txt client log with debug logging on. zkpython segfault (assertion error in io thread) Key: ZOOKEEPER-768 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-768 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.4.0 Environment: ubuntu lucid (10.04), zookeeper trunk (java/c/zkpython) Reporter: Kapil Thangavelu Attachments: zkpython-segfault-client-log.txt, zkpython-segfault-stack-traces.txt, zkpython-segfault.py While trying to create a test case showing slow average add_auth, i stumbled upon a test case that reliably segfaults for me, albeit with variable amount of iterations (anwhere from 2 to 20). fwiw, I've got about 220 processes in my test environment (ubuntu lucid 10.04). The test case opens a connection, adds authentication to it, and closes the connection, in a loop. I'm including the sample program and the gdb stack traces from the core file. I can upload the core file if thats helpful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-768) zkpython segfault on close (assertion error in io thread)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864849#action_12864849 ] Henry Robinson commented on ZOOKEEPER-768: -- Thanks Kapil - I'll take a look. From the stack trace it looks as though a pending completion callback is null and therefore something weird is going on with a completion dispatcher being freed before it is finished being used. As per usual I can't reproduce on my machine, but this is enough information to dig into it. zkpython segfault on close (assertion error in io thread) - Key: ZOOKEEPER-768 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-768 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.4.0 Environment: ubuntu lucid (10.04), zookeeper trunk (java/c/zkpython) Reporter: Kapil Thangavelu Attachments: zkpython-segfault-client-log.txt, zkpython-segfault-stack-traces.txt, zkpython-segfault.py While trying to create a test case showing slow average add_auth, i stumbled upon a test case that reliably segfaults for me, albeit with variable amount of iterations (anwhere from 0 to 20 typically). fwiw, I've got about 220 processes in my test environment (ubuntu lucid 10.04). The test case opens a connection, adds authentication to it, and closes the connection, in a loop. I'm including the sample program and the gdb stack traces from the core file. I can upload the core file if thats helpful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [GSoC 2010] Zookeeper Read-Only Mode
I created a wiki page that outlines how I'm thinking to implement read-only mode: http://wiki.apache.org/hadoop/ZooKeeper/GSoCReadOnlyMode This feature can change both client library and server side, and before implementing it, it would be great to hear some feedback from the community. I'd like to ask ZK developers what do they think about current approach, and ask ZK users to also share their thoughts, use cases where read-only mode would be beneficial and how current approach fits to them and so on. P.S. For zookeeper-user@ subscribers who didn't previously see this email: I'm Sergey Doroshenko, accepted GSoC-2010 applicant, and I'll be implementing read-only mode for Zookeeper. Any feedback is greatly appreciated and would be really helpful. On Mon, Apr 19, 2010 at 5:51 PM, Vishal K vishalm...@gmail.com wrote: Hi Sergey, This is a very useful feature. We would be happy to have it on our cluster. One small suggestion - It will be nice if you could document things along the way while going through the ZK code or testing ZK behavior. Since it will cover one of the core logic of ZK it will helpful to other contributors as well. Thanks. Regards, -Vishal On Fri, Apr 16, 2010 at 6:43 PM, Sergey Doroshenko dors...@gmail.com wrote: Hi, I'm Sergey Doroshenko, GSoC applicant, and I've submitted application for enabling read-only mode in Zookeeper. I worked with Zookeeper during my internship this winter, and liked it much. Now I'm very eager to contribute to it. Task for enabling read-only mode in ZK ( https://issues.apache.org/jira/browse/ZOOKEEPER-704) looks interesting and, as we discussed with Henry, has quite practical importance. Please check my proposal here: http://docs.google.com/View?id=dghqvqdd_51ffvhcsdb , and let me know if you have some thoughts or suggestions about it. Thanks! -- Regards, Sergey -- Regards, Sergey
[jira] Created: (ZOOKEEPER-769) Leader can treat observers as quorum members
Leader can treat observers as quorum members Key: ZOOKEEPER-769 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-769 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.0 Environment: Ubuntu Karmic x64 Reporter: Sergey Doroshenko Fix For: 3.3.0 In short: it seems leader can treat observers as quorum members. Steps to repro: 1. Server configuration: 3 voters, 2 observers (attached). 2. Bring up 2 voters and one observer. It's enough for quorum. 3. Shut down the one from the quorum who is the follower. As I understand, expected result is that leader will start a new election round so that to regain quorum. But the real situation is that it just says goodbye to that follower, and is still operable. (When I'm shutting down 3rd one -- observer -- leader starts trying to regain a quorum). (Expectedly, if on step 3 we shut down the leader, not the follower, remaining follower starta new leader election, as it should be). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-769) Leader can treat observers as quorum members
[ https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Doroshenko updated ZOOKEEPER-769: Attachment: zoo1.cfg Leader can treat observers as quorum members Key: ZOOKEEPER-769 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-769 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.0 Environment: Ubuntu Karmic x64 Reporter: Sergey Doroshenko Fix For: 3.3.0 Attachments: zoo1.cfg In short: it seems leader can treat observers as quorum members. Steps to repro: 1. Server configuration: 3 voters, 2 observers (attached). 2. Bring up 2 voters and one observer. It's enough for quorum. 3. Shut down the one from the quorum who is the follower. As I understand, expected result is that leader will start a new election round so that to regain quorum. But the real situation is that it just says goodbye to that follower, and is still operable. (When I'm shutting down 3rd one -- observer -- leader starts trying to regain a quorum). (Expectedly, if on step 3 we shut down the leader, not the follower, remaining follower starta new leader election, as it should be). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-763) Deadlock on close w/ zkpython / c client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-763: --- Attachment: (was: ZOOKEEPER-763_3_3_1.patch) Deadlock on close w/ zkpython / c client Key: ZOOKEEPER-763 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-763 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.3.0 Environment: ubuntu 10.04, zookeeper 3.3.0 and trunk Reporter: Kapil Thangavelu Assignee: Henry Robinson Fix For: 3.3.1, 3.4.0 Attachments: deadlock.py, deadlock_v2.py, stack-trace-deadlock.txt, ZOOKEEPER-763.patch, ZOOKEEPER-763.patch deadlocks occur if we attempt to close a handle while there are any outstanding async requests (aget, acreate, etc). Normally on close both the io thread terminates and the completion thread are terminated and joined, however w\ith outstanding async requests, the completion thread won't be in a joinable state, and we effectively hang when the main thread does the join. afaics ideal behavior would be on close of a handle, to effectively clear out any remaining callbacks and let the completion thread terminate. i've tried adding some bookkeeping to within a python client to guard against closing while there is an outstanding async completion request, but its an imperfect solution since even after the python callback is executed there is still a window for deadlock before the completion thread finishes the callback. a simple example to reproduce the deadlock is attached. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-763) Deadlock on close w/ zkpython / c client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864871#action_12864871 ] Patrick Hunt commented on ZOOKEEPER-763: For some reason I got confused on the 3.3 branch (may not have been up to date), the main patch applies to both just fine. Fixed this in svn. Deadlock on close w/ zkpython / c client Key: ZOOKEEPER-763 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-763 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.3.0 Environment: ubuntu 10.04, zookeeper 3.3.0 and trunk Reporter: Kapil Thangavelu Assignee: Henry Robinson Fix For: 3.3.1, 3.4.0 Attachments: deadlock.py, deadlock_v2.py, stack-trace-deadlock.txt, ZOOKEEPER-763.patch, ZOOKEEPER-763.patch deadlocks occur if we attempt to close a handle while there are any outstanding async requests (aget, acreate, etc). Normally on close both the io thread terminates and the completion thread are terminated and joined, however w\ith outstanding async requests, the completion thread won't be in a joinable state, and we effectively hang when the main thread does the join. afaics ideal behavior would be on close of a handle, to effectively clear out any remaining callbacks and let the completion thread terminate. i've tried adding some bookkeeping to within a python client to guard against closing while there is an outstanding async completion request, but its an imperfect solution since even after the python callback is executed there is still a window for deadlock before the completion thread finishes the callback. a simple example to reproduce the deadlock is attached. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-768) zkpython segfault on close (assertion error in io thread)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864873#action_12864873 ] Kapil Thangavelu commented on ZOOKEEPER-768: i've uploaded the core file here http://kapilt.com/files/zkpython-segfault-on-close-core.bz2 a little more poking around in gdb, shows the packet to be a ping one. (gdb) print hdr $8 = {xid = -2, zxid = 181, err = 0} zkpython segfault on close (assertion error in io thread) - Key: ZOOKEEPER-768 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-768 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.4.0 Environment: ubuntu lucid (10.04), zookeeper trunk (java/c/zkpython) Reporter: Kapil Thangavelu Attachments: zkpython-segfault-client-log.txt, zkpython-segfault-stack-traces.txt, zkpython-segfault.py While trying to create a test case showing slow average add_auth, i stumbled upon a test case that reliably segfaults for me, albeit with variable amount of iterations (anwhere from 0 to 20 typically). fwiw, I've got about 220 processes in my test environment (ubuntu lucid 10.04). The test case opens a connection, adds authentication to it, and closes the connection, in a loop. I'm including the sample program and the gdb stack traces from the core file. I can upload the core file if thats helpful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-768) zkpython segfault on close (assertion error in io thread)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Thangavelu updated ZOOKEEPER-768: --- Attachment: zkpython-segfault-on-close-core.bz2 Compressed the core file is small enough to just attach to the ticket. zkpython segfault on close (assertion error in io thread) - Key: ZOOKEEPER-768 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-768 Project: Zookeeper Issue Type: Bug Components: contrib-bindings Affects Versions: 3.4.0 Environment: ubuntu lucid (10.04), zookeeper trunk (java/c/zkpython) Reporter: Kapil Thangavelu Attachments: zkpython-segfault-client-log.txt, zkpython-segfault-on-close-core.bz2, zkpython-segfault-stack-traces.txt, zkpython-segfault.py While trying to create a test case showing slow average add_auth, i stumbled upon a test case that reliably segfaults for me, albeit with variable amount of iterations (anwhere from 0 to 20 typically). fwiw, I've got about 220 processes in my test environment (ubuntu lucid 10.04). The test case opens a connection, adds authentication to it, and closes the connection, in a loop. I'm including the sample program and the gdb stack traces from the core file. I can upload the core file if thats helpful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-769) Leader can treat observers as quorum members
[ https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864878#action_12864878 ] Henry Robinson commented on ZOOKEEPER-769: -- Hi Sergey - Can you attach the logs from (at least) the leader node to this ticket? I'd like to figure this one out asap. cheers, Henry Leader can treat observers as quorum members Key: ZOOKEEPER-769 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-769 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.0 Environment: Ubuntu Karmic x64 Reporter: Sergey Doroshenko Fix For: 3.3.0 Attachments: zoo1.cfg In short: it seems leader can treat observers as quorum members. Steps to repro: 1. Server configuration: 3 voters, 2 observers (attached). 2. Bring up 2 voters and one observer. It's enough for quorum. 3. Shut down the one from the quorum who is the follower. As I understand, expected result is that leader will start a new election round so that to regain quorum. But the real situation is that it just says goodbye to that follower, and is still operable. (When I'm shutting down 3rd one -- observer -- leader starts trying to regain a quorum). (Expectedly, if on step 3 we shut down the leader, not the follower, remaining follower starta new leader election, as it should be). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code
[ https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Baskinger updated ZOOKEEPER-767: Attachment: (was: Lock.java.patch) Submitting Demo/Recipe Shared / Exclusive Lock Code --- Key: ZOOKEEPER-767 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767 Project: Zookeeper Issue Type: Improvement Components: recipes Affects Versions: 3.3.0 Reporter: Sam Baskinger Assignee: Sam Baskinger Priority: Minor Fix For: 3.4.0 Networked Insights would like to share-back some code for shared/exclusive locking that we are using in our labs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code
[ https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Baskinger updated ZOOKEEPER-767: Attachment: ZOOKEEPER-767.patch Unit tests and new SharedExclusiveLock recipe implementation. Submitting Demo/Recipe Shared / Exclusive Lock Code --- Key: ZOOKEEPER-767 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767 Project: Zookeeper Issue Type: Improvement Components: recipes Affects Versions: 3.3.0 Reporter: Sam Baskinger Assignee: Sam Baskinger Priority: Minor Fix For: 3.4.0 Attachments: ZOOKEEPER-767.patch Networked Insights would like to share-back some code for shared/exclusive locking that we are using in our labs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code
[ https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864890#action_12864890 ] Hadoop QA commented on ZOOKEEPER-767: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443889/ZOOKEEPER-767.patch against trunk revision 941521. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 26 release audit warnings (more than the trunk's current 24 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/85/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/85/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/85/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/85/console This message is automatically generated. Submitting Demo/Recipe Shared / Exclusive Lock Code --- Key: ZOOKEEPER-767 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767 Project: Zookeeper Issue Type: Improvement Components: recipes Affects Versions: 3.3.0 Reporter: Sam Baskinger Assignee: Sam Baskinger Priority: Minor Fix For: 3.4.0 Attachments: ZOOKEEPER-767.patch Networked Insights would like to share-back some code for shared/exclusive locking that we are using in our labs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-770) Slow add_auth calls with multi-threaded client
Slow add_auth calls with multi-threaded client -- Key: ZOOKEEPER-770 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-770 Project: Zookeeper Issue Type: Bug Components: c client, contrib-bindings Affects Versions: 3.3.0, 3.4.0 Environment: ubuntu lucid (10.04), zk trunk (3.4) Reporter: Kapil Thangavelu Calls to add_auth are a bit slow from the c client library. The auth callback typically takes multiple seconds to fire. I instrumented the java, c binding, and python binding with a few log statements to find out where the slowness was occuring ( http://bazaar.launchpad.net/~hazmat/zookeeper/fast-auth-instrumented/revision/647). It looks like when the io thread polls, it doesn't register interest in the incoming packet, so the auth success message from the server and the auth callback are only processed when the poll timeouts. I tried modifying mt_adapter.c so the poll registers interest in both events, this causes a considerably more wakeups but it does address the issue of making add_auth fast. I think the ideal solution would be some sort of additional auth handshake state on the handle, that zookeeper_interest could utilize to suggest both POLLIN|POLLOUT are wanted for subsequent calls to poll during the auth handshake handle state. i'm attaching a script that takes 13s or 1.6s for the auth callback depending on the session time out value (which in turn figures into the calculation of the poll timeout). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-770) Slow add_auth calls with multi-threaded client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Thangavelu updated ZOOKEEPER-770: --- Attachment: authtest.py script that demonstrates that time for auth callbacks is dependent on session timeout, which is used to calculate poll timeout. Slow add_auth calls with multi-threaded client -- Key: ZOOKEEPER-770 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-770 Project: Zookeeper Issue Type: Bug Components: c client, contrib-bindings Affects Versions: 3.3.0, 3.4.0 Environment: ubuntu lucid (10.04), zk trunk (3.4) Reporter: Kapil Thangavelu Priority: Minor Attachments: authtest.py Calls to add_auth are a bit slow from the c client library. The auth callback typically takes multiple seconds to fire. I instrumented the java, c binding, and python binding with a few log statements to find out where the slowness was occuring ( http://bazaar.launchpad.net/~hazmat/zookeeper/fast-auth-instrumented/revision/647). It looks like when the io thread polls, it doesn't register interest in the incoming packet, so the auth success message from the server and the auth callback are only processed when the poll timeouts. I tried modifying mt_adapter.c so the poll registers interest in both events, this causes a considerably more wakeups but it does address the issue of making add_auth fast. I think the ideal solution would be some sort of additional auth handshake state on the handle, that zookeeper_interest could utilize to suggest both POLLIN|POLLOUT are wanted for subsequent calls to poll during the auth handshake handle state. i'm attaching a script that takes 13s or 1.6s for the auth callback depending on the session time out value (which in turn figures into the calculation of the poll timeout). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-769) Leader can treat observers as quorum members
[ https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Doroshenko updated ZOOKEEPER-769: Attachment: follower.log leader.log observer.log Logs Leader can treat observers as quorum members Key: ZOOKEEPER-769 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-769 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.0 Environment: Ubuntu Karmic x64 Reporter: Sergey Doroshenko Fix For: 3.3.0 Attachments: follower.log, leader.log, observer.log, zoo1.cfg In short: it seems leader can treat observers as quorum members. Steps to repro: 1. Server configuration: 3 voters, 2 observers (attached). 2. Bring up 2 voters and one observer. It's enough for quorum. 3. Shut down the one from the quorum who is the follower. As I understand, expected result is that leader will start a new election round so that to regain quorum. But the real situation is that it just says goodbye to that follower, and is still operable. (When I'm shutting down 3rd one -- observer -- leader starts trying to regain a quorum). (Expectedly, if on step 3 we shut down the leader, not the follower, remaining follower starta new leader election, as it should be). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code
[ https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-767: --- Status: Open (was: Patch Available) Thanks Sam, release audit failure typically means that you are missing license headers in your new source files. Could you update the patch? (checkout the existing source files for example of what the license header should be) Submitting Demo/Recipe Shared / Exclusive Lock Code --- Key: ZOOKEEPER-767 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767 Project: Zookeeper Issue Type: Improvement Components: recipes Affects Versions: 3.3.0 Reporter: Sam Baskinger Assignee: Sam Baskinger Priority: Minor Fix For: 3.4.0 Attachments: ZOOKEEPER-767.patch Networked Insights would like to share-back some code for shared/exclusive locking that we are using in our labs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code
[ https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Baskinger updated ZOOKEEPER-767: Attachment: (was: ZOOKEEPER-767.patch) Submitting Demo/Recipe Shared / Exclusive Lock Code --- Key: ZOOKEEPER-767 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767 Project: Zookeeper Issue Type: Improvement Components: recipes Affects Versions: 3.3.0 Reporter: Sam Baskinger Assignee: Sam Baskinger Priority: Minor Fix For: 3.4.0 Networked Insights would like to share-back some code for shared/exclusive locking that we are using in our labs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code
[ https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864913#action_12864913 ] Sam Baskinger commented on ZOOKEEPER-767: - I was wondering about that! Thanks, Patrick. I've got a few moments right now to get that updated. Naive question, do I need to invalidate / reflag the issue as having a patch for the build to pick it up? Thank you! Sam Baskinger Submitting Demo/Recipe Shared / Exclusive Lock Code --- Key: ZOOKEEPER-767 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767 Project: Zookeeper Issue Type: Improvement Components: recipes Affects Versions: 3.3.0 Reporter: Sam Baskinger Assignee: Sam Baskinger Priority: Minor Fix For: 3.4.0 Networked Insights would like to share-back some code for shared/exclusive locking that we are using in our labs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code
[ https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Baskinger updated ZOOKEEPER-767: Status: Patch Available (was: Open) This patch is the same as the previous one of the same name but with the license block added to the top of the test class. Submitting Demo/Recipe Shared / Exclusive Lock Code --- Key: ZOOKEEPER-767 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767 Project: Zookeeper Issue Type: Improvement Components: recipes Affects Versions: 3.3.0 Reporter: Sam Baskinger Assignee: Sam Baskinger Priority: Minor Fix For: 3.4.0 Attachments: ZOOKEEPER-767.patch Networked Insights would like to share-back some code for shared/exclusive locking that we are using in our labs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code
[ https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Baskinger updated ZOOKEEPER-767: Attachment: ZOOKEEPER-767.patch Unit tests and recipe implementation of a SharedExclusiveLock. This new attachment contains copyright/license information for the test class. Submitting Demo/Recipe Shared / Exclusive Lock Code --- Key: ZOOKEEPER-767 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767 Project: Zookeeper Issue Type: Improvement Components: recipes Affects Versions: 3.3.0 Reporter: Sam Baskinger Assignee: Sam Baskinger Priority: Minor Fix For: 3.4.0 Attachments: ZOOKEEPER-767.patch Networked Insights would like to share-back some code for shared/exclusive locking that we are using in our labs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code
[ https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864925#action_12864925 ] Hadoop QA commented on ZOOKEEPER-767: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443903/ZOOKEEPER-767.patch against trunk revision 941521. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/86/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/86/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/86/console This message is automatically generated. Submitting Demo/Recipe Shared / Exclusive Lock Code --- Key: ZOOKEEPER-767 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767 Project: Zookeeper Issue Type: Improvement Components: recipes Affects Versions: 3.3.0 Reporter: Sam Baskinger Assignee: Sam Baskinger Priority: Minor Fix For: 3.4.0 Attachments: ZOOKEEPER-767.patch Networked Insights would like to share-back some code for shared/exclusive locking that we are using in our labs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code
[ https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864936#action_12864936 ] Patrick Hunt commented on ZOOKEEPER-767: Sam, yes, you have to cancel/submit for the system to pickup the new attachment. Also you should just upload the new patch with the same name. JIRA will handle this properly (and as a result you can see the history of the patch as changes are made). Submitting Demo/Recipe Shared / Exclusive Lock Code --- Key: ZOOKEEPER-767 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767 Project: Zookeeper Issue Type: Improvement Components: recipes Affects Versions: 3.3.0 Reporter: Sam Baskinger Assignee: Sam Baskinger Priority: Minor Fix For: 3.4.0 Attachments: ZOOKEEPER-767.patch Networked Insights would like to share-back some code for shared/exclusive locking that we are using in our labs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-769) Leader can treat observers as quorum members
[ https://issues.apache.org/jira/browse/ZOOKEEPER-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864953#action_12864953 ] Henry Robinson commented on ZOOKEEPER-769: -- Sergey - In the cfg files for nodes 3 and 5, did you include the following line? peerType=observer See http://hadoop.apache.org/zookeeper/docs/r3.3.0/zookeeperObservers.html for details. The observer log contains this line: 2010-05-06 22:46:00,876 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2183:quorump...@642] - FOLLOWING which is a big red flag because observers should never adopt the FOLLOWING state. If I don't have that line I can reproduce your issue. If I add it, the observers work as expected. Can you check your cfg files? cheers, Henry Leader can treat observers as quorum members Key: ZOOKEEPER-769 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-769 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.0 Environment: Ubuntu Karmic x64 Reporter: Sergey Doroshenko Fix For: 3.3.0 Attachments: follower.log, leader.log, observer.log, zoo1.cfg In short: it seems leader can treat observers as quorum members. Steps to repro: 1. Server configuration: 3 voters, 2 observers (attached). 2. Bring up 2 voters and one observer. It's enough for quorum. 3. Shut down the one from the quorum who is the follower. As I understand, expected result is that leader will start a new election round so that to regain quorum. But the real situation is that it just says goodbye to that follower, and is still operable. (When I'm shutting down 3rd one -- observer -- leader starts trying to regain a quorum). (Expectedly, if on step 3 we shut down the leader, not the follower, remaining follower starta new leader election, as it should be). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-701) GSoC 2010: Monitoring Recipes and Web-based Administrative Interface
[ https://issues.apache.org/jira/browse/ZOOKEEPER-701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864970#action_12864970 ] Savu Andrei commented on ZOOKEEPER-701: --- I have created a wiki page for tracking my work on this project. You can find it at the following url: http://wiki.apache.org/hadoop/ZooKeeper/GSoCMonitoringAndWebInterface GSoC 2010: Monitoring Recipes and Web-based Administrative Interface Key: ZOOKEEPER-701 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-701 Project: Zookeeper Issue Type: Wish Reporter: Henry Robinson Assignee: Savu Andrei Attachments: milestones.txt Monitoring Recipes And Web-based Administrative Interface Mentor: Patrick Hunt (ph...@apache.org) Requirements: Modern web platform - e.g. Django. Some design or UI skills would help. Java for adding methods to ZooKeeper. Description: ZooKeeper is a complex distributed system. Understanding how well it is running is tremendously important. Patrick Hunt has created a Django-based dashboard (see http://github.com/phunt/zookeeper_dashboard) that allows some insight into how ZooKeeper is running. This is a great foundation on which to build; however there are improvements that could be made! This project would capture much more information from ZooKeeper, adding hooks to retrieve it where necessary and visualise it in a appealing and useful way. Integration with Ganglia would be a definite plus. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.