[jira] Commented: (ZOOKEEPER-904) super digest is not actually acting as a full superuser
[ https://issues.apache.org/jira/browse/ZOOKEEPER-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925323#action_12925323 ] Hudson commented on ZOOKEEPER-904: -- Integrated in ZooKeeper-trunk #981 (See [https://hudson.apache.org/hudson/job/ZooKeeper-trunk/981/]) ZOOKEEPER-904. super digest is not actually acting as a full superuser (Camille Fournier via mahadev) super digest is not actually acting as a full superuser --- Key: ZOOKEEPER-904 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-904 Project: Zookeeper Issue Type: Bug Components: server Affects Versions: 3.3.1 Reporter: Camille Fournier Assignee: Camille Fournier Fix For: 3.3.2, 3.4.0 Attachments: ZOOKEEPER-904-332.patch, ZOOKEEPER-904.patch The documentation states: New in 3.2: Enables a ZooKeeper ensemble administrator to access the znode hierarchy as a super user. In particular no ACL checking occurs for a user authenticated as super. However, if a super user does something like: zk.setACL(/, Ids.READ_ACL_UNSAFE, -1); the super user is now bound by read-only ACL. This is not what I would expect to see given the documentation. It can be fixed by moving the chec for the super authId in PrepRequestProcessor.checkACL to before the for(ACL a : acl) loop. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-907) Spurious KeeperErrorCode = Session moved messages
[ https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-907: --- Status: Open (was: Patch Available) Cancelling the patch - still needs a test. Ben could you get back on Vishal's question? (see latest comment) Spurious KeeperErrorCode = Session moved messages --- Key: ZOOKEEPER-907 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.1 Reporter: Vishal K Assignee: Vishal K Priority: Blocker Fix For: 3.3.2, 3.4.0 Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2 The sync request does not set the session owner in Request. As a result, the leader keeps printing: 2010-07-01 10:55:36,733 - INFO [ProcessThread:-1:preprequestproces...@405] - Got user-level KeeperException when processing sessionid:0x298d3b1fa9 type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error Path:null Error:KeeperErrorCode = Session moved -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [VOTE] ZooKeeper as TLP?
This passes, with 5 +1 votes and no -1 votes from the ZooKeeper community. I'll now forward this on to the Hadoop PMC for consideration. Thanks, Patrick On Mon, Oct 25, 2010 at 3:27 PM, Benjamin Reed br...@yahoo-inc.com wrote: +1 On 10/22/2010 02:42 PM, Patrick Hunt wrote: Please vote as to whether you think ZooKeeper should become a top-level Apache project, as discussed previously on this list. I've included below a draft board resolution. Do folks support sending this request on to the Hadoop PMC? Patrick X. Establish the Apache ZooKeeper Project WHEREAS, the Board of Directors deems it to be in the best interests of the Foundation and consistent with the Foundation's purpose to establish a Project Management Committee charged with the creation and maintenance of open-source software related to distributed system coordination for distribution at no charge to the public. NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee (PMC), to be known as the Apache ZooKeeper Project, be and hereby is established pursuant to Bylaws of the Foundation; and be it further RESOLVED, that the Apache ZooKeeper Project be and hereby is responsible for the creation and maintenance of software related to distributed system coordination; and be it further RESOLVED, that the office of Vice President, Apache ZooKeeper be and hereby is created, the person holding such office to serve at the direction of the Board of Directors as the chair of the Apache ZooKeeper Project, and to have primary responsibility for management of the projects within the scope of responsibility of the Apache ZooKeeper Project; and be it further RESOLVED, that the persons listed immediately below be and hereby are appointed to serve as the initial members of the Apache ZooKeeper Project: * Patrick Huntph...@apache.org * Flavio Junqueiraf...@apache.org * Mahadev Konarmaha...@apache.org * Benjamin Reedbr...@apache.org * Henry Robinsonhe...@apache.org NOW, THEREFORE, BE IT FURTHER RESOLVED, that Patrick Hunt be appointed to the office of Vice President, Apache ZooKeeper, to serve in accordance with and subject to the direction of the Board of Directors and the Bylaws of the Foundation until death, resignation, retirement, removal or disqualification, or until a successor is appointed; and be it further RESOLVED, that the initial Apache ZooKeeper PMC be and hereby is tasked with the creation of a set of bylaws intended to encourage open development and increased participation in the Apache ZooKeeper Project; and be it further RESOLVED, that the Apache ZooKeeper Project be and hereby is tasked with the migration and rationalization of the Apache Hadoop ZooKeeper sub-project; and be it further RESOLVED, that all responsibilities pertaining to the Apache Hadoop ZooKeeper sub-project encumbered upon the Apache Hadoop Project are hereafter discharged.
[jira] Commented: (ZOOKEEPER-907) Spurious KeeperErrorCode = Session moved messages
[ https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925512#action_12925512 ] Vishal K commented on ZOOKEEPER-907: Which return code are you referring to? You will see this message in the log file of the reader. It is not passed on to the caller anywhere. Spurious KeeperErrorCode = Session moved messages --- Key: ZOOKEEPER-907 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.1 Reporter: Vishal K Assignee: Vishal K Priority: Blocker Fix For: 3.3.2, 3.4.0 Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2 The sync request does not set the session owner in Request. As a result, the leader keeps printing: 2010-07-01 10:55:36,733 - INFO [ProcessThread:-1:preprequestproces...@405] - Got user-level KeeperException when processing sessionid:0x298d3b1fa9 type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error Path:null Error:KeeperErrorCode = Session moved -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-907) Spurious KeeperErrorCode = Session moved messages
[ https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925540#action_12925540 ] Benjamin Reed commented on ZOOKEEPER-907: - ah got it. ok i was able to reproduce it: the client connects to the follower, issues a sync, the error message shows up in the log of the leader. so there is an additional bug here -- why is the client not getting the session moved error. Spurious KeeperErrorCode = Session moved messages --- Key: ZOOKEEPER-907 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.1 Reporter: Vishal K Assignee: Vishal K Priority: Blocker Fix For: 3.3.2, 3.4.0 Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2 The sync request does not set the session owner in Request. As a result, the leader keeps printing: 2010-07-01 10:55:36,733 - INFO [ProcessThread:-1:preprequestproces...@405] - Got user-level KeeperException when processing sessionid:0x298d3b1fa9 type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error Path:null Error:KeeperErrorCode = Session moved -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-907) Spurious KeeperErrorCode = Session moved messages
[ https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925541#action_12925541 ] Vishal K commented on ZOOKEEPER-907: It did occur to me. I thought this was by design for sync? sync() is an async call. So the caller never gets any exceptions unless a callback is specified. I might be worng here though, I am still reading the code to understand how sync works. Spurious KeeperErrorCode = Session moved messages --- Key: ZOOKEEPER-907 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.1 Reporter: Vishal K Assignee: Vishal K Priority: Blocker Fix For: 3.3.2, 3.4.0 Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2 The sync request does not set the session owner in Request. As a result, the leader keeps printing: 2010-07-01 10:55:36,733 - INFO [ProcessThread:-1:preprequestproces...@405] - Got user-level KeeperException when processing sessionid:0x298d3b1fa9 type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error Path:null Error:KeeperErrorCode = Session moved -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-914) QuorumCnxManager blocks forever
[ https://issues.apache.org/jira/browse/ZOOKEEPER-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925546#action_12925546 ] Patrick Hunt commented on ZOOKEEPER-914: Thanks for the bug report. I've yet to find a codebase where I couldn't find what I consider bad programming, so I don't find that a constructive comment. We're happy you've joined community, let's all work together to address these issues. Thanks. bq. points out to lack of failure tests for QuorumCnxManager We can always use more testing. If you want to contribute additional patches just for testing please do so (I'm sure if you talk with Flavio he could give you some good ideas). Notice that there are a number of tests exercising this code already (around 85% coverage), we'd need to figure out some way to simulate network failures and such, which is difficult in my experience: https://hudson.apache.org/hudson/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/clover/org/apache/zookeeper/server/quorum/QuorumCnxManager.html QuorumCnxManager blocks forever Key: ZOOKEEPER-914 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-914 Project: Zookeeper Issue Type: Bug Reporter: Vishal K Assignee: Vishal K Priority: Blocker This was a disaster. While testing our application we ran into a scenario where a rebooted follower could not join the cluster. Further debugging showed that the follower could not join because the QuorumCnxManager on the leader was blocked for indefinite amount of time in receiveConnect() Thread-3 prio=10 tid=0x7fa920005800 nid=0x11bb runnable [0x7fa9275ed000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at sun.nio.ch.IOUtil.read(IOUtil.java:206) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236) - locked 0x7fa93315f988 (a java.lang.Object) at org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:210) at org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:501) I had pointed out this bug along with several other problems in QuorumCnxManager earlier in https://issues.apache.org/jira/browse/ZOOKEEPER-900 and https://issues.apache.org/jira/browse/ZOOKEEPER-822. I forgot to patch this one as a part of ZOOKEEPER-822. I am working on a fix and a patch will be out soon. The problem is that QuorumCnxManager is using SocketChannel in blocking mode. It does a read() in receiveConnection() and a write() in initiateConnection(). Sorry, but this is really bad programming. Also, points out to lack of failure tests for QuorumCnxManager. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-914) QuorumCnxManager blocks forever
[ https://issues.apache.org/jira/browse/ZOOKEEPER-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-914: --- Component/s: server quorum Fix Version/s: 3.4.0 3.3.3 QuorumCnxManager blocks forever Key: ZOOKEEPER-914 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-914 Project: Zookeeper Issue Type: Bug Components: quorum, server Reporter: Vishal K Assignee: Vishal K Priority: Blocker Fix For: 3.3.3, 3.4.0 This was a disaster. While testing our application we ran into a scenario where a rebooted follower could not join the cluster. Further debugging showed that the follower could not join because the QuorumCnxManager on the leader was blocked for indefinite amount of time in receiveConnect() Thread-3 prio=10 tid=0x7fa920005800 nid=0x11bb runnable [0x7fa9275ed000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at sun.nio.ch.IOUtil.read(IOUtil.java:206) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236) - locked 0x7fa93315f988 (a java.lang.Object) at org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:210) at org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:501) I had pointed out this bug along with several other problems in QuorumCnxManager earlier in https://issues.apache.org/jira/browse/ZOOKEEPER-900 and https://issues.apache.org/jira/browse/ZOOKEEPER-822. I forgot to patch this one as a part of ZOOKEEPER-822. I am working on a fix and a patch will be out soon. The problem is that QuorumCnxManager is using SocketChannel in blocking mode. It does a read() in receiveConnection() and a write() in initiateConnection(). Sorry, but this is really bad programming. Also, points out to lack of failure tests for QuorumCnxManager. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-885) Zookeeper drops connections under moderate IO load
[ https://issues.apache.org/jira/browse/ZOOKEEPER-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925598#action_12925598 ] Alexandre Hardy commented on ZOOKEEPER-885: --- Hi Flavio, I've set up some ec2 instances to reproduce the problem. I think the problem is related to relative disk performance and load. I have had to use a more aggressive disk benchmark utility to get the problem to occur, and I realise that this is contrary to the zookeeper requirements. However, I think we would like to know why a ping would be affected when no disk access is expected. Can we discuss access to these instances vie e-mail? Kind regards Alexandre Zookeeper drops connections under moderate IO load -- Key: ZOOKEEPER-885 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885 Project: Zookeeper Issue Type: Bug Components: server Affects Versions: 3.2.2, 3.3.1 Environment: Debian (Lenny) 1Gb RAM swap disabled 100Mb heap for zookeeper Reporter: Alexandre Hardy Priority: Critical Attachments: benchmark.csv, tracezklogs.tar.gz, tracezklogs.tar.gz, WatcherTest.java, zklogs.tar.gz A zookeeper server under minimum load, with a number of clients watching exactly one node will fail to maintain the connection when the machine is subjected to moderate IO load. In a specific test example we had three zookeeper servers running on dedicated machines with 45 clients connected, watching exactly one node. The clients would disconnect after moderate load was added to each of the zookeeper servers with the command: {noformat} dd if=/dev/urandom of=/dev/mapper/nimbula-test {noformat} The {{dd}} command transferred data at a rate of about 4Mb/s. The same thing happens with {noformat} dd if=/dev/zero of=/dev/mapper/nimbula-test {noformat} It seems strange that such a moderate load should cause instability in the connection. Very few other processes were running, the machines were setup to test the connection instability we have experienced. Clients performed no other read or mutation operations. Although the documents state that minimal competing IO load should present on the zookeeper server, it seems reasonable that moderate IO should not cause problems in this case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-912) ZooKeeper client logs trace and debug messages at level INFO
[ https://issues.apache.org/jira/browse/ZOOKEEPER-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925699#action_12925699 ] Anthony Urso commented on ZOOKEEPER-912: I don't think you understand the problem. log4j has different log levels for different severities. Debug messages should be logged at level debug. Warnings should be logged at level warn. Errors should be logged at level error. Zookeeper logs nearly everything at least at level info, regardless of severity. This leads to a situation where the uninformative debug message: 10/10/27 21:05:09 INFO zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session Is logged at the same level as the should-be-way-more-noticeable error message: 10/10/27 21:05:09 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect I can use log4j properties to turn off info level messages, but then I won't see those warnings and errors. If you truly feel that these log messages should be all-or-nothing, why not get rid of log4j entirely and use System.out.println? ZooKeeper client logs trace and debug messages at level INFO Key: ZOOKEEPER-912 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-912 Project: Zookeeper Issue Type: Improvement Components: java client Affects Versions: 3.3.1 Reporter: Anthony Urso Assignee: Anthony Urso Priority: Minor Fix For: 3.4.0 Attachments: zk-loglevel.patch ZK logs a lot of uninformative trace and debug messages to level INFO. This fuzzes up everything and makes it easy to miss useful log info. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.