[jira] Commented: (ZOOKEEPER-904) super digest is not actually acting as a full superuser

2010-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925323#action_12925323
 ] 

Hudson commented on ZOOKEEPER-904:
--

Integrated in ZooKeeper-trunk #981 (See 
[https://hudson.apache.org/hudson/job/ZooKeeper-trunk/981/])
ZOOKEEPER-904. super digest is not actually acting as a full superuser 
(Camille Fournier via mahadev)


 super digest is not actually acting as a full superuser
 ---

 Key: ZOOKEEPER-904
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-904
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.3.1
Reporter: Camille Fournier
Assignee: Camille Fournier
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-904-332.patch, ZOOKEEPER-904.patch


 The documentation states:
 New in 3.2:  Enables a ZooKeeper ensemble administrator to access the znode 
 hierarchy as a super user. In particular no ACL checking occurs for a user 
 authenticated as super.
 However, if a super user does something like:
 zk.setACL(/, Ids.READ_ACL_UNSAFE, -1);
 the super user is now bound by read-only ACL. This is not what I would expect 
 to see given the documentation. It can be fixed by moving the chec for the 
 super authId in PrepRequestProcessor.checkACL to before the for(ACL a : 
 acl) loop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-907) Spurious KeeperErrorCode = Session moved messages

2010-10-27 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-907:
---

Status: Open  (was: Patch Available)

Cancelling the patch - still needs a test.

Ben could you get back on Vishal's question? (see latest comment)

 Spurious KeeperErrorCode = Session moved messages
 ---

 Key: ZOOKEEPER-907
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.3.1
Reporter: Vishal K
Assignee: Vishal K
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2


 The sync request does not set the session owner in Request.
 As a result, the leader keeps printing:
 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
 Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
 type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
 Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] ZooKeeper as TLP?

2010-10-27 Thread Patrick Hunt
This passes, with 5 +1 votes and no -1 votes from the ZooKeeper community.

I'll now forward this on to the Hadoop PMC for consideration.

Thanks,

Patrick

On Mon, Oct 25, 2010 at 3:27 PM, Benjamin Reed br...@yahoo-inc.com wrote:
 +1

 On 10/22/2010 02:42 PM, Patrick Hunt wrote:

 Please vote as to whether you think ZooKeeper should become a
 top-level Apache project, as discussed previously on this list. I've
 included below a draft board resolution.

 Do folks support sending this request on to the Hadoop PMC?

 Patrick

 

     X. Establish the Apache ZooKeeper Project

        WHEREAS, the Board of Directors deems it to be in the best
        interests of the Foundation and consistent with the
        Foundation's purpose to establish a Project Management
        Committee charged with the creation and maintenance of
        open-source software related to distributed system coordination
        for distribution at no charge to the public.

        NOW, THEREFORE, BE IT RESOLVED, that a Project Management
        Committee (PMC), to be known as the Apache ZooKeeper Project,
        be and hereby is established pursuant to Bylaws of the
        Foundation; and be it further

        RESOLVED, that the Apache ZooKeeper Project be and hereby is
        responsible for the creation and maintenance of software
        related to distributed system coordination; and be it further

        RESOLVED, that the office of Vice President, Apache ZooKeeper be
        and hereby is created, the person holding such office to
        serve at the direction of the Board of Directors as the chair
        of the Apache ZooKeeper Project, and to have primary responsibility
        for management of the projects within the scope of
        responsibility of the Apache ZooKeeper Project; and be it further

        RESOLVED, that the persons listed immediately below be and
        hereby are appointed to serve as the initial members of the
        Apache ZooKeeper Project:

          * Patrick Huntph...@apache.org
          * Flavio Junqueiraf...@apache.org
          * Mahadev Konarmaha...@apache.org
          * Benjamin Reedbr...@apache.org
          * Henry Robinsonhe...@apache.org

        NOW, THEREFORE, BE IT FURTHER RESOLVED, that Patrick Hunt
        be appointed to the office of Vice President, Apache ZooKeeper, to
        serve in accordance with and subject to the direction of the
        Board of Directors and the Bylaws of the Foundation until
        death, resignation, retirement, removal or disqualification,
        or until a successor is appointed; and be it further

        RESOLVED, that the initial Apache ZooKeeper PMC be and hereby is
        tasked with the creation of a set of bylaws intended to
        encourage open development and increased participation in the
        Apache ZooKeeper Project; and be it further

        RESOLVED, that the Apache ZooKeeper Project be and hereby
        is tasked with the migration and rationalization of the Apache
        Hadoop ZooKeeper sub-project; and be it further

        RESOLVED, that all responsibilities pertaining to the Apache
        Hadoop ZooKeeper sub-project encumbered upon the
        Apache Hadoop Project are hereafter discharged.




[jira] Commented: (ZOOKEEPER-907) Spurious KeeperErrorCode = Session moved messages

2010-10-27 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925512#action_12925512
 ] 

Vishal K commented on ZOOKEEPER-907:


Which return code are you referring to? You will see this message in the log 
file of the reader. It is not passed on to the caller anywhere.

 Spurious KeeperErrorCode = Session moved messages
 ---

 Key: ZOOKEEPER-907
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.3.1
Reporter: Vishal K
Assignee: Vishal K
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2


 The sync request does not set the session owner in Request.
 As a result, the leader keeps printing:
 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
 Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
 type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
 Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-907) Spurious KeeperErrorCode = Session moved messages

2010-10-27 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925540#action_12925540
 ] 

Benjamin Reed commented on ZOOKEEPER-907:
-

ah got it. ok i was able to reproduce it: the client connects to the follower, 
issues a sync, the error message shows up in the log of the leader. so there is 
an additional bug here -- why is the client not getting the session moved error.

 Spurious KeeperErrorCode = Session moved messages
 ---

 Key: ZOOKEEPER-907
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.3.1
Reporter: Vishal K
Assignee: Vishal K
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2


 The sync request does not set the session owner in Request.
 As a result, the leader keeps printing:
 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
 Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
 type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
 Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-907) Spurious KeeperErrorCode = Session moved messages

2010-10-27 Thread Vishal K (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925541#action_12925541
 ] 

Vishal K commented on ZOOKEEPER-907:


It did occur to me. I thought this was by design for sync? sync() is an async 
call. So the caller never gets any exceptions unless a callback is specified.
I might be worng here though, I am still reading the code to understand how 
sync works.

 Spurious KeeperErrorCode = Session moved messages
 ---

 Key: ZOOKEEPER-907
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.3.1
Reporter: Vishal K
Assignee: Vishal K
Priority: Blocker
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2


 The sync request does not set the session owner in Request.
 As a result, the leader keeps printing:
 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
 Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
 type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
 Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-914) QuorumCnxManager blocks forever

2010-10-27 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925546#action_12925546
 ] 

Patrick Hunt commented on ZOOKEEPER-914:


Thanks for the bug report. I've yet to find a codebase where I couldn't find 
what I consider bad programming, so I don't find that a constructive comment. 
We're happy you've joined community, let's all work together to address these 
issues. Thanks.

bq. points out to lack of failure tests for QuorumCnxManager

We can always use more testing. If you want to contribute additional patches 
just for testing please do so (I'm sure if you talk with Flavio he could give 
you some good ideas). Notice that there are a number of tests exercising this 
code already (around 85% coverage), we'd need to figure out some way to 
simulate network failures and such, which is difficult in my experience:
https://hudson.apache.org/hudson/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/clover/org/apache/zookeeper/server/quorum/QuorumCnxManager.html

 QuorumCnxManager blocks forever 
 

 Key: ZOOKEEPER-914
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-914
 Project: Zookeeper
  Issue Type: Bug
Reporter: Vishal K
Assignee: Vishal K
Priority: Blocker

 This was a disaster. While testing our application we ran into a scenario 
 where a rebooted follower could not join the cluster. Further debugging 
 showed that the follower could not join because the QuorumCnxManager on the 
 leader was blocked for indefinite amount of time in receiveConnect()
 Thread-3 prio=10 tid=0x7fa920005800 nid=0x11bb runnable 
 [0x7fa9275ed000]
java.lang.Thread.State: RUNNABLE
 at sun.nio.ch.FileDispatcher.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
 at sun.nio.ch.IOUtil.read(IOUtil.java:206)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
 - locked 0x7fa93315f988 (a java.lang.Object)
 at 
 org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:210)
 at 
 org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:501)
 I had pointed out this bug along with several other problems in 
 QuorumCnxManager earlier in 
 https://issues.apache.org/jira/browse/ZOOKEEPER-900 and 
 https://issues.apache.org/jira/browse/ZOOKEEPER-822.
 I forgot to patch this one as a part of ZOOKEEPER-822. I am working on a fix 
 and a patch will be out soon. 
 The problem is that QuorumCnxManager is using SocketChannel in blocking mode. 
 It does a read() in receiveConnection() and a write() in initiateConnection().
 Sorry, but this is really bad programming. Also, points out to lack of 
 failure tests for QuorumCnxManager.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-914) QuorumCnxManager blocks forever

2010-10-27 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-914:
---

  Component/s: server
   quorum
Fix Version/s: 3.4.0
   3.3.3

 QuorumCnxManager blocks forever 
 

 Key: ZOOKEEPER-914
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-914
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum, server
Reporter: Vishal K
Assignee: Vishal K
Priority: Blocker
 Fix For: 3.3.3, 3.4.0


 This was a disaster. While testing our application we ran into a scenario 
 where a rebooted follower could not join the cluster. Further debugging 
 showed that the follower could not join because the QuorumCnxManager on the 
 leader was blocked for indefinite amount of time in receiveConnect()
 Thread-3 prio=10 tid=0x7fa920005800 nid=0x11bb runnable 
 [0x7fa9275ed000]
java.lang.Thread.State: RUNNABLE
 at sun.nio.ch.FileDispatcher.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
 at sun.nio.ch.IOUtil.read(IOUtil.java:206)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
 - locked 0x7fa93315f988 (a java.lang.Object)
 at 
 org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:210)
 at 
 org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:501)
 I had pointed out this bug along with several other problems in 
 QuorumCnxManager earlier in 
 https://issues.apache.org/jira/browse/ZOOKEEPER-900 and 
 https://issues.apache.org/jira/browse/ZOOKEEPER-822.
 I forgot to patch this one as a part of ZOOKEEPER-822. I am working on a fix 
 and a patch will be out soon. 
 The problem is that QuorumCnxManager is using SocketChannel in blocking mode. 
 It does a read() in receiveConnection() and a write() in initiateConnection().
 Sorry, but this is really bad programming. Also, points out to lack of 
 failure tests for QuorumCnxManager.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-885) Zookeeper drops connections under moderate IO load

2010-10-27 Thread Alexandre Hardy (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925598#action_12925598
 ] 

Alexandre Hardy commented on ZOOKEEPER-885:
---

Hi Flavio,

I've  set up some ec2 instances to reproduce the problem. I think the
problem is related to relative disk performance and load.

I have had to use a more aggressive disk benchmark utility to get the
problem to occur, and I realise that this is contrary to the zookeeper
requirements. However, I think we would like to know why a ping would
be affected when no disk access is expected.

Can we discuss access to these instances vie e-mail?

Kind regards
Alexandre


 Zookeeper drops connections under moderate IO load
 --

 Key: ZOOKEEPER-885
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.2.2, 3.3.1
 Environment: Debian (Lenny)
 1Gb RAM
 swap disabled
 100Mb heap for zookeeper
Reporter: Alexandre Hardy
Priority: Critical
 Attachments: benchmark.csv, tracezklogs.tar.gz, tracezklogs.tar.gz, 
 WatcherTest.java, zklogs.tar.gz


 A zookeeper server under minimum load, with a number of clients watching 
 exactly one node will fail to maintain the connection when the machine is 
 subjected to moderate IO load.
 In a specific test example we had three zookeeper servers running on 
 dedicated machines with 45 clients connected, watching exactly one node. The 
 clients would disconnect after moderate load was added to each of the 
 zookeeper servers with the command:
 {noformat}
 dd if=/dev/urandom of=/dev/mapper/nimbula-test
 {noformat}
 The {{dd}} command transferred data at a rate of about 4Mb/s.
 The same thing happens with
 {noformat}
 dd if=/dev/zero of=/dev/mapper/nimbula-test
 {noformat}
 It seems strange that such a moderate load should cause instability in the 
 connection.
 Very few other processes were running, the machines were setup to test the 
 connection instability we have experienced. Clients performed no other read 
 or mutation operations.
 Although the documents state that minimal competing IO load should present on 
 the zookeeper server, it seems reasonable that moderate IO should not cause 
 problems in this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-912) ZooKeeper client logs trace and debug messages at level INFO

2010-10-27 Thread Anthony Urso (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925699#action_12925699
 ] 

Anthony Urso commented on ZOOKEEPER-912:


I don't think you understand the problem.

log4j has different log levels for different severities.  Debug messages should 
be logged at level debug. Warnings should be logged at level warn. Errors 
should be logged at level error. Zookeeper logs nearly everything at least at 
level info, regardless of severity.

This leads to a situation where the uninformative debug message:

  10/10/27 21:05:09 INFO zookeeper.ClientCnxn: Socket connection established to 
localhost/127.0.0.1:2181, initiating session

Is logged at the same level as the should-be-way-more-noticeable error message:

  10/10/27 21:05:09 INFO zookeeper.ClientCnxn: Unable to read additional data 
from server sessionid 0x0, likely server has closed socket, closing socket 
connection and attempting reconnect

I can use log4j properties to turn off info level messages, but then I won't 
see those warnings and errors.  

If you truly feel that these log messages should be all-or-nothing, why not get 
rid of log4j entirely and use System.out.println?


 ZooKeeper client logs trace and debug messages at level INFO
 

 Key: ZOOKEEPER-912
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-912
 Project: Zookeeper
  Issue Type: Improvement
  Components: java client
Affects Versions: 3.3.1
Reporter: Anthony Urso
Assignee: Anthony Urso
Priority: Minor
 Fix For: 3.4.0

 Attachments: zk-loglevel.patch


 ZK logs a lot of uninformative trace and debug messages to level INFO.  This 
 fuzzes up everything and makes it easy to miss useful log info. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.