Re: [VOTE] ZooKeeper as TLP?

2010-10-22 Thread Henry Robinson
+1

On 22 October 2010 14:53, Mahadev Konar  wrote:

> +1
>
> On 10/22/10 2:42 PM, "Patrick Hunt"  wrote:
>
> > Please vote as to whether you think ZooKeeper should become a
> > top-level Apache project, as discussed previously on this list. I've
> > included below a draft board resolution.
> >
> > Do folks support sending this request on to the Hadoop PMC?
> >
> > Patrick
> >
> > 
> >
> > X. Establish the Apache ZooKeeper Project
> >
> >WHEREAS, the Board of Directors deems it to be in the best
> >interests of the Foundation and consistent with the
> >Foundation's purpose to establish a Project Management
> >Committee charged with the creation and maintenance of
> >open-source software related to distributed system coordination
> >for distribution at no charge to the public.
> >
> >NOW, THEREFORE, BE IT RESOLVED, that a Project Management
> >Committee (PMC), to be known as the "Apache ZooKeeper Project",
> >be and hereby is established pursuant to Bylaws of the
> >Foundation; and be it further
> >
> >RESOLVED, that the Apache ZooKeeper Project be and hereby is
> >responsible for the creation and maintenance of software
> >related to distributed system coordination; and be it further
> >
> >RESOLVED, that the office of "Vice President, Apache ZooKeeper" be
> >and hereby is created, the person holding such office to
> >serve at the direction of the Board of Directors as the chair
> >of the Apache ZooKeeper Project, and to have primary
> responsibility
> >for management of the projects within the scope of
> >responsibility of the Apache ZooKeeper Project; and be it further
> >
> >RESOLVED, that the persons listed immediately below be and
> >hereby are appointed to serve as the initial members of the
> >Apache ZooKeeper Project:
> >
> >  * Patrick Hunt 
> >  * Flavio Junqueira 
> >  * Mahadev Konar
> >  * Benjamin Reed
> >  * Henry Robinson   
> >
> >NOW, THEREFORE, BE IT FURTHER RESOLVED, that Patrick Hunt
> >be appointed to the office of Vice President, Apache ZooKeeper, to
> >serve in accordance with and subject to the direction of the
> >Board of Directors and the Bylaws of the Foundation until
> >death, resignation, retirement, removal or disqualification,
> >or until a successor is appointed; and be it further
> >
> >RESOLVED, that the initial Apache ZooKeeper PMC be and hereby is
> >tasked with the creation of a set of bylaws intended to
> >encourage open development and increased participation in the
> >Apache ZooKeeper Project; and be it further
> >
> >RESOLVED, that the Apache ZooKeeper Project be and hereby
> >is tasked with the migration and rationalization of the Apache
> >Hadoop ZooKeeper sub-project; and be it further
> >
> >RESOLVED, that all responsibilities pertaining to the Apache
> >Hadoop ZooKeeper sub-project encumbered upon the
> >Apache Hadoop Project are hereafter discharged.
> >
>
>


-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: [VOTE] ZooKeeper as TLP?

2010-10-22 Thread Mahadev Konar
+1

On 10/22/10 2:42 PM, "Patrick Hunt"  wrote:

> Please vote as to whether you think ZooKeeper should become a
> top-level Apache project, as discussed previously on this list. I've
> included below a draft board resolution.
> 
> Do folks support sending this request on to the Hadoop PMC?
> 
> Patrick
> 
> 
> 
> X. Establish the Apache ZooKeeper Project
> 
>WHEREAS, the Board of Directors deems it to be in the best
>interests of the Foundation and consistent with the
>Foundation's purpose to establish a Project Management
>Committee charged with the creation and maintenance of
>open-source software related to distributed system coordination
>for distribution at no charge to the public.
> 
>NOW, THEREFORE, BE IT RESOLVED, that a Project Management
>Committee (PMC), to be known as the "Apache ZooKeeper Project",
>be and hereby is established pursuant to Bylaws of the
>Foundation; and be it further
> 
>RESOLVED, that the Apache ZooKeeper Project be and hereby is
>responsible for the creation and maintenance of software
>related to distributed system coordination; and be it further
> 
>RESOLVED, that the office of "Vice President, Apache ZooKeeper" be
>and hereby is created, the person holding such office to
>serve at the direction of the Board of Directors as the chair
>of the Apache ZooKeeper Project, and to have primary responsibility
>for management of the projects within the scope of
>responsibility of the Apache ZooKeeper Project; and be it further
> 
>RESOLVED, that the persons listed immediately below be and
>hereby are appointed to serve as the initial members of the
>Apache ZooKeeper Project:
> 
>  * Patrick Hunt 
>  * Flavio Junqueira 
>  * Mahadev Konar
>  * Benjamin Reed
>  * Henry Robinson   
> 
>NOW, THEREFORE, BE IT FURTHER RESOLVED, that Patrick Hunt
>be appointed to the office of Vice President, Apache ZooKeeper, to
>serve in accordance with and subject to the direction of the
>Board of Directors and the Bylaws of the Foundation until
>death, resignation, retirement, removal or disqualification,
>or until a successor is appointed; and be it further
> 
>RESOLVED, that the initial Apache ZooKeeper PMC be and hereby is
>tasked with the creation of a set of bylaws intended to
>encourage open development and increased participation in the
>Apache ZooKeeper Project; and be it further
> 
>RESOLVED, that the Apache ZooKeeper Project be and hereby
>is tasked with the migration and rationalization of the Apache
>Hadoop ZooKeeper sub-project; and be it further
> 
>RESOLVED, that all responsibilities pertaining to the Apache
>Hadoop ZooKeeper sub-project encumbered upon the
>Apache Hadoop Project are hereafter discharged.
> 



[VOTE] ZooKeeper as TLP?

2010-10-22 Thread Patrick Hunt
Please vote as to whether you think ZooKeeper should become a
top-level Apache project, as discussed previously on this list. I've
included below a draft board resolution.

Do folks support sending this request on to the Hadoop PMC?

Patrick



X. Establish the Apache ZooKeeper Project

   WHEREAS, the Board of Directors deems it to be in the best
   interests of the Foundation and consistent with the
   Foundation's purpose to establish a Project Management
   Committee charged with the creation and maintenance of
   open-source software related to distributed system coordination
   for distribution at no charge to the public.

   NOW, THEREFORE, BE IT RESOLVED, that a Project Management
   Committee (PMC), to be known as the "Apache ZooKeeper Project",
   be and hereby is established pursuant to Bylaws of the
   Foundation; and be it further

   RESOLVED, that the Apache ZooKeeper Project be and hereby is
   responsible for the creation and maintenance of software
   related to distributed system coordination; and be it further

   RESOLVED, that the office of "Vice President, Apache ZooKeeper" be
   and hereby is created, the person holding such office to
   serve at the direction of the Board of Directors as the chair
   of the Apache ZooKeeper Project, and to have primary responsibility
   for management of the projects within the scope of
   responsibility of the Apache ZooKeeper Project; and be it further

   RESOLVED, that the persons listed immediately below be and
   hereby are appointed to serve as the initial members of the
   Apache ZooKeeper Project:

 * Patrick Hunt 
 * Flavio Junqueira 
 * Mahadev Konar
 * Benjamin Reed
 * Henry Robinson   

   NOW, THEREFORE, BE IT FURTHER RESOLVED, that Patrick Hunt
   be appointed to the office of Vice President, Apache ZooKeeper, to
   serve in accordance with and subject to the direction of the
   Board of Directors and the Bylaws of the Foundation until
   death, resignation, retirement, removal or disqualification,
   or until a successor is appointed; and be it further

   RESOLVED, that the initial Apache ZooKeeper PMC be and hereby is
   tasked with the creation of a set of bylaws intended to
   encourage open development and increased participation in the
   Apache ZooKeeper Project; and be it further

   RESOLVED, that the Apache ZooKeeper Project be and hereby
   is tasked with the migration and rationalization of the Apache
   Hadoop ZooKeeper sub-project; and be it further

   RESOLVED, that all responsibilities pertaining to the Apache
   Hadoop ZooKeeper sub-project encumbered upon the
   Apache Hadoop Project are hereafter discharged.


[jira] Commented: (ZOOKEEPER-805) four letter words fail with latest ubuntu nc.openbsd

2010-10-22 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924037#action_12924037
 ] 

Patrick Hunt commented on ZOOKEEPER-805:


Hi Mahadev, I don't think that's necessary given you can fallback to 
"traditional" nc, or you can use the -q option as suggested by akovi.

On my ubuntu system (lucid/maverick) I have two executables; nc.openbsd and 
nc.traditional. "nc" links to openbsd version by default.

Honestly I'm not sure why this is no longer working, given that we addressed 
the "nc closes input first" in ZOOKEEPER-737

> four letter words fail with latest ubuntu nc.openbsd
> 
>
> Key: ZOOKEEPER-805
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-805
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation, server
>Affects Versions: 3.3.1, 3.4.0
>Reporter: Patrick Hunt
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
>
> In both 3.3 branch and trunk "echo stat|nc localhost 2181" fails against the 
> ZK server on Ubuntu Lucid Lynx.
> I noticed this after upgrading to lucid lynx - which is now shipping openbsd 
> nc as the default:
> OpenBSD netcat (Debian patchlevel 1.89-3ubuntu2)
> vs nc traditional
> [v1.10-38]
> which works fine. Not sure if this is a bug in us or nc.openbsd, but it's 
> currently not working for me. Ugh.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-805) four letter words fail with latest ubuntu nc.openbsd

2010-10-22 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924021#action_12924021
 ] 

Mahadev konar commented on ZOOKEEPER-805:
-

Pat,
   You think this should go into 3.3.2?


> four letter words fail with latest ubuntu nc.openbsd
> 
>
> Key: ZOOKEEPER-805
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-805
> Project: Zookeeper
>  Issue Type: Bug
>  Components: documentation, server
>Affects Versions: 3.3.1, 3.4.0
>Reporter: Patrick Hunt
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
>
> In both 3.3 branch and trunk "echo stat|nc localhost 2181" fails against the 
> ZK server on Ubuntu Lucid Lynx.
> I noticed this after upgrading to lucid lynx - which is now shipping openbsd 
> nc as the default:
> OpenBSD netcat (Debian patchlevel 1.89-3ubuntu2)
> vs nc traditional
> [v1.10-38]
> which works fine. Not sure if this is a bug in us or nc.openbsd, but it's 
> currently not working for me. Ugh.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-909) Extract NIO specific code from ClientCnxn

2010-10-22 Thread Thomas Koch (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923986#action_12923986
 ] 

Thomas Koch commented on ZOOKEEPER-909:
---

Hi Benjamin,

you mean, I should add a class javadoc for ClientCnxnSocket and a javadoc for 
the socket property in ClientCnxn.SendThread? You're right. However I did not 
yet come to an end with thinking about an elegant structure for the Classes 
ClientCnxn, SendThread and ClientCnxnSocket. I believe that the 
ClientCnxnSocket class won't remain for long as it is in this patch.
For example SendThread and ClientCnxn have a circular dependency which I really 
don't like. Also both classes work on the common properties incomingBuffer and 
outgoingBuffer which is suboptimal.
So I'd like to ask for forgiveness for sparse (or inexistent) documentation 
until we settle on a final design.

I also want to start to learn the server code now to see, whether it makes 
sense to generalize certain things.

> Extract NIO specific code from ClientCnxn
> -
>
> Key: ZOOKEEPER-909
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-909
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Thomas Koch
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch
>
>
> This patch is mostly the same patch as my last one for ZOOKEEPER-823 minus 
> everything Netty related. This means this patch only extract all NIO specific 
> code in the class ClientCnxnSocketNIO which extends ClientCnxnSocket.
> I've redone this patch from current trunk step by step now and couldn't find 
> any logical error. I've already done a couple of successful test runs and 
> will continue to do so this night.
> It would be nice, if we could apply this patch as soon as possible to trunk. 
> This allows us to continue to work on the netty integration without blocking 
> the ClientCnxn class. Adding Netty after this patch should be only a matter 
> of adding the ClientCnxnSocketNetty class with the appropriate test cases.
> You could help me by reviewing the patch and by running it on whatever test 
> server you have available. Please send me any complete failure log you should 
> encounter to thomas at koch point ro. Thx!
> Update: Until now, I've collected 8 successful builds in a row!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-909) Extract NIO specific code from ClientCnxn

2010-10-22 Thread Thomas Koch (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Koch reassigned ZOOKEEPER-909:
-

Assignee: Thomas Koch  (was: Patrick Hunt)

> Extract NIO specific code from ClientCnxn
> -
>
> Key: ZOOKEEPER-909
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-909
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Thomas Koch
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch
>
>
> This patch is mostly the same patch as my last one for ZOOKEEPER-823 minus 
> everything Netty related. This means this patch only extract all NIO specific 
> code in the class ClientCnxnSocketNIO which extends ClientCnxnSocket.
> I've redone this patch from current trunk step by step now and couldn't find 
> any logical error. I've already done a couple of successful test runs and 
> will continue to do so this night.
> It would be nice, if we could apply this patch as soon as possible to trunk. 
> This allows us to continue to work on the netty integration without blocking 
> the ClientCnxn class. Adding Netty after this patch should be only a matter 
> of adding the ClientCnxnSocketNetty class with the appropriate test cases.
> You could help me by reviewing the patch and by running it on whatever test 
> server you have available. Please send me any complete failure log you should 
> encounter to thomas at koch point ro. Thx!
> Update: Until now, I've collected 8 successful builds in a row!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-909) Extract NIO specific code from ClientCnxn

2010-10-22 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923975#action_12923975
 ] 

Patrick Hunt commented on ZOOKEEPER-909:


Thomas, I assigned the jira to you because you're doing most/all the work to 
get this done, not as a work token. I believe you should get the credit when 
this patch gets committed.

Typically we use assignment (esp when a patch gets committed) to credit the 
author - that's one of the criteria we monitor when deciding on new committers 
(number and quality of patches, testing, conformance to community style 
guidelinesl, etc...)

Feel free to reassign this to yourself (please).

> Extract NIO specific code from ClientCnxn
> -
>
> Key: ZOOKEEPER-909
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-909
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch
>
>
> This patch is mostly the same patch as my last one for ZOOKEEPER-823 minus 
> everything Netty related. This means this patch only extract all NIO specific 
> code in the class ClientCnxnSocketNIO which extends ClientCnxnSocket.
> I've redone this patch from current trunk step by step now and couldn't find 
> any logical error. I've already done a couple of successful test runs and 
> will continue to do so this night.
> It would be nice, if we could apply this patch as soon as possible to trunk. 
> This allows us to continue to work on the netty integration without blocking 
> the ClientCnxn class. Adding Netty after this patch should be only a matter 
> of adding the ClientCnxnSocketNetty class with the appropriate test cases.
> You could help me by reviewing the patch and by running it on whatever test 
> server you have available. Please send me any complete failure log you should 
> encounter to thomas at koch point ro. Thx!
> Update: Until now, I've collected 8 successful builds in a row!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-904) super digest is not actually acting as a full superuser

2010-10-22 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-904:
---

Status: Patch Available  (was: Open)

Thanks for the patch, feel free to click "submit patch" once you have a patch 
ready to go. It transitions the workflow and lets us (committers) know to 
review your patch.

> super digest is not actually acting as a full superuser
> ---
>
> Key: ZOOKEEPER-904
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-904
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1
>Reporter: Camille Fournier
>Assignee: Camille Fournier
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-904.patch
>
>
> The documentation states:
> New in 3.2:  Enables a ZooKeeper ensemble administrator to access the znode 
> hierarchy as a "super" user. In particular no ACL checking occurs for a user 
> authenticated as super.
> However, if a super user does something like:
> zk.setACL("/", Ids.READ_ACL_UNSAFE, -1);
> the super user is now bound by read-only ACL. This is not what I would expect 
> to see given the documentation. It can be fixed by moving the chec for the 
> "super" authId in PrepRequestProcessor.checkACL to before the for(ACL a : 
> acl) loop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



RE: implications of netty on client connections

2010-10-22 Thread Fournier, Camille F. [Tech]
Yes, that's correct.

C

-Original Message-
From: Mahadev Konar [mailto:maha...@yahoo-inc.com] 
Sent: Friday, October 22, 2010 1:39 PM
To: zookeeper-dev@hadoop.apache.org
Subject: Re: implications of netty on client connections

Hi Camille,
   I am a little curious here. Does this mean you tried a single zookeeper
server with 16K clients?

Thanks
mahadev

On 10/20/10 1:07 PM, "Fournier, Camille F. [Tech]" 
wrote:

> Thanks Patrick, I'll look and see if I can figure out a clean change for this.
> It was the kernel limit for max number of open fds for the process that was
> where the problem shows up (not zk limit). FWIW, we tested with a process fd
> limit of 16K, and ZK performed reasonably well until the fd limit was reached,
> at which point it choked. There was a throughput degradation, but mostly going
> from 0 to 4000 connections. 4000 to 16000 was mostly flat until the sharp
> drop. For our use case it is fine to have a bit of performance loss with huge
> numbers of connections, so long as we can handle the choke, which for initial
> rollout I'm planning on just monitoring for.
> 
> C
> 
> -Original Message-
> From: Patrick Hunt [mailto:ph...@apache.org]
> Sent: Wednesday, October 20, 2010 2:06 PM
> To: zookeeper-dev@hadoop.apache.org
> Subject: Re: implications of netty on client connections
> 
> It may just be the case that we haven't tested sufficiently for this case
> (running out of fds) and we need to handle this better even in nio. Probably
> by cutting off "op_connect" in the selector. We should be able to do similar
> in netty.
> 
> Btw, on unix one can access the open/max fd count using this:
> http://download.oracle.com/javase/6/docs/jre/api/management/extension/com/sun/
> management/UnixOperatingSystemMXBean.html
> 
> 
> Secondly, are you running into a kernel limit or a zk limit? Take a look at
> this post describing 1million concurrent connections to a box:
> http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb
> -part-3
> 
> specifically:
> --
> 
> During various test with lots of connections, I ended up making some
> additional changes to my sysctl.conf. This was part trial-and-error, I don't
> really know enough about the internals to make especially informed decisions
> about which values to change. My policy was to wait for things to break,
> check /var/log/kern.log and see what mysterious error was reported, then
> increase stuff that sounded sensible after a spot of googling. Here are the
> settings in place during the above test:
> 
> net.core.rmem_max = 33554432
> net.core.wmem_max = 33554432
> net.ipv4.tcp_rmem = 4096 16384 33554432
> net.ipv4.tcp_wmem = 4096 16384 33554432
> net.ipv4.tcp_mem = 786432 1048576 26777216
> net.ipv4.tcp_max_tw_buckets = 36
> net.core.netdev_max_backlog = 2500
> vm.min_free_kbytes = 65536
> vm.swappiness = 0
> net.ipv4.ip_local_port_range = 1024 65535
> 
> --
> 
> 
> I'm guessing that even with this, at some point you'll run into a limit in
> our server implementation. In particular I suspect that we may start to
> respond more slowly to pings, eventually getting so bad it would time out.
> We'd have to debug that and address (optimize).
> 
>  b-part-3>
> Patrick
> 
> On Tue, Oct 19, 2010 at 7:16 AM, Fournier, Camille F. [Tech] <
> camille.fourn...@gs.com> wrote:
> 
>> Hi everyone,
>> 
>> I'm curious what the implications of using netty are going to be for the
>> case where a server gets close to its max available file descriptors. Right
>> now our somewhat limited testing has shown that a ZK server performs fine up
>> to the point when it runs out of available fds, at which point performance
>> degrades sharply and new connections get into a somewhat bad state. Is netty
>> going to enable the server to handle this situation more gracefully (or is
>> there a way to do this already that I haven't found)? Limiting connections
>> from the same client is not enough since we can potentially have far more
>> clients wanting to connect than available fds for certain use cases we might
>> consider.
>> 
>> Thanks,
>> Camille
>> 
>> 
> 



[jira] Updated: (ZOOKEEPER-904) super digest is not actually acting as a full superuser

2010-10-22 Thread Camille Fournier (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Camille Fournier updated ZOOKEEPER-904:
---

Attachment: ZOOKEEPER-904.patch

Fix for trunk

> super digest is not actually acting as a full superuser
> ---
>
> Key: ZOOKEEPER-904
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-904
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1
>Reporter: Camille Fournier
>Assignee: Camille Fournier
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-904.patch
>
>
> The documentation states:
> New in 3.2:  Enables a ZooKeeper ensemble administrator to access the znode 
> hierarchy as a "super" user. In particular no ACL checking occurs for a user 
> authenticated as super.
> However, if a super user does something like:
> zk.setACL("/", Ids.READ_ACL_UNSAFE, -1);
> the super user is now bound by read-only ACL. This is not what I would expect 
> to see given the documentation. It can be fixed by moving the chec for the 
> "super" authId in PrepRequestProcessor.checkACL to before the for(ACL a : 
> acl) loop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Heisenbugs, Bohrbugs, Mandelbugs?

2010-10-22 Thread Thomas Koch
Mahadev Konar:
> Hi Thomas,
>   Could you verify this by just testing the trunk without your patch? You
> might very well be right that those tests are a little flaky.
> 
> As for the hudson builds, Nigel is working on getting the patch builds for
> zookeeper running. As soon as that gets fixed this flaky tests would show
> up more often.
> 
> Thanks
> mahadev
> 
> On 10/20/10 11:48 PM, "Thomas Koch"  wrote:
> > Hi,
> > 
> > last night I let my hudson server do 42 (sic) builds of ZooKeeper trunk.
> > One of this builds failed:
> > 
> > junit.framework.AssertionFailedError: Leader hasn't joined: 5
> > 
> > at org.apache.zookeeper.test.FLETest.testLE(FLETest.java:312)
> > 
> > I did this many builds of trunk, because in my quest to redo the client
> > netty integration step by step I made one step which resulted in 2
> > failed builds out of 8. The two failures were both:
Hi Mahadev,

as I've written, I did 42 builds of trunk over the night from which 2 failed 
and 8 builds of my patch during work time with 2 failures. I also did another 
round of builds of my patch during last night and got only 1 failure out of 
~40 succesful builds.

So I believe that the high failure rate of 2/8 from the initial round of patch 
builds is because I did this builds over the day while other developers also 
used other virtual machines on the same host.

Have a nice weekend,

Thomas Koch, http://www.koch.ro


Re: Heisenbugs, Bohrbugs, Mandelbugs?

2010-10-22 Thread Mahadev Konar
Hi Thomas,
  Could you verify this by just testing the trunk without your patch? You
might very well be right that those tests are a little flaky.

As for the hudson builds, Nigel is working on getting the patch builds for
zookeeper running. As soon as that gets fixed this flaky tests would show up
more often. 

Thanks
mahadev


On 10/20/10 11:48 PM, "Thomas Koch"  wrote:

> Hi,
> 
> last night I let my hudson server do 42 (sic) builds of ZooKeeper trunk. One
> of this builds failed:
> 
> junit.framework.AssertionFailedError: Leader hasn't joined: 5
> at org.apache.zookeeper.test.FLETest.testLE(FLETest.java:312)
> 
> I did this many builds of trunk, because in my quest to redo the client netty
> integration step by step I made one step which resulted in 2 failed builds out
> of 8. The two failures were both:
> 
> junit.framework.AssertionFailedError: Threads didn't join
> at
> 
org.apache.zookeeper.test.FLERestartTest.testLERestart(FLERestartTest.java:198>
)
> 
> I can't find any relationship between the above test and my changes. The test
> does not use the ZooKeeper client code at all. So I begin to believe that
> there are some Heisenbugs, Bohrbugs or Mandelbugs[1] in ZooKeeper that just
> happen to show up from time to time without any relationship to the current
> changes.
> 
> I'll try to investigate the cause further, maybe there is some relationship
> I've not yet found. But if my assumption should apply, then these kind of bugs
> would be a strong argument in favor of refactoring. These bugs are best found
> by cleaning the code, most important implementing strict separation of
> concerns.
> 
> Wouldn't you like to setup Hudson to build ZooKeeper trunk every half an hour?
> 
> [1] http://en.wikipedia.org/wiki/Unusual_software_bug
> 
> Best regards,
> 
> Thomas Koch, http://www.koch.ro
> 



Re: implications of netty on client connections

2010-10-22 Thread Mahadev Konar
Hi Camille,
   I am a little curious here. Does this mean you tried a single zookeeper
server with 16K clients?

Thanks
mahadev

On 10/20/10 1:07 PM, "Fournier, Camille F. [Tech]" 
wrote:

> Thanks Patrick, I'll look and see if I can figure out a clean change for this.
> It was the kernel limit for max number of open fds for the process that was
> where the problem shows up (not zk limit). FWIW, we tested with a process fd
> limit of 16K, and ZK performed reasonably well until the fd limit was reached,
> at which point it choked. There was a throughput degradation, but mostly going
> from 0 to 4000 connections. 4000 to 16000 was mostly flat until the sharp
> drop. For our use case it is fine to have a bit of performance loss with huge
> numbers of connections, so long as we can handle the choke, which for initial
> rollout I'm planning on just monitoring for.
> 
> C
> 
> -Original Message-
> From: Patrick Hunt [mailto:ph...@apache.org]
> Sent: Wednesday, October 20, 2010 2:06 PM
> To: zookeeper-dev@hadoop.apache.org
> Subject: Re: implications of netty on client connections
> 
> It may just be the case that we haven't tested sufficiently for this case
> (running out of fds) and we need to handle this better even in nio. Probably
> by cutting off "op_connect" in the selector. We should be able to do similar
> in netty.
> 
> Btw, on unix one can access the open/max fd count using this:
> http://download.oracle.com/javase/6/docs/jre/api/management/extension/com/sun/
> management/UnixOperatingSystemMXBean.html
> 
> 
> Secondly, are you running into a kernel limit or a zk limit? Take a look at
> this post describing 1million concurrent connections to a box:
> http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb
> -part-3
> 
> specifically:
> --
> 
> During various test with lots of connections, I ended up making some
> additional changes to my sysctl.conf. This was part trial-and-error, I don't
> really know enough about the internals to make especially informed decisions
> about which values to change. My policy was to wait for things to break,
> check /var/log/kern.log and see what mysterious error was reported, then
> increase stuff that sounded sensible after a spot of googling. Here are the
> settings in place during the above test:
> 
> net.core.rmem_max = 33554432
> net.core.wmem_max = 33554432
> net.ipv4.tcp_rmem = 4096 16384 33554432
> net.ipv4.tcp_wmem = 4096 16384 33554432
> net.ipv4.tcp_mem = 786432 1048576 26777216
> net.ipv4.tcp_max_tw_buckets = 36
> net.core.netdev_max_backlog = 2500
> vm.min_free_kbytes = 65536
> vm.swappiness = 0
> net.ipv4.ip_local_port_range = 1024 65535
> 
> --
> 
> 
> I'm guessing that even with this, at some point you'll run into a limit in
> our server implementation. In particular I suspect that we may start to
> respond more slowly to pings, eventually getting so bad it would time out.
> We'd have to debug that and address (optimize).
> 
>  b-part-3>
> Patrick
> 
> On Tue, Oct 19, 2010 at 7:16 AM, Fournier, Camille F. [Tech] <
> camille.fourn...@gs.com> wrote:
> 
>> Hi everyone,
>> 
>> I'm curious what the implications of using netty are going to be for the
>> case where a server gets close to its max available file descriptors. Right
>> now our somewhat limited testing has shown that a ZK server performs fine up
>> to the point when it runs out of available fds, at which point performance
>> degrades sharply and new connections get into a somewhat bad state. Is netty
>> going to enable the server to handle this situation more gracefully (or is
>> there a way to do this already that I haven't found)? Limiting connections
>> from the same client is not enough since we can potentially have far more
>> clients wanting to connect than available fds for certain use cases we might
>> consider.
>> 
>> Thanks,
>> Camille
>> 
>> 
> 



Re: (ZOOKEEPER-905) enhance zkServer.sh for easier zookeeper automation-izing

2010-10-22 Thread Mahadev Konar
Thanks for the patch Nicholaus.


On 10/20/10 1:27 PM, "Nicholas Harteau"  wrote:

> Hi there.  I submitted a patch/jira issue for zkServer.sh (ZOOKEEPER-905).
> I'm not sure what else to say about it that's not covered in the comments.
> 
> p.s. Thanks for the great software - I'm enjoying building my applications
> around it.
> 
> --
> nicholas harteau
> n...@ikami.com
> 
> 
> 
> 



[jira] Updated: (ZOOKEEPER-904) super digest is not actually acting as a full superuser

2010-10-22 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-904:


Fix Version/s: 3.4.0

> super digest is not actually acting as a full superuser
> ---
>
> Key: ZOOKEEPER-904
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-904
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1
>Reporter: Camille Fournier
>Assignee: Camille Fournier
> Fix For: 3.4.0
>
>
> The documentation states:
> New in 3.2:  Enables a ZooKeeper ensemble administrator to access the znode 
> hierarchy as a "super" user. In particular no ACL checking occurs for a user 
> authenticated as super.
> However, if a super user does something like:
> zk.setACL("/", Ids.READ_ACL_UNSAFE, -1);
> the super user is now bound by read-only ACL. This is not what I would expect 
> to see given the documentation. It can be fixed by moving the chec for the 
> "super" authId in PrepRequestProcessor.checkACL to before the for(ACL a : 
> acl) loop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-903) Create a testing jar with useful classes from ZK test source

2010-10-22 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-903:


Fix Version/s: 3.4.0

> Create a testing jar with useful classes from ZK test source
> 
>
> Key: ZOOKEEPER-903
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-903
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: tests
>Reporter: Camille Fournier
> Fix For: 3.4.0
>
>
> From mailing list:
> -Original Message-
> From: Benjamin Reed 
> Sent: Monday, October 18, 2010 11:12 AM
> To: zookeeper-u...@hadoop.apache.org
> Subject: Re: Testing zookeeper outside the source distribution?
>   we should be exposing those classes and releasing them as a testing 
> jar. do you want to open up a jira to track this issue?
> ben
> On 10/18/2010 05:17 AM, Anthony Urso wrote:
> > Anyone have any pointers on how to test against ZK outside of the
> > source distribution? All the fun classes (e.g. ClientBase) do not make
> > it into the ZK release jar.
> >
> > Right now I am manually running a ZK node for the unit tests to
> > connect to prior to running my test, but I would rather have something
> > that ant could reliably
> > automate starting and stopping for CI.
> >
> > Thanks,
> > Anthony

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-909) Extract NIO specific code from ClientCnxn

2010-10-22 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923905#action_12923905
 ] 

Benjamin Reed commented on ZOOKEEPER-909:
-

this is looking really nice. i'm not done reviewing, but i did want to note 
that you need to add the zookeeper.clientCxnSocket property to the doc. You 
should also javadoc that variable. 

> Extract NIO specific code from ClientCnxn
> -
>
> Key: ZOOKEEPER-909
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-909
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch
>
>
> This patch is mostly the same patch as my last one for ZOOKEEPER-823 minus 
> everything Netty related. This means this patch only extract all NIO specific 
> code in the class ClientCnxnSocketNIO which extends ClientCnxnSocket.
> I've redone this patch from current trunk step by step now and couldn't find 
> any logical error. I've already done a couple of successful test runs and 
> will continue to do so this night.
> It would be nice, if we could apply this patch as soon as possible to trunk. 
> This allows us to continue to work on the netty integration without blocking 
> the ClientCnxn class. Adding Netty after this patch should be only a matter 
> of adding the ClientCnxnSocketNetty class with the appropriate test cases.
> You could help me by reviewing the patch and by running it on whatever test 
> server you have available. Please send me any complete failure log you should 
> encounter to thomas at koch point ro. Thx!
> Update: Until now, I've collected 8 successful builds in a row!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-907) Spurious "KeeperErrorCode = Session moved" messages

2010-10-22 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923895#action_12923895
 ] 

Benjamin Reed commented on ZOOKEEPER-907:
-

sync doesn't cause any additional traffic over the atomic broadcast. it just 
makes sure that the all of the in-process transactions have be sent to the 
follower. when that error happens, the error will be sent back to the follower 
ordered after all of the completed transactions. so rather than being able to 
see the result of all requests initiated before the sync, the follower will see 
all requests completed before the sync. that is why i referred to it as a 
partial sync.

i'm really having problems trying to reproduce this error. can you describe 
more how it happened? i would like to have an end-to-end test rather than the 
test of a particular implementation so that this error doesn't pop up if the 
implementation changes. looking at the code it seems like it should happen 
everytime the sync request is sent to a follower, but that doesn't seem to be 
the case.

> Spurious "KeeperErrorCode = Session moved" messages
> ---
>
> Key: ZOOKEEPER-907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-907.patch
>
>
> The sync request does not set the session owner in Request.
> As a result, the leader keeps printing:
> 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
> Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
> type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
> Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-909) Extract NIO specific code from ClientCnxn

2010-10-22 Thread Thomas Koch (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Koch updated ZOOKEEPER-909:
--

Status: Open  (was: Patch Available)

> Extract NIO specific code from ClientCnxn
> -
>
> Key: ZOOKEEPER-909
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-909
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Thomas Koch
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch
>
>
> This patch is mostly the same patch as my last one for ZOOKEEPER-823 minus 
> everything Netty related. This means this patch only extract all NIO specific 
> code in the class ClientCnxnSocketNIO which extends ClientCnxnSocket.
> I've redone this patch from current trunk step by step now and couldn't find 
> any logical error. I've already done a couple of successful test runs and 
> will continue to do so this night.
> It would be nice, if we could apply this patch as soon as possible to trunk. 
> This allows us to continue to work on the netty integration without blocking 
> the ClientCnxn class. Adding Netty after this patch should be only a matter 
> of adding the ClientCnxnSocketNetty class with the appropriate test cases.
> You could help me by reviewing the patch and by running it on whatever test 
> server you have available. Please send me any complete failure log you should 
> encounter to thomas at koch point ro. Thx!
> Update: Until now, I've collected 8 successful builds in a row!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-909) Extract NIO specific code from ClientCnxn

2010-10-22 Thread Thomas Koch (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Koch updated ZOOKEEPER-909:
--

Attachment: ZOOKEEPER-909.patch

add copyright blocks, replace deprecated ChannelPipelineCoverage annotation

> Extract NIO specific code from ClientCnxn
> -
>
> Key: ZOOKEEPER-909
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-909
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Thomas Koch
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch
>
>
> This patch is mostly the same patch as my last one for ZOOKEEPER-823 minus 
> everything Netty related. This means this patch only extract all NIO specific 
> code in the class ClientCnxnSocketNIO which extends ClientCnxnSocket.
> I've redone this patch from current trunk step by step now and couldn't find 
> any logical error. I've already done a couple of successful test runs and 
> will continue to do so this night.
> It would be nice, if we could apply this patch as soon as possible to trunk. 
> This allows us to continue to work on the netty integration without blocking 
> the ClientCnxn class. Adding Netty after this patch should be only a matter 
> of adding the ClientCnxnSocketNetty class with the appropriate test cases.
> You could help me by reviewing the patch and by running it on whatever test 
> server you have available. Please send me any complete failure log you should 
> encounter to thomas at koch point ro. Thx!
> Update: Until now, I've collected 8 successful builds in a row!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-909) Extract NIO specific code from ClientCnxn

2010-10-22 Thread Thomas Koch (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Koch updated ZOOKEEPER-909:
--

Assignee: Patrick Hunt  (was: Thomas Koch)
  Status: Patch Available  (was: Open)

> Extract NIO specific code from ClientCnxn
> -
>
> Key: ZOOKEEPER-909
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-909
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch
>
>
> This patch is mostly the same patch as my last one for ZOOKEEPER-823 minus 
> everything Netty related. This means this patch only extract all NIO specific 
> code in the class ClientCnxnSocketNIO which extends ClientCnxnSocket.
> I've redone this patch from current trunk step by step now and couldn't find 
> any logical error. I've already done a couple of successful test runs and 
> will continue to do so this night.
> It would be nice, if we could apply this patch as soon as possible to trunk. 
> This allows us to continue to work on the netty integration without blocking 
> the ClientCnxn class. Adding Netty after this patch should be only a matter 
> of adding the ClientCnxnSocketNetty class with the appropriate test cases.
> You could help me by reviewing the patch and by running it on whatever test 
> server you have available. Please send me any complete failure log you should 
> encounter to thomas at koch point ro. Thx!
> Update: Until now, I've collected 8 successful builds in a row!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-800) zoo_add_auth returns ZOK if zookeeper handle is in ZOO_CLOSED_STATE

2010-10-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923814#action_12923814
 ] 

Hudson commented on ZOOKEEPER-800:
--

Integrated in ZooKeeper-trunk #976 (See 
[https://hudson.apache.org/hudson/job/ZooKeeper-trunk/976/])
ZOOKEEPER-800. zoo_add_auth returns ZOK if zookeeper handle is in 
ZOO_CLOSED_STATE (michi mutsuzaki via mahadev konar)


> zoo_add_auth returns ZOK if zookeeper handle is in ZOO_CLOSED_STATE
> ---
>
> Key: ZOOKEEPER-800
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-800
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.3.1
>Reporter: Michi Mutsuzaki
>Assignee: Michi Mutsuzaki
>Priority: Minor
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-800.patch
>
>
> This happened when I called zoo_add_auth() immediately after 
> zookeeper_init(). It took me a while to figure out that authentication 
> actually failed since zoo_add_auth() returned ZOK. It should return 
> ZINVALIDSTATE instead. 
> --Michi

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.