Re: Retrying sequential znode creation

2010-10-20 Thread Ted Dunning
These corner cases are relatively rare, I would think (I personally keep
logs around for days or longer).

Would it be possible to get a partial solution in place that invokes the
current behavior if logs aren't available?

On Wed, Oct 20, 2010 at 10:42 AM, Patrick Hunt  wrote:

> Hi Ted, Mahadev is in the best position to comment (he looked at it last)
> but iirc when we started looking into implementing this we immediately ran
> into so big questions. One was what to do if the logs had been cleaned up
> and the individual transactions no longer available. This could be overcome
> by changes wrt cleanup, log rotation, etc... There was another more
> bulletproof option, essentially to keep all the changes in memory that
> might
> be necessary to implement 22, however this might mean a significant
> increase
> in mem requirements and general bookkeeping. It turned out (again correct
> me
> if I'm wrong) that more thought was going to be necessary, esp around
> ensuring correct operation in any/all special cases.
>
> Patrick
>
> On Wed, Oct 13, 2010 at 12:49 PM, Ted Dunning 
> wrote:
>
> > Patrick,
> >
> > What are these hurdles?  The last comment on ZK-22 was last winter.  Back
> > then, it didn't sound like
> > it was going to be that hard.
> >
> > On Wed, Oct 13, 2010 at 12:08 PM, Patrick Hunt  wrote:
> >
> > > 22 would help with this issue
> > > https://issues.apache.org/jira/browse/ZOOKEEPER-22
> > > however there are some real hurdles to implementing 22 successfully.
> > >
> >
>


RE: Digest user ACL check failing

2010-10-20 Thread Fournier, Camille F. [Tech]
Already did. I think it's for any znode, given the way the bug presents.
https://issues.apache.org/jira/browse/ZOOKEEPER-904
I have a test and fix for this (described in the tracker), if you agree this is 
a bug I will attach it.

C


-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Wednesday, October 20, 2010 2:08 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: Digest user ACL check failing

Sounds like it might be a bug, was this just for the root or for any znode?
Please file a JIRA, thanks.

Patrick

On Tue, Oct 19, 2010 at 1:01 PM, Fournier, Camille F. [Tech] <
camille.fourn...@gs.com> wrote:

> The ZK documentation says:
> New in 3.2: Enables a ZooKeeper ensemble administrator to access the znode
> hierarchy as a "super" user. In particular no ACL checking occurs for a user
> authenticated as super.
>
> However, in some testing today I created a digest user, logged in as this
> user, set the ACLs for "/" to Ids.READ_ACL_UNSAFE, and now even when I am
> logged in as the superuser, I cannot actually change this ACL or write nodes
> below it on the tree. So it does not actually seem to be the case that
> "super" skips ACL checks. Is this a bug or a feature?
>
> Thanks,
> Camille
>
>
>


Re: zxid integer overflow

2010-10-20 Thread Patrick Hunt
I'm not aware of sustained 1k/sec, Ben might know how long the 20k/sec test
runs for (and for how long that rate is sustained). You'd definitely want to
tune the GC, GC related pauses would be the biggest obstacle for this
(assuming you are using a dedicated log device for the transaction logs).

Patrick

On Tue, Oct 19, 2010 at 3:14 PM, Sandy Pratt  wrote:

> Follow up question: does anyone have a production cluster that handles a
> similar sustained rate of changes?
>
> -Original Message-
> From: Benjamin Reed [mailto:br...@yahoo-inc.com]
> Sent: Tuesday, October 19, 2010 2:53 PM
> To: zookeeper-user@hadoop.apache.org
> Subject: Re: zxid integer overflow
>
>  we should put in a test for that. it is certainly a plausible scenario. in
> theory it will just flow into the next epoch and everything will be fine,
> but we should try it and see.
>
> ben
>
> On 10/19/2010 11:33 AM, Sandy Pratt wrote:
> > Just as a thought experiment, I was pondering the following:
> >
> > ZK stamps each change to its managed state with a zxid (
> http://hadoop.apache.org/zookeeper/docs/r3.2.1/zookeeperInternals.html).
>  That ID consists of a 64 bit number in which the upper 32 bits are the
> epoch, which changes when the leader does, and the bottom 32 bits are a
> counter, which is incremented by the leader with every change.  If 1000
> changes are made to ZK state each second (which is 1/20th of the peak rate
> advertised), then the counter portion will roll over in 2^32 / (86400 *
> 1000) = 49 days.
> >
> > Now, assuming that my math is correct, is this an actual concern?  For
> example, if I'm using ZK to provide locking for a key value store that
> handles transactions at about that rate, am I setting myself up for failure?
> >
> > Thanks,
> >
> > Sandy
>
>


Re: Digest user ACL check failing

2010-10-20 Thread Patrick Hunt
Sounds like it might be a bug, was this just for the root or for any znode?
Please file a JIRA, thanks.

Patrick

On Tue, Oct 19, 2010 at 1:01 PM, Fournier, Camille F. [Tech] <
camille.fourn...@gs.com> wrote:

> The ZK documentation says:
> New in 3.2: Enables a ZooKeeper ensemble administrator to access the znode
> hierarchy as a "super" user. In particular no ACL checking occurs for a user
> authenticated as super.
>
> However, in some testing today I created a digest user, logged in as this
> user, set the ACLs for "/" to Ids.READ_ACL_UNSAFE, and now even when I am
> logged in as the superuser, I cannot actually change this ACL or write nodes
> below it on the tree. So it does not actually seem to be the case that
> "super" skips ACL checks. Is this a bug or a feature?
>
> Thanks,
> Camille
>
>
>


Re: Retrying sequential znode creation

2010-10-20 Thread Patrick Hunt
Hi Ted, Mahadev is in the best position to comment (he looked at it last)
but iirc when we started looking into implementing this we immediately ran
into so big questions. One was what to do if the logs had been cleaned up
and the individual transactions no longer available. This could be overcome
by changes wrt cleanup, log rotation, etc... There was another more
bulletproof option, essentially to keep all the changes in memory that might
be necessary to implement 22, however this might mean a significant increase
in mem requirements and general bookkeeping. It turned out (again correct me
if I'm wrong) that more thought was going to be necessary, esp around
ensuring correct operation in any/all special cases.

Patrick

On Wed, Oct 13, 2010 at 12:49 PM, Ted Dunning  wrote:

> Patrick,
>
> What are these hurdles?  The last comment on ZK-22 was last winter.  Back
> then, it didn't sound like
> it was going to be that hard.
>
> On Wed, Oct 13, 2010 at 12:08 PM, Patrick Hunt  wrote:
>
> > 22 would help with this issue
> > https://issues.apache.org/jira/browse/ZOOKEEPER-22
> > however there are some real hurdles to implementing 22 successfully.
> >
>


Re: Unusual exception

2010-10-20 Thread Patrick Hunt
EOS means that the client closed the connection (from the point of view of
the server). The server then tries to cleanup by closing the socket
explicitly, in some cases that results in debug messages you see subsequent.

EndOfStreamException: Unable to
read additional data from client sessionid 0x0, likely client has closed
socket

Notice that the session id is 0 - so either this is a zk client that failed
before establishing a session, or more likely it's a monitoring/4letterword
command (which never est sessions).

Patrick

On Wed, Oct 13, 2010 at 2:49 PM, Avinash Lakshman <
avinash.laksh...@gmail.com> wrote:

> I started seeing a bunch of these exceptions. What do these mean?
>
> 2010-10-13 14:01:33,426 - WARN [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:5001:nioserverc...@606] - EndOfStreamException: Unable to
> read additional data from client sessionid 0x0, likely client has closed
> socket
> 2010-10-13 14:01:33,426 - INFO [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:5001:nioserverc...@1286] - Closed socket connection for
> client /10.138.34.195:55738 (no session established for client)
> 2010-10-13 14:01:33,426 - DEBUG [CommitProcessor:1:finalrequestproces...@78
> ]
> - Processing request:: sessionid:0x12b9d1f8b907a44 type:closeSession
> cxid:0x0 zxid:0x600193996 txntype:-11 reqpath:n/a
> 2010-10-13 14:01:33,427 - WARN [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:5001:nioserverc...@606] - EndOfStreamException: Unable to
> read additional data from client sessionid 0x12b9d1f8b907a5d, likely client
> has closed socket
> 2010-10-13 14:01:33,427 - INFO [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:5001:nioserverc...@1286] - Closed socket connection for
> client /10.138.34.195:55979 which had sessionid 0x12b9d1f8b907a5d
> 2010-10-13 14:01:33,427 - DEBUG [QuorumPeer:/0.0.0.0:5001
> :commitproces...@159] - Committing request:: sessionid:0x52b90ab45bd51af
> type:createSession cxid:0x0 zxid:0x600193cf9 txntype:-10 reqpath:n/a
> 2010-10-13 14:01:33,427 - DEBUG [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:5001:nioserverc...@1302] - ignoring exception during
> output
> shutdown
> java.net.SocketException: Transport endpoint is not connected
> at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
> at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
> at
>
> org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1298)
> at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1263)
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:609)
> at
>
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:262)
> 2010-10-13 14:01:33,428 - DEBUG [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:5001:nioserverc...@1310] - ignoring exception during input
> shutdown
> java.net.SocketException: Transport endpoint is not connected
> at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
> at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
> at
>
> org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1306)
> at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1263)
> at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:609)
> at
>
> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:262)
> 2010-10-13 14:01:33,428 - WARN [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:5001:nioserverc...@606] - EndOfStreamException: Unable to
> read additional data from client sessionid 0x0, likely client has closed
> socket
> 2010-10-13 14:01:33,428 - INFO [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:5001:nioserverc...@1286] - Closed socket connection for
> client /10.138.34.195:55731 (no session established for client)
>


Re: Retrying sequential znode creation

2010-10-20 Thread Patrick Hunt
Hi Ted, Mahadev is in the best position to comment (he looked at it last)
but iirc when we started looking into implementing this we immediately ran
into so big questions. One was what to do if the logs had been cleaned up
and the individual transactions no longer available. This could be overcome
by changes wrt cleanup, log rotation, etc... There was another more
bulletproof option, essentially to keep all the changes in memory that might
be necessary to implement 22, however this might mean a significant increase
in mem requirements and general bookkeeping. It turned out (again correct me
if I'm wrong) that more thought was going to be necessary, esp around
ensuring correct operation in any/all special cases.

Patrick

On Wed, Oct 13, 2010 at 12:49 PM, Ted Dunning  wrote:

> Patrick,
>
> What are these hurdles?  The last comment on ZK-22 was last winter.  Back
> then, it didn't sound like
> it was going to be that hard.
>
> On Wed, Oct 13, 2010 at 12:08 PM, Patrick Hunt  wrote:
>
> > 22 would help with this issue
> > https://issues.apache.org/jira/browse/ZOOKEEPER-22
> > however there are some real hurdles to implementing 22 successfully.
> >
>