RE: Re: ephemerals handling after restart

2008-09-18 Thread Benjamin Reed
! Is the session recoverable in case the zk server was restarted in meantime ? Johannes On Sep 12, 2008, at 3:52 PM, Benjamin Reed wrote: If a application does not close the ZooKeeper session before shutting down, ZooKeeper will not cleanup the session until it times out. So when an application

RE: Read Write Performance Graphs

2008-10-01 Thread Benjamin Reed
That graph is taken from a paper we will be publishing as a tech report. Here is the missing text: To show the behavior of the system over time as failures are injected we ran a ZooKeeper service made up of 7 machines. We ran the same saturation benchmark as before, but this time we kept the write

RE: What happens when a server loses all its state?

2008-12-17 Thread Benjamin Reed
Thomas, in the scenario you give you have two simultaneous failures with 3 nodes, so it will not recover correctly. A is failed because it is not up. B has failed because it lost all its data. it would be good for ZooKeeper to not come up in that scenario. perhaps what we need is something

RE: State of the command line?

2009-01-05 Thread Benjamin Reed
The command line is a very simple utility for testing and as an example of how to use the API. these are good suggestions, you should document them in a Jira. ben From: burtona...@gmail.com [burtona...@gmail.com] On Behalf Of Kevin Burton

RE: Simpler ZooKeeper event interface....

2009-01-07 Thread Benjamin Reed
when you shutdown the full ensemble the session isn't expired. when things come back up your session will still be active. (it would be bad if the zk service could not survive the bounce of an ensembel.) you are way over thinking this and i fear you are not helping yourself with trying to

RE: Can ConnectionLossException be thrown when using multiple hosts?

2009-01-08 Thread Benjamin Reed
just to clarify: you also get ConnectionLossException from syncronous requests if the request cannot be sent or no response is received. ben -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Wednesday, January 07, 2009 10:16 AM To: zookeeper-user@hadoop.apache.org

RE: Sending data during NodeDataChanged or NodeCreated

2009-01-08 Thread Benjamin Reed
if you do a getData(/a, true) and then /a changes, you will get a watch event. if /a changes again, you will not get an event. so, if you want to monitor /a, you need to do a new getData() after each watch event to reregister the watch and get the new value. (re-registering watches on

RE: Updated NodeWatcher...

2009-01-09 Thread Benjamin Reed
. Then people could suggest abstractions that would essentially put a box around sections of the diagram. However I feel woefully inadequate at the former :(. .. Adam On Thu, Jan 8, 2009 at 4:20 PM, Benjamin Reed br...@yahoo-inc.com wrote: For your first issue if an ensemble goes offline and comes

RE: Updated NodeWatcher...

2009-01-09 Thread Benjamin Reed
: Updated NodeWatcher... Ben this is great, thanks! Do you want to close out this one and point to the faq? https://issues.apache.org/jira/browse/ZOOKEEPER-264 Although IMO this should be moved to the forrest docs. Patrick Benjamin Reed wrote: I'm really bad a creating figures, but i've put up

RE: Distributed queue: how to ensure no lost items?

2009-01-12 Thread Benjamin Reed
'potentially' already been processed. That way he can double check first before he goes off and processes the message again. But adding that info in ZK might be more expensive that doing the double check every time in consumer anyways. On Thu, Jan 8, 2009 at 11:42 AM, Benjamin Reed br...@yahoo

RE: Delaying 3.1 release by 2 to 3 weeks?

2009-01-16 Thread Benjamin Reed
we should delay. it would be good to try out quotas for a bit before we do the release. quotas are also a key part of the release. 3 weeks seem a little long though. ben From: Mahadev Konar [maha...@yahoo-inc.com] Sent: Thursday, January 15, 2009 4:32 PM

RE: Dealing with session expired

2009-02-12 Thread Benjamin Reed
idleness is not a problem. the client library sends heartbeats to keep the session alive. the client library will also handle reconnects automatically if a server dies. since session expiration really is a rare catastrophic event. (or at least it should be.) it is probably easiest to deal with

RE: Recommended session timeout

2009-02-26 Thread Benjamin Reed
just a quick sanity check. are you sure your memory is not overcommitted? in other words you aren't swapping. since the gc does a bunch of random memory accesses if you swap at all things will go very slow. ben From: Joey Echeverria [joe...@gmail.com]

Re: Contrib section (nee Re: A modest proposal for simplifying zookeeper :)

2009-02-27 Thread Benjamin Reed
i'm ready to reevaluate it. i did the contrib for fatjar and it was extremely painful! (and that was an extremely simple contrib!) we really want to ramp up the contribs and get a bunch of recipe implementations in, so we need something that makes it really easy. i'm not a fan of maven (they

Re: Contrib section (nee Re: A modest proposal for simplifying zookeeper :)

2009-02-27 Thread Benjamin Reed
... Be aware that the contribution process, release process and other documentation would have to be updated as part of this. For example if we want to push jars to an artifact repo the artifacts/pom/etc... would have to be voted on as part of the release process. Patrick Benjamin Reed wrote

Re: How large an ensemble can one build with Zookeeper?

2009-03-06 Thread Benjamin Reed
I realize this is discussion is over, but i did want to make one quick clarification. when we talk about ensembles, we are talking about the servers that make up the zookeeper service. we refer to the servers that use the zookeeper service as clients. we have systems here that use ensembles of

RE: Semantics of ConnectionLoss exception

2009-03-26 Thread Benjamin Reed
it is possible for the time to pass without the session expiring. Imagine a session timeout of 15 seconds. there is correlated power outage affecting the zookeeper servers. lets say it takes 5 minutes to recover power and reboot. when the service recovers, it resets expiration times, so when

Re: Unique Id Generation

2009-04-24 Thread Benjamin Reed
i'm not exactly clear how you use these ideas, but one source of unique ids that are longs is the zxid. if you create a znode, everytime you write to it, you will get a unique zxid in the mzxid member of the stat structure. (you get the stat structure back in the response to the setData.) ben

RE: Moving ZooKeeper Servers

2009-05-06 Thread Benjamin Reed
yes, /zookeeper is part of the reserved namespace for zookeeper internals. you should ignore it for such things. ben From: Satish Bhatti [cthd2...@gmail.com] Sent: Wednesday, May 06, 2009 2:57 PM To: zookeeper-user@hadoop.apache.org Subject: Re: Moving

RE: NodeChildrenChanged WatchedEvent

2009-05-11 Thread Benjamin Reed
good summary ted. just to add a bit. another motivation for the current design is what scott had mentioned earlier: not sending a flood of changes when the value of a node is changing rapidly. implicit in this is the fact that we do not send the value in the events. not only does this make the

Re: Some thoughts on Zookeeper after using it for a while in the CXF/DOSGi subproject

2009-05-29 Thread Benjamin Reed
this is great to hear. it's great to see siblings playing together ;) * In CXF we use Maven to build everything. To depend on Zookeeper we need to pull it in from a Maven repository. I couldn't find Zookeeper in any main Maven repos, so currently we're pulling it in from

Re: Confused about KeeperState.Disconnected and KeeperState.Expired

2009-06-24 Thread Benjamin Reed
sorry to jump in late. if i understand the scenario correctly, you are partitioned from ZK, but you still have access to the NN on which you are holding leases to files. the problem is that even though your ephemeral nodes may timeout, you are still holding a lease on the NN and recovery

Re: Question about the sequential flag on create.

2009-07-14 Thread Benjamin Reed
the create is atomic. we just use a data structure that does not store the list of children in order. ben Erik Holstad wrote: Hey Patrik! Thanks for the reply. I understand all the reasons that you posted above and totally agree that nodes should not be sorted since you then have to pay that

RE: c client header location

2009-08-02 Thread Benjamin Reed
Or maybe /usr/local/include/zookeeper but either way c-client-src is weird. Please open a jira. Thanx ben Sent from my phone. -Original Message- From: Michi Mutsuzaki mi...@cs.stanford.edu Sent: Saturday, August 01, 2009 6:15 PM To: zookeeper-user@hadoop.apache.org

RE: exist return true before event comes in

2009-08-03 Thread Benjamin Reed
I assume you are calling the synchronous version of exists. The callbacks for both the watches and async calls are processed by a callback thread, so the ordering is strict. Synchronous call responses are not queued to the callback thread. (this allows you to make synchronous calls in callbacks

Re: Errors when run zookeeper in windows ?

2009-08-19 Thread Benjamin Reed
good point david! zhang can you try david's scripts? we should probably commit those. thanx for pointing them out david. ben David Bosschaert wrote: FWIW, I've uploaded some Windows versions of the zookeeper scripts to https://issues.apache.org/jira/browse/ZOOKEEPER-426 a while ago. They run

RE: A question about Connection timed out and operation timeout

2009-08-20 Thread Benjamin Reed
are you using the single threaded or multithreaded C library? the exceeded deadline message means that our thread was supposed to get control after a certain period, but we got control that many milliseconds late. what is your session timeout? ben

Re: Start problem of Running Replicated ZooKeeper

2009-09-23 Thread Benjamin Reed
The connection refused message as opposed to no route to host, or unknown host, indicate that zookeeper has not been started on the other machines. are the other machines giving similar errors? ben Le Zhou wrote: Hi, I'm trying to install HBase 0.20.0 in fully distributed mode on my cluster.

Re: The idea behind 'myid'

2009-09-25 Thread Benjamin Reed
can you clarify what you are asking for? are you just looking for motivation? or are you trying to find out how to use it? the myid file just has the unique identifier (number) of the server in the cluster. that number is matched against the id in the configuration file. there isn't much to

Re: The idea behind 'myid'

2009-09-25 Thread Benjamin Reed
not getting here :-) Regards, Orjan On Fri, Sep 25, 2009 at 3:56 PM, Benjamin Reed br...@yahoo-inc.com wrote: can you clarify what you are asking for? are you just looking for motivation? or are you trying to find out how to use it? the myid file just has the unique identifier (number

Re: How to expire a session

2009-09-25 Thread Benjamin Reed
so you have two problems going on. both have the same root: zookeeper_init returns before a connection and session is established with zookeeper, so you will not be able to fill in myid until a connection is made. you can do something with a mutex in the watcher to wait for a connection, or

Re: Struggling with a simple configuration file.

2009-10-09 Thread Benjamin Reed
right at the beginning of http://hadoop.apache.org/zookeeper/docs/r3.2.1/zookeeperStarted.html it shows you the minimum standalone configuration. that doesn't explain the 0 id. i'd like to try an reproduce it. do you have an empty data directory with a single file, myid, set to 1? ben

Re: Zookeeper Presentation

2009-11-13 Thread Benjamin Reed
there are a bunch of presentations you can grab at http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperPresentations ben Mark Vigeant wrote: Hey Everyone, I'm supposed to give a presentation next week about the basic functionality and uses of zookeeper. I was wondering if anybody out there

Re: size of data / number of znodes

2009-12-15 Thread Benjamin Reed
there aren't any limits on the number of znodes, it's just limited by your memory. there are two things (probably more :) to keep in mind: 1) the 1M limit also applies to the children list. you can't grow the list of children to more than 1M (the sum of the names of all of the children)

Re: Share Zookeeper instance and Connection Limits

2009-12-16 Thread Benjamin Reed
I agree with Ted, it doesn't seem like a good idea to do in practice. however, you do have a couple of options if you are just testing things: 1) use tmpfs 2) you can set forceSync to no in the configuration file to disable syncing to disk before acknowledging responses 3) if you really want

RE: Does zookeeper support listening on a specified address?

2009-12-21 Thread Benjamin Reed
no please open a jira as a new feature request. sent from my droid -Original Message- From: Steve Chu [stv...@gmail.com] Received: 12/21/09 3:44 AM To: zookeeper-user@hadoop.apache.org [zookeeper-u...@hadoop.apache.org] Subject: Does zookeeper support listening on a specified address?

Re: ZAB kick Paxos butt?

2010-01-20 Thread Benjamin Reed
hi Qing, i'm glad you like the page and Zab. yes, we are very familiar with Paxos. that page is meant to show a weakness of Paxos and a design point for Zab. it is not to say Paxos is not useful. Paxos is used in the real world in production systems. sometimes there are not order

Re: Q about ZK internal: how commit is being remembered

2010-01-28 Thread Benjamin Reed
henry is correct. just to state another way, Zab guarantees that if a quorum of servers have accepted a transaction, the transaction will commit. this means that if less than a quorum of servers have accepted a transaction, we can commit or discard. the only constraint we have in choosing is

Re: how to handle re-add watch fails

2010-02-01 Thread Benjamin Reed
sadly connectionloss is the really ugly part of zookeeper! it is a pain to deal with. i'm not sure we have best practice, but i can tell you what i do :) ZOOKEEPER-22 is meant to alleviate this problem. i usually use the asynch API when handling the watch callback. in the completion function

RE: When session expired event fired?

2010-02-08 Thread Benjamin Reed
i was looking through the docs to see if we talk about handling session expired, but i couldn't find anything. we should probably open a jira to add to the docs, unless i missed something. did i? ben -Original Message- From: Mahadev Konar [mailto:maha...@yahoo-inc.com] Sent: Monday,

Re: Managing multi-site clusters with Zookeeper

2010-03-15 Thread Benjamin Reed
it is a bit confusing but initLimit is the timer that is used when a follower connects to a leader. there may be some state transfers involved to bring the follower up to speed so we need to be able to allow a little extra time for the initial connection. after that we use syncLimit to figure

Re: permanent ZSESSIONMOVED

2010-03-16 Thread Benjamin Reed
do you ever use zookeeper_init() with the clientid field set to something other than null? ben On 03/16/2010 07:43 AM, Łukasz Osipiuk wrote: Hi everyone! I am writing to this group because recently we are getting some strange errors with our production zookeeper setup. From time to time we

Re: permanent ZSESSIONMOVED

2010-03-16 Thread Benjamin Reed
weird, this does sound like a bug. do you have a reliable way of reproducing the problem? thanx ben On 03/16/2010 08:27 AM, Łukasz Osipiuk wrote: nope. I always pass 0 as clientid. Łukasz On Tue, Mar 16, 2010 at 16:20, Benjamin Reedbr...@yahoo-inc.com wrote: do you ever use

Re: cluster fails to start - broken snapshot?

2010-03-18 Thread Benjamin Reed
we have updated ZOOKEEPER-713 with much more detail, but the bottom line is that the Invalid snapshot was caused by an OutOfMemoryError. this turns out not be a problem since we recover using an older snapshot. there are other things that are happening that are the real causes of the problem.

Re: syncLimit explanation needed?

2010-03-18 Thread Benjamin Reed
yes it means in sync with the leader. syncLimit governs the timeout when a follower is actively following a leader. initLimit is the initial connection timeout. because there is the potential for more data that needs to be transmitted during the initial connection, we want to be able to manage

Re: Solitication for logging/debugging requirements

2010-03-29 Thread Benjamin Reed
awesome! that would be great ivan. i'm sure pat has some more concrete suggestions, but one simple thing to do is to run the unit tests and look at the log messages that get output. there are a couple of categories of things that need to be fixed (this is in no way exhaustive): 1) messages

Re: How to ensure trasaction create-and-update

2010-03-30 Thread Benjamin Reed
i agree with ted. i think he points out some disadvantages with trying do do more. there is a slippery slope with these kinds of things. the implementation is complicated enough even with the simple model that we use. ben On 03/29/2010 08:34 PM, Ted Dunning wrote: I perhaps should not have

Re: Xid out of order. Got 8 expected 7

2010-05-12 Thread Benjamin Reed
is this a bug? shouldn't we be returning an error. ben On 05/12/2010 11:34 AM, Patrick Hunt wrote: I think that explains it then - the server is probably dropping the new (3.3.0) getChildren message (xid 7) as it (3.2.2 server) doesn't know about that message type. Then the server responds to

Re: problem connecting to zookeeper server

2010-05-20 Thread Benjamin Reed
good catch lei! if this helps gregory, can you open a jira to throw an exception in this situation. we should be throwing an invalid argument exception or something in this case. thanx ben On 05/20/2010 09:04 AM, Lei Zhang wrote: Seems you are passing in wrong arguments: Should have been:

Re: zookeeper crash

2010-06-02 Thread Benjamin Reed
charity, do you mind going through your scenario again to give a timeline for the failure? i'm a bit confused as to what happened. ben On 06/02/2010 01:32 PM, Charity Majors wrote: Thanks. That worked for me. I'm a little confused about why it threw the entire cluster into an unusable

Re: Completions in C API

2010-06-03 Thread Benjamin Reed
the call is executed at a later time on a different thread. the zoo_a* calls are non-blocking, so (subject to the thread scheduling) usually they will return before the request completes. ben On 06/03/2010 01:24 PM, Jack Orenstein wrote: I'm trying to figure out how to use zookeeper's C API.

Re: is ZK client thread safe

2010-06-21 Thread Benjamin Reed
yes. (except for the single threaded C-client library :) ben On 06/17/2010 10:16 AM, Jun Rao wrote: Hi, Is ZK client thread safe? Is it ok for multiple threads sharing the same ZK client? Thanks, Jun

Re: integration tests

2010-06-23 Thread Benjamin Reed
we do this in our tests for ZooKeeper. bookkeeper uses the testing classes as well, unfortunately, we haven't documented the interface. ben On 06/22/2010 08:42 PM, Ishaaq Chandy wrote: Hi all, First some background: 1. We use maven as our build tool. 2. We use Hudson as our CI server, it is

Re: Suggested way to simulate client session expiration in unit tests?

2010-07-08 Thread Benjamin Reed
the difference between close and disconnect is that close will actually try to tell the server to kill the session before disconnecting. a paranoid lock implementation doesn't need to test it's session. it should just monitor watch events to look for disconnect and expired events. if a client

Re: running the systest

2010-07-09 Thread Benjamin Reed
can you try the following: Index: src/contrib/fatjar/build.xml === --- src/contrib/fatjar/build.xml(revision 962637) +++ src/contrib/fatjar/build.xml(working copy) @@ -46,6 +46,7 @@ fileset

Re: Achieving quorum with only half of the nodes

2010-07-14 Thread Benjamin Reed
by custom QuorumVerifier are you referring to http://hadoop.apache.org/zookeeper/docs/r3.3.1/zookeeperHierarchicalQuorums.html ? ben On 07/14/2010 12:43 PM, Sergei Babovich wrote: Hi, We are currently evaluating use of ZK in our infrastructure. In our setup we have a set of servers running

Re: total # of zknodes

2010-07-15 Thread Benjamin Reed
i think there is a wiki page on this, but for the short answer: the number of znodes impact two things: memory footprint and recovery time. there is a base overhead to znodes to store its path, pointers to the data, pointers to the acl, etc. i believe that is around 100 bytes. you cant just

RE: cleanup ZK takes 40-60 seconds

2010-07-16 Thread Benjamin Reed
how big is your database? it would be good to know the timing of the two calls. shutdown should take very little time. sent from my droid -Original Message- From: Vishal K [vishalm...@gmail.com] Received: 7/16/10 6:31 PM To: zookeeper-user@hadoop.apache.org

Re: BookKeeper Doubts

2010-07-19 Thread Benjamin Reed
you have concluded correctly. 1) bookkeeper was designed for a process to use as a write-ahead log, so as a simplifying assumption we assume a single writer to a log. we should be throwing an exception if you try to write to a handle that you obtained using openLedger. can you open a jira for

Re: ZK recovery questions

2010-07-21 Thread Benjamin Reed
i did a benchmark a while back to see the effect of turning off the disk. (it wasn't as big as you would think.) i had to modify the code. there is an option to turn off the sync in the config that will get you most of the performance you would get by turning off the disk entirely. ben On

Re: Do implementations of Watcher need to be thread-safe?

2010-07-21 Thread Benjamin Reed
as long as a watcher object is only used with a single ZooKeeper object it will be called by the same thread. ben On 07/21/2010 11:12 AM, Joshua Ball wrote: Hi, Do implementations of Watcher need to be thread-safe, or can I assume that process(...) will always be called by the same thread?

Re: How to handle Node does not exist error?

2010-08-12 Thread Benjamin Reed
i thought there was a jira about supporting embedded zookeeper. (i remember rejecting a patch to fix it. one of the problems is that we have a couple of places that do System.exit().) i can't seem to find it though. one case that would be great for embedding is writing test cases, so i think

Re: A question about Watcher

2010-08-16 Thread Benjamin Reed
zookeeper takes care of reregistering all watchers on reconnect. you don't need to do anything. ben On 08/16/2010 09:04 AM, Qian Ye wrote: Hi all: Will the watchers of a client be losed when the client disconnects from a Zookeeper server? It is said at

Re: A question about Watcher

2010-08-16 Thread Benjamin Reed
good point ted! i should have waited a bit longer before responding :) ben On 08/16/2010 09:20 AM, Ted Dunning wrote: There are two different concepts. One is connection loss. Watchers survive this and the client automatically connects to another member of the ZK cluster. The other is

Re: A question about Watcher

2010-08-16 Thread Benjamin Reed
the client does keep track of the watches that it has outstanding. when it reconnects to a new server it tells the server what it is watching for and the last view of the system that it had. ben On 08/16/2010 09:28 AM, Qian Ye wrote: thx for explaination. Since the watcher can be preserved

Re: Weird ephemeral node issue

2010-08-17 Thread Benjamin Reed
there are two things to keep in mind when thinking about this issue: 1) if a zk client is disconnected from the cluster, the client is essentially in limbo. because the client cannot talk to a server it cannot know if its session is still alive. it also cannot close its session. 2) the

Re: Session expiration caused by time change

2010-08-19 Thread Benjamin Reed
yes, you are right. we could do this. it turns out that the expiration code is very simple: while (running) { currentTime = System.currentTimeMillis(); if (nextExpirationTime currentTime) { this.wait(nextExpirationTime -

Re: Session expiration caused by time change

2010-08-19 Thread Benjamin Reed
if we can't rely on the clock, we cannot say things like if ... for 5 seconds. also, clients connect to servers, not visa-versa, so we cannot say things like server can attempt to reconnect. ben On 08/19/2010 10:17 AM, Vishal K wrote: Hi Ted, I haven't give it a serious thought yet, but I

Re: Session expiration caused by time change

2010-08-19 Thread Benjamin Reed
i'm updating ZOOKEEPER-366 with this discussion and try to get a patch out. Qing (or anyone else, can you reproduce it pretty easily?) thanx ben On 08/19/2010 09:29 AM, Ted Dunning wrote: Nice (modulo inverting the in your text). Option 2 seems very simple. That always attracts me. On

Re: Session expiration caused by time change

2010-08-20 Thread Benjamin Reed
i put up a patch that should address the problem. now i need to write a test case. the only way i can think of is to change the call to System.currentTimeMillis to a utility class that calls System.currentTimeMillis that i can mock for testing. any better ideas? ben On 08/19/2010 03:53 PM,

Re: closing session on socket close vs waiting for timeout

2010-09-01 Thread Benjamin Reed
i'm a bit skeptical that this is going to work out properly. a server may receive a socket reset even though the client is still alive: 1) client sends a request to a server 2) client is partitioned from the server 3) server starts trying to send response 4) client reconnects to a different

Re: closing session on socket close vs waiting for timeout

2010-09-06 Thread Benjamin Reed
for this session type (so 4 would fail). Would that address your concern, others? Patrick On 09/01/2010 10:03 AM, Benjamin Reed wrote: i'm a bit skeptical that this is going to work out properly. a server may receive a socket reset even though the client is still alive: 1) client sends a request

Re: closing session on socket close vs waiting for timeout

2010-09-08 Thread Benjamin Reed
@hadoop.apache.org Cc: Benjamin Reed Subject: Re: closing session on socket close vs waiting for timeout This really is, just as Ben says a problem of false positives and false negatives in detecting session expiration. On the other hand, the current algorithm isn't really using all the information available

Re: closing session on socket close vs waiting for timeout

2010-09-10 Thread Benjamin Reed
to waste my time if there's a fundamental reason it's a bad idea. Thanks, Camille -Original Message- From: Benjamin Reed [mailto:br...@yahoo-inc.com] Sent: Wednesday, September 08, 2010 4:03 PM To: zookeeper-user@hadoop.apache.org Subject: Re: closing session on socket close vs waiting

Re: closing session on socket close vs waiting for timeout

2010-09-10 Thread Benjamin Reed
ah dang, i should have said generate a close request for the session and push that through the system. ben On 09/10/2010 01:01 PM, Benjamin Reed wrote: the problem is that followers don't track session timeouts. they track when they last heard from the sessions that are connected to them

Re: ZK compatability

2010-10-01 Thread Benjamin Reed
we should also point out that our ops guys here at yahoo! don't like the break at major clause. i imagine when we do the next major release we will try to be one release backwards compatible. (although we shouldn't promise it until we successfully do it once :) ben On 09/30/2010 10:29 AM,

Re: Zookeeper on 60+Gb mem

2010-10-05 Thread Benjamin Reed
you will need to time how long it takes to read all that state back in and adjust the initTime accordingly. it will probably take a while to pull all that data into memory. ben On 10/05/2010 11:36 AM, Avinash Lakshman wrote: I have run it over 5 GB of heap with over 10M znodes. We will

Re: Question on production readiness, deployment, data of BookKeeper / Hedwig

2010-10-07 Thread Benjamin Reed
hi amit, sorry for the late response. this week has been crunch time for a lot of different things. here are your answers: production 1. it is still in prototype phase. we are evaluating different aspects, but there is still some work to do to make it production ready. we also need to

Re: Question on production readiness, deployment, data of BookKeeper / Hedwig

2010-10-08 Thread Benjamin Reed
your guess is correct :) for bookkeeper and hedwig we released early to do the development in public. originally we developed bookkeeper as a distributed write ahead log for the NameNode in HDFS, but while we were able to get a proof of concept going, the structure of the code of the NameNode

RE: What does this mean?

2010-10-10 Thread Benjamin Reed
this usually happens when a follower closes its connection to the leader. it is usually caused by the follower shutting down or failing. you may get further insight by looking at the follower logs. you should really run with timestamps on so that you can correlate the logs of the leader and

Re: What does this mean?

2010-10-11 Thread Benjamin Reed
how big is your data? you may be running into the problem where it takes too long to do the state transfer and times out. check the initLimit and the size of your data. ben On 10/10/2010 08:57 AM, Avinash Lakshman wrote: Thanks Ben. I am not mixing processes of different clusters. I just

Re: invalid acl for ZOO_CREATOR_ALL_ACL

2010-10-19 Thread Benjamin Reed
which scheme are you using? ben On 10/18/2010 11:57 PM, FANG Yang wrote: 2010/10/19 FANG Yangfa...@douban.com hi, all I have a simple zk client written by c ,which is attachment #1. When i use ZOO_CREATOR_ALL_ACL, the ret code of zoo_create is -114((Invalid ACL specified definde in

Re: zxid integer overflow

2010-10-19 Thread Benjamin Reed
we should put in a test for that. it is certainly a plausible scenario. in theory it will just flow into the next epoch and everything will be fine, but we should try it and see. ben On 10/19/2010 11:33 AM, Sandy Pratt wrote: Just as a thought experiment, I was pondering the following: ZK

Re: Is it possible to read/write a ledger concurrently

2010-10-21 Thread Benjamin Reed
currently program1 can read and write to an open ledger, but program2 must wait for the ledger to be closed before doing the read. the problem is that program2 needs to know the last valid entry in the ledger. (there may be entries that may not yet be valid.) for performance reasons, only

Re: Is it possible to read/write a ledger concurrently

2010-10-22 Thread Benjamin Reed
in hedwig one hub does both the publish and subscribe for a given topic and therefore is the only processes reading and writing from/to a ledger, so there isn't an issue. The ReadAheadCache does read-ahead :) it is so that we can minimize latency when doing sequential reads. ben On

Re: Getting a node exists code on a sequence create

2010-11-01 Thread Benjamin Reed
how were you able to reproduce it? all the znodes in /zkrsm were created with the sequence flag. right? ben On 11/01/2010 02:28 PM, Jeremy Stribling wrote: We were able to reproduce it. A stat on all three servers looks identical: [zk:ip:port(CONNECTED) 0] stat /zkrsm cZxid = 9 ctime = Mon

Re: Getting a node exists code on a sequence create

2010-11-03 Thread Benjamin Reed
sequential znodes. I'm guessing this is pretty well-tested behavior, so there must be something weird or wrong about the way I have stuff setup. I'm happy to provide whatever logs or snapshots might help someone track this down. Thanks, Jeremy On 11/01/2010 02:42 PM, Benjamin Reed wrote: how were

Re: Running cluster behind load balancer

2010-11-04 Thread Benjamin Reed
one thing to note: the if you are using a DNS load balancer, some load balancers will return the list of resolved addresses in different orders to do the balancing. the zookeeper client will shuffle that list before it it used, so in reality, using a single DNS hostname resolving to all the

Re: How to reestablish a session

2010-11-18 Thread Benjamin Reed
ah i see. you are manually reestablishing the connection to B using the session identifier for the session with A. the problem is that when you call close on a session, it kills the session. we don't really have a way to close a handle without do that. (actually there is a test class that