question about ZK robustness

2010-11-20 Thread Ted Dunning
I was just asked a very cogent question of the form how do you know and would like somebody who knows better than I do to confirm or deny my response. The only part that I am absolutely sure of is the part at the end where I say No doubt I have omitted something. With an edit from Ben, this

Re: Persistent watch stream?

2010-11-12 Thread Ted Dunning
Persistent watches were omitted from ZK on purpose because of the perceived danger of not have a load shedding mechanism. Note that when you get a notification, the query you do to get the next state typically sets the next watch. This guarantees that you don't lose anything, but it may mean

Re: Running cluster behind load balancer

2010-11-03 Thread Ted Dunning
DNS round-robin works as well. On Wed, Nov 3, 2010 at 3:45 PM, Benjamin Reed br...@yahoo-inc.com wrote: it would have to be a TCP based load balancer to work with ZooKeeper clients, but other than that it should work really well. The clients will be doing heart beats so the TCP connections

Re: Client seeing wrong data on nodeDataChanged

2010-10-28 Thread Ted Dunning
Client 2 is not guaranteed to see X if it doesn't get to asking before the value has been updated to Y. On Thu, Oct 28, 2010 at 2:39 PM, Stack st...@duboce.net wrote: Client 2 is also watching the znode. It gets notified three times: two nodeDataChanged events(only) and a nodeDeleted event.

Re: Client seeing wrong data on nodeDataChanged

2010-10-28 Thread Ted Dunning
On Thu, Oct 28, 2010 at 9:56 PM, Stack st...@duboce.net wrote: On Thu, Oct 28, 2010 at 7:32 PM, Ted Dunning ted.dunn...@gmail.com wrote: Client 2 is not guaranteed to see X if it doesn't get to asking before the value has been updated to Y. Right, but I wouldn't expect the watch

Re: znode recovery automatically?

2010-10-21 Thread Ted Dunning
On Thu, Oct 21, 2010 at 9:08 AM, Sean Bigdatafun sean.bigdata...@gmail.comwrote: Can a lost znode be recovered automatically? Say, in a 3 znodes Zookeeper cluster, the cluster get into a critical status if a znode is lost. If I bring that lost znode back into running, can it rejoin the quorum?

Re: Membership using ZK

2010-10-12 Thread Ted Dunning
Yes. You should get that event. You should also debug why you are getting disconnected in the first place. This is often a symptom of something really bad that is happening on your client side such as very long GC's. If these are unavoidable, then you need to adjust the timeouts with ZK to

Re: is zookeeper suitable for my application?

2010-10-08 Thread Ted Dunning
ZK provides all of the coordination you need for this problem, but you should store your data elsewhere. Any key-data store with decent read-write speed will suffice. Memcache would be reasonable for that if you don't need persistence in the presence of failure. Voldemort would be another

Re: Zookeeper on 60+Gb mem

2010-10-05 Thread Ted Dunning
That would be an interesting experiment although it is way outside normal usage as a coordination store. I have used ZK as a session store for PHP with OK results. I never implemented an expiration mechanism so things had to be cleared out manually sometimes. It worked pretty well until things

Re: ZK compatability

2010-09-30 Thread Ted Dunning
Looking forward, I don't think that anybody has even proposed anything that would require a major release yet. That should mean that you have quite a bit of lifetime ahead on the 3.x family. Moreover, it is a cinch to bet that even when a 4.0 is released, it is unlikely to have enough killer

Re: Expiring session... timeout of 600000ms exceeded

2010-09-21 Thread Ted Dunning
Generally best practices for crawlers is that no process runs more than an hour or five. All crawler processes update a central state store with their progress, but they exit when they reach a time limit knowing that somebody else will take up the work where they leave off. This avoids a

Re: possible bug in zookeeper ?

2010-09-14 Thread Ted Dunning
What was the list of servers that was given originally to open the connection to ZK? On Tue, Sep 14, 2010 at 6:15 AM, Yatir Ben Shlomo yat...@outbrain.comwrote: Hi I am using solrCloud which uses an ensemble of 3 zookeeper instances. I am performing survivability tests: Taking one of the

Re: possible bug in zookeeper ?

2010-09-14 Thread Ted Dunning
. Thanks mahadev On 9/14/10 8:44 AM, Yatir Ben Shlomo yat...@outbrain.com wrote: zook1:2181,zook2:2181,zook3:2181 -Original Message- From: Ted Dunning [mailto:ted.dunn...@gmail.com] Sent: Tuesday, September 14, 2010 4:11 PM To: zookeeper-user@hadoop.apache.org Subject

Re: closing session on socket close vs waiting for timeout

2010-09-10 Thread Ted Dunning
A switch failure could do that, I think. On Fri, Sep 10, 2010 at 1:49 PM, Fournier, Camille F. [Tech] camille.fourn...@gs.com wrote: I am not a networking expert, but in my experience I've seen network glitches that cause sockets to appear to be live that are actually dead, but not

Re: closing session on socket close vs waiting for timeout

2010-09-08 Thread Ted Dunning
(forgive lack of actual code in this email) -Original Message- From: Ted Dunning [mailto:ted.dunn...@gmail.com] Sent: Tuesday, September 07, 2010 1:11 PM To: zookeeper-user@hadoop.apache.org Cc: Benjamin Reed Subject: Re: closing session on socket close vs waiting for timeout

Re: Exception causing close of session

2010-08-27 Thread Ted Dunning
Patrick, Can you clarify what reset means? It doesn't mean just restart, does it? On Thu, Aug 26, 2010 at 5:05 PM, Patrick Hunt ph...@apache.org wrote: Client has seen zxid 0xfa4 our last zxid is 0x42 Someone reset the zk server database without restarting the clients. As a result the

Re: What roles do even nodes play in the ensamble

2010-08-25 Thread Ted Dunning
Just use 3 nodes. Life will be better. You can configure the fourth node in the event of one of the first three failing and bring it on line. Then you can re-configure and restart each of the others one at a time. This gives you flexibility because you have 4 nodes, but doesn't decrease your

Re: Non Hadoop scheduling frameworks

2010-08-23 Thread Ted Dunning
These are pretty easy to solve with ZK. Ephemerality, exclusive create, atomic update and file versions allow you to implement most of the semantics you need. I don't know of any recipes available for this, but they would be worthy additions to ZK. On Mon, Aug 23, 2010 at 11:33 PM, Todd Nine

Re: Session expiration caused by time change

2010-08-20 Thread Ted Dunning
of is to change the call to System.currentTimeMillis to a utility class that calls System.currentTimeMillis that i can mock for testing. any better ideas? ben On 08/19/2010 03:53 PM, Ted Dunning wrote: Put in a four letter command that will put the server to sleep for 15 seconds! :-) On Thu, Aug 19

Re: Session expiration caused by time change

2010-08-19 Thread Ted Dunning
You can always increase your timeouts a bit. On Thu, Aug 19, 2010 at 12:52 AM, Qing Yan qing...@gmail.com wrote: Oh.. our servers are also running in a virtualized environment. On Thu, Aug 19, 2010 at 2:58 PM, Martin Waite waite@gmail.com wrote: Hi, I have tripped over similar

Re: Zookeeper stops

2010-08-19 Thread Ted Dunning
Also, /tmp is not a great place to keep things that are intended for persistence. On Thu, Aug 19, 2010 at 7:34 AM, Mahadev Konar maha...@yahoo-inc.comwrote: Hi Wim, It mostly looks like that zookeeper is not able to create files on the /tmp filesystem. Is there is a space shortage or is it

Re: Session expiration caused by time change

2010-08-19 Thread Ted Dunning
Another option would be for the cluster to compare times and note when one member seems to be lagging. Restoration of that lag would then be less remarkable. I believe that the pattern of these problems is a slow slippage behind and a sudden jump forward. On Thu, Aug 19, 2010 at 7:51 AM, Vishal

Re: Session expiration caused by time change

2010-08-19 Thread Ted Dunning
True. But it knows that there has been a jump. Quiet time can be distinguished from clock shift by assuming that members of the cluster don't all jump at the same time. I would imagine that a recent clock jump estimate could be kept and buckets that would otherwise expire due to such a jump

Re: Session expiration caused by time change

2010-08-19 Thread Ted Dunning
much of a problem since clients send a ping if they are idle for 1/3 of their session timeout. ben On 08/19/2010 08:39 AM, Ted Dunning wrote: True. But it knows that there has been a jump. Quiet time can be distinguished from clock shift by assuming that members of the cluster don't all

Re: ZK monitoring

2010-08-19 Thread Ted Dunning
It would be nice if it took a list of servers and verified that they all thought that they were part of the same cluster. On Thu, Aug 19, 2010 at 1:46 PM, Patrick Hunt ph...@apache.org wrote: Maybe we should have a contrib pkg for utilities such as this? I could see a python script that, given

Re: Session expiration caused by time change

2010-08-19 Thread Ted Dunning
Ben's approach is really simpler. The client already sends keep-alive messages and we know that some have gone missing or a time shift has happened. Those two possibilities are cleanly distinguished by Ben's suggestion of comparing current time to the bucket expiration. If current time is

Re: Session expiration caused by time change

2010-08-19 Thread Ted Dunning
Put in a four letter command that will put the server to sleep for 15 seconds! :-) On Thu, Aug 19, 2010 at 3:51 PM, Benjamin Reed br...@yahoo-inc.com wrote: i'm updating ZOOKEEPER-366 with this discussion and try to get a patch out. Qing (or anyone else, can you reproduce it pretty easily?)

Re: Session expiration caused by time change

2010-08-18 Thread Ted Dunning
If NTP is changing your time by more than a few milliseconds then you have other problems (big ones). On Wed, Aug 18, 2010 at 1:04 AM, Qing Yan qing...@gmail.com wrote: I guess ZK might rely on timestamp to keep sessions alive, but we have NTP daemon running so machine time can get changed

Re: A question about Watcher

2010-08-16 Thread Ted Dunning
There are two different concepts. One is connection loss. Watchers survive this and the client automatically connects to another member of the ZK cluster. The other is session expiration. Watchers do not survive this. This happens when a client does not provide timely evidence that it is

Re: A question about Watcher

2010-08-16 Thread Ted Dunning
I should correct this. The watchers will deliver a session expiration event, but since the connection is closed at that point no further events will be delivered and the cluster will remove them. This is as good as the watchers disappearing. On Mon, Aug 16, 2010 at 9:20 AM, Ted Dunning ted.dunn

Re: Weird ephemeral node issue

2010-08-16 Thread Ted Dunning
:09 AM, Qing Yan qing...@gmail.com wrote: Hi Ted, Do you mean GC problem can prevent delivery of SESSION EXPIRE event? Hum...so you have met this problem before? I didn't see any OOM though, will look into it more. On Mon, Aug 16, 2010 at 12:46 PM, Ted Dunning ted.dunn...@gmail.com

Re: A question about Watcher

2010-08-16 Thread Ted Dunning
Almost never. There was a bug a while back that could have conceivably caused that under rare circumstances, but I don't know of any current mechanism for this lossage that you are asking about. On Mon, Aug 16, 2010 at 6:34 PM, Qian Ye yeqian@gmail.com wrote: My question is, if the master

Re: Weird ephemeral node issue

2010-08-15 Thread Ted Dunning
I am assuming that you are using ZK from java. Very likely you are having GC problems. Turn on verbose GC logging and see what is happening. You may also want to change the session timeout values. It is very common for the use of ZK to highlight problems that you didn't know that you had. On

Re: How to handle Node does not exist error?

2010-08-12 Thread Ted Dunning
, 2010 at 12:01 AM, Ted Dunning ted.dunn...@gmail.com wrote: Try running the server in non-embedded mode. Also, you are assuming that you know everything about how to configure the quorumPeer. That is going to change and your code will break at that time. If you use a non-embedded

Re: How to handle Node does not exist error?

2010-08-12 Thread Ted Dunning
I am not saying that the API shouldn't support embedded ZK. I am just saying that it is almost always a bad idea. It isn't that I am asking you to not do it, it is just that I am describing the experience I have had and that I have seen others have. In a nutshell, embedding leads to problems

Re: How to handle Node does not exist error?

2010-08-11 Thread Ted Dunning
of those nodes existed. Dr Hao He XPE - the truly SOA platform h...@softtouchit.com http://softtouchit.com http://itunes.com/apps/Scanmobile On 11/08/2010, at 4:38 PM, Ted Dunning wrote: Can you provide some more information? The output of some of the four letter commands and a transcript

Re: How to handle Node does not exist error?

2010-08-11 Thread Ted Dunning
/apps/Scanmobile On 11/08/2010, at 4:38 PM, Ted Dunning wrote: Can you provide some more information? The output of some of the four letter commands and a transcript of what you are doing would be very helpful. Also, there is no way for znodes to exist on one node of a properly

Re: Too many KeeperErrorCode = Session moved messages

2010-08-05 Thread Ted Dunning
I can't comment much on this, except that this is a very odd usage pattern. First, it isn't so unusual, but I find it a particularly bad practice to embed ZK into your application. The problem is that you lose a lot of the virtues of ZK in terms of coordination if ZK goes down with your

Re: Sequence Number Generation With Zookeeper

2010-08-05 Thread Ted Dunning
(b) BUT: Sequential numbering is a special case of now. In large diameters, now gets very expensive. This is a special case of that assertion. If there is a way to get away from this presumption of the need for sequential numbering, you will be miles better off. HOWEVER: ZK can do better

Re: Sequence Number Generation With Zookeeper

2010-08-05 Thread Ted Dunning
that a client would receive the wrong Stat object back? Many thanks again, Jon. On 5 August 2010 16:09, Ted Dunning ted.dunn...@gmail.com wrote: (b) BUT: Sequential numbering is a special case of now. In large diameters, now gets very expensive. This is a special case

Re: Using watcher for being notified of children addition/removal

2010-08-02 Thread Ted Dunning
Another option besides Steve's excellent one would be to keep something like 1000 nodes in your list per znode. Many update patterns will give you the same number of updates, but the ZK transactions that result (getChildren, read znode) will likely be more efficient, especially the getChildren

Re: node symlinks

2010-07-26 Thread Ted Dunning
with the #users of the system, so the clusters can grow sequentially, hence the symlink idea. --Maarten On 07/24/2010 11:12 PM, Ted Dunning wrote: Depending on your application, it might be good to simply hash the node name to decide which ZK cluster to put it on. Also, a scalable key

Re: node symlinks

2010-07-26 Thread Ted Dunning
I think it only mostly disappears. If a user puts 1K files up and is placed on a ZK cluster with 30K free slots then everything is good. But if that user adds 40K files, you have split or migrate that user. I think that the easy answer is to more than one location to look for a user's files.

Re: node symlinks

2010-07-24 Thread Ted Dunning
Depending on your application, it might be good to simply hash the node name to decide which ZK cluster to put it on. Also, a scalable key value store like Voldemort or Cassandra might be more appropriate for your application. Unless you need the hard-core guarantees of ZK, they can be better

Re: node symlinks

2010-07-24 Thread Ted Dunning
Depending on what a user needs to see, you can also have parallel structures and select a cluster based on user number. Your insistence on guarantees is worrisome, though. As much as I like ZK, I like getting rid of hard consistency requirements even more. As I tend to put it, the cost of NOW

Re: getChildren() when the number of children is very large

2010-07-21 Thread Ted Dunning
On Tue, Jul 20, 2010 at 8:47 PM, André Oriani aori...@gmail.com wrote: Ted, just to clarify. By file you mean znode, right ? Yes. So you are advising me to try an atomic append to znode's by first calling getData and then trying to conditionally set the data by using the version

Re: ZK recovery questions

2010-07-21 Thread Ted Dunning
My own experiments in my own environment where ZK is being used purely for coordination at a fairly low transaction rate (tens to hundreds of ops per second, mostly status updates) made me feel that disk throughput would only be detectable as an issue for pretty massively abused ZK applications.

Re: Adding observers

2010-07-21 Thread Ted Dunning
On Wed, Jul 21, 2010 at 10:30 AM, Avinash Lakshman avinash.laksh...@gmail.com wrote: (1) Is it possible to increase the number of observers in the cluster dynamically? Not quite, but practically speaking you can do as good as this. In general, pretty much any ZK configuration change can be

Re: Adding observers

2010-07-21 Thread Ted Dunning
It is really simpler than you can imagine. Something like this should be plenty sufficient. for h in ZK_HOSTS do ssh $h $ZK_HOME/bin/zkServer.sh restart sleep 5 done This is just something I typed in, not something I checked. It is intended to give you the idea. I will

Re: getChildren() when the number of children is very large

2010-07-20 Thread Ted Dunning
Creating a new znode for each update isn't really necessary. Just create a file that will contain all of the updates for the next snapshot and do atomic updates to add to the list of updates belonging to that snapshot. When you complete the snapshot, you will create a new file. After a time you

Re: ZK recovery questions

2010-07-19 Thread Ted Dunning
They don't auto-detect. What is usually done is that the configurations on all the servers are changed and they are re-started one at a time. On Mon, Jul 19, 2010 at 8:35 PM, Ashwin Jayaprakash ashwin.jayaprak...@gmail.com wrote: So, what happens when a new replacement server has to be

Re: ZK recovery questions

2010-07-18 Thread Ted Dunning
On Sun, Jul 18, 2010 at 3:34 PM, Ashwin Jayaprakash ashwin.jayaprak...@gmail.com wrote: - If 1 out of 3 servers crashes and the log files are unrecoverable, how do we provision a replacement server? Just start it and it will download a snapshot from the other servers. - If the

Re: cleanup ZK takes 40-60 seconds

2010-07-16 Thread Ted Dunning
I can't comment on the cleanup time, but I can suggest that it is normally not a very good idea to embed Zookeeper in your application. If your application really is distributed, then having ZK survive the demise of any particular instance is a really nice thing. If ZK goes away with your

Re: Achieving quorum with only half of the nodes

2010-07-14 Thread Ted Dunning
On Wed, Jul 14, 2010 at 2:16 PM, Sergei Babovich sbabov...@demandware.comwrote: Yep... I see. This is a problem. Any better idea? I think that the production of slightly elaborate quorum rules to handle specific failure modes isn't a reasonable thing. What you need to do in conjunction is to

Re: Regarding Leader election and the limit on number of clients without performance degradation

2010-07-12 Thread Ted Dunning
Having 16 clients all wake up and ping ZK is an extremely light load. The warning on the recipes page had more to do with the situation where thousands of nodes wake up at the same time. On Mon, Jul 12, 2010 at 1:30 PM, Srikanth Bondalapati: sbondalap...@tagged.com wrote: Hi, I am using

Re: Guaranteed message delivery until session timeout?

2010-06-30 Thread Ted Dunning
Which API are you talking about? C? I think that the difference between connection loss and session expiration might mess you up slightly in your disjunction here. On Wed, Jun 30, 2010 at 7:45 AM, Bryan Thompson br...@systap.com wrote: Hello, I am wondering what guarantees (if any)

Re: Guaranteed message delivery until session timeout?

2010-06-30 Thread Ted Dunning
Isn't this the same question that you sent this morning? On Wed, Jun 30, 2010 at 3:36 PM, Bryan Thompson br...@systap.com wrote: Hello, I am wondering what guarantees (if any) zookeeper provides for reliable messaging for operation return codes up to a session timeout. Basically, I would

Re: Guaranteed message delivery until session timeout?

2010-06-30 Thread Ted Dunning
Also this: Once an update has been applied, it will persist from that time forward until a client overwrites the update. This guarantee has two corollaries: If a client gets a successful return code, the update will have been applied. On some failures (communication errors, timeouts, etc) the

Re: Guaranteed message delivery until session timeout?

2010-06-30 Thread Ted Dunning
Yes. That is true. In particular, your link to a server (or the server itself) can fail causing your client to switch to a different ZK server and retry there. This can and often does happen without you knowing. On Wed, Jun 30, 2010 at 4:48 PM, Bryan Thompson br...@systap.com wrote: With

Re: Guaranteed message delivery until session timeout?

2010-06-30 Thread Ted Dunning
I think that you are correct, but a real ZK person should answer this. On Wed, Jun 30, 2010 at 4:48 PM, Bryan Thompson br...@systap.com wrote: For example, if a client registers a watch, and a state change which would trigger that watch occurs _after_ the client has successfuly registered the

Re: Receive timed out error while starting zookeeper server

2010-06-27 Thread Ted Dunning
Are you sure that you understand that there really isn't a good concept of a master and slave in zookeeper (at least not by default)? Are you actually starting servers on all of your machines in your cluster? On Sat, Jun 26, 2010 at 6:53 AM, Peeyush Kumar ago...@gmail.com wrote: I have a 6

Re: Free Software Solution to continuously load a large number of feeds with several servers?

2010-06-19 Thread Ted Dunning
You don't say what you mean by feed. The bixo system might be helpful to you. http://bixolabs.com/ On Fri, Jun 18, 2010 at 11:01 AM, Thomas Koch tho...@koch.ro wrote: http://stackoverflow.com/questions/3072042/free-software-solution-to- continuously-load-a-large-number-of-feeds-with-several

Re: Debugging help for SessionExpiredException

2010-06-15 Thread Ted Dunning
Jordan, Good step to get this info. I have to ask, did you have your disconnect problem last night as well? (just checking) What does the stat command on ZK give you for each server? On Tue, Jun 15, 2010 at 10:33 AM, Jordan Zimmerman jzimmer...@proofpoint.com wrote: More on this... I ran

Re: Debugging help for SessionExpiredException

2010-06-10 Thread Ted Dunning
Possibly. I have seen GC times of 4 minutes on some large processes. Better to set the GC parameters so you don't get long pauses. On http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting it mentions using the -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC options. I recommend adding

Re: Debugging help for SessionExpiredException

2010-06-10 Thread Ted Dunning
) that isn't getting a lot of traffic. i.e. 1 zookeeper instance that we're testing with. On Jun 10, 2010, at 4:06 PM, Ted Dunning wrote: Possibly. I have seen GC times of 4 minutes on some large processes. Better to set the GC parameters so you don't get long pauses. On http

Re: Debugging help for SessionExpiredException

2010-06-09 Thread Ted Dunning
This can depend on which kind of instance you invoke as well. The smallest instances disappear for short periods of time and that can lead to surprises. On Wed, Jun 9, 2010 at 3:35 PM, Lei Zhang lzvoya...@gmail.com wrote: On EC2 (still CentOS as guest OS), we consistently run into zk session

Re: Simulating failures?

2010-06-04 Thread Ted Dunning
I use mock objects to create a simulated ZK object. Alternatively, you may be able to sub-class and delegate all ZK calls. That would let you inject faults. On Fri, Jun 4, 2010 at 11:28 AM, Stephen Green eelstretch...@gmail.comwrote: Is there any way to inject failures into the ZK client so

Re: zookeeper crash

2010-06-02 Thread Ted Dunning
This looks a bit like a small bobble we had when upgrading a bit ago. I THINK that the answer here is to mind-wipe the misbehaving node and have it resynch from scratch from the other nodes. Wait for confirmation from somebody real. On Wed, Jun 2, 2010 at 11:11 AM, Charity Majors

Re: zookeeper crash

2010-06-02 Thread Ted Dunning
I knew Patrick would remember to add an important detail. On Wed, Jun 2, 2010 at 11:49 AM, Patrick Hunt ph...@apache.org wrote: As Ted suggested you can remove the datadir -- *only on the effected server* -- and then restart it.

Re: Zookeeper, Maven and dependencies on javax jar files

2010-05-24 Thread Ted Dunning
Which version of maven do you have? I have heard some versions don't follow redirects well. You can try deleting these defective files in your local repository under .m2 and try again. You may need to try with a newer maven to get things right. Another option is to explicitly remove those

Re: Zookeeper, Maven and dependencies on javax jar files

2010-05-24 Thread Ted Dunning
The only one that I think is important is the jmx which enables monitoring of the servers. On Mon, May 24, 2010 at 2:51 PM, Jack Orenstein j...@akiban.com wrote: This at least gets me through the build/install phase. My usage of zookeeper is pretty minimal right now -- just one a single node.

Re: Zookeeper, Maven and dependencies on javax jar files

2010-05-24 Thread Ted Dunning
Same version I use. On Mon, May 24, 2010 at 2:51 PM, Jack Orenstein j...@akiban.com wrote: Ted Dunning wrote: Which version of maven do you have? 2.2.1.

Re: Ping and client session timeouts

2010-05-21 Thread Ted Dunning
You may actually be swapping. That can be even worse than GC! On Fri, May 21, 2010 at 11:32 AM, Stephen Green eelstretch...@gmail.comwrote: Right. The system can be very memory-intensive, but at the time these are occurring, it's not under a really heavy load, and there's plenty of heap

Re: Pathological ZK cluster: 1 server verbosely WARN'ing, other 2 servers pegging CPU

2010-05-12 Thread Ted Dunning
Impressive number here, especially at your quoted few per second rate. Are you sure that you haven't inadvertently synchronized GC on multiple machines? On Wed, May 12, 2010 at 8:30 PM, Aaron Crow dirtyvagab...@yahoo.com wrote: Right now we're at 1.9 million. This isn't a bug of our

Re: Pathological ZK cluster: 1 server verbosely WARN'ing, other 2 servers pegging CPU

2010-05-12 Thread Ted Dunning
Yes. That is roughly what I mean. If one server starts a GC, it can effectively go offline. That might pressure the other servers enough that one of them starts a GC. This is unlikely with your GC settings, but you should turn on the verbose GC logging to be sure. On Wed, May 12, 2010 at

Re: ZKClient

2010-05-04 Thread Ted Dunning
This is used as part of katta where it gets a fair bit of exercise at low update rates with small data. It is used for managing the state of the search cluster. I don't think it has had much external review or use for purposes apart from katta. Katta generally has pretty decent code, though.

Re: ZKClient

2010-05-04 Thread Ted Dunning
I don't think that zk is hard to get right. What is hard is to layer a very different model on top of ZK that changes the semantics significantly and that that translation right. One of the very cool things about ZK is how easy it is to write correct code. I know that Ben and co put a lot of

Re: ZKClient

2010-05-04 Thread Ted Dunning
, 2010 at 2:21 PM, Ted Dunning ted.dunn...@gmail.com wrote: In general, writing this sort of layer on top of ZK is very, very hard to get really right for general use. In a simple use-case, you can probably nail it but distributed systems are a Zoo, to coin a phrase. The problem

Re: Question on maintaining leader/membership status in zookeeper

2010-04-30 Thread Ted Dunning
and Slave(s) are broken while all other connections are still alive, would my system hang after some point? Because no new leader election will be initiated by slaves and the leader can't get the work to slave(s). Thanks, Lei On 4/30/10 1:54 PM, Ted Dunning ted.dunn...@gmail.com

Re: zookeeper consistency model?

2010-04-29 Thread Ted Dunning
In general, the guarantee is that B will do exactly as you say it will read the new value or the old value. Your question depends on a definition of now that spans several machines. That is a dangerous concept and if your reasoning requires it, you are headed for trouble. On Thu, Apr 29,

Re: zookeeper consistency model?

2010-04-29 Thread Ted Dunning
, this is my browser homepage ;-) http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing Patrick On 04/29/2010 09:14 AM, Ted Dunning wrote: In general, the guarantee is that B will do exactly as you say it will read the new value or the old value. Your question depends

Re: Using Zookeeper to distribute tasks

2010-04-27 Thread Ted Dunning
The general way to do this is either a) have lots of watchers who all try to create a single file when a watched file changes. This is very simple to code, but leads to a lot of notifications when you have thousands of watchers. b) arrange the watchers in a chain. This is similar to the

Re: Bizarre ZooKeeper Client Behaviour

2010-04-27 Thread Ted Dunning
Lei, A contrary question for you is why you don't just share zk sessions within a single process. On Tue, Apr 27, 2010 at 5:17 PM, Lei Zhang lzvoya...@gmail.com wrote: I am in the process of changing to each thread of each daemon maintaining a zk session. That means we will hit this 10

Re: Embedding ZK in another application

2010-04-23 Thread Ted Dunning
It is, of course, your decision, but a key coordination function is to determine whether your application is up or not. That is very hard to do if Zookeeper is inside your application. On Fri, Apr 23, 2010 at 10:28 AM, Asankha C. Perera asan...@apache.orgwrote: However, I believe that both the

odd error message

2010-04-20 Thread Ted Dunning
We have just done an upgrade of ZK to 3.3.0. Previous to this, ZK has been up for about a year with no problems. On two nodes, we killed the previous instance and started the 3.3.0 instance. The first node was a follower and the second a leader. All went according to plan and no clients seemed

Re: Would this work?

2010-04-20 Thread Ted Dunning
I can't comment on the details of your code (but I have run in-process ZK's in the past without problem) Operationally, however, this isn't a great idea. The problem is two-fold: a) firstly, somebody would probably like to look at Zookeeper to understand the state of your service. If the

Re: user cousult

2010-04-01 Thread Ted Dunning
On Thu, Apr 1, 2010 at 7:27 PM, li li liqiyuan...@gmail.com wrote: Now I can handle about 300 clients with one server,when I set the session time out is 3. In your opinion , the session time out is set in which value more suitable? 5-30 seconds is a much more typically value.

Re: the error

2010-03-31 Thread Ted Dunning
Suppose a machine has probability of soft-failure p_1 and catastrophic p_2 p_1. Assume that two machines have independent failure modes. Probably of soft failure of a one machine cluster = p_1, two machine cluster = probability of soft failure of 1 or 2 machines + probability of one machine

Re: the error

2010-03-31 Thread Ted Dunning
As I pointed out in my response, you should distinguish hard and soft failures. If one machine fails even catastrophically, you can provide a new machine to replace it, thus converting a hard failure into a soft one. The conclusion is the same. Three machines is vastly better than one or two.

Re: How to ensure trasaction create-and-update

2010-03-30 Thread Ted Dunning
As usual, Ben says better what I was trying to say. Henry's point that a very limited multi-update would be useful is also true, though. If somebody can come up with a way to do that without making things unreasonably complicated, it would be really nice to have. In the meantime, I will try to

Re: How to ensure trasaction create-and-update

2010-03-29 Thread Ted Dunning
This is not a good thing. ZK gains lots of its power and reliability by not trying to do atomic updates to multiple znodes at once. Can you say more about the update that you want to do? It is common for updates like to be such that you can order the updates and do without a truly atomic

Re: How to ensure trasaction create-and-update

2010-03-29 Thread Ted Dunning
I perhaps should not have said power, except insofar as ZK's strengths are in reliability which derives from simplicity. There are essentially two common ways to implement multi-node update. The first is the tradtional db style with begin-transaction paired with either a commit or a rollback

Re: Re: How to ensure trasaction create-and-update

2010-03-29 Thread Ted Dunning
as a whole. 2010-03-30 Will 发件人: Ted Dunning ted.dunn...@gmail.com 发送时间: 2010-03-30 10:11 主 题: Re: How to ensure trasaction create-and-update 收件人: zookeeper-user@hadoop.apache.org This is not a good thing. ZK gains lots of its power and reliability by not trying to do atomic updates

Re: Modify ZooKeeper Java client to hold weak references to Watcher objects

2010-03-18 Thread Ted Dunning
This kind of sounds strange to me. My typical idiom is to create a watcher but not retain any references to it outside the client. It sounds to me like your change will cause my watchers to be collected and deactivated when GC happens. On Thu, Mar 18, 2010 at 3:32 AM, Dominic Williams

Re: permanent ZSESSIONMOVED

2010-03-16 Thread Ted Dunning
Hmm... this inspires me to have a thought as well. Łukasz, there isn't any fancy network stuff going on here is there? No NATing or fancy load balancing or reassignment of IP addresses of servers, right? On Tue, Mar 16, 2010 at 4:51 PM, Patrick Hunt ph...@apache.org wrote: It will be good to

Re: java heap size

2010-03-15 Thread Ted Dunning
Your understanding is correct. But if you set a heap size nearly as big as your physical memory (or larger) then java may allocate that heap which will cause swapping. So swapping is definitely done by the OS, but it is the applications like Java that can cause the OS to do it. On Mon, Mar 15,

Re: persistent storage and node recovery

2010-03-15 Thread Ted Dunning
I don't think that you have considered the impact of ordered updates here. On Mon, Mar 15, 2010 at 6:19 PM, Maxime Caron maxime.ca...@gmail.comwrote: So this is all about the operation log so if a node is in minority but have more recent committed value this node is in Veto over the other

Re: persistent storage and node recovery

2010-03-15 Thread Ted Dunning
I like to say that the cost of now goes up dramatically with diameter. On Mon, Mar 15, 2010 at 7:50 PM, Henry Robinson he...@cloudera.com wrote: There is a fundamental tension between synchronicity of updates and scale.

Re: Ok to share ZK nodes with Hadoop nodes?

2010-03-08 Thread Ted Dunning
I have used 5 and 3 in different clusters. Moderate amounts of sharing is reasonable, but sharing with less intensive applications is definitely better. Sharing with the job tracker, for instance is likely fine since it doesn't abuse disk so much. The namenode is similar, but not quite as nice.

Re: Managing multi-site clusters with Zookeeper

2010-03-07 Thread Ted Dunning
If you can stand the latency for updates then zk should work well for you. It is unlikely that you will be able to better than zk does and still maintain correctness. Do note that you can, probalbly bias client to use a local server. That should make things more efficient. Sent from my

  1   2   >