Re: Priority Queue?

2011-07-14 Thread Jordan Zimmerman
I won't get a delete event. I'll get a NodeChildrenChanged on the parent path. -JZ On 7/14/11 5:28 PM, Ted Dunning ted.dunn...@gmail.com wrote: Since you can see what the event is, can you test it to see if is the delete that was just issued? On Thu, Jul 14, 2011 at 5:24 PM, Jordan Zimmerman

Re: Priority Queue?

2011-07-14 Thread Jordan Zimmerman
if the difference is just the item you deleted. If so, you don't need to reset. Likewise, if something else was deleted that is still in your future, you can just delete it from the list of upcoming items and continue. On Thu, Jul 14, 2011 at 5:31 PM, Jordan Zimmerman jzimmer...@netflix.comwrote

Adding nodes to an ensemble

2011-08-12 Thread Jordan Zimmerman
It seems implicit in the docs that it's not possible to add nodes (even Observers) to an existing ensemble without restarting each node in the ensemble. i.e. The zoo.cfg has to be changed for each node. Is this true? Is there any work being done on a broadcast protocol that doesn't require static

Strange responses from server

2011-08-26 Thread Jordan Zimmerman
I have a three node ensemble. Each node responds to ruok with imok. Yet, each one response to stat with This ZooKeeper instance is not currently serving requests. What gives? Which one should I trust? How might it get in this state and what's the correct way to fix it? -Jordan

Re: Strange responses from server

2011-08-26 Thread Jordan Zimmerman
means that it is not currently a member of the quorum. You'd need to check the logs to see why it's not participating. Patrick On Fri, Aug 26, 2011 at 11:06 AM, Jordan Zimmerman jzimmer...@netflix.com wrote: I have a three node ensemble. Each node responds to ruok with imok. Yet, each one

Re: Strange responses from server

2011-08-26 Thread Jordan Zimmerman
wrote: Not really. Diagnosing the network is an important step. Diagnosing quorum formation is the next step. On Fri, Aug 26, 2011 at 11:31 AM, Jordan Zimmerman jzimmer...@netflix.comwrote: Thanks - that makes ruok essentially useless. Grrr... On 8/26/11 11:10 AM, Patrick Hunt ph

sync()

2011-08-30 Thread Jordan Zimmerman
When/why does sync() need to be called? I've searched the archive and the docs are non-existent. Any examples would be appreciated. -JZ

Re: sync()

2011-08-30 Thread Jordan Zimmerman
/r3.3.3/zookeeperProgrammers.html#ch_zkGuar antees specifically the section Simultaneously Consistent Cross-Client Views Patrick On Tue, Aug 30, 2011 at 9:52 AM, Jordan Zimmerman jzimmer...@netflix.com wrote: When/why does sync() need to be called? I've searched the archive and the docs are non

'kill' isn't very deadly

2011-09-06 Thread Jordan Zimmerman
I have a script that executes 'kill' on an instance when needed. However, the process isn't dying. After the kill, ps still shows the process running. Any ideas? -JZ

Re: 'kill' isn't very deadly

2011-09-06 Thread Jordan Zimmerman
Smith philip_sm...@apple.com wrote: kill -9 Review the man page to understand how a process responds to the different signals. On Sep 6, 2011, at 10:39 AM, Jordan Zimmerman wrote: I have a script that executes 'kill' on an instance when needed. However, the process isn't dying. After the kill

Re: 'kill' isn't very deadly

2011-09-06 Thread Jordan Zimmerman
Lol - that would explain why it isn't working On 9/6/11 11:12 AM, Ted Dunning ted.dunn...@gmail.com wrote: I believe (but should check) that the kill command is disabled by default. On Tue, Sep 6, 2011 at 6:10 PM, Jordan Zimmerman jzimmer...@netflix.comwrote: I'm referring to the four-letter

Re: Lock recipes and the lock path

2011-09-20 Thread Jordan Zimmerman
I gave it a vote - thanks. On 9/20/11 12:44 PM, Fournier, Camille F. camille.fourn...@gs.com wrote: Well, if you are locking member IDS that are generated strings that will never be reused, the best you can do right now is clean those up after a period of time. If, on the other hand, your

ANN: Curator - Netflix's ZooKeeper library

2011-10-10 Thread Jordan Zimmerman
https://github.com/Netflix/curator What is Curator? Curator n: a keeper or custodian of a museum or other collection - A ZooKeeper Keeper. Curator is a set of Java libraries that make using Apache ZooKeeper much easier. While ZooKeeper comes bundled with a Java client, using the client is

Re: ANN: Curator - Netflix's ZooKeeper library

2011-10-11 Thread Jordan Zimmerman
We'd appreciate any/all feedback, BTW. -JZ On 10/11/11 1:04 PM, Mahadev Konar maha...@hortonworks.com wrote: Nice. Good to see the Apache License. mahadev On Mon, Oct 10, 2011 at 2:22 PM, Jordan Zimmerman jzimmer...@netflix.comwrote: https://github.com/Netflix/curator What is Curator

Re: ANN: Curator - Netflix's ZooKeeper library

2011-10-11 Thread Jordan Zimmerman
The major raison d'être for Curator is to make adding recipes much easier. Using the Curator Framework APIs you can easily build new usages/recipes and not worry about connection management. -JZ On 10/11/11 1:46 PM, Ted Dunning ted.dunn...@gmail.com wrote: What I prefer to see in this context

Re: ANN: Curator - Netflix's ZooKeeper library

2011-10-11 Thread Jordan Zimmerman
Sorry. I didn't mean to come across as prickly. Jordan Zimmerman On Oct 11, 2011, at 4:52 PM, Camille Fournier cami...@apache.org wrote: You asked for feedback, I gave it.

Re: ANN: Curator - Netflix's ZooKeeper library

2011-10-11 Thread Jordan Zimmerman
Good 3 ed. :) Jordan Zimmerman On Oct 11, 2011, at 7:46 PM, Ted Dunning ted.dunn...@gmail.com wrote: Don't worry about being prickly. Camille and I can beat you any day on that account. Just be good hearted and serious about making things good while you are being

Re: ANN: Curator - Netflix's ZooKeeper library

2011-10-11 Thread Jordan Zimmerman
It might be possible to put a Scala wrapper around the recipe classes. I'll see what I can do - or if you want to contribute that ;) Jordan Zimmerman On Oct 11, 2011, at 4:40 PM, Joe Stein charmal...@allthingshadoop.com wrote: Its not written in Scala :( otherwise

Re: ANN: Curator - Netflix's ZooKeeper library

2011-10-13 Thread Jordan Zimmerman
of being unable to delete the directories. C On Tue, Oct 11, 2011 at 10:53 PM, Jordan Zimmerman jzimmer...@netflix.com wrote: Good 3 ed. :) Jordan Zimmerman On Oct 11, 2011, at 7:46 PM, Ted Dunning ted.dunn...@gmail.com wrote: Don't worry about being

Re: Locks based on ephemeral nodes - Handling network outage correctly

2011-10-14 Thread Jordan Zimmerman
FYI - Curator checks for KeeperException.Code.NODEEXISTS in its retry loop and just ignores it treating it as a success. I'm not sure if other libraries do that. So, this is a case that a disconnection can be handled generically. -JZ On 10/14/11 7:20 AM, Fournier, Camille F.

Re: Locks based on ephemeral nodes - Handling network outage correctly

2011-10-14 Thread Jordan Zimmerman
with sequential files since you don't know who created any other znodes out there. On Fri, Oct 14, 2011 at 9:39 AM, Jordan Zimmerman jzimmer...@netflix.comwrote: FYI - Curator checks for KeeperException.Code.NODEEXISTS in its retry loop and just ignores it treating it as a success. I'm not sure if other

Re: Locks based on ephemeral nodes - Handling network outage correctly

2011-10-14 Thread Jordan Zimmerman
Actually, as I think about it, it's incorrect to ignore KeeperException.Code.NODEEXISTS. This is because create() also takes a byte[] to set as the data. Another process may have created the node. I need to rethink this... On 10/14/11 9:51 AM, Jordan Zimmerman jzimmer...@netflix.com wrote

Re: Locks based on ephemeral nodes - Handling network outage correctly

2011-10-14 Thread Jordan Zimmerman
that works for every use case. C -Original Message- From: Jordan Zimmerman [mailto:jzimmer...@netflix.com] Sent: Friday, October 14, 2011 12:39 PM To: user@zookeeper.apache.org; 'Mike Schilli' Subject: Re: Locks based on ephemeral nodes - Handling network outage correctly FYI - Curator checks

Re: Something about zkclient

2011-10-19 Thread Jordan Zimmerman
This group does not support zkClient. You should bring it up on the zkClient website. -JZ On 10/19/11 4:57 PM, nileader nilea...@gmail.com wrote: Nice to hear from you. C. So, why tricky. In zk's original api, if you add an error authInfo, then you will receive en exception of NoAuth. Then

Node not joining ensemble

2011-10-21 Thread Jordan Zimmerman
I have a node that I restarted and it's not joining the ensemble. I ask it 'stat' and it says it isn't serving any requests. Here's a snippet from the log: 2011-10-21 22:46:23,166 - INFO [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@697] - Notification time out: 800 2011-10-21 22:46:23,167 - INFO

Re: Node not joining ensemble

2011-10-21 Thread Jordan Zimmerman
FYI - I turned on DEBUG and here's more log info: 2011-10-21 23:33:06,732 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@510] - id: 3, proposed id: 3, zxid: 12885265585, proposed zxid: 12885265585 2011-10-21 23:33:06,732 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@727] - Adding

Re: Node not joining ensemble

2011-10-21 Thread Jordan Zimmerman
Interesting. I restarted Server 2 in the ensemble and the problem cleared itself. -JZ On 10/21/11 4:34 PM, Jordan Zimmerman jzimmer...@netflix.com wrote: FYI - I turned on DEBUG and here's more log info: 2011-10-21 23:33:06,732 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@510] - id: 3

RE: Distributed lock via ZkClient

2011-10-23 Thread Jordan Zimmerman
FYI - Netflix has just open-sourced a new library that does many recipes: https://github.com/Netflix/curator From: zknewbie [sza...@narus.com] Sent: Sunday, October 23, 2011 8:41 PM To: zookeeper-u...@hadoop.apache.org Subject: Distributed lock via

Major issue with recipe doc

2011-10-27 Thread Jordan Zimmerman
http://zookeeper.apache.org/doc/trunk/recipes.html#sc_recipes_Locks I've just been informed that the recipe for Locking (and probably others) is seriously incorrect. When creating a sequential ephemeral node, there is a possibility that the ZK server can crash before the result is returned to

Re: ANN: Curator - Netflix's ZooKeeper library

2011-11-04 Thread Jordan Zimmerman
could you give more information about your client library? I'd like to include it in my list of work that people were forced to do to work around the shortcommings of the current ZooKeeper client API. You'd be number four after A lot could be solved by adding a retry (injectable please)

More sync questions

2011-11-10 Thread Jordan Zimmerman
A while back I asked about sync() and got responses that said it's only needed for reads (getData, getChildren). I was looking through the source and it appears that this does _not_ apply to exsists(). Am I reading that correctly? i.e. will exists() always return an accurate (in terms of the

Re: Missing session state handling in most Leader Election implementations

2011-11-13 Thread Jordan Zimmerman
On 11/13/11 3:40 PM, Jérémie BORDIER jeremie.bord...@gmail.com wrote: As noticed in ZOOKEEPER-1209, this can cause really important issues. As Leader election is one of the most demanded feature / recipe, I would really like to see the official recipe fixed and fully functional. Curator handles

Re: Missing session state handling in most Leader Election implementations

2011-11-13 Thread Jordan Zimmerman
On 11/13/11 4:45 PM, Jérémie BORDIER jeremie.bord...@gmail.com wrote: Hello Jordan, Thanks a lot for your answer. I tried to figure out where the handling of Disconnected / Expired takes place, but so far I understood that to have notifyClientClosing() called from the Lock, an exception needs to

Re: Missing session state handling in most Leader Election implementations

2011-11-18 Thread Jordan Zimmerman
/18/11 9:52 AM, Ted Dunning ted.dunn...@gmail.com wrote: Is the background sync even necessary? The ZK client itself will re-establish connection if it can. I think that LOST should only be sent on session expiration. On Fri, Nov 18, 2011 at 1:07 AM, Jordan Zimmerman jzimmer...@netflix.comwrote

Re: SQS Implementation with ZooKeeper

2011-12-06 Thread Jordan Zimmerman
Curator has both a Queue and a Priority Queue as described on the ZooKeeper recipes page: https://github.com/Netflix/curator/wiki/Recipes -JZ On 12/6/11 11:16 AM, Mike Schilli m...@perlmeister.com wrote: Seems like it should be possible to implement a queue service like Amazon's SQS [1] with

ANN: Curator 1.0

2011-12-31 Thread Jordan Zimmerman
FYI https://github.com/Netflix/curator I've decided to move Curator to 1.0 as it is feature complete and has seen good usage since it's initial announcement. Some highlights of developments since the initial announcement: * An implementation for every recipe on the ZooKeeper recipe wiki (sans

Re: ANN: Curator 1.0

2012-01-03 Thread Jordan Zimmerman
at 2:12 PM, Jordan Zimmerman jzimmer...@netflix.com wrote: FYI https://github.com/Netflix/curator I've decided to move Curator to 1.0 as it is feature complete and has seen good usage since it's initial announcement. Some highlights of developments since the initial announcement

Re: Curator question

2012-01-03 Thread Jordan Zimmerman
Not currently. I'm working on 3.4 compatibility and should have something soon. -JZ On 1/3/12 11:09 AM, Dima Gutzeit dima.gutz...@mailvision.com wrote: I have a small question to Curator author/users, does it support batch operations introduced in 3.4 ? Thanks in advance. Regards, Dima

Connections take longer in 3.4.x?

2012-01-03 Thread Jordan Zimmerman
I'm updating Curator to 3.4.x (3.4.2 to be exact). A lot of my tests have failed and the culprit is that it appears to be taking a lot longer to get the initial SysConnected. Of course, this is a flaw in my tests, but I thought I'd mention it and see if anyone else has noticed this. -JZ

Re: Connections take longer in 3.4.x?

2012-01-03 Thread Jordan Zimmerman
or just generally slower. No load at all. These are just unit tests, 1 or 2 connections. jdk versions change? java version 1.6.0_29 (Apple) -JZ On 1/3/12 4:20 PM, Patrick Hunt ph...@apache.org wrote: On Tue, Jan 3, 2012 at 4:02 PM, Jordan Zimmerman jzimmer...@netflix.com wrote: I'm updating Curator

Re: Multi doc?

2012-01-04 Thread Jordan Zimmerman
://issues.apache.org/jira/browse/ZOOKEEPER-1336 Contributing docs is a great way to get started as a zk contributor (hint hint ;-) ) Patrick On Wed, Jan 4, 2012 at 7:30 AM, Ted Dunning ted.dunn...@gmail.com wrote: Not really. On Wed, Jan 4, 2012 at 12:47 AM, Jordan Zimmerman jzimmer...@netflix.comwrote

Re: Multi doc?

2012-01-04 Thread Jordan Zimmerman
Did the patch for https://issues.apache.org/jira/browse/ZOOKEEPER-1336 not do a good enough job on that? Missed that - thanks. On 1/4/12 10:20 AM, Patrick Hunt ph...@apache.org wrote: On Wed, Jan 4, 2012 at 10:03 AM, Jordan Zimmerman jzimmer...@netflix.com wrote: * Can someone give more

zoo.cfg Server ID

2012-01-04 Thread Jordan Zimmerman
If I read the code correctly, the server ID in zoo.cfg (i.e. server.n=foo) does _not_ have to be sequential, 1 based, etc. i.e. it can be anything useful to me. For example, I could do: server.10064=foo:288:3888 server.18=bar:288:3888 server.65535=snafu:288:3888 Am I correct? -JZ

RE: Use cases for ZooKeeper

2012-01-04 Thread Jordan Zimmerman
Hi Josh, Second use case: Distributed locking This is one of the most common uses of ZooKeeper. There are many implementations - one included with the ZK distro. Also, there is Curator: https://github.com/Netflix/curator First use case: Distributing work to a cluster of nodes This sounds

Re: Use cases for ZooKeeper

2012-01-05 Thread Jordan Zimmerman
Stone pacesysj...@gmail.com wrote: Thanks for the response. Comments below: On Wed, Jan 4, 2012 at 10:46 PM, Jordan Zimmerman jzimmer...@netflix.comwrote: Hi Josh, Second use case: Distributed locking This is one of the most common uses of ZooKeeper. There are many implementations - one

Re: Use cases for ZooKeeper

2012-01-05 Thread Jordan Zimmerman
Care to work on it? On 1/5/12 12:50 AM, Ted Dunning ted.dunn...@gmail.com wrote: This pattern would make a nice addition to Curator, actually. It comes up repeatedly in different contexts.

Re: Use cases for ZooKeeper

2012-01-05 Thread Jordan Zimmerman
: Is the distributed queue effectively located by a single z-node? What happens when that node goes down? Will a node going down still clear any distributed locks? Josh On Thu, Jan 5, 2012 at 9:41 AM, Jordan Zimmerman jzimmer...@netflix.comwrote: FYI - Curator has a resilient message Queue: https://github.com

ANN: Curator 1.1.0

2012-01-05 Thread Jordan Zimmerman
I've just released Curator 1.1.0 that adds support for ZooKeeper transactions. I'm now going to maintain two branches of Curator: * 1.0.x for ZooKeeper 3.3.x * 1.1.x+ for ZooKeeper 3.4.x+ The Curator Transaction APIs use the same oh-so-cool Fluent style as the rest of Curator. E.g.

Re: Use cases for ZooKeeper

2012-01-12 Thread Jordan Zimmerman
is highly pluggable. This pattern would make a nice addition to Curator, actually. It comes up repeatedly in different contexts. On Thu, Jan 5, 2012 at 12:11 AM, Jordan Zimmerman jzimmer...@netflix.comwrote: OK - so this is two options for doing the same thing. You use a Leader Election algorithm

Backups

2012-01-13 Thread Jordan Zimmerman
As a backup strategy, it seems I would only want to backup snapshots from the leader. Does that make sense? -JZ

Re: Backups

2012-01-16 Thread Jordan Zimmerman
time you want to take a backup? That would be the downside to this strategy I would think. C From my phone On Jan 13, 2012 5:24 PM, Jordan Zimmerman jzimmer...@netflix.com wrote: As a backup strategy, it seems I would only want to backup snapshots from the leader. Does that make sense? -JZ

Re: Backups

2012-01-17 Thread Jordan Zimmerman
OK - I'll give you access to the repo as soon as it's in a reasonable state. On 1/17/12 10:10 AM, Neha Narkhede neha.narkh...@gmail.com wrote: Jordan, I'd be interested in previewing it. Let me know. Thanks, Neha On Mon, Jan 16, 2012 at 5:42 PM, Jordan Zimmerman jzimmer...@netflix.com wrote

Re: Backups

2012-01-19 Thread Jordan Zimmerman
durability would be acceptable. If it is not acceptable and you still want to have a backup, then I don't see a way other than shutting down the clients before you take a backup, which doesn't seem to be what is being proposed here. -Flavio On Jan 18, 2012, at 1:38 AM, Jordan Zimmerman wrote: Neha

Re: Backups

2012-01-19 Thread Jordan Zimmerman
you take a backup, which doesn't seem to be what is being proposed here. -Flavio On Jan 18, 2012, at 1:38 AM, Jordan Zimmerman wrote: Neha - can you send me your email address. Send it to: jzimmer...@netflix.com On 1/17/12 10:10 AM, Neha Narkhede neha.narkh...@gmail.com wrote: Jordan

Re: Backups

2012-01-19 Thread Jordan Zimmerman
Correct On 1/19/12 11:30 AM, Flavio Junqueira f...@yahoo-inc.com wrote: You're not talking about data corruption, are you? It is incorrect data that has been introduced by a user or application by mistake. Am I getting it right? -Flavio On Jan 19, 2012, at 8:07 PM, Jordan Zimmerman wrote

Re: Healthcheck using the stat command

2012-01-23 Thread Jordan Zimmerman
The problem with 'ruok' is that it doesn't tell you the state of the Instance. 'ruok' might return 'imok' but the instance might not be serving due to some other error. Only a 'stat' will tell you that. -JZ On 1/23/12 1:51 PM, Philip Smith philip_sm...@apple.com wrote: There is a batch java

Re: Causing ZSESSIONEXPIRED

2012-01-26 Thread Jordan Zimmerman
The Curator Test module has a class that does this: https://github.com/Netflix/curator/blob/master/curator-test/src/main/java/c om/netflix/curator/test/KillSession.java On 1/26/12 2:38 PM, Benjamin Reed br...@apache.org wrote: one easy way is to terminate the session using JMX. ben On Thu,

Re: Healthcheck using the stat command

2012-01-26 Thread Jordan Zimmerman
Is 'srvr' the same as 'stat' but without the clients? I'm relying on it in a monitor app. I need to the Mode and the message not currently serving. -JZ On 1/25/12 3:32 PM, Patrick Hunt ph...@apache.org wrote: Prefer srvr over stat in most cases - stat returns details on the connections which

Re: curator leader reconnect

2012-02-05 Thread Jordan Zimmerman
You can either create a new LeaderSelector or call start() again on your existing leader instance. Whatever's easier for your use-case. -Jordan On 2/5/12 8:09 AM, Hartmut Lang hartmut.l...@googlemail.com wrote: Hi, i work on a small demo application using the Curator Leader-Election. What i

RE: curator leader reconnect

2012-02-07 Thread Jordan Zimmerman
. But no ephemeral node in the cluster. /Hartmut Am 6. Februar 2012 18:46 schrieb Jordan Zimmerman jzimmer...@netflix.com: How are you verifying that there is no ephemeral node? -Jordan On 2/6/12 9:28 AM, Hartmut Lang hartmut.l...@googlemail.com wrote: Hi Jordan, thanks for your infos. What

Re: curator leader reconnect

2012-02-07 Thread Jordan Zimmerman
, without redoing the lock in the ZK-cluster. This seems not ok for me. But i'm the newbie here. Would be great if you can have a look. /Hartmut Am 7. Februar 2012 09:05 schrieb Jordan Zimmerman jzimmer...@netflix.com: I just pushed a test that simulates the situation you describe and it works

Re: curator leader reconnect

2012-02-07 Thread Jordan Zimmerman
at this point. There is no reasonable alternative that I can think of. I'm still thinking about this, so stay tuned... On 2/7/12 1:26 PM, Jordan Zimmerman jzimmer...@netflix.com wrote: I really appreciate your help Hartmut. You have, indeed, found a bug. My test case didn't precisely replicate your

Re: use cases for asynchronous API

2012-02-08 Thread Jordan Zimmerman
Some of the Curator recipes use async. For instance, adding a entry into the DistributedQueue is done in the background as there's no reason for the foreground process to wait for this. -JZ On 2/8/12 2:09 PM, Pierre Louis Aublin pierre-louis.aub...@inria.fr wrote: Hello everybody I would like

Re: Unit Testing ZooKeeper Based Application

2012-02-21 Thread Jordan Zimmerman
Curator provides an in-memory ZooKeeper Server and ZooKeeper Cluster: dependency groupIdcom.netflix.curator/groupId artifactIdcurator-test/artifactId version1.1.2/version /dependency https://github.com/Netflix/curator/tree/master/curator-test/src/main/java/c om/netflix/curator/test

Re: Ephemeral Nesting

2012-03-01 Thread Jordan Zimmerman
Yes, there's an issue in Jira on this: https://issues.apache.org/jira/browse/ZOOKEEPER-723 On 3/1/12 3:38 PM, Shelley, Ryan ryan.shel...@disney.com wrote: I know that Ephemeral nodes can't have children, but I was curious if there's been discussion around this lately, in particular around

Re: ZK as Configuration Service

2012-03-02 Thread Jordan Zimmerman
I suggest starting with Curator. It would make a good foundation. -JZ On 3/2/12 12:39 AM, Christopher Schmidt fakod...@googlemail.com wrote: Hi all, we plan to use Zookeeper and I wonder if there is a Java framework out there to use ZK as a central configuration service (holding paths,

ANN: Exhibitor beta

2012-03-04 Thread Jordan Zimmerman
Announcing the beta release of Exhibitor. https://github.com/Netflix/exhibitor Exhibitor is a Java supervisor system for ZooKeeper. It provides a number of features: * Watches a ZK instance and makes sure it is running * Performs periodic backups * Perform periodic cleaning of ZK log

RE: Are curator framework consumers single threaded?

2012-03-06 Thread Jordan Zimmerman
The ListenerContainer for PathChildrenCache allows you to pass an Executor along with your listener. So, that will give you the desired behavior. DistributedQueue is _supposed_ to allow this as well, but it looks like it doesn't. I view that as a bug. If you don't mind, please post an issue on

Rolling upgrades

2012-03-08 Thread Jordan Zimmerman
I've been reading the archives regarding rolling upgrades. Here's the scenario, given a stable ensemble: ZK1 - ZK2 - ZK3 In the above, the zoo.cfg for each server looks like this (pseudo): server.1=ZK1 server.2=ZK2 server.3=ZK3 I want to add a new server, ZK4. If I understand this correctly,

Re: Rolling upgrades

2012-03-08 Thread Jordan Zimmerman
files of all servers at once and restart them, because a majority of the new config necessarily intersects with a majority of the old one, so a server who has the latest state will be elected leader. Alex -Original Message- From: Jordan Zimmerman [mailto:jzimmer...@netflix.com] Sent

Cool news on the Curator/Exhibitor front - rolling release/upgrade

2012-03-12 Thread Jordan Zimmerman
I'm pretty excited about this one so I want to pre-annouce it. I'm currently working on rolling release/upgrade support in Exhibitor and Curator. For Exhibitor: * Ensemble list, config values, etc. are centralized (via Amazon S3, a shared file, etc.) * If the ensemble list changes,

Re: Zookeeper SASL Vs Curator

2012-04-13 Thread Jordan Zimmerman
It sounds like you're not setting the correct ACL in your Curator use. Curator supports ACLs. FYI - this issue might be better to post at https://github.com/Netflix/curator/issues (a Curator mailing list is coming soon). -JZ On 4/13/12 5:14 AM, antoniom antonio...@gmail.com wrote: Hi there,

ANN: Exhibitor

2012-04-16 Thread Jordan Zimmerman
Introducing Exhibitor - A Supervisor System for Apache ZooKeeper Exhibitor provides a number of features that make managing a ZooKeeper ensemble much easier: * Instance Monitoring * Log Cleanup * Backup/Restore * Cluster-wide Configuration * Rolling Ensemble Changes * Visualizer * Curator

Re: How to replace a zookeeper server ?

2012-04-18 Thread Jordan Zimmerman
FYI - making this process more reliable is one of the main reasons for Exhibitor: http://techblog.netflix.com/2012/04/introducing-exhibitor-supervisor-system .html On 4/18/12 8:44 AM, Ted Dunning ted.dunn...@gmail.com wrote: As long as the old quorum constitutes a quorum in the new cluster you

Re: Delegating Load within Quorum

2012-04-20 Thread Jordan Zimmerman
Have a look at the constructor for StaticHostProvider. You'll see that it does a Collections.shuffle(this.serverAddresses) - so each client should connect to a random server in the ensemble. -JZ On 4/20/12 5:25 PM, Matthew Ward m...@pixelpipe.com wrote: Hello, I am running an ensemble of 3

Re: Delegating Load within Quorum

2012-04-25 Thread Jordan Zimmerman
zkCli.sh seems to use port 8080. Is there any config/argument to change this? I couldn't find one. It's conflicting with another process on the machine. -JZ

Port 8080

2012-04-25 Thread Jordan Zimmerman
Doh - wrong subject. Is there any way to tell zkCli.sh to now register JMX or to not use port 8080 for it? It's conflicting with another process on our server. -JZ On 4/25/12 2:56 PM, Jordan Zimmerman jzimmer...@netflix.com wrote: zkCli.sh seems to use port 8080. Is there any config/argument

Re: adding/removing a zookeeper server

2012-04-26 Thread Jordan Zimmerman
See #6 in the FAQ: http://wiki.apache.org/hadoop/ZooKeeper/FAQ Also, FYI, we recently open sourced a ZooKeeper supervisor app that makes these kinds of tasks simpler: https://github.com/Netflix/exhibitor -Jordan On 4/26/12 2:46 PM, Mohammad Abdul-Amir (Shamma) mohammadsha...@gmail.com

Re: Zookeeper multiserver setup?

2012-05-08 Thread Jordan Zimmerman
Yes, you must start them all individually. ZooKeeper does not currently have any cluster-wide management tools built in. FYI - we've open sourced a cluster management tool for ZooKeeper. Please have a look: https://github.com/Netflix/exhibitor On 5/8/12 2:31 PM, Something Something

Re: Zookeeper multiserver setup?

2012-05-08 Thread Jordan Zimmerman
: Thanks Jordan. Will definitely look at the cluster management tool. In the mean time, am I correct with my assumption in #4? Once all of them are started they would automagically start talking to each other, right? On Tue, May 8, 2012 at 2:36 PM, Jordan Zimmerman jzimmer...@netflix.comwrote: Yes

Re: Zookeeper multiserver setup?

2012-05-08 Thread Jordan Zimmerman
Received: 4 Sent: 3 Outstanding: 0 Zxid: 0x1 Mode: follower Node count: 11 Connection closed by foreign host. Not sure I understand this output. I have 9 nodes in the ensemble. On Tue, May 8, 2012 at 2:45 PM, Jordan Zimmerman jzimmer...@netflix.comwrote: Yes - they should. Keep in mind

Re: Zookeeper multiserver setup?

2012-05-08 Thread Jordan Zimmerman
understand the difference between Nodes Instance. I expected ZKNodes to be 9. Please explain. Thanks a lot for your help. On Tue, May 8, 2012 at 3:05 PM, Jordan Zimmerman jzimmer...@netflix.comwrote: You have 9 instances? That's a lot. Why so many? I believe the Node field in the stat

Re: Watch not sent immediately?

2012-05-09 Thread Jordan Zimmerman
Interesting - this issue has come up several times with Curator users. I ended up writing a Tech Note on it. https://github.com/Netflix/curator/wiki/Tech-Note-1 -JZ On 5/9/12 1:23 PM, Patrick Hunt ph...@apache.org wrote: I believe the issue is that there is a single thread updating watchers.

Re: curator leader election and thread usage

2012-05-10 Thread Jordan Zimmerman
I've decided to add an alternate version of leader selection that doesn't use threads. It will behave somewhat like a CountDownLatch so I'm calling it LeaderLatch. I'll report back (only on the Curator list) in the next day or so when it's available. -Jordan On 5/9/12 2:51 PM, Patrick Hunt

SaslAuthenticated

2012-05-17 Thread Jordan Zimmerman
When using SASL authentication, waiting for SysConnected isn't enough. Clients need to wait for SaslAuthenticated as well before calling ZK methods. Is there a way for a library such as Curator to know that SASL is enabled so that it can know that it needs to wait for SaslAuthenticated? I

Re: SaslAuthenticated

2012-05-17 Thread Jordan Zimmerman
Never mind - https://issues.apache.org/jira/browse/ZOOKEEPER-1437 fixes the issue From: Netflix jzimmer...@netflix.commailto:jzimmer...@netflix.com Date: Thu, 17 May 2012 00:23:08 -0700 To: user@zookeeper.apache.orgmailto:user@zookeeper.apache.org Subject: SaslAuthenticated When using SASL

Re: cluster member was switched to standalone, detectable?

2012-05-18 Thread Jordan Zimmerman
ZooKeeper has a telnet style interface for periodic querying. You could also use Exhibitor and query it's REST API periodically. I should probably add alerting to Exhibitor for this kind of thing. -JZ On 5/18/12 10:34 AM, Adam Rosien a...@rosien.net wrote: We have a 5-member 3.3.3 cluster. One

New Instance can't sync

2012-05-24 Thread Jordan Zimmerman
I'm trying to add a new instance to the ensemble and it is throwing while trying to sync. Any ideas? 2012-05-24 20:21:08,751 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running

Re: New Instance can't sync

2012-05-24 Thread Jordan Zimmerman
Later on, I get this exception: 2012-05-24 20:23:03,800 - WARN [QuorumPeer:/0.0.0.0:2181:QuorumPeer@497] - Unable to load database java.io.IOException: Transaction log: /mnt/data/zookeeper/version-2/log.d0f357a0f has invalid magic number 0 != 1514884167 at

How to delete ZNode with 200K items

2012-05-24 Thread Jordan Zimmerman
We have a node that has 200K items and would like to delete them. getChildren() keeps failing. Is there anything that can be done? -JZ

Re: How to delete ZNode with 200K items

2012-05-24 Thread Jordan Zimmerman
; if (len 0 || len maxBuffer) { throw new IOException(Unreasonable length = + len); } byte[] arr = new byte[len]; in.readFully(arr); return arr; } On Thu, May 24, 2012 at 11:17 PM, Jordan Zimmerman jzimmer...@netflix.com wrote: We have a node

Re: New Instance can't sync

2012-05-24 Thread Jordan Zimmerman
I found the problem. I needed to increase the values for initLimit and syncLimit. -JZ On 5/24/12 1:23 PM, Jordan Zimmerman jzimmer...@netflix.com wrote: Later on, I get this exception: 2012-05-24 20:23:03,800 - WARN [QuorumPeer:/0.0.0.0:2181:QuorumPeer@497] - Unable to load database

ZOOKEEPER-1367

2012-05-25 Thread Jordan Zimmerman
I think we may be running into ZOOKEEPER-1367 in Production (we're still on 3.3.3). Is there a reliable way to reproduce this in a test environment? -JZ

initLimit/syncLimit

2012-05-30 Thread Jordan Zimmerman
Our ZK data size is currently around 116GB on disk. I find that I'm needing to set initLimit and syncLimit to very big numbers. Currently, the cluster cannot get a quorum on restart unless I have these values: initLimit=300 syncLimit=300 tickTime=2000 Does that seem normal?  -JZ

Re: initLimit/syncLimit

2012-05-30 Thread Jordan Zimmerman
What's the latest snapshot size look like? 1,378,363,003 What's the size of the ensemble. How many znodes. etc... 3 nodes - AWS m1.xlarge From: Patrick Hunt ph...@apache.org To: user@zookeeper.apache.org; Jordan Zimmerman jor...@jordanzimmerman.com Sent

3.3.4 client vs 3.3.5 client

2012-05-31 Thread Jordan Zimmerman
http://zookeeper.apache.org/doc/r3.3.5/releasenotes.html The bugs fixed in 3.3.5 seem to apply only to the server, but it isn't totally clear. If I were using a 3.3.4 client on a 3.3.5 server might I still see any of these issues? -JZ

Re: Adding to a quorum

2012-06-18 Thread Jordan Zimmerman
FYI - have a look at our Exhibitor system which makes upgrades/additions easier to manage. https://github.com/Netflix/exhibitor On 6/18/12 11:26 AM, Ted Dunning ted.dunn...@gmail.com wrote: Yes. Updating and restarting works fine. One does wonder why you have *four* servers now. That

Powered by Curator?

2012-06-21 Thread Jordan Zimmerman
We'd appreciate knowing what companies are using Curator. So, if you don't mind, please send me an email (jzimmer...@netflix.com). Let me know if you need to keep it private or if we can put it on a Powered By page. Thank you! -Jordan

Re: Dealing with an expired session

2012-06-26 Thread Jordan Zimmerman
All watchers will get called with session expiration, disconnect, etc. Jordan Zimmerman On Jun 26, 2012, at 7:51 AM, David Nickerson davidnickerson4mailingli...@gmail.com wrote: In my locking implementation, if a thread wants to wait for a lock, it will create a watcher

Re: Dealing with an expired session

2012-06-26 Thread Jordan Zimmerman
Even if it did (I don't actually know), I'd be nervous about having that kind of dependency in my app. What's the reason you need this? -JZ On Tue, Jun 26, 2012 at 1:25 PM, David Nickerson davidnickerson4mailingli...@gmail.com wrote: Is there any guarantee of order? For example, does the

  1   2   3   4   >