Siddharth Raghavan wrote:
I need to restart a single zookeeper server node on the same port within
my unit tests.
Are you testing c or java client?
I tried stopping the server, having a delay and restarting it on the
same port. But the server doesn't startup. When I re-start on a
different
You have a small typo in your client command, it should be:
bin/zkCli.sh -server 10.16.50.132:2181
(a : not a . prior to the port)
Patrick
chengxiong000 wrote:
Dear zookeepers:
I am a zookeeper user and encount an problem when start zookeeper when start the server and client task .
You might try my ZooKeeper configuration generator if you have python
handy: http://bit.ly/mBEcF
The main issue that I see with your config is that each config file
needs to contain a list of all the servers in the ensemble:
...
syncLimit=2
server.1=host1...
server.2=host2...
FWIW I noticed this on twitter last night, a third party PHP binding for
ZooKeeper is now available (I haven't tried it myself):
http://twitter.com/phunt/status/4906002271
Patrick
You're right, 0 should be something like INITIALIZING_STATE but it's
not in zookeeper.h
zookeeper_init(...) docs:
* This method creates a new handle and a zookeeper session that
corresponds
* to that handle. Session establishment is asynchronous, meaning that the
* session should not be
and Java's KeeperState.
-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Tuesday, October 13, 2009 5:03 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: C client (via zkpython) returns unknown state
You're right, 0 should be something like INITIALIZING_STATE but it's
I've seen this before. Either you have an old version of ant, or your
JAVA_HOME is not set, or it's set incorrectly (to 1.5 and ant is built
for 1.6, or vice versa).
Patrick
Henry Robinson wrote:
Hi Steven -
I also see that problem if I build on my Mac sometimes. I'm looking into a
proper
Take all of the server.# lines out, including server.1 (no other change
necessary). For standalone you don't need/want this.
Alternately you could use
org.apache.zookeeper.server.ZooKeeperServerMain
(I don't think you even need to change the config file if you do that).
for example:
java
You might want to add a link to zkclient on this page:
http://wiki.apache.org/hadoop/ZooKeeper/UsefulTools
Patrick
Patrick Hunt wrote:
Ted Dunning wrote:
Judging by history and that fact that only 40/127 issues are resolved,
3.3
is probably 3-6 months away. Is that a fair assessment?
Yes
) in particular that I should look at
to see how zkclient is used, and the benefits incurred?
Regards,
Patrick
Patrick Hunt wrote:
Hi Stefan, two suggestions off the bat:
1) fill in something in the README, doesn't have to be final or
polished, but give some insight into the what/why/how/where/goals
Not to harp on this ;-) but this sounds like something that would be a
very helpful addition to the README.
Ted Dunning wrote:
I think that another way to say this is that zkClient is going a bit for the
Spring philosophy that if the caller can't (or won't) be handling the
situation, then they
Ted Dunning wrote:
You may be able to tell if the file is yours be examining the content and
ownership, but this is pretty implementation dependent. In particular, it
makes queues very difficult to implement correctly. If this happens during
the creation of an ephemeral file, the only option
Ted Dunning wrote:
Judging by history and that fact that only 40/127 issues are resolved, 3.3
is probably 3-6 months away. Is that a fair assessment?
Yes, that's fair.
Patrick
On Thu, Oct 1, 2009 at 11:13 AM, Patrick Hunt ph...@apache.org wrote:
One nice thing about ephemeral
That detail is purposefully not exposed through the client api, however
it is output to the log on connection establishment.
Why would your client code need to know which server in the ensemble it
is connected to?
Patrick
Rob Baccus wrote:
How do I determine the server the client is
().toString();
}
Feel free to add a JIRA, I think we could make this a protected method
on ZooKeeper to make testing easier (and not expose internals).
Regards,
Patrick
Todd Greenwood wrote:
Failover testing.
-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent
to...@audiencescience.comwrote:
Failover testing.
-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Thursday, October 01, 2009 3:44 PM
To: zookeeper-user@hadoop.apache.org; Rob Baccus
Subject: Re: How do we find the Server the client is connected to?
That detail is purposefully
Hi Hector, looks like a connectivity issue to me: NoRouteToHostException.
3888 is the election port
2888 is the quorum port
basically, the ensemble uses the election port for leader election. Once
a leader is elected it then uses the quorum port for subsequent
communication.
Could it be a
Hi Stefan, two suggestions off the bat:
1) fill in something in the README, doesn't have to be final or
polished, but give some insight into the what/why/how/where/goals/etc...
to get things moving quickly for reviewers new users.
2) you should really discuss on the dev list. It's up to you
Not sure if you'll find this interesting but my zk configuration
generator is available on github:
http://github.com/phunt/zkconf
zkconf.py will generate all of the configuration needed to run a
ZooKeeper ensemble. I mainly use this tool for localhost based testing,
but it can generate
Greenwood [mailto:to...@audiencescience.com]
Sent: Friday, September 18, 2009 11:27 AM
To: Patrick Hunt; zookeeper-...@hadoop.apache.org; zookeeper-
u...@hadoop.apache.org
Subject: RE: ACL question w/ Zookeeper 3.1.1
Patrick / Mahadev,
Thanks for the heads-up!
Apparently I *am* receiving email from
=
{org.apache.zookeeper.proto.createrespo...@1360}'/ACLTest\n
r = {org.apache.zookeeper.proto.replyhea...@1389}2,2,0\n
-Todd
-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Monday, September 21, 2009 4:14 PM
To: zookeeper-user@hadoop.apache.org; Todd Greenwood
What is your client timeout? It may be too low.
also see this section on handling recoverable errors:
http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling
connection loss in particular needs special care since:
When a ZooKeeper client loses a connection to the ZooKeeper server
there may be
are not swapping (see gc pressure), etc...)
Patrick
Satish Bhatti wrote:
Session timeout is 30 seconds.
On Tue, Sep 1, 2009 at 4:26 PM, Patrick Hunt ph...@apache.org wrote:
What is your client timeout? It may be too low.
also see this section on handling recoverable errors:
http
has one server. Not sure if this is exacerbating the problem?
I will check out the trouble shooting link you sent me.
On Tue, Sep 1, 2009 at 5:01 PM, Patrick Hunt ph...@apache.org wrote:
I'm not very familiar with ec2 environment, are you doing any monitoring?
In particular network connectivity
Hi Leonard,
Between 00:43:23,035 and 00:43:23,157 I see client session
0x123730dbe6e0001 get 15 node exists exceptions in a row. Are you
expecting this? (ie are you attempting to create this node 15 times in a
row or is this unexpected? I can't tell from the client snippet you
included)
Nice!
Jean-Daniel Cryans wrote:
Added here http://wiki.apache.org/hadoop/Hbase/Troubleshooting#12
J-D
On Mon, Aug 24, 2009 at 5:20 PM, Patrick Huntph...@apache.org wrote:
No worries. The details are actually interesting/useful, you might consider
adding to your docs in case another user runs
Hi Qian, it would good if you could create a jira for this:
https://issues.apache.org/jira/browse/ZOOKEEPER
include both the client logs and the server logs (for overlapping
client/server time period where you see the problem). also the server
config if you're using a quorum vs standalone. If
-1.2.15.jar.
Program will exit.
$
Thank you
Jeff zhang
On Tue, Aug 18, 2009 at 12:53 PM, Patrick Hunt ph...@apache.org wrote:
you are using java 1.6 right? more detail on the class not found would be
useful (is that missing or just not included in your email?) Also the
command line you're
On Tue, Aug 18, 2009 at 12:53 PM, Patrick Hunt ph...@apache.org wrote:
you are using java 1.6 right? more detail on the class not found would be
useful (is that missing or just not included in your email?) Also the
command line you're using to start the app would be interesting.
Patrick
One more thing, please enter a jira on this so that we can track/fix it.
https://issues.apache.org/jira/browse/ZOOKEEPER
Thanks,
Patrick
Patrick Hunt wrote:
I suspect it has to do with the classpath - specifically having spaces
in the directory name. Notice that one of the lines you included
you are using java 1.6 right? more detail on the class not found would
be useful (is that missing or just not included in your email?) Also the
command line you're using to start the app would be interesting.
Patrick
Mahadev Konar wrote:
Hi Zhang,
Are you using cygwin?
mahadev
On 8/17/09
Please do enter a JIRA. Looking at the source it seems that we log and
error, but the calling code continues. I think this is happening because
the chroot c lib code is not handling znode watches separate from state
change notifications.
The calling code just continues after logging an
if you can wait a week or so...
Regards,
Patrick
Todd Greenwood wrote:
Inline.
-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Thursday, July 30, 2009 10:57 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2
Todd Greenwood wrote
with the manual patches to branch-3.2, as I am under
some time constraints to get our infrastructure deployed such that QA
can start playing with it. However, I'll switch to 3.2.1 as soon as I
can.
Understood.
Patrick
-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent
it
always fail?)
I've entered a jira to address this:
https://issues.apache.org/jira/browse/ZOOKEEPER-492
Patrick
Patrick Hunt wrote:
Todd Greenwood wrote:
The build succeeds, but not the all of the tests. In previous test runs,
I noticed an error in org.apache.zookeeper.test.FLETest
well try running these two tests individually and see if they always
fail or just occassionally. that will be a good start (and the env detail).
Patrick
Todd Greenwood wrote:
No edits to conf/log4j.properties.
-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent
until the
VM exit.
-Todd
-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Thursday, July 30, 2009 10:13 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2
Todd Greenwood wrote:
[Todd] Yes, I believe address in use was the problem
Flavio, please enter a doc jira for this if there are no docs, it should
be in forrest, not twiki btw. It would be good if you could review the
current quorum docs (any type) and create a jira/patch that addresses
any/all shortfall.
Patrick
Flavio Junqueira wrote:
Todd, Some more answers.
Thanks for the report, looks like something we need to address, would
you mind going the extra step and adding a JIRA on this?
https://issues.apache.org/jira/browse/ZOOKEEPER
Thanks,
Patrick
kishore g wrote:
Hi All,
Zookeeper recipe queue code has a bug.
byte[] b = zk.getData(root +
Nodes are maintained un-ordered on the server. A node can store any
subnodes, not exclusively sequential nodes. If we added an ordering
guarantee then then server would have to store the children sorted for
every parent node. This is a problem for a few reasons; 1) in many cases
you don't care
Yes, this is a strong guarantee:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkGuarantees
Sync is only necessary if client A makes a change, then client B wishes
to read that change with guarantee that it will see the successfully
applied change previously made
Erik, if you'd like enter a JIRA and take a whack at it go ahead.
Perhaps a subclass of DataNode specific for ephemerals? That way it can
handle any particulars - and should also minimize the number of
if(children==null) type checks that would be needed. (don't neg.
impact performance or b/w
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version
3.2.0.
ZooKeeper is a high-performance coordination service for distributed
applications. It exposes common services - such as naming, configuration
management, synchronization, and group services - in a simple interface
Henry Robinson wrote:
Effectively, EC2 does not introduce any new failure modes but potentially
exacerbates some existing ones. If a majority of EC2 nodes fail (in the
sense that their hard drive images cannot be recovered), there is no way to
restart the cluster, and persistence is lost. As you
Do we have a JIRA for this? If not we should add one for 3.3.
Patrick
Mahadev Konar wrote:
Hi Raghu,
We do have plans to enforce quota in future. Enforcing requires some more
work then just reporting. Reporting is a good enough tool for operations to
manage a zookeeper cluster but we would
The Hadoop summit is Wednesday. If you're attending please feel free to
say hi -- Mahadev is presenting @4, Ben and I will be attending as well.
Also, regardless of whether you're attending or not we'd appreciate any
updates to the powered by page, if you're too busy to update it
yourself
Agree, created a new JIRA for this:
https://issues.apache.org/jira/browse/ZOOKEEPER-430
See the following JIRA for one example why not to do this:
https://issues.apache.org/jira/browse/ZOOKEEPER-327
In general you don't want to create large node sizes since all of the
data/nodes are stored in
Javier, also note that the subsequent getChildren you mention in your
original email is usually not entirely superfluous given that you
generally want to watch the parent node for further changes, and a
getChildren is required to set that watch.
Patrick
Benjamin Reed wrote:
i'm adding a faq
time?
It would really help me much, 3x~
On Fri, Apr 17, 2009 at 1:20 AM, Patrick Hunt ph...@apache.org wrote:
You can generate the doxygen C API docs using make doxygen-doc (see the
README).
Mahadev Konar wrote:
Please take a look at src/c/src/cli.c for some examples on zookeeper c
client
Take a look at this section to start:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_commonProblems
What type of monitoring are you doing on your cluster? You could monitor
at both the host and at the java (jmx) level. That will give you some
insight on where to look;
the server side timeout
is sufficiently long.
Thanks again.
On Thu, Apr 16, 2009 at 10:57 AM, Patrick Hunt ph...@apache.org wrote:
lots of stuff about monitoring ... jmx ... packet loss ... vm latencies ...
timeout details.
... Hope this helps.
Patrick
Jun Rao wrote:
From the ZK web site, it's not clear how to set up a multi-node ZK service.
It seems that one has to add the server entries in the conf file and create
myid files on each node. Then, how should I start the ZK nodes? I tried
issuing zkServer start from each node and that didn't
Hey Chris this is really great! Thanks for making it available to the
community, very cool.
Patrick
Chris Darroch wrote:
Hi --
The http://wiki.apache.org/hadoop/ZooKeeper page includes the
comment that someday we hope to get Python, Perl, and REST interfaces.
I hope I can help with one
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version
3.1.1.
ZooKeeper is a high-performance coordination service for distributed
applications. It exposes common services - such as naming, configuration
management, synchronization, and group services - in a simple interface
Mahadev Konar wrote:
Hi Nitay,
- Does this event happening mean my ephemeral nodes will go away?
No. the client will try connecting to other servers and if its not able to
reconnect to the servers within the remaining session timeout.
If the client is not able to connect within the
section?
On Thu, Feb 26, 2009 at 10:00 PM, Patrick Hunt ph...@apache.org wrote:
So far we've stayed with the process used by core as this minimizes the
amount of work we need to do re process/build/release, etc... we just copy
the process/build/release etc... used in core, we get all
Ben, you might want to look at buildr, it recently graduated from the
apache incubator:
http://buildr.apache.org/
Buildr is a build system for Java applications. We wanted something
that’s simple and intuitive to use, so we only need to tell it what to
do, and it takes care of the rest. But
we do have an open issue to do this more on the fly without having to
do the bounce, but it is behind other priorities in the work queue.
This is the JIRA:
https://issues.apache.org/jira/browse/ZOOKEEPER-107
in case someone would like to work on this.
That's very interesting results, a good job sleuthing. You might try the
concurrent collector?
http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#available_collectors.selecting
specifically item 4 -XX:+UseConcMarkSweepGC
I've never used this before myself but it's supposed to
So far we've stayed with the process used by core as this minimizes the
amount of work we need to do re process/build/release, etc... we just
copy the process/build/release etc... used in core, we get all that for
free. I'm hesitant to diverge as this will increase the amount of work
we need
The latest docs (3.1.0 has some updates to that section) can be found here:
http://hadoop.apache.org/zookeeper/docs/r3.1.0/zookeeperProgrammers.html#ch_zkSessions
Patrick
Mahadev Konar wrote:
Hi Joey,
here is a link to information on session timeouts.
If you are using ZK and can publicly share this information please
update the wiki PoweredBy page:
http://wiki.apache.org/hadoop/ZooKeeper/PoweredBy
Patrick
Tom White wrote:
If client sets a watcher on a znode by doing a getData operation is it
guaranteed to get the next change after the value it read, or can a
change be missed?
In other words if the value it read had zxid z1 and the next update of
the znode has zxid z2, will the watcher always
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version
3.1.0.
ZooKeeper is a high-performance coordination service for distributed
applications. It exposes common services - such as naming, configuration
management, synchronization, and group services - in a simple interface
Ephemerals and watches are maintained across disconnect/reconnect btw
the client and server however session expiration (or closing the session
explicitly) will trigger deletion of ephemeral nodes associated with the
session.
Right - once the session is expired the id is invalid. You need to
Regardless of frequency Tom's code still has to handle this situation.
I would suggest that the two classes Tom is referring to in his mail,
the ones that use ZK client object, should either be able to
reinitialize with a new zk session, or they themselves should be
discarded and new
have to throw the KeeperException as
a fatal exception rather than letting that client try to re-elect. Or
maybe add in some logic to say if I can't re-elect, _then_ throw an
exception and consider it fatal.
Thanks guys.
-Tom
On Thu, Feb 12, 2009 at 2:39 PM, Patrick Hunt ph...@apache.org wrote
Chris, that's unfortunate re the version number (config.h), but I think
I see why that is -- config.h should only really be visible in the
implementation, not exposed through the includes.
I've created a JIRA for this:
https://issues.apache.org/jira/browse/ZOOKEEPER-293
We'll hold 3.1 for
contributions from anyone. ;-)
Patrick Hunt wrote:
Chris, that's unfortunate re the version number (config.h), but I think
I see why that is -- config.h should only really be visible in the
implementation, not exposed through the includes.
I've created a JIRA for this:
https://issues.apache.org/jira
All 3.1 issues have been resolved, I'll be starting the release process
today, detailed here:
http://wiki.apache.org/hadoop/ZooKeeper/HowToRelease
If voting is timely successful an official release should be available
early/mid next week. You can follow more closely on the zookeeper-dev list.
Mahadev, can you complete quotas in 2 weeks? This includes completing
the code itself, documentation, tests, and incorporating review feedback?
Parick
Benjamin Reed wrote:
we should delay. it would be good to try out quotas for a bit before
we do the release. quotas are also a key part of the
There's also been interest in having a chroot type capability as part
of the connect string:
host:port/app/abc,...
where the client's session would be rooted at /app/abc rather than /
This is very useful in multi-tenant situations (more than 1 app sharing
a zk cluster).
Patrick
Benjamin
Kevin Burton wrote:
3) it's possible for your code to get notified of a change, but never
process the change. This might happen if:
a) a node changed watch fires
b) your client code runs an async getData
c) you are disconnected from the server
Also, this seems very confusing...
If I run
To say that it will never return is not correct. The client will be
notified of connectionloss in the callback, however the client will
not know if the operation was successful (from point of view of the
server) or not.
Patrick
Kevin Burton wrote:
On Wed, Jan 7, 2009 at 11:12 AM, Mahadev
Mahadev Konar wrote:
Why would you want the session to expire if all the servers are down (which
should not happen unless you kill all the nodes or the datacenter is down) ?
A more likely case is that the client port on the switch dies and the
client is partitioned from the servers...
with Jute a bit.
Kevin
On Wed, Jan 7, 2009 at 10:07 AM, Patrick Hunt ph...@apache.org wrote:
Thanks for the report, entered as:
https://issues.apache.org/jira/browse/ZOOKEEPER-268
For the time being you can work around this by setting the threshold to
INFO for that class (in log4j.properties
Kevin Burton wrote:
Here's a good reason for each client to know it's session status
(connected/disconnected/expired). Depending on the application, if L does
not have a connected session to the ensemble it may need to be careful how
it acts.
connected/disconnected events are given out in
That's great, very cool!
Can you create ZOOKEEPER JIRAs for these items that you've identified?
First look it seems like we should be able to include these in 3.1.0,
perhaps even 3.0.2.
Regards,
Patrick
Hiram Chirino wrote:
FYI:
ActiveMQ has now started using ZooKeeper to do master
I'm not aware of any.
Patrick
Garth Patil wrote:
Hi,
Has anyone created an RPM or a SPEC file for Zookeeper? I thought I'd
ask before I embarked on creating one.
Thanks,
Garth
Patrick
-Original Message- From: Patrick Hunt [EMAIL PROTECTED]
Sent: Wednesday, November 12, 2008 2:11pm To:
zookeeper-user@hadoop.apache.org Subject: Re: Exists Watch Triggered
by Delete
Hi Stu,
The zk server maintains 2 lists of watches, data and child watches:
http
, 2008 at 10:35 PM, Patrick Hunt [EMAIL PROTECTED]
wrote:
Our first official Apache release has shipped and I'm already
looking
forward to 3.1.0. ;-)
In particular I believe we should look at the following for 3.1.0:
1) there are a number of issues that we're targeted to 3.1.0 during
the
3.0.0
I've entered a JIRA targeted for ZooKeeper 3.1.0 that will add Java 6
requirement to ZooKeeper (we will drop java5 support). If you have any
feedback (pos or neg) please add comments to the issue:
https://issues.apache.org/jira/browse/ZOOKEEPER-210
Regards,
Patrick
201 - 282 of 282 matches
Mail list logo