We are now officially an Apache TLP! http://bit.ly/9czN2x
As part of the process for moving out from under Hadoop and into full
TLP status we need to work through the following:
http://incubator.apache.org/guides/graduation.html#new-project-hand-over
If you are involved with the project, esp on
Camille, that's a very good question. Largest cluster I've heard about
is 10k sessions.
Jeremy - largest I've ever tested was a 3 server cluster with ~500
sessions. Each session created 10k znodes (100bytes each znode) and
set 5 watches on each. So 5 million znodes and 25million watches. I
then
:
Thanks Patrick - it's really nice to have those numbers and test harness
basis.
We're still in architecture mode so some of the details are still in flux,
but I think this gives us an idea.
Thanks very much.
On Nov 18, 2010, at 11:51 AM, Patrick Hunt wrote:
Camille, that's a very good
connections across
that this won't happen. I think maybe there's a JIRA out to deal with this
issue, not sure what the status is.
C
-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Thursday, November 18, 2010 2:57 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re
Perhaps something similar to what Ben detailed here? (rendezvous)
http://developer.yahoo.com/blogs/hadoop/posts/2009/05/using_zookeeper_to_tame_system/
Change the key, add child znode(s) that's deleted by the notified
client(s) once it's read the changed value. Some details need to be
worked out
On Wed, Nov 10, 2010 at 10:58 AM, Erwin Tam e...@yahoo-inc.com wrote:
1. Ops tools including monitoring and administration.
Command port (4 letter words) for monitoring has worked extremely well
for zk. Whatever you do put the command port on a separate port, and
make it a full fledged feature
I wanted to highlight a couple recent JIRAs that may have impact on
users (api consumers AND admins of the service) in the 3.4 timeframe.
If you want to weigh in please comment on the respective jira:
1) proposal to move to slf4j (remove/replace log4j)
Hi Chang, thanks for the insights, if you have a few minutes would you
mind updating the FAQ with some of this detail?
http://wiki.apache.org/hadoop/ZooKeeper/FAQ
Thanks!
Patrick
On Thu, Nov 4, 2010 at 6:27 AM, Chang Song tru64...@me.com wrote:
Sorry. I made a mistake on retry timeout in load
In addition to what Mahadev suggested you can also change the
log4j.properties to log to a file rather than the CONSOLE. Although
that just redirects the logs, if there is some output to stdout/stderr
then junit buffering is still in play.
Patrick
On Thu, Nov 4, 2010 at 8:15 AM, Mahadev Konar
resolving to all the server
addresses will probably work just as well as most DNS-based load balancers.
ben
On 11/04/2010 08:26 AM, Patrick Hunt wrote:
Hi Chang, thanks for the insights, if you have a few minutes would you
mind updating the FAQ with some of this detail?
http://wiki.apache.org
...).
Patrick
On Wed, Nov 3, 2010 at 1:13 AM, Qian Ye yeqian@gmail.com wrote:
thanks Patrick, I want to know all watches set by all clients.
I would open a jira and write some design think about it later.
On Tue, Nov 2, 2010 at 11:53 PM, Patrick Hunt ph...@apache.org wrote:
Hi Qian Ye
Hi Qian Ye, yes you should open a JIRA for this. If you want to work
on a patch we could advise you. One thing not clear to me, are you
interested in just the watches set by the particular client, or all
watches set by all clients? The first should be relatively easy to
get, the second would be
Hi Jeremy, this sounds like a bug to me, I don't think you should be
getting the nodeexists when the sequence flag is set.
Looking at the code briefly we use the parent's cversion
(incremented each time the child list is changed, added/removed).
Did you see this error each time you called
- thanks Patrick!
On Thu, Oct 28, 2010 at 6:13 PM, Patrick Hunt ph...@apache.org wrote:
Tim, one other thing you might want to be aware of:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_supervision
Patrick
On Thu, Oct 28, 2010 at 9:11 AM, Patrick Hunt ph...@apache.org
- thanks Patrick!
On Thu, Oct 28, 2010 at 6:13 PM, Patrick Hunt ph...@apache.org wrote:
Tim, one other thing you might want to be aware of:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_supervision
Patrick
On Thu, Oct 28, 2010 at 9:11 AM, Patrick Hunt ph...@apache.org
On Thu, Oct 28, 2010 at 2:52 AM, Tim Robertson
timrobertson...@gmail.com wrote:
We are setting up a small Hadoop 13 node cluster running 1 HDFS
master, 9 region severs for HBase and 3 map reduce nodes, and are just
installing zookeeper to perform the HBase coordination and to manage a
few
Tim, one other thing you might want to be aware of:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_supervision
Patrick
On Thu, Oct 28, 2010 at 9:11 AM, Patrick Hunt ph...@apache.org wrote:
On Thu, Oct 28, 2010 at 2:52 AM, Tim Robertson
timrobertson...@gmail.com wrote
they would still have to
code for this corner case.
Patrick
On Wed, Oct 20, 2010 at 10:42 AM, Patrick Hunt phu...@gmail.com wrote:
Hi Ted, Mahadev is in the best position to comment (he looked at it last)
but iirc when we started looking into implementing this we immediately
ran
into so big
Sounds like a useful utility, the closest that I know of is this:
http://hadoop.apache.org/zookeeper/docs/current/api/org/apache/zookeeper/server/LogFormatter.html
but it just dumps the txn log. Seems like it would be cool to be able to
open a shell on the datadir and query it (separate from
On Sat, Oct 23, 2010 at 9:03 PM, jingguo yao yaojing...@gmail.com wrote:
Read requests are handled locally at each Zookeeper server. So it is
possible for a read request to return a stale value even though a more
recent update to the same znode has been committed. Does this statement
still
EOS means that the client closed the connection (from the point of view of
the server). The server then tries to cleanup by closing the socket
explicitly, in some cases that results in debug messages you see subsequent.
EndOfStreamException: Unable to
read additional data from client sessionid
I'm not aware of sustained 1k/sec, Ben might know how long the 20k/sec test
runs for (and for how long that rate is sustained). You'd definitely want to
tune the GC, GC related pauses would be the biggest obstacle for this
(assuming you are using a dedicated log device for the transaction logs).
You might checkout a tool I built a while back to be used by operations
teams deploying ZooKeeper: http://bit.ly/a6tGVJ
It's really two tools actually, a smoketester and a latency tester, both of
which are important to verify when deploying a new cluster.
Patrick
On Mon, Oct 18, 2010 at 9:50
You might checkout a tool I built a while back to be used by operations
teams deploying ZooKeeper: http://bit.ly/a6tGVJ
It's really two tools actually, a smoketester and a latency tester, both of
which are important to verify when deploying a new cluster.
Patrick
On Mon, Oct 18, 2010 at 9:50
On Mon, Oct 11, 2010 at 4:16 PM, Avinash Lakshman
avinash.laksh...@gmail.com wrote:
tickTime = 2000, initLimit = 3000 and the data is around 11GB this is log +
snapshot. So if I need to add a new observer can I transfer state from the
ensemble manually before starting it? If so which files do
On Wed, Oct 13, 2010 at 5:58 AM, Vishal K vishalm...@gmail.com wrote:
However, gets trickier because there is no explicit way (to my knowledge)
to
get CreateMode for a znode. As a result, we cannot tell whether a node is
sequential or not.
Sequentials are really just regular znodes with
You probably want to do a rolling restart, this is preferable over
restart the cluster as the service will not go down.
http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A6
http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A6Patrick
On Wed, Oct 6, 2010 at 9:49 PM, Avinash Lakshman avinash.laksh...@gmail.com
Simplified: when a server comes back up it checks it's local snaps/logs to
reconstruct as much of the current state as possible. It then checks with
the leader to see how far behind it is, at which point it either gets a diff
or gets a full snapshot (from the leader) depending on how far behind it
= 0
ephemeralOwner = 0x2b7ce57ce4
dataLength = 54
numChildren = 0
Thanks for your help.
-Vishal
On Wed, Oct 6, 2010 at 4:45 PM, Patrick Hunt ph...@apache.org wrote:
Vishal the attachment seems to be getting removed by the list daemon (I
don't have it), can you create a JIRA
On Tue, Oct 5, 2010 at 10:23 AM, Avinash Lakshman
avinash.laksh...@gmail.com wrote:
So shouldn't all servers in another DC just have one session? So even if I
have 50 observers in another DC that should be 50 sessions established
since
the IP doesn't change correct? Am I missing something?
Vishal the attachment seems to be getting removed by the list daemon (I
don't have it), can you create a JIRA and attach? Also this is a good
question for the ppl on zookeeper-user. (ccing)
You are aware that ephemeral znodes are tied to the session? And that
sessions only expire after the
Tuning GC is going to be critical, otw all the sessions will timeout (and
potentially expire) during GC pauses.
Patrick
On Tue, Oct 5, 2010 at 1:18 PM, Maarten Koopmans maar...@vrijheid.netwrote:
Yes, and syncing after a crash will be interesting as well. Off note; I am
running it with a 6GB
wrote:
What about major releases going forward? Thanks,
Jun
On Mon, Sep 27, 2010 at 10:32 PM, Patrick Hunt ph...@apache.org wrote:
In general yes, minor and bug fix releases are fully backward compatible.
Patrick
On Sun, Sep 26, 2010 at 9:11 PM, Jun Rao jun...@gmail.com wrote
Seems like a bug to me. Please enter a JIRA (if you haven't already).
Thanks,
Patrick
On Fri, Sep 17, 2010 at 9:10 AM, Michael Xu mx2...@gmail.com wrote:
Hi everyone
in the c client api:
Is it normal for zoo_state() to return zero (not one of the valid state
consts) when it is handling
Sounds like you have an old version of autoconf, try upgrading, see similar
issue here:
http://www.mail-archive.com/thrift-u...@incubator.apache.org/msg00673.html
http://www.mail-archive.com/thrift-u...@incubator.apache.org/msg00673.html
Patrick
2010/9/24 俊贤 junx...@taobao.com
Hi mahadev,
My
I believe what the author is trying to say is that if the getdata were to
fail (such as the example you give) the watch set as part of the original
call will fire, and this will notify the client that the node was deleted.
(call to process(event))
Patrick
On Mon, Sep 27, 2010 at 6:56 PM, Milind
That is unusual. I don't recall anyone reporting a similar issue, and
looking at the code I don't see any issues off hand. Can you try the
following?
1) on that particular zk client machine resolve the hosts zook1/zook2/zook3,
what ip addresses does this resolve to? (try dig)
2) try running the
No worries, let us know if something else pops up.
Patrick
On Tue, Sep 7, 2010 at 3:10 PM, Stack st...@duboce.net wrote:
Nevermind. I figured it. It was an hbase issue. We were leaking a
client reference.
Sorry for the noise,
St.Ack
On Sat, Sep 4, 2010 at 10:58 AM, Stack
No worries, let us know if something else pops up.
Patrick
On Tue, Sep 7, 2010 at 3:10 PM, Stack st...@duboce.net wrote:
Nevermind. I figured it. It was an hbase issue. We were leaking a
client reference.
Sorry for the noise,
St.Ack
On Sat, Sep 4, 2010 at 10:58 AM, Stack
It is good to keep things simple, but we have seen some requests related to
the client api for children use cases that seem reasonable. In particular
the issue of handling large numbers of children efficiently is currently a
problem (queue say). We've seen proposals on this before, just no one's
Hi Andrei, the answer may not be as simple as that. In the case of passive
leader you might want to just wait till you're reconnected before taking
any action. Connection loss indicates that you aren't currently connected to
a server, it doesn't mean that you've lost leadership (if you get expired
On 09/01/2010 12:47 PM, Patrick Hunt wrote:
Ben, in this case the session would be tied directly to the connection,
we'd explicitly deny session re-establishment for this session type (so
4 would fail). Would that address your concern, others?
Patrick
On 09/01/2010 10:03 AM, Benjamin Reed
Ben, in this case the session would be tied directly to the connection,
we'd explicitly deny session re-establishment for this session type (so
4 would fail). Would that address your concern, others?
Patrick
On 09/01/2010 10:03 AM, Benjamin Reed wrote:
i'm a bit skeptical that this is going
On Mon, Aug 30, 2010 at 1:11 PM, Avinash Lakshman
avinash.laksh...@gmail.com wrote:
From my understanding when a znode is updated/created a write happens into
the local transaction logs and then some in-memory data structure is
updated
to serve the future reads.
Where in the source code can
Depending on your classpath setup:
java org.apache.zookeeper.ZooKeeperMain -server 127.0.0.1:2181
if jline jar is in your classpath (included in the zk release
distribution) you'll get history, auto-complete and such.
Patrick
On 08/31/2010 03:08 PM, Michi Mutsuzaki wrote:
Hello,
I'm
The client (solr in this case) is passing a null path to the
ZooKeeper.getChildren(path, ... ) call.
java.lang.IllegalArgumentException: Path cannot be null
at
org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:45)
at
On line 64 are you ensuring that the ZooKeeper session is active before
executing that sequence?
zookeeper = new ZooKeeper(...) is async - it returns before you're actually
connected to the server (you get notified of this in your watcher). If you
execute this sequence quickly enough your
it?
On Thu, Aug 26, 2010 at 5:05 PM, Patrick Hunt ph...@apache.org wrote:
Client has seen zxid 0xfa4 our last zxid is 0x42
Someone reset the zk server database without restarting the clients. As a
result the client is forward in time relative to the cluster.
Patrick
On 08/26/2010 04
Client has seen zxid 0xfa4 our last zxid is 0x42
Someone reset the zk server database without restarting the clients. As
a result the client is forward in time relative to the cluster.
Patrick
On 08/26/2010 04:03 PM, Ted Yu wrote:
Hi,
zookeeper-3.2.2 is used out of HBase 0.20.5
Linux
+1 on that Ted. I frequently see this issue crop up as I just rebooted
my server and lost all my data ... -- many os's will cleanup tmp on
reboot. :-)
Patrick
On 08/19/2010 07:43 AM, Ted Dunning wrote:
Also, /tmp is not a great place to keep things that are intended for
persistence.
On Thu,
No. You configure it in the server configuration file.
Patrick
On 08/19/2010 01:19 PM, Wim Jongman wrote:
Hi,
But zk does default to /tmp?
Regards,
Wim
On Thursday, August 19, 2010, Patrick Huntph...@apache.org wrote:
+1 on that Ted. I frequently see this issue crop up as I just
Maybe we should have a contrib pkg for utilities such as this? I could
see a python script that, given 1 server (might require addl 4letter
words but this would be useful regardless), could collect such
information from the cluster. Create a JIRA?
Patrick
On 08/17/2010 12:14 PM, Andrei Savu
All servers keep a copy - so you can shutdown the zk service entirely
(all servers) and restart it and the sessions are maintained.
Patrick
On 08/16/2010 06:34 PM, Qian Ye wrote:
Thx Mahadev and Benjamin, it seems that I've got some misunderstanding about
the client. I will check it out.
Try using the logs, stat command or JMX to verify that each ZK server is
indeed a leader/follower as expected. You should have one leader and n-1
followers. Verify that you don't have any standalone servers (this is
the most frequent error I see - misconfiguration of a server such that
it
The session timeout is used for this:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions
Patrick
On 08/16/2010 01:47 PM, Jun Rao wrote:
Hi,
What config parameters in ZK determine how soon a failed client is detected?
Thanks,
Jun
On 08/11/2010 06:49 PM, Adam Rosien wrote:
http://hadoop.apache.org/zookeeper/docs/r3.3.1/zookeeperAdmin.html#sc_dataFileManagement
says that one can copy the contents of the data directory and use it
on another machine. The example states the other instance is not in
the server list; what
Great bug report Ted, the stack trace in particular is very useful.
It looks like a timing bug where the client is not shutting down cleanly
on the close call. I reviewed the code in question but nothing pops out
at me. Also the logs just show us shutting down, nothing else from zk in
there.
On 08/11/2010 03:25 PM, Jordan Zimmerman wrote:
If I use an async version of a call in a cluster (ensemble) what
happens if the server I'm connected to goes down? Does ZK
transparently resubmit the call to the next server in the cluster and
call my async callback or is there something I need to
Great!
Basic details are here (create a jira, attach a patch, click submit
and someone will review and help you get it into a state which we can
commit). Probably you'd put your code into src/recipes or src/contrib
(recipes sounds reasonable).
I suspect this is a bug with the sync call and session moved (the code
path for sync is a bit special). Please enter a JIRA for this. Thanks.
Patrick
On 08/05/2010 01:20 PM, Vishal K wrote:
Hi All,
I am seeing a lot of these messages in our application. I would like to know
if I am doing
You may want to consider adding a distributed queue to your use of ZK.
As was mentioned previously, watches don't notify you of every change,
just that a change was made. For example multiple changes may be
visible when you get the notification.
A distributed queue would allow you to log
On 07/19/2010 05:04 PM, Rakesh Aggarwal wrote:
javax.management.MBeanServer; was not found
Sounds like you are missing rt.jar for some reason (contains that class).
Try running java -verbose -version and see what jars are being picked
up, I see a number of lines containing:
...
Hi Rich, the version string looks useful to have, thanks! Would you mind
submitting this via jira? Do a svn diff (looks like you did already),
create a jira and attach the diff, then click submit link on the jira.
We'll review and work on getting it into a future release.
I've done some tests with ~600 clients creating 5 million znodes (size
100bytes iirc) and 25million watches. I was using 8gb of memory for
this, however --- in this scenario it's critical that you tune the GC,
in particular you need to turn on CMS and incremental GC options. Otw
when the GC
If you want to simulate expiration use the example I sent.
http://github.com/phunt/zkexamples
Another option is to use a mock.
Patrick
On 07/06/2010 05:42 PM, Jeremy Davis wrote:
Thanks!
That seems to work, but it is approximately the same as zooKeeper.close() in
that there is no
Hi Travis, as Flavio suggested would be great to get the logs. A few
questions:
1) how did you eventually recover, restart the zk servers?
2) was the cluster losing quorum during this time? leader re-election?
3) Any chance this could have been initially triggered by a long GC
pause on one
On 06/30/2010 09:37 AM, Ted Dunning wrote:
Which API are you talking about? C?
I think that the difference between connection loss and session expiration
might mess you up slightly in your disjunction here.
On Wed, Jun 30, 2010 at 7:45 AM, Bryan Thompsonbr...@systap.com wrote:
I am
On 06/26/2010 06:53 AM, Peeyush Kumar wrote:
I have a 6 node cluster (5 slaves and 1 master). I am trying to
You typically want an odd number given that zk works by majority (even
is fine, but not optimal). So 5 would be great (7 is a bit of overkill).
3 is fine too, but 5 allows
There are 3 ports that need to be opened
1) the client port (btw client and servers)
2/3) the quorum and election ports - only btw servers
You are setting these three ports in your config file (clientport
defaults to 2181 iirc, unless you override)
Patrick
On 06/22/2010 06:17 AM, Erik Test
I've seen a number of these built as proprietary solutions using
ZooKeeper. It would be great to see something open sourced. HBase/ZK
seems like a good fit. You might also consider ZooKeeper/BookKeeper.
Patrick
On 06/18/2010 11:01 AM, Thomas Koch wrote:
it.
On Jun 2, 2010, at 11:49 AM, Patrick Hunt wrote:
Hi Charity, unfortunately this is a known issue not specific to 3.3
that
we are working to address. See this thread for some background:
http://zookeeper-user.578899.n2.nabble.com/odd-error-message-td4933761.html
I've raised the JIRA
I'm not very experienced personally with running zk on ec2 smalls, Ted
usually has the ec2 related insight. Given these boxes are not loaded or
lightly loaded, and you've ruled out gc/swap, the only thing I can think
of is that something is going on under the covers at the vm level that's
I don't think this should be possible (if it happens it's a bug in zk).
Perhaps, for some reason, there really are 2 change actions (children
created, or the same child created twice) and not just one?
Re-registering the watch inside the watch is fine. The server sends
watch notifications as
Session expiration is due to the server not hearing heartbeats from the
client. So either the client is partitioned from the server, or the
client is not sending heartbeats for some reason, typically this is due
to the client JVM gc'ing or swapping.
Patrick
On 06/10/2010 04:14 PM, Ted
100mb partition? sounds like virtualization. resource starvation
(worse in virtualized env) is a common cause of this. Are your clients
gcing/swapping at all? If a client gc's for long periods of time the
heartbeat thread won't be able to run and the server will expire the
session. There is a
On 06/09/2010 03:35 PM, Lei Zhang wrote:
We've consistently run into issues with vmware workstation (CentOS as guest
OS) on Windows host: just by leaving the cluster idle over night leads to zk
session expire issue. My theory is: windows may have gone to hibernation,
the zk heartbeat logic
Here's how to test session expiration (haven't tried this in a while):
http://github.com/phunt/zkexamples
It would be great to have some test
infrastructure/examples/docs/strategies available for developers (zk
client users). If someone would be interested to workon/contribute this
we'd be
Hi Charles, any luck with this? Re the issues you found with the recipes
please enter a JIRA, it would be good to address the problem(s) you found.
https://issues.apache.org/jira/browse/ZOOKEEPER
re use of session/thread id, might you use some sort of unique token
that's dynamically assigned
On 05/27/2010 09:47 AM, Benjamin Reed wrote:
actually pat hunt took over that issue: ZOOKEEPER-733. pat has made a
lot of progress and the patch looks close to being ready.
This is just the server side though, still need to make similar changes
on the client. That will likely be a separate
Short of someone else stepping up I have it on my todo list. ;-) Still
quite a bit of work to do on 733 though getting it back into shape. (not
to mention layering the ssl on top). Then there's also the server-server
connectivity that also needs to have netty support added
(quorum/election
Hi, this was originally proposed as a google summer of code project, the
slots for gsoc have already been given out, this was not one of the
projects chosen by apache. So you could still work on this if you like,
but not under the gsoc umbrella. We (zk contributor community) would be
happy to
Hi Stephen, my comments inline below:
On 05/21/2010 09:31 AM, Stephen Green wrote:
I feel like I'm missing something fairly fundamental here. I'm
building a clustered application that uses ZooKeeper (3.3.1) to store
its configuration information. There are 33 nodes in the cluster
(Amazon EC2
On 05/21/2010 11:32 AM, Stephen Green wrote:
Right. The system can be very memory-intensive, but at the time these
are occurring, it's not under a really heavy load, and there's plenty
of heap available. However, while looking at a thread dump from one of
the nodes, I realized that a very poor
those JIRAs.
Thanks!
Patrick
-Flavio
On May 20, 2010, at 1:36 AM, Patrick Hunt wrote:
On 05/19/2010 01:23 PM, Flavio Junqueira wrote:
Hi Andre, To guarantee that two clients that read from a ledger will
read the same sequence of entries, we need to make sure that there is
agreement on the end
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version
3.3.1
ZooKeeper is a high-performance coordination service for distributed
applications. It exposes common services - such as naming, configuration
management, synchronization, and group services - in a simple interface
Mahadev pointed out the ZK monitoring details, but on the solr side of
the house I don't think we can provide much insight as solr is acting as
a client of the zk service. Your best bet would be to ask on the solr
user list.
Regards,
Patrick
On 05/14/2010 04:09 AM, Rakhi Khatwani wrote:
Hi Jordan, you've seen this once or frequently? (having the server +
client logs will help alot)
Patrick
On 05/12/2010 11:08 AM, Jordan Zimmerman wrote:
Sure - if you think it's a bug.
We were using Zookeeper without issue. I then refactored a bunch of
code and this new behavior started. I'm
the server and now all works again.
Sorry to trouble y'all.
-Jordan
On May 12, 2010, at 11:11 AM, Patrick Hunt wrote:
Hi Jordan, you've seen this once or frequently? (having the server
+ client logs will help alot)
Patrick
On 05/12/2010 11:08 AM, Jordan Zimmerman wrote:
Sure - if you think
that
getChildren (xid 7) got lost.
Patrick
On 05/12/2010 11:30 AM, Jordan Zimmerman wrote:
Oh, OK. When I get a moment I'll restart the 3.2.2 and post logs,
etc.
Yes, we're calling getChildren with the callback.
-JZ
On May 12, 2010, at 11:28 AM, Patrick Hunt wrote:
I'm still interested though... Are you
On May 12, 2010, at 11:41 AM, Benjamin Reed wrote:
is this a bug? shouldn't we be returning an error.
ben
On 05/12/2010 11:34 AM, Patrick Hunt wrote:
I think that explains it then - the server is probably dropping the new
(3.3.0) getChildren message (xid 7) as it (3.2.2 server) doesn't know
Hm, if you don't mind enter that jira, would still like to verify by
looking at the logs.
Patrick
On 05/12/2010 11:52 AM, Jordan Zimmerman wrote:
So, I'm off the Jira hook then?
-JZ
On May 12, 2010, at 11:49 AM, Patrick Hunt wrote:
You're right. Ben, would you mind entering a JIRA
On 05/12/2010 08:30 PM, Aaron Crow wrote:
I may have a better idea of what caused the trouble. I way, WAY
underestimated the number of nodes we collect over time. Right now we're at
1.9 million. This isn't a bug of our application; it's actually a feature
(but perhaps an ill-conceived one).
A
The cases where we've seen this reported in the past the user tracked
the issue down to a firewall problem, I'm not sure what the issue is
here given you've verified that's not the problem. The log is clearly
saying:
Thread:quorumcnxmana...@336] - Cannot open channel to 2 at election
Ok, great, good luck!
Patrick
On 05/10/2010 11:20 PM, chen peng wrote:
My question has been decided.
*I did not http://www.iciba.com/not/ know http://www.iciba.com/know/
bin/zkServer start should be execute on each machine!*
*I took him to be very close in function with
Hi Dominic, this looks really interesting thanks for open sourcing it. I
really like the idea of providing higher level concepts. I only just
looked at the code, it wasn't clear on first pass what happens if you
multilock on 3 paths, the first 2 are success, but the third fails. How
are the
Often this is related to the port(s) being blocked by a firewall. Perhaps
you could check this (2888/3888) in both directions? Telnet can help:
https://help.maximumasp.com/KB/a445/connectivity-testing-with-ping-telnet-tracert-and-pathping-.aspx
Patrick
2010/5/7 chen peng chenpeng0...@hotmail.com
Thanks Travis, I've slated this for 3.4.0, I think it would be useful to
add more examples so feel free to add more if you have any ideas for
useful ones.
For future reference, we ask that contributions come in the form of a patch:
http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute
It's
While I agree DS is hard, I don't think we should lose the useful
feedback given by Jonathan/Adam - that getting started with ZK is
challenging and can be frustrating. We need to learn from this feedback
and create some action items to address. One of the main things I've
heard so far that we
Take a look at this thread for some background.
http://www.mail-archive.com/zookeeper-user@hadoop.apache.org/msg00917.html
There were some concerns at the time, not sure if they have been
addressed since (It has been a while since that discussion).
Patrick
On 05/04/2010 01:48 PM, Jonathan
Thanks Kapil, Mahadev perhaps you could take a look at this as well?
Patrick
On 05/04/2010 06:36 AM, Kapil Thangavelu wrote:
I've constructed a simple example just using the zkpython library with
condition variables, that will deadlock. I've filed a new ticket for it,
1 - 100 of 282 matches
Mail list logo