We're building a product for users that aren't used to deploying
distributed systems, and we're trying to make it as easy to configure
and use as possible. That means not requiring the full list of IP
addresses for every node at configuration time; instead, each node can
be configured with a s
yes, i think you have summarized the problem nicely jeremy.
i'm curious about your reasoning for running servers in standalone mode
and then merging. can you explain that a bit more?
thanx
ben
On 11/01/2010 04:51 PM, Jeremy Stribling wrote:
I think this is caused by stupid behavior on our ap
I think this is caused by stupid behavior on our application's part, and
the error message just confused me. Here's what I think is happening.
1) 3 servers are up and accepting data, creating sequential znodes under
/zkrsm.
2) 1 server dies, the other 2 continue creating sequential znodes.
3)
Yes, every znode in /zkrsm was created with the sequence flag.
We bring up a cluster of three nodes, though we do it in a slightly odd
manner to support dynamism: each node starts up as a single-node
instance knowing only itself, and then each node is contacted by a
coordinator that kills the
how were you able to reproduce it?
all the znodes in /zkrsm were created with the sequence flag. right?
ben
On 11/01/2010 02:28 PM, Jeremy Stribling wrote:
We were able to reproduce it. A "stat" on all three servers looks
identical:
[zk:(CONNECTED) 0] stat /zkrsm
cZxid = 9
ctime = Mon Nov 01
We were able to reproduce it. A "stat" on all three servers looks
identical:
[zk: (CONNECTED) 0] stat /zkrsm
cZxid = 9
ctime = Mon Nov 01 13:01:57 PDT 2010
mZxid = 9
mtime = Mon Nov 01 13:01:57 PDT 2010
pZxid = 12884902218
cversion = 177
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0
dataLen
Thanks for the reply. It happened every time we called create, not just
once. More than that, we tried restarting each of the nodes in the
system (one-by-one), including the new master, and the problem continued.
Unfortunately we cleaned everything up, and it's not in that state
anymore. We
Hi Jeremy, this sounds like a bug to me, I don't think you should be
getting the nodeexists when the sequence flag is set.
Looking at the code briefly we use the parent's "cversion"
(incremented each time the child list is changed, added/removed).
Did you see this error each time you called creat