odd error message

2010-04-20 Thread Ted Dunning
We have just done an upgrade of ZK to 3.3.0.  Previous to this, ZK has been
up for about a year with no problems.

On two nodes, we killed the previous instance and started the 3.3.0
instance.  The first node was a follower and the second a leader.

All went according to plan and no clients seemed to notice anything.  The
stat command showed connections moving around as expected and all other
indicators were normal.

When we did the third node, we saw this in the log:

2010-04-20 14:07:49,010 - FATAL [QuorumPeer:/0.0.0.0:2181:follo...@71] -
Leader epoch 18 is less than our epoch 19

The third node refused all connections.

We brought down the third node, wiped away its snapshot, restarted and it
joined without complaint.  Note that the third node
was originally a follower and had never been a leader during the upgrade
process.

Does anybody know why this happened?

We are fully upgraded and there was no interruption to normal service, but
this seems strange.


Re: Would this work?

2010-04-20 Thread Ted Dunning
I can't comment on the details of your code (but I have run in-process ZK's
in the past without problem)

Operationally, however, this isn't a great idea.  The problem is two-fold:

a) firstly, somebody would probably like to look at Zookeeper to understand
the state of your service.  If the service is
down, then ZK will go away.  That means that Zookeeper can't be used that
way and is mild to moderate
on the logarithmic international suckitude scale.

b) secondly, if you want to upgrade your server without upgrading Zookeeper
then you still have to bounce
Zookeeper.  This is probably not a problem, but it can be a slight pain.

c) thirdly, you can't scale your service independently of how you scale
Zookeeper.  This may or may
not bother you, but it would bother me.

d) fourthly, you will be synchronizing your server restarts with ZK's
service restarts.  Moving these events
away from each other is likely to make them slightly more reliable.  There
is no failure mode that I know
of that would be tickled here, but your service code will be slightly more
complex since it has to make sure
that ZK is up before it does stuff.  If you could make the assumption that
ZK is up or exit, that would be
simpler.

e) yes, I know that is more than two issues.  That is itself an issue since
any design where the number of worries
is increasing so fast is suspect on larger grounds.  If there are small
problems cropping up at that rate, the likelihood
of there being a large problem that comes up seems higher.

Your choice and your mileage will vary.

On Tue, Apr 20, 2010 at 1:25 PM, Avinash Lakshman 
avinash.laksh...@gmail.com wrote:

 This may sound weird but I want to know if there is something inherent that
 would preclude this from working. I want to have a thrift based service
 which exposes some API to read/write to certain znodes. I want ZK to run
 within the same process. So I will start the ZK process from within my main
 using QuorumPeerMain.main(). Now the implementation of my API would
 instantiate a ZooKeeper object and try reading/writing from specific znodes
 as the case may be. I tried running this and as soon as I instantiate my
 ZooKeeper object I get some really weird exceptions. What is wrong in this
 approach?



Re: Would this work?

2010-04-20 Thread Patrick Hunt
There are a small handful of cases where the server code will 
system.exit. This is typically only if quorum communication fails in 
some weird, unrecoverable way. We've been working to remove this (mainly 
so zk can be deployed in a container) but there are still a few cases left.


I don't see any server logs in that log snippet - having that detail 
would shed more light on why the client is unable to connect. Are you 
sure that the server is being started?


Patrick

On 04/20/2010 02:25 PM, Ted Dunning wrote:

I can't comment on the details of your code (but I have run in-process ZK's
in the past without problem)

Operationally, however, this isn't a great idea.  The problem is two-fold:

a) firstly, somebody would probably like to look at Zookeeper to understand
the state of your service.  If the service is
down, then ZK will go away.  That means that Zookeeper can't be used that
way and is mild to moderate
on the logarithmic international suckitude scale.

b) secondly, if you want to upgrade your server without upgrading Zookeeper
then you still have to bounce
Zookeeper.  This is probably not a problem, but it can be a slight pain.

c) thirdly, you can't scale your service independently of how you scale
Zookeeper.  This may or may
not bother you, but it would bother me.

d) fourthly, you will be synchronizing your server restarts with ZK's
service restarts.  Moving these events
away from each other is likely to make them slightly more reliable.  There
is no failure mode that I know
of that would be tickled here, but your service code will be slightly more
complex since it has to make sure
that ZK is up before it does stuff.  If you could make the assumption that
ZK is up or exit, that would be
simpler.

e) yes, I know that is more than two issues.  That is itself an issue since
any design where the number of worries
is increasing so fast is suspect on larger grounds.  If there are small
problems cropping up at that rate, the likelihood
of there being a large problem that comes up seems higher.

Your choice and your mileage will vary.

On Tue, Apr 20, 2010 at 1:25 PM, Avinash Lakshman
avinash.laksh...@gmail.com  wrote:


This may sound weird but I want to know if there is something inherent that
would preclude this from working. I want to have a thrift based service
which exposes some API to read/write to certain znodes. I want ZK to run
within the same process. So I will start the ZK process from within my main
using QuorumPeerMain.main(). Now the implementation of my API would
instantiate a ZooKeeper object and try reading/writing from specific znodes
as the case may be. I tried running this and as soon as I instantiate my
ZooKeeper object I get some really weird exceptions. What is wrong in this
approach?





Re: odd error message

2010-04-20 Thread Mahadev Konar
Ok, I think this is possible.
So here is what happens currently. This has been a long standing bug and
should be fixed in 3.4

https://issues.apache.org/jira/browse/ZOOKEEPER-335

A newly elected leader currently doesn't log the new leader transaction to
its database

In your case, the follower (the 3rd server) did log it but the leader never
did. Now when you brought up the 3rd server it had the transaction log
present but the leader did not have that. In that case the 3rd server cried
fowl and shut down.

Removing the DB is totally fine. For now, we should update our docs on 3.3
and mention that this problem might occur during upgrade and fix it in 3.4.


Thanks for bringing it up Ted.


Thanks
mahadev

On 4/20/10 2:14 PM, Ted Dunning ted.dunn...@gmail.com wrote:

 We have just done an upgrade of ZK to 3.3.0.  Previous to this, ZK has been
 up for about a year with no problems.
 
 On two nodes, we killed the previous instance and started the 3.3.0
 instance.  The first node was a follower and the second a leader.
 
 All went according to plan and no clients seemed to notice anything.  The
 stat command showed connections moving around as expected and all other
 indicators were normal.
 
 When we did the third node, we saw this in the log:
 
 2010-04-20 14:07:49,010 - FATAL [QuorumPeer:/0.0.0.0:2181:follo...@71] -
 Leader epoch 18 is less than our epoch 19
 
 The third node refused all connections.
 
 We brought down the third node, wiped away its snapshot, restarted and it
 joined without complaint.  Note that the third node
 was originally a follower and had never been a leader during the upgrade
 process.
 
 Does anybody know why this happened?
 
 We are fully upgraded and there was no interruption to normal service, but
 this seems strange.