NullPointerException stopping and starting Zookeeper servers

2008-12-08 Thread Thomas Vinod Johnson

Hi,
I have a replicated zookeeper services consisting of 3 zookeeper (3.0.1) 
servers all running on the same host for testing purposes. I've created 
exactly one znode in this ensemble. At this point, I stop, then restart 
a single zookeeper server; moving onto the next one a few seconds later. 
A few restarts later (about 4 is usually sufficient), I get the 
following exception on one of the servers, at which point it exits:

java.lang.NullPointerException
   at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:447)
   at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:358)
   at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:333)
   at 
org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:250)
   at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:102)
   at 
org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:183)

   at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:245)
   at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:421)
2008-12-08 14:14:24,880 - INFO  
[QuorumPeer:/0:0:0:0:0:0:0:0:2183:[EMAIL PROTECTED] - Shutdown called

java.lang.Exception: shutdown Leader! reason: Forcing shutdown
   at 
org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:336)
   at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:427)
Exception in thread QuorumPeer:/0:0:0:0:0:0:0:0:2183 
java.lang.NullPointerException
   at 
org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:339)
   at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:427)


The inputStream field is null, apparently because next is being called 
at line 358 even after next returns false. Having very little knowledge 
about the implementation, I don't know if the existence of hdr.getZxid() 
= zxid is supposed to be an invariant across all invocations of the 
server; however the following change to FileTxnLog.java seems to make 
the problem go away.

diff FileTxnLog.java /tmp/FileTxnLog.java
358c358,359
 next();
---
   if (!next())
   return;
447c448,450
 inputStream.close();
---
   if (inputStream != null) {
   inputStream.close();
   }

Is this a bug?

Thanks.



Re: What happens when a server loses all its state?

2008-12-16 Thread Thomas Vinod Johnson
Sorry, I should have been a little more explicit. At this point, the 
situation I'm considering is this; out of 3 servers, 1 server 'A' 
forgets its persistent state (due to a bad disk, say) and it restarts. 
My guess from what I could understand/reason about the internals was 
that the server 'A' will re-synchronize correctly on restart, by getting 
the entire snapshot.


I just wanted to make sure that this was a good assumption to make - or 
find out if I was missing corner cases where the fact that A has lost 
all memory could lead to inconsistencies (to take an example, in plain 
Paxos, no acceptor can forget the highest number prepare request to 
which it has responded).


More generally, is it a safe assumption to make that the ZooKeeper 
service will maintain all its guarantees if a minority of servers lose 
persistent state (due to bad disks, etc) and restart at some point in 
the future?


Thanks.
Mahadev Konar wrote:

Hi Thomas,

If a zookeeper server loses all state and their are enough servers in the
ensemble to continue a zookeeper service ( like 2 servers in the case of
ensemble of 3), then the server will get the latest snapshot from the leader
and continue.


The idea of zookeeper persisting its state on disk is just so that it does
not lose state. All the guarantees that zookeeper makes is based on the
understanding that we do not lose state of the data we store on the disk.


Their might be problems if we lose the state that we stored on the disk.
We might lose transactions that have been committed and the ensemble might
start with some snapshot in the past.

You might want ot read through how zookeeper internals work. This will help
you understand on why the persistence guarantees are required.

http://wiki.apache.org/hadoop-data/attachments/ZooKeeper(2f)ZooKeeperPresent
ations/attachments/zk-talk-upc.pdf

mahadev



On 12/16/08 9:45 AM, Thomas Vinod Johnson thomas.john...@sun.com wrote:

  

What is the expected behavior if a server in a ZooKeeper service
restarts with all its prior state lost? Empirically, everything seems to
work*.  Is this something that one can count on, as part of ZooKeeper
design, or are there known conditions under which this could cause
problems, either liveness or violation of ZooKeeper guarantees?

I'm really most interested in a situation where a single server loses
state, but insights into issues when more than one server loses state
and other interesting failure scenarios are appreciated.

Thanks.

* The restarted server appears to catch up to the latest snapshot (from
the current leader?).



  




Re: Simpler ZooKeeper event interface....

2009-01-09 Thread Thomas Vinod Johnson




In the case of an active leader, L continues to send commands 
(whatever) to the followers. However a new leader L' has since been 
elected and is also sending commands to the followers. In this case 
it seems like either a) L should not send commands if it's not 
sync'd to the ensemble (and holds the leader token) or b) followers 
should not accept commands from non-leader (only accept from the 
current leader). a) seems the right way to go; if L is disconnected 
it should stop sending commands to the followers, if it's resync'd 
in time it can


Seems to make sense in this particular case (I had some other cases 
in mind that I'm not so sure about though)


Feel free to discuss...


The thought is not that well formed, so perhaps it does not warrant much 
discussion ... This is more a realization that as far as the leader 
election recipe goes, if *in general* one wants to guarantee not having 
multiple leaders at the same time, certain assumptions have to made 
about timely reception and processing of events. So naively, if I wanted 
to use the recipe to ensure that only one system owns an IP address at 
any given time, I think there would be no way to guarantee it without 
making some assumptions about timing. In retrospect, this should have 
been obvious. In practice it may be simple enough to work around these 
problems (I actually think now that in my case an 'at least once' queue 
is more appropriate). Any way, like I said half baked thoughts ..