[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400257#comment-13400257
 ] 

Marshall McMullen commented on ZOOKEEPER-1453:
----------------------------------------------

I was able to reproduce this problem again. After I power cycled the server a 
few times, the node in question refuses to join the ensemble and no clients can 
connect to it. When I try to telnet to the host in question and issue 'stat' it 
fails with:

This ZooKeeper instance is not currently serving requests

I enabled tracing and in the log file as it's starting up it fails with:

2012-06-24 20:34:31,734 [myid:1] - INFO  [main:FileSnap@83][] - Reading 
snapshot /sf/data/zookeeper/10.10.5.123/version-2/snapshot.0
2012-06-24 20:34:31,738 [myid:1] - DEBUG 
[main:FileTxnLog$FileTxnIterator@575][] - Created new input stream 
/sf/data/zookeeper/10.10.5.123/version-2/log.100000001
2012-06-24 20:34:31,738 [myid:1] - DEBUG 
[main:FileTxnLog$FileTxnIterator@578][] - Created new input archive 
/sf/data/zookeeper/10.10.5.123/version-2/log.100000001
2012-06-24 20:34:31,763 [myid:1] - DEBUG [main:DataTree@951][] - Ignoring 
processTxn failure hdr: -1 : error: -110
2012-06-24 20:34:31,763 [myid:1] - DEBUG [main:FileTxnSnapLog@241][] - Ignoring 
processTxn failure hdr: -1 : error: -110
2012-06-24 20:34:31,763 [myid:1] - DEBUG [main:DataTree@951][] - Ignoring 
processTxn failure hdr: -1 : error: -110

...[ repeats many many times ]...

2012-06-24 20:34:32,065 [myid:1] - DEBUG 
[main:FileTxnLog$FileTxnIterator@618][] - EOF excepton java.io.EOFException: 
Failed to read /sf/data/zookeeper/10.10.5.123/version-2/log.100000001
2012-06-24 20:34:32,067 [myid:1] - INFO  
[NIOServerCxn.Factory:/10.10.5.123:2181:NIOServerCnxnFactory@227][] - Accepted 
socket connection from /10.10.5.123:39623
2012-06-24 20:34:32,069 [myid:1] - INFO  
[QuorumPeerListener:QuorumCnxManager$Listener@530][] - My election bind port: 
/10.10.5.123:2183
2012-06-24 20:34:32,071 [myid:1] - WARN  
[NIOServerCxn.Factory:/10.10.5.123:2181:NIOServerCnxn@354][] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running
2012-06-24 20:34:32,071 [myid:1] - DEBUG 
[NIOServerCxn.Factory:/10.10.5.123:2181:NIOServerCnxn@358][] - IOException 
stack trace

I also have a copy of the data directory if it would help.

                
> corrupted logs may not be correctly identified by FileTxnIterator
> -----------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1453
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1453
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.3
>            Reporter: Patrick Hunt
>            Priority: Critical
>
> See ZOOKEEPER-1449 for background on this issue. The main problem is that 
> during server recovery 
> org.apache.zookeeper.server.persistence.FileTxnLog.FileTxnIterator.next() 
> does not indicate if the available logs are valid or not. In some cases (say 
> a truncated record and a single txnlog in the datadir) we will not detect 
> that the file is corrupt, vs reaching the end of the file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to