[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishal K updated ZOOKEEPER-919:
-------------------------------

    Attachment: logs.tar.gz


1. Create 3 node cluster
#Tue Dec 07 03:08:43 UTC 2010
server.2=10.150.27.112\:2888\:3888
server.1=10.150.27.111\:2888\:3888
server.0=10.150.27.110\:2888\:3888
initLimit=5
syncLimit=2
snapCount=10000
clientPort=2181
tickTime=2000
dataDir=/var/zookeeper

2. run attached test.sh

zkmgtserver.jar in test.sh is our jar file that starts ZkServer. On
10.150.27.112 we made zkmgtserver also start a client that creates
znode /Online. I was able to repro the problem after a few reboots of
zkmgmtserver on .112. Reboot of zkmgmtserver results in rebooting
zkserver and the client.

File stat.out displays output of stat on /Online. We can see that
after rebooting zkmgmtserver (which reboots zkserver and zkclient),
old /Online znode reappeared.

log4j:WARN No appenders could be found for logger 
(org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
cZxid = 0x50000054a
ctime = Tue Dec 07 03:07:59 UTC 2010
mZxid = 0x50000054a
mtime = Tue Dec 07 03:07:59 UTC 2010
pZxid = 0x50000054a
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x22cbeccc7900000
dataLength = 54
numChildren = 0

log4j:WARN No appenders could be found for logger 
(org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
cZxid = 0x500000557
ctime = Tue Dec 07 03:08:21 UTC 2010
mZxid = 0x500000557
mtime = Tue Dec 07 03:08:21 UTC 2010
pZxid = 0x500000557
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x22cbecd1bb60000
dataLength = 54
numChildren = 0

log4j:WARN No appenders could be found for logger 
(org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
cZxid = 0x50000054a   <==================== OLD zxid
ctime = Tue Dec 07 03:07:59 UTC 2010
mZxid = 0x50000054a
mtime = Tue Dec 07 03:07:59 UTC 2010
pZxid = 0x50000054a
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x22cbeccc7900000
dataLength = 54
numChildren = 0


Similarly, we can see from ZK-10.150.27.112-zookeeper.log that
10.150.27.112 reapplied an old transaction to create znode.

ZK-10.150.27.110-zookeeper.log:2010-12-07 03:07:59,434 - DEBUG 
[QuorumPeer:/0.0.0.0:2181:commitproces...@159] - Committing request:: 
sessionid:0x22cbeccc7900000 type:create cxid:0x3 zxid:0x50000054a txntype:1 
reqpath:n/a
ZK-10.150.27.110-zookeeper.log:2010-12-07 03:07:59,434 - DEBUG 
[CommitProcessor:0:finalrequestproces...@78] - Processing request:: 
sessionid:0x22cbeccc7900000 type:create cxid:0x3 zxid:0x50000054a txntype:1 
reqpath:n/a
ZK-10.150.27.111-zookeeper.log:2010-12-07 03:07:59,881 - DEBUG 
[ProcessThread:-1:commitproces...@169] - Processing request:: 
sessionid:0x22cbeccc7900000 type:create cxid:0x3 zxid:0x50000054a txntype:1 
reqpath:n/a
ZK-10.150.27.111-zookeeper.log:2010-12-07 03:07:59,881 - DEBUG 
[ProcessThread:-1:lea...@647] - Proposing:: sessionid:0x22cbeccc7900000 
type:create cxid:0x3 zxid:0x50000054a txntype:1 reqpath:n/a
ZK-10.150.27.111-zookeeper.log:2010-12-07 03:07:59,883 - DEBUG 
[LearnerHandler-/10.150.27.112:56606:commitproces...@159] - Committing 
request:: sessionid:0x22cbeccc7900000 type:create cxid:0x3 zxid:0x50000054a 
txntype:1 reqpath:n/a
ZK-10.150.27.111-zookeeper.log:2010-12-07 03:07:59,883 - DEBUG 
[CommitProcessor:1:finalrequestproces...@78] - Processing request:: 
sessionid:0x22cbeccc7900000 type:create cxid:0x3 zxid:0x50000054a txntype:1 
reqpath:n/a
ZK-10.150.27.112-zookeeper.log:2010-12-07 03:08:00,481 - DEBUG 
[QuorumPeer:/0.0.0.0:2181:commitproces...@159] - Committing request:: 
sessionid:0x22cbeccc7900000 type:create cxid:0x3 zxid:0x50000054a txntype:1 
reqpath:n/a
ZK-10.150.27.112-zookeeper.log:2010-12-07 03:08:00,481 - DEBUG 
[CommitProcessor:2:finalrequestproces...@78] - Processing request:: 
sessionid:0x22cbeccc7900000 type:create cxid:0x3 zxid:0x50000054a txntype:1 
reqpath:/Online
ZK-10.150.27.112-zookeeper.log:2010-12-07 03:08:00,481 - DEBUG 
[CommitProcessor:2:finalrequestproces...@160] - sessionid:0x22cbeccc7900000 
type:create cxid:0x3 zxid:0x50000054a txntype:1 reqpath:/Online      
<===================== OLD transaction

The client used in the test was simple: - On .112 opens a connection
to local ZkServer and creates a /Online znode if it does not exist or
if the znode is deleted. It creates a new session to the local ZK
server after it is rebooted (session ID is not reused).

> Ephemeral nodes remains in one of ensemble after deliberate SIGKILL
> -------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-919
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-919
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.1
>         Environment: Linux CentOS 5.3 64bit, JDK 1.6.0-22
>            Reporter: Chang Song
>         Attachments: logs.tar.gz
>
>
> I was testing stability of Zookeeper ensemble for production deployment. 
> Three node ensemble cluster configuration.
> In a loop, I kill/restart three Zookeeper clients that created one ephemeral 
> node each, and at the same time,
> I killed Java process on one of ensemble (dont' know if it was a leader or 
> not). Then I restarted Zookeeper on the server,
> It turns out that on two zookeeper ensemble servers, all the ephemeral nodes 
> are gone (it should), but on the newly started
> Zookeeper server, the two old ephemeral nodes stayed.  The zookeeper didn't 
> restart in standalone mode since new ephemeral
> nodes gets created on all ensemble servers. 
> I captured the log.
> 2010-11-04 17:48:50,201 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:17288:nioservercnxn$fact...@250] - 
> Accepted socket connection from /10.25.131.21:11191
> 2010-11-04 17:48:50,202 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:17288:nioserverc...@776] - Client 
> attempting to establish new session at /10.25.131.21:11191
> 2010-11-04 17:48:50,203 - INFO  [CommitProcessor:1:nioserverc...@1579] - 
> Established session 0x12c160c31fc000b with negotiated timeout 30000 for 
> client /10.25.131.21:11191
> 2010-11-04 17:48:50,206 - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:17288:nioserverc...@633] - 
> EndOfStreamException: Unable to read additional data from client sessionid 
> 0x12c160c31fc000b, likely client has closed socket
> 2010-11-04 17:48:50,207 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:17288:nioserverc...@1434] - Closed 
> socket connection for client /10.25.131.21:11191 which had sessionid 
> 0x12c160c31fc000b
> 2010-11-04 17:48:50,207 - ERROR [CommitProcessor:1:nioserverc...@444] - 
> Unexpected Exception:
> java.nio.channels.CancelledKeyException
>         at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
>         at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
>         at 
> org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:417)
>         at 
> org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1508)
>         at 
> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
>         at 
> org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to