[
https://issues.apache.org/jira/browse/ZOOKEEPER-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261060#comment-13261060
]
Patrick Hunt commented on ZOOKEEPER-1449:
-----------------------------------------
I've looked into this issue and I believe with high confidence that this is
ZOOKEEPER-1156 "Log truncation truncating log too much - can cause data loss",
which is fixed in 3.3.4 and beyond.
ZOOKEEPER-1156 can lead to over-truncation of the txnlog, which is what I'm
seeing in the logs provided by Danny, here's a corrupted record, which when I
look the binary directly I can see that is likely a result of file truncation:
Output from LogFormatter (dumps the txnlog in ascii form):
4/16/12 11:00:54 AM PDT session 0x236bc504cb50005 cxid 0x1e zxid 0x40000009b
create
4/16/12 11:00:54 AM PDT session 0x36bc50622f0002 cxid 0x19 zxid 0x40000009c
create
4/16/12 11:00:54 AM PDT session 0x36bc50622f0004 cxid 0x19 zxid 0x40000009d
create
CRC doesn't match 3908382071 vs 1875708430 skipping to next entry
Skipped 181778 bytes
4/17/12 12:03:30 AM PDT session 0x236bc633856000f cxid 0x0 zxid 0x500000330
closeSession
4/17/12 12:03:30 AM PDT session 0x236bc6338560005 cxid 0x0 zxid 0x500000331
closeSession
4/17/12 12:03:32 AM PDT session 0x236bc6338560004 cxid 0x0 zxid 0x500000332
closeSession
which corresponds to this in the server log:
2012-04-17 00:03:47,077 - WARN [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Learner@306] -
Truncating log to get in sync with the leader 0x50000032f
2012-04-17 00:03:47,166 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FileSnap@82] -
Reading snapshot /data1/zookeeper/version-2/snapshot.400000156
I also applied the same code that's fixed in ZOOKEEPER-1156, in it's unfixed
state (ie what's in 3.3.3) to one of the log files from one of the other
servers. When I do this, applying the same flawed truncation operation, I see
that the calculated offset is very close to the location above (in terms of the
zxid that has been chopped/corrupted). This further convinces me that
ZOOKEEPER-1156 is at fault here.
As a result of this over-truncation we are missing a number of records,
including the session close record which would normally cause the ephemeral
node identified by Danny to be removed.
> Ephemeral znode not deleted after session has expired on one follower (quorum
> is in an inconsistent state)
> -----------------------------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-1449
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1449
> Project: ZooKeeper
> Issue Type: Bug
> Reporter: Daniel Lord
>
> I've been running in to this situation in our labs fairly regularly where
> we'll get a single follower that will be in an inconsistent state with
> dangling ephemeral znodes. Here is all of the information that I have right
> now. Please ask if there is anything else that is useful.
> Here is a quick snapshot of the state of the ensemble where you can see it is
> out of sync across several znodes:
> -bash-3.2$ echo srvr | nc il23n04sa-zk001 2181
> Zookeeper version: 3.3.3-cdh3u2--1, built on 10/14/2011 05:17 GMT
> Latency min/avg/max: 0/7/25802
> Received: 64002
> Sent: 63985
> Outstanding: 0
> Zxid: 0x500000a41
> Mode: follower
> Node count: 497
> -bash-3.2$ echo srvr | nc il23n04sa-zk002 2181
> Zookeeper version: 3.3.3-cdh3u2--1, built on 10/14/2011 05:17 GMT
> Latency min/avg/max: 0/13/79032
> Received: 74320
> Sent: 74276
> Outstanding: 0
> Zxid: 0x500000a41
> Mode: leader
> Node count: 493
> -bash-3.2$ echo srvr | nc il23n04sa-zk003 2181
> Zookeeper version: 3.3.3-cdh3u2--1, built on 10/14/2011 05:17 GMT
> Latency min/avg/max: 0/2/25234
> Received: 187310
> Sent: 187320
> Outstanding: 0
> Zxid: 0x500000a41
> Mode: follower
> Node count: 493
> All of the zxids match up just fine but zk001 has 4 more nodes in its node
> count than the other two (including the leader..). When I use a zookeeper
> client connect to connect directly to zk001 I can see the following znode
> that should no longer exist:
> [zk: localhost:2181(CONNECTED) 0] stat
> /siri/Douroucouli/clients/il23n04sa-app004.siri.apple.com:38096
> cZxid = 0x40000001a
> ctime = Mon Apr 16 11:00:47 PDT 2012
> mZxid = 0x40000001a
> mtime = Mon Apr 16 11:00:47 PDT 2012
> pZxid = 0x40000001a
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x236bc504cb50002
> dataLength = 0
> numChildren = 0
> This node does not exist using the client to connect to either of the other
> two members of the ensemble.
> I searched through the logs for that session id and it looks like it was
> established and closed cleanly. There were several leadership/quorum
> problems during the course of the session but it looks like it should have
> been shut down properly. Neither the session nor the znode show up in the
> "dump" on the leader but the problem znode does show up in the "dump" on
> zk001.
> 2012-04-16 11:00:47,637 - INFO [CommitProcessor:2:NIOServerCnxn@1580] -
> Established session 0x236bc504cb50002 with negotiated timeout 15000 for
> client /17.202.71.201:38971
> 2012-04-16 11:20:59,341 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client
> attempting to renew session 0x236bc504cb50002 at /17.202.71.201:50841
> 2012-04-16 11:20:59,342 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established
> session 0x236bc504cb50002 with negotiated timeout 15000 for client
> /17.202.71.201:50841
> 2012-04-16 11:21:09,343 - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@634] -
> EndOfStreamException: Unable to read additional data from client sessionid
> 0x236bc504cb50002, likely client has closed socket
> 2012-04-16 11:21:09,343 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed
> socket connection for client /17.202.71.201:50841 which had sessionid
> 0x236bc504cb50002
> 2012-04-16 11:21:20,352 - INFO
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:NIOServerCnxn@1435] - Closed socket
> connection for client /17.202.71.201:38971 which had sessionid
> 0x236bc504cb50002
> 2012-04-16 11:21:22,151 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client
> attempting to renew session 0x236bc504cb50002 at /17.202.71.201:38166
> 2012-04-16 11:21:22,152 - INFO
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:NIOServerCnxn@1580] - Established session
> 0x236bc504cb50002 with negotiated timeout 15000 for client
> /17.202.71.201:38166
> 2012-04-16 11:27:17,902 - INFO [ProcessThread:-1:PrepRequestProcessor@387] -
> Processed session termination for sessionid: 0x236bc504cb50002
> 2012-04-16 11:27:17,904 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed
> socket connection for client /17.202.71.201:38166 which had sessionid
> 0x236bc504cb50002
> The only way I've been able to recover from this situation is to shut down
> the problem follower, delete its snapshots and let it resync with the leader.
> I'll attach the full log4j logs, the txn logs, the snapshots and the conf
> files for each member of the ensemble. Please let me know what other
> information is useful.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira