[jira] [Commented] (ZOOKEEPER-1449) Ephemeral znode not deleted after session has expired on one follower (quorum is in an inconsistent state)

Patrick Hunt (JIRA) Tue, 24 Apr 2012 14:52:33 -0700

    [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261060#comment-13261060
 ]


Patrick Hunt commented on ZOOKEEPER-1449:
-----------------------------------------

I've looked into this issue and I believe with high confidence that this is 
ZOOKEEPER-1156 "Log truncation truncating log too much - can cause data loss", 
which is fixed in 3.3.4 and beyond.

ZOOKEEPER-1156 can lead to over-truncation of the txnlog, which is what I'm 
seeing in the logs provided by Danny, here's a corrupted record, which when I 
look the binary directly I can see that is likely a result of file truncation:

Output from LogFormatter (dumps the txnlog in ascii form):

4/16/12 11:00:54 AM PDT session 0x236bc504cb50005 cxid 0x1e zxid 0x40000009b 
create 
4/16/12 11:00:54 AM PDT session 0x36bc50622f0002 cxid 0x19 zxid 0x40000009c 
create 
4/16/12 11:00:54 AM PDT session 0x36bc50622f0004 cxid 0x19 zxid 0x40000009d 
create 
CRC doesn't match 3908382071 vs 1875708430 skipping to next entry 
Skipped 181778 bytes 
4/17/12 12:03:30 AM PDT session 0x236bc633856000f cxid 0x0 zxid 0x500000330 
closeSession 
4/17/12 12:03:30 AM PDT session 0x236bc6338560005 cxid 0x0 zxid 0x500000331 
closeSession 
4/17/12 12:03:32 AM PDT session 0x236bc6338560004 cxid 0x0 zxid 0x500000332 
closeSession

which corresponds to this in the server log:

2012-04-17 00:03:47,077 - WARN [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Learner@306] - 
Truncating log to get in sync with the leader 0x50000032f 
2012-04-17 00:03:47,166 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FileSnap@82] - 
Reading snapshot /data1/zookeeper/version-2/snapshot.400000156

I also applied the same code that's fixed in ZOOKEEPER-1156, in it's unfixed 
state (ie what's in 3.3.3) to one of the log files from one of the other 
servers. When I do this, applying the same flawed truncation operation, I see 
that the calculated offset is very close to the location above (in terms of the 
zxid that has been chopped/corrupted). This further convinces me that 
ZOOKEEPER-1156 is at fault here.

As a result of this over-truncation we are missing a number of records, 
including the session close record which would normally cause the ephemeral 
node identified by Danny to be removed.
                
> Ephemeral znode not deleted after session has expired on one follower (quorum 
> is in an inconsistent state) 
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1449
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1449
>             Project: ZooKeeper
>          Issue Type: Bug
>            Reporter: Daniel Lord
>
> I've been running in to this situation in our labs fairly regularly where 
> we'll get a single follower that will be in an inconsistent state with 
> dangling ephemeral znodes.  Here is all of the information that I have right 
> now.  Please ask if there is anything else that is useful.
> Here is a quick snapshot of the state of the ensemble where you can see it is 
> out of sync across several znodes: 
> -bash-3.2$ echo srvr | nc il23n04sa-zk001 2181
> Zookeeper version: 3.3.3-cdh3u2--1, built on 10/14/2011 05:17 GMT
> Latency min/avg/max: 0/7/25802
> Received: 64002
> Sent: 63985
> Outstanding: 0
> Zxid: 0x500000a41
> Mode: follower
> Node count: 497
> -bash-3.2$ echo srvr | nc il23n04sa-zk002 2181
> Zookeeper version: 3.3.3-cdh3u2--1, built on 10/14/2011 05:17 GMT
> Latency min/avg/max: 0/13/79032
> Received: 74320
> Sent: 74276
> Outstanding: 0
> Zxid: 0x500000a41
> Mode: leader
> Node count: 493
> -bash-3.2$ echo srvr | nc il23n04sa-zk003 2181
> Zookeeper version: 3.3.3-cdh3u2--1, built on 10/14/2011 05:17 GMT
> Latency min/avg/max: 0/2/25234
> Received: 187310
> Sent: 187320
> Outstanding: 0
> Zxid: 0x500000a41
> Mode: follower
> Node count: 493
> All of the zxids match up just fine but zk001 has 4 more nodes in its node 
> count than the other two (including the leader..).  When I use a zookeeper 
> client connect to connect directly to zk001 I can see the following znode 
> that should no longer exist: 
> [zk: localhost:2181(CONNECTED) 0] stat 
> /siri/Douroucouli/clients/il23n04sa-app004.siri.apple.com:38096
> cZxid = 0x40000001a
> ctime = Mon Apr 16 11:00:47 PDT 2012
> mZxid = 0x40000001a
> mtime = Mon Apr 16 11:00:47 PDT 2012
> pZxid = 0x40000001a
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x236bc504cb50002
> dataLength = 0
> numChildren = 0
> This node does not exist using the client to connect to either of the other 
> two members of the ensemble.
> I searched through the logs for that session id and it looks like it was 
> established and closed cleanly.  There were several leadership/quorum 
> problems during the course of the session but it looks like it should have 
> been shut down properly.  Neither the session nor the znode show up in the 
> "dump" on the leader but the problem znode does show up in the "dump" on 
> zk001.
> 2012-04-16 11:00:47,637 - INFO  [CommitProcessor:2:NIOServerCnxn@1580] - 
> Established session 0x236bc504cb50002 with negotiated timeout 15000 for 
> client /17.202.71.201:38971
> 2012-04-16 11:20:59,341 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client 
> attempting to renew session 0x236bc504cb50002 at /17.202.71.201:50841
> 2012-04-16 11:20:59,342 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established 
> session 0x236bc504cb50002 with negotiated timeout 15000 for client 
> /17.202.71.201:50841
> 2012-04-16 11:21:09,343 - WARN  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@634] - 
> EndOfStreamException: Unable to read additional data from client sessionid 
> 0x236bc504cb50002, likely client has closed socket
> 2012-04-16 11:21:09,343 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed 
> socket connection for client /17.202.71.201:50841 which had sessionid 
> 0x236bc504cb50002
> 2012-04-16 11:21:20,352 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:NIOServerCnxn@1435] - Closed socket 
> connection for client /17.202.71.201:38971 which had sessionid 
> 0x236bc504cb50002
> 2012-04-16 11:21:22,151 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client 
> attempting to renew session 0x236bc504cb50002 at /17.202.71.201:38166
> 2012-04-16 11:21:22,152 - INFO  
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:NIOServerCnxn@1580] - Established session 
> 0x236bc504cb50002 with negotiated timeout 15000 for client 
> /17.202.71.201:38166
> 2012-04-16 11:27:17,902 - INFO  [ProcessThread:-1:PrepRequestProcessor@387] - 
> Processed session termination for sessionid: 0x236bc504cb50002
> 2012-04-16 11:27:17,904 - INFO  
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed 
> socket connection for client /17.202.71.201:38166 which had sessionid 
> 0x236bc504cb50002
> The only way I've been able to recover from this situation is to shut down 
> the problem follower, delete its snapshots and let it resync with the leader.
> I'll attach the full log4j logs, the txn logs, the snapshots and the conf 
> files for each member of the ensemble.  Please let me know what other 
> information is useful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1449) Ephemeral znode not deleted after session has expired on one follower (quorum is in an inconsistent state)

Reply via email to