[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15869349#comment-15869349
 ] 

Ryan Zhang commented on ZOOKEEPER-2684:
---------------------------------------

so here is what actually happened

On the server A, the request is processed right after the session connection is 
closed and the request is forwarded to the lead

investigate.log:2017-02-14 23:30:50,373 - INFO [FollowerRequestProcessor:4] - 
Processing request:: sessionid:0x4006b1b17808b0c type:getData cxid:0x14 
zxid:0xfffffffffffffffe txntype:unknown reqpath:/loadgen/load_1775503086
investigate.log:2017-02-14 23:30:54,327 - INFO [FollowerRequestProcessor:4] - 
Processing request:: sessionid:0x4006b1b17808b0c type:getData cxid:0x15 
zxid:0xfffffffffffffffe txntype:unknown reqpath:/loadgen/load_1775503083
investigate.log:2017-02-14 23:30:57,888 - INFO [FollowerRequestProcessor:4] - 
Processing request:: sessionid:0x4006b1b17808b0c type:getData cxid:0x16 
zxid:0xfffffffffffffffe txntype:unknown reqpath:/loadgen/load_1775503084
investigate.log:2017-02-14 23:31:01,583 - INFO [FollowerRequestProcessor:4] - 
Processing request:: sessionid:0x4006b1b17808b0c type:getData cxid:0x17 
zxid:0xfffffffffffffffe txntype:unknown reqpath:/loadgen/load_1775493180
investigate.log:2017-02-14 23:31:04,867 - INFO [FollowerRequestProcessor:4] - 
Processing request:: sessionid:0x4006b1b17808b0c type:getData cxid:0x18 
zxid:0xfffffffffffffffe txntype:unknown reqpath:/loadgen/load_1775493181
investigate.log:2017-02-14 23:31:08,150 - INFO [FollowerRequestProcessor:4] - 
Processing request:: sessionid:0x4006b1b17808b0c type:getData cxid:0x19 
zxid:0xfffffffffffffffe txntype:unknown reqpath:/loadgen/load_1775985079
investigate.log:2017-02-14 23:31:12,672 - INFO [FollowerRequestProcessor:4] - 
Processing request:: sessionid:0x4006b1b17808b0c type:getChildren2 cxid:0x1a 
zxid:0xfffffffffffffffe txntype:unknown reqpath:/loadgen
investigate.log:2017-02-14 23:31:17,293 - WARN [NIOWorkerThread-16] - Exception 
causing close of session 0x4006b1b17808b0c: Broken pipe
investigate.log:2017-02-14 23:31:17,297 - INFO [FollowerRequestProcessor:4] - 
Processing request:: sessionid:0x4006b1b17808b0c type:delete cxid:0x1b 
zxid:0xfffffffffffffffe txntype:unknown reqpath:/loadgen/load_1776069644
investigate.log:2017-02-14 23:31:21,816 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:delete cxid:0x27 zxid:0x385012a1e94 
txntype:2 reqpath:n/a
investigate.log:2017-02-14 23:31:21,816 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:create cxid:0x28 zxid:0x385012a1e95 
txntype:1 reqpath:n/a
investigate.log:2017-02-14 23:31:21,816 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:delete cxid:0x29 zxid:0x385012a1e96 
txntype:2 reqpath:n/a
investigate.log:2017-02-14 23:31:21,816 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:create cxid:0x2a zxid:0x385012a1e98 
txntype:1 reqpath:n/a
investigate.log:2017-02-14 23:31:21,817 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:delete cxid:0x2b zxid:0x385012a1e9b 
txntype:2 reqpath:n/a
investigate.log:2017-02-14 23:31:21,817 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:create cxid:0x2c zxid:0x385012a1e9c 
txntype:1 reqpath:n/a
investigate.log:2017-02-14 23:31:21,818 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:delete cxid:0x2d zxid:0x385012a1ea1 
txntype:2 reqpath:n/a
investigate.log:2017-02-14 23:31:21,822 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:create cxid:0x2e zxid:0x385012a1eb8 
txntype:1 reqpath:n/a
investigate.log:2017-02-14 23:31:21,824 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:delete cxid:0x2f zxid:0x385012a1ec0 
txntype:2 reqpath:n/a
investigate.log:2017-02-14 23:31:21,833 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:create cxid:0x30 zxid:0x385012a1ee9 
txntype:1 reqpath:n/a
investigate.log:2017-02-14 23:31:22,048 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:error cxid:0x1b zxid:0x385012a2156 
txntype:-1 reqpath:n/a
investigate.log:2017-02-14 23:31:22,690 - ERROR [CommitProcessor:4] - Got 
request sessionid:0x4006b1b17808b0c type:delete cxid:0x27 zxid:0x385012a1e94 
txntype:2 reqpath:n/a
investigate.log: but we are expecting request sessionid:0x4006b1b17808b0c 
type:delete cxid:0x1b zxid:0xfffffffffffffffe txntype:unknown 
reqpath:/loadgen/load_1776069644
investigate.log:2017-02-14 23:32:39,172 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:create cxid:0x4c zxid:0x385012b34fa 
txntype:1 reqpath:n/a
investigate.log:2017-02-14 23:32:48,677 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: sessionid:0x4006b1b17808b0c type:delete cxid:0x4e zxid:0x385012b6c31 
txntype:2 reqpath:n/a
investigate.log:2017-02-14 23:32:54,580 - INFO 
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled)] - Committing 
request:: se


while on the lead side, the requests are arrived out of order (in terms of 
Cxid) 

2017-02-14 23:31:21,822 - INFO [LearnerHandler-/10.45.111.116:53431] - 
Committing request:: sessionid:0x4006b1b17808b0c type:create cxid:0x2e 
zxid:0x385012a1eb8 txntype:1 reqpath:n/a
2017-02-14 23:31:21,824 - INFO [LearnerHandler-/10.45.111.116:53431] - Commit a 
proposal: sessionid:0x4006b1b17808b0c type:delete cxid:0x2f zxid:0x385012a1ec0 
txntype:2 reqpath:n/a
2017-02-14 23:31:21,824 - INFO [LearnerHandler-/10.45.111.116:53431] - 
Committing request:: sessionid:0x4006b1b17808b0c type:delete cxid:0x2f 
zxid:0x385012a1ec0 txntype:2 reqpath:n/a
2017-02-14 23:31:21,833 - INFO [LearnerHandler-/10.45.111.116:53431] - Commit a 
proposal: sessionid:0x4006b1b17808b0c type:create cxid:0x30 zxid:0x385012a1ee9 
txntype:1 reqpath:n/a
2017-02-14 23:31:21,833 - INFO [LearnerHandler-/10.45.111.116:53431] - 
Committing request:: sessionid:0x4006b1b17808b0c type:create cxid:0x30 
zxid:0x385012a1ee9 txntype:1 reqpath:n/a
2017-02-14 23:31:21,999 - DEBUG [ProcessThread(sid:1 cport:-1):] - Got 
user-level KeeperException when processing sessionid:0x4006b1b17808b0c 
type:delete cxid:0x1b zxid:0x385012a2156 txntype:-1 
reqpath:/loadgen/load_1776069644 Error Path:null Error:KeeperErrorCode = 
Session moved
investigate.log.1:2017-02-14 23:31:21,999 - INFO [ProcessThread(sid:1 
cport:-1):] - Processing request:: sessionid:0x4006b1b17808b0c type:delete 
cxid:0x1b zxid:0x385012a2156 txntype:-1 reqpath:/loadgen/load_1776069644
investigate.log.1:2017-02-14 23:31:21,999 - INFO [ProcessThread(sid:1 
cport:-1):] - Leader proposing: sessionid:0x4006b1b17808b0c type:delete 
cxid:0x1b zxid:0x385012a2156 txntype:-1 reqpath:/loadgen/load_1776069644
investigate.log.1:2017-02-14 23:31:22,048 - INFO 
[LearnerHandler-/10.45.112.127:53292] - Commit a proposal: 
sessionid:0x4006b1b17808b0c type:delete cxid:0x1b zxid:0x385012a2156 txntype:-1 
reqpath:/loadgen/load_1776069644
investigate.log.1:2017-02-14 23:31:22,048 - INFO 
[LearnerHandler-/10.45.112.127:53292] - Committing request:: 
sessionid:0x4006b1b17808b0c type:delete cxid:0x1b zxid:0x385012a2156 txntype:-1 
reqpath:/loadgen/load_1776069644
2017-02-14 23:31:26,728 - INFO [ProcessThread(sid:1 cport:-1):] - Processing 
request:: sessionid:0x4006b1b17808b0c type:delete cxid:0x32 zxid:0x385012a3bd0 
txntype:2 reqpath:n/a
2017-02-14 23:31:26,728 - INFO [ProcessThread(sid:1 cport:-1):] - Leader 
proposing: sessionid:0x4006b1b17808b0c type:delete cxid:0x32 zxid:0x385012a3bd0 
txntype:2 reqpath:n/a
2017-02-14 23:31:26,728 - INFO [ProcessThread(sid:1 cport:-1):] - Processing 
request:: sessionid:0x4006b1b17808b0c type:create cxid:0x33 zxid:0x385012a3bd1 
txntype:1 reqpath:n/a

Ideally, I would like to check if the server still owns the session (has a 
connection open) but it seems there is no easy way to check that. Any thoughts? 
The worst case is just be like the un-patched version to not check at all...

> Fix a crashing bug in the mixed workloads commit processor
> ----------------------------------------------------------
>
>                 Key: ZOOKEEPER-2684
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2684
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.6.0
>         Environment: with pretty heavy load on a real cluster
>            Reporter: Ryan Zhang
>            Assignee: Ryan Zhang
>            Priority: Blocker
>         Attachments: ZOOKEEPER-2684.patch
>
>
> We deployed our build with ZOOKEEPER-2024 and it quickly started to crash 
> with the following error
> atla-buh-05-sr1.prod.twttr.net: 2017-01-18 22:24:42,305 - ERROR 
> [CommitProcessor:2] 
> -org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:268)
>  – Got cxid 0x119fa expected 0x11fc5 for client session id 1009079ba470055
> atla-buh-05-sr1.prod.twttr.net: 2017-01-18 22:32:04,746 - ERROR 
> [CommitProcessor:2] 
> -org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:268)
>  – Got cxid 0x698 expected 0x928 for client session id 4002eeb3fd0009d
> atla-buh-05-sr1.prod.twttr.net: 2017-01-18 22:34:46,648 - ERROR 
> [CommitProcessor:2] 
> -org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:268)
>  – Got cxid 0x8904 expected 0x8f34 for client session id 51b8905c90251
> atla-buh-05-sr1.prod.twttr.net: 2017-01-18 22:43:46,834 - ERROR 
> [CommitProcessor:2] 
> -org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:268)
>  – Got cxid 0x3a8d expected 0x3ebc for client session id 2051af11af900cc
> clearly something is not right in the new commit processor per session queue 
> implementation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to