Rakesh Kumar Singh created ZOOKEEPER-2546:
---------------------------------------------

             Summary: Started throwing "Error Path:null Error:KeeperErrorCode = 
ReconfigInProgress" error when trying to change the cluster using reconfig and 
the IO hangged at one node
                 Key: ZOOKEEPER-2546
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2546
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.5.1
            Reporter: Rakesh Kumar Singh


Started throwing "Error Path:null Error:KeeperErrorCode = ReconfigInProgress" 
error when trying to change the cluster using reconfig and the IO hangged at 
one node.

Steps:-
1. Start Zookeeper in cluster mode
2. try to reconfig the cluster using "reconfig" command from one node's client 
(194) like
"reconfig -remove 3 -add 3=10.18.221.194:2888:3888;2181
3. make the IO busy for 5-10 secs at 194 node and then release
4. Again execute the above reconfig command
It is failing to execute even after 3-4 mins.
Server log is attached. (Complete server log is attached)

2016-09-02 18:12:05,845 [myid:2] - INFO  
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):QuorumPeer@1074]
 - LEADING
2016-09-02 18:12:05,848 [myid:2] - INFO  
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Leader@63] - 
TCP NoDelay set to: true
2016-09-02 18:12:05,848 [myid:2] - INFO  
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Leader@83] - 
zookeeper.leader.maxConcurrentSnapshots = 10
2016-09-02 18:12:05,848 [myid:2] - INFO  
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Leader@85] - 
zookeeper.leader.maxConcurrentSnapshotTimeout = 5
2016-09-02 18:12:05,849 [myid:2] - INFO  
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):ZooKeeperServer@858]
 - minSessionTimeout set to 4000
2016-09-02 18:12:05,849 [myid:2] - INFO  
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):ZooKeeperServer@867]
 - maxSessionTimeout set to 40000
2016-09-02 18:12:05,849 [myid:2] - INFO  
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):ZooKeeperServer@156]
 - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 
40000 datadir 
/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/data/version-2 
snapdir 
/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/data/version-2
2016-09-02 18:12:05,850 [myid:2] - INFO  
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Leader@412] - 
LEADING - LEADER ELECTION TOOK - 5
2016-09-02 18:12:05,852 [myid:2] - INFO  
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):FileTxnSnapLog@298]
 - Snapshotting: 0x100000001 to 
/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/data/version-2/snapshot.100000001
2016-09-02 18:12:06,854 [myid:2] - INFO  
[LearnerHandler-/10.18.101.80:55632:LearnerHandler@382] - Follower sid: 1 : 
info : 10.18.101.80:2888:3888:participant;0.0.0.0:2181
2016-09-02 18:12:06,869 [myid:2] - INFO  
[LearnerHandler-/10.18.101.80:55632:LearnerHandler@683] - Synchronizing with 
Follower sid: 1 maxCommittedLog=0x100000001 minCommittedLog=0x100000001 
lastProcessedZxid=0x100000001 peerLastZxid=0x100000001
2016-09-02 18:12:06,869 [myid:2] - INFO  
[LearnerHandler-/10.18.101.80:55632:LearnerHandler@727] - Sending DIFF 
zxid=0x100000001 for peer sid: 1
2016-09-02 18:12:06,888 [myid:2] - INFO  
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Leader@1245] 
- Have quorum of supporters, sids: [ [1, 2] ]; starting up and setting last 
processed zxid: 0x200000000
2016-09-02 18:12:06,890 [myid:2] - INFO  
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):CommitProcessor@254]
 - Configuring CommitProcessor with 8 worker threads.
2016-09-02 18:12:06,898 [myid:2] - INFO  
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):ContainerManager@64]
 - Using checkIntervalMs=60000 maxPerMinute=10000
2016-09-02 18:12:18,886 [myid:2] - INFO  
[WorkerReceiver[myid=2]:FastLeaderElection@688] - Notification: 2 (message 
format version), 3 (n.leader), 0x0 (n.zxid), 0xffffffffffffffff (n.round), 
LEADING (n.state), 3 (n.sid), 0x1 (n.peerEPoch), LEADING (my state)200000028 
(n.config version)
2016-09-02 18:13:47,869 [myid:2] - INFO  [ProcessThread(sid:2 
cport:-1)::PrepRequestProcessor@512] - Incremental reconfig
2016-09-02 18:13:47,872 [myid:2] - ERROR [ProcessThread(sid:2 
cport:-1)::QuorumPeer@1383] - setLastSeenQuorumVerifier called with stale 
config 8589934593. Current version: 8589934632
2016-09-02 18:14:15,545 [myid:2] - INFO  [ProcessThread(sid:2 
cport:-1)::PrepRequestProcessor@843] - Got user-level KeeperException when 
processing sessionid:0x1000aa5ce650000 type:reconfig cxid:0x3 zxid:0x200000002 
txntype:-1 reqpath:n/a Error Path:null Error:KeeperErrorCode = 
ReconfigInProgress
2016-09-02 18:14:56,442 [myid:2] - INFO  
[NIOServerCxnFactory.AcceptThread:/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /10.18.219.50:48388
2016-09-02 18:14:56,454 [myid:2] - INFO  [NIOWorkerThread-1:NIOServerCnxn@485] 
- Processing srvr command from /10.18.219.50:48388
2016-09-02 18:14:56,467 [myid:2] - INFO  [NIOWorkerThread-1:NIOServerCnxn@606] 
- Closed socket connection for client /10.18.219.50:48388 (no session 
established for client)
2016-09-02 18:17:18,365 [myid:2] - INFO  [ProcessThread(sid:2 
cport:-1)::PrepRequestProcessor@843] - Got user-level KeeperException when 
processing sessionid:0x1000aa5ce650000 type:reconfig cxid:0x4 zxid:0x200000003 
txntype:-1 reqpath:n/a Error Path:null Error:KeeperErrorCode = 
ReconfigInProgress
2016-09-02 18:19:23,699 [myid:2] - INFO  [ProcessThread(sid:2 
cport:-1)::PrepRequestProcessor@843] - Got user-level KeeperException when 
processing sessionid:0x1000aa5ce650000 type:reconfig cxid:0x5 zxid:0x200000004 
txntype:-1 reqpath:n/a Error Path:null Error:KeeperErrorCode = 
ReconfigInProgress




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to