Rakesh Kumar Singh created ZOOKEEPER-2546:
---------------------------------------------
Summary: Started throwing "Error Path:null Error:KeeperErrorCode =
ReconfigInProgress" error when trying to change the cluster using reconfig and
the IO hangged at one node
Key: ZOOKEEPER-2546
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2546
Project: ZooKeeper
Issue Type: Bug
Components: server
Affects Versions: 3.5.1
Reporter: Rakesh Kumar Singh
Started throwing "Error Path:null Error:KeeperErrorCode = ReconfigInProgress"
error when trying to change the cluster using reconfig and the IO hangged at
one node.
Steps:-
1. Start Zookeeper in cluster mode
2. try to reconfig the cluster using "reconfig" command from one node's client
(194) like
"reconfig -remove 3 -add 3=10.18.221.194:2888:3888;2181
3. make the IO busy for 5-10 secs at 194 node and then release
4. Again execute the above reconfig command
It is failing to execute even after 3-4 mins.
Server log is attached. (Complete server log is attached)
2016-09-02 18:12:05,845 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):QuorumPeer@1074]
- LEADING
2016-09-02 18:12:05,848 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Leader@63] -
TCP NoDelay set to: true
2016-09-02 18:12:05,848 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Leader@83] -
zookeeper.leader.maxConcurrentSnapshots = 10
2016-09-02 18:12:05,848 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Leader@85] -
zookeeper.leader.maxConcurrentSnapshotTimeout = 5
2016-09-02 18:12:05,849 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):ZooKeeperServer@858]
- minSessionTimeout set to 4000
2016-09-02 18:12:05,849 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):ZooKeeperServer@867]
- maxSessionTimeout set to 40000
2016-09-02 18:12:05,849 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):ZooKeeperServer@156]
- Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout
40000 datadir
/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/data/version-2
snapdir
/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/data/version-2
2016-09-02 18:12:05,850 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Leader@412] -
LEADING - LEADER ELECTION TOOK - 5
2016-09-02 18:12:05,852 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):FileTxnSnapLog@298]
- Snapshotting: 0x100000001 to
/home/Rakesh/Zookeeper/18_Aug/cluster/zookeeper-3.5.1-alpha/data/version-2/snapshot.100000001
2016-09-02 18:12:06,854 [myid:2] - INFO
[LearnerHandler-/10.18.101.80:55632:LearnerHandler@382] - Follower sid: 1 :
info : 10.18.101.80:2888:3888:participant;0.0.0.0:2181
2016-09-02 18:12:06,869 [myid:2] - INFO
[LearnerHandler-/10.18.101.80:55632:LearnerHandler@683] - Synchronizing with
Follower sid: 1 maxCommittedLog=0x100000001 minCommittedLog=0x100000001
lastProcessedZxid=0x100000001 peerLastZxid=0x100000001
2016-09-02 18:12:06,869 [myid:2] - INFO
[LearnerHandler-/10.18.101.80:55632:LearnerHandler@727] - Sending DIFF
zxid=0x100000001 for peer sid: 1
2016-09-02 18:12:06,888 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Leader@1245]
- Have quorum of supporters, sids: [ [1, 2] ]; starting up and setting last
processed zxid: 0x200000000
2016-09-02 18:12:06,890 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):CommitProcessor@254]
- Configuring CommitProcessor with 8 worker threads.
2016-09-02 18:12:06,898 [myid:2] - INFO
[QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):ContainerManager@64]
- Using checkIntervalMs=60000 maxPerMinute=10000
2016-09-02 18:12:18,886 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection@688] - Notification: 2 (message
format version), 3 (n.leader), 0x0 (n.zxid), 0xffffffffffffffff (n.round),
LEADING (n.state), 3 (n.sid), 0x1 (n.peerEPoch), LEADING (my state)200000028
(n.config version)
2016-09-02 18:13:47,869 [myid:2] - INFO [ProcessThread(sid:2
cport:-1)::PrepRequestProcessor@512] - Incremental reconfig
2016-09-02 18:13:47,872 [myid:2] - ERROR [ProcessThread(sid:2
cport:-1)::QuorumPeer@1383] - setLastSeenQuorumVerifier called with stale
config 8589934593. Current version: 8589934632
2016-09-02 18:14:15,545 [myid:2] - INFO [ProcessThread(sid:2
cport:-1)::PrepRequestProcessor@843] - Got user-level KeeperException when
processing sessionid:0x1000aa5ce650000 type:reconfig cxid:0x3 zxid:0x200000002
txntype:-1 reqpath:n/a Error Path:null Error:KeeperErrorCode =
ReconfigInProgress
2016-09-02 18:14:56,442 [myid:2] - INFO
[NIOServerCxnFactory.AcceptThread:/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296]
- Accepted socket connection from /10.18.219.50:48388
2016-09-02 18:14:56,454 [myid:2] - INFO [NIOWorkerThread-1:NIOServerCnxn@485]
- Processing srvr command from /10.18.219.50:48388
2016-09-02 18:14:56,467 [myid:2] - INFO [NIOWorkerThread-1:NIOServerCnxn@606]
- Closed socket connection for client /10.18.219.50:48388 (no session
established for client)
2016-09-02 18:17:18,365 [myid:2] - INFO [ProcessThread(sid:2
cport:-1)::PrepRequestProcessor@843] - Got user-level KeeperException when
processing sessionid:0x1000aa5ce650000 type:reconfig cxid:0x4 zxid:0x200000003
txntype:-1 reqpath:n/a Error Path:null Error:KeeperErrorCode =
ReconfigInProgress
2016-09-02 18:19:23,699 [myid:2] - INFO [ProcessThread(sid:2
cport:-1)::PrepRequestProcessor@843] - Got user-level KeeperException when
processing sessionid:0x1000aa5ce650000 type:reconfig cxid:0x5 zxid:0x200000004
txntype:-1 reqpath:n/a Error Path:null Error:KeeperErrorCode =
ReconfigInProgress
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)