[
https://issues.apache.org/jira/browse/ZOOKEEPER-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108046#comment-17108046
]
benwang li edited comment on ZOOKEEPER-3829 at 5/15/20, 2:56 PM:
-----------------------------------------------------------------
We start `CommitProcessor`
[here|https://github.com/apache/zookeeper/blob/e87bad6774e7269ef21a156aff9dad089ef54794/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/CommitProcessor.java#L455]
.
We shutdown `CommitProcessor`
[here|https://github.com/apache/zookeeper/blob/e87bad6774e7269ef21a156aff9dad089ef54794/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/CommitProcessor.java#L637].
But when we call `start` method again, the `workerPool` will not work anymore.
I submit the node D logs attachment `d.log`, and we can see that happens.
{code:java}
307: 2020-05-14 18:04:12,022 [myid:4] - INFO
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):CommitProcessor@362]
- Shutting down
308 2020-05-14 18:04:12,022 [myid:4] - INFO
[FollowerRequestProcessor:4:FollowerRequestProcessor@110] -
FollowerRequestProcessor exited loop!
309 2020-05-14 18:04:12,022 [myid:4] - INFO
[CommitProcessor:4:CommitProcessor@195] - CommitProcessor exited loop!
310 2020-05-14 18:04:12,023 [myid:4] - INFO
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):FinalRequestProcessor@514]
- shutdown of request processor complete
311 2020-05-14 18:04:12,024 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):FileTxnLog$FileTxnIterator@655]
- Created new input stream /data1/zookeeper /logs/version-2/log.2a0000000b
312 2020-05-14 18:04:12,024 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):FileTxnLog$FileTxnIterator@658]
- Created new input archive /data1/zookeepe r/logs/version-2/log.2a0000000b
313 2020-05-14 18:04:12,024 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):FileTxnLog$FileTxnIterator@696]
- EOF exception java.io.EOFException: Faile d to read
/data1/zookeeper/logs/version-2/log.2a0000000b
314 --
315 2020-05-14 18:04:29,000 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):SessionTrackerImpl@274]
- Adding session 0x3082f5048fc0000
316 2020-05-14 18:04:29,000 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):SessionTrackerImpl@274]
- Adding session 0x40a33f8f3f40002
317 2020-05-14 18:04:29,000 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):SessionTrackerImpl@274]
- Adding session 0x40a33f8f3f40000
318 2020-05-14 18:04:29,000 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):SessionTrackerImpl@274]
- Adding session 0x40a33f8f3f40001
319 2020-05-14 18:04:29,000 [myid:4] - INFO
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):CommitProcessor@256]
- Configuring CommitProcessor with 24 worker threads.
320 2020-05-14 18:04:29,002 [myid:4] - INFO
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):ContainerManager@64]
- Using checkIntervalMs=60000 maxPerMinute=10000
321 2020-05-14 18:04:29,003 [myid:4] - DEBUG
[LearnerHandler-/146.196.79.232:38708:LearnerHandler@534] - Sending UPTODATE
message to 3
{code}
was (Author: sundyli):
We start `CommitProcessor`
[here|https://github.com/apache/zookeeper/blob/e87bad6774e7269ef21a156aff9dad089ef54794/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/CommitProcessor.java#L455]
.
We shutdown `CommitProcessor`
[here|https://github.com/apache/zookeeper/blob/e87bad6774e7269ef21a156aff9dad089ef54794/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/CommitProcessor.java#L637].
But when we call `start` method again, the `workerPool` will not work anymore.
I submit the node D logs attachment `d.log`, and we can see that happens.
{code:java}
308 2020-05-14 18:04:12,022 [myid:4] - INFO
[FollowerRequestProcessor:4:FollowerRequestProcessor@110] -
FollowerRequestProcessor exited loop!
309 2020-05-14 18:04:12,022 [myid:4] - INFO
[CommitProcessor:4:CommitProcessor@195] - CommitProcessor exited loop!
310 2020-05-14 18:04:12,023 [myid:4] - INFO
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):FinalRequestProcessor@514]
- shutdown of request processor complete
311 2020-05-14 18:04:12,024 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):FileTxnLog$FileTxnIterator@655]
- Created new input stream /data1/zookeeper /logs/version-2/log.2a0000000b
312 2020-05-14 18:04:12,024 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):FileTxnLog$FileTxnIterator@658]
- Created new input archive /data1/zookeepe r/logs/version-2/log.2a0000000b
313 2020-05-14 18:04:12,024 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):FileTxnLog$FileTxnIterator@696]
- EOF exception java.io.EOFException: Faile d to read
/data1/zookeeper/logs/version-2/log.2a0000000b
314 --
315 2020-05-14 18:04:29,000 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):SessionTrackerImpl@274]
- Adding session 0x3082f5048fc0000
316 2020-05-14 18:04:29,000 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):SessionTrackerImpl@274]
- Adding session 0x40a33f8f3f40002
317 2020-05-14 18:04:29,000 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):SessionTrackerImpl@274]
- Adding session 0x40a33f8f3f40000
318 2020-05-14 18:04:29,000 [myid:4] - DEBUG
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):SessionTrackerImpl@274]
- Adding session 0x40a33f8f3f40001
319 2020-05-14 18:04:29,000 [myid:4] - INFO
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):CommitProcessor@256]
- Configuring CommitProcessor with 24 worker threads.
320 2020-05-14 18:04:29,002 [myid:4] - INFO
[QuorumPeer[myid=4](plain=/0:0:0:0:0:0:0:0:2183)(secure=disabled):ContainerManager@64]
- Using checkIntervalMs=60000 maxPerMinute=10000
321 2020-05-14 18:04:29,003 [myid:4] - DEBUG
[LearnerHandler-/146.196.79.232:38708:LearnerHandler@534] - Sending UPTODATE
message to 3
{code}
> Zookeeper refuses request after node expansion
> ----------------------------------------------
>
> Key: ZOOKEEPER-3829
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3829
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.5.6
> Reporter: benwang li
> Priority: Major
> Attachments: d.log, screenshot-1.png
>
>
> It's easy to reproduce this bug.
> {code:java}
> //代码占位符
>
> Step 1. Deploy 3 nodes A,B,C with configuration A,B,C .
> Step 2. Deploy node ` D` with configuration `A,B,C,D` , cluster state is ok
> now.
> Step 3. Restart nodes A,B,C with configuration A,B,C,D, then the leader will
> be D, cluster hangs, but it can accept `mntr` command, other command like `ls
> /` will be blocked.
> Step 4. Restart nodes D, cluster state is back to normal now.
>
> {code}
>
> We have looked into the code of 3.5.6 version, and we found it may be the
> issue of `workerPool` .
> The `CommitProcessor` shutdown and make `workerPool` shutdown, but
> `workerPool` still exists. It will never work anymore, yet the cluster still
> thinks it's ok.
>
> I think the bug may still exist in master branch.
> We have tested it in our machines by reset the `workerPool` to null. If it's
> ok, please assign this issue to me, and then I'll create a PR.
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)