[
https://issues.apache.org/jira/browse/RATIS-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lokesh Jain updated RATIS-687:
------------------------------
Description:
LifeCycle#checkStateAndClose does not run closeMethod when there is a
transition from NEW to CLOSED state. In case of RaftServerImpl this leads to
leak of resources as the SegmentedRaftLogWorker#workerThread is never closed.
The below log shows that the group creation failed because the group was
removed before the server could be started and transition to RUNNING state.
Therefore the group was removed while the server was still in NEW state and the
state transitioned from NEW to CLOSED without shutting down the server threads.
{code:java}
2019-09-19 12:57:18,936 [grpc-default-executor-2] INFO impl.RaftServerProxy
(RaftServerProxy.java:addNew(89)) - dbb1763a-ad27-40d5-a907-326b490ac078:
addNew
group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423] returns
group-B983B2DCF085:java.util.concurrent.CompletableFuture@69914d62[Not
completed]
2019-09-19 12:57:27,953 [grpc-default-executor-0] INFO impl.RaftServerProxy
(RaftServerProxy.java:remove(95)) - dbb1763a-ad27-40d5-a907-326b490ac078:
remove group-B983B2DCF085:java.util.concurrent.CompletableFuture@69914d62[Not
completed, 1 dependents]
2019-09-19 12:59:04,846 [pool-48-thread-1] WARN impl.RaftServerProxy
(RaftServerProxy.java:lambda$groupAddAsync$11(388)) -
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: Failed groupAdd*
GroupManagementRequest:client-256726B37980->8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085,
cid=2, seq=0, RW, null,
Add:group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423]2019-09-19 12:59:04,846
[pool-48-thread-1] WARN impl.RaftServerProxy
(RaftServerProxy.java:lambda$groupAddAsync$11(388)) -
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: Failed groupAdd*
GroupManagementRequest:client-256726B37980->8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085,
cid=2, seq=0, RW, null,
Add:group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423]java.util.concurrent.CompletionException:
java.lang.IllegalStateException: 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: failed
to start a new impl: null
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085:t0, leader=null,
voted=null,
raftlog=8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLog:OPENED:c-1,f-1,i0,
conf=-1: [8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423], old=null CLOSED at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:604)
at
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
at
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)Caused by:
java.lang.IllegalStateException: 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: failed
to start a new impl: null
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085:t0, leader=null,
voted=null,
raftlog=8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLog:OPENED:c-1,f-1,i0,
conf=-1: [8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423], old=null CLOSED at
org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:75) at
org.apache.ratis.server.impl.RaftServerProxy.lambda$groupAddAsync$10(RaftServerProxy.java:382)
at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
... 5 more
Timestamp: 2019-09-19 01:12:35,583
"8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLogWorker"
prio=5 tid=703 timed_waiting
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at
org.apache.ratis.util.DataBlockingQueue.poll(DataBlockingQueue.java:137)
at
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.run(SegmentedRaftLogWorker.java:283)
at java.lang.Thread.run(Thread.java:748)
{code}
was:
LifeCycle#checkStateAndClose does not run closeMethod when there is a
transition from NEW to CLOSED state. In case of RaftServerImpl this leads to
leak of resources as the SegmentedRaftLogWorker#workerThread is never closed.
{code:java}
2019-09-19 12:57:18,936 [grpc-default-executor-2] INFO impl.RaftServerProxy
(RaftServerProxy.java:addNew(89)) - dbb1763a-ad27-40d5-a907-326b490ac078:
addNew
group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423] returns
group-B983B2DCF085:java.util.concurrent.CompletableFuture@69914d62[Not
completed]
2019-09-19 12:57:27,953 [grpc-default-executor-0] INFO impl.RaftServerProxy
(RaftServerProxy.java:remove(95)) - dbb1763a-ad27-40d5-a907-326b490ac078:
remove group-B983B2DCF085:java.util.concurrent.CompletableFuture@69914d62[Not
completed, 1 dependents]
2019-09-19 12:59:04,846 [pool-48-thread-1] WARN impl.RaftServerProxy
(RaftServerProxy.java:lambda$groupAddAsync$11(388)) -
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: Failed groupAdd*
GroupManagementRequest:client-256726B37980->8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085,
cid=2, seq=0, RW, null,
Add:group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423]2019-09-19 12:59:04,846
[pool-48-thread-1] WARN impl.RaftServerProxy
(RaftServerProxy.java:lambda$groupAddAsync$11(388)) -
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: Failed groupAdd*
GroupManagementRequest:client-256726B37980->8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085,
cid=2, seq=0, RW, null,
Add:group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423]java.util.concurrent.CompletionException:
java.lang.IllegalStateException: 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: failed
to start a new impl: null
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085:t0, leader=null,
voted=null,
raftlog=8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLog:OPENED:c-1,f-1,i0,
conf=-1: [8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423], old=null CLOSED at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:604)
at
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
at
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)Caused by:
java.lang.IllegalStateException: 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: failed
to start a new impl: null
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085:t0, leader=null,
voted=null,
raftlog=8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLog:OPENED:c-1,f-1,i0,
conf=-1: [8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423], old=null CLOSED at
org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:75) at
org.apache.ratis.server.impl.RaftServerProxy.lambda$groupAddAsync$10(RaftServerProxy.java:382)
at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
... 5 more
Timestamp: 2019-09-19 01:12:35,583
"8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLogWorker"
prio=5 tid=703 timed_waiting
java.lang.Thread.State: TIMED_WAITING
at sun.misc.Unsafe.park(Native Method)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at
org.apache.ratis.util.DataBlockingQueue.poll(DataBlockingQueue.java:137)
at
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.run(SegmentedRaftLogWorker.java:283)
at java.lang.Thread.run(Thread.java:748)
{code}
> RaftServerImpl is not shutdown when transitioning from NEW to CLOSED state
> --------------------------------------------------------------------------
>
> Key: RATIS-687
> URL: https://issues.apache.org/jira/browse/RATIS-687
> Project: Ratis
> Issue Type: Bug
> Reporter: Lokesh Jain
> Assignee: Lokesh Jain
> Priority: Major
> Attachments: RATIS-687.001.patch
>
>
> LifeCycle#checkStateAndClose does not run closeMethod when there is a
> transition from NEW to CLOSED state. In case of RaftServerImpl this leads to
> leak of resources as the SegmentedRaftLogWorker#workerThread is never closed.
> The below log shows that the group creation failed because the group was
> removed before the server could be started and transition to RUNNING state.
> Therefore the group was removed while the server was still in NEW state and
> the state transitioned from NEW to CLOSED without shutting down the server
> threads.
> {code:java}
> 2019-09-19 12:57:18,936 [grpc-default-executor-2] INFO impl.RaftServerProxy
> (RaftServerProxy.java:addNew(89)) - dbb1763a-ad27-40d5-a907-326b490ac078:
> addNew
> group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
> dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
> 6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423] returns
> group-B983B2DCF085:java.util.concurrent.CompletableFuture@69914d62[Not
> completed]
> 2019-09-19 12:57:27,953 [grpc-default-executor-0] INFO impl.RaftServerProxy
> (RaftServerProxy.java:remove(95)) - dbb1763a-ad27-40d5-a907-326b490ac078:
> remove group-B983B2DCF085:java.util.concurrent.CompletableFuture@69914d62[Not
> completed, 1 dependents]
> 2019-09-19 12:59:04,846 [pool-48-thread-1] WARN impl.RaftServerProxy
> (RaftServerProxy.java:lambda$groupAddAsync$11(388)) -
> 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: Failed groupAdd*
> GroupManagementRequest:client-256726B37980->8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085,
> cid=2, seq=0, RW, null,
> Add:group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
> dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
> 6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423]2019-09-19
> 12:59:04,846 [pool-48-thread-1] WARN impl.RaftServerProxy
> (RaftServerProxy.java:lambda$groupAddAsync$11(388)) -
> 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: Failed groupAdd*
> GroupManagementRequest:client-256726B37980->8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085,
> cid=2, seq=0, RW, null,
> Add:group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
> dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
> 6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423]java.util.concurrent.CompletionException:
> java.lang.IllegalStateException: 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:
> failed to start a new impl: null
> 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085:t0, leader=null,
> voted=null,
> raftlog=8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLog:OPENED:c-1,f-1,i0,
> conf=-1: [8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
> dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
> 6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423], old=null CLOSED at
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
> at
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
> at
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:604)
> at
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
> at
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)Caused by:
> java.lang.IllegalStateException: 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: failed
> to start a new impl: null
> 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085:t0, leader=null,
> voted=null,
> raftlog=8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLog:OPENED:c-1,f-1,i0,
> conf=-1: [8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
> dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421,
> 6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423], old=null CLOSED at
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:75) at
> org.apache.ratis.server.impl.RaftServerProxy.lambda$groupAddAsync$10(RaftServerProxy.java:382)
> at
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> ... 5 more
> Timestamp: 2019-09-19 01:12:35,583
> "8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLogWorker"
> prio=5 tid=703 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
> at sun.misc.Unsafe.park(Native Method)
> at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at
> org.apache.ratis.util.DataBlockingQueue.poll(DataBlockingQueue.java:137)
> at
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.run(SegmentedRaftLogWorker.java:283)
> at java.lang.Thread.run(Thread.java:748)
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)