[ 
https://issues.apache.org/jira/browse/RATIS-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated RATIS-687:
------------------------------
    Description: 
LifeCycle#checkStateAndClose does not run closeMethod when there is a 
transition from NEW to CLOSED state. In case of RaftServerImpl this leads to 
leak of resources as the SegmentedRaftLogWorker#workerThread is never closed.

 The below log shows that the group creation failed because the group was 
removed before the server could be started and transition to RUNNING state. 
Therefore the group was removed while the server was still in NEW state and the 
state transitioned from NEW to CLOSED without shutting down the server threads.
{code:java}
2019-09-19 12:57:18,936 [grpc-default-executor-2] INFO  impl.RaftServerProxy 
(RaftServerProxy.java:addNew(89)) - dbb1763a-ad27-40d5-a907-326b490ac078: 
addNew 
group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423] returns 
group-B983B2DCF085:java.util.concurrent.CompletableFuture@69914d62[Not 
completed]
2019-09-19 12:57:27,953 [grpc-default-executor-0] INFO  impl.RaftServerProxy 
(RaftServerProxy.java:remove(95)) - dbb1763a-ad27-40d5-a907-326b490ac078: 
remove group-B983B2DCF085:java.util.concurrent.CompletableFuture@69914d62[Not 
completed, 1 dependents]
2019-09-19 12:59:04,846 [pool-48-thread-1] WARN  impl.RaftServerProxy 
(RaftServerProxy.java:lambda$groupAddAsync$11(388)) - 
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: Failed groupAdd* 
GroupManagementRequest:client-256726B37980->8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085,
 cid=2, seq=0, RW, null, 
Add:group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423]2019-09-19 12:59:04,846 
[pool-48-thread-1] WARN  impl.RaftServerProxy 
(RaftServerProxy.java:lambda$groupAddAsync$11(388)) - 
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: Failed groupAdd* 
GroupManagementRequest:client-256726B37980->8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085,
 cid=2, seq=0, RW, null, 
Add:group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423]java.util.concurrent.CompletionException:
 java.lang.IllegalStateException: 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: failed 
to start a new impl:      null 
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085:t0, leader=null, 
voted=null, 
raftlog=8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLog:OPENED:c-1,f-1,i0,
 conf=-1: [8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423], old=null CLOSED at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
 at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
 at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:604) 
at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
 at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)Caused by: 
java.lang.IllegalStateException: 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: failed 
to start a new impl:      null 
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085:t0, leader=null, 
voted=null, 
raftlog=8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLog:OPENED:c-1,f-1,i0,
 conf=-1: [8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423], old=null CLOSED at 
org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:75) at 
org.apache.ratis.server.impl.RaftServerProxy.lambda$groupAddAsync$10(RaftServerProxy.java:382)
 at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) 
... 5 more
Timestamp: 2019-09-19 01:12:35,583
"8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLogWorker"
  prio=5 tid=703 timed_waiting
java.lang.Thread.State: TIMED_WAITING
        at sun.misc.Unsafe.park(Native Method)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
        at 
org.apache.ratis.util.DataBlockingQueue.poll(DataBlockingQueue.java:137)
        at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.run(SegmentedRaftLogWorker.java:283)
        at java.lang.Thread.run(Thread.java:748)
{code}
 

  was:
LifeCycle#checkStateAndClose does not run closeMethod when there is a 
transition from NEW to CLOSED state. In case of RaftServerImpl this leads to 
leak of resources as the SegmentedRaftLogWorker#workerThread is never closed.

 
{code:java}
2019-09-19 12:57:18,936 [grpc-default-executor-2] INFO  impl.RaftServerProxy 
(RaftServerProxy.java:addNew(89)) - dbb1763a-ad27-40d5-a907-326b490ac078: 
addNew 
group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423] returns 
group-B983B2DCF085:java.util.concurrent.CompletableFuture@69914d62[Not 
completed]
2019-09-19 12:57:27,953 [grpc-default-executor-0] INFO  impl.RaftServerProxy 
(RaftServerProxy.java:remove(95)) - dbb1763a-ad27-40d5-a907-326b490ac078: 
remove group-B983B2DCF085:java.util.concurrent.CompletableFuture@69914d62[Not 
completed, 1 dependents]
2019-09-19 12:59:04,846 [pool-48-thread-1] WARN  impl.RaftServerProxy 
(RaftServerProxy.java:lambda$groupAddAsync$11(388)) - 
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: Failed groupAdd* 
GroupManagementRequest:client-256726B37980->8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085,
 cid=2, seq=0, RW, null, 
Add:group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423]2019-09-19 12:59:04,846 
[pool-48-thread-1] WARN  impl.RaftServerProxy 
(RaftServerProxy.java:lambda$groupAddAsync$11(388)) - 
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: Failed groupAdd* 
GroupManagementRequest:client-256726B37980->8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085,
 cid=2, seq=0, RW, null, 
Add:group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423]java.util.concurrent.CompletionException:
 java.lang.IllegalStateException: 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: failed 
to start a new impl:      null 
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085:t0, leader=null, 
voted=null, 
raftlog=8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLog:OPENED:c-1,f-1,i0,
 conf=-1: [8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423], old=null CLOSED at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
 at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
 at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:604) 
at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
 at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)Caused by: 
java.lang.IllegalStateException: 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: failed 
to start a new impl:      null 
8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085:t0, leader=null, 
voted=null, 
raftlog=8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLog:OPENED:c-1,f-1,i0,
 conf=-1: [8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423], old=null CLOSED at 
org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:75) at 
org.apache.ratis.server.impl.RaftServerProxy.lambda$groupAddAsync$10(RaftServerProxy.java:382)
 at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) 
... 5 more
Timestamp: 2019-09-19 01:12:35,583
"8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLogWorker"
  prio=5 tid=703 timed_waiting
java.lang.Thread.State: TIMED_WAITING
        at sun.misc.Unsafe.park(Native Method)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
        at 
org.apache.ratis.util.DataBlockingQueue.poll(DataBlockingQueue.java:137)
        at 
org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.run(SegmentedRaftLogWorker.java:283)
        at java.lang.Thread.run(Thread.java:748)
{code}
 


> RaftServerImpl is not shutdown when transitioning from NEW to CLOSED state
> --------------------------------------------------------------------------
>
>                 Key: RATIS-687
>                 URL: https://issues.apache.org/jira/browse/RATIS-687
>             Project: Ratis
>          Issue Type: Bug
>            Reporter: Lokesh Jain
>            Assignee: Lokesh Jain
>            Priority: Major
>         Attachments: RATIS-687.001.patch
>
>
> LifeCycle#checkStateAndClose does not run closeMethod when there is a 
> transition from NEW to CLOSED state. In case of RaftServerImpl this leads to 
> leak of resources as the SegmentedRaftLogWorker#workerThread is never closed.
>  The below log shows that the group creation failed because the group was 
> removed before the server could be started and transition to RUNNING state. 
> Therefore the group was removed while the server was still in NEW state and 
> the state transitioned from NEW to CLOSED without shutting down the server 
> threads.
> {code:java}
> 2019-09-19 12:57:18,936 [grpc-default-executor-2] INFO  impl.RaftServerProxy 
> (RaftServerProxy.java:addNew(89)) - dbb1763a-ad27-40d5-a907-326b490ac078: 
> addNew 
> group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
> dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
> 6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423] returns 
> group-B983B2DCF085:java.util.concurrent.CompletableFuture@69914d62[Not 
> completed]
> 2019-09-19 12:57:27,953 [grpc-default-executor-0] INFO  impl.RaftServerProxy 
> (RaftServerProxy.java:remove(95)) - dbb1763a-ad27-40d5-a907-326b490ac078: 
> remove group-B983B2DCF085:java.util.concurrent.CompletableFuture@69914d62[Not 
> completed, 1 dependents]
> 2019-09-19 12:59:04,846 [pool-48-thread-1] WARN  impl.RaftServerProxy 
> (RaftServerProxy.java:lambda$groupAddAsync$11(388)) - 
> 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: Failed groupAdd* 
> GroupManagementRequest:client-256726B37980->8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085,
>  cid=2, seq=0, RW, null, 
> Add:group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
>  dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
> 6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423]2019-09-19 
> 12:59:04,846 [pool-48-thread-1] WARN  impl.RaftServerProxy 
> (RaftServerProxy.java:lambda$groupAddAsync$11(388)) - 
> 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: Failed groupAdd* 
> GroupManagementRequest:client-256726B37980->8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085,
>  cid=2, seq=0, RW, null, 
> Add:group-B983B2DCF085:[8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422,
>  dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
> 6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423]java.util.concurrent.CompletionException:
>  java.lang.IllegalStateException: 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: 
> failed to start a new impl:      null 
> 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085:t0, leader=null, 
> voted=null, 
> raftlog=8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLog:OPENED:c-1,f-1,i0,
>  conf=-1: [8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
> dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
> 6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423], old=null CLOSED at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
>  at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
>  at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:604) 
> at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
>  at 
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)Caused by: 
> java.lang.IllegalStateException: 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed: failed 
> to start a new impl:      null 
> 8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085:t0, leader=null, 
> voted=null, 
> raftlog=8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLog:OPENED:c-1,f-1,i0,
>  conf=-1: [8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed:10.200.5.17:50422, 
> dbb1763a-ad27-40d5-a907-326b490ac078:10.200.5.17:50421, 
> 6f0a09de-30ac-4af3-bd09-c1d256eeede3:10.200.5.17:50423], old=null CLOSED at 
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:75) at 
> org.apache.ratis.server.impl.RaftServerProxy.lambda$groupAddAsync$10(RaftServerProxy.java:382)
>  at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) 
> ... 5 more
> Timestamp: 2019-09-19 01:12:35,583
> "8e6c3d3c-fdb0-4da3-8339-a4c04a7595ed@group-B983B2DCF085-SegmentedRaftLogWorker"
>   prio=5 tid=703 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
>         at sun.misc.Unsafe.park(Native Method)
>         at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>         at 
> org.apache.ratis.util.DataBlockingQueue.poll(DataBlockingQueue.java:137)
>         at 
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.run(SegmentedRaftLogWorker.java:283)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to