[
https://issues.apache.org/jira/browse/CASSANDRA-10756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242893#comment-15242893
]
Alex Petrov commented on CASSANDRA-10756:
-----------------------------------------
In short, what is happening. When {{Server::close()}} is called, it calls out
to {{ConnectionTracker::closeAll}}, which in turn calls
{{DefaultChannelGroup.close}}. Although {{DefaultChannelGroup}} has it's own
executor, when {{Futures}} for {{waitUninterruptibly}} are created within
{{ConnectionTracker::closeAll}}, the {{Channel}}'s executor is taken
{{workerGroup}}/{{NioExecutor}} in our case.
So adding a guard will indeed fix the test, since call to {{close}} is
synchronous. Alternatively, we can shutdown the {{workerGroup}} gracefully with
quiet period and timeout.
> Timeout failures in NativeTransportService.testConcurrentDestroys unit test
> ---------------------------------------------------------------------------
>
> Key: CASSANDRA-10756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10756
> Project: Cassandra
> Issue Type: Bug
> Reporter: Joel Knighton
> Assignee: Alex Petrov
>
> History of test on trunk
> [here|http://cassci.datastax.com/job/trunk_testall/lastCompletedBuild/testReport/org.apache.cassandra.service/NativeTransportServiceTest/testConcurrentDestroys/history/].
> I've seen these failures across 3.0/trunk for a while. I ran the test looping
> locally for a while and the timeout is fairly easy to reproduce. The timeout
> appears to be an indefinite hang and not a timing issue.
> When the timeout occurs, the following stack trace is at the end of the logs
> for the unit test.
> {code}
> ERROR [ForkJoinPool.commonPool-worker-1] 2015-11-22 21:30:53,635 Failed to
> submit a listener notification task. Event loop shut down?
> java.util.concurrent.RejectedExecutionException: event executor terminated
> at
> io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:745)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:322)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:728)
> ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.util.concurrent.DefaultPromise.execute(DefaultPromise.java:671)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.util.concurrent.DefaultPromise.notifyLateListener(DefaultPromise.java:641)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:138)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:93)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:28)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.group.DefaultChannelGroupFuture.<init>(DefaultChannelGroupFuture.java:116)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.group.DefaultChannelGroup.close(DefaultChannelGroup.java:275)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.group.DefaultChannelGroup.close(DefaultChannelGroup.java:167)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> org.apache.cassandra.transport.Server$ConnectionTracker.closeAll(Server.java:277)
> [main/:na]
> at org.apache.cassandra.transport.Server.close(Server.java:180)
> [main/:na]
> at org.apache.cassandra.transport.Server.stop(Server.java:116)
> [main/:na]
> at java.util.Collections$SingletonSet.forEach(Collections.java:4767)
> ~[na:1.8.0_60]
> at
> org.apache.cassandra.service.NativeTransportService.stop(NativeTransportService.java:136)
> ~[main/:na]
> at
> org.apache.cassandra.service.NativeTransportService.destroy(NativeTransportService.java:144)
> ~[main/:na]
> at
> org.apache.cassandra.service.NativeTransportServiceTest.lambda$withService$102(NativeTransportServiceTest.java:201)
> ~[classes/:na]
> at java.util.stream.IntPipeline$3$1.accept(IntPipeline.java:233)
> ~[na:1.8.0_60]
> at
> java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110)
> ~[na:1.8.0_60]
> at java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:693)
> ~[na:1.8.0_60]
> at
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> ~[na:1.8.0_60]
> at
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> ~[na:1.8.0_60]
> at java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:747)
> ~[na:1.8.0_60]
> at java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:721)
> ~[na:1.8.0_60]
> at java.util.stream.AbstractTask.compute(AbstractTask.java:316)
> ~[na:1.8.0_60]
> at
> java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
> ~[na:1.8.0_60]
> at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> ~[na:1.8.0_60]
> at
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> ~[na:1.8.0_60]
> at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
> ~[na:1.8.0_60]
> at
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> ~[na:1.8.0_60]
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)