[
https://issues.apache.org/jira/browse/FLINK-13553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061119#comment-17061119
]
Gary Yao commented on FLINK-13553:
----------------------------------
At least one test leaks ByteBufs. We use {{EmbeddedChannel}} to capture
outbound data but do not always release data after the test ends.
For example, running {{KvStateServerHandlerTest#testChunkedResponse()}} in a
loop will eventually lead to a
{{org.apache.flink.shaded.netty4.io.netty.util.internal.OutOfDirectMemoryError}},
which in turn [will be
swallowed|https://github.com/apache/flink/blob/6cf07d374a34742a919a1dc1edf4eb1c1f44e831/flink-queryable-state/flink-queryable-state-client-java/src/main/java/org/apache/flink/queryablestate/network/AbstractServerHandler.java#L137-L152].
As a consequence the test fails with a {{TimeoutException}}.
We can furthermore prove that ByteBufs are leaked by enabling the
{{ResourceLeakDetector}}:
{code}
ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.PARANOID);
{code}
{noformat}
7931 [main] ERROR
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector [] - LEAK:
ByteBuf.release() was not called before it's garbage-collected. See
https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records:
#1:
org.apache.flink.shaded.netty4.io.netty.buffer.AdvancedLeakAwareByteBuf.writeBytes(AdvancedLeakAwareByteBuf.java:610)
org.apache.flink.queryablestate.network.messages.MessageSerializer.writePayload(MessageSerializer.java:211)
org.apache.flink.queryablestate.network.messages.MessageSerializer.serializeResponse(MessageSerializer.java:113)
org.apache.flink.queryablestate.network.AbstractServerHandler$AsyncRequestTask.lambda$run$0(AbstractServerHandler.java:249)
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:792)
java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2153)
org.apache.flink.queryablestate.network.AbstractServerHandler$AsyncRequestTask.run(AbstractServerHandler.java:236)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
[...]
{noformat}
A fix could be to just release unreleased ByteBufs at the end of each test.
> KvStateServerHandlerTest.readInboundBlocking unstable on Travis
> ---------------------------------------------------------------
>
> Key: FLINK-13553
> URL: https://issues.apache.org/jira/browse/FLINK-13553
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Queryable State
> Affects Versions: 1.10.0, 1.11.0
> Reporter: Till Rohrmann
> Assignee: Gary Yao
> Priority: Critical
> Labels: test-stability
> Fix For: 1.11.0
>
>
> The {{KvStateServerHandlerTest.readInboundBlocking}} and
> {{KvStateServerHandlerTest.testQueryExecutorShutDown}} fail on Travis with a
> {{TimeoutException}}.
> https://api.travis-ci.org/v3/job/566420641/log.txt
--
This message was sent by Atlassian Jira
(v8.3.4#803005)