[ 
https://issues.apache.org/jira/browse/FLINK-13553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061119#comment-17061119
 ] 

Gary Yao commented on FLINK-13553:
----------------------------------

At least one test leaks ByteBufs. We use {{EmbeddedChannel}} to capture 
outbound data but do not always release data after the test ends. 

For example, running {{KvStateServerHandlerTest#testChunkedResponse()}} in a 
loop will eventually lead to a 
{{org.apache.flink.shaded.netty4.io.netty.util.internal.OutOfDirectMemoryError}},
 which in turn [will be 
swallowed|https://github.com/apache/flink/blob/6cf07d374a34742a919a1dc1edf4eb1c1f44e831/flink-queryable-state/flink-queryable-state-client-java/src/main/java/org/apache/flink/queryablestate/network/AbstractServerHandler.java#L137-L152].
 As a consequence the test fails with a {{TimeoutException}}. 

We can furthermore prove that ByteBufs are leaked by enabling the 
{{ResourceLeakDetector}}:

{code}
ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.PARANOID);
{code}

{noformat}
7931 [main] ERROR 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector [] - LEAK: 
ByteBuf.release() was not called before it's garbage-collected. See 
https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records: 
#1:
        
org.apache.flink.shaded.netty4.io.netty.buffer.AdvancedLeakAwareByteBuf.writeBytes(AdvancedLeakAwareByteBuf.java:610)
        
org.apache.flink.queryablestate.network.messages.MessageSerializer.writePayload(MessageSerializer.java:211)
        
org.apache.flink.queryablestate.network.messages.MessageSerializer.serializeResponse(MessageSerializer.java:113)
        
org.apache.flink.queryablestate.network.AbstractServerHandler$AsyncRequestTask.lambda$run$0(AbstractServerHandler.java:249)
        
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
        
java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:792)
        
java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2153)
        
org.apache.flink.queryablestate.network.AbstractServerHandler$AsyncRequestTask.run(AbstractServerHandler.java:236)
        java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        java.util.concurrent.FutureTask.run(FutureTask.java:266)
        
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        java.lang.Thread.run(Thread.java:748)
        [...]
{noformat}

A fix could be to just release unreleased ByteBufs at the end of each test.



> KvStateServerHandlerTest.readInboundBlocking unstable on Travis
> ---------------------------------------------------------------
>
>                 Key: FLINK-13553
>                 URL: https://issues.apache.org/jira/browse/FLINK-13553
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Queryable State
>    Affects Versions: 1.10.0, 1.11.0
>            Reporter: Till Rohrmann
>            Assignee: Gary Yao
>            Priority: Critical
>              Labels: test-stability
>             Fix For: 1.11.0
>
>
> The {{KvStateServerHandlerTest.readInboundBlocking}} and 
> {{KvStateServerHandlerTest.testQueryExecutorShutDown}} fail on Travis with a 
> {{TimeoutException}}.
> https://api.travis-ci.org/v3/job/566420641/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to