[ 
https://issues.apache.org/jira/browse/FLINK-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545457#comment-16545457
 ] 

Nico Kruber edited comment on FLINK-9860 at 7/16/18 4:36 PM:
-------------------------------------------------------------

The e2e test that was running when the leak occurred actually runs with 
parallelism 1 on 1 taskmanager. Therefore, it cannot be in the Flink-internal 
communication between TMs. Also, looking at the logs in more details, it is 
reported from the JM log anyway.

The only call that is being executed at this stage (around job submission) is 
{{flink list -r}} but, unfortunately, I was not able to reproduce this without 
or with {{env.java.opts: -Dio.netty.leakDetection.level=paranoid}} which would 
give more details.


was (Author: nicok):
The e2e test that was running when the leak occurred actually runs with 
parallelism 1 on 1 taskmanager. Therefore, it cannot be in the Flink-internal 
communication between TMs. Also, looking at the logs in more details, it is 
reported from the JM log anyway.

The only call that is being executed at this stage (around job submission) is 
{{flink list -r}} but, unfortunately, I was not able to reproduce this with or 
without {{env.java.opts: -Dio.netty.leakDetection.level=paranoid}} which would 
give more details.

> Netty resource leak on receiver side
> ------------------------------------
>
>                 Key: FLINK-9860
>                 URL: https://issues.apache.org/jira/browse/FLINK-9860
>             Project: Flink
>          Issue Type: Bug
>          Components: Network
>    Affects Versions: 1.6.0
>            Reporter: Till Rohrmann
>            Assignee: Nico Kruber
>            Priority: Blocker
>              Labels: test-stability
>             Fix For: 1.6.0
>
>
> The Hadoop-free Wordcount end-to-end test fails with the following exception:
> {code}
> ERROR org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector  - 
> LEAK: ByteBuf.release() was not called before it's garbage-collected. See 
> http://netty.io/wiki/reference-counted-objects.html for more information.
> Recent access records: 
> Created at:
>       
> org.apache.flink.shaded.netty4.io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:331)
>       
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185)
>       
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:176)
>       
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:137)
>       
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
>       
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:147)
>       
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
>       
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
>       
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
>       
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
>       
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
>       
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> {code}
> We might have a resource leak on the receiving side of our network stack.
> https://api.travis-ci.org/v3/job/404225956/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to