[
https://issues.apache.org/jira/browse/FLINK-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936964#comment-14936964
]
ASF GitHub Bot commented on FLINK-2773:
---------------------------------------
GitHub user mxm opened a pull request:
https://github.com/apache/flink/pull/1203
[FLINK-2773] remove strict upper direct memory limit
Setting a strict upper limit for the direct memory size can cause
problems with the direct memory allocation of the Netty network stack
leading to OutOfMemoryExceptions.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mxm/flink direct-memory-fix
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/1203.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1203
----
commit e79b2f88d9fd8b70f3efc39ac8be2d907b67e698
Author: Maximilian Michels <[email protected]>
Date: 2015-09-30T14:45:14Z
[FLINK-2773] remove strict upper direct memory limit
Setting a strict upper limit for the direct memory size can cause
problems with the direct memory allocation of the Netty network stack
leading to OutOfMemoryExceptions.
----
> OutOfMemoryError on YARN Session
> --------------------------------
>
> Key: FLINK-2773
> URL: https://issues.apache.org/jira/browse/FLINK-2773
> Project: Flink
> Issue Type: Bug
> Components: YARN Client
> Affects Versions: 0.10
> Reporter: Fabian Hueske
> Assignee: Maximilian Michels
> Priority: Blocker
> Fix For: 0.10
>
>
> When running a Flink program on a detached YARN session using the latest
> master (commit {{0b3ca57b41e09937b9e63f2f443834c8ad1cf497}}), I observed this
> {{OutOfMemoryError}}
> {code}
> java.lang.Exception: The data preparation for task 'CoGroup
> (coGroup-A68B765B7BAB4E29BF6816965A994776)' , caused an error: Error
> obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due
> to an exception: java.lang.OutOfMemoryError: Direct buffer memory
> at
> org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:464)
> at
> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:579)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Error obtaining the sorted input:
> Thread 'SortMerger Reading Thread' terminated due to an exception:
> java.lang.OutOfMemoryError: Direct buffer memory
> at
> org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:607)
> at
> org.apache.flink.runtime.operators.RegularPactTask.getInput(RegularPactTask.java:1089)
> at
> org.apache.flink.runtime.operators.CoGroupDriver.prepare(CoGroupDriver.java:97)
> at
> org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459)
> ... 3 more
> Caused by: java.io.IOException: Thread 'SortMerger Reading Thread' terminated
> due to an exception: java.lang.OutOfMemoryError: Direct buffer memory
> at
> org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:787)
> Caused by:
> org.apache.flink.runtime.io.network.netty.exception.LocalTransportException:
> java.lang.OutOfMemoryError: Direct buffer memory
> at
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.exceptionCaught(PartitionRequestClientHandler.java:153)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:246)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:224)
> at
> io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:246)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:224)
> at
> io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:246)
> at
> io.netty.channel.AbstractChannelHandlerContext.notifyHandlerException(AbstractChannelHandlerContext.java:737)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:310)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
> at
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
> at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: io.netty.handler.codec.DecoderException:
> java.lang.OutOfMemoryError: Direct buffer memory
> at
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:234)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
> ... 9 more
> Caused by: java.lang.OutOfMemoryError: Direct buffer memory
> at java.nio.Bits.reserveMemory(Bits.java:658)
> at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
> at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
> at
> io.netty.buffer.UnpooledUnsafeDirectByteBuf.allocateDirect(UnpooledUnsafeDirectByteBuf.java:108)
> at
> io.netty.buffer.UnpooledUnsafeDirectByteBuf.capacity(UnpooledUnsafeDirectByteBuf.java:157)
> at
> io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251)
> at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849)
> at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:841)
> at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:831)
> at
> io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:92)
> at
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:228)
> ... 10 more
> {code}
> Since I know, that this feature was properly working recently, I reverted to
> commit {{8ca853e0f6c18be8e6b066c6ec0f23badb797323}} and the problem was gone.
> The problem might have been introduced when adding offheap memory support for
> YARN (commit {{93c95b6a6f150a2c55dc387e4ef1d603b3ef3f22}}).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)