[
https://issues.apache.org/jira/browse/FLINK-11379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Haibo Suen updated FLINK-11379:
-------------------------------
Description:
When TM loads a offloaded TDD with large size, it may throw a
"java.lang.OutOfMemoryError: Direct Buffer Memory" error. The loading uses
nio's _Files.readAllBytes()_ to read serialized TDD. In the call stack of
_Files.readAllBytes()_ , it will allocate a direct memory buffer which's size
is equal the length of the file. This will cause OutOfMemoryErro error when
direct memory is not enough.
A fixed size direct buffer should be used to read a file to avoid
OutOfMemoryErro error, such as a 8K buffer.
The exception stack is as follows (this exception stack is from an old Flink
version, but the master branch has the same problem).
_Caused by: java.lang.OutOfMemoryError: Direct buffer memory_
_at java.nio.Bits.reserveMemory(Bits.java:706)_
_at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)_
_at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)_
_at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:241)_
_at sun.nio.ch.IOUtil.read(IOUtil.java:195)_
_at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:182)_
_at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)_
_at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)_
_at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)_
_at java.nio.file.Files.read(Files.java:3105)_
_at java.nio.file.Files.readAllBytes(Files.java:3158)_
_at
org.apache.flink.runtime.deployment.TaskDeploymentDescriptor.loadBigData(TaskDeploymentDescriptor.java:338)_
_at
org.apache.flink.runtime.taskexecutor.TaskExecutor.submitTask(TaskExecutor.java:397)_
_at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)_
_at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)_
_at java.lang.reflect.Method.invoke(Method.java:498)_
_at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:211)_
_at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:155)_
_at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:133)_
_at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)_
_at akka.actor.Actor$class.aroundReceive(Actor.scala:502)_
_at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)_
_... 9 more_
was:
When TM loads a offloaded TDD with large size, it may throw a
"java.lang.OutOfMemoryError: Direct Buffer Memory" error. The loading uses
nio's _Files.readAllBytes()_ to read serialized TDD. In the call stack of
_Files.readAllBytes()_ , it will allocate a direct memory buffer which's size
is equal the length of the file. This will cause OutOfMemoryErro error when
direct memory is not enough.
A fixed size direct buffer should be used to read a file to avoid
OutOfMemoryErro error, such as a 8K buffer.
The exception stack is as follows (this exception stack is from an old Flink
version, but the master branch has the same problem).
> "java.lang.OutOfMemoryError: Direct buffer memory" when TM loads a large size
> TDD
> ---------------------------------------------------------------------------------
>
> Key: FLINK-11379
> URL: https://issues.apache.org/jira/browse/FLINK-11379
> Project: Flink
> Issue Type: Bug
> Components: TaskManager
> Affects Versions: 1.7.0, 1.7.1
> Reporter: Haibo Suen
> Assignee: Haibo Suen
> Priority: Major
>
> When TM loads a offloaded TDD with large size, it may throw a
> "java.lang.OutOfMemoryError: Direct Buffer Memory" error. The loading uses
> nio's _Files.readAllBytes()_ to read serialized TDD. In the call stack of
> _Files.readAllBytes()_ , it will allocate a direct memory buffer which's size
> is equal the length of the file. This will cause OutOfMemoryErro error when
> direct memory is not enough.
> A fixed size direct buffer should be used to read a file to avoid
> OutOfMemoryErro error, such as a 8K buffer.
> The exception stack is as follows (this exception stack is from an old Flink
> version, but the master branch has the same problem).
> _Caused by: java.lang.OutOfMemoryError: Direct buffer memory_
> _at java.nio.Bits.reserveMemory(Bits.java:706)_
> _at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)_
> _at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)_
> _at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:241)_
> _at sun.nio.ch.IOUtil.read(IOUtil.java:195)_
> _at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:182)_
> _at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)_
> _at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)_
> _at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)_
> _at java.nio.file.Files.read(Files.java:3105)_
> _at java.nio.file.Files.readAllBytes(Files.java:3158)_
> _at
> org.apache.flink.runtime.deployment.TaskDeploymentDescriptor.loadBigData(TaskDeploymentDescriptor.java:338)_
> _at
> org.apache.flink.runtime.taskexecutor.TaskExecutor.submitTask(TaskExecutor.java:397)_
> _at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)_
> _at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)_
> _at java.lang.reflect.Method.invoke(Method.java:498)_
> _at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:211)_
> _at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:155)_
> _at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:133)_
> _at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)_
> _at akka.actor.Actor$class.aroundReceive(Actor.scala:502)_
> _at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)_
> _... 9 more_
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)