[
https://issues.apache.org/jira/browse/FLUME-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Masanobu Horiyama updated FLUME-2731:
-------------------------------------
Description:
The flume agent throws an OutOfMemoryError during load tests.
{noformat}
2015-06-29 15:30:24,590 (New I/O worker #4) [WARN -
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.exceptionCaught(NettyServer.java:201)]
Unexpected exception from downstream.
java.lang.OutOfMemoryError: Java heap space
at java.util.HashMap.<init>(HashMap.java:187)
at java.util.HashMap.<init>(HashMap.java:199)
at
org.apache.avro.generic.GenericDatumReader.newMap(GenericDatumReader.java:330)
at
org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:239)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139)
at org.apache.avro.ipc.Responder.respond(Responder.java:124)
at
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
at
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at
org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at
org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
{noformat}
The test:
A test worker consists of a NettyAvroRpcClient shared by a thread pool of size
12. The rpc client instance will be recreated whenever isActive is false. Flume
events with a timestamp header and a body of 250 random bytes are submitted
continuously. Test workers are started in groups of 20. 5 groups are started in
total with 5 second delays between starts.
Usually, after the first group of 20, we see the OOM error in the agent.
Got the avro-1.8.0-SNAPSHOT source, and added debug logging in the newMap
method to see the size of allocation:
https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumReader.java#L405-L411
And found that in most cases the size was 1, but when the OOM errors start
happening, the size is always 640371331.
The OOM error occurs more frequently when the connect-timeout and
request-timeout are both shorter than 20 seconds.
Seems to be related to
AVRO-1111
FLUME-1259
FLUME-1641
was:
The flume agent throws an OutOfMemoryError during load tests.
{noformat}
2015-06-29 15:30:24,590 (New I/O worker #4) [WARN -
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.exceptionCaught(NettyServer.java:201)]
Unexpected exception from downstream.
java.lang.OutOfMemoryError: Java heap space
at java.util.HashMap.<init>(HashMap.java:187)
at java.util.HashMap.<init>(HashMap.java:199)
at
org.apache.avro.generic.GenericDatumReader.newMap(GenericDatumReader.java:330)
at
org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:239)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139)
at org.apache.avro.ipc.Responder.respond(Responder.java:124)
at
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
at
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at
org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at
org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
{noformat}
The test:
A test worker consists of a NettyAvroRpcClient shared by a thread pool of size
12. The rpc client instance will be recreated whenever isActive is false. Flume
events with a timestamp header and a body of 250 random bytes are submitted
continuously. Test workers are started in groups of 20. 5 groups are started in
total with 5 second delays between starts.
Usually, after the first group of 20, we see the OOM error in the agent.
Got the avro-1.8.0-SNAPSHOT source, and added debug logging in the newMap
method to see the size of allocation:
https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumReader.java#L405-L411
And found that in most cases the size was 1, but when the OOM errors start
happening, the size is always 640371331.
Seems to be related to
AVRO-1111
FLUME-1259
FLUME-1641
> Flume Agent throws OutOfMemoryError during load tests.
> ------------------------------------------------------
>
> Key: FLUME-2731
> URL: https://issues.apache.org/jira/browse/FLUME-2731
> Project: Flume
> Issue Type: Bug
> Components: Node
> Affects Versions: v1.6.0
> Environment: Flume Agent : 1.6.0
> OS : Mac OS X 10.7.5 and CentOS release 6.6 2.6.32-504.1.3.el6.x86_64
> avro : 1.7.4
> avro-ipc : 1.7.4
> JDK: 1.6.0_65 and 1.7.0-45
> Flume Client - NettyAvroRpcClient
> flume-ng-sdk : 1.6.0
> OS : Mac OS X 10.7.5
> avro : 1.7.4
> avro-ipc : 1.7.4
> JDK: 1.6.0_65
> The agent config:
> {noformat}
> # Define a memory channel called ch1 on agent1
> agent1.channels.ch1.type = memory
> agent1.channels.ch1.capacity = 10000
> # Define an Avro source called avro-source1 on agent1 and tell it
> # to bind to 0.0.0.0:41414. Connect it to channel ch1.
> agent1.sources.avro-source1.channels = ch1
> agent1.sources.avro-source1.type = avro
> agent1.sources.avro-source1.bind = 0.0.0.0
> agent1.sources.avro-source1.port = 41414
> # Define a logger sink that simply logs all events it receives
> # and connect it to the other end of the same channel.
> agent1.sinks.log-sink1.channel = ch1
> #agent1.sinks.log-sink1.type = logger
> agent1.sinks.log-sink1.type = null
> agent1.sinks.log-sink1.batchSize = 10
> # Finally, now that we've defined all of our components, tell
> # agent1 which ones we want to activate.
> agent1.channels = ch1
> agent1.sources = avro-source1
> agent1.sinks = log-sink1
> {noformat}
> Reporter: Masanobu Horiyama
>
> The flume agent throws an OutOfMemoryError during load tests.
> {noformat}
> 2015-06-29 15:30:24,590 (New I/O worker #4) [WARN -
> org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.exceptionCaught(NettyServer.java:201)]
> Unexpected exception from downstream.
> java.lang.OutOfMemoryError: Java heap space
> at java.util.HashMap.<init>(HashMap.java:187)
> at java.util.HashMap.<init>(HashMap.java:199)
> at
> org.apache.avro.generic.GenericDatumReader.newMap(GenericDatumReader.java:330)
> at
> org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:239)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139)
> at org.apache.avro.ipc.Responder.respond(Responder.java:124)
> at
> org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
> at
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
> at
> org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
> at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
> at
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
> at
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
> at
> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
> at
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
> at
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
> at
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
> at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
> at
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
> at
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
> at
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
> at
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
> at
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
> at
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
> at
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
> at
> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
> at
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
> at
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:695)
> {noformat}
> The test:
> A test worker consists of a NettyAvroRpcClient shared by a thread pool of
> size 12. The rpc client instance will be recreated whenever isActive is
> false. Flume events with a timestamp header and a body of 250 random bytes
> are submitted continuously. Test workers are started in groups of 20. 5
> groups are started in total with 5 second delays between starts.
> Usually, after the first group of 20, we see the OOM error in the agent.
> Got the avro-1.8.0-SNAPSHOT source, and added debug logging in the newMap
> method to see the size of allocation:
> https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumReader.java#L405-L411
> And found that in most cases the size was 1, but when the OOM errors start
> happening, the size is always 640371331.
> The OOM error occurs more frequently when the connect-timeout and
> request-timeout are both shorter than 20 seconds.
> Seems to be related to
> AVRO-1111
> FLUME-1259
> FLUME-1641
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)