[ 
https://issues.apache.org/jira/browse/FLUME-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masanobu Horiyama updated FLUME-2731:
-------------------------------------
    Description: 
The flume agent throws an OutOfMemoryError during load tests.

{noformat}
2015-06-29 15:30:24,590 (New I/O worker #4) [WARN - 
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.exceptionCaught(NettyServer.java:201)]
 Unexpected exception from downstream.
java.lang.OutOfMemoryError: Java heap space
        at java.util.HashMap.<init>(HashMap.java:187)
        at java.util.HashMap.<init>(HashMap.java:199)
        at 
org.apache.avro.generic.GenericDatumReader.newMap(GenericDatumReader.java:330)
        at 
org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:239)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139)
        at org.apache.avro.ipc.Responder.respond(Responder.java:124)
        at 
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
        at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at 
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
        at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
        at 
org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
        at 
org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
        at 
org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
        at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
        at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:695)
{noformat}



The test: 

A test worker consists of a NettyAvroRpcClient shared by a thread pool of size 
12. The rpc client instance will be recreated whenever isActive is false. Flume 
events with a timestamp header and a body of 250 random bytes are submitted 
continuously. Test workers are started in groups of 20. 5 groups are started in 
total with 5 second delays between starts.

Usually, after the first group of 20, we see the OOM error in the agent.

Got the avro-1.8.0-SNAPSHOT source, and added debug logging in the newMap 
method to see the size of allocation:

https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumReader.java#L405-L411

And found that in most cases the size was 1, but when the OOM errors start 
happening, the size is always 640371331.

Seems to be related to

AVRO-1111
FLUME-1259
FLUME-1641

  was:
The flume agent throws an OutOfMemoryError during load tests.

{noformat}
2015-06-29 15:30:24,590 (New I/O worker #4) [WARN - 
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.exceptionCaught(NettyServer.java:201)]
 Unexpected exception from downstream.
java.lang.OutOfMemoryError: Java heap space
        at java.util.HashMap.<init>(HashMap.java:187)
        at java.util.HashMap.<init>(HashMap.java:199)
        at 
org.apache.avro.generic.GenericDatumReader.newMap(GenericDatumReader.java:330)
        at 
org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:239)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139)
        at org.apache.avro.ipc.Responder.respond(Responder.java:124)
        at 
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
        at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at 
org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
        at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
        at 
org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
        at 
org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
        at 
org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
        at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
        at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:695)
{noformat}



The test: 

A test worker consists of a NettyAvroRpcClient shared by a thread pool of size 
12. Flume events with a timestamp header and a body of 250 random bytes are 
submitted continuously. Test workers are started in groups of 20. 5 groups are 
started in total with 5 second delays between starts.

Usually, after the first group of 20, we see the OOM error in the agent.

Got the avro-1.8.0-SNAPSHOT source, and added debug logging in the newMap 
method to see the size of allocation:

https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumReader.java#L405-L411

And found that in most cases the size was 1, but when the OOM errors start 
happening, the size is always 640371331.

Seems to be related to

AVRO-1111
FLUME-1259
FLUME-1641


> Flume Agent throws OutOfMemoryError during load tests.
> ------------------------------------------------------
>
>                 Key: FLUME-2731
>                 URL: https://issues.apache.org/jira/browse/FLUME-2731
>             Project: Flume
>          Issue Type: Bug
>          Components: Node
>    Affects Versions: v1.6.0
>         Environment: Flume Agent : 1.6.0
> OS : Mac OS X 10.7.5 and CentOS release 6.6 2.6.32-504.1.3.el6.x86_64
> avro : 1.7.4
> avro-ipc : 1.7.4
> JDK: 1.6.0_65 and 1.7.0-45
> Flume Client - NettyAvroRpcClient
> flume-ng-sdk : 1.6.0
> OS : Mac OS X 10.7.5
> avro : 1.7.4
> avro-ipc : 1.7.4
> JDK: 1.6.0_65
> The agent config:
> {noformat}
> # Define a memory channel called ch1 on agent1
> agent1.channels.ch1.type = memory
> agent1.channels.ch1.capacity = 10000
> # Define an Avro source called avro-source1 on agent1 and tell it
> # to bind to 0.0.0.0:41414. Connect it to channel ch1.
> agent1.sources.avro-source1.channels = ch1
> agent1.sources.avro-source1.type = avro
> agent1.sources.avro-source1.bind = 0.0.0.0
> agent1.sources.avro-source1.port = 41414
> # Define a logger sink that simply logs all events it receives
> # and connect it to the other end of the same channel.
> agent1.sinks.log-sink1.channel = ch1
> #agent1.sinks.log-sink1.type = logger
> agent1.sinks.log-sink1.type = null
> agent1.sinks.log-sink1.batchSize = 10
> # Finally, now that we've defined all of our components, tell
> # agent1 which ones we want to activate.
> agent1.channels = ch1
> agent1.sources = avro-source1
> agent1.sinks = log-sink1
> {noformat}
>            Reporter: Masanobu Horiyama
>
> The flume agent throws an OutOfMemoryError during load tests.
> {noformat}
> 2015-06-29 15:30:24,590 (New I/O worker #4) [WARN - 
> org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.exceptionCaught(NettyServer.java:201)]
>  Unexpected exception from downstream.
> java.lang.OutOfMemoryError: Java heap space
>         at java.util.HashMap.<init>(HashMap.java:187)
>         at java.util.HashMap.<init>(HashMap.java:199)
>         at 
> org.apache.avro.generic.GenericDatumReader.newMap(GenericDatumReader.java:330)
>         at 
> org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:239)
>         at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
>         at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:139)
>         at org.apache.avro.ipc.Responder.respond(Responder.java:124)
>         at 
> org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.messageReceived(NettyServer.java:188)
>         at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>         at 
> org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:173)
>         at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>         at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
>         at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
>         at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
>         at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
>         at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
>         at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>         at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
>         at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
>         at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>         at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>         at 
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
>         at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
>         at 
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
>         at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
>         at 
> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>         at 
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>         at 
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>         at java.lang.Thread.run(Thread.java:695)
> {noformat}
> The test: 
> A test worker consists of a NettyAvroRpcClient shared by a thread pool of 
> size 12. The rpc client instance will be recreated whenever isActive is 
> false. Flume events with a timestamp header and a body of 250 random bytes 
> are submitted continuously. Test workers are started in groups of 20. 5 
> groups are started in total with 5 second delays between starts.
> Usually, after the first group of 20, we see the OOM error in the agent.
> Got the avro-1.8.0-SNAPSHOT source, and added debug logging in the newMap 
> method to see the size of allocation:
> https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumReader.java#L405-L411
> And found that in most cases the size was 1, but when the OOM errors start 
> happening, the size is always 640371331.
> Seems to be related to
> AVRO-1111
> FLUME-1259
> FLUME-1641



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to