[
https://issues.apache.org/jira/browse/FLUME-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819026#comment-13819026
]
dave sinclair commented on FLUME-1326:
--------------------------------------
I have determined the cause of the memory leak, at least for my usage, to be
the WriterLinkedHashMap sfWriters. Depending on how you have your path and
filename specified, this can continue to grow until it hits the max number of
open files or run out of memory. The way I have gotten around it is by setting
an idleTimeout so idle writers are retired (could also lower the number of open
files).
My path was creating a new set of writers every hour, and like clock-work the
heap would increase on the hour and never release that memory. A heap dump and
subsequent inspection led me to the map of the writers.
Let me know if additional details are needed.
dave
> OutOfMemoryError in HDFSSink
> ----------------------------
>
> Key: FLUME-1326
> URL: https://issues.apache.org/jira/browse/FLUME-1326
> Project: Flume
> Issue Type: Bug
> Affects Versions: v1.2.0
> Reporter: Juhani Connolly
> Priority: Critical
>
> We run a 3 node/1 collector test cluster pushing about 350events/sec per
> node... Not really high stress, but just something to evaluate flume with.
> Consistently our collector has been dying because of an OOMError killing the
> SinkRunner after running for about 30-40 hours(seems pretty consistent as
> we've had it 3 times now).
> Suspected cause would be a memory leak somewhere in HdfsSink. The feeder
> nodes which run AvroSink instead of HdfsSink have been up and running for
> about a week without restarts.
> flume-load/act-wap02/2012-06-26-17.1340697637324.tmp, packetSize=65557,
> chunksPerPacket=127, bytesCurBlock=29731328
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 2012-06-26 17:12:56,080 (SinkRunner-PollingRunner-DefaultSinkProcessor)
> [ERROR -
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:411)]
> process failed
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.Arrays.copyOfRange(Arrays.java:3209)
> at java.lang.String.<init>(String.java:215)
> at java.lang.StringBuilder.toString(StringBuilder.java:430)
> at
> org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:306)
> at
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:367)
> at
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> at java.lang.Thread.run(Thread.java:619)
> Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor"
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.Arrays.copyOfRange(Arrays.java:3209)
> at java.lang.String.<init>(String.java:215)
> at java.lang.StringBuilder.toString(StringBuilder.java:430)
> at
> org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:306)
> at
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:367)
> at
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> at java.lang.Thread.run(Thread.java:619)
--
This message was sent by Atlassian JIRA
(v6.1#6144)