[ 
https://issues.apache.org/jira/browse/FLUME-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13407986#comment-13407986
 ] 

thinker0 edited comment on FLUME-1326 at 7/6/12 1:46 PM:
---------------------------------------------------------

{panel}
# Give Flume more memory and pre-allocate
JAVA_OPTS="-server -Xms100m -Xmx200m"
JAVA_OPTS="$JAVA_OPTS -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode"
JAVA_OPTS="$JAVA_OPTS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"
{panel}

{code}

2012-07-06T22:27:00.413+0900: [GC [1 CMS-initial-mark: 136575K(136576K)] 
198009K(198016K), 0.0350390 secs] [Times: user=0.03 sys=0.00, real=0.04 secs] 
2012-07-06T22:27:00.449+0900: [Full GC [CMS2012-07-06T22:27:00.608+0900: 
[CMS-concurrent-mark: 0.159/0.160 secs] [Times: user=0.16 sys=0.00, real=0.16 
secs] 
 (concurrent mode failure): 136575K->136575K(136576K), 0.5808260 secs] 
198014K->197895K(198016K), [CMS Perm : 20375K->20375K(34192K)] icms_dc=100 , 
0.5809580 secs] [Times: user=0.58 sys=0.00, real=0.58 secs] 
2012-07-06T22:27:01.030+0900: [GC [1 CMS-initial-mark: 136575K(136576K)] 
197897K(198016K), 0.0354160 secs] [Times: user=0.04 sys=0.00, real=0.03 secs] 
Heap
 par new generation   total 61440K, used 61435K [0x00000000ee600000, 
0x00000000f28a0000, 0x00000000f28a0000)
  eden space 54656K, 100% used [0x00000000ee600000, 0x00000000f1b60000, 
0x00000000f1b60000)
  from space 6784K,  99% used [0x00000000f2200000, 0x00000000f289efa8, 
0x00000000f28a0000)
  to   space 6784K,   0% used [0x00000000f1b60000, 0x00000000f1b60000, 
0x00000000f2200000)
 concurrent mark-sweep generation total 136576K, used 136575K 
[0x00000000f28a0000, 0x00000000fae00000, 0x00000000fae00000)
 concurrent-mark-sweep perm gen total 34192K, used 20375K [0x00000000fae00000, 
0x00000000fcf64000, 0x0000000100000000)

2012-07-06T22:27:01.069+0900: [Full GC [CMS2012-07-06T22:27:01.226+0900: 
[CMS-concurrent-mark: 0.159/0.160 secs] [Times: user=0.16 sys=0.00, real=0.17 
secs] 
 (concurrent mode failure): 136575K->136575K(136576K), 0.5796240 secs] 
198013K->197886K(198016K), [CMS Perm : 20375K->20375K(34192K)] icms_dc=100 , 
0.5797590 secs] [Times: user=0.58 sys=0.00, real=0.58 secs] 
2012-07-06T22:27:01.650+0900: [GC [1 CMS-initial-mark: 136575K(136576K)] 
197967K(198016K), 0.0351150 secs] [Times: user=0.04 sys=0.00, real=0.03 secs] 
2012-07-06T22:27:01.685+0900: [Full GC [CMS2012-07-06T22:27:01.845+0900: 
[CMS-concurrent-mark: 0.159/0.160 secs] [Times: user=0.16 sys=0.00, real=0.16 
secs] 
 (concurrent mode failure): 136575K->136575K(136576K), 0.6167080 secs] 
198008K->197894K(198016K), [CMS Perm : 20375K->20375K(34192K)] icms_dc=100 , 
0.6170540 secs] [Times: user=0.62 sys=0.00, real=0.62 secs] 
2012-07-06T22:27:02.303+0900: [GC [1 CMS-initial-mark: 136575K(136576K)] 
197918K(198016K), 0.0352080 secs] [Times: user=0.03 sys=0.00, real=0.04 secs] 
2012-07-06T22:27:02.339+0900: [Full GC [CMS2012-07-06T22:27:02.498+0900: 
[CMS-concurrent-mark: 0.159/0.160 secs] [Times: user=0.16 sys=0.00, real=0.16 
secs] 
 (concurrent mode failure): 136575K->136575K(136576K), 0.5811630 secs] 
198014K->197913K(198016K), [CMS Perm : 20375K->20375K(34192K)] icms_dc=100 , 
0.5812980 secs] [Times: user=0.58 sys=0.00, real=0.58 secs] 

....



2012-07-06T22:45:21.899+0900: [GC [1 CMS-initial-mark: 175551K(175552K)] 
198883K(201920K), 0.0150690 secs] [Times: user=0.01 sys=0.00, real=0.02 secs] 
Heap
 par new generation   total 26368K, used 25104K [0x00000000ee600000, 
0x00000000f0290000, 0x00000000f0290000)
  eden space 23488K, 100% used [0x00000000ee600000, 0x00000000efcf0000, 
0x00000000efcf0000)
  from space 2880K,  56% used [0x00000000effc0000, 0x00000000f01543d0, 
0x00000000f0290000)
  to   space 2880K,   0% used [0x00000000efcf0000, 0x00000000efcf0000, 
0x00000000effc0000)
 concurrent mark-sweep generation total 175552K, used 175551K 
[0x00000000f0290000, 0x00000000fae00000, 0x00000000fae00000)
 concurrent-mark-sweep perm gen total 34104K, used 20352K [0x00000000fae00000, 
0x00000000fcf4e000, 0x0000000100000000)

2012-07-06T22:45:21.923+0900: [Full GC [CMS2012-07-06T22:45:22.120+0900: 
[CMS-concurrent-mark: 0.205/0.206 secs] [Times: user=0.22 sys=0.01, real=0.20 
secs] 
 (concurrent mode failure): 175551K->175551K(175552K), 0.7612190 secs] 
201913K->198933K(201920K), [CMS Perm : 20352K->20352K(34104K)] icms_dc=100 , 
0.7613540 secs] [Times: user=0.76 sys=0.00, real=0.77 secs] 
2012-07-06T22:45:22.685+0900: [GC [1 CMS-initial-mark: 175551K(175552K)] 
199035K(201920K), 0.0151000 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
2012-07-06T22:45:22.708+0900: [Full GC [CMS2012-07-06T22:45:22.906+0900: 
[CMS-concurrent-mark: 0.204/0.205 secs] [Times: user=0.23 sys=0.00, real=0.21 
secs] 
 (concurrent mode failure): 175551K->175551K(175552K), 0.7694170 secs] 
201919K->199083K(201920K), [CMS Perm : 20352K->20352K(34104K)] icms_dc=100 , 
0.7695470 secs] [Times: user=0.77 sys=0.00, real=0.77 secs] 
2012-07-06T22:45:23.483+0900: [GC [1 CMS-initial-mark: 175551K(175552K)] 
200691K(201920K), 0.0157280 secs] [Times: user=0.02 sys=0.00, real=0.02 secs] 
2012-07-06T22:45:23.502+0900: [Full GC [CMS2012-07-06T22:45:23.716+0900: 
[CMS-concurrent-mark: 0.216/0.217 secs] [Times: user=0.22 sys=0.00, real=0.22 
secs] 
 (concurrent mode failure): 175552K->175551K(175552K), 0.7632440 secs] 
201920K->199203K(201920K), [CMS Perm : 20352K->20352K(34104K)] icms_dc=100 , 
0.7633760 secs] [Times: user=0.76 sys=0.00, real=0.77 secs] 
2012-07-06T22:45:24.268+0900: [GC [1 CMS-initial-mark: 175551K(175552K)] 
199625K(201920K), 0.0156730 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 
2012-07-06T22:45:24.295+0900: [Full GC [CMS2012-07-06T22:45:24.490+0900: 
[CMS-concurrent-mark: 0.205/0.206 secs] [Times: user=0.22 sys=0.00, real=0.21 
secs] 
 (concurrent mode failure): 175551K->175551K(175552K), 0.7598450 secs] 
201919K->199192K(201920K), [CMS Perm : 20352K->20352K(34104K)] icms_dc=100 , 
0.7599740 secs] [Times: user=0.76 sys=0.00, real=0.76 secs] 
2012-07-06T22:45:25.058+0900: [GC [1 CMS-initial-mark: 175551K(175552K)] 
199986K(201920K), 0.0155300 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
2012-07-06T22:45:25.080+0900: [Full GC [CMS2012-07-06T22:45:25.279+0900: 
[CMS-concurrent-mark: 0.205/0.205 secs] [Times: user=0.22 sys=0.00, real=0.21 
secs] 
 (concurrent mode failure): 175551K->175551K(175552K), 0.7627070 secs] 
201913K->199371K(201920K), [CMS Perm : 20352K->20352K(34104K)] icms_dc=100 , 
0.7628390 secs] [Times: user=0.77 sys=0.00, real=0.76 secs] 
2012-07-06T22:45:25.844+0900: [GC [1 CMS-initial-mark: 175551K(175552K)] 
199571K(201920K), 0.0154090 secs] [Times: user=0.01 sys=0.00, real=0.02 secs] 
2012-07-06T22:45:25.868+0900: [Full GC [CMS2012-07-06T22:45:26.065+0900: 
[CMS-concurrent-mark: 0.205/0.205 secs] [Times: user=0.22 sys=0.01, real=0.21 
secs] 
 (concurrent mode failure): 175551K->175551K(175552K), 0.7624890 secs] 
201912K->199463K(201920K), [CMS Perm : 20352K->20352K(34104K)] icms_dc=100 , 
0.7626150 secs] [Times: user=0.77 sys=0.00, real=0.76 secs] 
{code}
                
      was (Author: thinker0):
    {code}

2012-07-06T22:27:00.413+0900: [GC [1 CMS-initial-mark: 136575K(136576K)] 
198009K(198016K), 0.0350390 secs] [Times: user=0.03 sys=0.00, real=0.04 secs] 
2012-07-06T22:27:00.449+0900: [Full GC [CMS2012-07-06T22:27:00.608+0900: 
[CMS-concurrent-mark: 0.159/0.160 secs] [Times: user=0.16 sys=0.00, real=0.16 
secs] 
 (concurrent mode failure): 136575K->136575K(136576K), 0.5808260 secs] 
198014K->197895K(198016K), [CMS Perm : 20375K->20375K(34192K)] icms_dc=100 , 
0.5809580 secs] [Times: user=0.58 sys=0.00, real=0.58 secs] 
2012-07-06T22:27:01.030+0900: [GC [1 CMS-initial-mark: 136575K(136576K)] 
197897K(198016K), 0.0354160 secs] [Times: user=0.04 sys=0.00, real=0.03 secs] 
Heap
 par new generation   total 61440K, used 61435K [0x00000000ee600000, 
0x00000000f28a0000, 0x00000000f28a0000)
  eden space 54656K, 100% used [0x00000000ee600000, 0x00000000f1b60000, 
0x00000000f1b60000)
  from space 6784K,  99% used [0x00000000f2200000, 0x00000000f289efa8, 
0x00000000f28a0000)
  to   space 6784K,   0% used [0x00000000f1b60000, 0x00000000f1b60000, 
0x00000000f2200000)
 concurrent mark-sweep generation total 136576K, used 136575K 
[0x00000000f28a0000, 0x00000000fae00000, 0x00000000fae00000)
 concurrent-mark-sweep perm gen total 34192K, used 20375K [0x00000000fae00000, 
0x00000000fcf64000, 0x0000000100000000)

2012-07-06T22:27:01.069+0900: [Full GC [CMS2012-07-06T22:27:01.226+0900: 
[CMS-concurrent-mark: 0.159/0.160 secs] [Times: user=0.16 sys=0.00, real=0.17 
secs] 
 (concurrent mode failure): 136575K->136575K(136576K), 0.5796240 secs] 
198013K->197886K(198016K), [CMS Perm : 20375K->20375K(34192K)] icms_dc=100 , 
0.5797590 secs] [Times: user=0.58 sys=0.00, real=0.58 secs] 
2012-07-06T22:27:01.650+0900: [GC [1 CMS-initial-mark: 136575K(136576K)] 
197967K(198016K), 0.0351150 secs] [Times: user=0.04 sys=0.00, real=0.03 secs] 
2012-07-06T22:27:01.685+0900: [Full GC [CMS2012-07-06T22:27:01.845+0900: 
[CMS-concurrent-mark: 0.159/0.160 secs] [Times: user=0.16 sys=0.00, real=0.16 
secs] 
 (concurrent mode failure): 136575K->136575K(136576K), 0.6167080 secs] 
198008K->197894K(198016K), [CMS Perm : 20375K->20375K(34192K)] icms_dc=100 , 
0.6170540 secs] [Times: user=0.62 sys=0.00, real=0.62 secs] 
2012-07-06T22:27:02.303+0900: [GC [1 CMS-initial-mark: 136575K(136576K)] 
197918K(198016K), 0.0352080 secs] [Times: user=0.03 sys=0.00, real=0.04 secs] 
2012-07-06T22:27:02.339+0900: [Full GC [CMS2012-07-06T22:27:02.498+0900: 
[CMS-concurrent-mark: 0.159/0.160 secs] [Times: user=0.16 sys=0.00, real=0.16 
secs] 
 (concurrent mode failure): 136575K->136575K(136576K), 0.5811630 secs] 
198014K->197913K(198016K), [CMS Perm : 20375K->20375K(34192K)] icms_dc=100 , 
0.5812980 secs] [Times: user=0.58 sys=0.00, real=0.58 secs] {code}
                  
> OutOfMemoryError in HDFSSink
> ----------------------------
>
>                 Key: FLUME-1326
>                 URL: https://issues.apache.org/jira/browse/FLUME-1326
>             Project: Flume
>          Issue Type: Bug
>    Affects Versions: v1.2.0
>            Reporter: Juhani Connolly
>            Priority: Critical
>
> We run a 3 node/1 collector test cluster pushing about 350events/sec per 
> node... Not really high stress, but just something to evaluate flume with.
> Consistently our collector has been dying because of an OOMError killing the 
> SinkRunner after running for about 30-40 hours(seems pretty consistent as 
> we've had it 3 times now). 
> Suspected cause would be a memory leak somewhere in HdfsSink. The feeder 
> nodes which run AvroSink instead of HdfsSink have been up and running for 
> about a week without restarts.
> flume-load/act-wap02/2012-06-26-17.1340697637324.tmp, packetSize=65557, 
> chunksPerPacket=127, bytesCurBlock=29731328
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 2012-06-26 17:12:56,080 (SinkRunner-PollingRunner-DefaultSinkProcessor) 
> [ERROR - 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:411)] 
> process failed
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>         at java.util.Arrays.copyOfRange(Arrays.java:3209)
>         at java.lang.String.<init>(String.java:215)
>         at java.lang.StringBuilder.toString(StringBuilder.java:430)
>         at 
> org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:306)
>         at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:367)
>         at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:619)
> Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor" 
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>         at java.util.Arrays.copyOfRange(Arrays.java:3209)
>         at java.lang.String.<init>(String.java:215)
>         at java.lang.StringBuilder.toString(StringBuilder.java:430)
>         at 
> org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:306)
>         at 
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:367)
>         at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to