[ 
https://issues.apache.org/jira/browse/FLUME-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Lin updated FLUME-2796:
---------------------------
    Description: 
Due to some error, my flume agent has queued 185204 event messages (more than 1 
TB,  about 7.7 MB /per event in average) in its file channel. 

I tried to restart the flume agent with more JVM Java heap space and let the 
file channel replay, and I got the following error message:

{noformat}
java.lang.OutOfMemoryError: Java heap space
        at com.google.protobuf.ByteString.copyFrom(ByteString.java:90)
        at com.google.protobuf.ByteString.copyFrom(ByteString.java:99)
        at 
com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:294)
        at 
org.apache.flume.channel.file.proto.ProtosFactory$FlumeEvent$Builder.mergeFrom(ProtosFactory.java:5136)
        at 
org.apache.flume.channel.file.proto.ProtosFactory$FlumeEvent$Builder.mergeFrom(ProtosFactory.java:4950)
        at 
com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:275)
        at 
org.apache.flume.channel.file.proto.ProtosFactory$Put$Builder.mergeFrom(ProtosFactory.java:3312)
        at 
org.apache.flume.channel.file.proto.ProtosFactory$Put$Builder.mergeFrom(ProtosFactory.java:3164)
        at 
com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:212)
        at 
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
        at 
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
        at 
com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
        at 
com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
        at 
com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
        at 
com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
        at 
org.apache.flume.channel.file.proto.ProtosFactory$Put.parseDelimitedFrom(ProtosFactory.java:3121)
        at org.apache.flume.channel.file.Put.readProtos(Put.java:86)
        at 
org.apache.flume.channel.file.TransactionEventRecord.fromByteArray(TransactionEventRecord.java:201)
        at 
org.apache.flume.channel.file.LogFileV3$SequentialReader.doNext(LogFileV3.java:344)
        at 
org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.java:498)
        at 
org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:245)
        at org.apache.flume.channel.file.Log.doReplay(Log.java:435)
        at org.apache.flume.channel.file.Log.replay(Log.java:382)

{noformat}


Setting in flume-env.sh
{noformat}
JAVA_OPTS="-Xms40000m -Xmx40000m -Xss500k -XX:MaxDirectMemorySize=2000m
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:PermSize=256m -XX:MaxPermSize=512m 
-XX:-UseGCOverheadLimit"
{noformat}


Configuration for filechannel
{noformat}
a1.channels.fc1.type = file
a1.channels.fc1.dataDirs = ../../data
a1.channels.fc1.checkpointDir = ../../check
a1.channels.fc1.maxFileSize = 104857600
a1.channels.fc1.capacity = 1000000
a1.channels.fc1.transactionCapacity = 10000
{noformat}


Is it possible to tune the flume config or environment setting to replay such a 
large amount data files?


  was:
Due to some error, my flume agent has queued 185204 event messages (more than 1 
TB,  about 7.7 MB /per event in average) in its file channel. 

I tried to restart the flume agent and let the file channel replay, and I got 
the following error message:

{noformat}
java.lang.OutOfMemoryError: Java heap space
        at com.google.protobuf.ByteString.copyFrom(ByteString.java:90)
        at com.google.protobuf.ByteString.copyFrom(ByteString.java:99)
        at 
com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:294)
        at 
org.apache.flume.channel.file.proto.ProtosFactory$FlumeEvent$Builder.mergeFrom(ProtosFactory.java:5136)
        at 
org.apache.flume.channel.file.proto.ProtosFactory$FlumeEvent$Builder.mergeFrom(ProtosFactory.java:4950)
        at 
com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:275)
        at 
org.apache.flume.channel.file.proto.ProtosFactory$Put$Builder.mergeFrom(ProtosFactory.java:3312)
        at 
org.apache.flume.channel.file.proto.ProtosFactory$Put$Builder.mergeFrom(ProtosFactory.java:3164)
        at 
com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:212)
        at 
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
        at 
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
        at 
com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
        at 
com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
        at 
com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
        at 
com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
        at 
org.apache.flume.channel.file.proto.ProtosFactory$Put.parseDelimitedFrom(ProtosFactory.java:3121)
        at org.apache.flume.channel.file.Put.readProtos(Put.java:86)
        at 
org.apache.flume.channel.file.TransactionEventRecord.fromByteArray(TransactionEventRecord.java:201)
        at 
org.apache.flume.channel.file.LogFileV3$SequentialReader.doNext(LogFileV3.java:344)
        at 
org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.java:498)
        at 
org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:245)
        at org.apache.flume.channel.file.Log.doReplay(Log.java:435)
        at org.apache.flume.channel.file.Log.replay(Log.java:382)

{noformat}


Setting in flume-env.sh
{noformat}
JAVA_OPTS="-Xms40000m -Xmx40000m -Xss500k -XX:MaxDirectMemorySize=2000m
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:PermSize=256m -XX:MaxPermSize=512m 
-XX:-UseGCOverheadLimit"
{noformat}


Configuration for filechannel
{noformat}
a1.channels.fc1.type = file
a1.channels.fc1.dataDirs = ../../data
a1.channels.fc1.checkpointDir = ../../check
a1.channels.fc1.maxFileSize = 104857600
a1.channels.fc1.capacity = 1000000
a1.channels.fc1.transactionCapacity = 10000
{noformat}


Is it possible to tune the flume config or environment setting to replay such a 
large amount data files?



> File Channel which queued more than 1TB data files got OOME when doing replay
> -----------------------------------------------------------------------------
>
>                 Key: FLUME-2796
>                 URL: https://issues.apache.org/jira/browse/FLUME-2796
>             Project: Flume
>          Issue Type: Question
>          Components: File Channel
>    Affects Versions: v1.5.2
>         Environment: CDH 5.3
> Cent OS 
>            Reporter: Max Lin
>            Priority: Blocker
>
> Due to some error, my flume agent has queued 185204 event messages (more than 
> 1 TB,  about 7.7 MB /per event in average) in its file channel. 
> I tried to restart the flume agent with more JVM Java heap space and let the 
> file channel replay, and I got the following error message:
> {noformat}
> java.lang.OutOfMemoryError: Java heap space
>         at com.google.protobuf.ByteString.copyFrom(ByteString.java:90)
>         at com.google.protobuf.ByteString.copyFrom(ByteString.java:99)
>         at 
> com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:294)
>         at 
> org.apache.flume.channel.file.proto.ProtosFactory$FlumeEvent$Builder.mergeFrom(ProtosFactory.java:5136)
>         at 
> org.apache.flume.channel.file.proto.ProtosFactory$FlumeEvent$Builder.mergeFrom(ProtosFactory.java:4950)
>         at 
> com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:275)
>         at 
> org.apache.flume.channel.file.proto.ProtosFactory$Put$Builder.mergeFrom(ProtosFactory.java:3312)
>         at 
> org.apache.flume.channel.file.proto.ProtosFactory$Put$Builder.mergeFrom(ProtosFactory.java:3164)
>         at 
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:212)
>         at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
>         at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
>         at 
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
>         at 
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
>         at 
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
>         at 
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
>         at 
> org.apache.flume.channel.file.proto.ProtosFactory$Put.parseDelimitedFrom(ProtosFactory.java:3121)
>         at org.apache.flume.channel.file.Put.readProtos(Put.java:86)
>         at 
> org.apache.flume.channel.file.TransactionEventRecord.fromByteArray(TransactionEventRecord.java:201)
>         at 
> org.apache.flume.channel.file.LogFileV3$SequentialReader.doNext(LogFileV3.java:344)
>         at 
> org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.java:498)
>         at 
> org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:245)
>         at org.apache.flume.channel.file.Log.doReplay(Log.java:435)
>         at org.apache.flume.channel.file.Log.replay(Log.java:382)
> {noformat}
> Setting in flume-env.sh
> {noformat}
> JAVA_OPTS="-Xms40000m -Xmx40000m -Xss500k -XX:MaxDirectMemorySize=2000m
> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:PermSize=256m 
> -XX:MaxPermSize=512m -XX:-UseGCOverheadLimit"
> {noformat}
> Configuration for filechannel
> {noformat}
> a1.channels.fc1.type = file
> a1.channels.fc1.dataDirs = ../../data
> a1.channels.fc1.checkpointDir = ../../check
> a1.channels.fc1.maxFileSize = 104857600
> a1.channels.fc1.capacity = 1000000
> a1.channels.fc1.transactionCapacity = 10000
> {noformat}
> Is it possible to tune the flume config or environment setting to replay such 
> a large amount data files?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to