[ https://issues.apache.org/jira/browse/FLUME-3185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16200265#comment-16200265 ]
Santhosh Chandrasekaran commented on FLUME-3185: ------------------------------------------------ We tried executing FC Integrity tool. It fixes the Corrupt Event.. But this occurs frequently in PRODUCTION. corruption event removed file and original file (.bak) are of same size /apps/osp/apache-flume-1.7.0/bin/flume-ng tool FCINTEGRITYTOOL -l /appsdata/osp/elastic/datastore/flume/esu1l772/ems/d2cpaf_prod23/data/chn-file-xml-dis-log/ -Xms2g -Xmx2g Warning: No configuration directory set! Use --conf <dir> to override. Info: Including Hive libraries found via () for Hive access FLUME_CLASSPATH = /apps/osp/apache-flume-1.7.0/plugins.d/flume_processors/libext/*:/apps/osp/apache-flume-1.7.0/plugins.d/flume_processors/lib/*:/apps/osp/apache-flume-1.7.0/lib/*:/lib/* + exec /apps/osp/jdk1.8.0_131/bin/java -Xmx20m -Xms2g -Xmx2g -cp '/apps/osp/apache-flume-1.7.0/plugins.d/flume_processors/libext/*:/apps/osp/apache-flume-1.7.0/plugins.d/flume_processors/lib/*:/apps/osp/apache-flume-1.7.0/lib/*:/lib/*' -Djava.library.path= org.apache.flume.tools.FlumeToolsMain FCINTEGRITYTOOL -l /appsdata/osp/elastic/datastore/flume/esu1l772/ems/d2cpaf_prod23/data/chn-file-xml-dis-log/ log4j:WARN No appenders could be found for logger (org.apache.flume.tools.FileChannelIntegrityTool). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. ---------- Summary -------------------- Number of Events in the Channel = 14407 Number of Put Events Processed = 7280 Number of Valid Put Events = 7280 Number of Invalid Put Events = 0 Number of Put Events that threw Exception during validation = 0 Number of Corrupt Events = 1 --------------------------------------- FC Integrity tool logs:- (Offset is 0) 10 Oct 2017 20:03:39,225 WARN [main] (org.apache.flume.tools.FileChannelIntegrityTool.run:136) - Corruption found in /appsdata/osp/elastic/datastore/flume/esu1l772/ems/prod_corr/log-98 at 0 10 Oct 2017 20:03:39,507 INFO [main] (org.apache.flume.channel.file.LogFile$OperationRecordUpdater.markRecordAsNoop:437) - Marking event as 0 at 0 for file /appsdata/osp/elastic/datastore/flume/esu1l772/ems/prod_corr/log-98 10 Oct 2017 20:03:39,511 INFO [main] (org.apache.flume.channel.file.LogFile$SequentialReader.next:683) - Encountered EOF at 86589353 in /appsdata/osp/elastic/datastore/flume/esu1l772/ems/prod_corr/log-98 ~ > Corrupt event found. Please run File Channel Integrity tool > ----------------------------------------------------------- > > Key: FLUME-3185 > URL: https://issues.apache.org/jira/browse/FLUME-3185 > Project: Flume > Issue Type: Bug > Components: File Channel > Affects Versions: 1.7.0 > Environment: PRODUCTION > Reporter: Santhosh Chandrasekaran > > We get the below exception in PROD. > 08 Oct 2017 15:46:09,988 ERROR > [SinkRunner-PollingRunner-DefaultSinkProcessor] > (org.apache.flume.SinkRunner$PollingRunner.run:158) - Unable to deliver > event. Exception follows. > org.apache.flume.ChannelException: Take failed due to IO error > [channel=chn-file-xml-dis-log] > at > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:534) > at > org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113) > at > org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95) > at > com.macys.daas.flume.sink.elasticsearch.CreateIndex.process(CreateIndex.java:150) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: Corrupt event found. Please run File Channel > Integrity tool. > at org.apache.flume.channel.file.Log.get(Log.java:616) > at > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:531) > ... 6 more > Caused by: org.apache.flume.channel.file.CorruptEventException: Could not > parse event from data file. > at > org.apache.flume.channel.file.TransactionEventRecord.fromByteArray(TransactionEventRecord.java:212) > at > org.apache.flume.channel.file.LogFileV3$RandomReader.doGet(LogFileV3.java:303) > at > org.apache.flume.channel.file.LogFile$RandomReader.get(LogFile.java:501) > at org.apache.flume.channel.file.Log.get(Log.java:612) > ... 7 more > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) > at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) > at > com.google.protobuf.CodedInputStream.isAtEnd(CodedInputStream.java:701) > at > com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:99) > at > org.apache.flume.channel.file.proto.ProtosFactory$Put.<init>(ProtosFactory.java:3979) > at > org.apache.flume.channel.file.proto.ProtosFactory$Put.<init>(ProtosFactory.java:3943) > at > org.apache.flume.channel.file.proto.ProtosFactory$Put$1.parsePartialFrom(ProtosFactory.java:4039) > at > org.apache.flume.channel.file.proto.ProtosFactory$Put$1.parsePartialFrom(ProtosFactory.java:4034) > at > com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) > at > com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241) > at > com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253) > at > com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259) > at > com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49) > at > org.apache.flume.channel.file.proto.ProtosFactory$Put.parseDelimitedFrom(ProtosFactory.java:4179) > at org.apache.flume.channel.file.Put.readProtos(Put.java:97) > at > org.apache.flume.channel.file.TransactionEventRecord.fromByteArray(TransactionEventRecord.java:206) > ... 10 more > 08 Oct 2017 15:46:14,866 ERROR > [SinkRunner-PollingRunner-DefaultSinkProcessor] > (org.apache.flume.SinkRunner$PollingRunner.run:158) - Unable to deliver > event. Exception follows. > java.lang.IllegalStateException: Log is closed > at > com.google.common.base.Preconditions.checkState(Preconditions.java:145) > at org.apache.flume.channel.file.Log.getFlumeEventQueue(Log.java:591) > at > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.<init>(FileChannel.java:442) > at > org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:359) > at > org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:122) > at > com.macys.daas.flume.sink.elasticsearch.CreateIndex.process(CreateIndex.java:136) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian JIRA (v6.4.14#64029)