Did the default channel transaction change from 1.2 to 1.3? It used to be 1
million events default, and still looks like it according to metrics:
CHANNEL.fc_WebLogs:
{
* EventPutSuccessCount: "0",
* ChannelFillPercentage: "99.994",
* Type: "CHANNEL",
* StopTime: "0",
* EventPutAttemptCount: "0",
* ChannelSize: "999940",
* StartTime: "1362096361779",
* EventTakeSuccessCount: "0",
* ChannelCapacity: "1000000",
* EventTakeAttemptCount: "22022"
________________________________
From: Hari Shreedharan [mailto:[email protected]]
Sent: Thursday, February 28, 2013 4:07 PM
To: [email protected]
Subject: Re: Take list full error after 1.3 upgrade
You need to increase the transactionCapacity of the channel to at least the
batchSize of the HDFS sink. In your case, it is 1000 for the channel
transaction capacity and your hdfs batch size is 10000.
--
Hari Shreedharan
On Thursday, February 28, 2013 at 4:00 PM, Paul Chavez wrote:
I have a 2-tier flume setup, with 4 agents feeding into 2
'collector' agents that write to HDFS.
One of the data flows is hung up after an upgrade and restart with the
following error:
3:54:13.497 PM ERROR org.apache.flume.sink.hdfs.HDFSEventSink process
failed
org.apache.flume.ChannelException: Take list for FileBackedTransaction,
capacity 1000 full, consider committing more frequently, increasing capacity,
or increasing thread count. [channel=fc_WebLogs]
at
org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:481)
at
org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
at
org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95)
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:386)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:662)
3:54:13.498 PM ERROR org.apache.flume.SinkRunner Unable to deliver
event. Exception follows.
org.apache.flume.EventDeliveryException:
org.apache.flume.ChannelException: Take list for FileBackedTransaction,
capacity 1000 full, consider committing more frequently, increasing capacity,
or increasing thread count. [channel=fc_WebLogs]
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:461)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.flume.ChannelException: Take list for
FileBackedTransaction, capacity 1000 full, consider committing more frequently,
increasing capacity, or increasing thread count. [channel=fc_WebLogs]
at
org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:481)
at
org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
at
org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95)
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:386)
... 3 more
The relevant part of the config is here:
tier2.sinks.hdfs_WebLogs.type = hdfs
tier2.sinks.hdfs_WebLogs.channel = fc_WebLogs
tier2.sinks.hdfs_WebLogs.hdfs.path = /flume/WebLogs/%Y%m%d/%H%M
tier2.sinks.hdfs_WebLogs.hdfs.round = true
tier2.sinks.hdfs_WebLogs.hdfs.roundValue = 15
tier2.sinks.hdfs_WebLogs.hdfs.roundUnit = minute
tier2.sinks.hdfs_WebLogs.hdfs.rollSize = 67108864
tier2.sinks.hdfs_WebLogs.hdfs.rollCount = 0
tier2.sinks.hdfs_WebLogs.hdfs.rollInterval = 30
tier2.sinks.hdfs_WebLogs.hdfs.batchSize = 10000
tier2.sinks.hdfs_WebLogs.hdfs.fileType = DataStream
tier2.sinks.hdfs_WebLogs.hdfs.writeFormat = Text
The channel is full, and the metrics page shows many take attempts with
no successes. I've been in situations before where the channel is full (usually
due to lease issues on HDFS files) but never had this issue, usually just an
agent restart gets it going again.
Any help appreciated..
Thanks,
Paul Chavez