Hi there,
I'm testing flume with thrift source, file channel, and HDFS sink.
Also there is a flume client which sends events using thrift, and the size of each event is up to 30 Mbytes.
It works fine for a short period, but after a few minutes, following error occurs at org.apache.flume.channel.ChannelProcessor.
| 8:15:36.450 PM | ERROR | org.apache.flume.channel.ChannelProcessor |
I increased direct memory size upto 2G byte, but it didn't work.
Here's my flume configuration.
#source
tier1.sources.s1.type = thrift
tier1.sources.s1.bind = 0.0.0.0
tier1.sources.s1.port = 30010
tier1.sources.s1.channels = c0 c1 memdefault
tier1.sources.s1.selector.type = multiplexing
tier1.sources.s1.selector.header = category
tier1.sources.s1.selector.mapping.Log4j = c0
tier1.sources.s1.selector.mapping.Data = "">tier1.sources.s1.selector.default = memDefault
#channel
tier1.channels.c1.type = memory
tier1.channels.c1.checkpointDir=/data/2/flumechannel/checkpoint
tier1.channels.c1.dataDirs=/data/2/flumechannel/data
tier1.channels.c1.transactionCapacity = 1
tier1.channels.c1.maxFileSize = 500000000
#sink
tier1.sinks.k1.type = hdfs
tier1.sinks.k1.channel = c1
tier1.sinks.k1.hdfs.path = /user/soul
tier1.sinks.k1.hdfs.round = false
tier1.sinks.k1.hdfs.fileType = DataStream
tier1.sinks.k1.hdfs.rollCount = 1
tier1.sinks.k1.hdfs.batchSize = 1
tier1.sinks.k1.hdfs.retryInterval = 10
tier1.sinks.k1.hdfs.proxyUser = soul
tier1.sinks.k1.hdfs.maxOpenFiles = 10
tier1.sinks.k1.hdfs.idleTimeOut = 900
and java config. option: -Xmx2g -XX:MaxDirectMemorySize=2g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
When I use memory channel instead of file channel, it works great.
I can't understand this phenomena.
The only clue that I have is that the exception always occurs after "org.apache.flume.channel.file.Log: Roll end"
Hope your comments.
Thank you.
