Hi,
I am trying to write the flume events in Amaozn S3.The events written in S3 is
in compressed format. My Flume configuration is given below. I am facing a data
loss. Based on the configuration given below, if I publish 20000 events, I
receive only 1000 events and all other data is lost. But When I disable the
rollcount, rollSize and rollInterval configurations, all the events are
received but there are 2000 small files created. Is there any wrong in my
configuration settings? Should I add any other configurations?
injector.sinks.s3_3store.type = hdfs
injector.sinks.s3_3store.channel = disk_backed4
injector.sinks.s3_3store.hdfs.fileType = CompressedStream
injector.sinks.s3_3store.hdfs.codeC = gzip
injector.sinks.s3_3store.hdfs.serializer = TEXT
injector.sinks.s3_3store.hdfs.path =
s3n://CID:SecretKey@bucketName/dth=%Y-%m-%d-%H
injector.sinks.s3_1store.hdfs.filePrefix = events-%{receiver}
# Roll when files reach 256M or after 10m, whichever comes first
injector.sinks.s3_3store.hdfs.rollCount = 0
injector.sinks.s3_3store.hdfs.idleTimeout = 600
injector.sinks.s3_3store.hdfs.rollSize = 268435456
#injector.sinks.s3_3store.hdfs.rollInterval = 3600
# Flush data to buckets every 1k events
injector.sinks.s3_3store.hdfs.batchSize = 10000
Thanks
Bala