Are you getting any errors back in the flume log?
On Wed, Mar 5, 2014 at 10:11 AM, Nikolaos Tsipas <[email protected]> wrote: > Hello, > > We are using flume's HDFS sink to store log data in Amazon S3 and we are > facing some throughput issues. In our flume config we have an avro source, a > file channel and the hdfs sink. The file channel is configured on a > provisioned IOPS EBS volume and we are running on an m1.large EC2 instance > (flume 1.4.0, java 1.7.0). > > Below you will find an example metric from our s3-file-channel. The main > issue is that the "EventTakeSuccessCount" can't cope with the > "EventPutSuccessCount" and as a result our "ChannelSize" increases over time. > > We tried to use multiple hdfs-sinks but it didn't have any positive effect. > Strangely, the problem is still there even when a memory channel is used. > Another interesting fact is that we are also using an identical file-channel > with the elasticsearch-sink and under the same load we don't have any > throughput issues. > > We would appreciate any suggestions that could help us improve the > performance of the hdfs sink. > > Regards, > Nick > > "CHANNEL.s3-file-channel": { > "ChannelCapacity": "15000000", > "ChannelFillPercentage": "11.6603", > "ChannelSize": "1749045", > "EventPutAttemptCount": "938299", > "EventPutSuccessCount": "938181", > "EventTakeAttemptCount": "648801", > "EventTakeSuccessCount": "635000", > "StartTime": "1394038826288", > "StopTime": "0", > "Type": "CHANNEL" > > > > > ---------------------------- > > http://www.bbc.co.uk > This e-mail (and any attachments) is confidential and may contain personal > views which are not the views of the BBC unless specifically stated. > If you have received it in error, please delete it from your system. > Do not use, copy or disclose the information in any way nor act in reliance > on it and notify the sender immediately. > Please note that the BBC monitors e-mails sent or received. > Further communication will signify your consent to this. > > ---------------------
