it would be useful to see you HDFS sink config
On Wed, Mar 5, 2014 at 11:47 AM, Nikolaos Tsipas <[email protected]>wrote: > Thanks for the response. No, we don't get any errors in the flume log, > everything seems to be ok.. it is just not performing as expected. > For reference, I'm also attaching some metrics from the elasticsearch > file-channel. > > "CHANNEL.es-file-channel": { > "ChannelCapacity": "15000000", > "ChannelFillPercentage": "0.0027333333333333333", > "ChannelSize": "410", > "EventPutAttemptCount": "590760", > "EventPutSuccessCount": "590760", > "EventTakeAttemptCount": "599977", > "EventTakeSuccessCount": "599800", > "StartTime": "1394047580119", > "StopTime": "0", > "Type": "CHANNEL" > -- > "CHANNEL.s3-file-channel": { > "ChannelCapacity": "15000000", > "ChannelFillPercentage": "0.7050666666666667", > "ChannelSize": "105760", > "EventPutAttemptCount": "590760", > "EventPutSuccessCount": "590760", > "EventTakeAttemptCount": "487249", > "EventTakeSuccessCount": "485000", > "StartTime": "1394047574563", > "StopTime": "0", > "Type": "CHANNEL" > > ________________________________________ > From: Jeff Lord [[email protected]] > Sent: Wednesday, March 05, 2014 6:45 PM > To: Flume Developers Mailing List > Subject: Re: HDFS sink throughput > > Are you getting any errors back in the flume log? > > On Wed, Mar 5, 2014 at 10:11 AM, Nikolaos Tsipas > <[email protected]> wrote: > > Hello, > > > > We are using flume's HDFS sink to store log data in Amazon S3 and we are > facing some throughput issues. In our flume config we have an avro source, > a file channel and the hdfs sink. The file channel is configured on a > provisioned IOPS EBS volume and we are running on an m1.large EC2 instance > (flume 1.4.0, java 1.7.0). > > > > Below you will find an example metric from our s3-file-channel. The main > issue is that the "EventTakeSuccessCount" can't cope with the > "EventPutSuccessCount" and as a result our "ChannelSize" increases over > time. > > > > We tried to use multiple hdfs-sinks but it didn't have any positive > effect. Strangely, the problem is still there even when a memory channel is > used. Another interesting fact is that we are also using an identical > file-channel with the elasticsearch-sink and under the same load we don't > have any throughput issues. > > > > We would appreciate any suggestions that could help us improve the > performance of the hdfs sink. > > > > Regards, > > Nick > > > > "CHANNEL.s3-file-channel": { > > "ChannelCapacity": "15000000", > > "ChannelFillPercentage": "11.6603", > > "ChannelSize": "1749045", > > "EventPutAttemptCount": "938299", > > "EventPutSuccessCount": "938181", > > "EventTakeAttemptCount": "648801", > > "EventTakeSuccessCount": "635000", > > "StartTime": "1394038826288", > > "StopTime": "0", > > "Type": "CHANNEL" > > > > > > > > > > ---------------------------- > > > > http://www.bbc.co.uk > > This e-mail (and any attachments) is confidential and may contain > personal views which are not the views of the BBC unless specifically > stated. > > If you have received it in error, please delete it from your system. > > Do not use, copy or disclose the information in any way nor act in > reliance on it and notify the sender immediately. > > Please note that the BBC monitors e-mails sent or received. > > Further communication will signify your consent to this. > > > > --------------------- > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
