Hi Mike, thanks a lot for posting the performance metrics. Are you
doing the same for file channel?

Based on:

Load: 58,582 events/sec aggregate == approx. 5,850 events/sec per flow
on average x 10 flows.
Event size: 300 bytes/event.

58,582 events/sec * 300 bytes/event = 17,574,600 bytes/sec = 17.5
MB/sec = 140 Mbps

Is my math right? I assume the file based channel would be slower, any
idea how much slower?

Thanks,

-ari


On Mon, May 7, 2012 at 6:55 PM, Mike Percy <[email protected]> wrote:
> Hi folks,
> Will McQueen and I have been doing some Flume NG stress and performance 
> testing, and we wanted to share some of our recent findings. The focus of the 
> most recent tests has been on the syslog TCP source, memory channel, and HDFS 
> sink.
>
> I wrote some software to generate load in syslog format over TCP and to 
> automate some of the analysis. The first thing we wanted to verify is that no 
> data was lost during these tests (a.k.a. correctness), with a close second 
> priority being of course throughput (performance). I used Pig and AvroStorage 
> from piggybank in the data integrity analysis, and committed the compiled 
> (0.11 trunk) piggybank jar so the load analysis scripts would be relatively 
> easy to use. It seems to be compatible with Pig 0.8.1. I am a little wary of 
> having to maintain that type of thing at the Apache org level so for now I 
> have checked all the code in on Github under an ASL 2.0 license:
>
> https://github.com/mpercy/flume-load-gen
>
> I have created a Wiki page with the performance metrics we have come up with 
> so far. The executive summary is that at the time of this writing, we have 
> observed Flume NG on a single machine processing events at a throughput rate 
> of 70,000+ events/sec with no data loss.
>
> https://cwiki.apache.org/confluence/display/FLUME/Flume+NG+Performance+Measurements
>
> I have put more details on the wiki page itself. Please let me know if you 
> want me to add more detail. I'll be looking into improving the performance of 
> these components going forward, however we wanted to post these results to 
> set a public performance baseline of Flume NG.
>
> If others have done performance testing, we would love to see your results if 
> you can post the details.
>
> Regards,
> Mike
>

Reply via email to