Thanks a lot for the comparison Eric. Really good to hear a perspective from a user of both.
On Sep 29, 2011, at 1:25 PM, Eric Hauser wrote: > Jeremy, > > I've used both Flume and Kafka, and I can provide some info for comparison: > > Flume > - The current Flume release 0.9.4 has some pretty nasty bugs in it > (most have been fixed in trunk). > - Flume is a more complex to maintain operations-wise (IMO) than Kafka > since you have to setup masters and collectors (you don't necessarily > need collectors if you aren't writing to HDFS) > - Flume has a well defined pattern for doing what you want: > http://www.cloudera.com/blog/2010/09/using-flume-to-collect-apache-2-web-server-logs/ > > Kafka > - If you need multiple Kafka partitions for the logs, you will want to > partition by host so the messages arrive in order for the same host > - You can use the same piped technique as Flume to publish to Kafka, > but you'll have to write a little code to publish and subscribe to the > stream > - Kafka does not provide any of the file rolling, compression, etc. > that Flume provides > - If you ever want to do anything more interesting with those log > files than just send them to one location, publishing them to Kafka > would allow you to add additional consumers later. Flume has a > concept of fanout sinks, but I don't care for the way it works. > > > > On Thu, Sep 29, 2011 at 1:48 PM, Jun Rao <jun...@gmail.com> wrote: >> Jeremy, >> >> Yes, Kafka will be a good fit for that. >> >> Thanks, >> >> Jun >> >> On Thu, Sep 29, 2011 at 10:12 AM, Jeremy Hanna >> <jeremy.hanna1...@gmail.com>wrote: >> >>> We have a number of web servers in ec2 and periodically we just blow them >>> away and create new ones. That makes keeping logs problematic. We're >>> looking for a way to stream the logs from those various sources directly to >>> a central log server - either just a single server or hdfs or something like >>> that. >>> >>> My question is whether kafka is a good fit for that or should I be looking >>> more along the lines of flume or scribe? >>> >>> Many thanks. >>> >>> Jeremy >>