Mark, >> First and foremost we are currently using RSylog to aggregate our logs from >> our application servers.
This is similar to the legacy system we had at LinkedIn, now successfully replaced by Kafka. >> Although this strategy has been working for our bulk processing needs it >> doen'st help us much with realtime analysis, something we would really like >> to introduce. Kafka is designed to efficiently feed both real time and offline data pipelines. Being a pub-sub messaging system, it fits the need for real-time applications well. Its high throughput nature and built-in consumer parallelism features make it a good fit for feeding large systems like Hadoop and data-warehouses. At LinkedIn, we use it for activity tracking as well as real time RPC log analysis. For more information, please visit our webpage - http://incubator.apache.org/kafka/index.html. It has a detailed design writeup, and quickstart for you to try it out. >> We've tried Flume but that didn't work out too well. I'm interested in knowing what roadblocks you hit while trying Flume out, for curiosity sake ? Thanks, Neha On Thu, Nov 3, 2011 at 11:58 AM, Mark <static.void....@gmail.com> wrote: > Neha thanks for the response. > > I'll try and explain our use case. First and foremost we are currently using > RSylog to aggregate our logs from our application servers. This is > accomplished using their TCP plugin which sends logs to a cluster of logging > machines. At the end of the day we then import this into Hadoop. Although > this strategy has been working for our bulk processing needs it doen'st help > us much with realtime analysis, something we would really like to introduce. > We've tried Flume but that didn't work out too well. So now we are in the > process of looking into alternative technologies that can help us with both > our bulk and realtime analysis needs. > > Does it sound like Kafka would be a nice fit for our use case? Are there any > examples, documentation on realtime analysis with Kafka? > > Thanks. > > On 11/3/11 11:37 AM, Neha Narkhede wrote: >> >> Mark, >> >> For activity on the mailing list, take a look at these metrics - >> http://mail-archives.apache.org/mod_mbox/incubator-kafka-dev/ >> http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/ >> >> For activity of the committers and the development - >> >> https://issues.apache.org/jira/browse/KAFKA#selectedTab=com.atlassian.jira.plugin.system.project%3Aissues-panel >> >> A full-fledged comparison can be quite lengthy. Would you mind >> describing your case ? We can discuss the available alternatives and >> how Kafka would fit in. >> >> Kafka has been deployed in production at LinkedIn for over a year and >> a half. I believe there are other smaller startups using it too, and >> more in the pipeline. >> >> Thanks, >> Neha >> >> >> On Thu, Nov 3, 2011 at 11:00 AM, Mark<static.void....@gmail.com> wrote: >>> >>> I was wondering what the current state of Kafka is. Is it gaining much >>> traction? How active is the project, commiters and mailing lists? Are >>> there >>> other more popular alternatives out there? Any comparasion would help. >>> >>> Thanks for any input. >>> >