Mark,

>> First and foremost we are currently using RSylog to aggregate our logs from 
>> our application servers.

This is similar to the legacy system we had at LinkedIn, now
successfully replaced by Kafka.

>> Although this strategy has been working for our bulk processing needs it 
>> doen'st help us much with realtime analysis, something we would really like 
>> to introduce.

Kafka is designed to efficiently feed both real time and offline data
pipelines. Being a pub-sub messaging system, it fits the need for
real-time applications well. Its high throughput nature and built-in
consumer parallelism features make it a good fit for feeding large
systems like Hadoop and data-warehouses. At LinkedIn, we use it for
activity tracking as well as real time RPC log analysis.

For more information, please visit our webpage -
http://incubator.apache.org/kafka/index.html. It has a detailed design
writeup, and quickstart for you to try it out.

>> We've tried Flume but that didn't work out too well.

I'm interested in knowing what roadblocks you hit while trying Flume
out, for curiosity sake ?

Thanks,
Neha

On Thu, Nov 3, 2011 at 11:58 AM, Mark <static.void....@gmail.com> wrote:
> Neha thanks for the response.
>
> I'll try and explain our use case. First and foremost we are currently using
> RSylog to aggregate our logs from our application servers. This is
> accomplished using their TCP plugin which sends logs to a cluster of logging
> machines. At the end of the day we then import this into Hadoop. Although
> this strategy has been working for our bulk processing needs it doen'st help
> us much with realtime analysis, something we would really like to introduce.
> We've tried Flume but that didn't work out too well. So now we are in the
> process of looking into alternative technologies that can help us with both
> our bulk and realtime analysis needs.
>
> Does it sound like Kafka would be a nice fit for our use case? Are there any
> examples, documentation on realtime analysis with Kafka?
>
> Thanks.
>
> On 11/3/11 11:37 AM, Neha Narkhede wrote:
>>
>> Mark,
>>
>> For activity on the mailing list, take a look at these metrics -
>> http://mail-archives.apache.org/mod_mbox/incubator-kafka-dev/
>> http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/
>>
>> For activity of the committers and the development -
>>
>> https://issues.apache.org/jira/browse/KAFKA#selectedTab=com.atlassian.jira.plugin.system.project%3Aissues-panel
>>
>> A full-fledged comparison can be quite lengthy. Would you mind
>> describing your case ? We can discuss the available alternatives and
>> how Kafka would fit in.
>>
>> Kafka has been deployed in production at LinkedIn for over a year and
>> a half. I believe there are other smaller startups using it too, and
>> more in the pipeline.
>>
>> Thanks,
>> Neha
>>
>>
>> On Thu, Nov 3, 2011 at 11:00 AM, Mark<static.void....@gmail.com>  wrote:
>>>
>>> I was wondering what the current state of Kafka is. Is it gaining much
>>> traction? How active is the project, commiters and mailing lists? Are
>>> there
>>> other more popular alternatives out there? Any comparasion would help.
>>>
>>> Thanks for any input.
>>>
>

Reply via email to