Congratulations on this Chris ! Wondering if you would be writing a
blog post on your experience of using Kafka for enabling live
analytics ?

Thanks
Neha

On Fri, Jan 13, 2012 at 11:46 AM, Chris Burroughs
<chris.burrou...@gmail.com> wrote:
> At Clearspring we have been using Apache Kafka since early 2011.  It
> powers the AddThis Live View analytics [1] and the update [2] that
> product recently received involved yet more Kafka (three cheers for the
> log4j appender!).
>
> The project that we originally started investigating Kafka for is
> somewhat larger; taking all of the view activity data generated by
> AddThis sharing tools and replacing pixels on a CDN with direct request
> to our datacenters. The obvious and exciting benefit is that this gives
> us access to our data in seconds instead of waiting hours for access log
> delivery.
>
> For that we have two datacenters, each with a web tier pushing to 60
> Kafka servers (so 120 in total).  Between the two DCs we employ custom
> bi-directional replication, so that batch and nearline analytics
> processes have access to a full copy of the data.  We are receiving a
> bit over 3 billion events per day, and expect total events ingested by
> the system to grow briskly over the next year.
>
> One choice that appears somewhat unusual and might be notable is that
> we are currently exclusively using the low level producer/consumers.
> Each web server pushes to a local Kafka broker that it is co-located
> with (we our fans of multi-tenancy where possible and didn't want two
> different "kinds" of boxes, disk oblivious web services and sequential
> io oriented kafka were a natural fit), and our consumers are all using
> Clearspring's analytics system [3] which already had
> integrated stream consumption and check-pointing.
>
> Please let me know if you have any questions.  There ought to be some
> blog posts with more details in the coming weeks.
>
> [1]
> http://www.addthis.com/blog/2011/06/21/social-data-in-real-time-with-addthis-live-view/
>
> [2]
> http://www.addthis.com/blog/2011/12/20/expanded-addthis-analytics-now-available-in-live-view/
>
> [3] There are a few blog posts and presentations about analytics at
> Clearspring floating around.  This one is the highest level overview:
> http://www.clearspring.com/blog/2011/05/12/big-data-dc-analytics-at-clearspring/

Reply via email to