Yes, we have done so at Tagged.  I chronicled a bit of our experience here
on the the mailing list.  Effectively we found that a single machine could
not go above ~20k total topics.  This could be OS dependent however (we use
CentOS 5.x)

Various tweaks we made to go further:

   1. a beefed up node.js kafka client/producer implementation -
   https://github.com/tagged/node-kafka lies at the heart of our kafka
   deployment
   2. our own kafka software load balancer (implemented using said library)
   that shards out independent Kafka instances (guarantees in-order delivery
   per topic and scales the # of kafka topics linearly as a function of the #
   of kafka machines)
   3. a continuous cleaner that removes old dead topics completely from the
   filesystem (0.7 cleaner leaves empty directory/file which eats up open file
   handles and limits max # of topics)
   4. (coming soon) a hierarchical topic directory structure to ease the
   pain of too main directories/files in a single directory (should help the
   ~20k number, though probably by less than you might imagine)

On our todo list is blogging about this in more detail, and contributing
back more than just the node.js implementation.

On Mon, Jul 30, 2012 at 8:39 AM, Lorenzo Alberton <l.alber...@gmail.com>wrote:

> Is there anyone who tried Kafka with thousands of concurrent topics?
> If so, what are your experiences? How did you tune it?
>
> Thanks!
>

Reply via email to