Hi Thai. Well, actually we didn't solve this problem. We had to use the global topic settings that apply to all topics.
I would really like to see globs (wildcards) supported in the config settings. This is something my team and I have discussed on several occasions. I'm not sure if there is a Kafka JIRA to cover that feature⦠-Taylor On Fri, Feb 17, 2012 at 2:57 AM, Bao Thai Ngo <[email protected]> wrote: > Hi Taylor, > > I found your email and the Kafka use case by chance. Our use case is a > little similar to yours. We actually implement semantic partitioning to > maintain some kind of produced data and we are also running several > thousand topics as you. > > One issue we have been facing is that it is totally inconvenient for us to > maintain and update Kafka server configuration (server.properties) when > running several thousand topics. We have to put number of partitions on a > per-topic in the way Kafka requires: > > ### Overrides for for the default given by num.partitions on a per-topic > basis > topic.partition.count.map = topic1:4, topic2:4, ..., topicn:4 > > I am almost sure that you did meet this issue I have mentioned, so I am > curious to know how you solved it. > > Thanks, > ~Thai > > On Wed, Dec 7, 2011 at 12:34 AM, Taylor Gautier <[email protected]>wrote: > >> We had to isolate topics to specific servers because we are running >> several hundred thousand topics in aggregate. >> >> Due to the directory strategy of Kafka it's not feasible to put that >> many topics in every host since they reside in a single directory. >> >> An improvement we considered making was to make the data directory >> nested which would have alleviated this problem. We also could have >> tried a different filesystem but we weren't confident that would solve >> the problem entirely. >> >> The advantage to our solution is that each host in our Kafka tier is >> literally share nothing. It will scale horizontally for a long, long >> way. >> >> And it's also a contingency plan. Since Kafka was unproven (for us >> anyway at the time) it was easier to build smaller components with >> less overall functionality and glue them together in a scalable way. >> If we had had to we could have out a different message bus in place. >> But we didn't want to do that if we could avoid it :) >> >> >> >> On Dec 6, 2011, at 9:13 AM, Neha Narkhede <[email protected]> >> wrote: >> >> > Taylor, >> > >> > This sounds great ! Congratulations on this launch. >> > >> >>> But basically we have many topics, few messages (relatively) per topic >> > >> > Can you explain your strategy of mapping topics to brokers ? The >> default in >> > Kafka today is to have all brokers host all topics. >> > >> >>> An end user browser makes a long-poll event http connection to receive >> > 1:1 messages and 1:M messages from a specialized http server we built >> for >> > this purpose. 1:M messages are delivered from Kafka. >> > >> > What do you use for receiving 1:1 messages ? >> > >> > Your use case is interesting and different. It will be great if you add >> > relevant details here - >> > https://cwiki.apache.org/confluence/display/KAFKA/Powered+By >> > >> > Thanks, >> > Neha >> > >> > >> > On Tue, Dec 6, 2011 at 8:44 AM, Jun Rao <[email protected]> wrote: >> > >> >> Hi, Taylor, >> >> >> >> Thanks for the update. This is great. Could you update your usage in >> Kafka >> >> wiki? Also, do you delete topics online? If so, how do you do that? >> >> >> >> Jun >> >> >> >> On Tue, Dec 6, 2011 at 8:30 AM, Taylor Gautier <[email protected]> >> >> wrote: >> >> >> >>> I've already mentioned this before, but I wanted to give a quick >> shout to >> >>> let you guys know that our newest game, Deckadence, is 100% live as of >> >>> yesterday. >> >>> >> >>> Check it out at http://www.tagged.com/deckadence.html >> >>> >> >>> A little about our use case: >> >>> >> >>> - Deckadence is a game of buying and selling - or rather trading - >> >>> cards. Every user on Tagged owns a card. There are 100M uses on >> >> Tagged, >> >>> so that means there are 100M cards to trade. >> >>> - Kafka enables real-time delivery of events in the game >> >>> - An end user browser makes a long-poll event http connection to >> >> receive >> >>> 1:1 messages and 1:M messages from a specialized http server we built >> >> for >> >>> this purpose. 1:M messages are delivered from Kafka. >> >>> - Because of this design, we can publish a message anywhere inside >> our >> >>> datacenter and send it directly and immediately to any other system >> >> that >> >>> is >> >>> subscribed to Kafka, or to an end-user browser >> >>> - Every update event for every card is sent to a unique topic that >> >>> represents the users card. >> >>> - When a user is browsing any card or list of cards - say a search >> >>> result - their browser subscribes to all of the cards on screen. >> >>> - The effect of this is that any changes to any card seen on-screen >> are >> >>> seen in real-time by all users of the game >> >>> - Our primary producers and consumers are PHP and NodeJS, >> respectively >> >>> >> >>> Well, I plan to write up more about this use case in the near future. >> As >> >>> you might have guessed, this is just about as far away from the >> original >> >>> intent of Kafka as you could get - we have PHP that sends messages to >> >>> Kafka. Since it's not good to hold a TCP connection open in PHP, we >> had >> >> to >> >>> do some trickery here. There was no existing Node client so we had to >> >>> write our own. And since there are 100 million users registered on >> >> Tagged, >> >>> that means we could have in theory 100M topics. Of course in >> practice we >> >>> have far fewer than that. One of the main things we currently have >> to do >> >>> is aggressively clean topics. But basically we have many topics, few >> >>> messages (relatively) per topic. And order matters, so we had to deal >> >> with >> >>> ensuring that we could handle the number of topics we would create, >> and >> >>> ensure ordered delivery and receipt. >> >>> >> >>> In the future I have big plans for Kafka, another feature is >> currently in >> >>> private test and will be released to the public soon (it uses Kafka >> in a >> >>> more traditional way). And we hope to have many more in 2012... >> >>> >> >> >> > >
