Jun, No, it's necessary for us to modify the tuning parameters on a per topic basis using wildcards, e.g.
topic.flush.intervals.ms=chat*:100,presence*:1000 On Fri, Feb 17, 2012 at 10:09 AM, Jun Rao <[email protected]> wrote: > Taylor, > > We don't have a jira for that. Please open one. > > In 0.8, we will have DDLs for creating topics, which you can use to > customize # partitions. Will that be enough? > > Jun > > On Fri, Feb 17, 2012 at 9:02 AM, Taylor Gautier <[email protected]> > wrote: > > > Hi Thai. > > > > Well, actually we didn't solve this problem. We had to use the global > > topic settings that apply to all topics. > > > > I would really like to see globs (wildcards) supported in the config > > settings. This is something my team and I have discussed on several > > occasions. > > > > I'm not sure if there is a Kafka JIRA to cover that feature⦠> > > > -Taylor > > > > On Fri, Feb 17, 2012 at 2:57 AM, Bao Thai Ngo <[email protected]> > > wrote: > > > > > Hi Taylor, > > > > > > I found your email and the Kafka use case by chance. Our use case is a > > > little similar to yours. We actually implement semantic partitioning to > > > maintain some kind of produced data and we are also running several > > > thousand topics as you. > > > > > > One issue we have been facing is that it is totally inconvenient for us > > to > > > maintain and update Kafka server configuration (server.properties) when > > > running several thousand topics. We have to put number of partitions > on a > > > per-topic in the way Kafka requires: > > > > > > ### Overrides for for the default given by num.partitions on a > per-topic > > > basis > > > topic.partition.count.map = topic1:4, topic2:4, ..., topicn:4 > > > > > > I am almost sure that you did meet this issue I have mentioned, so I am > > > curious to know how you solved it. > > > > > > Thanks, > > > ~Thai > > > > > > On Wed, Dec 7, 2011 at 12:34 AM, Taylor Gautier <[email protected] > > >wrote: > > > > > >> We had to isolate topics to specific servers because we are running > > >> several hundred thousand topics in aggregate. > > >> > > >> Due to the directory strategy of Kafka it's not feasible to put that > > >> many topics in every host since they reside in a single directory. > > >> > > >> An improvement we considered making was to make the data directory > > >> nested which would have alleviated this problem. We also could have > > >> tried a different filesystem but we weren't confident that would solve > > >> the problem entirely. > > >> > > >> The advantage to our solution is that each host in our Kafka tier is > > >> literally share nothing. It will scale horizontally for a long, long > > >> way. > > >> > > >> And it's also a contingency plan. Since Kafka was unproven (for us > > >> anyway at the time) it was easier to build smaller components with > > >> less overall functionality and glue them together in a scalable way. > > >> If we had had to we could have out a different message bus in place. > > >> But we didn't want to do that if we could avoid it :) > > >> > > >> > > >> > > >> On Dec 6, 2011, at 9:13 AM, Neha Narkhede <[email protected]> > > >> wrote: > > >> > > >> > Taylor, > > >> > > > >> > This sounds great ! Congratulations on this launch. > > >> > > > >> >>> But basically we have many topics, few messages (relatively) per > > topic > > >> > > > >> > Can you explain your strategy of mapping topics to brokers ? The > > >> default in > > >> > Kafka today is to have all brokers host all topics. > > >> > > > >> >>> An end user browser makes a long-poll event http connection to > > receive > > >> > 1:1 messages and 1:M messages from a specialized http server we > built > > >> for > > >> > this purpose. 1:M messages are delivered from Kafka. > > >> > > > >> > What do you use for receiving 1:1 messages ? > > >> > > > >> > Your use case is interesting and different. It will be great if you > > add > > >> > relevant details here - > > >> > https://cwiki.apache.org/confluence/display/KAFKA/Powered+By > > >> > > > >> > Thanks, > > >> > Neha > > >> > > > >> > > > >> > On Tue, Dec 6, 2011 at 8:44 AM, Jun Rao <[email protected]> wrote: > > >> > > > >> >> Hi, Taylor, > > >> >> > > >> >> Thanks for the update. This is great. Could you update your usage > in > > >> Kafka > > >> >> wiki? Also, do you delete topics online? If so, how do you do that? > > >> >> > > >> >> Jun > > >> >> > > >> >> On Tue, Dec 6, 2011 at 8:30 AM, Taylor Gautier < > [email protected]> > > >> >> wrote: > > >> >> > > >> >>> I've already mentioned this before, but I wanted to give a quick > > >> shout to > > >> >>> let you guys know that our newest game, Deckadence, is 100% live > as > > of > > >> >>> yesterday. > > >> >>> > > >> >>> Check it out at http://www.tagged.com/deckadence.html > > >> >>> > > >> >>> A little about our use case: > > >> >>> > > >> >>> - Deckadence is a game of buying and selling - or rather trading > - > > >> >>> cards. Every user on Tagged owns a card. There are 100M uses on > > >> >> Tagged, > > >> >>> so that means there are 100M cards to trade. > > >> >>> - Kafka enables real-time delivery of events in the game > > >> >>> - An end user browser makes a long-poll event http connection to > > >> >> receive > > >> >>> 1:1 messages and 1:M messages from a specialized http server we > > built > > >> >> for > > >> >>> this purpose. 1:M messages are delivered from Kafka. > > >> >>> - Because of this design, we can publish a message anywhere > inside > > >> our > > >> >>> datacenter and send it directly and immediately to any other > system > > >> >> that > > >> >>> is > > >> >>> subscribed to Kafka, or to an end-user browser > > >> >>> - Every update event for every card is sent to a unique topic > that > > >> >>> represents the users card. > > >> >>> - When a user is browsing any card or list of cards - say a > search > > >> >>> result - their browser subscribes to all of the cards on screen. > > >> >>> - The effect of this is that any changes to any card seen > on-screen > > >> are > > >> >>> seen in real-time by all users of the game > > >> >>> - Our primary producers and consumers are PHP and NodeJS, > > >> respectively > > >> >>> > > >> >>> Well, I plan to write up more about this use case in the near > > future. > > >> As > > >> >>> you might have guessed, this is just about as far away from the > > >> original > > >> >>> intent of Kafka as you could get - we have PHP that sends messages > > to > > >> >>> Kafka. Since it's not good to hold a TCP connection open in PHP, > we > > >> had > > >> >> to > > >> >>> do some trickery here. There was no existing Node client so we > had > > to > > >> >>> write our own. And since there are 100 million users registered > on > > >> >> Tagged, > > >> >>> that means we could have in theory 100M topics. Of course in > > >> practice we > > >> >>> have far fewer than that. One of the main things we currently > have > > >> to do > > >> >>> is aggressively clean topics. But basically we have many topics, > > few > > >> >>> messages (relatively) per topic. And order matters, so we had to > > deal > > >> >> with > > >> >>> ensuring that we could handle the number of topics we would > create, > > >> and > > >> >>> ensure ordered delivery and receipt. > > >> >>> > > >> >>> In the future I have big plans for Kafka, another feature is > > >> currently in > > >> >>> private test and will be released to the public soon (it uses > Kafka > > >> in a > > >> >>> more traditional way). And we hope to have many more in 2012... > > >> >>> > > >> >> > > >> > > > > > > > > >
