Hi, I like the idea Taylor suggested. This will definitely help a lot.
Another approach I would suggest is to let Kafka load information of topic.partition.count.map from an external file (plain-text, xml, ect) in some format like: topic1:#partition topic2:#partition .... topicn:#partition By this way, a Kafka user will be also able to modify (manually or automatically by a script) this information as he/she wants. What do you think? Thanks, ~Thai On Sat, Feb 18, 2012 at 1:52 AM, Taylor Gautier <[email protected]> wrote: > Jun, > > No, it's necessary for us to modify the tuning parameters on a per topic > basis using wildcards, e.g. > > topic.flush.intervals.ms=chat*:100,presence*:1000 > > On Fri, Feb 17, 2012 at 10:09 AM, Jun Rao <[email protected]> wrote: > > > Taylor, > > > > We don't have a jira for that. Please open one. > > > > In 0.8, we will have DDLs for creating topics, which you can use to > > customize # partitions. Will that be enough? > > > > Jun > > > > On Fri, Feb 17, 2012 at 9:02 AM, Taylor Gautier <[email protected]> > > wrote: > > > > > Hi Thai. > > > > > > Well, actually we didn't solve this problem. We had to use the global > > > topic settings that apply to all topics. > > > > > > I would really like to see globs (wildcards) supported in the config > > > settings. This is something my team and I have discussed on several > > > occasions. > > > > > > I'm not sure if there is a Kafka JIRA to cover that feature⦠> > > > > > -Taylor > > > > > > On Fri, Feb 17, 2012 at 2:57 AM, Bao Thai Ngo <[email protected]> > > > wrote: > > > > > > > Hi Taylor, > > > > > > > > I found your email and the Kafka use case by chance. Our use case is > a > > > > little similar to yours. We actually implement semantic partitioning > to > > > > maintain some kind of produced data and we are also running several > > > > thousand topics as you. > > > > > > > > One issue we have been facing is that it is totally inconvenient for > us > > > to > > > > maintain and update Kafka server configuration (server.properties) > when > > > > running several thousand topics. We have to put number of partitions > > on a > > > > per-topic in the way Kafka requires: > > > > > > > > ### Overrides for for the default given by num.partitions on a > > per-topic > > > > basis > > > > topic.partition.count.map = topic1:4, topic2:4, ..., topicn:4 > > > > > > > > I am almost sure that you did meet this issue I have mentioned, so I > am > > > > curious to know how you solved it. > > > > > > > > Thanks, > > > > ~Thai > > > > > > > > On Wed, Dec 7, 2011 at 12:34 AM, Taylor Gautier <[email protected] > > > >wrote: > > > > > > > >> We had to isolate topics to specific servers because we are running > > > >> several hundred thousand topics in aggregate. > > > >> > > > >> Due to the directory strategy of Kafka it's not feasible to put that > > > >> many topics in every host since they reside in a single directory. > > > >> > > > >> An improvement we considered making was to make the data directory > > > >> nested which would have alleviated this problem. We also could have > > > >> tried a different filesystem but we weren't confident that would > solve > > > >> the problem entirely. > > > >> > > > >> The advantage to our solution is that each host in our Kafka tier is > > > >> literally share nothing. It will scale horizontally for a long, long > > > >> way. > > > >> > > > >> And it's also a contingency plan. Since Kafka was unproven (for us > > > >> anyway at the time) it was easier to build smaller components with > > > >> less overall functionality and glue them together in a scalable way. > > > >> If we had had to we could have out a different message bus in place. > > > >> But we didn't want to do that if we could avoid it :) > > > >> > > > >> > > > >> > > > >> On Dec 6, 2011, at 9:13 AM, Neha Narkhede <[email protected]> > > > >> wrote: > > > >> > > > >> > Taylor, > > > >> > > > > >> > This sounds great ! Congratulations on this launch. > > > >> > > > > >> >>> But basically we have many topics, few messages (relatively) per > > > topic > > > >> > > > > >> > Can you explain your strategy of mapping topics to brokers ? The > > > >> default in > > > >> > Kafka today is to have all brokers host all topics. > > > >> > > > > >> >>> An end user browser makes a long-poll event http connection to > > > receive > > > >> > 1:1 messages and 1:M messages from a specialized http server we > > built > > > >> for > > > >> > this purpose. 1:M messages are delivered from Kafka. > > > >> > > > > >> > What do you use for receiving 1:1 messages ? > > > >> > > > > >> > Your use case is interesting and different. It will be great if > you > > > add > > > >> > relevant details here - > > > >> > https://cwiki.apache.org/confluence/display/KAFKA/Powered+By > > > >> > > > > >> > Thanks, > > > >> > Neha > > > >> > > > > >> > > > > >> > On Tue, Dec 6, 2011 at 8:44 AM, Jun Rao <[email protected]> wrote: > > > >> > > > > >> >> Hi, Taylor, > > > >> >> > > > >> >> Thanks for the update. This is great. Could you update your usage > > in > > > >> Kafka > > > >> >> wiki? Also, do you delete topics online? If so, how do you do > that? > > > >> >> > > > >> >> Jun > > > >> >> > > > >> >> On Tue, Dec 6, 2011 at 8:30 AM, Taylor Gautier < > > [email protected]> > > > >> >> wrote: > > > >> >> > > > >> >>> I've already mentioned this before, but I wanted to give a quick > > > >> shout to > > > >> >>> let you guys know that our newest game, Deckadence, is 100% live > > as > > > of > > > >> >>> yesterday. > > > >> >>> > > > >> >>> Check it out at http://www.tagged.com/deckadence.html > > > >> >>> > > > >> >>> A little about our use case: > > > >> >>> > > > >> >>> - Deckadence is a game of buying and selling - or rather > trading > > - > > > >> >>> cards. Every user on Tagged owns a card. There are 100M uses > on > > > >> >> Tagged, > > > >> >>> so that means there are 100M cards to trade. > > > >> >>> - Kafka enables real-time delivery of events in the game > > > >> >>> - An end user browser makes a long-poll event http connection > to > > > >> >> receive > > > >> >>> 1:1 messages and 1:M messages from a specialized http server we > > > built > > > >> >> for > > > >> >>> this purpose. 1:M messages are delivered from Kafka. > > > >> >>> - Because of this design, we can publish a message anywhere > > inside > > > >> our > > > >> >>> datacenter and send it directly and immediately to any other > > system > > > >> >> that > > > >> >>> is > > > >> >>> subscribed to Kafka, or to an end-user browser > > > >> >>> - Every update event for every card is sent to a unique topic > > that > > > >> >>> represents the users card. > > > >> >>> - When a user is browsing any card or list of cards - say a > > search > > > >> >>> result - their browser subscribes to all of the cards on > screen. > > > >> >>> - The effect of this is that any changes to any card seen > > on-screen > > > >> are > > > >> >>> seen in real-time by all users of the game > > > >> >>> - Our primary producers and consumers are PHP and NodeJS, > > > >> respectively > > > >> >>> > > > >> >>> Well, I plan to write up more about this use case in the near > > > future. > > > >> As > > > >> >>> you might have guessed, this is just about as far away from the > > > >> original > > > >> >>> intent of Kafka as you could get - we have PHP that sends > messages > > > to > > > >> >>> Kafka. Since it's not good to hold a TCP connection open in > PHP, > > we > > > >> had > > > >> >> to > > > >> >>> do some trickery here. There was no existing Node client so we > > had > > > to > > > >> >>> write our own. And since there are 100 million users registered > > on > > > >> >> Tagged, > > > >> >>> that means we could have in theory 100M topics. Of course in > > > >> practice we > > > >> >>> have far fewer than that. One of the main things we currently > > have > > > >> to do > > > >> >>> is aggressively clean topics. But basically we have many > topics, > > > few > > > >> >>> messages (relatively) per topic. And order matters, so we had > to > > > deal > > > >> >> with > > > >> >>> ensuring that we could handle the number of topics we would > > create, > > > >> and > > > >> >>> ensure ordered delivery and receipt. > > > >> >>> > > > >> >>> In the future I have big plans for Kafka, another feature is > > > >> currently in > > > >> >>> private test and will be released to the public soon (it uses > > Kafka > > > >> in a > > > >> >>> more traditional way). And we hope to have many more in 2012... > > > >> >>> > > > >> >> > > > >> > > > > > > > > > > > > > >
