Re: Kafka is live in prod @ 100%

Jun Rao Fri, 17 Feb 2012 10:09:54 -0800

Taylor,

We don't have a jira for that. Please open one.


In 0.8, we will have DDLs for creating topics, which you can use to
customize # partitions. Will that be enough?

Jun

On Fri, Feb 17, 2012 at 9:02 AM, Taylor Gautier <[email protected]> wrote:

> Hi Thai.
>
> Well, actually we didn't solve this problem.  We had to use the global
> topic settings that apply to all topics.
>
> I would really like to see globs (wildcards) supported in the config
> settings.  This is something my team and I have discussed on several
> occasions.
>
> I'm not sure if there is a Kafka JIRA to cover that feature…
>
> -Taylor
>
> On Fri, Feb 17, 2012 at 2:57 AM, Bao Thai Ngo <[email protected]>
> wrote:
>
> > Hi Taylor,
> >
> > I found your email and the Kafka use case by chance. Our use case is a
> > little similar to yours. We actually implement semantic partitioning to
> > maintain some kind of produced data and we are also running several
> > thousand topics as you.
> >
> > One issue we have been facing is that it is totally inconvenient for us
> to
> > maintain and update Kafka server configuration (server.properties) when
> > running several thousand topics. We have to put number of partitions on a
> > per-topic in the way Kafka requires:
> >
> > ### Overrides for for the default given by num.partitions on a per-topic
> > basis
> > topic.partition.count.map = topic1:4, topic2:4, ..., topicn:4
> >
> > I am almost sure that you did meet this issue I have mentioned, so I am
> > curious to know how you solved it.
> >
> > Thanks,
> > ~Thai
> >
> > On Wed, Dec 7, 2011 at 12:34 AM, Taylor Gautier <[email protected]
> >wrote:
> >
> >> We had to isolate topics to specific servers because we are running
> >> several hundred thousand topics in aggregate.
> >>
> >> Due to the directory strategy of Kafka it's not feasible to put that
> >> many topics in every host since they reside in a single directory.
> >>
> >> An improvement we considered making was to make the data directory
> >> nested which would have alleviated this problem.  We also could have
> >> tried a different filesystem but we weren't confident that would solve
> >> the problem entirely.
> >>
> >> The advantage to our solution is that each host in our Kafka tier is
> >> literally share nothing. It will scale horizontally for a long, long
> >> way.
> >>
> >> And it's also a contingency plan. Since Kafka was unproven (for us
> >> anyway at the time) it was easier to build smaller components with
> >> less overall functionality and glue them together in a scalable way.
> >> If we had had to we could have out a different message bus in place.
> >> But we didn't want to do that if we could avoid it :)
> >>
> >>
> >>
> >> On Dec 6, 2011, at 9:13 AM, Neha Narkhede <[email protected]>
> >> wrote:
> >>
> >> > Taylor,
> >> >
> >> > This sounds great ! Congratulations on this launch.
> >> >
> >> >>> But basically we have many topics, few messages (relatively) per
> topic
> >> >
> >> > Can you explain your strategy of mapping topics to brokers ? The
> >> default in
> >> > Kafka today is to have all brokers host all topics.
> >> >
> >> >>> An end user browser makes a long-poll event http connection to
> receive
> >> >  1:1 messages and 1:M messages from a specialized http server we built
> >> for
> >> >  this purpose.  1:M messages are delivered from Kafka.
> >> >
> >> > What do you use for receiving 1:1 messages ?
> >> >
> >> > Your use case is interesting and different. It will be great if you
> add
> >> > relevant details here -
> >> > https://cwiki.apache.org/confluence/display/KAFKA/Powered+By
> >> >
> >> > Thanks,
> >> > Neha
> >> >
> >> >
> >> > On Tue, Dec 6, 2011 at 8:44 AM, Jun Rao <[email protected]> wrote:
> >> >
> >> >> Hi, Taylor,
> >> >>
> >> >> Thanks for the update. This is great. Could you update your usage in
> >> Kafka
> >> >> wiki? Also, do you delete topics online? If so, how do you do that?
> >> >>
> >> >> Jun
> >> >>
> >> >> On Tue, Dec 6, 2011 at 8:30 AM, Taylor Gautier <[email protected]>
> >> >> wrote:
> >> >>
> >> >>> I've already mentioned this before, but I wanted to give a quick
> >> shout to
> >> >>> let you guys know that our newest game, Deckadence, is 100% live as
> of
> >> >>> yesterday.
> >> >>>
> >> >>> Check it out at http://www.tagged.com/deckadence.html
> >> >>>
> >> >>> A little about our use case:
> >> >>>
> >> >>>  - Deckadence is a game of buying and selling - or rather trading -
> >> >>>  cards.  Every user on Tagged owns a card.  There are 100M uses on
> >> >> Tagged,
> >> >>>  so that means there are 100M cards to trade.
> >> >>>  - Kafka enables real-time delivery of events in the game
> >> >>>  - An end user browser makes a long-poll event http connection to
> >> >> receive
> >> >>>  1:1 messages and 1:M messages from a specialized http server we
> built
> >> >> for
> >> >>>  this purpose.  1:M messages are delivered from Kafka.
> >> >>>  - Because of this design, we can publish a message anywhere inside
> >> our
> >> >>>  datacenter and send it directly and immediately to any other system
> >> >> that
> >> >>> is
> >> >>>  subscribed to Kafka, or to an end-user browser
> >> >>>  - Every update event for every card is sent to a unique topic that
> >> >>>  represents the users card.
> >> >>>  - When a user is browsing any card or list of cards - say a search
> >> >>>  result - their browser subscribes to all of the cards on screen.
> >> >>>  - The effect of this is that any changes to any card seen on-screen
> >> are
> >> >>>  seen in real-time by all users of the game
> >> >>>  - Our primary producers and consumers are PHP and NodeJS,
> >> respectively
> >> >>>
> >> >>> Well, I plan to write up more about this use case in the near
> future.
> >>  As
> >> >>> you might have guessed, this is just about as far away from the
> >> original
> >> >>> intent of Kafka as you could get - we have PHP that sends messages
> to
> >> >>> Kafka.  Since it's not good to hold a TCP connection open in PHP, we
> >> had
> >> >> to
> >> >>> do some trickery here.  There was no existing Node client so we had
> to
> >> >>> write our own.  And since there are 100 million users registered on
> >> >> Tagged,
> >> >>> that means we could have in theory 100M topics.  Of course in
> >> practice we
> >> >>> have far fewer than that.  One of the main things we currently have
> >> to do
> >> >>> is aggressively clean topics.  But basically we have many topics,
> few
> >> >>> messages (relatively) per topic.  And order matters, so we had to
> deal
> >> >> with
> >> >>> ensuring that we could handle the number of topics we would create,
> >> and
> >> >>> ensure ordered delivery and receipt.
> >> >>>
> >> >>> In the future I have big plans for Kafka, another feature is
> >> currently in
> >> >>> private test and will be released to the public soon (it uses Kafka
> >> in a
> >> >>> more traditional way).  And we hope to have many more in 2012...
> >> >>>
> >> >>
> >>
> >
> >
>

Re: Kafka is live in prod @ 100%

Reply via email to