Re: Strategies for auto generating broker ID

Sriram Subramanian Wed, 02 Oct 2013 11:20:58 -0700

Jason - You should be able to solve that with Jay's proposal below. If you
just persist the id in a meta file, you can copy the meta file over to the
new broker and broker will not re-generate another id.


On 10/2/13 11:10 AM, "Jason Rosenberg" <j...@squareup.com> wrote:

>I recently moved away from generating a unique brokerId for each node, in
>favor of assigning ids in configuration.  The reason for this, is that in
>0.8, there isn't a convenient way yet to reassign partitions to a new
>brokerid, should one broker have a failure.  So, it seems the only
>work-around at the moment is to bring up a replacement broker, assign it
>the same brokerId as one that has failed and is no longer running.  The
>cluster will then automatically replicate all the partitions that were
>assigned to the failed broker to the new broker.
>
>This appears the only operational way to deal with failed brokers, at the
>moment.
>
>Longer term, it would be great if the cluster were self-healing, and if a
>broker went down, we could mark it as no longer available somehow, and the
>cluster would then reassign and re-replicate partitions to new brokers,
>that were previously assigned to the failed broker.  I expect something
>like this will be available in future versions, but that doesn't appear
>the
>case at present.
>
>And related, it would be nice, in the interests of horizontal scalability,
>to have an easy way for the cluster to dynamically rebalance load, if new
>nodes are added to the cluster (or to at least prefer assigning new
>partitions to brokers which have more space available).  I expect this
>will
>be something to prioritize in the future versions as well.
>
>Jason
>
>
>On Wed, Oct 2, 2013 at 1:00 PM, Sriram Subramanian <
>srsubraman...@linkedin.com> wrote:
>
>> I agree that we need a unique id and have something independent of the
>> machine. I am not sure you want a dependency on ZK to generate the
>>unique
>> id though. There are other ways to generate an unique id (Example -
>>UUID).
>> In case there was a collision (highly unlikely), the node creation in ZK
>> will anyways fail and the broker can regenerate another id.
>>
>> On 10/2/13 9:52 AM, "Jay Kreps" <jay.kr...@gmail.com> wrote:
>>
>> >There are scenarios in which you want a hostname to change or you want
>>to
>> >move the stored data off one machine onto another. This is the
>>motivation
>> >systems have for having a layer of indirection between the location and
>> >the
>> >identity of the nodes.
>> >
>> >-Jay
>> >
>> >
>> >On Wed, Oct 2, 2013 at 9:23 AM, Guozhang Wang <wangg...@gmail.com>
>>wrote:
>> >
>> >> Wondering what is the reason behind decoupling the node id with its
>> >> physical host(port)? If we found that for example, node 1 is not
>>owning
>> >>any
>> >> partitions, how would we know which physical machine is this node
>>then?
>> >>
>> >> Guozhang
>> >>
>> >>
>> >> On Wed, Oct 2, 2013 at 9:07 AM, Jay Kreps <jay.kr...@gmail.com>
>>wrote:
>> >>
>> >> > I'm in favor of doing this if someone is willing to work on it! I
>> >>agree
>> >> it
>> >> > would really help with easy provisioning.
>> >> >
>> >> > I filed a bug to discuss and track:
>> >> > https://issues.apache.org/jira/browse/KAFKA-1070
>> >> >
>> >> > Some comments:
>> >> > 1. I'm not in favor of having a pluggable strategy, unless we are
>> >>really
>> >> > really sure this is an area where people are going to get a lot of
>> >>value
>> >> by
>> >> > writing lots of plugins. I am not at all sure why you would want to
>> >> retain
>> >> > the current behavior if you had a good strategy for automatically
>> >> > generating ids. Basically plugins are an evil we only want to
>>accept
>> >>when
>> >> > either we don't understand the problem or the solutions have such
>> >>extreme
>> >> > tradeoffs that there is no single "good approach". Plugins cause
>> >>problems
>> >> > for upgrades, testing, documentation, user understandability, code
>> >> > understandability, etc.
>> >> > 2. The node id can't be the host or port or anything tied to the
>> >>physical
>> >> > machine or its location on the network because you need to be able
>>to
>> >> > change these things. I recommend we just keep an integer.
>> >> >
>> >> > -Jay
>> >> >
>> >> >
>> >> > On Tue, Oct 1, 2013 at 7:08 AM, Aniket Bhatnagar <
>> >> > aniket.bhatna...@gmail.com
>> >> > > wrote:
>> >> >
>> >> > > Right. It is currently java integer. However, as per previous
>> >>thread,
>> >> it
>> >> > > seems possible to change it to a string. In that case, we can use
>> >> > instance
>> >> > > IDs, IP addresses, custom ID generators, etc.
>> >> > > How are you currently generating broker IDs from IP address? Chef
>> >> script
>> >> > or
>> >> > > custom shell script?
>> >> > >
>> >> > >
>> >> > > On 1 October 2013 18:34, Maxime Brugidou
>><maxime.brugi...@gmail.com
>> >
>> >> > > wrote:
>> >> > >
>> >> > > > I think it currently is a java (signed) integer or maybe this
>>was
>> >> > > > zookeeper?
>> >> > > > We are generating the id from IP address for now but this is
>>not
>> >> ideal
>> >> > > (and
>> >> > > > can cause integer overflow with java signed ints)
>> >> > > > On Oct 1, 2013 12:52 PM, "Aniket Bhatnagar" <
>> >> > aniket.bhatna...@gmail.com>
>> >> > > > wrote:
>> >> > > >
>> >> > > > > I would like to revive an older thread around auto generating
>> >> broker
>> >> > > ID.
>> >> > > > As
>> >> > > > > a AWS user, I would like Kafka to just use the instance's ID
>>or
>> >> > > > instance's
>> >> > > > > IP or instance's internal domain (whichever is easier). This
>> >>would
>> >> > > mean I
>> >> > > > > can easily clone from a AMI to launch kafka instances without
>> >> having
>> >> > to
>> >> > > > > worry about setting a unique broker ID. This also alows me to
>> >>setup
>> >> > > auto
>> >> > > > > scaling.
>> >> > > > >
>> >> > > > > I realize 1 size may not fit all in this case. Other
>>strategies
>> >> that
>> >> > > may
>> >> > > > > work for other cloud providers are generate the UUID and
>> >>persist it
>> >> > on
>> >> > > a
>> >> > > > > disk, etc.
>> >> > > > >
>> >> > > > > What I propose is a way to define a a broker ID generation
>> >>strategy
>> >> > in
>> >> > > > the
>> >> > > > > configuration file which points to a class file that is
>> >>responsible
>> >> > for
>> >> > > > > generating the ID. Is this something being already worked
>>upon?
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> -- Guozhang
>> >>
>>
>>

Re: Strategies for auto generating broker ID

Reply via email to