Jason - You should be able to solve that with Jay's proposal below. If you just persist the id in a meta file, you can copy the meta file over to the new broker and broker will not re-generate another id.
On 10/2/13 11:10 AM, "Jason Rosenberg" <j...@squareup.com> wrote: >I recently moved away from generating a unique brokerId for each node, in >favor of assigning ids in configuration. The reason for this, is that in >0.8, there isn't a convenient way yet to reassign partitions to a new >brokerid, should one broker have a failure. So, it seems the only >work-around at the moment is to bring up a replacement broker, assign it >the same brokerId as one that has failed and is no longer running. The >cluster will then automatically replicate all the partitions that were >assigned to the failed broker to the new broker. > >This appears the only operational way to deal with failed brokers, at the >moment. > >Longer term, it would be great if the cluster were self-healing, and if a >broker went down, we could mark it as no longer available somehow, and the >cluster would then reassign and re-replicate partitions to new brokers, >that were previously assigned to the failed broker. I expect something >like this will be available in future versions, but that doesn't appear >the >case at present. > >And related, it would be nice, in the interests of horizontal scalability, >to have an easy way for the cluster to dynamically rebalance load, if new >nodes are added to the cluster (or to at least prefer assigning new >partitions to brokers which have more space available). I expect this >will >be something to prioritize in the future versions as well. > >Jason > > >On Wed, Oct 2, 2013 at 1:00 PM, Sriram Subramanian < >srsubraman...@linkedin.com> wrote: > >> I agree that we need a unique id and have something independent of the >> machine. I am not sure you want a dependency on ZK to generate the >>unique >> id though. There are other ways to generate an unique id (Example - >>UUID). >> In case there was a collision (highly unlikely), the node creation in ZK >> will anyways fail and the broker can regenerate another id. >> >> On 10/2/13 9:52 AM, "Jay Kreps" <jay.kr...@gmail.com> wrote: >> >> >There are scenarios in which you want a hostname to change or you want >>to >> >move the stored data off one machine onto another. This is the >>motivation >> >systems have for having a layer of indirection between the location and >> >the >> >identity of the nodes. >> > >> >-Jay >> > >> > >> >On Wed, Oct 2, 2013 at 9:23 AM, Guozhang Wang <wangg...@gmail.com> >>wrote: >> > >> >> Wondering what is the reason behind decoupling the node id with its >> >> physical host(port)? If we found that for example, node 1 is not >>owning >> >>any >> >> partitions, how would we know which physical machine is this node >>then? >> >> >> >> Guozhang >> >> >> >> >> >> On Wed, Oct 2, 2013 at 9:07 AM, Jay Kreps <jay.kr...@gmail.com> >>wrote: >> >> >> >> > I'm in favor of doing this if someone is willing to work on it! I >> >>agree >> >> it >> >> > would really help with easy provisioning. >> >> > >> >> > I filed a bug to discuss and track: >> >> > https://issues.apache.org/jira/browse/KAFKA-1070 >> >> > >> >> > Some comments: >> >> > 1. I'm not in favor of having a pluggable strategy, unless we are >> >>really >> >> > really sure this is an area where people are going to get a lot of >> >>value >> >> by >> >> > writing lots of plugins. I am not at all sure why you would want to >> >> retain >> >> > the current behavior if you had a good strategy for automatically >> >> > generating ids. Basically plugins are an evil we only want to >>accept >> >>when >> >> > either we don't understand the problem or the solutions have such >> >>extreme >> >> > tradeoffs that there is no single "good approach". Plugins cause >> >>problems >> >> > for upgrades, testing, documentation, user understandability, code >> >> > understandability, etc. >> >> > 2. The node id can't be the host or port or anything tied to the >> >>physical >> >> > machine or its location on the network because you need to be able >>to >> >> > change these things. I recommend we just keep an integer. >> >> > >> >> > -Jay >> >> > >> >> > >> >> > On Tue, Oct 1, 2013 at 7:08 AM, Aniket Bhatnagar < >> >> > aniket.bhatna...@gmail.com >> >> > > wrote: >> >> > >> >> > > Right. It is currently java integer. However, as per previous >> >>thread, >> >> it >> >> > > seems possible to change it to a string. In that case, we can use >> >> > instance >> >> > > IDs, IP addresses, custom ID generators, etc. >> >> > > How are you currently generating broker IDs from IP address? Chef >> >> script >> >> > or >> >> > > custom shell script? >> >> > > >> >> > > >> >> > > On 1 October 2013 18:34, Maxime Brugidou >><maxime.brugi...@gmail.com >> > >> >> > > wrote: >> >> > > >> >> > > > I think it currently is a java (signed) integer or maybe this >>was >> >> > > > zookeeper? >> >> > > > We are generating the id from IP address for now but this is >>not >> >> ideal >> >> > > (and >> >> > > > can cause integer overflow with java signed ints) >> >> > > > On Oct 1, 2013 12:52 PM, "Aniket Bhatnagar" < >> >> > aniket.bhatna...@gmail.com> >> >> > > > wrote: >> >> > > > >> >> > > > > I would like to revive an older thread around auto generating >> >> broker >> >> > > ID. >> >> > > > As >> >> > > > > a AWS user, I would like Kafka to just use the instance's ID >>or >> >> > > > instance's >> >> > > > > IP or instance's internal domain (whichever is easier). This >> >>would >> >> > > mean I >> >> > > > > can easily clone from a AMI to launch kafka instances without >> >> having >> >> > to >> >> > > > > worry about setting a unique broker ID. This also alows me to >> >>setup >> >> > > auto >> >> > > > > scaling. >> >> > > > > >> >> > > > > I realize 1 size may not fit all in this case. Other >>strategies >> >> that >> >> > > may >> >> > > > > work for other cloud providers are generate the UUID and >> >>persist it >> >> > on >> >> > > a >> >> > > > > disk, etc. >> >> > > > > >> >> > > > > What I propose is a way to define a a broker ID generation >> >>strategy >> >> > in >> >> > > > the >> >> > > > > configuration file which points to a class file that is >> >>responsible >> >> > for >> >> > > > > generating the ID. Is this something being already worked >>upon? >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> >> >> >> >> >> -- >> >> -- Guozhang >> >> >> >>