Replica assignments can change during topic creation, adding partitions, and reassign partitions. So, it would be good to integrate rack-aware assignment in those cases.
For compatibility, to do a rolling upgrade of a Kafka cluster means that at a given point of time, some brokers could be registered in the old format and some others could be registered in the new format. Both the consumer and the controller ( https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Controller+Internals) read the broker registration in ZK. So, we need to make sure that the new code can read the old format and the old code can read the new format too. For example, in Broker.createBroker(), if the code is looking for a field "rackid" but couldn't find it, we need to be able to provide a reasonable default. Also, take a look at the format of broker registration in ZK in https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper. Since we are changing the format, we should probably change the version from 1 to 2. However, we can make this change backward compatible. Thanks, Jun On Fri, Jan 24, 2014 at 1:57 PM, Joris VanRemoortere < jvanremoort...@tagged.com> wrote: > Working on the above. > > Since trunk currently allows Topic configs, and altering topics, does it > make sense to you to make this a topic config so that it is kept in sync > between topic alterations and partition adds? > > For compatibility: I can make the zookeeper state parsing backwards / > forwards compatible. I don't think I can make the readFrom / writeTo > ByteBuffer compatible since chained calls of these during api parsing are > size dependent? Can you clarify exactly what you expect to be compatible? > > I'm also not very familiar with the consequences of changing the broker > version in zookeeper registration. Can you please provide a reference to > documentation for this? > > Thanks, > > Joris > > > On Fri, Jan 24, 2014 at 9:43 AM, Jun Rao (JIRA) <j...@apache.org> wrote: > > > > > [ > > > https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881193#comment-13881193 > ] > > > > Jun Rao commented on KAFKA-1215: > > -------------------------------- > > > > Thanks for the patch. Looks good over all. Some comments. > > > > 1. KafkaConfig: > > 1.1 We need a config for default max-rack-replication for auto topic > > creation. > > 1.2 rackId: We probably don't want to make this a required property. So, > > perhaps we can default it to 0? > > > > 2. AdminUtils.assignReplicasToBrokers(): > > 2.1 Could you add some comments on the rack-aware assignment algorithm? > > 2.2 It's a bit weird for this method to take zkclient in the input. We > > probably can pass in a list of Broker objects instead. > > > > 3. Unit tests: I suggest that we leave most existing tests intact by > > keeping the rackId default and add a new test for rack-aware assignment. > > > > 4. Compatibility test: It seems that the changes for the broker format in > > ZK is backward compatible. Could you double check? For example, an old > > reader (controller, consumer, etc) should be able to parse the broker > > registered in new format and a new reader should be able to parse the > > broker registered in the old format. Also, we probably should increase > the > > version in the ZK registration for the broker. > > > > 5. Could you rebase to trunk? > > > > > Rack-Aware replica assignment option > > > ------------------------------------ > > > > > > Key: KAFKA-1215 > > > URL: https://issues.apache.org/jira/browse/KAFKA-1215 > > > Project: Kafka > > > Issue Type: Improvement > > > Components: replication > > > Affects Versions: 0.8.0, 0.8.1 > > > Reporter: Joris Van Remoortere > > > Assignee: Neha Narkhede > > > Fix For: 0.8.0, 0.8.1 > > > > > > Attachments: rack_aware_replica_assignment_v1.patch > > > > > > > > > Adding a rack-id to kafka config. This rack-id can be used during > > replica assignment by using the max-rack-replication argument in the > admin > > scripts (create topic, etc.). By default the original replication > > assignment algorithm is used because max-rack-replication defaults to -1. > > max-rack-replication > -1 is not honored if you are doing manual replica > > assignment (preffered). > > > If this looks good I can add some test cases specific to the rack-aware > > assignment. > > > I can also port this to trunk. We are currently running 0.8.0 in > > production and need this, so i wrote the patch against that. > > > > > > > > -- > > This message was sent by Atlassian JIRA > > (v6.1.5#6160) > > >