[ https://issues.apache.org/jira/browse/KAFKA-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976895#comment-16976895 ]
Vahid Hashemian commented on KAFKA-9205: ---------------------------------------- This will still likely require a KIP since the default behavior could change. > Add an option to enforce rack-aware partition reassignment > ---------------------------------------------------------- > > Key: KAFKA-9205 > URL: https://issues.apache.org/jira/browse/KAFKA-9205 > Project: Kafka > Issue Type: Improvement > Components: admin, tools > Reporter: Vahid Hashemian > Priority: Minor > > One regularly used healing operation on Kafka clusters is replica > reassignments for topic partitions. For example, when there is a skew in > inbound/outbound traffic of a broker replica reassignment can be used to move > some leaders/followers from the broker; or if there is a skew in disk usage > of brokers, replica reassignment can more some partitions to other brokers > that have more disk space available. > In Kafka clusters that span across multiple data centers (or availability > zones), high availability is a priority; in the sense that when a data center > goes offline the cluster should be able to resume normal operation by > guaranteeing partition replicas in all data centers. > This guarantee is currently the responsibility of the on-call engineer that > performs the reassignment or the tool that automatically generates the > reassignment plan for improving the cluster health (e.g. by considering the > rack configuration value of each broker in the cluster). the former, is quite > error-prone, and the latter, would lead to duplicate code in all such admin > tools (which are not error free either). Not all use cases can make use the > default assignment strategy that is used by --generate option; and current > rack aware enforcement applies to this option only. > It would be great for the built-in replica assignment API and tool provided > by Kafka to support a rack aware verification option for --execute scenario > that would simply return an error when [some] brokers in any replica set > share a common rack. -- This message was sent by Atlassian Jira (v8.3.4#803005)