Vahid Hashemian created KAFKA-9205:
--------------------------------------
Summary: Add an option to enforce rack-aware partition reassignment
Key: KAFKA-9205
URL: https://issues.apache.org/jira/browse/KAFKA-9205
Project: Kafka
Issue Type: Improvement
Components: admin, tools
Reporter: Vahid Hashemian
One regularly used healing operation on Kafka clusters is replica reassignments
for topic partitions. For example, when there is a skew in inbound/outbound
traffic of a broker replica reassignment can be used to move some
leaders/followers from the broker; or if there is a skew in disk usage of
brokers, replica reassignment can more some partitions to other brokers that
have more disk space available.
In Kafka clusters that span across multiple data centers (or availability
zones), high availability is a priority; in the sense that when a data center
goes offline the cluster should be able to resume normal operation by
guaranteeing partition replicas in all data centers.
This guarantee is currently the responsibility of the on-call engineer that
performs the reassignment or the tool that automatically generates the
reassignment plan for improving the cluster health (e.g. by considering the
rack configuration value of each broker in the cluster). the former, is quite
error-prone, and the latter, would lead to duplicate code in all such admin
tools (which are not error free either).
It would be great for the built-in replica assignment API and tool provided by
Kafka to support a rack aware verification option that would simply return an
error when [some] brokers in any replica set share a common rack.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)