[ 
https://issues.apache.org/jira/browse/KAFKA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17617017#comment-17617017
 ] 

Viktor Somogyi-Vass commented on KAFKA-14281:
---------------------------------------------

Thanks [~ottomata] for the linked evaluation, I think it's useful material. 
We're currently preparing our KIP, I'll hopefully publish it in a few days. 
We'll create an algorithm that'd spread evenly across multiple levels so it'd 
handle even assignments between racks. If you use Cruise Control I think we'll 
publish an optimization goal there too if the planned KIP gets adopted.

> Multi-level rack awareness
> --------------------------
>
>                 Key: KAFKA-14281
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14281
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 3.4.0
>            Reporter: Viktor Somogyi-Vass
>            Assignee: Viktor Somogyi-Vass
>            Priority: Major
>
> h1. Motivation
> With replication services data can be replicated across independent Kafka 
> clusters in multiple data center. In addition, many customers need "stretch 
> clusters" - a single Kafka cluster that spans across multiple data centers. 
> This architecture has the following useful characteristics:
>  - Data is natively replicated into all data centers by Kafka topic 
> replication.
>  - No data is lost when 1 DC is lost and no configuration change is required 
> - design is implicitly relying on native Kafka replication.
>  - From operational point of view, it is much easier to configure and operate 
> such a topology than a replication scenario via MM2.
> Kafka should provide "native" support for stretch clusters, covering any 
> special aspects of operations of stretch cluster.
> h2. Multi-level rack awareness
> Additionally, stretch clusters are implemented using the rack awareness 
> feature, where each DC is represented as a rack. This ensures that replicas 
> are spread across DCs evenly. Unfortunately, there are cases where this is 
> too limiting - in case there are actual racks inside the DCs, we cannot 
> specify those. Consider having 3 DCs with 2 racks each:
> /DC1/R1, /DC1/R2
> /DC2/R1, /DC2/R2
> /DC3/R1, /DC3/R2
> If we were to use racks as DC1, DC2, DC3, we lose the rack-level information 
> of the setup. This means that it is possible that when we are using RF=6, 
> that the 2 replicas assigned to DC1 will both end up in the same rack.
> If we were to use racks as /DC1/R1, /DC1/R2, etc, then when using RF=3, it is 
> possible that 2 replicas end up in the same DC, e.g. /DC1/R1, /DC1/R2, 
> /DC2/R1.
> Because of this, Kafka should support "multi-level" racks, which means that 
> rack IDs should be able to describe some kind of a hierarchy. With this 
> feature, brokers should be able to:
>  # spread replicas evenly based on the top level of the hierarchy (i.e. 
> first, between DCs)
>  # then inside a top-level unit (DC), if there are multiple replicas, they 
> should be spread evenly among lower-level units (i.e. between racks, then 
> between physical hosts, and so on)
>  ## repeat for all levels



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to