[ 
https://issues.apache.org/jira/browse/KAFKA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17613016#comment-17613016
 ] 

Andrew Otto commented on KAFKA-14281:
-------------------------------------

+1.  The Wikimedia Foundation is about to experiment with a Kafka stretch 
setup, and as is we'll end up just using DC level information instead of rack 
level, so that we can ensure replicas are spread evenly between DCs.  It'd be 
nice to spread evenly between racks in DCs too.

(FWIW, context for our evaluation is 
[here|https://phabricator.wikimedia.org/T307944].)

 

> Multi-level rack awareness
> --------------------------
>
>                 Key: KAFKA-14281
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14281
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 3.4.0
>            Reporter: Viktor Somogyi-Vass
>            Assignee: Viktor Somogyi-Vass
>            Priority: Major
>
> h1. Motivation
> With replication services data can be replicated across independent Kafka 
> clusters in multiple data center. In addition, many customers need "stretch 
> clusters" - a single Kafka cluster that spans across multiple data centers. 
> This architecture has the following useful characteristics:
>  - Data is natively replicated into all data centers by Kafka topic 
> replication.
>  - No data is lost when 1 DC is lost and no configuration change is required 
> - design is implicitly relying on native Kafka replication.
>  - From operational point of view, it is much easier to configure and operate 
> such a topology than a replication scenario via MM2.
> Kafka should provide "native" support for stretch clusters, covering any 
> special aspects of operations of stretch cluster.
> h2. Multi-level rack awareness
> Additionally, stretch clusters are implemented using the rack awareness 
> feature, where each DC is represented as a rack. This ensures that replicas 
> are spread across DCs evenly. Unfortunately, there are cases where this is 
> too limiting - in case there are actual racks inside the DCs, we cannot 
> specify those. Consider having 3 DCs with 2 racks each:
> /DC1/R1, /DC1/R2
> /DC2/R1, /DC2/R2
> /DC3/R1, /DC3/R2
> If we were to use racks as DC1, DC2, DC3, we lose the rack-level information 
> of the setup. This means that it is possible that when we are using RF=6, 
> that the 2 replicas assigned to DC1 will both end up in the same rack.
> If we were to use racks as /DC1/R1, /DC1/R2, etc, then when using RF=3, it is 
> possible that 2 replicas end up in the same DC, e.g. /DC1/R1, /DC1/R2, 
> /DC2/R1.
> Because of this, Kafka should support "multi-level" racks, which means that 
> rack IDs should be able to describe some kind of a hierarchy. With this 
> feature, brokers should be able to:
>  # spread replicas evenly based on the top level of the hierarchy (i.e. 
> first, between DCs)
>  # then inside a top-level unit (DC), if there are multiple replicas, they 
> should be spread evenly among lower-level units (i.e. between racks, then 
> between physical hosts, and so on)
>  ## repeat for all levels



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to