[ 
https://issues.apache.org/jira/browse/KAFKA-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-7537.
----------------------------
       Resolution: Fixed
    Fix Version/s: 2.2.0

Merged the PR to trunk.

> Only include live brokers in the UpdateMetadataRequest sent to existing 
> brokers if there is no change in the partition states
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-7537
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7537
>             Project: Kafka
>          Issue Type: Improvement
>          Components: controller
>            Reporter: Zhanxiang (Patrick) Huang
>            Assignee: Zhanxiang (Patrick) Huang
>            Priority: Major
>             Fix For: 2.2.0
>
>
> Currently if when brokers join/leave the cluster without any partition states 
> changes, controller will send out UpdateMetadataRequests containing the 
> states of all partitions to all brokers. But for existing brokers in the 
> cluster, the metadata diff between controller and the broker should only be 
> the "live_brokers" info. Only the brokers with empty metadata cache need the 
> full UpdateMetadataRequest. Sending the full UpdateMetadataRequest to all 
> brokers can place nonnegligible memory pressure on the controller side.
> Let's say in total we have N brokers, M partitions in the cluster and we want 
> to add 1 brand new broker in the cluster. With RF=2, the memory footprint per 
> partition in the UpdateMetadataRequest is ~200 Bytes. In the current 
> controller implementation, if each of the N RequestSendThreads serializes and 
> sends out the UpdateMetadataRequest at roughly the same time (which is very 
> likely the case), we will end up using *(N+1)*M*200B*. In a large kafka 
> cluster, we can have:
> {noformat}
> N=99
> M=100k
> Memory usage to send out UpdateMetadataRequest to all brokers:
> 100 * 100K * 200B = 2G
> However, we only need to send out full UpdateMetadataRequest to the newly 
> added broker. We only need to include live broker ids (4B * 100 brokers) in 
> the UpdateMetadataRequest sent to the existing 99 brokers. So the amount of 
> data that is actully needed will be:
> 1 * 100K * 200B + 99 * (100 * 4B) = ~21M
> We will can potentially reduce 2G / 21M = ~95x memory footprint as well as 
> the data tranferred in the network.{noformat}
>  
> This issue kind of hurts the scalability of a kafka cluster. KIP-380 and 
> KAFKA-7186 also help to further reduce the controller memory footprint.
>  
> In terms of implementation, we can keep some in-memory state in the 
> controller side to differentiate existing brokers and uninitialized brokers 
> (e.g. brand new brokers) so that if there is no change in partition states, 
> we only send out live brokers info to existing brokers.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to