[
https://issues.apache.org/jira/browse/KAFKA-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on KAFKA-3436 started by Jiangjie Qin.
-------------------------------------------
> Speed up controlled shutdown.
> -----------------------------
>
> Key: KAFKA-3436
> URL: https://issues.apache.org/jira/browse/KAFKA-3436
> Project: Kafka
> Issue Type: Improvement
> Affects Versions: 0.9.0.0
> Reporter: Jiangjie Qin
> Assignee: Jiangjie Qin
> Fix For: 0.10.1.0
>
>
> Currently rolling bounce a Kafka cluster with tens of thousands of partitions
> can take very long (~2 min for each broker with ~5000 partitions/broker in
> our environment). The majority of the time is spent on shutting down a
> broker. The time of shutting down a broker usually includes the following
> parts:
> T1: During controlled shutdown, people usually want to make sure there is no
> under replicated partitions. So shutting down a broker during a rolling
> bounce will have to wait for the previous restarted broker to catch up. This
> is T1.
> T2: The time to send controlled shutdown request and receive controlled
> shutdown response. Currently the a controlled shutdown request will trigger
> many LeaderAndIsrRequest and UpdateMetadataRequest. And also involving many
> zookeeper update in serial.
> T3: The actual time to shutdown all the components. It is usually small
> compared with T1 and T2.
> T1 is related to:
> A) the inbound throughput on the cluster, and
> B) the "down" time of the broker (time between replica fetchers stop and
> replica fetchers restart)
> The larger the traffic is, or the longer the broker stopped fetching, the
> longer it will take for the broker to catch up and get back into ISR.
> Therefore the longer T1 will be. Assume:
> * the in bound network traffic is X bytes/second on a broker
> * the time T1.B ("down" time) mentioned above is T
> Theoretically it will take (X * T) / (NetworkBandwidth - X) =
> InBoundNetworkUtilization * T / (1 - InboundNetworkUtilization) for a the
> broker to catch up after the restart. While X is out of our control, T is
> largely related to T2.
> The purpose of this ticket is to reduce T2 by:
> 1. Batching the LeaderAndIsrRequest and UpdateMetadataRequest during
> controlled shutdown.
> 2. Use async zookeeper write to pipeline zookeeper writes. According to
> Zookeeper wiki(https://wiki.apache.org/hadoop/ZooKeeper/Performance), a 3
> node ZK cluster should be able to handle 20K writes (1K size). So if we use
> async write, likely we will be able to reduce zookeeper update time to lower
> seconds or even sub-second level.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)