[ https://issues.apache.org/jira/browse/KAFKA-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658685#comment-14658685 ]
Jun Rao commented on KAFKA-2406: -------------------------------- [~becket_qin], another way to reduce the overhead is to batch on the broker side. Every time a broker changes an ISR, it just saves the partition in an in-memory map. Periodically, the replica manager can collect all partitions in the map and write a single node in the ISR change path. This should reduce the number of UpdateMetadataRequests the controller sends to the brokers. It also reduces the number of ZK nodes to be deleted and the number of times the ISR change watchers are triggered. Also, could you provide a bit more details on the performance issue? Does that issue happen when shutting down the first broker in the cluster or when there is another broker just being restarted? It will also be good know where the bottleneck is: whether just in the number of UpdateMetadataRequests or in processing the watchers as well. > ISR propagation should be throttled to avoid overwhelming controller. > --------------------------------------------------------------------- > > Key: KAFKA-2406 > URL: https://issues.apache.org/jira/browse/KAFKA-2406 > Project: Kafka > Issue Type: Bug > Reporter: Jiangjie Qin > Assignee: Jiangjie Qin > Priority: Blocker > > This is a follow up patch for KAFKA-1367. > We need to throttle the ISR propagation rate to avoid flooding in controller > to broker traffic. This might significantly increase time of controlled > shutdown or cluster startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)