[ https://issues.apache.org/jira/browse/KAFKA-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673651#comment-13673651 ]
Jun Rao commented on KAFKA-927: ------------------------------- Thanks for patch v3. A few more comments: 30. KafkaServer: 30.1 Could you combine isShuttingDown and startupComplete? 30.2 In controlledShutdown(), it's not clear if it's worth caching the socket channel. Technically, it's possible for a controller to come back on the broker with the same id, but with a different broker host/port. It's simpler to just always close the socket channel on each ControlledShutdownRequest and create a new channel on retry. 31. KafkaController: 31.1 remove unused import java.util.concurrent.{Semaphore 31.2 I think we still need to set shuttingDownBrokerIds to empty in onControllerFailover(). A controller may failover during a controlled shutdown and later regain the controllership. OnBrokerFailure() is only called if the controller is active. So shuttingDownBrokerIds may not be empty when the controllership switches back. > Integrate controlled shutdown into kafka shutdown hook > ------------------------------------------------------ > > Key: KAFKA-927 > URL: https://issues.apache.org/jira/browse/KAFKA-927 > Project: Kafka > Issue Type: Bug > Reporter: Sriram Subramanian > Assignee: Sriram Subramanian > Attachments: KAFKA-927.patch, KAFKA-927-v2.patch, > KAFKA-927-v2-revised.patch, KAFKA-927-v3.patch > > > The controlled shutdown mechanism should be integrated into the software for > better operational benefits. Also few optimizations can be done to reduce > unnecessary rpc and zk calls. This patch has been tested on a prod like > environment by doing rolling bounces continuously for a day. The average time > of doing a rolling bounce with controlled shutdown for a cluster with 7 nodes > without this patch is 340 seconds. With this patch it reduces to 220 seconds. > Also it ensures correctness in scenarios where the controller shrinks the isr > and the new leader could place the broker to be shutdown back into the isr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira