Pengwei created KAFKA-3585:
------------------------------

             Summary: Shutdown slow when there is only one broker which is 
controller
                 Key: KAFKA-3585
                 URL: https://issues.apache.org/jira/browse/KAFKA-3585
             Project: Kafka
          Issue Type: Improvement
    Affects Versions: 0.9.0.1
            Reporter: Pengwei
            Priority: Minor
             Fix For: 0.10.0.1


Reproducer Step:
1. Install 3 brokers's cluster

2. create a topic with 3 partition

3. shutdown the broker one by one , you will find the last one shutdown very 
slow because of error:
[2016-04-19 20:30:19,168] INFO [Kafka Server 1], Remaining partitions to move: 
__consumer_offsets-48,__consumer_offsets-13,__consumer_offsets-46,__consumer_offsets-11,__consumer_offsets-44,__consumer_offsets-42,__consumer_offsets-21,__consumer_offsets-19,__consumer_offsets-32,__consumer_offsets-30,__consumer_offsets-28,__consumer_offsets-26,__consumer_offsets-7,__consumer_offsets-40,__consumer_offsets-38,__consumer_offsets-36,__consumer_offsets-1,__consumer_offsets-34,__consumer_offsets-16,__consumer_offsets-45,__consumer_offsets-14,__consumer_offsets-12,__consumer_offsets-41,__consumer_offsets-10,__consumer_offsets-24,__consumer_offsets-22,__consumer_offsets-20,__consumer_offsets-49,__consumer_offsets-18,__consumer_offsets-31,__consumer_offsets-0,test2-0,__consumer_offsets-27,__consumer_offsets-39,__consumer_offsets-8,__consumer_offsets-37,__consumer_offsets-6,__consumer_offsets-4,__consumer_offsets-2
 (kafka.server.KafkaServer)
[2016-04-19 20:30:19,169] INFO [Kafka Server 1], Error code from controller: 0 
(kafka.server.KafkaServer)
[2016-04-19 20:30:24,169] WARN [Kafka Server 1], Retrying controlled shutdown 
after the previous attempt failed... (kafka.server.KafkaServer)
[2016-04-19 20:30:24,171] WARN [Kafka Server 1], Proceeding to do an unclean 
shutdown as all the controlled shutdown attempts failed 
(kafka.server.KafkaServer)

it is determined by :
controlled.shutdown.retry.backoff.ms  = 5000
controlled.shutdown.max.retries=3


It slow because the last one can not elect the new leader for the remaining 
partitions , the last one can improve to shutdown quickly, we can skip the 
shutdown error when it is the last broker



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to