GEORGE LI created KAFKA-8903:
--------------------------------

             Summary: allow the new replica (offset 0) to catch up with current 
leader using latest offset
                 Key: KAFKA-8903
                 URL: https://issues.apache.org/jira/browse/KAFKA-8903
             Project: Kafka
          Issue Type: Improvement
          Components: config, core
    Affects Versions: 2.3.0, 1.1.1, 1.1.0
            Reporter: GEORGE LI
            Assignee: GEORGE LI


It very common (and sometimes frequent) that a broker has hardware failures 
(disk, memory, cpu, nic) for large Kafka deployment with thousands of brokers.  
The failed host will be replaced by a new one with the same "broker.id",  and 
the new broker starts up as empty.  All topic/partitions will start with offset 
0.  If the current leader has start offset > 0,  this replaced broker will 
start the partition from the leader's earliest (start) offset. 

If the number of partitions  and size of the partitions that this broker is 
hosting is high, it would take quite sometime for the ReplicaFetcher threads to 
pull from all the leaders in the cluster.  and it could incur load of the 
brokers/leaders in the cluster affecting Latency, etc.  performance.   Once 
this replaced broker is caught up,  Preferred leader election can be run to 
move the leaders back to this broker. 

To avoid above performance impact and make the failed broker replacement 
process much easier and scalable,  we are proposing a new Dynamic config {{ 
replica.start.offset.strategy}}.  The default is Earliest, and can be 
dynamically set for a broker (or brokers) to Latest.  If it's set to Latest,  
when the empty broker is starting up, all partitions will be starting from 
latest (LEO LogEndOffset) of the current leader.  So the replace broker 
replicas are in ISR and have 0 TotalLag/MaxLag, 0 URP almost instantly. 

For preferred leadership election, we can wait till the retention time has 
passed, and this replaced broker is in the replication for enough time.  The 
better/safer approach is enable Preferred Leader Blacklist  mentioned in  
KAFKA-8638 /  KIP-491  ,  so before this replaced broker is completely caught 
up,  it's leadership determination priority is moved to the lowest. 












--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to