[ 
https://issues.apache.org/jira/browse/KAFKA-8733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Duggana updated KAFKA-8733:
----------------------------------
    Attachment: wio-time.png
                weighted-io-time-2.png

> Offline partitions occur when leader's disk is slow in reads while responding 
> to follower fetch requests.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-8733
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8733
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.1.2, 2.4.0
>            Reporter: Satish Duggana
>            Assignee: Satish Duggana
>            Priority: Critical
>         Attachments: weighted-io-time-2.png, wio-time.png
>
>
> We found offline partitions issue multiple times on some of the hosts in our 
> clusters. After going through the broker logs and hosts’s disk stats, it 
> looks like this issue occurs whenever the read/write operations take more 
> time on that disk. In a particular case where read time is more than the 
> replica.lag.time.max.ms, follower replicas will be out of sync as their 
> earlier fetch requests are stuck while reading the local log and their fetch 
> status is not yet updated as mentioned in the below code of `ReplicaManager`. 
> If there is an issue in reading the data from the log for a duration more 
> than replica.lag.time.max.ms then all the replicas will be out of sync and 
> partition becomes offline if min.isr.replicas > 1 and unclean.leader.election 
> is disabled.
>  
> {code:java}
> def readFromLog(): Seq[(TopicPartition, LogReadResult)] = {
>   val result = readFromLocalLog( // this call took more than 
> `replica.lag.time.max.ms`
>   replicaId = replicaId,
>   fetchOnlyFromLeader = fetchOnlyFromLeader,
>   readOnlyCommitted = fetchOnlyCommitted,
>   fetchMaxBytes = fetchMaxBytes,
>   hardMaxBytesLimit = hardMaxBytesLimit,
>   readPartitionInfo = fetchInfos,
>   quota = quota,
>   isolationLevel = isolationLevel)
>   if (isFromFollower) updateFollowerLogReadResults(replicaId, result). // 
> fetch time gets updated here, but mayBeShrinkIsr should have been already 
> called and the replica is removed from sir
>  else result
>  }
> val logReadResults = readFromLog()
> {code}
> I will raise a 
> [KIP|https://cwiki.apache.org/confluence/display/KAFKA/KIP-501+Avoid+offline+partitions+in+the+edgcase+scenario+of+follower+fetch+requests+not+processed+in+time]
>  describing options on how to handle this scenario.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to