[jira] [Commented] (KAFKA-6051) ReplicaFetcherThread should close the ReplicaFetcherBlockingSend earlier on shutdown

Maytee Chinavanichkit (JIRA) Wed, 11 Oct 2017 04:01:42 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16200091#comment-16200091
 ]


Maytee Chinavanichkit commented on KAFKA-6051:
----------------------------------------------

https://github.com/apache/kafka/pull/4056

> ReplicaFetcherThread should close the ReplicaFetcherBlockingSend earlier on 
> shutdown
> ------------------------------------------------------------------------------------
>
>                 Key: KAFKA-6051
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6051
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Maytee Chinavanichkit
>
> The ReplicaFetcherBlockingSend works as designed and will blocks until it is 
> able to get data. This becomes a problem when we are gracefully shutting down 
> a broker. The controller will attempt to shutdown the fetchers and elect new 
> leaders. When the last fetch of partition is removed, as part of the 
> {replicaManager.becomeLeaderOrFollower} call will proceed to shut down any 
> idle ReplicaFetcherThread. The shutdown process here can block up to until 
> the last fetch request completes. This blocking delay is a big problem 
> because the {replicaStateChangeLock}, and {mapLock} in 
> {AbstractFetcherManager} is still locked causing latency spikes on multiple 
> brokers.
> At this point in time, we do not need the last response as the fetcher is 
> shutting down. We should close the leaderEndpoint early during 
> {initiateShutdown()} instead of after {super.shutdown()}.
> For example we see here the shutdown blocked the broker from processing more 
> replica changes for ~500 ms 
> {code}
> [2017-09-01 18:11:42,879] INFO [ReplicaFetcherThread-0-2], Shutting down 
> (kafka.server.ReplicaFetcherThread) 
> [2017-09-01 18:11:43,314] INFO [ReplicaFetcherThread-0-2], Stopped 
> (kafka.server.ReplicaFetcherThread) 
> [2017-09-01 18:11:43,314] INFO [ReplicaFetcherThread-0-2], Shutdown completed 
> (kafka.server.ReplicaFetcherThread)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (KAFKA-6051) ReplicaFetcherThread should close the ReplicaFetcherBlockingSend earlier on shutdown

Reply via email to