[ 
https://issues.apache.org/jira/browse/KAFKA-8625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryne Yang updated KAFKA-8625:
-----------------------------
    Description: 
when we used kafka cruise control to invoke kafka's new feature( [feature 
proposal|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-113%3A+Support+replicas+movement+between+log+directories#KIP-113:Supportreplicasmovementbetweenlogdirectories-1)Howtomovereplicabetweenlogdirectoriesonthesamebroker]]
 ) intra broker disk balance, it did a great work however the process seems to 
stuck at the last mile. 

we stop seeing more movements meaning the move is done and we do see great 
balanced results from our monitoring, but there are some logdirs that are stuck 
at moving indicated as below example:
{code:java}
{"partition":"LOGSTASH5-4","size":0,"offsetLag":123189442,"isFuture":true}
{code}
 there are a handful of those partitions on each broker and they seem to be 
random.

we have waited for days and they don't seem to go away. however we haven't 
tried to restart the controller broker yet. 

does anyone know how to solve this and more importantly why did this happen?

so far we've only tried on version 1.1.1. no idea if this got fixed in the 
later version. 

  was:
when we used kafka cruise control to invoke kafka's new feature([feature 
proposal|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-113%3A+Support+replicas+movement+between+log+directories#KIP-113:Supportreplicasmovementbetweenlogdirectories-1)Howtomovereplicabetweenlogdirectoriesonthesamebroker]])
 intra broker disk balance, it did a great work however the process seems to 
stuck at the last mile. 

we stop seeing more movements meaning the move is done and we do see great 
balanced results from our monitoring, but there are some logdirs that are stuck 
at moving indicated as below example:
{code:java}
{"partition":"LOGSTASH5-4","size":0,"offsetLag":123189442,"isFuture":true}
{code}
 there are a handful of those partitions on each broker and they seem to be 
random.

we have waited for days and they don't seem to go away. however we haven't 
tried to restart the controller broker yet. 

does anyone know how to solve this and more importantly why did this happen?

so far we've only tried on version 1.1.1. no idea if this got fixed in the 
later version. 


> intra broker data balance stuck
> -------------------------------
>
>                 Key: KAFKA-8625
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8625
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>         Environment: Linux 2.6.32-431.5.1.el6.x86_64 #1 SMP x86_64 x86_64 
> x86_64 GNU/Linux
>            Reporter: Ryne Yang
>            Priority: Major
>
> when we used kafka cruise control to invoke kafka's new feature( [feature 
> proposal|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-113%3A+Support+replicas+movement+between+log+directories#KIP-113:Supportreplicasmovementbetweenlogdirectories-1)Howtomovereplicabetweenlogdirectoriesonthesamebroker]]
>  ) intra broker disk balance, it did a great work however the process seems 
> to stuck at the last mile. 
> we stop seeing more movements meaning the move is done and we do see great 
> balanced results from our monitoring, but there are some logdirs that are 
> stuck at moving indicated as below example:
> {code:java}
> {"partition":"LOGSTASH5-4","size":0,"offsetLag":123189442,"isFuture":true}
> {code}
>  there are a handful of those partitions on each broker and they seem to be 
> random.
> we have waited for days and they don't seem to go away. however we haven't 
> tried to restart the controller broker yet. 
> does anyone know how to solve this and more importantly why did this happen?
> so far we've only tried on version 1.1.1. no idea if this got fixed in the 
> later version. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to