[ 
https://issues.apache.org/jira/browse/KAFKA-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148680#comment-14148680
 ] 

Ewen Cheslack-Postava commented on KAFKA-1631:
----------------------------------------------

Right. Unfortunately most of the system isn't aware of the large scale change 
(reassign old set -> new set), only of each intermediate state (old set -> old 
set + new set -> new set). As it stands, the UnderReplicatedPartitions are 
computed by Partition class, which is created by ReplicaManager. But the 
high-level reassignment is managed by KafkaController, and looks like the only 
place the necessary state is maintained. I think getting the semantics you want 
may require a much more substantial change since each partition leader will 
need to know about the partition reassignment rather than just the controller.

On the other hand, while I think it's less than ideal, the current behavior 
could certainly be argued to be reasonable -- i.e. that reassignment is not 
"natively" supported, it's just a higher-level operation you can build up. In 
this case, the intermediate step is expected, and the temporary reporting of 
under-replication would make sense since for a time the desired replication of 
(old set + new set) has not been achieved.

> ReplicationFactor and under-replicated partitions incorrect during 
> reassignment
> -------------------------------------------------------------------------------
>
>                 Key: KAFKA-1631
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1631
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8.1.1
>            Reporter: Ryan Berdeen
>              Labels: newbie
>         Attachments: KAFKA-1631-v1.patch
>
>
> We have a topic with a replication factor of 3. We monitor 
> UnderReplicatedPartitions as recommended by the documentation.
> During a partition reassignment, partitions being reassigned are reported as 
> under-replicated. Running a describe shows:
> {code}
> Topic:activity-wal-1    PartitionCount:15       ReplicationFactor:5     
> Configs:
>         Topic: activity-wal-1   Partition: 0    Leader: 14      Replicas: 
> 14,13,12,11,15        Isr: 14,12,11,13
>         Topic: activity-wal-1   Partition: 1    Leader: 14      Replicas: 
> 15,14,11      Isr: 14,11
>         Topic: activity-wal-1   Partition: 2    Leader: 11      Replicas: 
> 11,15,12      Isr: 12,11,15
> ...
> {code}
> It looks like the displayed replication factor, 5, is simply the number of 
> replicas listed for the first partition, which includes both brokers in the 
> current list and those onto which the partition is being reassigned. 
> Partition 0 is also included in the list when using the 
> `--under-replicated-partitions` option, even though it is replicated to more 
> partitions than the true replication factor.
> During a reassignment, the under-replicated partitions metric is not usable, 
> meaning that actual under-replicated partitions can go unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to