[ https://issues.apache.org/jira/browse/KAFKA-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148680#comment-14148680 ]
Ewen Cheslack-Postava commented on KAFKA-1631: ---------------------------------------------- Right. Unfortunately most of the system isn't aware of the large scale change (reassign old set -> new set), only of each intermediate state (old set -> old set + new set -> new set). As it stands, the UnderReplicatedPartitions are computed by Partition class, which is created by ReplicaManager. But the high-level reassignment is managed by KafkaController, and looks like the only place the necessary state is maintained. I think getting the semantics you want may require a much more substantial change since each partition leader will need to know about the partition reassignment rather than just the controller. On the other hand, while I think it's less than ideal, the current behavior could certainly be argued to be reasonable -- i.e. that reassignment is not "natively" supported, it's just a higher-level operation you can build up. In this case, the intermediate step is expected, and the temporary reporting of under-replication would make sense since for a time the desired replication of (old set + new set) has not been achieved. > ReplicationFactor and under-replicated partitions incorrect during > reassignment > ------------------------------------------------------------------------------- > > Key: KAFKA-1631 > URL: https://issues.apache.org/jira/browse/KAFKA-1631 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.8.1.1 > Reporter: Ryan Berdeen > Labels: newbie > Attachments: KAFKA-1631-v1.patch > > > We have a topic with a replication factor of 3. We monitor > UnderReplicatedPartitions as recommended by the documentation. > During a partition reassignment, partitions being reassigned are reported as > under-replicated. Running a describe shows: > {code} > Topic:activity-wal-1 PartitionCount:15 ReplicationFactor:5 > Configs: > Topic: activity-wal-1 Partition: 0 Leader: 14 Replicas: > 14,13,12,11,15 Isr: 14,12,11,13 > Topic: activity-wal-1 Partition: 1 Leader: 14 Replicas: > 15,14,11 Isr: 14,11 > Topic: activity-wal-1 Partition: 2 Leader: 11 Replicas: > 11,15,12 Isr: 12,11,15 > ... > {code} > It looks like the displayed replication factor, 5, is simply the number of > replicas listed for the first partition, which includes both brokers in the > current list and those onto which the partition is being reassigned. > Partition 0 is also included in the list when using the > `--under-replicated-partitions` option, even though it is replicated to more > partitions than the true replication factor. > During a reassignment, the under-replicated partitions metric is not usable, > meaning that actual under-replicated partitions can go unnoticed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)