[jira] [Commented] (KAFKA-1631) ReplicationFactor and under-replicated partitions incorrect during reassignment

2015-04-04 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14396039#comment-14396039
 ] 

Jay Kreps commented on KAFKA-1631:
--

Is this behavior really so bad? I actually think handling a reassignment as an 
add+delete makes some sense...

 ReplicationFactor and under-replicated partitions incorrect during 
 reassignment
 ---

 Key: KAFKA-1631
 URL: https://issues.apache.org/jira/browse/KAFKA-1631
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
Assignee: Ewen Cheslack-Postava
  Labels: newbie
 Attachments: KAFKA-1631-v1.patch


 We have a topic with a replication factor of 3. We monitor 
 UnderReplicatedPartitions as recommended by the documentation.
 During a partition reassignment, partitions being reassigned are reported as 
 under-replicated. Running a describe shows:
 {code}
 Topic:activity-wal-1PartitionCount:15   ReplicationFactor:5 
 Configs:
 Topic: activity-wal-1   Partition: 0Leader: 14  Replicas: 
 14,13,12,11,15Isr: 14,12,11,13
 Topic: activity-wal-1   Partition: 1Leader: 14  Replicas: 
 15,14,11  Isr: 14,11
 Topic: activity-wal-1   Partition: 2Leader: 11  Replicas: 
 11,15,12  Isr: 12,11,15
 ...
 {code}
 It looks like the displayed replication factor, 5, is simply the number of 
 replicas listed for the first partition, which includes both brokers in the 
 current list and those onto which the partition is being reassigned. 
 Partition 0 is also included in the list when using the 
 `--under-replicated-partitions` option, even though it is replicated to more 
 partitions than the true replication factor.
 During a reassignment, the under-replicated partitions metric is not usable, 
 meaning that actual under-replicated partitions can go unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1631) ReplicationFactor and under-replicated partitions incorrect during reassignment

2014-10-04 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14159388#comment-14159388
 ] 

Neha Narkhede commented on KAFKA-1631:
--

The behavior of partition reassignment being old set - old set + new set - 
new set is just an implementation detail that users don't need to know and 
understand. However, there are 2 ways to report under replicated partitions 
today and this solution fixes one but not the other. For instance, if 
partitions being reassigned are not reported as under replicated through the 
topics tool (with this patch) but are reported by the broker's mbean, users 
would get confused. An ideal long term solution would be to define partition 
states as being one of the following - new, initializing, ready, migrating, 
under replicated (maybe more or less) and expose the partition's state as being 
one of these through the topic tool as well as JMX. It is possible to get away 
without having these states if there are maybe just 2 possible states that the 
partition lives in, but as the # of states increases, it is worth exposing 
those explicitly. One of these states is under-replicated and partitions being 
reassigned should belong to a separate migrating state, not under 
replicated. 

 ReplicationFactor and under-replicated partitions incorrect during 
 reassignment
 ---

 Key: KAFKA-1631
 URL: https://issues.apache.org/jira/browse/KAFKA-1631
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
  Labels: newbie
 Attachments: KAFKA-1631-v1.patch


 We have a topic with a replication factor of 3. We monitor 
 UnderReplicatedPartitions as recommended by the documentation.
 During a partition reassignment, partitions being reassigned are reported as 
 under-replicated. Running a describe shows:
 {code}
 Topic:activity-wal-1PartitionCount:15   ReplicationFactor:5 
 Configs:
 Topic: activity-wal-1   Partition: 0Leader: 14  Replicas: 
 14,13,12,11,15Isr: 14,12,11,13
 Topic: activity-wal-1   Partition: 1Leader: 14  Replicas: 
 15,14,11  Isr: 14,11
 Topic: activity-wal-1   Partition: 2Leader: 11  Replicas: 
 11,15,12  Isr: 12,11,15
 ...
 {code}
 It looks like the displayed replication factor, 5, is simply the number of 
 replicas listed for the first partition, which includes both brokers in the 
 current list and those onto which the partition is being reassigned. 
 Partition 0 is also included in the list when using the 
 `--under-replicated-partitions` option, even though it is replicated to more 
 partitions than the true replication factor.
 During a reassignment, the under-replicated partitions metric is not usable, 
 meaning that actual under-replicated partitions can go unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1631) ReplicationFactor and under-replicated partitions incorrect during reassignment

2014-09-25 Thread Ryan Berdeen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148304#comment-14148304
 ] 

Ryan Berdeen commented on KAFKA-1631:
-

Not reporting partitions being reassigned seems even worse--this would lead to 
false negatives! It also doesn't address the fact that replication factor is 
reported incorrectly.

It seems like the right solution would be to store the intended replication 
factor for the topic, and alert if the size of the ISR is less than this.

 ReplicationFactor and under-replicated partitions incorrect during 
 reassignment
 ---

 Key: KAFKA-1631
 URL: https://issues.apache.org/jira/browse/KAFKA-1631
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
  Labels: newbie

 We have a topic with a replication factor of 3. We monitor 
 UnderReplicatedPartitions as recommended by the documentation.
 During a partition reassignment, partitions being reassigned are reported as 
 under-replicated. Running a describe shows:
 {code}
 Topic:activity-wal-1PartitionCount:15   ReplicationFactor:5 
 Configs:
 Topic: activity-wal-1   Partition: 0Leader: 14  Replicas: 
 14,13,12,11,15Isr: 14,12,11,13
 Topic: activity-wal-1   Partition: 1Leader: 14  Replicas: 
 15,14,11  Isr: 14,11
 Topic: activity-wal-1   Partition: 2Leader: 11  Replicas: 
 11,15,12  Isr: 12,11,15
 ...
 {code}
 It looks like the displayed replication factor, 5, is simply the number of 
 replicas listed for the first partition, which includes both brokers in the 
 current list and those onto which the partition is being reassigned. 
 Partition 0 is also included in the list when using the 
 `--under-replicated-partitions` option, even though it is replicated to more 
 partitions than the true replication factor.
 During a reassignment, the under-replicated partitions metric is not usable, 
 meaning that actual under-replicated partitions can go unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1631) ReplicationFactor and under-replicated partitions incorrect during reassignment

2014-09-25 Thread Ryan Berdeen (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148559#comment-14148559
 ] 

Ryan Berdeen commented on KAFKA-1631:
-

The patch does look like an improvement to the {{TopicCommand}}, but doesn't 
address the number of under-replicated partitions reported by the brokers. It 
seems like there shouldn't be multiple definitions of under-replicated 
partition.

 ReplicationFactor and under-replicated partitions incorrect during 
 reassignment
 ---

 Key: KAFKA-1631
 URL: https://issues.apache.org/jira/browse/KAFKA-1631
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
  Labels: newbie
 Attachments: KAFKA-1631-v1.patch


 We have a topic with a replication factor of 3. We monitor 
 UnderReplicatedPartitions as recommended by the documentation.
 During a partition reassignment, partitions being reassigned are reported as 
 under-replicated. Running a describe shows:
 {code}
 Topic:activity-wal-1PartitionCount:15   ReplicationFactor:5 
 Configs:
 Topic: activity-wal-1   Partition: 0Leader: 14  Replicas: 
 14,13,12,11,15Isr: 14,12,11,13
 Topic: activity-wal-1   Partition: 1Leader: 14  Replicas: 
 15,14,11  Isr: 14,11
 Topic: activity-wal-1   Partition: 2Leader: 11  Replicas: 
 11,15,12  Isr: 12,11,15
 ...
 {code}
 It looks like the displayed replication factor, 5, is simply the number of 
 replicas listed for the first partition, which includes both brokers in the 
 current list and those onto which the partition is being reassigned. 
 Partition 0 is also included in the list when using the 
 `--under-replicated-partitions` option, even though it is replicated to more 
 partitions than the true replication factor.
 During a reassignment, the under-replicated partitions metric is not usable, 
 meaning that actual under-replicated partitions can go unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1631) ReplicationFactor and under-replicated partitions incorrect during reassignment

2014-09-25 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148680#comment-14148680
 ] 

Ewen Cheslack-Postava commented on KAFKA-1631:
--

Right. Unfortunately most of the system isn't aware of the large scale change 
(reassign old set - new set), only of each intermediate state (old set - old 
set + new set - new set). As it stands, the UnderReplicatedPartitions are 
computed by Partition class, which is created by ReplicaManager. But the 
high-level reassignment is managed by KafkaController, and looks like the only 
place the necessary state is maintained. I think getting the semantics you want 
may require a much more substantial change since each partition leader will 
need to know about the partition reassignment rather than just the controller.

On the other hand, while I think it's less than ideal, the current behavior 
could certainly be argued to be reasonable -- i.e. that reassignment is not 
natively supported, it's just a higher-level operation you can build up. In 
this case, the intermediate step is expected, and the temporary reporting of 
under-replication would make sense since for a time the desired replication of 
(old set + new set) has not been achieved.

 ReplicationFactor and under-replicated partitions incorrect during 
 reassignment
 ---

 Key: KAFKA-1631
 URL: https://issues.apache.org/jira/browse/KAFKA-1631
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
  Labels: newbie
 Attachments: KAFKA-1631-v1.patch


 We have a topic with a replication factor of 3. We monitor 
 UnderReplicatedPartitions as recommended by the documentation.
 During a partition reassignment, partitions being reassigned are reported as 
 under-replicated. Running a describe shows:
 {code}
 Topic:activity-wal-1PartitionCount:15   ReplicationFactor:5 
 Configs:
 Topic: activity-wal-1   Partition: 0Leader: 14  Replicas: 
 14,13,12,11,15Isr: 14,12,11,13
 Topic: activity-wal-1   Partition: 1Leader: 14  Replicas: 
 15,14,11  Isr: 14,11
 Topic: activity-wal-1   Partition: 2Leader: 11  Replicas: 
 11,15,12  Isr: 12,11,15
 ...
 {code}
 It looks like the displayed replication factor, 5, is simply the number of 
 replicas listed for the first partition, which includes both brokers in the 
 current list and those onto which the partition is being reassigned. 
 Partition 0 is also included in the list when using the 
 `--under-replicated-partitions` option, even though it is replicated to more 
 partitions than the true replication factor.
 During a reassignment, the under-replicated partitions metric is not usable, 
 meaning that actual under-replicated partitions can go unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1631) ReplicationFactor and under-replicated partitions incorrect during reassignment

2014-09-11 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131051#comment-14131051
 ] 

Neha Narkhede commented on KAFKA-1631:
--

Thanks for reporting the issue, [~rberdeen]. Since partition reassignment 
involves changing the replicas of a partition, it is tricky to report the under 
replicated status correctly at all times. However, one possible improvement is 
to change the topics tool to not report partitions being reassigned, as under 
replicated. It is a minor change, feel free to give it a stab.

 ReplicationFactor and under-replicated partitions incorrect during 
 reassignment
 ---

 Key: KAFKA-1631
 URL: https://issues.apache.org/jira/browse/KAFKA-1631
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
  Labels: newbie

 We have a topic with a replication factor of 3. We monitor 
 UnderReplicatedPartitions as recommended by the documentation.
 During a partition reassignment, partitions being reassigned are reported as 
 under-replicated. Running a describe shows:
 {code}
 Topic:activity-wal-1PartitionCount:15   ReplicationFactor:5 
 Configs:
 Topic: activity-wal-1   Partition: 0Leader: 14  Replicas: 
 14,13,12,11,15Isr: 14,12,11,13
 Topic: activity-wal-1   Partition: 1Leader: 14  Replicas: 
 15,14,11  Isr: 14,11
 Topic: activity-wal-1   Partition: 2Leader: 11  Replicas: 
 11,15,12  Isr: 12,11,15
 ...
 {code}
 It looks like the displayed replication factor, 5, is simply the number of 
 replicas listed for the first partition, which includes both brokers in the 
 current list and those onto which the partition is being reassigned. 
 Partition 0 is also included in the list when using the 
 `--under-replicated-partitions` option, even though it is replicated to more 
 partitions than the true replication factor.
 During a reassignment, the under-replicated partitions metric is not usable, 
 meaning that actual under-replicated partitions can go unnoticed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)