Re: Killing last replica for partition doesn't change ISR/Leadership if replica is running controller

2014-05-14 Thread Alex Demidko
Sure thing - https://issues.apache.org/jira/browse/KAFKA-1452


On May 13, 2014, at 8:33 PM, Jun Rao jun...@gmail.com wrote:

 Yes, that seems like a real issue. Could you file a jira?
 
 Thanks,
 
 Jun
 
 
 On Tue, May 13, 2014 at 11:58 AM, Alex Demidko 
 alexan...@metamarkets.comwrote:
 
 Hi,
 
 Kafka version is 0.8.1.1. We have three machines: A,B,C. Let’s say there
 is a topic with replication 2 and one of it’s partitions - partition 1 is
 placed on brokers A and B. If the broker A is already down than for the
 partition 1 we have: Leader: B, ISR: [B]. If the current controller is node
 C, than killing broker B will turn partition 1 into state: Leader:  -1,
 ISR: []. But if the current controller is node B, than killing it won’t
 update leadership/isr for partition 1 even when controller will be
 restarted on node C, so partition 1 will forever think it’s leader is node
 B which is dead.
 
 It looks that KafkaController.onBrokerFailure handles situation when the
 broker down is the partition leader - it sets the new leader value to -1.
 To the contrary, KafkaController.onControllerFailover never removes leader
 from the partition with all replicas offline - allegedly because partition
 gets into ReplicaDeletionIneligible state. Is it intended behavior?
 
 This behavior affects DefaultEventHandler.getPartition in the null key
 case - it can’t determine partition 1 as having no leader, and this results
 into events send failure.
 
 
 What we are trying to achieve - is to be able to write data even if some
 partitions lost all replicas, which is rare yet still possible scenario.
 Using null key looked suitable with minor DefaultEventHandler modifications
 (like getting rid from DefaultEventHandler.sendPartitionPerTopicCache to
 avoid caching and uneven events distribution) as we neither use logs
 compaction nor rely on partitioning of the data. We had such behavior with
 kafka 0.7 - if the node is down, simply produce to a different one.
 
 
 Thanks,
 Alex
 
 



Killing last replica for partition doesn't change ISR/Leadership if replica is running controller

2014-05-13 Thread Alex Demidko
Hi,

Kafka version is 0.8.1.1. We have three machines: A,B,C. Let’s say there is a 
topic with replication 2 and one of it’s partitions - partition 1 is placed on 
brokers A and B. If the broker A is already down than for the partition 1 we 
have: Leader: B, ISR: [B]. If the current controller is node C, than killing 
broker B will turn partition 1 into state: Leader:  -1, ISR: []. But if the 
current controller is node B, than killing it won’t update leadership/isr for 
partition 1 even when controller will be restarted on node C, so partition 1 
will forever think it’s leader is node B which is dead.

It looks that KafkaController.onBrokerFailure handles situation when the broker 
down is the partition leader - it sets the new leader value to -1. To the 
contrary, KafkaController.onControllerFailover never removes leader from the 
partition with all replicas offline - allegedly because partition gets into 
ReplicaDeletionIneligible state. Is it intended behavior?

This behavior affects DefaultEventHandler.getPartition in the null key case - 
it can’t determine partition 1 as having no leader, and this results into 
events send failure.


What we are trying to achieve - is to be able to write data even if some 
partitions lost all replicas, which is rare yet still possible scenario. Using 
null key looked suitable with minor DefaultEventHandler modifications (like 
getting rid from DefaultEventHandler.sendPartitionPerTopicCache to avoid 
caching and uneven events distribution) as we neither use logs compaction nor 
rely on partitioning of the data. We had such behavior with kafka 0.7 - if the 
node is down, simply produce to a different one.


Thanks, 
Alex



Re: Killing last replica for partition doesn't change ISR/Leadership if replica is running controller

2014-05-13 Thread Jun Rao
Yes, that seems like a real issue. Could you file a jira?

Thanks,

Jun


On Tue, May 13, 2014 at 11:58 AM, Alex Demidko alexan...@metamarkets.comwrote:

 Hi,

 Kafka version is 0.8.1.1. We have three machines: A,B,C. Let’s say there
 is a topic with replication 2 and one of it’s partitions - partition 1 is
 placed on brokers A and B. If the broker A is already down than for the
 partition 1 we have: Leader: B, ISR: [B]. If the current controller is node
 C, than killing broker B will turn partition 1 into state: Leader:  -1,
 ISR: []. But if the current controller is node B, than killing it won’t
 update leadership/isr for partition 1 even when controller will be
 restarted on node C, so partition 1 will forever think it’s leader is node
 B which is dead.

 It looks that KafkaController.onBrokerFailure handles situation when the
 broker down is the partition leader - it sets the new leader value to -1.
 To the contrary, KafkaController.onControllerFailover never removes leader
 from the partition with all replicas offline - allegedly because partition
 gets into ReplicaDeletionIneligible state. Is it intended behavior?

 This behavior affects DefaultEventHandler.getPartition in the null key
 case - it can’t determine partition 1 as having no leader, and this results
 into events send failure.


 What we are trying to achieve - is to be able to write data even if some
 partitions lost all replicas, which is rare yet still possible scenario.
 Using null key looked suitable with minor DefaultEventHandler modifications
 (like getting rid from DefaultEventHandler.sendPartitionPerTopicCache to
 avoid caching and uneven events distribution) as we neither use logs
 compaction nor rely on partitioning of the data. We had such behavior with
 kafka 0.7 - if the node is down, simply produce to a different one.


 Thanks,
 Alex