[jira] [Commented] (KAFKA-691) Fault tolerance broken with replication factor 1

Maxime Brugidou (JIRA) Wed, 09 Jan 2013 14:28:17 -0800

    [ 
https://issues.apache.org/jira/browse/KAFKA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549105#comment-13549105
 ]


Maxime Brugidou commented on KAFKA-691:
---------------------------------------

I agree with Jun solution, this would solve 3 (1 and 2 can be done manualy 
already -- just send a ReassignPartition command when you add a broker)

I could probably implement this very quickly, I'm just not sure of how you get 
the availability of a partition, but i'll try to figure it out and submit a 
first patch tomorrow.
                
> Fault tolerance broken with replication factor 1
> ------------------------------------------------
>
>                 Key: KAFKA-691
>                 URL: https://issues.apache.org/jira/browse/KAFKA-691
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Jay Kreps
>
> In 0.7 if a partition was down we would just send the message elsewhere. This 
> meant that the partitioning was really more of a "stickiness" then a hard 
> guarantee. This made it impossible to depend on it for partitioned, stateful 
> processing.
> In 0.8 when running with replication this should not be a problem generally 
> as the partitions are now highly available and fail over to other replicas. 
> However in the case of replication factor = 1 no longer really works for most 
> cases as now a dead broker will give errors for that broker.
> I am not sure of the best fix. Intuitively I think this is something that 
> should be handled by the Partitioner interface. However currently the 
> partitioner has no knowledge of which nodes are available. So you could use a 
> random partitioner, but that would keep going back to the down node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-691) Fault tolerance broken with replication factor 1

Reply via email to