[ 
https://issues.apache.org/jira/browse/KAFKA-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858045#comment-13858045
 ] 

Jun Rao commented on KAFKA-1193:
--------------------------------

Do you see the following in the controller log? This indicates an unclean 
leader election and could cause data loss.

"No broker in ISR is alive for ... There's potential data loss."

> Data loss if broker is killed using kill -9
> -------------------------------------------
>
>                 Key: KAFKA-1193
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1193
>             Project: Kafka
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.8.0, 0.8.1
>         Environment: Centos 6.3
>            Reporter: Hanish Bansal
>            Assignee: Neha Narkhede
>
> We are having kafka cluster of 2 nodes. (Using Kafka 0.8.0 version)
> Replication Factor: 2
> Number of partitions: 2
> Actual Behaviour:
> -------------------------
> Out of two nodes, if leader node goes down then data lost happens.
> Steps to Reproduce:
> ------------------------------
> 1. Create a 2 node kafka cluster with replication factor 2
> 2. Start the Kafka cluster
> 3. Create a topic lets say "test-trunk111"
> 4. Restart any one node.
> 5. Check topic status using kafka-list-topic tool.
> topic isr status is:
> topic: test-trunk111    partition: 0    leader: 0    replicas: 1,0    isr: 0,1
> topic: test-trunk111    partition: 1    leader: 0    replicas: 0,1    isr: 0,1
> If there is only one broker node in isr list then wait for some time and 
> again check isr status of topic. There should be 2 brokers in isr list.
> 6. Start producing the data.
> 7. Kill leader node (borker-0 in our case) meanwhile of data producing.
> 8. After all data is produced start consumer.
> 9. Observe the behaviour. There is data loss.
> After leader goes down, topic isr status is:
> topic: test-trunk111    partition: 0    leader: 1    replicas: 1,0    isr: 1
> topic: test-trunk111    partition: 1    leader: 1    replicas: 0,1    isr: 1
> We have tried below things to avoid data loss:
> ----------------------------------------------------------------
> 1. Configured "request.required.acks=-1" in producer configuration because as 
> mentioned in documentation 
> http://kafka.apache.org/documentation.html#producerconfigs, setting this 
> value to -1 provides guarantee that no messages will be lost.
> 2. Increased the "message.send.max.retries" from 3 to 10 in producer 
> configuration.
> 3. Set "controlled.shutdown.enable" to true in broker configuration.
> 4. Tested with Kafka-0.8.1 after applying patch KAFKA-1188.patch available on 
> https://issues.apache.org/jira/browse/KAFKA-1188 
> Nothing work out from above things in case of leader node is killed using 
> "kill -9 <pid>".
> Expected Behaviour:
> ----------------------------
> No data should be lost.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to