Klearchos Chaloulos created KAFKA-5564:
------------------------------------------

             Summary: Fail to create topics with error 'While recording the 
replica LEO, the partition [topic2,0] hasn't been created'
                 Key: KAFKA-5564
                 URL: https://issues.apache.org/jira/browse/KAFKA-5564
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.9.0.1
            Reporter: Klearchos Chaloulos


Hello,

*Short version*
we have seen sporadic occurrences of the following issue: Topics whose leader 
is a specific broker fail to be created properly, and it is impossible to 
produce to them or consume from them.
 The following logs appears in the broker that is the leader of the faulty 
topics:
{noformat}
[2017-07-05 05:22:15,564] WARN [Replica Manager on Broker 3]: While recording 
the replica LEO, the partition [topic2,0] hasn't been created. 
(kafka.server.ReplicaManager)
{noformat}

*Detailed version*:
Our setup consists of three brokers with ids 1, 2, 3. Broker 2 is the 
controller. We create 7 topics called topic1, topic2, topic3, topic4, topic5, 
topic6, topic7.

Sometimes (sporadically) some of the topics are faulty. In the particular 
example I describe here the faulty topics are topics are topic6, topic4, 
topic2, topic3. The faulty topics all have the same leader broker 3.

If we do a kafka-topics.sh --describe on the topics we see that for topics that 
do not have broker 3 as leader, the in sync replicas report that broker 3 is 
not synced:
{noformat}
 bin/kafka-topics.sh --describe --zookeeper zookeeper:2181/kafka
Topic:topic6    PartitionCount:1        ReplicationFactor:3     Configs:
        Topic: topic6   Partition: 0    Leader: 3       Replicas: 3,1,2 Isr: 
3,1,2
Topic:topic5    PartitionCount:1        ReplicationFactor:3     
Configs:retention.ms=300000
        Topic: topic5   Partition: 0    Leader: 2       Replicas: 2,3,1 Isr: 2,1
Topic:topic7    PartitionCount:1        ReplicationFactor:3     Configs:
        Topic: topic7   Partition: 0    Leader: 1       Replicas: 1,3,2 Isr: 1,2
Topic:topic4    PartitionCount:1        ReplicationFactor:3     Configs:
        Topic: topic4   Partition: 0    Leader: 3       Replicas: 3,1,2 Isr: 
3,1,2
Topic:topic1    PartitionCount:1        ReplicationFactor:3     Configs:
        Topic: topic1   Partition: 0    Leader: 2       Replicas: 2,1,3 Isr: 2,1
Topic:topic2    PartitionCount:1        ReplicationFactor:3     Configs:
        Topic: topic2   Partition: 0    Leader: 3       Replicas: 3,1,2 Isr: 
3,1,2
Topic:topic3    PartitionCount:1        ReplicationFactor:3     Configs:
        Topic: topic3   Partition: 0    Leader: 3       Replicas: 3,1,2 Isr: 
3,1,2
{noformat}
While for the faulty topics it is reported that all replicas are in sync.

Also, the topic directories under the log.dir folder were not created in the 
faulty broker 3.

We see the following logs in broker 3, which is the leader of the faulty topics:
{noformat}
[2017-07-05 05:22:15,564] WARN [Replica Manager on Broker 3]: While recording 
the replica LEO, the partition [topic2,0] hasn't been created. 
(kafka.server.ReplicaManager)
{noformat}
The above log is logged continuously.

and the following error logs in the other 2 brokers, the replicas:
{noformat}
ERROR [ReplicaFetcherThread-0-3], Error for partition [topic3,0] to broker 
3:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
does not host this topic-partition
{noformat}
Again the above log is logged continuously.

The issue described above occurs immediately after the deployment of the kafka 
cluster.
A restart of the faulty broker (3 in this case) fixes the problem and the 
faulty topics work normally.

I have also attached the broker configuration we use.

Do you have any idea what might cause this issue?

Best regards,

Klearchos




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to