[jira] [Created] (KAFKA-6969) AdminClient should provide a way to get the version of a node
Charly Molter created KAFKA-6969: Summary: AdminClient should provide a way to get the version of a node Key: KAFKA-6969 URL: https://issues.apache.org/jira/browse/KAFKA-6969 Project: Kafka Issue Type: Improvement Reporter: Charly Molter Currently adminClient returns a lot of info about the cluster, topics... It would be nice if it could return either the result of ApiVersions or a simplified "kafkaVersion" number for a node. This would be useful for admin tools to save them from having to do the request with their own NetworkClient. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-6445) Remove deprecated metrics in 2.0
Charly Molter created KAFKA-6445: Summary: Remove deprecated metrics in 2.0 Key: KAFKA-6445 URL: https://issues.apache.org/jira/browse/KAFKA-6445 Project: Kafka Issue Type: Bug Affects Versions: 2.0.0 Reporter: Charly Molter Assignee: Charly Molter Fix For: 2.0.0 As part of KIP-225 we've replaced a metric and deprecated the old one. We should remove these metrics in 2.0.0 this Jira is to track all of the metrics to remove in 2.0.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-6310) ConcurrentModificationException when reporting requests-in-flight in producer
Charly Molter created KAFKA-6310: Summary: ConcurrentModificationException when reporting requests-in-flight in producer Key: KAFKA-6310 URL: https://issues.apache.org/jira/browse/KAFKA-6310 Project: Kafka Issue Type: Bug Components: metrics, network Affects Versions: 1.0.0 Reporter: Charly Molter We are running in an issue really similar to KAFKA-4950. We have a producer running and a MetricsReporter with a background thread which publishes these metrics. The concurrent exception happens when calling `InFlightRequests.count()` in one thread when a connection or disconnection is happening. In this case one thread is iterating over the map while another is adding/removing from it thus causing the exception. We could potentially fix this with a volatile like in KAFKA-4950. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-6204) Interceptor and MetricsReporter should implement java.io.Closeable
Charly Molter created KAFKA-6204: Summary: Interceptor and MetricsReporter should implement java.io.Closeable Key: KAFKA-6204 URL: https://issues.apache.org/jira/browse/KAFKA-6204 Project: Kafka Issue Type: Improvement Components: clients Reporter: Charly Molter Priority: Minor The serializers and deserializers extends the Closeable interface, even ConsumerInterceptors and ProducerInterceptors implement it. ConsumerInterceptor, ProducerInterceptor and MetricsReporter do not extend the Closeable interface. Maybe they should for coherency with the rest of the apis. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-6192) In Config always transform Properties to Map
Charly Molter created KAFKA-6192: Summary: In Config always transform Properties to Map Key: KAFKA-6192 URL: https://issues.apache.org/jira/browse/KAFKA-6192 Project: Kafka Issue Type: Improvement Components: clients Reporter: Charly Molter Priority: Minor Currently there's a lot of duplicated code in AbstractConfig, ConfigDef and Clients which works with both Properties and Map. Properties is a Map
[jira] [Created] (KAFKA-6180) AbstractConfig.getList() should never return null
Charly Molter created KAFKA-6180: Summary: AbstractConfig.getList() should never return null Key: KAFKA-6180 URL: https://issues.apache.org/jira/browse/KAFKA-6180 Project: Kafka Issue Type: Improvement Components: config Affects Versions: 1.0.0, 0.11.0.0, 0.10.2.1, 0.10.2.0, 0.10.1.1, 0.10.1.0 Reporter: Charly Molter Priority: Trivial AbstractConfig.getList returns null if the property is unset and there's no default. This creates a lot of cases where we need to do null checks (and remember them). It's good practice to just return an empty list as usually code naturally handles empty lists. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5890) records.lag should use tags for topic and partition rather than using metric name.
Charly Molter created KAFKA-5890: Summary: records.lag should use tags for topic and partition rather than using metric name. Key: KAFKA-5890 URL: https://issues.apache.org/jira/browse/KAFKA-5890 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 0.10.2.0 Reporter: Charly Molter As part of KIP-92[1] a per partition lag metric was added. These metrics are really useful, however in the implementation it was implemented as a prefix to the metric name: https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L1321-L1344 Usually these kind of metrics use tags and the name is constant for all topics, partitions. We have a custom reporter which aggregates topics/partitions together to avoid explosion of the number of KPIs and this KPI doesn't support this as it doesn't have tags but a complex name. [1] https://cwiki.apache.org/confluence/display/KAFKA/KIP-92+-+Add+per+partition+lag+metrics+to+KafkaConsumer -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5513) Contradicting scalaDoc for AdminUtils.assignReplicasToBrokers
Charly Molter created KAFKA-5513: Summary: Contradicting scalaDoc for AdminUtils.assignReplicasToBrokers Key: KAFKA-5513 URL: https://issues.apache.org/jira/browse/KAFKA-5513 Project: Kafka Issue Type: Improvement Components: core Reporter: Charly Molter Priority: Trivial The documentation for AdminUtils.assignReplicasToBrokers seems to contradict itself. I says in the description: "As the result, if the number of replicas is equal to or greater than the number of racks, it will ensure that each rack will get at least one replica." Which means that it is possible to get an assignment where there's multiple replicas in a rack (if there's less racks than the replication factor). However, the throws clauses says: " @throws AdminOperationException If rack information is supplied but it is incomplete, or if it is not possible to assign each replica to a unique rack." Which seems to be contradicting the first claim. In practice it doesn't throw when RF < #racks so the point in the @throws clause should probably be removed. https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/AdminUtils.scala#L121-L130 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3992) InstanceAlreadyExistsException Error for Consumers Starting in Parallel
[ https://issues.apache.org/jira/browse/KAFKA-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15946118#comment-15946118 ] Charly Molter commented on KAFKA-3992: -- This is not a contribution. While I understand this limit it does sound to me that asking for unique client-id per thread seems like an unreasonable ask for the following reason: 1) Client-ids is a concept that survives all the way to the broker. Having to do aggregation this means there would be an explosion in the number of metrics for the brokers 2) Security and Quotas are heavily depending on client-id for .e.g this is extracted from the docs "can be applied to (user, client-id), user or client-id groups”. Adding 1000 entries to quota to allow an app with a 1000threads might be a bit annoying 3) The doc is clearly not saying it should be unique: "An id string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging.” or "The client id is a user-specified string sent in each request to help trace calls. It should logically identify the application making the request.” 4) KIP-98 does introduce producer-id which is supposed to be unique if client-id is already unique what’s the point of this producer-id? It doesn’t seem that client id was created to identify a specific instance but to identify an application (which may have multiple instances of clients). So It seems it’s either unclear in the docs or a problem in the metrics API. What do you think [~ewencp]? > InstanceAlreadyExistsException Error for Consumers Starting in Parallel > --- > > Key: KAFKA-3992 > URL: https://issues.apache.org/jira/browse/KAFKA-3992 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.9.0.0, 0.10.0.0 >Reporter: Alexander Cook >Assignee: Ewen Cheslack-Postava > > I see the following error sometimes when I start multiple consumers at about > the same time in the same process (separate threads). Everything seems to > work fine afterwards, so should this not actually be an ERROR level message, > or could there be something going wrong that I don't see? > Let me know if I can provide any more info! > Error processing messages: Error registering mbean > kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1 > org.apache.kafka.common.KafkaException: Error registering mbean > kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1 > > Caused by: javax.management.InstanceAlreadyExistsException: > kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1 > Here is the full stack trace: > M[?:com.ibm.streamsx.messaging.kafka.KafkaConsumerV9.produceTuples:-1] - > Error processing messages: Error registering mbean > kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1 > org.apache.kafka.common.KafkaException: Error registering mbean > kafka.consumer:type=consumer-node-metrics,client-id=consumer-1,node-id=node--1 > at > org.apache.kafka.common.metrics.JmxReporter.reregister(JmxReporter.java:159) > at > org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:77) > at > org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288) > at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:177) > at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:162) > at > org.apache.kafka.common.network.Selector$SelectorMetrics.maybeRegisterConnectionMetrics(Selector.java:641) > at org.apache.kafka.common.network.Selector.poll(Selector.java:268) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:270) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:303) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:197) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:187) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:126) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorKnown(AbstractCoordinator.java:186) > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:857) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:829) > at > com.ibm.streamsx.messaging.kafka.KafkaConsumerV9.produceTuples(KafkaConsumerV9.java:129) > at > com.ibm.streamsx.messaging.kafka.KafkaConsumerV9$1.run(KafkaConsumerV9.java:70) >
[jira] [Commented] (KAFKA-4195) support throttling on request rate
[ https://issues.apache.org/jira/browse/KAFKA-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878008#comment-15878008 ] Charly Molter commented on KAFKA-4195: -- For those looking for the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+Request+rate+quotas > support throttling on request rate > -- > > Key: KAFKA-4195 > URL: https://issues.apache.org/jira/browse/KAFKA-4195 > Project: Kafka > Issue Type: Improvement >Reporter: Jun Rao >Assignee: Rajini Sivaram > Labels: needs-kip > > Currently, we can throttle the client by data volume. However, if a client > sends requests too quickly (e.g., a consumer with min.byte configured to 0), > it can still overwhelm the broker. It would be useful to additionally support > throttling by request rate. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.
[ https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467691#comment-15467691 ] Charly Molter edited comment on KAFKA-2729 at 9/6/16 3:38 PM: -- Hi, We had this issue on a test cluster running 0.10.0.0 so I took time to investigate some more. We had a bunch of disconnections to Zookeeper and we had 2 changes of controller in a short time. Broker 103 was controller with epoch 44 Broker 104 was controller with epoch 45 I looked at one specific partitions and found the following pattern: 101 was the broker which thought was leader but kept failing shrink the ISR with: Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition [verifiable-test-topic,0] from 101,301,201 to 101,201 Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not equal to that in zookeeper, skip updating ISR Looking at ZK we have: get /brokers/topics/verifiable-test-topic/partitions/0/state {"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]} And metadata (to a random broker) is saying: Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 101,201,301 Isr: 301 Digging in the logs here’s what we think happened: 1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95 2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update zk!) 3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after updating zk!) 4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95 4) Is ignored by 301 as the leaderEpoch is older than the current one. We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and leaderEpoch 95 I believe this happened because when the controller steps down it empties its request queue so this request never left the controller: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57 So we ended up in a case where 301 and 101 think they are both leaders. Obviously 101 wants to update the state in ZK to remove 301 as it’s not even fetching from 101. Does this seem correct to you? It seems impossible to avoid having no Controller overlap, which could make it quite hard to avoid having 2 leaders for a short time. Though there should be a way for this situation to get back to a good state. I believe the impact of this would be: - writes = -1 unavailability - writes != -1 possible log divergence (I’m unsure about this). Hope this helps. While I had to fix the cluster by bouncing a node I kept most of the logs so let me know if you need more info. was (Author: cmolter): Hi, We had this issue on a test cluster running 0.10.0.0 so I took time to investigate some more. We had a bunch of disconnections to Zookeeper and we had 2 changes of controller in a short time. Broker 103 was leader with epoch 44 Broker 104 was leader with epoch 45 I looked at one specific partitions and found the following pattern: 101 was the broker which thought was leader but kept failing shrink the ISR with: Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition [verifiable-test-topic,0] from 101,301,201 to 101,201 Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not equal to that in zookeeper, skip updating ISR Looking at ZK we have: get /brokers/topics/verifiable-test-topic/partitions/0/state {"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]} And metadata (to a random broker) is saying: Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 101,201,301 Isr: 301 Digging in the logs here’s what we think happened: 1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95 2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update zk!) 3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after updating zk!) 4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95 4) Is ignored by 301 as the leaderEpoch is older than the current one. We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and leaderEpoch 95 I believe this happened because when the controller steps down it empties its request queue so this request never left the controller: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57 So we ended up in a case where 301 and 101 think they are both leaders. Obviously 101 wants to update the state in ZK to remove 301 as it’s not even fetching from 101. Does this seem correct to you? It seems impossible to avoid having no Controller overlap, which could make it quite hard to avoid having 2 leaders for a short time. Though there should be a way for this situation to get back to a good state. I believe the impact of
[jira] [Comment Edited] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.
[ https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467691#comment-15467691 ] Charly Molter edited comment on KAFKA-2729 at 9/6/16 3:37 PM: -- Hi, We had this issue on a test cluster running 0.10.0.0 so I took time to investigate some more. We had a bunch of disconnections to Zookeeper and we had 2 changes of controller in a short time. Broker 103 was leader with epoch 44 Broker 104 was leader with epoch 45 I looked at one specific partitions and found the following pattern: 101 was the broker which thought was leader but kept failing shrink the ISR with: Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition [verifiable-test-topic,0] from 101,301,201 to 101,201 Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not equal to that in zookeeper, skip updating ISR Looking at ZK we have: get /brokers/topics/verifiable-test-topic/partitions/0/state {"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]} And metadata (to a random broker) is saying: Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 101,201,301 Isr: 301 Digging in the logs here’s what we think happened: 1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95 2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update zk!) 3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after updating zk!) 4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95 4) Is ignored by 301 as the leaderEpoch is older than the current one. We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and leaderEpoch 95 I believe this happened because when the controller steps down it empties its request queue so this request never left the controller: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57 So we ended up in a case where 301 and 101 think they are both leaders. Obviously 101 wants to update the state in ZK to remove 301 as it’s not even fetching from 101. Does this seem correct to you? It seems impossible to avoid having no Controller overlap, which could make it quite hard to avoid having 2 leaders for a short time. Though there should be a way for this situation to get back to a good state. I believe the impact of this would be: - writes = -1 unavailability - writes != -1 possible log divergence (I’m unsure about this). Hope this helps. While I had to fix the cluster by bouncing a node I kept most of the logs so let me know if you need more info. was (Author: cmolter): Hi, We had this issue on a test cluster running 0.10.0.0 so I took time to investigate some more. We had a bunch of disconnections to Zookeeper and we had 2 changes of controller in a short time. Broker 103 was leader with epoch 44 Broker 104 was leader with epoch 45 I looked at one specific partitions and found the following pattern: 101 was the broker which thought was leader but kept failing shrink the ISR with: Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition [verifiable-test-topic,0] from 101,301,201 to 101,201 Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not equal to that in zookeeper, skip updating ISR Looking at ZK we have: get /brokers/topics/verifiable-test-topic/partitions/0/state {"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]} And metadata (to a random broker) is saying: Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 101,201,301 Isr: 301 Digging in the logs here’s what we think happened: 1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95 2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update zk!) 3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after updating zk!) 4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95 4) Is ignored by 301 as the leaderEpoch is older than the current one. We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and leaderEpoch 95 I believe this happened because when the controller steps down it empties its request queue so this request never left the controller: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57 So we ended up in a case where 301 and 101 think they are both leaders. Obviously 101 wants to update the state in ZK to remove 301 as it’s not even fetching from 101. Does this seem correct to you? It seems impossible to avoid having no Controller overlap, which could make it quite hard to avoid having 2 leaders for a short time. Though there should be a way for this situation to get back to a good state. I believe the impact of this
[jira] [Comment Edited] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.
[ https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467691#comment-15467691 ] Charly Molter edited comment on KAFKA-2729 at 9/6/16 3:32 PM: -- Hi, We had this issue on a test cluster running 0.10.0.0 so I took time to investigate some more. We had a bunch of disconnections to Zookeeper and we had 2 changes of controller in a short time. Broker 103 was leader with epoch 44 Broker 104 was leader with epoch 45 I looked at one specific partitions and found the following pattern: 101 was the broker which thought was leader but kept failing shrink the ISR with: Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition [verifiable-test-topic,0] from 101,301,201 to 101,201 Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not equal to that in zookeeper, skip updating ISR Looking at ZK we have: get /brokers/topics/verifiable-test-topic/partitions/0/state {"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]} And metadata (to a random broker) is saying: Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 101,201,301 Isr: 301 Digging in the logs here’s what we think happened: 1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95 2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update zk!) 3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after updating zk!) 4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95 4) Is ignored by 301 as the leaderEpoch is older than the current one. We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and leaderEpoch 95 I believe this happened because when the controller steps down it empties its request queue so this request never left the controller: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57 So we ended up in a case where 301 and 101 think they are both leaders. Obviously 101 wants to update the state in ZK to remove 301 as it’s not even fetching from 101. Does this seem correct to you? It seems impossible to avoid having no Controller overlap, which could make it quite hard to avoid having 2 leaders for a short time. Though there should be a way for this situation to get back to a good state. I believe the impact of this would be: - writes = -1 unavailability - writes != -1 possible log divergence depending on min in-sync replicas (I’m unsure about this). Hope this helps. While I had to fix the cluster by bouncing a node I kept most of the logs so let me know if you need more info. was (Author: cmolter): Hi, We had this issue on a test cluster so I took time to investigate some more. We had a bunch of disconnections to Zookeeper and we had 2 changes of controller in a short time. Broker 103 was leader with epoch 44 Broker 104 was leader with epoch 45 I looked at one specific partitions and found the following pattern: 101 was the broker which thought was leader but kept failing shrink the ISR with: Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition [verifiable-test-topic,0] from 101,301,201 to 101,201 Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not equal to that in zookeeper, skip updating ISR Looking at ZK we have: get /brokers/topics/verifiable-test-topic/partitions/0/state {"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]} And metadata (to a random broker) is saying: Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 101,201,301 Isr: 301 Digging in the logs here’s what we think happened: 1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95 2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update zk!) 3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after updating zk!) 4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95 4) Is ignored by 301 as the leaderEpoch is older than the current one. We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and leaderEpoch 95 I believe this happened because when the controller steps down it empties its request queue so this request never left the controller: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57 So we ended up in a case where 301 and 101 think they are both leaders. Obviously 101 wants to update the state in ZK to remove 301 as it’s not even fetching from 101. Does this seem correct to you? It seems impossible to avoid having no Controller overlap, which could make it quite hard to avoid having 2 leaders for a short time. Though there should be a way for this situation to get back to a good state. I believe the
[jira] [Commented] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.
[ https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467691#comment-15467691 ] Charly Molter commented on KAFKA-2729: -- Hi, We had this issue on a test cluster so I took time to investigate some more. We had a bunch of disconnections to Zookeeper and we had 2 changes of controller in a short time. Broker 103 was leader with epoch 44 Broker 104 was leader with epoch 45 I looked at one specific partitions and found the following pattern: 101 was the broker which thought was leader but kept failing shrink the ISR with: Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition [verifiable-test-topic,0] from 101,301,201 to 101,201 Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not equal to that in zookeeper, skip updating ISR Looking at ZK we have: get /brokers/topics/verifiable-test-topic/partitions/0/state {"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]} And metadata (to a random broker) is saying: Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 101,201,301 Isr: 301 Digging in the logs here’s what we think happened: 1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95 2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update zk!) 3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after updating zk!) 4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95 4) Is ignored by 301 as the leaderEpoch is older than the current one. We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and leaderEpoch 95 I believe this happened because when the controller steps down it empties its request queue so this request never left the controller: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57 So we ended up in a case where 301 and 101 think they are both leaders. Obviously 101 wants to update the state in ZK to remove 301 as it’s not even fetching from 101. Does this seem correct to you? It seems impossible to avoid having no Controller overlap, which could make it quite hard to avoid having 2 leaders for a short time. Though there should be a way for this situation to get back to a good state. I believe the impact of this would be: - writes = -1 unavailability - writes != -1 possible log divergence depending on min in-sync replicas (I’m unsure about this). Hope this helps. While I had to fix the cluster by bouncing a node I kept most of the logs so let me know if you need more info. > Cached zkVersion not equal to that in zookeeper, broker not recovering. > --- > > Key: KAFKA-2729 > URL: https://issues.apache.org/jira/browse/KAFKA-2729 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8.2.1 >Reporter: Danil Serdyuchenko > > After a small network wobble where zookeeper nodes couldn't reach each other, > we started seeing a large number of undereplicated partitions. The zookeeper > cluster recovered, however we continued to see a large number of > undereplicated partitions. Two brokers in the kafka cluster were showing this > in the logs: > {code} > [2015-10-27 11:36:00,888] INFO Partition > [__samza_checkpoint_event-creation_1,3] on broker 5: Shrinking ISR for > partition [__samza_checkpoint_event-creation_1,3] from 6,5 to 5 > (kafka.cluster.Partition) > [2015-10-27 11:36:00,891] INFO Partition > [__samza_checkpoint_event-creation_1,3] on broker 5: Cached zkVersion [66] > not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) > {code} > For all of the topics on the effected brokers. Both brokers only recovered > after a restart. Our own investigation yielded nothing, I was hoping you > could shed some light on this issue. Possibly if it's related to: > https://issues.apache.org/jira/browse/KAFKA-1382 , however we're using > 0.8.2.1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)