[jira] [Created] (KAFKA-12712) KRaft: Missing controller.quorom.voters config not properly handled
Magnus Edenhill created KAFKA-12712: --- Summary: KRaft: Missing controller.quorom.voters config not properly handled Key: KAFKA-12712 URL: https://issues.apache.org/jira/browse/KAFKA-12712 Project: Kafka Issue Type: Bug Components: core Affects Versions: 2.8.0 Reporter: Magnus Edenhill When trying out KRaft in 2.8 I mispelled controller.quorum.voters as controller.quorum.voters, but the broker did not fail to start, nor did it print any warning. Instead it raised this error: {code:java} [2021-04-23 18:25:13,484] INFO Starting controller (kafka.server.ControllerServer)[2021-04-23 18:25:13,484] INFO Starting controller (kafka.server.ControllerServer)[2021-04-23 18:25:13,485] ERROR [kafka-raft-io-thread]: Error due to (kafka.raft.KafkaRaftManager$RaftIoThread)java.lang.IllegalArgumentException: bound must be positive at java.util.Random.nextInt(Random.java:388) at org.apache.kafka.raft.RequestManager.findReadyVoter(RequestManager.java:57) at org.apache.kafka.raft.KafkaRaftClient.maybeSendAnyVoterFetch(KafkaRaftClient.java:1778) at org.apache.kafka.raft.KafkaRaftClient.pollUnattachedAsObserver(KafkaRaftClient.java:2080) at org.apache.kafka.raft.KafkaRaftClient.pollUnattached(KafkaRaftClient.java:2061) at org.apache.kafka.raft.KafkaRaftClient.pollCurrentState(KafkaRaftClient.java:2096) at org.apache.kafka.raft.KafkaRaftClient.poll(KafkaRaftClient.java:2181) at kafka.raft.KafkaRaftManager$RaftIoThread.doWork(RaftManager.scala:53) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96) {code} which I guess eventually (1 minute later) lead to this error which terminated the broker: {code:java} [2021-04-23 18:26:14,435] ERROR [BrokerLifecycleManager id=2] Shutting down because we were unable to register with the controller quorum. (kafka.server.BrokerLifecycleManager)[2021-04-23 18:26:14,435] ERROR [BrokerLifecycleManager id=2] Shutting down because we were unable to register with the controller quorum. (kafka.server.BrokerLifecycleManager)[2021-04-23 18:26:14,436] INFO [BrokerLifecycleManager id=2] registrationTimeout: shutting down event queue. (org.apache.kafka.queue.KafkaEventQueue)[2021-04-23 18:26:14,437] INFO [BrokerLifecycleManager id=2] Transitioning from STARTING to SHUTTING_DOWN. (kafka.server.BrokerLifecycleManager)[2021-04-23 18:26:14,437] INFO [broker-2-to-controller-send-thread]: Shutting down (kafka.server.BrokerToControllerRequestThread)[2021-04-23 18:26:14,438] INFO [broker-2-to-controller-send-thread]: Stopped (kafka.server.BrokerToControllerRequestThread)[2021-04-23 18:26:14,438] INFO [broker-2-to-controller-send-thread]: Shutdown completed (kafka.server.BrokerToControllerRequestThread)[2021-04-23 18:26:14,441] ERROR [BrokerServer id=2] Fatal error during broker startup. Prepare to shutdown (kafka.server.BrokerServer)java.util.concurrent.CancellationException at java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2276) at kafka.server.BrokerLifecycleManager$ShutdownEvent.run(BrokerLifecycleManager.scala:474) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:174) at java.lang.Thread.run(Thread.java:748) {code} But since the client listeners were made available prior to shutting down, the broker was deemed up and operational by the (naiive) monitoring tool. So..: - Broker should fail on startup on invalid/unknown config properties. I understand this is tehcnically tricky, so at least a warning log should be printed. - Perhaps not create client listeners before control plane is somewhat happy. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-9180) Broker won't start with empty log dir
[ https://issues.apache.org/jira/browse/KAFKA-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Magnus Edenhill resolved KAFKA-9180. Resolution: Invalid Turned out to be old client jars making a mess. > Broker won't start with empty log dir > - > > Key: KAFKA-9180 > URL: https://issues.apache.org/jira/browse/KAFKA-9180 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.4.0 >Reporter: Magnus Edenhill >Assignee: Ismael Juma >Priority: Blocker > > On kafka trunk at commit 1675115ec193acf4c7d44e68a57272edfec0b455: > > Attempting to start the broker with an existing but empty log dir yields the > following error and terminates the process: > {code:java} > [2019-11-13 10:42:16,922] ERROR Failed to read meta.properties file under dir > /Users/magnus/src/librdkafka/tests/tmp/LibrdkafkaTestCluster/73638126/KafkaBrokerApp/2/logs/meta.properties > due to > /Users/magnus/src/librdkafka/tests/tmp/LibrdkafkaTestCluster/73638126/KafkaBrokerApp/2/logs/meta.properties > (No such file or directory) > (kafka.server.BrokerMetadataCheckpoint)[2019-11-13 10:42:16,924] ERROR Fail > to read meta.properties under log directory > /Users/magnus/src/librdkafka/tests/tmp/LibrdkafkaTestCluster/73638126/KafkaBrokerApp/2/logs > (kafka.server.KafkaServer)java.io.FileNotFoundException: > /Users/magnus/src/librdkafka/tests/tmp/LibrdkafkaTestCluster/73638126/KafkaBrokerApp/2/logs/meta.properties > (No such file or directory)at java.io.FileInputStream.open0(Native > Method)at java.io.FileInputStream.open(FileInputStream.java:195) > at java.io.FileInputStream.(FileInputStream.java:138)at > java.io.FileInputStream.(FileInputStream.java:93)at > org.apache.kafka.common.utils.Utils.loadProps(Utils.java:512)at > kafka.server.BrokerMetadataCheckpoint.liftedTree2$1(BrokerMetadataCheckpoint.scala:73) > at > kafka.server.BrokerMetadataCheckpoint.read(BrokerMetadataCheckpoint.scala:72) >at > kafka.server.KafkaServer.$anonfun$getBrokerMetadataAndOfflineDirs$1(KafkaServer.scala:704) > at > kafka.server.KafkaServer.$anonfun$getBrokerMetadataAndOfflineDirs$1$adapted(KafkaServer.scala:702) > at > scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) > at > scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38) > at > kafka.server.KafkaServer.getBrokerMetadataAndOfflineDirs(KafkaServer.scala:702) > at kafka.server.KafkaServer.startup(KafkaServer.scala:214)at > kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44) > at kafka.Kafka$.main(Kafka.scala:84)at > kafka.Kafka.main(Kafka.scala) {code} > > > Changing the catch to FileNotFoundException fixes the issue, here: > [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/BrokerMetadataCheckpoint.scala#L84] > > > This is a regression from 2.3.x. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9180) Broker won't start with empty log dir
Magnus Edenhill created KAFKA-9180: -- Summary: Broker won't start with empty log dir Key: KAFKA-9180 URL: https://issues.apache.org/jira/browse/KAFKA-9180 Project: Kafka Issue Type: Bug Components: core Affects Versions: 2.4.0 Reporter: Magnus Edenhill Assignee: Ismael Juma On kafka trunk at commit 1675115ec193acf4c7d44e68a57272edfec0b455: Attempting to start the broker with an existing but empty log dir yields the following error and terminates the process: {code:java} [2019-11-13 10:42:16,922] ERROR Failed to read meta.properties file under dir /Users/magnus/src/librdkafka/tests/tmp/LibrdkafkaTestCluster/73638126/KafkaBrokerApp/2/logs/meta.properties due to /Users/magnus/src/librdkafka/tests/tmp/LibrdkafkaTestCluster/73638126/KafkaBrokerApp/2/logs/meta.properties (No such file or directory) (kafka.server.BrokerMetadataCheckpoint)[2019-11-13 10:42:16,924] ERROR Fail to read meta.properties under log directory /Users/magnus/src/librdkafka/tests/tmp/LibrdkafkaTestCluster/73638126/KafkaBrokerApp/2/logs (kafka.server.KafkaServer)java.io.FileNotFoundException: /Users/magnus/src/librdkafka/tests/tmp/LibrdkafkaTestCluster/73638126/KafkaBrokerApp/2/logs/meta.properties (No such file or directory)at java.io.FileInputStream.open0(Native Method)at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.(FileInputStream.java:138)at java.io.FileInputStream.(FileInputStream.java:93)at org.apache.kafka.common.utils.Utils.loadProps(Utils.java:512)at kafka.server.BrokerMetadataCheckpoint.liftedTree2$1(BrokerMetadataCheckpoint.scala:73) at kafka.server.BrokerMetadataCheckpoint.read(BrokerMetadataCheckpoint.scala:72) at kafka.server.KafkaServer.$anonfun$getBrokerMetadataAndOfflineDirs$1(KafkaServer.scala:704) at kafka.server.KafkaServer.$anonfun$getBrokerMetadataAndOfflineDirs$1$adapted(KafkaServer.scala:702) at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38) at kafka.server.KafkaServer.getBrokerMetadataAndOfflineDirs(KafkaServer.scala:702) at kafka.server.KafkaServer.startup(KafkaServer.scala:214)at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44) at kafka.Kafka$.main(Kafka.scala:84)at kafka.Kafka.main(Kafka.scala) {code} Changing the catch to FileNotFoundException fixes the issue, here: [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/BrokerMetadataCheckpoint.scala#L84] This is a regression from 2.3.x. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-7549) Old ProduceRequest with zstd compression does not return error to client
Magnus Edenhill created KAFKA-7549: -- Summary: Old ProduceRequest with zstd compression does not return error to client Key: KAFKA-7549 URL: https://issues.apache.org/jira/browse/KAFKA-7549 Project: Kafka Issue Type: Bug Components: compression Reporter: Magnus Edenhill Kafka broker v2.1.0rc0. KIP-110 states that: "Zstd will only be allowed for the bumped produce API. That is, for older version clients(=below KAFKA_2_1_IV0), we return UNSUPPORTED_COMPRESSION_TYPE regardless of the message format." However, sending a ProduceRequest V3 with zstd compression (which is a client side bug) closes the connection with the following exception rather than returning UNSUPPORTED_COMPRESSION_TYPE in the ProduceResponse: {noformat} [2018-10-25 11:40:31,813] ERROR Exception while processing request from 127.0.0.1:60723-127.0.0.1:60656-94 (kafka.network.Processor) org.apache.kafka.common.errors.InvalidRequestException: Error getting request for apiKey: PRODUCE, apiVersion: 3, connectionId: 127.0.0.1:60723-127.0.0.1:60656-94, listenerName: ListenerName(PLAINTEXT), principal: User:ANONYMOUS Caused by: org.apache.kafka.common.record.InvalidRecordException: Produce requests with version 3 are note allowed to use ZStandard compression {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-6885) DescribeConfigs synonyms are are identical to parent entry for BROKER resources
[ https://issues.apache.org/jira/browse/KAFKA-6885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Magnus Edenhill resolved KAFKA-6885. Resolution: Invalid > DescribeConfigs synonyms are are identical to parent entry for BROKER > resources > --- > > Key: KAFKA-6885 > URL: https://issues.apache.org/jira/browse/KAFKA-6885 > Project: Kafka > Issue Type: Bug > Components: admin >Affects Versions: 1.1.0 >Reporter: Magnus Edenhill >Priority: Major > > The DescribeConfigs protocol response for BROKER resources returns synonyms > for various configuration entries, such as "log.dir". > The list of synonyms returned are identical to their parent configuration > entry, rather than the actual synonyms. > For example, for the broker "log.dir" config entry it returns one synonym, > also named "log.dir" rather than "log.dirs" or whatever the synonym is > supposed to be. > > This does not seem to happen for TOPIC resources. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-6885) DescribeConfigs synonyms are are identical to parent entry for BROKER resources
Magnus Edenhill created KAFKA-6885: -- Summary: DescribeConfigs synonyms are are identical to parent entry for BROKER resources Key: KAFKA-6885 URL: https://issues.apache.org/jira/browse/KAFKA-6885 Project: Kafka Issue Type: Bug Components: admin Affects Versions: 1.1.0 Reporter: Magnus Edenhill The DescribeConfigs protocol response for BROKER resources returns synonyms for various configuration entries, such as "log.dir". The list of synonyms returned are identical to their parent configuration entry, rather than the actual synonyms. For example, for the broker "log.dir" config entry it returns one synonym, also named "log.dir" rather than "log.dirs" or whatever the synonym is supposed to be. This does not seem to happen for TOPIC resources. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-6778) DescribeConfigs does not return error for non-existent topic
Magnus Edenhill created KAFKA-6778: -- Summary: DescribeConfigs does not return error for non-existent topic Key: KAFKA-6778 URL: https://issues.apache.org/jira/browse/KAFKA-6778 Project: Kafka Issue Type: Bug Components: admin Affects Versions: 1.1.0 Reporter: Magnus Edenhill Sending a DescribeConfigsRequest with a ConfigResource(TOPIC, "non-existent-topic") returns a fully populated ConfigResource back in the response with 24 configuration entries. A resource-level error_code of UnknownTopic.. would be expected instead. {code:java} [0081_admin / 1.143s] ConfigResource #0: type TOPIC (2), "rdkafkatest_rnd3df408bf5d94d696_DescribeConfigs_notexist": 24 ConfigEntries, error NO_ERROR () [0081_admin / 1.144s] #0/24: Source UNKNOWN (5): "compression.type"="producer" [is read-only=n, default=n, sensitive=n, synonym=n] with 1 synonym(s) {code} But the topic does not exist: {code:java} $ $KAFKA_PATH/bin/kafka-topics.sh --zookeeper $ZK_ADDRESS --list | grep rdkafkatest_rnd3df408bf5d94d696_DescribeConfigs_notexist ; echo $? 1 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-4340) Change the default value of log.message.timestamp.difference.max.ms to the same as log.retention.ms
[ https://issues.apache.org/jira/browse/KAFKA-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027366#comment-16027366 ] Magnus Edenhill commented on KAFKA-4340: Since this is a change to the protocol API (change of behaviour), I suspect that a proper KIP is required and would recommend to revert this commit for the upcoming release. > Change the default value of log.message.timestamp.difference.max.ms to the > same as log.retention.ms > --- > > Key: KAFKA-4340 > URL: https://issues.apache.org/jira/browse/KAFKA-4340 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.10.1.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.11.0.0 > > > [~junrao] brought up the following scenario: > If users are pumping data with timestamp already passed log.retention.ms into > Kafka, the messages will be appended to the log but will be immediately > rolled out by log retention thread when it kicks in and the messages will be > deleted. > To avoid this produce-and-deleted scenario, we can set the default value of > log.message.timestamp.difference.max.ms to be the same as log.retention.ms. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4340) Change the default value of log.message.timestamp.difference.max.ms to the same as log.retention.ms
[ https://issues.apache.org/jira/browse/KAFKA-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16023668#comment-16023668 ] Magnus Edenhill commented on KAFKA-4340: Generally I would agree, but in this case I don't think there is a practical difference to discarding outdated messages, or removing them with the retention cleaner, it is just a matter of time frame - discarding and not appending outdated messages to the logs is immediate, while the retention cleaner might kick in immediately or in whatever max interval it is configured to. So from the producer and consumer perspectives the end result is pretty much the same: there is no guarantee that an outdated message will be seen by a consumer. However, rejecting the entire batch means there will be guaranteed data loss in the message stream: the producer will not try to re-send those failed messages, even if all but one message in the batch were actually okay. I strongly feel this is undesired behaviour from the application's point of view. > Change the default value of log.message.timestamp.difference.max.ms to the > same as log.retention.ms > --- > > Key: KAFKA-4340 > URL: https://issues.apache.org/jira/browse/KAFKA-4340 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.10.1.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.11.0.0 > > > [~junrao] brought up the following scenario: > If users are pumping data with timestamp already passed log.retention.ms into > Kafka, the messages will be appended to the log but will be immediately > rolled out by log retention thread when it kicks in and the messages will be > deleted. > To avoid this produce-and-deleted scenario, we can set the default value of > log.message.timestamp.difference.max.ms to be the same as log.retention.ms. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4340) Change the default value of log.message.timestamp.difference.max.ms to the same as log.retention.ms
[ https://issues.apache.org/jira/browse/KAFKA-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022945#comment-16022945 ] Magnus Edenhill commented on KAFKA-4340: While the idea behind this JIRA is good (as a means of optimization) I think it might be troublesome in practice. If a producer sends a batch of N messages, with one message being too old, the entire batch will fail (errors are propagated per partition, not message) and the producer can't really recover and retry gracefully to produce the non-timedout messages. This problem is not related to a specific client, but rather the nature of the data being produced: it will manifest itself with old timestamps, such as app-sourced timestamps, or things like MM. A better alternative would perhaps be to silently discard the message on the broker instead (which is effectively the same as writing the message to log and then immediately cleaning it before a consumer picks up the message). > Change the default value of log.message.timestamp.difference.max.ms to the > same as log.retention.ms > --- > > Key: KAFKA-4340 > URL: https://issues.apache.org/jira/browse/KAFKA-4340 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.10.1.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.11.0.0 > > > [~junrao] brought up the following scenario: > If users are pumping data with timestamp already passed log.retention.ms into > Kafka, the messages will be appended to the log but will be immediately > rolled out by log retention thread when it kicks in and the messages will be > deleted. > To avoid this produce-and-deleted scenario, we can set the default value of > log.message.timestamp.difference.max.ms to be the same as log.retention.ms. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Reopened] (KAFKA-4340) Change the default value of log.message.timestamp.difference.max.ms to the same as log.retention.ms
[ https://issues.apache.org/jira/browse/KAFKA-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Magnus Edenhill reopened KAFKA-4340: > Change the default value of log.message.timestamp.difference.max.ms to the > same as log.retention.ms > --- > > Key: KAFKA-4340 > URL: https://issues.apache.org/jira/browse/KAFKA-4340 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.10.1.0 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.11.0.0 > > > [~junrao] brought up the following scenario: > If users are pumping data with timestamp already passed log.retention.ms into > Kafka, the messages will be appended to the log but will be immediately > rolled out by log retention thread when it kicks in and the messages will be > deleted. > To avoid this produce-and-deleted scenario, we can set the default value of > log.message.timestamp.difference.max.ms to be the same as log.retention.ms. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Issue Comment Deleted] (KAFKA-4842) Streams integration tests occasionally fail with connection error
[ https://issues.apache.org/jira/browse/KAFKA-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Magnus Edenhill updated KAFKA-4842: --- Comment: was deleted (was: Happened again on a trunk PR, https://github.com/apache/kafka/pull/2765#issuecomment-290381971 : https://builds.apache.org/job/kafka-pr-jdk7-scala2.10/2525/ {noformat} org.apache.kafka.streams.integration.QueryableStateIntegrationTest > queryOnRebalance FAILED java.lang.AssertionError: Condition not met within timeout 3. waiting for metadata, store and value to be non null at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:257) at org.apache.kafka.streams.integration.QueryableStateIntegrationTest.verifyAllKVKeys(QueryableStateIntegrationTest.java:262) at org.apache.kafka.streams.integration.QueryableStateIntegrationTest.queryOnRebalance(QueryableStateIntegrationTest.java:344) {noformat} ) > Streams integration tests occasionally fail with connection error > - > > Key: KAFKA-4842 > URL: https://issues.apache.org/jira/browse/KAFKA-4842 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.2.0 >Reporter: Eno Thereska >Assignee: Matthias J. Sax > Fix For: 0.11.0.0 > > > {noformat} > org.apache.kafka.streams.integration.QueryableStateIntegrationTest > > queryOnRebalance[0] FAILED java.lang.IllegalStateException: No entry found > for connection 0. In ClusterConnectionStates.java: node state. > {noformat} > This happens locally. Happened to > KStreamAggregationIntegrationTest.java.shouldReduce() once too. > Not clear this is streams related. Also, it's hard to reproduce since it > doesn't happen all the time. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KAFKA-4983) Test failure: kafka.api.ConsumerBounceTest.testSubscribeWhenTopicUnavailable
[ https://issues.apache.org/jira/browse/KAFKA-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Magnus Edenhill updated KAFKA-4983: --- Description: The PR builder encountered this test failure: {{kafka.api.ConsumerBounceTest.testSubscribeWhenTopicUnavailable}} https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/2532/testReport/junit/kafka.api/ConsumerBounceTest/testSubscribeWhenTopicUnavailable/ {noformat} [2017-03-30 13:42:40,875] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-03-30 13:42:40,884] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-03-30 13:42:41,221] FATAL [Kafka Server 1], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer:118) kafka.common.KafkaException: Socket server failed to bind to localhost:42198: Address already in use. at kafka.network.Acceptor.openServerSocket(SocketServer.scala:330) at kafka.network.Acceptor.(SocketServer.scala:255) at kafka.network.SocketServer.$anonfun$startup$1(SocketServer.scala:99) at kafka.network.SocketServer.$anonfun$startup$1$adapted(SocketServer.scala:90) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:59) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:52) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at kafka.network.SocketServer.startup(SocketServer.scala:90) at kafka.server.KafkaServer.startup(KafkaServer.scala:215) at kafka.utils.TestUtils$.createServer(TestUtils.scala:122) at kafka.integration.KafkaServerTestHarness.$anonfun$setUp$1(KafkaServerTestHarness.scala:91) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:234) at scala.collection.Iterator.foreach(Iterator.scala:929) at scala.collection.Iterator.foreach$(Iterator.scala:929) at scala.collection.AbstractIterator.foreach(Iterator.scala:1406) at scala.collection.IterableLike.foreach(IterableLike.scala:71) at scala.collection.IterableLike.foreach$(IterableLike.scala:70) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike.map(TraversableLike.scala:234) at scala.collection.TraversableLike.map$(TraversableLike.scala:227) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at kafka.integration.KafkaServerTestHarness.setUp(KafkaServerTestHarness.scala:91) at kafka.api.IntegrationTestHarness.setUp(IntegrationTestHarness.scala:65) at kafka.api.ConsumerBounceTest.setUp(ConsumerBounceTest.scala:67) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) at
[jira] [Created] (KAFKA-4983) Test failure: kafka.api.ConsumerBounceTest.testSubscribeWhenTopicUnavailable
Magnus Edenhill created KAFKA-4983: -- Summary: Test failure: kafka.api.ConsumerBounceTest.testSubscribeWhenTopicUnavailable Key: KAFKA-4983 URL: https://issues.apache.org/jira/browse/KAFKA-4983 Project: Kafka Issue Type: Test Reporter: Magnus Edenhill The PR builder encountered this test failure: {{kafka.api.ConsumerBounceTest.testSubscribeWhenTopicUnavailable}} https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/2532/testReport/junit/kafka.api/ConsumerBounceTest/testSubscribeWhenTopicUnavailable/ {{noformat}} [2017-03-30 13:42:40,875] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-03-30 13:42:40,884] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-03-30 13:42:41,221] FATAL [Kafka Server 1], Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer:118) kafka.common.KafkaException: Socket server failed to bind to localhost:42198: Address already in use. at kafka.network.Acceptor.openServerSocket(SocketServer.scala:330) at kafka.network.Acceptor.(SocketServer.scala:255) at kafka.network.SocketServer.$anonfun$startup$1(SocketServer.scala:99) at kafka.network.SocketServer.$anonfun$startup$1$adapted(SocketServer.scala:90) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:59) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:52) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at kafka.network.SocketServer.startup(SocketServer.scala:90) at kafka.server.KafkaServer.startup(KafkaServer.scala:215) at kafka.utils.TestUtils$.createServer(TestUtils.scala:122) at kafka.integration.KafkaServerTestHarness.$anonfun$setUp$1(KafkaServerTestHarness.scala:91) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:234) at scala.collection.Iterator.foreach(Iterator.scala:929) at scala.collection.Iterator.foreach$(Iterator.scala:929) at scala.collection.AbstractIterator.foreach(Iterator.scala:1406) at scala.collection.IterableLike.foreach(IterableLike.scala:71) at scala.collection.IterableLike.foreach$(IterableLike.scala:70) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike.map(TraversableLike.scala:234) at scala.collection.TraversableLike.map$(TraversableLike.scala:227) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at kafka.integration.KafkaServerTestHarness.setUp(KafkaServerTestHarness.scala:91) at kafka.api.IntegrationTestHarness.setUp(IntegrationTestHarness.scala:65) at kafka.api.ConsumerBounceTest.setUp(ConsumerBounceTest.scala:67) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57) at
[jira] [Commented] (KAFKA-4476) Kafka Streams gets stuck if metadata is missing
[ https://issues.apache.org/jira/browse/KAFKA-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949128#comment-15949128 ] Magnus Edenhill commented on KAFKA-4476: Directed here from KAFKA-4482. Happened again on trunk PR: https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/2536/testReport/junit/org.apache.kafka.streams.integration/ResetIntegrationTest/testReprocessingFromScratchAfterResetWithoutIntermediateUserTopic/ > Kafka Streams gets stuck if metadata is missing > --- > > Key: KAFKA-4476 > URL: https://issues.apache.org/jira/browse/KAFKA-4476 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Critical > Fix For: 0.10.2.0 > > > When a Kafka Streams application gets started for the first time, it can > happen that some topic metadata is missing when > {{StreamPartitionAssigner#assign()}} is called on the group leader instance. > This can result in an infinite loop within > {{StreamPartitionAssigner#assign()}}. This issue was detected by > {{ResetIntegrationTest}} that does have a transient timeout failure (c.f. > https://issues.apache.org/jira/browse/KAFKA-4058 -- this issue was re-opened > multiple times as the problem was expected to be in the test -- however, that > is not the case). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4842) Streams integration tests occasionally fail with connection error
[ https://issues.apache.org/jira/browse/KAFKA-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948898#comment-15948898 ] Magnus Edenhill commented on KAFKA-4842: Happened again on a trunk PR, https://github.com/apache/kafka/pull/2765#issuecomment-290381971 : https://builds.apache.org/job/kafka-pr-jdk7-scala2.10/2525/ {noformat} org.apache.kafka.streams.integration.QueryableStateIntegrationTest > queryOnRebalance FAILED java.lang.AssertionError: Condition not met within timeout 3. waiting for metadata, store and value to be non null at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:257) at org.apache.kafka.streams.integration.QueryableStateIntegrationTest.verifyAllKVKeys(QueryableStateIntegrationTest.java:262) at org.apache.kafka.streams.integration.QueryableStateIntegrationTest.queryOnRebalance(QueryableStateIntegrationTest.java:344) {noformat} > Streams integration tests occasionally fail with connection error > - > > Key: KAFKA-4842 > URL: https://issues.apache.org/jira/browse/KAFKA-4842 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.2.0 >Reporter: Eno Thereska > Fix For: 0.11.0.0 > > > org.apache.kafka.streams.integration.QueryableStateIntegrationTest > > queryOnRebalance[0] FAILED java.lang.IllegalStateException: No entry found > for connection 0. In ClusterConnectionStates.java: node state. This happens > locally. Happened to KStreamAggregationIntegrationTest.java.shouldReduce() > once too. > Not clear this is streams related. Also, it's hard to reproduce since it > doesn't happen all the time. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KAFKA-4974) System test failure in 0.8.2.2 upgrade tests
[ https://issues.apache.org/jira/browse/KAFKA-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Magnus Edenhill updated KAFKA-4974: --- Description: The 0.10.2 system test failed in one of the upgrade tests from 0.8.2.2: http://testing.confluent.io/confluent-kafka-0-10-2-system-test-results/?prefix=2017-03-21--001.1490092219--apache--0.10.2--4a019bd/TestUpgrade/test_upgrade/from_kafka_version=0.8.2.2.to_message_format_version=None.compression_types=.snappy.new_consumer=False/ {noformat} [INFO - 2017-03-21 07:35:48,802 - runner_client - log - lineno:221]: RunnerClient: kafkatest.tests.core.upgrade_test.TestUpgrade.test_upgrade.from_kafka_version=0.8.2.2.to_message_format_version=None.compression_types=.snappy.new_consumer=False: FAIL: Kafka server didn't finish startup Traceback (most recent call last): File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py", line 123, in run data = self.run_test() File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py", line 176, in run_test return self.test_context.function(self.test) File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py", line 321, in wrapper return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/core/upgrade_test.py", line 125, in test_upgrade self.run_produce_consume_validate(core_test_action=lambda: self.perform_upgrade(from_kafka_version, File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/produce_consume_validate.py", line 114, in run_produce_consume_validate core_test_action(*args) File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/core/upgrade_test.py", line 126, in to_message_format_version)) File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/core/upgrade_test.py", line 52, in perform_upgrade self.kafka.start_node(node) File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/services/kafka/kafka.py", line 222, in start_node monitor.wait_until("Kafka Server.*started", timeout_sec=30, backoff_sec=.25, err_msg="Kafka server didn't finish startup") File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/cluster/remoteaccount.py", line 642, in wait_until return wait_until(lambda: self.acct.ssh("tail -c +%d %s | grep '%s'" % (self.offset+1, self.log, pattern), allow_fail=True) == 0, **kwargs) File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/utils/util.py", line 36, in wait_until {noformat} Logs: {noformat} ==> ./KafkaService-0-140646398705744/worker9/server-start-stdout-stderr.log <== [2017-03-21 07:35:18,250] DEBUG Leaving process event (org.I0Itec.zkclient.ZkClient) [2017-03-21 07:35:18,250] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn) [2017-03-21 07:35:18,250] INFO [Kafka Server 2], shut down completed (kafka.server.KafkaServer) Error: Exception thrown by the agent : java.rmi.server.ExportException: Port already in use: 9192; nested exception is: java.net.BindException: Address already in use {noformat} That's from starting the upgraded broker, which seems to indicate that 0.8.2.2 was not properly shut down or has its RMI port in the close-wait state. Since there probably isn't much to do about 0.8.2.2 the test should probably be hardened to either select a random port, or wait for lingering port to become available (can use netstat for that). This earlier failrue from the same 0.8.2.2 invocation might be of interest: {noformat} [2017-03-21 07:35:18,233] DEBUG Writing clean shutdown marker at /mnt/kafka-data-logs (kafka.log.LogManager) [2017-03-21 07:35:18,235] INFO Shutdown complete. (kafka.log.LogManager) [2017-03-21 07:35:18,238] DEBUG Shutting down task scheduler. (kafka.utils.KafkaScheduler) [2017-03-21 07:35:18,243] WARN sleep interrupted (kafka.utils.Utils$) java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:144) at kafka.utils.Utils$.swallow(Utils.scala:172) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:45) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:45)
[jira] [Created] (KAFKA-4974) System test failure in 0.8.2.2 upgrade tests
Magnus Edenhill created KAFKA-4974: -- Summary: System test failure in 0.8.2.2 upgrade tests Key: KAFKA-4974 URL: https://issues.apache.org/jira/browse/KAFKA-4974 Project: Kafka Issue Type: Bug Components: system tests Reporter: Magnus Edenhill The 0.10.2 system test failed in one of the upgrade tests from 0.8.2.2: http://testing.confluent.io/confluent-kafka-0-10-2-system-test-results/?prefix=2017-03-21--001.1490092219--apache--0.10.2--4a019bd/TestUpgrade/test_upgrade/from_kafka_version=0.8.2.2.to_message_format_version=None.compression_types=.snappy.new_consumer=False/ {{ [INFO - 2017-03-21 07:35:48,802 - runner_client - log - lineno:221]: RunnerClient: kafkatest.tests.core.upgrade_test.TestUpgrade.test_upgrade.from_kafka_version=0.8.2.2.to_message_format_version=None.compression_types=.snappy.new_consumer=False: FAIL: Kafka server didn't finish startup Traceback (most recent call last): File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py", line 123, in run data = self.run_test() File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py", line 176, in run_test return self.test_context.function(self.test) File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py", line 321, in wrapper return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/core/upgrade_test.py", line 125, in test_upgrade self.run_produce_consume_validate(core_test_action=lambda: self.perform_upgrade(from_kafka_version, File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/produce_consume_validate.py", line 114, in run_produce_consume_validate core_test_action(*args) File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/core/upgrade_test.py", line 126, in to_message_format_version)) File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/core/upgrade_test.py", line 52, in perform_upgrade self.kafka.start_node(node) File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/services/kafka/kafka.py", line 222, in start_node monitor.wait_until("Kafka Server.*started", timeout_sec=30, backoff_sec=.25, err_msg="Kafka server didn't finish startup") File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/cluster/remoteaccount.py", line 642, in wait_until return wait_until(lambda: self.acct.ssh("tail -c +%d %s | grep '%s'" % (self.offset+1, self.log, pattern), allow_fail=True) == 0, **kwargs) File "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/utils/util.py", line 36, in wait_until }} Logs: {{ ==> ./KafkaService-0-140646398705744/worker9/server-start-stdout-stderr.log <== [2017-03-21 07:35:18,250] DEBUG Leaving process event (org.I0Itec.zkclient.ZkClient) [2017-03-21 07:35:18,250] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn) [2017-03-21 07:35:18,250] INFO [Kafka Server 2], shut down completed (kafka.server.KafkaServer) Error: Exception thrown by the agent : java.rmi.server.ExportException: Port already in use: 9192; nested exception is: java.net.BindException: Address already in use }} That's from starting the upgraded broker, which seems to indicate that 0.8.2.2 was not properly shut down or has its RMI port in the close-wait state. Since there probably isn't much to do about 0.8.2.2 the test should probably be hardened to either select a random port, or wait for lingering port to become available (can use netstat for that). This earlier failrue from the same 0.8.2.2 invocation might be of interest: {{ [2017-03-21 07:35:18,233] DEBUG Writing clean shutdown marker at /mnt/kafka-data-logs (kafka.log.LogManager) [2017-03-21 07:35:18,235] INFO Shutdown complete. (kafka.log.LogManager) [2017-03-21 07:35:18,238] DEBUG Shutting down task scheduler. (kafka.utils.KafkaScheduler) [2017-03-21 07:35:18,243] WARN sleep interrupted (kafka.utils.Utils$) java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:144) at kafka.utils.Utils$.swallow(Utils.scala:172) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at
[jira] [Commented] (KAFKA-1588) Offset response does not support two requests for the same topic/partition combo
[ https://issues.apache.org/jira/browse/KAFKA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415000#comment-15415000 ] Magnus Edenhill commented on KAFKA-1588: I see now that this behaviour is infact documented in the protocol spec, sorry about that. I rest my case. > Offset response does not support two requests for the same topic/partition > combo > > > Key: KAFKA-1588 > URL: https://issues.apache.org/jira/browse/KAFKA-1588 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8.2.0 >Reporter: Todd Palino >Assignee: Sriharsha Chintalapani > Labels: newbie > > When performing an OffsetRequest, if you request the same topic and partition > combination in a single request more than once (for example, if you want to > get both the head and tail offsets for a partition in the same request), you > will get a response for both, but they will be the same offset. > We identified that the problem is that when the offset response is assembled, > a map is used to store the offset info before it is converted to the response > format and sent to the client. Therefore, the second request for a > topic/partition combination will overwrite the offset from the first request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1588) Offset response does not support two requests for the same topic/partition combo
[ https://issues.apache.org/jira/browse/KAFKA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414995#comment-15414995 ] Magnus Edenhill commented on KAFKA-1588: Thanks [~ijuma]. But I dont see how the bug fix changes the protocol (the behaviour is undocumented and non-intuitive), or breaks any clients (how can a client rely or make use of undefined ordering, silent ignore of duplicate topics, etc?). For the greater good of the community and eco-system I feel it is important that the protocol specification is considered authoritative rather than the broker implementation, and any discrepencies between the two should be fixed in the corresponding implementation(s) rather than the protocol spec unless the protocol spec is clearly wrong (which would affect all protocol implementations (clients)) . > Offset response does not support two requests for the same topic/partition > combo > > > Key: KAFKA-1588 > URL: https://issues.apache.org/jira/browse/KAFKA-1588 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8.2.0 >Reporter: Todd Palino >Assignee: Sriharsha Chintalapani > Labels: newbie > > When performing an OffsetRequest, if you request the same topic and partition > combination in a single request more than once (for example, if you want to > get both the head and tail offsets for a partition in the same request), you > will get a response for both, but they will be the same offset. > We identified that the problem is that when the offset response is assembled, > a map is used to store the offset info before it is converted to the response > format and sent to the client. Therefore, the second request for a > topic/partition combination will overwrite the offset from the first request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1588) Offset response does not support two requests for the same topic/partition combo
[ https://issues.apache.org/jira/browse/KAFKA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414868#comment-15414868 ] Magnus Edenhill commented on KAFKA-1588: [~guozhang] Why would this fix require a protocol change? It is already an array of Topics+partitions in the OffsetRequest and all the broker needs to do is honour the contents and order of that list and respond accordingly, which should be a broker implementation detail only and not affect the protocol definition, right? > Offset response does not support two requests for the same topic/partition > combo > > > Key: KAFKA-1588 > URL: https://issues.apache.org/jira/browse/KAFKA-1588 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.8.2.0 >Reporter: Todd Palino >Assignee: Sriharsha Chintalapani > Labels: newbie > > When performing an OffsetRequest, if you request the same topic and partition > combination in a single request more than once (for example, if you want to > get both the head and tail offsets for a partition in the same request), you > will get a response for both, but they will be the same offset. > We identified that the problem is that when the offset response is assembled, > a map is used to store the offset info before it is converted to the response > format and sent to the client. Therefore, the second request for a > topic/partition combination will overwrite the offset from the first request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-3743) kafka-server-start.sh: Unhelpful error message
Magnus Edenhill created KAFKA-3743: -- Summary: kafka-server-start.sh: Unhelpful error message Key: KAFKA-3743 URL: https://issues.apache.org/jira/browse/KAFKA-3743 Project: Kafka Issue Type: Bug Components: tools Affects Versions: 0.10.0.0 Reporter: Magnus Edenhill Priority: Minor When trying to start Kafka from an uncompiled source tarball rather than the binary the kafka-server-start.sh command gives a mystical error message: ``` $ bin/kafka-server-start.sh config/server.properties Error: Could not find or load main class config.server.properties ``` This could probably be improved to say something closer to the truth. This is on 0.10.0.0-rc6 tarball from github. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3547) Broker does not disconnect client on unknown request
[ https://issues.apache.org/jira/browse/KAFKA-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236783#comment-15236783 ] Magnus Edenhill commented on KAFKA-3547: Fixed in https://github.com/apache/kafka/commit/bb643f83a95e9ebb3a9f8331fe9998c8722c07d4#diff-d0332a0ff31df50afce3809d90505b25R80 > Broker does not disconnect client on unknown request > > > Key: KAFKA-3547 > URL: https://issues.apache.org/jira/browse/KAFKA-3547 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.9.0.0, 0.9.0.1 >Reporter: Magnus Edenhill >Priority: Critical > Fix For: 0.10.0.0 > > > A regression in 0.9.0 causes the broker to not close a client connection when > receiving an unsupported request. > Two effects of this are: > - the client is not informed that the request was not supported (even though > a closed connection is a blunt indication it is infact some indication), the > request will (hopefully) just time out on the client after some time, > stalling sub-sequent operations. > - the broker leaks the connection until the connection reaper brings it down > or the client closes the connection -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-3547) Broker does not disconnect client on unknown request
Magnus Edenhill created KAFKA-3547: -- Summary: Broker does not disconnect client on unknown request Key: KAFKA-3547 URL: https://issues.apache.org/jira/browse/KAFKA-3547 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.9.0.1, 0.9.0.0 Reporter: Magnus Edenhill Priority: Critical Fix For: 0.10.0.0 A regression in 0.9.0 causes the broker to not close a client connection when receiving an unsupported request. Two effects of this are: - the client is not informed that the request was not supported (even though a closed connection is a blunt indication it is infact some indication), the request will (hopefully) just time out on the client after some time, stalling sub-sequent operations. - the broker leaks the connection until the connection reaper brings it down or the client closes the connection -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3160) Kafka LZ4 framing code miscalculates header checksum
[ https://issues.apache.org/jira/browse/KAFKA-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15234610#comment-15234610 ] Magnus Edenhill commented on KAFKA-3160: [~ijuma] It needs to go through slow-path to "fix-down" (i.e., break the checksum) for older clients connecting to a newer broker. But yeah, the plan is to bundle this functionality with Message format 1, so that Message format v0 attribute flag LZ4 has a broken checksum, while Message format 1 with attribute flag LZ4 has a correct checksum. > Kafka LZ4 framing code miscalculates header checksum > > > Key: KAFKA-3160 > URL: https://issues.apache.org/jira/browse/KAFKA-3160 > Project: Kafka > Issue Type: Bug > Components: compression >Affects Versions: 0.8.2.0, 0.8.2.1, 0.9.0.0, 0.8.2.2, 0.9.0.1 >Reporter: Dana Powers >Assignee: Magnus Edenhill > Labels: compatibility, compression, lz4 > > KAFKA-1493 partially implements the LZ4 framing specification, but it > incorrectly calculates the header checksum. This causes > KafkaLZ4BlockInputStream to raise an error > [IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed > LZ4 data. It also causes KafkaLZ4BlockOutputStream to generate incorrectly > framed LZ4 data, which means clients decoding LZ4 messages from kafka will > always receive incorrectly framed data. > Specifically, the current implementation includes the 4-byte MagicNumber in > the checksum, which is incorrect. > http://cyan4973.github.io/lz4/lz4_Frame_format.html > Third-party clients that attempt to use off-the-shelf lz4 framing find that > brokers reject messages as having a corrupt checksum. So currently non-java > clients must 'fixup' lz4 packets to deal with the broken checksum. > Magnus first identified this issue in librdkafka; kafka-python has the same > problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3160) Kafka LZ4 framing code miscalculates header checksum
[ https://issues.apache.org/jira/browse/KAFKA-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15234245#comment-15234245 ] Magnus Edenhill commented on KAFKA-3160: [~dana.powers] My broker patch adds a new Attribute bit to specify the fixed LZ4F framing but still relies on clients being backwards compatible with the broken framing format, but that was before KIP-31 was a thing.. Your proposed solution with reusing the KIP-31 behaviour is much better, I'd definately like to see this in broker 0.10. This will also formally add LZ4 support to the protocol (it is not even mentioned in the Kafka protocol docs) and be compatible with KIP-35 (e.g., ProduceRequest >= v2 supports LZ4). I'm a strong +1 on this. > Kafka LZ4 framing code miscalculates header checksum > > > Key: KAFKA-3160 > URL: https://issues.apache.org/jira/browse/KAFKA-3160 > Project: Kafka > Issue Type: Bug > Components: compression >Affects Versions: 0.8.2.0, 0.8.2.1, 0.9.0.0, 0.8.2.2, 0.9.0.1 >Reporter: Dana Powers >Assignee: Magnus Edenhill > Labels: compatibility, compression, lz4 > > KAFKA-1493 partially implements the LZ4 framing specification, but it > incorrectly calculates the header checksum. This causes > KafkaLZ4BlockInputStream to raise an error > [IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed > LZ4 data. It also causes KafkaLZ4BlockOutputStream to generate incorrectly > framed LZ4 data, which means clients decoding LZ4 messages from kafka will > always receive incorrectly framed data. > Specifically, the current implementation includes the 4-byte MagicNumber in > the checksum, which is incorrect. > http://cyan4973.github.io/lz4/lz4_Frame_format.html > Third-party clients that attempt to use off-the-shelf lz4 framing find that > brokers reject messages as having a corrupt checksum. So currently non-java > clients must 'fixup' lz4 packets to deal with the broken checksum. > Magnus first identified this issue in librdkafka; kafka-python has the same > problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3442) FetchResponse size exceeds max.partition.fetch.bytes
[ https://issues.apache.org/jira/browse/KAFKA-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206838#comment-15206838 ] Magnus Edenhill commented on KAFKA-3442: librdkafka silently ignores partial messages (which are most often seen at the end of a messageset), so it should be improved for this case where it gets stuck on a single message. > FetchResponse size exceeds max.partition.fetch.bytes > > > Key: KAFKA-3442 > URL: https://issues.apache.org/jira/browse/KAFKA-3442 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.0.0 >Reporter: Dana Powers >Assignee: Jiangjie Qin >Priority: Blocker > Fix For: 0.10.0.0 > > > Produce 1 byte message to topic foobar > Fetch foobar w/ max.partition.fetch.bytes=1024 > Test expects to receive a truncated message (~1024 bytes). 0.8 and 0.9 pass > this test, but 0.10 FetchResponse has full message, exceeding the max > specified in the FetchRequest. > I tested with v0 and v1 apis, both fail. Have not tested w/ v2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-3394) Broker fails to parse Null Metadata in OffsetCommit requests
Magnus Edenhill created KAFKA-3394: -- Summary: Broker fails to parse Null Metadata in OffsetCommit requests Key: KAFKA-3394 URL: https://issues.apache.org/jira/browse/KAFKA-3394 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.10.0.0 Reporter: Magnus Edenhill librdkafka sends a Null Metadata string (size -1) in its OffsetCommitRequests when there is no metadata, this unfortunately leads to an exception on the broker that expects a non-null string. {noformat} [2016-03-11 11:11:57,623] ERROR Closing socket for 10.191.0.33:9092-10.191.0.33:56503 because of error (kafka.network.Processor) kafka.network.InvalidRequestException: Error getting request for apiKey: 8 and apiVersion: 1 at kafka.network.RequestChannel$Request.liftedTree2$1(RequestChannel.scala:91) at kafka.network.RequestChannel$Request.(RequestChannel.scala:88) at kafka.network.Processor$$anonfun$run$11.apply(SocketServer.scala:426) at kafka.network.Processor$$anonfun$run$11.apply(SocketServer.scala:421) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at kafka.network.Processor.run(SocketServer.scala:421) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.kafka.common.protocol.types.SchemaException: Error reading field 'topics': Error reading field 'partitions': Error reading field 'metadata': java.lang.NegativeArraySizeException at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:73) at org.apache.kafka.common.requests.OffsetCommitRequest.parse(OffsetCommitRequest.java:260) at org.apache.kafka.common.requests.AbstractRequest.getRequest(AbstractRequest.java:50) at kafka.network.RequestChannel$Request.liftedTree2$1(RequestChannel.scala:88) ... 9 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-3219) Long topic names mess up broker topic state
[ https://issues.apache.org/jira/browse/KAFKA-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Magnus Edenhill updated KAFKA-3219: --- Description: Seems like the broker doesn't like topic names of 254 chars or more when creating using kafka-topics.sh --create. The problem does not seem to arise when topic is created through automatic topic creation. How to reproduce: {code} TOPIC=$(printf 'd%.0s' {1..254} ) ; bin/kafka-topics.sh --zookeeper 0 --create --topic $TOPIC --partitions 1 --replication-factor 1 {code} {code} [2016-02-06 22:00:01,943] INFO [ReplicaFetcherManager on broker 3] Removed fetcher for partitions [dd,0] (kafka.server.ReplicaFetcherManager) [2016-02-06 22:00:01,944] ERROR [KafkaApi-3] Error when handling request {controller_id=3,controller_epoch=12,partition_states=[{topic=dd,partition=0,controller_epoch=12,leader=3,leader_epoch=0,isr=[3],zk_version=0,replicas=[3]}],live_leaders=[{id=3,host=eden,port=9093}]} (kafka.server.KafkaApis) java.lang.NullPointerException at scala.collection.mutable.ArrayOps$ofRef$.length$extension(ArrayOps.scala:114) at scala.collection.mutable.ArrayOps$ofRef.length(ArrayOps.scala:114) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:32) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) at kafka.log.Log.loadSegments(Log.scala:138) at kafka.log.Log.(Log.scala:92) at kafka.log.LogManager.createLog(LogManager.scala:357) at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:96) at kafka.cluster.Partition$$anonfun$4$$anonfun$apply$2.apply(Partition.scala:176) at kafka.cluster.Partition$$anonfun$4$$anonfun$apply$2.apply(Partition.scala:176) at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) at kafka.cluster.Partition$$anonfun$4.apply(Partition.scala:176) at kafka.cluster.Partition$$anonfun$4.apply(Partition.scala:170) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:259) at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:267) at kafka.cluster.Partition.makeLeader(Partition.scala:170) at kafka.server.ReplicaManager$$anonfun$makeLeaders$4.apply(ReplicaManager.scala:696) at kafka.server.ReplicaManager$$anonfun$makeLeaders$4.apply(ReplicaManager.scala:695) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) at kafka.server.ReplicaManager.makeLeaders(ReplicaManager.scala:695) at kafka.server.ReplicaManager.becomeLeaderOrFollower(ReplicaManager.scala:641) at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:142) at kafka.server.KafkaApis.handle(KafkaApis.scala:79) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60) at java.lang.Thread.run(Thread.java:745) {code} was: Seems like the broker doesn't like topic names of 254 chars or more when creating using kafka-topics.sh --create. The problem does not seem to arise when topic is created through automatic topic creation. How to reproduce: TOPIC=$(printf 'd%.0s' {1..254} ) ; bin/kafka-topics.sh --zookeeper 0 --create --topic $TOPIC --partitions 1 --replication-factor 1 [2016-02-06 22:00:01,943] INFO [ReplicaFetcherManager on broker 3] Removed fetcher for partitions [dd,0] (kafka.server.ReplicaFetcherManager) [2016-02-06 22:00:01,944] ERROR [KafkaApi-3] Error when handling request
[jira] [Created] (KAFKA-3219) Long topic names mess up broker topic state
Magnus Edenhill created KAFKA-3219: -- Summary: Long topic names mess up broker topic state Key: KAFKA-3219 URL: https://issues.apache.org/jira/browse/KAFKA-3219 Project: Kafka Issue Type: Bug Affects Versions: 0.9.0.0 Reporter: Magnus Edenhill Seems like the broker doesn't like topic names of 254 chars or more when creating using kafka-topics.sh --create. The problem does not seem to arise when topic is created through automatic topic creation. How to reproduce: TOPIC=$(printf 'd%.0s' {1..254} ) ; bin/kafka-topics.sh --zookeeper 0 --create --topic $TOPIC --partitions 1 --replication-factor 1 [2016-02-06 22:00:01,943] INFO [ReplicaFetcherManager on broker 3] Removed fetcher for partitions [dd,0] (kafka.server.ReplicaFetcherManager) [2016-02-06 22:00:01,944] ERROR [KafkaApi-3] Error when handling request {controller_id=3,controller_epoch=12,partition_states=[{topic=dd,partition=0,controller_epoch=12,leader=3,leader_epoch=0,isr=[3],zk_version=0,replicas=[3]}],live_leaders=[{id=3,host=eden,port=9093}]} (kafka.server.KafkaApis) java.lang.NullPointerException at scala.collection.mutable.ArrayOps$ofRef$.length$extension(ArrayOps.scala:114) at scala.collection.mutable.ArrayOps$ofRef.length(ArrayOps.scala:114) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:32) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) at kafka.log.Log.loadSegments(Log.scala:138) at kafka.log.Log.(Log.scala:92) at kafka.log.LogManager.createLog(LogManager.scala:357) at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:96) at kafka.cluster.Partition$$anonfun$4$$anonfun$apply$2.apply(Partition.scala:176) at kafka.cluster.Partition$$anonfun$4$$anonfun$apply$2.apply(Partition.scala:176) at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) at kafka.cluster.Partition$$anonfun$4.apply(Partition.scala:176) at kafka.cluster.Partition$$anonfun$4.apply(Partition.scala:170) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:259) at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:267) at kafka.cluster.Partition.makeLeader(Partition.scala:170) at kafka.server.ReplicaManager$$anonfun$makeLeaders$4.apply(ReplicaManager.scala:696) at kafka.server.ReplicaManager$$anonfun$makeLeaders$4.apply(ReplicaManager.scala:695) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) at kafka.server.ReplicaManager.makeLeaders(ReplicaManager.scala:695) at kafka.server.ReplicaManager.becomeLeaderOrFollower(ReplicaManager.scala:641) at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:142) at kafka.server.KafkaApis.handle(KafkaApis.scala:79) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KAFKA-3160) Kafka LZ4 framing code miscalculates header checksum
[ https://issues.apache.org/jira/browse/KAFKA-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Magnus Edenhill reassigned KAFKA-3160: -- Assignee: Magnus Edenhill > Kafka LZ4 framing code miscalculates header checksum > > > Key: KAFKA-3160 > URL: https://issues.apache.org/jira/browse/KAFKA-3160 > Project: Kafka > Issue Type: Bug > Components: compression >Affects Versions: 0.8.2.0, 0.8.2.1, 0.9.0.0, 0.8.2.2 >Reporter: Dana Powers >Assignee: Magnus Edenhill > > KAFKA-1493 implements the LZ4 framing specification, but it incorrectly > calculates the header checksum. Specifically, the current implementation > includes the 4-byte MagicNumber in the checksum, which is incorrect. > http://cyan4973.github.io/lz4/lz4_Frame_format.html > Third-party clients that attempt to use off-the-shelf lz4 framing find that > brokers reject messages as having a corrupt checksum. So currently non-java > clients must 'fixup' lz4 packets to deal with the broken checksum. > Magnus first identified this issue in librdkafka; kafka-python has the same > problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1493) Use a well-documented LZ4 compression format and remove redundant LZ4HC option
[ https://issues.apache.org/jira/browse/KAFKA-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15119920#comment-15119920 ] Magnus Edenhill commented on KAFKA-1493: [~dana.powers] I can confirm this is the case, as you describe it. I suggest creating a new issue for this. I have a patch that adds a new compression.type=lz4f with proper framing. > Use a well-documented LZ4 compression format and remove redundant LZ4HC option > -- > > Key: KAFKA-1493 > URL: https://issues.apache.org/jira/browse/KAFKA-1493 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.8.2.0 >Reporter: James Oliver >Assignee: James Oliver >Priority: Blocker > Fix For: 0.8.2.0 > > Attachments: KAFKA-1493.patch, KAFKA-1493.patch, > KAFKA-1493_2014-10-16_13:49:34.patch, KAFKA-1493_2014-10-16_21:25:23.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2695) Handle null string/bytes protocol primitives
[ https://issues.apache.org/jira/browse/KAFKA-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073886#comment-15073886 ] Magnus Edenhill commented on KAFKA-2695: This blocks kafka-consumer-groups.sh to describe librdkafka-based consumer groups, see https://github.com/edenhill/librdkafka/issues/479 > Handle null string/bytes protocol primitives > > > Key: KAFKA-2695 > URL: https://issues.apache.org/jira/browse/KAFKA-2695 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson > > The kafka protocol supports null bytes and string primitives by passing -1 as > the size, but the current deserializers implemented in > o.a.k.common.protocol.types.Type do not handle them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1367) Broker topic metadata not kept in sync with ZooKeeper
[ https://issues.apache.org/jira/browse/KAFKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160927#comment-14160927 ] Magnus Edenhill commented on KAFKA-1367: May I suggest not to change the protocol but to only send an empty ISR vector in the MetadataResponse? Broker topic metadata not kept in sync with ZooKeeper - Key: KAFKA-1367 URL: https://issues.apache.org/jira/browse/KAFKA-1367 Project: Kafka Issue Type: Bug Affects Versions: 0.8.0, 0.8.1 Reporter: Ryan Berdeen Labels: newbie++ Attachments: KAFKA-1367.txt When a broker is restarted, the topic metadata responses from the brokers will be incorrect (different from ZooKeeper) until a preferred replica leader election. In the metadata, it looks like leaders are correctly removed from the ISR when a broker disappears, but followers are not. Then, when a broker reappears, the ISR is never updated. I used a variation of the Vagrant setup created by Joe Stein to reproduce this with latest from the 0.8.1 branch: https://github.com/also/kafka/commit/dba36a503a5e22ea039df0f9852560b4fb1e067c -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1367) Broker topic metadata not kept in sync with ZooKeeper
[ https://issues.apache.org/jira/browse/KAFKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148054#comment-14148054 ] Magnus Edenhill commented on KAFKA-1367: [~guozhang] Yes, KAFKA-1555 looks like a good match. Broker topic metadata not kept in sync with ZooKeeper - Key: KAFKA-1367 URL: https://issues.apache.org/jira/browse/KAFKA-1367 Project: Kafka Issue Type: Bug Affects Versions: 0.8.0, 0.8.1 Reporter: Ryan Berdeen Labels: newbie++ Attachments: KAFKA-1367.txt When a broker is restarted, the topic metadata responses from the brokers will be incorrect (different from ZooKeeper) until a preferred replica leader election. In the metadata, it looks like leaders are correctly removed from the ISR when a broker disappears, but followers are not. Then, when a broker reappears, the ISR is never updated. I used a variation of the Vagrant setup created by Joe Stein to reproduce this with latest from the 0.8.1 branch: https://github.com/also/kafka/commit/dba36a503a5e22ea039df0f9852560b4fb1e067c -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1367) Broker topic metadata not kept in sync with ZooKeeper
[ https://issues.apache.org/jira/browse/KAFKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143218#comment-14143218 ] Magnus Edenhill commented on KAFKA-1367: Apart from supplying the ISR count and list in its metadata API, librdkafka also provides an `enforce.isr.cnt` configuration property that fails produce requests locally before transmission if the currently known ISR count is smaller than the configured value. This is a workaround for the broker not fully honoring `request.required.acks`, i.e., if `request.required.acks=3` and only one broker is available the produce request will not fail. More info in the original issue here: https://github.com/edenhill/librdkafka/issues/91 Generally I would assume that information provided by the broker is correct, otherwise it should not be included at all since it can't be used (reliably). Broker topic metadata not kept in sync with ZooKeeper - Key: KAFKA-1367 URL: https://issues.apache.org/jira/browse/KAFKA-1367 Project: Kafka Issue Type: Bug Affects Versions: 0.8.0, 0.8.1 Reporter: Ryan Berdeen Labels: newbie++ Attachments: KAFKA-1367.txt When a broker is restarted, the topic metadata responses from the brokers will be incorrect (different from ZooKeeper) until a preferred replica leader election. In the metadata, it looks like leaders are correctly removed from the ISR when a broker disappears, but followers are not. Then, when a broker reappears, the ISR is never updated. I used a variation of the Vagrant setup created by Joe Stein to reproduce this with latest from the 0.8.1 branch: https://github.com/also/kafka/commit/dba36a503a5e22ea039df0f9852560b4fb1e067c -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-955) After a leader change, messages sent with ack=0 are lost
[ https://issues.apache.org/jira/browse/KAFKA-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746810#comment-13746810 ] Magnus Edenhill commented on KAFKA-955: --- Hi Guozhang, I understand that you might not want to introduce a new message semantic at this point of the 0.8 beta, but it wont get easier after the release. My proposal is a change of the protocol definition to allow unsolicited metadata response messages to be sent from the broker, this would of course require changes in most clients, but a very small one for those that are not interested in keeping their leader cache up to date. Consider a producer forwarding 100kmsgs/s for a number of topics to a broker that suddenly drops the connection because one of those topics changed leader, the producer message queue will quickly build up and might start dropping messages (for topics that didnt loose their leader) due to local queue thresholds or very slowly recover if the current rate of messages is close to the maximum thruput. In my mind closing the socket because one top+par changed leader is a very intrusive way to signal an event for sub-set of the communication, and it should instead be fixed properly with an unsoliticed metadata response message. The unsolicited metadata response message is useful for other scenarios aswell, new brokers and topics being added, for instance. My two cents on the topic, thank you. After a leader change, messages sent with ack=0 are lost Key: KAFKA-955 URL: https://issues.apache.org/jira/browse/KAFKA-955 Project: Kafka Issue Type: Bug Reporter: Jason Rosenberg Assignee: Guozhang Wang Attachments: KAFKA-955.v1.patch, KAFKA-955.v1.patch, KAFKA-955.v2.patch, KAFKA-955.v3.patch If the leader changes for a partition, and a producer is sending messages with ack=0, then messages will be lost, since the producer has no active way of knowing that the leader has changed, until it's next metadata refresh update. The broker receiving the message, which is no longer the leader, logs a message like this: Produce request with correlation id 7136261 from client on partition [mytopic,0] failed due to Leader not local for partition [mytopic,0] on broker 508818741 This is exacerbated by the controlled shutdown mechanism, which forces an immediate leader change. A possible solution to this would be for a broker which receives a message, for a topic that it is no longer the leader for (and if the ack level is 0), then the broker could just silently forward the message over to the current leader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira