[jira] [Commented] (KAFKA-5449) Flaky test TransactionsTest.testReadCommittedConsumerShouldNotSeeUndecidedData
[ https://issues.apache.org/jira/browse/KAFKA-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050890#comment-16050890 ] ASF GitHub Bot commented on KAFKA-5449: --- Github user apurvam closed the pull request at: https://github.com/apache/kafka/pull/3346 > Flaky test TransactionsTest.testReadCommittedConsumerShouldNotSeeUndecidedData > -- > > Key: KAFKA-5449 > URL: https://issues.apache.org/jira/browse/KAFKA-5449 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Reporter: Matthias J. Sax >Assignee: Apurva Mehta > Labels: exactly-once, test > > {noformat} > java.lang.AssertionError: Consumed 6 records until timeout instead of the > expected 8 records > at kafka.utils.TestUtils$.fail(TestUtils.scala:333) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:847) > at kafka.utils.TestUtils$.consumeRecords(TestUtils.scala:1357) > at > kafka.api.TransactionsTest.testReadCommittedConsumerShouldNotSeeUndecidedData(TransactionsTest.scala:143) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5442) Streams producer `client.id` are not unique for EOS
[ https://issues.apache.org/jira/browse/KAFKA-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050887#comment-16050887 ] ASF GitHub Bot commented on KAFKA-5442: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3329 > Streams producer `client.id` are not unique for EOS > --- > > Key: KAFKA-5442 > URL: https://issues.apache.org/jira/browse/KAFKA-5442 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Blocker > Fix For: 0.11.0.0 > > > Wth producer per task model in EOS, producer `client.id` must encode > `task.id` to make IDs unique. Currently, only thread-id is encoded resulting > in naming conflict. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5454) Add a new Kafka Streams example IoT oriented
[ https://issues.apache.org/jira/browse/KAFKA-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050585#comment-16050585 ] ASF GitHub Bot commented on KAFKA-5454: --- GitHub user ppatierno opened a pull request: https://github.com/apache/kafka/pull/3352 KAFKA-5454: Add a new Kafka Streams example IoT oriented Added a Kafka Streams example (IoT oriented) using "tumbling" window You can merge this pull request into a Git repository by running: $ git pull https://github.com/ppatierno/kafka stream-temperature-example Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3352.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3352 commit 536fbbfc070053e5bbb07ecfa8b49f7c184497c1 Author: ppatiernoDate: 2017-06-15T14:56:39Z Added a Kafka Streams example (IoT oriented) using "tumbling" window > Add a new Kafka Streams example IoT oriented > > > Key: KAFKA-5454 > URL: https://issues.apache.org/jira/browse/KAFKA-5454 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Paolo Patierno >Assignee: Paolo Patierno >Priority: Trivial > > Hi, > I had the doubt to open a JIRA or not for this but I have a PR with an > example of using Kafka Streams in a simple IoT scenario using "tumbling" > window for processing maximum temperature value in the latest 5 seconds and > sending an "alarm" if it's over 20. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5275) Review and potentially tweak AdminClient API for the initial release (KIP-117)
[ https://issues.apache.org/jira/browse/KAFKA-5275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049874#comment-16049874 ] ASF GitHub Bot commented on KAFKA-5275: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3339 > Review and potentially tweak AdminClient API for the initial release (KIP-117) > -- > > Key: KAFKA-5275 > URL: https://issues.apache.org/jira/browse/KAFKA-5275 > Project: Kafka > Issue Type: Sub-task >Reporter: Ismael Juma >Assignee: Ismael Juma > Fix For: 0.11.0.0 > > > Once all the pieces are in, we should take a pass and ensure that the APIs > work well together and that they are consistent. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5449) Flaky test TransactionsTest.testReadCommittedConsumerShouldNotSeeUndecidedData
[ https://issues.apache.org/jira/browse/KAFKA-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049859#comment-16049859 ] ASF GitHub Bot commented on KAFKA-5449: --- GitHub user apurvam opened a pull request: https://github.com/apache/kafka/pull/3346 MINOR: Cleanups for TransactionsTest The motivation is that KAFKA-5449 seems to indicate that producer instances can be shared across tests, and that producers from one test seem to be hitting brokers in another test. So this patch does two things: # Make transactionsTest use random ports in each test case. # Clear producers and consumers between tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apurvam/kafka MINOR-transactiontest-should-inherit-from-integration-test-harness Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3346.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3346 commit 2cc3aa4c77d80bd1c806f7ee276cb18ffbeaa540 Author: Apurva MehtaDate: 2017-06-15T00:35:52Z Make transactionsTest use random ports in each test case. Clear producers and consumers between tests > Flaky test TransactionsTest.testReadCommittedConsumerShouldNotSeeUndecidedData > -- > > Key: KAFKA-5449 > URL: https://issues.apache.org/jira/browse/KAFKA-5449 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Reporter: Matthias J. Sax >Assignee: Apurva Mehta > Labels: exactly-once, test > > {noformat} > java.lang.AssertionError: Consumed 6 records until timeout instead of the > expected 8 records > at kafka.utils.TestUtils$.fail(TestUtils.scala:333) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:847) > at kafka.utils.TestUtils$.consumeRecords(TestUtils.scala:1357) > at > kafka.api.TransactionsTest.testReadCommittedConsumerShouldNotSeeUndecidedData(TransactionsTest.scala:143) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5450) Scripts to startup Connect in system tests have too short a timeout
[ https://issues.apache.org/jira/browse/KAFKA-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049803#comment-16049803 ] ASF GitHub Bot commented on KAFKA-5450: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3344 > Scripts to startup Connect in system tests have too short a timeout > --- > > Key: KAFKA-5450 > URL: https://issues.apache.org/jira/browse/KAFKA-5450 > Project: Kafka > Issue Type: Bug > Components: system tests >Affects Versions: 0.11.0.0 >Reporter: Randall Hauch >Assignee: Randall Hauch > Fix For: 0.11.0.0, 0.11.1.0 > > > When the system tests start up a Kafka Connect standalone or distributed > worker, the utility starts the process, and if the worker does not start up > within 30 seconds the utility considers it a failure and stops everything. > This is often sufficient when running the system tests against the source > code, as the CLASSPATH for Connect includes only the Kafka Connect runtime > JARs (in addition to all of the connector dirs). However, when running the > system tests against the packaged form of Kafka, the CLASSPATH for Connect > includes all of the Apache Kafka JARs (in addition to all of the connector > dirs). This increases the total number of JARs that have to be scanned by > almost 75% and increases the time required to scan all of the JARs nearly > doubles from ~14sec to ~26sec. (Some of the additional JARs are likely larger > and take longer to scan than those JARs in Connect or the connectors.) > As a result, the 30 second timeout is often not quite sufficient for the > Connect system test utility and should be increased to 60 seconds. This > shouldn't noticeably increase the time of most system tests, since 30 seconds > was nearly sufficient anyway; it will increase the duration of the tests > where does fail to start, but that ideally won't happen much. :-) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5448) TimestampConverter's "type" config conflicts with the basic Transformation "type" config
[ https://issues.apache.org/jira/browse/KAFKA-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049750#comment-16049750 ] ASF GitHub Bot commented on KAFKA-5448: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3342 > TimestampConverter's "type" config conflicts with the basic Transformation > "type" config > > > Key: KAFKA-5448 > URL: https://issues.apache.org/jira/browse/KAFKA-5448 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Reporter: Ewen Cheslack-Postava >Assignee: Ewen Cheslack-Postava >Priority: Blocker > Fix For: 0.11.0.0, 0.11.1.0 > > Original Estimate: 1h > Remaining Estimate: 1h > > [KIP-66|https://cwiki.apache.org/confluence/display/KAFKA/KIP-66%3A+Single+Message+Transforms+for+Kafka+Connect] > defined one of the configs for TimestampConverter to be "type". However, all > transformations are configured with the "type" config specifying the class > that implements them. > We need to modify the naming of the configs so these don't conflict. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5450) Scripts to startup Connect in system tests have too short a timeout
[ https://issues.apache.org/jira/browse/KAFKA-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049697#comment-16049697 ] ASF GitHub Bot commented on KAFKA-5450: --- GitHub user rhauch opened a pull request: https://github.com/apache/kafka/pull/3344 KAFKA-5450 Increased timeout of Connect system test utilities Increased the timeout from 30sec to 60sec. When running the system tests with packaged Kafka, Connect workers can take about 30seconds to start. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rhauch/kafka KAFKA-5450 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3344.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3344 commit 0f1f9f98b175e06373f1f4e5edcb009903b22e6e Author: Randall HauchDate: 2017-06-14T21:41:11Z KAFKA-5450 Increased timeout of Connect system test utilities Increased the timeout from 30sec to 60sec. When running the system tests with packaged Kafka, Connect workers can take about 30seconds to start. > Scripts to startup Connect in system tests have too short a timeout > --- > > Key: KAFKA-5450 > URL: https://issues.apache.org/jira/browse/KAFKA-5450 > Project: Kafka > Issue Type: Bug > Components: system tests >Affects Versions: 0.11.0.0 >Reporter: Randall Hauch >Assignee: Randall Hauch > Fix For: 0.11.1.0 > > > When the system tests start up a Kafka Connect standalone or distributed > worker, the utility starts the process, and if the worker does not start up > within 30 seconds the utility considers it a failure and stops everything. > This is often sufficient when running the system tests against the source > code, as the CLASSPATH for Connect includes only the Kafka Connect runtime > JARs (in addition to all of the connector dirs). However, when running the > system tests against the packaged form of Kafka, the CLASSPATH for Connect > includes all of the Apache Kafka JARs (in addition to all of the connector > dirs). This increases the total number of JARs that have to be scanned by > almost 75% and increases the time required to scan all of the JARs nearly > doubles from ~14sec to ~26sec. (Some of the additional JARs are likely larger > and take longer to scan than those JARs in Connect or the connectors.) > As a result, the 30 second timeout is often not quite sufficient for the > Connect system test utility and should be increased to 60 seconds. This > shouldn't noticeably increase the time of most system tests, since 30 seconds > was nearly sufficient anyway; it will increase the duration of the tests > where does fail to start, but that ideally won't happen much. :-) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5449) Flaky test TransactionsTest.testReadCommittedConsumerShouldNotSeeUndecidedData
[ https://issues.apache.org/jira/browse/KAFKA-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049670#comment-16049670 ] ASF GitHub Bot commented on KAFKA-5449: --- GitHub user apurvam opened a pull request: https://github.com/apache/kafka/pull/3343 WIP: KAFKA-5449: fix bad state transition in transaction manager The `kafka.api.TransactionsTest.testReadCommittedConsumerShouldNotSeeUndecidedData` very rarely sees the following. I have run it 700 times locally without failure, so it only happens on jenkins. this PR adds trace logging to the client. Will keep running the PR builder here and hope that the test fails again so that we can understand what's going on. It is strange that we we have an ongoing send when we are in `READY` state. It is even more strange that we see a `ProducerFencedException` in the log. Could it be that some other run is interfering with this one (since multiple test cases use the same producer ids) ? ``` [2017-06-13 23:58:09,644] ERROR Aborting producer batches due to fatal error (org.apache.kafka.clients.producer.internals.Sender:381) org.apache.kafka.common.errors.ProducerFencedException: Producer attempted an operation with an old epoch. Either there is a newer producer with the same transactionalId, or the producer's transaction has been expired by the broker. [2017-06-13 23:58:10,177] ERROR [ReplicaFetcherThread-0-0]: Error for partition [topic2,3] to broker 0:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread:99) [2017-06-13 23:58:10,177] ERROR [ReplicaFetcherThread-0-0]: Error for partition [topic2,0] to broker 0:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread:99) [2017-06-13 23:58:10,178] ERROR [ReplicaFetcherThread-0-0]: Error for partition [topic1,2] to broker 0:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread:99) [2017-06-13 23:58:12,128] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-06-13 23:58:12,134] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-06-13 23:58:12,310] ERROR [ReplicaFetcherThread-0-1]: Error for partition [topic1,0] to broker 1:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. (kafka.server.ReplicaFetcherThread:99) [2017-06-13 23:58:12,311] ERROR [ReplicaFetcherThread-0-1]: Error for partition [topic1,3] to broker 1:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. (kafka.server.ReplicaFetcherThread:99) [2017-06-13 23:58:15,998] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-06-13 23:58:16,005] ERROR ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes (org.apache.zookeeper.server.ZooKeeperServer:472) [2017-06-13 23:58:16,177] ERROR [ReplicaFetcherThread-0-2]: Error for partition [topic1,2] to broker 2:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. (kafka.server.ReplicaFetcherThread:99) [2017-06-13 23:58:16,177] ERROR [ReplicaFetcherThread-0-0]: Error for partition [topic1,3] to broker 0:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. (kafka.server.ReplicaFetcherThread:99) [2017-06-13 23:58:16,178] ERROR [ReplicaFetcherThread-0-0]: Error for partition [topic1,0] to broker 0:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. (kafka.server.ReplicaFetcherThread:99) [2017-06-13 23:58:28,177] ERROR Uncaught error in kafka producer I/O thread: (org.apache.kafka.clients.producer.internals.Sender:164) org.apache.kafka.common.KafkaException: Invalid transition attempted from state READY to state ABORTABLE_ERROR at org.apache.kafka.clients.producer.internals.TransactionManager.transitionTo(TransactionManager.java:476) at org.apache.kafka.clients.producer.internals.TransactionManager.transitionToAbortableError(TransactionManager.java:289) at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:601)
[jira] [Commented] (KAFKA-5448) TimestampConverter's "type" config conflicts with the basic Transformation "type" config
[ https://issues.apache.org/jira/browse/KAFKA-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049658#comment-16049658 ] ASF GitHub Bot commented on KAFKA-5448: --- GitHub user ewencp opened a pull request: https://github.com/apache/kafka/pull/3342 KAFKA-5448: Change TimestampConverter configuration name to avoid conflicting with reserved 'type' configuration used by all Transformations You can merge this pull request into a Git repository by running: $ git pull https://github.com/ewencp/kafka kafka-5448-change-timestamp-converter-config-name Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3342.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3342 commit bd560fd7ed2838c22d83003092ede284f092410f Author: Ewen Cheslack-PostavaDate: 2017-06-14T21:05:52Z KAFKA-5448: Change TimestampConverter configuration name to avoid conflicting with reserved 'type' configuration used by all Transformations > TimestampConverter's "type" config conflicts with the basic Transformation > "type" config > > > Key: KAFKA-5448 > URL: https://issues.apache.org/jira/browse/KAFKA-5448 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Reporter: Ewen Cheslack-Postava >Assignee: Ewen Cheslack-Postava >Priority: Blocker > Fix For: 0.11.0.0 > > Original Estimate: 1h > Remaining Estimate: 1h > > [KIP-66|https://cwiki.apache.org/confluence/display/KAFKA/KIP-66%3A+Single+Message+Transforms+for+Kafka+Connect] > defined one of the configs for TimestampConverter to be "type". However, all > transformations are configured with the "type" config specifying the class > that implements them. > We need to modify the naming of the configs so these don't conflict. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3123) Follower Broker cannot start if offsets are already out of range
[ https://issues.apache.org/jira/browse/KAFKA-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049506#comment-16049506 ] ASF GitHub Bot commented on KAFKA-3123: --- Github user soumyajit-sahu closed the pull request at: https://github.com/apache/kafka/pull/1716 > Follower Broker cannot start if offsets are already out of range > > > Key: KAFKA-3123 > URL: https://issues.apache.org/jira/browse/KAFKA-3123 > Project: Kafka > Issue Type: Bug > Components: core, replication >Affects Versions: 0.9.0.0 >Reporter: Soumyajit Sahu >Assignee: Mickael Maison >Priority: Critical > Labels: patch > Fix For: 0.11.0.0 > > Attachments: > 0001-Fix-Follower-crashes-when-offset-out-of-range-during.patch > > > I was trying to upgrade our test Windows cluster from 0.8.1.1 to 0.9.0 one > machine at a time. Our logs have just 2 hours of retention. I had re-imaged > the test machine under consideration, and got the following error in loop > after starting afresh with 0.9.0 broker: > [2016-01-19 13:57:28,809] WARN [ReplicaFetcherThread-1-169595708], Replica > 15588 for partition [EventLogs4,1] reset its fetch offset from 0 to > current leader 169595708's start offset 334086 > (kafka.server.ReplicaFetcherThread) > [2016-01-19 13:57:28,809] ERROR [ReplicaFetcherThread-1-169595708], Error > getting offset for partition [EventLogs4,1] to broker 169595708 > (kafka.server.ReplicaFetcherThread) > java.lang.IllegalStateException: Compaction for partition [EXO_EventLogs4,1] > cannot be aborted and paused since it is in LogCleaningPaused state. > at > kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply$mcV$sp(LogCleanerManager.scala:149) > at > kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply(LogCleanerManager.scala:140) > at > kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply(LogCleanerManager.scala:140) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262) > at > kafka.log.LogCleanerManager.abortAndPauseCleaning(LogCleanerManager.scala:140) > at kafka.log.LogCleaner.abortAndPauseCleaning(LogCleaner.scala:141) > at kafka.log.LogManager.truncateFullyAndStartAt(LogManager.scala:304) > at > kafka.server.ReplicaFetcherThread.handleOffsetOutOfRange(ReplicaFetcherThread.scala:185) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:152) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:122) > at scala.Option.foreach(Option.scala:236) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:122) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:120) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:120) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:120) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:120) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262) > at > kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118) > at > kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:93) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > I could unblock myself with a code change. I deleted the action for "case s > =>" in the LogCleanerManager.scala's abortAndPauseCleaning(). I think we > should not throw exception if the state is already LogCleaningAborted or > LogCleaningPaused in this function, but instead just let it roll. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5443) Consumer should use last offset from batch to set next fetch offset
[ https://issues.apache.org/jira/browse/KAFKA-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049371#comment-16049371 ] ASF GitHub Bot commented on KAFKA-5443: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3331 > Consumer should use last offset from batch to set next fetch offset > --- > > Key: KAFKA-5443 > URL: https://issues.apache.org/jira/browse/KAFKA-5443 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Jason Gustafson > Fix For: 0.11.0.0 > > > With message format v2, the log cleaner preserves the last offset in each > batch even if the last record is removed. Currently when the batch is > consumed by the consumer, we use the last record in the batch to determine > the next offset to fetch. So if the last record in the batch was removed > through compaction, the next fetch offset will still point to an offset in > the current batch and it will be refetched. In the worst case, if the fetch > size has room for that batch, the consumer will not be able to make progress. > To fix this, we should advance the next fetch offset to the last offset from > the batch once we have consumed that batch. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5361) Add EOS integration tests for Streams API
[ https://issues.apache.org/jira/browse/KAFKA-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049338#comment-16049338 ] ASF GitHub Bot commented on KAFKA-5361: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3276 > Add EOS integration tests for Streams API > - > > Key: KAFKA-5361 > URL: https://issues.apache.org/jira/browse/KAFKA-5361 > Project: Kafka > Issue Type: Bug > Components: streams, unit tests >Affects Versions: 0.11.0.0 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > We need to add more integration tests for Streams API with exactly-once > enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5274) Review and improve AdminClient Javadoc for the first release (KIP-117)
[ https://issues.apache.org/jira/browse/KAFKA-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049323#comment-16049323 ] ASF GitHub Bot commented on KAFKA-5274: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3316 > Review and improve AdminClient Javadoc for the first release (KIP-117) > -- > > Key: KAFKA-5274 > URL: https://issues.apache.org/jira/browse/KAFKA-5274 > Project: Kafka > Issue Type: Sub-task >Reporter: Ismael Juma >Assignee: Ismael Juma > Fix For: 0.11.0.0, 0.11.1.0 > > > Once all the AdminClient pieces are in, we should take a pass at the Javadoc > and improve it wherever possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5275) Review and potentially tweak AdminClient API for the initial release (KIP-117)
[ https://issues.apache.org/jira/browse/KAFKA-5275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049301#comment-16049301 ] ASF GitHub Bot commented on KAFKA-5275: --- GitHub user ijuma opened a pull request: https://github.com/apache/kafka/pull/3339 KAFKA-5275: AdminClient API consistency You can merge this pull request into a Git repository by running: $ git pull https://github.com/ijuma/kafka kafka-5275-admin-client-api-consistency Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3339.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3339 commit 515037d6c8d37c4d088fd1d15b2275a609ac7e26 Author: Ismael JumaDate: 2017-06-13T13:56:05Z Publish javadoc for common.annotation package It includes InterfaceStability annotation. commit 51264deb6a49221d9d524896f91fd70af2439887 Author: Ismael Juma Date: 2017-06-13T14:10:44Z Clarify InterfaceStability commit c5d77c5cb28bf5db2932c1862e0ff44c74513158 Author: Ismael Juma Date: 2017-06-13T14:27:46Z Various javadoc improvements to API classes in clients.admin commit 57d83d031fee20ac4e988c75deb6870c0be8ed41 Author: Ismael Juma Date: 2017-06-13T23:40:40Z Revert assert change commit d5a1bb479f968ee0da86542446befc7e4ea77019 Author: Ismael Juma Date: 2017-06-14T00:06:20Z Add javadoc to some classes in common commit a5bb109c46f90f0a9aa7312b725bc8326feb17e2 Author: Ismael Juma Date: 2017-06-14T00:31:51Z More javadoc for common classes commit baffc69ef6cbaf4d5b8478bef6a7cfc6fa639fed Author: Ismael Juma Date: 2017-06-14T00:33:26Z Address review feedback commit ac064599f1708ae8e7a1194922a138af92999d37 Author: Ismael Juma Date: 2017-06-14T00:56:41Z Document TopicPartitionInfo commit 72a0010ad5a0ddd2b18e0c757c70a923e4876d8c Author: Ismael Juma Date: 2017-06-14T12:34:40Z Document broker requirement for AdminClient methods commit 1a65c1005c65d03196e3db2cee2387389940421c Author: Ismael Juma Date: 2017-06-14T14:38:06Z Add InterfaceStability to more classes commit bdf4d09b3139d13c215f5ec7faea30654aa6db34 Author: Ismael Juma Date: 2017-06-14T13:11:04Z Use List instead of NavigableMap for TopicDescription.partitions commit 5d5bd02bfc7ef657514e3ffd3ccffab18957b486 Author: Ismael Juma Date: 2017-06-14T13:11:37Z Lists exposed by TopicPartitionInfo should be unmodifiable commit f41a6651728eecc0f16bbe167aaf747dc29ed8a1 Author: Ismael Juma Date: 2017-06-14T13:17:15Z Rename TopicListing to TopicListItem Listing doesn't seem to be the right term for what it represents. commit 5bfb60756deee38394fb00efb7dd2d5871edae95 Author: Ismael Juma Date: 2017-06-14T13:18:31Z Rename NewTopic.partitions to NewTopic.numPartitions commit 89b88deafa9cadeef2c4f35f75bc0e0331cb2456 Author: Ismael Juma Date: 2017-06-14T13:24:42Z Replace `description` usage in `ListTopicsResult` `ListTopicsResult` doesn't return `TopicDescription` commit e75985e0a1b6fe40d3309d36bf17e84e856edbd8 Author: Ismael Juma Date: 2017-06-14T13:36:24Z Rename `results()` to `value()` commit 93fa6cafc908958ab81f416a76e4d41c1dae28ed Author: Ismael Juma Date: 2017-06-14T13:43:36Z Don't use JVM level asserts as they are not enabled by default commit 50db3ccac1d168cc12109d99a5a3c40260c5b781 Author: Ismael Juma Date: 2017-06-14T14:07:32Z Make retries configurable commit 8752681528573e937b7ab5c1c7c9548fa21cbf29 Author: Ismael Juma Date: 2017-06-14T15:20:41Z Consistent usage of prefix for boolean accessors The other option is to remove all of the existing prefixes. commit c55861a3461931d22e340d6b7af8eb4e17c7b51c Author: Ismael Juma Date: 2017-06-14T15:36:11Z Use `null` for unknown controller or leader > Review and potentially tweak AdminClient API for the initial release (KIP-117) > -- > > Key: KAFKA-5275 > URL: https://issues.apache.org/jira/browse/KAFKA-5275 > Project: Kafka > Issue Type: Sub-task >Reporter: Ismael Juma >Assignee: Ismael Juma > Fix For: 0.11.0.0 > > > Once all the pieces are in, we should take a pass and ensure that the APIs > work well together and that they are consistent. -- This message was sent by Atlassian
[jira] [Commented] (KAFKA-5354) MirrorMaker not preserving headers
[ https://issues.apache.org/jira/browse/KAFKA-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049277#comment-16049277 ] ASF GitHub Bot commented on KAFKA-5354: --- Github user michaelandrepearce closed the pull request at: https://github.com/apache/kafka/pull/3323 > MirrorMaker not preserving headers > -- > > Key: KAFKA-5354 > URL: https://issues.apache.org/jira/browse/KAFKA-5354 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.11.0.0 >Reporter: Jun Rao >Assignee: Michael Andre Pearce >Priority: Blocker > Fix For: 0.11.0.0 > > > Currently, it doesn't seem that MirrorMaker preserves headers in > BaseConsumerRecord. So, headers won't be preserved during mirroring. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5446) Annoying braces showed on log.error using streams
[ https://issues.apache.org/jira/browse/KAFKA-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049191#comment-16049191 ] ASF GitHub Bot commented on KAFKA-5446: --- GitHub user ppatierno opened a pull request: https://github.com/apache/kafka/pull/3338 KAFKA-5446: Annoying braces showed on log.error using streams Fixed log.error usage with annoying braces You can merge this pull request into a Git repository by running: $ git pull https://github.com/ppatierno/kafka log-error Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3338.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3338 commit dcf8308e04663295f9310632dcfc226ea58cd947 Author: ppatiernoDate: 2017-06-14T13:35:11Z Fixed LOG.error usage with annoying braces > Annoying braces showed on log.error using streams > -- > > Key: KAFKA-5446 > URL: https://issues.apache.org/jira/browse/KAFKA-5446 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Paolo Patierno >Priority: Trivial > > Hi, > in the stream library seems to be a wrong usage of the log.error method when > we want to show an exception. There are useless braces at the end of the line > before showing exception information like the following example : > ERROR task [0_0] Could not close task due to {} > (org.apache.kafka.streams.processor.internals.StreamTask:414) > org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be > completed since the group has already rebalanced and assigned the partitions > to another member. This means that the time between subsequent calls to > poll() was longer than the configured max.poll.interval.ms, which typically > implies that the poll loop is spending too much time message processing. You > can address this either by increasing the session timeout or by reducing the > maximum size of batches returned in poll() with max.poll.records. > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:725) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:604) > at > org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1146) > as you can see in " due to {}", the braces aren't needed for showing > exception info so they are printed. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5354) MirrorMaker not preserving headers
[ https://issues.apache.org/jira/browse/KAFKA-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049045#comment-16049045 ] ASF GitHub Bot commented on KAFKA-5354: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3322 > MirrorMaker not preserving headers > -- > > Key: KAFKA-5354 > URL: https://issues.apache.org/jira/browse/KAFKA-5354 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.11.0.0 >Reporter: Jun Rao >Assignee: Michael Andre Pearce >Priority: Blocker > Fix For: 0.11.0.0 > > > Currently, it doesn't seem that MirrorMaker preserves headers in > BaseConsumerRecord. So, headers won't be preserved during mirroring. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5434) Console consumer hangs if not existing partition is specified
[ https://issues.apache.org/jira/browse/KAFKA-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048848#comment-16048848 ] ASF GitHub Bot commented on KAFKA-5434: --- GitHub user ppatierno opened a pull request: https://github.com/apache/kafka/pull/3335 KAFKA-5434: Console consumer hangs if not existing partition is specified Added checking partition exists before assign request You can merge this pull request into a Git repository by running: $ git pull https://github.com/ppatierno/kafka kafka-5434 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3335.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3335 commit 44bbb65994a957866ffc9b7574a4e40870c4f69f Author: ppatiernoDate: 2017-06-14T08:22:53Z Added checking partition exists before assign request > Console consumer hangs if not existing partition is specified > - > > Key: KAFKA-5434 > URL: https://issues.apache.org/jira/browse/KAFKA-5434 > Project: Kafka > Issue Type: Bug > Components: tools >Reporter: Paolo Patierno >Assignee: Vahid Hashemian > > Hi, > if I specify the --partition option for the console consumer with a not > existing partition for a topic, the application hangs indefinitely. > Debugging the code I see that it asks for metadata but when it receives topic > information and it doesn't find the requested partition inside such metadata, > the code retries new time. > Could be it worst to check if the partition exists using the partitionFor > method before calling the assign in the seek of the BaseConsumer and throwing > an exception so printing an error on the console ? > Thanks, > Paolo -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5443) Consumer should use last offset from batch to set next fetch offset
[ https://issues.apache.org/jira/browse/KAFKA-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048688#comment-16048688 ] ASF GitHub Bot commented on KAFKA-5443: --- GitHub user hachikuji opened a pull request: https://github.com/apache/kafka/pull/3331 KAFKA-5443: Consumer should use last offset from batch to set next fetch offset You can merge this pull request into a Git repository by running: $ git pull https://github.com/hachikuji/kafka KAFKA-5443 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3331.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3331 commit b991c8185c5bce1cedb39dc5158afaed1b8db2f4 Author: Jason GustafsonDate: 2017-06-14T04:17:33Z KAFKA-5443: Consumer should use last offset from batch to set next fetch offset > Consumer should use last offset from batch to set next fetch offset > --- > > Key: KAFKA-5443 > URL: https://issues.apache.org/jira/browse/KAFKA-5443 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Jason Gustafson > Fix For: 0.11.0.0 > > > With message format v2, the log cleaner preserves the last offset in each > batch even if the last record is removed. Currently when the batch is > consumed by the consumer, we use the last record in the batch to determine > the next offset to fetch. So if the last record in the batch was removed > through compaction, the next fetch offset will still point to an offset in > the current batch and it will be refetched. In the worst case, if the fetch > size has room for that batch, the consumer will not be able to make progress. > To fix this, we should advance the next fetch offset to the last offset from > the batch once we have consumed that batch. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5438) UnsupportedOperationException in WriteTxnMarkers handler
[ https://issues.apache.org/jira/browse/KAFKA-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048514#comment-16048514 ] ASF GitHub Bot commented on KAFKA-5438: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3313 > UnsupportedOperationException in WriteTxnMarkers handler > > > Key: KAFKA-5438 > URL: https://issues.apache.org/jira/browse/KAFKA-5438 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Apurva Mehta >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > {code} > [2017-06-10 19:16:36,280] ERROR [KafkaApi-2] Error when handling request > {replica_id=1,max_wait_time=500,min_bytes=1,max_bytes=10485760,isolation_level=0,topics=[{topic=__transaction_state,partitions=[{partition=45,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=15,f > etch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=20,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=3,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=21,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=29,fetch_offset=0,lo > g_start_offset=0,max_bytes=1048576},{partition=39,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=38,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=9,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=33,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=32,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=17,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=23,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=47,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=26,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=5,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=27,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=41,fetch_offset=48,log_start_offset=0,max_bytes=1048576},{partition=35,fetch_offset=0,log_start_offset=0,max_bytes=1048576}]},{topic=__consumer_offsets,partitions=[{partition=8,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=21,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=27,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=9,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=41,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=33,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=23,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=47,fetch_offset=41,log_start_offset=0,max_bytes=1048576},{partition=3,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=15,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=17,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=11,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=14,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=20,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=39,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=45,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=5,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=26,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=29,fetch_offset=0,log_start_offset=0,max_bytes=1048576}]},{topic=output-topic,partitions=[{partition=1,fetch_offset=4522,log_start_offset=0,max_bytes=1048576}]}]} > (kafka.server.KafkaApis) > java.lang.UnsupportedOperationException > at java.util.AbstractMap.put(AbstractMap.java:203) > at java.util.AbstractMap.putAll(AbstractMap.java:273) > at > kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$13(KafkaApis.scala:1509) > at > kafka.server.KafkaApis$$anonfun$handleWriteTxnMarkersRequest$2$$anonfun$apply$20.apply(KafkaApis.scala:1571) > at > kafka.server.KafkaApis$$anonfun$handleWriteTxnMarkersRequest$2$$anonfun$apply$20.apply(KafkaApis.scala:1571) > at kafka.server.DelayedProduce.onComplete(DelayedProduce.scala:131) > at > kafka.server.DelayedOperation.forceComplete(DelayedOperation.scala:66) > at kafka.server.DelayedProduce.tryComplete(DelayedProduce.scala:113) > at > kafka.server.DelayedProduce.safeTryComplete(DelayedProduce.scala:76) > at > kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:338) > at > kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:244) > at > kafka.server.ReplicaManager.tryCompleteDelayedProduce(ReplicaManager.scala:250) > at >
[jira] [Commented] (KAFKA-5442) Streams producer `client.id` are not unique for EOS
[ https://issues.apache.org/jira/browse/KAFKA-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048509#comment-16048509 ] ASF GitHub Bot commented on KAFKA-5442: --- GitHub user mjsax opened a pull request: https://github.com/apache/kafka/pull/3329 KAFKA-5442: Streams producer client.id are not unique for EOS You can merge this pull request into a Git repository by running: $ git pull https://github.com/mjsax/kafka kafka-5442-producer-id-conflict Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3329.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3329 commit f1162dbe5f41d40f644189af5fab3bb171ed9444 Author: Matthias J. SaxDate: 2017-06-13T23:05:11Z KAFKA-5442: Streams producer client.id are not unique for EOS > Streams producer `client.id` are not unique for EOS > --- > > Key: KAFKA-5442 > URL: https://issues.apache.org/jira/browse/KAFKA-5442 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Blocker > Fix For: 0.11.0.0 > > > Wth producer per task model in EOS, producer `client.id` must encode > `task.id` to make IDs unique. Currently, only thread-id is encoded resulting > in naming conflict. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5363) Add ability to batch restore and receive restoration stats.
[ https://issues.apache.org/jira/browse/KAFKA-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048378#comment-16048378 ] ASF GitHub Bot commented on KAFKA-5363: --- GitHub user bbejeck opened a pull request: https://github.com/apache/kafka/pull/3325 KAFKA-5363 [WIP] : Initial cut at implementing bulk load for persistent stat… …e stores You can merge this pull request into a Git repository by running: $ git pull https://github.com/bbejeck/kafka KAFKA-5363_add_ability_to_batch_restore Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3325.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3325 commit 012cc64568ed95fc8cc240536f645179aa3a7cec Author: Bill BejeckDate: 2017-06-06T21:49:25Z KAFKA-5363: Initial cut at implementing bulk load for persistent state stores KAFKA-5363: Initial cut at implementing bulk load for persistent state stores KAFKA-5363: added state store recovery benchmark KAFKA-5363: remove unused import. KAFKA-5363: Separated out listening to restoration events in separate interface, allows for notifications from current state stores and custom state stores. KAFKA-5363: minor cleanup > Add ability to batch restore and receive restoration stats. > --- > > Key: KAFKA-5363 > URL: https://issues.apache.org/jira/browse/KAFKA-5363 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Bill Bejeck >Assignee: Bill Bejeck > Labels: kip > Fix For: 0.11.1.0 > > > Currently, when restoring a state store in a Kafka Streams application, we > put one key-value at a time into the store. > This task aims to make this recovery more efficient by creating a new > interface with "restoreAll" functionality allowing for bulk writes by the > underlying state store implementation. > The proposal will also add "beginRestore" and "endRestore" callback methods > potentially used for > Tracking when the bulk restoration process begins and ends. > Keeping track of the number of records and last offset restored. > KIP: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-167%3A+Add+interface+for+the+state+store+restoration+process -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5354) MirrorMaker not preserving headers
[ https://issues.apache.org/jira/browse/KAFKA-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048318#comment-16048318 ] ASF GitHub Bot commented on KAFKA-5354: --- GitHub user michaelandrepearce opened a pull request: https://github.com/apache/kafka/pull/3323 KAFKA-5354: MirrorMaker not preserving headers Add test case to ensure fix and avoid regression Update mirror maker for new consumer to preserve headers You can merge this pull request into a Git repository by running: $ git pull https://github.com/IG-Group/kafka KAFKA-5354-0.11.0 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3323.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3323 commit 8dc160a74b2cc660f073d56ed77587e545951573 Author: Michael Andre PearceDate: 2017-06-13T19:52:50Z KAFKA-5354: MirrorMaker not preserving headers Add test case to ensure fix and avoid regression Update mirror maker for new consumer to preserve headers > MirrorMaker not preserving headers > -- > > Key: KAFKA-5354 > URL: https://issues.apache.org/jira/browse/KAFKA-5354 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.11.0.0 >Reporter: Jun Rao > > Currently, it doesn't seem that MirrorMaker preserves headers in > BaseConsumerRecord. So, headers won't be preserved during mirroring. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5354) MirrorMaker not preserving headers
[ https://issues.apache.org/jira/browse/KAFKA-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048298#comment-16048298 ] ASF GitHub Bot commented on KAFKA-5354: --- GitHub user michaelandrepearce opened a pull request: https://github.com/apache/kafka/pull/3322 KAFKA-5354: MirrorMaker not preserving headers Add test case to ensure fix and avoid regression Update mirror maker for new consumer to preserve headers You can merge this pull request into a Git repository by running: $ git pull https://github.com/IG-Group/kafka KAFKA-5354 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3322.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3322 commit 9483f4590efa2c1e2662244a34ab046ca24e06c5 Author: Michael Andre PearceDate: 2017-06-13T19:36:19Z KAFKA-5354: MirrorMaker not preserving headers Add test case to ensure fix and avoid regression Update mirror maker for new consumer to preserve headers > MirrorMaker not preserving headers > -- > > Key: KAFKA-5354 > URL: https://issues.apache.org/jira/browse/KAFKA-5354 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.11.0.0 >Reporter: Jun Rao > > Currently, it doesn't seem that MirrorMaker preserves headers in > BaseConsumerRecord. So, headers won't be preserved during mirroring. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5370) Replace uses of old consumer with the new consumer
[ https://issues.apache.org/jira/browse/KAFKA-5370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048236#comment-16048236 ] ASF GitHub Bot commented on KAFKA-5370: --- GitHub user vahidhashemian opened a pull request: https://github.com/apache/kafka/pull/3320 KAFKA-5370 (WIP): Replace uses of the old consumer with the new consumer when possible Also, methods in `ClientUtils` that are called by server or tools code should be introduced in `AdminUtils` with the implementation living in `AdminUtils`. All the existing callers apart from the old clients should call the `AdminUtils` methods. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vahidhashemian/kafka KAFKA-5370 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3320.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3320 commit 8e09061514d7b4f30abe59dd9321c884d4aa9101 Author: Vahid HashemianDate: 2017-06-13T15:36:31Z KAFKA-5370: Replace use of old consumer with the new consumer when possible - Uses of the old consumers in tools and tests where the new consumer would work as well (or better). - Methods in `ClientUtils` that are called by server or tools code should be introduced in `AdminUtils` with the implementation living in `AdminUtils`. All the existing callers apart from the old clients should call the `AdminUtils` methods. > Replace uses of old consumer with the new consumer > --- > > Key: KAFKA-5370 > URL: https://issues.apache.org/jira/browse/KAFKA-5370 > Project: Kafka > Issue Type: Improvement >Reporter: Vahid Hashemian >Assignee: Vahid Hashemian >Priority: Minor > > Where possible, use the new consumer In tools and tests instead of the old > consumer, and remove the deprecation warning. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5274) Review and improve AdminClient Javadoc for the first release (KIP-117)
[ https://issues.apache.org/jira/browse/KAFKA-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047966#comment-16047966 ] ASF GitHub Bot commented on KAFKA-5274: --- GitHub user ijuma opened a pull request: https://github.com/apache/kafka/pull/3316 [WIP] KAFKA-5274: AdminClient Javadoc improvements Also improve the Javadoc for ExtendedSerializer, ExtendedDeserializer and InterfaceStability. Publish Javadoc for common.annotation package, which contains InterfaceStability. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ijuma/kafka kafka-5274-admin-client-javadoc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3316.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3316 commit 3dd7718b7e0dd176307a1717f3a529594ccdd9f5 Author: Ismael JumaDate: 2017-06-13T13:56:05Z Publish javadoc for common.annotation package It includes InterfaceStability annotation. commit f8deae27fa04cdc6fbd62ef55880112e1a4d1a61 Author: Ismael Juma Date: 2017-06-13T14:09:20Z Javadoc for ExtendedSerializer and ExtendedDeserializer and add warning about usage commit 1cf4cf1414c242714c41e5e2faa61c23f83ca0ab Author: Ismael Juma Date: 2017-06-13T14:10:44Z Clarify InterfaceStability commit 7adc0d8297a46d6f61a5451c28fda62fb5ef4ba7 Author: Ismael Juma Date: 2017-06-13T14:27:46Z Various javadoc improvements to API classes introduced for the AdminClient commit 38ba7beed8685a67e606df3306e225885cc19500 Author: Ismael Juma Date: 2017-06-13T14:27:53Z WIP > Review and improve AdminClient Javadoc for the first release (KIP-117) > -- > > Key: KAFKA-5274 > URL: https://issues.apache.org/jira/browse/KAFKA-5274 > Project: Kafka > Issue Type: Sub-task >Reporter: Ismael Juma >Assignee: Ismael Juma > Fix For: 0.11.0.0 > > > Once all the AdminClient pieces are in, we should take a pass at the Javadoc > and improve it wherever possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5439) Add checks in integration tests to verify that threads have been shutdown
[ https://issues.apache.org/jira/browse/KAFKA-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047835#comment-16047835 ] ASF GitHub Bot commented on KAFKA-5439: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3314 > Add checks in integration tests to verify that threads have been shutdown > -- > > Key: KAFKA-5439 > URL: https://issues.apache.org/jira/browse/KAFKA-5439 > Project: Kafka > Issue Type: Improvement >Reporter: Rajini Sivaram >Assignee: Rajini Sivaram > Fix For: 0.11.1.0 > > > We have seen several test failures in integration tests due to threads being > left behind because brokers, producers or ZooKeeper clients haven't been > closed properly in tests. Add a check so that these failures can be caught > sooner since transient failures caused by port reuse or update of static JAAS > configuration are much harder to debug. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5418) ZkUtils.getAllPartitions() may fail if a topic is marked for deletion
[ https://issues.apache.org/jira/browse/KAFKA-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047807#comment-16047807 ] ASF GitHub Bot commented on KAFKA-5418: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3295 > ZkUtils.getAllPartitions() may fail if a topic is marked for deletion > - > > Key: KAFKA-5418 > URL: https://issues.apache.org/jira/browse/KAFKA-5418 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.9.0.1, 0.10.2.1 >Reporter: Edoardo Comar >Assignee: Mickael Maison > Fix For: 0.11.0.0 > > > Running {{ZkUtils.getAllPartitions()}} on a cluster which had a topic stuck > in the 'marked for deletion' state > so it was a child of {{/brokers/topics}} > but it had no children, i.e. the path {{/brokers/topics/thistopic/partitions}} > did not exist, throws a ZkNoNodeException while iterating: > {noformat} > rg.I0Itec.zkclient.exception.ZkNoNodeException: > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = > NoNode for /brokers/topics/xyzblahfoo/partitions > at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47) > at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:995) > at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:675) > at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:671) > at kafka.utils.ZkUtils.getChildren(ZkUtils.scala:537) > at > kafka.utils.ZkUtils$$anonfun$getAllPartitions$1.apply(ZkUtils.scala:817) > at > kafka.utils.ZkUtils$$anonfun$getAllPartitions$1.apply(ZkUtils.scala:816) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) > at scala.collection.Iterator$class.foreach(Iterator.scala:742) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1194) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:245) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at kafka.utils.ZkUtils.getAllPartitions(ZkUtils.scala:816) > ... > at java.lang.Thread.run(Thread.java:809) > Caused by: org.apache.zookeeper.KeeperException$NoNodeException: > KeeperErrorCode = NoNode for /brokers/topics/xyzblahfoo/partitions > at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472) > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500) > at org.I0Itec.zkclient.ZkConnection.getChildren(ZkConnection.java:114) > at org.I0Itec.zkclient.ZkClient$4.call(ZkClient.java:678) > at org.I0Itec.zkclient.ZkClient$4.call(ZkClient.java:675) > at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:985) > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5439) Add checks in integration tests to verify that threads have been shutdown
[ https://issues.apache.org/jira/browse/KAFKA-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047555#comment-16047555 ] ASF GitHub Bot commented on KAFKA-5439: --- GitHub user rajinisivaram opened a pull request: https://github.com/apache/kafka/pull/3314 KAFKA-5439: Verify that no unexpected threads are left behind in tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/rajinisivaram/kafka KAFKA-5439 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3314.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3314 commit d76f37bfb46a72245b7e9305bebc1a4fd240b227 Author: Rajini SivaramDate: 2017-06-13T07:33:43Z KAFKA-5439: Verify that no unexpected threads are left behind in tests > Add checks in integration tests to verify that threads have been shutdown > -- > > Key: KAFKA-5439 > URL: https://issues.apache.org/jira/browse/KAFKA-5439 > Project: Kafka > Issue Type: Improvement >Reporter: Rajini Sivaram >Assignee: Rajini Sivaram > > We have seen several test failures in integration tests due to threads being > left behind because brokers, producers or ZooKeeper clients haven't been > closed properly in tests. Add a check so that these failures can be caught > sooner since transient failures caused by port reuse or update of static JAAS > configuration are much harder to debug. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5438) UnsupportedOperationException in WriteTxnMarkers handler
[ https://issues.apache.org/jira/browse/KAFKA-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047417#comment-16047417 ] ASF GitHub Bot commented on KAFKA-5438: --- GitHub user apurvam opened a pull request: https://github.com/apache/kafka/pull/3313 KAFKA-5438: Fix UnsupportedOperationException in writeTxnMarkersRequest Before this patch, the `partitionErrors` was an immutable map. As a result if a single producer had a marker for multiple partitions, and if there were multiple response callbacks for a single append, we would get an `UnsupportedOperationException` in the `writeTxnMarker` handler. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apurvam/kafka KAFKA-5438-fix-unsupportedoperationexception-in-writetxnmarker Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3313.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3313 commit 0fc523aefb8a30f4fdebf166f904a780b40f6965 Author: Apurva MehtaDate: 2017-06-13T04:47:59Z Make partitionErrors a mutable map in KafkaApis.handleWriteTxnMarkerRequest > UnsupportedOperationException in WriteTxnMarkers handler > > > Key: KAFKA-5438 > URL: https://issues.apache.org/jira/browse/KAFKA-5438 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Apurva Mehta >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > {code} > [2017-06-10 19:16:36,280] ERROR [KafkaApi-2] Error when handling request > {replica_id=1,max_wait_time=500,min_bytes=1,max_bytes=10485760,isolation_level=0,topics=[{topic=__transaction_state,partitions=[{partition=45,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=15,f > etch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=20,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=3,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=21,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=29,fetch_offset=0,lo > g_start_offset=0,max_bytes=1048576},{partition=39,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=38,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=9,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=33,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=32,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=17,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=23,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=47,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=26,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=5,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=27,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=41,fetch_offset=48,log_start_offset=0,max_bytes=1048576},{partition=35,fetch_offset=0,log_start_offset=0,max_bytes=1048576}]},{topic=__consumer_offsets,partitions=[{partition=8,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=21,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=27,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=9,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=41,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=33,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=23,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=47,fetch_offset=41,log_start_offset=0,max_bytes=1048576},{partition=3,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=15,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=17,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=11,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=14,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=20,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=39,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=45,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=5,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=26,fetch_offset=0,log_start_offset=0,max_bytes=1048576},{partition=29,fetch_offset=0,log_start_offset=0,max_bytes=1048576}]},{topic=output-topic,partitions=[{partition=1,fetch_offset=4522,log_start_offset=0,max_bytes=1048576}]}]} > (kafka.server.KafkaApis) > java.lang.UnsupportedOperationException > at java.util.AbstractMap.put(AbstractMap.java:203) > at java.util.AbstractMap.putAll(AbstractMap.java:273) > at >
[jira] [Commented] (KAFKA-5428) Transactional producer aborts batches incorrectly in abortable error state
[ https://issues.apache.org/jira/browse/KAFKA-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047303#comment-16047303 ] ASF GitHub Bot commented on KAFKA-5428: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3298 > Transactional producer aborts batches incorrectly in abortable error state > -- > > Key: KAFKA-5428 > URL: https://issues.apache.org/jira/browse/KAFKA-5428 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > We currently abort batches blindly if we are in any error state. We should > only do this if we are in a fatal error state. Otherwise, we risk > OutOfOrderSequence errors if a failed produce request had actually been > written successfully to the topic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5362) Add EOS system tests for Streams API
[ https://issues.apache.org/jira/browse/KAFKA-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047282#comment-16047282 ] ASF GitHub Bot commented on KAFKA-5362: --- GitHub user mjsax opened a pull request: https://github.com/apache/kafka/pull/3310 KAFKA-5362: Add Streams EOS system test with repartitioning topic You can merge this pull request into a Git repository by running: $ git pull https://github.com/mjsax/kafka kafka-5362-add-eos-system-tests-for-streams-api Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3310.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3310 commit 71afa216515d187553b7e6b0415308be8d80c675 Author: Matthias J. SaxDate: 2017-06-10T06:05:45Z KAFKA-5362: Add Streams EOS system test with repartitioning topic > Add EOS system tests for Streams API > > > Key: KAFKA-5362 > URL: https://issues.apache.org/jira/browse/KAFKA-5362 > Project: Kafka > Issue Type: Bug > Components: streams, system tests >Affects Versions: 0.11.0.0 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > We need to add more system tests for Streams API with exactly-once enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5429) Producer IllegalStateException: Batch has already been completed
[ https://issues.apache.org/jira/browse/KAFKA-5429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047222#comment-16047222 ] ASF GitHub Bot commented on KAFKA-5429: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3300 > Producer IllegalStateException: Batch has already been completed > > > Key: KAFKA-5429 > URL: https://issues.apache.org/jira/browse/KAFKA-5429 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Jason Gustafson > Fix For: 0.11.0.0 > > > I've seen this a few times in system tests: > {code} > [2017-06-10 19:47:38,434] ERROR Uncaught error in request completion: > (org.apache.kafka.clients.NetworkClient) > java.lang.IllegalStateException: Batch has already been completed > at > org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:157) > at > org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:576) > at > org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:555) > at > org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:479) > at > org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:75) > at > org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:666) > at > org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:101) > at > org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:454) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:446) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:206) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:162) > at java.lang.Thread.run(Thread.java:745) > [2 > {code} > I think this is probably caused by aborting in-flight batches after an error > state. See the following log: > {code} > [2017-06-10 19:47:38,425] ERROR Aborting producer batches due to fatal error > (org.apache.kafka.clients.producer.internals.Sender) > org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker > received an out of order sequence number > [2017-06-10 19:47:38,425] DEBUG [TransactionalId my-first-transactional-id] > Transition from state ABORTABLE_ERROR to ABORTING_TRANSACTION > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:47:38,425] TRACE Produced messages to topic-partition > output-topic-0 with base offset offset -1 and error: {}. > (org.apache.kafka.clients.producer.internals.ProducerBatch) > org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker > received an out of order sequence number > [2017-06-10 19:47:38,425] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=EndTxnRequest, > transactionalId=my-first-transactional-id, producerId=2000, producerEpoch=0, > result=ABORT) (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:47:38,426] TRACE [TransactionalId my-first-transactional-id] > Request (type=EndTxnRequest, transactionalId=my-first-transactional-id, > producerId=2000, producerEpoch=0, result=ABORT) dequeued for sending > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:47:38,426] DEBUG [TransactionalId my-first-transactional-id] > Sending transactional request (type=EndTxnRequest, > transactionalId=my-first-transactional-id, producerId=2000, producerEpoch=0, > result=ABORT) to node worker11:9092 (id: 3 rack: null) > (org.apache.kafka.clients.producer.internals.Sender) > [2017-06-10 19:47:38,434] TRACE Received produce response from node 2 with > correlation id 514 (org.apache.kafka.clients.producer.internals.Sender) > [2017-06-10 19:47:38,434] DEBUG Incremented sequence number for > topic-partition output-topic-0 to 4500 > (org.apache.kafka.clients.producer.internals.Sender) > [2017-06-10 19:47:38,434] TRACE Produced messages to topic-partition > output-topic-0 with base offset offset 7033 and error: null. > (org.apache.kafka.clients.producer.internals.ProducerBatch) > [2017-06-10 19:47:38,434] ERROR Uncaught error in request completion: > (org.apache.kafka.clients.NetworkClient) > java.lang.IllegalStateException: Batch has already been completed > {code} > A simple solution is to add a separate flag to indicate that the batch has > been aborted. We can check it when the response returns and skip the callback. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5437) TransactionalMessageCopier should be force killed on test shutdown
[ https://issues.apache.org/jira/browse/KAFKA-5437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047182#comment-16047182 ] ASF GitHub Bot commented on KAFKA-5437: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3308 > TransactionalMessageCopier should be force killed on test shutdown > -- > > Key: KAFKA-5437 > URL: https://issues.apache.org/jira/browse/KAFKA-5437 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Apurva Mehta > Fix For: 0.11.0.0 > > > We've seen a few cases of the transactional message copier service failing to > be shutdown properly in a test case. We should probabl kill -9 in > {{clean_node}} like we do with some of the other services. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5437) TransactionalMessageCopier should be force killed on test shutdown
[ https://issues.apache.org/jira/browse/KAFKA-5437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047171#comment-16047171 ] ASF GitHub Bot commented on KAFKA-5437: --- GitHub user apurvam opened a pull request: https://github.com/apache/kafka/pull/3308 KAFKA-5437: Always send a sig_kill when cleaning the message copier When the message copier hangs (like when there is a bug in the client), it ignores the sigterm and doesn't shut down. this leaves the cluster in an unclean state causing future tests to fail. In this patch we always send SIGKILL when cleaning the node if the process isn't already dead. This is consistent with the other services. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apurvam/kafka KAFKA-5437-force-kill-message-copier-on-cleanup Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3308.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3308 commit 5364c4830f5f97fbbfbefbf2c92e1d7230490ebf Author: Apurva MehtaDate: 2017-06-12T22:52:45Z Always send a sig_kill when cleaning the message copier > TransactionalMessageCopier should be force killed on test shutdown > -- > > Key: KAFKA-5437 > URL: https://issues.apache.org/jira/browse/KAFKA-5437 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Apurva Mehta > Fix For: 0.11.1.0 > > > We've seen a few cases of the transactional message copier service failing to > be shutdown properly in a test case. We should probabl kill -9 in > {{clean_node}} like we do with some of the other services. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5427) Transactional producer cannot find coordinator when trying to abort transaction after error
[ https://issues.apache.org/jira/browse/KAFKA-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047131#comment-16047131 ] ASF GitHub Bot commented on KAFKA-5427: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3297 > Transactional producer cannot find coordinator when trying to abort > transaction after error > --- > > Key: KAFKA-5427 > URL: https://issues.apache.org/jira/browse/KAFKA-5427 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > It can happen that we receive an abortable error while we are already > aborting a transaction. In this case, we have an EndTxnRequest queued for > sending when we transition to ABORTABLE_ERROR. It could be that we need to > find the coordinator before sending this EndTxnRequest. The problem is that > we will fail even the FindCoordinatorRequest because we are in an error > state. This causes the following endless loop: > {code} > [2017-06-10 19:29:33,436] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=FindCoordinatorRequest, > coordinatorKey=my-fi > rst-transactional-id, coordinatorType=TRANSACTION) > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,436] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=EndTxnRequest, > transactionalId=my-first-tran > sactional-id, producerId=1000, producerEpoch=0, result=ABORT) > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,536] TRACE [TransactionalId my-first-transactional-id] > Not sending transactional request (type=FindCoordinatorRequest, > coordinatorKey=my- > first-transactional-id, coordinatorType=TRANSACTION) because we are in an > error state (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,637] TRACE [TransactionalId my-first-transactional-id] > Request (type=EndTxnRequest, transactionalId=my-first-transactional-id, > producerId > =1000, producerEpoch=0, result=ABORT) dequeued for sending > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,637] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=FindCoordinatorRequest, > coordinatorKey=my-fi > rst-transactional-id, coordinatorType=TRANSACTION) > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,637] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=EndTxnRequest, > transactionalId=my-first-tran > sactional-id, producerId=1000, producerEpoch=0, result=ABORT) > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,737] TRACE [TransactionalId my-first-transactional-id] > Not sending transactional request (type=FindCoordinatorRequest, > coordinatorKey=my- > first-transactional-id, coordinatorType=TRANSACTION) because we are in an > error state (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,837] TRACE [TransactionalId my-first-transactional-id] > Request (type=EndTxnRequest, transactionalId=my-first-transactional-id, > producerId > =1000, producerEpoch=0, result=ABORT) dequeued for sending > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,838] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=FindCoordinatorRequest, > coordinatorKey=my-fi > rst-transactional-id, coordinatorType=TRANSACTION) > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,838] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=EndTxnRequest, > transactionalId=my-first-tran > sactional-id, producerId=1000, producerEpoch=0, result=ABORT) > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,938] TRACE [TransactionalId my-first-transactional-id] > Not sending transactional request (type=FindCoordinatorRequest, > coordinatorKey=my- > first-transactional-id, coordinatorType=TRANSACTION) because we are in an > error state (org.apache.kafka.clients.producer.internals.TransactionManager) > {code} > A couple suggested improvements: > 1. We should allow FindCoordinator requests regardless of the transaction > state. > 2. It is a bit confusing that we allow EndTxnRequest to be sent in both the > ABORTABLE_ERROR and the ABORTING_TRANSACTION states. Perhaps we should only > allow EndTxnRequest
[jira] [Commented] (KAFKA-3123) Follower Broker cannot start if offsets are already out of range
[ https://issues.apache.org/jira/browse/KAFKA-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047101#comment-16047101 ] ASF GitHub Bot commented on KAFKA-3123: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3296 > Follower Broker cannot start if offsets are already out of range > > > Key: KAFKA-3123 > URL: https://issues.apache.org/jira/browse/KAFKA-3123 > Project: Kafka > Issue Type: Bug > Components: core, replication >Affects Versions: 0.9.0.0 >Reporter: Soumyajit Sahu >Assignee: Soumyajit Sahu >Priority: Critical > Labels: patch > Fix For: 0.11.0.0 > > Attachments: > 0001-Fix-Follower-crashes-when-offset-out-of-range-during.patch > > > I was trying to upgrade our test Windows cluster from 0.8.1.1 to 0.9.0 one > machine at a time. Our logs have just 2 hours of retention. I had re-imaged > the test machine under consideration, and got the following error in loop > after starting afresh with 0.9.0 broker: > [2016-01-19 13:57:28,809] WARN [ReplicaFetcherThread-1-169595708], Replica > 15588 for partition [EventLogs4,1] reset its fetch offset from 0 to > current leader 169595708's start offset 334086 > (kafka.server.ReplicaFetcherThread) > [2016-01-19 13:57:28,809] ERROR [ReplicaFetcherThread-1-169595708], Error > getting offset for partition [EventLogs4,1] to broker 169595708 > (kafka.server.ReplicaFetcherThread) > java.lang.IllegalStateException: Compaction for partition [EXO_EventLogs4,1] > cannot be aborted and paused since it is in LogCleaningPaused state. > at > kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply$mcV$sp(LogCleanerManager.scala:149) > at > kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply(LogCleanerManager.scala:140) > at > kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply(LogCleanerManager.scala:140) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262) > at > kafka.log.LogCleanerManager.abortAndPauseCleaning(LogCleanerManager.scala:140) > at kafka.log.LogCleaner.abortAndPauseCleaning(LogCleaner.scala:141) > at kafka.log.LogManager.truncateFullyAndStartAt(LogManager.scala:304) > at > kafka.server.ReplicaFetcherThread.handleOffsetOutOfRange(ReplicaFetcherThread.scala:185) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:152) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:122) > at scala.Option.foreach(Option.scala:236) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:122) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:120) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:120) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:120) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:120) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262) > at > kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118) > at > kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:93) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > I could unblock myself with a code change. I deleted the action for "case s > =>" in the LogCleanerManager.scala's abortAndPauseCleaning(). I think we > should not throw exception if the state is already LogCleaningAborted or > LogCleaningPaused in this function, but instead just let it roll. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5404) Add more AdminClient checks to ClientCompatibilityTest
[ https://issues.apache.org/jira/browse/KAFKA-5404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047090#comment-16047090 ] ASF GitHub Bot commented on KAFKA-5404: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3263 > Add more AdminClient checks to ClientCompatibilityTest > -- > > Key: KAFKA-5404 > URL: https://issues.apache.org/jira/browse/KAFKA-5404 > Project: Kafka > Issue Type: Bug >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Fix For: 0.11.0.0 > > > Add more AdminClient checks to ClientCompatibilityTest -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4831) Extract WindowedSerde to public APIs
[ https://issues.apache.org/jira/browse/KAFKA-4831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047003#comment-16047003 ] ASF GitHub Bot commented on KAFKA-4831: --- GitHub user vitaly-pushkar opened a pull request: https://github.com/apache/kafka/pull/3307 KAFKA-4831: Extract WindowedSerde to public APIs Now that we have augmented WindowSerde with non-arg parameters, extract it out as part of the public APIs so that users who want to I/O windowed streams can use it. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vitaly-pushkar/kafka public-windowed-serde Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3307.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3307 commit 1ca0b72b7e119979b4bfe9d05b00295ac9f30ab3 Author: Vitaly PushkarDate: 2017-06-04T22:34:40Z Extract Windowed Serde into the public package API commit 131b273675a773faa1f79eadd1715a1239e7aa6f Author: Vitaly Pushkar Date: 2017-06-12T19:54:08Z Extract Windowed Serde tests into public package > Extract WindowedSerde to public APIs > > > Key: KAFKA-4831 > URL: https://issues.apache.org/jira/browse/KAFKA-4831 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Guozhang Wang >Assignee: Vitaly Pushkar > Labels: newbie, user-experience > > Now that we have augmented WindowSerde with non-arg parameters, the next step > is to extract it out as part of the public APIs so that users who wants to > I/O windowed streams can use it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5416) TransactionCoordinator doesn't complete transition to CompleteCommit
[ https://issues.apache.org/jira/browse/KAFKA-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16046986#comment-16046986 ] ASF GitHub Bot commented on KAFKA-5416: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3287 > TransactionCoordinator doesn't complete transition to CompleteCommit > > > Key: KAFKA-5416 > URL: https://issues.apache.org/jira/browse/KAFKA-5416 > Project: Kafka > Issue Type: Bug >Reporter: Apurva Mehta >Assignee: Guozhang Wang >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > In regard to this system test: > http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/2017-06-09--001.1496974430--apurvam--MINOR-add-logging-to-transaction-coordinator-in-all-failure-cases--02b3816/TransactionsTest/test_transactions/failure_mode=clean_bounce.bounce_target=brokers/4.tgz > Here are the ownership changes for __transaction_state-37: > {noformat} > ./worker1/debug/server.log:3623:[2017-06-09 01:16:36,551] INFO [Transaction > Log Manager 2]: Trying to remove cached transaction metadata for > __transaction_state-37 on follower transition but there is no entries > remaining; it is likely that another process for removing the cached entries > has just executed earlier before > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:16174:[2017-06-09 01:16:39,963] INFO [Transaction > Log Manager 2]: Loading transaction metadata from __transaction_state-37 > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:16194:[2017-06-09 01:16:39,978] INFO [Transaction > Log Manager 2]: Finished loading 1 transaction metadata from > __transaction_state-37 in 15 milliseconds > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:35633:[2017-06-09 01:16:45,308] INFO [Transaction > Log Manager 2]: Removed 1 cached transaction metadata for > __transaction_state-37 on follower transition > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:40353:[2017-06-09 01:16:52,218] INFO [Transaction > Log Manager 2]: Trying to remove cached transaction metadata for > __transaction_state-37 on follower transition but there is no entries > remaining; it is likelythat another process for removing the cached entries > has just executed earlier before > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:40664:[2017-06-09 01:16:52,683] INFO [Transaction > Log Manager 2]: Trying to remove cached transaction metadata for > __transaction_state-37 on follower transition but there is no entries > remaining; it is likelythat another process for removing the cached entries > has just executed earlier before > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:146044:[2017-06-09 01:19:08,844] INFO [Transaction > Log Manager 2]: Loading transaction metadata from __transaction_state-37 > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:146048:[2017-06-09 01:19:08,850] INFO [Transaction > Log Manager 2]: Finished loading 1 transaction metadata from > __transaction_state-37 in 6 milliseconds > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:150147:[2017-06-09 01:19:10,995] INFO [Transaction > Log Manager 2]: Removed 1 cached transaction metadata for > __transaction_state-37 on follower transition > (kafka.coordinator.transaction.TransactionStateManager) > ./worker2/debug/server.log:3413:[2017-06-09 01:16:36,109] INFO [Transaction > Log Manager 1]: Loading transaction metadata from __transaction_state-37 > (kafka.coordinator.transaction.TransactionStateManager) > ./worker2/debug/server.log:20889:[2017-06-09 01:16:39,991] INFO [Transaction > Log Manager 1]: Removed 1 cached transaction metadata for > __transaction_state-37 on follower transition > (kafka.coordinator.transaction.TransactionStateManager) > ./worker2/debug/server.log:26084:[2017-06-09 01:16:46,682] INFO [Transaction > Log Manager 1]: Trying to remove cached transaction metadata for > __transaction_state-37 on follower transition but there is no entries > remaining; it is likelythat another process for removing the cached entries > has just executed earlier before > (kafka.coordinator.transaction.TransactionStateManager) > ./worker2/debug/server.log:26322:[2017-06-09 01:16:47,153] INFO [Transaction > Log Manager 1]: Trying to remove cached transaction metadata for > __transaction_state-37 on follower transition but there is no entries > remaining; it is likelythat another process for removing the cached entries > has just executed earlier
[jira] [Commented] (KAFKA-5435) Produce state lost if no snapshot retained
[ https://issues.apache.org/jira/browse/KAFKA-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16046868#comment-16046868 ] ASF GitHub Bot commented on KAFKA-5435: --- GitHub user hachikuji opened a pull request: https://github.com/apache/kafka/pull/3306 KAFKA-5435: Ensure producer snapshot retained after truncation You can merge this pull request into a Git repository by running: $ git pull https://github.com/hachikuji/kafka KAFKA-5435 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3306.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3306 commit 2e0a797cb5a0baa12032697500fb4a8c5d87b25a Author: Jason GustafsonDate: 2017-06-12T17:38:33Z KAFKA-5435: Ensure producer snapshot retained after truncation > Produce state lost if no snapshot retained > -- > > Key: KAFKA-5435 > URL: https://issues.apache.org/jira/browse/KAFKA-5435 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Blocker > Fix For: 0.11.0.0 > > > We have an optimization in {{Log}} to avoid the need to scan the log to build > producer state during the upgrade path. Basically, if no producer snapshot > exists, then we assume that it's an upgrade and take a new snapshot from the > end of the log. Unfortunately, it can happen that snapshot files are never > created or are deleted through truncation. Upon reinitialization, this can > cause the optimization above to kick in and we lose the current state of all > producers. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5433) Transient test failure: SaslPlainSslEndToEndAuthorizationTest.testNoProduceWithDescribeAcl
[ https://issues.apache.org/jira/browse/KAFKA-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16046712#comment-16046712 ] ASF GitHub Bot commented on KAFKA-5433: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3304 > Transient test failure: > SaslPlainSslEndToEndAuthorizationTest.testNoProduceWithDescribeAcl > -- > > Key: KAFKA-5433 > URL: https://issues.apache.org/jira/browse/KAFKA-5433 > Project: Kafka > Issue Type: Bug > Components: core >Reporter: Rajini Sivaram >Assignee: Rajini Sivaram > Fix For: 0.11.0.0, 0.11.1.0 > > > We seem to be forever committing tests without proper cleanup. > From https://builds.apache.org/job/kafka-trunk-jdk7/2377/: > {quote} > java.lang.SecurityException: zookeeper.set.acl is true, but the verification > of the JAAS login file failed. > at kafka.server.KafkaServer.initZk(KafkaServer.scala:325) > at kafka.server.KafkaServer.startup(KafkaServer.scala:191) > at kafka.utils.TestUtils$.createServer(TestUtils.scala:133) > at > kafka.integration.KafkaServerTestHarness$$anonfun$setUp$1.apply(KafkaServerTestHarness.scala:91) > at > kafka.integration.KafkaServerTestHarness$$anonfun$setUp$1.apply(KafkaServerTestHarness.scala:91) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.Iterator$class.foreach(Iterator.scala:891) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at > kafka.integration.KafkaServerTestHarness.setUp(KafkaServerTestHarness.scala:91) > at > kafka.api.IntegrationTestHarness.setUp(IntegrationTestHarness.scala:64) > at > kafka.api.EndToEndAuthorizationTest.setUp(EndToEndAuthorizationTest.scala:158) > at > kafka.api.SaslEndToEndAuthorizationTest.setUp(SaslEndToEndAuthorizationTest.scala:47) > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5433) Transient test failure: SaslPlainSslEndToEndAuthorizationTest.testNoProduceWithDescribeAcl
[ https://issues.apache.org/jira/browse/KAFKA-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16046613#comment-16046613 ] ASF GitHub Bot commented on KAFKA-5433: --- GitHub user rajinisivaram opened a pull request: https://github.com/apache/kafka/pull/3304 KAFKA-5433: Close SimpleAclAuthorizer in test to close ZK client You can merge this pull request into a Git repository by running: $ git pull https://github.com/rajinisivaram/kafka KAFKA-5433 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3304.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3304 commit 4c634f41d4ed63faea1227e8de226a6d5d510b80 Author: Rajini SivaramDate: 2017-06-12T14:19:31Z KAFKA-5433: Close SimpleAclAuthorizer in test to close ZK client > Transient test failure: > SaslPlainSslEndToEndAuthorizationTest.testNoProduceWithDescribeAcl > -- > > Key: KAFKA-5433 > URL: https://issues.apache.org/jira/browse/KAFKA-5433 > Project: Kafka > Issue Type: Bug > Components: core >Reporter: Rajini Sivaram >Assignee: Rajini Sivaram > > We seem to be forever committing tests without proper cleanup. > From https://builds.apache.org/job/kafka-trunk-jdk7/2377/: > {quote} > java.lang.SecurityException: zookeeper.set.acl is true, but the verification > of the JAAS login file failed. > at kafka.server.KafkaServer.initZk(KafkaServer.scala:325) > at kafka.server.KafkaServer.startup(KafkaServer.scala:191) > at kafka.utils.TestUtils$.createServer(TestUtils.scala:133) > at > kafka.integration.KafkaServerTestHarness$$anonfun$setUp$1.apply(KafkaServerTestHarness.scala:91) > at > kafka.integration.KafkaServerTestHarness$$anonfun$setUp$1.apply(KafkaServerTestHarness.scala:91) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.Iterator$class.foreach(Iterator.scala:891) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at > kafka.integration.KafkaServerTestHarness.setUp(KafkaServerTestHarness.scala:91) > at > kafka.api.IntegrationTestHarness.setUp(IntegrationTestHarness.scala:64) > at > kafka.api.EndToEndAuthorizationTest.setUp(EndToEndAuthorizationTest.scala:158) > at > kafka.api.SaslEndToEndAuthorizationTest.setUp(SaslEndToEndAuthorizationTest.scala:47) > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5402) JmxReporter Fetch metrics for kafka.server should not be created when client quotas are not enabled
[ https://issues.apache.org/jira/browse/KAFKA-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16046540#comment-16046540 ] ASF GitHub Bot commented on KAFKA-5402: --- GitHub user rajinisivaram opened a pull request: https://github.com/apache/kafka/pull/3303 KAFKA-5402: Avoid creating quota related metrics if quotas not enabled You can merge this pull request into a Git repository by running: $ git pull https://github.com/rajinisivaram/kafka KAFKA-5402 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3303.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3303 commit f69011760d67a4b85d5049014e5929fbdc661858 Author: Rajini SivaramDate: 2017-06-12T12:12:14Z KAFKA-5402: Avoid creating quota related metrics if quotas not enabled > JmxReporter Fetch metrics for kafka.server should not be created when client > quotas are not enabled > --- > > Key: KAFKA-5402 > URL: https://issues.apache.org/jira/browse/KAFKA-5402 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.10.1.1 >Reporter: Koelli Mungee >Assignee: Rajini Sivaram > Attachments: Fetch.jpg, Metrics.jpg > > > JMXReporter kafka.server Fetch metrics should not be created when client > quotas are not enforced for client fetch requests. Currently, these metrics > are created and this can cause OutOfMemoryException in the KafkaServer in > cases where a large number of consumers are being created rapidly. > Attaching screenshots from a heapdump showing the > kafka.server:type=Fetch,client-id=consumer-358567 with different client.ids > from a kafkaserver where client quotas were not enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5409) Providing a custom client-id to the ConsoleProducer tool
[ https://issues.apache.org/jira/browse/KAFKA-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16046434#comment-16046434 ] ASF GitHub Bot commented on KAFKA-5409: --- GitHub user ppatierno opened a pull request: https://github.com/apache/kafka/pull/3302 KAFKA-5409: Providing a custom client-id to the ConsoleProducer tool You can merge this pull request into a Git repository by running: $ git pull https://github.com/ppatierno/kafka kafka-5409 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3302.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3302 commit 591cceaa0fccd76086e4a6a2c232a5e1da85c183 Author: ppatiernoDate: 2017-06-12T10:48:22Z Fixed the possibility to specify the client.id without being overwritten with a fixed one > Providing a custom client-id to the ConsoleProducer tool > > > Key: KAFKA-5409 > URL: https://issues.apache.org/jira/browse/KAFKA-5409 > Project: Kafka > Issue Type: Improvement > Components: tools >Reporter: Paolo Patierno >Priority: Minor > > Hi, > I see that the client-id properties for the ConsoleProducer tool is always > "console-producer". It could be useful having it as parameter on the command > line or generating a random one like happens for the ConsolerConsumer. > If it makes sense to you, I can work on that. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5428) Transactional producer aborts batches incorrectly in abortable error state
[ https://issues.apache.org/jira/browse/KAFKA-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045857#comment-16045857 ] ASF GitHub Bot commented on KAFKA-5428: --- GitHub user hachikuji opened a pull request: https://github.com/apache/kafka/pull/3298 KAFKA-5428: Transactional producer should only abort batches in fatal error state You can merge this pull request into a Git repository by running: $ git pull https://github.com/hachikuji/kafka KAFKA-5428 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3298.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3298 commit 5bd97c0e10ba73a79299d40d3223718acfe7e97d Author: Jason GustafsonDate: 2017-06-11T07:18:05Z KAFKA-5428: Transactional producer should only abort batches in fatal error state > Transactional producer aborts batches incorrectly in abortable error state > -- > > Key: KAFKA-5428 > URL: https://issues.apache.org/jira/browse/KAFKA-5428 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Blocker > Fix For: 0.11.0.0 > > > We currently abort batches blindly if we are in any error state. We should > only do this if we are in a fatal error state. Otherwise, we risk > OutOfOrderSequence errors if a failed produce request had actually been > written successfully to the topic. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5427) Transactional producer cannot find coordinator when trying to abort transaction after error
[ https://issues.apache.org/jira/browse/KAFKA-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045748#comment-16045748 ] ASF GitHub Bot commented on KAFKA-5427: --- GitHub user hachikuji opened a pull request: https://github.com/apache/kafka/pull/3297 KAFKA-5427: Transactional producer should allow FindCoordinator in error state You can merge this pull request into a Git repository by running: $ git pull https://github.com/hachikuji/kafka KAFKA-5427 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3297.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3297 commit 76ffa96261084474cde0db350b96d8e4df9cbc55 Author: Jason GustafsonDate: 2017-06-10T23:31:32Z KAFKA-5427: Transactional producer should allow FindCoordinator in error state > Transactional producer cannot find coordinator when trying to abort > transaction after error > --- > > Key: KAFKA-5427 > URL: https://issues.apache.org/jira/browse/KAFKA-5427 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Blocker > Fix For: 0.11.0.0 > > > It can happen that we receive an abortable error while we are already > aborting a transaction. In this case, we have an EndTxnRequest queued for > sending when we transition to ABORTABLE_ERROR. It could be that we need to > find the coordinator before sending this EndTxnRequest. The problem is that > we will fail even the FindCoordinatorRequest because we are in an error > state. This causes the following endless loop: > {code} > [2017-06-10 19:29:33,436] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=FindCoordinatorRequest, > coordinatorKey=my-fi > rst-transactional-id, coordinatorType=TRANSACTION) > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,436] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=EndTxnRequest, > transactionalId=my-first-tran > sactional-id, producerId=1000, producerEpoch=0, result=ABORT) > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,536] TRACE [TransactionalId my-first-transactional-id] > Not sending transactional request (type=FindCoordinatorRequest, > coordinatorKey=my- > first-transactional-id, coordinatorType=TRANSACTION) because we are in an > error state (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,637] TRACE [TransactionalId my-first-transactional-id] > Request (type=EndTxnRequest, transactionalId=my-first-transactional-id, > producerId > =1000, producerEpoch=0, result=ABORT) dequeued for sending > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,637] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=FindCoordinatorRequest, > coordinatorKey=my-fi > rst-transactional-id, coordinatorType=TRANSACTION) > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,637] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=EndTxnRequest, > transactionalId=my-first-tran > sactional-id, producerId=1000, producerEpoch=0, result=ABORT) > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,737] TRACE [TransactionalId my-first-transactional-id] > Not sending transactional request (type=FindCoordinatorRequest, > coordinatorKey=my- > first-transactional-id, coordinatorType=TRANSACTION) because we are in an > error state (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,837] TRACE [TransactionalId my-first-transactional-id] > Request (type=EndTxnRequest, transactionalId=my-first-transactional-id, > producerId > =1000, producerEpoch=0, result=ABORT) dequeued for sending > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,838] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=FindCoordinatorRequest, > coordinatorKey=my-fi > rst-transactional-id, coordinatorType=TRANSACTION) > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-10 19:29:33,838] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=EndTxnRequest, > transactionalId=my-first-tran > sactional-id, producerId=1000, producerEpoch=0, result=ABORT) >
[jira] [Commented] (KAFKA-4661) Improve test coverage UsePreviousTimeOnInvalidTimestamp
[ https://issues.apache.org/jira/browse/KAFKA-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045721#comment-16045721 ] ASF GitHub Bot commented on KAFKA-4661: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3288 > Improve test coverage UsePreviousTimeOnInvalidTimestamp > --- > > Key: KAFKA-4661 > URL: https://issues.apache.org/jira/browse/KAFKA-4661 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Damian Guy >Assignee: Jeyhun Karimov >Priority: Minor > Fix For: 0.11.1.0 > > > Exception branch not tested -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-3123) Follower Broker cannot start if offsets are already out of range
[ https://issues.apache.org/jira/browse/KAFKA-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045695#comment-16045695 ] ASF GitHub Bot commented on KAFKA-3123: --- GitHub user mimaison opened a pull request: https://github.com/apache/kafka/pull/3296 KAFKA-3123: Follower Broker cannot start if offsets are already out o… …f range From https://github.com/apache/kafka/pull/1716#discussion_r112000498, ensure the cleaner is restarted if Log.truncateTo throws You can merge this pull request into a Git repository by running: $ git pull https://github.com/mimaison/kafka KAFKA-3123 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3296.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3296 commit e12690320d68c7686ecf9ceebe53a4498b8a5f0d Author: Mickael MaisonDate: 2017-06-10T18:49:43Z KAFKA-3123: Follower Broker cannot start if offsets are already out of range > Follower Broker cannot start if offsets are already out of range > > > Key: KAFKA-3123 > URL: https://issues.apache.org/jira/browse/KAFKA-3123 > Project: Kafka > Issue Type: Bug > Components: core, replication >Affects Versions: 0.9.0.0 >Reporter: Soumyajit Sahu >Assignee: Soumyajit Sahu >Priority: Critical > Labels: patch > Fix For: 0.11.0.1 > > Attachments: > 0001-Fix-Follower-crashes-when-offset-out-of-range-during.patch > > > I was trying to upgrade our test Windows cluster from 0.8.1.1 to 0.9.0 one > machine at a time. Our logs have just 2 hours of retention. I had re-imaged > the test machine under consideration, and got the following error in loop > after starting afresh with 0.9.0 broker: > [2016-01-19 13:57:28,809] WARN [ReplicaFetcherThread-1-169595708], Replica > 15588 for partition [EventLogs4,1] reset its fetch offset from 0 to > current leader 169595708's start offset 334086 > (kafka.server.ReplicaFetcherThread) > [2016-01-19 13:57:28,809] ERROR [ReplicaFetcherThread-1-169595708], Error > getting offset for partition [EventLogs4,1] to broker 169595708 > (kafka.server.ReplicaFetcherThread) > java.lang.IllegalStateException: Compaction for partition [EXO_EventLogs4,1] > cannot be aborted and paused since it is in LogCleaningPaused state. > at > kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply$mcV$sp(LogCleanerManager.scala:149) > at > kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply(LogCleanerManager.scala:140) > at > kafka.log.LogCleanerManager$$anonfun$abortAndPauseCleaning$1.apply(LogCleanerManager.scala:140) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262) > at > kafka.log.LogCleanerManager.abortAndPauseCleaning(LogCleanerManager.scala:140) > at kafka.log.LogCleaner.abortAndPauseCleaning(LogCleaner.scala:141) > at kafka.log.LogManager.truncateFullyAndStartAt(LogManager.scala:304) > at > kafka.server.ReplicaFetcherThread.handleOffsetOutOfRange(ReplicaFetcherThread.scala:185) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:152) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:122) > at scala.Option.foreach(Option.scala:236) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:122) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:120) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:120) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:120) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:120) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262) > at >
[jira] [Commented] (KAFKA-5418) ZkUtils.getAllPartitions() may fail if a topic is marked for deletion
[ https://issues.apache.org/jira/browse/KAFKA-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045678#comment-16045678 ] ASF GitHub Bot commented on KAFKA-5418: --- GitHub user mimaison opened a pull request: https://github.com/apache/kafka/pull/3295 KAFKA-5418: ZkUtils.getAllPartitions() may fail if a topic is marked for deletion Skip topics that don't have any partitions in zkUtils.getAllPartitions() You can merge this pull request into a Git repository by running: $ git pull https://github.com/mimaison/kafka KAFKA-5418 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3295.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3295 commit 0aeca093c2acf47b8d4fa01b68eaef79f625e091 Author: Mickael MaisonDate: 2017-06-10T20:10:40Z KAFKA-5418: ZkUtils.getAllPartitions() may fail if a topic is marked for deletion > ZkUtils.getAllPartitions() may fail if a topic is marked for deletion > - > > Key: KAFKA-5418 > URL: https://issues.apache.org/jira/browse/KAFKA-5418 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.9.0.1, 0.10.2.1 >Reporter: Edoardo Comar > > Running {{ZkUtils.getAllPartitions()}} on a cluster which had a topic stuck > in the 'marked for deletion' state > so it was a child of {{/brokers/topics}} > but it had no children, i.e. the path {{/brokers/topics/thistopic/partitions}} > did not exist, throws a ZkNoNodeException while iterating: > {noformat} > rg.I0Itec.zkclient.exception.ZkNoNodeException: > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = > NoNode for /brokers/topics/xyzblahfoo/partitions > at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47) > at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:995) > at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:675) > at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:671) > at kafka.utils.ZkUtils.getChildren(ZkUtils.scala:537) > at > kafka.utils.ZkUtils$$anonfun$getAllPartitions$1.apply(ZkUtils.scala:817) > at > kafka.utils.ZkUtils$$anonfun$getAllPartitions$1.apply(ZkUtils.scala:816) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245) > at scala.collection.Iterator$class.foreach(Iterator.scala:742) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1194) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at scala.collection.AbstractIterable.foreach(Iterable.scala:54) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:245) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at kafka.utils.ZkUtils.getAllPartitions(ZkUtils.scala:816) > ... > at java.lang.Thread.run(Thread.java:809) > Caused by: org.apache.zookeeper.KeeperException$NoNodeException: > KeeperErrorCode = NoNode for /brokers/topics/xyzblahfoo/partitions > at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472) > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500) > at org.I0Itec.zkclient.ZkConnection.getChildren(ZkConnection.java:114) > at org.I0Itec.zkclient.ZkClient$4.call(ZkClient.java:678) > at org.I0Itec.zkclient.ZkClient$4.call(ZkClient.java:675) > at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:985) > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4653) Improve test coverage of RocksDBStore
[ https://issues.apache.org/jira/browse/KAFKA-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045635#comment-16045635 ] ASF GitHub Bot commented on KAFKA-4653: --- GitHub user jeyhunkarimov opened a pull request: https://github.com/apache/kafka/pull/3294 KAFKA-4653: Improve test coverage of RocksDBStore You can merge this pull request into a Git repository by running: $ git pull https://github.com/jeyhunkarimov/kafka KAFKA-4653 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3294.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3294 commit 77e65eaa07855651cf4449de2cca074a9fe05ece Author: Jeyhun KarimovDate: 2017-06-10T17:54:27Z RocksDBStore putAll covered > Improve test coverage of RocksDBStore > - > > Key: KAFKA-4653 > URL: https://issues.apache.org/jira/browse/KAFKA-4653 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Damian Guy >Priority: Minor > Fix For: 0.11.1.0 > > > {{putAll}} - not covered > {{putInternal} - exceptions -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4658) Improve test coverage InMemoryKeyValueLoggedStore
[ https://issues.apache.org/jira/browse/KAFKA-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045602#comment-16045602 ] ASF GitHub Bot commented on KAFKA-4658: --- GitHub user jeyhunkarimov opened a pull request: https://github.com/apache/kafka/pull/3293 KAFKA-4658: Improve test coverage InMemoryKeyValueLoggedStore You can merge this pull request into a Git repository by running: $ git pull https://github.com/jeyhunkarimov/kafka KAFKA-4658 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3293.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3293 commit f5e5e8af539d8ce0e27232d8b9aac725ec19f80c Author: Jeyhun KarimovDate: 2017-06-10T16:23:54Z InMemoryKeyValueLoggedStore tests improved with putAll() and persistent() > Improve test coverage InMemoryKeyValueLoggedStore > - > > Key: KAFKA-4658 > URL: https://issues.apache.org/jira/browse/KAFKA-4658 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Damian Guy > Fix For: 0.11.1.0 > > > {{putAll} not covered -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4656) Improve test coverage of CompositeReadOnlyKeyValueStore
[ https://issues.apache.org/jira/browse/KAFKA-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045583#comment-16045583 ] ASF GitHub Bot commented on KAFKA-4656: --- GitHub user jeyhunkarimov opened a pull request: https://github.com/apache/kafka/pull/3292 KAFKA-4656: Improve test coverage of CompositeReadOnlyKeyValueStore You can merge this pull request into a Git repository by running: $ git pull https://github.com/jeyhunkarimov/kafka KAFKA-4656 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3292.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3292 commit 3fa739bf25c0a92e405635d9c0ae7f9c90ca8027 Author: Jeyhun KarimovDate: 2017-06-10T15:28:55Z Exceptions not covered in CompositeReadOnlyKeyValueStoreTest Exceptions not covered in CompositeReadOnlyKeyValueStoreTest Exceptions not covered in CompositeReadOnlyKeyValueStoreTest > Improve test coverage of CompositeReadOnlyKeyValueStore > --- > > Key: KAFKA-4656 > URL: https://issues.apache.org/jira/browse/KAFKA-4656 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Damian Guy >Priority: Minor > Fix For: 0.11.1.0 > > > exceptions not covered -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4659) Improve test coverage of CachingKeyValueStore
[ https://issues.apache.org/jira/browse/KAFKA-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045525#comment-16045525 ] ASF GitHub Bot commented on KAFKA-4659: --- GitHub user jeyhunkarimov opened a pull request: https://github.com/apache/kafka/pull/3291 KAFKA-4659: Improve test coverage of CachingKeyValueStore You can merge this pull request into a Git repository by running: $ git pull https://github.com/jeyhunkarimov/kafka KAFKA-4659 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3291.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3291 commit c640cdc430c287e100cbb94a6956005c029e83e5 Author: Jeyhun KarimovDate: 2017-06-10T13:22:51Z putIfAbsent and null pointer exception test cases covered > Improve test coverage of CachingKeyValueStore > - > > Key: KAFKA-4659 > URL: https://issues.apache.org/jira/browse/KAFKA-4659 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Damian Guy >Assignee: Jeyhun Karimov >Priority: Minor > Fix For: 0.11.1.0 > > > {{putIfAbsent}} mostly not covered -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4655) Improve test coverage of CompositeReadOnlySessionStore
[ https://issues.apache.org/jira/browse/KAFKA-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045501#comment-16045501 ] ASF GitHub Bot commented on KAFKA-4655: --- GitHub user jeyhunkarimov opened a pull request: https://github.com/apache/kafka/pull/3290 KAFKA-4655: Improve test coverage of CompositeReadOnlySessionStore You can merge this pull request into a Git repository by running: $ git pull https://github.com/jeyhunkarimov/kafka KAFKA-4655 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3290.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3290 commit 0627b6b932bde199f26ffead651ce5e1d7f0ea82 Author: Jeyhun KarimovDate: 2017-06-10T12:13:26Z Improved coverage with exceptions in fetch and internal iterator > Improve test coverage of CompositeReadOnlySessionStore > -- > > Key: KAFKA-4655 > URL: https://issues.apache.org/jira/browse/KAFKA-4655 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Damian Guy >Assignee: Jeyhun Karimov > Fix For: 0.11.1.0 > > > exceptions in fetch and internal iterator -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4661) Improve test coverage UsePreviousTimeOnInvalidTimestamp
[ https://issues.apache.org/jira/browse/KAFKA-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045287#comment-16045287 ] ASF GitHub Bot commented on KAFKA-4661: --- GitHub user jeyhunkarimov opened a pull request: https://github.com/apache/kafka/pull/3288 KAFKA-4661: Improve test coverage UsePreviousTimeOnInvalidTimestamp You can merge this pull request into a Git repository by running: $ git pull https://github.com/jeyhunkarimov/kafka KAFKA-4661 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3288.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3288 commit 2f407df88b1eeb9c91df235cd3dc07b4e6cdf3ed Author: Jeyhun KarimovDate: 2017-06-10T00:55:23Z Exception branch tested on UsePreviousTimeOnInvalidTimestamp > Improve test coverage UsePreviousTimeOnInvalidTimestamp > --- > > Key: KAFKA-4661 > URL: https://issues.apache.org/jira/browse/KAFKA-4661 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Damian Guy >Assignee: Jeyhun Karimov >Priority: Minor > > Exception branch not tested -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5415) TransactionCoordinator doesn't complete transition to PrepareCommit state
[ https://issues.apache.org/jira/browse/KAFKA-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045252#comment-16045252 ] ASF GitHub Bot commented on KAFKA-5415: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3286 > TransactionCoordinator doesn't complete transition to PrepareCommit state > - > > Key: KAFKA-5415 > URL: https://issues.apache.org/jira/browse/KAFKA-5415 > Project: Kafka > Issue Type: Bug >Reporter: Apurva Mehta >Assignee: Apurva Mehta >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > Attachments: 6.tgz > > > This has been revealed by the system test failures on jenkins. > The transaction coordinator seems to get into a path during the handling of > the EndTxnRequest where it returns an error (possibly a NOT_COORDINATOR or > COORDINATOR_NOT_AVAILABLE error, to be revealed by > https://github.com/apache/kafka/pull/3278) to the client. However, due to > network instability, the producer is disconnected before it receives this > error. > As a result, the transaction remains in a `PrepareXX` state, and future > `EndTxn` requests sent by the client after reconnecting result in a > `CONCURRENT_TRANSACTION` error code. Hence the client gets stuck and the > transaction never finishes, as expiration isn't done from a PrepareXX state. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5416) TransactionCoordinator doesn't complete transition to CompleteCommit
[ https://issues.apache.org/jira/browse/KAFKA-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045195#comment-16045195 ] ASF GitHub Bot commented on KAFKA-5416: --- GitHub user guozhangwang opened a pull request: https://github.com/apache/kafka/pull/3287 KAFKA-5416: Re-prepare transition to CompleteCommit/Abort upon retrying append to log In `TransationStateManager`, we reset the pending state if an error occurred while appending to log; this is correct except that for the `TransactionMarkerChannelManager`, as it will retry appending to log and if eventually it succeeded, the transaction metadata's completing transition will throw an IllegalStateException since pending state is None, this will be thrown all the way to the `KafkaApis` and be swallowed. 1. When re-enqueueing to the retry append queue, re-prepare transition to set its pending state. 2. A bunch of log4j improvements based the debugging experience. The main principle is to make sure all error codes that is about to sent to the client will be logged, and unnecessary log4j entries to be removed. 3. Also moved some log entries in ReplicationUtils.scala to `trace`: this is rather orthogonal to this PR but I found it rather annoying while debugging the logs. 4. A couple of unrelated bug fixes as pointed by @hachikuji and @apurvam . You can merge this pull request into a Git repository by running: $ git pull https://github.com/guozhangwang/kafka KHotfix-transaction-coordinator-append-callback Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3287.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3287 commit 755f01201774f6fb5ddcdff87caaa78634847ebe Author: Guozhang WangDate: 2017-06-09T23:02:44Z re-prepare transition to completeXX upon retrying append to log > TransactionCoordinator doesn't complete transition to CompleteCommit > > > Key: KAFKA-5416 > URL: https://issues.apache.org/jira/browse/KAFKA-5416 > Project: Kafka > Issue Type: Bug >Reporter: Apurva Mehta >Assignee: Guozhang Wang >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > In regard to this system test: > http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/2017-06-09--001.1496974430--apurvam--MINOR-add-logging-to-transaction-coordinator-in-all-failure-cases--02b3816/TransactionsTest/test_transactions/failure_mode=clean_bounce.bounce_target=brokers/4.tgz > Here are the ownership changes for __transaction_state-37: > {noformat} > ./worker1/debug/server.log:3623:[2017-06-09 01:16:36,551] INFO [Transaction > Log Manager 2]: Trying to remove cached transaction metadata for > __transaction_state-37 on follower transition but there is no entries > remaining; it is likely that another process for removing the cached entries > has just executed earlier before > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:16174:[2017-06-09 01:16:39,963] INFO [Transaction > Log Manager 2]: Loading transaction metadata from __transaction_state-37 > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:16194:[2017-06-09 01:16:39,978] INFO [Transaction > Log Manager 2]: Finished loading 1 transaction metadata from > __transaction_state-37 in 15 milliseconds > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:35633:[2017-06-09 01:16:45,308] INFO [Transaction > Log Manager 2]: Removed 1 cached transaction metadata for > __transaction_state-37 on follower transition > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:40353:[2017-06-09 01:16:52,218] INFO [Transaction > Log Manager 2]: Trying to remove cached transaction metadata for > __transaction_state-37 on follower transition but there is no entries > remaining; it is likelythat another process for removing the cached entries > has just executed earlier before > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:40664:[2017-06-09 01:16:52,683] INFO [Transaction > Log Manager 2]: Trying to remove cached transaction metadata for > __transaction_state-37 on follower transition but there is no entries > remaining; it is likelythat another process for removing the cached entries > has just executed earlier before > (kafka.coordinator.transaction.TransactionStateManager) > ./worker1/debug/server.log:146044:[2017-06-09 01:19:08,844] INFO [Transaction > Log Manager 2]: Loading transaction metadata from
[jira] [Commented] (KAFKA-5422) Multiple produce request failures causes invalid state transition in TransactionManager
[ https://issues.apache.org/jira/browse/KAFKA-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045176#comment-16045176 ] ASF GitHub Bot commented on KAFKA-5422: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3285 > Multiple produce request failures causes invalid state transition in > TransactionManager > --- > > Key: KAFKA-5422 > URL: https://issues.apache.org/jira/browse/KAFKA-5422 > Project: Kafka > Issue Type: Bug >Reporter: Apurva Mehta >Assignee: Apurva Mehta > Labels: exactly-once > Fix For: 0.11.0.0 > > > When multiple produce requests fail (for instance when all inflight batches > are expired), each will try to transition to ABORTABLE_ERROR. > However, only the first transition will succeed, the rest will fail with the > following 'invalid transition from ABORTABLE_ERROR to ABORTABLE_ERROR'. > This will be caught in the sender thread and things will continue. However, > the correct thing to do do is to allow multiple transitions to > ABORTABLE_ERROR. > {noformat} > [2017-06-09 01:22:39,327] WARN Connection to node 3 could not be established. > Broker may not be available. (org.apache.kafka.clients.NetworkClient) > [2017-06-09 01:22:39,958] TRACE Expired 2 batches in accumulator > (org.apache.kafka.clients.producer.internals.Sender) > [2017-06-09 01:22:39,958] DEBUG [TransactionalId my-first-transactional-id] > Transition from state COMMITTING_TRANSACTION to error state ABORTABLE_ERROR > (org.apache.kafka.clients.producer.internals.TransactionManager) > org.apache.kafka.common.errors.TimeoutException: Expiring 250 record(s) for > output-topic-0: 30099 ms has passed since batch creation plus linger time > [2017-06-09 01:22:39,960] TRACE Produced messages to topic-partition > output-topic-0 with base offset offset -1 and error: {}. > (org.apache.kafka.clients.producer.internals.ProducerBatch) > org.apache.kafka.common.errors.TimeoutException: Expiring 250 record(s) for > output-topic-0: 30099 ms has passed since batch creation plus linger time > [2017-06-09 01:22:39,960] ERROR Uncaught error in kafka producer I/O thread: > (org.apache.kafka.clients.producer.internals.Sender) > org.apache.kafka.common.KafkaException: Invalid transition attempted from > state ABORTABLE_ERROR to state ABORTABLE_ERROR > at > org.apache.kafka.clients.producer.internals.TransactionManager.transitionTo(TransactionManager.java:475) > at > org.apache.kafka.clients.producer.internals.TransactionManager.transitionToAbortableError(TransactionManager.java:288) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:602) > at > org.apache.kafka.clients.producer.internals.Sender.sendProducerData(Sender.java:271) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:221) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:162) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5415) TransactionCoordinator doesn't complete transition to PrepareCommit state
[ https://issues.apache.org/jira/browse/KAFKA-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045170#comment-16045170 ] ASF GitHub Bot commented on KAFKA-5415: --- GitHub user apurvam opened a pull request: https://github.com/apache/kafka/pull/3286 KAFKA-5415: Remove timestamp check in completeTransitionTo This assertion is hard to get right because the system time can roll backward on a host due to NTP (as shown in the ticket), and also because a transaction can start on one host and complete on another. Getting precise clock times across hosts is virtually impossible, and this check makes things fragile. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apurvam/kafka KAFKA-5415-avoid-timestamp-check-in-completeTransition Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3286.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3286 commit ccf5217d5a5985e7e88b2794c5fe43ff5b1d8a58 Author: Apurva MehtaDate: 2017-06-09T22:51:31Z Remove timestamp check in completeTransitionTo > TransactionCoordinator doesn't complete transition to PrepareCommit state > - > > Key: KAFKA-5415 > URL: https://issues.apache.org/jira/browse/KAFKA-5415 > Project: Kafka > Issue Type: Bug >Reporter: Apurva Mehta >Assignee: Apurva Mehta >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > Attachments: 6.tgz > > > This has been revealed by the system test failures on jenkins. > The transaction coordinator seems to get into a path during the handling of > the EndTxnRequest where it returns an error (possibly a NOT_COORDINATOR or > COORDINATOR_NOT_AVAILABLE error, to be revealed by > https://github.com/apache/kafka/pull/3278) to the client. However, due to > network instability, the producer is disconnected before it receives this > error. > As a result, the transaction remains in a `PrepareXX` state, and future > `EndTxn` requests sent by the client after reconnecting result in a > `CONCURRENT_TRANSACTION` error code. Hence the client gets stuck and the > transaction never finishes, as expiration isn't done from a PrepareXX state. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5422) Multiple produce request failures causes invalid state transition in TransactionManager
[ https://issues.apache.org/jira/browse/KAFKA-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045069#comment-16045069 ] ASF GitHub Bot commented on KAFKA-5422: --- GitHub user apurvam opened a pull request: https://github.com/apache/kafka/pull/3285 KAFKA-5422: Handle multiple transitions to ABORTABLE_ERROR correctly You can merge this pull request into a Git repository by running: $ git pull https://github.com/apurvam/kafka KAFKA-5422-allow-multiple-transitions-to-abortable-error Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3285.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3285 commit a3d0d923a76269d55541294447967167c35baebb Author: Apurva MehtaDate: 2017-06-09T21:39:49Z Handle multiple transitions to ABORTABLE_ERROR correctly > Multiple produce request failures causes invalid state transition in > TransactionManager > --- > > Key: KAFKA-5422 > URL: https://issues.apache.org/jira/browse/KAFKA-5422 > Project: Kafka > Issue Type: Bug >Reporter: Apurva Mehta >Assignee: Apurva Mehta > Labels: exactly-once > Fix For: 0.11.0.0 > > > When multiple produce requests fail (for instance when all inflight batches > are expired), each will try to transition to ABORTABLE_ERROR. > However, only the first transition will succeed, the rest will fail with the > following 'invalid transition from ABORTABLE_ERROR to ABORTABLE_ERROR'. > This will be caught in the sender thread and things will continue. However, > the correct thing to do do is to allow multiple transitions to > ABORTABLE_ERROR. > {noformat} > [2017-06-09 01:22:39,327] WARN Connection to node 3 could not be established. > Broker may not be available. (org.apache.kafka.clients.NetworkClient) > [2017-06-09 01:22:39,958] TRACE Expired 2 batches in accumulator > (org.apache.kafka.clients.producer.internals.Sender) > [2017-06-09 01:22:39,958] DEBUG [TransactionalId my-first-transactional-id] > Transition from state COMMITTING_TRANSACTION to error state ABORTABLE_ERROR > (org.apache.kafka.clients.producer.internals.TransactionManager) > org.apache.kafka.common.errors.TimeoutException: Expiring 250 record(s) for > output-topic-0: 30099 ms has passed since batch creation plus linger time > [2017-06-09 01:22:39,960] TRACE Produced messages to topic-partition > output-topic-0 with base offset offset -1 and error: {}. > (org.apache.kafka.clients.producer.internals.ProducerBatch) > org.apache.kafka.common.errors.TimeoutException: Expiring 250 record(s) for > output-topic-0: 30099 ms has passed since batch creation plus linger time > [2017-06-09 01:22:39,960] ERROR Uncaught error in kafka producer I/O thread: > (org.apache.kafka.clients.producer.internals.Sender) > org.apache.kafka.common.KafkaException: Invalid transition attempted from > state ABORTABLE_ERROR to state ABORTABLE_ERROR > at > org.apache.kafka.clients.producer.internals.TransactionManager.transitionTo(TransactionManager.java:475) > at > org.apache.kafka.clients.producer.internals.TransactionManager.transitionToAbortableError(TransactionManager.java:288) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:602) > at > org.apache.kafka.clients.producer.internals.Sender.sendProducerData(Sender.java:271) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:221) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:162) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-2170) 10 LogTest cases failed for file.renameTo failed under windows
[ https://issues.apache.org/jira/browse/KAFKA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044582#comment-16044582 ] ASF GitHub Bot commented on KAFKA-2170: --- GitHub user nxmbriggs404 opened a pull request: https://github.com/apache/kafka/pull/3283 KAFKA-2170: Updated Fixes For Windows Platform During stress testing of kafka 0.10.2.1 on a Windows platform, our group has encountered some issues that appear to be known to the community but not fully addressed by kafka. Using: https://github.com/apache/kafka/pull/154 as a guide, we have made derived changes to the source code and automated tests such that the "clients" and "core" tests pass for us on Windows and Linux platforms. Our stress tests succeed as well. This pull request adapts those changes to merge and build with kafka/trunk. The "clients" and "core" tests from kafka/trunk pass on Linux for us with these changes in place, and all tests pass on Windows except: ConsumerBounceTest (intermittent failures) TransactionsTest DeleteTopicTest.testDeleteTopicWithCleaner EpochDrivenReplicationProtocolAcceptanceTest.offsetsShouldNotGoBackwards Our intention is to help efforts to further kafka support for the Windows platform. Our changes are the work of engineers from Nexidia building upon the work found in the aforementioned pull request link, and they are contributed to the community per kafka's open source license. We welcome all feedback and look forward to working with the kafka community. Matt Briggs Principal Software Engineer Nexidia, a NICE Analytics Company www.nexidia.com You can merge this pull request into a Git repository by running: $ git pull https://github.com/nxmbriggs404/kafka nx-windows-fixes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3283.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3283 commit 6ee3c167c6e2daa8ce4564d98f9f63967a0efece Author: Matt BriggsDate: 2017-06-06T15:10:58Z Handle log file deletion and renaming on Windows Special treatment is needed for the deletion and renaming of log and index files on Windows, due to the latter's general inability to perform those operations while a file is opened or memory mapped. The changes in this commit are essentially adapted from: https://github.com/apache/kafka/pull/154 More detailed background information on the issues can also be found via that link. commit a0cd773a8d89d7df90fc75ce55a46fd8bb93d368 Author: Matt Briggs Date: 2017-06-06T15:21:23Z Colliding log filenames cause test failures on Windows This commit addresses an edge case with compaction and asynchronous deletion of log files initially encountered when debugging: LogCleanerTest.testRecoveryAfterCrash failures on Windows. It appears that troubles arise when compaction of logs results in two segments having the same base address, hence the same file names, and the segments get scheduled for background deletion. If one segment's files are pending deletion at the time the other segment's files are scheduled for deletion, the file rename attempted during the latter will fail on Windows (due to the typical Windows issues with open/memory-mapped files). It doesn't appear like we can simply close out the former files, as it seems that kafka intends to have them open for concurrent readers until the file deletion interval has fully passed. The fix in this commit basically sidesteps the issue by ensuring files scheduled for background delete are renamed uniquely (by injecting a UUID into the filename). Essentially this follows the approach taken with LogManager.asyncDelete and Log.DeleteDirSuffix. Collision related errors were also observed when running a custom stress test on Windows against a standalone kafka server. The test code caused extremely frequent compaction of the __consumer_offsets topic partitions, which resulted in collisions of the latter's log files when they were scheduled for deletion. Use of the UUID was successful in avoiding collision related issues in this context. commit 3633d493bc3c0de3f177eecd11e374be74d4ac32 Author: Matt Briggs Date: 2017-06-06T15:27:59Z Fixing log recovery crash on Windows When a sanity check failure was detected by log recovery code, an attempt to delete index files that were memory-mapped would lead to a crash on Windows. This commit adjusts the code to unmap, delete, recreate, and remap index files such the recovery can continue.
[jira] [Commented] (KAFKA-5417) Clients get inconsistent connection states when SASL/SSL connection is marked CONECTED and DISCONNECTED at the same time
[ https://issues.apache.org/jira/browse/KAFKA-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1608#comment-1608 ] ASF GitHub Bot commented on KAFKA-5417: --- GitHub user dongeforever opened a pull request: https://github.com/apache/kafka/pull/3282 [KAFKA-5417] Clients get inconsistent connection states when SASL/SSL… … connection is marked CONECTED and DISCONNECTED at the same time details are in: https://issues.apache.org/jira/browse/KAFKA-5417 You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongeforever/kafka KAFKA-5417 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3282.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3282 commit 7cad7adcba8ebd4b64a7c7012865ffd5315c5dfe Author: zanderDate: 2017-06-09T13:45:59Z [KAFKA-5417] Clients get inconsistent connection states when SASL/SSL connection is marked CONECTED and DISCONNECTED at the same time > Clients get inconsistent connection states when SASL/SSL connection is marked > CONECTED and DISCONNECTED at the same time > > > Key: KAFKA-5417 > URL: https://issues.apache.org/jira/browse/KAFKA-5417 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 0.10.2.1 >Reporter: dongeforever >Priority: Critical > Fix For: 0.10.2.2 > > > Assume the SASL or SSL Connection is established successfully, but be reset > when writing data into it (This will happen frequently in LVS Proxy > environment ) > Selecter poll will act like follows: > try { >... > //finish connect successfully > if (channel.finishConnect()) { > this.connected.add(channel.id());(1) > } > //the prepare will fail, for sasl or ssl will do handshake and write data > //throw exception > if (channel.isConnected() && !channel.ready()) > channel.prepare(); > > } catch { > close(channel); > this.disconnected.add(channel.id()); (2) > } > The code line named (1) and (2) will mark the connection CONNECTED and > DISCONNECTED at the same time. > And the NetworkClient poll will: > handleDisconnections(responses, updatedNow); //remove the channel > handleConnections(); //mark the channel CONNECTED > So get the inconsistent ConnectionStates, and such state will block the > messages sent into this channel in Sender: > For the channel will never be ready and never be connected again: > public boolean ready(Node node, long now) { > if (node.isEmpty()) > throw new IllegalArgumentException("Cannot connect to empty node > " + node); > //return false, for the channel dose not exist actually > if (isReady(node, now)) > return true; > //return false, for the channel is marked CONNECTED > if (connectionStates.canConnect(node.idString(), now)) > // if we are interested in sending to a node and we don't have a > connection to it, initiate one > initiateConnect(node, now); > return false; > } > So all messages sent to such channel will be expired eventually -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5388) Replace zkClient.subscribe*Changes method with an equivalent zkUtils method
[ https://issues.apache.org/jira/browse/KAFKA-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044334#comment-16044334 ] ASF GitHub Bot commented on KAFKA-5388: --- GitHub user baluchicken opened a pull request: https://github.com/apache/kafka/pull/3281 KAFKA-5388 Replace zkClient.subscribe*Changes method with an equivalent zkUtils method @ijuma can you please review if you have time:) ? You can merge this pull request into a Git repository by running: $ git pull https://github.com/baluchicken/kafka-1 KAFKA-5388 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3281.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3281 commit 32d54551746d303b977a5deb1e1cbc19d8dc33f2 Author: Balint MolnarDate: 2017-06-09T11:15:39Z KAFKA-5388 Replace zkClient.subscribe*Changes method with an equivalent zkUtils method > Replace zkClient.subscribe*Changes method with an equivalent zkUtils method > --- > > Key: KAFKA-5388 > URL: https://issues.apache.org/jira/browse/KAFKA-5388 > Project: Kafka > Issue Type: Sub-task > Components: core >Reporter: Balint Molnar >Assignee: Balint Molnar > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5412) Using connect-console-sink/source.properties raises an exception related to "file" property not found
[ https://issues.apache.org/jira/browse/KAFKA-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044181#comment-16044181 ] ASF GitHub Bot commented on KAFKA-5412: --- GitHub user ppatierno opened a pull request: https://github.com/apache/kafka/pull/3279 KAFKA-5412: Using connect-console-sink/source.properties raises an exception related to "file" property not found You can merge this pull request into a Git repository by running: $ git pull https://github.com/ppatierno/kafka kafka-5412 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3279.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3279 commit dcbe72dc72f2fcd2fafb84772b5fe5d9e80e1f75 Author: ppatiernoDate: 2017-06-09T09:13:38Z Added default null value for "file" parameter and more descriptive documentation on its usage > Using connect-console-sink/source.properties raises an exception related to > "file" property not found > - > > Key: KAFKA-5412 > URL: https://issues.apache.org/jira/browse/KAFKA-5412 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 0.11.0.0 >Reporter: Paolo Patierno > > Hi, > with the latest 0.11.1.0-SNAPSHOT it happens that the Kafka Connect example > using connect-console-sink/source.properties doesn't work anymore because the > needed "file" property isn't found. > This is because the underlying used FileStreamSink/Source connector and task > has defined a ConfigDef with "file" as mandatory parameter. In the case of > console example we want to have file=null so that stdin and stdout are used. > One possible solution is set "file=" inside the provided > connect-console-sink/source.properties. > The other one could be modify the FileStreamSink/Source source code in order > to remove the "file" definition from the ConfigDef. > What do you think ? > I can provide a PR for that. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-3199) LoginManager should allow using an existing Subject
[ https://issues.apache.org/jira/browse/KAFKA-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043891#comment-16043891 ] ASF GitHub Bot commented on KAFKA-3199: --- Github user kunickiaj closed the pull request at: https://github.com/apache/kafka/pull/862 > LoginManager should allow using an existing Subject > --- > > Key: KAFKA-3199 > URL: https://issues.apache.org/jira/browse/KAFKA-3199 > Project: Kafka > Issue Type: Improvement > Components: security >Affects Versions: 0.9.0.0 >Reporter: Adam Kunicki >Assignee: Adam Kunicki >Priority: Critical > > LoginManager currently creates a new Login in the constructor which then > performs a login and starts a ticket renewal thread. The problem here is that > because Kafka performs its own login, it doesn't offer the ability to re-use > an existing subject that's already managed by the client application. > The goal of LoginManager appears to be to be able to return a valid Subject. > It would be a simple fix to have LoginManager.acquireLoginManager() check for > a new config e.g. kerberos.use.existing.subject. > This would instead of creating a new Login in the constructor simply call > Subject.getSubject(AccessController.getContext()); to use the already logged > in Subject. > This is also doable without introducing a new configuration and simply > checking if there is already a valid Subject available, but I think it may be > preferable to require that users explicitly request this behavior. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5414) Console consumer offset commit regression
[ https://issues.apache.org/jira/browse/KAFKA-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043786#comment-16043786 ] ASF GitHub Bot commented on KAFKA-5414: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3277 > Console consumer offset commit regression > - > > Key: KAFKA-5414 > URL: https://issues.apache.org/jira/browse/KAFKA-5414 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Critical > Fix For: 0.11.0.0 > > > In KAFKA-5327, the behavior of console consumer was changed to only commit > offsets when the process closes. Previously we used periodic offset commits. > This breaks existing usage in system tests and probably elsewhere. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5414) Console consumer offset commit regression
[ https://issues.apache.org/jira/browse/KAFKA-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043720#comment-16043720 ] ASF GitHub Bot commented on KAFKA-5414: --- GitHub user hachikuji opened a pull request: https://github.com/apache/kafka/pull/3277 KAFKA-5414: Revert "KAFKA-5327: ConsoleConsumer should manually commi… …t offsets for records that are returned in receive()" This reverts commit d7d1196a0b542adb46d22eeb5b6d12af950b64c9. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hachikuji/kafka KAFKA-5414 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3277.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3277 commit 2676945c37f3526f73e0dbc0da66e14bb4f73d7e Author: Jason GustafsonDate: 2017-06-09T00:26:01Z KAFKA-5414: Revert "KAFKA-5327: ConsoleConsumer should manually commit offsets for records that are returned in receive()" This reverts commit d7d1196a0b542adb46d22eeb5b6d12af950b64c9. > Console consumer offset commit regression > - > > Key: KAFKA-5414 > URL: https://issues.apache.org/jira/browse/KAFKA-5414 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Critical > Fix For: 0.11.0.0 > > > In KAFKA-5327, the behavior of console consumer was changed to only commit > offsets when the process closes. Previously we used periodic offset commits. > This breaks existing usage in system tests and probably elsewhere. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5361) Add EOS integration tests for Streams API
[ https://issues.apache.org/jira/browse/KAFKA-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043677#comment-16043677 ] ASF GitHub Bot commented on KAFKA-5361: --- GitHub user mjsax opened a pull request: https://github.com/apache/kafka/pull/3276 KAFKA-5361: Add more integration tests for Streams EOS - multi-subtopology tests - fencing test You can merge this pull request into a Git repository by running: $ git pull https://github.com/mjsax/kafka kafka-5361-add-eos-integration-tests-for-streams-api Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3276.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3276 commit 93b4decb761d763dabec44c0d9a8b835fdd2a32f Author: Matthias J. SaxDate: 2017-06-02T23:19:34Z KAFKA-5361: Add more integration tests - multi-subtopology tests - fencing test > Add EOS integration tests for Streams API > - > > Key: KAFKA-5361 > URL: https://issues.apache.org/jira/browse/KAFKA-5361 > Project: Kafka > Issue Type: Bug > Components: streams, unit tests >Affects Versions: 0.11.0.0 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > We need to add more integration tests for Streams API with exactly-once > enabled. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-3199) LoginManager should allow using an existing Subject
[ https://issues.apache.org/jira/browse/KAFKA-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043552#comment-16043552 ] ASF GitHub Bot commented on KAFKA-3199: --- GitHub user utenakr opened a pull request: https://github.com/apache/kafka/pull/3274 KAFKA-3199 LoginManager should allow using an existing Subject LoginManager or KerberosLogin (for > kafka 0.10) should allow using an existing Subject. If there's an existing subject, the Jaas configuration won't needed in getService() You can merge this pull request into a Git repository by running: $ git pull https://github.com/utenakr/kafka trunk Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3274.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3274 commit 95d6b98440f02fee23f8063ad082ec3dae4bd0b2 Author: Ji SunDate: 2017-06-08T22:21:50Z KAFKA-3199 LoginManager should allow using an existing Subject > LoginManager should allow using an existing Subject > --- > > Key: KAFKA-3199 > URL: https://issues.apache.org/jira/browse/KAFKA-3199 > Project: Kafka > Issue Type: Improvement > Components: security >Affects Versions: 0.9.0.0 >Reporter: Adam Kunicki >Assignee: Adam Kunicki >Priority: Critical > > LoginManager currently creates a new Login in the constructor which then > performs a login and starts a ticket renewal thread. The problem here is that > because Kafka performs its own login, it doesn't offer the ability to re-use > an existing subject that's already managed by the client application. > The goal of LoginManager appears to be to be able to return a valid Subject. > It would be a simple fix to have LoginManager.acquireLoginManager() check for > a new config e.g. kerberos.use.existing.subject. > This would instead of creating a new Login in the constructor simply call > Subject.getSubject(AccessController.getContext()); to use the already logged > in Subject. > This is also doable without introducing a new configuration and simply > checking if there is already a valid Subject available, but I think it may be > preferable to require that users explicitly request this behavior. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5411) Generate javadoc for AdminClient and show configs in documentation
[ https://issues.apache.org/jira/browse/KAFKA-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043546#comment-16043546 ] ASF GitHub Bot commented on KAFKA-5411: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3271 > Generate javadoc for AdminClient and show configs in documentation > -- > > Key: KAFKA-5411 > URL: https://issues.apache.org/jira/browse/KAFKA-5411 > Project: Kafka > Issue Type: Improvement >Reporter: Ismael Juma >Assignee: Ismael Juma >Priority: Blocker > Fix For: 0.11.0.0, 0.11.1.0 > > > Also fix the table of contents. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5362) Add EOS system tests for Streams API
[ https://issues.apache.org/jira/browse/KAFKA-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043446#comment-16043446 ] ASF GitHub Bot commented on KAFKA-5362: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3201 > Add EOS system tests for Streams API > > > Key: KAFKA-5362 > URL: https://issues.apache.org/jira/browse/KAFKA-5362 > Project: Kafka > Issue Type: Bug > Components: streams, system tests >Affects Versions: 0.11.0.0 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > We need to add more system tests for Streams API with exactly-once enabled. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5357) StackOverFlow error in transaction coordinator
[ https://issues.apache.org/jira/browse/KAFKA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043233#comment-16043233 ] ASF GitHub Bot commented on KAFKA-5357: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3242 > StackOverFlow error in transaction coordinator > -- > > Key: KAFKA-5357 > URL: https://issues.apache.org/jira/browse/KAFKA-5357 > Project: Kafka > Issue Type: Sub-task >Affects Versions: 0.11.0.0 >Reporter: Apurva Mehta >Assignee: Damian Guy >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > Attachments: KAFKA-5357.tar.gz > > > I observed the following in the broker logs: > {noformat} > [2017-06-01 04:10:36,664] ERROR [Replica Manager on Broker 1]: Error > processing append operation on partition __transaction_state-37 > (kafka.server.ReplicaManager) > [2017-06-01 04:10:36,667] ERROR [TxnMarkerSenderThread-1]: Error due to > (kafka.common.InterBrokerSendThread) > java.lang.StackOverflowError > at java.security.AccessController.doPrivileged(Native Method) > at java.io.PrintWriter.(PrintWriter.java:116) > at java.io.PrintWriter.(PrintWriter.java:100) > at > org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:58) > at > org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87) > at > org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413) > at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:313) > at > org.apache.log4j.DailyRollingFileAppender.subAppend(DailyRollingFileAppender.java:369) > at org.apache.log4j.WriterAppender.append(WriterAppender.java:162) > at > org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) > at > org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) > at org.apache.log4j.Category.callAppenders(Category.java:206) > at org.apache.log4j.Category.forcedLog(Category.java:391) > at org.apache.log4j.Category.error(Category.java:322) > at kafka.utils.Logging$class.error(Logging.scala:105) > at kafka.server.ReplicaManager.error(ReplicaManager.scala:122) > at > kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:557) > at > kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:505) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.Map$Map1.foreach(Map.scala:116) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.AbstractTraversable.map(Traversable.scala:104) > at > kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:505) > at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:346) > at > kafka.coordinator.transaction.TransactionStateManager$$anonfun$appendTransactionToLog$1.apply$mcV$sp(TransactionStateManager.scala:589) > at > kafka.coordinator.transaction.TransactionStateManager$$anonfun$appendTransactionToLog$1.apply(TransactionStateManager.scala:570) > at > kafka.coordinator.transaction.TransactionStateManager$$anonfun$appendTransactionToLog$1.apply(TransactionStateManager.scala:570) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213) > at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:219) > at > kafka.coordinator.transaction.TransactionStateManager.appendTransactionToLog(TransactionStateManager.scala:564) > at > kafka.coordinator.transaction.TransactionMarkerChannelManager.kafka$coordinator$transaction$TransactionMarkerChannelManager$$retryAppendCallback$1(TransactionMarkerChannelManager.scala:225) > at > kafka.coordinator.transaction.TransactionMarkerChannelManager$$anonfun$kafka$coordinator$transaction$TransactionMarkerChannelManager$$retryAppendCallback$1$4.apply(TransactionMarkerChannelManager.scala:225) > at > kafka.coordinator.transaction.TransactionMarkerChannelManager$$anonfun$kafka$coordinator$transaction$TransactionMarkerChannelManager$$retryAppendCallback$1$4.apply(TransactionMarkerChannelManager.scala:225) > at > kafka.coordinator.transaction.TransactionStateManager.kafka$coordinator$transaction$TransactionStateManager$$updateCacheCallback$1(TransactionStateManager.scala:561) > at >
[jira] [Commented] (KAFKA-5246) Remove backdoor that allows any client to produce to internal topics
[ https://issues.apache.org/jira/browse/KAFKA-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043005#comment-16043005 ] ASF GitHub Bot commented on KAFKA-5246: --- Github user datalorax closed the pull request at: https://github.com/apache/kafka/pull/3061 > Remove backdoor that allows any client to produce to internal topics > - > > Key: KAFKA-5246 > URL: https://issues.apache.org/jira/browse/KAFKA-5246 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 0.10.2.0, > 0.10.2.1 >Reporter: Andy Coates >Assignee: Andy Coates >Priority: Minor > > kafka.admim.AdminUtils defines an ‘AdminClientId' val, which looks to be > unused in the code, with the exception of a single use in KafkaAPis.scala in > handleProducerRequest, where is looks to allow any client, using the special > ‘__admin_client' client id, to append to internal topics. > This looks like a security risk to me, as it would allow any client to > produce either rouge offsets or even a record containing something other than > group/offset info. > Can we remove this please? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5391) Replace zkClient.delete* method with an equivalent zkUtils method
[ https://issues.apache.org/jira/browse/KAFKA-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042920#comment-16042920 ] ASF GitHub Bot commented on KAFKA-5391: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3245 > Replace zkClient.delete* method with an equivalent zkUtils method > - > > Key: KAFKA-5391 > URL: https://issues.apache.org/jira/browse/KAFKA-5391 > Project: Kafka > Issue Type: Sub-task > Components: core >Reporter: Balint Molnar >Assignee: Balint Molnar > Fix For: 0.11.1.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5411) Generate javadoc for AdminClient and show configs in documentation
[ https://issues.apache.org/jira/browse/KAFKA-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042838#comment-16042838 ] ASF GitHub Bot commented on KAFKA-5411: --- GitHub user ijuma opened a pull request: https://github.com/apache/kafka/pull/3271 KAFKA-5411: AdminClient javadoc and documentation improvements - Show AdminClient configs in the docs. - Update Javadoc config so that public classes exposed by the AdminClient are included. - Version and table of contents fixes. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ijuma/kafka kafka-5411-admin-client-javadoc-configs Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3271.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3271 commit 5f5f0a2339708d31fadafef1013057003ef6afdc Author: Ismael JumaDate: 2017-06-08T15:15:34Z KAFKA-5411: AdminClient javadoc and documentation improvements > Generate javadoc for AdminClient and show configs in documentation > -- > > Key: KAFKA-5411 > URL: https://issues.apache.org/jira/browse/KAFKA-5411 > Project: Kafka > Issue Type: Improvement >Reporter: Ismael Juma >Assignee: Ismael Juma >Priority: Blocker > Fix For: 0.11.0.0 > > > Also fix the table of contents. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5410) Fix taskClass() method name in Connector and flush() signature in SinkTask
[ https://issues.apache.org/jira/browse/KAFKA-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042592#comment-16042592 ] ASF GitHub Bot commented on KAFKA-5410: --- GitHub user ppatierno opened a pull request: https://github.com/apache/kafka/pull/3269 KAFKA-5410: Fix taskClass() method name in Connector and flush() signature in SinkTask You can merge this pull request into a Git repository by running: $ git pull https://github.com/ppatierno/kafka connect-doc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3269.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3269 commit 1cd05d9154bf64c84651bb0aadb380dafb749bfe Author: ppatiernoDate: 2017-06-08T11:50:45Z Fixed method name for taskClass() in Connector class Fixed method signature for flush() in SinkTask class > Fix taskClass() method name in Connector and flush() signature in SinkTask > -- > > Key: KAFKA-5410 > URL: https://issues.apache.org/jira/browse/KAFKA-5410 > Project: Kafka > Issue Type: Improvement > Components: documentation >Reporter: Paolo Patierno > > Hi, > the current documentation refers to getTaskClass() for the Connector class > during the file example. At same time, a different signature is showed for > the flush() method in SinkTask which now has OffsetMetadata as well. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5410) Fix taskClass() method name in Connector and flush() signature in SinkTask
[ https://issues.apache.org/jira/browse/KAFKA-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042585#comment-16042585 ] ASF GitHub Bot commented on KAFKA-5410: --- Github user ppatierno closed the pull request at: https://github.com/apache/kafka/pull/3268 > Fix taskClass() method name in Connector and flush() signature in SinkTask > -- > > Key: KAFKA-5410 > URL: https://issues.apache.org/jira/browse/KAFKA-5410 > Project: Kafka > Issue Type: Improvement > Components: documentation >Reporter: Paolo Patierno > > Hi, > the current documentation refers to getTaskClass() for the Connector class > during the file example. At same time, a different signature is showed for > the flush() method in SinkTask which now has OffsetMetadata as well. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5410) Fix taskClass() method name in Connector and flush() signature in SinkTask
[ https://issues.apache.org/jira/browse/KAFKA-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042538#comment-16042538 ] ASF GitHub Bot commented on KAFKA-5410: --- GitHub user ppatierno opened a pull request: https://github.com/apache/kafka/pull/3268 KAFKA-5410: Fix taskClass() method name in Connector and flush() signature in SinkTask You can merge this pull request into a Git repository by running: $ git pull https://github.com/ppatierno/kafka connect-doc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3268.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3268 commit 52ecadd6ede849903cd705f5b5e2b470a8e138bb Author: ppatiernoDate: 2017-06-07T12:09:53Z Updated .gitignore for excluding out dir commit 4b32e4109f67a3b586c3d018277c502fbfad21a2 Author: ppatierno Date: 2017-06-08T10:34:52Z Merge remote-tracking branch 'upstream/trunk' into trunk commit 812439d03974ef1794601b85aec5b6c1d54b1e54 Author: ppatierno Date: 2017-06-08T10:45:21Z Fixed method name for taskClass() in Connector class Fixed method signature for flush() in SinkTask class > Fix taskClass() method name in Connector and flush() signature in SinkTask > -- > > Key: KAFKA-5410 > URL: https://issues.apache.org/jira/browse/KAFKA-5410 > Project: Kafka > Issue Type: Improvement > Components: documentation >Reporter: Paolo Patierno > > Hi, > the current documentation refers to getTaskClass() for the Connector class > during the file example. At same time, a different signature is showed for > the flush() method in SinkTask which now has OffsetMetadata as well. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5314) Improve exception handling for state stores
[ https://issues.apache.org/jira/browse/KAFKA-5314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042179#comment-16042179 ] ASF GitHub Bot commented on KAFKA-5314: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3135 > Improve exception handling for state stores > --- > > Key: KAFKA-5314 > URL: https://issues.apache.org/jira/browse/KAFKA-5314 > Project: Kafka > Issue Type: Sub-task > Components: streams >Affects Versions: 0.11.0.0 >Reporter: Eno Thereska >Assignee: Eno Thereska > Fix For: 0.11.1.0 > > > RocksDbStore.java throws a mix of exceptions like StreamsException and > ProcessorStateException. That needs to be made consistent. > Also the exceptions thrown are not documented in the KeyValueStore interface. > All the stores (RocksDb, InMemory) need to have consistent exception handling. > Today a store error is fatal and halts the stream thread that is processing > the exceptions. We could explore alternatives, like haling part of the > topology that passes through that faulty store, i.e., failing tasks, not the > entire thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5403) Transactions system test should dedup consumed messages by offset
[ https://issues.apache.org/jira/browse/KAFKA-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042147#comment-16042147 ] ASF GitHub Bot commented on KAFKA-5403: --- GitHub user apurvam opened a pull request: https://github.com/apache/kafka/pull/3266 KAFKA-5403: Transaction system test consumer should dedup messages by offset Since the consumer can consume duplicate offsets due to rebalances, we should dedup consumed messages by offset in order to ensure that the test doesn't fail spuriously. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apurvam/kafka KAFKA-5403-dedup-consumed-messages-transactions-system-test Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3266.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3266 commit f4ba943358aad08950e0952f60eb98843efd42f2 Author: Apurva MehtaDate: 2017-06-08T01:40:40Z WIP commit 5a6ac9dcc83c5bb0f62a5c387923612c5a9f1212 Author: Apurva Mehta Date: 2017-06-08T02:51:48Z WIP commit > Transactions system test should dedup consumed messages by offset > - > > Key: KAFKA-5403 > URL: https://issues.apache.org/jira/browse/KAFKA-5403 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.11.0.0 >Reporter: Apurva Mehta >Assignee: Apurva Mehta > Fix For: 0.11.0.1 > > > In KAFKA-5396, we saw that the consumers which verify the data in multiple > topics could read the same offsets multiple times, for instance when a > rebalance happens. > This would detect spurious duplicates, causing the test to fail. We should > dedup the consumed messages by offset and only fail the test if we have > duplicate values for a if for a unique set of offsets. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5329) Replica list in the metadata cache on the broker may have different order from zookeeper
[ https://issues.apache.org/jira/browse/KAFKA-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042097#comment-16042097 ] ASF GitHub Bot commented on KAFKA-5329: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3257 > Replica list in the metadata cache on the broker may have different order > from zookeeper > > > Key: KAFKA-5329 > URL: https://issues.apache.org/jira/browse/KAFKA-5329 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.2.1 >Reporter: Jiangjie Qin >Assignee: Ismael Juma > Labels: newbie > Fix For: 0.11.0.1 > > > It looks that in {{PartitionStateInfo}} we are storing the replicas in a set > instead of a Seq. This causes the replica order to be lost. In most case it > is fine, but in the context of preferred leader election, the replica order > determines which replica is the preferred leader of a partition. It would be > useful to make the order always be consistent with the order in zookeeper. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5405) Request log should log throttle time
[ https://issues.apache.org/jira/browse/KAFKA-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042061#comment-16042061 ] ASF GitHub Bot commented on KAFKA-5405: --- GitHub user huxihx opened a pull request: https://github.com/apache/kafka/pull/3265 KAFKA-5405: Request log should log throttle time Record `apiThrottleTime` in RequestChannel. @junrao A trivial change. Please review. Thanks. You can merge this pull request into a Git repository by running: $ git pull https://github.com/huxihx/kafka KAFKA-5405 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3265.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3265 commit 61fe11f475457efbddb321df692e41254fc220bc Author: huxihxDate: 2017-06-08T01:44:46Z KAFKA-5405: Request log should log throttle time Record `apiThrottleTime` in RequestChannel > Request log should log throttle time > > > Key: KAFKA-5405 > URL: https://issues.apache.org/jira/browse/KAFKA-5405 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.11.0.0 >Reporter: Jun Rao > Labels: newbie > > In RequestChannel, when logging the request and the latency, it would be > useful to include the apiThrottleTime as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5292) Fix authorization checks in AdminClient
[ https://issues.apache.org/jira/browse/KAFKA-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16042035#comment-16042035 ] ASF GitHub Bot commented on KAFKA-5292: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3240 > Fix authorization checks in AdminClient > --- > > Key: KAFKA-5292 > URL: https://issues.apache.org/jira/browse/KAFKA-5292 > Project: Kafka > Issue Type: Sub-task >Reporter: Ismael Juma >Assignee: Colin P. McCabe >Priority: Blocker > Fix For: 0.11.0.0 > > > Fix authorization checks in AdminClient. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5404) Add more AdminClient checks to ClientCompatibilityTest
[ https://issues.apache.org/jira/browse/KAFKA-5404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16041718#comment-16041718 ] ASF GitHub Bot commented on KAFKA-5404: --- GitHub user cmccabe opened a pull request: https://github.com/apache/kafka/pull/3263 KAFKA-5404: Add more AdminClient checks to ClientCompatibilityTest You can merge this pull request into a Git repository by running: $ git pull https://github.com/cmccabe/kafka KAFKA-5404 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3263.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3263 commit c115e23ae277875e8b143dfd78f1e8d26e261677 Author: Colin P. MccabeDate: 2017-06-07T21:24:19Z KAFKA-5404: Add more AdminClient checks to ClientCompatibilityTest > Add more AdminClient checks to ClientCompatibilityTest > -- > > Key: KAFKA-5404 > URL: https://issues.apache.org/jira/browse/KAFKA-5404 > Project: Kafka > Issue Type: Bug >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > > Add more AdminClient checks to ClientCompatibilityTest -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5385) Transactional Producer allows batches to expire and commits transactions regardless
[ https://issues.apache.org/jira/browse/KAFKA-5385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16041515#comment-16041515 ] ASF GitHub Bot commented on KAFKA-5385: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3252 > Transactional Producer allows batches to expire and commits transactions > regardless > --- > > Key: KAFKA-5385 > URL: https://issues.apache.org/jira/browse/KAFKA-5385 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.11.0.0 >Reporter: Apurva Mehta >Assignee: Apurva Mehta >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > The transactions system test has revealed a data loss issue. When there is > cluster instability, it can happen that the transactional requests > (AddPartitions, and AddOffsets) can retry for a long time. When they > eventually succeed, the commit message will be dequeued, at which point we > will try to drain the accumulator. However, we would find the batches should > be expired, and just drop them, but commit the transaction anyway. This > causes data loss. > Relevant portion from the producer log is here: > {noformat} > [2017-06-06 01:07:36,275] DEBUG [TransactionalId my-first-transactional-id] > Transition from state IN_TRANSACTION to COMMITTING_TRANSACTION > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-06 01:07:36,275] DEBUG [TransactionalId my-first-transactional-id] > Enqueuing transactional request (type=EndTxnRequest, > transactionalId=my-first-transactional-id, producerId=1001, producerEpoch=0, > result=COMMIT) > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-06 01:07:36,276] TRACE Expired 3 batches in accumulator > (org.apache.kafka.clients.producer.internals.RecordAccumulator) > [2017-06-06 01:07:36,286] TRACE Produced messages to topic-partition > output-topic-0 with base offset offset -1 and error: {}. > (org.apache.kafka.clients.producer.internals.ProducerBatch) > org.apache.kafka.common.errors.TimeoutException: Expiring 250 record(s) for > output-topic-0: 39080 ms has passed since batch creation plus linger time > [2017-06-06 01:07:36,424] TRACE Produced messages to topic-partition > output-topic-1 with base offset offset -1 and error: {}. > (org.apache.kafka.clients.producer.internals.ProducerBatch) > org.apache.kafka.common.errors.TimeoutException: Expiring 250 record(s) for > output-topic-1: 39080 ms has passed since batch creation plus linger time > [2017-06-06 01:07:36,436] TRACE Produced messages to topic-partition > output-topic-2 with base offset offset -1 and error: {}. > (org.apache.kafka.clients.producer.internals.ProducerBatch) > org.apache.kafka.common.errors.TimeoutException: Expiring 250 record(s) for > output-topic-2: 39080 ms has passed since batch creation plus linger time > [2017-06-06 01:07:36,444] TRACE [TransactionalId my-first-transactional-id] > Request (type=EndTxnRequest, transactionalId=my-first-transactional-id, > producerId=1001, producerEpoch=0, result=COMMIT) dequeued for sending > (org.apache.kafka.clients.producer.internals.TransactionManager) > [2017-06-06 01:07:36,446] DEBUG [TransactionalId my-first-transactional-id] > Sending transactional request (type=EndTxnRequest, > transactionalId=my-first-transactional-id, producerId=1001, producerEpoch=0, > result=COMMIT) to node knode04:9092 (id: 3 rack: null) > (org.apache.kafka.clients.producer.internals.Sender) > [2017-06-06 01:07:36,449] TRACE [TransactionalId my-first-transactional-id] > Received transactional response EndTxnResponse(error=NOT_COORDINATOR, > throttleTimeMs=0) for request (type=EndTxnRequest, > transactionalId=my-first-transactional-id, producerId=1001, producerEpoch=0, > result=COMMIT) > (org.apache.kafka.clients.producer.internals.TransactionManager) > {noformat} > As you can see, the commit goes ahead even though the batches are never sent. > In this test, we lost 750 messages in the output topic, and they correspond > exactly with the 750 messages in the input topic at the offset in this > portion of the log. > The solution is to either never expire transactional batches, or fail the > transaction if any batches have expired. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5389) Replace zkClient.exists method with zkUtils.pathExists
[ https://issues.apache.org/jira/browse/KAFKA-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16041384#comment-16041384 ] ASF GitHub Bot commented on KAFKA-5389: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3243 > Replace zkClient.exists method with zkUtils.pathExists > -- > > Key: KAFKA-5389 > URL: https://issues.apache.org/jira/browse/KAFKA-5389 > Project: Kafka > Issue Type: Sub-task > Components: core >Reporter: Balint Molnar >Assignee: Balint Molnar > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5394) KafkaAdminClient#timeoutCallsInFlight does not work as expected
[ https://issues.apache.org/jira/browse/KAFKA-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16041364#comment-16041364 ] ASF GitHub Bot commented on KAFKA-5394: --- Github user ijuma closed the pull request at: https://github.com/apache/kafka/pull/3258 > KafkaAdminClient#timeoutCallsInFlight does not work as expected > --- > > Key: KAFKA-5394 > URL: https://issues.apache.org/jira/browse/KAFKA-5394 > Project: Kafka > Issue Type: Bug > Components: admin >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Fix For: 0.11.0.0 > > > {{KafkaAdminClient#timeoutCallsInFlight}} does not work as expected. The > original idea was that this function would time out a call by closing the > associated socket. Then the following {{NetworkClient#poll}} call would > trigger the call to be removed. However, it turns out that it sometimes > takes time for the {{NetworkClient#poll}} call to return the disconnection > events. This leads to the call lingering for a while, causing us to > repeatedly disconnect connections to that node. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5394) KafkaAdminClient#timeoutCallsInFlight does not work as expected
[ https://issues.apache.org/jira/browse/KAFKA-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16041232#comment-16041232 ] ASF GitHub Bot commented on KAFKA-5394: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3250 > KafkaAdminClient#timeoutCallsInFlight does not work as expected > --- > > Key: KAFKA-5394 > URL: https://issues.apache.org/jira/browse/KAFKA-5394 > Project: Kafka > Issue Type: Bug > Components: admin >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > Fix For: 0.11.0.0 > > > {{KafkaAdminClient#timeoutCallsInFlight}} does not work as expected. The > original idea was that this function would time out a call by closing the > associated socket. Then the following {{NetworkClient#poll}} call would > trigger the call to be removed. However, it turns out that it sometimes > takes time for the {{NetworkClient#poll}} call to return the disconnection > events. This leads to the call lingering for a while, causing us to > repeatedly disconnect connections to that node. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5378) Last Stable Offset not returned in Fetch request
[ https://issues.apache.org/jira/browse/KAFKA-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16041086#comment-16041086 ] ASF GitHub Bot commented on KAFKA-5378: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3248 > Last Stable Offset not returned in Fetch request > > > Key: KAFKA-5378 > URL: https://issues.apache.org/jira/browse/KAFKA-5378 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Critical > Fix For: 0.11.0.0 > > > Looks like we didn't update KafkaApis to return the last stable offset in the > fetch response. The consumer doesn't use it for anything at the moment, but > it would still be good to fix for debugging purposes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5394) KafkaAdminClient#timeoutCallsInFlight does not work as expected
[ https://issues.apache.org/jira/browse/KAFKA-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040999#comment-16040999 ] ASF GitHub Bot commented on KAFKA-5394: --- GitHub user ijuma opened a pull request: https://github.com/apache/kafka/pull/3258 KAFKA-5394; Fix disconnections due to timeouts in AdminClient You can merge this pull request into a Git repository by running: $ git pull https://github.com/ijuma/kafka kafka-5394-admin-client-timeouts Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3258.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3258 > KafkaAdminClient#timeoutCallsInFlight does not work as expected > --- > > Key: KAFKA-5394 > URL: https://issues.apache.org/jira/browse/KAFKA-5394 > Project: Kafka > Issue Type: Bug > Components: admin >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > > {{KafkaAdminClient#timeoutCallsInFlight}} does not work as expected. The > original idea was that this function would time out a call by closing the > associated socket. Then the following {{NetworkClient#poll}} call would > trigger the call to be removed. However, it turns out that it sometimes > takes time for the {{NetworkClient#poll}} call to return the disconnection > events. This leads to the call lingering for a while, causing us to > repeatedly disconnect connections to that node. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5395) Distributed Herder Deadlocks on Shutdown
[ https://issues.apache.org/jira/browse/KAFKA-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040965#comment-16040965 ] ASF GitHub Bot commented on KAFKA-5395: --- Github user rajinisivaram closed the pull request at: https://github.com/apache/kafka/pull/3253 > Distributed Herder Deadlocks on Shutdown > > > Key: KAFKA-5395 > URL: https://issues.apache.org/jira/browse/KAFKA-5395 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 0.10.2.1 >Reporter: Michael Jaschob >Assignee: Rajini Sivaram >Priority: Critical > Fix For: 0.11.0.0, 0.10.2.2 > > Attachments: connect_01021_shutdown_deadlock.txt > > > We're trying to upgrade Kafka Connect to 0.10.2.1 and see that the process > does not shut down cleanly. It hangs instead. From what I can tell > [KAFKA-4786|https://github.com/apache/kafka/commit/ba4eafa7874988374abcd9f48fbab96abb2032a4] > introduced this deadlock. > [close|https://github.com/apache/kafka/blob/0.10.2.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L664] > on the AbstractCoordinator is marked as synchronized and acquires the > coordinator's monitor. The first thing it tries to do is > [join|https://github.com/apache/kafka/blob/0.10.2.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L323] > the heartbeat thread. > Meanwhile, the heartbeat thread is [synchronized on the same > monitor|https://github.com/apache/kafka/blob/0.10.2.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L891], > which it relinquishes when it > [waits|https://github.com/apache/kafka/blob/0.10.2.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L926]. > But for the wait to return (and the run method of the heartbeat to > terminate) it needs to reacquire that monitor. > There's no way for the heartbeat thread to reacquire the monitor since it is > held by the distributed herder thread. And the distributed herder will never > relinquish the monitor since it is waiting for the heartbeat thread to join. > I am attaching a thread dump illustrating the situation. Take note in > particular of threads #178 (the heartbeat thread) and #159 (the herder > thread). The former is BLOCKED trying to reacquire 0x0007406cc0c0, and > the latter is WAITING on the heartbeat thread to join, having itself acquired > 0x0007406cc0c0. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5329) Replica list in the metadata cache on the broker may have different order from zookeeper
[ https://issues.apache.org/jira/browse/KAFKA-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040942#comment-16040942 ] ASF GitHub Bot commented on KAFKA-5329: --- GitHub user ijuma opened a pull request: https://github.com/apache/kafka/pull/3257 KAFKA-5329: Fix order of replica list in metadata cache You can merge this pull request into a Git repository by running: $ git pull https://github.com/ijuma/kafka kafka-5329-fix-order-of-replica-list-in-metadata-cache Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3257.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3257 commit 126101d5df7c24f8a3ee720e95cb34e3e4df0efc Author: Ismael JumaDate: 2017-06-07T14:05:03Z KAFKA-5329: Fix order of replica list in metadata cache > Replica list in the metadata cache on the broker may have different order > from zookeeper > > > Key: KAFKA-5329 > URL: https://issues.apache.org/jira/browse/KAFKA-5329 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.2.1 >Reporter: Jiangjie Qin > Labels: newbie > Fix For: 0.11.0.1 > > > It looks that in {{PartitionStateInfo}} we are storing the replicas in a set > instead of a Seq. This causes the replica order to be lost. In most case it > is fine, but in the context of preferred leader election, the replica order > determines which replica is the preferred leader of a partition. It would be > useful to make the order always be consistent with the order in zookeeper. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5380) Transient test failure: KafkaConsumerTest.testChangingRegexSubscription
[ https://issues.apache.org/jira/browse/KAFKA-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040756#comment-16040756 ] ASF GitHub Bot commented on KAFKA-5380: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3238 > Transient test failure: KafkaConsumerTest.testChangingRegexSubscription > --- > > Key: KAFKA-5380 > URL: https://issues.apache.org/jira/browse/KAFKA-5380 > Project: Kafka > Issue Type: Bug > Components: consumer >Reporter: Rajini Sivaram >Assignee: Rajini Sivaram > Fix For: 0.11.0.0, 0.11.1.0 > > > https://builds.apache.org/job/kafka-trunk-jdk8/1647/testReport/junit/org.apache.kafka.clients.consumer/KafkaConsumerTest/testChangingRegexSubscription/: > {quote} > java.lang.IllegalStateException: Request matcher did not match next-in-line > request > (group_id=mock-group,session_timeout=3,rebalance_timeout=6,member_id=memberId,protocol_type=consumer,group_protocols=[(protocol_name=roundrobin,protocol_metadata=java.nio.HeapByteBuffer(pos=10 > lim=10 cap=10))) > at org.apache.kafka.clients.MockClient.send(MockClient.java:156) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.trySend(ConsumerNetworkClient.java:409) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:223) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:208) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:168) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:358) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:297) > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1051) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1016) > at > org.apache.kafka.clients.consumer.KafkaConsumerTest.testChangingRegexSubscription(KafkaConsumerTest.java:650) > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-3741) KStream config for changelog min.in.sync.replicas
[ https://issues.apache.org/jira/browse/KAFKA-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040711#comment-16040711 ] ASF GitHub Bot commented on KAFKA-3741: --- GitHub user dguy opened a pull request: https://github.com/apache/kafka/pull/3255 KAFKA-3741: add min.insync.replicas config to Streams Allow users to specify `min.insync.replicas` via StreamsConfig. Default to `null` so that the server settting will be used. If `replication.factor` is lower than `min.insync.replicas` then set `min.insync.replicas` to `replication.factor` You can merge this pull request into a Git repository by running: $ git pull https://github.com/dguy/kafka kafka-3741 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3255.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3255 commit a5f490d0921f5e929eb795b88c7f9b65233c69c7 Author: Damian GuyDate: 2017-06-07T11:04:08Z add min.insync.replicas config > KStream config for changelog min.in.sync.replicas > - > > Key: KAFKA-3741 > URL: https://issues.apache.org/jira/browse/KAFKA-3741 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.0.0 >Reporter: Roger Hoover >Assignee: Damian Guy > Labels: api > > Kafka Streams currently allows you to specify a replication factor for > changelog and repartition topics that it creates. It should also allow you > to specify min.in.sync.replicas. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5366) Add cases for concurrent transactional reads and writes in system tests
[ https://issues.apache.org/jira/browse/KAFKA-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039916#comment-16039916 ] ASF GitHub Bot commented on KAFKA-5366: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3217 > Add cases for concurrent transactional reads and writes in system tests > --- > > Key: KAFKA-5366 > URL: https://issues.apache.org/jira/browse/KAFKA-5366 > Project: Kafka > Issue Type: Test >Affects Versions: 0.11.0.0 >Reporter: Apurva Mehta >Assignee: Apurva Mehta >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > Currently the transactions system test does a transactional copy while > bouncing brokers and clients, and then does a verifying read on the output > topic to ensure that it exactly matches the input. > We should also have a transactional consumer reading the tail of the output > topic as the writes are happening, and then assert that the values _it_ reads > also exactly match the values in the source topics. > This test really exercises the abort index, and we don't have any of them in > the system or integration tests right now. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4942) Kafka Connect: Offset committing times out before expected
[ https://issues.apache.org/jira/browse/KAFKA-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039681#comment-16039681 ] ASF GitHub Bot commented on KAFKA-4942: --- Github user simplesteph closed the pull request at: https://github.com/apache/kafka/pull/2730 > Kafka Connect: Offset committing times out before expected > -- > > Key: KAFKA-4942 > URL: https://issues.apache.org/jira/browse/KAFKA-4942 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 0.10.2.0 >Reporter: Stephane Maarek > Fix For: 0.11.0.0, 0.11.1.0 > > > On Kafka 0.10.2.0 > I run a connector that deals with a lot of data, in a kafka connect cluster > When the offsets are getting committed, I get the following: > {code} > [2017-03-23 03:56:25,134] INFO WorkerSinkTask{id=MyConnector-1} Committing > offsets (org.apache.kafka.connect.runtime.WorkerSinkTask) > [2017-03-23 03:56:25,135] WARN Commit of WorkerSinkTask{id=MyConnector-1} > offsets timed out (org.apache.kafka.connect.runtime.WorkerSinkTask) > {code} > If you look at the timestamps, they're 1 ms apart. My settings are the > following: > {code} > offset.flush.interval.ms = 12 > offset.flush.timeout.ms = 6 > offset.storage.topic = _connect_offsets > {code} > It seems the offset flush timeout setting is completely ignored for the look > of the logs. I would expect the timeout message to happen 60 seconds after > the commit offset INFO message, not 1 millisecond later. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5394) KafkaAdminClient#timeoutCallsInFlight does not work as expected
[ https://issues.apache.org/jira/browse/KAFKA-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039677#comment-16039677 ] ASF GitHub Bot commented on KAFKA-5394: --- GitHub user cmccabe opened a pull request: https://github.com/apache/kafka/pull/3250 KAFKA-5394. KafkaAdminClient#timeoutCallsInFlight does not work as ex… …pected * Rename KafkaClient#close to KafkaClient#forget to emphasize that it forgets the requests on a given connection. * Create KafkaClient#disconnect to tear down a connection and deliver disconnects to all the requests on it. * AdminClient.java: fix mismatched braces in JavaDoc. * Make the AdminClientConfig constructor visible for testing. * KafkaAdminClient: add TimeoutProcessorFactory to make the TimeoutProcessor swappable for testing. * Make TimeoutProcessor a static class rather than an inner class. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cmccabe/kafka KAFKA-5394 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3250.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3250 commit fb3905a69bad5e4c37c28c64fdfb2ab786f7b4ed Author: Colin P. MccabeDate: 2017-06-06T21:19:31Z KAFKA-5394. KafkaAdminClient#timeoutCallsInFlight does not work as expected * Rename KafkaClient#close to KafkaClient#forget to emphasize that it forgets the requests on a given connection. * Create KafkaClient#disconnect to tear down a connection and deliver disconnects to all the requests on it. * AdminClient.java: fix mismatched braces in JavaDoc. * Make the AdminClientConfig constructor visible for testing. * KafkaAdminClient: add TimeoutProcessorFactory to make the TimeoutProcessor swappable for testing. * Make TimeoutProcessor a static class rather than an inner class. > KafkaAdminClient#timeoutCallsInFlight does not work as expected > --- > > Key: KAFKA-5394 > URL: https://issues.apache.org/jira/browse/KAFKA-5394 > Project: Kafka > Issue Type: Bug > Components: admin >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > > {{KafkaAdminClient#timeoutCallsInFlight}} does not work as expected. The > original idea was that this function would time out a call by closing the > associated socket. Then the following {{NetworkClient#poll}} call would > trigger the call to be removed. However, it turns out that it sometimes > takes time for the {{NetworkClient#poll}} call to return the disconnection > events. This leads to the call lingering for a while, causing us to > repeatedly disconnect connections to that node. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4942) Kafka Connect: Offset committing times out before expected
[ https://issues.apache.org/jira/browse/KAFKA-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039671#comment-16039671 ] ASF GitHub Bot commented on KAFKA-4942: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/2912 > Kafka Connect: Offset committing times out before expected > -- > > Key: KAFKA-4942 > URL: https://issues.apache.org/jira/browse/KAFKA-4942 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 0.10.2.0 >Reporter: Stephane Maarek > Fix For: 0.11.0.0, 0.11.1.0 > > > On Kafka 0.10.2.0 > I run a connector that deals with a lot of data, in a kafka connect cluster > When the offsets are getting committed, I get the following: > {code} > [2017-03-23 03:56:25,134] INFO WorkerSinkTask{id=MyConnector-1} Committing > offsets (org.apache.kafka.connect.runtime.WorkerSinkTask) > [2017-03-23 03:56:25,135] WARN Commit of WorkerSinkTask{id=MyConnector-1} > offsets timed out (org.apache.kafka.connect.runtime.WorkerSinkTask) > {code} > If you look at the timestamps, they're 1 ms apart. My settings are the > following: > {code} > offset.flush.interval.ms = 12 > offset.flush.timeout.ms = 6 > offset.storage.topic = _connect_offsets > {code} > It seems the offset flush timeout setting is completely ignored for the look > of the logs. I would expect the timeout message to happen 60 seconds after > the commit offset INFO message, not 1 millisecond later. -- This message was sent by Atlassian JIRA (v6.3.15#6346)