[GitHub] [kafka] guozhangwang commented on pull request #8924: KAFKA-10198: guard against recycling dirty state

2020-06-24 Thread GitBox


guozhangwang commented on pull request #8924:
URL: https://github.com/apache/kafka/pull/8924#issuecomment-649199285


   Merged to trunk and cherry-picked to 2.6



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] michael-carter-instaclustr commented on pull request #8844: KAFKA-9887 fix failed task or connector count on startup failure

2020-06-24 Thread GitBox


michael-carter-instaclustr commented on pull request #8844:
URL: https://github.com/apache/kafka/pull/8844#issuecomment-649196880


   I've made changes to those test now @C0urante .  I couple of things worth 
noting: Changing the mock of the WorkMetricsGroup to a real object needs a fair 
few more mocks that relate to each other, so I've organised those into a 
@Before method instead of injecting them via a @Mock annotation, I hope that's 
okay. Doing so had the benefit that the tests are now more explicit about the 
values of the metrics being recorded, which may make the deletion of lines in 
WorkerTest more palatable. Having a look at the tests I'm modifying there, most 
of them actually already do have expectations on the statusListener being 
called, so I think it that case it's fair to say that the responsibility has 
simply moved to the WorkerGroupMetrics class (and therefore the 
WorkerGroupMetricsTest). For the test that doesn't seem to have any 
expectations on the status listener (testAddRemoveTask), I believe this might 
be because it's the WorkerTask's job to call the status listener once it's 
running, but that aspect is mocked away in that particular test.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KAFKA-10198) Dirty tasks may be recycled instead of closed

2020-06-24 Thread Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144625#comment-17144625
 ] 

Sophie Blee-Goldman commented on KAFKA-10198:
-

[~rhauch] the fix has been merged and picked to the 2.6 branch

> Dirty tasks may be recycled instead of closed
> -
>
> Key: KAFKA-10198
> URL: https://issues.apache.org/jira/browse/KAFKA-10198
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Sophie Blee-Goldman
>Assignee: Sophie Blee-Goldman
>Priority: Blocker
> Fix For: 2.6.0
>
>
> We recently added a guard to `Task#closeClean` to make sure we don't 
> accidentally clean-close a dirty task, but we forgot to also add this check 
> to `Task#closeAndRecycleState`. This meant an otherwise dirty task could be 
> closed clean and recycled into a new task when it should have just been 
> closed.
> This manifest as an NPE in our test application. Specifically, task 1_0 was 
> active on StreamThread-2 but reassigned as a standby. During handleRevocation 
> we hit a TaskMigratedException while flushing the tasks and bailed on trying 
> to flush and commit the remainder. This left task 1_0 with dirty keys in the 
> suppression buffer and the `commitNeeded` flag still set to true.
> During handleAssignment, we should have closed all the tasks with pending 
> state as dirty (ie any task with commitNeeded = true). Since we don't know 
> about the TaskMigratedException we hit during handleRevocation, we rely on 
> the guard in Task#closeClean` to throw an exception and force the task to be 
> closed dirty.
> Unfortunately, we left this guard out of `closeAndRecycleState`, which meant 
> task 1_0 was able to slip through without being closed dirty. Once 
> reinitialized as a standby task, we eventually tried to commit it. The 
> suppression buffer of course tried to flush its remaining dirty keys from its 
> previous life as an active task. But since it's now a standby task, it should 
> not be sending anything to the changelog and has a null RecordCollector. We 
> tried to access it, and hit the NPE.
>  
> The fix is simple, we just need to add the guard in closeClean to 
> closeAndRecycleState as well



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] abbccdda commented on a change in pull request #8712: KAFKA-10006: Don't create internal topics when LeaderNotAvailableException

2020-06-24 Thread GitBox


abbccdda commented on a change in pull request #8712:
URL: https://github.com/apache/kafka/pull/8712#discussion_r445277549



##
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java
##
@@ -179,31 +180,43 @@ public InternalTopicManager(final Admin adminClient, 
final StreamsConfig streams
  * Topics that were not able to get its description will simply not be 
returned
  */
 // visible for testing
-protected Map getNumPartitions(final Set topics) {
-log.debug("Trying to check if topics {} have been created with 
expected number of partitions.", topics);
-
-final DescribeTopicsResult describeTopicsResult = 
adminClient.describeTopics(topics);
+protected Map getNumPartitions(final Set topics,
+final HashSet 
tempUnknownTopics,
+final int 
remainingRetries) {

Review comment:
   We could just pass in a boolean here to indicate whether there are 
remaining retries

##
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java
##
@@ -98,9 +98,10 @@ public InternalTopicManager(final Admin adminClient, final 
StreamsConfig streams
 int remainingRetries = retries;
 Set topicsNotReady = new HashSet<>(topics.keySet());
 final Set newlyCreatedTopics = new HashSet<>();
+final HashSet tempUnknownTopics = new HashSet<>();

Review comment:
   s/HashSet/Set?

##
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java
##
@@ -243,10 +259,18 @@ public InternalTopicManager(final Admin adminClient, 
final StreamsConfig streams
 throw new StreamsException(errorMsg);
 }
 } else {
-topicsToCreate.add(topicName);
+// for the tempUnknownTopics, we'll check again later if 
retries > 0

Review comment:
   Could be merged with above `else`

##
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java
##
@@ -179,31 +180,43 @@ public InternalTopicManager(final Admin adminClient, 
final StreamsConfig streams
  * Topics that were not able to get its description will simply not be 
returned
  */
 // visible for testing
-protected Map getNumPartitions(final Set topics) {
-log.debug("Trying to check if topics {} have been created with 
expected number of partitions.", topics);
-
-final DescribeTopicsResult describeTopicsResult = 
adminClient.describeTopics(topics);
+protected Map getNumPartitions(final Set topics,
+final HashSet 
tempUnknownTopics,
+final int 
remainingRetries) {
+final Set allTopicsToDescribe = new HashSet<>(topics);
+allTopicsToDescribe.addAll(tempUnknownTopics);

Review comment:
   Why do we need `allTopicsToDescribe`? It seems only queried once locally.

##
File path: 
streams/src/test/java/org/apache/kafka/streams/processor/internals/InternalTopicManagerTest.java
##
@@ -287,12 +291,41 @@ public void 
shouldLogWhenTopicNotFoundAndNotThrowException() {
 
 assertThat(
 appender.getMessages(),
-hasItem("stream-thread [" + threadName + "] Topic 
internal-topic is unknown or not found, hence not existed yet:" +
-" 
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: Topic 
internal-topic not found.")
+hasItem("stream-thread [" + threadName + "] Topic 
internal-topic is unknown or not found, hence not existed yet.\n" +
+"Error message was: 
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: Topic 
internal-topic not found.")
 );
 }
 }
 
+@Test
+public void shouldLogWhenTopicLeaderNotAvailableAndThrowException() {
+final String leaderNotAvailableTopic = "LeaderNotAvailableTopic";
+final AdminClient admin = EasyMock.createNiceMock(AdminClient.class);
+final InternalTopicManager topicManager = new 
InternalTopicManager(admin, new StreamsConfig(config));
+
+final KafkaFutureImpl topicDescriptionFailFuture = 
new KafkaFutureImpl<>();
+topicDescriptionFailFuture.completeExceptionally(new 
LeaderNotAvailableException("Leader Not Available!"));
+
+// simulate describeTopics got LeaderNotAvailableException
+
EasyMock.expect(admin.describeTopics(Collections.singleton(leaderNotAvailableTopic)))
+.andReturn(new MockDescribeTopicsResult(

Review comment:
   Use 4 space format to align with other tests.

##
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java

[jira] [Updated] (KAFKA-10017) Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta

2020-06-24 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-10017:

Priority: Critical  (was: Blocker)

> Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta
> ---
>
> Key: KAFKA-10017
> URL: https://issues.apache.org/jira/browse/KAFKA-10017
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.6.0
>Reporter: Sophie Blee-Goldman
>Assignee: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test, unit-test
> Fix For: 2.6.0
>
>
> Creating a new ticket for this since the root cause is different than 
> https://issues.apache.org/jira/browse/KAFKA-9966
> With injectError = true:
> h3. Stacktrace
> java.lang.AssertionError: Did not receive all 20 records from topic 
> multiPartitionOutputTopic within 6 ms Expected: is a value equal to or 
> greater than <20> but: <15> was less than <20> at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:563)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:429)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:397)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:559)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:530)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.readResult(EosBetaUpgradeIntegrationTest.java:973)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.verifyCommitted(EosBetaUpgradeIntegrationTest.java:961)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta(EosBetaUpgradeIntegrationTest.java:427)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10017) Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta

2020-06-24 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144590#comment-17144590
 ] 

Matthias J. Sax commented on KAFKA-10017:
-

The test is still subject to other bugs that we are currently working on. So 
it's hard to say atm. Feel free to cut an RC right away (I update the ticket as 
critical for now). However, if this test surfaces another bug, we might kill an 
RC.

> Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta
> ---
>
> Key: KAFKA-10017
> URL: https://issues.apache.org/jira/browse/KAFKA-10017
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.6.0
>Reporter: Sophie Blee-Goldman
>Assignee: Matthias J. Sax
>Priority: Blocker
>  Labels: flaky-test, unit-test
> Fix For: 2.6.0
>
>
> Creating a new ticket for this since the root cause is different than 
> https://issues.apache.org/jira/browse/KAFKA-9966
> With injectError = true:
> h3. Stacktrace
> java.lang.AssertionError: Did not receive all 20 records from topic 
> multiPartitionOutputTopic within 6 ms Expected: is a value equal to or 
> greater than <20> but: <15> was less than <20> at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:563)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:429)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:397)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:559)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:530)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.readResult(EosBetaUpgradeIntegrationTest.java:973)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.verifyCommitted(EosBetaUpgradeIntegrationTest.java:961)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta(EosBetaUpgradeIntegrationTest.java:427)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] guozhangwang merged pull request #8924: KAFKA-10198: guard against recycling dirty state

2020-06-24 Thread GitBox


guozhangwang merged pull request #8924:
URL: https://github.com/apache/kafka/pull/8924


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] ableegoldman commented on a change in pull request #8926: KAFKA-10166: always invoke `postCommit` before closing a task

2020-06-24 Thread GitBox


ableegoldman commented on a change in pull request #8926:
URL: https://github.com/apache/kafka/pull/8926#discussion_r445262784



##
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java
##
@@ -813,27 +821,6 @@ void shutdown(final boolean clean) {
 tasksToCloseDirty.add(task);
 }
 }
-
-for (final Task task : tasksToCommit) {

Review comment:
   We can actually simplify the standby task shutdown a LOT





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] ableegoldman commented on a change in pull request #8926: KAFKA-10166: always invoke `postCommit` before closing a task

2020-06-24 Thread GitBox


ableegoldman commented on a change in pull request #8926:
URL: https://github.com/apache/kafka/pull/8926#discussion_r445262369



##
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java
##
@@ -242,18 +242,16 @@ public void handleAssignment(final Map> activeTasks,
 
 for (final Task task : tasksToClose) {
 try {
-task.suspend(); // Should be a no-op for active tasks since 
they're suspended in handleRevocation
-if (task.commitNeeded()) {
-if (task.isActive()) {
-log.error("Active task {} was revoked and should have 
already been committed", task.id());
-throw new IllegalStateException("Revoked active task 
was not committed during handleRevocation");

Review comment:
   This was another "sort-of bug": if we hit an exception in 
`handleRevocation` we wouldn't finish committing the active tasks, so 
`commitNeeded` could still be true. But of course, if we hit an exception 
earlier, we would have thrown it up to ConsumerCoordinator which would only 
save the first exception, so this didn't really do anything





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Reopened] (KAFKA-8197) Flaky Test kafka.server.DynamicBrokerConfigTest > testPasswordConfigEncoderSecretChange

2020-06-24 Thread Sophie Blee-Goldman (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sophie Blee-Goldman reopened KAFKA-8197:


Saw this fail again: 
kafka.server.DynamicBrokerConfigTest > testPasswordConfigEncoderSecretChange 
FAILED*16:33:33* org.junit.ComparisonFailure: expected:<[staticLoginModule 
required;]> but was:<[u`??2?e;%h>r/???8e*16:33:33* at 
org.junit.Assert.assertEquals(Assert.java:117)*16:33:33* at 
org.junit.Assert.assertEquals(Assert.java:146)*16:33:33* at 
kafka.server.DynamicBrokerConfigTest.testPasswordConfigEncoderSecretChange(DynamicBrokerConfigTest.scala:309)

> Flaky Test kafka.server.DynamicBrokerConfigTest > 
> testPasswordConfigEncoderSecretChange
> ---
>
> Key: KAFKA-8197
> URL: https://issues.apache.org/jira/browse/KAFKA-8197
> Project: Kafka
>  Issue Type: Improvement
>  Components: core, unit tests
>Affects Versions: 1.1.1
>Reporter: Guozhang Wang
>Priority: Major
>
> {code}
> 09:18:23 kafka.server.DynamicBrokerConfigTest > 
> testPasswordConfigEncoderSecretChange FAILED
> 09:18:23 org.junit.ComparisonFailure: expected:<[staticLoginModule 
> required;]> but was:<[????O?i???A?c'??Ch?|?p]>
> 09:18:23 at org.junit.Assert.assertEquals(Assert.java:115)
> 09:18:23 at org.junit.Assert.assertEquals(Assert.java:144)
> 09:18:23 at 
> kafka.server.DynamicBrokerConfigTest.testPasswordConfigEncoderSecretChange(DynamicBrokerConfigTest.scala:253)
> {code}
> https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/13466/consoleFull



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] ableegoldman commented on pull request #8924: KAFKA-10198: guard against recycling dirty state

2020-06-24 Thread GitBox


ableegoldman commented on pull request #8924:
URL: https://github.com/apache/kafka/pull/8924#issuecomment-649162622


   Two unrelated test failures:
   `MirrorConnectorsIntegrationTest.testReplication`
   `DynamicBrokerConfigTest > testPasswordConfigEncoderSecretChange`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] ableegoldman opened a new pull request #8926: KAFKA-10166: always invoke `postCommit` before closing a task

2020-06-24 Thread GitBox


ableegoldman opened a new pull request #8926:
URL: https://github.com/apache/kafka/pull/8926


   This should address at least some of the excessive TaskCorruptedExceptions 
we've been seeing lately. Basically, at the moment we only commit tasks if 
`commitNeeded` is true -- this seems true by definition. But the problem is we 
do some essential cleanup in `postCommit` that should always be done before a 
task is closed:
   
   1. clear the PartitionGroup
   2. write the checkpoint
   
   2 is actually fine to skip when `commitNeeded = false` with ALOS, as we will 
have already written a checkpoint during the last commit. But for EOS, we 
_only_ write the checkpoint before a close -- so even if there is no new 
pending data since the last commit, we have to write the current offsets. If we 
don't, the task will be assumed dirty and we will run into our friend the 
TaskCorruptedException during (re)initialization.
   
   To fix this, we should just always call `prepareCommit` and `postCommit` at 
the TaskManager level. Within the task, it can decide whether or not to 
actually do something in those methods based on `commitNeeded`. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] guozhangwang commented on pull request #8925: KAFKA-9974: Make produce-sync flush

2020-06-24 Thread GitBox


guozhangwang commented on pull request #8925:
URL: https://github.com/apache/kafka/pull/8925#issuecomment-649157606


   test this



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] guozhangwang opened a new pull request #8925: KAFKA-9974: Make produce-sync flush

2020-06-24 Thread GitBox


guozhangwang opened a new pull request #8925:
URL: https://github.com/apache/kafka/pull/8925


   I cannot actually re-produce the failure locally, but by looking at the code 
I think there's an issue in `produceKeyValuesSynchronously`: when Eos is not 
enabled, we then need to call `flush` to make sure all records are indeed sent 
"synchronously". If Eos is enabled the `commitTxn` would flush the records 
already.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] guozhangwang removed a comment on pull request #8925: KAFKA-9974: Make produce-sync flush

2020-06-24 Thread GitBox


guozhangwang removed a comment on pull request #8925:
URL: https://github.com/apache/kafka/pull/8925#issuecomment-649157091


   test this



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] guozhangwang commented on pull request #8925: KAFKA-9974: Make produce-sync flush

2020-06-24 Thread GitBox


guozhangwang commented on pull request #8925:
URL: https://github.com/apache/kafka/pull/8925#issuecomment-649157235


   test this



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] guozhangwang commented on pull request #8925: KAFKA-9974: Make produce-sync flush

2020-06-24 Thread GitBox


guozhangwang commented on pull request #8925:
URL: https://github.com/apache/kafka/pull/8925#issuecomment-649157091







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KAFKA-10173) BufferUnderflowException during Kafka Streams Upgrade

2020-06-24 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144573#comment-17144573
 ] 

Boyang Chen commented on KAFKA-10173:
-

cc [~rhauch]

> BufferUnderflowException during Kafka Streams Upgrade
> -
>
> Key: KAFKA-10173
> URL: https://issues.apache.org/jira/browse/KAFKA-10173
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Karsten Schnitter
>Assignee: John Roesler
>Priority: Blocker
>  Labels: suppress
> Fix For: 2.6.0, 2.4.2, 2.5.1
>
>
> I migrated a Kafka Streams application from version 2.3.1 to 2.5.0. I 
> followed the steps described in the upgrade guide and set the property 
> {{migrate.from=2.3}}. On my dev system with just one running instance I got 
> the following exception:
> {noformat}
> stream-thread [0-StreamThread-2] Encountered the following error during 
> processing:
> java.nio.BufferUnderflowException: null
>   at java.base/java.nio.HeapByteBuffer.get(Unknown Source)
>   at java.base/java.nio.ByteBuffer.get(Unknown Source)
>   at 
> org.apache.kafka.streams.state.internals.BufferValue.extractValue(BufferValue.java:94)
>   at 
> org.apache.kafka.streams.state.internals.BufferValue.deserialize(BufferValue.java:83)
>   at 
> org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBuffer.restoreBatch(InMemoryTimeOrderedKeyValueBuffer.java:368)
>   at 
> org.apache.kafka.streams.processor.internals.CompositeRestoreListener.restoreBatch(CompositeRestoreListener.java:89)
>   at 
> org.apache.kafka.streams.processor.internals.StateRestorer.restore(StateRestorer.java:92)
>   at 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.processNext(StoreChangelogReader.java:350)
>   at 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(StoreChangelogReader.java:94)
>   at 
> org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:401)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:779)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:697)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:670)
> {noformat}
> I figured out, that this problem only occurs for stores, where I use the 
> suppress feature. If I rename the changelog topics during the migration, the 
> problem will not occur. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10017) Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta

2020-06-24 Thread Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144567#comment-17144567
 ] 

Sophie Blee-Goldman commented on KAFKA-10017:
-

cc [~mjsax]

> Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta
> ---
>
> Key: KAFKA-10017
> URL: https://issues.apache.org/jira/browse/KAFKA-10017
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.6.0
>Reporter: Sophie Blee-Goldman
>Assignee: Matthias J. Sax
>Priority: Blocker
>  Labels: flaky-test, unit-test
> Fix For: 2.6.0
>
>
> Creating a new ticket for this since the root cause is different than 
> https://issues.apache.org/jira/browse/KAFKA-9966
> With injectError = true:
> h3. Stacktrace
> java.lang.AssertionError: Did not receive all 20 records from topic 
> multiPartitionOutputTopic within 6 ms Expected: is a value equal to or 
> greater than <20> but: <15> was less than <20> at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:563)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:429)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:397)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:559)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:530)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.readResult(EosBetaUpgradeIntegrationTest.java:973)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.verifyCommitted(EosBetaUpgradeIntegrationTest.java:961)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta(EosBetaUpgradeIntegrationTest.java:427)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10166) Excessive TaskCorruptedException seen in testing

2020-06-24 Thread Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144562#comment-17144562
 ] 

Sophie Blee-Goldman commented on KAFKA-10166:
-

I think this should be a blocker, yes. It's a regression in 2.6 and causes 
Streams to unnecessarily rebuild state from the changelog which can mean a very 
long stall.

One root cause just occurred to me while looking at some related code, which 
I'll open a PR for right away. I'm not sure it's the _only_ root cause but I'll 
begin testing right away to see if it fixes the majority of the problem or not.

[~cadonna] do you want to split up this ticket? There are two kinds of 
TaskCorruptedException, both of which we see more than expected. It probably 
makes sense to look at these individually and in parallel. Can you look into 
the TaskCorruptedException thrown in StoreChangelogReader#restore? I'll 
investigate my theory for the exceptions thrown in ProcessorStateManager

> Excessive TaskCorruptedException seen in testing
> 
>
> Key: KAFKA-10166
> URL: https://issues.apache.org/jira/browse/KAFKA-10166
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Sophie Blee-Goldman
>Assignee: Bruno Cadonna
>Priority: Blocker
> Fix For: 2.6.0
>
>
> As the title indicates, long-running test applications with injected network 
> "outages" seem to hit TaskCorruptedException more than expected.
> Seen occasionally on the ALOS application (~20 times in two days in one case, 
> for example), and very frequently with EOS (many times per day)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] guozhangwang commented on pull request #8924: KAFKA-10198: guard against recycling dirty state

2020-06-24 Thread GitBox


guozhangwang commented on pull request #8924:
URL: https://github.com/apache/kafka/pull/8924#issuecomment-649137974


   LGTM.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] junrao commented on a change in pull request #8479: KAFKA-9769: Finish operations for leaderEpoch-updated partitions up to point ZK Exception

2020-06-24 Thread GitBox


junrao commented on a change in pull request #8479:
URL: https://github.com/apache/kafka/pull/8479#discussion_r445234702



##
File path: core/src/main/scala/kafka/server/ReplicaManager.scala
##
@@ -1556,6 +1557,11 @@ class ReplicaManager(val config: KafkaConfig,
 error(s"Error while making broker the follower for partition 
$partition with leader " +
   s"$newLeaderBrokerId in dir $dirOpt", e)
 responseMap.put(partition.topicPartition, 
Errors.KAFKA_STORAGE_ERROR)
+  case e: ZooKeeperClientException =>

Review comment:
   It's probably better to do this in Partition.makeFollower() instead of 
here. That way, we only skip partitions that have incurred ZK error.
   
   Also, the same ZK exception can happen in Partition.makeLeader(). So, we 
want to do the same thing there as well.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KAFKA-10173) BufferUnderflowException during Kafka Streams Upgrade

2020-06-24 Thread John Roesler (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144485#comment-17144485
 ] 

John Roesler commented on KAFKA-10173:
--

Ok, I'm horrified and embarrassed to report that I know what the problem is, as 
well as the solution. I'll update my PR tomorrow, and I'll re-escalate this 
ticket to a blocker for 2.5.1 and 2.6.0.

In a nutshell, although we have a version number on these changelog records so 
that we can deserialize old formats, I subtly changed the serialization format 
in 2.4.0 without updating the version number, so although the serialization 
format is incompatible between 2.3.1 and 2.5.0, they both claim to be at 
"version 2" (I was mistaken about this before).

This mistake happens to also render the serialization compatibility test I had 
written to be ineffectual.

And I've also just discovered that our system test that covers application 
upgrades had suffered an oversight that made it skip these versions.

Needless to say, in addition to fixing this bug, I'm fixing the serialization 
test to avoid using the same code paths as the application, and I'm also 
revamping the upgrade system test to be sure we'll have a much more robust test 
going forward.

Although this bug was introduced in 2.4.0, I'd still classify it as a blocker, 
since it's so severe (you simply can't upgrade the application until we fix it).

Thanks for the excellently detailed report, and, again, my sincere apologies,

-John

> BufferUnderflowException during Kafka Streams Upgrade
> -
>
> Key: KAFKA-10173
> URL: https://issues.apache.org/jira/browse/KAFKA-10173
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Karsten Schnitter
>Assignee: John Roesler
>Priority: Major
>  Labels: suppress
> Fix For: 2.5.1
>
>
> I migrated a Kafka Streams application from version 2.3.1 to 2.5.0. I 
> followed the steps described in the upgrade guide and set the property 
> {{migrate.from=2.3}}. On my dev system with just one running instance I got 
> the following exception:
> {noformat}
> stream-thread [0-StreamThread-2] Encountered the following error during 
> processing:
> java.nio.BufferUnderflowException: null
>   at java.base/java.nio.HeapByteBuffer.get(Unknown Source)
>   at java.base/java.nio.ByteBuffer.get(Unknown Source)
>   at 
> org.apache.kafka.streams.state.internals.BufferValue.extractValue(BufferValue.java:94)
>   at 
> org.apache.kafka.streams.state.internals.BufferValue.deserialize(BufferValue.java:83)
>   at 
> org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBuffer.restoreBatch(InMemoryTimeOrderedKeyValueBuffer.java:368)
>   at 
> org.apache.kafka.streams.processor.internals.CompositeRestoreListener.restoreBatch(CompositeRestoreListener.java:89)
>   at 
> org.apache.kafka.streams.processor.internals.StateRestorer.restore(StateRestorer.java:92)
>   at 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.processNext(StoreChangelogReader.java:350)
>   at 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(StoreChangelogReader.java:94)
>   at 
> org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:401)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:779)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:697)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:670)
> {noformat}
> I figured out, that this problem only occurs for stores, where I use the 
> suppress feature. If I rename the changelog topics during the migration, the 
> problem will not occur. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10173) BufferUnderflowException during Kafka Streams Upgrade

2020-06-24 Thread John Roesler (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-10173:
-
Fix Version/s: 2.4.2
   2.6.0

> BufferUnderflowException during Kafka Streams Upgrade
> -
>
> Key: KAFKA-10173
> URL: https://issues.apache.org/jira/browse/KAFKA-10173
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Karsten Schnitter
>Assignee: John Roesler
>Priority: Blocker
>  Labels: suppress
> Fix For: 2.6.0, 2.4.2, 2.5.1
>
>
> I migrated a Kafka Streams application from version 2.3.1 to 2.5.0. I 
> followed the steps described in the upgrade guide and set the property 
> {{migrate.from=2.3}}. On my dev system with just one running instance I got 
> the following exception:
> {noformat}
> stream-thread [0-StreamThread-2] Encountered the following error during 
> processing:
> java.nio.BufferUnderflowException: null
>   at java.base/java.nio.HeapByteBuffer.get(Unknown Source)
>   at java.base/java.nio.ByteBuffer.get(Unknown Source)
>   at 
> org.apache.kafka.streams.state.internals.BufferValue.extractValue(BufferValue.java:94)
>   at 
> org.apache.kafka.streams.state.internals.BufferValue.deserialize(BufferValue.java:83)
>   at 
> org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBuffer.restoreBatch(InMemoryTimeOrderedKeyValueBuffer.java:368)
>   at 
> org.apache.kafka.streams.processor.internals.CompositeRestoreListener.restoreBatch(CompositeRestoreListener.java:89)
>   at 
> org.apache.kafka.streams.processor.internals.StateRestorer.restore(StateRestorer.java:92)
>   at 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.processNext(StoreChangelogReader.java:350)
>   at 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(StoreChangelogReader.java:94)
>   at 
> org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:401)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:779)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:697)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:670)
> {noformat}
> I figured out, that this problem only occurs for stores, where I use the 
> suppress feature. If I rename the changelog topics during the migration, the 
> problem will not occur. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10173) BufferUnderflowException during Kafka Streams Upgrade

2020-06-24 Thread John Roesler (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-10173:
-
Priority: Blocker  (was: Major)

> BufferUnderflowException during Kafka Streams Upgrade
> -
>
> Key: KAFKA-10173
> URL: https://issues.apache.org/jira/browse/KAFKA-10173
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Karsten Schnitter
>Assignee: John Roesler
>Priority: Blocker
>  Labels: suppress
> Fix For: 2.5.1
>
>
> I migrated a Kafka Streams application from version 2.3.1 to 2.5.0. I 
> followed the steps described in the upgrade guide and set the property 
> {{migrate.from=2.3}}. On my dev system with just one running instance I got 
> the following exception:
> {noformat}
> stream-thread [0-StreamThread-2] Encountered the following error during 
> processing:
> java.nio.BufferUnderflowException: null
>   at java.base/java.nio.HeapByteBuffer.get(Unknown Source)
>   at java.base/java.nio.ByteBuffer.get(Unknown Source)
>   at 
> org.apache.kafka.streams.state.internals.BufferValue.extractValue(BufferValue.java:94)
>   at 
> org.apache.kafka.streams.state.internals.BufferValue.deserialize(BufferValue.java:83)
>   at 
> org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBuffer.restoreBatch(InMemoryTimeOrderedKeyValueBuffer.java:368)
>   at 
> org.apache.kafka.streams.processor.internals.CompositeRestoreListener.restoreBatch(CompositeRestoreListener.java:89)
>   at 
> org.apache.kafka.streams.processor.internals.StateRestorer.restore(StateRestorer.java:92)
>   at 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.processNext(StoreChangelogReader.java:350)
>   at 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(StoreChangelogReader.java:94)
>   at 
> org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:401)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:779)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:697)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:670)
> {noformat}
> I figured out, that this problem only occurs for stores, where I use the 
> suppress feature. If I rename the changelog topics during the migration, the 
> problem will not occur. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9313) Set default for client.dns.lookup to use_all_dns_ips

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-9313:
-
Fix Version/s: (was: 2.7.0)
   2.6.0

> Set default for client.dns.lookup to use_all_dns_ips
> 
>
> Key: KAFKA-9313
> URL: https://issues.apache.org/jira/browse/KAFKA-9313
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Yeva Byzek
>Assignee: Badai Aqrandista
>Priority: Minor
> Fix For: 2.6.0
>
>
> The default setting of the configuration parameter {{client.dns.lookup}} is 
> *not* {{use_all_dns_ips}} .  Consequently, by default, if there are multiple 
> IP addresses and the first one fails, the connection will fail.
>  
> It is desirable to change the default to be 
> {{client.dns.lookup=use_all_dns_ips}} for two reasons:
>  # reduce connection failure rates by 
>  # users are often surprised that this is not already the default
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9313) Set default for client.dns.lookup to use_all_dns_ips

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-9313:
-
Labels: need-kip  (was: )

> Set default for client.dns.lookup to use_all_dns_ips
> 
>
> Key: KAFKA-9313
> URL: https://issues.apache.org/jira/browse/KAFKA-9313
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Yeva Byzek
>Assignee: Badai Aqrandista
>Priority: Minor
>  Labels: need-kip
> Fix For: 2.6.0
>
>
> The default setting of the configuration parameter {{client.dns.lookup}} is 
> *not* {{use_all_dns_ips}} .  Consequently, by default, if there are multiple 
> IP addresses and the first one fails, the connection will fail.
>  
> It is desirable to change the default to be 
> {{client.dns.lookup=use_all_dns_ips}} for two reasons:
>  # reduce connection failure rates by 
>  # users are often surprised that this is not already the default
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9313) Set default for client.dns.lookup to use_all_dns_ips

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144481#comment-17144481
 ] 

Randall Hauch commented on KAFKA-9313:
--

Actually, this looks like it was merged and this issue was simply not updated. 
So please ignore the previous comment.

> Set default for client.dns.lookup to use_all_dns_ips
> 
>
> Key: KAFKA-9313
> URL: https://issues.apache.org/jira/browse/KAFKA-9313
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Yeva Byzek
>Assignee: Badai Aqrandista
>Priority: Minor
> Fix For: 2.7.0
>
>
> The default setting of the configuration parameter {{client.dns.lookup}} is 
> *not* {{use_all_dns_ips}} .  Consequently, by default, if there are multiple 
> IP addresses and the first one fails, the connection will fail.
>  
> It is desirable to change the default to be 
> {{client.dns.lookup=use_all_dns_ips}} for two reasons:
>  # reduce connection failure rates by 
>  # users are often surprised that this is not already the default
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] mjsax commented on pull request #8924: KAFKA-10198: guard against recycling dirty state

2020-06-24 Thread GitBox


mjsax commented on pull request #8924:
URL: https://github.com/apache/kafka/pull/8924#issuecomment-649120565


   Retest this please.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KAFKA-10198) Dirty tasks may be recycled instead of closed

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144472#comment-17144472
 ] 

Randall Hauch commented on KAFKA-10198:
---

Thanks, [~ableegoldman]. I agree this should be fixed in 2.6, so merge whenever 
the PR is ready and cherry-pick to the `2.6` branch. (I'm the 2.6 release 
manager.)

> Dirty tasks may be recycled instead of closed
> -
>
> Key: KAFKA-10198
> URL: https://issues.apache.org/jira/browse/KAFKA-10198
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Sophie Blee-Goldman
>Assignee: Sophie Blee-Goldman
>Priority: Blocker
> Fix For: 2.6.0
>
>
> We recently added a guard to `Task#closeClean` to make sure we don't 
> accidentally clean-close a dirty task, but we forgot to also add this check 
> to `Task#closeAndRecycleState`. This meant an otherwise dirty task could be 
> closed clean and recycled into a new task when it should have just been 
> closed.
> This manifest as an NPE in our test application. Specifically, task 1_0 was 
> active on StreamThread-2 but reassigned as a standby. During handleRevocation 
> we hit a TaskMigratedException while flushing the tasks and bailed on trying 
> to flush and commit the remainder. This left task 1_0 with dirty keys in the 
> suppression buffer and the `commitNeeded` flag still set to true.
> During handleAssignment, we should have closed all the tasks with pending 
> state as dirty (ie any task with commitNeeded = true). Since we don't know 
> about the TaskMigratedException we hit during handleRevocation, we rely on 
> the guard in Task#closeClean` to throw an exception and force the task to be 
> closed dirty.
> Unfortunately, we left this guard out of `closeAndRecycleState`, which meant 
> task 1_0 was able to slip through without being closed dirty. Once 
> reinitialized as a standby task, we eventually tried to commit it. The 
> suppression buffer of course tried to flush its remaining dirty keys from its 
> previous life as an active task. But since it's now a standby task, it should 
> not be sending anything to the changelog and has a null RecordCollector. We 
> tried to access it, and hit the NPE.
>  
> The fix is simple, we just need to add the guard in closeClean to 
> closeAndRecycleState as well



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9381) kafka-streams-scala: Javadocs + Scaladocs not published on maven central

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144469#comment-17144469
 ] 

Randall Hauch commented on KAFKA-9381:
--

[~mumrah] can you share the changes you made to get the javadoc and scaladoc to 
build? We're running Gradle 6.5, which is supposed to have the fix for the 
{{MalformedURLException}}.

> kafka-streams-scala: Javadocs + Scaladocs not published on maven central
> 
>
> Key: KAFKA-9381
> URL: https://issues.apache.org/jira/browse/KAFKA-9381
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation, streams
>Reporter: Julien Jean Paul Sirocchi
>Assignee: Randall Hauch
>Priority: Blocker
> Fix For: 2.6.0
>
>
> As per title, empty (aside for MANIFEST, LICENCE and NOTICE) 
> javadocs/scaladocs jars on central for any version (kafka nor scala), e.g.
> [http://repo1.maven.org/maven2/org/apache/kafka/kafka-streams-scala_2.12/2.3.1/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] d8tltanc commented on pull request #8683: KAFKA-9893: Configurable TCP connection timeout and improve the initial metadata fetch

2020-06-24 Thread GitBox


d8tltanc commented on pull request #8683:
URL: https://github.com/apache/kafka/pull/8683#issuecomment-649113870


   @rajinisivaram @dajac The test failures are caused by the connection state 
transition from `CONNECTING` to `CHECKING_API_VERSIONS`, and then to 
`CONNECTED`, instead of to `CONNECTED` directly. In this case, I should remove 
the node from the `connecting` HashSet when this transfer happens. I've fixed 
this issue and update the patch.
   
   Also, I've addressed the latest comments. Please let me know if you have 
more suggestions and if we can re-run the Jenkins tests. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] ableegoldman commented on a change in pull request #8924: KAFKA-10198: guard against recycling dirty state

2020-06-24 Thread GitBox


ableegoldman commented on a change in pull request #8924:
URL: https://github.com/apache/kafka/pull/8924#discussion_r445212765



##
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamTask.java
##
@@ -515,17 +517,20 @@ private void writeCheckpoint() {
 stateMgr.checkpoint(checkpointableOffsets());
 }
 
-/**
- * You must commit a task and checkpoint the state manager before closing 
as this will release the state dir lock
- */
-private void close(final boolean clean) {
-if (clean && commitNeeded) {
-// It may be that we failed to commit a task during 
handleRevocation, but "forgot" this and tried to
-// closeClean in handleAssignment. We should throw if we detect 
this to force the TaskManager to closeDirty
+private void validateClean() {
+// It may be that we failed to commit a task during handleRevocation, 
but "forgot" this and tried to
+// closeClean in handleAssignment. We should throw if we detect this 
to force the TaskManager to closeDirty
+if (commitNeeded) {
 log.debug("Tried to close clean but there was pending uncommitted 
data, this means we failed to"
   + " commit and should close as dirty instead");
 throw new TaskMigratedException("Tried to close dirty task as 
clean");
 }
+}
+
+/**
+ * You must commit a task and checkpoint the state manager before closing 
as this will release the state dir lock
+ */
+private void close(final boolean clean) {

Review comment:
   This diff turned out a bit awkward, basically I just factored this check 
out into a separate method that we should call at the beginning of both flavors 
of clean close





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] ableegoldman opened a new pull request #8924: KAFKA-10198: guard against recycling dirty state

2020-06-24 Thread GitBox


ableegoldman opened a new pull request #8924:
URL: https://github.com/apache/kafka/pull/8924


   We just needed to add the check in `StreamTask#closeClean`  to 
`closeAndRecycleState` as well.
   
   Also renamed `closeAndRecycleState` to `closeCleanAndRecycleState` to drive 
this point home: it needs to be clean



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (KAFKA-10198) Dirty tasks may be recycled instead of closed

2020-06-24 Thread Sophie Blee-Goldman (Jira)
Sophie Blee-Goldman created KAFKA-10198:
---

 Summary: Dirty tasks may be recycled instead of closed
 Key: KAFKA-10198
 URL: https://issues.apache.org/jira/browse/KAFKA-10198
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Sophie Blee-Goldman
Assignee: Sophie Blee-Goldman
 Fix For: 2.6.0


We recently added a guard to `Task#closeClean` to make sure we don't 
accidentally clean-close a dirty task, but we forgot to also add this check to 
`Task#closeAndRecycleState`. This meant an otherwise dirty task could be closed 
clean and recycled into a new task when it should have just been closed.

This manifest as an NPE in our test application. Specifically, task 1_0 was 
active on StreamThread-2 but reassigned as a standby. During handleRevocation 
we hit a TaskMigratedException while flushing the tasks and bailed on trying to 
flush and commit the remainder. This left task 1_0 with dirty keys in the 
suppression buffer and the `commitNeeded` flag still set to true.

During handleAssignment, we should have closed all the tasks with pending state 
as dirty (ie any task with commitNeeded = true). Since we don't know about the 
TaskMigratedException we hit during handleRevocation, we rely on the guard in 
Task#closeClean` to throw an exception and force the task to be closed dirty.

Unfortunately, we left this guard out of `closeAndRecycleState`, which meant 
task 1_0 was able to slip through without being closed dirty. Once 
reinitialized as a standby task, we eventually tried to commit it. The 
suppression buffer of course tried to flush its remaining dirty keys from its 
previous life as an active task. But since it's now a standby task, it should 
not be sending anything to the changelog and has a null RecordCollector. We 
tried to access it, and hit the NPE.

 

The fix is simple, we just need to add the guard in closeClean to 
closeAndRecycleState as well



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] ning2008wisc commented on pull request #7577: KAFKA-9076: support consumer offset sync across clusters in MM 2.0

2020-06-24 Thread GitBox


ning2008wisc commented on pull request #7577:
URL: https://github.com/apache/kafka/pull/7577#issuecomment-649092864


   bump for attention @mimaison ^ given that 
https://issues.apache.org/jira/browse/KAFKA-9076 is slipped to the next release 
(2.7.0) and some people may be already testing/using this feature, I would hope 
if it is possible to revisit this PR soon so that it can formally part of 
Kafka. Thanks



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] vvcephei commented on pull request #8881: KIP-557: Add emit on change support to Kafka Streams

2020-06-24 Thread GitBox


vvcephei commented on pull request #8881:
URL: https://github.com/apache/kafka/pull/8881#issuecomment-649089312


   Hey @ConcurrencyPractitioner , I'm sorry for the delay. I started to look at 
it, but got caught up in stabilizing the 2.6.0 and 2.5.1 releases. I'll get you 
a review ASAP.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] andrewchoi5 commented on pull request #8479: KAFKA-9769: Finish operations for leaderEpoch-updated partitions up to point ZK Exception

2020-06-24 Thread GitBox


andrewchoi5 commented on pull request #8479:
URL: https://github.com/apache/kafka/pull/8479#issuecomment-649082464


   Hello @junrao @hachikuji  -- I have made some updates to address the 
comments. Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] mjsax commented on a change in pull request #8902: KAFKA-10179: Pass correct changelog topic to state serdes

2020-06-24 Thread GitBox


mjsax commented on a change in pull request #8902:
URL: https://github.com/apache/kafka/pull/8902#discussion_r445179411



##
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/ProcessorStateManager.java
##
@@ -578,4 +577,10 @@ private StateStoreMetadata findStore(final TopicPartition 
changelogPartition) {
 
 return found.isEmpty() ? null : found.get(0);
 }
+
+@Override
+public TopicPartition changelogTopicPartitionFor(final String storeName) {
+final StateStoreMetadata storeMetadata = stores.get(storeName);
+return storeMetadata == null ? null : storeMetadata.changelogPartition;

Review comment:
   From my understanding `storeMetadata` should only be `null` if the store 
was not registered? Thus it seems to indicate a bug?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KAFKA-9846) Race condition can lead to severe lag underestimate for active tasks

2020-06-24 Thread Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144398#comment-17144398
 ] 

Sophie Blee-Goldman commented on KAFKA-9846:


This is definitely a limitation of the current Affects Version/Fix Version 
system – this actually is fixed in 2.6.0, but has not been fixed in 2.5.0 
(hence the ticket is unresolved).

That said, to avoid interfering with the release process I think we can leave 
it as is for now and then put 2.6.0 back on the fix version once it's released 
so that users know this doesn't affect 2.6+

> Race condition can lead to severe lag underestimate for active tasks
> 
>
> Key: KAFKA-9846
> URL: https://issues.apache.org/jira/browse/KAFKA-9846
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Sophie Blee-Goldman
>Priority: Critical
> Fix For: 2.7.0
>
>
> In KIP-535 we added the ability to query still-restoring and standby tasks. 
> To give users control over how out of date the data they fetch can be, we 
> added an API to KafkaStreams that fetches the end offsets for all changelog 
> partitions and computes the lag for each local state store.
> During this lag computation, we check whether an active task is in RESTORING 
> and calculate the actual lag if so. If not, we assume it's in RUNNING and 
> return a lag of zero. However, tasks may be in other states besides running 
> and restoring; notably they first pass through the CREATED state before 
> getting to RESTORING. A CREATED task may happen to be caught-up to the end 
> offset, but in many cases it is likely to be lagging or even completely 
> uninitialized.
> This introduces a race condition where users may be led to believe that a 
> task has zero lag and is "safe" to query even with the strictest correctness 
> guarantees, while the task is actually lagging by some unknown amount.  
> During transfer of ownership of the task between different threads on the 
> same machine, tasks can actually spend a while in CREATED while the new owner 
> waits to acquire the task directory lock. So, this race condition may not be 
> particularly rare in multi-threaded Streams applications



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10068) Verify HighAvailabilityTaskAssignor performance with large clusters and topologies

2020-06-24 Thread Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144394#comment-17144394
 ] 

Sophie Blee-Goldman commented on KAFKA-10068:
-

Moved the fix version back to 2.7.0. I don't think it's critical to have the 
test itself in 2.6.0, so long as we have the test itself and can verify that it 
doesn't expose a problem.

> Verify HighAvailabilityTaskAssignor performance with large clusters and 
> topologies
> --
>
> Key: KAFKA-10068
> URL: https://issues.apache.org/jira/browse/KAFKA-10068
> Project: Kafka
>  Issue Type: Task
>  Components: streams
>Affects Versions: 2.6.0
>Reporter: John Roesler
>Assignee: Sophie Blee-Goldman
>Priority: Blocker
> Fix For: 2.7.0
>
>
> While reviewing [https://github.com/apache/kafka/pull/8668/files,] I realized 
> that we should have a similar test to make sure that the new task assignor 
> completes well within the default assignment deadline. 30 seconds is a good 
> upper bound.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10068) Verify HighAvailabilityTaskAssignor performance with large clusters and topologies

2020-06-24 Thread Sophie Blee-Goldman (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sophie Blee-Goldman updated KAFKA-10068:

Fix Version/s: (was: 2.6.0)
   2.7.0

> Verify HighAvailabilityTaskAssignor performance with large clusters and 
> topologies
> --
>
> Key: KAFKA-10068
> URL: https://issues.apache.org/jira/browse/KAFKA-10068
> Project: Kafka
>  Issue Type: Task
>  Components: streams
>Affects Versions: 2.6.0
>Reporter: John Roesler
>Assignee: Sophie Blee-Goldman
>Priority: Blocker
> Fix For: 2.7.0
>
>
> While reviewing [https://github.com/apache/kafka/pull/8668/files,] I realized 
> that we should have a similar test to make sure that the new task assignor 
> completes well within the default assignment deadline. 30 seconds is a good 
> upper bound.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9076) MirrorMaker 2.0 automated consumer offset sync

2020-06-24 Thread Ning Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144388#comment-17144388
 ] 

Ning Zhang commented on KAFKA-9076:
---

[~rhauch] It is true that this jira is not a blocking factor to 2.6.0 release 
and thank you for checking it.

Given the PR has been proposed for over 6 months, some users are testing it in 
different use cases, I would hope the committers ([~mimaison]) and other 
reviewers may take an other review on https://github.com/apache/kafka/pull/7577 
and see if we could iterate faster, so that it can formally be part of Kafka in 
the 2.7.0 release.

> MirrorMaker 2.0 automated consumer offset sync
> --
>
> Key: KAFKA-9076
> URL: https://issues.apache.org/jira/browse/KAFKA-9076
> Project: Kafka
>  Issue Type: Improvement
>  Components: mirrormaker
>Affects Versions: 2.4.0
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>Priority: Major
>  Labels: mirrormaker, pull-request-available
> Fix For: 2.7.0
>
>
> To calculate the translated consumer offset in the target cluster, currently 
> `Mirror-client` provides a function called "remoteConsumerOffsets()" that is 
> used by "RemoteClusterUtils" for one-time purpose.
> In order to make the consumer and stream applications migrate from source to 
> target cluster transparently and conveniently, e.g. in event of source 
> cluster failure, a background job is proposed to periodically sync the 
> consumer offsets from the source to target cluster, so that when the consumer 
> and stream applications switch to the target cluster, it will resume to 
> consume from where it left off at source cluster.
>  KIP: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-545%3A+support+automated+consumer+offset+sync+across+clusters+in+MM+2.0
> [https://github.com/apache/kafka/pull/7577]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-4740) Using new consumer API with a Deserializer that throws SerializationException can lead to infinite loop

2020-06-24 Thread Patrick Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144384#comment-17144384
 ] 

Patrick Taylor commented on KAFKA-4740:
---

I agree with [Andrea's comment on 12/Dec/18|#comment-16719006] that a better 
solution is a customizable exception handler so the client can choose 
whether/how to log and skip unserializable records, or throw an exception as it 
does now, etc.  This would avoid the complexities discussed in earlier comments.

> Using new consumer API with a Deserializer that throws SerializationException 
> can lead to infinite loop
> ---
>
> Key: KAFKA-4740
> URL: https://issues.apache.org/jira/browse/KAFKA-4740
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1
> Environment: Kafka broker 0.10.1.1 (but this bug is not dependent on 
> the broker version)
> Kafka clients 0.9.0.0, 0.9.0.1, 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1
>Reporter: Sébastien Launay
>Assignee: Sébastien Launay
>Priority: Critical
>
> The old consumer supports deserializing records into typed objects and throws 
> a {{SerializationException}} through {{MessageAndMetadata#key()}} and 
> {{MessageAndMetadata#message()}} that can be catched by the client \[1\].
> When using the new consumer API with kafka-clients version < 0.10.0.1, such 
> the exception is swallowed by the {{NetworkClient}} class and result in an 
> infinite loop which the client has no control over like:
> {noformat}
> DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - Resetting offset 
> for partition test2-0 to earliest offset.
> DEBUG org.apache.kafka.clients.consumer.internals.Fetcher - Fetched offset 0 
> for partition test2-0
> ERROR org.apache.kafka.clients.NetworkClient - Uncaught error in request 
> completion:
> org.apache.kafka.common.errors.SerializationException: Size of data received 
> by IntegerDeserializer is not 4
> ERROR org.apache.kafka.clients.NetworkClient - Uncaught error in request 
> completion:
> org.apache.kafka.common.errors.SerializationException: Size of data received 
> by IntegerDeserializer is not 4
> ...
> {noformat}
> Thanks to KAFKA-3977, this has been partially fixed in 0.10.1.0 but another 
> issue still remains.
> Indeed, the client can now catch the {{SerializationException}} but the next 
> call to {{Consumer#poll(long)}} will throw the same exception indefinitely.
> The following snippet (full example available on Github \[2\] for most 
> released kafka-clients versions):
> {code:java}
> try (KafkaConsumer kafkaConsumer = new 
> KafkaConsumer<>(consumerConfig, new StringDeserializer(), new 
> IntegerDeserializer())) {
> kafkaConsumer.subscribe(Arrays.asList("topic"));
> // Will run till the shutdown hook is called
> while (!doStop) {
> try {
> ConsumerRecords records = 
> kafkaConsumer.poll(1000);
> if (!records.isEmpty()) {
> logger.info("Got {} messages", records.count());
> for (ConsumerRecord record : records) {
> logger.info("Message with partition: {}, offset: {}, key: 
> {}, value: {}",
> record.partition(), record.offset(), record.key(), 
> record.value());
> }
> } else {
> logger.info("No messages to consume");
> }
> } catch (SerializationException e) {
> logger.warn("Failed polling some records", e);
> }
>  }
> }
> {code}
> when run with the following records (third record has an invalid Integer 
> value):
> {noformat}
> printf "\x00\x00\x00\x00\n" | bin/kafka-console-producer.sh --broker-list 
> localhost:9092 --topic topic
> printf "\x00\x00\x00\x01\n" | bin/kafka-console-producer.sh --broker-list 
> localhost:9092 --topic topic
> printf "\x00\x00\x00\n" | bin/kafka-console-producer.sh --broker-list 
> localhost:9092 --topic topic
> printf "\x00\x00\x00\x02\n" | bin/kafka-console-producer.sh --broker-list 
> localhost:9092 --topic topic
> {noformat}
> will produce the following logs:
> {noformat}
> INFO  consumer.Consumer - Got 2 messages
> INFO  consumer.Consumer - Message with partition: 0, offset: 0, key: null, 
> value: 0
> INFO  consumer.Consumer - Message with partition: 0, offset: 1, key: null, 
> value: 1
> WARN  consumer.Consumer - Failed polling some records
> org.apache.kafka.common.errors.SerializationException: Error deserializing 
> key/value for partition topic-0 at offset 2
> Caused by: org.apache.kafka.common.errors.SerializationException: Size of 
> data received by IntegerDeserializer is not 4
> WARN  consumer.Consumer - Failed polling some records

[jira] [Commented] (KAFKA-9861) Process Simplification - Community Validation of Kafka Release Candidates

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144383#comment-17144383
 ] 

Randall Hauch commented on KAFKA-9861:
--

Since this is not a blocker issue, as release manager I'm trying to complete 
the 2.6.0 release. I'm removing `2.6.0` from the fix version and replacing with 
the next releases, `2.6.1` and `2.7.0`. If this is incorrect, please respond 
and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing 
list thread.

> Process Simplification - Community Validation of Kafka Release Candidates
> -
>
> Key: KAFKA-9861
> URL: https://issues.apache.org/jira/browse/KAFKA-9861
> Project: Kafka
>  Issue Type: Improvement
>  Components: build, documentation, system tests
>Affects Versions: 2.6.0, 2.4.2, 2.5.1
> Environment: Linux, Java 8/11, Scala 2.x
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>Priority: Minor
> Fix For: 2.4.2, 2.5.1, 2.7.0, 2.6.1
>
>
> When new KAFKA release candidates are published and there is a solicitation 
> for the community to get involved in testing and verifying the release 
> candidates, it would be great to have the test process thoroughly documented 
> for newcomers to participate effectively.
> For new contributors, this can be very daunting and it would be great to have 
> this process clearly documented in a way that lowers the level of effort 
> necessary to get started.
> The goal of this task is to create the documentation and supporting artifacts 
> that would make this goal a reality.
> Going forward for future releases, it would be great to have the link to this 
> documentation included in the RC announcements so that the community 
> (especially end users) can help test and participate in the voting process 
> effectively.
> These are the items that I believe should be included in this documentation
>  * How to set up test environment for unit and functional tests
>  * Java version(s) needed for the tests
>  * Scala version(s) needed for the tests
>  * Gradle version needed
>  * Sample script for running sanity checks and unit tests
>  * Sample Helm Charts for running all the basic components on a Kubernetes
>  * Sample Ansible Script for running all the basic components on Virtual 
> Machines
> The first 4 items will be part of the documentation that shows how to install 
> these dependencies in a Linux VM. The 5th item is a script that will download 
> PGP keys, check signatures, validate checksums and run unit/integration 
> tests. The 6th item is a Helm chart with basic components necessary to 
> validate critical components in the ecosystem (Zookeeper, Brokers, Streams 
> etc) within a Kubernetes cluster. The last item is similar to the 6th item 
> but installs these components on virtual machines instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9861) Process Simplification - Community Validation of Kafka Release Candidates

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-9861:
-
Fix Version/s: (was: 2.6.0)
   2.6.1

> Process Simplification - Community Validation of Kafka Release Candidates
> -
>
> Key: KAFKA-9861
> URL: https://issues.apache.org/jira/browse/KAFKA-9861
> Project: Kafka
>  Issue Type: Improvement
>  Components: build, documentation, system tests
>Affects Versions: 2.6.0, 2.4.2, 2.5.1
> Environment: Linux, Java 8/11, Scala 2.x
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>Priority: Minor
> Fix For: 2.4.2, 2.5.1, 2.6.1
>
>
> When new KAFKA release candidates are published and there is a solicitation 
> for the community to get involved in testing and verifying the release 
> candidates, it would be great to have the test process thoroughly documented 
> for newcomers to participate effectively.
> For new contributors, this can be very daunting and it would be great to have 
> this process clearly documented in a way that lowers the level of effort 
> necessary to get started.
> The goal of this task is to create the documentation and supporting artifacts 
> that would make this goal a reality.
> Going forward for future releases, it would be great to have the link to this 
> documentation included in the RC announcements so that the community 
> (especially end users) can help test and participate in the voting process 
> effectively.
> These are the items that I believe should be included in this documentation
>  * How to set up test environment for unit and functional tests
>  * Java version(s) needed for the tests
>  * Scala version(s) needed for the tests
>  * Gradle version needed
>  * Sample script for running sanity checks and unit tests
>  * Sample Helm Charts for running all the basic components on a Kubernetes
>  * Sample Ansible Script for running all the basic components on Virtual 
> Machines
> The first 4 items will be part of the documentation that shows how to install 
> these dependencies in a Linux VM. The 5th item is a script that will download 
> PGP keys, check signatures, validate checksums and run unit/integration 
> tests. The 6th item is a Helm chart with basic components necessary to 
> validate critical components in the ecosystem (Zookeeper, Brokers, Streams 
> etc) within a Kubernetes cluster. The last item is similar to the 6th item 
> but installs these components on virtual machines instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9861) Process Simplification - Community Validation of Kafka Release Candidates

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-9861:
-
Fix Version/s: 2.7.0

> Process Simplification - Community Validation of Kafka Release Candidates
> -
>
> Key: KAFKA-9861
> URL: https://issues.apache.org/jira/browse/KAFKA-9861
> Project: Kafka
>  Issue Type: Improvement
>  Components: build, documentation, system tests
>Affects Versions: 2.6.0, 2.4.2, 2.5.1
> Environment: Linux, Java 8/11, Scala 2.x
>Reporter: Israel Ekpo
>Assignee: Israel Ekpo
>Priority: Minor
> Fix For: 2.4.2, 2.5.1, 2.7.0, 2.6.1
>
>
> When new KAFKA release candidates are published and there is a solicitation 
> for the community to get involved in testing and verifying the release 
> candidates, it would be great to have the test process thoroughly documented 
> for newcomers to participate effectively.
> For new contributors, this can be very daunting and it would be great to have 
> this process clearly documented in a way that lowers the level of effort 
> necessary to get started.
> The goal of this task is to create the documentation and supporting artifacts 
> that would make this goal a reality.
> Going forward for future releases, it would be great to have the link to this 
> documentation included in the RC announcements so that the community 
> (especially end users) can help test and participate in the voting process 
> effectively.
> These are the items that I believe should be included in this documentation
>  * How to set up test environment for unit and functional tests
>  * Java version(s) needed for the tests
>  * Scala version(s) needed for the tests
>  * Gradle version needed
>  * Sample script for running sanity checks and unit tests
>  * Sample Helm Charts for running all the basic components on a Kubernetes
>  * Sample Ansible Script for running all the basic components on Virtual 
> Machines
> The first 4 items will be part of the documentation that shows how to install 
> these dependencies in a Linux VM. The 5th item is a script that will download 
> PGP keys, check signatures, validate checksums and run unit/integration 
> tests. The 6th item is a Helm chart with basic components necessary to 
> validate critical components in the ecosystem (Zookeeper, Brokers, Streams 
> etc) within a Kubernetes cluster. The last item is similar to the 6th item 
> but installs these components on virtual machines instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9313) Set default for client.dns.lookup to use_all_dns_ips

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-9313:
-
Fix Version/s: (was: 2.6.0)
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.7.0`. If this is incorrect, please respond and 
discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list 
thread.

> Set default for client.dns.lookup to use_all_dns_ips
> 
>
> Key: KAFKA-9313
> URL: https://issues.apache.org/jira/browse/KAFKA-9313
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Yeva Byzek
>Assignee: Badai Aqrandista
>Priority: Minor
> Fix For: 2.7.0
>
>
> The default setting of the configuration parameter {{client.dns.lookup}} is 
> *not* {{use_all_dns_ips}} .  Consequently, by default, if there are multiple 
> IP addresses and the first one fails, the connection will fail.
>  
> It is desirable to change the default to be 
> {{client.dns.lookup=use_all_dns_ips}} for two reasons:
>  # reduce connection failure rates by 
>  # users are often surprised that this is not already the default
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9587) Producer configs are omitted in the documentation

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-9587:
-
Fix Version/s: (was: 2.6.0)
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.7.0`. If this is incorrect, please respond and 
discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list 
thread.

> Producer configs are omitted in the documentation
> -
>
> Key: KAFKA-9587
> URL: https://issues.apache.org/jira/browse/KAFKA-9587
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, documentation
>Affects Versions: 2.4.0
>Reporter: Dongjin Lee
>Assignee: Dongjin Lee
>Priority: Minor
> Fix For: 2.7.0
>
>
> As of 2.4, [the KafkaProducer 
> documentation|https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
>  states:
> {quote}If the request fails, the producer can automatically retry, though 
> since we have specified retries as 0 it won't.
> {quote}
> {quote}... in the code snippet above, likely all 100 records would be sent in 
> a single request since we set our linger time to 1 millisecond.
> {quote}
> However, the code snippet (below) does not include any configurtaion on 
> '{{retry'}} or '{{linger.ms'}}:
> {quote}Properties props = new Properties();
>  props.put("bootstrap.servers", "localhost:9092");
>  props.put("acks", "all");
>  props.put("key.serializer", 
> "org.apache.kafka.common.serialization.StringSerializer");
>  props.put("value.serializer", 
> "org.apache.kafka.common.serialization.StringSerializer");
> {quote}
> The same documentation in [version 
> 2.0|https://kafka.apache.org/20/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
>  includes the configs; However, 
> [2.1|https://kafka.apache.org/21/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
>  only includes '{{linger.ms}}' and 
> [2.2|https://kafka.apache.org/22/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
>  includes none. It seems like it was removed in the middle of two releases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10038) ConsumerPerformance.scala supports the setting of client.id

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-10038:
--
Fix Version/s: (was: 2.6.0)
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.7.0`. If this is incorrect, please respond and 
discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list 
thread.

> ConsumerPerformance.scala supports the setting of client.id
> ---
>
> Key: KAFKA-10038
> URL: https://issues.apache.org/jira/browse/KAFKA-10038
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer, core
>Affects Versions: 2.1.1
> Environment: Trunk branch
>Reporter: tigertan
>Assignee: Luke Chen
>Priority: Minor
>  Labels: newbie, performance
> Fix For: 2.7.0
>
>
> ConsumerPerformance.scala supports the setting of "client.id", which is a 
> reasonable requirement, and the way "console consumer" and "console producer" 
> handle "client.id" can be unified. "client.id" defaults to 
> "perf-consumer-client".
> We often use client.id in quotas, if the script of 
> kafka-producer-perf-test.sh supports the setting of "client.id" , we can do 
> quota testing through scripts without writing our own consumer programs. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8930) MM2 documentation

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8930:
-
Fix Version/s: (was: 2.6.0)
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.7.0`. If this is incorrect, please respond and 
discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list 
thread.

> MM2 documentation
> -
>
> Key: KAFKA-8930
> URL: https://issues.apache.org/jira/browse/KAFKA-8930
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation, mirrormaker
>Affects Versions: 2.4.0
>Reporter: Ryanne Dolan
>Assignee: Ryanne Dolan
>Priority: Minor
> Fix For: 2.7.0
>
>
> Expand javadocs for new MirrorMaker (entrypoint) and MirrorMakerConfig 
> classes. Include example usage and example configuration.
> Expand javadocs for MirrorSourceConnector, MirrorCheckpointConnector, and 
> MirrorHeartbeatConnector, including example configuration for running on 
> Connect w/o mm2 driver.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8929) MM2 system tests

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8929:
-
Fix Version/s: (was: 2.6.0)
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.7.0`. If this is incorrect, please respond and 
discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list 
thread.

> MM2 system tests
> 
>
> Key: KAFKA-8929
> URL: https://issues.apache.org/jira/browse/KAFKA-8929
> Project: Kafka
>  Issue Type: Improvement
>  Components: mirrormaker
>Affects Versions: 2.4.0
>Reporter: Ryanne Dolan
>Assignee: Ryanne Dolan
>Priority: Minor
>  Labels: test
> Fix For: 2.7.0
>
>
> Add system tests for MM2 driver. Should resemble existing mirror-maker system 
> tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9018) Kafka Connect - throw clearer exceptions on serialisation errors

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-9018:
-
Fix Version/s: (was: 2.6.0)
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.7.0`. If this is incorrect, please respond and 
discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list 
thread.

> Kafka Connect - throw clearer exceptions on serialisation errors
> 
>
> Key: KAFKA-9018
> URL: https://issues.apache.org/jira/browse/KAFKA-9018
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 2.5.0, 2.4.1
>Reporter: Robin Moffatt
>Assignee: Mario Molina
>Priority: Minor
> Fix For: 2.7.0
>
>
> When Connect fails on a deserialisation error, it doesn't show if that's the 
> *key or value* that's thrown the error, nor does it give the user any 
> indication of the *topic/partition/offset* of the message. Kafka Connect 
> should be improved to return this information.
> Example message that user will get (in this case caused by reading non-Avro 
> data with the Avro converter)
> {code:java}
> Caused by: org.apache.kafka.connect.errors.DataException: Failed to 
> deserialize data for topic sample_topic to Avro:
>  at 
> io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:110)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$1(WorkerSinkTask.java:487)
>  at 
> org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
>  at 
> org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
>  ... 13 more
>  Caused by: org.apache.kafka.common.errors.SerializationException: Error 
> deserializing Avro message for id -1
>  Caused by: org.apache.kafka.common.errors.SerializationException: Unknown 
> magic byte!{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10068) Verify HighAvailabilityTaskAssignor performance with large clusters and topologies

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144375#comment-17144375
 ] 

Randall Hauch commented on KAFKA-10068:
---

[~ableegoldman], actually, I'm still trying to cut the first RC, but we need to 
decide whether this is really a blocker for the `2.6.0` release. If we think it 
still is a blocker, what is an ETA for completing this and is this a risky 
change for so late in the release cycle?

> Verify HighAvailabilityTaskAssignor performance with large clusters and 
> topologies
> --
>
> Key: KAFKA-10068
> URL: https://issues.apache.org/jira/browse/KAFKA-10068
> Project: Kafka
>  Issue Type: Task
>  Components: streams
>Affects Versions: 2.6.0
>Reporter: John Roesler
>Assignee: Sophie Blee-Goldman
>Priority: Blocker
> Fix For: 2.6.0
>
>
> While reviewing [https://github.com/apache/kafka/pull/8668/files,] I realized 
> that we should have a similar test to make sure that the new task assignor 
> completes well within the default assignment deadline. 30 seconds is a good 
> upper bound.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8264) Flaky Test PlaintextConsumerTest#testLowMaxFetchSizeForRequestAndPartition

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8264:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

> Flaky Test PlaintextConsumerTest#testLowMaxFetchSizeForRequestAndPartition
> --
>
> Key: KAFKA-8264
> URL: https://issues.apache.org/jira/browse/KAFKA-8264
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.0.1, 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Luke Chen
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/252/tests]
> {quote}org.apache.kafka.common.errors.TopicExistsException: Topic 'topic3' 
> already exists.{quote}
> STDOUT
>  
> {quote}[2019-04-19 03:54:20,080] ERROR [ReplicaFetcher replicaId=1, 
> leaderId=0, fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:20,080] ERROR [ReplicaFetcher replicaId=2, leaderId=0, 
> fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:20,312] ERROR [ReplicaFetcher replicaId=2, leaderId=0, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:20,313] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition topic-1 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:20,994] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:21,727] ERROR [ReplicaFetcher replicaId=0, leaderId=2, 
> fetcherId=0] Error for partition topicWithNewMessageFormat-1 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:28,696] ERROR [ReplicaFetcher replicaId=0, leaderId=2, 
> fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:28,699] ERROR [ReplicaFetcher replicaId=1, leaderId=2, 
> fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:29,246] ERROR [ReplicaFetcher replicaId=0, leaderId=2, 
> fetcherId=0] Error for partition topic-1 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:29,247] ERROR [ReplicaFetcher replicaId=0, leaderId=1, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:29,287] ERROR [ReplicaFetcher replicaId=1, leaderId=2, 
> fetcherId=0] Error for partition topic-1 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:33,408] ERROR [ReplicaFetcher replicaId=0, leaderId=2, 
> fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:33,408] ERROR [ReplicaFetcher replicaId=1, leaderId=2, 
> fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:33,655] ERROR 

[jira] [Commented] (KAFKA-8264) Flaky Test PlaintextConsumerTest#testLowMaxFetchSizeForRequestAndPartition

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144374#comment-17144374
 ] 

Randall Hauch commented on KAFKA-8264:
--

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.7.0`. If this is incorrect, please respond and 
discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list 
thread.

> Flaky Test PlaintextConsumerTest#testLowMaxFetchSizeForRequestAndPartition
> --
>
> Key: KAFKA-8264
> URL: https://issues.apache.org/jira/browse/KAFKA-8264
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.0.1, 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Luke Chen
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/252/tests]
> {quote}org.apache.kafka.common.errors.TopicExistsException: Topic 'topic3' 
> already exists.{quote}
> STDOUT
>  
> {quote}[2019-04-19 03:54:20,080] ERROR [ReplicaFetcher replicaId=1, 
> leaderId=0, fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:20,080] ERROR [ReplicaFetcher replicaId=2, leaderId=0, 
> fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:20,312] ERROR [ReplicaFetcher replicaId=2, leaderId=0, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:20,313] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition topic-1 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:20,994] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:21,727] ERROR [ReplicaFetcher replicaId=0, leaderId=2, 
> fetcherId=0] Error for partition topicWithNewMessageFormat-1 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:28,696] ERROR [ReplicaFetcher replicaId=0, leaderId=2, 
> fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:28,699] ERROR [ReplicaFetcher replicaId=1, leaderId=2, 
> fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:29,246] ERROR [ReplicaFetcher replicaId=0, leaderId=2, 
> fetcherId=0] Error for partition topic-1 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:29,247] ERROR [ReplicaFetcher replicaId=0, leaderId=1, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:29,287] ERROR [ReplicaFetcher replicaId=1, leaderId=2, 
> fetcherId=0] Error for partition topic-1 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:33,408] ERROR [ReplicaFetcher replicaId=0, leaderId=2, 
> fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-19 03:54:33,408] ERROR [ReplicaFetcher replicaId=1, leaderId=2, 
> fetcherId=0] Error for partition __consumer_offsets-0 at 

[jira] [Commented] (KAFKA-9943) Enable TLSv.1.3 in system tests "run all" execution.

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144372#comment-17144372
 ] 

Randall Hauch commented on KAFKA-9943:
--

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.7.0`. If this is incorrect, please respond and 
discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list 
thread.

> Enable TLSv.1.3 in system tests "run all" execution.
> 
>
> Key: KAFKA-9943
> URL: https://issues.apache.org/jira/browse/KAFKA-9943
> Project: Kafka
>  Issue Type: Test
>Reporter: Nikolay Izhikov
>Assignee: Nikolay Izhikov
>Priority: Major
> Fix For: 2.7.0
>
>
> We need to enable system tests with the TLSv1.3 in "run all" execution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-5453) Controller may miss requests sent to the broker when zk session timeout happens.

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144373#comment-17144373
 ] 

Randall Hauch commented on KAFKA-5453:
--

Moving this to 2.7.0 as there is no progress.

> Controller may miss requests sent to the broker when zk session timeout 
> happens.
> 
>
> Key: KAFKA-5453
> URL: https://issues.apache.org/jira/browse/KAFKA-5453
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.11.0.0
>Reporter: Jiangjie Qin
>Assignee: Viktor Somogyi-Vass
>Priority: Major
> Fix For: 2.7.0
>
>
> The issue I encountered was the following:
> 1. Partition reassignment was in progress, one replica of a partition is 
> being reassigned from broker 1 to broker 2.
> 2. Controller received an ISR change notification which indicates broker 2 
> has caught up.
> 3. Controller was sending StopReplicaRequest to broker 1.
> 4. Broker 1 zk session timeout occurs. Controller removed broker 1 from the 
> cluster and cleaned up the queue. i.e. the StopReplicaRequest was removed 
> from the ControllerChannelManager.
> 5. Broker 1 reconnected to zk and act as if it is still a follower replica of 
> the partition. 
> 6. Broker 1 will always receive exception from the leader because it is not 
> in the replica list.
> Not sure what is the correct fix here. It seems that broke 1 in this case 
> should ask the controller for the latest replica assignment.
> There are two related bugs:
> 1. when a {{NotAssignedReplicaException}} is thrown from 
> {{Partition.updateReplicaLogReadResult()}}, the other partitions in the same 
> request will failed to update the fetch timestamp and offset and thus also 
> drop out of the ISR.
> 2. The {{NotAssignedReplicaException}} was not properly returned to the 
> replicas, instead, a UnknownServerException is returned.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-5453) Controller may miss requests sent to the broker when zk session timeout happens.

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-5453:
-
Fix Version/s: (was: 2.6.0)
   2.7.0

> Controller may miss requests sent to the broker when zk session timeout 
> happens.
> 
>
> Key: KAFKA-5453
> URL: https://issues.apache.org/jira/browse/KAFKA-5453
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.11.0.0
>Reporter: Jiangjie Qin
>Assignee: Viktor Somogyi-Vass
>Priority: Major
> Fix For: 2.7.0
>
>
> The issue I encountered was the following:
> 1. Partition reassignment was in progress, one replica of a partition is 
> being reassigned from broker 1 to broker 2.
> 2. Controller received an ISR change notification which indicates broker 2 
> has caught up.
> 3. Controller was sending StopReplicaRequest to broker 1.
> 4. Broker 1 zk session timeout occurs. Controller removed broker 1 from the 
> cluster and cleaned up the queue. i.e. the StopReplicaRequest was removed 
> from the ControllerChannelManager.
> 5. Broker 1 reconnected to zk and act as if it is still a follower replica of 
> the partition. 
> 6. Broker 1 will always receive exception from the leader because it is not 
> in the replica list.
> Not sure what is the correct fix here. It seems that broke 1 in this case 
> should ask the controller for the latest replica assignment.
> There are two related bugs:
> 1. when a {{NotAssignedReplicaException}} is thrown from 
> {{Partition.updateReplicaLogReadResult()}}, the other partitions in the same 
> request will failed to update the fetch timestamp and offset and thus also 
> drop out of the ISR.
> 2. The {{NotAssignedReplicaException}} was not properly returned to the 
> replicas, instead, a UnknownServerException is returned.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9943) Enable TLSv.1.3 in system tests "run all" execution.

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-9943:
-
Fix Version/s: (was: 2.6.0)
   2.7.0

> Enable TLSv.1.3 in system tests "run all" execution.
> 
>
> Key: KAFKA-9943
> URL: https://issues.apache.org/jira/browse/KAFKA-9943
> Project: Kafka
>  Issue Type: Test
>Reporter: Nikolay Izhikov
>Assignee: Nikolay Izhikov
>Priority: Major
> Fix For: 2.7.0
>
>
> We need to enable system tests with the TLSv1.3 in "run all" execution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10185) Streams should log summarized restoration information at info level

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144371#comment-17144371
 ] 

Randall Hauch commented on KAFKA-10185:
---

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.7.0`. If this is incorrect, please respond and 
discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list 
thread.

> Streams should log summarized restoration information at info level
> ---
>
> Key: KAFKA-10185
> URL: https://issues.apache.org/jira/browse/KAFKA-10185
> Project: Kafka
>  Issue Type: Task
>  Components: streams
>Reporter: John Roesler
>Assignee: John Roesler
>Priority: Major
> Fix For: 2.7.0
>
>
> Currently, restoration progress is only visible at debug level in the 
> Consumer's Fetcher logs. Users can register a restoration listener and 
> implement their own logging, but it would substantially improve operability 
> to have some logs available at INFO level.
> Logging each partition in each restore batch at info level would be too much, 
> though, so we should print summarized logs at a decreased interval, like 
> every 10 seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9458) Kafka crashed in windows environment

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-9458:
-
Fix Version/s: (was: 2.6.0)
   2.7.0

> Kafka crashed in windows environment
> 
>
> Key: KAFKA-9458
> URL: https://issues.apache.org/jira/browse/KAFKA-9458
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 2.4.0
> Environment: Windows Server 2019
>Reporter: hirik
>Priority: Critical
>  Labels: windows
> Fix For: 2.7.0
>
> Attachments: Windows_crash_fix.patch, logs.zip
>
>
> Hi,
> while I was trying to validate Kafka retention policy, Kafka Server crashed 
> with below exception trace. 
> [2020-01-21 17:10:40,475] INFO [Log partition=test1-3, 
> dir=C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka] 
> Rolled new log segment at offset 1 in 52 ms. (kafka.log.Log)
> [2020-01-21 17:10:40,484] ERROR Error while deleting segments for test1-3 in 
> dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka 
> (kafka.server.LogDirFailureChannel)
> java.nio.file.FileSystemException: 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex
>  -> 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex.deleted:
>  The process cannot access the file because it is being used by another 
> process.
> at 
> java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
>  at 
> java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
>  at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:395)
>  at 
> java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
>  at java.base/java.nio.file.Files.move(Files.java:1425)
>  at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:795)
>  at kafka.log.AbstractIndex.renameTo(AbstractIndex.scala:209)
>  at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:497)
>  at kafka.log.Log.$anonfun$deleteSegmentFiles$1(Log.scala:2206)
>  at kafka.log.Log.$anonfun$deleteSegmentFiles$1$adapted(Log.scala:2206)
>  at scala.collection.immutable.List.foreach(List.scala:305)
>  at kafka.log.Log.deleteSegmentFiles(Log.scala:2206)
>  at kafka.log.Log.removeAndDeleteSegments(Log.scala:2191)
>  at kafka.log.Log.$anonfun$deleteSegments$2(Log.scala:1700)
>  at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$sp.scala:17)
>  at kafka.log.Log.maybeHandleIOException(Log.scala:2316)
>  at kafka.log.Log.deleteSegments(Log.scala:1691)
>  at kafka.log.Log.deleteOldSegments(Log.scala:1686)
>  at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1763)
>  at kafka.log.Log.deleteOldSegments(Log.scala:1753)
>  at kafka.log.LogManager.$anonfun$cleanupLogs$3(LogManager.scala:982)
>  at kafka.log.LogManager.$anonfun$cleanupLogs$3$adapted(LogManager.scala:979)
>  at scala.collection.immutable.List.foreach(List.scala:305)
>  at kafka.log.LogManager.cleanupLogs(LogManager.scala:979)
>  at kafka.log.LogManager.$anonfun$startup$2(LogManager.scala:403)
>  at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:116)
>  at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:65)
>  at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
>  at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at java.base/java.lang.Thread.run(Thread.java:830)
>  Suppressed: java.nio.file.FileSystemException: 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex
>  -> 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex.deleted:
>  The process cannot access the file because it is being used by another 
> process.
> at 
> java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
>  at 
> java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
>  at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309)
>  at 
> java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
>  at java.base/java.nio.file.Files.move(Files.java:1425)
>  at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:792)
>  ... 27 more
> [2020-01-21 17:10:40,495] INFO [ReplicaManager 

[jira] [Updated] (KAFKA-10185) Streams should log summarized restoration information at info level

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-10185:
--
Fix Version/s: (was: 2.6.0)

> Streams should log summarized restoration information at info level
> ---
>
> Key: KAFKA-10185
> URL: https://issues.apache.org/jira/browse/KAFKA-10185
> Project: Kafka
>  Issue Type: Task
>  Components: streams
>Reporter: John Roesler
>Assignee: John Roesler
>Priority: Major
> Fix For: 2.7.0
>
>
> Currently, restoration progress is only visible at debug level in the 
> Consumer's Fetcher logs. Users can register a restoration listener and 
> implement their own logging, but it would substantially improve operability 
> to have some logs available at INFO level.
> Logging each partition in each restore batch at info level would be too much, 
> though, so we should print summarized logs at a decreased interval, like 
> every 10 seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9458) Kafka crashed in windows environment

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144370#comment-17144370
 ] 

Randall Hauch commented on KAFKA-9458:
--

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.7.0`. If this is incorrect, please respond and 
discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list 
thread.

> Kafka crashed in windows environment
> 
>
> Key: KAFKA-9458
> URL: https://issues.apache.org/jira/browse/KAFKA-9458
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 2.4.0
> Environment: Windows Server 2019
>Reporter: hirik
>Priority: Critical
>  Labels: windows
> Fix For: 2.7.0
>
> Attachments: Windows_crash_fix.patch, logs.zip
>
>
> Hi,
> while I was trying to validate Kafka retention policy, Kafka Server crashed 
> with below exception trace. 
> [2020-01-21 17:10:40,475] INFO [Log partition=test1-3, 
> dir=C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka] 
> Rolled new log segment at offset 1 in 52 ms. (kafka.log.Log)
> [2020-01-21 17:10:40,484] ERROR Error while deleting segments for test1-3 in 
> dir C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka 
> (kafka.server.LogDirFailureChannel)
> java.nio.file.FileSystemException: 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex
>  -> 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex.deleted:
>  The process cannot access the file because it is being used by another 
> process.
> at 
> java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
>  at 
> java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
>  at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:395)
>  at 
> java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292)
>  at java.base/java.nio.file.Files.move(Files.java:1425)
>  at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:795)
>  at kafka.log.AbstractIndex.renameTo(AbstractIndex.scala:209)
>  at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:497)
>  at kafka.log.Log.$anonfun$deleteSegmentFiles$1(Log.scala:2206)
>  at kafka.log.Log.$anonfun$deleteSegmentFiles$1$adapted(Log.scala:2206)
>  at scala.collection.immutable.List.foreach(List.scala:305)
>  at kafka.log.Log.deleteSegmentFiles(Log.scala:2206)
>  at kafka.log.Log.removeAndDeleteSegments(Log.scala:2191)
>  at kafka.log.Log.$anonfun$deleteSegments$2(Log.scala:1700)
>  at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$sp.scala:17)
>  at kafka.log.Log.maybeHandleIOException(Log.scala:2316)
>  at kafka.log.Log.deleteSegments(Log.scala:1691)
>  at kafka.log.Log.deleteOldSegments(Log.scala:1686)
>  at kafka.log.Log.deleteRetentionMsBreachedSegments(Log.scala:1763)
>  at kafka.log.Log.deleteOldSegments(Log.scala:1753)
>  at kafka.log.LogManager.$anonfun$cleanupLogs$3(LogManager.scala:982)
>  at kafka.log.LogManager.$anonfun$cleanupLogs$3$adapted(LogManager.scala:979)
>  at scala.collection.immutable.List.foreach(List.scala:305)
>  at kafka.log.LogManager.cleanupLogs(LogManager.scala:979)
>  at kafka.log.LogManager.$anonfun$startup$2(LogManager.scala:403)
>  at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:116)
>  at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:65)
>  at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
>  at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at java.base/java.lang.Thread.run(Thread.java:830)
>  Suppressed: java.nio.file.FileSystemException: 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex
>  -> 
> C:\Users\Administrator\Downloads\kafka\bin\windows\..\..\data\kafka\test1-3\.timeindex.deleted:
>  The process cannot access the file because it is being used by another 
> process.
> at 
> java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
>  at 
> java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
>  at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309)
>  at 
> 

[jira] [Commented] (KAFKA-10166) Excessive TaskCorruptedException seen in testing

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144358#comment-17144358
 ] 

Randall Hauch commented on KAFKA-10166:
---

[~ableegoldman], [~cadonna]: Do we want to continue treating this as a blocker 
for the 2.6.0 release? If so, what's the timeframe for fixing this?

If this should not block the release, should we downgrade the priority and/or 
change the fix versions to 2.6.1 and/or 2.7.0?

> Excessive TaskCorruptedException seen in testing
> 
>
> Key: KAFKA-10166
> URL: https://issues.apache.org/jira/browse/KAFKA-10166
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Sophie Blee-Goldman
>Assignee: Bruno Cadonna
>Priority: Blocker
> Fix For: 2.6.0
>
>
> As the title indicates, long-running test applications with injected network 
> "outages" seem to hit TaskCorruptedException more than expected.
> Seen occasionally on the ALOS application (~20 times in two days in one case, 
> for example), and very frequently with EOS (many times per day)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10143) Can no longer change replication throttle with reassignment tool

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144357#comment-17144357
 ] 

Randall Hauch commented on KAFKA-10143:
---

[~hachikuji]: do we want to continue treating this as a blocker for the 2.6.0 
release? If so, what's the timeframe for fixing this?

If this should not block the release, should we downgrade the priority and/or 
change the fix versions to 2.6.1 and/or 2.7.0?

> Can no longer change replication throttle with reassignment tool
> 
>
> Key: KAFKA-10143
> URL: https://issues.apache.org/jira/browse/KAFKA-10143
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
> Fix For: 2.6.0
>
>
> Previously we could use --execute with the --throttle option in order to 
> change the quota of an active reassignment. We seem to have lost this with 
> KIP-455. The code has the following comment:
> {code}
> val reassignPartitionsInProgress = zkClient.reassignPartitionsInProgress()
> if (reassignPartitionsInProgress) {
>   // Note: older versions of this tool would modify the broker quotas 
> here (but not
>   // topic quotas, for some reason).  This behavior wasn't documented in 
> the --execute
>   // command line help.  Since it might interfere with other ongoing 
> reassignments,
>   // this behavior was dropped as part of the KIP-455 changes.
>   throw new 
> TerseReassignmentFailureException(cannotExecuteBecauseOfExistingMessage)
> }
> {code}
> Seems like it was a mistake to change this because it breaks compatibility. 
> We probably have to revert. At the same time, we can make the intent clearer 
> both in the code and in the command help output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10134) High CPU issue during rebalance in Kafka consumer after upgrading to 2.5

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144355#comment-17144355
 ] 

Randall Hauch commented on KAFKA-10134:
---

[~guozhang], [~ijuma], [~seanguo]: what's the status of this? Do we want to 
continue treating this as a blocker for the 2.6.0 release? If so, what's the 
timeframe for fixing this?

If this should not block the release, should we downgrade the priority and/or 
change the fix versions to 2.6.1 and/or 2.7.0?

> High CPU issue during rebalance in Kafka consumer after upgrading to 2.5
> 
>
> Key: KAFKA-10134
> URL: https://issues.apache.org/jira/browse/KAFKA-10134
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.5.0
>Reporter: Sean Guo
>Priority: Blocker
> Fix For: 2.6.0, 2.5.1
>
>
> We want to utilize the new rebalance protocol to mitigate the stop-the-world 
> effect during the rebalance as our tasks are long running task.
> But after the upgrade when we try to kill an instance to let rebalance happen 
> when there is some load(some are long running tasks >30S) there, the CPU will 
> go sky-high. It reads ~700% in our metrics so there should be several threads 
> are in a tight loop. We have several consumer threads consuming from 
> different partitions during the rebalance. This is reproducible in both the 
> new CooperativeStickyAssignor and old eager rebalance rebalance protocol. The 
> difference is that with old eager rebalance rebalance protocol used the high 
> CPU usage will dropped after the rebalance done. But when using cooperative 
> one, it seems the consumers threads are stuck on something and couldn't 
> finish the rebalance so the high CPU usage won't drop until we stopped our 
> load. Also a small load without long running task also won't cause continuous 
> high CPU usage as the rebalance can finish in that case.
>  
> "executor.kafka-consumer-executor-4" #124 daemon prio=5 os_prio=0 
> cpu=76853.07ms elapsed=841.16s tid=0x7fe11f044000 nid=0x1f4 runnable  
> [0x7fe119aab000]"executor.kafka-consumer-executor-4" #124 daemon prio=5 
> os_prio=0 cpu=76853.07ms elapsed=841.16s tid=0x7fe11f044000 nid=0x1f4 
> runnable  [0x7fe119aab000]   java.lang.Thread.State: RUNNABLE at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:467)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1275)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1241) 
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1216) 
> at
>  
> By debugging into the code we found it looks like the clients are  in a loop 
> on finding the coordinator.
> I also tried the old rebalance protocol for the new version the issue still 
> exists but the CPU will be back to normal when the rebalance is done.
> Also tried the same on the 2.4.1 which seems don't have this issue. So it 
> seems related something changed between 2.4.1 and 2.5.0.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10017) Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144352#comment-17144352
 ] 

Randall Hauch commented on KAFKA-10017:
---

[~vvcephei], [~ableegoldman]: what's the status of this? Is this more than just 
a flaky test, and should we keep this as a blocker for the 2.6.0 release? If 
so, what's the timeframe for fixing this?

> Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta
> ---
>
> Key: KAFKA-10017
> URL: https://issues.apache.org/jira/browse/KAFKA-10017
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.6.0
>Reporter: Sophie Blee-Goldman
>Assignee: Matthias J. Sax
>Priority: Blocker
>  Labels: flaky-test, unit-test
> Fix For: 2.6.0
>
>
> Creating a new ticket for this since the root cause is different than 
> https://issues.apache.org/jira/browse/KAFKA-9966
> With injectError = true:
> h3. Stacktrace
> java.lang.AssertionError: Did not receive all 20 records from topic 
> multiPartitionOutputTopic within 6 ms Expected: is a value equal to or 
> greater than <20> but: <15> was less than <20> at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:563)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:429)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:397)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:559)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:530)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.readResult(EosBetaUpgradeIntegrationTest.java:973)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.verifyCommitted(EosBetaUpgradeIntegrationTest.java:961)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta(EosBetaUpgradeIntegrationTest.java:427)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8073) Transient failure in kafka.api.UserQuotaTest.testThrottledProducerConsumer

2020-06-24 Thread Randall Hauch (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144349#comment-17144349
 ] 

Randall Hauch commented on KAFKA-8073:
--

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
removing `2.6.0` from the fix version and adding `2.6.1` and `2.7.0`. If this 
is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 
release" discussion mailing list thread.

> Transient failure in kafka.api.UserQuotaTest.testThrottledProducerConsumer
> --
>
> Key: KAFKA-8073
> URL: https://issues.apache.org/jira/browse/KAFKA-8073
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Bill Bejeck
>Assignee: Chia-Ping Tsai
>Priority: Critical
> Fix For: 2.2.3, 2.7.0, 2.6.1
>
>
> Failed in build [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/20134/]
>  
> Stacktrace and STDOUT
> {noformat}
> Error Message
> java.lang.AssertionError: Client with id=QuotasTestProducer-1 should have 
> been throttled
> Stacktrace
> java.lang.AssertionError: Client with id=QuotasTestProducer-1 should have 
> been throttled
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at 
> kafka.api.QuotaTestClients.verifyThrottleTimeMetric(BaseQuotaTest.scala:229)
>   at 
> kafka.api.QuotaTestClients.verifyProduceThrottle(BaseQuotaTest.scala:215)
>   at 
> kafka.api.BaseQuotaTest.testThrottledProducerConsumer(BaseQuotaTest.scala:82)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:412)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)

[jira] [Updated] (KAFKA-8073) Transient failure in kafka.api.UserQuotaTest.testThrottledProducerConsumer

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8073:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

> Transient failure in kafka.api.UserQuotaTest.testThrottledProducerConsumer
> --
>
> Key: KAFKA-8073
> URL: https://issues.apache.org/jira/browse/KAFKA-8073
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Bill Bejeck
>Assignee: Chia-Ping Tsai
>Priority: Critical
> Fix For: 2.2.3, 2.7.0, 2.6.1
>
>
> Failed in build [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/20134/]
>  
> Stacktrace and STDOUT
> {noformat}
> Error Message
> java.lang.AssertionError: Client with id=QuotasTestProducer-1 should have 
> been throttled
> Stacktrace
> java.lang.AssertionError: Client with id=QuotasTestProducer-1 should have 
> been throttled
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at 
> kafka.api.QuotaTestClients.verifyThrottleTimeMetric(BaseQuotaTest.scala:229)
>   at 
> kafka.api.QuotaTestClients.verifyProduceThrottle(BaseQuotaTest.scala:215)
>   at 
> kafka.api.BaseQuotaTest.testThrottledProducerConsumer(BaseQuotaTest.scala:82)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:412)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> 

[jira] [Updated] (KAFKA-10135) Extract Task#executeAndMaybeSwallow to be a general utility function into TaskManager

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-10135:
--
Fix Version/s: 2.6.0

> Extract Task#executeAndMaybeSwallow to be a general utility function into 
> TaskManager
> -
>
> Key: KAFKA-10135
> URL: https://issues.apache.org/jira/browse/KAFKA-10135
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.5.0, 2.6.0
>Reporter: Boyang Chen
>Assignee: feyman
>Priority: Major
> Fix For: 2.6.0, 2.7.0
>
>
> We have a couple of cases where we need to swallow the exception during 
> operations in both Task class and TaskManager class. This utility method 
> should be generalized at least onto TaskManager level. See discussion comment 
> [here|https://github.com/apache/kafka/pull/8833#discussion_r437697665].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-7964) Flaky Test ConsumerBounceTest#testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-7964:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test 
> ConsumerBounceTest#testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize
> --
>
> Key: KAFKA-7964
> URL: https://issues.apache.org/jira/browse/KAFKA-7964
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/21/]
> {quote}java.lang.AssertionError: expected:<100> but was:<0> at 
> org.junit.Assert.fail(Assert.java:88) at 
> org.junit.Assert.failNotEquals(Assert.java:834) at 
> org.junit.Assert.assertEquals(Assert.java:645) at 
> org.junit.Assert.assertEquals(Assert.java:631) at 
> kafka.api.ConsumerBounceTest.receiveExactRecords(ConsumerBounceTest.scala:551)
>  at 
> kafka.api.ConsumerBounceTest.$anonfun$testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize$2(ConsumerBounceTest.scala:409)
>  at 
> kafka.api.ConsumerBounceTest.$anonfun$testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize$2$adapted(ConsumerBounceTest.scala:408)
>  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at 
> kafka.api.ConsumerBounceTest.testConsumerReceivesFatalExceptionWhenGroupPassesMaxSize(ConsumerBounceTest.scala:408){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8085) Flaky Test ResetConsumerGroupOffsetTest#testResetOffsetsByDuration

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8085:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test ResetConsumerGroupOffsetTest#testResetOffsetsByDuration
> --
>
> Key: KAFKA-8085
> URL: https://issues.apache.org/jira/browse/KAFKA-8085
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/62/testReport/junit/kafka.admin/ResetConsumerGroupOffsetTest/testResetOffsetsByDuration/]
> {quote}java.lang.AssertionError: Expected that consumer group has consumed 
> all messages from topic/partition. at 
> kafka.utils.TestUtils$.fail(TestUtils.scala:381) at 
> kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at 
> kafka.admin.ResetConsumerGroupOffsetTest.awaitConsumerProgress(ResetConsumerGroupOffsetTest.scala:364)
>  at 
> kafka.admin.ResetConsumerGroupOffsetTest.produceConsumeAndShutdown(ResetConsumerGroupOffsetTest.scala:359)
>  at 
> kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsByDuration(ResetConsumerGroupOffsetTest.scala:146){quote}
> STDOUT
> {quote}[2019-03-09 08:39:29,856] WARN Unable to read additional data from 
> client sessionid 0x105f6adb208, likely client has closed socket 
> (org.apache.zookeeper.server.NIOServerCnxn:376) [2019-03-09 08:39:46,373] 
> WARN Unable to read additional data from client sessionid 0x105f6adf4c50001, 
> likely client has closed socket 
> (org.apache.zookeeper.server.NIOServerCnxn:376){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-7540) Flaky Test ConsumerBounceTest#testClose

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-7540:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test ConsumerBounceTest#testClose
> ---
>
> Key: KAFKA-7540
> URL: https://issues.apache.org/jira/browse/KAFKA-7540
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, unit tests
>Affects Versions: 2.2.0
>Reporter: John Roesler
>Assignee: Jason Gustafson
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> Observed on Java 8: 
> [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/17314/testReport/junit/kafka.api/ConsumerBounceTest/testClose/]
>  
> Stacktrace:
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: -1
>   at 
> kafka.integration.KafkaServerTestHarness.killBroker(KafkaServerTestHarness.scala:146)
>   at 
> kafka.api.ConsumerBounceTest.checkCloseWithCoordinatorFailure(ConsumerBounceTest.scala:238)
>   at kafka.api.ConsumerBounceTest.testClose(ConsumerBounceTest.scala:211)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
>   at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>   at 
> 

[jira] [Updated] (KAFKA-8140) Flaky Test SaslSslAdminClientIntegrationTest#testDescribeAndAlterConfigs

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8140:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test SaslSslAdminClientIntegrationTest#testDescribeAndAlterConfigs
> 
>
> Key: KAFKA-8140
> URL: https://issues.apache.org/jira/browse/KAFKA-8140
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/80/testReport/junit/kafka.api/SaslSslAdminClientIntegrationTest/testDescribeAndAlterConfigs/]
> {quote}java.lang.IllegalArgumentException: Could not find a 'KafkaServer' or 
> 'sasl_ssl.KafkaServer' entry in the JAAS configuration. System property 
> 'java.security.auth.login.config' is not set at 
> org.apache.kafka.common.security.JaasContext.defaultContext(JaasContext.java:133)
>  at org.apache.kafka.common.security.JaasContext.load(JaasContext.java:98) at 
> org.apache.kafka.common.security.JaasContext.loadServerContext(JaasContext.java:70)
>  at 
> org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:121)
>  at 
> org.apache.kafka.common.network.ChannelBuilders.serverChannelBuilder(ChannelBuilders.java:85)
>  at kafka.network.Processor.(SocketServer.scala:694) at 
> kafka.network.SocketServer.newProcessor(SocketServer.scala:344) at 
> kafka.network.SocketServer.$anonfun$addDataPlaneProcessors$1(SocketServer.scala:253)
>  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at 
> kafka.network.SocketServer.addDataPlaneProcessors(SocketServer.scala:252) at 
> kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1(SocketServer.scala:216)
>  at 
> kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1$adapted(SocketServer.scala:214)
>  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at 
> kafka.network.SocketServer.createDataPlaneAcceptorsAndProcessors(SocketServer.scala:214)
>  at kafka.network.SocketServer.startup(SocketServer.scala:114) at 
> kafka.server.KafkaServer.startup(KafkaServer.scala:253) at 
> kafka.utils.TestUtils$.createServer(TestUtils.scala:140) at 
> kafka.integration.KafkaServerTestHarness.$anonfun$setUp$1(KafkaServerTestHarness.scala:101)
>  at scala.collection.Iterator.foreach(Iterator.scala:941) at 
> scala.collection.Iterator.foreach$(Iterator.scala:941) at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1429) at 
> scala.collection.IterableLike.foreach(IterableLike.scala:74) at 
> scala.collection.IterableLike.foreach$(IterableLike.scala:73) at 
> scala.collection.AbstractIterable.foreach(Iterable.scala:56) at 
> kafka.integration.KafkaServerTestHarness.setUp(KafkaServerTestHarness.scala:100)
>  at kafka.api.IntegrationTestHarness.doSetup(IntegrationTestHarness.scala:81) 
> at kafka.api.IntegrationTestHarness.setUp(IntegrationTestHarness.scala:73) at 
> kafka.api.AdminClientIntegrationTest.setUp(AdminClientIntegrationTest.scala:79)
>  at 
> kafka.api.SaslSslAdminClientIntegrationTest.setUp(SaslSslAdminClientIntegrationTest.scala:64){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8250) Flaky Test DelegationTokenEndToEndAuthorizationTest#testProduceConsumeViaAssign

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8250:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test 
> DelegationTokenEndToEndAuthorizationTest#testProduceConsumeViaAssign
> ---
>
> Key: KAFKA-8250
> URL: https://issues.apache.org/jira/browse/KAFKA-8250
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk11/detail/kafka-trunk-jdk11/442/tests]
> {quote}java.lang.AssertionError: Consumed more records than expected 
> expected:<1> but was:<2>
> at org.junit.Assert.fail(Assert.java:89)
> at org.junit.Assert.failNotEquals(Assert.java:835)
> at org.junit.Assert.assertEquals(Assert.java:647)
> at kafka.utils.TestUtils$.consumeRecords(TestUtils.scala:1288)
> at 
> kafka.api.EndToEndAuthorizationTest.consumeRecords(EndToEndAuthorizationTest.scala:460)
> at 
> kafka.api.EndToEndAuthorizationTest.testProduceConsumeViaAssign(EndToEndAuthorizationTest.scala:209){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8138) Flaky Test PlaintextConsumerTest#testFetchRecordLargerThanFetchMaxBytes

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8138:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test PlaintextConsumerTest#testFetchRecordLargerThanFetchMaxBytes
> ---
>
> Key: KAFKA-8138
> URL: https://issues.apache.org/jira/browse/KAFKA-8138
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/80/testReport/junit/kafka.api/PlaintextConsumerTest/testFetchRecordLargerThanFetchMaxBytes/]
> {quote}java.lang.AssertionError: Partition [topic,0] metadata not propagated 
> after 15000 ms at kafka.utils.TestUtils$.fail(TestUtils.scala:381) at 
> kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at 
> kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:318) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:317) at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at 
> scala.collection.immutable.Range.foreach(Range.scala:158) at 
> scala.collection.TraversableLike.map(TraversableLike.scala:237) at 
> scala.collection.TraversableLike.map$(TraversableLike.scala:230) at 
> scala.collection.AbstractTraversable.map(Traversable.scala:108) at 
> kafka.utils.TestUtils$.createTopic(TestUtils.scala:317) at 
> kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:125)
>  at kafka.api.BaseConsumerTest.setUp(BaseConsumerTest.scala:69){quote}
> STDOUT (truncated)
> {quote}[2019-03-20 16:10:19,759] ERROR [ReplicaFetcher replicaId=2, 
> leaderId=0, fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-20 16:10:19,760] ERROR 
> [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition 
> __consumer_offsets-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-20 16:10:19,963] ERROR 
> [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition 
> topic-1 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-20 16:10:19,964] ERROR 
> [ReplicaFetcher replicaId=1, leaderId=2, fetcherId=0] Error for partition 
> topic-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-20 16:10:19,975] ERROR 
> [ReplicaFetcher replicaId=0, leaderId=2, fetcherId=0] Error for partition 
> topic-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-7947) Flaky Test EpochDrivenReplicationProtocolAcceptanceTest#shouldFollowLeaderEpochBasicWorkflow

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-7947:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test 
> EpochDrivenReplicationProtocolAcceptanceTest#shouldFollowLeaderEpochBasicWorkflow
> 
>
> Key: KAFKA-7947
> URL: https://issues.apache.org/jira/browse/KAFKA-7947
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/17/]
> {quote}java.lang.AssertionError: expected: startOffset=0), EpochEntry(epoch=1, startOffset=1))> but 
> was: startOffset=1))> at org.junit.Assert.fail(Assert.java:88) at 
> org.junit.Assert.failNotEquals(Assert.java:834) at 
> org.junit.Assert.assertEquals(Assert.java:118) at 
> org.junit.Assert.assertEquals(Assert.java:144) at 
> kafka.server.epoch.EpochDrivenReplicationProtocolAcceptanceTest.shouldFollowLeaderEpochBasicWorkflow(EpochDrivenReplicationProtocolAcceptanceTest.scala:101){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-7969) Flaky Test DescribeConsumerGroupTest#testDescribeOffsetsOfExistingGroupWithNoMembers

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-7969:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test 
> DescribeConsumerGroupTest#testDescribeOffsetsOfExistingGroupWithNoMembers
> 
>
> Key: KAFKA-7969
> URL: https://issues.apache.org/jira/browse/KAFKA-7969
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/24/]
> {quote}java.lang.AssertionError: Expected no active member in describe group 
> results, state: Some(Empty), assignments: Some(List()) at 
> org.junit.Assert.fail(Assert.java:88) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> kafka.admin.DescribeConsumerGroupTest.testDescribeOffsetsOfExistingGroupWithNoMembers(DescribeConsumerGroupTest.scala:278{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8267) Flaky Test SaslAuthenticatorTest#testUserCredentialsUnavailableForScramMechanism

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8267:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test 
> SaslAuthenticatorTest#testUserCredentialsUnavailableForScramMechanism
> 
>
> Key: KAFKA-8267
> URL: https://issues.apache.org/jira/browse/KAFKA-8267
> Project: Kafka
>  Issue Type: Bug
>  Components: core, security, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3925/testReport/junit/org.apache.kafka.common.security.authenticator/SaslAuthenticatorTest/testUserCredentialsUnavailableForScramMechanism/]
> {quote}java.lang.AssertionError: Metric not updated 
> successful-reauthentication-total expected:<0.0> but was:<1.0> expected:<0.0> 
> but was:<1.0> at org.junit.Assert.fail(Assert.java:89) at 
> org.junit.Assert.failNotEquals(Assert.java:835) at 
> org.junit.Assert.assertEquals(Assert.java:555) at 
> org.apache.kafka.common.network.NioEchoServer.waitForMetrics(NioEchoServer.java:190)
>  at 
> org.apache.kafka.common.network.NioEchoServer.verifyReauthenticationMetrics(NioEchoServer.java:157)
>  at 
> org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest.testUserCredentialsUnavailableForScramMechanism(SaslAuthenticatorTest.java:501){quote}
> STDOUT
> {quote}[2019-04-19 22:15:35,524] ERROR Extensions provided in login context 
> without a token 
> (org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule:318) 
> java.io.IOException: Extensions provided in login context without a token at 
> org.apache.kafka.common.security.oauthbearer.internals.unsecured.OAuthBearerUnsecuredLoginCallbackHandler.handle(OAuthBearerUnsecuredLoginCallbackHandler.java:164)
>  at 
> org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule.identifyToken(OAuthBearerLoginModule.java:316)
>  at 
> org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule.login(OAuthBearerLoginModule.java:301)
>  at 
> java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:726)
>  at 
> java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:665) 
> at 
> java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:663) 
> at java.base/java.security.AccessController.doPrivileged(Native Method) at 
> java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:663)
>  at 
> java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:574) 
> at 
> org.apache.kafka.common.security.authenticator.AbstractLogin.login(AbstractLogin.java:60)
>  at 
> org.apache.kafka.common.security.authenticator.LoginManager.(LoginManager.java:61)
>  at 
> org.apache.kafka.common.security.authenticator.LoginManager.acquireLoginManager(LoginManager.java:104)
>  at 
> org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:149)
>  at 
> org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:146)
>  at 
> org.apache.kafka.common.network.ChannelBuilders.serverChannelBuilder(ChannelBuilders.java:85)
>  at 
> org.apache.kafka.common.network.NioEchoServer.(NioEchoServer.java:121) 
> at 
> org.apache.kafka.common.network.NioEchoServer.(NioEchoServer.java:97) 
> at 
> org.apache.kafka.common.network.NetworkTestUtils.createEchoServer(NetworkTestUtils.java:49)
>  at 
> org.apache.kafka.common.network.NetworkTestUtils.createEchoServer(NetworkTestUtils.java:43)
>  at 
> org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest.createEchoServer(SaslAuthenticatorTest.java:1851)
>  at 
> org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest.createEchoServer(SaslAuthenticatorTest.java:1847)
>  at 
> org.apache.kafka.common.security.authenticator.SaslAuthenticatorTest.testValidSaslOauthBearerMechanismWithoutServerTokens(SaslAuthenticatorTest.java:1586)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> 

[jira] [Updated] (KAFKA-8268) Flaky Test SaslSslAdminIntegrationTest#testSeekAfterDeleteRecords

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8268:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test SaslSslAdminIntegrationTest#testSeekAfterDeleteRecords
> -
>
> Key: KAFKA-8268
> URL: https://issues.apache.org/jira/browse/KAFKA-8268
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3570/tests]
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.TimeoutException: Aborted due to timeout.
> at 
> org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
>  
> at 
> org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
> at 
> org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
> at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
> at 
> kafka.api.AdminClientIntegrationTest.testSeekAfterDeleteRecords(AdminClientIntegrationTest.scala:775){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8269) Flaky Test TopicCommandWithAdminClientTest#testDescribeUnderMinIsrPartitionsMixed

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8269:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test 
> TopicCommandWithAdminClientTest#testDescribeUnderMinIsrPartitionsMixed
> -
>
> Key: KAFKA-8269
> URL: https://issues.apache.org/jira/browse/KAFKA-8269
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3573/tests]
> {quote}java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:87)
> at org.junit.Assert.assertTrue(Assert.java:42)
> at org.junit.Assert.assertTrue(Assert.java:53)
> at 
> kafka.admin.TopicCommandWithAdminClientTest.testDescribeUnderMinIsrPartitionsMixed(TopicCommandWithAdminClientTest.scala:659){quote}
> It's a long LOG. This might be interesting:
> {quote}[2019-04-20 21:30:37,936] ERROR [ReplicaFetcher replicaId=4, 
> leaderId=5, fetcherId=0] Error for partition 
> testCreateWithReplicaAssignment-0cpsXnG35w-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-20 21:30:48,600] WARN Unable to read additional data from client 
> sessionid 0x10510a59d3c0004, likely client has closed socket 
> (org.apache.zookeeper.server.NIOServerCnxn:376)
> [2019-04-20 21:30:48,908] WARN Unable to read additional data from client 
> sessionid 0x10510a59d3c0003, likely client has closed socket 
> (org.apache.zookeeper.server.NIOServerCnxn:376)
> [2019-04-20 21:30:48,919] ERROR [RequestSendThread controllerId=0] Controller 
> 0 fails to send a request to broker localhost:43520 (id: 5 rack: rack3) 
> (kafka.controller.RequestSendThread:76)
> java.lang.InterruptedException
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
> at kafka.utils.ShutdownableThread.pause(ShutdownableThread.scala:75)
> at 
> kafka.controller.RequestSendThread.backoff$1(ControllerChannelManager.scala:224)
> at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:252)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89)
> [2019-04-20 21:30:48,920] ERROR [RequestSendThread controllerId=0] Controller 
> 0 fails to send a request to broker localhost:33570 (id: 4 rack: rack3) 
> (kafka.controller.RequestSendThread:76)
> java.lang.InterruptedException
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
> at kafka.utils.ShutdownableThread.pause(ShutdownableThread.scala:75)
> at 
> kafka.controller.RequestSendThread.backoff$1(ControllerChannelManager.scala:224)
> at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:252)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89)
> [2019-04-20 21:31:28,942] ERROR [ReplicaFetcher replicaId=3, leaderId=1, 
> fetcherId=0] Error for partition under-min-isr-topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-04-20 21:31:28,973] ERROR [ReplicaFetcher replicaId=0, leaderId=1, 
> fetcherId=0] Error for partition under-min-isr-topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8033) Flaky Test PlaintextConsumerTest#testFetchInvalidOffset

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8033:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test PlaintextConsumerTest#testFetchInvalidOffset
> ---
>
> Key: KAFKA-8033
> URL: https://issues.apache.org/jira/browse/KAFKA-8033
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/2829/testReport/junit/kafka.api/PlaintextConsumerTest/testFetchInvalidOffset/]
> {quote}org.scalatest.junit.JUnitTestFailedError: Expected exception 
> org.apache.kafka.clients.consumer.NoOffsetForPartitionException to be thrown, 
> but no exception was thrown{quote}
> STDOUT prints this over and over again:
> {quote}[2019-03-02 04:01:25,576] ERROR [ReplicaFetcher replicaId=0, 
> leaderId=1, fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-7957) Flaky Test DynamicBrokerReconfigurationTest#testMetricsReporterUpdate

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-7957:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test DynamicBrokerReconfigurationTest#testMetricsReporterUpdate
> -
>
> Key: KAFKA-7957
> URL: https://issues.apache.org/jira/browse/KAFKA-7957
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/18/]
> {quote}java.lang.AssertionError: Messages not sent at 
> kafka.utils.TestUtils$.fail(TestUtils.scala:356) at 
> kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:766) at 
> kafka.server.DynamicBrokerReconfigurationTest.startProduceConsume(DynamicBrokerReconfigurationTest.scala:1270)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.testMetricsReporterUpdate(DynamicBrokerReconfigurationTest.scala:650){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8115) Flaky Test CoordinatorTest#testTaskRequestWithOldStartMsGetsUpdated

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8115:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test CoordinatorTest#testTaskRequestWithOldStartMsGetsUpdated
> ---
>
> Key: KAFKA-8115
> URL: https://issues.apache.org/jira/browse/KAFKA-8115
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3254/testReport/junit/org.apache.kafka.trogdor.coordinator/CoordinatorTest/testTaskRequestWithOldStartMsGetsUpdated/]
> {quote}org.junit.runners.model.TestTimedOutException: test timed out after 
> 12 milliseconds at java.base@11.0.1/jdk.internal.misc.Unsafe.park(Native 
> Method) at 
> java.base@11.0.1/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
>  at 
> java.base@11.0.1/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123)
>  at 
> java.base@11.0.1/java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1454)
>  at 
> java.base@11.0.1/java.util.concurrent.Executors$DelegatedExecutorService.awaitTermination(Executors.java:709)
>  at 
> app//org.apache.kafka.trogdor.rest.JsonRestServer.waitForShutdown(JsonRestServer.java:157)
>  at app//org.apache.kafka.trogdor.agent.Agent.waitForShutdown(Agent.java:123) 
> at 
> app//org.apache.kafka.trogdor.common.MiniTrogdorCluster.close(MiniTrogdorCluster.java:285)
>  at 
> app//org.apache.kafka.trogdor.coordinator.CoordinatorTest.testTaskRequestWithOldStartMsGetsUpdated(CoordinatorTest.java:596)
>  at 
> java.base@11.0.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base@11.0.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> java.base@11.0.1/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base@11.0.1/java.lang.reflect.Method.invoke(Method.java:566) at 
> app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>  at 
> app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>  at java.base@11.0.1/java.util.concurrent.FutureTask.run(FutureTask.java:264) 
> at java.base@11.0.1/java.lang.Thread.run(Thread.java:834){quote}
> STDOUT
> {quote}[2019-03-15 09:23:41,364] INFO Creating MiniTrogdorCluster with 
> agents: node02 and coordinator: node01 
> (org.apache.kafka.trogdor.common.MiniTrogdorCluster:135) [2019-03-15 
> 09:23:41,595] INFO Logging initialized @13340ms to 
> org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log:193) 
> [2019-03-15 09:23:41,752] INFO Starting REST server 
> (org.apache.kafka.trogdor.rest.JsonRestServer:89) [2019-03-15 09:23:41,912] 
> INFO Registered resource 
> org.apache.kafka.trogdor.agent.AgentRestResource@3fa38ceb 
> (org.apache.kafka.trogdor.rest.JsonRestServer:94) [2019-03-15 09:23:42,178] 
> INFO jetty-9.4.14.v20181114; built: 2018-11-14T21:20:31.478Z; git: 
> c4550056e785fb5665914545889f21dc136ad9e6; jvm 11.0.1+13-LTS 
> (org.eclipse.jetty.server.Server:370) [2019-03-15 09:23:42,360] INFO 
> DefaultSessionIdManager workerName=node0 
> (org.eclipse.jetty.server.session:365) [2019-03-15 09:23:42,362] INFO No 
> SessionScavenger set, using defaults (org.eclipse.jetty.server.session:370) 
> [2019-03-15 09:23:42,370] INFO node0 Scavenging every 66ms 
> (org.eclipse.jetty.server.session:149) [2019-03-15 09:23:44,412] INFO Started 
> o.e.j.s.ServletContextHandler@335a5293\{/,null,AVAILABLE} 
> (org.eclipse.jetty.server.handler.ContextHandler:855) [2019-03-15 
> 09:23:44,473] INFO Started 
> ServerConnector@79a93bf1\{HTTP/1.1,[http/1.1]}{0.0.0.0:33477} 
> (org.eclipse.jetty.server.AbstractConnector:292) [2019-03-15 09:23:44,474] 
> INFO Started @16219ms (org.eclipse.jetty.server.Server:407) 

[jira] [Updated] (KAFKA-8113) Flaky Test ListOffsetsRequestTest#testResponseIncludesLeaderEpoch

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8113:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test ListOffsetsRequestTest#testResponseIncludesLeaderEpoch
> -
>
> Key: KAFKA-8113
> URL: https://issues.apache.org/jira/browse/KAFKA-8113
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3468/tests]
> {quote}java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:87)
> at org.junit.Assert.assertTrue(Assert.java:42)
> at org.junit.Assert.assertTrue(Assert.java:53)
> at 
> kafka.server.ListOffsetsRequestTest.fetchOffsetAndEpoch$1(ListOffsetsRequestTest.scala:136)
> at 
> kafka.server.ListOffsetsRequestTest.testResponseIncludesLeaderEpoch(ListOffsetsRequestTest.scala:151){quote}
> STDOUT
> {quote}[2019-03-15 17:16:13,029] ERROR [ReplicaFetcher replicaId=2, 
> leaderId=1, fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-15 17:16:13,231] ERROR [KafkaApi-0] Error while responding to offset 
> request (kafka.server.KafkaApis:76)
> org.apache.kafka.common.errors.ReplicaNotAvailableException: Partition 
> topic-0 is not available{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8082) Flaky Test ProducerFailureHandlingTest#testNotEnoughReplicasAfterBrokerShutdown

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8082:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test 
> ProducerFailureHandlingTest#testNotEnoughReplicasAfterBrokerShutdown
> ---
>
> Key: KAFKA-8082
> URL: https://issues.apache.org/jira/browse/KAFKA-8082
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/61/testReport/junit/kafka.api/ProducerFailureHandlingTest/testNotEnoughReplicasAfterBrokerShutdown/]
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.NotEnoughReplicasAfterAppendException: 
> Messages are written to the log, but to fewer in-sync replicas than required. 
> at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:98)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:67)
>  at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
>  at 
> kafka.api.ProducerFailureHandlingTest.testNotEnoughReplicasAfterBrokerShutdown(ProducerFailureHandlingTest.scala:270){quote}
> STDOUT
> {quote}[2019-03-09 03:59:24,897] ERROR [ReplicaFetcher replicaId=0, 
> leaderId=1, fetcherId=0] Error for partition topic-1-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-09 03:59:28,028] ERROR 
> [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition 
> topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-09 03:59:42,046] ERROR 
> [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition 
> minisrtest-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-09 03:59:42,245] ERROR 
> [ReplicaManager broker=1] Error processing append operation on partition 
> minisrtest-0 (kafka.server.ReplicaManager:76) 
> org.apache.kafka.common.errors.NotEnoughReplicasException: The size of the 
> current ISR Set(1, 0) is insufficient to satisfy the min.isr requirement of 3 
> for partition minisrtest-0 [2019-03-09 04:00:01,212] ERROR [ReplicaFetcher 
> replicaId=1, leaderId=0, fetcherId=0] Error for partition topic-1-0 at offset 
> 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-09 04:00:02,214] ERROR 
> [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition 
> topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-09 04:00:03,216] ERROR 
> [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition 
> topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-09 04:00:23,144] ERROR 
> [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition 
> topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-09 04:00:24,146] ERROR 
> [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition 
> topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-09 04:00:25,148] ERROR 
> [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Error for partition 
> topic-1-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-09 04:00:44,607] ERROR 
> [ReplicaFetcher replicaId=1, leaderId=0, 

[jira] [Updated] (KAFKA-8015) Flaky Test SaslGssapiSslEndToEndAuthorizationTest#testProduceConsumeTopicAutoCreateTopicCreateAcl

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8015:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test 
> SaslGssapiSslEndToEndAuthorizationTest#testProduceConsumeTopicAutoCreateTopicCreateAcl
> -
>
> Key: KAFKA-8015
> URL: https://issues.apache.org/jira/browse/KAFKA-8015
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3422/tests]
> {quote}java.lang.AssertionError: Partition [e2etopic,0] metadata not 
> propagated after 15000 ms
> at kafka.utils.TestUtils$.fail(TestUtils.scala:356)
> at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:766)
> at kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:855)
> at kafka.utils.TestUtils$$anonfun$createTopic$1.apply(TestUtils.scala:303)
> at kafka.utils.TestUtils$$anonfun$createTopic$1.apply(TestUtils.scala:302)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.immutable.Range.foreach(Range.scala:160)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
> at kafka.utils.TestUtils$.createTopic(TestUtils.scala:302)
> at 
> kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:125)
> at 
> kafka.api.EndToEndAuthorizationTest.setUp(EndToEndAuthorizationTest.scala:189)
> at 
> kafka.api.SaslEndToEndAuthorizationTest.setUp(SaslEndToEndAuthorizationTest.scala:45){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8084) Flaky Test DescribeConsumerGroupTest#testDescribeMembersOfExistingGroupWithNoMembers

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8084:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test 
> DescribeConsumerGroupTest#testDescribeMembersOfExistingGroupWithNoMembers
> 
>
> Key: KAFKA-8084
> URL: https://issues.apache.org/jira/browse/KAFKA-8084
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/62/testReport/junit/kafka.admin/DescribeConsumerGroupTest/testDescribeMembersOfExistingGroupWithNoMembers/]
> {quote}java.lang.AssertionError: Partition [__consumer_offsets,0] metadata 
> not propagated after 15000 ms at 
> kafka.utils.TestUtils$.fail(TestUtils.scala:381) at 
> kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at 
> kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:318) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:317) at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at 
> scala.collection.immutable.Range.foreach(Range.scala:158) at 
> scala.collection.TraversableLike.map(TraversableLike.scala:237) at 
> scala.collection.TraversableLike.map$(TraversableLike.scala:230) at 
> scala.collection.AbstractTraversable.map(Traversable.scala:108) at 
> kafka.utils.TestUtils$.createTopic(TestUtils.scala:317) at 
> kafka.utils.TestUtils$.createOffsetsTopic(TestUtils.scala:375) at 
> kafka.admin.DescribeConsumerGroupTest.testDescribeMembersOfExistingGroupWithNoMembers(DescribeConsumerGroupTest.scala:283){quote}
> STDOUT
> {quote}TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST 
> CLIENT-ID foo 0 0 0 0 - - - TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG 
> CONSUMER-ID HOST CLIENT-ID foo 0 0 0 0 - - - COORDINATOR (ID) 
> ASSIGNMENT-STRATEGY STATE #MEMBERS localhost:45812 (0) Empty 0{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8141) Flaky Test FetchRequestDownConversionConfigTest#testV1FetchWithDownConversionDisabled

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8141:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test 
> FetchRequestDownConversionConfigTest#testV1FetchWithDownConversionDisabled
> -
>
> Key: KAFKA-8141
> URL: https://issues.apache.org/jira/browse/KAFKA-8141
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/80/testReport/junit/kafka.server/FetchRequestDownConversionConfigTest/testV1FetchWithDownConversionDisabled/]
> {quote}java.lang.AssertionError: Partition [__consumer_offsets,0] metadata 
> not propagated after 15000 ms at 
> kafka.utils.TestUtils$.fail(TestUtils.scala:381) at 
> kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at 
> kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:318) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:317) at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at 
> scala.collection.immutable.Range.foreach(Range.scala:158) at 
> scala.collection.TraversableLike.map(TraversableLike.scala:237) at 
> scala.collection.TraversableLike.map$(TraversableLike.scala:230) at 
> scala.collection.AbstractTraversable.map(Traversable.scala:108) at 
> kafka.utils.TestUtils$.createTopic(TestUtils.scala:317) at 
> kafka.utils.TestUtils$.createOffsetsTopic(TestUtils.scala:375) at 
> kafka.api.IntegrationTestHarness.doSetup(IntegrationTestHarness.scala:95) at 
> kafka.api.IntegrationTestHarness.setUp(IntegrationTestHarness.scala:73){quote}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-7988) Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-7988:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize
> 
>
> Key: KAFKA-7988
> URL: https://issues.apache.org/jira/browse/KAFKA-7988
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Rajini Sivaram
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/30/]
> {quote}kafka.server.DynamicBrokerReconfigurationTest > testThreadPoolResize 
> FAILED java.lang.AssertionError: Invalid threads: expected 6, got 5: 
> List(ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-1, 
> ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-2, ReplicaFetcherThread-0-1) 
> at org.junit.Assert.fail(Assert.java:88) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> kafka.server.DynamicBrokerReconfigurationTest.verifyThreads(DynamicBrokerReconfigurationTest.scala:1260)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.maybeVerifyThreadPoolSize$1(DynamicBrokerReconfigurationTest.scala:531)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.resizeThreadPool$1(DynamicBrokerReconfigurationTest.scala:550)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.reducePoolSize$1(DynamicBrokerReconfigurationTest.scala:536)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.$anonfun$testThreadPoolResize$3(DynamicBrokerReconfigurationTest.scala:559)
>  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at 
> kafka.server.DynamicBrokerReconfigurationTest.verifyThreadPoolResize$1(DynamicBrokerReconfigurationTest.scala:558)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize(DynamicBrokerReconfigurationTest.scala:572){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8329) Flaky Test LogOffsetTest#testEmptyLogsGetOffsets

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8329:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test LogOffsetTest#testEmptyLogsGetOffsets
> 
>
> Key: KAFKA-8329
> URL: https://issues.apache.org/jira/browse/KAFKA-8329
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/4325/testReport/junit/kafka.server/LogOffsetTest/testEmptyLogsGetOffsets/]
> {quote}org.scalatest.exceptions.TestFailedException: Partition [kafka-,0] 
> metadata not propagated after 15000 ms at 
> org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) at 
> org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) 
> at 
> org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) 
> at org.scalatest.Assertions.fail(Assertions.scala:1091) at 
> org.scalatest.Assertions.fail$(Assertions.scala:1087) at 
> org.scalatest.Assertions$.fail(Assertions.scala:1389) at 
> kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:788) at 
> kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:877) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:320) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:319) at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at 
> scala.collection.immutable.Range.foreach(Range.scala:158) at 
> scala.collection.TraversableLike.map(TraversableLike.scala:237) at 
> scala.collection.TraversableLike.map$(TraversableLike.scala:230) at 
> scala.collection.AbstractTraversable.map(Traversable.scala:108) at 
> kafka.utils.TestUtils$.createTopic(TestUtils.scala:319) at 
> kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:125)
>  at 
> kafka.server.LogOffsetTest.testEmptyLogsGetOffsets(LogOffsetTest.scala:141){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8059) Flaky Test DynamicConnectionQuotaTest #testDynamicConnectionQuota

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8059:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test DynamicConnectionQuotaTest #testDynamicConnectionQuota
> -
>
> Key: KAFKA-8059
> URL: https://issues.apache.org/jira/browse/KAFKA-8059
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/46/tests]
> {quote}org.scalatest.junit.JUnitTestFailedError: Expected exception 
> java.io.IOException to be thrown, but no exception was thrown
> at 
> org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:100)
> at 
> org.scalatest.junit.JUnitSuite.newAssertionFailedException(JUnitSuite.scala:71)
> at org.scalatest.Assertions$class.intercept(Assertions.scala:822)
> at org.scalatest.junit.JUnitSuite.intercept(JUnitSuite.scala:71)
> at 
> kafka.network.DynamicConnectionQuotaTest.testDynamicConnectionQuota(DynamicConnectionQuotaTest.scala:82){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8303) Flaky Test SaslSslAdminClientIntegrationTest#testLogStartOffsetCheckpoint

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8303:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test SaslSslAdminClientIntegrationTest#testLogStartOffsetCheckpoint
> -
>
> Key: KAFKA-8303
> URL: https://issues.apache.org/jira/browse/KAFKA-8303
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, security, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/21274/testReport/junit/kafka.api/SaslSslAdminClientIntegrationTest/testLogStartOffsetCheckpoint/]
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.TimeoutException: Aborted due to timeout. at 
> org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
>  at 
> kafka.api.AdminClientIntegrationTest$$anonfun$testLogStartOffsetCheckpoint$2.apply$mcZ$sp(AdminClientIntegrationTest.scala:820)
>  at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:789) at 
> kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpoint(AdminClientIntegrationTest.scala:813){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8110) Flaky Test DescribeConsumerGroupTest#testDescribeMembersWithConsumersWithoutAssignedPartitions

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8110:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test 
> DescribeConsumerGroupTest#testDescribeMembersWithConsumersWithoutAssignedPartitions
> --
>
> Key: KAFKA-8110
> URL: https://issues.apache.org/jira/browse/KAFKA-8110
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/67/testReport/junit/kafka.admin/DescribeConsumerGroupTest/testDescribeMembersWithConsumersWithoutAssignedPartitions/]
> {quote}java.lang.AssertionError: Partition [__consumer_offsets,0] metadata 
> not propagated after 15000 ms at 
> kafka.utils.TestUtils$.fail(TestUtils.scala:381) at 
> kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at 
> kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:318) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:317) at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at 
> scala.collection.immutable.Range.foreach(Range.scala:158) at 
> scala.collection.TraversableLike.map(TraversableLike.scala:237) at 
> scala.collection.TraversableLike.map$(TraversableLike.scala:230) at 
> scala.collection.AbstractTraversable.map(Traversable.scala:108) at 
> kafka.utils.TestUtils$.createTopic(TestUtils.scala:317) at 
> kafka.utils.TestUtils$.createOffsetsTopic(TestUtils.scala:375) at 
> kafka.admin.DescribeConsumerGroupTest.testDescribeMembersWithConsumersWithoutAssignedPartitions(DescribeConsumerGroupTest.scala:372){quote}
> STDOUT
> {quote}[2019-03-14 20:01:52,347] WARN Ignoring unexpected runtime exception 
> (org.apache.zookeeper.server.NIOServerCnxnFactory:236) 
> java.nio.channels.CancelledKeyException at 
> sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) at 
> sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:205)
>  at java.lang.Thread.run(Thread.java:748) TOPIC PARTITION CURRENT-OFFSET 
> LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID foo 0 0 0 0 - - - TOPIC 
> PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID foo 0 
> 0 0 0 - - - COORDINATOR (ID) ASSIGNMENT-STRATEGY STATE #MEMBERS 
> localhost:44669 (0){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8068) Flaky Test DescribeConsumerGroupTest#testDescribeMembersOfExistingGroup

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8068:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test DescribeConsumerGroupTest#testDescribeMembersOfExistingGroup
> ---
>
> Key: KAFKA-8068
> URL: https://issues.apache.org/jira/browse/KAFKA-8068
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/55/testReport/junit/kafka.admin/DescribeConsumerGroupTest/testDescribeMembersOfExistingGroup/]
> {quote}java.lang.AssertionError: Partition [__consumer_offsets,0] metadata 
> not propagated after 15000 ms at 
> kafka.utils.TestUtils$.fail(TestUtils.scala:381) at 
> kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at 
> kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:880) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:318) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:317) at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at 
> scala.collection.immutable.Range.foreach(Range.scala:158) at 
> scala.collection.TraversableLike.map(TraversableLike.scala:237) at 
> scala.collection.TraversableLike.map$(TraversableLike.scala:230) at 
> scala.collection.AbstractTraversable.map(Traversable.scala:108) at 
> kafka.utils.TestUtils$.createTopic(TestUtils.scala:317) at 
> kafka.utils.TestUtils$.createOffsetsTopic(TestUtils.scala:375) at 
> kafka.admin.DescribeConsumerGroupTest.testDescribeMembersOfExistingGroup(DescribeConsumerGroupTest.scala:154){quote}
>  
> STDOUT
> {quote}[2019-03-07 18:55:40,194] WARN Unable to read additional data from 
> client sessionid 0x1006fb9a65f0001, likely client has closed socket 
> (org.apache.zookeeper.server.NIOServerCnxn:376) TOPIC PARTITION 
> CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID foo 0 0 0 0 - - 
> - TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST 
> CLIENT-ID foo 0 0 0 0 - - - COORDINATOR (ID) ASSIGNMENT-STRATEGY STATE 
> #MEMBERS localhost:35213 (0) Empty 0 [2019-03-07 18:58:42,206] WARN Unable to 
> read additional data from client sessionid 0x1006fbc6962, likely client 
> has closed socket (org.apache.zookeeper.server.NIOServerCnxn:376)
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8064) Flaky Test DeleteTopicTest #testRecreateTopicAfterDeletion

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8064:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test DeleteTopicTest #testRecreateTopicAfterDeletion
> --
>
> Key: KAFKA-8064
> URL: https://issues.apache.org/jira/browse/KAFKA-8064
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/54/testReport/junit/kafka.admin/DeleteTopicTest/testRecreateTopicAfterDeletion/]
> {quote}java.lang.AssertionError: Admin path /admin/delete_topic/test path not 
> deleted even after a replica is restarted at 
> kafka.utils.TestUtils$.fail(TestUtils.scala:381) at 
> kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:791) at 
> kafka.utils.TestUtils$.verifyTopicDeletion(TestUtils.scala:1056) at 
> kafka.admin.DeleteTopicTest.testRecreateTopicAfterDeletion(DeleteTopicTest.scala:283){quote}
> STDOUT
> {quote}[2019-03-07 16:05:05,661] ERROR [ReplicaFetcher replicaId=1, 
> leaderId=0, fetcherId=0] Error for partition test-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-07 16:05:26,122] WARN Unable to 
> read additional data from client sessionid 0x1006f1dd1a60003, likely client 
> has closed socket (org.apache.zookeeper.server.NIOServerCnxn:376) [2019-03-07 
> 16:05:36,511] ERROR [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] 
> Error for partition test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-07 16:05:36,512] ERROR 
> [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Error for partition 
> test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-07 16:05:43,418] ERROR 
> [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Error for partition 
> test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-07 16:05:43,422] ERROR 
> [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition 
> test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-07 16:05:47,649] ERROR 
> [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition 
> test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-07 16:05:47,649] ERROR 
> [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Error for partition 
> test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-07 16:05:51,668] ERROR 
> [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for partition 
> test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. WARNING: If partitions are increased for 
> a topic that has a key, the partition logic or ordering of the messages will 
> be affected Adding partitions succeeded! [2019-03-07 16:05:56,135] WARN 
> Unable to read additional data from client sessionid 0x1006f1e2abb0006, 
> likely client has closed socket 
> (org.apache.zookeeper.server.NIOServerCnxn:376) [2019-03-07 16:06:00,286] 
> ERROR [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error for 
> partition test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition. [2019-03-07 16:06:00,357] ERROR 
> [ReplicaFetcher replicaId=2, leaderId=0, fetcherId=0] Error for partition 
> test-0 at offset 0 (kafka.server.ReplicaFetcherThread:76) 
> 

[jira] [Updated] (KAFKA-8701) Flaky Test SaslSslAdminClientIntegrationTest#testDescribeConfigsForTopic

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8701:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test SaslSslAdminClientIntegrationTest#testDescribeConfigsForTopic
> 
>
> Key: KAFKA-8701
> URL: https://issues.apache.org/jira/browse/KAFKA-8701
> Project: Kafka
>  Issue Type: Bug
>  Components: unit tests
>Affects Versions: 2.4.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/477/testReport/junit/kafka.api/SaslSslAdminClientIntegrationTest/testDescribeConfigsForTopic/]
> {quote}org.scalatest.exceptions.TestFailedException: Partition [topic,0] 
> metadata not propagated after 15000 ms at 
> org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) at 
> org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) 
> at 
> org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) 
> at org.scalatest.Assertions.fail(Assertions.scala:1091) at 
> org.scalatest.Assertions.fail$(Assertions.scala:1087) at 
> org.scalatest.Assertions$.fail(Assertions.scala:1389) at 
> kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:822) at 
> kafka.utils.TestUtils$.waitUntilMetadataIsPropagated(TestUtils.scala:911) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3(TestUtils.scala:337) at 
> kafka.utils.TestUtils$.$anonfun$createTopic$3$adapted(TestUtils.scala:336) at 
> scala.collection.immutable.Range.map(Range.scala:59) at 
> kafka.utils.TestUtils$.createTopic(TestUtils.scala:336) at 
> kafka.integration.KafkaServerTestHarness.createTopic(KafkaServerTestHarness.scala:126)
>  at 
> kafka.api.AdminClientIntegrationTest.testDescribeConfigsForTopic(AdminClientIntegrationTest.scala:1008){quote}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-8133) Flaky Test MetadataRequestTest#testNoTopicsRequest

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8133:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test MetadataRequestTest#testNoTopicsRequest
> --
>
> Key: KAFKA-8133
> URL: https://issues.apache.org/jira/browse/KAFKA-8133
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.1.1
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.1-jdk8/detail/kafka-2.1-jdk8/151/tests]
> {quote}org.apache.kafka.common.errors.TopicExistsException: Topic 't1' 
> already exists.{quote}
> STDOUT:
> {quote}[2019-03-20 03:49:00,982] ERROR [ReplicaFetcher replicaId=1, 
> leaderId=0, fetcherId=0] Error for partition isr-after-broker-shutdown-0 at 
> offset 0 (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-20 03:49:00,982] ERROR [ReplicaFetcher replicaId=2, leaderId=0, 
> fetcherId=0] Error for partition isr-after-broker-shutdown-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-20 03:49:15,319] ERROR [ReplicaFetcher replicaId=1, leaderId=2, 
> fetcherId=0] Error for partition replicaDown-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-20 03:49:15,319] ERROR [ReplicaFetcher replicaId=0, leaderId=2, 
> fetcherId=0] Error for partition replicaDown-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-20 03:49:20,049] ERROR [ReplicaFetcher replicaId=0, leaderId=1, 
> fetcherId=0] Error for partition testAutoCreate_Topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-20 03:49:27,080] ERROR [ReplicaFetcher replicaId=0, leaderId=2, 
> fetcherId=0] Error for partition __consumer_offsets-1 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-20 03:49:27,080] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition __consumer_offsets-2 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-20 03:49:27,080] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition __consumer_offsets-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-20 03:49:27,538] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition notInternal-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-20 03:49:27,538] ERROR [ReplicaFetcher replicaId=0, leaderId=2, 
> fetcherId=0] Error for partition notInternal-1 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-20 03:49:28,863] WARN Unable to read additional data from client 
> sessionid 0x102fbd81b150003, likely client has closed socket 
> (org.apache.zookeeper.server.NIOServerCnxn:376)
> [2019-03-20 03:49:40,478] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition t1-2 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-20 03:49:40,921] ERROR [ReplicaFetcher replicaId=0, leaderId=1, 
> fetcherId=0] Error for partition t2-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This 

[jira] [Updated] (KAFKA-8075) Flaky Test GroupAuthorizerIntegrationTest#testTransactionalProducerTopicAuthorizationExceptionInCommit

2020-06-24 Thread Randall Hauch (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8075:
-
Fix Version/s: (was: 2.6.0)
   2.6.1
   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm 
changing the fix version to `2.6.1` and `2.7.0`. If this is incorrect, please 
respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion 
mailing list thread.

> Flaky Test 
> GroupAuthorizerIntegrationTest#testTransactionalProducerTopicAuthorizationExceptionInCommit
> --
>
> Key: KAFKA-8075
> URL: https://issues.apache.org/jira/browse/KAFKA-8075
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.7.0, 2.6.1
>
>
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/56/testReport/junit/kafka.api/GroupAuthorizerIntegrationTest/testTransactionalProducerTopicAuthorizationExceptionInCommit/]
> {quote}org.apache.kafka.common.errors.TimeoutException: Timeout expired while 
> initializing transactional state in 3000ms.{quote}
> STDOUT
> {quote}[2019-03-08 01:48:45,226] ERROR [Consumer clientId=consumer-99, 
> groupId=my-group] Offset commit failed on partition topic-0 at offset 5: Not 
> authorized to access topics: [Topic authorization failed.] 
> (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:812) 
> [2019-03-08 01:48:45,227] ERROR [Consumer clientId=consumer-99, 
> groupId=my-group] Not authorized to commit to topics [topic] 
> (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:850) 
> [2019-03-08 01:48:57,870] ERROR [KafkaApi-0] Error when handling request: 
> clientId=0, correlationId=0, api=UPDATE_METADATA, 
> body=\{controller_id=0,controller_epoch=1,broker_epoch=25,topic_states=[],live_brokers=[{id=0,end_points=[{port=43610,host=localhost,listener_name=PLAINTEXT,security_protocol_type=0}],rack=null}]}
>  (kafka.server.KafkaApis:76) 
> org.apache.kafka.common.errors.ClusterAuthorizationException: Request 
> Request(processor=0, connectionId=127.0.0.1:43610-127.0.0.1:44870-0, 
> session=Session(Group:testGroup,/127.0.0.1), 
> listenerName=ListenerName(PLAINTEXT), securityProtocol=PLAINTEXT, 
> buffer=null) is not authorized. [2019-03-08 01:49:14,858] ERROR [KafkaApi-0] 
> Error when handling request: clientId=0, correlationId=0, 
> api=UPDATE_METADATA, 
> body=\{controller_id=0,controller_epoch=1,broker_epoch=25,topic_states=[],live_brokers=[{id=0,end_points=[{port=44107,host=localhost,listener_name=PLAINTEXT,security_protocol_type=0}],rack=null}]}
>  (kafka.server.KafkaApis:76) 
> org.apache.kafka.common.errors.ClusterAuthorizationException: Request 
> Request(processor=0, connectionId=127.0.0.1:44107-127.0.0.1:38156-0, 
> session=Session(Group:testGroup,/127.0.0.1), 
> listenerName=ListenerName(PLAINTEXT), securityProtocol=PLAINTEXT, 
> buffer=null) is not authorized. [2019-03-08 01:49:21,984] ERROR [KafkaApi-0] 
> Error when handling request: clientId=0, correlationId=0, 
> api=UPDATE_METADATA, 
> body=\{controller_id=0,controller_epoch=1,broker_epoch=25,topic_states=[],live_brokers=[{id=0,end_points=[{port=39025,host=localhost,listener_name=PLAINTEXT,security_protocol_type=0}],rack=null}]}
>  (kafka.server.KafkaApis:76) 
> org.apache.kafka.common.errors.ClusterAuthorizationException: Request 
> Request(processor=0, connectionId=127.0.0.1:39025-127.0.0.1:41474-0, 
> session=Session(Group:testGroup,/127.0.0.1), 
> listenerName=ListenerName(PLAINTEXT), securityProtocol=PLAINTEXT, 
> buffer=null) is not authorized. [2019-03-08 01:49:39,438] ERROR [KafkaApi-0] 
> Error when handling request: clientId=0, correlationId=0, 
> api=UPDATE_METADATA, 
> body=\{controller_id=0,controller_epoch=1,broker_epoch=25,topic_states=[],live_brokers=[{id=0,end_points=[{port=44798,host=localhost,listener_name=PLAINTEXT,security_protocol_type=0}],rack=null}]}
>  (kafka.server.KafkaApis:76) 
> org.apache.kafka.common.errors.ClusterAuthorizationException: Request 
> Request(processor=0, connectionId=127.0.0.1:44798-127.0.0.1:58496-0, 
> session=Session(Group:testGroup,/127.0.0.1), 
> listenerName=ListenerName(PLAINTEXT), securityProtocol=PLAINTEXT, 
> buffer=null) is not authorized. Error: Consumer group 'my-group' does not 
> exist. [2019-03-08 01:49:55,502] WARN Ignoring unexpected runtime exception 
> (org.apache.zookeeper.server.NIOServerCnxnFactory:236) 
> java.nio.channels.CancelledKeyException at 
> sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) at 
> sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87) at 
> 

  1   2   3   >