[GitHub] [kafka] rhauch commented on a change in pull request #11132: KAFKA-13139: Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread GitBox


rhauch commented on a change in pull request #11132:
URL: https://github.com/apache/kafka/pull/11132#discussion_r677104692



##
File path: 
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorsResource.java
##
@@ -269,7 +269,7 @@ public Response restartConnector(final 
@PathParam("connector") String connector,
 FutureCallback cb = new FutureCallback<>();
 herder.restartConnector(connector, cb);
 completeOrForwardRequest(cb, forwardingPath, "POST", headers, 
null, forward);
-return Response.ok().build();
+return Response.noContent().build();

Review comment:
   Awesome. Thanks, folks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] hachikuji commented on a change in pull request #11131: KAFKA-13137: KRaft Controller Metric MBean names incorrectly quoted

2021-07-26 Thread GitBox


hachikuji commented on a change in pull request #11131:
URL: https://github.com/apache/kafka/pull/11131#discussion_r677033977



##
File path: 
metadata/src/main/java/org/apache/kafka/controller/QuorumControllerMetrics.java
##
@@ -22,22 +22,21 @@
 import com.yammer.metrics.core.MetricName;
 import com.yammer.metrics.core.MetricsRegistry;
 
-
 public final class QuorumControllerMetrics implements ControllerMetrics {
-private final static MetricName ACTIVE_CONTROLLER_COUNT = new MetricName(
-"kafka.controller", "KafkaController", "ActiveControllerCount", null);
-private final static MetricName EVENT_QUEUE_TIME_MS = new MetricName(
-"kafka.controller", "ControllerEventManager", "EventQueueTimeMs", 
null);
-private final static MetricName EVENT_QUEUE_PROCESSING_TIME_MS = new 
MetricName(
-"kafka.controller", "ControllerEventManager", 
"EventQueueProcessingTimeMs", null);
-private final static MetricName GLOBAL_TOPIC_COUNT = new MetricName(
-"kafka.controller", "KafkaController", "GlobalTopicCount", null);
-private final static MetricName GLOBAL_PARTITION_COUNT = new MetricName(
-"kafka.controller", "KafkaController", "GlobalPartitionCount", null);
-private final static MetricName OFFLINE_PARTITION_COUNT = new MetricName(
-"kafka.controller", "KafkaController", "OfflinePartitionCount", null);
-private final static MetricName PREFERRED_REPLICA_IMBALANCE_COUNT = new 
MetricName(
-"kafka.controller", "KafkaController", 
"PreferredReplicaImbalanceCount", null);
+private final static MetricName ACTIVE_CONTROLLER_COUNT = getMetricName(
+"kafka.controller", "KafkaController", "ActiveControllerCount");

Review comment:
   I don't feel too strongly about it. I guess rewriting the name might 
prevent some accidental modification case.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] hachikuji commented on a change in pull request #11131: KAFKA-13137: KRaft Controller Metric MBean names incorrectly quoted

2021-07-26 Thread GitBox


hachikuji commented on a change in pull request #11131:
URL: https://github.com/apache/kafka/pull/11131#discussion_r677032066



##
File path: 
metadata/src/test/java/org/apache/kafka/controller/QuorumControllerMetricsTest.java
##
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kafka.controller;
+
+import com.yammer.metrics.core.MetricsRegistry;
+import org.apache.kafka.common.utils.Utils;
+import org.junit.jupiter.api.Test;
+
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import static org.junit.jupiter.api.Assertions.assertEquals;
+
+public class QuorumControllerMetricsTest {
+@Test
+public void testKafkaControllerMetricNames() {
+String expectedGroup = "kafka.controller";
+String expectedType = "KafkaController";
+Set expectedMetricNames = Utils.mkSet(
+"ActiveControllerCount",
+"GlobalTopicCount",
+"GlobalPartitionCount",
+"OfflinePartitionsCount",
+"PreferredReplicaImbalanceCount");
+Set missingMetrics = 
getMissingMetricNames(expectedMetricNames, expectedGroup, expectedType);
+assertEquals(Collections.emptySet(), missingMetrics, "Expected metrics 
did not exist");
+}
+
+@Test
+public void testControllerEventManagerMetricNames() {
+String expectedGroup = "kafka.controller";
+String expectedType = "ControllerEventManager";
+Set expectedMetricNames = new HashSet<>(Arrays.asList(
+"EventQueueTimeMs",
+"EventQueueProcessingTimeMs"));
+Set missingMetrics = 
getMissingMetricNames(expectedMetricNames, expectedGroup, expectedType);
+assertEquals(Collections.emptySet(), missingMetrics, "Expected metrics 
did not exist");
+}
+
+private static Set getMissingMetricNames(
+Set expectedMetricNames, String expectedGroup, String 
expectedType) {
+MetricsRegistry registry = new MetricsRegistry();
+new QuorumControllerMetrics(registry); // populates the registry
+Set foundMetricNames = 
expectedMetricNames.stream().filter(expectedMetricName ->
+registry.allMetrics().keySet().stream().anyMatch(metricName -> {
+if (metricName.getGroup().equals(expectedGroup) && 
metricName.getType().equals(expectedType)
+&& metricName.getScope() == null && 
metricName.getName().equals(expectedMetricName)) {
+// It has to exist AND the MBean name has to be correct;
+// fail right here if the MBean name doesn't match
+String expectedMBeanPrefix = expectedGroup + ":type=" + 
expectedType + ",name=";
+assertEquals(expectedMBeanPrefix + expectedMetricName, 
metricName.getMBeanName());
+return true; // the expected metric name exists and the 
associated MBean name matches
+} else {
+return false; // this one didn't match

Review comment:
   Could we fail the test right here? It doesn't seem like there is much 
benefit to returning the missing metrics from the method. That would let us 
simplify this a little. Instead of this:
   ```java
   Set missingMetrics = getMissingMetricNames(expectedMetricNames, 
expectedGroup, expectedType);
   assertEquals(Collections.emptySet(), missingMetrics, "Expected metrics did 
not exist");
   ```
   we could have:
   ```java
   assertRegisteredMetrics(expectedMetricNames, expectedGroup, expectedType);
   ```
   We could probably also drop `expectedGroup` since we only have 
`kafka.controller`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] mjsax commented on pull request #10207: Fixing documentation source for issue KAFKA-12360

2021-07-26 Thread GitBox


mjsax commented on pull request #10207:
URL: https://github.com/apache/kafka/pull/10207#issuecomment-887104113


   Closing this PR due to inactivity. @nicodds Feel free to re-open if you see 
fit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] mjsax closed pull request #10207: Fixing documentation source for issue KAFKA-12360

2021-07-26 Thread GitBox


mjsax closed pull request #10207:
URL: https://github.com/apache/kafka/pull/10207


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] hachikuji commented on a change in pull request #11126: KAFKA-13132: Upgrading to topic IDs in LISR requests has gaps introduced in 3.0

2021-07-26 Thread GitBox


hachikuji commented on a change in pull request #11126:
URL: https://github.com/apache/kafka/pull/11126#discussion_r677015974



##
File path: core/src/main/scala/kafka/server/ReplicaManager.scala
##
@@ -1367,13 +1367,23 @@ class ReplicaManager(val config: KafkaConfig,
   val currentLeaderEpoch = partition.getLeaderEpoch
   val requestLeaderEpoch = partitionState.leaderEpoch
   val requestTopicId = topicIdFromRequest(topicPartition.topic)
+  val logTopicId = partition.topicId
+
+  // We propagate the partition state down if:
+  // 1. The leader epoch is higher than the current leader epoch 
of the partition
+  // 2. The leader epoch is same as the current leader epoch but a 
new topic id is being assigned. This is
+  //needed to handle the case where a topic id is assigned for 
the first time after upgrade.
+  def propagatePartitionState(requestLeaderEpoch: Int, 
currentLeaderEpoch: Int): Boolean = {
+requestLeaderEpoch > currentLeaderEpoch ||
+  (requestLeaderEpoch == currentLeaderEpoch && 
logTopicId.isEmpty && requestTopicId.isDefined)
+  }
 
-  if (!hasConsistentTopicId(requestTopicId, partition.topicId)) {
-stateChangeLogger.error(s"Topic ID in memory: 
${partition.topicId.get} does not" +
+  if (!hasConsistentTopicId(requestTopicId, logTopicId)) {
+stateChangeLogger.error(s"Topic ID in memory: 
${logTopicId.get} does not" +
   s" match the topic ID for partition $topicPartition 
received: " +
   s"${requestTopicId.get}.")
 responseMap.put(topicPartition, Errors.INCONSISTENT_TOPIC_ID)
-  } else if (requestLeaderEpoch > currentLeaderEpoch) {
+  } else if (propagatePartitionState(requestLeaderEpoch, 
currentLeaderEpoch)) {

Review comment:
   Nevertheless, the path through `makeLeader` seems unsafe without an 
epoch bump. Another case is `updateAssignmentAndIsr` which blindly overrides 
the current ISR state. This makes sense when there is an epoch bump, but not 
otherwise, and could lead to the exposing of uncommitted data. We'd also need 
to fix the logging since it would be confusing otherwise. So all in all, I 
think it's best to take another route to updating the topicId (as the latest 
commit does).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] kpatelatwork commented on a change in pull request #11132: KAFKA-13139: Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread GitBox


kpatelatwork commented on a change in pull request #11132:
URL: https://github.com/apache/kafka/pull/11132#discussion_r677015394



##
File path: 
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorsResource.java
##
@@ -269,7 +269,7 @@ public Response restartConnector(final 
@PathParam("connector") String connector,
 FutureCallback cb = new FutureCallback<>();
 herder.restartConnector(connector, cb);
 completeOrForwardRequest(cb, forwardingPath, "POST", headers, 
null, forward);
-return Response.ok().build();
+return Response.noContent().build();

Review comment:
   @rhauch and @kkonstantine  I confirm that from AK2.8 test the old 
behavior was returning 204, We would need to fix the documentation and KIP. 
Please see below
   
   > ➜  kafka_2.13-2.8.0 curl -v -X POST 
http://localhost:8083/connectors/local-file-source/restart
   > *   Trying ::1...
   > * TCP_NODELAY set
   > * Connected to localhost (::1) port 8083 (#0)
   > > POST /connectors/local-file-source/restart HTTP/1.1
   > > Host: localhost:8083
   > > User-Agent: curl/7.64.1
   > > Accept: */*
   > > 
   > < HTTP/1.1 204 No Content
   > < Date: Mon, 26 Jul 2021 23:41:44 GMT
   > < Server: Jetty(9.4.39.v20210325)
   > < 
   > * Connection #0 to host localhost left intact
   > * Closing connection 0
   > ➜  kafka_2.13-2.8.0




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] kpatelatwork commented on a change in pull request #11132: KAFKA-13139: Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread GitBox


kpatelatwork commented on a change in pull request #11132:
URL: https://github.com/apache/kafka/pull/11132#discussion_r677015394



##
File path: 
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorsResource.java
##
@@ -269,7 +269,7 @@ public Response restartConnector(final 
@PathParam("connector") String connector,
 FutureCallback cb = new FutureCallback<>();
 herder.restartConnector(connector, cb);
 completeOrForwardRequest(cb, forwardingPath, "POST", headers, 
null, forward);
-return Response.ok().build();
+return Response.noContent().build();

Review comment:
   @rhauch  I confirm that from AK2.8 test the old behavior was returning 
204, Please see below
   
   > ➜  kafka_2.13-2.8.0 curl -v -X POST 
http://localhost:8083/connectors/local-file-source/restart
   > *   Trying ::1...
   > * TCP_NODELAY set
   > * Connected to localhost (::1) port 8083 (#0)
   > > POST /connectors/local-file-source/restart HTTP/1.1
   > > Host: localhost:8083
   > > User-Agent: curl/7.64.1
   > > Accept: */*
   > > 
   > < HTTP/1.1 204 No Content
   > < Date: Mon, 26 Jul 2021 23:41:44 GMT
   > < Server: Jetty(9.4.39.v20210325)
   > < 
   > * Connection #0 to host localhost left intact
   > * Closing connection 0
   > ➜  kafka_2.13-2.8.0




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] kpatelatwork commented on a change in pull request #11132: KAFKA-13139: Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread GitBox


kpatelatwork commented on a change in pull request #11132:
URL: https://github.com/apache/kafka/pull/11132#discussion_r677015394



##
File path: 
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorsResource.java
##
@@ -269,7 +269,7 @@ public Response restartConnector(final 
@PathParam("connector") String connector,
 FutureCallback cb = new FutureCallback<>();
 herder.restartConnector(connector, cb);
 completeOrForwardRequest(cb, forwardingPath, "POST", headers, 
null, forward);
-return Response.ok().build();
+return Response.noContent().build();

Review comment:
   @rhauch  I confirme that from AK2.8 test the old behavior was returning 
204, Please see below
   
   > ➜  kafka_2.13-2.8.0 curl -v -X POST 
http://localhost:8083/connectors/local-file-source/restart
   > *   Trying ::1...
   > * TCP_NODELAY set
   > * Connected to localhost (::1) port 8083 (#0)
   > > POST /connectors/local-file-source/restart HTTP/1.1
   > > Host: localhost:8083
   > > User-Agent: curl/7.64.1
   > > Accept: */*
   > > 
   > < HTTP/1.1 204 No Content
   > < Date: Mon, 26 Jul 2021 23:41:44 GMT
   > < Server: Jetty(9.4.39.v20210325)
   > < 
   > * Connection #0 to host localhost left intact
   > * Closing connection 0
   > ➜  kafka_2.13-2.8.0




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] mjsax commented on pull request #10824: KAFKA-12718: SessionWindows are closed too early

2021-07-26 Thread GitBox


mjsax commented on pull request #10824:
URL: https://github.com/apache/kafka/pull/10824#issuecomment-887099529


   It's very unlikely that there will be another 2.5.x bug-fix release... Thus, 
upgrading to a newer version would be the only way to get the fix.
   
   However, as a workaround, you can add the "gap" parameter to the 
grace-period manually to get the same effect. Thus, if you code is:
   ```
   SessionWindows.gap(Duration.ofSeconds(5L)).grace(Duration.of(10L));
   ```
   You can update to
   ```
   SessionWindows.gap(Duration.ofSeconds(5L)).grace(Duration.of(15L));
   ```
   as a workaround.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KAFKA-9897) Flaky Test StoreQueryIntegrationTest#shouldQuerySpecificActivePartitionStores

2021-07-26 Thread A. Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387652#comment-17387652
 ] 

A. Sophie Blee-Goldman commented on KAFKA-9897:
---

[~mjsax] that last failure is for a different test that was added recently, we 
merged a fix three days ago which I think was just after that build was 
triggered...but if you see that specific test fail again on a recent build can 
you reopen KAFKA-13128 instead?

> Flaky Test StoreQueryIntegrationTest#shouldQuerySpecificActivePartitionStores
> -
>
> Key: KAFKA-9897
> URL: https://issues.apache.org/jira/browse/KAFKA-9897
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 2.6.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
>
> [https://builds.apache.org/job/kafka-pr-jdk14-scala2.13/22/testReport/junit/org.apache.kafka.streams.integration/StoreQueryIntegrationTest/shouldQuerySpecificActivePartitionStores/]
> {quote}org.apache.kafka.streams.errors.InvalidStateStoreException: Cannot get 
> state store source-table because the stream thread is PARTITIONS_ASSIGNED, 
> not RUNNING at 
> org.apache.kafka.streams.state.internals.StreamThreadStateStoreProvider.stores(StreamThreadStateStoreProvider.java:85)
>  at 
> org.apache.kafka.streams.state.internals.QueryableStoreProvider.getStore(QueryableStoreProvider.java:61)
>  at org.apache.kafka.streams.KafkaStreams.store(KafkaStreams.java:1183) at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores(StoreQueryIntegrationTest.java:178){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-13125) Close KeyValueIterator implemented instance in internal tests (cont.)

2021-07-26 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-13125:

Fix Version/s: 3.1.0

> Close KeyValueIterator implemented instance in internal tests (cont.)
> -
>
> Key: KAFKA-13125
> URL: https://issues.apache.org/jira/browse/KAFKA-13125
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Luke Chen
>Assignee: Luke Chen
>Priority: Major
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-13124) Close KeyValueIterator implemented instance in internal tests

2021-07-26 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-13124:

Fix Version/s: 3.1.0

> Close KeyValueIterator implemented instance in internal tests
> -
>
> Key: KAFKA-13124
> URL: https://issues.apache.org/jira/browse/KAFKA-13124
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Luke Chen
>Assignee: Luke Chen
>Priority: Major
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-13122) resource leak due to not close KeyValueIterator implemented instances

2021-07-26 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-13122:

Fix Version/s: 3.1.0

> resource leak due to not close KeyValueIterator implemented instances
> -
>
> Key: KAFKA-13122
> URL: https://issues.apache.org/jira/browse/KAFKA-13122
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Luke Chen
>Assignee: Luke Chen
>Priority: Major
> Fix For: 3.1.0
>
>
> Found there are "many" KeyValueIterator implemented instances don't 
> explicitly get closed, which will cause resource leak.
> From the java doc in KeyValueIterator:
> {color:#808080}* Users must call its {{color}{color:#808080}@code 
> {color}{color:#808080}close} method explicitly upon completeness to release 
> resources{color}
>  
> This issue mostly happen in tests because we usually query state store to get 
> result iterator, and then do verification, but forgot close it. This issue 
> also *appear in the example code in our developer guide docs*.
>  
> I'll use try-with-resource to fix them. To avoid huge PR created, I split 
> this bug into 3 sub-tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-13123) Close KeyValueIterator implemented instance in example codes and some tests

2021-07-26 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-13123:

Fix Version/s: 3.1.0

> Close KeyValueIterator implemented instance in example codes and some tests
> ---
>
> Key: KAFKA-13123
> URL: https://issues.apache.org/jira/browse/KAFKA-13123
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Luke Chen
>Assignee: Luke Chen
>Priority: Major
> Fix For: 3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] mjsax commented on pull request #11106: KAFKA-13124: close KeyValueIterator instance in internals tests (part 1)

2021-07-26 Thread GitBox


mjsax commented on pull request #11106:
URL: https://github.com/apache/kafka/pull/11106#issuecomment-887095417


   Thanks a lot @showuon! Merged to `trunk`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] mjsax commented on pull request #11107: KAFKA-13125: close KeyValueIterator instances in internals tests (part 2)

2021-07-26 Thread GitBox


mjsax commented on pull request #11107:
URL: https://github.com/apache/kafka/pull/11107#issuecomment-887095254


   Thanks a lot @showuon! Merged to `trunk`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] mjsax merged pull request #11107: KAFKA-13125: close KeyValueIterator instances in internals tests (part 2)

2021-07-26 Thread GitBox


mjsax merged pull request #11107:
URL: https://github.com/apache/kafka/pull/11107


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] mjsax merged pull request #11106: KAFKA-13124: close KeyValueIterator instance in internals tests (part 1)

2021-07-26 Thread GitBox


mjsax merged pull request #11106:
URL: https://github.com/apache/kafka/pull/11106


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KAFKA-10251) Flaky Test kafka.api.TransactionsBounceTest.testWithGroupMetadata

2021-07-26 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387650#comment-17387650
 ] 

Matthias J. Sax commented on KAFKA-10251:
-

https://github.com/apache/kafka/pull/11106/checks?check_run_id=3146651514

> Flaky Test kafka.api.TransactionsBounceTest.testWithGroupMetadata
> -
>
> Key: KAFKA-10251
> URL: https://issues.apache.org/jira/browse/KAFKA-10251
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: A. Sophie Blee-Goldman
>Assignee: Luke Chen
>Priority: Major
>
> h3. Stacktrace
> org.scalatest.exceptions.TestFailedException: Consumed 0 records before 
> timeout instead of the expected 200 records at 
> org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530) at 
> org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529) 
> at 
> org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389) 
> at org.scalatest.Assertions.fail(Assertions.scala:1091) at 
> org.scalatest.Assertions.fail$(Assertions.scala:1087) at 
> org.scalatest.Assertions$.fail(Assertions.scala:1389) at 
> kafka.utils.TestUtils$.pollUntilAtLeastNumRecords(TestUtils.scala:842) at 
> kafka.api.TransactionsBounceTest.testWithGroupMetadata(TransactionsBounceTest.scala:109)
>  
>  
> The logs are pretty much just this on repeat:
> {code:java}
> [2020-07-08 23:41:04,034] ERROR Error when sending message to topic 
> output-topic with key: 9955, value: 9955 with error: 
> (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback:52) 
> org.apache.kafka.common.KafkaException: Failing batch since transaction was 
> aborted at 
> org.apache.kafka.clients.producer.internals.Sender.maybeSendAndPollTransactionalRequest(Sender.java:423)
>  at 
> org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:313) 
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:240) at 
> java.lang.Thread.run(Thread.java:748) [2020-07-08 23:41:04,034] ERROR Error 
> when sending message to topic output-topic with key: 9959, value: 9959 with 
> error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback:52) 
> org.apache.kafka.common.KafkaException: Failing batch since transaction was 
> aborted at 
> org.apache.kafka.clients.producer.internals.Sender.maybeSendAndPollTransactionalRequest(Sender.java:423)
>  at 
> org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:313) 
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:240) at 
> java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] mjsax commented on pull request #11105: KAFKA-13123: close KeyValueIterator instances in example code and tests

2021-07-26 Thread GitBox


mjsax commented on pull request #11105:
URL: https://github.com/apache/kafka/pull/11105#issuecomment-887094098


   Thanks a lot @showuon! Merged to `trunk`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] mjsax merged pull request #11105: KAFKA-13123: close KeyValueIterator instances in example code and tests

2021-07-26 Thread GitBox


mjsax merged pull request #11105:
URL: https://github.com/apache/kafka/pull/11105


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KAFKA-9926) Flaky test PlaintextAdminIntegrationTest.testCreatePartitions

2021-07-26 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387649#comment-17387649
 ] 

Matthias J. Sax commented on KAFKA-9926:


[https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-11105/4/testReport/kafka.api/PlaintextAdminIntegrationTest/Build___JDK_11_and_Scala_2_13___testReplicaCanFetchFromLogStartOffsetAfterDeleteRecords___2/]

Different test method: 

{{org.opentest4j.AssertionFailedError: Expected follower to catch up to log end 
offset 200 at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:39) 
at org.junit.jupiter.api.Assertions.fail(Assertions.java:117) at 
kafka.api.PlaintextAdminIntegrationTest.waitForFollowerLog$1(PlaintextAdminIntegrationTest.scala:698)
 at 
kafka.api.PlaintextAdminIntegrationTest.testReplicaCanFetchFromLogStartOffsetAfterDeleteRecords(PlaintextAdminIntegrationTest.scala:728)}}

> Flaky test PlaintextAdminIntegrationTest.testCreatePartitions
> -
>
> Key: KAFKA-9926
> URL: https://issues.apache.org/jira/browse/KAFKA-9926
> Project: Kafka
>  Issue Type: Bug
>Reporter: Wang Ge
>Assignee: GeordieMai
>Priority: Major
>
> Flaky test: kafka.api.PlaintextAdminIntegrationTest.testCreatePartitions
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/6007/testReport/junit/kafka.api/PlaintextAdminIntegrationTest/testCreatePartitions/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9897) Flaky Test StoreQueryIntegrationTest#shouldQuerySpecificActivePartitionStores

2021-07-26 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387648#comment-17387648
 ] 

Matthias J. Sax commented on KAFKA-9897:


https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-11105/4/testReport/junit/org.apache.kafka.streams.integration/StoreQueryIntegrationTest/Build___JDK_16_and_Scala_2_13___shouldQueryStoresAfterAddingAndRemovingStreamThread_2/

> Flaky Test StoreQueryIntegrationTest#shouldQuerySpecificActivePartitionStores
> -
>
> Key: KAFKA-9897
> URL: https://issues.apache.org/jira/browse/KAFKA-9897
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 2.6.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
>
> [https://builds.apache.org/job/kafka-pr-jdk14-scala2.13/22/testReport/junit/org.apache.kafka.streams.integration/StoreQueryIntegrationTest/shouldQuerySpecificActivePartitionStores/]
> {quote}org.apache.kafka.streams.errors.InvalidStateStoreException: Cannot get 
> state store source-table because the stream thread is PARTITIONS_ASSIGNED, 
> not RUNNING at 
> org.apache.kafka.streams.state.internals.StreamThreadStateStoreProvider.stores(StreamThreadStateStoreProvider.java:85)
>  at 
> org.apache.kafka.streams.state.internals.QueryableStoreProvider.getStore(QueryableStoreProvider.java:61)
>  at org.apache.kafka.streams.KafkaStreams.store(KafkaStreams.java:1183) at 
> org.apache.kafka.streams.integration.StoreQueryIntegrationTest.shouldQuerySpecificActivePartitionStores(StoreQueryIntegrationTest.java:178){quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] kkonstantine commented on a change in pull request #11132: KAFKA-13139: Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread GitBox


kkonstantine commented on a change in pull request #11132:
URL: https://github.com/apache/kafka/pull/11132#discussion_r676997082



##
File path: 
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorsResource.java
##
@@ -269,7 +269,7 @@ public Response restartConnector(final 
@PathParam("connector") String connector,
 FutureCallback cb = new FutureCallback<>();
 herder.restartConnector(connector, cb);
 completeOrForwardRequest(cb, forwardingPath, "POST", headers, 
null, forward);
-return Response.ok().build();
+return Response.noContent().build();

Review comment:
   Thanks for the review @rhauch. I had the same questions and concerns. 
   
   Actually, the current behavior for both connector and task restarts is to 
return `HTTP/1.1 204 No Content`. So this fix should retain this behavior for 
the connector restart. 
   
   This seemed the most appropriate thing to do for a POST request whose 
response would have an empty body given the description in the RFC:
   https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5
   
   I also think that an addendum to the KIP would be a good idea one we finish 
the review here. 
   Wdyt?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] kkonstantine commented on a change in pull request #11132: KAFKA-13139: Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread GitBox


kkonstantine commented on a change in pull request #11132:
URL: https://github.com/apache/kafka/pull/11132#discussion_r676997082



##
File path: 
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorsResource.java
##
@@ -269,7 +269,7 @@ public Response restartConnector(final 
@PathParam("connector") String connector,
 FutureCallback cb = new FutureCallback<>();
 herder.restartConnector(connector, cb);
 completeOrForwardRequest(cb, forwardingPath, "POST", headers, 
null, forward);
-return Response.ok().build();
+return Response.noContent().build();

Review comment:
   Thanks for the review @rhauch. I had the same questions and concerns. 
   
   Actually, the current behavior for both connector and task restarts is to 
return `HTTP/1.1 204 No Content`. So this fix should retain this behavior for 
the connector restart. 
   
   This seemed the most appropriate thing to do for a POST request who's 
response would have an empty body given the description in the RFC:
   https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.5
   
   I also think that an addendum to the KIP would be a good idea one we finish 
the review here. 
   Wdyt?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] ableegoldman merged pull request #11123: MINOR: factor state checks into descriptive methods and clarify javadocs

2021-07-26 Thread GitBox


ableegoldman merged pull request #11123:
URL: https://github.com/apache/kafka/pull/11123


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (KAFKA-10777) Add additional configuration to control MirrorMaker 2 internal topics naming convention

2021-07-26 Thread Omnia Ibrahim (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omnia Ibrahim updated KAFKA-10777:
--
Priority: Major  (was: Minor)

> Add additional configuration to control MirrorMaker 2 internal topics naming 
> convention
> ---
>
> Key: KAFKA-10777
> URL: https://issues.apache.org/jira/browse/KAFKA-10777
> Project: Kafka
>  Issue Type: Improvement
>  Components: mirrormaker
>Affects Versions: 2.6.0
>Reporter: Omnia Ibrahim
>Assignee: Omnia Ibrahim
>Priority: Major
>  Labels: mirror, mirror-maker, mirrormaker
> Fix For: 3.1.0
>
>
> MM2 internal topic names (heartbeats, checkpoints and offset-syncs) are 
> hardcoded in the source code which makes it hard to run MM2 with any Kafka 
> Cluster that has rules around topic’s naming convention and doesn’t allow 
> auto-creation for topics.
> In this case developers will need to create these internal topics up-front 
> manually and make sure they do follow the cluster rules and guidance for 
> topic creation, so MM2 should have flexibility to let you override the name 
> of internal topics to follow their cluster topic naming convention. 
>  
> Way to solve this is under-discussion in 
> [KIP-690|https://cwiki.apache.org/confluence/display/KAFKA/KIP-690%3A+Add+additional+configuration+to+control+MirrorMaker+2+internal+topics+naming+convention]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10777) Add additional configuration to control MirrorMaker 2 internal topics naming convention

2021-07-26 Thread Omnia Ibrahim (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omnia Ibrahim updated KAFKA-10777:
--
Fix Version/s: 3.1.0

> Add additional configuration to control MirrorMaker 2 internal topics naming 
> convention
> ---
>
> Key: KAFKA-10777
> URL: https://issues.apache.org/jira/browse/KAFKA-10777
> Project: Kafka
>  Issue Type: Improvement
>  Components: mirrormaker
>Affects Versions: 2.6.0
>Reporter: Omnia Ibrahim
>Assignee: Omnia Ibrahim
>Priority: Minor
>  Labels: mirror, mirror-maker, mirrormaker
> Fix For: 3.1.0
>
>
> MM2 internal topic names (heartbeats, checkpoints and offset-syncs) are 
> hardcoded in the source code which makes it hard to run MM2 with any Kafka 
> Cluster that has rules around topic’s naming convention and doesn’t allow 
> auto-creation for topics.
> In this case developers will need to create these internal topics up-front 
> manually and make sure they do follow the cluster rules and guidance for 
> topic creation, so MM2 should have flexibility to let you override the name 
> of internal topics to follow their cluster topic naming convention. 
>  
> Way to solve this is under-discussion in 
> [KIP-690|https://cwiki.apache.org/confluence/display/KAFKA/KIP-690%3A+Add+additional+configuration+to+control+MirrorMaker+2+internal+topics+naming+convention]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] hachikuji commented on a change in pull request #11131: KAFKA-13137: KRaft Controller Metric MBean names incorrectly quoted

2021-07-26 Thread GitBox


hachikuji commented on a change in pull request #11131:
URL: https://github.com/apache/kafka/pull/11131#discussion_r676993092



##
File path: 
metadata/src/main/java/org/apache/kafka/controller/QuorumControllerMetrics.java
##
@@ -34,7 +34,7 @@
 private final static MetricName GLOBAL_PARTITION_COUNT = getMetricName(
 "KafkaController", "GlobalPartitionCount");
 private final static MetricName OFFLINE_PARTITION_COUNT = getMetricName(
-"KafkaController", "OfflinePartitionCount");
+"KafkaController", "OfflinePartitionsCount");

Review comment:
   Good catch




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] rhauch commented on a change in pull request #11132: KAFKA-13139: Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread GitBox


rhauch commented on a change in pull request #11132:
URL: https://github.com/apache/kafka/pull/11132#discussion_r676987061



##
File path: 
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorsResource.java
##
@@ -269,7 +269,7 @@ public Response restartConnector(final 
@PathParam("connector") String connector,
 FutureCallback cb = new FutureCallback<>();
 herder.restartConnector(connector, cb);
 completeOrForwardRequest(cb, forwardingPath, "POST", headers, 
null, forward);
-return Response.ok().build();
+return Response.noContent().build();

Review comment:
   This returns `204 No Content`, right? But doesn't the old code (prior to 
this feature) return `200 OK` with no content? 
   
   
[KIP-745](https://cwiki.apache.org/confluence/display/KAFKA/KIP-745%3A+Connect+API+to+restart+connector+and+tasks)
 mentions using `202 Accepted` for the new behavior, but keeps the `200 OK` 
behavior when `includeTasks=false` and `onlyFailed=false`. 
   
   Are you suggesting that we can't return `200 OK` here if there is no 
response body and should therefore update the KIP, or can this line be left as 
is and we change the `RestClient` to properly handle a null content? The issue 
for this test failure ([KAFKA-111]()) mentions [this 
code](https://github.com/apache/kafka/blob/d89ea74918ce0a1f2309a09473c9500c52af3b10/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/RestClient.java#L134-L136)
 in the `RestClient`, which handles all 200 to <300 status codes the same way, 
so could we just improve that?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] kpatelatwork edited a comment on pull request #11132: KAFKA-13139: Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread GitBox


kpatelatwork edited a comment on pull request #11132:
URL: https://github.com/apache/kafka/pull/11132#issuecomment-887070057


   @kkonstantine  LGTM .  Thanks a lot for the fixes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] kpatelatwork commented on pull request #11132: KAFKA-13139: Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread GitBox


kpatelatwork commented on pull request #11132:
URL: https://github.com/apache/kafka/pull/11132#issuecomment-887070057


   @kkonstantine  the change looks ok to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] rondagostino commented on a change in pull request #11131: KAFKA-13137: KRaft Controller Metric MBean names incorrectly quoted

2021-07-26 Thread GitBox


rondagostino commented on a change in pull request #11131:
URL: https://github.com/apache/kafka/pull/11131#discussion_r676985024



##
File path: 
metadata/src/main/java/org/apache/kafka/controller/QuorumControllerMetrics.java
##
@@ -22,22 +22,21 @@
 import com.yammer.metrics.core.MetricName;
 import com.yammer.metrics.core.MetricsRegistry;
 
-
 public final class QuorumControllerMetrics implements ControllerMetrics {
-private final static MetricName ACTIVE_CONTROLLER_COUNT = new MetricName(
-"kafka.controller", "KafkaController", "ActiveControllerCount", null);
-private final static MetricName EVENT_QUEUE_TIME_MS = new MetricName(
-"kafka.controller", "ControllerEventManager", "EventQueueTimeMs", 
null);
-private final static MetricName EVENT_QUEUE_PROCESSING_TIME_MS = new 
MetricName(
-"kafka.controller", "ControllerEventManager", 
"EventQueueProcessingTimeMs", null);
-private final static MetricName GLOBAL_TOPIC_COUNT = new MetricName(
-"kafka.controller", "KafkaController", "GlobalTopicCount", null);
-private final static MetricName GLOBAL_PARTITION_COUNT = new MetricName(
-"kafka.controller", "KafkaController", "GlobalPartitionCount", null);
-private final static MetricName OFFLINE_PARTITION_COUNT = new MetricName(
-"kafka.controller", "KafkaController", "OfflinePartitionCount", null);
-private final static MetricName PREFERRED_REPLICA_IMBALANCE_COUNT = new 
MetricName(
-"kafka.controller", "KafkaController", 
"PreferredReplicaImbalanceCount", null);
+private final static MetricName ACTIVE_CONTROLLER_COUNT = getMetricName(
+"kafka.controller", "KafkaController", "ActiveControllerCount");

Review comment:
   I'm wondering if using constants in the test case would be 
counter-productive -- the test would still succeed if the constant were to 
change.  It feels to me that hard-coding the expected string inside the test 
case as is done now is the safest thing.  Having a constant for de-registration 
makes sense, but maybe we should leave the rafactor for later if we aren't 
going to use the constant in multiple places right now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] rondagostino commented on a change in pull request #11131: KAFKA-13137: KRaft Controller Metric MBean names incorrectly quoted

2021-07-26 Thread GitBox


rondagostino commented on a change in pull request #11131:
URL: https://github.com/apache/kafka/pull/11131#discussion_r676982713



##
File path: 
metadata/src/test/java/org/apache/kafka/controller/QuorumControllerMetricsTest.java
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kafka.controller;
+
+import com.yammer.metrics.core.MetricsRegistry;
+import org.junit.jupiter.api.Test;
+
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import static org.junit.jupiter.api.Assertions.assertEquals;
+
+public class QuorumControllerMetricsTest {
+@Test
+public void testKafkaControllerMetricNames() {
+String expectedGroup = "kafka.controller";
+String expectedType = "KafkaController";
+Set expectedMetricNames = new HashSet<>(Arrays.asList(
+"ActiveControllerCount",
+"GlobalTopicCount",
+"GlobalPartitionCount",
+"OfflinePartitionCount",
+"PreferredReplicaImbalanceCount"));
+Set missingMetrics = 
getMissingMetricNames(expectedMetricNames, expectedGroup, expectedType);
+assertEquals(Collections.emptySet(), missingMetrics, "Expected metrics 
did not exist");
+}
+
+@Test
+public void testControllerEventManagerMetricNames() {
+String expectedGroup = "kafka.controller";
+String expectedType = "ControllerEventManager";
+Set expectedMetricNames = new HashSet<>(Arrays.asList(
+"EventQueueTimeMs",
+"EventQueueProcessingTimeMs"));
+Set missingMetrics = 
getMissingMetricNames(expectedMetricNames, expectedGroup, expectedType);
+assertEquals(Collections.emptySet(), missingMetrics, "Expected metrics 
did not exist");
+}
+
+private static Set getMissingMetricNames(
+Set expectedMetricNames, String expectedGroup, String 
expectedType) {
+MetricsRegistry registry = new MetricsRegistry();
+new QuorumControllerMetrics(registry); // populates the registry
+Set foundMetricNames = 
expectedMetricNames.stream().filter(expectedMetricName ->
+registry.allMetrics().keySet().stream().anyMatch(metricName -> {
+if (metricName.getGroup().equals(expectedGroup) && 
metricName.getType().equals(expectedType)
+&& metricName.getScope() == null && 
metricName.getName().equals(expectedMetricName)) {
+// It has to exist AND the MBean name has to be correct;
+// fail right here if the MBean name doesn't match
+String expectedMBeanPrefix = expectedGroup + ":type=" + 
expectedType + ",name=";
+assertEquals(expectedMBeanPrefix + expectedMetricName, 
metricName.getMBeanName());
+return true; // the expected metric name exists and the 
associated MBean name matches
+} else {
+return false; // this one didn't match
+}
+})).collect(Collectors.toSet());
+if (foundMetricNames.size() == expectedMetricNames.size()) {

Review comment:
   The size check is sufficient, but there's really no significant harm in 
following the other branch in all cases (just unnecessarily copying a small 
set), so I'll go with the cleaner single solution as you suggest.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] hachikuji commented on a change in pull request #11116: KAFKA-13114: Revert state and reregister raft listener

2021-07-26 Thread GitBox


hachikuji commented on a change in pull request #6:
URL: https://github.com/apache/kafka/pull/6#discussion_r676949613



##
File path: 
metadata/src/main/java/org/apache/kafka/controller/QuorumController.java
##
@@ -783,6 +798,11 @@ public void handleLeaderChange(LeaderAndEpoch newLeader) {
 });
 } else if (curClaimEpoch != -1) {
 appendControlEvent("handleRenounce[" + curClaimEpoch + "]", () 
-> {
+if (this != metaLogListener) {

Review comment:
   Maybe we can add a little helper? The only difference in all of these 
checks is the type of event.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] ccding commented on a change in pull request #11080: fix NPE when record==null in append

2021-07-26 Thread GitBox


ccding commented on a change in pull request #11080:
URL: https://github.com/apache/kafka/pull/11080#discussion_r676954916



##
File path: 
clients/src/main/java/org/apache/kafka/common/record/DefaultRecord.java
##
@@ -293,7 +293,9 @@ public static DefaultRecord readFrom(ByteBuffer buffer,
  Long logAppendTime) {
 int sizeOfBodyInBytes = ByteUtils.readVarint(buffer);
 if (buffer.remaining() < sizeOfBodyInBytes)
-return null;
+throw new InvalidRecordException("Invalid record size: expected " 
+ sizeOfBodyInBytes +
+" bytes in record payload, but instead the buffer has only " + 
buffer.remaining() +
+" remaining bytes.");

Review comment:
   @hachikuji do you have time to have a look at this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] mjsax merged pull request #11075: MINOR: Move off deprecated APIs in StreamsResetter

2021-07-26 Thread GitBox


mjsax merged pull request #11075:
URL: https://github.com/apache/kafka/pull/11075


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] hachikuji commented on a change in pull request #11116: KAFKA-13114: Revert state and reregister raft listener

2021-07-26 Thread GitBox


hachikuji commented on a change in pull request #6:
URL: https://github.com/apache/kafka/pull/6#discussion_r676949613



##
File path: 
metadata/src/main/java/org/apache/kafka/controller/QuorumController.java
##
@@ -783,6 +798,11 @@ public void handleLeaderChange(LeaderAndEpoch newLeader) {
 });
 } else if (curClaimEpoch != -1) {
 appendControlEvent("handleRenounce[" + curClaimEpoch + "]", () 
-> {
+if (this != metaLogListener) {

Review comment:
   Maybe we can add a little helper? The only thing difference in all of 
these checks is the type of event.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] mjsax commented on pull request #11117: MINOR: Remove older brokers from upgrade test

2021-07-26 Thread GitBox


mjsax commented on pull request #7:
URL: https://github.com/apache/kafka/pull/7#issuecomment-887029595


   Merged to `trunk` and cherry-picked to `3.0` branch. \cc @kkonstantine 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] mjsax merged pull request #11117: MINOR: Remove older brokers from upgrade test

2021-07-26 Thread GitBox


mjsax merged pull request #7:
URL: https://github.com/apache/kafka/pull/7


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] hachikuji commented on a change in pull request #11131: KAFKA-13137: KRaft Controller Metric MBean names incorrectly quoted

2021-07-26 Thread GitBox


hachikuji commented on a change in pull request #11131:
URL: https://github.com/apache/kafka/pull/11131#discussion_r676942838



##
File path: 
metadata/src/test/java/org/apache/kafka/controller/QuorumControllerMetricsTest.java
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kafka.controller;
+
+import com.yammer.metrics.core.MetricsRegistry;
+import org.junit.jupiter.api.Test;
+
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import static org.junit.jupiter.api.Assertions.assertEquals;
+
+public class QuorumControllerMetricsTest {
+@Test
+public void testKafkaControllerMetricNames() {
+String expectedGroup = "kafka.controller";
+String expectedType = "KafkaController";
+Set expectedMetricNames = new HashSet<>(Arrays.asList(
+"ActiveControllerCount",
+"GlobalTopicCount",
+"GlobalPartitionCount",
+"OfflinePartitionCount",
+"PreferredReplicaImbalanceCount"));
+Set missingMetrics = 
getMissingMetricNames(expectedMetricNames, expectedGroup, expectedType);
+assertEquals(Collections.emptySet(), missingMetrics, "Expected metrics 
did not exist");
+}
+
+@Test
+public void testControllerEventManagerMetricNames() {
+String expectedGroup = "kafka.controller";
+String expectedType = "ControllerEventManager";
+Set expectedMetricNames = new HashSet<>(Arrays.asList(
+"EventQueueTimeMs",
+"EventQueueProcessingTimeMs"));
+Set missingMetrics = 
getMissingMetricNames(expectedMetricNames, expectedGroup, expectedType);
+assertEquals(Collections.emptySet(), missingMetrics, "Expected metrics 
did not exist");
+}
+
+private static Set getMissingMetricNames(
+Set expectedMetricNames, String expectedGroup, String 
expectedType) {
+MetricsRegistry registry = new MetricsRegistry();
+new QuorumControllerMetrics(registry); // populates the registry
+Set foundMetricNames = 
expectedMetricNames.stream().filter(expectedMetricName ->
+registry.allMetrics().keySet().stream().anyMatch(metricName -> {
+if (metricName.getGroup().equals(expectedGroup) && 
metricName.getType().equals(expectedType)
+&& metricName.getScope() == null && 
metricName.getName().equals(expectedMetricName)) {
+// It has to exist AND the MBean name has to be correct;
+// fail right here if the MBean name doesn't match
+String expectedMBeanPrefix = expectedGroup + ":type=" + 
expectedType + ",name=";
+assertEquals(expectedMBeanPrefix + expectedMetricName, 
metricName.getMBeanName());
+return true; // the expected metric name exists and the 
associated MBean name matches
+} else {
+return false; // this one didn't match
+}
+})).collect(Collectors.toSet());
+if (foundMetricNames.size() == expectedMetricNames.size()) {

Review comment:
   nit: is the size check sufficient? Any harm skipping this check and 
following the other branch in all cases?

##
File path: 
metadata/src/main/java/org/apache/kafka/controller/QuorumControllerMetrics.java
##
@@ -22,22 +22,21 @@
 import com.yammer.metrics.core.MetricName;
 import com.yammer.metrics.core.MetricsRegistry;
 
-
 public final class QuorumControllerMetrics implements ControllerMetrics {
-private final static MetricName ACTIVE_CONTROLLER_COUNT = new MetricName(
-"kafka.controller", "KafkaController", "ActiveControllerCount", null);
-private final static MetricName EVENT_QUEUE_TIME_MS = new MetricName(
-"kafka.controller", "ControllerEventManager", "EventQueueTimeMs", 
null);
-private final static MetricName EVENT_QUEUE_PROCESSING_TIME_MS = new 
MetricName(
-"kafka.controller", "ControllerEventManager", 
"EventQueueProcessingTimeMs", null);
-private final static MetricName GLOBAL_TOPIC_COUNT = new MetricName(
-

[GitHub] [kafka] cmccabe edited a comment on pull request #10951: KAFKA-12841: NPE from the provided metadata in client callback in case of ApiException

2021-07-26 Thread GitBox


cmccabe edited a comment on pull request #10951:
URL: https://github.com/apache/kafka/pull/10951#issuecomment-887023640


   If I understand correctly, the PR makes up a fake partition so that 
`TopicPartition` can be non-null in the callback. I don't think that this is 
the right thing to do. Why not change the JavaDoc for the callback to indicate 
that `TopicPartition` can be null if the method fails before it gets assigned?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] cmccabe commented on pull request #10951: KAFKA-12841: NPE from the provided metadata in client callback in case of ApiException

2021-07-26 Thread GitBox


cmccabe commented on pull request #10951:
URL: https://github.com/apache/kafka/pull/10951#issuecomment-887023640


   This doesn't seem correct. You're making up a fake partition just so that 
`TopicPartition` can be non-null in the callback. Why not change the JavaDoc 
for the callback to indicate that `TopicPartition` can be null if the method 
fails before it gets assigned?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] cmccabe commented on pull request #10980: KAFKA-12989: MockClient should respect the request matcher passed to prepareUnsupportedVersionResponse

2021-07-26 Thread GitBox


cmccabe commented on pull request #10980:
URL: https://github.com/apache/kafka/pull/10980#issuecomment-887020995


   Why is `NetworkClient#send` throwing an exception? It shouldn't be doing 
that, right? Can you explain more about the problem that this PR fixes?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (KAFKA-13026) Idempotent producer (KAFKA-10619) follow-up testings

2021-07-26 Thread Rajini Sivaram (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-13026.

Fix Version/s: 3.0.0
 Reviewer: Rajini Sivaram
   Resolution: Fixed

> Idempotent producer (KAFKA-10619) follow-up testings
> 
>
> Key: KAFKA-13026
> URL: https://issues.apache.org/jira/browse/KAFKA-13026
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Cheng Tan
>Assignee: Cheng Tan
>Priority: Major
> Fix For: 3.0.0
>
>
> # Adjust config priority
>  # Adjust the JUnit tests so we get good coverage of the non-default behavior
>  # Similar to 2 for system tests



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-13026) Idempotent producer (KAFKA-10619) follow-up testings

2021-07-26 Thread Rajini Sivaram (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram reassigned KAFKA-13026:
--

Assignee: Cheng Tan

> Idempotent producer (KAFKA-10619) follow-up testings
> 
>
> Key: KAFKA-13026
> URL: https://issues.apache.org/jira/browse/KAFKA-13026
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Cheng Tan
>Assignee: Cheng Tan
>Priority: Major
>
> # Adjust config priority
>  # Adjust the JUnit tests so we get good coverage of the non-default behavior
>  # Similar to 2 for system tests



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] rajinisivaram merged pull request #11002: KAFKA-13026: Idempotent producer (KAFKA-10619) follow-up testings

2021-07-26 Thread GitBox


rajinisivaram merged pull request #11002:
URL: https://github.com/apache/kafka/pull/11002


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] jolshan commented on a change in pull request #11126: KAFKA-13132: Upgrading to topic IDs in LISR requests has gaps introduced in 3.0

2021-07-26 Thread GitBox


jolshan commented on a change in pull request #11126:
URL: https://github.com/apache/kafka/pull/11126#discussion_r676920282



##
File path: core/src/main/scala/kafka/server/ReplicaManager.scala
##
@@ -1367,13 +1367,23 @@ class ReplicaManager(val config: KafkaConfig,
   val currentLeaderEpoch = partition.getLeaderEpoch
   val requestLeaderEpoch = partitionState.leaderEpoch
   val requestTopicId = topicIdFromRequest(topicPartition.topic)
+  val logTopicId = partition.topicId
+
+  // We propagate the partition state down if:
+  // 1. The leader epoch is higher than the current leader epoch 
of the partition
+  // 2. The leader epoch is same as the current leader epoch but a 
new topic id is being assigned. This is
+  //needed to handle the case where a topic id is assigned for 
the first time after upgrade.
+  def propagatePartitionState(requestLeaderEpoch: Int, 
currentLeaderEpoch: Int): Boolean = {
+requestLeaderEpoch > currentLeaderEpoch ||
+  (requestLeaderEpoch == currentLeaderEpoch && 
logTopicId.isEmpty && requestTopicId.isDefined)
+  }
 
-  if (!hasConsistentTopicId(requestTopicId, partition.topicId)) {
-stateChangeLogger.error(s"Topic ID in memory: 
${partition.topicId.get} does not" +
+  if (!hasConsistentTopicId(requestTopicId, logTopicId)) {
+stateChangeLogger.error(s"Topic ID in memory: 
${logTopicId.get} does not" +
   s" match the topic ID for partition $topicPartition 
received: " +
   s"${requestTopicId.get}.")
 responseMap.put(topicPartition, Errors.INCONSISTENT_TOPIC_ID)
-  } else if (requestLeaderEpoch > currentLeaderEpoch) {
+  } else if (propagatePartitionState(requestLeaderEpoch, 
currentLeaderEpoch)) {

Review comment:
   I took a closer look at the code in makeLeader and it seems that we do 
not reassign the value for a given epoch.
   ```
   /**
   * Assigns the supplied Leader Epoch to the supplied Offset
   * Once the epoch is assigned it cannot be reassigned
   */
 def assign(epoch: Int, startOffset: Long): Unit = {
   val entry = EpochEntry(epoch, startOffset)
   if (assign(entry)) {
 debug(s"Appended new epoch entry $entry. Cache now contains 
${epochs.size} entries.")
 flush()
   }
 }
   ```
   It is true that we may not initially assign a value if there is no epoch 
cache. But I'm not sure if that is a case we need to consider (ie, first 
request there is not a cache, second request there is)
   ```
   def maybeAssignEpochStartOffset(leaderEpoch: Int, startOffset: Long): Unit = 
{
   leaderEpochCache.foreach { cache =>
 cache.assign(leaderEpoch, startOffset)
   }
 }
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] kkonstantine commented on pull request #11132: KAFKA-13139: Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread GitBox


kkonstantine commented on pull request #11132:
URL: https://github.com/apache/kafka/pull/11132#issuecomment-886989269


   cc @kpatelatwork @rhauch @ijuma 
   This looks like a blocker for 3.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] kkonstantine commented on pull request #11132: KAFKA-13139: Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread GitBox


kkonstantine commented on pull request #11132:
URL: https://github.com/apache/kafka/pull/11132#issuecomment-886988649


   I'd like to check quickly if this can be covered in unit/integration tests 
as well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] kkonstantine opened a new pull request #11132: KAFKA-13139: Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread GitBox


kkonstantine opened a new pull request #11132:
URL: https://github.com/apache/kafka/pull/11132


   Currently tested via the system test 
`connect_distributed_test.py::ConnectDistributedTest.test_restart_failed_connector`
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (KAFKA-13139) Empty response after requesting to restart a connector without the tasks results in NPE

2021-07-26 Thread Konstantine Karantasis (Jira)
Konstantine Karantasis created KAFKA-13139:
--

 Summary: Empty response after requesting to restart a connector 
without the tasks results in NPE
 Key: KAFKA-13139
 URL: https://issues.apache.org/jira/browse/KAFKA-13139
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 3.0.0
Reporter: Konstantine Karantasis
Assignee: Konstantine Karantasis
 Fix For: 3.0.0


After https://issues.apache.org/jira/browse/KAFKA-4793 a response to restart 
only the connector (without any tasks) returns OK with an empty body. 

As system test runs revealed, this causes an NPE in 
[https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/RestClient.java#L135]

We should return 204 (NO_CONTENT) instead. 

This is a regression from previous behavior, therefore the ticket is marked as 
a blocker candidate for 3.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] rondagostino opened a new pull request #11131: KAFKA-13137: KRaft Controller Metric MBean names incorrectly quoted

2021-07-26 Thread GitBox


rondagostino opened a new pull request #11131:
URL: https://github.com/apache/kafka/pull/11131


   Controller metric names that are in common between the ZooKeeper-based and 
KRaft-based controller must remain the same, but they were not in the AK 2.8 
early access release of KRaft. For example, the non-KRaft MBean name 
`kafka.controller:type=KafkaController,name=OfflinePartitionsCount` incorrectly 
became 
`"kafka.controller":type="KafkaController",name="OfflinePartitionsCount"` (note 
the added quotes).  This patch fixes the issue and closes the test gap that 
allowed the divergence to occur.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka] hachikuji commented on a change in pull request #11126: KAFKA-13132: Upgrading to topic IDs in LISR requests has gaps introduced in 3.0

2021-07-26 Thread GitBox


hachikuji commented on a change in pull request #11126:
URL: https://github.com/apache/kafka/pull/11126#discussion_r676863216



##
File path: core/src/main/scala/kafka/server/ReplicaManager.scala
##
@@ -1367,13 +1367,23 @@ class ReplicaManager(val config: KafkaConfig,
   val currentLeaderEpoch = partition.getLeaderEpoch
   val requestLeaderEpoch = partitionState.leaderEpoch
   val requestTopicId = topicIdFromRequest(topicPartition.topic)
+  val logTopicId = partition.topicId
+
+  // We propagate the partition state down if:
+  // 1. The leader epoch is higher than the current leader epoch 
of the partition
+  // 2. The leader epoch is same as the current leader epoch but a 
new topic id is being assigned. This is
+  //needed to handle the case where a topic id is assigned for 
the first time after upgrade.
+  def propagatePartitionState(requestLeaderEpoch: Int, 
currentLeaderEpoch: Int): Boolean = {
+requestLeaderEpoch > currentLeaderEpoch ||
+  (requestLeaderEpoch == currentLeaderEpoch && 
logTopicId.isEmpty && requestTopicId.isDefined)
+  }
 
-  if (!hasConsistentTopicId(requestTopicId, partition.topicId)) {
-stateChangeLogger.error(s"Topic ID in memory: 
${partition.topicId.get} does not" +
+  if (!hasConsistentTopicId(requestTopicId, logTopicId)) {
+stateChangeLogger.error(s"Topic ID in memory: 
${logTopicId.get} does not" +
   s" match the topic ID for partition $topicPartition 
received: " +
   s"${requestTopicId.get}.")
 responseMap.put(topicPartition, Errors.INCONSISTENT_TOPIC_ID)
-  } else if (requestLeaderEpoch > currentLeaderEpoch) {
+  } else if (propagatePartitionState(requestLeaderEpoch, 
currentLeaderEpoch)) {

Review comment:
   I'm a bit concerned about the transition through `Partition.makeLeader` 
when there is no epoch bump. For example, the logic to update the epoch cache 
assumes that the epoch has indeed been bumped. If we end up overwriting the 
start offset of the current epoch, that could affect the truncation logic.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KAFKA-12980) Allow consumers to return from poll when position advances due to aborted transactions

2021-07-26 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387547#comment-17387547
 ] 

Guozhang Wang commented on KAFKA-12980:
---

Thanks [~ChrisEgerton], this makes sense to me too. And I also feel we do not 
really need a KIP to address this change.

> Allow consumers to return from poll when position advances due to aborted 
> transactions
> --
>
> Key: KAFKA-12980
> URL: https://issues.apache.org/jira/browse/KAFKA-12980
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Chris Egerton
>Assignee: Chris Egerton
>Priority: Major
>
> When {{Consumer::poll}} is invoked on a topic with an open transaction, and 
> then that transaction is aborted, {{poll}} does not return unless there are 
> other records available in that topic after the aborted transaction.
> Instead, {{poll}} could return in this case, even when no records are 
> available.
> This facilitates reads to the end of a topic where the end offsets of a topic 
> are listed and then a consumer for that topic is polled until its 
> [position|https://kafka.apache.org/28/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#position(org.apache.kafka.common.TopicPartition)]
>  is at or beyond each of those offsets (for example, [Connect does 
> this|https://github.com/apache/kafka/blob/fce771579c3e20f20949c4c7e0a5e3a16c57c7f0/connect/runtime/src/main/java/org/apache/kafka/connect/util/KafkaBasedLog.java#L322-L345]
>  when reading to the end of any of its internal topics).
> We could update the existing language in the docs for {{Consumer::poll}} from
> {quote}This method returns immediately if there are records available.
> {quote}
> to
> {quote}This method returns immediately if there are records available or if 
> the position advances past control records.
> {quote}
>  
> A workaround for existing users who would like to see this is to use short 
> poll intervals and manually check the consumer's position in between each 
> poll, but this is fairly tedious and may lead to excess CPU and network 
> utilization depending on the latency requirements for knowing when the end of 
> the topic has been reached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] agavra commented on a change in pull request #11120: Add support for infinite endpoints for range queries

2021-07-26 Thread GitBox


agavra commented on a change in pull request #11120:
URL: https://github.com/apache/kafka/pull/11120#discussion_r676851943



##
File path: 
streams/src/main/java/org/apache/kafka/streams/state/internals/NamedCache.java
##
@@ -281,11 +280,27 @@ public boolean isEmpty() {
 }
 
 synchronized Iterator keyRange(final Bytes from, final Bytes to, 
final boolean toInclusive) {
-return keySetIterator(cache.navigableKeySet().subSet(from, true, to, 
toInclusive), true);
+final Set rangeSet = computeSubSet(from, to, toInclusive);
+return keySetIterator(rangeSet, true);
 }
 
 synchronized Iterator reverseKeyRange(final Bytes from, final Bytes 
to) {
-return keySetIterator(cache.navigableKeySet().subSet(from, true, to, 
true), false);
+final Set rangeSet = computeSubSet(from, to, true);
+return keySetIterator(rangeSet, false);
+}
+
+private Set computeSubSet(final Bytes from, final Bytes to, final 
boolean toInclusive) {
+if (from == null && to == null) {
+return cache.navigableKeySet();
+} else if (from == null) {
+return cache.headMap(to, toInclusive).keySet();
+} else if (to == null) {
+return cache.tailMap(from, true).keySet();
+} else if (from.compareTo(to) > 0) {
+return new TreeSet<>();

Review comment:
   instead of creating a new set, thoughts on just returning an empty 
collection? (`Collections.emptyNavigableSet()`)

##
File path: 
streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBRangeIterator.java
##
@@ -63,17 +65,20 @@
 final KeyValue next = super.makeNext();
 if (next == null) {
 return allDone();
+} else if (rawLastKey == null) {
+return next;
+
 } else {
 if (forward) {
-if (comparator.compare(next.key.get(), rawLastKey) < 0) {
+if (rawLastKey != null && comparator.compare(next.key.get(), 
rawLastKey) < 0) {

Review comment:
   how can `rawLastKey` be null here (c.f. branch above)?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KAFKA-13138) FileConfigProvider#get should preserve IOException on failure

2021-07-26 Thread Colin McCabe (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387529#comment-17387529
 ] 

Colin McCabe commented on KAFKA-13138:
--

https://github.com/apache/kafka/pull/11130/files

> FileConfigProvider#get should preserve IOException on failure
> -
>
> Key: KAFKA-13138
> URL: https://issues.apache.org/jira/browse/KAFKA-13138
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin McCabe
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] cmccabe opened a new pull request #11130: MINOR: FileConfigProvider#get should keep failure exception

2021-07-26 Thread GitBox


cmccabe opened a new pull request #11130:
URL: https://github.com/apache/kafka/pull/11130


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KAFKA-13120) Flesh out `streams_static_membership_test` to be more robust

2021-07-26 Thread A. Sophie Blee-Goldman (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387506#comment-17387506
 ] 

A. Sophie Blee-Goldman commented on KAFKA-13120:


Thanks [~Reggiehsu111]! I assigned the ticket to you and added you as a 
contributor so you can self-assign tickets from now on. Let me or [~lct45] know 
if you have any questions

> Flesh out `streams_static_membership_test` to be more robust
> 
>
> Key: KAFKA-13120
> URL: https://issues.apache.org/jira/browse/KAFKA-13120
> Project: Kafka
>  Issue Type: Task
>  Components: streams, system tests
>Reporter: Leah Thomas
>Assignee: Reggie Hsu
>Priority: Minor
>  Labels: newbie++
>
> When fixing the `streams_static_membership_test.py` we noticed that the test 
> is pretty bare bones, it creates a streams application but doesn't do much 
> with the streams application, eg has no stateful processing. We should flesh 
> this out a bit to be more realistic and potentially consider testing with EOS 
> as well. The full java test is in `StaticMembershipTestClient`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-13138) FileConfigProvider#get should preserve IOException on failure

2021-07-26 Thread Colin McCabe (Jira)
Colin McCabe created KAFKA-13138:


 Summary: FileConfigProvider#get should preserve IOException on 
failure
 Key: KAFKA-13138
 URL: https://issues.apache.org/jira/browse/KAFKA-13138
 Project: Kafka
  Issue Type: Bug
Reporter: Colin McCabe






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-13120) Flesh out `streams_static_membership_test` to be more robust

2021-07-26 Thread A. Sophie Blee-Goldman (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

A. Sophie Blee-Goldman reassigned KAFKA-13120:
--

Assignee: Reggie Hsu

> Flesh out `streams_static_membership_test` to be more robust
> 
>
> Key: KAFKA-13120
> URL: https://issues.apache.org/jira/browse/KAFKA-13120
> Project: Kafka
>  Issue Type: Task
>  Components: streams, system tests
>Reporter: Leah Thomas
>Assignee: Reggie Hsu
>Priority: Minor
>  Labels: newbie++
>
> When fixing the `streams_static_membership_test.py` we noticed that the test 
> is pretty bare bones, it creates a streams application but doesn't do much 
> with the streams application, eg has no stateful processing. We should flesh 
> this out a bit to be more realistic and potentially consider testing with EOS 
> as well. The full java test is in `StaticMembershipTestClient`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-13137) KRaft Controller Metric MBean names are incorrectly quoted

2021-07-26 Thread Ron Dagostino (Jira)
Ron Dagostino created KAFKA-13137:
-

 Summary: KRaft Controller Metric MBean names are incorrectly quoted
 Key: KAFKA-13137
 URL: https://issues.apache.org/jira/browse/KAFKA-13137
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 2.8.0, 3.0.0
Reporter: Ron Dagostino
Assignee: Ron Dagostino
 Fix For: 3.0.0


QuorumControllerMetrics is letting com.yammer.metrics.MetricName create the 
MBean names for all of the controller metrics, and that adds quotes.  We have 
typically used KafkaMetricsGroup to explicitly create the MBean name, and we do 
not add quotes there.  The controller metric names that are in common between 
the old and new controller must remain the same, but they are not.  For 
example, this non-KRaft MBean name:

kafka.controller:type=KafkaController,name=OfflinePartitionsCount

has morphed into this when using KRaft:

"kafka.controller":type="KafkaController",name="OfflinePartitionsCount"





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-13132) Upgrading to topic IDs in LISR requests has gaps introduced in 3.0

2021-07-26 Thread Konstantine Karantasis (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387456#comment-17387456
 ] 

Konstantine Karantasis commented on KAFKA-13132:


Marked as Blocker for now targeting 3.0, as it seems to be a regression. 

> Upgrading to topic IDs in LISR requests has gaps introduced in 3.0
> --
>
> Key: KAFKA-13132
> URL: https://issues.apache.org/jira/browse/KAFKA-13132
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Justine Olshan
>Assignee: Justine Olshan
>Priority: Blocker
> Fix For: 3.0.0
>
>
> With the change in 3.0 to how topic IDs are assigned to logs, a bug was 
> inadvertently introduced. Now, topic IDs will only be assigned on the load of 
> the log to a partition in LISR requests. This means we will only assign topic 
> IDs for newly created topics/partitions, on broker startup, or potentially 
> when a partition is reassigned.
>  
> In the case of upgrading from an IBP before 2.8, we may have a scenario where 
> we upgrade the controller to IBP 3.0 (or even 2.8) last. (Ie, the controller 
> is IBP < 2.8 and all other brokers are on the newest IBP) Upon the last 
> broker upgrading, we will elect a new controller but its LISR request will 
> not result in topic IDs being assigned to logs of existing topics. They will 
> only be assigned in the cases mentioned above.
> *Keep in mind, in this scenario, topic IDs will be still be assigned in the 
> controller/ZK to all new and pre-existing topics and will show up in 
> metadata.*  This means we are not ensured the same guarantees we had in 2.8. 
> *It is just the LISR/partition.metadata part of the code that is affected.* 
>  
> The problem is two-fold
>  1. We ignore LISR requests when the partition leader epoch has not increased 
> (previously we assigned the ID before this check)
>  2. We only assign the topic ID when we are associating the log with the 
> partition in replicamanager for the first time. Though in the scenario 
> described above, we have logs associated with partitions that need to be 
> upgraded.
>  
> We should check the if the LISR request is resulting in a topic ID addition 
> and add logic to logs already associated to partitions in replica manager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-13132) Upgrading to topic IDs in LISR requests has gaps introduced in 3.0

2021-07-26 Thread Konstantine Karantasis (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13132:
---
Priority: Blocker  (was: Major)

> Upgrading to topic IDs in LISR requests has gaps introduced in 3.0
> --
>
> Key: KAFKA-13132
> URL: https://issues.apache.org/jira/browse/KAFKA-13132
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Justine Olshan
>Assignee: Justine Olshan
>Priority: Blocker
> Fix For: 3.0.0
>
>
> With the change in 3.0 to how topic IDs are assigned to logs, a bug was 
> inadvertently introduced. Now, topic IDs will only be assigned on the load of 
> the log to a partition in LISR requests. This means we will only assign topic 
> IDs for newly created topics/partitions, on broker startup, or potentially 
> when a partition is reassigned.
>  
> In the case of upgrading from an IBP before 2.8, we may have a scenario where 
> we upgrade the controller to IBP 3.0 (or even 2.8) last. (Ie, the controller 
> is IBP < 2.8 and all other brokers are on the newest IBP) Upon the last 
> broker upgrading, we will elect a new controller but its LISR request will 
> not result in topic IDs being assigned to logs of existing topics. They will 
> only be assigned in the cases mentioned above.
> *Keep in mind, in this scenario, topic IDs will be still be assigned in the 
> controller/ZK to all new and pre-existing topics and will show up in 
> metadata.*  This means we are not ensured the same guarantees we had in 2.8. 
> *It is just the LISR/partition.metadata part of the code that is affected.* 
>  
> The problem is two-fold
>  1. We ignore LISR requests when the partition leader epoch has not increased 
> (previously we assigned the ID before this check)
>  2. We only assign the topic ID when we are associating the log with the 
> partition in replicamanager for the first time. Though in the scenario 
> described above, we have logs associated with partitions that need to be 
> upgraded.
>  
> We should check the if the LISR request is resulting in a topic ID addition 
> and add logic to logs already associated to partitions in replica manager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-13132) Upgrading to topic IDs in LISR requests has gaps introduced in 3.0

2021-07-26 Thread Konstantine Karantasis (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantine Karantasis updated KAFKA-13132:
---
Fix Version/s: 3.0.0

> Upgrading to topic IDs in LISR requests has gaps introduced in 3.0
> --
>
> Key: KAFKA-13132
> URL: https://issues.apache.org/jira/browse/KAFKA-13132
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Justine Olshan
>Assignee: Justine Olshan
>Priority: Major
> Fix For: 3.0.0
>
>
> With the change in 3.0 to how topic IDs are assigned to logs, a bug was 
> inadvertently introduced. Now, topic IDs will only be assigned on the load of 
> the log to a partition in LISR requests. This means we will only assign topic 
> IDs for newly created topics/partitions, on broker startup, or potentially 
> when a partition is reassigned.
>  
> In the case of upgrading from an IBP before 2.8, we may have a scenario where 
> we upgrade the controller to IBP 3.0 (or even 2.8) last. (Ie, the controller 
> is IBP < 2.8 and all other brokers are on the newest IBP) Upon the last 
> broker upgrading, we will elect a new controller but its LISR request will 
> not result in topic IDs being assigned to logs of existing topics. They will 
> only be assigned in the cases mentioned above.
> *Keep in mind, in this scenario, topic IDs will be still be assigned in the 
> controller/ZK to all new and pre-existing topics and will show up in 
> metadata.*  This means we are not ensured the same guarantees we had in 2.8. 
> *It is just the LISR/partition.metadata part of the code that is affected.* 
>  
> The problem is two-fold
>  1. We ignore LISR requests when the partition leader epoch has not increased 
> (previously we assigned the ID before this check)
>  2. We only assign the topic ID when we are associating the log with the 
> partition in replicamanager for the first time. Though in the scenario 
> described above, we have logs associated with partitions that need to be 
> upgraded.
>  
> We should check the if the LISR request is resulting in a topic ID addition 
> and add logic to logs already associated to partitions in replica manager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KAFKA-12851) Flaky Test RaftEventSimulationTest.canMakeProgressIfMajorityIsReachable

2021-07-26 Thread Jose Armando Garcia Sancio (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387433#comment-17387433
 ] 

Jose Armando Garcia Sancio edited comment on KAFKA-12851 at 7/26/21, 4:05 PM:
--

The issue is with the test and not with the Raft implementation. At a high 
level the test runs with 5 voters until it reaches at least a high watermark of 
10. At this point it partition the network so that 3 nodes cannot send request 
to 2 nodes and vice versa. With the seed {{137014923570865933}} right before 
the partition the nodes have the following state:
{code:java}
Node(id=0, hw=14, logEndOffset=25)
Node(id=1, hw=10, logEndOffset=14)
Node(id=2, hw=10, logEndOffset=14)
Node(id=3, hw=10, logEndOffset=22)
Node(id=4, hw=10, logEndOffset=14)
Node(id=5, hw=10, logEndOffset=14)
Node(id=6, hw=6, logEndOffset=18){code}
Nodes 5 and 6 are observers and do not participate in quorum or hw values. 
Notices that two nodes have a log end offset greater than 20 (0 and 3).

The tests now partitions the network so that nodes 0, 1 can send request to 
each other and nodes 2, 3, 4 can send request to each other. That means that 
only node 1 needs to reach offset 20 before the election timeout so that the 
leader 0 can advance the high watermark.
{code:java}
Node(id=0, hw=22, logEndOffset=34)
Node(id=1, hw=18, logEndOffset=22)
Node(id=2, hw=14, logEndOffset=18)
Node(id=3, hw=10, logEndOffset=25)
Node(id=4, hw=14, logEndOffset=18)
Node(id=5, hw=18, logEndOffset=25)
Node(id=6, hw=18, logEndOffset=29){code}
 

I think that the best way to fix the test, after the partition, is to wait for 
the high-watermark to reach a value must larger than the LEO before the 
partition. In the trace above it would be a value much greater than 25.


was (Author: jagsancio):
The issue is with the test and not with the Raft implementation. At a high 
level the test runs with 5 voters until it reaches at least a high watermark of 
10. At this point it partition the network so that 3 nodes cannot send request 
to 2 nodes and vice versa. With the seed {{137014923570865933}} right before 
the partition the nodes have the following state:
{code:java}
Node(id=0, hw=14, logEndOffset=25)
Node(id=1, hw=10, logEndOffset=14)
Node(id=2, hw=10, logEndOffset=14)
Node(id=3, hw=10, logEndOffset=22)
Node(id=4, hw=10, logEndOffset=14)
Node(id=5, hw=10, logEndOffset=14)
Node(id=6, hw=6, logEndOffset=18){code}
Nodes 5 and 6 are observers and do not participate in quorum or hw values. 
Notices that two nodes have a HW greater than 20 (0 and 3).

The tests now partitions the network so that nodes 0, 1 can send request to 
each other and nodes 2, 3, 4 can send request to each other. That means that 
only node 1 needs to reach offset 20 before the election timeout so that the 
leader 0 can advance the high watermark.
{code:java}
Node(id=0, hw=22, logEndOffset=34)
Node(id=1, hw=18, logEndOffset=22)
Node(id=2, hw=14, logEndOffset=18)
Node(id=3, hw=10, logEndOffset=25)
Node(id=4, hw=14, logEndOffset=18)
Node(id=5, hw=18, logEndOffset=25)
Node(id=6, hw=18, logEndOffset=29){code}
 

I think that the best way to fix the test, after the partition, is to wait for 
the high-watermark to reach a value must larger than the LEO before the 
partition. In the trace above it would be a value much greater than 25.

> Flaky Test RaftEventSimulationTest.canMakeProgressIfMajorityIsReachable
> ---
>
> Key: KAFKA-12851
> URL: https://issues.apache.org/jira/browse/KAFKA-12851
> Project: Kafka
>  Issue Type: Bug
>  Components: core, kraft
>Reporter: A. Sophie Blee-Goldman
>Assignee: Jose Armando Garcia Sancio
>Priority: Major
>  Labels: kip-500
> Fix For: 3.0.0
>
> Attachments: Capture.PNG
>
>
> Failed twice on a [PR 
> build|https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-10755/6/testReport/]
> h3. Stacktrace
> org.opentest4j.AssertionFailedError: expected:  but was:  at 
> org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at 
> org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at 
> org.apache.kafka.raft.RaftEventSimulationTest.canMakeProgressIfMajorityIsReachable(RaftEventSimulationTest.java:263)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12851) Flaky Test RaftEventSimulationTest.canMakeProgressIfMajorityIsReachable

2021-07-26 Thread Jose Armando Garcia Sancio (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387433#comment-17387433
 ] 

Jose Armando Garcia Sancio commented on KAFKA-12851:


The issue is with the test and not with the Raft implementation. At a high 
level the test runs with 5 voters until it reaches at least a high watermark of 
10. At this point it partition the network so that 3 nodes cannot send request 
to 2 nodes and vice versa. With the seed {{137014923570865933}} right before 
the partition the nodes have the following state:
{code:java}
Node(id=0, hw=14, logEndOffset=25)
Node(id=1, hw=10, logEndOffset=14)
Node(id=2, hw=10, logEndOffset=14)
Node(id=3, hw=10, logEndOffset=22)
Node(id=4, hw=10, logEndOffset=14)
Node(id=5, hw=10, logEndOffset=14)
Node(id=6, hw=6, logEndOffset=18){code}
Nodes 5 and 6 are observers and do not participate in quorum or hw values. 
Notices that two nodes have a HW greater than 20 (0 and 3).

The tests now partitions the network so that nodes 0, 1 can send request to 
each other and nodes 2, 3, 4 can send request to each other. That means that 
only node 1 needs to reach offset 20 before the election timeout so that the 
leader 0 can advance the high watermark.
{code:java}
Node(id=0, hw=22, logEndOffset=34)
Node(id=1, hw=18, logEndOffset=22)
Node(id=2, hw=14, logEndOffset=18)
Node(id=3, hw=10, logEndOffset=25)
Node(id=4, hw=14, logEndOffset=18)
Node(id=5, hw=18, logEndOffset=25)
Node(id=6, hw=18, logEndOffset=29){code}
 

I think that the best way to fix the test, after the partition, is to wait for 
the high-watermark to reach a value must larger than the LEO before the 
partition. In the trace above it would be a value much greater than 25.

> Flaky Test RaftEventSimulationTest.canMakeProgressIfMajorityIsReachable
> ---
>
> Key: KAFKA-12851
> URL: https://issues.apache.org/jira/browse/KAFKA-12851
> Project: Kafka
>  Issue Type: Bug
>  Components: core, kraft
>Reporter: A. Sophie Blee-Goldman
>Assignee: Jose Armando Garcia Sancio
>Priority: Blocker
>  Labels: kip-500
> Fix For: 3.0.0
>
> Attachments: Capture.PNG
>
>
> Failed twice on a [PR 
> build|https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-10755/6/testReport/]
> h3. Stacktrace
> org.opentest4j.AssertionFailedError: expected:  but was:  at 
> org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at 
> org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at 
> org.apache.kafka.raft.RaftEventSimulationTest.canMakeProgressIfMajorityIsReachable(RaftEventSimulationTest.java:263)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-12851) Flaky Test RaftEventSimulationTest.canMakeProgressIfMajorityIsReachable

2021-07-26 Thread Jose Armando Garcia Sancio (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Armando Garcia Sancio updated KAFKA-12851:
---
Priority: Major  (was: Blocker)

> Flaky Test RaftEventSimulationTest.canMakeProgressIfMajorityIsReachable
> ---
>
> Key: KAFKA-12851
> URL: https://issues.apache.org/jira/browse/KAFKA-12851
> Project: Kafka
>  Issue Type: Bug
>  Components: core, kraft
>Reporter: A. Sophie Blee-Goldman
>Assignee: Jose Armando Garcia Sancio
>Priority: Major
>  Labels: kip-500
> Fix For: 3.0.0
>
> Attachments: Capture.PNG
>
>
> Failed twice on a [PR 
> build|https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-10755/6/testReport/]
> h3. Stacktrace
> org.opentest4j.AssertionFailedError: expected:  but was:  at 
> org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at 
> org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:35) at 
> org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:162) at 
> org.apache.kafka.raft.RaftEventSimulationTest.canMakeProgressIfMajorityIsReachable(RaftEventSimulationTest.java:263)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] wcarlson5 commented on a change in pull request #11123: MINOR: factor state checks into descriptive methods and clarify javadocs

2021-07-26 Thread GitBox


wcarlson5 commented on a change in pull request #11123:
URL: https://github.com/apache/kafka/pull/11123#discussion_r676562303



##
File path: streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java
##
@@ -235,10 +238,26 @@
 this.validTransitions.addAll(Arrays.asList(validTransitions));
 }
 
+public boolean hasNotStarted() {
+return equals(CREATED);
+}
+
 public boolean isRunningOrRebalancing() {
 return equals(RUNNING) || equals(REBALANCING);
 }
 
+public boolean isShuttingDown() {
+return equals(PENDING_SHUTDOWN) || equals(PENDING_ERROR);
+}
+
+public boolean hasCompletedShutdown() {
+return equals(NOT_RUNNING) || equals(ERROR);
+}
+
+public boolean hasStartedOrFinishedShuttingDown() {

Review comment:
   Maybe have it be `isShuttingDown() || 
hasStartedOrFinishedShuttingDown()` to be less brittle?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (KAFKA-13136) kafka-connect task.max : active task in consumer group is limited by the bigger topic to consume

2021-07-26 Thread raphael auv (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

raphael auv updated KAFKA-13136:

Issue Type: Bug  (was: Improvement)

> kafka-connect task.max : active task in consumer group is limited by the 
> bigger topic to consume
> 
>
> Key: KAFKA-13136
> URL: https://issues.apache.org/jira/browse/KAFKA-13136
> Project: Kafka
>  Issue Type: Bug
>Reporter: raphael auv
>Priority: Major
>
> In kafka-connect 2.7
> *The maximum number of active task for a sink connector is equal to the topic 
> with the biggest number of partitions to consume*
> An active task is a task with partitions attributed in the consumer-group of 
> the sink connector
> example :
> With 2 topics where each have 10 partitions ( 20 partitions in total )
> The maximum number of active task is 10 ( if I set task.max at 12 ,there is 
> 10 members of the consumer group consuming partitions and  2 members in the 
> consumer-group that do not have partitions to consume).
> If I add a third topic with 15 partitions to the connector conf then the 12 
> members of the consumer group are consuming partitions, and then if I set now 
> task.max at 17 only 15 members are active in the consumer-group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-13136) kafka-connect task.max : active task in consumer group is limited by the bigger topic to consume

2021-07-26 Thread raphael auv (Jira)
raphael auv created KAFKA-13136:
---

 Summary: kafka-connect task.max : active task in consumer group is 
limited by the bigger topic to consume
 Key: KAFKA-13136
 URL: https://issues.apache.org/jira/browse/KAFKA-13136
 Project: Kafka
  Issue Type: Improvement
Reporter: raphael auv


In kafka-connect 2.7

*The maximum number of active task for a sink connector is equal to the topic 
with the biggest number of partitions to consume*

An active task is a task with partitions attributed in the consumer-group of 
the sink connector

example :

With 2 topics where each have 10 partitions ( 20 partitions in total )

The maximum number of active task is 10 ( if I set task.max at 12 ,there is 10 
members of the consumer group consuming partitions and  2 members in the 
consumer-group that do not have partitions to consume).

If I add a third topic with 15 partitions to the connector conf then the 12 
members of the consumer group are consuming partitions, and then if I set now 
task.max at 17 only 15 members are active in the consumer-group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka] patrickstuedi opened a new pull request #11129: Fix for flaky test in StoreQueryIntegrationTest

2021-07-26 Thread GitBox


patrickstuedi opened a new pull request #11129:
URL: https://github.com/apache/kafka/pull/11129


   This PR fixes a bug in 
StoreQueryIntegrationTest::shouldQueryOnlyActivePartitionStoresByDefault that 
causes the test to fail in the case of a client rebalancing. The changes in 
this PR make sure the test keeps re-trying after a rebalancing operation, 
instead of failing. 
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KAFKA-12635) Mirrormaker 2 offset sync is incorrect if the target partition is empty

2021-07-26 Thread Alexis Polyzos (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387184#comment-17387184
 ] 

Alexis Polyzos commented on KAFKA-12635:


Hey [~mimaison], any chance you took a look at this bug? Thank you :)

> Mirrormaker 2 offset sync is incorrect if the target partition is empty
> ---
>
> Key: KAFKA-12635
> URL: https://issues.apache.org/jira/browse/KAFKA-12635
> Project: Kafka
>  Issue Type: Bug
>  Components: mirrormaker
>Affects Versions: 2.7.0
>Reporter: Frank Yi
>Assignee: Ning Zhang
>Priority: Major
>
> This bug occurs when using Mirrormaker with "sync.group.offsets.enabled = 
> true".
> If a source partition is empty, but the source consumer group's offset for 
> that partition is non-zero, then Mirrormaker sets the target consumer group's 
> offset for that partition to the literal, not translated, offset of the 
> source consumer group. This state can be reached if the source consumer group 
> consumed some records that were now deleted (like by a retention policy), or 
> if Mirrormaker replication is set to start at "latest". This bug causes the 
> target consumer group's lag for that partition to be negative and breaks 
> offset sync for that partition until lag is positive.
> The correct behavior when the source partition is empty would be to set the 
> target offset to the translated offset, not literal offset, which in this 
> case would always be 0. 
> Original email thread on this issue: 
> https://lists.apache.org/thread.html/r7c54ee5f57227367b911d4abffa72781772d8dd3b72d75eb65ee19f7%40%3Cusers.kafka.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-2376) Add Kafka Connect metrics

2021-07-26 Thread rameshkrishnan muthusamy (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387168#comment-17387168
 ] 

rameshkrishnan muthusamy commented on KAFKA-2376:
-

[~rhauch] can you help clarify why the p95 or P99 metrics were not implemented, 
I can see only avg and max implemented in PR but the KIP has p95, p99 etc

> Add Kafka Connect metrics
> -
>
> Key: KAFKA-2376
> URL: https://issues.apache.org/jira/browse/KAFKA-2376
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.9.0.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Randall Hauch
>Priority: Blocker
>  Labels: needs-kip
> Fix For: 1.0.0
>
>
> Kafka Connect needs good metrics for monitoring since that will be the 
> primary insight into the health of connectors as they copy data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-13120) Flesh out `streams_static_membership_test` to be more robust

2021-07-26 Thread Reggie Hsu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387140#comment-17387140
 ] 

Reggie Hsu commented on KAFKA-13120:


I would like to work on this issue

> Flesh out `streams_static_membership_test` to be more robust
> 
>
> Key: KAFKA-13120
> URL: https://issues.apache.org/jira/browse/KAFKA-13120
> Project: Kafka
>  Issue Type: Task
>  Components: streams, system tests
>Reporter: Leah Thomas
>Priority: Minor
>  Labels: newbie++
>
> When fixing the `streams_static_membership_test.py` we noticed that the test 
> is pretty bare bones, it creates a streams application but doesn't do much 
> with the streams application, eg has no stateful processing. We should flesh 
> this out a bit to be more realistic and potentially consider testing with EOS 
> as well. The full java test is in `StaticMembershipTestClient`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)