Re: [DISCUSS] KIP-876: Time based cluster metadata snapshots

2022-10-13 Thread Luke Chen
Hi José,

Thanks for the KIP.
Adding support to generate snapshot based on time makes sense to me.

The only thing I'd like to point out is the compatibility section.
Since this new config is default to 1 hour, which means if users explicitly
set the config `metadata.log.max.record.bytes.between.snapshots` to a very
large value to avoid snapshot creation, after upgraded, the snapshots will
be created every hour. I think this behavior change should be explicitly
written in compatibility section. WDYT?

Otherwise, LGTM.

Luke

On Fri, Oct 14, 2022 at 8:14 AM José Armando García Sancio
 wrote:

> Thanks for your feedback David Jacot, Colin McCabe and Niket Goel.
>
> I started the vote thread at
> https://lists.apache.org/thread/yzzhbvdqxg9shttgbzooc2f42l1cv2sj
>
> --
> -José
>


Re: [VOTE] KIP-876: Time based cluster metadata snapshots

2022-10-13 Thread David Jacot
+1 (binding)

Thanks for the KIP!

Le ven. 14 oct. 2022 à 05:47, deng ziming  a
écrit :

> Thanks for this KIP,
>
> +1 for this(binding).
>
> --
> Best,
> Ziming
>
> > On Oct 14, 2022, at 8:11 AM, José Armando García Sancio
>  wrote:
> >
> > Hello all,
> >
> > I would like to start voting for "KIP-876: Time based cluster metadata
> > snapshots."
> >
> > KIP: https://cwiki.apache.org/confluence/x/MY3GDQ
> > Discussion thread:
> > https://lists.apache.org/thread/ww67h9d4xvgw1f7jn4zxwydmt8x1mq72
> >
> > Thanks!
> > --
> > -José
>
>


Jenkins build is unstable: Kafka » Kafka Branch Builder » 3.3 #103

2022-10-13 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-875: First-class offsets support in Kafka Connect

2022-10-13 Thread Ashwin
Thanks for KIP Chris - I think this is a useful feature.

Can you please elaborate on the following in the KIP -

1. How would the response of GET /connectors/{connector}/offsets look like
if the worker has both global and connector specific offsets topic ?

2. How can we pass the reset options like shift-by , to-date-time etc.
using a REST API like DELETE /connectors/{connector}/offsets ?

3. Today PAUSE operation on a connector invokes its stop method - will
there be a change here to reduce confusion with the new proposed STOPPED
state ?

Thanks,
Ashwin

On Fri, Oct 14, 2022 at 2:22 AM Chris Egerton 
wrote:

> Hi all,
>
> I noticed a fairly large gap in the first version of this KIP that I
> published last Friday, which has to do with accommodating connectors
> that target different Kafka clusters than the one that the Kafka Connect
> cluster uses for its internal topics and source connectors with dedicated
> offsets topics. I've since updated the KIP to address this gap, which has
> substantially altered the design. Wanted to give a heads-up to anyone
> that's already started reviewing.
>
> Cheers,
>
> Chris
>
> On Fri, Oct 7, 2022 at 1:29 PM Chris Egerton  wrote:
>
> > Hi all,
> >
> > I'd like to begin discussion on a KIP to add offsets support to the Kafka
> > Connect REST API:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-875%3A+First-class+offsets+support+in+Kafka+Connect
> >
> > Cheers,
> >
> > Chris
> >
>


Re: [VOTE] KIP-876: Time based cluster metadata snapshots

2022-10-13 Thread deng ziming
Thanks for this KIP,

+1 for this(binding).

--
Best,
Ziming

> On Oct 14, 2022, at 8:11 AM, José Armando García Sancio 
>  wrote:
> 
> Hello all,
> 
> I would like to start voting for "KIP-876: Time based cluster metadata
> snapshots."
> 
> KIP: https://cwiki.apache.org/confluence/x/MY3GDQ
> Discussion thread:
> https://lists.apache.org/thread/ww67h9d4xvgw1f7jn4zxwydmt8x1mq72
> 
> Thanks!
> -- 
> -José



Jenkins build is back to stable : Kafka » Kafka Branch Builder » trunk #1295

2022-10-13 Thread Apache Jenkins Server
See 




Re: [DISCUSS] (continued) KIP-848: The Next Generation of the Consumer Rebalance Protocol

2022-10-13 Thread Luke Chen
Hi David,

A few more questions:
1. We will store the "targetAssignment" into log now. But as we know,
there's max batch size limit (default 1MB), which means, we cannot support
1M partitions in one group (actually, it should be less than 60k partitions
since we'll store {topicID+partition id}) by default now. How will we
handle that? Do we expect users to adjust the max batch size to support
large partitions in groups, which we don't need this change for old
protocol?

I'm wondering why we should persist the "targetAssignment" data? If we want
to work for coordinator failover, could the new coordinator try to request
for currently owned partitions from each consumer when failed over? I'm not
sure if the consumer will auto send owned partitions to the new
coordinator. If not, maybe we can return an error to client
ConsumerGroupHeartbeat API with REQUIRE_OWNED_PARTITION error, and ask
client to append the currently owned partitions to new coordinates for new
assignment computation. Does that make sense?

Luke

On Fri, Oct 14, 2022 at 12:22 AM Jun Rao  wrote:

> Hi, David,
>
> Thanks for the reply and the updated KIP. A few more comments on the
> interfaces and the protocols.
>
> 60.  On the consumer side, do we need both PartitionAssignor.onAssignment
> and ConsumerRebalanceListener.onPartitionsAssigned? My understanding is
> that the former was added for cooperative rebalance, which is now handled
> by the coordinator. If we do need both, should we make them more consistent
> (e.g. topic name vs topic id, list vs set vs collection)?
>
> 61. group.local.assignors: Could we make it clear that it's the full class
> name that implements PartitionAssignor?
>
> 62. MemberAssignment: It currently has the following method.
> public Set topicPartitions()
> We are adding List ownedPartitions. Should we keep the
> naming and the return type consistent?
>
> 63. MemberAssignment.error: should that be reason?
>
> 64. group.remote.assignor: The client may not know what assignors the
> broker supports. Should we default this to what the broker determines (e.g.
> first assignor listed in group.consumer.assignors)?
>
> 65. After the text "When A heartbeats again and acknowledges the
> revocation, the group coordinator transitions him to epoch 2 and releases
> foo-2.", we have the following.
>   B - epoch=2, partitions=[foo-2], pending-partitions=[]
> Should foo-2 be in pending-partitions?
>
> 66. In the Online Migration example, is the first occurence of "C -
> epoch=23, partitions=[foo-2, foo-5, foo-4], pending-partitions=[]" correct?
> It seems that should happen after C receives a SyncGroup response? If so,
> subsequent examples have the same issue.
>
> 67. ConsumerGroupHeartbeatRequest.RebalanceTimeoutMs : Which config
> controls this? How is this used by the group coordinator since there is no
> sync barrier anymore?
>
> 68. ConsumerGroupHeartbeatResponse:
> 68.1 AssignedTopicPartitions and PendingTopicPartitions are of type
> []TopicPartition. Should they be TopicPartitions?
> 68.2 Assignment.error. Should we also have an errorMessage field?
>
> 69. ConsumerGroupPrepareAssignmentResponse.Members.Assignor: Should it
> include the selected assignor name?
>
> 70. ConsumerGroupInstallAssignmentRequest.GroupEpoch: Should we let the
> client set this? Intuitively, it seems the coordinator should manage the
> group epoch.
>
> 71. ConsumerGroupDescribeResponse:
> 71.1 Members.Assignment.Partitions. Should we include the topic name too
> since it's convenient for building tools? Ditto for TargetAssignment.
> 71.2 Members: Should we include SubscribedTopicRegex too?
>
> 72. OffsetFetchRequest: Is GenerationIdOrMemberEpoch needed since tools may
> also want to issue this request?
>
> Thanks,
>
> Jun
>


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1294

2022-10-13 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-876: Time based cluster metadata snapshots

2022-10-13 Thread José Armando García Sancio
Thanks for your feedback David Jacot, Colin McCabe and Niket Goel.

I started the vote thread at
https://lists.apache.org/thread/yzzhbvdqxg9shttgbzooc2f42l1cv2sj

-- 
-José


[VOTE] KIP-876: Time based cluster metadata snapshots

2022-10-13 Thread José Armando García Sancio
Hello all,

I would like to start voting for "KIP-876: Time based cluster metadata
snapshots."

KIP: https://cwiki.apache.org/confluence/x/MY3GDQ
Discussion thread:
https://lists.apache.org/thread/ww67h9d4xvgw1f7jn4zxwydmt8x1mq72

Thanks!
-- 
-José


[jira] [Resolved] (KAFKA-14296) Partition leaders are not demoted during kraft controlled shutdown

2022-10-13 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-14296.
-
Fix Version/s: 3.4.0
   3.3.2
   Resolution: Fixed

> Partition leaders are not demoted during kraft controlled shutdown
> --
>
> Key: KAFKA-14296
> URL: https://issues.apache.org/jira/browse/KAFKA-14296
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: David Jacot
>Assignee: David Jacot
>Priority: Major
> Fix For: 3.4.0, 3.3.2
>
>
> When the BrokerServer starts its shutting down process, it transitions to 
> SHUTTING_DOWN and sets isShuttingDown to true. With this state change, the 
> follower state changes are short-cutted. This means that a broker which was 
> serving as leader would remain acting as a leader until controlled shutdown 
> completes. Instead, we want the leader and ISR state to be updated so that 
> requests will return NOT_LEADER and the client can find the new leader.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14301) Simulation validation that raft listener see NO_LEADER on leader

2022-10-13 Thread Jira
José Armando García Sancio created KAFKA-14301:
--

 Summary: Simulation validation that raft listener see NO_LEADER on 
leader
 Key: KAFKA-14301
 URL: https://issues.apache.org/jira/browse/KAFKA-14301
 Project: Kafka
  Issue Type: Task
Reporter: José Armando García Sancio


It would be good to validate in our KRaft simulation that if a replica wins an 
election, the listener will always see a NO_LEADER first.

In other words we want to check and validate that the listener always sees the 
epoch changes before the the leader is assign.

At the protocol level this is valid because a replica first increases the epoch 
and transitions to candidate. Once the replica wins an election does it assign 
a leader for that epoch.

It is important to validate that this is the case because the current 
controller implementation assumes this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #1293

2022-10-13 Thread Apache Jenkins Server
See 




RE: Re: [DISCUSS] KIP-875: First-class offsets support in Kafka Connect

2022-10-13 Thread Greg Harris
Hey Chris,

Thanks for the KIP!

I think this is an important feature for both development and operations
use-cases, and it's an obvious gap in the REST feature set.
I also appreciate the incremental nature of the KIP and the future
extensions that will now be possible.

I had a couple of questions about the design and it's extensibility:

1. How do you imagine the API will behave with connectors that have
extremely large numbers of partitions (thousands or more) and/or source
connectors with large amounts of data per partition?

2. Does the new STOPPED state need any special integration with the
rebalance subsystem, or can the rebalance algorithms remain ignorant of the
target state of connectors?

And about the implementation:

1. For my own edification, what is the difference between deleting a
consumer group and deleting all known offsets for that group? Does deleting
the group offer better/easier atomicity?

2. For EOS sources, will stopping the connector and tombstoning the task
configs perform a fence-out, or will that fence-out only occur when
performing the offsets DELETE operation?

Thanks!
Greg

On 2022/10/13 20:52:26 Chris Egerton wrote:
> Hi all,
>
> I noticed a fairly large gap in the first version of this KIP that I
> published last Friday, which has to do with accommodating connectors
> that target different Kafka clusters than the one that the Kafka Connect
> cluster uses for its internal topics and source connectors with dedicated
> offsets topics. I've since updated the KIP to address this gap, which has
> substantially altered the design. Wanted to give a heads-up to anyone
> that's already started reviewing.
>
> Cheers,
>
> Chris
>
> On Fri, Oct 7, 2022 at 1:29 PM Chris Egerton  wrote:
>
> > Hi all,
> >
> > I'd like to begin discussion on a KIP to add offsets support to the
Kafka
> > Connect REST API:
> >
https://cwiki.apache.org/confluence/display/KAFKA/KIP-875%3A+First-class+offsets+support+in+Kafka+Connect
> >
> > Cheers,
> >
> > Chris
> >
>


Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.3 #102

2022-10-13 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 501024 lines...]
[2022-10-13T21:05:52.909Z] [INFO] Using 'UTF-8' encoding to copy filtered 
resources.
[2022-10-13T21:05:52.909Z] [INFO] Copying 2 resources
[2022-10-13T21:05:52.909Z] [INFO] Copying 3 resources
[2022-10-13T21:05:52.909Z] [INFO] 
[2022-10-13T21:05:52.909Z] [INFO] --- maven-archetype-plugin:2.2:jar 
(default-jar) @ streams-quickstart-java ---
[2022-10-13T21:05:53.865Z] [INFO] Building archetype jar: 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.3/streams/quickstart/java/target/streams-quickstart-java-3.3.2-SNAPSHOT
[2022-10-13T21:05:53.865Z] [INFO] 
[2022-10-13T21:05:53.865Z] [INFO] --- maven-site-plugin:3.5.1:attach-descriptor 
(attach-descriptor) @ streams-quickstart-java ---
[2022-10-13T21:05:53.865Z] [INFO] 
[2022-10-13T21:05:53.865Z] [INFO] --- 
maven-archetype-plugin:2.2:integration-test (default-integration-test) @ 
streams-quickstart-java ---
[2022-10-13T21:05:53.865Z] [INFO] 
[2022-10-13T21:05:53.865Z] [INFO] --- maven-gpg-plugin:1.6:sign 
(sign-artifacts) @ streams-quickstart-java ---
[2022-10-13T21:05:53.865Z] [INFO] 
[2022-10-13T21:05:53.865Z] [INFO] --- maven-install-plugin:2.5.2:install 
(default-install) @ streams-quickstart-java ---
[2022-10-13T21:05:53.865Z] [INFO] Installing 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.3/streams/quickstart/java/target/streams-quickstart-java-3.3.2-SNAPSHOT.jar
 to 
/home/jenkins/.m2/repository/org/apache/kafka/streams-quickstart-java/3.3.2-SNAPSHOT/streams-quickstart-java-3.3.2-SNAPSHOT.jar
[2022-10-13T21:05:53.865Z] [INFO] Installing 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.3/streams/quickstart/java/pom.xml
 to 
/home/jenkins/.m2/repository/org/apache/kafka/streams-quickstart-java/3.3.2-SNAPSHOT/streams-quickstart-java-3.3.2-SNAPSHOT.pom
[2022-10-13T21:05:53.865Z] [INFO] 
[2022-10-13T21:05:53.865Z] [INFO] --- 
maven-archetype-plugin:2.2:update-local-catalog (default-update-local-catalog) 
@ streams-quickstart-java ---
[2022-10-13T21:05:53.865Z] [INFO] 

[2022-10-13T21:05:53.865Z] [INFO] Reactor Summary for Kafka Streams :: 
Quickstart 3.3.2-SNAPSHOT:
[2022-10-13T21:05:53.865Z] [INFO] 
[2022-10-13T21:05:53.865Z] [INFO] Kafka Streams :: Quickstart 
 SUCCESS [  2.494 s]
[2022-10-13T21:05:53.865Z] [INFO] streams-quickstart-java 
 SUCCESS [  0.974 s]
[2022-10-13T21:05:53.865Z] [INFO] 

[2022-10-13T21:05:53.865Z] [INFO] BUILD SUCCESS
[2022-10-13T21:05:53.865Z] [INFO] 

[2022-10-13T21:05:53.865Z] [INFO] Total time:  3.839 s
[2022-10-13T21:05:53.865Z] [INFO] Finished at: 2022-10-13T21:05:53Z
[2022-10-13T21:05:53.865Z] [INFO] 

[Pipeline] dir
[2022-10-13T21:05:54.395Z] Running in 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.3/streams/quickstart/test-streams-archetype
[Pipeline] {
[Pipeline] sh
[2022-10-13T21:05:55.328Z] 
[2022-10-13T21:05:55.328Z] 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testInnerLeft[caching enabled = false] PASSED
[2022-10-13T21:05:55.328Z] 
[2022-10-13T21:05:55.328Z] 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest > 
testOuterInner[caching enabled = false] STARTED
[2022-10-13T21:05:56.598Z] + echo Y
[2022-10-13T21:05:56.598Z] + mvn archetype:generate -DarchetypeCatalog=local 
-DarchetypeGroupId=org.apache.kafka 
-DarchetypeArtifactId=streams-quickstart-java -DarchetypeVersion=3.3.2-SNAPSHOT 
-DgroupId=streams.examples -DartifactId=streams.examples -Dversion=0.1 
-Dpackage=myapps
[2022-10-13T21:05:58.385Z] [INFO] Scanning for projects...
[2022-10-13T21:05:59.340Z] [INFO] 
[2022-10-13T21:05:59.340Z] [INFO] --< 
org.apache.maven:standalone-pom >---
[2022-10-13T21:05:59.340Z] [INFO] Building Maven Stub Project (No POM) 1
[2022-10-13T21:05:59.340Z] [INFO] [ pom 
]-
[2022-10-13T21:05:59.340Z] [INFO] 
[2022-10-13T21:05:59.340Z] [INFO] >>> maven-archetype-plugin:3.2.1:generate 
(default-cli) > generate-sources @ standalone-pom >>>
[2022-10-13T21:05:59.340Z] [INFO] 
[2022-10-13T21:05:59.340Z] [INFO] <<< maven-archetype-plugin:3.2.1:generate 
(default-cli) < generate-sources @ standalone-pom <<<
[2022-10-13T21:05:59.340Z] [INFO] 
[2022-10-13T21:05:59.340Z] [INFO] 
[2022-10-13T21:05:59.340Z] [INFO] --- maven-archetype-plugin:3.2.1:generate 
(default-cli) @ standalone-pom ---
[2022-10-13T21:06:00.298Z] [INFO] Generating project in Interactive mode
[2022-10-13T21:06:00.298Z] [WARNING] Archetype not found in any catalog. 
Falling back to central repository.
[2022-10-13T21:06:00.29

Re: [DISCUSS] KIP-875: First-class offsets support in Kafka Connect

2022-10-13 Thread Chris Egerton
Hi all,

I noticed a fairly large gap in the first version of this KIP that I
published last Friday, which has to do with accommodating connectors
that target different Kafka clusters than the one that the Kafka Connect
cluster uses for its internal topics and source connectors with dedicated
offsets topics. I've since updated the KIP to address this gap, which has
substantially altered the design. Wanted to give a heads-up to anyone
that's already started reviewing.

Cheers,

Chris

On Fri, Oct 7, 2022 at 1:29 PM Chris Egerton  wrote:

> Hi all,
>
> I'd like to begin discussion on a KIP to add offsets support to the Kafka
> Connect REST API:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-875%3A+First-class+offsets+support+in+Kafka+Connect
>
> Cheers,
>
> Chris
>


[jira] [Resolved] (KAFKA-14292) KRaft broker controlled shutdown can be delayed indefinitely

2022-10-13 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-14292.
-
Fix Version/s: 3.4.0
   3.3.2
   Resolution: Fixed

> KRaft broker controlled shutdown can be delayed indefinitely
> 
>
> Key: KAFKA-14292
> URL: https://issues.apache.org/jira/browse/KAFKA-14292
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Alyssa Huang
>Priority: Major
> Fix For: 3.4.0, 3.3.2
>
>
> We noticed when rolling a kraft cluster that it took an unexpectedly long 
> time for one of the brokers to shutdown. In the logs, we saw the following:
> {code:java}
> Oct 11, 2022 @ 17:53:38.277   [Controller 1] The request from broker 8 to 
> shut down can not yet be granted because the lowest active offset 2283357 is 
> not greater than the broker's shutdown offset 2283358. 
> org.apache.kafka.controller.BrokerHeartbeatManager  DEBUG   
> 2Oct 11, 2022 @ 17:53:38.277  [Controller 1] Updated the controlled shutdown 
> offset for broker 8 to 2283362.  
> org.apache.kafka.controller.BrokerHeartbeatManager  DEBUG   
> 3Oct 11, 2022 @ 17:53:40.278  [Controller 1] Updated the controlled shutdown 
> offset for broker 8 to 2283366.  
> org.apache.kafka.controller.BrokerHeartbeatManager  DEBUG   
> 4Oct 11, 2022 @ 17:53:40.278  [Controller 1] The request from broker 8 to 
> shut down can not yet be granted because the lowest active offset 2283361 is 
> not greater than the broker's shutdown offset 2283362. 
> org.apache.kafka.controller.BrokerHeartbeatManager  DEBUG   
> 5Oct 11, 2022 @ 17:53:42.279  [Controller 1] The request from broker 8 to 
> shut down can not yet be granted because the lowest active offset 2283365 is 
> not greater than the broker's shutdown offset 2283366. 
> org.apache.kafka.controller.BrokerHeartbeatManager  DEBUG   
> 6Oct 11, 2022 @ 17:53:42.279  [Controller 1] Updated the controlled shutdown 
> offset for broker 8 to 2283370.  
> org.apache.kafka.controller.BrokerHeartbeatManager  DEBUG   
> 7Oct 11, 2022 @ 17:53:44.280  [Controller 1] The request from broker 8 to 
> shut down can not yet be granted because the lowest active offset 2283369 is 
> not greater than the broker's shutdown offset 2283370. 
> org.apache.kafka.controller.BrokerHeartbeatManager  DEBUG   
> 8Oct 11, 2022 @ 17:53:44.281  [Controller 1] Updated the controlled shutdown 
> offset for broker 8 to 2283374.  
> org.apache.kafka.controller.BrokerHeartbeatManager  DEBUG{code}
> From what I can tell, it looks like the controller waits until all brokers 
> have caught up to the {{controlledShutdownOffset}} of the broker that is 
> shutting down before allowing it to proceed. Probably the intent is to make 
> sure they have all the leader and ISR state.
> The problem is that the {{controlledShutdownOffset}} seems to be updated 
> after every heartbeat that the controller receives: 
> https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/QuorumController.java#L1996.
>  Unless all other brokers can catch up to that offset before the next 
> heartbeat from the shutting down broker is received, then the broker remains 
> in the shutting down state indefinitely.
> In this case, it took more than 40 minutes before the broker completed 
> shutdown:
> {code:java}
> 1Oct 11, 2022 @ 18:36:36.105  [Controller 1] The request from broker 8 to 
> shut down has been granted since the lowest active offset 2288510 is now 
> greater than the broker's controlled shutdown offset 2288510.  
> org.apache.kafka.controller.BrokerHeartbeatManager  INFO
> 2Oct 11, 2022 @ 18:40:35.197  [Controller 1] The request from broker 8 to 
> unfence has been granted because it has caught up with the offset of it's 
> register broker record 2288906.   
> org.apache.kafka.controller.BrokerHeartbeatManager  INFO{code}
> It seems like the bug here is that we should not keep updating 
> {{controlledShutdownOffset}} if it has already been set.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1292

2022-10-13 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 427006 lines...]
[2022-10-13T20:18:30.723Z] 
[2022-10-13T20:18:30.723Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testOuter[caching
 enabled = false] PASSED
[2022-10-13T20:18:30.723Z] 
[2022-10-13T20:18:30.723Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testLeft[caching
 enabled = false] STARTED
[2022-10-13T20:18:36.816Z] 
[2022-10-13T20:18:36.816Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testLeft[caching
 enabled = false] PASSED
[2022-10-13T20:18:36.816Z] 
[2022-10-13T20:18:36.816Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerInner[caching
 enabled = false] STARTED
[2022-10-13T20:18:43.953Z] 
[2022-10-13T20:18:43.953Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerInner[caching
 enabled = false] PASSED
[2022-10-13T20:18:43.953Z] 
[2022-10-13T20:18:43.953Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerOuter[caching
 enabled = false] STARTED
[2022-10-13T20:18:49.936Z] 
[2022-10-13T20:18:49.936Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerOuter[caching
 enabled = false] PASSED
[2022-10-13T20:18:49.936Z] 
[2022-10-13T20:18:49.936Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerLeft[caching
 enabled = false] STARTED
[2022-10-13T20:18:56.571Z] 
[2022-10-13T20:18:56.571Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerLeft[caching
 enabled = false] PASSED
[2022-10-13T20:18:56.571Z] 
[2022-10-13T20:18:56.571Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testOuterInner[caching
 enabled = false] STARTED
[2022-10-13T20:19:02.641Z] 
[2022-10-13T20:19:02.641Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testOuterInner[caching
 enabled = false] PASSED
[2022-10-13T20:19:02.641Z] 
[2022-10-13T20:19:02.641Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testOuterOuter[caching
 enabled = false] STARTED
[2022-10-13T20:19:09.133Z] 
[2022-10-13T20:19:09.133Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testOuterOuter[caching
 enabled = false] PASSED
[2022-10-13T20:19:10.093Z] 
[2022-10-13T20:19:10.093Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TaskAssignorIntegrationTest > 
shouldProperlyConfigureTheAssignor STARTED
[2022-10-13T20:19:10.093Z] 
[2022-10-13T20:19:10.093Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TaskAssignorIntegrationTest > 
shouldProperlyConfigureTheAssignor PASSED
[2022-10-13T20:19:11.053Z] 
[2022-10-13T20:19:11.053Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TaskMetadataIntegrationTest > 
shouldReportCorrectEndOffsetInformation STARTED
[2022-10-13T20:19:12.063Z] 
[2022-10-13T20:19:12.063Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > TaskMetadataIntegrationTest > 
shouldReportCorrectEndOffsetInformation PASSED
[2022-10-13T20:19:12.063Z] 
[2022-10-13T20:19:12.063Z] Gradle Test Run :streams:integrationTest > Gradle

[jira] [Created] (KAFKA-14300) KRaft controller snapshot not trigger after resign

2022-10-13 Thread Jira
José Armando García Sancio created KAFKA-14300:
--

 Summary: KRaft controller snapshot not trigger after resign
 Key: KAFKA-14300
 URL: https://issues.apache.org/jira/browse/KAFKA-14300
 Project: Kafka
  Issue Type: Bug
Reporter: José Armando García Sancio
Assignee: José Armando García Sancio


When the active KRaft controller resigns it resets the 
newBytesSinceLastSnapshot field to zero. This is not accurate since it is not 
guarantee that there is an on disk snapshot at the committed offset.

 

Since when the active controller resign it always reverts to the last committed 
state then it doesn't reset the newBytesSinceLastSnapshot. 
newBytesSinceLastSnapshot always represent committed bytes read since the last 
snapshot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-866 ZooKeeper to KRaft Migration

2022-10-13 Thread Jun Rao
Hi, Colin,

Thanks for the reply.

10. This is a bit on the implementation side. If you look at the existing
ZK-based controller, most of the logic is around maintaining an in-memory
state of all the resources (broker, topic, partition, etc), reading/writing
to ZK, sending LeaderAndIsr and UpdateMetadata requests and handling the
responses to brokers. So we need all that logic in the dual write mode. One
option is to duplicate all that logic in some new code. This can be a bit
error prone and makes the code a bit harder to maintain if we need to fix
some critical issues in ZK-based controllers. Another option is to try
reusing the existing code in the ZK-based controller. For example, we could
start the EventManager in the ZK-based controller, but let the KRaft
controller ingest new events. This has its own challenges: (1) the existing
logic only logs ZK failures and doesn't expose them to the caller; (2) the
existing logic may add new events to the queue itself and we probably need
to think through how this is coordinated with the KRaft controller; (3) it
registers some ZK listeners unnecessarily (may not be a big concern). So we
need to get around those issues somehow. I am wondering if we have
considered both options and which approach we are leaning towards for the
implementation.

14. Good point and make sense.

Thanks,

Jun




On Wed, Oct 12, 2022 at 3:27 PM Colin McCabe  wrote:

> Hi Jun,
>
> Thanks for taking a look. I can answer some questions here because I
> collaborated on this a bit, and David is on vacation for a few days.
>
> On Wed, Oct 12, 2022, at 14:41, Jun Rao wrote:
> > Hi, David,
> >
> > Thanks for the KIP. A few comments below.
> >
> > 10. It's still not very clear to me how the KRaft controller works in the
> > dual writes mode to KRaft log and ZK when the brokers still run in ZK
> mode.
> > Does the KRaft controller run a ZK based controller in parallel or do we
> > derive what needs to be written to ZK based on KRaft controller logic?
>
> We derive what needs to be written to ZK based on KRaft controller logic.
>
> > I am also not sure how the KRaft controller handles broker
> > registration/deregistration, since brokers are still running in ZK mode
> and
> > are not heartbeating to the KRaft controller.
>
> The new controller will listen for broker registrations under /brokers.
> This is the only znode watch that the new controller will do.
>
> We did consider changing how ZK-based broker registration worked, but it
> just ended up being too much work for not enough gain.
>
> >
> > 12. "A new set of nodes will be provisioned to host the controller
> quorum."
> > I guess we don't support starting the KRaft controller quorum on existing
> > brokers. It would be useful to make that clear.
> >
>
> Agreed
>
> > 13. "Once the quorum is established and a leader is elected, the
> controller
> > will check the state of the cluster using the MigrationCheck RPC." How
> does
> > the quorum controller detect other brokers? Does the controller node need
> > to be configured with ZK connection string? If so, it would be useful to
> > document the additional configs that the quorum controller needs to set.
> >
>
> Yes, the controllers monitor ZK for broker registrations, as I mentioned
> above. So they need zk.connect and the other ZK connection configurations.
>
> > 14. "In order to prevent further writes to ZK, the first thing the new
> > KRaft quorum must do is take over leadership of the ZK controller. " The
> ZK
> > controller processing changes to /controller update asynchronously. How
> > does the KRaft controller know when the ZK controller has resigned before
> > it can safely copy the ZK data?
> >
>
> This should be done through expectedControllerEpochZkVersion, just like in
> ZK mode, right? We should bump this epoch value so that any writes from the
> old controller will not go through. I agree we should spell this out in the
> KIP.
>
> > 15. We have the following sentences. One says ControllerId is a random
> > KRaft broker and the other says it's the active controller. Which one is
> > correct?
> > "UpdateMetadata: for certain metadata changes, the KRaft controller will
> > need to send UpdateMetadataRequests to the ZK brokers. For the
> > “ControllerId” field in this request, the controller should specify a
> > random KRaft broker."
> > "In the UpdateMetadataRequest sent by the KRaft controller to the ZK
> > brokers, the ControllerId will point to the active controller which will
> be
> > used for the inter-broker requests."
> >
>
> Yeah, this seems like an error to me as well. A random value is not really
> useful. Plus the text here is self-contradictory, as you pointed out.
>
> I suspect what we should do here is add a new field, KRaftControllerId,
> and populate it with the real controller ID, and leave the old controllerId
> field as -1. A ZK-based broker that sees this can then consult its
> controller.quorum.voters configuration to see where it should send
> controller-bou

Re: [DISCUSS] KIP-864: Add End-To-End Latency Metrics to Connectors

2022-10-13 Thread Chris Egerton
Hi Jorge,

Thanks for the KIP. I agree with the overall direction and think this would
be a nice improvement to Kafka Connect. Here are my initial thoughts on the
details:

1. The motivation section outlines the gaps in Kafka Connect's task metrics
nicely. I think it'd be useful to include more concrete details on why
these gaps need to be filled in, and in which cases additional metrics
would be helpful. One goal could be to provide enhanced monitoring of
production deployments that allows for cluster administrators to set up
automatic alerts for latency spikes and, if triggered, quickly identify the
root cause of those alerts, reducing the time to remediation. Another goal
could be to provide more insight to developers or cluster administrators
who want to do performance testing on connectors in non-production
environments. It may help guide our decision making process to have a
clearer picture of the goals we're trying to achieve.
2. If we're trying to address the alert-and-diagnose use case, it'd be
useful to have as much information as possible at INFO level, rather than
forcing cluster administrators to possibly reconfigure a connector to emit
DEBUG or TRACE level metrics in order to diagnose a potential
production-impacting performance bottleneck. I can see the rationale for
emitting per-record metrics that track an average value at DEBUG level, but
for per-record metrics that track a maximum value, is there any reason not
to provide this information at INFO level?
3. I'm also curious about the performance testing suggested by Yash to
gauge the potential impact of this change. Have you been able to do any
testing with your draft implementation yet?
4. Just to make sure I understand correctly--does "time when it has been
received by the Sink task" refer to the wallclock time directly after a
call to SinkTask::put has been completed (as opposed to directly before
that call is made, or something else entirely)?
5. If the goal is to identify performance bottlenecks (either in production
or pre-production environments), would it make sense to introduce metrics
for each individual converter (i.e., key/value/header) and transformation?
It's definitely an improvement to be able to identify the total time for
conversion and transformation, but then the immediate follow-up question if
a bottleneck is found in that phase is "which converter/transformation is
responsible?" It'd be nice if we could provide a way to quickly answer that
question.
6. Any thoughts about offering latency metrics for source tasks between
receipt of the record from the task and delivery of the record to Kafka
(which would be tracked by producer callback)? We could also use the record
timestamp either instead of or in addition to receipt time if the task
provides a timestamp with its records.
7. We may end up introducing a way for sink tasks to record per-record
delivery to the sink system (see KIP-767 [1]). I'd like it if we could keep
the names of our metrics very precise in order to avoid confusing users
(who may think that we're providing metrics on actual delivery to the sink
system, which may not be the case if the connector performs asynchronous
writes), and in order to leave room for a metrics on true delivery time by
sink tasks. It'd also be nice if we could remain consistent with existing
metrics such as "put-batch-avg-time-ms". With that in mind, what do you
think about renaming these metrics:
- "sink-record-batch-latency-max-ms" to "put-batch-avg-latency-ms"
- "sink-record-latency-max-ms" to "put-sink-record-latency-max-ms"
- "sink-record-latency-avg-ms" to "put-sink-record-latency-avg-ms"
- "sink-record-convert-transform-time-max-ms" to
"convert-transform-sink-record-time-max-ms"
- "sink-record-convert-transform-time-avg-ms" to
"convert-transform-sink-record-time-avg-ms"
- "source-record-transform-convert-time-max-ms" to
"transform-convert-source-record-time-max-ms"
- "source-record-transform-convert-time-avg-ms" to
"transform-convert-source-record-time-avg-ms"

Thanks again for the KIP! Looking forward to your thoughts.

Cheers,

Chris

[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-767%3A+Connect+Latency+Metrics

On Thu, Sep 15, 2022 at 1:32 PM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Hi everyone,
>
> I've made a slight addition to the KIP based on Yash feedback:
>
> - A new metric is added at INFO level to record the max latency from the
> batch timestamp, by keeping the oldest record timestamp per batch.
> - A draft implementation is linked.
>
> Looking forward to your feedback.
> Also, a kindly reminder that the vote thread is open.
>
> Thanks!
> Jorge.
>
> On Thu, 8 Sept 2022 at 14:25, Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > Great. I have updated the KIP to reflect this.
> >
> > Cheers,
> > Jorge.
> >
> > On Thu, 8 Sept 2022 at 12:26, Yash Mayya  wrote:
> >
> >> Thanks, I think it makes sense to define these metrics at a DEBUG
> >> recording
> >> l

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1291

2022-10-13 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 506302 lines...]
[2022-10-13T17:00:50.201Z] 
[2022-10-13T17:00:50.201Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > KStreamAggregationIntegrationTest > 
shouldReduceWindowed(TestInfo) PASSED
[2022-10-13T17:00:50.201Z] 
[2022-10-13T17:00:50.201Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > KStreamAggregationIntegrationTest > 
shouldCountSessionWindows() STARTED
[2022-10-13T17:00:53.600Z] 
[2022-10-13T17:00:53.600Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > KStreamAggregationIntegrationTest > 
shouldCountSessionWindows() PASSED
[2022-10-13T17:00:53.600Z] 
[2022-10-13T17:00:53.600Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > KStreamAggregationIntegrationTest > 
shouldAggregateWindowed(TestInfo) STARTED
[2022-10-13T17:01:00.605Z] 
[2022-10-13T17:01:00.605Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 167 > KStreamAggregationIntegrationTest > 
shouldAggregateWindowed(TestInfo) PASSED
[2022-10-13T17:01:33.587Z] 
[2022-10-13T17:01:33.587Z] FAILURE: Build failed with an exception.
[2022-10-13T17:01:33.587Z] 
[2022-10-13T17:01:33.587Z] * What went wrong:
[2022-10-13T17:01:33.587Z] Execution failed for task ':core:integrationTest'.
[2022-10-13T17:01:33.587Z] > Process 'Gradle Test Executor 161' finished with 
non-zero exit value 1
[2022-10-13T17:01:33.587Z]   This problem might be caused by incorrect test 
process configuration.
[2022-10-13T17:01:33.587Z]   Please refer to the test execution section in the 
User Manual at 
https://docs.gradle.org/7.5.1/userguide/java_testing.html#sec:test_execution
[2022-10-13T17:01:33.587Z] 
[2022-10-13T17:01:33.587Z] * Try:
[2022-10-13T17:01:33.587Z] > Run with --stacktrace option to get the stack 
trace.
[2022-10-13T17:01:33.587Z] > Run with --info or --debug option to get more log 
output.
[2022-10-13T17:01:33.587Z] > Run with --scan to get full insights.
[2022-10-13T17:01:33.587Z] 
[2022-10-13T17:01:33.587Z] * Get more help at https://help.gradle.org
[2022-10-13T17:01:33.587Z] 
[2022-10-13T17:01:33.587Z] BUILD FAILED in 2h 22m 4s
[2022-10-13T17:01:33.587Z] 212 actionable tasks: 115 executed, 97 up-to-date
[2022-10-13T17:01:33.587Z] 
[2022-10-13T17:01:33.587Z] See the profiling report at: 
file:///home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/build/reports/profile/profile-2022-10-13-14-39-32.html
[2022-10-13T17:01:33.587Z] A fine-grained performance profile is available: use 
the --scan option.
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // timestamps
[Pipeline] }
[Pipeline] // timeout
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
Failed in branch JDK 17 and Scala 2.12
[2022-10-13T17:01:37.661Z] > Task :core:classes
[2022-10-13T17:01:37.661Z] > Task :core:compileTestJava NO-SOURCE
[2022-10-13T17:02:02.554Z] > Task :core:compileTestScala
[2022-10-13T17:02:47.674Z] > Task :core:testClasses
[2022-10-13T17:02:59.796Z] > Task :streams:compileTestJava
[2022-10-13T17:02:59.796Z] > Task :streams:testClasses
[2022-10-13T17:02:59.796Z] > Task :streams:testJar
[2022-10-13T17:02:59.796Z] > Task :streams:testSrcJar
[2022-10-13T17:02:59.796Z] > Task 
:streams:publishMavenJavaPublicationToMavenLocal
[2022-10-13T17:02:59.796Z] > Task :streams:publishToMavenLocal
[2022-10-13T17:02:59.796Z] 
[2022-10-13T17:02:59.796Z] Deprecated Gradle features were used in this build, 
making it incompatible with Gradle 8.0.
[2022-10-13T17:02:59.796Z] 
[2022-10-13T17:02:59.796Z] You can use '--warning-mode all' to show the 
individual deprecation warnings and determine if they come from your own 
scripts or plugins.
[2022-10-13T17:02:59.796Z] 
[2022-10-13T17:02:59.796Z] See 
https://docs.gradle.org/7.5.1/userguide/command_line_interface.html#sec:command_line_warnings
[2022-10-13T17:02:59.796Z] 
[2022-10-13T17:02:59.796Z] Execution optimizations have been disabled for 2 
invalid unit(s) of work during this build to ensure correctness.
[2022-10-13T17:02:59.796Z] Please consult deprecation warnings for more details.
[2022-10-13T17:02:59.796Z] 
[2022-10-13T17:02:59.796Z] BUILD SUCCESSFUL in 3m 16s
[2022-10-13T17:02:59.796Z] 79 actionable tasks: 37 executed, 42 up-to-date
[Pipeline] sh
[2022-10-13T17:03:03.024Z] + grep ^version= gradle.properties
[2022-10-13T17:03:03.024Z] + cut -d= -f 2
[Pipeline] dir
[2022-10-13T17:03:04.042Z] Running in 
/home/jenkins/workspace/Kafka_kafka_trunk_2/streams/quickstart
[Pipeline] {
[Pipeline] sh
[2022-10-13T17:03:06.595Z] + mvn clean install -Dgpg.skip
[2022-10-13T17:03:07.612Z] [INFO] Scanning for projects...
[2022-10-13T17:03:07.612Z] [INFO] 

[2022-10-13T17:03:07.612Z] [INFO] Reactor Build Order:
[2022-

Kafka Client trying to reconnect resulting in application crash

2022-10-13 Thread Yash Bansal
Hello,

I have a Spring Web Application using Kafka. I am using Apache Kafka
Client 3.1.0 to connect to Kafka.
I have noticed that if I give a wrong Kafka port to connect to, the client
keeps on trying to connect to Kafka and at that time I am unable to call my
application API. It returns 404.

So I have below questions:
1. While Kafka client trying to reconnect so frequently, my application is
unable to serve any API requests.
2. How can I mitigate this? If Kafka client unable to connect to broker,
try few times and then stop trying.

P.S.: If I give correct port, then the application starts working fine.

Best Regards
Yash Bansal


Re: [jira] [Created] (KAFKA-14293) Basic Auth filter should set the SecurityContext after a successful login

2022-10-13 Thread jacob bogers
On Wednesday, October 12, 2022, Patrik Márton (Jira) 
wrote:

> Patrik Márton created KAFKA-14293:
> -
>
>  Summary: Basic Auth filter should set the SecurityContext
> after a successful login
>  Key: KAFKA-14293
>  URL: https://issues.apache.org/jira/browse/KAFKA-14293
>  Project: Kafka
>   Issue Type: Improvement
> Reporter: Patrik Márton
>
>
> Currently, the JaasBasicAuthFilter does not set the security context of
> the request after a successful login. However, this information of an
> authenticated user might be required for further processing, for example to
> perform authorization checks after the authentication.
>
> > The filter should be extended to add the Security Context after a
> successful login.
>
> Another improvement would be to assign the right Priority to the filter.
> The current implementation uses the default priority, which is
> Priorities.USER = 5000. This is a lower priority than for example
> AUTHORIZATION, which means that the basic auth filter would run after
> authorization filters.
>
> > Assing the correct Priorities.AUTHENTICATION = 1000 priority to the
> filter
>
>
>
> --
> This message was sent by Atlassian Jira
> (v8.20.10#820010)
>


[DISCUSS] (continued) KIP-848: The Next Generation of the Consumer Rebalance Protocol

2022-10-13 Thread Jun Rao
Hi, David,

Thanks for the reply and the updated KIP. A few more comments on the
interfaces and the protocols.

60.  On the consumer side, do we need both PartitionAssignor.onAssignment
and ConsumerRebalanceListener.onPartitionsAssigned? My understanding is
that the former was added for cooperative rebalance, which is now handled
by the coordinator. If we do need both, should we make them more consistent
(e.g. topic name vs topic id, list vs set vs collection)?

61. group.local.assignors: Could we make it clear that it's the full class
name that implements PartitionAssignor?

62. MemberAssignment: It currently has the following method.
public Set topicPartitions()
We are adding List ownedPartitions. Should we keep the
naming and the return type consistent?

63. MemberAssignment.error: should that be reason?

64. group.remote.assignor: The client may not know what assignors the
broker supports. Should we default this to what the broker determines (e.g.
first assignor listed in group.consumer.assignors)?

65. After the text "When A heartbeats again and acknowledges the
revocation, the group coordinator transitions him to epoch 2 and releases
foo-2.", we have the following.
  B - epoch=2, partitions=[foo-2], pending-partitions=[]
Should foo-2 be in pending-partitions?

66. In the Online Migration example, is the first occurence of "C -
epoch=23, partitions=[foo-2, foo-5, foo-4], pending-partitions=[]" correct?
It seems that should happen after C receives a SyncGroup response? If so,
subsequent examples have the same issue.

67. ConsumerGroupHeartbeatRequest.RebalanceTimeoutMs : Which config
controls this? How is this used by the group coordinator since there is no
sync barrier anymore?

68. ConsumerGroupHeartbeatResponse:
68.1 AssignedTopicPartitions and PendingTopicPartitions are of type
[]TopicPartition. Should they be TopicPartitions?
68.2 Assignment.error. Should we also have an errorMessage field?

69. ConsumerGroupPrepareAssignmentResponse.Members.Assignor: Should it
include the selected assignor name?

70. ConsumerGroupInstallAssignmentRequest.GroupEpoch: Should we let the
client set this? Intuitively, it seems the coordinator should manage the
group epoch.

71. ConsumerGroupDescribeResponse:
71.1 Members.Assignment.Partitions. Should we include the topic name too
since it's convenient for building tools? Ditto for TargetAssignment.
71.2 Members: Should we include SubscribedTopicRegex too?

72. OffsetFetchRequest: Is GenerationIdOrMemberEpoch needed since tools may
also want to issue this request?

Thanks,

Jun


[jira] [Created] (KAFKA-14299) Benchmark and stabilize state updater

2022-10-13 Thread Lucas Brutschy (Jira)
Lucas Brutschy created KAFKA-14299:
--

 Summary: Benchmark and stabilize state updater
 Key: KAFKA-14299
 URL: https://issues.apache.org/jira/browse/KAFKA-14299
 Project: Kafka
  Issue Type: Task
Reporter: Lucas Brutschy


We need to benchmark and stabilize the separate state restoration code path.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14298) Getting null pointer exception

2022-10-13 Thread Ramakrishna (Jira)
Ramakrishna created KAFKA-14298:
---

 Summary: Getting null pointer exception
 Key: KAFKA-14298
 URL: https://issues.apache.org/jira/browse/KAFKA-14298
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Ramakrishna


Getting null pointer exception.

java.lang.NullPointerException at 
org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:995) 
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:925) 
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:801) 
at 
com.wellsfargo.dci.opensource.kafka.KafkaUtil$$anonfun$10$$anonfun$11.apply(KafkaUtil.scala:363)
 at 
com.wellsfargo.dci.opensource.kafka.KafkaUtil$$anonfun$10$$anonfun$11.apply(KafkaUtil.scala:361)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14297) Automated protocol should support namespace

2022-10-13 Thread David Jacot (Jira)
David Jacot created KAFKA-14297:
---

 Summary: Automated protocol should support namespace
 Key: KAFKA-14297
 URL: https://issues.apache.org/jira/browse/KAFKA-14297
 Project: Kafka
  Issue Type: Sub-task
Reporter: David Jacot
Assignee: David Jacot


The number of APIs has increased significantly. With KIP-848, we will a few 
others. It would be great if we could group them into namespaces.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14296) Partition leaders are not demoted during kraft controlled shutdown

2022-10-13 Thread David Jacot (Jira)
David Jacot created KAFKA-14296:
---

 Summary: Partition leaders are not demoted during kraft controlled 
shutdown
 Key: KAFKA-14296
 URL: https://issues.apache.org/jira/browse/KAFKA-14296
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.3.1
Reporter: David Jacot
Assignee: David Jacot


TODO



--
This message was sent by Atlassian Jira
(v8.20.10#820010)