Re: [DISCUSS] KIP-855: Add schema.namespace parameter to SetSchemaMetadata SMT in Kafka Connect

2022-08-22 Thread Michael Negodaev
Hi Mickael,

Thank you for looking into this.
This is definitely a typo, I've just corrected it. Thanks for finding this!

Michael

ср, 17 авг. 2022 г. в 15:28, Mickael Maison :

> Hi Michael,
>
> Thanks for the KIP! Sorry for the delay, I finally took some time to
> take a look.
>
> In both the "Public Interfaces" and "Compatibility, Deprecation, and
> Migration Plan" sections it mentions the new config is
> "transforms.transformschema.schema.name". However if I understand
> correctly the config you propose adding is actually
> "transforms.transformschema.schema.namespace". Is this a typo or am I
> missing something?
>
> Thanks,
> Mickael
>
> On Fri, Jul 22, 2022 at 9:57 AM Michael Negodaev 
> wrote:
> >
> > Hi all,
> >
> > I would like to start the discussion on my design to add
> "schema.namespace"
> > parameter in SetSchemaMetadata Single Message Transform in Kafka Connect.
> >
> > KIP URL: https://cwiki.apache.org/confluence/x/CiT1D
> >
> > Thanks!
> > -Michael
>


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #1165

2022-08-22 Thread Apache Jenkins Server
See 




[VOTE] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-08-22 Thread Stanislav Kozlovski
Hello,

I'd like to start a vote on KIP-860, which adds a client-provided option to
the AlterPartitionReassignmentsRequest that allows the user to guard
against an unintentional change in the replication factor during partition
reassignments.

Discuss Thread:
https://lists.apache.org/thread/bhrqjd4vb05xtztkdo8py374m9dgq69r
KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-860%3A+Add+client-provided+option+to+guard+against+replication+factor+change+during+partition+reassignments
JIRA: https://issues.apache.org/jira/browse/KAFKA-14121


-- 
Best,
Stanislav


Re: [DISCISS] KIP-860: Add client-provided option to guard against unintentional replication factor change during partition reassignments

2022-08-22 Thread Stanislav Kozlovski
Thanks David,

I do prefer the `disallow-replication-factor-change` flag but only for the
CLI. I assume that's what you're proposing instead of
"disable-replication-factor-change". The wording is more natural in your
suggestion I feel.
If we were to modify more (e.g RPC, Admin API), I think it'd be a bit less
straightforward in the code and API to reason about having a
double-negative - e.g `disallowReplicationFactorChange=false`
I have changed the KIP to mention "disallow".

As for the RPCs, I had only envisioned changes in the request - hence I had
only pasted the `.diff`. I have now added the option to be returned as part
of the response too, and have described both RPC jsons in the conventional
way we do it.
Hopefully, this looks good.

I'll be starting a voting thread now.

Best,
Stanislav

On Tue, Aug 16, 2022 at 6:17 AM David Jacot 
wrote:

> Thanks Stan. Overall, the KIP looks good to me. I have two minor comments:
>
> * Could we document the different versions in the
> AlterPartitionReassignmentsRequest/Request schema? You can look at
> other requests/responses to see how we have done this so far.
>
> * I wonder if --disallow-replication-factor-change would be a better
> name. I don't feel strong about this so I am happy to go with the
> quorum here.
>
> Best,
> David
>
> On Tue, Aug 16, 2022 at 12:31 AM Stanislav Kozlovski
>  wrote:
> >
> > Thanks for the discussion all,
> >
> > I have updated the KIP to mention throwing an UnsupportedVersionException
> > if the server is running an old version that would not honor the
> configured
> > allowReplicationFactor option.
> >
> > Please take a look:
> > - KIP:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-860%3A+Add+client-provided+option+to+guard+against+replication+factor+change+during+partition+reassignments
> > - changes:
> >
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=217392873&selectedPageVersions=4&selectedPageVersions=3
> >
> > If there aren't extra comments, I plan on starting a vote thread by the
> end
> > of this week.
> >
> > Best,
> > Stanislav
> >
> > On Tue, Aug 9, 2022 at 5:06 AM David Jacot 
> > wrote:
> >
> > > Throwing an UnsupportedVersionException with an appropriate message
> > > seems to be the best option when the new API is not supported and
> > > AllowReplicationFactorChange is not set to the default value.
> > >
> > > Cheers,
> > > David
> > >
> > > On Mon, Aug 8, 2022 at 6:25 PM Vikas Singh  >
> > > wrote:
> > > >
> > > > I personally like the UVE option. It provides options for clients to
> go
> > > > either way, retry or abort. If we do it in AdminClient, then users
> have
> > > to
> > > > live with what we have chosen.
> > > >
> > > > > Note this can happen during an RF change too. e.g [1,2,3] =>
> [4,5,6,7]
> > > (RF
> > > > > change, intermediate set is [1,2,3,4,5,6,7]), and we try to do a
> > > > > reassignment to [9,10,11], the logic will compare [4,5,6,7] to
> > > [9,10,11].
> > > > > In such a situation where one wants to cancel the RF increase and
> > > reassign
> > > > > again, one first needs to cancel the existing reassignment via the
> API
> > > (no
> > > > > special action required despite RF change)
> > > >
> > > > Thanks for the explanation. I did realize this nuance and thus
> requested
> > > to
> > > > put that in KIP as it's not mentioned why the choice was made. I am
> fine
> > > if
> > > > you choose to not do it in the interest of brevity.
> > > >
> > > > Vikas
> > > >
> > > > On Sun, Aug 7, 2022 at 9:02 AM Stanislav Kozlovski
> > > >  wrote:
> > > >
> > > > > Thank you for the reviews.
> > > > >
> > > > > Vikas,
> > > > > > > In the case of an already-reassigning partition being
> reassigned
> > > again,
> > > > > the validation compares the targetReplicaSet size of the
> reassignment
> > > to
> > > > > the targetReplicaSet size of the new reassignment and throws if
> those
> > > > > differ.
> > > > > > Can you add more detail to this, or clarify what is
> targetReplicaSet
> > > (for
> > > > > e.g. why not sourceReplicaSet?) and how the target replica set
> will be
> > > > > calculated?
> > > > > If a reassignment is ongoing, such that [1,2,3] => [4,5,6] (the
> > > replica set
> > > > > in Kafka will be [1,2,3,4,5,6] during the reassignment), and you
> try to
> > > > > issue a new reassignment (e.g [7,8,9], Kafka should NOT think that
> the
> > > RF
> > > > > of the partition is 6 just because a reassignment is ongoing.
> Hence, we
> > > > > compare [4,5,6]'s length to [7,8,9]
> > > > > The targetReplicaSet is a term we use in KIP-455
> > > > > <
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-455%3A+Create+an+Administrative+API+for+Replica+Reassignment
> > > > > >.
> > > > > It means the desired replica set that a given reassignment is
> trying to
> > > > > achieve. Here we compare said set of the on-going reassignment to
> the
> > > new
> > > > > reassignment.
> > > > >
> > > > > Note this can happen during an RF change too. e.g 

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1164

2022-08-22 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 419041 lines...]
[2022-08-22T20:45:11.955Z] > Task :connect:api:testJar
[2022-08-22T20:45:11.955Z] > Task :connect:api:testSrcJar
[2022-08-22T20:45:11.955Z] > Task 
:connect:api:publishMavenJavaPublicationToMavenLocal
[2022-08-22T20:45:11.955Z] > Task :connect:api:publishToMavenLocal
[2022-08-22T20:45:13.729Z] 
[2022-08-22T20:45:13.729Z] > Task :streams:javadoc
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:890:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:919:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:939:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:854:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:890:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:919:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:939:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Produced.java:84:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Produced.java:136:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Produced.java:147:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Repartitioned.java:101:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/kstream/Repartitioned.java:167:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/TopologyConfig.java:58:
 warning - Tag @link: missing '#': "org.apache.kafka.streams.StreamsBuilder()"
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/TopologyConfig.java:58:
 warning - Tag @link: can't find org.apache.kafka.streams.StreamsBuilder() in 
org.apache.kafka.streams.TopologyConfig
[2022-08-22T20:45:13.729Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/TopologyDescription.java:38:
 warning - Tag @link: reference not found: ProcessorContext#forward(Object, 
Object) forwards
[2022-08-22T20:45:14.676Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/query/Position.java:44:
 warning - Tag @link: can't find query(Query,
[2022-08-22T20:45:14.676Z]  PositionBound, boolean) in 
org.apache.kafka.streams.processor.StateStore
[2022-08-22T20:45:14.676Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/query/QueryResult.java:109:
 warning - Tag @link: reference not found: this#getResult()
[2022-08-22T20:45:14.676Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/query/QueryResult.java:116:
 warning - Tag @link: reference not found: this#getFailureReason()
[2022-08-22T20:45:14.676Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/main/java/org/apache/kafka/streams/query/QueryResult.java:116:
 warning - Tag @link: reference not found: this#getFailureMessage()
[2022-08-22T20:45:14.676Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk/streams/src/mai

RE: Re: [DISCUSS] KIP-855: Add schema.namespace parameter to SetSchemaMetadata SMT in Kafka Connect

2022-08-22 Thread Patrick Magee
This looks like a typo yes. We are adding the ability to add a ‘namespace’ 
which can be prefixed onto the schema name.

Regards,

Patrick

On 2022/08/17 09:28:27 Mickael Maison wrote:
> Hi Michael,
>
> Thanks for the KIP! Sorry for the delay, I finally took some time to
> take a look.
>
> In both the "Public Interfaces" and "Compatibility, Deprecation, and
> Migration Plan" sections it mentions the new config is
> "transforms.transformschema.schema.name". However if I understand
> correctly the config you propose adding is actually
> "transforms.transformschema.schema.namespace". Is this a typo or am I
> missing something?
>
> Thanks,
> Mickael
>
> On Fri, Jul 22, 2022 at 9:57 AM Michael Negodaev  wrote:
> >
> > Hi all,
> >
> > I would like to start the discussion on my design to add "schema.namespace"
> > parameter in SetSchemaMetadata Single Message Transform in Kafka Connect.
> >
> > KIP URL: https://cwiki.apache.org/confluence/x/CiT1D
> >
> > Thanks!
> > -Michael
>
Sent from Mail for Windows



[jira] [Resolved] (KAFKA-13911) Rate is calculated as NaN for minimum config values

2022-08-22 Thread Jose Armando Garcia Sancio (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jose Armando Garcia Sancio resolved KAFKA-13911.

  Reviewer: Ismael Juma
Resolution: Fixed

Closing as it was merged to trunk and 3.3.

> Rate is calculated as NaN for minimum config values
> ---
>
> Key: KAFKA-13911
> URL: https://issues.apache.org/jira/browse/KAFKA-13911
> Project: Kafka
>  Issue Type: Bug
>Reporter: Divij Vaidya
>Assignee: Divij Vaidya
>Priority: Minor
> Fix For: 3.3.0
>
>
> Implementation of connection creation rate quotas in Kafka is dependent on 
> two configurations:
>  # 
> [quota.window.num|https://kafka.apache.org/documentation.html#brokerconfigs_quota.window.num]
>  # 
> [quota.window.size.seconds|https://kafka.apache.org/documentation.html#brokerconfigs_quota.window.size.seconds]
> The minimum possible values of these configuration is 1 as per the 
> documentation. However, 1 as a minimum value for quota.window.num is invalid 
> and leads to failure for calculation of rate as demonstrated below.
> As a proof of the bug, the following unit test fails:
> {code:java}
> @Test
> public void testUseWithMinimumPossibleConfiguration() {
> final Rate r = new Rate();
> MetricConfig config = new MetricConfig().samples(1).timeWindow(1, 
> TimeUnit.SECONDS);
> Time elapsed = new MockTime();
> r.record(config, 1.0, elapsed.milliseconds());
> elapsed.sleep(100);
> r.record(config, 1.0, elapsed.milliseconds());
> elapsed.sleep(1000);
> final Double observedRate = r.measure(config, elapsed.milliseconds());
> assertFalse(Double.isNaN(observedRate));
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-13410) KRaft Upgrades

2022-08-22 Thread David Arthur (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Arthur resolved KAFKA-13410.
--
Resolution: Fixed

The unfinished tasks from this issue were moved to KAFKA-14175

> KRaft Upgrades
> --
>
> Key: KAFKA-13410
> URL: https://issues.apache.org/jira/browse/KAFKA-13410
> Project: Kafka
>  Issue Type: New Feature
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Major
> Fix For: 3.3.0
>
>
> This is the placeholder JIRA for KIP-778



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-13935) Factor out static IBP usages from broker

2022-08-22 Thread David Arthur (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Arthur resolved KAFKA-13935.
--
  Assignee: David Arthur
Resolution: Fixed

> Factor out static IBP usages from broker
> 
>
> Key: KAFKA-13935
> URL: https://issues.apache.org/jira/browse/KAFKA-13935
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 3.3.0
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Major
>
> We pass the IBP down to the log layer for checking things like compression 
> support. Currently, we are still reading this from KafkaConfig. In ZK mode 
> this is fine, but in KRaft mode, reading the IBP from the config is not 
> supported.
> Since KRaft only supports IBP/MetadataVersion greater than 3.0 (which 
> supports the compression mode we check for), we may be able to avoid using a 
> dynamic call and/or volatile to get the current version. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14175) KRaft Upgrades Part 2

2022-08-22 Thread David Arthur (Jira)
David Arthur created KAFKA-14175:


 Summary: KRaft Upgrades Part 2
 Key: KAFKA-14175
 URL: https://issues.apache.org/jira/browse/KAFKA-14175
 Project: Kafka
  Issue Type: Bug
Reporter: David Arthur
Assignee: David Arthur
 Fix For: 3.4.0


This is the parent issue for KIP-778 tasks which were not completed for the 3.3 
release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14174) Documentation for KRaft

2022-08-22 Thread Jose Armando Garcia Sancio (Jira)
Jose Armando Garcia Sancio created KAFKA-14174:
--

 Summary: Documentation for KRaft
 Key: KAFKA-14174
 URL: https://issues.apache.org/jira/browse/KAFKA-14174
 Project: Kafka
  Issue Type: Improvement
Reporter: Jose Armando Garcia Sancio
Assignee: Jose Armando Garcia Sancio
 Fix For: 3.3.0


KRaft documentation for 3.3
 # Disk recovery
 # Talk about KRaft operation



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-13166) EOFException when Controller handles unknown API

2022-08-22 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-13166.
-
Resolution: Fixed

> EOFException when Controller handles unknown API
> 
>
> Key: KAFKA-13166
> URL: https://issues.apache.org/jira/browse/KAFKA-13166
> Project: Kafka
>  Issue Type: Bug
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Minor
> Fix For: 3.3.0, 3.4.0
>
>
> When ControllerApis handles an unsupported RPC, it silently drops the request 
> due to an unhandled exception. 
> The following stack trace was manually printed since this exception was 
> suppressed on the controller. 
> {code}
> java.util.NoSuchElementException: key not found: UpdateFeatures
>   at scala.collection.MapOps.default(Map.scala:274)
>   at scala.collection.MapOps.default$(Map.scala:273)
>   at scala.collection.AbstractMap.default(Map.scala:405)
>   at scala.collection.mutable.HashMap.apply(HashMap.scala:425)
>   at kafka.network.RequestChannel$Metrics.apply(RequestChannel.scala:74)
>   at 
> kafka.network.RequestChannel.$anonfun$updateErrorMetrics$1(RequestChannel.scala:458)
>   at 
> kafka.network.RequestChannel.$anonfun$updateErrorMetrics$1$adapted(RequestChannel.scala:457)
>   at 
> kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
>   at 
> scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359)
>   at 
> scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355)
>   at 
> scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309)
>   at 
> kafka.network.RequestChannel.updateErrorMetrics(RequestChannel.scala:457)
>   at kafka.network.RequestChannel.sendResponse(RequestChannel.scala:388)
>   at 
> kafka.server.RequestHandlerHelper.sendErrorOrCloseConnection(RequestHandlerHelper.scala:93)
>   at 
> kafka.server.RequestHandlerHelper.sendErrorResponseMaybeThrottle(RequestHandlerHelper.scala:121)
>   at 
> kafka.server.RequestHandlerHelper.handleError(RequestHandlerHelper.scala:78)
>   at kafka.server.ControllerApis.handle(ControllerApis.scala:116)
>   at 
> kafka.server.ControllerApis.$anonfun$handleEnvelopeRequest$1(ControllerApis.scala:125)
>   at 
> kafka.server.ControllerApis.$anonfun$handleEnvelopeRequest$1$adapted(ControllerApis.scala:125)
>   at 
> kafka.server.EnvelopeUtils$.handleEnvelopeRequest(EnvelopeUtils.scala:65)
>   at 
> kafka.server.ControllerApis.handleEnvelopeRequest(ControllerApis.scala:125)
>   at kafka.server.ControllerApis.handle(ControllerApis.scala:103)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:75)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> This is due to a bug in the metrics code in RequestChannel.
> The result is that the request fails, but no indication is given that it was 
> due to an unsupported API on either the broker, controller, or client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1163

2022-08-22 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 419361 lines...]
[2022-08-22T17:01:32.580Z] 
[2022-08-22T17:01:32.580Z] StreamsAssignmentScaleTest > 
testStickyTaskAssignorManyStandbys STARTED
[2022-08-22T17:01:39.018Z] 
[2022-08-22T17:01:39.018Z] StreamsAssignmentScaleTest > 
testStickyTaskAssignorManyStandbys PASSED
[2022-08-22T17:01:39.018Z] 
[2022-08-22T17:01:39.018Z] StreamsAssignmentScaleTest > 
testStickyTaskAssignorManyThreadsPerClient STARTED
[2022-08-22T17:01:39.018Z] 
[2022-08-22T17:01:39.018Z] StreamsAssignmentScaleTest > 
testStickyTaskAssignorManyThreadsPerClient PASSED
[2022-08-22T17:01:39.018Z] 
[2022-08-22T17:01:39.018Z] StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorManyThreadsPerClient STARTED
[2022-08-22T17:01:39.964Z] 
[2022-08-22T17:01:39.964Z] StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorManyThreadsPerClient PASSED
[2022-08-22T17:01:39.964Z] 
[2022-08-22T17:01:39.964Z] StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorLargePartitionCount STARTED
[2022-08-22T17:02:03.270Z] 
[2022-08-22T17:02:03.270Z] StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorLargePartitionCount PASSED
[2022-08-22T17:02:03.270Z] 
[2022-08-22T17:02:03.270Z] StreamsAssignmentScaleTest > 
testStickyTaskAssignorLargePartitionCount STARTED
[2022-08-22T17:02:26.243Z] 
[2022-08-22T17:02:26.243Z] StreamsAssignmentScaleTest > 
testStickyTaskAssignorLargePartitionCount PASSED
[2022-08-22T17:02:26.243Z] 
[2022-08-22T17:02:26.243Z] StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorManyStandbys STARTED
[2022-08-22T17:02:32.274Z] 
[2022-08-22T17:02:32.274Z] StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorManyStandbys PASSED
[2022-08-22T17:02:32.274Z] 
[2022-08-22T17:02:32.274Z] StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorManyStandbys STARTED
[2022-08-22T17:02:55.258Z] 
[2022-08-22T17:02:55.258Z] StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorManyStandbys PASSED
[2022-08-22T17:02:55.258Z] 
[2022-08-22T17:02:55.258Z] StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorLargeNumConsumers STARTED
[2022-08-22T17:02:55.258Z] 
[2022-08-22T17:02:55.258Z] StreamsAssignmentScaleTest > 
testFallbackPriorTaskAssignorLargeNumConsumers PASSED
[2022-08-22T17:02:55.258Z] 
[2022-08-22T17:02:55.258Z] StreamsAssignmentScaleTest > 
testStickyTaskAssignorLargeNumConsumers STARTED
[2022-08-22T17:02:57.957Z] 
[2022-08-22T17:02:57.957Z] StreamsAssignmentScaleTest > 
testStickyTaskAssignorLargeNumConsumers PASSED
[2022-08-22T17:02:57.957Z] 
[2022-08-22T17:02:57.957Z] AdjustStreamThreadCountTest > 
testConcurrentlyAccessThreads() STARTED
[2022-08-22T17:02:59.775Z] 
[2022-08-22T17:02:59.775Z] AdjustStreamThreadCountTest > 
testConcurrentlyAccessThreads() PASSED
[2022-08-22T17:02:59.775Z] 
[2022-08-22T17:02:59.775Z] AdjustStreamThreadCountTest > 
shouldResizeCacheAfterThreadReplacement() STARTED
[2022-08-22T17:03:03.550Z] 
[2022-08-22T17:03:03.550Z] AdjustStreamThreadCountTest > 
shouldResizeCacheAfterThreadReplacement() PASSED
[2022-08-22T17:03:03.550Z] 
[2022-08-22T17:03:03.550Z] AdjustStreamThreadCountTest > 
shouldAddAndRemoveThreadsMultipleTimes() STARTED
[2022-08-22T17:03:09.641Z] 
[2022-08-22T17:03:09.641Z] AdjustStreamThreadCountTest > 
shouldAddAndRemoveThreadsMultipleTimes() PASSED
[2022-08-22T17:03:09.641Z] 
[2022-08-22T17:03:09.641Z] AdjustStreamThreadCountTest > 
shouldnNotRemoveStreamThreadWithinTimeout() STARTED
[2022-08-22T17:03:13.488Z] 
[2022-08-22T17:03:13.488Z] AdjustStreamThreadCountTest > 
shouldnNotRemoveStreamThreadWithinTimeout() PASSED
[2022-08-22T17:03:13.488Z] 
[2022-08-22T17:03:13.488Z] AdjustStreamThreadCountTest > 
shouldAddAndRemoveStreamThreadsWhileKeepingNamesCorrect() STARTED
[2022-08-22T17:03:33.604Z] 
[2022-08-22T17:03:33.604Z] AdjustStreamThreadCountTest > 
shouldAddAndRemoveStreamThreadsWhileKeepingNamesCorrect() PASSED
[2022-08-22T17:03:33.604Z] 
[2022-08-22T17:03:33.604Z] AdjustStreamThreadCountTest > 
shouldAddStreamThread() STARTED
[2022-08-22T17:03:37.882Z] 
[2022-08-22T17:03:37.882Z] AdjustStreamThreadCountTest > 
shouldAddStreamThread() PASSED
[2022-08-22T17:03:37.882Z] 
[2022-08-22T17:03:37.882Z] AdjustStreamThreadCountTest > 
shouldRemoveStreamThreadWithStaticMembership() STARTED
[2022-08-22T17:03:42.231Z] 
[2022-08-22T17:03:42.231Z] AdjustStreamThreadCountTest > 
shouldRemoveStreamThreadWithStaticMembership() PASSED
[2022-08-22T17:03:42.231Z] 
[2022-08-22T17:03:42.231Z] AdjustStreamThreadCountTest > 
shouldRemoveStreamThread() STARTED
[2022-08-22T17:03:47.886Z] 
[2022-08-22T17:03:47.886Z] AdjustStreamThreadCountTest > 
shouldRemoveStreamThread() PASSED
[2022-08-22T17:03:47.886Z] 
[2022-08-22T17:03:47.886Z] AdjustStreamThreadCountTest > 
shouldResizeCacheAfterThreadRemovalTimesOut() STARTED
[2022-08-22T17:03:48.908Z] 
[2022-08-22T17:03:48.908Z] AdjustStreamThreadCountTest > 
shouldResizeCacheAfter

RE: ARM/PowerPC builds

2022-08-22 Thread Amit Ramswaroop Baheti
I am planning to raise a PR for re-enabling PowerPC on the pipeline. 

I will be monitoring PowerPC for failures. I hope this resolves the concern 
about build failures on PowerPC. Let me know otherwise.

Thanks,
Amit Baheti

-Original Message-
From: Amit Ramswaroop Baheti  
Sent: 10 August 2022 20:09
To: dev@kafka.apache.org
Subject: [EXTERNAL] RE: ARM/PowerPC builds

I looked at the PR failures on PowerPC jenkins node & the issue was due to few 
stuck gradle daemons. I could find that using "./gradlew --status".
After I cleaned them up using "./gradlew --stop", I was able to build & test 
existing PR's in the jenkins workspace on that node. 

I also pulled another PR #12488 manually & could execute build and tests 
successfully on that node. Further the build and test is executing within 
minutes. So no problem on the performance front.

I am wondering if Apache infra team has any automation to clear stuck gradle 
daemons. If not, perhaps we may look at one using cron job and make such infra 
issues selfheal.

I think it should be fine now to re-enable the PowerPC node in the pipeline. 

Thanks,
Amit Baheti

-Original Message-
From: Amit Ramswaroop Baheti 
Sent: 05 August 2022 17:52
To: dev@kafka.apache.org
Subject: [EXTERNAL] RE: ARM/PowerPC builds

Hi,

I am looking into the failures on PowerPC. 

I will share more details once I have something concrete & hopefully we would 
be able to enable it again soon.

-Amit Baheti

-Original Message-
From: Colin McCabe 
Sent: 04 August 2022 22:39
To: dev@kafka.apache.org
Subject: [EXTERNAL] Re: ARM/PowerPC builds

Hi Matthew,

Can you open a JIRA for the test failures you have seen on M1?

By the way, I have an M1 myself.

best,
Colin

On Thu, Aug 4, 2022, at 04:12, Matthew Benedict de Detrich wrote:
> Quite happy to have this change gone through since the ARM builds were 
> constantly failing however I iterate what Divij Vaidya is saying. I 
> just recently got a new MacBook M1 laptop that has ARM architecture 
> and even locally the tests fail (these are the same tests that also 
> failed in Jenkins).
>
> Should get to the root of the issue especially as more people will get 
> newer Apple laptops over time.
>
> --
> Matthew de Detrich
> Aiven Deutschland GmbH
> Immanuelkirchstraße 26, 10405 Berlin
> Amtsgericht Charlottenburg, HRB 209739 B
>
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> m: +491603708037
> w: aiven.io e: matthew.dedetr...@aiven.io On 4. Aug 2022, 12:36 +0200, 
> Divij Vaidya , wrote:
>> Thank you. This would greatly improve the PR experience since now, 
>> there is higher probability for it to be green.
>>
>> Side question though, do we know why ARM tests are timing out? Should 
>> we start a JIRA with Apache Infra to root cause?
>>
>> —
>> Divij Vaidya
>>
>>
>>
>> On Thu, Aug 4, 2022 at 12:42 AM Colin McCabe  wrote:
>>
>> > Just a quick note. Today we committed INVALID URI REMOVED 
>> > che_kafka_pull_12380&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=50nG1jjahH
>> > 8a5uFJPfv74FQ5POP1oomXeKf8Xylvgw4&m=g4U2eMxnb6_oWrLqMssiRMCn_beLcy75GeHjl2_RYFce1iI2i2HfDSLLFK8gjKrV&s=7rXLDMIeH-7tYpHu6rFBeRVd8kqK0SMKIgb9_nvr6HI&e=
>> >   , "MINOR: Remove ARM/PowerPC builds from Jenkinsfile #12380". This PR 
>> > removes the ARM and PowerPC builds from the Jenkinsfile.
>> >
>> > The rationale is that these builds seem to be failing all the time, 
>> > and this is very disruptive. I personally didn't see any successes 
>> > in the last week or two. So I think we need to rethink this integration a 
>> > bit.
>> >
>> > I'd suggest that we run these builds as nightly builds rather than 
>> > on each commit. It's going to be rare that we make a change that 
>> > succeeds on x86 but breaks on PowerPC or ARM. This would let us 
>> > have very long timeouts on our ARM and PowerPC builds (they could 
>> > take all night if necessary), hence avoiding this issue.
>> >
>> > best,
>> > Colin
>> >
>> --
>> Divij Vaidya


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1162

2022-08-22 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-848: The Next Generation of the Consumer Rebalance Protocol

2022-08-22 Thread Sagar
Hi All,

As per the suggestions from David/Guozhang above, I updated the page for
Connect to have it's own set of APIs and not extend the ones from consumer.
Plz review again.

@Luke,

Thank you. I am actually looking for review comments/thoughts/suggestions
as Connect changes are still in draft stage but since this is a wholesome
change, I am sure there might be many cases missed by me. Also, on this
discussion thread David suggested to continue discussing on this very
thread to ensure connect is compatible with whatever is done in KIP-848. I
am ok either way.

@Guozhang,

Regarding the migration plan and the link to KIP-415 that you had shared, I
had taken a look at it. From what I understood, that migration/downgrading
is from switching the assignment protocol from eager -> incremental or vice
versa. In this case, the migration is to the new rebalance protocol which
would also depend on the group coordinator and whether they support the new
rebalance protocol or not. I have tried to incorporate this aspect in the
draft page.

Thanks!
Sagar.




On Mon, Aug 22, 2022 at 1:00 PM Luke Chen  wrote:

> Hi David,
>
> Thanks for the update.
>
> Some more questions:
> 1. In Group Coordinator section, you mentioned:
> > The new group coordinator will have a state machine per
> *__consumer_offsets* partitions, where each state machine is modelled as an
> event loop. Those state machines will be executed in
> *group.coordinator.threads* threads.
>
> 1.1. I think the state machine are: "Empty, assigning, reconciling, stable,
> dead" mentioned in Consumer Group States section, right?
> 1.2. What do you mean "each state machine is modelled as an event loop"?
> 1.3. Why do we need a state machine per *__consumer_offsets* partitions?
> Not a state machine "per consumer group" owned by a group coordinator? For
> example, if one group coordinator owns 2 consumer groups, and both exist in
> *__consumer_offsets-0*, will we have 1 state machine for it, or 2?
> 1.4. I know the "*group.coordinator.threads" *is the number of threads used
> to run the state machines. But I'm wondering if the purpose of the threads
> is only to keep the state of each consumer group (or *__consumer_offsets*
> partitions?), and no heavy computation, why should we need multi-threads
> here?
>
> 2. For the default value in the new configs:
> 2.1. The consumer session timeout, why does the default session timeout not
> locate between min (45s) and max(60s)? I thought the min/max session
> timeout is to define lower/upper bound of it, no?
>
> group.consumer.session.timeout.ms int 30s The timeout to detect client
> failures when using the consumer group protocol.
> group.consumer.min.session.timeout.ms int 45s The minimum session timeout.
> group.consumer.max.session.timeout.ms int 60s The maximum session timeout.
>
>
>
> 2.2. The default server side assignor are [range, uniform], which means
> we'll default to "range" assignor. I'd like to know why not uniform one? I
> thought usually users will choose uniform assignor (former sticky assinor)
> for better evenly distribution. Any other reason we choose range assignor
> as default?
> group.consumer.assignors List range, uniform The server side assignors.
>
>
>
>
>
>
> Thank you.
> Luke
>
>
>
>
>
>
> On Mon, Aug 22, 2022 at 2:10 PM Luke Chen  wrote:
>
> > Hi Sagar,
> >
> > I have some thoughts about Kafka Connect integrating with KIP-848, but I
> > think we should have a separate discussion thread for the Kafka Connect
> > KIP: Integrating Kafka Connect With New Consumer Rebalance Protocol [1],
> > and let this discussion thread focus on consumer rebalance protocol,
> WDYT?
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/KAFKA/%5BDRAFT%5DIntegrating+Kafka+Connect+With+New+Consumer+Rebalance+Protocol
> >
> > Thank you.
> > Luke
> >
> >
> >
> > On Fri, Aug 12, 2022 at 9:31 PM Sagar  wrote:
> >
> >> Thank you Guozhang/David for the feedback. Looks like there's agreement
> on
> >> using separate APIs for Connect. I would revisit the doc and see what
> >> changes are to be made.
> >>
> >> Thanks!
> >> Sagar.
> >>
> >> On Tue, Aug 9, 2022 at 7:11 PM David Jacot  >
> >> wrote:
> >>
> >> > Hi Sagar,
> >> >
> >> > Thanks for the feedback and the document. That's really helpful. I
> >> > will take a look at it.
> >> >
> >> > Overall, it seems to me that both Connect and the Consumer could share
> >> > the same underlying "engine". The main difference is that the Consumer
> >> > assigns topic-partitions to members whereas Connect assigns tasks to
> >> > workers. I see two ways to move forward:
> >> > 1) We extend the new proposed APIs to support different resource types
> >> > (e.g. partitions, tasks, etc.); or
> >> > 2) We use new dedicated APIs for Connect. The dedicated APIs would be
> >> > similar to the new ones but different on the content/resources and
> >> > they would rely on the same engine on the coordinator side.
> >> >
> >> > I personally lean towards 2) because I am not a fan of overchar

[jira] [Created] (KAFKA-14173) TopologyTestDriver does not perform left join on two streams when right side is missing

2022-08-22 Thread Guido Josquin (Jira)
Guido Josquin created KAFKA-14173:
-

 Summary: TopologyTestDriver does not perform left join on two 
streams when right side is missing
 Key: KAFKA-14173
 URL: https://issues.apache.org/jira/browse/KAFKA-14173
 Project: Kafka
  Issue Type: Bug
  Components: streams-test-utils
Affects Versions: 2.3.1
Reporter: Guido Josquin


I am trying to test a stream-stream join with `TopologyTestDriver`. My goal is 
to confirm, without running external services, that my topology performs the 
following left join correctly.

```
bills
  .leftJoin(payments)(
    {
      case (billValue, null) => billValue
      case (billValue, paymentValue) => (billValue.toInt - 
paymentValue.toInt).toString
    },
    JoinWindows.ofTimeDifferenceWithNoGrace(Duration.ofMillis(100))
  )
  .to("debt")
```

In other words, if we see a `bill` and a `payment` within 100ms, the payment 
should be subtracted from the bill. If we do not see a payment, the debt is 
simply the bill.

Here is the test code.

```
val simpleLeftJoinTopology = new SimpleLeftJoinTopology
val driver = new TopologyTestDriver(simpleLeftJoinTopology.topology)
val serde = Serdes.stringSerde

val bills = driver.createInputTopic("bills", serde.serializer, serde.serializer)
val payments = driver.createInputTopic("payments", serde.serializer, 
serde.serializer)
val debt = driver.createOutputTopic("debt", serde.deserializer, 
serde.deserializer)

bills.pipeInput("fred", "100")
bills.pipeInput("george", "20")
payments.pipeInput("fred", "95")

// When in doubt, sleep twice
driver.advanceWallClockTime(Duration.ofMillis(500))
Thread.sleep(500)

val keyValues = debt.readKeyValuesToList()
keyValues should contain theSameElementsAs Seq(
  // This record is present
  new KeyValue[String, String]("fred", "5"),
  // This record is missing
  new KeyValue[String, String]("george", "20")
)
```

Full code available at https://github.com/Oduig/kstreams-left-join-example

Is seems that advancing the wall clock time, or sleeping for that matter, never 
triggers the join condition when data only arrives on the left side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14097) Separate configuration for producer ID expiry

2022-08-22 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-14097.
-
Fix Version/s: 3.4.0
 Reviewer: David Jacot
 Assignee: Justine Olshan
   Resolution: Fixed

>  Separate configuration for producer ID expiry
> --
>
> Key: KAFKA-14097
> URL: https://issues.apache.org/jira/browse/KAFKA-14097
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Justine Olshan
>Assignee: Justine Olshan
>Priority: Major
> Fix For: 3.4.0
>
>
> Ticket to track KIP-854. Currently time-based producer ID expiration is 
> controlled by `transactional.id.expiration.ms` but we want to create a 
> separate config. This can give us finer control over memory usage – 
> especially since producer IDs will be more common with idempotency becoming 
> the default.
> See 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-854+Separate+configuration+for+producer+ID+expiry



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-848: The Next Generation of the Consumer Rebalance Protocol

2022-08-22 Thread Luke Chen
Hi David,

Thanks for the update.

Some more questions:
1. In Group Coordinator section, you mentioned:
> The new group coordinator will have a state machine per
*__consumer_offsets* partitions, where each state machine is modelled as an
event loop. Those state machines will be executed in
*group.coordinator.threads* threads.

1.1. I think the state machine are: "Empty, assigning, reconciling, stable,
dead" mentioned in Consumer Group States section, right?
1.2. What do you mean "each state machine is modelled as an event loop"?
1.3. Why do we need a state machine per *__consumer_offsets* partitions?
Not a state machine "per consumer group" owned by a group coordinator? For
example, if one group coordinator owns 2 consumer groups, and both exist in
*__consumer_offsets-0*, will we have 1 state machine for it, or 2?
1.4. I know the "*group.coordinator.threads" *is the number of threads used
to run the state machines. But I'm wondering if the purpose of the threads
is only to keep the state of each consumer group (or *__consumer_offsets*
partitions?), and no heavy computation, why should we need multi-threads
here?

2. For the default value in the new configs:
2.1. The consumer session timeout, why does the default session timeout not
locate between min (45s) and max(60s)? I thought the min/max session
timeout is to define lower/upper bound of it, no?

group.consumer.session.timeout.ms int 30s The timeout to detect client
failures when using the consumer group protocol.
group.consumer.min.session.timeout.ms int 45s The minimum session timeout.
group.consumer.max.session.timeout.ms int 60s The maximum session timeout.



2.2. The default server side assignor are [range, uniform], which means
we'll default to "range" assignor. I'd like to know why not uniform one? I
thought usually users will choose uniform assignor (former sticky assinor)
for better evenly distribution. Any other reason we choose range assignor
as default?
group.consumer.assignors List range, uniform The server side assignors.






Thank you.
Luke






On Mon, Aug 22, 2022 at 2:10 PM Luke Chen  wrote:

> Hi Sagar,
>
> I have some thoughts about Kafka Connect integrating with KIP-848, but I
> think we should have a separate discussion thread for the Kafka Connect
> KIP: Integrating Kafka Connect With New Consumer Rebalance Protocol [1],
> and let this discussion thread focus on consumer rebalance protocol, WDYT?
>
> [1]
> https://cwiki.apache.org/confluence/display/KAFKA/%5BDRAFT%5DIntegrating+Kafka+Connect+With+New+Consumer+Rebalance+Protocol
>
> Thank you.
> Luke
>
>
>
> On Fri, Aug 12, 2022 at 9:31 PM Sagar  wrote:
>
>> Thank you Guozhang/David for the feedback. Looks like there's agreement on
>> using separate APIs for Connect. I would revisit the doc and see what
>> changes are to be made.
>>
>> Thanks!
>> Sagar.
>>
>> On Tue, Aug 9, 2022 at 7:11 PM David Jacot 
>> wrote:
>>
>> > Hi Sagar,
>> >
>> > Thanks for the feedback and the document. That's really helpful. I
>> > will take a look at it.
>> >
>> > Overall, it seems to me that both Connect and the Consumer could share
>> > the same underlying "engine". The main difference is that the Consumer
>> > assigns topic-partitions to members whereas Connect assigns tasks to
>> > workers. I see two ways to move forward:
>> > 1) We extend the new proposed APIs to support different resource types
>> > (e.g. partitions, tasks, etc.); or
>> > 2) We use new dedicated APIs for Connect. The dedicated APIs would be
>> > similar to the new ones but different on the content/resources and
>> > they would rely on the same engine on the coordinator side.
>> >
>> > I personally lean towards 2) because I am not a fan of overcharging
>> > APIs to serve different purposes. That being said, I am not opposed to
>> > 1) if we can find an elegant way to do it.
>> >
>> > I think that we can continue to discuss it here for now in order to
>> > ensure that this KIP is compatible with what we will do for Connect in
>> > the future.
>> >
>> > Best,
>> > David
>> >
>> > On Mon, Aug 8, 2022 at 2:41 PM David Jacot  wrote:
>> > >
>> > > Hi all,
>> > >
>> > > I am back from vacation. I will go through and address your comments
>> > > in the coming days. Thanks for your feedback.
>> > >
>> > > Cheers,
>> > > David
>> > >
>> > > On Wed, Aug 3, 2022 at 10:05 PM Gregory Harris > >
>> > wrote:
>> > > >
>> > > > Hey All!
>> > > >
>> > > > Thanks for the KIP, it's wonderful to see cooperative rebalancing
>> > making it
>> > > > down the stack!
>> > > >
>> > > > I had a few questions:
>> > > >
>> > > > 1. The 'Rejected Alternatives' section describes how member epoch
>> > should
>> > > > advance in step with the group epoch and assignment epoch values. I
>> > think
>> > > > that this is a good idea for the reasons described in the KIP. When
>> the
>> > > > protocol is incrementally assigning partitions to a worker, what
>> member
>> > > > epoch does each incremental assignment use? Are member epochs
>> re-used,
>> > and
>> > > > a