[jira] [Resolved] (KAFKA-15881) Make changes in Release Process Wiki and Release Process

2023-12-12 Thread Vedarth Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vedarth Sharma resolved KAFKA-15881.

Fix Version/s: 3.7.0
 Reviewer: Manikumar
   Resolution: Fixed

Release Process is updated

> Make changes in Release Process Wiki and Release Process
> 
>
> Key: KAFKA-15881
> URL: https://issues.apache.org/jira/browse/KAFKA-15881
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Vedarth Sharma
>Assignee: Vedarth Sharma
>Priority: Major
> Fix For: 3.7.0
>
>
> Make changes to Release Process Wiki and docker README.md for detailed 
> release process instructions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.7 #4

2023-12-12 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 452670 lines...]
> Task :storage:storage-api:compileTestJava
> Task :storage:storage-api:testClasses
> Task :connect:json:testSrcJar
> Task :connect:json:publishMavenJavaPublicationToMavenLocal
> Task :connect:json:publishToMavenLocal
> Task :server:compileTestJava
> Task :server:testClasses
> Task :server-common:compileTestJava
> Task :server-common:testClasses
> Task :raft:compileTestJava
> Task :raft:testClasses
> Task :core:compileScala
> Task :streams:generateMetadataFileForMavenJavaPublication
> Task :group-coordinator:compileTestJava
> Task :group-coordinator:testClasses

> Task :clients:javadoc
/home/jenkins/workspace/Kafka_kafka_3.7/clients/src/main/java/org/apache/kafka/clients/admin/ScramMechanism.java:32:
 warning - Tag @see: missing final '>': "https://cwiki.apache.org/confluence/display/KAFKA/KIP-554%3A+Add+Broker-side+SCRAM+Config+API;>KIP-554:
 Add Broker-side SCRAM Config API

 This code is duplicated in 
org.apache.kafka.common.security.scram.internals.ScramMechanism.
 The type field in both files must match and must not change. The type field
 is used both for passing ScramCredentialUpsertion and for the internal
 UserScramCredentialRecord. Do not change the type field."
/home/jenkins/workspace/Kafka_kafka_3.7/clients/src/main/java/org/apache/kafka/common/security/oauthbearer/secured/package-info.java:21:
 warning - Tag @link: reference not found: 
org.apache.kafka.common.security.oauthbearer
2 warnings

> Task :clients:javadocJar
> Task :metadata:compileTestJava
> Task :metadata:testClasses
> Task :clients:srcJar
> Task :clients:testJar
> Task :clients:testSrcJar
> Task :clients:publishMavenJavaPublicationToMavenLocal
> Task :clients:publishToMavenLocal
> Task :connect:api:generateMetadataFileForMavenJavaPublication
> Task :connect:api:compileTestJava UP-TO-DATE
> Task :connect:api:testClasses UP-TO-DATE
> Task :connect:api:testJar
> Task :connect:api:testSrcJar
> Task :connect:api:publishMavenJavaPublicationToMavenLocal
> Task :connect:api:publishToMavenLocal
> Task :streams:javadoc
> Task :streams:javadocJar
> Task :streams:srcJar
> Task :streams:processTestResources UP-TO-DATE
> Task :core:classes
> Task :core:compileTestJava NO-SOURCE
> Task :core:compileTestScala
> Task :core:testClasses
> Task :streams:compileTestJava
> Task :streams:testClasses
> Task :streams:testJar
> Task :streams:testSrcJar
> Task :streams:publishMavenJavaPublicationToMavenLocal
> Task :streams:publishToMavenLocal

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

For more on this, please refer to 
https://docs.gradle.org/8.5/userguide/command_line_interface.html#sec:command_line_warnings
 in the Gradle documentation.

BUILD SUCCESSFUL in 5m 21s
95 actionable tasks: 41 executed, 54 up-to-date

Publishing build scan...
https://ge.apache.org/s/t6nyom5ofi3k6

[Pipeline] sh
+ grep ^version= gradle.properties
+ cut -d= -f 2
[Pipeline] dir
Running in /home/jenkins/workspace/Kafka_kafka_3.7/streams/quickstart
[Pipeline] {
[Pipeline] sh
+ mvn clean install -Dgpg.skip
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Kafka Streams :: Quickstart[pom]
[INFO] streams-quickstart-java[maven-archetype]
[INFO] 
[INFO] < org.apache.kafka:streams-quickstart >-
[INFO] Building Kafka Streams :: Quickstart 3.7.0-SNAPSHOT[1/2]
[INFO]   from pom.xml
[INFO] [ pom ]-
[INFO] 
[INFO] --- clean:3.0.0:clean (default-clean) @ streams-quickstart ---
[INFO] 
[INFO] --- remote-resources:1.5:process (process-resource-bundles) @ 
streams-quickstart ---
[INFO] 
[INFO] --- site:3.5.1:attach-descriptor (attach-descriptor) @ 
streams-quickstart ---
[INFO] 
[INFO] --- gpg:1.6:sign (sign-artifacts) @ streams-quickstart ---
[INFO] 
[INFO] --- install:2.5.2:install (default-install) @ streams-quickstart ---
[INFO] Installing 
/home/jenkins/workspace/Kafka_kafka_3.7/streams/quickstart/pom.xml to 
/home/jenkins/.m2/repository/org/apache/kafka/streams-quickstart/3.7.0-SNAPSHOT/streams-quickstart-3.7.0-SNAPSHOT.pom
[INFO] 
[INFO] --< org.apache.kafka:streams-quickstart-java >--
[INFO] Building streams-quickstart-java 3.7.0-SNAPSHOT[2/2]
[INFO]   from java/pom.xml
[INFO] --[ maven-archetype ]---
[INFO] 
[INFO] --- clean:3.0.0:clean (default-clean) @ streams-quickstart-java ---
[INFO] 
[INFO] --- remote-resources:1.5:process (process-resource-bundles) @ 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.5 #102

2023-12-12 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2477

2023-12-12 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.7 #3

2023-12-12 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-996: Pre-Vote

2023-12-12 Thread Alyssa Huang
Hello!

Thank you all for your votes. I'm closing the vote as it's been 72 hours
since the start of the voting thread.
KIP-996 has PASSED with 3 binding votes.

Best,
Alyssa

On Fri, Dec 8, 2023 at 12:58 PM Jun Rao  wrote:

> Hi, Alyssa,
>
> Thanks for the KIP. +1
>
> Jun
>
> On Fri, Dec 8, 2023 at 10:52 AM José Armando García Sancio
>  wrote:
>
> > +1.
> >
> > Thanks for the KIP. Looking forward to the implementation!
> >
> > --
> > -José
> >
>


[jira] [Resolved] (KAFKA-15111) Correction kafka examples

2023-12-12 Thread Greg Harris (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris resolved KAFKA-15111.
-
Resolution: Duplicate

> Correction kafka examples
> -
>
> Key: KAFKA-15111
> URL: https://issues.apache.org/jira/browse/KAFKA-15111
> Project: Kafka
>  Issue Type: Task
>Reporter: Dmitry
>Priority: Minor
> Fix For: 3.6.0
>
>
> Need set TOPIC_NAME = topic1 in KafkaConsumerProducerDemo class and remove 
> unused TOPIC field from KafkaProperties.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.7 #2

2023-12-12 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2476

2023-12-12 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-16001) Migrate ConsumerNetworkThreadTestBuilder away from ConsumerTestBuilder

2023-12-12 Thread Lucas Brutschy (Jira)
Lucas Brutschy created KAFKA-16001:
--

 Summary: Migrate ConsumerNetworkThreadTestBuilder away from 
ConsumerTestBuilder
 Key: KAFKA-16001
 URL: https://issues.apache.org/jira/browse/KAFKA-16001
 Project: Kafka
  Issue Type: Sub-task
Reporter: Lucas Brutschy






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16000) Migrate MembershipManagerImpl away from ConsumerTestBuilder

2023-12-12 Thread Lucas Brutschy (Jira)
Lucas Brutschy created KAFKA-16000:
--

 Summary: Migrate MembershipManagerImpl away from 
ConsumerTestBuilder
 Key: KAFKA-16000
 URL: https://issues.apache.org/jira/browse/KAFKA-16000
 Project: Kafka
  Issue Type: Sub-task
Reporter: Lucas Brutschy






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15999) Migrate HeartbeatRequestManagerTest away from ConsumerTestBuilder

2023-12-12 Thread Lucas Brutschy (Jira)
Lucas Brutschy created KAFKA-15999:
--

 Summary: Migrate HeartbeatRequestManagerTest away from 
ConsumerTestBuilder
 Key: KAFKA-15999
 URL: https://issues.apache.org/jira/browse/KAFKA-15999
 Project: Kafka
  Issue Type: Sub-task
Reporter: Lucas Brutschy






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15998) EAGER rebalance onPartitionsAssigned() called with no previous onPartitionsLost() nor onPartitionsRevoked()

2023-12-12 Thread Jonathan Haapala (Jira)
Jonathan Haapala created KAFKA-15998:


 Summary: EAGER rebalance onPartitionsAssigned() called with no 
previous onPartitionsLost() nor onPartitionsRevoked()
 Key: KAFKA-15998
 URL: https://issues.apache.org/jira/browse/KAFKA-15998
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 3.4.0
Reporter: Jonathan Haapala


I ran into a case where {{onPartitionsAssigned()}} was called without first 
calling {{onPartitionsRevoked()}} and there is no indication that 
{{onPartitionsLost()}} was called or had any reason to be called. We are using 
the *EAGER* rebalance protocol and the *StickyAssignor* on kafka 3.4.0.

Our services rely on the API contract that {{{}onPartitionsRevoked(){}}}:
{quote}In eager rebalancing, it will always be called at the start of a 
rebalance and after the consumer stops fetching data.{quote}
We internally keep track of partition states with a state machine, and rely on 
these APIs to assert what expected states we are in. So when a partition is 
Revoked and then re-Assigned, we know that we kept ownership. Moreover, if we 
are assigned partitions in EAGER rebalancing, we expect that entire assignment 
is passed to {{{}onPartitionsAssigned(){}}}, because if 
{{onPartitionsRevoked()}} is always called at the start of a rebalance and 
EAGER protocol always revokes the entire assignment, then by the time we hit 
{{onPartitionsAssigned()}} there should be nothing assigned from the consumer's 
point of view, and therefore the entire assignment is newly added.

However, we recently ran into a situation where we received an assignment while 
the consumer's existing assignment was non-empty:
|     *Pod*
|                                      *Message*
|
|aggregator-ff95b6cf-r2jkm|2023-12-07 23:02:25,715\{UTC} 
[KafkaConsumerAutoService] INFO  o.a.k.c.c.i.ConsumerCoordinator - [Consumer 
clientId=consumer.metric-data-points.metric-aggregator, 
groupId=metric-aggregator] Notifying assignor about the {*}new 
Assignment{*}(partitions=[topic-26, topic-44, topic-60, topic-71, topic-78, 
topic-82, topic-88, topic-101, topic-105, topic-109, topic-113, topic-117, 
topic-123, topic-130, topic-137, topic-141])|
|aggregator-ff95b6cf-r2jkm|2023-12-07 23:02:25,715\{UTC} 
[KafkaConsumerAutoService] INFO  o.a.k.c.c.i.ConsumerCoordinator - [Consumer 
clientId=consumer.metric-data-points.metric-aggregator, 
groupId=metric-aggregator] Adding {*}newly assigned partitions{*}: topic-26, 
topic-44, topic-60, topic-71, topic-78, topic-82, topic-88, topic-101, 
topic-105, topic-109, topic-113, topic-117, topic-123, topic-130, topic-137, 
topic-141|
|aggregator-ff95b6cf-r2jkm|2023-12-07 23:02:31,923\{UTC} 
[kafka-coordinator-heartbeat-thread \| metric-aggregator] INFO  
o.a.k.c.c.i.ConsumerCoordinator - [Consumer 
clientId=consumer.metric-data-points.metric-aggregator, 
groupId=metric-aggregator] *Request joining group* due to: group is already 
rebalancing|
|aggregator-ff95b6cf-r2jkm|2023-12-07 23:02:32,132\{UTC} 
[KafkaConsumerAutoService] INFO  o.a.k.c.c.i.ConsumerCoordinator - [Consumer 
clientId=consumer.metric-data-points.metric-aggregator, 
groupId=metric-aggregator] Successfully joined group with generation 
Generation\{generationId=12417, 
memberId='consumer.metric-data-points.metric-aggregator-a43be1e2-eba1-444c-96dd-ccb52cdba223',
 protocol='sticky'}|
|aggregator-ff95b6cf-r2jkm|2023-12-07 23:02:32,134\{UTC} 
[KafkaConsumerAutoService] INFO  o.a.k.c.c.i.ConsumerCoordinator - [Consumer 
clientId=consumer.metric-data-points.metric-aggregator, 
groupId=metric-aggregator] Successfully synced group in generation 
Generation{generationId={*}12417{*}, 
memberId='consumer.metric-data-points.metric-aggregator-a43be1e2-eba1-444c-96dd-ccb52cdba223',
 protocol='sticky'}|
|aggregator-ff95b6cf-r2jkm|2023-12-07 23:02:32,135\{UTC} 
[KafkaConsumerAutoService] INFO  o.a.k.c.c.i.ConsumerCoordinator - [Consumer 
clientId=consumer.metric-data-points.metric-aggregator, 
groupId=metric-aggregator] Notifying assignor about the \{*}new 
Assignment{*}(partitions=[topic-26, topic-44, topic-60, topic-71, topic-78, 
topic-82, topic-88, topic-101, topic-105, topic-109, topic-113, topic-117])|
|aggregator-ff95b6cf-r2jkm|2023-12-07 23:02:32,135\{UTC} 
[KafkaConsumerAutoService] INFO  o.a.k.c.c.i.ConsumerCoordinator - [Consumer 
clientId=consumer.metric-data-points.metric-aggregator, 
groupId=metric-aggregator] Adding {*}newly assigned partitions{*}: |

Here you can see we get assigned partitions:

  26, 44, 60, 71, 78, 82, 88, 101, 105, 109, 113, 117, 123, 130, 137, 141

And promptly see them all as newly added when passed to 
{{{}onPartitionsAssigned(){}}}. 6 seconds later the heartbeat thread notices 
another rebalance and requests to join. It quickly succeeds and then almost 
immediately successfully syncs. We then get a new assignment:

  26, 44, 60, 71, 78, 82, 88, 101, 105, 109, 113, 

Jenkins build is unstable: Kafka » Kafka Branch Builder » 3.7 #1

2023-12-12 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-974 Docker Image for GraalVM based Native Kafka Broker

2023-12-12 Thread Krishna Agarwal
Hi Ismael,
Would you happen to have any remaining concerns regarding the selection of
the base Docker image?
Alternatively, do you have any additional suggestions or insights?

Regards,
Krishna


On Fri, Nov 24, 2023 at 1:16 AM Krishna Agarwal <
krishna0608agar...@gmail.com> wrote:

> Hi Ismael,
>
> In my pursuit of a lightweight base image, I initially considered Alpine
> and Distroless
>
>1. The next best option I explored is the Ubuntu Docker image(
>https://hub.docker.com/_/ubuntu/tags) which is a more complete image.
>It has a size of 70MB compared to the 15MB of the Alpine image
>(post-installation of glibc and bash), resulting in a difference of 55MB.
>2. To assess performance, I executed produce/consume performance
>scripts on the Kafka native Docker image using both Alpine and Ubuntu, and
>the results indicated comparable performance between the two.
>
> I wanted to check if there's any other image you'd like me to assess for
> consideration. Your input would be greatly appreciated.
>
> Regards,
> Krishna
>
> On Thu, Nov 23, 2023 at 2:31 AM Ismael Juma  wrote:
>
>> Hi Krishna,
>>
>> I am still finding it difficult to evaluate this choice. A couple of
>> things
>> would help:
>>
>> 1. How much smaller is the alpine image compared to the best alternative?
>> 2. Is there any performance impact of going with Alpine?
>>
>> Ismael
>>
>>
>> On Wed, Nov 22, 2023, 8:42 AM Krishna Agarwal <
>> krishna0608agar...@gmail.com>
>> wrote:
>>
>> > Hi Ismael,
>> > Thanks for the feedback.
>> >
>> > The alpine image does present a few drawbacks, such as the use of musl
>> libc
>> > instead of glibc, the absence of bash, and reliance on the less popular
>> > package manager "apk". Considering the advantage of a smaller image size
>> > and installing the missing packages(glibc and bash), I have proposed the
>> > alpine image as the base image. Let me know if you have any suggestions.
>> > I have added a detailed section for the same in the KIP.
>> >
>> > Regards,
>> > Krishna
>> >
>> > On Wed, Nov 22, 2023 at 8:08 PM Ismael Juma  wrote:
>> >
>> > > Hi,
>> > >
>> > > One question I have is regarding the choice to use alpine - it would
>> be
>> > > good to clarify if there are downsides (the upside was explained -
>> images
>> > > are smaller).
>> > >
>> > > Ismael
>> > >
>> > > On Fri, Sep 8, 2023, 12:17 AM Krishna Agarwal <
>> > > krishna0608agar...@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi,
>> > > > I want to submit a KIP to deliver an experimental Apache Kafka
>> docker
>> > > > image.
>> > > > The proposed docker image can launch brokers with sub-second startup
>> > time
>> > > > and minimal memory footprint by leveraging a GraalVM based native
>> Kafka
>> > > > binary.
>> > > >
>> > > > KIP-974: Docker Image for GraalVM based Native Kafka Broker
>> > > > <
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-974%3A+Docker+Image+for+GraalVM+based+Native+Kafka+Broker
>> > > > >
>> > > >
>> > > > Regards,
>> > > > Krishna
>> > > >
>> > >
>> >
>>
>


[jira] [Resolved] (KAFKA-9545) Flaky Test `RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted`

2023-12-12 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-9545.

Resolution: Fixed

> Flaky Test `RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted`
> --
>
> Key: KAFKA-9545
> URL: https://issues.apache.org/jira/browse/KAFKA-9545
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Jason Gustafson
>Assignee: Ashwin Pankaj
>Priority: Major
>
> https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/4678/testReport/org.apache.kafka.streams.integration/RegexSourceIntegrationTest/testRegexMatchesTopicsAWhenDeleted/
> {code}
> java.lang.AssertionError: Condition not met within timeout 15000. Stream 
> tasks not updated
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:26)
>   at 
> org.apache.kafka.test.TestUtils.lambda$waitForCondition$5(TestUtils.java:367)
>   at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:415)
>   at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:383)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:366)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:337)
>   at 
> org.apache.kafka.streams.integration.RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted(RegexSourceIntegrationTest.java:224)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15997) Ensure fairness in the uniform assignor

2023-12-12 Thread Emanuele Sabellico (Jira)
Emanuele Sabellico created KAFKA-15997:
--

 Summary: Ensure fairness in the uniform assignor
 Key: KAFKA-15997
 URL: https://issues.apache.org/jira/browse/KAFKA-15997
 Project: Kafka
  Issue Type: Sub-task
Reporter: Emanuele Sabellico


 

 

Fairness has to be ensured in uniform assignor as it was in cooperative-sticky 
one.

There's this test 0113 subtest u_multiple_subscription_changes in librdkafka 
where 8 consumers are subscribing to the same topic, and it's verifying that 
all of them are getting 2 partitions assigned. But with new protocol it seems 
two consumers get assigned 3 partitions and 1 has zero partitions. The test 
doesn't configure any client.rack.


{code:java}
[0113_cooperative_rebalance  /478.183s] Consumer assignments 
(subscription_variation 0) (stabilized) (no rebalance cb):
[0113_cooperative_rebalance  /478.183s] Consumer C_0#consumer-3 assignment (2): 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [5] (2000msgs), 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [8] (4000msgs)
[0113_cooperative_rebalance  /478.183s] Consumer C_1#consumer-4 assignment (3): 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [0] (1000msgs), 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [3] (2000msgs), 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [13] (1000msgs)
[0113_cooperative_rebalance  /478.184s] Consumer C_2#consumer-5 assignment (2): 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [6] (1000msgs), 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [10] (2000msgs)
[0113_cooperative_rebalance  /478.184s] Consumer C_3#consumer-6 assignment (2): 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [7] (1000msgs), 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [9] (2000msgs)
[0113_cooperative_rebalance  /478.184s] Consumer C_4#consumer-7 assignment (2): 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [11] (1000msgs), 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [14] (3000msgs)
[0113_cooperative_rebalance  /478.184s] Consumer C_5#consumer-8 assignment (3): 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [1] (2000msgs), 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [2] (2000msgs), 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [4] (1000msgs)
[0113_cooperative_rebalance  /478.184s] Consumer C_6#consumer-9 assignment (0): 
[0113_cooperative_rebalance  /478.184s] Consumer C_7#consumer-10 assignment 
(2): rdkafkatest_rnd24419cc75e59d8de_0113u_1 [12] (1000msgs), 
rdkafkatest_rnd24419cc75e59d8de_0113u_1 [15] (1000msgs)
[0113_cooperative_rebalance  /478.184s] 16/32 partitions assigned
[0113_cooperative_rebalance  /478.184s] Consumer C_0#consumer-3 has 2 assigned 
partitions (1 subscribed topic(s)), expecting 2 assigned partitions
[0113_cooperative_rebalance  /478.184s] Consumer C_1#consumer-4 has 3 assigned 
partitions (1 subscribed topic(s)), expecting 2 assigned partitions
[0113_cooperative_rebalance  /478.184s] Consumer C_2#consumer-5 has 2 assigned 
partitions (1 subscribed topic(s)), expecting 2 assigned partitions
[0113_cooperative_rebalance  /478.184s] Consumer C_3#consumer-6 has 2 assigned 
partitions (1 subscribed topic(s)), expecting 2 assigned partitions
[0113_cooperative_rebalance  /478.184s] Consumer C_4#consumer-7 has 2 assigned 
partitions (1 subscribed topic(s)), expecting 2 assigned partitions
[0113_cooperative_rebalance  /478.184s] Consumer C_5#consumer-8 has 3 assigned 
partitions (1 subscribed topic(s)), expecting 2 assigned partitions
[0113_cooperative_rebalance  /478.184s] Consumer C_6#consumer-9 has 0 assigned 
partitions (1 subscribed topic(s)), expecting 2 assigned partitions
[0113_cooperative_rebalance  /478.184s] Consumer C_7#consumer-10 has 2 assigned 
partitions (1 subscribed topic(s)), expecting 2 assigned partitions
[                      /479.057s] 1 test(s) running: 
0113_cooperative_rebalance
[                      /480.057s] 1 test(s) running: 
0113_cooperative_rebalance
[                      /481.057s] 1 test(s) running: 
0113_cooperative_rebalance
[0113_cooperative_rebalance  /482.498s] TEST FAILURE
### Test "0113_cooperative_rebalance (u_multiple_subscription_changes:2390: 
use_rebalance_cb: 0, subscription_variation: 0)" failed at 
test.c:1243:check_test_timeouts() at Thu Dec  7 15:52:15 2023: ###
Test 0113_cooperative_rebalance (u_multiple_subscription_changes:2390: 
use_rebalance_cb: 0, subscription_variation: 0) timed out (timeout set to 480 
seconds)
./run-test.sh: line 62: 3512920 Killed                  $TEST $ARGS
###
### Test ./test-runner in bare mode FAILED! (return code 137) ###
###{code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2475

2023-12-12 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-877: Mechanism for plugins and connectors to register metrics

2023-12-12 Thread Mickael Maison
Hi,

I've not received any feedback since I updated the KIP.
I'll wait a few more days and if there's no further feedback I'll start a vote.

Thanks,
Mickael

On Tue, Nov 7, 2023 at 6:29 PM Mickael Maison  wrote:
>
> Hi,
>
> A bit later than initially planned I finally restarted looking at this KIP.
>
> I made a few significant changes to the proposed APIs.
> I considered Chris' idea of automatically removing metrics but decided
> to leave that responsibility to the plugins. All plugins that will
> support this feature have close/stop methods and will need to close
> their PluginMetrics instance. This simplifies the required changes a
> lot and I think it's not a big ask on users implementing plugins.
>
> Thanks,
> Mickael
>
> On Tue, May 30, 2023 at 11:32 AM Mickael Maison
>  wrote:
> >
> > Hi Jorge,
> >
> > There are a few issues with the current proposal. Once 3.5 is out, I
> > plan to start looking at this again.
> >
> > Thanks,
> > Mickael
> >
> > On Mon, May 15, 2023 at 3:19 PM Jorge Esteban Quilcate Otoya
> >  wrote:
> > >
> > > Hi Mickael,
> > >
> > > Just to check the status of this KIP as it looks very useful. I can see 
> > > how
> > > new Tiered Storage interfaces and plugins may benefit from this.
> > >
> > > Cheers,
> > > Jorge.
> > >
> > > On Mon, 6 Feb 2023 at 23:00, Chris Egerton  
> > > wrote:
> > >
> > > > Hi Mickael,
> > > >
> > > > I agree that adding a getter method for Monitorable isn't great. A few
> > > > alternatives come to mind:
> > > >
> > > > 1. Introduce a new ConfiguredInstance (name subject to change) 
> > > > interface
> > > > that wraps an instance of type T, but also contains a getter method for 
> > > > any
> > > > PluginMetrics instances that the plugin was instantiated with (which may
> > > > return null either if no PluginMetrics instance could be created for the
> > > > plugin, or if it did not implement the Monitorable interface). This can 
> > > > be
> > > > the return type of the new AbstractConfig::getConfiguredInstance 
> > > > variants.
> > > > It would give us room to move forward with other plugin-for-your-plugin
> > > > style interfaces without cluttering things up with getter methods. We 
> > > > could
> > > > even add a close method to this interface which would handle cleanup of 
> > > > all
> > > > extra resources allocated for the plugin by the runtime, and even 
> > > > possibly
> > > > the plugin itself.
> > > >
> > > > 2. Break out the instantiation logic into two separate steps. The first
> > > > step, creating a PluginMetrics instance, can be either private or public
> > > > API. The second step, providing that PluginMetrics instance to a
> > > > newly-created object, can be achieved with a small tweak of the proposed
> > > > new methods for the AbstractConfig class; instead of accepting a Metrics
> > > > instance, they would now accept a PluginMetrics instance. For the first
> > > > step, we might even introduce a new CloseablePluginMetrics interface 
> > > > which
> > > > would be the return type of whatever method we use to create the
> > > > PluginMetrics instance. We can track that CloseablePluginMetrics 
> > > > instance
> > > > in tandem with the plugin it applies to, and close it when we're done 
> > > > with
> > > > the plugin.
> > > >
> > > > I know that this adds some complexity to the API design and some
> > > > bookkeeping responsibilities for our implementation, but I can't shake 
> > > > the
> > > > feeling that if we don't feel comfortable taking on the responsibility 
> > > > to
> > > > clean up these resources ourselves, it's not really fair to ask users to
> > > > handle it for us instead. And with the case of Connect, sometimes 
> > > > Connector
> > > > or Task instances that are scheduled for shutdown block for a while, at
> > > > which point we abandon them and bring up new instances in their place; 
> > > > it'd
> > > > be nice to have a way to forcibly clear out all the metrics allocated by
> > > > that Connector or Task instance before bringing up a new one, in order 
> > > > to
> > > > prevent issues due to naming conflicts.
> > > >
> > > > Regardless, and whether or not it ends up being relevant to this KIP, 
> > > > I'd
> > > > love to see a new Converter::close method. It's irked me for quite a 
> > > > while
> > > > that we don't have one already.
> > > >
> > > > Cheers,
> > > >
> > > > Chris
> > > >
> > > > On Mon, Feb 6, 2023 at 1:50 PM Mickael Maison 
> > > > wrote:
> > > >
> > > > > Hi Chris,
> > > > >
> > > > > I envisioned plugins to be responsible for closing the PluginMetrics
> > > > > instance. This is mostly important for Connect connector plugins as
> > > > > they can be closed while the runtime keeps running (and keeps its
> > > > > Metrics instance). As far as I can tell, other plugins should only be
> > > > > closed when their runtime closes, so we should not be leaking metrics
> > > > > even if those don't explicitly call close().
> > > > >
> > > > > For Connect plugin, as you said, it would be nice 

[DISCUSS] Kafka Connect source task interruption semantics

2023-12-12 Thread Chris Egerton
Hi all,

I'd like to solicit input from users and maintainers on a problem we've
been dealing with for source task cleanup logic.

If you'd like to pore over some Jira history, here's the primary link:
https://issues.apache.org/jira/browse/KAFKA-15090

To summarize, we accidentally introduced a breaking change for Kafka
Connect in https://github.com/apache/kafka/pull/9669. Before that change,
the SourceTask::stop method [1] would be invoked on a separate thread from
the one that did the actual data processing for the task (polling the task
for records, transforming and converting those records, then sending them
to Kafka). After that change, we began invoking SourceTask::stop on the
same thread that handled data processing for the task. This had the effect
that tasks which blocked indefinitely in the SourceTask::poll method [2]
with the expectation that they could stop blocking when SourceTask::stop
was invoked would no longer be capable of graceful shutdown, and may even
hang forever.

This breaking change was introduced in the 3.0.0 release, a little over two
three ago. Since then, source connectors may have been modified to adapt to
the change in behavior by the Connect framework. As a result, we are
hesitant to go back to the prior logic of invoking SourceTask::stop on a
separate thread (see the linked Jira ticket for more detail on this front).

In https://github.com/apache/kafka/pull/14316, I proposed that we begin
interrupting the data processing thread for the source task after it had
exhausted its graceful shutdown timeout (i.e., when the Kafka Connect
runtime decides to cancel [3], [4], [5] the task). I believe this change is
fairly non-controversial--once a task has failed to shut down gracefully,
the runtime can and should do whatever it wants to force a shutdown,
graceful or otherwise.

With all that context out of the way, the question I'd like to ask is: do
we believe it's also appropriate to interrupt the data processing thread
when the task is scheduled for shutdown [6], [7]? This interruption would
ideally be followed up by a graceful shutdown of the task, which may
require the Kafka Connect runtime to handle a potential
InterruptedException from SourceTask::poll. Other exceptions (such as a
wrapped InterruptedException) would be impossible to handle gracefully, and
may lead to spurious error messages in the logs and failed final offset
commits for connectors that do not work well with this new behavior.

Finally, one important note: in the official documentation for
SourceTask::poll, we do already state that this method should not block for
too long:

> If no data is currently available, this method should block but return
control to the caller regularly (by returning null) in order for the task
to transition to the PAUSED state if requested to do so.

Looking forward to everyone's thoughts on this tricky issue!

Cheers,

Chris

[1] -
https://kafka.apache.org/36/javadoc/org/apache/kafka/connect/source/SourceTask.html#stop()
[2] -
https://kafka.apache.org/36/javadoc/org/apache/kafka/connect/source/SourceTask.html#poll()
[3] -
https://github.com/apache/kafka/blob/c5ee82cab447b094ad7491200afa319515d5f467/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L1037
[4] -
https://github.com/apache/kafka/blob/c5ee82cab447b094ad7491200afa319515d5f467/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java#L129-L136
[5] -
https://github.com/apache/kafka/blob/c5ee82cab447b094ad7491200afa319515d5f467/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/AbstractWorkerSourceTask.java#L284-L297
[6] -
https://github.com/apache/kafka/blob/c5ee82cab447b094ad7491200afa319515d5f467/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L1014
[7] -
https://github.com/apache/kafka/blob/c5ee82cab447b094ad7491200afa319515d5f467/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java#L112-L127


Re: KIP-993: Allow restricting files accessed by File and Directory ConfigProviders

2023-12-12 Thread Gantigmaa Selenge
Thanks Chris. I have added that as well.

I think I can start the vote on this KIP if there are no further comments.

Thanks.
Regards,
Tina

On Tue, Dec 12, 2023 at 1:43 PM Chris Egerton 
wrote:

> Thanks Tina! LGTM as long as we take care to document that recursive access
> to directories will be granted when we release this feature.
>
> On Tue, Dec 12, 2023 at 8:16 AM Gantigmaa Selenge 
> wrote:
>
> > Hi Chris,
> >
> > Thank you for the feedback.
> >
> >
> > 1.  Addressed
> >
> >
> > 2. I have updated the type to be List. The configure() function is more
> > likely to process the value as String and convert to List using the comma
> > separation but I think it makes sense to specify it as List, as that is
> the
> > final type.
> >
> >
> > 3. Yes, it's mentioned under the Public Interfaces section but I also
> added
> > another sentence to make it clearer.
> >
> >
> > 4. Yes, I have just tested this to confirm and it looks like "/" gives
> > access to the entire file system.
> >
> >
> > Thanks.
> > Regards,
> > Tina
> >
> >
> >
> >
> > On Thu, Dec 7, 2023 at 2:58 PM Chris Egerton 
> > wrote:
> >
> > > Hi Tina,
> > >
> > > Thanks for the KIP! Looks good overall. A few minor thoughts:
> > >
> > > 1. We can remove the "This page is meant as a template for writing a
> KIP"
> > > section from the beginning.
> > >
> > > 2. The type of the allowed.paths property is string in the KIP, but the
> > > description mentions it'll contain multiple comma-separated paths.
> > > Shouldn't it be described as a list? Or are we calling it a string in
> > order
> > > to allow for escape syntax for directories that may contain the
> delimiter
> > > character (e.g., ',')?
> > >
> > > 3. I'm guessing the answer is yes but I want to make sure--will users
> be
> > > allowed to specify files in the allowed.paths property?
> > >
> > > 4. Again, guessing the answer is yes but to make sure--if a directory
> is
> > > specified in the allowed.paths property, will all files (nested or
> > > otherwise) be accessible by the config provider? E.g., if I set
> > > allowed.paths to "/", then everything on the entire file system would
> be
> > > accessible, instead of just the files directly inside the root
> directory.
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Thu, Dec 7, 2023 at 9:33 AM Gantigmaa Selenge 
> > > wrote:
> > >
> > > > Thank you Mickael.
> > > >
> > > > I'm going to leave the discussion thread open for a couple more days
> > and
> > > if
> > > > there are no further comments, I would like to start the vote for
> this
> > > KIP.
> > > >
> > > > Thanks.
> > > > Regards,
> > > > Tina
> > > >
> > > > On Wed, Dec 6, 2023 at 10:06 AM Mickael Maison <
> > mickael.mai...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'm not aware of any other mechanisms to explore the filesystem. If
> > > > > you have ideas, please reach out to the security list.
> > > > >
> > > > > Thanks,
> > > > > Mickael
> > > > >
> > > > > On Tue, Dec 5, 2023 at 1:05 PM Gantigmaa Selenge <
> > gsele...@redhat.com>
> > > > > wrote:
> > > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > >
> > > > > > Apologies for the very delayed response. Thank you both for the
> > > > feedback.
> > > > > >
> > > > > >
> > > > > > > For clarity it might make sense to mention this feature will be
> > > > useful
> > > > > >
> > > > > > when using a ConfigProvider with Kafka Connect as providers are
> set
> > > in
> > > > > >
> > > > > > the runtime and can then be used by connectors. This feature has
> no
> > > > > >
> > > > > > use when using a ConfigProvider in server.properties or in
> clients.
> > > > > >
> > > > > >
> > > > > > I have updated the KIP to address this suggestion. Please let me
> > know
> > > > if
> > > > > > it's not clear enough.
> > > > > >
> > > > > >
> > > > > > > When trying to use a path not allowed, you propose returning an
> > > > error.
> > > > > >
> > > > > > With Connect does that mean the connector will be failed? The
> > > > > >
> > > > > > EnvVarConfigProvider returns empty string in case a user tries to
> > > > > >
> > > > > > access an environment variable not allowed. I wonder if we should
> > > > > >
> > > > > > follow the same pattern so the behavior is "consistent" across
> all
> > > > > >
> > > > > > built-in providers.
> > > > > >
> > > > > >
> > > > > > I agree with this, it makes sense to have consistent behaviour
> > across
> > > > all
> > > > > > the providers. I made this update.
> > > > > >
> > > > > >
> > > > > > > 1. In the past Connect removed the FileStream connectors in
> order
> > > to
> > > > > >
> > > > > > prevent a REST API attacker from accessing the filesystem. Is
> this
> > > the
> > > > > >
> > > > > > only remaining attack vector for reading the file system?
> Meaning,
> > if
> > > > > >
> > > > > > this feature is configured and all custom plugins are audited for
> > > > > >
> > > > > > filesystem accesses, would someone with access to the REST API be
> > > > > >
> > > > > > 

Re: Apache Kafka 3.7.0 Release

2023-12-12 Thread Stanislav Kozlovski
Hey!

Just notifying everybody on this thread that I have cut the 3.7 branch and
sent a new email thread titled "New Release Branch 3.7" to the mailing list
.

Best,
Stanislav

On Wed, Dec 6, 2023 at 11:10 AM Stanislav Kozlovski 
wrote:

> Hello again,
>
> Time is flying by! It is feature freeze day!
>
> By today, we expect to have major features merged and to begin working on
> their stabilisation. Minor features should have PRs.
>
> I am planning to cut the release branch soon - on Monday EU daytime. When
> I do that, I will create a new e-mail thread titled "New release branch
> 3.7.0" to notify you, so be on the lookout for that. I will also notify
> this thread.
>
> Thank you for your contributions. Let's get this release shipped!
>
> Best,
> Stanislav
>
>
> On Fri, Nov 24, 2023 at 6:11 PM Stanislav Kozlovski <
> stanis...@confluent.io> wrote:
>
>> Hey all,
>>
>> The KIP Freeze has passed. I count 31 KIPs that will be going into the
>> 3.7 Release. Thank you all for your hard work!
>>
>> They are the following (some of these were accepted in previous releases
>> and have minor parts going out, some targeting a Preview release and the
>> rest being fully released as regular.):
>>  - KIP-1000: List Client Metrics Configuration Resources
>>  - KIP-1001: Add CurrentControllerId Metric
>>  - KIP-405: Kafka Tiered Storage
>>  - KIP-580: Exponential Backoff for Kafka Clients
>>  - KIP-714: Client metrics and observability
>>  - KIP-770: Replace "buffered.records.per.partition" &
>> "cache.max.bytes.buffering" with
>> "{statestore.cache}/{input.buffer}.max.bytes"
>>  - KIP-848: The Next Generation of the Consumer Rebalance Protocol
>>  - KIP-858: Handle JBOD broker disk failure in KRaft
>>  - KIP-890: Transactions Server-Side Defense
>>  - KIP-892: Transactional StateStores
>>  - KIP-896: Remove old client protocol API versions in Kafka 4.0 -
>> metrics/request log changes to identify deprecated apis
>>  - KIP-925: Rack aware task assignment in Kafka Streams
>>  - KIP-938: Add more metrics for measuring KRaft performance
>>  - KIP-951 - Leader discovery optimizations for the client
>>  - KIP-954: expand default DSL store configuration to custom types
>>  - KIP-959: Add BooleanConverter to Kafka Connect
>>  - KIP-960: Single-key single-timestamp IQv2 for state stores
>>  - KIP-963: Additional metrics in Tiered Storage
>>  - KIP-968: Support single-key_multi-timestamp Interactive Queries (IQv2)
>> for Versioned State Stores
>>  - KIP-970: Deprecate and remove Connect's redundant task configurations
>> endpoint
>>  - KIP-975: Docker Image for Apache Kafka
>>  - KIP-976: Cluster-wide dynamic log adjustment for Kafka Connect
>>  - KIP-978: Allow dynamic reloading of certificates with different DN /
>> SANs
>>  - KIP-979: Allow independently stop KRaft processes
>>  - KIP-980: Allow creating connectors in a stopped state
>>  - KIP-985: Add reverseRange and reverseAll query over kv-store in IQv2
>>  - KIP-988: Streams Standby Update Listener
>>  - KIP-992: Proposal to introduce IQv2 Query Types: TimestampedKeyQuery
>> and TimestampedRangeQuery
>>  - KIP-998: Give ProducerConfig(props, doLog) constructor protected access
>>
>> Notable KIPs that didn't make the Freeze were KIP-977 - it only got 2/3
>> votes.
>>
>> For the full list and latest source of truth, refer to the Release Plan
>> 3.7.0 Document
>> .
>>
>> Thanks for your contributions once again!
>> Best,
>> Stan
>>
>>
>> On Thu, Nov 23, 2023 at 2:27 PM Nick Telford 
>> wrote:
>>
>>> Hi Stan,
>>>
>>> I'd like to propose including KIP-892 in the 3.7 release. The KIP has
>>> been
>>> accepted and I'm just working on rebasing the implementation against
>>> trunk
>>> before I open a PR.
>>>
>>> Regards,
>>> Nick
>>>
>>> On Tue, 21 Nov 2023 at 11:27, Mayank Shekhar Narula <
>>> mayanks.nar...@gmail.com> wrote:
>>>
>>> > Hi Stan
>>> >
>>> > Can you include KIP-951 to the 3.7 release plan? All PRs are merged in
>>> the
>>> > trunk.
>>> >
>>> > On Wed, Nov 15, 2023 at 4:05 PM Stanislav Kozlovski
>>> >  wrote:
>>> >
>>> > > Friendly reminder to everybody that the KIP Freeze is *exactly 7 days
>>> > away*
>>> > > - November 22.
>>> > >
>>> > > A KIP must be accepted by this date in order to be considered for
>>> this
>>> > > release. Note, any KIP that may not be implemented in time, or
>>> otherwise
>>> > > risks heavily destabilizing the release, should be deferred.
>>> > >
>>> > > Best,
>>> > > Stan
>>> > >
>>> > > On Fri, Nov 3, 2023 at 6:03 AM Sophie Blee-Goldman <
>>> > sop...@responsive.dev>
>>> > > wrote:
>>> > >
>>> > > > Looks great, thank you! +1
>>> > > >
>>> > > > On Thu, Nov 2, 2023 at 10:21 AM David Jacot
>>> > >> > > >
>>> > > > wrote:
>>> > > >
>>> > > > > +1 from me as well. Thanks, Stan!
>>> > > > >
>>> > > > > David
>>> > > > >
>>> > > > > On Thu, Nov 2, 2023 at 6:04 PM Ismael Juma 
>>> > wrote:
>>> > > > >
>>> > > 

New Release Branch 3.7

2023-12-12 Thread Stanislav Kozlovski
Hello Kafka developers and friends,

As promised, we now have a release branch for 3.7 release.
Trunk is being bumped to 3.8.0-SNAPSHOT (please help review the PR
).

I'll be going over the JIRAs to move every non-blocker from this
release to the next release.

>From this point, most changes should go to trunk.
*- Blockers (existing and new that we discover while testing the
release) will be double-committed.*
*- Please discuss with your reviewer whether your PR should go to
trunk or to trunk+release so they can merge accordingly.*
*- Please help us test the release!*

Thanks!

-- 
Best,
Stanislav


Re: KIP-993: Allow restricting files accessed by File and Directory ConfigProviders

2023-12-12 Thread Chris Egerton
Thanks Tina! LGTM as long as we take care to document that recursive access
to directories will be granted when we release this feature.

On Tue, Dec 12, 2023 at 8:16 AM Gantigmaa Selenge 
wrote:

> Hi Chris,
>
> Thank you for the feedback.
>
>
> 1.  Addressed
>
>
> 2. I have updated the type to be List. The configure() function is more
> likely to process the value as String and convert to List using the comma
> separation but I think it makes sense to specify it as List, as that is the
> final type.
>
>
> 3. Yes, it's mentioned under the Public Interfaces section but I also added
> another sentence to make it clearer.
>
>
> 4. Yes, I have just tested this to confirm and it looks like "/" gives
> access to the entire file system.
>
>
> Thanks.
> Regards,
> Tina
>
>
>
>
> On Thu, Dec 7, 2023 at 2:58 PM Chris Egerton 
> wrote:
>
> > Hi Tina,
> >
> > Thanks for the KIP! Looks good overall. A few minor thoughts:
> >
> > 1. We can remove the "This page is meant as a template for writing a KIP"
> > section from the beginning.
> >
> > 2. The type of the allowed.paths property is string in the KIP, but the
> > description mentions it'll contain multiple comma-separated paths.
> > Shouldn't it be described as a list? Or are we calling it a string in
> order
> > to allow for escape syntax for directories that may contain the delimiter
> > character (e.g., ',')?
> >
> > 3. I'm guessing the answer is yes but I want to make sure--will users be
> > allowed to specify files in the allowed.paths property?
> >
> > 4. Again, guessing the answer is yes but to make sure--if a directory is
> > specified in the allowed.paths property, will all files (nested or
> > otherwise) be accessible by the config provider? E.g., if I set
> > allowed.paths to "/", then everything on the entire file system would be
> > accessible, instead of just the files directly inside the root directory.
> >
> > Cheers,
> >
> > Chris
> >
> > On Thu, Dec 7, 2023 at 9:33 AM Gantigmaa Selenge 
> > wrote:
> >
> > > Thank you Mickael.
> > >
> > > I'm going to leave the discussion thread open for a couple more days
> and
> > if
> > > there are no further comments, I would like to start the vote for this
> > KIP.
> > >
> > > Thanks.
> > > Regards,
> > > Tina
> > >
> > > On Wed, Dec 6, 2023 at 10:06 AM Mickael Maison <
> mickael.mai...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm not aware of any other mechanisms to explore the filesystem. If
> > > > you have ideas, please reach out to the security list.
> > > >
> > > > Thanks,
> > > > Mickael
> > > >
> > > > On Tue, Dec 5, 2023 at 1:05 PM Gantigmaa Selenge <
> gsele...@redhat.com>
> > > > wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > >
> > > > > Apologies for the very delayed response. Thank you both for the
> > > feedback.
> > > > >
> > > > >
> > > > > > For clarity it might make sense to mention this feature will be
> > > useful
> > > > >
> > > > > when using a ConfigProvider with Kafka Connect as providers are set
> > in
> > > > >
> > > > > the runtime and can then be used by connectors. This feature has no
> > > > >
> > > > > use when using a ConfigProvider in server.properties or in clients.
> > > > >
> > > > >
> > > > > I have updated the KIP to address this suggestion. Please let me
> know
> > > if
> > > > > it's not clear enough.
> > > > >
> > > > >
> > > > > > When trying to use a path not allowed, you propose returning an
> > > error.
> > > > >
> > > > > With Connect does that mean the connector will be failed? The
> > > > >
> > > > > EnvVarConfigProvider returns empty string in case a user tries to
> > > > >
> > > > > access an environment variable not allowed. I wonder if we should
> > > > >
> > > > > follow the same pattern so the behavior is "consistent" across all
> > > > >
> > > > > built-in providers.
> > > > >
> > > > >
> > > > > I agree with this, it makes sense to have consistent behaviour
> across
> > > all
> > > > > the providers. I made this update.
> > > > >
> > > > >
> > > > > > 1. In the past Connect removed the FileStream connectors in order
> > to
> > > > >
> > > > > prevent a REST API attacker from accessing the filesystem. Is this
> > the
> > > > >
> > > > > only remaining attack vector for reading the file system? Meaning,
> if
> > > > >
> > > > > this feature is configured and all custom plugins are audited for
> > > > >
> > > > > filesystem accesses, would someone with access to the REST API be
> > > > >
> > > > > unable to access arbitrary files on disk?
> > > > >
> > > > >
> > > > > Once this feature is configured, it will stop someone from
> accessing
> > > the
> > > > > file system via config providers.
> > > > >
> > > > > However, I’m not sure whether there are other ways users can access
> > > file
> > > > > systems via REST API.
> > > > >
> > > > >
> > > > > Mickael, perhaps you have some thoughts on this?
> > > > >
> > > > >
> > > > > > 2. Could you explain how this feature would prevent a path
> > traversal
> > > > >
> > > > > 

Re: [DISCUSS] KIP-995: Allow users to specify initial offsets while creating connectors

2023-12-12 Thread Chris Egerton
Hi Ashwin,

LGTM! One small adjustment I'd suggest but we don't have to block on--it
may be clearer to put "Wipe all existing offsets for the connector" in
between steps 2 and 3 of the proposed changes section.

Cheers,

Chris

On Mon, Dec 11, 2023 at 11:24 PM Ashwin 
wrote:

> Thanks for pointing this out Chris.
>
> I have updated the KIP with the correct sequence of steps.
>
> Thanks,
> Ashwin
>
> On Wed, Dec 6, 2023 at 11:48 PM Chris Egerton 
> wrote:
>
> > Hi Ashwin,
> >
> > Regarding point 4--I think this is still a bit unwise. When workers pick
> up
> > a new connector config from the config topic, they participate in a
> > rebalance. It may be safe to write offsets during that rebalance, but in
> > the name of simplicity, do you think we can write the offsets for the
> > connector before its config? The sequence of steps would be something
> like
> > this:
> >
> > 1. Validate offsets and connector config (can be done in any order)
> > 2. Write offsets
> > 3. Write connector config (with whatever initial state is specified in
> the
> > request, or the default if none is specified)
> >
> > Cheers,
> >
> > Chris
> >
> > On Wed, Dec 6, 2023 at 9:13 AM Ashwin 
> > wrote:
> >
> > > Hello Chris,
> > >
> > > Thanks for the quick and detailed review. Please see my responses below
> > >
> > > High-level thoughts:
> > >
> > > 1. I had not thought of this till now, thanks for pointing it out. I
> > > would lean towards the second option of cleaning previous offsets as
> > > it will result in the fewer surprises for the user.
> > >
> > > 2 and 3. I agree and have updated the wiki to that effect. I just
> > > wanted to use the connector name as a mutex - we can handle the race
> > > in other ways.
> > >
> > > 4. Yes, I meant submitting config to the config topic. Have updated
> > > the wiki to make it clearer.
> > >
> > >
> > > Nits:
> > >
> > > Thanks for pointing these - I have made the necessary changes.
> > >
> > >
> > > Thanks again,
> > >
> > > Ashwin
> > >
> > >
> > > On Mon, Dec 4, 2023 at 9:05 PM Chris Egerton 
> > > wrote:
> > >
> > > > Hi Ashwin,
> > > >
> > > > Thanks for the KIP! This would be a nice simplification to the
> process
> > > for
> > > > migrating connectors enabled by KIP-980, and would also add global
> > > support
> > > > for a feature I've seen implemented by hand in at least a couple
> > > connectors
> > > > over the years.
> > > >
> > > >
> > > > High-level thoughts:
> > > >
> > > > 1. If there are leftover offsets from a previous connector, what will
> > the
> > > > impact on those offsets (if any) be if a new connector is created
> with
> > > the
> > > > same name with initial offsets specified? I can think of at least two
> > > > options: we leave those offsets as-are but allow any of the initial
> > > offsets
> > > > in the new connector request to overwrite them, or we automatically
> > wipe
> > > > all existing offsets for the connector first before writing its
> initial
> > > > offsets and then creating it. I have a slight preference for the
> first
> > > > because it's simpler to implement and aligns with existing precedent
> > for
> > > > offset handling where we never wipe them unless explicitly requested
> by
> > > the
> > > > user or connector, but it could be argued that the second is less
> > likely
> > > to
> > > > generate footguns for users. Interested in your thoughts!
> > > >
> > > > 2. IMO preflight validation (i.e., the "Validate initial_offset
> before
> > > > creating the connector in STOPPED state rejected alternative) is a
> > > > must-have for this kind of feature. I acknowledge the possible race
> > > > condition (and I don't think it's worth it to track in-flight
> connector
> > > > creation requests in the herder in order to prevent this race, since
> > > > ever-blocking Connector operations would be cumbersome to deal with),
> > and
> > > > the extra implementation effort. But I don't think either of those
> tip
> > > the
> > > > scales far enough to override the benefit of ensuring the submitted
> > > offsets
> > > > are valid before creating the connector.
> > > >
> > > > 3. On the topic of preflight validation--I also think it's important
> > that
> > > > we validate both the connector config and the initial offsets before
> > > either
> > > > creating the connector or storing the initial offsets. I don't think
> > this
> > > > point requires any changes to the KIP that aren't already proposed
> with
> > > > point 2. above, but wanted to see if we could adopt this as a primary
> > > goal
> > > > for the design of the feature and keep it in mind with future
> changes.
> > > >
> > > > 4. In the proposed changes section, the first step is "Create a
> > connector
> > > > in STOPPED state", and the second step is "Validate the offset...".
> > What
> > > > exactly is entailed by "Create a connector"? Submit the config to the
> > > > config topic (presumably after a preflight validation of the config)?
> > > > Participate in the ensuing rebalance? I'm a 

Re: KIP-993: Allow restricting files accessed by File and Directory ConfigProviders

2023-12-12 Thread Gantigmaa Selenge
Hi Chris,

Thank you for the feedback.


1.  Addressed


2. I have updated the type to be List. The configure() function is more
likely to process the value as String and convert to List using the comma
separation but I think it makes sense to specify it as List, as that is the
final type.


3. Yes, it's mentioned under the Public Interfaces section but I also added
another sentence to make it clearer.


4. Yes, I have just tested this to confirm and it looks like "/" gives
access to the entire file system.


Thanks.
Regards,
Tina




On Thu, Dec 7, 2023 at 2:58 PM Chris Egerton 
wrote:

> Hi Tina,
>
> Thanks for the KIP! Looks good overall. A few minor thoughts:
>
> 1. We can remove the "This page is meant as a template for writing a KIP"
> section from the beginning.
>
> 2. The type of the allowed.paths property is string in the KIP, but the
> description mentions it'll contain multiple comma-separated paths.
> Shouldn't it be described as a list? Or are we calling it a string in order
> to allow for escape syntax for directories that may contain the delimiter
> character (e.g., ',')?
>
> 3. I'm guessing the answer is yes but I want to make sure--will users be
> allowed to specify files in the allowed.paths property?
>
> 4. Again, guessing the answer is yes but to make sure--if a directory is
> specified in the allowed.paths property, will all files (nested or
> otherwise) be accessible by the config provider? E.g., if I set
> allowed.paths to "/", then everything on the entire file system would be
> accessible, instead of just the files directly inside the root directory.
>
> Cheers,
>
> Chris
>
> On Thu, Dec 7, 2023 at 9:33 AM Gantigmaa Selenge 
> wrote:
>
> > Thank you Mickael.
> >
> > I'm going to leave the discussion thread open for a couple more days and
> if
> > there are no further comments, I would like to start the vote for this
> KIP.
> >
> > Thanks.
> > Regards,
> > Tina
> >
> > On Wed, Dec 6, 2023 at 10:06 AM Mickael Maison  >
> > wrote:
> >
> > > Hi,
> > >
> > > I'm not aware of any other mechanisms to explore the filesystem. If
> > > you have ideas, please reach out to the security list.
> > >
> > > Thanks,
> > > Mickael
> > >
> > > On Tue, Dec 5, 2023 at 1:05 PM Gantigmaa Selenge 
> > > wrote:
> > > >
> > > > Hi everyone,
> > > >
> > > >
> > > > Apologies for the very delayed response. Thank you both for the
> > feedback.
> > > >
> > > >
> > > > > For clarity it might make sense to mention this feature will be
> > useful
> > > >
> > > > when using a ConfigProvider with Kafka Connect as providers are set
> in
> > > >
> > > > the runtime and can then be used by connectors. This feature has no
> > > >
> > > > use when using a ConfigProvider in server.properties or in clients.
> > > >
> > > >
> > > > I have updated the KIP to address this suggestion. Please let me know
> > if
> > > > it's not clear enough.
> > > >
> > > >
> > > > > When trying to use a path not allowed, you propose returning an
> > error.
> > > >
> > > > With Connect does that mean the connector will be failed? The
> > > >
> > > > EnvVarConfigProvider returns empty string in case a user tries to
> > > >
> > > > access an environment variable not allowed. I wonder if we should
> > > >
> > > > follow the same pattern so the behavior is "consistent" across all
> > > >
> > > > built-in providers.
> > > >
> > > >
> > > > I agree with this, it makes sense to have consistent behaviour across
> > all
> > > > the providers. I made this update.
> > > >
> > > >
> > > > > 1. In the past Connect removed the FileStream connectors in order
> to
> > > >
> > > > prevent a REST API attacker from accessing the filesystem. Is this
> the
> > > >
> > > > only remaining attack vector for reading the file system? Meaning, if
> > > >
> > > > this feature is configured and all custom plugins are audited for
> > > >
> > > > filesystem accesses, would someone with access to the REST API be
> > > >
> > > > unable to access arbitrary files on disk?
> > > >
> > > >
> > > > Once this feature is configured, it will stop someone from accessing
> > the
> > > > file system via config providers.
> > > >
> > > > However, I’m not sure whether there are other ways users can access
> > file
> > > > systems via REST API.
> > > >
> > > >
> > > > Mickael, perhaps you have some thoughts on this?
> > > >
> > > >
> > > > > 2. Could you explain how this feature would prevent a path
> traversal
> > > >
> > > > attack, and how we will verify that such attacks are not feasible?
> > > >
> > > >
> > > > The intention is to generate File objects based on the String value
> > > > provided for allowed.paths and the String path passed to the get()
> > > function.
> > > >
> > > > This would allow validation of path inclusion within the specified
> > > allowed
> > > > paths using their corresponding Path objects, rather than doing
> String
> > > > comparisons.
> > > >
> > > > This hopefully will mitigate the risk of path traversal. The
> > > implementation
> > > > should include 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2474

2023-12-12 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-15996) json serializer too slow

2023-12-12 Thread WangMinChao (Jira)
WangMinChao created KAFKA-15996:
---

 Summary: json serializer too slow
 Key: KAFKA-15996
 URL: https://issues.apache.org/jira/browse/KAFKA-15996
 Project: Kafka
  Issue Type: Improvement
  Components: connect
Affects Versions: 3.5.1
Reporter: WangMinChao


I use `org.apache.kafka.connect.json.JsonConverter` as debezium default 
serializer, but JSON serializers and deserializers are too slow.

 

Related link:  https://issues.redhat.com/browse/DBZ-7240



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-969: Support range interactive queries for versioned state stores

2023-12-12 Thread Bruno Cadonna

Hi Alieh,

I just realized that your KIP uses allKeys() and KIP-997 uses 
withAllKeys() and the current RangeQuery uses withNoBounds(). We should 
agree on one of those.


I am in favor of withAllKeys() but I am also fine with 
withNoKeyBounds(). I just prefer the former because it is more concise.


Best,
Bruno

On 12/12/23 9:08 AM, Bruno Cadonna wrote:

Hi Alieh,

I think using TimestampedRangeQuery to query the latest versions is 
totally fine. If it is not, users will report it and we can add it then.


Best,
Bruno

On 12/11/23 6:22 PM, Alieh Saeedi wrote:

Thank you all.
I decided to remove the ordering from the KIP and maybe move it to the
subsequent KIPs (based on user demand).
I skimmed over the discussion thread, but we still had an open question
about how a user can retrieve the `latest()` values. I think what 
Matthias
suggested (using `TimestampedRangeQuery`) can be the solution. What do 
you

think? Bests,
Alieh

On Wed, Dec 6, 2023 at 1:57 PM Lucas Brutschy
 wrote:


Hi Alieh,

I think we do not have to restrict ourselves too much for the future
and complicate the implementation. The user can always store away and
sort, so we should only provide the ordering guarantee we can provide
efficiently, and we shouldn't restrict our future evolution too much
by this. I think a global ordering by timestamp is sufficient for this
KIP, so I vote for option 2.

Cheers,
Lucas

On Fri, Dec 1, 2023 at 8:45 PM Alieh Saeedi
 wrote:


Hi all,
I updated the KIP based on the changes made in the former KIP 
(KIP-968).

So

with the `ResultOrder` enum, the class `MultiVersionedRangeQuery` had

some

changes both in the defined fields and defined methods.

Based on the PoC PR, what we currently promise in the KIP about 
ordering

seems like a dream. I intended to enable the user to have a global

ordering

based on the key or timestamp (using `orderByKey()` or
`orderByTimestamp()`) and then even have a partial ordering based on 
the

other parameter.  For example, if the user specifies
`orderByKey().withDescendingKey().withAscendingTimestamps()`, then the
global ordering is based on keys in a descending order, and then all 
the
records with the same key are ordered ascendingly based on ts. The 
result

will be something like (k3,v1,t1), (k3,v2,t2), (k2,v2,t3), (k1.v1.t1)
(assuming that k1But in reality, we have limitations for having a global ordering 
based on

keys since we are iterating over the segments in a lazy manner.

Therefore,

when we are processing the current segment, we have no knowledge of the
keys in the next segment.

Now I have two suggestions:
1. Changing the `MultiVersionedRangeQuery` class as follows:

private final ResultOrder *segmentOrder*;
private final contentOrder *segmentContentOrder*; // can be KEY_WISE or
TIMESTAMP_WISE
private final ResultOrder  *keyOrder*;
private final ResultOrder *timestampOrder*;

This way, the global ordering is specified by the `segmentOrder`. It

means

we either show the results from the oldest to the latest segment
(ASCENDING) or from the latest to the oldest segment (DESCENDING).
Then, inside each segment, we guarantee a `segmentContentOrder` 
which can
be `KEY_WISE` or `TIMESTAMP_WISE`. The key order and timestamp order 
are

specified by the `keyOrder` and `timestampOrder` properties,

respectively.
If the content of a segment must be ordered key-wise and then we 
have two

records with the same key (it happens in older segments), then the
`timestampOrder` determines the order between them.

2. We define that global ordering can only be based on timestamps (the
`timestampOrder` property), and if two records have the same timestamp,

the

`keyOrder` determines the order between them.

I think the first suggestion gives more flexibility to the user, but it

is

more complicated. I mean, it needs good Javadocs.

I look forward to your ideas.

Cheers,
Alieh


On Mon, Nov 6, 2023 at 3:08 PM Alieh Saeedi 

wrote:



Thank you, Bruno and Matthias.
I updated the KIP as follows:
1. The one remaining `asOf` word in the KIP is removed.
2. Example 2 is updated. Thanks, Bruno for the correction.

Discussions and open questions
1. Yes, Bruno. We need `orderByKey()` and `orderByTimestamp()` as 
well.
Because the results must have a global ordering. Either based on 
key or

based on ts. For example, we can have
`orderByKey().withDescendingKey().withAscendingTimestamps()`. Then the
global ordering is based on keys in a descending order, and then all

the

records with the same key are ordered ascendingly based on ts. The

result

will be something like (k3,v1,t1), (k3,v2,t2), (k3,v1,t1), (k2,v2,t2),
(k1.v1.t1) (assuming that k1
yet.

Adding a new class or ignoring `latest()` for VersionedRangeQuery and
instead using the `TimestampedRangeQuery` as Matthias suggested.

Cheers,
Alieh

On Sat, Nov 4, 2023 at 1:38 AM Matthias J. Sax 

wrote:



Great discussion. Seems we are making good progress.

I see advantages and disadvantages in splitting out a "single-ts
key-range" 

[VOTE] KIP-1007: Introduce Remote Storage Not Ready Exception

2023-12-12 Thread Kamal Chandraprakash
Hi,

I would like to call a vote for KIP-1007
.
This KIP aims to introduce a new error code for retriable remote storage
errors. Thanks to everyone who reviewed the KIP!

--
Kamal


Re: [DISCUSS] KIP-1007: Introduce Remote Storage Not Ready Exception

2023-12-12 Thread Kamal Chandraprakash
Thanks Luke for reviewing this KIP!

If there are no more comments from others, I'll start the VOTE since this
is a minor KIP.

On Mon, Dec 11, 2023 at 1:01 PM Luke Chen  wrote:

> Hi Kamal,
>
> Thanks for the KIP!
> LGTM.
>
> Thanks.
> Luke
>
> On Wed, Nov 22, 2023 at 7:28 PM Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
> > Hi,
> >
> > I would like to start a discussion to introduce a new error code for
> > retriable remote storage errors. Please take a look at the proposal:
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1007%3A+Introduce+Remote+Storage+Not+Ready+Exception
> >
>


[jira] [Created] (KAFKA-15995) Mechanism for plugins and connectors to register metrics

2023-12-12 Thread Mickael Maison (Jira)
Mickael Maison created KAFKA-15995:
--

 Summary: Mechanism for plugins and connectors to register metrics
 Key: KAFKA-15995
 URL: https://issues.apache.org/jira/browse/KAFKA-15995
 Project: Kafka
  Issue Type: New Feature
Reporter: Mickael Maison
Assignee: Mickael Maison


Ticket for 
[KIP-877|https://cwiki.apache.org/confluence/display/KAFKA/KIP-877%3A+Mechanism+for+plugins+and+connectors+to+register+metrics]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] Apache Kafka 3.5.2

2023-12-12 Thread Dima Brodsky
Hello,

Is there a "git" issue with 3.5.2.  When I look at github I see the 3.5.2
tag.  But if I make the repo an upstream remote target I don't see 3.5.2.
Any ideas what could be up?

Thanks!
ttyl
Dima


On Mon, Dec 11, 2023 at 3:36 AM Luke Chen  wrote:

> The Apache Kafka community is pleased to announce the release for
> Apache Kafka 3.5.2
>
> This is a bugfix release. It contains many bug fixes including
> upgrades the Snappy and Rocksdb dependencies.
>
> All of the changes in this release can be found in the release notes:
> https://www.apache.org/dist/kafka/3.5.2/RELEASE_NOTES.html
>
>
> You can download the source and binary release from:
> https://kafka.apache.org/downloads#3.5.2
>
>
> ---
>
>
> Apache Kafka is a distributed streaming platform with four core APIs:
>
>
> ** The Producer API allows an application to publish a stream of records to
> one or more Kafka topics.
>
> ** The Consumer API allows an application to subscribe to one or more
> topics and process the stream of records produced to them.
>
> ** The Streams API allows an application to act as a stream processor,
> consuming an input stream from one or more topics and producing an
> output stream to one or more output topics, effectively transforming the
> input streams to output streams.
>
> ** The Connector API allows building and running reusable producers or
> consumers that connect Kafka topics to existing applications or data
> systems. For example, a connector to a relational database might
> capture every change to a table.
>
>
> With these APIs, Kafka can be used for two broad classes of application:
>
> ** Building real-time streaming data pipelines that reliably get data
> between systems or applications.
>
> ** Building real-time streaming applications that transform or react
> to the streams of data.
>
>
> Apache Kafka is in use at large and small companies worldwide, including
> Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
> Target, The New York Times, Uber, Yelp, and Zalando, among others.
>
> A big thank you for the following contributors to this release!
>
> A. Sophie Blee-Goldman, atu-sharm, bachmanity1, Calvin Liu, Chase
> Thomas, Chris Egerton, Colin Patrick McCabe, David Arthur, Divij
> Vaidya, Federico Valeri, flashmouse, Florin Akermann, Greg Harris,
> hudeqi, José Armando García Sancio, Levani Kokhreidze, Lucas Brutschy,
> Luke Chen, Manikumar Reddy, Matthias J. Sax, Mickael Maison, Nick
> Telford, Okada Haruki, Omnia G.H Ibrahim, Robert Wagner, Rohan, Said
> Boudjelda, sciclon2, Vincent Jiang, Xiaobing Fang, Yash Mayya
>
> We welcome your help and feedback. For more information on how to
> report problems, and to get involved, visit the project website at
> https://kafka.apache.org/
>
> Thank you!
>
> Regards,
> Luke
>


-- 
ddbrod...@gmail.com

"The price of reliability is the pursuit of the utmost simplicity.
It is a price which the very rich find the most hard to pay."
   (Sir
Antony Hoare, 1980)


Re: [DISCUSS] KIP-969: Support range interactive queries for versioned state stores

2023-12-12 Thread Bruno Cadonna

Hi Alieh,

I think using TimestampedRangeQuery to query the latest versions is 
totally fine. If it is not, users will report it and we can add it then.


Best,
Bruno

On 12/11/23 6:22 PM, Alieh Saeedi wrote:

Thank you all.
I decided to remove the ordering from the KIP and maybe move it to the
subsequent KIPs (based on user demand).
I skimmed over the discussion thread, but we still had an open question
about how a user can retrieve the `latest()` values. I think what Matthias
suggested (using `TimestampedRangeQuery`) can be the solution. What do you
think? Bests,
Alieh

On Wed, Dec 6, 2023 at 1:57 PM Lucas Brutschy
 wrote:


Hi Alieh,

I think we do not have to restrict ourselves too much for the future
and complicate the implementation. The user can always store away and
sort, so we should only provide the ordering guarantee we can provide
efficiently, and we shouldn't restrict our future evolution too much
by this. I think a global ordering by timestamp is sufficient for this
KIP, so I vote for option 2.

Cheers,
Lucas

On Fri, Dec 1, 2023 at 8:45 PM Alieh Saeedi
 wrote:


Hi all,
I updated the KIP based on the changes made in the former KIP (KIP-968).

So

with the `ResultOrder` enum, the class `MultiVersionedRangeQuery` had

some

changes both in the defined fields and defined methods.

Based on the PoC PR, what we currently promise in the KIP about ordering
seems like a dream. I intended to enable the user to have a global

ordering

based on the key or timestamp (using `orderByKey()` or
`orderByTimestamp()`) and then even have a partial ordering based on the
other parameter.  For example, if the user specifies
`orderByKey().withDescendingKey().withAscendingTimestamps()`, then the
global ordering is based on keys in a descending order, and then all the
records with the same key are ordered ascendingly based on ts. The result
will be something like (k3,v1,t1), (k3,v2,t2), (k2,v2,t3), (k1.v1.t1)
(assuming that k1
Therefore,

when we are processing the current segment, we have no knowledge of the
keys in the next segment.

Now I have two suggestions:
1. Changing the `MultiVersionedRangeQuery` class as follows:

private final ResultOrder *segmentOrder*;
private final contentOrder *segmentContentOrder*; // can be KEY_WISE or
TIMESTAMP_WISE
private final ResultOrder  *keyOrder*;
private final ResultOrder *timestampOrder*;

This way, the global ordering is specified by the `segmentOrder`. It

means

we either show the results from the oldest to the latest segment
(ASCENDING) or from the latest to the oldest segment (DESCENDING).
Then, inside each segment, we guarantee a `segmentContentOrder` which can
be `KEY_WISE` or `TIMESTAMP_WISE`. The key order and timestamp order are
specified by the `keyOrder` and `timestampOrder` properties,

respectively.

If the content of a segment must be ordered key-wise and then we have two
records with the same key (it happens in older segments), then the
`timestampOrder` determines the order between them.

2. We define that global ordering can only be based on timestamps (the
`timestampOrder` property), and if two records have the same timestamp,

the

`keyOrder` determines the order between them.

I think the first suggestion gives more flexibility to the user, but it

is

more complicated. I mean, it needs good Javadocs.

I look forward to your ideas.

Cheers,
Alieh


On Mon, Nov 6, 2023 at 3:08 PM Alieh Saeedi 

wrote:



Thank you, Bruno and Matthias.
I updated the KIP as follows:
1. The one remaining `asOf` word in the KIP is removed.
2. Example 2 is updated. Thanks, Bruno for the correction.

Discussions and open questions
1. Yes, Bruno. We need `orderByKey()` and `orderByTimestamp()` as well.
Because the results must have a global ordering. Either based on key or
based on ts. For example, we can have
`orderByKey().withDescendingKey().withAscendingTimestamps()`. Then the
global ordering is based on keys in a descending order, and then all

the

records with the same key are ordered ascendingly based on ts. The

result

will be something like (k3,v1,t1), (k3,v2,t2), (k3,v1,t1), (k2,v2,t2),
(k1.v1.t1) (assuming that k1
yet.

Adding a new class or ignoring `latest()` for VersionedRangeQuery and
instead using the `TimestampedRangeQuery` as Matthias suggested.

Cheers,
Alieh

On Sat, Nov 4, 2023 at 1:38 AM Matthias J. Sax 

wrote:



Great discussion. Seems we are making good progress.

I see advantages and disadvantages in splitting out a "single-ts
key-range" query type. I guess, the main question might be, what
use-cases do we anticipate and how common we believe they are?

We should also take KIP-992 (adding `TimestampedRangeQuery`) into

account.


(1) The most common use case seems to be, a key-range over latest. For
this one, we could use `TimestampedRangeQuery` -- it would return a
`ValueAndTimestamp` instead of a `VersionedRecord` but the won't
be any information loss, because "toTimestamp" would be `null` anyway.


(2) The open question is, how common is a 

Re: [REVIEW REQUEST] ConsumerGroupCommand move to tools

2023-12-12 Thread Николай Ижиков
Hello.

Please, take a time and help me with moving ConsumerGroupCommand move tools
First patch in the series are waiting for review.


> 4 дек. 2023 г., в 11:07, Николай Ижиков  написал(а):
> 
> Hello.
> 
> Dear Kafka committers.
> As far as I know - we moving Kafka from scala to java.
> 
> I prepared a series of PRs to move `ConsumerGroupCommand` from scala to java.
> From my experience, right now, Mickael Maison and Justine Olshan are 
> overwhelmed with other activities and can review my patches from time to time 
> only.
> 
> So, if any Kafka committer has any chance to review simple patches and keep 
> pace of moving scala code to java, please, do it.
> Help me to move these command to tools.
> 
> Looks like all code are prepared and quality review is all we need :).
> 
>> 30 нояб. 2023 г., в 13:28, Николай Ижиков  
>> написал(а):
>> 
>> Hello.
>> 
>> I prepared a PR to move `ConsumerGroupCommand` to `tools` module.
>> Full changes are pretty huge [1] so I split them into several PR’s.
>> 
>> Please, take a look at first PR in the series - 
>> https://github.com/apache/kafka/pull/14856
>> 
>> PR adds `ConsumerGroupCommandOptions` class and other command case classes 
>> to `tools` module.
>> 
>> [1] https://github.com/apache/kafka/pull/14471
>> 
>