Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.6 #25

2023-09-05 Thread Apache Jenkins Server
See 




Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #2176

2023-09-05 Thread Apache Jenkins Server
See 




Re: Unable to start the Kafka with Kraft in Windows 11

2023-09-05 Thread ziming deng
It seems this is related to KAFKA-14273, there is already a pr for this 
problem, but it’s not merged.
 https://github.com/apache/kafka/pull/12763

--
Ziming

> On Sep 6, 2023, at 07:25, Greg Harris  wrote:
> 
> Hey Sumanshu,
> 
> Thanks for trying out Kraft! I hope that you can get it working :)
> 
> I am not familiar with Kraft or Windows, but the error appears to
> mention that the file is already in use by another process so maybe we
> can start there.
> 
> 1. Have you verified that no other Kafka processes are running, such
> as in the background or in another terminal?
> 2. Are you setting up multiple Kafka brokers on the same machine in your test?
> 3. Do you see the error if you restart your machine before starting Kafka?
> 4. Do you see the error if you delete the log directory and format it
> again before starting Kafka?
> 5. Have you made any changes to the `server.properties`, such as
> changing the log directories? (I see that the default is
> `/tmp/kraft-combined-logs`, I don't know if that is a valid path for
> Windows).
> 
> Thanks,
> Greg
> 
> On Mon, Sep 4, 2023 at 2:21 PM Sumanshu Nankana
>  wrote:
>> 
>> Hi Team,
>> 
>> I am following the steps mentioned here https://kafka.apache.org/quickstart 
>> to Install the Kafka.
>> 
>> Windows 11
>> Kafka Version 
>> https://www.apache.org/dyn/closer.cgi?path=/kafka/3.5.0/kafka_2.13-3.5.0.tgz
>> 64 Bit Operating System
>> 
>> 
>> Step1: Generate the Cluster UUID
>> 
>> $KAFKA_CLUSTER_ID=.\bin\windows\kafka-storage.bat random-uuid
>> 
>> Step2: Format Log Directories
>> 
>> .\bin\windows\kafka-storage.bat format -t $KAFKA_CLUSTER_ID -c 
>> .\config\kraft\server.properties
>> 
>> Step3: Start the Kafka Server
>> 
>> .\bin\windows\kafka-server-start.bat .\config\kraft\server.properties
>> 
>> I am getting the error. Logs are attached
>> 
>> Could you please help me to sort this error.
>> 
>> Kindly let me know, if you need any more information.
>> 
>> -
>> Best
>> Sumanshu Nankana
>> 
>> 


Re: Re: Re: Re: [DISCUSS] KIP-971 Expose replication-offset-lag MirrorMaker2 metric

2023-09-05 Thread hudeqi
Hey, Viktor.
As far as my implementation is concerned, the default setting is 30s, but I 
added it to `MirrorConnectorConfig`, which can be adjusted freely according to 
the load of the source cluster and the number of tasks.

best,
hudeqi

Viktor Somogyi-Vass viktor.somo...@cloudera.com.INVALID写道:
> Hey Elkhan and hudeqi,
> 
> I'm reading your debate around the implementation. I also think a
> scheduled task would be better in overall accuracy and performance
> (compared to calling endOffsets with every poll).
> Hudeqi, do you have any experience of what works best for you in terms of
> time intervals? I would think refreshing the metric every 5-10sec would be
> overall good and sufficient for the users (as short intervals can be quite
> noisy anyways).
> 
> Best,
> Viktor
> 
> On Mon, Sep 4, 2023 at 11:41 AM hudeqi <16120...@bjtu.edu.cn> wrote:
> 
> > My approach is to create another thread to regularly request and update
> > the end offset of each partition for the `keySet` in the collection
> > `lastReplicatedSourceOffsets` mentioned by your kip (if there is no update
> > for a long time, it will be removed from `lastReplicatedSourceOffsets`).
> > Obviously, such processing makes the calculation of the partition offset
> > lag less real-time and accurate.
> > But this also meets our needs, because we need the partition offset lag to
> > analyze the replication performance of the task and which task may have
> > performance problems; and if you monitor the overall offset lag of the
> > topic, then using the
> > "kafka_consumer_consumer_fetch_manager_metrics_records_lag" metric will be
> > more real-time and accurate.
> > This is my suggestion. I hope to be able to throw bricks and start jade,
> > we can come up with a better solution.
> >
> > best,
> > hudeqi
> >
> > Elxan Eminov elxanemino...@gmail.com写道:
> > > @huqedi replying to your comment on the PR (
> > > https://github.com/apache/kafka/pull/14077#discussion_r1314592488),
> > quote:
> > >
> > > "I guess we have a disagreement about lag? My understanding of lag is:
> > the
> > > real LEO of the source cluster partition minus the LEO that has been
> > > written to the target cluster. It seems that your definition of lag is:
> > the
> > > lag between the mirror task getting data from consumption and writing it
> > to
> > > the target cluster?"
> > >
> > > Yes, this is the case. I've missed the fact that the consumer itself
> > might
> > > be lagging behind the actual data in the partition.
> > > I believe your definition of the lag is more precise, but:
> > > Implementing it this way will come at the cost of an extra listOffsets
> > > request, introducing the overhead that you mentioned in your initial
> > > comment.
> > >
> > > If you have enough insights about this, what would you say is the chances
> > > of the task consumer lagging behind the LEO of the partition?
> > > Are they big enough to justify the extra call to listOffsets?
> > > @Viktor,  any thoughts?
> > >
> > > Thanks,
> > > Elkhan
> > >
> > > On Mon, 4 Sept 2023 at 09:36, Elxan Eminov 
> > wrote:
> > >
> > > > I already have the PR for this so if it will make it easier to discuss,
> > > > feel free to take a look: https://github.com/apache/kafka/pull/14077
> > > >
> > > > On Mon, 4 Sept 2023 at 09:17, hudeqi <16120...@bjtu.edu.cn> wrote:
> > > >
> > > >> But does the offset of the last `ConsumerRecord` obtained in poll not
> > > >> only represent the offset of this record in the source cluster? It
> > seems
> > > >> that it cannot represent the LEO of the source cluster for this
> > partition.
> > > >> I understand that the offset lag introduced here should be the LEO of
> > the
> > > >> source cluster minus the offset of the last record to be polled?
> > > >>
> > > >> best,
> > > >> hudeqi
> > > >>
> > > >>
> > > >>  -原始邮件-
> > > >>  发件人: "Elxan Eminov" 
> > > >>  发送时间: 2023-09-04 14:52:08 (星期一)
> > > >>  收件人: dev@kafka.apache.org
> > > >>  抄送:
> > > >>  主题: Re: [DISCUSS] KIP-971 Expose replication-offset-lag
> > MirrorMaker2
> > > >> metric
> > > >> 
> > > >> 
> > > >
> > > >
> >


Re: KIP-976: Cluster-wide dynamic log adjustment for Kafka Connect

2023-09-05 Thread Sagar
Hey Chris,

Thanks for making the updates.

The updated definition of last_modified looks good to me. As a continuation
to point number 2, could we also mention that this could be used to get
insights into the propagation of the cluster wide log level updates. It is
implicit but probably better to add it I feel.

Regarding

It's a little indirect on the front of worker restart behavior, so let me
> know if that especially should be fleshed out a bit more (possibly by
> calling it out in the "Config topic records" section).


Yeah I would lean on the side of calling it out explicitly. Since the
behaviour is similar to the current dynamically set log levels (i.e
resetting to the log4j config files levels) so we can call it out stating
that similarity just for completeness sake. It would be useful info for
new/medium level users reading the KIP considering worker restarts is not
uncommon.


Thanks, I'm glad that this seems reasonable. I've tentatively added this to
> the rejected alternatives section, but am happy to continue the
> conversation if anyone wants to explore that possibility further.


+1

I had a nit level suggestion but not sure if it makes sense but would still
call it out. The entire namespace name in the config records key (along
with logger-cluster prefix) seems to be a bit too verbose. My first thought
was to not have the prefix org.apache.kafka.connect in the keys considering
it is the root namespace. But since logging can be enabled at a root level,
can we just use initials like (o.a.k.c) which is also a standard practice.
Let me know what you think?

Lastly, I was also thinking if we could introduce a new parameter which
takes a subset of worker ids and enables logging for them in one go. But
this is already achievable by invoking scope=worker endpoint n times to
reflect on n workers so maybe not a necessary change. But this could be
useful on a large cluster. Do you think this is worth listing in the Future
Work section? It's not important so can be ignored as well.

Thanks!
Sagar.


On Wed, Sep 6, 2023 at 12:08 AM Chris Egerton 
wrote:

> Hi Sagar,
>
> Thanks for your thoughts! Responses inline:
>
> > 1) Considering that you have now clarified here that last_modified field
> would be stored in-memory, it is not mentioned in the KIP. Although that's
> something that's understandable, it wasn't apparent when reading the KIP.
> Probably, we could mention it? Also, what happens if a worker restarts? In
> that case, since the log level update message would be pre-dating the
> startup of the worker, it would be ignored? We should probably mention that
> behaviour as well IMO.
>
> I've tweaked the second paragraph of the "Last modified timestamp" section
> to try to clarify this without getting too verbose: "Modification times
> will be tracked in-memory and determined by when they are applied by the
> worker, as opposed to when they are requested by the user or persisted to
> the config topic (details below). If no modifications to the namespace have
> been made since the worker finished startup, they will be null." Does that
> feel sufficient? It's a little indirect on the front of worker restart
> behavior, so let me know if that especially should be fleshed out a bit
> more (possibly by calling it out in the "Config topic records" section).
>
> > 2) Staying on the last_modified field, it's utility is also not too clear
> to me. Can you add some examples of how it can be useful for debugging etc?
>
> The cluster-wide API relaxes the consistency guarantees of the existing
> worker-local API. With the latter, users can be certain that once they
> receive a 2xx response, the logging level on that worker has been updated.
> With the former, users know that the logging level will eventually be
> updated, but insight into the propagation of that update across the cluster
> is limited. Although it's a little primitive, I'm hoping that the last
> modified timestamp will be enough to help shed some light on this process.
> We could also explore exposing the provenance of logging levels (which maps
> fairly cleanly to the scope of the request from which they originated), but
> that feels like overkill at the moment.
>
> > 3) In the test plan, should we also add a test when the scope parameter
> passed is non-null and neither worker nor cluster? In this case the
> behaviour should be similar to the default case.
>
> Good call! Done.
>
> > 4) I had the same question as Yash regarding persistent cluster-wide
> logging level. I think you have explained it well and we can skip it for
> now.
>
> Thanks, I'm glad that this seems reasonable. I've tentatively added this to
> the rejected alternatives section, but am happy to continue the
> conversation if anyone wants to explore that possibility further.
>
> Cheers,
>
> Chris
>
> On Tue, Sep 5, 2023 at 11:48 AM Sagar  wrote:
>
> > Hey Chris,
> >
> > Thanks for the KIP. Seems like a useful feature. I have some more
> > questions/comments:
> >
> > 1) 

[jira] [Created] (KAFKA-15438) Review exception caching logic used for reset/validate positions in async consumer

2023-09-05 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-15438:
--

 Summary: Review exception caching logic used for reset/validate 
positions in async consumer
 Key: KAFKA-15438
 URL: https://issues.apache.org/jira/browse/KAFKA-15438
 Project: Kafka
  Issue Type: Task
  Components: clients, consumer
Reporter: Lianet Magrans


The refactored async consumer reuses part of the core logic required for 
resetting and validating positions. That currently works on the principle of 
async requests, that reset/validate positions when responses are received. If 
the responses include errors, or if a validation verification fails (ex. log 
truncation detected), exceptions are saved in-memory, to be thrown on the next 
call to the reset/validate. Note that these functionalities are periodically 
called as part of the poll loop to update fetch positions before fetching 
records.

 

As an initial implementation, the async consumer reuses this same caching 
logic, as it has the asyn nature required. This task aims at reviewing the 
processing of `ResetApplicationEvent `and `ValidatePositionsApplicationEvent` 
to evaluate if they should rely on event completion instead, to propagate the 
errors found.  It would align with how other application events manage async 
requests and responses/errors for the new async consumer (based on 
CompletableFutures), but with the trade-off of heavily changing a caching logic 
that is currently reused by the legacy and the new consumer in the 
OffsetFetcherUtils.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-953: partition method to be overloaded to accept headers as well.

2023-09-05 Thread Sagar
Hey Jack,

The way I interpreted this thread, it seems like there's more alignment on
the DTO based approach. I spent some time on the suggestion that Ismael had
regarding the usage of ProducerRecord. Did you get a chance to look at the
reply I had posted and whether that makes sense? Also, checking out the
AdminClient APIs examples provided by Ismael will give you more context.
Let me know what you think.

Thanks!
Sagar.

On Thu, Aug 31, 2023 at 12:49 PM Jack Tomy  wrote:

> Hey everyone,
>
> As I see devs favouring the current style of implementation, and that is
> inline with existing code. I would like to go ahead with the same approach
> as mentioned in the KIP.
> Can I get a few more votes so that I can take the KIP forward.
>
> Thanks
>
>
>
> On Sun, Aug 27, 2023 at 1:38 PM Sagar  wrote:
>
> > Hi Ismael,
> >
> > Thanks for pointing us towards the direction of a DTO based approach. The
> > AdminClient examples seem very neat and extensible in that sense.
> > Personally, I was trying to think only along the lines of how the current
> > Partitioner interface has been designed, i.e having all requisite
> > parameters as separate arguments (Topic, Key, Value etc).
> >
> > Regarding this question of yours:
> >
> > A more concrete question: did we consider having the method `partition`
> > > take `ProduceRecord` as one of its parameters and `Cluster` as the
> other?
> >
> >
> > No, I don't think in the discussion thread it was brought up and as I
> said
> > it appears that could be due to an attempt to keep the new method's
> > signature similar to the existing one within Partitioner. If I understood
> > the intent of the question correctly, are you trying to hint here that
> > `ProducerRecord` already contains all the arguments that the `partition`
> > method accepts and also has a `headers` field within it. So, instead of
> > adding another method for the `headers` field, why not create a new
> method
> > taking ProducerRecord directly?
> >
> > If my understanding is correct, then it seems like a very clean way of
> > adding support for `headers`. Anyways, the partition method within
> > KafkaProducer already takes a ProducerRecord as an argument so that makes
> > it easier. Keeping that in mind, should this new method's (which takes a
> > ProducerRecord as an input) default implementation invoke the existing
> > method ? One challenge I see there is that the existing partition method
> > expects serialized keys and values while ProducerRecord doesn't have
> access
> > to those (It directly operates on K, V).
> >
> > Thanks!
> > Sagar.
> >
> >
> > On Sun, Aug 27, 2023 at 8:51 AM Ismael Juma  wrote:
> >
> > > A more concrete question: did we consider having the method `partition`
> > > take `ProduceRecord` as one of its parameters and `Cluster` as the
> other?
> > >
> > > Ismael
> > >
> > > On Sat, Aug 26, 2023 at 12:50 PM Greg Harris
> >  > > >
> > > wrote:
> > >
> > > > Hey Ismael,
> > > >
> > > > > The mention of "runtime" is specific to Connect. When it comes to
> > > > clients,
> > > > one typically compiles and runs with the same version or runs with a
> > > newer
> > > > version than the one used for compilation. This is standard practice
> in
> > > > Java and not something specific to Kafka.
> > > >
> > > > When I said "older runtimes" I was being lazy, and should have said
> > > > "older versions of clients at runtime," thank you for figuring out
> > > > what I meant.
> > > >
> > > > I don't know how common it is to compile a partitioner against one
> > > > version of clients, and then distribute and run the partitioner with
> > > > older versions of clients and expect graceful degradation of
> features.
> > > > If you say that it is very uncommon and not something that we should
> > > > optimize for, then I won't suggest otherwise.
> > > >
> > > > > With regards to the Admin APIs, they have been extended several
> times
> > > > since introduction (naturally). One of them is:
> > > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/commit/1d22b0d70686aef5689b775ea2ea7610a37f3e8c
> > > >
> > > > Thanks for the example. I also see that includes a migration from
> > > > regular arguments to the DTO style, consistent with your
> > > > recommendation here.
> > > >
> > > > I think the DTO style and the proposed additional argument style are
> > > > both reasonable.
> > > >
> > > > Thanks,
> > > > Greg
> > > >
> > > > On Sat, Aug 26, 2023 at 9:46 AM Ismael Juma 
> wrote:
> > > > >
> > > > > Hi Greg,
> > > > >
> > > > > The mention of "runtime" is specific to Connect. When it comes to
> > > > clients,
> > > > > one typically compiles and runs with the same version or runs with
> a
> > > > newer
> > > > > version than the one used for compilation. This is standard
> practice
> > in
> > > > > Java and not something specific to Kafka.
> > > > >
> > > > > With regards to the Admin APIs, they have been extended several
> times
> > > > since
> > > > > introduction (naturally). One of them is:
> > > > >
> 

Re: Apache Kafka 3.6.0 release

2023-09-05 Thread Satish Duggana
Hi David,
Thanks for bringing this issue to this thread.
I marked https://issues.apache.org/jira/browse/KAFKA-15435 as a blocker.

Thanks,
Satish.

On Tue, 5 Sept 2023 at 21:29, David Arthur  wrote:
>
> Hi Satish. Thanks for running the release!
>
> I'd like to raise this as a blocker for 3.6
> https://issues.apache.org/jira/browse/KAFKA-15435.
>
> It's a very quick fix, so I should be able to post a PR soon.
>
> Thanks!
> David
>
> On Mon, Sep 4, 2023 at 11:44 PM Justine Olshan 
> wrote:
>
> > Thanks Satish. This is done 
> >
> > Justine
> >
> > On Mon, Sep 4, 2023 at 5:16 PM Satish Duggana 
> > wrote:
> >
> > > Hey Justine,
> > > I went through KAFKA-15424 and the PR[1]. It seems there are no
> > > dependent changes missing in 3.6 branch. They seem to be low risk as
> > > you mentioned. Please merge it to the 3.6 branch as well.
> > >
> > > 1. https://github.com/apache/kafka/pull/14324.
> > >
> > > Thanks,
> > > Satish.
> > >
> > > On Tue, 5 Sept 2023 at 05:06, Justine Olshan
> > >  wrote:
> > > >
> > > > Sorry I meant to add the jira as well.
> > > > https://issues.apache.org/jira/browse/KAFKA-15424
> > > >
> > > > Justine
> > > >
> > > > On Mon, Sep 4, 2023 at 4:34 PM Justine Olshan 
> > > wrote:
> > > >
> > > > > Hey Satish,
> > > > >
> > > > > I was working on adding dynamic configuration for
> > > > > transaction verification. The PR is approved and ready to merge into
> > > trunk.
> > > > > I was thinking I could also add it to 3.6 since it is fairly low
> > risk.
> > > > > What do you think?
> > > > >
> > > > > Justine
> > > > >
> > > > > On Sat, Sep 2, 2023 at 6:21 PM Sophie Blee-Goldman <
> > > ableegold...@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> Thanks Satish! The fix has been merged and cherrypicked to 3.6
> > > > >>
> > > > >> On Sat, Sep 2, 2023 at 6:02 AM Satish Duggana <
> > > satish.dugg...@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >> > Hi Sophie,
> > > > >> > Please feel free to add that to 3.6 branch as you say this is a
> > > minor
> > > > >> > change and will not cause any regressions.
> > > > >> >
> > > > >> > Thanks,
> > > > >> > Satish.
> > > > >> >
> > > > >> > On Sat, 2 Sept 2023 at 08:44, Sophie Blee-Goldman
> > > > >> >  wrote:
> > > > >> > >
> > > > >> > > Hey Satish, someone reported a minor bug in the Streams
> > > application
> > > > >> > > shutdown which was a recent regression, though not strictly a
> > new
> > > one:
> > > > >> > was
> > > > >> > > introduced in 3.4 I believe.
> > > > >> > >
> > > > >> > > The fix seems to be super lightweight and low-risk so I was
> > > hoping to
> > > > >> > slip
> > > > >> > > it into 3.6 if that's ok with you? They plan to have the patch
> > > > >> tonight.
> > > > >> > >
> > > > >> > > https://issues.apache.org/jira/browse/KAFKA-15429
> > > > >> > >
> > > > >> > > On Thu, Aug 31, 2023 at 5:45 PM Satish Duggana <
> > > > >> satish.dugg...@gmail.com
> > > > >> > >
> > > > >> > > wrote:
> > > > >> > >
> > > > >> > > > Thanks Chris for bringing this issue here and filing the new
> > > JIRA
> > > > >> for
> > > > >> > > > 3.6.0[1]. It seems to be a blocker for 3.6.0.
> > > > >> > > >
> > > > >> > > > Please help review https://github.com/apache/kafka/pull/14314
> > > as
> > > > >> Chris
> > > > >> > > > requested.
> > > > >> > > >
> > > > >> > > > 1. https://issues.apache.org/jira/browse/KAFKA-15425
> > > > >> > > >
> > > > >> > > > ~Satish.
> > > > >> > > >
> > > > >> > > > On Fri, 1 Sept 2023 at 03:59, Chris Egerton
> > >  > > > >> >
> > > > >> > > > wrote:
> > > > >> > > > >
> > > > >> > > > > Hi all,
> > > > >> > > > >
> > > > >> > > > > Quick update: I've filed a separate ticket,
> > > > >> > > > > https://issues.apache.org/jira/browse/KAFKA-15425, to track
> > > the
> > > > >> > behavior
> > > > >> > > > > change in Admin::listOffsets. For the full history of the
> > > ticket,
> > > > >> > it's
> > > > >> > > > > worth reading the comment thread on the old ticket at
> > > > >> > > > > https://issues.apache.org/jira/browse/KAFKA-12879.
> > > > >> > > > >
> > > > >> > > > > I've also published
> > > https://github.com/apache/kafka/pull/14314
> > > > >> as a
> > > > >> > > > fairly
> > > > >> > > > > lightweight PR to revert the behavior of Admin::listOffsets
> > > > >> without
> > > > >> > also
> > > > >> > > > > reverting the refactoring to use the internal admin driver
> > > API.
> > > > >> Would
> > > > >> > > > > appreciate a review on that if anyone can spare the cycles.
> > > > >> > > > >
> > > > >> > > > > Cheers,
> > > > >> > > > >
> > > > >> > > > > Chris
> > > > >> > > > >
> > > > >> > > > > On Wed, Aug 30, 2023 at 1:01 PM Chris Egerton <
> > > chr...@aiven.io>
> > > > >> > wrote:
> > > > >> > > > >
> > > > >> > > > > > Hi Satish,
> > > > >> > > > > >
> > > > >> > > > > > Wanted to let you know that KAFKA-12879 (
> > > > >> > > > > > https://issues.apache.org/jira/browse/KAFKA-12879), a
> > > breaking
> > > > >> > change
> > > > >> > > > in
> > > > >> > > > > > Admin::listOffsets, has been reintroduced into 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.6 #24

2023-09-05 Thread Apache Jenkins Server
See 




Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.5 #70

2023-09-05 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 283943 lines...]
> Task :clients:generateMetadataFileForMavenJavaPublication
> Task :connect:json:testSrcJar
> Task :streams:srcJar
> Task :streams:compileJava UP-TO-DATE
> Task :streams:classes UP-TO-DATE
> Task :streams:copyDependantLibs UP-TO-DATE
> Task :streams:test-utils:compileJava UP-TO-DATE
> Task :streams:jar UP-TO-DATE
> Task :streams:generateMetadataFileForMavenJavaPublication

> Task :connect:api:javadoc
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.5/connect/api/src/main/java/org/apache/kafka/connect/source/SourceRecord.java:44:
 warning - Tag @link: reference not found: org.apache.kafka.connect.data
1 warning

> Task :connect:api:copyDependantLibs UP-TO-DATE
> Task :connect:api:jar UP-TO-DATE
> Task :connect:api:generateMetadataFileForMavenJavaPublication
> Task :connect:json:copyDependantLibs UP-TO-DATE
> Task :connect:json:jar UP-TO-DATE
> Task :connect:json:generateMetadataFileForMavenJavaPublication
> Task :connect:api:javadocJar
> Task :connect:api:compileTestJava UP-TO-DATE
> Task :connect:json:publishMavenJavaPublicationToMavenLocal
> Task :connect:api:testClasses UP-TO-DATE
> Task :connect:json:publishToMavenLocal
> Task :connect:api:testJar
> Task :connect:api:testSrcJar
> Task :connect:api:publishMavenJavaPublicationToMavenLocal
> Task :connect:api:publishToMavenLocal
> Task :streams:javadoc
> Task :streams:javadocJar

> Task :clients:javadoc
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.5/clients/src/main/java/org/apache/kafka/clients/admin/ScramMechanism.java:32:
 warning - Tag @see: missing final '>': "https://cwiki.apache.org/confluence/display/KAFKA/KIP-554%3A+Add+Broker-side+SCRAM+Config+API;>KIP-554:
 Add Broker-side SCRAM Config API

 This code is duplicated in 
org.apache.kafka.common.security.scram.internals.ScramMechanism.
 The type field in both files must match and must not change. The type field
 is used both for passing ScramCredentialUpsertion and for the internal
 UserScramCredentialRecord. Do not change the type field."
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.5/clients/src/main/java/org/apache/kafka/common/security/oauthbearer/secured/package-info.java:21:
 warning - Tag @link: reference not found: 
org.apache.kafka.common.security.oauthbearer
2 warnings

> Task :clients:javadocJar
> Task :clients:srcJar
> Task :clients:testJar
> Task :clients:testSrcJar
> Task :clients:publishMavenJavaPublicationToMavenLocal
> Task :clients:publishToMavenLocal
> Task :core:compileScala
> Task :core:classes
> Task :core:compileTestJava NO-SOURCE
> Task :core:compileTestScala
> Task :core:testClasses
> Task :streams:compileTestJava UP-TO-DATE
> Task :streams:testClasses UP-TO-DATE
> Task :streams:testJar
> Task :streams:testSrcJar
> Task :streams:publishMavenJavaPublicationToMavenLocal
> Task :streams:publishToMavenLocal

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

See 
https://docs.gradle.org/8.0.2/userguide/command_line_interface.html#sec:command_line_warnings

BUILD SUCCESSFUL in 3m 19s
89 actionable tasks: 33 executed, 56 up-to-date
[Pipeline] sh
+ grep ^version= gradle.properties
+ cut -d= -f 2
[Pipeline] dir
Running in 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.5/streams/quickstart
[Pipeline] {
[Pipeline] sh
+ mvn clean install -Dgpg.skip
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Kafka Streams :: Quickstart[pom]
[INFO] streams-quickstart-java[maven-archetype]
[INFO] 
[INFO] < org.apache.kafka:streams-quickstart >-
[INFO] Building Kafka Streams :: Quickstart 3.5.2-SNAPSHOT[1/2]
[INFO]   from pom.xml
[INFO] [ pom ]-
[INFO] 
[INFO] --- clean:3.0.0:clean (default-clean) @ streams-quickstart ---
[INFO] 
[INFO] --- remote-resources:1.5:process (process-resource-bundles) @ 
streams-quickstart ---
[INFO] 
[INFO] --- site:3.5.1:attach-descriptor (attach-descriptor) @ 
streams-quickstart ---
[INFO] 
[INFO] --- gpg:1.6:sign (sign-artifacts) @ streams-quickstart ---
[INFO] 
[INFO] --- install:2.5.2:install (default-install) @ streams-quickstart ---
[INFO] Installing 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.5/streams/quickstart/pom.xml
 to 
/home/jenkins/.m2/repository/org/apache/kafka/streams-quickstart/3.5.2-SNAPSHOT/streams-quickstart-3.5.2-SNAPSHOT.pom
[INFO] 
[INFO] --< org.apache.kafka:streams-quickstart-java >--
[INFO] Building streams-quickstart-java 3.5.2-SNAPSHOT

Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2175

2023-09-05 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 409366 lines...]
Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldDrainPendingTasksToCreate() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldDrainPendingTasksToCreate() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldKeepAddedTasks() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > TasksTest > 
shouldKeepAddedTasks() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldAssignTasksThatCanBeSystemTimePunctuated() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldAssignTasksThatCanBeSystemTimePunctuated() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldNotUnassignNotOwnedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldNotUnassignNotOwnedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsTwice() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsTwice() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForPunctuationIfPunctuationDisabled() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForPunctuationIfPunctuationDisabled() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldAddTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldAddTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldNotAssignAnyLockedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldNotAssignAnyLockedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldRemoveTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldRemoveTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldNotRemoveAssignedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldNotRemoveAssignedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldAssignTaskThatCanBeProcessed() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldAssignTaskThatCanBeProcessed() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldNotRemoveUnlockedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldNotRemoveUnlockedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldReturnAndClearExceptionsOnDrainExceptions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldReturnAndClearExceptionsOnDrainExceptions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldUnassignTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldUnassignTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForProcessingIfProcessingDisabled() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForProcessingIfProcessingDisabled() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsForUnassignedTasks() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 84 > 
DefaultTaskManagerTest > 

Re: Unable to start the Kafka with Kraft in Windows 11

2023-09-05 Thread Greg Harris
Hey Sumanshu,

Thanks for trying out Kraft! I hope that you can get it working :)

I am not familiar with Kraft or Windows, but the error appears to
mention that the file is already in use by another process so maybe we
can start there.

1. Have you verified that no other Kafka processes are running, such
as in the background or in another terminal?
2. Are you setting up multiple Kafka brokers on the same machine in your test?
3. Do you see the error if you restart your machine before starting Kafka?
4. Do you see the error if you delete the log directory and format it
again before starting Kafka?
5. Have you made any changes to the `server.properties`, such as
changing the log directories? (I see that the default is
`/tmp/kraft-combined-logs`, I don't know if that is a valid path for
Windows).

Thanks,
Greg

On Mon, Sep 4, 2023 at 2:21 PM Sumanshu Nankana
 wrote:
>
> Hi Team,
>
> I am following the steps mentioned here https://kafka.apache.org/quickstart 
> to Install the Kafka.
>
> Windows 11
> Kafka Version 
> https://www.apache.org/dyn/closer.cgi?path=/kafka/3.5.0/kafka_2.13-3.5.0.tgz
> 64 Bit Operating System
>
>
> Step1: Generate the Cluster UUID
>
> $KAFKA_CLUSTER_ID=.\bin\windows\kafka-storage.bat random-uuid
>
> Step2: Format Log Directories
>
> .\bin\windows\kafka-storage.bat format -t $KAFKA_CLUSTER_ID -c 
> .\config\kraft\server.properties
>
> Step3: Start the Kafka Server
>
> .\bin\windows\kafka-server-start.bat .\config\kraft\server.properties
>
> I am getting the error. Logs are attached
>
> Could you please help me to sort this error.
>
> Kindly let me know, if you need any more information.
>
> -
> Best
> Sumanshu Nankana
>
>


[jira] [Created] (KAFKA-15437) Add metrics about open iterators

2023-09-05 Thread Matthias J. Sax (Jira)
Matthias J. Sax created KAFKA-15437:
---

 Summary: Add metrics about open iterators
 Key: KAFKA-15437
 URL: https://issues.apache.org/jira/browse/KAFKA-15437
 Project: Kafka
  Issue Type: New Feature
  Components: streams
Reporter: Matthias J. Sax


Kafka Streams allows to create iterators over state stores. Those iterator must 
get closed to free up resources (especially for RocksDB). – We regularly get 
user reports of "resource leaks" that can be pinned down to leaking (ie 
not-closed) iterators.

To simplify monitoring, it would be helpful to add a metric about open 
iterators to allow users to alert and pin-point the issue directly (and before 
the actually resource leak is observed).

We might want to have a DEBUG level per-store metric (to allow identifying the 
store in question quickly), but an already rolled up INFO level metric for the 
whole application.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.5 #69

2023-09-05 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 283412 lines...]
> Task :raft:testClasses UP-TO-DATE
> Task :connect:json:testJar
> Task :connect:json:testSrcJar
> Task :group-coordinator:compileTestJava UP-TO-DATE
> Task :group-coordinator:testClasses UP-TO-DATE
> Task :streams:generateMetadataFileForMavenJavaPublication
> Task :metadata:compileTestJava UP-TO-DATE
> Task :metadata:testClasses UP-TO-DATE
> Task :clients:generateMetadataFileForMavenJavaPublication

> Task :connect:api:javadoc
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.5/connect/api/src/main/java/org/apache/kafka/connect/source/SourceRecord.java:44:
 warning - Tag @link: reference not found: org.apache.kafka.connect.data
1 warning

> Task :connect:api:copyDependantLibs UP-TO-DATE
> Task :connect:api:jar UP-TO-DATE
> Task :connect:api:generateMetadataFileForMavenJavaPublication
> Task :connect:json:copyDependantLibs UP-TO-DATE
> Task :connect:json:jar UP-TO-DATE
> Task :connect:json:generateMetadataFileForMavenJavaPublication
> Task :connect:api:javadocJar
> Task :connect:json:publishMavenJavaPublicationToMavenLocal
> Task :connect:json:publishToMavenLocal
> Task :connect:api:compileTestJava UP-TO-DATE
> Task :connect:api:testClasses UP-TO-DATE
> Task :connect:api:testJar
> Task :connect:api:testSrcJar
> Task :connect:api:publishMavenJavaPublicationToMavenLocal
> Task :connect:api:publishToMavenLocal
> Task :streams:javadoc
> Task :streams:javadocJar

> Task :clients:javadoc
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.5/clients/src/main/java/org/apache/kafka/clients/admin/ScramMechanism.java:32:
 warning - Tag @see: missing final '>': "https://cwiki.apache.org/confluence/display/KAFKA/KIP-554%3A+Add+Broker-side+SCRAM+Config+API;>KIP-554:
 Add Broker-side SCRAM Config API

 This code is duplicated in 
org.apache.kafka.common.security.scram.internals.ScramMechanism.
 The type field in both files must match and must not change. The type field
 is used both for passing ScramCredentialUpsertion and for the internal
 UserScramCredentialRecord. Do not change the type field."
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.5/clients/src/main/java/org/apache/kafka/common/security/oauthbearer/secured/package-info.java:21:
 warning - Tag @link: reference not found: 
org.apache.kafka.common.security.oauthbearer
2 warnings

> Task :clients:javadocJar
> Task :clients:srcJar
> Task :clients:testJar
> Task :clients:testSrcJar
> Task :clients:publishMavenJavaPublicationToMavenLocal
> Task :clients:publishToMavenLocal
> Task :core:compileScala
> Task :core:classes
> Task :core:compileTestJava NO-SOURCE
> Task :core:compileTestScala
> Task :core:testClasses
> Task :streams:compileTestJava UP-TO-DATE
> Task :streams:testClasses UP-TO-DATE
> Task :streams:testJar
> Task :streams:testSrcJar
> Task :streams:publishMavenJavaPublicationToMavenLocal
> Task :streams:publishToMavenLocal

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings 
and determine if they come from your own scripts or plugins.

See 
https://docs.gradle.org/8.0.2/userguide/command_line_interface.html#sec:command_line_warnings

BUILD SUCCESSFUL in 3m 3s
89 actionable tasks: 33 executed, 56 up-to-date
[Pipeline] sh
+ + grep ^version= gradle.propertiescut
 -d= -f 2
[Pipeline] dir
Running in 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.5/streams/quickstart
[Pipeline] {
[Pipeline] sh
+ mvn clean install -Dgpg.skip
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Kafka Streams :: Quickstart[pom]
[INFO] streams-quickstart-java[maven-archetype]
[INFO] 
[INFO] < org.apache.kafka:streams-quickstart >-
[INFO] Building Kafka Streams :: Quickstart 3.5.2-SNAPSHOT[1/2]
[INFO]   from pom.xml
[INFO] [ pom ]-
[INFO] 
[INFO] --- clean:3.0.0:clean (default-clean) @ streams-quickstart ---
[INFO] 
[INFO] --- remote-resources:1.5:process (process-resource-bundles) @ 
streams-quickstart ---
[INFO] 
[INFO] --- site:3.5.1:attach-descriptor (attach-descriptor) @ 
streams-quickstart ---
[INFO] 
[INFO] --- gpg:1.6:sign (sign-artifacts) @ streams-quickstart ---
[INFO] 
[INFO] --- install:2.5.2:install (default-install) @ streams-quickstart ---
[INFO] Installing 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.5/streams/quickstart/pom.xml
 to 
/home/jenkins/.m2/repository/org/apache/kafka/streams-quickstart/3.5.2-SNAPSHOT/streams-quickstart-3.5.2-SNAPSHOT.pom
[INFO] 
[INFO] --< org.apache.kafka:streams-quickstart-java >--
[INFO] Building streams-quickstart-java 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.6 #23

2023-09-05 Thread Apache Jenkins Server
See 




Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2174

2023-09-05 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 410419 lines...]
Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldDrainPendingTasksToCreate() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
onlyRemovePendingTaskToRecycleShouldRemoveTaskFromPendingUpdateActions() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseClean() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldAddAndRemovePendingTaskToCloseDirty() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldKeepAddedTasks() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > TasksTest > 
shouldKeepAddedTasks() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAssignTasksThatCanBeSystemTimePunctuated() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAssignTasksThatCanBeSystemTimePunctuated() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotUnassignNotOwnedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotUnassignNotOwnedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsTwice() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsTwice() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForPunctuationIfPunctuationDisabled() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForPunctuationIfPunctuationDisabled() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAddTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAddTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotAssignAnyLockedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotAssignAnyLockedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldRemoveTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldRemoveTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotRemoveAssignedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotRemoveAssignedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAssignTaskThatCanBeProcessed() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldAssignTaskThatCanBeProcessed() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotRemoveUnlockedTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotRemoveUnlockedTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldReturnAndClearExceptionsOnDrainExceptions() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldReturnAndClearExceptionsOnDrainExceptions() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldUnassignTask() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldUnassignTask() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForProcessingIfProcessingDisabled() STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > 
shouldNotAssignTasksForProcessingIfProcessingDisabled() PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsForUnassignedTasks() 
STARTED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 
DefaultTaskManagerTest > shouldNotSetUncaughtExceptionsForUnassignedTasks() 
PASSED

Gradle Test Run :streams:test > Gradle Test Executor 85 > 

Re: [DISCUSS] KIP-939: Support Participation in 2PC

2023-09-05 Thread Alexander Sorokoumov
Hi Artem,

Thanks for publishing this KIP!

Can you please clarify the purpose of having broker-level
transaction.two.phase.commit.enable config in addition to the new ACL? If
the brokers are configured with transaction.two.phase.commit.enable=false,
at what point will a client configured with
transaction.two.phase.commit.enable=true fail? Will it happen at
KafkaProducer#initTransactions?

WDYT about adding an AdminClient method that returns the state of t
ransaction.two.phase.commit.enable? This way, clients would know in advance
if 2PC is enabled on the brokers.

Best,
Alex

On Fri, Aug 25, 2023 at 9:40 AM Roger Hoover  wrote:

> Other than supporting multiplexing transactional streams on a single
> producer, I don't see how to improve it.
>
> On Thu, Aug 24, 2023 at 12:12 PM Artem Livshits
>  wrote:
>
> > Hi Roger,
> >
> > Thank you for summarizing the cons.  I agree and I'm curious what would
> be
> > the alternatives to solve these problems better and if they can be
> > incorporated into this proposal (or built independently in addition to or
> > on top of this proposal).  E.g. one potential extension we discussed
> > earlier in the thread could be multiplexing logical transactional
> "streams"
> > with a single producer.
> >
> > -Artem
> >
> > On Wed, Aug 23, 2023 at 4:50 PM Roger Hoover 
> > wrote:
> >
> > > Thanks.  I like that you're moving Kafka toward supporting this
> > dual-write
> > > pattern.  Each use case needs to consider the tradeoffs.  You already
> > > summarized the pros very well in the KIP.  I would summarize the cons
> > > as follows:
> > >
> > > - you sacrifice availability - each write requires both DB and Kafka to
> > be
> > > available so I think your overall application availability is 1 - p(DB
> is
> > > unavailable)*p(Kafka is unavailable).
> > > - latency will be higher and throughput lower - each write requires
> both
> > > writes to DB and Kafka while holding an exclusive lock in DB.
> > > - you need to create a producer per unit of concurrency in your app
> which
> > > has some overhead in the app and Kafka side (number of connections,
> poor
> > > batching).  I assume the producers would need to be configured for low
> > > latency (linger.ms=0)
> > > - there's some complexity in managing stable transactional ids for each
> > > producer/concurrency unit in your application.  With k8s deployment,
> you
> > > may need to switch to something like a StatefulSet that gives each pod
> a
> > > stable identity across restarts.  On top of that pod identity which you
> > can
> > > use as a prefix, you then assign unique transactional ids to each
> > > concurrency unit (thread/goroutine).
> > >
> > > On Wed, Aug 23, 2023 at 12:53 PM Artem Livshits
> > >  wrote:
> > >
> > > > Hi Roger,
> > > >
> > > > Thank you for the feedback.  You make a very good point that we also
> > > > discussed internally.  Adding support for multiple concurrent
> > > > transactions in one producer could be valuable but it seems to be a
> > > fairly
> > > > large and independent change that would deserve a separate KIP.  If
> > such
> > > > support is added we could modify 2PC functionality to incorporate
> that.
> > > >
> > > > > Maybe not too bad but a bit of pain to manage these ids inside each
> > > > process and across all application processes.
> > > >
> > > > I'm not sure if supporting multiple transactions in one producer
> would
> > > make
> > > > id management simpler: we'd need to store a piece of data per
> > > transaction,
> > > > so whether it's N producers with a single transaction or N
> transactions
> > > > with a single producer, it's still roughly the same amount of data to
> > > > manage.  In fact, managing transactional ids (current proposal) might
> > be
> > > > easier, because the id is controlled by the application and it knows
> > how
> > > to
> > > > complete the transaction after crash / restart; while a TID would be
> > > > generated by Kafka and that would create a question of starting Kafka
> > > > transaction, but not saving its TID and then crashing, then figuring
> > out
> > > > which transactions to abort and etc.
> > > >
> > > > > 2) creating a separate producer for each concurrency slot in the
> > > > application
> > > >
> > > > This is a very valid concern.  Maybe we'd need to have some
> > multiplexing
> > > of
> > > > transactional logical "streams" over the same connection.  Seems
> like a
> > > > separate KIP, though.
> > > >
> > > > > Otherwise, it seems you're left with single-threaded model per
> > > > application process?
> > > >
> > > > That's a fair assessment.  Not necessarily exactly single-threaded
> per
> > > > application, but a single producer per thread model (i.e. an
> > application
> > > > could have a pool of threads + producers to increase concurrency).
> > > >
> > > > -Artem
> > > >
> > > > On Tue, Aug 22, 2023 at 7:22 PM Roger Hoover  >
> > > > wrote:
> > > >
> > > > > Artem,
> > > > >
> > > > > Thanks for the reply.
> > > > >
> > > > > If I 

[jira] [Created] (KAFKA-15436) Custom ConfigDef validators are invoked with null when user-provided value does not match type

2023-09-05 Thread Chris Egerton (Jira)
Chris Egerton created KAFKA-15436:
-

 Summary: Custom ConfigDef validators are invoked with null when 
user-provided value does not match type
 Key: KAFKA-15436
 URL: https://issues.apache.org/jira/browse/KAFKA-15436
 Project: Kafka
  Issue Type: Bug
Reporter: Chris Egerton


Filed in response to [discussion on a tangentially-related 
PR|https://github.com/apache/kafka/pull/14304#discussion_r1310039190].
h3. Background

The [ConfigDef.Validator 
interface|https://kafka.apache.org/35/javadoc/org/apache/kafka/common/config/ConfigDef.Validator.html]
 can be used to add custom per-property validation logic to a 
[ConfigDef|https://kafka.apache.org/35/javadoc/org/apache/kafka/common/config/ConfigDef.html]
 instance. This can serve many uses, including but not limited to:
 * Ensuring that the value for a string property matches the name of a Java 
enum type
 * Ensuring that the value for an integer property falls within the range of 
valid port numbers
 * Ensuring that the value for a class property has a public, no-args 
constructor and/or implements a certain interface

This validation logic can be invoked directly via 
[ConfigDef::validate|https://kafka.apache.org/35/javadoc/org/apache/kafka/common/config/ConfigDef.html#validate(java.util.Map)]
 or 
[ConfigDef::validateAll|https://kafka.apache.org/35/javadoc/org/apache/kafka/common/config/ConfigDef.html#validateAll(java.util.Map)],
 or indirectly when instantiating an 
[AbstractConfig|https://kafka.apache.org/35/javadoc/org/apache/kafka/common/config/AbstractConfig.html].

When a value is validated by a {{ConfigDef}} instance, the {{ConfigDef}} first 
verifies that the value adheres to the expected type. For example, if the "raw" 
value is the string {{"345"}} and the property is defined with the [INT 
type|https://kafka.apache.org/35/javadoc/org/apache/kafka/common/config/ConfigDef.Type.html#INT],
 then the value is valid (it is parsed as the integer {{{}345{}}}). However, if 
the same raw value is used for a property defined with the [BOOLEAN 
type|https://kafka.apache.org/35/javadoc/org/apache/kafka/common/config/ConfigDef.Type.html#BOOLEAN],
 then the value is invalid (it cannot be parsed as a boolean).
h3. Problem

When a raw value is invalid for the type of the property it is used for (e.g., 
{{"345"}} is used for a property defined with the [BOOLEAN 
type|https://kafka.apache.org/35/javadoc/org/apache/kafka/common/config/ConfigDef.Type.html#BOOLEAN]),
 custom validators for the property are still invoked, with a value of 
{{{}null{}}}.

This can lead to some counterintuitive behavior, and may necessitate that 
implementers of the {{ConfigDef.Validator}} interface catch cases where the 
value is {{null}} and choose not to report any errors (with the assumption that 
an error will already be reported by the {{ConfigDef}} regarding its failure to 
parse the raw value with the expected type).

We may consider skipping custom validation altogether when the raw value for a 
property cannot be parsed with the expected type. On the other hand, it's 
unclear if there are compatibility concerns about this kind of change.

If we decide to change this behavior, we should try to assess which code paths 
may lead to custom validators being invoked, which use cases correspond to 
which of these code paths, and whether this behavioral change has a chance to 
negatively impact these use cases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-9800) [KIP-580] Client Exponential Backoff Implementation

2023-09-05 Thread Jun Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-9800.

Fix Version/s: 3.7.0
   Resolution: Fixed

Merged [https://github.com/apache/kafka/pull/14111] to trunk.

> [KIP-580] Client Exponential Backoff Implementation
> ---
>
> Key: KAFKA-9800
> URL: https://issues.apache.org/jira/browse/KAFKA-9800
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Cheng Tan
>Assignee: Andrew Schofield
>Priority: Major
>  Labels: KIP-580, client
> Fix For: 3.7.0
>
>
> Design:
> The main idea is to bookkeep the failed attempt. Currently, the retry backoff 
> has two main usage patterns:
>  # Synchronous retires and blocking loop. The thread will sleep in each 
> iteration for retry backoff ms.
>  # Async retries. In each polling, the retries do not meet the backoff will 
> be filtered. The data class often maintains a 1:1 mapping to a set of 
> requests which are logically associated. (i.e. a set contains only one 
> initial request and only its retries.)
> For type 1, we can utilize a local failure counter of a Java generic data 
> type.
> For case 2, I already wrapped the exponential backoff/timeout util class in 
> my KIP-601 
> [implementation|https://github.com/apache/kafka/pull/8683/files#diff-9ca2b1294653dfa914b9277de62b52e3R28]
>  which takes the number of attempts and returns the backoff/timeout value at 
> the corresponding level. Thus, we can add a new class property to those 
> classes containing retriable data in order to record the number of failed 
> attempts.
>  
> Changes:
> KafkaProducer:
>  # Produce request (ApiKeys.PRODUCE). Currently, the backoff applies to each 
> ProducerBatch in Accumulator, which already has an attribute attempts 
> recording the number of failed attempts. So we can let the Accumulator 
> calculate the new retry backoff for each bach when it enqueues them, to avoid 
> instantiate the util class multiple times.
>  # Transaction request (ApiKeys..*TXN). TxnRequestHandler will have a new 
> class property of type `Long` to record the number of attempts.
> KafkaConsumer:
>  # Some synchronous retry use cases. Record the failed attempts in the 
> blocking loop.
>  # Partition request (ApiKeys.OFFSET_FOR_LEADER_EPOCH, ApiKeys.LIST_OFFSETS). 
> Though the actual requests are packed for each node, the current 
> implementation is applying backoff to each topic partition, where the backoff 
> value is kept by TopicPartitionState. Thus, TopicPartitionState will have the 
> new property recording the number of attempts.
> Metadata:
>  #  Metadata lives as a singleton in many clients. Add a new property 
> recording the number of attempts
>  AdminClient:
>  # AdminClient has its own request abstraction Call. The failed attempts are 
> already kept by the abstraction. So probably clean the Call class logic a bit.
> Existing tests:
>  # If the tests are testing the retry backoff, add a delta to the assertion, 
> considering the existence of the jitter.
>  # If the tests are testing other functionality, we can specify the same 
> value for both `retry.backoff.ms` and `retry.backoff.max.ms` in order to make 
> the retry backoff static. We can use this trick to make the existing tests 
> compatible with the changes.
> There're other common usages look like client.poll(timeout), where the 
> timeout passed in is the retry backoff value. We won't change these usages 
> since its underlying logic is nioSelector.select(timeout) and 
> nioSelector.selectNow(), which means if no interested op exists, the client 
> will block retry backoff milliseconds. This is an optimization when there's 
> no request that needs to be sent but the client is waiting for responses. 
> Specifically, if the client fails the inflight requests before the retry 
> backoff milliseconds passed, it still needs to wait until that amount of time 
> passed, unless there's a new request need to be sent.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [kafka-site] C0urante merged pull request #538: MINOR: Add missing entries for Kafka Connect to the documentation's table of contents

2023-09-05 Thread via GitHub


C0urante merged PR #538:
URL: https://github.com/apache/kafka-site/pull/538


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: KIP-976: Cluster-wide dynamic log adjustment for Kafka Connect

2023-09-05 Thread Chris Egerton
Hi Sagar,

Thanks for your thoughts! Responses inline:

> 1) Considering that you have now clarified here that last_modified field
would be stored in-memory, it is not mentioned in the KIP. Although that's
something that's understandable, it wasn't apparent when reading the KIP.
Probably, we could mention it? Also, what happens if a worker restarts? In
that case, since the log level update message would be pre-dating the
startup of the worker, it would be ignored? We should probably mention that
behaviour as well IMO.

I've tweaked the second paragraph of the "Last modified timestamp" section
to try to clarify this without getting too verbose: "Modification times
will be tracked in-memory and determined by when they are applied by the
worker, as opposed to when they are requested by the user or persisted to
the config topic (details below). If no modifications to the namespace have
been made since the worker finished startup, they will be null." Does that
feel sufficient? It's a little indirect on the front of worker restart
behavior, so let me know if that especially should be fleshed out a bit
more (possibly by calling it out in the "Config topic records" section).

> 2) Staying on the last_modified field, it's utility is also not too clear
to me. Can you add some examples of how it can be useful for debugging etc?

The cluster-wide API relaxes the consistency guarantees of the existing
worker-local API. With the latter, users can be certain that once they
receive a 2xx response, the logging level on that worker has been updated.
With the former, users know that the logging level will eventually be
updated, but insight into the propagation of that update across the cluster
is limited. Although it's a little primitive, I'm hoping that the last
modified timestamp will be enough to help shed some light on this process.
We could also explore exposing the provenance of logging levels (which maps
fairly cleanly to the scope of the request from which they originated), but
that feels like overkill at the moment.

> 3) In the test plan, should we also add a test when the scope parameter
passed is non-null and neither worker nor cluster? In this case the
behaviour should be similar to the default case.

Good call! Done.

> 4) I had the same question as Yash regarding persistent cluster-wide
logging level. I think you have explained it well and we can skip it for
now.

Thanks, I'm glad that this seems reasonable. I've tentatively added this to
the rejected alternatives section, but am happy to continue the
conversation if anyone wants to explore that possibility further.

Cheers,

Chris

On Tue, Sep 5, 2023 at 11:48 AM Sagar  wrote:

> Hey Chris,
>
> Thanks for the KIP. Seems like a useful feature. I have some more
> questions/comments:
>
> 1) Considering that you have now clarified here that last_modified field
> would be stored in-memory, it is not mentioned in the KIP. Although that's
> something that's understandable, it wasn't apparent when reading the KIP.
> Probably, we could mention it? Also, what happens if a worker restarts? In
> that case, since the log level update message would be pre-dating the
> startup of the worker, it would be ignored? We should probably mention that
> behaviour as well IMO.
>
> 2) Staying on the last_modified field, it's utility is also not too clear
> to me. Can you add some examples of how it can be useful for debugging etc?
>
> 3) In the test plan, should we also add a test when the scope parameter
> passed is non-null and neither worker nor cluster? In this case the
> behaviour should be similar to the default case.
>
> 4) I had the same question as Yash regarding persistent cluster-wide
> logging level. I think you have explained it well and we can skip it for
> now.
>
> Thanks!
> Sagar.
>
> On Tue, Sep 5, 2023 at 8:49 PM Chris Egerton 
> wrote:
>
> > Hi all,
> >
> > Thank you so much for the generous review comments! Happy to see interest
> > in this feature. Inline responses follow.
> >
> >
> > Ashwin:
> >
> > > Don't you foresee a future scope type which may require cluster
> metadata
> > ?
> > In that case, isn't it better to forward the requests to the leader in
> the
> > initial implementation ?
> >
> > I agree with Yash here: we can cross that bridge when we come to it,
> unless
> > there are problems that we'd encounter then that would be better
> addressed
> > by adding request forwarding now. One potential argument I can think of
> is
> > that the UX would be a little strange if the semantics for this endpoint
> > differ depending on the value of the scope parameter (some values would
> be
> > affected by in-progress rebalances, and some would not), but I don't know
> > if making scope=cluster more brittle in the name of consistency with,
> e.g.,
> > scope=connectorType:foo is really a worthwhile tradeoff. Thoughts?
> >
> > > I would also recommend an additional system test for Standalone herder
> to
> > ensure that the new scope parameter is honored and the 

[GitHub] [kafka-site] yashmayya commented on pull request #538: MINOR: Add missing entries for Kafka Connect to the documentation's table of contents

2023-09-05 Thread via GitHub


yashmayya commented on PR #538:
URL: https://github.com/apache/kafka-site/pull/538#issuecomment-1707109599

   cc @C0urante 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka-site] yashmayya opened a new pull request, #538: MINOR: Add missing entries for Kafka Connect to the documentation's table of contents

2023-09-05 Thread via GitHub


yashmayya opened a new pull request, #538:
URL: https://github.com/apache/kafka-site/pull/538

   - https://github.com/apache/kafka/pull/14337
   - Some of Kafka Connect's top level headings (``) and sub top level 
headings (``) in the documentation weren't added to the documentation's 
table of contents. This patch rectifies that.
   
   Before:
   
   https://github.com/apache/kafka/assets/23502577/7a0d6425-05d0-4ebc-b62f-6495e300aa27;>
   
   
   After:
   
   https://github.com/apache/kafka/assets/23502577/f0f71e02-06c2-4ea1-9d65-376e09f9cd6f;>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.6 #22

2023-09-05 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-15293) Update metrics doc to add tiered storage metrics

2023-09-05 Thread Satish Duggana (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Duggana resolved KAFKA-15293.

Resolution: Fixed

> Update metrics doc to add tiered storage metrics
> 
>
> Key: KAFKA-15293
> URL: https://issues.apache.org/jira/browse/KAFKA-15293
> Project: Kafka
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Abhijeet Kumar
>Assignee: Abhijeet Kumar
>Priority: Critical
> Fix For: 3.6.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2173

2023-09-05 Thread Apache Jenkins Server
See 




RE: Need more clarity in documentation for upgrade/downgrade procedures and limitations across releases.

2023-09-05 Thread Kaushik Srinivas (Nokia)
Hi Team,
Any help on this query ?
Thanks.

From: Kaushik Srinivas (Nokia)
Sent: Tuesday, August 22, 2023 10:27 AM
To: dev@kafka.apache.org
Subject: Need more clarity in documentation for upgrade/downgrade procedures 
and limitations across releases.


Hi Team,

Referring to the upgrade documentation for apache kafka.

https://kafka.apache.org/34/documentation.html#upgrade_3_4_0

There is confusion with respect to below statements from the above sectioned 
link of apache docs.

"If you are upgrading from a version prior to 2.1.x, please see the note below 
about the change to the schema used to store consumer offsets. Once you have 
changed the inter.broker.protocol.version to the latest version, it will not be 
possible to downgrade to a version prior to 2.1."

The above statement mentions that the downgrade would not be possible to 
version prior to "2.1" in case of "upgrading the inter.broker.protocol.version 
to the latest version".

But, there is another statement made in the documentation in point 4 as below

"Restart the brokers one by one for the new protocol version to take effect. 
Once the brokers begin using the latest protocol version, it will no longer be 
possible to downgrade the cluster to an older version."



These two statements are repeated across a lot of prior releases of kafka and 
is confusing.

Below are the questions:

  1.  Is downgrade not at all possible to "any" older version of kafka once the 
inter.broker.protocol.version is updated to latest version OR downgrades are 
not possible only to versions "<2.1" ?
  2.  Suppose one takes an approach similar to upgrade even for the downgrade 
path. i.e. downgrade the inter.broker.protocol.version first to the previous 
version, next downgrade the software/code of kafka to previous release 
revision. Does downgrade work with this approach ?

Can these two questions be documented if the results are already known ?

Regards,
Kaushik.



Re: Apache Kafka 3.6.0 release

2023-09-05 Thread David Arthur
Hi Satish. Thanks for running the release!

I'd like to raise this as a blocker for 3.6
https://issues.apache.org/jira/browse/KAFKA-15435.

It's a very quick fix, so I should be able to post a PR soon.

Thanks!
David

On Mon, Sep 4, 2023 at 11:44 PM Justine Olshan 
wrote:

> Thanks Satish. This is done 
>
> Justine
>
> On Mon, Sep 4, 2023 at 5:16 PM Satish Duggana 
> wrote:
>
> > Hey Justine,
> > I went through KAFKA-15424 and the PR[1]. It seems there are no
> > dependent changes missing in 3.6 branch. They seem to be low risk as
> > you mentioned. Please merge it to the 3.6 branch as well.
> >
> > 1. https://github.com/apache/kafka/pull/14324.
> >
> > Thanks,
> > Satish.
> >
> > On Tue, 5 Sept 2023 at 05:06, Justine Olshan
> >  wrote:
> > >
> > > Sorry I meant to add the jira as well.
> > > https://issues.apache.org/jira/browse/KAFKA-15424
> > >
> > > Justine
> > >
> > > On Mon, Sep 4, 2023 at 4:34 PM Justine Olshan 
> > wrote:
> > >
> > > > Hey Satish,
> > > >
> > > > I was working on adding dynamic configuration for
> > > > transaction verification. The PR is approved and ready to merge into
> > trunk.
> > > > I was thinking I could also add it to 3.6 since it is fairly low
> risk.
> > > > What do you think?
> > > >
> > > > Justine
> > > >
> > > > On Sat, Sep 2, 2023 at 6:21 PM Sophie Blee-Goldman <
> > ableegold...@gmail.com>
> > > > wrote:
> > > >
> > > >> Thanks Satish! The fix has been merged and cherrypicked to 3.6
> > > >>
> > > >> On Sat, Sep 2, 2023 at 6:02 AM Satish Duggana <
> > satish.dugg...@gmail.com>
> > > >> wrote:
> > > >>
> > > >> > Hi Sophie,
> > > >> > Please feel free to add that to 3.6 branch as you say this is a
> > minor
> > > >> > change and will not cause any regressions.
> > > >> >
> > > >> > Thanks,
> > > >> > Satish.
> > > >> >
> > > >> > On Sat, 2 Sept 2023 at 08:44, Sophie Blee-Goldman
> > > >> >  wrote:
> > > >> > >
> > > >> > > Hey Satish, someone reported a minor bug in the Streams
> > application
> > > >> > > shutdown which was a recent regression, though not strictly a
> new
> > one:
> > > >> > was
> > > >> > > introduced in 3.4 I believe.
> > > >> > >
> > > >> > > The fix seems to be super lightweight and low-risk so I was
> > hoping to
> > > >> > slip
> > > >> > > it into 3.6 if that's ok with you? They plan to have the patch
> > > >> tonight.
> > > >> > >
> > > >> > > https://issues.apache.org/jira/browse/KAFKA-15429
> > > >> > >
> > > >> > > On Thu, Aug 31, 2023 at 5:45 PM Satish Duggana <
> > > >> satish.dugg...@gmail.com
> > > >> > >
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Thanks Chris for bringing this issue here and filing the new
> > JIRA
> > > >> for
> > > >> > > > 3.6.0[1]. It seems to be a blocker for 3.6.0.
> > > >> > > >
> > > >> > > > Please help review https://github.com/apache/kafka/pull/14314
> > as
> > > >> Chris
> > > >> > > > requested.
> > > >> > > >
> > > >> > > > 1. https://issues.apache.org/jira/browse/KAFKA-15425
> > > >> > > >
> > > >> > > > ~Satish.
> > > >> > > >
> > > >> > > > On Fri, 1 Sept 2023 at 03:59, Chris Egerton
> >  > > >> >
> > > >> > > > wrote:
> > > >> > > > >
> > > >> > > > > Hi all,
> > > >> > > > >
> > > >> > > > > Quick update: I've filed a separate ticket,
> > > >> > > > > https://issues.apache.org/jira/browse/KAFKA-15425, to track
> > the
> > > >> > behavior
> > > >> > > > > change in Admin::listOffsets. For the full history of the
> > ticket,
> > > >> > it's
> > > >> > > > > worth reading the comment thread on the old ticket at
> > > >> > > > > https://issues.apache.org/jira/browse/KAFKA-12879.
> > > >> > > > >
> > > >> > > > > I've also published
> > https://github.com/apache/kafka/pull/14314
> > > >> as a
> > > >> > > > fairly
> > > >> > > > > lightweight PR to revert the behavior of Admin::listOffsets
> > > >> without
> > > >> > also
> > > >> > > > > reverting the refactoring to use the internal admin driver
> > API.
> > > >> Would
> > > >> > > > > appreciate a review on that if anyone can spare the cycles.
> > > >> > > > >
> > > >> > > > > Cheers,
> > > >> > > > >
> > > >> > > > > Chris
> > > >> > > > >
> > > >> > > > > On Wed, Aug 30, 2023 at 1:01 PM Chris Egerton <
> > chr...@aiven.io>
> > > >> > wrote:
> > > >> > > > >
> > > >> > > > > > Hi Satish,
> > > >> > > > > >
> > > >> > > > > > Wanted to let you know that KAFKA-12879 (
> > > >> > > > > > https://issues.apache.org/jira/browse/KAFKA-12879), a
> > breaking
> > > >> > change
> > > >> > > > in
> > > >> > > > > > Admin::listOffsets, has been reintroduced into the code
> > base.
> > > >> > Since we
> > > >> > > > > > haven't yet published a release with this change (at
> least,
> > not
> > > >> the
> > > >> > > > more
> > > >> > > > > > recent instance of it), I was hoping we could treat it as
> a
> > > >> > blocker for
> > > >> > > > > > 3.6.0. I'd also like to solicit the input of people
> familiar
> > > >> with
> > > >> > the
> > > >> > > > admin
> > > >> > > > > > client to weigh in on the Jira ticket about whether we
> > should
> > > >> > continue

[jira] [Created] (KAFKA-15435) KRaft migration record counts in log message are incorrect

2023-09-05 Thread David Arthur (Jira)
David Arthur created KAFKA-15435:


 Summary: KRaft migration record counts in log message are incorrect
 Key: KAFKA-15435
 URL: https://issues.apache.org/jira/browse/KAFKA-15435
 Project: Kafka
  Issue Type: Bug
  Components: kraft
Affects Versions: 3.6.0
Reporter: David Arthur


The counting logic in MigrationManifest is incorrect and produces invalid 
output. This information is critical for users wanting to validate the result 
of a migration.

 
{code}
Completed migration of metadata from ZooKeeper to KRaft. 7117 records were 
generated in 54253 ms across 1629 batches. The record types were 
{TOPIC_RECORD=2, CONFIG_RECORD=2, PARTITION_RECORD=2, 
ACCESS_CONTROL_ENTRY_RECORD=2, PRODUCER_IDS_RECORD=1}. 
{code}

Due to the logic bug, the counts will never exceed 2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: KIP-976: Cluster-wide dynamic log adjustment for Kafka Connect

2023-09-05 Thread Sagar
Hey Chris,

Thanks for the KIP. Seems like a useful feature. I have some more
questions/comments:

1) Considering that you have now clarified here that last_modified field
would be stored in-memory, it is not mentioned in the KIP. Although that's
something that's understandable, it wasn't apparent when reading the KIP.
Probably, we could mention it? Also, what happens if a worker restarts? In
that case, since the log level update message would be pre-dating the
startup of the worker, it would be ignored? We should probably mention that
behaviour as well IMO.

2) Staying on the last_modified field, it's utility is also not too clear
to me. Can you add some examples of how it can be useful for debugging etc?

3) In the test plan, should we also add a test when the scope parameter
passed is non-null and neither worker nor cluster? In this case the
behaviour should be similar to the default case.

4) I had the same question as Yash regarding persistent cluster-wide
logging level. I think you have explained it well and we can skip it for
now.

Thanks!
Sagar.

On Tue, Sep 5, 2023 at 8:49 PM Chris Egerton 
wrote:

> Hi all,
>
> Thank you so much for the generous review comments! Happy to see interest
> in this feature. Inline responses follow.
>
>
> Ashwin:
>
> > Don't you foresee a future scope type which may require cluster metadata
> ?
> In that case, isn't it better to forward the requests to the leader in the
> initial implementation ?
>
> I agree with Yash here: we can cross that bridge when we come to it, unless
> there are problems that we'd encounter then that would be better addressed
> by adding request forwarding now. One potential argument I can think of is
> that the UX would be a little strange if the semantics for this endpoint
> differ depending on the value of the scope parameter (some values would be
> affected by in-progress rebalances, and some would not), but I don't know
> if making scope=cluster more brittle in the name of consistency with, e.g.,
> scope=connectorType:foo is really a worthwhile tradeoff. Thoughts?
>
> > I would also recommend an additional system test for Standalone herder to
> ensure that the new scope parameter is honored and the response contains
> the last modified time.
>
> Ah, great call! I love the new testing plan section. I also share Yash's
> concerns about adding a new system test though (at this point, they're so
> painful to write, test, and debug that in most circumstances I consider
> them a last resort). Do you think it'd be reasonable to add end-to-end
> verification for this with an integration test instead?
>
>
> Yash:
>
> > From the proposed changes section, it isn't very clear to me how we'll be
> tracking this last modified timestamp to be returned in responses for the
> *GET
> /admin/loggers* and *GET /admin/loggers/{logger}* endpoints. Could you
> please elaborate on this? Also, will we track the last modified timestamp
> even for worker scoped modifications where we won't write any records to
> the config topic and the requests will essentially be processed
> synchronously?
>
> RE timestamp tracking: I was thinking we'd store the timestamp for each
> level in-memory and, whenever we change the level for a namespace, update
> its timestamp to the current wallclock time. Concretely, this means we'd
> want the timestamp for some logger `logger` to be as soon as possible after
> the call to `logger.setLevel(level)` for some level `level`. I'm honestly
> unsure how to clarify this further in the KIP; is there anything in there
> that strikes you as particularly ambiguous that we can tweak to be more
> clear?
>
> RE scope distinction for timestamps: I've updated the KIP to clarify this
> point, adding this sentence: "Timestamps will be updated regardless of
> whether the namespace update was applied using scope=worker or
> scope=cluster.". Let me know what you think
>
> > In the current synchronous implementation for the *PUT
> /admin/loggers/{logger} *endpoint, we return a 404 error if the level is
> invalid (i.e. not one of TRACE, DEBUG, WARN etc.). Since the new cluster
> scoped variant of the endpoint will be asynchronous, can we also add a
> validation to synchronously surface erroneous log levels to users?
>
> Good call! I think we don't have to be explicit about this in the proposed
> changes section, but it's a great fit for the testing plan, where I've
> added it: "Ensure that cluster-scoped requests with invalid logging levels
> are rejected with a 404 response"
>
> > I'm curious to know what the rationale here is? In KIP-745, the stated
> reasoning behind ignoring restart requests during worker startup was that
> the worker will anyway be starting all connectors and tasks assigned to it
> so the restart request is essentially meaningless. With the log level API
> however, wouldn't it make more sense to apply any cluster scoped
> modifications to new workers in the cluster too? This also applies to any
> workers that are restarted soon after a 

Re: KIP-976: Cluster-wide dynamic log adjustment for Kafka Connect

2023-09-05 Thread Chris Egerton
Hi all,

Thank you so much for the generous review comments! Happy to see interest
in this feature. Inline responses follow.


Ashwin:

> Don't you foresee a future scope type which may require cluster metadata ?
In that case, isn't it better to forward the requests to the leader in the
initial implementation ?

I agree with Yash here: we can cross that bridge when we come to it, unless
there are problems that we'd encounter then that would be better addressed
by adding request forwarding now. One potential argument I can think of is
that the UX would be a little strange if the semantics for this endpoint
differ depending on the value of the scope parameter (some values would be
affected by in-progress rebalances, and some would not), but I don't know
if making scope=cluster more brittle in the name of consistency with, e.g.,
scope=connectorType:foo is really a worthwhile tradeoff. Thoughts?

> I would also recommend an additional system test for Standalone herder to
ensure that the new scope parameter is honored and the response contains
the last modified time.

Ah, great call! I love the new testing plan section. I also share Yash's
concerns about adding a new system test though (at this point, they're so
painful to write, test, and debug that in most circumstances I consider
them a last resort). Do you think it'd be reasonable to add end-to-end
verification for this with an integration test instead?


Yash:

> From the proposed changes section, it isn't very clear to me how we'll be
tracking this last modified timestamp to be returned in responses for the
*GET
/admin/loggers* and *GET /admin/loggers/{logger}* endpoints. Could you
please elaborate on this? Also, will we track the last modified timestamp
even for worker scoped modifications where we won't write any records to
the config topic and the requests will essentially be processed
synchronously?

RE timestamp tracking: I was thinking we'd store the timestamp for each
level in-memory and, whenever we change the level for a namespace, update
its timestamp to the current wallclock time. Concretely, this means we'd
want the timestamp for some logger `logger` to be as soon as possible after
the call to `logger.setLevel(level)` for some level `level`. I'm honestly
unsure how to clarify this further in the KIP; is there anything in there
that strikes you as particularly ambiguous that we can tweak to be more
clear?

RE scope distinction for timestamps: I've updated the KIP to clarify this
point, adding this sentence: "Timestamps will be updated regardless of
whether the namespace update was applied using scope=worker or
scope=cluster.". Let me know what you think

> In the current synchronous implementation for the *PUT
/admin/loggers/{logger} *endpoint, we return a 404 error if the level is
invalid (i.e. not one of TRACE, DEBUG, WARN etc.). Since the new cluster
scoped variant of the endpoint will be asynchronous, can we also add a
validation to synchronously surface erroneous log levels to users?

Good call! I think we don't have to be explicit about this in the proposed
changes section, but it's a great fit for the testing plan, where I've
added it: "Ensure that cluster-scoped requests with invalid logging levels
are rejected with a 404 response"

> I'm curious to know what the rationale here is? In KIP-745, the stated
reasoning behind ignoring restart requests during worker startup was that
the worker will anyway be starting all connectors and tasks assigned to it
so the restart request is essentially meaningless. With the log level API
however, wouldn't it make more sense to apply any cluster scoped
modifications to new workers in the cluster too? This also applies to any
workers that are restarted soon after a request is made to *PUT
/admin/loggers/{logger}?scope=cluster *on another worker. Maybe we could
stack up all the cluster scoped log level modification requests during the
config topic read at worker startup and apply the latest ones per namespace
(assuming older records haven't already been compacted) after we've
finished reading to the end of the config topic?

It's definitely possible to implement persistent cluster-wide logging level
adjustments, with the possible exception that none will be applied until
the worker has finished reading to the end of the config topic. The
rationale for keeping these updates ephemeral is to continue to give
priority to workers' Log4j configuration files, with the underlying
philosophy that this endpoint is still only intended for debugging
purposes, as opposed to cluster-wide configuration. Permanent changes can
already be made by tweaking the Log4j file for a worker and then restarting
it. If a restart is too expensive for a permanent change, then the change
can be applied immediately via the REST API, and staged via the Log4j
configuration file (which will then be used the next time the worker is
restarted, whenever that happens).

We could potentially explore persistent cluster-wide logging adjustments 

Jenkins build is unstable: Kafka » Kafka Branch Builder » 3.6 #21

2023-09-05 Thread Apache Jenkins Server
See 




Re: Re: Re: [DISCUSS] KIP-971 Expose replication-offset-lag MirrorMaker2 metric

2023-09-05 Thread Viktor Somogyi-Vass
Hey Elkhan and hudeqi,

I'm reading your debate around the implementation. I also think a
scheduled task would be better in overall accuracy and performance
(compared to calling endOffsets with every poll).
Hudeqi, do you have any experience of what works best for you in terms of
time intervals? I would think refreshing the metric every 5-10sec would be
overall good and sufficient for the users (as short intervals can be quite
noisy anyways).

Best,
Viktor

On Mon, Sep 4, 2023 at 11:41 AM hudeqi <16120...@bjtu.edu.cn> wrote:

> My approach is to create another thread to regularly request and update
> the end offset of each partition for the `keySet` in the collection
> `lastReplicatedSourceOffsets` mentioned by your kip (if there is no update
> for a long time, it will be removed from `lastReplicatedSourceOffsets`).
> Obviously, such processing makes the calculation of the partition offset
> lag less real-time and accurate.
> But this also meets our needs, because we need the partition offset lag to
> analyze the replication performance of the task and which task may have
> performance problems; and if you monitor the overall offset lag of the
> topic, then using the
> "kafka_consumer_consumer_fetch_manager_metrics_records_lag" metric will be
> more real-time and accurate.
> This is my suggestion. I hope to be able to throw bricks and start jade,
> we can come up with a better solution.
>
> best,
> hudeqi
>
> Elxan Eminov elxanemino...@gmail.com写道:
> > @huqedi replying to your comment on the PR (
> > https://github.com/apache/kafka/pull/14077#discussion_r1314592488),
> quote:
> >
> > "I guess we have a disagreement about lag? My understanding of lag is:
> the
> > real LEO of the source cluster partition minus the LEO that has been
> > written to the target cluster. It seems that your definition of lag is:
> the
> > lag between the mirror task getting data from consumption and writing it
> to
> > the target cluster?"
> >
> > Yes, this is the case. I've missed the fact that the consumer itself
> might
> > be lagging behind the actual data in the partition.
> > I believe your definition of the lag is more precise, but:
> > Implementing it this way will come at the cost of an extra listOffsets
> > request, introducing the overhead that you mentioned in your initial
> > comment.
> >
> > If you have enough insights about this, what would you say is the chances
> > of the task consumer lagging behind the LEO of the partition?
> > Are they big enough to justify the extra call to listOffsets?
> > @Viktor,  any thoughts?
> >
> > Thanks,
> > Elkhan
> >
> > On Mon, 4 Sept 2023 at 09:36, Elxan Eminov 
> wrote:
> >
> > > I already have the PR for this so if it will make it easier to discuss,
> > > feel free to take a look: https://github.com/apache/kafka/pull/14077
> > >
> > > On Mon, 4 Sept 2023 at 09:17, hudeqi <16120...@bjtu.edu.cn> wrote:
> > >
> > >> But does the offset of the last `ConsumerRecord` obtained in poll not
> > >> only represent the offset of this record in the source cluster? It
> seems
> > >> that it cannot represent the LEO of the source cluster for this
> partition.
> > >> I understand that the offset lag introduced here should be the LEO of
> the
> > >> source cluster minus the offset of the last record to be polled?
> > >>
> > >> best,
> > >> hudeqi
> > >>
> > >>
> > >>  -原始邮件-
> > >>  发件人: "Elxan Eminov" 
> > >>  发送时间: 2023-09-04 14:52:08 (星期一)
> > >>  收件人: dev@kafka.apache.org
> > >>  抄送:
> > >>  主题: Re: [DISCUSS] KIP-971 Expose replication-offset-lag
> MirrorMaker2
> > >> metric
> > >> 
> > >> 
> > >
> > >
>


[jira] [Created] (KAFKA-15434) Tiered Storage Quotas

2023-09-05 Thread Satish Duggana (Jira)
Satish Duggana created KAFKA-15434:
--

 Summary: Tiered Storage Quotas 
 Key: KAFKA-15434
 URL: https://issues.apache.org/jira/browse/KAFKA-15434
 Project: Kafka
  Issue Type: Improvement
Reporter: Satish Duggana
Assignee: Abhijeet Kumar






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15433) Follower fetch from tiered offset

2023-09-05 Thread Satish Duggana (Jira)
Satish Duggana created KAFKA-15433:
--

 Summary: Follower fetch from tiered offset 
 Key: KAFKA-15433
 URL: https://issues.apache.org/jira/browse/KAFKA-15433
 Project: Kafka
  Issue Type: Improvement
Reporter: Satish Duggana
Assignee: Abhijeet Kumar






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-858: Handle JBOD broker disk failure in KRaft

2023-09-05 Thread ziming deng
Hi, Igor
I’m +1(binding) for this, looking forward the PR.

--
Best,
Ziming

> On Jul 26, 2023, at 01:13, Igor Soarez  wrote:
> 
> Hi everyone,
> 
> Following a face-to-face discussion with Ron and Colin,
> I have just made further improvements to this KIP:
> 
> 
> 1. Every log directory gets a random UUID assigned, even if just one
>   log dir is configured in the Broker.
> 
> 2. All online log directories are registered, even if just one if configured.
> 
> 3. Partition-to-directory assignments are only performed if more than
>   one log directory is configured/registered.
> 
> 4. A successful reply from the Controller to a AssignReplicasToDirsRequest
>   is taken as a guarantee that the metadata changes are
>   successfully persisted.
> 
> 5. Replica assignments that refer log directories pending a failure
>   notification are prioritized to guarantee the Controller and Broker
>   agree on the assignments before acting on the failure notification.
> 
> 6. The transition from one log directory to multiple log directories
>   relies on a logical update to efficiently update directory assignments
>   to the previously registered single log directory when that's possible.
> 
> I have also introduced a configuration for the maximum time the broker
> will keep trying to send a log directory notification before shutting down.
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-858%3A+Handle+JBOD+broker+disk+failure+in+KRaft
> 
> Best,
> 
> --
> Igor
> 



[jira] [Created] (KAFKA-15432) RLM Stop partitions should not be invoked for non-tiered storage topics

2023-09-05 Thread Kamal Chandraprakash (Jira)
Kamal Chandraprakash created KAFKA-15432:


 Summary: RLM Stop partitions should not be invoked for non-tiered 
storage topics
 Key: KAFKA-15432
 URL: https://issues.apache.org/jira/browse/KAFKA-15432
 Project: Kafka
  Issue Type: Improvement
Reporter: Kamal Chandraprakash


When a stop partition request is sent by the controller. It invokes the 
RemoteLogManager#stopPartition even for internal and non-tiered-storage enabled 
topics. The replica manager should not call this method for non-tiered-storage 
topics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-940: Broker extension point for validating record contents at produce time

2023-09-05 Thread Edoardo Comar
Hi all,

we have opened the VOTE thread a few weeks ago as we hoped that this
DISCUSS thread exchange had been exhaustive.
If so, we would you like any interested party to. cast a vote there.
Of course we're happy to further progress the KIP discussion if needed.

Thanks
Edo & Adrian

On Wed, 5 Jul 2023 at 16:55, Edoardo Comar  wrote:
>
> Hi Jorge!
>
> On Fri, 30 Jun 2023 at 15:47, Jorge Esteban Quilcate Otoya
>  wrote:
> >
> > Thank you both for the replies! A couple more comments:
> >
> > The current proposal is to have ‘record.validation.policy’ per topic
> > (default null). A flag would be something like
> > ‘record.validation.policy.enable’ (default=false) may be simpler to
> > configure from the user perspective.
> >
> > Also, at the moment, is a bit unclear to me what value the topic config
> > ‘record.validation.policy’ should contain: is the policy class name? How is
> > the policy expected to use the name received?
> >
>
> The 'record.validation.policy' will typically contain a value that is
> meaningful to the policy implementation.
> For example, a schema registry might support different strategies to
> associate a schema with a topic.
> The policy could use this property to determine which strategy is in
> use and then evaluate whether the record is valid.
> We decided to reserve the 'null' value to mean disable validation for
> this topic to avoid the need for introducing a second inter-dependent
> boolean property.
>
> >
> > Thanks! I think adding a simple example of a Policy implementation and how
> > plugin developer may use this hints (and metadata as well) may bring some
> > clarity to the proposal.
> >
>
> We've added a sample to the KIP, hope this helps.
>
> We expect the RecordIntrospectionHints to be a declaration the policy makes,
> which the implementation of the KIP may use to optimise record
> iteration avoiding a full decompression in the case where a message is
> received with compression type matching the topic compression config.
> Currently Kafka optimizes that case by supplying an iterator that does
> not provide access to the record data, only answers hasKey/hasValue
> checks.
>
> HTH,
> best
> Edo & Adrian


[jira] [Resolved] (KAFKA-15261) ReplicaFetcher thread should not block if RLMM is not initialized

2023-09-05 Thread Abhijeet Kumar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhijeet Kumar resolved KAFKA-15261.

Resolution: Fixed

> ReplicaFetcher thread should not block if RLMM is not initialized
> -
>
> Key: KAFKA-15261
> URL: https://issues.apache.org/jira/browse/KAFKA-15261
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Abhijeet Kumar
>Assignee: Abhijeet Kumar
>Priority: Blocker
> Fix For: 3.6.0
>
>
> While building remote log aux state, the replica fetcher fetches the remote 
> log segment metadata. If the TBRLMM is not initialized yet, the call blocks. 
> Since replica fetchers share a common lock, it prevents other replica 
> fetchers from running as well. Also the same lock is shared in the handle 
> LeaderAndISR request path, hence those calls get blocked as well.
> Instead, replica fetcher should check if RLMM is initialized before 
> attempting to fetch the remote log segment metadata. If RLMM is not 
> initialized, it should throw a retryable error so that it can be retried 
> later, and also does not block other operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2172

2023-09-05 Thread Apache Jenkins Server
See