[jira] [Resolved] (KAFKA-16472) Integration tests in Java don't really run kraft case

2024-04-05 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-16472.

Resolution: Fixed

> Integration tests in Java don't really run kraft case
> -
>
> Key: KAFKA-16472
> URL: https://issues.apache.org/jira/browse/KAFKA-16472
> Project: Kafka
>  Issue Type: Test
>Reporter: PoAn Yang
>Assignee: PoAn Yang
>Priority: Major
> Fix For: 3.8.0, 3.7.1
>
>
> Following test cases don't really run kraft case. The reason is that the test 
> info doesn't contain parameter name, so it always returns false in 
> TestInfoUtils#isKRaft.
>  * TopicCommandIntegrationTest
>  * DeleteConsumerGroupsTest
>  * AuthorizerIntegrationTest
>  * DeleteOffsetsConsumerGroupCommandIntegrationTest
>  
> We can add `options.compilerArgs += '-parameters'` after 
> [https://github.com/apache/kafka/blob/376e9e20dbf7c7aeb6f6f666d47932c445eb6bd1/build.gradle#L264-L273]
>  to fix it.
>  
> Also, we have to add `String quorum` to cases in 
> DeleteOffsetsConsumerGroupCommandIntegrationTest.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2786

2024-04-05 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-1033: Add Kafka Streams exception handler for exceptions occuring during processing

2024-04-05 Thread Sophie Blee-Goldman
Hi Damien,

First off thanks for the KIP, this is definitely a much needed feature. On
the
whole it seems pretty straightforward and I am in favor of the proposal.
Just
a few questions and suggestions here and there:

1. One of the #handle method's parameters is "ProcessorNode node", but
ProcessorNode is an internal class (and would expose a lot of internals
that we probably don't want to pass in to an exception handler). Would it
be sufficient to just make this a String and pass in the processor name?

2. Another of the parameters in the ProcessorContext. This would enable
the handler to potentially forward records, which imo should not be done
from the handler since it could only ever call #forward but not direct where
the record is actually forwarded to, and could cause confusion if users
aren't aware that the handler is effectively calling from the context of the
processor that threw the exception.
2a. If you don't explicitly want the ability to forward records, I would
suggest changing the type of this parameter to ProcessingContext, which
has all the metadata and useful info of the ProcessorContext but without
the
forwarding APIs. This would also lets us sidestep the following issue:
2b. If you *do* want the ability to forward records, setting aside whether
that
in of itself makes sense to do, we would need to pass in either a regular
ProcessorContext or a FixedKeyProcessorContext, depending on what kind
of processor it is. I'm not quite sure how we could design a clean API here,
so I'll hold off until you clarify whether you even want forwarding or not.
We would also need to split the input record into a Record vs FixedKeyRecord

3. One notable difference between this handler and the existing ones you
pointed out, the Deserialization/ProductionExceptionHandler, is that the
records passed in to those are in serialized bytes, whereas the record
here would be POJOs. You account for this by making the parameter
type a Record, but I just wonder how users would be
able to read the key/value and figure out what type it should be. For
example, would they need to maintain a map from processor name to
input record types?

If you could provide an example of this new feature in the KIP, it would be
very helpful in understanding whether we could do something to make it
easier for users to use, for if it would be fine as-is

4. We should include all the relevant info for a new metric, such as the
metric
group and recording level. You can look at other metrics KIPs like KIP-444
and KIP-613 for an example. I suspect you intend for this to be in the
processor group and at the INFO level?

Hope that all makes sense! Thanks again for the KIP

-Sophie

On Fri, Mar 29, 2024 at 6:16 AM Damien Gasparina 
wrote:

> Hi everyone,
>
> After writing quite a few Kafka Streams applications, me and my colleagues
> just created KIP-1033 to introduce a new Exception Handler in Kafka Streams
> to simplify error handling.
> This feature would allow defining an exception handler to automatically
> catch exceptions occurring during the processing of a message.
>
> KIP link:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1033%3A+Add+Kafka+Streams+exception+handler+for+exceptions+occuring+during+processing
>
> Feedbacks and suggestions are welcome,
>
> Cheers,
> Damien, Sebastien and Loic
>


Re: [DISCUSS] KIP-1020: Deprecate `window.size.ms` in `StreamsConfig`

2024-04-05 Thread Sophie Blee-Goldman
Thanks Matthias -- sorry I forgot to get back to you!

Thanks for kicking off the voting thread on this

On Thu, Mar 28, 2024 at 3:58 AM Matthias J. Sax  wrote:

> Thanks. I think you can start a VOTE.
>
> -Matthias
>
>
> On 3/20/24 9:24 PM, Lucia Cerchie wrote:
> > thanks Sophie, I've made the updates, would appreciate one more look
> before
> > submission
> >
> > On Wed, Mar 20, 2024 at 8:36 AM Sophie Blee-Goldman <
> sop...@responsive.dev>
> > wrote:
> >
> >> A few minor notes on the KIP but otherwise I think you can go ahead and
> >> call for a vote!
> >>
> >> 1. Need to update the Motivation section to say "console clients" or
> >> "console consumer/producer" instead of " plain consumer client"
> >> 2. Remove the first paragraph under "Public Interfaces" (ie the
> KIP-writing
> >> instructions) and also list the new config definitions here. You can
> just
> >> add a code snippet for each class (TimeWindowedDe/Serializer) with the
> >> actual variable definition you'll be adding. Maybe also add a code
> snippet
> >> for StreamsConfig with the @Deprecated annotation added to the two
> configs
> >> we're deprecating
> >> 3. nit: remove the "questions" under "Compatibility, Deprecation, and
> >> Migration Plan", ie the stuff from the KIP-writing template. Just makes
> it
> >> easier to read
> >> 4. In "Test Plan" we should also have a unit test to make sure the new "
> >> windowed.inner.de/serializer.class" takes preference in the case that
> both
> >> it and the old "windowed.inner.serde.class" config are both specified
> >>
> >> Also, this is more of a question than a suggestion, but is the KIP title
> >> perhaps a bit misleading in that people might assume these configs will
> no
> >> longer be available for use at all? What do people think about changing
> it
> >> to something like
> >>
> >> *Move `window.size.ms ` and
> >> `windowed.inner.serde.class` from `StreamsConfig` to
> >> TimeWindowedDe/Serializer class*
> >>
> >> A bit long/clunky but at least it gets the point across about where
> folks
> >> can find these configs going forward.
> >>
> >> On Mon, Mar 18, 2024 at 6:26 PM Lucia Cerchie
> >> 
> >> wrote:
> >>
> >>> Thanks for the discussion!
> >>>
> >>> I've updated the KIP here with what I believe are the relevant pieces,
> >>> please let me know if anything is missing:
> >>>
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=290982804
> >>>
> >>> On Sun, Mar 17, 2024 at 7:09 PM Sophie Blee-Goldman <
> >> sop...@responsive.dev
> 
> >>> wrote:
> >>>
>  Sounds good!
> 
>  @Lucia when you have a moment can you update the KIP with
>  the new proposal, including the details that Matthias pointed
>  out in his last response? After that's done I think you can go
>  ahead and call for a vote whenever you're ready!
> 
>  On Sat, Mar 16, 2024 at 7:35 PM Matthias J. Sax 
> >>> wrote:
> 
> > Thanks for the summary. Sounds right to me. That is what I would
> >>> propose.
> >
> > As you pointed out, we of course still need to support the current
> > confis, and we should log a warning when in use (even if the new one
> >> is
> > in use IMHO) -- but that's more an implementation detail.
> >
> > I agree that the new config should take preference in case both are
> > specified. This should be pointed out in the KIP, as it's an
> >> important
> > contract the user needs to understand.
> >
> >
> > -Matthias
> >
> > On 3/14/24 6:18 PM, Sophie Blee-Goldman wrote:
> >>>
> >>> Should we change it do `.serializer` and `.deserialize`?
> >>
> >> That's a good point -- if we're going to split this up by defining
> >>> the
> >> config
> >> in both the TimeWindowedSerializer and TimeWindowedDeserializer,
> >> then it makes perfect sense to go a step further and actually
> >> define
> >> only the relevant de/serializer class instead of the full serde
> >>
> >> Just to put this all together, it sounds like the proposal is to:
> >>
> >> 1) Deprecate both these configs where they appear in StreamsConfig
> >> (as per the original plan in the KIP, just reiterating it here)
> >>
> >> 2) Don't "define" either config in any specific client's Config
> >>> class,
> >> but just define a String variable with the config name in the
> >>> relevant
> >> de/serializer class, and maybe point people to it in the docs
> >>> somewhere
> >>
> >> 3) We would add three new public String variables for three
> >> different
> >> configs across two classes, specifically:
> >>
> >> In TimeWindowedSerializer:
> >> - define a constant for "windowed.inner.serializer.class"
> >> In TimeWindowedDeserializer:
> >> - define a constant for "windowed.inner.deserializer.class"
> >> - define a constant for "window.size.ms"
> >>
> >> 4) Lastly, we would update the windowed de/serializer
> >> 

Re: [DISCUSS] KIP-990: Capability to SUSPEND Tasks on DeserializationException

2024-04-05 Thread Sophie Blee-Goldman
Thanks for the followup Nick!

This is definitely one of those things that feels like it should be easy
and useful for everyone, but "the devil is in the details" as we always
say. It'll be good to have this thread and the discussion so far should
anyone want to attempt it in the future or need something similar

On Thu, Mar 28, 2024 at 6:03 AM Nick Telford  wrote:

> Hi folks,
>
> Sorry I haven't got back to you until now.
>
> It's become clear that I hadn't anticipated a significant number of
> technical challenges that this KIP presents. I think expecting users to
> understand the ramifications on aggregations, joins and windowing
> ultimately kills it: it only becomes a problem under *specific*
> combinations of operations, and those problems would manifest in ways that
> might be difficult for users to detect, let alone diagnose.
>
> I think it's best to abandon this KIP, at least for now. If anyone else
> sees a use for and path forwards for it, feel free to pick it up.
>
> Since I'm abandoning the KIP, I won't update the Motivation section. But I
> will provide a bit of background here on why I originally suggested it, in
> case you're interested:
>
> In my organisation, we don't have schemas for *any* of our data in Kafka.
> Consequently, one of the biggest causes of downtime in our applications are
> "bad" records being written by Producers. We integrate with a lot of
> third-party APIs, and have Producers that just push that data straight to
> Kafka with very little validation. I've lost count of the number of times
> my application has been crashed by a deserialization exception because we
> received a record that looks like '{"error": "Bad gatetway"}' or similar,
> instead of the actual payload we expect.
>
> The difficulty is we can't just use CONTINUE to discard these messages,
> because we also sometimes get deserialization exceptions caused by an
> upstream schema change that is incompatible with the expectations of our
> app. In these cases, we don't want to discard records (which are
> technically valid), but instead need to adjust our application to be
> compatible with the new schema, before processing them.
>
> Crucially, we use a monolithic app, with more than 45 sub-topologies, so
> crashing the entire app just because of one bad record causes downtime on
> potentially unrelated sub-topologies.
>
> This was the motivation for this KIP, which would have enabled users to
> make a decision on what to do about a bad message, *without taking down the
> entire application*.
>
> Obviously, the *correct* solution to this problem is to introduce schemas
> on our topics and have our Producers correctly validate records before
> writing them to the cluster. This is ultimately the solution I am going to
> pursue in lieu of this KIP.
>
> I still think this KIP could have been useful for dealing with an
> incompatible upstream schema change; by pausing only the sub-topologies
> that are affected by the schema change, while leaving others to continue to
> run while the user deploys a fix. However, in practice I think few users
> have monolithic apps like ours, and most instead de-couple unrelated topics
> via different apps, which reduces the impact of incompatible upstream
> schema changes.
>
> Thanks for your reviews and feedback, I've learned a lot, as always; this
> time, mostly about how, when authoring a KIP,  I should always ask myself:
> "yes, but what about timestamp ordering?" :-D
>
> Nick
>
> On Thu, 14 Mar 2024 at 03:27, Sophie Blee-Goldman 
> wrote:
>
> > >
> > > Well, the KIP mentions the ability to either re-try the record (eg,
> > > after applying some external fix that would allow Kafka Streams to now
> > > deserialize the record now) or to skip it by advancing the offset.
> >
> >
> > That's fair -- you're definitely right that what's described in the KIP
> > document
> > right now would not be practical. I just wanted to clarify that this
> > doesn't
> > mean the feature as a whole is impractical, but certainly we'd want to
> > update the proposal to remove the line about resetting offsets via
> external
> > tool and come up with a more concrete approach, and perhaps  describe
> > it in more detail.
> >
> > That's  probably not worth getting into until/unless we decide whether to
> > go forward with this feature in the first place. I'll let Nick reflect on
> > the
> > motivation and your other comments and then decide whether he still
> > wants to pursue it.
> >
> > To Nick: if you want to go through with this KIP and can expand on the
> > motivation so that we understand it better, I'd be happy to help work
> > out the details. For now I'll just wait for your decision
> >
> > On Wed, Mar 13, 2024 at 10:24 AM Matthias J. Sax 
> wrote:
> >
> > > Yes, about the "drop records" case. It's a very common scenario to have
> > > a repartition step before a windowed aggregation or a join with
> > > grace-period.
> > >
> > >
> > > About "add feature vs guard users": it's always a tricky 

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-05 Thread Sophie Blee-Goldman
Cool, looks good to me!

Seems like there is no further feedback, so maybe we can start to call for
a vote?

However, since as noted we are setting aside time to discuss this during
the sync next Thursday, we can also wait until after that meeting to
officially kick off the vote.

On Fri, Apr 5, 2024 at 12:19 PM Rohan Desai  wrote:

> Thanks for the feedback Sophie!
>
> re1: Totally agree. The fact that it's related to the partition assignor is
> clear from just `task.assignor`. I'll update.
> re3: This is a good point, and something I would find useful personally. I
> think its worth adding an interface that lets the plugin observe the final
> assignment. I'll add that.
> re4: I like the new `NodeAssignment` type. I'll update the KIP with that.
>
> On Thu, Nov 9, 2023 at 11:18 PM Rohan Desai 
> wrote:
>
> > Thanks for the feedback so far! I think pretty much all of it is
> > reasonable. I'll reply to it inline:
> >
> > > 1. All the API logic is granular at the Task level, except the
> > previousOwnerForPartition func. I’m not clear what’s the motivation
> behind
> > it, does our controller also want to change how the partitions->tasks
> > mapping is formed?
> > You're right that this is out of place. I've removed this method as it's
> > not needed by the task assignor.
> >
> > > 2. Just on the API layering itself: it feels a bit weird to have the
> > three built-in functions (defaultStandbyTaskAssignment etc) sitting in
> the
> > ApplicationMetadata class. If we consider them as some default util
> > functions, how about introducing moving those into their own static util
> > methods to separate from the ApplicationMetadata “fact objects” ?
> > Agreed. Updated in the latest revision of the kip. These have been moved
> > to TaskAssignorUtils
> >
> > > 3. I personally prefer `NodeAssignment` to be a read-only object
> > containing the decisions made by the assignor, including the
> > requestFollowupRebalance flag. For manipulating the half-baked results
> > inside the assignor itself, maybe we can just be flexible to let users
> use
> > whatever struts / their own classes even, if they like. WDYT?
> > Agreed. Updated in the latest version of the kip.
> >
> > > 1. For the API, thoughts on changing the method signature to return a
> > (non-Optional) TaskAssignor? Then we can either have the default
> > implementation return new HighAvailabilityTaskAssignor or just have a
> > default implementation class that people can extend if they don't want to
> > implement every method.
> > Based on some other discussion, I actually decided to get rid of the
> > plugin interface, and instead use config to specify individual plugin
> > behaviour. So the method you're referring to is no longer part of the
> > proposal.
> >
> > > 3. Speaking of ApplicationMetadata, the javadoc says it's read only but
> > theres methods that return void on it? It's not totally clear to me how
> > that interface is supposed to be used by the assignor. It'd be nice if we
> > could flip that interface such that it becomes part of the output instead
> > of an input to the plugin.
> > I've moved those methods to a util class. They're really utility methods
> > the assignor might want to call to do some default or optimized
> assignment
> > for some cases like rack-awareness.
> >
> > > 4. We should consider wrapping UUID in a ProcessID class so that we
> > control
> > the interface (there are a few places where UUID is directly used).
> > I like it. Updated the proposal.
> >
> > > 5. What does NodeState#newAssignmentForNode() do? I thought the point
> > was
> > for the plugin to make the assignment? Is that the result of the default
> > logic?
> > It doesn't need to be part of the interface. I've removed it.
> >
> > > re 2/6:
> >
> > I generally agree with these points, but I'd rather hash that out in a PR
> > than in the KIP review, as it'll be clearer what gets used how. It seems
> to
> > me (committers please correct me if I'm wrong) that as long as we're on
> the
> > same page about what information the interfaces are returning, that's ok
> at
> > this level of discussion.
> >
> > On Tue, Nov 7, 2023 at 12:03 PM Rohan Desai 
> > wrote:
> >
> >> Hello All,
> >>
> >> I'd like to start a discussion on KIP-924 (
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-924%3A+customizable+task+assignment+for+Streams
> )
> >> which proposes an interface to allow users to plug into the streams
> >> partition assignor. The motivation section in the KIP goes into some
> more
> >> detail on why we think this is a useful addition. Thanks in advance for
> >> your feedback!
> >>
> >> Best Regards,
> >>
> >> Rohan
> >>
> >>
>


Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-05 Thread Rohan Desai
Thanks for the feedback Sophie!

re1: Totally agree. The fact that it's related to the partition assignor is
clear from just `task.assignor`. I'll update.
re3: This is a good point, and something I would find useful personally. I
think its worth adding an interface that lets the plugin observe the final
assignment. I'll add that.
re4: I like the new `NodeAssignment` type. I'll update the KIP with that.

On Thu, Nov 9, 2023 at 11:18 PM Rohan Desai  wrote:

> Thanks for the feedback so far! I think pretty much all of it is
> reasonable. I'll reply to it inline:
>
> > 1. All the API logic is granular at the Task level, except the
> previousOwnerForPartition func. I’m not clear what’s the motivation behind
> it, does our controller also want to change how the partitions->tasks
> mapping is formed?
> You're right that this is out of place. I've removed this method as it's
> not needed by the task assignor.
>
> > 2. Just on the API layering itself: it feels a bit weird to have the
> three built-in functions (defaultStandbyTaskAssignment etc) sitting in the
> ApplicationMetadata class. If we consider them as some default util
> functions, how about introducing moving those into their own static util
> methods to separate from the ApplicationMetadata “fact objects” ?
> Agreed. Updated in the latest revision of the kip. These have been moved
> to TaskAssignorUtils
>
> > 3. I personally prefer `NodeAssignment` to be a read-only object
> containing the decisions made by the assignor, including the
> requestFollowupRebalance flag. For manipulating the half-baked results
> inside the assignor itself, maybe we can just be flexible to let users use
> whatever struts / their own classes even, if they like. WDYT?
> Agreed. Updated in the latest version of the kip.
>
> > 1. For the API, thoughts on changing the method signature to return a
> (non-Optional) TaskAssignor? Then we can either have the default
> implementation return new HighAvailabilityTaskAssignor or just have a
> default implementation class that people can extend if they don't want to
> implement every method.
> Based on some other discussion, I actually decided to get rid of the
> plugin interface, and instead use config to specify individual plugin
> behaviour. So the method you're referring to is no longer part of the
> proposal.
>
> > 3. Speaking of ApplicationMetadata, the javadoc says it's read only but
> theres methods that return void on it? It's not totally clear to me how
> that interface is supposed to be used by the assignor. It'd be nice if we
> could flip that interface such that it becomes part of the output instead
> of an input to the plugin.
> I've moved those methods to a util class. They're really utility methods
> the assignor might want to call to do some default or optimized assignment
> for some cases like rack-awareness.
>
> > 4. We should consider wrapping UUID in a ProcessID class so that we
> control
> the interface (there are a few places where UUID is directly used).
> I like it. Updated the proposal.
>
> > 5. What does NodeState#newAssignmentForNode() do? I thought the point
> was
> for the plugin to make the assignment? Is that the result of the default
> logic?
> It doesn't need to be part of the interface. I've removed it.
>
> > re 2/6:
>
> I generally agree with these points, but I'd rather hash that out in a PR
> than in the KIP review, as it'll be clearer what gets used how. It seems to
> me (committers please correct me if I'm wrong) that as long as we're on the
> same page about what information the interfaces are returning, that's ok at
> this level of discussion.
>
> On Tue, Nov 7, 2023 at 12:03 PM Rohan Desai 
> wrote:
>
>> Hello All,
>>
>> I'd like to start a discussion on KIP-924 (
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-924%3A+customizable+task+assignment+for+Streams)
>> which proposes an interface to allow users to plug into the streams
>> partition assignor. The motivation section in the KIP goes into some more
>> detail on why we think this is a useful addition. Thanks in advance for
>> your feedback!
>>
>> Best Regards,
>>
>> Rohan
>>
>>


[jira] [Created] (KAFKA-16479) Add pagination supported describeTopic interface

2024-04-05 Thread Calvin Liu (Jira)
Calvin Liu created KAFKA-16479:
--

 Summary: Add pagination supported  describeTopic interface
 Key: KAFKA-16479
 URL: https://issues.apache.org/jira/browse/KAFKA-16479
 Project: Kafka
  Issue Type: Sub-task
Reporter: Calvin Liu


During the DescribeTopicPartitions API implementations, we found it awkward to 
place the pagination logic within the current admin client describe topic 
interface. So, in order to change the interface, we may need to have a boarder 
discussion like creating a KIP. Or even a step forward, to discuss a general 
client side pagination framework. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15583) High watermark can only advance if ISR size is larger than min ISR

2024-04-05 Thread Calvin Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Calvin Liu resolved KAFKA-15583.

Resolution: Fixed

> High watermark can only advance if ISR size is larger than min ISR
> --
>
> Key: KAFKA-15583
> URL: https://issues.apache.org/jira/browse/KAFKA-15583
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Calvin Liu
>Assignee: Calvin Liu
>Priority: Major
>
> This is the new high watermark advancement requirement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Apache Kafka 3.6.2 release

2024-04-05 Thread Manikumar
Hi Divij,

Thanks for letting me know. I missed it because the current announcement
email template does not mention it.
Added the 3.6.2 blog entry:
https://kafka.apache.org/blog#apache_kafka_362_release_announcement

Also updated the Release wiki instructions to include the blog details to
release announcement email


Thanks,


On Fri, Apr 5, 2024 at 5:34 PM Divij Vaidya  wrote:

> Hey Manikumar
>
> Are we planning to add the blog entry about the 3.6.2 release at
> https://kafka.apache.org/blog ? Asking because I didn't see this included
> in the release announcement.
>
> --
> Divij Vaidya
>
>
>
> On Wed, Mar 20, 2024 at 12:01 PM Manikumar 
> wrote:
>
> > Hi,
> >
> > We have one non-blocker issue to be resolved.
> > https://issues.apache.org/jira/browse/KAFKA-16073
> >
> > I plan to generate the first release candidate later today/tomorrow.
> >
> > Thanks.
> >
> >
> >
> > On Mon, Mar 18, 2024 at 11:28 PM Edoardo Comar 
> > wrote:
> >
> > > Thanks Manikumar, done and marked the issue as resolved
> > >
> > > On Mon, 18 Mar 2024 at 16:30, Manikumar 
> > wrote:
> > > >
> > > > Hi Edoardo,
> > > >
> > > > sure, pls go ahead and cherry-pick the changes to 3.7 and 3.6
> branches.
> > > >
> > > > Thanks,
> > > >
> > > > On Mon, Mar 18, 2024 at 3:53 PM Edoardo Comar  >
> > > wrote:
> > > >
> > > > > Hi Manikumar,
> > > > > https://issues.apache.org/jira/browse/KAFKA-16369
> > > > > is merged in trunk now.
> > > > > can you please cherry-pick it to 3.6.2 ?
> > > > >
> > > > > I didn't see it included in the plan, would you like me to add it ?
> > > > > I'd consider it a blocker since Kafka may stay up and running but
> > > useless
> > > > > ...
> > > > >
> > > > > On Mon, 18 Mar 2024 at 09:52, Manikumar  >
> > > wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I have backported a couple of CVE related fixes to the 3.6
> branch.
> > > > > > https://issues.apache.org/jira/browse/KAFKA-16210
> > > > > > https://issues.apache.org/jira/browse/KAFKA-16322
> > > > > >
> > > > > >
> > > > > > We have one non-blocker issue to be resolved.
> > > > > > https://issues.apache.org/jira/browse/KAFKA-16073
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > On Fri, Mar 15, 2024 at 10:18 AM Ismael Juma 
> > > wrote:
> > > > > >
> > > > > > > Hi team,
> > > > > > >
> > > > > > > We should focus on regressions and cve fixes for releases like
> > > this one
> > > > > > > (second bug fix and it's not the most recently released
> version).
> > > The
> > > > > idea
> > > > > > > is to reduce the risk of regressions since it may well be the
> > last
> > > > > official
> > > > > > > release on this branch.
> > > > > > >
> > > > > > > Ismael
> > > > > > >
> > > > > > > On Thu, Mar 14, 2024, 1:39 AM Divij Vaidya <
> > > divijvaidy...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Manikumar,
> > > > > > > >
> > > > > > > > 1. Can you please take a look at
> > > > > > > > https://github.com/apache/kafka/pull/15490
> > > > > > > > which is a bug fix specific to the 3.6.x branch?
> > > > > > > > 2. Should we do a one-time update of all dependencies in
> 3.6.x
> > > branch
> > > > > > > > before releasing 3.6.2?
> > > > > > > > 3. We fixed quite a lot of flaky tests in 3.7.x. I will see
> if
> > > any
> > > > > > > > backporting is needed to make the release qualification
> easier.
> > > > > > > > 4. There are a large number of bugs reported as impacting
> 3.6.1
> > > [1]
> > > > > Some
> > > > > > > of
> > > > > > > > them have attached PRs and pending review. Maybe we can
> request
> > > all
> > > > > > > > committers to take a look at the ones which have a PR
> attached
> > > and
> > > > > see if
> > > > > > > > we can close them in the next few days before 3.6.2. Note
> that
> > > this
> > > > > will
> > > > > > > be
> > > > > > > > on a best-effort basis and won't block release of 3.6.2.
> > > > > > > > 5. Have you looked at the JIRA marked as "bugs" in 3.7 and
> > > triaged
> > > > > > > whether
> > > > > > > > something needs to be backported? Usually it is the
> > > responsibility
> > > > > of the
> > > > > > > > reviewer but I have observed that sometimes we forget to
> > backport
> > > > > > > important
> > > > > > > > onces as well. I can help with this one early next week.
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://issues.apache.org/jira/browse/KAFKA-16222?jql=project%20%3D%20KAFKA%20AND%20issuetype%20%3D%20Bug%20AND%20resolution%20%3D%20Unresolved%20AND%20affectedVersion%20%3D%203.6.1%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Divij Vaidya
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Mar 14, 2024 at 7:55 AM Manikumar <
> > > manikumar.re...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > Here is the release plan for 3.6.2:
> > > > > > > > >
> > > > >
> 

Re: Gentle bump on KAFKA-16371 (Unstable committed offsets after triggering commits where metadata for some partitions are over the limit)

2024-04-05 Thread David Jacot
Thanks, Michal. Let me add it to my review queue.

BR,
David

On Fri, Apr 5, 2024 at 3:29 PM Michał Łowicki  wrote:

> Hi there!
>
> Created https://issues.apache.org/jira/browse/KAFKA-16371 few weeks back
> but there wasn't any attention. Any chance someone knowing that code could
> take a look at the issue found and proposed fixed? Thanks in advance.
>
> --
> BR,
> Michał Łowicki
>


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2784

2024-04-05 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.7 #128

2024-04-05 Thread Apache Jenkins Server
See 




Gentle bump on KAFKA-16371 (Unstable committed offsets after triggering commits where metadata for some partitions are over the limit)

2024-04-05 Thread Michał Łowicki
Hi there!

Created https://issues.apache.org/jira/browse/KAFKA-16371 few weeks back
but there wasn't any attention. Any chance someone knowing that code could
take a look at the issue found and proposed fixed? Thanks in advance.

-- 
BR,
Michał Łowicki


[jira] [Created] (KAFKA-16478) Links for Kafka 3.5.2 release are broken

2024-04-05 Thread Philipp Trulson (Jira)
Philipp Trulson created KAFKA-16478:
---

 Summary: Links for Kafka 3.5.2 release are broken
 Key: KAFKA-16478
 URL: https://issues.apache.org/jira/browse/KAFKA-16478
 Project: Kafka
  Issue Type: Bug
  Components: website
Affects Versions: 3.5.2
Reporter: Philipp Trulson


While trying to update our setup, I noticed that the download links for the 
3.5.2 links are broken. They all point to a different host and also contain an 
additional `/kafka` in their URL. Compare:

not working:
[https://downloads.apache.org/kafka/kafka/3.5.2/RELEASE_NOTES.html]

working:
[https://archive.apache.org/dist/kafka/3.5.1/RELEASE_NOTES.html]
[https://downloads.apache.org/kafka/3.6.2/RELEASE_NOTES.html]

This goes for all links in the release - archives, checksums, signatures.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Apache Kafka 3.6.2 release

2024-04-05 Thread Divij Vaidya
Hey Manikumar

Are we planning to add the blog entry about the 3.6.2 release at
https://kafka.apache.org/blog ? Asking because I didn't see this included
in the release announcement.

--
Divij Vaidya



On Wed, Mar 20, 2024 at 12:01 PM Manikumar 
wrote:

> Hi,
>
> We have one non-blocker issue to be resolved.
> https://issues.apache.org/jira/browse/KAFKA-16073
>
> I plan to generate the first release candidate later today/tomorrow.
>
> Thanks.
>
>
>
> On Mon, Mar 18, 2024 at 11:28 PM Edoardo Comar 
> wrote:
>
> > Thanks Manikumar, done and marked the issue as resolved
> >
> > On Mon, 18 Mar 2024 at 16:30, Manikumar 
> wrote:
> > >
> > > Hi Edoardo,
> > >
> > > sure, pls go ahead and cherry-pick the changes to 3.7 and 3.6 branches.
> > >
> > > Thanks,
> > >
> > > On Mon, Mar 18, 2024 at 3:53 PM Edoardo Comar 
> > wrote:
> > >
> > > > Hi Manikumar,
> > > > https://issues.apache.org/jira/browse/KAFKA-16369
> > > > is merged in trunk now.
> > > > can you please cherry-pick it to 3.6.2 ?
> > > >
> > > > I didn't see it included in the plan, would you like me to add it ?
> > > > I'd consider it a blocker since Kafka may stay up and running but
> > useless
> > > > ...
> > > >
> > > > On Mon, 18 Mar 2024 at 09:52, Manikumar 
> > wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I have backported a couple of CVE related fixes to the 3.6 branch.
> > > > > https://issues.apache.org/jira/browse/KAFKA-16210
> > > > > https://issues.apache.org/jira/browse/KAFKA-16322
> > > > >
> > > > >
> > > > > We have one non-blocker issue to be resolved.
> > > > > https://issues.apache.org/jira/browse/KAFKA-16073
> > > > >
> > > > > Thanks,
> > > > >
> > > > > On Fri, Mar 15, 2024 at 10:18 AM Ismael Juma 
> > wrote:
> > > > >
> > > > > > Hi team,
> > > > > >
> > > > > > We should focus on regressions and cve fixes for releases like
> > this one
> > > > > > (second bug fix and it's not the most recently released version).
> > The
> > > > idea
> > > > > > is to reduce the risk of regressions since it may well be the
> last
> > > > official
> > > > > > release on this branch.
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Thu, Mar 14, 2024, 1:39 AM Divij Vaidya <
> > divijvaidy...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Manikumar,
> > > > > > >
> > > > > > > 1. Can you please take a look at
> > > > > > > https://github.com/apache/kafka/pull/15490
> > > > > > > which is a bug fix specific to the 3.6.x branch?
> > > > > > > 2. Should we do a one-time update of all dependencies in 3.6.x
> > branch
> > > > > > > before releasing 3.6.2?
> > > > > > > 3. We fixed quite a lot of flaky tests in 3.7.x. I will see if
> > any
> > > > > > > backporting is needed to make the release qualification easier.
> > > > > > > 4. There are a large number of bugs reported as impacting 3.6.1
> > [1]
> > > > Some
> > > > > > of
> > > > > > > them have attached PRs and pending review. Maybe we can request
> > all
> > > > > > > committers to take a look at the ones which have a PR attached
> > and
> > > > see if
> > > > > > > we can close them in the next few days before 3.6.2. Note that
> > this
> > > > will
> > > > > > be
> > > > > > > on a best-effort basis and won't block release of 3.6.2.
> > > > > > > 5. Have you looked at the JIRA marked as "bugs" in 3.7 and
> > triaged
> > > > > > whether
> > > > > > > something needs to be backported? Usually it is the
> > responsibility
> > > > of the
> > > > > > > reviewer but I have observed that sometimes we forget to
> backport
> > > > > > important
> > > > > > > onces as well. I can help with this one early next week.
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > > >
> > > > > >
> > > >
> >
> https://issues.apache.org/jira/browse/KAFKA-16222?jql=project%20%3D%20KAFKA%20AND%20issuetype%20%3D%20Bug%20AND%20resolution%20%3D%20Unresolved%20AND%20affectedVersion%20%3D%203.6.1%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Divij Vaidya
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Mar 14, 2024 at 7:55 AM Manikumar <
> > manikumar.re...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > Here is the release plan for 3.6.2:
> > > > > > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/Release+plan+3.6.2
> > > > > > > >
> > > > > > > > Currently there is one open non-blocker issue. I plan to
> > generate
> > > > the
> > > > > > > first
> > > > > > > > release candidate
> > > > > > > > once the issue is resolved and no other issues are raised in
> > the
> > > > > > > meantime.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Manikumar
> > > > > > > >
> > > > > > > > On Thu, Mar 14, 2024 at 6:24 AM Satish Duggana <
> > > > > > satish.dugg...@gmail.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1, Thanks Mani for volunteering.
> > > > > > > > >
> > > > > > > > > On Thu, 14 Mar 2024 at 06:01, Luke 

[jira] [Created] (KAFKA-16477) Detect thread leaked client-metrics-reaper in tests

2024-04-05 Thread Kuan Po Tseng (Jira)
Kuan Po Tseng created KAFKA-16477:
-

 Summary: Detect thread leaked client-metrics-reaper in tests
 Key: KAFKA-16477
 URL: https://issues.apache.org/jira/browse/KAFKA-16477
 Project: Kafka
  Issue Type: Improvement
Reporter: Kuan Po Tseng
Assignee: Kuan Po Tseng


After profiling the kafka tests, tons of `client-metrics-reaper` thread not 
cleanup after BrokerServer shutdown.

The thread {{client-metrics-reaper}} comes from 
[ClientMetricsManager#expirationTimer|https://github.com/apache/kafka/blob/a2ee0855ee5e73f3a74555d52294bb4acfd28945/server/src/main/java/org/apache/kafka/server/ClientMetricsManager.java#L115],
 and BrokerServer#shudown doesn't close ClientMetricsManager which let the 
timer thread still runs in background.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-16305) Optimisation in SslTransportLayer:handshakeUnwrap stalls TLS handshake

2024-04-05 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-16305.

Fix Version/s: 3.7.1
   Resolution: Fixed

push 
https://github.com/apache/kafka/commit/633d2f139c403cbbe2912d04f823d74c561dab76 
to 3.7 branch

> Optimisation in SslTransportLayer:handshakeUnwrap stalls TLS handshake
> --
>
> Key: KAFKA-16305
> URL: https://issues.apache.org/jira/browse/KAFKA-16305
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.7.0, 3.6.1
>Reporter: Gaurav Narula
>Assignee: Gaurav Narula
>Priority: Major
> Fix For: 3.8.0, 3.7.1
>
>
> Kafka allows users to configure custom {{SSLEngine}} via the 
> {{SslEngineFactory}} interface. There have been attempts to use an OpenSSL 
> based {{SSLEngine}} using {{netty-handler}} over the JDK based implementation 
> for performance reasons.
> While trying to use a Netty/Openssl based SSLEngine, we observe that the 
> server hangs while performing the TLS handshake.  We observe the following 
> logs
> {code}
> 2024-02-26 01:40:00,117 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0 state NOT_INITIALIZED
> 2024-02-26 01:40:00,117 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [INFO ] 
> Selector - [SocketServer listenerType=ZK_BROKER, nodeId=101] 
> XX-Gaurav: calling prepare channelId 127.0.0.1:60045-127.0.0.1:60046-0
> 2024-02-26 01:40:00,117 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0 state NOT_INITIALIZED
> 2024-02-26 01:40:00,117 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
> 2024-02-26 01:40:00,117 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] SSLHandshake NEED_UNWRAP channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0, appReadBuffer pos 0, netReadBuffer pos 0, 
> netWriteBuffer pos 0
> 2024-02-26 01:40:00,117 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] SSLHandshake handshakeUnwrap channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0 doRead true
> 2024-02-26 01:40:00,118 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] SSLHandshake post handshakeUnwrap: channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0 handshakeStatus NEED_UNWRAP status 
> BUFFER_UNDERFLOW read 0
> 2024-02-26 01:40:00,118 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] SSLHandshake post unwrap NEED_UNWRAP channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0, handshakeResult Status = BUFFER_UNDERFLOW 
> 

[jira] [Reopened] (KAFKA-16305) Optimisation in SslTransportLayer:handshakeUnwrap stalls TLS handshake

2024-04-05 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai reopened KAFKA-16305:


reopen for backport to 3.7

> Optimisation in SslTransportLayer:handshakeUnwrap stalls TLS handshake
> --
>
> Key: KAFKA-16305
> URL: https://issues.apache.org/jira/browse/KAFKA-16305
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.7.0, 3.6.1
>Reporter: Gaurav Narula
>Assignee: Gaurav Narula
>Priority: Major
> Fix For: 3.8.0
>
>
> Kafka allows users to configure custom {{SSLEngine}} via the 
> {{SslEngineFactory}} interface. There have been attempts to use an OpenSSL 
> based {{SSLEngine}} using {{netty-handler}} over the JDK based implementation 
> for performance reasons.
> While trying to use a Netty/Openssl based SSLEngine, we observe that the 
> server hangs while performing the TLS handshake.  We observe the following 
> logs
> {code}
> 2024-02-26 01:40:00,117 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0 state NOT_INITIALIZED
> 2024-02-26 01:40:00,117 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [INFO ] 
> Selector - [SocketServer listenerType=ZK_BROKER, nodeId=101] 
> XX-Gaurav: calling prepare channelId 127.0.0.1:60045-127.0.0.1:60046-0
> 2024-02-26 01:40:00,117 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0 state NOT_INITIALIZED
> 2024-02-26 01:40:00,117 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
> 2024-02-26 01:40:00,117 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] SSLHandshake NEED_UNWRAP channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0, appReadBuffer pos 0, netReadBuffer pos 0, 
> netWriteBuffer pos 0
> 2024-02-26 01:40:00,117 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] SSLHandshake handshakeUnwrap channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0 doRead true
> 2024-02-26 01:40:00,118 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] SSLHandshake post handshakeUnwrap: channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0 handshakeStatus NEED_UNWRAP status 
> BUFFER_UNDERFLOW read 0
> 2024-02-26 01:40:00,118 
> data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
> SslTransportLayer- [SslTransportLayer 
> channelId=127.0.0.1:60045-127.0.0.1:60046-0 
> key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
> remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
> interestOps=1, readyOps=0] SSLHandshake post unwrap NEED_UNWRAP channelId 
> 127.0.0.1:60045-127.0.0.1:60046-0, handshakeResult Status = BUFFER_UNDERFLOW 
> HandshakeStatus = NEED_UNWRAP bytesConsumed = 0 bytesProduced = 0, 
> appReadBuffer pos 0, netReadBuffer pos 0, netWriteBuffer pos 0 
> 

Re: [DISCUSS] KIP-932: Queues for Kafka

2024-04-05 Thread Andrew Schofield
Hi Tom,
Thanks for your question. I have not spent any time learning about MM2
previously, so please forgive my ignorance.

The question of an ordering requirement on upgrading clusters seems real to me.
As I understand it, MM2 can mirror topic configurations between clusters. I 
think
you’re correct that no topic configurations have been added during the MM2 era
(possibly tiered storage) so it’s not really been a problem so far. I think 
that it
could potentially be a problem with the 4.0 migration. There are several
configuration parameters which are currently marked as deprecated and planning
to be removed in 4.0. So, expecting the set of topic configurations to be
unchanging is not a safe assumption in general.

KIP-848 introduced the concept of dynamic group configuration such as
“group.consumer.heartbeat.interval,ms”. It seems sensible to me that 
administrators
can control key aspects of the behaviour of consumers, rather than needing
each individual consumer to have the same configuration. So, KIP-932
continues on the same path and I think it’s a good one.

I don’t believe the MM2 is at all aware of dynamic group configurations and
does not mirror these configurations across clusters. Personally, I’m slightly
on the fence about whether it should. Group configurations are about
ensuring that the consumers in a group have consistent configuration for
timeouts and so on. So, MM2 could mirror these, but then why would it
not also mirror broker configurations?

I think there are 3 things to consider here:

1) Should MM2 mirror group configurations?
Perhaps it should. MM2 also does not know how to mirror the
persistent share-group state. I think there’s an MM2 KIP for share groups
in the future.

2) How should MM2 cope with mismatches in the set of valid
configurations when the clusters are at different versions?
I have no good answer for this and I defer to people with better knowledge
of MM2.

3) Is “group.type” a sensible configuration at all?
A few reviewers of KIP-932 have picked up on “group.type” and I think the
jury is still out on this one. I expect a consensus to emerge before long.

Thanks,
Andrew


> On 3 Apr 2024, at 23:32, Tom Bentley  wrote:
> 
> Hi Andrew (and Omnia),
> 
> Thanks for the KIP. I hope to provide some feedback on this KIP soon, but I
> had a thought on the specific subject of group configs and MM2. If brokers
> validate for known groups configs then doesn't this induce an ordering
> requirement on upgrading clusters: Wouldn't you have to upgrade a
> destination cluster first, in order that it knew about `group.type`,
> otherwise it would reject attempts to configure an unknown group config
> parameter? A similar issue arises wrt topic configs, but this is the first
> instance (that I'm aware of) of a config being added during the MM2 era, so
> perhaps this is a minor problem worth thinking about.
> 
> Cheers,
> 
> Tom
> 
> On Wed, 3 Apr 2024 at 05:31, Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
> 
>> Hi Omnia,
>> Thanks for your questions.
>> 
>> The DR angle on `group.type` is interesting and I had not considered it.
>> The namespace of groups contains
>> both consumer groups and share groups, so I was trying to ensure that
>> which group type was used was
>> deterministic rather than a race to create the first member. There are
>> already other uses of the group protocol
>> such as Kafka Connect, so it’s all a bit confusing even today.
>> 
>> It is actually KIP-848 which introduces configurations for group resources
>> and KIP-932 is just building on
>> the idea. I think that MM2 will need to sync these configurations. The
>> question of whether `group.type` is
>> a sensible configuration I think is separate.
>> 
>> Imagine that we do have `group.type` as a group configuration. How would
>> we end up with groups with
>> the same ID but different types on the two ends of MM2? Assuming that both
>> ends have KIP-932 enabled,
>> either the configuration was not set, and a consumer group was made on one
>> end while a share group was
>> made on the other, OR, the configuration was set but its value changed,
>> and again we get a divergence.
>> 
>> I think that on balance, having `group.type` as a configuration does at
>> least mean there’s a better chance that
>> the two ends of MM2 do agree on the type of group. I’m happy to consider
>> other ways to do this better. The
>> fact that we have different kinds of group in the same namespace is the
>> tricky thing. I think this was possible
>> before this KIP, but it’s much more likely now.
>> 
>> 
>> Onto the question of memory. There are several different parts to this,
>> all of which are distributed across
>> the cluster.
>> 
>> * For the group coordinator, the memory consumption will be affected by
>> the number of groups,
>> the number of members and the number of topic-partitions to be assigned to
>> the members. The
>> group coordinator is concerned with membership and assignment, so the
>> memory per 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2783

2024-04-05 Thread Apache Jenkins Server
See 




Re: [PR] MINOR: make ZK migrating to KRaft feature as production ready [kafka-site]

2024-04-05 Thread via GitHub


showuon merged PR #594:
URL: https://github.com/apache/kafka-site/pull/594


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] KIP-1023: Follower fetch from tiered offset

2024-04-05 Thread Christo Lolov
Hello Abhijeet and Jun,

I have been mulling this KIP over a bit more in recent days!

re: Jun

I wasn't aware we apply 2.1 and 2.2 for reserving new timestamps - in
retrospect it should have been fairly obvious. I would need to go an update
KIP-1005 myself then, thank you for giving the useful reference!

4. I think Abhijeet wants to rebuild state from latest-tiered-offset and
fetch from latest-tiered-offset + 1 only for new replicas (or replicas
which experienced a disk failure) to decrease the time a partition spends
in under-replicated state. In other words, a follower which has just fallen
out of ISR, but has local data will continue using today's Tiered Storage
replication protocol (i.e. fetching from earliest-local). I further believe
he has taken this approach so that local state of replicas which have just
fallen out of ISR isn't forcefully wiped thus leading to situation 1.
Abhijeet, have I understood (and summarised) what you are proposing
correctly?

5. I think in today's Tiered Storage we know the leader's log-start-offset
from the FetchResponse and we can learn its local-log-start-offset from the
ListOffsets by asking for earliest-local timestamp (-4). But granted, this
ought to be added as an additional API call in the KIP.

re: Abhijeet

101. I am still a bit confused as to why you want to include a new offset
(i.e. pending-upload-offset) when you yourself mention that you could use
an already existing offset (i.e. last-tiered-offset + 1). In essence, you
end your Motivation with "In this KIP, we will focus only on the follower
fetch protocol using the *last-tiered-offset*" and then in the following
sections you talk about pending-upload-offset. I understand this might be
classified as an implementation detail, but if you introduce a new offset
(i.e. pending-upload-offset) you have to make a change to the ListOffsets
API (i.e. introduce -6) and thus document it in this KIP as such. However,
the last-tiered-offset ought to already be exposed as part of KIP-1005
(under implementation). Am I misunderstanding something here?

Best,
Christo

On Thu, 4 Apr 2024 at 19:37, Jun Rao  wrote:

> Hi, Abhijeet,
>
> Thanks for the KIP. Left a few comments.
>
> 1. "A drawback of using the last-tiered-offset is that this new follower
> would possess only a limited number of locally stored segments. Should it
> ascend to the role of leader, there is a risk of needing to fetch these
> segments from the remote storage, potentially impacting broker
> performance."
> Since we support consumers fetching from followers, this is a potential
> issue on the follower side too. In theory, it's possible for a segment to
> be tiered immediately after rolling. In that case, there could be very
> little data after last-tiered-offset. It would be useful to think through
> how to address this issue.
>
> 2. ListOffsetsRequest:
> 2.1 Typically, we need to bump up the version of the request if we add a
> new value for timestamp. See
>
> https://github.com/apache/kafka/pull/10760/files#diff-fac7080d67da905a80126d58fc1745c9a1409de7ef7d093c2ac66a888b134633
> .
> 2.2 Since this changes the inter broker request protocol, it would be
> useful to have a section on upgrade (e.g. new IBP/metadata.version).
>
> 3. "Instead of fetching Earliest-Pending-Upload-Offset, it could fetch the
> last-tiered-offset from the leader, and make a separate leader call to
> fetch leader epoch for the following offset."
> Why do we need to make a separate call for the leader epoch?
> ListOffsetsResponse include both the offset and the corresponding epoch.
>
> 4. "Check if the follower replica is empty and if the feature to use
> last-tiered-offset is enabled."
> Why do we need to check if the follower replica is empty?
>
> 5. It can be confirmed by checking if the leader's Log-Start-Offset is the
> same as the Leader's Local-Log-Start-Offset.
> How does the follower know Local-Log-Start-Offset?
>
> Jun
>
> On Sat, Mar 30, 2024 at 5:51 AM Abhijeet Kumar  >
> wrote:
>
> > Hi Christo,
> >
> > Thanks for reviewing the KIP.
> >
> > The follower needs the earliest-pending-upload-offset (and the
> > corresponding leader epoch) from the leader.
> > This is the first offset the follower will have locally.
> >
> > Regards,
> > Abhijeet.
> >
> >
> >
> > On Fri, Mar 29, 2024 at 1:14 PM Christo Lolov 
> > wrote:
> >
> > > Heya!
> > >
> > > First of all, thank you very much for the proposal, you have explained
> > the
> > > problem you want solved very well - I think a faster bootstrap of an
> > empty
> > > replica is definitely an improvement!
> > >
> > > For my understanding, which concrete offset do you want the leader to
> > give
> > > back to a follower - earliest-pending-upload-offset or the
> > > latest-tiered-offset? If it is the second, then I believe KIP-1005
> ought
> > to
> > > already be exposing that offset as part of the ListOffsets API, no?
> > >
> > > Best,
> > > Christo
> > >
> > > On Wed, 27 Mar 2024 at 18:23, Abhijeet Kumar <
> 

[jira] [Created] (KAFKA-16476) Check whether TestInfo contains correct parameter name

2024-04-05 Thread PoAn Yang (Jira)
PoAn Yang created KAFKA-16476:
-

 Summary: Check whether TestInfo contains correct parameter name
 Key: KAFKA-16476
 URL: https://issues.apache.org/jira/browse/KAFKA-16476
 Project: Kafka
  Issue Type: Test
Reporter: PoAn Yang
Assignee: PoAn Yang


After KAFKA-16472, we fixed missing parameter name in `TestInfo#getDisplayName` 
in Java. However, we don't have check for cases like users don't give empty or 
incorrect parameter name in test function.

For example, before the fix in KAFKA-16472, 
`DeleteOffsetsConsumerGroupCommandIntegrationTest#testDeleteOffsetsNonExistingGroup`
 shows `(String).`.

For incorrect parameter name, if users give parameter name like `Quorum`, but 
not `quorum`, the `TestInfoUtils#isKRaft` still can't work as expected.

Probably, we can add check like containing `.`, but not `=` for empty case and 
containing `zk` or `kraft, but not `quorum` for incorrect parameter name case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.7 #127

2024-04-05 Thread Apache Jenkins Server
See 




[ANNOUNCE] Apache Kafka 3.6.2

2024-04-05 Thread Manikumar
The Apache Kafka community is pleased to announce the release for
Apache Kafka 3.6.2

This is a bug fix release and it includes fixes and improvements from 28 JIRAs.

All of the changes in this release can be found in the release notes:
https://www.apache.org/dist/kafka/3.6.2/RELEASE_NOTES.html


You can download the source and binary release (Scala 2.12 and Scala 2.13) from:
https://kafka.apache.org/downloads#3.6.2

---


Apache Kafka is a distributed streaming platform with four core APIs:


** The Producer API allows an application to publish a stream of records to
one or more Kafka topics.

** The Consumer API allows an application to subscribe to one or more
topics and process the stream of records produced to them.

** The Streams API allows an application to act as a stream processor,
consuming an input stream from one or more topics and producing an
output stream to one or more output topics, effectively transforming the
input streams to output streams.

** The Connector API allows building and running reusable producers or
consumers that connect Kafka topics to existing applications or data
systems. For example, a connector to a relational database might
capture every change to a table.


With these APIs, Kafka can be used for two broad classes of application:

** Building real-time streaming data pipelines that reliably get data
between systems or applications.

** Building real-time streaming applications that transform or react
to the streams of data.


Apache Kafka is in use at large and small companies worldwide, including
Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
Target, The New York Times, Uber, Yelp, and Zalando, among others.

A big thank you for the following 40 contributors to this release!
(Please report an unintended omission)

Akhilesh Chaganti, Andrew Schofield, Anton Liauchuk, Bob Barrett,
Bruno Cadonna, Cheng-Kai, Zhang, Chia-Ping Tsai, Chris Egerton, Colin
P. McCabe, Colin Patrick McCabe, David Arthur, David Jacot, David Mao,
Divij Vaidya, Edoardo Comar, Emma Humber, Gaurav Narula, Greg Harris,
hudeqi, Ismael Juma, Jason Gustafson, Jim Galasyn, Joel Hamill, Johnny
Hsu, José Armando García Sancio, Justine Olshan, Luke Chen, Manikumar
Reddy, Matthias J. Sax, Mayank Shekhar Narula, Mickael Maison, Mike
Lloyd, Paolo Patierno, PoAn Yang, Ron Dagostino, Sagar Rao, Stanislav
Kozlovski, Walker Carlson, Yash Mayya

We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
https://kafka.apache.org/

Thank you!


Regards,

Manikumar
Release Manager for Apache Kafka 3.6.2